What is a cbow model and when to use it?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

What is a cbow model and when to use it?

What is a cbow model and when to use it?

This recipe explains what is a cbow model and when to use it

Recipe Objective

What is a cbow model and when to use it? As we have discussed earlier only about Word2vec and CBOW comes under Word2Vec. CBOW (Continuous Bag of Words) which predicts the current word given the context of words within a specific window. The output layer containing the current word and the input layer contains context words. The other layer which is called a hidden layer it contains the number of dimensions where we want to represent the current word present at the output layer.

Step 1 - Import the necessary libraries

from nltk.tokenize import sent_tokenize, word_tokenize import warnings warnings.filterwarnings(action = 'ignore') import gensim from gensim.models import Word2Vec

Here we have imported the necessary packages along with the warnings and kept them as ignore because we know that there might be some warnings coming up when we run our program, but that can be ignored.

Step 2 - load the sample data

sample = open("/content/alice_in_wonderland.txt", "r") s = sample.read()

Step 3 - Replace the escape character with spaces

f = s.replace("\n", " ")

Step 4 - Iterate and tokenize

import nltk nltk.download('punkt') data = [] for i in sent_tokenize(f): temp = [] for j in word_tokenize(i): temp.append(j.lower()) data.append(temp)

Here we are taking a list as variable named data which is initially empty, after that we are going to take a for loop which will iterate through each sentence present in the text file, and the second for loop will tokenize the sentences into words.

Step 5 - Create a CBOW model

model1 = gensim.models.Word2Vec(data, min_count = 1, size = 100, window = 5)

Step 6 - Print the result of CBOW model

print("Cosine similarity between 'alice' " + "and 'wonderland' - CBOW : ", model1.similarity('alice', 'wonderland')) print("Cosine similarity between 'alice' " + "and 'machines' - CBOW : ", model1.similarity('alice', 'machines'))
Cosine similarity between 'alice' and 'wonderland' - CBOW :  0.99817955
Cosine similarity between 'alice' and 'machines' - CBOW :  0.9881186

Relevant Projects

Ecommerce product reviews - Pairwise ranking and sentiment analysis
This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.

Build a Face Recognition System in Python using FaceNet
In this deep learning project, you will build your own face recognition system in Python using OpenCV and FaceNet by extracting features from an image of a person's face.

Topic modelling using Kmeans clustering to group customer reviews
In this Kmeans clustering machine learning project, you will perform topic modelling in order to group customer reviews based on recurring patterns.

Customer Churn Prediction Analysis using Ensemble Techniques
In this machine learning churn project, we implement a churn prediction model in python using ensemble techniques.

Churn Prediction in Telecom using Machine Learning in R
Estimating churners before they discontinue using a product or service is extremely important. In this ML project, you will develop a churn prediction model in telecom to predict customers who are most likely subject to churn.

Inventory Demand Forecasting using Machine Learning in R
In this machine learning project, you will develop a machine learning model to accurately forecast inventory demand based on historical sales data.

Identifying Product Bundles from Sales Data Using R Language
In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.

Classification - Zero to hero - Part 1
Classification is one of the basic things in ML and most of us jump to Neural networks or boosting to predict classes. But more often than not, to make the other person understand how the classification is happening, we need to use basic models like Logistic, decision trees etc. In this project we talk about you can apply various basic techniques, the maths and intuition behind them and how they paved way to bagging and boosting of the world

Image Segmentation using Mask R-CNN with Tensorflow
In this Deep Learning Project on Image Segmentation Python, you will learn how to implement the Mask R-CNN model for early fire detection.

Customer Market Basket Analysis using Apriori and Fpgrowth algorithms
In this data science project, you will learn how to perform market basket analysis with the application of Apriori and FP growth algorithms based on the concept of association rule learning.