What is a cbow model and when to use it?

What is a cbow model and when to use it?

What is a cbow model and when to use it?

This recipe explains what is a cbow model and when to use it


Recipe Objective

What is a cbow model and when to use it? As we have discussed earlier only about Word2vec and CBOW comes under Word2Vec. CBOW (Continuous Bag of Words) which predicts the current word given the context of words within a specific window. The output layer containing the current word and the input layer contains context words. The other layer which is called a hidden layer it contains the number of dimensions where we want to represent the current word present at the output layer.

Step 1 - Import the necessary libraries

from nltk.tokenize import sent_tokenize, word_tokenize import warnings warnings.filterwarnings(action = 'ignore') import gensim from gensim.models import Word2Vec

Here we have imported the necessary packages along with the warnings and kept them as ignore because we know that there might be some warnings coming up when we run our program, but that can be ignored.

Step 2 - load the sample data

sample = open("/content/alice_in_wonderland.txt", "r") s = sample.read()

Step 3 - Replace the escape character with spaces

f = s.replace("\n", " ")

Step 4 - Iterate and tokenize

import nltk nltk.download('punkt') data = [] for i in sent_tokenize(f): temp = [] for j in word_tokenize(i): temp.append(j.lower()) data.append(temp)

Here we are taking a list as variable named data which is initially empty, after that we are going to take a for loop which will iterate through each sentence present in the text file, and the second for loop will tokenize the sentences into words.

Step 5 - Create a CBOW model

model1 = gensim.models.Word2Vec(data, min_count = 1, size = 100, window = 5)

Step 6 - Print the result of CBOW model

print("Cosine similarity between 'alice' " + "and 'wonderland' - CBOW : ", model1.similarity('alice', 'wonderland')) print("Cosine similarity between 'alice' " + "and 'machines' - CBOW : ", model1.similarity('alice', 'machines'))
Cosine similarity between 'alice' and 'wonderland' - CBOW :  0.99817955
Cosine similarity between 'alice' and 'machines' - CBOW :  0.9881186

Relevant Projects

Loan Eligibility Prediction using Gradient Boosting Classifier
This data science in python project predicts if a loan should be given to an applicant or not. We predict if the customer is eligible for loan based on several factors like credit score and past history.

Mercari Price Suggestion Challenge Data Science Project
Data Science Project in Python- Build a machine learning algorithm that automatically suggests the right product prices.

Build a Collaborative Filtering Recommender System in Python
Use the Amazon Reviews/Ratings dataset of 2 Million records to build a recommender system using memory-based collaborative filtering in Python.

Topic modelling using Kmeans clustering to group customer reviews
In this Kmeans clustering machine learning project, you will perform topic modelling in order to group customer reviews based on recurring patterns.

Ecommerce product reviews - Pairwise ranking and sentiment analysis
This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.

Machine Learning or Predictive Models in IoT - Energy Prediction Use Case
In this machine learning and IoT project, we are going to test out the experimental data using various predictive models and train the models and break the energy usage.

Data Science Project on Wine Quality Prediction in R
In this R data science project, we will explore wine dataset to assess red wine quality. The objective of this data science project is to explore which chemical properties will influence the quality of red wines.

Choosing the right Time Series Forecasting Methods
There are different time series forecasting methods to forecast stock price, demand etc. In this machine learning project, you will learn to determine which forecasting method to be used when and how to apply with time series forecasting example.

Time Series Forecasting with LSTM Neural Network Python
Deep Learning Project- Learn to apply deep learning paradigm to forecast univariate time series data.

Predict Macro Economic Trends using Kaggle Financial Dataset
In this machine learning project, you will uncover the predictive value in an uncertain world by using various artificial intelligence, machine learning, advanced regression and feature transformation techniques.