How to use Glove embedings?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

How to use Glove embedings?

How to use Glove embedings?

This recipe helps you use Glove embedings

0

Recipe Objective

How to use Glove embedings? As we have already discussed about Embeddings or Word Embedding and what are they. So Glove Embedding is also another method of creating Word Embeddings. Lets understand more about it.

So Glove Embeddings which is Global vectors for word representation are the method in which we will a take the corpus and will iterate through it and get the co-occurence of each word present in the corpus. We will get a co-occurence matrix through this, the words which occur next to each other will get a value of 1, if they are one word apart then 1/2, if they are two words apart then 1/3 and so on.

Lets get a better clarification by taking an example.

Example :

It is a lovely morning !

Good Morning!!

Is it a lovely morning ?

Step 1 - Import the necessary libraries

import itertools from gensim.models.word2vec import Text8Corpus from glove import Corpus, Glove

Step 2 - Store the sample text file in a variable called sentences

sentences = list(itertools.islice(Text8Corpus('/content/alice_in_wonderland.txt'),None))

Step 3 - Store the Corpus into a variable

corpus = Corpus()

Step 4 - fit the sentences into corpus with a window size of 10

corpus.fit(sentences, window=10)

Step 5 - Store the Glove in a varibale

glove = Glove(no_components=100, learning_rate=0.05)

Step 6 - Perform the training

glove.fit(corpus.matrix, epochs=30, no_threads=4, verbose=True)
Epoch 0
Epoch 1
Epoch 2
Epoch 3
Epoch 4
Epoch 5
Epoch 6
Epoch 7
Epoch 8
Epoch 9
Epoch 10
Epoch 11
Epoch 12
Epoch 13
Epoch 14
Epoch 15
Epoch 16
Epoch 17
Epoch 18
Epoch 19
Epoch 20
Epoch 21
Epoch 22
Epoch 23
Epoch 24
Epoch 25
Epoch 26
Epoch 27
Epoch 28
Epoch 29

Here we are going the fit the Glove i.e performing 30 training epochs with 4 threads

Step 7 - Add our corpus dictionary to glove dictionary

glove.add_dictionary(corpus.dictionary)

Step 8 - Test with some words

glove.most_similar('man')
[('sobs.', 0.9555372382921672),
 ('reasons.', 0.9555298747918248),
 ('signify:', 0.9551492193112306),
 ('`chop', 0.954856860860499)]
glove.most_similar('this', number=10)
[('time', 0.9964498350533215),
 ('once', 0.9964002559452605),
 ('more', 0.9963721925296446),
 ('any', 0.9955253094062864),
 ('about', 0.9950879007354146),
 ('which', 0.9948399539941413),
 ('turned', 0.9942261952259767),
 ('is', 0.9941542169966086),
 ('them', 0.994141679802586)]
glove.most_similar('Adventures', number=10)
[('hedgehogs,', 0.9398138500036824),
 ("THAT'S", 0.93867888598354),
 ('soup,', 0.9355306192717532),
 ('familiarly', 0.9338930212646674),
 ('showing', 0.9334707250469283),
 ("Turtle's", 0.9328493584474263),
 ('blades', 0.9318802670676556),
 ('heads.', 0.9318625356540701),
 ("refreshments!'", 0.9315206030115342)]
glove.most_similar('girl', number=10)
[('dispute', 0.8922924095994201),
 ('proper', 0.889211005639865),
 ('hurry.', 0.8875119118249284),
 ('remark,', 0.8874202609802221),
 ('bringing', 0.88048150664503),
 ('dog', 0.8769020310475344),
 ('tree.', 0.8758689282289073),
 ('fast', 0.8754031020409732),
 ('rules', 0.8743036670054224)]
glove.most_similar('Alice', number=10)
[('thought', 0.99722981845681),
 ('he', 0.985967266394433),
 ('her,', 0.9848540529024399),
 ('She', 0.984218370767349),
 ('not,', 0.9834714497587523),
 ('much', 0.9827468801839833),
 ("I'm", 0.9826786300945485),
 ('got', 0.9825505635825527),
 ("I've", 0.982494375644852)]

From the above we have seen that how to use glove embeddings for word representation, the above examples specifies us about how it performs.

Relevant Projects

Natural language processing Chatbot application using NLTK for text classification
In this NLP AI application, we build the core conversational engine for a chatbot. We use the popular NLTK text classification library to achieve this.

Resume parsing with Machine learning - NLP with Python OCR and Spacy
In this machine learning resume parser example we use the popular Spacy NLP python library for OCR and text classification.

Credit Card Fraud Detection as a Classification Problem
In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.

Solving Multiple Classification use cases Using H2O
In this project, we are going to talk about H2O and functionality in terms of building Machine Learning models.

Data Science Project on Wine Quality Prediction in R
In this R data science project, we will explore wine dataset to assess red wine quality. The objective of this data science project is to explore which chemical properties will influence the quality of red wines.

Customer Market Basket Analysis using Apriori and Fpgrowth algorithms
In this data science project, you will learn how to perform market basket analysis with the application of Apriori and FP growth algorithms based on the concept of association rule learning.

Time Series Forecasting with LSTM Neural Network Python
Deep Learning Project- Learn to apply deep learning paradigm to forecast univariate time series data.

Human Activity Recognition Using Smartphones Data Set
In this deep learning project, you will build a classification system where to precisely identify human fitness activities.

Deep Learning with Keras in R to Predict Customer Churn
In this deep learning project, we will predict customer churn using Artificial Neural Networks and learn how to model an ANN in R with the keras deep learning package.

Machine Learning or Predictive Models in IoT - Energy Prediction Use Case
In this machine learning and IoT project, we are going to test out the experimental data using various predictive models and train the models and break the energy usage.