How to use Glove embedings? As we have already discussed about Embeddings or Word Embedding and what are they. So Glove Embedding is also another method of creating Word Embeddings. Lets understand more about it.
So Glove Embeddings which is Global vectors for word representation are the method in which we will a take the corpus and will iterate through it and get the co-occurence of each word present in the corpus. We will get a co-occurence matrix through this, the words which occur next to each other will get a value of 1, if they are one word apart then 1/2, if they are two words apart then 1/3 and so on.
Lets get a better clarification by taking an example.
Example :
It is a lovely morning !
Good Morning!!
Is it a lovely morning ?
import itertools
from gensim.models.word2vec import Text8Corpus
from glove import Corpus, Glove
sentences = list(itertools.islice(Text8Corpus('/content/alice_in_wonderland.txt'),None))
corpus = Corpus()
corpus.fit(sentences, window=10)
glove = Glove(no_components=100, learning_rate=0.05)
glove.fit(corpus.matrix, epochs=30, no_threads=4, verbose=True)
Epoch 0 Epoch 1 Epoch 2 Epoch 3 Epoch 4 Epoch 5 Epoch 6 Epoch 7 Epoch 8 Epoch 9 Epoch 10 Epoch 11 Epoch 12 Epoch 13 Epoch 14 Epoch 15 Epoch 16 Epoch 17 Epoch 18 Epoch 19 Epoch 20 Epoch 21 Epoch 22 Epoch 23 Epoch 24 Epoch 25 Epoch 26 Epoch 27 Epoch 28 Epoch 29
Here we are going the fit the Glove i.e performing 30 training epochs with 4 threads
glove.add_dictionary(corpus.dictionary)
glove.most_similar('man')
[('sobs.', 0.9555372382921672), ('reasons.', 0.9555298747918248), ('signify:', 0.9551492193112306), ('`chop', 0.954856860860499)]
glove.most_similar('this', number=10)
[('time', 0.9964498350533215), ('once', 0.9964002559452605), ('more', 0.9963721925296446), ('any', 0.9955253094062864), ('about', 0.9950879007354146), ('which', 0.9948399539941413), ('turned', 0.9942261952259767), ('is', 0.9941542169966086), ('them', 0.994141679802586)]
glove.most_similar('Adventures', number=10)
[('hedgehogs,', 0.9398138500036824), ("THAT'S", 0.93867888598354), ('soup,', 0.9355306192717532), ('familiarly', 0.9338930212646674), ('showing', 0.9334707250469283), ("Turtle's", 0.9328493584474263), ('blades', 0.9318802670676556), ('heads.', 0.9318625356540701), ("refreshments!'", 0.9315206030115342)]
glove.most_similar('girl', number=10)
[('dispute', 0.8922924095994201), ('proper', 0.889211005639865), ('hurry.', 0.8875119118249284), ('remark,', 0.8874202609802221), ('bringing', 0.88048150664503), ('dog', 0.8769020310475344), ('tree.', 0.8758689282289073), ('fast', 0.8754031020409732), ('rules', 0.8743036670054224)]
glove.most_similar('Alice', number=10)
[('thought', 0.99722981845681), ('he', 0.985967266394433), ('her,', 0.9848540529024399), ('She', 0.984218370767349), ('not,', 0.9834714497587523), ('much', 0.9827468801839833), ("I'm", 0.9826786300945485), ('got', 0.9825505635825527), ("I've", 0.982494375644852)]
From the above we have seen that how to use glove embeddings for word representation, the above examples specifies us about how it performs.