Explain Skip gram with subwords models from word2vec.
As we have discussed earlier about skip gram, which predicts the the surrounding context words within specific window given current word. The input layer contains the current word and the output layer contains the context words. The hidden layer contains the number of dimensions in which we want to represent current word present at the input layer. Subwords these are the woords which uses some letters of a subject. for e.g "gi","rl" are the subwords of "girl". Lets understand the skip gram with subword practically.
!pip install cython
!pip install pyfasttext
from pyfasttext import FastText
sample = open("/content/alice_in_wonderland.txt", 'r')
alice_data = sample.read()
model = FastText()
model.skipgram(input='alice_in_wonderland.txt', output='model', epoch=2, lr=0.7)
print("The subword for boy are:",model.get_all_subwords('boy'),'\n')
print("The subword for girl are:",model.get_all_subwords('girl'),'\n')
The subword for boy are: ['boy', '
', 'boy', 'boy>', 'oy>']
The subword for girl are: ['girl', '
', 'gir', 'girl', 'girl>', 'irl', 'irl>', 'rl>']