Explain how LSTM is used for Classification?
LSTM is mainly used for text classification so, we will take the example of it.
We will create a LSTM model for text classification
First we will load the text from our drive.
pharma_train=pd.read_csv('/content/drive/My Drive/Python/pharma/train.csv')
pharma_train
MAX_WORDS = 10000
MAX_LENGTH = 150
# This is fixed.
EMBEDDING_DIM = 100
tokenizer = Tokenizer(num_words=MAX_NB_WORDS, filters='!"#$%&()*+,-./:;<=>?@', lower=True)
tokenizer.fit_on_texts(parma_train['job_discription'].values)
word_index = tokenizer.word_index
print('tokens' % len(word_index))
X = tokenizer.texts_to_sequences(df['job_discription'].values)
X = pad_sequences(X, maxlen=MAX_LENGTH)
Y = pd.get_dummies(pharma_train['job_type']).values
We will split the dataset into training and testing
X_train, X_test, Y_train, Y_test = train_test_split(X,Y, test_size = 0.10, random_state = 42)
we will create a LSTM model and pass our dataset through it.
model = Sequential()
model.add(Embedding(MAX_WORDS, EMBEDDING_DIM, input_length=X.shape[1]))
model.add(SpatialDropout1D(0.2))
model.add(LSTM(50))
model.add(Dense(32, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit(X_train, Y_train, epochs=10, batch_size=50,validation_split=0.1)