Explain how LSTMs work and why they are preferred in NLP analysis?

Explain how LSTMs work and why they are preferred in NLP analysis?

Explain how LSTMs work and why they are preferred in NLP analysis?

This recipe explains how LSTMs work and why they are preferred in NLP analysis


Recipe Objective

Explain how LSTM's work and why they are preferred in NLP analysis.

LSTM is nothing but the long short term memory, it is an artificial recurrent neural network used in the field of deep learning, also LSTM's are a special kind of RNN, capable of learning long term dependencies. These networks are based on time series data which are well suited for classifying, processing and making predictions. Also the development of these were done for dealing with the vanishing gradient problem that can be encountered when training traditional RNNs.

LSTM processes the data passing on information as it propagates forward. Within the LSTM's cells the differences are the operations. For LSTN to keep or forget information these operations are used.

Step 1 - Import the necessary libraries

import numpy from keras.models import Sequential from keras.layers import Dense from keras.layers import Dropout from keras.layers import LSTM from keras.callbacks import ModelCheckpoint from keras.utils import np_utils

Step 2 - load the sample data

Sample_data = "/content/alice_in_wonderland.txt" wonderland_text = open(Sample_data, 'r', encoding='utf-8').read() wonderland_text = wonderland_text.lower() print(wonderland_text)

Step 3 - Create mapping of unique characters and integers

My_characters = sorted(list(set(wonderland_text))) character_to_integer = dict((c, i) for i, c in enumerate(My_characters)) character_to_integer
{'\n': 0,
 ' ': 1,
 '!': 2,
 '"': 3,
 "'": 4,
 '(': 5,
 ')': 6,
 '*': 7,
 ',': 8,
 '-': 9,
 '.': 10,
 '0': 11,
 '3': 12,
 ':': 13,
 ';': 14,
 '?': 15,
 '[': 16,
 ']': 17,
 '_': 18,
 '`': 19,
 'a': 20,
 'b': 21,
 'c': 22,
 'd': 23,
 'e': 24,
 'f': 25,
 'g': 26,
 'h': 27,
 'i': 28,
 'j': 29,
 'k': 30,
 'l': 31,
 'm': 32,
 'n': 33,
 'o': 34,
 'p': 35,
 'q': 36,
 'r': 37,
 's': 38,
 't': 39,
 'u': 40,
 'v': 41,
 'w': 42,
 'x': 43,
 'y': 44,
 'z': 45}

As we know that we cannot model the characters data directly, so for that we need to convert them into integers the above step is all about that. Firstly we have taken the set of all unique characters present in the data then creating a map of each character to unique integer.

Step 4 - Summarize the data

wonder_chars = len(wonderland_text) wonder_vocab = len(My_characters) print("Total Characters Present in the Sample data: ", wonder_chars) print("Total Vocab in the data: ", wonder_vocab)
Total Characters Present in the Sample data:  148574
Total Vocab in the data:  46

Step 5 - Prepare the dataset

sequence_length = 100 x_data = [] y_data = [] for i in range(0, wonder_chars - sequence_length, 1): sequence_in = wonderland_text[i:i + sequence_length] sequence_out = wonderland_text[i + sequence_length] x_data.append([character_to_integer[char] for char in sequence_in]) y_data.append(character_to_integer[sequence_out]) pattern_nn = len(x_data) print("Result of total patterns:", pattern_nn)
Result of total patterns: 148474

Here we have prepared the data of input and output pairs which are encoded as integers.

Step 6 - Reshaping the data

X = numpy.reshape(x_data, (pattern_nn, sequence_length, 1)) X = X / float(wonder_vocab) y = np_utils.to_categorical(y_data)

Step 7 - Define the LSTM model

model = Sequential() model.add(LSTM(256, input_shape=(X.shape[1], X.shape[2]))) model.add(Dropout(0.2)) model.add(Dense(y.shape[1], activation='softmax')) model.compile(loss='categorical_crossentropy', optimizer='adam')

Step 8 - Define the checkpoint

filepath="weights-improvement-{epoch:02d}-{loss:.4f}.hdf5" checkpoint = ModelCheckpoint(filepath, monitor='loss', verbose=1, save_best_only=True, mode='min') callbacks_list = [checkpoint]

Step 9 - Fit the model

model.fit(X, y, epochs=1, batch_size=128, callbacks=callbacks_list)
1160/1160 [==============================] - ETA: 0s - loss: 2.7172
Epoch 00001: loss improved from 2.95768 to 2.71722, saving model to weights-improvement-01-2.7172.hdf5
1160/1160 [==============================] - 735s 634ms/step - loss: 2.7172

Relevant Projects

Deep Learning with Keras in R to Predict Customer Churn
In this deep learning project, we will predict customer churn using Artificial Neural Networks and learn how to model an ANN in R with the keras deep learning package.

Forecast Inventory demand using historical sales data in R
In this machine learning project, you will develop a machine learning model to accurately forecast inventory demand based on historical sales data.

Learn to prepare data for your next machine learning project
Text data requires special preparation before you can start using it for any machine learning project.In this ML project, you will learn about applying Machine Learning models to create classifiers and learn how to make sense of textual data.

Predict Census Income using Deep Learning Models
In this project, we are going to work on Deep Learning using H2O to predict Census income.

Build an Image Classifier for Plant Species Identification
In this machine learning project, we will use binary leaf images and extracted features, including shape, margin, and texture to accurately identify plant species using different benchmark classification techniques.

Identifying Product Bundles from Sales Data Using R Language
In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.

Solving Multiple Classification use cases Using H2O
In this project, we are going to talk about H2O and functionality in terms of building Machine Learning models.

Machine Learning project for Retail Price Optimization
In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.

Loan Eligibility Prediction using Gradient Boosting Classifier
This data science in python project predicts if a loan should be given to an applicant or not. We predict if the customer is eligible for loan based on several factors like credit score and past history.

Data Science Project on Wine Quality Prediction in R
In this R data science project, we will explore wine dataset to assess red wine quality. The objective of this data science project is to explore which chemical properties will influence the quality of red wines.