Explain how LSTMs work and why they are preferred in NLP analysis?

This recipe explains how LSTMs work and why they are preferred in NLP analysis
Last Updated: 18 Aug 2022

Get access to Data Science projects View all Data Science projects

MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET ALL TAGS

Recipe Objective

Explain how LSTM's work and why they are preferred in NLP analysis.

LSTM is nothing but the long short term memory, it is an artificial recurrent neural network used in the field of deep learning, also LSTM's are a special kind of RNN, capable of learning long term dependencies. These networks are based on time series data which are well suited for classifying, processing and making predictions. Also the development of these were done for dealing with the vanishing gradient problem that can be encountered when training traditional RNNs.

Build Expedia Hotel Recommendation System using Machine Learning

LSTM processes the data passing on information as it propagates forward. Within the LSTM's cells the differences are the operations. For LSTN to keep or forget information these operations are used.

Step 1 - Import the necessary libraries

import numpy from keras.models import Sequential from keras.layers import Dense from keras.layers import Dropout from keras.layers import LSTM from keras.callbacks import ModelCheckpoint from keras.utils import np_utils

Step 2 - load the sample data

Sample_data = "/content/alice_in_wonderland.txt" wonderland_text = open(Sample_data, 'r', encoding='utf-8').read() wonderland_text = wonderland_text.lower() print(wonderland_text)

Step 3 - Create mapping of unique characters and integers

My_characters = sorted(list(set(wonderland_text))) character_to_integer = dict((c, i) for i, c in enumerate(My_characters)) character_to_integer

{'\n': 0,
 ' ': 1,
 '!': 2,
 '"': 3,
 "'": 4,
 '(': 5,
 ')': 6,
 '*': 7,
 ',': 8,
 '-': 9,
 '.': 10,
 '0': 11,
 '3': 12,
 ':': 13,
 ';': 14,
 '?': 15,
 '[': 16,
 ']': 17,
 '_': 18,
 '`': 19,
 'a': 20,
 'b': 21,
 'c': 22,
 'd': 23,
 'e': 24,
 'f': 25,
 'g': 26,
 'h': 27,
 'i': 28,
 'j': 29,
 'k': 30,
 'l': 31,
 'm': 32,
 'n': 33,
 'o': 34,
 'p': 35,
 'q': 36,
 'r': 37,
 's': 38,
 't': 39,
 'u': 40,
 'v': 41,
 'w': 42,
 'x': 43,
 'y': 44,
 'z': 45}

As we know that we cannot model the characters data directly, so for that we need to convert them into integers the above step is all about that. Firstly we have taken the set of all unique characters present in the data then creating a map of each character to unique integer.

Step 4 - Summarize the data

wonder_chars = len(wonderland_text) wonder_vocab = len(My_characters) print("Total Characters Present in the Sample data: ", wonder_chars) print("Total Vocab in the data: ", wonder_vocab)

Total Characters Present in the Sample data:  148574
Total Vocab in the data:  46

Step 5 - Prepare the dataset

sequence_length = 100 x_data = [] y_data = [] for i in range(0, wonder_chars - sequence_length, 1): sequence_in = wonderland_text[i:i + sequence_length] sequence_out = wonderland_text[i + sequence_length] x_data.append([character_to_integer[char] for char in sequence_in]) y_data.append(character_to_integer[sequence_out]) pattern_nn = len(x_data) print("Result of total patterns:", pattern_nn)

Result of total patterns: 148474

Here we have prepared the data of input and output pairs which are encoded as integers.

Step 6 - Reshaping the data

X = numpy.reshape(x_data, (pattern_nn, sequence_length, 1)) X = X / float(wonder_vocab) y = np_utils.to_categorical(y_data)

Step 7 - Define the LSTM model

model = Sequential() model.add(LSTM(256, input_shape=(X.shape[1], X.shape[2]))) model.add(Dropout(0.2)) model.add(Dense(y.shape[1], activation='softmax')) model.compile(loss='categorical_crossentropy', optimizer='adam')

Step 8 - Define the checkpoint

filepath="weights-improvement-{epoch:02d}-{loss:.4f}.hdf5" checkpoint = ModelCheckpoint(filepath, monitor='loss', verbose=1, save_best_only=True, mode='min') callbacks_list = [checkpoint]

Step 9 - Fit the model

model.fit(X, y, epochs=1, batch_size=128, callbacks=callbacks_list)

1160/1160 [==============================] - ETA: 0s - loss: 2.7172
Epoch 00001: loss improved from 2.95768 to 2.71722, saving model to weights-improvement-01-2.7172.hdf5
1160/1160 [==============================] - 735s 634ms/step - loss: 2.7172

What Users are saying..

Savvy Sahai

Data Science Intern, Capgemini

As a student looking to break into the field of data engineering and data science, one can get really confused as to which path to take. Very few ways to do it are Google, YouTube, etc. I was one of... Read More

Relevant Projects

Machine Learning Projects

Data Science Projects

Python Projects for Data Science

Data Science Projects in R

Machine Learning Projects for Beginners

Deep Learning Projects

Neural Network Projects

Tensorflow Projects

NLP Projects

Kaggle Projects

IoT Projects

Big Data Projects

Hadoop Real-Time Projects Examples

Spark Projects

Data Analytics Projects for Students

Relevant Projects

Insurance Pricing Forecast Using XGBoost Regressor

In this project, we are going to talk about insurance forecast by using linear and xgboost regression techniques.

View Project Details

ML Model Deployment on AWS for Customer Churn Prediction

MLOps Project-Deploy Machine Learning Model to Production Python on AWS for Customer Churn Prediction

View Project Details

NLP and Deep Learning For Fake News Classification in Python

In this project you will use Python to implement various machine learning methods( RNN, LSTM, GRU) for fake news classification.

View Project Details

MLOps Project for a Mask R-CNN on GCP using uWSGI Flask

MLOps on GCP - Solved end-to-end MLOps Project to deploy a Mask RCNN Model for Image Segmentation as a Web Application using uWSGI Flask, Docker, and TensorFlow.

View Project Details

Build Customer Propensity to Purchase Model in Python

In this machine learning project, you will learn to build a machine learning model to estimate customer propensity to purchase.

View Project Details

Image Segmentation using Mask R-CNN with Tensorflow

In this Deep Learning Project on Image Segmentation Python, you will learn how to implement the Mask R-CNN model for early fire detection.

View Project Details

Build CNN for Image Colorization using Deep Transfer Learning

Image Processing Project -Train a model for colorization to make grayscale images colorful using convolutional autoencoders.

View Project Details

Forecasting Business KPI's with Tensorflow and Python

In this machine learning project, you will use the video clip of an IPL match played between CSK and RCB to forecast key performance indicators like the number of appearances of a brand logo, the frames, and the shortest and longest area percentage in the video.

View Project Details