How To Run A Basic Pytorch RNN Model?

This Pytorch RNN model code example inputs a dataset into a basic RNN (recurrent neural net) model to generate image classification predictions.

Objective: How To Run A Basic Pytorch RNN Model?

This RNN model PyTorch code example shows you how to generate predictions using an RNN image classification model on the MNIST handwritten digits dataset. This RNN model PyTorch source code uses the PyTorch utility DataLoader, which allows batch, shuffle, and load the data in parallel using multiprocessing workers.

How To Implement An RNN Language Model With Attention in PyTorch?

You can implement an RNN language model with attention in PyTorch using the following steps-

  • Define the model architecture- This involves defining the number of layers in the RNN, the type of RNN unit (e.g., LSTM or GRU), and the attention mechanism.

  • Initialize the model parameters- You can do this randomly or by loading pre-trained parameters.

  • Define the training loop- This involves feeding the model batches of text data and calculating the loss. You can then use the loss to update the model parameters using gradient descent.

  • Evaluate the model- You can assess your model’s performance by generating text from the model or predicting the next word in a sequence.

Steps Showing How To Run A Pytorch RNN Model

The following steps will help you understand how to use a PyTorch RNN model for image classification with the help of an easy-to-understand RNN model example.

Step 1: Import PyTorch Modules

The first step is to import the required libraries and set some hyperparameters.

import torch

from torch import nn

from torch.autograd import Variable

import torchvision.datasets as dsets

import torchvision.transforms as transforms

import matplotlib.pyplot as plt

%matplotlib inline

torch.manual_seed(1)  # Set a random seed for reproducibility

# Hyper Parameters

EPOCH = 1

BATCH_SIZE = 64

TIME_STEP = 28

INPUT_SIZE = 28

LR = 0.01

DOWNLOAD_MNIST = True

Step 2: Load The MNIST dataset

In this step, you must load the MNIST dataset and display an example image.

# MNIST digital dataset

train_data = dsets.MNIST(

    root='./mnist/',            # Specify the root directory where the dataset will be stored

    train=True,                 # Load the training data

    transform=transforms.ToTensor(),  # Transform the data to PyTorch tensors and normalize it

    download=DOWNLOAD_MNIST,    # Download the dataset if it's not already downloaded

)

# Plot one example

print(train_data.train_data.size())   # (60000, 28, 28)

print(train_data.train_labels.size()) # (60000)

plt.imshow(train_data.train_data[0].numpy(), cmap='gray')

plt.title('%i' % train_data.train_labels[0])

plt.show()

Get Access To 70+ Enterprise-Grade Solved End-to-End ML Projects And Become A Data Science Pro

Step 3: Preparing Training And Testing Data

The next step is to create a data loader (train_loader) to load and iterate through the training data in mini-batches efficiently. This helps with stochastic gradient descent during training.

# Data Loader for easy mini-batch return in training

train_loader = torch.utils.data.DataLoader(dataset=train_data, batch_size=BATCH_SIZE, shuffle=True)

Then, you must load the test dataset and transform it similarly to the training data. test_x will be converted to a Variable and normalized. It will include the first 2,000 test samples. test_y will contain the corresponding labels converted to a numpy array.

test_data = dsets.MNIST(root='./mnist/', train=False, transform=transforms.ToTensor())

test_x = Variable(test_data.test_data, volatile=True).type(torch.FloatTensor)[:2000] / 255.

test_y = test_data.test_labels.numpy().squeeze()[:2000]

Step 4: Define The PyTorch RNN Model (LSTM)

In this step, you will define an LSTM-based RNN model using the nn.Module class. The model will consist of an LSTM layer with a specified input size, hidden size, number of layers, and batch-first format. You can use a linear layer (self.out) to map the LSTM output to 10 classes for digit recognition. You must also create an instance of the RNN model called rnn.

class RNN(nn.Module):

    def __init__(self):

        super(RNN, self).__init()

        self.rnn = nn.LSTM(

            input_size=INPUT_SIZE,

            hidden_size=64,

            num_layers=1,

            batch_first=True

        )

        self.out = nn.Linear(64, 10)

    def forward(self, x):

        r_out, (h_n, h_c) = self.rnn(x, None)

        out = self.out(r_out[:, -1, :])

        return out

rnn = RNN()

Step 5: Model Optimizer And Loss Function

You must set up the Adam optimizer to optimize the parameters of the RNN model with a specified learning rate. Also, you must define the loss function as cross-entropy, suitable for classification tasks.

optimizer = torch.optim.Adam(rnn.parameters(), lr=LR)

loss_func = nn.CrossEntropyLoss()

Step 6: Training The LSTM RNN Language Model PyTorch

Here, you will loop over the specified epochs (EPOCH). Inside each epoch, you will loop through the training data in mini-batches, where b_x and b_y are batches of input data and labels, respectively. You will perform the forward pass through the LSTM model and compute the loss.

Every 50 steps, you must evaluate the model's performance on the test data and print the training loss and test accuracy.

for epoch in range(EPOCH):

    for step, (x, y) in enumerate(train_loader):

        b_x = Variable(x.view(-1, 28, 28))

        b_y = Variable(y)

        output = rnn(b_x)

        loss = loss_func(output, b_y)

        optimizer.zero_grad()

        loss.backward()

        optimizer.step()

        if step % 50 == 0:

            test_output = rnn(test_x)

            pred_y = torch.max(test_output, 1)[1].data.numpy().squeeze()

            accuracy = sum(pred_y == test_y) / float(test_y.size)

            print('Epoch: ', epoch, '| train loss: %.4f' % loss.data[0], '| test accuracy: %.2f' % accuracy)

The output of the above code is-

Epoch:  0 | train loss: 2.2883 | test accuracy: 0.10

Epoch:  0 | train loss: 0.8138 | test accuracy: 0.62

Epoch:  0 | train loss: 0.9010 | test accuracy: 0.78

Epoch:  0 | train loss: 0.6608 | test accuracy: 0.83

Epoch:  0 | train loss: 0.3150 | test accuracy: 0.85

Epoch:  0 | train loss: 0.2186 | test accuracy: 0.91

Epoch:  0 | train loss: 0.4511 | test accuracy: 0.90

Epoch:  0 | train loss: 0.4673 | test accuracy: 0.90

Epoch:  0 | train loss: 0.2014 | test accuracy: 0.93

Epoch:  0 | train loss: 0.2198 | test accuracy: 0.93

Epoch:  0 | train loss: 0.0439 | test accuracy: 0.93

Epoch:  0 | train loss: 0.1979 | test accuracy: 0.95

Epoch:  0 | train loss: 0.0518 | test accuracy: 0.95

Epoch:  0 | train loss: 0.1723 | test accuracy: 0.94

Epoch:  0 | train loss: 0.1908 | test accuracy: 0.94

Epoch:  0 | train loss: 0.0576 | test accuracy: 0.95

Epoch:  0 | train loss: 0.0414 | test accuracy: 0.96

Epoch:  0 | train loss: 0.3591 | test accuracy: 0.95

Epoch:  0 | train loss: 0.2465 | test accuracy: 0.95

Step 7: Generate Predictions With The RNN Language Model PyTorch

In this step, you will make predictions using the trained RNN model and compare these predictions to the real labels for the first 10 test samples. This will help you understand how well the model performs on a small subset of the test data.

# print 10 predictions from test data

test_output = rnn(test_x[:10].view(-1, 28, 28))

# Get the predicted labels

pred_y = torch.max(test_output, 1)[1].data.numpy().squeeze()

# Print the predicted labels and real labels

print(pred_y, 'prediction number')

print(test_y[:10], 'real number')

The output of the above code-

[7 2 1 0 4 1 4 9 5 9] prediction number

[7 2 1 0 4 1 4 9 5 9] real number

Build Your PyTorch RNN Model From Scratch With ProjectPro

This PyTorch RNN model tutorial covered the essential steps for creating and training a basic LSTM RNN model in PyTorch for digit recognition using the MNIST dataset. Furthermore, if you want a more in-depth understanding of neural network and PyTorch concepts and their practical applications, you will greatly benefit from ProjectPro's end-to-end solved Neural Network projects and PyTorch projects. The ProjectPro platform offers hands-on training, real-world projects, and expert guidance to help you gain the skills and experience needed to build advanced machine learning and data science solutions, including RNN and other neural network models, which are crucial in the data science domain.

What Users are saying..

profile image

Ed Godalle

Director Data Analytics at EY / EY Tech
linkedin profile url

I am the Director of Data Analytics with over 10+ years of IT experience. I have a background in SQL, Python, and Big Data working with Accenture, IBM, and Infosys. I am looking to enhance my skills... Read More

Relevant Projects

Demand prediction of driver availability using multistep time series analysis
In this supervised learning machine learning project, you will predict the availability of a driver in a specific area by using multi step time series analysis.

AWS MLOps Project for Gaussian Process Time Series Modeling
MLOps Project to Build and Deploy a Gaussian Process Time Series Model in Python on AWS

Linear Regression Model Project in Python for Beginners Part 2
Machine Learning Linear Regression Project for Beginners in Python to Build a Multiple Linear Regression Model on Soccer Player Dataset.

Build a Graph Based Recommendation System in Python -Part 1
Python Recommender Systems Project - Learn to build a graph based recommendation system in eCommerce to recommend products.

LLM Project to Build and Fine Tune a Large Language Model
In this LLM project for beginners, you will learn to build a knowledge-grounded chatbot using LLM's and learn how to fine tune it.

Census Income Data Set Project-Predict Adult Census Income
Use the Adult Income dataset to predict whether income exceeds 50K yr based oncensus data.

Time Series Project to Build a Multiple Linear Regression Model
Learn to build a Multiple linear regression model in Python on Time Series Data

NLP Project for Multi Class Text Classification using BERT Model
In this NLP Project, you will learn how to build a multi-class text classification model using using the pre-trained BERT model.

Time Series Python Project using Greykite and Neural Prophet
In this time series project, you will forecast Walmart sales over time using the powerful, fast, and flexible time series forecasting library Greykite that helps automate time series problems.

AWS MLOps Project to Deploy a Classification Model [Banking]
In this AWS MLOps project, you will learn how to deploy a classification model using Flask on AWS.