How to make a RNN model using chainer

This recipe helps you make a RNN model using chainer
Last Updated: 22 Aug 2021

Get access to Data Science projects View all Data Science projects

DATA SCIENCE PROJECTS IN PYTHON DATA CLEANING PYTHON DATA MUNGING MACHINE LEARNING RECIPES PANDAS CHEATSHEET ALL TAGS

Recipe Objective - How to make a RNN model using chainer?

[Note] Save this file in ".py" format and run it on command line. Because "parse_args" function does not work in jupyter notebook.

Step-by-step Implementation

1. Importing Necessary Packages:

from __future__ import division import argparse import sys import numpy as np import chainer from chainer import backend from chainer import backends from chainer.backends import cuda from chainer import Function, FunctionNode, gradient_check, report, training, utils, Variable from chainer import datasets, initializers, iterators, optimizers, serializers from chainer import Link, Chain, ChainList import chainer.functions as F import chainer.links as L from chainer.training import extensions

2. Defining Training Settings:

parser = argparse.ArgumentParser() parser.add_argument('arg') parser.add_argument('--batchsize', '-b', type=int, default=20,help='Number of examples in each mini-batch') parser.add_argument('--bproplen', '-l', type=int, default=35,help='Number of words in each mini-batch ''(= length of truncated BPTT)') parser.add_argument('--epoch', '-e', type=int, default=39,help='Number of sweeps over the dataset to train') parser.add_argument('--device', '-d', type=str, default='-1',help='Device specifier. Either ChainerX device ' 'specifier or an integer. If non-negative integer, ''CuPy arrays with specified device id are used. If ''negative integer, NumPy arrays are used') parser.add_argument('--gradclip', '-c', type=float, default=5,help='Gradient norm threshold to clip') parser.add_argument('--out', '-o', default='result',help='Directory to output the result') parser.add_argument('--resume', '-r', type=str,help='Resume the training from snapshot') parser.add_argument('--test', action='store_true',help='Use tiny datasets for quick tests') parser.set_defaults(test=False) parser.add_argument('--unit', '-u', type=int, default=650,help='Number of LSTM units in each layer') parser.add_argument('--model', '-m', default='model.npz',help='Model file name to serialize')

3. Defining Network Structure:

class RNNForLM(chainer.Chain): def __init__(self, n_vocab, n_units): super(RNNForLM, self).__init__() with self.init_scope(): self.embed = L.EmbedID(n_vocab, n_units) self.l1 = L.LSTM(n_units, n_units) self.l2 = L.LSTM(n_units, n_units) self.l3 = L.Linear(n_units, n_vocab) for param in self.params(): param.array[...] = np.random.uniform(-0.1, 0.1, param.shape) def reset_state(self): self.l1.reset_state() self.l2.reset_state() def forward(self, x): h0 = self.embed(x) h1 = self.l1(F.dropout(h0)) h2 = self.l2(F.dropout(h1)) y = self.l3(F.dropout(h2)) return y

4. Loading the Penn Tree Bank Long Word Sequence Dataset:

# Load the Penn Tree Bank long word sequence dataset train, val, test = chainer.datasets.get_ptb_words()

5. Defining Iterator for Making a Mini-batch from the Dataset:

class ParallelSequentialIterator(chainer.dataset.Iterator): def __init__(self, dataset, batch_size, repeat=True): super(ParallelSequentialIterator, self).__init__() self.dataset = dataset self.batch_size = batch_size # batch size self.repeat = repeat length = len(dataset) # Offsets maintain the position of each sequence in the mini-batch. self.offsets = [i * length // batch_size for i in range(batch_size)] self.reset() def reset(self): # Number of completed sweeps over the dataset. In this case, it is # incremented if every word is visited at least once after the last # increment. self.epoch = 0 # True if the epoch is incremented at the last iteration. self.is_new_epoch = False # NOTE: this is not a count of parameter updates. It is just a count of # calls of ``__next__``. self.iteration = 0 # use -1 instead of None internally self._previous_epoch_detail = -1. def __next__(self): # This iterator returns a list representing a mini-batch. Each item # indicates a different position in the original sequence. Each item is # represented by a pair of two word IDs. The first word is at the # "current" position, while the second word at the next position. # At each iteration, the iteration count is incremented, which pushes # forward the "current" position. length = len(self.dataset) if not self.repeat and self.iteration * self.batch_size >= length: # If not self.repeat, this iterator stops at the end of the first # epoch (i.e., when all words are visited once). raise StopIteration cur_words = self.get_words() self._previous_epoch_detail = self.epoch_detail self.iteration += 1 next_words = self.get_words() epoch = self.iteration * self.batch_size // length self.is_new_epoch = self.epoch < epoch if self.is_new_epoch: self.epoch = epoch return list(zip(cur_words, next_words)) @property def epoch_detail(self): # Floating point version of epoch. return self.iteration * self.batch_size / len(self.dataset) @property def previous_epoch_detail(self): if self._previous_epoch_detail < 0: return None return self._previous_epoch_detail def get_words(self): # It returns a list of current words. return [self.dataset[(offset + self.iteration) % len(self.dataset)] for offset in self.offsets] def serialize(self, serializer): # It is important to serialize the state to be recovered on resume. self.iteration = serializer('iteration', self.iteration) self.epoch = serializer('epoch', self.epoch) try: self._previous_epoch_detail = serializer('previous_epoch_detail', self._previous_epoch_detail) except KeyError: # guess previous_epoch_detail for older version self._previous_epoch_detail = self.epoch + \ (self.current_position - self.batch_size) / len(self.dataset) if self.epoch_detail > 0: self._previous_epoch_detail = max(self._previous_epoch_detail, 0.) else: self._previous_epoch_detail = -1.

6. Defining Updater:

class BPTTUpdater(training.updaters.StandardUpdater): def __init__(self, train_iter, optimizer, bprop_len, device): super(BPTTUpdater, self).__init__(train_iter, optimizer, device=device) self.bprop_len = bprop_len # The core part of the update routine can be customized by overriding. def update_core(self): loss = 0 # When we pass one iterator and optimizer to StandardUpdater.__init__, # they are automatically named 'main'. train_iter = self.get_iterator('main') optimizer = self.get_optimizer('main') # Progress the dataset iterator for bprop_len words at each iteration. for i in range(self.bprop_len): # Get the next batch (a list of tuples of two word IDs) batch = train_iter.__next__() # Concatenate the word IDs to matrices and send them to the device # self.converter does this job # (it is chainer.dataset.concat_examples by default) x, t = self.converter(batch, self.device) # Compute the loss at this time step and accumulate it loss += optimizer.target(x, t) optimizer.target.cleargrads() # Clear the parameter gradients loss.backward() # Backprop loss.unchain_backward() # Truncate the graph optimizer.update() # Update the parameters

7. Defining Evaluation Function (Perplexity):

def compute_perplexity(result): result['perplexity'] = np.exp(result['main/loss']) if 'validation/main/loss' in result: result['val_perplexity'] = np.exp(result['validation/main/loss'])

8. Iterator:

train_iter = ParallelSequentialIterator(train, args.batchsize) val_iter = ParallelSequentialIterator(val, 1, repeat=False) test_iter = ParallelSequentialIterator(test, 1, repeat=False)

9. Create RNN and Classification Model:

args = parser.parse_args() n_vocab = 1000 rnn = RNNForLM(n_vocab, args.unit) model = L.Classifier(rnn) model.compute_accuracy = False # we only want the perplexity

10. Seting up Optimizer:

optimizer = chainer.optimizers.SGD(lr=1.0) optimizer.setup(model) optimizer.add_hook(chainer.optimizer_hooks.GradientClipping(args.gradclip))

11. Setup and Run Trainer:

updater = BPTTUpdater(train_iter, optimizer, args.bproplen, device) trainer = training.Trainer(updater, (args.epoch, 'epoch'), out=args.out) eval_model = model.copy() # Model with shared params and distinct states eval_rnn = eval_model.predictor trainer.extend(extensions.Evaluator( val_iter, eval_model, device=device, # Reset the RNN state at the beginning of each evaluation eval_hook=lambda _: eval_rnn.reset_state())) interval = 10 if args.test else 500 trainer.extend(extensions.LogReport(postprocess=compute_perplexity,trigger=(interval, 'iteration'))) trainer.extend(extensions.PrintReport(['epoch', 'iteration', 'perplexity', 'val_perplexity']), trigger=(interval, 'iteration')) trainer.extend(extensions.ProgressBar(update_interval=1 if args.test else 10)) trainer.extend(extensions.snapshot()) trainer.extend(extensions.snapshot_object(model, 'model_iter_{.updater.iteration}')) if args.resume is not None: chainer.serializers.load_npz(args.resume, trainer) trainer.run()

12. Evaluate the trained model on test dataset:

print('test') eval_rnn.reset_state() evaluator = extensions.Evaluator(test_iter, eval_model, device=device) result = evaluator() print('test perplexity: {}'.format(np.exp(float(result['main/loss']))))

What Users are saying..

Ray han

Tech Leader | Stanford / Yale University

I think that they are fantastic. I attended Yale and Stanford and have worked at Honeywell,Oracle, and Arthur Andersen(Accenture) in the US. I have taken Big Data and Hadoop,NoSQL, Spark, Hadoop... Read More

Relevant Projects

Machine Learning Projects

Data Science Projects

Python Projects for Data Science

Data Science Projects in R

Machine Learning Projects for Beginners

Deep Learning Projects

Neural Network Projects

Tensorflow Projects

NLP Projects

Kaggle Projects

IoT Projects

Big Data Projects

Hadoop Real-Time Projects Examples

Spark Projects

Data Analytics Projects for Students

Relevant Projects

Recommender System Machine Learning Project for Beginners-2

Recommender System Machine Learning Project for Beginners Part 2- Learn how to build a recommender system for market basket analysis using association rule mining.

View Project Details

CycleGAN Implementation for Image-To-Image Translation

In this GAN Deep Learning Project, you will learn how to build an image to image translation model in PyTorch with Cycle GAN.

View Project Details

Build Deep Autoencoders Model for Anomaly Detection in Python

In this deep learning project , you will build and deploy a deep autoencoders model using Flask.

View Project Details

Abstractive Text Summarization using Transformers-BART Model

Deep Learning Project to implement an Abstractive Text Summarizer using Google's Transformers-BART Model to generate news article headlines.

View Project Details

MLOps AWS Project on Topic Modeling using Gunicorn Flask

In this project we will see the end-to-end machine learning development process to design, build and manage reproducible, testable, and evolvable machine learning models by using AWS

View Project Details

Multilabel Classification Project for Predicting Shipment Modes

Multilabel Classification Project to build a machine learning model that predicts the appropriate mode of transport for each shipment, using a transport dataset with 2000 unique products. The project explores and compares four different approaches to multilabel classification, including naive independent models, classifier chains, natively multilabel models, and multilabel to multiclass approaches.

View Project Details

How to make a RNN model using chainer

Recipe Objective - How to make a RNN model using chainer?

[Note] Save this file in ".py" format and run it on command line. Because "parse_args" function does not work in jupyter notebook.

Step-by-step Implementation

1. Importing Necessary Packages:

2. Defining Training Settings:

3. Defining Network Structure:

4. Loading the Penn Tree Bank Long Word Sequence Dataset:

5. Defining Iterator for Making a Mini-batch from the Dataset:

6. Defining Updater:

7. Defining Evaluation Function (Perplexity):

8. Iterator:

9. Create RNN and Classification Model:

10. Seting up Optimizer:

11. Setup and Run Trainer:

12. Evaluate the trained model on test dataset:

Ray han

Relevant Projects

You might also like

Relevant Projects