How to build Character level LSTM using dynet

This recipe helps you build Character level LSTM using dynet

Recipe Objective - How to build Character-level LSTM using dynet?

Now we know the basics of the RNN model and LSTM model, let's create the Character-level LSTM model.

import random
from collections import defaultdict
from itertools import count
import sys

layers = 3
input_dim = 60
hidden_dim = 60

characters = list("abcdefghijklmnopqrstuvwxyz ")
characters.append("")

int2char = list(characters)
char2int = {c:i for i,c in enumerate(characters)}

vocab_size = len(characters)

1. Compute the loss of RNN.
2. generate from model.
3. train and generate the samples.

pc = dy.ParameterCollection()


SRNN_builder = dy.SimpleRNNBuilder(layers, input_dim, hidden_dim, pc)
LSTM_builder = dy.LSTMBuilder(layers, input_dim, hidden_dim, pc)

# add parameters for the hidden->output part for both lstm and srnn
params_lstm = {}
params_srnn = {}
for params in [params_lstm, params_srnn]:
    params["lookup"] = pc.add_lookup_parameters((vocab_size, input_dim))
    params["R"] = pc.add_parameters((vocab_size, hidden_dim))
    params["bias"] = pc.add_parameters((vocab_size))

# return compute loss of RNN for one sentence
def do_one_sentence(rnn, params, sentence):
    # setup the sentence
    dy.renew_cg()
    s0 = rnn.initial_state()


    R = params["R"]
    bias = params["bias"]
    lookup = params["lookup"]
    sentence = [""] + list(sentence) + [""]
    sentence = [char2int[c] for c in sentence]
    s = s0
    loss = []
    for char,next_char in zip(sentence,sentence[1:]):
        s = s.add_input(lookup[char])
        probs = dy.softmax(R*s.output() + bias)
    loss.append( -dy.log(dy.pick(probs,next_char)) )
    loss = dy.esum(loss)
    return loss

# generate from model:
def generate(rnn, params):
    def sample(probs):
    rnd = random.random()
    for i,p in enumerate(probs):
            rnd -= p
            if rnd <= 0: break
        return i

    # setup the sentence
    dy.renew_cg()
    s0 = rnn.initial_state()

    R = params["R"]
    bias = params["bias"]
    lookup = params["lookup"]

    s = s0.add_input(lookup[char2int[""]])
    out=[]
    while True:
        probs = dy.softmax(R*s.output() + bias)
        probs = probs.vec_value()
        next_char = sample(probs)
        out.append(int2char[next_char])
        if out[-1] == "": break
        s = s.add_input(lookup[next_char])
    return "".join(out[:-1]) # strip the

# train, and generate every 5 samples
def train(rnn, params, sentence):
    trainer = dy.SimpleSGDTrainer(pc)
    for i in range(200):
        loss = do_one_sentence(rnn, params, sentence)
        loss_value = loss.value()
        loss.backward()
        trainer.update()
        if i % 5 == 0:
            print("%.10f" % loss_value, end="\t")
            print(generate(rnn, params))

sentence = "a monkey ate the food of crocodile"
train(SRNN_builder, params_srnn, sentence)

sentence = "a monkey ate the food of crocodile"
train(LSTM_builder, params_lstm, sentence)

train(SRNN_builder, params_srnn, "these chickens are making me hungry")

What Users are saying..

profile image

Anand Kumpatla

Sr Data Scientist @ Doubleslash Software Solutions Pvt Ltd
linkedin profile url

ProjectPro is a unique platform and helps many people in the industry to solve real-life problems with a step-by-step walkthrough of projects. A platform with some fantastic resources to gain... Read More

Relevant Projects

Customer Market Basket Analysis using Apriori and Fpgrowth algorithms
In this data science project, you will learn how to perform market basket analysis with the application of Apriori and FP growth algorithms based on the concept of association rule learning.

Multilabel Classification Project for Predicting Shipment Modes
Multilabel Classification Project to build a machine learning model that predicts the appropriate mode of transport for each shipment, using a transport dataset with 2000 unique products. The project explores and compares four different approaches to multilabel classification, including naive independent models, classifier chains, natively multilabel models, and multilabel to multiclass approaches.

Learn Hyperparameter Tuning for Neural Networks with PyTorch
In this Deep Learning Project, you will learn how to optimally tune the hyperparameters (learning rate, epochs, dropout, early stopping) of a neural network model in PyTorch to improve model performance.

Classification Projects on Machine Learning for Beginners - 2
Learn to implement various ensemble techniques to predict license status for a given business.

Build a Text Generator Model using Amazon SageMaker
In this Deep Learning Project, you will train a Text Generator Model on Amazon Reviews Dataset using LSTM Algorithm in PyTorch and deploy it on Amazon SageMaker.

Tensorflow Transfer Learning Model for Image Classification
Image Classification Project - Build an Image Classification Model on a Dataset of T-Shirt Images for Binary Classification

Build a Text Classification Model with Attention Mechanism NLP
In this NLP Project, you will learn to build a multi class text classification model with attention mechanism.

MLOps Project to Build Search Relevancy Algorithm with SBERT
In this MLOps SBERT project you will learn to build and deploy an accurate and scalable search algorithm on AWS using SBERT and ANNOY to enhance search relevancy in news articles.

Learn How to Build a Linear Regression Model in PyTorch
In this Machine Learning Project, you will learn how to build a simple linear regression model in PyTorch to predict the number of days subscribed.

MLOps Project on GCP using Kubeflow for Model Deployment
MLOps using Kubeflow on GCP - Build and deploy a deep learning model on Google Cloud Platform using Kubeflow pipelines in Python