How to save a model in MXNet

This recipe helps you save a model in MXNet

Recipe Objective: How to save a model in MXNet?

This recipe explains how to save model in MXNet.

Learn How to use XLNet for Text Classification

Step 1: Importing library

Let us first import the necessary libraries.

import math
import mxnet as mx
import numpy as np
from mxnet import nd, autograd, gluon
from mxnet.gluon.data.vision import transforms

Step 2: Data Set

We'll use the MNIST data set to perform a set of operations. We'll load the data set using gluon.data.DataLoader().

train = gluon.data.DataLoader(gluon.data.vision.MNIST(train=True).transform_first(transforms.ToTensor()), 128, shuffle=True)

Step 3: Neural Network

We have built a neural network with two convolutional layers.

def network(net):
    with net.name_scope():
       net.add(gluon.nn.Conv2D(channels=10, kernel_size=1, activation='relu'))
       net.add(gluon.nn.MaxPool2D(pool_size=4, strides=4))
       net.add(gluon.nn.Conv2D(channels=20, kernel_size=1, activation='relu'))
       net.add(gluon.nn.MaxPool2D(pool_size=4, strides=4))
       net.add(gluon.nn.Flatten())
       net.add(gluon.nn.Dense(256, activation="relu"))
       net.add(gluon.nn.Dense(10))

       return net

Step 4: Learning Rate Schedules

To control the ultimate performance of the network and speed of convergence while training a neural network, the essential part is setting the learning rate for SGD (Stochastic Gradient Descent). By keeping the learning rate constant throughout the training process is the most straightforward strategy. By keeping the learning rate value small, the optimizer finds reasonable solutions, but this comes at the expense of limiting the initial speed of convergence. Changing the learning rate over time can resolve this.

def modeltrain(model):
    model.initialize()     iterations = math.ceil(len(train) / 128)
    steps = [s*iterations for s in [1,2,3]]
    softmax_cross_entropy = gluon.loss.SoftmaxCrossEntropyLoss()
    learning_rate = mx.lr_scheduler.MultiFactorScheduler(step=steps, factor=0.1)
    cnt = mx.optimizer.SGD(learning_rate=0.03, lr_scheduler=learning_rate)
    trainer = mx.gluon.Trainer(params=net.collect_params(), optimizer=cnt)
    for epoch in range(1):
       for batch_num, (data, label) in enumerate(train):
          data = data.as_in_context(mx.cpu())
          label = label.as_in_context(mx.cpu())
          with autograd.record():
             output = model(data)
             loss = softmax_cross_entropy(output, label)
          loss.backward()
          trainer.step(data.shape[0])
          if batch_num % 50 == 0:
             curr_loss = nd.mean(loss).asscalar()
             print("Epoch: %d; Batch %d; Loss %f" % (epoch, batch_num, curr_loss))

Step 5: Save a model

save_parameters is used to save parameters of any gluon model, but it is unable to keep the model's architecture. This method does not allow to save the parameters of non-dynamic models. As model architecture changes during execution, it can not be kept in the dynamic model.

file = "net.params"
net.save_parameters(file)

What Users are saying..

profile image

Jingwei Li

Graduate Research assistance at Stony Brook University
linkedin profile url

ProjectPro is an awesome platform that helps me learn much hands-on industrial experience with a step-by-step walkthrough of projects. There are two primary paths to learn: Data Science and Big Data.... Read More

Relevant Projects

Credit Card Fraud Detection as a Classification Problem
In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.

Create Your First Chatbot with RASA NLU Model and Python
Learn the basic aspects of chatbot development and open source conversational AI RASA to create a simple AI powered chatbot on your own.

MLOps Project to Deploy Resume Parser Model on Paperspace
In this MLOps project, you will learn how to deploy a Resume Parser Streamlit Application on Paperspace Private Cloud.

Digit Recognition using CNN for MNIST Dataset in Python
In this deep learning project, you will build a convolutional neural network using MNIST dataset for handwritten digit recognition.

End-to-End Speech Emotion Recognition Project using ANN
Speech Emotion Recognition using RAVDESS Audio Dataset - Build an Artificial Neural Network Model to Classify Audio Data into various Emotions like Sad, Happy, Angry, and Neutral

Abstractive Text Summarization using Transformers-BART Model
Deep Learning Project to implement an Abstractive Text Summarizer using Google's Transformers-BART Model to generate news article headlines.

NLP and Deep Learning For Fake News Classification in Python
In this project you will use Python to implement various machine learning methods( RNN, LSTM, GRU) for fake news classification.

BERT Text Classification using DistilBERT and ALBERT Models
This Project Explains how to perform Text Classification using ALBERT and DistilBERT

Model Deployment on GCP using Streamlit for Resume Parsing
Perform model deployment on GCP for resume parsing model using Streamlit App.

Loan Eligibility Prediction using Gradient Boosting Classifier
This data science in python project predicts if a loan should be given to an applicant or not. We predict if the customer is eligible for loan based on several factors like credit score and past history.