How to save a model in MXNet

This recipe helps you save a model in MXNet

Recipe Objective: How to save a model in MXNet?

This recipe explains how to save model in MXNet.

Learn How to use XLNet for Text Classification

Step 1: Importing library

Let us first import the necessary libraries.

import math
import mxnet as mx
import numpy as np
from mxnet import nd, autograd, gluon
from mxnet.gluon.data.vision import transforms

Step 2: Data Set

We'll use the MNIST data set to perform a set of operations. We'll load the data set using gluon.data.DataLoader().

train = gluon.data.DataLoader(gluon.data.vision.MNIST(train=True).transform_first(transforms.ToTensor()), 128, shuffle=True)

Step 3: Neural Network

We have built a neural network with two convolutional layers.

def network(net):
    with net.name_scope():
       net.add(gluon.nn.Conv2D(channels=10, kernel_size=1, activation='relu'))
       net.add(gluon.nn.MaxPool2D(pool_size=4, strides=4))
       net.add(gluon.nn.Conv2D(channels=20, kernel_size=1, activation='relu'))
       net.add(gluon.nn.MaxPool2D(pool_size=4, strides=4))
       net.add(gluon.nn.Flatten())
       net.add(gluon.nn.Dense(256, activation="relu"))
       net.add(gluon.nn.Dense(10))

       return net

Step 4: Learning Rate Schedules

To control the ultimate performance of the network and speed of convergence while training a neural network, the essential part is setting the learning rate for SGD (Stochastic Gradient Descent). By keeping the learning rate constant throughout the training process is the most straightforward strategy. By keeping the learning rate value small, the optimizer finds reasonable solutions, but this comes at the expense of limiting the initial speed of convergence. Changing the learning rate over time can resolve this.

def modeltrain(model):
    model.initialize()     iterations = math.ceil(len(train) / 128)
    steps = [s*iterations for s in [1,2,3]]
    softmax_cross_entropy = gluon.loss.SoftmaxCrossEntropyLoss()
    learning_rate = mx.lr_scheduler.MultiFactorScheduler(step=steps, factor=0.1)
    cnt = mx.optimizer.SGD(learning_rate=0.03, lr_scheduler=learning_rate)
    trainer = mx.gluon.Trainer(params=net.collect_params(), optimizer=cnt)
    for epoch in range(1):
       for batch_num, (data, label) in enumerate(train):
          data = data.as_in_context(mx.cpu())
          label = label.as_in_context(mx.cpu())
          with autograd.record():
             output = model(data)
             loss = softmax_cross_entropy(output, label)
          loss.backward()
          trainer.step(data.shape[0])
          if batch_num % 50 == 0:
             curr_loss = nd.mean(loss).asscalar()
             print("Epoch: %d; Batch %d; Loss %f" % (epoch, batch_num, curr_loss))

Step 5: Save a model

save_parameters is used to save parameters of any gluon model, but it is unable to keep the model's architecture. This method does not allow to save the parameters of non-dynamic models. As model architecture changes during execution, it can not be kept in the dynamic model.

file = "net.params"
net.save_parameters(file)

What Users are saying..

profile image

Ed Godalle

Director Data Analytics at EY / EY Tech
linkedin profile url

I am the Director of Data Analytics with over 10+ years of IT experience. I have a background in SQL, Python, and Big Data working with Accenture, IBM, and Infosys. I am looking to enhance my skills... Read More

Relevant Projects

Digit Recognition using CNN for MNIST Dataset in Python
In this deep learning project, you will build a convolutional neural network using MNIST dataset for handwritten digit recognition.

Build a CNN Model with PyTorch for Image Classification
In this deep learning project, you will learn how to build an Image Classification Model using PyTorch CNN

NLP Project on LDA Topic Modelling Python using RACE Dataset
Use the RACE dataset to extract a dominant topic from each document and perform LDA topic modeling in python.

Machine Learning Project to Forecast Rossmann Store Sales
In this machine learning project you will work on creating a robust prediction model of Rossmann's daily sales using store, promotion, and competitor data.

Build a Multi Touch Attribution Machine Learning Model in Python
Identifying the ROI on marketing campaigns is an essential KPI for any business. In this ML project, you will learn to build a Multi Touch Attribution Model in Python to identify the ROI of various marketing efforts and their impact on conversions or sales..

OpenCV Project to Master Advanced Computer Vision Concepts
In this OpenCV project, you will learn to implement advanced computer vision concepts and algorithms in OpenCV library using Python.

Image Classification Model using Transfer Learning in PyTorch
In this PyTorch Project, you will build an image classification model in PyTorch using the ResNet pre-trained model.

Time Series Forecasting Project-Building ARIMA Model in Python
Build a time series ARIMA model in Python to forecast the use of arrival rate density to support staffing decisions at call centres.

Build Real Estate Price Prediction Model with NLP and FastAPI
In this Real Estate Price Prediction Project, you will learn to build a real estate price prediction machine learning model and deploy it on Heroku using FastAPI Framework.

Build a Review Classification Model using Gated Recurrent Unit
In this Machine Learning project, you will build a classification model in python to classify the reviews of an app on a scale of 1 to 5 using Gated Recurrent Unit.