How to load a model using MXNet

This recipe helps you load a model using MXNet

Recipe Objective: How to load a model using MXNet?

This recipe explains how to load a model in MXNet.

Step 1: Importing library

Let us first import the necessary libraries.

import math
import mxnet as mx
import numpy as np
from mxnet import nd, autograd, gluon
from mxnet.gluon.data.vision import transforms

Step 2: Data Set

We'll use the MNIST data set to perform a set of operations. We'll load the data set using gluon.data.DataLoader().

train = gluon.data.DataLoader(gluon.data.vision.MNIST(train=True).transform_first(transforms.ToTensor()), 128, shuffle=True)

Step 3: Neural Network

We have built a neural network with two convolutional layers.

def network(net):
    with net.name_scope():
       net.add(gluon.nn.Conv2D(channels=10, kernel_size=1, activation='relu'))
       net.add(gluon.nn.MaxPool2D(pool_size=4, strides=4))
       net.add(gluon.nn.Conv2D(channels=20, kernel_size=1, activation='relu'))
       net.add(gluon.nn.MaxPool2D(pool_size=4, strides=4))
       net.add(gluon.nn.Flatten())
       net.add(gluon.nn.Dense(256, activation="relu"))
       net.add(gluon.nn.Dense(10))

       return net

Step 4: Learning Rate Schedules

To control the ultimate performance of the network and speed of convergence while training a neural network, the essential part is setting the learning rate for SGD (Stochastic Gradient Descent). By keeping the learning rate constant throughout the training process is the most straightforward strategy. Keeping the learning rate value small, the optimizer finds reasonable solutions, but this comes at the expense of limiting the initial speed of convergence and changing the learning rate over time. Changing the learning rate over time can resolve this.

def modeltrain(model):
    model.initialize()     iterations = math.ceil(len(train) / 128)
    steps = [s*iterations for s in [1,2,3]]
    softmax_cross_entropy = gluon.loss.SoftmaxCrossEntropyLoss()
    learning_rate = mx.lr_scheduler.MultiFactorScheduler(step=steps, factor=0.1)
    cnt = mx.optimizer.SGD(learning_rate=0.03, lr_scheduler=learning_rate)
    trainer = mx.gluon.Trainer(params=net.collect_params(), optimizer=cnt)
    for epoch in range(1):
       for batch_num, (data, label) in enumerate(train):
          data = data.as_in_context(mx.cpu())
          label = label.as_in_context(mx.cpu())
          with autograd.record():
             output = model(data)
             loss = softmax_cross_entropy(output, label)
          loss.backward()
          trainer.step(data.shape[0])
          if batch_num % 50 == 0:
             curr_loss = nd.mean(loss).asscalar()
             print("Epoch: %d; Batch %d; Loss %f" % (epoch, batch_num, curr_loss))

Step 5: Load a model

load_parameters are used to load parameters of any gluon model, but it is unable to save the model's architecture. This method does not allow to save of parameters of non-dynamic models. As model architecture changes during execution, it can not be kept in the dynamic model.

net0 = network(gluon.nn.Sequential())
net0.load_parameters(file, ctx=ctx)

What Users are saying..

profile image

Savvy Sahai

Data Science Intern, Capgemini
linkedin profile url

As a student looking to break into the field of data engineering and data science, one can get really confused as to which path to take. Very few ways to do it are Google, YouTube, etc. I was one of... Read More

Relevant Projects

Learn to Build a Siamese Neural Network for Image Similarity
In this Deep Learning Project, you will learn how to build a siamese neural network with Keras and Tensorflow for Image Similarity.

Build a Text Generator Model using Amazon SageMaker
In this Deep Learning Project, you will train a Text Generator Model on Amazon Reviews Dataset using LSTM Algorithm in PyTorch and deploy it on Amazon SageMaker.

Time Series Python Project using Greykite and Neural Prophet
In this time series project, you will forecast Walmart sales over time using the powerful, fast, and flexible time series forecasting library Greykite that helps automate time series problems.

MLOps Project to Deploy Resume Parser Model on Paperspace
In this MLOps project, you will learn how to deploy a Resume Parser Streamlit Application on Paperspace Private Cloud.

Time Series Forecasting with LSTM Neural Network Python
Deep Learning Project- Learn to apply deep learning paradigm to forecast univariate time series data.

Recommender System Machine Learning Project for Beginners-4
Collaborative Filtering Recommender System Project - Comparison of different model based and memory based methods to build recommendation system using collaborative filtering.

Customer Market Basket Analysis using Apriori and Fpgrowth algorithms
In this data science project, you will learn how to perform market basket analysis with the application of Apriori and FP growth algorithms based on the concept of association rule learning.

Learn Hyperparameter Tuning for Neural Networks with PyTorch
In this Deep Learning Project, you will learn how to optimally tune the hyperparameters (learning rate, epochs, dropout, early stopping) of a neural network model in PyTorch to improve model performance.

Build a Face Recognition System in Python using FaceNet
In this deep learning project, you will build your own face recognition system in Python using OpenCV and FaceNet by extracting features from an image of a person's face.

Linear Regression Model Project in Python for Beginners Part 1
Machine Learning Linear Regression Project in Python to build a simple linear regression model and master the fundamentals of regression for beginners.