What is SoftmaxCrossEntropyLoss in MXNET

This recipe explains what is SoftmaxCrossEntropyLoss in MXNET

Recipe Objective: What is SoftmaxCrossEntropyLoss in MXNet?

This recipe explains what is SoftmaxCrossEntropyLoss in MXNet.

Step 1: Importing library

Let us first import the necessary libraries.

import math
import mxnet as mx
import numpy as np
from mxnet import nd, autograd, gluon
from mxnet.gluon.data.vision import transforms

Step 2: Data Set

We'll use the MNIST data set to perform a set of operations. We'll load the data set using gluon.data.DataLoader().

train = gluon.data.DataLoader(gluon.data.vision.MNIST(train=True).transform_first(transforms.ToTensor()), 128, shuffle=True)

Step 3: Neural Network

We have built a neural network with two convolutional layers.

def network(net):
    with net.name_scope():
       net.add(gluon.nn.Conv2D(channels=10, kernel_size=1, activation='relu'))
       net.add(gluon.nn.MaxPool2D(pool_size=4, strides=4))
       net.add(gluon.nn.Conv2D(channels=20, kernel_size=1, activation='relu'))
       net.add(gluon.nn.MaxPool2D(pool_size=4, strides=4))
       net.add(gluon.nn.Flatten())
       net.add(gluon.nn.Dense(256, activation="relu"))
       net.add(gluon.nn.Dense(10))

       return net

Step 4: Loss with Softmax

To control the ultimate performance of the network and speed of convergence while training a neural network, the essential part is setting the learning rate for SGD (Stochastic Gradient Descent). By keeping the learning rate constant throughout the training process is the most straightforward strategy. By keeping the learning rate value small, the optimizer finds reasonable solutions, but this comes at the expense of limiting the initial speed of convergence. Changing the learning rate over time can resolve this.
SoftmaxCrossEntropyLoss() computes the softmax cross entropy loss. To avoid numerical instabilities, the softmax_cross_entropy module provides a single operator with softmax and cross-entropy fused.

def modeltrain(model):
    model.initialize()     iterations = math.ceil(len(train) / 128)
    steps = [s*iterations for s in [1,2,3]]
    softmax_cross_entropy = gluon.loss.SoftmaxCrossEntropyLoss()
    learning_rate = mx.lr_scheduler.MultiFactorScheduler(step=steps, factor=0.1)
    cnt = mx.optimizer.SGD(learning_rate=0.03, lr_scheduler=learning_rate)
    trainer = mx.gluon.Trainer(params=net.collect_params(), optimizer=cnt)
    for epoch in range(1):
       for batch_num, (data, label) in enumerate(train):
          data = data.as_in_context(mx.cpu())
          label = label.as_in_context(mx.cpu())
          with autograd.record():
             output = model(data)
             loss = softmax_cross_entropy(output, label)
          loss.backward()
          trainer.step(data.shape[0])
          if batch_num % 50 == 0:
             curr_loss = nd.mean(loss).asscalar()
             print("Epoch: %d; Batch %d; Loss %f" % (epoch, batch_num, curr_loss))

What Users are saying..

profile image

Anand Kumpatla

Sr Data Scientist @ Doubleslash Software Solutions Pvt Ltd
linkedin profile url

ProjectPro is a unique platform and helps many people in the industry to solve real-life problems with a step-by-step walkthrough of projects. A platform with some fantastic resources to gain... Read More

Relevant Projects

AWS MLOps Project to Deploy Multiple Linear Regression Model
Build and Deploy a Multiple Linear Regression Model in Python on AWS

Census Income Data Set Project-Predict Adult Census Income
Use the Adult Income dataset to predict whether income exceeds 50K yr based oncensus data.

Customer Churn Prediction Analysis using Ensemble Techniques
In this machine learning churn project, we implement a churn prediction model in python using ensemble techniques.

Hands-On Approach to Causal Inference in Machine Learning
In this Machine Learning Project, you will learn to implement various causal inference techniques in Python to determine, how effective the sprinkler is in making the grass wet.

Locality Sensitive Hashing Python Code for Look-Alike Modelling
In this deep learning project, you will find similar images (lookalikes) using deep learning and locality sensitive hashing to find customers who are most likely to click on an ad.

Build Real Estate Price Prediction Model with NLP and FastAPI
In this Real Estate Price Prediction Project, you will learn to build a real estate price prediction machine learning model and deploy it on Heroku using FastAPI Framework.

Learn to Build an End-to-End Machine Learning Pipeline - Part 2
In this Machine Learning Project, you will learn how to build an end-to-end machine learning pipeline for predicting truck delays, incorporating Hopsworks' feature store and Weights and Biases for model experimentation.

MLOps Project on GCP using Kubeflow for Model Deployment
MLOps using Kubeflow on GCP - Build and deploy a deep learning model on Google Cloud Platform using Kubeflow pipelines in Python

Natural language processing Chatbot application using NLTK for text classification
In this NLP AI application, we build the core conversational engine for a chatbot. We use the popular NLTK text classification library to achieve this.

OpenCV Project for Beginners to Learn Computer Vision Basics
In this OpenCV project, you will learn computer vision basics and the fundamentals of OpenCV library using Python.