How to clip gradient in Pytorch

This recipe helps you clip gradient in Pytorch

Recipe Objective

How to clip gradient in Pytorch?

This is achieved by using the torch.nn.utils.clip_grad_norm_(parameters, max_norm, norm_type=2.0) syntax available in PyTorch, in this it will clip gradient norm of iterable parameters, where the norm is computed overall gradients together as if they were been concatenated into vector. There are functions being used in this which have there separate meanings:
parameters - It will have gradients normalized this is iterable of tensors or single tensor.
max_norm - this is nothing but the maximum normalization of the gradients.
norm_type - This is the normalization type or norm type which used p-norm. Also this can be "inf" for the infinity norm.

PyTorch vs Tensorflow - Which One Should You Choose For Your Next Deep Learning Project ?

Step 1 - Import library

import torch

Step 2 - Define parameters

batch, dim_in, dim_h, dim_out = 128, 2000, 200, 20

Here we are defining various parameters which are as follows:
batch - batch size
dim_in - Input dimension.
dim_out - Output dimension.
dim_h - hidden dimension.

Step 3 - Create Random tensors

input_X = torch.randn(batch, dim_in)
output_Y = torch.randn(batch, dim_out)

Here we are creating random tensors for holding the input and output data.

Step 4 - Define model and loss function

Adam_model = torch.nn.Sequential( torch.nn.Linear(dim_in, dim_h), torch.nn.ReLU(), torch.nn.Linear(dim_h, dim_out), )
loss_fn = torch.nn.MSELoss(reduction='sum')

Step 5 - Define learning rate

rate_learning = 1e-4

Step 6 - Initialize optimizer

optim = torch.optim.Adam(SGD_model.parameters(), lr=rate_learning)

Here we are Initializing our optimizer by using the "optim" package which will update the weights of the model for us. We are using SGD optimizer here the "optim" package which consist of many optimization algorithms.

Step 7 - Forward pass

for values in range(500):
   pred_y = Adam_model(input_X)
   loss = loss_fn(pred_y, output_Y)
   if values % 100 == 99:
     print(values, loss.item())

99 698.3545532226562
199 698.3545532226562
299 698.3545532226562
399 698.3545532226562
499 698.3545532226562

Here we are computing the predicted y by passing input_X to the model, after that computing the loss and then printing it.

Step 8 - Zero all gradients

optim.zero_grad()

Here before the backward pass we must zero all the gradients for the variables it will update which are nothing but the learnable weights of the model.

Step 9 - Backward pass

loss.backward()

Here we are computing the gradients of the loss w.r.t the model parameters.

Step 10 - Call step function

optim.step()

Here we are calling the step function on an optimizer which will makes an update to its parameters.

Step 11 - Clip gradients

torch.nn.utils.clip_grad_norm(parameters=Adam_model.parameters(), max_norm=10, norm_type=2.0)

tensor(1462.1097)

What Users are saying..

profile image

Ed Godalle

Director Data Analytics at EY / EY Tech
linkedin profile url

I am the Director of Data Analytics with over 10+ years of IT experience. I have a background in SQL, Python, and Big Data working with Accenture, IBM, and Infosys. I am looking to enhance my skills... Read More

Relevant Projects

Ensemble Machine Learning Project - All State Insurance Claims Severity Prediction
In this ensemble machine learning project, we will predict what kind of claims an insurance company will get. This is implemented in python using ensemble machine learning algorithms.

Deep Learning Project for Beginners with Source Code Part 1
Learn to implement deep neural networks in Python .

Build a CNN Model with PyTorch for Image Classification
In this deep learning project, you will learn how to build an Image Classification Model using PyTorch CNN

Build a Collaborative Filtering Recommender System in Python
Use the Amazon Reviews/Ratings dataset of 2 Million records to build a recommender system using memory-based collaborative filtering in Python.

Demand prediction of driver availability using multistep time series analysis
In this supervised learning machine learning project, you will predict the availability of a driver in a specific area by using multi step time series analysis.

LLM Project to Build and Fine Tune a Large Language Model
In this LLM project for beginners, you will learn to build a knowledge-grounded chatbot using LLM's and learn how to fine tune it.

Image Segmentation using Mask R-CNN with Tensorflow
In this Deep Learning Project on Image Segmentation Python, you will learn how to implement the Mask R-CNN model for early fire detection.

Build an Image Segmentation Model using Amazon SageMaker
In this Machine Learning Project, you will learn to implement the UNet Architecture and build an Image Segmentation Model using Amazon SageMaker

Loan Eligibility Prediction in Python using H2O.ai
In this loan prediction project you will build predictive models in Python using H2O.ai to predict if an applicant is able to repay the loan or not.

Llama2 Project for MetaData Generation using FAISS and RAGs
In this LLM Llama2 Project, you will automate metadata generation using Llama2, RAGs, and AWS to reduce manual efforts.