What is padding in NLP?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

What is padding in NLP?

What is padding in NLP?

This recipe explains what is padding in NLP

Recipe Objective

What is padding in NLP?

Padding As we know all the neural networks needs to have the inputs that should be in similar shape and size. When we pre-process the texts and use the texts as an inputs for our Model. Note that not all the sequences have the same length, as we can say naturally some of the sequences are long in lengths and some are short. Where we know that we need to have the inputs with the same size, now here padding comes into picture. The inputs should be in same size at that time padding is necessary.

Step 1 - Take Sample text

Detail1 = ['Jon', '26', 'Canada'] Detail2 = ['Heena', '24', 'India'] Detail3 = ['Shawn', '27', 'California']

Here we are taking the sample text as "name", "age" and "address" of different person.

Step 2 - Apply left padding

for Details in [Detail1,Detail2,Detail3]: for entry in Details: print(entry.ljust(25), end='') print()
Jon                      26                       Canada                   
Heena                    24                       India                    
Shawn                    27                       California               

In the above we applying left padding to text by using .ljust

Step 3 - Center Padding

Sample_text = ["Jon playes cricket", "His favourite player is MS Dhoni","Sometimes he loves to play football"] for text in Sample_text: print(text.center(50, ' '))
                Jon playes cricket                
         His favourite player is MS Dhoni         
       Sometimes he loves to play football        

Step 4 - Right Padding

for ele in [Detail1, Detail2, Detail3]: for entry in ele: print(entry.rjust(30), end='') print()
                           Jon                            26                        Canada
                         Heena                            24                         India
                         Shawn                            27                    California

Relevant Projects

Machine Learning project for Retail Price Optimization
In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.

Credit Card Fraud Detection as a Classification Problem
In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.

Predict Macro Economic Trends using Kaggle Financial Dataset
In this machine learning project, you will uncover the predictive value in an uncertain world by using various artificial intelligence, machine learning, advanced regression and feature transformation techniques.

Expedia Hotel Recommendations Data Science Project
In this data science project, you will contextualize customer data and predict the likelihood a customer will stay at 100 different hotel groups.

Time Series Python Project using Greykite and Neural Prophet
In this time series project, you will forecast Walmart sales over time using the powerful, fast, and flexible time series forecasting library Greykite that helps automate time series problems.

Machine Learning or Predictive Models in IoT - Energy Prediction Use Case
In this machine learning and IoT project, we are going to test out the experimental data using various predictive models and train the models and break the energy usage.

Ola Bike Rides Request Demand Forecast
Given big data at taxi service (ride-hailing) i.e. OLA, you will learn multi-step time series forecasting and clustering with Mini-Batch K-means Algorithm on geospatial data to predict future ride requests for a particular region at a given time.

Loan Eligibility Prediction in Python using H2O.ai
In this loan prediction project you will build predictive models in Python using H2O.ai to predict if an applicant is able to repay the loan or not.

House Price Prediction Project using Machine Learning
Use the Zillow dataset to follow a test-driven approach and build a regression machine learning model to predict the price of the house based on other variables.

Build OCR from Scratch Python using YOLO and Tesseract
In this deep learning project, you will learn how to build your custom OCR (optical character recognition) from scratch by using Google Tesseract and YOLO to read the text from any images.