When to use stemming and when to use lemmatization in nltk

This recipe explains when to use stemming and when to use lemmatization in nltk

Recipe Objective

When we are talking about the sentimental analysis, customer review analysis or we want to take out some output from customer reviews and positive and negative sentiments then stemming comes into picture. Whereas lemmatization is used when it comes to chatbots and displaying the reviews of the site, services, or products where the output should be understandable by a human.

Explanation

Stemming It is used to chop the words, or we can say that reduce the size of the words. for e.g. eating, and eat will become eat and beating, and beat will become beat, but in some cases, it will not work for e.g. Words like Finally, Finalized and Final will become Fina which is not understandable by humans because Stemming reduces the size of the word and taking out the common word from the matching one only. So in that case to make it understandable by human lemmatization comes into the picture where it converts the word into a meaningful output which will be understandable by a human. for e.g historical, history will become history and finalized, final and finally will become final

Step 1 - Import the library - nltk and PorterStemmer from nltk

import nltk from nltk.stem import PorterStemmer

As we have imported the nltk library which is nothing but the Natural language Processing toolkit and from nltk.stem we have imported the PorterStemmer for Stemming which is a popularly used Stemmer

Step 2 - Create a Variable for stemmer

My_stemmer = PorterStemmer()

Here we have taken a variable as My_stemmer and stored our PorterStemmer in that variable for further operations

Step 3 - Input words into the stemmer

print("The output after Stemming the word is :", My_stemmer.stem('writing'), '\n') print("The output after Stemming the word is :", My_stemmer.stem('eating'))

The output after Stemming the word is : write

The output after Stemming the word is : eat

from the above we have got an idea about how stemming works as we can see the word writing has become write and eating has become eat

Step 4 - Import the lemmatizer from nltk library

from nltk.stem import WordNetLemmatizer

Now we will check the process with lemmatizer as we did with Stemmer for that we are importing the library WordNetLemmatizer from nltk which is popularly used one.

Step 5 - Create a variable for lemmatizer

My_lemmatizer = WordNetLemmatizer()

Here we have taken a variable My_lemmatizer and stored our WordNetLemmatizer in that variable for further operations

Step 6 - Input words into lemmatizer

print("The word after lemmatization :",My_lemmatizer.lemmatize('eating'), '\n') print("The word after lemmatization :",My_lemmatizer.lemmatize('bottles'))

The word after lemmatization : eating

The word after lemmatization : bottle

From the above, we get the idea about lemmatizer working as the eating word has remained the same because it gives meaningful output that will be understandable by humans also the second word bottles has become bottle as a converted word.

What Users are saying..

profile image

Abhinav Agarwal

Graduate Student at Northwestern University
linkedin profile url

I come from Northwestern University, which is ranked 9th in the US. Although the high-quality academics at school taught me all the basics I needed, obtaining practical experience was a challenge.... Read More

Relevant Projects

Build Classification Algorithms for Digital Transformation[Banking]
Implement a machine learning approach using various classification techniques in Python to examine the digitalisation process of bank customers.

AWS MLOps Project to Deploy Multiple Linear Regression Model
Build and Deploy a Multiple Linear Regression Model in Python on AWS

Locality Sensitive Hashing Python Code for Look-Alike Modelling
In this deep learning project, you will find similar images (lookalikes) using deep learning and locality sensitive hashing to find customers who are most likely to click on an ad.

Build a Graph Based Recommendation System in Python-Part 2
In this Graph Based Recommender System Project, you will build a recommender system project for eCommerce platforms and learn to use FAISS for efficient similarity search.

AWS MLOps Project to Deploy a Classification Model [Banking]
In this AWS MLOps project, you will learn how to deploy a classification model using Flask on AWS.

Build CNN Image Classification Models for Real Time Prediction
Image Classification Project to build a CNN model in Python that can classify images into social security cards, driving licenses, and other key identity information.

Loan Eligibility Prediction using Gradient Boosting Classifier
This data science in python project predicts if a loan should be given to an applicant or not. We predict if the customer is eligible for loan based on several factors like credit score and past history.

PyTorch Project to Build a GAN Model on MNIST Dataset
In this deep learning project, you will learn how to build a GAN Model on MNIST Dataset for generating new images of handwritten digits.

Stock Price Prediction Project using LSTM and RNN
Learn how to predict stock prices using RNN and LSTM models. Understand deep learning concepts and apply them to real-world financial data for accurate forecasting.

AWS MLOps Project for ARCH and GARCH Time Series Models
Build and deploy ARCH and GARCH time series forecasting models in Python on AWS .