When to use stemming and when to use lemmatization in nltk

This recipe explains when to use stemming and when to use lemmatization in nltk
Last Updated: 23 Feb 2023

Get access to Data Science projects View all Data Science projects

MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET ALL TAGS

Recipe Objective

When we are talking about the sentimental analysis, customer review analysis or we want to take out some output from customer reviews and positive and negative sentiments then stemming comes into picture. Whereas lemmatization is used when it comes to chatbots and displaying the reviews of the site, services, or products where the output should be understandable by a human.

Recipe Objective

Explanation

Stemming It is used to chop the words, or we can say that reduce the size of the words. for e.g. eating, and eat will become eat and beating, and beat will become beat, but in some cases, it will not work for e.g. Words like Finally, Finalized and Final will become Fina which is not understandable by humans because Stemming reduces the size of the word and taking out the common word from the matching one only. So in that case to make it understandable by human lemmatization comes into the picture where it converts the word into a meaningful output which will be understandable by a human. for e.g historical, history will become history and finalized, final and finally will become final

Step 1 - Import the library - nltk and PorterStemmer from nltk

import nltk from nltk.stem import PorterStemmer

As we have imported the nltk library which is nothing but the Natural language Processing toolkit and from nltk.stem we have imported the PorterStemmer for Stemming which is a popularly used Stemmer

Step 2 - Create a Variable for stemmer

My_stemmer = PorterStemmer()

Here we have taken a variable as My_stemmer and stored our PorterStemmer in that variable for further operations

Step 3 - Input words into the stemmer

print("The output after Stemming the word is :", My_stemmer.stem('writing'), '\n') print("The output after Stemming the word is :", My_stemmer.stem('eating'))

The output after Stemming the word is : write

The output after Stemming the word is : eat

from the above we have got an idea about how stemming works as we can see the word writing has become write and eating has become eat

Step 4 - Import the lemmatizer from nltk library

from nltk.stem import WordNetLemmatizer

Now we will check the process with lemmatizer as we did with Stemmer for that we are importing the library WordNetLemmatizer from nltk which is popularly used one.

Step 5 - Create a variable for lemmatizer

My_lemmatizer = WordNetLemmatizer()

Here we have taken a variable My_lemmatizer and stored our WordNetLemmatizer in that variable for further operations

Step 6 - Input words into lemmatizer

print("The word after lemmatization :",My_lemmatizer.lemmatize('eating'), '\n') print("The word after lemmatization :",My_lemmatizer.lemmatize('bottles'))

The word after lemmatization : eating

The word after lemmatization : bottle

From the above, we get the idea about lemmatizer working as the eating word has remained the same because it gives meaningful output that will be understandable by humans also the second word bottles has become bottle as a converted word.

What Users are saying..

Savvy Sahai

Data Science Intern, Capgemini

As a student looking to break into the field of data engineering and data science, one can get really confused as to which path to take. Very few ways to do it are Google, YouTube, etc. I was one of... Read More

Relevant Projects

Machine Learning Projects

Data Science Projects

Python Projects for Data Science

Data Science Projects in R

Machine Learning Projects for Beginners

Deep Learning Projects

Neural Network Projects

Tensorflow Projects

NLP Projects

Kaggle Projects

IoT Projects

Big Data Projects

Hadoop Real-Time Projects Examples

Spark Projects

Data Analytics Projects for Students

Relevant Projects

Ola Bike Rides Request Demand Forecast

Given big data at taxi service (ride-hailing) i.e. OLA, you will learn multi-step time series forecasting and clustering with Mini-Batch K-means Algorithm on geospatial data to predict future ride requests for a particular region at a given time.

View Project Details

Text Classification with Transformers-RoBERTa and XLNet Model

In this machine learning project, you will learn how to load, fine tune and evaluate various transformer models for text classification tasks.

View Project Details

Learn Object Tracking (SOT, MOT) using OpenCV and Python

Get Started with Object Tracking using OpenCV and Python - Learn to implement Multiple Instance Learning Tracker (MIL) algorithm, Generic Object Tracking Using Regression Networks Tracker (GOTURN) algorithm, Kernelized Correlation Filters Tracker (KCF) algorithm, Tracking, Learning, Detection Tracker (TLD) algorithm for single and multiple object tracking from various video clips.

View Project Details

When to use stemming and when to use lemmatization in nltk

Recipe Objective

Table of Contents

Explanation

Step 1 - Import the library - nltk and PorterStemmer from nltk

Step 2 - Create a Variable for stemmer

Step 3 - Input words into the stemmer

Step 4 - Import the lemmatizer from nltk library

Step 5 - Create a variable for lemmatizer

Step 6 - Input words into lemmatizer

Savvy Sahai

Relevant Projects

You might also like

Relevant Projects