When to use stemming and when to use lemmatization?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

When to use stemming and when to use lemmatization?

When to use stemming and when to use lemmatization?

This recipe explains when to use stemming and when to use lemmatization

0

Recipe Objective

When we are talking about the sentimental analysis, customer review analysis or we want to take out some output from customer reviews and positive and negative sentiments then stemming comes into picture. Whereas lemmatization is used when it comes to chatbots and displaying the reviews of the site, services, or products where the output should be understandable by a human.

Explanation

Stemming It is used to chop the words, or we can say that reduce the size of the words. for e.g. eating, and eat will become eat and beating, and beat will become beat, but in some cases, it will not work for e.g. Words like Finally, Finalized and Final will become Fina which is not understandable by humans because Stemming reduces the size of the word and taking out the common word from the matching one only. So in that case to make it understandable by human lemmatization comes into the picture where it converts the word into a meaningful output which will be understandable by a human. for e.g historical, history will become history and finalized, final and finally will become final

Step 1 - Import the library - nltk and PorterStemmer from nltk

import nltk from nltk.stem import PorterStemmer

As we have imported the nltk library which is nothing but the Natural language Processing toolkit and from nltk.stem we have imported the PorterStemmer for Stemming which is a popularly used Stemmer

Step 2 - Create a Variable for stemmer

My_stemmer = PorterStemmer()

Here we have taken a variable as My_stemmer and stored our PorterStemmer in that variable for further operations

Step 3 - Input words into the stemmer

print("The output after Stemming the word is :", My_stemmer.stem('writing'), '\n') print("The output after Stemming the word is :", My_stemmer.stem('eating'))

The output after Stemming the word is : write

The output after Stemming the word is : eat

from the above we have got an idea about how stemming works as we can see the word writing has become write and eating has become eat

Step 4 - Import the lemmatizer from nltk library

from nltk.stem import WordNetLemmatizer

Now we will check the process with lemmatizer as we did with Stemmer for that we are importing the library WordNetLemmatizer from nltk which is popularly used one.

Step 5 - Create a variable for lemmatizer

My_lemmatizer = WordNetLemmatizer()

Here we have taken a variable My_lemmatizer and stored our WordNetLemmatizer in that variable for further operations

Step 6 - Input words into lemmatizer

print("The word after lemmatization :",My_lemmatizer.lemmatize('eating'), '\n') print("The word after lemmatization :",My_lemmatizer.lemmatize('bottles'))

The word after lemmatization : eating

The word after lemmatization : bottle

From the above, we get the idea about lemmatizer working as the eating word has remained the same because it gives meaningful output that will be understandable by humans also the second word bottles has become bottle as a converted word.

Relevant Projects

Customer Market Basket Analysis using Apriori and Fpgrowth algorithms
In this data science project, you will learn how to perform market basket analysis with the application of Apriori and FP growth algorithms based on the concept of association rule learning.

Ecommerce product reviews - Pairwise ranking and sentiment analysis
This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.

Data Science Project - Instacart Market Basket Analysis
Data Science Project - Build a recommendation engine which will predict the products to be purchased by an Instacart consumer again.

Predict Employee Computer Access Needs in Python
Data Science Project in Python- Given his or her job role, predict employee access needs using amazon employee database.

Human Activity Recognition Using Multiclass Classification in Python
In this human activity recognition project, we use multiclass classification machine learning techniques to analyse fitness dataset from a smartphone tracker.

Machine Learning or Predictive Models in IoT - Energy Prediction Use Case
In this machine learning and IoT project, we are going to test out the experimental data using various predictive models and train the models and break the energy usage.

PySpark Tutorial - Learn to use Apache Spark with Python
PySpark Project-Get a handle on using Python with Spark through this hands-on data processing spark python tutorial.

Data Science Project in Python on BigMart Sales Prediction
The goal of this data science project is to build a predictive model and find out the sales of each product at a given Big Mart store.

Build a Collaborative Filtering Recommender System in Python
Use the Amazon Reviews/Ratings dataset of 2 Million records to build a recommender system using memory-based collaborative filtering in Python.

Time Series Forecasting with LSTM Neural Network Python
Deep Learning Project- Learn to apply deep learning paradigm to forecast univariate time series data.