How to use Porter Stemmer?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

How to use Porter Stemmer?

How to use Porter Stemmer?

This recipe helps you use Porter Stemmer

0

Recipe Objective

As we have discussed before what is stemming, So it is nothing but reducing the words or chopping the words into their root forms for e.g eating becomes eat and so on. So in stemming there are different stemmers and we are going to discuss PortersStemmer the most popularly used one.

Porters Stemmer It is a type of stemmer which is mainly known for Data Mining and Information Retrieval. As its applications are limited to the English language only. It is based on the idea that the suffixes in the English language are made up of a combination of smaller and simpler suffixes, it is also majorly known for its simplicity and speed. The advantage is, it produces the best output from other stemmers and has less error rate.

Step 1 - Import the NLTK library and from NLTK import PorterStemmer

import nltk from nltk.stem import PorterStemmer

Step 2 - Creat a variable and store PorterStemmer into it

ps = PorterStemmer()

Step 3 - lets see how to use PorterStemmer

print(ps.stem('bat')) print(ps.stem('batting'))

bat

bat

from the above we can say that the word bat and batting has reduced to bat lets try with some more examples

print(ps.stem('code')) print(ps.stem('coding')) print(ps.stem('coder')) print(ps.stem('coded'))

code

code

coder

code

So, we have observed that it is working for the words like code, coding, coded but not working for coder because if the word has at least one vowel and consonant plus EED ending, change the ending to 'EE' for e.g agreed become agree.

Relevant Projects

Ensemble Machine Learning Project - All State Insurance Claims Severity Prediction
In this ensemble machine learning project, we will predict what kind of claims an insurance company will get. This is implemented in python using ensemble machine learning algorithms.

Customer Market Basket Analysis using Apriori and Fpgrowth algorithms
In this data science project, you will learn how to perform market basket analysis with the application of Apriori and FP growth algorithms based on the concept of association rule learning.

Data Science Project in Python on BigMart Sales Prediction
The goal of this data science project is to build a predictive model and find out the sales of each product at a given Big Mart store.

Predict Macro Economic Trends using Kaggle Financial Dataset
In this machine learning project, you will uncover the predictive value in an uncertain world by using various artificial intelligence, machine learning, advanced regression and feature transformation techniques.

Predict Credit Default | Give Me Some Credit Kaggle
In this data science project, you will predict borrowers chance of defaulting on credit loans by building a credit score prediction model.

Build a Collaborative Filtering Recommender System in Python
Use the Amazon Reviews/Ratings dataset of 2 Million records to build a recommender system using memory-based collaborative filtering in Python.

Identifying Product Bundles from Sales Data Using R Language
In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.

Time Series Forecasting with LSTM Neural Network Python
Deep Learning Project- Learn to apply deep learning paradigm to forecast univariate time series data.

Mercari Price Suggestion Challenge Data Science Project
Data Science Project in Python- Build a machine learning algorithm that automatically suggests the right product prices.

Zillow’s Home Value Prediction (Zestimate)
Data Science Project in R -Build a machine learning algorithm to predict the future sale prices of homes.