How to use Porter Stemmer?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

How to use Porter Stemmer?

How to use Porter Stemmer?

This recipe helps you use Porter Stemmer

0

Recipe Objective

As we have discussed before what is stemming, So it is nothing but reducing the words or chopping the words into their root forms for e.g eating becomes eat and so on. So in stemming there are different stemmers and we are going to discuss PortersStemmer the most popularly used one.

Porters Stemmer It is a type of stemmer which is mainly known for Data Mining and Information Retrieval. As its applications are limited to the English language only. It is based on the idea that the suffixes in the English language are made up of a combination of smaller and simpler suffixes, it is also majorly known for its simplicity and speed. The advantage is, it produces the best output from other stemmers and has less error rate.

Step 1 - Import the NLTK library and from NLTK import PorterStemmer

import nltk from nltk.stem import PorterStemmer

Step 2 - Creat a variable and store PorterStemmer into it

ps = PorterStemmer()

Step 3 - lets see how to use PorterStemmer

print(ps.stem('bat')) print(ps.stem('batting'))

bat

bat

from the above we can say that the word bat and batting has reduced to bat lets try with some more examples

print(ps.stem('code')) print(ps.stem('coding')) print(ps.stem('coder')) print(ps.stem('coded'))

code

code

coder

code

So, we have observed that it is working for the words like code, coding, coded but not working for coder because if the word has at least one vowel and consonant plus EED ending, change the ending to 'EE' for e.g agreed become agree.

Relevant Projects

Ensemble Machine Learning Project - All State Insurance Claims Severity Prediction
In this ensemble machine learning project, we will predict what kind of claims an insurance company will get. This is implemented in python using ensemble machine learning algorithms.

Forecast Inventory demand using historical sales data in R
In this machine learning project, you will develop a machine learning model to accurately forecast inventory demand based on historical sales data.

Predict Credit Default | Give Me Some Credit Kaggle
In this data science project, you will predict borrowers chance of defaulting on credit loans by building a credit score prediction model.

Mercari Price Suggestion Challenge Data Science Project
Data Science Project in Python- Build a machine learning algorithm that automatically suggests the right product prices.

PySpark Tutorial - Learn to use Apache Spark with Python
PySpark Project-Get a handle on using Python with Spark through this hands-on data processing spark python tutorial.

Learn to prepare data for your next machine learning project
Text data requires special preparation before you can start using it for any machine learning project.In this ML project, you will learn about applying Machine Learning models to create classifiers and learn how to make sense of textual data.

Resume parsing with Machine learning - NLP with Python OCR and Spacy
In this machine learning resume parser example we use the popular Spacy NLP python library for OCR and text classification.

Natural language processing Chatbot application using NLTK for text classification
In this NLP AI application, we build the core conversational engine for a chatbot. We use the popular NLTK text classification library to achieve this.

Ecommerce product reviews - Pairwise ranking and sentiment analysis
This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.

Predict Census Income using Deep Learning Models
In this project, we are going to work on Deep Learning using H2O to predict Census income.