How to stem non english words in nlp

This recipe helps you stem non english words in nlp

Recipe Objective

How to stem non english words?

Stemming as we have discussed already what is stemming which is nothing but reducing the words to their root size. We have seen stemming for English words, but what about non - english language words, there are stemmers available for non - english words as well. Lets understand this with practical implementation.

Build a Chatbot in Python from Scratch!

Step 1 - Import the German language Stemmer

from nltk.stem.snowball import GermanStemmer

Step 2 - Store the german stemmer in a variable

german_st = GermanStemmer()

Step 3 - Take sample words

token_sample = ["Schreiben","geschrieben"]

Here we have taken some sample words in german whose english translation is:

Schreiben - writing

geschrieben - written

Step 4 - Apply stemming and print the results

stem_words = [german_st.stem(words) for words in token_sample] print("Print the output after stemming:",stem_words)

Print the output after stemming: ['schreib', 'geschrieb']

Here we can see the output as, 'schreib', 'geschrieb' whose english translation is:

schreib - write

geschrieb - wrote

So we can see the difference between our sample token words and results after applying stremming on that words.

What Users are saying..

profile image

Savvy Sahai

Data Science Intern, Capgemini
linkedin profile url

As a student looking to break into the field of data engineering and data science, one can get really confused as to which path to take. Very few ways to do it are Google, YouTube, etc. I was one of... Read More

Relevant Projects

AWS MLOps Project to Deploy Multiple Linear Regression Model
Build and Deploy a Multiple Linear Regression Model in Python on AWS

Build an End-to-End AWS SageMaker Classification Model
MLOps on AWS SageMaker -Learn to Build an End-to-End Classification Model on SageMaker to predict a patient’s cause of death.

Machine Learning Project to Forecast Rossmann Store Sales
In this machine learning project you will work on creating a robust prediction model of Rossmann's daily sales using store, promotion, and competitor data.

Build a Music Recommendation Algorithm using KKBox's Dataset
Music Recommendation Project using Machine Learning - Use the KKBox dataset to predict the chances of a user listening to a song again after their very first noticeable listening event.

Time Series Project to Build a Multiple Linear Regression Model
Learn to build a Multiple linear regression model in Python on Time Series Data

Customer Market Basket Analysis using Apriori and Fpgrowth algorithms
In this data science project, you will learn how to perform market basket analysis with the application of Apriori and FP growth algorithms based on the concept of association rule learning.

Learn to Build Generative Models Using PyTorch Autoencoders
In this deep learning project, you will learn how to build a Generative Model using Autoencoders in PyTorch

Build a Graph Based Recommendation System in Python -Part 1
Python Recommender Systems Project - Learn to build a graph based recommendation system in eCommerce to recommend products.

Predictive Analytics Project for Working Capital Optimization
In this Predictive Analytics Project, you will build a model to accurately forecast the timing of customer and supplier payments for optimizing working capital.

Build a Face Recognition System in Python using FaceNet
In this deep learning project, you will build your own face recognition system in Python using OpenCV and FaceNet by extracting features from an image of a person's face.