How to stem non english words in nlp

This recipe helps you stem non english words in nlp

Recipe Objective

How to stem non english words?

Stemming as we have discussed already what is stemming which is nothing but reducing the words to their root size. We have seen stemming for English words, but what about non - english language words, there are stemmers available for non - english words as well. Lets understand this with practical implementation.

Build a Chatbot in Python from Scratch!

Step 1 - Import the German language Stemmer

from nltk.stem.snowball import GermanStemmer

Step 2 - Store the german stemmer in a variable

german_st = GermanStemmer()

Step 3 - Take sample words

token_sample = ["Schreiben","geschrieben"]

Here we have taken some sample words in german whose english translation is:

Schreiben - writing

geschrieben - written

Step 4 - Apply stemming and print the results

stem_words = [german_st.stem(words) for words in token_sample] print("Print the output after stemming:",stem_words)

Print the output after stemming: ['schreib', 'geschrieb']

Here we can see the output as, 'schreib', 'geschrieb' whose english translation is:

schreib - write

geschrieb - wrote

So we can see the difference between our sample token words and results after applying stremming on that words.

What Users are saying..

profile image

Anand Kumpatla

Sr Data Scientist @ Doubleslash Software Solutions Pvt Ltd
linkedin profile url

ProjectPro is a unique platform and helps many people in the industry to solve real-life problems with a step-by-step walkthrough of projects. A platform with some fantastic resources to gain... Read More

Relevant Projects

Deep Learning Project for Time Series Forecasting in Python
Deep Learning for Time Series Forecasting in Python -A Hands-On Approach to Build Deep Learning Models (MLP, CNN, LSTM, and a Hybrid Model CNN-LSTM) on Time Series Data.

Hands-On Approach to Master PyTorch Tensors with Examples
In this deep learning project, you will learn how to perform various operations on the building block of PyTorch : Tensors.

Build a Multi Class Image Classification Model Python using CNN
This project explains How to build a Sequential Model that can perform Multi Class Image Classification in Python using CNN

BERT Text Classification using DistilBERT and ALBERT Models
This Project Explains how to perform Text Classification using ALBERT and DistilBERT

Learn to Build a Neural network from Scratch using NumPy
In this deep learning project, you will learn to build a neural network from scratch using NumPy

Hands-On Approach to Causal Inference in Machine Learning
In this Machine Learning Project, you will learn to implement various causal inference techniques in Python to determine, how effective the sprinkler is in making the grass wet.

Create Your First Chatbot with RASA NLU Model and Python
Learn the basic aspects of chatbot development and open source conversational AI RASA to create a simple AI powered chatbot on your own.

Build an Image Segmentation Model using Amazon SageMaker
In this Machine Learning Project, you will learn to implement the UNet Architecture and build an Image Segmentation Model using Amazon SageMaker

Recommender System Machine Learning Project for Beginners-4
Collaborative Filtering Recommender System Project - Comparison of different model based and memory based methods to build recommendation system using collaborative filtering.

A/B Testing Approach for Comparing Performance of ML Models
The objective of this project is to compare the performance of BERT and DistilBERT models for building an efficient Question and Answering system. Using A/B testing approach, we explore the effectiveness and efficiency of both models and determine which one is better suited for Q&A tasks.