How to stem non english words?

How to stem non english words?

How to stem non english words?

This recipe helps you stem non english words


Recipe Objective

How to stem non english words?

Stemming as we have discussed already what is stemming which is nothing but reducing the words to their root size. We have seen stemming for English words, but what about non - english language words, there are stemmers available for non - english words as well. Lets understand this with practical implementation.

Step 1 - Import the German language Stemmer

from nltk.stem.snowball import GermanStemmer

Step 2 - Store the german stemmer in a variable

german_st = GermanStemmer()

Step 3 - Take sample words

token_sample = ["Schreiben","geschrieben"]

Here we have taken some sample words in german whose english translation is:

Schreiben - writing

geschrieben - written

Step 4 - Apply stemming and print the results

stem_words = [german_st.stem(words) for words in token_sample] print("Print the output after stemming:",stem_words)
Print the output after stemming: ['schreib', 'geschrieb']

Here we can see the output as, 'schreib', 'geschrieb' whose english translation is:

schreib - write

geschrieb - wrote

So we can see the difference between our sample token words and results after applying stremming on that words.

Relevant Projects

Music Recommendation System Project using Python and R
Machine Learning Project - Work with KKBOX's Music Recommendation System dataset to build the best music recommendation engine.

Deep Learning with Keras in R to Predict Customer Churn
In this deep learning project, we will predict customer churn using Artificial Neural Networks and learn how to model an ANN in R with the keras deep learning package.

Mercari Price Suggestion Challenge Data Science Project
Data Science Project in Python- Build a machine learning algorithm that automatically suggests the right product prices.

Machine Learning or Predictive Models in IoT - Energy Prediction Use Case
In this machine learning and IoT project, we are going to test out the experimental data using various predictive models and train the models and break the energy usage.

Zillow’s Home Value Prediction (Zestimate)
Data Science Project in R -Build a machine learning algorithm to predict the future sale prices of homes.

Customer Churn Prediction Analysis using Ensemble Techniques
In this machine learning churn project, we implement a churn prediction model in python using ensemble techniques.

Predict Credit Default | Give Me Some Credit Kaggle
In this data science project, you will predict borrowers chance of defaulting on credit loans by building a credit score prediction model.

Ensemble Machine Learning Project - All State Insurance Claims Severity Prediction
In this ensemble machine learning project, we will predict what kind of claims an insurance company will get. This is implemented in python using ensemble machine learning algorithms.

Time Series Forecasting with LSTM Neural Network Python
Deep Learning Project- Learn to apply deep learning paradigm to forecast univariate time series data.

Human Activity Recognition Using Smartphones Data Set
In this deep learning project, you will build a classification system where to precisely identify human fitness activities.