How to stem non english words?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

How to stem non english words?

How to stem non english words?

This recipe helps you stem non english words

0

Recipe Objective

How to stem non english words?

Stemming as we have discussed already what is stemming which is nothing but reducing the words to their root size. We have seen stemming for English words, but what about non - english language words, there are stemmers available for non - english words as well. Lets understand this with practical implementation.

Step 1 - Import the German language Stemmer

from nltk.stem.snowball import GermanStemmer

Step 2 - Store the german stemmer in a variable

german_st = GermanStemmer()

Step 3 - Take sample words

token_sample = ["Schreiben","geschrieben"]

Here we have taken some sample words in german whose english translation is:

Schreiben - writing

geschrieben - written

Step 4 - Apply stemming and print the results

stem_words = [german_st.stem(words) for words in token_sample] print("Print the output after stemming:",stem_words)
Print the output after stemming: ['schreib', 'geschrieb']

Here we can see the output as, 'schreib', 'geschrieb' whose english translation is:

schreib - write

geschrieb - wrote

So we can see the difference between our sample token words and results after applying stremming on that words.

Relevant Projects

Topic modelling using Kmeans clustering to group customer reviews
In this Kmeans clustering machine learning project, you will perform topic modelling in order to group customer reviews based on recurring patterns.

Resume parsing with Machine learning - NLP with Python OCR and Spacy
In this machine learning resume parser example we use the popular Spacy NLP python library for OCR and text classification.

Build an Image Classifier for Plant Species Identification
In this machine learning project, we will use binary leaf images and extracted features, including shape, margin, and texture to accurately identify plant species using different benchmark classification techniques.

Build a Collaborative Filtering Recommender System in Python
Use the Amazon Reviews/Ratings dataset of 2 Million records to build a recommender system using memory-based collaborative filtering in Python.

Natural language processing Chatbot application using NLTK for text classification
In this NLP AI application, we build the core conversational engine for a chatbot. We use the popular NLTK text classification library to achieve this.

Predict Macro Economic Trends using Kaggle Financial Dataset
In this machine learning project, you will uncover the predictive value in an uncertain world by using various artificial intelligence, machine learning, advanced regression and feature transformation techniques.

Demand prediction of driver availability using multistep time series analysis
In this supervised learning machine learning project, you will predict the availability of a driver in a specific area by using multi step time series analysis.

Data Science Project on Wine Quality Prediction in R
In this R data science project, we will explore wine dataset to assess red wine quality. The objective of this data science project is to explore which chemical properties will influence the quality of red wines.

Walmart Sales Forecasting Data Science Project
Data Science Project in R-Predict the sales for each department using historical markdown data from the Walmart dataset containing data of 45 Walmart stores.

Time Series Forecasting with LSTM Neural Network Python
Deep Learning Project- Learn to apply deep learning paradigm to forecast univariate time series data.