How to use levenshtein distance in text similarity in nlp

This recipe helps you use levenshtein distance in text similarity in nlp

Recipe Objective

How to use levenshtein distance in text similarity ?

levenshtein distance it is defined as distance in which less number of characters required to insert, delete or replace in a given string for e.g String 1 to transform it to another string which is String 2.

For e.g.

String A = helo

String B = hello

So in the above example we need to insert one missing character in String A which is l and transform it to String B. The Levenshtein distance for this will be 1 because there is only one edit is needed.

Similarly if:

String A = kelo

String B = hello

So in this the levenshtein distance will be 2, because not only insertion of l have to done but we have to substitute the character k by h.

Step 1 - Import the necessary libraries

import enchant

Step 2 - Define Sample strings

string_A = "helo" string_B = "hello"

Step 3 - Print the result for levenshtein Distance

print("The Levenshtein Distance between String_A and String_B is: ",enchant.utils.levenshtein(string_A, string_B))

The Levenshtein Distance between String_A and String_B is:  1

So from the above we can get an idea about how levenshtein distance works, in this example the distance is 1 because there is only one operation is needed.

Step 4 - Some more examples

string_C = "Hello Jc" string_D= "Hello Jack" print(enchant.utils.levenshtein(string_C, string_D))

2

string_E = "My nam i S" string_F = "My name is Sam" print(enchant.utils.levenshtein(string_E, string_F))

4

What Users are saying..

profile image

Abhinav Agarwal

Graduate Student at Northwestern University
linkedin profile url

I come from Northwestern University, which is ranked 9th in the US. Although the high-quality academics at school taught me all the basics I needed, obtaining practical experience was a challenge.... Read More

Relevant Projects

Loan Eligibility Prediction using Gradient Boosting Classifier
This data science in python project predicts if a loan should be given to an applicant or not. We predict if the customer is eligible for loan based on several factors like credit score and past history.

Deep Learning Project for Time Series Forecasting in Python
Deep Learning for Time Series Forecasting in Python -A Hands-On Approach to Build Deep Learning Models (MLP, CNN, LSTM, and a Hybrid Model CNN-LSTM) on Time Series Data.

Hands-On Approach to Regression Discontinuity Design Python
In this machine learning project, you will learn to implement Regression Discontinuity Design Example in Python to determine the effect of age on Mortality Rate in Python.

Locality Sensitive Hashing Python Code for Look-Alike Modelling
In this deep learning project, you will find similar images (lookalikes) using deep learning and locality sensitive hashing to find customers who are most likely to click on an ad.

Recommender System Machine Learning Project for Beginners-1
Recommender System Machine Learning Project for Beginners - Learn how to design, implement and train a rule-based recommender system in Python

OpenCV Project to Master Advanced Computer Vision Concepts
In this OpenCV project, you will learn to implement advanced computer vision concepts and algorithms in OpenCV library using Python.

PyTorch Project to Build a GAN Model on MNIST Dataset
In this deep learning project, you will learn how to build a GAN Model on MNIST Dataset for generating new images of handwritten digits.

ML Model Deployment on AWS for Customer Churn Prediction
MLOps Project-Deploy Machine Learning Model to Production Python on AWS for Customer Churn Prediction

Text Classification with Transformers-RoBERTa and XLNet Model
In this machine learning project, you will learn how to load, fine tune and evaluate various transformer models for text classification tasks.

Customer Churn Prediction Analysis using Ensemble Techniques
In this machine learning churn project, we implement a churn prediction model in python using ensemble techniques.