What is a Vectorizer? Vectorization is the process of converting words into numbers is called Vectorization, It is a methodology in NLP to map words or phrases from vocabulary to a corresponding vector of real numbers which is used to find word predictions, similarities etc.
The vectorization is used in use case like:
Compute Similar words
Document Clustering / Grouping
Natural language Processing (NLP)
feature extraction in Text Classification.
lets see a example of vectorizer by using Count Vectorizer
import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer
Count_vect = CountVectorizer()
text1 = "jack wants to play football"
text2 = "Heena also loves to play football"
vectors = Count_vect.fit_transform([text1, text2])
feature_names = Count_vect.get_feature_names()
dense = vectors.todense()
denselist = dense.tolist()
df = pd.DataFrame(denselist, columns=feature_names)
also football heena jack loves play to wants 0 0 1 0 1 0 1 1 1 1 1 1 1 0 1 1 1 0