One hot Encoding with multiple labels in Python?

One hot Encoding with multiple labels in Python

Recipe Objective

In many datasets we find that there are multiple labels and machine learning model can not be trained on the labels. To solve this problem we may assign numbers to this labels but machine learning models can compare numbers and will give different weightage to different labels and as a result it will be bias towards a label. So what we can do is we can make different columns acconding to the labels and assign bool values in it.

This python source code does the following:
1. Converts categorical into numerical types.
2. Loads the important libraries and modules.
3. Implements multi label binarizer.
4. Creates your own numpy feature matrix.
5.Extracts and interprets the final result

So this is the recipe on how we can use MultiLabelBinarize to convert labels into bool values in Python.

Get Closer To Your Dream of Becoming a Data Scientist with 70+ Solved End-to-End ML Projects

Step 1 - Import the library

from sklearn.preprocessing import MultiLabelBinarizer

We have only imported MultiLabelBinarizer which is reqired to do so.

Step 2 - Setting up the Data

We have created a arrays of differnt labels with few of the labels in common. y = [('Raj', 'Penny'), ('Amy', 'Raj'), ('Sheldon', 'Penny'), ('Leonard', 'Amy'), ('Amy', 'Leonard')]

Explore More Data Science and Machine Learning Projects for Practice. Fast-Track Your Career Transition with ProjectPro

Step 3 - Using MultiLabelBinarizer and Printing Output

We have created an object for MultiLabelBinarizer and using fit_transform we have fitted and transformed our data. Finally we have printed the classes that has been make by the function. one_hot = MultiLabelBinarizer() print(one_hot.fit_transform(y)) print(one_hot.classes_) So the output comes as:

[[0 0 1 1 0]
 [1 0 0 1 0]
 [0 0 1 0 1]
 [1 1 0 0 0]
 [1 1 0 0 0]]

['Amy' 'Leonard' 'Penny' 'Raj' 'Sheldon']

Download Materials

What Users are saying..

profile image

Gautam Vermani

Data Consultant at Confidential
linkedin profile url

Having worked in the field of Data Science, I wanted to explore how I can implement projects in other domains, So I thought of connecting with ProjectPro. A project that helped me absorb this topic... Read More

Relevant Projects

Expedia Hotel Recommendations Data Science Project
In this data science project, you will contextualize customer data and predict the likelihood a customer will stay at 100 different hotel groups.

Build a Music Recommendation Algorithm using KKBox's Dataset
Music Recommendation Project using Machine Learning - Use the KKBox dataset to predict the chances of a user listening to a song again after their very first noticeable listening event.

PyTorch Project to Build a LSTM Text Classification Model
In this PyTorch Project you will learn how to build an LSTM Text Classification model for Classifying the Reviews of an App .

End-to-End Snowflake Healthcare Analytics Project on AWS-2
In this AWS Snowflake project, you will build an end to end retraining pipeline by checking Data and Model Drift and learn how to redeploy the model if needed

MLOps Project to Build Search Relevancy Algorithm with SBERT
In this MLOps SBERT project you will learn to build and deploy an accurate and scalable search algorithm on AWS using SBERT and ANNOY to enhance search relevancy in news articles.

Stock Price Prediction Project using LSTM and RNN
Learn how to predict stock prices using RNN and LSTM models. Understand deep learning concepts and apply them to real-world financial data for accurate forecasting.

Machine Learning project for Retail Price Optimization
In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.

Many-to-One LSTM for Sentiment Analysis and Text Generation
In this LSTM Project , you will build develop a sentiment detection model using many-to-one LSTMs for accurate prediction of sentiment labels in airline text reviews. Additionally, we will also train many-to-one LSTMs on 'Alice's Adventures in Wonderland' to generate contextually relevant text.

Ecommerce product reviews - Pairwise ranking and sentiment analysis
This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.

Personalized Medicine: Redefining Cancer Treatment
In this Personalized Medicine Machine Learning Project you will learn to classify genetic mutations on the basis of medical literature into 9 classes.