Ordinal Encoding - What, How, and When?

Learn all about ordinal encoding in our easy-to-follow tutorial. This tutorial will explain what it is, how to use it, and when to use it. | ProjectPro

If you've ever wondered how to handle categorical data in machine learning, you're in the right place. Ordinal Encoding is a fundamental technique that transforms categorical variables into numerical values, allowing algorithms to process them effectively. Check out this tutorial that explains what Ordinal Encoding is, how it works, and, most importantly, when to use it. Understanding Ordinal Encoding can significantly enhance your data preprocessing skills and improve the performance of your machine-learning models. So, let's dive in to get started! 

What is Ordinal Encoding? 

Ordinal encoding, or label encoding, converts categorical variables into numerical format by assigning a unique value to each category. It preserves the inherent ordering of categories and is helpful for machine learning models like neural networks that require numeric input features. It ensures algorithms understand categorical data and retains ordinal information between categories lost in one-hot encoding.

When to Use Ordinal Encoding? 

Ordinal encoding is a valuable technique in data preprocessing, particularly suited for handling ordinal data. As the name suggests, ordinal encoders are designed to manage data with an inherent order or ranking. This makes ordinal encoding an ideal choice when confronted with variables that represent categories with a clear hierarchy or sequence but are not inherently numerical.

One prominent scenario where ordinal encoding shines is when dealing with categorical data that involves ranking or ordering, such as survey responses ranging from "strongly disagree" to "strongly agree" or educational levels like "high school," "college," and "graduate school." In such cases, the ordinal nature of the data needs to be preserved, and ordinal encoding offers a systematic approach to achieve this. Ordinal data exhibits a clear sequence or hierarchy, unlike nominal data, where categories lack any natural order. Thus, using techniques like one-hot encoding, which treats each category as independent, might not be appropriate as it fails to capture the ordinal relationships present in the data. Instead, ordinal encoding assigns a numerical value to each category based on its position in the sequence, thereby preserving the ordinal information.

How to do Ordinal Encoding in Python?

Here is a concise example of performing ordinal encoding using Pandas in just a few simple steps - 

Step 1 - Import the library

    import pandas as pd

We have imported pandas, which will be needed for the dataset.

Step 2 - Setting up the Data

We have created a dataframe with one feature, "score," and categorical variables, "Low," "Medium," and "High."   

    df = pd.DataFrame({"Score": ["Low", "Low", "Medium", "Medium", "High", "Low", "Medium","High", "Low"]})

    print(df)

Explore More Data Science and Machine Learning Projects for Practice. Fast-Track Your Career Transition with ProjectPro

Step 3 - Encoding variable

We have created an object scale_mapper in which we have passed the encoding parameter, i.e., putting numerical values instead of categorical variables. We have made a feature scale with numerical encoded values.   

    scale_mapper = {"Low":1, "Medium":2, "High":3}

    df["Scale"] = df["Score"].replace(scale_mapper)

 

    print(df)

So the output comes as:

   Score

0     Low

1     Low

2  Medium

3  Medium

4    High

5     Low

6  Medium

7    High

8     Low

    Score  Scale

0     Low      1

1     Low      1

2  Medium      2

3  Medium      2

4    High      3

5     Low      1

6  Medium      2

7    High      3

8     Low      1

 

You can also perform ordinal encoding using libraries like scikit-learn as mentioned below - 

  1. Import the necessary library. 

Import ordinal encoder in Python

  1. Initialize the OrdinalEncoder object and fit_transform your categorical data. 

How to encode ordinal features in Python?

These two lines of code will help you encode your categorical data into a numerical format using ordinal encoding in Python.

Master Ordinal Encoding Concepts with ProjectPro!

Ordinal encoding is an important and must-know concept for any data scientist or machine learning engineer. Understanding its principles, applications, and nuances can significantly enhance your ability to preprocess categorical data effectively, leading to more accurate models and insightful analyses. Yet, grasping its intricacies goes beyond theory—it demands real-world practice. ProjectPro offers an invaluable opportunity to gain hands-on experience through its extensive collection of over 270+ projects in data science and big data. Practicing these real-world projects will solidify your understanding of ordinal encoding and develop the skills necessary to excel in the field. ProjectPro can help you take the next step in your learning journey and become proficient in ordinal encoding, setting yourself up for success in machine learning. 

Download Materials

What Users are saying..

profile image

Ameeruddin Mohammed

ETL (Abintio) developer at IBM
linkedin profile url

I come from a background in Marketing and Analytics and when I developed an interest in Machine Learning algorithms, I did multiple in-class courses from reputed institutions though I got good... Read More

Relevant Projects

Build Real Estate Price Prediction Model with NLP and FastAPI
In this Real Estate Price Prediction Project, you will learn to build a real estate price prediction machine learning model and deploy it on Heroku using FastAPI Framework.

Abstractive Text Summarization using Transformers-BART Model
Deep Learning Project to implement an Abstractive Text Summarizer using Google's Transformers-BART Model to generate news article headlines.

Time Series Classification Project for Elevator Failure Prediction
In this Time Series Project, you will predict the failure of elevators using IoT sensor data as a time series classification machine learning problem.

Build a Hybrid Recommender System in Python using LightFM
In this Recommender System project, you will build a hybrid recommender system in Python using LightFM .

Build a Logistic Regression Model in Python from Scratch
Regression project to implement logistic regression in python from scratch on streaming app data.

Build a Customer Churn Prediction Model using Decision Trees
Develop a customer churn prediction model using decision tree machine learning algorithms and data science on streaming service data.

Digit Recognition using CNN for MNIST Dataset in Python
In this deep learning project, you will build a convolutional neural network using MNIST dataset for handwritten digit recognition.

BERT Text Classification using DistilBERT and ALBERT Models
This Project Explains how to perform Text Classification using ALBERT and DistilBERT

Build Customer Propensity to Purchase Model in Python
In this machine learning project, you will learn to build a machine learning model to estimate customer propensity to purchase.

Credit Card Fraud Detection as a Classification Problem
In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.