Ordinal Encoding - What, How, and When?

Learn all about ordinal encoding in our easy-to-follow tutorial. This tutorial will explain what it is, how to use it, and when to use it. | ProjectPro

If you've ever wondered how to handle categorical data in machine learning, you're in the right place. Ordinal Encoding is a fundamental technique that transforms categorical variables into numerical values, allowing algorithms to process them effectively. Check out this tutorial that explains what Ordinal Encoding is, how it works, and, most importantly, when to use it. Understanding Ordinal Encoding can significantly enhance your data preprocessing skills and improve the performance of your machine-learning models. So, let's dive in to get started! 

What is Ordinal Encoding? 

Ordinal encoding, or label encoding, converts categorical variables into numerical format by assigning a unique value to each category. It preserves the inherent ordering of categories and is helpful for machine learning models like neural networks that require numeric input features. It ensures algorithms understand categorical data and retains ordinal information between categories lost in one-hot encoding.

When to Use Ordinal Encoding? 

Ordinal encoding is a valuable technique in data preprocessing, particularly suited for handling ordinal data. As the name suggests, ordinal encoders are designed to manage data with an inherent order or ranking. This makes ordinal encoding an ideal choice when confronted with variables that represent categories with a clear hierarchy or sequence but are not inherently numerical.

One prominent scenario where ordinal encoding shines is when dealing with categorical data that involves ranking or ordering, such as survey responses ranging from "strongly disagree" to "strongly agree" or educational levels like "high school," "college," and "graduate school." In such cases, the ordinal nature of the data needs to be preserved, and ordinal encoding offers a systematic approach to achieve this. Ordinal data exhibits a clear sequence or hierarchy, unlike nominal data, where categories lack any natural order. Thus, using techniques like one-hot encoding, which treats each category as independent, might not be appropriate as it fails to capture the ordinal relationships present in the data. Instead, ordinal encoding assigns a numerical value to each category based on its position in the sequence, thereby preserving the ordinal information.

How to do Ordinal Encoding in Python?

Here is a concise example of performing ordinal encoding using Pandas in just a few simple steps - 

Step 1 - Import the library

    import pandas as pd

We have imported pandas, which will be needed for the dataset.

Step 2 - Setting up the Data

We have created a dataframe with one feature, "score," and categorical variables, "Low," "Medium," and "High."   

    df = pd.DataFrame({"Score": ["Low", "Low", "Medium", "Medium", "High", "Low", "Medium","High", "Low"]})

    print(df)

Explore More Data Science and Machine Learning Projects for Practice. Fast-Track Your Career Transition with ProjectPro

Step 3 - Encoding variable

We have created an object scale_mapper in which we have passed the encoding parameter, i.e., putting numerical values instead of categorical variables. We have made a feature scale with numerical encoded values.   

    scale_mapper = {"Low":1, "Medium":2, "High":3}

    df["Scale"] = df["Score"].replace(scale_mapper)

 

    print(df)

So the output comes as:

   Score

0     Low

1     Low

2  Medium

3  Medium

4    High

5     Low

6  Medium

7    High

8     Low

    Score  Scale

0     Low      1

1     Low      1

2  Medium      2

3  Medium      2

4    High      3

5     Low      1

6  Medium      2

7    High      3

8     Low      1

 

You can also perform ordinal encoding using libraries like scikit-learn as mentioned below - 

  1. Import the necessary library. 

Import ordinal encoder in Python

  1. Initialize the OrdinalEncoder object and fit_transform your categorical data. 

How to encode ordinal features in Python?

These two lines of code will help you encode your categorical data into a numerical format using ordinal encoding in Python.

Master Ordinal Encoding Concepts with ProjectPro!

Ordinal encoding is an important and must-know concept for any data scientist or machine learning engineer. Understanding its principles, applications, and nuances can significantly enhance your ability to preprocess categorical data effectively, leading to more accurate models and insightful analyses. Yet, grasping its intricacies goes beyond theory—it demands real-world practice. ProjectPro offers an invaluable opportunity to gain hands-on experience through its extensive collection of over 270+ projects in data science and big data. Practicing these real-world projects will solidify your understanding of ordinal encoding and develop the skills necessary to excel in the field. ProjectPro can help you take the next step in your learning journey and become proficient in ordinal encoding, setting yourself up for success in machine learning. 

Download Materials

What Users are saying..

profile image

Ray han

Tech Leader | Stanford / Yale University
linkedin profile url

I think that they are fantastic. I attended Yale and Stanford and have worked at Honeywell,Oracle, and Arthur Andersen(Accenture) in the US. I have taken Big Data and Hadoop,NoSQL, Spark, Hadoop... Read More

Relevant Projects

Build an AI Chatbot from Scratch using Keras Sequential Model
In this NLP Project, you will learn how to build an AI Chatbot from Scratch using Keras Sequential Model.

Time Series Analysis with Facebook Prophet Python and Cesium
Time Series Analysis Project - Use the Facebook Prophet and Cesium Open Source Library for Time Series Forecasting in Python

Build CNN for Image Colorization using Deep Transfer Learning
Image Processing Project -Train a model for colorization to make grayscale images colorful using convolutional autoencoders.

Learn How to Build PyTorch Neural Networks from Scratch
In this deep learning project, you will learn how to build PyTorch neural networks from scratch.

Demand prediction of driver availability using multistep time series analysis
In this supervised learning machine learning project, you will predict the availability of a driver in a specific area by using multi step time series analysis.

Time Series Forecasting with LSTM Neural Network Python
Deep Learning Project- Learn to apply deep learning paradigm to forecast univariate time series data.

PyCaret Project to Build and Deploy an ML App using Streamlit
In this PyCaret Project, you will build a customer segmentation model with PyCaret and deploy the machine learning application using Streamlit.

Many-to-One LSTM for Sentiment Analysis and Text Generation
In this LSTM Project , you will build develop a sentiment detection model using many-to-one LSTMs for accurate prediction of sentiment labels in airline text reviews. Additionally, we will also train many-to-one LSTMs on 'Alice's Adventures in Wonderland' to generate contextually relevant text.

OpenCV Project for Beginners to Learn Computer Vision Basics
In this OpenCV project, you will learn computer vision basics and the fundamentals of OpenCV library using Python.

Expedia Hotel Recommendations Data Science Project
In this data science project, you will contextualize customer data and predict the likelihood a customer will stay at 100 different hotel groups.