Ordinal Encoding - What, How, and When?

Learn all about ordinal encoding in our easy-to-follow tutorial. This tutorial will explain what it is, how to use it, and when to use it. | ProjectPro

If you've ever wondered how to handle categorical data in machine learning, you're in the right place. Ordinal Encoding is a fundamental technique that transforms categorical variables into numerical values, allowing algorithms to process them effectively. Check out this tutorial that explains what Ordinal Encoding is, how it works, and, most importantly, when to use it. Understanding Ordinal Encoding can significantly enhance your data preprocessing skills and improve the performance of your machine-learning models. So, let's dive in to get started! 

What is Ordinal Encoding? 

Ordinal encoding, or label encoding, converts categorical variables into numerical format by assigning a unique value to each category. It preserves the inherent ordering of categories and is helpful for machine learning models like neural networks that require numeric input features. It ensures algorithms understand categorical data and retains ordinal information between categories lost in one-hot encoding.

When to Use Ordinal Encoding? 

Ordinal encoding is a valuable technique in data preprocessing, particularly suited for handling ordinal data. As the name suggests, ordinal encoders are designed to manage data with an inherent order or ranking. This makes ordinal encoding an ideal choice when confronted with variables that represent categories with a clear hierarchy or sequence but are not inherently numerical.

One prominent scenario where ordinal encoding shines is when dealing with categorical data that involves ranking or ordering, such as survey responses ranging from "strongly disagree" to "strongly agree" or educational levels like "high school," "college," and "graduate school." In such cases, the ordinal nature of the data needs to be preserved, and ordinal encoding offers a systematic approach to achieve this. Ordinal data exhibits a clear sequence or hierarchy, unlike nominal data, where categories lack any natural order. Thus, using techniques like one-hot encoding, which treats each category as independent, might not be appropriate as it fails to capture the ordinal relationships present in the data. Instead, ordinal encoding assigns a numerical value to each category based on its position in the sequence, thereby preserving the ordinal information.

How to do Ordinal Encoding in Python?

Here is a concise example of performing ordinal encoding using Pandas in just a few simple steps - 

Step 1 - Import the library

    import pandas as pd

We have imported pandas, which will be needed for the dataset.

Step 2 - Setting up the Data

We have created a dataframe with one feature, "score," and categorical variables, "Low," "Medium," and "High."   

    df = pd.DataFrame({"Score": ["Low", "Low", "Medium", "Medium", "High", "Low", "Medium","High", "Low"]})

    print(df)

Explore More Data Science and Machine Learning Projects for Practice. Fast-Track Your Career Transition with ProjectPro

Step 3 - Encoding variable

We have created an object scale_mapper in which we have passed the encoding parameter, i.e., putting numerical values instead of categorical variables. We have made a feature scale with numerical encoded values.   

    scale_mapper = {"Low":1, "Medium":2, "High":3}

    df["Scale"] = df["Score"].replace(scale_mapper)

 

    print(df)

So the output comes as:

   Score

0     Low

1     Low

2  Medium

3  Medium

4    High

5     Low

6  Medium

7    High

8     Low

    Score  Scale

0     Low      1

1     Low      1

2  Medium      2

3  Medium      2

4    High      3

5     Low      1

6  Medium      2

7    High      3

8     Low      1

 

You can also perform ordinal encoding using libraries like scikit-learn as mentioned below - 

  1. Import the necessary library. 

Import ordinal encoder in Python

  1. Initialize the OrdinalEncoder object and fit_transform your categorical data. 

How to encode ordinal features in Python?

These two lines of code will help you encode your categorical data into a numerical format using ordinal encoding in Python.

Master Ordinal Encoding Concepts with ProjectPro!

Ordinal encoding is an important and must-know concept for any data scientist or machine learning engineer. Understanding its principles, applications, and nuances can significantly enhance your ability to preprocess categorical data effectively, leading to more accurate models and insightful analyses. Yet, grasping its intricacies goes beyond theory—it demands real-world practice. ProjectPro offers an invaluable opportunity to gain hands-on experience through its extensive collection of over 270+ projects in data science and big data. Practicing these real-world projects will solidify your understanding of ordinal encoding and develop the skills necessary to excel in the field. ProjectPro can help you take the next step in your learning journey and become proficient in ordinal encoding, setting yourself up for success in machine learning. 

Download Materials

What Users are saying..

profile image

Jingwei Li

Graduate Research assistance at Stony Brook University
linkedin profile url

ProjectPro is an awesome platform that helps me learn much hands-on industrial experience with a step-by-step walkthrough of projects. There are two primary paths to learn: Data Science and Big Data.... Read More

Relevant Projects

Build a Review Classification Model using Gated Recurrent Unit
In this Machine Learning project, you will build a classification model in python to classify the reviews of an app on a scale of 1 to 5 using Gated Recurrent Unit.

Build Portfolio Optimization Machine Learning Models in R
Machine Learning Project for Financial Risk Modelling and Portfolio Optimization with R- Build a machine learning model in R to develop a strategy for building a portfolio for maximized returns.

Predictive Analytics Project for Working Capital Optimization
In this Predictive Analytics Project, you will build a model to accurately forecast the timing of customer and supplier payments for optimizing working capital.

Build CNN for Image Colorization using Deep Transfer Learning
Image Processing Project -Train a model for colorization to make grayscale images colorful using convolutional autoencoders.

Ecommerce product reviews - Pairwise ranking and sentiment analysis
This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.

Deep Learning Project for Text Detection in Images using Python
CV2 Text Detection Code for Images using Python -Build a CRNN deep learning model to predict the single-line text in a given image.

Recommender System Machine Learning Project for Beginners-1
Recommender System Machine Learning Project for Beginners - Learn how to design, implement and train a rule-based recommender system in Python

MLOps Project on GCP using Kubeflow for Model Deployment
MLOps using Kubeflow on GCP - Build and deploy a deep learning model on Google Cloud Platform using Kubeflow pipelines in Python

Build a Autoregressive and Moving Average Time Series Model
In this time series project, you will learn to build Autoregressive and Moving Average Time Series Models to forecast future readings, optimize performance, and harness the power of predictive analytics for sensor data.

Learn to Build a Neural network from Scratch using NumPy
In this deep learning project, you will learn to build a neural network from scratch using NumPy