How to rank a Pandas DataFrame?
DATA MUNGING DATA CLEANING PYTHON MACHINE LEARNING RECIPES PANDAS CHEATSHEET     ALL TAGS

How to rank a Pandas DataFrame?

How to rank a Pandas DataFrame?

This recipe helps you rank a Pandas DataFrame

0

Recipe Objective

While working on a dataset we sometimes need to get ranks of the columns based on the values in other features, rank can be defined in many ways like based on ascending order or decending order of the values in the feature.

This python source code does the following :
1. Creates and converts data dictionary into pandas dataframe
2. Creates new columns in the dataframe
3. Ranks dataframe in ascending and descending order

So this is the recipe on how we rank a Pandas DataFrame.

Step 1 - Import the library

import pandas as pd

We have only imported pandas which is needed.

Step 2 - Setting up the Data

We have created a dictionary of data and passed it in pd.DataFrame to make a dataframe with columns 'first_name', 'last_name', 'age', 'Comedy_Score' and 'Rating_Score'. raw_data = {'first_name': ['Sheldon', 'Raj', 'Leonard', 'Howard', 'Amy'], 'last_name': ['Copper', 'Koothrappali', 'Hofstadter', 'Wolowitz', 'Fowler'], 'age': [42, 38, 36, 41, 35], 'Comedy_Score': [9, 7, 8, 8, 5], 'Rating_Score': [25, 25, 49, 62, 70]} df = pd.DataFrame(raw_data, columns = ['first_name', 'last_name', 'age', 'Comedy_Score', 'Rating_Score']) print(df)

Step 3 - Ranking the dataframe

We want to rank the dataframe on the basis of column 'age', for better understanding we will rank on ascending as well as decending order of age. But before using rank function let us first look into its parameters.

  • axis : It is bool in which 0 signifies rows and 1 signifies column and by default it is 0.
  • method : In this we have to pass the method of ranking the dataframe, it can be 'average', 'min', 'max', 'first' and 'dense'. By default it is set to average.
  • na_option : This to decide if we want to rank NaN values as NaN or we have give the higest or lowest rank to it. By default it is set to keep.
  • ascending : This is a bool feature in which we have to especify that we want the ranking as ascending or decending.
df['Hierarchy_Rank'] = df['age'].rank(ascending=True) print(df) df['Hierarchy_Rank'] = df['age'].rank(ascending=False) print(df) So the output comes as


  first_name     last_name  age  Comedy_Score  Rating_Score
0    Sheldon        Copper   42             9            25
1        Raj  Koothrappali   38             7            25
2    Leonard    Hofstadter   36             8            49
3     Howard      Wolowitz   41             8            62
4        Amy        Fowler   35             5            70

  first_name     last_name  age  Comedy_Score  Rating_Score  Hierarchy_Rank
0    Sheldon        Copper   42             9            25             5.0
1        Raj  Koothrappali   38             7            25             3.0
2    Leonard    Hofstadter   36             8            49             2.0
3     Howard      Wolowitz   41             8            62             4.0
4        Amy        Fowler   35             5            70             1.0

  first_name     last_name  age  Comedy_Score  Rating_Score  Hierarchy_Rank
0    Sheldon        Copper   42             9            25             1.0
1        Raj  Koothrappali   38             7            25             3.0
2    Leonard    Hofstadter   36             8            49             4.0
3     Howard      Wolowitz   41             8            62             2.0
4        Amy        Fowler   35             5            70             5.0

Relevant Projects

Data Science Project on Wine Quality Prediction in R
In this R data science project, we will explore wine dataset to assess red wine quality. The objective of this data science project is to explore which chemical properties will influence the quality of red wines.

Machine Learning project for Retail Price Optimization
In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.

Identifying Product Bundles from Sales Data Using R Language
In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.

Perform Time series modelling using Facebook Prophet
In this project, we are going to talk about Time Series Forecasting to predict the electricity requirement for a particular house using Prophet.

Mercari Price Suggestion Challenge Data Science Project
Data Science Project in Python- Build a machine learning algorithm that automatically suggests the right product prices.

Customer Market Basket Analysis using Apriori and Fpgrowth algorithms
In this data science project, you will learn how to perform market basket analysis with the application of Apriori and FP growth algorithms based on the concept of association rule learning.

Build a Collaborative Filtering Recommender System in Python
Use the Amazon Reviews/Ratings dataset of 2 Million records to build a recommender system using memory-based collaborative filtering in Python.

Sequence Classification with LSTM RNN in Python with Keras
In this project, we are going to work on Sequence to Sequence Prediction using IMDB Movie Review Dataset​ using Keras in Python.

Choosing the right Time Series Forecasting Methods
There are different time series forecasting methods to forecast stock price, demand etc. In this machine learning project, you will learn to determine which forecasting method to be used when and how to apply with time series forecasting example.

Predict Churn for a Telecom company using Logistic Regression
Machine Learning Project in R- Predict the customer churn of telecom sector and find out the key drivers that lead to churn. Learn how the logistic regression model using R can be used to identify the customer churn in telecom dataset.