How to insert a new column based on condition in Python?

This recipe helps you insert a new column based on condition in Python

Recipe Objective- How to insert a new column based on condition in Python?

Adding a new column in Python is an easy task. Have you tried adding a column with values based on some condition? Like a column with values that depends on the values of another column. For a small data set with few numbers of rows, it may be easy to do it manually, but for a large dataset with hundreds of rows, it may be challenging to do it manually.

We can do this hectic manual work with few lines of code. We can create a function that will do it for all the rows.

This recipe shows how to create a function to insert a Pandas new column based on condition.

Python Pandas ‘Add New Column Based On Condition’

You can follow the below steps in Pandas to create new column based on condition.

Step 1 - Import the library

import pandas as pd

import numpy as np

We have imported pandas and numpy. No other library is needed for this function.

Step 2 - Creating a Sample Dataset

We will create a Dataframe with columns 'bond_name' and 'risk_score'. We will use a print statement to view our initial dataset.

raw_data = {'bond_name': ['govt_bond_1', 'govt_bond_2', 'govt_bond_3', 'pvt_bond_1', 'pvt_bond_2', 'pvt_bond_3', 'pvt_bond_4'], 'risk_score': [1.6, 0.9, 2.3, 3.0, 2.7, 1.8, 4.1]}

df = pd.DataFrame(raw_data, columns = ['bond_name', 'risk_score'])

print(df)

Step 3 - Creating a function to assign values in column

First, we will create an empty list named rating, which we will append and assign values as per the condition. 

rating = []

We will create a loop that will iterate over all the rows in column 'risk_score' and assign values in the list. We are using the if-else function to make the condition on which we want to assign the values in the column. Here, we want to assign a rating based on risk_score. The condition which we are making is:

  • If the value in risk_score is between 0 and 1, it will assign 'AA' in the rating column.

  • If the value in risk_score is between 1 and 2, it will assign 'A' in the rating column.

  • If the value in risk_score is between 2 and 3, it will assign 'BB' in the rating column.

  • If the value in risk_score is between 3 and 4, it will assign 'B' in the rating column.

  • If the value in risk_score is between 4 and 5, it will assign 'C' in the rating column.

  • If there is no value in risk_score, then it will assign Not_Rated in the rating column.

rating = [] for row in df['risk_score']: if row < 1.0 : rating.append('AA') elif row < 2.0: rating.append('A') elif row < 3.0: rating.append('BB') elif row < 4.0: rating.append('B') elif row < 5.0: rating.append('C') else: rating.append('Not_Rated') 

Explore More Data Science and Machine Learning Projects for Practice. Fast-Track Your Career Transition with ProjectPro

Step 5 - Converting list into column of dataset and viewing the final dataset

So finally, we are adding that list as a column in the dataset and printing the final dataset to see the changes. 

df['rating'] = rating print(df) 

We get the following output-

    bond_name  risk_score

0  govt_bond_1         1.6

1  govt_bond_2         0.9

2  govt_bond_3         2.3

3   pvt_bond_1         3.0

4   pvt_bond_2         2.7

5   pvt_bond_3         1.8

6   pvt_bond_4         4.1

 

     bond_name  risk_score rating

0  govt_bond_1         1.6      A

1  govt_bond_2         0.9     AA

2  govt_bond_3         2.3     BB

3   pvt_bond_1         3.0      B

4   pvt_bond_2         2.7     BB

5   pvt_bond_3         1.8      A

6   pvt_bond_4         4.1      C

Here we see that a new column has been added with the values according to the risk_score.

How to Create New Column in Pandas Dataframe Based on Condition?

The apply() method shows you how to create a new column in a Pandas based on condition. The apply() method takes a function as an argument and applies that function to each row in the DataFrame. The function you pass to the apply() method should return a single value. The function should return a Boolean value when creating a new column based on a condition.

The following code shows how to create a new column called Is_Male in a DataFrame called df based on the value of the Name column:

df['Is_Male'] = df['Name'].apply(lambda name: name.split()[-1] == 'M')

The apply() method is applied to the Name column in this code. The function passed to the apply() method checks the last letter of the name. If the last letter is M, then the function returns True. Otherwise, the function returns False.

Python Pandas ‘Create New Column Based On Other Columns’

In Python Pandas, new column based on another column can be created using the where() method. The where() method takes a condition and a value as arguments. If the condition is met, then the value is returned. Otherwise, another value is returned.

Python Pandas ‘Add Column Based on Other Columns’

You can add column based on other columns, i.e., based on the values of two existing columns, using the assign() method. The assign() method takes a dictionary as an argument, where the keys are the names of the new columns, and the values are the expressions used to fill the columns.

 

Download Materials

What Users are saying..

profile image

Ed Godalle

Director Data Analytics at EY / EY Tech
linkedin profile url

I am the Director of Data Analytics with over 10+ years of IT experience. I have a background in SQL, Python, and Big Data working with Accenture, IBM, and Infosys. I am looking to enhance my skills... Read More

Relevant Projects

LLM Project to Build and Fine Tune a Large Language Model
In this LLM project for beginners, you will learn to build a knowledge-grounded chatbot using LLM's and learn how to fine tune it.

Build a Autoregressive and Moving Average Time Series Model
In this time series project, you will learn to build Autoregressive and Moving Average Time Series Models to forecast future readings, optimize performance, and harness the power of predictive analytics for sensor data.

Build a CNN Model with PyTorch for Image Classification
In this deep learning project, you will learn how to build an Image Classification Model using PyTorch CNN

Word2Vec and FastText Word Embedding with Gensim in Python
In this NLP Project, you will learn how to use the popular topic modelling library Gensim for implementing two state-of-the-art word embedding methods Word2Vec and FastText models.

Learn Object Tracking (SOT, MOT) using OpenCV and Python
Get Started with Object Tracking using OpenCV and Python - Learn to implement Multiple Instance Learning Tracker (MIL) algorithm, Generic Object Tracking Using Regression Networks Tracker (GOTURN) algorithm, Kernelized Correlation Filters Tracker (KCF) algorithm, Tracking, Learning, Detection Tracker (TLD) algorithm for single and multiple object tracking from various video clips.

AWS MLOps Project for ARCH and GARCH Time Series Models
Build and deploy ARCH and GARCH time series forecasting models in Python on AWS .

Predictive Analytics Project for Working Capital Optimization
In this Predictive Analytics Project, you will build a model to accurately forecast the timing of customer and supplier payments for optimizing working capital.

Build a Collaborative Filtering Recommender System in Python
Use the Amazon Reviews/Ratings dataset of 2 Million records to build a recommender system using memory-based collaborative filtering in Python.

Build a Text Classification Model with Attention Mechanism NLP
In this NLP Project, you will learn to build a multi class text classification model with attention mechanism.

Build a Text Generator Model using Amazon SageMaker
In this Deep Learning Project, you will train a Text Generator Model on Amazon Reviews Dataset using LSTM Algorithm in PyTorch and deploy it on Amazon SageMaker.