How to create a new column based on a condition in Python?

This recipe helps you create a new column based on a condition in Python

Recipe Objective

Adding a new column in python is a easy task. But have you tried to add a column with values in it based on some condition. Like a column with values which depends on the values of another column. For a small data set with few numbers of rows it may be easy to do it manually but for a large dataset with hundreds of rows it may be quite difficult to do it manually.

We can do this hectic manual work with few lines of code. We can create a function which will do it for us for all the rows.

So this recipe is a short example of how can create a new column based on a condition in Python.

Master the Art of Data Cleaning in Machine Learning

Step 1 - Import the library

import pandas as pd import numpy as np

We have imported pandas and numpy. No other library is needed for the this function.

Step 2 - Creating a sample Dataset

Here we have created a Dataframe with columns. We have used a print statement to view our initial dataset. data = {"name": ["Jason", "Molly", "Tina", "Jake", "Amy"], "age": [42, 52, 63, 24, 73], "preTestScore": [4, 24, 31, 2, 3], "postTestScore": [25, 94, 57, 62, 70]} print(df) df = pd.DataFrame(data, columns = ["name", "age", "preTestScore", "postTestScore"]) print(); print(df)

Step 3 - Creating a new column

We are building condition for making new columns.

  • If the value of age is greater then 50 then print yes in column elderly@50
  • If the value of age is greater then 60 then print yes in column elderly@60
  • If the value of age is greater then 70 then print yes in column elderly@70

df["elderly@50"] = np.where(df["age"]>=50, "yes", "no") df["elderly@60"] = np.where(df["age"]>=60, "yes", "no") df["elderly@70"] = np.where(df["age"]>=70, "yes", "no") print(df) As an output we get:

    name  age  preTestScore  postTestScore
0  Jason   42             4             25
1  Molly   52            24             94
2   Tina   63            31             57
3   Jake   24             2             62
4    Amy   73             3             70

    name  age  preTestScore  postTestScore elderly@50 elderly@60 elderly@70
0  Jason   42             4             25         no         no         no
1  Molly   52            24             94        yes         no         no
2   Tina   63            31             57        yes        yes         no
3   Jake   24             2             62         no         no         no
4    Amy   73             3             70        yes        yes        yes

Download Materials

What Users are saying..

profile image

Abhinav Agarwal

Graduate Student at Northwestern University
linkedin profile url

I come from Northwestern University, which is ranked 9th in the US. Although the high-quality academics at school taught me all the basics I needed, obtaining practical experience was a challenge.... Read More

Relevant Projects

Build a Customer Churn Prediction Model using Decision Trees
Develop a customer churn prediction model using decision tree machine learning algorithms and data science on streaming service data.

Demand prediction of driver availability using multistep time series analysis
In this supervised learning machine learning project, you will predict the availability of a driver in a specific area by using multi step time series analysis.

A/B Testing Approach for Comparing Performance of ML Models
The objective of this project is to compare the performance of BERT and DistilBERT models for building an efficient Question and Answering system. Using A/B testing approach, we explore the effectiveness and efficiency of both models and determine which one is better suited for Q&A tasks.

Personalized Medicine: Redefining Cancer Treatment
In this Personalized Medicine Machine Learning Project you will learn to classify genetic mutations on the basis of medical literature into 9 classes.

Build Time Series Models for Gaussian Processes in Python
Time Series Project - A hands-on approach to Gaussian Processes for Time Series Modelling in Python

Build Piecewise and Spline Regression Models in Python
In this Regression Project, you will learn how to build a piecewise and spline regression model from scratch in Python to predict the points scored by a sports team.

End-to-End Snowflake Healthcare Analytics Project on AWS-2
In this AWS Snowflake project, you will build an end to end retraining pipeline by checking Data and Model Drift and learn how to redeploy the model if needed

Insurance Pricing Forecast Using XGBoost Regressor
In this project, we are going to talk about insurance forecast by using linear and xgboost regression techniques.

Learn Hyperparameter Tuning for Neural Networks with PyTorch
In this Deep Learning Project, you will learn how to optimally tune the hyperparameters (learning rate, epochs, dropout, early stopping) of a neural network model in PyTorch to improve model performance.

Build a Speech-Text Transcriptor with Nvidia Quartznet Model
In this Deep Learning Project, you will leverage transfer learning from Nvidia QuartzNet pre-trained models to develop a speech-to-text transcriptor.