How to create a new column based on a condition in Python?
DATA MUNGING DATA CLEANING PYTHON MACHINE LEARNING RECIPES PANDAS CHEATSHEET     ALL TAGS

How to create a new column based on a condition in Python?

How to create a new column based on a condition in Python?

This recipe helps you create a new column based on a condition in Python

0

Recipe Objective

Adding a new column in python is a easy task. But have you tried to add a column with values in it based on some condition. Like a column with values which depends on the values of another column. For a small data set with few numbers of rows it may be easy to do it manually but for a large dataset with hundreds of rows it may be quite difficult to do it manually.

We can do this hectic manual work with few lines of code. We can create a function which will do it for us for all the rows.

So this recipe is a short example of how can create a new column based on a condition in Python.

Step 1 - Import the library

import pandas as pd import numpy as np

We have imported pandas and numpy. No other library is needed for the this function.

Step 2 - Creating a sample Dataset

Here we have created a Dataframe with columns. We have used a print statement to view our initial dataset. data = {"name": ["Jason", "Molly", "Tina", "Jake", "Amy"], "age": [42, 52, 63, 24, 73], "preTestScore": [4, 24, 31, 2, 3], "postTestScore": [25, 94, 57, 62, 70]} print(df) df = pd.DataFrame(data, columns = ["name", "age", "preTestScore", "postTestScore"]) print(); print(df)

Step 3 - Creating a new column

We are building condition for making new columns.

  • If the value of age is greater then 50 then print yes in column elderly@50
  • If the value of age is greater then 60 then print yes in column elderly@60
  • If the value of age is greater then 70 then print yes in column elderly@70
df["elderly@50"] = np.where(df["age"]>=50, "yes", "no") df["elderly@60"] = np.where(df["age"]>=60, "yes", "no") df["elderly@70"] = np.where(df["age"]>=70, "yes", "no") print(df) As an output we get:

    name  age  preTestScore  postTestScore
0  Jason   42             4             25
1  Molly   52            24             94
2   Tina   63            31             57
3   Jake   24             2             62
4    Amy   73             3             70

    name  age  preTestScore  postTestScore elderly@50 elderly@60 elderly@70
0  Jason   42             4             25         no         no         no
1  Molly   52            24             94        yes         no         no
2   Tina   63            31             57        yes        yes         no
3   Jake   24             2             62         no         no         no
4    Amy   73             3             70        yes        yes        yes

Relevant Projects

Data Science Project on Wine Quality Prediction in R
In this R data science project, we will explore wine dataset to assess red wine quality. The objective of this data science project is to explore which chemical properties will influence the quality of red wines.

Predict Credit Default | Give Me Some Credit Kaggle
In this data science project, you will predict borrowers chance of defaulting on credit loans by building a credit score prediction model.

Machine Learning project for Retail Price Optimization
In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.

Predict Macro Economic Trends using Kaggle Financial Dataset
In this machine learning project, you will uncover the predictive value in an uncertain world by using various artificial intelligence, machine learning, advanced regression and feature transformation techniques.

Ensemble Machine Learning Project - All State Insurance Claims Severity Prediction
In this ensemble machine learning project, we will predict what kind of claims an insurance company will get. This is implemented in python using ensemble machine learning algorithms.

Human Activity Recognition Using Smartphones Data Set
In this deep learning project, you will build a classification system where to precisely identify human fitness activities.

Customer Market Basket Analysis using Apriori and Fpgrowth algorithms
In this data science project, you will learn how to perform market basket analysis with the application of Apriori and FP growth algorithms based on the concept of association rule learning.

Zillow’s Home Value Prediction (Zestimate)
Data Science Project in R -Build a machine learning algorithm to predict the future sale prices of homes.

Build an Image Classifier for Plant Species Identification
In this machine learning project, we will use binary leaf images and extracted features, including shape, margin, and texture to accurately identify plant species using different benchmark classification techniques.

Data Science Project-TalkingData AdTracking Fraud Detection
Machine Learning Project in R-Detect fraudulent click traffic for mobile app ads using R data science programming language.