How to create a new column based on a condition in Python?
DATA MUNGING DATA CLEANING PYTHON MACHINE LEARNING RECIPES PANDAS CHEATSHEET     ALL TAGS

How to create a new column based on a condition in Python?

How to create a new column based on a condition in Python?

This recipe helps you create a new column based on a condition in Python

0

Recipe Objective

Adding a new column in python is a easy task. But have you tried to add a column with values in it based on some condition. Like a column with values which depends on the values of another column. For a small data set with few numbers of rows it may be easy to do it manually but for a large dataset with hundreds of rows it may be quite difficult to do it manually.

We can do this hectic manual work with few lines of code. We can create a function which will do it for us for all the rows.

So this recipe is a short example of how can create a new column based on a condition in Python.

Step 1 - Import the library

import pandas as pd import numpy as np

We have imported pandas and numpy. No other library is needed for the this function.

Step 2 - Creating a sample Dataset

Here we have created a Dataframe with columns. We have used a print statement to view our initial dataset. data = {"name": ["Jason", "Molly", "Tina", "Jake", "Amy"], "age": [42, 52, 63, 24, 73], "preTestScore": [4, 24, 31, 2, 3], "postTestScore": [25, 94, 57, 62, 70]} print(df) df = pd.DataFrame(data, columns = ["name", "age", "preTestScore", "postTestScore"]) print(); print(df)

Step 3 - Creating a new column

We are building condition for making new columns.

  • If the value of age is greater then 50 then print yes in column elderly@50
  • If the value of age is greater then 60 then print yes in column elderly@60
  • If the value of age is greater then 70 then print yes in column elderly@70
df["elderly@50"] = np.where(df["age"]>=50, "yes", "no") df["elderly@60"] = np.where(df["age"]>=60, "yes", "no") df["elderly@70"] = np.where(df["age"]>=70, "yes", "no") print(df) As an output we get:

    name  age  preTestScore  postTestScore
0  Jason   42             4             25
1  Molly   52            24             94
2   Tina   63            31             57
3   Jake   24             2             62
4    Amy   73             3             70

    name  age  preTestScore  postTestScore elderly@50 elderly@60 elderly@70
0  Jason   42             4             25         no         no         no
1  Molly   52            24             94        yes         no         no
2   Tina   63            31             57        yes        yes         no
3   Jake   24             2             62         no         no         no
4    Amy   73             3             70        yes        yes        yes

Relevant Projects

Solving Multiple Classification use cases Using H2O
In this project, we are going to talk about H2O and functionality in terms of building Machine Learning models.

Loan Eligibility Prediction using Gradient Boosting Classifier
This data science in python project predicts if a loan should be given to an applicant or not. We predict if the customer is eligible for loan based on several factors like credit score and past history.

Customer Churn Prediction Analysis using Ensemble Techniques
In this machine learning churn project, we implement a churn prediction model in python using ensemble techniques.

Build an Image Classifier for Plant Species Identification
In this machine learning project, we will use binary leaf images and extracted features, including shape, margin, and texture to accurately identify plant species using different benchmark classification techniques.

Walmart Sales Forecasting Data Science Project
Data Science Project in R-Predict the sales for each department using historical markdown data from the Walmart dataset containing data of 45 Walmart stores.

Ensemble Machine Learning Project - All State Insurance Claims Severity Prediction
In this ensemble machine learning project, we will predict what kind of claims an insurance company will get. This is implemented in python using ensemble machine learning algorithms.

Mercari Price Suggestion Challenge Data Science Project
Data Science Project in Python- Build a machine learning algorithm that automatically suggests the right product prices.

Zillow’s Home Value Prediction (Zestimate)
Data Science Project in R -Build a machine learning algorithm to predict the future sale prices of homes.

Ecommerce product reviews - Pairwise ranking and sentiment analysis
This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.

Perform Time series modelling using Facebook Prophet
In this project, we are going to talk about Time Series Forecasting to predict the electricity requirement for a particular house using Prophet.