How to filter in a Pandas DataFrame?
DATA MUNGING DATA CLEANING PYTHON MACHINE LEARNING RECIPES PANDAS CHEATSHEET     ALL TAGS

How to filter in a Pandas DataFrame?

How to filter in a Pandas DataFrame?

This recipe helps you filter in a Pandas DataFrame

0

Recipe Objective

In a dataframe many times we need to filter the dataset based on some condition so how to do that?

So this is the recipe on how we can filter a Pandas DataFrame.

Step 1 - Import the library

import pandas as pd

We have only imported pandas which is needed.

Step 2 - Creating Dataframe

We have created a dictionary with features and passed it through pd.DataFrame to create a dataframe. data = {"first_name": ["Sheldon", "Raj", "Leonard", "Howard", "Amy"], "last_name": ["Copper", "Koothrappali", "Hofstadter", "Wolowitz", "Fowler"], "age": [42, 38, 36, 41, 35], "Comedy_Score": [9, 7, 8, 8, 5], "Rating_Score": [25, 25, 49, 62, 70]} df = pd.DataFrame(data, columns = ["first_name", "last_name", "age", "Comedy_Score", "Rating_Score"]) print(df)

Step 3 - Filtering the dataframe

We will be filtering the dataset such that only one column is there i.e in this case first_name. print(df["first_name"]) Now, We will be filtering the dataset such that two columns will be there i.e in this case first_name and age. print(df[["first_name", "age"]]) Now, We will be filtering the dataset such that first two rows will be there. print(df[:2]) Now, We will be filtering the dataset such that rows having Rating Score greater than 50 will be there. print(df[df["Rating_Score"] > 50]) Now, We will be filtering the dataset such that rows having Comedy Score greater than 5 and Rating Score less than 40 will be there. print(df[(df["Comedy_Score"] > 5) & (df["Rating_Score"] < 40)]) So the output comes as

  first_name     last_name  age  Comedy_Score  Rating_Score
0    Sheldon        Copper   42             9            25
1        Raj  Koothrappali   38             7            25
2    Leonard    Hofstadter   36             8            49
3     Howard      Wolowitz   41             8            62
4        Amy        Fowler   35             5            70

0    Sheldon
1        Raj
2    Leonard
3     Howard
4        Amy
Name: first_name, dtype: object

  first_name  age
0    Sheldon   42
1        Raj   38
2    Leonard   36
3     Howard   41
4        Amy   35

  first_name     last_name  age  Comedy_Score  Rating_Score
0    Sheldon        Copper   42             9            25
1        Raj  Koothrappali   38             7            25

  first_name last_name  age  Comedy_Score  Rating_Score
3     Howard  Wolowitz   41             8            62
4        Amy    Fowler   35             5            70

  first_name     last_name  age  Comedy_Score  Rating_Score
0    Sheldon        Copper   42             9            25
1        Raj  Koothrappali   38             7            25

Relevant Projects

Walmart Sales Forecasting Data Science Project
Data Science Project in R-Predict the sales for each department using historical markdown data from the Walmart dataset containing data of 45 Walmart stores.

Build a Collaborative Filtering Recommender System in Python
Use the Amazon Reviews/Ratings dataset of 2 Million records to build a recommender system using memory-based collaborative filtering in Python.

Predict Employee Computer Access Needs in Python
Data Science Project in Python- Given his or her job role, predict employee access needs using amazon employee database.

German Credit Dataset Analysis to Classify Loan Applications
In this data science project, you will work with German credit dataset using classification techniques like Decision Tree, Neural Networks etc to classify loan applications using R.

Machine Learning project for Retail Price Optimization
In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.

Sequence Classification with LSTM RNN in Python with Keras
In this project, we are going to work on Sequence to Sequence Prediction using IMDB Movie Review Dataset​ using Keras in Python.

Choosing the right Time Series Forecasting Methods
There are different time series forecasting methods to forecast stock price, demand etc. In this machine learning project, you will learn to determine which forecasting method to be used when and how to apply with time series forecasting example.

Human Activity Recognition Using Multiclass Classification in Python
In this human activity recognition project, we use multiclass classification machine learning techniques to analyse fitness dataset from a smartphone tracker.

Predict Macro Economic Trends using Kaggle Financial Dataset
In this machine learning project, you will uncover the predictive value in an uncertain world by using various artificial intelligence, machine learning, advanced regression and feature transformation techniques.

PySpark Tutorial - Learn to use Apache Spark with Python
PySpark Project-Get a handle on using Python with Spark through this hands-on data processing spark python tutorial.