How to filter in a Pandas DataFrame?
DATA MUNGING DATA CLEANING PYTHON MACHINE LEARNING RECIPES PANDAS CHEATSHEET     ALL TAGS

How to filter in a Pandas DataFrame?

How to filter in a Pandas DataFrame?

This recipe helps you filter in a Pandas DataFrame

0

Recipe Objective

In a dataframe many times we need to filter the dataset based on some condition so how to do that?

So this is the recipe on how we can filter a Pandas DataFrame.

Step 1 - Import the library

import pandas as pd

We have only imported pandas which is needed.

Step 2 - Creating Dataframe

We have created a dictionary with features and passed it through pd.DataFrame to create a dataframe. data = {"first_name": ["Sheldon", "Raj", "Leonard", "Howard", "Amy"], "last_name": ["Copper", "Koothrappali", "Hofstadter", "Wolowitz", "Fowler"], "age": [42, 38, 36, 41, 35], "Comedy_Score": [9, 7, 8, 8, 5], "Rating_Score": [25, 25, 49, 62, 70]} df = pd.DataFrame(data, columns = ["first_name", "last_name", "age", "Comedy_Score", "Rating_Score"]) print(df)

Step 3 - Filtering the dataframe

We will be filtering the dataset such that only one column is there i.e in this case first_name. print(df["first_name"]) Now, We will be filtering the dataset such that two columns will be there i.e in this case first_name and age. print(df[["first_name", "age"]]) Now, We will be filtering the dataset such that first two rows will be there. print(df[:2]) Now, We will be filtering the dataset such that rows having Rating Score greater than 50 will be there. print(df[df["Rating_Score"] > 50]) Now, We will be filtering the dataset such that rows having Comedy Score greater than 5 and Rating Score less than 40 will be there. print(df[(df["Comedy_Score"] > 5) & (df["Rating_Score"] < 40)]) So the output comes as

  first_name     last_name  age  Comedy_Score  Rating_Score
0    Sheldon        Copper   42             9            25
1        Raj  Koothrappali   38             7            25
2    Leonard    Hofstadter   36             8            49
3     Howard      Wolowitz   41             8            62
4        Amy        Fowler   35             5            70

0    Sheldon
1        Raj
2    Leonard
3     Howard
4        Amy
Name: first_name, dtype: object

  first_name  age
0    Sheldon   42
1        Raj   38
2    Leonard   36
3     Howard   41
4        Amy   35

  first_name     last_name  age  Comedy_Score  Rating_Score
0    Sheldon        Copper   42             9            25
1        Raj  Koothrappali   38             7            25

  first_name last_name  age  Comedy_Score  Rating_Score
3     Howard  Wolowitz   41             8            62
4        Amy    Fowler   35             5            70

  first_name     last_name  age  Comedy_Score  Rating_Score
0    Sheldon        Copper   42             9            25
1        Raj  Koothrappali   38             7            25

Relevant Projects

Credit Card Fraud Detection as a Classification Problem
In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.

Resume parsing with Machine learning - NLP with Python OCR and Spacy
In this machine learning resume parser example we use the popular Spacy NLP python library for OCR and text classification.

Mercari Price Suggestion Challenge Data Science Project
Data Science Project in Python- Build a machine learning algorithm that automatically suggests the right product prices.

Ecommerce product reviews - Pairwise ranking and sentiment analysis
This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.

Natural language processing Chatbot application using NLTK for text classification
In this NLP AI application, we build the core conversational engine for a chatbot. We use the popular NLTK text classification library to achieve this.

Choosing the right Time Series Forecasting Methods
There are different time series forecasting methods to forecast stock price, demand etc. In this machine learning project, you will learn to determine which forecasting method to be used when and how to apply with time series forecasting example.

Predict Macro Economic Trends using Kaggle Financial Dataset
In this machine learning project, you will uncover the predictive value in an uncertain world by using various artificial intelligence, machine learning, advanced regression and feature transformation techniques.

Machine Learning project for Retail Price Optimization
In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.

Data Science Project on Wine Quality Prediction in R
In this R data science project, we will explore wine dataset to assess red wine quality. The objective of this data science project is to explore which chemical properties will influence the quality of red wines.

Zillow’s Home Value Prediction (Zestimate)
Data Science Project in R -Build a machine learning algorithm to predict the future sale prices of homes.