In a dataframe many times we need to filter the dataset based on some condition so how to do that?
So this is the recipe on how we can filter a Pandas DataFrame.
import pandas as pd
We have only imported pandas which is needed.
We have created a dictionary with features and passed it through pd.DataFrame to create a dataframe.
data = {"first_name": ["Sheldon", "Raj", "Leonard", "Howard", "Amy"],
"last_name": ["Copper", "Koothrappali", "Hofstadter", "Wolowitz", "Fowler"],
"age": [42, 38, 36, 41, 35],
"Comedy_Score": [9, 7, 8, 8, 5],
"Rating_Score": [25, 25, 49, 62, 70]}
df = pd.DataFrame(data, columns = ["first_name", "last_name", "age",
"Comedy_Score", "Rating_Score"])
print(df)
We will be filtering the dataset such that only one column is there i.e in this case first_name.
print(df["first_name"])
Now, We will be filtering the dataset such that two columns will be there i.e in this case first_name and age.
print(df[["first_name", "age"]])
Now, We will be filtering the dataset such that first two rows will be there.
print(df[:2])
Now, We will be filtering the dataset such that rows having Rating Score greater than 50 will be there.
print(df[df["Rating_Score"] > 50])
Now, We will be filtering the dataset such that rows having Comedy Score greater than 5 and Rating Score less than 40 will be there.
print(df[(df["Comedy_Score"] > 5) & (df["Rating_Score"] < 40)])
So the output comes as
first_name last_name age Comedy_Score Rating_Score 0 Sheldon Copper 42 9 25 1 Raj Koothrappali 38 7 25 2 Leonard Hofstadter 36 8 49 3 Howard Wolowitz 41 8 62 4 Amy Fowler 35 5 70 0 Sheldon 1 Raj 2 Leonard 3 Howard 4 Amy Name: first_name, dtype: object first_name age 0 Sheldon 42 1 Raj 38 2 Leonard 36 3 Howard 41 4 Amy 35 first_name last_name age Comedy_Score Rating_Score 0 Sheldon Copper 42 9 25 1 Raj Koothrappali 38 7 25 first_name last_name age Comedy_Score Rating_Score 3 Howard Wolowitz 41 8 62 4 Amy Fowler 35 5 70 first_name last_name age Comedy_Score Rating_Score 0 Sheldon Copper 42 9 25 1 Raj Koothrappali 38 7 25