What is the difference between different filtering functions in R and which of them is fastest?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

What is the difference between different filtering functions in R and which of them is fastest?

What is the difference between different filtering functions in R and which of them is fastest?

This recipe explains what is the difference between different filtering functions in R and which of them is fastest

Recipe Objective

What is the difference between different filtering functions in R? Which of them is fastest ? select () — Used for filtering out only relevant data from the dataframe. filter () — Filtering on the basis of some condition pipe- %>%- Pipe is the fastest filtering function, it makes execution faster and with fewer errors, it does not save unncessary object but rather makes code more readable in the process. arrange ()- Arrange () , arranges the output in ascending/descending order. This recipe demonstrates an example of different filtering functions in R.

Step 1 - Import necessary library

install.packages("dplyr") # Install package library(dplyr) # load the package

Step 2 - Create a dataframe

df <- data.frame(a = c(10,23,15,37,9), b = c(21,44,26,18,30), classify= c('A','B','A','C','A')) print(df)
 "Output of the line of code is :" 
df <- data.frame(a = c(10,23,15,37,9),
                 b = c(21,44,26,18,30),
                 classify= c('A','B','A','C','A'))
print(df)
   a  b classify
1 10 21        A
2 23 44        B
3 15 26        A
4 37 18        C
5  9 30        A

Step 3 - Apply select()

x <- select(df,a,b) print(x)
 "Output of the line of code is :" 

x <- select(df,a,b)
print(x)
   a  b
1 10 21
2 23 44
3 15 26
4 37 18
5  9 30

Step 4 - Apply filter()

Filter rows on basis of column classify ='a'

x <- filter(df,classify=='A') print(x)
 "Output of the line of code is :"

x <- filter(df,classify=='A')
print(x)
   a  b classify
1 10 21        A
2 15 26        A
3  9 30        A
 

Step 5 - Apply arrange()

x <- arrange(df,classify) print(x)
 "Output of the line of code is :" 

x <- arrange(df,classify)
print(x)
   a  b classify
1 10 21        A
2 15 26        A
3  9 30        A
4 23 44        B
5 37 18        C

Step 6 - Pipeline : Apply %>%

The ususal way of performing a function operation is function(argument) The pipe function works argument %>% function

x <- df %>% select(a) print(x)
 "Output of the line of code is :"
   a
1 10
2 23
3 15
4 37
5  9
 

Relevant Projects

Data Science Project on Wine Quality Prediction in R
In this R data science project, we will explore wine dataset to assess red wine quality. The objective of this data science project is to explore which chemical properties will influence the quality of red wines.

Customer Churn Prediction Analysis using Ensemble Techniques
In this machine learning churn project, we implement a churn prediction model in python using ensemble techniques.

Ecommerce product reviews - Pairwise ranking and sentiment analysis
This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.

Avocado Machine Learning Project Python for Price Prediction
In this ML Project, you will use the Avocado dataset to build a machine learning model to predict the average price of avocado which is continuous in nature based on region and varieties of avocado.

Build a Face Recognition System in Python using FaceNet
In this deep learning project, you will build your own face recognition system in Python using OpenCV and FaceNet by extracting features from an image of a person's face.

Build an Image Classifier for Plant Species Identification
In this machine learning project, we will use binary leaf images and extracted features, including shape, margin, and texture to accurately identify plant species using different benchmark classification techniques.

Locality Sensitive Hashing Python Code for Look-Alike Modelling
In this deep learning project, you will find similar images (lookalikes) using deep learning and locality sensitive hashing to find customers who are most likely to click on an ad.

Predict Churn for a Telecom company using Logistic Regression
Machine Learning Project in R- Predict the customer churn of telecom sector and find out the key drivers that lead to churn. Learn how the logistic regression model using R can be used to identify the customer churn in telecom dataset.

Expedia Hotel Recommendations Data Science Project
In this data science project, you will contextualize customer data and predict the likelihood a customer will stay at 100 different hotel groups.

Forecasting Business KPI's with Tensorflow and Python
In this machine learning project, you will use the video clip of an IPL match played between CSK and RCB to forecast key performance indicators like the number of appearances of a brand logo, the frames, and the shortest and longest area percentage in the video.