What is the difference between different filtering functions in R and which of them is fastest?

This recipe explains what is the difference between different filtering functions in R and which of them is fastest

Recipe Objective

What is the difference between different filtering functions in R? Which of them is fastest ? select () — Used for filtering out only relevant data from the dataframe. filter () — Filtering on the basis of some condition pipe- %>%- Pipe is the fastest filtering function, it makes execution faster and with fewer errors, it does not save unncessary object but rather makes code more readable in the process. arrange ()- Arrange () , arranges the output in ascending/descending order. This recipe demonstrates an example of different filtering functions in R.

Step 1 - Import necessary library

install.packages("dplyr") # Install package library(dplyr) # load the package

Step 2 - Create a dataframe

df <- data.frame(a = c(10,23,15,37,9), b = c(21,44,26,18,30), classify= c('A','B','A','C','A')) print(df)
 "Output of the line of code is :" 
df <- data.frame(a = c(10,23,15,37,9),
                 b = c(21,44,26,18,30),
                 classify= c('A','B','A','C','A'))
print(df)
   a  b classify
1 10 21        A
2 23 44        B
3 15 26        A
4 37 18        C
5  9 30        A

Step 3 - Apply select()

x <- select(df,a,b) print(x)
 "Output of the line of code is :" 

x <- select(df,a,b)
print(x)
   a  b
1 10 21
2 23 44
3 15 26
4 37 18
5  9 30

Step 4 - Apply filter()

Filter rows on basis of column classify ='a'

x <- filter(df,classify=='A') print(x)
 "Output of the line of code is :"

x <- filter(df,classify=='A')
print(x)
   a  b classify
1 10 21        A
2 15 26        A
3  9 30        A
 

Step 5 - Apply arrange()

x <- arrange(df,classify) print(x)
 "Output of the line of code is :" 

x <- arrange(df,classify)
print(x)
   a  b classify
1 10 21        A
2 15 26        A
3  9 30        A
4 23 44        B
5 37 18        C

Step 6 - Pipeline : Apply %>%

The ususal way of performing a function operation is function(argument) The pipe function works argument %>% function

x <- df %>% select(a) print(x)
 "Output of the line of code is :"
   a
1 10
2 23
3 15
4 37
5  9
 

What Users are saying..

profile image

Ed Godalle

Director Data Analytics at EY / EY Tech
linkedin profile url

I am the Director of Data Analytics with over 10+ years of IT experience. I have a background in SQL, Python, and Big Data working with Accenture, IBM, and Infosys. I am looking to enhance my skills... Read More

Relevant Projects

Predictive Analytics Project for Working Capital Optimization
In this Predictive Analytics Project, you will build a model to accurately forecast the timing of customer and supplier payments for optimizing working capital.

Build an Image Classifier for Plant Species Identification
In this machine learning project, we will use binary leaf images and extracted features, including shape, margin, and texture to accurately identify plant species using different benchmark classification techniques.

Machine Learning project for Retail Price Optimization
In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.

Build Multi Class Text Classification Models with RNN and LSTM
In this Deep Learning Project, you will use the customer complaints data about consumer financial products to build multi-class text classification models using RNN and LSTM.

MLOps Project to Deploy Resume Parser Model on Paperspace
In this MLOps project, you will learn how to deploy a Resume Parser Streamlit Application on Paperspace Private Cloud.

Image Segmentation using Mask R-CNN with Tensorflow
In this Deep Learning Project on Image Segmentation Python, you will learn how to implement the Mask R-CNN model for early fire detection.

Tensorflow Transfer Learning Model for Image Classification
Image Classification Project - Build an Image Classification Model on a Dataset of T-Shirt Images for Binary Classification

Build an End-to-End AWS SageMaker Classification Model
MLOps on AWS SageMaker -Learn to Build an End-to-End Classification Model on SageMaker to predict a patient’s cause of death.

Time Series Python Project using Greykite and Neural Prophet
In this time series project, you will forecast Walmart sales over time using the powerful, fast, and flexible time series forecasting library Greykite that helps automate time series problems.

Text Classification with Transformers-RoBERTa and XLNet Model
In this machine learning project, you will learn how to load, fine tune and evaluate various transformer models for text classification tasks.