What is the difference between different filtering functions in R and which of them is fastest?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

What is the difference between different filtering functions in R and which of them is fastest?

What is the difference between different filtering functions in R and which of them is fastest?

This recipe explains what is the difference between different filtering functions in R and which of them is fastest

0

Recipe Objective

What is the difference between different filtering functions in R? Which of them is fastest ? select () — Used for filtering out only relevant data from the dataframe. filter () — Filtering on the basis of some condition pipe- %>%- Pipe is the fastest filtering function, it makes execution faster and with fewer errors, it does not save unncessary object but rather makes code more readable in the process. arrange ()- Arrange () , arranges the output in ascending/descending order. This recipe demonstrates an example of different filtering functions in R.

Step 1 - Import necessary library

install.packages("dplyr") # Install package library(dplyr) # load the package

Step 2 - Create a dataframe

df <- data.frame(a = c(10,23,15,37,9), b = c(21,44,26,18,30), classify= c('A','B','A','C','A')) print(df)
 "Output of the line of code is :" 
df <- data.frame(a = c(10,23,15,37,9),
                 b = c(21,44,26,18,30),
                 classify= c('A','B','A','C','A'))
print(df)
   a  b classify
1 10 21        A
2 23 44        B
3 15 26        A
4 37 18        C
5  9 30        A

Step 3 - Apply select()

x <- select(df,a,b) print(x)
 "Output of the line of code is :" 

x <- select(df,a,b)
print(x)
   a  b
1 10 21
2 23 44
3 15 26
4 37 18
5  9 30

Step 4 - Apply filter()

Filter rows on basis of column classify ='a'

x <- filter(df,classify=='A') print(x)
 "Output of the line of code is :"

x <- filter(df,classify=='A')
print(x)
   a  b classify
1 10 21        A
2 15 26        A
3  9 30        A
 

Step 5 - Apply arrange()

x <- arrange(df,classify) print(x)
 "Output of the line of code is :" 

x <- arrange(df,classify)
print(x)
   a  b classify
1 10 21        A
2 15 26        A
3  9 30        A
4 23 44        B
5 37 18        C

Step 6 - Pipeline : Apply %>%

The ususal way of performing a function operation is function(argument) The pipe function works argument %>% function

x <- df %>% select(a) print(x)
 "Output of the line of code is :"
   a
1 10
2 23
3 15
4 37
5  9
 

Relevant Projects

Human Activity Recognition Using Multiclass Classification in Python
In this human activity recognition project, we use multiclass classification machine learning techniques to analyse fitness dataset from a smartphone tracker.

Data Science Project in Python on BigMart Sales Prediction
The goal of this data science project is to build a predictive model and find out the sales of each product at a given Big Mart store.

Predict Employee Computer Access Needs in Python
Data Science Project in Python- Given his or her job role, predict employee access needs using amazon employee database.

PySpark Tutorial - Learn to use Apache Spark with Python
PySpark Project-Get a handle on using Python with Spark through this hands-on data processing spark python tutorial.

Customer Churn Prediction Analysis using Ensemble Techniques
In this machine learning churn project, we implement a churn prediction model in python using ensemble techniques.

Mercari Price Suggestion Challenge Data Science Project
Data Science Project in Python- Build a machine learning algorithm that automatically suggests the right product prices.

Credit Card Fraud Detection as a Classification Problem
In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.

Predict Churn for a Telecom company using Logistic Regression
Machine Learning Project in R- Predict the customer churn of telecom sector and find out the key drivers that lead to churn. Learn how the logistic regression model using R can be used to identify the customer churn in telecom dataset.

Topic modelling using Kmeans clustering to group customer reviews
In this Kmeans clustering machine learning project, you will perform topic modelling in order to group customer reviews based on recurring patterns.

Machine Learning or Predictive Models in IoT - Energy Prediction Use Case
In this machine learning and IoT project, we are going to test out the experimental data using various predictive models and train the models and break the energy usage.