How to drop null values in a dataframe in R

This recipe helps you drop null values in a dataframe in R

Recipe Objective

Missing value is one of the most common problem in any raw dataset. To create a precise and unbiased machine learning model, we need to deal with these Missing values after identifying them. There are different steps that we can take to do so:

  1. Identifying number of missing values in each column
  2. Based on the number, we decide whether we need to drop the column or replace it with it's mean, median or any other computed value.

In this recipe, we will demonstrate how to drop the rows with missing values (NA) in a dataframe. ​

Explore the BERT Variants - ALBERT vs DistilBERT

STEP 1: Creating a DataFrame

Creating a STUDENT dataframe with student_id, Name and marks as columns

STUDENT = data.frame(student_id = c(1,2,3,NA,5), Name = c("Ram","Shyam", "Jessica", NA, NA), Marks = c(NA, 60, NA, 80, NA))

STEP 2:Dropping rows with NA, NAN values

Method 1:

We will use built-in function na.omit(x) where x is a dataframe or a column.

STUDENT = na.omit(STUDENT) STUDENT

student_id	Name	Marks
1		Ram	55
2		Shyam	60

Method 2:

We will use built-in function complete.cases(x) to filter out the rows by removing the missing values (NA, NAN) where x is a dataframe or a column.

STUDENT[complete.cases(STUDENT),]

student_id	Name	Marks
1		Ram	55
2		Shyam	60
​

What Users are saying..

profile image

Ray han

Tech Leader | Stanford / Yale University
linkedin profile url

I think that they are fantastic. I attended Yale and Stanford and have worked at Honeywell,Oracle, and Arthur Andersen(Accenture) in the US. I have taken Big Data and Hadoop,NoSQL, Spark, Hadoop... Read More

Relevant Projects

Build a Autoregressive and Moving Average Time Series Model
In this time series project, you will learn to build Autoregressive and Moving Average Time Series Models to forecast future readings, optimize performance, and harness the power of predictive analytics for sensor data.

Topic modelling using Kmeans clustering to group customer reviews
In this Kmeans clustering machine learning project, you will perform topic modelling in order to group customer reviews based on recurring patterns.

Recommender System Machine Learning Project for Beginners-3
Content Based Recommender System Project - Building a Content-Based Product Recommender App with Streamlit

AWS Project to Build and Deploy LSTM Model with Sagemaker
In this AWS Sagemaker Project, you will learn to build a LSTM model on Sagemaker for sales forecasting while analyzing the impact of weather conditions on Sales.

End-to-End ML Model Monitoring using Airflow and Docker
In this MLOps Project, you will learn to build an end to end pipeline to monitor any changes in the predictive power of model or degradation of data.

Build Regression (Linear,Ridge,Lasso) Models in NumPy Python
In this machine learning regression project, you will learn to build NumPy Regression Models (Linear Regression, Ridge Regression, Lasso Regression) from Scratch.

Avocado Machine Learning Project Python for Price Prediction
In this ML Project, you will use the Avocado dataset to build a machine learning model to predict the average price of avocado which is continuous in nature based on region and varieties of avocado.

Build an AI Chatbot from Scratch using Keras Sequential Model
In this NLP Project, you will learn how to build an AI Chatbot from Scratch using Keras Sequential Model.

Build a Credit Default Risk Prediction Model with LightGBM
In this Machine Learning Project, you will build a classification model for default prediction with LightGBM.

Time Series Classification Project for Elevator Failure Prediction
In this Time Series Project, you will predict the failure of elevators using IoT sensor data as a time series classification machine learning problem.