How to drop null values in a dataframe?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

How to drop null values in a dataframe?

How to drop null values in a dataframe?

This recipe helps you drop null values in a dataframe

0

Recipe Objective

Missing value is one of the most common problem in any raw dataset. To create a precise and unbiased machine learning model, we need to deal with these Missing values after identifying them. There are different steps that we can take to do so:

  1. Identifying number of missing values in each column
  2. Based on the number, we decide whether we need to drop the column or replace it with it's mean, median or any other computed value.

In this recipe, we will demonstrate how to drop the rows with missing values (NA) in a dataframe. ​

STEP 1: Creating a DataFrame

Creating a STUDENT dataframe with student_id, Name and marks as columns

STUDENT = data.frame(student_id = c(1,2,3,NA,5), Name = c("Ram","Shyam", "Jessica", NA, NA), Marks = c(NA, 60, NA, 80, NA))

STEP 2:Dropping rows with NA, NAN values

Method 1:

We will use built-in function na.omit(x) where x is a dataframe or a column.

STUDENT = na.omit(STUDENT) STUDENT
student_id	Name	Marks
1		Ram	55
2		Shyam	60

Method 2:

We will use built-in function complete.cases(x) to filter out the rows by removing the missing values (NA, NAN) where x is a dataframe or a column.

STUDENT[complete.cases(STUDENT),]
student_id	Name	Marks
1		Ram	55
2		Shyam	60
​

Relevant Projects

Walmart Sales Forecasting Data Science Project
Data Science Project in R-Predict the sales for each department using historical markdown data from the Walmart dataset containing data of 45 Walmart stores.

Ensemble Machine Learning Project - All State Insurance Claims Severity Prediction
In this ensemble machine learning project, we will predict what kind of claims an insurance company will get. This is implemented in python using ensemble machine learning algorithms.

PySpark Tutorial - Learn to use Apache Spark with Python
PySpark Project-Get a handle on using Python with Spark through this hands-on data processing spark python tutorial.

Build a Collaborative Filtering Recommender System in Python
Use the Amazon Reviews/Ratings dataset of 2 Million records to build a recommender system using memory-based collaborative filtering in Python.

Predict Macro Economic Trends using Kaggle Financial Dataset
In this machine learning project, you will uncover the predictive value in an uncertain world by using various artificial intelligence, machine learning, advanced regression and feature transformation techniques.

Machine Learning or Predictive Models in IoT - Energy Prediction Use Case
In this machine learning and IoT project, we are going to test out the experimental data using various predictive models and train the models and break the energy usage.

Human Activity Recognition Using Multiclass Classification in Python
In this human activity recognition project, we use multiclass classification machine learning techniques to analyse fitness dataset from a smartphone tracker.

Data Science Project in Python on BigMart Sales Prediction
The goal of this data science project is to build a predictive model and find out the sales of each product at a given Big Mart store.

Census Income Data Set Project - Predict Adult Census Income
Use the Adult Income dataset to predict whether income exceeds 50K yr based on census data.

Natural language processing Chatbot application using NLTK for text classification
In this NLP AI application, we build the core conversational engine for a chatbot. We use the popular NLTK text classification library to achieve this.