How to deal with missing values in a Timeseries in Python?
DATA MUNGING DATA CLEANING PYTHON MACHINE LEARNING RECIPES PANDAS CHEATSHEET     ALL TAGS

How to deal with missing values in a Timeseries in Python?

How to deal with missing values in a Timeseries in Python?

This recipe helps you deal with missing values in a Timeseries in Python

0

Recipe Objective

In a dataset its very normal that we can get missing values and we can not use that missing values in models. So how to deal with missing values.

So this is the recipe on how we can deal with missing values in a Timeseries in Python.

Step 1 - Import the library

import pandas as pd import numpy as np

We have imported numpy and pandas which will be needed for the dataset.

Step 2 - Setting up the Data

We have created a dataframe with index as timeseries and with a feature "sales". We can clearly see that there are 3 missing values in the feature. time_index = pd.date_range("1/01/2021", periods=6, freq="W") df = pd.DataFrame(index=time_index); print(df) df["Sales"] = [5.0,4.0,np.nan,np.nan,1.0,np.nan]; print(df)

Step 3 - Dealing with missing values

Here we will be using different methods to deal with missing values.

  • Interpolating missing values
  • df1= df.interpolate(); print(df1)
  • Forward-fill Missing Values - Using value of next row to fill the missing value
  • df2 = df.ffill() print(df2)
  • Backfill Missing Values - Using value of previous row to fill the missing value
  • df3 = df.bfill(); print(df3)
  • Interpolating Missing Values But Only Up One Value
  • df4 = df.interpolate(limit=1, limit_direction="forward"); print(df4)
  • Interpolating Missing Values But Only Up Two Values
  • df5 = df.interpolate(limit=2, limit_direction="forward"); print(df5)
So the output comes as:

Empty DataFrame
Columns: []
Index: [2021-01-03 00:00:00, 2021-01-10 00:00:00, 2021-01-17 00:00:00, 2021-01-24 00:00:00, 2021-01-31 00:00:00, 2021-02-07 00:00:00]

            Sales
2021-01-03    5.0
2021-01-10    4.0
2021-01-17    NaN
2021-01-24    NaN
2021-01-31    1.0
2021-02-07    NaN

            Sales
2021-01-03    5.0
2021-01-10    4.0
2021-01-17    3.0
2021-01-24    2.0
2021-01-31    1.0
2021-02-07    1.0

            Sales
2021-01-03    5.0
2021-01-10    4.0
2021-01-17    4.0
2021-01-24    4.0
2021-01-31    1.0
2021-02-07    1.0

            Sales
2021-01-03    5.0
2021-01-10    4.0
2021-01-17    1.0
2021-01-24    1.0
2021-01-31    1.0
2021-02-07    NaN

            Sales
2021-01-03    5.0
2021-01-10    4.0
2021-01-17    3.0
2021-01-24    NaN
2021-01-31    1.0
2021-02-07    1.0

            Sales
2021-01-03    5.0
2021-01-10    4.0
2021-01-17    3.0
2021-01-24    2.0
2021-01-31    1.0
2021-02-07    1.0
‚Äč

Relevant Projects

Learn to prepare data for your next machine learning project
Text data requires special preparation before you can start using it for any machine learning project.In this ML project, you will learn about applying Machine Learning models to create classifiers and learn how to make sense of textual data.

Data Science Project on Wine Quality Prediction in R
In this R data science project, we will explore wine dataset to assess red wine quality. The objective of this data science project is to explore which chemical properties will influence the quality of red wines.

Predict Macro Economic Trends using Kaggle Financial Dataset
In this machine learning project, you will uncover the predictive value in an uncertain world by using various artificial intelligence, machine learning, advanced regression and feature transformation techniques.

Ensemble Machine Learning Project - All State Insurance Claims Severity Prediction
In this ensemble machine learning project, we will predict what kind of claims an insurance company will get. This is implemented in python using ensemble machine learning algorithms.

Machine Learning project for Retail Price Optimization
In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.

Walmart Sales Forecasting Data Science Project
Data Science Project in R-Predict the sales for each department using historical markdown data from the Walmart dataset containing data of 45 Walmart stores.

PySpark Tutorial - Learn to use Apache Spark with Python
PySpark Project-Get a handle on using Python with Spark through this hands-on data processing spark python tutorial.

Data Science Project-TalkingData AdTracking Fraud Detection
Machine Learning Project in R-Detect fraudulent click traffic for mobile app ads using R data science programming language.

German Credit Dataset Analysis to Classify Loan Applications
In this data science project, you will work with German credit dataset using classification techniques like Decision Tree, Neural Networks etc to classify loan applications using R.

Identifying Product Bundles from Sales Data Using R Language
In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.