How to deal with missing values in a Timeseries in Python?
DATA MUNGING DATA CLEANING PYTHON MACHINE LEARNING RECIPES PANDAS CHEATSHEET     ALL TAGS

How to deal with missing values in a Timeseries in Python?

How to deal with missing values in a Timeseries in Python?

This recipe helps you deal with missing values in a Timeseries in Python

Recipe Objective

In a dataset its very normal that we can get missing values and we can not use that missing values in models. So how to deal with missing values.

So this is the recipe on how we can deal with missing values in a Timeseries in Python.

Step 1 - Import the library

import pandas as pd import numpy as np

We have imported numpy and pandas which will be needed for the dataset.

Step 2 - Setting up the Data

We have created a dataframe with index as timeseries and with a feature "sales". We can clearly see that there are 3 missing values in the feature. time_index = pd.date_range("1/01/2021", periods=6, freq="W") df = pd.DataFrame(index=time_index); print(df) df["Sales"] = [5.0,4.0,np.nan,np.nan,1.0,np.nan]; print(df)

Step 3 - Dealing with missing values

Here we will be using different methods to deal with missing values.

  • Interpolating missing values
  • df1= df.interpolate(); print(df1)
  • Forward-fill Missing Values - Using value of next row to fill the missing value
  • df2 = df.ffill() print(df2)
  • Backfill Missing Values - Using value of previous row to fill the missing value
  • df3 = df.bfill(); print(df3)
  • Interpolating Missing Values But Only Up One Value
  • df4 = df.interpolate(limit=1, limit_direction="forward"); print(df4)
  • Interpolating Missing Values But Only Up Two Values
  • df5 = df.interpolate(limit=2, limit_direction="forward"); print(df5)
So the output comes as:

Empty DataFrame
Columns: []
Index: [2021-01-03 00:00:00, 2021-01-10 00:00:00, 2021-01-17 00:00:00, 2021-01-24 00:00:00, 2021-01-31 00:00:00, 2021-02-07 00:00:00]

            Sales
2021-01-03    5.0
2021-01-10    4.0
2021-01-17    NaN
2021-01-24    NaN
2021-01-31    1.0
2021-02-07    NaN

            Sales
2021-01-03    5.0
2021-01-10    4.0
2021-01-17    3.0
2021-01-24    2.0
2021-01-31    1.0
2021-02-07    1.0

            Sales
2021-01-03    5.0
2021-01-10    4.0
2021-01-17    4.0
2021-01-24    4.0
2021-01-31    1.0
2021-02-07    1.0

            Sales
2021-01-03    5.0
2021-01-10    4.0
2021-01-17    1.0
2021-01-24    1.0
2021-01-31    1.0
2021-02-07    NaN

            Sales
2021-01-03    5.0
2021-01-10    4.0
2021-01-17    3.0
2021-01-24    NaN
2021-01-31    1.0
2021-02-07    1.0

            Sales
2021-01-03    5.0
2021-01-10    4.0
2021-01-17    3.0
2021-01-24    2.0
2021-01-31    1.0
2021-02-07    1.0
‚Äč

Download Materials

Relevant Projects

Topic modelling using Kmeans clustering to group customer reviews
In this Kmeans clustering machine learning project, you will perform topic modelling in order to group customer reviews based on recurring patterns.

Credit Card Fraud Detection as a Classification Problem
In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.

Time Series Python Project using Greykite and Neural Prophet
In this time series project, you will forecast Walmart sales over time using the powerful, fast, and flexible time series forecasting library Greykite that helps automate time series problems.

Expedia Hotel Recommendations Data Science Project
In this data science project, you will contextualize customer data and predict the likelihood a customer will stay at 100 different hotel groups.

Locality Sensitive Hashing Python Code for Look-Alike Modelling
In this deep learning project, you will find similar images (lookalikes) using deep learning and locality sensitive hashing to find customers who are most likely to click on an ad.

Machine learning for Retail Price Recommendation with Python
Use the Mercari Dataset with dynamic pricing to build a price recommendation algorithm using machine learning in Python to automatically suggest the right product prices.

Data Science Project on Wine Quality Prediction in R
In this R data science project, we will explore wine dataset to assess red wine quality. The objective of this data science project is to explore which chemical properties will influence the quality of red wines.

Data Science Project-TalkingData AdTracking Fraud Detection
Machine Learning Project in R-Detect fraudulent click traffic for mobile app ads using R data science programming language.

Classification of T shirt images to see if they have text on them
Want to search images of clothes which have text on them? Then this project talks through how we can classify an image whether it has text on it or not. For this we use state of the model called as inception and try and deepdive into how it works on our dataset

Medical Image Segmentation Deep Learning Project
In this deep learning project, you will learn to implement Unet++ models for medical image segmentation to detect and classify colorectal polyps.