What is differencing in timeseries and why do we do it?

What is differencing in timeseries and why do we do it?

What is differencing in timeseries and why do we do it?

This recipe explains what is differencing in timeseries and why do we do it


Recipe Objective

Differencing is a method of transforming a time series dataset. It can be used to remove the series dependence on time, so-called temporal dependence. This includes structures like trends and seasonality.

So this recipe is a short example on what is differencing in time series and why do we do it. Let's get started.

Step 1 - Import the library

import numpy as np import pandas as pd import matplotlib.pyplot as plt

Let's pause and look at these imports. Numpy and pandas are general ones. Here matplotlib.pyplot will help us in plotting.

Step 2 - Setup the Data

df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/a10.csv', parse_dates=['date']).set_index('date')

Here, we have used one time series data from github. Also, we have set our index to date.

Now our dataset is ready.

Step 3 - Defining Difference function

def difference(dataset, interval=1): diff = list() for i in range(interval, len(dataset)): value = dataset[i] - dataset[i - interval] diff.append(value) return diff

We are just taking difference between each adjacement elements for removal of trends. Each differnence is taken and put in one list which is sent back when call on.

Step 4 - Visualizing trend

diff = difference(df.value) plt.plot(diff) plt.show()

Here we have simply calling our defined function. Finally, we are trying to understand the trend in dataset.

Step 5 - Lets look at our dataset now

Once we run the above code snippet, we will see:

Scroll down the ipython file to visualize the output.

Clearly, it can be seen that the trend has been removed from our dataset.

Relevant Projects

Choosing the right Time Series Forecasting Methods
There are different time series forecasting methods to forecast stock price, demand etc. In this machine learning project, you will learn to determine which forecasting method to be used when and how to apply with time series forecasting example.

Predict Credit Default | Give Me Some Credit Kaggle
In this data science project, you will predict borrowers chance of defaulting on credit loans by building a credit score prediction model.

Predict Churn for a Telecom company using Logistic Regression
Machine Learning Project in R- Predict the customer churn of telecom sector and find out the key drivers that lead to churn. Learn how the logistic regression model using R can be used to identify the customer churn in telecom dataset.

Solving Multiple Classification use cases Using H2O
In this project, we are going to talk about H2O and functionality in terms of building Machine Learning models.

PySpark Tutorial - Learn to use Apache Spark with Python
PySpark Project-Get a handle on using Python with Spark through this hands-on data processing spark python tutorial.

Build an Image Classifier for Plant Species Identification
In this machine learning project, we will use binary leaf images and extracted features, including shape, margin, and texture to accurately identify plant species using different benchmark classification techniques.

Human Activity Recognition Using Smartphones Data Set
In this deep learning project, you will build a classification system where to precisely identify human fitness activities.

Perform Time series modelling using Facebook Prophet
In this project, we are going to talk about Time Series Forecasting to predict the electricity requirement for a particular house using Prophet.

Customer Churn Prediction Analysis using Ensemble Techniques
In this machine learning churn project, we implement a churn prediction model in python using ensemble techniques.

Resume parsing with Machine learning - NLP with Python OCR and Spacy
In this machine learning resume parser example we use the popular Spacy NLP python library for OCR and text classification.