The Bayesian Information Criterion (BIC) is an index used in Bayesian statistics to choose between two or more alternative models. Comparing models with the Bayesian information criterion simply involves calculating the BIC for each model. The model with the lowest BIC is considered the best.
So this recipe is a short example on how to evaluate time series models using BIC. Let's get started.
import numpy as np import pandas as pd from statsmodels.tsa.arima_model import ARIMA
Let's pause and look at these imports. Numpy and pandas are general ones. Here matplotlib.pyplot will help us in plotting. statsmodels.tsa.arima_model will help us in model building.
df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/a10.csv', parse_dates=['date'])
Here, we have used one time series data from github. Also, we have set our index to date.
Now our dataset is ready.
for i in range(0,2): for j in range(0,2): for k in range(0,2): model = ARIMA(df.value, order=(i, j, k)).fit() print(model.bic)
Best BIC can easily be calcuated through libraries. Here we have tried to understand what actually is happening inside. With variation of values of orders, BIC can be seen varying.
Once we run the above code snippet, we will see:
1316.66388768731 1162.0554484438037 913.5172157072849 868.8258162104078 918.9268418564968 888.116114839384 889.526006058412 857.0907664191164
Clearly, order (1,1,1) is best fitted solution to our model. It can be extended further to 2 degrees to have a better understanding of results.