How to check multicollinearity using python?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

How to check multicollinearity using python?

How to check multicollinearity using python?

This recipe helps you check multicollinearity using python

0

Recipe Objective

How to check multicollinearity using python?

Multicollinearity mostly occurs in a regression model when two or more independent variable are highly correlated to eachother.

The variance inflation factor (VIF) can be used to check the multicollinearity.

VIF starts at 1 and has no limits. VIF = 1, no correlation beetween idependent variables. VIF > 10 high multicollinearity between independent variables.

Step 1- Importing Libraries.

import pandas as pd from statsmodels.stats.outliers_influence import variance_inflation_factor

Step 2- Reading file

df= pd.read_csv('/content/sample_data/california_housing_train.csv') df.head()

Step 3- Defining function.

We will define a function which will check the correlation between the independent variables.

def calc_VIF(x): vif= pd.DataFrame() vif['variables']=x.columns vif["VIF"]=[variance_inflation_factor(x.values,i) for i in range(x.shape[1])] return(vif)

Step 4- Showing multicollinearity.

x=df.iloc[:,:-1] calc_VIF(x)

Relevant Projects

Zillow’s Home Value Prediction (Zestimate)
Data Science Project in R -Build a machine learning algorithm to predict the future sale prices of homes.

Predict Employee Computer Access Needs in Python
Data Science Project in Python- Given his or her job role, predict employee access needs using amazon employee database.

Data Science Project - Instacart Market Basket Analysis
Data Science Project - Build a recommendation engine which will predict the products to be purchased by an Instacart consumer again.

Perform Time series modelling using Facebook Prophet
In this project, we are going to talk about Time Series Forecasting to predict the electricity requirement for a particular house using Prophet.

Loan Eligibility Prediction in Python using H2O.ai
In this loan prediction project you will build predictive models in Python using H2O.ai to predict if an applicant is able to repay the loan or not.

Deep Learning with Keras in R to Predict Customer Churn
In this deep learning project, we will predict customer churn using Artificial Neural Networks and learn how to model an ANN in R with the keras deep learning package.

Expedia Hotel Recommendations Data Science Project
In this data science project, you will contextualize customer data and predict the likelihood a customer will stay at 100 different hotel groups.

Census Income Data Set Project - Predict Adult Census Income
Use the Adult Income dataset to predict whether income exceeds 50K yr based on census data.

Build an Image Classifier for Plant Species Identification
In this machine learning project, we will use binary leaf images and extracted features, including shape, margin, and texture to accurately identify plant species using different benchmark classification techniques.

Resume parsing with Machine learning - NLP with Python OCR and Spacy
In this machine learning resume parser example we use the popular Spacy NLP python library for OCR and text classification.