What is interaction variable and how to calculate it and how is it used in model building?

What is interaction variable and how to calculate it and how is it used in model building?

What is interaction variable and how to calculate it and how is it used in model building?

This recipe explains what is interaction variable and how to calculate it and how is it used in model building


Recipe Objective

Interaction effects occurs when the value of one independent variable depends on the effects of another variable. Interaction between independent variables increases the complexity of the model. It mainly indicates that a third variable affects the relationahip between a pair of independent and dependent variable. We cannot ignore these interation effects effects while modelling. ​

They are most common in regression analysis and designed experments. Interaction variable is a variable constructed which tries to represent some or all of the interation effects present in a set of independent variables. They introduce an additional level of analysis by allowing the user to explore it's effects on a deeper level. ​

In this recipe, we will learn how to create a interaction variable and model. We will demonstrate this by using regression analysis with interaction variable. ​

Step 1: Reading the dataset

Dataset Description: The company wants to predict the cost they should set for a new variant of the kinds of bags based on the attributes mentioned below using multiple linear regression model with Interation variable: ​

  1. Height – The height of the bag
  2. Width – The width of the bag
  3. Length – The length of the bag
  4. Weight – The weight the bag can carry
  5. Weight1 – Weight the bag can carry after expansion
data_1 = read.csv("R_250_Data_1.csv") # attach data variable attach(data_1) head(data_1)
Cost	Weight	Weight1	Length	Height	Width
242	23.2	25.4	30.0	11.5200	4.0200
290	24.0	26.3	31.2	12.4800	4.3056
340	23.9	26.5	31.1	12.3778	4.6961
363	26.3	29.0	33.5	12.7300	4.4555
430	26.5	29.0	34.0	12.4440	5.1340
450	26.8	29.7	34.7	13.6024	4.9274

Step 2: Creating an interaction variable

Two steps should be followed in creation of the interaction variable: ​

  1. The input variable must be centered to avoid multicollinearity
  2. The interaction variable is formed by the multiplication of two or more predictors or independent variables.

We will use the interaction between Weight and Weight1. ​

# centering the input variables Weightc <- Weight - mean(Weight) Weight1c <- Weight1 - mean(Weight1) # creating the interaction variable Weighti_Weight1 <- Weightc * Weight1c

Step 3: Creating an Interaction model

We use lm(FORMULA, data) function to create an interaction model where: ​

  1. Formula = y~x1+x2+x3+... (y ~ dependent variable; x1,x2 ~ independent variable)
  2. data = data variable
interactionModel <- lm(Cost ~ Weight1 + Weight + Length + Height + Width + Weighti_Weight1, data = data_1) #display summary information about the model summary(interactionModel)
lm(formula = Cost ~ Weight1 + Weight + Length + Height + Width + 
    Weighti_Weight1, data = data_1)

     Min       1Q   Median       3Q      Max 
-180.928  -35.323   -3.022   32.165  201.418 

                  Estimate Std. Error t value Pr(>|t|)    
(Intercept)     -516.25092   14.36967 -35.926  < 2e-16 ***
Weight1          -24.44967   20.27981  -1.206  0.22984    
Weight            51.30138   19.51801   2.628  0.00946 ** 
Length           -18.99909    8.43269  -2.253  0.02569 *  
Height            36.24918    4.25092   8.527  1.4e-14 ***
Width             98.44557   10.45567   9.416  < 2e-16 ***
Weighti_Weight1    0.90253    0.04045  22.310  < 2e-16 ***
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 59.79 on 152 degrees of freedom
Multiple R-squared:  0.9732,	Adjusted R-squared:  0.9721 
F-statistic: 918.7 on 6 and 152 DF,  p-value: < 2.2e-16

Note: This is the complete interaction model. We would like to compare the model to others. ​

Relevant Projects

Build a Similar Images Finder with Python, Keras, and Tensorflow
Build your own image similarity application using Python to search and find images of products that are similar to any given product. You will implement the K-Nearest Neighbor algorithm to find products with maximum similarity.

Customer Market Basket Analysis using Apriori and Fpgrowth algorithms
In this data science project, you will learn how to perform market basket analysis with the application of Apriori and FP growth algorithms based on the concept of association rule learning.

Mercari Price Suggestion Challenge Data Science Project
Data Science Project in Python- Build a machine learning algorithm that automatically suggests the right product prices.

Build a Collaborative Filtering Recommender System in Python
Use the Amazon Reviews/Ratings dataset of 2 Million records to build a recommender system using memory-based collaborative filtering in Python.

Ensemble Machine Learning Project - All State Insurance Claims Severity Prediction
In this ensemble machine learning project, we will predict what kind of claims an insurance company will get. This is implemented in python using ensemble machine learning algorithms.

PySpark Tutorial - Learn to use Apache Spark with Python
PySpark Project-Get a handle on using Python with Spark through this hands-on data processing spark python tutorial.

Data Science Project in Python on BigMart Sales Prediction
The goal of this data science project is to build a predictive model and find out the sales of each product at a given Big Mart store.

Build a Music Recommendation Algorithm using KKBox's Dataset
Music Recommendation Project using Machine Learning - Use the KKBox dataset to predict the chances of a user listening to a song again after their very first noticeable listening event.

NLP and Deep Learning For Fake News Classification in Python
In this project you will use Python to implement various machine learning methods( RNN, LSTM, GRU) for fake news classification.

Walmart Sales Forecasting Data Science Project
Data Science Project in R-Predict the sales for each department using historical markdown data from the Walmart dataset containing data of 45 Walmart stores.