What is interaction variable and how to calculate it and how is it used in model building in R

This recipe explains what is interaction variable and how to calculate it and how is it used in model building in R
Last Updated: 22 Jun 2022

Get access to Data Science projects View all Data Science projects

MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET ALL TAGS

Recipe Objective

Interaction effects occurs when the value of one independent variable depends on the effects of another variable. Interaction between independent variables increases the complexity of the model. It mainly indicates that a third variable affects the relationahip between a pair of independent and dependent variable. We cannot ignore these interation effects effects while modelling.

They are most common in regression analysis and designed experments. Interaction variable is a variable constructed which tries to represent some or all of the interation effects present in a set of independent variables. They introduce an additional level of analysis by allowing the user to explore it's effects on a deeper level.

In this recipe, we will learn how to create a interaction variable and model. We will demonstrate this by using regression analysis with interaction variable.

Recipe Objective

Step 1: Reading the dataset

Dataset Description: The company wants to predict the cost they should set for a new variant of the kinds of bags based on the attributes mentioned below using multiple linear regression model with Interation variable:

Height – The height of the bag
Width – The width of the bag
Length – The length of the bag
Weight – The weight the bag can carry
Weight1 – Weight the bag can carry after expansion

data_1 = read.csv("R_250_Data_1.csv") # attach data variable attach(data_1) head(data_1)

Cost	Weight	Weight1	Length	Height	Width
242	23.2	25.4	30.0	11.5200	4.0200
290	24.0	26.3	31.2	12.4800	4.3056
340	23.9	26.5	31.1	12.3778	4.6961
363	26.3	29.0	33.5	12.7300	4.4555
430	26.5	29.0	34.0	12.4440	5.1340
450	26.8	29.7	34.7	13.6024	4.9274

Step 2: Creating an interaction variable

Two steps should be followed in creation of the interaction variable:

The input variable must be centered to avoid multicollinearity
The interaction variable is formed by the multiplication of two or more predictors or independent variables.

We will use the interaction between Weight and Weight1.

# centering the input variables Weightc <- Weight - mean(Weight) Weight1c <- Weight1 - mean(Weight1) # creating the interaction variable Weighti_Weight1 <- Weightc * Weight1c

Step 3: Creating an Interaction model

We use lm(FORMULA, data) function to create an interaction model where:

Formula = y~x1+x2+x3+... (y ~ dependent variable; x1,x2 ~ independent variable)
data = data variable

interactionModel <- lm(Cost ~ Weight1 + Weight + Length + Height + Width + Weighti_Weight1, data = data_1) #display summary information about the model summary(interactionModel)

Call:
lm(formula = Cost ~ Weight1 + Weight + Length + Height + Width + 
    Weighti_Weight1, data = data_1)

Residuals:
     Min       1Q   Median       3Q      Max 
-180.928  -35.323   -3.022   32.165  201.418 

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
(Intercept)     -516.25092   14.36967 -35.926  < 2e-16 ***
Weight1          -24.44967   20.27981  -1.206  0.22984    
Weight            51.30138   19.51801   2.628  0.00946 ** 
Length           -18.99909    8.43269  -2.253  0.02569 *  
Height            36.24918    4.25092   8.527  1.4e-14 ***
Width             98.44557   10.45567   9.416  < 2e-16 ***
Weighti_Weight1    0.90253    0.04045  22.310  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 59.79 on 152 degrees of freedom
Multiple R-squared:  0.9732,	Adjusted R-squared:  0.9721 
F-statistic: 918.7 on 6 and 152 DF,  p-value: < 2.2e-16

Note: This is the complete interaction model. We would like to compare the model to others.

What Users are saying..

Ed Godalle

Director Data Analytics at EY / EY Tech

I am the Director of Data Analytics with over 10+ years of IT experience. I have a background in SQL, Python, and Big Data working with Accenture, IBM, and Infosys. I am looking to enhance my skills... Read More