How to create and optimize a baseline Ridge Regression model in R?

This recipe helps you create and optimize a baseline Ridge Regression model in R

Recipe Objective

The subset selection methods use ordinary least squares to fit a linear model that contains a subset of the predictors. As an alternative, we can fit a model containing all p predictors using a technique that constrains or regularizes the coefficient estimates, or equivalently, that shrinks the coefficient estimates towards zero. This shrinkage is also known as regularisation.

It may not be immediately obvious why such a constraint should improve the fit, but it turns out that shrinking the coefficient estimates can significantly reduce their variance while performing variable selection. There are three types of regularisation that occurs: ​

  1. Ridge regression: This uses L2 regularization to penalise residuals when the coefficients of the predictor variables from a regression model are being learned. It involves minimising the sum of squared residuals as well as the penalised term added by L2 regularization. This addition to the residuals makes the coefficients of variables with minor contribution close to zero. This is useful when you need all the variables needs to be incorporated in the model.
  2. Lasso regression: This type of regularization makes the coefficients of variables with minor contribution exactly to zero by adding a penalised term to the loss function. Only the most significant variables are left in the final model after applying this technique.
  3. Elastic-Net regression: It is a combination of both ridge and lasso regression.

In this recipe, we will discuss how to create and optimise ridge regression model ​

STEP 1: Importing Necessary Libraries

library(caret) library(tidyverse) # for data manipulation

STEP 2: Read a csv file and explore the data

The dataset attached contains the data of 160 different bags associated with ABC industries.

The bags have certain attributes which are described below: ​

  1. Height – The height of the bag
  2. Width – The width of the bag
  3. Length – The length of the bag
  4. Weight – The weight the bag can carry
  5. Weight1 – Weight the bag can carry after expansion

The company now wants to predict the cost they should set for a new variant of these kinds of bags. ​

data <- read.csv("R_340_Data_1.csv") glimpse(data)
Rows: 159
Columns: 6
$ Cost     242, 290, 340, 363, 430, 450, 500, 390, 450, 500, 475, 500,...
$ Weight   23.2, 24.0, 23.9, 26.3, 26.5, 26.8, 26.8, 27.6, 27.6, 28.5,...
$ Weight1  25.4, 26.3, 26.5, 29.0, 29.0, 29.7, 29.7, 30.0, 30.0, 30.7,...
$ Length   30.0, 31.2, 31.1, 33.5, 34.0, 34.7, 34.5, 35.0, 35.1, 36.2,...
$ Height   11.5200, 12.4800, 12.3778, 12.7300, 12.4440, 13.6024, 14.17...
$ Width    4.0200, 4.3056, 4.6961, 4.4555, 5.1340, 4.9274, 5.2785, 4.6...
summary(data) # returns the statistical summary of the data columns
Cost            Weight         Weight1          Length     
 Min.   :   0.0   Min.   : 7.50   Min.   : 8.40   Min.   : 8.80  
 1st Qu.: 120.0   1st Qu.:19.05   1st Qu.:21.00   1st Qu.:23.15  
 Median : 273.0   Median :25.20   Median :27.30   Median :29.40  
 Mean   : 398.3   Mean   :26.25   Mean   :28.42   Mean   :31.23  
 3rd Qu.: 650.0   3rd Qu.:32.70   3rd Qu.:35.50   3rd Qu.:39.65  
 Max.   :1650.0   Max.   :59.00   Max.   :63.40   Max.   :68.00  
     Height           Width      
 Min.   : 1.728   Min.   :1.048  
 1st Qu.: 5.945   1st Qu.:3.386  
 Median : 7.786   Median :4.248  
 Mean   : 8.971   Mean   :4.417  
 3rd Qu.:12.366   3rd Qu.:5.585  
 Max.   :18.957   Max.   :8.142   
dim(data)
159 6

STEP 3: Train Test Split

# createDataPartition() function from the caret package to split the original dataset into a training and testing set and split data into training (80%) and testing set (20%) parts = createDataPartition(data$Cost, p = .8, list = F) train = data[parts, ] test = data[-parts, ]

STEP 4: Building and optimising Ridge Regression

We will use caret package to perform Cross Validation and Hyperparameter tuning (alpha and lambda values) using grid search technique. First, we will use the trainControl() function to define the method of cross validation to be carried out and search type i.e. "grid" or "random". Then train the model using train() function with tuneGrid as one of the arguements.

Syntax: train(formula, data = , method = , trControl = , tuneGrid = )

where:

  1. formula = y~x1+x2+x3+..., where y is the independent variable and x1,x2,x3 are the dependent variables
  2. data = dataframe
  3. method = Type of the model to be built ("glmnet" for ridge/lasso or Elastic regression)
  4. trControl = Takes the control parameters. We will use trainControl function out here where we will specify the Cross validation technique.
  5. tuneGrid = takes the tuning parameters and applies grid search CV on them
# specifying the CV technique which will be passed into the train() function later and number parameter is the "k" in K-fold cross validation train_control = trainControl(method = "cv", number = 5, search = "grid") ## Customsing the tuning grid (ridge regression has alpha = 0) ridgeGrid = expand.grid(alpha = 0, lambda = c(seq(0.1, 1.5, by = 0.1), seq(2,5,1), seq(5,20,2))) set.seed(50) # training a ridge Regression model while tuning parameters model = train(Cost~., data = train, method = "glmnet", trControl = train_control, tuneGrid = ridgeGrid) # summarising the results print(model)
129 samples
  5 predictor

No pre-processing
Resampling: Cross-Validated (5 fold) 
Summary of sample sizes: 103, 104, 103, 103, 103 
Resampling results across tuning parameters:

  lambda  RMSE      Rsquared   MAE    
   0.1    127.0819  0.8825179  98.1665
   0.2    127.0819  0.8825179  98.1665
   0.3    127.0819  0.8825179  98.1665
   0.4    127.0819  0.8825179  98.1665
   0.5    127.0819  0.8825179  98.1665
   0.6    127.0819  0.8825179  98.1665
   0.7    127.0819  0.8825179  98.1665
   0.8    127.0819  0.8825179  98.1665
   0.9    127.0819  0.8825179  98.1665
   1.0    127.0819  0.8825179  98.1665
   1.1    127.0819  0.8825179  98.1665
   1.2    127.0819  0.8825179  98.1665
   1.3    127.0819  0.8825179  98.1665
   1.4    127.0819  0.8825179  98.1665
   1.5    127.0819  0.8825179  98.1665
   2.0    127.0819  0.8825179  98.1665
   3.0    127.0819  0.8825179  98.1665
   4.0    127.0819  0.8825179  98.1665
   5.0    127.0819  0.8825179  98.1665
   7.0    127.0819  0.8825179  98.1665
   9.0    127.0819  0.8825179  98.1665
  11.0    127.0819  0.8825179  98.1665
  13.0    127.0819  0.8825179  98.1665
  15.0    127.0819  0.8825179  98.1665
  17.0    127.0819  0.8825179  98.1665
  19.0    127.0819  0.8825179  98.1665

Tuning parameter 'alpha' was held constant at a value of 0
RMSE was used to select the optimal model using the smallest value.
The final values used for the model were alpha = 0 and lambda = 19.

Note: RMSE was used select the optimal model using the smallest value. And the final model has the lambda value of 19.

STEP 5: Make predictions on the final ridge regression model

We use our final ridge Regression model to make predictions on the testing data (unseen data) and predict the 'Cost' value and generate performance measures.

#use model to make predictions on test data pred_y = predict(model, test) # performance metrics on the test data test_y = test[, 1] mean((test_y - pred_y)^2) #mse - Mean Squared Error caret::RMSE(test_y, pred_y) #rmse - Root Mean Squared Error
13138.7222015375
114.624265326053

Final Coefficients are mentioned below:

data.frame( ridge = as.data.frame.matrix(coef(model$finalModel, model$finalModel$lambdaOpt))) %>% rename(ridge = X1)
(Intercept)	-516.756313
Weight		8.391177
Weight1		7.564281
Length		5.726575
Height		9.096478
Width		49.89799

What Users are saying..

profile image

Savvy Sahai

Data Science Intern, Capgemini
linkedin profile url

As a student looking to break into the field of data engineering and data science, one can get really confused as to which path to take. Very few ways to do it are Google, YouTube, etc. I was one of... Read More

Relevant Projects

Deep Learning Project for Text Detection in Images using Python
CV2 Text Detection Code for Images using Python -Build a CRNN deep learning model to predict the single-line text in a given image.

Build a Speech-Text Transcriptor with Nvidia Quartznet Model
In this Deep Learning Project, you will leverage transfer learning from Nvidia QuartzNet pre-trained models to develop a speech-to-text transcriptor.

Machine Learning Project to Forecast Rossmann Store Sales
In this machine learning project you will work on creating a robust prediction model of Rossmann's daily sales using store, promotion, and competitor data.

PyCaret Project to Build and Deploy an ML App using Streamlit
In this PyCaret Project, you will build a customer segmentation model with PyCaret and deploy the machine learning application using Streamlit.

Build Classification Algorithms for Digital Transformation[Banking]
Implement a machine learning approach using various classification techniques in Python to examine the digitalisation process of bank customers.

Ecommerce product reviews - Pairwise ranking and sentiment analysis
This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.

Classification Projects on Machine Learning for Beginners - 2
Learn to implement various ensemble techniques to predict license status for a given business.

OpenCV Project for Beginners to Learn Computer Vision Basics
In this OpenCV project, you will learn computer vision basics and the fundamentals of OpenCV library using Python.

PyTorch Project to Build a LSTM Text Classification Model
In this PyTorch Project you will learn how to build an LSTM Text Classification model for Classifying the Reviews of an App .

Build Regression (Linear,Ridge,Lasso) Models in NumPy Python
In this machine learning regression project, you will learn to build NumPy Regression Models (Linear Regression, Ridge Regression, Lasso Regression) from Scratch.