How to implement Elastic Net regression in R

Elastic Net is another regularization technique that uses L1 and L2 regularizations. In this recipe, we shall learn how to implement Elastic Net regression in R.

Recipe Objective: How to implement Elastic Net regression in R?

Elastic Net is another regularization technique that uses L1 and L2 regularizations. Elastic Net improves your model's predictions by combining feature elimination from Lasso with feature coefficient reduction from the Ridge model. When dealing with multicollinearity, the elastic Net is highly effective, and it outperforms both lasso and ridge regression in most cases. The steps to implement Elastic Net Regression in R are as follows -

Step 1: Load the required packages

#importing required libraries
library(caret)
library(glmnet)
library(MASS)

Step 2: Load the dataset

Boston is an inbuilt dataset in R which contains Housing data for 506 census tracts of Boston from the 1970 census.
indus- the proportion of non-retail business acres per town
chas- Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
nox- nitric oxides concentration (parts per 10 million)
rm- the average number of rooms per dwelling
age- the proportion of owner-occupied units built before 1940
dis- weighted distances to five Boston employment centers
rad- index of accessibility to radial highways
tax- full-value property-tax rate per USD 10,000
ptratio pupil-teacher ratio by town
black- 1000(B - 0.63)^2 where B is the proportion of blacks by town
lstat- the percentage of the lower status of the population
medv- median value of owner-occupied homes in USD 1000's

#loading the dataset
data <- Boston
head(data)

	     crim zn indus chas   nox    rm  age    dis rad tax ptratio
1 0.00632 18  2.31    0 0.538 6.575 65.2 4.0900   1 296    15.3
2 0.02731  0  7.07    0 0.469 6.421 78.9 4.9671   2 242    17.8
3 0.02729  0  7.07    0 0.469 7.185 61.1 4.9671   2 242    17.8
4 0.03237  0  2.18    0 0.458 6.998 45.8 6.0622   3 222    18.7
5 0.06905  0  2.18    0 0.458 7.147 54.2 6.0622   3 222    18.7
6 0.02985  0  2.18    0 0.458 6.430 58.7 6.0622   3 222    18.7
   black lstat medv
1 396.90  4.98 24.0
2 396.90  9.14 21.6
3 392.83  4.03 34.7
4 394.63  2.94 33.4
5 396.90  5.33 36.2
6 394.12  5.21 28.7

Step 3: Check the structure of the dataset

#structure
head(data)
str(data)

	'data.frame':	506 obs. of  14 variables:
 $ crim   : num  0.00632 0.02731 0.02729 0.03237 0.06905 ...
 $ zn     : num  18 0 0 0 0 0 12.5 12.5 12.5 12.5 ...
 $ indus  : num  2.31 7.07 7.07 2.18 2.18 2.18 7.87 7.87 7.87 7.87 ...
 $ chas   : int  0 0 0 0 0 0 0 0 0 0 ...
 $ nox    : num  0.538 0.469 0.469 0.458 0.458 0.458 0.524 0.524 0.524 0.524 ...
 $ rm     : num  6.58 6.42 7.18 7 7.15 ...
 $ age    : num  65.2 78.9 61.1 45.8 54.2 58.7 66.6 96.1 100 85.9 ...
 $ dis    : num  4.09 4.97 4.97 6.06 6.06 ...
 $ rad    : int  1 2 2 3 3 3 5 5 5 5 ...
 $ tax    : num  296 242 242 222 222 222 311 311 311 311 ...
 $ ptratio: num  15.3 17.8 17.8 18.7 18.7 18.7 15.2 15.2 15.2 15.2 ...
 $ black  : num  397 397 393 395 397 ...
 $ lstat  : num  4.98 9.14 4.03 2.94 5.33 ...
 $ medv   : num  24 21.6 34.7 33.4 36.2 28.7 22.9 27.1 16.5 18.9 ...

All the columns are int or numeric type

Step 4: Train-Test split

#train-test split
set.seed(222)
ind <- sample(2, nrow(data), replace = TRUE, prob = c(0.7, 0.3))
train <- data[ind==1,]
head(data)
test <- data[ind==2,]

Step 5: Create custom Control Parameters

#creating custom Control Parameters
custom <- trainControl(method = "repeatedcv",
number = 10,
repeats = 5,
verboseIter = TRUE)

Step 6: Model Fitting

#fitting Elastic Net Regression model set.seed(1234) en <- train(medv~., train, method='glmnet', tuneGrid =expand.grid(alpha=seq(0,1,length=10), lambda = seq(0.0001,0.2,length=5)), trControl=custom) en

Output:
glmnet 

353 samples
 13 predictor

No pre-processing
Resampling: Cross-Validated (10 fold, repeated 5 times) 
Summary of sample sizes: 316, 318, 318, 319, 317, 318, ... 
Resampling results across tuning parameters:

  alpha      lambda    RMSE      Rsquared   MAE     
  0.0000000  0.000100  4.242204  0.7782278  3.008339
  0.0000000  0.050075  4.242204  0.7782278  3.008339
  0.0000000  0.100050  4.242204  0.7782278  3.008339
  0.0000000  0.150025  4.242204  0.7782278  3.008339
  0.0000000  0.200000  4.242204  0.7782278  3.008339
  0.1111111  0.000100  4.230292  0.7786226  3.025857
  0.1111111  0.050075  4.228437  0.7787777  3.019236
  0.1111111  0.100050  4.227739  0.7788251  3.010332
  0.1111111  0.150025  4.229814  0.7786315  3.005266
  0.1111111  0.200000  4.233949  0.7782676  3.003662
  0.2222222  0.000100  4.230694  0.7785669  3.026161
  0.2222222  0.050075  4.228863  0.7787036  3.017107
  0.2222222  0.100050  4.231209  0.7784424  3.008141
  0.2222222  0.150025  4.238397  0.7777559  3.006343
  0.2222222  0.200000  4.247863  0.7768865  3.010248
  0.3333333  0.000100  4.230795  0.7785677  3.026282
  0.3333333  0.050075  4.229507  0.7786164  3.014779
  0.3333333  0.100050  4.236481  0.7778956  3.007742
  0.3333333  0.150025  4.249344  0.7766620  3.011300
  0.3333333  0.200000  4.265850  0.7751107  3.019599
  0.4444444  0.000100  4.230574  0.7785789  3.025987
  0.4444444  0.050075  4.230733  0.7784693  3.012970
  0.4444444  0.100050  4.242983  0.7772328  3.009176
  0.4444444  0.150025  4.262872  0.7753149  3.017982
  0.4444444  0.200000  4.288416  0.7728709  3.031840
  0.5555556  0.000100  4.230656  0.7785681  3.026115
  0.5555556  0.050075  4.232424  0.7782802  3.011358
  0.5555556  0.100050  4.250651  0.7764597  3.012491
  0.5555556  0.150025  4.279147  0.7736912  3.026532
  0.5555556  0.200000  4.316077  0.7700938  3.047343
  0.6666667  0.000100  4.230688  0.7785626  3.026161
  0.6666667  0.050075  4.234804  0.7780228  3.010543
  0.6666667  0.100050  4.259738  0.7755436  3.016711
  0.6666667  0.150025  4.298632  0.7717308  3.037088
  0.6666667  0.200000  4.346119  0.7670904  3.065817
  0.7777778  0.000100  4.230768  0.7785606  3.026086
  0.7777778  0.050075  4.237651  0.7777250  3.010355
  0.7777778  0.100050  4.270246  0.7744861  3.021917
  0.7777778  0.150025  4.321489  0.7694148  3.050128
  0.7777778  0.200000  4.369850  0.7648646  3.081138
  0.8888889  0.000100  4.230862  0.7785562  3.026279
  0.8888889  0.050075  4.240866  0.7773909  3.010495
  0.8888889  0.100050  4.282287  0.7732696  3.028047
  0.8888889  0.150025  4.345008  0.7670525  3.064443
  0.8888889  0.200000  4.387467  0.7632810  3.091423
  1.0000000  0.000100  4.230700  0.7785841  3.025998
  1.0000000  0.050075  4.244334  0.7770330  3.011344
  1.0000000  0.100050  4.295773  0.7719043  3.035321
  1.0000000  0.150025  4.364484  0.7651821  3.076854
  1.0000000  0.200000  4.405206  0.7617009  3.102491

RMSE was used to select the optimal model using the
 smallest value.
The final values used for the model were alpha = 0.1111111
 and lambda = 0.10005.

Step 7: Check RMSE value

#mean validation score mean(en$resample$RMSE)

	[1] 4.227739

Step 8: Plots

#plotting the model
plot(en, main = "Elastic Net Regression")

#plotting important variables
plot(varImp(en,scale=TRUE))

nox, rm, and chas were the top three most important variables.

What Users are saying..

profile image

Gautam Vermani

Data Consultant at Confidential
linkedin profile url

Having worked in the field of Data Science, I wanted to explore how I can implement projects in other domains, So I thought of connecting with ProjectPro. A project that helped me absorb this topic... Read More

Relevant Projects

Build Customer Propensity to Purchase Model in Python
In this machine learning project, you will learn to build a machine learning model to estimate customer propensity to purchase.

Build Piecewise and Spline Regression Models in Python
In this Regression Project, you will learn how to build a piecewise and spline regression model from scratch in Python to predict the points scored by a sports team.

Build CNN for Image Colorization using Deep Transfer Learning
Image Processing Project -Train a model for colorization to make grayscale images colorful using convolutional autoencoders.

Build a Customer Churn Prediction Model using Decision Trees
Develop a customer churn prediction model using decision tree machine learning algorithms and data science on streaming service data.

Multilabel Classification Project for Predicting Shipment Modes
Multilabel Classification Project to build a machine learning model that predicts the appropriate mode of transport for each shipment, using a transport dataset with 2000 unique products. The project explores and compares four different approaches to multilabel classification, including naive independent models, classifier chains, natively multilabel models, and multilabel to multiclass approaches.

Walmart Sales Forecasting Data Science Project
Data Science Project in R-Predict the sales for each department using historical markdown data from the Walmart dataset containing data of 45 Walmart stores.

Loan Eligibility Prediction in Python using H2O.ai
In this loan prediction project you will build predictive models in Python using H2O.ai to predict if an applicant is able to repay the loan or not.

Model Deployment on GCP using Streamlit for Resume Parsing
Perform model deployment on GCP for resume parsing model using Streamlit App.

Build Real Estate Price Prediction Model with NLP and FastAPI
In this Real Estate Price Prediction Project, you will learn to build a real estate price prediction machine learning model and deploy it on Heroku using FastAPI Framework.

OpenCV Project to Master Advanced Computer Vision Concepts
In this OpenCV project, you will learn to implement advanced computer vision concepts and algorithms in OpenCV library using Python.