This recipe uses the ggplot () package in R to visualize the output of a regression analysis. This visualization combines a regression line with confidence intervals and prediction intervals.
What is Regression Analysis ?
Regression analysis is a statistical technique used to find the relationship between 2 or more variables. It is used in business to understand what factors impact a specific outcome. Regression allows you to determine which factors matter most, which factors can be ignored, and how these factors influence each other. In order to conduct a regression analysis, you'll need to define a dependent variable that you hypothesize is being influenced by one or several independent variables.
What is R ?
R is a programming language used for statistics and data science computing. R has very powerful libraries (almost 12,000) for performing data analytics including regression, classification, visualisation etc.
# --------------------------------------------------------------
# Regression Analysis in R - How to visualise predict() function
# --------------------------------------------------------------
# load libraries
library(mlbench)
library(gridExtra)
library(ggpubr)
# Visualise prediction with CI and PI
# 1. Build linear model
data("cars", package = "datasets")
model <- lm(dist ~ speed, data = cars)
# 2. Add predictions
pred.int <- predict(model, interval = "prediction")
mydata <- cbind(cars, pred.int)
# 3. Regression line + confidence intervals
library("ggplot2")
p1 <- ggplot(mydata, aes(speed, dist)) +
geom_point() +
stat_smooth(method = lm)
# 4. Add prediction intervals
p2 <- p1 + geom_line(aes(y = lwr), color = "red", linetype = "dashed")+
geom_line(aes(y = upr), color = "red", linetype = "dashed")
# plot
grid.arrange(p1,p2, nrow=1)