How to build a correlation matrix in R?

This recipe helps you build a correlation matrix in R
Last Updated: 21 Jul 2022

Get access to Data Science projects View all Data Science projects

MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET ALL TAGS

Recipe Objective

A correlation matrix is a "square" table which consists of correlation coefficients for a set of variables. They are mainly used to determine relationships between the variables.

There are three main applications of correlation matrix:

To explore patterns in a large dataset by summarising it in a form of a table.
Used as an input for exploratory data analysis, structural equation models and confirmatory factor analysis.
Used as a diagnostic step for checking different analysis. For example, a high correlation coefficients indicates that linear regression is unreliable.

This recipe demonstrates how to build a correlation matrix.

STEP 1: Loading required library and dataset

Dataset description: It is the basic data about the customers going to the supermarket mall. The variable that we interested in is Annual.Income which is in 1000s , Spending Score and Age

# Data manipulation package library(dplyr) library(tidyverse) # reading a dataset customer_seg = read.csv('Mall_Customers.csv') # selecting the required variables using the select() function customer_seg_var = select(customer_seg, Age, Annual.Income..k..,Spending.Score..1.100.) # summary of the selected variables glimpse(customer_seg_var)

Observations: 200
Variables: 3
$ Age                     19, 21, 20, 23, 31, 22, 35, 23, 64, 30, 67, 35…
$ Annual.Income..k..      15, 15, 16, 16, 17, 17, 18, 18, 19, 19, 19, 19…
$ Spending.Score..1.100.  39, 81, 6, 77, 40, 76, 6, 94, 3, 72, 14, 99, 1…

STEP 2: Building a correlation matrix

We use cor() function to create a correlation matrix.

Syntax: corr(x, method = )

where:

x = dataframe as input
method = An arguement which provides us to input a method of calculation in the form of vector. The default is Pearson's

customer_seg_var.cor = cor(customer_seg_var) customer_seg_var.cor

	Age	Annual.Income..k..	Spending.Score..1.100.
Age	1.00000000 	-0.012398043	-0.327226846
Annual.Income..k..	-0.01239804 	1.000000000	0.009902848
Spending.Score..1.100.	-0.32722685 	0.009902848	1.000000000

Note: The Diagonal elements in the matrix is 1.0 as this is the correlation coefficient of the same variable and the coefficients ranges of -1 to 1

What Users are saying..

Ameeruddin Mohammed

ETL (Abintio) developer at IBM

I come from a background in Marketing and Analytics and when I developed an interest in Machine Learning algorithms, I did multiple in-class courses from reputed institutions though I got good... Read More

Relevant Projects

Machine Learning Projects

Data Science Projects

Python Projects for Data Science

Data Science Projects in R

Machine Learning Projects for Beginners

Deep Learning Projects

Neural Network Projects

Tensorflow Projects

NLP Projects

Kaggle Projects

IoT Projects

Big Data Projects

Hadoop Real-Time Projects Examples

Spark Projects

Data Analytics Projects for Students

Relevant Projects

Recommender System Machine Learning Project for Beginners-1

Recommender System Machine Learning Project for Beginners - Learn how to design, implement and train a rule-based recommender system in Python

View Project Details

Time Series Python Project using Greykite and Neural Prophet

In this time series project, you will forecast Walmart sales over time using the powerful, fast, and flexible time series forecasting library Greykite that helps automate time series problems.

View Project Details

Build Piecewise and Spline Regression Models in Python

In this Regression Project, you will learn how to build a piecewise and spline regression model from scratch in Python to predict the points scored by a sports team.

View Project Details

Build Portfolio Optimization Machine Learning Models in R

Machine Learning Project for Financial Risk Modelling and Portfolio Optimization with R- Build a machine learning model in R to develop a strategy for building a portfolio for maximized returns.

View Project Details

PyCaret Project to Build and Deploy an ML App using Streamlit

In this PyCaret Project, you will build a customer segmentation model with PyCaret and deploy the machine learning application using Streamlit.

View Project Details

A/B Testing Approach for Comparing Performance of ML Models

The objective of this project is to compare the performance of BERT and DistilBERT models for building an efficient Question and Answering system. Using A/B testing approach, we explore the effectiveness and efficiency of both models and determine which one is better suited for Q&A tasks.

View Project Details

How to build a correlation matrix in R?

Recipe Objective

STEP 1: Loading required library and dataset

STEP 2: Building a correlation matrix

Ameeruddin Mohammed

Relevant Projects

You might also like

Relevant Projects