How to create a heatmap in R?
This recipe helps you create a heatmap in R

Recipe Objective

A correlation matrix is a "square" table which consists of correlation coefficients for a set of variables. They are mainly used to determine relationships between the variables.

There are three main applications of correlation matrix:

1. To explore patterns in a large dataset by summarising it in a form of a table.
2. Used as an input for exploratory data analysis, structural equation models and confirmatory factor analysis.
3. Used as a diagnostic step for checking different analysis. For example, a high correlation coefficients indicates that linear regression is unreliable.

The most commonly used visualisation technique to showcase the correlation matrix is heatmap. This technique showcases the magnitude as shades of colors.

This recipe demonstrates how to build a heatmap of a correlation matrix.

Dataset description: It is the basic data about the customers going to the supermarket mall. The variable that we interested in is Annual.Income which is in 1000s , Spending Score and Age

``` # Data manipulation package library(dplyr) library(tidyverse) ​ # reading a dataset customer_seg = read.csv('Mall_Customers.csv') ​ # selecting the required variables using the select() function customer_seg_var = select(customer_seg, Age, Annual.Income..k..,Spending.Score..1.100.) ​ # summary of the selected variables glimpse(customer_seg_var) ```
```
Observations: 200
Variables: 3
\$ Age                     19, 21, 20, 23, 31, 22, 35, 23, 64, 30, 67, 35…
\$ Annual.Income..k..      15, 15, 16, 16, 17, 17, 18, 18, 19, 19, 19, 19…
\$ Spending.Score..1.100.  39, 81, 6, 77, 40, 76, 6, 94, 3, 72, 14, 99, 1…
```

STEP 2: Building a correlation matrix

We use cor() function to create a correlation matrix.

Syntax: corr(x, method = )

where:

1. x = dataframe as input
2. method = An arguement which provides us to input a method of calculation in the form of vector. The default is Pearson's
``` customer_seg_var.cor = cor(customer_seg_var) customer_seg_var.cor ```
```	Age	Annual.Income..k..	Spending.Score..1.100.
Age	1.00000000 	-0.012398043	-0.327226846
Annual.Income..k..	-0.01239804 	1.000000000	0.009902848
Spending.Score..1.100.	-0.32722685 	0.009902848	1.000000000
```

Note: The Diagonal elements in the matrix is 1.0 as this is the correlation coefficient of the same variable and the coefficients ranges of -1 to 1

STEP 3: Building a heatmap of correlation matrix

We use the heatmap() function in R to carry out this task.

Syntax: heatmap(x, col = , symm = )

where:
1. x = matrix
2. col = vector which indicates colors to be used to showcase the magnitude of correlation coefficients.
3. symm = If True, the heat map is symmetrical
``` # we have used the default colour scheme heatmap(customer_seg_var.cor, symm = TRUE) ```

