How to find correlations among feature variables in R?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

How to find correlations among feature variables in R?

How to find correlations among feature variables in R?

This recipe helps you correlate the fields in a dataset to determine the relationship between them and pick the right features for modelling.

0

This recipe uses the cor (), cov (), rcorr() packages in R to establish relationships between the features. It then outputs a correlation matrix.

What is a Feature variable ?
A feature variable refers to the fields in a dataset used for analytics or machine learning. Feature selection, also known as variable selection is the process of selecting a subset of relevant features (variables) from the dataset for in model construction.

What is R ?
R is a programming language used for statistics and data science computing. R has very powerful libraries (almost 12,000) for performing data analytics including regression, classification, visualisation etc.

In [ ]:
# -------------------------------------------------
# How to find correlations among feature variables in R
# -------------------------------------------------
# load library and data
library(mlbench)
library(Hmisc)

data(mtcars)
dim(mtcars)

par(mfrow=c(1,1))

# Correlations/covariances among numeric variables 
# Use listwise deletion of missing data. 
cor(mtcars, use="complete.obs", method="kendall") 
cov(mtcars, use="complete.obs")

# Correlations with significance levels
rcorr(as.matrix(mtcars), type="pearson")

# Correlation matrix from mtcars
# with mpg, cyl, and disp as rows 
# and hp, drat, and wt as columns 
x <- mtcars[1:3]
y <- mtcars[4:6]
cor(x, y)

Relevant Projects

Loan Eligibility Prediction using Gradient Boosting Classifier
This data science in python project predicts if a loan should be given to an applicant or not. We predict if the customer is eligible for loan based on several factors like credit score and past history.

Zillow’s Home Value Prediction (Zestimate)
Data Science Project in R -Build a machine learning algorithm to predict the future sale prices of homes.

Resume parsing with Machine learning - NLP with Python OCR and Spacy
In this machine learning resume parser example we use the popular Spacy NLP python library for OCR and text classification.

German Credit Dataset Analysis to Classify Loan Applications
In this data science project, you will work with German credit dataset using classification techniques like Decision Tree, Neural Networks etc to classify loan applications using R.

Walmart Sales Forecasting Data Science Project
Data Science Project in R-Predict the sales for each department using historical markdown data from the Walmart dataset containing data of 45 Walmart stores.

Machine Learning project for Retail Price Optimization
In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.

Predict Macro Economic Trends using Kaggle Financial Dataset
In this machine learning project, you will uncover the predictive value in an uncertain world by using various artificial intelligence, machine learning, advanced regression and feature transformation techniques.

Mercari Price Suggestion Challenge Data Science Project
Data Science Project in Python- Build a machine learning algorithm that automatically suggests the right product prices.

Build an Image Classifier for Plant Species Identification
In this machine learning project, we will use binary leaf images and extracted features, including shape, margin, and texture to accurately identify plant species using different benchmark classification techniques.

Human Activity Recognition Using Smartphones Data Set
In this deep learning project, you will build a classification system where to precisely identify human fitness activities.