How to determine Pearsons correlation in Python?
DATA MUNGING DATA CLEANING PYTHON MACHINE LEARNING RECIPES PANDAS CHEATSHEET     ALL TAGS

# How to determine Pearsons correlation in Python?

This recipe helps you determine Pearsons correlation in Python

## Recipe Objective

Pearson"s correlation is very important statical data that we need many times. We can calculate it manually but it takes time.

So this is the recipe on how we can determine Pearson"s correlation in Python

## Step 1 - Importing Library

``` import matplotlib.pyplot as plt import statistics as stats import pandas as pd import random import seaborn as sns ```

We have imported stats, seaborn and pandas which is needed.

## Step 2 - Creating a dataframe

We have created a empty dataframe and then added rows to it with random numbers. ``` df = pd.DataFrame() df["x"] = random.sample(range(1, 100), 75) df["y"] = random.sample(range(1, 100), 75) print(); print(df.head()) ```

## Step 3 - Calculating Pearsons correlation coefficient

We hawe defined a function with differnt steps that we will see.

• We have calculated mean and standard deviation of x and length of x
• ``` def pearson(x,y): n = len(x) standard_score_x = []; standard_score_y = []; mean_x = stats.mean(x) standard_deviation_x = stats.stdev(x) ```
• We atre calculating mean and standard deviation of y
• ``` mean_y = stats.mean(y) standard_deviation_y = stats.stdev(y) ```
• We are calculating standard score by dividing difference of observation and mean with standard deviation. We have done this for both X and Y
• ``` for observation in x: standard_score_x.append((observation - mean_x)/standard_deviation_x) for observation in y: standard_score_y.append((observation - mean_y)/standard_deviation_y) return (sum([i*j for i,j in zip(standard_score_x, standard_score_y)]))/(n-1) ```

## Printing the Results

``` result = pearson(df.x, df.y) print() print("Pearson"s correlation coefficient is: ", result) sns.lmplot("x", "y", data=df, fit_reg=True) plt.show() ```

```    x   y
0  96  62
1   1  81
2  27  73
3  55  26
4  83  93

Pearson"s correlation coefficient is:  -0.006387074440361877
```

#### Relevant Projects

##### Loan Eligibility Prediction in Python using H2O.ai
In this loan prediction project you will build predictive models in Python using H2O.ai to predict if an applicant is able to repay the loan or not.

##### Ecommerce product reviews - Pairwise ranking and sentiment analysis
This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.

##### Digit Recognition using CNN for MNIST Dataset in Python
In this deep learning project, you will build a convolutional neural network using MNIST dataset for handwritten digit recognition.

##### Avocado Machine Learning Project Python for Price Prediction
In this ML Project, you will use the Avocado dataset to build a machine learning model to predict the average price of avocado which is continuous in nature based on region and varieties of avocado.

##### Predict Employee Computer Access Needs in Python
Data Science Project in Python- Given his or her job role, predict employee access needs using amazon employee database.

##### Identifying Product Bundles from Sales Data Using R Language
In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.

##### German Credit Dataset Analysis to Classify Loan Applications
In this data science project, you will work with German credit dataset using classification techniques like Decision Tree, Neural Networks etc to classify loan applications using R.

##### Locality Sensitive Hashing Python Code for Look-Alike Modelling
In this deep learning project, you will find similar images (lookalikes) using deep learning and locality sensitive hashing to find customers who are most likely to click on an ad.

##### Machine Learning or Predictive Models in IoT - Energy Prediction Use Case
In this machine learning and IoT project, we are going to test out the experimental data using various predictive models and train the models and break the energy usage.

##### Natural language processing Chatbot application using NLTK for text classification
In this NLP AI application, we build the core conversational engine for a chatbot. We use the popular NLTK text classification library to achieve this.