DATA MUNGING
DATA CLEANING PYTHON
MACHINE LEARNING RECIPES
PANDAS CHEATSHEET
ALL TAGS
# How to determine Pearsons correlation in Python?

# How to determine Pearsons correlation in Python?

This recipe helps you determine Pearsons correlation in Python

Pearson"s correlation is very important statical data that we need many times. We can calculate it manually but it takes time.

So this is the recipe on how we can determine Pearson"s correlation in Python

```
import matplotlib.pyplot as plt
import statistics as stats
import pandas as pd
import random
import seaborn as sns
```

We have imported stats, seaborn and pandas which is needed.

We have created a empty dataframe and then added rows to it with random numbers.
```
df = pd.DataFrame()
df["x"] = random.sample(range(1, 100), 75)
df["y"] = random.sample(range(1, 100), 75)
print(); print(df.head())
```

We hawe defined a function with differnt steps that we will see.

- We have calculated mean and standard deviation of x and length of x
- We atre calculating mean and standard deviation of y
- We are calculating standard score by dividing difference of observation and mean with standard deviation. We have done this for both X and Y

```
def pearson(x,y):
n = len(x)
standard_score_x = []; standard_score_y = [];
mean_x = stats.mean(x)
standard_deviation_x = stats.stdev(x)
```

```
mean_y = stats.mean(y)
standard_deviation_y = stats.stdev(y)
```

```
for observation in x:
standard_score_x.append((observation - mean_x)/standard_deviation_x)
for observation in y:
standard_score_y.append((observation - mean_y)/standard_deviation_y)
return (sum([i*j for i,j in zip(standard_score_x, standard_score_y)]))/(n-1)
```

```
result = pearson(df.x, df.y)
print()
print("Pearson"s correlation coefficient is: ", result)
sns.lmplot("x", "y", data=df, fit_reg=True)
plt.show()
```

x y 0 96 62 1 1 81 2 27 73 3 55 26 4 83 93 Pearson"s correlation coefficient is: -0.006387074440361877

**
Download Materials
**

In this loan prediction project you will build predictive models in Python using H2O.ai to predict if an applicant is able to repay the loan or not.

This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.

In this deep learning project, you will build a convolutional neural network using MNIST dataset for handwritten digit recognition.

In this ML Project, you will use the Avocado dataset to build a machine learning model to predict the average price of avocado which is continuous in nature based on region and varieties of avocado.

Data Science Project in Python- Given his or her job role, predict employee access needs using amazon employee database.

In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.

In this data science project, you will work with German credit dataset using classification techniques like Decision Tree, Neural Networks etc to classify loan applications using R.

In this deep learning project, you will find similar images (lookalikes) using deep learning and locality sensitive hashing to find customers who are most likely to click on an ad.

In this machine learning and IoT project, we are going to test out the experimental data using various predictive models and train the models and break the energy usage.

In this NLP AI application, we build the core conversational engine for a chatbot. We use the popular NLTK text classification library to achieve this.