DATA MUNGING
# How to determine Pearsons correlation in Python?

# How to determine Pearsons correlation in Python?

This recipe helps you determine Pearsons correlation in Python

In [2]:

```
def Snippet_120():
print()
print(format('How to determine Pearson\'s correlation in Python','*^82'))
import warnings
warnings.filterwarnings("ignore")
# load libraries
import matplotlib.pyplot as plt
import statistics as stats
import pandas as pd
import random
import seaborn as sns
# Create empty dataframe
df = pd.DataFrame()
# Add columns
df['x'] = random.sample(range(1, 100), 75)
df['y'] = random.sample(range(1, 100), 75)
# View first few rows of data
print(); print(df.head())
# Calculate Pearsonâ€™s Correlation Coefficient
def pearson(x,y):
# Create n, the number of observations in the data
n = len(x)
# Create lists to store the standard scores
standard_score_x = []; standard_score_y = [];
# Calculate the mean of x
mean_x = stats.mean(x)
# Calculate the standard deviation of x
standard_deviation_x = stats.stdev(x)
# Calculate the mean of y
mean_y = stats.mean(y)
# Calculate the standard deviation of y
standard_deviation_y = stats.stdev(y)
# For each observation in x
for observation in x:
# Calculate the standard score of x
standard_score_x.append((observation - mean_x)/standard_deviation_x)
# For each observation in y
for observation in y:
# Calculate the standard score of y
standard_score_y.append((observation - mean_y)/standard_deviation_y)
# Multiple the standard scores together, sum them, then divide by n-1, return that value
return (sum([i*j for i,j in zip(standard_score_x, standard_score_y)]))/(n-1)
# Show Pearson's Correlation Coefficient
result = pearson(df.x, df.y)
print()
print("Pearson\'s correlation coefficient is: ", result)
sns.lmplot('x', 'y', data=df, fit_reg=True)
plt.show()
Snippet_120()
```

In [ ]:

```
```

In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.

This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.

In this data science project, you will predict borrowers chance of defaulting on credit loans by building a credit score prediction model.

In this machine learning project, you will develop a machine learning model to accurately forecast inventory demand based on historical sales data.

There are different time series forecasting methods to forecast stock price, demand etc. In this machine learning project, you will learn to determine which forecasting method to be used when and how to apply with time series forecasting example.

In this deep learning project, we will predict customer churn using Artificial Neural Networks and learn how to model an ANN in R with the keras deep learning package.

PySpark Project-Get a handle on using Python with Spark through this hands-on data processing spark python tutorial.

In this deep learning project, you will build a classification system where to precisely identify human fitness activities.

Data science project in R to develop automated methods for predicting the cost and severity of insurance claims.

Data Science Project - Build a recommendation engine which will predict the products to be purchased by an Instacart consumer again.