MACHINE LEARNING RECIPES
DATA CLEANING PYTHON
DATA MUNGING
PANDAS CHEATSHEET
ALL TAGS
# What is box cox transformation?

# What is box cox transformation?

This recipe explains what is box cox transformation

Transformation of any power-law or any non-linear distribution to normal distribution is generally carried on by Box-Cox Transformation. A Box cox transformation is defined as a way to transform non-normal dependent variables in our data to a normal shape.

So this recipe is a short example on what is box cox transformation. Let's get started.

```
import numpy as np
from scipy.stats import boxcox
import seaborn as sns
import matplotlib.pyplot as plt
```

Let's pause and look at these imports. Numpy is general one. boxcox will help in normalizing dataset. sns and plt are used for plotting of dataset.

```
original_data = np.random.exponential(size = 1000)
```

We have set here an exponential function for normalization.

Now our dataset is ready.

```
fitted_data, fitted_lambda = boxcox(original_data)
```

We have fitted our data usin boxcox into normal function and found the lamda used for the transformation.

```
fig, ax = plt.subplots(1, 2)
sns.distplot(original_data, hist = False, kde = True, kde_kws = {'shade': True, 'linewidth': 2}, label = "Non-Normal", color ="green", ax = ax[0])
sns.distplot(fitted_data, hist = False, kde = True, kde_kws = {'shade': True, 'linewidth': 2}, label = "Normal", color ="green", ax = ax[1])
plt.legend(loc = "upper right")
fig.set_figheight(5)
fig.set_figwidth(10)
```

We have simply used sns class to plot for original as well as fitted dataset.

Once we run the above code snippet, we will see:

Srcoll down the ipython file to visualize the results.

In this project, we are going to talk about H2O and functionality in terms of building Machine Learning models.

Data Science Project - Build a recommendation engine which will predict the products to be purchased by an Instacart consumer again.

Data Science Project in R-Predict the sales for each department using historical markdown data from the Walmart dataset containing data of 45 Walmart stores.

In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.

In this machine learning churn project, we implement a churn prediction model in python using ensemble techniques.

In this project, we are going to talk about Time Series Forecasting to predict the electricity requirement for a particular house using Prophet.

In this project, we are going to work on Deep Learning using H2O to predict Census income.

Data Science Project in R -Build a machine learning algorithm to predict the future sale prices of homes.

In this machine learning resume parser example we use the popular Spacy NLP python library for OCR and text classification.

Machine Learning Project in R-Detect fraudulent click traffic for mobile app ads using R data science programming language.