Data Science Project in Python on BigMart Sales Prediction

Data Science Project in Python on BigMart Sales Prediction

The goal of this data science project is to build a predictive model and find out the sales of each product at a given Big Mart store.


Each project comes with 2-5 hours of micro-videos explaining the solution.

Code & Dataset

Get access to 50+ solved projects with iPython notebooks and datasets.

Project Experience

Add project experience to your Linkedin/Github profiles.

What will you learn

Understanding the problem Statement
Importing the Dataset and performing basic EDA
Checking for the null values and describing the variables
Imputation of the Null-Values using pivot tables
Feature Engineering/ Creating New features
Using seaborn to understand the contribution of the categorical values on target variables
Using boxplot for identifying outliers
Fixing categorical variables using Label and One hot encoding
Applying Linear, Bayesian Regression models
Applying ensemble bagging models like Random Forest and Bagging models
Applying boosting models like Gradient Boosting Tree and XGboost
Applying Neural Network model MLPRegressor
Making function for On spot-checking and selecting the best for hyperparameter tuning
Defining function for HyperParameter tuning
Standardization and effect of Standardization
Understanding Robust Scaler and Normalization
Implementing Robust Scaler and Normalization
Concluding the final model and predicting for the test data set
Saving the model using Joblib

Project Description

The data scientists at BigMart have collected 2013 sales data for 1559 products across 10 stores in different cities. Also, certain attributes of each product and store have been defined. The aim of this data science project is to build a predictive model and find out the sales of each product at a particular store.

Using this model, BigMart will try to understand the properties of products and stores which play a key role in increasing sales.

 The data has missing values as some stores do not report all the data due to technical glitches. Hence, it will be required to treat them accordingly.

Similar Projects

Build a predictive model to correctly classify products between 9 product categories (fashion, electronics, etc.) using the Otto Group dataset.

In this project, we are going to talk about insurance forecast by using regression techniques.

In this machine learning project, you will build a model to predict the purchase amount of customer against various products which will help the company create personalized offer for customers against different products.

Curriculum For This Mini Project

The Business Problem
Exploring The Dataset
Exploratory Data Analysis (eda) - Outliers
Exploratory Data Analysis (eda) - Graphs
Converting Categorical To Numerical
Seperating Training And Test Data
Running The Models
Hyper Parameter Tuning XGB And GBR
Standard Scaling
Robust Scaling
Final Predictions On The Test Dataset
Saving The Final Model