HANDS-ON-LAB

Walmart Sales Forecasting Data Science Project

Problem Statement

Build an LSTM forecasting model for Walmart Sales data and deploy it as an endpoint using AWS Sagemaker.

Dataset

There are 3 datasets given:

    • Features.csv => This file contains additional data related to the store, department, and regional activity for the given dates

    • Stores.csv => This file contains anonymized information about the 45 stores, indicating the type and size of the store

    • Train.csv => This is the historical training data, which covers 2010–02–05 to 2012–11–01

 

Kindly download the data from here.

Tasks

  1. Create a final dataset by joining the 3 datasets on the relevant ID column.

  2. Hypothesis based EDA:

    • Do weekends and weekdays have different average sales across stores?

    • Plot a correlation heatmap to understand highly correlated features with the target sales.

    • Does having any promotional markdowns affect sales?

  1. Preprocess and do feature engineering to create the final model ready data.

  2. Build an LSTM model using AWS Sagemaker and check its performance using a validation dataset created from the train data.

  3. Deploy the trained model as an endpoint by uploading the artifacts to s3 bucket.

 

Leverage the power of AWS Sagemaker to deploy and scale your LSTM model for accurate sales predictions

 

FAQs

Q1. Do weekends and weekdays have different average sales across stores?

By analyzing the dataset, we can determine if there is a significant difference in average sales between weekends and weekdays across stores.

 

Q2. What are the highly correlated features with the target sales?

We can plot a correlation heatmap to identify the features that have a strong correlation with the target sales variable.

 

Q3. Does having any promotional markdowns affect sales?

By analyzing the data, we can examine the impact of promotional markdowns on sales and determine if there is a correlation between the two.