Identifying Product Bundles from Sales Data Using R Language

Identifying Product Bundles from Sales Data Using R Language

In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.

Videos

Each project comes with 2-5 hours of micro-videos explaining the solution.

Code & Dataset

Get access to 50+ solved projects with iPython notebooks and datasets.

Project Experience

Add project experience to your Linkedin/Github profiles.

Customer Love

Read All Reviews

Camille St. Omer

Artificial Intelligence Researcher, Quora 'Most Viewed Writer in 'Data Mining'

I came to the platform with no experience and now I am knowledgeable in Machine Learning with Python. No easy thing I must say, the sessions are challenging and go to the depths. I looked at graduate... Read More

SUBHABRATA BISWAS

Lead Consultant, ITC Infotech

The project orientation is very much unique and it helps to understand the real time scenarios most of the industries are dealing with. And there is no limit, one can go through as many projects... Read More

What will you learn

Understanding the problem Statement
Importing the dataset directly from Amazon AWS
What is Clustering
Performing basic EDA and Data-preprocessing
Plotting Boxplot and checking for identifying outliers
Fixing outliers using proper technique
Scaling and normalizing the dataset
Partition based methods, Hierarchical Methods, Model-based methods of clustering
Kmeans, k-median, k-mode agglomerative, divisive expectation-maximization based methods
Identifying different libraries used for Clustering
Steps involved in K-clustering
Applying different methods available for clustering
Applying the silhouette score on the clustering result
Visualizing the result by plotting graphs
Picking the best model for making predictions
Calculating probability related to testing data points for different clusters

Project Description

The weekly sales transaction dataset consists of weekly purchased quantities of 800 products over 52 weeks. Normalised values are provided too. The objective of this data science project in R is to find out product bundles that can be put together on sale. Typically Market Basket Analysis was used to identify such bundles, here we are going to compare the relative importance of time series clustering in identifying product bundles.

Similar Projects

In this Kmeans clustering machine learning project, you will perform topic modelling in order to group customer reviews based on recurring patterns.

Data Science Project in R-Predict the sales for each department using historical markdown data from the Walmart dataset containing data of 45 Walmart stores.

In this ensemble machine learning project, we will predict what kind of claims an insurance company will get. This is implemented in python using ensemble machine learning algorithms.

Curriculum For This Mini Project

Introduction
03m
Installing Libraries
01m
Understand the Data Set
05m
Outliers
06m
Clustering Techniques
02m
Installing Library to implement KMeans
05m
Steps in KMeans Algorithm
11m
Implementing Kmeans
12m
Cluster Model Deployment
04m
Hierarchical Clustering - Agglomerative Method
10m
Hierarchical Clustering - Divisive Method
04m
Silhoutte Score
04m
Cluster Goodness using Silhoutte score
05m
Model Based Clustering
06m
Self Organizing Maps
07m
FactoExtra library for Visualization
01m
Other Clustering Methods
10m
Hierarchical Clustering
03m
Hopkins Statistics
03m
Determine Optimal Number of Clusters
03m
Clustering Validation Statistics
05m
Advanced Clustering
04m
Conclusion
08m