1-844-696-6465 (US)        +91 77600 44484        help@dezyre.com

Perform Exploratory Data Analysis and build Music Recommendation Engine using Python & R

Can you build the best music recommendation system?

Users who bought this project also bought

What will you learn

  • Working with Music Data with several category
  • EDA using several Visualization techniques
  • Building Automated Recommendation Engine
  • Solve this use case using Python and R
  • Finding Parameter Tuning for better Algorithm

What will you get

  • Access to recording of the complete project
  • Access to all material related to project like data files, solution files etc.

Project Description


The 11th ACM International Conference on Web Search and Data Mining (WSDM 2018) is challenging you to build a better music recommendation system using a donated dataset from KKBOX. WSDM (pronounced "wisdom") is one of the the premier conferences on web inspired research involving search and data mining. They're committed to publishing original, high quality papers and presentations, with an emphasis on practical but principled novel models.

WSDM has challenged us to help solve these problems and build a better music recommendation system. The dataset is from KKBOX, Asia’s leading music streaming service, holding the world’s most comprehensive Asia-Pop music library with over 30 million tracks.

They currently use a collaborative filtering based algorithm with matrix factorization and word embedding in their recommendation system but believe new techniques could lead to better results.


In this task, you will be asked to predict the chances of a user listening to a song repetitively after the first observable listening event within a time window was triggered. If there are recurring listening event(s) triggered within a month after the user’s very first observable listening event, its target is marked 1, and 0 otherwise in the training set. The same rule applies to the testing set.

KKBOX provides a training data set consists of information of the first observable listening event for each unique user-song pair within a specific time duration. Metadata of each unique user and song pair is also provided. The use of public data to increase the level of accuracy of your prediction is encouraged.

The train and the test data are selected from users listening history in a given time period. Note that this time period is chosen to be before the WSDM-KKBox Churn Prediction time period. The train and test sets are split based on time, and the split of public/private is based on unique user/song pairs.



Data Scientist / Business Consultant at GE

3 years of rich working experience in BIG Data, Business Intelligence & Analytics with CMMI Level 5 Organizations in BFSI, Manufacturing Sector. Excellent written and oral communications, strong analytical and problem solving capabilities. Constantly learning and experimenting emerging open source tools and technologie see more...