1-844-696-6465 (US)        +91 77600 44484        help@dezyre.com
music-recommendation-engine.jpg

Process a Million Song Dataset to Predict Song Preferences

In this big data project, we will discover songs for those artists that are associated with the different cultures across the globe.

Users who bought this project also bought

What will you learn

  • Analysis large datasets easily and efficiently
  • Using data flow programming language "Pig Latin" for analysis
  • Data compression using LZO codec
  • PigLatin UDF "DataFu" (Created by LinkedIn) for data localization
  • Working with Hierarchical Data Format (HDF5)

What will you get

  • Access to recording of the complete project
  • Access to all material related to project like data files, solution files etc.

Project Description

This big data hadoop project aims at being the best possible offline evaluation of a music recommendation system.  Any type of algorithm can be used: collaborative filtering, content-based methods, web crawling. By relying on the Million Song Dataset, the data for this big data project is completely open: almost everything is known and possibly available.

What is the task in a few words? You have: 

  1. the full listening history for 1M users, 
  2. half of the listening history for 110K users (10K validation set, 100K test set), 

and you must predict the missing half. How much easier can it get?

The most straightforward approach to this task is pure collaborative filtering, but remember that there is a wealth of information available to you through the Million Song Dataset.  For Million Song Dataset Download, click this link - labrosa.ee.columbia.edu/millionsong/. Go ahead, explore!

Instructors

 
Sakhuja

Senior Hadoop Engineer at Sirius Computer Solutions

Abhishek has a corporate experience for 5 years in the fields of Hadoop R&D, Big Data technologies, Hadoop administration, IBM Netezza Database Administration, Data Warehousing, Data Mining (Netezza, Oracle PL/SQL and Microsoft SQL Server), Development, ETL and Advanced analytics. He has a vast exposures on various pro see more...

Curriculum For This Mini Project

 
  13-Jul-2016
02:35:26
  14-Jul-2016
02:41:12