This big data hadoop project aims at being the best possible offline evaluation of a music recommendation system. Any type of algorithm can be used: collaborative filtering, content-based methods, web crawling. By relying on the Million Song Dataset, the data for this big data project is completely open: almost everything is known and possibly available.
What is the task in a few words? You have:
and you must predict the missing half. How much easier can it get?
The most straightforward approach to this task is pure collaborative filtering, but remember that there is a wealth of information available to you through the Million Song Dataset. For Million Song Dataset Download, click this link - labrosa.ee.columbia.edu/millionsong/. Go ahead, explore!