HANDS-ON-LAB

Movie Recommendation using Association Rules Mining Project

Problem Statement

Movie recommendation to users who have watched similar movies and gave a rating of 4 & above.

Dataset

This project focuses on the main dataset: movies.csv and ratings.csv

The complete data dictionary can be found here.

Kindly download the two data files from here.

Tasks

  1. Hypothesis-based EDA:

    • Find the top 10 rated movies in the data and how many users have rated them. (output df: two columns  MovieName - Rating)

    • Plot the distribution of ratings as a histogram

    • How many unique genres are in the movie?

    • Which genres have more number of < 3 rated movies?

  1. Filter the data for genre = ‘Comedy|Romance’ & rating = 4 and create the USER_ID <> MOVIE_ID matrix

  2. Build the Association rules using the Apriori algorithm.

  3. Create a support function that takes in genre and rating_cutoff as an argument to dynamically build association rules using the Apriori algorithm (similar to the function created in the project video)

Explore movie ratings, genres, and association rules with our data analysis.

 

FAQs

Q1. What are the top 10 rated movies in the dataset and how many users have rated them?

The analysis provides a dataframe with columns MovieName and Rating, showcasing the top-rated movies and the number of user ratings.

 

Q2. How many unique genres are present in the movie dataset?

The analysis reveals the count of unique genres available in the dataset, allowing for genre exploration.

 

Q3. Which genres have a higher number of movies with a rating less than 3?

By examining the data, it is possible to identify the genres that have a larger number of movies rated below 3.