Neo4j Project using Yelp dataset to analyse ratings from users

Neo4j Project using Yelp dataset to analyse ratings from users

In this Neo4j project, you will do network analysis using a graph database to find patterns on how a social network affects business reviews and ratings.


Each project comes with 2-5 hours of micro-videos explaining the solution.

Code & Dataset

Get access to 50+ solved projects with iPython notebooks and datasets.

Project Experience

Add project experience to your Linkedin/Github profiles.

Customer Love

Read All Reviews

James Peebles

Data Analytics Leader, IQVIA

This is one of the best of investments you can make with regards to career progression and growth in technological knowledge. I was pointed in this direction by a mentor in the IT world who I highly... Read More

Camille St. Omer

Artificial Intelligence Researcher, Quora 'Most Viewed Writer in 'Data Mining'

I came to the platform with no experience and now I am knowledgeable in Machine Learning with Python. No easy thing I must say, the sessions are challenging and go to the depths. I looked at graduate... Read More

What will you learn

Downloading the Yelp Dataset
Setting up Virtual environment in Cloudera VM ware
Introduction of key terminologies in graph database
Understanding Directed and Undirected Graphs
Integrating Hue and Imapla
Writing Queries in Hue Impala
Exploring documentation of Spark Graph X and understanding different functions
What are Out-degrees and In-degrees
Importing Edge and Graphs storage levels from Scala
Printing the subsets of In-degrees after filtering them
Short introduction to cypher
Neo4j connector to Apache Spark
Understanding and Creating Dataschema for storing the data
Different types of Database like Hbase, Cassandra, Graph Databases
Selecting the appropriate Database for your project
RDBMS over Graph Databses
Data analysis using GraphX and Neo4j
Writing Queries for fetching Data and visualizing the output

Project Description

Still on the series on "Data engineering using Yelp dataset", we have built our data warehouse to an appreciable stage and users can make any kind of query that they want to. Well done.

But not all queries are easy to read/write by users or not all queries are easy to execute by the query engine. Some queries carry so much self-joins that they either become inefficient for the system or too confusing for the writer.

In this Neo4j big data project, we are going to be doing network analysis using a graph database. The purpose of this is to find patterns in how a social network affects business reviews and ratings. This on its own could be an outstanding data product from the yelp dataset.

We will be using the open source graph database Neo4J and Spark to analyze the social network of users and if it has any effect on how ratings or reviews were done.

Similar Projects

In this big data project using Neo4j, we will be remodelling the movielens dataset in a graph structure and using that structures to answer questions in different ways.

In this big data project, we will look at how to mine and make sense of connections in a simple way by building a Spark GraphX Algorithm and a Network Crawler.

The goal of this spark project is to analyse the level and strength of interactions across areas of coverage of a telecom provider between different areas in the city of Milan.

Curriculum For This Mini Project

02h 36m
02h 45m