Chicago Crime Data Analysis on Apache Spark

Chicago Crime Data Analysis on Apache Spark

In this project, we will look at running various use cases in the analysis of crime data sets using Apache Spark.
explanation image

Videos

Each project comes with 2-5 hours of micro-videos explaining the solution.

ipython image

Code & Dataset

Get access to 50+ solved projects with iPython notebooks and datasets.

project experience

Project Experience

Add project experience to your Linkedin/Github profiles.

Customer Love

Read All Reviews
profile image

Mohamed Yusef Ahmed linkedin profile url

Software Developer at Taske

Recently I became interested in Hadoop as I think its a great platform for storing and analyzing large structured and unstructured data sets. The experts did a great job not only explaining the... Read More

profile image

Camille St. Omer linkedin profile url

Artificial Intelligence Researcher, Quora 'Most Viewed Writer in 'Data Mining'

I came to the platform with no experience and now I am knowledgeable in Machine Learning with Python. No easy thing I must say, the sessions are challenging and go to the depths. I looked at graduate... Read More

What will you learn

Spark's DataFrame vs Dataset
Type-safe UDF in Spark
Rollup functions in Spark
Windowing functions in Spark
Running your spark code in Apache Zeppelin

Project Description

In this Hackerday, we will look at running various use cases in the analysis of crime datasets using Apache Spark.
This is a back-to-basics Hackerday session that is going to be very expository for those who have never written spark application or are new to writing spark application using Scala. We will explore the Spark SQL UDF and as well as roll-up and windowing functions.

We will also do a final submission of our application on Apache Zeppelin to submit our application to our friends. We will try to run some of our code in both 1.x and 2.x versions of Spark. However, you are recommended to start moving completely to Spark 2.x.
 

Similar Projects

This Elasticsearch example deploys the AWS ELK stack to analyse streaming event data. Tools used include Nifi, PySpark, Elasticsearch, Logstash and Kibana for visualisation.

In this project, we will evaluate and demonstrate how to handle unstructured data using Spark.

In this Hackerday, we will go through the basis of statistics and see how Spark enables us to perform statistical operations like descriptive and inferential statistics over the very large dataset.

Curriculum For This Mini Project