Spark Project - Airline Dataset Analysis using Spark MLlib

Spark Project - Airline Dataset Analysis using Spark MLlib

In this Hackerday, we will go through the basis of statistics and see how Spark enables us to perform statistical operations like descriptive and inferential statistics over the very large dataset.
explanation image

Videos

Each project comes with 2-5 hours of micro-videos explaining the solution.

ipython image

Code & Dataset

Get access to 102+ solved projects with iPython notebooks and datasets.

project experience

Project Experience

Add project experience to your Linkedin/Github profiles.

Customer Love

What will you learn

Introduction to Spark MLlib
MLlib Data Structures
Descriptive statistics
Inferential statistics
Data Sampling
Introduction to Machine Learning algorithms with Spark MLlib

Project Description

According to Wikipedia, Statistics is a branch of mathematics dealing with data collection, organization, analysis, interpretation and presentation. It is about building from collected data, a model that can enable humans to describe, analyze and infer event happening around. Statistics is in itself a conduit to the field of Machine Learning and AI.

In this Hackerday, we will go through the basis of statistics and see how Spark enables us to perform statistical operations like descriptive and inferential statistics over the very large dataset.


No knowledge of statistics is assumed in this session. Every concept will be discussed ground up and put to practice on the airline on-time performance dataset. We will conclude the session by introducing a number of machine learning algorithms available in MLlib.
 

New Projects

Curriculum For This Mini Project

Latest Blogs