Each project comes with 2-5 hours of micro-videos explaining the solution.
Get access to 50+ solved projects with iPython notebooks and datasets.
Add project experience to your Linkedin/Github profiles.
I have extensive experience in data management and data processing. Over the past few years I saw the data management technology transition into the Big Data ecosystem and I needed to follow suit. I... Read More
I think that they are fantastic. I attended Yale and Stanford and have worked at Honeywell,Oracle, and Arthur Andersen(Accenture) in the US. I have taken Big Data and Hadoop,NoSQL, Spark, Hadoop... Read More
In this big data project we build a live workflow for a real project using Apache Airflow which is the new edge workflow management platform. We will go through the use cases of workflow, different tools available to manage workflow, important features of workflow like CLI and UI and how Airflow is differnt. We will install Airflow and run some simple workflows.
In this big data hadoop project, we will download the raw page counts data from wikipedia archieve and we will process them via Hadoop. Then map that processed data to raw SQL data to identify the most lived up pages of a given day. Then we will visualize the proecessed data via Zeppelin Notebooks to identify the daily trends. We will use Qubole to power up Hadoop and Notebooks.
All steps like downloading, copying data to S3, creating tables and processing them via Hadoop would be task in Airflow and we will learn how to craft scheduled workflow in Airflow.
In this big data project, we will be performing an OLAP cube design using AdventureWorks database. The deliverable for this session will be to design a cube, build and implement it using Kylin, query the cube and even connect familiar tools (like Excel) with our new cube.
In this NoSQL project, we will use two NoSQL databases(HBase and MongoDB) to store Yelp business attributes and learn how to retrieve this data for processing or query.
In this spark project, we will continue building the data warehouse from the previous project Yelp Data Processing Using Spark And Hive Part 1 and will do further data processing to develop diverse data products.