Each project comes with 2-5 hours of micro-videos explaining the solution.
Get access to 102+ solved projects with iPython notebooks and datasets.
Add project experience to your Linkedin/Github profiles.
This is one of the best of investments you can make with regards to career progression and growth in technological knowledge. I was pointed in this direction by a mentor in the IT world who I highly... Read More
I have worked for more than 15 years in Java and J2EE and have recently developed an interest in Big Data technologies and Machine learning due to a big need at my workspace. I was referred here by a... Read More
I have extensive experience in data management and data processing. Over the past few years I saw the data management technology transition into the Big Data ecosystem and I needed to follow suit. I... Read More
This was great. The use of Jupyter was great. Prior to learning Python I was a self taught SQL user with advanced skills. I hold a Bachelors in Finance and have 5 years of business experience.. I... Read More
Initially, I was unaware of how this would cater to my career needs. But when I stumbled through the reviews given on the website. I went through many of them and found them all positive. I would... Read More
I think that they are fantastic. I attended Yale and Stanford and have worked at Honeywell,Oracle, and Arthur Andersen(Accenture) in the US. I have taken Big Data and Hadoop,NoSQL, Spark, Hadoop... Read More
Still on the series on Data engineering using Yelp dataset, we have established several concepts - from data warehousing to graph analysis. Well done.
But in today's world, not all data are best stored on HDFS. Some special requirements and scenario could require a data storage with a very low latency that could also handle large dataset. Here comes the use of NoSQL databases.
In this NoSQL project, we will use two NoSQL databases(HBase and MongoDB) to store Yelp business attributes and also learn how to retrieve these data for processing or query. We will substantiate the value of these other ways to store data over using HDFS and how to join them with data stored in HDFS in real time.
Seeing that MongoDB is not available in Cloudera Quickstart VM, we are encouraged to install MongoDB on our host machine while setting up a host network interface between the host and the VM for this big data project.
Given big data at taxi service (ride-hailing) i.e. OLA, you will learn multi-step time series forecasting and clustering with Mini-Batch K-means Algorithm on geospatial data to predict future ride requests for a particular region at a given time.
In this time series project, you will forecast Walmart sales over time using the powerful, fast, and flexible time series forecasting library Greykite that helps automate time series problems.
In this deep learning project, you will learn how to build your custom OCR (optical character recognition) from scratch by using Google Tesseract and YOLO to read the text from any images.
Hive Project -Learn to write a Hive program to find the first unique URL, given 'n' number of URL's.
Use the Zillow dataset to follow a test-driven approach and build a regression machine learning model to predict the price of the house based on other variables.
In this data science project, you will learn how to perform market basket analysis with the application of Apriori and FP growth algorithms based on the concept of association rule learning.
Learn to design Hadoop Architecture and understand how to store data using data acquisition tools in Hadoop.
In this PySpark project, you will simulate a complex real-world data pipeline based on messaging. This project is deployed using the following tech stack - NiFi, PySpark, Hive, HDFS, Kafka, Airflow, Tableau and AWS QuickSight.
In this spark project, you will use the real-world production logs from NASA Kennedy Space Center WWW server in Florida to perform scalable log analytics with Apache Spark, Python, and Kafka.
In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.