Each project comes with 2-5 hours of micro-videos explaining the solution.

Get access to 102+ solved projects with iPython notebooks and datasets.

Add project experience to your Linkedin/Github profiles.

Understanding the problem Statement

Importing the dataset directly from Amazon AWS

What is Clustering

Performing basic EDA and Data-preprocessing

Plotting Boxplot and checking for identifying outliers

Fixing outliers using proper technique

Scaling and normalizing the dataset

Partition based methods, Hierarchical Methods, Model-based methods of clustering

Kmeans, k-median, k-mode agglomerative, divisive expectation-maximization based methods

Identifying different libraries used for Clustering

Steps involved in K-clustering

Applying different methods available for clustering

Applying the silhouette score on the clustering result

Visualizing the result by plotting graphs

Picking the best model for making predictions

Calculating probability related to testing data points for different clusters

In this Databricks Azure tutorial project, you will use Spark Sql to analyse the movielens dataset to provide movie recommendations. As part of this you will deploy Azure data factory, data pipelines and visualise the analysis.

Use the Zillow dataset to follow a test-driven approach and build a regression machine learning model to predict the price of the house based on other variables.

Deep Learning Project to implement an Abstractive Text Summarizer using Google's Transformers-BART Model to generate news article headlines.

Machine Learning Project in R- Predict the customer churn of telecom sector and find out the key drivers that lead to churn. Learn how the logistic regression model using R can be used to identify the customer churn in telecom dataset.

Given big data at taxi service (ride-hailing) i.e. OLA, you will learn multi-step time series forecasting and clustering with Mini-Batch K-means Algorithm on geospatial data to predict future ride requests for a particular region at a given time.

In this NLP AI application, we build the core conversational engine for a chatbot. We use the popular NLTK text classification library to achieve this.

In this PySpark project, you will simulate a complex real-world data pipeline based on messaging. This project is deployed using the following tech stack - NiFi, PySpark, Hive, HDFS, Kafka, Airflow, Tableau and AWS QuickSight.

In this deep learning project, you will learn how to build your custom OCR (optical character recognition) from scratch by using Google Tesseract and YOLO to read the text from any images.

PySpark Project-Get a handle on using Python with Spark through this hands-on data processing spark python tutorial.

In this time series project, you will forecast Walmart sales over time using the powerful, fast, and flexible time series forecasting library Greykite that helps automate time series problems.

Introduction

03m

Installing Libraries

01m

Understand the Data Set

05m

Outliers

06m

Clustering Techniques

02m

Installing Library to implement KMeans

05m

Steps in KMeans Algorithm

11m

Implementing Kmeans

12m

Cluster Model Deployment

04m

Hierarchical Clustering - Agglomerative Method

10m

Hierarchical Clustering - Divisive Method

04m

Silhoutte Score

04m

Cluster Goodness using Silhoutte score

05m

Model Based Clustering

06m

Self Organizing Maps

07m

FactoExtra library for Visualization

01m

Other Clustering Methods

10m

Hierarchical Clustering

03m

Hopkins Statistics

03m

Determine Optimal Number of Clusters

03m

Clustering Validation Statistics

05m

Advanced Clustering

04m

Conclusion

08m