In this Databricks Azure tutorial project, you will use Spark Sql to analyse the movielens dataset to provide movie recommendations. As part of this you will deploy Azure data factory, data pipelines and visualise the analysis.
This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.
This project will cover the understanding of Apache Spark with main focus on one of its components, Spark SQL. We will understand how Spark and Spark SQL works, its internal functioning, its capabilities and advantages over other data processing tools. We are going to take up one business problem in the area of Supply Chain. Our tech stack will be Databricks and the latest Spark 3.0 for this project. We will use Spark SQL to understand the business data and generate insights from it which must help us frame a solution for our business problem.
In this project, we will use time-series forecasting to predict the values of a sensor using multiple dependent variables. A variety of machine learning models are applied in this task of time series forecasting. We will see a comparison between the LSTM, ARIMA and Regression models. Classical forecasting methods like ARIMA are still popular and powerful but they lack the overall generalizability that memory-based models like LSTM offer. Every model has its own advantages and disadvantages and that will be discussed. The main objective of this article is to lead you through building a working LSTM model and it's different variants such as Vanilla, Stacked, Bidirectional, etc. There will be special focus on customized data preparation for LSTM.