In this R data science project, we will explore wine dataset to assess red wine quality. The objective of this data science project is to explore which chemical properties will influence the quality of red wines.
One of the broadest uses of Snowflake is building a data warehouse platform or enhancing the existing data lake. It offers all sorts of services to build an efficient Data warehouse with ETL capability and support for various external data partners. Slowly Changing dimensions are a common database modeling technique used to capture data in a table and show how it changes over time. The slowly changing dimension of the warehouse dimension is said to rarely change. However, when they change, there should be a systematic approach to capturing that change. Examples of SCDs are customer and products information. This project explains how to build a Slowly Changing Dimension (SCD) using Snowflake’s Stream functionality and how to automate the process using Snowflake’s Task functionality.
The project will use rasa NLU for the Intent classifier, spacy for entity tagging, and mongo dB as the DB. The project will incorporate slot filling and context management and will be supporting the following intent and entities. Intents : product_info | ask_price|cancel_order Entities : product_name|location|order id The project will demonstrate how to generate data on the fly, annotate using framework and how to process those for different pieces of training as discussed above .
In this PySpark project, you will simulate a complex real-world data pipeline based on messaging. This project is deployed using the following tech stack - NiFi, PySpark, Hive, HDFS, Kafka, Airflow, Tableau and AWS QuickSight.