In my office, thinking to start Hadoop Project. My team members are not aware of how Hadoop will help in Analysis, below are my project requirements, can you please go through it and give one solution. So I can explain them.
1. Currently, half of the Source data in Mysql server (OLTP) and half of the source data in MongoDB (OLTP).
2. we need to pull this data to central data warehouse repository (OLAP).
3. How to build schema's(star schema and snowflake schema) and build relations b/w tables.
4. purpose of this project is to build Analytics server where Data Scientist will use this for analysis and research using python and R.
5. currently around 80GB of data has to be transferred into Analytics server and data load frequency is weekly
6. how can we do cleansing and staging area creation in this project.