Estimating churners before they discontinue using a product or service is extremely important. In this ML project, you will develop a churn prediction model in telecom to predict customers who are most likely subject to churn.
This is a typical Big Data ETL visualization project implemented in AWS cloud using cloud native tools like Glue which is used to Spark jobs without maintaining cluster infrastructure, Step Functions which is used to schedule jobs based on dependency ,Redshift which is the ultimate petabyte scale data warehouse solution in AWS and Quicksight which is AWS managed Visualization tool to create business reports
Use cluster analysis to identify the groups of characteristically similar schools in the College Scorecard dataset. Considerations: Clustering Algorithm Data Preparation How will you deal with missing values? Categorical variables? Feature intercorrelations? Feature normalization or scaling? Dimensionality reduction? Hyperparameters How will you set the parameters -- the algorithm's knobs and dials, so to speak -- in order to achieve valid and useful output? Interpretation Is it possible to explain what each cluster represents? Did you retain or prepare a set of features that enables a meaningful interpretation of the clusters? Do the compositions of the clusters seem to make sense? Validation How will you measure the validity of your clustering process? Which metrics will you use and how will you apply them?