Classifying Handwritten Digits using MNIST Dataset

The goal of this data science project is to take an image of a handwritten single digit, and determine what that digit is.


Each project comes with 2-5 hours of micro-videos explaining the solution.

Code & Dataset

Get access to 50+ solved projects with iPython notebooks and datasets.

Project Experience

Add project experience to your Linkedin/Github profiles.

What will you learn

  • Unzipping folders and loading the dataset

  • Visualizing different images available in the dataset

  • Using the summary function for basic EDA

  • Understanding left-skew and right-skew of the dataset

  • Preprocessing the train dataset for initial predictions

  • Apply ensemble model Random Forest for predictions

  • Use the Importance function in R for extracting the necessary features

  • Plotting graphs for feature versus MeanDecresedGini

  • Hyper-parameter tuning Random Forest and selecting the best parameters for this model

  • Plotting graphs for against parameters and OOB errors

  • Importing FNN library and using K-nearest neighbors as the training model

  • Importing XGBoost and converting Dataset into DMatrix for performing predictions

  • Defining parameters and performing Cross Folds validation using XGBoost model

  • Predicting using XGBoost and saving the predictions in form of CSV

  • Installing h2o package for using complete RAM and CPU cores available

  • Initializing an h2o cluster

  • Initializing a DeepLearning Neural Networks model

  • Defining , Understanding parameters and Training Neural Networks for predictions

  • Plotting Confusion matrix and interpreting the result

  • Predicting the result and saving it in the form of CSV

  • Shutting down the h2o created cluster

Project Description

Data scientists looking for their first machine learning or data science project begin by trying the handwritten digit recognition problem. The Digit Recognizer data science project makes use of the popular MNIST database of handwritten digits, taken from American Census Bureau employees. The dataset consists of already pre-processed and formatted 60,000 images of 28x28 pixel handwritten digits. With the use of image recognition techniques and a chosen machine learning algorithm, a program can be built to accurately read the handwritten digits with 95% accuracy. The accuracy rate can be higher based on the chosen machine learning algorithm,

Similar Projects

Big Data Project Build a predictive model for Otto Group Product Classification
Build a predictive model to correctly classify products between 9 product categories (fashion, electronics, etc.) using the Otto Group dataset.
Big Data Project Explore San Francisco City Employee Salary Data
Using this Kaggle dataset, you will explore which type of employees make less or more money, or which employees get normal pay hikes and promotions.
Big Data Project Data Science Project -Predicting survival on the Titanic
In this data science project with Python, we will complete the analysis of what sorts of people were likely to survive.You will learn to use various machine learning tools to predict which passengers survived the tragedy.
Big Data Project Job Recommendation Challenge-Predict which jobs users will apply
Build a machine learning model that will predict which jobs users will apply to given their past applications, demographics and work history.

Curriculum For This Mini Project

04h 29m