Classifying Handwritten Digits using MNIST Dataset

Classifying Handwritten Digits using MNIST Dataset

The goal of this data science project is to take an image of a handwritten single digit, and determine what that digit is.

Videos

Each project comes with 2-5 hours of micro-videos explaining the solution.

Code & Dataset

Get access to 50+ solved projects with iPython notebooks and datasets.

Project Experience

Add project experience to your Linkedin/Github profiles.

What will you learn

Unzipping folders and loading the dataset
Visualizing different images available in the dataset
Using the summary function for basic EDA
Understanding left-skew and right-skew of the dataset
Preprocessing the train dataset for initial predictions
Apply ensemble model Random Forest for predictions
Use the Importance function in R for extracting the necessary features
Plotting graphs for feature versus MeanDecresedGini
Hyper-parameter tuning Random Forest and selecting the best parameters for this model
Plotting graphs for against parameters and OOB errors
Importing FNN library and using K-nearest neighbors as the training model
Importing XGBoost and converting Dataset into DMatrix for performing predictions
Defining parameters and performing Cross Folds validation using XGBoost model
Predicting using XGBoost and saving the predictions in form of CSV
Installing h2o package for using complete RAM and CPU cores available
Initializing an h2o cluster
Initializing a DeepLearning Neural Networks model
Defining , Understanding parameters and Training Neural Networks for predictions
Plotting Confusion matrix and interpreting the result
Predicting the result and saving it in the form of CSV
Shutting down the h2o created cluster

Project Description

Data scientists looking for their first machine learning or data science project begin by trying the handwritten digit recognition problem. The Digit Recognizer data science project makes use of the popular MNIST database of handwritten digits, taken from American Census Bureau employees. The dataset consists of already pre-processed and formatted 60,000 images of 28x28 pixel handwritten digits. With the use of image recognition techniques and a chosen machine learning algorithm, a program can be built to accurately read the handwritten digits with 95% accuracy. The accuracy rate can be higher based on the chosen machine learning algorithm,

Similar Projects

In this machine learning project , you will predict the total travel time of taxi trips from their initial partial trajectories.

Data Science Project in Python- Given his or her job role, predict employee access needs using amazon employee database.

Data Science Project-Predict the car insurance policy a customer buys after receiving a number of quotes.

Curriculum For This Mini Project

27-Feb-2016
04h 29m