How to visualise a tree model Multiclass Classification?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

How to visualise a tree model Multiclass Classification?

How to visualise a tree model Multiclass Classification?

This recipe helps you visualise a tree model Multiclass Classification

0

Recipe Objective

Visualising a model gives a better representation of how the model is working. Tree models are one of the easiest to visualise.

So this recipe is a short example of how we can visualise a tree model - Multiclass Classification.

Step 1 - Import the library

from sklearn import datasets from sklearn import metrics from sklearn.model_selection import train_test_split import matplotlib.pyplot as plt plt.style.use("ggplot") from sklearn import tree from sklearn.externals.six import StringIO import pydotplus

Here we have imported various modules like datasets, StringIO and test_train_split from differnt libraries. We will understand the use of these later while using it in the in the code snipet.
For now just have a look on these imports.

Step 2 - Setup the Data for classifier

Here we have used datasets to load the inbuilt wine dataset and we have created objects X and y to store the data and the target value respectively. dataset = datasets.load_wine() X = dataset.data; y = dataset.target X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25)

Step 3 - Model and its Score

Here, we are using DecisionTreeClassifier as a Machine Learning model to fit the data. model = tree.DecisionTreeClassifier() model.fit(X_train, y_train) print(model) Now we have predicted the output by passing X_test and also stored real target in expected_y. expected_y = y_test predicted_y = model.predict(X_test) Here we have printed classification report and confusion matrix for the classifier. print(metrics.classification_report(expected_y, predicted_y, target_names=dataset.target_names)) print(metrics.confusion_matrix(expected_y, predicted_y))

Step 4 - Visualizing the model

We have created a dot file for the tree and used it to create png and pdf. By using matplotlib we have created a image of the tree with the conditions used by the model. dotfile = open("tree.dot", "w") tree.export_graphviz(model, out_file = dotfile, feature_names = dataset.feature_names) dotfile.close() dot_data = StringIO() tree.export_graphviz(model, out_file=dot_data, filled=True, rounded=True, special_characters=True, feature_names = dataset.feature_names) graph = pydotplus.graph_from_dot_data(dot_data.getvalue()) graph.write_png("tree.png") graph.write_pdf("tree.pdf") import matplotlib.image as mpimg img = mpimg.imread("tree.png") plt.figure(figsize=(15,15)) plt.imshow(img) plt.show() As an output we get:


Relevant Projects

German Credit Dataset Analysis to Classify Loan Applications
In this data science project, you will work with German credit dataset using classification techniques like Decision Tree, Neural Networks etc to classify loan applications using R.

PySpark Tutorial - Learn to use Apache Spark with Python
PySpark Project-Get a handle on using Python with Spark through this hands-on data processing spark python tutorial.

Predict Employee Computer Access Needs in Python
Data Science Project in Python- Given his or her job role, predict employee access needs using amazon employee database.

Data Science Project-TalkingData AdTracking Fraud Detection
Machine Learning Project in R-Detect fraudulent click traffic for mobile app ads using R data science programming language.

Resume parsing with Machine learning - NLP with Python OCR and Spacy
In this machine learning resume parser example we use the popular Spacy NLP python library for OCR and text classification.

Solving Multiple Classification use cases Using H2O
In this project, we are going to talk about H2O and functionality in terms of building Machine Learning models.

Data Science Project - Instacart Market Basket Analysis
Data Science Project - Build a recommendation engine which will predict the products to be purchased by an Instacart consumer again.

Customer Churn Prediction Analysis using Ensemble Techniques
In this machine learning churn project, we implement a churn prediction model in python using ensemble techniques.

Zillow’s Home Value Prediction (Zestimate)
Data Science Project in R -Build a machine learning algorithm to predict the future sale prices of homes.

Identifying Product Bundles from Sales Data Using R Language
In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.