How to use Classification Metrics in Python?

How to use Classification Metrics in Python?

How to use Classification Metrics in Python?

This recipe helps you use Classification Metrics in Python


Recipe Objective

In a dataset after applying a Classification model how to evaluate it. There are many metrics that we can use. We will be using accuracy , logarithmic loss and Area under ROC.

So this is the recipe on how we we can use Classification Metrics in Python.

Step 1 - Import the library

from sklearn import datasets from sklearn import tree, model_selection, metrics from sklearn.model_selection import train_test_split

We have imported datasets, tree, model_selection and test_train_split which will be needed for the dataset.

Step 2 - Setting up the Data

We have imported inbuilt wine dataset and stored data in x and target in y. We have used to split the data by test train split. Then we have used model_selection.KFold. seed = 42 dataset = datasets.load_breast_cancer() X =; y = X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25) kfold = model_selection.KFold(n_splits=10, random_state=seed) kfold = model_selection.KFold(n_splits=10, random_state=seed)

Step 3 - Training model and calculating Metrics

Here we will be using DecisionTreeClassifier as a model model = tree.DecisionTreeClassifier() Now we will be calculating different metrics. We will be using cross validation score to calculate the metrices. So we will be printing the mean and standard deviation of all the scores.

  • Calculating Accuracy
  • scoring = "accuracy" results = model_selection.cross_val_score(model, X_train, y_train, cv=kfold, scoring=scoring) print("Accuracy: ", results.mean()); print("Standard Deviation: ", results.std())
  • Calculating Logarithmic Loss
  • scoring = "neg_log_loss" results = model_selection.cross_val_score(model, X_train, y_train, cv=kfold, scoring=scoring) print("Logloss: ", results.mean()); print("Standard Deviation: ", results.std())
  • Calculating Area under ROC curve
  • scoring = "roc_auc" results = model_selection.cross_val_score(model, X_train, y_train, cv=kfold, scoring=scoring) print(); print("AUC: ", results.mean()); print("Standard Deviation: ", results.std())
So the output comes as:

Accuracy:  0.9248615725359912
Standard Deviation:  0.03454639234547574

Logloss:  -2.675538335423929
Standard Deviation:  1.2623224750420183

AUC:  0.9168731849436718
Standard Deviation:  0.027925303925433888

Relevant Projects

Topic modelling using Kmeans clustering to group customer reviews
In this Kmeans clustering machine learning project, you will perform topic modelling in order to group customer reviews based on recurring patterns.

Deep Learning with Keras in R to Predict Customer Churn
In this deep learning project, we will predict customer churn using Artificial Neural Networks and learn how to model an ANN in R with the keras deep learning package.

Customer Market Basket Analysis using Apriori and Fpgrowth algorithms
In this data science project, you will learn how to perform market basket analysis with the application of Apriori and FP growth algorithms based on the concept of association rule learning.

Loan Eligibility Prediction using Gradient Boosting Classifier
This data science in python project predicts if a loan should be given to an applicant or not. We predict if the customer is eligible for loan based on several factors like credit score and past history.

Forecast Inventory demand using historical sales data in R
In this machine learning project, you will develop a machine learning model to accurately forecast inventory demand based on historical sales data.

Machine Learning project for Retail Price Optimization
In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.

Walmart Sales Forecasting Data Science Project
Data Science Project in R-Predict the sales for each department using historical markdown data from the Walmart dataset containing data of 45 Walmart stores.

Learn to prepare data for your next machine learning project
Text data requires special preparation before you can start using it for any machine learning project.In this ML project, you will learn about applying Machine Learning models to create classifiers and learn how to make sense of textual data.

Solving Multiple Classification use cases Using H2O
In this project, we are going to talk about H2O and functionality in terms of building Machine Learning models.

Customer Churn Prediction Analysis using Ensemble Techniques
In this machine learning churn project, we implement a churn prediction model in python using ensemble techniques.