What is Classification Dataset in PyBrain

This recipe explains what is Classification Dataset in PyBrain

Recipe Objective - What is Classification Dataset in PyBrain?

This dataset is used primarily to solve classification problems. It accepts input, target field, and an additional field called "Class," an automatic backup of the specified targets. For example, the output will be 1 or 0, or the output will be grouped with values based on the given inputs; belongs to a certain class.

For more related projects -

https://www.projectpro.io/projects/data-science-projects/deep-learning-projects
https://www.projectpro.io/projects/data-science-projects/neural-network-projects

Let's try to build a network on the iris dataset -

# Importing all the necessary libraries
from sklearn import datasets
import matplotlib.pyplot as plt
from pybrain.datasets import ClassificationDataSet
from pybrain.utilities import percentError
from pybrain.tools.shortcuts import buildNetwork
from pybrain.supervised.trainers import BackpropTrainer
from pybrain.structure.modules import SoftmaxLayer
from numpy import ravel

# Loading iris dataset from sklearn datasets
iris = datasets.load_iris()

# Defining feature variables and target variable
X_data = iris.data
y_data = iris.target

# Defining classification dataset model
classification_dataset = ClassificationDataSet(4, 1, nb_classes=3)

# Adding sample into classification dataset
for i in range(len(X_data)):
  classification_dataset.addSample(ravel(X_data[i]), y_data[i])

# Spilling data into testing and training data with the ratio 7:3
testing_data, training_data = classification_dataset.splitWithProportion(0.3)

# Classification dataset for test data
test_data = ClassificationDataSet(4, 1, nb_classes=3)

# Adding sample into testing classification dataset
for n in range(0, testing_data.getLength()):
   test_data.addSample( testing_data.getSample(n)[0], testing_data.getSample(n)[1] )

# Classification dataset for train data
train_data = ClassificationDataSet(4, 1, nb_classes=3)

# Adding sample into training classification dataset
for n in range(0, training_data.getLength()):
   train_data.addSample( training_data.getSample(n)[0], training_data.getSample(n)[1] )

test_data._convertToOneOfMany()
train_data._convertToOneOfMany()

# Building network with outclass as SoftmaxLayer on training data
build_network = buildNetwork(train_data.indim, 4, train_data.outdim, outclass=SoftmaxLayer)

# Building a backproptrainer on training data
trainer = BackpropTrainer(build_network, dataset=train_data, learningrate=0.01, verbose=True)

# 20 iterations on training data
trainer.trainEpochs(20)

# Testing data
print('Error percentage on testing data=>',percentError(trainer.testOnClassData(dataset=test_data), test_data['class']))

Output -
Total error:  0.0892390931641
Total error:  0.0821479733597
Total error:  0.0759327938967
Total error:  0.0722385583142
Total error:  0.0690818068826
Total error:  0.0667645311923
Total error:  0.0647079622731
Total error:  0.0630345245312
Total error:  0.0608030839912
Total error:  0.0595356750412
Total error:  0.0586635639408
Total error:  0.0573043661487
Total error:  0.0559188704413
Total error:  0.0548155819544
Total error:  0.0535537679931
Total error:  0.0527051106108
Total error:  0.0515783629912
Total error:  0.0501025301423
Total error:  0.0499123823243
Total error:  0.0482250742606
Error percentage on testing data=> 20.0

In this way, we can use a classification dataset in pybrain.

What Users are saying..

profile image

Anand Kumpatla

Sr Data Scientist @ Doubleslash Software Solutions Pvt Ltd
linkedin profile url

ProjectPro is a unique platform and helps many people in the industry to solve real-life problems with a step-by-step walkthrough of projects. A platform with some fantastic resources to gain... Read More

Relevant Projects

Build a Collaborative Filtering Recommender System in Python
Use the Amazon Reviews/Ratings dataset of 2 Million records to build a recommender system using memory-based collaborative filtering in Python.

Learn to Build a Polynomial Regression Model from Scratch
In this Machine Learning Regression project, you will learn to build a polynomial regression model to predict points scored by the sports team.

Machine Learning project for Retail Price Optimization
In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.

Build a CNN Model with PyTorch for Image Classification
In this deep learning project, you will learn how to build an Image Classification Model using PyTorch CNN

Classification Projects on Machine Learning for Beginners - 2
Learn to implement various ensemble techniques to predict license status for a given business.

Abstractive Text Summarization using Transformers-BART Model
Deep Learning Project to implement an Abstractive Text Summarizer using Google's Transformers-BART Model to generate news article headlines.

A/B Testing Approach for Comparing Performance of ML Models
The objective of this project is to compare the performance of BERT and DistilBERT models for building an efficient Question and Answering system. Using A/B testing approach, we explore the effectiveness and efficiency of both models and determine which one is better suited for Q&A tasks.

Loan Eligibility Prediction Project using Machine learning on GCP
Loan Eligibility Prediction Project - Use SQL and Python to build a predictive model on GCP to determine whether an application requesting loan is eligible or not.

End-to-End ML Model Monitoring using Airflow and Docker
In this MLOps Project, you will learn to build an end to end pipeline to monitor any changes in the predictive power of model or degradation of data.

Walmart Sales Forecasting Data Science Project
Data Science Project in R-Predict the sales for each department using historical markdown data from the Walmart dataset containing data of 45 Walmart stores.