Do you ever wanted to generate dataset from python itself for any use. We can generate different types of data for different purposes from python.
So this recipe is a short example of how we can Create simulated data for classification in Python.
from sklearn.datasets import make_classification
import pandas as pd
Here we have imported modules pandas and make_classification from differnt libraries. We will understand the use of these later while using it in the in the code snipet.
For now just have a look on these imports.
Here we are using make_classification to generate a classification data. We have stored features and targets.
features, output = make_classification(n_samples = 50,
n_features = 5,
n_informative = 5,
n_redundant = 0,
n_classes = 3,
weights = [.2, .3, .8])
We are viewing first 5 observation of the features.
print("Feature Matrix: ");
print(pd.DataFrame(features, columns=["Feature 1", "Feature 2", "Feature 3", "Feature 4", "Feature 5"]).head())
We are viewing the first 5 observation of target.
print()
print("Target Class: ");
print(pd.DataFrame(output, columns=["TargetClass"]).head())
So the output comes as:
Feature Matrix: Feature 1 Feature 2 Feature 3 Feature 4 Feature 5 0 0.833135 -1.107635 -0.728420 0.101483 1.793259 1 1.120892 -1.856847 -2.490347 1.247622 1.594469 2 -0.980409 -3.042990 -0.482548 4.075172 -1.058840 3 0.827502 2.839329 2.943324 -2.449732 0.303014 4 1.173058 -0.519413 1.240518 -2.643039 2.406873 Target Class: TargetClass 0 2 1 2 2 1 3 0 4 2