How do you select which model to use for a dataset. We can do this by voting ensemble which trains on an ensemble of numerous models and predicts an output (class) based on their highest probability of chosen class as the output
So this is the recipe on how we can implement voting ensemble in Python.
from sklearn import model_selection from sklearn.linear_model import LogisticRegression from sklearn.tree import DecisionTreeClassifier from sklearn.svm import SVC from sklearn.ensemble import VotingClassifier from sklearn import datasets from sklearn.model_selection import train_test_split import matplotlib.pyplot as plt plt.style.use("ggplot")
We have imported various models like LogisticRegression, DecisionTreeClassifier, SVC and VotingClassifier.
We have imported Wine dataset and stored the data in X and the target in y. We have used test_train_split to split the data. We have also used model_selection.KFold to split the data.
seed = 42
dataset = datasets.load_wine()
X = dataset.data; y = dataset.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30)
kfold = model_selection.KFold(n_splits=10, random_state=seed)
We have made an array named estimators with all the models from e=which we want to select. Now we have used VotingClassifier with parameter as extimator which contain all the models. Finally we have calculated cross validation score of the model.
estimators = 
model1 = LogisticRegression(); estimators.append(("logistic", model1))
model2 = DecisionTreeClassifier(); estimators.append(("cart", model2))
model3 = SVC(); estimators.append(("svm", model3))
ensemble = VotingClassifier(estimators)
results = model_selection.cross_val_score(ensemble, X_train, y_train, cv=kfold)
So the output comes as