How to tune Hyper parameters using Random Search in Python?

How to tune Hyper parameters using Random Search in Python?

How to tune Hyper parameters using Random Search in Python?

This recipe helps you tune Hyper parameters using Random Search in Python


Recipe Objective

Many a times while working on a dataset and using a Machine Learning model we don"t know which set of hyperparameters will give us the best result. Passing all sets of hyperparameters manually through the model and checking the result might be a hectic work and may not be possible to do.

To get the best set of hyperparameters we can use Grid Search. Random Search passes Random combinations of hyperparameters one by one into the model and check the result. Finally it gives us the set of hyperparemeters which gives the best result after passing in the model.

So this recipe is a short example of how can tune Hyper-parameters using Random Search in Python

Step 1 - Import the library - RandomizedSearchCv

from scipy.stats import uniform from sklearn import linear_model, datasets from sklearn.model_selection import RandomizedSearchCV

Here we have imported various modules like datasets, uniform, linear_model and RandomizedSearchCV from differnt libraries. We will understand the use of these later while using it in the in the code snipet.
For now just have a look on these imports.

Step 2 - Setup the Data

Here we have used datasets to load the inbuilt iris dataset and we have created objects X and y to store the data and the target value respectively. iris = datasets.load_iris() X = y =

Step 3 - Using Model

Here, we are using Logistic Regression as a Machine Learning model to use RandomisedSearchCV. So we have created an object Logistic. logistic = linear_model.LogisticRegression()

Step 5 - Parameters to be optimized

Logistic Regression requires two parameters "C" and "penalty" to be optimised by RandomisedSearchCV. So we have set these two parameters as a list of values form which RandomisedSearchCV will select the best value of parameter. C = uniform(loc=0, scale=4) penalty = ["l1", "l2"] hyperparameters = dict(C=C, penalty=penalty)

Step 6 - Using RandomisedSearchCV and Printing Results

Before using RandomisedSearchCV, lets have a look on the important parameters.

  • estimator: In this we have to pass the models or functions on which we want to use RandomisedSearchCV
  • param_grid: Dictionary or list of parameters of models or function in which RandomisedSearchCV have to select the best.
  • Scoring: It is used as a evaluating metric for the model performance to decide the best hyperparameters, if not especified then it uses estimator score.
Making an object clf for RandomisedSearchCV and fitting the dataset i.e X and y clf = RandomizedSearchCV(logistic, hyperparameters, random_state=1, n_iter=100, cv=5, verbose=0, n_jobs=-1) best_model =, y) Now we are using print statements to print the results. It will give the values of hyperparameters as a result. print("Best Penalty:", best_model.best_estimator_.get_params()["penalty"]) print("Best C:", best_model.best_estimator_.get_params()["C"]) As an output we get:

Best Penalty: l1
Best C: 1.668088018810296

Relevant Projects

Expedia Hotel Recommendations Data Science Project
In this data science project, you will contextualize customer data and predict the likelihood a customer will stay at 100 different hotel groups.

Predict Employee Computer Access Needs in Python
Data Science Project in Python- Given his or her job role, predict employee access needs using amazon employee database.

Customer Market Basket Analysis using Apriori and Fpgrowth algorithms
In this data science project, you will learn how to perform market basket analysis with the application of Apriori and FP growth algorithms based on the concept of association rule learning.

Build a Music Recommendation Algorithm using KKBox's Dataset
Music Recommendation Project using Machine Learning - Use the KKBox dataset to predict the chances of a user listening to a song again after their very first noticeable listening event.

PySpark Tutorial - Learn to use Apache Spark with Python
PySpark Project-Get a handle on using Python with Spark through this hands-on data processing spark python tutorial.

Predict Credit Default | Give Me Some Credit Kaggle
In this data science project, you will predict borrowers chance of defaulting on credit loans by building a credit score prediction model.

Topic modelling using Kmeans clustering to group customer reviews
In this Kmeans clustering machine learning project, you will perform topic modelling in order to group customer reviews based on recurring patterns.

Ecommerce product reviews - Pairwise ranking and sentiment analysis
This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.

Natural language processing Chatbot application using NLTK for text classification
In this NLP AI application, we build the core conversational engine for a chatbot. We use the popular NLTK text classification library to achieve this.

Predict Churn for a Telecom company using Logistic Regression
Machine Learning Project in R- Predict the customer churn of telecom sector and find out the key drivers that lead to churn. Learn how the logistic regression model using R can be used to identify the customer churn in telecom dataset.