How to rescale features in Python?

This recipe helps you rescale features in Python

Recipe Objective

In a dataset there may be many outliers which effects the performance of the model. We can deal with it by scaling the data. Here we will be using min-max scaler for this.

So this is the recipe on how we can can rescale features in Python.

Explore the Must Know Python Libraries for Data Science and Machine Learning.

Step 1 - Importing Library

from sklearn import preprocessing import numpy as np

We have imported numpy and preprocessing which is needed.

Step 2 - Creating array

We have created a array with values on which we will perform operation. x = np.array([[-500.5], [-100.1], [0], [100.1], [900.9]])

Step 3 - Scaling the array

We have used min-max scaler to scale the data in the array in the range 0 to 1 which we have passed in the parameter. Then we have used fit transform to fit and transform the array according to the min max scaler. minmax_scale = preprocessing.MinMaxScaler(feature_range=(0, 1)) x_scale = minmax_scale.fit_transform(x) print(x) print(x_scale) So the output comes as

[[-500.5]
 [-100.1]
 [   0. ]
 [ 100.1]
 [ 900.9]]

[[0.        ]
 [0.28571429]
 [0.35714286]
 [0.42857143]
 [1.        ]]

Download Materials

What Users are saying..

profile image

Ameeruddin Mohammed

ETL (Abintio) developer at IBM
linkedin profile url

I come from a background in Marketing and Analytics and when I developed an interest in Machine Learning algorithms, I did multiple in-class courses from reputed institutions though I got good... Read More

Relevant Projects

Learn to Build an End-to-End Machine Learning Pipeline - Part 2
In this Machine Learning Project, you will learn how to build an end-to-end machine learning pipeline for predicting truck delays, incorporating Hopsworks' feature store and Weights and Biases for model experimentation.

Forecasting Business KPI's with Tensorflow and Python
In this machine learning project, you will use the video clip of an IPL match played between CSK and RCB to forecast key performance indicators like the number of appearances of a brand logo, the frames, and the shortest and longest area percentage in the video.

NLP Project on LDA Topic Modelling Python using RACE Dataset
Use the RACE dataset to extract a dominant topic from each document and perform LDA topic modeling in python.

Build a Face Recognition System in Python using FaceNet
In this deep learning project, you will build your own face recognition system in Python using OpenCV and FaceNet by extracting features from an image of a person's face.

Loan Eligibility Prediction Project using Machine learning on GCP
Loan Eligibility Prediction Project - Use SQL and Python to build a predictive model on GCP to determine whether an application requesting loan is eligible or not.

Build Multi Class Text Classification Models with RNN and LSTM
In this Deep Learning Project, you will use the customer complaints data about consumer financial products to build multi-class text classification models using RNN and LSTM.

End-to-End Snowflake Healthcare Analytics Project on AWS-1
In this Snowflake Healthcare Analytics Project, you will leverage Snowflake on AWS to predict patient length of stay (LOS) in hospitals. The prediction of LOS can help in efficient resource allocation, lower the risk of staff/visitor infections, and improve overall hospital functioning.

Topic modelling using Kmeans clustering to group customer reviews
In this Kmeans clustering machine learning project, you will perform topic modelling in order to group customer reviews based on recurring patterns.

Hands-On Approach to Causal Inference in Machine Learning
In this Machine Learning Project, you will learn to implement various causal inference techniques in Python to determine, how effective the sprinkler is in making the grass wet.

Ensemble Machine Learning Project - All State Insurance Claims Severity Prediction
In this ensemble machine learning project, we will predict what kind of claims an insurance company will get. This is implemented in python using ensemble machine learning algorithms.