How to standardise features in Python?

This recipe helps you standardise features in Python

Recipe Objective

It is very rare to find a raw dataset which perfectly follows certain specific distribution. Usually every dataset needs to be standarize by any means.

So this is the recipe on how we can standardise features in Python.

Master the Art of Data Cleaning in Machine Learning

Step 1 - Import the library

from sklearn import preprocessing import numpy as np

We have only imported numpy and preprocessing which is needed.

Step 2 - Setting up the Data

We have created an numpy array with different values. x = np.array([[-500.5], [-100.1], [0], [100.1], [900.9]])

Step 3 - Using StandardScaler

StandardScaler is used to remove the outliners and scale the data by making the mean of the data 0 and standard deviation as 1. So we are creating an object scaler to use standardScaler. We have fitted the fit data and transformed train and test data form standard scaler. Finally we have printed the dataset. scaler = preprocessing.StandardScaler() standardized_x = scaler.fit_transform(x) print(x) print(standardized_x) As an output we get

[[-500.5]
 [-100.1]
 [   0. ]
 [ 100.1]
 [ 900.9]]

[[-1.26687088]
 [-0.39316683]
 [-0.17474081]
 [ 0.0436852 ]
 [ 1.79109332]]

Download Materials

What Users are saying..

profile image

Gautam Vermani

Data Consultant at Confidential
linkedin profile url

Having worked in the field of Data Science, I wanted to explore how I can implement projects in other domains, So I thought of connecting with ProjectPro. A project that helped me absorb this topic... Read More

Relevant Projects

NLP and Deep Learning For Fake News Classification in Python
In this project you will use Python to implement various machine learning methods( RNN, LSTM, GRU) for fake news classification.

Abstractive Text Summarization using Transformers-BART Model
Deep Learning Project to implement an Abstractive Text Summarizer using Google's Transformers-BART Model to generate news article headlines.

Credit Card Fraud Detection as a Classification Problem
In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.

Build CI/CD Pipeline for Machine Learning Projects using Jenkins
In this project, you will learn how to create a CI/CD pipeline for a search engine application using Jenkins.

Build Classification Algorithms for Digital Transformation[Banking]
Implement a machine learning approach using various classification techniques in Python to examine the digitalisation process of bank customers.

PyTorch Project to Build a LSTM Text Classification Model
In this PyTorch Project you will learn how to build an LSTM Text Classification model for Classifying the Reviews of an App .

Build a Customer Churn Prediction Model using Decision Trees
Develop a customer churn prediction model using decision tree machine learning algorithms and data science on streaming service data.

Build a Credit Default Risk Prediction Model with LightGBM
In this Machine Learning Project, you will build a classification model for default prediction with LightGBM.

Build a Multi Touch Attribution Machine Learning Model in Python
Identifying the ROI on marketing campaigns is an essential KPI for any business. In this ML project, you will learn to build a Multi Touch Attribution Model in Python to identify the ROI of various marketing efforts and their impact on conversions or sales..

Build a Similar Images Finder with Python, Keras, and Tensorflow
Build your own image similarity application using Python to search and find images of products that are similar to any given product. You will implement the K-Nearest Neighbor algorithm to find products with maximum similarity.