How to standardise features in Python?
DATA MUNGING DATA CLEANING PYTHON MACHINE LEARNING RECIPES PANDAS CHEATSHEET     ALL TAGS

How to standardise features in Python?

How to standardise features in Python?

This recipe helps you standardise features in Python

0

Recipe Objective

It is very rare to find a raw dataset which perfectly follows certain specific distribution. Usually every dataset needs to be standarize by any means.

So this is the recipe on how we can standardise features in Python.

Step 1 - Import the library

from sklearn import preprocessing import numpy as np

We have only imported numpy and preprocessing which is needed.

Step 2 - Setting up the Data

We have created an numpy array with different values. x = np.array([[-500.5], [-100.1], [0], [100.1], [900.9]])

Step 3 - Using StandardScaler

StandardScaler is used to remove the outliners and scale the data by making the mean of the data 0 and standard deviation as 1. So we are creating an object scaler to use standardScaler.
We have fitted the fit data and transformed train and test data form standard scaler. Finally we have printed the dataset. scaler = preprocessing.StandardScaler() standardized_x = scaler.fit_transform(x) print(x) print(standardized_x) As an output we get

[[-500.5]
 [-100.1]
 [   0. ]
 [ 100.1]
 [ 900.9]]

[[-1.26687088]
 [-0.39316683]
 [-0.17474081]
 [ 0.0436852 ]
 [ 1.79109332]]

Relevant Projects

Resume parsing with Machine learning - NLP with Python OCR and Spacy
In this machine learning resume parser example we use the popular Spacy NLP python library for OCR and text classification.

Loan Eligibility Prediction in Python using H2O.ai
In this loan prediction project you will build predictive models in Python using H2O.ai to predict if an applicant is able to repay the loan or not.

Data Science Project - Instacart Market Basket Analysis
Data Science Project - Build a recommendation engine which will predict the products to be purchased by an Instacart consumer again.

Data Science Project-TalkingData AdTracking Fraud Detection
Machine Learning Project in R-Detect fraudulent click traffic for mobile app ads using R data science programming language.

Walmart Sales Forecasting Data Science Project
Data Science Project in R-Predict the sales for each department using historical markdown data from the Walmart dataset containing data of 45 Walmart stores.

Demand prediction of driver availability using multistep time series analysis
In this supervised learning machine learning project, you will predict the availability of a driver in a specific area by using multi step time series analysis.

Build a Similar Images Finder with Python, Keras, and Tensorflow
Build your own image similarity application using Python to search and find images of products that are similar to any given product. You will implement the K-Nearest Neighbor algorithm to find products with maximum similarity.

Predict Macro Economic Trends using Kaggle Financial Dataset
In this machine learning project, you will uncover the predictive value in an uncertain world by using various artificial intelligence, machine learning, advanced regression and feature transformation techniques.

Natural language processing Chatbot application using NLTK for text classification
In this NLP AI application, we build the core conversational engine for a chatbot. We use the popular NLTK text classification library to achieve this.

Forecast Inventory demand using historical sales data in R
In this machine learning project, you will develop a machine learning model to accurately forecast inventory demand based on historical sales data.