How to convert string categorical variables into numerical variables in Python?
DATA MUNGING

How to convert string categorical variables into numerical variables in Python?

How to convert string categorical variables into numerical variables in Python?

This recipe helps you convert string categorical variables into numerical variables in Python

0
This python source code does the following: 1. Creates a data dictionary and converts it into pandas dataframe 2. Manually creates a encoding function 3. Applies the function on dataframe to encode the variable
In [1]:
## How to convert string categorical variables into numerical variables in Python
def Kickstarter_Example_77():
    print()
    print(format('How to convert strings into numerical variables in Python','*^82'))

    import warnings
    warnings.filterwarnings("ignore")

    # load libraries
    import pandas as pd

    # Create dataframe
    raw_data = {'patient': [1, 1, 1, 2, 2],
                'obs': [1, 2, 3, 1, 2],
                'treatment': [0, 1, 0, 1, 0],
                'score': ['strong', 'weak', 'normal', 'weak', 'strong']}

    df = pd.DataFrame(raw_data, columns = ['patient', 'obs', 'treatment', 'score'])
    print(); print(df)

    # Create a function that converts all values of df['score'] into numbers
    def score_to_numeric(x):
        if x=='strong': return 3
        if x=='normal': return 2
        if x=='weak':   return 1

    # Apply the function to the score variable
    df['score_num'] = df['score'].apply(score_to_numeric)
    print(); print(df)

Kickstarter_Example_77()
************How to convert strings into numerical variables in Python*************

   patient  obs  treatment   score
0        1    1          0  strong
1        1    2          1    weak
2        1    3          0  normal
3        2    1          1    weak
4        2    2          0  strong

   patient  obs  treatment   score  score_num
0        1    1          0  strong          3
1        1    2          1    weak          1
2        1    3          0  normal          2
3        2    1          1    weak          1
4        2    2          0  strong          3

Relevant Projects

Data Science Project-All State Insurance Claims Severity Prediction
Data science project in R to develop automated methods for predicting the cost and severity of insurance claims.

Build an Image Classifier for Plant Species Identification
In this machine learning project, we will use binary leaf images and extracted features, including shape, margin, and texture to accurately identify plant species using different benchmark classification techniques.

Predict Credit Default | Give Me Some Credit Kaggle
In this data science project, you will predict borrowers chance of defaulting on credit loans by building a credit score prediction model.

Learn to prepare data for your next machine learning project
Text data requires special preparation before you can start using it for any machine learning project.In this ML project, you will learn about applying Machine Learning models to create classifiers and learn how to make sense of textual data.

Choosing the right Time Series Forecasting Methods
There are different time series forecasting methods to forecast stock price, demand etc. In this machine learning project, you will learn to determine which forecasting method to be used when and how to apply with time series forecasting example.

Ecommerce product reviews - Pairwise ranking and sentiment analysis
This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.

Loan Eligibility Prediction using Gradient Boosting Classifier
This data science in python project predicts if a loan should be given to an applicant or not. We predict if the customer is eligible for loan based on several factors like credit score and past history.

Zillow’s Home Value Prediction (Zestimate)
Data Science Project in R -Build a machine learning algorithm to predict the future sale prices of homes.

Time Series Forecasting with LSTM Neural Network Python
Deep Learning Project- Learn to apply deep learning paradigm to forecast univariate time series data.

Predict Census Income using Deep Learning Models
In this project, we are going to work on Deep Learning using H2O to predict Census income.