DATA MUNGING
DATA CLEANING PYTHON
MACHINE LEARNING RECIPES
PANDAS CHEATSHEET
ALL TAGS
# How to convert string categorical variables into numerical variables in Python?

# How to convert string categorical variables into numerical variables in Python?

This recipe helps you convert string categorical variables into numerical variables in Python

Machine Learning Models can not work on categorical variables in the form of strings, so we need to change it into numerical form. This can be done by making new features according to the categories by assigning it values.

This python source code does the following:

1. Creates a data dictionary and converts it into pandas dataframe

2. Manually creates a encoding function

3. Applies the function on dataframe to encode the variable

So this is the recipe on how we can convert string categorical variables into numerical variables in Python.

```
import pandas as pd
```

We have only imported pandas this is reqired for dataset.

We have created a dictionary and passed it through the pd.DataFrame to create a dataframe with columns 'name', 'episodes', 'gender'.
```
data = {'name': ['Sheldon', 'Penny', 'Amy', 'Penny', 'Raj', 'Sheldon'],
'episodes': [42, 24, 31, 29, 37, 40],
'gender': ['male', 'female', 'female', 'female', 'male', 'male']}
df = pd.DataFrame(data, columns = ['name','episodes', 'gender'])
print(df)
```

We can clearly observe that in the column 'gender' there are two categories male and female, so for that we can assign number to each categories like 1 to male and 2 to female.
```
def gender_to_numeric(x):
if x=='female': return 2
if x=='male': return 1
df['gender_num'] = df['gender'].apply(gender_to_numeric)
print(df)
```

Here we are defining a function to assign numeric values and then we are applying of the feature 'gender'.

So the output comes as:

name episodes gender 0 Sheldon 42 male 1 Penny 24 female 2 Amy 31 female 3 Penny 29 female 4 Raj 37 male 5 Sheldon 40 male name episodes gender gender_num 0 Sheldon 42 male 1 1 Penny 24 female 2 2 Amy 31 female 2 3 Penny 29 female 2 4 Raj 37 male 1 5 Sheldon 40 male 1

Data Science Project in R -Build a machine learning algorithm to predict the future sale prices of homes.

In this deep learning project, we will predict customer churn using Artificial Neural Networks and learn how to model an ANN in R with the keras deep learning package.

Data Science Project - Build a recommendation engine which will predict the products to be purchased by an Instacart consumer again.

In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.

In this data science project, you will work with German credit dataset using classification techniques like Decision Tree, Neural Networks etc to classify loan applications using R.

The goal of this data science project is to build a predictive model and find out the sales of each product at a given Big Mart store.

Data Science Project in Python- Given his or her job role, predict employee access needs using amazon employee database.

There are different time series forecasting methods to forecast stock price, demand etc. In this machine learning project, you will learn to determine which forecasting method to be used when and how to apply with time series forecasting example.

In this machine learning project you will work on creating a robust prediction model of Rossmann's daily sales using store, promotion, and competitor data.

Machine Learning Project in R-Detect fraudulent click traffic for mobile app ads using R data science programming language.