How to process categorical features in Python?

How to process categorical features in Python?

This recipe helps you process categorical features in Python
In [1]:
## How to process categorical features in Python 
def Kickstarter_Example_38():
    print()
    print(format('How to process categorical features in Python', '*^82'))

    import warnings
    warnings.filterwarnings("ignore")

    # load libraries
    from sklearn import preprocessing

    #from sklearn.pipeline import Pipeline
    import pandas as pd

    # Create Data
    raw_data = {'first_name': ['Jason', 'Molly', 'Tina', 'Jake', 'Amy'],
                'last_name': ['Miller', 'Jacobson', 'Ali', 'Milner', 'Cooze'],
                'age': [42, 52, 36, 24, 73],
                'city': ['San Francisco', 'Baltimore', 'Miami', 'Douglas', 'Boston']}
    df = pd.DataFrame(raw_data, columns = ['first_name', 'last_name', 'age', 'city'])
    print(); print(df)

    # Convert Nominal Categorical Feature Into Dummy Variables Using Pandas
    # Create dummy variables for every unique category in df.city
    print(); print(pd.get_dummies(df["city"]))

    # Convert Nominal Categorical Data Into Dummy (OneHot) Features Using Scikit
    # Convert strings categorical names to integers
    integerized_data = preprocessing.LabelEncoder().fit_transform(df["city"])

    # View data
    print(); print(integerized_data)

    # Convert integer categorical representations to OneHot encodings
    output = preprocessing.OneHotEncoder().fit_transform(integerized_data.reshape(-1,1)).toarray()
    print(); print(output)

Kickstarter_Example_38()
******************How to process categorical features in Python*******************

  first_name last_name  age           city
0      Jason    Miller   42  San Francisco
1      Molly  Jacobson   52      Baltimore
2       Tina       Ali   36          Miami
3       Jake    Milner   24        Douglas
4        Amy     Cooze   73         Boston

   Baltimore  Boston  Douglas  Miami  San Francisco
0          0       0        0      0              1
1          1       0        0      0              0
2          0       0        0      1              0
3          0       0        1      0              0
4          0       1        0      0              0

[4 0 3 2 1]

[[0. 0. 0. 0. 1.]
 [1. 0. 0. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 1. 0. 0.]
 [0. 1. 0. 0. 0.]]


Stuck at work?
Can't find the recipe you are looking for. Let us know and we will find an expert to create the recipe for you. Click here
Companies using this Recipe
1 developer from Deepera
1 developer from KPMG
1 developer from Vodafone
1 developer from Altimetrik
1 developer from GyanSys
1 developer from LTI
1 developer from YASH Technologies
1 developer from ANAC
1 developer from HvH
1 developer from MudraCircle