How to load features from a Dictionary in python?
DATA MUNGING DATA CLEANING PYTHON MACHINE LEARNING RECIPES PANDAS CHEATSHEET     ALL TAGS

How to load features from a Dictionary in python?

How to load features from a Dictionary in python?

This recipe helps you load features from a Dictionary in python

Recipe Objective

Have to tried to load features? We think yes. But have to tried to do it from a dictionary in python?

So this is the recipe on how we can load features from a Dictionary in python.

Step 1 - Import the library

from sklearn.feature_extraction import DictVectorizer

We have only imported DictVectorizer which is needed.

Step 2 - Creating a Dictionary

We have created a dictionary on which we will perform the operation. employee = [{"name": "Steve Miller", "age": 33., "dept": "Analytics"}, {"name": "Lyndon Jones", "age": 42., "dept": "Finance"}, {"name": "Baxter Morth", "age": 37., "dept": "Marketing"}, {"name": "Mathew Scott", "age": 32., "dept": "Business"}]

Step 3 - Extracting Features

We are creating an object for DictVectorizer() then we are using this to fit and transform the feature employee to array and finally printing the feature. vec = DictVectorizer() print("Feature Matrix: "); print(vec.fit_transform(employee).toarray()) print("Feature Name: "); print(vec.get_feature_names()) So the output comes as

Feature Matrix: 
[[33.  1.  0.  0.  0.  0.  0.  0.  1.]
 [42.  0.  0.  1.  0.  0.  1.  0.  0.]
 [37.  0.  0.  0.  1.  1.  0.  0.  0.]
 [32.  0.  1.  0.  0.  0.  0.  1.  0.]]

Feature Name: 
["age", "dept=Analytics", "dept=Business", "dept=Finance", "dept=Marketing", "name=Baxter Morth", "name=Lyndon Jones", "name=Mathew Scott", "name=Steve Miller"]

Download Materials

Relevant Projects

Machine Learning or Predictive Models in IoT - Energy Prediction Use Case
In this machine learning and IoT project, we are going to test out the experimental data using various predictive models and train the models and break the energy usage.

Build OCR from Scratch Python using YOLO and Tesseract
In this deep learning project, you will learn how to build your custom OCR (optical character recognition) from scratch by using Google Tesseract and YOLO to read the text from any images.

Data Science Project - Instacart Market Basket Analysis
Data Science Project - Build a recommendation engine which will predict the products to be purchased by an Instacart consumer again.

Data Science Project in Python on BigMart Sales Prediction
The goal of this data science project is to build a predictive model and find out the sales of each product at a given Big Mart store.

Avocado Machine Learning Project Python for Price Prediction
In this ML Project, you will use the Avocado dataset to build a machine learning model to predict the average price of avocado which is continuous in nature based on region and varieties of avocado.

Topic modelling using Kmeans clustering to group customer reviews
In this Kmeans clustering machine learning project, you will perform topic modelling in order to group customer reviews based on recurring patterns.

Build a Similar Images Finder with Python, Keras, and Tensorflow
Build your own image similarity application using Python to search and find images of products that are similar to any given product. You will implement the K-Nearest Neighbor algorithm to find products with maximum similarity.

Resume parsing with Machine learning - NLP with Python OCR and Spacy
In this machine learning resume parser example we use the popular Spacy NLP python library for OCR and text classification.

Machine Learning project for Retail Price Optimization
In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.

Forecasting Business KPI's with Tensorflow and Python
In this machine learning project, you will use the video clip of an IPL match played between CSK and RCB to forecast key performance indicators like the number of appearances of a brand logo, the frames, and the shortest and longest area percentage in the video.