How to convert a dictionary to a matrix or nArray in Python?
DATA MUNGING DATA CLEANING PYTHON MACHINE LEARNING RECIPES PANDAS CHEATSHEET     ALL TAGS

How to convert a dictionary to a matrix or nArray in Python?

How to convert a dictionary to a matrix or nArray in Python?

This recipe helps you convert a dictionary to a matrix or nArray in Python

0

Recipe Objective

Many a times we get data in form of dictionary and to use NLP or any model we need to preprocess the data. It becomes quite easy to work on matrix or data in vector form. So if somehow we change a dictionary dataset to a matrix then it will be quite good for us.

This python source code does the following:
1. Creates custom dictionary in python
2. Creates dictvectorizer object and converts dictionary into array
3. Extracts names of feature columns

So this is the recipe on how we can Convert a Dictionary into a Matrix or ndArray.

Step 1 - Import the library

from sklearn.feature_extraction import DictVectorizer

We have only imported DictVectorizer which is needed.

Step 2 - Setting up the Data

We have created a dictionary of data with three features named 'Pen', 'Pencil' and 'Eraser'. Each three features has values assigned to them. data_dict = [{'Pen': 2, 'Pencil': 4}, {'Pen': 4, 'Pencil': 3}, {'Pen': 1, 'Eraser': 2}, {'Pen': 2, 'Eraser': 2}] print(data_dict)

Step 3 - Converting Dictionary into Matrix

So here we want to convert a dictionary into a matrix. So we have used DictVectorizer to do so, it will create a matrix such that each column will signifies a feature and rows will be the samples of dictionary. Finally we have also printed the feature name using get_feature_names. dictvectorizer = DictVectorizer(sparse=False) features = dictvectorizer.fit_transform(data_dict) print(features) feature_name =dictvectorizer.get_feature_names() print(feature_name) So the output comes as

[{'Pen': 2, 'Pencil': 4}, {'Pen': 4, 'Pencil': 3}, {'Pen': 1, 'Eraser': 2}, {'Pen': 2, 'Eraser': 2}]

[[0. 2. 4.]
 [0. 4. 3.]
 [2. 1. 0.]
 [2. 2. 0.]]

['Eraser', 'Pen', 'Pencil']

Relevant Projects

Machine Learning or Predictive Models in IoT - Energy Prediction Use Case
In this machine learning and IoT project, we are going to test out the experimental data using various predictive models and train the models and break the energy usage.

Choosing the right Time Series Forecasting Methods
There are different time series forecasting methods to forecast stock price, demand etc. In this machine learning project, you will learn to determine which forecasting method to be used when and how to apply with time series forecasting example.

Predict Census Income using Deep Learning Models
In this project, we are going to work on Deep Learning using H2O to predict Census income.

Predict Employee Computer Access Needs in Python
Data Science Project in Python- Given his or her job role, predict employee access needs using amazon employee database.

Natural language processing Chatbot application using NLTK for text classification
In this NLP AI application, we build the core conversational engine for a chatbot. We use the popular NLTK text classification library to achieve this.

Build an Image Classifier for Plant Species Identification
In this machine learning project, we will use binary leaf images and extracted features, including shape, margin, and texture to accurately identify plant species using different benchmark classification techniques.

Learn to prepare data for your next machine learning project
Text data requires special preparation before you can start using it for any machine learning project.In this ML project, you will learn about applying Machine Learning models to create classifiers and learn how to make sense of textual data.

Identifying Product Bundles from Sales Data Using R Language
In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.

Ensemble Machine Learning Project - All State Insurance Claims Severity Prediction
In this ensemble machine learning project, we will predict what kind of claims an insurance company will get. This is implemented in python using ensemble machine learning algorithms.

Loan Eligibility Prediction using Gradient Boosting Classifier
This data science in python project predicts if a loan should be given to an applicant or not. We predict if the customer is eligible for loan based on several factors like credit score and past history.