How to convert a dictionary to a matrix or nArray in Python?
DATA MUNGING DATA CLEANING PYTHON MACHINE LEARNING RECIPES PANDAS CHEATSHEET     ALL TAGS

How to convert a dictionary to a matrix or nArray in Python?

How to convert a dictionary to a matrix or nArray in Python?

This recipe helps you convert a dictionary to a matrix or nArray in Python

0

Recipe Objective

Many a times we get data in form of dictionary and to use NLP or any model we need to preprocess the data. It becomes quite easy to work on matrix or data in vector form. So if somehow we change a dictionary dataset to a matrix then it will be quite good for us.

This python source code does the following:
1. Creates custom dictionary in python
2. Creates dictvectorizer object and converts dictionary into array
3. Extracts names of feature columns

So this is the recipe on how we can Convert a Dictionary into a Matrix or ndArray.

Step 1 - Import the library

from sklearn.feature_extraction import DictVectorizer

We have only imported DictVectorizer which is needed.

Step 2 - Setting up the Data

We have created a dictionary of data with three features named 'Pen', 'Pencil' and 'Eraser'. Each three features has values assigned to them. data_dict = [{'Pen': 2, 'Pencil': 4}, {'Pen': 4, 'Pencil': 3}, {'Pen': 1, 'Eraser': 2}, {'Pen': 2, 'Eraser': 2}] print(data_dict)

Step 3 - Converting Dictionary into Matrix

So here we want to convert a dictionary into a matrix. So we have used DictVectorizer to do so, it will create a matrix such that each column will signifies a feature and rows will be the samples of dictionary. Finally we have also printed the feature name using get_feature_names. dictvectorizer = DictVectorizer(sparse=False) features = dictvectorizer.fit_transform(data_dict) print(features) feature_name =dictvectorizer.get_feature_names() print(feature_name) So the output comes as

[{'Pen': 2, 'Pencil': 4}, {'Pen': 4, 'Pencil': 3}, {'Pen': 1, 'Eraser': 2}, {'Pen': 2, 'Eraser': 2}]

[[0. 2. 4.]
 [0. 4. 3.]
 [2. 1. 0.]
 [2. 2. 0.]]

['Eraser', 'Pen', 'Pencil']

Relevant Projects

Resume parsing with Machine learning - NLP with Python OCR and Spacy
In this machine learning resume parser example we use the popular Spacy NLP python library for OCR and text classification.

Zillow’s Home Value Prediction (Zestimate)
Data Science Project in R -Build a machine learning algorithm to predict the future sale prices of homes.

Machine Learning project for Retail Price Optimization
In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.

Build an Image Classifier for Plant Species Identification
In this machine learning project, we will use binary leaf images and extracted features, including shape, margin, and texture to accurately identify plant species using different benchmark classification techniques.

Mercari Price Suggestion Challenge Data Science Project
Data Science Project in Python- Build a machine learning algorithm that automatically suggests the right product prices.

Data Science Project on Wine Quality Prediction in R
In this R data science project, we will explore wine dataset to assess red wine quality. The objective of this data science project is to explore which chemical properties will influence the quality of red wines.

Predict Macro Economic Trends using Kaggle Financial Dataset
In this machine learning project, you will uncover the predictive value in an uncertain world by using various artificial intelligence, machine learning, advanced regression and feature transformation techniques.

Identifying Product Bundles from Sales Data Using R Language
In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.

Data Science Project-TalkingData AdTracking Fraud Detection
Machine Learning Project in R-Detect fraudulent click traffic for mobile app ads using R data science programming language.

Forecast Inventory demand using historical sales data in R
In this machine learning project, you will develop a machine learning model to accurately forecast inventory demand based on historical sales data.