How to convert a dictionary to a matrix or nArray in Python?
DATA MUNGING DATA CLEANING PYTHON MACHINE LEARNING RECIPES PANDAS CHEATSHEET     ALL TAGS

How to convert a dictionary to a matrix or nArray in Python?

How to convert a dictionary to a matrix or nArray in Python?

This recipe helps you convert a dictionary to a matrix or nArray in Python

0

Recipe Objective

Many a times we get data in form of dictionary and to use NLP or any model we need to preprocess the data. It becomes quite easy to work on matrix or data in vector form. So if somehow we change a dictionary dataset to a matrix then it will be quite good for us.

This python source code does the following:
1. Creates custom dictionary in python
2. Creates dictvectorizer object and converts dictionary into array
3. Extracts names of feature columns

So this is the recipe on how we can Convert a Dictionary into a Matrix or ndArray.

Step 1 - Import the library

from sklearn.feature_extraction import DictVectorizer

We have only imported DictVectorizer which is needed.

Step 2 - Setting up the Data

We have created a dictionary of data with three features named 'Pen', 'Pencil' and 'Eraser'. Each three features has values assigned to them. data_dict = [{'Pen': 2, 'Pencil': 4}, {'Pen': 4, 'Pencil': 3}, {'Pen': 1, 'Eraser': 2}, {'Pen': 2, 'Eraser': 2}] print(data_dict)

Step 3 - Converting Dictionary into Matrix

So here we want to convert a dictionary into a matrix. So we have used DictVectorizer to do so, it will create a matrix such that each column will signifies a feature and rows will be the samples of dictionary. Finally we have also printed the feature name using get_feature_names. dictvectorizer = DictVectorizer(sparse=False) features = dictvectorizer.fit_transform(data_dict) print(features) feature_name =dictvectorizer.get_feature_names() print(feature_name) So the output comes as

[{'Pen': 2, 'Pencil': 4}, {'Pen': 4, 'Pencil': 3}, {'Pen': 1, 'Eraser': 2}, {'Pen': 2, 'Eraser': 2}]

[[0. 2. 4.]
 [0. 4. 3.]
 [2. 1. 0.]
 [2. 2. 0.]]

['Eraser', 'Pen', 'Pencil']

Relevant Projects

Machine Learning project for Retail Price Optimization
In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.

Credit Card Fraud Detection as a Classification Problem
In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.

Choosing the right Time Series Forecasting Methods
There are different time series forecasting methods to forecast stock price, demand etc. In this machine learning project, you will learn to determine which forecasting method to be used when and how to apply with time series forecasting example.

Identifying Product Bundles from Sales Data Using R Language
In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.

Data Science Project - Instacart Market Basket Analysis
Data Science Project - Build a recommendation engine which will predict the products to be purchased by an Instacart consumer again.

Data Science Project on Wine Quality Prediction in R
In this R data science project, we will explore wine dataset to assess red wine quality. The objective of this data science project is to explore which chemical properties will influence the quality of red wines.

Ecommerce product reviews - Pairwise ranking and sentiment analysis
This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.

Build an Image Classifier for Plant Species Identification
In this machine learning project, we will use binary leaf images and extracted features, including shape, margin, and texture to accurately identify plant species using different benchmark classification techniques.

Solving Multiple Classification use cases Using H2O
In this project, we are going to talk about H2O and functionality in terms of building Machine Learning models.

Mercari Price Suggestion Challenge Data Science Project
Data Science Project in Python- Build a machine learning algorithm that automatically suggests the right product prices.