How to convert a dictionary to a matrix or nArray in Python?

How to convert a dictionary to a matrix or nArray in Python?

How to convert a dictionary to a matrix or nArray in Python?

This recipe helps you convert a dictionary to a matrix or nArray in Python

Recipe Objective

Many a times we get data in form of dictionary and to use NLP or any model we need to preprocess the data. It becomes quite easy to work on matrix or data in vector form. So if somehow we change a dictionary dataset to a matrix then it will be quite good for us.

This python source code does the following:
1. Creates custom dictionary in python
2. Creates dictvectorizer object and converts dictionary into array
3. Extracts names of feature columns

So this is the recipe on how we can Convert a Dictionary into a Matrix or ndArray.

Step 1 - Import the library

from sklearn.feature_extraction import DictVectorizer

We have only imported DictVectorizer which is needed.

Step 2 - Setting up the Data

We have created a dictionary of data with three features named 'Pen', 'Pencil' and 'Eraser'. Each three features has values assigned to them. data_dict = [{'Pen': 2, 'Pencil': 4}, {'Pen': 4, 'Pencil': 3}, {'Pen': 1, 'Eraser': 2}, {'Pen': 2, 'Eraser': 2}] print(data_dict)

Step 3 - Converting Dictionary into Matrix

So here we want to convert a dictionary into a matrix. So we have used DictVectorizer to do so, it will create a matrix such that each column will signifies a feature and rows will be the samples of dictionary. Finally we have also printed the feature name using get_feature_names. dictvectorizer = DictVectorizer(sparse=False) features = dictvectorizer.fit_transform(data_dict) print(features) feature_name =dictvectorizer.get_feature_names() print(feature_name) So the output comes as

[{'Pen': 2, 'Pencil': 4}, {'Pen': 4, 'Pencil': 3}, {'Pen': 1, 'Eraser': 2}, {'Pen': 2, 'Eraser': 2}]

[[0. 2. 4.]
 [0. 4. 3.]
 [2. 1. 0.]
 [2. 2. 0.]]

['Eraser', 'Pen', 'Pencil']

Download Materials

Relevant Projects

Loan Eligibility Prediction using Gradient Boosting Classifier
This data science in python project predicts if a loan should be given to an applicant or not. We predict if the customer is eligible for loan based on several factors like credit score and past history.

Locality Sensitive Hashing Python Code for Look-Alike Modelling
In this deep learning project, you will find similar images (lookalikes) using deep learning and locality sensitive hashing to find customers who are most likely to click on an ad.

Digit Recognition using CNN for MNIST Dataset in Python
In this deep learning project, you will build a convolutional neural network using MNIST dataset for handwritten digit recognition.

PySpark Tutorial - Learn to use Apache Spark with Python
PySpark Project-Get a handle on using Python with Spark through this hands-on data processing spark python tutorial.

Data Science Project-TalkingData AdTracking Fraud Detection
Machine Learning Project in R-Detect fraudulent click traffic for mobile app ads using R data science programming language.

Data Science Project in Python on BigMart Sales Prediction
The goal of this data science project is to build a predictive model and find out the sales of each product at a given Big Mart store.

Forecasting Business KPI's with Tensorflow and Python
In this machine learning project, you will use the video clip of an IPL match played between CSK and RCB to forecast key performance indicators like the number of appearances of a brand logo, the frames, and the shortest and longest area percentage in the video.

House Price Prediction Project using Machine Learning
Use the Zillow dataset to follow a test-driven approach and build a regression machine learning model to predict the price of the house based on other variables.

Data Science Project - Instacart Market Basket Analysis
Data Science Project - Build a recommendation engine which will predict the products to be purchased by an Instacart consumer again.

Credit Card Fraud Detection as a Classification Problem
In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.