How to utilise Pandas dataframe & series for data wrangling?

This recipe helps you utilise Pandas dataframe & series for data wrangling

Recipe Objective

There are various data wrangling methods. Have you tried to use any of them for dataframe or series?

So this is the recipe on how we can utilise a Pandas dataframe & series for data wrangling.

Step 1 - Importing Library

import pandas as pd

We have only imported pandas which is needed.

Step 2 - Creating a series

We have created a series of numbers in the boject floodingReports and then added index to each number. floodingReports = pd.Series([5, 6, 2, 9, 12]) print(floodingReports) floodingReports = pd.Series([5, 6, 2, 9, 12], index=["Cochise County", "Pima County", "Santa Cruz County", "Maricopa County", "Yuma County"]) print(floodingReports)

Step 3 - Data Wrangling on series

First we have printed the number as per the index. Then we have printed the index on a condition that the value should be greater than 6. print(floodingReports["Cochise County"]) print(floodingReports[floodingReports > 6])

Step 4 - Creating a series from dictionary

We have created a series from a dictionary by passing the dictionary through pd.series. fireReports_dict = {"Cochise County": 12, "Pima County": 342, "Santa Cruz County": 13, "Maricopa County": 42, "Yuma County" : 52} fireReports = pd.Series(fireReports_dict) print(fireReports)

Step 5 - Changing the index of series

We can change the index of series by defining new set of index in series.index function. fireReports.index = ["Cochice", "Pima", "Santa Cruz", "Maricopa", "Yuma"]

Step 6 - Creating a dataframe from dictionary

We have created a dataframe from a dictionary by passing the dictionary through pd.DataFrame data = {"county": ["Cochice", "Pima", "Santa Cruz", "Maricopa", "Yuma"], "year": [2012, 2012, 2013, 2014, 2014], "reports": [4, 24, 31, 2, 3]} df = pd.DataFrame(data) print(df)

Step 7 - Performing Wrangling on dataframe

We are peroforming three Wrangling for better understanding.

  • Adding a new Column
  • dfColumnOrdered["newsCoverage"] = pd.Series([42.3, 92.1, 12.2, 39.3, 30.2]) print(dfColumnOrdered)
  • Deleting a column
  • del dfColumnOrdered["newsCoverage"] print(dfColumnOrdered)
  • Making Transpose
  • # Transpose the dataframe print(dfColumnOrdered.T)
So the output comes as:

0     5
1     6
2     2
3     9
4    12
dtype: int64

Cochise County        5
Pima County           6
Santa Cruz County     2
Maricopa County       9
Yuma County          12
dtype: int64


Maricopa County     9
Yuma County        12
dtype: int64

Cochise County        12
Pima County          342
Santa Cruz County     13
Maricopa County       42
Yuma County           52
dtype: int64

       county  year  reports
0     Cochice  2012        4
1        Pima  2012       24
2  Santa Cruz  2013       31
3    Maricopa  2014        2
4        Yuma  2014        3

       county  year  reports  newsCoverage
0     Cochice  2012        4          42.3
1        Pima  2012       24          92.1
2  Santa Cruz  2013       31          12.2
3    Maricopa  2014        2          39.3
4        Yuma  2014        3          30.2

       county  year  reports
0     Cochice  2012        4
1        Pima  2012       24
2  Santa Cruz  2013       31
3    Maricopa  2014        2
4        Yuma  2014        3

               0     1           2         3     4
county   Cochice  Pima  Santa Cruz  Maricopa  Yuma
year        2012  2012        2013      2014  2014
reports        4    24          31         2     3

Download Materials

What Users are saying..

profile image

Jingwei Li

Graduate Research assistance at Stony Brook University
linkedin profile url

ProjectPro is an awesome platform that helps me learn much hands-on industrial experience with a step-by-step walkthrough of projects. There are two primary paths to learn: Data Science and Big Data.... Read More

Relevant Projects

Learn Object Tracking (SOT, MOT) using OpenCV and Python
Get Started with Object Tracking using OpenCV and Python - Learn to implement Multiple Instance Learning Tracker (MIL) algorithm, Generic Object Tracking Using Regression Networks Tracker (GOTURN) algorithm, Kernelized Correlation Filters Tracker (KCF) algorithm, Tracking, Learning, Detection Tracker (TLD) algorithm for single and multiple object tracking from various video clips.

Create Your First Chatbot with RASA NLU Model and Python
Learn the basic aspects of chatbot development and open source conversational AI RASA to create a simple AI powered chatbot on your own.

MLOps Project on GCP using Kubeflow for Model Deployment
MLOps using Kubeflow on GCP - Build and deploy a deep learning model on Google Cloud Platform using Kubeflow pipelines in Python

Learn to Build an End-to-End Machine Learning Pipeline - Part 1
In this Machine Learning Project, you will learn how to build an end-to-end machine learning pipeline for predicting truck delays, addressing a major challenge in the logistics industry.

AWS MLOps Project to Deploy a Classification Model [Banking]
In this AWS MLOps project, you will learn how to deploy a classification model using Flask on AWS.

Text Classification with Transformers-RoBERTa and XLNet Model
In this machine learning project, you will learn how to load, fine tune and evaluate various transformer models for text classification tasks.

Build a Collaborative Filtering Recommender System in Python
Use the Amazon Reviews/Ratings dataset of 2 Million records to build a recommender system using memory-based collaborative filtering in Python.

Build Regression (Linear,Ridge,Lasso) Models in NumPy Python
In this machine learning regression project, you will learn to build NumPy Regression Models (Linear Regression, Ridge Regression, Lasso Regression) from Scratch.

Build an AI Chatbot from Scratch using Keras Sequential Model
In this NLP Project, you will learn how to build an AI Chatbot from Scratch using Keras Sequential Model.

Time Series Forecasting Project-Building ARIMA Model in Python
Build a time series ARIMA model in Python to forecast the use of arrival rate density to support staffing decisions at call centres.