How to do string munging in Pandas?

This recipe helps you do string munging in Pandas
Last Updated: 06 Sep 2022

Get access to Data Science projects View all Data Science projects

DATA MUNGING DATA CLEANING PYTHON MACHINE LEARNING RECIPES PANDAS CHEATSHEET ALL TAGS

Recipe Objective

Have you ever tried string munging? That is selecting a part of string of making another string form the available strings in the dataframe.

So this is the recipe on how we can do string munging in Pandas.

Learn How to Build a Simple Chatbot from Scratch in Python (using NLTK)

Step 1 - Import the library

import pandas as pd import numpy as np import re as re

We have only imported pandas, numpy and re which is needed.

Step 2 - Creating DataFrame

We have created a dictionary and passed it through pd.DataFrame to create a Dataframe raw_data = {"first_name": ["Jason", "Molly", "Tina", "Jake", "Amy"], "last_name": ["Miller", "Jacobson", "Ali", "Milner", "Cooze"], "email": ["jas203@gmail.com", "momomolly@gmail.com", np.NAN, "battler@milner.com", "Ames1234@yahoo.com"]} df = pd.DataFrame(raw_data, columns = ["first_name", "last_name", "email"]) print(); print(df)

Step 3 - Applying Different Munging Operation

Lets say, first we want to check that if in feature "email" which string contains "gmail". print(df["email"].str.contains("gmail")) Lets say, we want to seperate the email into parts such that the characters before "@" becomes one string and after and before "." becomes one. At last the remaining becomes the one string. pattern = "([A-Z0-9._%+-]+)@([A-Z0-9.-]+)\.([A-Z]{2,4})" print(df["email"].str.findall(pattern, flags=re.IGNORECASE)) So the output comes as

  first_name last_name                email  preTestScore  postTestScore
0      Jason    Miller     jas203@gmail.com             4             25
1      Molly  Jacobson  momomolly@gmail.com            24             94
2       Tina       Ali                  NaN            31             57
3       Jake    Milner   battler@milner.com             2             62
4        Amy     Cooze   Ames1234@yahoo.com             3             70

0     True
1     True
2      NaN
3    False
4    False
Name: email, dtype: object

0       [(jas203, gmail, com)]
1    [(momomolly, gmail, com)]
2                          NaN
3     [(battler, milner, com)]
4     [(Ames1234, yahoo, com)]
Name: email, dtype: object

0    True
1    True
2     NaN
3    True
4    True
Name: email, dtype: object

Download Materials

iPython Notebook

What Users are saying..

Ameeruddin Mohammed

ETL (Abintio) developer at IBM

I come from a background in Marketing and Analytics and when I developed an interest in Machine Learning algorithms, I did multiple in-class courses from reputed institutions though I got good... Read More

Relevant Projects

Machine Learning Projects

Data Science Projects

Python Projects for Data Science

Data Science Projects in R

Machine Learning Projects for Beginners

Deep Learning Projects

Neural Network Projects

Tensorflow Projects

NLP Projects

Kaggle Projects

IoT Projects

Big Data Projects

Hadoop Real-Time Projects Examples

Spark Projects

Data Analytics Projects for Students

Relevant Projects

Time Series Forecasting Project-Building ARIMA Model in Python

Build a time series ARIMA model in Python to forecast the use of arrival rate density to support staffing decisions at call centres.

View Project Details

AWS Project to Build and Deploy LSTM Model with Sagemaker

In this AWS Sagemaker Project, you will learn to build a LSTM model on Sagemaker for sales forecasting while analyzing the impact of weather conditions on Sales.

View Project Details

OpenCV Project for Beginners to Learn Computer Vision Basics

In this OpenCV project, you will learn computer vision basics and the fundamentals of OpenCV library using Python.

View Project Details

Build Real Estate Price Prediction Model with NLP and FastAPI

In this Real Estate Price Prediction Project, you will learn to build a real estate price prediction machine learning model and deploy it on Heroku using FastAPI Framework.

View Project Details

Skip Gram Model Python Implementation for Word Embeddings

Skip-Gram Model word2vec Example -Learn how to implement the skip gram algorithm in NLP for word embeddings on a set of documents.

View Project Details

Loan Eligibility Prediction in Python using H2O.ai

In this loan prediction project you will build predictive models in Python using H2O.ai to predict if an applicant is able to repay the loan or not.

View Project Details

Multilabel Classification Project for Predicting Shipment Modes

Multilabel Classification Project to build a machine learning model that predicts the appropriate mode of transport for each shipment, using a transport dataset with 2000 unique products. The project explores and compares four different approaches to multilabel classification, including naive independent models, classifier chains, natively multilabel models, and multilabel to multiclass approaches.

View Project Details

Learn Object Tracking (SOT, MOT) using OpenCV and Python

Get Started with Object Tracking using OpenCV and Python - Learn to implement Multiple Instance Learning Tracker (MIL) algorithm, Generic Object Tracking Using Regression Networks Tracker (GOTURN) algorithm, Kernelized Correlation Filters Tracker (KCF) algorithm, Tracking, Learning, Detection Tracker (TLD) algorithm for single and multiple object tracking from various video clips.

View Project Details

Learn to Build a Siamese Neural Network for Image Similarity

In this Deep Learning Project, you will learn how to build a siamese neural network with Keras and Tensorflow for Image Similarity.

View Project Details

Stock Price Prediction Project using LSTM and RNN

Learn how to predict stock prices using RNN and LSTM models. Understand deep learning concepts and apply them to real-world financial data for accurate forecasting.

View Project Details

How to do string munging in Pandas?

Recipe Objective

Step 1 - Import the library

Step 2 - Creating DataFrame

Step 3 - Applying Different Munging Operation

Ameeruddin Mohammed

Relevant Projects

You might also like

Relevant Projects