How to parse XML in python

In this recipe, we will learn how to parse XML in python using Minidom, ElementTree, and BeautifulSoup libraries of python.

How to parse XML in python

In this tutorial, you will learn how to parse XML in python using –
• Minidom
• ElementTree
• BeautifulSoup


XML is an abbreviation for eXtensible Markup Language. It was created to store and convey small to medium volumes of data and is commonly used for the exchange of structured data.

Python can parse and alter XML documents. To parse an XML document, you must have the whole XML document in memory. Let us see how can we parse XML in python through examples.

Deep Learning Project for Text Detection in Images using Python 

Data stored in the XML file used for examples –


  
     Item1
     Item2
     Item3
     Item4
     Item5
  

Parsing XML in python using Minidom

The World Wide Web Consortium's DOM (document object model) is a cross-language API for reading and changing XML content. Python allows you to parse XML files using xml.dom.minidom. It is less complex than the complete DOM API.

Code:
#importing minidom library
from xml.dom import minidom

#parsing XML file
xmldoc = minidom.parse('D:/ProjectPro/Tutorials/How to parse XML in python/items.xml')

#getting list of elements
itemlist = xmldoc.getElementsByTagName('item')

#printing number of elements
print("Number of elements: ",len(itemlist))

#printing name of first element
print("Name of First element:", itemlist[0].attributes['name'].value)

#printing name of all the elements
print("Name of all the elements:")
for s in itemlist:
print(s.attributes['name'].value)

Output:
Number of elements:  5
Name of First element: item1
Name of all the elements:
item1
item2
item3
item4
item5

Parsing XML in python using ElementTree

ElementTree is an XML manipulation API. ElementTree is a simple way to work with XML files.

Code:
#importing ElementTree library
import xml.etree.ElementTree as ET

#parsing XML file
tree = ET.parse('D:/ProjectPro/Tutorials/How to parse XML in python/items.xml')

#fetching the root element
root = tree.getroot()

#printing the data
for elem in root:
   for subelem in elem:
     print(subelem.text)

  
Output:
Item1
Item2
Item3
Item4
Item5

Parsing XML in python using BeautifulSoup

Beautiful Soup is a Python package that is used for extracting data from HTML and XML files. Let us see an example to parse XML in python using the Beautiful Soup library.

Code:
#importing beautifulsoup library
from bs4 import BeautifulSoup

#XML data
data="""





"""

#parsing XML data
b=BeautifulSoup(data)

#printing elements of the data
print(b.foo.bar.type["foobar"])
print(b.foo.bar.findAll("type"))
print(b.foo.bar.findAll("type")[0]["foobar"])
print(b.foo.bar.findAll("type")[2]["foobar"])

 
Output:
1
[, , ]
1
3

What Users are saying..

profile image

Ray han

Tech Leader | Stanford / Yale University
linkedin profile url

I think that they are fantastic. I attended Yale and Stanford and have worked at Honeywell,Oracle, and Arthur Andersen(Accenture) in the US. I have taken Big Data and Hadoop,NoSQL, Spark, Hadoop... Read More

Relevant Projects

GCP MLOps Project to Deploy ARIMA Model using uWSGI Flask
Build an end-to-end MLOps Pipeline to deploy a Time Series ARIMA Model on GCP using uWSGI and Flask

Build an optimal End-to-End MLOps Pipeline and Deploy on GCP
Learn how to build and deploy an end-to-end optimal MLOps Pipeline for Loan Eligibility Prediction Model in Python on GCP

Machine Learning Project to Forecast Rossmann Store Sales
In this machine learning project you will work on creating a robust prediction model of Rossmann's daily sales using store, promotion, and competitor data.

ML Model Deployment on AWS for Customer Churn Prediction
MLOps Project-Deploy Machine Learning Model to Production Python on AWS for Customer Churn Prediction

Learn to Build an End-to-End Machine Learning Pipeline - Part 2
In this Machine Learning Project, you will learn how to build an end-to-end machine learning pipeline for predicting truck delays, incorporating Hopsworks' feature store and Weights and Biases for model experimentation.

Machine Learning project for Retail Price Optimization
In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.

Build CNN Image Classification Models for Real Time Prediction
Image Classification Project to build a CNN model in Python that can classify images into social security cards, driving licenses, and other key identity information.

MLOps AWS Project on Topic Modeling using Gunicorn Flask
In this project we will see the end-to-end machine learning development process to design, build and manage reproducible, testable, and evolvable machine learning models by using AWS

Ola Bike Rides Request Demand Forecast
Given big data at taxi service (ride-hailing) i.e. OLA, you will learn multi-step time series forecasting and clustering with Mini-Batch K-means Algorithm on geospatial data to predict future ride requests for a particular region at a given time.

Build an Image Segmentation Model using Amazon SageMaker
In this Machine Learning Project, you will learn to implement the UNet Architecture and build an Image Segmentation Model using Amazon SageMaker