Python XML Parsing

Python XML Parsing – We shall learn to parse xml documents in python programming language. There are many options available out there. We shall go through enough example for the following libraries

We shall look into examples to parse the xml file, extract attributes, extract elements, etc. for all of the above libraries.

We shall consider following xml file for  examples going forward in this tutorial.

<?xml version="1.0" encoding="UTF-8" ?>

<holidays year="2017">
    <holiday type="other">
        <date>Jan 1</date>
        <name>New Year</name>
    </holiday>
    <holiday type="public">
        <date>Oct 2</date>
        <name>Gandhi Jayanti</name>
    </holiday>
</holidays>
Try Online

ElementTree – Python XML Parser

ElementTree comes along with python.

Get Root Tag Name

example.py – Python Program

# Python XML Parsing
import xml.etree.ElementTree as ET
root = ET.parse('sample.xml').getroot()
tag = root.tag
print(tag)
Try Online

Output

tutorialkart@arjun-VPCEH26EN:~/PycharmProjects/PythonTutorial/parsing$ python python_xml_parse_ElementTree.py
holidays

Get Attributes of Root

example.py – Python Program

# Python XML Parsing
import xml.etree.ElementTree as ET
root = ET.parse('sample.xml').getroot()

# get all attributes
attributes = root.attrib
print(attributes)

# extract a particular attribute
year = attributes.get('year')
print('year : ',year)
Try Online

Output

tutorialkart@arjun-VPCEH26EN:~/PycharmProjects/PythonTutorial/parsing$ python python_xml_parse_ElementTree.py
{'year': '2017'}
year : 2017

Iterate over child nodes of root

example.py – Python Program

# Python XML Parsing
import xml.etree.ElementTree as ET
root = ET.parse('sample.xml').getroot()

# iterate over all the nodes with tag name - holiday
for holiday in root.findall('holiday'):
    print(holiday)
Try Online

Output

tutorialkart@arjun-VPCEH26EN:~/PycharmProjects/PythonTutorial/parsing$ python python_xml_parse_ElementTree.py
<Element 'holiday' at 0x7fb5a107d3b8>
<Element 'holiday' at 0x7fb59fc2f868>

Iterate over child nodes of root and get their attributes

example.py – Python Program

# Python XML Parsing
import xml.etree.ElementTree as ET
root = ET.parse('sample.xml').getroot()

# iterate over child nodes
for holiday in root.findall('holiday'):

    # get all attributes of a node
    attributes = holiday.attrib
    print(attributes)

    # get a particular attribute
    type = attributes.get('type')
    print(type)
Try Online

Output

tutorialkart@arjun-VPCEH26EN:~/PycharmProjects/PythonTutorial/parsing$ python python_xml_parse_ElementTree.py
{'type': 'other'}
other
{'type': 'public'}
public

Access Elements of a Node

example.py – Python Program

# Python XML Parsing
import xml.etree.ElementTree as ET
root = ET.parse('sample.xml').getroot()

# iterate over all nodes
for holiday in root.findall('holiday'):

    # access element - name
    name = holiday.find('name').text
    print('name : ', name)

    # access element - date
    date = holiday.find('date').text
    print('date : ', date)
Try Online

Output

tutorialkart@arjun-VPCEH26EN:~/PycharmProjects/PythonTutorial/parsing$ python python_xml_parse_ElementTree.py
name :  New Year
date :  Jan 1
name :  Gandhi Jayanti
date :  Oct 2

Access Elements of a Node without knowing their tag names

example.py – Python Program

# Python XML Parsing
import xml.etree.ElementTree as ET
root = ET.parse('sample.xml').getroot()

for holiday in root.findall('holiday'):
    # access all elements in node
    for element in holiday:
        ele_name = element.tag
        ele_value = holiday.find(element.tag).text
        print(ele_name, ' : ', ele_value)
Try Online

Output

tutorialkart@arjun-VPCEH26EN:~/PycharmProjects/PythonTutorial/parsing$ python python_xml_parse_ElementTree.py
date  :  Jan 1
name  :  New Year
date  :  Oct 2
name  :  Gandhi Jayanti