In this Python tutorial, you will learn how to parse XML using the built-in xml.etree.ElementTree module. The examples cover reading an XML file, getting the root tag, accessing attributes, looping through child nodes, reading element text, parsing XML from a string, finding nested elements, handling missing tags safely, and working with XML namespaces.
Python XML parsing with ElementTree
XML is a structured markup format used in configuration files, data exchange, feeds, reports, and many older APIs. In Python, the standard library provides xml.etree.ElementTree, commonly imported as ET, for parsing and navigating XML documents without installing an external package.
ElementTree represents an XML document as a tree. The top-level element is called the root element. Each XML tag becomes an element object, and you can read its tag name, attributes, text, and child elements using simple methods such as .tag, .attrib, .find(), .findall(), and .iter().
ElementTree methods commonly used for XML parsing in Python
| ElementTree method or property | Use in XML parsing |
|---|---|
ET.parse('file.xml') | Reads and parses XML from a file. |
ET.fromstring(xml_text) | Parses XML from a string. |
getroot() | Returns the root element of the XML tree. |
element.tag | Returns the name of an XML tag. |
element.attrib | Returns attributes as a dictionary. |
element.find('tag') | Returns the first matching child element. |
element.findall('tag') | Returns all matching child elements. |
element.iter('tag') | Iterates through matching elements at any depth. |
element.text | Returns the text inside an element. |
Sample XML file used in the Python examples
We shall consider the following XML file for the examples going forward in this tutorial.
sample.xml
<?xml version="1.0" encoding="UTF-8" ?>
<holidays year="2017">
<holiday type="other">
<date>Jan 1</date>
<name>New Year</name>
</holiday>
<holiday type="public">
<date>Oct 2</date>
<name>Gandhi Jayanti</name>
</holiday>
</holidays>
The root element is <holidays>. It has one attribute, year. Inside the root, there are two <holiday> elements, and each holiday has a type attribute, a <date> element, and a <name> element.
Read an XML file in Python using ElementTree
To parse an XML file, import xml.etree.ElementTree, call ET.parse() with the XML file path, and then call getroot() to get the root element.
import xml.etree.ElementTree as ET
tree = ET.parse('sample.xml')
root = tree.getroot()
Once you have the root element, you can start reading tags, attributes, and child elements from the XML tree.
1. Get the root tag name from XML in Python
In the following program, we get the tag name of the root node.
Python Program
# Python XML Parsing
import xml.etree.ElementTree as ET
root = ET.parse('sample.xml').getroot()
tag = root.tag
print(tag)
Output
tutorialkart@arjun-VPCEH26EN:~/PycharmProjects/PythonTutorial/parsing$ python python_xml_parse_ElementTree.py
holidays
The output is holidays because <holidays> is the document root in sample.xml.
2. Get XML attributes of the root element in Python
In the following program, we access the attributes of the root node.
Python Program
# Python XML Parsing
import xml.etree.ElementTree as ET
root = ET.parse('sample.xml').getroot()
# get all attributes
attributes = root.attrib
print(attributes)
# extract a particular attribute
year = attributes.get('year')
print('year : ',year)
Output
tutorialkart@arjun-VPCEH26EN:~/PycharmProjects/PythonTutorial/parsing$ python python_xml_parse_ElementTree.py
{'year': '2017'}
year : 2017
The attrib property gives a dictionary of attributes. Using .get('year') is safer than direct dictionary access because it returns None if the attribute is not present.
3. Iterate over child XML nodes using findall()
In the following program, we iterate over the child nodes of the root node using a For loop statement.
Python Program
# Python XML Parsing
import xml.etree.ElementTree as ET
root = ET.parse('sample.xml').getroot()
# iterate over all the nodes with tag name - holiday
for holiday in root.findall('holiday'):
print(holiday)
Output
tutorialkart@arjun-VPCEH26EN:~/PycharmProjects/PythonTutorial/parsing$ python python_xml_parse_ElementTree.py
<Element 'holiday' at 0x7fb5a107d3b8>
<Element 'holiday' at 0x7fb59fc2f868>
The printed values are Element objects. To display useful values from each node, read the element attributes or child text, as shown in the next examples.
4. Read attributes from each child XML node
The following program is an extension to the previous program, where we access the attributes of the children, while iterating over them.
Python Program
# Python XML Parsing
import xml.etree.ElementTree as ET
root = ET.parse('sample.xml').getroot()
# iterate over child nodes
for holiday in root.findall('holiday'):
# get all attributes of a node
attributes = holiday.attrib
print(attributes)
# get a particular attribute
type = attributes.get('type')
print(type)
Output
tutorialkart@arjun-VPCEH26EN:~/PycharmProjects/PythonTutorial/parsing$ python python_xml_parse_ElementTree.py
{'type': 'other'}
other
{'type': 'public'}
public
In production code, avoid using type as a variable name because type is also the name of a built-in Python function. A clearer variable name would be holiday_type.
import xml.etree.ElementTree as ET
root = ET.parse('sample.xml').getroot()
for holiday in root.findall('holiday'):
holiday_type = holiday.attrib.get('type')
print(holiday_type)
other
public
5. Access XML child element text using find()
In the following program, we access the elements of a specific node.
Python Program
# Python XML Parsing
import xml.etree.ElementTree as ET
root = ET.parse('sample.xml').getroot()
# iterate over all nodes
for holiday in root.findall('holiday'):
# access element - name
name = holiday.find('name').text
print('name : ', name)
# access element - date
date = holiday.find('date').text
print('date : ', date)
Output
tutorialkart@arjun-VPCEH26EN:~/PycharmProjects/PythonTutorial/parsing$ python python_xml_parse_ElementTree.py
name : New Year
date : Jan 1
name : Gandhi Jayanti
date : Oct 2
The find() method returns the first matching child element. The .text property returns the text stored inside that element.
6. Access XML child elements without knowing all tag names
In the following program, we access the elements of a node, iteratively, in a For loop statement.
Python Program
# Python XML Parsing
import xml.etree.ElementTree as ET
root = ET.parse('sample.xml').getroot()
for holiday in root.findall('holiday'):
# access all elements in node
for element in holiday:
ele_name = element.tag
ele_value = holiday.find(element.tag).text
print(ele_name, ' : ', ele_value)
Output
tutorialkart@arjun-VPCEH26EN:~/PycharmProjects/PythonTutorial/parsing$ python python_xml_parse_ElementTree.py
date : Jan 1
name : New Year
date : Oct 2
name : Gandhi Jayanti
Since each holiday element is iterable, the loop visits its direct child elements. This approach is useful when you want to inspect the structure of an XML node or print all immediate child tags and values.
Parse XML from a Python string using fromstring()
If the XML data is already available as a string, use ET.fromstring(). This is common when XML is received from an API response, a message queue, or a stored text field.
import xml.etree.ElementTree as ET
xml_data = '''<holiday type="public">
<date>Oct 2</date>
<name>Gandhi Jayanti</name>
</holiday>'''
holiday = ET.fromstring(xml_data)
print(holiday.tag)
print(holiday.attrib.get('type'))
print(holiday.find('name').text)
holiday
public
Gandhi Jayanti
Use ET.parse() for files and ET.fromstring() for XML strings. Both approaches give you Element objects that can be queried in the same way.
Find all matching XML elements at any depth in Python
The findall() examples above search direct child elements of the root. If you want to search through the full XML tree, use iter(). It visits matching elements even when they are nested deeper inside the document.
import xml.etree.ElementTree as ET
root = ET.parse('sample.xml').getroot()
for name in root.iter('name'):
print(name.text)
New Year
Gandhi Jayanti
Use root.findall('holiday') when you want direct child elements named holiday. Use root.iter('name') when you want all name elements in the tree.
Handle missing XML tags safely in Python
When parsing real XML files, a tag may be missing or empty. If you directly write holiday.find('name').text and the name element is missing, Python raises an error because find() returns None. Check the element before reading .text.
import xml.etree.ElementTree as ET
root = ET.parse('sample.xml').getroot()
for holiday in root.findall('holiday'):
name_element = holiday.find('name')
date_element = holiday.find('date')
name = name_element.text if name_element is not None else 'Unknown'
date = date_element.text if date_element is not None else 'Unknown'
print(name, '-', date)
New Year - Jan 1
Gandhi Jayanti - Oct 2
This pattern makes the XML parser code more reliable when the input document is not fully controlled by your program.
Parse XML namespaces with ElementTree in Python
Some XML documents use namespaces. In that case, a tag name may look simple in the file but must be searched with a namespace-aware path in ElementTree. The clean approach is to define a namespace dictionary and use prefixes in find() or findall().
import xml.etree.ElementTree as ET
xml_data = '''<calendar xmlns:h="https://example.com/holidays">
<h:holiday type="public">
<h:name>Gandhi Jayanti</h:name>
</h:holiday>
</calendar>'''
root = ET.fromstring(xml_data)
namespace = {'h': 'https://example.com/holidays'}
holiday_name = root.find('h:holiday/h:name', namespace)
print(holiday_name.text)
Gandhi Jayanti
If find() or findall() returns no results even though the tag is visible in the XML file, check whether the document uses a namespace.
Convert parsed XML data into a Python list of dictionaries
A common XML parsing task is converting selected XML values into Python data structures. The following example converts the holiday records into a list of dictionaries.
import xml.etree.ElementTree as ET
root = ET.parse('sample.xml').getroot()
holidays = []
for holiday in root.findall('holiday'):
holidays.append({
'type': holiday.attrib.get('type'),
'date': holiday.find('date').text,
'name': holiday.find('name').text
})
print(holidays)
[{'type': 'other', 'date': 'Jan 1', 'name': 'New Year'}, {'type': 'public', 'date': 'Oct 2', 'name': 'Gandhi Jayanti'}]
After the XML is converted into dictionaries or lists, it becomes easier to filter, print, export, or process the data in the rest of your Python program.
ElementTree XML parsing errors to check in Python
ElementTree expects well-formed XML. If the XML has mismatched tags, invalid nesting, or unescaped special characters, parsing can fail with xml.etree.ElementTree.ParseError. Use exception handling when reading XML from external files or user-provided input.
import xml.etree.ElementTree as ET
try:
root = ET.parse('sample.xml').getroot()
print(root.tag)
except ET.ParseError as error:
print('XML parsing failed:', error)
except FileNotFoundError:
print('XML file was not found.')
This is useful when the XML file is downloaded, generated by another system, or edited manually.
When ElementTree is enough for XML parsing in Python
ElementTree is suitable for many everyday XML parsing tasks: reading configuration files, extracting values from small or medium XML documents, processing structured exports, and converting XML records into Python objects. Since it is part of the Python standard library, it is often the simplest first choice.
For very large XML files, streaming with iterparse() may be a better fit because it can process elements progressively instead of keeping the entire tree in memory. For advanced XML features such as extensive XPath support, schema validation, or heavy transformation workflows, developers often consider additional libraries such as lxml.
Python XML parsing FAQ
Which Python module is used to parse XML without external libraries?
You can use xml.etree.ElementTree, which is included in the Python standard library. It can parse XML files and XML strings, read attributes, search elements, and access element text.
What is the difference between ET.parse() and ET.fromstring()?
ET.parse() reads XML from a file and returns an ElementTree object. ET.fromstring() parses XML from a string and returns the root element directly.
How do I get an XML attribute value in Python?
Use the attrib dictionary of an element. For example, holiday.attrib.get('type') returns the value of the type attribute if it exists.
Why does ElementTree find() return None for a tag that exists?
The tag may not be a direct child of the current element, or the XML may use namespaces. Use the correct path, use iter() for deeper searches, or pass a namespace dictionary when namespaces are present.
Can ElementTree parse large XML files?
ElementTree can parse XML files, but loading a very large file into a full tree can use significant memory. For large files, consider iterparse() so you can process elements as they are read.
QA checklist for this Python XML parsing tutorial
- Confirms that
xml.etree.ElementTreeis a built-in Python XML parsing module. - Shows both file-based XML parsing with
ET.parse()and string-based parsing withET.fromstring(). - Explains how to read root tags, attributes, child elements, and element text.
- Includes safe handling for missing XML tags and parsing errors.
- Covers namespace-aware XML searches, which are a common reason for failed
find()calls. - Uses PrismJS-compatible classes for all newly added Python, XML, and output code blocks.
Python XML parsing with ElementTree: summary
In this Python Tutorial, we learned how to parse an XML file using ElementTree library. You can use ET.parse() for XML files, ET.fromstring() for XML strings, find() and findall() for matching child elements, iter() for deeper searches, and attrib to read XML attributes.
TutorialKart.com