Convert Python ElementTree to string
PythonXmlMarshallingElementtreePython Problem Overview
Whenever I call ElementTree.tostring(e)
, I get the following error message:
AttributeError: 'Element' object has no attribute 'getroot'
Is there any other way to convert an ElementTree object into an XML string?
TraceBack:
Traceback (most recent call last):
File "Development/Python/REObjectSort/REObjectResolver.py", line 145, in <module>
cm = integrateDataWithCsv(cm, csvm)
File "Development/Python/REObjectSort/REObjectResolver.py", line 137, in integrateDataWithCsv
xmlstr = ElementTree.tostring(et.getroot(),encoding='utf8',method='xml')
AttributeError: 'Element' object has no attribute 'getroot'
Python Solutions
Solution 1 - Python
Element
objects have no .getroot()
method. Drop that call, and the .tostring()
call works:
xmlstr = ElementTree.tostring(et, encoding='utf8', method='xml')
You only need to use .getroot()
if you have an ElementTree
instance.
Other notes:
-
This produces a bytestring, which in Python 3 is the
bytes
type.
If you must have astr
object, you have two options:-
Decode the resulting bytes value, from UTF-8:
xmlstr.decode("utf8")
-
Use
encoding='unicode'
; this avoids an encode / decode cycle:xmlstr = ElementTree.tostring(et, encoding='unicode', method='xml')
-
-
If you wanted the UTF-8 encoded bytestring value or are using Python 2, take into account that ElementTree doesn't properly detect
utf8
as the standard XML encoding, so it'll add a<?xml version='1.0' encoding='utf8'?>
declaration. Useutf-8
orUTF-8
(with a dash) if you want to prevent this. When usingencoding="unicode"
no declaration header is added.
Solution 2 - Python
ElementTree.Element
to a String?
How do I convert For Python 3:
xml_str = ElementTree.tostring(xml, encoding='unicode')
For Python 2:
xml_str = ElementTree.tostring(xml, encoding='utf-8')
The following is compatible with both Python 2 & 3, but only works for Latin characters:
xml_str = ElementTree.tostring(xml).decode()
Example usage
from xml.etree import ElementTree
xml = ElementTree.Element("Person", Name="John")
xml_str = ElementTree.tostring(xml).decode()
print(xml_str)
Output:
<Person Name="John" />
Explanation
Despite what the name implies, ElementTree.tostring()
returns a bytestring by default in Python 2 & 3. This is an issue in Python 3, which uses Unicode for strings.
> In Python 2 you could use the str
type for both text and binary data.
> Unfortunately this confluence of two different concepts could lead to
> brittle code which sometimes worked for either kind of data, sometimes
> not. [...]
>
> To make the distinction between text and binary data clearer and more pronounced, [Python 3] made text and binary data distinct types that cannot blindly be mixed together.
Source: Porting Python 2 Code to Python 3
If we know what version of Python is being used, we can specify the encoding as unicode
or utf-8
. Otherwise, if we need compatibility with both Python 2 & 3, we can use decode()
to convert into the correct type.
For reference, I've included a comparison of .tostring()
results between Python 2 and Python 3.
ElementTree.tostring(xml)
# Python 3: b'<Person Name="John" />'
# Python 2: <Person Name="John" />
ElementTree.tostring(xml, encoding='unicode')
# Python 3: <Person Name="John" />
# Python 2: LookupError: unknown encoding: unicode
ElementTree.tostring(xml, encoding='utf-8')
# Python 3: b'<Person Name="John" />'
# Python 2: <Person Name="John" />
ElementTree.tostring(xml).decode()
# Python 3: <Person Name="John" />
# Python 2: <Person Name="John" />
Thanks to Martijn Peters for pointing out that the str
datatype changed between Python 2 and 3.
Why not use str()?
In most scenarios, using str()
would be the "cannonical" way to convert an object to a string. Unfortunately, using this with Element
returns the object's location in memory as a hexstring, rather than a string representation of the object's data.
from xml.etree import ElementTree
xml = ElementTree.Element("Person", Name="John")
print(str(xml)) # <Element 'Person' at 0x00497A80>
Solution 3 - Python
Non-Latin Answer Extension
Extension to @Stevoisiak's answer and dealing with non-Latin characters. Only one way will display the non-Latin characters to you. The one method is different on both Python 3 and Python 2.
Input
xml = ElementTree.fromstring('<Person Name="크리스" />')
xml = ElementTree.Element("Person", Name="크리스") # Read Note about Python 2
> NOTE: In Python 2, when calling the toString(...)
code, assigning xml
with ElementTree.Element("Person", Name="크리스")
will raise an error...
>
>UnicodeDecodeError: 'ascii' codec can't decode byte 0xed in position 0: ordinal not in range(128)
Output
ElementTree.tostring(xml)
# Python 3 (크리스): b'<Person Name="크리스" />'
# Python 3 (John): b'<Person Name="John" />'
# Python 2 (크리스): <Person Name="크리스" />
# Python 2 (John): <Person Name="John" />
ElementTree.tostring(xml, encoding='unicode')
# Python 3 (크리스): <Person Name="크리스" /> <-------- Python 3
# Python 3 (John): <Person Name="John" />
# Python 2 (크리스): LookupError: unknown encoding: unicode
# Python 2 (John): LookupError: unknown encoding: unicode
ElementTree.tostring(xml, encoding='utf-8')
# Python 3 (크리스): b'<Person Name="\xed\x81\xac\xeb\xa6\xac\xec\x8a\xa4" />'
# Python 3 (John): b'<Person Name="John" />'
# Python 2 (크리스): <Person Name="크리스" /> <-------- Python 2
# Python 2 (John): <Person Name="John" />
ElementTree.tostring(xml).decode()
# Python 3 (크리스): <Person Name="크리스" />
# Python 3 (John): <Person Name="John" />
# Python 2 (크리스): <Person Name="크리스" />
# Python 2 (John): <Person Name="John" />
Solution 4 - Python
I had the same problem in Python 3.8 and none of the previous answers solved it. The issue is that ElementTree is both the name of a module and of a class within it. Using an alias makes it clear:
from xml.etree.ElementTree import ElementTree
import xml.etree.ElementTree as XET
...
ElementTree.tostring(...) # Attribute-error
XET.tostring(...) # Works
Solution 5 - Python
If you just need this for debugging to see how the XML looks like, then instead of print(xml.etree.ElementTree.tostring(e))
you can use dump
like this:
xml.etree.ElementTree.dump(e)
And this works both with Element
and ElementTree
objects as e
, so there should be no need for getroot
.
The documentation of dump
says:
> xml.etree.ElementTree.dump(elem)
>
> Writes an element tree or element structure to sys.stdout
. This function should be used for debugging only.
>
> The exact output format is implementation dependent. In this version, it’s written as an ordinary XML file.
>
> elem
is an element tree or an individual element.
>
> Changed in version 3.8: The dump()
function now preserves the attribute order specified by the user.