How to write XML declaration using xml.etree.ElementTree

PythonXmlElementtree

Python Problem Overview


I am generating an XML document in Python using an ElementTree, but the tostring function doesn't include an XML declaration when converting to plaintext.

from xml.etree.ElementTree import Element, tostring

document = Element('outer')
node = SubElement(document, 'inner')
node.NewValue = 1
print tostring(document)  # Outputs "<outer><inner /></outer>"

I need my string to include the following XML declaration:

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>

However, there does not seem to be any documented way of doing this.

Is there a proper method for rendering the XML declaration in an ElementTree?

Python Solutions


Solution 1 - Python

I am surprised to find that there doesn't seem to be a way with ElementTree.tostring(). You can however use ElementTree.ElementTree.write() to write your XML document to a fake file:

from io import BytesIO
from xml.etree import ElementTree as ET

document = ET.Element('outer')
node = ET.SubElement(document, 'inner')
et = ET.ElementTree(document)

f = BytesIO()
et.write(f, encoding='utf-8', xml_declaration=True) 
print(f.getvalue())  # your XML file, encoded as UTF-8

See this question. Even then, I don't think you can get your 'standalone' attribute without writing prepending it yourself.

Solution 2 - Python

I would use lxml (see http://lxml.de/api.html).

Then you can:

from lxml import etree
document = etree.Element('outer')
node = etree.SubElement(document, 'inner')
print(etree.tostring(document, xml_declaration=True))

Solution 3 - Python

If you include the encoding='utf8', you will get an XML header:

> xml.etree.ElementTree.tostring writes a XML encoding declaration with encoding='utf8'

Sample Python code (works with Python 2 and 3):

import xml.etree.ElementTree as ElementTree

tree = ElementTree.ElementTree(
	ElementTree.fromstring('<xml><test>123</test></xml>')
)
root = tree.getroot()

print('without:')
print(ElementTree.tostring(root, method='xml'))
print('')
print('with:')
print(ElementTree.tostring(root, encoding='utf8', method='xml'))

Python 2 output:

$ python2 example.py
without:
<xml><test>123</test></xml>

with:
<?xml version='1.0' encoding='utf8'?>
<xml><test>123</test></xml>

With Python 3 you will note the b prefix indicating byte literals are returned (just like with Python 2):

$ python3 example.py
without:
b'<xml><test>123</test></xml>'

with:
b"<?xml version='1.0' encoding='utf8'?>\n<xml><test>123</test></xml>"

Solution 4 - Python

xml_declaration Argument

> Is there a proper method for rendering the XML declaration in an ElementTree?

YES, and there is no need of using .tostring function. According to ElementTree Documentation, you should create an ElementTree object, create Element and SubElements, set the tree's root, and finally use xml_declaration argument in .write function, so the declaration line is included in output file.

You can do it this way:

import xml.etree.ElementTree as ET

tree = ET.ElementTree("tree")

document = ET.Element("outer")
node1 = ET.SubElement(document, "inner")
node1.text = "text"

tree._setroot(document)
tree.write("./output.xml", encoding = "UTF-8", xml_declaration = True)	

And the output file is:

<?xml version='1.0' encoding='UTF-8'?>
<outer><inner>text</inner></outer>

Solution 5 - Python

I encounter this issue recently, after some digging of the code, I found the following code snippet is definition of function ElementTree.write

def write(self, file, encoding="us-ascii"):
    assert self._root is not None
    if not hasattr(file, "write"):
        file = open(file, "wb")
    if not encoding:
        encoding = "us-ascii"
    elif encoding != "utf-8" and encoding != "us-ascii":
        file.write("<?xml version='1.0' encoding='%s'?>\n" % 
     encoding)
    self._write(file, self._root, encoding, {})

So the answer is, if you need write the XML header to your file, set the encoding argument other than utf-8 or us-ascii, e.g. UTF-8

Solution 6 - Python

The minimal working example with ElementTree package usage:

import xml.etree.ElementTree as ET

document = ET.Element('outer')
node = ET.SubElement(document, 'inner')
node.text = '1'
res = ET.tostring(document, encoding='utf8', method='xml').decode()
print(res)

the output is:

<?xml version='1.0' encoding='utf8'?>
<outer><inner>1</inner></outer>

Solution 7 - Python

Easy

Sample for both Python 2 and 3 (encoding parameter must be utf8):

import xml.etree.ElementTree as ElementTree

tree = ElementTree.ElementTree(ElementTree.fromstring('<xml><test>123</test></xml>'))
root = tree.getroot()
print(ElementTree.tostring(root, encoding='utf8', method='xml'))

From Python 3.8 there is xml_declaration parameter for that stuff:

> New in version 3.8: The xml_declaration and default_namespace > parameters.

> xml.etree.ElementTree.tostring(element, encoding="us-ascii", > method="xml", *, xml_declaration=None, default_namespace=None, > short_empty_elements=True) Generates a string representation of an XML > element, including all subelements. element is an Element instance. > encoding 1 is the output encoding (default is US-ASCII). Use > encoding="unicode" to generate a Unicode string (otherwise, a > bytestring is generated). method is either "xml", "html" or "text" > (default is "xml"). xml_declaration, default_namespace and > short_empty_elements has the same meaning as in ElementTree.write(). > Returns an (optionally) encoded string containing the XML data.

Sample for Python 3.8 and higher:

import xml.etree.ElementTree as ElementTree

tree = ElementTree.ElementTree(ElementTree.fromstring('<xml><test>123</test></xml>'))
root = tree.getroot()
print(ElementTree.tostring(root, encoding='unicode', method='xml', xml_declaration=True))

Solution 8 - Python

Another pretty simple option is to concatenate the desired header to the string of xml like this:

xml = (bytes('<?xml version="1.0" encoding="UTF-8"?>\n', encoding='utf-8') + ET.tostring(root))
xml = xml.decode('utf-8')
with open('invoice.xml', 'w+') as f:
    f.write(xml)

Solution 9 - Python

I would use ET:

try:
	from lxml import etree
	print("running with lxml.etree")
except ImportError:
	try:
		# Python 2.5
		import xml.etree.cElementTree as etree
		print("running with cElementTree on Python 2.5+")
	except ImportError:
		try:
			# Python 2.5
			import xml.etree.ElementTree as etree
			print("running with ElementTree on Python 2.5+")
		except ImportError:
			try:
				# normal cElementTree install
				import cElementTree as etree
				print("running with cElementTree")
			except ImportError:
			   try:
				   # normal ElementTree install
				   import elementtree.ElementTree as etree
				   print("running with ElementTree")
			   except ImportError:
				   print("Failed to import ElementTree from any known place")

document = etree.Element('outer')
node = etree.SubElement(document, 'inner')
print(etree.tostring(document, encoding='UTF-8', xml_declaration=True))

Solution 10 - Python

This works if you just want to print. Getting an error when I try to send it to a file...

import xml.dom.minidom as minidom
import xml.etree.ElementTree as ET
from xml.etree.ElementTree import Element, SubElement, Comment, tostring

def prettify(elem):
    rough_string = ET.tostring(elem, 'utf-8')
    reparsed = minidom.parseString(rough_string)
    return reparsed.toprettyxml(indent="  ")

Solution 11 - Python

Including 'standalone' in the declaration

I didn't found any alternative for adding the standalone argument in the documentation so I adapted the ET.tosting function to take it as an argument.

from xml.etree import ElementTree as ET

# Sample
document = ET.Element('outer')
node = ET.SubElement(document, 'inner')
et = ET.ElementTree(document)

 # Function that you need   
 def tostring(element, declaration, encoding=None, method=None,):
     class dummy:
         pass
     data = []
     data.append(declaration+"\n")
     file = dummy()
     file.write = data.append
     ET.ElementTree(element).write(file, encoding, method=method)
     return "".join(data)
# Working example
xdec = """<?xml version="1.0" encoding="UTF-8" standalone="no" ?>"""    
xml = tostring(document, encoding='utf-8', declaration=xdec)

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionRoman AlexanderView Question on Stackoverflow
Solution 1 - PythonwrgrsView Answer on Stackoverflow
Solution 2 - PythonglormphView Answer on Stackoverflow
Solution 3 - PythonAlexander O'MaraView Answer on Stackoverflow
Solution 4 - PythonsmrachiView Answer on Stackoverflow
Solution 5 - PythonalijandroView Answer on Stackoverflow
Solution 6 - PythonAndriyView Answer on Stackoverflow
Solution 7 - PythonKirill MalakhovView Answer on Stackoverflow
Solution 8 - PythonNovakView Answer on Stackoverflow
Solution 9 - PythonAlessandroView Answer on Stackoverflow
Solution 10 - PythonRebecca FallonView Answer on Stackoverflow
Solution 11 - PythonG MView Answer on Stackoverflow