Saving XML files using ElementTree

PythonElementtree

Python Problem Overview


I'm trying to develop simple Python (3.2) code to read XML files, do some corrections and store them back. However, during the storage step ElementTree adds this namespace nomenclature. For example:

<ns0:trk>
  <ns0:name>ACTIVE LOG</ns0:name>
<ns0:trkseg>
<ns0:trkpt lat="38.5" lon="-120.2">
  <ns0:ele>6.385864</ns0:ele>
  <ns0:time>2011-12-10T17:46:30Z</ns0:time>
</ns0:trkpt>
<ns0:trkpt lat="40.7" lon="-120.95">
  <ns0:ele>5.905273</ns0:ele>
  <ns0:time>2011-12-10T17:46:51Z</ns0:time>
</ns0:trkpt>
<ns0:trkpt lat="43.252" lon="-126.453">
  <ns0:ele>7.347168</ns0:ele>
  <ns0:time>2011-12-10T17:52:28Z</ns0:time>
</ns0:trkpt>
</ns0:trkseg>
</ns0:trk>

The code snippet is below:

def parse_gpx_data(gpxdata, tzname=None, npoints=None, filter_window=None,
                   output_file_name=None):
        ET = load_xml_library();

    def find_trksegs_or_route(etree, ns):
        trksegs=etree.findall('.//'+ns+'trkseg')
        if trksegs:
            return trksegs, "trkpt"
        else: # try to display route if track is missing
            rte=etree.findall('.//'+ns+'rte')
            return rte, "rtept"

    # try GPX10 namespace first
    try:
        element = ET.XML(gpxdata)
    except ET.ParseError as v:
        row, column = v.position
        print ("error on row %d, column %d:%d" % row, column, v)

    print ("%s" % ET.tostring(element))
    trksegs,pttag=find_trksegs_or_route(element, GPX10)
    NS=GPX10
    if not trksegs: # try GPX11 namespace otherwise
        trksegs,pttag=find_trksegs_or_route(element, GPX11)
        NS=GPX11
    if not trksegs: # try without any namespace
        trksegs,pttag=find_trksegs_or_route(element, "")
        NS=""

    # Store the results if requested
    if output_file_name:
        ET.register_namespace('', GPX11)
        ET.register_namespace('', GPX10)
        ET.ElementTree(element).write(output_file_name, xml_declaration=True)

    return;

I have tried using the register_namespace, but with no positive result. Are there any specific changes for this version of ElementTree 1.3?

Python Solutions


Solution 1 - Python

In order to avoid the ns0 prefix the default namespace should be set before reading the XML data.

ET.register_namespace('', "http://www.topografix.com/GPX/1/1")
ET.register_namespace('', "http://www.topografix.com/GPX/1/0")

Solution 2 - Python

You need to register all your namespaces before you parse xml file.

For example: If you have your input xml like this and Capabilities is the root of your Element tree.

<Capabilities xmlns="http://www.opengis.net/wmts/1.0"
	xmlns:ows="http://www.opengis.net/ows/1.1"
	xmlns:xlink="http://www.w3.org/1999/xlink"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xmlns:gml="http://www.opengis.net/gml"
	xsi:schemaLocation="http://www.opengis.net/wmts/1.0 http://schemas.opengis.net/wmts/1.0/wmtsGetCapabilities_response.xsd"
	version="1.0.0">

Then you have to register all the namespaces i.e attributes present with xmlns like this:

ET.register_namespace('', "http://www.opengis.net/wmts/1.0")
ET.register_namespace('ows', "http://www.opengis.net/ows/1.1")
ET.register_namespace('xlink', "http://www.w3.org/1999/xlink")
ET.register_namespace('xsi', "http://www.w3.org/2001/XMLSchema-instance")
ET.register_namespace('gml', "http://www.opengis.net/gml")

Solution 3 - Python

It seems that you have to declare your namespace, meaning that you need to change the first line of your xml from:

<ns0:trk>

to something like:

<ns0:trk xmlns:ns0="uri:">

Once did that you will no longer get ParseError: for unbound prefix: ..., and:

elem.tag = elem.tag[(len('{uri:}'):]

will remove the namespace.

Solution 4 - Python

If you try to print the root, you will see something like this: http://www.host.domain/path/to/your/xml/namespace}RootTag'; at 0x0000000000558DB8>

So, to avoid the ns0 prefix, you have to change the default namespace before parsing the XML data as below:

ET.register_namespace('', "http://www.host.domain/path/to/your/xml/namespace")

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionilya1725View Question on Stackoverflow
Solution 1 - Pythonilya1725View Answer on Stackoverflow
Solution 2 - PythonsingingsinghView Answer on Stackoverflow
Solution 3 - PythonRik PoggiView Answer on Stackoverflow
Solution 4 - PythonNaiim KhaskhoussiView Answer on Stackoverflow