selecting attribute values from lxml

PythonPython 2.7AttributesLxml

Python Problem Overview


I want to use an xpath expression to get the value of an attribute.

I expected the following to work

from lxml import etree

for customer in etree.parse('file.xml').getroot().findall('BOB'):
    print customer.find('./@NAME')

but this gives an error :

Traceback (most recent call last):
  File "bob.py", line 22, in <module>
    print customer.find('./@ID')
  File "lxml.etree.pyx", line 1409, in lxml.etree._Element.find (src/lxml/lxml.etree.c:39972)
  File "/usr/local/lib/python2.7/dist-packages/lxml/_elementpath.py", line 272, in find
    it = iterfind(elem, path, namespaces)
  File "/usr/local/lib/python2.7/dist-packages/lxml/_elementpath.py", line 262, in iterfind
    selector = _build_path_iterator(path, namespaces)
  File "/usr/local/lib/python2.7/dist-packages/lxml/_elementpath.py", line 246, in _build_path_iterator
    selector.append(ops[token[0]](_next, token))
KeyError: '@'

Am I wrong to expect this to work?

Python Solutions


Solution 1 - Python

find and findall only implement a subset of XPath. Their presence is meant to provide compatibility with other ElementTree implementations (like ElementTree and cElementTree).

The xpath method, in contrast, provides full access to XPath 1.0:

print customer.xpath('./@NAME')[0]

However, you could instead use get:

print customer.get('NAME')

or attrib:

print customer.attrib['NAME']

Solution 2 - Python

As a possible useful addition, this is how to get the value of an attribute in the case that the element has more than one, and it is the only difference with respect to another element. E.g., given the following file.xml:

<?xml version ="1.0" encoding="UTF-8"?>
    <level1>
      <level2 first_att='att1' second_att='foo'>8</level2>
      <level2 first_att='att2' second_att='bar'>8</level2>
    </level1>

One can access the attribute 'bar' with:

import lxml.etree as etree
tree = etree.parse("test_file.xml")
print tree.xpath("//level1/level2[@first_att='att2']/@second_att")[0]

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionGHZView Question on Stackoverflow
Solution 1 - PythonunutbuView Answer on Stackoverflow
Solution 2 - PythonUse MeView Answer on Stackoverflow