Get an attribute value based on the name attribute with BeautifulSoup

PythonBeautifulsoup

Python Problem Overview


I want to print an attribute value based on its name, take for example

<META NAME="City" content="Austin">

I want to do something like this

soup = BeautifulSoup(f)  # f is some HTML containing the above meta tag
for meta_tag in soup("meta"):
    if meta_tag["name"] == "City":
        print(meta_tag["content"])

The above code give a KeyError: 'name', I believe this is because name is used by BeatifulSoup so it can't be used as a keyword argument.

Python Solutions


Solution 1 - Python

It's pretty simple, use the following -

>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup('<META NAME="City" content="Austin">')
>>> soup.find("meta", {"name":"City"})
<meta name="City" content="Austin" />
>>> soup.find("meta", {"name":"City"})['content']
u'Austin'

Leave a comment if anything is not clear.

Solution 2 - Python

theharshest answered the question but here is another way to do the same thing. Also, In your example you have NAME in caps and in your code you have name in lowercase.

s = '<div class="question" id="get attrs" name="python" x="something">Hello World</div>'
soup = BeautifulSoup(s)

attributes_dictionary = soup.find('div').attrs
print attributes_dictionary
# prints: {'id': 'get attrs', 'x': 'something', 'class': ['question'], 'name': 'python'}

print attributes_dictionary['class'][0]
# prints: question

print soup.find('div').get_text()
# prints: Hello World

Solution 3 - Python

6 years late to the party but I've been searching for how to extract an html element's tag attribute value, so for:

<span property="addressLocality">Ayr</span>

I want "addressLocality". I kept being directed back here, but the answers didn't really solve my problem.

How I managed to do it eventually:

>>> from bs4 import BeautifulSoup as bs

>>> soup = bs('<span property="addressLocality">Ayr</span>', 'html.parser')
>>> my_attributes = soup.find().attrs
>>> my_attributes
{u'property': u'addressLocality'}

As it's a dict, you can then also use keys and 'values'

>>> my_attributes.keys()
[u'property']
>>> my_attributes.values()
[u'addressLocality']

Hopefully it helps someone else!

Solution 4 - Python

theharshest's answer is the best solution, but FYI the problem you were encountering has to do with the fact that a Tag object in Beautiful Soup acts like a Python dictionary. If you access tag['name'] on a tag that doesn't have a 'name' attribute, you'll get a KeyError.

Solution 5 - Python

The following works:

from bs4 import BeautifulSoup

soup = BeautifulSoup('<META NAME="City" content="Austin">', 'html.parser')

metas = soup.find_all("meta")

for meta in metas:
    print meta.attrs['content'], meta.attrs['name']

Solution 6 - Python

One can also try this solution :

To find the value, which is written in span of table

htmlContent


<table>
    <tr>
        <th>
            ID
        </th>
        <th>
            Name
        </th>
    </tr>


    <tr>
        <td>
            <span name="spanId" class="spanclass">ID123</span>
        </td>

        <td>
            <span>Bonny</span>
        </td>
    </tr>
</table>

Python code


soup = BeautifulSoup(htmlContent, "lxml")
soup.prettify()

tables = soup.find_all("table")

for table in tables:
   storeValueRows = table.find_all("tr")
   thValue = storeValueRows[0].find_all("th")[0].string

   if (thValue == "ID"): # with this condition I am verifying that this html is correct, that I wanted.
      value = storeValueRows[1].find_all("span")[0].string
      value = value.strip()

      # storeValueRows[1] will represent <tr> tag of table located at first index and find_all("span")[0] will give me <span> tag and '.string' will give me value

      # value.strip() - will remove space from start and end of the string.

     # find using attribute :

     value = storeValueRows[1].find("span", {"name":"spanId"})['class']
     print value
     # this will print spanclass

Solution 7 - Python

If tdd='<td class="abc"> 75</td>'
In Beautifulsoup 

if(tdd.has_attr('class')):
   print(tdd.attrs['class'][0])


Result:  abc

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionRuthView Question on Stackoverflow
Solution 1 - PythontheharshestView Answer on Stackoverflow
Solution 2 - PythonDeliciousView Answer on Stackoverflow
Solution 3 - Pythonron_gView Answer on Stackoverflow
Solution 4 - PythonLeonard RichardsonView Answer on Stackoverflow
Solution 5 - PythonBrightMoonView Answer on Stackoverflow
Solution 6 - PythonUjjaval MoradiyaView Answer on Stackoverflow
Solution 7 - PythonPriyank SinghalView Answer on Stackoverflow