You can use `find_all` in the following way to find every `a` element that has an `href` attribute, and print each one:

    from BeautifulSoup import BeautifulSoup
    
    html = &#39;&#39;&#39;&lt;a href=&quot;some_url&quot;&gt;next&lt;/a&gt;
    &lt;span class=&quot;class&quot;&gt;&lt;a href=&quot;another_url&quot;&gt;later&lt;/a&gt;&lt;/span&gt;&#39;&#39;&#39;
    
    soup = BeautifulSoup(html)
    
    for a in soup.find_all(&#39;a&#39;, href=True):
        print &quot;Found the URL:&quot;, a[&#39;href&#39;]

The output would be:

&lt;!-- language: lang-none --&gt;

    Found the URL: some_url
    Found the URL: another_url

Note that if you&#39;re using an older version of BeautifulSoup (before version 4) the name of this method is `findAll`. In version 4, BeautifulSoup&#39;s method names [were changed to be PEP 8 compliant](http://www.crummy.com/software/BeautifulSoup/bs4/doc/#method-names), so you should use `find_all` instead.

--------

If you want _all_ tags with an `href`, you can omit the `name` parameter:

    href_tags = soup.find_all(href=True)

**Update:**  

Again thanks for the examples, they have been very helpful and with the following, I don&#39;t mean 
to take anything away from them.

Aren&#39;t the currently given examples, as far as I understand them &amp; state-machines, only half of what we usually understand by a state-machine?  
In the sense that the examples do change state but that&#39;s only represented by changing the value of a variable (and allowing different value- changes in different states), while usually, a state machine should also change its behavior, and behavior not (only) in the sense of allowing different value changes for a variable depending on the state, but in the sense of allowing different methods to be executed for different states.

Or do I have a misconception of state machines and their common use?

---
**Original question:**  

I found this discussion about [state machines &amp; iterator blocks in c#](https://stackoverflow.com/questions/1406986/does-c-include-finite-state-machines) and tools to create state machines and whatnot for C#, so I found a lot of abstract stuff but as a noob, all of this is a little confusing.

So it would be great if someone could provide a C# source code-example that realizes a simple state machine with perhaps 3,4 states, just to get the gist of it.


Simple state machine example in C#?

I would like to undo my git pull on account of unwanted commits on the remote origin, but I don&#39;t know to which revision I have to reset back to.

 How can I just go back to the state before I did the git pull on the remote origin? 

How to undo a git pull?

I have the following `soup`:

    &lt;a href=&quot;some_url&quot;&gt;next&lt;/a&gt;
    &lt;span class=&quot;class&quot;&gt;...&lt;/span&gt;

From this I want to extract the href, `&quot;some_url&quot;`

I can do it if I only have one tag, but here there are two tags. I can also get the text `&#39;next&#39;` but that&#39;s not what I want.

Also, is there a good description of the API somewhere with examples. I&#39;m using [the standard documentation](http://www.crummy.com/software/BeautifulSoup/documentation.html), but I&#39;m looking for something a little more organized. 

BeautifulSoup getting href

I have the following <code>soup</code>:
<pre><code class="hljs language-ini">&#x3C;a href="some_url">next&#x3C;/a>
&#x3C;span class="class">...&#x3C;/span>
</code></pre>
From this I want to extract the href, <code>"some_url"</code>
I can do it if I only have one tag, but here there are two tags. I can also get the text <code>'next'</code> but that's not what I want.
Also, is there a good description of the API somewhere with examples. I'm using <a href="http://www.crummy.com/software/BeautifulSoup/documentation.html" target="_blank" rel="noopener noreferrer">the standard documentation</a>, but I'm looking for something a little more organized.

I&#39;d like to browse through the current folder and all its subfolders and get all the files with .htm|.html extensions. I have found out that it is possible to find out whether an object is a dir or file like this:

    import os
    
    dirList = os.listdir(&quot;./&quot;) # current directory
    for dir in dirList:
      if os.path.isdir(dir) == True:
        # I don&#39;t know how to get into this dir and do the same thing here
      else:
        # I got file and i can regexp if it is .htm|html

and in the end, I would like to have all the files and their paths in an array. Is something like that possible?

Browse files and subfolders in Python

I know that `__call__` method in a class is triggered when the instance of a class is called. However, I have no idea when I can use this special method, because one can simply create a new method and perform the same operation done in `__call__` method and instead of calling the instance, you can call the method.

I would really appreciate it if someone gives me a practical usage of this special method.

Python __call__ special method practical example

I am looking for what type of code would I put in `__init__.py` files and what are the best practices related to this. Or, is it a bad practice in general ?

Any reference to known documents that explain this is also very much appreciated.


Why would I put code in __init__.py files?

I am waiting for another developer to finish a piece of code that will return an np array of shape (100,2000) with values of either -1,0, or 1.

In the meantime, I want to randomly create an array of the same characteristics so I can get a head start on my development and testing. The thing is that I want this randomly created array to be the same each time, so that I&#39;m not testing against an array that keeps changing its value each time I re-run my process.

I can create my array like this, but is there a way to create it so that it&#39;s the same each time. I can pickle the object and unpickle it, but wondering if there&#39;s another way.

    r = np.random.randint(3, size=(100, 2000)) - 1


Consistently create same random numpy array

 1. I have a list of dictionaries containing unicode strings.
 2. `csv.DictWriter` can write a list of dictionaries into a CSV file.
 3. I want the CSV file to be encoded in UTF8.
 4. The `csv` module cannot handle converting unicode strings into UTF8.
 5. The `csv` module documentation has an example for converting everything to UTF8:

        def utf_8_encoder(unicode_csv_data):
            for line in unicode_csv_data:
                yield line.encode(&#39;utf-8&#39;)

 6. It also has a `UnicodeWriter` class.  

But... how do I make `DictWriter` work with these?  Wouldn&#39;t they have to inject themselves in the middle of it, to catch the disassembled dictionaries and encode them before it writes them to the file?  I don&#39;t get it.

Python DictWriter writing UTF-8 encoded CSV files

When I try this

    &lt;option disabled = &quot;disabled&quot; &lt;!-- Used to disable any particular option --&gt;
            selected = &quot;selected&quot; &lt;!-- Used to pre-select any particular option --&gt;
            label = &quot;string&quot;      &lt;!-- Used to provide a short version of the content in the option --&gt; 
            value = &quot;value&quot;&gt;      &lt;!-- The actual value that will be send to the server. If omitted the content between the option opening and closing tags will be send. --&gt;

    Option 1
    &lt;/option&gt;

I am trying to comment the attributes and values inside the openning tag of the element. However this does not work as browsers (tested on IE9, FF4.01, GG11, AF5 and Opera11) treat everything followed after the disabled=&quot;disabled&quot; as either comment or content.

Are HTMl Comments not allowed inside the opening tag of elements?

HTML Comments inside Opening Tag of the Element

How can I add an attribute into specific HTML tags in jQuery?

For example, like this simple HTML:

    &lt;input id=&quot;someid&quot; /&gt;

Then adding an attribute disabled=&quot;true&quot; like this:

    &lt;input id=&quot;someid&quot; disabled=&quot;true&quot; /&gt;

Adding attribute in jQuery

Is there a one liner that shows me the dates where all git lightweight tags where created ?

Something like: `git show tags --format=date` ?

git command to show all (lightweight) tags creation dates

I have this div element with a background image and I want to stop highlighting on the div element when double-clicking it.  Is there a CSS property for this?

How to stop highlighting of a div element when double-clicking

For example of a blog-post or article.

    &lt;article&gt;
    &lt;h1&gt;header&lt;h1&gt;
    &lt;time&gt;09-02-2011&lt;/time&gt;
    &lt;author&gt;John&lt;/author&gt;
    My article....
    &lt;/article&gt;

The `author` tag doesn&#39;t exist though... So what is the commonly used HTML5 tag for authors?
Thanks.

(If there isn&#39;t, shouldn&#39;t there be one?)

Which HTML5 tag should I use to mark up an author’s name?

    
I&#39;m trying to get the content &quot;My home address&quot; using the following but got the AttributeError:

    address = soup.find(text=&quot;Address:&quot;)
    print address.nextSibling
    
This is my HTML:

    &lt;td&gt;&lt;b&gt;Address:&lt;/b&gt;&lt;/td&gt;
    &lt;td&gt;My home address&lt;/td&gt;


What is a good way to navigate down `td` tag and pull the content?

Beautifulsoup - nextSibling

I want to get all the `&lt;a&gt;` tags which are children of `&lt;li&gt;`:

&lt;!-- language: lang-html --&gt;

    &lt;div&gt;
    &lt;li class=&quot;test&quot;&gt;
        &lt;a&gt;link1&lt;/a&gt;
        &lt;ul&gt; 
           &lt;li&gt;  
              &lt;a&gt;link2&lt;/a&gt; 
           &lt;/li&gt;
        &lt;/ul&gt;
    &lt;/li&gt;
    &lt;/div&gt;


I know how to find element with particular class like this:


    soup.find(&quot;li&quot;, { &quot;class&quot; : &quot;test&quot; }) 

But I don&#39;t know how to find all `&lt;a&gt;` which are children of `&lt;li class=test&gt;` but not any others.

Like I want to select:

&lt;!-- language: lang-html --&gt;

    &lt;a&gt;link1&lt;/a&gt;



How to find children of nodes using BeautifulSoup

Let&#39;s say I have a page with a `div`. I can easily get that div with `soup.find()`.

Now that I have the result, I&#39;d like to print the WHOLE `innerhtml` of that `div`: I mean, I&#39;d need a string with ALL the html tags and text all toegether, exactly like the string I&#39;d get in javascript with `obj.innerHTML`.  Is this possible?


BeautifulSoup innerhtml?

How would I, using BeautifulSoup, search for tags containing ONLY the attributes I search for? 

For example, I want to find all `&lt;td valign=&quot;top&quot;&gt;` tags. 

The following code:
`raw_card_data = soup.fetch(&#39;td&#39;, {&#39;valign&#39;:re.compile(&#39;top&#39;)})`

gets all of the data I want, but also grabs any `&lt;td&gt;` tag that has the attribute `valign:top` 

I also tried:
`raw_card_data = soup.findAll(re.compile(&#39;&lt;td valign=&quot;top&quot;&gt;&#39;))`
and this returns nothing (probably because of bad regex)

I was wondering if there was a way in BeautifulSoup to say &quot;Find `&lt;td&gt;` tags whose only attribute is `valign:top`&quot;

**UPDATE**
FOr example, if an HTML document contained the following `&lt;td&gt;` tags:

    &lt;td valign=&quot;top&quot;&gt;.....&lt;/td&gt;&lt;br /&gt;
    &lt;td width=&quot;580&quot; valign=&quot;top&quot;&gt;.......&lt;/td&gt;&lt;br /&gt;
    &lt;td&gt;.....&lt;/td&gt;&lt;br /&gt;

I would want only the first `&lt;td&gt;` tag (`&lt;td width=&quot;580&quot; valign=&quot;top&quot;&gt;`) to return
 





How to find tags with only certain attributes - BeautifulSoup

I am using BeautifulSoup to look for user-entered strings on a specific page. 
For example, I want to see if the string &#39;Python&#39; is located on the page: http://python.org

When I used:
`find_string = soup.body.findAll(text=&#39;Python&#39;)`, 
`find_string` returned `[]`

But when I used:
`find_string = soup.body.findAll(text=re.compile(&#39;Python&#39;), limit=1)`, 
`find_string` returned `[u&#39;Python Jobs&#39;]` as expected

What is the difference between these two statements that makes the second statement work when there are more than one instances of the word to be searched?


Content Type	Original Author	Original Content on Stackoverflow
Question	dkgirl	View Question on Stackoverflow
Solution 1 - Python	Mark Longair	View Answer on Stackoverflow

BeautifulSoup getting href

Python Problem Overview

Python Solutions

Solution 1 - Python

How to undo a git pull?

Simple state machine example in C#?

Attributions