How to get rid of BeautifulSoup user warning?

PythonBeautifulsoupUser Warning

Python Problem Overview


After I installed BeautifulSoup, Whenever I run my Python in cmd, this warning comes out.

D:\Application\python\lib\site-packages\beautifulsoup4-4.4.1-py3.4.egg\bs4\__init__.py:166:
UserWarning: No parser was explicitly specified, so I'm using the best
available HTML parser for this system ("html.parser"). This usually isn't a
problem, but if you run this code on another system, or in a different
virtual environment, it may use a different parser and behave differently.

To get rid of this warning, change this:

 BeautifulSoup([your markup])

to this:

 BeautifulSoup([your markup], "html.parser")

I have no ideal why it comes out and how to solve it.

Python Solutions


Solution 1 - Python

The solution to your problem is clearly stated in the error message. Code like the below does not specify an XML/HTML/etc. parser.

BeautifulSoup( ... )

In order to fix the error, you'll need to specify which parser you'd like to use, like so:

BeautifulSoup( ..., "html.parser" )

You can also install a 3rd party parser if you'd like.

Solution 2 - Python

Documentation recommends that you install and use lxml for speed.

BeautifulSoup(html, "lxml")

> If you’re using a version of Python 2 earlier than 2.7.3, or a version > of Python 3 earlier than 3.2.2, it’s essential that you install lxml > or html5lib–Python’s built-in HTML parser is just not very good in > older versions.

Installing LXML parser

  • On Ubuntu (debian)

     apt-get install python-lxml 
    
  • Fedora (RHEL based)

     dnf install python-lxml
    
  • Using PIP

     pip install lxml
    

Solution 3 - Python

For HTML parser, you need to install html5lib, run:

pip install html5lib

then add html5lib in the BeautifulSoup method:

htmlDoc = bs4.BeautifulSoup(req1.text, 'html5lib')
print(htmlDoc)

Solution 4 - Python

In my opinion, the previous posts did not answer the question.

Yes, as everyone said, you can remove the warning by specifying the parser.
And as pointed by the documentation, it is a best-practice for performances 1 and for consistency 2.

But in some cases, you want to silence the warning... Hence this post.

  • since BeautifulSoup 4 rev 460, the warning message does not appear in interactive (REPL) mode
  • there are more generalist answers at: https://stackoverflow.com/questions/14463277 to control Python warnings (TL;DL: PYTHONWARNINGS=ignore or -Wignore)
  • suppressing the warning explicitly (bs4 ≥ rev 569) by adding to your code:
    import warnings
    warnings.filterwarnings('ignore', category=GuessedAtParserWarning)
    
  • cheating by letting bs4 think you provided the parser, i.e.:
    bs4.BeautifulSoup(
      your_markup,
      builder=bs4.builder_registry.lookup(*bs4.BeautifulSoup.DEFAULT_BUILDER_FEATURES)
    )
    

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionjellyfishhuangView Question on Stackoverflow
Solution 1 - PythonEthan BierleinView Answer on Stackoverflow
Solution 2 - PythonGayan WeerakuttiView Answer on Stackoverflow
Solution 3 - PythonWilson WuView Answer on Stackoverflow
Solution 4 - PythonbufhView Answer on Stackoverflow