Scraping: SSL: CERTIFICATE_VERIFY_FAILED error for http://en.wikipedia.org

PythonWeb ScrapingBeautifulsoupScrapySsl Certificate

Python Problem Overview


I'm practicing the code from 'Web Scraping with Python', and I keep having this certificate problem:

from urllib.request import urlopen 
from bs4 import BeautifulSoup 
import re

pages = set()
def getLinks(pageUrl):
	global pages
	html = urlopen("http://en.wikipedia.org"+pageUrl)
	bsObj = BeautifulSoup(html)
	for link in bsObj.findAll("a", href=re.compile("^(/wiki/)")):
		if 'href' in link.attrs:
			if link.attrs['href'] not in pages:
				#We have encountered a new page
				newPage = link.attrs['href'] 
				print(newPage) 
				pages.add(newPage) 
				getLinks(newPage)
getLinks("")

The error is:

  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 1319, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1049)>

Btw,I was also practicing scrapy, but kept getting the problem: command not found: scrapy (I tried all sorts of solutions online but none works... really frustrating)

Python Solutions


Solution 1 - Python

Once upon a time I stumbled with this issue. If you're using macOS go to Macintosh HD > Applications > Python3.6 folder (or whatever version of python you're using) > double click on "Install Certificates.command" file. :D

Solution 2 - Python

to use unverified ssl you can add this to your code:

import ssl
ssl._create_default_https_context = ssl._create_unverified_context

Solution 3 - Python

This terminal command:

open /Applications/Python\ 3.7/Install\ Certificates.command

Found here: https://stackoverflow.com/a/57614113/6207266

Resolved it for me. With my config

pip install --upgrade certifi

had no impact.

Solution 4 - Python

To solve this:

All you need to do is to install Python certificates! A common issue on macOS.

Open these files:

Install Certificates.command
Update Shell Profile.command

Simply Run these two scripts and you wont have this issue any more.

Hope this helps!

Solution 5 - Python

For novice users, you can go in the Applications folder and expand the Python 3.7 folder. Now first run (or double click) the Install Certificates.command and then Update Shell Profile.command

enter image description here

Solution 6 - Python

For anyone who is using anaconda, you would install the certifi package, see more at:

https://anaconda.org/anaconda/certifi

To install, type this line in your terminal:

conda install -c anaconda certifi

Solution 7 - Python

open /Applications/Python\ 3.7/Install\ Certificates.command

Try this command in terminal

Solution 8 - Python

Two steps worked for me :

  • going Macintosh HD > Applications > Python3.7 folder
  • click on "Install Certificates.command"

Solution 9 - Python

I could find this solution and is working fine:

cd /Applications/Python\ 3.7/
./Install\ Certificates.command

Solution 10 - Python

I had the same error and solved the problem by running the program code below:

# install_certifi.py
#
# sample script to install or update a set of default Root Certificates
# for the ssl module.  Uses the certificates provided by the certifi package:
#       https://pypi.python.org/pypi/certifi

import os
import os.path
import ssl
import stat
import subprocess
import sys

STAT_0o775 = ( stat.S_IRUSR | stat.S_IWUSR | stat.S_IXUSR
             | stat.S_IRGRP | stat.S_IWGRP | stat.S_IXGRP
             | stat.S_IROTH |                stat.S_IXOTH )


def main():
    openssl_dir, openssl_cafile = os.path.split(
        ssl.get_default_verify_paths().openssl_cafile)

    print(" -- pip install --upgrade certifi")
    subprocess.check_call([sys.executable,
        "-E", "-s", "-m", "pip", "install", "--upgrade", "certifi"])

    import certifi

    # change working directory to the default SSL directory
    os.chdir(openssl_dir)
    relpath_to_certifi_cafile = os.path.relpath(certifi.where())
    print(" -- removing any existing file or link")
    try:
        os.remove(openssl_cafile)
    except FileNotFoundError:
        pass
    print(" -- creating symlink to certifi certificate bundle")
    os.symlink(relpath_to_certifi_cafile, openssl_cafile)
    print(" -- setting permissions")
    os.chmod(openssl_cafile, STAT_0o775)
    print(" -- update complete")

if __name__ == '__main__':
    main()

Solution 11 - Python

Take a look at this post, it seems like for later versions of Python, certificates are not pre installed which seems to cause this error. You should be able to run the following command to install the certifi package: /Applications/Python\ 3.6/Install\ Certificates.command

Post 1: https://stackoverflow.com/questions/27835619/urllib-and-ssl-certificate-verify-failed-error

Post 2: https://stackoverflow.com/questions/51774807/airbrake-error-urlopen-error-ssl-certificate-verify-failed-certificate-verif

Solution 12 - Python

i didn't solve the problem, sadly. but managed to make to codes work (almost all of my codes have this probelm btw) the local issuer certificate problem happens under python3.7 so i changed back to python2.7 QAQ and all that needed to change including "from urllib2 import urlopen" instead of "from urllib.request import urlopen" so sad...

Solution 13 - Python

If you're running on a Mac you could just search for Install Certificates.command on the spotlight and hit enter.

Solution 14 - Python

I'm a relative novice compared to all the experts on Stack Overflow.

I have 2 versions of jupyter notebook running (one through a fresh Anaconda Navigator installation and one through ????). I think this is because Anaconda was installed as a local installation on my Mac (per Anaconda instructions).

I already had python 3.7 installed. After that, I used my terminal to open jupyter notebook and I think that it put another version globally onto my Mac.

However, I'm not sure because I'm just learning through trial and error!

I did the terminal command:

conda install -c anaconda certifi 

(as directed above, but it didn't work.)

My python 3.7 is installed on OS Catalina10.15.3 in:

  • /Library/Python/3.7/site-packages AND
  • ~/Library/Python/3.7/lib/python/site-packages

The certificate is at:

  • ~/Library/Python/3.7/lib/python/site-packages/certifi-2019.11.28.dist-info

I tried to find the Install Certificate.command ... but couldn't find it through looking through the file structures...not in Applications...not in links above.

I finally installed it by finding it through Spotlight (as someone suggested above). And it double clicked automatically and installed ANOTHER certificate in the same folder as:

  • ~/Library/Python/3.7/lib/python/site-packages/

NONE of the above solved anything for me...I still got the same error.

So, I solved the problem by:

  1. closing my jupyter notebook.
  2. opening Anaconda Navigator.
  3. opening jupyter notebook through the Navigator GUI (instead of through Terminal).
  4. opening my notebook and running the code.

I can't tell you why this worked. But it solved the problem for me.

I just want to save someone the hassle next time. If someone can tell my why it worked, that would be terrific.

I didn't try the other terminal commands because of the 2 versions of jupyter notebook that I knew were a problem. I just don't know how to fix that.

Solution 15 - Python

Use requests library. Try this solution, or just add https:// before the URL:

import requests
from bs4 import BeautifulSoup
import re

pages = set()
def getLinks(pageUrl):
    global pages
    html = requests.get("http://en.wikipedia.org"+pageUrl, verify=False).text
    bsObj = BeautifulSoup(html)
    for link in bsObj.findAll("a", href=re.compile("^(/wiki/)")):
        if 'href' in link.attrs:
            if link.attrs['href'] not in pages:
                #We have encountered a new page
                newPage = link.attrs['href']
                print(newPage)
                pages.add(newPage)
                getLinks(newPage)
getLinks("")

Check if this works for you

Solution 16 - Python

For me the problem was that I was setting REQUESTS_CA_BUNDLE in my .bash_profile

/Users/westonagreene/.bash_profile:
...
export REQUESTS_CA_BUNDLE=/usr/local/etc/openssl/cert.pem
...

Once I set REQUESTS_CA_BUNDLE to blank (i.e. removed from .bash_profile), requests worked again.

export REQUESTS_CA_BUNDLE=""

The problem only exhibited when executing python requests via a CLI (Command Line Interface). If I ran requests.get(URL, CERT) it resolved just fine.

Mac OS Catalina (10.15.6). Pyenv of 3.6.11. Error message I was getting: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1056)

My answer elsewhere: https://stackoverflow.com/a/64151964/4420657

Solution 17 - Python

I am using Debian 10 buster and try download a file with youtube-dl and get this error: sudo youtube-dl -k https://youtu.be/uscis0CnDjk

> [youtube] uscis0CnDjk: Downloading webpage ERROR: Unable to download webpage: (caused by URLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1056)')))

Certificates with python2 and python3.8 are installed correctly, but i persistent receive the same error. finally (which is not the best solution, but works for me was to eliminate the certificate check as it is given as an option in youtube-dl) whith this command sudo youtube-dl -k --no-check-certificate https://youtu.be/uscis0CnDjk

Solution 18 - Python

I am seeing this issue on a Ubuntu 20.04 system and none of the "real fixes" (like this one) helped.

While Firefox was willing to open the site just fine neither GNOME Web (i.e. Epiphany) nor Python3 or wget were accepting the certificate. After some searching, I came across this answer on ServerFault which lists two common reasons:

> * The certificate is really signed by an unknown CA (for instance an internal CA). > * The certificate is signed with an intermediate CA certificate from one of the well known CA's and the remote server is misconfigured in the regard that it doesn't include that intermediate CA certificate as a CA chain it's response.

You can use the Qualys SSL Labs website to check the site's certificates and if there are issues, contact the site's administrator to have it fixed.

If you really need to work around the issue right now, I'd recommend a temporary solution like Rambod's confined to the site(s) you're trying to access.

Solution 19 - Python

BTW guys if you are getting the same error using aiohttp just put verify_ssl=False argument into your TCPConnector:

import aiohttp
...

async with aiohttp.ClientSession(
    connector=aiohttp.TCPConnector(verify_ssl=False)
) as session:
    async with session.get(url) as response:
        body = await response.text()

Solution 20 - Python

This will work. Set the environment variable PYTHONHTTPSVERIFY to 0.

  • By typing linux command:
export PYTHONHTTPSVERIFY = 0

OR

  • Using in python code:
import os
os.environ["PYTHONHTTPSVERIFY"] = "0"

Solution 21 - Python

I am using anaconda on windows. Was getting the same error until I tried the following;

import urllib.request
link = 'http://docs.python.org'
with urllib.request.urlopen(link) as response:
    htmlSource = response.read()

which I got from the stackoverflow thread on using urlopen:

https://stackoverflow.com/questions/25863101/python-urllib-urlopen-not-working

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionCatherine4jView Question on Stackoverflow
Solution 1 - PythonJey MirandaView Answer on Stackoverflow
Solution 2 - PythonRambodView Answer on Stackoverflow
Solution 3 - PythonHillsieView Answer on Stackoverflow
Solution 4 - PythonAzimView Answer on Stackoverflow
Solution 5 - PythonHemantView Answer on Stackoverflow
Solution 6 - PythonAmy MouView Answer on Stackoverflow
Solution 7 - PythonMuzammil-cyberView Answer on Stackoverflow
Solution 8 - PythonAlexis BersonView Answer on Stackoverflow
Solution 9 - PythonAlexandre CrivellaroView Answer on Stackoverflow
Solution 10 - PythonMilovan TomaševićView Answer on Stackoverflow
Solution 11 - PythonPatrick SuzukiView Answer on Stackoverflow
Solution 12 - PythonCatherine4jView Answer on Stackoverflow
Solution 13 - PythonVIC3KINGView Answer on Stackoverflow
Solution 14 - Pythonuser3303164View Answer on Stackoverflow
Solution 15 - PythonNitinView Answer on Stackoverflow
Solution 16 - PythonWeston GreeneView Answer on Stackoverflow
Solution 17 - Pythontedy58View Answer on Stackoverflow
Solution 18 - PythonFriendFXView Answer on Stackoverflow
Solution 19 - PythonSimfikDukeView Answer on Stackoverflow
Solution 20 - PythonSaurabhView Answer on Stackoverflow
Solution 21 - PythonWayne WignesView Answer on Stackoverflow