How to get JSON from webpage into Python script

PythonJson

Python Problem Overview


Got the following code in one of my scripts:

#
# url is defined above.
#
jsonurl = urlopen(url)

#
# While trying to debug, I put this in:
#
print jsonurl

#
# Was hoping text would contain the actual json crap from the URL, but seems not...
#
text = json.loads(jsonurl)
print text

What I want to do is get the {{.....etc.....}} stuff that I see on the URL when I load it in Firefox into my script so I can parse a value out of it. I've Googled a ton but I haven't found a good answer as to how to actually get the {{...}} stuff from a URL ending in .json into an object in a Python script.

Python Solutions


Solution 1 - Python

Get data from the URL and then call json.loads e.g.

Python3 example:

import urllib.request, json 
with urllib.request.urlopen("http://maps.googleapis.com/maps/api/geocode/json?address=google") as url:
    data = json.loads(url.read().decode())
    print(data)

Python2 example:

import urllib, json
url = "http://maps.googleapis.com/maps/api/geocode/json?address=google"
response = urllib.urlopen(url)
data = json.loads(response.read())
print data

The output would result in something like this:

{
"results" : [
    {
    "address_components" : [
        {
            "long_name" : "Charleston and Huff",
            "short_name" : "Charleston and Huff",
            "types" : [ "establishment", "point_of_interest" ]
        },
        {
            "long_name" : "Mountain View",
            "short_name" : "Mountain View",
            "types" : [ "locality", "political" ]
        },
        {
...

Solution 2 - Python

I'll take a guess that you actually want to get data from the URL:

jsonurl = urlopen(url)
text = json.loads(jsonurl.read()) # <-- read from it

Or, check out JSON decoder in the requests library.

import requests
r = requests.get('someurl')
print r.json() # if response type was set to JSON, then you'll automatically have a JSON response here...

Solution 3 - Python

This gets a dictionary in JSON format from a webpage with Python 2.X and Python 3.X:

#!/usr/bin/env python

try:
    # For Python 3.0 and later
    from urllib.request import urlopen
except ImportError:
    # Fall back to Python 2's urllib2
    from urllib2 import urlopen

import json


def get_jsonparsed_data(url):
    """
    Receive the content of ``url``, parse it as JSON and return the object.

    Parameters
    ----------
    url : str

    Returns
    -------
    dict
    """
    response = urlopen(url)
    data = response.read().decode("utf-8")
    return json.loads(data)


url = ("http://maps.googleapis.com/maps/api/geocode/json?"
       "address=googleplex&sensor=false")
print(get_jsonparsed_data(url))

See also: Read and write example for JSON

Solution 4 - Python

I have found this to be the easiest and most efficient way to get JSON from a webpage when using Python 3:

import json,urllib.request
data = urllib.request.urlopen("https://api.github.com/users?since=100").read()
output = json.loads(data)
print (output)

Solution 5 - Python

you need import requests and use from json() method :

source = requests.get("url").json()
print(source)

Of course, this method also works:

import json,urllib.request
data = urllib.request.urlopen("url").read()
output = json.loads(data)
print (output)

json.loads will decode it into a Python object using this table, for example a JSON object will become a Python dict.

Solution 6 - Python

All that the call to urlopen() does (according to the docs) is return a file-like object. Once you have that, you need to call its read() method to actually pull the JSON data across the network.

Something like:

jsonurl = urlopen(url)

text = json.loads(jsonurl.read())
print text

Solution 7 - Python

In Python 2, json.load() will work instead of json.loads()

import json
import urllib

url = 'https://api.github.com/users?since=100'
output = json.load(urllib.urlopen(url))
print(output)

Unfortunately, that doesn't work in Python 3. json.load is just a wrapper around json.loads that calls read() for a file-like object. json.loads requires a string object and the output of urllib.urlopen(url).read() is a bytes object. So one has to get the file encoding in order to make it work in Python 3.

In this example we query the headers for the encoding and fall back to utf-8 if we don't get one. The headers object is different between Python 2 and 3 so it has to be done different ways. Using requests would avoid all this, but sometimes you need to stick to the standard library.

import json
from six.moves.urllib.request import urlopen

DEFAULT_ENCODING = 'utf-8'
url = 'https://api.github.com/users?since=100'
urlResponse = urlopen(url)

if hasattr(urlResponse.headers, 'get_content_charset'):
    encoding = urlResponse.headers.get_content_charset(DEFAULT_ENCODING)
else:
    encoding = urlResponse.headers.getparam('charset') or DEFAULT_ENCODING

output = json.loads(urlResponse.read().decode(encoding))
print(output)

Solution 8 - Python

For python>=3.6 you can use:

import dload

j = dload.json(url)

Install dload with:

pip3 install dload

Solution 9 - Python

There's no need to use an extra library to parse the json...

json.loads() returns a dictionary.

So in your case, just do text["someValueKey"]

Solution 10 - Python

Not sure why all the earlier answers are using json.loads. All you need is:

import json
from urllib.request import urlopen

f = urlopen("https://www.openml.org/d/40996/json")
j = json.load(f)

This works because urlopen returns a file-like object, which works with json.load.

Solution 11 - Python

you can use json.dumps:

import json

# Hier comes you received data

data = json.dumps(response)

print(data)

for loading json and write it on file the following code is useful:

data = json.loads(json.dumps(Response, sort_keys=False, indent=4))
with open('data.json', 'w') as outfile:
json.dump(data, outfile, sort_keys=False, indent=4)

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionChris BView Question on Stackoverflow
Solution 1 - PythonAnurag UniyalView Answer on Stackoverflow
Solution 2 - PythonJon ClementsView Answer on Stackoverflow
Solution 3 - PythonMartin ThomaView Answer on Stackoverflow
Solution 4 - PythonUxbridgeView Answer on Stackoverflow
Solution 5 - PythonmamalView Answer on Stackoverflow
Solution 6 - PythonbgporterView Answer on Stackoverflow
Solution 7 - PythonavisoView Answer on Stackoverflow
Solution 8 - PythonPedro LobitoView Answer on Stackoverflow
Solution 9 - Pythonposit labsView Answer on Stackoverflow
Solution 10 - PythonMoe KayaliView Answer on Stackoverflow
Solution 11 - PythonKeivanView Answer on Stackoverflow