URL query parameters to dict python

PythonParsingUrlQuery Parameters

Python Problem Overview


Is there a way to parse a URL (with some python library) and return a python dictionary with the keys and values of a query parameters part of the URL?

For example:

url = "http://www.example.org/default.html?ct=32&op=92&item=98"

expected return:

{'ct':32, 'op':92, 'item':98}

Python Solutions


Solution 1 - Python

Use the urllib.parse library:

>>> from urllib import parse
>>> url = "http://www.example.org/default.html?ct=32&op=92&item=98"
>>> parse.urlsplit(url)
SplitResult(scheme='http', netloc='www.example.org', path='/default.html', query='ct=32&op=92&item=98', fragment='')
>>> parse.parse_qs(parse.urlsplit(url).query)
{'item': ['98'], 'op': ['92'], 'ct': ['32']}
>>> dict(parse.parse_qsl(parse.urlsplit(url).query))
{'item': '98', 'op': '92', 'ct': '32'}

The urllib.parse.parse_qs() and urllib.parse.parse_qsl() methods parse out query strings, taking into account that keys can occur more than once and that order may matter.

If you are still on Python 2, urllib.parse was called urlparse.

Solution 2 - Python

For Python 3, the values of the dict from parse_qs are in a list, because there might be multiple values. If you just want the first one:

>>> from urllib.parse import urlsplit, parse_qs
>>>
>>> url = "http://www.example.org/default.html?ct=32&op=92&item=98"
>>> query = urlsplit(url).query
>>> params = parse_qs(query)
>>> params
{'item': ['98'], 'op': ['92'], 'ct': ['32']}
>>> dict(params)
{'item': ['98'], 'op': ['92'], 'ct': ['32']}
>>> {k: v[0] for k, v in params.items()}
{'item': '98', 'op': '92', 'ct': '32'}

Solution 3 - Python

If you prefer not to use a parser:

url = "http://www.example.org/default.html?ct=32&op=92&item=98"
url = url.split("?")[1]
dict = {x[0] : x[1] for x in [x.split("=") for x in url[1:].split("&") ]}

So I won't delete what's above but it's definitely not what you should use.

I think I read a few of the answers and they looked a little complicated, incase you're like me, don't use my solution.

Use this:

from urllib import parse
params = dict(parse.parse_qsl(parse.urlsplit(url).query))

and for Python 2.X

import urlparse as parse
params = dict(parse.parse_qsl(parse.urlsplit(url).query))

I know this is the same as the accepted answer, just in a one liner that can be copied.

Solution 4 - Python

For python 2.7

In [14]: url = "http://www.example.org/default.html?ct=32&op=92&item=98"

In [15]: from urlparse import urlparse, parse_qsl

In [16]: parse_url = urlparse(url)

In [17]: query_dict = dict(parse_qsl(parse_url.query))

In [18]: query_dict
Out[18]: {'ct': '32', 'item': '98', 'op': '92'}

Solution 5 - Python

I agree about not reinventing the wheel but sometimes (while you're learning) it helps to build a wheel in order to understand a wheel. :) So, from a purely academic perspective, I offer this with the caveat that using a dictionary assumes that name value pairs are unique (that the query string does not contain multiple records).

url = 'http:/mypage.html?one=1&two=2&three=3'

page, query = url.split('?')
    
names_values_dict = dict(pair.split('=') for pair in query.split('&'))

names_values_list = [pair.split('=') for pair in query.split('&')]

I'm using version 3.6.5 in the Idle IDE.

Solution 6 - Python

For python2.7 I am using urlparse module to parse url query to dict.

import urlparse

url = "http://www.example.org/default.html?ct=32&op=92&item=98"

print urlparse.parse_qs( urlparse.urlparse(url).query )
# result: {'item': ['98'], 'op': ['92'], 'ct': ['32']} 

Solution 7 - Python

from urllib.parse import splitquery, parse_qs, parse_qsl

url = "http://www.example.org/default.html?ct=32&op=92&item=98&item=99"

splitquery(url)
# ('http://www.example.org/default.html', 'ct=32&op=92&item=98&item=99')

parse_qs(splitquery(url)[1])
# {'ct': ['32'], 'op': ['92'], 'item': ['98', '99']}

dict(parse_qsl(splitquery(url)[1]))
# {'ct': '32', 'op': '92', 'item': '99'}

# also works with url w/o query
parse_qs(splitquery("http://example.org")[1])
# {}

dict(parse_qsl(splitquery("http://example.org")[1]))
# {}

Old question, thougt I'd chip in though after I came across this splitquery thingy. Not sure about Python 2 since I dont use Python 2. splitquery is a bit more than re.split(r"\?", url, 1).

Solution 8 - Python

WSGI, python 2.7

Code

    import sys
    import json
    import cgi
    import urlparse

    def application(environ, start_response):

      status = '200 OK'

      method = environ['REQUEST_METHOD']
      args = urlparse.parse_qs(environ['QUERY_STRING'])
      m = args['mesg']

      x = {
      "input": m[0],
      "result": m[0].capitalize()
      }

      # convert into JSON:
      y = json.dumps(x)

      output = y

      response_headers = [('Content-type', 'application/json'),
                    ('Content-Length', str(len(output)))]
      start_response(status, response_headers)
      print (sys.version_info)
      return [output]

URL

http:///echo.py?mesg=hola

Response

{"input": "hola", "result": "Hola"}

Solution 9 - Python

You can easily parse a URL with a speciific library.

> Here is my simple code to parse it without any dedicated library.

(the input url must contain a domain name,a protocol and a path.

def parseURL(url):

seg2 = url.split('/')[2]    # Separating domain name
seg1 = url.split(seg2)[-2]  # Deriving protocol
print('Protocol:', seg1, '\n')
print('Domain name:', seg2, '\n')
seg3 = url.split(seg2)[1]   #Getting the path; if output is empty,the there is no path in URL
print('Path:', seg3, '\n')

if '#' in url:  # Extracting fragment id, else None
    seg4 = url.split('#')[1]
    print('Fragment ID:', seg4, '\n')
else:
    seg4 = 'None'
if '@' in url:              # Extracting user name, else None
    seg5 = url.split('/')[-1]
    print('Scheme with User Name:', seg5, '\n')
else:
    seg5 = 'None'
if '?' in url:              # Extracting query string, else None
    seg6 = url.split('?')[-1]
    print('Query string:', seg6, '\n')
else:
    seg6 = 'None'

print('**The dictionary is in the sequence: 0.URL 1.Protocol 2.Domain name 3.Path 4.Fragment id 5.User name 6.Query string** \n')

dictionary = {'0.URL': url, '1.Protocol': seg1, '2.Domain name': seg2, '3.Path': seg3, '4.Fragment id': seg4,
              '5.User name': seg5, '6.Query string': seg6}  # Printing required dictionary
print(dictionary, '\n')

print('The TLD in the given URL is following: ')
if '.com' in url:           # Extracting most famous TLDs maintained by ICAAN
    print('.com\n')
elif '.de' in url:
    print('.de\n')
elif '.uk' in url:
    print('.uk\n')
elif 'gov' in url:
    print('gov\n')
elif '.org' in url:
    print('.org\n')
elif '.ru' in url:
    print('.ru\n')
elif '.net' in url:
    print('.net\n')
elif '.info' in url:
    print('.info\n')
elif '.biz' in url:
    print('.biz\n')
elif '.online' in url:
    print('.online\n')
elif '.in' in url:
    print('.in\n')
elif '.edu' in url:
    print('.edu\n')
else:
    print('Other low level domain!\n')

return dictionary

if name == 'main': url = input("Enter your URL: ") parseURL(url)

#Sample URLS to copy
# url='https://www.facebook.com/photo.php?fbid=2068026323275211&set=a.269104153167446&type=3&theater'   
# url='http://www.blog.google.uk:1000/path/to/myfile.html?key1=value1&key2=value2#InTheDocument'      
# url='https://www.overleaf.com/9565720ckjijuhzpbccsd#/347876331/' 

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionLeonardo AndradeView Question on Stackoverflow
Solution 1 - PythonMartijn PietersView Answer on Stackoverflow
Solution 2 - PythonreubanoView Answer on Stackoverflow
Solution 3 - PythonTomos WilliamsView Answer on Stackoverflow
Solution 4 - PythonAnurag MisraView Answer on Stackoverflow
Solution 5 - PythonClariusView Answer on Stackoverflow
Solution 6 - PythonTamimView Answer on Stackoverflow
Solution 7 - PythonmikeyView Answer on Stackoverflow
Solution 8 - PythonJose Manuel Gomez AlvarezView Answer on Stackoverflow
Solution 9 - PythonAshutosh MahajanView Answer on Stackoverflow