How to join absolute and relative urls?

PythonUrl

Python Problem Overview


I have two urls:

url1 = "http://127.0.0.1/test1/test2/test3/test5.xml"
url2 = "../../test4/test6.xml"

How can I get an absolute url for url2?

Python Solutions


Solution 1 - Python

You should use urlparse.urljoin :

>>> import urlparse
>>> urlparse.urljoin(url1, url2)
'http://127.0.0.1/test1/test4/test6.xml'

With Python 3 (where urlparse is renamed to urllib.parse) you could use it as follow:

>>> import urllib.parse
>>> urllib.parse.urljoin(url1, url2)
'http://127.0.0.1/test1/test4/test6.xml'

Solution 2 - Python

If your relative path consists of multiple parts, you have to join them separately, since urljoin would replace the relative path, not join it. The easiest way to do that is to use posixpath.

>>> import urllib.parse
>>> import posixpath
>>> url1 = "http://127.0.0.1"
>>> url2 = "test1"
>>> url3 = "test2"
>>> url4 = "test3"
>>> url5 = "test5.xml"
>>> url_path = posixpath.join(url2, url3, url4, url5)
>>> urllib.parse.urljoin(url1, url_path)
'http://127.0.0.1/test1/test2/test3/test5.xml'

See also: https://stackoverflow.com/questions/1793261/how-to-join-components-of-a-path-when-you-are-constructing-a-url-in-python

Solution 3 - Python

es = ['http://127.0.0.1', 'test1', 'test4', 'test6.xml']
base = ''
map(lambda e: urlparse.urljoin(base, e), es)

Solution 4 - Python

For python 3.0+ the correct way to join urls is:

from urllib.parse import urljoin
urljoin('https://10.66.0.200/', '/api/org')
# output : 'https://10.66.0.200/api/org'

Solution 5 - Python

You can use reduce to achieve Shikhar's method in a cleaner fashion.

>>> import urllib.parse
>>> from functools import reduce
>>> reduce(urllib.parse.urljoin, ["http://moc.com/", "path1/", "path2/", "path3/"])
'http://moc.com/path1/path2/path3/'

Note that with this method each fragment should have trailing forward-slash, with no leading forward-slash, to indicate it is a path fragment being joined.

This is more correct/informative, telling you that path1/ is a URI path fragment, and not the full path (e.g. /path1/) or an unknown (e.g. path1). An unknown could be either, but they are handled as a full path.

If you need to add / to a fragment lacking it, you could do:

uri = uri if uri.endswith("/") else f"{uri}/"

To learn more about URI resolution, Wikipedia has some nice examples.

Updates

  • Just noticed Peter Perron commented about reduce on Shikhar's answer, but I'll leave this here then to demonstrate how that's done.

  • Updated wikipedia URL

Solution 6 - Python

>>> from urlparse import urljoin
>>> url1 = "http://www.youtube.com/user/khanacademy"
>>> url2 = "/user/khanacademy"
>>> urljoin(url1, url2)
'http://www.youtube.com/user/khanacademy'

Simple.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionBdfyView Question on Stackoverflow
Solution 1 - PythonCédric JulienView Answer on Stackoverflow
Solution 2 - PythonpcvView Answer on Stackoverflow
Solution 3 - PythonShikhar MallView Answer on Stackoverflow
Solution 4 - Pythonsrth12View Answer on Stackoverflow
Solution 5 - PythonryanjdillonView Answer on Stackoverflow
Solution 6 - PythonTalha AshrafView Answer on Stackoverflow