How to get everything after last slash in a URL?

PythonParsingUrl

Python Problem Overview


How can I extract whatever follows the last slash in a URL in Python? For example, these URLs should return the following:

URL: http://www.test.com/TEST1
returns: TEST1

URL: http://www.test.com/page/TEST2
returns: TEST2

URL: http://www.test.com/page/page/12345
returns: 12345

I've tried urlparse, but that gives me the full path filename, such as page/page/12345.

Python Solutions


Solution 1 - Python

You don't need fancy things, just see the string methods in the standard library and you can easily split your url between 'filename' part and the rest:

url.rsplit('/', 1)

So you can get the part you're interested in simply with:

url.rsplit('/', 1)[-1]

Solution 2 - Python

One more (idio(ma)tic) way:

URL.split("/")[-1]

Solution 3 - Python

rsplit should be up to the task:

In [1]: 'http://www.test.com/page/TEST2'.rsplit('/', 1)[1]
Out[1]: 'TEST2'

Solution 4 - Python

You can do like this:

head, tail = os.path.split(url)

Where tail will be your file name.

Solution 5 - Python

urlparse is fine to use if you want to (say, to get rid of any query string parameters).

import urllib.parse

urls = [    'http://www.test.com/TEST1',    'http://www.test.com/page/TEST2',    'http://www.test.com/page/page/12345',    'http://www.test.com/page/page/12345?abc=123']

for i in urls:
    url_parts = urllib.parse.urlparse(i)
    path_parts = url_parts[2].rpartition('/')
    print('URL: {}\nreturns: {}\n'.format(i, path_parts[2]))

Output:

URL: http://www.test.com/TEST1
returns: TEST1

URL: http://www.test.com/page/TEST2
returns: TEST2

URL: http://www.test.com/page/page/12345
returns: 12345

URL: http://www.test.com/page/page/12345?abc=123
returns: 12345

Solution 6 - Python

os.path.basename(os.path.normpath('/folderA/folderB/folderC/folderD/'))
>>> folderD

Solution 7 - Python

Here's a more general, regex way of doing this:

    re.sub(r'^.+/([^/]+)$', r'\1', url)

Solution 8 - Python

First extract the path element from the URL:

from urllib.parse import urlparse
parsed= urlparse('https://www.dummy.example/this/is/PATH?q=/a/b&r=5#asx')

and then you can extract the last segment with string functions:

parsed.path.rpartition('/')[2]

(example resulting to 'PATH')

Solution 9 - Python

Use urlparse to get just the path and then split the path you get from it on / characters:

from urllib.parse import urlparse

my_url = "http://example.com/some/path/last?somequery=param"
last_path_fragment = urlparse(my_url).path.split('/')[-1]  # returns 'last'

Note: if your url ends with a / character, the above will return '' (i.e. the empty string). If you want to handle that case differently, you need to strip the last trailing / character before you split the path:

my_url = "http://example.com/last/"
# handle URL ending in `/` by removing it.
last_path_fragment = urlparse(my_url).path.rstrip('/', 1).split('/')[-1]  # returns 'last'

Solution 10 - Python

extracted_url = url[url.rfind("/")+1:];

Solution 11 - Python

Split the url and pop the last element url.split('/').pop()

Solution 12 - Python

Split the URL and pop the last element

const plants = ['broccoli', 'cauliflower', 'cabbage', 'kale', 'tomato'];

console.log(plants.pop());
// expected output: "tomato"

console.log(plants);
// expected output: Array ["broccoli", "cauliflower", "cabbage", "kale"]

Solution 13 - Python

The following solution, which uses pathlib to parse the path obtained from urllib.parse allows to get the last part even when a terminal slash is present:

import urllib.parse
from pathlib import Path

urls = [
    "http://www.test.invalid/demo",
    "http://www.test.invalid/parent/child",
    "http://www.test.invalid/terminal-slash/",
    "http://www.test.invalid/query-params?abc=123&works=yes",
    "http://www.test.invalid/fragment#70446893",
    "http://www.test.invalid/has/all/?abc=123&works=yes#70446893",
]

for url in urls:
    url_path = Path(urllib.parse.urlparse(url).path)
    last_part = url_path.name  # use .stem to cut file extensions
    print(f"{last_part=}")

yields:

last_part='demo'
last_part='child'
last_part='terminal-slash'
last_part='query-params'
last_part='fragment'
last_part='all'

Solution 14 - Python

url ='http://www.test.com/page/TEST2'.split('/')[4]
print url

Output: TEST2.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionmixView Question on Stackoverflow
Solution 1 - PythonLuke404View Answer on Stackoverflow
Solution 2 - PythonKimvaisView Answer on Stackoverflow
Solution 3 - PythonBenjamin WohlwendView Answer on Stackoverflow
Solution 4 - PythonneowinstonView Answer on Stackoverflow
Solution 5 - PythonJacob WanView Answer on Stackoverflow
Solution 6 - PythonRochanView Answer on Stackoverflow
Solution 7 - PythonsandoronodiView Answer on Stackoverflow
Solution 8 - PythontzotView Answer on Stackoverflow
Solution 9 - PythonBoris VerkhovskiyView Answer on Stackoverflow
Solution 10 - PythonfardjadView Answer on Stackoverflow
Solution 11 - PythonAtul YadavView Answer on Stackoverflow
Solution 12 - PythonJaimin PatelView Answer on Stackoverflow
Solution 13 - PythonlcnittlView Answer on Stackoverflow
Solution 14 - Pythonlive_aloneView Answer on Stackoverflow