Extracting extension from filename in Python

PythonFilenamesFile Extension

Python Problem Overview


Is there a function to extract the extension from a filename?

Python Solutions


Solution 1 - Python

Yes. Use os.path.splitext(see Python 2.X documentation or Python 3.X documentation):

>>> import os
>>> filename, file_extension = os.path.splitext('/path/to/somefile.ext')
>>> filename
'/path/to/somefile'
>>> file_extension
'.ext'

Unlike most manual string-splitting attempts, os.path.splitext will correctly treat /a/b.c/d as having no extension instead of having extension .c/d, and it will treat .bashrc as having no extension instead of having extension .bashrc:

>>> os.path.splitext('/a/b.c/d')
('/a/b.c/d', '')
>>> os.path.splitext('.bashrc')
('.bashrc', '')

Solution 2 - Python

New in version 3.4.

import pathlib

print(pathlib.Path('yourPath.example').suffix) # '.example'
print(pathlib.Path("hello/foo.bar.tar.gz").suffixes) # ['.bar', '.tar', '.gz']

I'm surprised no one has mentioned pathlib yet, pathlib IS awesome!

Solution 3 - Python

import os.path
extension = os.path.splitext(filename)[1]

Solution 4 - Python

import os.path
extension = os.path.splitext(filename)[1][1:]

To get only the text of the extension, without the dot.

Solution 5 - Python

For simple use cases one option may be splitting from dot:

>>> filename = "example.jpeg"
>>> filename.split(".")[-1]
'jpeg'

No error when file doesn't have an extension:

>>> "filename".split(".")[-1]
'filename'

But you must be careful:

>>> "png".split(".")[-1]
'png'    # But file doesn't have an extension

Also will not work with hidden files in Unix systems:

>>> ".bashrc".split(".")[-1]
'bashrc'    # But this is not an extension

For general use, prefer os.path.splitext

Solution 6 - Python

worth adding a lower in there so you don't find yourself wondering why the JPG's aren't showing up in your list.

os.path.splitext(filename)[1][1:].strip().lower()

Solution 7 - Python

Any of the solutions above work, but on linux I have found that there is a newline at the end of the extension string which will prevent matches from succeeding. Add the strip() method to the end. For example:

import os.path
extension = os.path.splitext(filename)[1][1:].strip() 

Solution 8 - Python

With splitext there are problems with files with double extension (e.g. file.tar.gz, file.tar.bz2, etc..)

>>> fileName, fileExtension = os.path.splitext('/path/to/somefile.tar.gz')
>>> fileExtension 
'.gz'

but should be: .tar.gz

The possible solutions are here

Solution 9 - Python

You can find some great stuff in pathlib module (available in python 3.x).

import pathlib
x = pathlib.PurePosixPath("C:\\Path\\To\\File\\myfile.txt").suffix
print(x)

# Output 
'.txt'

Solution 10 - Python

Although it is an old topic, but i wonder why there is none mentioning a very simple api of python called rpartition in this case:

to get extension of a given file absolute path, you can simply type:

filepath.rpartition('.')[-1]

example:

path = '/home/jersey/remote/data/test.csv'
print path.rpartition('.')[-1]

will give you: 'csv'

Solution 11 - Python

Just join all pathlib suffixes.

>>> x = 'file/path/archive.tar.gz'
>>> y = 'file/path/text.txt'
>>> ''.join(pathlib.Path(x).suffixes)
'.tar.gz'
>>> ''.join(pathlib.Path(y).suffixes)
'.txt'

Solution 12 - Python

Surprised this wasn't mentioned yet:

import os
fn = '/some/path/a.tar.gz'

basename = os.path.basename(fn)  # os independent
Out[] a.tar.gz

base = basename.split('.')[0]
Out[] a

ext = '.'.join(basename.split('.')[1:])   # <-- main part

# if you want a leading '.', and if no result `None`:
ext = '.' + ext if ext else None
Out[] .tar.gz

Benefits:

  • Works as expected for anything I can think of
  • No modules
  • No regex
  • Cross-platform
  • Easily extendible (e.g. no leading dots for extension, only last part of extension)

As function:

def get_extension(filename):
    basename = os.path.basename(filename)  # os independent
    ext = '.'.join(basename.split('.')[1:])
    return '.' + ext if ext else None

Solution 13 - Python

You can use a split on a filename:

f_extns = filename.split(".")
print ("The extension of the file is : " + repr(f_extns[-1]))

This does not require additional library

Solution 14 - Python

filename='ext.tar.gz'
extension = filename[filename.rfind('.'):]


Solution 15 - Python

Extracting extension from filename in Python

Python os module splitext()

splitext() function splits the file path into a tuple having two values – root and extension.

import os
# unpacking the tuple
file_name, file_extension = os.path.splitext("/Users/Username/abc.txt")
print(file_name)
print(file_extension)

Get File Extension using Pathlib Module

Pathlib module to get the file extension

import pathlib
pathlib.Path("/Users/pankaj/abc.txt").suffix
#output:'.txt'

Solution 16 - Python

This is a direct string representation techniques : I see a lot of solutions mentioned, but I think most are looking at split. Split however does it at every occurrence of "." . What you would rather be looking for is partition.

string = "folder/to_path/filename.ext"
extension = string.rpartition(".")[-1]

Solution 17 - Python

Another solution with right split:

# to get extension only

s = 'test.ext'

if '.' in s: ext = s.rsplit('.', 1)[1]

# or, to get file name and extension

def split_filepath(s):
    """
    get filename and extension from filepath 
    filepath -> (filename, extension)
    """
    if not '.' in s: return (s, '')
    r = s.rsplit('.', 1)
    return (r[0], r[1])

Solution 18 - Python

Even this question is already answered I'd add the solution in Regex.

>>> import re
>>> file_suffix = ".*(\..*)"
>>> result = re.search(file_suffix, "somefile.ext")
>>> result.group(1)
'.ext'

Solution 19 - Python

you can use following code to split file name and extension.

    import os.path
    filenamewithext = os.path.basename(filepath)
    filename, ext = os.path.splitext(filenamewithext)
    #print file name
    print(filename)
    #print file extension
    print(ext)

Solution 20 - Python

You can use endswith to identify the file extension in python

like bellow example

for file in os.listdir():
    if file.endswith('.csv'):
        df1 =pd.read_csv(file)
        frames.append(df1)
        result = pd.concat(frames)

Solution 21 - Python

A true one-liner, if you like regex. And it doesn't matter even if you have additional "." in the middle

import re

file_ext = re.search(r"\.([^.]+)$", filename).group(1)

See here for the result: Click Here

Solution 22 - Python

try this:

files = ['file.jpeg','file.tar.gz','file.png','file.foo.bar','file.etc']
pen_ext = ['foo', 'tar', 'bar', 'etc']

for file in files: #1
	if (file.split(".")[-2] in pen_ext): #2
		ext =  file.split(".")[-2]+"."+file.split(".")[-1]#3
	else:
		ext = file.split(".")[-1] #4
	print (ext) #5
  1. get all file name inside the list
  2. splitting file name and check the penultimate extension, is it in the pen_ext list or not?
  3. if yes then join it with the last extension and set it as the file's extension
  4. if not then just put the last extension as the file's extension
  5. and then check it out

Solution 23 - Python

For funsies... just collect the extensions in a dict, and track all of them in a folder. Then just pull the extensions you want.

import os

search = {}

for f in os.listdir(os.getcwd()):
    fn, fe = os.path.splitext(f)
    try:
        search[fe].append(f)
    except:
        search[fe]=[f,]

extensions = ('.png','.jpg')
for ex in extensions:
    found = search.get(ex,'')
    if found:
        print(found)

Solution 24 - Python

# try this, it works for anything, any length of extension
# e.g www.google.com/downloads/file1.gz.rs -> .gz.rs

import os.path

class LinkChecker:

    @staticmethod
    def get_link_extension(link: str)->str:
        if link is None or link == "":
            return ""
        else:
            paths = os.path.splitext(link)
            ext = paths[1]
            new_link = paths[0]
            if ext != "":
                return LinkChecker.get_link_extension(new_link) + ext
            else:
                return ""

Solution 25 - Python

def NewFileName(fichier):
    cpt = 0
    fic , *ext =  fichier.split('.')
    ext = '.'.join(ext)
    while os.path.isfile(fichier):
        cpt += 1
        fichier = '{0}-({1}).{2}'.format(fic, cpt, ext)
    return fichier

Solution 26 - Python

This is The Simplest Method to get both Filename & Extension in just a single line.

fName, ext = 'C:/folder name/Flower.jpeg'.split('/')[-1].split('.')

>>> print(fName)
Flower
>>> print(ext)
jpeg

Unlike other solutions, you don't need to import any package for this.

Solution 27 - Python

a = ".bashrc"
b = "text.txt"
extension_a = a.split(".")
extension_b = b.split(".")
print(extension_a[-1])  # bashrc
print(extension_b[-1])  # txt

Solution 28 - Python

name_only=file_name[:filename.index(".")

That will give you the file name up to the first ".", which would be the most common.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionAlexView Question on Stackoverflow
Solution 1 - PythonnoskloView Answer on Stackoverflow
Solution 2 - PythonjeromejView Answer on Stackoverflow
Solution 3 - PythonBrian NealView Answer on Stackoverflow
Solution 4 - PythonwonzbakView Answer on Stackoverflow
Solution 5 - PythonMurat ÇorluView Answer on Stackoverflow
Solution 6 - PythonblentedView Answer on Stackoverflow
Solution 7 - Pythonyamex5View Answer on Stackoverflow
Solution 8 - PythonXavierCLLView Answer on Stackoverflow
Solution 9 - Pythonr3t40View Answer on Stackoverflow
Solution 10 - PythonweiyixieView Answer on Stackoverflow
Solution 11 - PythonAlexView Answer on Stackoverflow
Solution 12 - PythonPascalVKootenView Answer on Stackoverflow
Solution 13 - PythonsoheshdoshiView Answer on Stackoverflow
Solution 14 - PythonstaytimeView Answer on Stackoverflow
Solution 15 - PythonDS_ShraShettyView Answer on Stackoverflow
Solution 16 - PythonKenstarsView Answer on Stackoverflow
Solution 17 - PythonArnaldo P. Figueira FigueiraView Answer on Stackoverflow
Solution 18 - PythonExecudayView Answer on Stackoverflow
Solution 19 - PythonMuhammad SalmanView Answer on Stackoverflow
Solution 20 - Pythoncng.buffView Answer on Stackoverflow
Solution 21 - PythonVictor WangView Answer on Stackoverflow
Solution 22 - PythonIbnul HusainanView Answer on Stackoverflow
Solution 23 - PythoneatmeimadanishView Answer on Stackoverflow
Solution 24 - PythonDragonXView Answer on Stackoverflow
Solution 25 - Pythonuser5535053View Answer on Stackoverflow
Solution 26 - PythonRipon Kumar SahaView Answer on Stackoverflow
Solution 27 - PythonlendooView Answer on Stackoverflow
Solution 28 - PythonwookieView Answer on Stackoverflow