Allowed characters in filename

Special CharactersFilenames

Special Characters Problem Overview


Where can I find a list of allowed characters in filenames, depending on the operating system? (e.g. on Linux, the character : is allowed in filenames, but not on Windows)

Special Characters Solutions


Solution 1 - Special Characters

You should start with the Wikipedia Filename page. It has a decent-sized table (Comparison of filename limitations), listing the reserved characters for quite a lot of file systems.

It also has a plethora of other information about each file system, including reserved file names such as CON under MS-DOS. I mention that only because I was bitten by that once when I shortened an include file from const.h to con.h and spent half an hour figuring out why the compiler hung.

Turns out DOS ignored extensions for devices so that con.h was exactly the same as con, the input console (meaning, of course, the compiler was waiting for me to type in the header file before it would continue).

Solution 2 - Special Characters

OK, so looking at Comparison of file systems if you only care about the main players file systems:

so any byte except NUL, \, /, :, *, ?, ", <, >, | and you can't have files/folders call . or .. and no control characters (of course).

Solution 3 - Special Characters

On Windows OS create a file and give it a invalid character like \ in the filename. As a result you will get a popup with all the invalid characters in a filename.

enter image description here

Solution 4 - Special Characters

To be more precise about Mac OS X (now called MacOS) / in the Finder is interpreted to : in the Unix file system.

This was done for backward compatibility when Apple moved from Classic Mac OS.

It is legitimate to use a / in a file name in the Finder, looking at the same file in the terminal it will show up with a :.

And it works the other way around too: you can't use a / in a file name with the terminal, but a : is OK and will show up as a / in the Finder.

Some applications may be more restrictive and prohibit both characters to avoid confusion or because they kept logic from previous Classic Mac OS or for name compatibility between platforms.

Solution 5 - Special Characters

For "English locale" file names, this works nicely. I'm using this for sanitizing uploaded file names. The file name is not meant to be linked to anything on disk, it's for when the file is being downloaded hence there are no path checks.

$file_name = preg_replace('/([^\x20-~]+)|([\\/:?"<>|]+)/g', '_', $client_specified_file_name);

Basically it strips all non-printable and reserved characters for Windows and other OSs. You can easily extend the pattern to support other locales and functionalities.

Solution 6 - Special Characters

Rather than trying to identify all the characters that are unwanted, you could just look for anything except the acceptable characters. Here's a regex for anything except posix characters:

cleaned_name = re.sub(r'[^[:alnum:]._-]', '', name)

Solution 7 - Special Characters

Here is the code to clean file name in python.

import unicodedata

def clean_name(name, replace_space_with=None):
    """
    Remove invalid file name chars from the specified name

    :param name: the file name
    :param replace_space_with: if not none replace space with this string
    :return: a valid name for Win/Mac/Linux
    """

    # ref: https://en.wikipedia.org/wiki/Filename
    # ref: https://stackoverflow.com/questions/4814040/allowed-characters-in-filename
    # No control chars, no: /, \, ?, %, *, :, |, ", <, >

    # remove control chars
    name = ''.join(ch for ch in name if unicodedata.category(ch)[0] != 'C')

    cleaned_name = re.sub(r'[/\\?%*:|"<>]', '', name)
    if replace_space_with is not None:
        return cleaned_name.replace(' ', replace_space_with)
    return cleaned_name

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionpython dudeView Question on Stackoverflow
Solution 1 - Special CharacterspaxdiabloView Answer on Stackoverflow
Solution 2 - Special CharactersCpILLView Answer on Stackoverflow
Solution 3 - Special CharactersDevidView Answer on Stackoverflow
Solution 4 - Special CharactersJean LétourneauView Answer on Stackoverflow
Solution 5 - Special CharactersTheRealChx101View Answer on Stackoverflow
Solution 6 - Special CharactersDog PilotView Answer on Stackoverflow
Solution 7 - Special CharactersDu D.View Answer on Stackoverflow