Validate a file name on Windows

JavaRegexWindowsStringFilenames

Java Problem Overview


public static boolean isValidName(String text)
{
    Pattern pattern = Pattern.compile("^[^/./\\:*?\"<>|]+$");
    Matcher matcher = pattern.matcher(text);
    boolean isMatch = matcher.matches();
    return isMatch;
}

Does this method guarantee a valid filename on Windows?

Java Solutions


Solution 1 - Java

Given the requirements specified in the previously cited MSDN documentation, the following regex should do a pretty good job:

public static boolean isValidName(String text)
{
    Pattern pattern = Pattern.compile(
        "# Match a valid Windows filename (unspecified file system).          \n" +
        "^                                # Anchor to start of string.        \n" +
        "(?!                              # Assert filename is not: CON, PRN, \n" +
        "  (?:                            # AUX, NUL, COM1, COM2, COM3, COM4, \n" +
        "    CON|PRN|AUX|NUL|             # COM5, COM6, COM7, COM8, COM9,     \n" +
        "    COM[1-9]|LPT[1-9]            # LPT1, LPT2, LPT3, LPT4, LPT5,     \n" +
        "  )                              # LPT6, LPT7, LPT8, and LPT9...     \n" +
        "  (?:\\.[^.]*)?                  # followed by optional extension    \n" +
        "  $                              # and end of string                 \n" +
        ")                                # End negative lookahead assertion. \n" +
        "[^<>:\"/\\\\|?*\\x00-\\x1F]*     # Zero or more valid filename chars.\n" +
        "[^<>:\"/\\\\|?*\\x00-\\x1F\\ .]  # Last char is not a space or dot.  \n" +
        "$                                # Anchor to end of string.            ", 
        Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE | Pattern.COMMENTS);
    Matcher matcher = pattern.matcher(text);
    boolean isMatch = matcher.matches();
    return isMatch;
}

Note that this regex does not impose any limit on the length of the filename, but a real filename may be limited to 260 or 32767 chars depending on the platform.

Solution 2 - Java

Not enough,in Windows and DOS, some words might also be reserved and can not be used as filenames.

CON, PRN, AUX, CLOCK$, NUL
COM0, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9
LPT0, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, and LPT9.

See~

http://en.wikipedia.org/wiki/Filename


Edit:

Windows usually limits file names to 260 characters. But the file name must actually be shorter than that, since the complete path (such as C:\Program Files\filename.txt) is included in this character count.

This is why you might occasionally encounter an error when copying a file with a very long file name to a location that has a longer path than its current location.

Solution 3 - Java

Well, I think the following method would guarantee a valid file name:

public static boolean isValidName(String text)
{
    try
    {
        File file = new File(text);
        file.createNewFile();
        if(file.exists()) file.delete();
        return true;
    }
    catch(Exception ex){}
    return false;
}

What do you think?

Solution 4 - Java

A method that guarantees, generally, that a Windows filename is valid -- that it would be legal to create a file of that name -- would be impossible to implement.

It is relatively straightforward to guarantee that a Windows filename is invalid. Some of the other regexes attempt to do this. However, the original question requests a stronger assertion: a method that guarantees the filename is valid on Windows.

The MSDN reference cited in other answers indicates that a Windows filename cannot contain "Any other character that the target file system does not allow". For instance, a file containing NUL would be invalid on some file systems, as would extended Unicode characters on some older file systems. Thus, a file called ☃.txt would be valid in some cases, but not others. So whether a hypothetical isValidName(\"☃\") would return true is dependent on the underlying file system.

Suppose, however, such a function is conservative and requires the filename consist of printable ASCII characters. All modern versions of Windows natively support NTFS, FAT32, and FAT16 file formats, which accept Unicode filenames. But drivers for arbitrary filesystems can be installed, and one is free to create a filesystem that doesn't allow, for instance, the letter 'n'. Thus, not even a simple file like "snowman.txt" can be "guaranteed" to be valid.

But even with extreme cases aside, there are other complications. For instance, a file named "$LogFile" cannot exist in the root of a NTFS volume, but can exist elsewhere on the volume. Thus, without knowing the directory, we cannot know if "$LogFile" is a valid name. But even "C:\data$LogFile" might be invalid if, say, "c:\data" is a symbolic link to another NTFS volume root. (Similarly, "D:$LogFile" can be valid if D: is an alias to a subdirectory of an NTFS volume.)

There are even more complications. Alternate data streams on files, for instance, are legal on NTFS volumes, so "snowman.txt:☃" may be valid. All three major Windows file systems have path length restructions, so the validity of the file name is also function of the path. But the length of the physical path might not even be available to isValidName if the path is a virtual alias, mapped network drive, or symbolic link rather than a physical path on the volume.

Some others have suggested an alternative: create a file by the proposed name and then delete it, returning true if and only if the creation succeeds. This approach has several practical and theoretical problems. One, as indicated earlier, is that the validity is a function both of the filename and the path, so the validity of c:\test\☃.txt might differ from the validity of c:\test2\☃.txt. Also, the function would fail to write the file for any number of reasons not related to the validity of the file, such as not having write permission to the directory. A third flaw is that the validity of a filename is not required to be nondeterministic: a hypothetical file system might, for instance, not allow a deleted file to be replaced, or (in theory) could even randomly decide if a filename is valid.

As an alternative, it's fairly straightforward to create a method isInvalidFileName(String text) that returns true if the file is guaranteed to not be valid in Windows; filenames like "aux", "*", and "abc.txt." would return true. The file create operation would first check that the filename is guaranteed to be invalid and, if it returns false, would stop. Otherwise, the method could attempt to create the file, while being prepared for the edge case where the file cannot be created because the filename is invalid.

Solution 5 - Java

Posting a new answer because I dont have the rep threshold to comment on Eng.Fouad's code

public static boolean isValidName(String text)
{
    try
    {
        File file = new File(text);
        if(file.createNewFile()) file.delete();
        return true;
    }
    catch(Exception ex){}
    return false;
}

A small change to your answer that prevents deleting a pre-existing file. Files only get deleted if they were created during this method call, while the return value is the same.

Solution 6 - Java

Here you can find which file names are allowed.

The following characters are not allowed:

  • < (less than)

  • > (greater than)

  • : (colon)

  • " (double quote)

  • / (forward slash)

  • \ (backslash)

  • | (vertical bar or pipe)

  • ? (question mark)

  • * (asterisk)

  • Integer value zero, sometimes referred to as the ASCII NUL character.

  • Characters whose integer representations are in the range from 1 through 31, except for alternate data streams where these characters are allowed. For more information about file streams, see File Streams.

  • Any other character that the target file system does not allow.

Solution 7 - Java

Looks good. At least if we believe to this resource: http://msdn.microsoft.com/en-us/library/aa365247%28v=vs.85%29.aspx

But I'd simplify use the code. It is enough to look for one of these characters to say that the name is invalid, so:

public static boolean isValidName(String text)
{
    Pattern pattern = Pattern.compile("[^/./\\:*?\"<>|]");
    return !pattern.matcher(text).find();
}

This regex is simpler and will work faster.

Solution 8 - Java

This solution will only check if a given filename is valid according to the OS rules without creating a file.

You still need to handle other failures when actually creating the file (e.g. insufficient permissions, lack of drive space, security restrictions).

import java.io.File;
import java.io.IOException;

public class FileUtils {
  public static boolean isFilenameValid(String file) {
    File f = new File(file);
    try {
       f.getCanonicalPath();
       return true;
    }
    catch (IOException e) {
       return false;
    }
  }

  public static void main(String args[]) throws Exception {
    // true
    System.out.println(FileUtils.isFilenameValid("well.txt"));
    System.out.println(FileUtils.isFilenameValid("well well.txt"));
    System.out.println(FileUtils.isFilenameValid(""));

    //false
    System.out.println(FileUtils.isFilenameValid("test.T*T"));
    System.out.println(FileUtils.isFilenameValid("test|.TXT"));
    System.out.println(FileUtils.isFilenameValid("te?st.TXT"));
    System.out.println(FileUtils.isFilenameValid("con.TXT")); // windows
    System.out.println(FileUtils.isFilenameValid("prn.TXT")); // windows
    }
  }

Solution 9 - Java

Not sure how to implement it in Java (either Regex or own method). But, Windows OS has the following rules to create file/directory in the file system:

  1. Name is not only be Dots
  2. Windows device names like AUX, CON, NUL, PRN, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, LPT9, cannot be used for a file name nor for the first segment of a file name (i.e. test1 in test1.txt).
  3. Device names are case insensitive. (i.e. prn, PRN, Prn, etc. are identical.)
  4. All characters greater than ASCII 31 to be used except "*/:<>?|

So, the program needs to stick with these rules. Hope, it covers the validation rules for your question.

Solution 10 - Java

You can check all the reserved names (AUX, CON, and the like) and then use this code:

bool invalidName = GetFileAttributes(name) == INVALID_FILE_ATTRIBUTES && 
        GetLastError() == ERROR_INVALID_NAME;

to check for any additional restriction. But note that if you check for a name in a non existant directory you will get ERROR_PATH_NOT_FOUND whether the name is really valid or not.

Anyway, you should remember the old saying:

> It's easier to ask for forgiveness than it is to get permission.

Solution 11 - Java

How about letting the File class do your validation?

public static boolean isValidName(String text) {
    try {
        File file = new File(text);
        return file.getPath().equals(text);
    }
    catch(Exception ex){}
    return false;
}

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionEng.FouadView Question on Stackoverflow
Solution 1 - JavaridgerunnerView Answer on Stackoverflow
Solution 2 - JavaMondayView Answer on Stackoverflow
Solution 3 - JavaEng.FouadView Answer on Stackoverflow
Solution 4 - JavadrfView Answer on Stackoverflow
Solution 5 - JavaAbdul HfudaView Answer on Stackoverflow
Solution 6 - JavaphimuemueView Answer on Stackoverflow
Solution 7 - JavaAlexRView Answer on Stackoverflow
Solution 8 - JavaRealHowToView Answer on Stackoverflow
Solution 9 - JavaGanesanView Answer on Stackoverflow
Solution 10 - JavarodrigoView Answer on Stackoverflow
Solution 11 - JavaneknoView Answer on Stackoverflow