Replacing illegal character in fileName

JavaRegexReplace

Java Problem Overview


In Java, I've a File-Name-String. There I want to replace all illegal Characters with '_', but not a-z, 0-9, -,. and _

I tried following code: But this did not worked!

myString = myString.replaceAll("[\\W][^\\.][^-][^_]", "_");

Java Solutions


Solution 1 - Java

You need to replace everything but [a-zA-Z0-9.-]. The ^ within the brackets stands for "NOT".

myString = myString.replaceAll("[^a-zA-Z0-9\\.\\-]", "_");

Solution 2 - Java

If you are looking for options on windows platform then you can try below solution to make use of all valid characters other than "\/:*?"<>|" in file name.

fileName = fileName.replaceAll("[\\\\/:*?\"<>|]", "_");

Solution 3 - Java

Keep it simple.

myString = myString.replaceAll("[^a-zA-Z0-9.-]", "_");

http://ideone.com/TINsr4

Solution 4 - Java

Even simpler

myString = myString.replaceAll("[^\\w.-]", "_");

Predefined Character Classes:

  • \w A word character: [a-zA-Z_0-9]

Solution 5 - Java

I know there have been some answers here already, but I would like to point out that I had to alter the given suggestions slightly.

filename.matches("^.*[^a-zA-Z0-9._-].*$")

This is what I had to use for .matches in Java to get the desired results. I am not sure if this is 100% correct, but this is how it worked for me, it would return true if it encountered any character other than a-z A-Z 0-9 (.) (_) and (-).

I would like to know if there are any flaws with my logic here.

In previous answers I've seen some discussion of what should or should not be escaped. For this example, I've gotten away without escaping anything, but you should escape the (-) minus character to be safe as it will "break" your expression unless it is at the end of the list. The (.) dot character doesn't have to be escaped within the ([]) Square Braces it would seem, but it will not hurt you if you do escape it.

Please see Java Patterns for more details.

Solution 6 - Java

If you want to use more than like [A-Za-z0-9], then check MS Naming Conventions, and dont forget to filter out "...Characters whose integer representations are in the range from 1 through 31,...".

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionbbholzbbView Question on Stackoverflow
Solution 1 - JavapoitroaeView Answer on Stackoverflow
Solution 2 - JavaPrakashView Answer on Stackoverflow
Solution 3 - JavaMatt BallView Answer on Stackoverflow
Solution 4 - JavaIvanRFView Answer on Stackoverflow
Solution 5 - JavaAtspulgsView Answer on Stackoverflow
Solution 6 - JavawandlangView Answer on Stackoverflow