Remove all special characters with RegExp

JavascriptRegexSpecial Characters

Javascript Problem Overview


I would like a RegExp that will remove all special characters from a string. I am trying something like this but it doesn’t work in IE7, though it works in Firefox.

var specialChars = "!@#$^&%*()+=-[]\/{}|:<>?,.";

for (var i = 0; i < specialChars.length; i++) {
  stringToReplace = stringToReplace.replace(new RegExp("\\" + specialChars[i], "gi"), "");
}

A detailed description of the RegExp would be helpful as well.

Javascript Solutions


Solution 1 - Javascript

var desired = stringToReplace.replace(/[^\w\s]/gi, '')

As was mentioned in the comments it's easier to do this as a whitelist - replace the characters which aren't in your safelist.

The caret (^) character is the negation of the set [...], gi say global and case-insensitive (the latter is a bit redundant but I wanted to mention it) and the safelist in this example is digits, word characters, underscores (\w) and whitespace (\s).

Solution 2 - Javascript

Note that if you still want to exclude a set, including things like slashes and special characters you can do the following:

var outString = sourceString.replace(/[`~!@#$%^&*()_|+\-=?;:'",.<>\{\}\[\]\\\/]/gi, '');

take special note that in order to also include the "minus" character, you need to escape it with a backslash like the latter group. if you don't it will also select 0-9 which is probably undesired.

Solution 3 - Javascript

Plain Javascript regex does not handle Unicode letters.

Do not use [^\w\s], this will remove letters with accents (like àèéìòù), not to mention to Cyrillic or Chinese, letters coming from such languages will be completed removed.

You really don't want remove these letters together with all the special characters. You have two chances:

  • Add in your regex all the special characters you don't want remove,
    for example: [^èéòàùì\w\s].
  • Have a look at xregexp.com. XRegExp adds base support for Unicode matching via the \p{...} syntax.

var str = "Їжак::: résd,$%& adùf"
var search = XRegExp('([^?<first>\\pL ]+)');
var res = XRegExp.replace(str, search, '',"all");

console.log(res); // returns "Їжак::: resd,adf"
console.log(str.replace(/[^\w\s]/gi, '') ); // returns " rsd adf"
console.log(str.replace(/[^\wèéòàùì\s]/gi, '') ); // returns " résd adùf"

<script src="https://cdnjs.cloudflare.com/ajax/libs/xregexp/3.1.1/xregexp-all.js"></script>

Solution 4 - Javascript

The first solution does not work for any UTF-8 alphabet. (It will cut text such as Їжак). I have managed to create a function which does not use RegExp and use good UTF-8 support in the JavaScript engine. The idea is simple if a symbol is equal in uppercase and lowercase it is a special character. The only exception is made for whitespace.

function removeSpecials(str) {
    var lower = str.toLowerCase();
    var upper = str.toUpperCase();
    
    var res = "";
    for(var i=0; i<lower.length; ++i) {
        if(lower[i] != upper[i] || lower[i].trim() === '')
            res += str[i];
    }
    return res;
}

Update: Please note, that this solution works only for languages where there are small and capital letters. In languages like Chinese, this won't work.

Update 2: I came to the original solution when I was working on a fuzzy search. If you also trying to remove special characters to implement search functionality, there is a better approach. Use any transliteration library which will produce you string only from Latin characters and then the simple Regexp will do all magic of removing special characters. (This will work for Chinese also and you also will receive side benefits by making Tromsø == Tromso).

Solution 5 - Javascript

using \W or [a-z0-9] regex won't work for non english languages like chinese etc.,

It's better to use all special characters in regex and exclude them from given string

str.replace(/[~`!@#$%^&*()+={}\[\];:\'\"<>.,\/\\\?-_]/g, '');

Solution 6 - Javascript

str.replace(/\s|[0-9_]|\W|[#$%^&*()]/g, "") I did sth like this. But there is some people who did it much easier like str.replace(/\W_/g,"");

Solution 7 - Javascript

I use RegexBuddy for debbuging my regexes it has almost all languages very usefull. Than copy/paste for the targeted language. Terrific tool and not very expensive.

So I copy/pasted your regex and your issue is that [,] are special characters in regex, so you need to escape them. So the regex should be : /!@#$^&%*()+=-[\x5B\x5D]/{}|:<>?,./im

Solution 8 - Javascript

@Seagull anwser (https://stackoverflow.com/a/26482552/4556619) looks good but you get undefined string in result when there are some special (turkish) characters. See example below.

let str="bənövşəyi 😟пурпурный İdÖĞ";

i slightly improve it and patch with undefined check.

function removeSpecials(str) {
    let lower = str.toLowerCase();
    let upper = str.toUpperCase();

    let res = "",i=0,n=lower.length,t;
    for(i; i<n; ++i) {
        if(lower[i] !== upper[i] || lower[i].trim() === ''){
            t=str[i];
            if(t!==undefined){
                res +=t;
            }
        }
    }
    return res;
}

Solution 9 - Javascript

why dont you do something like:

re = /^[a-z0-9 ]$/i;
var isValid = re.test(yourInput);

to check if your input contain any special char

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionTimothy RuhleView Question on Stackoverflow
Solution 1 - JavascriptannakataView Answer on Stackoverflow
Solution 2 - JavascriptnoinputView Answer on Stackoverflow
Solution 3 - JavascriptfreedevView Answer on Stackoverflow
Solution 4 - JavascriptSeagullView Answer on Stackoverflow
Solution 5 - JavascriptManikanta C.S.EView Answer on Stackoverflow
Solution 6 - JavascriptEldar MammadovView Answer on Stackoverflow
Solution 7 - JavascriptmillebiiView Answer on Stackoverflow
Solution 8 - JavascriptFuad AllView Answer on Stackoverflow
Solution 9 - JavascriptAnDView Answer on Stackoverflow