Escaping HTML strings with jQuery

JavascriptJqueryStringEscaping

Javascript Problem Overview


Does anyone know of an easy way to escape HTML from strings in jQuery? I need to be able to pass an arbitrary string and have it properly escaped for display in an HTML page (preventing JavaScript/HTML injection attacks). I'm sure it's possible to extend jQuery to do this, but I don't know enough about the framework at the moment to accomplish this.

Javascript Solutions


Solution 1 - Javascript

There is also the solution from mustache.js

var entityMap = {
  '&': '&',
  '<': '&lt;',
  '>': '&gt;',
  '"': '&quot;',
  "'": '&#39;',
  '/': '&#x2F;',
  '`': '&#x60;',
  '=': '&#x3D;'
};

function escapeHtml (string) {
  return String(string).replace(/[&<>"'`=\/]/g, function (s) {
    return entityMap[s];
  });
}

Solution 2 - Javascript

Since you're using jQuery, you can just set the element's text property:

// before:
// <div class="someClass">text</div>
var someHtmlString = "<script>alert('hi!');</script>";

// set a DIV's text:
$("div.someClass").text(someHtmlString);
// after: 
// <div class="someClass">&lt;script&gt;alert('hi!');&lt;/script&gt;</div>

// get the text in a string:
var escaped = $("<div>").text(someHtmlString).html();
// value: 
// &lt;script&gt;alert('hi!');&lt;/script&gt;

Solution 3 - Javascript

$('<div/>').text('This is fun & stuff').html(); // "This is fun &amp; stuff"

Source: http://debuggable.com/posts/encode-html-entities-with-jquery:480f4dd6-13cc-4ce9-8071-4710cbdd56cb

Solution 4 - Javascript

If you're escaping for HTML, there are only three that I can think of that would be really necessary:

html.replace(/&/g, "&amp;").replace(/</g, "&lt;").replace(/>/g, "&gt;");

Depending on your use case, you might also need to do things like " to &quot;. If the list got big enough, I'd just use an array:

var escaped = html;
var findReplace = [[/&/g, "&amp;"], [/</g, "&lt;"], [/>/g, "&gt;"], [/"/g, "&quot;"]]
for(var item in findReplace)
    escaped = escaped.replace(findReplace[item][0], findReplace[item][1]);

encodeURIComponent() will only escape it for URLs, not for HTML.

Solution 5 - Javascript

Easy enough to use underscore:

_.escape(string) 

Underscore is a utility library that provides a lot of features that native js doesn't provide. There's also lodash which is the same API as underscore but was rewritten to be more performant.

Solution 6 - Javascript

I wrote a tiny little function which does this. It only escapes ", &, < and > (but usually that's all you need anyway). It is slightly more elegant then the earlier proposed solutions in that it only uses one .replace() to do all the conversion. (EDIT 2: Reduced code complexity making the function even smaller and neater, if you're curious about the original code see end of this answer.)

function escapeHtml(text) {
    'use strict';
    return text.replace(/[\"&<>]/g, function (a) {
        return { '"': '&quot;', '&': '&amp;', '<': '&lt;', '>': '&gt;' }[a];
    });
}

This is plain Javascript, no jQuery used.

Escaping / and ' too

Edit in response to mklement's comment.

The above function can easily be expanded to include any character. To specify more characters to escape, simply insert them both in the character class in the regular expression (i.e. inside the /[...]/g) and as an entry in the chr object. (EDIT 2: Shortened this function too, in the same way.)

function escapeHtml(text) {
    'use strict';
    return text.replace(/[\"&'\/<>]/g, function (a) {
        return {
            '"': '&quot;', '&': '&amp;', "'": '&#39;',
            '/': '&#47;',  '<': '&lt;',  '>': '&gt;'
        }[a];
    });
}

Note the above use of &#39; for apostrophe (the symbolic entity &apos; might have been used instead – it is defined in XML, but was originally not included in the HTML spec and might therefore not be supported by all browsers. See: Wikipedia article on HTML character encodings). I also recall reading somewhere that using decimal entities is more widely supported than using hexadecimal, but I can't seem to find the source for that now though. (And there cannot be many browsers out there which does not support the hexadecimal entities.)

Note: Adding / and ' to the list of escaped characters isn't all that useful, since they do not have any special meaning in HTML and do not need to be escaped.

Original escapeHtml Function

EDIT 2: The original function used a variable (chr) to store the object needed for the .replace() callback. This variable also needed an extra anonymous function to scope it, making the function (needlessly) a little bit bigger and more complex.

var escapeHtml = (function () {
    'use strict';
    var chr = { '"': '&quot;', '&': '&amp;', '<': '&lt;', '>': '&gt;' };
    return function (text) {
        return text.replace(/[\"&<>]/g, function (a) { return chr[a]; });
    };
}());

I haven't tested which of the two versions are faster. If you do, feel free to add info and links about it here.

Solution 7 - Javascript

I realize how late I am to this party, but I have a very easy solution that does not require jQuery.

escaped = new Option(unescaped).innerHTML;

Edit: This does not escape quotes. The only case where quotes would need to be escaped is if the content is going to be pasted inline to an attribute within an HTML string. It is hard for me to imagine a case where doing this would be good design.

Edit 3: For the fastest solution, check the answer above from Saram. This one is the shortest.

Solution 8 - Javascript

Here is a clean, clear JavaScript function. It will escape text such as "a few < many" into "a few &lt; many".

function escapeHtmlEntities (str) {
  if (typeof jQuery !== 'undefined') {
    // Create an empty div to use as a container,
    // then put the raw text in and get the HTML
    // equivalent out.
    return jQuery('<div/>').text(str).html();
  }
  
  // No jQuery, so use string replace.
  return str
    .replace(/&/g, '&amp;')
    .replace(/>/g, '&gt;')
    .replace(/</g, '&lt;')
    .replace(/"/g, '&quot;')
    .replace(/'/g, '&apos;');
}

Solution 9 - Javascript

After last tests I can recommend fastest and completely cross browser compatible native javaScript (DOM) solution:

function HTMLescape(html){
    return document.createElement('div')
        .appendChild(document.createTextNode(html))
        .parentNode
        .innerHTML
}

If you repeat it many times you can do it with once prepared variables:

//prepare variables
var DOMtext = document.createTextNode("test");
var DOMnative = document.createElement("span");
DOMnative.appendChild(DOMtext);

//main work for each case
function HTMLescape(html){
  DOMtext.nodeValue = html;
  return DOMnative.innerHTML
}

Look at my final performance comparison (stack question).

Solution 10 - Javascript

Try Underscore.string lib, it works with jQuery.

_.str.escapeHTML('<div>Blah blah blah</div>')

output:

'&lt;div&gt;Blah blah blah&lt;/div&gt;'

Solution 11 - Javascript

escape() and unescape() are intended to encode / decode strings for URLs, not HTML.

Actually, I use the following snippet to do the trick that doesn't require any framework:

var escapedHtml = html.replace(/&/g, '&amp;')
                      .replace(/>/g, '&gt;')
                      .replace(/</g, '&lt;')
                      .replace(/"/g, '&quot;')
                      .replace(/'/g, '&apos;');

Solution 12 - Javascript

I've enhanced the mustache.js example adding the escapeHTML() method to the string object.

var __entityMap = {
    "&": "&amp;",
    "<": "&lt;",
    ">": "&gt;",
    '"': '&quot;',
    "'": '&#39;',
    "/": '&#x2F;'
};

String.prototype.escapeHTML = function() {
    return String(this).replace(/[&<>"'\/]/g, function (s) {
        return __entityMap[s];
    });
}

That way it is quite easy to use "Some <text>, more Text&Text".escapeHTML()

Solution 13 - Javascript

If you have underscore.js, use _.escape (more efficient than the jQuery method posted above):

_.escape('Curly, Larry & Moe'); // returns: Curly, Larry &amp; Moe

Solution 14 - Javascript

If your're going the regex route, there's an error in tghw's example above.

<!-- WON'T WORK -  item[0] is an index, not an item -->

var escaped = html; 
var findReplace = [[/&/g, "&amp;"], [/</g, "&lt;"], [/>/g,"&gt;"], [/"/g,
"&quot;"]]

for(var item in findReplace) {
     escaped = escaped.replace(item[0], item[1]);   
}


<!-- WORKS - findReplace[item[]] correctly references contents -->
  
var escaped = html;
var findReplace = [[/&/g, "&amp;"], [/</g, "&lt;"], [/>/g, "&gt;"], [/"/g, "&quot;"]]
     
for(var item in findReplace) {
     escaped = escaped.replace(findReplace[item[0]], findReplace[item[1]]);
}

Solution 15 - Javascript

This is a nice safe example...

function escapeHtml(str) {
    if (typeof(str) == "string"){
        try{
            var newStr = "";
            var nextCode = 0;
            for (var i = 0;i < str.length;i++){
                nextCode = str.charCodeAt(i);
                if (nextCode > 0 && nextCode < 128){
                    newStr += "&#"+nextCode+";";
                }
                else{
                    newStr += "?";
                }
             }
             return newStr;
        }
        catch(err){
        }
    }
    else{
        return str;
    }
}

Solution 16 - Javascript

You can easily do it with vanilla js.

Simply add a text node the document. It will be escaped by the browser.

var escaped = document.createTextNode("<HTML TO/ESCAPE/>")
document.getElementById("[PARENT_NODE]").appendChild(escaped)

Solution 17 - Javascript

2 simple methods that require NO JQUERY...

You can encode all characters in your string like this:

function encode(e){return e.replace(/[^]/g,function(e){return"&#"+e.charCodeAt(0)+";"})}

Or just target the main characters to worry about &, line breaks, <, >, " and ' like:

function encode(r){

return r.replace(/[\x26\x0A<>'"]/g,function(r){return"&#"+r.charCodeAt(0)+";"}) }

var myString='Encode HTML entities!\n"Safe" escape

Solution 18 - Javascript

(function(undefined){
    var charsToReplace = {
        '&': '&amp;',
        '<': '&lt;',
        '>': '&gt;'
    };

    var replaceReg = new RegExp("[" + Object.keys(charsToReplace).join("") + "]", "g");
    var replaceFn = function(tag){ return charsToReplace[tag] || tag; };

    var replaceRegF = function(replaceMap) {
        return (new RegExp("[" + Object.keys(charsToReplace).concat(Object.keys(replaceMap)).join("") + "]", "gi"));
    };
    var replaceFnF = function(replaceMap) {
        return function(tag){ return replaceMap[tag] || charsToReplace[tag] || tag; };
    };

    String.prototype.htmlEscape = function(replaceMap) {
        if (replaceMap === undefined) return this.replace(replaceReg, replaceFn);
        return this.replace(replaceRegF(replaceMap), replaceFnF(replaceMap));
    };
})();

No global variables, some memory optimization. Usage:

"some<tag>and&symbol©".htmlEscape({'©': '&copy;'})

result is:

"some&lt;tag&gt;and&amp;symbol&copy;"

Solution 19 - Javascript

Plain JavaScript escaping example:

function escapeHtml(text) {
	var div = document.createElement('div');
	div.innerText = text;
	return div.innerHTML;
}

escapeHtml("<script>alert('hi!');</script>")
// "&lt;script&gt;alert('hi!');&lt;/script&gt;"

Solution 20 - Javascript

ES6 one liner for the solution from mustache.js

const escapeHTML = str => (str+'').replace(/[&<>"'`=\/]/g, s => ({'&': '&amp;','<': '&lt;','>': '&gt;','"': '&quot;',"'": '&#39;','/': '&#x2F;','`': '&#x60;','=': '&#x3D;'})[s]);

Solution 21 - Javascript

function htmlEscape(str) {
    var stringval="";
    $.each(str, function (i, element) {
        alert(element);
        stringval += element
            .replace(/&/g, '&amp;')
            .replace(/"/g, '&quot;')
            .replace(/'/g, '&#39;')
            .replace(/</g, '&lt;')
            .replace(/>/g, '&gt;')
            .replace(' ', '-')
            .replace('?', '-')
            .replace(':', '-')
            .replace('|', '-')
            .replace('.', '-');
    });
    alert(stringval);
    return String(stringval);
}

Solution 22 - Javascript

function htmlDecode(t){
   if (t) return $('<div />').html(t).text();
}

works like a charm

Solution 23 - Javascript

A speed-optimized version:

function escapeHtml(s) {
   let out = "";
   let p2 = 0;
   for (let p = 0; p < s.length; p++) {
      let r;
      switch (s.charCodeAt(p)) {
         case 34: r = "&quot;"; break;  // "
         case 38: r = "&amp;" ; break;  // &
         case 39: r = "&#39;" ; break;  // '
         case 60: r = '&lt;'  ; break;  // <
         case 62: r = '&gt;'  ; break;  // >
         default: continue;
      }
      if (p2 < p) {
         out += s.substring(p2, p);
      }
      out += r;
      p2 = p + 1;
   }
   if (p2 == 0) {
      return s;
   }
   if (p2 < s.length) {
      out += s.substring(p2);
   }
   return out;
}

const s = "Hello <World>!";
document.write(escapeHtml(s));
console.log(escapeHtml(s));

Solution 24 - Javascript

For escape html specials (UTF-8)

function htmlEscape(str) {
  return str
      .replace(/&/g, '&amp;')
      .replace(/"/g, '&quot;')
      .replace(/'/g, '&#39;')
      .replace(/</g, '&lt;')
      .replace(/>/g, '&gt;')
      .replace(/\//g, '&#x2F;')
      .replace(/=/g,  '&#x3D;')
      .replace(/`/g, '&#x60;');
}

For unescape html specials (UTF-8)

function htmlUnescape(str) {
  return str
      .replace(/&amp;/g, '&')
      .replace(/&quot;/g, '"')
      .replace(/&#39;/g, "'")
      .replace(/&lt;/g, '<')
      .replace(/&gt;/g, '>')
      .replace(/&#x2F/g, '/')
      .replace(/&#x3D;/g, '=')
      .replace(/&#x60;/g, '`');
}

Solution 25 - Javascript

This answer provides the jQuery and normal JS methods, but this is shortest without using the DOM:

unescape(escape("It's > 20% less complicated this way."))

Escaped string: It%27s%20%3E%2020%25%20less%20complicated%20this%20way.

If the escaped spaces bother you, try:

unescape(escape("It's > 20% less complicated this way.").replace(/%20/g, " "))

Escaped string: It%27s %3E 20%25 less complicated this way.

Unfortunately, the escape() function was deprecated in JavaScript version 1.5. encodeURI() or encodeURIComponent() are alternatives, but they ignore ', so the last line of code would turn into this:

decodeURI(encodeURI("It's > 20% less complicated this way.").replace(/%20/g, " ").replace("'", '%27'))

All major browsers still support the short code, and given the number of old websites, i doubt that will change soon.

Solution 26 - Javascript

If you are saving this information in a database, its wrong to escape HTML using a client-side script, this should be done in the server. Otherwise its easy to bypass your XSS protection.

To make my point clear, here is a exemple using one of the answers:

Lets say you are using the function escapeHtml to escape the Html from a comment in your blog and then posting it to your server.

var entityMap = {
    "&": "&amp;",
    "<": "&lt;",
    ">": "&gt;",
    '"': '&quot;',
    "'": '&#39;',
    "/": '&#x2F;'
  };

  function escapeHtml(string) {
    return String(string).replace(/[&<>"'\/]/g, function (s) {
      return entityMap[s];
    });
  }

The user could:

  • Edit the POST request parameters and replace the comment with javascript code.
  • Overwrite the escapeHtml function using the browser console.

If the user paste this snippet in the console it would bypass the XSS validation:

function escapeHtml(string){
   return string
}

Solution 27 - Javascript

All solutions are useless if you dont prevent re-escape, e.g. most solutions would keep escaping & to &amp;.

escapeHtml = function (s) {
    return s ? s.replace(
        /[&<>'"]/g,
        function (c, offset, str) {
            if (c === "&") {
                var substr = str.substring(offset, offset + 6);
                if (/&(amp|lt|gt|apos|quot);/.test(substr)) {
                    // already escaped, do not re-escape
                    return c;
                }
            }
            return "&" + {
                "&": "amp",
                "<": "lt",
                ">": "gt",
                "'": "apos",
                '"': "quot"
            }[c] + ";";
        }
    ) : "";
};

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionPageView Question on Stackoverflow
Solution 1 - JavascriptTom GrunerView Answer on Stackoverflow
Solution 2 - JavascripttravisView Answer on Stackoverflow
Solution 3 - JavascriptHenrik NView Answer on Stackoverflow
Solution 4 - JavascripttghwView Answer on Stackoverflow
Solution 5 - JavascriptchovyView Answer on Stackoverflow
Solution 6 - JavascriptzrajmView Answer on Stackoverflow
Solution 7 - JavascriptAdam LeggettView Answer on Stackoverflow
Solution 8 - JavascriptintrepidisView Answer on Stackoverflow
Solution 9 - JavascriptSaramView Answer on Stackoverflow
Solution 10 - JavascriptNikita KoksharovView Answer on Stackoverflow
Solution 11 - JavascriptNicolasBernierView Answer on Stackoverflow
Solution 12 - JavascriptJeenaView Answer on Stackoverflow
Solution 13 - JavascriptronnbotView Answer on Stackoverflow
Solution 14 - JavascriptWayneView Answer on Stackoverflow
Solution 15 - JavascriptamrpView Answer on Stackoverflow
Solution 16 - Javascriptraam86View Answer on Stackoverflow
Solution 17 - JavascriptDave BrownView Answer on Stackoverflow
Solution 18 - JavascriptGheljenorView Answer on Stackoverflow
Solution 19 - JavascriptiamandrewlucaView Answer on Stackoverflow
Solution 20 - JavascriptchickensView Answer on Stackoverflow
Solution 21 - JavascriptKatharapu RamanaView Answer on Stackoverflow
Solution 22 - Javascriptd-_-bView Answer on Stackoverflow
Solution 23 - JavascriptChristian d'HeureuseView Answer on Stackoverflow
Solution 24 - Javascriptoscar castellonView Answer on Stackoverflow
Solution 25 - JavascriptCees TimmermanView Answer on Stackoverflow
Solution 26 - JavascriptKauê GimenesView Answer on Stackoverflow
Solution 27 - JavascriptC NimmanantView Answer on Stackoverflow