Can I escape HTML special chars in JavaScript?

JavascriptHtml

Javascript Problem Overview


I want to display text to HTML by a JavaScript function. How can I escape HTML special characters in JavaScript? Is there an API?

Javascript Solutions


Solution 1 - Javascript

Here's a solution that will work in practically every web browser:

function escapeHtml(unsafe)
{
    return unsafe
         .replace(/&/g, "&")
         .replace(/</g, "&lt;")
         .replace(/>/g, "&gt;")
         .replace(/"/g, "&quot;")
         .replace(/'/g, "&#039;");
 }

If you only support modern web browsers (2020+), then you can use the new replaceAll function:

const escapeHtml = (unsafe) => {
    return unsafe.replaceAll('&', '&amp;').replaceAll('<', '&lt;').replaceAll('>', '&gt;').replaceAll('"', '&quot;').replaceAll("'", '&#039;');
}

Solution 2 - Javascript

function escapeHtml(html){
  var text = document.createTextNode(html);
  var p = document.createElement('p');
  p.appendChild(text);
  return p.innerHTML;
}

// Escape while typing & print result
document.querySelector('input').addEventListener('input', e => {
  console.clear();
  console.log( escapeHtml(e.target.value) );
});

<input style='width:90%; padding:6px;' placeholder='&lt;b&gt;cool&lt;/b&gt;'>

Solution 3 - Javascript

You can use jQuery's .text() function.

For example:

http://jsfiddle.net/9H6Ch/

From the jQuery documentation regarding the .text() function:

> We need to be aware that this method > escapes the string provided as > necessary so that it will render > correctly in HTML. To do so, it calls > the DOM method .createTextNode(), > does not interpret the string as HTML.

Previous Versions of the jQuery Documentation worded it this way (emphasis added):

>We need to be aware that this method escapes the string provided as necessary so that it will render correctly in HTML. To do so, it calls the DOM method .createTextNode(), which replaces special characters with their HTML entity equivalents (such as &lt; for <).

Solution 4 - Javascript

Using Lodash:

_.escape('fred, barney, & pebbles');
// => 'fred, barney, &amp; pebbles'

Source code

Solution 5 - Javascript

I think I found the proper way to do it...

// Create a DOM Text node:
var text_node = document.createTextNode(unescaped_text);

// Get the HTML element where you want to insert the text into:
var elem = document.getElementById('msg_span');

// Optional: clear its old contents
//elem.innerHTML = '';

// Append the text node into it:
elem.appendChild(text_node);

Solution 6 - Javascript

This is, by far, the fastest way I have seen it done. Plus, it does it all without adding, removing, or changing elements on the page.

function escapeHTML(unsafeText) {
	let div = document.createElement('div');
	div.innerText = unsafeText;
	return div.innerHTML;
}

Solution 7 - Javascript

It was interesting to find a better solution:

var escapeHTML = function(unsafe) {
  return unsafe.replace(/[&<"']/g, function(m) {
    switch (m) {
      case '&':
        return '&amp;';
      case '<':
        return '&lt;';
      case '"':
        return '&quot;';
      default:
        return '&#039;';
    }
  });
};

I do not parse > because it does not break XML/HTML code in the result.

Here are the benchmarks: http://jsperf.com/regexpairs Also, I created a universal escape function: http://jsperf.com/regexpairs2

Solution 8 - Javascript

The most concise and performant way to display unencoded text is to use textContent property.

Faster than using innerHTML. And that's without taking into account escaping overhead.

document.body.textContent = 'a <b> c </b>';

Solution 9 - Javascript

DOM Elements support converting text to HTML by assigning to innerText. innerText is not a function but assigning to it works as if the text were escaped.

document.querySelectorAll('#id')[0].innerText = 'unsafe " String >><>';

Solution 10 - Javascript

You can encode every character in your string:

function encode(e){return e.replace(/[^]/g,function(e){return"&#"+e.charCodeAt(0)+";"})}

Or just target the main characters to worry about (&, inebreaks, <, >, " and ') like:

function encode(r){

return r.replace(/[\x26\x0A<>'"]/g,function(r){return"&#"+r.charCodeAt(0)+";"}) }

test.value=encode('How to encode\nonly html tags &<>'" nice & fast!');

/*************
* \x26 is &ampersand (it has to be first),
* \x0A is newline,
*************/

<textarea id=test rows="9" cols="55">&#119;&#119;&#119;&#46;&#87;&#72;&#65;&#75;&#46;&#99;&#111;&#109;</textarea>

Solution 11 - Javascript

If you already use modules in your application, you can use escape-html module.

import escapeHtml from 'escape-html';
const unsafeString = '<script>alert("XSS");</script>';
const safeString = escapeHtml(unsafeString);

Solution 12 - Javascript

By the books

OWASP recommends that "[e]xcept for alphanumeric characters, [you should] escape all characters with ASCII values less than 256 with the &#xHH; format (or a named entity if available) to prevent switching out of [an] attribute."

So here's a function that does that, with a usage example:

function escapeHTML(unsafe) {
  return unsafe.replace(
    /[\u0000-\u002F\u003A-\u0040\u005B-\u0060\u007B-\u00FF]/g,
    c => '&#' + ('000' + c.charCodeAt(0)).slice(-4) + ';'
  )
}

document.querySelector('div').innerHTML =
  '<span class=' +
  escapeHTML('"fakeclass" onclick="alert("test")') +
  '>' +
  escapeHTML('<script>alert("inspect the attributes")\u003C/script>') +
  '</span>'

<div></div>

You should verify the entity ranges I have provided to validate the safety of the function yourself. You could also use this regular expression which has better readability and should cover the same character codes, but is about 10% less performant in my browser:

/(?![0-9A-Za-z])[\u0000-\u00FF]/g

Solution 13 - Javascript

I came across this issue when building a DOM structure. This question helped me solve it. I wanted to use a double chevron as a path separator, but appending a new text node directly resulted in the escaped character code showing, rather than the character itself:

var _div = document.createElement('div');
var _separator = document.createTextNode('&raquo;');
//_div.appendChild(_separator); /* This resulted in '&raquo;' being displayed */
_div.innerHTML = _separator.textContent; /* This was key */

Solution 14 - Javascript

Use this to remove HTML tags from a string in JavaScript:

const strippedString = htmlString.replace(/(<([^>]+)>)/gi, "");

console.log(strippedString);

Solution 15 - Javascript

Just write the code in between <pre><code class="html-escape">....</code></pre>. Make sure you add the class name in the code tag. It will escape all the HTML snippet written in
<pre><code class="html-escape">....</code></pre>.

const escape = {
    '"': '&quot;',
    '&': '&amp;',
    '<': '&lt;',
    '>': '&gt;',
}
const codeWrappers = document.querySelectorAll('.html-escape')
if (codeWrappers.length > 0) {
    codeWrappers.forEach(code => {
        const htmlCode = code.innerHTML
        const escapeString = htmlCode.replace(/"|&|<|>/g, function (matched) {
            return escape[matched];
        });
        code.innerHTML = escapeString
    })
}

<pre>
    <code class="language-html html-escape">
        <div class="card">
            <div class="card-header-img" style="background-image: url('/assets/card-sample.png');"></div>
            <div class="card-body">
                <p class="card-title">Card Title</p>
                <p class="card-subtitle">Srcondary text</p>
                <p class="card-text">Greyhound divisively hello coldly wonderfully marginally far upon
                    excluding.</p>
                <button class="btn">Go to </button>
                <button class="btn btn-outline">Go to </button>
            </div>
        </div>
    </code>
</pre>

Solution 16 - Javascript

Try this, using the prototype.js library:

string.escapeHTML();

Try a demo

Solution 17 - Javascript

I came up with this solution.

Let's assume that we want to add some HTML to the element with unsafe data from the user or database.

var unsafe = 'some unsafe data like <script>alert("oops");</script> here';

var html = '';
html += '<div>';
html += '<p>' + unsafe + '</p>';
html += '</div>';

element.html(html);

It's unsafe against XSS attacks. Now add this: $(document.createElement('div')).html(unsafe).text();

So it is

var unsafe = 'some unsafe data like <script>alert("oops");</script> here';

var html = '';
html += '<div>';
html += '<p>' + $(document.createElement('div')).html(unsafe).text(); + '</p>';
html += '</div>';

element.html(html);

To me this is much easier than using .replace() and it'll remove!!! all possible HTML tags (I hope).

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionfernando123View Question on Stackoverflow
Solution 1 - JavascriptbjorndView Answer on Stackoverflow
Solution 2 - JavascriptspiderlamaView Answer on Stackoverflow
Solution 3 - JavascriptjeremysawesomeView Answer on Stackoverflow
Solution 4 - Javascriptcs01View Answer on Stackoverflow
Solution 5 - JavascriptlvellaView Answer on Stackoverflow
Solution 6 - JavascriptarjunpatView Answer on Stackoverflow
Solution 7 - JavascriptiegikView Answer on Stackoverflow
Solution 8 - JavascriptuserView Answer on Stackoverflow
Solution 9 - JavascriptteknopaulView Answer on Stackoverflow
Solution 10 - JavascriptDave BrownView Answer on Stackoverflow
Solution 11 - JavascriptShimon SView Answer on Stackoverflow
Solution 12 - JavascriptADJenksView Answer on Stackoverflow
Solution 13 - JavascriptSilasView Answer on Stackoverflow
Solution 14 - JavascriptMuneeb AhmedView Answer on Stackoverflow
Solution 15 - JavascriptSoumen KharaView Answer on Stackoverflow
Solution 16 - JavascriptLuckyView Answer on Stackoverflow
Solution 17 - JavascriptKostiantynView Answer on Stackoverflow