Should I use <![CDATA[...]]> in HTML5?

CdataHtml

Cdata Problem Overview


I'm pretty sure <![CDATA[...]]> sections can be used in XHTML5, but what about HTML5?

Cdata Solutions


Solution 1 - Cdata

The CDATA structure isn't really for HTML at all, it's for XML.

People sometimes use them in XHTML inside script tags because it removes the need for them to escape <, > and & characters. It's unnecessary in HTML though, since script tags in HTML are already parsed like CDATA sections.

Edit: This is where we open that really mouldy old can of worms from 2002 over whether you're sending XHTML as text/html or as application/xhtml+xml like you’re “supposed” to :-)

Solution 2 - Cdata

From the same page @pst linked to:

> Element-specific parsing for script and style tags, Guidance for XHTML-HTML compatibility: "The following code with escaping can ensure script and style elements will work in both XHTML and HTML, including older browsers."

Maximum backwards compatibility:

<script type="text/javascript"><!--//--><![CDATA[//><!--
    ...
//--><!]]></script>

Simpler version, sort of incompatible with "much older browsers":

<script>//<![CDATA[
   ...
//]]></script>

So, CDATA can be used in HTML5, and it's recommended in the official Guidance for XHTML-HTML compatibility.

This useful for polyglot HTML/XML/XHTML pages, which are served as strict application/xml XML during development, but served as text/html HTML5 in production mode for better cross-browser compatibility. Polyglot pages have their benefits; I've used this myself, as it's much easier to debug XML/XHTML5. Google Chrome, for example, will throw an error for invalid XML/XHTML5 (including for example character escaping), whereas the same page served as HTML5 will "just work" also known as "probably work".

Solution 3 - Cdata

The spec seems to clear up this issue. script and style tags are considered to be "raw text elements." CDATA is not needed or allowed for them. CDATA is only used with "foreign content" - i.e. MathML and SVG. Note that there are some restrictions to what can go in the script tag -- basically you can't put something like var x = '</script>' in there because it will close the tag and needs to be split like pst noted in his answer. http://www.w3.org/TR/html5/syntax.html#cdata-rcdata-restrictions

Solution 4 - Cdata

HTML5-supporting browsers (and most older browsers going all the way back to 2001) already read the content inside <style> and <script> tags as CDATA (character data). That means you generally do not need to add CDATA tags inside those elements for most HTML browsers built the past 20 years as they will parse any special characters ok that might popup when adding CSS and JavaScript code between them.

Note: The CDATA tag helps XML parsers ignore special characters that might popup in between those elements, which are part of XML elements, and therefore which would break the markup (like using < or > characters, for example). Only the <style> and <script> in modern HTML parses have this special feature already built in. That simply means in HTML browsers and parsers they are designed to ignore those weird characters, or rather not read or parse them, as part of the markup. If they did not have built in CDATA properties, your web page, styles, and scripts could break!

However...you do need to add the CDATA block inside <style> and <script> HTML5 tags if you want your HTML5 page to be compatible with XHTML and XML, which do need CDATA tags. For that reason, I do recommend you use CDATA in HTML5 <style> and <script> tags, but please read on. If you do not do this right, you will break your website!

XML and XHTML parsers will read the <style> and <script> tag content as they do all HTML elements, as PCDATA (i.e. a normal HTML element), meaning the contents are parsed as markup and potentially break with special characters added in between those tags. You can add special CDATA sections between those two tags to support it. Because XML and XHTML parsers reads everything inside elements as potentially more markup, adding CDATA prevents certain characters from being interpreted as XML or other types of character references.

The problem is, most HTML4/HTML5 browsers and parsers don't support adding additional CDATA sections between those tags, so CDATA blocks have to be commented out for those agents if you add them for XHTML/XML support.

Also, note that all HTML comments (<!-- or -->) added inside those tags are ignored by HTML parsers, but implemented by XHTML ones, commenting out CSS and JavaScript for XHTML, when added. Many people in the past would add comment rules between those tags to hide styles and scripts from very old browsers that normally would not understand CSS or Javascript (pre-1998 browsers). But that strategy failed in XHTML without additional code.

So how do you combine all that inside <style> and <script> tags, and should you care?

I am a purist and like my HTML5 content to still be XML/XHTML-friendly, regardless of what markup recommendation I am using. I also like my pages to work in browsers that know CSS and older browsers that do not. So here are two solutions to support all those scenarios and still display your styles and scripts in modern browsers without error. They are totally safe to use in modern HTML5 browsers:

STYLE

<style type="text/css">
    <!--/*--><![CDATA[/*><!--*/

    /* put your styles here */

    /*]]>*/-->
</style>

SCRIPT

<script type="text/javascript">
    <!--//--><![CDATA[//><!--

    // put your scripts here

    //--><!]]>
</script>
  • These two code blocks will allow HTML5 browsers to work normally with CSS and JavaScript but hide them from older browsers that do not support those technologies.

  • XHTML browsers will now parse your CSS and JavaScript as before but not allow special characters like <, >, and & to be interpreted as markup or entities/escaped characters which would generate parsing errors. They are CDATA now.

  • XML parsers of your page will not understand your CSS and JavaScript, of course, but will accept any type of text you add in there and not try and parse them as markup. They are CDATA now.

  • HOW THE EXAMPLES WORK: For modern HTML5-supporting browsers, comment markers <!-- and --> inside script and style tags are treated like CDATA by default inside style and script elements, so are completely ignored. Following that, the CSS and script comments wrap the rest of the top and bottom lines in CSS and script comments, so are removed. This means the top and bottom lines are always safely hidden and ignored in newer HTML5 browsers. Older browsers that do not know scripts or CSS do not treat script and style elements as CDATA-supporting nor understand CSS and script comments, but will understand the HTML comments. So, they will comment out all the CSS and scripts within each of the two elements. The first line HTML comment is applied first(<!--/*-->), then the <![CDATA[/*> block is read which becomes an empty unknown element to them and ignored. The HTML comment that follows hides all the CSS and scripts from there to the end of the block. The final <!]]> is another ignored empty element to them. For XHTML, these parsers do not read the content inside these elements as CDATA but understand the HTML comments. So, they remove the first comment block. <![CDATA[ next starts the CDATA block for them, wrapping around all styles and scripts inside the tags till ]]> is read. Everything inside the CDATA block is interpreted like HTML5 parsers do now - as normal CSS and scripts - to the XHTML parser rather than as HTML markup, as before. All CSS and script comments also apply. Because XHTML knows CSS and scripting, it still parses those correctly now.

XML parsers with this code work the same as XHTML parsers do using these rules, except not knowing CSS and script comments inside the CDATA blocks, XML parsers would just interpret everything inside CDATA tags as plain character text within the elements and show that as plain text when parsing the web page.

Your HTML5 page is now cross-compatible with modern HTML5 and XHTML5 browsers, older HTML/XHTML browsers, very old 1990's non-supporting CSS/script browsers, and various XML parsers, old and new! Enjoy

Solution 5 - Cdata

Perhaps see: http://wiki.whatwg.org/wiki/HTML_vs._XHTML

> <![CDATA[...]]> is a a bogus comment.

In HTML, <script> is already protected -- this is why sometimes it must be written as a = "<" + "/script>", to avoid confusing the browser. Note that the code is valid outside a CDATA in HTML.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionDarryl HeinView Question on Stackoverflow
Solution 1 - CdatahollskView Answer on Stackoverflow
Solution 2 - CdataJoel PurraView Answer on Stackoverflow
Solution 3 - CdatarmarscherView Answer on Stackoverflow
Solution 4 - CdataStokelyView Answer on Stackoverflow
Solution 5 - Cdatauser166390View Answer on Stackoverflow