At the end of the day, why choose XHTML over HTML?

HtmlXhtml

Html Problem Overview


I wonder why I should use XHTML instead of HTML.

XHTML is supposed to be "modularized", but I haven't seen any server side language take advantage of any of that.

XHTML is also more strict, and I don't see the advantage. What does XHTML offer that I need so bad? How does it make my code "better"?

EDIT: another question I found in the comments: Does XHTML parse faster than HTML?

EDIT2: after reading all your comments and the links, I indeed agree that another post deserves to be the correct answer, so I chose the one that directly links to the best source.

Also, goes to show that people upvote the green comment without even reading it.

Html Solutions


Solution 1 - Html

You should read Beware of XHTML, which is an informative article that warns about some of the pitfalls of XHTML over HTML.

I was pretty gung-ho about XHTML until I read it, but it does make several valid points. Including the following bit;

>XHTML 1.x is not “future-compatible”. XHTML 2, currently in the drafting stages, is not backwards-compatible with XHTML 1.x. XHTML 2 will have lots of major changes to the way documents are written and structured, and even if you already have your site written in XHTML 1.1, a complete site rewrite will usually be necessary in order to convert it to proper XHTML 2. A simple XSL transformation will not be sufficient in most cases, because some semantics won't translate properly. > >HTML 4.01 is actually more future-compatible. A valid HTML 4.01 document written to modern support levels will be valid HTML 5, and HTML 5 is where the majority of attention is from browser developers and the W3C.

Future compatibility can be huge when working on some projects. The article goes on to make several other good points, but I think that may have stood out the most for me.

Don't mistake the article for a rant against XHTML, the author does talk about the good points of XHTML, but it is good to be aware of the shortcomings before you dive in.

Solution 2 - Html

I was going to add this as a comment to one of the other posts, but it grew a little too large.

What the fundamental point that most people seem to be missing, is the purpose behind XHTML. One of the major reasons for developing the XHTML specification was to de-emphasise presentation-related tags in the markup, and to defer presentation to CSS. Whilst this separation can be achieved with plain HTML, this behaviour isn't promoted by the specifcation.

Separating meta-markup and presentation is a vital part of developing for the 'programmable web', and will not only improve SEO, and access for screen readers/text browsers, but will also lead towards your website being more easily analysable by those wishing to access it programmatically (in many simple cases, this can negate the need for developing a specific API, or even just allow for client-side scripts to do things like, identify phone numbers readily). If your web-page conforms to the XHTML specification, it can easily be traversed using XML-related tools, and things such as XPath... which is fantastic news for those who want to extract particular information from your website.

XHTML was not developed for use by itself, but by use with a variety of other technologies. It relies heavily on the use of CSS for presentation, and places a foundation for things like Microformats (whether you love them, or hate them) to offer a standardised markup for common data presentation.

Don't be fooled by the crowd who think that XHTML is insignificant, and is just overly restrictive and pointless... it was created with a purpose that 95% of the world seems to ignore/not know about.

By all means use HTML, but use it for what it's good for, and take the same approach when looking at XHTML.


With regard to parsing speed, I imagine there would be very little difference in the parsing of the actual documents between XHTML and HTML. The trade-off will come purely in how you describe the document using the available markup. XHTML tags tend to be longer, due to required attributes, proper closing, etc. but will forego the need for any presentational markup in the document itself. With that being the case, I think you're talking about comparing one type of apple, with a very slightly different type of apple... they're different, but it's unlikely to be of any consequence (in terms of parsing and rendering) when all you want is a healthy, tasty apple.

Solution 3 - Html

For the visitor of a website it probably doesn't make any visible difference. Furthermore, XHTML is usually more of a pain to use as at least one widespread browser still doesn't know how to handle it and you need to serve it as text/html in that case (which yields invalid HTML).

If your HTML is going to be regularly processed by automated tools instead of being read by humans, then you might want to use XHTML because of its more strict structure and being XML it's more easy to parse (from an application standpoint. Not that XML is inherently easy to parse, though).

Apart from that I don't see any compelling reasons to use it, though. XHTML was created in an approach of making use of XML features for HTML and basically it boils down to "HTML 4 with several annoying side-effects" (IMHO, at least).

Solution 4 - Html

Use HTML (HTML4 Strict or HTML5).

  • HTML can fully utilize CSS, can be validated and parsed unambiguously. Separation of structure and presentation has been done in HTML4 and XHTML merely continued that.

  • All browsers support HTML. Only some browsers support XHTML and those that do, often have more mature and better tested and optimized support for HTML (it's caused by the fact that tiny fraction of pages uses XML mode).

  • If you care about IE and Google, you have to use HTML or subset of XHTML and HTML defined in Appendix C of XHTML spec. The latter is almost worst of the both worlds, because such XHTML cannot be generated with standard XML tools, cannot use extension mechanisms new to XHTML and has additional limitations over those in HTML alone.

  • XHTML1.0 is now over 10 years old, it was designed in "Web1.0" times, and as head of W3C said, in retrospect it didn't work out and better approach is needed. W3C HTML5 is written as we speak and addresses needs of web applications used today, and has very good backwards compatibility.

  • HTML5 closes many gaps that were between HTML4 and XHTML1 (e.g. adds inline SVG, MathML i RDF), cleans up language beyond what was done in XHTML1.0 and XHTML1.1.

  • XHTML2 is not going to be supported by web browsers in forseeable future. It's likely that it will never be supported (all browser vendors heavily support [X]HTML5, some have already declared that they won't implement XHTML2).


XHTML1.0 has exactly the same semantics and separation of presentation from structure as HTML4.01. Anybody who says otherwise, hasn't read the specification. I encourage everybody to read the spec – it's suprisingly short and uninteresting.

  • Stylesheets were introduced in HTML4.01 and were not changed in XHTML1.0.
  • Presentational elements were deprecated in HTML4.01 and were not removed in XHTML1.0.

XHTML myths.


There are no untractable differences in HTML and XHTML that would make parsing of one much slower than another. It depends how the parser is implemented.

  • Both SGML and XML parsers need to load and parse entire DTD in order to understand entities. This alone is usually more work than parsing of the document itself. HTML parsers almost always "cheat" and use hardcoded entities and element information. XHTML parsers in browsers cheat too.
  • Parsing of HTML requires handling of implied start and end tags, and real-world HTML requires additional work to handle misplaced tags.
  • Proper parsing of XHTML requires tracking of XML namespaces.
  • Draconian XML rules require checking if every character is properly encoded. HTML parsers may get away with this, but OTOH they need to look for <meta>.

The overall difference in cost of parsing is tiny compared to time it takes to download document, build DOM, run scripts, apply CSS and all other things browsers have to do.

Solution 5 - Html

I'm surprised that all the answers here recommend XHTML over HTML. I am firmly of the opposite opinion - you should not use XHTML, for the foreseeable future. Here's why:

  • No browser interprets XHTML as XHTML unless you serve it as mimetype application/xhtml+xml. If you just serve it with the default mimetype, all browsers will interpret it as HTML - eg, accepting unclosed or improperly nested elements.

  • However, you should never actually do this, as Internet Explorer does not recognise application/xhtml+xml, and would fail to render the page completely.

  • There are significant differences in the DOM between XHTML and HTML. Since all so-called XHTML pages are being served as HTML at the moment, all javascript code is written using the HTML DOM. If, support for the XHTML mimetype becomes significant enough to convince people to start using it, most of their javascript code will break - even if they think their pages validate as XHTML.

Solution 6 - Html

Instead of continuing to debate HTML 4.01 Strict vs XHTML Strict, I would suggest starting to use HTML 5 today. John Resig, the author of jquery, made a similar suggestion last year on his blog.

The HTML 5 doctype, in it's beautiful simplicity will trigger standards mode in all browsers (including IE6).

<!DOCTYPE html>

That's it.

HTML 5 provides some exciting new features such as the <canvas> tag which potentially can push javascript application development to the next level. HTML 5 also has proper support for media (and media is a fairly important aspect of the web these days!) in the form of <video> and <audio> tags.

If you like the syntax of XHTML, i.e. closing "empty" tags such as <br />, that is fully supported in HTML 5. From Karl Dubost of the W3C's post Learn How To Write HTML 5:

> auto-closing tag is allowed and conformant in HTML 5.

XHTML2 has received relatively little attention compared to HTML 5. It's becoming increasingly clear that HTML 5 is the future of markup on the web. Microsoft's latest browser, IE8 still renders XHTML served as text/xml as text/html.

Microsoft have a co-chair on the W3C HTML working group and there's an implied support from them for HTML 5. All of the browser vendors have publicly announced their support for HTML 5.

At the end of the day, even if XHTML2 regains support from the industry, it won't be a significant issue having two competing standards as it has been in the past. Both languages support XML namespaces (in the case of HTML 5, serialization of HTML i.e. DOCTYPE switching).

Solution 7 - Html

Take a look at http://www.w3.org/MarkUp/2004/xhtml-faq#need. There are some good reasons apart from modularisation.

I favor XHTML because it's stricter and more clearly laid out. HTML is quirky and browsers have to accept things like <b><i>sadasd</b></i>. While this is a really simple example, it could also get more confusing and different browsers could lay out things differently.

Also I think that XHTML has to be "faster" since the browser doesn't have to do that kind of "reparations".

Solution 8 - Html

As a programmer, you should be VERY concerned about your code. HTML is ugly and follows few rules.

XHTML on the other hand, turns HTML into a proper language, following strict structural and syntactic rules.

XHTML is better for everyone, as it will help move the web to a point where everyone (all browsers) can agree on how to display a web page.

XHTML is an XML descendent, and us such is much easier on parsers built for the job of analysing syntactically sound XML documents.

If you can't see the benefit of XHTML, you might as well be using MS Word to create your HTML documents.

Solution 9 - Html

Some differences are:

  • XHTML tags must be properly nested
  • The documents must have one root element
  • XHTML tags are always in lowercase
  • Tags must always be closed (e.g. using the <br> tag in XHTML must have closing tag <br /> or <br></br> in XHTML)

Here are some links on it

wiki XHTML

wiki HTML vs XHTML

Solution 10 - Html

XHTML allows to use all those tools designed for XML. Among then, there is XSLT, embedding SVG, etc...

Solution 11 - Html

Interesting development: http://www.w3.org/News/2009#item119">XHTML 2 Working Group Expected to Stop Work End of 2009, W3C to Increase Resources on HTML 5

2009-07-02: Today the Director announces that when the XHTML 2 Working Group charter expires as scheduled at the end of 2009, the charter will not be renewed. By doing so, and by increasing resources in the Working Group, W3C hopes to accelerate the progress of HTML 5 and clarify W3C's position regarding the future of HTML. A FAQ answers questions about the future of deliverables of the XHTML 2 Working Group, and the status of various discussions related to HTML. Learn more about the HTML Activity.

Well, I guess that makes the future of HTML pretty clear.

Solution 12 - Html

XHTML forces you to be neat.

For example, in HTML, you can write:

<img src="image.jpg">

This isn't very logical, because the img tag never gets closed. In XHTML, however, you're forced to close the tag neatly, like this:

<img src="image.jpg" />

I like using something that forces me to be neat.

Steve

Solution 13 - Html

The subtitle to the XHTML 1.0 recommendation:

> A Reformulation of HTML 4 in XML 1.0

Many tools exist today to process XML. By using XHTML, you are allowing a huge set of tools to operate on your pages and to extract information programmatically.

If you were to use HTML, this would be possible too. There are tools in existence to parse HTML DOM trees. However, these tools can often be more specialized than those for XML. You may not find your favorite XML data processing tools compatible with HTML. Furthermore, there are so many uses for XML nowadays that you may be using XML for some other part of an application; why not also use that same XML parser to parse your web pages? This is the motivation behind XHTML.

If you're already comfortable and familiar with HTML 4.01, you have an established project using HTML 4, and you don't have tons of spare time, just go with HTML 4.01. If you have spare time, learn XHTML 1.1 anyway, and start your new projects in XHTML 1.1 – there's no harm in doing so. If you're using something other than HTML 4.01 or are pretty unfamiliar with HTML 4 anyway, just learn XHTML 1.1.

Solution 14 - Html

Using XHTML with the correct DocType will force the browser to render the content in a more standards compliant (strict) mode. This makes the different browsers behave better and, most importantly, more like each other. This makes your job as a webdeveloper a lot easier since it reduces the amount of browser specific tweaks needed to make the content look the same in all browsers.

Quirksmode.org has a lot of good info on this subject.

Solution 15 - Html

In my opinion, the strictness is, at least in theory, a good thing, because in HTML, you don't need to be strict, and because of that and the HTML5 junk, Browsers have advanced error correction algorithms that will make the best out of broken HTML. The problem is, the algorithms are not exactly the same and will lead to really strange behaviour you can't predict. With XHTML, on the other hand, you typically have fine, valid XHTML and so the error correction algorithms are not needed, i.e. the entire Browser behaviour is predictable. In addition, strict code makes it easier for your tools to work with the code. So you have actually nothing to lose by using XHTML, but there is some potential to gain. Things will get worse with plain HTML when HTML5 is finally out and the "be open in what you accept" will lead to the described strange behaviour. But at least then it's a standardized strange behaviour. Sigh.

On the other hand, if you use a good IDE like Visual Studio, it's almost impossible to produce broken HTML code anyway, so the result is the same.

Solution 16 - Html

Use XHTML

  • Fails fast. If there are any inconsistencies they will be found during validation.
  • It encourages better design by separating semantic markup from presentation etc.
  • It's structured which means that you can treat it as a data object and run all sorts of queries against it. For example you could find all addresses or citations within your website.
  • You can do build-time optimizations. Since it's well-formed XML you can easily do find/replace operations during build time. Or any document management and manipulation.
  • You can write XSLT or other transformation scripts to programatically transform your XHTML for other platforms. For example you could have an XSLT for the iPhone that would transform all XHTML to make it compatible or more user-friendly for the iPhone
  • You are future proofing yourself. Transforming XHTML to newer semantics is again, very easy using transformation.
  • Search engines will continue to evolve to gather more semantic information as part of the programmable web.
  • DOM operations are more reliable since it's structured.
  • From an algorithmic perspective, it yields easier and faster parsing.

Solution 17 - Html

XHTMl is a good standing point to use because if you want valid code you would need to provide some aspect of help to the disabled community due to the fact screen readers need the alt and title parts of the image and link tags. It must be faster to parse to an extent because unlike HTML the parser wouldn't need to check to see if the tag wasn't closed properly, if it was nested correctly etc. Also it is better to use it because yes it is strict but it helps you to think more logically (in my opinion) when it comes to learning programming languages.

Solution 18 - Html

I believe XHTML is (or should be) faster to parse. A valid XHTML document must be written to a stricter spec in that errors are fatal when parsing, whereas HTML is more lenient and allows for oddities mentioned before my comment like out of order closing tags and such. I found this helpful in uncovering the differences between HTML and XHTML parsing:

http://wiki.whatwg.org/wiki/HTML_vs._XHTML#Parsing

A reason you might use XHTML over HTML might be if you intend to have mobile users as part of your audience. If I recall, many phones use something more of an XML parser, rather than an HTML one to display the web. If you are writing for desktop browsers, HTML would probably be acceptable.

That said, if you are going to serve the data as text/html anyway, you should use HTML:

http://www.hixie.ch/advocacy/xhtml

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionKdgDevView Question on Stackoverflow
Solution 1 - HtmlJames McMahonView Answer on Stackoverflow
Solution 2 - HtmlJames BView Answer on Stackoverflow
Solution 3 - HtmlJoeyView Answer on Stackoverflow
Solution 4 - HtmlKornelView Answer on Stackoverflow
Solution 5 - HtmlDaniel RosemanView Answer on Stackoverflow
Solution 6 - HtmlBayard RandelView Answer on Stackoverflow
Solution 7 - Htmllx.View Answer on Stackoverflow
Solution 8 - HtmlAntony CarthyView Answer on Stackoverflow
Solution 9 - HtmlkevchaddersView Answer on Stackoverflow
Solution 10 - HtmlPierreView Answer on Stackoverflow
Solution 11 - HtmlAlecView Answer on Stackoverflow
Solution 12 - HtmlSteve HarrisonView Answer on Stackoverflow
Solution 13 - HtmlWesleyView Answer on Stackoverflow
Solution 14 - HtmlMarnix van ValenView Answer on Stackoverflow
Solution 15 - HtmlOregonGhostView Answer on Stackoverflow
Solution 16 - HtmlaleembView Answer on Stackoverflow
Solution 17 - HtmlMarc TowlerView Answer on Stackoverflow
Solution 18 - Htmlcm2View Answer on Stackoverflow