Why doesn't Node.js have a native DOM?

Javascriptnode.jsDomV8Headless Browser

Javascript Problem Overview


When I discovered that Node.js was built using the V8 JavaScript engine, I thought:

> Great, web scraping will be easier as the page will be rendered like in the browser, with a "native" DOM supporting XPath and any AJAX calls on the page executed.

  1. Why doesn't it have a native DOM when it uses the same JavaScript engine as Chrome?
  2. Why doesn't it have a mode to run JavaScript in retrieved pages?
  3. What am I not understanding about JavaScript engines vs the engine in a web browser?

Many thanks!

Javascript Solutions


Solution 1 - Javascript

The DOM is the DOM, and the JavaScript implementation is simply a separate entity. The DOM represents a set of facilities that a web browser exposes to the JavaScript environment. There's no requirement however that any particular JavaScript runtime will have any facilities exposed via the global object.

What Node.js is is a stand-alone JavaScript environment completely independent of a web browser. There's no intrinsic link between web browsers and JavaScript; the DOM is not part of the JavaScript language or specification or anything.

I use the old Rhino Java-based JavaScript implementation in my Java-based web server. That environment also has nothing at all to do with any DOM. It's my own application that's responsible for populating the global object with facilities to do what I need it to be able to do, and it's not a DOM.

Note that there are projects like jsdom if you want a virtual DOM in your Node project. Because of its very nature as a server-side platform, a DOM is a facility that Node can do without and still make perfect sense for a wide variety of server applications. That's not to say that a DOM might not be useful to some people, but it's just not in the same category of services as things like process control, I/O, networking, database interop, and so on.

There may be some "official" answer to the question "why?" out there, but it's basically just the business of those who maintain Node (the Node Foundation now). If some intrepid developer out there decides that Node should ship by default with a set of modules to support a virtual DOM, and successfully works and works and makes that happen, then Node will have a DOM.

Solution 2 - Javascript

P.S: When reading this question I was also wondering if V8 (node.js is built on top of this) had a DOM

> Why when it uses the same JS engine as Chrome doesn't it have a native > DOM?

But I searched google and found Google's V8 page which recites the following:

> JavaScript is most commonly used for client-side scripting in a > browser, being used to manipulate Document Object Model (DOM) objects > for example. The DOM is not, however, typically provided by the > JavaScript engine but instead by a browser. The same is true of > V8—Google Chrome provides the DOM. V8 does however provide all the > data types, operators, objects and functions specified in the ECMA > standard.

node.js uses V8 and not Google Chrome.

> Likewise, why doesn't it have a mode to run JS in retrieved pages?

I also think we don't really need it that bad. Ryan Dahl created node.js as one man (single programmer). Maybe now he (his team) will develop this, but I was already extremely amazed by the amount of code he produced (crazy). He wanted to make a non-blocking easy/efficient library, which I think he did a mighty good job at.

But then again, another developer created a module which is pretty good and actively developed (today) at https://github.com/tmpvar/jsdom.

> What am I not understanding about Javascript engines vs the engine in > a web browser? :)

Those are different things as is hopefully clear from the quote above.

Solution 3 - Javascript

The Document Object Model (DOM in short) is a programming interface for HTML and XML documents and it represents the page so that programs can change the document structure, style, and content. More on this subject.


The necessary distinction between client-side (browser) and server-side (Node.js) and their main goals:

  • Client-side: accessing and displaying information of the web
  • Server-side: providing stable and reliable ways to deliver web information

Why is there no DOM in Node.js be default?

By default, Node.js doesn't have access, nor have any knowledge about the actual DOM in your own browser. Node.js just delivers the data, that will be used by your own browser to process and render the whole website, the DOM included. The server provides the data to your browser to use and process. That is the intended way.

Why wouldn't you want to access the DOM in Node.js?

Accessing your browser's actual DOM using Node.js would be just simply out of the goal of the server. Your own browser's role is to display the data coming from the server. However it is certainly possible and there are multiple solutions in different level of depths and varieties to pre-render, manipulate or change the DOM using AJAX calls. We'll see what future trends will bring.

Why would you want to access the DOM in Node.js?

By default, you shouldn't access your own, actual DOM (at least some data of it) using Node.js. Client-side and server-side are separated in terms of role, functionality, and responsibility based on years of experience and knowledge. Although there are several situations, where there are solid reasons to do so:

  • Gathering usage data (A/B testing, UI/UX efficiency and feedback)
  • Headless testing (Development, automation, web-scraping)

How can you access the DOM in Node.js?

  • jsdom: pure-JavaScript implementation, good for testing your own DOM/browser-related project
  • cheerio: great solution if you like/often use jQuery
  • puppeteer: Google's own way to provide headless testing using Google Chrome
  • own solution (your possible future project link here)

Although these solutions do not provide a way to access your browser's own, actual DOM by default, but you can create a project to send some form of data about your DOM to the server, then use/render/manipulate that data based on your needs.

...and yes, web-scraping and web development in terms of tools and utilities became more sophisticated and certainly easier in several fields.

Solution 4 - Javascript

node.js chose not to include it in their standard library. For any functionality, there is an inevitable tradeoff between comprehensiveness, scalability, and maintainability.

That doesn't mean it's not potentially useful. There is at least one JavaScript DOM implementation intended for NodeJS (among other CommonJS implementations).

Solution 5 - Javascript

You seem to have a flawed assumption that V8 and the DOM are inextricably related, that's not the case. The DOM is actually handled by Webkit, V8 doesn't handle the DOM, it handles Javascript calls to the DOM. Don't let this discourage you, Node.js has carved out a significant niche in the realtime server market, but don't let anybody tell you it's just for servers. Node makes it possible to build almost anything with JavaScript.

It is possible to do what you're talking about. For example there is the very good jsdom library if you really need access to the DOM, and node-htmlparser, there are also some really good scraping libraries that take advantage of these like apricot.

Solution 6 - Javascript

2018 answer: mainly for historical reasons, but this may change in future.

Historically, very little DOM manipulation was done on the server. Addiotinally, as other answers allude, the JS stdlib and the DOM are seperate libraries - if you're using node, for, say, Unix scripting, then HTMLElement and NodeList etc aren't really relevant to that.

However: server-side DOM manipulation is now a very common part of delivering web apps. Web servers need to understand the structure of pages, and, if asked to render a resource as HTML, deliver HTML content that reflects the initial state of a web application. This means web apps load much faster than if the server simply delivers a stub page and has the browsers then do the work of filling in the real content. Currently this is done with JSDom and similar, but in the same way node has Request and Response objects built in, having DOM functions maintained as part of the stdlib would help with this task.

Solution 7 - Javascript

Javascript != browser. Javascript as a language is not tied to browsers; node.js is simply an implementation of Javascript that is intended for servers, not browsers. Hence no DOM.

Solution 8 - Javascript

If you read DOM as 'linked objects immediately accessible from my script' then the answer 'it does, but it's very different from set of objects available from web document script'. The main reason is that node is 'evented I/O for V8', not 'HTML tree objects for V8'

Solution 9 - Javascript

Node is a runtime environment, it does not render a DOM like a browser.

Solution 10 - Javascript

Because there isn't a DOM. DOM stands for Document Object Model. There is no document in Node, so not DOM to manipulate it. That is definitively a browser thing.

You can use a library like cheerio though which gives you some simple DOM manipulation.

Node is server-level JavaScript. It's just the language applied to a basic system API, more like C++ or Java.

Solution 11 - Javascript

It seems people have answered 'why' but not how. A quick answer of how is that in a web browser, a document object is exposed (hence DOM , document object model). On windows this object is called document object. You can refer to this page and look at the methods it exposes which are for handling HTML documents like createElement. I don't use node.js or haven't done COM programming in a while but I'd imagine you could use DOM in node.js by simply calling the COM object IHTMLDocument3. Of course for other platforms like Mac OS X or Linux you would probably have to use something from their OS api. This should allow you to easily build a webpage server side using DOM, or to scrape incoming web pages.

Solution 12 - Javascript

Node.js is for serverside programming. There is no DOM to be rendered in the server.

Solution 13 - Javascript

  1. What does it mean for it to have a D ocument O bject M odel? There's no document to represent.

  2. You're most of the time you're not retrieving pages. You can, but most Node apps probably won't be.

  3. Without a document and a browser, Javascript is just another programming language. So you may ask why there isn't a DOM in C# or Java

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionPeterBView Question on Stackoverflow
Solution 1 - JavascriptPointyView Answer on Stackoverflow
Solution 2 - JavascriptAlfredView Answer on Stackoverflow
Solution 3 - Javascriptuser7637745View Answer on Stackoverflow
Solution 4 - JavascriptMatthew FlaschenView Answer on Stackoverflow
Solution 5 - JavascriptOmni5cienceView Answer on Stackoverflow
Solution 6 - JavascriptmikemaccanaView Answer on Stackoverflow
Solution 7 - JavascriptPaul SonierView Answer on Stackoverflow
Solution 8 - JavascriptAndrey SidorovView Answer on Stackoverflow
Solution 9 - JavascriptzanmatoView Answer on Stackoverflow
Solution 10 - JavascriptsamanimeView Answer on Stackoverflow
Solution 11 - Javascriptuser2074102View Answer on Stackoverflow
Solution 12 - JavascripthugomgView Answer on Stackoverflow
Solution 13 - JavascriptDavy8View Answer on Stackoverflow