Java: how to locate an element via xpath string on org.w3c.dom.document

JavaDomXpath

Java Problem Overview


How do you quickly locate element/elements via xpath string on a given org.w3c.dom.document? there seems to be no FindElementsByXpath() method. For example

/html/body/p/div[3]/a

I found that recursively iterating through all the child node levels to be quite slow when there are lot of elements of same name. Any suggestions?

I cannot use any parser or library, must work with w3c dom document only.

Java Solutions


Solution 1 - Java

Try this:

//obtain Document somehow, doesn't matter how
DocumentBuilder b = DocumentBuilderFactory.newInstance().newDocumentBuilder();
org.w3c.dom.Document doc = b.parse(new FileInputStream("page.html"));

//Evaluate XPath against Document itself
XPath xPath = XPathFactory.newInstance().newXPath();
NodeList nodes = (NodeList)xPath.evaluate("/html/body/p/div[3]/a",
		doc, XPathConstants.NODESET);
for (int i = 0; i < nodes.getLength(); ++i) {
	Element e = (Element) nodes.item(i);
}

With the following page.html file:

<html>
  <head>
  </head>
  <body>
  <p>
    <div></div>
    <div></div>
    <div><a>link</a></div>
  </p>
  </body>
</html>

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionKJWView Question on Stackoverflow
Solution 1 - JavaTomasz NurkiewiczView Answer on Stackoverflow