I've build a method which extracts data from an html document using the xpath components of saxon-he. I'm using w3c dom object model for this.
I already created a method which returns the text-value, similar like the text value method from jsoup (jsoupElement.text()):
protected String getNodeValue(Node node) {
NodeList childNodes = node.getChildNodes();
for (int x = 0; x < childNodes.getLength(); x++) {
Node data = childNodes.item(x);
if (data.getNodeType() == Node.TEXT_NODE)
return data.getNodeValue();
}
return "";
}
This works fine but i now i need the underlying html of a selected node (with jsoup it would be jsoupElement.html()). Using the w3c dom object model i have org.w3c.dom.Node. How can i get the html from a org.w3c.dom.Node as String? I couldn't find anything regarding this in the documentation.
Just for clarification: I need the inner html (with or without the node element/tag) as String. Similar like http://api.jquery.com/html/ or http://jsoup.org/apidocs/org/jsoup/nodes/Element.html#html--
<xsl:output method="html"/>or<xsl:output method="xhtml"/>) so you could use a Transformer with a stylesheet setting the method as needed. Perhaps the API offers some way as well.