1

I'm having troubles taking the DOM document (or a node in it) and serialize it as properly formatted xml. I need to do this as the tools I will upload part of the document to understands only XML and not HTML with its improperly closed elements. As an example I'm currently scraping (amongst many) http://studentlund.se which showcases my problems with img elements not being closed.

For example if I execute the following in chromes console:

$('<div>').append($('body ul:first li:last')).html()

I'll receive:

<li><a href="http://studentlund.se/feed/"><img src="http://studentlund.se/wordpress/wp-
content/themes/studentlund/pics/rss.png" alt="RSS"></a></li>

The img element is not closed, thus my xml parser will fail.

If I use the XMLSerializer:

n = $('body ul:first li:last').get(0)
new XMLSerializer().serializeToString(n)

I will get the same, incorrectly formatted XML:

<li><a href="http://studentlund.se/feed/"><img src="http://studentlund.se/wordpress/wp-content/themes/studentlund/pics/rss.png" alt="RSS"></a></li>

All I want is being able to dump the RAW DOM of a node in a properly formatted string of XML so I can use it with my XML tools, is this possible?

1 Answer 1

1

Try to create an XML document and then serialize it to string, something like this:

n = $('body ul:first li:last').get(0);
var doc = document.implementation.createDocument('', '', null);
doc.appendChild(n);
var xml = new XMLSerializer().serializeToString(doc);
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.