I have a HTML document that looks (when oversimplified) like this:
<html>
<body>
<a href="...">...</a>
<a href="...">...</a>
<a href="...">...</a>
...
</body>
</html>
What I'd like to do would be to extract the URLs in line-delimited output. Enter xmllint:
$ xmllint --html --xpath //a/@href
href="..." href="..." href="..."
It's getting the attribute, the whole attribute including the name, and it's outputting them space-delimited. How can I just get a list of lines with the values of the href attribute? I want output like this:
...
...
...
where ... is the URL found in the href attribute of each a element.
How can I format this output properly?