1

I've used DOM before to parse websites in PHP.

I know I should never try to parse HTML using regex.

But... (I don't want to start a shitstorm, just an answer :P )

If i want to parse just 1 HTML element, e.g.

<a href="http://example.com/something?id=1212132131133&filter=true" rel="blebeleble" target="_blank">

And find the content of the href attribute, can I (and probably I need to if I can) use DOM to parse this string or do I need a complete webpage to be able to parse it using the DOM?

2
  • 1
    What does parsing using the DOM mean? Commented Apr 11, 2011 at 22:11
  • php.net/manual/en/book.dom.php Commented Apr 11, 2011 at 22:17

2 Answers 2

4

Yes, you can do this.

You have to:

  • pretend that the <a /> tag constitutes the whole document;
  • ensure that you close the tag;
  • ensure that the input string is valid XML (note that I've replaced your & with &amp;, the proper HTML entity).

Code:

<?php
$str = '<a href="http://example.com/something?id=1212132131133&amp;filter=true" rel="blebeleble" target="_blank" />';

$dom = new DOMDocument();
$dom->loadXML($str);
var_dump($dom->childNodes->item(0)->attributes->getNamedItem('href')->value);

// Output: string(57) "http://example.com/something?id=1212132131133&filter=true"
?>

PS, if you want to include the link text, that's ok too:

$str = '<a href="http://example.com/something?id=1212132131133&amp;filter=true" rel="blebeleble" target="_blank">Click here!</a>';
// .. code .. //

// Output: string(57) "http://example.com/something?id=1212132131133&filter=true"
Sign up to request clarification or add additional context in comments.

1 Comment

Great answer, thanks a bunch for this. I have no problem with Beautiful Soup or Nokogiri, but I find DomDocument difficult to use.
0

You can easily adapt a regex to parse just this tag, given you've isolated it. An example can be found here. It's for java, so remember to change the case insensitive modifier to the end!

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.