0

I am using PHPDocument and DOMXPath. I am trying to get the attribute with json type value. But I don't get the exact value. I could get the other attributes well but not this. The HTML looks like

<a href="URL" title="{tt4438848=Nicholas Stoller (dir.), Seth Rogen, Rose Byrne, tt2567026=James Bobin (dir.), Mia Wasikowska, Johnny Depp, tt3498820=Anthony Russo (dir.), Chris Evans, Robert Downey Jr., tt2948356=Byron Howard (dir.), Ginnifer Goodwin, Jason Bateman, tt3385516=Bryan Singer (dir.), James McAvoy, Michael Fassbender, tt1985949=Clay Kaytis (dir.), Jason Sudeikis, Josh Gad, tt3068194=Whit Stillman (dir.), Kate Beckinsale, Chloë Sevigny, tt3799694=Shane Black (dir.), Russell Crowe, Ryan Gosling, tt3040964=Jon Favreau (dir.), Neel Sethi, Bill Murray, tt2241351=Jodie Foster (dir.), George Clooney, Julia Roberts}">X-Men: Apocalypse</a>

If I use echo $dom->getAttribute("href"); the output is URL
If I use echo $dom->getAttribute("title"); the output is Bryan Singer (dir.), James McAvoy, Michael Fassbender

I cannot get the exact attribute value.

Edit link phpfiddle.org/main/code/dvj5-zf0q

Can anyone help?? I am new to PHPDOM. Thanks in advance

2
  • Not an answer but the value of that attribute isn't JSON. Commented Jun 3, 2016 at 12:30
  • @alex So there is no way to get it?? Commented Jun 3, 2016 at 12:30

1 Answer 1

2

To get the title attribute:

<?php
$html = <<<EOF
<html>
<a href="URL" title="{tt4438848=Nicholas Stoller (dir.), Seth Rogen, Rose Byrne, tt2567026=James Bobin (dir.), Mia Wasikowska, Johnny Depp, tt3498820=Anthony Russo (dir.), Chris Evans, Robert Downey Jr., tt2948356=Byron Howard (dir.), Ginnifer Goodwin, Jason Bateman, tt3385516=Bryan Singer (dir.), James McAvoy, Michael Fassbender, tt1985949=Clay Kaytis (dir.), Jason Sudeikis, Josh Gad, tt3068194=Whit Stillman (dir.), Kate Beckinsale, Chloë Sevigny, tt3799694=Shane Black (dir.), Russell Crowe, Ryan Gosling, tt3040964=Jon Favreau (dir.), Neel Sethi, Bill Murray, tt2241351=Jodie Foster (dir.), George Clooney, Julia Roberts}">X-Men: Apocalypse</a>
</html>
EOF;

$dom = new DOMDocument();
$dom->loadHTML($html);
$links = $dom->getElementsByTagName('a');
foreach ($links as $link) {
    $title = $link->getAttribute('title');
    echo $title;
}
?>

Be aware though that the title does not hold a json string but some custom implementation.
See a demo on ideone.com.


To actually get the information, you could use some regular expressions like so:

\w+=((?:(?!(?:, tt)).)+)

Broken down to your problem this would be:

$regex = '~\w+=((?:(?!(?:, tt)).)+)~';
foreach ($links as $link) {
    preg_match_all($regex, $link->getAttribute('title'), $actors);
    print_r($actors);
}

See a demo for this one on ideone.com as well.

Sign up to request clarification or add additional context in comments.

5 Comments

Did you echoed the $title ??
Thanks for your help.. But in my case its not working phpfiddle.org/main/code/dvj5-zf0q
It work cool with you code. I tried it but output is array with no values.
@hemnathmouli: Try it without the online Emulator (but on your harddisk).
Oops So that was the error. I must have tried in local. Now I got the output. Thanks a lot. I have to +1 and Mark as correct answer. Thanks :)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.