1

I want to use PHP's cURL to visit a page on an external site, and get some the whole html content of the page.

When i visit the site, it will redirect me to another page on the same site. Also, i have to set the useragent, i want a useragent for PC windows7 chrome and iPhone 4s. This is what i got so far:

$ch = curl_init ($url);
curl_setopt($ch, CURLOPT_USERAGENT, $useragent);
curl_setopt($ch, CURLOPT_AUTOREFERER , true)
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, true);
$kl = curl_exec ($ch);
curl_close($ch);
echo $kl;

Notice:
I will probably run into more errors.

10
  • 1
    use simplehtmldom or phpquery to parse the html Commented Jul 22, 2013 at 1:47
  • @DevZer0 these libs had been designed for PHP4. in PHP5 there is DOMXPath or simple_xml_element->xpath() Commented Jul 22, 2013 at 1:49
  • i want to use cURL. But thanks for the comments Commented Jul 22, 2013 at 1:50
  • 2
    @ahmadalbayati cURL will not let you manipulate the html. Commented Jul 22, 2013 at 1:51
  • 3
    you can get html result, but if you want to manipulate the html, get some items inside it, you should write more code to analyze it. "PHP Simple HTML DOM Parser" it is helpful.And if there is a redirection, you should use curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); to follow it Commented Jul 22, 2013 at 2:32

3 Answers 3

7

You might also need to consider urls with https

$cookie = tmpfile();
$userAgent = 'Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/26.0.1410.64 Safari/537.31' ;

$ch = curl_init($url);

$options = array(
    CURLOPT_CONNECTTIMEOUT => 20 , 
    CURLOPT_USERAGENT => $userAgent,
    CURLOPT_AUTOREFERER => true,
    CURLOPT_FOLLOWLOCATION => true,
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_COOKIEFILE => $cookie,
    CURLOPT_COOKIEJAR => $cookie ,
    CURLOPT_SSL_VERIFYPEER => 0 ,
    CURLOPT_SSL_VERIFYHOST => 0
);

curl_setopt_array($ch, $options);
$kl = curl_exec($ch);
curl_close($ch);
echo $kl;
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks, it also did it, and +1 for putting it into array :)
4
+50

So:

  1. Search the proper UserAgent strings on the 'net.
  2. Enable CURLOPT_FOLLOWLOCATION as @TroyCheng indicated
  3. Enable the CURLOPT_COOKIEFILE & CURLOPT_COOKIEJAR.

1 Comment

thanks. it did it. sadly i can't give you the bounty until 23 hours have past :/
1

Why don't you use a library like Buzz ?

$request = new Buzz\Message\Request('GET', '/', 'http://google.com');
$response = new Buzz\Message\Response();

$client = new Buzz\Client\Curl();
// do not check https validity
$client->setVerifyPeer(false);
// define your user agent
$client->setOption('CURLOPT_USERAGENT', $userAgent);
$client->setOption('CURLOPT_COOKIEFILE', true);
$client->setOption('CURLOPT_COOKIEJAR', true);
$client->send($request, $response);

if ($response->isOk())
{
  echo $response->getContent();

  // or if you want the dom
  echo $response->toDomDocument();
}

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.