5

If I have the html file:

<!doctype html>
 <html>
  <head></head>
   <body>
    <!-- Begin -->
    Important Information
    <!-- End -->
   </body>
  </head>
 </html>

How can I use PHP to get the string "Important Information" from the file?

8
  • 2
    Have a look at stackoverflow.com/questions/3577641/best-methods-to-parse-html Commented Feb 16, 2011 at 4:00
  • It isn't (mostly) the parsing i'm worried about, more getting the code in the first place Commented Feb 16, 2011 at 4:05
  • How can I turn "<!doctype html> <html> <head></head> <body> <!-- Begin --> Important Information <!-- End --> </body> </head> </html>" Into a php $variable? Commented Feb 16, 2011 at 4:07
  • Title is highly misleading, you're not "getting the HTML source code." You're just getting the text. Commented Feb 16, 2011 at 4:08
  • No sorry, I am, how can I turn index.html into php $variable? Commented Feb 16, 2011 at 4:08

3 Answers 3

5

If you already have the parsing sorted, just use file_get_contents(). You can pass it a URL and it will return the content found at the URL, in this case, the html. Or if you have the file locally, you pass it the file path.

Sign up to request clarification or add additional context in comments.

Comments

2

In this simple example you can open the file and do fgets() until you find a line with <!-- Begin --> and saving the lines until you find <!-- End -->.

If your HTML is in a variable you can just do:

<?php
$begin = strpos($var, '<!-- Begin -->') + strlen('<!-- Begin -->'); // Can hardcode this with 14 (the length of your 'needle'
$end   = strpos($var, '<!-- End -->');

$text = substr($var, $begin, ($end - $begin));

echo $text;
?>

You can see the output here.

3 Comments

How can I use fgets with a $var?
Don't forget the whitespace before <!-- Begin -->
Is this $var a string with the HTML content?
-1

You can fetch "HTML" by this

//file_get_html function from third party library
// Create DOM from URL or file
$html = file_get_html('http://www.example.com/');

and any operation on DOM then read following docs: http://de.php.net/manual/en/book.dom.php

4 Comments

There is no file_get_html in DOM or in PHP.
@Manish simplehtmldom is a third party library and not a native PHP extension. You linked to DOM in your answer.
@gordon:: I'm not mention it. Sorry for this
(tip) when you know your answer is wrong, correct it. People might remove the downvote then.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.