1

Is there any difference between:

preg_replace( '@<(script|style)[^>]*?>.*?</\\1>@si', '', $string );

and

preg_replace( '@<(script|style)[^>]*>.*</\\1>@si', '', $string );

?

2 Answers 2

3

Yes...

Consider this example string...

<script>bla</script><script>hello</script>
  • The first one will stop matching as soon as it is satisfied; it is known as an ungreedy match.

In the above example, it will only match the first script element.

  • The second one will match everything between the first and last closing tag, perhaps consuming other matches inside. This is known as greedy, as it will consume as much as it can.

It will match <script>bla</script><script>hello</script>.

The first non greedy probably doesn't need to be there, as it will search all non > anyway, and then there should not be any other characters after it anyway (between non > and closing >).

I also need to mention using something like DOMDocument is a much better method of getting script and style elements.

$dom = new DOMDocument;

$dom->loadHTML($string);

$scripts = $dom->getElementsByTagName('script');

$styles = $dom->getElementsByTagName('style');
Sign up to request clarification or add additional context in comments.

2 Comments

Take a closer look, @alex: the non-greedy version (with the added ?) is listed first; the second one is greedy. But the analysis is spot-on.
@Alan Whoops, I must be living in reverse today. I'll edit :)
0

The extra ? will invert the greediness of the expression (they're greedy by default in php):

  • /a+b/ will match aaab in aaab
  • /a*b/ will match aaab in aaab
  • /a*?b/ will match b in aaab
  • /a+?b/ will match ab in aaab

So, in your particular example, the non-greedy expression will catch a script tag and its contents, so to speak. While the greedy version will start matching the first script tag, and grab everything (including non-script areas) up to the very last close script tag.

Don't rely on either, though:

http://ha.ckers.org/xss.html

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.