Is there any difference between:
preg_replace( '@<(script|style)[^>]*?>.*?</\\1>@si', '', $string );
and
preg_replace( '@<(script|style)[^>]*>.*</\\1>@si', '', $string );
?
Yes...
Consider this example string...
<script>bla</script><script>hello</script>
In the above example, it will only match the first script element.
It will match <script>bla</script><script>hello</script>.
The first non greedy probably doesn't need to be there, as it will search all non > anyway, and then there should not be any other characters after it anyway (between non > and closing >).
I also need to mention using something like DOMDocument is a much better method of getting script and style elements.
$dom = new DOMDocument;
$dom->loadHTML($string);
$scripts = $dom->getElementsByTagName('script');
$styles = $dom->getElementsByTagName('style');
?) is listed first; the second one is greedy. But the analysis is spot-on.The extra ? will invert the greediness of the expression (they're greedy by default in php):
/a+b/ will match aaab in aaab/a*b/ will match aaab in aaab/a*?b/ will match b in aaab/a+?b/ will match ab in aaabSo, in your particular example, the non-greedy expression will catch a script tag and its contents, so to speak. While the greedy version will start matching the first script tag, and grab everything (including non-script areas) up to the very last close script tag.
Don't rely on either, though: