Regex to replace a special character in PHP?

Question

I have problem with a special character §. I want to replace multiple occurrences of § with single §. The following regex works fine on Regex 101.

$file_data = file_get_contents($file_name);
$file_data = preg_replace('/\§+/g', '§',$file_data);

It changed

§§§§§§§§§This free 3D robot game could redefine how kids learn to codeDigital Trends It’s hard to get kids to code. Up until very recently, it was largely ....

to

§This free 3D robot game could redefine how kids learn to codeDigital Trends It’s hard to get kids to code. Up until very recently, it was largely ....

However, it is not working on the server after I upload it. Here is the var_dump($file_data) by PHP

Â§Â§Â§Â§Â§Â§Â§Â§ This free 3D robot game could redefine how kids learn to codeDigital Trends It’s hard to get kids to code. Up until very recently, it was largely ....

So, there seems to be an additional character Â before every § in the var_dump. The extra character Â does not show up on webpage when echoed as HTML. It just shows up during plain PHP var_dump. How can I replace multiple occurrences of § using regex in PHP?

I would start by removing the g modifier since it doesn't exist in php regex. My first guess would be to try the u modifier: /§+/u. Have fun — HamZa
– HamZa, Commented Nov 15, 2015 at 11:13
Second guess: make sure to use utf-8 in your html document or send a header beforehand to define the type: header('Content-Type: text/html; charset=utf-8'); — HamZa
– HamZa, Commented Nov 15, 2015 at 11:16
@HamZa Thank you. It was working on regex101 so I thought it would work on server too. I would see if it solves the problem. — SanJeet Singh
– SanJeet Singh, Commented Nov 15, 2015 at 11:17
@HamZa It is working now. What you said about using u was correct. If you write it as an answer I will accept it. Otherwise, I will accept the other answer which does what your comment says. — SanJeet Singh
– SanJeet Singh, Commented Nov 15, 2015 at 11:35

Andreas Louv · Accepted Answer · 2015-11-15 11:32:32Z

2

You will need to set the u (utf-8) modifier:

From perlre documentation:

/u means to use Unicode rules when pattern matching. On ASCII platforms, this means that the code points between 128 and 255 take on their Latin-1 (ISO-8859-1) meanings (which are the same as Unicode's)....

$output = preg_replace('/§+/u', '§', $input);
                         // ^

answered Nov 15, 2015 at 11:32

Andreas Louv

47.3k14 gold badges109 silver badges126 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Professor Abronsius · Accepted Answer · 2015-11-15 11:29:53Z

0

$str="§§§§§§§§§This free 3D robot game could redefine how kids learn to codeDigital Trends It's hard to get kids to code. Up until very recently, it was largely ....";
$pttn='@\§{2,}@um';
echo preg_replace( $pttn,'§',$str );

/* will output */
/*
   §This free 3D robot game could redefine how kids learn to codeDigital Trends It's hard to get kids to code. Up until very recently, it was largely .... 
*/

answered Nov 15, 2015 at 11:29

Professor Abronsius

33.9k5 gold badges36 silver badges49 bronze badges

Collectives™ on Stack Overflow

Regex to replace a special character in PHP?

2 Answers 2

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Related