0

I have a French site that I want to parse, but am running into problems converting the (uft-8) html to latin-1.

The problem is shown in the following phpunit test case:

class Test extends PHPUnit_Framework_TestCase {

    private static function fromHTML($str){
        return html_entity_decode($str, ENT_QUOTES, 'UTF-8');
    }

    public function test1(){

        //REMOVE THE SPACE between the '&' and 'nbsp'. SO won't
        //let me write it without the space
        $strFrom    = 'Wanted& nbsp;: les Chasseurs de Tamriel';
        $strTo  = 'Wanted : les Chasseurs de Tamriel';
        $strFrom = self::fromHTML($strFrom);
        $this->assertEquals($strTo, $strFrom);
    }

    public function test2(){
        $strFrom    = 'Remplacement d’Almalexia';
        $strTo      = 'Remplacement d’Almalexia';
        $strFrom = self::fromHTML($strFrom);
        $this->assertEquals($strTo, $strFrom);
    }

    }

test2 completes fine. test1 seems to fail as the space isn't correct, so when converted to ascii it ends up as a unknown character (�).

How would I ensure both tests pass?

2
  • Why is there a space between the '&' and 'nbsp;'? Is that what you are trying to fix? Commented Aug 7, 2009 at 13:57
  • Because I can't write it without the space as SO converts it into a space if I don't Commented Aug 7, 2009 at 14:01

2 Answers 2

2

test1 does not fail, its answer is correct. The strings you compare are not the same. “ ” is not decoded to a space (0x20). It’s a non-breaking space character and as such gets decoded to 0xa0. When you change strTo to contain that character before the colon the assertEquals will return true. Of course you have to make sure that your file is saved with the UTF-8 encoding, just as PERR0_HUNTER mentioned but seeing that you use the “’” character you are probably already doing that. :)

Sign up to request clarification or add additional context in comments.

Comments

2

Just as a small suggestion, make sure that your .php file encoding is set to utf8, you don't know how many people miss that.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.