1

I'm writing a basic function in PHP which takes an input string, converts a list of "weird" characters to URL-friendly ones. Writing the function is not the issue, but rather how it inteprets strings with weird charaters.

For example, right now I have this problem:

$string = "år";
echo $string[0]; // Output: �
echo $string[1]; // Output: �
echo $string[0] . $string[1]; // Output: å
echo $string[2]; // Output: r

So basically it interprets the letter "å" as two characters, which causes problem for me. Because I want to be able to look at each character of the string individually and replace it if needed.

I encode everything in UTF8 and I know my issue has to do something with UTF8 treating weird characters as two chars, as we've seen above.

But how do I work around this? Basically I want to achieve this:

$string = "år";
echo $string[0]; // Output: å
echo $string[1]; // Output: r
5
  • 2
    Chekc out php.net/manual/en/function.mb-substr.php Commented Mar 21, 2012 at 17:39
  • @Pekka i misunderstand question Commented Mar 21, 2012 at 17:40
  • is this string coming from db ? Commented Mar 21, 2012 at 17:40
  • @zod No it comes from a HTML form. Commented Mar 21, 2012 at 17:43
  • @zod I was just about to do that, but had to experiment with the different answers first. Commented Mar 21, 2012 at 18:29

2 Answers 2

2
$string = "år";

mb_internal_encoding('UTF-8');
echo mb_substr($string, 0, 1); // å
echo mb_substr($string, 1, 1); // r
Sign up to request clarification or add additional context in comments.

2 Comments

Not sure why this was downvoted? Anyway, as far as I can see it is correct, so +1
@Pekka I agree. Bot this and Artjoms answer solved it for me but since Artjom wrote his answer first I chose his as accepted
1

Since UTF encoding is not always 1 byte per-letter, but stretches as you need more space your non-ASCII letters actually take more than one byte of memory. And array-like access to a string variable returns that byte, not a letter. So to actually get it, you should use methods for that

echo mb_substr($string, 0,1);// Output: å
echo mb_substr($string, 1,1);// Output: r

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.