Decode utf8 character on javascript

Question

I have a badly configured third party service that outputs strings like this:

"SK Uni=C4=8Dov vs Prostejov"

I want to replace on the fly all the wrong characters it sends me, so my modules work with the correctly decoded string

I have found on this website (https://www.compart.com/en/unicode/U+010D) that the =C4=8D substring corresponds to the utf-8 character č

https://www.compart.com/en/unicode/U+010D

č
...
UTF-8 Encoding:     0xC4 0x8D
UTF-16 Encoding:    0x010D
UTF-32 Encoding:    0x0000010D
...

but I cannot find the way to decode it automatically.

I've tried with:

>> String.fromCodePoint(0xc48d)
"쒍"


>> String.fromCodePoint("0xc4 0x8d")
RangeError    

>> String.fromCharCode(0xc48d)
"쒍"

etc...

If I do it with the utf-16 code, String.fromCodePoint(0x010D) outputs the correct character.

How can I make it work with utf-8 instead of utf-16 codes?

Should I convert my string to utf16 achieve what I want? If so, How can I convert it?

This reminds me of RFC 2047 and quoted printable encoding. dogmamix.com/MimeHeadersDecoder — Álvaro González
– Álvaro González, Commented Jul 7, 2021 at 9:23
Yep, you are right @Álvaro González, treating it as a quoted printable solves it. Thanks. — Enuff
– Enuff, Commented Jul 7, 2021 at 10:41

nwellnhof · Accepted Answer · 2021-07-07 13:55:38Z

1

Since the encoding is almost identical to percent escapes used in URLs, you can simply use:

decodeURIComponent("SK Uni=C4=8Dov vs Prostejov".replace(/=/g, "%"))

answered Jul 7, 2021 at 13:55

nwellnhof

34k7 gold badges97 silver badges121 bronze badges

Sign up to request clarification or add additional context in comments.

1 Answer 1