Skip to main content
1 vote
2 answers
98 views

Parsing UTF-8 XML using DefaultHandler: when / how does it become UTF-16 in Java?

I have a Java program that was working perfectly in Corretto 17, but is now having character set encoding issues in Corretto 25. I am reading a UTF-8 encoded XML from an external API. The code is ...
Philip H's user avatar
  • 422
4 votes
1 answer
107 views

Are spelling variations of encoding identifiers for "setlocale" standardized or documented?

This question has to do with syntactic conventions for string encoding identifiers in locale names passed to setlocale in C, focusing on the particular example of UTF-8. My preliminary observation is ...
NikS's user avatar
  • 194
0 votes
0 answers
69 views

Avoid encoding emojis when using yq --prettyPrint

When I convert my JSON file to YAML, i want any unicode chars to be part of the content and not escaped with backslash and quotes. Example: This is my file.json What can I do in order to make YQ not ...
Esben von Buchwald's user avatar
0 votes
0 answers
42 views

How to show GBK encoded header in Mac version Unity editor properly?

I work on my project on both Mac and Windows. Some editor headers were written in Visual Studio on Windows in Chinese, and were encoded in GBK(default I guess). However they don't show properly on mac ...
ArtS's user avatar
  • 2,022
1 vote
0 answers
38 views

How can I know what encoding to use when reading & writing CFDataRef entries to macOS Keychain via the C-based SecItem API?

I'm writing FFI bindings to macOS' C-based Keychain Services API for use in a plugin. The Keychain Services API takes CFDataRef values, which allows storing arbitrary bytes to the Keychain, but the ...
habibalamin's user avatar
-1 votes
1 answer
36 views

How to pass special chars to nodemailer?

I want to pass a string to nodemailer so that it results in =C2=A0 i tried to pass in both 'Â ' (the literal chars), "\xC2\xA0", etc. But they always result in =C3=A9 A longer example in ...
gcb's user avatar
  • 14.5k
3 votes
1 answer
115 views

How to efficiently split a text file with an arbitrary Charset without damaging code points?

Given a valid text file file and its java.nio.charset.Charset how can I efficiently (preferably using RandomAccessFile.seek() or InputStream.skip(), without reading the whole file) split it into two ...
Basilevs's user avatar
  • 24.6k
-5 votes
1 answer
100 views

Swift: text/string vs. raw bytes? [closed]

I gave the following Swift-code to ChatGPT: let data = text.data(using: .utf8) It answered me: "This line takes your string and turns it into raw bytes (data) that can be stored, sent over the ...
mewi's user avatar
  • 769
0 votes
1 answer
127 views

Why does Wikipedia claim UTF-16 is obsolete when Javascript uses it?

The Wikipedia page for UTF-16 claims that it is obsolete, saying: UTF-16 is the only encoding (still) allowed on the web that is incompatible with 8-bit ASCII. However it has never gained popularity ...
Isaac King's user avatar
1 vote
0 answers
32 views

Clipper (CP437) Character Display Issue on AlmaLinux 9.6 Minimal Server Terminal [duplicate]

I'm working with an AlmaLinux 9.6 Minimal server that hosts a Clipper-programmed system. This Clipper system uses the CP437 character set. I've successfully configured client computers to display the ...
Gabriel's user avatar
  • 11
0 votes
0 answers
81 views

How to print Greek letters in C++ [duplicate]

Recently, I have been working on a small program that requires me to print messages in Greek. Nothing fancy, just printing greek sentences in the terminal. For example: #include <iostream> int ...
Thanos's user avatar
  • 33
0 votes
0 answers
46 views

Accents replaced with a strange character in Symfony 6.4 [duplicate]

For 3 days, the characters with French accents no longer appear and are replaced by a �. Everything was fine before. In phpmyadmin the display is correct. It's only on the remote server, in local all ...
Alain Jouve's user avatar
3 votes
1 answer
128 views

Python subprocess: Trouble receiving non-ASCII characters from output of yt-dlp

(Windows, Python 3.9.6, yt-dlp 2025.06.09) I have a youtube playlist whose titles contain both korean and latin characters. I can print the titles of the videos in the playlist using the below command:...
S.M.'s user avatar
  • 33
0 votes
0 answers
76 views

How to tell if these two identical-looking but differently encoded strings are the same? [duplicate]

I'm trying to import some data downloaded from Google Sheets. The tab, when editing on the website, is called "Kodály". In Ruby, if I look at the individual characters, I see this: >> ...
Max Williams's user avatar
  • 33.1k
0 votes
1 answer
50 views

glib iconv() - force conversion to single bytes

I have my own personal movie database system, within which context I NEVER want to see "extended" characters (with accents, umlauts, etc.) in any text fields. MS Co-pilot tells me that i ...
FumbleFingers's user avatar

15 30 50 per page
1
2 3 4 5
1022