15,318 questions
1
vote
2
answers
98
views
Parsing UTF-8 XML using DefaultHandler: when / how does it become UTF-16 in Java?
I have a Java program that was working perfectly in Corretto 17, but is now having character set encoding issues in Corretto 25.
I am reading a UTF-8 encoded XML from an external API. The code is ...
4
votes
1
answer
107
views
Are spelling variations of encoding identifiers for "setlocale" standardized or documented?
This question has to do with syntactic conventions for string encoding identifiers in locale names passed to setlocale in C, focusing on the particular example of UTF-8. My preliminary observation is ...
0
votes
0
answers
69
views
Avoid encoding emojis when using yq --prettyPrint
When I convert my JSON file to YAML, i want any unicode chars to be part of the content and not escaped with backslash and quotes. Example: This is my file.json
What can I do in order to make YQ not ...
0
votes
0
answers
42
views
How to show GBK encoded header in Mac version Unity editor properly?
I work on my project on both Mac and Windows. Some editor headers were written in Visual Studio on Windows in Chinese, and were encoded in GBK(default I guess). However they don't show properly on mac ...
1
vote
0
answers
38
views
How can I know what encoding to use when reading & writing CFDataRef entries to macOS Keychain via the C-based SecItem API?
I'm writing FFI bindings to macOS' C-based Keychain Services API for use in a plugin.
The Keychain Services API takes CFDataRef values, which allows storing arbitrary bytes to the Keychain, but the ...
-1
votes
1
answer
36
views
How to pass special chars to nodemailer?
I want to pass a string to nodemailer so that it results in
=C2=A0
i tried to pass in both 'Â ' (the literal chars), "\xC2\xA0", etc.
But they always result in =C3=A9
A longer example in ...
3
votes
1
answer
115
views
How to efficiently split a text file with an arbitrary Charset without damaging code points?
Given a valid text file file and its java.nio.charset.Charset how can I efficiently (preferably using RandomAccessFile.seek() or InputStream.skip(), without reading the whole file) split it into two ...
-5
votes
1
answer
100
views
Swift: text/string vs. raw bytes? [closed]
I gave the following Swift-code to ChatGPT:
let data = text.data(using: .utf8)
It answered me:
"This line takes your string and turns it into raw bytes (data) that can be stored, sent over the ...
0
votes
1
answer
127
views
Why does Wikipedia claim UTF-16 is obsolete when Javascript uses it?
The Wikipedia page for UTF-16 claims that it is obsolete, saying:
UTF-16 is the only encoding (still) allowed on the web that is incompatible with 8-bit ASCII. However it has never gained popularity ...
1
vote
0
answers
32
views
Clipper (CP437) Character Display Issue on AlmaLinux 9.6 Minimal Server Terminal [duplicate]
I'm working with an AlmaLinux 9.6 Minimal server that hosts a Clipper-programmed system. This Clipper system uses the CP437 character set.
I've successfully configured client computers to display the ...
0
votes
0
answers
81
views
How to print Greek letters in C++ [duplicate]
Recently, I have been working on a small program that requires me to print messages in Greek. Nothing fancy, just printing greek sentences in the terminal. For example:
#include <iostream>
int ...
0
votes
0
answers
46
views
Accents replaced with a strange character in Symfony 6.4 [duplicate]
For 3 days, the characters with French accents no longer appear and are replaced by a �.
Everything was fine before.
In phpmyadmin the display is correct.
It's only on the remote server, in local all ...
3
votes
1
answer
128
views
Python subprocess: Trouble receiving non-ASCII characters from output of yt-dlp
(Windows, Python 3.9.6, yt-dlp 2025.06.09)
I have a youtube playlist whose titles contain both korean and latin characters. I can print the titles of the videos in the playlist using the below command:...
0
votes
0
answers
76
views
How to tell if these two identical-looking but differently encoded strings are the same? [duplicate]
I'm trying to import some data downloaded from Google Sheets.
The tab, when editing on the website, is called "Kodály". In Ruby, if I look at the individual characters, I see this:
>> ...
0
votes
1
answer
50
views
glib iconv() - force conversion to single bytes
I have my own personal movie database system, within which context I NEVER want to see "extended" characters (with accents, umlauts, etc.) in any text fields.
MS Co-pilot tells me that i ...