Timeline for Command to retrieve the list of characters in a given character class in the current locale
Current License: CC BY-SA 3.0
14 events
when toggle format | what | by | license | comment | |
---|---|---|---|---|---|
Mar 23, 2017 at 17:31 | vote | accept | Stéphane Chazelas | ||
Oct 30, 2016 at 8:21 | history | edited | Stéphane Chazelas | CC BY-SA 3.0 |
add context
|
May 13, 2014 at 4:08 | comment | added | mikeserv |
I seem to be able to basically extract my charset from LC_CTYPE with just od -A n -t c <LC_CTYPE | tsort Probably you've tried it already, but I'd never heard of it before and I was reading through info and it reminded me of this - and it seems to work. There's also ptx but I think it's less relevant. Anyway, if you haven't tried it and decide to do so - fair warning - it does require a little patience. lehman.cuny.edu/cgi-bin/man-cgi?tsort+1
|
|
May 9, 2014 at 20:11 | history | edited | Stéphane Chazelas | CC BY-SA 3.0 |
code points don't make sense in all locales.
|
May 9, 2014 at 14:34 | history | edited | Stéphane Chazelas | CC BY-SA 3.0 |
added 510 characters in body
|
May 7, 2014 at 16:21 | answer | added | Stéphane Chazelas | timeline score: 3 | |
May 7, 2014 at 16:17 | comment | added | mikeserv |
Thats very good! That means you dont need perl at all, i think.
|
|
May 7, 2014 at 16:15 | history | edited | Stéphane Chazelas | CC BY-SA 3.0 |
edited body
|
May 7, 2014 at 2:35 | comment | added | mikeserv |
Yeah - you can get that info parsed - I just finally got around to wrapping up my edit. There are several commands you probably already have installed - at least I did, and I didn't even know about them. I hope it helps. Specifically recode and uconv can give you what you what you say you're looking for. Possibly even just luit and od I guess...
|
|
May 7, 2014 at 0:19 | history | tweeted | twitter.com/#!/StackUnix/status/463835520989016064 | ||
May 6, 2014 at 22:01 | answer | added | mikeserv | timeline score: 10 | |
May 6, 2014 at 22:01 | comment | added | Stéphane Chazelas |
@derobert, yes, while locale (at least the GNU one) retrieves many of the informations stored in many of the categories, things it doesn't are the most important ones in LC_CTYPE and LC_COLLATE. I wonder if there's a hidden API to retrieve that information or uncompile the locale information.
|
|
May 6, 2014 at 21:11 | comment | added | derobert |
For most locales it ultimately comes from the LC_CTYPE stuff in (with glibc) /usr/share/i18n/locales/i18n ... which of course comes largely from the Unicode Character Database. Of course, it would be nice to have a command
|
|
May 6, 2014 at 20:23 | history | asked | Stéphane Chazelas | CC BY-SA 3.0 |