Skip to main content

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

5
  • Just a few thoughts/comments : 1. zsh is a good suggestion IMHO... I had no idea! 2. I generally use the "Unicode" alphabet in my filenames (e.g. é ç); does zsh a-z include those characters? 3. However... from strictly a file system perspective, I wonder if it might be "better" to convert to ASCII? Oh - please don't feel compelled to modify your answer; this comment is for my own benefit as much as anything else :) Commented Oct 21 at 22:07
  • 1
    @Seamus, zsh's [a-z] only matches the 26 ASCII lower case letters. Elsewhere, YMMV. Some will include ç or πŸ†•, some will include multi-character collating elements (which would throw off your count) some won't. Most won't include ZzΕΉΕΊΕ»ΕΌΕ½ΕΎαΆ»α·¦αΊαΊ‘αΊ’αΊ“αΊ”αΊ•β„€β„¨β’΅β“β“©οΌΊο½šπ™π³π‘π‘§π’π’›π’΅π“π“©π”ƒπ”·π•«π–…π–Ÿπ–Ήπ—“π—­π˜‡π˜‘π˜»π™•π™―πš‰πš£πŸ„©πŸ…‰πŸ…©πŸ†‰ as they sort after z. Commented 2 days ago
  • 1
    For alphanumeric characters (of any script, not just latin), you can use [[:alnum:]]. For latin alphabet letters only, including ΕΊ, æ and ç, best is probably to use perl and its \p{latin} match by character property. Not available in PCRE, so not in zsh's [[ -pcre-match ]] unfortunately. Commented 2 days ago
  • 1
    Using [[=a=][=b=]...[=z=]] in some regexp and glob engines (not zsh globs, but will work with its [[ =~ ]] if rematchpcre is not enabled and the system's ERE support it), could be another approach. Commented 2 days ago
  • 1
    That would miss some ligatures though such as ᴁ used in French or the German ß, and include symbols which may not be classified as alnum such as degree celsius symbols. Commented 2 days ago