On my new Arch installation, perl doesn't seem to play nice with Unicode. For example, given this input file:
ελα ρε
王小红
This command should give me the last two characters of each line:
$ perl -CIO -pe 's/.*(..)$/$1/' file
ε
º¢
However, as you can see above, I get gibberish. The correct output is:
ρε
小红
I know that my terminal (gnome-terminator) supports UTF-8 since these both work as expected:
$ cat file
ελα ρε
王小红
$ perl -pe '' file
ελα ρε
王小红
Unfortunately, without -CIO, perl doesn't deal with the files correctly either:
$ perl -pe 's/.*(..)$/$1/' file
ε
��
It also shouldn't be a locale issue:
$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
I'm guessing I need to install some Perl packages, but I don't know which ones. Some relevant information:
$ perl --version | grep subversion
This is perl 5, version 22, subversion 0 (v5.22.0) built for x86_64-linux-thread-multi
$ pacman -Qs unicode
local/fribidi 0.19.7-1
    A Free Implementation of the Unicode Bidirectional Algorithm
local/icu 55.1-1
    International Components for Unicode library
local/libunistring 0.9.6-1
    Library for manipulating Unicode strings and C strings
local/perl 5.22.0-1 (base)
    A highly capable, feature-rich programming language
local/perl-unicode-stringprep 1.105-1
    Preparation of Internationalized Strings (RFC 3454)
local/perl-unicode-utf8simple 1.06-5
    Conversions to/from UTF8 from/to characterse
local/ttf-arphic-uming 0.2.20080216.1-5
    CJK Unicode font Ming style
How can I get my perl installation to play nice with Unicode?


perl -Mutf8::all -pe 's/.*(..)$/$1/' fileorperl -Mutf8::all -CIO -pe 's/.*(..)$/$1/' file.utf8::allmodule, the command you gave did indeed work as expected. I would still like to have the standard-CIOoptions working though. I shouldn't need to call another module for this.