Skip to main content
added 5 characters in body
Source Link
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k

AnAnd then use things like:

$ logger $'St\xe9phane'
$ journalctl --since today -o json | perl -MJSON -MData::Dumper -lne '
   BEGIN{$j = JSON->new}
   $j->incr_parse($_);
   while ($obj = $j->incr_parse) {
     $msg = $obj->{MESSAGE};
     # handle array of integer representation
     $msg = join "", map(chr, @$msg) if ref $msg eq "ARRAY";
     print $msg
   }' |
   sed -n '/phane/l'
St\351phane$

An then use things like:

$ logger $'St\xe9phane'
$ journalctl --since today -o json | perl -MJSON -MData::Dumper -lne '
   BEGIN{$j = JSON->new}
   $j->incr_parse($_);
   while ($obj = $j->incr_parse) {
     $msg = $obj->{MESSAGE};
     # handle array of integer representation
     $msg = join "", map(chr, @$msg) if ref $msg eq "ARRAY";
     print $msg
   }' |
   sed -n '/phane/l'
St\351phane$

And then use things like:

$ logger $'St\xe9phane'
$ journalctl --since today -o json | perl -MJSON -lne '
   BEGIN{$j = JSON->new}
   $j->incr_parse($_);
   while ($obj = $j->incr_parse) {
     $msg = $obj->{MESSAGE};
     # handle array of integer representation
     $msg = join "", map(chr, @$msg) if ref $msg eq "ARRAY";
     print $msg
   }' |
   sed -n '/phane/l'
St\351phane$
added 5 characters in body
Source Link
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k

A possible (not fully satisfactory) approach if one doesn't need to consider any of the strings in the JSON as text is to pre-process the input to the JSON-processing tool (jq, mlr...) with iconv -f latin1 -t utf-8 and post-process its output with iconv -f utf-8 -t latinlatin1, that is convert all bytes >= 0x80 to the character with the corresponding Unicode code point, or in other words, consider the input as if it was encoded in latin1.

A possible (not fully satisfactory) approach if one doesn't need to consider any of strings in the JSON as text is to pre-process the input to the JSON-processing tool (jq, mlr...) with iconv -f latin1 -t utf-8 and post-process its output with iconv -f utf-8 -t latin, that is convert all bytes >= 0x80 to the character with the corresponding Unicode code point, or in other words, consider the input as if it was encoded in latin1.

A possible (not fully satisfactory) approach if one doesn't need to consider any of the strings in the JSON as text is to pre-process the input to the JSON-processing tool (jq, mlr...) with iconv -f latin1 -t utf-8 and post-process its output with iconv -f utf-8 -t latin1, that is convert all bytes >= 0x80 to the character with the corresponding Unicode code point, or in other words, consider the input as if it was encoded in latin1.

deleted 10 characters in body
Source Link
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k
$ exec 3> $'\x80\xff'
$ lsfd -Jp "$$" | perl -MJSON -l -0777 -ne '
   $_ = JSON->new->decode($_);
   print $_->{name} for grep {
      $_->{assoc} == 3 } @{$_->{lsfd}
   }' |
   sed -n l
/home/chazelas/tmp/\200\377$

Using latin1 also makes is relatively easy to deal with journalctl's representation of messages as [1, 2, 3] whichwhere we just need to convert those byte values to the character with the corresponding Unicode codepoint (and when encoded as latin1, you get the right byte back)

$ exec 3> $'\x80\xff'
$ lsfd -Jp "$$" | perl -MJSON -l -0777 -ne '
   $_ = JSON->new->decode($_);
   print $_->{name} for grep {
      $_->{assoc} == 3 } @{$_->{lsfd}
   }' | sed -n l
/home/chazelas/tmp/\200\377$

Using latin1 also makes is relatively easy to deal with journalctl's representation of messages as [1, 2, 3] which we just need to convert those byte values to the character with the corresponding Unicode codepoint (and when encoded as latin1, you get the right byte back)

$ exec 3> $'\x80\xff'
$ lsfd -Jp "$$" | perl -MJSON -l -0777 -ne '
   $_ = JSON->new->decode($_);
   print $_->{name} for grep {$_->{assoc} == 3} @{$_->{lsfd}}' |
   sed -n l
/home/chazelas/tmp/\200\377$

Using latin1 also makes is relatively easy to deal with journalctl's representation of messages as [1, 2, 3] where we just need to convert those byte values to the character with the corresponding Unicode codepoint (and when encoded as latin1, you get the right byte back)

added 687 characters in body
Source Link
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k
Loading
deleted 466 characters in body
Source Link
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k
Loading
added 114 characters in body
Source Link
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k
Loading
added 24 characters in body
Source Link
Philip Couling
  • 21k
  • 5
  • 63
  • 100
Loading
avoid assumng there's one and only one json per line of journalctl's output.
Source Link
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k
Loading
added 608 characters in body
Source Link
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k
Loading
added 608 characters in body
Source Link
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k
Loading
added 65 characters in body
Source Link
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k
Loading
added 111 characters in body
Source Link
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k
Loading
added 477 characters in body
Source Link
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k
Loading
added 436 characters in body
Source Link
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k
Loading
added 88 characters in body
Source Link
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k
Loading
added 71 characters in body
Source Link
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k
Loading
added 1248 characters in body
Source Link
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k
Loading
added 656 characters in body
Source Link
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k
Loading
added 252 characters in body
Source Link
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k
Loading
added 1125 characters in body
Source Link
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k
Loading
Source Link
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k
Loading