Skip to main content
need `function f { ...}` syntax in ksh93 to get local variable scope.
Source Link
Stéphane Chazelas
  • 584.9k
  • 96
  • 1.1k
  • 1.7k
    #! /bin/ksh93 -
    float input="$1" # get it as input from the user in his locale
    float output
    function arith() { typeset LC_ALL=C; (($@)); }
    arith output=input/1.2 # use the dot here as it will be interpreted
                           # under LC_ALL=C
    echo "$output" # output in the user's locale
    #! /bin/ksh93 -
    float input="$1" # get it as input from the user in his locale
    float output
    arith() { typeset LC_ALL=C; (($@)); }
    arith output=input/1.2 # use the dot here as it will be interpreted
                           # under LC_ALL=C
    echo "$output" # output in the user's locale
    #! /bin/ksh93 -
    float input="$1" # get it as input from the user in his locale
    float output
    function arith { typeset LC_ALL=C; (($@)); }
    arith output=input/1.2 # use the dot here as it will be interpreted
                           # under LC_ALL=C
    echo "$output" # output in the user's locale
Make sentence clearer
Source Link
Stéphane Chazelas
  • 584.9k
  • 96
  • 1.1k
  • 1.7k
  • When you need characters to be bytes. Nowadays, most locales are UTF-8 based which means characters can take up from 1 to 6 bytes³. When dealing with data that is meant to be bytes, with text utilities, you'll want to set LC_ALL=C. It will also improve performance significantly because parsing UTF-8 data has a cost.

  • a corollary of the previous point: when processing text where you don't know what character set the input is written in, but can assume it's compatible with ASCII (as virtually all charsets are). For instance grep '<.*>' to look for lines containing a <, > pair will nonot work if you're in a UTF-8 locale and the input is encoded in a single-byte 8-bit character set like iso8859-15. That's because . only matches characters, and non-ASCII characters in iso8859-15 are likely not to form a valid character in UTF-8. On the other hand, LC_ALL=C grep '<.*>' will work because any byte value forms a valid character in the C locale.

  • Any time where you process input data or output data that is not intended from/for a human. If you're talking to a user, you may want to use their convention and language, but for instance, if you generate some numbers to feed some other application that expects English style decimal points, or English month names, you'll want to set LC_ALL=C:

     $ printf '%g\n' 1e-2
     0,01
     $ LC_ALL=C printf '%g\n' 1e-2
     0.01
     $ date +%b
     août
     $ LC_ALL=C date +%b
     Aug
    
  • When you need characters to be bytes. Nowadays, most locales are UTF-8 based which means characters can take up from 1 to 6 bytes³. When dealing with data that is meant to be bytes, with text utilities, you'll want to set LC_ALL=C. It will also improve performance significantly because parsing UTF-8 data has a cost.

  • a corollary of the previous point: when processing text where you don't know what character set the input is written in, but can assume it's compatible with ASCII (as virtually all charsets are). For instance grep '<.*>' to look for lines containing a <, > pair will no work if you're in a UTF-8 locale and the input is encoded in a single-byte 8-bit character set like iso8859-15. That's because . only matches characters, and non-ASCII characters in iso8859-15 are likely not to form a valid character in UTF-8. On the other hand, LC_ALL=C grep '<.*>' will work because any byte value forms a valid character in the C locale.

  • Any time where you process input data or output data that is not intended from/for a human. If you're talking to a user, you may want to use their convention and language, but for instance, if you generate some numbers to feed some other application that expects English style decimal points, or English month names, you'll want to set LC_ALL=C:

     $ printf '%g\n' 1e-2
     0,01
     $ LC_ALL=C printf '%g\n' 1e-2
     0.01
     $ date +%b
     août
     $ LC_ALL=C date +%b
     Aug
    
  • When you need characters to be bytes. Nowadays, most locales are UTF-8 based which means characters can take up from 1 to 6 bytes³. When dealing with data that is meant to be bytes, with text utilities, you'll want to set LC_ALL=C. It will also improve performance significantly because parsing UTF-8 data has a cost.

  • a corollary of the previous point: when processing text where you don't know what character set the input is written in, but can assume it's compatible with ASCII (as virtually all charsets are). For instance grep '<.*>' to look for lines containing a <, > pair will not work if you're in a UTF-8 locale and the input is encoded in a single-byte 8-bit character set like iso8859-15. That's because . only matches characters, and non-ASCII characters in iso8859-15 are likely not to form a valid character in UTF-8. On the other hand, LC_ALL=C grep '<.*>' will work because any byte value forms a valid character in the C locale.

  • Any time where you process input data or output data that is not intended from/for a human. If you're talking to a user, you may want to use their convention and language, but for instance, if you generate some numbers to feed some other application that expects English style decimal points, or English month names, you'll want to set LC_ALL=C:

     $ printf '%g\n' 1e-2
     0,01
     $ LC_ALL=C printf '%g\n' 1e-2
     0.01
     $ date +%b
     août
     $ LC_ALL=C date +%b
     Aug
    
  • When you need characters to be bytes. Nowadays, most locales are UTF-8 based which means characters can take up from 1 to 6 bytes³. When dealing with data that is meant to be bytes, with text utilities, you'll want to set LC_ALL=C. It will also improve performance significantly because parsing UTF-8 data has a cost.

  • a corollary of the previous point: when processing text where you don't know what character set the input is written in, but can assume it's compatible with ASCII (as virtually all charsets are). For instance grep '<.*>' to look for lines containing a <, > pair will no work if you're in a UTF-8 locale and the input is encoded in a single-byte 8-bit character set like iso8859-15. That's because . only matches characters, and non-ASCII characters in iso8859-15 are likely not to form a valid character in UTF-8. On the other hand, LC_ALL=C grep '<.*>' will work because any byte value forms a valid character in the C locale.

  • Any time where you process input data or output data that is not intended from/for a human. If you're talking to a user, you may want to use their convention and language, but for instance, if you generate some numbers to feed some other application that expects English style decimal points, or English month names, you'll want to set LC_ALL=C:

     $ printf '%g\n' 1e-2
     0,01
     $ LC_ALL=C printf '%g\n' 1e-2
     0.01
     $ date +%b
     août
     $ LC_ALL=C date +%b
     Aug
    
  • When you need characters to be bytes. Nowadays, most locales are UTF-8 based which means characters can take up from 1 to 6 bytes³. When dealing with data that is meant to be bytes, with text utilities, you'll want to set LC_ALL=C. It will also improve performance significantly because parsing UTF-8 data has a cost.

  • a corollary of the previous point: when processing text where you don't know what character set the input is written in, but can assume it's compatible with ASCII (as virtually all charsets are). For instance grep '<.*>' to look for lines containing a <, > pair will no work if you're in a UTF-8 locale and the input is encoded in a single-byte 8-bit character set like iso8859-15. That's because . only matches characters and non-ASCII characters in iso8859-15 are likely not to form a valid character in UTF-8. On the other hand, LC_ALL=C grep '<.*>' will work because any byte value forms a valid character in the C locale.

  • Any time where you process input data or output data that is not intended from/for a human. If you're talking to a user, you may want to use their convention and language, but for instance, if you generate some numbers to feed some other application that expects English style decimal points, or English month names, you'll want to set LC_ALL=C:

     $ printf '%g\n' 1e-2
     0,01
     $ LC_ALL=C printf '%g\n' 1e-2
     0.01
     $ date +%b
     août
     $ LC_ALL=C date +%b
     Aug
    
  • When you need characters to be bytes. Nowadays, most locales are UTF-8 based which means characters can take up from 1 to 6 bytes³. When dealing with data that is meant to be bytes, with text utilities, you'll want to set LC_ALL=C. It will also improve performance significantly because parsing UTF-8 data has a cost.

  • a corollary of the previous point: when processing text where you don't know what character set the input is written in, but can assume it's compatible with ASCII (as virtually all charsets are). For instance grep '<.*>' to look for lines containing a <, > pair will no work if you're in a UTF-8 locale and the input is encoded in a single-byte 8-bit character set like iso8859-15. That's because . only matches characters, and non-ASCII characters in iso8859-15 are likely not to form a valid character in UTF-8. On the other hand, LC_ALL=C grep '<.*>' will work because any byte value forms a valid character in the C locale.

  • Any time where you process input data or output data that is not intended from/for a human. If you're talking to a user, you may want to use their convention and language, but for instance, if you generate some numbers to feed some other application that expects English style decimal points, or English month names, you'll want to set LC_ALL=C:

     $ printf '%g\n' 1e-2
     0,01
     $ LC_ALL=C printf '%g\n' 1e-2
     0.01
     $ date +%b
     août
     $ LC_ALL=C date +%b
     Aug
    
added 308 characters in body
Source Link
Stéphane Chazelas
  • 584.9k
  • 96
  • 1.1k
  • 1.7k
Loading
added 450 characters in body
Source Link
Stéphane Chazelas
  • 584.9k
  • 96
  • 1.1k
  • 1.7k
Loading
added 1240 characters in body
Source Link
Stéphane Chazelas
  • 584.9k
  • 96
  • 1.1k
  • 1.7k
Loading
added 91 characters in body
Source Link
Stéphane Chazelas
  • 584.9k
  • 96
  • 1.1k
  • 1.7k
Loading
added 182 characters in body
Source Link
Stéphane Chazelas
  • 584.9k
  • 96
  • 1.1k
  • 1.7k
Loading
added 99 characters in body
Source Link
Stéphane Chazelas
  • 584.9k
  • 96
  • 1.1k
  • 1.7k
Loading
added 338 characters in body
Source Link
Stéphane Chazelas
  • 584.9k
  • 96
  • 1.1k
  • 1.7k
Loading
added 16 characters in body
Source Link
Stéphane Chazelas
  • 584.9k
  • 96
  • 1.1k
  • 1.7k
Loading
added 171 characters in body
Source Link
Stéphane Chazelas
  • 584.9k
  • 96
  • 1.1k
  • 1.7k
Loading
added 382 characters in body
Source Link
Stéphane Chazelas
  • 584.9k
  • 96
  • 1.1k
  • 1.7k
Loading
added 643 characters in body
Source Link
Stéphane Chazelas
  • 584.9k
  • 96
  • 1.1k
  • 1.7k
Loading
added 503 characters in body
Source Link
Stéphane Chazelas
  • 584.9k
  • 96
  • 1.1k
  • 1.7k
Loading
added 2550 characters in body
Source Link
Stéphane Chazelas
  • 584.9k
  • 96
  • 1.1k
  • 1.7k
Loading
added 95 characters in body
Source Link
Stéphane Chazelas
  • 584.9k
  • 96
  • 1.1k
  • 1.7k
Loading
added 151 characters in body
Source Link
Stéphane Chazelas
  • 584.9k
  • 96
  • 1.1k
  • 1.7k
Loading
Source Link
Stéphane Chazelas
  • 584.9k
  • 96
  • 1.1k
  • 1.7k
Loading