Skip to main content
39 of 47
added 705 characters in body
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k

Basically, it's a portability (and reliability) issue.

Initially, echo didn't accept any option and didn't expand anything. All it was doing was outputting its arguments separated by a space character and terminated by a newline character.

Now, someone thought it would be nice if we could do things like echo "\n\t" to output newline or tab characters, or have an option not to output the trailing newline character.

They then thought harder but instead of adding that functionality to the shell (like perl where inside double quotes, \t actually means a tab character), they added it to echo.

David Korn realized the mistake and introduced a new form of shell quotes: $'...' which was later copied by bash and zsh but it was far too late by that time.

Now when a standard UNIX echo receives an argument which contains the two characters \ and t, instead of outputting them, it outputs a tab character. And as soon as it sees \c in an argument, it stops outputting (so the trailing newline is not output either).

Other shells/Unix vendors/versions chose to do it differently: they added a -e option to expand escape sequences, and a -n option to not output the trailing newline. Some have a -E to disable escape sequences, some have -n but not -e, the list of escape sequences supported by one echo implementation is not necessarily the same as supported by another.

Sven Mascheck has a nice page that shows the extent of the problem.

On those echo implementations that support options, there's generally no support of a -- to mark the end of options (zsh and possibly others support - for that though), so for instance, it's difficult to output "-n" with echo in many shells.

On some shells like bash¹ or ksh93² or yash ($ECHO_STYLE variable), the behaviour even depends on how the shell was compiled or the environment (GNU echo's behaviour will also change if $POSIXLY_CORRECT is in the environment and with the version4, zsh's with its bsd_echo option, some pdksh-based with their posix option or whether they're called as sh or not). So two bash echos, even from the same version of bash are not guaranteed to behave the same.

POSIX says: if the first argument is -n or any argument contains backslashes, then the behaviour is unspecified. bash echo in that regard is not POSIX in that for instance echo -e is not outputting -e<newline> as POSIX requires. The UNIX specification is stricter, it prohibits -n and requires expansion of some escape sequences including the \c one to stop outputting.

Those specifications don't really come to the rescue here given that many implementations are not compliant. Even some certified systems like macOS5 are not compliant.

To really represent the current reality, POSIX should actually say: if the first argument matches the ^-([eEn]*|-help|-version)$ extended regexp or any argument contains backslashes (or characters whose encoding contains the encoding of the backslash character like α in locales using the BIG5 charset), then the behaviour is unspecified.

All in all, you don't know what echo "$var" will output unless you can make sure that $var doesn't contain backslash characters and doesn't start with -. The POSIX specification actually does tell us to use printf instead in that case.

So what that means is that you can't use echo to display uncontrolled data. In other words, if you're writing a script and it is taking external input (from the user as arguments, or file names from the file system...), you can't use echo to display it.

This is OK:

echo >&2 Invalid file.

This is not:

echo >&2 "Invalid file: $file"

(Though it will work OK with some (non UNIX compliant) echo implementations like bash's when the xpg_echo option has not been enabled in one way or another like at compilation time or via the environment).

file=$(echo "$var" | tr ' ' _) is not OK in most implementations (exceptions being yash with ECHO_STYLE=raw (with the caveat that yash's variables can't hold arbitrary sequences of bytes so not arbitrary file names) and zsh's echo -E - "$var"6).

printf, on the other hand is more reliable, at least when it's limited to the basic usage of echo.

printf '%s\n' "$var"

Will output the content of $var followed by a newline character regardless of what character it may contain.

printf '%s' "$var"

Will output it without the trailing newline character.

Now, there also are differences between printf implementations. There's a core of features that is specified by POSIX, but then there are a lot of extensions. For instance, some support a %q to quote the arguments but how it's done varies from shell to shell, some support \uxxxx for unicode characters. The behavior varies for printf '%10s\n' "$var" in multi-byte locales, there are at least three different outcomes for printf %b '\123'

But in the end, if you stick to the POSIX feature set of printf and don't try doing anything too fancy with it, you're out of trouble.

But remember the first argument is the format, so shouldn't contain variable/uncontrolled data.

A more reliable echo can be implemented using printf, like:

echo() ( # subshell for local scope for $IFS
  IFS=" " # needed for "$*"
  printf '%s\n' "$*"
)

echo_n() (
  IFS=" "
  printf %s "$*"
)

echo_e() (
  IFS=" "
  printf '%b\n' "$*"
)

The subshell (which implies spawning an extra process in most shell implementations) can be avoided using local IFS with many shells, or by writing it like:

echo() {
  if [ "$#" -gt 0 ]; then
     printf %s "$1"
     shift
  fi
  if [ "$#" -gt 0 ]; then
     printf ' %s' "$@"
  fi
  printf '\n'
}

Notes

1. how bash's echo behaviour can be altered.

With bash, at run time, there are two things that control the behaviour of echo (beside enable -n echo or redefining echo as a function or alias): the xpg_echo bash option and whether bash is in posix mode. posix mode can be enabled if bash is called as sh or if POSIXLY_CORRECT is in the environment or with the the posix option:

Default behaviour on most systems:

$ bash -c 'echo -n "\0101"'
\0101% # the % here denotes the absence of newline character

xpg_echo expands sequences as UNIX requires:

$ BASHOPTS=xpg_echo bash -c 'echo "\0101"'
A

It still honours -n and -e (and -E):

$ BASHOPTS=xpg_echo bash -c 'echo -n "\0101"'
A%

With xpg_echo and POSIX mode:

$ env BASHOPTS=xpg_echo POSIXLY_CORRECT=1 bash -c 'echo -n "\0101"'
-n A
$ env BASHOPTS=xpg_echo sh -c 'echo -n "\0101"' # (where sh is a symlink to bash)
-n A
$ env BASHOPTS=xpg_echo SHELLOPTS=posix bash -c 'echo -n "\0101"'
-n A

This time, bash is both POSIX and UNIX conformant. Note that in POSIX mode, bash is still not POSIX conformant as it doesn't output -e in:

$ env SHELLOPTS=posix bash -c 'echo -e'

$

The default values for xpg_echo and posix can be defined at compilation time with the --enable-xpg-echo-default and --enable-strict-posix-default options to the configure script. That's typically what recent versions of OS/X do to build their /bin/sh. No Unix/Linux implementation/distribution in their right mind would typically do that for /bin/bash though. Actually, that's not true, the /bin/bash that Oracle ships with Solaris 11 (in an optional package) seems to be built with --enable-xpg-echo-default (that was not the case in Solaris 10).

2. How ksh93's echo behaviour can be altered.

In ksh93, whether echo expands escape sequences or not and recognises options depends on the content of the $PATH and/or $_AST_FEATURES environment variables.

If $PATH contains a component that contains /5bin or /xpg before the /bin or /usr/bin component then it behave the SysV/UNIX way (expands sequences, doesn't accept options). If it finds /ucb or /bsd first or if $_AST_FEATURES7 contains UNIVERSE = ucb, then it behaves the BSD3 way (-e to enable expansion, recognises -n).

The default is system dependant, BSD on Debian (see the output of builtin getconf; getconf UNIVERSE in recent versions of ksh93):

$ ksh93 -c 'echo -n' # default -> BSD (on Debian)
$ PATH=/foo/xpgbar:$PATH ksh93 -c 'echo -n' # /xpg before /bin or /usr/bin -> XPG
-n
$ PATH=/5binary:$PATH ksh93 -c 'echo -n' # /5bin before /bin or /usr/bin -> XPG
-n
$ PATH=/5binary:$PATH _AST_FEATURES='UNIVERSE = ucb' ksh93 -c 'echo -n' # -> BSD
$ PATH=/ucb:/foo/xpgbar:$PATH ksh93 -c 'echo -n' # /ucb first -> BSD
$ PATH=/bin:/foo/xpgbar:$PATH ksh93 -c 'echo -n' # /bin before /xpg -> default -> BSD

3. BSD for echo -e?

The reference to BSD for the handling of the -e option is a bit misleading here. Most of those different and incompatible echo behaviours were all introduced at AT&T:

  • \n, \0ooo, \c in Programmer's Work Bench UNIX (based on Unix V6), and the rest (\b, \r...) in Unix System IIIRef.
  • -n in Unix V7 (by Dennis RitchieRef)
  • -e in Unix V8 (by Dennis RitchieRef)
  • -E itself possibly initially came from bash (CWRU/CWRU.chlog in version 1.13.5 mentions Brian Fox adding it on 1992-10-18, GNU echo copying it shortly after in sh-utils-1.8 released 10 days later)

While the echo builtin of the sh or BSDs have supported -e since the day they started using the Almquist shell for it in the early 90s, the standalone echo utility to this day doesn't support it there (FreeBSD echo still doesn't support -e, though it does support -n like Unix V7 (and also \c but only at the end of the last argument)).

The handling of -e was added to ksh93's echo when in the BSD universe in the ksh93r version released in 2006 and can be disabled at compilation time.

4. GNU echo change of behaviour in 8.31

Since coreutils 8.31 (and this commit), GNU echo now expands escape sequences by default when POSIXLY_CORRECT is in the environment, to match the behaviour of bash -o posix -O xpg_echo's echo builtin (see bug report).

5. macOS echo

Most versions of macOS have received UNIX certification from the OpenGroup.

Their sh builtin echo is compliant as it's bash (a very old version) built with xpg_echo enabled by default, but their stand-alone echo utility is not. env echo -n outputs nothing instead of -n<newline>, env echo '\n' outputs \n<newline> instead of <newline><newline>.

That /bin/echo is the one from FreeBSD which suppresses newline output if the first argument is -n or (since 1995) if the last argument ends in \c, but doesn't support any other backslash sequences required by UNIX, not even \\.

6. echo implementations that can output arbitrary data verbatim

Strictly speaking, you could also count that FreeBSD/macOS /bin/echo above (not their shell's echo builtin) where zsh's echo -E - "$var" or yash's ECHO_STYLE=raw echo "$var" (printf '%s\n' "$var") could be written:

/bin/echo "$var
\c"

And zsh's echo -nE - "$var" (printf %s "$var") could be written

/bin/echo "$var\c"

7. _AST_FEATURES and the AST UNIVERSE

The _AST_FEATURES is not meant to be manipulated directly, it is used to propagate AST configuration settings across command execution. The configuration is meant to be done via the (undocumented) astgetconf() API. Inside ksh93, the getconf builtin (enabled with builtin getconf or by invoking command /opt/ast/bin/getconf) is the interface to astgetconf()

For instance, you'd do builtin getconf; getconf UNIVERSE = att to change the UNIVERSE setting to att (causing echo to behave the SysV way among other things). After doing that, you'll notice the $_AST_FEATURES environment variable contains UNIVERSE = att.

Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k