3

It is known that a path could contain newlines in any of its components.

Should we conclude then that the environment variable $PATH could contain newlines ?

If so, how to split the $PATH into its elements, similar to (Bourne like):

    IFS=':' ; set -f
    for var in $PATH
    do
        echo "<$var>"
    done

But if it could be done without changing IFS, even better.

1
  • Note that in the Bourne shell (contrary to POSIX shells), /bin::/usr/bin would be split into /bin and /usr/bin instead of /bin, "" and /usr/bin. Commented Dec 31, 2018 at 11:12

2 Answers 2

4

In POSIX shells, $IFS is a field delimiter, not separator, so a $PATH value like /bin:/usr/bin: would be split into /bin and /usr/bin instead of /bin, /usr/bin and the empty string (meaning the current directory). You need:

IFS=:; set -o noglob
for var in $PATH""; do
  printf '<%s>\n' "$var"
done

To avoid modifying global settings, you can use a shell with explicit splitting operators like zsh:

for var in "${(s/:/@)PATH}"; do
  printf '<%s>\n' "$var"
done

Though in that case, zsh already has the $path array tied to $PATH like in csh/tcsh, so:

for var in "$path[@]"; do
  printf '<%s>\n' "$var"
done

In any case, yes, in theory $PATH like any variable could contain newline characters, the newline character is not special in any way when it comes to file path resolution. I don't expect anyone sensible would put a directory with newline (or wildcards) in their $PATH or name a command with newline in its name. It's also hard to imagine a scenario where someone could exploit a script that makes the assumption that $PATH won't contain newline characters.

2
  • I'm saying that set -o noglob is more portable among the shells that can run that code if we want to consider zsh -o shwordsplit -o globsubst in that list. set -f is more portable among ancient Bourne-like shells, but those shells that don't support set -o noglob cannot run that code correctly anyway. When zsh was written in 1990, csh/tcsh were by far the most popular shells at the time. All of ksh/bash/zsh borrowed features from csh (...) Commented Jan 2, 2019 at 11:11
  • (...) csh had the -f option (for fast start) long before the Bourne shell added its -f to disable glob. So if you want to blame something for breaking compatibility, blame the Bourne (SysV) shell. There would be not reason why one would want to disable glob if it weren't for that bug of the Bourne shell whereby globbing is performed upon expansions. zsh fixed that bug, so set -o noglob is not needed there unless in sh emulation (where set -f works to disable it) or the globsubst option is enabled. Commented Jan 2, 2019 at 11:13
2

Yes, PATH can contain newlines (even on ancient Unix system).

As to splitting any string in shell, the only way you can do it portably is with IFS. You can use IFS=:; set -f; set -- $PATH or pass it to a function instead of looping with for, though.

With bash you can also "read" a string into an array:

xtra=$'some\nother\nplace\n\n'; PATH="$PATH:$xtra"
mapfile -td: path < <(printf %s "$PATH")
printf '<%s>\n' "${path[@]}"

But using arrays is usually not a good idea, because they can't be stored transparently in environment variables or passed as a single argument to external commands.

Notice that IFS will terminate fields, not separate them (kind of like \n at the end of the file won't be treated like an empty line by programs reading the file line-by-line); if that's not what's expected, and you really want to create an extra empty field at the end when splitting a string that ends in a character from IFS, you should join an empty string after the variable that is subject to word splitting:

(P=/bin:; IFS=:; printf '<%s>\n' $P"")
</bin>
<>

The word splitting algorithm will also ignore white space characters at the beginning of the string, if those whitespace characters are part of IFS. If you want an extra field for the leading whitespace, you should also join an empty string before the variable:

(P='   foo : bar  '; IFS=': '; set -f; set -- $P; printf '<%s>\n' "$@")
<foo>
<bar>

(P='   foo : bar  '; IFS=': '; set -f; set -- ""$P""; printf '<%s>\n' "$@")
<>
<foo>
<bar>
<>
13
  • Using arrays is often an excellent idea, as a number of answers here on unix.SE show. It's almost impossible to handle lists of strings with arbitrary data without using an array. You only need lists of paths with whitespace, or a list of command arguments to get the issue. Of course you can use the positional parameters instead of an array, but those aren't any better regarding the points you mention: they can't be sanely pushed through the environment, nor passed as a single argument to external commands. Commented Dec 31, 2018 at 10:32
  • No, you need the set -f to take effect before the $PATH expansion. So it should be set -o noglob; set -- $PATH"" Commented Dec 31, 2018 at 10:41
  • @ilkkachu fwiw, quoting the here-string variable is not needed: x='a b'; mapfile -td: <<< $x y; printf '<%s>\n' "$y"; but the added trailing newline is a problem, really. Commented Dec 31, 2018 at 12:14
  • @ilkkachu and that's documented in the bash manual, under "Here Strings": "Pathname expansion and word splitting are not performed" Commented Dec 31, 2018 at 12:25
  • @StéphaneChazelas thanks, I've changed it to use a process substitution instead. Commented Dec 31, 2018 at 12:36

You must log in to answer this question.