7

This question asks how to get a list of environment variable names in POSIX sh. The top answer suggests invoking awk via the shell, but gives this caveat:

The output is ambiguous if the name of an environment variable contains a newline, which is extremely unusual but technically possible.

My initial reaction to this sentence was that it's incorrect in the sense there's no way the POSIX shell executing awk will export such a name. Environment variable names in general can contain anything except =, as POSIX defines the environment as an array of name=value strings. However, utilities (and thus, sh) are specifically limited in the names they use. For example:

$ cat >main.c <<EOF
#include <unistd.h>
int main(int argc, char *argv[]) {
    char *env[] = {"A=1", "B\nC=2", (char *)0};
    execve(argv[1], argv + 1, env);
}
EOF
$ cc main.c
$ ./a.out /usr/bin/env
A=1
B
C=2
$ ./a.out /bin/sh -c env
A=1

The relevant portion of the standard seems to be this:

Environment variable names used by the utilities in the Shell and Utilities volume of POSIX.1-2024 consist solely of uppercase letters, digits, and the <underscore> ('_') from the characters defined in Portable Character Set and do not begin with a digit. Other characters, and byte sequences that do not form valid characters, may be permitted by an implementation; applications shall tolerate the presence of such names.

The last sentence seems to suggest I'm wrong. I don't think names outside this criteria can become exported shell variables, since the shell standard says:

Shell variables shall be initialized only from environment variables that have valid names.

But, could an implementation of POSIX sh keep invalid names in the environment that it executes utilities under (and just not ever make them shell variables)?

15
  • 3
    Rather than be overly judgemental and label things as abominations like a wild-eyed zealot, it's far better and far more useful to strive to put Postel's Law into practice with every piece of code you write. Postel's Law is also known as the Robustness Principle and can be summarised as: "be conservative in what you send, be liberal in what you accept". While it needs to be applied with caution, it is a good part of the reason why the internet exists and doesn't suck anywhere near as much as it could, and why interoperability is possible. Commented Sep 6 at 16:14
  • 2
    You certainly shouldn't ignore, or deliberately sabotage, parts of a spec just because you don't like them or think they're too much trouble. Every character except NUL is valid in a pathname. Every character except NUL and / is valid in a filename. The posix spec allows newlines and other "annoying" characters in environment variable names and requires that applications "tolerate the presence of such names". If you fail to do that, it's not the file or variable names that are at fault, it's you and your code. Commented Sep 6 at 16:18
  • 1
    Also, as @emron's C code showed, shells are not the only source of environment variables. Pretty much every language has some trivial method to export them to the environment of child processes. And shells can obviously be child processes that inherit annoyingly-named variables from parent processes. Commented Sep 6 at 16:19
  • 2
    @GyroGearloose, Linus might have a say about how things are done by the Linux kernel, but not all of Unix-land is Linux. He's not god or king and he can't make e.g. the BSDs or commercial Unixen do things his way. Also, even the Linux kernel quite happily accepts pretty much any binary string in filenames. You might consider putting down the book you found those commandments in... Commented Sep 6 at 20:45
  • 2
    I'm not saying how or why applications should filter their inputs, just that they should. It's up to the application to make sure its input data is handled in a safe way. Commented Sep 7 at 1:02

1 Answer 1

6

could an implementation of POSIX sh keep invalid names in the environment that it executes utilities under

Well, based on a simple test with env, Bash, ksh and yash seem to pass through env vars with funny names:

% env $'%%foo * bar\nsecond line=foo' ksh -c 'echo $KSH_VERSION; env |grep -A1 %%'
Version AJM 93u+ 2012-08-01
%%foo * bar
second line=foo

% env $'%%foo * bar\nsecond line=foo' ./bash -c 'echo $BASH_VERSION; env |grep -A1 %%'
5.3.0(1)-release
%%foo * bar
second line=foo

$ env $'%%foo * bar\nsecond line=foo' yash -c 'echo $YASH_VERSION; env |grep -A1 %%'
2.55
%%foo * bar
second line=foo

$ env $'%%foo * bar\nsecond line=foo' zsh -c 'echo $ZSH_VERSION; env |grep -A1 %%'
5.9
%%foo * bar
second line=foo

I don't know if those four are breaking the spec and you can decide how far you count them as "POSIX shells" anyway, but at least some POSIX-like implementations that do pass envvars through do exist. As far as I tested, Busybox and Dash dropped that odd var, as did zsh on macOS for some reason.

11
  • Interesting demo of env producing these environment strings when fed an arg that contains a newline in the right place. (Which is possible even with shells that don't support $'escape sequences'). Unless env could (or POSIXly should?) be doing any filtering itself? Otherwise should be possible with just a POSIX shell environment, no compiler other other programming languages needed. (It's still not directly from sh, which is what the question asked about; good data points on that, too.) Commented Sep 8 at 6:27
  • Didn't do what? AFAICT, zsh doesn't strip the env vars it can't import as shell variables either. Note that bash sets variables with %% in their name since shellshock for its exported functions, so it would be silly if it did this kind of thing. Commented Sep 8 at 6:51
  • @PeterCordes, mm, the post says "could an implementation of POSIX sh keep invalid names in the environment that it executes utilities under?" and I read that as asking if it's possible for a program launched from a POSIX-like shell to receive envvars with funny names, in some way, regardless of how they came to be. Commented Sep 8 at 10:04
  • @StéphaneChazelas, I don't know what zsh is supposed to do, but at least with that simple test, zsh didn't pass through the envvar with the funny name. Commented Sep 8 at 10:05
  • 1
    @StéphaneChazelas, ok, right, I only tried the zsh on my mac (zsh 5.9 (x86_64-apple-darwin23.0)) at first, I should have mentioned that. The behaviour looks to be different there vs. on Ubuntu, so I suppose it might be some Apple-specific modification(?) Commented Sep 8 at 12:29

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.