7

Context

$ bash --version
GNU bash, version 4.4.19(1)-release (x86_64-redhat-linux-gnu)
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.$ which read
/usr/bin/read

$ which read
/usr/bin/read

Can someone explain why Example 1 below works and Example 2 does not?

Example 1 - bare read works

This:

declare data
data="pig,cow,horse,rattlesnake,"
declare -a my_array
IFS=',' read -r -a my_array <<< "$data"
for item in "${my_array[@]}"; do echo "$item"; done

Produces:

pig
cow
horse
rattlesnake

Example 2 - /usr/bin/read fails

This produces no output:

declare data
data="pig,cow,horse,rattlesnake,"
declare -a my_array
IFS=',' /usr/bin/read -r -a my_array <<< "$data"
for item in "${my_array[@]}"; do echo "$item"; done
3
  • 5
    Instead of which, try type -a read Commented Jan 5, 2021 at 22:43
  • What does /usr/bin/read do on your system? (/usr/bin/read --help or run strings on it or whatever). My Ubuntu and Arch GNU/Linux systems don't have a read executable because that would be pointless; shadowed by a builtin.) Commented Jan 6, 2021 at 23:38
  • Correct, which is a csh specific tool. For a POSIX shell always use the type command. Commented Jan 12, 2021 at 19:03

5 Answers 5

26

read is a shell builtin, i.e. a command that is provided by the shell itself rather than by an external program. For more information about shell builtins, see What is the difference between a builtin command and one that is not?

read needs to be a builtin because it modifies the state of the shell, specifically it sets variables containing the output. It's impossible for an external command to set variables of the shell that calls them. See also Why is cd not a program?.

Some systems also have an external command called read, for debatable compliance reasons. The external command can't do all the job of the command: it can read a line of input, but it can't set shell variables to what it read, so the external command can only be used to discard a line of input, not to process it.

which read doesn't tell you that a builtin exists because that's not its job. which itself is an external command in bash and other Bourne-style shells (excluding zsh), so it only reports information about external commands. There's very rarely any good reason to call which. The command to find out what a command name stands for is type.

bash-5.0$ type read
read is a shell builtin
5
  • 3
    Using type -a read is a more useful habit. Commented Jan 6, 2021 at 5:39
  • 1
    It's worth keeping in mind that -a is an extension; POSIX type doesn't have any options (and in particular, POSIX doesn't--correct me if I am mistaken--mandate a command hash that would necessitate the need for -a). Commented Jan 6, 2021 at 14:01
  • 2
    Note that which is actually a builtin in some shells (ZSH for example) and behaves equivalently to type. Alternatively, if looking for command availability in a script, command -v is better than both as you can execute it’s output directly. Commented Jan 6, 2021 at 16:36
  • @chepner Yes, type -a is an extension to POSIX's type. So? ... The question is about interactive use of type not about script use of type. An script has no (hard) control of the shell that may execute it, but an interactive session is (usually) defined by the shell set by chsh in /etc/passwd. If not, calling bash will start a bash shell to work in, as the user has tagged the question. Raising POSIX issues for interactive shells seems like a waste of time. A ksh equivalent to type -a will be whence -a. zsh is similar. Commented Jan 11, 2021 at 0:13
  • Change my initial comment to the more technically accurate: in bash Using type -a read is a more useful habit. Add, if you wish: In ksh use whence -a. Commented Jan 11, 2021 at 0:18
4

read is also a shell built-in, which which doesn't know about. Try running:

$ type read
read is a shell builtin

As for why /usr/bin/read doesn't work, I'm not sure what app that is as I don't have it installed on my system, but most likely the shell built-in is the one you want.

3
  • Most likely the OP's /usr/bin/read is a bash wrapper script around its read builtin (as it seems to support -a, while other shells have -A for the same feature) Commented Jan 6, 2021 at 6:52
  • In POSIX, it doesn't matter what app is /usr/bin/read. As long as it exists, the builtin gets executed. Commented Jan 11, 2021 at 0:15
  • @Isaac, but per POSIX, env read or find -exec read or execlp("read"...) in a POSIX environment are meant to run a read utility that conforms to the read utility specification, and except maybe for shells where env or find are builtin, in those cases, it is /usr/bin/read that will be run, so it should be a standard compliant read that is stored in there (in the OP's case, that sounds like a bash script wrapper to the read builtin). Commented Jul 21, 2021 at 15:02
4

read is a shell builtin that affects the current environment. /usr/bin/read is an external command that runs in a subshell and so can't.

So why do we have /usr/bin/read at all, since it's practically useless? The answer is POSIX. It requires builtin commands to also exist as an external command!

So, for example, there's also a /usr/bin/cd command. Let's walk that through... it creates a sub shell, runs the script (which is basically builtin cd "$@") and then exits... so it does nothing useful.

The rationale for this oddity is described here: https://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xcu_chap01.html#tag_23_01_07

1
  • This is very similar how some things have to be operators in Lisp, e.g. if or and. Yet, I did a POSIX-like thing in a Lisp dialect and made them functions also. The and function will not do short-circuiting like the operator, since all of its arguments are evaluated before it is called. But, you can apply it and indirect with it: it is "execable" in POSIX terms, and there are useful cases for that. Commented Jan 7, 2021 at 6:45
3

why Example 1 below works and Example 2 does not?

Because the read command being executed is not the same.
Thus each one acts differently.

We can set an external executable to explicitly show the difference:

sudo mv /usr/bin/read /usr/bin/read.back  # keep a backup
echo $'#/bin/bash\necho "This is an external $0"' | sudo tee /usr/bin/read
sudo chmod a+x /usr/bin/read

With the executable changed, this still works:

read a b c <<<'one two three'
echo "$a $b $c"

But this (obviously) doesn't:

/usr/bin/read a b c <<<'111 222 333'
echo "$a $b $c"

The reason is that in bash (and in most of POSIX shells) there is an order to find commands. The first command found is the one that gets executed. That order may be shown by type -a in bash:

$ type -a read
read is a shell builtin
read is /usr/bin/read
read is /bin/read

Which explains why the builtin is executed even if there are external executables that have the same effective name.

The reason why Redhat provides an external /usr/bin/read is a bit more complex and is actually related to the way POSIX works.

1

As mentioned by others, "read" is a shell built-in. On my system, there is no /usr/bin/read. However, man read informs me:

NAME read - read from a file descriptor

SYNOPSIS #include <unistd.h>

ssize_t read(int fd, void *buf, size_t count);

DESCRIPTION read() attempts to read up to count bytes from file descriptor fd into the buffer starting at buf.

So, that kind of read is a system call and programming tool.

HOWEVER! Do not use read at all. You have "data" and you want an array.

IFS=',';my_array=( ${data[@]} )
for item in "${my_array[@]}"; do echo "$item"; done
pig
cow
horse
rattlesnake
5
  • 1
    Why ${data[@]}? It's a flat scalar string input, not an array. Also, instead of a loop, you can demo by expanding the array as args to printf, which repeatedly consumes the format string for each arg (or group of args if multiple conversions). IFS=','; my_array=( $data ); printf "%s\n" "${my_array[@]}"; (I tested it inside a (subshell) to not mess up IFS in my terminal session.) Commented Jan 6, 2021 at 23:46
  • If you'd used data=( $data ) to make data become an array var, it could make some sense to handle the case where it might already be an array, and take all the elements. Commented Jan 6, 2021 at 23:49
  • 1
    Note that since your read is only present as shell builtin, it will not have its own manpage (and hence man read directs you to the page for the system call). Instead, it is documented in the bash manpage, in section "shell builtin commands". Commented Jan 7, 2021 at 9:57
  • @Peter Cordes Good points. I can only say that seeing a list of items like that causes me to more or less automatically use the ${var[@]} syntax. Commented Jan 7, 2021 at 13:23
  • You can edit your answer to improve it with suggestions from comments. I also agree with AdminBee; quoting the read(2) man page for the system call of the same name is not helpful. Since you don't have a /usr/bin/read on your system, there's no chance you'd have a useful read(1) man page. Since the useless /usr/bin/read only ever exists to satisfy the letter of some part of POSIX, hopefully the man page for it on RedHat systems will say so, so there might be some point in mentioning man pages to learn about the mysterious /usr/bin/read, but not as the first thing in your answer. Commented Jan 7, 2021 at 19:33

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.