Here is all you never thought you would ever not want to know about it:
Summary
To get the pathname of an executable in a Bourne-like shell script (there are a few caveats; see below):
ls_path=$(command -v ls)
To find out if a given command exists:
if command -v given-command > /dev/null; then
echo given-command is available
else
echo given-command is not available
fi
At the prompt of an interactive Bourne-like shell:
type ls
The which command is a broken heritage from the C-Shell and is better left alone in Bourne-like shells.
Use Cases
There's a distinction between looking for that information as part of a script or interactively at the shell prompt.
At the shell prompt, the typical use case is: this command behaves weirdly, am I using the right one? What exactly happened when I typed mycmd? Can I look further at what it is?
In that case, you want to know what your shell does when you invoke the command without actually invoking the command.
In shell scripts, it tends to be quite different. In a shell script there's no reason why you'd want to know where or what a command is if all you want to do is run it. Generally, what you want to know is the path of the executable, so you can get more information out of it (like the path to another file relative to that, or read information from the content of the executable file at that path).
Interactively, you may want to know about all the my-cmd commands available on the system, in scripts, rarely so.
Most of the available tools (as is often the case) have been designed to be used interactively.
History
A bit of history first.
The early Unix shells until the late 70s had no functions or aliases. Only the traditional looking up of executables in $PATH. csh introduced aliases around 1978 (though csh was first released in 2BSD, in May 1979), and also the processing of a .cshrc for users to customize the shell (every shell, as csh, reads .cshrc even when not interactive like in scripts).
While the Bourne shell was first released in Unix V7 earlier in 1979, function support was only added much later (1984 in SVR2), and anyway, it never had some rc file (the .profile is to configure your environment, not the shell per se).
csh got a lot more popular than the Bourne shell as (though it had an awfully worse syntax than the Bourne shell) it was adding a lot of more convenient and nice features for interactive use.
In 3BSD (1980), a which csh script was added for the csh users to help identify an executable, and it's a hardly different script you can find as which on many commercial Unices nowadays (like Solaris, HP/UX, AIX or Tru64).
That script reads the user's ~/.cshrc (like all csh scripts do unless invoked with csh -f), and looks up the provided command name(s) in the list of aliases and in $path (the array that csh maintains based on $PATH).
Here you go: which came first for the most popular shell at the time (and csh was still popular until the mid-90s), which is the main reason why it got documented in books and is still widely used.
Note that, even for a csh user, that which csh script does not necessarily give you the right information. It gets the aliases defined in ~/.cshrc, not the ones you may have defined later at the prompt or for instance by sourceing another csh file, and (though that would not be a good idea), PATH might be redefined in ~/.cshrc.
Running that which command from a Bourne shell would still lookup aliases defined in your ~/.cshrc, but if you don't have one because you don't use csh, that would still probably get you the right answer.
A similar functionality was not added to the Bourne shell until 1984 in SVR2 with the type builtin command. The fact that it is builtin (as opposed to an external script) means that it can give you the right information (to some extent) as it has access to the internals of the shell.
The initial type command suffered from a similar issue as the which script in that it didn't return a failure exit status if the command was not found. Also, for executables, contrary to which, it output something like ls is /bin/ls instead of just /bin/ls which made it less easy to use in scripts.
Unix Version 8's (not released in the wild) Bourne shell had its type builtin renamed to whatis and extended to also report about parameters and print function definitions. It also fixed type issue of not returning failure when failing to find a name.
rc, the shell of Plan9 (the once-to-be successor of Unix) (and its derivatives like akanga and es) have whatis as well.
The Korn shell (a subset of which the POSIX sh definition is based on), developed in the mid-80s but not widely available before 1988, added many of the csh features (line editor, aliases...) on top of the Bourne shell. It added its own whence builtin (in addition to type) which took several options (-v to provide with the type-like verbose output, and -p to look only for executables (not aliases/functions...)).
Coincidental to the turmoil with regards to the copyright issues between AT&T and Berkeley, a few free software shell implementations came out in the late 80s early 90s. All of the Almquist shell (ash, to be replacement of the Bourne shell in BSDs), the public domain implementation of ksh (pdksh), bash (sponsored by the FSF), zsh came out in-between 1989 and 1991.
Ash, though meant to be a replacement for the Bourne shell, didn't have a type builtin until much later (in NetBSD 1.3 and FreeBSD 2.3), though it had hash -v. OSF/1 /bin/sh had a type builtin which always returned 0 up to OSF/1 v3.x. bash didn't add a whence but added a -p option to type to print the path (type -p would be like whence -p) and -a to report all the matching commands. tcsh made which builtin and added a where command acting like bash's type -a. zsh has them all.
The fish shell (2005) has a type command implemented as a function.
The which csh script meanwhile was removed from NetBSD (as it was builtin in tcsh and of not much use in other shells), and the functionality added to whereis (when invoked as which, whereis behaves like which except that it only looks up executables in $PATH). In OpenBSD and FreeBSD, which was also changed to one written in C that looks up commands in $PATH only.
Implementations
There are dozens of implementations of a which command on various Unices with different syntax and behaviour.
On Linux (beside the builtin ones in tcsh and zsh) we find several implementations. On recent Debian systems for instance, it's a simple POSIX shell script that looks for commands in $PATH.
busybox also has a which command.
There is a GNU which which is probably the most extravagant one. It tries to extend what the which csh script did to other shells: you can tell it what your aliases and functions are so that it can give you a better answer (and I believe some Linux distributions set some global aliases around that for bash to do that).
zsh has a couple of operators to expand to the path of executables: the = filename expansion operator and the :c history expansion modifier (here applied to parameter expansion):
$ print -r -- =ls
/bin/ls
$ cmd=ls; print -r -- $cmd:c
/bin/ls
zsh, in the zsh/parameters module also makes the command hash table as the commands associative array:
$ print -r -- $commands[ls]
/bin/ls
The whatis utility (except for the one in Unix V8 Bourne shell or Plan 9 rc/es) is not really related as it's for documentation only (greps the whatis database, that is the man page synopsis').
whereis was also added in 3BSD at the same time as which though it was written in C, not csh and is used to lookup at the same time, the executable, man page and source but not based on the current environment. So again, that answers a different need.
Now, on the standard front, POSIX specifies the command -v and -V commands (which used to be optional until POSIX.2008). UNIX specifies the type command (no option). That's all (where, which, whence are not specified in any standard).
Up to some version, type and command -v were optional in the Linux Standard Base specification which explains why for instance some old versions of posh (though based on pdksh which had both) didn't have either. command -v was also added to some Bourne shell implementations (like on Solaris).
Status Today
The status nowadays is that type and command -v are ubiquitous in all the Bourne-like shells (though, as noted by @jarno, note the caveat/bug in bash when not in POSIX mode or some descendants of the Almquist shell below in comments). tcsh is the only shell where you would want to use which (as there's no type there and which is builtin).
In the shells other than tcsh and zsh, which may tell you the path of the given executable as long as there's no alias or function by that same name in any of our ~/.cshrc, ~/.bashrc or any shell startup file and you don't define $PATH in your ~/.cshrc. If you have an alias or function defined for it, it may or may not tell you about it, or tell you the wrong thing.
If you want to know about all the commands by a given name, there's nothing portable. You'd use where in tcsh or zsh, type -a in bash or zsh, whence -a in ksh93 and in other shells, you can use type in combination with which -a which may work.
Recommendations
Getting the pathname to an executable
Now, to get the pathname of an executable in a script, there are a few caveats:
ls_path=$(command -v ls)
would be the standard way to do it.
There are a few issues though:
- It is not possible to know the path of the executable without executing it. All the
type, which, command -v... all use heuristics to find out the path. They loop through the $PATH components and find the first non-directory file for which you have execute permission. However, depending on the shell, when it comes to executing the command, many of them (Bourne, AT&T ksh, zsh, ash...) will just execute them in the order of $PATH until the execve system call doesn't return with an error. For instance if $PATH contains /foo:/bar and you want to execute ls, they will first try to execute /foo/ls or if that fails /bar/ls. Now execution of /foo/ls may fail because you don't have execution permission but also for many other reasons, like it's not a valid executable. command -v ls would report /foo/ls if you have execution permission for /foo/ls, but running ls might actually run /bar/ls if /foo/ls is not a valid executable.
- if
foo is a builtin or function or alias, command -v foo returns foo. With some shells like ash, pdksh or zsh, it may also return foo if $PATH includes the empty string and there's an executable foo file in the current directory. There are some circumstances where you may need to take that into account. Keep in mind for instance that the list of builtins varies with the shell implementation (for instance, mount is sometimes builtin for busybox sh), and for instance bash can get functions from the environment.
- if
$PATH contains relative path components (typically . or the empty string which both refer to the current directory but could be anything), depending on the shell, command -v cmd might not output an absolute path. So the path you obtain at the time you run command -v will no longer be valid after you cd somewhere else.
- Anecdotal: with the ksh93 shell, if
/opt/ast/bin (though that exact path can vary on different systems I believe) is in your $PATH, ksh93 will make available a few extra builtins (chmod, cmp, cat...), but command -v chmod will return /opt/ast/bin/chmod even if that path doesn't exist.
Determining whether a command exists
To find out if a given command exists standardly, you can do:
if command -v given-command > /dev/null 2>&1; then
echo given-command is available
else
echo given-command is not available
fi
Where one might want to use which
(t)csh
In csh and tcsh, you don't have much choice. In tcsh, that's fine as which is builtin. In csh, that will be the system which command, which may not do what you want in a few cases.
Find commands only in some shells
A case where it might make sense to use which is if you want to know the path of a command, ignoring potential shell builtins or functions in bash, csh (not tcsh), dash, or Bourne shell scripts, that is shells that don't have whence -p (like ksh or zsh), command -ev (like yash), whatis -p (rc, akanga) or a builtin which (like tcsh or zsh) on systems where which is available and is not the csh script.
If those conditions are met, then:
echo_path=$(which echo)
would give you the path of the first echo in $PATH (except in corner cases), regardless of whether echo also happens to be a shell builtin/alias/function or not.
In other shells, you'd prefer:
- zsh:
echo_path==echo or echo_path=$commands[echo] or echo_path=${${:-echo}:c}
- ksh, zsh:
echo_path=$(whence -p echo)
- yash:
echo_path=$(command -ev echo)
- rc, akanga:
echo_path = `{whatis -p echo} (beware of paths containing whitespace)
- fish:
set echo_path (type -fp echo)
Note that if all you want to do is run that echo command, you don't have to get its path, you can just do:
env echo this is not echoed by the builtin echo
For instance, with tcsh, to prevent the builtin which from being used:
set echo_path = "`env which echo`"
When you do need an external command
Another case where you may want to use which is when you actually need an external command. POSIX requires that all shell builtins (like command) be also available as external commands, but unfortunately, that's not the case for command on many systems. For instance, it's rare to find a command command on Linux based operating systems while most of them have a which command (though different ones with different options and behaviours).
Cases where you may want an external command would be wherever you would execute a command without invoking a POSIX shell.
The system("some command line"), popen()... functions of C or various languages do invoke a shell to parse that command line, so system("command -v my-cmd") do work in them. An exception to that would be perl which optimises out the shell if it doesn't see any shell special character (other than space). That also applies to its backtick operator:
$ perl -le 'print system "command -v emacs"'
-1
$ perl -le 'print system ":;command -v emacs"'
/usr/bin/emacs
0
$ perl -e 'print `command -v emacs`'
$ perl -e 'print `:;command -v emacs`'
/usr/bin/emacs
The addition of that :; above forces perl to invoke a shell there. By using which, you wouldn't have to use that trick.
whichare assuming an interactive shell context. This question is tagged /portability. So i interpret the question in this context as "what to use instead ofwhichto find the first executable of a given name in the$PATH". Most answers and reasons againstwhichdeal with aliases, builtins and functions, which in most real-world portable shell scripts are just of academic interest. Locally defined aliases aren't inherited when running a shell script (unless you source it with.).csh(andwhichis still acshscript on most commercial Unices) does read~/.cshrcwhen non-interactive. That's why you'll notice csh scripts usually start with#! /bin/csh -f.whichdoes not because it aims to give you the aliases, because it's meant as a tool for (interactive) users ofcsh. POSIX shells users havecommand -v.stat $(which ls)is wrong for several reasons (missing--, missing quotes), not only the usage ofwhich). You'd usestat -- "$(command -v ls)". That assumeslsis indeed a command found on the file system (not a builtin of your shell, or function of alias).whichmight give you the wrong path (not the path that your shell would execute if you enteredls) or give you an alias as defined in the configuration of some other shells...lsis a function on my system is the best reason why to usewhichin this case. Instead ofstatmy real usage is often this onerpm -q --whatprovides $(which ls). And yes, I don't use quotes interactively when I know that I don't need any. My PATH never contains whitespaces otherwise I consider it a bug in my setup and accept undefined behavior.whichimplementations would not give you even thelsthat would be found by a look-up of$PATH(regardless of whatlsmay invoke in your shell).sh -c 'command -v ls', orzsh -c 'rpm -q --whatprovides =ls'are more likely to give you the correct answer. The point here is thatwhichis a broken heritage fromcsh.