11

I would like to be able to search all my $PATH for files matching a given pattern.

For example, if my PATH is /usr/local/bin:/usr/bin:/bin and there's a /usr/local/bin/gcc-4 and a /usr/bin/gcc-12, I would like to be able to search for gcc-* to find them both.

The trivial approach of course does not work:

find ${PATH} -name "gcc-*"

this naive approach does work:

find $(echo "${PATH}" | sed -e 's|:| |g') -name "gcc-*"

but of course this is breaks if PATH holds any weird characters like space and the like.

So how can I achieve this in a safe way? My shell is sh.

3
  • This question is similar to: How to find files with find tool in system path ($PATH)? Or alternatively, How to specify starting-point directory for find as an expression?. If you believe it’s different, please edit the question, make it clear how it’s different and/or how the answers on that question are not helpful for your problem. Commented Jul 30 at 15:00
  • I believe this question is different enough, "as is", to merit keeping it open and non-duplicate. The OP specifically mentioned glob in this case. While the answers to the potential duplicate discuss glob in the answers, that other question doesn't mention globbing as part of the main goal. @umläute has made it an integral part of this question. Commented Jul 30 at 18:24
  • I added that you're using sh but since you're almost certainly not actually using the ancient bourne shell, it would be good if you could specify what /bin/sh is pointing to on your system. Commented Jul 31 at 10:23

5 Answers 5

18
find ${PATH} -name "gcc-*"

Is actually a good start, as by leaving $PATH unquoted in sh/bash and other POSIX-like shells (not zsh unless in sh/ksh emulation), you're asking it to split it.

However, the splitting is done based on the value of the $IFS special parameter which by default doesn't contain :.

You also need the -H option to find for find to follow symlinks passed as arguments.

So you'd need:

IFS=:; find -H $PATH -name 'gcc-*'

There's a second unwanted side effect to leaving $PATH unquoted: filename generation aka globbing. That can be disabled with set -o noglob, but it's unlikely to make any difference as it's unlikely that $PATH components will contain glob operators (*?[...] and more if extglob is enabled).

Note it will also look into subdirectories of $PATH components if any (files at depth 2 or deeper) and would return $PATH components themselves if they matched the pattern (such as /opt/gcc-latest) (depth 0) which it doesn't look like you want.

With GNU find or compatible, you could add -mindepth 1 -maxdepth 1, or you could use your shell globs.

With bash:

println() {
  [ "$#" -eq 0 ] || printf '%s\n' "$@"
}
IFS=:
set -- $PATH
shopt -s nullglob
shopt -u failglob
println ${@/%/\/gcc-*}

If switching to zsh is an option, then it becomes a lot simpler:

There $PATH is tied to the $path array à la csh, so you can just to:

find $path -name 'gcc-*'

Using globs:

print -rC1 -- $^path/gcc-*(N)

Where print -rC1 -- (print raw on 1 Column) does the equivalent of our println function from above, $^path enables rc-like (or fish-like) array expansion, and the N glob qualifier does the equivalent of nullglob.

Or you can just use type/whence/which... with the -m option to list commands (executable ones, not restricted to the ones in $path/$PATH, functions, builtins... will also be included) that match a pattern.

$ type -m 'gcc-*'
gcc-11 is /usr/bin/gcc-11
gcc-12 is /usr/bin/gcc-12
gcc-9 is /usr/bin/gcc-9
gcc-ar is /usr/bin/gcc-ar
gcc-ar-11 is /usr/bin/gcc-ar-11
gcc-ar-12 is /usr/bin/gcc-ar-12
gcc-ar-9 is /usr/bin/gcc-ar-9
gcc-nm is /usr/bin/gcc-nm
gcc-nm-11 is /usr/bin/gcc-nm-11
gcc-nm-12 is /usr/bin/gcc-nm-12
gcc-nm-9 is /usr/bin/gcc-nm-9
gcc-ranlib is /usr/bin/gcc-ranlib
gcc-ranlib-11 is /usr/bin/gcc-ranlib-11
gcc-ranlib-12 is /usr/bin/gcc-ranlib-12
gcc-ranlib-9 is /usr/bin/gcc-ranlib-9

type being short for whence -v. Without -v:

$ whence -m 'gcc-*'
/usr/bin/gcc-11
/usr/bin/gcc-12
/usr/bin/gcc-9
/usr/bin/gcc-ar
/usr/bin/gcc-ar-11
/usr/bin/gcc-ar-12
/usr/bin/gcc-ar-9
/usr/bin/gcc-nm
/usr/bin/gcc-nm-11
/usr/bin/gcc-nm-12
/usr/bin/gcc-nm-9
/usr/bin/gcc-ranlib
/usr/bin/gcc-ranlib-11
/usr/bin/gcc-ranlib-12
/usr/bin/gcc-ranlib-9

Some notes on corner cases:

  • empty $PATH components (like when $PATH is /bin:/usr/bin: which has a trailing empty component) means the current working directory (same as /bin:/usr/bin:.), but find $path (zsh) would skip them and find $PATH (sh/bash) would skip the last and find would complain about empty arguments. ${@/%/\/gcc-*} or $^path/gcc-* would try and expand /gcc-* instead of ./gcc-*. Having relative paths in $PATH is very bad practice though, so unlikely to happen in practice.
  • an unset $PATH means a default search path will be used, but which it is can depend on what does the command lookup. It's hard to get that default value reliably.
  • a set but empty $PATH means search the current working directory only (same as PATH=.). So that's similar to the first point above with the added caveat that the behaviour for find -H -name 'gcc-*' is unspecified and varying with the find implementation.
6
  • i haven't mentioned this explicitly, but it seems that my shell is not zsh but rather sh (hence i have only tagged this as shell - what's more, it seems i cannot actually change the shell (that's all within a flatpak build script). but IFS=:; find $PATH looks great. Commented Jul 30 at 14:16
  • @umläute: For interactive use I often use a subshell if I want to set IFS. Like (IFS=$'\n'; mpv $(locate -iA foo bar ) ) to play video/audio found by locate, with filenames from locate which don't contain newlines, but do contain spaces and whatnot. (Usually I control-r recall on $(locate, not retyping all of that from scratch. The actual command I use is mpv $(find $(locate ...) -maxdepth 0 -type f ...) to filter out dirs, so is a lot to type; I sometimes adjust parts other than the locate options so I've never bothered to put it in a script.) Commented Jul 31 at 10:03
  • 1
    @PeterCordes, note GNU find now supports a -files0-from predicate, so you can do (much more reliably and efficiently) find -files0-from <(locate -0 ...) -prune -type f -exec mpv {} + (or use zsh and its 0 parameter expansion flag to split NUL-delimited records (or f for line feed delimited records) and can filter lists of files by type with its glob qualifiers). Commented Jul 31 at 10:37
  • 1
    Wouldn't it be better to use ( IFS=:; find -H $PATH -name 'gcc-*' ) so the IFS value isn't changed in the parent shell? Commented Jul 31 at 11:30
  • 1
    @CSM, I don't know of any find implementation that supports a --depth=n option. FreeBSD's find has -depth 1, but it's a condition one only, doesn't stop from descending deeper, you want GNU-style -mindepth 1 -maxdepth 1 already mentioned. Commented Jul 31 at 14:54
11

With Bash you can use mapfile to split the path into array entries, safely, and then process that (with some special handling for newlines, see Mapfile not removing trailing newline):

readarray -t -d: p <<<"$PATH"
find "${p[@]%$'\n'}" -name "gcc-*"

or perhaps more usefully,

find "${p[@]%$'\n'}" -maxdepth 1 -name "gcc-*"

This separates the contents of $PATH using a colon as the delimiter, removing the colon from each entry, and storing the result in the array p. It then gives the individual entries in p as arguments to find, removing any trailing newline.

Since you’re using dash, you can’t use arrays, I think the best solution is to rely on the shell splitting $PATH using IFS — see Stéphane Chazelas’ answer.

2
  • 2
    @Hans-MartinMosner that’s a fair point, however find does have some advantages here. Using ls as in your example will produce errors for all directories that don’t contain files matching gcc-*; find will only produce errors for directories that don’t exist or aren’t accessible. Using find also allows more criteria to be specified (-type x for example). Commented Jul 30 at 13:09
  • i haven't mentioned this explicitly, but it seems that my shell is not bash but rather sh (hence i have only tagged this as shell - what's more, it seems i cannot actually change the shell (that's all within a flatpak build script). in any case a very elegant solution. Commented Jul 30 at 14:14
7

I'll add a solution using the null byte (\0) as a delimiter. This strategy is often used when there are "any weird characters like space and the like." @umläute, you did mention -print0 in a comment (archived), so I'm guessing there has been some thought about null bytes. However, we're talking about input to find, so -print0 won't help us. Also, directly using IFS='\0' (or even IFS=$'\0') is problematic in bash due to its internal handling of C-style strings. This source (archived) gives a very good explanation of the problems.

A quick disclaimer: Now that I've seen it, I would use the ( IFS=:; find -H $PATH -name "gcc-*" ) solution as given in the answer (archived) by @Stéphane-Chazelas, along with the comment (archived) by @terdon-♦. My first instinct in situations like this, however, is to null-separate things. Because of this, I've posted my take on that approach.

Edit: I've learned a lot by answering this question and receiving great suggestions about how to fix problems and make things better. I've left some of my "learning" here in Notes after the answer, but I'm taking out some extraneous stuff. Hopefully everything remaining will be useful.


An example

Here's one example that shows the concept. I give the input/output from my system1. Note that an extended solution that takes care of some often-unwanted error reports is at the end of this section (right before the <hr/> horizontal line).

Edit: As @Raffa was so kind as to point out in the comments, there are a couple problems with my first version. I discuss these in Note 3.


$ printf $PATH | tr ':' '\0' | xargs -I'{}' -0 find "{}" -maxdepth 1 -type f -name "gcc-*"'

Edited. Credit goes to @Stéphane-Chazelas for warning about using -maxdepth 1 without -mindepth 1 as well as for giving better usage of printf.

$ printf %s "$PATH" | tr ':' '\0' |
    xargs -0 sh -c 'find "$@" -mindepth 1 -maxdepth 1 -type f -name "gcc-*"' sh
/usr/bin/gcc-ar.exe
/usr/bin/gcc-nm.exe
/usr/bin/gcc-ranlib.exe

The command only gives output as it does (without warnings or errors), because my PATH doesn't contain any non-existent directories. This is a good time to highlight that I've used printf instead of echo to avoid errors such as

find: ‘/usr/local/bin\n’: No such file or directory

This is a legitimate report, as there is a very real possibility that a gcc-* file could be in /usr/local/bin. However, there can be multiple other (likely unwanted) No such errors, as noted in the solution of @Chris-Davies (archived),

[Y]ou have to discard stderr completely in order to avoid outputting errors for directories in $PATH that don't exist.

There's also a real possibility of getting Permission denied errors. If you want to get all output from stderr other than the No such and Permission denied errors, you can do some process substitution along with using file descriptors. I've used this pattern for a while, and it's due to an answer (archived) to a find question, the answer given by @mklement0.

Edit: The same issues that @Raffa so kindly pointed out (see Note 3) are taken care of with the unwanted-error-filtering code, below.

printf %s "$PATH" | tr ':' '\0' |     
    xargs -0 sh -c '
      find "$@" -mindepth 1 -maxdepth 1 -type f -name "gcc-*" 2> 
        >( grep -v "Permission denied\|No such" >&2)
    ' sh

The use of sh -c after xargs allows me to use the in-my-opinion nicer structure from a comment by @gniourf-gniourf (archived).

printf %s "$PATH" | tr ':' '\0' |     
    xargs -0 sh -c '
     { find "$@" -mindepth 1 -maxdepth 1 -type f \
                 -name "gcc-*" 2>&1 >&3 | 
         grep -v "Permission denied\|No such" >&2; } 3>&1' sh

I like that plumbing approach better, as I think it more cleanly takes care of streams and allows a simple pipe to grep, but it is a bit more opaque—harder to read and to understand. In the same page as the code from @mklement0 and @gniourf-gniourf, you can see discussion of a similar pattern mentioned in an answer from @wjordan.

Edited. Another shout-out to @Stéphane-Chazelas for sorting out some of my plumbing problems.


Some thoughts about weird characters

I got thinking of how robust this is. There is this annoying little thing that my bash1 just let me do from my $HOME directory (without any complaint).

$ cd  # make sure I'm $HOME
$ mkdir -p "abc:def"
$ cd abc\:def
$ cat > test_executable <<'EOF'
#!/usr/bin/env bash
echo "A colon in the file path?"
EOF
$ chmod a+x test_executable
$ ./test_executable
A colon in the file path?

There are problems with this idea of a colon in the path to an executable, specifically with using said path in the PATH environment variable as in PATH="/home/bballdave025/abc\:def:$PATH". In fact, there are enough problems that I've decided to take this part of my solution out, since the problem isn't really ... real. However, I leave a reference to Note 3, in case someone has no control over a colon being in the path of an executable.

The not-real-ness of the problem has to do with the way PATH is implemented and documented. Some explanations come from this answer (archived) on U&L as well as this other answer (archived) on SO, respectively, that

The POSIX standard explicitly mentions that it's impossible to use directories with : in their names in the PATH variable's value.

This [escaping a colon in PATH on UNIX] is impossible according to the POSIX standard. This is not a function of a specific shell, PATH handling is done within the execvp function in the C library. There is no provision for any kind of quoting.

Things are more clear and concise in a comment (archived) on this (my current) answer by
@Stéphane-Chazelas,

PATH="/home/bballdave025/abc\:def" means a $PATH with two directories: /home/bballdave025/abc\ and def, there's not point trying to treat that \: specially.

In the interest of history, I do have the previously-existing section as a gist (archived) on my GitHub.


Notes:

[1]

My system:

$ uname -a
CYGWIN_NT-10.0-19045 MY-MACHINE 3.6.3-1.x86_64 2025-06-05 11:45 UTC x86_64 Cygwin
$ bash --version | head -n 1
GNU bash, version 5.2.21(1)-release (x86_64-pc-cygwin)

[2]

If you need to work around this, you could soft-link the directory to a non-colon-containing name, e.g.

cd
ln -s abc\:def abc_def
export PATH="/home/bballdave025/abc_def:$PATH"

cf. this SO post (archived).

Note that I've also seen solutions where one mounts the colon-containing directory with a non-colon-containing name, but unless said directory is on another file system, I don't like doing this.


[3]

I always appreciate comments that help me learn more, and I'm indebted to @Raffa for the comments that allowed me to give an answer that works with desired characteristics and that works better. There were two issues pointed out, but before I go into them, I'll put the old code here for inspection/comparison.

The original versions of the code were as follows:

Initial

$ printf $PATH | tr ':' '\0' |
    xargs -I'{}' -0 find "{}" -maxdepth 1 -type f -name "gcc-*"

With suppression of unwanted errors/warnings

$ printf $PATH | tr ':' '\0' |
    xargs -I'{}' -0  find "{}" -maxdepth 1 -type f -name "gcc-*" 2> \
    >(grep -v 'Permission denied\|No such' >&2)

The first issue is a loss in efficiency due to the -I'{}' passed to xargs. From man xargs | cat | grep -A 4 and man xargs | cat | grep -E -A3 "^\s+[-]L", respectively,

       -I replace‐str
              Replace  occurrences of replace‐str in the initial‐arguments with
              names read from standard input.  Also,  unquoted  blanks  do  not
              terminate input items; instead the separator is the newline char‐
              acter.  Implies -x and -L 1.
       -L max‐lines
              Use at most max‐lines nonblank  input  lines  per  command  line.
              Trailing  blanks cause an input line to be logically continued on
              the next input line.  Implies -x.

(-x just specifies some exit conditions). Using at most 1 input line per command line has the possibility to cause significant slowdown.

The second issue is that not having $PATH in double quotes leads to an expansion that will break paths containing spaces and other "weird characters". Since this is one of the initial goals of the OP, the fix is very important. The new command has passed tested with a PATH variable containing a space.

12
  • 1
    Re your first example ... -I {} implies -L 1 thus reducing efficiency and the unquoted expansion of $PATH will break paths with spaces in them ... So, probably consider something like printf "$PATH" | tr ':' '\0' | xargs -0 sh -c 'find "$@" -maxdepth 1 -type f -name "gcc-*"' sh ... Also, although a minor thing, but the replace string {} when used doesn't need to be quoted as it doesn't undergoes shell expansion rather the arguments are written/placed on the command-line by xargs properly. Commented Jul 31 at 12:29
  • 1
    Thanks, @Raffa. I'll make the changes. I never knew what to use by default with xargs; I was always just taught -I '{}'. I should have tested those "weird character" conditions more thoroughly. Commented Jul 31 at 18:19
  • 1
    You're welcome ... The first position in the parameters after a command string in a shell i.e. sh -c '...' 1 2 3 is $0 and not $1 ... so you need to force xargs to start positioning arguments from $1 by preoccupying $0 with e.g. the name of the shell sh ... Compare the output of sh -c 'echo "$@"' 1 2 3 to sh -c 'echo "$@"' sh 1 2 3 Commented Aug 1 at 8:55
  • 1
    I also think it means that I need to add an sh after my sh -c '...'. @Raffa, does my code look correct, now? Commented Aug 1 at 20:01
  • 1
    Your redirections in { find "$@" -maxdepth 1 -type f -name "gcc-*" | grep -v "Permission denied\|No such" >&3; } 3>&2 2>&1 don't make much sense. Maybe you meant { find "$@" -maxdepth 1 -type f -name "gcc-*" 2>&1 >&3 | grep -v "Permission denied\|No such" >&2; } 3>&1 for grep to filter find's errors (assuming they're in English). Commented Aug 2 at 8:33
4

You could roll a little loop

(
    name='gcc-*'
    IFS=:; for p in $PATH    # Mind filename globbing
    do
        [ -d "$p" ] && find "$p" -maxdepth 1 -type f -name "$name"
    done
)

which can easily be converted to a function

pfind() (
    name="$1"
    set -f    # noglob
    IFS=:; for p in $PATH
    do
        [ -d "$p" ] && find "$p" -maxdepth 1 -type f -name "$name"
    done
)

pfind 'gcc-*'

Conveniently you don't need local in this function because the ( … ) already localises the context.

As a final offering, here's the version without a loop

pfind() ( set -f; IFS=:; find $PATH -mindepth 1 -maxdepth 1 -type f -name "$1" 2>/dev/null )

where you have to discard stderr completely in order to avoid outputting errors for directories in $PATH that don't exist

3
  • what's the advantage of looping over an IFS=:-separated path over using find directly? (I'm mostly asking as it prevents things like -print0) Commented Jul 30 at 14:18
  • 1
    @umläute like pfind() ( set -f; IFS=:; find $PATH -mindepth 1 -maxdepth 1 -type f -name "$1" 2>/dev/null )? That would work but you have to discard stderr to avoid errors where the $PATH contains directories that don't exist Commented Jul 30 at 14:59
  • Good clarification that "you don't need local in this function because the ( ... ) already localises the context". I think that's an advantage. Commented Jul 31 at 20:01
0

You don't mention whether you're trying to do this programmatically for scripting purposes, or manually for human-readable purposes. Also, this answer presupposes that you are looking for executable files. Since you already have several programmatic solutions, I'll just mention the bash-ism:

$ gcc-tabtab

gcc-ar13 gcc-nm13 gcc-ranlib13

This likely works in some other shells, too.

1
  • indeed i was looking for programmatic solutions (and the pattern i used as an example was somewhat simplistic for the sake of the the question). but indeed, for an interactive session (and the simplistic pattern), <kbd>tab</kbd>-completion is of course the way to go. Commented Aug 7 at 19:33

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.