-
Notifications
You must be signed in to change notification settings - Fork 37
Description
Ksh's ** recursive globbing (enabled with set -o globstar/set -G),
when looking closely behaves significantly differently from other
implementations. And in quite surprising ways. The documentation is very
terse about it:
-G Causes the pattern
**by itself to match files and zero or more
directories and sub-directories when used for file name
generation. If followed by a/only directories and
sub-directories are matched.
It would be useful for its behaviour to be specified more clearly for
users to know what to expect.
Recursive globbing was introduced in 1990 by zsh, and the **/ form
specifically in 2.2 in 1992.
ksh93 didn't introduce ** until over 10 years later, but ksh93
implementation was apparently written independently from zsh's. Up until
2016, David Korn did think he had invented it (see
news://news.gmane.io/CAPVHe3R-qDpi+yG3OwTvhAPSnhBofboeYqonjckjB6EiOmhTDg@mail.gmail.com)
suggesting he wasn't aware of zsh's implementation when he wrote his.
Which would explain why it's significantly different. A few other shells
(bash, yash, tcsh, fish) have later added support for **, with varying
behaviours though often closer to zsh's than to ksh93's.
In zsh, the operator was **/, and ** alone was not special. **/ was
short for (*/)#, that is matches 0 or more subdirectories (excluding
symlinks to directories, see ***/ for the form that follows symlinks).
In ksh93, **/, /** and ** are all special.
Here are a few things I've gathered from ksh93's ** through
experimentation:
** alone is like zsh's **/*, that is expands to files of any type in the
current directory or in subdirectories, but not following symlinks when
recursing.
**/ is like zsh's **/*/ in that it will also include symlinks to
directories (possibly contradicts the manual above), while zsh's **/
excludes symlinks to directories.
**/file however is like zsh's **/file and will not report "file"s inside
symlinks to directories.
dir/** will match files under "dir", but
di[r]/** will match files under "dir" and "dir" (unless "dir" is a
symlink to a directory in which case it will match only "dir"). That is
/** can match the empty string if it follows a component that
contains wildcards.
[f]ile/** will match on "file" even if "file" is not a directory. In
effect, **, */**, **/* all expand to the same thing.
When ** is present in a glob, it affects the way symlinks are processed
in all the glob path components.
For instance, link/dir*/** or **/link/* will not match anything (but
link/dir/**, link/dir/*, */link/* will). As long as ** is present, ksh
won't read the contents of symlinked dir components to match files in
them.
link/** won't match (but lin[k]/** will match on "link" and "link" only
as seen above).
"." and ".." seem to receive a special treatment in ksh93u+m (not in
ksh2020 nor ksh93u) in that **/. or **/.. (or **/./file...) don't match anything (possibly related to #58)
Note that I'm not saying ksh93's API is better or worse. It's got its
uses. For instance, the not following of symlinks could be seen as an
improvement even if it makes globbing inconsistent globally; the /**
possibly matching the empty string as well (like /foo*/** matching the
/foo* directories and their contents as opposed to just their contents
in zsh's /foo*/**/*, or /etc/profile*/** that matches on
/etc/profile and /etc/profile.d and its contents).
bash's ** has undergone several significant changes since it was introduced
in 4.0 and its documentation has been very terse all along as well (and
very similar to ksh93's suggesting the original intention was to copy
the ksh93 behaviour rather than zsh's, also suggested by the choice of
the "globstar" option name). bash's ** also exhibits some of the
behaviours of ksh93's (like the di[r]/** including "dir", though note
that contrary to ksh93, fil[e]/** doesn't expand to "file", and
dir/** does include dir/).
See also
https://unix.stackexchange.com/questions/62660/the-result-of-ls-ls-and-ls