7

I want to test if a relative symlink points within the subtree of a certain directory.

This example would yield false since it points outside the foo directory:
/foo>readlink bar
../fie.txt

While this example would yield true:
/foo>readlink bar
fum/fie.txt

Is there an existing utility I can leverage or will I have to code it from scratch? I'm using bash.

5
  • What if fie.txt or fum is itself a symlink outside foo? Commented Jan 8, 2014 at 12:12
  • Would you always run this from /foo or do you need to be able to pass arbitrary directories? I mean, is the question always with respect to ./ or not? Commented Jan 8, 2014 at 12:23
  • @StephaneChazelas Yeah, that's a problem that I choose to ignore, sort of at least. I'm thinking I'll expand the link using readlink -f and see if the prefixes match. But I will ignore crazy corner cases since they don't exist in our environment. Commented Jan 8, 2014 at 14:18
  • @terdon No, it should accept arbitrary directories. Commented Jan 8, 2014 at 14:21
  • Have a look at symlinks, it may help. Commented Jan 15, 2014 at 12:28

3 Answers 3

5

I don't think there's such an utility. With GNU readlink, you could do something like:

is_in() (
  needle=$(readlink -ve -- "$1" && echo .) || exit
  haystack=$(readlink -ve -- "$2" && echo .) || exit
  needle=${needle%??} haystack=${haystack%??}
  haystack=${haystack%/} needle=${needle%/}
  case $needle in
    ("$haystack" | "$haystack"/*) true;;
    (*) false;;
  esac
)

That resolves all symlinks to end up with a canonical absolute path for both needle and haystack.

Explanation

  • We get the canonical absolute path of both the needle and the haystack. We use -e instead of -f as we want to make sure the files exist. The -v option gives an error message if the files can't be accessed.

  • As always, -- should be used to mark the end of options and quoting as we don't want to invoke the split+glob operator here.

  • Command substitution in Bourne-like shells have a misfeature in that it removes all the newline character from the end the output of a command, not just the one added by commands to end the last line. What that means is that for a file like /foo<LF><LF>, $(readlink -ve -- "$1") would return /foo. The common work-around for that is to append a non-LF character (here .) and strip that and the extra LF character added by readlink with var=${var%??} (remove the last two characters).

  • The needle is regarded as being in the haystack if it is the haystack or if it is haystack/something. However, that wouldn't work if the haystack was / (/etc for instance is not //something). / often needs to be treated specially because while / and /xx have the same number of slashes, one is a level above the other.

    One way to address it is to replace / with the empty string which is done with var=${var%/} (the only path ending with / that readlink -e may output is /, so removing a trailing / is changing / to the empty string).

For the canonizing of the file paths, you could use a helper function.

canonicalize_path() {
  # canonicalize paths stored in supplied variables. `/` is returned as 
  # the empty string.
  for _var do
    eval '
      '"$_var"'=$(readlink -ve -- "${'"$_var"'}" && echo .) &&
      '"$_var"'=${'"$_var"'%??} &&
      '"$_var"'=${'"$_var"'%/}' || return
  done
}

is_in() (
  needle=$1 haystack=$2
  canonicalize_path needle haystack || exit
  case $needle in
    ("$haystack" | "$haystack"/*) true;;
    (*) false;;
  esac
)
3
  • I found studying this post very instructive. May I ask you a couple of questions? Is there any significance in the fact that in needle=${needle%??} haystack=${haystack%??} the needle variable is dealt with first, whereas in the next line it is the other way around? Also, how come your return statements don't explicitly return a non-zero value (to indicate error)? Last one: would it make sense to factor out the entire transformation (the call to readlink, plus the two suffix truncations) to a separate _canonicalize_path helper function? Commented Feb 22, 2016 at 12:28
  • 1
    @kjo, 1) no significance 2) return returns by default with the status of the last command. With || return, that allows to return the status as provided by the failing application. 3) sure, but the resulting function will likely not be a pleasant sight. I'll add an example. Commented Feb 22, 2016 at 12:48
  • Thanks! I see what you mean! Not a pleasant sight at all. Shell programming must be the hardest type of programming I know of... Commented Feb 22, 2016 at 13:11
0

I solved the problem like this:

echo $abs_link_target | grep -qe "^$containing_dir"

The $abs_link_target variable contains the absolut path to the symlink target (expanded through readlink -f). I then check to see if the beginning of the target path matches the beginning of the $containing_dir

9
  • It would says that /foobar is in /foo Commented Jan 9, 2014 at 8:28
  • It would say that /abc/d is in /a.b ($containing_dir taken as a regexp) Commented Jan 9, 2014 at 8:28
  • It would say that /abc is in /foo<LF>/abc (the lines of the pattern string are treated as different patterns to match) Commented Jan 9, 2014 at 8:30
  • It would say that /foo<LF>/abc is in /abc (grep matches on each line of the input, not the whole input so generally can't be used to match file names). Commented Jan 9, 2014 at 8:55
  • Depending on the echo implementation and/or the environment, you'll have issues with filenames containing backslashes echo should really not be used to handle arbitrary data Commented Jan 9, 2014 at 8:56
0

grep -q "^/foo/bar/" <<< "$(readlink -f "anyfile.ext")"

1
  • 1
    Assuming the target of anyfile.ext exists and is reachable (otherwise, readlink -f as opposed to readlink -e might not give you the correct path) and that the resulting path doesn't contain newline characters (assumes zsh or bash4 or ksh93m+ or above). Note that if anyfile.ext points to /foo/bar itself, it will say it's not within. Commented Jan 10, 2014 at 11:56

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.