8

I'm looking for a command to return the absolute path of a file, without resolving symlinks. In general, realpath does this well.

$ mkdir /tmp/test; cd /tmp/test
$ mkdir foo
$ ln -s foo bar
$ realpath -se bar           # good, this does not resolve the symlink
/tmp/test/bar

It also works for files inside symlink directories.

$ touch foo/file
$ realpath -se bar/file      # good, this does not resolve the symlink
/tmp/test/bar/file

However, it fails when the current director is a symlink directory

$ cd bar
$ pwd
/tmp/test/bar
$ realpath -se file          # this fails, returning the target
/tmp/test/foo/file
$ realpath -se .             # this also fails, returning the target
/tmp/test/foo
$ realpath -se /tmp/test/bar/file # yet this works
/tmp/test/bar/file
$ realpath -se /tmp/test/bar # and this works
/tmp/test/bar

Why would realpath behave like that? (Is it a bug?) Is there a way to get realpath to never resolve the symlink, or is there another method I should use?

4
  • It seems as if realpath, when the given pathname is a filename, does a pwd -P (equivalent) to get the current directory path. Prepending $PWD/ would be a solution, but an awkward one. Commented Apr 2, 2020 at 7:25
  • when you do cd bar, it's already too late: the shell is now in /tmp/test/foo and nothing run from it will be able to know about bar. The only thing still knowing about bar is the shell itself (built-in pwd but not /bin/pwd, $PWD and its output in $PS1). To tell it otherwise, try: ls -l /proc/$$/cwd. Commented Apr 2, 2020 at 9:20
  • @A.B Could you please explain in more detail? What do you mean by "the shell is now in"? Are you saying that apart from the exceptions you list, any other command is fed the target, and reads from /proc/$$/cwd? Commented Apr 2, 2020 at 9:58
  • 1
    I made an answer. I hope the explanation is ok to you Commented Apr 2, 2020 at 10:19

1 Answer 1

2

The current workind directory (CWD) of a process is inherited at the OS level from the previous process or it can be changed for the current process using chdir(2). The OS (here I mean "the kernel") of course will always resolve any symlink to determine the end result which must be a directory, not a symlink (to a directory). For example the previous system call (chdir(2)) can return the error ELOOP when there were too many symbolic links to resolve. So from the OS point of view, there can't be a CWD not being a directory for any process: the OS will always resolve it the the real path without any symlink anywhere.

Once the shell has done cd /tmp/test/bar, the CWD path was resolved by the OS into /tmp/test/foo. For example, on a Linux system, ls -l /proc/$$/cwd will show the link to the resolved path as seen by the kernel: /tmp/test/foo.

The fact that the shell still displays bar in its prompt is that because it remembers the cd command done before. The behaviour could depend on the shell kind. I'll assume bash here. So its built-in pwd (but not the external /bin/pwd command), the $PWD variable and their use in $PS1 will "lie" to the user about the current directory.

Any process, such as realpath, or /bin/pwd run from the shell will of course inherit the actual CWD, which is /tmp/test/foo. So that's not a bug in realpath, it will never have a specific information about bar.

One possible awkward way, as suggested by Kusalananda is to reuse somehow the $PWD variable and prepend it to realpath's argument only if its argument isn't absolute already.

Here's an example. I'm not sure there are not ways to abuse it. For example, while the function below would cope, the $PWD variable itself doesn't behave well in bash 4.4.12 (Debian 9) but works fine in bash 5.0.3 (Debian 10) if there's a linefeed character in the path. When there's a linefeed somewhere, to be useful, a -z option should also be added to realpath but I'm not going to reimplement the whole parsing of options in this simple example.

myrealpathnofollowsym () {
    for p in "$@"; do
        if ! printf '%s' "$p" | grep -q -- '^/'; then
            realpath -se "$PWD/$p"
        else
            realpath -se "$p"
        fi
    done
}
1
  • This answer is interesting, but may appear a bit misleading. While the kernel makes getcwd() always return a resolved path, that is kind of irrelevant because the environment variables are also inherited. When running cd $( mktemp -d ); mkdir directory; ln -s directory symlink; cd symlink; /bin/sh -c pwd, the child process will show itself being inside a path with the symlink still in the output. How realpath acts here is a consequence of how it is implemented, not of which information is accessible to it. Commented Apr 2 at 7:58

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.