Revisions to Why can I not bind a mount namespace to a file

deleted 2327 characters in body

Source Link

edited May 5, 2019 at 16:28

53.5k
23
178
336

Given that it only affects the mount namespace, I am extremely suspicious that this is due to one of the loop prevention checks for mount namespaces. I do not think it is the exact same case as the link talks about, because unshare --mount defaults to setting mount propagation to private, i.e. disabling it.

However, to protect against certain race conditions, I think full correctness might indeed require that you mount your mount namespaces inside a directorymount which has private mount propagation. I also think it might be cleanest (easiest to debug) if you use unbindable. (I think unbindable already includes all the effects of private).

I.e. mount your mount namespaces inside a directory prepared using:

mount --bind /var/local/lib/myns/ /var/local/lib/myns/
mount --make-unbindable /var/local/lib/myns/

In general I think this is the safest approach, to avoid ever triggering such a problem.

My race condition is hypothetical. I would not expect you to be hitting it most of the time. So I do not know what your actual problem is.

Given that it only affects the mount namespace, I am extremely suspicious that this is due to one of the loop prevention checks for mount namespaces. I do not think it is the exact same case as the link talks about, because unshare --mount defaults to setting mount propagation to private, i.e. disabling it.

However, to protect against certain race conditions, I think full correctness might indeed require that you mount your mount namespaces inside a directory which has private mount propagation. I also think it might be cleanest (easiest to debug) if you use unbindable. (I think unbindable already includes all the effects of private).

In general I think this is the safest approach, to avoid ever triggering such a problem.

My race condition is hypothetical. I would not expect you to be hitting it most of the time. So I do not know what your actual problem is.

Given that it only affects the mount namespace, I am extremely suspicious that this is due to one of the loop prevention checks for mount namespaces. I do not think it is the exact same case as the link talks about, because unshare --mount defaults to setting mount propagation to private, i.e. disabling it.

However, to protect against certain race conditions, I think full correctness might require that you mount your mount namespaces inside a mount which has private mount propagation. I also think it might be cleanest (easiest to debug) if you use unbindable. (I think unbindable already includes all the effects of private).

I.e. mount your mount namespaces inside a directory prepared using:

mount --bind /var/local/lib/myns/ /var/local/lib/myns/
mount --make-unbindable /var/local/lib/myns/

In general I think this is the safest approach, to avoid ever triggering such a problem.

My race condition is hypothetical. I would not expect you to be hitting it most of the time. So I do not know what your actual problem is.

Post Undeleted by sourcejedi

occurred May 5, 2019 at 16:21

deleted 2327 characters in body

Source Link

edited May 5, 2019 at 16:21

sourcejedi

53.5k
23
178
336

This answerGiven that it only affects the mount namespace, I am extremely suspicious that this is copied & reworded fromdue to one of the question post What code prevents mount namespace loops? In a more complex case involving mount propagationloop prevention checks .

The following commands return an error:

# touch /tmp/a
# mount --bind /proc/self/ns/mnt /tmp/a
mount: /tmp/a: wrong fs type, bad option, bad superblock on /proc/self/ns/mnt, missing codepage or helper program, or other error.

This is because the kernel code (see extracts below) prevents a simplefor mount namespace loopnamespaces. The code comments explain why this is I do not allowed. The lifetime of a mount namespacethink it is tracked by a simple reference count. If you have a loop where mount namespaces A and B both reference the other, then both A and B will always have at least one referenceexact same case as the link talks about, and they wouldbecause never be freed. The allocated memory would be lostunshare --mount defaults to setting mount propagation to private, until you rebooted the entire systemi.e. disabling it.

For comparisonHowever, the kernel allows the followingto protect against certain race conditions, which is not a loop:

# unshare -m
# echo $$
8456
# kill -STOP $$
[1]+  Stopped                 unshare -m

# touch /tmp/a
# mount --bind /proc/8456/ns/mnt /tmp/a
#
# umount /tmp/a  # cleanup
#

If I try to createthink full correctness might indeed require that you mount your mount namespaces inside a loop usingdirectory which has private mount propagation,. I also think it willmight be cleanest also fail(easiest to debug) if you use unbindable. I suspect this is what happens in your question. Shared mounts (i.eI think unbindable already includes all the effects of private). pr

# mount --make-shared /tmp
# unshare -m --propagation shared
# echo $$
8456
# kill -STOP $$
[1]+  Stopped                 unshare -m

# mount --bind /proc/8456/ns/mnt /tmp/a
mount: /tmp/a: wrong fs type, bad option, bad superblock on /proc/9061/ns/mnt, missing codepage or helper program, or other error.

But ifIn general I remove the mount propagation, no loopthink this is createdthe safest approach, and it succeeds:

# unshare -m --propagation private
# echo $$
8456
# kill -STOP $$
[1]+  Stopped                 unshare -m

# mount --bind /proc/8456/ns/mnt /tmp/a
# 
# umount /tmp/a  # cleanup

Kernel code which handles the simpler case

https://elixir.bootlin.com/linux/v4.18/source/fs/namespace.c

static bool mnt_ns_loop(struct dentry *dentry)
{
    /* Could bind mounting the mount namespace inode cause a
     * mount namespace loop?
     */
    struct mnt_namespace *mnt_ns;
    if (!is_mnt_ns_file(dentry))
        return false;

    mnt_ns = to_mnt_ns(get_proc_ns(dentry->d_inode));
    return current->nsproxy->mnt_ns->seq >= mnt_ns->seq;
}

..to avoid ever triggering such a problem.

    err = -EINVAL;
    if (mnt_ns_loop(old_path.dentry))
        goto out;

My race condition is hypothetical. I would not expect you to be hitting it most of the time. So I do not know what your actual problem is.

 * Assign a sequence number so we can detect when we attempt to bind
 * mount a reference to an older mount namespace into the current
 * mount namespace, preventing reference counting loops.  A 64bit
 * number incrementing at 10Ghz will take 12,427 years to wrap which
 * is effectively never, so we can ignore the possibility.
 */
static atomic64_t mnt_ns_seq = ATOMIC64_INIT(1);

static struct mnt_namespace *alloc_mnt_ns(struct user_namespace *user_ns)

This answer is copied & reworded from the question post What code prevents mount namespace loops? In a more complex case involving mount propagation .

The following commands return an error:

# touch /tmp/a
# mount --bind /proc/self/ns/mnt /tmp/a
mount: /tmp/a: wrong fs type, bad option, bad superblock on /proc/self/ns/mnt, missing codepage or helper program, or other error.

This is because the kernel code (see extracts below) prevents a simple mount namespace loop. The code comments explain why this is not allowed. The lifetime of a mount namespace is tracked by a simple reference count. If you have a loop where mount namespaces A and B both reference the other, then both A and B will always have at least one reference, and they would never be freed. The allocated memory would be lost, until you rebooted the entire system.

For comparison, the kernel allows the following, which is not a loop:

# unshare -m
# echo $$
8456
# kill -STOP $$
[1]+  Stopped                 unshare -m

# touch /tmp/a
# mount --bind /proc/8456/ns/mnt /tmp/a
#
# umount /tmp/a  # cleanup
#

If I try to create a loop using mount propagation, it will also fail. I suspect this is what happens in your question. Shared mounts (i.e. pr

# mount --make-shared /tmp
# unshare -m --propagation shared
# echo $$
8456
# kill -STOP $$
[1]+  Stopped                 unshare -m

# mount --bind /proc/8456/ns/mnt /tmp/a
mount: /tmp/a: wrong fs type, bad option, bad superblock on /proc/9061/ns/mnt, missing codepage or helper program, or other error.

But if I remove the mount propagation, no loop is created, and it succeeds:

# unshare -m --propagation private
# echo $$
8456
# kill -STOP $$
[1]+  Stopped                 unshare -m

# mount --bind /proc/8456/ns/mnt /tmp/a
# 
# umount /tmp/a  # cleanup

Kernel code which handles the simpler case

https://elixir.bootlin.com/linux/v4.18/source/fs/namespace.c

static bool mnt_ns_loop(struct dentry *dentry)
{
    /* Could bind mounting the mount namespace inode cause a
     * mount namespace loop?
     */
    struct mnt_namespace *mnt_ns;
    if (!is_mnt_ns_file(dentry))
        return false;

    mnt_ns = to_mnt_ns(get_proc_ns(dentry->d_inode));
    return current->nsproxy->mnt_ns->seq >= mnt_ns->seq;
}

...

    err = -EINVAL;
    if (mnt_ns_loop(old_path.dentry))
        goto out;

...

 * Assign a sequence number so we can detect when we attempt to bind
 * mount a reference to an older mount namespace into the current
 * mount namespace, preventing reference counting loops.  A 64bit
 * number incrementing at 10Ghz will take 12,427 years to wrap which
 * is effectively never, so we can ignore the possibility.
 */
static atomic64_t mnt_ns_seq = ATOMIC64_INIT(1);

static struct mnt_namespace *alloc_mnt_ns(struct user_namespace *user_ns)

Given that it only affects the mount namespace, I am extremely suspicious that this is due to one of the loop prevention checks for mount namespaces. I do not think it is the exact same case as the link talks about, because unshare --mount defaults to setting mount propagation to private, i.e. disabling it.

However, to protect against certain race conditions, I think full correctness might indeed require that you mount your mount namespaces inside a directory which has private mount propagation. I also think it might be cleanest (easiest to debug) if you use unbindable. (I think unbindable already includes all the effects of private).

In general I think this is the safest approach, to avoid ever triggering such a problem.

My race condition is hypothetical. I would not expect you to be hitting it most of the time. So I do not know what your actual problem is.

Post Deleted by sourcejedi

occurred May 5, 2019 at 16:09

Post Undeleted by sourcejedi

occurred May 5, 2019 at 16:05

added 670 characters in body

Source Link

edited May 5, 2019 at 16:05

sourcejedi

53.5k
23
178
336

This answer is copied & stripped downreworded from the question post What code prevents mount namespace loops? In a more complex case involving mount propagation . I.e. this answer describes the "simple" case :-), of mount namespaces loops, which are forbidden.

# unshare -m
# echo $$
8456
# kill -STOP $$
[1]+  Stopped                 unshare -m

# touch /tmp/a
# mount --bind /proc/8456/ns/mnt /tmp/a
#
# umount /tmp/a  # cleanup
#

If I try to create a loop using mount propagation, it will also fail. I suspect this is what happens in your question. Shared mounts (i.e. pr

# mount --make-shared /tmp
# unshare -m --propagation shared
# echo $$
8456
# kill -STOP $$
[1]+  Stopped                 unshare -m

# mount --bind /proc/8456/ns/mnt /tmp/a
mount: /tmp/a: wrong fs type, bad option, bad superblock on /proc/9061/ns/mnt, missing codepage or helper program, or other error.

But if I remove the mount propagation, no loop is created, and it succeeds:

# unshare -m --propagation private
# echo $$
8456
# kill -STOP $$
[1]+  Stopped                 unshare -m

# mount --bind /proc/8456/ns/mnt /tmp/a
# 
# umount /tmp/a  # cleanup

Kernel code which handles the simpler case

This answer is copied & stripped down from What code prevents mount namespace loops? In a more complex case involving mount propagation . I.e. this answer describes the "simple" case :-), of mount namespaces loops, which are forbidden.

# unshare -m
# echo $$
8456
# kill -STOP $$
[1]+  Stopped                 unshare -m

# touch /tmp/a
# mount --bind /proc/8456/ns/mnt /tmp/a
#
# umount /tmp/a  # cleanup
#

Kernel code

This answer is copied & reworded from the question post What code prevents mount namespace loops? In a more complex case involving mount propagation .

# unshare -m
# echo $$
8456
# kill -STOP $$
[1]+  Stopped                 unshare -m

# touch /tmp/a
# mount --bind /proc/8456/ns/mnt /tmp/a
#
# umount /tmp/a  # cleanup
#

If I try to create a loop using mount propagation, it will also fail. I suspect this is what happens in your question. Shared mounts (i.e. pr

# mount --make-shared /tmp
# unshare -m --propagation shared
# echo $$
8456
# kill -STOP $$
[1]+  Stopped                 unshare -m

# mount --bind /proc/8456/ns/mnt /tmp/a
mount: /tmp/a: wrong fs type, bad option, bad superblock on /proc/9061/ns/mnt, missing codepage or helper program, or other error.

But if I remove the mount propagation, no loop is created, and it succeeds:

# unshare -m --propagation private
# echo $$
8456
# kill -STOP $$
[1]+  Stopped                 unshare -m

# mount --bind /proc/8456/ns/mnt /tmp/a
# 
# umount /tmp/a  # cleanup

Kernel code which handles the simpler case

Post Deleted by sourcejedi

occurred May 5, 2019 at 16:02

Source Link

answered May 5, 2019 at 15:59

sourcejedi

53.5k
23
178
336

Loading

Stack Exchange Network

Return to Answer

Kernel code which handles the simpler case

Kernel code which handles the simpler case

Kernel code which handles the simpler case

Kernel code

Kernel code which handles the simpler case