The Wayback Machine - https://web.archive.org/web/20230731080318/https://github.com/google/gvisor/commits/master
Skip to content

Commits

Permalink
master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Commits on Jul 27, 2023

  1. nginx config: Remove worker_processes and `events.worker_connecti…

    …ons`.
    
    This is closer to the default nginx settings with respect to concurrency.
    
    Turn off access log, as it is heavy on I/O and would not be used in a
    production setup (whether with or without gVisor).
    
    Update nginx to version `1.25.1`.
    
    PiperOrigin-RevId: 551669286
    EtiennePerot authored and gvisor-bot committed Jul 27, 2023
    Copy the full SHA
    3924579 View commit details
    Browse the repository at this point in the history
  2. site: note that flags passed to run should be replicated for restore

    PiperOrigin-RevId: 551588446
    kevinGC authored and gvisor-bot committed Jul 27, 2023
    Copy the full SHA
    39d89e4 View commit details
    Browse the repository at this point in the history
  3. Merge pull request #8990 from sitano:ivan_ptrace_eperm_guide

    PiperOrigin-RevId: 551559276
    gvisor-bot committed Jul 27, 2023
    Copy the full SHA
    a3ae02e View commit details
    Browse the repository at this point in the history

Commits on Jul 24, 2023

  1. gVisor fio benchmarks: Use libaio where it makes sense.

    Enforce that `IODepth` must be `1` when using the `sync` IO engine, since
    it has no effect with that engine.
    
    Also add unit names to `tools.Fio` struct fields for easier readability.
    
    PiperOrigin-RevId: 550710273
    EtiennePerot authored and gvisor-bot committed Jul 24, 2023
    Copy the full SHA
    9926c0f View commit details
    Browse the repository at this point in the history
  2. Implement the setns syscall

    This change introduces the nsfs file system. Each new namespace allocates
    a new nsfs inode.
    
    Here are reasons why we need these inodes:
    * each namespace has to have an unique id.
    * proc/pid/ns/ contains one entry for each namespace. Bind mounting one of
      the files in this directory to somewhere else in the filesystem keeps the
      corresponding namespace alive even if all processes currently in
      the namespace terminate.
    * setns() allows the calling process to join an existing namespace specified
      by a file descriptor.
    
    PiperOrigin-RevId: 550694515
    avagin authored and gvisor-bot committed Jul 24, 2023
    Copy the full SHA
    4611550 View commit details
    Browse the repository at this point in the history
  3. Better memory reporting for multi-container

    Right now, the entire sandbox memory is reported per-container, confusing
    users and tools that aggregate per-container memory to compute sandbox/pod
    memory. So instead, split memory usage amoung all containers in the
    system, except for the root container which is ignored by K8s. This way
    pod memory usage is shown correctly in graphs.
    
    Updates #172
    
    PiperOrigin-RevId: 550670618
    fvoznika authored and gvisor-bot committed Jul 24, 2023
    Copy the full SHA
    a5fd501 View commit details
    Browse the repository at this point in the history
  4. Fix fio "regex"s in buildkite file.

    PiperOrigin-RevId: 550601194
    zkoopmans authored and gvisor-bot committed Jul 24, 2023
    Copy the full SHA
    0ef88bb View commit details
    Browse the repository at this point in the history
  5. Copy the full SHA
    71bfa2b View commit details
    Browse the repository at this point in the history

Commits on Jul 21, 2023

  1. Add methods for generating PCI sysfs paths and registering accel devi…

    …ces.
    
    The TPU userspace driver needs access to specific PCI device information
    located in Linux sysfs. We mirror the sysfs paths the driver reads on the host
    in the Sentry sysfs. This way we can ensure we only expose the host device
    information that's strictly necessary for TPU to run.
    
    PiperOrigin-RevId: 550005271
    manninglucas authored and gvisor-bot committed Jul 21, 2023
    Copy the full SHA
    19e0421 View commit details
    Browse the repository at this point in the history
  2. Remove last remaining !go1.22 build tag

    The last remaining !go1.22 build is protecting the definition of
    pkg/sync.maptype, which is a copy of runtime.maptype. We need to ensure these
    definitions match so we can safely access the hasher field.
    
    At its core, this CL achieves this check by ensuring that
    unsafe.Offsetof(maptype{}.Hasher) matches the offset in the runtime version of
    the type.
    
    Several things happen along the way to achieve this:
    
    * As of May 2023, runtime.maptype is actually a type alias for
    internal/abi.MapType. checkoffset was failing to record the offsets because it
    skipped type aliases for no good reason. Simply removing the type alias check
    is sufficient to make type aliases work. (This part of the CL is technically
    unnecessary because this CL ultimately references internal/abi.MapType
    directly in anticipation of removal of the type alias. But there is no reason
    not to allow type aliases).
    
    * The checkconst / checkoffset regexp unintentionally does not allow / in
    package paths, even though the rest of the package supports /. Fix this.
    
    * checkconst was comparing the literal AST expression string against the
    runtime value (i.e., "unsafe.Offsetof(maptype{}.Hasher)" vs "72", which fails
    comparison. Switch to getting the resolved constant value from the type
    checker.
    
    * nogo/check.importer only loads package facts on direct import (stored in
    importer.cache). If a package is not directly imported ImportPackageFact will
    not find the facts. Typically packages need to ensure they directly depend on
    packages they want facts from (e.g., pkg/sync has a dummy import of runtime in
    runtime.go). This doesn't work for internal/abi because we cannot directly
    import an internal package. Work around this as a hack by unconditionally
    "importing" internal/abi when analyzing any package.
    
    With regard to the last point, not that the nogo/defs.bzl nogo integration only
    provides facts from the direct dependencies and the entire stdlib (since the
    stdlib is analyzed as one bundle). So this trick only works for a stdlib
    package. A bazel package indirect dependency would be missing facts altogether.
    
    PiperOrigin-RevId: 549999084
    prattmic authored and gvisor-bot committed Jul 21, 2023
    Copy the full SHA
    f3e4a1f View commit details
    Browse the repository at this point in the history
  3. Add nvproxy support for V100 Nvidia GPUs.

    Tested on 1 V100 GPU:
    ```
    $ docker run --runtime=runsc --rm --gpus all nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda11.7.1-ubi8
    [Vector addition of 50000 elements]
    Copy input data from the host memory to the CUDA device
    CUDA kernel launch with 196 blocks of 256 threads
    Copy output data from the CUDA device to the host memory
    Test PASSED
    Done
    ```
    
    PiperOrigin-RevId: 549837326
    ayushr2 authored and gvisor-bot committed Jul 21, 2023
    Copy the full SHA
    41ec0d4 View commit details
    Browse the repository at this point in the history
  4. Remove unused parameters

    PiperOrigin-RevId: 549820338
    fvoznika authored and gvisor-bot committed Jul 21, 2023
    Copy the full SHA
    d40be91 View commit details
    Browse the repository at this point in the history

Commits on Jul 20, 2023

  1. Merge pull request #9007 from andrew-d:andrew/tcp-forwarder-on-ignored

    PiperOrigin-RevId: 549727271
    gvisor-bot committed Jul 20, 2023
    Copy the full SHA
    1ba123f View commit details
    Browse the repository at this point in the history
  2. Add seccomp filters for TPU proxying and stub out accel fd methods.

    PiperOrigin-RevId: 549718797
    manninglucas authored and gvisor-bot committed Jul 20, 2023
    Copy the full SHA
    d4510e7 View commit details
    Browse the repository at this point in the history
  3. Add config flags and sandbox chroot configuration for TPU proxying.

    PiperOrigin-RevId: 549662855
    manninglucas authored and gvisor-bot committed Jul 20, 2023
    Copy the full SHA
    5eb44a9 View commit details
    Browse the repository at this point in the history
  4. Plumb memory cgroup id in memmap.IncRef.

    Update the memmap IncRef method to pass memory cgroup id and store it in the
    FrameRefSet which will be used for memory accounting. During DecRef, the
    memCgID from the FrameRefSet will be retrieved and passed to MemoryLocked.Dec
    to remove the memory from the cgroup.
    
    PiperOrigin-RevId: 549656411
    nybidari authored and gvisor-bot committed Jul 20, 2023
    Copy the full SHA
    aff5168 View commit details
    Browse the repository at this point in the history
  5. Add O_DIRECT version of fio benchmarks to track direct I/O perfor…

    …mance.
    
    PiperOrigin-RevId: 549470615
    EtiennePerot authored and gvisor-bot committed Jul 20, 2023
    Copy the full SHA
    0244c8c View commit details
    Browse the repository at this point in the history

Commits on Jul 19, 2023

  1. Run nvidia-container-cli configure in the Gofer mount namespace.

    This change adds a new synchronization FD to the Gofer startup sequence,
    passed as `sync-nvproxy-fd`. The `runsc create` process uses this to
    wait for the Gofer to start, then to run
    `nvidia-container-cli configure [...] --pid=$GOFER_PID`.
    
    This causes the mounts that `nvidia-container-cli configure` does to be
    performed in the mount namespace of the Gofer, rather than the `runsc create`
    process. This avoids polluting the main mount namespace with NVIDIA-specific
    mountpoints, and ties the lifetime of these mounts to the lifetime of the
    Gofer process, which means they are cleaned up automatically when the sandbox
    exits.
    
    Due to the added complexity in Gofer startup, this CL also introduces a
    `goferSyncFDs` struct that encodes some of the logic around these FDs, and
    better documents how they interact with the Gofer and the container startup
    sequence.
    
    One suggestion by Ayush was to move `nvidia-container-cli` to be done after
    `createGoferProcess` returns. Unfortunately this isn't possible without having
    to return the nvproxy FD in the `createGoferProcess` return signature, since
    FDs can only be donated before the Gofer process has started. This would make
    the signature uglier. So instead, this CL takes the approach of a single
    `nvproxyConfigureGofer` function called during Gofer initialization. It
    creates and donates the FD to the Gofer command, and returns a callback
    function called after the Gofer process is started, where it finally runs
    `nvidia-container-cli configure` and then notifies the Gofer through this
    same FD it created. This encapsulates all the logic within
    `nvproxyConfigureGofer` and is the cleanest I could think of.
    
    Tested manually using a fresh Debian machine, with:
    
    ```shell
    $ sudo mkdir -p /tmp/bundle-cuda/rootfs
    $ docker export $(docker create nvidia/cuda:11.6.2-base-ubuntu20.04) \
        | sudo tar -xf - -C /tmp/bundle-cuda/rootfs
    $ sudo runc spec --bundle=/tmp/bundle-cuda
    $ $EDITOR /tmp/bundle-cuda/config.json
    # Add NVIDIA_VISIBLE_DEVICES=0 and NVIDIA_DRIVER_CAPABILITIES=all to env
    $ sudo ./runsc -nvproxy -nvproxy-docker create --bundle=/tmp/bundle-cuda mycuda
    $ sudo ./runsc -nvproxy -nvproxy-docker start mycuda
    $ sudo ./runsc -nvproxy -nvproxy-docker exec mycuda nvidia-smi -L
    (Works)
    $ sudo ./runsc delete --force mycuda
    # And verified at each step that `grep bundle-cuda /proc/mounts` was empty.
    ```
    
    And also verified that regular use through Docker also works.
    
    Fixes #9142.
    
    PiperOrigin-RevId: 549427158
    EtiennePerot authored and gvisor-bot committed Jul 19, 2023
    Copy the full SHA
    a455fbd View commit details
    Browse the repository at this point in the history
  2. Don't run benchmarks on ptrace on buildkite.

    PiperOrigin-RevId: 549384309
    zkoopmans authored and gvisor-bot committed Jul 19, 2023
    Copy the full SHA
    2212760 View commit details
    Browse the repository at this point in the history
  3. Add accel and gasket ABI definitions.

    PiperOrigin-RevId: 549376196
    manninglucas authored and gvisor-bot committed Jul 19, 2023
    Copy the full SHA
    9f4df21 View commit details
    Browse the repository at this point in the history
  4. Allow walking on FIFO and UDS in lisafs.

    The flags --host-uds and --host-fifo only control whether the application can
    open/connect or create/bind these special files. Stat-ing a host FIFO or UDS
    should not be blocked.
    
    PiperOrigin-RevId: 549199286
    ayushr2 authored and gvisor-bot committed Jul 19, 2023
    Copy the full SHA
    ea7cd71 View commit details
    Browse the repository at this point in the history

Commits on Jul 18, 2023

  1. Increment/decrement memory accounted per cgroup.

    - Adds a new field in the usageInfo to store the memory cgroup id.
    - Creates a map of cgroup ids and memory stats to track the memory per cgroup
    in MemoryLocked struct.
    - Introduces new methods to increment, decrement, move, copy and get the total
    memory usage per cgroup.
    
    PiperOrigin-RevId: 549148091
    nybidari authored and gvisor-bot committed Jul 18, 2023
    Copy the full SHA
    a87aa73 View commit details
    Browse the repository at this point in the history
  2. Pass NV2080_CTRL_CMD_MC_SERVICE_INTERRUPTS through nvproxy.

    Fixes #9176
    
    PiperOrigin-RevId: 549125072
    nixprime authored and gvisor-bot committed Jul 18, 2023
    Copy the full SHA
    ef410e6 View commit details
    Browse the repository at this point in the history
  3. Remove panic in ConsumeCoverageData() when no coverage is observed.

    A call to ConsumeCoverageData() can observe zero incremental coverage
    immediately after a concurrent call to ConsumeCoverageData() unlocks coverageMu
    if sync.Mutex.Lock/Unlock are excluded from coverage instrumentation.
    
    PiperOrigin-RevId: 549119637
    nixprime authored and gvisor-bot committed Jul 18, 2023
    Copy the full SHA
    f43a5fc View commit details
    Browse the repository at this point in the history
  4. Update host redirect handling for gvisor.dev

    We now make sure that the requested domain is a valid domain based on the
    custom domain and project ID settings before redirecting.
    
    PiperOrigin-RevId: 548859553
    ianlewis authored and gvisor-bot committed Jul 18, 2023
    Copy the full SHA
    14df01f View commit details
    Browse the repository at this point in the history

Commits on Jul 17, 2023

  1. kernfs: Don't try to cache anonymous inodes.

    They have no parent, so are not reachable again.
    
    PiperOrigin-RevId: 548765107
    nlacasse authored and gvisor-bot committed Jul 17, 2023
    Copy the full SHA
    150831f View commit details
    Browse the repository at this point in the history
  2. gvisor.dev homepage: Minor fixes.

    This CL does the following:
    
    - Add `<strong>emphasis</strong>` on the important keywords for each panel
    - Mention "LLM-generated code" as code that can be sandboxed in gVisor
    - Change "GPU support" header to "GPU & CUDA support"
    - Add PNG transparency to images where it was missing
    - Remove link to raw image file on architecture diagram
    - Adjust panel icon size
    - Adjust `<h2>` margins in panels
    - Small CSS cleanups
    
    PiperOrigin-RevId: 548764864
    EtiennePerot authored and gvisor-bot committed Jul 17, 2023
    Copy the full SHA
    42df09a View commit details
    Browse the repository at this point in the history
  3. Do not hold metadataMu on gofer O_DIRECT read path.

    dentry.writeback() takes dataMu when it needs to. This lock seems to be
    unnecessary.
    
    PiperOrigin-RevId: 548763586
    ayushr2 authored and gvisor-bot committed Jul 17, 2023
    Copy the full SHA
    05f62e5 View commit details
    Browse the repository at this point in the history
  4. pkg/tcpip/transport/tcp: add statistics for dropped connections

    When the TCP forwarder ignores a connection due to having too many
    in-flight connections, it's not easy to log a message or update a metric
    for later debugging. Add a metric that will be incremented in this case
    so that the user of the Forwarder can observe this.
    
    Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
    andrew-d committed Jul 17, 2023
    Copy the full SHA
    057e0b7 View commit details
    Browse the repository at this point in the history

Commits on Jul 15, 2023

  1. Deflake tcp test TestMaxRTO

    The test checked the RTO value(500ms) for the first retransmit by rounding the
    value to seconds which resulted in 1s. In some cases, when the RTO calculated
    was slightly less than 500ms (~499 ms) the test failed. Fix this by checking
    the absolute difference when the calculated rto is less than expected rto.
    
    Before: http://sponge2/6a8d125a-ff90-4090-8565-76b9f8a91573
    After: http://sponge2/386f9716-bdc1-4079-848d-4ebc23b70167
    PiperOrigin-RevId: 548273122
    nybidari authored and gvisor-bot committed Jul 15, 2023
    Copy the full SHA
    c7a7e6b View commit details
    Browse the repository at this point in the history

Commits on Jul 14, 2023

  1. Impose default tmpfs size limits correctly.

    Syzkaller came up with workloads that fallocate(2) 1 TB in /tmp. The host
    mlock(2) or madvise(2) syscalls on memfd(2) files end up hanging for multiple
    minutes in such situations causing the watchdog to mark the calling goroutine
    as stuck. memfd(2) files have not size limits.
    
    Linux fails such fallocate(2) attempts in /tmp with ENOSPC.
    In Linux tmpfs (shmem), when size= mount option is not specified, the default
    size limit for the mount is set to 50% of physical RAM size. But in gVisor, it
    is set to MaxInt64. Which is why Linux fails with ENOSPC and gVisor doesn't.
    
    In runsc, the physcial RAM size is already exposed to the containerized
    application via `/proc/meminfo` which uses `usage.*TotalMemoryBytes`. These
    fields are configured using the `MemTotal:` field from host `/proc/meminfo`.
    So use that information to set the default size limit correctly.
    
    Reported-by: syzbot+4aa3d6d42b063a11c850@syzkaller.appspotmail.com
    PiperOrigin-RevId: 548252095
    ayushr2 authored and gvisor-bot committed Jul 14, 2023
    Copy the full SHA
    e54e366 View commit details
    Browse the repository at this point in the history
  2. Implement PR_{S,G}ET_CHILD_SUBREAPER.

    Closes #2323
    
    PiperOrigin-RevId: 548205854
    nlacasse authored and gvisor-bot committed Jul 14, 2023
    Copy the full SHA
    e7bd1b4 View commit details
    Browse the repository at this point in the history
  3. Enforce --host-fifo flag in directfs.

    The flag is only enforced in lisafs gofer as of now.
    This change plumbs a gofer client flag which disallowing opening FIFO from the
    host filesystem.
    
    PiperOrigin-RevId: 548193558
    ayushr2 authored and gvisor-bot committed Jul 14, 2023
    Copy the full SHA
    f5049f6 View commit details
    Browse the repository at this point in the history
  4. netstack: remove finished TODO

    Fixes #6015.
    
    PiperOrigin-RevId: 548191467
    kevinGC authored and gvisor-bot committed Jul 14, 2023
    Copy the full SHA
    f7f5baf View commit details
    Browse the repository at this point in the history
  5. Use write(2) host syscall to perform writes on disk-backed MemoryFiles.

    Prepopulating pages for disk-backed MemoryFiles has proved to be futile.
    The mf.MapInternal()+safemem.CopySeq() approach used right now incurs a lot of
    page faults without page population. Page-by-page faults incurs a lot of
    context switching.
    
    On the other hand, the write syscall makes one context switch to kernel, and
    faults all the pages that are touched during write. Note that safemem.CopySeq()
    avoids a syscall and hence can be faster sometimes when the underlying page
    is populated. But with disk writebacks, it is hard to predict/account what is
    populated. Writebacks can happen asynchronously based on system load.
    
    Benchmark results show that FIO write performance improves a lot on rootfs:
    ```
    goos: linux
    goarch: amd64
    cpu: Intel(R) Xeon(R) CPU @ 2.80GHz
                                                           │ benchout.runsc-before │       benchout.runsc-after       │
                                                           │        sec/op         │   sec/op    vs base              │
    BuildABSL/page_cache.clean/filesystem.bindfs-4                      90.20 ± 2%   89.44 ± 1%       ~ (p=0.382 n=8)
    BuildGRPC/page_cache.clean/filesystem.bindfs-4                      626.0 ± 1%   626.8 ± 0%       ~ (p=0.505 n=8)
    RubySpecTest/page_cache.clean/filesystem.bindfs-4                   52.11 ± 1%   52.37 ± 1%       ~ (p=0.105 n=8)
    Fio/operation.write/blockSize.4K/filesystem.rootfs-4                2.509m ± 0%   2.509m ± 0%        ~ (p=0.878 n=8)
    Fio/operation.write/blockSize.64K/filesystem.rootfs-4               2.009m ± 0%   1.507m ± 0%  -24.98% (p=0.000 n=8)
    Fio/operation.write/blockSize.1024K/filesystem.rootfs-4             2.008m ± 0%   1.508m ± 0%  -24.90% (p=0.000 n=8)
    
                                                            │   benchout.runsc-before    │               benchout.runsc-after                │
                                                            │ bandwidth.bytes_per_second │ bandwidth.bytes_per_second  vs base               │
    Fio/operation.write/blockSize.4K/filesystem.rootfs-4                     649.1M ± 2%                  705.2M ± 2%   +8.64% (p=0.000 n=8)
    Fio/operation.write/blockSize.64K/filesystem.rootfs-4                    991.1M ± 1%                 1499.1M ± 3%  +51.25% (p=0.000 n=8)
    Fio/operation.write/blockSize.1024K/filesystem.rootfs-4                  1.198G ± 2%                  1.945G ± 2%  +62.34% (p=0.000 n=8)
    
                                                            │ benchout.runsc-before │             benchout.runsc-after             │
                                                            │ io_ops.ops_per_second │ io_ops.ops_per_second  vs base               │
    Fio/operation.write/blockSize.4K/filesystem.rootfs-4                158.5k ± 2%             172.2k ± 2%   +8.64% (p=0.000 n=8)
    Fio/operation.write/blockSize.64K/filesystem.rootfs-4               15.12k ± 1%             22.87k ± 3%  +51.25% (p=0.000 n=8)
    Fio/operation.write/blockSize.1024K/filesystem.rootfs-4             1.143k ± 2%             1.855k ± 2%  +62.34% (p=0.000 n=8)
    
                                                      │ benchout.runsc-before │    benchout.runsc-after     │
                                                      │       load.sec        │  load.sec   vs base         │
    RubySpecTest/page_cache.clean/filesystem.bindfs-4              7.555 ± 1%   7.585 ± 1%  ~ (p=0.457 n=8)
    ```
    
    PiperOrigin-RevId: 548021076
    ayushr2 authored and gvisor-bot committed Jul 14, 2023
    Copy the full SHA
    0d52b50 View commit details
    Browse the repository at this point in the history
Older