Context
I have a GitHub Action running on a linux machine.
It does the following (might want to skip to problem first as most of this might be irrelevant):
pwd:/home/runner/work/net.twisterrob.cinema/net.twisterrob.cinema- Download a zip and extract to
diff/prev - Download a zip and extract to
diff/curr - All the rest runs inside
difffolder. - Runs diffs on pairs of files inside
prevandcurrlike this:diff --unified=3 --new-file --text --minimal "prev/backend.lockfile" "curr/backend.lockfile" > "backend.lockfile.diff" - Runs another set of different diffs on other files:
./dependency-tree-diff.jar "prev/$1.dependencies" "curr/$1.dependencies" > "$1.dependencies.diff" - Converts the diff to patch via a shell script:
./dependency-tree-diff-to-patch.sh "$1.dependencies.diff" "${BASE}" "${HEAD}" > "$1.dependencies.patch" - At the end of all this these are the directory contents:
ls -latotal 272 drwxr-xr-x 4 runner docker 4096 Jan 27 21:13 . drwxr-xr-x 12 runner docker 4096 Jan 27 21:12 .. -rw-r--r-- 1 runner docker 32293 Jan 27 21:13 backend-database.dependencies.diff -rw-r--r-- 1 runner docker 32784 Jan 27 21:13 backend-database.dependencies.patch -rw-r--r-- 1 runner docker 4250 Jan 27 21:13 backend-database.lockfile.diff -rw-r--r-- 1 runner docker 2 Jan 27 21:13 backend-database.lockfile.diff.error -rw-r--r-- 1 runner docker 14853 Jan 27 21:13 backend-endpoint.dependencies.diff -rw-r--r-- 1 runner docker 15322 Jan 27 21:13 backend-endpoint.dependencies.patch -rw-r--r-- 1 runner docker 4202 Jan 27 21:13 backend-endpoint.lockfile.diff -rw-r--r-- 1 runner docker 2 Jan 27 21:13 backend-endpoint.lockfile.diff.error -rw-r--r-- 1 runner docker 0 Jan 27 21:13 backend-feed.dependencies.diff -rw-r--r-- 1 runner docker 426 Jan 27 21:13 backend-feed.dependencies.patch -rw-r--r-- 1 runner docker 0 Jan 27 21:13 backend-feed.lockfile.diff -rw-r--r-- 1 runner docker 2 Jan 27 21:13 backend-feed.lockfile.diff.error -rw-r--r-- 1 runner docker 0 Jan 27 21:13 backend-network.dependencies.diff -rw-r--r-- 1 runner docker 438 Jan 27 21:13 backend-network.dependencies.patch -rw-r--r-- 1 runner docker 0 Jan 27 21:13 backend-network.lockfile.diff -rw-r--r-- 1 runner docker 2 Jan 27 21:13 backend-network.lockfile.diff.error -rw-r--r-- 1 runner docker 0 Jan 27 21:13 backend-quickbook.dependencies.diff -rw-r--r-- 1 runner docker 446 Jan 27 21:13 backend-quickbook.dependencies.patch -rw-r--r-- 1 runner docker 0 Jan 27 21:13 backend-quickbook.lockfile.diff -rw-r--r-- 1 runner docker 2 Jan 27 21:13 backend-quickbook.lockfile.diff.error -rw-r--r-- 1 runner docker 13691 Jan 27 21:13 backend-sync.dependencies.diff -rw-r--r-- 1 runner docker 14141 Jan 27 21:13 backend-sync.dependencies.patch -rw-r--r-- 1 runner docker 3869 Jan 27 21:13 backend-sync.lockfile.diff -rw-r--r-- 1 runner docker 2 Jan 27 21:13 backend-sync.lockfile.diff.error -rw-r--r-- 1 runner docker 0 Jan 27 21:13 backend.dependencies.diff -rw-r--r-- 1 runner docker 406 Jan 27 21:13 backend.dependencies.patch -rw-r--r-- 1 runner docker 0 Jan 27 21:13 backend.lockfile.diff -rw-r--r-- 1 runner docker 2 Jan 27 21:13 backend.lockfile.diff.error drwxr-xr-x 2 runner docker 4096 Jan 27 21:12 curr -rwxr-xr-x 1 runner docker 441 Jan 27 21:13 dependency-tree-diff-to-patch.sh -rwxr-xr-x 1 runner docker 20979 Jun 17 2022 dependency-tree-diff.jar -rw-r--r-- 1 runner docker 0 Jan 27 21:13 plugins-settings.lockfile.diff -rw-r--r-- 1 runner docker 2 Jan 27 21:13 plugins-settings.lockfile.diff.error -rw-r--r-- 1 runner docker 0 Jan 27 21:13 plugins.dependencies.diff -rw-r--r-- 1 runner docker 406 Jan 27 21:13 plugins.dependencies.patch -rw-r--r-- 1 runner docker 0 Jan 27 21:13 plugins.lockfile.diff -rw-r--r-- 1 runner docker 2 Jan 27 21:13 plugins.lockfile.diff.error drwxr-xr-x 2 runner docker 4096 Jan 27 21:12 prev -rw-r--r-- 1 runner docker 0 Jan 27 21:13 root-settings.lockfile.diff -rw-r--r-- 1 runner docker 2 Jan 27 21:13 root-settings.lockfile.diff.error -rw-r--r-- 1 runner docker 0 Jan 27 21:13 root.dependencies.diff -rw-r--r-- 1 runner docker 394 Jan 27 21:13 root.dependencies.patch -rw-r--r-- 1 runner docker 0 Jan 27 21:13 root.lockfile.diff -rw-r--r-- 1 runner docker 2 Jan 27 21:13 root.lockfile.diff.error -rw-r--r-- 1 runner docker 0 Jan 27 21:13 test-helpers.dependencies.diff -rw-r--r-- 1 runner docker 426 Jan 27 21:13 test-helpers.dependencies.patch -rw-r--r-- 1 runner docker 0 Jan 27 21:13 test-helpers.lockfile.diff -rw-r--r-- 1 runner docker 2 Jan 27 21:13 test-helpers.lockfile.diff.error - And for good measure the size of the folder:
du -h -d 12.3M ./prev 2.3M ./curr 4.8M . - And the disk usage:
dfFilesystem 1K-blocks Used Available Use% Mounted on /dev/root 87204404 51329544 35858476 59% / tmpfs 3555312 172 3555140 1% /dev/shm tmpfs 1422128 1092 1421036 1% /run tmpfs 5120 0 5120 0% /run/lock /dev/sdb15 106858 5329 101529 5% /boot/efi /dev/sda1 14341128 4194336 9396508 31% /mnt tmpfs 711060 12 711048 1% /run/user/1001 - Merge the diff files into one file
ls --format=single-column --time=ctime --reverse *.dependencies.diff | xargs tail --lines=+1 > all.dependencies.diff
Step 11 is meant to do an equivalent of cat *.dependencies.diff > all.dependencies.diff with a bit more control (order) and fluff (headings). https://stackoverflow.com/a/2817024 + https://stackoverflow.com/a/7816490 + expansion of flags with man ls and man tail.
Problem and question
This last Step 11 is sometimes timing out. The whole GitHub Action has a 10 minute timeout, and Step 11 often takes 9 minutes and gets cancelled. But not always, oftentimes it "just works". All the ls, du and df is to try to diagnose what is happening. I'm posting this in hopes that someone sees something wrong with this setup/script/environment that could cause it to not finish all the time. I'm trying to rule out that my script is wrong.
Example
I have a good example which exhibits all the different scenarios on the exact same code (without anything changing):
Attempt 1: failure https://github.com/TWiStErRob/net.twisterrob.cinema/actions/runs/4028195985/jobs/6924812632
Attempt 2: failure https://github.com/TWiStErRob/net.twisterrob.cinema/actions/runs/4028195985/jobs/6925862191
Attempt 3: success https://github.com/TWiStErRob/net.twisterrob.cinema/actions/runs/4028195985/jobs/6930610577
The weirdest thing is that when it fails it consistently shows "No space left on device", and always at this step!
System.IO.IOException: No space left on device : '/home/runner/runners/2.301.1/_diag/Worker_20230127-210915-utc.log'
at System.IO.RandomAccess.WriteAtOffset(SafeFileHandle handle, ReadOnlySpan`1 buffer, Int64 fileOffset)
at System.IO.Strategies.BufferedFileStreamStrategy.FlushWrite()
at System.IO.Strategies.BufferedFileStreamStrategy.WriteSpan(ReadOnlySpan`1 source, ArraySegment`1 arraySegment)
at System.IO.StreamWriter.Flush(Boolean flushStream, Boolean flushEncoder)
at System.Diagnostics.TextWriterTraceListener.Flush()
at GitHub.Runner.Common.HostTraceListener.WriteHeader(String source, TraceEventType eventType, Int32 id)
at GitHub.Runner.Common.HostTraceListener.TraceEvent(TraceEventCache eventCache, String source, TraceEventType eventType, Int32 id, String message)
at System.Diagnostics.TraceSource.TraceEvent(TraceEventType eventType, Int32 id, String message)
at GitHub.Runner.Worker.Worker.RunAsync(String pipeIn, String pipeOut)
at GitHub.Runner.Worker.Program.MainAsync(IHostContext context, String[] args)
System.IO.IOException: No space left on device : '/home/runner/runners/2.301.1/_diag/Worker_20230127-210915-utc.log'
at System.IO.RandomAccess.WriteAtOffset(SafeFileHandle handle, ReadOnlySpan`1 buffer, Int64 fileOffset)
at System.IO.Strategies.BufferedFileStreamStrategy.FlushWrite()
at System.IO.Strategies.BufferedFileStreamStrategy.WriteSpan(ReadOnlySpan`1 source, ArraySegment`1 arraySegment)
at System.IO.StreamWriter.Flush(Boolean flushStream, Boolean flushEncoder)
at System.Diagnostics.TextWriterTraceListener.Flush()
at GitHub.Runner.Common.HostTraceListener.WriteHeader(String source, TraceEventType eventType, Int32 id)
at GitHub.Runner.Common.HostTraceListener.TraceEvent(TraceEventCache eventCache, String source, TraceEventType eventType, Int32 id, String message)
at System.Diagnostics.TraceSource.TraceEvent(TraceEventType eventType, Int32 id, String message)
at GitHub.Runner.Common.Tracing.Error(Exception exception)
at GitHub.Runner.Worker.Program.MainAsync(IHostContext context, String[] args)
Unhandled exception. System.IO.IOException: No space left on device : '/home/runner/runners/2.301.1/_diag/Worker_20230127-210915-utc.log'
at System.IO.RandomAccess.WriteAtOffset(SafeFileHandle handle, ReadOnlySpan`1 buffer, Int64 fileOffset)
at System.IO.Strategies.BufferedFileStreamStrategy.FlushWrite()
at System.IO.StreamWriter.Flush(Boolean flushStream, Boolean flushEncoder)
at System.Diagnostics.TextWriterTraceListener.Flush()
at System.Diagnostics.TraceSource.Flush()
at GitHub.Runner.Common.TraceManager.Dispose(Boolean disposing)
at GitHub.Runner.Common.TraceManager.Dispose()
at GitHub.Runner.Common.HostContext.Dispose(Boolean disposing)
at GitHub.Runner.Common.HostContext.Dispose()
at GitHub.Runner.Worker.Program.Main(String[] args)