[v3-2-test] Fix macOS SIGSEGV in task execution by using fork+exec (#64874)#66872
Merged
Conversation
On macOS, the task supervisor's bare os.fork() copies the parent's Objective-C runtime state into the child process. When the child later triggers ObjC class initialization (e.g. socket.getaddrinfo -> system DNS resolver -> Security.framework -> +[NSNumber initialize]), the runtime detects the corrupted state and crashes with SIGABRT/SIGSEGV. This is a well-documented macOS platform limitation -- Apple's ObjC runtime, CoreFoundation, and libdispatch are not fork-safe. CPython changed multiprocessing's default start method to "spawn" on macOS in 3.8 for this reason, but Airflow's TaskSDK supervisor uses os.fork() directly. The fix: on macOS, immediately call os.execv() after os.fork() for task execution subprocesses. The exec replaces the child's address space, giving it clean ObjC state. The socketpair FDs survive across exec (marked inheritable) and the child reads their numbers from an environment variable. Only task execution (target=_subprocess_main) uses fork+exec. DAG processor and triggerer pass different targets and keep bare fork -- they don't make network calls that trigger the macOS crash. References: - python/cpython#105912 - python/cpython#58037 - #24463 (cherry picked from commit a3383b7)
Lee-W
approved these changes
May 15, 2026
vatsrahul1001
added a commit
that referenced
this pull request
May 20, 2026
…#66872) On macOS, the task supervisor's bare os.fork() copies the parent's Objective-C runtime state into the child process. When the child later triggers ObjC class initialization (e.g. socket.getaddrinfo -> system DNS resolver -> Security.framework -> +[NSNumber initialize]), the runtime detects the corrupted state and crashes with SIGABRT/SIGSEGV. This is a well-documented macOS platform limitation -- Apple's ObjC runtime, CoreFoundation, and libdispatch are not fork-safe. CPython changed multiprocessing's default start method to "spawn" on macOS in 3.8 for this reason, but Airflow's TaskSDK supervisor uses os.fork() directly. The fix: on macOS, immediately call os.execv() after os.fork() for task execution subprocesses. The exec replaces the child's address space, giving it clean ObjC state. The socketpair FDs survive across exec (marked inheritable) and the child reads their numbers from an environment variable. Only task execution (target=_subprocess_main) uses fork+exec. DAG processor and triggerer pass different targets and keep bare fork -- they don't make network calls that trigger the macOS crash. References: - python/cpython#105912 - python/cpython#58037 - #24463 (cherry picked from commit a3383b7) Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
vatsrahul1001
added a commit
that referenced
this pull request
May 20, 2026
…#66872) On macOS, the task supervisor's bare os.fork() copies the parent's Objective-C runtime state into the child process. When the child later triggers ObjC class initialization (e.g. socket.getaddrinfo -> system DNS resolver -> Security.framework -> +[NSNumber initialize]), the runtime detects the corrupted state and crashes with SIGABRT/SIGSEGV. This is a well-documented macOS platform limitation -- Apple's ObjC runtime, CoreFoundation, and libdispatch are not fork-safe. CPython changed multiprocessing's default start method to "spawn" on macOS in 3.8 for this reason, but Airflow's TaskSDK supervisor uses os.fork() directly. The fix: on macOS, immediately call os.execv() after os.fork() for task execution subprocesses. The exec replaces the child's address space, giving it clean ObjC state. The socketpair FDs survive across exec (marked inheritable) and the child reads their numbers from an environment variable. Only task execution (target=_subprocess_main) uses fork+exec. DAG processor and triggerer pass different targets and keep bare fork -- they don't make network calls that trigger the macOS crash. References: - python/cpython#105912 - python/cpython#58037 - #24463 (cherry picked from commit a3383b7) Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
vatsrahul1001
added a commit
that referenced
this pull request
May 21, 2026
…#66872) On macOS, the task supervisor's bare os.fork() copies the parent's Objective-C runtime state into the child process. When the child later triggers ObjC class initialization (e.g. socket.getaddrinfo -> system DNS resolver -> Security.framework -> +[NSNumber initialize]), the runtime detects the corrupted state and crashes with SIGABRT/SIGSEGV. This is a well-documented macOS platform limitation -- Apple's ObjC runtime, CoreFoundation, and libdispatch are not fork-safe. CPython changed multiprocessing's default start method to "spawn" on macOS in 3.8 for this reason, but Airflow's TaskSDK supervisor uses os.fork() directly. The fix: on macOS, immediately call os.execv() after os.fork() for task execution subprocesses. The exec replaces the child's address space, giving it clean ObjC state. The socketpair FDs survive across exec (marked inheritable) and the child reads their numbers from an environment variable. Only task execution (target=_subprocess_main) uses fork+exec. DAG processor and triggerer pass different targets and keep bare fork -- they don't make network calls that trigger the macOS crash. References: - python/cpython#105912 - python/cpython#58037 - #24463 (cherry picked from commit a3383b7) Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Cherry-pick of #64874
Conflict resolution
One conflict in
task-sdk/tests/task_sdk/execution_time/test_supervisor.py. The PR side's after-context included two unrelated test functions —test_in_process_api_server_caches_instanceandtest_api_client_clears_dag_bag_override_when_dag_is_none— that come from a separate PR not in this backport set. They referencedin_process_api_serverandInProcessTestSupervisor._api_clientsymbols/flows that v3-2-test does not have in this form, and they're not part of the #64874 diff. Dropped those two functions and kept only #64874's actual addition — theTestChildExecMainclass withtest_uses_fds_012_and_requests_log_channel.