Skip to content

[v3-2-test] fix(scheduler): ignore stale executor success after defer reschedule (#66431)#67089

Merged
vatsrahul1001 merged 1 commit into
v3-2-testfrom
backport-322-66431
May 18, 2026
Merged

[v3-2-test] fix(scheduler): ignore stale executor success after defer reschedule (#66431)#67089
vatsrahul1001 merged 1 commit into
v3-2-testfrom
backport-322-66431

Conversation

@vatsrahul1001
Copy link
Copy Markdown
Contributor

Backport of #66431 to v3-2-test for the 3.2.2 release.

Regression fix for #66374: when a trigger fires and moves a deferred TI to scheduled (resume after defer) before the executor's success event from the pre-defer worker exit is processed, the scheduler treated the stale success as a state mismatch and would kill the task externally. The fix adds an additional ti_requeued branch for the resume-after-defer case: TI is scheduled + next_method is not None + executor reports success → treat as a requeue, not a mismatch.

Backport notes

Tests required a small adapt for v3-2-test:

  • v3-2-test mocks stats via @mock.patch(\"airflow.jobs.scheduler_job_runner.Stats.incr\") (direct Stats.incr patch). main uses the newer @mock.patch(\"airflow._shared.observability.metrics.stats._get_backend\") indirection — that helper does not exist on v3-2-test.
  • The new test test_process_executor_events_stale_success_when_scheduled_after_defer is included verbatim from fix(scheduler): ignore stale executor success after defer reschedule #66431, but its decorator + assertions are rewritten to use the v3-2-test Stats.incr mock pattern (drops the MagicMock(spec=StatsLogger) + mock_get_backend.return_value = … indirection; references mock_stats_incr directly).
  • The scheduler change in scheduler_job_runner.py auto-merged cleanly.

Was generative AI tooling used to co-author this PR?
  • Yes — Claude Opus 4.7 (1M context)

Generated-by: Claude Opus 4.7 (1M context) following the guidelines

…66431)

* fix(scheduler): ignore stale executor success after defer reschedule

When a trigger moves a deferred task back to scheduled before the
scheduler processes the executor success from the worker defer exit,
treat it as benign (same try_number, next_method set) instead of
state mismatch failure.

Closes #66374

Co-authored-by: Cursor <cursoragent@cursor.com>

* Remove newsfragment for bugfix (per review)

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
(cherry picked from commit ac39596)
@boring-cyborg boring-cyborg Bot added the area:Scheduler including HA (high availability) scheduler label May 18, 2026
@vatsrahul1001 vatsrahul1001 added this to the Airflow 3.2.2 milestone May 18, 2026
@vatsrahul1001 vatsrahul1001 added the type:bug-fix Changelog: Bug Fixes label May 18, 2026
@vatsrahul1001 vatsrahul1001 merged commit 06cdb37 into v3-2-test May 18, 2026
77 checks passed
@vatsrahul1001 vatsrahul1001 deleted the backport-322-66431 branch May 18, 2026 09:34
vatsrahul1001 added a commit that referenced this pull request May 20, 2026
…66431) (#67089)

* fix(scheduler): ignore stale executor success after defer reschedule

When a trigger moves a deferred task back to scheduled before the
scheduler processes the executor success from the worker defer exit,
treat it as benign (same try_number, next_method set) instead of
state mismatch failure.

Closes #66374



* Remove newsfragment for bugfix (per review)

---------


(cherry picked from commit ac39596)

Co-authored-by: /-\ - Pedro Henrique Klein <pedrohenriquekleinphg@gmail.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
vatsrahul1001 added a commit that referenced this pull request May 20, 2026
…66431) (#67089)

* fix(scheduler): ignore stale executor success after defer reschedule

When a trigger moves a deferred task back to scheduled before the
scheduler processes the executor success from the worker defer exit,
treat it as benign (same try_number, next_method set) instead of
state mismatch failure.

Closes #66374



* Remove newsfragment for bugfix (per review)

---------


(cherry picked from commit ac39596)

Co-authored-by: /-\ - Pedro Henrique Klein <pedrohenriquekleinphg@gmail.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
vatsrahul1001 added a commit that referenced this pull request May 21, 2026
…66431) (#67089)

* fix(scheduler): ignore stale executor success after defer reschedule

When a trigger moves a deferred task back to scheduled before the
scheduler processes the executor success from the worker defer exit,
treat it as benign (same try_number, next_method set) instead of
state mismatch failure.

Closes #66374



* Remove newsfragment for bugfix (per review)

---------


(cherry picked from commit ac39596)

Co-authored-by: /-\ - Pedro Henrique Klein <pedrohenriquekleinphg@gmail.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Scheduler including HA (high availability) scheduler type:bug-fix Changelog: Bug Fixes

2 participants