Skip to content

[v3-2-test] Add cursor based pagination for get_task_instances endpoint (#64845)#65405

Merged
pierrejeambrun merged 1 commit into
apache:v3-2-testfrom
astronomer:backport-64845
Apr 20, 2026
Merged

[v3-2-test] Add cursor based pagination for get_task_instances endpoint (#64845)#65405
pierrejeambrun merged 1 commit into
apache:v3-2-testfrom
astronomer:backport-64845

Conversation

@pierrejeambrun
Copy link
Copy Markdown
Member

  • Add cursor-based pagination to get_task_instances endpoint
  • Add cursor-based (keyset) pagination as an alternative to offset-based pagination on the get_task_instances endpoint. Offset pagination remains the default and is not deprecated globally.
  • Response uses a discriminated union: offset responses include total_entries, cursor responses include next_cursor and previous_cursor.
  • Refactor SortParam to lazily cache column resolution instead of mutating state in to_orm.
  • Move cursor helpers (encode/decode/apply) to dedicated common/db/cursors.py module.
  • Cleanly separate cursor vs offset code paths in the endpoint handler.
  • Simplify cursor token and support first page without sentinel
  • Remove order_by from cursor token (now just a list of values)
  • Support empty string cursor for first page (no fake sentinel needed)
  • Drop order_by consistency check between cursor and query param
  • Small adjustments

  • Adjustments

  • Narrow endpoint return types and encode cursor value types

Encode type information directly into cursor tokens as {"type": ..., "value": ...} objects, removing the fragile column-based type guessing during deserialization.

Narrow return types for endpoints that only return offset pagination (patch, clear, batch, mapped) so the OpenAPI spec and generated UI client reflect the correct types. Only get_task_instances retains the discriminated union response.

Update UI components to use the narrowed types from the spec.

  • Use msgpack for cursor tokens and nested keyset predicate

Switch cursor encoding from typed JSON to msgpack for compactness. Replace flat OR-of-prefix-equalities with nested and/or keyset predicate for better composite index range scans. Always use ascending PK as the final tie-breaker for stable pagination.

  • Fetch limit+1 rows to accurately detect last page, returning next_cursor=null when no more results exist
  • Return previous_cursor=null on the first page (when no cursor was provided)
  • Use LimitFilter in apply_filters_to_select for the +1 limit instead of a manual .limit() call
  • Raise HTTP 400 on invalid UUID in cursor token instead of silently passing the invalid value
  • Update endpoint docs and add boundary-condition test
  • Fix backward cursor based pagination

(cherry picked from commit e11c603)


Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)

  • Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
  • For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
  • When adding dependency, check compliance with the ASF 3rd Party License Policy.
  • For significant user-facing changes create newsfragment: {pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.
Copy link
Copy Markdown
Member

@jason810496 jason810496 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Technically approved, I didn't check the change side by side with the original PR thoroughly.

…4845)

* Add cursor-based pagination to get_task_instances endpoint

- Add cursor-based (keyset) pagination as an alternative to offset-based
  pagination on the get_task_instances endpoint. Offset pagination remains
  the default and is not deprecated globally.
- Response uses a discriminated union: offset responses include
  total_entries, cursor responses include next_cursor and previous_cursor.
- Refactor SortParam to lazily cache column resolution instead of
  mutating state in to_orm.
- Move cursor helpers (encode/decode/apply) to dedicated
  common/db/cursors.py module.
- Cleanly separate cursor vs offset code paths in the endpoint handler.

* Simplify cursor token and support first page without sentinel

- Remove order_by from cursor token (now just a list of values)
- Support empty string cursor for first page (no fake sentinel needed)
- Drop order_by consistency check between cursor and query param

* Small adjustments

* Adjustments

* Narrow endpoint return types and encode cursor value types

Encode type information directly into cursor tokens as
{"type": ..., "value": ...} objects, removing the fragile
column-based type guessing during deserialization.

Narrow return types for endpoints that only return offset
pagination (patch, clear, batch, mapped) so the OpenAPI spec
and generated UI client reflect the correct types. Only
get_task_instances retains the discriminated union response.

Update UI components to use the narrowed types from the spec.

* Use msgpack for cursor tokens and nested keyset predicate

Switch cursor encoding from typed JSON to msgpack for compactness.
Replace flat OR-of-prefix-equalities with nested and/or keyset
predicate for better composite index range scans. Always use
ascending PK as the final tie-breaker for stable pagination.

* Flatten TaskInstanceCollectionRes
ponse to avoid oneOf codegen issues

  Replace the discriminated union (offset | cursor response types) with
  a single flat model using optional fields. OpenAPI oneOf + discriminator
  is not handled correctly by hey-api/openapi-ts (apache#1613, apache#3270): return
  types degrade to unknown in generated TypeScript code.

* Fix UI

* Fix CI

* Fix cursor pagination boundary detection and error handling

- Fetch limit+1 rows to accurately detect last page, returning
  next_cursor=null when no more results exist
- Return previous_cursor=null on the first page (when no cursor
  was provided)
- Use LimitFilter in apply_filters_to_select for the +1 limit
  instead of a manual .limit() call
- Raise HTTP 400 on invalid UUID in cursor token instead of
  silently passing the invalid value
- Update endpoint docs and add boundary-condition test

* Fix backward cursor based pagination

(cherry picked from commit e11c603)
@pierrejeambrun
Copy link
Copy Markdown
Member Author

Unrelated static check failure, merging.

@pierrejeambrun pierrejeambrun merged commit 0eae578 into apache:v3-2-test Apr 20, 2026
107 of 108 checks passed
@pierrejeambrun pierrejeambrun deleted the backport-64845 branch April 20, 2026 13:57
vatsrahul1001 pushed a commit that referenced this pull request Apr 23, 2026
…65405)

* Add cursor-based pagination to get_task_instances endpoint

- Add cursor-based (keyset) pagination as an alternative to offset-based
  pagination on the get_task_instances endpoint. Offset pagination remains
  the default and is not deprecated globally.
- Response uses a discriminated union: offset responses include
  total_entries, cursor responses include next_cursor and previous_cursor.
- Refactor SortParam to lazily cache column resolution instead of
  mutating state in to_orm.
- Move cursor helpers (encode/decode/apply) to dedicated
  common/db/cursors.py module.
- Cleanly separate cursor vs offset code paths in the endpoint handler.

* Simplify cursor token and support first page without sentinel

- Remove order_by from cursor token (now just a list of values)
- Support empty string cursor for first page (no fake sentinel needed)
- Drop order_by consistency check between cursor and query param

* Small adjustments

* Adjustments

* Narrow endpoint return types and encode cursor value types

Encode type information directly into cursor tokens as
{"type": ..., "value": ...} objects, removing the fragile
column-based type guessing during deserialization.

Narrow return types for endpoints that only return offset
pagination (patch, clear, batch, mapped) so the OpenAPI spec
and generated UI client reflect the correct types. Only
get_task_instances retains the discriminated union response.

Update UI components to use the narrowed types from the spec.

* Use msgpack for cursor tokens and nested keyset predicate

Switch cursor encoding from typed JSON to msgpack for compactness.
Replace flat OR-of-prefix-equalities with nested and/or keyset
predicate for better composite index range scans. Always use
ascending PK as the final tie-breaker for stable pagination.

* Flatten TaskInstanceCollectionRes
ponse to avoid oneOf codegen issues

  Replace the discriminated union (offset | cursor response types) with
  a single flat model using optional fields. OpenAPI oneOf + discriminator
  is not handled correctly by hey-api/openapi-ts (#1613, #3270): return
  types degrade to unknown in generated TypeScript code.

* Fix UI

* Fix CI

* Fix cursor pagination boundary detection and error handling

- Fetch limit+1 rows to accurately detect last page, returning
  next_cursor=null when no more results exist
- Return previous_cursor=null on the first page (when no cursor
  was provided)
- Use LimitFilter in apply_filters_to_select for the +1 limit
  instead of a manual .limit() call
- Raise HTTP 400 on invalid UUID in cursor token instead of
  silently passing the invalid value
- Update endpoint docs and add boundary-condition test

* Fix backward cursor based pagination

(cherry picked from commit e11c603)
vatsrahul1001 pushed a commit that referenced this pull request Apr 27, 2026
…65405)

* Add cursor-based pagination to get_task_instances endpoint

- Add cursor-based (keyset) pagination as an alternative to offset-based
  pagination on the get_task_instances endpoint. Offset pagination remains
  the default and is not deprecated globally.
- Response uses a discriminated union: offset responses include
  total_entries, cursor responses include next_cursor and previous_cursor.
- Refactor SortParam to lazily cache column resolution instead of
  mutating state in to_orm.
- Move cursor helpers (encode/decode/apply) to dedicated
  common/db/cursors.py module.
- Cleanly separate cursor vs offset code paths in the endpoint handler.

* Simplify cursor token and support first page without sentinel

- Remove order_by from cursor token (now just a list of values)
- Support empty string cursor for first page (no fake sentinel needed)
- Drop order_by consistency check between cursor and query param

* Small adjustments

* Adjustments

* Narrow endpoint return types and encode cursor value types

Encode type information directly into cursor tokens as
{"type": ..., "value": ...} objects, removing the fragile
column-based type guessing during deserialization.

Narrow return types for endpoints that only return offset
pagination (patch, clear, batch, mapped) so the OpenAPI spec
and generated UI client reflect the correct types. Only
get_task_instances retains the discriminated union response.

Update UI components to use the narrowed types from the spec.

* Use msgpack for cursor tokens and nested keyset predicate

Switch cursor encoding from typed JSON to msgpack for compactness.
Replace flat OR-of-prefix-equalities with nested and/or keyset
predicate for better composite index range scans. Always use
ascending PK as the final tie-breaker for stable pagination.

* Flatten TaskInstanceCollectionRes
ponse to avoid oneOf codegen issues

  Replace the discriminated union (offset | cursor response types) with
  a single flat model using optional fields. OpenAPI oneOf + discriminator
  is not handled correctly by hey-api/openapi-ts (#1613, #3270): return
  types degrade to unknown in generated TypeScript code.

* Fix UI

* Fix CI

* Fix cursor pagination boundary detection and error handling

- Fetch limit+1 rows to accurately detect last page, returning
  next_cursor=null when no more results exist
- Return previous_cursor=null on the first page (when no cursor
  was provided)
- Use LimitFilter in apply_filters_to_select for the +1 limit
  instead of a manual .limit() call
- Raise HTTP 400 on invalid UUID in cursor token instead of
  silently passing the invalid value
- Update endpoint docs and add boundary-condition test

* Fix backward cursor based pagination

(cherry picked from commit e11c603)
@vatsrahul1001 vatsrahul1001 added this to the Airflow 3.2.2 milestone May 19, 2026
@vatsrahul1001 vatsrahul1001 added the type:misc/internal Changelog: Misc changes that should appear in change log label May 19, 2026
vatsrahul1001 pushed a commit that referenced this pull request May 20, 2026
…65405)

* Add cursor-based pagination to get_task_instances endpoint

- Add cursor-based (keyset) pagination as an alternative to offset-based
  pagination on the get_task_instances endpoint. Offset pagination remains
  the default and is not deprecated globally.
- Response uses a discriminated union: offset responses include
  total_entries, cursor responses include next_cursor and previous_cursor.
- Refactor SortParam to lazily cache column resolution instead of
  mutating state in to_orm.
- Move cursor helpers (encode/decode/apply) to dedicated
  common/db/cursors.py module.
- Cleanly separate cursor vs offset code paths in the endpoint handler.

* Simplify cursor token and support first page without sentinel

- Remove order_by from cursor token (now just a list of values)
- Support empty string cursor for first page (no fake sentinel needed)
- Drop order_by consistency check between cursor and query param

* Small adjustments

* Adjustments

* Narrow endpoint return types and encode cursor value types

Encode type information directly into cursor tokens as
{"type": ..., "value": ...} objects, removing the fragile
column-based type guessing during deserialization.

Narrow return types for endpoints that only return offset
pagination (patch, clear, batch, mapped) so the OpenAPI spec
and generated UI client reflect the correct types. Only
get_task_instances retains the discriminated union response.

Update UI components to use the narrowed types from the spec.

* Use msgpack for cursor tokens and nested keyset predicate

Switch cursor encoding from typed JSON to msgpack for compactness.
Replace flat OR-of-prefix-equalities with nested and/or keyset
predicate for better composite index range scans. Always use
ascending PK as the final tie-breaker for stable pagination.

* Flatten TaskInstanceCollectionRes
ponse to avoid oneOf codegen issues

  Replace the discriminated union (offset | cursor response types) with
  a single flat model using optional fields. OpenAPI oneOf + discriminator
  is not handled correctly by hey-api/openapi-ts (#1613, #3270): return
  types degrade to unknown in generated TypeScript code.

* Fix UI

* Fix CI

* Fix cursor pagination boundary detection and error handling

- Fetch limit+1 rows to accurately detect last page, returning
  next_cursor=null when no more results exist
- Return previous_cursor=null on the first page (when no cursor
  was provided)
- Use LimitFilter in apply_filters_to_select for the +1 limit
  instead of a manual .limit() call
- Raise HTTP 400 on invalid UUID in cursor token instead of
  silently passing the invalid value
- Update endpoint docs and add boundary-condition test

* Fix backward cursor based pagination

(cherry picked from commit e11c603)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:airflow-ctl area:API Airflow's REST/HTTP API area:UI Related to UI/UX. For Frontend Developers. type:misc/internal Changelog: Misc changes that should appear in change log

3 participants