Skip to content

Rust SDK: add Client::create_cloud_session#1394

Open
tclem wants to merge 6 commits into
mainfrom
tclem/verbose-meme
Open

Rust SDK: add Client::create_cloud_session#1394
tclem wants to merge 6 commits into
mainfrom
tclem/verbose-meme

Conversation

@tclem
Copy link
Copy Markdown
Member

@tclem tclem commented May 23, 2026

Adds Client::create_cloud_session to the Rust SDK so callers can create Mission Control–backed (cloud) sessions through the SDK instead of working around the gap downstream.

Ports the SDK-side changes from github/github-app#5592 into the canonical SDK so the app can drop its vendored CloudBootstrapWire hack on the next refresh.

Why a separate entry point

Cloud sessions break three load-bearing assumptions of create_session:

  • The runtime owns the session id — caller-provided sessionId and provider are rejected by the runtime.
  • The runtime returns its own Mission Control id in the session.create response.
  • The runtime may emit session.event notifications and inbound JSON-RPC requests keyed by that id before the response arrives.

create_session pre-assigns a UUID, registers per-session routing under it, and asserts the response echoes it back — none of which is compatible with the cloud contract.

API

  • New Client::create_cloud_session(SessionConfig) -> Result<Session>. Build the config with SessionConfig::with_cloud(...); the SDK validates that session_id and provider are unset.
  • Client::create_session now returns Error::InvalidConfig when config.cloud.is_some(), pointing callers at create_cloud_session.
  • SessionCreateWire.session_id becomes Option<SessionId> (skip_serializing_if) so the cloud request omits it. New SessionConfig::into_cloud_wire() shares a private into_create_wire(Option<SessionId>) with the existing path.

Routing the pre-registration window

SessionRouter gains a refcounted PendingSessionRouting guard and a bounded (128) drop-oldest pending buffer keyed by session id. The guard is acquired before the session.create RPC and released after register_session installs channels for the runtime-assigned id, so notifications and inbound JSON-RPC requests that arrive in that window are replayed in arrival order. On the last guard drop the buffer is cleared.

On decode failure of the session.create response, the SDK extracts sessionId from the raw result and issues a targeted session.destroy; if no id can be recovered it logs a warning so the leak is visible.

Refactor

Factor the shared handler/SessionFs extraction out of create_session and resume_session into prepare_session_runtime, now reused by all three entry points.

Tests

  • 5 router unit tests: off-mode drop, on-mode buffer + ordered flush, drop-oldest at limit, last-guard-drop clears, unregister clears pending.
  • 6 integration tests against the existing fake JSON-RPC server: cloud-create wire shape, create_session cloud rejection, caller session_id/provider rejection, handler-driven request flags, early-notification buffering, early-request buffering.

Known follow-up

router.rs still drop-oldest's buffered JSON-RPC requests at the 128 limit. Requests need responses, so a drop leaves the runtime waiting; matches upstream PR #5592. Tracked inline for a future change (likely a larger limit plus an explicit JSON-RPC error response on overflow).

Follow-ups in other SDKs

TypeScript/Python/Go/.NET/Java SDKs need the same create_cloud_session API and the same pre-registration buffering; TS in particular is broken against the current copilot-agent-runtime contract (it sends a caller sessionId which the runtime rejects). Out of scope here — Rust ships first to unblock github/github-app#5592.

  Generated via Copilot (Claude Opus 4.7) on behalf of @tclem

Copilot AI review requested due to automatic review settings May 23, 2026 01:39
@tclem tclem requested a review from a team as a code owner May 23, 2026 01:39
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Rust SDK support for creating Mission Control–backed cloud sessions where the runtime assigns the session ID, including pre-registration buffering for early session traffic.

Changes:

  • Adds Client::create_cloud_session and rejects cloud configs through create_session.
  • Updates session create wire/config conversion to allow omitting sessionId.
  • Adds pending-routing buffers in the router plus Rust README and test coverage for cloud creation behavior.
Show a summary per file
File Description
rust/src/session.rs Adds cloud session creation flow and factors shared runtime preparation.
rust/src/router.rs Adds pending routing guard and buffers for early notifications/requests.
rust/src/types.rs Adds cloud-specific wire conversion path.
rust/src/wire.rs Makes sessionId optional for session.create serialization.
rust/src/lib.rs Exposes router start and pending-routing helpers internally.
rust/tests/session_test.rs Adds integration tests for cloud session creation and early routing.
rust/README.md Documents cloud session usage in the Rust SDK.

Copilot's findings

  • Files reviewed: 7/7 changed files
  • Comments generated: 1
Comment thread rust/src/router.rs Outdated
Cloud (Mission Control) sessions don't fit the create_session contract:
the runtime owns the session id, forbids caller-provided sessionId and
provider, and may emit session.event notifications or inbound JSON-RPC
requests before the session.create response.

Add a dedicated Client::create_cloud_session entry point that:

- Omits sessionId from the session.create wire (new
  SessionConfig::into_cloud_wire via shared into_create_wire(Option)).
- Rejects caller-provided session_id and provider with InvalidConfig.
- Buffers early session-keyed notifications and inbound requests through
  a refcounted PendingSessionRouting guard with a bounded (128)
  drop-oldest queue, replayed when register_session installs channels
  for the runtime-assigned id.
- Holds the guard across a best-effort session.destroy on decode
  failure, then drops it.

Also:

- Client::create_session now rejects config.cloud with InvalidConfig,
  pointing callers at create_cloud_session.
- Factor the shared handler/sessionFs extraction into
  prepare_session_runtime, reused by create / create_cloud / resume.
- Five router unit tests cover off-mode drop, on-mode buffer + ordered
  flush, drop-oldest at limit, last-guard-drop clears, and
  unregister-clears-pending.
- Six integration tests cover the cloud-create wire, both rejection
  paths, handler-driven request flags, and early-notification +
  early-request buffering, against the existing fake JSON-RPC server.

Ports github/github-app#5592 into the canonical SDK so the app can drop
its vendored CloudBootstrapWire hack on the next refresh.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@tclem tclem force-pushed the tclem/verbose-meme branch from 0989fce to bafec89 Compare May 23, 2026 01:53
tclem and others added 2 commits May 22, 2026 18:59
Tokio tasks are cheap; the lazy ensure_started + idempotency guard
existed only because cloud session creation needed the router running
before any session was registered. Starting both routing tasks once
during Client::from_streams collapses the lazy/eager split and lets
register_session and begin_pending_session_routing become plain
delegations.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Previously the pending router buffered notifications and inbound JSON-RPC
requests in two separate VecDeques and on register flushed all
notifications, then all requests. That meant a request arriving between
two notifications would be delivered to the consumer after both
notifications instead of in arrival order, batching cross-type messages
in a way the docs claimed not to. Some downstream behaviors care: an
'approval requested' session event observed before its matching
'tool.call' request keeps the consumer's permission/elicitation flow
coherent.

Collapse the two queues into a single FIFO of PendingItem (a notif/req
enum), apply the drop-oldest limit globally, and flush in arrival order.
Adds a regression test that interleaves a request between two
notifications and verifies cross-type ordering survives the flush.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
tclem and others added 2 commits May 22, 2026 19:10
…ests

When the pending session buffer overflows during cloud session creation,
we still have to evict the oldest entry to make room. For notifications
that's fine; for inbound JSON-RPC requests it left the runtime hanging
on a reply that would never come (until the runtime's own timeout).

Add a sync WriterHandle on JsonRpcClient that lets the router enqueue
fire-and-forget frames from inside its parking_lot::Mutex. On request
eviction, send a JSON-RPC error response (-32603 INTERNAL_ERROR,
message 'request dropped: pending session buffer overflow before
session.create response') for the evicted request's id before discarding
it. Notifications continue to drop with a warn-level log.

Test reads bytes back off a duplex stream and asserts the error
response is well-formed and carries the expected id and code.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
CI failed before tests because nightly rustfmt reordered imports and wrapped a router test call differently than the committed source. Running the pinned formatter produced the expected import grouping and line wrapping without changing behavior.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…out register

GPT-5.5 self-review surfaced two issues in the pending-session router:

1. When the last PendingSessionRouting guard dropped without anyone
   calling register_session (e.g. session.create failed mid-RPC), any
   buffered inbound JSON-RPC requests were silently discarded. The
   runtime is waiting on those ids and would hang until its own timeout.
   This was a symmetric gap to the overflow path fixed in 491b442.

   Fix: Drop impl now drains pending entries and emits an INTERNAL_ERROR
   response ("pending session routing ended before session was
   registered") for every PendingItem::Request before clearing.
   Notifications still just warn-log on the way out. WriterHandle is
   now Clone so the snapshot can be taken under lock and the lock
   released before per-item work.

2. The 'cross-type arrival order' claim from c802943 overstated what
   the unified FIFO actually guarantees. After register, items still
   flow through two separate per-session mpsc channels consumed via
   select! in session.rs, so consumer-observed cross-type order is
   implementation-defined. Updated push_pending docs to describe the
   actual guarantee (FIFO eviction + interleaved flush) and renamed
   the test from pending_buffer_preserves_cross_type_arrival_order to
   pending_buffer_flush_interleaves_types_in_arrival_order. Unifying
   per-session channels into one stream is tracked as a follow-up.

New router test: last_guard_drop_without_register_responds_to_buffered_requests.
8 router tests + 101 session tests all pass; clippy clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
tclem added a commit that referenced this pull request May 23, 2026
Carries forward the Rust SDK PR #1394 follow-up review feedback into the
TS port:

1. Cap the per-session inbound-request parked-waiter list at 128. When
   exceeded, reject the oldest waiter with 'pending session buffer
   overflow'. vscode-jsonrpc translates the rejection into a JSON-RPC
   error response (-32603) back to the runtime so the request id isn't
   left hanging — silently dropping it would leave the runtime waiting
   on the response until its own timeout. Mirrors Rust commit 491b442.

2. Use a distinct message ('pending session routing ended before
   session was registered') when the pending guard drops without
   registration. Lets debugging tell the overflow path from the
   create-failed path. Mirrors Rust commit e0ff254.

Adds two tests:
- overflow path resolves to overflow error for the oldest waiter,
  remaining 128 resolve normally after registration
- guard-drop path rejects parked waiters with the distinct message
tclem added a commit that referenced this pull request May 23, 2026
Carries the Rust SDK PR #1394 follow-up review fixes into the Go port:

1. Cap the per-session parked-waiter list at 128. When exceeded, reject
   the oldest waiter with errPendingSessionBufferOverflow ('pending
   session buffer overflow'). The handler returns a *jsonrpc2.Error with
   code -32603 via the new pendingRoutingRPCError helper, so the runtime
   receives a proper error response instead of hanging on the request id
   until its own timeout. Mirrors Rust commit 491b442 and TS commit
   c167bc3.

2. When the last pending-routing guard drops without RegisterSession
   (e.g. session.create failed mid-RPC), signal all parked waiters with
   errPendingSessionRoutingEnded ('pending session routing ended before
   session was registered'). Distinct phrasing from the overflow path so
   debugging can tell the two cases apart. Mirrors Rust commit e0ff254
   and TS commit c167bc3.

Adds pendingRoutingRPCError helper that routes sentinel errors to
-32603 while unknown-session errors keep -32602.

Adds two tests:
- TestPendingRouting_OverflowEmitsError: 129 parked waiters, oldest
  gets -32603 overflow error, remaining 128 resolve normally after
  registration.
- TestPendingRouting_GuardDropDistinctMessage: parks a request, drops
  the guard without registration, verifies exact routing-ended message
  and -32603 code.

Updates TestPendingRouting_RejectsWaitersOnDispose to assert the new
exact message and code instead of the old 'dropped' substring check.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
tclem added a commit that referenced this pull request May 23, 2026
Carries forward the Rust SDK PR #1394 follow-up review feedback into the
.NET port:

1. Cap the per-session inbound-request parked-waiter list at 128. When
   exceeded, reject the oldest waiter with 'pending session buffer
   overflow'. The custom JsonRpc dispatcher translates the thrown
   exception into a JSON-RPC error response (-32603) back to the runtime
   so the request id isn't left hanging — silently dropping it would
   leave the runtime waiting on the response until its own timeout.
   Mirrors Rust commit 491b442 and TS commit c167bc3.

2. Use a distinct message ('pending session routing ended before
   session was registered') when the pending guard drops without
   registration. Lets debugging tell the overflow path from the
   create-failed path. Also fixes the pre-existing bug where the
   waiter-fault loop ran inside _pendingLock despite the comment saying
   otherwise — moved it outside the lock so TCS continuations can't
   deadlock against the lock. Mirrors Rust commit e0ff254.

Adds two tests:
- overflow path: 129 early inbound requests → oldest gets error with
  'pending session buffer overflow', remaining 128 resolve after
  registration
- guard-drop path: session.create fails with parked request → error
  response with 'pending session routing ended before session was
  registered'

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
tclem added a commit that referenced this pull request May 23, 2026
Carries forward the Rust SDK PR #1394 review feedback into the Java port:

1. Cap the per-session inbound-request parked-waiter list at 128
   (BUFFER_LIMIT). When exceeded, evict the oldest waiter via
   completeExceptionally("pending session buffer overflow"). The
   RpcHandlerDispatcher thread blocked in waiter.get() wakes up, catches
   ExecutionException, and resolveSessionForRequest calls
   rpc.sendErrorResponse(-32603, ...) so the runtime isn't left waiting
   on a request id that will never get a reply. Mirrors Rust commit
   491b442 and TS commit c167bc3 on PR #1395.

2. decrementGuard now completes all stale waiters internally with a
   distinct message ("pending session routing ended before session was
   registered") instead of returning them for callers to complete with
   ad-hoc strings. A single canonical message lets debugging tell the
   overflow-eviction path from the create-failed path. Mirrors Rust
   commit e0ff254 and TS commit c167bc3.

3. Fix isDone() fast path in resolveSessionForRequest: the existing
   catch-all "fall through" swallowed ExecutionException from an
   overflow-evicted waiter, sending a generic -32602 "Unknown session"
   error instead of -32603. Now explicitly catches ExecutionException in
   the isDone() branch and sends the same -32603 error as the blocking
   path.

Adds two new unit tests in CloudSessionTest:
- parkedRequestWaiterBuffer_overflow_evictsOldestWithOverflowMessage:
  parks 129 waiters, verifies oldest completes with "pending session
  buffer overflow", remaining 128 resolve normally after registration.
- parkedRequestWaiter_guardDropMessage_isDistinctFromOverflowMessage:
  parks a request via the full handler path, drops the guard without
  registration, verifies the JSON-RPC error response contains "routing
  ended before session was registered" but not "buffer overflow".

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
tclem added a commit that referenced this pull request May 23, 2026
Ports the cloud-session contract from the Rust SDK (PR #1394) to
the TypeScript SDK.

- createSession({ cloud }) now throws — the runtime rejects this
  shape, and the existing path silently sent cloud + sessionId
  which RemoteSessionManager.cloud() rejects with a generic error.
- New createCloudSession(config) omits sessionId from the wire
  payload, lets the runtime assign the Mission Control session id,
  and registers the resulting session under that id.
- Pending-routing buffer (bounded, drop-oldest) holds session.event
  notifications and inbound JSON-RPC request handlers
  (userInput / exitPlanMode / autoModeSwitch / hooks /
  systemMessage) that arrive while session.create is in flight,
  replaying them after the runtime-assigned id is registered.
- Rejection guards for caller-provided sessionId, caller-provided
  provider, and missing cloud config.
- Post-response setup is wrapped in try/catch so a setupSessionFs
  failure rolls back the partial registration and disposes the
  pending-routing guard.

Inbound sessionFs.* requests still use the generated synchronous
handler shim and are not pending-buffered. This is unlikely in
practice (runtime does not initiate sessionFs before the create
response) and is documented as a known limitation; the fix is a
codegen update to make registerClientSessionApiHandlers accept an
async session resolver.

Tests cover wire-shape parity with Rust, rejection guards, early
notification buffering, and early inbound-request parking.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
tclem added a commit that referenced this pull request May 23, 2026
Carries forward the Rust SDK PR #1394 follow-up review feedback into the
TS port:

1. Cap the per-session inbound-request parked-waiter list at 128. When
   exceeded, reject the oldest waiter with 'pending session buffer
   overflow'. vscode-jsonrpc translates the rejection into a JSON-RPC
   error response (-32603) back to the runtime so the request id isn't
   left hanging — silently dropping it would leave the runtime waiting
   on the response until its own timeout. Mirrors Rust commit 491b442.

2. Use a distinct message ('pending session routing ended before
   session was registered') when the pending guard drops without
   registration. Lets debugging tell the overflow path from the
   create-failed path. Mirrors Rust commit e0ff254.

Adds two tests:
- overflow path resolves to overflow error for the oldest waiter,
  remaining 128 resolve normally after registration
- guard-drop path rejects parked waiters with the distinct message
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants