v2.5.1 shipping46 cli adapters · apache 2.0 · on-prem · 125,000+ installs

several coding agents.
one git tree. only what passes.

bernstein orchestrates claude code, codex, gemini cli, aider, and 42 more cli coding agents - in parallel, in isolated git worktrees, with lint, types, and tests gating every merge. python scheduler. no llm in the loop.

every routing decision signed. every file write logged. the audit chain is hash-linked and replayable end to end.

ask the docsgrounded in source + 14 postscited

ask anything.

how it works

How does Bernstein work?

Bernstein is an open-source orchestrator for CLI coding agents. It decomposes a goal into tasks, spawns Claude Code, Codex, Gemini CLI and 43 other agents into isolated git worktrees, runs each task in parallel, then verifies the output through lint, type checks, tests, and an optional cross-model review before merging. The scheduler is plain Python - deterministic, replayable from an HMAC-chained audit log, no LLM tokens spent on coordination.

one run, four stages.

decompose

manager → tasks · roles · signals

spawn

agents → isolated worktrees

verify

janitor → tests · types · lint

merge

only verified diffs land

from the blog

three reads worth your time.

field notes from the orchestra pit. hand-picked: where bernstein sits in the multi-agent coding category, what it looks like in the cloud, and how it started.

May 20, 202618 min read

bernstein 2.x recap: lineage, ten trackers, A2A capability cards, and a CI that started fixing itself

Thirteen releases since the 1.10 recap consolidated into nine themes: a per-artefact transparency log with Ed25519 signatures, ten tracker adapters from Jira to Plane, A2A capability cards, MCP client and server hardening, a Playwright sandbox for UI agents, a secrets broker, supply-chain coverage with SBOM and OSSF Scorecard, calibrated cost guards, and a web UI plus PWA in the wheel.

multi-agent orchestrationreleaselineageaudit log

Apr 14, 20263 min read

agents on cloudflare: workers, durable objects, r2, d1

bernstein 1.8.4 cloudflare backend for ai coding agents: workers run agents, durable workflows handle multi-step tasks, r2 + d1 hold state.

Cloudflare Workerscloud AI agentsserverless orchestrationmulti-agent orchestration

Mar 29, 20262 min read

bernstein 1.0: open-source orchestrator for ai coding agents

orchestrate cli coding agents (claude code, codex, gemini cli) in parallel. deterministic scheduler, file-based state, one git worktree per agent.

multi-agent orchestrationAI coding agentsClaude Codeopen-source

evidence, not vibes

every step signed, in order, on disk.

bernstein writes an hmac-signed event chain to .bernstein/audit.log. each entry references the previous hash. tampering breaks verification. nothing leaves your machine.

this is the artifact security review actually wants. not a screenshot, not a SOC2 PDF — a hash chain you can replay.

~/proj $ bernstein audit verify

# reading .bernstein/audit.log · 412 entries · 18m04s span

17:42:01  manager.plan     cb84a1…  13 tasks · est $4.20
17:42:18  scheduler.spawn  3e09f7…  t-001 → backend (sonnet)
17:42:18  scheduler.spawn  81c2dd…  t-002 → backend (sonnet)
17:42:19  scheduler.spawn  b50af8…  t-003 → qa      (opus)
17:55:11  janitor.verify   d014a3…  t-005 docs · pytest 84/84 · ruff 0
17:55:14  manager.merge    ee7c12…  t-005 → main · 4 files · clean
18:02:14  janitor.fail     9aa10b…  t-003 qa · pytest collect err · 3/3 retries
18:02:30  scheduler.route  71b4cc…  t-003 opus → sonnet · awaiting op
18:09:14  scheduler.pause  4d2e09…  conflict · t-001 ↔ t-007 · src/auth/refresh.py

verify chain      ok ✓  412/412 entries · last hash 4d2e09cb…
signed by         ~/.bernstein/keys/audit.ed25519

how it compares

same shape. different column.

bernstein is not winning the star race in this category - claude-flow, vibe-kanban, and archon are. it owns the column the regulated-buyer needs: hmac-chained audit, signed agent cards, per-artefact lineage, air-gap deploy. star counts captured 2026-05-21; capability snapshots from each project's own readme on the same date.

	bernstein	claude-flow	archon	vibe-kanban	claude-squad	composio ao
stars (2026-05-21)	433	54k	22k	26k	7.6k	7.2k
cli adapters	46	~5	~10	~6	~5	3
deterministic scheduler	yes	no (swarm)	partial	no	yes	no
hmac-chained audit log	yes	no	no	no	no	no
signed agent cards	yes	no	no	no	no	no
per-artefact lineage	yes	no	no	no	no	no
air-gap profile	yes	no	partial	no	no	no
mcp server mode	yes	yes	yes	no	no	no
python library shape	yes	no	no	no	no	no

frequently asked

what people actually ask first.

is the scheduler an llm?

the core loop is plain python. who runs, who's blocked, what merges is deterministic and replayable. agents are llms, model selection is llm-assisted (capability router + recommender), best-of-n picks a winner via llm judge. all of those are opt-in and pluggable, so if you'd rather have the planner decide via llm too, you wire your own through the routing layer. just don't put it in the scheduler tick.

does it phone home?

nothing leaves your machine without your config. opt-in telemetry is full and audit-grade when you turn it on: hmac-chained run trail, per-task tool calls, model usage, token cost, latency percentiles. ship it to your own otel collector, datadog, splunk, s3 bucket. defaults to local-only because the on-prem audience wants that, but the enterprise hooks are there.

where does it run?

wherever you point it. your laptop, on-prem behind a firewall, cloudflare workers as the cloud runtime, kubernetes as a multi-node cluster, or a hybrid of those. sandbox-execution mode is supported (cloudflare sandbox, local docker). your repo is the input, your tests are the gate; bernstein adapts to the host. nothing forces a saas hop.

how is this different from claude code?

claude code can spawn sub-agents on its own; bernstein does the same thing across 40+ different cli agents at once and verifies their output against your tests instead of trusting it. complementary, not competing. claude code is the most common primary backend inside bernstein.

one engineering post a month.

what we shipped, what broke, what we learned. no funnel, no auto-drip. unsubscribe with one click.

several coding agents.one git tree. only what passes.