🧠 A modular orchestration layer for LLM agents. Traceable. Forkable. Deterministic. Real.
I’ve been building software for over a decade AI systems, DevOps stacks, immersive tech, you name it. I’ve co-founded companies, shipped products, led teams. But nothing in recent years has broken my developer soul quite like trying to build with today’s LLM tooling.
I don’t say this lightly: LangChain nearly made me quit.
It’s not that LangChain is bad. It’s that it's solving the wrong abstraction and in doing so, it creates the illusion of structure where there is none. Prompt chains, memory hacks, magical “agents” with no trace logs or determinism. What looks like flexibility often becomes a black hole of fragility.
So I stepped away.
And built something else.
⛔ What Broke Me
I wanted to build a real AI reasoning layer — something that felt composable, testable, explainable.
Instead, I found:
- Chained prompts with no traceability
- Hidden logic that made debugging impossible
- Memory that was really just a vectorstore with fancy hats
- “Agents” that were barely more than
if
statements wrapped in optimism
Worst of all? Every single run felt like a gamble. No determinism. No accountability. No observability.
💥 What I Wanted
I wanted to design systems like I would in robotics or neuroscience:
- Signals pass through a network
- Decisions emerge from structure, not just syntax
- Memory decays, context fades, flows branch
- Every action is traceable, auditable, replayable
In short: I didn’t want a chatbot chain. I wanted cognition.
🚀 So I Built OrKa
OrKa = Orchestrator Kit for Agents
It’s an open cognitive execution engine built from scratch:
- Defined via YAML
- Backed by Redis or Kafka
- Uses agents as modular units of logic
- Fully traceable, forkable, and observable
🧬 The Core Philosophy:
Agents are not scripts. They’re nodes in a reasoning graph.
🔍 1. Traceability: If You Can't Rewind It, It's Not Real
Every OrKa run logs every agent execution with:
- Input/output
- Latency
- Timestamps
- Failure state
- Confidence distribution (if applicable)
Backends:
-
Redis Streams
(default) -
Kafka
(production-grade option) - Soon: Langfuse, Prometheus/Grafana, or custom exporters
This isn’t logging after the fact. This is execution-by-design — like flight data recorders for cognition.
🔀 2. Fork–Join Execution: Branching Isn't Optional
LangChain treats branching as exotic. OrKa makes it native:
orchestrator:
id: example_flow
strategy: fork_group
queue: redis
agents:
- classify_input
- fork_next
agents:
- id: classify_input
type: router
prompt: |
What type of input is this: "{{ input }}"?
Choose one: [math, code, poetry]
- id: fork_next
type: fork_group
targets:
- math_agent
- code_agent
- poetry_agent
Each branch runs in parallel. You can join them later using a join_node that merges outputs.
You define cognition like infrastructure, not inline conditionals.
🔁 3. Kafka + Redis Integration — Queue as the Substrate
OrKa isn’t tied to one runtime. You can run:
Local CPU-only agents
LLM calls via LiteLLM (OpenAI, Ollama, Claude, Mistral)
Full streaming queues (Kafka topics, Redis shards)
The orchestrator reads from a queue, resolves the strategy (e.g. sequential, fork_group, confidence_weighted), and schedules agent execution deterministically.
It’s closer to Kubernetes for cognition than it is to LangChain.
🧠 4. Memory Decay — Forgetting Is a Feature
In LangChain, memory is a blob of text stuffed into the next prompt.
In OrKa, memory is scoped and temporal:
Episodic: per-run context
Procedural: flow-level embedded traces
Semantic: long-term embeddings (with freshness scores)
Decay logic: you set how fast information fades
Inspired by how actual cognition works.
Ask yourself: Why should an agent remember a number from 2 hours ago?
OrKa gives you control.
✅ What I Have Now
A working system with 1000+ test runs, zero agent drift
7.6 s avg latency per agent on CPU-only Ollama (DeepSeek-R1)
Visual YAML builder via OrKa UI → https://orka-ui.web.app
PyPI package → https://pypi.org/project/orka-reasoning/
Docker-ready frontend → https://hub.docker.com/r/marcosomma/orka-ui
🎯 Who Should Use OrKa?
Engineers building multi-agent systems (beyond toy apps)
Infra teams who want traceable LLM pipelines
Researchers who care about reproducibility
Builders frustrated with the “just hack the prompt” approach
👋 Up Next
Part 2 will dive into:
Service Nodes (MemoryWriter, RAGNode, Embedder)
Fork–Join flow construction in YAML
Kafka orchestration for high-throughput agent networks
Want to see the benchmark logs? Play with the UI?
Explore → https://github.com/marcosomma/orka-reasoning
Thanks for reading. I built OrKa because I was tired of pretending brittle chains were cognition.
Let’s build real systems.
Top comments (2)
Cool!
Thanks @clemra, If you end up trying it feel free to send feedback over any channel ;) Also PM, I try to answer fast :)!