Ask HN: How are you orchestrating multi-agent AI workflows in production?

I've been building AI agent pipelines for the past year and curious how others handle it. Specifically:

- Do you use a framework (LangChain, CrewAI) or roll your own - How do you handle agent-to-agent data passing? - What does your observability look like for agent runs? - Are you running agents on cron/webhooks or manual-only?

Interested in hearing what's working and what's painful.

7 points | by swrly 2 days ago

16 comments

hirewilliam 1 day ago
From a production sales context specifically, the orchestration question that matters most is: how do you handle state across a multi-turn conversation with a real human who might reply days apart?
The naive approach is stateless. Each reply gets processed independently. This breaks down fast when a prospect says "as I mentioned before" and the agent has no memory of what they mentioned before.
What has worked better: treating the entire conversation thread as the context window, not just the latest message. Every reply, every prior message, the research done on the prospect at the start, all of it gets passed through. The agent always knows where it is in the conversation and what has already been said.
The second problem is confidence calibration. Multi-agent systems in production need to know when to act autonomously and when to surface something for human review. In sales specifically, the cost of an agent saying something wrong to a real prospect is high. We err toward flagging ambiguous situations rather than guessing.
The pattern that has held up: agents own clearly bounded tasks end to end (research, draft, send, parse reply), with a thin orchestration layer that routes based on reply classification. Classification is the hardest part to get right and the most important to get right.
[-]
- swrly 13 hours ago
  This is a great question. We handle it with session state that persists across turns — the agent's memory scope can be set to "agent" (persists across runs) vs "swirl" (one run only). For truly long-running conversations, we store context in agent memories with importance scoring, so the agent can recall relevant context days later without carrying the full history. It's not perfect yet but it works for most production patterns we've seen.
pablovarela 12 hours ago
I roll my own on Node.js. Each agent is an Express endpoint running in a V8 isolate — they communicate through a shared MongoDB layer where each agent reads/writes its state. Data passing is just JSON documents with a pipeline ID linking them. For orchestration I use a simple coordinator endpoint that chains the agents sequentially or fans out in parallel depending on the task.
Observability is the part most people underestimate. I log every agent run with input, output, token usage, and latency to a dedicated collection. Simple but it catches failures fast.
Chepko932 23 hours ago
LangGraph, built my own orchestrator on top. Agents run as parallel workers (Claude Code, Codex CLI, Gemini CLI), each in its own git worktree. Agent-to-agent data flows through SQLite-structured JSON output per task, central coordinator reads and routes. Letting agents talk to each other directly was a mess. Biggest takeaway: don’t let agents pick their own subtasks. Define the task graph yourself: agents only handle the leaf nodes.
[-]
- swrly 13 hours ago
  Parallel workers in git worktrees is clever. We have a similar pattern with fan-out/fan-in — you can split an array across parallel agent executions and collect results. The SQLite-structured JSON output approach is interesting for coordination. We use template variables + scratchpad (Redis-backed shared state) for inter-agent data flow. Different trade-offs — yours gives you more control at the cost of more infrastructure.
olegbk 1 day ago
We've been running this pattern in production for a few weeks. The biggest pain wasn't orchestration, it was trust when agents delegate to agents they don't own. We ended up building reputation-based gating so a low-trust agent can't delegate upward. Happy to share specifics if useful.
[-]
- swrly 13 hours ago
  [dead]
go4horizon 2 days ago
We have our own lightweight abstraction for running and managing agents, ironically managed by an agent.
How do you handle agent-to-agent data passing? - We do have a memory concept for the pipeline we are in
What does your observability look like for agent runs? - locally, we are using our own test abstraction and eval. For production, we are using https://www.wayfound.ai
Are you running agents on cron/webhooks or manual-only? - webhook and cron when needed
[-]
- swrly 2 days ago
  [dead]
segmondy 19 hours ago
roll my own, there's absolute 0 framework out there that's good enough for serious work.
[-]
- swrly 13 hours ago
  [dead]
kathir05 2 days ago
We have been using AGNO framework for HuntYourTribe quite sometime. It is pretty much working out well for us. Minimalistic design for isolation, decoupling and control plane architecture.
[-]
- swrly 2 days ago
  Interesting — I hadn't looked into AGNO closely. The isolation and control plane approach sounds solid. How do you handle observability? That's been one of the harder parts for us — knowing exactly which agent produced which output when something goes wrong in a multi-step pipeline.
  Also curious if you're running agents on triggers (webhooks, cron) or mostly manual execution?
globalchatads 22 hours ago
[dead]
JohnChali 23 hours ago
[dead]
deepreview 1 day ago
[dead]
tatrions 2 days ago
[flagged]
jamiemallers 2 days ago
[dead]
Janani_26_D 1 day ago
[dead]
chonle 2 days ago
[flagged]
bendusm 1 day ago
[flagged]
darshil2023 2 days ago
[dead]