The coordination problem is the real problem. Not whether AI can write code — it can. Not whether it can parse documents, draft reports, or run analysis — it can do those too. The problem is: who decides what task goes to which worker, with what context, under what constraints, and how does the result come back in a form you can actually use?
That's what an AI work orchestrator solves. And it turns out we already built this pattern once before — for food delivery.
The DoorDash Parallel Is Structural, Not Metaphorical
Everyone reaches for the DoorDash comparison when talking about AI agents. Most of the time it's lazy. But the structural parallel here is exact enough to be worth unpacking.
DoorDash doesn't cook food. It doesn't employ restaurants. What it does is operate a coordination layer: it accepts orders, validates them, packages the job with enough context for a driver (location, delivery address, special instructions), dispatches to an available worker, tracks execution, and confirms delivery back to the customer. The loop is: intake → validate → dispatch → execute → deliver → confirm.
An AI work orchestrator runs the same loop for software tasks. A client submits work. The orchestrator validates whether the task is well-formed enough to execute. It packages the job — tools, constraints, acceptance criteria — into a self-contained unit. It dispatches to an AI worker. The worker runs. Output comes back. The orchestrator validates the output against the acceptance criteria and delivers it to the client.
Same loop. Different cargo.
The reason this parallel matters: DoorDash is not in the restaurant business and it's not in the driving business. It's in the coordination business. AI orchestrators are not in the AI business. They're in the coordination business too. The intelligence is someone else's problem.
Intake: The Orchestrator Decides Before the Worker Does
Every orchestrator needs an intake layer. This is where most implementations fail — they skip validation and send whatever the user typed directly to the model. That's how you get vague, ambiguous tasks that produce useless output.
A real intake layer does three things:
- Accepts tasks that are specific, scoped, and actionable
- Rejects tasks that are too vague, too long, or structurally malformed
- Requests clarification when a task is close but missing key fields
The intake form is the enforcement mechanism. Here's a minimal implementation — not production-ready, but it shows exactly what the validation logic needs to check:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<title>Orchestrator Intake</title>
</head>
<body>
<form id="intakeForm">
<label>Task Type
<select name="task_type">
<option value="code_generation">Code Generation</option>
<option value="data_analysis">Data Analysis</option>
<option value="document_drafting">Document Drafting</option>
<option value="web_research">Web Research</option>
</select>
</label>
<label>Description
<textarea name="description" rows="4" placeholder="Describe the task in detail..."></textarea>
</label>
<label>Deadline (seconds)
<input type="number" name="deadline" min="1" placeholder="e.g. 120" />
</label>
<fieldset>
<legend>Tools Needed</legend>
<label><input type="checkbox" name="tools" value="python_exec" /> python_exec</label>
<label><input type="checkbox" name="tools" value="web_search" /> web_search</label>
<label><input type="checkbox" name="tools" value="file_write" /> file_write</label>
</fieldset>
<button type="submit">Submit Task</button>
</form>
<p id="result"></p>
<script>
document.getElementById('intakeForm').addEventListener('submit', function(e) {
e.preventDefault();
const desc = this.description.value.trim();
const deadline = parseInt(this.deadline.value, 10);
const result = document.getElementById('result');
if (desc.length < 20) {
result.textContent = '❌ Rejected: task description too vague';
} else if (deadline < 10) {
result.textContent = '❌ Rejected: deadline too short';
} else {
result.textContent = '✅ Accepted — dispatching context package...';
}
});
</script>
</body>
</html>
This validates two things: description length (a proxy for specificity) and deadline feasibility. A real implementation would add token estimation against the deadline, tool availability checks, and a queue depth check before confirming acceptance. But the shape is right — reject early, reject cheap, never send garbage to a worker.
The Context Package: Atomic Unit of AI Work
When the orchestrator accepts a task, it doesn't just forward the description. It builds a context package — a self-contained work unit that the AI worker can execute without asking a single follow-up question.
This is the key insight. The worker should be stateless relative to the client. Everything it needs to complete the task — tools, constraints, acceptance criteria, input data — must be in the package. If the worker needs to ask for clarification, the intake layer failed.
Here's what a real context package looks like:
{
"task_id": "tsk_7f3a92b1",
"task_type": "code_generation",
"description": "Write a Python function that parses ISO 8601 timestamps and returns them in UTC. Handle edge cases: timezone offsets, missing seconds, leap seconds.",
"tools_available": ["python_exec", "web_search", "file_write"],
"constraints": {
"deadline_seconds": 120,
"max_tokens": 4096,
"output_format": "python_file",
"max_retries": 2
},
"input_data": {
"language": "python",
"test_cases": ["2024-01-15T10:30:00+05:30", "2024-03-01T00:00:00Z"]
},
"acceptance_criteria": [
"Function returns datetime object in UTC",
"All provided test cases pass",
"Includes docstring and type hints"
]
}
The most important fields here are acceptance_criteria and constraints. The criteria define what "done" looks like — without them, you can't validate output and the delivery loop never closes. The constraints define the envelope: how long, how many tokens, how many retries before failing. tools_available tells the worker what it can actually call; a worker that tries to use a tool it doesn't have should fail fast, not hallucinate an answer.
task_id matters more than it looks. Once you're running work across multiple workers asynchronously, you need to trace every package through the system. Correlation IDs are not optional at scale.
The Flow, End to End
Here's the full pipeline visualized:
Each node has one job. Client submits work and eventually receives output. Orchestrator is the only node with decision authority — it accepts or rejects, builds the context package, and validates the output before it goes back. Context Package is not a node in the runtime sense; it's the artifact that moves between orchestrator and worker. AI Worker executes and returns output — it has no client contact, no intake logic, no delivery responsibility. Delivered Output is what the orchestrator hands back after validating against the acceptance criteria.
The dashed return arrow is the confirmation loop — the moment the client knows the task is done. In synchronous flows, this is a response. In async flows, it's a webhook or a queue event.
Why You Can't Skip This Layer
You can call GPT-4 directly. You can pipe a description into an agent framework. For simple, one-shot tasks, that works fine. For anything multi-step, long-running, or running in production — it falls apart fast.
Three specific problems the orchestrator solves:
The context problem. AI workers have token limits and no persistent memory. Every task needs to arrive with complete context. If you rely on the worker to ask follow-up questions or remember what happened in a previous call, you're building fragility into the critical path. The context package pattern forces you to front-load everything.
The retry problem. Workers fail. Models hallucinate. Outputs don't meet criteria. Without an orchestrator managing max_retries and validation, a failed task silently becomes a bad result. The orchestrator owns the retry loop — not the client, not the worker.
The coordination problem. At any scale above a single task, you have multiple workers, multiple task types, queue depth, priority ordering, and resource constraints. None of that belongs in the worker. None of it belongs in the client. It belongs in a dedicated coordination layer — the orchestrator.
This is an infrastructure problem. The AI is a commodity. The orchestration is the hard part.
Who Builds This and What the Stack Looks Like
Right now, most teams building AI workflows are reinventing the orchestrator in their application code. A few framework layers (LangGraph, Temporal for AI workflows, some of the newer agent orchestration platforms) are starting to make this explicit. But the market is early.
The teams that get this right will treat the orchestrator as a first-class service: its own codebase, its own observability, its own SLAs. Not a utility function inside an existing app.
The stack looks like this:
| Layer | Responsibility |
|---|---|
| Client / UI | Submit tasks, receive results |
| Orchestrator API | Intake, validation, queue management |
| Context Builder | Assembles task + tools + constraints into packages |
| Worker Pool | AI agents or model API calls with tool access |
| Output Validator | Checks output against acceptance criteria |
| Delivery Layer | Returns results, fires confirmation events |
The interesting engineering is in the orchestrator API and the output validator. Those two layers determine whether the system is actually reliable or just impressive in demos.
Build the intake form first. Enforce the acceptance criteria. The workers will sort themselves out.