Files
ai_ops/docs/orchestration-engine.md

49 lines
3.3 KiB
Markdown

# Schema-Driven Orchestration Engine
## Why this exists
The orchestration runtime introduces explicit schema validation and deterministic execution rules for multi-agent pipelines. The design favors predictable behavior over implicit conversational memory.
## Main components
- `AgentManifest` schema (`src/agents/manifest.ts`): validates personas, relationships, topology constraints, and a strict DAG pipeline.
- Persona registry (`src/agents/persona-registry.ts`): renders templated prompts with runtime context and routes behavioral events.
- Stateful storage for stateless execution (`src/agents/state-context.ts`): each node execution reads payload + state from storage to get fresh context.
- DAG pipeline runner (`src/agents/pipeline.ts`): executes topology blocks, emits typed domain events, evaluates route conditions, and enforces retry/depth/failure limits.
- Project context store (`src/agents/project-context.ts`): project-scoped global flags, artifact pointers, and task queue persisted across sessions.
- Orchestration facade (`src/agents/orchestration.ts`): wires manifest + registry + pipeline + state manager + project context with env-driven limits.
- Hierarchical resource suballocation (`src/agents/provisioning.ts`): builds child `git-worktree` and child `port-range` requests from parent allocation data.
- Recursive manager runtime (`src/agents/manager.ts`): utility invoked by the pipeline engine for fan-out/retry-unrolled execution.
## Constraint model
- Relationship constraints: per-edge limits (`maxDepth`, `maxChildren`) and process-level cap (`AGENT_RELATIONSHIP_MAX_CHILDREN`).
- Pipeline constraints: per-node retry limits, retry-unrolled topology, and process-level cap (`AGENT_TOPOLOGY_MAX_RETRIES`).
- Topology constraints: max depth and retries from manifest + env caps.
## Stateless handoffs
Node payloads are persisted under the state root. Nodes do not inherit in-memory conversational context from previous node runs. Fresh context is reconstructed from the handoff and persisted state each execution. Sessions load project context from `AGENT_PROJECT_CONTEXT_PATH` at initialization, and orchestration writes project updates on each node completion.
## Execution topology model
- Pipeline graph execution is DAG-based with ready-node frontiers.
- Nodes tagged with topology blocks `parallel`/`hierarchical` are dispatched concurrently (`Promise.all`) through `AgentManager`.
- Validation failures follow retry-unrolled behavior and are executed as new manager child sessions.
- Sequential hard failures (timeout/network/403-like) trigger fail-fast abort.
- `AbortSignal` is passed through actor execution input for immediate cancellation propagation.
## Domain events
- Domain event schema is strongly typed (`src/agents/domain-events.ts`).
- Standard event domains:
- planning: `requirements_defined`, `tasks_planned`
- execution: `code_committed`, `task_blocked`
- validation: `validation_passed`, `validation_failed`
- integration: `branch_merged`
- Pipeline edges can trigger on domain events (`edge.event`) in addition to legacy status triggers (`edge.on`).
## Security note
Tool clearance allowlists/banlists are currently data-model stubs. Enforcement must be implemented in the tool execution boundary before relying on these policies for hard guarantees.