Files
ai_ops/docs/orchestration-engine.md
2026-02-23 12:06:13 -05:00

2.5 KiB

Schema-Driven Orchestration Engine

Why this exists

The orchestration runtime introduces explicit schema validation and deterministic execution rules for multi-agent pipelines. The design favors predictable behavior over implicit conversational memory.

Main components

  • AgentManifest schema (src/agents/manifest.ts): validates personas, relationships, topology constraints, and a strict DAG pipeline.
  • Persona registry (src/agents/persona-registry.ts): renders templated prompts with runtime context and routes behavioral events.
  • Stateful storage for stateless execution (src/agents/state-context.ts): each node execution reads payload + state from storage to get fresh context.
  • DAG pipeline runner (src/agents/pipeline.ts): executes actor nodes, evaluates state/history/repo conditions, enforces retry/depth limits.
  • Orchestration facade (src/agents/orchestration.ts): wires manifest + registry + pipeline + state manager with env-driven limits.
  • Hierarchical resource suballocation (src/agents/provisioning.ts): builds child git-worktree and child port-range requests from parent allocation data.
  • Recursive manager runtime (src/agents/manager.ts): queue-aware fanout/fan-in execution with fail-fast child cancellation and session-level abort propagation.

Constraint model

  • Relationship constraints: per-edge limits (maxDepth, maxChildren) and process-level cap (AGENT_RELATIONSHIP_MAX_CHILDREN).
  • Pipeline constraints: per-node retry limits and process-level cap (AGENT_TOPOLOGY_MAX_RETRIES).
  • Topology constraints: max depth and retries from manifest + env caps.

Stateless handoffs

Node payloads are persisted under the state root. Nodes do not inherit in-memory conversational context from previous node runs. Fresh context is reconstructed from the handoff and persisted state each execution.

Recursive execution model

  • Recursive planning is schema-driven: a node returns child intents rather than imperatively spawning children.
  • Parent execution ends before child runs begin; the parent token is released and reacquired only for aggregate phase execution.
  • Child sessions use deterministic hierarchical IDs (<parent>_child_<n>) and are cancellable through parent session closure.
  • Resource orchestration remains external to AgentManager via middleware hooks for child allocation/release.

Security note

Tool clearance allowlists/banlists are currently data-model stubs. Enforcement must be implemented in the tool execution boundary before relying on these policies for hard guarantees.