Files
ai_ops/docs/orchestration-engine.md

3.8 KiB

Schema-Driven Orchestration Engine

Why this exists

The orchestration runtime introduces explicit schema validation and deterministic execution rules for multi-agent pipelines. The design favors predictable behavior over implicit conversational memory.

Main components

  • AgentManifest schema (src/agents/manifest.ts): validates personas, relationships, topology constraints, and a strict DAG pipeline.
  • Persona registry (src/agents/persona-registry.ts): renders templated prompts with runtime context and routes behavioral events.
  • Stateful storage for stateless execution (src/agents/state-context.ts): each node execution reads payload + state from storage to get fresh context.
  • DAG pipeline runner (src/agents/pipeline.ts): executes topology blocks, emits typed domain events, evaluates route conditions, and enforces retry/depth/failure limits.
  • Project context store (src/agents/project-context.ts): project-scoped global flags, artifact pointers, and task queue persisted across sessions.
  • Orchestration facade (src/agents/orchestration.ts): wires manifest + registry + pipeline + state manager + project context with env-driven limits.
  • Hierarchical resource suballocation (src/agents/provisioning.ts): builds child git-worktree and child port-range requests from parent allocation data.
  • Recursive manager runtime (src/agents/manager.ts): utility invoked by the pipeline engine for fan-out/retry-unrolled execution.

Constraint model

  • Relationship constraints: per-edge limits (maxDepth, maxChildren) and process-level cap (AGENT_RELATIONSHIP_MAX_CHILDREN).
  • Pipeline constraints: per-node retry limits, retry-unrolled topology, and process-level cap (AGENT_TOPOLOGY_MAX_RETRIES).
  • Topology constraints: max depth and retries from manifest + env caps.

Stateless handoffs

Node payloads are persisted under the state root. Nodes do not inherit in-memory conversational context from previous node runs. Fresh context is reconstructed from the handoff and persisted state each execution. Sessions load project context from AGENT_PROJECT_CONTEXT_PATH at initialization, and orchestration writes project updates on each node completion.

Execution topology model

  • Pipeline graph execution is DAG-based with ready-node frontiers.
  • Nodes tagged with topology blocks parallel/hierarchical are dispatched concurrently (Promise.all) through AgentManager.
  • Validation failures follow retry-unrolled behavior and are executed as new manager child sessions.
  • Sequential hard failures (timeout/network/403-like) trigger fail-fast abort.
  • AbortSignal is passed through actor execution input for immediate cancellation propagation.

Domain events

  • Domain event schema is strongly typed (src/agents/domain-events.ts).
  • Standard event domains:
    • planning: requirements_defined, tasks_planned
    • execution: code_committed, task_blocked
    • validation: validation_passed, validation_failed
    • integration: branch_merged
  • Pipeline edges can trigger on domain events (edge.event) in addition to legacy status triggers (edge.on).
  • history_has_event route conditions evaluate persisted domain event history entries (validation_failed, task_blocked, etc.).

Security note

Security enforcement now lives in src/security:

  • bash-parser AST parsing for shell command tokenization (Command/Word nodes).
  • Zod-validated shell/tool policy schemas.
  • SecurityRulesEngine for binary allowlists, path traversal checks, worktree boundaries, and tool clearance checks.
  • SecureCommandExecutor for controlled child_process execution with timeout + explicit env policy.

PipelineExecutor treats SecurityViolationError via configurable policy:

  • hard_abort (default): immediate pipeline termination.
  • validation_fail: maps to retry-unrolled remediation.