# Schema-Driven Orchestration Engine ## Why this exists The orchestration runtime introduces explicit schema validation and deterministic execution rules for multi-agent pipelines. The design favors predictable behavior over implicit conversational memory. ## Main components - `AgentManifest` schema (`src/agents/manifest.ts`): validates personas, relationships, topology constraints, and a strict DAG pipeline. - Persona registry (`src/agents/persona-registry.ts`): renders templated prompts with runtime context and routes behavioral events. - Stateful storage for stateless execution (`src/agents/state-context.ts`): each node execution reads payload + state from storage to get fresh context. - DAG pipeline runner (`src/agents/pipeline.ts`): executes topology blocks, emits typed domain events, evaluates route conditions, and enforces retry/depth/failure limits. - Project context store (`src/agents/project-context.ts`): project-scoped global flags, artifact pointers, and task queue persisted across sessions. - Orchestration facade (`src/agents/orchestration.ts`): wires manifest + registry + pipeline + state manager + project context with env-driven limits. - Hierarchical resource suballocation (`src/agents/provisioning.ts`): builds child `git-worktree` and child `port-range` requests from parent allocation data. - Optional `AGENT_WORKTREE_TARGET_PATH` enables sparse-checkout for a subdirectory and sets per-session working directory to that target path. - Recursive manager runtime (`src/agents/manager.ts`): utility invoked by the pipeline engine for fan-out/retry-unrolled execution. ## Constraint model - Relationship constraints: per-edge limits (`maxDepth`, `maxChildren`) and process-level cap (`AGENT_RELATIONSHIP_MAX_CHILDREN`). - Pipeline constraints: per-node retry limits, retry-unrolled topology, and process-level cap (`AGENT_TOPOLOGY_MAX_RETRIES`). - Topology constraints: max depth and retries from manifest + env caps. ## Stateless handoffs Node payloads are persisted under the state root. Nodes do not inherit in-memory conversational context from previous node runs. Fresh context is reconstructed from the handoff and persisted state each execution. Sessions load project context from `AGENT_PROJECT_CONTEXT_PATH` at initialization, and orchestration writes project updates on each node completion. ## Resolved execution contract Before each actor invocation, orchestration resolves an immutable `ResolvedExecutionContext` and injects it into the executor input: - `phase`: current pipeline node id - `modelConstraint`: persona-level model policy (or runtime fallback) - `allowedTools`: flat resolved tool list for that node attempt - `security`: hard runtime constraints (`dropUid`, `dropGid`, `worktreePath`, violation handling mode) This keeps orchestration policy resolution separate from executor enforcement. Executors do not need to parse manifests or MCP registry internals. Worktree ownership invariant: - In UI session mode, orchestration/session lifecycle is the single owner of git worktree allocation. - Provider adapters (Codex/Claude runtime wrappers) must execute inside `ResolvedExecutionContext.security.worktreePath` and must not provision independent worktrees. ## Execution topology model - Pipeline graph execution is DAG-based with ready-node frontiers. - Nodes tagged with topology blocks `parallel`/`hierarchical` are dispatched concurrently (`Promise.all`) through `AgentManager`. - Validation failures follow retry-unrolled behavior and are executed as new manager child sessions. - Sequential hard failures (timeout/network/403-like) trigger fail-fast abort. - `AbortSignal` is passed through actor execution input for immediate cancellation propagation. ## Domain events - Domain event schema is strongly typed (`src/agents/domain-events.ts`). - Standard event domains: - planning: `requirements_defined`, `tasks_planned` - execution: `code_committed`, `task_blocked` - validation: `validation_passed`, `validation_failed` - integration: `branch_merged`, `merge_conflict_detected`, `merge_conflict_resolved`, `merge_conflict_unresolved`, `merge_retry_started` - Pipeline edges can trigger on domain events (`edge.event`) in addition to legacy status triggers (`edge.on`). - `history_has_event` route conditions evaluate persisted domain event history entries (`validation_failed`, `task_blocked`, etc.). ## Merge conflict orchestration - Task merge/close merge operations return structured outcomes (`success`, `conflict`, `fatal_error`) instead of throwing for conflicts. - Task state supports conflict workflows (`conflict`, `resolving_conflict`) and conflict metadata is persisted under `task.metadata.mergeConflict`. - Conflict retries are bounded by `AGENT_MERGE_CONFLICT_MAX_ATTEMPTS`; exhaustion emits `merge_conflict_unresolved` and the session continues without crashing. ## Security note Security enforcement now lives in `src/security`: - `bash-parser` AST parsing for shell command tokenization (`Command`/`Word` nodes). - Zod-validated shell/tool policy schemas. - `SecurityRulesEngine` for binary allowlists, path traversal checks, worktree boundaries, and tool clearance checks. - `SecureCommandExecutor` for controlled `child_process` execution with timeout + explicit env policy. - `ResolvedExecutionContext.allowedTools` is used to filter provider-exposed tools before SDK invocation, including Claude-specific tool gating where shared `enabled_tools` is ignored. `PipelineExecutor` treats `SecurityViolationError` via configurable policy: - `hard_abort` (default): immediate pipeline termination. - `validation_fail`: maps to retry-unrolled remediation.