Enforce resolved execution context for deterministic actor policy

This commit is contained in:
2026-02-23 17:51:09 -05:00
parent 94c79d9dd7
commit 4f5ff16b45
9 changed files with 371 additions and 93 deletions

View File

@@ -73,7 +73,7 @@ npm run dev -- claude "List potential improvements."
`AgentManifest` (schema `"1"`) validates: `AgentManifest` (schema `"1"`) validates:
- supported topologies (`sequential`, `parallel`, `hierarchical`, `retry-unrolled`) - supported topologies (`sequential`, `parallel`, `hierarchical`, `retry-unrolled`)
- persona definitions and tool-clearance policy (validated by shared Zod schema) - persona definitions, optional `modelConstraint`, and tool-clearance policy (validated by shared Zod schema)
- relationship DAG and unknown persona references - relationship DAG and unknown persona references
- strict pipeline DAG - strict pipeline DAG
- topology constraints (`maxDepth`, `maxRetries`) - topology constraints (`maxDepth`, `maxRetries`)
@@ -191,9 +191,10 @@ jq -c 'select(.severity=="critical")' .ai_ops/events/runtime-events.ndjson
- timeout enforcement - timeout enforcement
- optional uid/gid drop - optional uid/gid drop
- stdout/stderr streaming hooks for audit - stdout/stderr streaming hooks for audit
- Every actor execution input now includes a pre-resolved `executionContext` (`phase`, `modelConstraint`, `allowedTools`, and immutable security constraints) generated by orchestration per node attempt.
- Every actor execution input now includes `security` helpers (`rulesEngine`, `createCommandExecutor(...)`) so executors can enforce shell/tool policy at the execution boundary. - Every actor execution input now includes `security` helpers (`rulesEngine`, `createCommandExecutor(...)`) so executors can enforce shell/tool policy at the execution boundary.
- Every actor execution input now includes `mcp` helpers (`registry`, `resolveConfig(...)`) so MCP server config resolution stays centrally policy-controlled per persona/tool-clearance. - Every actor execution input now includes `mcp` helpers (`resolvedConfig`, `resolveConfig(...)`, `filterToolsForProvider(...)`, `createClaudeCanUseTool()`) so provider adapters are filtered against `executionContext.allowedTools` before SDK calls.
- For Claude-based executors, use `input.mcp.createClaudeCanUseTool()` as the SDK `canUseTool` callback to enforce persona allowlist/banlist before each tool invocation. - For Claude-based executors, pass `input.mcp.filterToolsForProvider(...)` and `input.mcp.createClaudeCanUseTool()` into the SDK call path so unauthorized tools are never exposed and runtime bypass attempts trigger security violations.
- Pipeline behavior on `SecurityViolationError` is configurable: - Pipeline behavior on `SecurityViolationError` is configurable:
- `hard_abort` (default) - `hard_abort` (default)
- `validation_fail` (retry-unrolled remediation) - `validation_fail` (retry-unrolled remediation)

View File

@@ -25,6 +25,17 @@ The orchestration runtime introduces explicit schema validation and deterministi
Node payloads are persisted under the state root. Nodes do not inherit in-memory conversational context from previous node runs. Fresh context is reconstructed from the handoff and persisted state each execution. Sessions load project context from `AGENT_PROJECT_CONTEXT_PATH` at initialization, and orchestration writes project updates on each node completion. Node payloads are persisted under the state root. Nodes do not inherit in-memory conversational context from previous node runs. Fresh context is reconstructed from the handoff and persisted state each execution. Sessions load project context from `AGENT_PROJECT_CONTEXT_PATH` at initialization, and orchestration writes project updates on each node completion.
## Resolved execution contract
Before each actor invocation, orchestration resolves an immutable `ResolvedExecutionContext` and injects it into the executor input:
- `phase`: current pipeline node id
- `modelConstraint`: persona-level model policy (or runtime fallback)
- `allowedTools`: flat resolved tool list for that node attempt
- `security`: hard runtime constraints (`dropUid`, `dropGid`, `worktreePath`, violation handling mode)
This keeps orchestration policy resolution separate from executor enforcement. Executors do not need to parse manifests or MCP registry internals.
## Execution topology model ## Execution topology model
- Pipeline graph execution is DAG-based with ready-node frontiers. - Pipeline graph execution is DAG-based with ready-node frontiers.
@@ -52,6 +63,7 @@ Security enforcement now lives in `src/security`:
- Zod-validated shell/tool policy schemas. - Zod-validated shell/tool policy schemas.
- `SecurityRulesEngine` for binary allowlists, path traversal checks, worktree boundaries, and tool clearance checks. - `SecurityRulesEngine` for binary allowlists, path traversal checks, worktree boundaries, and tool clearance checks.
- `SecureCommandExecutor` for controlled `child_process` execution with timeout + explicit env policy. - `SecureCommandExecutor` for controlled `child_process` execution with timeout + explicit env policy.
- `ResolvedExecutionContext.allowedTools` is used to filter provider-exposed tools before SDK invocation, including Claude-specific tool gating where shared `enabled_tools` is ignored.
`PipelineExecutor` treats `SecurityViolationError` via configurable policy: `PipelineExecutor` treats `SecurityViolationError` via configurable policy:
- `hard_abort` (default): immediate pipeline termination. - `hard_abort` (default): immediate pipeline termination.

View File

@@ -15,6 +15,7 @@
- Coordinates DAG traversal and retry behavior. - Coordinates DAG traversal and retry behavior.
- Computes aggregate run status from executed terminal nodes plus critical-path failures. - Computes aggregate run status from executed terminal nodes plus critical-path failures.
- Applies dedicated `SecurityViolationError` handling policy (`hard_abort` or `validation_fail` mapping). - Applies dedicated `SecurityViolationError` handling policy (`hard_abort` or `validation_fail` mapping).
- Resolves per-attempt `ResolvedExecutionContext` (phase/model/tool/security contract) and injects it into actor executors.
## Aggregate status semantics ## Aggregate status semantics
@@ -29,3 +30,9 @@ Otherwise status is `failure`.
State and project-context writes are now atomic via temp-file + rename. State and project-context writes are now atomic via temp-file + rename.
Project-context patch/write operations are serialized both in-process (promise queue) and cross-process (lock file). Project-context patch/write operations are serialized both in-process (promise queue) and cross-process (lock file).
## Tool enforcement guarantees
- Pipeline resolves a flat `allowedTools` list per node attempt.
- MCP config exposed to executors is pre-filtered to `allowedTools`.
- Claude tool callbacks are expected to use the provided policy handler so unsupported shared MCP tool filters cannot bypass enforcement.

View File

@@ -11,6 +11,7 @@ export type ManifestPersona = {
id: string; id: string;
displayName: string; displayName: string;
systemPromptTemplate: string; systemPromptTemplate: string;
modelConstraint?: string;
toolClearance: ToolClearancePolicy; toolClearance: ToolClearancePolicy;
}; };
@@ -147,10 +148,21 @@ function parsePersona(value: unknown): ManifestPersona {
throw new Error("Manifest persona entry must be an object."); throw new Error("Manifest persona entry must be an object.");
} }
const modelConstraintRaw = value.modelConstraint;
if (
modelConstraintRaw !== undefined &&
(typeof modelConstraintRaw !== "string" || modelConstraintRaw.trim().length === 0)
) {
throw new Error('Manifest persona field "modelConstraint" must be a non-empty string when provided.');
}
return { return {
id: readString(value, "id"), id: readString(value, "id"),
displayName: readString(value, "displayName"), displayName: readString(value, "displayName"),
systemPromptTemplate: readString(value, "systemPromptTemplate"), systemPromptTemplate: readString(value, "systemPromptTemplate"),
...(typeof modelConstraintRaw === "string"
? { modelConstraint: modelConstraintRaw.trim() }
: {}),
toolClearance: parseToolClearance(value.toolClearance), toolClearance: parseToolClearance(value.toolClearance),
}; };
} }

View File

@@ -364,12 +364,18 @@ export class SchemaDrivenExecutionEngine {
{ {
workspaceRoot: this.settings.workspaceRoot, workspaceRoot: this.settings.workspaceRoot,
runtimeContext: this.settings.runtimeContext, runtimeContext: this.settings.runtimeContext,
defaultModelConstraint: this.config.provider.claudeModel,
resolvedExecutionSecurityConstraints: {
dropUid: this.config.security.dropUid !== undefined,
dropGid: this.config.security.dropGid !== undefined,
worktreePath: this.settings.workspaceRoot,
violationMode: this.settings.securityViolationHandling,
},
maxDepth: Math.min(this.settings.maxDepth, this.manifest.topologyConstraints.maxDepth), maxDepth: Math.min(this.settings.maxDepth, this.manifest.topologyConstraints.maxDepth),
maxRetries: Math.min(this.settings.maxRetries, this.manifest.topologyConstraints.maxRetries), maxRetries: Math.min(this.settings.maxRetries, this.manifest.topologyConstraints.maxRetries),
manager: this.manager, manager: this.manager,
managerSessionId, managerSessionId,
projectContextStore: this.projectContextStore, projectContextStore: this.projectContextStore,
mcpRegistry: this.mcpRegistry,
resolveMcpConfig: ({ providerHint, prompt, toolClearance }) => resolveMcpConfig: ({ providerHint, prompt, toolClearance }) =>
loadMcpConfigFromEnv( loadMcpConfigFromEnv(
{ {

View File

@@ -74,6 +74,11 @@ export class PersonaRegistry {
}; };
} }
getModelConstraint(personaId: string): string | undefined {
const persona = this.getById(personaId);
return persona.modelConstraint;
}
async emitBehaviorEvent(input: PersonaBehaviorContext & { personaId: string }): Promise<JsonObject> { async emitBehaviorEvent(input: PersonaBehaviorContext & { personaId: string }): Promise<JsonObject> {
const persona = this.getById(input.personaId); const persona = this.getById(input.personaId);
const handler = persona.behaviorHandlers?.[input.event]; const handler = persona.behaviorHandlers?.[input.event];

View File

@@ -16,8 +16,11 @@ import {
} from "./lifecycle-observer.js"; } from "./lifecycle-observer.js";
import type { AgentManifest, PipelineEdge, PipelineNode, RouteCondition } from "./manifest.js"; import type { AgentManifest, PipelineEdge, PipelineNode, RouteCondition } from "./manifest.js";
import type { AgentManager, RecursiveChildIntent } from "./manager.js"; import type { AgentManager, RecursiveChildIntent } from "./manager.js";
import type { McpRegistry } from "../mcp/handlers.js"; import type {
import type { LoadedMcpConfig, McpLoadContext } from "../mcp/types.js"; CodexConfigObject,
LoadedMcpConfig,
McpLoadContext,
} from "../mcp/types.js";
import { PersonaRegistry } from "./persona-registry.js"; import { PersonaRegistry } from "./persona-registry.js";
import { type ProjectContextPatch, type FileSystemProjectContextStore } from "./project-context.js"; import { type ProjectContextPatch, type FileSystemProjectContextStore } from "./project-context.js";
import { import {
@@ -74,19 +77,35 @@ export type ActorToolPermissionHandler = (
) => Promise<ActorToolPermissionResult>; ) => Promise<ActorToolPermissionResult>;
export type ActorExecutionMcpContext = { export type ActorExecutionMcpContext = {
registry: McpRegistry; allowedTools: string[];
resolvedConfig: LoadedMcpConfig;
resolveConfig: (context?: McpLoadContext) => LoadedMcpConfig; resolveConfig: (context?: McpLoadContext) => LoadedMcpConfig;
filterToolsForProvider: (tools: string[]) => string[];
createToolPermissionHandler: () => ActorToolPermissionHandler; createToolPermissionHandler: () => ActorToolPermissionHandler;
createClaudeCanUseTool: () => ActorToolPermissionHandler; createClaudeCanUseTool: () => ActorToolPermissionHandler;
}; };
export type ResolvedExecutionSecurityConstraints = {
dropUid: boolean;
dropGid: boolean;
worktreePath: string;
violationMode: SecurityViolationHandling;
};
export type ResolvedExecutionContext = {
phase: string;
modelConstraint: string;
allowedTools: string[];
security: ResolvedExecutionSecurityConstraints;
};
export type ActorExecutionInput = { export type ActorExecutionInput = {
sessionId: string; sessionId: string;
node: PipelineNode; node: PipelineNode;
prompt: string; prompt: string;
context: NodeExecutionContext; context: NodeExecutionContext;
signal: AbortSignal; signal: AbortSignal;
toolClearance: ToolClearancePolicy; executionContext: ResolvedExecutionContext;
mcp: ActorExecutionMcpContext; mcp: ActorExecutionMcpContext;
security?: ActorExecutionSecurityContext; security?: ActorExecutionSecurityContext;
}; };
@@ -114,12 +133,13 @@ export type PipelineAggregateStatus = "success" | "failure";
export type PipelineExecutorOptions = { export type PipelineExecutorOptions = {
workspaceRoot: string; workspaceRoot: string;
runtimeContext: Record<string, string | number | boolean>; runtimeContext: Record<string, string | number | boolean>;
defaultModelConstraint?: string;
resolvedExecutionSecurityConstraints: ResolvedExecutionSecurityConstraints;
maxDepth: number; maxDepth: number;
maxRetries: number; maxRetries: number;
manager: AgentManager; manager: AgentManager;
managerSessionId: string; managerSessionId: string;
projectContextStore: FileSystemProjectContextStore; projectContextStore: FileSystemProjectContextStore;
mcpRegistry: McpRegistry;
failurePolicy?: FailurePolicy; failurePolicy?: FailurePolicy;
lifecycleObserver?: PipelineLifecycleObserver; lifecycleObserver?: PipelineLifecycleObserver;
hardFailureThreshold?: number; hardFailureThreshold?: number;
@@ -301,6 +321,99 @@ function dedupeStrings(values: readonly string[]): string[] {
return deduped; return deduped;
} }
function cloneMcpConfig(config: LoadedMcpConfig): LoadedMcpConfig {
return typeof structuredClone === "function"
? structuredClone(config)
: (JSON.parse(JSON.stringify(config)) as LoadedMcpConfig);
}
function readStringArray(value: unknown): string[] | undefined {
if (!Array.isArray(value)) {
return undefined;
}
const output: string[] = [];
for (const item of value) {
if (typeof item !== "string") {
continue;
}
const normalized = item.trim();
if (!normalized) {
continue;
}
output.push(normalized);
}
return output;
}
function toAllowedToolPolicy(allowedTools: readonly string[]): ToolClearancePolicy {
return {
allowlist: [...allowedTools],
banlist: [],
};
}
function applyAllowedToolsToLoadedMcpConfig(
input: LoadedMcpConfig,
allowedTools: readonly string[],
): LoadedMcpConfig {
if (allowedTools.length === 0) {
const codexServers = input.codexConfig?.mcp_servers;
if (!codexServers) {
return cloneMcpConfig(input);
}
const sanitizedServers: Record<string, CodexConfigObject> = {};
for (const [serverName, rawServer] of Object.entries(codexServers)) {
if (typeof rawServer !== "object" || rawServer === null || Array.isArray(rawServer)) {
continue;
}
sanitizedServers[serverName] = {
...rawServer,
enabled_tools: [],
};
}
return {
...cloneMcpConfig(input),
codexConfig: {
...(input.codexConfig ?? {}),
mcp_servers: sanitizedServers,
},
};
}
const allowset = new Set(allowedTools);
const codexServers = input.codexConfig?.mcp_servers;
if (!codexServers) {
return cloneMcpConfig(input);
}
const sanitizedServers: Record<string, CodexConfigObject> = {};
for (const [serverName, rawServer] of Object.entries(codexServers)) {
if (typeof rawServer !== "object" || rawServer === null || Array.isArray(rawServer)) {
continue;
}
const enabledFromConfig = readStringArray((rawServer as Record<string, unknown>).enabled_tools);
const enabledTools = (enabledFromConfig ?? allowedTools).filter((tool) => allowset.has(tool));
sanitizedServers[serverName] = {
...rawServer,
enabled_tools: enabledTools,
};
}
return {
...cloneMcpConfig(input),
codexConfig: {
...(input.codexConfig ?? {}),
mcp_servers: sanitizedServers,
},
};
}
function toToolNameCandidates(toolName: string): string[] { function toToolNameCandidates(toolName: string): string[] {
const trimmed = toolName.trim(); const trimmed = toolName.trim();
if (!trimmed) { if (!trimmed) {
@@ -857,14 +970,20 @@ export class PipelineExecutor {
try { try {
throwIfAborted(input.signal); throwIfAborted(input.signal);
const toolClearance = this.personaRegistry.getToolClearance(input.node.personaId); const toolClearance = this.personaRegistry.getToolClearance(input.node.personaId);
const executionContext = this.resolveExecutionContext({
node: input.node,
toolClearance,
prompt: input.prompt,
});
return await input.executor({ return await input.executor({
sessionId: input.sessionId, sessionId: input.sessionId,
node: input.node, node: input.node,
prompt: input.prompt, prompt: input.prompt,
context: input.context, context: input.context,
signal: input.signal, signal: input.signal,
toolClearance, executionContext,
mcp: this.buildActorMcpContext(toolClearance), mcp: this.buildActorMcpContext(executionContext, input.prompt),
security: this.securityContext, security: this.securityContext,
}); });
} catch (error) { } catch (error) {
@@ -901,34 +1020,142 @@ export class PipelineExecutor {
} }
} }
private buildActorMcpContext(toolClearance: ToolClearancePolicy): ActorExecutionMcpContext { private resolveExecutionContext(input: {
node: PipelineNode;
toolClearance: ToolClearancePolicy;
prompt: string;
}): ResolvedExecutionContext {
const normalizedToolClearance = parseToolClearancePolicy(input.toolClearance);
const toolUniverse = this.resolveAvailableToolsForAttempt(normalizedToolClearance, input.prompt);
const allowedTools = this.resolveAllowedToolsForAttempt({
toolClearance: normalizedToolClearance,
toolUniverse,
});
const modelConstraint =
this.personaRegistry.getModelConstraint(input.node.personaId) ??
this.options.defaultModelConstraint ??
"provider-default";
return {
phase: input.node.id,
modelConstraint,
allowedTools,
security: {
...this.options.resolvedExecutionSecurityConstraints,
},
};
}
private resolveAllowedToolsForAttempt(input: {
toolClearance: ToolClearancePolicy;
toolUniverse: string[];
}): string[] {
const normalized = parseToolClearancePolicy(input.toolClearance);
const banlist = new Set(normalized.banlist);
if (normalized.allowlist.length > 0) {
return dedupeStrings(normalized.allowlist.filter((tool) => !banlist.has(tool)));
}
if (input.toolUniverse.length > 0) {
return dedupeStrings(input.toolUniverse.filter((tool) => !banlist.has(tool)));
}
return [];
}
private resolveAvailableToolsForAttempt(toolClearance: ToolClearancePolicy, prompt: string): string[] {
if (!this.options.resolveMcpConfig) {
return [];
}
const resolved = this.options.resolveMcpConfig({
providerHint: "codex",
prompt,
toolClearance,
});
const rawServers = resolved.codexConfig?.mcp_servers;
if (!rawServers) {
return [];
}
const tools: string[] = [];
for (const rawServer of Object.values(rawServers)) {
if (typeof rawServer !== "object" || rawServer === null || Array.isArray(rawServer)) {
continue;
}
const enabled = readStringArray((rawServer as Record<string, unknown>).enabled_tools) ?? [];
tools.push(...enabled);
}
return dedupeStrings(tools);
}
private buildActorMcpContext(
executionContext: ResolvedExecutionContext,
prompt: string,
): ActorExecutionMcpContext {
const toolPolicy = toAllowedToolPolicy(executionContext.allowedTools);
const filterToolsForProvider = (tools: string[]): string[] => {
const deduped = dedupeStrings(tools);
const allowset = new Set(executionContext.allowedTools);
return deduped.filter((tool) => allowset.has(tool));
};
const baseResolvedConfig = this.options.resolveMcpConfig
? this.options.resolveMcpConfig({
providerHint: "both",
prompt,
toolClearance: toolPolicy,
})
: {};
const resolvedConfig = applyAllowedToolsToLoadedMcpConfig(
baseResolvedConfig,
executionContext.allowedTools,
);
const resolveConfig = (context: McpLoadContext = {}): LoadedMcpConfig => { const resolveConfig = (context: McpLoadContext = {}): LoadedMcpConfig => {
if (!this.options.resolveMcpConfig) { if (context.providerHint === "codex") {
return {}; return {
...(resolvedConfig.codexConfig ? { codexConfig: cloneMcpConfig(resolvedConfig).codexConfig } : {}),
...(resolvedConfig.sourcePath ? { sourcePath: resolvedConfig.sourcePath } : {}),
...(resolvedConfig.resolvedHandlers
? { resolvedHandlers: { ...resolvedConfig.resolvedHandlers } }
: {}),
};
} }
return this.options.resolveMcpConfig({ if (context.providerHint === "claude") {
...context, return {
toolClearance, ...(resolvedConfig.claudeMcpServers
}); ? { claudeMcpServers: cloneMcpConfig(resolvedConfig).claudeMcpServers }
: {}),
...(resolvedConfig.sourcePath ? { sourcePath: resolvedConfig.sourcePath } : {}),
...(resolvedConfig.resolvedHandlers
? { resolvedHandlers: { ...resolvedConfig.resolvedHandlers } }
: {}),
};
}
return cloneMcpConfig(resolvedConfig);
}; };
const createToolPermissionHandler = (): ActorToolPermissionHandler => const createToolPermissionHandler = (): ActorToolPermissionHandler =>
this.createToolPermissionHandler(toolClearance); this.createToolPermissionHandler(executionContext.allowedTools);
return { return {
registry: this.options.mcpRegistry, allowedTools: [...executionContext.allowedTools],
resolvedConfig: cloneMcpConfig(resolvedConfig),
resolveConfig, resolveConfig,
filterToolsForProvider,
createToolPermissionHandler, createToolPermissionHandler,
createClaudeCanUseTool: createToolPermissionHandler, createClaudeCanUseTool: createToolPermissionHandler,
}; };
} }
private createToolPermissionHandler(toolClearance: ToolClearancePolicy): ActorToolPermissionHandler { private createToolPermissionHandler(allowedTools: readonly string[]): ActorToolPermissionHandler {
const normalizedToolClearance = parseToolClearancePolicy(toolClearance); const allowset = new Set(allowedTools);
const allowlist = new Set(normalizedToolClearance.allowlist);
const banlist = new Set(normalizedToolClearance.banlist);
const rulesEngine = this.securityContext?.rulesEngine; const rulesEngine = this.securityContext?.rulesEngine;
const toolPolicy = toAllowedToolPolicy(allowedTools);
return async (toolName, _input, options) => { return async (toolName, _input, options) => {
const toolUseID = options.toolUseID; const toolUseID = options.toolUseID;
@@ -942,63 +1169,23 @@ export class PipelineExecutor {
} }
const candidates = toToolNameCandidates(toolName); const candidates = toToolNameCandidates(toolName);
const banMatch = candidates.find((candidate) => banlist.has(candidate)); const allowMatch = candidates.find((candidate) => allowset.has(candidate));
if (banMatch) { if (!allowMatch) {
if (rulesEngine) { rulesEngine?.assertToolInvocationAllowed({
try { tool: candidates[0] ?? toolName,
rulesEngine.assertToolInvocationAllowed({ toolClearance: toolPolicy,
tool: banMatch, });
toolClearance: normalizedToolClearance,
});
} catch {
// Security audit event already emitted by rules engine.
}
}
return { return {
behavior: "deny", behavior: "deny",
message: `Tool "${toolName}" is blocked by actor tool policy.`, message: `Tool "${toolName}" is not in the resolved execution allowlist.`,
interrupt: true, interrupt: true,
...(toolUseID ? { toolUseID } : {}), ...(toolUseID ? { toolUseID } : {}),
}; };
} }
if (allowlist.size > 0) {
const allowMatch = candidates.find((candidate) => allowlist.has(candidate));
if (!allowMatch) {
if (rulesEngine) {
try {
rulesEngine.assertToolInvocationAllowed({
tool: toolName,
toolClearance: normalizedToolClearance,
});
} catch {
// Security audit event already emitted by rules engine.
}
}
return {
behavior: "deny",
message: `Tool "${toolName}" is not in the actor tool allowlist.`,
interrupt: true,
...(toolUseID ? { toolUseID } : {}),
};
}
rulesEngine?.assertToolInvocationAllowed({
tool: allowMatch,
toolClearance: normalizedToolClearance,
});
return {
behavior: "allow",
...(toolUseID ? { toolUseID } : {}),
};
}
rulesEngine?.assertToolInvocationAllowed({ rulesEngine?.assertToolInvocationAllowed({
tool: candidates[0] ?? toolName, tool: allowMatch,
toolClearance: normalizedToolClearance, toolClearance: toolPolicy,
}); });
return { return {

View File

@@ -76,6 +76,21 @@ test("parses a valid AgentManifest", () => {
assert.equal(manifest.relationships.length, 1); assert.equal(manifest.relationships.length, 1);
}); });
test("parses optional persona modelConstraint", () => {
const manifest = validManifest() as {
personas: Array<Record<string, unknown>>;
};
manifest.personas[1] = {
...manifest.personas[1],
modelConstraint: "claude-3-haiku",
};
const parsed = parseAgentManifest(manifest);
const coder = parsed.personas.find((persona) => persona.id === "coder");
assert.ok(coder);
assert.equal(coder.modelConstraint, "claude-3-haiku");
});
test("rejects pipeline cycles", () => { test("rejects pipeline cycles", () => {
const manifest = validManifest() as { const manifest = validManifest() as {
pipeline: { pipeline: {
@@ -136,3 +151,18 @@ test("rejects legacy edge trigger aliases", () => {
/unsupported event "onValidationFail"/, /unsupported event "onValidationFail"/,
); );
}); });
test("rejects empty persona modelConstraint", () => {
const manifest = validManifest() as {
personas: Array<Record<string, unknown>>;
};
manifest.personas[0] = {
...manifest.personas[0],
modelConstraint: " ",
};
assert.throws(
() => parseAgentManifest(manifest),
/modelConstraint/,
);
});

View File

@@ -193,7 +193,10 @@ test("runs DAG pipeline with state-dependent routing and retry behavior", async
}, },
coder: async (input): Promise<ActorExecutionResult> => { coder: async (input): Promise<ActorExecutionResult> => {
assert.match(input.prompt, /AIOPS-123/); assert.match(input.prompt, /AIOPS-123/);
assert.deepEqual(input.toolClearance.allowlist, ["read_file", "write_file"]); assert.deepEqual(input.executionContext.allowedTools, ["read_file", "write_file"]);
assert.equal(input.executionContext.phase, "coder-1");
assert.equal(typeof input.executionContext.modelConstraint, "string");
assert.ok(input.executionContext.modelConstraint.length > 0);
assert.ok(input.security); assert.ok(input.security);
coderAttempts += 1; coderAttempts += 1;
if (coderAttempts === 1) { if (coderAttempts === 1) {
@@ -254,7 +257,7 @@ test("runs DAG pipeline with state-dependent routing and retry behavior", async
assert.deepEqual(engine.planChildPersonas({ parentPersonaId: "task", depth: 1 }), ["coder"]); assert.deepEqual(engine.planChildPersonas({ parentPersonaId: "task", depth: 1 }), ["coder"]);
}); });
test("injects mcp registry/config helpers and enforces Claude tool gate in actor executor", async () => { test("injects resolved mcp/helpers and enforces Claude tool gate in actor executor", async () => {
const workspaceRoot = await mkdtemp(resolve(tmpdir(), "ai-ops-workspace-")); const workspaceRoot = await mkdtemp(resolve(tmpdir(), "ai-ops-workspace-"));
const stateRoot = await mkdtemp(resolve(tmpdir(), "ai-ops-session-state-")); const stateRoot = await mkdtemp(resolve(tmpdir(), "ai-ops-session-state-"));
const projectContextPath = resolve(stateRoot, "project-context.json"); const projectContextPath = resolve(stateRoot, "project-context.json");
@@ -302,6 +305,7 @@ test("injects mcp registry/config helpers and enforces Claude tool gate in actor
id: "task", id: "task",
displayName: "Task", displayName: "Task",
systemPromptTemplate: "Task executor", systemPromptTemplate: "Task executor",
modelConstraint: "claude-3-haiku",
toolClearance: { toolClearance: {
allowlist: ["read_file", "write_file"], allowlist: ["read_file", "write_file"],
banlist: ["rm"], banlist: ["rm"],
@@ -340,7 +344,11 @@ test("injects mcp registry/config helpers and enforces Claude tool gate in actor
}, },
actorExecutors: { actorExecutors: {
task_actor: async (input) => { task_actor: async (input) => {
assert.equal(input.mcp.registry, customRegistry); assert.deepEqual(input.executionContext.allowedTools, ["read_file", "write_file"]);
assert.equal(input.executionContext.phase, "task-node");
assert.equal(input.executionContext.modelConstraint, "claude-3-haiku");
assert.equal(input.executionContext.security.worktreePath, workspaceRoot);
assert.equal(input.executionContext.security.violationMode, "hard_abort");
const codexConfig = input.mcp.resolveConfig({ const codexConfig = input.mcp.resolveConfig({
providerHint: "codex", providerHint: "codex",
@@ -350,7 +358,11 @@ test("injects mcp registry/config helpers and enforces Claude tool gate in actor
]; ];
assert.ok(codexServer); assert.ok(codexServer);
assert.deepEqual(codexServer.enabled_tools, ["read_file", "write_file"]); assert.deepEqual(codexServer.enabled_tools, ["read_file", "write_file"]);
assert.deepEqual(codexServer.disabled_tools, ["rm"]); assert.deepEqual(input.mcp.allowedTools, ["read_file", "write_file"]);
assert.deepEqual(
input.mcp.filterToolsForProvider(["read_file", "search", "write_file"]),
["read_file", "write_file"],
);
const claudeConfig = input.mcp.resolveConfig({ const claudeConfig = input.mcp.resolveConfig({
providerHint: "claude", providerHint: "claude",
@@ -371,25 +383,31 @@ test("injects mcp registry/config helpers and enforces Claude tool gate in actor
toolUseID: "allow-1", toolUseID: "allow-1",
}); });
const denyBlocked = await canUseTool( await assert.rejects(
"mcp__claude-task-master__rm", () =>
{}, canUseTool(
{ "mcp__claude-task-master__rm",
signal: new AbortController().signal, {},
toolUseID: "deny-1", {
}, signal: new AbortController().signal,
toolUseID: "deny-1",
},
),
/Tool .* is not present in allowlist/,
); );
assert.equal(denyBlocked.behavior, "deny");
const denyMissingAllowlist = await canUseTool( await assert.rejects(
"mcp__claude-task-master__search", () =>
{}, canUseTool(
{ "mcp__claude-task-master__search",
signal: new AbortController().signal, {},
toolUseID: "deny-2", {
}, signal: new AbortController().signal,
toolUseID: "deny-2",
},
),
/Tool .* is not present in allowlist/,
); );
assert.equal(denyMissingAllowlist.behavior, "deny");
return { return {
status: "success", status: "success",