Compare commits
2 Commits
45374a033b
...
90725eaae8
| Author | SHA1 | Date | |
|---|---|---|---|
| 90725eaae8 | |||
| 7727612ce9 |
@@ -16,6 +16,7 @@ CLAUDE_CODE_OAUTH_TOKEN=
|
||||
ANTHROPIC_API_KEY=
|
||||
CLAUDE_MODEL=
|
||||
CLAUDE_CODE_PATH=
|
||||
CLAUDE_MAX_TURNS=2
|
||||
# Claude binary observability: off | stdout | file | both
|
||||
CLAUDE_OBSERVABILITY_MODE=off
|
||||
# CLAUDE_OBSERVABILITY_VERBOSITY: summary | full
|
||||
@@ -52,7 +53,7 @@ AGENT_PORT_LOCK_DIR=.ai_ops/locks/ports
|
||||
AGENT_DISCOVERY_FILE_RELATIVE_PATH=.agent-context/resources.json
|
||||
|
||||
# Security middleware
|
||||
# AGENT_SECURITY_VIOLATION_MODE: hard_abort | validation_fail
|
||||
# AGENT_SECURITY_VIOLATION_MODE: hard_abort | validation_fail | dangerous_warn_only
|
||||
AGENT_SECURITY_VIOLATION_MODE=hard_abort
|
||||
AGENT_SECURITY_ALLOWED_BINARIES=git,npm,node,cat,ls,pwd,echo,bash,sh
|
||||
AGENT_SECURITY_COMMAND_TIMEOUT_MS=120000
|
||||
|
||||
@@ -109,7 +109,9 @@ Provider mode notes:
|
||||
- `provider=codex` uses existing OpenAI/Codex auth settings (`OPENAI_AUTH_MODE`, `CODEX_API_KEY`, `OPENAI_API_KEY`).
|
||||
- `provider=claude` uses Claude auth resolution (`CLAUDE_CODE_OAUTH_TOKEN` preferred, otherwise `ANTHROPIC_API_KEY`, or existing Claude Code login state).
|
||||
- `CLAUDE_MODEL` should be a Claude model id/alias recognized by Claude Code (for example `claude-sonnet-4-6`); `anthropic/...` prefixes are normalized automatically.
|
||||
- `CLAUDE_MAX_TURNS` controls the per-query Claude turn budget (default `2`).
|
||||
- Claude provider runs can emit Claude SDK/CLI internals to stdout and/or NDJSON with `CLAUDE_OBSERVABILITY_*` settings.
|
||||
- UI session-mode provider runs execute directly in orchestration-assigned task/base worktrees; provider adapters do not allocate additional nested worktrees.
|
||||
|
||||
## Manifest Semantics
|
||||
|
||||
@@ -271,6 +273,7 @@ jq -c 'select(.severity=="critical")' .ai_ops/events/runtime-events.ndjson
|
||||
- Pipeline behavior on `SecurityViolationError` is configurable:
|
||||
- `hard_abort` (default)
|
||||
- `validation_fail` (retry-unrolled remediation)
|
||||
- `dangerous_warn_only` (logs violations and continues execution; high risk)
|
||||
|
||||
## Environment Variables
|
||||
|
||||
@@ -285,6 +288,7 @@ jq -c 'select(.severity=="critical")' .ai_ops/events/runtime-events.ndjson
|
||||
- `ANTHROPIC_API_KEY` (used when `CLAUDE_CODE_OAUTH_TOKEN` is unset)
|
||||
- `CLAUDE_MODEL`
|
||||
- `CLAUDE_CODE_PATH`
|
||||
- `CLAUDE_MAX_TURNS` (integer >= 1, defaults to `2`)
|
||||
- `CLAUDE_OBSERVABILITY_MODE` (`off`, `stdout`, `file`, or `both`)
|
||||
- `CLAUDE_OBSERVABILITY_VERBOSITY` (`summary` or `full`)
|
||||
- `CLAUDE_OBSERVABILITY_LOG_PATH`
|
||||
@@ -322,7 +326,7 @@ jq -c 'select(.severity=="critical")' .ai_ops/events/runtime-events.ndjson
|
||||
|
||||
### Security Middleware
|
||||
|
||||
- `AGENT_SECURITY_VIOLATION_MODE` (`hard_abort` or `validation_fail`)
|
||||
- `AGENT_SECURITY_VIOLATION_MODE` (`hard_abort`, `validation_fail`, or `dangerous_warn_only`)
|
||||
- `AGENT_SECURITY_ALLOWED_BINARIES`
|
||||
- `AGENT_SECURITY_COMMAND_TIMEOUT_MS`
|
||||
- `AGENT_SECURITY_AUDIT_LOG_PATH`
|
||||
|
||||
@@ -37,6 +37,11 @@ Before each actor invocation, orchestration resolves an immutable `ResolvedExecu
|
||||
|
||||
This keeps orchestration policy resolution separate from executor enforcement. Executors do not need to parse manifests or MCP registry internals.
|
||||
|
||||
Worktree ownership invariant:
|
||||
|
||||
- In UI session mode, orchestration/session lifecycle is the single owner of git worktree allocation.
|
||||
- Provider adapters (Codex/Claude runtime wrappers) must execute inside `ResolvedExecutionContext.security.worktreePath` and must not provision independent worktrees.
|
||||
|
||||
## Execution topology model
|
||||
|
||||
- Pipeline graph execution is DAG-based with ready-node frontiers.
|
||||
|
||||
@@ -30,6 +30,7 @@ This middleware provides a first-pass hardening layer for agent-executed shell c
|
||||
|
||||
- `hard_abort` (default): fail fast and stop the pipeline.
|
||||
- `validation_fail`: map violation to retry-unrolled behavior so the actor can attempt a compliant alternative.
|
||||
- `dangerous_warn_only`: emit security audit/runtime events but continue execution. This is intentionally unsafe and should only be used for temporary unblock/debug workflows.
|
||||
|
||||
## MCP integration
|
||||
|
||||
|
||||
262
human_only_TODO
262
human_only_TODO
@@ -10,21 +10,137 @@
|
||||
|
||||
# in progress
|
||||
|
||||
there is some major ui issue. there is app/provider logic wrapped up in the ui which i didnt know about or understand and it has gotten out of hand. we need to rip it out and clean it up. additionally the work trees are still not working as intended after like 5 attempts to fix it so that has got to be officially spaghetti at this point
|
||||
|
||||
here is the takeaway from the ui app logic issue
|
||||
|
||||
- Keep orchestration core in src/agents.
|
||||
- Move backend run/session/provider code out of src/ui into src/control-plane (or src/backend).
|
||||
- Keep src/ui as static/frontend + API client only.
|
||||
- Treat provider prompt shaping as an adapter concern (src/providers), not UI concern.
|
||||
|
||||
|
||||
test results
|
||||
session itself has a dir in worktrees that is a worktree
|
||||
then there is a base dir and a tasks dir
|
||||
base is also a worktree
|
||||
inside of base, there is ANOTHER WORKTREE
|
||||
inside of tasks is a product-intake??? directory
|
||||
code is being written in both product-intake and the worktree in the base/worktrees/d3e411... directory
|
||||
|
||||
i dont think that the product guy is writing any files
|
||||
fwiw, the dev agents are definitely making the app
|
||||
|
||||
log activity of claude code binary
|
||||
WHY IS IT STILL NOT LOGGING WHAT IS ACTUALLY HAPPENING
|
||||
it will not explain it, it just keeps adding different logs
|
||||
test run
|
||||
|
||||
they are writing files!
|
||||
|
||||
# problem 1 - logging
|
||||
logging is still fucking dog dick fuck ass shit
|
||||
|
||||
# problem 2 - worktree
|
||||
the worktree shit is fucking insanity
|
||||
they are getting confused because they see some of the orchestration infrastructure
|
||||
they legit need to be in a clean room and know nothing about the world outside of their project going forward
|
||||
|
||||
# problem 3 - task management/product context being passed in its entirety
|
||||
the dev agents for some reason have the entire fucking task list in their context
|
||||
|
||||
|
||||
|
||||
# Scheduled
|
||||
|
||||
|
||||
So yes, the UI growing into “its own project” increases risk because orchestration logic leaks into UI-layer
|
||||
services.
|
||||
|
||||
Best refactor target:
|
||||
|
||||
1. Make UI a thin transport layer (HTTP in/out, no resource ownership decisions).
|
||||
2. Move run/session orchestration into one app-service module with a strict interface.
|
||||
3. Enforce single-owner invariants in code (worktree owner = session lifecycle only).
|
||||
4. Add contract tests around ownership boundaries (like the regression we just added).
|
||||
|
||||
what even is src/examples ????
|
||||
|
||||
|
||||
clean up events/locks/ports (may not be needed with new session work?)
|
||||
|
||||
|
||||
|
||||
|
||||
ui is gargantuan - needs a full rewrite in a different dir or something holy
|
||||
|
||||
the ais arent actually writing to the directory
|
||||
the ui is fucking bad
|
||||
it kinda slow
|
||||
i think the banned command thing is kind of restrictive, idk if they will really be able to do anything
|
||||
codex isnt working?
|
||||
i dont even know if this runs on linux at all
|
||||
wtf is even happening in the backend i dont see any logs for it anywhere
|
||||
|
||||
# identify potential design conflict dag vs things that happen not on the dag?
|
||||
## linked to confusion around max env vars and what they do
|
||||
|
||||
- "This runtime enforces a DAG, so merge-conflict retry loops are multi-run (rerun the same managed session) rather than in-run graph cycles."
|
||||
- what are the implications of this? is this a stop gap until we do stricter control at the ends (task manager + git worktree cleanup)? or do we have a design inconsistency that we need to address?
|
||||
|
||||
For structured-dev-workflow specifically
|
||||
|
||||
- Happy path is depth 4 (product -> task -> dev -> qa -> merge).
|
||||
- QA-fail/rework path reaches depth ~6.
|
||||
- If merge-conflict resolver nodes are used, path can reach ~7.
|
||||
|
||||
# confusion around max env vars and what they do
|
||||
## linked to identify potential design conflict dag vs things that happen not on the dag?
|
||||
|
||||
PROBLEM
|
||||
.env has AGENT_MAX_RECURSIVE_DEPTH=3, but structured-dev-workflow requires depth 4 (product -> task -> dev -> qa -> merge)
|
||||
- does this mean that if qa sends a task back to dev that the depth is 5? or does it literally mean num of agents in a branch of the dag?
|
||||
- if it blocks, problem because it will just fail if we have to pass back
|
||||
- if it doesnt, problem bc this does nothing - we are either allowing them to loop infinitely creating themselves OR we are relying on enforcing that elsewhere and we need to consolidated
|
||||
|
||||
RESPONSE
|
||||
|
||||
• Short answer: your concern is valid, and today the two depth limits are conflated more than they should be.
|
||||
|
||||
1. Does QA -> Dev make depth 5?
|
||||
|
||||
- In this system, depth is not “number of unique agents.” It is path depth from the entry node (edge distance), plus recursive child depth when retries/fanout happen.
|
||||
- Back-edges like qa -> dev to the same upstream node are not allowed because pipeline must be a strict DAG (manifest.ts (/home/zman/projects/ai_ops/src/agents/manifest.ts)).
|
||||
- So “send back” is modeled as downstream rework nodes (qa -> dev-rework -> qa-rework -> merge), which does increase depth.
|
||||
|
||||
2. If it blocks, that’s bad; if not, does the limit do nothing?
|
||||
|
||||
- It does block today, and it also does still serve a purpose.
|
||||
- PipelineExecutor increments node depth by edge traversal (pipeline.ts (/home/zman/projects/ai_ops/src/agents/pipeline.ts:748), pipeline.ts (/home/zman/projects/ai_ops/src/agents/pipeline.ts:815)).
|
||||
- But that same depth is passed into AgentManager.runRecursive (pipeline.ts (/home/zman/projects/ai_ops/src/agents/pipeline.ts:875)), and AgentManager enforces maxRecursiveDepth on that depth (manager.ts (/home/zman/projects/ai_ops/
|
||||
src/agents/manager.ts:343)).
|
||||
- So AGENT_MAX_RECURSIVE_DEPTH is currently acting as a second pipeline-depth cap, not just a recursion cap.
|
||||
|
||||
For structured-dev-workflow specifically
|
||||
|
||||
- Happy path is depth 4 (product -> task -> dev -> qa -> merge).
|
||||
- QA-fail/rework path reaches depth ~6.
|
||||
- If merge-conflict resolver nodes are used, path can reach ~7.
|
||||
- So with .env values like AGENT_MAX_RECURSIVE_DEPTH=3 and AGENT_TOPOLOGY_MAX_DEPTH=4, fallback branches will fail.
|
||||
|
||||
SOLUTION
|
||||
|
||||
2. Structural fix (better):
|
||||
|
||||
- Decouple meanings:
|
||||
- AGENT_TOPOLOGY_MAX_DEPTH should gate DAG traversal depth only.
|
||||
- AGENT_MAX_RECURSIVE_DEPTH should gate recursive fanout/retry depth only.
|
||||
- In practice: stop passing pipeline queue depth into manager recursive depth; start recursive runs at a local depth baseline per node.
|
||||
|
||||
3. Safety/clarity guard:
|
||||
|
||||
- Add a preflight check that computes max possible DAG depth and warns/errors if env depth limits are below it.
|
||||
|
||||
# other scheduled
|
||||
|
||||
- persona definitions
|
||||
@@ -556,3 +672,149 @@ Manifest Builder: A UI to visually build or edit the AgentManifest (Schema "1"),
|
||||
Security Policy Management: An interface mapped to src/security/schemas.ts. This allows admins to define AGENT_SECURITY_ALLOWED_BINARIES, toggle AGENT_SECURITY_VIOLATION_MODE (hard_abort vs validation_fail), and manage MCP tool allowlists/banlists.
|
||||
|
||||
Environment & Resource Limits: Simple forms to configure agent manager limits (AGENT_MAX_CONCURRENT) and port block sizing without manually editing the .env file.
|
||||
|
||||
|
||||
# Architecture Requirements: Session Isolation & Task-Scoped Worktrees
|
||||
|
||||
## Objective
|
||||
|
||||
Disentangle the `ai_ops` control plane from the target project data plane. Replace the implicit `process.cwd()` execution anchor with a formal Session lifecycle and dynamic, task-scoped Git worktrees. This ensures concurrent agents operate in isolated environments and prevents the runtime from mutating its own repository.
|
||||
|
||||
## 1. Domain Definitions
|
||||
|
||||
- **Target Project:** The absolute local path to the repository being operated on (e.g., `/home/user/target_repo`).
|
||||
|
||||
- **Session (The Clean Room):** A persistent orchestration context strictly bound to one Target Project. It maintains a "Base Workspace" (a localized Git checkout/branch) that represents the integrated, approved state of the current work period.
|
||||
|
||||
- **Task Worktree:** An ephemeral Git worktree branched from the Session's Base Workspace. It is scoped strictly to a `taskId`, enabling multi-agent handoffs (e.g., Coder $\rightarrow$ QA) within the same isolated environment before merging back to the Base Workspace.
|
||||
|
||||
|
||||
## 2. Core Data Model Updates
|
||||
|
||||
Introduce explicit types to track project binding and resource ownership.
|
||||
|
||||
- **API Payloads:**
|
||||
|
||||
TypeScript
|
||||
|
||||
```
|
||||
interface CreateSessionRequest {
|
||||
projectPath: string; // Absolute local path to target repo
|
||||
}
|
||||
```
|
||||
|
||||
- **Session State (`AGENT_STATE_ROOT`):**
|
||||
|
||||
TypeScript
|
||||
|
||||
```
|
||||
interface SessionMetadata {
|
||||
sessionId: string;
|
||||
projectPath: string;
|
||||
sessionStatus: 'active' | 'suspended' | 'closed';
|
||||
baseWorkspacePath: string; // e.g., ${AGENT_WORKTREE_ROOT}/${sessionId}/base
|
||||
createdAt: string;
|
||||
updatedAt: string;
|
||||
}
|
||||
```
|
||||
|
||||
- **Project Context (`src/agents/project-context.ts`):**
|
||||
|
||||
Update the `taskQueue` schema to act as the persistent ledger for worktree ownership.
|
||||
|
||||
TypeScript
|
||||
|
||||
```
|
||||
interface TaskRecord {
|
||||
taskId: string;
|
||||
status: 'pending' | 'in_progress' | 'review' | 'merged' | 'failed';
|
||||
worktreePath?: string; // e.g., ${AGENT_WORKTREE_ROOT}/${sessionId}/tasks/${taskId}
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
## 3. API & Control Plane (`src/ui/server.ts`)
|
||||
|
||||
Replace implicit session generation with an explicit lifecycle API.
|
||||
|
||||
- `POST /api/sessions`: Accepts `CreateSessionRequest`. Initializes the SessionMetadata and provisions the Base Workspace.
|
||||
|
||||
- `GET /api/sessions`: Returns existing sessions for resuming work across restarts.
|
||||
|
||||
- `POST /api/sessions/:id/run`: Triggers `SchemaDrivenExecutionEngine.runSession(...)`, passing the resolved `SessionMetadata`.
|
||||
|
||||
- `POST /api/sessions/:id/close`: Prunes all task worktrees, optionally merges the Base Workspace back to the original `projectPath`, and marks the session closed.
|
||||
|
||||
|
||||
## 4. Provisioning Layer (`src/agents/provisioning.ts`)
|
||||
|
||||
Remove all fallback logic relying on `process.cwd()`.
|
||||
|
||||
- **Session Initialization:** Clone or create a primary worktree of `projectPath` into `baseWorkspacePath`.
|
||||
|
||||
- **Task Provisioning:** When a task begins execution, check out a new branch from the Base Workspace and provision it at `worktreePath`.
|
||||
|
||||
- **Security & MCP Isolation:** `SecureCommandExecutor` and MCP handler configurations must dynamically anchor their working directories to the specific `worktreePath` injected into the execution context, preventing traversal outside the task scope.
|
||||
|
||||
|
||||
## 5. Orchestration & Routing (`src/agents/pipeline.ts`)
|
||||
|
||||
Implement the hybrid routing model: Domain Events for control flow, Project Context for resource lookup.
|
||||
|
||||
1. **The Signal (Domain Events):** When a Coder agent finishes, it emits a standard domain event (e.g., `task_ready_for_review` with the `taskId`). The pipeline routes this event to trigger the QA agent.
|
||||
|
||||
2. **The Map (Project Context):** Before initializing the QA agent's sandbox, the lifecycle observer/engine reads `project-context.ts` to look up the `worktreePath` associated with that `taskId`.
|
||||
|
||||
3. **The Execution:** The QA agent boots inside the exact same Task Worktree the Coder agent just vacated, preserving all uncommitted files and local state.
|
||||
|
||||
4. **The Merge:** Upon successful QA (e.g., `validation_passed`), the orchestration layer commits the Task Worktree, merges it into the Base Workspace, and deletes the Task Worktree.
|
||||
|
||||
|
||||
# turning merge conflicts into first-class orchestration events instead of hard exceptions.
|
||||
|
||||
1. Add new domain events:
|
||||
|
||||
- merge_conflict_detected
|
||||
- merge_conflict_resolved
|
||||
- merge_conflict_unresolved (after max attempts)
|
||||
- optionally merge_retry_started
|
||||
|
||||
2. Extend task state model with conflict-aware statuses:
|
||||
|
||||
- add conflict (and maybe resolving_conflict)
|
||||
|
||||
3. Change merge code path to return structured outcomes instead of throwing on conflict:
|
||||
|
||||
- success
|
||||
- conflict (with conflictFiles, mergeBase, taskId, worktreePath)
|
||||
- fatal_error
|
||||
- only throw for truly fatal cases (repo corruption, missing worktree, etc.)
|
||||
|
||||
4. On conflict, patch project context + emit event:
|
||||
|
||||
- set task to conflict
|
||||
- store conflict metadata in task.metadata
|
||||
- emit merge_conflict_detected
|
||||
|
||||
5. Route conflict events to dedicated resolver personas in the pipeline:
|
||||
|
||||
- Coder/QA conflict-resolver agent opens same worktreePath
|
||||
- resolves conflict markers, runs checks
|
||||
- emits merge_conflict_resolved
|
||||
|
||||
6. Retry merge after resolution event:
|
||||
|
||||
- integration node attempts merge again
|
||||
- if successful, emit branch_merged, mark merged, prune task worktree
|
||||
- if still conflicting, loop with bounded retries
|
||||
|
||||
7. Add retry guardrails:
|
||||
|
||||
- max conflict-resolution attempts per task
|
||||
- on exhaustion emit merge_conflict_unresolved and stop cleanly (not crash the whole session)
|
||||
|
||||
8. Apply same pattern to session close (base -> project) so close can become:
|
||||
|
||||
- conflict workflow or “closed_with_conflicts” state, rather than a hard failure.
|
||||
|
||||
This keeps the app stable and lets agents handle conflicts as part of normal orchestration.
|
||||
@@ -26,6 +26,7 @@ import type { JsonObject } from "./types.js";
|
||||
import { SessionWorktreeManager, type SessionMetadata } from "./session-lifecycle.js";
|
||||
import {
|
||||
SecureCommandExecutor,
|
||||
type SecurityViolationHandling,
|
||||
type SecurityAuditEvent,
|
||||
type SecurityAuditSink,
|
||||
SecurityRulesEngine,
|
||||
@@ -46,7 +47,7 @@ export type OrchestrationSettings = {
|
||||
maxRetries: number;
|
||||
maxChildren: number;
|
||||
mergeConflictMaxAttempts: number;
|
||||
securityViolationHandling: "hard_abort" | "validation_fail";
|
||||
securityViolationHandling: SecurityViolationHandling;
|
||||
runtimeContext: Record<string, string | number | boolean>;
|
||||
};
|
||||
|
||||
@@ -211,6 +212,9 @@ function createActorSecurityContext(input: {
|
||||
blockedEnvAssignments: ["AGENT_STATE_ROOT", "AGENT_PROJECT_CONTEXT_PATH"],
|
||||
},
|
||||
auditSink,
|
||||
{
|
||||
violationHandling: input.settings.securityViolationHandling,
|
||||
},
|
||||
);
|
||||
|
||||
return {
|
||||
@@ -342,6 +346,7 @@ export class SchemaDrivenExecutionEngine {
|
||||
this.sessionWorktreeManager = new SessionWorktreeManager({
|
||||
worktreeRoot: resolve(this.settings.workspaceRoot, this.config.provisioning.gitWorktree.rootDirectory),
|
||||
baseRef: this.config.provisioning.gitWorktree.baseRef,
|
||||
targetPath: this.config.provisioning.gitWorktree.targetPath,
|
||||
});
|
||||
|
||||
this.actorExecutors = toExecutorMap(input.actorExecutors);
|
||||
@@ -426,7 +431,11 @@ export class SchemaDrivenExecutionEngine {
|
||||
}): Promise<PipelineRunSummary> {
|
||||
const managerSessionId = `${input.sessionId}__pipeline`;
|
||||
const managerSession = this.manager.createSession(managerSessionId);
|
||||
const workspaceRoot = input.sessionMetadata?.baseWorkspacePath ?? this.settings.workspaceRoot;
|
||||
const workspaceRoot = input.sessionMetadata
|
||||
? this.sessionWorktreeManager.resolveWorkingDirectoryForWorktree(
|
||||
input.sessionMetadata.baseWorkspacePath,
|
||||
)
|
||||
: this.settings.workspaceRoot;
|
||||
const projectContextStore = input.sessionMetadata
|
||||
? new FileSystemProjectContextStore({
|
||||
filePath: resolveSessionProjectContextPath(this.settings.stateRoot, input.sessionId),
|
||||
@@ -531,6 +540,7 @@ export class SchemaDrivenExecutionEngine {
|
||||
|
||||
return {
|
||||
taskId,
|
||||
workingDirectory: ensured.taskWorkingDirectory,
|
||||
worktreePath: ensured.taskWorktreePath,
|
||||
statusAtStart,
|
||||
...(existing?.metadata ? { metadata: existing.metadata } : {}),
|
||||
|
||||
@@ -63,6 +63,7 @@ export type ActorExecutionResult = {
|
||||
export type ActorToolPermissionResult =
|
||||
| {
|
||||
behavior: "allow";
|
||||
updatedInput?: Record<string, unknown>;
|
||||
toolUseID?: string;
|
||||
}
|
||||
| {
|
||||
@@ -171,6 +172,7 @@ export type ActorExecutionSecurityContext = {
|
||||
|
||||
export type TaskExecutionResolution = {
|
||||
taskId: string;
|
||||
workingDirectory: string;
|
||||
worktreePath: string;
|
||||
statusAtStart: string;
|
||||
metadata?: JsonObject;
|
||||
@@ -941,7 +943,7 @@ export class PipelineExecutor {
|
||||
node,
|
||||
toolClearance,
|
||||
prompt,
|
||||
worktreePathOverride: taskResolution?.worktreePath,
|
||||
worktreePathOverride: taskResolution?.workingDirectory,
|
||||
});
|
||||
|
||||
const result = await this.invokeActorExecutor({
|
||||
@@ -970,6 +972,7 @@ export class PipelineExecutor {
|
||||
...(taskResolution
|
||||
? {
|
||||
taskId: taskResolution.taskId,
|
||||
workingDirectory: taskResolution.workingDirectory,
|
||||
worktreePath: taskResolution.worktreePath,
|
||||
}
|
||||
: {}),
|
||||
@@ -1309,6 +1312,7 @@ export class PipelineExecutor {
|
||||
const createToolPermissionHandler = (): ActorToolPermissionHandler =>
|
||||
this.createToolPermissionHandler({
|
||||
allowedTools: executionContext.allowedTools,
|
||||
violationMode: executionContext.security.violationMode,
|
||||
sessionId: input.sessionId,
|
||||
nodeId: input.nodeId,
|
||||
attempt: input.attempt,
|
||||
@@ -1326,6 +1330,7 @@ export class PipelineExecutor {
|
||||
|
||||
private createToolPermissionHandler(input: {
|
||||
allowedTools: readonly string[];
|
||||
violationMode: SecurityViolationHandling;
|
||||
sessionId: string;
|
||||
nodeId: string;
|
||||
attempt: number;
|
||||
@@ -1340,7 +1345,7 @@ export class PipelineExecutor {
|
||||
attempt: input.attempt,
|
||||
};
|
||||
|
||||
return async (toolName, _input, options) => {
|
||||
return async (toolName, toolInput, options) => {
|
||||
const toolUseID = options.toolUseID;
|
||||
if (options.signal.aborted) {
|
||||
return {
|
||||
@@ -1358,11 +1363,28 @@ export class PipelineExecutor {
|
||||
caseInsensitiveLookup: caseInsensitiveAllowLookup,
|
||||
});
|
||||
if (!allowMatch) {
|
||||
rulesEngine?.assertToolInvocationAllowed({
|
||||
tool: candidates[0] ?? toolName,
|
||||
toolClearance: toolPolicy,
|
||||
context: toolAuditContext,
|
||||
});
|
||||
if (rulesEngine) {
|
||||
try {
|
||||
rulesEngine.assertToolInvocationAllowed({
|
||||
tool: candidates[0] ?? toolName,
|
||||
toolClearance: toolPolicy,
|
||||
context: toolAuditContext,
|
||||
});
|
||||
} catch (error) {
|
||||
if (
|
||||
!(input.violationMode === "dangerous_warn_only" && error instanceof SecurityViolationError)
|
||||
) {
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
}
|
||||
if (input.violationMode === "dangerous_warn_only") {
|
||||
return {
|
||||
behavior: "allow",
|
||||
updatedInput: toolInput,
|
||||
...(toolUseID ? { toolUseID } : {}),
|
||||
};
|
||||
}
|
||||
return {
|
||||
behavior: "deny",
|
||||
message: `Tool "${toolName}" is not in the resolved execution allowlist.`,
|
||||
@@ -1379,6 +1401,7 @@ export class PipelineExecutor {
|
||||
|
||||
return {
|
||||
behavior: "allow",
|
||||
updatedInput: toolInput,
|
||||
...(toolUseID ? { toolUseID } : {}),
|
||||
};
|
||||
};
|
||||
|
||||
@@ -358,13 +358,16 @@ export class FileSystemSessionMetadataStore {
|
||||
export class SessionWorktreeManager {
|
||||
private readonly worktreeRoot: string;
|
||||
private readonly baseRef: string;
|
||||
private readonly targetPath?: string;
|
||||
|
||||
constructor(input: {
|
||||
worktreeRoot: string;
|
||||
baseRef: string;
|
||||
targetPath?: string;
|
||||
}) {
|
||||
this.worktreeRoot = assertAbsolutePath(input.worktreeRoot, "worktreeRoot");
|
||||
this.baseRef = assertNonEmptyString(input.baseRef, "baseRef");
|
||||
this.targetPath = normalizeWorktreeTargetPath(input.targetPath, "targetPath");
|
||||
}
|
||||
|
||||
resolveBaseWorkspacePath(sessionId: string): string {
|
||||
@@ -378,6 +381,11 @@ export class SessionWorktreeManager {
|
||||
return resolve(this.worktreeRoot, scopedSession, "tasks", scopedTask);
|
||||
}
|
||||
|
||||
resolveWorkingDirectoryForWorktree(worktreePath: string): string {
|
||||
const normalizedWorktreePath = assertAbsolutePath(worktreePath, "worktreePath");
|
||||
return this.targetPath ? resolve(normalizedWorktreePath, this.targetPath) : normalizedWorktreePath;
|
||||
}
|
||||
|
||||
private resolveBaseBranchName(sessionId: string): string {
|
||||
const scoped = sanitizeSegment(sessionId, "session");
|
||||
return `ai-ops/${scoped}/base`;
|
||||
@@ -399,14 +407,13 @@ export class SessionWorktreeManager {
|
||||
|
||||
await mkdir(dirname(baseWorkspacePath), { recursive: true });
|
||||
|
||||
const alreadyExists = await pathExists(baseWorkspacePath);
|
||||
if (alreadyExists) {
|
||||
return;
|
||||
if (!(await pathExists(baseWorkspacePath))) {
|
||||
const repoRoot = await runGit(["-C", projectPath, "rev-parse", "--show-toplevel"]);
|
||||
const branchName = this.resolveBaseBranchName(input.sessionId);
|
||||
await runGit(["-C", repoRoot, "worktree", "add", "-B", branchName, baseWorkspacePath, this.baseRef]);
|
||||
}
|
||||
|
||||
const repoRoot = await runGit(["-C", projectPath, "rev-parse", "--show-toplevel"]);
|
||||
const branchName = this.resolveBaseBranchName(input.sessionId);
|
||||
await runGit(["-C", repoRoot, "worktree", "add", "-B", branchName, baseWorkspacePath, this.baseRef]);
|
||||
await this.ensureWorktreeTargetPath(baseWorkspacePath);
|
||||
}
|
||||
|
||||
async ensureTaskWorktree(input: {
|
||||
@@ -416,6 +423,7 @@ export class SessionWorktreeManager {
|
||||
existingWorktreePath?: string;
|
||||
}): Promise<{
|
||||
taskWorktreePath: string;
|
||||
taskWorkingDirectory: string;
|
||||
}> {
|
||||
const baseWorkspacePath = assertAbsolutePath(input.baseWorkspacePath, "baseWorkspacePath");
|
||||
const maybeExisting = input.existingWorktreePath?.trim();
|
||||
@@ -451,8 +459,10 @@ export class SessionWorktreeManager {
|
||||
if (addResult.exitCode !== 0) {
|
||||
const attachedAfterFailure = await this.findWorktreePathForBranch(baseWorkspacePath, branchName);
|
||||
if (attachedAfterFailure === worktreePath && (await pathExists(worktreePath))) {
|
||||
const taskWorkingDirectory = await this.ensureWorktreeTargetPath(worktreePath);
|
||||
return {
|
||||
taskWorktreePath: worktreePath,
|
||||
taskWorkingDirectory,
|
||||
};
|
||||
}
|
||||
throw new Error(
|
||||
@@ -462,8 +472,10 @@ export class SessionWorktreeManager {
|
||||
}
|
||||
}
|
||||
|
||||
const taskWorkingDirectory = await this.ensureWorktreeTargetPath(worktreePath);
|
||||
return {
|
||||
taskWorktreePath: worktreePath,
|
||||
taskWorkingDirectory,
|
||||
};
|
||||
}
|
||||
|
||||
@@ -780,4 +792,69 @@ export class SessionWorktreeManager {
|
||||
}
|
||||
return parseGitWorktreeRecords(result.stdout);
|
||||
}
|
||||
|
||||
private async ensureWorktreeTargetPath(worktreePath: string): Promise<string> {
|
||||
if (this.targetPath) {
|
||||
await runGit(["-C", worktreePath, "sparse-checkout", "init", "--cone"]);
|
||||
await runGit(["-C", worktreePath, "sparse-checkout", "set", this.targetPath]);
|
||||
}
|
||||
|
||||
const workingDirectory = this.resolveWorkingDirectoryForWorktree(worktreePath);
|
||||
let workingDirectoryStats;
|
||||
try {
|
||||
workingDirectoryStats = await stat(workingDirectory);
|
||||
} catch (error) {
|
||||
if ((error as NodeJS.ErrnoException).code === "ENOENT") {
|
||||
if (this.targetPath) {
|
||||
throw new Error(
|
||||
`Configured worktree target path "${this.targetPath}" is not a directory in ref "${this.baseRef}".`,
|
||||
);
|
||||
}
|
||||
throw new Error(`Worktree path "${workingDirectory}" does not exist.`);
|
||||
}
|
||||
throw error;
|
||||
}
|
||||
|
||||
if (!workingDirectoryStats.isDirectory()) {
|
||||
if (this.targetPath) {
|
||||
throw new Error(
|
||||
`Configured worktree target path "${this.targetPath}" is not a directory in ref "${this.baseRef}".`,
|
||||
);
|
||||
}
|
||||
throw new Error(`Worktree path "${workingDirectory}" is not a directory.`);
|
||||
}
|
||||
|
||||
return workingDirectory;
|
||||
}
|
||||
}
|
||||
|
||||
function normalizeWorktreeTargetPath(value: string | undefined, key: string): string | undefined {
|
||||
if (value === undefined) {
|
||||
return undefined;
|
||||
}
|
||||
|
||||
const trimmed = value.trim();
|
||||
if (trimmed.length === 0) {
|
||||
return undefined;
|
||||
}
|
||||
|
||||
const slashNormalized = trimmed.replace(/\\/g, "/");
|
||||
if (isAbsolute(slashNormalized) || /^[a-zA-Z]:\//.test(slashNormalized)) {
|
||||
throw new Error(`${key} must be a relative path within the repository worktree.`);
|
||||
}
|
||||
|
||||
const normalizedSegments = slashNormalized
|
||||
.split("/")
|
||||
.map((segment) => segment.trim())
|
||||
.filter((segment) => segment.length > 0 && segment !== ".");
|
||||
|
||||
if (normalizedSegments.some((segment) => segment === "..")) {
|
||||
throw new Error(`${key} must not contain ".." path segments.`);
|
||||
}
|
||||
|
||||
if (normalizedSegments.length === 0) {
|
||||
return undefined;
|
||||
}
|
||||
|
||||
return normalizedSegments.join("/");
|
||||
}
|
||||
|
||||
@@ -16,6 +16,7 @@ export type ProviderRuntimeConfig = {
|
||||
anthropicApiKey?: string;
|
||||
claudeModel?: string;
|
||||
claudeCodePath?: string;
|
||||
claudeMaxTurns: number;
|
||||
claudeObservability: ClaudeObservabilityRuntimeConfig;
|
||||
};
|
||||
|
||||
@@ -136,6 +137,8 @@ const DEFAULT_CLAUDE_OBSERVABILITY: ClaudeObservabilityRuntimeConfig = {
|
||||
debugLogPath: undefined,
|
||||
};
|
||||
|
||||
const DEFAULT_CLAUDE_MAX_TURNS = 2;
|
||||
|
||||
function readOptionalString(
|
||||
env: NodeJS.ProcessEnv,
|
||||
key: string,
|
||||
@@ -401,6 +404,12 @@ export function loadConfig(env: NodeJS.ProcessEnv = process.env): Readonly<AppCo
|
||||
anthropicApiKey,
|
||||
claudeModel: normalizeClaudeModel(readOptionalString(env, "CLAUDE_MODEL")),
|
||||
claudeCodePath: readOptionalString(env, "CLAUDE_CODE_PATH"),
|
||||
claudeMaxTurns: readIntegerWithBounds(
|
||||
env,
|
||||
"CLAUDE_MAX_TURNS",
|
||||
DEFAULT_CLAUDE_MAX_TURNS,
|
||||
{ min: 1 },
|
||||
),
|
||||
claudeObservability: {
|
||||
mode: parseClaudeObservabilityMode(
|
||||
readStringWithFallback(
|
||||
|
||||
@@ -19,7 +19,7 @@ function requiredPrompt(argv: string[]): string {
|
||||
|
||||
function buildOptions(config = getConfig()): Options {
|
||||
return {
|
||||
maxTurns: 1,
|
||||
maxTurns: config.provider.claudeMaxTurns,
|
||||
...(config.provider.claudeModel ? { model: config.provider.claudeModel } : {}),
|
||||
...(config.provider.claudeCodePath
|
||||
? { pathToClaudeCodeExecutable: config.provider.claudeCodePath }
|
||||
|
||||
@@ -8,6 +8,7 @@ import {
|
||||
import {
|
||||
parseShellValidationPolicy,
|
||||
parseToolClearancePolicy,
|
||||
type SecurityViolationHandling,
|
||||
type ShellValidationPolicy,
|
||||
type ToolClearancePolicy,
|
||||
} from "./schemas.js";
|
||||
@@ -62,6 +63,10 @@ function normalizeToken(value: string): string {
|
||||
return value.trim();
|
||||
}
|
||||
|
||||
function normalizeLookupToken(value: string): string {
|
||||
return normalizeToken(value).toLowerCase();
|
||||
}
|
||||
|
||||
function hasPathTraversalSegment(token: string): boolean {
|
||||
const normalized = token.replaceAll("\\", "/");
|
||||
if (normalized === ".." || normalized.startsWith("../") || normalized.endsWith("/..")) {
|
||||
@@ -100,6 +105,18 @@ function toToolSet(values: readonly string[]): Set<string> {
|
||||
return out;
|
||||
}
|
||||
|
||||
function toCaseInsensitiveLookup(values: readonly string[]): Map<string, string> {
|
||||
const out = new Map<string, string>();
|
||||
for (const value of values) {
|
||||
const normalized = normalizeLookupToken(value);
|
||||
if (!normalized || out.has(normalized)) {
|
||||
continue;
|
||||
}
|
||||
out.set(normalized, value);
|
||||
}
|
||||
return out;
|
||||
}
|
||||
|
||||
function toNow(): string {
|
||||
return new Date().toISOString();
|
||||
}
|
||||
@@ -133,10 +150,14 @@ export class SecurityRulesEngine {
|
||||
private readonly blockedEnvAssignments: Set<string>;
|
||||
private readonly worktreeRoot: string;
|
||||
private readonly protectedPaths: string[];
|
||||
private readonly violationHandling: SecurityViolationHandling;
|
||||
|
||||
constructor(
|
||||
policy: ShellValidationPolicy,
|
||||
private readonly auditSink?: SecurityAuditSink,
|
||||
options?: {
|
||||
violationHandling?: SecurityViolationHandling;
|
||||
},
|
||||
) {
|
||||
this.policy = parseShellValidationPolicy(policy);
|
||||
this.allowedBinaries = toToolSet(this.policy.allowedBinaries);
|
||||
@@ -144,6 +165,7 @@ export class SecurityRulesEngine {
|
||||
this.blockedEnvAssignments = toToolSet(this.policy.blockedEnvAssignments);
|
||||
this.worktreeRoot = resolve(this.policy.worktreeRoot);
|
||||
this.protectedPaths = this.policy.protectedPaths.map((path) => resolve(path));
|
||||
this.violationHandling = options?.violationHandling ?? "hard_abort";
|
||||
}
|
||||
|
||||
getPolicy(): ShellValidationPolicy {
|
||||
@@ -212,6 +234,15 @@ export class SecurityRulesEngine {
|
||||
code: error.code,
|
||||
details: error.details,
|
||||
});
|
||||
if (this.violationHandling === "dangerous_warn_only") {
|
||||
return {
|
||||
cwd: resolvedCwd,
|
||||
parsed: {
|
||||
commandCount: 0,
|
||||
commands: [],
|
||||
},
|
||||
};
|
||||
}
|
||||
throw error;
|
||||
}
|
||||
|
||||
@@ -232,8 +263,11 @@ export class SecurityRulesEngine {
|
||||
};
|
||||
}): void {
|
||||
const policy = parseToolClearancePolicy(input.toolClearance);
|
||||
const normalizedTool = normalizeLookupToken(input.tool);
|
||||
const banlistLookup = toCaseInsensitiveLookup(policy.banlist);
|
||||
const allowlistLookup = toCaseInsensitiveLookup(policy.allowlist);
|
||||
|
||||
if (policy.banlist.includes(input.tool)) {
|
||||
if (banlistLookup.has(normalizedTool)) {
|
||||
this.emit({
|
||||
...toAuditContext(input.context),
|
||||
type: "tool.invocation_blocked",
|
||||
@@ -252,7 +286,7 @@ export class SecurityRulesEngine {
|
||||
);
|
||||
}
|
||||
|
||||
if (policy.allowlist.length > 0 && !policy.allowlist.includes(input.tool)) {
|
||||
if (policy.allowlist.length > 0 && !allowlistLookup.has(normalizedTool)) {
|
||||
this.emit({
|
||||
...toAuditContext(input.context),
|
||||
type: "tool.invocation_blocked",
|
||||
@@ -280,13 +314,15 @@ export class SecurityRulesEngine {
|
||||
|
||||
filterAllowedTools(tools: string[], toolClearance: ToolClearancePolicy): string[] {
|
||||
const policy = parseToolClearancePolicy(toolClearance);
|
||||
const allowlistLookup = toCaseInsensitiveLookup(policy.allowlist);
|
||||
const banlistLookup = toCaseInsensitiveLookup(policy.banlist);
|
||||
|
||||
const allowedByAllowlist =
|
||||
policy.allowlist.length === 0
|
||||
? tools
|
||||
: tools.filter((tool) => policy.allowlist.includes(tool));
|
||||
: tools.filter((tool) => allowlistLookup.has(normalizeLookupToken(tool)));
|
||||
|
||||
return allowedByAllowlist.filter((tool) => !policy.banlist.includes(tool));
|
||||
return allowedByAllowlist.filter((tool) => !banlistLookup.has(normalizeLookupToken(tool)));
|
||||
}
|
||||
|
||||
private assertCwdBoundary(cwd: string): void {
|
||||
|
||||
@@ -157,11 +157,15 @@ export function parseParsedShellScript(input: unknown): ParsedShellScript {
|
||||
};
|
||||
}
|
||||
|
||||
export type SecurityViolationHandling = "hard_abort" | "validation_fail";
|
||||
export type SecurityViolationHandling =
|
||||
| "hard_abort"
|
||||
| "validation_fail"
|
||||
| "dangerous_warn_only";
|
||||
|
||||
export const securityViolationHandlingSchema = z.union([
|
||||
z.literal("hard_abort"),
|
||||
z.literal("validation_fail"),
|
||||
z.literal("dangerous_warn_only"),
|
||||
]);
|
||||
|
||||
export function parseSecurityViolationHandling(input: unknown): SecurityViolationHandling {
|
||||
|
||||
@@ -1,5 +1,6 @@
|
||||
import { resolve } from "node:path";
|
||||
import { loadConfig, type AppConfig } from "../config.js";
|
||||
import type { SecurityViolationHandling } from "../security/index.js";
|
||||
import { parseEnvFile, writeEnvFileUpdates } from "./env-store.js";
|
||||
|
||||
export type RuntimeNotificationSettings = {
|
||||
@@ -9,7 +10,7 @@ export type RuntimeNotificationSettings = {
|
||||
};
|
||||
|
||||
export type SecurityPolicySettings = {
|
||||
violationMode: "hard_abort" | "validation_fail";
|
||||
violationMode: SecurityViolationHandling;
|
||||
allowedBinaries: string[];
|
||||
commandTimeoutMs: number;
|
||||
inheritedEnv: string[];
|
||||
|
||||
@@ -9,7 +9,6 @@ import {
|
||||
import { isDomainEventType, type DomainEventEmission } from "../agents/domain-events.js";
|
||||
import type { ActorExecutionInput, ActorExecutionResult, ActorExecutor } from "../agents/pipeline.js";
|
||||
import { isRecord, type JsonObject, type JsonValue } from "../agents/types.js";
|
||||
import { createSessionContext, type SessionContext } from "../examples/session-context.js";
|
||||
import { ClaudeObservabilityLogger } from "./claude-observability.js";
|
||||
|
||||
export type RunProvider = "codex" | "claude";
|
||||
@@ -17,7 +16,7 @@ export type RunProvider = "codex" | "claude";
|
||||
export type ProviderRunRuntime = {
|
||||
provider: RunProvider;
|
||||
config: Readonly<AppConfig>;
|
||||
sessionContext: SessionContext;
|
||||
sharedEnv: Record<string, string>;
|
||||
claudeObservability: ClaudeObservabilityLogger;
|
||||
close: () => Promise<void>;
|
||||
};
|
||||
@@ -30,6 +29,16 @@ type ProviderUsage = {
|
||||
costUsd?: number;
|
||||
};
|
||||
|
||||
function sanitizeEnv(input: Record<string, string | undefined>): Record<string, string> {
|
||||
const output: Record<string, string> = {};
|
||||
for (const [key, value] of Object.entries(input)) {
|
||||
if (typeof value === "string") {
|
||||
output[key] = value;
|
||||
}
|
||||
}
|
||||
return output;
|
||||
}
|
||||
|
||||
const ACTOR_RESPONSE_SCHEMA = {
|
||||
type: "object",
|
||||
additionalProperties: true,
|
||||
@@ -74,8 +83,6 @@ const CLAUDE_OUTPUT_FORMAT = {
|
||||
schema: ACTOR_RESPONSE_SCHEMA,
|
||||
} as const;
|
||||
|
||||
const CLAUDE_PROVIDER_MAX_TURNS = 2;
|
||||
|
||||
function toErrorMessage(error: unknown): string {
|
||||
if (error instanceof Error) {
|
||||
return error.message;
|
||||
@@ -83,6 +90,23 @@ function toErrorMessage(error: unknown): string {
|
||||
return String(error);
|
||||
}
|
||||
|
||||
export function resolveProviderWorkingDirectory(actorInput: ActorExecutionInput): string {
|
||||
return actorInput.executionContext.security.worktreePath;
|
||||
}
|
||||
|
||||
export function buildProviderRuntimeEnv(input: {
|
||||
runtime: ProviderRunRuntime;
|
||||
actorInput: ActorExecutionInput;
|
||||
includeClaudeAuth?: boolean;
|
||||
}): Record<string, string> {
|
||||
const workingDirectory = resolveProviderWorkingDirectory(input.actorInput);
|
||||
return sanitizeEnv({
|
||||
...input.runtime.sharedEnv,
|
||||
...(input.includeClaudeAuth ? buildClaudeAuthEnv(input.runtime.config.provider) : {}),
|
||||
AGENT_WORKTREE_PATH: workingDirectory,
|
||||
});
|
||||
}
|
||||
|
||||
function toJsonValue(value: unknown): JsonValue {
|
||||
return JSON.parse(JSON.stringify(value)) as JsonValue;
|
||||
}
|
||||
@@ -367,6 +391,7 @@ async function runCodexActor(input: {
|
||||
const prompt = buildActorPrompt(actorInput);
|
||||
const startedAt = Date.now();
|
||||
const apiKey = resolveOpenAiApiKey(runtime.config.provider);
|
||||
const workingDirectory = resolveProviderWorkingDirectory(actorInput);
|
||||
|
||||
const codex = new Codex({
|
||||
...(apiKey ? { apiKey } : {}),
|
||||
@@ -376,20 +401,21 @@ async function runCodexActor(input: {
|
||||
...(actorInput.mcp.resolvedConfig.codexConfig
|
||||
? { config: actorInput.mcp.resolvedConfig.codexConfig }
|
||||
: {}),
|
||||
env: runtime.sessionContext.runtimeInjection.env,
|
||||
env: buildProviderRuntimeEnv({
|
||||
runtime,
|
||||
actorInput,
|
||||
}),
|
||||
});
|
||||
|
||||
const thread = codex.startThread({
|
||||
workingDirectory: runtime.sessionContext.runtimeInjection.workingDirectory,
|
||||
workingDirectory,
|
||||
skipGitRepoCheck: runtime.config.provider.codexSkipGitCheck,
|
||||
});
|
||||
|
||||
const turn = await runtime.sessionContext.runInSession(() =>
|
||||
thread.run(prompt, {
|
||||
signal: actorInput.signal,
|
||||
outputSchema: ACTOR_RESPONSE_SCHEMA,
|
||||
}),
|
||||
);
|
||||
const turn = await thread.run(prompt, {
|
||||
signal: actorInput.signal,
|
||||
outputSchema: ACTOR_RESPONSE_SCHEMA,
|
||||
});
|
||||
|
||||
const usage: ProviderUsage = {
|
||||
...(turn.usage
|
||||
@@ -457,6 +483,7 @@ function buildClaudeOptions(input: {
|
||||
actorInput: ActorExecutionInput;
|
||||
}): Options {
|
||||
const { runtime, actorInput } = input;
|
||||
const workingDirectory = resolveProviderWorkingDirectory(actorInput);
|
||||
|
||||
const authOptionOverrides = runtime.config.provider.anthropicOauthToken
|
||||
? { authToken: runtime.config.provider.anthropicOauthToken }
|
||||
@@ -465,14 +492,15 @@ function buildClaudeOptions(input: {
|
||||
return token ? { apiKey: token } : {};
|
||||
})();
|
||||
|
||||
const runtimeEnv = {
|
||||
...runtime.sessionContext.runtimeInjection.env,
|
||||
...buildClaudeAuthEnv(runtime.config.provider),
|
||||
};
|
||||
const runtimeEnv = buildProviderRuntimeEnv({
|
||||
runtime,
|
||||
actorInput,
|
||||
includeClaudeAuth: true,
|
||||
});
|
||||
const traceContext = toClaudeTraceContext(actorInput);
|
||||
|
||||
return {
|
||||
maxTurns: CLAUDE_PROVIDER_MAX_TURNS,
|
||||
maxTurns: runtime.config.provider.claudeMaxTurns,
|
||||
...(runtime.config.provider.claudeModel
|
||||
? { model: runtime.config.provider.claudeModel }
|
||||
: {}),
|
||||
@@ -484,7 +512,7 @@ function buildClaudeOptions(input: {
|
||||
? { mcpServers: actorInput.mcp.resolvedConfig.claudeMcpServers as Options["mcpServers"] }
|
||||
: {}),
|
||||
canUseTool: actorInput.mcp.createClaudeCanUseTool(),
|
||||
cwd: runtime.sessionContext.runtimeInjection.workingDirectory,
|
||||
cwd: workingDirectory,
|
||||
env: runtimeEnv,
|
||||
...runtime.claudeObservability.toOptionOverrides({
|
||||
context: traceContext,
|
||||
@@ -507,8 +535,8 @@ async function runClaudeTurn(input: {
|
||||
context: traceContext,
|
||||
data: {
|
||||
...(options.model ? { model: options.model } : {}),
|
||||
maxTurns: options.maxTurns ?? CLAUDE_PROVIDER_MAX_TURNS,
|
||||
cwd: input.runtime.sessionContext.runtimeInjection.workingDirectory,
|
||||
maxTurns: options.maxTurns ?? input.runtime.config.provider.claudeMaxTurns,
|
||||
...(typeof options.cwd === "string" ? { cwd: options.cwd } : {}),
|
||||
},
|
||||
});
|
||||
|
||||
@@ -605,13 +633,11 @@ async function runClaudeActor(input: {
|
||||
actorInput: ActorExecutionInput;
|
||||
}): Promise<ActorExecutionResult> {
|
||||
const prompt = buildActorPrompt(input.actorInput);
|
||||
const turn = await input.runtime.sessionContext.runInSession(() =>
|
||||
runClaudeTurn({
|
||||
runtime: input.runtime,
|
||||
actorInput: input.actorInput,
|
||||
prompt,
|
||||
}),
|
||||
);
|
||||
const turn = await runClaudeTurn({
|
||||
runtime: input.runtime,
|
||||
actorInput: input.actorInput,
|
||||
prompt,
|
||||
});
|
||||
|
||||
const parsed = parseActorExecutionResultFromModelOutput({
|
||||
rawText: turn.text,
|
||||
@@ -626,33 +652,21 @@ async function runClaudeActor(input: {
|
||||
|
||||
export async function createProviderRunRuntime(input: {
|
||||
provider: RunProvider;
|
||||
initialPrompt: string;
|
||||
config: Readonly<AppConfig>;
|
||||
projectPath: string;
|
||||
observabilityRootPath?: string;
|
||||
baseEnv?: Record<string, string | undefined>;
|
||||
}): Promise<ProviderRunRuntime> {
|
||||
const sessionContext = await createSessionContext(input.provider, {
|
||||
prompt: input.initialPrompt,
|
||||
config: input.config,
|
||||
workspaceRoot: input.projectPath,
|
||||
});
|
||||
const claudeObservability = new ClaudeObservabilityLogger({
|
||||
workspaceRoot: input.observabilityRootPath ?? input.projectPath,
|
||||
workspaceRoot: input.observabilityRootPath ?? process.cwd(),
|
||||
config: input.config.provider.claudeObservability,
|
||||
});
|
||||
|
||||
return {
|
||||
provider: input.provider,
|
||||
config: input.config,
|
||||
sessionContext,
|
||||
sharedEnv: sanitizeEnv(input.baseEnv ?? process.env),
|
||||
claudeObservability,
|
||||
close: async () => {
|
||||
try {
|
||||
await sessionContext.close();
|
||||
} finally {
|
||||
await claudeObservability.close();
|
||||
}
|
||||
},
|
||||
close: async () => claudeObservability.close(),
|
||||
};
|
||||
}
|
||||
|
||||
|
||||
@@ -202,6 +202,7 @@
|
||||
<select id="cfg-security-mode">
|
||||
<option value="hard_abort">hard_abort</option>
|
||||
<option value="validation_fail">validation_fail</option>
|
||||
<option value="dangerous_warn_only">dangerous_warn_only</option>
|
||||
</select>
|
||||
</label>
|
||||
<label>
|
||||
|
||||
@@ -359,6 +359,7 @@ export class UiRunService {
|
||||
worktreeManager: new SessionWorktreeManager({
|
||||
worktreeRoot: paths.worktreeRoot,
|
||||
baseRef: config.provisioning.gitWorktree.baseRef,
|
||||
targetPath: config.provisioning.gitWorktree.targetPath,
|
||||
}),
|
||||
};
|
||||
}
|
||||
@@ -485,10 +486,9 @@ export class UiRunService {
|
||||
if (executionMode === "provider") {
|
||||
providerRuntime = await createProviderRunRuntime({
|
||||
provider,
|
||||
initialPrompt: input.prompt,
|
||||
config,
|
||||
projectPath: session?.baseWorkspacePath ?? this.workspaceRoot,
|
||||
observabilityRootPath: this.workspaceRoot,
|
||||
baseEnv: process.env,
|
||||
});
|
||||
}
|
||||
|
||||
|
||||
@@ -25,6 +25,7 @@ test("loads defaults and freezes config", () => {
|
||||
"session.failed",
|
||||
]);
|
||||
assert.equal(config.provider.openAiAuthMode, "auto");
|
||||
assert.equal(config.provider.claudeMaxTurns, 2);
|
||||
assert.equal(config.provider.claudeObservability.mode, "off");
|
||||
assert.equal(config.provider.claudeObservability.verbosity, "summary");
|
||||
assert.equal(config.provider.claudeObservability.logPath, ".ai_ops/events/claude-trace.ndjson");
|
||||
@@ -55,6 +56,11 @@ test("validates security violation mode", () => {
|
||||
);
|
||||
});
|
||||
|
||||
test("loads dangerous_warn_only security violation mode", () => {
|
||||
const config = loadConfig({ AGENT_SECURITY_VIOLATION_MODE: "dangerous_warn_only" });
|
||||
assert.equal(config.security.violationHandling, "dangerous_warn_only");
|
||||
});
|
||||
|
||||
test("validates runtime discord severity mode", () => {
|
||||
assert.throws(
|
||||
() => loadConfig({ AGENT_RUNTIME_DISCORD_MIN_SEVERITY: "verbose" }),
|
||||
@@ -69,6 +75,13 @@ test("validates claude observability mode", () => {
|
||||
);
|
||||
});
|
||||
|
||||
test("validates CLAUDE_MAX_TURNS bounds", () => {
|
||||
assert.throws(
|
||||
() => loadConfig({ CLAUDE_MAX_TURNS: "0" }),
|
||||
/CLAUDE_MAX_TURNS must be an integer >= 1/,
|
||||
);
|
||||
});
|
||||
|
||||
test("validates claude observability verbosity", () => {
|
||||
assert.throws(
|
||||
() => loadConfig({ CLAUDE_OBSERVABILITY_VERBOSITY: "verbose" }),
|
||||
|
||||
@@ -380,6 +380,7 @@ test("injects resolved mcp/helpers and enforces Claude tool gate in actor execut
|
||||
);
|
||||
assert.deepEqual(allow, {
|
||||
behavior: "allow",
|
||||
updatedInput: {},
|
||||
toolUseID: "allow-1",
|
||||
});
|
||||
|
||||
@@ -997,6 +998,7 @@ test("createClaudeCanUseTool accepts tool casing differences from providers", as
|
||||
});
|
||||
assert.deepEqual(allow, {
|
||||
behavior: "allow",
|
||||
updatedInput: {},
|
||||
toolUseID: "allow-bash",
|
||||
});
|
||||
|
||||
@@ -1020,6 +1022,88 @@ test("createClaudeCanUseTool accepts tool casing differences from providers", as
|
||||
assert.equal(result.status, "success");
|
||||
});
|
||||
|
||||
test("dangerous_warn_only allows tool use outside persona allowlist", async () => {
|
||||
const workspaceRoot = await mkdtemp(resolve(tmpdir(), "ai-ops-workspace-"));
|
||||
const stateRoot = await mkdtemp(resolve(tmpdir(), "ai-ops-session-state-"));
|
||||
const projectContextPath = resolve(stateRoot, "project-context.json");
|
||||
|
||||
const manifest = {
|
||||
schemaVersion: "1",
|
||||
topologies: ["sequential"],
|
||||
personas: [
|
||||
{
|
||||
id: "reader",
|
||||
displayName: "Reader",
|
||||
systemPromptTemplate: "Reader",
|
||||
toolClearance: {
|
||||
allowlist: ["read_file"],
|
||||
banlist: [],
|
||||
},
|
||||
},
|
||||
],
|
||||
relationships: [],
|
||||
topologyConstraints: {
|
||||
maxDepth: 2,
|
||||
maxRetries: 0,
|
||||
},
|
||||
pipeline: {
|
||||
entryNodeId: "warn-node",
|
||||
nodes: [
|
||||
{
|
||||
id: "warn-node",
|
||||
actorId: "warn_actor",
|
||||
personaId: "reader",
|
||||
},
|
||||
],
|
||||
edges: [],
|
||||
},
|
||||
} as const;
|
||||
|
||||
const engine = new SchemaDrivenExecutionEngine({
|
||||
manifest,
|
||||
settings: {
|
||||
workspaceRoot,
|
||||
stateRoot,
|
||||
projectContextPath,
|
||||
maxChildren: 1,
|
||||
maxDepth: 2,
|
||||
maxRetries: 0,
|
||||
securityViolationHandling: "dangerous_warn_only",
|
||||
runtimeContext: {},
|
||||
},
|
||||
actorExecutors: {
|
||||
warn_actor: async (input) => {
|
||||
const canUseTool = input.mcp.createClaudeCanUseTool();
|
||||
const allow = await canUseTool("Bash", {}, {
|
||||
signal: new AbortController().signal,
|
||||
toolUseID: "allow-bash-warn",
|
||||
});
|
||||
assert.deepEqual(allow, {
|
||||
behavior: "allow",
|
||||
updatedInput: {},
|
||||
toolUseID: "allow-bash-warn",
|
||||
});
|
||||
|
||||
return {
|
||||
status: "success",
|
||||
payload: {
|
||||
ok: true,
|
||||
},
|
||||
};
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
const result = await engine.runSession({
|
||||
sessionId: "session-dangerous-warn-only",
|
||||
initialPayload: {
|
||||
task: "verify warn-only bypass",
|
||||
},
|
||||
});
|
||||
|
||||
assert.equal(result.status, "success");
|
||||
});
|
||||
|
||||
test("hard-aborts pipeline on security violations by default", async () => {
|
||||
const workspaceRoot = await mkdtemp(resolve(tmpdir(), "ai-ops-workspace-"));
|
||||
const stateRoot = await mkdtemp(resolve(tmpdir(), "ai-ops-session-state-"));
|
||||
|
||||
@@ -160,6 +160,7 @@ test("runClaudePrompt wires auth env, stream parsing, and output", async () => {
|
||||
ANTHROPIC_API_KEY: "legacy-api-key",
|
||||
CLAUDE_MODEL: "claude-sonnet-4-6",
|
||||
CLAUDE_CODE_PATH: "/usr/local/bin/claude",
|
||||
CLAUDE_MAX_TURNS: "5",
|
||||
});
|
||||
|
||||
let closed = false;
|
||||
@@ -229,6 +230,7 @@ test("runClaudePrompt wires auth env, stream parsing, and output", async () => {
|
||||
assert.equal(queryInput?.prompt, "augmented prompt");
|
||||
assert.equal(queryInput?.options?.model, "claude-sonnet-4-6");
|
||||
assert.equal(queryInput?.options?.pathToClaudeCodeExecutable, "/usr/local/bin/claude");
|
||||
assert.equal(queryInput?.options?.maxTurns, 5);
|
||||
assert.equal(queryInput?.options?.cwd, "/tmp/claude-worktree");
|
||||
assert.equal(queryInput?.options?.authToken, "oauth-token");
|
||||
assert.deepEqual(queryInput?.options?.mcpServers, sessionContext.mcp.claudeMcpServers);
|
||||
|
||||
@@ -1,6 +1,17 @@
|
||||
import test from "node:test";
|
||||
import assert from "node:assert/strict";
|
||||
import { parseActorExecutionResultFromModelOutput } from "../src/ui/provider-executor.js";
|
||||
import { mkdtemp } from "node:fs/promises";
|
||||
import { tmpdir } from "node:os";
|
||||
import { resolve } from "node:path";
|
||||
import { loadConfig } from "../src/config.js";
|
||||
import type { ActorExecutionInput } from "../src/agents/pipeline.js";
|
||||
import {
|
||||
buildProviderRuntimeEnv,
|
||||
createProviderRunRuntime,
|
||||
parseActorExecutionResultFromModelOutput,
|
||||
resolveProviderWorkingDirectory,
|
||||
type ProviderRunRuntime,
|
||||
} from "../src/ui/provider-executor.js";
|
||||
|
||||
test("parseActorExecutionResultFromModelOutput parses strict JSON payload", () => {
|
||||
const parsed = parseActorExecutionResultFromModelOutput({
|
||||
@@ -64,3 +75,71 @@ test("parseActorExecutionResultFromModelOutput falls back when response is not J
|
||||
assert.equal(parsed.status, "success");
|
||||
assert.equal(parsed.payload?.assistantResponse, "Implemented update successfully.");
|
||||
});
|
||||
|
||||
test("resolveProviderWorkingDirectory reads cwd from actor execution context", () => {
|
||||
const actorInput = {
|
||||
executionContext: {
|
||||
security: {
|
||||
worktreePath: "/tmp/session/tasks/product-intake",
|
||||
},
|
||||
},
|
||||
} as unknown as ActorExecutionInput;
|
||||
|
||||
assert.equal(
|
||||
resolveProviderWorkingDirectory(actorInput),
|
||||
"/tmp/session/tasks/product-intake",
|
||||
);
|
||||
});
|
||||
|
||||
test("buildProviderRuntimeEnv scopes AGENT_WORKTREE_PATH to actor worktree and filters undefined auth", () => {
|
||||
const config = loadConfig({
|
||||
CLAUDE_CODE_OAUTH_TOKEN: "oauth-token",
|
||||
});
|
||||
const runtime = {
|
||||
provider: "claude",
|
||||
config,
|
||||
sharedEnv: {
|
||||
PATH: "/usr/bin",
|
||||
KEEP_ME: "1",
|
||||
},
|
||||
claudeObservability: {} as ProviderRunRuntime["claudeObservability"],
|
||||
close: async () => {},
|
||||
} satisfies ProviderRunRuntime;
|
||||
const actorInput = {
|
||||
executionContext: {
|
||||
security: {
|
||||
worktreePath: "/tmp/session/tasks/product-intake",
|
||||
},
|
||||
},
|
||||
} as unknown as ActorExecutionInput;
|
||||
|
||||
const env = buildProviderRuntimeEnv({
|
||||
runtime,
|
||||
actorInput,
|
||||
includeClaudeAuth: true,
|
||||
});
|
||||
|
||||
assert.equal(env.AGENT_WORKTREE_PATH, "/tmp/session/tasks/product-intake");
|
||||
assert.equal(env.CLAUDE_CODE_OAUTH_TOKEN, "oauth-token");
|
||||
assert.equal("ANTHROPIC_API_KEY" in env, false);
|
||||
assert.equal(env.KEEP_ME, "1");
|
||||
});
|
||||
|
||||
test("createProviderRunRuntime does not require session context provisioning", async () => {
|
||||
const observabilityRoot = await mkdtemp(resolve(tmpdir(), "ai-ops-provider-runtime-"));
|
||||
const runtime = await createProviderRunRuntime({
|
||||
provider: "claude",
|
||||
config: loadConfig({}),
|
||||
observabilityRootPath: observabilityRoot,
|
||||
baseEnv: {
|
||||
PATH: "/usr/bin",
|
||||
},
|
||||
});
|
||||
|
||||
try {
|
||||
assert.equal(runtime.provider, "claude");
|
||||
assert.equal(runtime.sharedEnv.PATH, "/usr/bin");
|
||||
} finally {
|
||||
await runtime.close();
|
||||
}
|
||||
});
|
||||
|
||||
@@ -111,6 +111,42 @@ test("rules engine enforces binary allowlist, tool policy, and path boundaries",
|
||||
);
|
||||
});
|
||||
|
||||
test("rules engine dangerous_warn_only logs but does not block violating shell commands", async () => {
|
||||
const worktreeRoot = await mkdtemp(resolve(tmpdir(), "ai-ops-security-warn-worktree-"));
|
||||
const stateRoot = await mkdtemp(resolve(tmpdir(), "ai-ops-security-warn-state-"));
|
||||
const projectContextPath = resolve(stateRoot, "project-context.json");
|
||||
|
||||
const rules = new SecurityRulesEngine(
|
||||
{
|
||||
allowedBinaries: ["git"],
|
||||
worktreeRoot,
|
||||
protectedPaths: [stateRoot, projectContextPath],
|
||||
requireCwdWithinWorktree: true,
|
||||
rejectRelativePathTraversal: true,
|
||||
enforcePathBoundaryOnArguments: true,
|
||||
allowedEnvAssignments: [],
|
||||
blockedEnvAssignments: [],
|
||||
},
|
||||
undefined,
|
||||
{
|
||||
violationHandling: "dangerous_warn_only",
|
||||
},
|
||||
);
|
||||
|
||||
const validated = await rules.validateShellCommand({
|
||||
command: "unauthorized_bin --version",
|
||||
cwd: worktreeRoot,
|
||||
toolClearance: {
|
||||
allowlist: ["git"],
|
||||
banlist: [],
|
||||
},
|
||||
});
|
||||
|
||||
assert.equal(validated.cwd, worktreeRoot);
|
||||
assert.equal(validated.parsed.commandCount, 0);
|
||||
assert.deepEqual(validated.parsed.commands, []);
|
||||
});
|
||||
|
||||
test("secure executor runs with explicit env policy", async () => {
|
||||
const worktreeRoot = await mkdtemp(resolve(tmpdir(), "ai-ops-security-exec-"));
|
||||
|
||||
@@ -193,3 +229,47 @@ test("rules engine carries session context in tool audit events", () => {
|
||||
assert.equal(allowedEvent.nodeId, "node-ctx");
|
||||
assert.equal(allowedEvent.attempt, 2);
|
||||
});
|
||||
|
||||
test("rules engine applies tool clearance matching case-insensitively", () => {
|
||||
const rules = new SecurityRulesEngine({
|
||||
allowedBinaries: ["git"],
|
||||
worktreeRoot: "/tmp",
|
||||
protectedPaths: [],
|
||||
requireCwdWithinWorktree: true,
|
||||
rejectRelativePathTraversal: true,
|
||||
enforcePathBoundaryOnArguments: true,
|
||||
allowedEnvAssignments: [],
|
||||
blockedEnvAssignments: [],
|
||||
});
|
||||
|
||||
assert.doesNotThrow(() =>
|
||||
rules.assertToolInvocationAllowed({
|
||||
tool: "Bash",
|
||||
toolClearance: {
|
||||
allowlist: ["bash", "glob"],
|
||||
banlist: [],
|
||||
},
|
||||
}),
|
||||
);
|
||||
|
||||
assert.throws(
|
||||
() =>
|
||||
rules.assertToolInvocationAllowed({
|
||||
tool: "Glob",
|
||||
toolClearance: {
|
||||
allowlist: ["bash", "glob"],
|
||||
banlist: ["GLOB"],
|
||||
},
|
||||
}),
|
||||
(error: unknown) =>
|
||||
error instanceof SecurityViolationError && error.code === "TOOL_BANNED",
|
||||
);
|
||||
|
||||
assert.deepEqual(
|
||||
rules.filterAllowedTools(["Bash", "Glob", "Read"], {
|
||||
allowlist: ["bash", "glob"],
|
||||
banlist: ["gLoB"],
|
||||
}),
|
||||
["Bash"],
|
||||
);
|
||||
});
|
||||
|
||||
@@ -228,3 +228,60 @@ test("session worktree manager recreates a task worktree after stale metadata pr
|
||||
const stats = await stat(recreatedTaskWorktreePath);
|
||||
assert.equal(stats.isDirectory(), true);
|
||||
});
|
||||
|
||||
test("session worktree manager applies target path sparse checkout and task working directory", async () => {
|
||||
const root = await mkdtemp(resolve(tmpdir(), "ai-ops-session-worktree-target-"));
|
||||
const projectPath = resolve(root, "project");
|
||||
const worktreeRoot = resolve(root, "worktrees");
|
||||
|
||||
await mkdir(resolve(projectPath, "app", "src"), { recursive: true });
|
||||
await mkdir(resolve(projectPath, "infra"), { recursive: true });
|
||||
await git(["init", projectPath]);
|
||||
await git(["-C", projectPath, "config", "user.name", "AI Ops"]);
|
||||
await git(["-C", projectPath, "config", "user.email", "ai-ops@example.local"]);
|
||||
await writeFile(resolve(projectPath, "app", "src", "index.ts"), "export const app = true;\n", "utf8");
|
||||
await writeFile(resolve(projectPath, "infra", "notes.txt"), "infra\n", "utf8");
|
||||
await git(["-C", projectPath, "add", "."]);
|
||||
await git(["-C", projectPath, "commit", "-m", "initial commit"]);
|
||||
|
||||
const manager = new SessionWorktreeManager({
|
||||
worktreeRoot,
|
||||
baseRef: "HEAD",
|
||||
targetPath: "app",
|
||||
});
|
||||
|
||||
const sessionId = "session-target-1";
|
||||
const baseWorkspacePath = manager.resolveBaseWorkspacePath(sessionId);
|
||||
await manager.initializeSessionBaseWorkspace({
|
||||
sessionId,
|
||||
projectPath,
|
||||
baseWorkspacePath,
|
||||
});
|
||||
|
||||
const baseWorkingDirectory = manager.resolveWorkingDirectoryForWorktree(baseWorkspacePath);
|
||||
assert.equal(baseWorkingDirectory, resolve(baseWorkspacePath, "app"));
|
||||
const baseWorkingStats = await stat(baseWorkingDirectory);
|
||||
assert.equal(baseWorkingStats.isDirectory(), true);
|
||||
await assert.rejects(() => stat(resolve(baseWorkspacePath, "infra")), {
|
||||
code: "ENOENT",
|
||||
});
|
||||
|
||||
const ensured = await manager.ensureTaskWorktree({
|
||||
sessionId,
|
||||
taskId: "task-target-1",
|
||||
baseWorkspacePath,
|
||||
});
|
||||
assert.equal(ensured.taskWorkingDirectory, resolve(ensured.taskWorktreePath, "app"));
|
||||
|
||||
await writeFile(resolve(ensured.taskWorkingDirectory, "src", "feature.ts"), "export const feature = true;\n", "utf8");
|
||||
|
||||
const mergeOutcome = await manager.mergeTaskIntoBase({
|
||||
taskId: "task-target-1",
|
||||
baseWorkspacePath,
|
||||
taskWorktreePath: ensured.taskWorktreePath,
|
||||
});
|
||||
assert.equal(mergeOutcome.kind, "success");
|
||||
|
||||
const merged = await readFile(resolve(baseWorkingDirectory, "src", "feature.ts"), "utf8");
|
||||
assert.equal(merged, "export const feature = true;\n");
|
||||
});
|
||||
|
||||
Reference in New Issue
Block a user