zman27/ai_ops

Fork 0

Files

zman 83bbf1a9ce Add configurable worktree target path and session run diagnostics

2026-02-23 20:38:05 -05:00

5.2 KiB

Raw Blame History

Session Walkthrough (Concrete Example)

This document walks through one successful provider run end-to-end using:

session id: ui-session-mlzw94bv-cb753677
run id: 9287775f-a507-492a-9afa-347ed3f3a6b3
execution mode: provider
provider: claude
manifest: .ai_ops/manifests/test.json

Use this as a mental model and as a debugging template for future sessions.

1) What happened in this run

The manifest defines two sequential nodes:

write-node (persona: writer)
copy-node (persona: copy-editor)

Edge routing is write-node -> copy-node on success.

In this run:

write-node succeeded on attempt 1 and emitted validation_passed and tasks_planned.
copy-node succeeded on attempt 1 and emitted validation_passed.
Session aggregate status was success.

2) Timeline from runtime events

From .ai_ops/events/runtime-events.ndjson:

2026-02-24T00:55:28.632Z session.started
2026-02-24T00:55:48.705Z node.attempt.completed for write-node with status=success
2026-02-24T00:55:48.706Z domain.validation_passed for write-node
2026-02-24T00:55:48.706Z domain.tasks_planned for write-node
2026-02-24T00:56:14.237Z node.attempt.completed for copy-node with status=success
2026-02-24T00:56:14.238Z domain.validation_passed for copy-node
2026-02-24T00:56:14.242Z session.completed with status=success

3) How artifacts map to runtime behavior

Run metadata (UI-level)

state/<session>/ui-run-meta.json stores run summary fields:

run/provider/mode
status (running, success, failure, cancelled)
start/end timestamps

For this run:

{
  "sessionId": "ui-session-mlzw94bv-cb753677",
  "status": "success",
  "executionMode": "provider",
  "provider": "claude"
}

Handoffs (node input payloads)

state/<session>/handoffs/*.json stores payload handoffs per node.

write-node.json:

{
  "nodeId": "write-node",
  "payload": { "prompt": "be yourself" }
}

copy-node.json includes fromNodeId: "write-node" and carries the story generated by the writer node.

Important: this is the payload pipeline edge transfer. If a downstream node output looks strange, inspect this file first.

Session state (flags + metadata + history)

state/<session>/state.json is cumulative session state:

flags: merged boolean flags from node results
metadata: merged metadata from node results/behavior patches
history: domain-event history entries

For this run, state includes:

flags: story_written=true, copy_edited=true
history events:
- write-node: validation_passed
- write-node: tasks_planned
- copy-node: validation_passed

Project context pointer

.ai_ops/project-context.json tracks cross-session pointers like:

sessions/<session>/last_completed_node
sessions/<session>/last_attempt
sessions/<session>/final_state

This lets operators and tooling locate the final state file for any completed session.

4) Code path (from button click to persisted state)

UI starts run via UiRunService.startRun(...).
Service loads config, parses manifest, creates engine, writes initial run meta.
Engine runSession(...) initializes state and writes entry handoff.
Pipeline executes ready nodes:
- builds fresh node context (handoff + state)
- renders persona system prompt
- invokes provider executor
- receives actor result
Lifecycle observer persists:
- state flags/metadata/history
- runtime events (node.attempt.completed, domain.*)
- project context pointers (last_completed_node, last_attempt)
Pipeline evaluates edges and writes downstream handoffs.
Pipeline computes aggregate status and emits session.completed.
UI run service writes final ui-run-meta.json status from pipeline summary.

Primary entrypoints:

src/ui/run-service.ts
src/agents/orchestration.ts
src/agents/pipeline.ts
src/agents/lifecycle-observer.ts
src/agents/state-context.ts
src/ui/provider-executor.ts

5) Mental model that keeps this manageable

Think of one session as five stores and one loop:

Manifest (static plan): node graph + routing rules.
Handoffs (per-node input payload snapshots).
State (session memory): flags + metadata + domain history.
Runtime events (timeline/audit side channel).
Project context (cross-session pointers and shared context).
Loop: dequeue ready node -> execute -> persist result/events -> enqueue next nodes.

If you track those six things, behavior becomes deterministic and explainable.

6) Debug checklist for any future session id

Given <sid>, inspect in this order:

state/<sid>/ui-run-meta.json
.ai_ops/events/runtime-events.ndjson filtered by <sid>
state/<sid>/handoffs/*.json
state/<sid>/state.json
.ai_ops/project-context.json pointer entries for <sid>

Interpretation:

No session.started: run failed before pipeline began.
node.attempt.completed with failureCode=provider_*: provider/runtime issue.
Missing downstream handoff file: edge condition did not pass.
history has validation_failed: retry/unrolled path or remediation branch likely triggered.
ui-run-meta disagrees with runtime events: check run-service status mapping and restart server on new code.

5.2 KiB Raw Blame History