Add Claude observability tracing and diagnostics UI

2026-02-24 12:50:31 -05:00
parent 6863c1da0b
commit 691591d279
22 changed files with 1898 additions and 32 deletions
--- a/README.md
+++ b/README.md
@@ -95,6 +95,7 @@ The UI provides:
 - graph visualizer with topology/retry rendering, edge trigger labels, node economics (duration/cost/tokens), and critical-path highlighting
 - node inspector with attempt metadata and injected `ResolvedExecutionContext` sandbox payload
 - live runtime event feed from `AGENT_RUNTIME_EVENT_LOG_PATH` with severity coloring (including security mirror events)
+- Claude trace feed from `CLAUDE_OBSERVABILITY_LOG_PATH` (query lifecycle, SDK message types/subtypes, and errors)
 - run trigger + kill switch backed by `SchemaDrivenExecutionEngine.runSession(...)`
  - run mode selector: `provider` (real Codex/Claude execution) or `mock` (deterministic dry-run executor)
  - provider selector: `codex` or `claude`
@@ -108,6 +109,7 @@ Provider mode notes:
 - `provider=codex` uses existing OpenAI/Codex auth settings (`OPENAI_AUTH_MODE`, `CODEX_API_KEY`, `OPENAI_API_KEY`).
 - `provider=claude` uses Claude auth resolution (`CLAUDE_CODE_OAUTH_TOKEN` preferred, otherwise `ANTHROPIC_API_KEY`, or existing Claude Code login state).
 - `CLAUDE_MODEL` should be a Claude model id/alias recognized by Claude Code (for example `claude-sonnet-4-6`); `anthropic/...` prefixes are normalized automatically.
+- Claude provider runs can emit Claude SDK/CLI internals to stdout and/or NDJSON with `CLAUDE_OBSERVABILITY_*` settings.

 ## Manifest Semantics

@@ -202,6 +204,30 @@ Notes:
  - `security.tool.invocation_allowed`
  - `security.tool.invocation_blocked`

+## Claude Observability
+
+- `CLAUDE_OBSERVABILITY_MODE=stdout` prints structured Claude query internals (tool progress, system events, stderr, result lifecycle) to stdout as JSON lines prefixed with `[claude-trace]`.
+- `CLAUDE_OBSERVABILITY_MODE=file` appends the same records to `CLAUDE_OBSERVABILITY_LOG_PATH`.
+- `CLAUDE_OBSERVABILITY_MODE=both` enables both outputs.
+- Output samples high-frequency `tool_progress` events to avoid log flooding while retaining suppression counters.
+- `assistant` and `user` message records are retained so turn flow is inspectable end-to-end.
+- `CLAUDE_OBSERVABILITY_VERBOSITY=summary` stores compact metadata; `full` stores redacted full SDK message payloads.
+- `CLAUDE_OBSERVABILITY_INCLUDE_PARTIAL=true` enables and emits sampled partial assistant stream events from the SDK.
+- `CLAUDE_OBSERVABILITY_DEBUG=true` enables Claude SDK debug mode.
+- `CLAUDE_OBSERVABILITY_DEBUG_LOG_PATH` writes Claude SDK debug output to a file (also enables debug mode).
+- In UI/provider mode, `CLAUDE_OBSERVABILITY_LOG_PATH` resolves relative to the repo workspace root.
+- UI API: `GET /api/claude-trace?limit=<n>&sessionId=<id>` reads filtered Claude trace records.
+
+Example:
+
+```bash
+CLAUDE_OBSERVABILITY_MODE=both
+CLAUDE_OBSERVABILITY_VERBOSITY=summary
+CLAUDE_OBSERVABILITY_LOG_PATH=.ai_ops/events/claude-trace.ndjson
+CLAUDE_OBSERVABILITY_INCLUDE_PARTIAL=false
+CLAUDE_OBSERVABILITY_DEBUG=false
+```
+
 ### Analytics Quick Start

 Inspect latest events:
@@ -258,6 +284,12 @@ jq -c 'select(.severity=="critical")' .ai_ops/events/runtime-events.ndjson
 - `ANTHROPIC_API_KEY` (used when `CLAUDE_CODE_OAUTH_TOKEN` is unset)
 - `CLAUDE_MODEL`
 - `CLAUDE_CODE_PATH`
+- `CLAUDE_OBSERVABILITY_MODE` (`off`, `stdout`, `file`, or `both`)
+- `CLAUDE_OBSERVABILITY_VERBOSITY` (`summary` or `full`)
+- `CLAUDE_OBSERVABILITY_LOG_PATH`
+- `CLAUDE_OBSERVABILITY_INCLUDE_PARTIAL` (`true` or `false`)
+- `CLAUDE_OBSERVABILITY_DEBUG` (`true` or `false`)
+- `CLAUDE_OBSERVABILITY_DEBUG_LOG_PATH`
 - `MCP_CONFIG_PATH`

 ### Agent Manager Limits