Add runtime event telemetry and auth-mode config hardening

This commit is contained in:
2026-02-23 17:30:53 -05:00
parent 3ca9bd3db8
commit 94c79d9dd7
10 changed files with 853 additions and 3 deletions

45
docs/runtime-events.md Normal file
View File

@@ -0,0 +1,45 @@
# Runtime Events
## Purpose
Runtime events provide a best-effort telemetry side-channel for:
- long-term analytics (tool usage, token usage, retries, failure rates)
- high-visibility operational notifications (session starts/stops, critical failures)
This channel is intentionally non-blocking and does not participate in orchestration routing logic.
## Event model
Events include:
- identity: `id`, `timestamp`, `type`, `severity`
- routing context: `sessionId`, `nodeId`, `attempt`
- narrative context: `message`
- analytics context: optional `usage` (`tokenInput`, `tokenOutput`, `tokenTotal`, `toolCalls`, `durationMs`, `costUsd`)
- structured `metadata`
Core emitted event types:
- `session.started`
- `node.attempt.completed`
- `domain.<domain_event_type>`
- `session.completed`
- `session.failed`
- `security.<security_audit_event_type>` (mirrored from security audit engine)
## Sinks
- File sink (`AGENT_RUNTIME_EVENT_LOG_PATH`)
- NDJSON append-only log suitable for offline analytics ingestion.
- Discord webhook sink (`AGENT_RUNTIME_DISCORD_WEBHOOK_URL`)
- Sends events at or above `AGENT_RUNTIME_DISCORD_MIN_SEVERITY`.
- Always-notify event types configurable via `AGENT_RUNTIME_DISCORD_ALWAYS_NOTIFY_TYPES`.
All sinks are best-effort. Sink failures are swallowed to avoid impacting agent execution.
## Non-goals
- Runtime events are not used to drive DAG edge conditions.
- Runtime events are not required for pipeline correctness.
- Runtime events do not replace session state persistence (`AGENT_STATE_ROOT`) or project context state (`AGENT_PROJECT_CONTEXT_PATH`).