Files
ai_ops/docs/runtime-events.md

53 lines
1.9 KiB
Markdown

# Runtime Events
## Purpose
Runtime events provide a best-effort telemetry side-channel for:
- long-term analytics (tool usage, token usage, retries, failure rates)
- high-visibility operational notifications (session starts/stops, critical failures)
This channel is intentionally non-blocking and does not participate in orchestration routing logic.
## Event model
Events include:
- identity: `id`, `timestamp`, `type`, `severity`
- routing context: `sessionId`, `nodeId`, `attempt`
- narrative context: `message`
- analytics context: optional `usage` (`tokenInput`, `tokenOutput`, `tokenTotal`, `toolCalls`, `durationMs`, `costUsd`)
- structured `metadata`
Core emitted event types:
- `session.started`
- `node.attempt.completed`
- `domain.<domain_event_type>`
- `session.completed`
- `session.failed`
- `security.<security_audit_event_type>` (mirrored from security audit engine)
`node.attempt.completed` metadata includes orchestration-debug fields used by the operator UI:
- `status`, optional `failureKind`, optional `failureCode`
- `executionContext` (`phase`, `modelConstraint`, `allowedTools`, security constraints)
- `topologyKind`, `retrySpawned`, optional `fromNodeId`
- optional `subtasks`, `securityViolation`
## Sinks
- File sink (`AGENT_RUNTIME_EVENT_LOG_PATH`)
- NDJSON append-only log suitable for offline analytics ingestion.
- Discord webhook sink (`AGENT_RUNTIME_DISCORD_WEBHOOK_URL`)
- Sends events at or above `AGENT_RUNTIME_DISCORD_MIN_SEVERITY`.
- Always-notify event types configurable via `AGENT_RUNTIME_DISCORD_ALWAYS_NOTIFY_TYPES`.
All sinks are best-effort. Sink failures are swallowed to avoid impacting agent execution.
## Non-goals
- Runtime events are not used to drive DAG edge conditions.
- Runtime events are not required for pipeline correctness.
- Runtime events do not replace session state persistence (`AGENT_STATE_ROOT`) or project context state (`AGENT_PROJECT_CONTEXT_PATH`).