zman27/ai_ops

Files

Josh Rzemien cf386e1aaa feat(ui): add operator UI server, stores, and insights

2026-02-23 18:49:53 -05:00

1.9 KiB

Raw Blame History

Runtime Events

Purpose

Runtime events provide a best-effort telemetry side-channel for:

long-term analytics (tool usage, token usage, retries, failure rates)
high-visibility operational notifications (session starts/stops, critical failures)

This channel is intentionally non-blocking and does not participate in orchestration routing logic.

Event model

Events include:

identity: id, timestamp, type, severity
routing context: sessionId, nodeId, attempt
narrative context: message
analytics context: optional usage (tokenInput, tokenOutput, tokenTotal, toolCalls, durationMs, costUsd)
structured metadata

Core emitted event types:

session.started
node.attempt.completed
domain.<domain_event_type>
session.completed
session.failed
security.<security_audit_event_type> (mirrored from security audit engine)

node.attempt.completed metadata includes orchestration-debug fields used by the operator UI:

status, optional failureKind, optional failureCode
executionContext (phase, modelConstraint, allowedTools, security constraints)
topologyKind, retrySpawned, optional fromNodeId
optional subtasks, securityViolation

Sinks

File sink (AGENT_RUNTIME_EVENT_LOG_PATH)
- NDJSON append-only log suitable for offline analytics ingestion.
Discord webhook sink (AGENT_RUNTIME_DISCORD_WEBHOOK_URL)
- Sends events at or above AGENT_RUNTIME_DISCORD_MIN_SEVERITY.
- Always-notify event types configurable via AGENT_RUNTIME_DISCORD_ALWAYS_NOTIFY_TYPES.

All sinks are best-effort. Sink failures are swallowed to avoid impacting agent execution.

Non-goals

Runtime events are not used to drive DAG edge conditions.
Runtime events are not required for pipeline correctness.
Runtime events do not replace session state persistence (AGENT_STATE_ROOT) or project context state (AGENT_PROJECT_CONTEXT_PATH).