ai_ops/human_only_TODO

# what is it
- a thing that gives me finer control around what agents are doing and what their contexts are
- has concurrency
- manages git workdirs and merges
- can use any openai/anthropic models
- can use multiple sets of creds

# in progress

# epic
implementation of AgentManager.runRecursiveAgent

# primitives / assumptions
  - runRecursiveAgent is currently a stub (src/agents/manager.ts:100), and README confirms it’s intentionally
    unimplemented (README.md:16, README.md:304).
  - Core primitives exist:
      - queue + depth limits (src/agents/manager.ts:85, src/agents/manager.ts:117)
      - child resource suballocation (src/agents/provisioning.ts:262)
      - child persona planning (src/agents/orchestration.ts:146)
  - Baseline tests are green (npm test passes).

# concurrency and deadlocks

- deadlock policy
    - release parent token before awaiting children, specifically implementing a strict fan out/fan in policy
        - a parent agent should never actively "wait" or suspend while holding a concurrency token
            - the parent's execution should simply terminate by returning a "Fanout Plan" (an array of child intents/payloads).
            - Once returned, the Orchestrator reclaims the parent's token and queues the children.
            - Once all the child nodes in that fanout complete their work, the Orchestrator schedules a completely new "Aggregator" node (or Phase 2 of the parent) and passes the children's outputs into it.
        - This guarantees we will never hit a deadlock, even if maxConcurrentAgents=1, because a parent and its children will never hold capacity at the same time.
- api contracts for recursion
    - Do not give the agent a spawnChild() callback that it runs imperatively mid-stream. Instead, when runRecursiveAgent finishes its thought process, it should return an array of intent objects: [{ persona: "Coder", task: "build X", context: {...} }, ...]. The DAG Orchestrator reads this array and schedules the children. This keeps the execution engine completely schema-driven and observable.
- failure/cancellation semantics
    - Cancel active work via AbortController and Fail-Fast
        - Every agent invocation must accept an AbortSignal. If a parent session is closed or fails catastrophically, the Orchestrator fires the AbortController. The SDKs (@openai/codex-sdk, @anthropic-ai/claude-agent-sdk) natively support passing standard web AbortSignals to cancel in-flight API requests. The pipeline owns the retry logic, not the agent.

# topology and state management

- session topology model
    - child sessions with hierarchical ids
        - Children must run in entirely isolated child sessions. If the parent is session_xyz, children are session_xyz_child_1. This creates a deterministic tree. If session_xyz is cancelled, the Orchestrator can easily regex/filter and cascade the cancellation to all sessions starting with that prefix
- state merge semantics
    - States stay isolated; merging is an explicit DAG node action
        - given our stateless handoffs, children do not magically merge their state back into the parent. Child A writes to its own isolated sub-worktree and outputs a final JSON payload.
            - merge behavior can be dictated by either:
                - an explicitly defined merge agent
                - a deterministic git merge script in the orchestrator
- relationship graph semantics for recursion
    - Enforce acyclic relationships at the manifest level, rely on depth caps for dynamic fan-outs
        - AgentManifest validator should throw an error on startup if it detects a hardcoded cycle (A -> B -> A).
        - For dynamic recursion (where an agent decides at runtime to spawn 3 sub-agents), rely strictly on AGENT_MAX_RECURSIVE_DEPTH.
        - If an agent tries to spawn a child at depth limit + 1, the Orchestrator rejects the intent and returns a hard error payload to the parent

# resources and boundaries

- resource inheritance boundary
    - AgentManager stays concurrency-only; orchestrate via a Middleware layer
    - DAG Orchestrator / AgentManager and Resource Provisioner should remain decoupled
    - When the AgentManager pops a child task off the queue to run it, it should emit an event or call a middleware: provisioner.allocateFor(childSessionId, parentSessionId).
    - The provisioner reads the parent's resources, slices them (e.g., sub-allocating a chunk of ports or creating a nested git worktree), and injects them into the child's environment.
    - The AgentManager never touches Git or files directly

# test acceptance criteria

- runRecursiveAgent test suites
    - The Deadlock Test: Set maxConcurrentAgents=1. Run a parent that spawns a child. Assert that the system resolves successfully (proving the parent yielded its token).
    - The Depth Test: Set AGENT_MAX_RECURSIVE_DEPTH=2. Have an agent spawn a child, which spawns a grandchild, which tries to spawn a great-grandchild. Assert the 3rd spawn is rejected and the error propagates up gracefully.
    - The Abort Test: Start a parent with a 5-second sleep task, cancel the session at 1 second. Assert that the underlying LLM SDK handles were aborted and resources were released.
    - The Isolation Test: Spawn two children concurrently. Assert they are assigned non-overlapping port ranges and isolated worktree paths.

# Scheduled
- security implementation
- persona definitions
    - product
    - task
    - coder
    - tester
    - git
        - handle basic git validation/maintenance
        - edit + merge when conflict is low
        - pass to dev when conflict is big

- need to untangle
    - what goes where in terms of DAG definition vs app logic vs agent behavior
    - events
        - what events do we have
        - what personas care about what events
        - how should a persona respond to an event
            - where is this defined
    - success/failure/retry policy definitions
        - where does this go?
        - what are they?


- task management flow
    - init
    - planning
    - prioritization
    - dependency graph
    - subtasks
    - task/subtask status updates (pending, in progress, done, failed)

# Considering
- model selection per task/session/agent
- agent "notebook"
- agent run log
- agent persona support
- ping pong support - ie. product agent > dev agent, dev agent needs clarification = ping pong back to product. same with tester > dev.
    - resume session aspect of this
    - max ping pong length ie. tester can only pass back once otherwise mark as failed
    - max ping pong length per relationship ie dev:git can ping pong 4 times, dev:product only once, etc
- git orchestration
    - merging
    - symlinks
- security
    - whatever existing thing has
    - banned commands (look up a git repo for this)
- front end
- list available models
- specific workflows
    - ui
    - ci/cd
    - review
    - testing
# Defer
# Won't Do


# Completed
1. boilerplate typescript project for claude
- mcp server support
- generic mcp handlers
- specific mcp handlers for
    - context7
    - claude task manager
- concurrency, configurable max agent and max depth
- Extensible Resource Provisioning
    - hard constraints
    - soft constraints
- basic hygeine run
# epic
- agent orchestration system improvements
# module 1
- schema driven execution engine
    - specific definitions handled in AgentManifest schema
- persona registry
    - templated system prompts injected with runtime context
    - tool clearances (stub this for now, add TODO for security implementation)
        - allowlist
        - banlist
    - behavioral event handlers
        - define how personas react to specific events ie. onTaskComplete, onValidationFail
# module 2
- actor oriented pipeline constrained by a strict directed acyclic graph
- relationship + pipeline graphs
    - multi level topology
        - hierarchical ie parent spawns 3 coder children
        - unrolled retry pipelines ie coder1 > QA1 > Coder2 > QA2
        - sequential ie product > task > coder > QA > git
    - support for constraint definition for each concept (relationship, pipeline, topology)
        - ie max depth, max retries
    - state dependent routings
        - support branching logic based on project history or repository state ie. project init requires product agent to generate prd, then task agent needs to create roadmap, once those exist future sessions skip those agents and go straight to coder agents
# module 3
- state/context manager
- stateless handoffs
    - state and context are passed forwards through payloads via worktree/storage, not conversational memory
    - fresh context per node execution
# module 4
- resource provisioning
- hierarchical resource suballocation
    - when a parent agent spawns children, handle local resource management
        - branche/sub-worktree provisioning
        - suballocating deterministic port range provisioning
        - extensibility to support future resource types