5.6 KiB
5.6 KiB
App Factory
Autonomous multi-agent orchestration framework. Give it a natural language prompt, get back a fully developed, QA-verified, and merged codebase.
How It Works
User prompt
→ PM Agent (expands into structured PRD)
→ Task Agent (generates prioritized dependency graph via claude-task-master)
→ Dev Agents (concurrent, isolated Docker containers with Claude Code)
→ QA Agent (code review, tests, rebase, merge to main)
→ Done
If any agent gets blocked, the flow reverses through a clarification loop — Dev asks Task, Task asks PM, PM asks the human — while other agents keep working.
Quick Start
Prerequisites
- Python 3.11+
- Docker Desktop (running)
- Git
- Claude Code with OAuth or an Anthropic API key
Setup
# Clone and enter project
git clone <repo-url> && cd ai_ops2
# Create venv and install dependencies
uv venv
uv pip install -r requirements.txt
# Configure environment (optional — not needed with Claude Code OAuth)
cp .env.example .env
# Edit .env to add API keys if not using OAuth
# Optionally add LANGSMITH_API_KEY for tracing
Run
# Build a project from a prompt
python main.py --prompt "Build a video transcription service with Whisper and summarization"
# Limit concurrent dev agents
python main.py --prompt "Build a REST API" --max-concurrent-tasks 3
# Target a specific repo
python main.py --prompt "Add user authentication" --repo-path /path/to/project
# Validate config without executing
python main.py --dry-run --prompt "test"
# Verbose logging
python main.py --prompt "Build a CLI tool" --debug
Architecture
Agents
| Agent | File | Role |
|---|---|---|
| PMAgent | agents/pm_agent.py |
Expands prompts into PRDs, handles clarification requests |
| TaskMasterAgent | agents/task_agent.py |
Bridges to claude-task-master for task graph management |
| DevAgentManager | agents/dev_agent.py |
Spawns Claude Code in Docker containers via pexpect |
| QAAgent | agents/qa_agent.py |
Code review, linting, testing, rebase, and merge |
Core
| Component | File | Role |
|---|---|---|
| AppFactoryOrchestrator | core/graph.py |
LangGraph state machine with conditional routing |
| WorkspaceManager | core/workspace.py |
Git worktree + Docker container lifecycle |
| ObservabilityManager | core/observability.py |
LangSmith tracing + structured logging |
| ArchitectureTracker | core/architecture_tracker.py |
Prevents context starvation across dev agents |
Project Structure
app_factory/
├── agents/
│ ├── pm_agent.py # PRD generation + clarification
│ ├── task_agent.py # claude-task-master interface
│ ├── dev_agent.py # Claude Code + Docker orchestration
│ └── qa_agent.py # Review, test, merge pipeline
├── core/
│ ├── graph.py # LangGraph state machine
│ ├── workspace.py # Git worktree + Docker isolation
│ ├── observability.py # LangSmith tracing + logging
│ └── architecture_tracker.py # Global architecture summary
├── prompts/ # Agent prompt templates
│ ├── pm_prd_expansion.txt
│ ├── pm_clarification.txt
│ ├── dev_task_execution.txt
│ └── qa_review.txt
└── data/ # Runtime state + architecture tracking
Execution Phases
- Linear Planning — User → PM Agent → Task Agent. Produces a prioritized DAG of tasks.
- Dynamic Concurrency — Orchestrator spins up a WorkspaceManager + DevAgent for every unblocked task concurrently via
asyncio.gather(). - Clarification Loop — Blocked agents route requests backward up the chain. Other agents continue uninterrupted.
- QA & Merge — QA Agent rebases, lints, tests, reviews, and merges each completed task. Task Agent then unlocks downstream dependencies.
Design Decisions
- Context Starvation Prevention: A read-only
ArchitectureTrackersummary is injected into every Dev Agent prompt so they know what other agents have built. - Merge Conflict Handling: QA Agent rebases onto main before testing. Complex conflicts are kicked back to the Dev Agent automatically.
- Infinite Loop Protection: Max retry counter (3) per task at the LangGraph node level. Exceeded retries escalate to PM → human.
- Claude Code Automation: Dev agents interact with Claude Code via
pexpectsubprocess in headless mode inside Docker containers.
Testing
# Run full test suite
python -m pytest tests/ -v
# Run specific test file
python -m pytest tests/test_graph.py -v
# Run with coverage
python -m pytest tests/ --cov=app_factory --cov-report=term-missing
229 tests across 9 test files covering all agents, core components, and integration.
Configuration
Authentication
App Factory supports two auth modes:
- Claude Code OAuth (default) — If you use Claude Code with OAuth, no API key is needed. The Claude Agent SDK (
claude-agent-sdk) picks up your auth automatically. - API key — Set
ANTHROPIC_API_KEYin.envfor direct API access.
Environment Variables
| Variable | Required | Description |
|---|---|---|
ANTHROPIC_API_KEY |
No* | Claude API key. Not needed with Claude Code OAuth. |
OPENAI_API_KEY |
No | Codex fallback for algorithmic generation |
LANGSMITH_API_KEY |
No | LangSmith tracing and observability |
LANGSMITH_PROJECT |
No | LangSmith project name (default: app-factory) |
Required only if not using Claude Code OAuth.
License
MIT