156 lines
5.6 KiB
Markdown
156 lines
5.6 KiB
Markdown
# App Factory
|
|
|
|
Autonomous multi-agent orchestration framework. Give it a natural language prompt, get back a fully developed, QA-verified, and merged codebase.
|
|
|
|
## How It Works
|
|
|
|
```
|
|
User prompt
|
|
→ PM Agent (expands into structured PRD)
|
|
→ Task Agent (generates prioritized dependency graph via claude-task-master)
|
|
→ Dev Agents (concurrent, isolated Docker containers with Claude Code)
|
|
→ QA Agent (code review, tests, rebase, merge to main)
|
|
→ Done
|
|
```
|
|
|
|
If any agent gets blocked, the flow reverses through a **clarification loop** — Dev asks Task, Task asks PM, PM asks the human — while other agents keep working.
|
|
|
|
## Quick Start
|
|
|
|
### Prerequisites
|
|
|
|
- Python 3.11+
|
|
- Docker Desktop (running)
|
|
- Git
|
|
- Claude Code with OAuth **or** an [Anthropic API key](https://console.anthropic.com/)
|
|
|
|
### Setup
|
|
|
|
```bash
|
|
# Clone and enter project
|
|
git clone <repo-url> && cd ai_ops2
|
|
|
|
# Create venv and install dependencies
|
|
uv venv
|
|
uv pip install -r requirements.txt
|
|
|
|
# Configure environment (optional — not needed with Claude Code OAuth)
|
|
cp .env.example .env
|
|
# Edit .env to add API keys if not using OAuth
|
|
# Optionally add LANGSMITH_API_KEY for tracing
|
|
```
|
|
|
|
### Run
|
|
|
|
```bash
|
|
# Build a project from a prompt
|
|
python main.py --prompt "Build a video transcription service with Whisper and summarization"
|
|
|
|
# Limit concurrent dev agents
|
|
python main.py --prompt "Build a REST API" --max-concurrent-tasks 3
|
|
|
|
# Target a specific repo
|
|
python main.py --prompt "Add user authentication" --repo-path /path/to/project
|
|
|
|
# Validate config without executing
|
|
python main.py --dry-run --prompt "test"
|
|
|
|
# Verbose logging
|
|
python main.py --prompt "Build a CLI tool" --debug
|
|
```
|
|
|
|
## Architecture
|
|
|
|
### Agents
|
|
|
|
| Agent | File | Role |
|
|
|-------|------|------|
|
|
| **PMAgent** | `agents/pm_agent.py` | Expands prompts into PRDs, handles clarification requests |
|
|
| **TaskMasterAgent** | `agents/task_agent.py` | Bridges to claude-task-master for task graph management |
|
|
| **DevAgentManager** | `agents/dev_agent.py` | Spawns Claude Code in Docker containers via pexpect |
|
|
| **QAAgent** | `agents/qa_agent.py` | Code review, linting, testing, rebase, and merge |
|
|
|
|
### Core
|
|
|
|
| Component | File | Role |
|
|
|-----------|------|------|
|
|
| **AppFactoryOrchestrator** | `core/graph.py` | LangGraph state machine with conditional routing |
|
|
| **WorkspaceManager** | `core/workspace.py` | Git worktree + Docker container lifecycle |
|
|
| **ObservabilityManager** | `core/observability.py` | LangSmith tracing + structured logging |
|
|
| **ArchitectureTracker** | `core/architecture_tracker.py` | Prevents context starvation across dev agents |
|
|
|
|
### Project Structure
|
|
|
|
```
|
|
app_factory/
|
|
├── agents/
|
|
│ ├── pm_agent.py # PRD generation + clarification
|
|
│ ├── task_agent.py # claude-task-master interface
|
|
│ ├── dev_agent.py # Claude Code + Docker orchestration
|
|
│ └── qa_agent.py # Review, test, merge pipeline
|
|
├── core/
|
|
│ ├── graph.py # LangGraph state machine
|
|
│ ├── workspace.py # Git worktree + Docker isolation
|
|
│ ├── observability.py # LangSmith tracing + logging
|
|
│ └── architecture_tracker.py # Global architecture summary
|
|
├── prompts/ # Agent prompt templates
|
|
│ ├── pm_prd_expansion.txt
|
|
│ ├── pm_clarification.txt
|
|
│ ├── dev_task_execution.txt
|
|
│ └── qa_review.txt
|
|
└── data/ # Runtime state + architecture tracking
|
|
```
|
|
|
|
## Execution Phases
|
|
|
|
1. **Linear Planning** — User → PM Agent → Task Agent. Produces a prioritized DAG of tasks.
|
|
2. **Dynamic Concurrency** — Orchestrator spins up a WorkspaceManager + DevAgent for every unblocked task concurrently via `asyncio.gather()`.
|
|
3. **Clarification Loop** — Blocked agents route requests backward up the chain. Other agents continue uninterrupted.
|
|
4. **QA & Merge** — QA Agent rebases, lints, tests, reviews, and merges each completed task. Task Agent then unlocks downstream dependencies.
|
|
|
|
## Design Decisions
|
|
|
|
- **Context Starvation Prevention**: A read-only `ArchitectureTracker` summary is injected into every Dev Agent prompt so they know what other agents have built.
|
|
- **Merge Conflict Handling**: QA Agent rebases onto main before testing. Complex conflicts are kicked back to the Dev Agent automatically.
|
|
- **Infinite Loop Protection**: Max retry counter (3) per task at the LangGraph node level. Exceeded retries escalate to PM → human.
|
|
- **Claude Code Automation**: Dev agents interact with Claude Code via `pexpect` subprocess in headless mode inside Docker containers.
|
|
|
|
## Testing
|
|
|
|
```bash
|
|
# Run full test suite
|
|
python -m pytest tests/ -v
|
|
|
|
# Run specific test file
|
|
python -m pytest tests/test_graph.py -v
|
|
|
|
# Run with coverage
|
|
python -m pytest tests/ --cov=app_factory --cov-report=term-missing
|
|
```
|
|
|
|
**229 tests** across 9 test files covering all agents, core components, and integration.
|
|
|
|
## Configuration
|
|
|
|
### Authentication
|
|
|
|
App Factory supports two auth modes:
|
|
|
|
- **Claude Code OAuth** (default) — If you use Claude Code with OAuth, no API key is needed. The Claude Agent SDK (`claude-agent-sdk`) picks up your auth automatically.
|
|
- **API key** — Set `ANTHROPIC_API_KEY` in `.env` for direct API access.
|
|
|
|
### Environment Variables
|
|
|
|
| Variable | Required | Description |
|
|
|----------|----------|-------------|
|
|
| `ANTHROPIC_API_KEY` | No* | Claude API key. Not needed with Claude Code OAuth. |
|
|
| `OPENAI_API_KEY` | No | Codex fallback for algorithmic generation |
|
|
| `LANGSMITH_API_KEY` | No | LangSmith tracing and observability |
|
|
| `LANGSMITH_PROJECT` | No | LangSmith project name (default: `app-factory`) |
|
|
|
|
*Required only if not using Claude Code OAuth.*
|
|
|
|
## License
|
|
|
|
MIT
|