first commit

2026-02-25 23:49:54 -05:00
commit 4d097161cb
1775 changed files with 452827 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,155 @@
+# App Factory
+
+Autonomous multi-agent orchestration framework. Give it a natural language prompt, get back a fully developed, QA-verified, and merged codebase.
+
+## How It Works
+
+```
+User prompt
+  → PM Agent (expands into structured PRD)
+    → Task Agent (generates prioritized dependency graph via claude-task-master)
+      → Dev Agents (concurrent, isolated Docker containers with Claude Code)
+        → QA Agent (code review, tests, rebase, merge to main)
+          → Done
+```
+
+If any agent gets blocked, the flow reverses through a **clarification loop** — Dev asks Task, Task asks PM, PM asks the human — while other agents keep working.
+
+## Quick Start
+
+### Prerequisites
+
+- Python 3.11+
+- Docker Desktop (running)
+- Git
+- Claude Code with OAuth **or** an [Anthropic API key](https://console.anthropic.com/)
+
+### Setup
+
+```bash
+# Clone and enter project
+git clone <repo-url> && cd ai_ops2
+
+# Create venv and install dependencies
+uv venv
+uv pip install -r requirements.txt
+
+# Configure environment (optional — not needed with Claude Code OAuth)
+cp .env.example .env
+# Edit .env to add API keys if not using OAuth
+# Optionally add LANGSMITH_API_KEY for tracing
+```
+
+### Run
+
+```bash
+# Build a project from a prompt
+python main.py --prompt "Build a video transcription service with Whisper and summarization"
+
+# Limit concurrent dev agents
+python main.py --prompt "Build a REST API" --max-concurrent-tasks 3
+
+# Target a specific repo
+python main.py --prompt "Add user authentication" --repo-path /path/to/project
+
+# Validate config without executing
+python main.py --dry-run --prompt "test"
+
+# Verbose logging
+python main.py --prompt "Build a CLI tool" --debug
+```
+
+## Architecture
+
+### Agents
+
+| Agent | File | Role |
+|-------|------|------|
+| **PMAgent** | `agents/pm_agent.py` | Expands prompts into PRDs, handles clarification requests |
+| **TaskMasterAgent** | `agents/task_agent.py` | Bridges to claude-task-master for task graph management |
+| **DevAgentManager** | `agents/dev_agent.py` | Spawns Claude Code in Docker containers via pexpect |
+| **QAAgent** | `agents/qa_agent.py` | Code review, linting, testing, rebase, and merge |
+
+### Core
+
+| Component | File | Role |
+|-----------|------|------|
+| **AppFactoryOrchestrator** | `core/graph.py` | LangGraph state machine with conditional routing |
+| **WorkspaceManager** | `core/workspace.py` | Git worktree + Docker container lifecycle |
+| **ObservabilityManager** | `core/observability.py` | LangSmith tracing + structured logging |
+| **ArchitectureTracker** | `core/architecture_tracker.py` | Prevents context starvation across dev agents |
+
+### Project Structure
+
+```
+app_factory/
+├── agents/
+│   ├── pm_agent.py          # PRD generation + clarification
+│   ├── task_agent.py        # claude-task-master interface
+│   ├── dev_agent.py         # Claude Code + Docker orchestration
+│   └── qa_agent.py          # Review, test, merge pipeline
+├── core/
+│   ├── graph.py             # LangGraph state machine
+│   ├── workspace.py         # Git worktree + Docker isolation
+│   ├── observability.py     # LangSmith tracing + logging
+│   └── architecture_tracker.py  # Global architecture summary
+├── prompts/                 # Agent prompt templates
+│   ├── pm_prd_expansion.txt
+│   ├── pm_clarification.txt
+│   ├── dev_task_execution.txt
+│   └── qa_review.txt
+└── data/                    # Runtime state + architecture tracking
+```
+
+## Execution Phases
+
+1. **Linear Planning** — User → PM Agent → Task Agent. Produces a prioritized DAG of tasks.
+2. **Dynamic Concurrency** — Orchestrator spins up a WorkspaceManager + DevAgent for every unblocked task concurrently via `asyncio.gather()`.
+3. **Clarification Loop** — Blocked agents route requests backward up the chain. Other agents continue uninterrupted.
+4. **QA & Merge** — QA Agent rebases, lints, tests, reviews, and merges each completed task. Task Agent then unlocks downstream dependencies.
+
+## Design Decisions
+
+- **Context Starvation Prevention**: A read-only `ArchitectureTracker` summary is injected into every Dev Agent prompt so they know what other agents have built.
+- **Merge Conflict Handling**: QA Agent rebases onto main before testing. Complex conflicts are kicked back to the Dev Agent automatically.
+- **Infinite Loop Protection**: Max retry counter (3) per task at the LangGraph node level. Exceeded retries escalate to PM → human.
+- **Claude Code Automation**: Dev agents interact with Claude Code via `pexpect` subprocess in headless mode inside Docker containers.
+
+## Testing
+
+```bash
+# Run full test suite
+python -m pytest tests/ -v
+
+# Run specific test file
+python -m pytest tests/test_graph.py -v
+
+# Run with coverage
+python -m pytest tests/ --cov=app_factory --cov-report=term-missing
+```
+
+**229 tests** across 9 test files covering all agents, core components, and integration.
+
+## Configuration
+
+### Authentication
+
+App Factory supports two auth modes:
+
+- **Claude Code OAuth** (default) — If you use Claude Code with OAuth, no API key is needed. The Claude Agent SDK (`claude-agent-sdk`) picks up your auth automatically.
+- **API key** — Set `ANTHROPIC_API_KEY` in `.env` for direct API access.
+
+### Environment Variables
+
+| Variable | Required | Description |
+|----------|----------|-------------|
+| `ANTHROPIC_API_KEY` | No* | Claude API key. Not needed with Claude Code OAuth. |
+| `OPENAI_API_KEY` | No | Codex fallback for algorithmic generation |
+| `LANGSMITH_API_KEY` | No | LangSmith tracing and observability |
+| `LANGSMITH_PROJECT` | No | LangSmith project name (default: `app-factory`) |
+
+*Required only if not using Claude Code OAuth.*
+
+## License
+
+MIT