first commit

This commit is contained in:
2026-02-25 23:49:54 -05:00
commit 4d097161cb
1775 changed files with 452827 additions and 0 deletions

155
README.md Normal file
View File

@@ -0,0 +1,155 @@
# App Factory
Autonomous multi-agent orchestration framework. Give it a natural language prompt, get back a fully developed, QA-verified, and merged codebase.
## How It Works
```
User prompt
→ PM Agent (expands into structured PRD)
→ Task Agent (generates prioritized dependency graph via claude-task-master)
→ Dev Agents (concurrent, isolated Docker containers with Claude Code)
→ QA Agent (code review, tests, rebase, merge to main)
→ Done
```
If any agent gets blocked, the flow reverses through a **clarification loop** — Dev asks Task, Task asks PM, PM asks the human — while other agents keep working.
## Quick Start
### Prerequisites
- Python 3.11+
- Docker Desktop (running)
- Git
- Claude Code with OAuth **or** an [Anthropic API key](https://console.anthropic.com/)
### Setup
```bash
# Clone and enter project
git clone <repo-url> && cd ai_ops2
# Create venv and install dependencies
uv venv
uv pip install -r requirements.txt
# Configure environment (optional — not needed with Claude Code OAuth)
cp .env.example .env
# Edit .env to add API keys if not using OAuth
# Optionally add LANGSMITH_API_KEY for tracing
```
### Run
```bash
# Build a project from a prompt
python main.py --prompt "Build a video transcription service with Whisper and summarization"
# Limit concurrent dev agents
python main.py --prompt "Build a REST API" --max-concurrent-tasks 3
# Target a specific repo
python main.py --prompt "Add user authentication" --repo-path /path/to/project
# Validate config without executing
python main.py --dry-run --prompt "test"
# Verbose logging
python main.py --prompt "Build a CLI tool" --debug
```
## Architecture
### Agents
| Agent | File | Role |
|-------|------|------|
| **PMAgent** | `agents/pm_agent.py` | Expands prompts into PRDs, handles clarification requests |
| **TaskMasterAgent** | `agents/task_agent.py` | Bridges to claude-task-master for task graph management |
| **DevAgentManager** | `agents/dev_agent.py` | Spawns Claude Code in Docker containers via pexpect |
| **QAAgent** | `agents/qa_agent.py` | Code review, linting, testing, rebase, and merge |
### Core
| Component | File | Role |
|-----------|------|------|
| **AppFactoryOrchestrator** | `core/graph.py` | LangGraph state machine with conditional routing |
| **WorkspaceManager** | `core/workspace.py` | Git worktree + Docker container lifecycle |
| **ObservabilityManager** | `core/observability.py` | LangSmith tracing + structured logging |
| **ArchitectureTracker** | `core/architecture_tracker.py` | Prevents context starvation across dev agents |
### Project Structure
```
app_factory/
├── agents/
│ ├── pm_agent.py # PRD generation + clarification
│ ├── task_agent.py # claude-task-master interface
│ ├── dev_agent.py # Claude Code + Docker orchestration
│ └── qa_agent.py # Review, test, merge pipeline
├── core/
│ ├── graph.py # LangGraph state machine
│ ├── workspace.py # Git worktree + Docker isolation
│ ├── observability.py # LangSmith tracing + logging
│ └── architecture_tracker.py # Global architecture summary
├── prompts/ # Agent prompt templates
│ ├── pm_prd_expansion.txt
│ ├── pm_clarification.txt
│ ├── dev_task_execution.txt
│ └── qa_review.txt
└── data/ # Runtime state + architecture tracking
```
## Execution Phases
1. **Linear Planning** — User → PM Agent → Task Agent. Produces a prioritized DAG of tasks.
2. **Dynamic Concurrency** — Orchestrator spins up a WorkspaceManager + DevAgent for every unblocked task concurrently via `asyncio.gather()`.
3. **Clarification Loop** — Blocked agents route requests backward up the chain. Other agents continue uninterrupted.
4. **QA & Merge** — QA Agent rebases, lints, tests, reviews, and merges each completed task. Task Agent then unlocks downstream dependencies.
## Design Decisions
- **Context Starvation Prevention**: A read-only `ArchitectureTracker` summary is injected into every Dev Agent prompt so they know what other agents have built.
- **Merge Conflict Handling**: QA Agent rebases onto main before testing. Complex conflicts are kicked back to the Dev Agent automatically.
- **Infinite Loop Protection**: Max retry counter (3) per task at the LangGraph node level. Exceeded retries escalate to PM → human.
- **Claude Code Automation**: Dev agents interact with Claude Code via `pexpect` subprocess in headless mode inside Docker containers.
## Testing
```bash
# Run full test suite
python -m pytest tests/ -v
# Run specific test file
python -m pytest tests/test_graph.py -v
# Run with coverage
python -m pytest tests/ --cov=app_factory --cov-report=term-missing
```
**229 tests** across 9 test files covering all agents, core components, and integration.
## Configuration
### Authentication
App Factory supports two auth modes:
- **Claude Code OAuth** (default) — If you use Claude Code with OAuth, no API key is needed. The Claude Agent SDK (`claude-agent-sdk`) picks up your auth automatically.
- **API key** — Set `ANTHROPIC_API_KEY` in `.env` for direct API access.
### Environment Variables
| Variable | Required | Description |
|----------|----------|-------------|
| `ANTHROPIC_API_KEY` | No* | Claude API key. Not needed with Claude Code OAuth. |
| `OPENAI_API_KEY` | No | Codex fallback for algorithmic generation |
| `LANGSMITH_API_KEY` | No | LangSmith tracing and observability |
| `LANGSMITH_PROJECT` | No | LangSmith project name (default: `app-factory`) |
*Required only if not using Claude Code OAuth.*
## License
MIT