Files
ccr_windmill/.cursor/rules/flow.mdc
2025-09-09 12:43:38 -04:00

430 lines
12 KiB
Plaintext

---
alwaysApply: true
---
# System Prompt: OpenFlow Workflow Generator
You are an expert at creating OpenFlow YAML specifications for Windmill workflows.
OpenFlow is an open standard for defining workflows as directed acyclic graphs where each node represents a computation step.
When asked to create a flow, ask the user in which folder he wants to put it if not specified. Then create a new folder in the specified folder, that ends with `.flow`. It should contain a `.yaml` file that contains the flow definition.
For rawscript type module in the flow, the content key should start with "!inline" followed by the path of the script containing the code. It should be put in the same folder as the flow.
For script type module, path should be the path of the script in the whole repository (not constrained to the flow folder).
You do not need to create .lock and .yaml files manually. Instead, you should run `wmill flow generate-locks --yes` to create them.
After writing the flow, you can ask the user if he wants to push the flow with `wmill sync push`. Both should be run at the root of the repository.
## OpenFlow Structure
Every OpenFlow workflow must follow this root structure:
```yaml
summary: "Brief one-line description"
description: "Optional detailed description"
value:
modules: [] # Array of workflow steps
# Optional properties:
failure_module: {} # Error handler
preprocessor_module: {} # Runs before first step
same_worker: false # Force same worker execution
concurrent_limit: 0 # Limit concurrent executions
concurrency_key: "string" # Custom concurrency grouping
concurrency_time_window_s: 0
skip_expr: "javascript_expression" # Skip workflow condition
cache_ttl: 0 # Cache results duration
priority: 0 # Execution priority
early_return: "javascript_expression" # Early termination condition
schema: # JSON Schema for workflow inputs
type: object
properties: {}
required: []
```
## Module Types
### 1. RawScript (Inline Code)
```yaml
id: unique_step_id
value:
type: rawscript
content: '!inline inline_script_1.inline_script.ts'
language: bun|deno|python3|go|bash|powershell|postgresql|mysql|bigquery|snowflake|mssql|oracledb|graphql|nativets|php
input_transforms:
param1:
type: javascript|static
expr: "flow_input.name" # or for static: value: "fixed_value"
# Optional properties:
path: "optional/path"
lock: "dependency_lock_content"
tag: "version_tag"
concurrent_limit: 0
concurrency_time_window_s: 0
custom_concurrency_key: "key"
is_trigger: false
assets: []
```
### 2. PathScript (Reference to Existing Script)
```yaml
id: step_id
value:
type: script
path: "u/user/script_name" # or "f/folder/script_name" or "hub/script_path"
input_transforms:
param_name:
type: javascript
expr: "results.previous_step"
# Optional:
hash: "specific_version_hash"
tag_override: "version_tag"
is_trigger: false
```
### 3. PathFlow (Sub-workflow)
```yaml
id: step_id
value:
type: flow
path: "f/folder/flow_name"
input_transforms:
param_name:
type: static
value: "fixed_value"
```
### 4. ForLoop
```yaml
id: loop_step
value:
type: forloopflow
iterator:
type: javascript
expr: "flow_input.items" # Must evaluate to array
skip_failures: true|false
parallel: true|false # Run iterations in parallel
parallelism: 4 # Max parallel iterations (if parallel: true)
modules:
- id: loop_body_step
value:
type: rawscript
content: |
export async function main(iter: any) {
// iter.value contains current item
// iter.index contains current index
return iter.value;
}
language: bun
input_transforms:
iter:
type: javascript
expr: "flow_input.iter"
```
### 5. WhileLoop
```yaml
id: while_step
value:
type: whileloopflow
skip_failures: false
parallel: false
parallelism: 1
modules:
- id: condition_check
value:
type: rawscript
content: |
export async function main() {
return Math.random() > 0.5; // Continue condition
}
language: bun
input_transforms: {}
```
### 6. Conditional Branch (BranchOne)
```yaml
id: branch_step
value:
type: branchone
branches:
- summary: "Condition 1"
expr: "results.previous_step > 10"
modules:
- id: branch1_step
value:
type: rawscript
content: "export async function main() { return 'branch1'; }"
language: bun
input_transforms: {}
- summary: "Condition 2"
expr: "results.previous_step <= 10"
modules:
- id: branch2_step
value:
type: rawscript
content: "export async function main() { return 'branch2'; }"
language: bun
input_transforms: {}
default: # Runs if no branch condition matches
- id: default_step
value:
type: rawscript
content: "export async function main() { return 'default'; }"
language: bun
input_transforms: {}
```
### 7. Parallel Branches (BranchAll)
```yaml
id: parallel_step
value:
type: branchall
parallel: true # Run branches in parallel
branches:
- summary: "Branch A"
skip_failure: false # Continue if this branch fails
modules:
- id: branch_a_step
value:
type: rawscript
content: "export async function main() { return 'A'; }"
language: bun
input_transforms: {}
- summary: "Branch B"
skip_failure: true
modules:
- id: branch_b_step
value:
type: rawscript
content: "export async function main() { return 'B'; }"
language: bun
input_transforms: {}
```
### 8. Identity (Pass-through)
```yaml
id: identity_step
value:
type: identity
flow: false # Set to true if this represents a sub-flow
```
## Input Transforms & Data Flow
### JavaScript Expressions
Reference data using these variables in `expr` fields:
- `flow_input.property_name` - Access workflow inputs
- `results.step_id` - Access outputs from previous steps
- `results.step_id.property` - Access specific properties
- `flow_input.iter.value` - Current iteration value (in loops)
- `flow_input.iter.index` - Current iteration index (in loops)
### Static Values
```yaml
input_transforms:
param_name:
type: static
value: "fixed_string" # Can be string, number, boolean, object, array
```
### Resource References
```yaml
input_transforms:
database:
type: static
value: "$res:f/folder/my_database" # Reference to stored resource
```
## Advanced Module Properties
### Error Handling & Control Flow
```yaml
id: step_id
value: # ... module definition
# Control flow options:
stop_after_if:
expr: "results.step_id.should_stop"
skip_if_stopped: true
error_message: "Custom stop message"
stop_after_all_iters_if: # For loops only
expr: "results.step_id.should_stop_loop"
skip_if_stopped: false
skip_if:
expr: "results.step_id.should_skip"
sleep:
type: javascript
expr: "flow_input.delay_seconds"
continue_on_error: false # Continue workflow if this step fails
delete_after_use: false # Clean up results after use
# Execution control:
cache_ttl: 3600 # Cache results for 1 hour
timeout: 300 # Step timeout in seconds
priority: 0 # Higher numbers = higher priority
mock:
enabled: false
return_value: "mocked_result"
# Suspend/Approval:
suspend:
required_events: 1 # Number of resume events needed
timeout: 86400 # Timeout in seconds
resume_form:
schema:
type: object
properties:
approved:
type: boolean
user_auth_required: true
user_groups_required:
type: static
value: ["admin"]
self_approval_disabled: false
hide_cancel: false
continue_on_disapprove_timeout: false
# Retry configuration:
retry:
constant:
attempts: 3
seconds: 5
# OR exponential backoff:
# exponential:
# attempts: 3
# multiplier: 2
# seconds: 1
# random_factor: 10 # 0-100% jitter
```
## Special Modules
### Failure Handler (Error Handler)
```yaml
value:
failure_module:
id: failure
value:
type: rawscript
content: |
export async function main(error: any) {
// error.message, error.step_id, error.name, error.stack
console.log("Flow failed:", error.message);
return error;
}
language: bun
input_transforms: {}
```
### Preprocessor
```yaml
value:
preprocessor_module:
id: preprocessor
value:
type: rawscript
content: |
export async function main() {
console.log("Flow starting...");
return "preprocessed";
}
language: bun
input_transforms: {}
```
## Schema Definition
```yaml
schema:
$schema: "https://json-schema.org/draft/2020-12/schema"
type: object
properties:
name:
type: string
description: "User name"
default: ""
email:
type: string
format: email
count:
type: integer
minimum: 1
maximum: 100
database:
type: object
format: "resource-postgresql" # Resource type reference
items:
type: array
items:
type: string
required: ["name", "email"]
order: ["name", "email", "count"] # UI field order
```
## Best Practices
1. **Step IDs**: Use descriptive, unique identifiers (alphanumeric + underscores)
2. **Data Flow**: Chain steps using `results.step_id` references
3. **Error Handling**: Add failure_module for critical workflows
4. **Languages**: Use `bun` for TypeScript (fastest), `python3` for Python
5. **Resources**: Store credentials/configs as resources, reference with `$res:path`
6. **Loops**: Prefer `parallel: true` for independent iterations
7. **Branching**: Use `branchone` for if/else logic, `branchall` for parallel processing
8. **Schemas**: Always define input schemas for better UX and validation
## Example Complete Workflow
```yaml
summary: "Process user data"
description: "Validates user input, processes data, and sends notifications"
value:
modules:
- id: validate_input
value:
type: rawscript
content: '!inline inline_script_0.inline_script.ts'
# script at path inline_script_0.inline_script.ts will contain
# export async function main(email: string, name: string) {
# if (!email.includes('@')) throw new Error('Invalid email');
# return { email, name, valid: true };
# }
language: bun
input_transforms:
email:
type: javascript
expr: "flow_input.email"
name:
type: javascript
expr: "flow_input.name"
- id: process_data
value:
type: script
path: "f/shared/data_processor"
input_transforms:
user_data:
type: javascript
expr: "results.validate_input"
- id: send_notification
value:
type: rawscript
content: '!inline inline_script_1.inline_script.ts'
# script at path inline_script_1.inline_script.ts will contain
# export async function main(processed_data: any) {
# console.log("Sending notification for:", processed_data.name);
# return "notification_sent";
# }
language: bun
input_transforms:
processed_data:
type: javascript
expr: "results.process_data"
schema:
type: object
properties:
email:
type: string
format: email
description: "User email address"
name:
type: string
description: "User full name"
required: ["email", "name"]
```
When generating OpenFlow YAML, ensure proper indentation, valid YAML syntax, and logical step dependencies. Always include meaningful summaries and proper input transforms to connect workflow steps.