Budgets (always set these)
Prompt caching
auto lets the harness place provider cache breakpoints (Anthropic ephemeral
cache blocks, Gemini cached content, OpenAI prompt caching) — you choose
scope and TTL, not provider mechanics. Cache scoping is workspace-isolated on
the vendor side.
Batching
Batch APIs are implemented for Anthropic (anthropic.batches.create) and the
OpenAI batch surface — right for evals, backfills, and bulk classification
where minutes of latency are fine for ~half price. Drive them through a
workflow stage that submits and a later stage (or trigger) that collects.
Headroom (context compression)
headroom.compression / headroom.skipped)
— you can audit exactly what it did.
When to use what
| Symptom | Dial |
|---|---|
| Same big system context every turn | cache_policy |
| Long-running agent blowing context windows | Headroom |
| Thousands of independent items | Batch |
| Runaway spend risk | Tighter budget + retry.max_attempts |
See also
- Workflow Schema — field shapes
- Provider pages: Anthropic batch/cache rows