Engine routing
Alfred is the scheduler and guardrail layer. The actual LLM work is done by the engine: a local CLI you have already authenticated. The framework owns the per-codename decision of which engine that is. Default posture is local subscription auth; Alfred does not need Anthropic or OpenAI API keys for the normal Claude Code or Codex CLI flow.
This page covers the three modes, the precedence chain, the fallback behavior, the default routing matrix for the shipped fleet, and where the multi-engine roadmap is going. Full doc at docs/ENGINE_ROUTING.md.
Three modes
Section titled “Three modes”| Mode | Behavior |
|---|---|
claude | Use Claude Code only. No fallback. |
codex | Use Codex only. No fallback. |
hybrid | Use Claude Code first. Fall back to Codex on error_budget, error_rate_limit, or error_authentication. Default for most codenames. |
hybrid is the default for builder agents because it gives you graceful degradation when Claude quota is exhausted, without committing every firing to Codex. Reviewer agents that are happy with either engine often run pure codex so they preserve Claude quota for builders.
Per-agent overrides
Section titled “Per-agent overrides”The framework reads the engine for each firing from a precedence chain. The first source that returns a normalized mode wins.
ALFRED_<CODENAME>_ENGINE(e.g.ALFRED_LUCIUS_ENGINE=claude,ALFRED_RASALGHUL_ENGINE=codex).- An optional legacy env var for migrated fleets (the codename’s runner can name one).
ALFRED_ENGINEfor fleet-wide testing (useful inalfred-dry-run).$ALFRED_HOME/state/engines/<codename>, written byalfred engine set.- An optional legacy state file.
- The codename’s compiled-in default, usually
hybrid.
Alfred CLI:
alfred engine status # one line per codename, resolved modealfred engine status lucius # one codename, plus where the value came fromalfred engine set lucius hybrid # persist to $ALFRED_HOME/state/engines/luciusalfred engine set rasalghul codexalfred codex status # check the Codex CLI is reachablealfred codex probe # run one tiny non-interactive requestalfred auth status # auth-surface check across both enginesSet the env-var form in ~/.alfredrc when you want the override to follow your shell. Set the state-file form when you want the override to follow the host scheduler (it survives a deploy.sh re-render).
Hybrid fallback behavior
Section titled “Hybrid fallback behavior”Hybrid mode tries Claude first. The runner inspects the AgentResult and falls back to Codex only for a narrow set of subtypes:
error_budget: the Claude account has run out of subscription budget.error_rate_limit: the Claude account hit a rate limit.error_authentication: the Claude CLI auth is missing or stale.
Any other failure stays a Claude failure. A normal Claude tool error is a bug in the runner or prompt, not a reason to switch engines; hiding it behind a fallback would mask real problems.
When a Claude-backed firing returns error_rate_limit or error_budget, the runner also calls set_global_block(hours=1, reason=...). That writes $ALFRED_HOME/state/global-blocked-until.json, which every other Claude-backed firing reads at the top of main(). They print [<AGENT>-GLOBAL-BLOCKED] and exit 0 for the next hour. The block stops the stampede; without it, the whole fleet would spend the hour firing into the same rate-limit wall.
Hybrid agents are not silenced by the global block: if they fall back to Codex successfully, they keep working through the Claude outage. That is the point.
Default routing matrix
Section titled “Default routing matrix”The shipped fleet has the following defaults. Override per codename when your account economics or quality posture call for it.
| Codename | Default mode | Why |
|---|---|---|
| batman | hybrid | Architect for cross-repo execution. Long-context planning prefers Claude; Codex fallback keeps the architect lane alive during Claude outages. |
| lucius | hybrid | Builder. Wants Claude for first-class code generation, but cannot afford to be idle during a Claude rate-limit hour. |
| drake | claude | Planner. Cross-repo grep plus issue-filing benefits from Claude’s longer effective context and tool integration. |
| bane | hybrid | Test-coverage builder. Same posture as Lucius; tests are valuable enough to fall back rather than skip. |
| rasalghul | codex | Reviewer. An independent reviewer on a different model surfaces blind spots the builder model shares. Also preserves Claude quota for builders. |
| nightwing | hybrid | Review-fix builder. Needs Claude for the same reasons as Lucius. |
| robin | hybrid | Bug triage. Light-touch; either engine works. |
| huntress | claude | Post-deploy smoke. Lower volume; Claude is fine. |
| gordon | claude | Deploy-health. Read-only; quiet on healthy days. |
| automerge | n/a | No engine call. |
| agent-cleanup | n/a | No engine call. |
These are starting points, not laws. If you have a Claude Max plan and abundant quota, push more codenames to pure claude. If you have OpenAI credits to burn and want a second opinion on every PR, push more reviewers to pure codex. The override surface is per-codename for exactly this reason.
Subscription economics
Section titled “Subscription economics”Alfred’s default posture is to use the local CLI subscription auth you have already paid for. It does not need API keys for normal operation.
- Claude Code with a Pro or Max plan: keep
ANTHROPIC_API_KEYunset. Claude Code gives env-var API keys priority over subscription auth, which silently moves a firing onto API billing. - Codex with a ChatGPT plan: sign in through the Codex CLI with your ChatGPT account. Keep
OPENAI_API_KEYunset unless you intentionally want API-key billing. - AWS: only used when an agent needs Secrets Manager, and only with per-agent IAM (see AWS setup).
The shipped fleet is designed to run on subscriptions you already have. No double billing. If you want to add API-key fallback for redundancy, set the env vars deliberately and document what you did in ~/.alfredrc.
Multi-engine roadmap
Section titled “Multi-engine roadmap”The current engine surface is two: Claude Code and Codex. The runtime contract is engine-agnostic. AgentResult carries success, subtype, num_turns, cost_usd, session_id, and result_text regardless of which engine produced it. Adding a third engine means writing a new <engine>_invoke() that returns the same shape.
On the roadmap:
- Gemini CLI: when Google ships a stable non-interactive
gemini -pequivalent with a structured result. Useful as a third independent reviewer or as a hedge against Anthropic and OpenAI both being down at once. - Ollama and other local engines: for teams that want every firing on-host with no provider call at all. Trade-off is model quality; reasonable for utility roles.
- Anthropic native agents: when the upstream Agent Teams or Memory Tool primitives stabilize, Alfred will lean on them rather than re-implementing them.
Each new engine needs three things to land: a CLI binary on PATH, a deterministic non-interactive prompt mode that returns structured results, and a subtype-mapping table so hybrid fallback knows which failures to swallow.
See also
Section titled “See also”- Architecture: why the engine is a fresh subprocess per firing.
- How it works: the firing trace including the engine call.
- Claude Code and Codex: install, auth, Pro vs Max sizing, account swap.
- State and memory: the
engines/<codename>state file. - Install: first-run install flow.