agent_runner API reference
The framework substrate. Every codename agent imports from this module. Source: lib/agent_runner.py.
Categorised by what the operator-facing primitive does. For deep semantics, read the source’s docstrings. They’re the authoritative reference.
Path resolution + module constants
Section titled “Path resolution + module constants”HOME: Path # operator's home directoryHERMES_HOME: Path # runtime root, default ~/.hermesWORKSPACE_ROOT: Path # parent of per-repo checkouts, default ~/codeWORKSPACE: Path # WORKSPACE_ROOT / "product" (back-compat alias)GH_ORG: str # GitHub org slug; required for gh helpers
STATE_ROOT: Path # HERMES_HOME / "state"WORKTREE_ROOT: Path # HERMES_HOME / "worktrees"LIB_DIR: Path # HERMES_HOME / "lib"BIN_DIR: Path # HERMES_HOME / "bin"
CLAUDE_BIN: str # path to the claude CLI; default "claude"CODEX_BIN: str # path to the codex CLI; default "codex"CODEX_TRANSCRIPTS_ROOT: Path # HERMES_HOME / "state" / "codex"
GH_REPO_TO_LOCAL: dict[str, str] # consumer-extended slug → local-dir mapSTANDARD_LABELS: list[tuple] # consumer-extended label set for ensure_labelsLIFECYCLE_LABELS: list[tuple] # framework-provided state-machine labels
SLACK_SEVERITY_INFO: str # "info"SLACK_SEVERITY_WARN: str # "warn"SLACK_SEVERITY_ALERT: str # "alert"Preflight + doctor mode
Section titled “Preflight + doctor mode”@dataclassclass PreflightSpec: agent: str bins: list[str] = [] # CLIs that must be on PATH require_gh_auth: bool = False aws_profile: str = "" # if set, sts get-caller-identity must succeed under this profile require_workspace_repos: list[str] = [] # local checkout dirs that must exist env_vars: list[str] = [] # required env vars
def preflight(spec: PreflightSpec) -> Nonedef doctor_mode() -> bool # reads HERMES_DOCTOR envpreflight raises PreflightFailed (a RuntimeError) on any gap. The runner’s main pattern:
try: preflight(PREFLIGHT)except PreflightFailed: return 0if doctor_mode(): print(f"[{AGENT.upper()}-DOCTOR-OK]") return 0Lock + spend + global block
Section titled “Lock + spend + global block”def with_lock(name: str) # mkdir-atomic per-agent mutexclass AgentLock # the underlying class
class SpendState: def __init__(self, agent: str) state: dict # firings_today, turns_today, cost_usd_today, ... def increment(self, **kwargs) -> None def set(self, **kwargs) -> None def is_blocked(self) -> str | None # returns reason if rate-blocked, else None
def is_globally_blocked() -> str | Nonedef set_global_block(hours: int, reason: str) -> str # returns until-isoSubprocess + shell
Section titled “Subprocess + shell”def run(cmd: list[str], *, cwd: str | None = None, timeout: int = 60) -> subprocess.CompletedProcess
def gh_json(cmd: list[str], default: Any = None) -> Any # gh + json parsedef slack_post(text: str, *, severity: str = "info") -> bool
# Severities: "info" (default, posted as-is), "warn" (⚠️ prefix),# "alert" (🚨 prefix + appends <!here>).Webhook URL resolution: SLACK_WEBHOOK_URL env -> 30-day disk cache at $HERMES_HOME/state/slack-webhook.cache -> AWS Secrets Manager (SLACK_WEBHOOK_SECRET_ID, default alfred/slack-webhook).
GitHub helpers
Section titled “GitHub helpers”def ensure_labels(repo_slug: str, labels: list[tuple[str, str, str]] | None = None) -> Nonedef gh_issue_edit(repo_slug: str, num: int, *, add_labels: list[str] = None, remove_labels: list[str] = None) -> booldef gh_issue_comment(repo_slug: str, num: int, body: str) -> booldef gh_pr_create(repo_slug: str, *, title: str, body_file: Path, head: str | None = None, labels: list[str] | None = None, base: str = "main") -> str | None # returns PR URLdef gh_pr_comment(repo_slug: str, num: int, body: str) -> boolIssue claim state machine
Section titled “Issue claim state machine”See State machine for design.
def claim_issue(repo_slug: str, num: int, *, codename: str, firing_id: str) -> bool
def release_issue(repo_slug: str, num: int, *, codename: str, firing_id: str, outcome: str = "success", transition_to: str | None = None, pr_url: str | None = None) -> bool
def find_stale_claims(repo_slug: str, *, max_age_hours: int = 4) -> list[dict]
def force_release_stale_claim(repo_slug: str, num: int, *, sweep_id: str, released_codename: str | None = None, released_firing_id: str | None = None) -> bool
def issue_dedup_check(repo_slug: str, num: int) -> dict
# Operator overridesdef is_repo_paused(repo_slug: str) -> booldef list_paused_repos() -> list[str]def set_repo_paused(repo_slug: str, paused: bool) -> list[str]
# ConstantsPAUSED_REPOS_FILE: Path # state-file locationCLAIM_COMMENT_PREFIX: str # HTML comment marker for claimsRELEASE_COMMENT_PREFIX: str # HTML comment marker for releasesWorktree management
Section titled “Worktree management”def make_worktree(local_repo: str, agent: str, target: str, base: str = "origin/main") -> tuple[Path, str] # (path, branch)def make_worktree_from_branch(local_repo: str, agent: str, head_ref: str, target: str) -> Pathdef remove_worktree(local_repo: str, wt: Path) -> NoneClaude invocation
Section titled “Claude invocation”@dataclassclass ClaudeResult: success: bool subtype: str # "success" | "error_max_turns" | "error_budget" | "error_rate_limit" | ... num_turns: int cost_usd: float session_id: str | None result_text: str raw: dict stop_reason: str | None # opt-in field; falls back to subtype error_message: str | None
def claude_invoke(prompt: str, *, workdir: Path, allowed_tools: str, max_turns: int | None = None, timeout: int = 1200) -> ClaudeResult
def claude_invoke_streaming(prompt: str, *, workdir: Path, allowed_tools: str, agent: str, firing_id: str, max_turns: int | None = None, timeout: int = 1200) -> ClaudeResult
def codex_invoke(prompt: str, *, workdir: Path, agent: str = "codex", firing_id: str | None = None, timeout: int = 1200, model: str | None = None, sandbox: str | None = None, approval_policy: str | None = None, add_dirs: list[Path] | None = None) -> ClaudeResultThe OSS streaming variant currently delegates to claude_invoke() while preserving the future call shape. codex_invoke() shells out to codex exec, rejects unsupported Claude-only controls (allowed_tools, max_turns, resume_session), defaults to read-only + approval_policy=never, and writes final-message/stdout/stderr artifacts to $HERMES_HOME/state/codex/<agent>/<YYYY-MM>/.
Event log + commit trailer + handoff table
Section titled “Event log + commit trailer + handoff table”class EventLog: def __init__(self, agent: str, firing_id: str | None = None, path: Path | None = None) firing_id: str path: Path def emit(self, event_type: str, **payload) -> None
def commit_trailer(agent: str, firing_id: str, *, extra: dict[str, str] | None = None) -> strclass HandoffTable # producer/consumer table for cross-codename validationPrompt loading
Section titled “Prompt loading”def load_prompt(path: Path | str, *, extra_vars: dict[str, str] | None = None) -> strSubstitutes ${ENV_VAR} from the environment (and any extra_vars). Unset vars stay as literals. Fails loud if you accidentally interpolate a missing var into a gh command.
Conventions
Section titled “Conventions”- Every primitive that does network I/O has an explicit timeout and returns a status (bool / dict / dataclass) rather than raising on operational errors. Programming bugs do raise.
- Every primitive that writes operator-visible state (Slack, gh, files) is idempotent or near-idempotent.
- Every primitive that depends on the host shell uses
subprocess.run(viarun()), nevershell=True.
For implementation details, the source file is exhaustively commented. Module-level docstring at the top documents the env-var contract every consumer agent inherits.