agent-task
Run provider-neutral task plans through Homeboy’s durable agent-task lifecycle.
Homeboy owns durable orchestration and provider-neutral outcomes. Runtime
providers own backend-specific execution. For the provider fanout ownership seam,
see docs/architecture/provider-fanout-boundary.md.
Dispatch
agent-task dispatch builds a durable task plan from common repo-cooking inputs
without requiring hand-authored provider JSON:
homeboy agent-task dispatch
--repo data-machine
--cwd /path/to/worktree
--provider-config @provider-config.json
--client-context @client-context.json
--prompt @task.txtHomeboy core treats --client-context as an optional opaque JSON object. Client
adapters may include whatever correlation data they need to reconcile their own
notifications or UI state, but Homeboy does not interpret transport-specific
identifiers in core lifecycle state. Provider-specific execution settings belong
in --provider-config; durable lifecycle commands remain headless and can be
claimed later with agent-task run or agent-task run-next.
Headless Fleet-Cooking Review
The authoritative non-chat workflow is the durable agent-task lifecycle. Chat
clients, Discord threads, GitHub Actions, cron, and terminal operators can all
submit the same run id, inspect it later, and promote selected artifacts without
depending on transport-local state.
run_id="homeboy-3357-$(date +%s)"
homeboy agent-task dispatch
--repo homeboy
--cwd /path/to/homeboy@fix-issue
--task-url https://github.com/Extra-Chill/homeboy/issues/3357
--concurrency 4
--attempts 2
--run-id "$run_id"
--queue-only
--prompt @task.txt
# A daemon or later terminal process can claim work without chat history.
homeboy agent-task run-next
# One review envelope contains lifecycle state, logs, artifacts, aggregate
# reconciliation, promotion candidates, and next actions.
homeboy agent-task review "$run_id"
--to-worktree homeboy@fix-issue-3357-agent-task-non-chat-flowagent-task review returns homeboy/agent-task-review/v1 with:
record: the durable run record fromstatus.logs: scheduler events from queued or completed lifecycle state.artifacts: artifacts and evidence refs from the completed aggregate.aggregate_review: apply/retry/issue-report/review candidate reconciliation.promotion_candidates: generatedhomeboy agent-task promote <run-id>command arrays for apply candidates, completed with--to-worktreewhen supplied.transport.chat_state_required: false, making Homeboy the source of truth.
This is the terminal/daemon-owned review surface for fleet cooking. Kimaki or any other chat UI should submit, poll, render, and call these commands rather than owning scheduling, state, artifacts, reconciliation, or promotion.
Deterministic Smoke Gate
Issue #3392 is covered by a no-secret fixture plan at
tests/fixtures/agent_task_smoke_plan.json. It exercises the operator path
without provider credentials, chat state, or long-running external services.
Run it from a disposable Homeboy worktree:
run_id="agent-task-smoke-$(date +%s)"
target_worktree="homeboy@fix-3392-agent-task-smoke"
homeboy agent-task submit
--plan @tests/fixtures/agent_task_smoke_plan.json
--run-id "$run_id"
homeboy agent-task status "$run_id"
homeboy agent-task logs "$run_id"
homeboy agent-task run "$run_id"
# Or let a generic worker claim the oldest queued durable run:
# homeboy agent-task run-next
homeboy agent-task status "$run_id"
homeboy agent-task artifacts "$run_id"
homeboy agent-task review "$run_id"
--to-worktree "$target_worktree"
homeboy agent-task promote "$run_id"
--to-worktree "$target_worktree"
--dry-runThe gate passes when:
submitreturns a durablerun_idimmediately withstate: "queued".- Pre-run
statusandlogsshow the queued fixture cell. runexits successfully and writes the aggregate lifecycle record.- Post-run
statusshowsstate: "succeeded". artifactslists a patch artifact, an agent result artifact, and a transcript evidence ref.reviewreturns ahomeboy/agent-task-review/v1envelope withtransport.chat_state_required: false, aggregate reconciliation, and promotion candidates.promote <run-id> --dry-runresolves the aggregate from the durable run id and reports the selected non-empty patch plus changed files without requiring the operator to look upaggregate_pathmanually.
When promotion runs without --dry-run, each --verify <command> is treated as
a visible deterministic gate in the promoted worktree. Promotion reports gate
results as deterministic_gates[] using
homeboy/agent-task-gate-report/v1. Failed visible gates set promotion
status: "gate_failed", exit nonzero, and include
failure_evidence.agent_feedback plus stdout/stderr tails so the next cook-loop
agent task can receive exact failure context instead of a generic shell error.
Use --private-verify <command> for orchestrator-only completion gates that
should decide completion without exposing hidden evaluator details to the next
agent attempt. Private gate reports still appear in the promotion report for
human/orchestrator evidence, but agent-task gate-feedback applies
--private-gate-reveal <policy> before building the follow-up request. Supported
policies are summary-only (default), redacted, no-detail, and
full-evidence. Visible gate failures continue to provide full deterministic
evidence to the agent.
agent-task gate-feedback converts a promotion report and the original
AgentTaskRequest into a provider-neutral cook-loop decision:
homeboy agent-task gate-feedback
--promotion @promotion.json
--source-task @source-task.json
--source-run-id "$run_id"
--attempt 1
--max-attempts 3
--current-diff @current.diffThe command returns homeboy/agent-task-cook-loop-report/v1. Red gates with
remaining budget produce status: "retry_requested" and a complete
follow_up_request containing the failed command, exit status, log tails,
changed files, patch artifact ref, current diff context, and source run/task
refs. Red gates with exhausted budget return status: "retries_exhausted".
Green promotion returns status: "green_completed" and no follow-up task.
Queued runs that should not execute can be cancelled without chat/session state:
homeboy agent-task cancel "$run_id" --reason "not selected by controller"cancel marks queued runs and stale-running records as cancelled in the
durable lifecycle store. It refuses to claim live provider cancellation for an
active runner process until a provider-owned cancellation channel is available.
Dispatch Workspaces
agent-task dispatch accepts generic Homeboy workspace inputs and does not
resolve product-specific workspace handles itself.
Use --cwd <PATH> when the caller already knows the checkout or worktree path:
homeboy agent-task dispatch
--repo homeboy
--cwd /path/to/homeboy@fix-issue
--prompt @task.txtUse --workspace <ID_OR_PATH> for a Homeboy-managed task worktree ID or an
existing workspace path:
homeboy worktree create homeboy --branch fix/issue-123
homeboy agent-task dispatch
--workspace homeboy@fix-issue-123
--prompt @task.txtExternal workspace managers should resolve their own handles to local paths and
call dispatch with --cwd <resolved-path>.
Durable Loop Controllers
agent-task controller stores domain-agnostic controller state for multi-day
multi-agent loops. The controller record lives outside any single agent-task run
and can reference runs, artifacts, gates, reviews, waits, and human-ready work by
stable ids instead of copying every payload inline.
Create and inspect a controller:
homeboy agent-task controller init transformer-loop
--phase generate
--config-version transformer-v1
homeboy agent-task controller status transformer-loop
homeboy agent-task controller listApply external events, such as CI completion, PR review, human merge, scheduled wakeups, or artifact availability:
homeboy agent-task controller apply-event transformer-loop
--event-type github.pr.merged
--event-key Extra-Chill/homeboy#123
--entity-id pr:123
--payload @event.jsonThe payload may include a policy object using
homeboy/agent-task-loop-controller/v1 action names such as spawn_task,
fan_out, join, retry, request_changes, run_gates, wait_for_event,
mark_human_ready, complete, abandon, and escalate. Actions with
deterministic dedupe_key values are recorded once, so replaying a resumed
controller does not duplicate already-open tasks or PR work.
Mark work as explicitly ready for a human handoff:
homeboy agent-task controller mark-human-ready transformer-loop
--entity-id pr:123
--reason "gates passed and review approved"Gate bundles are represented as structured checks and results. Existing
--verify command gates are compatible as the simplest command check type;
long-running loops can reuse named bundles across repos and persist normalized
passed, failed, or warn results against a loop, entity, PR, finding, or
run.
Fixture Backend
The built-in fixture backend is intentionally narrow. It exists for smoke
proofs and unit tests, not production task execution. A successful fixture cell
writes:
changes.patchas a non-empty unified diff.agent-result.jsonas a structuredhomeboy/agent-task-outcome/v1artifact.transcript.logas transcript evidence.
Useful fixture executor.config fields:
artifact_root: directory where fixture artifacts are written.changed_file: diff path recorded in the generated patch.metadata: optional JSON object copied into the fixture outcome metadata.mode: omit or set tosuccess; set toempty_patchorempty_runtime_bundlefor classification checks.
Output-Driven DAG Phases
agent-task run-plan supports backend-neutral output dependencies with a
plan-level output_dependencies map keyed by downstream task id. A task with
bindings waits until every declared upstream task has a terminal outcome, selects
values from prior homeboy/agent-task-outcome/v1 payloads with JSON Pointer,
renders {{outputs.<name>}} placeholders into the downstream request, then
dispatches the generated task.
Example:
{
"schema": "homeboy/agent-task-plan/v1",
"plan_id": "site-generator-static-fanout",
"tasks": [
{
"schema": "homeboy/agent-task-request/v1",
"task_id": "idea",
"executor": { "backend": "provider" },
"instructions": "Create the GitHub issue for this site idea."
},
{
"schema": "homeboy/agent-task-request/v1",
"task_id": "design",
"executor": {
"backend": "provider",
"config": {
"github_issue": "{{outputs.issue_number}}"
}
},
"instructions": "Build the design for GitHub issue #{{outputs.issue_number}}."
}
],
"output_dependencies": {
"design": {
"bindings": {
"issue_number": {
"task_id": "idea",
"path": "/metadata/github/issue_number",
"required": true
}
}
}
}
}Supported rendering targets:
instructionsinputsexecutor.configworkspace.materializationmetadataexpected_artifacts
If a field is exactly {{outputs.<name>}}, Homeboy preserves the selected JSON
value type. Inline placeholders render as strings. If a required binding is
missing, the downstream task is not sent to the provider; the aggregate records a
skipped scheduler event, increments totals.skipped, and writes a no-op
outcome with diagnostic class output_dependency_missing.
Use depends_on for ordering-only edges that do not bind values:
{
"output_dependencies": {
"static-build": {
"depends_on": ["design"],
"bindings": {
"issue_number": {
"task_id": "idea",
"path": "/metadata/github/issue_number"
}
}
}
}
}Failure Classifications
The deterministic smoke and existing provider path expose these failure classes:
| Case | Diagnostic/classification |
|---|---|
| no-op or empty patch | agent_task.fixture_empty_patch plus promotion rejecting promotion refuses an empty patch artifact |
| provider timeout | agent_task.provider_timeout, failure_classification: "timeout" |
| missing secrets/preflight | agent_task.secret_env_missing, failure_classification: "invalid_input" |
| empty runtime bundle | agent_task.fixture_empty_runtime_bundle |
| stale/non-terminal status | status annotates running records with metadata.stale_running and metadata.stale_running_reason |