0003 - Taskbase Agent Module¶

Status¶

Accepted

Date¶

2026-04-02

Context¶

ADR 0002 established the taskbase system: a Kubernetes-hosted task management app with a REST API that lets an agent pick up work, report progress, and track token consumption. That ADR defined the server side. This ADR defines the agent side — the module that runs on the agent machine and drives the interaction with the taskbase API autonomously.

The requirements for this module are:

Autonomous task pickup — the agent should be able to start a new session, call the taskbase API, and receive the next task without a human typing anything
Token consumption tracking — every Claude API call made while working a task emits a usage object; the module must accumulate these and report them back to taskbase after each interaction step and on task completion
Task lifecycle signalling — the agent must transition tasks through pending → in_progress → done / paused by calling the taskbase API at each phase boundary
Context budget awareness — when remaining token budget approaches a configurable threshold, the agent should checkpoint the current task (log a summary, mark it paused), and pull the next task into a fresh context
Runs locally on the agent machine — the module lives on the Mac mini alongside Cowork, not in the Kubernetes cluster; it is the agent's interface to the taskbase system

Decision Drivers¶

Low coupling — the module should wrap the taskbase REST API without tightly coupling to its internal implementation; if endpoints change, only the module needs updating
Native tool interface — Claude works best when capabilities are presented as structured tools, not prose instructions; the module should expose named tools that Claude can call directly
Zero manual steps — once a Cowork session starts, no human input should be required to pick up and begin executing the next queued task
Auditability — every tool call the agent makes against taskbase (task pickup, log entries, token reports, status transitions) must be traceable in the taskbase management UI
Token accuracy — token counts must come from the authoritative source: the usage field on Claude API responses, not estimates or scraping

Considered Options¶

Option A — Cowork skill (markdown prompt instructions)
Option B — Local MCP server exposing structured taskbase tools (Selected)
Option C — Standalone automation script calling Claude API + taskbase API directly

Decision Outcome¶

Chosen option: Option B — Local MCP server, because:

It exposes the taskbase API as discrete, named tools (get_next_task, log_progress, report_tokens, complete_task, pause_task) that Claude can call with type-safe arguments — much more reliable than asking Claude to construct raw HTTP requests from prose instructions
It runs as a persistent process on the agent machine and is available to any Cowork session without per-session setup
Token usage can be forwarded to taskbase directly from inside the MCP server by intercepting the Claude API usage object before returning results to the session
It keeps the taskbase REST API internal to the server; Claude never constructs API URLs or manages auth tokens directly

Option A — Cowork Skill (Markdown prompt instructions)¶

Architecture: - A skill file (e.g. agent-skills/taskbase-runner/SKILL.md) that instructs Claude to call the taskbase REST API using fetch or curl via the Bash or JavaScript tools - Token counting done by instructing Claude to read the usage field from each response and accumulate it manually

Pros: - Zero new infrastructure — a markdown file is all that is needed - Works in any Cowork session immediately after the skill is loaded

Cons: - Claude constructing raw HTTP requests from prose instructions is fragile; URL paths, headers, and JSON payloads are error-prone when authored at inference time - Token accumulation relies on Claude not losing count across many tool calls in a long session — unreliable - Auth tokens must be embedded in the skill file or passed in plaintext through the conversation - No persistent state between tool calls; if Claude misses a step, there is no guard-rail

Option B — Local MCP Server (Selected)¶

Architecture: - A small Go or Node.js process running on the agent machine, registered with Cowork as a plugin - Exposes the following tools over the MCP protocol:

Tool	Description
`get_next_task`	Returns the highest-priority `pending` task from taskbase, transitions it to `in_progress`, and returns its `id`, `title`, `description`, and `token_budget`
`log_progress`	Appends a progress note to the active task's activity log in taskbase
`report_tokens`	Sends an incremental token usage report (`input_tokens`, `output_tokens`) for the active task
`complete_task`	Marks the active task `done`, stores a completion summary, and sends the final token tally
`pause_task`	Marks the active task `paused` with a checkpoint summary when the context budget is running low
`get_token_budget`	Returns the remaining token budget for the active task so Claude can decide whether to continue or checkpoint

The MCP server holds the taskbase API base URL and auth token in its own config; Claude never sees credentials
On each Claude API response, the MCP server reads the usage object from the response metadata and calls report_tokens automatically, so Claude does not need to handle this manually

Agent session flow:

Session starts
  └─ Claude calls get_next_task
       └─ Task returned (id, title, description, token_budget)
            └─ Claude works the task
                 ├─ Periodically calls log_progress with updates
                 ├─ MCP server auto-reports tokens after each step
                 └─ Claude checks get_token_budget before each major step
                      ├─ Budget OK → continue
                      └─ Budget low → calls pause_task with checkpoint summary
                           └─ Fresh context → calls get_next_task again

Pros: - Structured, typed tool interface — Claude cannot construct a malformed API call - Auth and URL management are encapsulated in the server config - Token reporting is automatic and accurate — sourced from Claude API usage metadata - Persistent process — available across all Cowork sessions without reloading - Testable independently of Claude: the MCP server can be exercised with any MCP client

Cons: - Requires building and running a new local process on the agent machine - Adds a dependency: the MCP server must be running for the agent to interact with taskbase

Option C — Standalone Automation Script¶

Architecture: - A script (Python or Go) that runs on a cron schedule on the Mac mini - Calls Claude API directly with a system prompt instructing it to work the next taskbase task - Reads usage from Claude API responses and posts them back to taskbase

Pros: - Fully autonomous — no Cowork session required; runs on a timer - Token tracking is clean since the script controls the API call loop

Cons: - Bypasses Cowork entirely — the agent cannot use Cowork skills, computer use, or other MCP tools while working a task; severely limits what the agent can actually do - The Claude context is managed by the script, not by Claude itself; context budget logic must be re-implemented in the script - Harder to observe: progress is only visible in taskbase logs, not in a Cowork session the operator can watch

Implementation Notes¶

The MCP server config file should live at ~/.config/taskbase-agent/config.yaml with fields: api_base_url, api_token, default_token_budget, budget_warning_threshold
The server should be registered in the Cowork plugins directory so it starts automatically when Cowork launches
Token budget is enforced on the agent side (via get_token_budget) and recorded on the server side (via report_tokens); both are necessary — the server-side record is the audit trail, the agent-side check is the guard-rail
v1 does not handle parallel tasks; the server enforces that only one task can be in_progress per agent session at a time