0003 - Taskbase Agent Module¶
Status¶
Accepted
Date¶
2026-04-02
Context¶
ADR 0002 established the taskbase system: a Kubernetes-hosted task management app with a REST API that lets an agent pick up work, report progress, and track token consumption. That ADR defined the server side. This ADR defines the agent side — the module that runs on the agent machine and drives the interaction with the taskbase API autonomously.
The requirements for this module are:
- Autonomous task pickup — the agent should be able to start a new session, call the taskbase API, and receive the next task without a human typing anything
- Token consumption tracking — every Claude API call made while working a task emits a
usageobject; the module must accumulate these and report them back to taskbase after each interaction step and on task completion - Task lifecycle signalling — the agent must transition tasks through
pending → in_progress → done / pausedby calling the taskbase API at each phase boundary - Context budget awareness — when remaining token budget approaches a configurable threshold, the agent should checkpoint the current task (log a summary, mark it
paused), and pull the next task into a fresh context - Runs locally on the agent machine — the module lives on the Mac mini alongside Cowork, not in the Kubernetes cluster; it is the agent's interface to the taskbase system
Decision Drivers¶
- Low coupling — the module should wrap the taskbase REST API without tightly coupling to its internal implementation; if endpoints change, only the module needs updating
- Native tool interface — Claude works best when capabilities are presented as structured tools, not prose instructions; the module should expose named tools that Claude can call directly
- Zero manual steps — once a Cowork session starts, no human input should be required to pick up and begin executing the next queued task
- Auditability — every tool call the agent makes against taskbase (task pickup, log entries, token reports, status transitions) must be traceable in the taskbase management UI
- Token accuracy — token counts must come from the authoritative source: the
usagefield on Claude API responses, not estimates or scraping
Considered Options¶
- Option A — Cowork skill (markdown prompt instructions)
- Option B — Local MCP server exposing structured taskbase tools (Selected)
- Option C — Standalone automation script calling Claude API + taskbase API directly
Decision Outcome¶
Chosen option: Option B — Local MCP server, because:
- It exposes the taskbase API as discrete, named tools (
get_next_task,log_progress,report_tokens,complete_task,pause_task) that Claude can call with type-safe arguments — much more reliable than asking Claude to construct raw HTTP requests from prose instructions - It runs as a persistent process on the agent machine and is available to any Cowork session without per-session setup
- Token usage can be forwarded to taskbase directly from inside the MCP server by intercepting the Claude API
usageobject before returning results to the session - It keeps the taskbase REST API internal to the server; Claude never constructs API URLs or manages auth tokens directly
Option A — Cowork Skill (Markdown prompt instructions)¶
Architecture:
- A skill file (e.g. agent-skills/taskbase-runner/SKILL.md) that instructs Claude to call the taskbase REST API using fetch or curl via the Bash or JavaScript tools
- Token counting done by instructing Claude to read the usage field from each response and accumulate it manually
Pros: - Zero new infrastructure — a markdown file is all that is needed - Works in any Cowork session immediately after the skill is loaded
Cons: - Claude constructing raw HTTP requests from prose instructions is fragile; URL paths, headers, and JSON payloads are error-prone when authored at inference time - Token accumulation relies on Claude not losing count across many tool calls in a long session — unreliable - Auth tokens must be embedded in the skill file or passed in plaintext through the conversation - No persistent state between tool calls; if Claude misses a step, there is no guard-rail
Option B — Local MCP Server (Selected)¶
Architecture: - A small Go or Node.js process running on the agent machine, registered with Cowork as a plugin - Exposes the following tools over the MCP protocol:
| Tool | Description |
|---|---|
get_next_task |
Returns the highest-priority pending task from taskbase, transitions it to in_progress, and returns its id, title, description, and token_budget |
log_progress |
Appends a progress note to the active task's activity log in taskbase |
report_tokens |
Sends an incremental token usage report (input_tokens, output_tokens) for the active task |
complete_task |
Marks the active task done, stores a completion summary, and sends the final token tally |
pause_task |
Marks the active task paused with a checkpoint summary when the context budget is running low |
get_token_budget |
Returns the remaining token budget for the active task so Claude can decide whether to continue or checkpoint |
- The MCP server holds the taskbase API base URL and auth token in its own config; Claude never sees credentials
- On each Claude API response, the MCP server reads the
usageobject from the response metadata and callsreport_tokensautomatically, so Claude does not need to handle this manually
Agent session flow:
Session starts
└─ Claude calls get_next_task
└─ Task returned (id, title, description, token_budget)
└─ Claude works the task
├─ Periodically calls log_progress with updates
├─ MCP server auto-reports tokens after each step
└─ Claude checks get_token_budget before each major step
├─ Budget OK → continue
└─ Budget low → calls pause_task with checkpoint summary
└─ Fresh context → calls get_next_task again
Pros:
- Structured, typed tool interface — Claude cannot construct a malformed API call
- Auth and URL management are encapsulated in the server config
- Token reporting is automatic and accurate — sourced from Claude API usage metadata
- Persistent process — available across all Cowork sessions without reloading
- Testable independently of Claude: the MCP server can be exercised with any MCP client
Cons: - Requires building and running a new local process on the agent machine - Adds a dependency: the MCP server must be running for the agent to interact with taskbase
Option C — Standalone Automation Script¶
Architecture:
- A script (Python or Go) that runs on a cron schedule on the Mac mini
- Calls Claude API directly with a system prompt instructing it to work the next taskbase task
- Reads usage from Claude API responses and posts them back to taskbase
Pros: - Fully autonomous — no Cowork session required; runs on a timer - Token tracking is clean since the script controls the API call loop
Cons: - Bypasses Cowork entirely — the agent cannot use Cowork skills, computer use, or other MCP tools while working a task; severely limits what the agent can actually do - The Claude context is managed by the script, not by Claude itself; context budget logic must be re-implemented in the script - Harder to observe: progress is only visible in taskbase logs, not in a Cowork session the operator can watch
Implementation Notes¶
- The MCP server config file should live at
~/.config/taskbase-agent/config.yamlwith fields:api_base_url,api_token,default_token_budget,budget_warning_threshold - The server should be registered in the Cowork plugins directory so it starts automatically when Cowork launches
- Token budget is enforced on the agent side (via
get_token_budget) and recorded on the server side (viareport_tokens); both are necessary — the server-side record is the audit trail, the agent-side check is the guard-rail - v1 does not handle parallel tasks; the server enforces that only one task can be
in_progressper agent session at a time