Skip to content

Skill contract

Every skill implements the Skill interface in src/types.ts. The contract is the unit of trust: it declares exactly what the skill does, what it touches, and how it is verified.

Fields

FieldRequiredPurpose
nameUnique skill identifier
execution_classdeterministic · probabilistic · tool
does / does_notPlain-language responsibility and explicit exclusions
state_readSTATE fields the skill may read
state_writeSTATE fields the skill may write (enforced by base-io)
assertionsSemantic checks run by base-assert after the skill
confidence_thresholdJudge pass threshold (probabilistic)
retriesRetry budget (probabilistic; default 2). Deterministic skills get 0
golden_anchorsWorked input/output examples fed to the judge
input_schema / output_schemaRegistry schema pins (name@version)
capabilitiesSide-effect capabilities the skill needs (absent == pure)
run(state, retry?)Executes the skill; retry carries negative context

A deterministic skill

ts
export const validateCoverage: Skill = {
  name: "validate-coverage",
  execution_class: "deterministic",
  does: "checks whether content blocks are sufficient context for the task",
  does_not: "extract content, score groundedness, or call an LLM",
  state_read: ["content_blocks"],
  state_write: ["coverage"],
  assertions: [
    {
      statement: "coverage_score >= 0.70",
      check: (s) => ({
        statement: "coverage_score >= 0.70",
        ok: s.coverage?.sufficient === true,
        detail: `score ${s.coverage?.score}`,
      }),
    },
  ],
  async run(state) {
    const coverage = assess(state);
    return { writes: { coverage }, summary: `coverage ${coverage.score}`, cost: 0 };
  },
};

Deterministic skills carry zero reliability overhead — no judge, no confidence routing, no retry.

A probabilistic skill

A probabilistic skill additionally declares confidence_threshold, retries, and golden_anchors, returns a confidence (which drives routing), and may return judge_blocks — the output the orchestrator auto-judges for groundedness:

ts
export const extractHighlights: Skill = {
  name: "extract-highlights",
  execution_class: "probabilistic",
  does: "selects the most important content blocks as highlights with confidence",
  does_not: "parse input, score groundedness, or persist memory",
  state_read: ["content_blocks"],
  state_write: ["highlights"],
  confidence_threshold: 0.8,
  retries: 2,
  golden_anchors: [{ input: { /* … */ }, output: { /* … */ } }],
  assertions: [/* … */],
  async run(state, retry) {
    // `retry` is undefined on the first attempt; on a re-invocation it carries
    // the previous summary + the failure reason (negative context).
    const highlights = select(state, retry);
    return {
      writes: { highlights },
      summary: `selected ${highlights.length} highlights`,
      cost: 0,
      confidence: Math.min(...highlights.map((h) => h.confidence)),
      judge_blocks: highlights.map((h) => ({ id: h.block_id, type: "paragraph", text: h.text })),
    };
  },
};

Rules

  1. Single responsibility — one job per skill; everything else goes in does_not.
  2. Declare your scope — only read/write the STATE fields you list. base-io enforces it.
  3. Declare your capabilities — list every side effect (fs:read · fs:write · net · env:read) in capabilities; an undeclared or ungranted effect is halted by the security model before the skill runs. Pure skills declare [].
  4. Neutral language — skill instructions must run on any LLM provider; no model-specific syntax.
  5. Classify honestly — mark a skill probabilistic only if it makes non-deterministic decisions; deterministic skills must stay overhead-free.

Worked example

todo-flagger is a complete, registered deterministic skill you can copy: it flags content blocks containing a TODO / FIXME / XXX marker, extends State with a new flags field, pins a todo-flag@1.0 schema, and grades 9/9 → verified. For the probabilistic side, see summarize — it returns a confidence and judge_blocks, so the orchestrator's reliability layer (confidence routing · auto-judge · retry) applies. Run them with npx tsx examples/todo-flagger.ts / examples/summarize.ts, and see the Examples guide for more.

MIT License