Validation gates

Validation in Pulse is not advisory. The four governance gates are enforced server-side, and 10 of Pulse’s 198 MCP tools exist specifically to record the evidence those gates require. Skip the evidence and the transition fails — move_spec and move_card reject the request with a CONFLICT error. This page documents the four gates and the tools that drive them:

Gate 1 — Spec evaluation (validated → in_progress). Qualitative score with narrative; default threshold 80. 4 tools (server.py:10089–10974).
Gate 2 — Spec validation (approved → validated). Three-axis threshold check (completeness / assertiveness / ambiguity) that locks spec content. 2 tools (server.py:12323–12426).
Gate 3 — Task validation (validation → done). Independent reviewer evidence; auto-fail on threshold violations. 3 tools (server.py:12165–12322).
Gate 4 — Test theater prevention (update_test_scenario_status). Test status updates require structured evidence; sprint close re-validates every passed scenario. 1 tool (server.py:6267).

Source-of-truth citations: okto-pulse-feature-inventory.md:353–370 (spec evaluation), :371–377 (spec validation), :485–492 (task validation), :333–352 (test scenario status), :982–1021 (governance gates), :877–887 (Validator preset). For lifecycle context — how each gate slots into the pipeline — see The ADLC pipeline. For permission-flag definitions, see Permissions on the MCP overview.

Why four gates and not one

Pulse splits validation into four because they answer different questions:

Gate	The question it answers	Failure mode it prevents
Spec evaluation	Is this spec good enough to start implementing?	Specs that read fine to one agent but cannot be executed without re-deriving requirements.
Spec validation	Is the spec content stable enough to be a contract?	Silent edits to a spec while cards are in-flight.
Task validation	Did the implementer actually deliver what the spec asked for?	”Done” cards with hand-waved conclusions and no independent review.
Test theater prevention	Are passed scenarios backed by real test runs?	Marking scenarios passed without structured evidence; sprint close hides regression.

Each gate has its own permission flag, its own data shape, and its own retry path. An agent cannot satisfy one gate by impersonating another.

Gate 1 — Spec evaluation

The spec evaluation captures a qualitative judgement of spec quality, with a written narrative the gate threshold is measured against. Default threshold: score ≥ 80. Configurable per board.

Tool	Line	Description
`okto_pulse_submit_spec_evaluation`	`server.py:10089`	Submit a qualitative evaluation (score 0–100 + narrative) — gate for `validated → in_progress`
`okto_pulse_list_spec_evaluations`	`server.py:10194`	List evaluations recorded on a spec
`okto_pulse_get_spec_evaluation`	`server.py:10256`	Get one evaluation by ID
`okto_pulse_delete_spec_evaluation`	`server.py:10293`	Delete an evaluation

spec_id: "spc_b921..."
score:   85
narrative: "Acceptance criteria are concrete and measurable; AC1-AC3 each map to a test scenario with structured evidence requirements. Decision dec_e7ca (Redis token-bucket) carries 2 rejected alternatives with real reasons. One small gap: error budget for the fail-open path is not quantified — flagged as a follow-up under sprint Q&A spr_a01b."

Submitting an evaluation does not advance the spec — it records the evidence. The transition itself happens via move_spec and re-checks the latest evaluation against the threshold. An evaluation can be delete-d and re-submitted while the spec is still validated.

Gate 2 — Spec validation

Spec validation is the content-lock gate. It runs three threshold checks — completeness, assertiveness, ambiguity — each with written evidence. Once it passes, the spec is locked: edits require an explicit validated_to_draft move and a justification.

Tool	Line	Description
`okto_pulse_submit_spec_validation`	`server.py:12323`	Submit a validation record (3 axes + evidence) — gate for `approved → validated`
`okto_pulse_list_spec_validations`	`server.py:12427`	List validation records on a spec

spec_id:                  "spc_b921..."
completeness_score:       85
completeness_evidence:    "All 3 ACs covered by test scenarios; FRs map to BRs br_3da7 and br_5e1c; API contract apc_5b18 covers 200 + 429 paths."
assertiveness_score:      90
assertiveness_evidence:   "Decision dec_e7ca (Redis token-bucket) carries 2 alternatives with rejection reasons. BR br_3da7 (fail-open) carries explicit rationale (pager noise vs minor abuse window)."
ambiguity_score:          88
ambiguity_evidence:       "All choice questions on the parent ideation answered. Sole open question (per-IP fallback for anonymous traffic) resolved via QA item ide_5fa9.qa_3."

The spec content lock is what makes a spec a contract rather than a wiki page. Cards in-flight against a locked spec see the same content the implementer started with. Bypassing the lock to “just fix a typo” is rejected — make the edit explicit by moving the spec to validated_to_draft, recording the change, and re-running both gates.

Gate 3 — Task validation

Task validation is the independent-reviewer gate. The closing agent provides a conclusion, completeness, and drift via update_card. A separate reviewer agent then submits a validation record — auto-fail rules apply even if the reviewer’s recommendation is approve.

Tool	Line	Description
`okto_pulse_submit_task_validation`	`server.py:12165`	Submit independent-reviewer evidence — gate for `validation → done`
`okto_pulse_list_task_validations`	`server.py:12246`	List validation records for a card
`okto_pulse_get_task_validation`	`server.py:12283`	Get one validation record by ID

card_id:                "card_771a..."
recommendation:         "approve"
completeness_score:     95
drift_score:            10
review_evidence: [
  "Re-ran tests/test_rate_limit.py — all 7 cases pass (run id ci_4382, latest commit b3a17ec).",
  "Inspected lua/RT_LIMITER_INCR.lua: matches decision dec_e7ca; atomic INCRBYFLOAT + EXPIRE confirmed.",
  "Verified fail-open path via Redis-down chaos test (test_rate_limit_fail_open).",
  "Drift check: INCRBYFLOAT vs spec'd INCRBY is acknowledged in conclusion + decision dec_f201."
]
narrative: "Implementation matches AC1-AC3 with reasonable drift (recorded). Recommend approve."

Auto-fail rules

The Task Validation Gate runs auto-fail checks independent of the reviewer’s recommendation:

Trigger	Outcome
`completeness_score < threshold` (default 70)	`auto_fail: true`; transition rejected
`drift_score > threshold` (default 30) without a linked Decision	`auto_fail: true`; transition rejected
`review_evidence` empty or shorter than the configured minimum	`VALIDATION_ERROR` on submit
Recommendation is `approve` but a linked test scenario lacks structured evidence	`auto_fail: true`; transition rejected (test theater interlock)

Auto-fails are reported via the auto_fail and auto_fail_reason fields on the validation record. They cannot be overridden by the closing agent; the failing condition has to be fixed and the validation re-submitted.

The Task Validation Gate is a board policy, not a hard requirement. Boards in solo-developer mode can disable it (the validation → done transition then runs only the conclusion + completeness + drift fields check). Multi-agent boards almost always leave it on — it is the cheapest defense against a closing agent rubber-stamping its own work.

Gate 4 — Test theater prevention

A spec’s → done gate requires every linked test scenario to be passed. Without an evidence requirement on update_test_scenario_status, agents can simply set status = passed and walk away. Pulse rejects this.

Tool	Line	Description
`okto_pulse_update_test_scenario_status`	`server.py:6267`	Update a test scenario’s status — anti-theater gate enforces structured evidence on `passed`

The evidence schema for status passed:

Field	Required	Notes
`test_file_path`	yes	Path to the test source, repo-relative.
`test_function`	yes	Function or test-case name.
`last_run_at`	yes	ISO-8601 timestamp of the latest passing run.
`output_snippet` or `test_run_id`	one of the two	Output snippet for local runs; CI run ID for hosted runs.

Setting status = passed without these fields returns VALIDATION_ERROR. Sprint close re-validates every scenario marked passed — a scenario whose evidence has been deleted between status update and sprint close regresses to pending and the sprint cannot close.

scenario_id:    "tsc_8a1c..."
status:         "passed"
test_file_path: "tests/test_rate_limit.py"
test_function:  "test_burst_429"
last_run_at:    "2026-05-07T17:01:11Z"
test_run_id:    "ci_4382"

scenario_id: "tsc_8a1c..."
status:      "passed"

A scenario can also be moved to failing, skipped, or pending. Only passed carries the evidence gate — but failing is what unlocks the bug-card test-first gate (a bug card cannot move to in_progress without a failing linked scenario). See the Sprints & cards page for the full bug-card flow.

The Validator preset

Pulse ships 5 built-in agent presets (okto-pulse-feature-inventory.md:877–887). One of them — Validator — is the only preset that carries Permissions.TASK_VALIDATION_SUBMIT by default. The closing agent and the validating agent are intentionally not the same role.

Preset	Notable validation flags
Owner	All validation flags (full access).
Maintainer	Spec evaluation + spec validation; not task validation.
Contributor	None of the validation submit flags by default.
Validator	Task validation submit + spec evaluation submit; cannot author specs.
Read-only	None.

Boards configure agents to a preset on creation. To submit a task validation as a non-Validator preset agent, ask a board admin to assign you the Validator preset or to create a custom preset with Permissions.TASK_VALIDATION_SUBMIT.

On a single-developer board the same human owns every agent role, so the preset matters less. On multi-agent boards (one agent per role) the Validator preset is the protection — the implementer agent cannot rubber-stamp its own work because the submit_task_validation permission flag is not on its preset.

Permissions

Action	Permission flag
Submit / list / get / delete spec evaluation	`Permissions.SPEC_EVALUATION_SUBMIT`, `_READ`, `_DELETE`
Submit / list spec validation	`Permissions.SPEC_VALIDATION_SUBMIT`, `_READ`
Submit / list / get task validation	`Permissions.TASK_VALIDATION_SUBMIT`, `_READ`
Update test scenario status (with evidence)	`Permissions.TEST_SCENARIO_UPDATE`
Move spec across `validated → in_progress`	gate-checks `SPEC_EVALUATION_SUBMIT`’d evaluation against threshold
Move spec across `approved → validated`	gate-checks `SPEC_VALIDATION_SUBMIT`’d record against threshold
Move card across `validation → done`	gate-checks `TASK_VALIDATION_SUBMIT`’d record + auto-fail rules
Sprint close (`review → closed`)	re-validates passed scenarios via test-theater gate

A blocked submit returns FORBIDDEN. Adjust the agent’s preset in Board → Agents → Edit.

Errors

{
  "error": "CONFLICT",
  "message": "Cannot move card to 'done' — task validation auto-failed (drift_score=42 exceeds threshold 30 with no linked Decision).",
  "detail": {"card_id": "card_771a...", "auto_fail_reason": "drift_above_threshold_no_decision_linked"}
}

Code	Most common cause
`VALIDATION_ERROR`	Missing evidence fields on `submit__evaluation` / `submit__validation` / `update_test_scenario_status`.
`NOT_FOUND`	Spec / card / scenario ID belongs to a different board.
`FORBIDDEN`	Agent’s preset does not include the required validation submit flag. The closing agent cannot submit its own task validation.
`CONFLICT`	Submitted evaluation / validation does not pass the threshold; auto-fail rule triggered; sprint close blocked by a `passed` scenario whose evidence has been removed.

Next steps

Specs

Author the spec content the evaluation and validation gates score.

Sprints & cards

Card lifecycle, conclusion fields, and the bug-card test-first gate.

Comments & questions

Resolve gate-blocking ambiguity via Q&A on the parent entity.

ADLC pipeline

Where each gate sits in the end-to-end flow.

Get started

Concepts

Configuration

Reference

Permissions & Agents

Guides

KG Operations

Operations

Help

Validation gates

Validation gates

Why four gates and not one

Gate 1 — Spec evaluation

Gate 2 — Spec validation

Gate 3 — Task validation

Auto-fail rules

Gate 4 — Test theater prevention

The Validator preset

Permissions

Errors

Next steps

Specs

Sprints & cards

Comments & questions

ADLC pipeline

Get started

Concepts

Configuration

Reference

Permissions & Agents

Guides

KG Operations

Operations

Help

Documentation Index

​Validation gates

​Why four gates and not one

​Gate 1 — Spec evaluation

​Gate 2 — Spec validation

​Gate 3 — Task validation

​Auto-fail rules

​Gate 4 — Test theater prevention

​The Validator preset

​Permissions

​Errors

​Next steps

Specs

Sprints & cards

Comments & questions

ADLC pipeline

Validation gates

Why four gates and not one

Gate 1 — Spec evaluation

Gate 2 — Spec validation

Gate 3 — Task validation

Auto-fail rules

Gate 4 — Test theater prevention

The Validator preset

Permissions

Errors

Next steps