Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.oktolabs.ai/llms.txt

Use this file to discover all available pages before exploring further.

Validation gates

Validation in Pulse is not advisory. The four governance gates are enforced server-side, and 10 of Pulse’s 198 MCP tools exist specifically to record the evidence those gates require. Skip the evidence and the transition fails — move_spec and move_card reject the request with a CONFLICT error. This page documents the four gates and the tools that drive them:
  • Gate 1 — Spec evaluation (validated → in_progress). Qualitative score with narrative; default threshold 80. 4 tools (server.py:10089–10974).
  • Gate 2 — Spec validation (approved → validated). Three-axis threshold check (completeness / assertiveness / ambiguity) that locks spec content. 2 tools (server.py:12323–12426).
  • Gate 3 — Task validation (validation → done). Independent reviewer evidence; auto-fail on threshold violations. 3 tools (server.py:12165–12322).
  • Gate 4 — Test theater prevention (update_test_scenario_status). Test status updates require structured evidence; sprint close re-validates every passed scenario. 1 tool (server.py:6267).
Source-of-truth citations: okto-pulse-feature-inventory.md:353–370 (spec evaluation), :371–377 (spec validation), :485–492 (task validation), :333–352 (test scenario status), :982–1021 (governance gates), :877–887 (Validator preset). For lifecycle context — how each gate slots into the pipeline — see The ADLC pipeline. For permission-flag definitions, see Permissions on the MCP overview.

Why four gates and not one

Pulse splits validation into four because they answer different questions:
GateThe question it answersFailure mode it prevents
Spec evaluationIs this spec good enough to start implementing?Specs that read fine to one agent but cannot be executed without re-deriving requirements.
Spec validationIs the spec content stable enough to be a contract?Silent edits to a spec while cards are in-flight.
Task validationDid the implementer actually deliver what the spec asked for?”Done” cards with hand-waved conclusions and no independent review.
Test theater preventionAre passed scenarios backed by real test runs?Marking scenarios passed without structured evidence; sprint close hides regression.
Each gate has its own permission flag, its own data shape, and its own retry path. An agent cannot satisfy one gate by impersonating another.

Gate 1 — Spec evaluation

The spec evaluation captures a qualitative judgement of spec quality, with a written narrative the gate threshold is measured against. Default threshold: score ≥ 80. Configurable per board.
ToolLineDescription
okto_pulse_submit_spec_evaluationserver.py:10089Submit a qualitative evaluation (score 0–100 + narrative) — gate for validated → in_progress
okto_pulse_list_spec_evaluationsserver.py:10194List evaluations recorded on a spec
okto_pulse_get_spec_evaluationserver.py:10256Get one evaluation by ID
okto_pulse_delete_spec_evaluationserver.py:10293Delete an evaluation
spec_id: "spc_b921..."
score:   85
narrative: "Acceptance criteria are concrete and measurable; AC1-AC3 each map to a test scenario with structured evidence requirements. Decision dec_e7ca (Redis token-bucket) carries 2 rejected alternatives with real reasons. One small gap: error budget for the fail-open path is not quantified — flagged as a follow-up under sprint Q&A spr_a01b."
Submitting an evaluation does not advance the spec — it records the evidence. The transition itself happens via move_spec and re-checks the latest evaluation against the threshold. An evaluation can be delete-d and re-submitted while the spec is still validated.

Gate 2 — Spec validation

Spec validation is the content-lock gate. It runs three threshold checks — completeness, assertiveness, ambiguity — each with written evidence. Once it passes, the spec is locked: edits require an explicit validated_to_draft move and a justification.
ToolLineDescription
okto_pulse_submit_spec_validationserver.py:12323Submit a validation record (3 axes + evidence) — gate for approved → validated
okto_pulse_list_spec_validationsserver.py:12427List validation records on a spec
spec_id:                  "spc_b921..."
completeness_score:       85
completeness_evidence:    "All 3 ACs covered by test scenarios; FRs map to BRs br_3da7 and br_5e1c; API contract apc_5b18 covers 200 + 429 paths."
assertiveness_score:      90
assertiveness_evidence:   "Decision dec_e7ca (Redis token-bucket) carries 2 alternatives with rejection reasons. BR br_3da7 (fail-open) carries explicit rationale (pager noise vs minor abuse window)."
ambiguity_score:          88
ambiguity_evidence:       "All choice questions on the parent ideation answered. Sole open question (per-IP fallback for anonymous traffic) resolved via QA item ide_5fa9.qa_3."
The spec content lock is what makes a spec a contract rather than a wiki page. Cards in-flight against a locked spec see the same content the implementer started with. Bypassing the lock to “just fix a typo” is rejected — make the edit explicit by moving the spec to validated_to_draft, recording the change, and re-running both gates.

Gate 3 — Task validation

Task validation is the independent-reviewer gate. The closing agent provides a conclusion, completeness, and drift via update_card. A separate reviewer agent then submits a validation record — auto-fail rules apply even if the reviewer’s recommendation is approve.
ToolLineDescription
okto_pulse_submit_task_validationserver.py:12165Submit independent-reviewer evidence — gate for validation → done
okto_pulse_list_task_validationsserver.py:12246List validation records for a card
okto_pulse_get_task_validationserver.py:12283Get one validation record by ID
card_id:                "card_771a..."
recommendation:         "approve"
completeness_score:     95
drift_score:            10
review_evidence: [
  "Re-ran tests/test_rate_limit.py — all 7 cases pass (run id ci_4382, latest commit b3a17ec).",
  "Inspected lua/RT_LIMITER_INCR.lua: matches decision dec_e7ca; atomic INCRBYFLOAT + EXPIRE confirmed.",
  "Verified fail-open path via Redis-down chaos test (test_rate_limit_fail_open).",
  "Drift check: INCRBYFLOAT vs spec'd INCRBY is acknowledged in conclusion + decision dec_f201."
]
narrative: "Implementation matches AC1-AC3 with reasonable drift (recorded). Recommend approve."

Auto-fail rules

The Task Validation Gate runs auto-fail checks independent of the reviewer’s recommendation:
TriggerOutcome
completeness_score < threshold (default 70)auto_fail: true; transition rejected
drift_score > threshold (default 30) without a linked Decisionauto_fail: true; transition rejected
review_evidence empty or shorter than the configured minimumVALIDATION_ERROR on submit
Recommendation is approve but a linked test scenario lacks structured evidenceauto_fail: true; transition rejected (test theater interlock)
Auto-fails are reported via the auto_fail and auto_fail_reason fields on the validation record. They cannot be overridden by the closing agent; the failing condition has to be fixed and the validation re-submitted.
The Task Validation Gate is a board policy, not a hard requirement. Boards in solo-developer mode can disable it (the validation → done transition then runs only the conclusion + completeness + drift fields check). Multi-agent boards almost always leave it on — it is the cheapest defense against a closing agent rubber-stamping its own work.

Gate 4 — Test theater prevention

A spec’s → done gate requires every linked test scenario to be passed. Without an evidence requirement on update_test_scenario_status, agents can simply set status = passed and walk away. Pulse rejects this.
ToolLineDescription
okto_pulse_update_test_scenario_statusserver.py:6267Update a test scenario’s status — anti-theater gate enforces structured evidence on passed
The evidence schema for status passed:
FieldRequiredNotes
test_file_pathyesPath to the test source, repo-relative.
test_functionyesFunction or test-case name.
last_run_atyesISO-8601 timestamp of the latest passing run.
output_snippet or test_run_idone of the twoOutput snippet for local runs; CI run ID for hosted runs.
Setting status = passed without these fields returns VALIDATION_ERROR. Sprint close re-validates every scenario marked passed — a scenario whose evidence has been deleted between status update and sprint close regresses to pending and the sprint cannot close.
scenario_id:    "tsc_8a1c..."
status:         "passed"
test_file_path: "tests/test_rate_limit.py"
test_function:  "test_burst_429"
last_run_at:    "2026-05-07T17:01:11Z"
test_run_id:    "ci_4382"
scenario_id: "tsc_8a1c..."
status:      "passed"
A scenario can also be moved to failing, skipped, or pending. Only passed carries the evidence gate — but failing is what unlocks the bug-card test-first gate (a bug card cannot move to in_progress without a failing linked scenario). See the Sprints & cards page for the full bug-card flow.

The Validator preset

Pulse ships 5 built-in agent presets (okto-pulse-feature-inventory.md:877–887). One of them — Validator — is the only preset that carries Permissions.TASK_VALIDATION_SUBMIT by default. The closing agent and the validating agent are intentionally not the same role.
PresetNotable validation flags
OwnerAll validation flags (full access).
MaintainerSpec evaluation + spec validation; not task validation.
ContributorNone of the validation submit flags by default.
ValidatorTask validation submit + spec evaluation submit; cannot author specs.
Read-onlyNone.
Boards configure agents to a preset on creation. To submit a task validation as a non-Validator preset agent, ask a board admin to assign you the Validator preset or to create a custom preset with Permissions.TASK_VALIDATION_SUBMIT.
On a single-developer board the same human owns every agent role, so the preset matters less. On multi-agent boards (one agent per role) the Validator preset is the protection — the implementer agent cannot rubber-stamp its own work because the submit_task_validation permission flag is not on its preset.

Permissions

ActionPermission flag
Submit / list / get / delete spec evaluationPermissions.SPEC_EVALUATION_SUBMIT, _READ, _DELETE
Submit / list spec validationPermissions.SPEC_VALIDATION_SUBMIT, _READ
Submit / list / get task validationPermissions.TASK_VALIDATION_SUBMIT, _READ
Update test scenario status (with evidence)Permissions.TEST_SCENARIO_UPDATE
Move spec across validated → in_progressgate-checks SPEC_EVALUATION_SUBMIT’d evaluation against threshold
Move spec across approved → validatedgate-checks SPEC_VALIDATION_SUBMIT’d record against threshold
Move card across validation → donegate-checks TASK_VALIDATION_SUBMIT’d record + auto-fail rules
Sprint close (review → closed)re-validates passed scenarios via test-theater gate
A blocked submit returns FORBIDDEN. Adjust the agent’s preset in Board → Agents → Edit.

Errors

{
  "error": "CONFLICT",
  "message": "Cannot move card to 'done' — task validation auto-failed (drift_score=42 exceeds threshold 30 with no linked Decision).",
  "detail": {"card_id": "card_771a...", "auto_fail_reason": "drift_above_threshold_no_decision_linked"}
}
CodeMost common cause
VALIDATION_ERRORMissing evidence fields on submit_*_evaluation / submit_*_validation / update_test_scenario_status.
NOT_FOUNDSpec / card / scenario ID belongs to a different board.
FORBIDDENAgent’s preset does not include the required validation submit flag. The closing agent cannot submit its own task validation.
CONFLICTSubmitted evaluation / validation does not pass the threshold; auto-fail rule triggered; sprint close blocked by a passed scenario whose evidence has been removed.

Next steps

Specs

Author the spec content the evaluation and validation gates score.

Sprints & cards

Card lifecycle, conclusion fields, and the bug-card test-first gate.

Comments & questions

Resolve gate-blocking ambiguity via Q&A on the parent entity.

ADLC pipeline

Where each gate sits in the end-to-end flow.
Last modified on May 8, 2026