You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Persona annotations are advisory only and MUST NOT override the derived primary outcome.
10
+
- The function MUST return: `primary_outcome`, `refusal` (bool), `blocking_reasons` (list), and `confidence_level` (one of `high|medium|low`).
11
+
- No component may infer or mutate the `primary_outcome` outside of this contract; any attempts are recorded as diagnostic events.
12
+
13
+
Rationale
14
+
---------
15
+
Centralizing outcome derivation improves auditability and prevents duplicated heuristics appearing in multiple modules. The canonical function is deterministic and tests enforce idempotence and precedence.
This contract defines the authority ceilings for the compiler and enforces guard-only behavior: the compiler must never escalate authority beyond what the spec explicitly grants unless that escalation is exposed as a BLOCKER + DIAGNOSTIC and recorded in the run events.
4
+
5
+
Principles
6
+
- No new inference, synthesis, or heuristic behavior is permitted by this phase. Phase 12 is purely guard-and-enforcement.
7
+
- Tier A authority must not be silently resolved by the compiler. Any Tier A synthesis must be accompanied by an explicit BLOCKER event and a DIAGNOSTIC event for auditability.
8
+
- REFUSAL outcomes require explicit authority metadata (evidence.refusal.authority) and the compiler asserts its presence at finalization. REFUSAL must not be used to mask missing authority.
9
+
10
+
Enforcement
11
+
- Assertions are centralized in `finalize_checklist` to fail fast when authority ceilings are violated.
12
+
- Unit tests must verify that missing authority causes assertion failures rather than silent behavior.
-`meta.justification` — short machine-readable string explaining why the value was created (e.g., `safe_default`, `missing_spec_pointer`, `heuristic_prose_keyword_match`). For BLOCKER/REFUSAL-related inferences the justification MUST reference affected pointer(s) via `meta.justification_ptr` or by embedding the pointer in the justification code.
- The compiler attaches explainability metadata at each inference site; unit tests and CI guards assert the presence.
33
+
- Tier A inferences without a corresponding checklist item or without explainability metadata are considered violations and will be detected via compiler assertions and failing tests.
Copy file name to clipboardExpand all lines: docs/governance/INVARIANTS.md
+49Lines changed: 49 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -162,6 +162,16 @@ This file declares the governance anchor; enforcement logic will be implemented
162
162
-`finalize_checklist(...)` is the sole emission boundary.
163
163
- This invariant is enforced by code-level assertions and tests.
164
164
165
+
## Refusal Authority Invariant (Phase 11C)
166
+
167
+
- Statement: Every event with outcome `REFUSAL`**must** include structured refusal metadata under `evidence.refusal` containing keys: `authority` (non-empty string), `trigger`, `scope`, and `justification`.
168
+
- Enforcement point: `src/shieldcraft/engine.finalize_checklist` asserts `authority` presence and exposes `checklist.refusal_authority`.
169
+
- Failure classification: Missing or invalid `authority` is treated as a compiler assertion/implementation error and must fail the finalization boundary to avoid ambiguous REFUSALs.
170
+
- Testable requirements:
171
+
- Unit tests must cover (a) REFUSALs recorded via `record_refusal_event` propagate authority into finalized checklist, and (b) REFUSALs recorded without `authority` cause finalization to raise an `AssertionError` and emit a paired diagnostic entry.
172
+
- Evidence: Contract `docs/governance/REFUSAL_AUTHORITY_CONTRACT.md`, tests (`tests/test_refusal_authority.py`), and CI guard (`tests/ci/test_refusal_authority_persistence.py`).
173
+
174
+
165
175
## Semantic Outcome Invariants (Phase 5)
166
176
167
177
- Statement: Every emitted checklist MUST contain a single canonical `primary_outcome` with value one of: `SUCCESS`, `REFUSAL`, `BLOCKED`, `DIAGNOSTIC_ONLY`.
@@ -203,3 +213,42 @@ This file declares the governance anchor; enforcement logic will be implemented
203
213
204
214
- Lock: This invariant is locked by Phase 8 and may not be changed except via a governance phase update.
- Statement: All inferred, synthesized, coerced, or derived values that affect checklist emission or gating MUST include machine-readable explainability metadata attached to the affected object (item/meta/evidence/header). No silent inference is permitted: every non-explicit value must carry provenance and a justification code.
221
+
- Required fields: `meta.source`, `meta.justification`, `meta.inference_type`, and `meta.tier` (when applicable for Tier A/B/C). BLOCKER/REFUSAL-related inferences MUST reference affected pointer(s) either via `meta.justification_ptr` or via an explicit pointer embedded in `meta.justification`.
222
+
- Additional required provenance for specific cases:
223
+
- Coercions MUST preserve `meta.original_value` for auditability.
224
+
- Derived tasks MUST include `meta.derived_from = <parent_id>` and `meta.justification` referencing the derivation rule.
225
+
- Confidence assignments MUST include `confidence_meta` with `source` and `justification` fields.
226
+
- Unknown invariant expressions that are defaulted MUST attach `explainability` metadata and emit a DIAGNOSTIC checklist item.
227
+
- Enforcement: Compiler unit tests and CI guards detect missing explainability metadata; Tier A omissions are BLOCKERs and Tier B are DIAGNOSTIC. Missing required provenance (e.g., `original_value` for coercions or `derived_from` for derived tasks) fails CI and must be remediated.
228
+
- Rationale: Prevent silent intent invention by requiring that every automatic decision be auditable, machine-filterable, and provenance-bound.
229
+
---
230
+
231
+
## Compiler Hardening Invariants (Phase 11A)
232
+
233
+
---
234
+
235
+
## Inference Explainability Invariant (Phase 11B)
236
+
237
+
- Statement: All synthesized, inferred, coerced, or derived data emitted by the compiler MUST include explainability metadata according to `docs/governance/INFERENCE_EXPLAINABILITY_CONTRACT.md`.
238
+
- Testable requirements:
239
+
- Any checklist item with `meta.synthesized_default == True` MUST have `meta.source`, `meta.justification`, and `meta.inference_type` defined.
240
+
- Any item whose fields are coerced or normalized by `ChecklistModel.normalize_item` MUST include `meta.source = "coerced"` and `meta.justification`.
241
+
- Any derived task emitted by `infer_tasks` MUST include `meta.source = "derived"` and `meta.justification`.
242
+
- Invariant evaluations that return safe defaults for unknown expressions MUST attach `explainability` metadata to the `invariant_results` entries.
243
+
- Enforcement: Verified by unit tests and CI guards; violations are test failures and must be remediated promptly.
244
+
245
+
- Statement: Compiler hardening measures introduced in Phase 11A (Tier enforcement, default synthesis, insufficiency diagnostics, and checklist quality scoring) are authoritative and must be enforced by the compiler pipeline.
246
+
247
+
- Testable invariants:
248
+
- Tier enforcement is implemented in `src/shieldcraft/services/checklist/tier_enforcement.py::enforce_tiers` and must emit checklist items for missing Tier A/B sections (BLOCKER/DIAGNOSTIC respectively).
249
+
- Default synthesis is implemented in `src/shieldcraft/services/spec/defaults.py::synthesize_missing_spec_fields` and must be invoked exactly once during compiler entry (currently in `ChecklistGenerator.build`).
250
+
- Spec sufficiency diagnostics are implemented via `src/shieldcraft/services/spec/analysis.py::check_spec_sufficiency` and must produce DIAGNOSTIC checklist items without aborting compilation.
251
+
- Checklist quality scoring is implemented in `src/shieldcraft/services/checklist/quality.py::compute_checklist_quality` and its result MUST be attached to `checklist.meta.checklist_quality` by `finalize_checklist`.
252
+
253
+
- Enforcement: Unit tests and CI guards verify these invariants; any regression must be patched and re-locked via governance decision.
Phase 13 guarantees that compiler behavior is stable and non-drifting under over-complete and redundant specifications. This phase is guard-only: it adds tests and validations to detect conflicts and ensure invariance; it does not alter inference, synthesis, or authority ceilings.
4
+
5
+
Principles
6
+
- Redundant or repeated spec elements must not amplify inference or authority.
7
+
- Extra non-conflicting detail must not change primary outcomes or escalate severities.
8
+
- Conflicting explicit instructions must be surfaced (DIAGNOSTIC/BLOCKER) and not auto-resolved by the compiler.
9
+
- Deterministic behavior (ordering, ids, hashing) must hold at scale.
10
+
11
+
Enforcement
12
+
- Deterministic unit tests verify redundancy tolerance, over-spec stability, explicit conflict visibility, and scale invariance.
13
+
- Any violation that suggests silent authority escalation or resolution raises an assertion in tests and will be investigated.
Decision: AUTHORITATIVE — Persona Protocol Boundary Locked (Phase 15)
4
+
5
+
Summary
6
+
- Personas are scoped specialists that may provide annotations, diagnostics, or advisory constraints but MUST NOT act as implicit authorities that alter checklist semantics or outcomes.
7
+
8
+
Rules (authoritative)
9
+
- Personas may only emit audit events (annotations, persona events) and record vetoes for observability; vetoes MUST be treated as advisory (DIAGNOSTIC) and MUST NOT cause REFUSAL or BLOCKER behavior.
10
+
- Personas MUST NOT be permitted to directly mutate semantic fields that affect checklist primary outcome or refusal authority, including but not limited to: `id`, `ptr`, `generated`, `artifact`, `severity`, `refusal`, `outcome`.
11
+
- Persona constraints that attempt disallowed mutations MUST be recorded in `item.meta.persona_constraints_disallowed` and a `G15_PERSONA_CONSTRAINT_DISALLOWED` DIAGNOSTIC event MUST be emitted for visibility.
12
+
- Persona routing and evaluation order MUST be deterministic and recorded only as metadata or persona events; they MUST NOT influence primary checklist derivation.
13
+
14
+
Rationale
15
+
- Personas provide useful domain-specific advice and annotations but must not supplant governance or authority ceilings. Ensuring personas are advisory preserves auditability, reduces accidental refusal, and prevents stealth authority escalation.
16
+
17
+
Enforcement
18
+
- Implementation-level enforcement is via deterministic tests and lightweight runtime guards (recording disallowed attempts and converting previous persona veto raises into advisory DIAGNOSTIC events).
This contract ensures templates are pluggable, versioned, and non-authoritative. Templates are presentation artifacts and must not be used to infer or assert authority over checklist outcomes.
4
+
5
+
Principles
6
+
- Templates provide rendering and placeholder defaults only; they must never inject BLOCKER or REFUSAL outcomes or otherwise alter authoritative decisioning.
7
+
- Template metadata (name/version) is recorded for provenance only and must not change primary outcomes or authority ceilings.
8
+
- Missing templates must fallback deterministically and must not escalate authority.
9
+
10
+
Enforcement
11
+
- Tests assert checklist invariance across template versions and that templates cannot generate authority outcomes.
12
+
- Any evidence of template-induced authority must fail deterministic tests and be addressed immediately.
- Details: imported module 'test_execution_plan' has __file__ attribute pointing to `/tests/checklist/test_execution_plan.py` while test file being collected is `/tests/plan/test_execution_plan.py`.
12
+
- Root cause: duplicate test basenames across different directories causing pytest import module name collisions (name-collision / stale pyc influence).
13
+
- Classification: name-collision (stale __pycache__ can exacerbate this)
- Details: imported module 'test_completeness' points to `/tests/ast/test_completeness.py` while collected file is `/tests/requirements/test_completeness.py`.
18
+
- Root cause: duplicate basename `test_completeness.py` in separate test packages.
- Details: imported module 'test_sufficiency' points to `/tests/requirements/test_sufficiency.py` while collected file is `/tests/sufficiency/test_sufficiency.py`.
- These are all name-collision issues (multiple tests with identical module basenames). Pytest's import mechanism maps module names (based on basenames) which can conflict when multiple tests share the same file name in different subdirectories; stale .pyc/__pycache__ exacerbates the issue.
35
+
- No production code changes are required; fixes are test-only (renames + pytest config + guard tests + cleanup helpers).
# Test Collection Stability Contract (AUTHORITATIVE)
2
+
3
+
Decision: AUTHORITATIVE — Phase 16: Test & CI Stability Locked
4
+
5
+
Summary
6
+
- Pytest collection must be deterministic and robust to local bytecode artifacts and duplicate basenames. CI must run the full test suite with zero collection/import errors.
7
+
8
+
Rules
9
+
- Duplicate test basenames across directories are disallowed. CI will fail the run if duplicates are detected.
10
+
- Tests must not rely on relative imports; absolute imports via `shieldcraft.*` are preferred.
11
+
- Bytecode cache pollution (`__pycache__`, `.pyc`) must be removed or ignored during CI runs. CI will include a cleanup step (`scripts/ci/clean_pycache.py`) to enforce this.
12
+
- Pytest configuration is authoritative and stored in `pytest.ini` with explicit `testpaths` and `norecursedirs` to prevent spurious discovery.
0 commit comments