feat(completeness): Regulatory Completeness Engine — auditable coverage, not confidence

Phase A½. The move from feature to product development: for every assessment, answer "how sure are we that this answer is COMPLETE?" — different from confidence. The product never claims full coverage; it makes its own knowledge state transparent and auditable. Shows what we do NOT know and why. - compliance/completeness/: assess_completeness(identified, corpus_status, uncertain, assumptions, assessed_obligations) -> CompletenessReport. Separates IDENTIFIED from ASSESSED (validated corpus AND determined applicability) and justifies every gap. Two kinds of open: corpus gap (future_corpus) and applicability uncertainty (query_required + deciding question, e.g. Data Act / generates_usage_data). - The metric is COUNTS, never a single percentage: "Identifiziert N · bewertet M · offen K · Unsicherheiten U · Begründung ja" + an honest audit statement. - ADR-007: auditable honesty; phase order A factory -> A½ Completeness -> B new domains; the transparency selling point. Deterministic, no LLM; corpus status + obligation count injected. - reference suite: "Regulatory Completeness" section runs an industrial-dishwasher assessment (assessed CRA/MaschinenVO; open EMV/Environmental=future_corpus, Data Act=query_required) and notes Environmental flips open->validated automatically once the corpus lands. 11 completeness tests (54 with adjacent modules), mypy --strict clean (15 files), check-loc 0. Product code with no app caller + ADR/reference = non-runtime -> no deploy (ADR-001). Freeze-safe. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-27 14:16:12 +02:00
parent 0b0d262462
commit aa99111a87
8 changed files with 389 additions and 3 deletions
@@ -0,0 +1,62 @@
+"""Schemas for the Regulatory Completeness Engine — auditable knowledge-coverage, not confidence.
+
+For an assessment it answers „wie sicher sind wir, dass diese Antwort VOLLSTÄNDIG ist?" by separating
+IDENTIFIED regulations from ASSESSED ones (those in the validated corpus) and listing every open or
+excluded domain WITH a reason. The metric is counts, never a single „87%". This is an internal quality
+machine: the product never claims full coverage — it makes its own knowledge state transparent.
+Deterministic, computed-not-stored, no new meta-model class (freeze v1.0). Python 3.9 compatible.
+"""
+
+from __future__ import annotations
+
+from enum import Enum
+from typing import List
+
+from pydantic import BaseModel, Field
+
+
+class CorpusStatus(str, Enum):
+    """The maturity of our knowledge corpus for a regulation/domain."""
+
+    VALIDATED = "validated"      # we can fully assess this
+    DRAFT = "draft"             # partial / under review
+    UNSUPPORTED = "unsupported"  # triggered but no corpus yet
+    UNKNOWN = "unknown"          # not in our registry at all
+
+
+class DomainCoverage(BaseModel):
+    regulation: str
+    status: CorpusStatus = CorpusStatus.UNKNOWN
+    note: str = ""
+
+
+class Exclusion(BaseModel):
+    """A domain/regulation DELIBERATELY not assessed — always with a reason (the heart of the engine)."""
+
+    subject: str
+    reason: str
+    deciding_question: str = ""                 # what would resolve it (if a query)
+    resolution: str = "future_corpus"           # query_required | future_corpus | not_applicable
+
+
+class Assumption(BaseModel):
+    key: str
+    value: str = ""
+    note: str = ""
+
+
+class CompletenessReport(BaseModel):
+    """The auditable coverage report for one assessment — counts + justification, NO single percentage."""
+
+    identified_regulations: List[str] = Field(default_factory=list)
+    assessed_regulations: List[str] = Field(default_factory=list)      # in the validated corpus
+    open_regulations: List[str] = Field(default_factory=list)          # identified but not validated
+    open_corpora: List[str] = Field(default_factory=list)             # missing domains worth building
+    coverage: List[DomainCoverage] = Field(default_factory=list)
+    assumptions: List[Assumption] = Field(default_factory=list)
+    exclusions: List[Exclusion] = Field(default_factory=list)
+    uncertainties_count: int = 0
+    assessed_obligations: int = 0                                      # injected (Execution-owned)
+    justification_present: bool = False
+    completeness_summary: str = ""                                    # "Identifiziert N · bewertet M · offen K · ..."
+    audit_statement: str = ""                                         # the honest narrative sentence