refactor(reasoning): enforce ClaimCoverage (Welt 1) vs ComplianceStatus (Welt 2) boundary [F1]

Architecture-validation finding: the implementation mode produced compliance-
flavored output ("teilweise erfüllt", "covered") from a mere customer claim,
blurring the line to the Execution layer. This is a design decision, not a text
fix — the reasoning layer judges only the customer's STATEMENT, never conformity.

- CoverageStatus -> ClaimCoverage; values are claim-relative + carry "potential":
  potentially_addresses / partially_addresses / does_not_address /
  insufficient_information.
- ImplementationAssessment -> ClaimObligationMapping (coverage_status ->
  claim_coverage); ImplementationResponse -> ImplementationReasoningResponse
  (assessments -> mappings, + explicit `disclaimer`); request renamed; engine
  entry assess_implementation -> reason_implementation_claim.
- Endpoint /reasoning/implementation-assessment -> /reasoning/implementation-reasoning.
- Summary/explanations reworded: "adressiert wahrscheinlich N Pflichten … für
  eine Bewertung der tatsächlichen Umsetzung sind Nachweise erforderlich (keine
  Konformitätsaussage)". No "erfüllt"/"abgedeckt" leaks.
- New guard test asserts no compliance verdict leaks (no "erfüllt"; disclaimer
  separates ClaimCoverage from ComplianceStatus). 23 tests green, mypy clean.

Discovery (scope/obligations) was already structurally claim-free and unaffected.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Benjamin Admin
2026-06-26 00:37:57 +02:00
parent 1607c89459
commit 5e5002c883
6 changed files with 135 additions and 84 deletions
@@ -68,12 +68,19 @@ class OverlapType(str, Enum):
DIFFERENT_SCOPE = "different_scope"
class CoverageStatus(str, Enum):
COVERED = "covered"
PARTIALLY_COVERED = "partially_covered"
NOT_COVERED = "not_covered"
UNCLEAR = "unclear"
OUT_OF_SCOPE = "out_of_scope"
class ClaimCoverage(str, Enum):
"""How a customer's *claim* relates to an obligation — Welt 1 (reasoning).
This is NOT a conformity verdict. It judges only the customer's statement,
never whether the obligation is actually met. The real compliance verdict
(erfüllt/offen/unklar from verified evidence) is `ComplianceStatus`, owned by
the Compliance Execution Graph — the two must never be conflated.
"""
POTENTIALLY_ADDRESSES = "potentially_addresses"
PARTIALLY_ADDRESSES = "partially_addresses"
DOES_NOT_ADDRESS = "does_not_address"
INSUFFICIENT_INFORMATION = "insufficient_information"
class InterpretationVerdict(str, Enum):