refactor(reasoning): enforce ClaimCoverage (Welt 1) vs ComplianceStatus (Welt 2) boundary [F1]

Architecture-validation finding: the implementation mode produced compliance-
flavored output ("teilweise erfüllt", "covered") from a mere customer claim,
blurring the line to the Execution layer. This is a design decision, not a text
fix — the reasoning layer judges only the customer's STATEMENT, never conformity.

- CoverageStatus -> ClaimCoverage; values are claim-relative + carry "potential":
  potentially_addresses / partially_addresses / does_not_address /
  insufficient_information.
- ImplementationAssessment -> ClaimObligationMapping (coverage_status ->
  claim_coverage); ImplementationResponse -> ImplementationReasoningResponse
  (assessments -> mappings, + explicit `disclaimer`); request renamed; engine
  entry assess_implementation -> reason_implementation_claim.
- Endpoint /reasoning/implementation-assessment -> /reasoning/implementation-reasoning.
- Summary/explanations reworded: "adressiert wahrscheinlich N Pflichten … für
  eine Bewertung der tatsächlichen Umsetzung sind Nachweise erforderlich (keine
  Konformitätsaussage)". No "erfüllt"/"abgedeckt" leaks.
- New guard test asserts no compliance verdict leaks (no "erfüllt"; disclaimer
  separates ClaimCoverage from ComplianceStatus). 23 tests green, mypy clean.

Discovery (scope/obligations) was already structurally claim-free and unaffected.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Benjamin Admin
2026-06-26 00:37:57 +02:00
parent 1607c89459
commit 5e5002c883
6 changed files with 135 additions and 84 deletions
@@ -1,9 +1,15 @@
"""Implementation reasoning engine (spec Modus 3).
"""Implementation reasoning (spec Modus 3) — Welt 1 only.
Given a free-text claim ("Wir haben SBOMs und machen Updates, wenn Kunden Fehler
melden.") it maps the claimed capabilities onto the product's applicable
obligations and reports, per obligation, whether it is covered, partially
covered or not covered — plus the evidence that would close the gap.
Maps a free-text claim ("Wir haben SBOMs und machen Updates, wenn Kunden Fehler
melden.") onto the product's applicable obligations and reports, per obligation,
whether the *claim* potentially/partially/does-not address it — plus the
evidence that WOULD be needed to prove real implementation.
This is NOT a conformity verdict. It judges the customer's statement, never
whether the obligation is met. The real verdict (ComplianceStatus: erfüllt/
offen/unklar from verified evidence) lives in the Compliance Execution Graph.
The four reasoning layers: claim -> interpretation (capabilities/topics on the
claim) -> potential obligation coverage (`claim_coverage`) -> evidence required.
"""
from __future__ import annotations
@@ -11,16 +17,22 @@ from __future__ import annotations
from typing import Dict, List
from .claim_normalizer import normalize_claim
from .enums import Confidence, CoverageStatus
from .enums import ClaimCoverage, Confidence
from .obligation_engine import derive_obligations
from .schemas import (
ClaimObligationMapping,
CustomerImplementationClaim,
ImplementationAssessment,
ImplementationResponse,
ImplementationReasoningResponse,
ProductProfile,
)
from .taxonomy_claims import topics_for
DISCLAIMER = (
"Diese Auswertung interpretiert ausschließlich die Kundenaussage (ClaimCoverage, Welt 1). "
"Sie ist KEINE Konformitätsaussage — der tatsächliche Compliance-Status (ComplianceStatus, "
"Welt 2) ergibt sich erst aus geprüften Nachweisen im Compliance Execution Graph."
)
# Typical sub-elements a capability still misses when only partially claimed.
STANDARD_GAPS: Dict[str, List[str]] = {
"software_bill_of_materials": [
@@ -57,27 +69,31 @@ def _missing_for(capabilities: List[str]) -> List[str]:
return out
def _coverage(required: List[str], claimed: List[str], qualifiers: List[str]) -> CoverageStatus:
def _coverage(required: List[str], claimed: List[str], qualifiers: List[str]) -> ClaimCoverage:
if not required:
return ClaimCoverage.INSUFFICIENT_INFORMATION
req, have = set(required), set(claimed)
hit = req & have
if not hit:
return CoverageStatus.NOT_COVERED
return ClaimCoverage.DOES_NOT_ADDRESS
if "absent" in qualifiers or "planned" in qualifiers:
return CoverageStatus.NOT_COVERED
return ClaimCoverage.DOES_NOT_ADDRESS
if "reactive" in qualifiers and hit & {"secure_updates", "vulnerability_management"}:
return CoverageStatus.PARTIALLY_COVERED
return ClaimCoverage.PARTIALLY_ADDRESSES
if req <= have:
return CoverageStatus.COVERED
return CoverageStatus.PARTIALLY_COVERED
return ClaimCoverage.POTENTIALLY_ADDRESSES
return ClaimCoverage.PARTIALLY_ADDRESSES
def assess_implementation(profile: ProductProfile, customer_claim: str) -> ImplementationResponse:
def reason_implementation_claim(
profile: ProductProfile, customer_claim: str
) -> ImplementationReasoningResponse:
claim = normalize_claim(customer_claim)
obligations = derive_obligations(profile).applicable_obligations
claimed = claim.claimed_capability
claim_topics = set(claim.related_topics) | set(claimed)
assessments: List[ImplementationAssessment] = []
mappings: List[ClaimObligationMapping] = []
missing_evidence: List[str] = []
for ob in obligations:
@@ -89,54 +105,54 @@ def assess_implementation(profile: ProductProfile, customer_claim: str) -> Imple
directly_claimed = bool(set(required_caps) & set(claimed))
related = bool(ob_topics & claim_topics)
if not directly_claimed and not related:
continue # unrelated to the claim -> don't assess
continue # unrelated to the claim -> don't reason about it
status = _coverage(required_caps, claimed, claim.qualifiers)
missing = [] if status == CoverageStatus.COVERED else _missing_for(required_caps)
explanation = _explain(status, ob.title, claim.qualifiers)
if status != CoverageStatus.COVERED:
coverage = _coverage(required_caps, claimed, claim.qualifiers)
missing = [] if coverage == ClaimCoverage.POTENTIALLY_ADDRESSES else _missing_for(required_caps)
if coverage != ClaimCoverage.POTENTIALLY_ADDRESSES:
for ev in ob.required_evidence:
if ev not in missing_evidence:
missing_evidence.append(ev)
assessments.append(
ImplementationAssessment(
mappings.append(
ClaimObligationMapping(
claim_id=claim.claim_id,
obligation_id=ob.obligation_id,
coverage_status=status,
claim_coverage=coverage,
missing_elements=missing,
required_evidence=ob.required_evidence,
explanation=explanation,
explanation=_explain(coverage, ob.title, claim.qualifiers),
confidence=Confidence.MEDIUM,
)
)
return ImplementationResponse(
return ImplementationReasoningResponse(
claim=claim,
assessments=assessments,
mappings=mappings,
missing_evidence=missing_evidence,
summary=_summary(claim, assessments),
summary=_summary(claim, mappings),
disclaimer=DISCLAIMER,
)
def _explain(status: CoverageStatus, title: str, qualifiers: List[str]) -> str:
if status == CoverageStatus.COVERED:
return "Die Pflicht '%s' wird durch die beschriebene Umsetzung plausibel abgedeckt." % title
if status == CoverageStatus.PARTIALLY_COVERED:
extra = " Der Prozess wirkt reaktiv." if "reactive" in qualifiers else ""
return "Die Pflicht '%s' ist nur teilweise abgedeckt.%s" % (title, extra)
return "Die Pflicht '%s' wird durch die Aussage nicht abgedeckt." % title
def _explain(coverage: ClaimCoverage, title: str, qualifiers: List[str]) -> str:
if coverage == ClaimCoverage.POTENTIALLY_ADDRESSES:
return "Die Aussage adressiert die Pflicht '%s' wahrscheinlich vollständig — Nachweise erforderlich." % title
if coverage == ClaimCoverage.PARTIALLY_ADDRESSES:
extra = " Der beschriebene Prozess wirkt reaktiv." if "reactive" in qualifiers else ""
return "Die Aussage adressiert die Pflicht '%s' nur teilweise.%s" % (title, extra)
if coverage == ClaimCoverage.DOES_NOT_ADDRESS:
return "Die Aussage adressiert die Pflicht '%s' nicht." % title
return "Zur Pflicht '%s' liegen zu wenige Angaben für eine Einordnung vor." % title
def _summary(claim: CustomerImplementationClaim, assessments: List[ImplementationAssessment]) -> str:
def _summary(claim: CustomerImplementationClaim, mappings: List[ClaimObligationMapping]) -> str:
if not claim.claimed_capability:
return "Die Aussage ist zu unspezifisch — bitte konkretisieren, was umgesetzt wurde."
covered = sum(1 for a in assessments if a.coverage_status == CoverageStatus.COVERED)
partial = sum(1 for a in assessments if a.coverage_status == CoverageStatus.PARTIALLY_COVERED)
notc = sum(1 for a in assessments if a.coverage_status == CoverageStatus.NOT_COVERED)
if notc or partial:
head = "Teilweise erfüllt"
elif covered:
head = "Plausibel abgedeckt"
else:
head = "Nicht beurteilbar"
return "%s: %d abgedeckt, %d teilweise, %d offen." % (head, covered, partial, notc)
full = sum(1 for m in mappings if m.claim_coverage == ClaimCoverage.POTENTIALLY_ADDRESSES)
partial = sum(1 for m in mappings if m.claim_coverage == ClaimCoverage.PARTIALLY_ADDRESSES)
none = sum(1 for m in mappings if m.claim_coverage == ClaimCoverage.DOES_NOT_ADDRESS)
return (
"Die beschriebene Maßnahme adressiert wahrscheinlich %d Pflicht(en) vollständig und %d "
"teilweise; %d werden nicht berührt. Für eine Bewertung der tatsächlichen Umsetzung sind "
"Nachweise erforderlich (keine Konformitätsaussage)." % (full, partial, none)
)