fix(impressum): P9 — 7 False-Positive-Fixes in Pflichtangaben-Checks
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 16s
CI / loc-budget (push) Failing after 16s
CI / go-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 37s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped

#1 Name des Anbieters: \b Word-Boundary verhindert "ag" in "samstag",
   plus "aktiengesellschaft" als Volltreffer.
#2 Vertretungsberechtigte: Klammer-Liste-Pattern erkennt jetzt BMW-
   Format "Vorstand (Milan Nedeljkovic, Jochen Goller, ...)" plus
   "Vorsitzender des Aufsichtsrats: Name".
#3 V.i.S.d.P.: war schon INFO, OK.
#4 OS-Plattform/VSBG: bei no_direct_sales=True (OEM-Pattern) jetzt als
   "Nicht anwendbar" skipped statt 0/1 fail. Profile fliesst neu durch
   check_document_completeness -> runner.
#5 Zustaendige Kammer: IHK + Handwerkskammer + Tieraerztekammer in
   Pattern aufgenommen + severity LOW -> INFO (konditional).
#6 Stammkapital: war schon INFO, OK.
#7 Link-Disclaimer: neue Check-Eigenschaft "invert"=True. Anti-Pattern
   ist passed wenn NICHT gefunden, fail wenn gefunden. Vorher feuerte
   das Finding immer, jetzt nur wenn ein illegaler Disclaimer im Text
   ist.

Plus: L2-INFO-Checks (z.B. profession_chamber) zaehlen nicht mehr in
correctness-pct und erzeugen keine DSI-DETAIL-Findings. Konsistent
mit P8-Modell: INFO = "selbst pruefen", nicht "fail".

Verifiziert mit BMW-Impressum-Text — alle 7 Faelle korrekt klassifiziert:
  name=passed, representative_person=passed, profession_chamber=INFO,
  illegal_disclaimer=passed (kein Disclaimer im Text),
  dispute_resolution=skipped (no_direct_sales),
  editorial_visdp=INFO, share_capital=INFO.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Benjamin Admin
2026-05-19 00:52:03 +02:00
parent 575644c9c5
commit 0d37822b7c
3 changed files with 62 additions and 13 deletions
@@ -317,6 +317,7 @@ async def _run_compliance_check(check_id: str, req: ComplianceCheckRequest):
text, doc_type, label, url,
entry["word_count"], use_agent_flag,
business_scope=business_scope,
business_profile={"no_direct_sales": getattr(profile, "no_direct_sales", False)},
)
# Apply profile context filter
@@ -1001,6 +1002,7 @@ async def _check_single(
text: str, doc_type: str, label: str, url: str,
word_count: int, use_agent: bool,
business_scope: set[str] | None = None,
business_profile: dict | None = None,
):
"""Run regex + MC checks on a single document."""
from compliance.services.doc_checks.runner import check_document_completeness
@@ -1008,7 +1010,8 @@ async def _check_single(
from .agent_doc_check_routes import CheckItem, DocCheckResult
# Regex checklist
findings = check_document_completeness(text, doc_type, label, url)
findings = check_document_completeness(text, doc_type, label, url,
business_profile=business_profile)
all_checks: list[CheckItem] = []
completeness = 0