Files
breakpilot-compliance/backend-compliance/compliance/api/agent_check/_b4_wiring.py
T
Benjamin Admin d0e3621192 feat(audit): V2 mail render + 5 new findings (B4/B5/B6/B7/B8) + LLM-Plausibility-Phase
Mail Render V2 (compliance/services/mail_render_v2/) — 11-Modul-Subpackage
das einen einheitlichen Audit-Mail-Output erzeugt mit:
  - Header + KPI-Kacheln (Score / Findings / Docs / Vendors)
  - TOC + Sprung-Links
  - 3-Bucket-Trennung: Kritische Befunde / Manuelle Prüfung / Interne Reminder
  - Cookie-Inventar (Name·Vendor·Kategorie·Speicherdauer·Löschfrist·Sitzland·Quelle·Status)
  - Sofortmaßnahmen-Aggregator ("Sitzland ergänzen für 11 Cookies")
  - 24 Legacy-Wrappers — alle alten build_*_html in V2-Sections
  - Scope-Filter: FIN/GOV/MED/INS/EDU/LEG aus Berichten wenn nicht relevant
  - Hint/Action-Dedup: keine doppelten Sätze pro Card mehr
Aktiviert via env MAIL_RENDER_V2=true (Default: legacy renderer).

5 neue deterministische Findings als Phase D-2b/B4/B5/B6/B7/B8:

  B4 vendor_consistency_check — Cross-Doc-Provider-Widerspruch
     (Elli: DSE nennt Vertex AI für Chatbot, /de/cookies nennt Iadvize → HIGH).
     6 Service-Types: chatbot/analytics/tag_manager/pixel/cdn/cmp.

  B5 ai_act_transparency_check — AI Act Art. 50 Transparenzpflicht
     (Elli: Vertex AI vorhanden ohne Pre-Chat-Disclosure → HIGH).
     Plus B5-Erweiterung: Rechtsgrundlage Art-6-Abs-1-lit-f bei AI → MED
     (Einwilligung empfehlen).

  B6 cross_doc_dpo_check — DPO in DSE genannt, nicht im Impressum (LOW).

  B7 doc_staleness_check — Datum-Extraktion aus DSE/AGB/Nutzungsbedingungen.
     Cap: AGB/NB 3y, DSE 2y. Älter → MEDIUM (Elli NB Stand 2018 → HIGH).

  B8 cmp_fingerprint_check — Banner detected, aber CMP-Provider generic
     (kein Usercentrics/OneTrust/Cookiebot/etc → MED).

  B3-Erweiterung detect_intra_doc_contradictions — Widersprüchliche
     Speicherdauer im SELBEN Doc (Elli: Logfile 7d vs 30d → HIGH).

LLM-Plausibility-Phase (Phase D-2b, finding_plausibility_check.py):
  - Läuft AFTER MC pipeline, BEFORE D3 render
  - Prompt mit Beispiel-IDs + 3-Phase-Mapping: exact-ID / position-fallback /
    fuzzy-tail-match
  - Stempelt llm_title / llm_severity / llm_recommendation / llm_drop auf
    jeden FAIL CheckItem
  - V2-Render zeigt "🤖 LLM-Plausibility:" Box pro Finding wenn gestempelt
  - KNOWN ISSUE: qwen3:30b-a3b liefert oft empty content auf format='json' +
    8000-char-excerpt prompts. Pipeline läuft mit stamped=0 weiter. Task #16.

Coverage gegen Elli Ground Truth (zeroclaw/docs/ground-truth/elli_eco_2026-06-06.json,
13 expected findings via WebFetch-Agent-Crawl):
  - 4/4 HIGH-Findings ✓ (COOKIE-CONSENT-UX-001 + WIDERRUFSBELEHRUNG-001 +
    VENDOR-CONSISTENCY-001 + AI-ACT-TRANSPARENCY-001)
  - 4/6 MEDIUM ✓
  - 2/3 LOW ✓
  - Total: 10/13 = 77% (Sprung von 4/13 = 31%)

Restliche 3 Gaps als Task #17: IMPRESSUM-001 (multi-entity USt-IdNr),
TRANSFER-001 (Vendor-Mechanismus DPF/SCC), TH-RETENTION-002 (AI-Retention
pro Datenkategorie).

V2-Mail-Preview in Mailpit: 'v2all@local.test' Subject '[V2 ALL] ELLI'.
Backend healthy, B1+B3+B4+B5+B6+B7+B8 alle live im Orchestrator.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-06 21:19:49 +02:00

79 lines
3.1 KiB
Python
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
"""B4 wiring — Cross-Doc Vendor-Consistency check + HTML block.
Activated after B1+B3 in the orchestrator. The check itself is
deterministic (no LLM); it scans DSE + cookie texts for known
service providers per service type and flags every mismatch.
The mail renderer reads `state["vendor_consistency_findings"]` and
`state["vendor_consistency_html"]` directly — no further wiring.
"""
from __future__ import annotations
import html
import logging
from compliance.services.vendor_consistency_check import (
check_vendor_consistency,
)
logger = logging.getLogger(__name__)
def run_b4(state: dict) -> None:
findings = check_vendor_consistency(state)
state["vendor_consistency_findings"] = findings
if not findings:
return
state["vendor_consistency_html"] = _render(findings)
logger.info(
"B4 Vendor-Consistency: %d findings (HIGH=%d, MEDIUM=%d)",
len(findings),
sum(1 for f in findings if (f.get("severity") or "") == "HIGH"),
sum(1 for f in findings if (f.get("severity") or "") == "MEDIUM"),
)
def _render(findings: list[dict]) -> str:
rows = []
for f in findings:
sev = (f.get("severity") or "").upper()
color = "#dc2626" if sev == "HIGH" else "#f59e0b"
dse = ", ".join(f.get("dse_providers") or []) or "<em></em>"
cookie = ", ".join(f.get("cookie_providers") or []) or "<em></em>"
rows.append(
"<tr>"
f"<td style='padding:6px 10px;border-bottom:1px solid #e5e7eb;'>"
f"{html.escape((f.get('service_type') or '').replace('_',' ').title())}"
"</td>"
f"<td style='padding:6px 10px;border-bottom:1px solid #e5e7eb;'>"
f"{dse}</td>"
f"<td style='padding:6px 10px;border-bottom:1px solid #e5e7eb;'>"
f"{cookie}</td>"
f"<td style='padding:6px 10px;border-bottom:1px solid #e5e7eb;"
f"color:{color};font-weight:600;'>"
f"{sev} {html.escape(f.get('severity_reason') or '')}</td>"
"</tr>"
)
return (
"<div style='margin:24px 0;padding:16px;border-left:4px solid #dc2626;"
"background:#fff1f2;border-radius:4px;'>"
"<h2 style='margin:0 0 8px;color:#991b1b;font-size:16px;'>"
"VENDOR-CONSISTENCY-001 — Vendor-Konsistenz DSE ↔ Cookies</h2>"
"<p style='margin:0 0 8px;font-size:14px;color:#3f3f46;'>"
f"<strong>{len(findings)}</strong> Provider-Widersprüche zwischen "
"Datenschutzerklärung und Cookie-Seite. Beispiel Elli: "
"DSE = Vertex AI für Chatbot, Cookies-Seite = Iadvize.</p>"
"<table style='width:100%;border-collapse:collapse;font-size:13px;"
"margin-top:8px;background:#fff;'>"
"<thead><tr style='background:#f1f5f9;'>"
"<th style='text-align:left;padding:6px 10px;'>Service-Typ</th>"
"<th style='text-align:left;padding:6px 10px;'>In DSE</th>"
"<th style='text-align:left;padding:6px 10px;'>Auf Cookies-Seite</th>"
"<th style='text-align:left;padding:6px 10px;'>Severity</th>"
"</tr></thead>"
f"<tbody>{''.join(rows)}</tbody>"
"</table>"
"</div>"
)