feat(consent+report): P56-P67 Mercedes-Audit-Cycle (Anti-Audit, Phase G Vendors, Cookie-Behavior-Validator + 5 Mail-Polish-Items) [migration-approved]
CI / detect-changes (push) Successful in 11s
CI / branch-name (push) Has been skipped
CI / nodejs-build (push) Successful in 2m19s
CI / test-go (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 16s
CI / loc-budget (push) Failing after 15s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 37s

P56  Anti-Auditing-Detection als constructive Compliance-Finding (Audit-API-
     Empfehlung statt Anklage, weil Mercedes berechtigt Bots blockiert)
P57  Phase G vendor_details Union mit cmp_vendors -> 42 Anbieter sichtbar
P58  Anti-Audit-Detection robuster (Script-Domain-Check + Settings-spezifisch)
P59  Cookie-Behavior-Validator (4 Layer, 3-Tier-Severity: MEDIUM=Kategorie-
     Mismatch / HIGH=Zweck-Mismatch / CRITICAL=beide=Vorsatz-Indiz)
     + Open Cookie Database (CC0) als Library-Seed (2264 Cookies)
P59b Cookie-Behavior in Banner-Check verdrahtet + Mail-Block (BUGFIX:
     SessionLocal selbst oeffnen, db war im Background-Task nicht im Scope)

Mail-Polish nach Mercedes-Review:
P63  Banner-Footer-Links auch im wb7-link/role=link erkennen (Shadow-DOM-
     Walker label-based statt nur <a href>)
P64  Re-Access-Severity: MEDIUM statt HIGH, wenn Footer "Einstellungen" oder
     Mercedes-typisch existiert; OEM-Footer-Detection (wb7-footer)
P65  Text-Truncation: Word-Boundary statt Zeichen-Cut (kein "einfa"-Bruch
     mehr in Sofortmassnahmen)
P66  GF-Aktionen: Service-Zweck vs Cookie-Zweck explizit erklaert
     (haeufige Verwechslung Marketing/GF: "Akamai-Beschreibung" != Cookie-
     Zweck pro DSK-OH 2024)
P67  Stirring-Finding mit "Verlust-Framing"-Erklaerung + Alt-vs-Neutral-
     Beispiel, statt nur EDPB-Fachbegriff

Compliance-Advisor FAQ (admin agent-core/soul):
  + CNIL/EDPB Top-Bussgelder (Google 100M, Meta 60M, Amazon 35M)
  + Deutsche Praezedenz (LG Muenchen Google Fonts, EuGH Planet49, BGH I ZR 7/16)
  + 4 Risiko-Pfade (Bussgeld/Abmahnung/Sammelklage/NOYB) + Berechnungs-Methodik

Document-Generator Templates: AGB-DE (142), Impressum (140), Widerrufs-
formular-Anlage (143), DSR-Process-Dedup (139), Cookie-Library (144).

Architektur: doc_action_mappings.py + banner_dom_walkers.py +
cookie_behavior_validator.py + vendor_detail_extractor.py rausgezogen,
um die 500-LOC-Caps in agent_doc_check_report.py und
banner_text_checker.py einzuhalten.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Benjamin Admin
2026-05-21 06:28:25 +02:00
parent badb356740
commit 57c0f940a2
38 changed files with 3656 additions and 116 deletions
@@ -396,6 +396,17 @@ async def _run_compliance_check(check_id: str, req: ComplianceCheckRequest):
f"mit-geprueft.",
))
continue
# P24: DSB-Kontakt ist Pflichtangabe in der DSE (Art. 13(1)(b)
# DSGVO) — wenn kein separates DSB-Dokument vorliegt, ist das
# KEIN Fehler. DSB-Pruefung passiert ohnehin in der DSE.
if doc_type == "dsb" and not (entry.get("url") or "").strip():
results.append(DocCheckResult(
label=label, url="", doc_type=doc_type,
error="Nicht separat vorhanden — DSB-Kontaktdaten "
"werden in der Datenschutzerklaerung als "
"Pflichtangabe nach Art. 13(1)(b) DSGVO geprueft.",
))
continue
# Empty entry — either from auto-discovery padding (no URL
# to fetch) or from a fetch that returned nothing. If there
# was a URL we keep the error so the user knows the fetch
@@ -442,7 +453,7 @@ async def _run_compliance_check(check_id: str, req: ComplianceCheckRequest):
if banner_url:
_update(check_id, "Cookie-Banner wird geprueft...", 82)
try:
async with httpx.AsyncClient(timeout=120.0) as client:
async with httpx.AsyncClient(timeout=900.0) as client: # P50: +10min for vendor-detail-phase
resp = await client.post(
f"{CONSENT_TESTER_URL}/scan",
json={"url": banner_url, "timeout_per_phase": 10},
@@ -450,7 +461,9 @@ async def _run_compliance_check(check_id: str, req: ComplianceCheckRequest):
if resp.status_code == 200:
banner_result = resp.json()
except Exception as e:
logger.warning("Banner check failed: %s", e)
logger.warning(
"Banner check failed: %s (%s)", e or "<empty>", type(e).__name__
)
# Step 3c: Cross-check Banner vs Cookie-Richtlinie (88-90%)
if banner_result and "cookie" in doc_texts:
@@ -530,12 +543,35 @@ async def _run_compliance_check(check_id: str, req: ComplianceCheckRequest):
)
cookie_payloads = []
cookie_text = ""
# P30: aggregate cmp_payloads from ALL doc_entries — sites
# like Mercedes load Usercentrics only on the homepage, so
# the JSON gets captured during DSE/Impressum discovery, not
# in the cookies.html fetch. Dedup by URL since the same
# payload is captured on every page load.
seen_cmp_urls: set[str] = set()
for e in doc_entries:
if e.get("doc_type") == "cookie":
if e.get("cmp_payloads"):
cookie_payloads.extend(e["cmp_payloads"])
if e.get("text"):
cookie_text = e["text"]
for p in (e.get("cmp_payloads") or []):
p_url = p.get("url") or ""
if p_url and p_url in seen_cmp_urls:
continue
seen_cmp_urls.add(p_url)
cookie_payloads.append(p)
if e.get("doc_type") == "cookie" and e.get("text"):
cookie_text = e["text"]
# P48: also pull cmp_payloads from the Banner-Scan (homepage
# 3-phase consent test). Mercedes' Usercentrics-JSON is
# captured there even when not in DSI-Discovery of static
# legal pages.
if banner_result:
for p in (banner_result.get("cmp_payloads") or []):
p_url = p.get("url") or ""
if p_url and p_url in seen_cmp_urls:
continue
seen_cmp_urls.add(p_url)
cookie_payloads.append(p)
if cookie_payloads:
logger.info("P48: %d CMP-payloads available for vendor-extract (after Banner-Scan merge)",
len(cookie_payloads))
# P17-D: Fallback wenn cookie via P15 deduped wurde — nutze DSE-Text
# sofern Cookie-Begriffe drin sind, damit LLM-Vendor-Extract trotzdem
# greifen kann.
@@ -570,6 +606,160 @@ async def _run_compliance_check(check_id: str, req: ComplianceCheckRequest):
category=v.get("category", ""),
owner_name=owner_name,
)
# P57: Phase G vendor_details als zusätzliche Vendor-Quelle.
# Wenn extract_vendors_from_payloads weniger findet als
# Phase G's Info-Click-Through (z.B. Mercedes-Settings nicht
# erkannt als usercentrics-kind), die Phase-G-Namen als
# eigenständige Vendors hinzufügen.
if banner_result:
vd_list = banner_result.get("vendor_details") or []
vd_list = [v for v in vd_list if v.get("name") != "__TDM_OPTOUT__"]
existing_names = {(v.get("name") or "").strip().lower()
for v in cmp_vendors}
added = 0
for d in vd_list:
n = (d.get("name") or "").strip()
if not n or n.lower() in existing_names:
continue
# Skip generic category-labels (Mercedes-Kategorien)
if n.lower() in ("technisch erforderlich", "analyse und statistik",
"marketing", "alles auswählen",
"alles auswaehlen"):
continue
from compliance.services.vendor_classifier import classify
cmp_vendors.append({
"name": n,
"country": "",
"purpose": d.get("description", "")[:500],
"category": "",
"opt_out_url": d.get("opt_out_url", ""),
"privacy_policy_url": d.get("privacy_url", ""),
"persistence": d.get("retention", ""),
"cookies": d.get("cookies", []),
"processing_company": d.get("processing_company", ""),
"address": d.get("address", ""),
"purposes": d.get("purposes", []),
"technologies": d.get("technologies", []),
"recipient_type": classify(
vendor_name=n, category="", owner_name=owner_name,
),
})
existing_names.add(n.lower())
added += 1
if added:
logger.info("P57: added %d new vendors from Phase G (total: %d)",
added, len(cmp_vendors))
# P50: enrich vendors with per-vendor detail-modal-extracts
# (description, opt-out URL, privacy URL, cookies). Detail
# comes from Phase G Info-button-click-through in /scan.
tdm_opt_out_notice = ""
if cmp_vendors and banner_result:
vendor_details = banner_result.get("vendor_details") or []
# P50f: filter out TDM-opt-out sentinel
tdm_sentinel = next((v for v in vendor_details
if v.get("name") == "__TDM_OPTOUT__"), None)
if tdm_sentinel:
tdm_opt_out_notice = tdm_sentinel.get("description", "")
logger.info("P50f: TDM opt-out — skipped detail-enrichment for vendors")
vendor_details = [v for v in vendor_details
if v.get("name") != "__TDM_OPTOUT__"]
if vendor_details:
details_by_name = {}
for d in vendor_details:
n = (d.get("name") or "").strip().lower()
if n:
details_by_name[n] = d
enriched = 0
for v in cmp_vendors:
key = (v.get("name") or "").strip().lower()
# Substring fallback for fuzzy matches (e.g.
# "Google Analytics" detail-name may differ slightly)
d = details_by_name.get(key)
if not d:
for dn, dv in details_by_name.items():
if key in dn or dn in key:
d = dv
break
if not d:
continue
if not v.get("country") and (d.get("processing_company") or d.get("address")):
# Heuristic country extract from address (DE/EU keywords)
addr = d.get("address", "")
if re.search(r"\b(deutschland|germany|berlin|m(?:ue|ü)nchen|hamburg|stuttgart)\b", addr, re.I):
v["country"] = "DE"
elif re.search(r"\bireland|irland|dublin\b", addr, re.I):
v["country"] = "IE"
elif re.search(r"\busa|united states|california|new york|delaware\b", addr, re.I):
v["country"] = "US"
if not v.get("purpose"):
v["purpose"] = d.get("description", "")[:500]
if not v.get("opt_out_url"):
v["opt_out_url"] = d.get("opt_out_url", "")
if not v.get("privacy_policy_url"):
v["privacy_policy_url"] = d.get("privacy_url", "")
if not v.get("cookies"):
v["cookies"] = d.get("cookies", [])
v["purposes"] = d.get("purposes", [])
v["technologies"] = d.get("technologies", [])
if not v.get("persistence"):
v["persistence"] = d.get("retention", "")
v["processing_company"] = d.get("processing_company", "")
v["address"] = d.get("address", "")
enriched += 1
logger.info("P50: enriched %d/%d vendors with detail-modal data",
enriched, len(cmp_vendors))
# P59b: Cookie-Behavior-Validator — pruefe alle gesetzten Cookies
# gegen unsere Library, generiere 3-Tier-Severity-Findings.
# Background-Task hat keinen DB-Dependency-Inject -> SessionLocal
# selber oeffnen + sauber schliessen.
cookie_behavior_findings: list[dict] = []
if banner_result:
cookies_detailed = banner_result.get("cookies_detailed") or []
if cookies_detailed:
cb_session = None
try:
from database import SessionLocal
from compliance.services.cookie_behavior_validator import (
validate_cookie_behavior,
)
from urllib.parse import urlparse
fp_domain = ""
if banner_url:
fp_domain = urlparse(banner_url).netloc.replace("www.", "")
cb_session = SessionLocal()
cookie_behavior_findings = validate_cookie_behavior(
cb_session, cookies_detailed,
network_requests=[], # TODO Layer B in P59d
first_party_domain=fp_domain,
)
if cookie_behavior_findings:
sevs = {f["severity"] for f in cookie_behavior_findings}
logger.info(
"P59b: Cookie-Behavior-Check %d findings "
"(severities: %s) ueber %d Cookies",
len(cookie_behavior_findings),
sorted(sevs),
len(cookies_detailed),
)
banner_result["cookie_behavior_findings"] = (
cookie_behavior_findings
)
else:
logger.info(
"P59b: Cookie-Behavior-Check 0 findings "
"ueber %d Cookies (library miss / clean)",
len(cookies_detailed),
)
except Exception as cb_err:
logger.warning("P59b Cookie-Behavior-Check failed: %s", cb_err)
finally:
if cb_session is not None:
try:
cb_session.close()
except Exception:
pass
if cmp_vendors:
logger.info("VVT: %d vendors extracted, validating links",
len(cmp_vendors))
@@ -1149,10 +1339,15 @@ _DISCOVERY_RULES: list[tuple[str, tuple[str, ...]]] = [
"right-of-withdrawal", "ruecktritts", "rücktritts")),
("social_media", ("social-media", "soziale-medien", "social_media",
"social-media-policy")),
# P23: 'terms-and-conditions' kann Allgemeine Geschaeftsbedingungen ODER
# Nutzungsbedingungen meinen. Discovery-Funktion klassifiziert spaeter
# praeziser per Titel + Inhalt. Hier nur Url-Hint:
("agb", ("/agb", "geschaeftsbedingungen", "geschäftsbedingungen",
"terms-and-conditions", "general-terms")),
("nutzungsbedingungen", ("nutzungsbedingung", "terms-of-use",
"nutzungsordnung", "terms-of-service")),
"general-terms")),
("nutzungsbedingungen", ("nutzungsbedingung", "nutzungsbedingungen",
"terms-of-use", "terms-and-conditions",
"nutzungsordnung", "terms-of-service",
"allgemeine-nutzungsbedingungen")),
("dsb", ("datenschutzbeauftragt", "data-protection-officer",
"dpo-contact", "/dsb")),
("impressum", ("impressum", "imprint", "legal-notice", "site-notice",
@@ -202,5 +202,34 @@ def build_banner_deep_html(banner_result: dict | None) -> str:
)
parts.append('</ul>')
# 5) P59b: Cookie-Behavior-Findings (deklariert vs. tatsaechlich)
cb_findings = banner_result.get("cookie_behavior_findings") or []
if cb_findings:
parts.append(
'<div style="margin:14px 0 4px;padding:8px 12px;'
'background:#fef9e7;border-left:3px solid #d97706;border-radius:4px">'
'<div style="font-size:12px;color:#92400e;font-weight:600;'
'margin-bottom:6px">Cookie-Verhaltens-Check '
'(P59 — deklarierter Zweck vs. tatsaechliches Verhalten)</div>'
'<ul style="margin:0 0 0 18px;padding:0;font-size:11px;color:#1e293b">'
)
for f in cb_findings[:20]:
sev = (f.get("severity") or "MEDIUM").upper()
sev_c = ("#dc2626" if sev in ("CRITICAL", "HIGH") else
"#d97706" if sev == "MEDIUM" else "#94a3b8")
cname = f.get("cookie_name", "?")
parts.append(
f'<li style="margin-bottom:6px">'
f'<span style="display:inline-block;background:{sev_c};color:#fff;'
f'font-size:9px;padding:1px 5px;border-radius:3px;margin-right:6px">'
f'{sev}</span><code style="font-size:10px;background:#f1f5f9;'
f'padding:1px 4px;border-radius:2px">{cname}</code>: '
f'{f.get("text", "")[:280]}'
f'<div style="font-size:10px;color:#94a3b8;margin-top:2px;'
f'font-style:italic">Quelle: {f.get("legal_ref", "")} · '
f'Layer {f.get("layer", "?")}</div></li>'
)
parts.append('</ul></div>')
parts.append('</div>')
return "".join(parts)
@@ -13,6 +13,17 @@ Bei sauberen Sites bleibt er weg.
from __future__ import annotations
def _truncate_words(text: str, max_chars: int) -> str:
"""P65: Truncate at word boundary, never mid-word."""
if not text or len(text) <= max_chars:
return text
cut = text[:max_chars]
last_space = cut.rfind(" ")
if last_space > max_chars // 2:
cut = cut[:last_space]
return cut.rstrip(",;:.") + ""
# Bekannte Buessgeld-Praezedenzfaelle als Quellen-Hint
_BUSSGELD_REFS = {
"no_provider_per_category": "CNIL France 2023 — TikTok 5 Mio EUR (fehlende Vendor-Transparenz)",
@@ -40,7 +51,7 @@ def _detect_critical_issues(
if sev in ("CRITICAL", "HIGH"):
issues.append({
"key": "banner_violation",
"title": v.get("text", "")[:120],
"title": _truncate_words(v.get("text", ""), 260),
"severity": sev,
"action": _action_for_banner_violation(v),
"source": v.get("legal_ref", ""),
@@ -283,6 +283,50 @@ def build_vvt_table_html(vendors: list[dict]) -> str:
summary_parts.append("&mdash; alle ueber 50%")
summary = " ".join(summary_parts)
# P60: Wenn viele Vendors die GLEICHEN Flag-Sets haben, einmal
# global hinweisen statt 42x pro Vendor wiederholen.
from collections import Counter
flag_sets = Counter()
for v in vendors:
flags = v.get("compliance_flags") or []
if flags:
flag_sets[tuple(sorted(flags))] += 1
pattern_notice = ""
if flag_sets:
most_common, n_match = flag_sets.most_common(1)[0]
share = n_match / max(1, len(vendors))
if n_match >= 8 and share >= 0.5:
from compliance.services.finding_action_recipes import recipe_for
labels = [_flag_short(f) for f in most_common]
shared_actions = []
for f in most_common:
rec = recipe_for(f)
if rec:
shared_actions.append(
f'<li><strong>{_flag_short(f)}:</strong> '
f'{rec.get("fix_text", "").splitlines()[0][:180]}</li>'
)
pattern_notice = (
f'<div style="margin:8px 0 12px;padding:10px 14px;'
f'background:#fef3c7;border-left:3px solid #d97706;'
f'border-radius:4px;font-size:11px;color:#92400e">'
f'<strong>Wiederkehrendes Muster ({n_match} von {len(vendors)} '
f'Anbietern, {int(share*100)}%):</strong> '
f'Bei diesen Anbietern fehlen jeweils: '
f'<em>{", ".join(labels)}</em>. '
f'Vermutlich systembedingt (z.B. Settings-Export liefert '
f'nur Namen, oder Banner-API blockiert Detail-Extraktion). '
f'Die globalen Empfehlungen unten gelten fuer all diese Eintraege; '
f'in der Tabelle werden sie nicht pro Zeile wiederholt.'
+ (f'<ul style="margin:8px 0 0 0;padding-left:20px">{"".join(shared_actions)}</ul>'
if shared_actions else '')
+ '</div>'
)
# Mark vendors so _render_vendor_row can suppress redundant actions
for v in vendors:
if tuple(sorted(v.get("compliance_flags") or [])) == most_common:
v["_actions_in_global_notice"] = True
out: list[str] = [
'<div style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;'
'max-width:760px;margin:0 auto 16px;padding:12px 16px;'
@@ -296,6 +340,7 @@ def build_vvt_table_html(vendors: list[dict]) -> str:
'Verarbeitungen (INTERNAL/GROUP) werden Opt-Out und Privacy-Link '
'NICHT als Pflicht gewertet &mdash; der Widerruf erfolgt ueber das '
'Cookie-Banner, Privacy ist in der Haupt-DSI dokumentiert.</p>',
pattern_notice,
]
for rtype, section_label in RECIPIENT_TYPE_SECTIONS:
@@ -389,7 +434,9 @@ def _render_vendor_row_full(v: dict) -> str:
# Inline-Aktions-Anweisungen pro Flag
actions_html = ""
if flags:
# P60: skip per-row actions when already covered by global pattern notice
skip_actions = bool(v.get("_actions_in_global_notice"))
if flags and not skip_actions:
from compliance.services.finding_action_recipes import recipe_for
action_items = []
for f in flags:
@@ -202,52 +202,13 @@ def build_management_summary(results: list[DocCheckResult]) -> str:
def _check_to_action(doc_label: str, check_label: str, hint: str) -> str:
"""Convert a failed check into a plain-language action item."""
# Map technical check labels to business-language actions
label_lower = check_label.lower()
"""Convert a failed check into a plain-language action item.
if "datenschutzbeauftragter" in label_lower or "dsb" in label_lower:
return (f"<strong>{doc_label}:</strong> Ihren Datenschutzbeauftragten "
f"mit Kontaktdaten erwaehnen. Pflicht ab 20 Mitarbeitern.")
if "beschwerderecht" in label_lower or "art. 77" in label_lower:
return (f"<strong>{doc_label}:</strong> Hinweis auf das Beschwerderecht "
f"bei der Aufsichtsbehoerde ergaenzen (Name + Kontakt der Behoerde).")
if "betroffenenrechte" in label_lower:
return (f"<strong>{doc_label}:</strong> Alle Betroffenenrechte "
f"(Auskunft, Berichtigung, Loeschung, etc.) einzeln auffuehren.")
if "verantwortlicher" in label_lower:
return (f"<strong>{doc_label}:</strong> Vollstaendige Firmenbezeichnung "
f"mit Rechtsform, Adresse, E-Mail und Telefon eintragen.")
if "interessenabwaegung" in label_lower:
return (f"<strong>{doc_label}:</strong> Bei 'berechtigtem Interesse' "
f"die Abwaegung dokumentieren. Aufgabe fuer den DSB/Rechtsanwalt.")
if "widerrufsbelehrung" in label_lower or "widerruf" in label_lower:
return (f"<strong>{doc_label}:</strong> Gesetzliche Widerrufsbelehrung "
f"mit 14-Tage-Frist und Musterformular bereitstellen.")
if "loeschkonzept" in label_lower:
return (f"<strong>{doc_label}:</strong> Loeschfristen und -prozess "
f"dokumentieren. Aufgabe fuer den DSB.")
if "profiling" in label_lower or "art. 22" in label_lower:
return (f"<strong>{doc_label}:</strong> Hinweis ergaenzen ob "
f"automatisierte Entscheidungen stattfinden oder nicht.")
if "nicht im eingereichten text" in label_lower:
return (f"<strong>{doc_label}:</strong> Das eingereichte Dokument "
f"enthaelt nicht den erwarteten Inhalt. Bitte korrekte URL pruefen.")
if any(w in label_lower for w in ("rechtswidrig", "illegal", "haftungsausschluss", "disclaimer")):
return f"<strong>{doc_label}:</strong> '{check_label}' muss entfernt werden (Anti-Pattern, rechtlich wirkungslos)."
# Generic fallback
if hint and len(hint) < 150:
return f"<strong>{doc_label}:</strong> {hint[:120]}"
return f"<strong>{doc_label}:</strong> '{check_label}' muss ergaenzt werden."
Implementation lives in doc_action_mappings.check_to_action kept here
as a thin wrapper so the report module stays under the 500-LOC cap.
"""
from compliance.api.doc_action_mappings import check_to_action
return check_to_action(doc_label, check_label, hint)
def build_html_report(
@@ -0,0 +1,102 @@
"""
GF-freundliche Action-Texte fuer fehlende Pflichtangaben.
Ausgelagert aus agent_doc_check_report.py (LOC-Cap). Wandelt einen
fehlgeschlagenen DocCheck in eine kurze Handlungsanweisung um, die ein
Geschaeftsfuehrer ohne juristisches Vorwissen versteht.
P66: Cookie-spezifische Findings unterscheiden zwischen Service-Zweck
(Anbieter-Beschreibung wie "Akamai = Bot-Schutz") und Cookie-Zweck
(welches Cookie wozu) eine haeufige Verwechslung bei Marketing-Managern.
"""
from __future__ import annotations
def _cookie_finding_action(doc_label: str, check_label: str) -> str | None:
"""P66 — Cookie-spezifische Mappings."""
label_lower = check_label.lower()
if "zwecke der cookies" in label_lower or label_lower == "zwecke":
return (f"<strong>{doc_label}:</strong> Zwecke pro Cookie ergaenzen "
f"— nicht pro Anbieter. Service-Beschreibungen ('Akamai = "
f"Bot-Schutz') beantworten nicht, was das einzelne Cookie "
f"tut. Pflicht: pro Cookie (z.B. <code>_abck</code>) den "
f"konkreten Zweck angeben ('Bot-Detection-Token, gueltig "
f"24h'). DSK-OH Telemedien 2024 §3.2.")
if "speicherdauer" in label_lower:
return (f"<strong>{doc_label}:</strong> Speicherdauer pro Cookie "
f"angeben — nicht pauschal 'siehe Anbieter'. Pflicht: "
f"konkreter Wert (z.B. '_ga: 2 Jahre', '_gid: 24h', "
f"'PHPSESSID: Session'). Werte aus DevTools &gt; "
f"Application &gt; Cookies pruefen, Anbieter-Doku ist "
f"oft veraltet. Art. 13 Abs. 2 lit. a DSGVO.")
if "anbieter" in label_lower or "providers_named" in label_lower:
return (f"<strong>{doc_label}:</strong> Konkrete Firmen mit Sitz "
f"benennen — nicht 'Drittanbieter' oder 'Marketing-Partner'. "
f"Pflicht: voller Firmenname + Rechtsform + Land (z.B. "
f"'Google Ireland Limited, Dublin'). Art. 13 Abs. 1 lit. e "
f"DSGVO (Empfaenger-Pflicht).")
if "cookie-tabelle" in label_lower or "cookie_list" in label_lower:
return (f"<strong>{doc_label}:</strong> Tabellarische Cookie-Liste "
f"mit Name, Anbieter, Zweck und Speicherdauer ergaenzen. "
f"Reine Anbieter-Beschreibung ohne Cookie-Namen reicht "
f"nicht — Nutzer muss nachvollziehen, welches einzelne "
f"Cookie was tut. DSK-OH 2024.")
if "drittland" in label_lower or "schrems" in label_lower:
return (f"<strong>{doc_label}:</strong> Pro US-Anbieter (Google, "
f"Meta, AWS, Akamai) klaeren: SCC (Art. 46 DSGVO) oder "
f"DPF-Zertifizierung — und in der Cookie-Richtlinie "
f"explizit nennen. Pauschales 'Anbieter ausserhalb EU' "
f"reicht nicht. EuGH Schrems II.")
return None
def check_to_action(doc_label: str, check_label: str, hint: str) -> str:
"""Convert a failed check into a plain-language action item."""
label_lower = check_label.lower()
if "datenschutzbeauftragter" in label_lower or "dsb" in label_lower:
return (f"<strong>{doc_label}:</strong> Ihren Datenschutzbeauftragten "
f"mit Kontaktdaten erwaehnen. Pflicht ab 20 Mitarbeitern.")
if "beschwerderecht" in label_lower or "art. 77" in label_lower:
return (f"<strong>{doc_label}:</strong> Hinweis auf das Beschwerderecht "
f"bei der Aufsichtsbehoerde ergaenzen (Name + Kontakt der Behoerde).")
if "betroffenenrechte" in label_lower:
return (f"<strong>{doc_label}:</strong> Alle Betroffenenrechte "
f"(Auskunft, Berichtigung, Loeschung, etc.) einzeln auffuehren.")
if "verantwortlicher" in label_lower:
return (f"<strong>{doc_label}:</strong> Vollstaendige Firmenbezeichnung "
f"mit Rechtsform, Adresse, E-Mail und Telefon eintragen.")
if "interessenabwaegung" in label_lower:
return (f"<strong>{doc_label}:</strong> Bei 'berechtigtem Interesse' "
f"die Abwaegung dokumentieren. Aufgabe fuer den DSB/Rechtsanwalt.")
if "widerrufsbelehrung" in label_lower or "widerruf" in label_lower:
return (f"<strong>{doc_label}:</strong> Gesetzliche Widerrufsbelehrung "
f"mit 14-Tage-Frist und Musterformular bereitstellen.")
if "loeschkonzept" in label_lower:
return (f"<strong>{doc_label}:</strong> Loeschfristen und -prozess "
f"dokumentieren. Aufgabe fuer den DSB.")
if "profiling" in label_lower or "art. 22" in label_lower:
return (f"<strong>{doc_label}:</strong> Hinweis ergaenzen ob "
f"automatisierte Entscheidungen stattfinden oder nicht.")
if "nicht im eingereichten text" in label_lower:
return (f"<strong>{doc_label}:</strong> Das eingereichte Dokument "
f"enthaelt nicht den erwarteten Inhalt. Bitte korrekte URL pruefen.")
if any(w in label_lower for w in ("rechtswidrig", "illegal",
"haftungsausschluss", "disclaimer")):
return (f"<strong>{doc_label}:</strong> '{check_label}' muss entfernt "
f"werden (Anti-Pattern, rechtlich wirkungslos).")
mapped = _cookie_finding_action(doc_label, check_label)
if mapped:
return mapped
if hint and len(hint) < 300:
return f"<strong>{doc_label}:</strong> {hint[:280]}"
return f"<strong>{doc_label}:</strong> '{check_label}' muss ergaenzt werden."