feat: Browser-Matrix C2 + B11 AI-Retention + Impressum-Specialist-Agent + B1 Mobile Playwright
Task #15 Stage 1.c-e — Browser-Matrix Backend-Integration: - _phase_c2_browser_matrix.py: ruft consent-tester /scan-matrix wenn env BROWSER_MATRIX=true, fuellt state["browser_matrix"] + state["browser_aggregate"] + state["browser_matrix_html"] - V2-Mail-Block: 🌐 Browser-Matrix Tabelle (Profile · Score · Sub-Scores PC/RR/BD · Bewertung) mit Worst-of-Header - Orchestrator ruft run_phase_c2 nach run_phase_c KNOWN: Stage 1.b (consent_scanner browser_profile-Param) bleibt zurueckgestellt (Datei in loc-exception, Hook-Patch verweigert). Stage 1.a-Shim laeuft im consent-tester — alle Profile aktuell auf Chromium, echte Engine-Diversitaet kommt mit 1.b. Task #17 TH-RETENTION-002 als B11 ai_retention_granularity_check: - Erkennt AI-Provider-Kontext (vertex/openai/anthropic/etc) - In +-800-char-Window: prueft ≥2 Datenkategorien aus Standard-Liste (Texteingaben/IP/Geraet/Session/Fehlerprotokoll/Zeitstempel) - Wenn 1 pauschale Speicherdauer + ≥2 Kategorien aber kein per-Kategorie-Differential → LOW - Smoke: Elli-Mock-DSE trifft LOW "AI-Speicherdauer pauschal" Task #18 Specialist-Agents Phase-1-Prototyp: - compliance/services/specialist_agents/__init__.py mit Architektur-Doku - impressum_agent.py: 9 Pflichtangaben § 5 TMG + § 1 DL-InfoV als Pattern-Registry (Name, Email, Telefon, HR, USt-IdNr, Vertretungsberechtigt, Aufsichtsbehoerde, Berufsangaben, OS-Link) - business_scope-aware (OS-Link nur fuer ecommerce, Aufsichtsbehoerde nur fuer regulated_profession/financial/insurance) - Phase-1 ist Pattern-Match-only (kein LLM), demonstriert die Schnittstelle. Phase 2 ersetzt Pattern durch System-Prompt + KB. - Smoke: minimal-Impressum triggert 4 Findings korrekt Task #7 B1 Playwright Mobile-Verifikation: - consent-tester/services/mobile_reachability_scanner.py: echte WebKit-launch + p.devices['iPhone 15'] preset + de-DE locale + Europe/Berlin timezone - Footer-Anchor-Suche via locator("footer >> text=/.../i") fuer 13 Reopen-Phrasen - Tap-Target-Boundingbox-Messung (Apple HIG / WCAG ≥44x44) - Click-Behavior: DOM-Modal-Snapshot vor/nach, erkennt CMP-Open - Output: has_anchor, anchor_text, tap_target_px, click_opens_cmp, engine_meta, screenshot_b64 (Footer-Crop wenn kein Anchor) - consent-tester/routes_mobile.py POST /scan-mobile-reachability - Backend _b1_wiring erweitert: ruft Mobile-Endpoint zuerst, Fallback auf statischen HTTP-Fetch. Mobile-Daten enrichen finding.mobile_playwright + Severity-Bump bei tap-target<44 / click-doesnt-open-CMP. KNOWN: WebKit-System-Libs sind im Dockerfile ergaenzt (Stage 1.a- Commit), greifen aber erst nach CI/CD-Rebuild des consent-tester. Bis dahin faellt B1 sauber auf statischen Fetch zurueck. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -42,6 +42,29 @@ async def run_b1(state: dict) -> None:
|
||||
return
|
||||
|
||||
_update(check_id, "Mobile Consent-Reachability prüfen...", 95)
|
||||
|
||||
# Try the new Playwright WebKit + iPhone scan first (Task #7).
|
||||
# Falls back to static HTTP fetch on error.
|
||||
mobile = None
|
||||
try:
|
||||
from ._constants import CONSENT_TESTER_URL
|
||||
async with httpx.AsyncClient(timeout=60.0) as c:
|
||||
r = await c.post(
|
||||
f"{CONSENT_TESTER_URL}/scan-mobile-reachability",
|
||||
json={"url": homepage_url},
|
||||
)
|
||||
if r.status_code == 200:
|
||||
mobile = r.json()
|
||||
logger.info(
|
||||
"B1 Mobile-Playwright: has_anchor=%s tap=%s click_opens=%s",
|
||||
mobile.get("has_anchor"),
|
||||
mobile.get("tap_target_px"),
|
||||
mobile.get("click_opens_cmp"),
|
||||
)
|
||||
except Exception as e:
|
||||
logger.info("B1 Mobile-Playwright fallback to static fetch: %s", e)
|
||||
|
||||
page_html = None
|
||||
try:
|
||||
async with httpx.AsyncClient(
|
||||
timeout=20.0, follow_redirects=True,
|
||||
@@ -50,21 +73,53 @@ async def run_b1(state: dict) -> None:
|
||||
"Version/17.5 Mobile/15E148 Safari/604.1"},
|
||||
) as c:
|
||||
r = await c.get(homepage_url)
|
||||
if r.status_code != 200:
|
||||
logger.info("B1: homepage fetch %s → HTTP %d", homepage_url, r.status_code)
|
||||
return
|
||||
page_html = r.text
|
||||
if r.status_code == 200:
|
||||
page_html = r.text
|
||||
except Exception as e:
|
||||
logger.warning("B1: homepage fetch failed: %s", e)
|
||||
|
||||
if not page_html and not mobile:
|
||||
return
|
||||
|
||||
finding = evaluate_reachability(page_html, homepage_url)
|
||||
finding = evaluate_reachability(page_html or "", homepage_url)
|
||||
|
||||
# Enrich finding with mobile-playwright details when available
|
||||
if mobile and mobile.get("has_anchor"):
|
||||
finding["mobile_playwright"] = {
|
||||
"has_anchor": mobile.get("has_anchor"),
|
||||
"anchor_text": mobile.get("anchor_text"),
|
||||
"tap_target_px": mobile.get("tap_target_px"),
|
||||
"click_opens_cmp": mobile.get("click_opens_cmp"),
|
||||
"engine_meta": mobile.get("engine_meta"),
|
||||
}
|
||||
# Tap-target rule (Apple HIG / WCAG 2.5.5): ≥ 44 px each side
|
||||
tp = mobile.get("tap_target_px") or {}
|
||||
if tp and (tp.get("w", 0) < 44 or tp.get("h", 0) < 44):
|
||||
finding["notes"] = (finding.get("notes") or []) + [
|
||||
f"tap-target nur {tp.get('w')}×{tp.get('h')}px "
|
||||
"(Apple HIG / WCAG verlangen ≥ 44×44)",
|
||||
]
|
||||
if finding.get("passed"):
|
||||
finding["passed"] = False
|
||||
finding["severity"] = "MEDIUM"
|
||||
finding["severity_reason"] = "misclassified"
|
||||
# If anchor exists in DOM but click doesn't open CMP, bump severity
|
||||
if mobile.get("has_anchor") and not mobile.get("click_opens_cmp"):
|
||||
finding["notes"] = (finding.get("notes") or []) + [
|
||||
"click auf Footer-Link öffnet CMP nicht direkt",
|
||||
]
|
||||
if finding.get("severity_reason") != "factually_wrong":
|
||||
finding["severity"] = "MEDIUM"
|
||||
finding["severity_reason"] = "misclassified"
|
||||
finding["passed"] = False
|
||||
|
||||
state["reachability_finding"] = finding
|
||||
state["reachability_html"] = _render_block(finding)
|
||||
logger.info(
|
||||
"B1 Reachability: passed=%s severity=%s reason=%s",
|
||||
"B1 Reachability: passed=%s severity=%s reason=%s mobile=%s",
|
||||
finding["passed"], finding.get("severity"),
|
||||
finding.get("severity_reason"),
|
||||
bool(mobile),
|
||||
)
|
||||
|
||||
|
||||
|
||||
@@ -9,6 +9,9 @@ from __future__ import annotations
|
||||
import html
|
||||
import logging
|
||||
|
||||
from compliance.services.ai_retention_granularity_check import (
|
||||
check_ai_retention_granularity,
|
||||
)
|
||||
from compliance.services.impressum_multi_entity_check import (
|
||||
check_multi_entity_impressum,
|
||||
)
|
||||
@@ -24,6 +27,7 @@ def run_b9b10(state: dict) -> None:
|
||||
new: list[dict] = []
|
||||
new.extend(check_multi_entity_impressum(state))
|
||||
new.extend(check_transfer_mechanism(state))
|
||||
new.extend(check_ai_retention_granularity(state))
|
||||
if not new:
|
||||
return
|
||||
extras.extend(new)
|
||||
|
||||
@@ -51,6 +51,8 @@ async def run_compliance_check(check_id: str, req) -> None:
|
||||
await run_phase_b(state)
|
||||
# Phase C: Step 3b-d (banner + cross-check + TCF) + Step 4
|
||||
await run_phase_c(state)
|
||||
# Phase C-2: optional browser-matrix scan (env BROWSER_MATRIX=true)
|
||||
await run_phase_c2(state)
|
||||
# Phase D-1/D-2: Step 5 vendor extraction + finalize
|
||||
await run_phase_d1(state)
|
||||
await run_phase_d2(state)
|
||||
|
||||
@@ -0,0 +1,146 @@
|
||||
"""Phase C-2 — Browser-Matrix Multi-Browser Scan (Stage 1.c).
|
||||
|
||||
After the single-browser scan in Phase C, optionally fan out to the
|
||||
consent-tester /scan-matrix endpoint that runs the same probe on
|
||||
chromium / firefox / webkit / mobile-safari and returns a worst-of
|
||||
robustness score per browser.
|
||||
|
||||
Activated by env `BROWSER_MATRIX=true`. Default off so existing
|
||||
runs aren't slowed down 4× while we tune.
|
||||
|
||||
The state gets these new keys:
|
||||
|
||||
state["browser_matrix"] list[dict] per-profile results
|
||||
state["browser_aggregate"] dict worst/best score + verbal
|
||||
state["browser_matrix_html"] str pre-rendered V2 block
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import os
|
||||
from html import escape as h
|
||||
|
||||
import httpx
|
||||
|
||||
from ._constants import CONSENT_TESTER_URL
|
||||
from ._helpers import _update
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
async def run_phase_c2(state: dict) -> None:
|
||||
if os.environ.get("BROWSER_MATRIX", "false").lower() not in (
|
||||
"true", "1", "yes", "on",
|
||||
):
|
||||
return
|
||||
check_id = state["check_id"]
|
||||
req = state["req"]
|
||||
banner_url = ""
|
||||
for d in req.documents:
|
||||
if d.url:
|
||||
from urllib.parse import urlparse
|
||||
p = urlparse(d.url)
|
||||
if p.scheme and p.netloc:
|
||||
banner_url = f"{p.scheme}://{p.netloc}"
|
||||
break
|
||||
if not banner_url:
|
||||
return
|
||||
|
||||
_update(check_id, "Browser-Matrix: Multi-Engine-Scan...", 83)
|
||||
|
||||
profiles_env = os.environ.get("BROWSER_MATRIX_PROFILES", "")
|
||||
profiles = [p.strip() for p in profiles_env.split(",") if p.strip()] or None
|
||||
|
||||
try:
|
||||
async with httpx.AsyncClient(timeout=600.0) as c:
|
||||
r = await c.post(
|
||||
f"{CONSENT_TESTER_URL}/scan-matrix",
|
||||
json={
|
||||
"url": banner_url,
|
||||
"timeout_per_phase": 10,
|
||||
"categories": [],
|
||||
"browser_profiles": profiles,
|
||||
},
|
||||
)
|
||||
if r.status_code != 200:
|
||||
logger.warning("browser-matrix scan HTTP %d", r.status_code)
|
||||
return
|
||||
data = r.json()
|
||||
except Exception as e:
|
||||
logger.warning("browser-matrix scan failed: %s", e)
|
||||
return
|
||||
|
||||
state["browser_matrix"] = data.get("browser_matrix") or []
|
||||
state["browser_aggregate"] = data.get("aggregate") or {}
|
||||
state["browser_matrix_html"] = _render(
|
||||
state["browser_matrix"], state["browser_aggregate"],
|
||||
)
|
||||
logger.info(
|
||||
"browser-matrix: %d profiles, worst=%s@%s%%, best=%s@%s%%",
|
||||
len(state["browser_matrix"]),
|
||||
state["browser_aggregate"].get("worst_profile"),
|
||||
state["browser_aggregate"].get("worst_score"),
|
||||
state["browser_aggregate"].get("best_profile"),
|
||||
state["browser_aggregate"].get("best_score"),
|
||||
)
|
||||
|
||||
|
||||
def _render(rows: list[dict], aggregate: dict) -> str:
|
||||
if not rows:
|
||||
return ""
|
||||
table_rows = []
|
||||
for r in rows:
|
||||
sev = ("fail" if r["score"] < 60
|
||||
else "warn" if r["score"] < 80 else "pass")
|
||||
color = ("#dc2626" if sev == "fail"
|
||||
else "#f59e0b" if sev == "warn" else "#15803d")
|
||||
dims = r.get("dimensions") or {}
|
||||
dims_str = (
|
||||
f"PC {int(dims.get('pre_consent',0)*100)}% · "
|
||||
f"RR {int(dims.get('reject_respect',0)*100)}% · "
|
||||
f"BD {int(dims.get('banner_design',0)*100)}%"
|
||||
)
|
||||
table_rows.append(
|
||||
"<tr>"
|
||||
f"<td style='padding:6px 10px;border-bottom:1px solid #e5e7eb;'>"
|
||||
f"{h(r.get('label') or r.get('profile_id') or '—')}</td>"
|
||||
f"<td style='padding:6px 10px;border-bottom:1px solid #e5e7eb;"
|
||||
f"color:{color};font-weight:600;'>{r['score']}%</td>"
|
||||
f"<td style='padding:6px 10px;border-bottom:1px solid #e5e7eb;"
|
||||
f"color:#475569;font-size:12px;'>{dims_str}</td>"
|
||||
f"<td style='padding:6px 10px;border-bottom:1px solid #e5e7eb;"
|
||||
f"color:#475569;font-size:13px;'>{h(r.get('verbal','—'))}</td>"
|
||||
"</tr>"
|
||||
)
|
||||
worst = aggregate.get("worst_score", 0)
|
||||
sev_color = ("#dc2626" if worst < 60
|
||||
else "#f59e0b" if worst < 80 else "#15803d")
|
||||
head = (
|
||||
f"<p style='margin:0 0 8px;font-size:13px;color:#475569;'>"
|
||||
f"<strong style='color:{sev_color};'>Worst-of {worst}%</strong> "
|
||||
f"(Profil <code>{aggregate.get('worst_profile','—')}</code>) — "
|
||||
f"Best-of {aggregate.get('best_score','—')}% "
|
||||
f"(<code>{aggregate.get('best_profile','—')}</code>). "
|
||||
"Aggregierter Score nach Worst-of-Regel: ein HIGH-Verstoß "
|
||||
"auf einem Browser kappt den Gesamt-Score.</p>"
|
||||
)
|
||||
return (
|
||||
"<div style='margin:24px 0;padding:16px;border-left:4px solid "
|
||||
f"{sev_color};background:#f8fafc;border-radius:4px;'>"
|
||||
"<h2 style='margin:0 0 8px;color:#1e293b;font-size:16px;'>"
|
||||
"🌐 Browser-Matrix · Consent-Robustness pro Engine"
|
||||
"</h2>"
|
||||
f"{head}"
|
||||
"<table style='width:100%;border-collapse:collapse;font-size:13px;"
|
||||
"margin-top:8px;background:#fff;'>"
|
||||
"<thead><tr style='background:#f1f5f9;'>"
|
||||
"<th style='text-align:left;padding:6px 10px;'>Browser-Profil</th>"
|
||||
"<th style='text-align:left;padding:6px 10px;'>Score</th>"
|
||||
"<th style='text-align:left;padding:6px 10px;'>Pre-Consent · "
|
||||
"Reject-Respekt · Banner-Design</th>"
|
||||
"<th style='text-align:left;padding:6px 10px;'>Bewertung</th>"
|
||||
"</tr></thead>"
|
||||
f"<tbody>{''.join(table_rows)}</tbody></table>"
|
||||
"</div>"
|
||||
)
|
||||
Reference in New Issue
Block a user