Files
breakpilot-compliance/backend-compliance/compliance/api/agent_check/_b1_wiring.py
T
Benjamin Admin 37093ff9e3 feat: Browser-Matrix C2 + B11 AI-Retention + Impressum-Specialist-Agent + B1 Mobile Playwright
Task #15 Stage 1.c-e — Browser-Matrix Backend-Integration:
  - _phase_c2_browser_matrix.py: ruft consent-tester /scan-matrix wenn
    env BROWSER_MATRIX=true, fuellt state["browser_matrix"] +
    state["browser_aggregate"] + state["browser_matrix_html"]
  - V2-Mail-Block: 🌐 Browser-Matrix Tabelle (Profile · Score ·
    Sub-Scores PC/RR/BD · Bewertung) mit Worst-of-Header
  - Orchestrator ruft run_phase_c2 nach run_phase_c
  KNOWN: Stage 1.b (consent_scanner browser_profile-Param) bleibt
    zurueckgestellt (Datei in loc-exception, Hook-Patch verweigert).
    Stage 1.a-Shim laeuft im consent-tester — alle Profile aktuell
    auf Chromium, echte Engine-Diversitaet kommt mit 1.b.

Task #17 TH-RETENTION-002 als B11 ai_retention_granularity_check:
  - Erkennt AI-Provider-Kontext (vertex/openai/anthropic/etc)
  - In +-800-char-Window: prueft ≥2 Datenkategorien aus Standard-Liste
    (Texteingaben/IP/Geraet/Session/Fehlerprotokoll/Zeitstempel)
  - Wenn 1 pauschale Speicherdauer + ≥2 Kategorien aber kein
    per-Kategorie-Differential → LOW
  - Smoke: Elli-Mock-DSE trifft LOW "AI-Speicherdauer pauschal"

Task #18 Specialist-Agents Phase-1-Prototyp:
  - compliance/services/specialist_agents/__init__.py mit Architektur-Doku
  - impressum_agent.py: 9 Pflichtangaben § 5 TMG + § 1 DL-InfoV
    als Pattern-Registry (Name, Email, Telefon, HR, USt-IdNr,
    Vertretungsberechtigt, Aufsichtsbehoerde, Berufsangaben, OS-Link)
  - business_scope-aware (OS-Link nur fuer ecommerce, Aufsichtsbehoerde
    nur fuer regulated_profession/financial/insurance)
  - Phase-1 ist Pattern-Match-only (kein LLM), demonstriert die
    Schnittstelle. Phase 2 ersetzt Pattern durch System-Prompt + KB.
  - Smoke: minimal-Impressum triggert 4 Findings korrekt

Task #7 B1 Playwright Mobile-Verifikation:
  - consent-tester/services/mobile_reachability_scanner.py: echte
    WebKit-launch + p.devices['iPhone 15'] preset + de-DE locale +
    Europe/Berlin timezone
  - Footer-Anchor-Suche via locator("footer >> text=/.../i") fuer
    13 Reopen-Phrasen
  - Tap-Target-Boundingbox-Messung (Apple HIG / WCAG ≥44x44)
  - Click-Behavior: DOM-Modal-Snapshot vor/nach, erkennt CMP-Open
  - Output: has_anchor, anchor_text, tap_target_px, click_opens_cmp,
    engine_meta, screenshot_b64 (Footer-Crop wenn kein Anchor)
  - consent-tester/routes_mobile.py POST /scan-mobile-reachability
  - Backend _b1_wiring erweitert: ruft Mobile-Endpoint zuerst,
    Fallback auf statischen HTTP-Fetch. Mobile-Daten enrichen
    finding.mobile_playwright + Severity-Bump bei
    tap-target<44 / click-doesnt-open-CMP.
  KNOWN: WebKit-System-Libs sind im Dockerfile ergaenzt (Stage 1.a-
    Commit), greifen aber erst nach CI/CD-Rebuild des consent-tester.
    Bis dahin faellt B1 sauber auf statischen Fetch zurueck.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-06 22:20:25 +02:00

161 lines
6.1 KiB
Python
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
"""B1 wiring — Mobile Consent-Reachability check + HTML block.
Fetches the homepage of the first submitted URL, runs the static
`evaluate_reachability` analysis on the footer, and renders the
result as an HTML block for the audit mail.
Only renders a block when the check FAILS — a passing site doesn't
need a block. The block is severity-colored and lists the specific
notes that triggered the finding (missing reopen anchor, new-tab
break, browser-deflection language).
"""
from __future__ import annotations
import html
import logging
import httpx
from compliance.services.consent_reachability_check import (
evaluate_reachability,
)
from ._helpers import _update
logger = logging.getLogger(__name__)
async def run_b1(state: dict) -> None:
"""Run the reachability check + render HTML. Mutates state in place."""
req = state["req"]
check_id = state["check_id"]
homepage_url = ""
for d in req.documents:
if d.url:
from urllib.parse import urlparse
p = urlparse(d.url)
if p.scheme and p.netloc:
homepage_url = f"{p.scheme}://{p.netloc}/"
break
if not homepage_url:
return
_update(check_id, "Mobile Consent-Reachability prüfen...", 95)
# Try the new Playwright WebKit + iPhone scan first (Task #7).
# Falls back to static HTTP fetch on error.
mobile = None
try:
from ._constants import CONSENT_TESTER_URL
async with httpx.AsyncClient(timeout=60.0) as c:
r = await c.post(
f"{CONSENT_TESTER_URL}/scan-mobile-reachability",
json={"url": homepage_url},
)
if r.status_code == 200:
mobile = r.json()
logger.info(
"B1 Mobile-Playwright: has_anchor=%s tap=%s click_opens=%s",
mobile.get("has_anchor"),
mobile.get("tap_target_px"),
mobile.get("click_opens_cmp"),
)
except Exception as e:
logger.info("B1 Mobile-Playwright fallback to static fetch: %s", e)
page_html = None
try:
async with httpx.AsyncClient(
timeout=20.0, follow_redirects=True,
headers={"User-Agent": "Mozilla/5.0 (iPhone; CPU iPhone OS 17_5 "
"like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) "
"Version/17.5 Mobile/15E148 Safari/604.1"},
) as c:
r = await c.get(homepage_url)
if r.status_code == 200:
page_html = r.text
except Exception as e:
logger.warning("B1: homepage fetch failed: %s", e)
if not page_html and not mobile:
return
finding = evaluate_reachability(page_html or "", homepage_url)
# Enrich finding with mobile-playwright details when available
if mobile and mobile.get("has_anchor"):
finding["mobile_playwright"] = {
"has_anchor": mobile.get("has_anchor"),
"anchor_text": mobile.get("anchor_text"),
"tap_target_px": mobile.get("tap_target_px"),
"click_opens_cmp": mobile.get("click_opens_cmp"),
"engine_meta": mobile.get("engine_meta"),
}
# Tap-target rule (Apple HIG / WCAG 2.5.5): ≥ 44 px each side
tp = mobile.get("tap_target_px") or {}
if tp and (tp.get("w", 0) < 44 or tp.get("h", 0) < 44):
finding["notes"] = (finding.get("notes") or []) + [
f"tap-target nur {tp.get('w')}×{tp.get('h')}px "
"(Apple HIG / WCAG verlangen ≥ 44×44)",
]
if finding.get("passed"):
finding["passed"] = False
finding["severity"] = "MEDIUM"
finding["severity_reason"] = "misclassified"
# If anchor exists in DOM but click doesn't open CMP, bump severity
if mobile.get("has_anchor") and not mobile.get("click_opens_cmp"):
finding["notes"] = (finding.get("notes") or []) + [
"click auf Footer-Link öffnet CMP nicht direkt",
]
if finding.get("severity_reason") != "factually_wrong":
finding["severity"] = "MEDIUM"
finding["severity_reason"] = "misclassified"
finding["passed"] = False
state["reachability_finding"] = finding
state["reachability_html"] = _render_block(finding)
logger.info(
"B1 Reachability: passed=%s severity=%s reason=%s mobile=%s",
finding["passed"], finding.get("severity"),
finding.get("severity_reason"),
bool(mobile),
)
def _render_block(finding: dict) -> str:
"""Render the reachability finding as an audit-mail HTML block."""
if finding["passed"]:
return ""
sev = (finding.get("severity") or "").upper()
color = "#dc2626" if sev == "HIGH" else "#f59e0b"
notes_html = "".join(
f"<li>{html.escape(n)}</li>" for n in finding.get("notes") or []
)
anchor = finding.get("reopen_anchor") or {}
anchor_html = ""
if anchor:
anchor_html = (
"<p style='margin:8px 0 0;font-size:13px;color:#475569;'>"
"Gefundener Footer-Link: "
f"<code>{html.escape((anchor.get('text') or '')[:80])}</code> "
f"→ <code>{html.escape((anchor.get('href') or '')[:120])}</code> "
f"(target_class: {html.escape(anchor.get('target_class') or '')})"
"</p>"
)
return (
f"<div style='margin:24px 0;padding:16px;border-left:4px solid {color};"
"background:#fef2f2;border-radius:4px;'>"
f"<h2 style='margin:0 0 8px;color:{color};font-size:16px;'>"
"COOKIE-CONSENT-UX-001 — Mobile Consent-Reachability</h2>"
f"<p style='margin:0 0 8px;font-size:14px;'><strong>Severity:</strong> "
f"{sev} ({html.escape(finding.get('severity_reason') or '')})</p>"
"<p style='margin:0 0 4px;font-size:14px;'>"
"Art. 7 Abs. 3 DSGVO: Widerruf muss so einfach wie Erteilung sein. "
"Auf Mobile-Safari konnten wir folgendes Problem feststellen:</p>"
f"<ul style='margin:8px 0 0 20px;font-size:14px;color:#7f1d1d;'>"
f"{notes_html}</ul>"
f"{anchor_html}"
"</div>"
)