Files
breakpilot-compliance/consent-tester/services/mobile_reachability_scanner.py
T
Benjamin Admin 37093ff9e3 feat: Browser-Matrix C2 + B11 AI-Retention + Impressum-Specialist-Agent + B1 Mobile Playwright
Task #15 Stage 1.c-e — Browser-Matrix Backend-Integration:
  - _phase_c2_browser_matrix.py: ruft consent-tester /scan-matrix wenn
    env BROWSER_MATRIX=true, fuellt state["browser_matrix"] +
    state["browser_aggregate"] + state["browser_matrix_html"]
  - V2-Mail-Block: 🌐 Browser-Matrix Tabelle (Profile · Score ·
    Sub-Scores PC/RR/BD · Bewertung) mit Worst-of-Header
  - Orchestrator ruft run_phase_c2 nach run_phase_c
  KNOWN: Stage 1.b (consent_scanner browser_profile-Param) bleibt
    zurueckgestellt (Datei in loc-exception, Hook-Patch verweigert).
    Stage 1.a-Shim laeuft im consent-tester — alle Profile aktuell
    auf Chromium, echte Engine-Diversitaet kommt mit 1.b.

Task #17 TH-RETENTION-002 als B11 ai_retention_granularity_check:
  - Erkennt AI-Provider-Kontext (vertex/openai/anthropic/etc)
  - In +-800-char-Window: prueft ≥2 Datenkategorien aus Standard-Liste
    (Texteingaben/IP/Geraet/Session/Fehlerprotokoll/Zeitstempel)
  - Wenn 1 pauschale Speicherdauer + ≥2 Kategorien aber kein
    per-Kategorie-Differential → LOW
  - Smoke: Elli-Mock-DSE trifft LOW "AI-Speicherdauer pauschal"

Task #18 Specialist-Agents Phase-1-Prototyp:
  - compliance/services/specialist_agents/__init__.py mit Architektur-Doku
  - impressum_agent.py: 9 Pflichtangaben § 5 TMG + § 1 DL-InfoV
    als Pattern-Registry (Name, Email, Telefon, HR, USt-IdNr,
    Vertretungsberechtigt, Aufsichtsbehoerde, Berufsangaben, OS-Link)
  - business_scope-aware (OS-Link nur fuer ecommerce, Aufsichtsbehoerde
    nur fuer regulated_profession/financial/insurance)
  - Phase-1 ist Pattern-Match-only (kein LLM), demonstriert die
    Schnittstelle. Phase 2 ersetzt Pattern durch System-Prompt + KB.
  - Smoke: minimal-Impressum triggert 4 Findings korrekt

Task #7 B1 Playwright Mobile-Verifikation:
  - consent-tester/services/mobile_reachability_scanner.py: echte
    WebKit-launch + p.devices['iPhone 15'] preset + de-DE locale +
    Europe/Berlin timezone
  - Footer-Anchor-Suche via locator("footer >> text=/.../i") fuer
    13 Reopen-Phrasen
  - Tap-Target-Boundingbox-Messung (Apple HIG / WCAG ≥44x44)
  - Click-Behavior: DOM-Modal-Snapshot vor/nach, erkennt CMP-Open
  - Output: has_anchor, anchor_text, tap_target_px, click_opens_cmp,
    engine_meta, screenshot_b64 (Footer-Crop wenn kein Anchor)
  - consent-tester/routes_mobile.py POST /scan-mobile-reachability
  - Backend _b1_wiring erweitert: ruft Mobile-Endpoint zuerst,
    Fallback auf statischen HTTP-Fetch. Mobile-Daten enrichen
    finding.mobile_playwright + Severity-Bump bei
    tap-target<44 / click-doesnt-open-CMP.
  KNOWN: WebKit-System-Libs sind im Dockerfile ergaenzt (Stage 1.a-
    Commit), greifen aber erst nach CI/CD-Rebuild des consent-tester.
    Bis dahin faellt B1 sauber auf statischen Fetch zurueck.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-06 22:20:25 +02:00

164 lines
5.8 KiB
Python

"""B1 Mobile Reachability — echter Playwright-Scan auf iPhone-Emulation.
Ersetzt den statischen HTTP-Fetch im Backend-B1-Wiring durch eine
echte WebKit-Browser-Session mit `devices['iPhone 15']`-Preset. Misst:
- hat Footer einen Reopen-Anchor (Text/aria-label/onclick)?
- Tap-Target-Größe (boundingBox in px) — Apple HIG 44pt = ≥44 px
- Click-Behavior: öffnet sich der CMP direkt? (DOM-Mutation +
Modal-Detection nach 2s)
Output schema (für Backend-B1 ersetzbar mit statischer Logik):
{
"url": str,
"has_anchor": bool,
"anchor_text": str,
"tap_target_px": {"w": int, "h": int} | None,
"click_opens_cmp": bool,
"modal_selector": str | None,
"screenshot_b64": str (initial Footer-Crop),
"engine_meta": {"engine": "webkit", "device": "iPhone 15",
"user_agent": str, "viewport": str},
}
"""
from __future__ import annotations
import base64
import logging
from typing import Any
logger = logging.getLogger(__name__)
# Phrasen für Footer-Anchor-Suche (mirror des Backend-Service)
_REOPEN_PHRASES = (
"cookie-einstellungen", "cookie einstellungen",
"cookie-präferenzen", "cookie-praeferenzen",
"cookie-einwilligung",
"einwilligung verwalten",
"consent manager", "consent settings", "consent-einstellungen",
"datenschutz-einstellungen", "datenschutzeinstellungen",
"cookies verwalten", "manage cookies", "manage preferences",
"privacy settings", "privacy preferences",
"tracking-einstellungen",
)
async def scan_mobile_reachability(url: str) -> dict[str, Any]:
"""Run Mobile-Safari emulation + footer reachability check."""
try:
from playwright.async_api import async_playwright
except Exception as e:
logger.warning("playwright not available: %s", e)
return {"url": url, "error": "playwright missing"}
async with async_playwright() as p:
device_preset = p.devices.get("iPhone 15") or {}
browser = await p.webkit.launch(headless=True)
try:
context = await browser.new_context(
**device_preset,
locale="de-DE",
timezone_id="Europe/Berlin",
)
page = await context.new_page()
try:
await page.goto(url, wait_until="domcontentloaded",
timeout=30000)
except Exception as e:
return {"url": url, "error": f"goto failed: {e}"[:200]}
try:
await page.wait_for_timeout(1500)
except Exception:
pass
ua = await page.evaluate("() => navigator.userAgent")
viewport = page.viewport_size or {}
engine_meta = {
"engine": "webkit",
"device": "iPhone 15",
"user_agent": ua,
"viewport": f"{viewport.get('width','?')}x{viewport.get('height','?')}",
}
# Find footer reopen anchor by text matching
anchor_loc = None
for phrase in _REOPEN_PHRASES:
try:
candidate = page.locator(
f"footer >> text=/{phrase}/i"
).first
if await candidate.count() > 0:
anchor_loc = candidate
anchor_text = phrase
break
except Exception:
continue
result: dict[str, Any] = {
"url": url,
"has_anchor": False,
"anchor_text": "",
"tap_target_px": None,
"click_opens_cmp": False,
"modal_selector": None,
"engine_meta": engine_meta,
}
if anchor_loc is None:
# Capture footer crop
try:
footer = page.locator("footer").first
if await footer.count() > 0:
png = await footer.screenshot()
result["screenshot_b64"] = base64.b64encode(
png,
).decode("ascii")[:120000]
except Exception:
pass
return result
result["has_anchor"] = True
result["anchor_text"] = anchor_text
try:
box = await anchor_loc.bounding_box()
if box:
result["tap_target_px"] = {
"w": int(box["width"]), "h": int(box["height"]),
}
except Exception:
pass
# DOM-Modal-Snapshot vorher
try:
before_modals = await page.evaluate(
"() => Array.from(document.querySelectorAll("
"'[role=dialog],[aria-modal=true],.cmp-modal,"
".ot-sdk-container,#usercentrics-cmp')).length"
)
except Exception:
before_modals = 0
# Klick + warten
try:
await anchor_loc.click(timeout=5000)
await page.wait_for_timeout(2000)
after_modals = await page.evaluate(
"() => Array.from(document.querySelectorAll("
"'[role=dialog],[aria-modal=true],.cmp-modal,"
".ot-sdk-container,#usercentrics-cmp')).length"
)
if after_modals > before_modals:
result["click_opens_cmp"] = True
result["modal_selector"] = (
"[role=dialog] | [aria-modal=true] | cmp-modal"
)
except Exception as e:
logger.info("anchor click skipped: %s", e)
return result
finally:
await browser.close()