Files
breakpilot-compliance/backend-compliance/compliance/services/mail_render_v2/_compose.py
T
Benjamin Admin 5c5d676f01
CI / detect-changes (push) Successful in 7s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / loc-budget (push) Failing after 11s
CI / python-lint (push) Has been skipped
CI / test-python-backend (push) Successful in 28s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / build-sha-integrity (push) Failing after 4s
CI / validate-canonical-controls (push) Successful in 10s
CI / go-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
feat: Plan B + A + C — DSE-Versions-MCs + Legacy-URL + Multi-Version
Drei verwandte Mechanismen für DSE-Beweisbarkeit + URL-Hygiene.

Plan B + PDF — Versions-Beweisbarkeit-MCs (dse_checks.py):
  - mc-dse_version_date (HIGH) — sichtbares Stand/Versionsdatum
    Pflicht. 12 Regex-Pattern: "Stand: April 2024", ISO-Datum,
    "Letzte Aktualisierung", "Version 3.2", englische
    Varianten ("Last updated", "Effective date as of …").
    Norm: Art. 7 Abs. 1 DSGVO (Nachweisbarkeit Einwilligung).
  - mc-dse_version_proof (MED) — PDF-Download oder
    versionierte Archiv-URL. Reine HTML-DSE ohne Snapshot ist
    juristisch fragil. 8 Pattern: .pdf, Download-Hinweis,
    web.archive.org, /dse-vNNN.html.
    Norm: DSK-Orientierungshilfe 2024.

Plan A — Legacy-URL-Discovery (legacy_url_discovery.py + B20):
  Vier komplementäre Quellen:
    A.1 /sitemap.xml + Sub-Sitemaps parsen, auf compliance-
        relevante Slugs filtern
    A.2 archive.org/wayback/available pro Slug — wenn Wayback
        zeigt ≥18 Monate alten Snapshot UND Seite heute noch
        200 liefert UND nicht im Footer → Legacy-Verdacht
    A.3 Slug-Permutations: 6 doc_types × 6 Slug-Varianten ×
        5 Lang-Prefixe × 4 Brand-Parameter
    A.4 Banner-Modal-Links (über consent-tester Stufe 4 Tour)
  Mail-Block "🗂️ Legacy-URL-Inventar" mit Tabelle: URL · HTTP ·
  Wayback-Alter · Footer · Empfehlung (301/Offline/Behalten).
  Engine entscheidet NICHT was Legacy ist — präsentiert das
  Inventar, Kunde wählt.

  Real-World-Smoke Elli:
    /en/cookies → HTTP 200, Wayback 69 Mo alt, nicht im Footer
                  → "Legacy-Verdacht, 301 setzen"
    /en/impressum → HTTP 302, redirected → "behalten"

Plan C — Multi-Version-DSE-Analyse (multi_version_dse.py):
  Wenn ≥2 DSE-URLs reachable: pro Variante DSB-Name + Datum +
  Wortzahl + SHA-256 extrahieren, Inkonsistenzen flaggen
  (date_divergent, dsb_divergent, no_date_count).
  Mail-Block "📑 Mehrere DSE-Versionen erkannt" mit
  Vergleichstabelle + rotem Hinweis "Nur eine Version kann
  gültig sein". Beispiel Elli: /de/datenschutz (Mollstr-DSB,
  2022) vs /de/datenschutzerklaerung?brand=elli (Proliance,
  ohne Datum).

API-Response erweitert um legacy_url_inventory +
html_blocks.legacy_urls + multi_version_dse_html im V2-Layout.

ENV-Override: LEGACY_URL_DISABLED=1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-08 10:04:14 +02:00

95 lines
3.3 KiB
Python

"""Mail-V2 compose — single entrypoint that returns the full HTML.
Call `compose_v2(state)` from the email-dispatch phase when
`MAIL_RENDER_V2=true`. Default remains the legacy compose so we can
A/B in Mailpit.
"""
from __future__ import annotations
import os
from ._blocks import (
render_attachments,
render_caveats,
render_header,
render_per_doc,
render_per_theme,
render_sofortmassnahmen,
render_toc,
)
from ._blocks_findings import (
render_critical,
render_internal_reminders,
render_manual_review,
)
from ._vendor_cards import (
render_info_box_rechtsrahmen,
render_vendor_cards,
)
from ._legacy_wrappers import render_all_legacy
from ._style import page_close, page_open
def compose_v2(state: dict) -> str:
"""Build the full audit-mail HTML in the V2 layout."""
site = state.get("site_name") or ""
parts = [
page_open(site),
render_header(state),
render_info_box_rechtsrahmen(),
render_toc(state),
render_vendor_cards(
state.get("cmp_vendors") or [],
state.get("cookie_coherence_findings") or [],
),
render_critical(state),
render_manual_review(state),
render_internal_reminders(state),
render_sofortmassnahmen(state),
render_per_doc(state),
render_per_theme(state),
# B4 — Cross-Doc Vendor-Consistency (Elli Vertex↔Iadvize pattern)
state.get("vendor_consistency_html", ""),
# B5 — AI-Act Art. 50 Transparenzpflicht
state.get("ai_act_html", ""),
# B6/B7/B8/B9/B10 — DPO + Staleness + CMP + MultiEntity + Transfer
state.get("extra_findings_html", ""),
# B12 Chatbot-Cookie-Klassifikation
state.get("chatbot_cookie_html", ""),
# B13 Widerrufsbelehrung-Reachability (B2C-Pflicht)
state.get("widerruf_reach_html", ""),
# B14 Widersprüchliche Speicherdauer im selben Doc
state.get("retention_conflict_html", ""),
# B15 AI-Act Rechtsgrundlage (LLM-Vendor auf lit. f)
state.get("ai_legal_basis_html", ""),
# B16 Footer-Label-vs-URL-Slug-Drift (SEO / Bookmarks)
state.get("url_slug_drift_html", ""),
# B17 Audit-Walk-Video (Beweis-Aufzeichnung)
state.get("audit_walk_html", ""),
# B18 Impressum-Specialist-Agent (Pattern + LLM)
state.get("impressum_agent_html", ""),
# B19 Cookie-Coherence-Check (Salesforce-as-essential etc.)
state.get("cookie_coherence_html", ""),
# B20 Legacy-URL-Discovery + Multi-Version-DSE-Vergleich
state.get("multi_version_dse_html", ""),
state.get("legacy_url_html", ""),
# Browser-Matrix (Stage 1.c)
state.get("browser_matrix_html", ""),
# All legacy build_*_html() wrapped in V2 sections — preserves
# every information block from the old renderer (Exec Summary,
# Banner-Screenshot, VVT, Redundancy, Solutions, Diff, etc.)
render_all_legacy(state),
render_caveats(state),
render_attachments(state),
page_close(state.get("check_id", ""),
os.environ.get("BUILD_SHA", "unknown")),
]
return "".join(p for p in parts if p)
def is_v2_enabled() -> bool:
return os.environ.get("MAIL_RENDER_V2", "false").lower() in (
"true", "1", "yes", "on",
)