Files
breakpilot-compliance/backend-compliance/compliance/services/mail_render_v2/_compose.py
T
Benjamin Admin e8ff75cbfe feat: Backlog 1-5 — soft-hints, chatbot-discovery, API-payload, LLM-Agent
5 Backlog-Items aus dem Multi-Site-Briefing in einem Sprint:

1. B13 B2C-Soft-Hints — Versicherungs/Tarif/Buchungs-Marker
   _B2C_WEAK erweitert um "Reiseversicherung", "Tarifrechner",
   "Online-Antrag", "Flug buchen", "Stromtarif" etc.
   Fängt Allianz-Reise-Chatbot (vorher False-Negative).

2. Chatbot-Policy-Discovery (chatbot_policy_discovery.py)
   Probt 14 Standard-Slugs (privacypolicychatbot, chatbot-datenschutz,
   ai-policy, ki-datenschutz, ...) × 5 Lang-Prefixe auf jeder
   submitted Origin. Successful >300-Wort-Findings werden in
   doc_texts['dse'] gemerged. Audit-Trail über
   doc_entries[dse].chatbot_policy_sources.
   Hebt Westfield-iAdvize-Lücke.

3. API-Response-Payload erweitert
   phase_f_persist.response um extra_findings, audit_walk und
   html_blocks erweitert. B-Wiring-Output (B1, B3-B18) ist nicht
   mehr nur im Mail-HTML versteckt — externe Aufrufer sehen jeden
   Finding. Schema additiv, legacy clients ignorieren neue Felder.

4. Plausibility-LLM Empty-Response-Fix
   Resilienz-Strategie A→B→C→D:
   A) format='json' (strict, default)
   B) format='' (loose, _try_extract_json mit ```json-fence + prose-
      wrap-Unterstützung)
   C) Split-Batch-Recursion (vorhanden)
   D) Give up, leeres dict (callers behandeln als skipped)
   Plus _post_llm() als isolierter LLM-Call-Helper, catched
   Network-Errors.

5. Specialist-Agents Phase 2 LLM (MVP) — Impressum-Agent
   impressum_agent_llm.py: qwen3:30b-a3b mit § 5 TMG System-Prompt,
   business_scope-hints aus profile_dict. Output identisches Schema
   wie pattern-agent für ein Merge ohne API-Bruch.
   _b18_wiring.py orchestriert beide Agents + deduplet nach
   field_id, rendert lila V2-Block mit KB/LLM-Tags pro Finding.
   Pattern-first im Dedup (deterministisch + stable).

Tests: 107/107 grün (7 Test-Suites + chatbot-discovery + b18).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-07 18:41:54 +02:00

81 lines
2.8 KiB
Python

"""Mail-V2 compose — single entrypoint that returns the full HTML.
Call `compose_v2(state)` from the email-dispatch phase when
`MAIL_RENDER_V2=true`. Default remains the legacy compose so we can
A/B in Mailpit.
"""
from __future__ import annotations
import os
from ._blocks import (
render_attachments,
render_caveats,
render_header,
render_per_doc,
render_per_theme,
render_sofortmassnahmen,
render_toc,
)
from ._blocks_findings import (
render_critical,
render_internal_reminders,
render_manual_review,
)
from ._legacy_wrappers import render_all_legacy
from ._style import page_close, page_open
def compose_v2(state: dict) -> str:
"""Build the full audit-mail HTML in the V2 layout."""
site = state.get("site_name") or ""
parts = [
page_open(site),
render_header(state),
render_toc(state),
render_critical(state),
render_manual_review(state),
render_internal_reminders(state),
render_sofortmassnahmen(state),
render_per_doc(state),
render_per_theme(state),
# B4 — Cross-Doc Vendor-Consistency (Elli Vertex↔Iadvize pattern)
state.get("vendor_consistency_html", ""),
# B5 — AI-Act Art. 50 Transparenzpflicht
state.get("ai_act_html", ""),
# B6/B7/B8/B9/B10 — DPO + Staleness + CMP + MultiEntity + Transfer
state.get("extra_findings_html", ""),
# B12 Chatbot-Cookie-Klassifikation
state.get("chatbot_cookie_html", ""),
# B13 Widerrufsbelehrung-Reachability (B2C-Pflicht)
state.get("widerruf_reach_html", ""),
# B14 Widersprüchliche Speicherdauer im selben Doc
state.get("retention_conflict_html", ""),
# B15 AI-Act Rechtsgrundlage (LLM-Vendor auf lit. f)
state.get("ai_legal_basis_html", ""),
# B16 Footer-Label-vs-URL-Slug-Drift (SEO / Bookmarks)
state.get("url_slug_drift_html", ""),
# B17 Audit-Walk-Video (Beweis-Aufzeichnung)
state.get("audit_walk_html", ""),
# B18 Impressum-Specialist-Agent (Pattern + LLM)
state.get("impressum_agent_html", ""),
# Browser-Matrix (Stage 1.c)
state.get("browser_matrix_html", ""),
# All legacy build_*_html() wrapped in V2 sections — preserves
# every information block from the old renderer (Exec Summary,
# Banner-Screenshot, VVT, Redundancy, Solutions, Diff, etc.)
render_all_legacy(state),
render_caveats(state),
render_attachments(state),
page_close(state.get("check_id", ""),
os.environ.get("BUILD_SHA", "unknown")),
]
return "".join(p for p in parts if p)
def is_v2_enabled() -> bool:
return os.environ.get("MAIL_RENDER_V2", "false").lower() in (
"true", "1", "yes", "on",
)