6ed30dae5b
Now that all 1874 MCs run per check (Task #30 cap removal), the report was about to drown in noise. This commit adds the full aggregation / persistence / drill-down stack so each MC is actionable, not just counted. A1 mc_scorecard.py (new): build_scorecard(checks) -> per-regulation PASS/FAIL/SKIP + severity top_fails(checks, n) -> N most severe failed MCs full_audit_records(...) -> flat rows ready for sidecar SQLite A2 Email rendering: agent_doc_check_scorecard.py (new) builds an HTML scorecard table (regulation × passed/failed/HIGH/MEDIUM/score) shown at the top of the email. agent_doc_check_report._render_document now collapses the 500-MC L2 forest into 'X/Y bestanden (Z Fail)' summary plus a top-10 fails block per doc — old verbose render is gone. A3 compliance_audit_log.py (new) — sidecar SQLite at /data/compliance_audits.db (separate from compliance Postgres schema to comply with the no-new-migrations rule in CLAUDE.md): check_runs(check_id, ts, tenant_id, site_name, base_domain, doc_count, scorecard json, vvt_summary json) mc_results(check_id, doc_type, mc_id, label, passed, skipped, severity, regulation, matched_text, hint) Route persists every run after the email is sent. docker-compose.yml adds compliance-audit volume + env. A4 backfill_mc_regulation_llm.py (new) — Qwen-tagged backfill for the 1636 MCs the regex pass couldn't classify. Batches of 25, format=json, output constrained to the canonical regulation list. Run manually: docker exec bp-compliance-backend python3 \ /app/scripts/backfill_mc_regulation_llm.py [--dry-run] A5 Admin audit tab — GET /api/compliance/agent/audit/<check_id> proxied via /api/sdk/v1/agent/audit/<id>. New page /sdk/agent/audit/[checkId] renders scorecard + filterable MC table (status / doc_type / regulation, expandable rows with matched_text + hint). ComplianceCheckTab now shows 'Voll-Audit oeffnen' link. A6 Trend per tenant — GET /api/compliance/agent/audit/tenant/<id> returns recent runs. Email scorecard shows per-regulation delta badges ('(+12%)', '(-3%)') compared with the previous run for the same tenant + base_domain. Lookup is one SQLite query. Plumbing: rag_document_checker.py — SELECT now includes 'article'; MC results carry 'regulation' + 'article' through to CheckItem. agent_doc_check_routes.CheckItem schema gains regulation + article fields (defaults '') so old clients still parse. agent_compliance_check_routes — response gains 'check_id' so the frontend can build the audit link.
389 lines
16 KiB
Python
389 lines
16 KiB
Python
"""
|
|
HTML email report builder for document checks.
|
|
|
|
Generates a styled HTML report similar to the frontend ChecklistView,
|
|
including L1/L2 check hierarchy, progress bars, and actionable hints.
|
|
"""
|
|
|
|
from __future__ import annotations
|
|
|
|
from typing import TYPE_CHECKING
|
|
|
|
if TYPE_CHECKING:
|
|
from .agent_doc_check_routes import CheckItem, DocCheckResult
|
|
|
|
|
|
def _bar(pct: int, color: str) -> str:
|
|
bg = {"green": "#22c55e", "yellow": "#eab308", "red": "#ef4444", "blue": "#60a5fa"}
|
|
c = bg.get(color, "#60a5fa")
|
|
return (
|
|
f'<div style="display:inline-block;width:120px;height:8px;background:#e5e7eb;'
|
|
f'border-radius:4px;overflow:hidden;vertical-align:middle;margin-right:8px">'
|
|
f'<div style="width:{pct}%;height:100%;background:{c};border-radius:4px"></div>'
|
|
f'</div><span style="font-size:13px;font-weight:600;color:{c}">{pct}%</span>'
|
|
)
|
|
|
|
|
|
def _icon(passed: bool, skipped: bool = False) -> str:
|
|
if skipped:
|
|
return '<span style="color:#d1d5db">—</span>'
|
|
if passed:
|
|
return '<span style="color:#22c55e;font-weight:bold">✓</span>'
|
|
return '<span style="color:#ef4444;font-weight:bold">✗</span>'
|
|
|
|
|
|
def _hint_box(hint: str) -> str:
|
|
return (
|
|
f'<div style="font-size:11px;color:#dc2626;margin:2px 0 4px 20px;'
|
|
f'padding:4px 8px;background:#fef2f2;border-radius:4px;'
|
|
f'border-left:3px solid #fca5a5">{hint}</div>'
|
|
)
|
|
|
|
|
|
def build_management_summary(results: list[DocCheckResult]) -> str:
|
|
"""Build a plain-language management summary for the CEO/GF.
|
|
|
|
No legal jargon — concrete actions that can be delegated to staff,
|
|
lawyers, or the DPO.
|
|
"""
|
|
ok = [r for r in results if r.completeness_pct == 100 and not r.error]
|
|
fixable = [r for r in results if 0 < r.completeness_pct < 100 and not r.error]
|
|
critical = [r for r in results if r.completeness_pct == 0 and not r.error]
|
|
errors = [r for r in results if r.error]
|
|
|
|
html = [
|
|
'<div style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;'
|
|
'max-width:700px;margin:0 auto 20px;padding:16px 20px;'
|
|
'background:#f8fafc;border:1px solid #e2e8f0;border-radius:12px">',
|
|
'<h2 style="margin:0 0 12px;font-size:18px;color:#1e293b">'
|
|
'Zusammenfassung fuer die Geschaeftsfuehrung</h2>',
|
|
]
|
|
|
|
# Overall status
|
|
total = len(results) - len(errors)
|
|
if total == 0:
|
|
html.append('<p>Keine Dokumente geprueft.</p></div>')
|
|
return "\n".join(html)
|
|
|
|
if len(ok) == total:
|
|
html.append(
|
|
'<p style="color:#16a34a;font-weight:600;font-size:15px">'
|
|
'Alle Dokumente sind vollstaendig. Keine dringenden Massnahmen noetig.</p>'
|
|
)
|
|
else:
|
|
html.append(
|
|
f'<p style="font-size:14px;color:#475569">'
|
|
f'{len(ok)} von {total} Dokumenten sind vollstaendig. '
|
|
f'{len(fixable)} brauchen Korrekturen'
|
|
f'{f", {len(critical)} fehlen oder sind unbrauchbar" if critical else ""}.</p>'
|
|
)
|
|
|
|
# Concrete actions
|
|
actions: list[str] = []
|
|
for r in results:
|
|
if r.error or r.completeness_pct == 100:
|
|
continue
|
|
failed_checks = [
|
|
c for c in r.checks
|
|
if c.level == 1 and not c.passed and not c.skipped
|
|
and c.severity != "INFO"
|
|
]
|
|
for c in failed_checks[:3]: # Max 3 per document
|
|
action = _check_to_action(r.label, c.label, c.hint)
|
|
if action:
|
|
actions.append(action)
|
|
|
|
if actions:
|
|
html.append(
|
|
'<h3 style="font-size:14px;color:#334155;margin:16px 0 8px">'
|
|
'Konkrete Aufgaben:</h3>'
|
|
'<ol style="font-size:13px;color:#475569;padding-left:20px;margin:0">'
|
|
)
|
|
for a in actions[:10]: # Max 10 actions
|
|
html.append(f'<li style="margin-bottom:6px">{a}</li>')
|
|
html.append('</ol>')
|
|
|
|
html.append('</div>')
|
|
return "\n".join(html)
|
|
|
|
|
|
def _check_to_action(doc_label: str, check_label: str, hint: str) -> str:
|
|
"""Convert a failed check into a plain-language action item."""
|
|
# Map technical check labels to business-language actions
|
|
label_lower = check_label.lower()
|
|
|
|
if "datenschutzbeauftragter" in label_lower or "dsb" in label_lower:
|
|
return (f"<strong>{doc_label}:</strong> Ihren Datenschutzbeauftragten "
|
|
f"mit Kontaktdaten erwaehnen. Pflicht ab 20 Mitarbeitern.")
|
|
|
|
if "beschwerderecht" in label_lower or "art. 77" in label_lower:
|
|
return (f"<strong>{doc_label}:</strong> Hinweis auf das Beschwerderecht "
|
|
f"bei der Aufsichtsbehoerde ergaenzen (Name + Kontakt der Behoerde).")
|
|
|
|
if "betroffenenrechte" in label_lower:
|
|
return (f"<strong>{doc_label}:</strong> Alle Betroffenenrechte "
|
|
f"(Auskunft, Berichtigung, Loeschung, etc.) einzeln auffuehren.")
|
|
|
|
if "verantwortlicher" in label_lower:
|
|
return (f"<strong>{doc_label}:</strong> Vollstaendige Firmenbezeichnung "
|
|
f"mit Rechtsform, Adresse, E-Mail und Telefon eintragen.")
|
|
|
|
if "interessenabwaegung" in label_lower:
|
|
return (f"<strong>{doc_label}:</strong> Bei 'berechtigtem Interesse' "
|
|
f"die Abwaegung dokumentieren. Aufgabe fuer den DSB/Rechtsanwalt.")
|
|
|
|
if "widerrufsbelehrung" in label_lower or "widerruf" in label_lower:
|
|
return (f"<strong>{doc_label}:</strong> Gesetzliche Widerrufsbelehrung "
|
|
f"mit 14-Tage-Frist und Musterformular bereitstellen.")
|
|
|
|
if "loeschkonzept" in label_lower:
|
|
return (f"<strong>{doc_label}:</strong> Loeschfristen und -prozess "
|
|
f"dokumentieren. Aufgabe fuer den DSB.")
|
|
|
|
if "profiling" in label_lower or "art. 22" in label_lower:
|
|
return (f"<strong>{doc_label}:</strong> Hinweis ergaenzen ob "
|
|
f"automatisierte Entscheidungen stattfinden oder nicht.")
|
|
|
|
if "nicht im eingereichten text" in label_lower:
|
|
return (f"<strong>{doc_label}:</strong> Das eingereichte Dokument "
|
|
f"enthaelt nicht den erwarteten Inhalt. Bitte korrekte URL pruefen.")
|
|
|
|
# Generic fallback
|
|
if hint and len(hint) < 150:
|
|
return f"<strong>{doc_label}:</strong> {hint[:120]}"
|
|
|
|
return f"<strong>{doc_label}:</strong> '{check_label}' muss ergaenzt werden."
|
|
|
|
|
|
def build_html_report(
|
|
results: list[DocCheckResult],
|
|
cookie_result: dict | None,
|
|
) -> str:
|
|
"""Build HTML email report styled like the frontend."""
|
|
ok_count = sum(1 for r in results if r.completeness_pct == 100)
|
|
html = [
|
|
'<div style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;'
|
|
'max-width:700px;margin:0 auto">',
|
|
'<h2 style="margin-bottom:4px">Dokumenten-Pruefung</h2>',
|
|
f'<p style="color:#6b7280;margin-top:0">'
|
|
f'{len(results)} Dokumente, {ok_count} vollstaendig</p>',
|
|
]
|
|
|
|
for r in results:
|
|
_render_document(html, r)
|
|
|
|
if cookie_result:
|
|
_render_cookie_banner(html, cookie_result)
|
|
|
|
html.append('</div>')
|
|
return "\n".join(html)
|
|
|
|
|
|
def _render_document(html: list[str], r: DocCheckResult) -> None:
|
|
pct = r.completeness_pct
|
|
cpct = r.correctness_pct
|
|
bar_color = "green" if pct >= 80 else "yellow" if pct >= 50 else "red"
|
|
status_label = "OK" if pct == 100 else "LUECKENHAFT" if pct >= 50 else "MANGELHAFT"
|
|
is_missing = bool(r.error) and (
|
|
r.error.startswith("Nicht eingereicht")
|
|
or r.error.startswith("Auf der Website nicht gefunden")
|
|
)
|
|
if is_missing:
|
|
status_label = ("NICHT GEFUNDEN"
|
|
if r.error.startswith("Auf der Website")
|
|
else "NICHT EINGEREICHT")
|
|
elif r.error:
|
|
status_label = "FEHLER"
|
|
|
|
l1_checks = [c for c in r.checks if c.level == 1]
|
|
l2_by_parent: dict[str, list[CheckItem]] = {}
|
|
for c in r.checks:
|
|
if c.level == 2 and c.parent:
|
|
l2_by_parent.setdefault(c.parent, []).append(c)
|
|
|
|
l1_passed = sum(1 for c in l1_checks if c.passed)
|
|
l2_active = [c for c in r.checks if c.level == 2 and not c.skipped]
|
|
l2_passed = sum(1 for c in l2_active if c.passed)
|
|
|
|
# Header
|
|
html.append(
|
|
f'<div style="border:1px solid #e5e7eb;border-radius:8px;margin-bottom:12px;overflow:hidden">'
|
|
f'<div style="padding:12px 16px;background:#f9fafb">'
|
|
f'<div style="display:flex;justify-content:space-between;align-items:center"><div>'
|
|
f'<span style="font-size:11px;background:#f3f4f6;padding:2px 8px;border-radius:4px;'
|
|
f'color:#4b5563;font-weight:500;margin-right:8px">{status_label}</span>'
|
|
f'<strong style="font-size:14px">{r.label}</strong>'
|
|
f'<div style="font-size:12px;color:#6b7280;margin-top:2px">'
|
|
f'{l1_passed}/{len(l1_checks)} Pflichtangaben'
|
|
)
|
|
if l2_active:
|
|
html.append(f', {l2_passed}/{len(l2_active)} Detailpruefungen')
|
|
html.append(f'</div></div><div style="text-align:right">{_bar(pct, bar_color)}')
|
|
if cpct and l2_active:
|
|
html.append(f'<br>{_bar(cpct, "blue")}')
|
|
html.append('</div></div></div>')
|
|
|
|
# Body
|
|
if is_missing:
|
|
body_msg = (
|
|
"Wir haben die Hauptseite durchsucht, aber kein Dokument fuer "
|
|
"diese Pflichtangabe gefunden. Pruefen Sie, ob es auf der "
|
|
"Website existiert und tragen Sie die URL manuell nach."
|
|
if r.error.startswith("Auf der Website")
|
|
else "Keine URL oder Text fuer dieses Dokument angegeben. "
|
|
"Tragen Sie die Quelle im Compliance-Check Formular nach, "
|
|
"um diese Pflichtangabe zu pruefen."
|
|
)
|
|
html.append(
|
|
'<div style="padding:12px 16px;color:#6b7280;font-size:12px;'
|
|
'background:#fafafa;border-top:1px solid #f3f4f6">'
|
|
+ body_msg + '</div>'
|
|
)
|
|
elif r.error:
|
|
html.append(f'<div style="padding:12px 16px;color:#991b1b">{r.error}</div>')
|
|
else:
|
|
html.append('<div style="padding:8px 16px 12px">')
|
|
for c in l1_checks:
|
|
_render_l1_check(html, c, l2_by_parent.get(c.id, []))
|
|
|
|
# Master-Control aggregation: with 1874 MCs evaluated per run,
|
|
# rendering every L2 check inline produces ~600 rows per doc and
|
|
# makes the email unreadable. Show only top-N severe fails plus a
|
|
# one-line summary. Full results live in /sdk/agent/audit/<id>.
|
|
from compliance.api.agent_doc_check_scorecard import build_top_fails_html
|
|
from compliance.services.mc_scorecard import top_fails
|
|
|
|
mc_results = [
|
|
{"id": c.id, "label": c.label, "passed": c.passed,
|
|
"severity": c.severity, "skipped": c.skipped, "hint": c.hint,
|
|
"regulation": c.regulation}
|
|
for c in r.checks
|
|
if c.id.startswith("mc-")
|
|
]
|
|
if mc_results:
|
|
n_total = len(mc_results)
|
|
n_passed = sum(1 for x in mc_results if x["passed"])
|
|
n_skipped = sum(1 for x in mc_results if x["skipped"])
|
|
n_failed = n_total - n_passed - n_skipped
|
|
html.append(
|
|
f'<div style="margin-top:12px;padding-top:8px;'
|
|
f'border-top:1px solid #e5e7eb;font-size:11px;color:#475569">'
|
|
f'<strong>Master-Controls:</strong> {n_passed}/'
|
|
f'{n_total - n_skipped} bestanden '
|
|
f'<span style="color:#dc2626">({n_failed} Fail)</span>'
|
|
f'{f" + {n_skipped} nicht anwendbar" if n_skipped else ""}.'
|
|
f'</div>'
|
|
)
|
|
top = top_fails(mc_results, n=10)
|
|
html.append(build_top_fails_html(top, r.label))
|
|
|
|
if r.word_count:
|
|
html.append(
|
|
f'<div style="font-size:11px;color:#9ca3af;margin-top:8px;'
|
|
f'padding-top:8px;border-top:1px solid #e5e7eb">'
|
|
f'{r.word_count} Woerter analysiert</div>'
|
|
)
|
|
html.append('</div>')
|
|
html.append('</div>')
|
|
|
|
|
|
def _render_l1_check(
|
|
html: list[str], c: CheckItem, children: list[CheckItem],
|
|
) -> None:
|
|
l2_sub = [ch for ch in children if not ch.skipped]
|
|
l2_passed = sum(1 for ch in l2_sub if ch.passed)
|
|
|
|
style = "color:#991b1b;font-weight:600" if not c.passed else "color:#374151"
|
|
html.append(
|
|
f'<div style="padding:3px 0">{_icon(c.passed)} '
|
|
f'<span style="font-size:13px;{style}">{c.label}</span>'
|
|
)
|
|
if l2_sub:
|
|
html.append(f' <span style="color:#9ca3af;font-size:11px">({l2_passed}/{len(l2_sub)})</span>')
|
|
if not c.passed and c.hint:
|
|
html.append(_hint_box(c.hint))
|
|
html.append('</div>')
|
|
|
|
for ch in children:
|
|
if ch.skipped:
|
|
continue
|
|
_render_l2_check(html, ch)
|
|
|
|
|
|
def _render_l2_check(html: list[str], ch: CheckItem) -> None:
|
|
style = "color:#dc2626;font-weight:500" if not ch.passed else "color:#6b7280"
|
|
html.append(
|
|
f'<div style="padding:2px 0 2px 24px;border-left:2px solid #e5e7eb;margin-left:8px">'
|
|
f'{_icon(ch.passed)} '
|
|
f'<span style="font-size:12px;{style}">{ch.label}</span>'
|
|
)
|
|
if ch.passed and ch.matched_text:
|
|
html.append(
|
|
f'<div style="font-size:10px;color:#9ca3af;font-family:monospace;'
|
|
f'margin-left:20px;overflow:hidden;text-overflow:ellipsis;'
|
|
f'white-space:nowrap">"...{ch.matched_text[:80]}..."</div>'
|
|
)
|
|
if not ch.passed and ch.hint:
|
|
html.append(_hint_box(ch.hint))
|
|
html.append('</div>')
|
|
|
|
|
|
def _render_cookie_banner(html: list[str], cookie_result: dict) -> None:
|
|
html.append(
|
|
'<div style="border:1px solid #e5e7eb;border-radius:8px;'
|
|
'padding:12px 16px;margin-bottom:12px">'
|
|
'<strong>Cookie-Banner Pruefung</strong><br>'
|
|
f'Banner erkannt: {cookie_result.get("banner_detected", False)}<br>'
|
|
f'Anbieter: {cookie_result.get("banner_provider", "unbekannt")}'
|
|
)
|
|
violations = cookie_result.get("banner_checks", {}).get("violations", [])
|
|
if violations:
|
|
for v in violations[:10]:
|
|
html.append(f'<br>{_icon(False)} {v.get("text", "")[:80]}')
|
|
else:
|
|
html.append('<br><span style="color:#22c55e">Keine Verstoesse erkannt.</span>')
|
|
html.append('</div>')
|
|
|
|
|
|
# Re-export the helpers extracted to agent_doc_check_extras.py so existing
|
|
# callers that did `from .agent_doc_check_report import build_scanned_urls_html`
|
|
# keep working.
|
|
from .agent_doc_check_extras import ( # noqa: E402,F401
|
|
build_provider_list_html,
|
|
build_scanned_urls_html,
|
|
)
|
|
|
|
|
|
def build_profile_html(profile) -> str:
|
|
"""Build a small HTML block summarizing the detected business profile."""
|
|
service_tags = ", ".join(profile.detected_services[:10]) or "keine erkannt"
|
|
flags = []
|
|
if profile.has_online_shop:
|
|
flags.append("Online-Shop")
|
|
if profile.has_editorial_content:
|
|
flags.append("Redaktionelle Inhalte")
|
|
if profile.is_regulated_profession:
|
|
flags.append(f"Regulierter Beruf ({profile.regulated_profession_type})")
|
|
if profile.needs_odr:
|
|
flags.append("ODR-pflichtig")
|
|
flags_str = ", ".join(flags) or "keine"
|
|
|
|
return (
|
|
'<div style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;'
|
|
'max-width:700px;margin:0 auto 16px;padding:12px 16px;'
|
|
'background:#f0f9ff;border:1px solid #bae6fd;border-radius:8px">'
|
|
'<h3 style="margin:0 0 8px;font-size:14px;color:#0369a1">'
|
|
'Erkanntes Geschaeftsmodell</h3>'
|
|
'<table style="font-size:13px;color:#374151">'
|
|
f'<tr><td style="padding:2px 12px 2px 0;color:#6b7280">Typ:</td>'
|
|
f'<td><strong>{profile.business_type.upper()}</strong>'
|
|
f' ({profile.industry})</td></tr>'
|
|
f'<tr><td style="padding:2px 12px 2px 0;color:#6b7280">Merkmale:</td>'
|
|
f'<td>{flags_str}</td></tr>'
|
|
f'<tr><td style="padding:2px 12px 2px 0;color:#6b7280">Dienste:</td>'
|
|
f'<td>{service_tags}</td></tr>'
|
|
f'<tr><td style="padding:2px 12px 2px 0;color:#6b7280">Konfidenz:</td>'
|
|
f'<td>{int(profile.confidence * 100)}%</td></tr>'
|
|
'</table></div>'
|
|
)
|