feat(compliance-check): exec-summary + voll-audit + TDM-respect + cookie-KB-extended + saving-scan-funnel
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 14s
CI / loc-budget (push) Failing after 15s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m43s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 37s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 14s
CI / loc-budget (push) Failing after 15s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m43s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 37s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
P1 — Exec-Summary oben im Email-Report (4 KPIs + 2 CTAs, dunkler Gradient)
P3 — no_direct_sales-Flag fuer OEM-Konfigurator-Sites; AGB/Widerruf/AGB als
"NICHT ANWENDBAR" (grau) statt "NICHT GEFUNDEN" (rot)
P5 — Voll-Audit Unification: alle Findings (MC + Pflichtangaben + Vendor +
Redundanz) in /data/compliance_audits.db.unified_findings; neuer
/api/compliance/agent/findings/<id> Endpoint + FindingsTab im Audit-UI
mit Filter + CSV-Export
P7 — Crawl-Hardening: TDM-Reservation-Check (robots.txt / ai.txt / Header /
Meta) vor jedem Run mit 24h-Cache; HeadlessChrome-UA (Firma noch nicht
gegruendet — Switch via BREAKPILOT_BRANDED_UA env); per-Domain
Rate-Limit 1 req/s + max 2 concurrent
P2 — Cookie-Knowledge-DB additiv erweitert (35 -> 74 Cookies): Adobe, Meta,
Microsoft, LinkedIn, TikTok, HubSpot, Marketo, Salesforce, Hotjar,
FullStory, Mouseflow, Intercom, Drift, Zendesk, Cloudflare, Stripe,
OneTrust/Cookiebot/Usercentrics, Matomo, Pinterest, Snapchat, X/Twitter,
YouTube, Vimeo, Klaviyo, Mailchimp, Mixpanel, Segment, Amplitude,
Optimizely, Datadog; Wire-in in cookie_function_classifier liefert
compliance_risk-Label (kritisch/hoch/mittel/gering) pro Vendor
A — k-Anonymitaets-Helper (benchmark_k_anonymity) fuer P6-Vorbereitung
B — Cross-Tenant-Domain-Assertion im /findings-Endpoint (expected_domain
Query-Param -> 403 bei Mismatch)
C — Saving-Scan-Funnel: /api/compliance/agent/saving-scan/start mit
Validierung + 24h-Rate-Limit pro Domain + Lead-Persistenz in
saving_scan_leads + Auto-Discovery via _run_compliance_check; 6 Tests
D — Risk-Badge im Email-Vendor-Row
Rechtliche Leitplanken (Memory feedback_oem_data_legal.md): nur eigene
Knapp-Bewertungen + Source-Pointer, keine 1:1-Kopien fremder CMP-Texte.
TDM-Opt-Out-Respect nach § 44b UrhG. KEINE Schema-Aenderungen — alles in
Sidecar-SQLite.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -166,6 +166,33 @@ async def _run_compliance_check(check_id: str, req: ComplianceCheckRequest):
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# P7: TDM-Reservation-Check der Base-Domain (§ 44b UrhG).
|
||||
# Bei reserved/denied: Run sofort beenden, kein Crawl.
|
||||
try:
|
||||
from compliance.services.tdm_reservation_check import (
|
||||
check_tdm_reservation, is_crawl_allowed,
|
||||
)
|
||||
first_url = next(
|
||||
(d.url for d in req.documents if d.url), "",
|
||||
)
|
||||
if first_url:
|
||||
tdm = await check_tdm_reservation(first_url)
|
||||
_compliance_check_jobs[check_id]["tdm"] = tdm
|
||||
if not is_crawl_allowed(tdm):
|
||||
_compliance_check_jobs[check_id]["status"] = "skipped_tdm"
|
||||
_compliance_check_jobs[check_id]["error"] = (
|
||||
f"TDM-Vorbehalt fuer {tdm.get('domain')} erkannt "
|
||||
f"(status={tdm.get('status')}) — Crawl nach § 44b "
|
||||
f"UrhG nicht zulaessig. Signals: "
|
||||
f"{[s.get('src') for s in tdm.get('signals', [])]}"
|
||||
)
|
||||
_compliance_check_jobs[check_id]["progress_pct"] = 100
|
||||
logger.info("TDM-skip check_id=%s domain=%s status=%s",
|
||||
check_id, tdm.get("domain"), tdm.get("status"))
|
||||
return
|
||||
except Exception as e:
|
||||
logger.warning("TDM-check failed (proceeding): %s", e)
|
||||
|
||||
# Step 1: Resolve texts (fetch from URL if needed) — 0-30%
|
||||
_update(check_id, "Texte werden geladen...", 1)
|
||||
doc_texts: dict[str, str] = {}
|
||||
@@ -526,15 +553,37 @@ async def _run_compliance_check(check_id: str, req: ComplianceCheckRequest):
|
||||
report_html = build_html_report(results, None, doc_texts)
|
||||
profile_html = _build_profile_html(profile)
|
||||
|
||||
# O4: Vendor-Redundanz / EU-Alternativen + Cost-Savings-Block —
|
||||
# zwischen VVT und Doc-Report einsortiert, damit Geschaeftsfuehrung
|
||||
# die Einsparung sieht bevor sie in die Detail-Pruefung geht.
|
||||
# O4: Vendor-Redundanz / EU-Alternativen + Cost-Savings-Block
|
||||
from .agent_doc_check_redundancy import build_redundancy_html
|
||||
redundancy_html = build_redundancy_html(redundancy_report)
|
||||
|
||||
# P1: Executive-Summary GANZ oben — CFO/GF sieht 4 KPIs + 2 CTAs.
|
||||
from .agent_doc_check_exec_summary import build_exec_summary_html
|
||||
# Site-Name fuer Header bestimmen (gleiche Logik wie Email-Subject)
|
||||
url_company_for_exec = _company_name_from_url(doc_entries)
|
||||
domain_for_exec = _extract_domain(doc_entries)
|
||||
site_name_for_exec = url_company_for_exec or domain_for_exec or ""
|
||||
exec_summary_html = build_exec_summary_html(
|
||||
scorecard=scorecard,
|
||||
previous_scorecard=prev_scorecard,
|
||||
cmp_vendors=cmp_vendors,
|
||||
redundancy_report=redundancy_report,
|
||||
site_name=site_name_for_exec,
|
||||
)
|
||||
|
||||
# Reihenfolge — Sales-optimiert:
|
||||
# 1) Exec-Summary (KPIs + Saving + CTAs)
|
||||
# 2) summary_html (Konkrete Aufgaben fuer die Geschaeftsfuehrung)
|
||||
# 3) scanned_urls (Quellen-Transparenz)
|
||||
# 4) profile_html (Erkanntes Geschaeftsmodell)
|
||||
# 5) scorecard_html (MC-Scorecard)
|
||||
# 6) redundancy_html (Optimierungspotenzial — direkt nach Compliance-Score)
|
||||
# 7) providers_html + vvt_html (Vendor-Liste)
|
||||
# 8) report_html (Doc-Pruefung Details)
|
||||
full_html = (
|
||||
summary_html + scanned_html + profile_html + scorecard_html
|
||||
+ providers_html + vvt_html + redundancy_html + report_html
|
||||
exec_summary_html + summary_html + scanned_html + profile_html
|
||||
+ scorecard_html + redundancy_html
|
||||
+ providers_html + vvt_html + report_html
|
||||
)
|
||||
|
||||
# Step 6: Send email — derive site name primarily from entered URL.
|
||||
@@ -619,6 +668,21 @@ async def _run_compliance_check(check_id: str, req: ComplianceCheckRequest):
|
||||
vendors=cmp_vendors,
|
||||
profile=extracted_profile,
|
||||
)
|
||||
# Unified findings (P5): bundle MC + Pflichtangaben + Vendor +
|
||||
# Redundanz in one searchable table behind /agent/findings/<id>.
|
||||
try:
|
||||
from compliance.services.unified_findings_collector import collect
|
||||
from compliance.services.unified_findings_store import record_findings
|
||||
unified = collect(
|
||||
check_id=check_id,
|
||||
results=results,
|
||||
cmp_vendors=cmp_vendors,
|
||||
redundancy_report=redundancy_report,
|
||||
doc_texts=doc_texts,
|
||||
)
|
||||
record_findings(check_id, unified)
|
||||
except Exception as e:
|
||||
logger.warning("Unified findings collect failed: %s", e)
|
||||
except Exception as e:
|
||||
logger.warning("Audit persistence skipped: %s", e)
|
||||
|
||||
@@ -696,11 +760,19 @@ async def _fetch_text(url: str, doc_type: str = "") -> tuple[str, list[dict]]:
|
||||
except Exception as e:
|
||||
logger.warning("Consent-tester fetch failed for %s: %s", url, e)
|
||||
|
||||
# 2. Fallback: direct HTTP fetch (works for SSR pages like BMW)
|
||||
# 2. Fallback: direct HTTP fetch (works for SSR pages like BMW).
|
||||
# P7: kenntlicher UA + per-Domain Rate-Limit.
|
||||
try:
|
||||
import re as _re
|
||||
async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
|
||||
resp = await client.get(url)
|
||||
from compliance.services.compliance_user_agent import (
|
||||
default_request_headers, DomainRateLimiter,
|
||||
)
|
||||
async with httpx.AsyncClient(
|
||||
timeout=30.0, follow_redirects=True,
|
||||
headers=default_request_headers(),
|
||||
) as client:
|
||||
async with DomainRateLimiter(url):
|
||||
resp = await client.get(url)
|
||||
if resp.status_code == 200 and "text/html" in resp.headers.get("content-type", ""):
|
||||
html = resp.text
|
||||
# Strip HTML tags, decode entities
|
||||
@@ -1135,8 +1207,25 @@ def _company_name_from_url(doc_entries: list[dict]) -> str | None:
|
||||
|
||||
|
||||
def _get_skip_types(profile) -> dict[str, str]:
|
||||
"""Doc_types to skip entirely. Currently empty — we check everything
|
||||
and flag irrelevant items as INFO instead of skipping."""
|
||||
"""Doc_types to skip entirely with a per-type reason message.
|
||||
|
||||
Heute primaer fuer OEM-Konfigurator-Pattern (BMW/Audi/Mercedes):
|
||||
wenn die Site kein Direkt-Vertrieb macht, sind AGB/Widerruf/
|
||||
Nutzungsbedingungen nicht Pflicht auf der Website — sie werden
|
||||
beim Vertragshaendler ausgehaendigt.
|
||||
"""
|
||||
if getattr(profile, "no_direct_sales", False):
|
||||
msg = (
|
||||
"Nicht anwendbar — die Webseite schliesst keinen Direkt-"
|
||||
"Kaufvertrag (OEM-Konfigurator-Pattern, Vertrag laeuft "
|
||||
"ueber Vertragshaendler). AGB/Widerruf werden beim "
|
||||
"Haendler ausgehaendigt."
|
||||
)
|
||||
return {
|
||||
"agb": msg,
|
||||
"widerruf": msg,
|
||||
"nutzungsbedingungen": msg,
|
||||
}
|
||||
return {}
|
||||
|
||||
|
||||
|
||||
@@ -0,0 +1,135 @@
|
||||
"""
|
||||
Executive-Summary-Block — der oberste Email-Abschnitt.
|
||||
|
||||
Zeigt CFO / GF in 4 Zahlen den Gesamt-Mehrwert des Compliance-Checks:
|
||||
1) Compliance-Score (Trend vs Vorlauf)
|
||||
2) Anzahl analysierter Anbieter
|
||||
3) Geschaetztes jaehrliches Sparpotenzial (Range)
|
||||
4) Konsolidierungs-Potenzial (Anbieter koennen reduziert werden)
|
||||
|
||||
Plus zwei Big-CTA-Buttons:
|
||||
- "Compliance-Maengel im Detail" → springt zum Doc-Pruefungs-Block
|
||||
- "Konsolidierungs-Plan ansehen" → springt zum Redundanz-Block
|
||||
|
||||
Ziel: in 5 Sekunden sieht der Vorstand den ROI. Wenn neugierig, scrollt
|
||||
er weiter in die Detail-Bloecke (die UNTER dieser Summary liegen).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
|
||||
def _fmt_eur_range(low: int, high: int) -> str:
|
||||
if not low and not high:
|
||||
return "—"
|
||||
if low == high:
|
||||
return f"~{low:,} €".replace(",", ".")
|
||||
return f"{low:,}–{high:,} €".replace(",", ".")
|
||||
|
||||
|
||||
def build_exec_summary_html(
|
||||
scorecard: dict | None,
|
||||
previous_scorecard: dict | None,
|
||||
cmp_vendors: list[dict] | None,
|
||||
redundancy_report: dict | None,
|
||||
site_name: str = "",
|
||||
) -> str:
|
||||
"""Build the top-of-email Executive Summary with 4 KPIs + 2 CTAs."""
|
||||
# 1) Compliance-Score
|
||||
pct = 0
|
||||
delta_str = ""
|
||||
score_color = "#94a3b8"
|
||||
if scorecard:
|
||||
totals = scorecard.get("totals") or {}
|
||||
pct = int(totals.get("pct", 0))
|
||||
score_color = ("#16a34a" if pct >= 80 else
|
||||
"#d97706" if pct >= 50 else "#dc2626")
|
||||
if previous_scorecard:
|
||||
prev_pct = int((previous_scorecard.get("totals") or {}).get("pct", 0))
|
||||
d = pct - prev_pct
|
||||
if d:
|
||||
trend_color = "#16a34a" if d > 0 else "#dc2626"
|
||||
delta_str = (
|
||||
f'<span style="font-size:14px;color:{trend_color};margin-left:6px">'
|
||||
f'{"+" if d > 0 else ""}{d} pp</span>'
|
||||
)
|
||||
|
||||
# 2) Vendor-Count
|
||||
n_vendors = len(cmp_vendors or [])
|
||||
|
||||
# 3+4) Saving + Konsolidierung
|
||||
s = (redundancy_report or {}).get("summary") or {}
|
||||
sav_low, sav_high = s.get("estimated_saving_year_eur", [0, 0])
|
||||
n_consolidation = s.get("consolidation_potential", 0)
|
||||
sav_pct = s.get("estimated_saving_pct", "—")
|
||||
|
||||
parts = [
|
||||
'<div style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;'
|
||||
'max-width:700px;margin:0 auto 18px;padding:18px 22px;'
|
||||
'background:linear-gradient(135deg,#1e293b 0%,#0f172a 100%);'
|
||||
'border-radius:10px;color:white">',
|
||||
|
||||
f'<div style="font-size:11px;color:#94a3b8;text-transform:uppercase;'
|
||||
f'letter-spacing:1.5px;margin-bottom:6px">Executive Summary</div>',
|
||||
f'<h2 style="margin:0 0 16px;font-size:18px;color:white">'
|
||||
f'Compliance-Check {site_name}</h2>',
|
||||
|
||||
# 2x2 KPI grid
|
||||
'<table style="width:100%;border-collapse:separate;border-spacing:8px">',
|
||||
|
||||
# Row 1: Compliance + Vendor count
|
||||
'<tr>',
|
||||
f'<td style="width:50%;padding:12px 14px;background:rgba(255,255,255,0.05);'
|
||||
f'border-radius:6px;border:1px solid rgba(255,255,255,0.08)">'
|
||||
f'<div style="font-size:10px;color:#94a3b8;text-transform:uppercase;'
|
||||
f'letter-spacing:1px;margin-bottom:4px">DSGVO / TDDDG / TMG Score</div>'
|
||||
f'<div style="font-size:28px;font-weight:700;color:{score_color}">'
|
||||
f'{pct}%{delta_str}</div>'
|
||||
f'<div style="font-size:11px;color:#cbd5e1;margin-top:2px">'
|
||||
f'aus {int((scorecard or {}).get("totals", {}).get("total", 0))} Pflicht-Pruefungen</div>'
|
||||
f'</td>',
|
||||
|
||||
f'<td style="width:50%;padding:12px 14px;background:rgba(255,255,255,0.05);'
|
||||
f'border-radius:6px;border:1px solid rgba(255,255,255,0.08)">'
|
||||
f'<div style="font-size:10px;color:#94a3b8;text-transform:uppercase;'
|
||||
f'letter-spacing:1px;margin-bottom:4px">Identifizierte Anbieter</div>'
|
||||
f'<div style="font-size:28px;font-weight:700;color:white">{n_vendors}</div>'
|
||||
f'<div style="font-size:11px;color:#cbd5e1;margin-top:2px">'
|
||||
f'davon {n_consolidation} konsolidierbar</div>'
|
||||
f'</td>',
|
||||
'</tr>',
|
||||
|
||||
# Row 2: Saving + CTA-Hinweis
|
||||
'<tr>',
|
||||
f'<td colspan="2" style="padding:14px 16px;background:linear-gradient(90deg,'
|
||||
f'rgba(16,185,129,0.15) 0%,rgba(16,185,129,0.05) 100%);'
|
||||
f'border-radius:6px;border:1px solid rgba(16,185,129,0.3)">'
|
||||
f'<div style="font-size:10px;color:#86efac;text-transform:uppercase;'
|
||||
f'letter-spacing:1px;margin-bottom:4px">'
|
||||
f'Geschaetztes Sparpotenzial pro Jahr (Tool-Lizenzen, ohne Media-Spend)</div>'
|
||||
f'<div style="font-size:24px;font-weight:700;color:#34d399">'
|
||||
f'{_fmt_eur_range(sav_low, sav_high)}'
|
||||
f'<span style="font-size:14px;color:#86efac;margin-left:8px">({sav_pct})</span></div>'
|
||||
f'<div style="font-size:11px;color:#cbd5e1;margin-top:4px">'
|
||||
f'durch Konsolidierung redundanter Anbieter auf je 1 EU-Tool pro '
|
||||
f'Funktions-Kategorie. <em>Schaetzbereich, mit dem Einkauf zu verifizieren.</em>'
|
||||
f'</div></td>',
|
||||
'</tr>',
|
||||
|
||||
'</table>',
|
||||
|
||||
# CTAs
|
||||
'<div style="margin-top:14px;padding-top:12px;border-top:1px solid '
|
||||
'rgba(255,255,255,0.1);text-align:center">',
|
||||
'<a href="#mc-scorecard" style="display:inline-block;padding:8px 16px;'
|
||||
'background:#7c3aed;color:white;text-decoration:none;border-radius:6px;'
|
||||
'font-size:12px;font-weight:600;margin-right:8px">'
|
||||
'Compliance-Maengel im Detail →</a>',
|
||||
'<a href="#optimierungspotenzial" style="display:inline-block;padding:8px 16px;'
|
||||
'background:#10b981;color:white;text-decoration:none;border-radius:6px;'
|
||||
'font-size:12px;font-weight:600">'
|
||||
'Konsolidierungs-Plan →</a>',
|
||||
'</div>',
|
||||
|
||||
'</div>',
|
||||
]
|
||||
return "".join(parts)
|
||||
@@ -421,10 +421,18 @@ def _render_vendor_row_full(v: dict) -> str:
|
||||
f'{", ".join(flags[:4])}</div>'
|
||||
f'{actions_html}'
|
||||
)
|
||||
risk = v.get("compliance_risk") or {}
|
||||
risk_label = risk.get("label") or ""
|
||||
risk_badge = ""
|
||||
if risk_label and risk_label != "unklar":
|
||||
rc = {"kritisch": ("#dc2626", "#fff"), "hoch": ("#fecaca", "#991b1b"),
|
||||
"mittel": ("#fde68a", "#92400e"), "gering": ("#d1fae5", "#065f46")}.get(risk_label, ("#e5e7eb", "#475569"))
|
||||
risk_badge = (f'<span style="margin-left:6px;padding:1px 5px;border-radius:3px;font-size:9px;'
|
||||
f'background:{rc[0]};color:{rc[1]}">Risk: {risk_label}</span>')
|
||||
return (
|
||||
f'<tr style="border-top:1px solid #e2e8f0">'
|
||||
f'<td style="padding:6px 8px;color:#1e293b;font-size:11px">'
|
||||
f'{name}{flag_str}</td>'
|
||||
f'{name}{risk_badge}{flag_str}</td>'
|
||||
f'<td style="padding:6px 8px;color:#475569;font-size:11px">{category}</td>'
|
||||
f'<td style="padding:6px 8px;color:#475569;font-size:11px">{country}</td>'
|
||||
f'<td style="padding:6px 8px;text-align:center;color:#475569;font-size:11px">'
|
||||
|
||||
@@ -28,9 +28,10 @@ def build_redundancy_html(report: dict | None) -> str:
|
||||
pct = s.get("estimated_saving_pct") or "n/a"
|
||||
|
||||
parts = [
|
||||
'<div style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;'
|
||||
'max-width:700px;margin:0 auto 16px;padding:14px 18px;'
|
||||
'background:#fef3c7;border:1px solid #fcd34d;border-radius:8px">',
|
||||
'<div id="optimierungspotenzial" style="font-family:-apple-system,'
|
||||
'BlinkMacSystemFont,sans-serif;max-width:700px;margin:0 auto 16px;'
|
||||
'padding:14px 18px;background:#fef3c7;border:1px solid #fcd34d;'
|
||||
'border-radius:8px">',
|
||||
'<h3 style="margin:0 0 6px;font-size:14px;color:#92400e">'
|
||||
'Optimierungspotenzial: Redundanzen + EU-Alternativen</h3>',
|
||||
f'<p style="margin:0 0 10px;font-size:11px;color:#78350f">'
|
||||
|
||||
@@ -134,7 +134,9 @@ def build_management_summary(results: list[DocCheckResult]) -> str:
|
||||
ok = [r for r in results if r.completeness_pct == 100 and not r.error]
|
||||
fixable = [r for r in results if 0 < r.completeness_pct < 100 and not r.error]
|
||||
critical = [r for r in results if r.completeness_pct == 0 and not r.error]
|
||||
errors = [r for r in results if r.error]
|
||||
not_applicable = [r for r in results if r.error
|
||||
and r.error.startswith("Nicht anwendbar")]
|
||||
errors = [r for r in results if r.error and r not in not_applicable]
|
||||
|
||||
html = [
|
||||
'<div style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;'
|
||||
@@ -150,17 +152,24 @@ def build_management_summary(results: list[DocCheckResult]) -> str:
|
||||
html.append('<p>Keine Dokumente geprueft.</p></div>')
|
||||
return "\n".join(html)
|
||||
|
||||
na_note = (
|
||||
f' Zusaetzlich {len(not_applicable)} Dokument{"" if len(not_applicable) == 1 else "e"} '
|
||||
f'als NICHT ANWENDBAR markiert (kein Direkt-Vertrieb — '
|
||||
f'OEM-Konfigurator-Pattern).' if not_applicable else ""
|
||||
)
|
||||
if len(ok) == total:
|
||||
html.append(
|
||||
'<p style="color:#16a34a;font-weight:600;font-size:15px">'
|
||||
'Alle Dokumente sind vollstaendig. Keine dringenden Massnahmen noetig.</p>'
|
||||
f'<p style="color:#16a34a;font-weight:600;font-size:15px">'
|
||||
f'Alle Dokumente sind vollstaendig. Keine dringenden Massnahmen noetig.'
|
||||
f'{na_note}</p>'
|
||||
)
|
||||
else:
|
||||
html.append(
|
||||
f'<p style="font-size:14px;color:#475569">'
|
||||
f'{len(ok)} von {total} Dokumenten sind vollstaendig. '
|
||||
f'{len(fixable)} brauchen Korrekturen'
|
||||
f'{f", {len(critical)} fehlen oder sind unbrauchbar" if critical else ""}.</p>'
|
||||
f'{f", {len(critical)} fehlen oder sind unbrauchbar" if critical else ""}.'
|
||||
f'{na_note}</p>'
|
||||
)
|
||||
|
||||
# Concrete actions
|
||||
@@ -279,10 +288,13 @@ def _render_document(html: list[str], r: DocCheckResult, doc_text: str = "") ->
|
||||
r.error.startswith("Nicht eingereicht")
|
||||
or r.error.startswith("Auf der Website nicht gefunden")
|
||||
)
|
||||
is_not_applicable = bool(r.error) and r.error.startswith("Nicht anwendbar")
|
||||
if is_missing:
|
||||
status_label = ("NICHT GEFUNDEN"
|
||||
if r.error.startswith("Auf der Website")
|
||||
else "NICHT EINGEREICHT")
|
||||
elif is_not_applicable:
|
||||
status_label = "NICHT ANWENDBAR"
|
||||
elif r.error:
|
||||
status_label = "FEHLER"
|
||||
|
||||
@@ -330,6 +342,13 @@ def _render_document(html: list[str], r: DocCheckResult, doc_text: str = "") ->
|
||||
'background:#fafafa;border-top:1px solid #f3f4f6">'
|
||||
+ body_msg + '</div>'
|
||||
)
|
||||
elif is_not_applicable:
|
||||
html.append(
|
||||
'<div style="padding:12px 16px;color:#475569;font-size:12px;'
|
||||
'background:#f1f5f9;border-top:1px solid #cbd5e1;border-left:'
|
||||
'3px solid #94a3b8">'
|
||||
+ r.error + '</div>'
|
||||
)
|
||||
elif r.error:
|
||||
html.append(f'<div style="padding:12px 16px;color:#991b1b">{r.error}</div>')
|
||||
else:
|
||||
|
||||
@@ -44,7 +44,7 @@ def build_scorecard_html(
|
||||
trend_str = _delta_badge(overall_pct, prev_total_pct) if prev_total_pct is not None else ""
|
||||
|
||||
head = (
|
||||
'<div style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;'
|
||||
'<div id="mc-scorecard" style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;'
|
||||
'max-width:700px;margin:0 auto 16px;padding:12px 16px;'
|
||||
'background:#f0f9ff;border:1px solid #bae6fd;border-radius:8px">'
|
||||
'<h3 style="margin:0 0 6px;font-size:14px;color:#0369a1">'
|
||||
|
||||
@@ -0,0 +1,104 @@
|
||||
"""
|
||||
Voll-Audit Findings Router — unified view across all 4 finding sources.
|
||||
|
||||
Endpoint:
|
||||
GET /api/compliance/agent/findings/{check_id}
|
||||
?source=mc|pflichtangabe|vendor|redundanz|all
|
||||
&severity=CRITICAL|HIGH|MEDIUM|LOW|INFO|all
|
||||
&doc_type=impressum|dse|cookie|...|all
|
||||
&status=failed|passed|skipped|na|info|all
|
||||
&q=<freitext>
|
||||
&limit=<int>
|
||||
|
||||
Liefert summary + filtered findings list. Frontend rendert daraus den
|
||||
Voll-Audit-Tab unter /sdk/agent/audit/<check_id>.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from urllib.parse import urlparse
|
||||
from fastapi import APIRouter, HTTPException, Query
|
||||
|
||||
from compliance.services.unified_findings_store import (
|
||||
findings_summary,
|
||||
list_findings,
|
||||
)
|
||||
from compliance.services.compliance_audit_log import get_check_run
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
router = APIRouter(prefix="/compliance/agent", tags=["agent"])
|
||||
|
||||
|
||||
def _normalize_domain(d: str) -> str:
|
||||
if not d:
|
||||
return ""
|
||||
if "://" not in d:
|
||||
d = "https://" + d
|
||||
host = urlparse(d).netloc.lower()
|
||||
return host[4:] if host.startswith("www.") else host
|
||||
|
||||
|
||||
@router.get("/findings/{check_id}")
|
||||
def get_findings(
|
||||
check_id: str,
|
||||
source: str | None = Query(None, description="mc|pflichtangabe|vendor|redundanz|all"),
|
||||
severity: str | None = Query(None, description="CRITICAL|HIGH|MEDIUM|LOW|INFO|all"),
|
||||
doc_type: str | None = Query(None),
|
||||
status: str | None = Query(None, description="failed|passed|skipped|na|info|all"),
|
||||
q: str | None = Query(None, description="freitext-suche label/vendor"),
|
||||
limit: int = Query(1000, ge=1, le=5000),
|
||||
expected_domain: str | None = Query(
|
||||
None, description="Hard-Assertion: Run muss zu dieser Domain gehoeren (Cross-Tenant-Schutz)",
|
||||
),
|
||||
) -> dict:
|
||||
"""Return aggregated findings + summary counters for a check run."""
|
||||
# P7-Restpunkt: optionale Domain-Assertion. Verhindert dass ein Frontend
|
||||
# einen check_id einer fremden Tenant-Domain anfragen kann.
|
||||
if expected_domain:
|
||||
run = get_check_run(check_id)
|
||||
actual = _normalize_domain((run or {}).get("base_domain") or "")
|
||||
if not run or actual != _normalize_domain(expected_domain):
|
||||
raise HTTPException(
|
||||
status_code=403,
|
||||
detail=f"Cross-tenant access blocked: check_id {check_id} "
|
||||
f"gehoert zu Domain '{actual or '?'}', angefragt: "
|
||||
f"'{_normalize_domain(expected_domain)}'",
|
||||
)
|
||||
try:
|
||||
summary = findings_summary(check_id)
|
||||
findings = list_findings(
|
||||
check_id=check_id,
|
||||
source_type=source,
|
||||
severity=severity,
|
||||
doc_type=doc_type,
|
||||
status=status,
|
||||
q=q,
|
||||
limit=limit,
|
||||
)
|
||||
return {
|
||||
"found": summary.get("total", 0) > 0,
|
||||
"check_id": check_id,
|
||||
"summary": summary,
|
||||
"filter": {
|
||||
"source": source or "all",
|
||||
"severity": severity or "all",
|
||||
"doc_type": doc_type or "all",
|
||||
"status": status or "all",
|
||||
"q": q or "",
|
||||
"limit": limit,
|
||||
},
|
||||
"count": len(findings),
|
||||
"findings": findings,
|
||||
}
|
||||
except Exception as e:
|
||||
logger.exception("get_findings failed for %s", check_id)
|
||||
return {
|
||||
"found": False,
|
||||
"check_id": check_id,
|
||||
"error": str(e)[:200],
|
||||
"summary": {},
|
||||
"count": 0,
|
||||
"findings": [],
|
||||
}
|
||||
@@ -0,0 +1,196 @@
|
||||
"""
|
||||
Saving-Scan-Funnel Endpoint — Marketing-Lead → Compliance-Check.
|
||||
|
||||
Externes Form (https://breakpilot.ai/savings-scan) postet hier:
|
||||
POST /api/compliance/agent/saving-scan/start
|
||||
Body: {"url": "...", "email": "..."}
|
||||
|
||||
Server-side:
|
||||
1. Validierung URL + Email (E-Mail-Regex, URL-Schema).
|
||||
2. Rate-Limit: max 1 vollstaendiger Scan / Domain / 24h
|
||||
(saving_scan_allowed aus compliance_user_agent).
|
||||
3. Lead persistieren (saving_scan_leads in Sidecar-SQLite) — fuer
|
||||
spaeteren Report-Versand + Sales-Follow-Up.
|
||||
4. Compliance-Check starten mit Auto-Discovery (DocumentInput leer
|
||||
ausser Homepage). Der bestehende Worker laeuft TDM-Check, dann
|
||||
Discovery, dann Pruefung.
|
||||
5. check_id zurueck — Frontend pollt /compliance-check/<check_id>.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import os
|
||||
import re
|
||||
import sqlite3
|
||||
import uuid as _uuid
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
|
||||
import asyncio
|
||||
from fastapi import APIRouter, HTTPException
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
from compliance.services.compliance_user_agent import (
|
||||
base_domain_of, saving_scan_allowed,
|
||||
)
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
router = APIRouter(prefix="/compliance/agent", tags=["agent"])
|
||||
|
||||
DB_PATH = os.getenv("COMPLIANCE_AUDIT_DB", "/data/compliance_audits.db")
|
||||
|
||||
_EMAIL_RE = re.compile(r"^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$")
|
||||
_URL_RE = re.compile(r"^https?://[A-Za-z0-9.-]+(/.*)?$")
|
||||
|
||||
|
||||
class SavingScanRequest(BaseModel):
|
||||
url: str = Field(..., min_length=4, max_length=400)
|
||||
email: str = Field(..., min_length=5, max_length=200)
|
||||
consent: bool = Field(
|
||||
True, description="Marketing-Consent fuer Sales-Follow-Up — "
|
||||
"muss True sein laut Form-Checkbox.",
|
||||
)
|
||||
|
||||
|
||||
class SavingScanResponse(BaseModel):
|
||||
check_id: str
|
||||
status: str
|
||||
message: str = ""
|
||||
|
||||
|
||||
def _ensure_leads_table() -> None:
|
||||
Path(DB_PATH).parent.mkdir(parents=True, exist_ok=True)
|
||||
with sqlite3.connect(DB_PATH) as conn:
|
||||
conn.executescript("""
|
||||
CREATE TABLE IF NOT EXISTS saving_scan_leads (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
ts TEXT NOT NULL,
|
||||
email TEXT NOT NULL,
|
||||
url TEXT NOT NULL,
|
||||
base_domain TEXT NOT NULL,
|
||||
check_id TEXT,
|
||||
consent INTEGER NOT NULL,
|
||||
source TEXT
|
||||
);
|
||||
CREATE INDEX IF NOT EXISTS idx_leads_domain ON saving_scan_leads(base_domain, ts);
|
||||
CREATE INDEX IF NOT EXISTS idx_leads_email ON saving_scan_leads(email, ts);
|
||||
""")
|
||||
|
||||
|
||||
def _persist_lead(email: str, url: str, check_id: str, consent: bool) -> None:
|
||||
try:
|
||||
_ensure_leads_table()
|
||||
with sqlite3.connect(DB_PATH) as conn:
|
||||
conn.execute(
|
||||
"INSERT INTO saving_scan_leads "
|
||||
"(ts, email, url, base_domain, check_id, consent, source) "
|
||||
"VALUES (?, ?, ?, ?, ?, ?, ?)",
|
||||
(
|
||||
datetime.now(timezone.utc).isoformat(),
|
||||
email.lower().strip(),
|
||||
url,
|
||||
base_domain_of(url),
|
||||
check_id,
|
||||
1 if consent else 0,
|
||||
"saving_scan_form",
|
||||
),
|
||||
)
|
||||
conn.commit()
|
||||
except Exception as e:
|
||||
logger.warning("persist lead failed: %s", e)
|
||||
|
||||
|
||||
def _normalize_url(url: str) -> str:
|
||||
"""Strip path → behaupt nur Homepage, der Discover findet den Rest."""
|
||||
if "://" not in url:
|
||||
url = "https://" + url
|
||||
from urllib.parse import urlparse
|
||||
p = urlparse(url)
|
||||
return f"{p.scheme}://{p.netloc}/"
|
||||
|
||||
|
||||
@router.post("/saving-scan/start", response_model=SavingScanResponse)
|
||||
async def start_saving_scan(req: SavingScanRequest) -> SavingScanResponse:
|
||||
"""Trigger compliance check from the marketing-funnel form."""
|
||||
if not _EMAIL_RE.match(req.email):
|
||||
raise HTTPException(400, "Ungueltige E-Mail-Adresse.")
|
||||
if not _URL_RE.match(req.url):
|
||||
raise HTTPException(400, "URL muss mit http:// oder https:// beginnen.")
|
||||
if not req.consent:
|
||||
raise HTTPException(400, "Marketing-Consent erforderlich.")
|
||||
|
||||
domain = base_domain_of(req.url)
|
||||
if not domain:
|
||||
raise HTTPException(400, "Konnte Domain nicht ermitteln.")
|
||||
|
||||
allowed, wait_s = saving_scan_allowed(req.url)
|
||||
if not allowed:
|
||||
raise HTTPException(
|
||||
429,
|
||||
f"Fuer '{domain}' wurde in den letzten 24h bereits ein Scan "
|
||||
f"durchgefuehrt. Bitte in {wait_s // 3600}h {wait_s % 3600 // 60}min "
|
||||
f"erneut versuchen.",
|
||||
)
|
||||
|
||||
# Lazy import to avoid circular dependency at module load.
|
||||
from compliance.api.agent_compliance_check_routes import (
|
||||
DocumentInput,
|
||||
ComplianceCheckRequest,
|
||||
_run_compliance_check,
|
||||
_compliance_check_jobs,
|
||||
)
|
||||
|
||||
homepage = _normalize_url(req.url)
|
||||
check_id = str(_uuid.uuid4())[:8]
|
||||
_compliance_check_jobs[check_id] = {
|
||||
"status": "running",
|
||||
"progress": "Saving-Scan gestartet — Auto-Discovery laeuft...",
|
||||
"progress_pct": 0,
|
||||
"result": None,
|
||||
"error": "",
|
||||
}
|
||||
|
||||
# Single "other" entry forces auto-discovery to fill in the rest.
|
||||
docs = [DocumentInput(doc_type="other", url=homepage)]
|
||||
check_req = ComplianceCheckRequest(
|
||||
documents=docs, recipient=req.email.lower().strip(),
|
||||
)
|
||||
|
||||
_persist_lead(req.email, req.url, check_id, req.consent)
|
||||
asyncio.create_task(_run_compliance_check(check_id, check_req))
|
||||
|
||||
logger.info("saving-scan start: check_id=%s domain=%s email=%s",
|
||||
check_id, domain, req.email[:3] + "***")
|
||||
return SavingScanResponse(
|
||||
check_id=check_id,
|
||||
status="running",
|
||||
message=f"Scan gestartet fuer {domain}. Bericht in ~3-5 Minuten.",
|
||||
)
|
||||
|
||||
|
||||
@router.get("/saving-scan/lead-count")
|
||||
def saving_scan_lead_count() -> dict:
|
||||
"""Diagnostik fuer das Sales-Dashboard."""
|
||||
try:
|
||||
_ensure_leads_table()
|
||||
with sqlite3.connect(DB_PATH) as conn:
|
||||
total = conn.execute(
|
||||
"SELECT COUNT(*) FROM saving_scan_leads",
|
||||
).fetchone()[0]
|
||||
last_24h = conn.execute(
|
||||
"SELECT COUNT(*) FROM saving_scan_leads "
|
||||
"WHERE ts > datetime('now', '-1 day')",
|
||||
).fetchone()[0]
|
||||
top_domains = conn.execute(
|
||||
"SELECT base_domain, COUNT(*) AS n FROM saving_scan_leads "
|
||||
"GROUP BY base_domain ORDER BY n DESC LIMIT 10",
|
||||
).fetchall()
|
||||
return {
|
||||
"total_leads": total,
|
||||
"last_24h": last_24h,
|
||||
"top_domains": [{"domain": d, "scans": n} for d, n in top_domains],
|
||||
}
|
||||
except Exception as e:
|
||||
return {"error": str(e)[:200]}
|
||||
Reference in New Issue
Block a user