feat(audit-pipeline): P72-v2 Heuristik nachgeschaerft + P80 Mini-Replay-Endpoint
CI / detect-changes (push) Successful in 9s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 13s
CI / loc-budget (push) Failing after 14s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 36s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / detect-changes (push) Successful in 9s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 13s
CI / loc-budget (push) Failing after 14s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 36s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / nodejs-build (push) Has been skipped
P72-v2 MC-Scope-Classifier Heuristik v2 — v1 hatte 79% 'other'-Bucket
(Patterns zu strict). v2 deckt deutlich breiter ab:
- DSE: Art. 13/14 + Betroffenenrechte (Art. 15-22) + DSB +
Aufsichtsbehoerde + Speicherdauer + besondere Kategorien
- TOM: Art. 32 + Verschluesselung/Backup/Pseudonymisierung +
Zugriffskontrolle + ISO 27001 + BSI-Grundschutz + Audit-Log
- cookie_richtlinie: Tracking-Pixel + Webstorage + GA/Matomo/
Hotjar/Pixel/GTM
- process: VVT (Art. 30) + DSFA (Art. 35) + Datenpannen
(Art. 33/34) + HinSchG + Schulungen + Loeschkonzept
Script `backfill_mc_scope_v2.py` re-classifiziert NUR den
'other'-Bucket (spezifische v1-Buckets bleiben unangetastet).
P80 Mini-Replay-Endpoint (v1):
POST /compliance-check/snapshots/{id}/replay
?recipient=foo@bar.com & dry_run=false
Laedt Snapshot, rendert Mail mit AKTUELLEM Render-Code (P63-P67,
P59b/P61/P62). Sendet [REPLAY]-prefixed Mail oder gibt nur
HTML-Stats zurueck (dry_run).
Effekt: 7min Re-Scan -> 2-5sec fuer Mail-Layout-Iterationen.
v2 (spaeter): MC-Scorecard mit aktuellem scope_doc_type-Filter
ueber Snapshot — erfordert _run_compliance_check Refactoring.
Plus Bugfix: GET /snapshots/{id} raised jetzt HTTPException statt
Tuple-Return (FastAPI hat Tuple als JSON-Array zurueckgegeben).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,147 @@
|
||||
"""
|
||||
P80 — Replay-Pipeline (Mini-Version v1).
|
||||
|
||||
Lädt einen persistierten Snapshot und rendert die Audit-Mail mit dem
|
||||
AKTUELLEN Mail-Render-Code neu. Nutzbar fuer:
|
||||
* Mail-Layout-Aenderungen (P63-P67, P82 1-Pager, P84 Diff-Mode) testen
|
||||
* Action-Recipes anpassen
|
||||
* Disclaimer-Text iterieren
|
||||
* Pattern-Notice-Logik tunen
|
||||
|
||||
NICHT enthalten (kommt in v2):
|
||||
* MC-Scorecard re-run mit aktuellem scope_doc_type-Filter (P72) —
|
||||
erfordert MC-Pipeline-Refactoring aus _run_compliance_check
|
||||
* Vendor-Redundancy-Analyse re-run
|
||||
|
||||
Effekt v1: 7min Re-Scan -> 2-5 Sek fuer Mail-Layout-Iterationen.
|
||||
Effekt v2 (spaeter): auch fuer MC-Filter-Tests.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from typing import Any
|
||||
|
||||
from sqlalchemy.orm import Session
|
||||
|
||||
from compliance.services.check_snapshot import load_snapshot
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
def replay_from_snapshot(
|
||||
db: Session,
|
||||
snapshot_id: str,
|
||||
recipient: str | None = None,
|
||||
dry_run: bool = False,
|
||||
) -> dict:
|
||||
"""Replay audit mail render from snapshot.
|
||||
|
||||
Args:
|
||||
db: SQLAlchemy session
|
||||
snapshot_id: UUID of snapshot to replay
|
||||
recipient: Override email recipient. None = skip send.
|
||||
dry_run: If True, render HTML but do not send mail.
|
||||
|
||||
Returns:
|
||||
{"snapshot_id", "html_size", "sections", "mail_sent", "preview"}
|
||||
"""
|
||||
snap = load_snapshot(db, snapshot_id)
|
||||
if not snap:
|
||||
return {"error": "snapshot not found", "snapshot_id": snapshot_id}
|
||||
|
||||
doc_entries = snap.get("doc_entries") or []
|
||||
banner_result = snap.get("banner_result") or {}
|
||||
profile_dict = snap.get("profile") or {}
|
||||
cmp_vendors = snap.get("cmp_vendors") or []
|
||||
site_label = snap.get("site_label") or snap.get("site_domain")
|
||||
|
||||
# Reconstruct doc_texts mapping (was the input to mail-render)
|
||||
doc_texts: dict[str, str] = {}
|
||||
for e in doc_entries:
|
||||
dt = e.get("doc_type", "")
|
||||
txt = (e.get("full_text") or e.get("text_preview") or "").strip()
|
||||
if dt and txt:
|
||||
doc_texts[dt] = txt
|
||||
|
||||
# Build results list mock (just enough for mail-render)
|
||||
from compliance.services.doc_checks.runner import DocCheckResult
|
||||
|
||||
def _dict_to_result(d: dict) -> Any:
|
||||
"""Best-effort reconstruction. Snapshot didn't persist DocCheckResult
|
||||
so we fake minimal fields. For real MC-replay (v2) we'd re-run the
|
||||
check_document_completeness function against the snapshot text."""
|
||||
return type("R", (), {
|
||||
"doc_type": d.get("doc_type", "other"),
|
||||
"label": d.get("doc_type", "Dokument"),
|
||||
"completeness_pct": d.get("completeness_pct", 0),
|
||||
"correctness_pct": d.get("correctness_pct"),
|
||||
"checks": [],
|
||||
"error": d.get("error", ""),
|
||||
})()
|
||||
|
||||
results = [_dict_to_result(e) for e in doc_entries]
|
||||
|
||||
# Render mail sections
|
||||
section_sizes: dict[str, int] = {}
|
||||
parts: list[str] = []
|
||||
|
||||
try:
|
||||
from compliance.api.agent_doc_check_critical import build_critical_findings_html
|
||||
critical_html = build_critical_findings_html(banner_result, None, results) or ""
|
||||
parts.append(critical_html)
|
||||
section_sizes["critical"] = len(critical_html)
|
||||
except Exception as e:
|
||||
logger.warning("Replay: critical-block failed: %s", e)
|
||||
|
||||
try:
|
||||
from compliance.api.scope_disclaimer import build_scope_disclaimer_html
|
||||
disclaimer = build_scope_disclaimer_html()
|
||||
parts.append(disclaimer)
|
||||
section_sizes["disclaimer"] = len(disclaimer)
|
||||
except Exception as e:
|
||||
logger.warning("Replay: disclaimer failed: %s", e)
|
||||
|
||||
try:
|
||||
from compliance.api.agent_doc_check_banner import build_banner_deep_html
|
||||
banner_html = build_banner_deep_html(banner_result) or ""
|
||||
parts.append(banner_html)
|
||||
section_sizes["banner"] = len(banner_html)
|
||||
except Exception as e:
|
||||
logger.warning("Replay: banner-block failed: %s", e)
|
||||
|
||||
try:
|
||||
from compliance.api.vvt_table_renderer import build_vvt_table_html
|
||||
vvt_html = build_vvt_table_html(cmp_vendors) or ""
|
||||
parts.append(vvt_html)
|
||||
section_sizes["vvt"] = len(vvt_html)
|
||||
except Exception as e:
|
||||
logger.warning("Replay: vvt failed: %s", e)
|
||||
|
||||
full_html = "".join(parts)
|
||||
|
||||
result = {
|
||||
"snapshot_id": snapshot_id,
|
||||
"check_id": snap.get("check_id"),
|
||||
"site_domain": snap.get("site_domain"),
|
||||
"html_size": len(full_html),
|
||||
"sections": section_sizes,
|
||||
"mail_sent": False,
|
||||
"preview": full_html[:500] + "..." if len(full_html) > 500 else full_html,
|
||||
}
|
||||
|
||||
if recipient and not dry_run:
|
||||
try:
|
||||
from compliance.services.email_sender import send_email
|
||||
email_res = send_email(
|
||||
recipient=recipient,
|
||||
subject=f"[REPLAY] {site_label} (Snapshot {snapshot_id[:8]})",
|
||||
body_html=full_html,
|
||||
)
|
||||
result["mail_sent"] = (email_res.get("status") == "sent")
|
||||
result["mail_status"] = email_res.get("status")
|
||||
except Exception as e:
|
||||
logger.warning("Replay: mail send failed: %s", e)
|
||||
result["mail_send_error"] = str(e)[:200]
|
||||
|
||||
return result
|
||||
Reference in New Issue
Block a user