feat(audit): P72 MC-Scope-Filter + P73 MC-Solution-Generator
CI / detect-changes (push) Successful in 12s
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 15s
CI / loc-budget (push) Failing after 18s
CI / go-lint (push) Has been skipped
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 41s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped

P72 — rag_document_checker LEFT JOINs canonical_controls.scope_doc_type.
_filter_by_canonical_scope wirft MCs raus deren scope explizit auf
einen inkompatiblen Doc-Type zeigt (Mapping in _SCOPE_COMPATIBLE).
Konservativ: 'other'/NULL/'process' bleiben drin — Heuristik v1 ist
noch nicht stark genug fuer hartes Filtern.

Erwartete Wirkung: ~10-15% weniger irrelevante MCs pro Doc, weil z.B.
ein TOM-MC nicht mehr als DSE-Finding auftaucht.

P73 — mc_solution_generator.py: Qwen->OVH Cascade generiert pro HIGH/
CRITICAL-Fail eine konkrete Einfuege-Empfehlung mit Anchor (wo + was)
und Aufwand-Schaetzung. JSON-Schema {solution_text, anchor_hint,
effort_min}. In-process LRU-Cache (500 entries) per (mc_id, doc_md5).

Max 3 Solutions pro Doc-Type, global Cap 8 — haelt Latenz < 60s. Bloecke
werden im Mail-Render unter VVT als 'Loesungs-Vorschlaege (KI-generiert)'
eingehaengt. Disclaimer: kein Rechts-Beratung, mit DSB pruefen.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Benjamin Admin
2026-05-21 17:21:19 +02:00
parent 4183379dc5
commit 309c10c203
3 changed files with 366 additions and 8 deletions
@@ -973,14 +973,22 @@ async def _run_compliance_check(check_id: str, req: ComplianceCheckRequest):
from compliance.services.mc_scorecard import build_scorecard
from .agent_doc_check_scorecard import build_scorecard_html
all_mc_checks: list[dict] = []
# P73: pro-doc Fails sammeln um Solution-Generator pro Doc-Type
# mit dem korrekten doc_text aufzurufen.
fails_by_doc: dict[str, list[dict]] = {}
for r in results:
for c in r.checks:
if c.id.startswith("mc-"):
all_mc_checks.append({
rec = {
"id": c.id, "label": c.label, "passed": c.passed,
"severity": c.severity, "skipped": c.skipped,
"regulation": c.regulation,
})
"hint": getattr(c, "hint", "") or "",
}
all_mc_checks.append(rec)
if (not c.passed and not c.skipped
and (c.severity or "").upper() in ("CRITICAL", "HIGH")):
fails_by_doc.setdefault(r.doc_type, []).append(rec)
scorecard = build_scorecard(all_mc_checks) if all_mc_checks else {}
# Trend: load previous scorecard for the same tenant + domain so the
# email can show delta indicators (A6).
@@ -1168,6 +1176,32 @@ async def _run_compliance_check(check_id: str, req: ComplianceCheckRequest):
except Exception as e:
logger.warning("P92/P94 consistency-check failed: %s", e)
# P73: MC-Solution-Generator — LLM-Vorschlaege pro HIGH-Fail.
# Max 5 Solutions pro Doc-Type um Latenz < 60s zu halten.
solutions_html = ""
try:
from compliance.services.mc_solution_generator import (
generate_solutions_for_fails, build_solutions_block_html,
)
all_solutions: list[dict] = []
for dt, fails in fails_by_doc.items():
if not fails:
continue
doc_txt = doc_texts.get(dt) or doc_texts.get("dse") or ""
if not doc_txt or len(doc_txt) < 500:
continue
sols = await generate_solutions_for_fails(
fails, doc_txt, dt, limit=3,
)
all_solutions.extend(sols)
if len(all_solutions) >= 8:
break # global cap
if all_solutions:
solutions_html = build_solutions_block_html(all_solutions[:8])
logger.info("P73: %d MC-Solutions generiert", len(all_solutions))
except Exception as e:
logger.warning("P73 MC-Solution-Generator skipped: %s", e)
# P82: GF-1-Pager ganz oben in der Mail — 5-Bullet-Zusammenfassung
# damit die GF nicht 124k Char lesen muss.
gf_one_pager_html = ""
@@ -1232,7 +1266,7 @@ async def _run_compliance_check(check_id: str, req: ComplianceCheckRequest):
+ cookie_arch_html + summary_html + scanned_html + profile_html
+ scorecard_html + redundancy_html
+ providers_html + banner_deep_html + library_mismatch_html
+ consistency_html + signals_html
+ consistency_html + signals_html + solutions_html
+ vvt_html + report_html
)