After a compliance-check run finishes, the user can now apply the
extracted vendor inventory directly to their own:
- CookieBanner config (admin /sdk/einwilligungen)
- Cookie-Policy / VVT-Register / Privacy-Policy templates
(admin /sdk/document-generator)
Backend:
- migration_to_banner.py: vendor list -> CookieBannerConfig with
ESSENTIAL/PERFORMANCE/PERSONALIZATION/EXTERNAL_MEDIA buckets +
review flags (broken opt-out URLs, missing expiry, no cookies listed)
- migration_to_document.py: vendor list -> pre-fills for 3 doc
templates, recipient-type aware (INTERNAL/GROUP/PROCESSOR/CONTROLLER)
- agent_migration_routes.py: GET /banner-preview, /document-preview,
/summary keyed on check_id
- compliance_audit_log: new check_payloads table persists cmp_vendors +
extracted_profile so the preview survives an app restart
- tests: 9 mapper units + 4 endpoint integration tests
Frontend:
- MigrationPanel.tsx: modal showing banner-config diff + document
pre-fills, plus links into the existing editors
- ComplianceCheckTab.tsx: replaces standalone audit link with the
panel; net -3 lines, stays at the 500-cap
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Now that all 1874 MCs run per check (Task #30 cap removal), the report
was about to drown in noise. This commit adds the full aggregation /
persistence / drill-down stack so each MC is actionable, not just
counted.
A1 mc_scorecard.py (new):
build_scorecard(checks) -> per-regulation PASS/FAIL/SKIP + severity
top_fails(checks, n) -> N most severe failed MCs
full_audit_records(...) -> flat rows ready for sidecar SQLite
A2 Email rendering:
agent_doc_check_scorecard.py (new) builds an HTML scorecard table
(regulation × passed/failed/HIGH/MEDIUM/score) shown at the top of
the email. agent_doc_check_report._render_document now collapses
the 500-MC L2 forest into 'X/Y bestanden (Z Fail)' summary plus
a top-10 fails block per doc — old verbose render is gone.
A3 compliance_audit_log.py (new) — sidecar SQLite at
/data/compliance_audits.db (separate from compliance Postgres
schema to comply with the no-new-migrations rule in CLAUDE.md):
check_runs(check_id, ts, tenant_id, site_name, base_domain,
doc_count, scorecard json, vvt_summary json)
mc_results(check_id, doc_type, mc_id, label, passed, skipped,
severity, regulation, matched_text, hint)
Route persists every run after the email is sent.
docker-compose.yml adds compliance-audit volume + env.
A4 backfill_mc_regulation_llm.py (new) — Qwen-tagged backfill for
the 1636 MCs the regex pass couldn't classify. Batches of 25,
format=json, output constrained to the canonical regulation list.
Run manually: docker exec bp-compliance-backend python3 \
/app/scripts/backfill_mc_regulation_llm.py [--dry-run]
A5 Admin audit tab — GET /api/compliance/agent/audit/<check_id>
proxied via /api/sdk/v1/agent/audit/<id>. New page
/sdk/agent/audit/[checkId] renders scorecard + filterable MC table
(status / doc_type / regulation, expandable rows with matched_text
+ hint). ComplianceCheckTab now shows 'Voll-Audit oeffnen' link.
A6 Trend per tenant — GET /api/compliance/agent/audit/tenant/<id>
returns recent runs. Email scorecard shows per-regulation delta
badges ('(+12%)', '(-3%)') compared with the previous run for the
same tenant + base_domain. Lookup is one SQLite query.
Plumbing:
rag_document_checker.py — SELECT now includes 'article'; MC results
carry 'regulation' + 'article' through to CheckItem.
agent_doc_check_routes.CheckItem schema gains regulation + article
fields (defaults '') so old clients still parse.
agent_compliance_check_routes — response gains 'check_id' so the
frontend can build the audit link.
9 files had conflict markers from the branch merge. All resolved keeping
the feature branch version. Also split agent_scan_routes.py (534→367 LOC)
by extracting Pydantic models to agent_scan_models.py.
[guardrail-change]
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New "Banner-Check" tab with:
- URL input → Playwright 3-phase test (before/reject/accept)
- Shield icon + provider detection
- Progress bar with pass/fail percentage
- 3-phase summary (cookies + scripts per phase)
- Violations (red) and passes (green) in structured list
Backend: new POST /api/compliance/agent/banner-check endpoint
that proxies to consent-tester:8094/scan.
Next step: Upgrade banner checks to L1/L2 format with expert
hints (same quality as document checks).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New "Dokumenten-Pruefung" tab in Compliance Agent:
- User adds multiple URLs with document type (DSI, AGB, Impressum, Cookie, Widerruf)
- Each document loaded via Playwright, accordions expanded, text extracted
- Checked against type-specific legal checklist
- Optional: Cookie banner check via checkbox
Checklisten-UX (solves "100% looks like nothing was checked"):
- All checks shown per document: green checkmark + matched text excerpt
- Red X for missing fields with legal reference
- Builds user trust: "9 Punkte geprueft, alle bestanden"
- Expandable per document with completeness bar
New checklists:
- Impressum: §5 TMG (6 fields: name, address, contact, register, VAT, representative)
- Cookie-Richtlinie: §25 TDDDG (5 fields: types, purposes, retention, third-party, opt-out)
Backend:
- POST /agent/doc-check — async with polling (same pattern as /scan)
- DocCheckResult includes checks[] with passed/failed + matched_text
- dsi_document_checker returns all_checks in SCORE finding
- Email report shows per-document checklist
Files: agent_doc_check_routes.py (280 LOC), DocCheckTab.tsx (248 LOC),
ChecklistView.tsx (130 LOC), dsi_document_checker.py (+70 LOC)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fundamental fix: scans now run asynchronously with progress polling.
Backend:
- POST /scan starts background task, returns scan_id immediately
- GET /scan/{scan_id} returns status + progress + result when done
- 7 progress steps shown: Website scan, DSI discovery, DSE analysis,
SOLL/IST comparison, corrections, report, email
- In-memory job store (dict with scan_id → status/result)
- No timeout limits on scan duration
Frontend:
- POST starts scan, receives scan_id
- Polls GET every 5 seconds (max 120 attempts = 10 min)
- Shows live progress message during scan
- Displays result when completed, error when failed
Proxy:
- POST timeout reduced to 30s (just starts the job)
- GET timeout 10s (just status check)
- No more 504/connection-dropped errors
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Third tab "Cookie-Test" in Compliance Agent:
- Phase A: Before consent (tracking without permission)
- Phase B: After rejection (CRITICAL if tracking persists)
- Phase C: After acceptance (undocumented services)
- CMP badge (Didomi, OneTrust, etc.)
- Violation cards with severity badges and legal references
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Chat-Verlauf wird als strukturiertes Beratungsprotokoll per Email
an den DSB gesendet. Button erscheint im Header sobald Nachrichten
vorhanden sind. Zeigt Checkmark nach erfolgreichem Versand.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>