4087bb5f185d31bc448fc75ef18db59f9fdff1aa
4 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
6f16507c5f |
feat(banner): P19 + P20 — Per-Category-Click-Test + Frontend-Drilldown
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m54s
CI / test-go (push) Has been skipped
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 17s
CI / loc-budget (push) Successful in 17s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 43s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
P19 (consent-tester): - dp-cookieconsent (TYPO3, Safetykon-Pattern) als CMP-Profil hinzu — Selektoren #dp--cookie-statistics/marketing + a.cc-allow Save-Button - Neues Signal provider_details_visible: nach Kategorie-Toggle prueft Playwright ob im Banner sichtbare Provider-/Cookie-Detail-Elemente erscheinen. Bei dp-cookieconsent (Banner ohne Listing) immer False -> HIGH-Violation "Kategorie zeigt keine Provider-/Cookie-Details — Nutzer kann nicht informiert einwilligen (Art. 7 Abs. 1 DSGVO)" - main.py serialisiert provider_details_visible + cookies_set pro Kategorie P20 (Frontend-Drilldown): - Backend: check_payloads-Tabelle um Spalte 'banner' (JSON) — voller banner_result persistiert (vorher nur in-memory). ALTER TABLE Migration idempotent. - Neuer Endpoint GET /api/compliance/agent/banner/<check_id> — liefert Quality-Score, Phases, Category-Tests, Banner-Checks, alle 46 structured_checks. - Frontend: BannerTab im /sdk/agent/audit/<id> mit Quality-Cards, 3-Phasen-Cookie-Tabelle, Per-Category-Listing (mit P19-Signal rot/gruen), Banner-Verstoesse + Rechtsgrundlagen, 46-Check-Drilldown filterbar nach Severity. - Tab-Switcher in page.tsx um "Cookie-Banner-Analyse" erweitert. - Bonus: 2 alte route.ts auf Next.js 15 Promise-params umgestellt (Build-Fix). Plus: Critical-Findings-Block nutzt provider_details_visible als primaeres Signal statt nur tracking_services-Anzahl. Smoke-Test Safetykon: 4 Critical Findings im Mail, banner-Endpoint liefert 46 checks + 3 phases + 2 categories mit provider_details_visible=False. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
575644c9c5 |
feat(audit): P8 — MC-Severity raus, Email nur harte Findings, MC-Audit als Checkliste
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 17s
CI / loc-budget (push) Failing after 17s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m48s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 40s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
Email-Hardening (mc_scorecard.top_fails):
Neue _is_hard_finding-Heuristik filtert konditionale MCs ohne
Negativ-Beleg aus den Top-Auffaelligkeiten. matched_text leer + Label
enthaelt "falls/sofern/wenn/soweit/ggf." -> raus, landet nur noch im
MC-Audit als "selbst pruefen". DATA-2066-A05 (kostenfreie Abschaltung
Standortdaten) ist das prototypische Beispiel.
MC-Audit-Frontend (audit/[checkId]/page.tsx):
Severity-Spalte (CRITICAL/HIGH/MEDIUM/LOW) entfernt — der MC-Audit
ist eine Checkliste, keine Severity-Drohung. Stattdessen:
- Spalte "Prioritaet" mit 3-Tier aus regulation-Mapping:
Gesetz (DSGVO/ePrivacy/TDDDG/...) / Behoerden-Leitlinie
(EDPB/DSK/EuGH/...) / Best-Practice (ISO/NIST/BSI)
- 3-Status: erfuellt (✓) / nicht erfuellt (✗) / selbst pruefen (?)
/ nicht anwendbar (—). rowReviewStatus() leitet "selbst pruefen"
aus matched_text-leer + konditionalem Label ab.
- Filter umgebaut auf 5 Stati statt 4
- Default-Filter "Nicht erfuellt" (vorher "Nur Fail")
Bonus: f.payload.risk_label TS-Cast im FindingsTab clean gemacht
(unknown -> string).
Effekt:
- Email an die GF zeigt nur noch echte Belege ("DSB fehlt",
"Gebuehr fuer Widerruf")
- MC-Audit ist eine sachliche Pruefliste fuer den Compliance-Officer
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
6c223c7c9b |
feat(compliance-check): exec-summary + voll-audit + TDM-respect + cookie-KB-extended + saving-scan-funnel
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 14s
CI / loc-budget (push) Failing after 15s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m43s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 37s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
P1 — Exec-Summary oben im Email-Report (4 KPIs + 2 CTAs, dunkler Gradient)
P3 — no_direct_sales-Flag fuer OEM-Konfigurator-Sites; AGB/Widerruf/AGB als
"NICHT ANWENDBAR" (grau) statt "NICHT GEFUNDEN" (rot)
P5 — Voll-Audit Unification: alle Findings (MC + Pflichtangaben + Vendor +
Redundanz) in /data/compliance_audits.db.unified_findings; neuer
/api/compliance/agent/findings/<id> Endpoint + FindingsTab im Audit-UI
mit Filter + CSV-Export
P7 — Crawl-Hardening: TDM-Reservation-Check (robots.txt / ai.txt / Header /
Meta) vor jedem Run mit 24h-Cache; HeadlessChrome-UA (Firma noch nicht
gegruendet — Switch via BREAKPILOT_BRANDED_UA env); per-Domain
Rate-Limit 1 req/s + max 2 concurrent
P2 — Cookie-Knowledge-DB additiv erweitert (35 -> 74 Cookies): Adobe, Meta,
Microsoft, LinkedIn, TikTok, HubSpot, Marketo, Salesforce, Hotjar,
FullStory, Mouseflow, Intercom, Drift, Zendesk, Cloudflare, Stripe,
OneTrust/Cookiebot/Usercentrics, Matomo, Pinterest, Snapchat, X/Twitter,
YouTube, Vimeo, Klaviyo, Mailchimp, Mixpanel, Segment, Amplitude,
Optimizely, Datadog; Wire-in in cookie_function_classifier liefert
compliance_risk-Label (kritisch/hoch/mittel/gering) pro Vendor
A — k-Anonymitaets-Helper (benchmark_k_anonymity) fuer P6-Vorbereitung
B — Cross-Tenant-Domain-Assertion im /findings-Endpoint (expected_domain
Query-Param -> 403 bei Mismatch)
C — Saving-Scan-Funnel: /api/compliance/agent/saving-scan/start mit
Validierung + 24h-Rate-Limit pro Domain + Lead-Persistenz in
saving_scan_leads + Auto-Discovery via _run_compliance_check; 6 Tests
D — Risk-Badge im Email-Vendor-Row
Rechtliche Leitplanken (Memory feedback_oem_data_legal.md): nur eigene
Knapp-Bewertungen + Source-Pointer, keine 1:1-Kopien fremder CMP-Texte.
TDM-Opt-Out-Respect nach § 44b UrhG. KEINE Schema-Aenderungen — alles in
Sidecar-SQLite.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
6ed30dae5b |
feat(agent): MC scorecard + audit drill-down + tenant trend (A1-A6)
Now that all 1874 MCs run per check (Task #30 cap removal), the report was about to drown in noise. This commit adds the full aggregation / persistence / drill-down stack so each MC is actionable, not just counted. A1 mc_scorecard.py (new): build_scorecard(checks) -> per-regulation PASS/FAIL/SKIP + severity top_fails(checks, n) -> N most severe failed MCs full_audit_records(...) -> flat rows ready for sidecar SQLite A2 Email rendering: agent_doc_check_scorecard.py (new) builds an HTML scorecard table (regulation × passed/failed/HIGH/MEDIUM/score) shown at the top of the email. agent_doc_check_report._render_document now collapses the 500-MC L2 forest into 'X/Y bestanden (Z Fail)' summary plus a top-10 fails block per doc — old verbose render is gone. A3 compliance_audit_log.py (new) — sidecar SQLite at /data/compliance_audits.db (separate from compliance Postgres schema to comply with the no-new-migrations rule in CLAUDE.md): check_runs(check_id, ts, tenant_id, site_name, base_domain, doc_count, scorecard json, vvt_summary json) mc_results(check_id, doc_type, mc_id, label, passed, skipped, severity, regulation, matched_text, hint) Route persists every run after the email is sent. docker-compose.yml adds compliance-audit volume + env. A4 backfill_mc_regulation_llm.py (new) — Qwen-tagged backfill for the 1636 MCs the regex pass couldn't classify. Batches of 25, format=json, output constrained to the canonical regulation list. Run manually: docker exec bp-compliance-backend python3 \ /app/scripts/backfill_mc_regulation_llm.py [--dry-run] A5 Admin audit tab — GET /api/compliance/agent/audit/<check_id> proxied via /api/sdk/v1/agent/audit/<id>. New page /sdk/agent/audit/[checkId] renders scorecard + filterable MC table (status / doc_type / regulation, expandable rows with matched_text + hint). ComplianceCheckTab now shows 'Voll-Audit oeffnen' link. A6 Trend per tenant — GET /api/compliance/agent/audit/tenant/<id> returns recent runs. Email scorecard shows per-regulation delta badges ('(+12%)', '(-3%)') compared with the previous run for the same tenant + base_domain. Lookup is one SQLite query. Plumbing: rag_document_checker.py — SELECT now includes 'article'; MC results carry 'regulation' + 'article' through to CheckItem. agent_doc_check_routes.CheckItem schema gains regulation + article fields (defaults '') so old clients still parse. agent_compliance_check_routes — response gains 'check_id' so the frontend can build the audit link. |