breakpilot-compliance

Author	SHA1	Message	Date
Benjamin Admin	8cbb513e2c	feat(audit): Phase 1 Quick-Wins (P81 + P85 + P70 + P83) + TCF DELETE/INSERT-Fix CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / detect-changes (push) Successful in 11s Details CI / branch-name (push) Has been skipped Details CI / loc-budget (push) Failing after 16s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 15s Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 38s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details CI / test-go (push) Has been skipped Details P81 — tests/fixtures/golden_truth/vw_de.json: GT-Fixture mit must_find_cookies (47 VW-Cookies) + expected_vendors (Google, Adobe, Trade Desk, ...). Basis fuer kuenftige Regression-Tests. P85 — banner_screenshot_block.py + consent_scanner.py + main.py: consent-tester macht beim Banner-Detect einen base64-PNG-Screenshot (< 1.5MB). Backend rendert ihn als <img src="data:..."> direkt nach dem GF-1-Pager. Visueller Beweis 'so sah das Banner aus' fuer Dispute mit Marketing/DSB. P70 — rag_provenance.py: classify_finding_provenance() klassifiziert ein Finding als 'rag' (Norm + Quelle), 'mixed' (Norm ohne Quelle) oder 'heuristic' (eigene Interpretation). provenance_badge_html() rendert kleine Badges (✓ RAG / NORM / ⚠ HEURISTIK). Modul ist generisch, kann bei jedem Finding-Renderer einklinkt werden. P83 — scripts/check-rebuild-needed.sh: Prueft ob die im Container deployten BUILD_SHA mit local HEAD uebereinstimmen. Bei Mismatch exit 1 mit 'REBUILD REQUIRED'-Hinweis. Verhindert das 'alter Code im Container'-Problem das uns mehrfach erwischt hat (Frontend-Tabs sichtbar, Backend ohne neuen Service). TCF-Fix — tcf_vendor_authority.py: cookie_library hat keinen UNIQUE-Index auf cookie_name → ON CONFLICT war unmoeglich. Loesung: vor Insert DELETE WHERE source_name='iab_tcf_v2'. Idempotent. + per-Vendor-Commit damit ein Fail die naechsten nicht blockt. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 08:24:46 +02:00
Benjamin Admin	2e87b74749	feat(audit): P103+P104+P105 Defeat-Device-Heuristik fuer Cookies CI / detect-changes (push) Successful in 10s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 15s Details CI / nodejs-build (push) Successful in 2m35s Details CI / test-go (push) Failing after 51s Details CI / iace-gt-coverage (push) Successful in 27s Details CI / loc-budget (push) Failing after 16s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-python-backend (push) Successful in 39s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details Drei zusammenhaengende Stufen 'Cookie-Verhalten ist anders als deklariert' — analog zum VW-Diesel-Skandal-Pattern (Pruefstand vs Realbetrieb). P103 (Stufe 3) — cookie_value_entropy.py: Klassifiziert Cookie-Werte als flag/short_id/long_token/uuid/hash/json_blob via Shannon-Entropy + Regex-Patterns. Wenn ein als 'essential' deklarierter Cookie einen 64-char-Base64-Wert hat → MEDIUM-Finding 'Defeat-Device-Heuristik'. P104 (Stufe 4) — cookie_network_tracer.py: Vergleicht Cookie-Domain mit Site-Hauptdomain + bekannten Tracker-Vendoren (50 Domains gemapped: doubleclick.net, facebook.com, demdex.net, omtrdc.net, adsrvr.org, hotjar.com, ...). Wenn ein als 'essential' deklariertes Cookie von externer Tracker-Domain gesetzt wird → HIGH. Drittland-Cookies werden als 'DRITTLAND US/CN/...' markiert (Schrems-II-Folge). P105 (Stufe 5) — tcf_vendor_authority.py: Ingest-Endpoint POST /api/compliance/agent/admin/tcf-ingest holt die IAB TCF v2 Global Vendor List (vendor-list.consensu.org/v3) und upserted sie in cookie_library mit source='iab_tcf_v2'. cross_reference_with_tcf fuzzy-matched cmp_vendors gegen die TCF-Liste — wenn Vendor in TCF als Marketing gefuehrt aber Site sagt 'Funktional' → HIGH (externe Authority widerspricht der Deklaration). Alle drei rendern eigene Mail-Bloecke im Bereich Cookies (nach cookie_audit_html, vor library_mismatch_html). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 00:24:07 +02:00
Benjamin Admin	6263462ba3	feat(frontend): Tab-Layout für Audit-Ergebnisse + cookie_audit in API CI / detect-changes (push) Successful in 8s Details CI / branch-name (push) Has been skipped Details CI / iace-gt-coverage (push) Successful in 28s Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 15s Details CI / loc-budget (push) Failing after 16s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 2m40s Details CI / test-go (push) Failing after 45s Details CI / test-python-backend (push) Successful in 40s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details ResultsTabsView.tsx — neue Komponente mit 7 Tabs: 1. Übersicht (KPIs: Docs, Findings, Vendors, Score) 2. Cookies & VVT (3-Quellen-Compliance-Vergleich + undokumentiert/compliant/nicht-geladen + deduplizierte Vendor-Tabelle) 3. Datenschutzerklärung (DSE-Findings via ChecklistView) 4. Impressum 5. AGB / Widerruf (zwei Sections in einem Tab) 6. Cookie-Banner (Verstoesse + Phasen-KPIs) 7. Mail-Vorschau (PDF-Download-Link) Sticky Tab-Header oben, Content scrollt darunter. Lange Scroll-Mail ist damit verschwunden. DocCheckTab nutzt ResultsTabsView statt der alten Inline-ChecklistView. Backend liefert jetzt cookie_audit-dict in der Response (zusaetzlich zu cmp_vendors + banner_result) damit das Cookie-Tab die 3 Listen (undokumentiert / compliant / nicht-geladen) rendern kann. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 23:44:36 +02:00
Benjamin Admin	081e4f057a	feat(audit): Cookie-Compliance-Audit (3-Quellen-Vergleich) + Vendor-Dedup + Block-Parser CI / detect-changes (push) Successful in 12s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / test-go (push) Failing after 55s Details CI / iace-gt-coverage (push) Successful in 25s Details CI / test-python-backend (push) Successful in 44s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 16s Details CI / loc-budget (push) Failing after 18s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 2m43s Details ZENTRALER USP: cookie_compliance_audit.py vergleicht 3 Quellen * DEKLARIERT in Cookie-Richtlinie (parse_cookie_table + parse_flat) * TATSAECHLICH im Browser geladen (banner_result.phases.after_accept) * LIBRARY-Metadaten (cookie_library lookup) Liefert 3 Listen mit Compliance-Verdict: * compliant (deklariert UND geladen) — gruener Block * undeclared_in_browser (geladen NICHT deklariert) — ROTER HIGH-Block → Art. 13(1)(c) DSGVO + § 25 TDDDG Verstoss * declared_not_loaded (deklariert NICHT geladen) — gelber Hinweis → Tabelle moeglicherweise veraltet parse_cookie_table erweitert um Block-Format (5 Zeilen pro Cookie wie beim User-Copy aus VW). Findet 35+ Cookies aus Copy-Paste statt 0. vendor_normalizer.py: 50+ Aliases (Google-Familie, Adobe-Familie, Trade Desk, AdForm, ...) + Garbage-Filter (URLs, leere Strings, 'click to select', 'Mehrere OEMs'). Mergt cookies-Listen beim Dedup. _guess_vendor erweitert: Adobe-Familie (s_ecid/AMCV/demdex/mbox/...), Trade Desk (TDID/TDCPM/TTDOptOut), AdForm (uid/cid/otsid), Salesforce LiveAgent, etracker, Akamai, EDAA. audit_quality_checks: vendor-thin-Threshold jetzt dynamisch nach Cookie-Doc-Wörter (3k→10 / 6k→20 / 10k→30 / 15k+→40). VW-Test-Fixture: tests/fixtures/cookie_gt/vw_cookie_richtlinie.txt (36-Cookie-Sample fuer Regression-Tests). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 23:36:45 +02:00
Benjamin Admin	1451873194	fix(audit): parse_flat_cookie_text fuer VW-Style Flat-Tabellen CI / loc-budget (push) Failing after 19s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 3m4s Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 43s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details CI / detect-changes (push) Successful in 12s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 19s Details VW Cookie-Doc liefert die Tabelle als FLACHEN Text ohne Spalten-Trenner: 'IDE Tracking Cookies (Marketing) Beschreibung 13 Monate Permanent TAID Tracking Cookies (Marketing) ...' parse_flat_cookie_text matched mit Regex: NAME [Tracking\|Session\|Funktional\|...] Cookies ... [13 Monate\|Session\|Permanent] Backend faellt bei parse_cookie_table=[] auf parse_flat zurueck. Damit holen wir aus dem 65k VW Cookie-Doc ~30-50 Cookies + Vendors deterministisch, auch wenn der HTML-Table-DOM-Extract leer ist (was passiert wenn die Tabelle aus mehreren append-Code-Pfaden geladen wird). Bonus: _extract_dom_tables Helper in dsi_discovery.py vorbereitet fuer spaeteres Einhaengen an allen 7 DiscoveredDSI.append-Stellen. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 21:24:14 +02:00
Benjamin Admin	dfac940272	feat(licenses): attribution renderer — Stufe 1 (overview) + Stufe 3 (SourceBadge) Backend - backend-compliance/compliance/api/licenses_routes.py: three endpoints built on the now-complete license_rule classification - GET /api/compliance/licenses/overview global aggregation by rule + per-source breakdown (Stufe 1) - POST /api/compliance/licenses/aggregate per-control-set aggregation for PDF footer (Stufe 2) and tech-file appendix (Stufe 4) — consumed later - GET /api/compliance/licenses/source-info/{control_uuid} single-control lookup for the inline source badge (Stufe 3) - registered in api/__init__.py via the existing safe-import loader Frontend - app/sdk/licenses/page.tsx (Stufe 1): the /sdk/licenses overview page. Renders rule legend cards + per-rule source tables. Drives the /licenses footer link and gives auditors a one-page view of what licence classes the platform is operating under. - components/sdk/SourceBadge.tsx (Stufe 3): reusable React component. Small R1/R2/R3 pill with click-expand tooltip showing source regulation + attribution string + render-full-text policy. Will be embedded into IACE hazards/mitigations, VVT items, DSFA controls in follow-up commits. Two stages of the four-stage renderer are now ready. Stufe 2 (PDF auto-footer) + Stufe 4 (tech-file appendix) follow once the existing PDF generators are extended to call /licenses/aggregate.	2026-05-21 21:00:10 +02:00
Benjamin Admin	cb5dad1a2f	feat(audit): A Audit-Transparenz + B Tabellen-Parse + D HTML-Tables aus DOM CI / detect-changes (push) Successful in 10s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-python-backend (push) Successful in 45s Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 20s Details CI / loc-budget (push) Failing after 17s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details Drei zusammenhaengende Fixes fuer den VW-Befund (6 Vendors statt 100+): A — audit_quality_checks.py: drei systemische Vorbehalte die IMMER prominent gezeigt werden: * banner_detected=False trotz Cookie-Doc → HIGH 'CMP-Tool ungeladen' * cookie_doc >= 30k chars aber cmp_vendors < 15 → HIGH/MEDIUM 'Vendor-Liste auffaellig kurz fuer Doc-Groesse' * submitted URL aber 0/Mini-Text → MEDIUM 'URL nicht ladbar' Rote Audit-Vorbehalt-Box ueber dem GF-1-Pager. GF-Summary sagt 'Audit unvollstaendig' statt faelschlich 'Keine kritischen Themen'. gf_one_pager nimmt audit_quality_findings in top_findings auf (BEVOR andere Findings). B — cookies_table_parser laeuft jetzt auch auf gecrawltem Cookie-Doc- Text (nicht nur bei User-Paste). Wenn der dsi-discovery-Response Tab/ Pipe-getrennte Tabellen-Reihen liefert, parsen wir sie deterministisch. D — consent-tester/dsi-discovery extrahiert jetzt zusaetzlich zum Text die <table>-Elemente aus dem DOM als list[str] (Tab-getrennt pro Zeile, mind. 2 Zellen, mind. 3 Zeilen, max 10 Tabellen pro Doc). Backend schleust diese als 'html_table'-cmp_payload ein und jagt sie zuerst durch cookies_table_parser → 100% deterministische Vendor-Extraktion ohne LLM. VW-Erwartung: aus der 65k-Cookie-Tabelle werden jetzt 30-50 Vendors deterministisch geparst statt 6 vom LLM-Cascade. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 20:21:28 +02:00
Benjamin Admin	e411c4f0d3	feat(audit): Text-Paste-Mode pro Row — Crawler optional umgehen CI / detect-changes (push) Successful in 12s Details CI / branch-name (push) Has been skipped Details CI / nodejs-build (push) Successful in 3m27s Details CI / iace-gt-coverage (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 17s Details CI / loc-budget (push) Failing after 20s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / test-python-backend (push) Successful in 47s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details Hintergrund: VW liefert ueber URL-Crawler nur 6 Vendors statt der 100+ die in der echten Cookie-Tabelle stehen. Wenn der User die Tabelle aber direkt von der Site kopieren kann (was bei den meisten OEM-Sites moeglich ist), umgehen wir den Crawler komplett und parsen den Text deterministisch. Backend: * doc_type_classifier.py — 7 Pattern-Gruppen (§5 TMG, Art.13 DSGVO, AGB-Klauseln, Widerrufs-Frist, Cookie-Tabellen-Header, etc). Wenn der User Text ins falsche Doc-Type-Feld kopiert (Impressum->DSE), detect_mismatch liefert detected + action ('reclassify' bei sehr hoher Konfidenz, 'warn' bei medium). * cookies_table_parser.py — Tab/Pipe/Komma/Semicolon-Separator-Auto- Detection, Spalten-Mapping per Header-Keyword. Aggregiert Cookie- Eintraege zu Vendor-Records (mit _guess_vendor-Fallback). Voll deterministisch, kein LLM. * doc_input_warnings.py — Mail-Block ueber dem Audit, der Mismatches + Auto-Reclassifies dem User transparent macht. * Pipeline: text gewinnt ueber url (war schon im Schema vermerkt), neue Felder declared_doc_type / input_source / reclassify_hint in doc_entries. Pasted-Tabellen-Vendors haben Vorrang vor Library-Fallback + LLM-Cascade (sind 100% genau). Frontend (DocCheckTab): * Pro Row Mode-Toggle 'URL' / 'Text einfuegen' (lila wenn aktiv). * Textarea (h-32, monospace) im text-mode mit kontext-spezifischem Placeholder (Cookie-Hinweis ggue. anderen Doc-Types) und Live- Zeichen-/Wort-Counter. * Submit-Button accepted entries mit URL ODER text. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 18:58:32 +02:00
Benjamin Admin	7335f64f4f	feat(founding-wizard): Per-Person IP-Assignment + Prefill + E2E-Tests CI / loc-budget (push) Failing after 20s Details CI / detect-changes (push) Successful in 12s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 19s Details CI / nodejs-build (push) Successful in 3m17s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 43s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details Wizard unterstuetzt jetzt 2-4 Gesellschafter mit individuellem IP-Bereich: - Pro Gruender ein IP-Assignment-Vertrag (z.B. Benjamin: Compliance+RAG; Sharang: Security+Infrastruktur). Pro GF ein eigener Dienstvertrag. - Step 1: Prefill-Button aus Unternehmensprofil + Felder Registergericht und HRB-Nr. - Step 2: Rollen-Dropdown (CEO/CTO/CFO/COO/CPO/GF/Sonstige) statt freie Texteingabe, IP-Bereiche-Textarea pro Person. Backend: - generate_documents() iteriert pro Person fuer PER_PERSON_DOCS. - _build_person_context() injiziert ASSIGNOR_, GF_, IP_LIST_DETAILS aus person.ip_areas. - base_context() propagiert basics.register_court und basics.hrb_number. Tests: - 30/30 Pytest gruen (6 neue: Per-Person-Context, Slug-Helper, Registergericht-Propagation). - 4 neue Playwright-E2E-Specs (hermetisch via route.fulfill, mit Console-/Page-Error-Traps): kompletter 8-Step-Flow, Prefill-Fehlerpfad, Step-Navigation/Reset, Rollen-Dropdown + IP-Areas. - Spec setzt 'bp-sdk-cookie-consent' im addInitScript damit der CookieBannerOverlay nicht die Wizard-Buttons ueberlagert. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 18:49:10 +02:00
Benjamin Admin	138d9068c4	fix(audit): VW-Cookie-Tabelle — Library-Fallback + Pattern-Extract verstaerkt CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / detect-changes (push) Successful in 11s Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 16s Details CI / loc-budget (push) Failing after 18s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 41s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details VW-Lehre: cmp_vendors=6 (alle LLM-grob) wurde als ausreichend gewertet, obwohl die echte Cookie-Tabelle 30+ Eintraege hat. 3 Fixes: 1. fallback_vendors_for_run skip-Schwelle: existing_vendor_count >= 3 war zu niedrig. Jetzt nur skip wenn < 5 Cookies UND >= 5 Vendors schon vorhanden. 2. Library-Fallback wird jetzt aufgerufen bei < 20 cmp_vendors (statt < 3). VW-typische Setups (6 LLM-grob + 30 aus Library) bekommen damit eine vollstaendige Vendor-Liste. 3. _extract_cookie_names_from_doc: regex-Pattern-Extract aus dem Cookie-Doc-Text selbst — sucht nach 'NAME Tracking Cookies (Marketing)' etc. Findet Cookie-Namen die NICHT im Browser-Jar landen (z.B. nur nach Consent geladen werden). Diese werden zusaetzlich durch die Library matched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 18:32:07 +02:00
Benjamin Admin	c281464071	feat(audit): P71 JC-vs-AVV Entscheidungsbaum CI / detect-changes (push) Successful in 10s Details CI / branch-name (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 15s Details CI / test-python-backend (push) Successful in 39s Details CI / test-python-document-crawler (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / loc-budget (push) Failing after 17s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details jc_avv_decision.py: detect_ambiguous_jc_avv prueft ob DSE-Text sowohl JC-Signale (gemeinsame Auswertung, Schwesterunternehmen, Konzern...) als auch AVV-Signale (Auftragsverarbeiter, weisungsgebunden...) enthaelt. Bei Treffer rendert build_jc_avv_decision_html einen Block mit 4 EDPB- basierten Leitfragen + jeweiliger Empfehlung. Quellen: EDPB Guidelines 7/2020, EuGH C-25/17, C-40/17. In Mail-Render zwischen Solutions-Block und VVT eingehaengt. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 17:31:37 +02:00
Benjamin Admin	6dc427a754	fix(audit): VW-404-Recovery + P52 LLM-Merge + P51 Banner-UX-Checks CI / detect-changes (push) Successful in 10s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 14s Details CI / loc-budget (push) Failing after 17s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 42s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details VW-404-Fix: submitted_types zaehlt jetzt nur Doc-Types mit >= 200 Zeichen echtem Text. Eine eingegebene URL die 404/Mini-Text liefert (VW cookie- richtlinie.html) wird als 'missing' behandelt, sodass Auto-Discovery alternative URLs auf der Homepage probiert. In-place-Update statt Duplicate-Entry, rejected_url wird fuer Audit-Transparenz aufgehoben. P52 LLM-Cascade Merge: vendor_llm_extractor laeuft jetzt bei < 5 Vendors (nicht nur bei 0), und die Ergebnisse werden MIT existing cmp_vendors gemerged statt zu ueberschreiben. VW-typische Setups (Generic CMP + 0 cmp_payloads) bekommen damit den Text-basierten Vendor-Layer dazu. P51 — banner_consistency_checks erweitert: * check_banner_copyability: scannt banner_html nach user-select:none / oncopy=return false / onselectstart. MEDIUM Finding wenn Banner-Text nicht kopierbar (Art. 7 (2) DSGVO). * check_consent_history: prueft auf 'Meine Einwilligungen' / Consent- Historie / Datenschutz-Cockpit. MEDIUM wenn keine sichtbare Historie (Art. 7 (3) — Widerruf muss so einfach wie Erteilung sein). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 17:27:55 +02:00
Benjamin Admin	309c10c203	feat(audit): P72 MC-Scope-Filter + P73 MC-Solution-Generator CI / detect-changes (push) Successful in 12s Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 15s Details CI / loc-budget (push) Failing after 18s Details CI / go-lint (push) Has been skipped Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 41s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details P72 — rag_document_checker LEFT JOINs canonical_controls.scope_doc_type. _filter_by_canonical_scope wirft MCs raus deren scope explizit auf einen inkompatiblen Doc-Type zeigt (Mapping in _SCOPE_COMPATIBLE). Konservativ: 'other'/NULL/'process' bleiben drin — Heuristik v1 ist noch nicht stark genug fuer hartes Filtern. Erwartete Wirkung: ~10-15% weniger irrelevante MCs pro Doc, weil z.B. ein TOM-MC nicht mehr als DSE-Finding auftaucht. P73 — mc_solution_generator.py: Qwen->OVH Cascade generiert pro HIGH/ CRITICAL-Fail eine konkrete Einfuege-Empfehlung mit Anchor (wo + was) und Aufwand-Schaetzung. JSON-Schema {solution_text, anchor_hint, effort_min}. In-process LRU-Cache (500 entries) per (mc_id, doc_md5). Max 3 Solutions pro Doc-Type, global Cap 8 — haelt Latenz < 60s. Bloecke werden im Mail-Render unter VVT als 'Loesungs-Vorschlaege (KI-generiert)' eingehaengt. Disclaimer: kein Rechts-Beratung, mit DSB pruefen. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 17:21:19 +02:00
Benjamin Admin	4183379dc5	feat(audit): P33 3-Spalten-Vendor-Konsistenz (DSE/Cookie-Doc/Banner) CI / detect-changes (push) Successful in 11s Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 15s Details CI / loc-budget (push) Failing after 20s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 44s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Has been skipped Details check_three_source_vendor_consistency: scannt DSE-, Cookie-Doc- und Banner-Vendor-Liste auf 15 typische Vendor-Signaturen (Google Analytics, Meta Pixel, Hotjar, HubSpot, LinkedIn Insight, ...). Listet Vendors die in mind. einer Quelle stehen, aber nicht in allen sources_with_data. Liefert MEDIUM-Finding mit konkreter 'fehlt in: DSE, Banner-Liste'- Liste pro Vendor. Empfehlung: zentrale Vendor-Liste pflegen + in alle drei Dokumenttypen propagieren. (Art. 13(1)(c)+(e) DSGVO) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 17:11:47 +02:00
Benjamin Admin	c93c88577c	feat(audit): P88 PDF-Export via WeasyPrint CI / detect-changes (push) Successful in 9s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / test-python-backend (push) Successful in 42s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 14s Details CI / loc-budget (push) Failing after 17s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details GET /api/compliance/agent/snapshots/{id}/pdf liefert application/pdf mit dem vollen Audit-Mail-Inhalt im A4-Print-Layout (Header mit Site/Timestamp/Snapshot-ID, Seitenzahlen unten rechts). check_replay.py liefert jetzt zusaetzlich 'full_html' (nicht nur 500-char-preview), damit der PDF-Renderer das komplette HTML hat. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 17:06:48 +02:00
Benjamin Admin	9f06911ff9	feat(audit): Cookie-Library-Fallback fuer VW-Pattern (kein bekanntes CMP) CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details CI / detect-changes (push) Successful in 10s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 17s Details CI / loc-budget (push) Failing after 17s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 41s Details Wenn nach Standard-Extract + Phase-G + LLM-Cascade weiterhin < 3 cmp_vendors aber >= 5 Cookies im after_accept stehen (typisch: Custom-CMP wie VW 'cookiemgmt'), matcht der Fallback die Cookie-Namen gegen die compliance.cookie_library und rekonstruiert Vendor-Records aus den Library-Eintraegen. Hintergrund: VW Run de2a029e zeigt 4 Vendors trotz 28 after_accept-Cookies. cmp_payloads ist 0 (kein bekanntes IAB-Tool erkannt) und die hinterlegte Cookie-URL liefert 404. Die DSE ist mit 34k zwar substanziell, listet aber keine Vendor-Tabelle. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 17:00:49 +02:00
Benjamin Admin	338e03d3b0	feat(audit): P34 Exec-Summary Score-Einordnung — 'wo Sie stehen sollten' CI / detect-changes (push) Successful in 10s Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 14s Details CI / loc-budget (push) Failing after 16s Details CI / go-lint (push) Has been skipped Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 2m46s Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / test-python-backend (push) Successful in 43s Details CI / test-python-document-crawler (push) Has been skipped Details _score_band_explanation: vier Baender (Sehr gut/Akzeptabel/Handlungs- bedarf/Erhoehtes Risiko) liefern Label + erwartete Handlung. Wird als neue Zeile unter den KPIs in der Exec-Summary gerendert (mit score-farbiger Linkmark). Sachlicher Ton — kein 'Vorstand muss sofort handeln', sondern realistische Empfehlung (z.B. '70-84: Branchen-Median, einmaliges Aufraeumen + Halbjahres-Check'). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 16:51:34 +02:00
Benjamin Admin	30e43afba6	feat(audit): P86 Branchen-Benchmark + P35/P77/P78 Textsignale CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / detect-changes (push) Successful in 11s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / loc-budget (push) Failing after 19s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 41s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 15s Details P86 — industry_benchmark.py: zieht alle Snapshots mit derselben scan_context.industry, berechnet Median + Percentile, rendert 'Sie 42% — Automotive-Median 58% (Stichprobe: 12)'. Min Sample 3. P35 — banner_text 'Speichern' ohne 'Ablehnen' = MEDIUM. Mehrdeutiges Label nach EDPB 03/2022 Deceptive-Design-Guidelines. P77 — DSE mit prominenter Cookie-Sektion (Vendor-Hints: Speicherdauer, Anbieter, Datenkategorie) ersetzt die Forderung nach separater Cookie-Richtlinie. Positives Signal statt False-Positive. P78 — Art. 26-Klausel im DSE-Text erkannt → positives Signal 'JC-Konstrukt dokumentiert'. Vermeidet False-Positive bei Konzern-Schwester-Kooperationen. Alle in Mail eingehaengt: Branchen-Block nach GF-1-Pager, Signale-Block nach Konsistenz-Check. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 16:43:15 +02:00
Benjamin Admin	df8832c521	feat(audit): P75 Banner-vs-CMP + P84 Diff-Mode + P74/P96/P97 Doc-Types CI / detect-changes (push) Successful in 11s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 15s Details CI / loc-budget (push) Failing after 18s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 42s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details P75 — check_banner_vs_cmp_partner_count: wenn Banner-Text 'N Partner' nennt und N < cmp_vendors * 0.6, HIGH-Finding (Art. 13(1)(e) DSGVO). Erkennt Verharmlosung der tatsaechlichen Vendor-Anzahl. P84 — run_diff.py: vergleicht aktuellen Lauf mit letztem Snapshot derselben Site (set-Diff auf normalisierten Finding-Labels). Block ueber dem GF-1-Pager: 'Seit letztem Lauf: X Findings weg, Y neue'. USP — keiner der grossen Anbieter hat das. P74/P96/P97 — Labels fuer legal_notice (Rechtliche Hinweise / IP / Forward-Looking), dsa (Art. 12+17 Digital Services Act), lizenzhinweise (OSS-Compliance) in _DOC_TYPE_LABELS registriert. Echte Pflichtangaben- Checks kommen separat. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 16:38:25 +02:00
Benjamin Admin	7842c95532	feat(audit): P92 CMP-Tool-Verfuegbarkeit + P94 Banner-vs-Cookie-Doc-Konsistenz CI / detect-changes (push) Successful in 11s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 16s Details CI / loc-budget (push) Failing after 17s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 42s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details P92 — Wenn der Nutzer 'Anpassen'/'Einstellungen' klickt und der CMP-Settings-Bereich kein Fehlerfreies Laden zeigt (Error, Timeout, <80 Zeichen ohne Kategorien, keine Toggles), ist das ein HIGH- Finding. Granulare Wahl formal vorhanden, faktisch nicht funktionsfaehig (Art. 7 (3) DSGVO + EDPB 03/2022). P94 — Cookie-Liste im Banner-Settings vs Cookie-Richtlinie. Heuristik extrahiert Cookie-Namen aus dem Cookie-Doc-Text (regex auf typische camelCase/_underscored Patterns + Vendor-Prefixes _ga/_gid/ot_/uc_). Wenn \|only_in_doc\| >= 5 ODER \|only_in_banner\| >= 3 → MEDIUM-Finding. \|only_in_doc\| >= 15 UND \|only_in_banner\| >= 5 → HIGH. Beide Findings landen im neuen Mail-Block 'Banner-Konsistenz-Pruefung' (amber-yellow) zwischen Mismatch-Block und VVT. Auch in check_replay.py eingehaengt. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 16:31:19 +02:00
Benjamin Admin	08671adfdf	feat(audit): P82 GF-1-Pager + P87 Konfidenz-Score pro Finding CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / detect-changes (push) Successful in 12s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / test-python-backend (push) Successful in 43s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 18s Details CI / loc-budget (push) Failing after 19s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details P82 — gf_one_pager.py: kompakte 5-Bullet-Kurzfassung ganz oben in der Mail. Score (gross + Farbe), Delta-zu-Vorlauf, Top-Findings nach HIGH/MEDIUM sortiert mit zustaendiger Rolle (DSB / Marketing / IT / Legal / Web-Team) und Klassifizierungsbits aus dem Wizard. Sachlicher Ton — keine 4%-Drohung, '4-8 Wochen' als realistischer Zeitrahmen. Eingehaengt vor Critical-Findings-Block in Mail-Composition und Replay-Pipeline. P87 — finding_confidence.py: 13 Regex-Regeln liefern (confidence_pct, reason) pro Finding-Label. Direkt im DOM beobachtbar = 95-98%, Library-Mismatch = 82%, Textmuster-Match auf Pflichtangaben = 75-88%. Im 1-Pager als kleines '(NN% Konfidenz)'-Tag mit Reason-Tooltip hinter jedem Finding gerendert. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 16:20:19 +02:00
Benjamin Admin	50fc0ecc59	feat(audit): P79 Pre-Scan-Wizard (8 Pflichtfelder) + P99 erweitert + P102 Replay-Fix CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 14s Details CI / nodejs-lint (push) Has been skipped Details CI / detect-changes (push) Successful in 11s Details CI / branch-name (push) Has been skipped Details CI / loc-budget (push) Failing after 17s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 2m56s Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 40s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details P79: PreScanWizard.tsx mit 8 Pflichtfeldern (Branche, B2B/B2C, Direkt-Vertrieb, Rechtsform, Konzern-Struktur, MA-Zahl, Besondere Daten, Drittland). Scan-Button disabled bis alle 8 ausgefuellt. Werte landen in scan_context und ueber Backend in compliance_check_snapshots. P99: DOC_TYPES um dsa + legal_notice + lizenzhinweise + nutzungsbedingungen erweitert. URL-hinzufuegen-Button war schon da. P102 (Replay-Bug): check_replay.py liest jetzt e.get('text') statt nur full_text — Snapshot-Schema verwendet 'text'. Library-Mismatch- Block wird damit auch im Replay angezeigt. Backend: ComplianceCheckRequest.scan_context optional; save_snapshot persistiert ihn in compliance_check_snapshots.scan_context. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 15:59:01 +02:00
Benjamin Admin	94057b1536	feat(audit): VW-Cookie-Bug-Fix + P101/P102 Cookie-Library-Mismatch-Findings CI / loc-budget (push) Failing after 19s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 42s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details CI / detect-changes (push) Successful in 11s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 15s Details VW-Bug B1: extract_vendors_via_llm hatte max_text_chars=12000 -> bei VW-Cookie-Doc (60k chars, 100 Cookies in Tabelle) wurden 80% abgeschnitten, LLM extrahierte nur 1 Vendor. Fix: max_text_chars=50000, num_predict 6000->16000 fuer mehr Vendor-Output, Ollama-Timeout 120s->420s. P101 Aggregator-Script (backend-compliance/scripts/cookie_library_enrich.py) geht alle compliance_check_snapshots durch und extrahiert (cookie_name, declared_category, observed_sites). Erste Auswertung ueber 8 Snapshots: 101 unique Cookies, 47 in Library, 54 unbekannt, 18 Mismatches. P102 Cookie-Klassifikations-Pruefung als Mail-Block. Vergleicht Site-deklarierte Kategorie vs Library + Vendor-Doku. HIGH wenn Library sagt 'marketing' aber Site als 'essential'/'statistics' deklariert (faktische Drittland-/Werbe-Verarbeitung versteckt). MEDIUM sonst. In agent_compliance_check_routes Mail-Komposition + Replay-Pipeline eingebaut. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 15:47:11 +02:00
Benjamin Admin	e5b4672f2a	fix(audit): P90 — auto-discovery Timeout 180s -> 300s fuer BMW-Homepage CI / detect-changes (push) Successful in 10s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 15s Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / loc-budget (push) Failing after 15s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 39s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 12:05:41 +02:00
Benjamin Admin	0d5c76ea98	fix(audit): P90-B1 — DSI-Discovery Timeout 120s -> 240s fuer BMW-Impressum CI / detect-changes (push) Successful in 11s Details CI / branch-name (push) Has been skipped Details CI / go-lint (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 13s Details CI / loc-budget (push) Failing after 15s Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 38s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details BMW-fafcb090 zeigte exception 'ReadTimeout' beim consent-tester-Call fuer anbieterkennzeichnung.html. Der Discovery-Lauf folgt 3 Sub-Documents (Versicherungsvermittler, Aufsicht, Berufsrecht) plus ePaaS-Captures — braucht regelmaessig >120s. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 11:52:59 +02:00
Benjamin Admin	54f5a06c2f	fix(audit): P90-Diagnose — verbose Exception fuer fetch+auto-discovery CI / detect-changes (push) Successful in 10s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 14s Details CI / loc-budget (push) Failing after 15s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 38s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details BMW-Lauf 760de886 hat 0 cmp_payloads obwohl consent-tester ePaaS 4x captured. Backend-Log zeigt 'Consent-tester fetch failed for ...anbieterkennzeichnung.html: ' mit LEEREM Exception-String. Auch 'auto-discovery failed for https://www.bmw.de/: ' ist leer. Quick-Fix: str(e) + type(e).__name__ in beiden Except-Bloecken, damit naechster BMW-Lauf den echten Fehler sichtbar macht. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 11:45:28 +02:00
Benjamin Admin	86b4a263d2	fix(audit): P90-B1 — cmp_payloads bei kurzem DSE-Text nicht verwerfen CI / detect-changes (push) Successful in 9s Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / test-go (push) Failing after 41s Details CI / iace-gt-coverage (push) Successful in 25s Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 16s Details CI / loc-budget (push) Failing after 15s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-python-backend (push) Successful in 35s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details BMW-Lauf 9811eba1 hatte 0 cmp_vendors obwohl consent-tester ePaaS 4x captured (~393KB). Root-Cause in _fetch_text Z.1254: if merged and len(merged.split()) > 100: return merged, cmp_payloads Wenn DSE/Cookie-URL nur kurzen SPA-Shell-Text liefert (BMW: 10 Worte), greift die Schwelle nicht — Code faellt durch zum HTTP-Fallback der return text, [] zurueckgibt. Die zuvor captured CMP-Payloads (ePaaS-JSON mit allen Vendor-Daten) werden komplett verworfen. Fix: vor dem HTTP-Fallback pruefen ob cmp_payloads vorhanden sind. Wenn ja, diese zurueckgeben mit dem (kurzen) Text oder dem rekonstruierten cmp_cookie_text. Auch ohne 100-Wort-Schwelle. Effekt: BMW-VVT-Tabelle wird gefuellt (~90 Vendors aus ePaaS-JSON). Mercedes/andere OEMs unveraendert. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 11:29:41 +02:00
Benjamin Admin	7938e377b6	feat(audit-tonality): P89/P76/P91 — Co-Pilot statt Roboter-Anwalt CI / branch-name (push) Has been skipped Details CI / detect-changes (push) Successful in 11s Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 14s Details CI / loc-budget (push) Failing after 15s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Failing after 48s Details CI / iace-gt-coverage (push) Successful in 25s Details CI / test-python-backend (push) Successful in 43s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details User-Feedback in einer Session: "Wir erzeugen nur Panik. Egal was da steht, es dauert Wochen. Wir sind Tool an der Seite von CMO/GF/CIO, nicht Gegner." Memory: feedback_breakpilot_tonalitaet.md (gilt fuer ALLE Module + Marketing). P89 Critical-Findings-Block ENTFERNT/UMGEBAUT — keine Panik-Rot-Box mehr. - Statt "🚨 SOFORTMASSNAHMEN ERFORDERLICH" -> "Zusammenfassung fuer die Geschaeftsfuehrung", blauer dezenter Block - Statt "VERSTOSSE" -> "Themen zur Besprechung mit DSB, Marketing und Entwicklung" - Statt "Bussgeldrahmen 4% Weltumsatz" als Erstes -> realistische Einordnung (0,1-1%) in dezenter Schluss-Notiz mit Konfidenz-Hinweis - "Sofortmassnahme" -> "Empfehlung" - "Themen 1, 2, 3..." statt "HIGH"-Badges (P87-Vorbereitung) - Explizite Zeitschaetzung "4-8 Wochen (DSB -> Agentur -> Dev -> Freigabe)" P76 Mercedes-Sekundaer-Buttons (Datenschutzerklaerung + Impressum klein unter den 3 Haupt-Buttons) erkennen. Walker scant jetzt label-basiert ALLE klickbaren Elemente im Shadow-DOM (wb7-link, wb7-link-secondary, wb7-button-text, span[onclick], small a, [role=button], etc.). Vermeidet Mercedes-Impressum-False-Positive der Phase 1. P91 VVT-Tabellen-Renderer in neuer Co-Pilot-Tonalitaet. Statt "Verstoss-Liste mit Bussgeldpotenzial" -> Wahrscheinlichkeits-Aussage: "Bei Anbieter-Reduktion + Wechsel zu europaeischen Alternativen ist Reduktion des Tracking-Footprints + Lizenz-Einsparung wahrscheinlich. Fundierte Bewertung erfordert DSB-Abstimmung." BMW-Bug B1-B4 (P90) bewusst nicht in diesem Commit: BMW-Lauf hat ePaaS 4x captured im consent-tester, aber Backend bekommt 0 cmp_payloads. Wiring-Bug zwischen consent-tester /dsi-discovery und Backend _fetch_text — eigene Diagnose-Session noetig (siehe Task P90). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 11:24:57 +02:00
Benjamin Admin	4946571863	feat(audit-pipeline): P72-v2 Heuristik nachgeschaerft + P80 Mini-Replay-Endpoint CI / detect-changes (push) Successful in 9s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 13s Details CI / loc-budget (push) Failing after 14s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 36s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details P72-v2 MC-Scope-Classifier Heuristik v2 — v1 hatte 79% 'other'-Bucket (Patterns zu strict). v2 deckt deutlich breiter ab: - DSE: Art. 13/14 + Betroffenenrechte (Art. 15-22) + DSB + Aufsichtsbehoerde + Speicherdauer + besondere Kategorien - TOM: Art. 32 + Verschluesselung/Backup/Pseudonymisierung + Zugriffskontrolle + ISO 27001 + BSI-Grundschutz + Audit-Log - cookie_richtlinie: Tracking-Pixel + Webstorage + GA/Matomo/ Hotjar/Pixel/GTM - process: VVT (Art. 30) + DSFA (Art. 35) + Datenpannen (Art. 33/34) + HinSchG + Schulungen + Loeschkonzept Script `backfill_mc_scope_v2.py` re-classifiziert NUR den 'other'-Bucket (spezifische v1-Buckets bleiben unangetastet). P80 Mini-Replay-Endpoint (v1): POST /compliance-check/snapshots/{id}/replay ?recipient=foo@bar.com & dry_run=false Laedt Snapshot, rendert Mail mit AKTUELLEM Render-Code (P63-P67, P59b/P61/P62). Sendet [REPLAY]-prefixed Mail oder gibt nur HTML-Stats zurueck (dry_run). Effekt: 7min Re-Scan -> 2-5sec fuer Mail-Layout-Iterationen. v2 (spaeter): MC-Scorecard mit aktuellem scope_doc_type-Filter ueber Snapshot — erfordert _run_compliance_check Refactoring. Plus Bugfix: GET /snapshots/{id} raised jetzt HTTPException statt Tuple-Return (FastAPI hat Tuple als JSON-Array zurueckgegeben). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 10:21:56 +02:00
Benjamin Admin	cde670617e	feat(audit-pipeline): P72 MC-Scope-Classifier + P80 Snapshot/Replay-Foundation [migration-approved] CI / detect-changes (push) Successful in 11s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 14s Details CI / loc-budget (push) Failing after 14s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 37s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details P72 MC-Scope-Classifier — pro MC den ECHTEN Doc-Adressaten festlegen (cookie_richtlinie/dse/banner_implementation/cmp_audit/tom/avv/jc/ impressum/agb/widerruf/process/accounting/other). - Migration 145: scope_doc_type Spalte + Index auf canonical_controls - Backfill-Script mit Regex-Heuristik (12 Regeln, Prioritaet-sortiert) - Erste 11k-Sample-Distribution: 76% other (Heuristik v1 zu strict — v2 muss lockerere Patterns fuer DSE/TOM nachschaerfen) - Ziel: bevor MC-Scorecard filtert, weiss jeder MC welches Dokument er adressiert. Bisher landeten eHealth-/HGB-MCs im Cookie-Audit. P80 Snapshot + Replay-Foundation — Roh-Daten persistieren damit Audit-Pipeline ohne erneuten Crawl rebuildbar ist. - Migration 146: compliance_check_snapshots Tabelle (JSONB pro doc_entries/banner_result/profile/cmp_vendors/scan_context) - services.check_snapshot.save_snapshot/load_snapshot/list - Endpoints GET /snapshots, GET /snapshots/{id} - Hook in _run_compliance_check: nach Mail-Send automatischer Snapshot-Save via separater SessionLocal (background-task safe) - Replay-Endpoint folgt im naechsten PR (braucht Refactoring von _run_compliance_check in crawl_phase + interpret_phase) - Effekt: Test-Cycle 7min -> 5sec bei reinen Logik-Aenderungen (P73/P79/P81+ profitieren direkt). Snapshots dienen auch als Regression-Test-Corpus (P81 Golden-Truth-Library). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 08:53:31 +02:00
Benjamin Admin	603381a67f	feat(audit-mail): P58/P59c/P60b/P61/P62 — Mercedes-Cycle Phase 1 abgeschlossen CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 38s Details CI / test-python-document-crawler (push) Has been skipped Details CI / detect-changes (push) Successful in 12s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 14s Details CI / loc-budget (push) Failing after 15s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details P58 Anti-Audit-Detection robuster (script-domain + settings-spezifisch — war bereits im Code, jetzt sauber als completed dokumentiert). P59c DACH-Custom-Cookies in compliance.cookie_library: Borlabs, etracker, Matomo/Piwik, Userlike, Cookiebot/Cookieyes/Usercentrics, Akamai/Cloudflare/Datadome Bot-Manager + HubSpot. 21 neue Eintraege (3 von 24 schon via Open-Cookie-Database vorhanden). Script: backend-compliance/scripts/seed_dach_cookies.py. P60b Vendor-Pattern-Dedupe mit Fuzzy-Match (Jaccard >= 0.7) statt exakter Tuple-Equality. Vendors mit teilweise befuellten Feldern (z.B. Sitzland eingetragen) fallen nicht mehr aus der globalen Notice — Bug: Amazon/Psyma/Qualtrics hatten zuvor wiederholte per-row Actions. P61 "Untergeschobene Cookies"-Erkennung — wenn ein deklarierter Vendor (z.B. Google Tag Manager) automatisch weitere mitbringt (GA + GCL_AU + DoubleClick), werden diese als separater Mail-Block (gelb) mit COOKIE/VENDOR-Badges + Quellen-Doku ausgewiesen. Neuer Service: compliance.services.vendor_package_cookies (8 Primary-Vendors mit je 2-4 implicit Cookies/Vendors). P62 Marketing-Manager-Disclaimer "Was wir sehen / nicht sehen" als blauer Box-Block direkt unter dem Critical-Findings-Block. Erklaert Grenzen unseres Audits (Server-Side-Tracking, Vendor-interne Datenweitergabe, Cross-Page-Banner) und Risiko des Falschvertrauens in einen 100%-Score. Neuer Renderer: compliance.api.scope_disclaimer. Architektur: VVT-Tabellen-Renderer aus agent_doc_check_extras.py (552 LOC -> 242 LOC) in compliance.api.vvt_table_renderer ausgelagert, um den 500-LOC-Hardcap einzuhalten. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 08:01:27 +02:00
Benjamin Admin	57c0f940a2	feat(consent+report): P56-P67 Mercedes-Audit-Cycle (Anti-Audit, Phase G Vendors, Cookie-Behavior-Validator + 5 Mail-Polish-Items) [migration-approved] CI / detect-changes (push) Successful in 11s Details CI / branch-name (push) Has been skipped Details CI / nodejs-build (push) Successful in 2m19s Details CI / test-go (push) Has been skipped Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 16s Details CI / loc-budget (push) Failing after 15s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 37s Details P56 Anti-Auditing-Detection als constructive Compliance-Finding (Audit-API- Empfehlung statt Anklage, weil Mercedes berechtigt Bots blockiert) P57 Phase G vendor_details Union mit cmp_vendors -> 42 Anbieter sichtbar P58 Anti-Audit-Detection robuster (Script-Domain-Check + Settings-spezifisch) P59 Cookie-Behavior-Validator (4 Layer, 3-Tier-Severity: MEDIUM=Kategorie- Mismatch / HIGH=Zweck-Mismatch / CRITICAL=beide=Vorsatz-Indiz) + Open Cookie Database (CC0) als Library-Seed (2264 Cookies) P59b Cookie-Behavior in Banner-Check verdrahtet + Mail-Block (BUGFIX: SessionLocal selbst oeffnen, db war im Background-Task nicht im Scope) Mail-Polish nach Mercedes-Review: P63 Banner-Footer-Links auch im wb7-link/role=link erkennen (Shadow-DOM- Walker label-based statt nur <a href>) P64 Re-Access-Severity: MEDIUM statt HIGH, wenn Footer "Einstellungen" oder Mercedes-typisch existiert; OEM-Footer-Detection (wb7-footer) P65 Text-Truncation: Word-Boundary statt Zeichen-Cut (kein "einfa"-Bruch mehr in Sofortmassnahmen) P66 GF-Aktionen: Service-Zweck vs Cookie-Zweck explizit erklaert (haeufige Verwechslung Marketing/GF: "Akamai-Beschreibung" != Cookie- Zweck pro DSK-OH 2024) P67 Stirring-Finding mit "Verlust-Framing"-Erklaerung + Alt-vs-Neutral- Beispiel, statt nur EDPB-Fachbegriff Compliance-Advisor FAQ (admin agent-core/soul): + CNIL/EDPB Top-Bussgelder (Google 100M, Meta 60M, Amazon 35M) + Deutsche Praezedenz (LG Muenchen Google Fonts, EuGH Planet49, BGH I ZR 7/16) + 4 Risiko-Pfade (Bussgeld/Abmahnung/Sammelklage/NOYB) + Berechnungs-Methodik Document-Generator Templates: AGB-DE (142), Impressum (140), Widerrufs- formular-Anlage (143), DSR-Process-Dedup (139), Cookie-Library (144). Architektur: doc_action_mappings.py + banner_dom_walkers.py + cookie_behavior_validator.py + vendor_detail_extractor.py rausgezogen, um die 500-LOC-Caps in agent_doc_check_report.py und banner_text_checker.py einzuhalten. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 06:28:25 +02:00
Benjamin Admin	4478b7f479	fix(founding-wizard): mypy/ruff cleanup for CI CI / detect-changes (push) Successful in 10s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 14s Details CI / loc-budget (push) Successful in 17s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 42s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details - markdown_to_docx.py: type annotations + unused import - founding_wizard_routes.py: drop unused get_db import	2026-05-20 09:58:38 +02:00
Benjamin Admin	7a5f1e48dd	feat(founding-wizard): Gründungs-Wizard für 2-Mann GmbH + 14 Notar-Templates [migration-approved] Templates (Migrations 123-136): - 123 GO-GF (Geschäftsordnung Geschäftsführung) - 124 SHA (Shareholders' Agreement, 56 Platzhalter) - 125 Satzung (Articles of Association mit UG-Variante) - 126 GF-Dienstvertrag (Trennungsprinzip Organ/Anstellung) - 127 Arbeitsvertrag (AGG-neutral, NachwG, eAU) - 128 Gesellschafterliste (§ 40 GmbHG) - 129 GF-Bestellungsbeschluss (mit § 6 Abs. 2 Versicherung) - 130 HRB-Anmeldung (§§ 7, 8, 39 GmbHG, § 12 HGB) - 131 IP-Assignment Agreement (Gründer→GmbH) - 132 Term Sheet (Pre-Seed/Seed VC-Standard) - 133 Wandeldarlehensvertrag (Convertible Loan) - 134 Beteiligungsvertrag (Subscription Agreement) - 135 ESOP/VSOP-Plan (3 Varianten) - 136 Cap Table Kategorisierung (Migrations 137-138): - ALTER TABLE compliance_legal_templates ADD lifecycle_stage TEXT[], functional_category TEXT (mit CHECK Constraints + GIN-Index) - Backfill aller 105 Templates: lifecycle_stage (pre_founding\|founding\| startup\|kmu\|konzern) + functional_category (founding_legal\|employment\| investor_funding\|...) Backend Founding-Wizard Service: - template_renderer.py: Handlebars-light ({{VAR}}, {{#IF FLAG}}...{{/IF}}) - wizard_to_context.py: Mapping Wizard-State → SCREAMING_SNAKE_CASE Vars - markdown_to_docx.py: Markdown → DOCX via python-docx - founding_wizard_routes.py: POST /v1/founding-wizard/generate → liefert base64-DOCX-Files für ausgewählte Templates Frontend Founding-Wizard (/sdk/founding-wizard): - 8-Step Wizard (Basics, Gesellschafter, GF, Kapital, Notar, SHA, GF-Verträge, Generate) - useFoundingWizardForm Hook mit localStorage-Persistenz - TypeScript Code-Registry (template-categories.ts) als Backup zur DB - Word-Download via data:URLs (base64) Tests: - 20 Unit-Tests grün (Renderer, Context-Mapping, DOCX-Conversion) - Playwright E2E-Test mit 2-Mann GmbH (Benjamin + Sharang) Test-Daten	2026-05-20 09:30:51 +02:00
Benjamin Admin	98ec6d4284	fix(report): Anti-Pattern-Aufgabe — "muss entfernt werden" statt "ergaenzt werden" CI / detect-changes (push) Successful in 9s Details CI / secret-scan (push) Has been skipped Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 17s Details CI / loc-budget (push) Successful in 17s Details CI / go-lint (push) Has been skipped Details CI / test-python-backend (push) Successful in 40s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details Bug: bei invertierten Checks (P9 #7 illegal_disclaimer) sagte die GF-Aufgaben-Liste "muss ergaenzt werden" — semantisch falsch, weil der Disclaimer ja schon da IST und entfernt werden soll. Fix: _check_to_action() erkennt jetzt Anti-Pattern-Labels (rechtswidrig/illegal/haftungsausschluss/disclaimer) und gibt "muss entfernt werden (Anti-Pattern, rechtlich wirkungslos)" zurueck. Smoke-Test BMW d2f7bcc0: vorher 'Rechtswidriger Haftungsausschluss muss ergaenzt werden' -> jetzt 'muss entfernt werden'. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 16:40:24 +02:00
Benjamin Admin	6f16507c5f	feat(banner): P19 + P20 — Per-Category-Click-Test + Frontend-Drilldown CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 2m54s Details CI / test-go (push) Has been skipped Details CI / detect-changes (push) Successful in 10s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 17s Details CI / loc-budget (push) Successful in 17s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 43s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details P19 (consent-tester): - dp-cookieconsent (TYPO3, Safetykon-Pattern) als CMP-Profil hinzu — Selektoren #dp--cookie-statistics/marketing + a.cc-allow Save-Button - Neues Signal provider_details_visible: nach Kategorie-Toggle prueft Playwright ob im Banner sichtbare Provider-/Cookie-Detail-Elemente erscheinen. Bei dp-cookieconsent (Banner ohne Listing) immer False -> HIGH-Violation "Kategorie zeigt keine Provider-/Cookie-Details — Nutzer kann nicht informiert einwilligen (Art. 7 Abs. 1 DSGVO)" - main.py serialisiert provider_details_visible + cookies_set pro Kategorie P20 (Frontend-Drilldown): - Backend: check_payloads-Tabelle um Spalte 'banner' (JSON) — voller banner_result persistiert (vorher nur in-memory). ALTER TABLE Migration idempotent. - Neuer Endpoint GET /api/compliance/agent/banner/<check_id> — liefert Quality-Score, Phases, Category-Tests, Banner-Checks, alle 46 structured_checks. - Frontend: BannerTab im /sdk/agent/audit/<id> mit Quality-Cards, 3-Phasen-Cookie-Tabelle, Per-Category-Listing (mit P19-Signal rot/gruen), Banner-Verstoesse + Rechtsgrundlagen, 46-Check-Drilldown filterbar nach Severity. - Tab-Switcher in page.tsx um "Cookie-Banner-Analyse" erweitert. - Bonus: 2 alte route.ts auf Next.js 15 Promise-params umgestellt (Build-Fix). Plus: Critical-Findings-Block nutzt provider_details_visible als primaeres Signal statt nur tracking_services-Anzahl. Smoke-Test Safetykon: 4 Critical Findings im Mail, banner-Endpoint liefert 46 checks + 3 phases + 2 categories mit provider_details_visible=False. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 14:31:13 +02:00
Benjamin Admin	d4d9b60007	feat(email): P18 — Critical-Findings-Box + Banner-Deep-Block CI / detect-changes (push) Successful in 12s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 15s Details CI / loc-budget (push) Successful in 20s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 3m8s Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 47s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details Backend wirft 90% der consent-tester-Daten weg — nur 4 Felder von einem vollen Banner-Scan landeten im Email. Phases (before_consent / after_reject / after_accept), banner_checks.violations mit Rechtsgrundlagen, category_tests, 46 structured_checks, completeness/correctness-Scores waren alle nicht sichtbar. Backend: agent_compliance_check_routes leitet jetzt das volle banner_result durch (15 Felder statt 4). Renderer (2 neue Module): 1) agent_doc_check_critical.build_critical_findings_html - ROTER Sofortmassnahmen-Block GANZ OBEN in der Email - Erkennt: banner-violations (HIGH/CRITICAL), leere Per-Category-Lists, DSE-Score <30%, fehlende Cookie-Richtlinie, US-Tracker ohne SCC/DPF - Pro Issue: konkrete Sofortmassnahme + Rechtsgrundlage + Bussgeld- Praezedenz (CNIL TikTok 5 Mio, LfDI BW 30k, EuGH Schrems II, ...) - Wird nur gerendert wenn echte Issues vorliegen 2) agent_doc_check_banner.build_banner_deep_html - Banner-Quality-Score-Cards (Vollstaendigkeit / Korrektheit / Verstoesse) - 3-Phasen-Cookie-Tabelle: vor Consent / nach Ablehnung / nach Annahme mit Cookie-Count, Tracker-Count, Auffaelligkeiten - Per-Category-Tracker-Listing (Statistik/Marketing) — zeigt explizit wenn eine Kategorie keine Provider listet (Safetykon-Pattern) - Violations-Liste mit Severity-Badge + Quellen-Hint (LG Rostock, EDPB) Smoke-Test Safetykon: alle 6 neuen Blocks rendern, kein Regression. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 13:34:17 +02:00
Benjamin Admin	e536247c20	feat(quaidal): backend API + frontend tab for BSI QUAIDAL data-quality controls Wire the 195 Clean-Room QUAIDAL controls (from breakpilot-core migration 011) into the compliance SaaS UI. Backend: - GET /api/v1/quaidal/stats - counts by kind + source provenance - GET /api/v1/quaidal/controls - list, optional kind= filter - GET /api/v1/quaidal/controls/{id} - single derived control - GET /api/v1/quaidal/criteria - 10 QKB criteria - GET /api/v1/quaidal/criteria/{id} - QKB with QB/MA/QM tree Frontend: - /sdk/quality: new "Trainingsdaten-Qualität (BSI QUAIDAL)" tab with 10 QKB cards and a drill-down modal showing the full QB→MA→QM tree plus original BSI source link and license note. - /sdk/ai-act: Art. 10 tile on each high-risk/unacceptable result, linking to /sdk/quality?category=data_quality. Pattern matches existing IACE module DIN-reference handling: own wording, source section + URL preserved for due diligence. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 13:03:54 +02:00
Benjamin Admin	313982c6f1	feat(profile+report): P17 — 4 Polish-Items CI / detect-changes (push) Successful in 10s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 16s Details CI / loc-budget (push) Successful in 19s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 39s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details A) Cookie-Policy-Architecture-Block Fallback auf DSE-Text wenn cookie via P15 deduped wurde. Erkennt jetzt auch single-doc Sites (Safetykon-Pattern). B) Konkrete-Aufgaben-Liste: Per-Doc-Cap (3) entfernt + globaler Cap 10→20. Safetykon zeigt jetzt 7 statt 4 Aufgaben. C) business_type-Klassifizierer: B2B-Service-Cluster aus P14 als Boost. Bei 2+ Service-Indikatoren (CE-Zertifizierung/Compliance/Auditierung) wird b2b_score angehoben. Safetykon: "B2C consulting" → "B2B (consulting)". D) Vendor-Extract Fallback auf DSE-Text wenn cookie deduped + keine CMP- Payloads. LLM extrahiert dann Vendors aus dem DSE-Text. Safetykon: 0 → 1 Vendor (Google Analytics aus dem DSE-Text erkannt). Smoke-Test Safetykon: alle 4 Polish-Items wirken, kein Regression. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 12:22:05 +02:00
Benjamin Admin	479ce2225b	feat(profile): P14+P15+P16 — B2B-Heuristik + Doc-URL-Dedup + Homepage-Profile P14 — _detect_no_direct_sales erweitert um 3 Cluster: A) OEM-Konfigurator (BMW/Audi/Mercedes/VW/Porsche-Markennamen + Vertragshaendler-Pattern) B) B2B-Dienstleister (CE-Zertifizierung, Compliance-Beratung, Schulungen, Auditierung, TISAX, ISO-Normen, Arbeitssicherheit, ...) C) NGO/Verein/Public (Spendenkonto, Vereinsregister, gemeinnuetzig, ...) Schwelle: pos >= 2 pro Cluster UND pos > neg. Bisher: nur OEM. P15 — Doc-URL-Dedup im Worker: wenn mehrere Doc-Types DASSELBE Dokument referenzieren (Safetykon-Pattern: User gibt /datenschutz fuer dse, cookie UND widerruf), wird nur dem primaeren Doc-Type (Priority: dse > impressum > cookie > widerruf > agb > nutzungsbedingungen) der Text gegeben. Andere landen als "Nicht separat vorhanden — wird im Dokument 'X' mit-geprueft." Eliminiert die 8+8 systematischen widerruf/cookie False Positives. P16 — Profile-Detection auch Homepage-Text: Homepage-HTML wird mit kurzem Fetch (8s timeout) gezogen, getrippt und zum profile_input gemerged. Vor- her wirkte P14 nur wenn B2B-Indikatoren im DSE/Impressum-Pflichttext standen — bei Safetykon stehen sie nur im Homepage-Menue. Plus Bonus: TDM-Override-Submit-Button wird deaktiviert wenn Reason < 10 Zeichen — verhindert dass User wie heute in den Bug rein klickt. Smoke-Test Safetykon (B2B Compliance-Dienstleister): dse geprueft (kein err) impressum geprueft (kein err) cookie "Nicht separat vorhanden — wird in DSE mit-geprueft" agb "Nicht anwendbar — kein Direkt-Kaufvertrag" widerruf "Nicht anwendbar — kein Direkt-Kaufvertrag" nutzungsbedingungen "Nicht anwendbar — kein Direkt-Kaufvertrag" Vorher: 16 False Positives. Jetzt: 0. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 11:46:58 +02:00
Benjamin Admin	78b27d4684	feat(compliance-check): P12 — TDM-Override mit dokumentierter Kunden-Erlaubnis CI / guardrail-integrity (push) Has been skipped Details CI / nodejs-build (push) Successful in 3m5s Details CI / test-go (push) Has been skipped Details CI / detect-changes (push) Successful in 10s Details CI / branch-name (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 16s Details CI / loc-budget (push) Successful in 17s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 40s Details CI / test-python-document-crawler (push) Has been skipped Details Backend: ComplianceCheckRequest um tdm_override + tdm_override_reason erweitert. Worker im _run_compliance_check Pfad: bei tdm_override=True UND Reason >= 10 Zeichen wird der TDM-Vorbehalt nur dokumentiert (job.tdm_override.{reason, original_status}) und NICHT als Abbruch-Grund gewertet. Ohne Reason: Override ignoriert. Audit-Spur via logger.warning(reason). Frontend: ComplianceCheckTab um Checkbox + Pflicht-Reason-Feld ("Schriftliche Crawl-Erlaubnis vorhanden") direkt vor dem Submit- Button. Pflicht: Reason >= 10 Zeichen. Submit sendet die Flags ans Backend. Anwendungsfall: Safetykon-Pattern — robots.txt + ai.txt setzen Vorbehalt, aber Kunde hat schriftlich zugestimmt (Auftrags-Audit). [guardrail-change] ComplianceCheckTab.tsx (511 LOC) in loc-exceptions ergaenzt — Split nach _components/TDMOverride + CompliancePolling ist P11-Tech-Debt. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 08:56:50 +02:00
Benjamin Admin	28a078ccb4	feat(compliance-check): P10 — Cookie-Policy-Architecture-Detection CI / detect-changes (push) Successful in 10s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 16s Details CI / loc-budget (push) Failing after 17s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 41s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details Neuer Service cookie_policy_architecture.detect_architecture(...) prueft vier Diagnose-Punkte der Cookie-Policy einer Website: 1. Layer-Trennung: single (BMW-Pattern: Banner + Info in EINER URL) \| separate (Best Practice: getrennte Layer) 2. Versionierung: "Stand vom DD.MM.JJJJ" / "Version X.Y" / ... 3. Dynamic content: CMP-Capture auf Doc-URL oder Marker-Texte 4. Vendor-Count im Text: Indikator ob Liste statisch drinsteht Risiko-Ampel: - gruen: separate + versioned + statisch - gelb : single+unversioned (BMW) ODER separate+unversioned - rot : weder noch (Pflicht-Info fehlt) Wire-in im Compliance-Check-Worker: nach Exec-Summary-Block wird der Architecture-Block gerendert (build_architecture_html) mit konkreter Empfehlung. Bei BMW-Pattern: "Snapshot der dynamischen Vendor-Tabelle als versioniertes PDF im Archiv." Hintergrund: BMW hat eine HTML-Seite die GLEICHZEITIG Banner-Re-Trigger und Cookie-Richtlinie ist. Mindestanforderung nach §25 TDDDG + Art. 13 DSGVO erfuellt, aber bei einer Aufsichtsbehoerden-Pruefung kann nicht belegt werden welche Vendor-Liste an einem bestimmten Stichtag aktiv war. Das ist kein Verstoss aber best-practice-Luecke. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 01:01:48 +02:00
Benjamin Admin	0d37822b7c	fix(impressum): P9 — 7 False-Positive-Fixes in Pflichtangaben-Checks CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / detect-changes (push) Successful in 10s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 16s Details CI / loc-budget (push) Failing after 16s Details CI / go-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 37s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details #1 Name des Anbieters: \b Word-Boundary verhindert "ag" in "samstag", plus "aktiengesellschaft" als Volltreffer. #2 Vertretungsberechtigte: Klammer-Liste-Pattern erkennt jetzt BMW- Format "Vorstand (Milan Nedeljkovic, Jochen Goller, ...)" plus "Vorsitzender des Aufsichtsrats: Name". #3 V.i.S.d.P.: war schon INFO, OK. #4 OS-Plattform/VSBG: bei no_direct_sales=True (OEM-Pattern) jetzt als "Nicht anwendbar" skipped statt 0/1 fail. Profile fliesst neu durch check_document_completeness -> runner. #5 Zustaendige Kammer: IHK + Handwerkskammer + Tieraerztekammer in Pattern aufgenommen + severity LOW -> INFO (konditional). #6 Stammkapital: war schon INFO, OK. #7 Link-Disclaimer: neue Check-Eigenschaft "invert"=True. Anti-Pattern ist passed wenn NICHT gefunden, fail wenn gefunden. Vorher feuerte das Finding immer, jetzt nur wenn ein illegaler Disclaimer im Text ist. Plus: L2-INFO-Checks (z.B. profession_chamber) zaehlen nicht mehr in correctness-pct und erzeugen keine DSI-DETAIL-Findings. Konsistent mit P8-Modell: INFO = "selbst pruefen", nicht "fail". Verifiziert mit BMW-Impressum-Text — alle 7 Faelle korrekt klassifiziert: name=passed, representative_person=passed, profession_chamber=INFO, illegal_disclaimer=passed (kein Disclaimer im Text), dispute_resolution=skipped (no_direct_sales), editorial_visdp=INFO, share_capital=INFO. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 00:52:03 +02:00
Benjamin Admin	6c223c7c9b	feat(compliance-check): exec-summary + voll-audit + TDM-respect + cookie-KB-extended + saving-scan-funnel CI / detect-changes (push) Successful in 10s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 14s Details CI / loc-budget (push) Failing after 15s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 2m43s Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 37s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details P1 — Exec-Summary oben im Email-Report (4 KPIs + 2 CTAs, dunkler Gradient) P3 — no_direct_sales-Flag fuer OEM-Konfigurator-Sites; AGB/Widerruf/AGB als "NICHT ANWENDBAR" (grau) statt "NICHT GEFUNDEN" (rot) P5 — Voll-Audit Unification: alle Findings (MC + Pflichtangaben + Vendor + Redundanz) in /data/compliance_audits.db.unified_findings; neuer /api/compliance/agent/findings/<id> Endpoint + FindingsTab im Audit-UI mit Filter + CSV-Export P7 — Crawl-Hardening: TDM-Reservation-Check (robots.txt / ai.txt / Header / Meta) vor jedem Run mit 24h-Cache; HeadlessChrome-UA (Firma noch nicht gegruendet — Switch via BREAKPILOT_BRANDED_UA env); per-Domain Rate-Limit 1 req/s + max 2 concurrent P2 — Cookie-Knowledge-DB additiv erweitert (35 -> 74 Cookies): Adobe, Meta, Microsoft, LinkedIn, TikTok, HubSpot, Marketo, Salesforce, Hotjar, FullStory, Mouseflow, Intercom, Drift, Zendesk, Cloudflare, Stripe, OneTrust/Cookiebot/Usercentrics, Matomo, Pinterest, Snapchat, X/Twitter, YouTube, Vimeo, Klaviyo, Mailchimp, Mixpanel, Segment, Amplitude, Optimizely, Datadog; Wire-in in cookie_function_classifier liefert compliance_risk-Label (kritisch/hoch/mittel/gering) pro Vendor A — k-Anonymitaets-Helper (benchmark_k_anonymity) fuer P6-Vorbereitung B — Cross-Tenant-Domain-Assertion im /findings-Endpoint (expected_domain Query-Param -> 403 bei Mismatch) C — Saving-Scan-Funnel: /api/compliance/agent/saving-scan/start mit Validierung + 24h-Rate-Limit pro Domain + Lead-Persistenz in saving_scan_leads + Auto-Discovery via _run_compliance_check; 6 Tests D — Risk-Badge im Email-Vendor-Row Rechtliche Leitplanken (Memory feedback_oem_data_legal.md): nur eigene Knapp-Bewertungen + Source-Pointer, keine 1:1-Kopien fremder CMP-Texte. TDM-Opt-Out-Respect nach § 44b UrhG. KEINE Schema-Aenderungen — alles in Sidecar-SQLite. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 23:48:34 +02:00
Benjamin Admin	27384aea09	feat(cra): Phase 5 — Technical Doc + DoC Generator (Annex V + VII) CI / detect-changes (push) Successful in 11s Details CI / branch-name (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 15s Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / loc-budget (push) Failing after 16s Details CI / go-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 3m1s Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 39s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details Migration 122: compliance_cra_documents with versioning + approval workflow - doc_type whitelist: doc_eu_conformity, doc_technical, doc_cvd_policy, doc_update_policy, doc_sbom_report - Status state machine: draft → reviewed → approved (+ superseded) - Snapshot generation_context for audit trail New module cra_doc_templates.py — pure-function generators (no DB access): - doc_eu_conformity: EU DoC structured per CRA Annex VII (all 7 mandatory fields) - doc_technical: Technische Dokumentation per CRA Annex V - doc_cvd_policy: ISO/IEC 29147-compliant CVD policy with SLA table - doc_update_policy: Patch/Update policy with Lifecycle + CSAF reference - doc_sbom_report: Latest SBOM summary with top-10 components Returns (title, markdown_content, requirements_coverage) — coverage tracks how many mandatory fields are filled vs placeholders. Backend endpoints: - POST /documents/generate — generates doc, supersedes previous version, increments version number atomically - GET /documents — lists all 5 doc types (also "not_generated" stubs) - GET /documents/{id} — full content_md - POST /documents/{id}/approve — set status + signed_by + signed_at Frontend: - /documents page: 5 doc-type cards with Generate/Re-Generate buttons, inline Markdown preview with .md download, 2-step approval flow (reviewed → approved with signature) - Optional params form: manufacturer, notified_body, security_contact - Dashboard: +1 button (Dokumente, 7 buttons total) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 22:10:23 +02:00
Benjamin Admin	cc80e59e5e	feat(cra): Phase 4 — Vulnerability Disclosure + Post-Market Monitoring Migration 121: compliance_cra_vulnerabilities table with full lifecycle tracking - Status state machine: reported → triaged → patched → disclosed (+ withdrawn) - CRA Art. 14(2) deadlines tracked: reported_to_enisa_at (24h), detailed_report_at (72h) - CVE-ID, severity, CVSS, affected_components (JSONB), embargo_until Backend endpoints in cra_routes.py: - POST /vulnerabilities — create with validation (severity, CVSS range) - GET /vulnerabilities — list with deadline-breach summary (24h/72h counters) - PATCH /vulnerabilities/{id} — update fields + auto-set lifecycle timestamps - DELETE /vulnerabilities/{id} — soft-delete (withdrawn) - GET /monitoring — combined view: CRA deadlines + vuln summary + post-market checklist Frontend: - /vuln page: intake form, vuln cards with 24h/72h-countdown buttons, status-transition flow with auto-timestamps - /monitoring page: CRA deadlines (11.06.26 / 11.09.26 / 11.12.27), breach banner if 24h/72h obligations missed, post-market checklist with deep-links - Dashboard: +2 buttons (Vulns, Monitoring) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 22:08:49 +02:00
Benjamin Admin	662327e8b4	feat(compliance-check): MC-Classification + Embedding + Vendor-Redundanz + Action-Recipes + Borlabs-Features CI / nodejs-build (push) Successful in 2m47s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / detect-changes (push) Successful in 10s Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 16s Details CI / loc-budget (push) Failing after 17s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-python-backend (push) Successful in 42s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details Massiv-Update auf Basis BMW-Test-Iterationen (v1→v9): Core Compliance-Check - Sonnet check_type Klassifikation: text/process/review fuer alle 1874 MCs in compliance.doc_check_controls (script + Sidecar /data/mc_classification.db). rag_document_checker filtert auf check_type='text' fuer doc_check. Plus fits_doc_type-Audit (v2) + ui_only-Audit fuer DSA/E-Commerce-MCs in falscher doc_type-Schublade. - scope_requires-Filter: biometric/ai_decision/child_targeting MCs werden per business_profile gefiltert (FRT skipped fuer BMW etc.). - Embedding-Match (BGE-M3) als Phase-3 nach Regex-Match: Per-doc_type-Threshold-Override (impressum 0.50, dse/cookie 0.60), Short-Field-Rescue (15-Wort-Chunks) fuer Pflichtfelder im Impressum. Title+check_question als Embedding-Input fuer mehr Kontext. - Cookie-Text-Routing: consent-tester gibt cmp_cookie_text aus dem CMP-Reconstruct zurueck, Backend bevorzugt das gegen DOM-Extraction wenn richer (BMW 1824 vs 600 Worte). Vendor-Redundanz + EU-Alternativen + Cost-Saving - vendor_redundancy.analyze() — funktionale Kategorisierung der CMP-Vendors, Detektion von Mehrfach-Anbietern pro Kategorie, EU-Alternative-Lookup (Matomo, IONOS, HERE, Friendly Captcha, Smart AdServer, ...). - vendor_cost_estimator: Tier-Inferenz aus Cookie-Footprint (Cookie-Anzahl + Premium-Feature-Cookies + Third-Party-Quote → starter/professional/ enterprise/premier). - Self-Service-Werbung (Google/Meta/Pinterest/...) = 0 Lizenz-Kosten (nur Media-Spend, separat). DSP-Plattformen behalten enge Range. - Tier-aware Saving-Range: bei Enterprise/Premier nutzen wir den oberen 40-100%-Band der Listpreise, nicht starter→premier. - Multi-Function-Tools (Matomo Pro, SAP CX, IONOS Cloud, Userlike, Smart AdServer, HERE Maps, Vimeo Pro, LamaPoll) — ein Tool ersetzt mehrere Kategorien gleichzeitig. Cookie-Wissens-DB + Funktionale Klassifikation - cookie_knowledge_db: 50 kuratierte Top-Cookies (Google/Meta/Adobe/MS/...) mit vendor, exact_purpose, data_collected, IAB-TCF-IDs, reid_risk, schrems_ii_status, EuGH-Urteile, EU-Alternative. - cookie_function_classifier: pro Cookie funktionale Rolle (tracking_id, ad_pixel, session_id, ab_test, csrf, ...) + blocking_impact. Country-Inferenz aus Rechtsform - cookie_link_validator: Country-Field wird aus Vendor-Name abgeleitet (A/S=DK, GmbH=DE, Inc=US, B.V.=NL, ...) plus Vendor-Lookup-Table. Reduziert false-positive no_country-Flags bei eindeutig-EU-Vendors (Adform DK, Pinterest IE). Action-Recipes + Doc-Anchor-Locator - finding_action_recipes: pro Finding-Typ (no_cookies_listed, no_country, broken_opt_out, "Auftragsverarbeiter erwaehnen", "Art. 22 Profiling", ...) eine strukturierte Anweisung mit what/why/fix_text/where/example. Zum 1:1-Einfuegen in Kunden-Dokumente. - doc_anchor_locator: Embedding-basiert (BGE-M3 cosine) — sucht den passenden Absatz im existierenden Kundendokument fuer jeden Finding. Per-Run Thread-Local-Cache. Fallback: keyword-Match. - Email-Rendering integriert Recipe + Anchor pro Doc-Pruefungs-Fail + Vendor-Flag-Liste mit aufklappbarer Action-Liste. - Score-Erklaerung pro Vendor-Zeile (3/5-Untertitel + Tooltip). Migration-Pipeline (Compliance-Check -> Customer Banner/Documents) - migration_to_banner.py: Vendor-Liste -> CookieBannerConfig mit 4 Kategorien + Review-Flags. - migration_to_document.py: Vendor-Liste -> Cookie-Policy + VVT-Register + Privacy-Policy-Pre-Fills. - agent_migration_routes: 3 Preview-Endpoints (banner-preview, document-preview, summary). Persistierung der cmp_vendors in /data/compliance_audits.db check_payloads-Tabelle. Borlabs-Parity Cookie-Banner-Features - Consent-Historie im Banner: window.bpShowConsentHistory() + localStorage. - Content-Blocker: cookie-banner-content-blocker.ts — YouTube/Maps/Video Placeholder bis Einwilligung. - Google Consent Mode v2 erweitert: wait_for_update + region=EEA/CH/GB. - Consent-Log Export (CSV/JSON) per einwilligungen_export_routes. Bug-Fixes - canonical_control_routes: _jsonish-Helper fuer string-typed jsonb, similar-controls-Endpoint mit _has_embedding_col()-Cache (kein 500 mehr). - Control-Library Frontend: defensive .map-Coercer in 2 Detail-Views. - Embedding-Service-Batching (32er Batches statt 165 in einem Call). - KeyError 'control_id' in MC-Result-Aggregation (defensive .get). - Master-Controls-Klick-Through von /sdk/master-controls auf /sdk/control-library?control=<id> mit URL-Param-Auto-Open. - Dockerfile: /data pre-chowned auf appuser (Audit-DB-Schreibrecht). - Cookie-Text-Routing-Bug (cmp_reconstructed > DOM-extraction). - doc_type-aware MC-Filter (statt all-text-MCs). - Master-Contract-Dedup (60 BMW-Internal-Eintraege = 1 Adobe-Vertrag). - A3-v2-Audit hat 24 UI-Sprache-MCs als 'process' reklassifiziert. Tests - test_migration_mappers.py (9 Tests) - test_migration_endpoints.py (4 Tests) Skripte (one-shot) - classify_mc_check_type.py (v1) + _v2 (PK=control_id,doc_type) - audit_mc_doctype_fit.py (v1 fits) + _v2 (ui_only + scope_requires) BMW-Run-Bilanz v1 (broken) -> v9 (alle Fixes): DSE 7,5% -> 81-83% Impressum 4% -> 100% (6 echte MCs alle erfuellt) Cookie 0% -> 79-83% (CMP-Text-Routing + Embedding) Plus: 10 Konsolidierungs-Kategorien, geschaetzte Saving 200k-3M / Jahr Plus: Action-Recipes + Doc-Anchors fuer jeden Fail Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 18:30:08 +02:00
Benjamin Admin	1cf5de1d45	feat(cra): CRA Compliance module Phase 1+2+3 (intake, scope, path, requirements, backlog, sbom, checks) Phase 1 — Intake + Scope + Path: - Migration 119: compliance_cra_projects table (intake + classification + path + status state machine) - Backend service cra_routes.py: CRUD + scope-check + path-select - Deterministic Annex III/IV classifier (verbatim mapping from migration 059 wiki) - Path validation per classification (CRITICAL → notified_body mandatory) - Frontend: project list, dashboard, 3-step wizard (intake/scope/path) - Sidebar entry under "CRA Compliance" (red) Phase 2 — Annex I Requirements + Priorisierungs-Backlog: - cra_annex_i_data.py: 40 Annex-I requirements (8 categories), 9 measures (M540-M548), 3 CRA deadlines - Endpoints: /requirements (40 items), /backlog (priority-sorted with deadline pressure) - Frontend: requirements table with filters + expandable details, backlog with deadline banner + score-ranked table - Dashboard KPI cards (Critical count, days to CE deadline, etc.) + top-10 backlog snippet Phase 3 — SBOM Upload + Automated Checks: - Migration 120: compliance_cra_sboms (versioned uploads, CycloneDX + SPDX) - SBOM endpoints: POST /sbom/upload (format detection, summary extraction), GET /sboms - Checks reuse compliance_evidence_checks: init creates 6 default CRA checks, run executes - Real implementations: cra_security_txt (HTTP + Contact: line) and cra_tls_cert_check (TLS handshake) - Frontend: SBOM file upload + version list, Checks page with per-check URL input + Run button Backend-Reuse: gap_projects (intake pre-population), compliance_evidence_checks/_check_results. Tenant scoping via existing X-Tenant-ID header pattern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 17:56:52 +02:00
Benjamin Admin	df7d83134b	feat(agent): migrate compliance-check results to banner + documents (M1-M5) After a compliance-check run finishes, the user can now apply the extracted vendor inventory directly to their own: - CookieBanner config (admin /sdk/einwilligungen) - Cookie-Policy / VVT-Register / Privacy-Policy templates (admin /sdk/document-generator) Backend: - migration_to_banner.py: vendor list -> CookieBannerConfig with ESSENTIAL/PERFORMANCE/PERSONALIZATION/EXTERNAL_MEDIA buckets + review flags (broken opt-out URLs, missing expiry, no cookies listed) - migration_to_document.py: vendor list -> pre-fills for 3 doc templates, recipient-type aware (INTERNAL/GROUP/PROCESSOR/CONTROLLER) - agent_migration_routes.py: GET /banner-preview, /document-preview, /summary keyed on check_id - compliance_audit_log: new check_payloads table persists cmp_vendors + extracted_profile so the preview survives an app restart - tests: 9 mapper units + 4 endpoint integration tests Frontend: - MigrationPanel.tsx: modal showing banner-config diff + document pre-fills, plus links into the existing editors - ComplianceCheckTab.tsx: replaces standalone audit link with the panel; net -3 lines, stays at the 500-cap Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 14:06:28 +02:00
Benjamin Admin	6ed30dae5b	feat(agent): MC scorecard + audit drill-down + tenant trend (A1-A6) Now that all 1874 MCs run per check (Task #30 cap removal), the report was about to drown in noise. This commit adds the full aggregation / persistence / drill-down stack so each MC is actionable, not just counted. A1 mc_scorecard.py (new): build_scorecard(checks) -> per-regulation PASS/FAIL/SKIP + severity top_fails(checks, n) -> N most severe failed MCs full_audit_records(...) -> flat rows ready for sidecar SQLite A2 Email rendering: agent_doc_check_scorecard.py (new) builds an HTML scorecard table (regulation × passed/failed/HIGH/MEDIUM/score) shown at the top of the email. agent_doc_check_report._render_document now collapses the 500-MC L2 forest into 'X/Y bestanden (Z Fail)' summary plus a top-10 fails block per doc — old verbose render is gone. A3 compliance_audit_log.py (new) — sidecar SQLite at /data/compliance_audits.db (separate from compliance Postgres schema to comply with the no-new-migrations rule in CLAUDE.md): check_runs(check_id, ts, tenant_id, site_name, base_domain, doc_count, scorecard json, vvt_summary json) mc_results(check_id, doc_type, mc_id, label, passed, skipped, severity, regulation, matched_text, hint) Route persists every run after the email is sent. docker-compose.yml adds compliance-audit volume + env. A4 backfill_mc_regulation_llm.py (new) — Qwen-tagged backfill for the 1636 MCs the regex pass couldn't classify. Batches of 25, format=json, output constrained to the canonical regulation list. Run manually: docker exec bp-compliance-backend python3 \ /app/scripts/backfill_mc_regulation_llm.py [--dry-run] A5 Admin audit tab — GET /api/compliance/agent/audit/<check_id> proxied via /api/sdk/v1/agent/audit/<id>. New page /sdk/agent/audit/[checkId] renders scorecard + filterable MC table (status / doc_type / regulation, expandable rows with matched_text + hint). ComplianceCheckTab now shows 'Voll-Audit oeffnen' link. A6 Trend per tenant — GET /api/compliance/agent/audit/tenant/<id> returns recent runs. Email scorecard shows per-regulation delta badges ('(+12%)', '(-3%)') compared with the previous run for the same tenant + base_domain. Lookup is one SQLite query. Plumbing: rag_document_checker.py — SELECT now includes 'article'; MC results carry 'regulation' + 'article' through to CheckItem. agent_doc_check_routes.CheckItem schema gains regulation + article fields (defaults '') so old clients still parse. agent_compliance_check_routes — response gains 'check_id' so the frontend can build the audit link.	2026-05-17 13:45:58 +02:00

1 2 3 4 5 ...

281 Commits