breakpilot-compliance

Author	SHA1	Message	Date
Benjamin Admin	2f0f76e365	fix: Add missing 'import re' to agent_scan_routes.py NameError: name 're' is not defined at line 146 — the import was accidentally removed when extracting helper functions to agent_scan_helpers.py. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-04 22:59:53 +02:00
Benjamin Admin	53f6f30cf0	feat: DSI document discovery + completeness check in agent scan workflow Agent scan now automatically: 1. Discovers all legal documents via consent-tester /dsi-discovery endpoint 2. Classifies each as DSE/AGB/Widerruf/Cookie/Impressum 3. Checks completeness against type-specific checklists: - DSE: 9 Art. 13 DSGVO mandatory fields (controller, DPO, purposes, legal basis, recipients, third-country, retention, rights, complaint) - AGB: §305ff BGB (scope, contract formation, liability, jurisdiction) - Widerruf: §355 BGB (right info, 14-day deadline, form, consequences) 4. Adds findings per document to scan results 5. Shows discovered documents with completeness % in email summary 6. Returns discovered_documents list in API response New files: - dsi_document_checker.py (229 LOC) — checklists + classifier - agent_scan_helpers.py (109 LOC) — extracted summary builder + corrections Refactor: agent_scan_routes.py 537→448 LOC (under 500 budget) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-04 22:09:45 +02:00
Benjamin Admin	d3c8811fdb	feat: IAB TCF 2.2 — TC String encoder + purpose mapping + UI - TCFEncoderService: generates base64url-encoded TC Strings per IAB spec with 12 purposes, vendor consent bitfield, CMP metadata - Category-to-purpose mapping (necessary→none, statistics→1,7,8,9,10, marketing→1,2,3,4,5,6,7,12, functional→1,11) - tcf_routes: 5 endpoints (purposes, features, mapping, encode, encode-categories) - banner_consent_service: auto-generates TC String when tcf_enabled=true - TCFSettings.tsx: enable/disable toggle, purpose grid with category mapping, TC String test generator, CMP registration info - New "TCF/IAB" tab in cookie-banner page (7 tabs total) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-04 07:01:37 +02:00
Benjamin Admin	c89a68e59e	feat: Whistleblower backend + Scanner banner-check (last 2 gaps) Whistleblower (HinSchG): - Migration 118: 3 tables (reports, messages, measures) with HinSchG deadlines (7d acknowledgment, 3mo feedback) - whistleblower_routes.py: 14 endpoints (CRUD, acknowledge, close, messages, measures, public submit, anonymous status check) - Frontend api-operations.ts rewired from Go SDK to compliance proxy - Access key format XXXX-XXXX-XXXX for anonymous reporters Scanner banner-check (TTDSG § 25): - CMP Dashboard: green "Kein Cookie-Banner erforderlich" when no trackers detected + no banner configured - Red warning "Cookie-Banner fehlt!" when trackers found but no banner - Mandatory note: Impressum (DDG § 5) + DSE (DSGVO Art. 13) still required [migration-approved] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-04 00:22:18 +02:00
Benjamin Admin	eb4ea8bc42	feat: EmailDeliveryService + professional DSR email templates - EmailDeliveryService: load template → find published version → render {{variables}} → send via SMTP → audit log. Fallback to inline HTML when no published template exists. - Migration 117: Professional HTML/text content for all 5 DSR templates (receipt, completion, rejection, identity, extension) with branded styling and proper Art. references - DSRArt11Service now uses EmailDeliveryService with dsr_rejection template instead of hardcoded HTML [migration-approved] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-03 23:38:32 +02:00
Benjamin Admin	060f351da7	feat: Art. 11 DSGVO — reject DSR when data subject not identifiable - New DSRArt11Service: handles rejection with proper legal basis, automated email notification to requester explaining Art. 11 - POST /dsr/{id}/reject-art11 endpoint - ActionButtons.tsx: "Nicht identifizierbar (Art. 11)" button shown when identity is not yet verified - Also fixes: DSR export type-cast rollback handling Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-03 23:30:18 +02:00
Benjamin Admin	c55d0ab12a	fix: DSR export type-cast bug + session rollback on partial failures - tenant_id kept as string (PostgreSQL handles UUID cast) - Einwilligungen query uses CAST(:tid AS VARCHAR) for compatibility - Each data source query wrapped with rollback on failure to prevent cascading "transaction aborted" errors Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-03 23:15:25 +02:00
Benjamin Admin	02468c94c0	feat: DSR User Data Export — Art. 15 PDF + Art. 20 JSON/CSV - DSRExportService: aggregates all CMP data about a user from Banner Consents, Einwilligungen, Audit Trail, DSR History - GET /dsr/{id}/export-user-data?format=json\|csv\|pdf endpoint - PDF: A4 reportlab with 4 sections (Consents, Einwilligungen, Audit-Trail, DSR-Anfragen) + cover page - CSV: BOM-encoded for Excel with flattened data rows - JSON: structured export with all data categories - ActionButtons.tsx: PDF/JSON/CSV export buttons now functional Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-03 22:42:03 +02:00
Benjamin Admin	630fffc0cc	feat: Academy integration — training gap detection after document approval (F7) - Migration 115: compliance_role_training_mapping table (org roles → training codes) - TrainingLinkService: queries training_modules/matrix/assignments to find gaps per person and role. Gracefully degrades when Go training tables don't exist yet. - document_review_routes: 2 new endpoints (training-requirements, training-gaps) - _notify_approval() now checks training gaps and sends emails to persons with outstanding modules, linking to /sdk/training/learner [migration-approved] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-03 22:03:25 +02:00
Benjamin Admin	965af3a34c	feat: A/B Testing + Compliance Report PDF (F5 + F8) F5: A/B Testing for Consent Rate - Migration 116: banner_variants table + variant tracking in audit log - BannerABService: deterministic sticky bucketing via device hash, chi-squared significance testing, variant CRUD - banner_ab_routes: 6 endpoints (CRUD + stats + assign) - ABTestPanel.tsx: variant creation, traffic sliders, opt-in comparison chart with winner/significance badges - New "A/B-Test" tab in cookie-banner page F8: Compliance Report PDF - CompliancePDFGenerator: reportlab-based A4 PDF covering all modules (Company Profile, TOM, VVT, DSFA, Risks, Vendors, Incidents, Reviews, Consents, Roles) - compliance_report_routes: GET /compliance/report/pdf - "Compliance-Report herunterladen" button on SDK dashboard [migration-approved] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-03 21:42:50 +02:00
Benjamin Admin	c3fcfe88ee	feat: Vendor-level consent + Consent analytics (F4 + F6) F4: Granular Vendor-Level Consent - Migration 113: vendor_consents JSONB on banner_consents + audit_log - ConsentCreate schema + BannerConsentDB model extended - banner_consent_service stores vendor_consents alongside categories - Audit trail includes vendor-level decisions + user_agent F6: Consent Rate Analytics - Migration 114: user_agent on audit_log + time-series index - BannerAnalyticsService: time series, category breakdown, device stats - banner_analytics_routes: 4 endpoints (overview, time-series, categories, devices) - AnalyticsDashboard.tsx: KPIs, bar chart, category bars, device breakdown - New "Analytik" tab in cookie-banner page [migration-approved] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-03 20:58:06 +02:00
Benjamin Admin	9b4be663f7	feat: Rollenkonzept backend + SOP template (Phase 1-3) - Migration 111: 3 new tables (org_roles, document_reviews, document_role_mapping) with seed data mapping all 71 doc types to 7 compliance roles - org_role_routes.py: CRUD for roles, seed defaults, test email, mapping API - document_review_routes.py: Review lifecycle (create→send→approve/reject) with approval notification to all affected roles - Migration 112: SOP template (ISO 9001 structure, 21 placeholders) - Added standard_operating_procedure to TemplateType, doc-labels, presets [migration-approved] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-03 13:03:38 +02:00
Benjamin Admin	64700b355e	feat: Review all 12 remaining policy templates + categorize Migration 110: Updated descriptions and version for 12 previously unreviewed templates (asset_management, backup, change_management, cloud_security, devsecops, incident_response, logging, patch_management, secrets_management, vulnerability_management, informationspflichten, verpflichtungserklaerung). All templates assessed as "Very Good" quality — only incremental updates needed (AI Act, CRA, NIS2UmsuCG references in descriptions). informationspflichten: Kept as separate compact checklist (distinct from the full privacy_policy DSI template). verpflichtungserklaerung: Kept as standalone HR document (employee signs at onboarding). Added to HR & Mitarbeiter category. Result: 88 templates, 44 at v1.1+, 0 unreviewed remaining. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-03 07:19:41 +02:00
Benjamin Admin	4b9cf34243	feat: Full template cleanup + categories by use case Cleanup (109): - Removed DPA duplicates (v1 DE + v1 EN, kept v2 DE) - Removed cookie_banner duplicate (kept larger with IF-blocks) - Removed impressum duplicate (kept larger with IF-blocks) - Removed TOM duplicate (kept newest) - Removed DSFA v1 (kept v2) - Kept all 8 VVT templates (1 main + 7 industry templates) - DB: 98 → 88 templates, 0 duplicates remaining Categories restructured by use case: - Website/App: DSI, Impressum, Cookie, Social Media - Online-Shop: AGB, Widerruf, DSI, Cookie - SaaS/Cloud: AGB, AVV, SLA, Cloud Agreement - App/Plattform: Nutzungsbedingungen, Community Guidelines, AUP - Vertraege (B2B): AVV, NDA, SLA, Cloud - DSGVO-Pflichten: TOM, VVT, Loeschkonzept, DSFA - Sicherheitskonzepte + Richtlinien (separate categories) - HR & Mitarbeiter, Daten-Governance, Vendor, BCM Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-03 07:09:16 +02:00
Benjamin Admin	5298467275	feat: Privacy notice cleanup + English v2 - 108: Remove DSI duplicate (023 + 093 both wrote privacy_policy DE), remove outdated EN v1, create English Privacy Notice v2 with all modular sections (data categories table, retention periods, processor vs. controller guidance, Art. 21 right to object highlighted) DB now has exactly 2 privacy_policy templates: DE + EN, both v2.0.0 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-03 07:03:06 +02:00
Benjamin Admin	91b4034fee	feat: AGB cleanup + English Terms v2 - 106: Remove AGB duplicates and obsolete templates (terms_of_service DE/EN v1.0, liability clause) — replaced by agb v2.0 - 107: English Terms and Conditions v2 (EU-compliant, same structure as DE version with all IF-blocks) DB now has exactly 2 AGB templates: DE + EN, both v2.0.0 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-03 06:59:28 +02:00
Benjamin Admin	fe6764df9a	fix: ensure JSONB array fields are always arrays in control API Backend: _ensure_list() converts null/string/malformed JSONB to [] for requirements, test_procedure, evidence, open_anchors, tags. Frontend: defensive Array.isArray() check on ControlDetail.tsx. Fixes: TypeError: A.requirements.map is not a function Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-02 21:18:10 +02:00
Benjamin Admin	db697924ed	feat: Cookie banner vendors per category + {{COOKIE_TABLE}} generator - CookieBannerOverlay: shows vendors per category with expandable tables (Verarbeiter, Cookies, Dauer, Land) for full transparency - Demo vendors: 4 necessary, 3 statistics, 3 marketing, 3 functional - cookie_table_generator.py: renders {{COOKIE_TABLE}} Markdown tables from vendor configs (DB) or service registry (fallback) - SERVICE_COOKIES: 16 known vendor-to-cookie mappings with provider + country Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-02 20:06:44 +02:00
Benjamin Admin	17c67b4f25	feat: Cookie-Banner ↔ Backend Integration (DSR, Retention, Consent Proof) Phase 1: Vendor sync from service registry (82+ services → banner vendors) Phase 2: Category-based retention (marketing=90d, statistics=790d, not hardcoded 365d) Phase 3: DSR ↔ Banner email linking (link-email, by-email, Art.17 erasure, Art.15/20 export) Phase 4: Consent sync (Banner → Einwilligungen bridge) Phase 6: Consent proof (SHA256 config hash + config_version in audit log, Art. 7(1) DSGVO) New files: - banner_dsr_service.py — email linking + DSR integration - vendor_banner_sync.py — service registry → vendor configs - migration 106 — linked_email, banner_config_hash, consent_version columns Tests: 20+ new backend tests + 2 Playwright E2E test suites (API + UI) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-02 19:41:22 +02:00
Benjamin Admin	c5b22e0c99	fix: derive intake flags from DETECTED SERVICES, not from text content Fundamental architecture fix: data processing happens through APIs/scripts/ cookies — NOT through visible page text. A news site about healthcare does NOT process health data. Before: Qwen reads website text → guesses "health_data: true" (WRONG) After: Google Analytics detected → tracking: true (CORRECT, deterministic) New flow: detect services from HTML → map service categories to flags → feed flags into UCCA assessment. No LLM needed for flag extraction. SERVICE_TO_FLAGS maps categories: tracking→tracking, marketing→marketing+ third_party_sharing, payment→payment_data, heatmap→profiling, etc. SPECIFIC_SERVICE_FLAGS for Klarna (Art.22), Stripe (US transfer), etc. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-02 08:37:51 +02:00
Benjamin Admin	0f3ec9061e	fix: false positive findings + restore docs-src + §312k ecommerce filter 1. Intake prompt: "BETREIBER verarbeitet" statt "Text erwaehnt". IHK berichtet ueber Gesundheitsdaten → false. Vorher: true. 2. §312k Check: nur bei E-Commerce/Abo-Websites (Warenkorb, Shop, PayPal etc.) IHK hat keine Vertraege → kein Kuendigungsbutton noetig. 3. docs-src/ restored from commit `9824304` Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-02 08:26:59 +02:00
Benjamin Admin	e318215cc5	refactor: split agent_analyze_routes (420→309 LOC) + agent docs + migration - Extracted website compliance checks + helpers to website_compliance_checks.py - Created agent documentation (zeroclaw/docs/compliance-agent.md) - DB migration 086 executed (compliance_agent_scans table) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-02 08:22:52 +02:00
Benjamin Admin	d942b21354	feat: SCC + TIA templates for third-country transfers New templates for the Vendor Compliance module: - 105: Transfer Impact Assessment (TIA) — Schrems II risk assessment with country evaluation, government access assessment, supplementary measures, risk matrix, and go/conditional/deny decision - 105: SCC Companion Document — annexes to EU Decision 2021/914 (module selection C2C/C2P/P2P/P2C, party details, data description, TOMs, sub-processor list) Template recommendations: SCC+TIA triggered by tech_third_country answer Generator: New "Drittlandtransfer" category Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-01 10:19:56 +02:00
Benjamin Admin	3984f39329	feat: Phase 5 — Special templates (AI policy, BYOD, ISMS, consent, video DSI) Phase 5 of the Document Templates Masterplan: - 104: 5 new special templates: - ai_usage_policy: AI usage policy (AI Act Art. 4 training obligation, forbidden inputs, quality check, labeling, TDM opt-out) - byod_policy: Bring Your Own Device (container solution, remote wipe, DSFA, cost sharing options) - consent_texts: Double-Opt-In texts, newsletter, marketing, tracking, profiling consent, unsubscribe confirmation - video_conference_dsi: Video conference privacy notice (Zoom/Teams/Meet, recording consent, third-country transfer) - isms_manual: ISMS handbook (ISO 27001, document structure map to all other templates, PDCA cycle, management review) Generator: 6 new categories (AI governance, ISMS, consent, special DSI, internal policies) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-01 09:25:32 +02:00
Benjamin Admin	4417938558	feat: Phase 3 — Security + HR/Vendor/BCM policies Phase 3 of the Document Templates Masterplan: - 103: 4 new security policies (information_security_policy, password_policy, encryption_policy, access_control_policy) + updates for CRA (056) and all 15 HR/Vendor/BCM policies (072) New templates: - Information Security Policy: ISMS-Leitlinie (ISO 27001, BSI, NIS2) - Password Policy: BSI/NIST compliant (12+ chars, MFA, no forced rotation) - Encryption Policy: BSI TR-02102, algorithms, key management, TLS config - Access Control Policy: RBAC, Least Privilege, Zero Trust, rezertification Updates: AI Act + NIS2UmsuCG references for CRA and all 15 HR/Vendor/BCM Generator: 6 new categories (security, HR, data, vendor, BCM policies) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-01 09:05:03 +02:00
Benjamin Admin	90c7f02b40	feat: Phase 2 — Security Concepts + DSFA + DSR updates Phase 2 of the Document Templates Masterplan: - 101: Security Concepts v2 (7 templates) — NIS2UmsuCG references, BSI Grundschutz++ modernization, AI Act cross-references, Zero Trust principle, ransomware-protected backups, NIS2 logging - 102: DSFA + Pflichtenregister + DSR v2 — AI Act Art. 9 for DSFA, NIS2UmsuCG for Pflichtenregister, tenant_id fix for DSR processes All 16 templates reviewed — already at good product level, only incremental updates needed (standards references, cross-doc links). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-01 08:45:04 +02:00
Benjamin Admin	f591871277	feat: Phase 1 — Whistleblower + Cookie/Impressum + HR-DSI templates Phase 1 of the Document Templates Masterplan: - 098: Whistleblower-Richtlinie (HinSchG) — 10 sections, anonymous reporting, 7-day confirmation, 3-month feedback, reprisal protection - 099: Cookie-Banner + Impressum updates — OS-Plattform discontinued note (July 2025), description updates - 100: Applicant DSI + Employee DSI — two new HR privacy notices with § 26 BDSG, 6-month retention (applicants), modular blocks for video interviews, talent pool, IT monitoring, company vehicles, works council Generator: 25 new fields (whistleblower, applicant, employee categories) Categories: whistleblower, hr_dsi added to document generator Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-01 08:29:52 +02:00
Benjamin Admin	bae59e2ce0	feat: Document Templates v2 — 11 migrations + scope-based generator Complete overhaul of document generator templates based on paragraph-by-paragraph legal review of attorney-drafted templates (TOM, AVV, AGB, DSI, Community Guidelines, Nutzungsbedingungen, Widerrufsbelehrung, Cookie-Richtlinie). Templates (11 migrations 087-097): - 087: TOM-Dokumentation v2 (11 categories incl. Trennungskontrolle) - 088: AVV Art. 28 DSGVO (complete, §§ 1-11, 3 annexes) - 089: Cross-document updates (Löschkonzept DIN 66399, VVT recipients) - 090: AGB SaaS/Shop v2 (18 §§, B2B/B2C, IoT, physical goods, IP protection) - 091: Community Guidelines v2 (3 tones, 11 modular categories, DSA-compliant) - 092: Media & Content modules (MStV, AI Act Art. 50, UWG, Pressekodex) - 093: DSI/Privacy Policy v2 (Art. 13 complete, shop+corporate modules) - 094: Nutzungsbedingungen (Terms of Use, UGC, tipping, wallet, CC licenses) - 095: Widerrufsbelehrung (SaaS + physical + IoT bundle + combo) - 096: Social Media DSI (Facebook, YouTube, LinkedIn, TikTok, Meta Pixel) - 097: Cookie-Richtlinie v2 (TDDDG § 25, consent banner, browser links) Frontend (generator): - scopeDefaults.ts: L1-L4 scope-based defaults from Compliance Scope Engine - contextBridge.ts: TOMCtx + DPACtx interfaces (70+ new fields) - contextBridge-helpers.ts: 35+ placeholder mappings for TOM/DPA/AGB - _constants.ts: 120+ new generator fields (TOM, DPA, AGB, community, media, social, nutzungsbedingungen, widerruf, cookie, shop, IoT) - page.tsx: Auto-prefill TOM/DPA from scope engine decision Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-01 01:18:33 +02:00
Benjamin Admin	58957a4aaa	fix: Playwright user permission + etracker DSE matching + CMP skip 1. Dockerfile: install Playwright AS appuser (not root) so chromium binary is accessible at runtime. Was causing 500 error. 2. DSE service matching: text-search fallback when LLM extraction fails. If "etracker" appears in DSE text, mark as documented even without LLM parsing the service list. 3. CMP skip: consent managers in category "cmp" skipped (not just "other" with id "cmp"). NOT DEPLOYED — RAG pipeline is running. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 19:36:46 +02:00
Benjamin Admin	cedc5de15d	feat: Phase 10 — Playwright website scanner replaces httpx New /website-scan endpoint in consent-tester service: - Real browser renders JavaScript (finds dynamic content) - Clicks navigation menus (discovers hidden sub-pages like IHK DSB page) - Follows links within DSE to find regional privacy policies - Collects rendered HTML for each page (after JS execution) Backend integration: - agent_scan_routes tries Playwright first, falls back to httpx - DSE text and HTML extracted from Playwright-rendered pages - Service detection runs on rendered HTML (catches JS-loaded scripts) Also fixes: - GA regex: G-[A-Z0-9]{8,12} prevents CSS class false positives - etracker added to service registry - External page scanning blocked (same-domain only) - CSS/JS/image files excluded from page list Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 19:16:50 +02:00
Benjamin Admin	5eeef3a9c3	fix: 4 bugs from IHK scan — false positives + missing etracker 1. GA regex: G-\w{5,} matched CSS classes (g-7031048). Now requires G-[A-Z0-9]{8,12} (uppercase after G-, 8-12 chars = real GA4 ID) 2. External page scanning: DSE-internal links now SAME DOMAIN only. Previously followed links to etracker.com, google.de/policies etc. and detected services on THOSE sites as IHK services. 3. Added etracker to service registry (DE, ePrivacy-certified) 4. CSS/JS/image files excluded from page scanning 5. Navigation-pattern links for deeper DSE sub-pages Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 19:08:07 +02:00
Benjamin Admin	fff47cc52e	fix: 4 bugs from IHK Konstanz scan validation 1. DSE-Matcher: Google/YouTube false match — now requires 2+ word match for provider-name fallback, not just "Google" matching YouTube section 2. AGB/Widerrufsbelehrung: only_ecommerce flag — skips for non-shop websites (detected via payment providers, cart keywords) 3. DSE-internal link following — scanner now discovers links WITHIN the privacy policy and scans those too (finds regional DSE sub-pages) 4. Expanded keyword synonyms for DSE mandatory checks: - "Zweck und Rechtsgrundlage" now matches "zwecke" - "behoerdlichen datenschutzbeauftragt" matches DSB - "aufsichtsbehörde" with umlaut matches Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 17:57:19 +02:00
Benjamin Admin	0f3ba9c207	test: Lit-Mapping validation — Dict vs Control Library comparison 8 test cases with deliberately wrong legal basis assignments: - Cookie tracking on lit. f (should be lit. a) - Analytics on lit. b (should be lit. a) - Newsletter on lit. f (should be lit. a) - Klarna without Art. 22 - Session recording on lit. f - 2 correct cases (should NOT trigger findings) Runs both hardcoded dict AND Control Library query, compares results. If Control Library passes all → dict can be removed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 16:56:38 +02:00
Benjamin Admin	2c9cea74e3	docs: instruction for hardcoded knowledge → Control Library migration 6 files with hardcoded legal knowledge identified. Review deadline 2026-07-01. legal_basis_validator.py marked with warning log on every use. Instruction file for other session to execute migration. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 16:33:48 +02:00
Benjamin Admin	4bf92f42b8	feat: Phase 9 — Authenticated Testing + Legal Basis Validator (lit. mapping) Phase 9: Playwright login + 5 post-login checks: - §312k BGB: Kündigungsbutton (2 Klicks) - Art. 17 DSGVO: Konto löschen - Art. 20 DSGVO: Daten exportieren - Art. 7(3): Einwilligungen widerrufen - Art. 15: Profildaten einsehen Auto-detects login form selectors. Credentials destroyed after test. Legal Basis Validator: Checks 7 common lit-mapping mistakes: - Cookie tracking on lit. f instead of lit. a (Planet49) - Analytics on lit. b (contract overextension) - Klarna without Art. 22 reference - Session recording without consent Integrated into website scan pipeline. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 16:08:41 +02:00
Benjamin Admin	8336c01c5c	feat: Phase 6-8 — PDF export, recurring scans, multi-website compare Phase 6: PDF export via WeasyPrint — POST /agent/scans/pdf generates printable compliance report with findings table, service comparison, risk badge, and legal disclaimer. Phase 7: Recurring scans — POST /agent/monitored-urls to add URLs, POST /agent/run-scheduled triggers all enabled scans (cron/ZeroClaw). In-memory storage with DB upgrade path. Phase 8: Multi-website compare — POST /agent/compare with 2-5 URLs, parallel scanning, comparison table (risk, findings, services, compliance features per site). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 15:27:51 +02:00
Benjamin Admin	e35db90232	feat: Phase 5 — DB persistence for scan results + Phase 10 in plan - Migration 086: compliance_agent_scans table (findings, services, corrections) - agent_history_routes.py: POST /scans (save), GET /scans (list), GET /scans/{id} - Scan results survive page reloads and can be reviewed later - Phase 10 (Playwright website scanner) added to product roadmap Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 15:17:51 +02:00
Benjamin Admin	53774886e7	perf: Phase 4 — parallel page fetching (asyncio.gather) Scan pages in parallel instead of sequential. Reduces scan time from ~10s (5 pages × 2s) to ~3s (all pages at once). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 15:09:03 +02:00
Benjamin Admin	5c5054f740	feat: Phase 3 — registry 82 services, mandatory checker, SDK flow step - website_scanner.py: imports from master service_registry.py (82 services) - agent_scan_routes.py: mandatory content checks (documents + DSE sections) - steps-betrieb.ts: Compliance Agent step added to SDK Flow (seq 5000) - PLAN: Phase 9 (Authenticated Testing) added to product roadmap Mandatory checks know what MUST be there: - Documents: Impressum, DSE, AGB, Widerrufsbelehrung - DSE content: 9 Art. 13 DSGVO fields (DSB, Speicherdauer, etc.) - Impressum content: 5 §5 TMG fields (GF, HRB, USt-ID, etc.) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 15:04:44 +02:00
Benjamin Admin	642382cbe8	feat: Mandatory Content Checker — knows what MUST be there Three check levels: 1. Documents: Impressum, DSE, AGB, Widerrufsbelehrung must exist as pages 2. DSE content: 9 Art. 13 DSGVO mandatory sections (Verantwortlicher, DSB-Kontakt, Zwecke, Rechtsgrundlagen, Speicherdauer, Betroffenenrechte, Beschwerderecht, Drittlandtransfer, Profiling) 3. Impressum content: 5 §5 TMG mandatory fields (GF, Handelsregister, USt-ID, Anschrift, Kontakt) Detects both missing documents AND missing content within documents. Also catches HTTP errors (page exists but returns 404/500). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 14:23:22 +02:00
Benjamin Admin	f219b9c244	feat: Master Service Registry — 82 third-party services across 15 categories Tracking (12), Marketing/Ads (9), Newsletter (8), CDN/Fonts (7), Chatbots/Support (7), Payment (5), Heatmaps (4), A/B Testing (3), Tag Managers (3), Push (3), Video (4), Social (3), Error Tracking (4), CRM (3), Maps (3), Captcha (3), Accessibility (2), CMP (1). Each entry: regex, provider, country, EU adequacy, consent requirement, legal reference. Pure data, no logic. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 14:21:32 +02:00
Benjamin Admin	0ba76d041a	feat: DSE parser + matcher — textblock references in scan findings - dse_parser.py: HTML → structured sections (heading, number, content, parent) Uses heading hierarchy (h1-h4) with regex fallback - dse_matcher.py: matches detected services against DSE sections Exact name → provider → category matching with insertion point suggestion - agent_scan_routes: TextReference model in findings (original text, section, paragraph, correction type, insert_after) Enables showing: "Google Analytics not found in DSE, insert after Section 2.4 Cookies und Tracking" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 11:55:26 +02:00
Benjamin Admin	4298ae17ab	feat: Phase 0+1 — LLM intake extraction + control relevance filter Phase 0: Qwen extracts 14 structured intake flags (personal_data, marketing, profiling, ai_usage, etc.) instead of keyword matching. Fallback to keywords if LLM unavailable. Flags feed into UCCA for accurate scoring. Phase 1: Control relevance filter removes false positives. C_TRANSPARENCY only recommended if AI/ML keywords found in text. 7 control rules with keyword lists + intake flag fallback. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 11:36:24 +02:00
Benjamin Admin	6a77cf6a89	feat: HTML email format, tab info hints, scan history - Summary now renders as styled HTML (table layout, colored risk badge, warning banners) instead of plaintext in <div> - Tab info text explains scope: "Analysiert nur die eingegebene URL" vs "Scannt automatisch 5-10 Unterseiten" - Scan history with findings count badge and page count Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 11:04:29 +02:00
Benjamin Admin	b39c1d5dce	feat: DSR Prozessbeschreibungen Art. 15-21 mit Swim-Lane-Diagrammen Build + Deploy / build-admin-compliance (push) Successful in 1m56s Details Build + Deploy / build-backend-compliance (push) Successful in 3m5s Details Build + Deploy / build-ai-sdk (push) Successful in 47s Details Build + Deploy / build-developer-portal (push) Successful in 1m5s Details Build + Deploy / build-tts (push) Successful in 1m23s Details Build + Deploy / build-document-crawler (push) Successful in 33s Details Build + Deploy / build-dsms-gateway (push) Successful in 23s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / loc-budget (push) Failing after 17s Details CI / secret-scan (push) Has been skipped Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 2m40s Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / test-go (push) Successful in 42s Details CI / test-python-backend (push) Successful in 47s Details CI / test-python-document-crawler (push) Successful in 33s Details CI / test-python-dsms-gateway (push) Successful in 22s Details CI / validate-canonical-controls (push) Successful in 18s Details Build + Deploy / trigger-orca (push) Successful in 2m53s Details 7 vollstaendige Prozessbeschreibungen fuer den Document Generator: - Art. 15: Auskunftsrecht (30 Tage, 6 Schritte, Informationskatalog) - Art. 16: Berichtigungsrecht (14 Tage, inkl. Art. 19 Mitteilung) - Art. 17: Loeschungsrecht (14 Tage, Art. 17(3) Ausnahmen-Checkliste) - Art. 18: Einschraenkungsrecht (14 Tage, erlaubte Verarbeitung) - Art. 19: Mitteilungspflicht (automatisch bei Art. 16/17/18) - Art. 20: Datenuebertragbarkeit (30 Tage, JSON/CSV/XML Export) - Art. 21: Widerspruchsrecht (30 Tage, Sonderfall Direktwerbung) Jede Beschreibung enthaelt: - Mermaid Swim-Lane-Diagramm (Betroffener/Sachbearbeitung/Fachabteilung/DSB) - Detaillierte Schritt-Tabelle mit Verantwortlichkeiten und Fristen - Rechtsgrundlagen-Verweise - Firmen-Platzhalter (FIRMENNAME, VERSION, DATUM, DSB_NAME) Integration: - 7 neue Typen in VALID_DOCUMENT_TYPES (legal_template_routes.py) - Neue Kategorie "DSR-Prozesse" im Document Generator Frontend - DSR types-core.ts: templateType Feld verknuepft DSR → Document Generator - Migration 085 seeded die Templates in die legal_templates Tabelle [migration-approved] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-28 17:53:44 +02:00
Benjamin Admin	b06a33a5fe	fix: syntax error — missing closing paren in scan summary builder	2026-04-28 17:41:11 +02:00
Benjamin Admin	6c0e76f96d	feat: show scanned pages in email summary + frontend (expandable list) Email now lists all scanned URLs with checkmark/cross status. Frontend shows collapsible "X Seiten gescannt — Details anzeigen". Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-28 17:26:03 +02:00
Benjamin Admin	0106f3b5b6	fix: use Ollama directly for correction generation (bypass SDK think-mode) SDK LLM chat returns empty content due to Qwen think-mode. Direct Ollama /api/generate call with stream:false gets the full response including think tags which we strip. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-28 16:30:51 +02:00
Benjamin Admin	b175ad2594	fix: increase LLM timeouts for scan corrections (90s) and DSE extraction (120s) Qwen 3.5:35b needs ~30-60s per call. Multi-call scan was timing out. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-28 16:05:35 +02:00
Benjamin Admin	711b9b3146	feat: website scanner with SOLL/IST service comparison + corrections - website_scanner.py: multi-page crawl, 20+ service patterns (tracking, CDN, chatbots, payment, fonts, captcha, video), AI text detection - dse_service_extractor.py: LLM extracts services from privacy policy text - agent_scan_routes.py: POST /agent/scan — combines scan + DSE comparison, generates findings (undocumented, outdated, third-country transfer), auto-corrections via Qwen in pre-launch mode Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-28 15:35:31 +02:00

1 2 3 4 5

239 Commits