breakpilot-compliance

Author	SHA1	Message	Date
Benjamin Admin	53f6f30cf0	feat: DSI document discovery + completeness check in agent scan workflow Agent scan now automatically: 1. Discovers all legal documents via consent-tester /dsi-discovery endpoint 2. Classifies each as DSE/AGB/Widerruf/Cookie/Impressum 3. Checks completeness against type-specific checklists: - DSE: 9 Art. 13 DSGVO mandatory fields (controller, DPO, purposes, legal basis, recipients, third-country, retention, rights, complaint) - AGB: §305ff BGB (scope, contract formation, liability, jurisdiction) - Widerruf: §355 BGB (right info, 14-day deadline, form, consequences) 4. Adds findings per document to scan results 5. Shows discovered documents with completeness % in email summary 6. Returns discovered_documents list in API response New files: - dsi_document_checker.py (229 LOC) — checklists + classifier - agent_scan_helpers.py (109 LOC) — extracted summary builder + corrections Refactor: agent_scan_routes.py 537→448 LOC (under 500 budget) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-04 22:09:45 +02:00
Benjamin Admin	d3c8811fdb	feat: IAB TCF 2.2 — TC String encoder + purpose mapping + UI - TCFEncoderService: generates base64url-encoded TC Strings per IAB spec with 12 purposes, vendor consent bitfield, CMP metadata - Category-to-purpose mapping (necessary→none, statistics→1,7,8,9,10, marketing→1,2,3,4,5,6,7,12, functional→1,11) - tcf_routes: 5 endpoints (purposes, features, mapping, encode, encode-categories) - banner_consent_service: auto-generates TC String when tcf_enabled=true - TCFSettings.tsx: enable/disable toggle, purpose grid with category mapping, TC String test generator, CMP registration info - New "TCF/IAB" tab in cookie-banner page (7 tabs total) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-04 07:01:37 +02:00
Benjamin Admin	eb4ea8bc42	feat: EmailDeliveryService + professional DSR email templates - EmailDeliveryService: load template → find published version → render {{variables}} → send via SMTP → audit log. Fallback to inline HTML when no published template exists. - Migration 117: Professional HTML/text content for all 5 DSR templates (receipt, completion, rejection, identity, extension) with branded styling and proper Art. references - DSRArt11Service now uses EmailDeliveryService with dsr_rejection template instead of hardcoded HTML [migration-approved] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-03 23:38:32 +02:00
Benjamin Admin	060f351da7	feat: Art. 11 DSGVO — reject DSR when data subject not identifiable - New DSRArt11Service: handles rejection with proper legal basis, automated email notification to requester explaining Art. 11 - POST /dsr/{id}/reject-art11 endpoint - ActionButtons.tsx: "Nicht identifizierbar (Art. 11)" button shown when identity is not yet verified - Also fixes: DSR export type-cast rollback handling Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-03 23:30:18 +02:00
Benjamin Admin	c55d0ab12a	fix: DSR export type-cast bug + session rollback on partial failures - tenant_id kept as string (PostgreSQL handles UUID cast) - Einwilligungen query uses CAST(:tid AS VARCHAR) for compatibility - Each data source query wrapped with rollback on failure to prevent cascading "transaction aborted" errors Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-03 23:15:25 +02:00
Benjamin Admin	02468c94c0	feat: DSR User Data Export — Art. 15 PDF + Art. 20 JSON/CSV - DSRExportService: aggregates all CMP data about a user from Banner Consents, Einwilligungen, Audit Trail, DSR History - GET /dsr/{id}/export-user-data?format=json\|csv\|pdf endpoint - PDF: A4 reportlab with 4 sections (Consents, Einwilligungen, Audit-Trail, DSR-Anfragen) + cover page - CSV: BOM-encoded for Excel with flattened data rows - JSON: structured export with all data categories - ActionButtons.tsx: PDF/JSON/CSV export buttons now functional Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-03 22:42:03 +02:00
Benjamin Admin	630fffc0cc	feat: Academy integration — training gap detection after document approval (F7) - Migration 115: compliance_role_training_mapping table (org roles → training codes) - TrainingLinkService: queries training_modules/matrix/assignments to find gaps per person and role. Gracefully degrades when Go training tables don't exist yet. - document_review_routes: 2 new endpoints (training-requirements, training-gaps) - _notify_approval() now checks training gaps and sends emails to persons with outstanding modules, linking to /sdk/training/learner [migration-approved] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-03 22:03:25 +02:00
Benjamin Admin	965af3a34c	feat: A/B Testing + Compliance Report PDF (F5 + F8) F5: A/B Testing for Consent Rate - Migration 116: banner_variants table + variant tracking in audit log - BannerABService: deterministic sticky bucketing via device hash, chi-squared significance testing, variant CRUD - banner_ab_routes: 6 endpoints (CRUD + stats + assign) - ABTestPanel.tsx: variant creation, traffic sliders, opt-in comparison chart with winner/significance badges - New "A/B-Test" tab in cookie-banner page F8: Compliance Report PDF - CompliancePDFGenerator: reportlab-based A4 PDF covering all modules (Company Profile, TOM, VVT, DSFA, Risks, Vendors, Incidents, Reviews, Consents, Roles) - compliance_report_routes: GET /compliance/report/pdf - "Compliance-Report herunterladen" button on SDK dashboard [migration-approved] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-03 21:42:50 +02:00
Benjamin Admin	c3fcfe88ee	feat: Vendor-level consent + Consent analytics (F4 + F6) F4: Granular Vendor-Level Consent - Migration 113: vendor_consents JSONB on banner_consents + audit_log - ConsentCreate schema + BannerConsentDB model extended - banner_consent_service stores vendor_consents alongside categories - Audit trail includes vendor-level decisions + user_agent F6: Consent Rate Analytics - Migration 114: user_agent on audit_log + time-series index - BannerAnalyticsService: time series, category breakdown, device stats - banner_analytics_routes: 4 endpoints (overview, time-series, categories, devices) - AnalyticsDashboard.tsx: KPIs, bar chart, category bars, device breakdown - New "Analytik" tab in cookie-banner page [migration-approved] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-03 20:58:06 +02:00
Benjamin Admin	fe6764df9a	fix: ensure JSONB array fields are always arrays in control API Backend: _ensure_list() converts null/string/malformed JSONB to [] for requirements, test_procedure, evidence, open_anchors, tags. Frontend: defensive Array.isArray() check on ControlDetail.tsx. Fixes: TypeError: A.requirements.map is not a function Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-02 21:18:10 +02:00
Benjamin Admin	db697924ed	feat: Cookie banner vendors per category + {{COOKIE_TABLE}} generator - CookieBannerOverlay: shows vendors per category with expandable tables (Verarbeiter, Cookies, Dauer, Land) for full transparency - Demo vendors: 4 necessary, 3 statistics, 3 marketing, 3 functional - cookie_table_generator.py: renders {{COOKIE_TABLE}} Markdown tables from vendor configs (DB) or service registry (fallback) - SERVICE_COOKIES: 16 known vendor-to-cookie mappings with provider + country Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-02 20:06:44 +02:00
Benjamin Admin	17c67b4f25	feat: Cookie-Banner ↔ Backend Integration (DSR, Retention, Consent Proof) Phase 1: Vendor sync from service registry (82+ services → banner vendors) Phase 2: Category-based retention (marketing=90d, statistics=790d, not hardcoded 365d) Phase 3: DSR ↔ Banner email linking (link-email, by-email, Art.17 erasure, Art.15/20 export) Phase 4: Consent sync (Banner → Einwilligungen bridge) Phase 6: Consent proof (SHA256 config hash + config_version in audit log, Art. 7(1) DSGVO) New files: - banner_dsr_service.py — email linking + DSR integration - vendor_banner_sync.py — service registry → vendor configs - migration 106 — linked_email, banner_config_hash, consent_version columns Tests: 20+ new backend tests + 2 Playwright E2E test suites (API + UI) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-02 19:41:22 +02:00
Benjamin Admin	c5b22e0c99	fix: derive intake flags from DETECTED SERVICES, not from text content Fundamental architecture fix: data processing happens through APIs/scripts/ cookies — NOT through visible page text. A news site about healthcare does NOT process health data. Before: Qwen reads website text → guesses "health_data: true" (WRONG) After: Google Analytics detected → tracking: true (CORRECT, deterministic) New flow: detect services from HTML → map service categories to flags → feed flags into UCCA assessment. No LLM needed for flag extraction. SERVICE_TO_FLAGS maps categories: tracking→tracking, marketing→marketing+ third_party_sharing, payment→payment_data, heatmap→profiling, etc. SPECIFIC_SERVICE_FLAGS for Klarna (Art.22), Stripe (US transfer), etc. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-02 08:37:51 +02:00
Benjamin Admin	0f3ec9061e	fix: false positive findings + restore docs-src + §312k ecommerce filter 1. Intake prompt: "BETREIBER verarbeitet" statt "Text erwaehnt". IHK berichtet ueber Gesundheitsdaten → false. Vorher: true. 2. §312k Check: nur bei E-Commerce/Abo-Websites (Warenkorb, Shop, PayPal etc.) IHK hat keine Vertraege → kein Kuendigungsbutton noetig. 3. docs-src/ restored from commit `9824304` Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-02 08:26:59 +02:00
Benjamin Admin	e318215cc5	refactor: split agent_analyze_routes (420→309 LOC) + agent docs + migration - Extracted website compliance checks + helpers to website_compliance_checks.py - Created agent documentation (zeroclaw/docs/compliance-agent.md) - DB migration 086 executed (compliance_agent_scans table) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-02 08:22:52 +02:00
Benjamin Admin	58957a4aaa	fix: Playwright user permission + etracker DSE matching + CMP skip 1. Dockerfile: install Playwright AS appuser (not root) so chromium binary is accessible at runtime. Was causing 500 error. 2. DSE service matching: text-search fallback when LLM extraction fails. If "etracker" appears in DSE text, mark as documented even without LLM parsing the service list. 3. CMP skip: consent managers in category "cmp" skipped (not just "other" with id "cmp"). NOT DEPLOYED — RAG pipeline is running. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 19:36:46 +02:00
Benjamin Admin	5eeef3a9c3	fix: 4 bugs from IHK scan — false positives + missing etracker 1. GA regex: G-\w{5,} matched CSS classes (g-7031048). Now requires G-[A-Z0-9]{8,12} (uppercase after G-, 8-12 chars = real GA4 ID) 2. External page scanning: DSE-internal links now SAME DOMAIN only. Previously followed links to etracker.com, google.de/policies etc. and detected services on THOSE sites as IHK services. 3. Added etracker to service registry (DE, ePrivacy-certified) 4. CSS/JS/image files excluded from page scanning 5. Navigation-pattern links for deeper DSE sub-pages Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 19:08:07 +02:00
Benjamin Admin	fff47cc52e	fix: 4 bugs from IHK Konstanz scan validation 1. DSE-Matcher: Google/YouTube false match — now requires 2+ word match for provider-name fallback, not just "Google" matching YouTube section 2. AGB/Widerrufsbelehrung: only_ecommerce flag — skips for non-shop websites (detected via payment providers, cart keywords) 3. DSE-internal link following — scanner now discovers links WITHIN the privacy policy and scans those too (finds regional DSE sub-pages) 4. Expanded keyword synonyms for DSE mandatory checks: - "Zweck und Rechtsgrundlage" now matches "zwecke" - "behoerdlichen datenschutzbeauftragt" matches DSB - "aufsichtsbehörde" with umlaut matches Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 17:57:19 +02:00
Benjamin Admin	2c9cea74e3	docs: instruction for hardcoded knowledge → Control Library migration 6 files with hardcoded legal knowledge identified. Review deadline 2026-07-01. legal_basis_validator.py marked with warning log on every use. Instruction file for other session to execute migration. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 16:33:48 +02:00
Benjamin Admin	4bf92f42b8	feat: Phase 9 — Authenticated Testing + Legal Basis Validator (lit. mapping) Phase 9: Playwright login + 5 post-login checks: - §312k BGB: Kündigungsbutton (2 Klicks) - Art. 17 DSGVO: Konto löschen - Art. 20 DSGVO: Daten exportieren - Art. 7(3): Einwilligungen widerrufen - Art. 15: Profildaten einsehen Auto-detects login form selectors. Credentials destroyed after test. Legal Basis Validator: Checks 7 common lit-mapping mistakes: - Cookie tracking on lit. f instead of lit. a (Planet49) - Analytics on lit. b (contract overextension) - Klarna without Art. 22 reference - Session recording without consent Integrated into website scan pipeline. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 16:08:41 +02:00
Benjamin Admin	8336c01c5c	feat: Phase 6-8 — PDF export, recurring scans, multi-website compare Phase 6: PDF export via WeasyPrint — POST /agent/scans/pdf generates printable compliance report with findings table, service comparison, risk badge, and legal disclaimer. Phase 7: Recurring scans — POST /agent/monitored-urls to add URLs, POST /agent/run-scheduled triggers all enabled scans (cron/ZeroClaw). In-memory storage with DB upgrade path. Phase 8: Multi-website compare — POST /agent/compare with 2-5 URLs, parallel scanning, comparison table (risk, findings, services, compliance features per site). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 15:27:51 +02:00
Benjamin Admin	53774886e7	perf: Phase 4 — parallel page fetching (asyncio.gather) Scan pages in parallel instead of sequential. Reduces scan time from ~10s (5 pages × 2s) to ~3s (all pages at once). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 15:09:03 +02:00
Benjamin Admin	5c5054f740	feat: Phase 3 — registry 82 services, mandatory checker, SDK flow step - website_scanner.py: imports from master service_registry.py (82 services) - agent_scan_routes.py: mandatory content checks (documents + DSE sections) - steps-betrieb.ts: Compliance Agent step added to SDK Flow (seq 5000) - PLAN: Phase 9 (Authenticated Testing) added to product roadmap Mandatory checks know what MUST be there: - Documents: Impressum, DSE, AGB, Widerrufsbelehrung - DSE content: 9 Art. 13 DSGVO fields (DSB, Speicherdauer, etc.) - Impressum content: 5 §5 TMG fields (GF, HRB, USt-ID, etc.) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 15:04:44 +02:00
Benjamin Admin	642382cbe8	feat: Mandatory Content Checker — knows what MUST be there Three check levels: 1. Documents: Impressum, DSE, AGB, Widerrufsbelehrung must exist as pages 2. DSE content: 9 Art. 13 DSGVO mandatory sections (Verantwortlicher, DSB-Kontakt, Zwecke, Rechtsgrundlagen, Speicherdauer, Betroffenenrechte, Beschwerderecht, Drittlandtransfer, Profiling) 3. Impressum content: 5 §5 TMG mandatory fields (GF, Handelsregister, USt-ID, Anschrift, Kontakt) Detects both missing documents AND missing content within documents. Also catches HTTP errors (page exists but returns 404/500). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 14:23:22 +02:00
Benjamin Admin	f219b9c244	feat: Master Service Registry — 82 third-party services across 15 categories Tracking (12), Marketing/Ads (9), Newsletter (8), CDN/Fonts (7), Chatbots/Support (7), Payment (5), Heatmaps (4), A/B Testing (3), Tag Managers (3), Push (3), Video (4), Social (3), Error Tracking (4), CRM (3), Maps (3), Captcha (3), Accessibility (2), CMP (1). Each entry: regex, provider, country, EU adequacy, consent requirement, legal reference. Pure data, no logic. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 14:21:32 +02:00
Benjamin Admin	0ba76d041a	feat: DSE parser + matcher — textblock references in scan findings - dse_parser.py: HTML → structured sections (heading, number, content, parent) Uses heading hierarchy (h1-h4) with regex fallback - dse_matcher.py: matches detected services against DSE sections Exact name → provider → category matching with insertion point suggestion - agent_scan_routes: TextReference model in findings (original text, section, paragraph, correction type, insert_after) Enables showing: "Google Analytics not found in DSE, insert after Section 2.4 Cookies und Tracking" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 11:55:26 +02:00
Benjamin Admin	4298ae17ab	feat: Phase 0+1 — LLM intake extraction + control relevance filter Phase 0: Qwen extracts 14 structured intake flags (personal_data, marketing, profiling, ai_usage, etc.) instead of keyword matching. Fallback to keywords if LLM unavailable. Flags feed into UCCA for accurate scoring. Phase 1: Control relevance filter removes false positives. C_TRANSPARENCY only recommended if AI/ML keywords found in text. 7 control rules with keyword lists + intake flag fallback. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 11:36:24 +02:00
Benjamin Admin	b175ad2594	fix: increase LLM timeouts for scan corrections (90s) and DSE extraction (120s) Qwen 3.5:35b needs ~30-60s per call. Multi-call scan was timing out. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-28 16:05:35 +02:00
Benjamin Admin	711b9b3146	feat: website scanner with SOLL/IST service comparison + corrections - website_scanner.py: multi-page crawl, 20+ service patterns (tracking, CDN, chatbots, payment, fonts, captcha, video), AI text detection - dse_service_extractor.py: LLM extracts services from privacy policy text - agent_scan_routes.py: POST /agent/scan — combines scan + DSE comparison, generates findings (undocumented, outdated, third-country transfer), auto-corrections via Qwen in pre-launch mode Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-28 15:35:31 +02:00
Benjamin Admin	0c0dd4e3a6	feat: ZeroClaw compliance agent — document analysis + role assignment + email Add autonomous compliance agent that fetches web documents (cookie banners, privacy policies), classifies them via Qwen/Ollama, assesses DSGVO compliance, assigns to the responsible role, and sends notification emails. Components: - ZeroClaw SOP (6-step workflow: fetch, classify, assess, summarize, assign, notify) - Backend: /api/compliance/agent/analyze (combined endpoint) - Backend: /api/compliance/agent/notify (standalone email) - Frontend: /sdk/agent page (Manager UI with URL input + results) - Helper scripts + E2E test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-27 23:28:21 +02:00
Sharang Parnerkar	c43d9da6d0	merge: sync with origin/main, take upstream on conflicts # Conflicts: # admin-compliance/lib/sdk/types.ts # admin-compliance/lib/sdk/vendor-compliance/types.ts	2026-04-16 16:26:48 +02:00
Sharang Parnerkar	769e8c12d5	chore: mypy cleanup — comprehensive disable headers for agent-created services Adds scoped mypy disable-error-code headers to all 15 agent-created service files covering the ORM Column[T] + raw-SQL result type issues. Updates mypy.ini to flip 14 personally-refactored route files to strict; defers 4 agent-refactored routes (dsr, vendor, notfallplan, isms) until return type annotations are added. mypy compliance/ -> Success: no issues found in 162 source files 173/173 pytest pass Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 11:23:43 +02:00
Sharang Parnerkar	7344e5806e	refactor(backend/isms): split isms_assessment_service.py to stay under 500 LOC The previous commit (`32e121f`) left isms_assessment_service.py at 639 LOC, exceeding the 500-line hard cap. This follow-up extracts ReadinessCheckService and OverviewService into a new isms_readiness_service.py (400 LOC), leaving isms_assessment_service.py at 257 LOC (Management Reviews, Internal Audits, Audit Trail only). Updated isms_routes.py imports to reference the new service file. File sizes after split: - isms_routes.py: 446 LOC (thin handlers) - isms_governance_service.py: 416 LOC (scope, context, policy, objectives, SoA) - isms_findings_service.py: 276 LOC (findings, CAPA) - isms_assessment_service.py: 257 LOC (mgmt reviews, internal audits, audit trail) - isms_readiness_service.py: 400 LOC (readiness check, ISO 27001 overview) All 58 integration tests + 173 unit/contract tests pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 20:50:30 +02:00
Sharang Parnerkar	32e121f2a3	refactor(backend/api): extract ISMS services (Step 4 — file 18 of 18) compliance/api/isms_routes.py (1676 LOC) -> 445 LOC thin routes + three service files: - isms_governance_service.py (416) — scope, context, policy, objectives, SoA - isms_findings_service.py (276) — findings, CAPA, audit trail - isms_assessment_service.py (639) — management reviews, internal audits, readiness checks, ISO 27001 overview NOTE: isms_assessment_service.py exceeds the 500-line hard cap at 639 LOC. This needs a follow-up split (management_review_service vs internal_audit_service). Flagged for next session. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 20:34:59 +02:00
Sharang Parnerkar	07d470edee	refactor(backend/api): extract DSR services (Step 4 — file 15 of 18) compliance/api/dsr_routes.py (1176 LOC) -> 369 LOC thin routes + 469-line DsrService + 487-line DsrWorkflowService + 101-line schemas. Two-service split for Data Subject Request (DSGVO Art. 15-22): - dsr_service.py: CRUD, list, stats, export, audit log - dsr_workflow_service.py: identity verification, processing, portability, escalation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 20:34:48 +02:00
Sharang Parnerkar	a84dccb339	refactor(backend/api): extract vendor compliance services (Step 4) Split vendor_compliance_routes.py (1107 LOC) into thin route handlers plus three service modules: VendorService (vendors CRUD/stats/status), ContractService (contracts CRUD), and FindingService + ControlInstanceService + ControlsLibraryService (findings, control instances, controls library). All files under 500 lines. 215 tests pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 20:11:24 +02:00
Sharang Parnerkar	1a2ae896fb	refactor(backend/api): extract Notfallplan schemas + services (Step 4) Split notfallplan_routes.py (1018 LOC) into clean architecture layers: - compliance/schemas/notfallplan.py (146 LOC): all Pydantic models - compliance/services/notfallplan_service.py (500 LOC): contacts, scenarios, checklists, exercises, stats - compliance/services/notfallplan_workflow_service.py (309 LOC): incidents, templates - compliance/api/notfallplan_routes.py (361 LOC): thin handlers with domain error translation All 250 tests pass. Schemas re-exported via __all__ for legacy test imports. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 20:10:43 +02:00
Sharang Parnerkar	d35b0bc78c	chore: mypy fixes for routes.py + legal_document_service + control_export_service - Add [mypy-compliance.api.routes] to mypy.ini strict scope - Fix bare `dict` type annotation in routes.py update_requirement handler - Fix Column[str] return type in control_export_service.download_file - Fix unused type:ignore in legal_document_service.upload_word - Add union-attr ignore for optional requirement null access in routes.py mypy compliance/ -> Success on 149 source files 173/173 pytest pass Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 20:04:16 +02:00
Sharang Parnerkar	ae008d7d25	refactor(backend/api): extract DSFA schemas + services (Step 4 — file 14 of 18) - Create compliance/schemas/dsfa.py (161 LOC) — extract DSFACreate, DSFAUpdate, DSFAStatusUpdate, DSFASectionUpdate, DSFAApproveRequest - Create compliance/services/dsfa_service.py (386 LOC) — CRUD + helpers + stats + audit-log + CSV export; uses domain errors - Create compliance/services/dsfa_workflow_service.py (347 LOC) — status update, section update, submit-for-review, approve, export JSON, versions - Rewrite compliance/api/dsfa_routes.py (339 LOC) as thin handlers with Depends + translate_domain_errors(); re-export legacy symbols via __all__ - Add [mypy-compliance.api.dsfa_routes] ignore_errors = False to mypy.ini - Update tests: 422 -> 400 for domain ValidationError (6 assertions) - Regenerate OpenAPI baseline (360 paths / 484 operations — unchanged) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 19:20:48 +02:00
Sharang Parnerkar	6658776610	refactor(backend/api): extract compliance routes services (Step 4 — file 13 of 18) Split routes.py (991 LOC) into thin handlers + two service files: - RegulationRequirementService: regulations CRUD, requirements CRUD - ControlExportService: controls CRUD/review/domain, export, admin seeding All 216 tests pass. Route module re-exports repository classes so existing test patches (compliance.api.routes.*Repository) keep working. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 19:12:22 +02:00
Sharang Parnerkar	d2c94619d8	refactor(backend/api): extract LegalDocumentConsentService (Step 4 — file 12 of 18) Extract consent, audit log, cookie category, and consent stats endpoints from legal_document_routes into LegalDocumentConsentService. The route file is now a thin handler layer delegating to LegalDocumentService and LegalDocumentConsentService with translate_domain_errors(). Legacy helpers (_doc_to_response, _version_to_response, _transition, _log_approval) and schemas are re-exported for existing tests. Two transition tests updated to expect domain errors instead of HTTPException. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 08:47:56 +02:00
Sharang Parnerkar	cc1c61947d	refactor(backend/api): extract Incident services (Step 4 — file 11 of 18) compliance/api/incident_routes.py (916 LOC) -> 280 LOC thin routes + two services + 95-line schemas file. Two-service split for DSGVO Art. 33/34 Datenpannen-Management: incident_service.py (460 LOC): - CRUD (create, list, get, update, delete) - Stats, status update, timeline append, close - Module-level helpers: _calculate_risk_level, _is_notification_required, _calculate_72h_deadline, _incident_to_response, _measure_to_response, _parse_jsonb, _append_timeline, DEFAULT_TENANT_ID incident_workflow_service.py (329 LOC): - Risk assessment (likelihood x impact -> risk_level) - Art. 33 authority notification (with 72h deadline tracking) - Art. 34 data subject notification - Corrective measures CRUD Both services use raw SQL via sqlalchemy.text() — no ORM models for incident_incidents / incident_measures tables. Migrated from the Go ai-compliance-sdk; Python backend is Source of Truth. Legacy test compat: tests/test_incident_routes.py imports _calculate_risk_level, _is_notification_required, _calculate_72h_deadline, _incident_to_response, _measure_to_response, _parse_jsonb, DEFAULT_TENANT_ID directly from compliance.api.incident_routes — all re-exported via __all__. Verified: - 223/223 pytest pass (173 core + 50 incident) - OpenAPI 360/484 unchanged - mypy compliance/ -> Success on 141 source files - incident_routes.py 916 -> 280 LOC - Hard-cap violations: 8 -> 7 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 08:35:57 +02:00
Sharang Parnerkar	0c2e03f294	refactor(backend/api): extract Email Template services (Step 4 — file 10 of 18) compliance/api/email_template_routes.py (823 LOC) -> 295 LOC thin routes + 402-line EmailTemplateService + 241-line EmailTemplateVersionService + 61-line schemas file. Two-service split along natural responsibility seam: email_template_service.py (402 LOC): - Template type catalog (TEMPLATE_TYPES constant) - Template CRUD (list, create, get) - Stats, settings, send logs, initialization, default content - Shared _template_to_dict / _version_to_dict / _render_template helpers email_template_version_service.py (241 LOC): - Version CRUD (create, list, get, update) - Workflow transitions (submit, approve, reject, publish) - Preview and test-send TEMPLATE_TYPES, VALID_CATEGORIES, VALID_STATUSES re-exported from the route module for any legacy consumers. State-transition errors use ValidationError (-> HTTPException 400) to preserve the original handler's 400 status for "Only draft/review versions can be ..." checks, since the existing TestClient integration tests (47 tests) assert status_code == 400. Verified: - 47/47 tests/test_email_template_routes.py pass - OpenAPI 360/484 unchanged - mypy compliance/ -> Success on 138 source files - email_template_routes.py 823 -> 295 LOC - Hard-cap violations: 9 -> 8 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 22:39:19 +02:00
Sharang Parnerkar	a638d0e527	refactor(backend/api): extract EvidenceService (Step 4 — file 9 of 18) compliance/api/evidence_routes.py (641 LOC) -> 240 LOC thin routes + 460-line EvidenceService. Manages evidence CRUD, file upload, CI/CD evidence collection (SAST/dependency/SBOM/container scans), and CI status dashboard. Service injection pattern: EvidenceService takes the EvidenceRepository, ControlRepository, and AutoRiskUpdater classes as constructor parameters. The route's get_evidence_service factory reads these class references from its own module namespace so tests that ``patch("compliance.api.evidence_routes.EvidenceRepository", ...)`` still take effect through the factory. The `_store_evidence` and `_update_risks` helpers stay as module-level callables in evidence_service and are re-exported from the route module. The collect_ci_evidence handler remains inline (not delegated to a service method) so tests can patch `compliance.api.evidence_routes._store_evidence` and have the patch take effect at the handler's call site. Legacy re-exports via __all__: SOURCE_CONTROL_MAP, EvidenceRepository, ControlRepository, AutoRiskUpdater, _parse_ci_evidence, _extract_findings_detail, _store_evidence, _update_risks. Verified: - 208/208 pytest (core + 35 evidence tests) pass - OpenAPI 360/484 unchanged - mypy compliance/ -> Success on 135 source files - evidence_routes.py 641 -> 240 LOC - Hard-cap violations: 10 -> 9 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 21:59:03 +02:00
Sharang Parnerkar	e613af1a7d	refactor(backend/api): extract ScreeningService (Step 4 — file 8 of 18) compliance/api/screening_routes.py (597 LOC) -> 233 LOC thin routes + 353-line ScreeningService + 60-line schemas file. Manages SBOM generation (CycloneDX 1.5) and OSV.dev vulnerability scanning. Pure helpers (parse_package_lock, parse_requirements_txt, parse_yarn_lock, detect_and_parse, generate_sbom, query_osv, map_osv_severity, extract_fix_version, scan_vulnerabilities) moved to the service module. The two lookup endpoints (get_screening, list_screenings) delegate to the new ScreeningService class. Test-mock compatibility: tests/test_screening_routes.py uses `patch("compliance.api.screening_routes.SessionLocal", ...)` and `patch("compliance.api.screening_routes.scan_vulnerabilities", ...)`. Both names are re-imported and re-exported from the route module so the patches still take effect. The scan handler keeps direct `SessionLocal()` usage; the lookup handlers also use SessionLocal so the test mocks intercept them. Latent bug fixed: the original scan handler had text = content.decode("utf-8") on line 339, shadowing the imported `sqlalchemy.text` so that the subsequent `text("INSERT ...")` calls would have raised at runtime. The variable is now named `file_text`. Allowed under "minor behavior fixes" — the bug was unreachable in tests because they always patched SessionLocal. Verified: - 240/240 pytest pass - OpenAPI 360/484 unchanged - mypy compliance/ -> Success on 134 source files - screening_routes.py 597 -> 233 LOC - Hard-cap violations: 11 -> 10 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 20:03:16 +02:00
Sharang Parnerkar	7107a31496	refactor(backend/api): extract SourcePolicyService (Step 4 — file 7 of 18) compliance/api/source_policy_router.py (580 LOC) -> 253 LOC thin routes + 453-line SourcePolicyService + 83-line schemas file. Manages allowed data sources, operations matrix, PII rules, blocked-content log, audit trail, and dashboard stats/report. Single-service split. ORM-based (uses compliance.db.source_policy_models). Date-string parsing extracted to a module-level _parse_iso_optional helper so the audit + blocked-content list endpoints share it instead of duplicating try/except blocks. Legacy test compat: SourceCreate, SourceUpdate, SourceResponse, PIIRuleCreate, PIIRuleUpdate, OperationUpdate, _log_audit re-exported from compliance.api.source_policy_router via __all__. Verified: - 208/208 pytest pass (173 core + 35 source policy) - OpenAPI 360/484 unchanged - mypy compliance/ -> Success on 132 source files - source_policy_router.py 580 -> 253 LOC - Hard-cap violations: 12 -> 11 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 19:58:02 +02:00
Sharang Parnerkar	b850368ec9	refactor(backend/api): extract CanonicalControlService (Step 4 — file 6 of 18) compliance/api/canonical_control_routes.py (514 LOC) -> 192 LOC thin routes + 316-line CanonicalControlService + 105-line schemas file. Canonical Control Library manages OWASP/NIST/ENISA-anchored security control frameworks and controls. Like company_profile_routes, this file uses raw SQL via sqlalchemy.text() because there are no SQLAlchemy models for canonical_control_frameworks or canonical_controls. Single-service split. Session management moved from bespoke `with SessionLocal() as db:` blocks to Depends(get_db) for consistency. Legacy test imports preserved via re-export (FrameworkResponse, ControlResponse, SimilarityCheckRequest, SimilarityCheckResponse, _control_row). Validation extracted to a module-level `_validate_control_input` helper so both create and update share the same checks. ValidationError (from compliance.domain) replaces raw HTTPException(400) raises. Verified: - 187/187 pytest (173 core + 14 canonical) pass - OpenAPI 360/484 unchanged - mypy compliance/ -> Success on 130 source files - canonical_control_routes.py 514 -> 192 LOC - Hard-cap violations: 13 -> 12 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 19:53:55 +02:00
Sharang Parnerkar	4fa0dd6f6d	refactor(backend/api): extract VVTService (Step 4 — file 5 of 18) compliance/api/vvt_routes.py (550 LOC) -> 225 LOC thin routes + 475-line VVTService. Covers the organization header, processing activities CRUD, audit log, JSON/CSV export, stats, and version lookups for the Art. 30 DSGVO Verzeichnis. Single-service split: organization + activities + audit + stats all revolve around the same tenant's VVT document, and the existing test suite (tests/test_vvt_routes.py — 768 LOC, tests/test_vvt_tenant_isolation.py — 205 LOC) exercises them together. Module-level helpers (_activity_to_response, _log_audit, _export_csv) stay module-level in compliance.services.vvt_service and are re-exported from compliance.api.vvt_routes so the two test files keep importing from the old path. Pydantic schemas already live in compliance.schemas.vvt from Step 3 — no new schema file needed this round. mypy.ini flips compliance.api.vvt_routes from ignore_errors=True to False. Two SQLAlchemy Column[str] vs str dict-index errors fixed with explicit str() casts on status/business_function in the stats loop. Verified: - 242/242 pytest (173 core + 69 VVT integration) pass - OpenAPI 360/484 unchanged - mypy compliance/ -> Success on 128 source files - vvt_routes.py 550 -> 225 LOC - vvt_service.py 475 LOC (under 500 hard cap) - Hard-cap violations: 14 -> 13 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 19:50:40 +02:00
Sharang Parnerkar	f39c7ca40c	refactor(backend/api): extract CompanyProfileService (Step 4 — file 4 of 18) compliance/api/company_profile_routes.py (640 LOC) -> 154 LOC thin routes. Unusual for this repo: persistence uses raw SQL via sqlalchemy.text() because the underlying compliance_company_profiles table has ~45 columns with complex jsonb coercion and there is no SQLAlchemy model for it. New files: compliance/schemas/company_profile.py (127) — 4 request/response models compliance/services/company_profile_service.py (340) — Service class + row_to_response + log_audit compliance/services/_company_profile_sql.py (139) — 70-line INSERT/UPDATE statements separated for readability Minor behavioral improvement: the handlers now use Depends(get_db) for session management instead of the bespoke `db = SessionLocal(); try: ... finally: db.close()` pattern. This makes the routes consistent with every other refactored service, fixes the broken-ness under test dependency_overrides, and removes 6 duplicate try/finally blocks. Legacy exports preserved: CompanyProfileRequest, CompanyProfileResponse, AuditEntryResponse, AuditListResponse, row_to_response, and log_audit are re-exported from compliance.api.company_profile_routes so that the two existing test files (tests/test_company_profile_routes.py, tests/test_company_profile_extend.py) keep importing from the same path. Pre-existing broken tests noted: 6 tests in those files feed a 40-tuple row into row_to_response, but _BASE_COLUMNS_LIST has 46 columns (has had since the Phase 2 Stammdaten extension). These tests fail on main too (verified via `git stash` round-trip). Not fixed in this commit — they require a rewrite of the test's _make_row helper, which is out of scope for a pure structural refactor. Flagged for follow-up. Verified: - 173/173 pytest compliance/tests/ tests/contracts/ pass - OpenAPI 360/484 unchanged - mypy compliance/ -> Success on 127 source files - company_profile_routes.py 640 -> 154 LOC - All new files under soft 300 target except service (340, under hard 500) - Hard-cap violations: 15 -> 14 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 19:47:29 +02:00
Sharang Parnerkar	d571412657	refactor(backend/api): extract TOMService (Step 4 — file 3 of 18) compliance/api/tom_routes.py (609 LOC) -> 215 LOC thin routes + 434-line TOMService. Request bodies (TOMStateBody, TOMMeasureCreate, TOMMeasureUpdate, TOMMeasureBulkItem, TOMMeasureBulkBody) moved to compliance/schemas/tom.py (joining the existing response models from the Step 3 split). Single-service split (not two like banner): state, measures CRUD + bulk upsert, stats, export, and version lookups are all tightly coupled around the TOMMeasureDB aggregate, so splitting would create artificial boundaries. TOMService is 434 LOC — comfortably under the 500 hard cap. Domain error mapping: - ConflictError -> 409 (version conflict on state save; duplicate control_id on create) - NotFoundError -> 404 (missing measure on update; missing version) - ValidationError -> 400 (missing tenant_id on DELETE /state) Legacy test compat: the existing tests/test_tom_routes.py imports TOMMeasureBulkItem, _parse_dt, _measure_to_dict, and DEFAULT_TENANT_ID directly from compliance.api.tom_routes. All re-exported via __all__ so the 44-test file runs unchanged. mypy.ini flips compliance.api.tom_routes from ignore_errors=True to False. TOMService carries the scoped Column[T] header. Verified: - 217/217 pytest (173 baseline + 44 TOM) pass - OpenAPI 360/484 unchanged - mypy compliance/ -> Success on 124 source files - tom_routes.py 609 -> 215 LOC - Hard-cap violations: 16 -> 15 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 19:42:17 +02:00

1 2

99 Commits