feat/dsms-stufe3-version-chains
101 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
8cbb513e2c |
feat(audit): Phase 1 Quick-Wins (P81 + P85 + P70 + P83) + TCF DELETE/INSERT-Fix
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / detect-changes (push) Successful in 11s
CI / branch-name (push) Has been skipped
CI / loc-budget (push) Failing after 16s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 15s
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 38s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / test-go (push) Has been skipped
P81 — tests/fixtures/golden_truth/vw_de.json: GT-Fixture mit must_find_cookies (47 VW-Cookies) + expected_vendors (Google, Adobe, Trade Desk, ...). Basis fuer kuenftige Regression-Tests. P85 — banner_screenshot_block.py + consent_scanner.py + main.py: consent-tester macht beim Banner-Detect einen base64-PNG-Screenshot (< 1.5MB). Backend rendert ihn als <img src="data:..."> direkt nach dem GF-1-Pager. Visueller Beweis 'so sah das Banner aus' fuer Dispute mit Marketing/DSB. P70 — rag_provenance.py: classify_finding_provenance() klassifiziert ein Finding als 'rag' (Norm + Quelle), 'mixed' (Norm ohne Quelle) oder 'heuristic' (eigene Interpretation). provenance_badge_html() rendert kleine Badges (✓ RAG / NORM / ⚠ HEURISTIK). Modul ist generisch, kann bei jedem Finding-Renderer einklinkt werden. P83 — scripts/check-rebuild-needed.sh: Prueft ob die im Container deployten BUILD_SHA mit local HEAD uebereinstimmen. Bei Mismatch exit 1 mit 'REBUILD REQUIRED'-Hinweis. Verhindert das 'alter Code im Container'-Problem das uns mehrfach erwischt hat (Frontend-Tabs sichtbar, Backend ohne neuen Service). TCF-Fix — tcf_vendor_authority.py: cookie_library hat keinen UNIQUE-Index auf cookie_name → ON CONFLICT war unmoeglich. Loesung: vor Insert DELETE WHERE source_name='iab_tcf_v2'. Idempotent. + per-Vendor-Commit damit ein Fail die naechsten nicht blockt. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
081e4f057a |
feat(audit): Cookie-Compliance-Audit (3-Quellen-Vergleich) + Vendor-Dedup + Block-Parser
CI / detect-changes (push) Successful in 12s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Failing after 55s
CI / iace-gt-coverage (push) Successful in 25s
CI / test-python-backend (push) Successful in 44s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 16s
CI / loc-budget (push) Failing after 18s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m43s
ZENTRALER USP: cookie_compliance_audit.py vergleicht 3 Quellen * DEKLARIERT in Cookie-Richtlinie (parse_cookie_table + parse_flat) * TATSAECHLICH im Browser geladen (banner_result.phases.after_accept) * LIBRARY-Metadaten (cookie_library lookup) Liefert 3 Listen mit Compliance-Verdict: * compliant (deklariert UND geladen) — gruener Block * undeclared_in_browser (geladen NICHT deklariert) — ROTER HIGH-Block → Art. 13(1)(c) DSGVO + § 25 TDDDG Verstoss * declared_not_loaded (deklariert NICHT geladen) — gelber Hinweis → Tabelle moeglicherweise veraltet parse_cookie_table erweitert um Block-Format (5 Zeilen pro Cookie wie beim User-Copy aus VW). Findet 35+ Cookies aus Copy-Paste statt 0. vendor_normalizer.py: 50+ Aliases (Google-Familie, Adobe-Familie, Trade Desk, AdForm, ...) + Garbage-Filter (URLs, leere Strings, 'click to select', 'Mehrere OEMs'). Mergt cookies-Listen beim Dedup. _guess_vendor erweitert: Adobe-Familie (s_ecid/AMCV/demdex/mbox/...), Trade Desk (TDID/TDCPM/TTDOptOut), AdForm (uid/cid/otsid), Salesforce LiveAgent, etracker, Akamai, EDAA. audit_quality_checks: vendor-thin-Threshold jetzt dynamisch nach Cookie-Doc-Wörter (3k→10 / 6k→20 / 10k→30 / 15k+→40). VW-Test-Fixture: tests/fixtures/cookie_gt/vw_cookie_richtlinie.txt (36-Cookie-Sample fuer Regression-Tests). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
7335f64f4f |
feat(founding-wizard): Per-Person IP-Assignment + Prefill + E2E-Tests
CI / loc-budget (push) Failing after 20s
CI / detect-changes (push) Successful in 12s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 19s
CI / nodejs-build (push) Successful in 3m17s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 43s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
Wizard unterstuetzt jetzt 2-4 Gesellschafter mit individuellem IP-Bereich: - Pro Gruender ein IP-Assignment-Vertrag (z.B. Benjamin: Compliance+RAG; Sharang: Security+Infrastruktur). Pro GF ein eigener Dienstvertrag. - Step 1: Prefill-Button aus Unternehmensprofil + Felder Registergericht und HRB-Nr. - Step 2: Rollen-Dropdown (CEO/CTO/CFO/COO/CPO/GF/Sonstige) statt freie Texteingabe, IP-Bereiche-Textarea pro Person. Backend: - generate_documents() iteriert pro Person fuer PER_PERSON_DOCS. - _build_person_context() injiziert ASSIGNOR_*, GF_*, IP_LIST_DETAILS aus person.ip_areas. - base_context() propagiert basics.register_court und basics.hrb_number. Tests: - 30/30 Pytest gruen (6 neue: Per-Person-Context, Slug-Helper, Registergericht-Propagation). - 4 neue Playwright-E2E-Specs (hermetisch via route.fulfill, mit Console-/Page-Error-Traps): kompletter 8-Step-Flow, Prefill-Fehlerpfad, Step-Navigation/Reset, Rollen-Dropdown + IP-Areas. - Spec setzt 'bp-sdk-cookie-consent' im addInitScript damit der CookieBannerOverlay nicht die Wizard-Buttons ueberlagert. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
badb356740 |
fix(founding-wizard): nested IF-Bloecke korrekt aufloesen (innermost-first)
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 42s
CI / detect-changes (push) Successful in 10s
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 13s
CI / loc-budget (push) Successful in 16s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
|
||
|
|
7a5f1e48dd |
feat(founding-wizard): Gründungs-Wizard für 2-Mann GmbH + 14 Notar-Templates
[migration-approved]
Templates (Migrations 123-136):
- 123 GO-GF (Geschäftsordnung Geschäftsführung)
- 124 SHA (Shareholders' Agreement, 56 Platzhalter)
- 125 Satzung (Articles of Association mit UG-Variante)
- 126 GF-Dienstvertrag (Trennungsprinzip Organ/Anstellung)
- 127 Arbeitsvertrag (AGG-neutral, NachwG, eAU)
- 128 Gesellschafterliste (§ 40 GmbHG)
- 129 GF-Bestellungsbeschluss (mit § 6 Abs. 2 Versicherung)
- 130 HRB-Anmeldung (§§ 7, 8, 39 GmbHG, § 12 HGB)
- 131 IP-Assignment Agreement (Gründer→GmbH)
- 132 Term Sheet (Pre-Seed/Seed VC-Standard)
- 133 Wandeldarlehensvertrag (Convertible Loan)
- 134 Beteiligungsvertrag (Subscription Agreement)
- 135 ESOP/VSOP-Plan (3 Varianten)
- 136 Cap Table
Kategorisierung (Migrations 137-138):
- ALTER TABLE compliance_legal_templates ADD lifecycle_stage TEXT[],
functional_category TEXT (mit CHECK Constraints + GIN-Index)
- Backfill aller 105 Templates: lifecycle_stage (pre_founding|founding|
startup|kmu|konzern) + functional_category (founding_legal|employment|
investor_funding|...)
Backend Founding-Wizard Service:
- template_renderer.py: Handlebars-light ({{VAR}}, {{#IF FLAG}}...{{/IF}})
- wizard_to_context.py: Mapping Wizard-State → SCREAMING_SNAKE_CASE Vars
- markdown_to_docx.py: Markdown → DOCX via python-docx
- founding_wizard_routes.py: POST /v1/founding-wizard/generate
→ liefert base64-DOCX-Files für ausgewählte Templates
Frontend Founding-Wizard (/sdk/founding-wizard):
- 8-Step Wizard (Basics, Gesellschafter, GF, Kapital, Notar, SHA, GF-Verträge, Generate)
- useFoundingWizardForm Hook mit localStorage-Persistenz
- TypeScript Code-Registry (template-categories.ts) als Backup zur DB
- Word-Download via data:URLs (base64)
Tests:
- 20 Unit-Tests grün (Renderer, Context-Mapping, DOCX-Conversion)
- Playwright E2E-Test mit 2-Mann GmbH (Benjamin + Sharang) Test-Daten
|
||
|
|
6c223c7c9b |
feat(compliance-check): exec-summary + voll-audit + TDM-respect + cookie-KB-extended + saving-scan-funnel
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 14s
CI / loc-budget (push) Failing after 15s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m43s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 37s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
P1 — Exec-Summary oben im Email-Report (4 KPIs + 2 CTAs, dunkler Gradient)
P3 — no_direct_sales-Flag fuer OEM-Konfigurator-Sites; AGB/Widerruf/AGB als
"NICHT ANWENDBAR" (grau) statt "NICHT GEFUNDEN" (rot)
P5 — Voll-Audit Unification: alle Findings (MC + Pflichtangaben + Vendor +
Redundanz) in /data/compliance_audits.db.unified_findings; neuer
/api/compliance/agent/findings/<id> Endpoint + FindingsTab im Audit-UI
mit Filter + CSV-Export
P7 — Crawl-Hardening: TDM-Reservation-Check (robots.txt / ai.txt / Header /
Meta) vor jedem Run mit 24h-Cache; HeadlessChrome-UA (Firma noch nicht
gegruendet — Switch via BREAKPILOT_BRANDED_UA env); per-Domain
Rate-Limit 1 req/s + max 2 concurrent
P2 — Cookie-Knowledge-DB additiv erweitert (35 -> 74 Cookies): Adobe, Meta,
Microsoft, LinkedIn, TikTok, HubSpot, Marketo, Salesforce, Hotjar,
FullStory, Mouseflow, Intercom, Drift, Zendesk, Cloudflare, Stripe,
OneTrust/Cookiebot/Usercentrics, Matomo, Pinterest, Snapchat, X/Twitter,
YouTube, Vimeo, Klaviyo, Mailchimp, Mixpanel, Segment, Amplitude,
Optimizely, Datadog; Wire-in in cookie_function_classifier liefert
compliance_risk-Label (kritisch/hoch/mittel/gering) pro Vendor
A — k-Anonymitaets-Helper (benchmark_k_anonymity) fuer P6-Vorbereitung
B — Cross-Tenant-Domain-Assertion im /findings-Endpoint (expected_domain
Query-Param -> 403 bei Mismatch)
C — Saving-Scan-Funnel: /api/compliance/agent/saving-scan/start mit
Validierung + 24h-Rate-Limit pro Domain + Lead-Persistenz in
saving_scan_leads + Auto-Discovery via _run_compliance_check; 6 Tests
D — Risk-Badge im Email-Vendor-Row
Rechtliche Leitplanken (Memory feedback_oem_data_legal.md): nur eigene
Knapp-Bewertungen + Source-Pointer, keine 1:1-Kopien fremder CMP-Texte.
TDM-Opt-Out-Respect nach § 44b UrhG. KEINE Schema-Aenderungen — alles in
Sidecar-SQLite.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
df7d83134b |
feat(agent): migrate compliance-check results to banner + documents (M1-M5)
After a compliance-check run finishes, the user can now apply the
extracted vendor inventory directly to their own:
- CookieBanner config (admin /sdk/einwilligungen)
- Cookie-Policy / VVT-Register / Privacy-Policy templates
(admin /sdk/document-generator)
Backend:
- migration_to_banner.py: vendor list -> CookieBannerConfig with
ESSENTIAL/PERFORMANCE/PERSONALIZATION/EXTERNAL_MEDIA buckets +
review flags (broken opt-out URLs, missing expiry, no cookies listed)
- migration_to_document.py: vendor list -> pre-fills for 3 doc
templates, recipient-type aware (INTERNAL/GROUP/PROCESSOR/CONTROLLER)
- agent_migration_routes.py: GET /banner-preview, /document-preview,
/summary keyed on check_id
- compliance_audit_log: new check_payloads table persists cmp_vendors +
extracted_profile so the preview survives an app restart
- tests: 9 mapper units + 4 endpoint integration tests
Frontend:
- MigrationPanel.tsx: modal showing banner-config diff + document
pre-fills, plus links into the existing editors
- ComplianceCheckTab.tsx: replaces standalone audit link with the
panel; net -3 lines, stays at the 500-cap
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
17c67b4f25 |
feat: Cookie-Banner ↔ Backend Integration (DSR, Retention, Consent Proof)
Phase 1: Vendor sync from service registry (82+ services → banner vendors) Phase 2: Category-based retention (marketing=90d, statistics=790d, not hardcoded 365d) Phase 3: DSR ↔ Banner email linking (link-email, by-email, Art.17 erasure, Art.15/20 export) Phase 4: Consent sync (Banner → Einwilligungen bridge) Phase 6: Consent proof (SHA256 config hash + config_version in audit log, Art. 7(1) DSGVO) New files: - banner_dsr_service.py — email linking + DSR integration - vendor_banner_sync.py — service registry → vendor configs - migration 106 — linked_email, banner_config_hash, consent_version columns Tests: 20+ new backend tests + 2 Playwright E2E test suites (API + UI) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
0f3ba9c207 |
test: Lit-Mapping validation — Dict vs Control Library comparison
8 test cases with deliberately wrong legal basis assignments: - Cookie tracking on lit. f (should be lit. a) - Analytics on lit. b (should be lit. a) - Newsletter on lit. f (should be lit. a) - Klarna without Art. 22 - Session recording on lit. f - 2 correct cases (should NOT trigger findings) Runs both hardcoded dict AND Control Library query, compares results. If Control Library passes all → dict can be removed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
0c0dd4e3a6 |
feat: ZeroClaw compliance agent — document analysis + role assignment + email
Add autonomous compliance agent that fetches web documents (cookie banners, privacy policies), classifies them via Qwen/Ollama, assesses DSGVO compliance, assigns to the responsible role, and sends notification emails. Components: - ZeroClaw SOP (6-step workflow: fetch, classify, assess, summarize, assign, notify) - Backend: /api/compliance/agent/analyze (combined endpoint) - Backend: /api/compliance/agent/notify (standalone email) - Frontend: /sdk/agent page (Manager UI with URL input + results) - Helper scripts + E2E test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
c43d9da6d0 |
merge: sync with origin/main, take upstream on conflicts
# Conflicts: # admin-compliance/lib/sdk/types.ts # admin-compliance/lib/sdk/vendor-compliance/types.ts |
||
|
|
7344e5806e |
refactor(backend/isms): split isms_assessment_service.py to stay under 500 LOC
The previous commit (
|
||
|
|
ae008d7d25 |
refactor(backend/api): extract DSFA schemas + services (Step 4 — file 14 of 18)
- Create compliance/schemas/dsfa.py (161 LOC) — extract DSFACreate, DSFAUpdate, DSFAStatusUpdate, DSFASectionUpdate, DSFAApproveRequest - Create compliance/services/dsfa_service.py (386 LOC) — CRUD + helpers + stats + audit-log + CSV export; uses domain errors - Create compliance/services/dsfa_workflow_service.py (347 LOC) — status update, section update, submit-for-review, approve, export JSON, versions - Rewrite compliance/api/dsfa_routes.py (339 LOC) as thin handlers with Depends + translate_domain_errors(); re-export legacy symbols via __all__ - Add [mypy-compliance.api.dsfa_routes] ignore_errors = False to mypy.ini - Update tests: 422 -> 400 for domain ValidationError (6 assertions) - Regenerate OpenAPI baseline (360 paths / 484 operations — unchanged) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
d2c94619d8 |
refactor(backend/api): extract LegalDocumentConsentService (Step 4 — file 12 of 18)
Extract consent, audit log, cookie category, and consent stats endpoints from legal_document_routes into LegalDocumentConsentService. The route file is now a thin handler layer delegating to LegalDocumentService and LegalDocumentConsentService with translate_domain_errors(). Legacy helpers (_doc_to_response, _version_to_response, _transition, _log_approval) and schemas are re-exported for existing tests. Two transition tests updated to expect domain errors instead of HTTPException. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
cc1c61947d |
refactor(backend/api): extract Incident services (Step 4 — file 11 of 18)
compliance/api/incident_routes.py (916 LOC) -> 280 LOC thin routes +
two services + 95-line schemas file.
Two-service split for DSGVO Art. 33/34 Datenpannen-Management:
incident_service.py (460 LOC):
- CRUD (create, list, get, update, delete)
- Stats, status update, timeline append, close
- Module-level helpers: _calculate_risk_level, _is_notification_required,
_calculate_72h_deadline, _incident_to_response, _measure_to_response,
_parse_jsonb, _append_timeline, DEFAULT_TENANT_ID
incident_workflow_service.py (329 LOC):
- Risk assessment (likelihood x impact -> risk_level)
- Art. 33 authority notification (with 72h deadline tracking)
- Art. 34 data subject notification
- Corrective measures CRUD
Both services use raw SQL via sqlalchemy.text() — no ORM models for
incident_incidents / incident_measures tables. Migrated from the Go
ai-compliance-sdk; Python backend is Source of Truth.
Legacy test compat: tests/test_incident_routes.py imports
_calculate_risk_level, _is_notification_required, _calculate_72h_deadline,
_incident_to_response, _measure_to_response, _parse_jsonb,
DEFAULT_TENANT_ID directly from compliance.api.incident_routes — all
re-exported via __all__.
Verified:
- 223/223 pytest pass (173 core + 50 incident)
- OpenAPI 360/484 unchanged
- mypy compliance/ -> Success on 141 source files
- incident_routes.py 916 -> 280 LOC
- Hard-cap violations: 8 -> 7
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
|
|
0c2e03f294 |
refactor(backend/api): extract Email Template services (Step 4 — file 10 of 18)
compliance/api/email_template_routes.py (823 LOC) -> 295 LOC thin routes
+ 402-line EmailTemplateService + 241-line EmailTemplateVersionService +
61-line schemas file.
Two-service split along natural responsibility seam:
email_template_service.py (402 LOC):
- Template type catalog (TEMPLATE_TYPES constant)
- Template CRUD (list, create, get)
- Stats, settings, send logs, initialization, default content
- Shared _template_to_dict / _version_to_dict / _render_template helpers
email_template_version_service.py (241 LOC):
- Version CRUD (create, list, get, update)
- Workflow transitions (submit, approve, reject, publish)
- Preview and test-send
TEMPLATE_TYPES, VALID_CATEGORIES, VALID_STATUSES re-exported from the
route module for any legacy consumers.
State-transition errors use ValidationError (-> HTTPException 400) to
preserve the original handler's 400 status for "Only draft/review
versions can be ..." checks, since the existing TestClient integration
tests (47 tests) assert status_code == 400.
Verified:
- 47/47 tests/test_email_template_routes.py pass
- OpenAPI 360/484 unchanged
- mypy compliance/ -> Success on 138 source files
- email_template_routes.py 823 -> 295 LOC
- Hard-cap violations: 9 -> 8
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
|
|
a638d0e527 |
refactor(backend/api): extract EvidenceService (Step 4 — file 9 of 18)
compliance/api/evidence_routes.py (641 LOC) -> 240 LOC thin routes + 460-line
EvidenceService. Manages evidence CRUD, file upload, CI/CD evidence
collection (SAST/dependency/SBOM/container scans), and CI status dashboard.
Service injection pattern: EvidenceService takes the EvidenceRepository,
ControlRepository, and AutoRiskUpdater classes as constructor parameters.
The route's get_evidence_service factory reads these class references from
its own module namespace so tests that
``patch("compliance.api.evidence_routes.EvidenceRepository", ...)`` still
take effect through the factory.
The `_store_evidence` and `_update_risks` helpers stay as module-level
callables in evidence_service and are re-exported from the route module.
The collect_ci_evidence handler remains inline (not delegated to a service
method) so tests can patch
`compliance.api.evidence_routes._store_evidence` and have the patch take
effect at the handler's call site.
Legacy re-exports via __all__: SOURCE_CONTROL_MAP, EvidenceRepository,
ControlRepository, AutoRiskUpdater, _parse_ci_evidence,
_extract_findings_detail, _store_evidence, _update_risks.
Verified:
- 208/208 pytest (core + 35 evidence tests) pass
- OpenAPI 360/484 unchanged
- mypy compliance/ -> Success on 135 source files
- evidence_routes.py 641 -> 240 LOC
- Hard-cap violations: 10 -> 9
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
|
|
7107a31496 |
refactor(backend/api): extract SourcePolicyService (Step 4 — file 7 of 18)
compliance/api/source_policy_router.py (580 LOC) -> 253 LOC thin routes + 453-line SourcePolicyService + 83-line schemas file. Manages allowed data sources, operations matrix, PII rules, blocked-content log, audit trail, and dashboard stats/report. Single-service split. ORM-based (uses compliance.db.source_policy_models). Date-string parsing extracted to a module-level _parse_iso_optional helper so the audit + blocked-content list endpoints share it instead of duplicating try/except blocks. Legacy test compat: SourceCreate, SourceUpdate, SourceResponse, PIIRuleCreate, PIIRuleUpdate, OperationUpdate, _log_audit re-exported from compliance.api.source_policy_router via __all__. Verified: - 208/208 pytest pass (173 core + 35 source policy) - OpenAPI 360/484 unchanged - mypy compliance/ -> Success on 132 source files - source_policy_router.py 580 -> 253 LOC - Hard-cap violations: 12 -> 11 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
b850368ec9 |
refactor(backend/api): extract CanonicalControlService (Step 4 — file 6 of 18)
compliance/api/canonical_control_routes.py (514 LOC) -> 192 LOC thin routes + 316-line CanonicalControlService + 105-line schemas file. Canonical Control Library manages OWASP/NIST/ENISA-anchored security control frameworks and controls. Like company_profile_routes, this file uses raw SQL via sqlalchemy.text() because there are no SQLAlchemy models for canonical_control_frameworks or canonical_controls. Single-service split. Session management moved from bespoke `with SessionLocal() as db:` blocks to Depends(get_db) for consistency. Legacy test imports preserved via re-export (FrameworkResponse, ControlResponse, SimilarityCheckRequest, SimilarityCheckResponse, _control_row). Validation extracted to a module-level `_validate_control_input` helper so both create and update share the same checks. ValidationError (from compliance.domain) replaces raw HTTPException(400) raises. Verified: - 187/187 pytest (173 core + 14 canonical) pass - OpenAPI 360/484 unchanged - mypy compliance/ -> Success on 130 source files - canonical_control_routes.py 514 -> 192 LOC - Hard-cap violations: 13 -> 12 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
4fa0dd6f6d |
refactor(backend/api): extract VVTService (Step 4 — file 5 of 18)
compliance/api/vvt_routes.py (550 LOC) -> 225 LOC thin routes + 475-line VVTService. Covers the organization header, processing activities CRUD, audit log, JSON/CSV export, stats, and version lookups for the Art. 30 DSGVO Verzeichnis. Single-service split: organization + activities + audit + stats all revolve around the same tenant's VVT document, and the existing test suite (tests/test_vvt_routes.py — 768 LOC, tests/test_vvt_tenant_isolation.py — 205 LOC) exercises them together. Module-level helpers (_activity_to_response, _log_audit, _export_csv) stay module-level in compliance.services.vvt_service and are re-exported from compliance.api.vvt_routes so the two test files keep importing from the old path. Pydantic schemas already live in compliance.schemas.vvt from Step 3 — no new schema file needed this round. mypy.ini flips compliance.api.vvt_routes from ignore_errors=True to False. Two SQLAlchemy Column[str] vs str dict-index errors fixed with explicit str() casts on status/business_function in the stats loop. Verified: - 242/242 pytest (173 core + 69 VVT integration) pass - OpenAPI 360/484 unchanged - mypy compliance/ -> Success on 128 source files - vvt_routes.py 550 -> 225 LOC - vvt_service.py 475 LOC (under 500 hard cap) - Hard-cap violations: 14 -> 13 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
f39c7ca40c |
refactor(backend/api): extract CompanyProfileService (Step 4 — file 4 of 18)
compliance/api/company_profile_routes.py (640 LOC) -> 154 LOC thin routes.
Unusual for this repo: persistence uses raw SQL via sqlalchemy.text()
because the underlying compliance_company_profiles table has ~45 columns
with complex jsonb coercion and there is no SQLAlchemy model for it.
New files:
compliance/schemas/company_profile.py (127) — 4 request/response models
compliance/services/company_profile_service.py (340) — Service class + row_to_response + log_audit
compliance/services/_company_profile_sql.py (139) — 70-line INSERT/UPDATE statements
separated for readability
Minor behavioral improvement: the handlers now use Depends(get_db) for
session management instead of the bespoke `db = SessionLocal(); try: ...
finally: db.close()` pattern. This makes the routes consistent with
every other refactored service, fixes the broken-ness under test
dependency_overrides, and removes 6 duplicate try/finally blocks.
Legacy exports preserved: CompanyProfileRequest, CompanyProfileResponse,
AuditEntryResponse, AuditListResponse, row_to_response, and log_audit are
re-exported from compliance.api.company_profile_routes so that the two
existing test files
(tests/test_company_profile_routes.py, tests/test_company_profile_extend.py)
keep importing from the same path.
Pre-existing broken tests noted: 6 tests in those files feed a 40-tuple
row into row_to_response, but _BASE_COLUMNS_LIST has 46 columns (has had
since the Phase 2 Stammdaten extension). These tests fail on main too
(verified via `git stash` round-trip). Not fixed in this commit — they
require a rewrite of the test's _make_row helper, which is out of scope
for a pure structural refactor. Flagged for follow-up.
Verified:
- 173/173 pytest compliance/tests/ tests/contracts/ pass
- OpenAPI 360/484 unchanged
- mypy compliance/ -> Success on 127 source files
- company_profile_routes.py 640 -> 154 LOC
- All new files under soft 300 target except service (340, under hard 500)
- Hard-cap violations: 15 -> 14
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
|
|
d571412657 |
refactor(backend/api): extract TOMService (Step 4 — file 3 of 18)
compliance/api/tom_routes.py (609 LOC) -> 215 LOC thin routes + 434-line TOMService. Request bodies (TOMStateBody, TOMMeasureCreate, TOMMeasureUpdate, TOMMeasureBulkItem, TOMMeasureBulkBody) moved to compliance/schemas/tom.py (joining the existing response models from the Step 3 split). Single-service split (not two like banner): state, measures CRUD + bulk upsert, stats, export, and version lookups are all tightly coupled around the TOMMeasureDB aggregate, so splitting would create artificial boundaries. TOMService is 434 LOC — comfortably under the 500 hard cap. Domain error mapping: - ConflictError -> 409 (version conflict on state save; duplicate control_id on create) - NotFoundError -> 404 (missing measure on update; missing version) - ValidationError -> 400 (missing tenant_id on DELETE /state) Legacy test compat: the existing tests/test_tom_routes.py imports TOMMeasureBulkItem, _parse_dt, _measure_to_dict, and DEFAULT_TENANT_ID directly from compliance.api.tom_routes. All re-exported via __all__ so the 44-test file runs unchanged. mypy.ini flips compliance.api.tom_routes from ignore_errors=True to False. TOMService carries the scoped Column[T] header. Verified: - 217/217 pytest (173 baseline + 44 TOM) pass - OpenAPI 360/484 unchanged - mypy compliance/ -> Success on 124 source files - tom_routes.py 609 -> 215 LOC - Hard-cap violations: 16 -> 15 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
10073f3ef0 |
refactor(backend/api): extract BannerConsent + BannerAdmin services (Step 4)
Phase 1 Step 4, file 2 of 18. Same cookbook as audit_routes ( |
||
|
|
883ef702ac |
tech-debt: mypy --strict config + integration tests for audit routes
Phase 1 Step 4 follow-up addressing the debt flagged in the worked-example
commit (
|
||
|
|
4a91814bfc |
refactor(backend/api): extract AuditSession service layer (Step 4 worked example)
Phase 1 Step 4 of PHASE1_RUNBOOK.md, first worked example. Demonstrates
the router -> service delegation pattern for all 18 oversized route
files still above the 500 LOC hard cap.
compliance/api/audit_routes.py (637 LOC) is decomposed into:
compliance/api/audit_routes.py (198) — thin handlers
compliance/services/audit_session_service.py (259) — session lifecycle
compliance/services/audit_signoff_service.py (319) — checklist + sign-off
compliance/api/_http_errors.py ( 43) — reusable error translator
Handlers shrink to 3-6 lines each:
@router.post("/sessions", response_model=AuditSessionResponse)
async def create_audit_session(
request: CreateAuditSessionRequest,
service: AuditSessionService = Depends(get_audit_session_service),
):
with translate_domain_errors():
return service.create(request)
Services are HTTP-agnostic: they raise NotFoundError / ConflictError /
ValidationError from compliance.domain, and the route layer translates
those to HTTPException(404/409/400) via the translate_domain_errors()
context manager in compliance.api._http_errors. The error translator is
reusable by every future Step 4 refactor.
Services take a sqlalchemy Session in the constructor and are wired via
Depends factories (get_audit_session_service / get_audit_signoff_service).
No globals, no module-level state.
Behavior is byte-identical at the HTTP boundary:
- Same paths, methods, status codes, response models
- Same error messages (domain error __str__ preserved)
- Same auto-start-on-first-signoff, same statistics calculation,
same signature hash format, same PDF streaming response
Verified:
- 173/173 pytest compliance/tests/ tests/contracts/ pass
- OpenAPI 360 paths / 484 operations unchanged
- audit_routes.py under soft 300 target
- Both new service files under soft 300 / hard 500
Note: compliance/tests/test_audit_routes.py contains placeholder tests
that do not actually import or call the handler functions — they only
assert on request-data shape. Real behavioral coverage relies on the
contract test. A follow-up commit should add TestClient-based
integration tests for the audit endpoints. Flagged in PHASE1_RUNBOOK.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
|
|
3320ef94fc |
refactor: phase 0 guardrails + phase 1 step 2 (models.py split)
Squash of branch refactor/phase0-guardrails-and-models-split — 4 commits,
81 files, 173/173 pytest green, OpenAPI contract preserved (360 paths /
484 operations).
## Phase 0 — Architecture guardrails
Three defense-in-depth layers to keep the architecture rules enforced
regardless of who opens Claude Code in this repo:
1. .claude/settings.json PreToolUse hook on Write/Edit blocks any file
that would exceed the 500-line hard cap. Auto-loads in every Claude
session in this repo.
2. scripts/githooks/pre-commit (install via scripts/install-hooks.sh)
enforces the LOC cap locally, freezes migrations/ without
[migration-approved], and protects guardrail files without
[guardrail-change].
3. .gitea/workflows/ci.yaml gains loc-budget + guardrail-integrity +
sbom-scan (syft+grype) jobs, adds mypy --strict for the new Python
packages (compliance/{services,repositories,domain,schemas}), and
tsc --noEmit for admin-compliance + developer-portal.
Per-language conventions documented in AGENTS.python.md, AGENTS.go.md,
AGENTS.typescript.md at the repo root — layering, tooling, and explicit
"what you may NOT do" lists. Root CLAUDE.md is prepended with the six
non-negotiable rules. Each of the 10 services gets a README.md.
scripts/check-loc.sh enforces soft 300 / hard 500 and surfaces the
current baseline of 205 hard + 161 soft violations so Phases 1-4 can
drain it incrementally. CI gates only CHANGED files in PRs so the
legacy baseline does not block unrelated work.
## Deprecation sweep
47 files. Pydantic V1 regex= -> pattern= (2 sites), class Config ->
ConfigDict in source_policy_router.py (schemas.py intentionally skipped;
it is the Phase 1 Step 3 split target). datetime.utcnow() ->
datetime.now(timezone.utc) everywhere including SQLAlchemy default=
callables. All DB columns already declare timezone=True, so this is a
latent-bug fix at the Python side, not a schema change.
DeprecationWarning count dropped from 158 to 35.
## Phase 1 Step 1 — Contract test harness
tests/contracts/test_openapi_baseline.py diffs the live FastAPI /openapi.json
against tests/contracts/openapi.baseline.json on every test run. Fails on
removed paths, removed status codes, or new required request body fields.
Regenerate only via tests/contracts/regenerate_baseline.py after a
consumer-updated contract change. This is the safety harness for all
subsequent refactor commits.
## Phase 1 Step 2 — models.py split (1466 -> 85 LOC shim)
compliance/db/models.py is decomposed into seven sibling aggregate modules
following the existing repo pattern (dsr_models.py, vvt_models.py, ...):
regulation_models.py (134) — Regulation, Requirement
control_models.py (279) — Control, Mapping, Evidence, Risk
ai_system_models.py (141) — AISystem, AuditExport
service_module_models.py (176) — ServiceModule, ModuleRegulation, ModuleRisk
audit_session_models.py (177) — AuditSession, AuditSignOff
isms_governance_models.py (323) — ISMSScope, Context, Policy, Objective, SoA
isms_audit_models.py (468) — Finding, CAPA, MgmtReview, InternalAudit,
AuditTrail, Readiness
models.py becomes an 85-line re-export shim in dependency order so
existing imports continue to work unchanged. Schema is byte-identical:
__tablename__, column definitions, relationship strings, back_populates,
cascade directives all preserved.
All new sibling files are under the 500-line hard cap; largest is
isms_audit_models.py at 468. No file in compliance/db/ now exceeds
the hard cap.
## Phase 1 Step 3 — infrastructure only
backend-compliance/compliance/{schemas,domain,repositories}/ packages
are created as landing zones with docstrings. compliance/domain/
exports DomainError / NotFoundError / ConflictError / ValidationError /
PermissionError — the base classes services will use to raise
domain-level errors instead of HTTPException.
PHASE1_RUNBOOK.md at backend-compliance/PHASE1_RUNBOOK.md documents
the nine-step execution plan for Phase 1: snapshot baseline,
characterization tests, split models.py (this commit), split schemas.py
(next), extract services, extract repositories, mypy --strict, coverage.
## Verification
backend-compliance/.venv-phase1: uv python install 3.12 + pip -r requirements.txt
PYTHONPATH=. pytest compliance/tests/ tests/contracts/
-> 173 passed, 0 failed, 35 warnings, OpenAPI 360/484 unchanged
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
|
|
712fa8cb74 |
feat: Pass 0b quality — negative actions, container detection, session object classes
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Successful in 33s
CI/CD / test-python-backend-compliance (push) Successful in 30s
CI/CD / test-python-document-crawler (push) Successful in 21s
CI/CD / test-python-dsms-gateway (push) Successful in 16s
CI/CD / validate-canonical-controls (push) Successful in 10s
CI/CD / Deploy (push) Successful in 2s
4 error class fixes from AUTH-1052 quality review: 1. Prohibitive action types (prevent/exclude/forbid) for "dürfen keine", "verboten" etc. 2. Container object detection (Sitzungsverwaltung, Token-Schutz → _requires_decomposition) 3. Session-specific object classes (session, cookie, jwt, federated_assertion) 4. Session lifecycle actions (invalidate, issue, rotate, enforce) with templates + severity caps 76 new tests (303 total), all passing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
f8d9919b97 |
Improve object normalization: shorter keys, synonym expansion, qualifier stripping
- Truncate object keys to 40 chars (was 80) at underscore boundary - Strip German qualifying prepositional phrases (bei/für/gemäß/von/zur/...) - Add 65 new synonym mappings for near-duplicate patterns found in analysis - Strip trailing noise tokens (articles/prepositions) - Add _truncate_at_boundary() helper and _QUALIFYING_PHRASE_RE regex - 11 new tests for normalization improvements (227 total pass) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
fb2cf29b34 |
fix: Pass 0b — Duplicate Guard, Severity-Kalibrierung, Title-Truncation
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Successful in 55s
CI/CD / test-python-backend-compliance (push) Successful in 36s
CI/CD / test-python-document-crawler (push) Successful in 23s
CI/CD / test-python-dsms-gateway (push) Successful in 20s
CI/CD / validate-canonical-controls (push) Successful in 11s
CI/CD / Deploy (push) Successful in 4s
1. Duplicate Guard: merge_hint-Lookup vor INSERT in _write_atomic_control() verhindert semantisch identische Controls unter demselben Parent. 2. Severity-Kalibrierung: action_type-basiert statt blind vom Parent. define/review/test → max medium, implement/monitor → max high. 3. Title-Truncation: Schnitt am Wortende statt mitten im Wort. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
f39e5a71af |
feat: Obligation-Deduplizierung — 34.617 Duplikate als 'duplicate' markiert
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Successful in 33s
CI/CD / test-python-backend-compliance (push) Successful in 35s
CI/CD / test-python-document-crawler (push) Successful in 30s
CI/CD / test-python-dsms-gateway (push) Successful in 20s
CI/CD / validate-canonical-controls (push) Successful in 13s
CI/CD / Deploy (push) Successful in 3s
Neue Endpunkte POST /obligations/dedup und GET /obligations/dedup-stats. Pro candidate_id wird der aelteste Eintrag behalten, alle weiteren erhalten release_state='duplicate' mit merged_into_id + quality_flags fuer Traceability. Detail-View filtert Duplikate aus. MKDocs aktualisiert. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
52e463a7c8 |
feat: Faceted Search — Dropdown-Counts passen sich aktiven Filtern an
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Successful in 36s
CI/CD / test-python-backend-compliance (push) Successful in 42s
CI/CD / test-python-document-crawler (push) Successful in 30s
CI/CD / test-python-dsms-gateway (push) Successful in 21s
CI/CD / validate-canonical-controls (push) Successful in 13s
CI/CD / Deploy (push) Successful in 2s
Backend: controls-meta akzeptiert alle Filter-Parameter und berechnet Faceted Counts (jede Dimension zaehlt mit allen ANDEREN Filtern). Neue Facets: severity, verification_method, category, evidence_type, release_state — zusaetzlich zu domains, sources, type_counts. Frontend: loadMeta laedt bei jeder Filteraenderung neu, alle Dropdowns zeigen kontextsensitive Zahlen. Proxy leitet Filter an controls-meta weiter. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
81c9ce5de3 |
fix: V1 Enrichment — Qdrant Collection + Parent-Resolution fuer regulatorische Matches
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Successful in 33s
CI/CD / test-python-backend-compliance (push) Successful in 30s
CI/CD / test-python-document-crawler (push) Successful in 21s
CI/CD / test-python-dsms-gateway (push) Successful in 16s
CI/CD / validate-canonical-controls (push) Successful in 9s
CI/CD / Deploy (push) Successful in 1s
Die atomic_controls_dedup Collection (51k Punkte) enthaelt nur atomare Controls ohne source_citation. Jetzt wird der Parent-Control aufgeloest, der die Rechtsgrundlage traegt. Deduplizierung nach Parent-UUID verhindert mehrfache Eintraege fuer die gleiche Regulation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
db7c207464 |
feat: V1 Control Enrichment — Eigenentwicklung-Label, regulatorisches Matching & Vergleichsansicht
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Successful in 39s
CI/CD / test-python-backend-compliance (push) Successful in 32s
CI/CD / test-python-document-crawler (push) Successful in 20s
CI/CD / test-python-dsms-gateway (push) Successful in 16s
CI/CD / validate-canonical-controls (push) Successful in 9s
CI/CD / Deploy (push) Successful in 4s
863 v1-Controls (manuell geschrieben, ohne Rechtsgrundlage) werden als "Eigenentwicklung" gekennzeichnet und automatisch mit regulatorischen Controls (DSGVO, NIS2, OWASP etc.) per Embedding-Similarity abgeglichen. Backend: - Migration 080: v1_control_matches Tabelle (Cross-Reference) - v1_enrichment.py: Batch-Matching via BGE-M3 + Qdrant (Threshold 0.75) - 3 neue API-Endpoints: enrich-v1-matches, v1-matches, v1-enrichment-stats - 6 Tests (dry-run, execution, matches, pagination, detection) Frontend: - Orange "Eigenentwicklung"-Badge statt grauem "v1" (wenn kein Source) - "Regulatorische Abdeckung"-Sektion im ControlDetail mit Match-Karten - Side-by-Side V1CompareView (Eigenentwicklung vs. regulatorisch gedeckt) - Prev/Next Navigation durch alle Matches Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
23dd5116b3 |
feat: LLM-basierter Rationale-Backfill fuer atomare Controls
POST /controls/backfill-rationale — ersetzt Placeholder "Aus Obligation abgeleitet." durch LLM-generierte Begruendungen (Ollama/qwen3.5). Optimierung: gruppiert ~86k Controls nach ~7k Parents, ein LLM-Call pro Parent. Paginierung via batch_size/offset fuer kontrollierte Ausfuehrung. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
5e9cab6ab5 |
feat: evidence_type Feld (code/process/hybrid) fuer Controls
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Successful in 38s
CI/CD / test-python-backend-compliance (push) Successful in 31s
CI/CD / test-python-document-crawler (push) Successful in 19s
CI/CD / test-python-dsms-gateway (push) Successful in 17s
CI/CD / validate-canonical-controls (push) Successful in 10s
CI/CD / Deploy (push) Successful in 4s
Neues Feld auf canonical_controls klassifiziert, ob ein Control technisch im Source Code (code), organisatorisch via Dokumente (process) oder beides (hybrid) nachgewiesen wird. Inklusive Backfill-Endpoint, Frontend-Badge/Filter und MkDocs-Dokumentation. - Migration 079: evidence_type VARCHAR(20) + Index - Backend: Filter, Backfill-Endpoint mit Domain-Heuristik, CRUD - Frontend: EvidenceTypeBadge (sky/amber/violet), Nachweisart-Dropdown - Proxy: evidence_type Passthrough fuer controls + controls-count - Tests: 22 Tests fuer Klassifikations-Heuristik - Docs: Eigenes MkDocs-Kapitel mit Mermaid-Diagramm Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
a29bfdd588 |
fix: normative_strength 'may' statt 'can' (DB-Constraint)
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Failing after 34s
CI/CD / test-python-backend-compliance (push) Successful in 30s
CI/CD / test-python-document-crawler (push) Successful in 19s
CI/CD / test-python-dsms-gateway (push) Successful in 17s
CI/CD / validate-canonical-controls (push) Successful in 11s
CI/CD / Deploy (push) Has been skipped
DB-Constraint erlaubt nur must/should/may. 'can' gibt es nicht. Alle Referenzen auf 'can' durch 'may' ersetzt. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
230fbeb490 |
feat: Dreistufenmodell normative Verbindlichkeit + Duplikat-Filter + Auto-Deploy
- Source-Type-Klassifikation (58 Regulierungen: law/guideline/framework) - Backfill-Endpoint POST /controls/backfill-normative-strength - exclude_duplicates Filter fuer Control-Library (Backend + Proxy + UI-Toggle) - MkDocs-Kapitel: Normative Verbindlichkeit mit Mermaid-Diagrammen - scripts/deploy.sh: Auto-Push + Mac Mini rebuild + Coolify health monitoring - 26 Unit Tests fuer Klassifikations-Logik Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
6d3bdf8e74 |
feat: Control-Detail Provenance + Atomare Controls Seite
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Successful in 41s
CI/CD / test-python-backend-compliance (push) Successful in 40s
CI/CD / test-python-document-crawler (push) Successful in 23s
CI/CD / test-python-dsms-gateway (push) Successful in 18s
CI/CD / validate-canonical-controls (push) Successful in 11s
CI/CD / Deploy (push) Successful in 4s
Backend: provenance endpoint (obligations, doc refs, merged duplicates, regulations summary) + atomic-stats aggregation endpoint. Frontend: ControlDetail mit Provenance-Sektionen, klickbare Navigation, neue /sdk/atomic-controls Seite mit Stats-Bar und gefilterer Liste. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
770f0b5ab0 |
fix: adapt batch dedup to NULL pattern_id — group by merge_group_hint
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Successful in 31s
CI/CD / test-python-backend-compliance (push) Successful in 31s
CI/CD / test-python-document-crawler (push) Successful in 21s
CI/CD / test-python-dsms-gateway (push) Successful in 19s
CI/CD / validate-canonical-controls (push) Successful in 10s
CI/CD / Deploy (push) Successful in 2s
All Pass 0b controls have pattern_id=NULL. Rewritten to: - Phase 1: Group by merge_group_hint (action:object:trigger), 52k groups - Phase 2: Cross-group embedding search for semantically similar masters - Qdrant search uses unfiltered cross-regulation endpoint - API param changed: pattern_id → hint_filter Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
35784c35eb |
feat: Batch Dedup Runner — 85k→~18-25k Master Controls
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Successful in 32s
CI/CD / test-python-backend-compliance (push) Successful in 30s
CI/CD / test-python-document-crawler (push) Successful in 20s
CI/CD / test-python-dsms-gateway (push) Successful in 16s
CI/CD / validate-canonical-controls (push) Successful in 9s
CI/CD / Deploy (push) Successful in 1s
Adds batch orchestration for deduplicating ~85k Pass 0b atomic controls into ~18-25k unique masters with M:N parent linking. New files: - migrations/078_batch_dedup.sql: merged_into_uuid column, perf indexes, link_type CHECK extended for cross_regulation - batch_dedup_runner.py: BatchDedupRunner with quality scoring, merge-hint grouping, title-identical short-circuit, parent-link transfer, and cross-regulation pass - tests/test_batch_dedup_runner.py: 21 tests (all passing) Modified: - control_dedup.py: optional collection param on Qdrant functions - crosswalk_routes.py: POST/GET batch-dedup endpoints Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
cce2707c03 |
fix: update 61 outdated test mocks to match current schemas
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Successful in 41s
CI/CD / test-python-backend-compliance (push) Successful in 31s
CI/CD / test-python-document-crawler (push) Successful in 21s
CI/CD / test-python-dsms-gateway (push) Successful in 16s
CI/CD / validate-canonical-controls (push) Successful in 10s
CI/CD / Deploy (push) Successful in 4s
Tests were failing due to stale mock objects after schema extensions: - DSFA: add _mapping property to _DictRow, use proper mock instead of MagicMock - Company Profile: add 6 missing fields (project_id, offering_urls, etc.) - Legal Templates/Policy: update document type count 52→58 - VVT: add 13 missing attributes to activity mock - Legal Documents: align consent test assertions with production behavior Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
e6201d5239 |
feat: Anti-Fake-Evidence System (Phase 1-4b)
Implement full evidence integrity pipeline to prevent compliance theater: - Confidence levels (E0-E4), truth status tracking, assertion engine - Four-Eyes approval workflow, audit trail, reject endpoint - Evidence distribution dashboard, LLM audit routes - Traceability matrix (backend endpoint + Compliance Hub UI tab) - Anti-fake badges, control status machine, normative patterns - 2 migrations, 4 test suites, MkDocs documentation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
48ca0a6bef |
feat: Framework Decomposition Engine + Composite Detection for Pass 0b
Adds a routing layer between Pass 0a and Pass 0b that classifies obligations into atomic/compound/framework_container. Framework-container obligations (e.g. "CCM-Praktiken fuer AIS") are decomposed into concrete sub-obligations via an internal framework registry before Pass 0b composition. - New: framework_decomposition.py with routing, matching, decomposition - New: Framework registry (NIST SP 800-53, OWASP ASVS, CSA CCM) as JSON - New: Composite detection flags on atomic controls (is_composite, atomicity) - New: gen_meta fields: framework_ref, framework_domain, decomposition_source - Integration: _route_and_compose() in run_pass0b() deterministic path - 248 tests (198 decomposition + 50 framework), all passing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
1a63f5857b |
feat: Deterministic Control Composition Engine v2 for Pass 0b
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Successful in 38s
CI/CD / test-python-backend-compliance (push) Successful in 39s
CI/CD / test-python-document-crawler (push) Successful in 26s
CI/CD / test-python-dsms-gateway (push) Successful in 21s
CI/CD / validate-canonical-controls (push) Successful in 13s
CI/CD / Deploy (push) Successful in 3s
Replace LLM-based Pass 0b with rule-based deterministic engine that
composes atomic controls from obligation data without any LLM call.
Engine features:
- 24 action types (implement, configure, define, document, maintain,
review, monitor, assess, audit, test, verify, validate, report,
notify, train, restrict_access, encrypt, delete, retain, ensure,
approve, remediate, perform, obtain)
- 19 object classes (policy, procedure, register, record, report,
technical_control, access_control, cryptographic_control,
configuration, account, system, data, interface, role, training,
incident, risk_artifact, process, consent)
- Compound action splitting with no-split phrases
- Title pattern: "{Object} {state_suffix}" (24 state suffixes)
- Statement field: "{condition} {object} ist {trigger} {action}"
- Pattern candidates for downstream categorization (26 specific
combos + 24 action fallbacks)
- Structured timing: deadline_hours + frequency extraction
- Confidence scoring (0.3 base + action/object/trigger/template)
- Merge group hints for dedup: "{action}:{norm_obj}:{trigger}"
- Synonym-based object normalization (50+ German synonyms)
- 16 specific (action_type, object_class) template overrides
- Output validator with Pflichtfelder + Negativregeln + Warnregeln
- All new fields serialized into gen_meta JSONB (no migration needed)
Tests: 185 passed (33 new tests covering all engine components)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
||
|
|
ac6134ce6d |
feat: control_parent_links population + traceability API + frontend
- _write_atomic_control() now uses RETURNING id and inserts into
control_parent_links (M:N) with source_regulation, source_article,
and obligation_candidate_id parsed from parent's source_citation
- New _parse_citation() helper for JSONB source_citation extraction
- New GET /controls/{id}/traceability endpoint returning full chain:
parent links with obligations, child controls, source_count
- Backend: control_type filter (atomic/rich) for controls + count
- Frontend: Rechtsgrundlagen section in ControlDetail showing all
parent links per source regulation with obligation text + strength
- Frontend: Atomic/Rich filter dropdown in Control Library list
- Frontend: GenerationStrategyBadge recognizes 'pass0b' strategy
- Tests: 3 new tests for parent_link creation + citation parsing,
existing batch test mock updated for RETURNING clause
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
||
|
|
a14e2f3a00 |
feat(decomposition): add merge pass, enrichment, and Pass 0b refinements
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Failing after 51s
CI/CD / test-python-backend-compliance (push) Successful in 34s
CI/CD / test-python-document-crawler (push) Successful in 23s
CI/CD / test-python-dsms-gateway (push) Successful in 20s
CI/CD / validate-canonical-controls (push) Successful in 12s
CI/CD / Deploy (push) Has been skipped
Add obligation refinement pipeline between Pass 0a and 0b: - Merge pass: rule-based dedup of implementation-level duplicate obligations within the same parent control (Jaccard similarity on action+object) - Enrich pass: classify trigger_type (event/periodic/continuous) and detect is_implementation_specific from obligation text (regex-based, no LLM) - Pass 0b: skip merged obligations, cap severity for impl-specific, override category to 'testing' for test obligations - Migration 075: merged_into_id, trigger_type, is_implementation_specific - Two new API endpoints: merge-obligations, enrich-obligations - 30+ new tests (122 total, all passing) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
643b26618f |
feat: Control Library UI, dedup migration, QA tooling, docs
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Failing after 31s
CI/CD / test-python-backend-compliance (push) Successful in 1m35s
CI/CD / test-python-document-crawler (push) Successful in 20s
CI/CD / test-python-dsms-gateway (push) Successful in 17s
CI/CD / validate-canonical-controls (push) Successful in 10s
CI/CD / Deploy (push) Has been skipped
- Control Library: parent control display, ObligationTypeBadge, GenerationStrategyBadge variants, evidence string fallback - API: expose parent_control_uuid/id/title in canonical controls - Fix: DSFA SQLAlchemy 2.0 Row._mapping compatibility - Migration 074: control_parent_links + control_dedup_reviews tables - QA scripts: benchmark, gap analysis, OSCAL import, OWASP cleanup, phase5 normalize, phase74 gap fill, sync_db, run_job - Docs: dedup engine, RAG benchmark, lessons learned, pipeline docs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
c52dbdb8f1 |
feat(rag): optimize RAG pipeline — JSON-Mode, CoT, Hybrid Search, Re-Ranking, Cross-Reg Dedup, chunk 1024
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Failing after 42s
CI/CD / test-python-backend-compliance (push) Successful in 1m38s
CI/CD / test-python-document-crawler (push) Successful in 20s
CI/CD / test-python-dsms-gateway (push) Successful in 17s
CI/CD / validate-canonical-controls (push) Successful in 10s
CI/CD / Deploy (push) Has been skipped
Phase 1 (LLM Quality): - Add format=json to all Ollama payloads (obligation_extractor, control_generator, citation_backfill) - Add Chain-of-Thought analysis steps to Pass 0a/0b system prompts Phase 2 (Retrieval Quality): - Hybrid search via Qdrant Query API with RRF fusion + automatic text index (legal_rag.go) - Fallback to dense-only search if Query API unavailable - Cross-encoder re-ranking with BGE Reranker v2 (RERANK_ENABLED=false by default) - CPU-only PyTorch dependency to keep Docker image small Phase 3 (Data Layer): - Cross-regulation dedup pass (threshold 0.95) links controls across regulations - DedupResult.link_type field distinguishes dedup_merge vs cross_regulation - Chunk size defaults updated 512/50 → 1024/128 for new ingestions only - Existing collections and controls are NOT affected Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
c3afa628ed |
feat(sdk): vendor-compliance cross-module integration — VVT, obligations, TOM, loeschfristen
Integrate the vendor-compliance module with four DSGVO modules to eliminate data silos and resolve the VVT processor tab's ephemeral state problem. - Reposition vendor-compliance sidebar from seq 4200 to 2500 (after VVT) - VVT: replace ephemeral ProcessorRecord state with Vendor-API fetch (read-only) - Obligations: add linked_vendor_ids (JSONB) + compliance check #12 MISSING_VENDOR_LINK - TOM: add vendor TOM-controls cross-reference table in overview tab - Loeschfristen: add linked_vendor_ids (JSONB) + vendor picker + document section - Migrations: 069_obligations_vendor_link.sql, 070_loeschfristen_vendor_link.sql - Tests: 12 new backend tests (125 total pass) - Docs: update obligations.md + vendors.md with cross-module integration Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
4b1eede45b |
feat(tom): audit document, compliance checks, 25 controls, canonical control mapping
Phase A: TOM document HTML generator (12 sections, inline CSS, A4 print) Phase B: TOMDocumentTab component (org-header form, revisions, print/download) Phase C: 11 compliance checks with severity-weighted scoring Phase D: MkDocs documentation for TOM module Phase E: 25 new controls (63 → 88) in 13 categories Canonical Control Mapping (three-layer architecture): - Migration 068: tom_control_mappings + tom_control_sync_state tables - 6 API endpoints: sync, list, by-tom, stats, manual add, delete - Category mapping: 13 TOM categories → 17 canonical categories - Frontend: sync button + coverage card (Overview), drill-down (Editor), belegende Controls count (Document) - 20 tests (unit + API with mocked DB) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |