Squash of branch refactor/phase0-guardrails-and-models-split — 4 commits,
81 files, 173/173 pytest green, OpenAPI contract preserved (360 paths /
484 operations).
## Phase 0 — Architecture guardrails
Three defense-in-depth layers to keep the architecture rules enforced
regardless of who opens Claude Code in this repo:
1. .claude/settings.json PreToolUse hook on Write/Edit blocks any file
that would exceed the 500-line hard cap. Auto-loads in every Claude
session in this repo.
2. scripts/githooks/pre-commit (install via scripts/install-hooks.sh)
enforces the LOC cap locally, freezes migrations/ without
[migration-approved], and protects guardrail files without
[guardrail-change].
3. .gitea/workflows/ci.yaml gains loc-budget + guardrail-integrity +
sbom-scan (syft+grype) jobs, adds mypy --strict for the new Python
packages (compliance/{services,repositories,domain,schemas}), and
tsc --noEmit for admin-compliance + developer-portal.
Per-language conventions documented in AGENTS.python.md, AGENTS.go.md,
AGENTS.typescript.md at the repo root — layering, tooling, and explicit
"what you may NOT do" lists. Root CLAUDE.md is prepended with the six
non-negotiable rules. Each of the 10 services gets a README.md.
scripts/check-loc.sh enforces soft 300 / hard 500 and surfaces the
current baseline of 205 hard + 161 soft violations so Phases 1-4 can
drain it incrementally. CI gates only CHANGED files in PRs so the
legacy baseline does not block unrelated work.
## Deprecation sweep
47 files. Pydantic V1 regex= -> pattern= (2 sites), class Config ->
ConfigDict in source_policy_router.py (schemas.py intentionally skipped;
it is the Phase 1 Step 3 split target). datetime.utcnow() ->
datetime.now(timezone.utc) everywhere including SQLAlchemy default=
callables. All DB columns already declare timezone=True, so this is a
latent-bug fix at the Python side, not a schema change.
DeprecationWarning count dropped from 158 to 35.
## Phase 1 Step 1 — Contract test harness
tests/contracts/test_openapi_baseline.py diffs the live FastAPI /openapi.json
against tests/contracts/openapi.baseline.json on every test run. Fails on
removed paths, removed status codes, or new required request body fields.
Regenerate only via tests/contracts/regenerate_baseline.py after a
consumer-updated contract change. This is the safety harness for all
subsequent refactor commits.
## Phase 1 Step 2 — models.py split (1466 -> 85 LOC shim)
compliance/db/models.py is decomposed into seven sibling aggregate modules
following the existing repo pattern (dsr_models.py, vvt_models.py, ...):
regulation_models.py (134) — Regulation, Requirement
control_models.py (279) — Control, Mapping, Evidence, Risk
ai_system_models.py (141) — AISystem, AuditExport
service_module_models.py (176) — ServiceModule, ModuleRegulation, ModuleRisk
audit_session_models.py (177) — AuditSession, AuditSignOff
isms_governance_models.py (323) — ISMSScope, Context, Policy, Objective, SoA
isms_audit_models.py (468) — Finding, CAPA, MgmtReview, InternalAudit,
AuditTrail, Readiness
models.py becomes an 85-line re-export shim in dependency order so
existing imports continue to work unchanged. Schema is byte-identical:
__tablename__, column definitions, relationship strings, back_populates,
cascade directives all preserved.
All new sibling files are under the 500-line hard cap; largest is
isms_audit_models.py at 468. No file in compliance/db/ now exceeds
the hard cap.
## Phase 1 Step 3 — infrastructure only
backend-compliance/compliance/{schemas,domain,repositories}/ packages
are created as landing zones with docstrings. compliance/domain/
exports DomainError / NotFoundError / ConflictError / ValidationError /
PermissionError — the base classes services will use to raise
domain-level errors instead of HTTPException.
PHASE1_RUNBOOK.md at backend-compliance/PHASE1_RUNBOOK.md documents
the nine-step execution plan for Phase 1: snapshot baseline,
characterization tests, split models.py (this commit), split schemas.py
(next), extract services, extract repositories, mypy --strict, coverage.
## Verification
backend-compliance/.venv-phase1: uv python install 3.12 + pip -r requirements.txt
PYTHONPATH=. pytest compliance/tests/ tests/contracts/
-> 173 passed, 0 failed, 35 warnings, OpenAPI 360/484 unchanged
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
17 KiB
BreakPilot Compliance - DSGVO/AI-Act SDK Platform
NON-NEGOTIABLE STRUCTURE RULES (enforced by
.claude/settings.jsonhook, git pre-commit, and CI):
- File-size budget: soft target 300 lines, hard cap 500 lines for any non-test, non-generated source file. Anything larger → split it. Exceptions are listed in
.claude/rules/loc-exceptions.txtand require a written rationale.- Clean architecture per service. Routers/handlers stay thin (≤30 lines per handler) and delegate to services; services use repositories; repositories own DB I/O. See
AGENTS.python.md/AGENTS.go.md/AGENTS.typescript.md.- Do not touch the database schema. No new Alembic migrations, no
ALTER TABLE, no model field renames without an explicit migration plan reviewed by the DB owner. SQLAlchemy__tablename__and column names are frozen.- Public endpoints are a contract. Any change to a path, method, status code, request schema, or response schema in
backend-compliance/,ai-compliance-sdk/,dsms-gateway/,document-crawler/, orcompliance-tts-service/must be accompanied by a matching update in every consumer (admin-compliance/,developer-portal/,breakpilot-compliance-sdk/,consent-sdk/). Use the OpenAPI snapshot tests intests/contracts/as the gate.- Tests are not optional. New code without tests fails CI. Refactors must preserve coverage and add a characterization test before splitting an oversized file.
- Do not bypass the guardrails. Do not edit
.claude/settings.json,scripts/check-loc.sh, or the loc-exceptions list to silence violations. If a rule is wrong, raise it in a PR description.These rules apply to every Claude Code session opened inside this repository, regardless of who launched it. They are loaded automatically via this
CLAUDE.md.
Entwicklungsumgebung (WICHTIG - IMMER ZUERST LESEN)
Zwei-Rechner-Setup + Hetzner
| Geraet | Rolle | Aufgaben |
|---|---|---|
| MacBook | Entwicklung | Claude Terminal, Code-Entwicklung, Browser (Frontend-Tests) |
| Mac Mini | Lokaler Server | Docker fuer lokale Dev/Tests (NICHT mehr fuer Production!) |
| Hetzner | Production | CI/CD Build + Deploy via Gitea Actions |
WICHTIG: Code wird auf dem MacBook bearbeitet. Production-Deployment laeuft automatisch auf Hetzner via CI/CD.
Entwicklungsworkflow (CI/CD — seit 2026-03-11)
# 1. Code auf MacBook bearbeiten (dieses Verzeichnis)
# 2. Committen und zu BEIDEN Remotes pushen:
git push origin main && git push gitea main
# 3. FERTIG! Gitea Actions auf Hetzner uebernimmt automatisch:
# Push auf main → Lint → Tests → Build → Deploy
# Pipeline: .gitea/workflows/ci.yaml
# Dauer: ca. 3 Minuten
# Status pruefen: https://gitea.meghsakha.com/Benjamin_Boenisch/breakpilot-compliance/actions
NICHT MEHR NOETIG: Manuelles ssh macmini "docker compose build" — das macht jetzt die CI/CD Pipeline!
CI/CD Pipeline (Gitea Actions → Hetzner)
Push auf main → go-lint/python-lint/nodejs-lint (nur PRs)
→ test-go-ai-compliance
→ test-python-backend-compliance
→ test-python-document-crawler
→ test-python-dsms-gateway
→ deploy-hetzner (nur wenn ALLE Tests gruen)
Dateien:
.gitea/workflows/ci.yaml— Pipeline-Definitiondocker-compose.hetzner.yml— Override: arm64→amd64 fuer Hetzner (x86_64)- Deploy-Pfad auf Hetzner:
/opt/breakpilot-compliance/
Ablauf deploy-hetzner:
git pullim Deploy-Dirdocker compose -f docker-compose.yml -f docker-compose.hetzner.yml build --paralleldocker compose up -d --remove-orphans- Health Checks
Lokale Entwicklung (Mac Mini — optional)
# Nur fuer lokale Tests, NICHT fuer Production:
ssh macmini "git -C /Users/benjaminadmin/Projekte/breakpilot-compliance pull --no-rebase origin main"
ssh macmini "/usr/local/bin/docker compose -f /Users/benjaminadmin/Projekte/breakpilot-compliance/docker-compose.yml build --no-cache <service>"
ssh macmini "/usr/local/bin/docker compose -f /Users/benjaminadmin/Projekte/breakpilot-compliance/docker-compose.yml up -d <service>"
# Fuer schnelle Iteration ohne Commit (rsync):
rsync -avz --exclude node_modules --exclude .next --exclude .git \
admin-compliance/ macmini:~/Projekte/breakpilot-compliance/admin-compliance/
WICHTIG: Docker-Pfad auf Mac Mini ist /usr/local/bin/docker (nicht im Standard-SSH-PATH).
WICHTIG: cd funktioniert NICHT in SSH-Einzelbefehlen — immer -f <pfad>/docker-compose.yml verwenden!
Voraussetzung
breakpilot-core MUSS laufen! Dieses Projekt nutzt Core-Services:
- Valkey (Session-Cache)
- Vault (Secrets)
- RAG-Service (Vektorsuche fuer Compliance-Dokumente)
- Nginx (Reverse Proxy)
Externe Services (Hetzner/meghshakka) — seit 2026-03-06:
- PostgreSQL 17 @
46.225.100.82:54321(sslmode=require) — Schemas:compliance(51),public(compliance_* + training_* + ucca_* + academy_*) - Qdrant @
qdrant-dev.breakpilot.ai(HTTPS, API-Key) - Object Storage @
nbg1.your-objectstorage.com(S3-kompatibel, TLS)
Config via .env auf Mac Mini (nicht im Repo): COMPLIANCE_DATABASE_URL, QDRANT_URL, QDRANT_API_KEY
Pruefen: curl -sf http://macmini:8099/health
Haupt-URLs
Production (Hetzner — primaer)
| URL | Service | Beschreibung |
|---|---|---|
| https://admin-dev.breakpilot.ai/ | Admin Compliance | SDK-Dashboard, alle Compliance-Module |
| https://developers-dev.breakpilot.ai/ | Developer Portal | API-Dokumentation fuer Kunden |
| https://api-dev.breakpilot.ai/ | Backend Compliance | Compliance APIs (DSGVO, DSR, GDPR) |
| https://sdk-dev.breakpilot.ai/ | AI Compliance SDK | KI-konforme Compliance-Analyse |
Lokal (Mac Mini — nur Dev/Tests)
| URL | Service | Beschreibung |
|---|---|---|
| https://macmini:3007/ | Admin Compliance | Lokale Entwicklung |
| https://macmini:3006/ | Developer Portal | Lokale Entwicklung |
| https://macmini:8002/ | Backend Compliance | Lokale Entwicklung |
| https://macmini:8093/ | AI Compliance SDK | Lokale Entwicklung |
Admin Compliance Module (https://macmini:3007/)
| Pfad | Modul | Beschreibung |
|---|---|---|
/dashboard |
Dashboard | Uebersicht + Catalog-Manager |
/sdk/tom |
TOM | Technisch-Organisatorische Massnahmen |
/sdk/dsfa |
DSFA | Datenschutz-Folgenabschaetzung |
/sdk/vvt |
VVT | Verzeichnis von Verarbeitungstaetigkeiten |
/sdk/loeschfristen |
Loeschfristen | Loeschfristen-Verwaltung |
/sdk/ai-act |
AI Act | KI-Verordnung Compliance |
/sdk/consent |
Consent | Einwilligungsmanagement |
/sdk/dsr |
DSR | Betroffenenrechte |
/sdk/vendor-compliance |
Vendor | Auftragsverarbeitung |
/sdk/risk-assessment |
Risiko | Risikobewertung |
/sdk/incident-response |
Vorfaelle | Datenschutz-Vorfaelle |
/sdk/training |
Schulung | Mitarbeiter-Schulungen |
/sdk/audit |
Audit | Audit-Management |
/sdk/policy-generator |
Policies | Richtlinien-Generator |
/sdk/data-mapping |
Data Map | Datenfluss-Mapping |
/developers |
Developer Portal | SDK API-Docs |
Services (10 Container)
| Service | Tech | Port | Container |
|---|---|---|---|
| admin-compliance | Next.js 15 | 3007 (via nginx) | bp-compliance-admin |
| backend-compliance | Python/FastAPI | 8002 | bp-compliance-backend |
| ai-compliance-sdk | Go/Gin | 8090→8093 | bp-compliance-ai-sdk |
| developer-portal | Next.js 15 | 3006 (via nginx) | bp-compliance-developer-portal |
| compliance-tts-service | Python/Piper TTS | 8095 | bp-compliance-tts |
| document-crawler | Python/FastAPI | 8098 | bp-compliance-document-crawler |
| dsms-node | IPFS Kubo | 4001/5001/8085 | bp-compliance-dsms-node |
| dsms-gateway | Node.js | 8082 | bp-compliance-dsms-gateway |
| docs | MkDocs/nginx | 8011 | bp-compliance-docs |
| core-wait | curl health-check | - | bp-compliance-core-wait |
compliance-tts-service
- Piper TTS + FFmpeg fuer Schulungsvideos
- Speichert Audio/Video in Hetzner Object Storage (nbg1.your-objectstorage.com)
- TTS-Modell:
de_DE-thorsten-high.onnx - Dateien:
main.py,tts_engine.py,video_generator.py,storage.py
document-crawler
- Dokument-Analyse: PDF, DOCX, XLSX, PPTX
- Gap-Analyse zwischen bestehenden Dokumenten und Compliance-Anforderungen
- IPFS-Archivierung via dsms-gateway
- Kommuniziert mit ai-compliance-sdk (LLM Gateway)
Docker-Netzwerk
Nutzt das externe Core-Netzwerk:
networks:
breakpilot-network:
external: true
name: breakpilot-network
Container-Naming: bp-compliance-*
DB search_path: compliance,core,public
Verzeichnisstruktur
breakpilot-compliance/
├── .claude/
│ ├── CLAUDE.md # Diese Datei
│ └── rules/ # Automatische Regeln
├── admin-compliance/ # Next.js Compliance Dashboard
│ ├── app/(sdk)/ # 37 SDK-Route-Dirs
│ ├── app/(admin)/ # Dashboard + Catalog-Manager
│ ├── components/sdk/ # SDKSidebar, CommandBar, ComplianceAdvisor
│ ├── components/catalog-manager/ # Shared Catalog UI
│ └── lib/sdk/ # SDK Context, Types, API-Client
├── backend-compliance/ # Python/FastAPI Backend
│ ├── compliance/ # Haupt-Package (40 Dateien)
│ │ ├── api/ # API Router
│ │ ├── db/ # DB Models
│ │ ├── services/ # Business Logic
│ │ └── data/ # Stammdaten
│ ├── consent_admin_api.py
│ ├── dsr_api.py
│ └── gdpr_api.py
├── ai-compliance-sdk/ # KI-Compliance Analyse Service
├── developer-portal/ # API-Dokumentation (Next.js)
├── breakpilot-compliance-sdk/ # SDK Package
├── consent-sdk/ # Consent SDK Package
├── pca-platform/ # Privacy Compliance Automation
├── dsms-node/ # IPFS Node
├── dsms-gateway/ # IPFS Gateway
├── scripts/ # Helper Scripts
├── docker-compose.yml # Compliance Compose (~10 Services, platform: arm64)
├── docker-compose.hetzner.yml # Override: arm64→amd64 fuer Hetzner
└── .gitea/workflows/ci.yaml # CI/CD Pipeline (Lint → Tests → Deploy)
Haeufige Befehle
Deployment (CI/CD — Standardweg)
# Committen und pushen → CI/CD deployt automatisch auf Hetzner:
git push origin main && git push gitea main
# CI-Status pruefen (im Browser):
# https://gitea.meghsakha.com/Benjamin_Boenisch/breakpilot-compliance/actions
# Health Checks:
curl -sf https://api-dev.breakpilot.ai/health
curl -sf https://sdk-dev.breakpilot.ai/health
Git
# Zu BEIDEN Remotes pushen (PFLICHT! — vom MacBook):
git push origin main && git push gitea main
# Remotes:
# origin: lokale Gitea (macmini:3003)
# gitea: gitea.meghsakha.com:22222
Lokale Docker-Befehle (Mac Mini — nur fuer Dev/Tests)
# Logs
ssh macmini "/usr/local/bin/docker logs -f bp-compliance-<service>"
# Status
ssh macmini "/usr/local/bin/docker ps --filter name=bp-compliance"
# Lokaler Rebuild (nur wenn noetig):
ssh macmini "git -C /Users/benjaminadmin/Projekte/breakpilot-compliance pull --no-rebase origin main"
ssh macmini "/usr/local/bin/docker compose -f /Users/benjaminadmin/Projekte/breakpilot-compliance/docker-compose.yml build --no-cache <service> && /usr/local/bin/docker compose -f /Users/benjaminadmin/Projekte/breakpilot-compliance/docker-compose.yml up -d <service>"
Kernprinzipien
1. Open Source Policy
- NUR Open Source mit kommerziell nutzbarer Lizenz
- Erlaubt: MIT, Apache-2.0, BSD, ISC, MPL-2.0, LGPL
- VERBOTEN: GPL (ausser LGPL), AGPL, proprietaer
2. DSGVO-Compliance
- Dieses Projekt implementiert DSGVO-Tools — es muss selbst DSGVO-konform sein
- Audit-Logging fuer alle Compliance-Aktionen
- Consent-Management via Core consent-service
3. AI Act Compliance
- KI-Risikobewertung fuer alle KI-Features
- Human Oversight sicherstellen
- Transparenzpflicht bei KI-Nutzung
4. Testing & Dokumentation
- Tests sind Pflicht bei jeder Aenderung
- Compliance-Checkliste bei neuen Features durchgehen
5. Sensitive Dateien
NIEMALS aendern oder committen:
.env,.env.local, Vault-Tokens, SSL-Zertifikate*.pdf,*.docx, kompilierte Binaries, grosse Medien
Tech-Stack
| Sprache | Services |
|---|---|
| Python/FastAPI | backend-compliance, ai-compliance-sdk, pca-platform |
| TypeScript/Next.js | admin-compliance, developer-portal |
| Node.js | dsms-node, dsms-gateway, consent-sdk |
SDK-Module im Detail
Katalog-System (Shared mit Lehrer)
components/catalog-manager/— CatalogManagerContent, CatalogTable, CatalogModuleTabs, CatalogEntryFormlib/sdk/catalog-manager/— catalog-registry.ts, types.ts- 17 DSGVO/AI-Act Kataloge (dsfa, vvt-baseline, vendor-compliance, etc.)
Multi-Projekt-Architektur (seit 2026-03-09)
Jeder Tenant kann mehrere Compliance-Projekte anlegen. CompanyProfile ist pro Projekt (nicht tenant-weit).
URL-Schema: /sdk?project={uuid} — alle SDK-Seiten enthalten ?project= Query-Param.
/sdk ohne ?project= zeigt die Projektliste (ProjectSelector).
Datenbank:
compliance_projects— Projekt-Metadaten (Name, Typ, Status, Version)sdk_states— UNIQUE auf(tenant_id, project_id)statt nurtenant_id- Migration:
039_compliance_projects.sql
Backend API (FastAPI):
GET /api/v1/projects → Alle Projekte des Tenants
POST /api/v1/projects → Neues Projekt erstellen (mit copy_from_project_id)
GET /api/v1/projects/{project_id} → Einzelnes Projekt laden
PATCH /api/v1/projects/{project_id} → Projekt aktualisieren
DELETE /api/v1/projects/{project_id} → Projekt archivieren (Soft Delete)
Frontend:
components/sdk/ProjectSelector/ProjectSelector.tsx— Projektliste + Erstellen-Dialoglib/sdk/types.ts—ProjectInfoInterface,SDKState.projectIdlib/sdk/context.tsx—projectIdProp,createProject(),listProjects(),switchProject()lib/sdk/sync.ts— BroadcastChannel + localStorage pro Projektlib/sdk/api-client.ts—projectIdin State-API + Projekt-CRUD-Methodenapp/sdk/layout.tsx— liest?project=aus searchParamsapp/api/sdk/v1/projects/— Next.js Proxy zum Backend
Multi-Tab: Tab A (Projekt X) und Tab B (Projekt Y) interferieren nicht — separate BroadcastChannel + localStorage Keys.
Stammdaten-Kopie: Neues Projekt mit copy_from_project_id → Backend kopiert companyProfile aus dem Quell-State. Danach unabhaengig editierbar.
Backend-Compliance APIs
POST/GET /api/v1/compliance/risks
POST/GET /api/v1/compliance/controls
POST/GET /api/v1/compliance/requirements
POST/GET /api/v1/compliance/evidence
POST/GET /api/v1/dsr/requests
POST/GET /api/v1/gdpr/exports
POST/GET /api/v1/consent/admin
# Stammdaten, Versionierung & Change-Requests (Phase 1-6, 2026-03-07)
GET/POST/DELETE /api/compliance/company-profile
GET /api/compliance/company-profile/template-context
GET /api/compliance/change-requests
GET /api/compliance/change-requests/stats
POST /api/compliance/change-requests/{id}/accept
POST /api/compliance/change-requests/{id}/reject
POST /api/compliance/change-requests/{id}/edit
GET /api/compliance/generation/preview/{doc_type}
POST /api/compliance/generation/apply/{doc_type}
GET /api/compliance/{doc}/{id}/versions
Multi-Tenancy
- Shared Dependency:
compliance/api/tenant_utils.py(get_tenant_id()) - UUID-Format, kein
"default"mehr - Header
X-Tenant-ID> Querytenant_id> ENV-Fallback
Migrations (035-038)
| Nr | Datei | Beschreibung |
|---|---|---|
| 035 | migrations/035_vvt_tenant_isolation.sql |
VVT tenant_id + DSFA/Vendor default→UUID |
| 036 | migrations/036_company_profile_extend.sql |
Stammdaten JSONB + Regulierungs-Flags |
| 037 | migrations/037_document_versions.sql |
5 Versions-Tabellen + current_version |
| 038 | migrations/038_change_requests.sql |
Change-Requests + Audit-Log |
Neue Backend-Module
| Datei | Beschreibung |
|---|---|
compliance/api/tenant_utils.py |
Shared Tenant-ID Dependency |
compliance/api/versioning_utils.py |
Shared Versioning Helper |
compliance/api/change_request_routes.py |
CR CRUD + Accept/Reject/Edit |
compliance/api/change_request_engine.py |
Regelbasierte CR-Generierung |
compliance/api/generation_routes.py |
Dokumentengenerierung aus Stammdaten |
compliance/api/document_templates/ |
5 Template-Generatoren (DSFA, VVT, TOM, etc.) |
Wichtige Dateien (Referenz)
| Datei | Beschreibung |
|---|---|
admin-compliance/app/(sdk)/ |
Alle 37+ SDK-Routes |
admin-compliance/app/(sdk)/sdk/change-requests/page.tsx |
Change-Request Inbox |
admin-compliance/components/sdk/Sidebar/SDKSidebar.tsx |
SDK Navigation (mit CR-Badge) |
admin-compliance/components/sdk/VersionHistory.tsx |
Versions-Timeline-Komponente |
admin-compliance/components/sdk/CommandBar.tsx |
Command Palette |
admin-compliance/lib/sdk/context.tsx |
SDK State (Provider) |
backend-compliance/compliance/ |
Haupt-Package (50+ Dateien) |
ai-compliance-sdk/ |
KI-Compliance Analyse |
developer-portal/ |
API-Dokumentation |