Files
breakpilot-compliance/.claude/CLAUDE.md
Sharang Parnerkar 3320ef94fc refactor: phase 0 guardrails + phase 1 step 2 (models.py split)
Squash of branch refactor/phase0-guardrails-and-models-split — 4 commits,
81 files, 173/173 pytest green, OpenAPI contract preserved (360 paths /
484 operations).

## Phase 0 — Architecture guardrails

Three defense-in-depth layers to keep the architecture rules enforced
regardless of who opens Claude Code in this repo:

  1. .claude/settings.json PreToolUse hook on Write/Edit blocks any file
     that would exceed the 500-line hard cap. Auto-loads in every Claude
     session in this repo.
  2. scripts/githooks/pre-commit (install via scripts/install-hooks.sh)
     enforces the LOC cap locally, freezes migrations/ without
     [migration-approved], and protects guardrail files without
     [guardrail-change].
  3. .gitea/workflows/ci.yaml gains loc-budget + guardrail-integrity +
     sbom-scan (syft+grype) jobs, adds mypy --strict for the new Python
     packages (compliance/{services,repositories,domain,schemas}), and
     tsc --noEmit for admin-compliance + developer-portal.

Per-language conventions documented in AGENTS.python.md, AGENTS.go.md,
AGENTS.typescript.md at the repo root — layering, tooling, and explicit
"what you may NOT do" lists. Root CLAUDE.md is prepended with the six
non-negotiable rules. Each of the 10 services gets a README.md.

scripts/check-loc.sh enforces soft 300 / hard 500 and surfaces the
current baseline of 205 hard + 161 soft violations so Phases 1-4 can
drain it incrementally. CI gates only CHANGED files in PRs so the
legacy baseline does not block unrelated work.

## Deprecation sweep

47 files. Pydantic V1 regex= -> pattern= (2 sites), class Config ->
ConfigDict in source_policy_router.py (schemas.py intentionally skipped;
it is the Phase 1 Step 3 split target). datetime.utcnow() ->
datetime.now(timezone.utc) everywhere including SQLAlchemy default=
callables. All DB columns already declare timezone=True, so this is a
latent-bug fix at the Python side, not a schema change.

DeprecationWarning count dropped from 158 to 35.

## Phase 1 Step 1 — Contract test harness

tests/contracts/test_openapi_baseline.py diffs the live FastAPI /openapi.json
against tests/contracts/openapi.baseline.json on every test run. Fails on
removed paths, removed status codes, or new required request body fields.
Regenerate only via tests/contracts/regenerate_baseline.py after a
consumer-updated contract change. This is the safety harness for all
subsequent refactor commits.

## Phase 1 Step 2 — models.py split (1466 -> 85 LOC shim)

compliance/db/models.py is decomposed into seven sibling aggregate modules
following the existing repo pattern (dsr_models.py, vvt_models.py, ...):

  regulation_models.py       (134) — Regulation, Requirement
  control_models.py          (279) — Control, Mapping, Evidence, Risk
  ai_system_models.py        (141) — AISystem, AuditExport
  service_module_models.py   (176) — ServiceModule, ModuleRegulation, ModuleRisk
  audit_session_models.py    (177) — AuditSession, AuditSignOff
  isms_governance_models.py  (323) — ISMSScope, Context, Policy, Objective, SoA
  isms_audit_models.py       (468) — Finding, CAPA, MgmtReview, InternalAudit,
                                     AuditTrail, Readiness

models.py becomes an 85-line re-export shim in dependency order so
existing imports continue to work unchanged. Schema is byte-identical:
__tablename__, column definitions, relationship strings, back_populates,
cascade directives all preserved.

All new sibling files are under the 500-line hard cap; largest is
isms_audit_models.py at 468. No file in compliance/db/ now exceeds
the hard cap.

## Phase 1 Step 3 — infrastructure only

backend-compliance/compliance/{schemas,domain,repositories}/ packages
are created as landing zones with docstrings. compliance/domain/
exports DomainError / NotFoundError / ConflictError / ValidationError /
PermissionError — the base classes services will use to raise
domain-level errors instead of HTTPException.

PHASE1_RUNBOOK.md at backend-compliance/PHASE1_RUNBOOK.md documents
the nine-step execution plan for Phase 1: snapshot baseline,
characterization tests, split models.py (this commit), split schemas.py
(next), extract services, extract repositories, mypy --strict, coverage.

## Verification

  backend-compliance/.venv-phase1: uv python install 3.12 + pip -r requirements.txt
  PYTHONPATH=. pytest compliance/tests/ tests/contracts/
  -> 173 passed, 0 failed, 35 warnings, OpenAPI 360/484 unchanged

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 13:18:29 +02:00

17 KiB

BreakPilot Compliance - DSGVO/AI-Act SDK Platform

NON-NEGOTIABLE STRUCTURE RULES (enforced by .claude/settings.json hook, git pre-commit, and CI):

  1. File-size budget: soft target 300 lines, hard cap 500 lines for any non-test, non-generated source file. Anything larger → split it. Exceptions are listed in .claude/rules/loc-exceptions.txt and require a written rationale.
  2. Clean architecture per service. Routers/handlers stay thin (≤30 lines per handler) and delegate to services; services use repositories; repositories own DB I/O. See AGENTS.python.md / AGENTS.go.md / AGENTS.typescript.md.
  3. Do not touch the database schema. No new Alembic migrations, no ALTER TABLE, no model field renames without an explicit migration plan reviewed by the DB owner. SQLAlchemy __tablename__ and column names are frozen.
  4. Public endpoints are a contract. Any change to a path, method, status code, request schema, or response schema in backend-compliance/, ai-compliance-sdk/, dsms-gateway/, document-crawler/, or compliance-tts-service/ must be accompanied by a matching update in every consumer (admin-compliance/, developer-portal/, breakpilot-compliance-sdk/, consent-sdk/). Use the OpenAPI snapshot tests in tests/contracts/ as the gate.
  5. Tests are not optional. New code without tests fails CI. Refactors must preserve coverage and add a characterization test before splitting an oversized file.
  6. Do not bypass the guardrails. Do not edit .claude/settings.json, scripts/check-loc.sh, or the loc-exceptions list to silence violations. If a rule is wrong, raise it in a PR description.

These rules apply to every Claude Code session opened inside this repository, regardless of who launched it. They are loaded automatically via this CLAUDE.md.

Entwicklungsumgebung (WICHTIG - IMMER ZUERST LESEN)

Zwei-Rechner-Setup + Hetzner

Geraet Rolle Aufgaben
MacBook Entwicklung Claude Terminal, Code-Entwicklung, Browser (Frontend-Tests)
Mac Mini Lokaler Server Docker fuer lokale Dev/Tests (NICHT mehr fuer Production!)
Hetzner Production CI/CD Build + Deploy via Gitea Actions

WICHTIG: Code wird auf dem MacBook bearbeitet. Production-Deployment laeuft automatisch auf Hetzner via CI/CD.

Entwicklungsworkflow (CI/CD — seit 2026-03-11)

# 1. Code auf MacBook bearbeiten (dieses Verzeichnis)
# 2. Committen und zu BEIDEN Remotes pushen:
git push origin main && git push gitea main

# 3. FERTIG! Gitea Actions auf Hetzner uebernimmt automatisch:
#    Push auf main → Lint → Tests → Build → Deploy
#    Pipeline: .gitea/workflows/ci.yaml
#    Dauer: ca. 3 Minuten
#    Status pruefen: https://gitea.meghsakha.com/Benjamin_Boenisch/breakpilot-compliance/actions

NICHT MEHR NOETIG: Manuelles ssh macmini "docker compose build" — das macht jetzt die CI/CD Pipeline!

CI/CD Pipeline (Gitea Actions → Hetzner)

Push auf main → go-lint/python-lint/nodejs-lint (nur PRs)
             → test-go-ai-compliance
             → test-python-backend-compliance
             → test-python-document-crawler
             → test-python-dsms-gateway
             → deploy-hetzner (nur wenn ALLE Tests gruen)

Dateien:

  • .gitea/workflows/ci.yaml — Pipeline-Definition
  • docker-compose.hetzner.yml — Override: arm64→amd64 fuer Hetzner (x86_64)
  • Deploy-Pfad auf Hetzner: /opt/breakpilot-compliance/

Ablauf deploy-hetzner:

  1. git pull im Deploy-Dir
  2. docker compose -f docker-compose.yml -f docker-compose.hetzner.yml build --parallel
  3. docker compose up -d --remove-orphans
  4. Health Checks

Lokale Entwicklung (Mac Mini — optional)

# Nur fuer lokale Tests, NICHT fuer Production:
ssh macmini "git -C /Users/benjaminadmin/Projekte/breakpilot-compliance pull --no-rebase origin main"
ssh macmini "/usr/local/bin/docker compose -f /Users/benjaminadmin/Projekte/breakpilot-compliance/docker-compose.yml build --no-cache <service>"
ssh macmini "/usr/local/bin/docker compose -f /Users/benjaminadmin/Projekte/breakpilot-compliance/docker-compose.yml up -d <service>"

# Fuer schnelle Iteration ohne Commit (rsync):
rsync -avz --exclude node_modules --exclude .next --exclude .git \
  admin-compliance/ macmini:~/Projekte/breakpilot-compliance/admin-compliance/

WICHTIG: Docker-Pfad auf Mac Mini ist /usr/local/bin/docker (nicht im Standard-SSH-PATH). WICHTIG: cd funktioniert NICHT in SSH-Einzelbefehlen — immer -f <pfad>/docker-compose.yml verwenden!


Voraussetzung

breakpilot-core MUSS laufen! Dieses Projekt nutzt Core-Services:

  • Valkey (Session-Cache)
  • Vault (Secrets)
  • RAG-Service (Vektorsuche fuer Compliance-Dokumente)
  • Nginx (Reverse Proxy)

Externe Services (Hetzner/meghshakka) — seit 2026-03-06:

  • PostgreSQL 17 @ 46.225.100.82:54321 (sslmode=require) — Schemas: compliance (51), public (compliance_* + training_* + ucca_* + academy_*)
  • Qdrant @ qdrant-dev.breakpilot.ai (HTTPS, API-Key)
  • Object Storage @ nbg1.your-objectstorage.com (S3-kompatibel, TLS)

Config via .env auf Mac Mini (nicht im Repo): COMPLIANCE_DATABASE_URL, QDRANT_URL, QDRANT_API_KEY

Pruefen: curl -sf http://macmini:8099/health


Haupt-URLs

Production (Hetzner — primaer)

URL Service Beschreibung
https://admin-dev.breakpilot.ai/ Admin Compliance SDK-Dashboard, alle Compliance-Module
https://developers-dev.breakpilot.ai/ Developer Portal API-Dokumentation fuer Kunden
https://api-dev.breakpilot.ai/ Backend Compliance Compliance APIs (DSGVO, DSR, GDPR)
https://sdk-dev.breakpilot.ai/ AI Compliance SDK KI-konforme Compliance-Analyse

Lokal (Mac Mini — nur Dev/Tests)

URL Service Beschreibung
https://macmini:3007/ Admin Compliance Lokale Entwicklung
https://macmini:3006/ Developer Portal Lokale Entwicklung
https://macmini:8002/ Backend Compliance Lokale Entwicklung
https://macmini:8093/ AI Compliance SDK Lokale Entwicklung

Admin Compliance Module (https://macmini:3007/)

Pfad Modul Beschreibung
/dashboard Dashboard Uebersicht + Catalog-Manager
/sdk/tom TOM Technisch-Organisatorische Massnahmen
/sdk/dsfa DSFA Datenschutz-Folgenabschaetzung
/sdk/vvt VVT Verzeichnis von Verarbeitungstaetigkeiten
/sdk/loeschfristen Loeschfristen Loeschfristen-Verwaltung
/sdk/ai-act AI Act KI-Verordnung Compliance
/sdk/consent Consent Einwilligungsmanagement
/sdk/dsr DSR Betroffenenrechte
/sdk/vendor-compliance Vendor Auftragsverarbeitung
/sdk/risk-assessment Risiko Risikobewertung
/sdk/incident-response Vorfaelle Datenschutz-Vorfaelle
/sdk/training Schulung Mitarbeiter-Schulungen
/sdk/audit Audit Audit-Management
/sdk/policy-generator Policies Richtlinien-Generator
/sdk/data-mapping Data Map Datenfluss-Mapping
/developers Developer Portal SDK API-Docs

Services (10 Container)

Service Tech Port Container
admin-compliance Next.js 15 3007 (via nginx) bp-compliance-admin
backend-compliance Python/FastAPI 8002 bp-compliance-backend
ai-compliance-sdk Go/Gin 8090→8093 bp-compliance-ai-sdk
developer-portal Next.js 15 3006 (via nginx) bp-compliance-developer-portal
compliance-tts-service Python/Piper TTS 8095 bp-compliance-tts
document-crawler Python/FastAPI 8098 bp-compliance-document-crawler
dsms-node IPFS Kubo 4001/5001/8085 bp-compliance-dsms-node
dsms-gateway Node.js 8082 bp-compliance-dsms-gateway
docs MkDocs/nginx 8011 bp-compliance-docs
core-wait curl health-check - bp-compliance-core-wait

compliance-tts-service

  • Piper TTS + FFmpeg fuer Schulungsvideos
  • Speichert Audio/Video in Hetzner Object Storage (nbg1.your-objectstorage.com)
  • TTS-Modell: de_DE-thorsten-high.onnx
  • Dateien: main.py, tts_engine.py, video_generator.py, storage.py

document-crawler

  • Dokument-Analyse: PDF, DOCX, XLSX, PPTX
  • Gap-Analyse zwischen bestehenden Dokumenten und Compliance-Anforderungen
  • IPFS-Archivierung via dsms-gateway
  • Kommuniziert mit ai-compliance-sdk (LLM Gateway)

Docker-Netzwerk

Nutzt das externe Core-Netzwerk:

networks:
  breakpilot-network:
    external: true
    name: breakpilot-network

Container-Naming: bp-compliance-*

DB search_path: compliance,core,public


Verzeichnisstruktur

breakpilot-compliance/
├── .claude/
│   ├── CLAUDE.md             # Diese Datei
│   └── rules/                # Automatische Regeln
├── admin-compliance/         # Next.js Compliance Dashboard
│   ├── app/(sdk)/            # 37 SDK-Route-Dirs
│   ├── app/(admin)/          # Dashboard + Catalog-Manager
│   ├── components/sdk/       # SDKSidebar, CommandBar, ComplianceAdvisor
│   ├── components/catalog-manager/  # Shared Catalog UI
│   └── lib/sdk/              # SDK Context, Types, API-Client
├── backend-compliance/       # Python/FastAPI Backend
│   ├── compliance/           # Haupt-Package (40 Dateien)
│   │   ├── api/              # API Router
│   │   ├── db/               # DB Models
│   │   ├── services/         # Business Logic
│   │   └── data/             # Stammdaten
│   ├── consent_admin_api.py
│   ├── dsr_api.py
│   └── gdpr_api.py
├── ai-compliance-sdk/        # KI-Compliance Analyse Service
├── developer-portal/         # API-Dokumentation (Next.js)
├── breakpilot-compliance-sdk/ # SDK Package
├── consent-sdk/              # Consent SDK Package
├── pca-platform/             # Privacy Compliance Automation
├── dsms-node/                # IPFS Node
├── dsms-gateway/             # IPFS Gateway
├── scripts/                  # Helper Scripts
├── docker-compose.yml        # Compliance Compose (~10 Services, platform: arm64)
├── docker-compose.hetzner.yml # Override: arm64→amd64 fuer Hetzner
└── .gitea/workflows/ci.yaml  # CI/CD Pipeline (Lint → Tests → Deploy)

Haeufige Befehle

Deployment (CI/CD — Standardweg)

# Committen und pushen → CI/CD deployt automatisch auf Hetzner:
git push origin main && git push gitea main

# CI-Status pruefen (im Browser):
# https://gitea.meghsakha.com/Benjamin_Boenisch/breakpilot-compliance/actions

# Health Checks:
curl -sf https://api-dev.breakpilot.ai/health
curl -sf https://sdk-dev.breakpilot.ai/health

Git

# Zu BEIDEN Remotes pushen (PFLICHT! — vom MacBook):
git push origin main && git push gitea main

# Remotes:
# origin: lokale Gitea (macmini:3003)
# gitea:  gitea.meghsakha.com:22222

Lokale Docker-Befehle (Mac Mini — nur fuer Dev/Tests)

# Logs
ssh macmini "/usr/local/bin/docker logs -f bp-compliance-<service>"

# Status
ssh macmini "/usr/local/bin/docker ps --filter name=bp-compliance"

# Lokaler Rebuild (nur wenn noetig):
ssh macmini "git -C /Users/benjaminadmin/Projekte/breakpilot-compliance pull --no-rebase origin main"
ssh macmini "/usr/local/bin/docker compose -f /Users/benjaminadmin/Projekte/breakpilot-compliance/docker-compose.yml build --no-cache <service> && /usr/local/bin/docker compose -f /Users/benjaminadmin/Projekte/breakpilot-compliance/docker-compose.yml up -d <service>"

Kernprinzipien

1. Open Source Policy

  • NUR Open Source mit kommerziell nutzbarer Lizenz
  • Erlaubt: MIT, Apache-2.0, BSD, ISC, MPL-2.0, LGPL
  • VERBOTEN: GPL (ausser LGPL), AGPL, proprietaer

2. DSGVO-Compliance

  • Dieses Projekt implementiert DSGVO-Tools — es muss selbst DSGVO-konform sein
  • Audit-Logging fuer alle Compliance-Aktionen
  • Consent-Management via Core consent-service

3. AI Act Compliance

  • KI-Risikobewertung fuer alle KI-Features
  • Human Oversight sicherstellen
  • Transparenzpflicht bei KI-Nutzung

4. Testing & Dokumentation

  • Tests sind Pflicht bei jeder Aenderung
  • Compliance-Checkliste bei neuen Features durchgehen

5. Sensitive Dateien

NIEMALS aendern oder committen:

  • .env, .env.local, Vault-Tokens, SSL-Zertifikate
  • *.pdf, *.docx, kompilierte Binaries, grosse Medien

Tech-Stack

Sprache Services
Python/FastAPI backend-compliance, ai-compliance-sdk, pca-platform
TypeScript/Next.js admin-compliance, developer-portal
Node.js dsms-node, dsms-gateway, consent-sdk

SDK-Module im Detail

Katalog-System (Shared mit Lehrer)

  • components/catalog-manager/ — CatalogManagerContent, CatalogTable, CatalogModuleTabs, CatalogEntryForm
  • lib/sdk/catalog-manager/ — catalog-registry.ts, types.ts
  • 17 DSGVO/AI-Act Kataloge (dsfa, vvt-baseline, vendor-compliance, etc.)

Multi-Projekt-Architektur (seit 2026-03-09)

Jeder Tenant kann mehrere Compliance-Projekte anlegen. CompanyProfile ist pro Projekt (nicht tenant-weit).

URL-Schema: /sdk?project={uuid} — alle SDK-Seiten enthalten ?project= Query-Param. /sdk ohne ?project= zeigt die Projektliste (ProjectSelector).

Datenbank:

  • compliance_projects — Projekt-Metadaten (Name, Typ, Status, Version)
  • sdk_states — UNIQUE auf (tenant_id, project_id) statt nur tenant_id
  • Migration: 039_compliance_projects.sql

Backend API (FastAPI):

GET    /api/v1/projects              → Alle Projekte des Tenants
POST   /api/v1/projects              → Neues Projekt erstellen (mit copy_from_project_id)
GET    /api/v1/projects/{project_id} → Einzelnes Projekt laden
PATCH  /api/v1/projects/{project_id} → Projekt aktualisieren
DELETE /api/v1/projects/{project_id} → Projekt archivieren (Soft Delete)

Frontend:

  • components/sdk/ProjectSelector/ProjectSelector.tsx — Projektliste + Erstellen-Dialog
  • lib/sdk/types.tsProjectInfo Interface, SDKState.projectId
  • lib/sdk/context.tsxprojectId Prop, createProject(), listProjects(), switchProject()
  • lib/sdk/sync.ts — BroadcastChannel + localStorage pro Projekt
  • lib/sdk/api-client.tsprojectId in State-API + Projekt-CRUD-Methoden
  • app/sdk/layout.tsx — liest ?project= aus searchParams
  • app/api/sdk/v1/projects/ — Next.js Proxy zum Backend

Multi-Tab: Tab A (Projekt X) und Tab B (Projekt Y) interferieren nicht — separate BroadcastChannel + localStorage Keys.

Stammdaten-Kopie: Neues Projekt mit copy_from_project_id → Backend kopiert companyProfile aus dem Quell-State. Danach unabhaengig editierbar.

Backend-Compliance APIs

POST/GET /api/v1/compliance/risks
POST/GET /api/v1/compliance/controls
POST/GET /api/v1/compliance/requirements
POST/GET /api/v1/compliance/evidence
POST/GET /api/v1/dsr/requests
POST/GET /api/v1/gdpr/exports
POST/GET /api/v1/consent/admin

# Stammdaten, Versionierung & Change-Requests (Phase 1-6, 2026-03-07)
GET/POST/DELETE /api/compliance/company-profile
GET /api/compliance/company-profile/template-context
GET /api/compliance/change-requests
GET /api/compliance/change-requests/stats
POST /api/compliance/change-requests/{id}/accept
POST /api/compliance/change-requests/{id}/reject
POST /api/compliance/change-requests/{id}/edit
GET /api/compliance/generation/preview/{doc_type}
POST /api/compliance/generation/apply/{doc_type}
GET /api/compliance/{doc}/{id}/versions

Multi-Tenancy

  • Shared Dependency: compliance/api/tenant_utils.py (get_tenant_id())
  • UUID-Format, kein "default" mehr
  • Header X-Tenant-ID > Query tenant_id > ENV-Fallback

Migrations (035-038)

Nr Datei Beschreibung
035 migrations/035_vvt_tenant_isolation.sql VVT tenant_id + DSFA/Vendor default→UUID
036 migrations/036_company_profile_extend.sql Stammdaten JSONB + Regulierungs-Flags
037 migrations/037_document_versions.sql 5 Versions-Tabellen + current_version
038 migrations/038_change_requests.sql Change-Requests + Audit-Log

Neue Backend-Module

Datei Beschreibung
compliance/api/tenant_utils.py Shared Tenant-ID Dependency
compliance/api/versioning_utils.py Shared Versioning Helper
compliance/api/change_request_routes.py CR CRUD + Accept/Reject/Edit
compliance/api/change_request_engine.py Regelbasierte CR-Generierung
compliance/api/generation_routes.py Dokumentengenerierung aus Stammdaten
compliance/api/document_templates/ 5 Template-Generatoren (DSFA, VVT, TOM, etc.)

Wichtige Dateien (Referenz)

Datei Beschreibung
admin-compliance/app/(sdk)/ Alle 37+ SDK-Routes
admin-compliance/app/(sdk)/sdk/change-requests/page.tsx Change-Request Inbox
admin-compliance/components/sdk/Sidebar/SDKSidebar.tsx SDK Navigation (mit CR-Badge)
admin-compliance/components/sdk/VersionHistory.tsx Versions-Timeline-Komponente
admin-compliance/components/sdk/CommandBar.tsx Command Palette
admin-compliance/lib/sdk/context.tsx SDK State (Provider)
backend-compliance/compliance/ Haupt-Package (50+ Dateien)
ai-compliance-sdk/ KI-Compliance Analyse
developer-portal/ API-Dokumentation