The monolithic compliance/db/models.py is decomposed into seven sibling
aggregate modules following the existing repo pattern (dsr_models.py,
vvt_models.py, tom_models.py, etc.):
regulation_models.py (134 LOC) — RegulationDB, RequirementDB
control_models.py (279 LOC) — ControlDB, ControlMappingDB, EvidenceDB, RiskDB
ai_system_models.py (141 LOC) — AISystemDB, AuditExportDB
service_module_models.py (176 LOC) — ServiceModuleDB, ModuleRegulationMappingDB, ModuleRiskDB
audit_session_models.py (177 LOC) — AuditSessionDB, AuditSignOffDB
isms_governance_models.py (323 LOC) — ISMSScope, Context, Policy, Objective, SoA
isms_audit_models.py (468 LOC) — AuditFinding, CAPA, ManagementReview, InternalAudit,
AuditTrail, ReadinessCheck
models.py becomes an 85-line re-export shim — every public symbol is
re-exported in dependency order so existing imports work unchanged:
from compliance.db.models import RegulationDB, ControlDB, AuditFindingDB # still works
New code SHOULD import from the aggregate module directly; the shim is
for backwards compatibility during the migration.
Schema freeze preserved:
- __tablename__ byte-identical
- Column names, types, indexes, constraints byte-identical
- relationship() string references and back_populates unchanged
- cascade directives unchanged
Verified:
- 173/173 pytest compliance/tests/ pass
- tests/contracts/test_openapi_baseline.py passes (360 paths,
484 operations — identical to baseline)
- All new sibling files under the 500-line hard cap
(largest: isms_audit_models.py at 468 LOC)
- No file in compliance/db/ now exceeds the hard cap
This is Phase 1 Step 2 from PHASE1_RUNBOOK.md. Phase 1 Step 3 (split
compliance/api/schemas.py, 1899 LOC) is the next target.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds tests/contracts/test_openapi_baseline.py which loads the live
FastAPI app and diffs its OpenAPI schema against a checked-in baseline.
Fails on:
- Any removed path or operation
- Any removed response status code on an existing operation
- Any new required request body field (would break existing clients)
Passes silently on additive changes. The baseline is regenerated by
running tests/contracts/regenerate_baseline.py — only when a contract
change has been reviewed and every consumer (admin-compliance,
developer-portal, SDKs) has been updated in the same change set.
This is the safety harness for the Phase 1 backend-compliance refactor:
every subsequent refactor commit must keep this test green.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two low-risk Pydantic V1 idioms that will be hard errors in V3:
- Query(regex=...) -> Query(pattern=...) (audit_routes, control_generator_routes)
- class Config: from_attributes=True -> model_config = ConfigDict(...)
in source_policy_router.py (schemas.py is intentionally skipped — it is
the Phase 1 schema-split target and the ConfigDict conversion is most
efficient to do during that split).
Naive -> aware datetime sweep across 47 files:
- datetime.utcnow() -> datetime.now(timezone.utc)
- default=datetime.utcnow -> default=lambda: datetime.now(timezone.utc)
- onupdate=datetime.utcnow -> onupdate=lambda: datetime.now(timezone.utc)
All SQLAlchemy DateTime columns in the project already declare
timezone=True, so the DB schema expects aware datetimes. Before this
commit, the in-Python side was generating naive values and the driver
was silently coercing them. This is a latent-bug fix, not a behavior
change at the DB boundary.
Verified:
- 173/173 pytest compliance/tests/ pass (same as baseline)
- tests/contracts/test_openapi_baseline.py passes (360 paths,
484 operations unchanged)
- DeprecationWarning count dropped from 158 -> 35
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Non-negotiable structural rules that apply to every Claude Code session in
this repo and to every commit, enforced via three defense-in-depth layers:
1. PreToolUse hook in .claude/settings.json blocks any Write/Edit that
would push a file past the 500-line hard cap. Auto-loads for any
Claude session in this repo regardless of who launched it.
2. scripts/githooks/pre-commit (installed via scripts/install-hooks.sh)
enforces the LOC cap, freezes migrations/ unless [migration-approved],
and protects guardrail files unless [guardrail-change] is present.
3. .gitea/workflows/ci.yaml gets loc-budget + guardrail-integrity jobs,
plus mypy --strict on new Python packages, tsc --noEmit on Node
services, and a syft+grype SBOM scan.
Per-language conventions are documented in AGENTS.python.md / AGENTS.go.md /
AGENTS.typescript.md at the repo root — layering (router->service->repo for
Python, hexagonal for Go, colocation for Next.js), tooling baseline, and
explicit "what you may NOT do" lists.
Adds scripts/check-loc.sh (soft 300 / hard 500, reports 205 hard and 161
soft violations in the current codebase) plus .claude/rules/loc-exceptions.txt
(initially empty — the list is designed to shrink over time).
Per-service READMEs for all 10 services + PHASE1_RUNBOOK.md for the
backend-compliance refactor. Skeleton packages (compliance/{domain,
repositories,schemas}) are the landing zone for the clean-arch rewrite that
begins in Phase 1.
CLAUDE.md is prepended with the six non-negotiable rules.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Containers are on multiple networks (breakpilot-network, coolify,
gokocgws...). Without traefik.docker.network, Traefik randomly picks
a network and may choose breakpilot-network where it has no access.
This label forces Traefik to always use the coolify network.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Traefik routes traffic via the 'coolify' bridge network, so services
that need public domain access must be on both breakpilot-network
(for inter-service communication) and coolify (for Traefik routing).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
SQLAlchemy 2.x requires raw SQL strings to be explicitly wrapped
in text(). Fixed 16 instances across 5 route files.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Switch to ${COMPLIANCE_DATABASE_URL} for admin-compliance, backend, SDK, crawler
- Add DATABASE_URL to admin-compliance environment
- Switch ai-compliance-sdk from QDRANT_HOST/PORT to QDRANT_URL + QDRANT_API_KEY
- Add MINIO_SECURE to compliance-tts-service
- Update .env.coolify.example with new variable patterns
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace bp-core-postgres with POSTGRES_HOST env var
- Replace bp-core-qdrant with QDRANT_HOST env var
- Replace bp-core-minio with S3_ENDPOINT/S3_ACCESS_KEY/S3_SECRET_KEY
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add docker-compose.coolify.yml (8 services), .env.coolify.example,
and Gitea Action workflow for Coolify API deployment. Removes
core-health-check and docs. Adds Traefik labels for
*.breakpilot.ai domain routing with Let's Encrypt SSL.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replaced bare imports with safe_import_router pattern — if one sub-router
fails to import (e.g. missing dependency), other routers still load.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
from __future__ import annotations breaks Pydantic BaseModel runtime type
evaluation. Replaced str | None → Optional[str], list[str] → List[str] etc.
in control_generator.py, anchor_finder.py, control_generator_routes.py.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds migration_runner.py that executes pending migrations from
migrations/ directory when backend-compliance starts. Tracks applied
migrations in _migration_history table.
Handles existing databases: detects if tables from migrations 001-045
already exist and seeds the history table accordingly, so only new
migrations (046+) are applied.
Skippable via SKIP_MIGRATIONS=true env var.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Migration 045: Seed 10 controls (AUTH, NET, SUP, LOG, WEB, DATA, CRYP, REL)
with 39 open-source anchors into the database
- Backend: POST/PUT/DELETE endpoints for canonical controls CRUD
- Frontend proxy: PUT and DELETE methods added to canonical route
- Frontend: Control Library with create/edit/delete UI, full form with
open anchor management, scope, requirements, evidence, test procedures
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The backend mounts the compliance router at /api/compliance, so canonical
control endpoints are at /api/compliance/v1/canonical/*, not /api/v1/canonical/*.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add phase_security() with 15 documents across 3 sub-phases:
- J1: 7 NIST standards (SP 800-53, 800-218, 800-63, 800-207, 8259A/B, AI RMF)
- J2: 6 OWASP projects (Top 10, API Security, ASVS, MASVS, SAMM, Mobile Top 10)
- J3: 2 ENISA guides (Procurement Hospitals, Cloud Security SMEs)
All documents are commercially licensed (Public Domain / CC BY / CC BY-SA).
Wire up 'security' phase in dispatcher and workflow yaml.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
download_pdf() and extract_gesetz_html() now return 0 on failure and clean up
partial files. This prevents set -euo pipefail from aborting the entire script
when a single download fails (e.g. EUR-Lex timeout, BSI redirect).
Root cause of H2 EU loop only processing 1 document in Run #724: first failed
download_pdf returned 1, triggering set -e script abort.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
docker cp fails when target dir doesn't exist in a created container.
Copy scripts to /workspace_scripts, then cp them at container start.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The runner container can't access host paths directly, so the
deploy dir scripts were always stale. Now uses docker create +
docker cp + docker start to copy the freshly checked-out scripts
into the ingestion container before starting it.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The RAG workflow mounts scripts from /opt/breakpilot-compliance/scripts
(deploy dir) but this may not have the latest fixes if CI hasn't
deployed yet. Add explicit git pull before running ingestion.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- collection_count() returns 0 (not ?) on failure — fixes arithmetic error
- Pass QDRANT_API_KEY to ingestion container for dedup checks
- Include api-key header in collection_count() and dedup scroll queries
- Lower large-file threshold to 256KB (EGBGB 310KB was timing out)
- More targeted EGBGB XML extraction (Art. 246a + Anlage only)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Critical bug fix: mandatoryDocuments in Hard-Trigger-Rules used UPPERCASE
names (VVT, TOM, DSE) that never matched lowercase ScopeDocumentType keys
(vvt, tom, dsi). This meant no trigger documents were ever recognized as
mandatory in buildDocumentScope().
- Add normalizeDocType() mapping function with alias support
(DSE→dsi, LOESCHKONZEPT→lf, DSR_PROZESS→betroffenenrechte, etc.)
- Fix buildDocumentScope() to use normalized doc types
- Fix estimateEffort() to use lowercase keys matching ScopeDocumentType
- Add 2 tests for UPPERCASE normalization and alias resolution
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Extended timeout (15 min) for files > 500KB (BGB is 1.5MB)
- upload_file returns 0 even on failure so set -e doesn't kill script
- Failed uploads are still counted and reported in summary
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The gesetze phase failed because it expects text files created by the
download phase. Now the workflow automatically runs download first for
any phase that depends on it. Also adds git and python3 to the alpine
container for repo cloning and text extraction.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- admin-compliance/Dockerfile: mkdir -p public before build
- developer-portal/Dockerfile: mkdir -p public before build
(fixes "failed to calculate checksum /app/public: not found")
- docker-compose.hetzner.yml: Override core-health-check to exit
immediately (Core doesn't run on Hetzner)
- Network override: external:false (auto-create breakpilot-network)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The runner container has Docker socket but no host filesystem access.
docker compose needs to read YAML files, so run build+deploy inside
a helper container that has both Docker socket and the deploy dir mounted.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Problems fixed:
1. Deploy step couldn't access /opt/breakpilot-compliance (host path not
mounted in runner container). Now uses alpine/git helper container with
host bind-mount for git ops, then docker compose with host paths.
2. breakpilot-network was external:true but Core doesn't run on Hetzner.
Override in hetzner.yml creates the network automatically.
3. core-health-check blocks startup waiting for Core. Override in
hetzner.yml makes it exit immediately.
4. RAG ingestion script now respects RAG_URL/QDRANT_URL env vars.
5. RAG workflow discovers network dynamically from running containers.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Temporary commit to discover Docker container names and networks
on Hetzner, since breakpilot-network doesn't exist there.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Instead of trying to connect the runner to breakpilot-network,
spawn a new alpine container directly on it via docker run.
Added debug output for network/container visibility.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Join breakpilot-network so bp-core-rag-service is reachable
- Make RAG_URL/QDRANT_URL in script respect env vars (${VAR:-default})
- Remove complex fallback logic — fail fast if network not available
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The runner container doesn't always have /opt/breakpilot-compliance mounted.
Use the git-cloned workspace (current dir) and add multi-fallback for RAG API
URL (container network → localhost → host.docker.internal).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Phase H now includes:
- 16 German laws (PAngV, VSBG, ProdHaftG, BDSG, HGB, AO, DDG, TKG, etc.)
- 15 EUR-Lex EU laws (DSGVO, Consumer Rights Dir, Sale of Goods Dir,
E-Commerce Dir, Unfair Terms Dir, DMA, NIS2, Product Liability Dir, etc.)
- 2 NIST frameworks (CSF 2.0, Privacy Framework 1.0)
- 1 HLEG Ethics Guidelines
Updated rag-sources.md with complete inventory of already-ingested vs
new documents, plus Layer 2-5 TODO roadmap.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Runner needs access to /opt/breakpilot-compliance and Docker network
for RAG service (bp-core-rag-service:8097). Falls back to
host.docker.internal if container network unavailable.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Split HT-H01 into HT-H01a (B2C/Hybrid mit Verbraucherschutzpflichten) und
HT-H01b (reiner B2B mit Basis-Pflichten). B2B-Webshops bekommen keine
Widerrufsbelehrung/Preisangaben/Fernabsatz mehr.
- Add excludeWhen/requireWhen to HardTriggerRule for conditional trigger logic
- Register 6 neue ScopeDocumentType: widerrufsbelehrung, preisangaben,
fernabsatz_info, streitbeilegung, produktsicherheit, ai_act_doku
- Full DOCUMENT_SCOPE_MATRIX L1-L4 for all new types
- Align HardTriggerRule interface with actual engine field names
- Add Phase H (Verbraucherschutz) to RAG ingestion script:
10 deutsche Gesetze + 4 EU-Verordnungen + HLEG Ethics Guidelines
- Add scripts/rag-sources.md with license documentation
- 9 new tests for B2B/B2C trigger split, all 326 tests pass
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>