The learning point is not the hypothesis, it is the QUESTION — and confirmed/refuted is too coarse.
"partial, only critical suppliers" or "certified but not lived" are not "wrong", they are valuable
knowledge. So the chain is Hypothesis -> Question -> Observation -> (Review) -> Hypothesis, and the
observation model must be defined cleanly before any store/API (else thousands of too-coarse
observations get migrated later).
compliance/onboarding/observations.py:
- ObservationType: confirmed / partial / refuted / not_applicable / unknown (richer than binary).
- Observation: {hypothesis_id, capability, question, answer (free text), observation_type,
scope_note ("only critical suppliers"), evidence_uploaded, reviewed, reviewed_by}.
- empirical_distribution() -> a DISTRIBUTION (confirmed 61 / partial 31 / refuted 8), not one %.
- empirical_confidence() -> (confirmed + 0.5*partial) / (confirmed+partial+refuted); n.a./unknown
excluded; None until calibrated.
- REVIEW GATE: only reviewed observations calibrate — a raw answer never changes a hypothesis (no
learning from outliers).
Refactor: the hypothesis is now PURE curated knowledge — the binary observations counter and any
confidence are removed from CapabilityHypothesis and the YAML; confidence is COMPUTED from the separate
reviewed observation stream. Pure, mypy --strict clean. Persistence/aggregation/calibration are 59b/c/d.
Non-runtime -> no deploy. 12 tests pass, check-loc 0.
breakpilot-compliance
DSGVO/AI-Act compliance platform — 10 services, Go · Python · TypeScript
Overview
breakpilot-compliance is a multi-tenant DSGVO/EU AI Act compliance platform that provides an SDK for consent management, data subject requests (DSR), audit logging, iACE impact assessments, and document archival. It ships as 10 containerised services covering an admin dashboard, a developer portal, a Python/FastAPI backend, a Go AI compliance engine, TTS, and a decentralised document store on IPFS. Every service is deployed automatically via Gitea Actions → Orca on every push to main.
Architecture
| Service | Tech | Port | Container |
|---|---|---|---|
| admin-compliance | Next.js 15 | 3007 | bp-compliance-admin |
| backend-compliance | Python / FastAPI 0.123 | 8002 | bp-compliance-backend |
| ai-compliance-sdk | Go 1.24 / Gin | 8093 | bp-compliance-ai-sdk |
| developer-portal | Next.js 15 | 3006 | bp-compliance-developer-portal |
| breakpilot-compliance-sdk | TypeScript SDK (React/Vue/Angular/vanilla) | — | — |
| consent-sdk | JS/TS Consent SDK | — | — |
| compliance-tts-service | Python / Piper TTS | 8095 | bp-compliance-tts |
| document-crawler | Python / FastAPI | 8098 | bp-compliance-document-crawler |
| dsms-gateway | Python / FastAPI / IPFS | 8082 | bp-compliance-dsms-gateway |
| dsms-node | IPFS Kubo v0.24.0 | — | bp-compliance-dsms-node |
All containers share the external breakpilot-network Docker network and depend on breakpilot-core (Valkey, Vault, RAG service, Nginx reverse proxy).
Quick Start
Prerequisites: Docker, Go 1.24+, Python 3.12+, Node.js 20+, Infisical CLI
git clone ssh://git@gitea.meghsakha.com:22222/Benjamin_Boenisch/breakpilot-compliance.git
cd breakpilot-compliance
# One-time per machine: log in to the self-hosted Infisical instance
infisical login --domain https://secrets.meghsakha.com
# Start the full stack with secrets injected from Infisical (env=dev)
make dev
Secrets are pulled from Infisical (secrets.meghsakha.com) at runtime; .env files are not used. See INFISICAL_SETUP.md for full onboarding, and make help for the rest of the targets (dev-build, dev-down, secrets, secrets-set).
For the Orca/Hetzner production target (x86_64), use the override:
make dev ENV=prod # or:
infisical run --env=prod -- docker compose -f docker-compose.yml -f docker-compose.hetzner.yml up -d
Development Workflow
Use feature branches off main. Supported prefixes: feat/, feature/, hotfix/.
git checkout main && git pull origin main
git checkout -b feat/my-change
# ... make changes ...
git push origin feat/my-change
# Open a PR → squash merge to main
Push to main triggers:
- Gitea Actions — lint → test → validate (see CI Pipeline below)
- Orca — automatic build + deploy (~3 min total)
Monitor status: https://gitea.meghsakha.com/Benjamin_Boenisch/breakpilot-compliance/actions
CI Pipeline
Defined in .gitea/workflows/ci.yaml.
| Job | What it checks |
|---|---|
loc-budget |
All source files ≤ 500 LOC; soft target 300 |
guardrail-integrity |
Commits touching guardrail files carry [guardrail-change] |
go-lint |
golangci-lint on ai-compliance-sdk/ |
python-lint |
ruff + mypy on Python services |
nodejs-lint |
tsc --noEmit + ESLint on Next.js services |
test-go-ai-compliance |
go test ./... in ai-compliance-sdk/ |
test-python-backend-compliance |
pytest in backend-compliance/ |
test-python-document-crawler |
pytest in document-crawler/ |
test-python-dsms-gateway |
pytest test_main.py in dsms-gateway/ |
sbom-scan |
License + vulnerability scan via syft + grype |
validate-canonical-controls |
OpenAPI contract baseline diff |
File Budget
| Limit | Value | How to check |
|---|---|---|
| Soft target | 300 LOC | bash scripts/check-loc.sh |
| Hard cap | 500 LOC | Same; also enforced by PreToolUse hook + git pre-commit + CI |
| Exceptions | .claude/rules/loc-exceptions.txt |
Require written rationale + [guardrail-change] commit marker |
The .claude/settings.json PreToolUse hook blocks Claude Code from writing or editing files that would exceed the hard cap. The git pre-commit hook re-checks. CI is the final gate.
Links
| URL | |
|---|---|
| Admin dashboard | https://admin-dev.breakpilot.ai |
| Developer portal | https://developers-dev.breakpilot.ai |
| Backend API | https://api-dev.breakpilot.ai |
| AI SDK API | https://sdk-dev.breakpilot.ai |
| Gitea repo | https://gitea.meghsakha.com/Benjamin_Boenisch/breakpilot-compliance |
| Gitea Actions | https://gitea.meghsakha.com/Benjamin_Boenisch/breakpilot-compliance/actions |