T

Benjamin Admin e1dadc8027 feat: Browser-Matrix Stufe 1.a + 2 weitere GT-Findings + Plausibility-LLM-Härtung

Stage 1.a Browser-Matrix (Task #15) — Multi-Engine Scaffolding:
  - consent-tester/Dockerfile: firefox + webkit + Xvfb deps
  - playwright install chromium firefox webkit
  - services/browser_profiles.py: Registry mit DEFAULT_PROFILES
    (Chromium-Headed/Firefox-Headed/WebKit-Headed/Mobile-Safari) +
    EXTRA_PROFILES (Chrome-Channel, Edge, Brave)
  - services/multi_browser_scanner.py: run_matrix() orchestriert N
    parallele Scans + worst-of-Aggregation + 3 Sub-Scores
    (Pre-Consent 50%, Reject-Respekt 30%, Banner-Design 20%) +
    Hard-Fail-Cap auf <60% bei Pre-Consent/Reject-Verstoß
  - routes_matrix.py: POST /scan-matrix Endpoint (eigenes Modul,
    damit main.py unter 500 LOC bleibt)
  KNOWN: Stage 1.a-Shim ruft alle Profile auf demselben Chromium,
    echte Engine-Diversität in Stage 1.b (consent_scanner.py Param)

Coverage-Gap 3 (Task #17): 2/3 verbleibende GT-Lücken geschlossen:
  - B9 impressum_multi_entity_check (IMPRESSUM-001): erkennt
    USt-IdNr/HR/GF-Fehlen pro Entity bei multi-entity Impressen
    (Elli: USt-IdNr nur bei Elli Mobility, fehlt bei VW Group Charging)
  - B10 transfer_mechanism_check (TRANSFER-001): pro Non-EU-Vendor
    in cmp_vendors prüft DSE auf DPF/SCCs/BCRs/Einwilligung im
    ±400-char-Window. Findet Vendors ohne benannten Mechanismus.
  - TH-RETENTION-002 (AI-Datenkategorie-Differenzierung) bleibt
    semantisch-tief, vorgesehen für Specialist-Agents Task #18.

Plausibility-LLM Empty-Response-Härtung (Task #16):
  - BATCH_SIZE 8 → 4, EXCERPT 4000 → 1500 chars, TIMEOUT 60 → 45s
  - Single-retry mit halbierter Batch wenn LLM empty content
    zurückgibt — qwen3:30b-a3b rejektiert manchmal ≥6-Item-Prompts
    unter format='json'. Falls auch Half-Batch empty: log + skip.
  - Pipeline läuft jetzt nicht mehr 10min in Timeouts.

GT-Coverage Sprung: 10/13 → 11/13 (85%). 4/4 HIGH ✓, 5/6 MEDIUM ✓,
2/3 LOW ✓.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-06-06 21:42:27 +02:00

.claude

refactor(agent-check): split routes file (2692→347 LOC) + wire B1/B3/A1 [guardrail-change]

2026-06-06 14:47:25 +02:00

.gitea/workflows

feat(p83): wire BUILD_SHA through all Dockerfiles + compose + CI check

2026-05-22 18:29:03 +02:00

.woodpecker

fix(ci): update Go to 1.24 for ai-compliance-sdk

2026-02-15 17:43:27 +01:00

admin-compliance

feat(agent): PreScanWizard im ComplianceCheckTab (P79 sichtbar)

2026-05-23 07:21:11 +02:00

ai-compliance-sdk

feat(p83): wire BUILD_SHA through all Dockerfiles + compose + CI check

2026-05-22 18:29:03 +02:00

backend-compliance

feat: Browser-Matrix Stufe 1.a + 2 weitere GT-Findings + Plausibility-LLM-Härtung

2026-06-06 21:42:27 +02:00

breakpilot-compliance-sdk

docs: update service READMEs for refactor progress and stale phase references

2026-04-19 16:07:23 +02:00

compliance-tts-service

feat(p83): wire BUILD_SHA through all Dockerfiles + compose + CI check

2026-05-22 18:29:03 +02:00

consent-sdk

refactor(consent-sdk,dsms-gateway): split ConsentManager, types, and main.py

2026-04-18 08:42:32 +02:00

consent-tester

feat: Browser-Matrix Stufe 1.a + 2 weitere GT-Findings + Plausibility-LLM-Härtung

2026-06-06 21:42:27 +02:00

developer-portal

feat(p83): wire BUILD_SHA through all Dockerfiles + compose + CI check

2026-05-22 18:29:03 +02:00

docs-site

feat(p83): wire BUILD_SHA through all Dockerfiles + compose + CI check

2026-05-22 18:29:03 +02:00

docs-src

feat(p83): wire BUILD_SHA through all Dockerfiles + compose + CI check

2026-05-22 18:29:03 +02:00

document-crawler

feat(p83): wire BUILD_SHA through all Dockerfiles + compose + CI check

2026-05-22 18:29:03 +02:00

dsms-gateway

feat(p83): wire BUILD_SHA through all Dockerfiles + compose + CI check

2026-05-22 18:29:03 +02:00

dsms-node

feat(p83): wire BUILD_SHA through all Dockerfiles + compose + CI check

2026-05-22 18:29:03 +02:00

scripts

feat(audit): Phase 1 Quick-Wins (P81 + P85 + P70 + P83) + TCF DELETE/INSERT-Fix

2026-05-22 08:24:46 +02:00

zeroclaw

feat(audit): V2 mail render + 5 new findings (B4/B5/B6/B7/B8) + LLM-Plausibility-Phase

2026-06-06 21:19:49 +02:00

.env.example

feat(infra): Qdrant + MinIO auf externe Hetzner-Services migrieren

2026-03-06 14:33:04 +01:00

.env.orca.example

chore: replace all Coolify references with Orca

2026-04-19 16:33:56 +02:00

.gitignore

docs: add root README, CONTRIBUTING, onboarding section, gitignore fixes

2026-04-19 16:09:28 +02:00

AGENTS.go.md

fix: resolve CI failures in Python tests and admin-compliance build

2026-04-19 16:41:39 +02:00

AGENTS.python.md

fix: resolve CI failures in Python tests and admin-compliance build

2026-04-19 16:41:39 +02:00

AGENTS.typescript.md

docs(agents): require build + lint + test locally before pushing [guardrail-change]

2026-04-19 16:38:21 +02:00

CONTRIBUTING.md

chore: replace all Coolify references with Orca

2026-04-19 16:33:56 +02:00

docker-compose.hetzner.yml

docs: replace all Coolify references with Orca across compliance repo

2026-04-17 10:39:45 +02:00

docker-compose.orca.yml

chore: replace all Coolify references with Orca

2026-04-19 16:33:56 +02:00

docker-compose.yml

feat(p83): wire BUILD_SHA through all Dockerfiles + compose + CI check

2026-05-22 18:29:03 +02:00

mkdocs.yml

docs: add Pass 0b cost benchmark — v3 vs v4 vs backfill vs Mac Mini

2026-04-27 16:00:11 +02:00

README.md

chore: replace all Coolify references with Orca

2026-04-19 16:33:56 +02:00

REFACTOR_PLAYBOOK.md

docs: add root README, CONTRIBUTING, onboarding section, gitignore fixes

2026-04-19 16:09:28 +02:00

README.md

breakpilot-compliance

DSGVO/AI-Act compliance platform — 10 services, Go · Python · TypeScript

Overview

breakpilot-compliance is a multi-tenant DSGVO/EU AI Act compliance platform that provides an SDK for consent management, data subject requests (DSR), audit logging, iACE impact assessments, and document archival. It ships as 10 containerised services covering an admin dashboard, a developer portal, a Python/FastAPI backend, a Go AI compliance engine, TTS, and a decentralised document store on IPFS. Every service is deployed automatically via Gitea Actions → Orca on every push to main.

Architecture

Service	Tech	Port	Container
admin-compliance	Next.js 15	3007	bp-compliance-admin
backend-compliance	Python / FastAPI 0.123	8002	bp-compliance-backend
ai-compliance-sdk	Go 1.24 / Gin	8093	bp-compliance-ai-sdk
developer-portal	Next.js 15	3006	bp-compliance-developer-portal
breakpilot-compliance-sdk	TypeScript SDK (React/Vue/Angular/vanilla)	—	—
consent-sdk	JS/TS Consent SDK	—	—
compliance-tts-service	Python / Piper TTS	8095	bp-compliance-tts
document-crawler	Python / FastAPI	8098	bp-compliance-document-crawler
dsms-gateway	Python / FastAPI / IPFS	8082	bp-compliance-dsms-gateway
dsms-node	IPFS Kubo v0.24.0	—	bp-compliance-dsms-node

All containers share the external breakpilot-network Docker network and depend on breakpilot-core (Valkey, Vault, RAG service, Nginx reverse proxy).

Quick Start

Prerequisites: Docker, Go 1.24+, Python 3.12+, Node.js 20+

git clone ssh://git@gitea.meghsakha.com:22222/Benjamin_Boenisch/breakpilot-compliance.git
cd breakpilot-compliance

# Copy and populate secrets (never commit .env)
cp .env.example .env

# Start all services
docker compose up -d

For the Orca/Hetzner production target (x86_64), use the override:

docker compose -f docker-compose.yml -f docker-compose.hetzner.yml up -d

Development Workflow

Use feature branches off main. Supported prefixes: feat/, feature/, hotfix/.

git checkout main && git pull origin main
git checkout -b feat/my-change
# ... make changes ...
git push origin feat/my-change
# Open a PR → squash merge to main

Push to main triggers:

Gitea Actions — lint → test → validate (see CI Pipeline below)
Orca — automatic build + deploy (~3 min total)

Monitor status: https://gitea.meghsakha.com/Benjamin_Boenisch/breakpilot-compliance/actions

CI Pipeline

Defined in .gitea/workflows/ci.yaml.

Job	What it checks
`loc-budget`	All source files ≤ 500 LOC; soft target 300
`guardrail-integrity`	Commits touching guardrail files carry `[guardrail-change]`
`go-lint`	`golangci-lint` on `ai-compliance-sdk/`
`python-lint`	`ruff` + `mypy` on Python services
`nodejs-lint`	`tsc --noEmit` + ESLint on Next.js services
`test-go-ai-compliance`	`go test ./...` in `ai-compliance-sdk/`
`test-python-backend-compliance`	`pytest` in `backend-compliance/`
`test-python-document-crawler`	`pytest` in `document-crawler/`
`test-python-dsms-gateway`	`pytest test_main.py` in `dsms-gateway/`
`sbom-scan`	License + vulnerability scan via `syft` + `grype`
`validate-canonical-controls`	OpenAPI contract baseline diff

File Budget

Limit	Value	How to check
Soft target	300 LOC	`bash scripts/check-loc.sh`
Hard cap	500 LOC	Same; also enforced by `PreToolUse` hook + git pre-commit + CI
Exceptions	`.claude/rules/loc-exceptions.txt`	Require written rationale + `[guardrail-change]` commit marker

The .claude/settings.json PreToolUse hook blocks Claude Code from writing or editing files that would exceed the hard cap. The git pre-commit hook re-checks. CI is the final gate.

Links

	URL
Admin dashboard	https://admin-dev.breakpilot.ai
Developer portal	https://developers-dev.breakpilot.ai
Backend API	https://api-dev.breakpilot.ai
AI SDK API	https://sdk-dev.breakpilot.ai
Gitea repo	https://gitea.meghsakha.com/Benjamin_Boenisch/breakpilot-compliance
Gitea Actions	https://gitea.meghsakha.com/Benjamin_Boenisch/breakpilot-compliance/actions

Languages

TypeScript 41.7%

Python 33.1%

Go 22.7%

Shell 1.2%

PLpgSQL 0.8%

Other 0.2%