Sprint 1.12 Phase 1 (User-Vorgabe 2026-06-09):
Statt eigener 12 hartgepatchter Patterns nutzt der Impressum-Agent jetzt
die 75 echten Master-Controls aus compliance.doc_check_controls. Pipeline:
Layer 0 — Regex-Boost (meine 12 Patterns aus mcs.py / regex_boost.py)
→ wenn Pattern hits, MC wird zu PASS überschrieben
Layer 1 — Keyword-Match aus pass_criteria der 75 DB-MCs
(rag_document_checker.check_document_with_controls)
Layer 2 — BGE-M3 Embedding-Match (in rag_document_checker integriert)
Layer 3 — Semantic-Validator (LLM) für übriggebliebene HIGH/MEDIUM
+ Auto-Learning-Pattern-Library
Output-Layer bleibt unverändert: Disclaimer-Linter + Rollup-Dedup +
Methodik-First-UI.
Neue Dateien:
- impressum/v3_engine.py — Pipeline-Orchestrator
- impressum/regex_boost.py — meine 12 Patterns + Boost-Mapping
Refactored:
- impressum/agent.py — komplett umgeschrieben, agent_version=3.0
255 LOC (unter 500-Cap)
Tests: test_impressum_v3.py mit 10 neuen Tests, alle gruen. Mockt
run_v3_pipeline für offline-Lauf. Bestaetigt:
- Layer-0 erkennt Tesla-typische Felder
- Boost matched DB-MC nur bei ≥2 Keyword-Treffern in pass_criteria
- 12 Pattern-Boost-Slots + N DB-MCs in coverage
- Notes enthalten Telemetrie (v3-pipeline, Boost-Overrides)
Telemetrie wird in AgentOutput.notes ausgegeben, damit Frontend
sehen kann: 75 DB-MCs geprueft · 5 Pattern-Boosts · 3 Boost-Overrides.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
backend-compliance
Python/FastAPI service implementing the DSGVO compliance API: DSR, DSFA, consent, controls, risks, evidence, audit, vendor management, ISMS, change requests, document generation.
Port: 8002 (container: bp-compliance-backend)
Stack: Python 3.12, FastAPI, SQLAlchemy 2.x, Alembic, Keycloak auth.
Architecture
compliance/
├── api/ # Routers (thin, ≤30 LOC per handler)
├── services/ # Business logic
├── repositories/ # DB access
├── domain/ # Value objects, domain errors
├── schemas/ # Pydantic models, split per domain
└── db/models/ # SQLAlchemy ORM, one module per aggregate
The service follows this layered target structure but not all files are fully refactored yet. Phase 1 backlog is tracked in .claude/rules/loc-exceptions.txt (27 backend-compliance files currently excepted).
See ../AGENTS.python.md for the full convention and ../.claude/rules/architecture.md for the non-negotiable rules.
Run locally
cd backend-compliance
pip install -r requirements.txt
export COMPLIANCE_DATABASE_URL=... # Postgres (Hetzner or local)
uvicorn main:app --reload --port 8002
Tests
pytest compliance/tests/ -v
pytest --cov=compliance --cov-report=term-missing
Layout: tests/unit/, tests/integration/, tests/contracts/. Contract tests diff /openapi.json against tests/contracts/openapi.baseline.json.
Public API surface
404+ endpoints across /api/v1/*. Grouped by domain: ai, audit, consent, dsfa, dsr, gdpr, vendor, evidence, change-requests, generation, projects, company-profile, isms. Every path is a contract — see the "Public endpoints" rule in the root CLAUDE.md.
Environment
| Var | Purpose |
|---|---|
COMPLIANCE_DATABASE_URL |
Postgres DSN, sslmode=require |
KEYCLOAK_* |
Auth verification |
QDRANT_URL, QDRANT_API_KEY |
Vector search |
CORE_VALKEY_URL |
Session cache |
Don't touch
Database schema, __tablename__, column names, existing migrations under migrations/. See root CLAUDE.md rule 3.