Compare commits

..

22 Commits

Author SHA1 Message Date
Benjamin Admin 79ad95e244 feat(ai-sdk): keep cyber/AI hazards out of the traditional CE hazard log
CI / detect-changes (push) Successful in 5s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 5s
CI / validate-canonical-controls (push) Successful in 2s
CI / loc-budget (push) Successful in 16s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Successful in 57s
CI / iace-gt-coverage (push) Successful in 18s
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
InitializeProject created hazards for every matched pattern, so native
cybersecurity/AI topics (unauthorized access, firmware manipulation, missing
SBOM, ...) mixed into the ISO 12100 hazard log. Route the security categories
(frontend groups I. Cyber/Netzwerk + J. KI) to the CRA module instead —
generically for EVERY project, enforced centrally in InitializeProject.

The split is by the nature of the hazard, not the component: functional-safety
control faults stay in CE (software faults, lost safety functions, config
errors, bus failures, botched updates) — they are random/systematic faults,
not attacks, and feed the CRA safety-function bridge. This holds whether the
controller is a bought-in CE-marked PLC or the manufacturer's own control.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-24 20:20:15 +02:00
Benjamin Admin a6f1020b2c feat(ai-sdk): IACE warewashing hazard patterns + cross-domain gating
Add commercial-dishwasher hazard patterns (HP2200-HP2206): hot-water/steam
scald on door opening, hot surfaces, hot ware, corrosive detergent/rinse-aid
burn, respiratory irritation, door pinch and wet-floor slip — each gated by
dom_warewashing so they never leak into other machine classes. Add the
matching warewashing protective measures (M2200-M2208).

Tighten capability-domain gating: emit dom_flame/dom_glue and add welding
surface-form gate terms (schweissarbeitsplatz, schweissfunke, lichtbogenzone,
...) so the welding/flame/glue burn patterns stop leaking into thermal-capable
machines such as a dishwasher.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-24 20:20:15 +02:00
Benjamin_Boenisch e50892a2aa feat(ai-sdk): searchControls — recall control sources on implementation questions (#39)
CI / detect-changes (push) Successful in 5s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 6s
CI / validate-canonical-controls (push) Successful in 3s
CI / loc-budget (push) Successful in 18s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Successful in 58s
CI / iace-gt-coverage (push) Successful in 15s
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-06-24 12:08:29 +00:00
Benjamin_Boenisch 9cfe6f83b1 feat(ai-sdk): source_role control-pool (controls != only technical_standard) (#38)
CI / detect-changes (push) Successful in 4s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 6s
CI / validate-canonical-controls (push) Successful in 3s
CI / loc-budget (push) Successful in 19s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Successful in 57s
CI / iace-gt-coverage (push) Successful in 15s
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-06-24 11:12:22 +00:00
Benjamin_Boenisch df7966656a feat(ai-sdk): classify NIST/OWASP/Grundschutz as technical_standard (#37)
CI / detect-changes (push) Successful in 4s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 6s
CI / validate-canonical-controls (push) Successful in 3s
CI / loc-budget (push) Successful in 18s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Successful in 1m0s
CI / iace-gt-coverage (push) Successful in 14s
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-06-24 10:15:17 +00:00
Benjamin_Boenisch 05d75e8039 feat(ai-sdk): control-intent — technical_standard may win implementation questions (#36)
CI / detect-changes (push) Successful in 5s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 4s
CI / validate-canonical-controls (push) Successful in 4s
CI / loc-budget (push) Successful in 15s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Successful in 54s
CI / iace-gt-coverage (push) Successful in 14s
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-06-24 09:58:35 +00:00
Benjamin_Boenisch e24a551ee4 fix(ai-sdk): make interpretation-intent override reliably win (#35)
CI / detect-changes (push) Successful in 8s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 4s
CI / validate-canonical-controls (push) Successful in 2s
CI / loc-budget (push) Successful in 18s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Successful in 57s
CI / iace-gt-coverage (push) Successful in 15s
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-06-24 09:31:58 +00:00
Benjamin_Boenisch f11b2e035f feat(ai-sdk): controlled interpretation-intent guidance override (#34)
CI / detect-changes (push) Successful in 5s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 5s
CI / validate-canonical-controls (push) Successful in 3s
CI / loc-budget (push) Successful in 17s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Successful in 57s
CI / iace-gt-coverage (push) Successful in 15s
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-06-24 09:01:25 +00:00
Benjamin_Boenisch 230dc05287 feat(ai-sdk): legal-corpus coverage + Phase-2 citation-graph assessment (#33)
CI / detect-changes (push) Successful in 8s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / build-sha-integrity (push) Successful in 6s
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 6s
CI / go-lint (push) Has been skipped
CI / loc-budget (push) Successful in 19s
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 3m1s
CI / test-go (push) Successful in 59s
CI / iace-gt-coverage (push) Successful in 22s
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-06-24 06:37:22 +00:00
Benjamin_Boenisch b83c3e6e00 ci(go-lint): golangci-lint v1.64.8 (go1.24) + new-from-merge-base (#32)
CI / detect-changes (push) Successful in 16s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / build-sha-integrity (push) Successful in 11s
CI / validate-canonical-controls (push) Successful in 5s
CI / loc-budget (push) Successful in 19s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Successful in 57s
CI / iace-gt-coverage (push) Successful in 16s
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-06-23 10:58:48 +00:00
Benjamin_Boenisch a1f425d43a feat(ai-sdk): authority-aware re-ranking for legal RAG (Phase 1) (#31)
CI / detect-changes (push) Successful in 8s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 5s
CI / validate-canonical-controls (push) Successful in 4s
CI / loc-budget (push) Successful in 28s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Successful in 58s
CI / iace-gt-coverage (push) Successful in 16s
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-06-23 09:30:52 +00:00
sharang 23c6ac6f32 Merge pull request 'feat: wire breakpilot-compliance to Infisical for local dev' (#30) from feat/infisical-secrets into main
CI / detect-changes (push) Successful in 7s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 7s
CI / validate-canonical-controls (push) Successful in 6s
CI / loc-budget (push) Successful in 19s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / test-python-backend (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-06-22 19:12:54 +00:00
Sharang Parnerkar d82f86fc95 feat: wire breakpilot-compliance to Infisical for local dev
CI / detect-changes (pull_request) Successful in 9s
CI / branch-name (pull_request) Successful in 1s
CI / guardrail-integrity (pull_request) Successful in 7s
CI / secret-scan (pull_request) Successful in 11s
CI / dep-audit (pull_request) Failing after 58s
CI / sbom-scan (pull_request) Failing after 1m4s
CI / build-sha-integrity (pull_request) Successful in 6s
CI / validate-canonical-controls (pull_request) Successful in 4s
CI / loc-budget (pull_request) Successful in 25s
CI / go-lint (pull_request) Failing after 22s
CI / python-lint (pull_request) Failing after 13s
CI / nodejs-lint (pull_request) Failing after 1m15s
CI / nodejs-build (pull_request) Successful in 3m12s
CI / test-go (pull_request) Successful in 57s
CI / iace-gt-coverage (pull_request) Successful in 16s
CI / test-python-backend (pull_request) Successful in 25s
CI / test-python-document-crawler (pull_request) Successful in 14s
CI / test-python-dsms-gateway (pull_request) Successful in 10s
- Add .infisical.json linking the repo to the breakpilot-compliance
  project on the self-hosted secrets.meghsakha.com instance.
- Add Makefile with infisical-aware targets (make dev, dev-build,
  dev-down, secrets, secrets-set). `make dev` runs `infisical run
  --env=dev -- docker compose up`, so secrets are injected at run
  time and .env files no longer touch disk.
- Add INFISICAL_SETUP.md with per-developer onboarding (CLI install,
  login, verify project link, run targets, Claude Code usage patterns,
  troubleshooting).
- Update README Quick Start to drop the cp .env.example .env step and
  point at make dev + INFISICAL_SETUP.md.
- Remove HashiCorp Vault references from CLAUDE.md (core-services list
  + sensitive-files list) and compliance-checklist.md TOM section;
  replace with Infisical.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-22 21:00:58 +02:00
Benjamin Admin a4d1105b3c Merge branch 'feat/advisor-corpus-authority' into HEAD
CI / detect-changes (push) Successful in 7s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 7s
CI / validate-canonical-controls (push) Successful in 7s
CI / loc-budget (push) Successful in 21s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 3m8s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-06-22 18:40:15 +02:00
Benjamin Admin 067118b12d fix(cascade): give OVH/gpt-oss reasoning headroom so Tier-2 isn't silently dead
CI / detect-changes (push) Successful in 8s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 6s
CI / validate-canonical-controls (push) Successful in 5s
CI / loc-budget (push) Successful in 20s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 25s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
gpt-oss-120b is a reasoning model: it spends output tokens on chain-of-thought
before the answer. deep_check called _call_ovh with max_tokens=400, which
length-capped it mid-reasoning -> content=null -> the OVH tier returned nothing
and the cascade always skipped Tier-2. Floor the OVH budget to >=2000, fall back
to reasoning_content when content is null, and raise the client timeout to 90s
for the slower reasoning path.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-22 17:37:48 +02:00
Benjamin Admin b9c00574b1 docs(catalog): freeze criterion meta-model (compliance_tier axis)
Friert das Kriterien-Meta-Modell ein: atomare getypte Kriterien mit drei
Achsen (verification_method, decision_method, compliance_tier), 3-Status-Gating
nur auf LEGAL_MINIMUM (ERFÜLLT/TEILWEISE/FEHLT), 3-Ebenen-Reporting und
Grün/Blau/Rot-Semantik. Control-UUID bleibt stabil (kein physischer Split),
Speicherung in generation_metadata jsonb (keine Schema-Änderung). Validiert am
Pilot (6/6 Disagreements korrigiert, TEILWEISE empirisch bestätigt).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-22 17:37:48 +02:00
Benjamin Admin 5ff08a240b feat(dse): tiered 3-state evaluator + Layer-3 wiring (compliance_tier)
Getierte Auswertung mit compliance_tier-Gating (nur LEGAL_MINIMUM bestimmt
ERFÜLLT/TEILWEISE/FEHLT; BEST_PRACTICE/OPTIONAL → Empfehlungen). Deterministisch-
first: EMBEDDING-Präsenz + gecachter Haiku nur für Sufficiency → reproduzierbar
(löst die gemessene Judge-Varianz). Layer-3 in v3_engine gated auf tiered_criteria,
fail-safe (UNBESTIMMT → Legacy). Offene Kalibrierung: Präsenz-Schwelle (Schritt 2).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-22 17:37:48 +02:00
Benjamin Admin 3e3644f83d feat(checkers): platform router + Haiku sufficiency tier; cookie is first consumer
Generalise "Embedding finds, Claude decides" into the shared Pruefer-Library:
- router.route_and_check dispatches control -> sensor_classification -> Checker.
- build_spec reads sensor_classification (CONTENT/LLM -> judge=haiku, the
  validated sufficiency tier; the Qwen-first cascade is disproven for sufficiency).
- LLMChecker gains a Haiku-direct tier (reuses the validated deep_check prompt).
- Cookie Layer-3 now routes through route_and_check instead of bespoke code, so
  cookie is the first real router consumer -- proves the architecture end-to-end.

Reproduces the validated result via the shared path: FN 159->14, recall
0.13->0.92, precision 0.89 (vs bespoke 12/0.93/0.90 -- within Haiku noise).
Tests: 10/10 (router dispatch + build_spec + haiku tier + cookie rewire).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-22 17:37:48 +02:00
Benjamin Admin e809d0bc1c feat(cookie): Layer-3 sufficiency-judge — Haiku re-judges embedding/boost rescues
The embedding/boost auto-rescue is intentionally optimistic (finds the topic, not
fulfilment) -> 159 FN over-rescues vs Opus-GT (recall 0.13). Layer-3 re-judges
exactly the rescued passes with the validated Haiku judge (cohort
cookie_sufficiency_v1 P0.89/R0.91) -- NOT the Qwen-first cascade (local is
disproven as a sufficiency judge) -- and un-passes them when the obligation is
not concretely met. Gated to the full check (not skip_llm).

Measured (5-firm Opus-GT, engine+L3): FN 159->12, recall 0.13->0.93,
precision 0.96->0.90 (276 rescues corrected). "Embedding finds, Claude decides."

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-22 17:37:48 +02:00
Benjamin Admin 869e7aeb1e fix(cookie): gate non-COOKIE_POLICY controls out of the cookie-policy scan
The cookie agent loaded 100 controls, 11 of which have no COOKIE_POLICY in
applicable_artifacts -- Security/TOM/Audit (PROCESS) or Banner-behaviour
(BEHAVIOR) controls that produce nonsense findings against a cookie policy
(e.g. "TOMs not documented"). Add a cookie classification gate (analogous to the
DSE gate, keyed on COOKIE_POLICY, without the needs_review carve-out since the
artifact signal is decisive and the set is inventory-verified). Controls are
routed out, not deleted. Effect vs Opus-GT: FP 16->11, FN 179->159; the
remaining FN=159 over-rescue is a separate (judge/criteria) question, not routing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-22 17:37:48 +02:00
Benjamin Admin 33085c61b4 feat(advisor): Korpus-Autoritaet — Fakten nur aus Kontext, Konflikt-Transparenz
Authority-/Freshness-Layer Punkte 1/2/5 im Advisor-Antwortpfad (Prompt-Ebene, kein
Schema). Neue Soul-Sektion "Korpus-Autoritaet & Aktualitaet": rechtliche FAKTEN
(Schwellen/Fristen/Zahlen/Pflichten) nur aus bereitgestelltem RAG-/Controls-Kontext,
Trainingswissen nie als Rechtsquelle; Konflikt -> Kontext gewinnt, transparent;
Co-Pilot-Ton statt Roboter-Verweigerung. Ergaenzt Quellentreue (Fundstellen) um die
Fakten-Ebene -> loest den "DSB ab 10 statt 20"-Fall. route.ts: RAG-Framing als
"deine EINZIGEN Rechtsquellen" verschaerft.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-21 23:18:05 +02:00
Benjamin_Boenisch 38a347a82a feat(platform): live-wire AGB v2 + DSE v3 + Architektur-Tab (#29)
CI / detect-changes (push) Successful in 7s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 9s
CI / validate-canonical-controls (push) Successful in 12s
CI / loc-budget (push) Successful in 24s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 3m11s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 24s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
AGB v2 (decision_method routing, 71%FP->~0) + DSE v3 (4-layer, recovered from container) + Architektur-Tab into /sdk/agent live path. Incl CI robustness (detect-changes.sh + PR-head checkout) + security (hardcoded Qdrant key removed, gitleaks allowlist).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-21 12:58:26 +00:00
58 changed files with 3738 additions and 29 deletions
+3 -2
View File
@@ -130,10 +130,11 @@ rsync -avz --exclude node_modules --exclude .next --exclude .git \
**breakpilot-core MUSS laufen!** Dieses Projekt nutzt Core-Services:
- Valkey (Session-Cache)
- Vault (Secrets)
- RAG-Service (Vektorsuche fuer Compliance-Dokumente)
- Nginx (Reverse Proxy)
Secrets liegen in Infisical (`secrets.meghsakha.com`); die Projektverknuepfung steht in `.infisical.json`. Lokal mit `infisical run --env=dev -- docker compose up` (oder `make dev`) starten — `.env`/`.env.local` werden nicht mehr verwendet.
**Externe Services (Production):**
- PostgreSQL 17 (sslmode=require) — Schemas: `compliance`, `public`
- Qdrant @ `qdrant-dev.breakpilot.ai` (HTTPS, API-Key)
@@ -316,7 +317,7 @@ ssh macmini "/usr/local/bin/docker compose -f /Users/benjaminadmin/Projekte/brea
### 5. Sensitive Dateien
**NIEMALS aendern oder committen:**
- `.env`, `.env.local`, Vault-Tokens, SSL-Zertifikate
- `.env`, `.env.local`, Infisical-Tokens, SSL-Zertifikate
- `*.pdf`, `*.docx`, kompilierte Binaries, grosse Medien
---
+1 -1
View File
@@ -92,7 +92,7 @@ Wenn Hochrisiko:
- [ ] **Transit:** TLS 1.3 für alle Verbindungen
- [ ] **Rest:** Datenbank-Verschlüsselung
- [ ] **Secrets:** Vault für Credentials
- [ ] **Secrets:** Infisical (`secrets.meghsakha.com`) für Credentials
### Zugriffskontrollen
+4 -2
View File
@@ -136,12 +136,14 @@ jobs:
runs-on: docker
needs: detect-changes
if: github.event_name == 'pull_request' && needs.detect-changes.outputs.sdk == 'true'
container: golangci/golangci-lint:v1.62-alpine
container: golangci/golangci-lint:v1.64.8-alpine
steps:
- name: Checkout
run: |
apk add --no-cache git
git clone --depth 1 --branch ${GITHUB_HEAD_REF:-${GITHUB_REF_NAME}} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
# Full clone so `main` is a local ref — new-from-merge-base needs the merge base.
git clone ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
git checkout ${GITHUB_HEAD_REF:-${GITHUB_REF_NAME}}
- name: Lint ai-compliance-sdk
run: |
[ -d "ai-compliance-sdk" ] || exit 0
+5
View File
@@ -0,0 +1,5 @@
{
"workspaceId": "996bda36-9e01-4071-ae8d-69a9f9ff5a23",
"defaultEnvironment": "",
"gitBranchToEnvironmentMapping": null
}
+157
View File
@@ -0,0 +1,157 @@
# Infisical Setup for Local Development
This is the per-developer onboarding for accessing the `breakpilot-compliance` secrets while developing locally. Once this is done, **everything you launch through `make dev` (or `infisical run …`) gets the dev secrets injected as environment variables** — including any Claude Code session that spawns those commands.
Secrets live in the self-hosted Infisical instance at **`secrets.meghsakha.com`**. The project link is committed in `.infisical.json`, so you don't need to know the project ID.
---
## 1. Install the Infisical CLI
**macOS (recommended):**
```bash
brew install infisical/get-cli/infisical
```
**Other platforms / manual install:**
See <https://infisical.com/docs/cli/overview>. Verify with:
```bash
infisical --version
# infisical version 0.43.x (or newer)
```
---
## 2. Log in to the self-hosted instance
```bash
infisical login --domain https://secrets.meghsakha.com
```
This opens a browser for SSO. The login is persisted to your OS keychain — you only do this once per machine.
Sanity check:
```bash
cd ~/projects/breakpilot-compliance # wherever you cloned the repo
infisical --domain https://secrets.meghsakha.com secrets --env=dev
```
You should see a table of secret names + values. If you get an auth error, re-run `infisical login`.
---
## 3. Verify the project link
The repo already contains `.infisical.json` pointing at the `breakpilot-compliance` project:
```bash
cat .infisical.json
# { "workspaceId": "996bda36-9e01-4071-ae8d-69a9f9ff5a23", ... }
```
If the file is missing (rare — only if you reset the repo), recreate it:
```bash
infisical init --domain https://secrets.meghsakha.com
```
Pick the `breakpilot-compliance` project from the picker.
---
## 4. Launch the stack
```bash
make dev
```
This runs `infisical run --env=dev -- docker compose up`. Every service in the compose stack sees its secrets as normal env vars — no `.env` file ever touches disk.
Other targets:
| Target | What it does |
|--------|--------------|
| `make dev-build` | Same as `make dev` but rebuilds images first |
| `make dev-down` | Stop the stack (no secrets needed) |
| `make dev-logs` | Tail logs |
| `make dev-ps` | List running containers |
| `make secrets` | Print all secrets in `dev` (read-only) |
| `make secrets-set KEY=FOO VALUE=bar` | Add or update a secret in `dev` |
To target a different environment:
```bash
make dev ENV=staging
make secrets ENV=prod
```
---
## 5. Using secrets from Claude Code
When Claude Code runs commands in this repo via its Bash tool, the commands inherit your shell's environment. Two patterns:
**Pattern A — let Claude launch the stack normally**
Claude just runs `make dev`. The Infisical CLI inside that command resolves secrets at run time and passes them to docker compose. Claude doesn't see plaintext secrets in its context, but the running services do.
**Pattern B — let Claude run a one-off script with secrets**
If Claude needs to execute a Python/Go script that requires secrets, wrap the command:
```bash
infisical run --env=dev -- python scripts/some_one_off.py
```
This works for any subprocess: pytest, alembic, go run, npm scripts. If Claude proposes a command that reads env vars and runs raw, ask it to wrap it in `infisical run --env=dev --` first.
**What Claude should not do:**
- `infisical export --env=dev > .env` — defeats the whole point and the `.gitignore` will still try to keep the file out.
- `infisical secrets get KEY --env=dev --raw` and pasting the value into a code edit — secrets must stay out of the repo.
If you want Claude to never accidentally dump secrets, add this to your `.claude/settings.json` permissions (project-level or user-level):
```json
{
"permissions": {
"deny": [
"Bash(infisical export*)",
"Bash(infisical secrets get*)"
]
}
}
```
---
## Troubleshooting
| Symptom | Fix |
|---------|-----|
| `please either run infisical init or pass --projectId` | `.infisical.json` is missing or unreadable — re-run `infisical init` |
| `unauthorized` / `please log in` | Re-run `infisical login --domain https://secrets.meghsakha.com` |
| `make dev` says secret is empty | Check the name in `make secrets` matches what docker-compose expects, then update the service config or rename the secret in Infisical |
| Browser SSO doesn't open | Use `infisical login --domain https://secrets.meghsakha.com --method=user` and paste the URL manually |
---
## What the dev env contains
Run `make secrets` to see the live list. As of this writing the dev env includes (at minimum):
- `BREAKPILOT_DB_PASSWORD`
- `BREAKPILOT_QDRANT_API_KEY`
- `LITELLM_API_KEY`
Every other variable in `.env.example` either has a sane default in `docker-compose.yml` or needs to be added to Infisical. To add one:
```bash
make secrets-set KEY=ANTHROPIC_API_KEY VALUE=sk-ant-xxxx
```
Or via the web UI: <https://secrets.meghsakha.com>.
+57
View File
@@ -0,0 +1,57 @@
# breakpilot-compliance — developer workflow
#
# Secrets are managed in Infisical (secrets.meghsakha.com). The project
# link lives in .infisical.json. To get started:
# 1) infisical login --domain https://secrets.meghsakha.com (once per machine)
# 2) make dev
#
# .env / .env.local are NOT used in this repo anymore. Anything that needs
# secrets MUST be launched through `infisical run` so the values come from
# the secrets store instead of disk.
INFISICAL ?= infisical
INFISICAL_DOMAIN ?= https://secrets.meghsakha.com
ENV ?= dev
INFISICAL_RUN := $(INFISICAL) --domain $(INFISICAL_DOMAIN) run --env=$(ENV) --
INFISICAL_SECRETS := $(INFISICAL) --domain $(INFISICAL_DOMAIN) secrets --env=$(ENV)
.PHONY: help dev dev-build dev-down dev-logs dev-ps secrets secrets-set check-loc
help:
@echo "Targets:"
@echo " dev Start the full compose stack with secrets injected from Infisical"
@echo " dev-build Same as dev, but force a rebuild first"
@echo " dev-down Stop the compose stack (no secrets needed)"
@echo " dev-logs Tail logs from all services"
@echo " dev-ps Show running containers"
@echo " secrets List all secrets in the current env ($(ENV))"
@echo " secrets-set Set a secret (KEY=... VALUE=...)"
@echo " check-loc Run the 500-line LOC guard"
dev:
$(INFISICAL_RUN) docker compose up
dev-build:
$(INFISICAL_RUN) docker compose up --build
dev-down:
docker compose down
dev-logs:
docker compose logs -f
dev-ps:
docker compose ps
secrets:
$(INFISICAL_SECRETS)
secrets-set:
@if [ -z "$(KEY)" ] || [ -z "$(VALUE)" ]; then \
echo "Usage: make secrets-set KEY=MY_KEY VALUE=my_value"; exit 1; \
fi
$(INFISICAL) --domain $(INFISICAL_DOMAIN) secrets set $(KEY)=$(VALUE) --env=$(ENV)
check-loc:
bash scripts/check-loc.sh
+9 -6
View File
@@ -42,23 +42,26 @@ All containers share the external `breakpilot-network` Docker network and depend
## Quick Start
**Prerequisites:** Docker, Go 1.24+, Python 3.12+, Node.js 20+
**Prerequisites:** Docker, Go 1.24+, Python 3.12+, Node.js 20+, [Infisical CLI](https://infisical.com/docs/cli/overview)
```bash
git clone ssh://git@gitea.meghsakha.com:22222/Benjamin_Boenisch/breakpilot-compliance.git
cd breakpilot-compliance
# Copy and populate secrets (never commit .env)
cp .env.example .env
# One-time per machine: log in to the self-hosted Infisical instance
infisical login --domain https://secrets.meghsakha.com
# Start all services
docker compose up -d
# Start the full stack with secrets injected from Infisical (env=dev)
make dev
```
Secrets are pulled from Infisical (`secrets.meghsakha.com`) at runtime; `.env` files are not used. See [INFISICAL_SETUP.md](./INFISICAL_SETUP.md) for full onboarding, and `make help` for the rest of the targets (`dev-build`, `dev-down`, `secrets`, `secrets-set`).
For the Orca/Hetzner production target (x86_64), use the override:
```bash
docker compose -f docker-compose.yml -f docker-compose.hetzner.yml up -d
make dev ENV=prod # or:
infisical run --env=prod -- docker compose -f docker-compose.yml -f docker-compose.hetzner.yml up -d
```
---
@@ -35,6 +35,25 @@ Dies ist ein **Legal RAG**. Eine falsch zitierte Fundstelle ist schlimmer als ga
- **Interne IDs** (Control-IDs wie SEC-xxxx, MC-/M-Nummern) gehoeren NICHT in die Nutzerantwort
als Hauptaussage — fuehre die Pflicht im Klartext, eine ID hoechstens in Klammern nachgestellt.
## Korpus-Autoritaet & Aktualitaet — der Kontext schlaegt dein Gedaechtnis (KRITISCH)
Gesetze aendern sich nach deinem Trainingsstand. Der bereitgestellte RAG-/Controls-Kontext bildet
den AKTUELLEN Rechtsstand ab — dein Trainingswissen kann veraltet sein. Diese Regel gilt fuer
FAKTEN, nicht nur fuer Fundstellen (ergaenzt **Quellentreue**).
- Rechtliche **Fakten** (Schwellenwerte, Fristen, Zahlen, ob/ab-wann eine Pflicht gilt,
Zustaendigkeiten) nimmst du AUSSCHLIESSLICH aus dem bereitgestellten Kontext. Dein Trainingswissen
dient nur fuer Sprache, Struktur und Schlussfolgerung — **niemals als Rechtsquelle**.
- Steht ein gefragter Fakt NICHT im Kontext: gib KEINE aus dem Gedaechtnis erinnerte Zahl/Frist/
Schwelle aus — auch nicht beilaeufig im Fliesstext ohne Fundstelle. Sag offen, dass du ihn aus
deinen geprueften Quellen nicht belegen kannst, nenne Pflicht/Thema allgemein, und biete den
naechsten Schritt an (gezielt nachschlagen / mit DSB oder Anwalt verifizieren).
- **Konflikt-Transparenz**: Weicht der Kontext von dem ab, was dir "gelaeufig" vorkommt, gewinnt
IMMER der Kontext. Mach es ruhig transparent — z.B. "Die aktuelle Quelle nennt 20; eine evtl.
aeltere, gelaeufige Annahme (10) gilt hier nicht."
- **Co-Pilot-Ton, keine Roboter-Verweigerung**: formuliere "Aus meinen geprueften Quellen kann ich
X nicht belegen — ich kann es gezielt nachschlagen, oder du klaerst es mit deinem DSB/Anwalt"
statt eines harten "Nein". Du bleibst hilfreicher Begleiter, gibst dem Nutzer aber keine
ungesicherte Rechtsangabe als Tatsache mit.
## Kompetenzbereich
- DSGVO Art. 1-99 + Erwaegsgruende
- BDSG (Bundesdatenschutzgesetz)
@@ -80,7 +80,7 @@ export async function POST(request: NextRequest) {
let systemContent = soulPrompt || FALLBACK_SYSTEM_PROMPT
if (validCountry) systemContent += countryBlock(validCountry)
if (ragContext) {
systemContent += `\n\n## Relevanter Kontext aus dem RAG-System\n\nNutze die folgenden Quellen fuer deine Antwort. Verweise in deiner Antwort auf die jeweilige Quelle:\n\n${ragContext}`
systemContent += `\n\n## Relevanter Kontext aus dem RAG-System (deine EINZIGEN Rechtsquellen)\n\nDies sind deine einzigen zulaessigen Rechtsquellen. Triff keine konkrete Rechtsaussage (Zahl, Frist, Schwelle, Pflicht, Fundstelle), die nicht hier oder im Controls-Block belegt ist — sonst sage offen, dass du sie aus deinen Quellen nicht belegen kannst. Verweise in deiner Antwort auf die jeweilige Quelle:\n\n${ragContext}`
}
if (controlsContext) systemContent += `\n\n${controlsContext}`
systemContent += `\n\n## Aktueller SDK-Schritt\nDer Nutzer befindet sich im SDK-Schritt: ${currentStep}`
@@ -46,6 +46,28 @@ export interface CorpusOverview {
totals: { documents: number; catalog_sources: number }
}
// --- Ingested legal-corpus structure (from the vector store, via the Go SDK).
// Shows WHAT each eur-lex act consists of (articles/annexes/recitals), so the
// ingested corpus is not a black box for developers. ---
export interface LegalActStructure {
regulation_short: string
regulation_name: string
articles: number
annexes: number
recitals: number
chunks: number
}
export interface LegalCorpus {
regulations: LegalActStructure[]
totals: {
regulations: number
articles: number
annexes: number
recitals: number
}
}
// --- Korpus-Dokumente: gruppieren nach Art (Gesetz/Leitfaden/Standard/Urteil)
// + Herausgeber-Familie (DSK, EDPB, OWASP, NIST …). Deterministisch, pure. ---
interface DocCat {
+83 -3
View File
@@ -3,6 +3,7 @@ import Link from 'next/link'
import {
type UseCaseRow,
type CorpusOverview,
type LegalCorpus,
licenseTierBadgeClass,
commercialBadgeClass,
groupUseCases,
@@ -11,28 +12,46 @@ import {
const BACKEND_URL =
process.env.COMPLIANCE_BACKEND_URL || 'http://backend-compliance:8002'
// The legal-corpus structure comes from the Go SDK (it owns the vector store).
const SDK_URL = process.env.SDK_URL || 'http://ai-compliance-sdk:8090'
export const dynamic = 'force-dynamic'
// Fetched from the SDK and isolated in its own try/catch so a vector-store
// hiccup degrades to "no structure shown" instead of blanking the whole page.
async function fetchLegalCorpus(): Promise<LegalCorpus | null> {
try {
const res = await fetch(`${SDK_URL}/sdk/v1/rag/legal-corpus`, {
cache: 'no-store',
})
return res.ok ? await res.json() : null
} catch {
return null
}
}
async function getData(): Promise<{
useCases: UseCaseRow[]
corpus: CorpusOverview | null
legalCorpus: LegalCorpus | null
}> {
try {
const [ucRes, corpusRes] = await Promise.all([
const [ucRes, corpusRes, legalCorpus] = await Promise.all([
fetch(`${BACKEND_URL}/api/compliance/v1/controls/use-cases`, {
cache: 'no-store',
}),
fetch(`${BACKEND_URL}/api/compliance/v1/controls/corpus`, {
cache: 'no-store',
}),
fetchLegalCorpus(),
])
return {
useCases: ucRes.ok ? await ucRes.json() : [],
corpus: corpusRes.ok ? await corpusRes.json() : null,
legalCorpus,
}
} catch {
return { useCases: [], corpus: null }
return { useCases: [], corpus: null, legalCorpus: null }
}
}
@@ -46,7 +65,7 @@ function Stat({ label, value }: { label: string; value: string | number }) {
}
export default async function CoveragePage() {
const { useCases, corpus } = await getData()
const { useCases, corpus, legalCorpus } = await getData()
const groups = groupUseCases(useCases)
const totalRelevant = useCases.reduce((s, u) => s + u.atom_relevant, 0)
const totalAtoms = useCases.reduce((s, u) => s + u.atom_total, 0)
@@ -221,6 +240,67 @@ export default async function CoveragePage() {
</div>
</section>
{legalCorpus?.regulations?.length ? (
<section className="space-y-2">
<h2 className="text-lg font-semibold text-gray-900">
Ingestierter Rechtskorpus Struktur ({legalCorpus.totals.regulations}{' '}
Rechtsakte)
</h2>
<p className="text-xs text-gray-500">
Woraus jeder ingestierte eur-lex-Rechtsakt tatsächlich besteht:
Artikel (§), Anhänge, Erwägungsgründe und retrievbare Chunks direkt
aus dem Vektorspeicher, damit kein Black-Box-Korpus entsteht.
</p>
<div className="overflow-auto rounded-lg border border-gray-200">
<table className="min-w-full divide-y divide-gray-200 text-sm">
<thead className="bg-gray-50 text-left text-xs uppercase text-gray-500">
<tr>
<th className="px-4 py-2">Rechtsakt</th>
<th className="px-4 py-2 text-right">Artikel (§)</th>
<th className="px-4 py-2 text-right">Anhänge</th>
<th className="px-4 py-2 text-right">Erwägungsgründe</th>
<th className="px-4 py-2 text-right">Chunks</th>
</tr>
</thead>
<tbody className="divide-y divide-gray-100 bg-white">
{legalCorpus.regulations.map((r) => (
<tr key={r.regulation_short}>
<td className="px-4 py-2 text-gray-900">
<span className="font-medium">{r.regulation_short}</span>
{r.regulation_name !== r.regulation_short ? (
<span className="ml-2 text-xs text-gray-500">
{r.regulation_name}
</span>
) : null}
</td>
<td className="px-4 py-2 text-right font-semibold">
{r.articles.toLocaleString('de-DE')}
</td>
<td className="px-4 py-2 text-right">
{r.annexes > 0 ? (
r.annexes.toLocaleString('de-DE')
) : (
<span className="text-gray-300"></span>
)}
</td>
<td className="px-4 py-2 text-right text-gray-500">
{r.recitals > 0 ? (
r.recitals.toLocaleString('de-DE')
) : (
<span className="text-gray-300"></span>
)}
</td>
<td className="px-4 py-2 text-right text-gray-500">
{r.chunks.toLocaleString('de-DE')}
</td>
</tr>
))}
</tbody>
</table>
</div>
</section>
) : null}
{corpus?.license_catalog?.length ? (
<section className="space-y-2">
<h2 className="text-lg font-semibold text-gray-900">
+4 -5
View File
@@ -55,8 +55,7 @@ linters-settings:
rules:
- name: exported
arguments:
- checkPrivateReceivers: false
- disableStutteringCheck: true
- disableStutteringCheck
- name: error-return
- name: increment-decrement
- name: var-declaration
@@ -83,6 +82,6 @@ issues:
max-issues-per-linter: 50
max-same-issues: 5
# New code only: don't fail on pre-existing issues in files we haven't touched.
# Remove this once a clean baseline is established.
new: false
# New code only: lint lines changed vs main, so pre-existing debt doesn't fail CI.
# Needs the go-lint job to clone with a local `main` ref (see .gitea/workflows/ci.yaml).
new-from-merge-base: main
@@ -211,6 +211,13 @@ func (h *IACEHandler) InitializeProject(c *gin.Context) {
}
for _, cat := range mp.HazardCats {
// Native cyber/AI categories (frontend groups I+J) belong to the
// CRA module, not the traditional CE (ISO 12100) hazard log.
// Enforced centrally here so it holds for EVERY project.
if isCyberSecurityCategory(cat) {
fmt.Printf("CYBER-SKIP: cat=%s pattern=%s — routed to CRA module\n", cat, mp.PatternID)
continue
}
maxForCat := categoryHazardCap(cat, len(comps))
if catCount[cat] >= maxForCat {
continue
@@ -0,0 +1,45 @@
package handlers
// Safety/Security separation for the IACE hazard log.
//
// The traditional CE risk assessment (Maschinenrichtlinie / EN ISO 12100) and
// the cybersecurity assessment (Cyber Resilience Act) are two distinct steps.
// IACE owns the traditional, physical + functional-safety hazards; the CRA
// module (/sdk/iace/{id}/cra) owns the native cyber/AI topics and re-examines
// which safety functions a cyber attack can re-open (see iace-safety-bridge).
//
// The split is by the NATURE of the hazard, not by the component: a control
// fault, bus failure or botched update is FUNCTIONAL safety (random/systematic
// fault) and stays in CE — independent of whether the controller is a bought-in
// CE-marked PLC or the manufacturer's own embedded control. Only the security
// PROPERTIES against malicious actors (access control, firmware/update
// integrity, SBOM, vulnerability handling, default passwords) are CRA.
//
// Functional-safety control categories (software_control, software_fault,
// safety_function_failure, configuration_error, communication_failure,
// update_failure, sensor_fault, …) therefore intentionally STAY in IACE — they
// are the safety functions whose loss the CRA bridge re-examines.
//
// Enforced centrally in InitializeProject so it holds for EVERY project.
var nativeCyberSecurityCategories = map[string]bool{
// I. Cyber / Netzwerk — security against malicious actors
"unauthorized_access": true,
"firmware_corruption": true,
"cyber_resilience": true,
"logging_audit_failure": true,
"cyber_network": true,
"sensor_spoofing": true,
// J. KI-spezifisch
"ai_specific": true,
"ai_misclassification": true,
"false_classification": true,
"model_drift": true,
"data_poisoning": true,
"unintended_bias": true,
}
// isCyberSecurityCategory reports whether a hazard category is a native cyber/AI
// topic that belongs to the CRA module rather than the traditional CE hazard log.
func isCyberSecurityCategory(category string) bool {
return nativeCyberSecurityCategories[category]
}
@@ -0,0 +1,37 @@
package handlers
import "testing"
func TestIsCyberSecurityCategory_RoutedToCRA(t *testing.T) {
cyber := []string{
"unauthorized_access", "firmware_corruption", "cyber_resilience",
"logging_audit_failure", "cyber_network", "sensor_spoofing",
"ai_specific", "ai_misclassification", "false_classification",
"model_drift", "data_poisoning", "unintended_bias",
}
for _, c := range cyber {
if !isCyberSecurityCategory(c) {
t.Errorf("category %q must be routed to the CRA module, not the traditional IACE log", c)
}
}
}
func TestIsCyberSecurityCategory_StaysInIACE(t *testing.T) {
// Physical + functional-safety categories must remain in the traditional CE
// hazard log. communication_failure (bus failure -> loss of control) and
// update_failure (botched update -> lost safety function) are FUNCTIONAL
// faults, not attacks, so they stay too.
keep := []string{
"mechanical_hazard", "electrical_hazard", "thermal_hazard",
"pneumatic_hydraulic", "noise_vibration", "ergonomic_hazard",
"material_environmental", "chemical_risk", "fire_explosion",
"software_control", "software_fault", "safety_function_failure",
"configuration_error", "sensor_fault", "hmi_error",
"communication_failure", "update_failure",
}
for _, c := range keep {
if isCyberSecurityCategory(c) {
t.Errorf("category %q must stay in the traditional IACE log, not be routed to CRA", c)
}
}
}
@@ -75,9 +75,10 @@ func (h *RAGHandlers) Search(c *gin.Context) {
}
c.JSON(http.StatusOK, gin.H{
"query": req.Query,
"results": results,
"count": len(results),
"query": req.Query,
"results": results,
"count": len(results),
"assessment": ucca.Assess(results),
})
}
@@ -206,3 +207,32 @@ func (h *RAGHandlers) HandleScrollChunks(c *gin.Context) {
"total": len(chunks),
})
}
// LegalCorpusStructure returns the composition (distinct articles, annexes,
// recitals + chunk count) of every ingested eur-lex legal act, so the coverage
// page can show WHAT was ingested instead of just the act name.
// GET /sdk/v1/rag/legal-corpus
func (h *RAGHandlers) LegalCorpusStructure(c *gin.Context) {
acts, err := h.ragClient.CorpusStructure(c.Request.Context())
if err != nil {
c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to aggregate legal corpus: " + err.Error()})
return
}
arts, anns, recs := 0, 0, 0
for _, a := range acts {
arts += a.Articles
anns += a.Annexes
recs += a.Recitals
}
c.JSON(http.StatusOK, gin.H{
"regulations": acts,
"totals": gin.H{
"regulations": len(acts),
"articles": arts,
"annexes": anns,
"recitals": recs,
},
})
}
+1
View File
@@ -161,6 +161,7 @@ func registerRAGRoutes(v1 *gin.RouterGroup, h *handlers.RAGHandlers) {
ragRoutes.GET("/corpus-status", h.CorpusStatus)
ragRoutes.GET("/corpus-versions/:collection", h.CorpusVersionHistory)
ragRoutes.GET("/scroll", h.HandleScrollChunks)
ragRoutes.GET("/legal-corpus", h.LegalCorpusStructure)
}
}
@@ -0,0 +1,132 @@
package iace
// GetWarewashingPatterns returns hazard patterns for commercial warewashing
// machines (gewerbliche Geschirrspuelmaschinen / Untertisch-, Hauben-, Korb-
// und Bandspuelmaschinen). These capture the machine-specific hazards a
// Fachmann immediately expects but that the generic library did not cover:
// hot-water/steam scalding on door opening, hot surfaces, hot ware, corrosive
// detergent/rinse-aid contact, door pinch and wet-floor slipping.
//
// Every pattern is gated by the capability tag "dom_warewashing" (emitted only
// by warewashing narrative keywords in keyword_dictionary.go), so none of these
// leak into unrelated machine classes.
//
// HP range: HP2200-HP2206. ISO 12100 Annex B section identifiers only (facts);
// product standard EN 60335-2-58 (commercial dishwashing machines).
func GetWarewashingPatterns() []HazardPattern {
return []HazardPattern{
{
ID: "HP2200", NameDE: "Verbruehung durch Heisswasser/Dampf beim Oeffnen der Tuer", NameEN: "Scalding by hot water/steam when opening the door",
RequiredComponentTags: []string{"dom_warewashing", "steam_emission"},
GeneratedHazardCats: []string{"thermal_hazard"},
SuggestedMeasureIDs: []string{"M2200", "M2201", "M2202", "M2208"},
Priority: 94,
ApplicableLifecycles: []string{"normal_operation", "cleaning"},
ScenarioDE: "Beim Oeffnen der Tuer waehrend oder unmittelbar nach dem Spuelgang tritt ein Schwall aus heissem Wasser und Wrasen (Dampf) aus der Spuelkammer aus und trifft Gesicht, Haende und Arme des Bedieners.",
TriggerDE: "Tuer wird vor Programmende oder bei noch vorhandenem Restdampf geoeffnet; Tuerverriegelung fehlt oder ist ueberbrueckt; Nachspueltemperatur ca. 85 Grad C.",
HarmDE: "Verbruehung 1.-2. Grades an Gesicht, Haenden und Unterarmen; Augenreizung durch heissen Dampf.",
AffectedDE: "Bedienpersonal (Spuelkraft)",
ZoneDE: "Tuer- und Beschickungsoeffnung der Spuelkammer",
ISO12100Section: "6.2.4",
DefaultSeverity: 3, DefaultExposure: 4,
},
{
ID: "HP2201", NameDE: "Verbrennung an heissen Oberflaechen (Boiler/Tank/Spuelkammer)", NameEN: "Burn on hot surfaces (boiler/tank/wash chamber)",
RequiredComponentTags: []string{"dom_warewashing", "high_temperature"},
GeneratedHazardCats: []string{"thermal_hazard"},
SuggestedMeasureIDs: []string{"M2202", "M055", "M2208"},
Priority: 90,
ApplicableLifecycles: []string{"cleaning", "maintenance"},
ScenarioDE: "Beruehrung heisser Oberflaechen von Boiler, Tankheizkoerper oder Spuelkammerwaenden bei Reinigung, Entkalkung oder Wartung fuehrt zu Kontaktverbrennungen.",
TriggerDE: "Reinigung/Entkalkung ohne Abkuehlzeit; Eingriff in die Spuelkammer bei betriebswarmem Geraet.",
HarmDE: "Kontaktverbrennung an Haenden und Unterarmen.",
AffectedDE: "Reinigungspersonal, Wartungspersonal",
ZoneDE: "Boiler, Tankheizkoerper, Spuelkammerwaende",
ISO12100Section: "6.2.4",
DefaultSeverity: 2, DefaultExposure: 3,
},
{
ID: "HP2202", NameDE: "Verbrennung an heissem Spuelgut beim Entladen", NameEN: "Burn on hot ware when unloading",
RequiredComponentTags: []string{"dom_warewashing", "hot_water"},
GeneratedHazardCats: []string{"thermal_hazard"},
SuggestedMeasureIDs: []string{"M2202", "M055", "M2208"},
Priority: 86,
ApplicableLifecycles: []string{"normal_operation"},
ScenarioDE: "Geschirr, Glaeser und Bestecke sind nach dem Spuelgang durch die Heisswasser-Nachspuelung sehr heiss; beim Entladen kommt es zu Verbrennungen.",
TriggerDE: "Sofortiges Entnehmen des Spuelguts nach Programmende ohne Abkuehl-/Trocknungszeit.",
HarmDE: "Verbrennung an Haenden/Fingern beim Greifen heisser Teile.",
AffectedDE: "Bedienpersonal (Spuelkraft)",
ZoneDE: "Spuelkammer, Entnahmebereich/Korb",
ISO12100Section: "6.2.4",
DefaultSeverity: 2, DefaultExposure: 3,
},
{
ID: "HP2203", NameDE: "Chemische Veraetzung (Haut/Augen) durch Reiniger-/Klarspueler-Konzentrat", NameEN: "Chemical burn (skin/eyes) from detergent/rinse-aid concentrate",
RequiredComponentTags: []string{"dom_warewashing", "corrosive_chemical"},
GeneratedHazardCats: []string{"chemical_risk"},
SuggestedMeasureIDs: []string{"M2203", "M2204", "M2208"},
Priority: 92,
ApplicableLifecycles: []string{"normal_operation", "maintenance"},
ScenarioDE: "Direkter Kontakt mit dem aetzenden (alkalischen) Reiniger- bzw. Klarspueler-Konzentrat beim Nachfuellen, Sauglanzenwechsel oder bei Leckage fuehrt zu Veraetzungen von Haut und Augen.",
TriggerDE: "Gebinde-/Sauglanzenwechsel ohne Schutzausruestung; Umfuellen von Konzentrat; undichte Dosierleitung.",
HarmDE: "Veraetzung von Haut und Augen (alkalische Verletzung), bleibende Augenschaeden moeglich.",
AffectedDE: "Bedienpersonal, Reinigungspersonal beim Chemikalien-Handling",
ZoneDE: "Dosiergeraet, Reiniger-/Klarspueler-Gebinde, Sauglanzen",
ISO12100Section: "6.2.4",
DefaultSeverity: 3, DefaultExposure: 3,
ClarificationQuestionsDE: []string{
"Liegt fuer alle eingesetzten Reiniger/Klarspueler/Entkalker ein aktuelles Sicherheitsdatenblatt (SDB) am Geraet vor?",
"Ist ein geschlossenes Dosiersystem mit Sauglanzen vorhanden, sodass kein Umfuellen noetig ist?",
},
},
{
ID: "HP2204", NameDE: "Reizung/Veraetzung der Atemwege durch Reinigungs-Aerosole/Daempfe", NameEN: "Respiratory irritation from cleaning aerosols/vapours",
RequiredComponentTags: []string{"dom_warewashing", "corrosive_chemical"},
GeneratedHazardCats: []string{"chemical_risk"},
SuggestedMeasureIDs: []string{"M2205", "M2203", "M2204"},
Priority: 82,
ApplicableLifecycles: []string{"normal_operation", "maintenance"},
ScenarioDE: "Aerosole und Daempfe der Reinigungschemie (insbesondere beim Oeffnen kurz nach dem Spuelgang oder bei der Entkalkung mit Saeure) gelangen in die Atemzone und reizen Atemwege und Schleimhaeute.",
TriggerDE: "Oeffnen bei laufender/heisser Chemie; Entkalkung mit Saeure; unzureichende Lueftung des Aufstellbereichs.",
HarmDE: "Reizung von Atemwegen, Augen und Schleimhaeuten; bei Saeure-/Laugen-Vermischung gefaehrliche Gase.",
AffectedDE: "Bedienpersonal, Reinigungspersonal",
ZoneDE: "Atemzone vor der Spuelkammer, Aufstellbereich",
ISO12100Section: "6.2.4",
DefaultSeverity: 2, DefaultExposure: 2,
ClarificationQuestionsDE: []string{
"Ist der Aufstellbereich ausreichend be-/entlueftet (Kuechenlueftung)?",
"Wird in der BA vor dem Vermischen von Reiniger und Entkalker/Saeure gewarnt?",
},
},
{
ID: "HP2205", NameDE: "Quetschen der Finger an der Tuer/Haube", NameEN: "Finger crushing at the door/hood",
RequiredComponentTags: []string{"dom_warewashing", "access_door"},
GeneratedHazardCats: []string{"mechanical_hazard"},
SuggestedMeasureIDs: []string{"M2206", "M003", "M2208"},
Priority: 78,
ApplicableLifecycles: []string{"normal_operation"},
ScenarioDE: "Beim Schliessen der Tuer bzw. Absenken der Haube werden Finger zwischen Tuer/Haube und Gehaeuse gequetscht.",
TriggerDE: "Greifen in den Schliessbereich beim Schliessen; hohe Schliesskraft der Haube; scharfe Kanten.",
HarmDE: "Quetschung und Prellung der Finger.",
AffectedDE: "Bedienpersonal (Spuelkraft)",
ZoneDE: "Tuer-/Haubenkante, Schliessbereich",
ISO12100Section: "6.2.3",
DefaultSeverity: 1, DefaultExposure: 3,
},
{
ID: "HP2206", NameDE: "Ausrutschen auf nassem Boden (Wasseraustritt/Leckage)", NameEN: "Slipping on wet floor (water leakage)",
RequiredComponentTags: []string{"dom_warewashing"},
GeneratedHazardCats: []string{"mechanical_hazard"},
SuggestedMeasureIDs: []string{"M2207", "M538", "M2208"},
Priority: 76,
ApplicableLifecycles: []string{"normal_operation", "cleaning", "maintenance"},
ScenarioDE: "Aus der Spuelmaschine austretendes Wasser (Beschickung, Tuer oeffnen, Leckage, Tankwasserwechsel) macht den Boden im Aufstellbereich rutschig; der Bediener rutscht aus.",
TriggerDE: "Wasseraustritt beim Oeffnen/Beschicken; undichter Ablauf; fehlender Bodenablauf.",
HarmDE: "Sturz mit Prellungen, Knochenbruechen oder Kopfaufprall.",
AffectedDE: "Bedienpersonal, Reinigungspersonal",
ZoneDE: "Aufstell- und Bedienbereich der Spuelmaschine",
ISO12100Section: "6.3.5.6",
DefaultSeverity: 2, DefaultExposure: 3,
},
}
}
@@ -0,0 +1,112 @@
package iace
import "testing"
// firedSet runs the engine for the given custom tags and returns the set of
// fired pattern IDs.
func firedSet(customTags []string) map[string]bool {
engine := NewPatternEngine()
out := engine.Match(MatchInput{CustomTags: customTags})
fired := make(map[string]bool, len(out.MatchedPatterns))
for _, m := range out.MatchedPatterns {
fired[m.PatternID] = true
}
return fired
}
// A warewashing narrative emits these capability + functional tags.
var warewashingTags = []string{
"dom_warewashing", "steam_emission", "hot_water", "high_temperature",
"corrosive_chemical", "access_door", "rotating_part",
}
func TestWarewashing_PatternsFireForDishwasher(t *testing.T) {
fired := firedSet(warewashingTags)
want := []string{"HP2200", "HP2201", "HP2202", "HP2203", "HP2204", "HP2205", "HP2206"}
for _, id := range want {
if !fired[id] {
t.Errorf("expected warewashing pattern %s to fire for a dishwasher, but it did not", id)
}
}
}
func TestWarewashing_PatternsDoNotLeakIntoOtherMachines(t *testing.T) {
// A machine with thermal + electrical + chemical capability but NOT a
// dishwasher must never produce warewashing hazards (dom_warewashing gate).
fired := firedSet([]string{"high_temperature", "electrical_part", "chemical_risk", "rotating_part", "moving_part"})
for _, id := range []string{"HP2200", "HP2201", "HP2202", "HP2203", "HP2204", "HP2205", "HP2206"} {
if fired[id] {
t.Errorf("warewashing pattern %s leaked into a non-dishwasher machine", id)
}
}
}
func TestWarewashing_WeldingAndGlueDoNotLeakIntoDishwasher(t *testing.T) {
// The gate-term additions must stop the welding/flame/glue burn patterns
// from firing for a dishwasher (they previously leaked via high_temperature
// / electrical_part). dom_welding/dom_flame/dom_glue are absent here.
fired := firedSet(warewashingTags)
leak := map[string]string{
"HP530": "Lichtbogen-Verbrennung (Schweissen)",
"HP532": "Schweissrauch",
"HP533": "Brand durch Schweissfunken (Schweissen)",
}
for id, name := range leak {
if fired[id] {
t.Errorf("cross-domain pattern %s (%s) leaked into a dishwasher", id, name)
}
}
}
func TestWarewashing_MeasureIDsExist(t *testing.T) {
lib := GetProtectiveMeasureLibrary()
have := make(map[string]bool, len(lib))
for _, m := range lib {
have[m.ID] = true
}
for _, p := range GetWarewashingPatterns() {
for _, mid := range p.SuggestedMeasureIDs {
if !have[mid] {
t.Errorf("pattern %s references measure %s which is not in the library", p.ID, mid)
}
}
}
}
func TestWarewashing_NarrativeEmitsTags(t *testing.T) {
// Closes the loop: a realistic dishwasher description must emit the tags
// the warewashing patterns gate on (otherwise the patterns are dead).
narrative := "Gewerbliche Untertisch-Geschirrspuelmaschine mit Heisswasser-Boiler " +
"und Nachspuelung ca. 85 Grad C, Spuelpumpe mit rotierenden Spuelfeldern, " +
"Dampf-/Wrasenabgabe beim Oeffnen, Reiniger und Klarspueler ueber Dosiergeraet, " +
"Tuer mit Sicherheitsschalter, Eingreifen in die Spuelkammer."
res := ParseNarrative(narrative, "Gewerbliche Geschirrspuelmaschine")
got := make(map[string]bool, len(res.CustomTags))
for _, tag := range res.CustomTags {
got[tag] = true
}
for _, want := range []string{"dom_warewashing", "steam_emission", "hot_water", "corrosive_chemical", "access_door", "rotating_part"} {
if !got[want] {
t.Errorf("narrative did not emit expected tag %q (got %v)", want, res.CustomTags)
}
}
// And it must NOT emit any welding/flame/glue domain that would re-open leaks.
for _, bad := range []string{"dom_welding", "dom_flame", "dom_glue"} {
if got[bad] {
t.Errorf("dishwasher narrative unexpectedly emitted cross-domain tag %q", bad)
}
}
}
func TestWarewashing_NewMeasuresPresent(t *testing.T) {
lib := GetProtectiveMeasureLibrary()
have := make(map[string]bool, len(lib))
for _, m := range lib {
have[m.ID] = true
}
for _, mid := range []string{"M2200", "M2201", "M2202", "M2203", "M2204", "M2205", "M2206", "M2207", "M2208"} {
if !have[mid] {
t.Errorf("expected warewashing measure %s to be registered in the library", mid)
}
}
}
@@ -88,6 +88,21 @@ func GetKeywordDictionary() []KeywordEntry {
{Keywords: []string{"folienwickler", "wickelmaschine", "konfektioniermaschine", "folienverpackung", "wellpappe"}, ExtraTags: []string{"dom_converting"}},
{Keywords: []string{"bergbau", "untertage", "tunnelbau", "off-grid"}, ExtraTags: []string{"dom_remote"}},
{Keywords: []string{"asbest", "asbestsanierung", "asbestexposition"}, ExtraTags: []string{"dom_asbestos"}},
{Keywords: []string{"gasbrenner", "brennerbetrieb", "offene flamme", "flammhaert", "abflammen", "flammrichten"}, ExtraTags: []string{"dom_flame"}},
{Keywords: []string{"heissleim", "heissleimanlage", "schmelzkleber", "schmelzklebstoff", "klebstoffschmelzer", "leimwerk"}, ExtraTags: []string{"dom_glue"}},
// ── Gewerbliche Spuelmaschine / Warewashing ──────────────────────
// dom_warewashing gates the warewashing-specific patterns
// (hazard_patterns_warewashing.go) so they never leak into other
// machine classes. The functional tags (hot_water, steam_emission,
// corrosive_chemical, access_door) are the within-domain triggers.
{Keywords: []string{"spuelmaschine", "geschirrspuelmaschine", "geschirrspueler", "haubenspuelmaschine", "untertischspuelmaschine", "korbspuelmaschine", "bandspuelmaschine", "glaeserspuelmaschine", "bistrospuelmaschine", "warewashing", "dishwasher"}, ExtraTags: []string{"dom_warewashing"}},
{Keywords: []string{"heisswasser", "nachspuelung", "nachspueltemperatur", "spuelgang", "spuelzyklus", "thermostopp", "thermostop"}, ExtraTags: []string{"hot_water", "high_temperature"}},
{Keywords: []string{"dampf", "wrasen", "schwaden", "brueden"}, ExtraTags: []string{"steam_emission", "high_temperature"}},
{Keywords: []string{"boiler", "spuelboiler", "nachspuelboiler", "tankheiz", "boilerheiz"}, ComponentIDs: []string{"C094"}, ExtraTags: []string{"heating_element", "high_temperature"}},
{Keywords: []string{"reiniger", "klarspueler", "spuelmittel", "reinigungsmittel", "reinigerkonzentrat", "spuelchemie", "dosiergeraet", "dosierpumpe", "sauglanze", "entkalker"}, ExtraTags: []string{"corrosive_chemical"}},
{Keywords: []string{"spuelarm", "spuelfeld", "wascharm", "spruehfeld"}, ComponentIDs: []string{"C004"}, ExtraTags: []string{"rotating_part"}},
{Keywords: []string{"spuelkammer", "spueltuer", "geraetetuer", "haubentuer", "klapptuer"}, ExtraTags: []string{"access_door"}},
// Ghost-Closure (Emit-Seite): macht die 34 toten Required-Tags
// emittierbar, jeweils NUR via domaenenspezifische Keywords -> die 120
// Ghost-Patterns feuern wieder, aber nur fuer ihre echte Maschine (kein
@@ -22,6 +22,7 @@ func GetProtectiveMeasureLibrary() []ProtectiveMeasureEntry {
all = append(all, getGTBremseMeasures()...) // GT-Bremse-Coverage-Gaps (M483-M522)
all = append(all, GetCRAMeasures()...) // CRA / DIN EN 40000-1-2 cyber-resilience (M540-M548)
all = append(all, getLiftEndstopMeasures()...) // Lift/hoist endstop (M600-M604) — bridges OSHA MD library
all = append(all, getWarewashingMeasures()...) // Commercial dishwasher (M2200-M2208) — scald/chemical/door/slip
return all
}
@@ -0,0 +1,69 @@
package iace
// getWarewashingMeasures returns protective measures for commercial warewashing
// machines (gewerbliche Geschirrspuelmaschinen): hot-water/steam scalding,
// hot surfaces, corrosive cleaning chemicals, door pinch and wet-floor slip.
// They complement the generic thermal/mechanical/material measures with the
// machine-specific controls a Fachmann expects for this product class.
//
// M-ID range: M2200-M2208. Norm identifiers only (facts) — no norm text is
// reproduced (DIN/Beuth license). Lead standard: EN 60335-2-58 (safety of
// commercial electric dishwashing machines).
func getWarewashingMeasures() []ProtectiveMeasureEntry {
return []ProtectiveMeasureEntry{
{ID: "M2200", ReductionType: "design", SubType: "interlock",
Name: "Tuer-/Haubenverriegelung beendet Spuelgang vor dem Oeffnen",
Description: "Die Tuer bzw. Haube ist so mit der Steuerung verriegelt, dass beim Oeffnen Spuelpumpe und Nachspuelung sofort abschalten und ein Oeffnen erst nach Programmende (bzw. nach Abbau des Restdampfs) freigegeben wird. Verhindert den Schwall aus Heisswasser/Wrasen und den Kontakt mit noch rotierenden Spuelfeldern.",
HazardCategory: "thermal",
Examples: []string{"Tuerkontaktschalter schaltet Pumpe + Heizung beim Oeffnen ab", "Rastposition mit Restdampf-Verzoegerung vor Freigabe"},
NormReferences: []string{"EN 60335-2-58", "EN ISO 12100 — Inhaerent sichere Konstruktion"}},
{ID: "M2201", ReductionType: "design", SubType: "thermal",
Name: "Wrasen-/Dampfreduzierung (Kondensations- / Waermerueckgewinnungssystem)",
Description: "Der beim Oeffnen austretende Wrasen wird durch ein Kondensations- bzw. Waermerueckgewinnungssystem reduziert, sodass beim Entnehmen kein gefaehrlicher Dampfschwall entsteht. Senkt zugleich die Restwaerme- und Feuchtebelastung am Arbeitsplatz.",
HazardCategory: "thermal",
Examples: []string{"Umluft-Waermerueckgewinnung reduziert austretenden Wrasen", "Kondensationshaube ueber der Spuelkammer"},
NormReferences: []string{"EN 60335-2-58"}},
{ID: "M2202", ReductionType: "protection", SubType: "monitoring",
Name: "Thermostop / Temperaturueberwachung von Boiler und Tank",
Description: "Boiler- und Tanktemperatur werden ueberwacht; ein Thermostop gibt den naechsten Schritt erst frei, wenn die Solltemperatur erreicht ist, und begrenzt die maximale Nachspueltemperatur. Schuetzt vor Verbruehung durch unkontrolliert heisses Nachspuelwasser.",
HazardCategory: "thermal",
Examples: []string{"Temperatursensor in Boiler und Tank mit Abschaltgrenze", "Thermostop-Funktion im Spuelprogramm"},
NormReferences: []string{"EN 60335-2-58", "EN ISO 13732-1"}},
{ID: "M2203", ReductionType: "design", SubType: "containment",
Name: "Geschlossenes Dosiersystem mit Sauglanzen und Niveauueberwachung",
Description: "Reiniger und Klarspueler werden ausschliesslich ueber ein geschlossenes Dosiersystem mit Sauglanzen aus dem Originalgebinde gefoerdert (Niveau-Ueberwachung statt Umfuellen). Direkter Haut-/Augenkontakt mit dem aetzenden Konzentrat beim Nachfuellen wird konstruktiv vermieden.",
HazardCategory: "material_environmental",
Examples: []string{"Sauglanze mit Leermeldung im Reiniger-Kanister", "Kein Umfuellen — Gebindewechsel ohne offenen Chemiekontakt"},
NormReferences: []string{"EN 60335-2-58", "Verordnung (EG) Nr. 1272/2008 (CLP/GHS)"}},
{ID: "M2204", ReductionType: "information", SubType: "ppe",
Name: "PSA (Augen-/Hautschutz) + GHS-Kennzeichnung und Sicherheitsdatenblatt",
Description: "Fuer Handhabung, Gebindewechsel und Entkalkung werden Augen- und Handschutz vorgeschrieben; Reiniger/Klarspueler/Entkalker sind GHS-gekennzeichnet und das Sicherheitsdatenblatt liegt am Geraet vor. Stellt die sichere Handhabung der aetzenden Konzentrate sicher.",
HazardCategory: "material_environmental",
Examples: []string{"Schutzbrille + chemikalienbestaendige Handschuhe bei Gebindewechsel", "GHS-Etikett und SDB im Chemikalienschrank am Geraet"},
NormReferences: []string{"Verordnung (EG) Nr. 1272/2008 (CLP/GHS)", "TRGS 500"}},
{ID: "M2205", ReductionType: "protection", SubType: "ventilation",
Name: "Be-/Entlueftung bzw. geschlossene Haube gegen Chemie-Aerosole und Wrasen",
Description: "Der Aufstellbereich ist ausreichend be- und entlueftet bzw. die Spuelkammer bleibt waehrend des Programms geschlossen, sodass Reinigungs-Aerosole und heisser Wrasen nicht in die Atemzone des Bedieners gelangen.",
HazardCategory: "material_environmental",
Examples: []string{"Kuechenlueftung ueber dem Spuelbereich", "Programmstart nur bei geschlossener Haube"},
NormReferences: []string{"EN 60335-2-58", "TRGS 500"}},
{ID: "M2206", ReductionType: "design", SubType: "geometry",
Name: "Tuerkanten mit geringer Schliesskraft / Einklemmschutz",
Description: "Die Tuer-/Haubenmechanik ist so gestaltet (gefuehrte Bewegung, begrenzte Schliesskraft, abgerundete Kanten), dass beim Schliessen keine Finger gequetscht werden.",
HazardCategory: "mechanical",
Examples: []string{"Gefuehrte Haube mit gedaempfter Schliessbewegung", "Abgerundete Tuerkanten ohne Quetschspalt"},
NormReferences: []string{"EN 60335-2-58", "EN ISO 12100 — Geometrie und Anordnung"}},
{ID: "M2207", ReductionType: "design", SubType: "environment",
Name: "Rutschhemmender Bodenbelag + Ablauf/Leckagewanne im Aufstellbereich",
Description: "Im Aufstell- und Bedienbereich der Spuelmaschine sorgen rutschhemmender Bodenbelag und ein definierter Ablauf bzw. eine Leckagewanne dafuer, dass austretendes Wasser nicht zur Sturzgefahr wird.",
HazardCategory: "mechanical",
Examples: []string{"Rutschhemmender Industrieboden (Bewertungsgruppe R11/R12)", "Bodenablauf bzw. Leckagewanne unter dem Geraet"},
NormReferences: []string{"ASR A1.5/1,2", "DGUV Regel 108-003"}},
{ID: "M2208", ReductionType: "information", SubType: "signage",
Name: "Warnhinweis heisser Dampf/Heisswasser — Tuer erst nach Programmende oeffnen",
Description: "Am Geraet und in der Betriebsanleitung wird vor heissem Dampf und Heisswasser gewarnt und das Oeffnen der Tuer erst nach Programmende mit vorsichtigem Anheben vorgeschrieben. Sprachneutrale Piktogramme ergaenzen den Hinweis.",
HazardCategory: "general",
Examples: []string{"Warnpiktogramm 'Heisser Dampf' an der Tuer", "BA-Hinweis 'Tuer nach Programmende langsam oeffnen'"},
NormReferences: []string{"ISO 7010", "EN 60335-2-58"}},
}
}
@@ -46,6 +46,20 @@ var domainGateTerms = map[string]string{
"widerstandsschweiss": "dom_welding", "lichtbogenschweiss": "dom_welding",
"schutzgasschweiss": "dom_welding", "punktschweiss": "dom_welding",
"schweisselektrod": "dom_welding", "elektrodenspalt": "dom_welding",
// Schweissen — Oberflaechenformen die bisher ungegatet leakten (z.B. in
// thermische Hazards einer Spuelmaschine ueber high_temperature/electrical_part)
"schweissarbeitsplatz": "dom_welding", "schweissfunke": "dom_welding",
"schweisshelm": "dom_welding", "schweisserschutz": "dom_welding",
"lichtbogenzone": "dom_welding", "lichtbogen-verbrennung": "dom_welding",
"schweissrauch": "dom_welding", "schweissgeraet": "dom_welding",
"schweisszone": "dom_welding", "schweissbrenner": "dom_welding",
"schweissspritzer": "dom_welding", "schweissstrom": "dom_welding",
// Offene Flamme / Brenner (Gasbrenner, Flammhaerten, Abflammen)
"offene flamme": "dom_flame", "brennerbereich": "dom_flame",
"flammenzone": "dom_flame", "gasbrenner": "dom_flame",
// Heissleim / Schmelzkleber
"heissleimanlage": "dom_glue", "klebstoffschmelzer": "dom_glue",
"heisskleber": "dom_glue", "schmelzkleber": "dom_glue",
// Solar / PV
"pv-modul": "dom_solar", "photovoltaik": "dom_solar", "pv-anlage": "dom_solar",
"dc-steckverbindung": "dom_solar", "solarmodul": "dom_solar",
@@ -44,6 +44,7 @@ func collectAllPatterns() []HazardPattern {
patterns = append(patterns, GetCRAPatterns()...) // HP1910-HP1918 CRA / DIN EN 40000-1-2 cyber-resilience spur
patterns = append(patterns, GetSecondaryHarmDemoPatterns()...) // HP2000-HP2001 secondary harm chain demos (Cola splitter, Pharma)
patterns = append(patterns, GetLiftEndstopPatterns()...) // HP2100-HP2102 lift body-part crush at endstops
patterns = append(patterns, GetWarewashingPatterns()...) // HP2200-HP2206 commercial dishwasher (scald/chemical/door/slip)
patterns = applyMachineTypeOverrides(patterns) // Fill MachineTypes on legacy patterns to prevent drift
patterns = applyDomainGates(patterns) // Capability-domain gate: stop domain-specific patterns leaking cross-machine
return patterns
@@ -0,0 +1,230 @@
package ucca
import (
"regexp"
"strconv"
"strings"
)
// authorityInfo is the normative classification of a search result, used internally
// for re-ranking only (Phase 1 changes ordering, not the response contract).
type authorityInfo struct {
weight int // 100 binding, 80 technical_standard, 70 guidance, 0 foreign, 50 unknown
sourceClass string // binding_law | technical_standard | supervisory_guidance | foreign_law | unknown
jurisdiction string // DE | EU | CH
}
var (
guidanceMarkers = []string{
"DSK", "EDPB", "BfDI", "BFDI", "BayLfD", "Baylfb", "ENISA", "BSI", "EUCC",
"Standards Mapping", "Kpnr", "Orientierungshilfe", "Handreichung", "Beschluss",
"Leitlinie", "Guidance", "Empfehlung", "OECD", "CISA", "Blue Guide",
}
// Technical standards / control frameworks (best-practice controls). Checked BEFORE
// guidanceMarkers so a "BSI Grundschutz" chunk classifies as a standard, not BSI guidance.
standardMarkers = []string{
"NIST", "OWASP", "Grundschutz", "ISO 27001", "ISO/IEC 27001",
"CSA CCM", "Cloud Controls Matrix", "CIS Benchmark", "CIS Control",
}
foreignMarkers = []string{"RevDSG", "fedlex", "(CH)"}
deMarkers = []string{"BDSG", "DSK", "BfDI", "BFDI", "BayLfD", "Baylfb", "BSI"}
normPattern = regexp.MustCompile(`(§|Art\.?)\s*\d`)
bdsgParagraph = regexp.MustCompile(`§\s*(\d+)`)
)
// classifyAuthority derives weight/source-class/jurisdiction. Explicitly tagged payload
// values win; otherwise it falls back to the curated category + name markers, so the
// not-yet-re-ingested (untagged) corpus is still classified deterministically.
func classifyAuthority(r LegalSearchResult) authorityInfo {
jur := r.Jurisdiction
if jur == "" {
jur = inferJurisdiction(r)
}
if r.SourceClass != "" {
w := r.AuthorityWeight
if w == 0 && r.SourceClass == "binding_law" {
w = 100
}
return authorityInfo{weight: w, sourceClass: r.SourceClass, jurisdiction: jur}
}
if r.AuthorityWeight > 0 {
return authorityInfo{weight: r.AuthorityWeight, sourceClass: sourceClassFromWeight(r.AuthorityWeight), jurisdiction: jur}
}
hay := r.ArticleLabel + " " + r.RegulationShort + " " + r.RegulationName + " " + r.RegulationCode
switch {
case containsAny(hay, foreignMarkers):
return authorityInfo{weight: 0, sourceClass: "foreign_law", jurisdiction: "CH"}
case r.Category == "standard" || containsAny(hay, standardMarkers):
return authorityInfo{weight: 80, sourceClass: "technical_standard", jurisdiction: jur}
case r.Category == "guidance" || containsAny(hay, guidanceMarkers):
return authorityInfo{weight: 70, sourceClass: "supervisory_guidance", jurisdiction: jur}
case r.Category == "regulation" || r.Category == "eu_recht" || normPattern.MatchString(r.ArticleLabel):
return authorityInfo{weight: 100, sourceClass: "binding_law", jurisdiction: jur}
default:
return authorityInfo{weight: 50, sourceClass: "unknown", jurisdiction: jur}
}
}
func sourceClassFromWeight(w int) string {
switch {
case w >= 100:
return "binding_law"
case w >= 80:
return "technical_standard"
case w >= 70:
return "supervisory_guidance"
case w <= 0:
return "foreign_law"
default:
return "unknown"
}
}
func inferJurisdiction(r LegalSearchResult) string {
hay := r.ArticleLabel + " " + r.RegulationShort + " " + r.RegulationName
switch {
case containsAny(hay, foreignMarkers):
return "CH"
case strings.Contains(hay, "§") || containsAny(hay, deMarkers):
return "DE"
default:
return "EU"
}
}
// --- Domain routing: separates same-authority but topically foreign norms ---
type domainDef struct {
name string
regs []string // regulation markers found in a chunk
keywords []string // query keywords that signal this domain
}
// Deterministic order (slice, not map) — important for stable classification + tests.
var domains = []domainDef{
{"data_protection",
[]string{"DSGVO", "GDPR", "BDSG", "EDPB", "DSK", "BfDI", "BayLfD", "DPF"},
[]string{"personenbezogen", "betroffene", "datenschutz", "datenschutzbeauftrag", "dsb",
"datenpanne", "auskunft", "loesch", "lösch", "einwilligung", "besondere kategorien", "auftragsverarbeiter"}},
{"cyber",
[]string{"CRA", "NIS2", "NIS-2", "ENISA", "DORA", "EUCC"},
[]string{"security update", "sicherheitsupdate", "sicherheitsaktualisierung", "schwachstelle", "sbom",
"cybersicherheit", "konformit", "hersteller", "importeur", "haendler", "händler", "ikt-",
"resilienz", "sicherheitsvorfall", "digitalen elementen"}},
{"ai",
[]string{"AI Act", "KI-VO", "KI-Verordnung"},
[]string{"ki-system", "ki-modell", "hochrisiko", "kuenstliche intelligenz", "künstliche intelligenz"}},
{"product_safety",
[]string{"Maschinenverordnung", "MaschinenVO", "GPSR", "RED", "MDR"},
nil},
}
func queryDomain(query string) string {
ql := strings.ToLower(query)
for _, d := range domains {
for _, kw := range d.keywords {
if strings.Contains(ql, kw) {
return d.name
}
}
}
return ""
}
func chunkDomain(r LegalSearchResult) string {
hay := r.ArticleLabel + " " + r.RegulationShort + " " + r.RegulationCode + " " + r.RegulationName
for _, d := range domains {
if containsAny(hay, d.regs) {
return d.name
}
}
return ""
}
// scopeClass flags special sub-regimes that must not win general questions —
// BDSG Teil 3 (§§ 45-84) implements the JI directive (law enforcement), not the general regime.
func scopeClass(r LegalSearchResult) string {
hay := r.ArticleLabel + " " + r.RegulationShort
if strings.Contains(hay, "BDSG") {
if m := bdsgParagraph.FindStringSubmatch(hay); m != nil {
if n, err := strconv.Atoi(m[1]); err == nil && n >= 45 && n <= 84 {
return "law_enforcement"
}
}
}
return "general"
}
// --- Topic ontology: amplifier only (boost), never an override ---
type topicDef struct {
keywords []string
norms []string // preferred canonical citation fragments
}
var topics = []topicDef{
{[]string{"datenschutzbeauftrag", "dsb", "benennung"}, []string{"Art. 37", "§ 38 BDSG"}},
{[]string{"stellung des"}, []string{"Art. 38"}},
{[]string{"aufgaben des"}, []string{"Art. 39"}},
{[]string{"folgenabsch", "dsfa"}, []string{"Art. 35"}},
{[]string{"besondere kategorien"}, []string{"Art. 9", "§ 22 BDSG"}},
{[]string{"auskunft"}, []string{"Art. 15", "§ 34 BDSG"}},
{[]string{"loesch", "lösch"}, []string{"Art. 17", "§ 35 BDSG"}},
{[]string{"bussgeld", "geldbusse"}, []string{"Art. 83"}},
{[]string{"security update", "sicherheitsupdate", "schwachstelle", "sbom", "cybersicherheitsanforderung"}, []string{"CRA Anhang I"}},
{[]string{"meldepflicht", "sicherheitsvorfall"}, []string{"Art. 14 CRA"}},
}
// resultMatchesTopic reports whether the result is a preferred norm of a topic the query hits.
func resultMatchesTopic(query string, r LegalSearchResult) bool {
ql := strings.ToLower(query)
hay := r.ArticleLabel + " " + r.RegulationShort
for _, t := range topics {
if !containsAnyLower(ql, t.keywords) {
continue
}
for _, n := range t.norms {
if normMatches(hay, n) {
return true
}
}
}
return false
}
// normMatches checks that norm appears in hay with a non-digit boundary, so "Art. 9"
// matches "Art. 9 DSGVO" but not "Art. 90".
func normMatches(hay, norm string) bool {
idx := strings.Index(hay, norm)
if idx < 0 {
return false
}
end := idx + len(norm)
if end < len(hay) && hay[end] >= '0' && hay[end] <= '9' {
return false
}
return true
}
func queryIsForeign(query string) bool {
return containsAnyLower(strings.ToLower(query),
[]string{"schweiz", "revdsg", "fedlex", " ch ", "oesterreich", "österreich"})
}
func containsAny(hay string, markers []string) bool {
for _, m := range markers {
if strings.Contains(hay, m) {
return true
}
}
return false
}
func containsAnyLower(haylower string, markers []string) bool {
for _, m := range markers {
if strings.Contains(haylower, strings.ToLower(m)) {
return true
}
}
return false
}
@@ -0,0 +1,171 @@
package ucca
import (
"sort"
"strings"
)
// Re-ranking coefficients (validated in the offline golden harness; Phase A — conservative).
const (
authorityCoef = 0.40 // * weight/100
jurisdictionGain = 0.05 // binding/guidance from DE or EU
foreignPenalty = 0.60 // foreign law on a DE/EU question (demoted, not removed)
unknownPenalty = 0.08
domainMatchGain = 0.15
offDomainPenalty = 0.10 // off-domain binding (demoted, not removed)
scopePenalty = 0.25 // BDSG Teil 3 (law enforcement) on a general DP question
topicGain = 0.18 // amplifier only
supersededPenalty = 0.50 // superseded Alt-Quelle (pre-eu-v1): demoted, nicht versteckt
intentLiftGain = 0.10 // epsilon a qualifying interpretative source is lifted ABOVE the best binding
intentLiftMargin = 0.05 // ...only if that source is semantically competitive with binding
)
// guidanceIntentSignals mark a query that EXPLICITLY asks for an interpretation /
// recommendation by a guidance body, rather than for the binding obligation. Only
// then may a (semantically competitive) guideline outrank the binding norm.
var guidanceIntentSignals = []string{
"edpb", "europäischer datenschutzausschuss", "europaeischer datenschutzausschuss",
"dsk", "enisa", "bsi", "leitlinie", "guideline", "orientierungshilfe",
"auslegung", "empfiehlt", "empfehlung", "sagt", "laut",
}
// controlIntentSignals mark a query that asks HOW to implement / which controls or
// measures fit — rather than WHAT the binding obligation is. Only then may a
// (semantically competitive) technical_standard outrank the binding norm.
var controlIntentSignals = []string{
"control", "controls", "maßnahme", "massnahme", "schutzmaßnahme",
"best practice", "best-practice", "umsetzen", "implementier", "absicher",
"härt", "haert", "hardening", "nist", "owasp", "grundschutz",
"ccm", "iso 27001", "isms",
}
func queryMatchesAny(query string, signals []string) bool {
q := strings.ToLower(query)
for _, sig := range signals {
if strings.Contains(q, sig) {
return true
}
}
return false
}
// queryWantsGuidance reports whether the query explicitly asks for guidance/interpretation.
func queryWantsGuidance(query string) bool { return queryMatchesAny(query, guidanceIntentSignals) }
// queryWantsControls reports whether the query asks for implementation controls/measures.
func queryWantsControls(query string) bool { return queryMatchesAny(query, controlIntentSignals) }
// bestBindingSemantic returns the highest RAW semantic score among binding-law
// results (0 if none / no intent). Used as the guard threshold so an off-topic
// interpretative source cannot ride the intent boost.
func bestBindingSemantic(results []LegalSearchResult, wantsIntent bool) float64 {
if !wantsIntent {
return 0
}
best := 0.0
for _, r := range results {
if classifyAuthority(r).sourceClass == "binding_law" && r.Score > best {
best = r.Score
}
}
return best
}
// authorityScore computes the normative relevance of a result for a query. It augments the
// semantic score with authority/jurisdiction/domain/scope/topic signals. Exposed for tests.
func authorityScore(query string, r LegalSearchResult, qDomain string, qForeign bool) float64 {
info := classifyAuthority(r)
score := r.Score + authorityCoef*float64(info.weight)/100.0
if r.Superseded {
// Alt-Quelle (pre-eu-v1): Default-Fragen sollen die eu-v1-Norm sehen. Demoted,
// nicht entfernt — fuer Historie/Uebergangsfragen bleibt sie auffindbar.
score -= supersededPenalty
}
if info.jurisdiction == "CH" && !qForeign {
score -= foreignPenalty // Fremdrecht bei DE/EU-Frage: demoted, nicht geloescht
} else {
score += jurisdictionGain
}
if info.sourceClass == "unknown" {
score -= unknownPenalty
}
if qDomain != "" {
switch cd := chunkDomain(r); {
case cd == qDomain:
score += domainMatchGain
case cd != "":
score -= offDomainPenalty // off-domain binding: demoted, nicht geloescht
}
}
if qDomain == "data_protection" && scopeClass(r) == "law_enforcement" {
score -= scopePenalty
}
if resultMatchesTopic(query, r) {
score += topicGain // Verstaerker, kein Override
}
return score
}
// rerankByAuthority re-orders results so binding law from the matching jurisdiction/domain
// ranks above guidance, foreign and off-domain law — WITHOUT dropping anything (guidance is
// kept as interpretation context). The computed score is written back to Score so downstream
// merges (e.g. the multi-collection advisor) preserve this order. Pure + deterministic.
func rerankByAuthority(query string, results []LegalSearchResult) []LegalSearchResult {
if len(results) < 2 {
return results
}
qDomain := queryDomain(query)
qForeign := queryIsForeign(query)
wantsGuidance := queryWantsGuidance(query)
wantsControls := queryWantsControls(query)
bestBindingSem := bestBindingSemantic(results, wantsGuidance)
out := make([]LegalSearchResult, len(results))
copy(out, results)
for i := range out {
out[i].Score = authorityScore(query, out[i], qDomain, qForeign)
}
// Explicit interpretation intent → a competitive guideline may outrank binding (lift
// above the best binding FINAL). Explicit implementation intent → boost the CONTROL-POOL
// (operational/procedural requirement, control standard, implementation guidance) over
// the abstract obligation, soft-ordered by role. Norm questions (neither) stay untouched.
if wantsGuidance {
liftAboveBinding(out, results, bestBindingSem, "supervisory_guidance")
}
if wantsControls {
applyControlRoles(out)
}
sort.SliceStable(out, func(a, b int) bool {
return out[a].Score > out[b].Score
})
return out
}
// liftAboveBinding lifts a semantically-competitive interpretative source (the given
// sourceClass — supervisory_guidance or technical_standard) just ABOVE the best binding
// hit, ordered by semantic, so an EXPLICIT guidance/implementation question can return
// that source Top-1. A pure norm question (no intent → not called) keeps binding on top.
// Sources below the semantic margin are left untouched, so an off-topic source can never
// ride the override — and the lift is from the binding FINAL score, so authority/topic/
// domain bonuses cannot edge it out.
func liftAboveBinding(out, raw []LegalSearchResult, bestBindingSem float64, sourceClass string) {
bestBindingFinal := 0.0
for i := range out {
if classifyAuthority(out[i]).sourceClass == "binding_law" && out[i].Score > bestBindingFinal {
bestBindingFinal = out[i].Score
}
}
for i := range out {
// Classify (not raw payload) so the untagged legacy corpus — e.g. NIST ingested
// before source_class tagging — is still recognized as its interpretative class.
if classifyAuthority(out[i]).sourceClass != sourceClass || raw[i].Score < bestBindingSem-intentLiftMargin {
continue
}
lifted := bestBindingFinal + intentLiftGain + (raw[i].Score - bestBindingSem)
if lifted > out[i].Score {
out[i].Score = lifted
}
}
}
@@ -0,0 +1,96 @@
package ucca
import "testing"
func bindingRes(label, reg, jur string, score float64) LegalSearchResult {
return LegalSearchResult{ArticleLabel: label, RegulationShort: reg, SourceClass: "binding_law", AuthorityWeight: 100, Jurisdiction: jur, Score: score}
}
func guidanceRes(label, reg string, score float64) LegalSearchResult {
return LegalSearchResult{ArticleLabel: label, RegulationShort: reg, SourceClass: "supervisory_guidance", AuthorityWeight: 70, Jurisdiction: "EU", Score: score}
}
func foreignRes(label string, score float64) LegalSearchResult {
return LegalSearchResult{ArticleLabel: label, RegulationShort: "RevDSG", SourceClass: "foreign_law", AuthorityWeight: 0, Jurisdiction: "CH", Score: score}
}
// Acceptance criteria (Phase 1) expressed as ordering tests.
func TestRerankByAuthority_Acceptance(t *testing.T) {
t.Run("guidance does not overtake semantically competitive binding", func(t *testing.T) {
out := rerankByAuthority("Was gilt hier?", []LegalSearchResult{
guidanceRes("ENISA Mapping", "ENISA", 0.72),
bindingRes("CRA Anhang I", "CRA", "EU", 0.66),
})
if out[0].RegulationShort != "CRA" {
t.Fatalf("binding must rank first over competitive guidance, got %q", out[0].RegulationShort)
}
})
t.Run("foreign law demoted on DE/EU question but kept", func(t *testing.T) {
in := []LegalSearchResult{foreignRes("RevDSG Art 1", 0.85), bindingRes("Art. 9 DSGVO", "DSGVO", "EU", 0.62)}
out := rerankByAuthority("Welche Daten sind besonders geschuetzt?", in)
if out[0].RegulationShort != "DSGVO" {
t.Fatalf("binding EU must beat foreign on a DE/EU query, got %q", out[0].RegulationShort)
}
if len(out) != 2 {
t.Fatalf("foreign law must be kept, got len=%d", len(out))
}
})
t.Run("off-domain binding demoted but not removed", func(t *testing.T) {
in := []LegalSearchResult{
bindingRes("Art. 13 EU MDR", "MDR", "EU", 0.70),
bindingRes("Art. 13 CRA", "CRA", "EU", 0.60),
}
out := rerankByAuthority("Welche Pflichten hat der Hersteller von Produkten mit digitalen Elementen?", in)
if out[0].RegulationShort != "CRA" {
t.Fatalf("on-domain CRA must beat off-domain MDR, got %q", out[0].RegulationShort)
}
if len(out) != 2 {
t.Fatalf("off-domain MDR must be kept, got len=%d", len(out))
}
})
t.Run("same-regime binding wins over guidance", func(t *testing.T) {
out := rerankByAuthority("Was gilt hier?", []LegalSearchResult{
bindingRes("Art. 13 CRA", "CRA", "EU", 0.70),
guidanceRes("ENISA Mapping", "ENISA", 0.60),
})
if out[0].RegulationShort != "CRA" {
t.Fatalf("binding must win, got %q", out[0].RegulationShort)
}
})
t.Run("BDSG Teil 3 demoted below DSGVO on general DP question", func(t *testing.T) {
in := []LegalSearchResult{
bindingRes("§ 48 BDSG", "BDSG", "DE", 0.70), // Teil 3 (law enforcement)
bindingRes("Art. 9 DSGVO", "DSGVO", "EU", 0.62),
}
out := rerankByAuthority("Was sind besondere Kategorien personenbezogener Daten?", in)
if out[0].RegulationShort != "DSGVO" {
t.Fatalf("DSGVO must beat BDSG Teil 3 on a general DP question, got %q", out[0].RegulationShort)
}
})
t.Run("nothing is dropped and topic amplifies", func(t *testing.T) {
in := []LegalSearchResult{
guidanceRes("ENISA", "ENISA", 0.72),
bindingRes("CRA Anhang I", "CRA", "EU", 0.66),
foreignRes("RevDSG", 0.5),
}
out := rerankByAuthority("Anforderungen an Security Updates?", in)
if len(out) != len(in) {
t.Fatalf("rerank must preserve all results, got %d want %d", len(out), len(in))
}
if out[0].ArticleLabel != "CRA Anhang I" {
t.Fatalf("topic+authority must lift CRA Anhang I to top, got %q", out[0].ArticleLabel)
}
})
t.Run("single result returned unchanged", func(t *testing.T) {
in := []LegalSearchResult{bindingRes("Art. 1 CRA", "CRA", "EU", 0.5)}
if out := rerankByAuthority("x", in); len(out) != 1 {
t.Fatalf("len=%d", len(out))
}
})
}
@@ -0,0 +1,129 @@
package ucca
import "testing"
func TestClassifyAuthority(t *testing.T) {
tests := []struct {
name string
result LegalSearchResult
wantW int
wantSC string
wantJur string
}{
{"tagged binding EU", LegalSearchResult{AuthorityWeight: 100, SourceClass: "binding_law", Jurisdiction: "EU"}, 100, "binding_law", "EU"},
{"tagged guidance DE", LegalSearchResult{AuthorityWeight: 70, SourceClass: "supervisory_guidance", Jurisdiction: "DE"}, 70, "supervisory_guidance", "DE"},
{"tagged foreign CH", LegalSearchResult{AuthorityWeight: 0, SourceClass: "foreign_law", Jurisdiction: "CH"}, 0, "foreign_law", "CH"},
{"untagged ENISA guidance", LegalSearchResult{RegulationShort: "ENISA", ArticleLabel: "ENISA CRA Standards Mapping"}, 70, "supervisory_guidance", "EU"},
{"untagged NIST standard", LegalSearchResult{RegulationShort: "NIST SP 800-82r3", ArticleLabel: "AU-8"}, 80, "technical_standard", "EU"},
{"BSI Grundschutz standard beats BSI guidance", LegalSearchResult{RegulationShort: "BSI Grundschutz", ArticleLabel: "BSI Grundschutz Baustein"}, 80, "technical_standard", "DE"},
{"weight-only 85 TRGS standard", LegalSearchResult{AuthorityWeight: 85, RegulationShort: "TRGS 529"}, 85, "technical_standard", "EU"},
{"tagged technical_standard", LegalSearchResult{AuthorityWeight: 80, SourceClass: "technical_standard", Jurisdiction: "EU"}, 80, "technical_standard", "EU"},
{"untagged CRA binding", LegalSearchResult{RegulationShort: "CRA", ArticleLabel: "Art. 13 CRA", Category: "regulation"}, 100, "binding_law", "EU"},
{"untagged BDSG binding DE", LegalSearchResult{RegulationShort: "BDSG", ArticleLabel: "§ 38 BDSG"}, 100, "binding_law", "DE"},
{"untagged RevDSG foreign", LegalSearchResult{RegulationShort: "RevDSG", ArticleLabel: "RevDSG (CH)"}, 0, "foreign_law", "CH"},
{"untagged unknown", LegalSearchResult{RegulationShort: "", ArticleLabel: ""}, 50, "unknown", "EU"},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := classifyAuthority(tt.result)
if got.weight != tt.wantW || got.sourceClass != tt.wantSC || got.jurisdiction != tt.wantJur {
t.Errorf("classifyAuthority() = {%d %s %s}, want {%d %s %s}",
got.weight, got.sourceClass, got.jurisdiction, tt.wantW, tt.wantSC, tt.wantJur)
}
})
}
}
func TestQueryDomain(t *testing.T) {
tests := []struct{ q, want string }{
{"Welche Anforderungen an Security Updates?", "cyber"},
{"Wer braucht einen Datenschutzbeauftragten?", "data_protection"},
{"Was sind besondere Kategorien personenbezogener Daten?", "data_protection"},
{"Welche Pflichten beim Hochrisiko-KI-System?", "ai"},
{"Wie spaet ist es?", ""},
}
for _, tt := range tests {
if got := queryDomain(tt.q); got != tt.want {
t.Errorf("queryDomain(%q) = %q, want %q", tt.q, got, tt.want)
}
}
}
func TestChunkDomain(t *testing.T) {
tests := []struct {
name string
r LegalSearchResult
want string
}{
{"CRA cyber", LegalSearchResult{RegulationShort: "CRA", ArticleLabel: "Art. 13 CRA"}, "cyber"},
{"DSGVO dp", LegalSearchResult{RegulationShort: "DSGVO", ArticleLabel: "Art. 9 DSGVO"}, "data_protection"},
{"AI Act ai", LegalSearchResult{RegulationShort: "AI Act", ArticleLabel: "Art. 10 AI Act"}, "ai"},
{"MDR product", LegalSearchResult{RegulationShort: "MDR", ArticleLabel: "Art. 13 EU MDR"}, "product_safety"},
{"unknown", LegalSearchResult{RegulationShort: "XYZ"}, ""},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
if got := chunkDomain(tt.r); got != tt.want {
t.Errorf("chunkDomain() = %q, want %q", got, tt.want)
}
})
}
}
func TestScopeClass(t *testing.T) {
tests := []struct {
name string
r LegalSearchResult
want string
}{
{"BDSG Teil 3 law enforcement", LegalSearchResult{RegulationShort: "BDSG", ArticleLabel: "§ 48 BDSG"}, "law_enforcement"},
{"BDSG general part", LegalSearchResult{RegulationShort: "BDSG", ArticleLabel: "§ 38 BDSG"}, "general"},
{"DSGVO general", LegalSearchResult{RegulationShort: "DSGVO", ArticleLabel: "Art. 9 DSGVO"}, "general"},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
if got := scopeClass(tt.r); got != tt.want {
t.Errorf("scopeClass() = %q, want %q", got, tt.want)
}
})
}
}
func TestResultMatchesTopic(t *testing.T) {
tests := []struct {
name string
query string
r LegalSearchResult
want bool
}{
{"besondere Kategorien -> Art 9 match", "Was sind besondere Kategorien?", LegalSearchResult{ArticleLabel: "Art. 9 DSGVO"}, true},
{"besondere Kategorien -> Art 90 no match", "Was sind besondere Kategorien?", LegalSearchResult{ArticleLabel: "Art. 90 DSGVO"}, false},
{"security updates -> CRA Anhang I", "Anforderungen an Security Updates?", LegalSearchResult{ArticleLabel: "CRA Anhang I"}, true},
{"no topic keyword", "Wie spaet ist es?", LegalSearchResult{ArticleLabel: "Art. 9 DSGVO"}, false},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
if got := resultMatchesTopic(tt.query, tt.r); got != tt.want {
t.Errorf("resultMatchesTopic() = %v, want %v", got, tt.want)
}
})
}
}
func TestNormMatches(t *testing.T) {
tests := []struct {
hay, norm string
want bool
}{
{"Art. 9 DSGVO", "Art. 9", true},
{"Art. 90 DSGVO", "Art. 9", false},
{"§ 38 BDSG", "§ 38 BDSG", true},
{"§ 380 BDSG", "§ 38", false},
{"Art. 14 CRA", "Art. 14 CRA", true},
}
for _, tt := range tests {
if got := normMatches(tt.hay, tt.norm); got != tt.want {
t.Errorf("normMatches(%q,%q) = %v, want %v", tt.hay, tt.norm, got, tt.want)
}
}
}
@@ -0,0 +1,123 @@
package ucca
import "strings"
// source_role is the FUNCTIONAL role of a chunk — WHAT must be done (obligation),
// HOW to implement it (operational/procedural requirement, control standard,
// implementation guidance), or how to READ the norm (interpretation/definition).
// It is ORTHOGONAL to source_class (legal authority): source_class decides RANK,
// source_role decides CONTROL-POOL membership for implementation questions.
// Derived deterministically from markers, so the untagged corpus needs no re-tag.
const (
roleObligation = "obligation" // the abstract duty (the WHAT)
roleOperationalReq = "operational_requirement" // concrete binding requirement (CRA Annex I)
roleProceduralReq = "procedural_requirement" // a process: notification/registration/DPIA/incident report
roleControlStandard = "control_standard" // best-practice control catalog (NIST/OWASP/ISO/CIS)
roleImplGuidance = "implementation_guidance" // advisory how-to (ENISA good practices, BSI)
roleInterpretation = "interpretation" // interprets the norm's MEANING (EDPB guideline)
roleDefinition = "definition" // definitions / scope / recitals
)
var (
proceduralMarkers = []string{
"Meldung", "Meldepflicht", "Notification", "Notifizierung", "Registrierung",
"Registration", "Konformitätserklärung", "Declaration of Conformity", "Incident",
"Berichterstattung", "Reporting", "Folgenabschätzung", "DSFA", "DPIA", "Anzeigepflicht",
}
annexMarkers = []string{"Anhang", "Annex", "Appendix", "Anlage"}
operationalMarkers = []string{"Anforderung", "Requirement", "essential", "wesentliche"}
implMarkers = []string{
"Good Practice", "Best Practice", "Standards Mapping", "Umsetzung", "Implementation",
"Handreichung", "Maßnahmenkatalog", "ICS", "SCADA", "Technical Guideline", "TIG",
}
definitionMarkers = []string{"Begriffsbestimmung", "Definition"}
)
// classifyRole derives the functional source_role from chunk metadata + the authority
// class. technical_standard is always a control_standard; guidance splits into
// implementation_guidance (how-to) vs interpretation (meaning); binding splits into
// procedural / operational requirement / definition / plain obligation.
func classifyRole(r LegalSearchResult) string {
cls := classifyAuthority(r).sourceClass
hay := strings.ToLower(r.ArticleLabel + " " + r.RegulationShort + " " + r.RegulationName + " " + r.Article)
switch {
case r.IsRecital:
return roleDefinition
case cls == "technical_standard":
return roleControlStandard
case cls == "supervisory_guidance":
if containsAnyLower(hay, implMarkers) {
return roleImplGuidance
}
return roleInterpretation
case cls == "binding_law":
switch {
case containsAnyLower(hay, definitionMarkers):
return roleDefinition
case containsAnyLower(hay, proceduralMarkers):
return roleProceduralReq
case containsAnyLower(hay, annexMarkers) || containsAnyLower(hay, operationalMarkers):
return roleOperationalReq
default:
return roleObligation
}
default:
return roleObligation
}
}
// controlRoleBonus is the soft intra-pool preference (User 2026-06-24):
// operational_requirement > procedural_requirement > control_standard > implementation_guidance.
var controlRoleBonus = map[string]float64{
roleOperationalReq: 0.100,
roleProceduralReq: 0.075,
roleControlStandard: 0.050,
roleImplGuidance: 0.000,
}
// controlPoolGain lifts EVERY control-pool role over the non-control roles (obligation/
// interpretation/definition) on an implementation question, so the binding abstract
// obligation does not dominate by authority alone. The obligation is not removed — it
// stays visible as "Rechtsgrundlage" context below the recommended measures.
const controlPoolGain = 0.15
// applyControlRoles boosts the control-pool (the four implementation roles) for an
// EXPLICIT implementation question, soft-ordered op_req > procedural > standard > guidance.
// Replaces the earlier "lift technical_standard above binding" — controls are not only
// technical_standard, and the binding operational_requirement (e.g. CRA Annex I) should win.
func applyControlRoles(out []LegalSearchResult) {
for i := range out {
if bonus, ok := controlRoleBonus[classifyRole(out[i])]; ok {
out[i].Score += controlPoolGain + bonus
}
}
}
// isControlPoolRole reports whether a role belongs to the control-pool surfaced on
// implementation questions (the four "how to implement" roles).
func isControlPoolRole(role string) bool {
switch role {
case roleOperationalReq, roleProceduralReq, roleControlStandard, roleImplGuidance:
return true
}
return false
}
// controlRoleOf classifies a raw Qdrant payload into a source_role, so searchControls can
// filter its deep dense pull to the control-pool BEFORE hits are mapped to LegalSearchResult.
func controlRoleOf(payload map[string]interface{}) string {
article := getString(payload, "article")
if article == "" {
article = getString(payload, "section")
}
return classifyRole(LegalSearchResult{
RegulationShort: getString(payload, "regulation_short"),
RegulationName: getString(payload, "regulation_name_de"),
ArticleLabel: getString(payload, "article_label"),
Article: article,
Category: getString(payload, "category"),
SourceClass: getString(payload, "source_class"),
AuthorityWeight: getInt(payload, "authority_weight"),
IsRecital: getBool(payload, "is_recital"),
})
}
@@ -0,0 +1,79 @@
package ucca
import "testing"
func TestClassifyRole(t *testing.T) {
tests := []struct {
name string
r LegalSearchResult
want string
}{
{"NIST -> control_standard", LegalSearchResult{RegulationShort: "NIST SP 800-82r3", ArticleLabel: "AU-8"}, roleControlStandard},
{"OWASP -> control_standard", LegalSearchResult{RegulationShort: "OWASP ASVS"}, roleControlStandard},
{"CRA Anhang -> operational_requirement", LegalSearchResult{RegulationShort: "CRA", ArticleLabel: "CRA Anhang I", Category: "regulation"}, roleOperationalReq},
{"CRA Meldepflicht -> procedural_requirement", LegalSearchResult{RegulationShort: "CRA", ArticleLabel: "Art. 14 CRA Meldepflicht", Category: "regulation"}, roleProceduralReq},
{"ENISA Good Practices -> implementation_guidance", LegalSearchResult{RegulationShort: "ENISA Supply Chain Good Practices"}, roleImplGuidance},
{"EDPB Leitlinie -> interpretation", LegalSearchResult{RegulationShort: "EDPB DPO", ArticleLabel: "WP243 Leitlinien Datenschutzbeauftragte"}, roleInterpretation},
{"DORA article -> obligation", LegalSearchResult{RegulationShort: "DORA", ArticleLabel: "Art. 5 DORA", Category: "regulation"}, roleObligation},
{"DSGVO Begriffsbestimmungen -> definition", LegalSearchResult{RegulationShort: "DSGVO", ArticleLabel: "Art. 4 DSGVO Begriffsbestimmungen", Category: "regulation"}, roleDefinition},
{"recital -> definition", LegalSearchResult{RegulationShort: "CRA", IsRecital: true}, roleDefinition},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
if got := classifyRole(tt.r); got != tt.want {
t.Errorf("classifyRole() = %q, want %q", got, tt.want)
}
})
}
}
func TestApplyControlRoles_PoolPreference(t *testing.T) {
// op_req > procedural > control_standard > impl_guidance; non-control roles get no boost.
roles := []struct {
r LegalSearchResult
wantGain float64
}{
{LegalSearchResult{ArticleLabel: "CRA Anhang I", Category: "regulation"}, controlPoolGain + 0.100},
{LegalSearchResult{ArticleLabel: "Art. 14 CRA Meldepflicht", Category: "regulation"}, controlPoolGain + 0.075},
{LegalSearchResult{RegulationShort: "NIST SP 800-53"}, controlPoolGain + 0.050},
{LegalSearchResult{RegulationShort: "ENISA Good Practices"}, controlPoolGain + 0.000},
{LegalSearchResult{ArticleLabel: "Art. 5 DORA", Category: "regulation"}, 0.0}, // obligation: no boost
}
for _, rc := range roles {
out := []LegalSearchResult{rc.r}
out[0].Score = 1.0
applyControlRoles(out)
if got := out[0].Score - 1.0; got < rc.wantGain-1e-9 || got > rc.wantGain+1e-9 {
t.Errorf("role %q: gain %.3f, want %.3f", classifyRole(rc.r), got, rc.wantGain)
}
}
}
func TestIsControlPoolRole(t *testing.T) {
for _, r := range []string{roleOperationalReq, roleProceduralReq, roleControlStandard, roleImplGuidance} {
if !isControlPoolRole(r) {
t.Errorf("%q should be in the control-pool", r)
}
}
for _, r := range []string{roleObligation, roleInterpretation, roleDefinition} {
if isControlPoolRole(r) {
t.Errorf("%q should NOT be in the control-pool", r)
}
}
}
func TestControlRoleOf_Payload(t *testing.T) {
// searchControls filters its deep dense pull by classifying the raw Qdrant payload.
nist := map[string]interface{}{"regulation_short": "NIST SP 800-82r3", "article": "AU-8"}
if got := controlRoleOf(nist); got != roleControlStandard {
t.Errorf("untagged NIST payload role = %q, want control_standard", got)
}
craAnnex := map[string]interface{}{"regulation_short": "CRA", "article": "Anhang-I", "category": "regulation"}
if got := controlRoleOf(craAnnex); got != roleOperationalReq {
t.Errorf("CRA Anhang payload role = %q, want operational_requirement", got)
}
dora := map[string]interface{}{"regulation_short": "DORA", "article_label": "Art. 5 DORA", "category": "regulation"}
if got := controlRoleOf(dora); isControlPoolRole(got) {
t.Errorf("DORA abstract article role = %q must be excluded from the control-pool", got)
}
}
@@ -0,0 +1,167 @@
package ucca
import (
"bytes"
"context"
"encoding/json"
"fmt"
"io"
"net/http"
"sort"
)
// LegalActStructure is the composition of one ingested eur-lex legal act — how
// many distinct articles, annexes and recitals it consists of (plus the raw
// chunk count). Backs the coverage page so the ingested corpus is not a black
// box: a developer SEES what each act actually contains, not only its name.
type LegalActStructure struct {
RegulationShort string `json:"regulation_short"`
RegulationName string `json:"regulation_name"`
Articles int `json:"articles"`
Annexes int `json:"annexes"`
Recitals int `json:"recitals"`
Chunks int `json:"chunks"`
}
const eurlexSource = "eur-lex.europa.eu"
// legalStructureCollections hold the clean eur-lex legal corpus (chunks tagged
// with chunk_scope = section | annex | recital).
var legalStructureCollections = []string{"bp_compliance_ce", "bp_compliance_datenschutz"}
// chunkScopeBucket maps a Qdrant chunk_scope to the structure field it feeds.
var chunkScopeBucket = map[string]string{"section": "articles", "annex": "annexes", "recital": "recitals"}
// CorpusStructure scrolls the eur-lex legal corpus across the legal collections
// and aggregates the per-act composition. The source filter keeps it to a few
// hundred points regardless of total corpus size. Read-only; a collection that
// fails to scroll is skipped rather than failing the whole call.
func (c *LegalRAGClient) CorpusStructure(ctx context.Context) ([]LegalActStructure, error) {
var all []qdrantScrollPoint
for _, coll := range legalStructureCollections {
pts, err := c.scrollLegalCorpus(ctx, coll)
if err != nil {
continue
}
all = append(all, pts...)
}
return aggregateStructure(all), nil
}
// aggregateStructure counts distinct article labels per (regulation, scope).
// Pure → unit-testable without a vector store.
func aggregateStructure(points []qdrantScrollPoint) []LegalActStructure {
distinct := map[string]map[string]map[string]struct{}{}
names := map[string]string{}
chunks := map[string]int{}
order := []string{}
for _, pt := range points {
reg := getString(pt.Payload, "regulation_short")
if reg == "" {
continue
}
if _, seen := names[reg]; !seen {
name := getString(pt.Payload, "regulation_name_de")
if name == "" {
name = reg
}
names[reg] = name
distinct[reg] = map[string]map[string]struct{}{}
order = append(order, reg)
}
chunks[reg]++
bucket, ok := chunkScopeBucket[getString(pt.Payload, "chunk_scope")]
article := getString(pt.Payload, "article")
if !ok || article == "" {
continue
}
if distinct[reg][bucket] == nil {
distinct[reg][bucket] = map[string]struct{}{}
}
distinct[reg][bucket][article] = struct{}{}
}
out := make([]LegalActStructure, 0, len(order))
for _, reg := range order {
out = append(out, LegalActStructure{
RegulationShort: reg,
RegulationName: names[reg],
Articles: len(distinct[reg]["articles"]),
Annexes: len(distinct[reg]["annexes"]),
Recitals: len(distinct[reg]["recitals"]),
Chunks: chunks[reg],
})
}
sort.SliceStable(out, func(i, j int) bool {
if out[i].Articles != out[j].Articles {
return out[i].Articles > out[j].Articles
}
return out[i].RegulationShort < out[j].RegulationShort
})
return out
}
// scrollLegalCorpus pages through one collection, filtered to the eur-lex legal
// corpus, returning minimal-payload points (no text/vectors).
func (c *LegalRAGClient) scrollLegalCorpus(ctx context.Context, collection string) ([]qdrantScrollPoint, error) {
var all []qdrantScrollPoint
var offset interface{}
for {
points, next, err := c.scrollLegalPage(ctx, collection, offset)
if err != nil {
return nil, err
}
all = append(all, points...)
if next == nil {
break
}
offset = next
}
return all, nil
}
// scrollLegalPage fetches one page of the filtered scroll and returns the
// points plus the next-page offset (nil when exhausted).
func (c *LegalRAGClient) scrollLegalPage(ctx context.Context, collection string, offset interface{}) ([]qdrantScrollPoint, interface{}, error) {
reqBody := map[string]interface{}{
"limit": 500,
"with_payload": map[string]interface{}{"include": []string{"regulation_short", "regulation_name_de", "chunk_scope", "article"}},
"with_vectors": false,
"filter": map[string]interface{}{
"must": []map[string]interface{}{
{"key": "source", "match": map[string]interface{}{"value": eurlexSource}},
},
},
}
if offset != nil {
reqBody["offset"] = offset
}
jsonBody, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
url := fmt.Sprintf("%s/collections/%s/points/scroll", c.qdrantURL, collection)
req, err := http.NewRequestWithContext(ctx, "POST", url, bytes.NewReader(jsonBody))
if err != nil {
return nil, nil, err
}
req.Header.Set("Content-Type", "application/json")
if c.qdrantAPIKey != "" {
req.Header.Set("api-key", c.qdrantAPIKey)
}
resp, err := c.httpClient.Do(req)
if err != nil {
return nil, nil, err
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != http.StatusOK {
body, _ := io.ReadAll(resp.Body)
return nil, nil, fmt.Errorf("qdrant returned %d: %s", resp.StatusCode, string(body))
}
var scrollResp qdrantScrollResponse
if err := json.NewDecoder(resp.Body).Decode(&scrollResp); err != nil {
return nil, nil, err
}
return scrollResp.Result.Points, scrollResp.Result.NextPageOffset, nil
}
@@ -0,0 +1,50 @@
package ucca
import "testing"
func structPoint(reg, name, scope, article string) qdrantScrollPoint {
return qdrantScrollPoint{Payload: map[string]interface{}{
"regulation_short": reg,
"regulation_name_de": name,
"chunk_scope": scope,
"article": article,
}}
}
func TestAggregateStructure_CountsDistinctPerScope(t *testing.T) {
points := []qdrantScrollPoint{
structPoint("CRA", "Cyber Resilience Act", "section", "13"),
structPoint("CRA", "Cyber Resilience Act", "section", "13"), // duplicate article → still 1
structPoint("CRA", "Cyber Resilience Act", "section", "14"),
structPoint("CRA", "Cyber Resilience Act", "annex", "Anhang-I"),
structPoint("CRA", "Cyber Resilience Act", "annex", "Anhang-VII"),
structPoint("DORA", "", "section", "6"), // first sighting has no name →
structPoint("DORA", "", "section", "19"), // regulation_name falls back to short
structPoint("DORA", "", "recital", ""), // empty article → ignored for distinct
structPoint("", "x", "section", "1"), // missing regulation → skipped entirely
}
got := aggregateStructure(points)
if len(got) != 2 {
t.Fatalf("want 2 acts, got %d (%+v)", len(got), got)
}
// CRA has more articles → sorts first.
cra := got[0]
if cra.RegulationShort != "CRA" || cra.Articles != 2 || cra.Annexes != 2 || cra.Recitals != 0 || cra.Chunks != 5 {
t.Errorf("CRA wrong: %+v", cra)
}
dora := got[1]
if dora.RegulationShort != "DORA" || dora.Articles != 2 || dora.Chunks != 3 {
t.Errorf("DORA wrong: %+v", dora)
}
if dora.RegulationName != "DORA" {
t.Errorf("DORA name fallback failed: %q", dora.RegulationName)
}
}
func TestAggregateStructure_Empty(t *testing.T) {
if got := aggregateStructure(nil); len(got) != 0 {
t.Errorf("want empty, got %+v", got)
}
}
@@ -0,0 +1,134 @@
package ucca
import (
"fmt"
"strings"
)
const (
assessConnectedCap = 12 // cap connected norms surfaced in the assessment
assessCrossRegimeTopN = 5 // window over which "cross regime" is judged
assessReviewMargin = 0.05 // a tighter winner gap → recommend human review
)
// Assess builds the auditable explanation layer over a ranked result set:
// primary norm, the norms it connects to (citation graph), cross-regime, a
// human-review flag, the winner margin and a short reasoning string. Pure →
// unit-testable. It EXPLAINS the ranking, it does not change it. Returns nil for
// an empty result set.
func Assess(results []LegalSearchResult) *LegalAssessment {
if len(results) == 0 {
return nil
}
// Norm-level view: collapse multiple chunks of the same article/annex so the
// margin and cross-regime are judged between DISTINCT norms, not near-identical
// chunks of one norm (which would make every winner margin ~0).
norms := distinctNorms(results)
p := norms[0]
primary := primaryLabel(p)
connected := dedupStrings(p.ReferencesOut, p.ReferencesIn, p.CitationUnit)
if len(connected) > assessConnectedCap {
connected = connected[:assessConnectedCap]
}
window := norms
if len(window) > assessCrossRegimeTopN {
window = window[:assessCrossRegimeTopN]
}
regimes := make(map[string]bool)
for _, r := range window {
if r.RegulationShort != "" {
regimes[r.RegulationShort] = true
}
}
crossRegime := len(regimes) > 1
margin := 0.0
if len(norms) > 1 {
margin = norms[0].Score - norms[1].Score
}
primaryBinding := p.SourceClass == "binding_law"
humanReview := margin < assessReviewMargin || crossRegime || !primaryBinding
return &LegalAssessment{
PrimaryNorm: primary,
PrimaryRegulation: p.RegulationShort,
ConnectedNorms: connected,
CrossRegime: crossRegime,
HumanReviewFlag: humanReview,
WinnerMargin: margin,
ScoreReasoning: assessReasoning(p, margin, crossRegime, primaryBinding),
}
}
func primaryLabel(p LegalSearchResult) string {
if p.CitationUnit != "" {
return p.CitationUnit
}
if p.ArticleLabel != "" {
return p.ArticleLabel
}
return strings.TrimSpace(p.RegulationShort + " " + p.Article)
}
// assessReasoning renders a short, human-readable justification (German).
func assessReasoning(p LegalSearchResult, margin float64, crossRegime, primaryBinding bool) string {
label := primaryLabel(p)
parts := make([]string, 0, 4)
if primaryBinding {
parts = append(parts, fmt.Sprintf("Primärtreffer %s: bindendes Recht (Autorität %d).", label, p.AuthorityWeight))
} else {
parts = append(parts, fmt.Sprintf("Primärtreffer %s ist keine bindende Norm (Leitlinie/Standard) — Quelle prüfen.", label))
}
if margin > 0 {
parts = append(parts, fmt.Sprintf("Vorsprung %.2f vor #2.", margin))
}
if margin < assessReviewMargin {
parts = append(parts, "Knapper Vorsprung — Alternativtreffer prüfen.")
}
if crossRegime {
parts = append(parts, "Mehrere Regime betroffen — Querbezug prüfen.")
}
return strings.Join(parts, " ")
}
// distinctNorms collapses results that share a citation (multiple chunks of the
// same article/annex) to the first — i.e. highest-ranked — occurrence. Results
// without any citation identity are each kept, since they cannot be matched.
func distinctNorms(results []LegalSearchResult) []LegalSearchResult {
seen := make(map[string]bool, len(results))
out := make([]LegalSearchResult, 0, len(results))
for _, r := range results {
key := r.CitationUnit
if key == "" {
key = r.ArticleLabel
}
if key != "" {
if seen[key] {
continue
}
seen[key] = true
}
out = append(out, r)
}
return out
}
// dedupStrings concatenates out+in, drops empties and the excluded value, and
// returns a stable de-duplicated slice (insertion order preserved).
func dedupStrings(out, in []string, exclude string) []string {
seen := map[string]bool{exclude: true}
res := make([]string, 0, len(out)+len(in))
for _, list := range [][]string{out, in} {
for _, s := range list {
if s == "" || seen[s] {
continue
}
seen[s] = true
res = append(res, s)
}
}
return res
}
@@ -0,0 +1,112 @@
package ucca
import "testing"
func ares(reg, cu, sc string, score float64, weight int, out, in []string) LegalSearchResult {
return LegalSearchResult{
RegulationShort: reg, CitationUnit: cu, SourceClass: sc, Score: score,
AuthorityWeight: weight, ReferencesOut: out, ReferencesIn: in,
}
}
func TestAssess_Empty(t *testing.T) {
if Assess(nil) != nil {
t.Error("empty results → nil assessment")
}
}
func TestAssess_BindingPrimary_NoReview(t *testing.T) {
results := []LegalSearchResult{
ares("CRA", "Art. 13 CRA", "binding_law", 1.05, 100,
[]string{"CRA Anhang I", "Art. 14 CRA"}, []string{"Art. 12 CRA"}),
ares("CRA", "Art. 14 CRA", "binding_law", 0.80, 100, nil, nil),
}
a := Assess(results)
if a == nil {
t.Fatal("nil assessment")
}
if a.PrimaryNorm != "Art. 13 CRA" || a.PrimaryRegulation != "CRA" {
t.Errorf("primary wrong: %+v", a)
}
if len(a.ConnectedNorms) != 3 { // out(2) + in(1), self excluded, deduped
t.Errorf("connected norms: %v", a.ConnectedNorms)
}
if a.CrossRegime {
t.Error("single regime must not be cross-regime")
}
if a.WinnerMargin < 0.24 || a.WinnerMargin > 0.26 {
t.Errorf("margin = %v, want ~0.25", a.WinnerMargin)
}
if a.HumanReviewFlag {
t.Error("clean binding + healthy margin + single regime → no review")
}
}
func TestAssess_CrossRegimeFlagsReview(t *testing.T) {
a := Assess([]LegalSearchResult{
ares("CRA", "Art. 13 CRA", "binding_law", 1.05, 100, nil, nil),
ares("DORA", "Art. 6 DORA", "binding_law", 0.70, 100, nil, nil),
})
if !a.CrossRegime || !a.HumanReviewFlag {
t.Errorf("cross-regime must flag review: %+v", a)
}
}
func TestAssess_NonBindingFlagsReview(t *testing.T) {
a := Assess([]LegalSearchResult{
ares("ENISA", "ENISA SBOM", "supervisory_guidance", 0.90, 70, nil, nil),
ares("ENISA", "ENISA X", "supervisory_guidance", 0.40, 70, nil, nil),
})
if !a.HumanReviewFlag {
t.Error("non-binding primary → review")
}
}
func TestAssess_TightMarginFlagsReview(t *testing.T) {
a := Assess([]LegalSearchResult{
ares("CRA", "Art. 13 CRA", "binding_law", 1.00, 100, nil, nil),
ares("CRA", "Art. 14 CRA", "binding_law", 0.98, 100, nil, nil),
})
if a.WinnerMargin >= 0.05 || !a.HumanReviewFlag {
t.Errorf("tight margin → review: %+v", a)
}
}
func TestAssess_MarginIsNormLevelNotChunkLevel(t *testing.T) {
// Two near-identical chunks of the SAME norm at the top, then a distinct norm.
results := []LegalSearchResult{
ares("CRA", "Art. 13 CRA", "binding_law", 1.050, 100, []string{"CRA Anhang I"}, nil),
ares("CRA", "Art. 13 CRA", "binding_law", 1.049, 100, nil, nil), // same norm
ares("CRA", "Art. 14 CRA", "binding_law", 0.800, 100, nil, nil),
}
a := Assess(results)
if a.WinnerMargin < 0.24 || a.WinnerMargin > 0.26 { // Art.13 vs Art.14, not chunk vs chunk
t.Errorf("margin must be norm-level (~0.25), got %v", a.WinnerMargin)
}
if a.HumanReviewFlag {
t.Error("healthy norm-level margin → no review")
}
}
func TestDistinctNorms(t *testing.T) {
got := distinctNorms([]LegalSearchResult{
{CitationUnit: "Art. 13 CRA"},
{CitationUnit: "Art. 13 CRA"}, // duplicate norm → collapsed
{CitationUnit: "Art. 14 CRA"},
{CitationUnit: ""}, // no identity → kept
{CitationUnit: ""}, // no identity → kept
})
if len(got) != 4 {
t.Errorf("want 4 (2 distinct + 2 unidentified), got %d", len(got))
}
}
func TestDedupStrings(t *testing.T) {
got := dedupStrings([]string{"a", "b", "", "a"}, []string{"b", "c"}, "self")
if len(got) != 3 || got[0] != "a" || got[1] != "b" || got[2] != "c" {
t.Errorf("dedup: %v", got)
}
if len(dedupStrings([]string{"self"}, nil, "self")) != 0 {
t.Error("excluded value must be dropped")
}
}
@@ -20,6 +20,7 @@ type LegalRAGClient struct {
httpClient *http.Client
textIndexEnsured map[string]bool
hybridEnabled bool
graphEnabled bool
}
// NewLegalRAGClient creates a new Legal RAG client using Ollama bge-m3 embeddings.
@@ -38,6 +39,11 @@ func NewLegalRAGClient() *LegalRAGClient {
}
hybridEnabled := os.Getenv("RAG_HYBRID_SEARCH") != "false"
// Graph-Expansion ist OPT-IN: kein gemessener Rang-Nutzen ggue. der Binding-Augmentation,
// +1 Qdrant-Call/Suche, Flutungsrisiko ueber Reverse-Kanten. Bleibt als Recall-Sicherheitsnetz
// fuer spaetere Luecken (RAG_GRAPH_EXPANSION=true). Die Graph-Kanten werden in der Response
// zur Begruendung/Vollstaendigkeit genutzt, nicht zur Pool-Expansion (Default).
graphEnabled := os.Getenv("RAG_GRAPH_EXPANSION") == "true"
return &LegalRAGClient{
qdrantURL: qdrantURL,
@@ -47,6 +53,7 @@ func NewLegalRAGClient() *LegalRAGClient {
collection: "bp_compliance_ce",
textIndexEnsured: make(map[string]bool),
hybridEnabled: hybridEnabled,
graphEnabled: graphEnabled,
httpClient: &http.Client{
Timeout: 60 * time.Second,
},
@@ -93,6 +100,29 @@ func (c *LegalRAGClient) searchInternal(ctx context.Context, collection string,
hits = denseHits
}
// Stratified: den binding_law-Pool ERGAENZEN (nicht ersetzen), damit die Pflichtquelle
// immer Kandidat ist — Guidance bleibt als Auslegungskontext erhalten. Best-effort:
// Fehler beim Binding-Query degradieren still auf den semantischen Pool.
if bindingHits, bErr := c.searchBinding(ctx, collection, embedding, topK); bErr == nil {
hits = mergeDedupHits(hits, bindingHits)
}
// Control-Augmentation: bei expliziter Umsetzungsfrage einen tiefen dense-Pool ziehen und
// nur die Control-Pool-Rollen behalten — so werden NIST/CRA-Anhang (dense rank ~8-9, unter
// dem kleinen top-K) Kandidaten. Re-Rank/applyControlRoles ordnen sie danach.
if queryWantsControls(query) {
if controlHits, cErr := c.searchControls(ctx, collection, embedding); cErr == nil {
hits = mergeDedupHits(hits, controlHits)
}
}
// Graph-Augmentation: verbundene Normen (references_out/in) der Top-Hits ueber die
// praezise Zitations-Kante in den Pool ziehen — z.B. Art. 13 CRA zieht Anhang I (die
// eigentliche Pflichtquelle). Pool-Augmentation only; Re-Rank + topK bleiben.
if c.graphEnabled {
hits = c.expandViaGraph(ctx, collection, hits)
}
results := make([]LegalSearchResult, len(hits))
for i, hit := range hits {
// Legal-Metadaten nach rag_reingest_spec.md §2: bevorzugt die normalisierten Felder
@@ -121,12 +151,45 @@ func (c *LegalRAGClient) searchInternal(ctx context.Context, collection string,
Pages: getIntSlice(hit.Payload, "pages"),
SourceURL: getString(hit.Payload, "source"),
Score: hit.Score,
AuthorityWeight: getInt(hit.Payload, "authority_weight"),
SourceClass: getString(hit.Payload, "source_class"),
Jurisdiction: getString(hit.Payload, "jurisdiction"),
CitationUnit: getString(hit.Payload, "citation_unit"),
ReferencesOut: getStringSlice(hit.Payload, "references_out"),
ReferencesIn: getStringSlice(hit.Payload, "references_in"),
Superseded: getString(hit.Payload, "status") == "superseded",
}
}
// Authority-aware Re-Ranking: bindendes Recht der passenden Jurisdiktion/Domaene nach
// oben, Guidance/Fremdrecht/Off-Domain runter (nichts wird geloescht). Reihenfolge only,
// Response-Schema unveraendert. Score traegt den Authority-Score, damit nachgelagerte
// Multi-Collection-Merges (Advisor) die Ordnung bewahren.
results = rerankByAuthority(query, results)
if topK > 0 && len(results) > topK {
results = results[:topK]
}
return results, nil
}
// mergeDedupHits concatenates two hit lists, keeping the first occurrence of each point ID.
func mergeDedupHits(primary, extra []qdrantSearchHit) []qdrantSearchHit {
seen := make(map[string]bool, len(primary)+len(extra))
out := make([]qdrantSearchHit, 0, len(primary)+len(extra))
for _, list := range [][]qdrantSearchHit{primary, extra} {
for _, h := range list {
id := fmt.Sprint(h.ID)
if seen[id] {
continue
}
seen[id] = true
out = append(out, h)
}
}
return out
}
// FormatLegalContextForPrompt formats the legal context for inclusion in an LLM prompt.
func (c *LegalRAGClient) FormatLegalContextForPrompt(lc *LegalContext) string {
if lc == nil || len(lc.Results) == 0 {
@@ -0,0 +1,162 @@
package ucca
import (
"bytes"
"context"
"encoding/json"
"fmt"
"io"
"net/http"
"sort"
)
// Graph-augmented retrieval: when a top hit cites an annex/article (references_out)
// or is cited by one (references_in), pull that connected norm into the candidate
// pool via the PRECISE citation graph instead of hoping semantic search surfaces
// it. E.g. a hit on CRA Art. 13 pulls in CRA Anhang I (the actual requirement).
// Pool-augmentation only — authority re-rank + topK slice still apply, so the
// response schema is unchanged.
const (
graphSeedCount = 5 // only the top hits seed the expansion
graphMaxExpand = 15 // cap connected norms pulled in (avoid pool explosion)
graphHopPenalty = 0.05 // a one-hop neighbour ranks just below its seed
)
// expandViaGraph augments hits with the norms they cite and the norms that cite
// them. Best-effort: on any error (or nothing to expand) the original hits are
// returned unchanged.
func (c *LegalRAGClient) expandViaGraph(ctx context.Context, collection string, hits []qdrantSearchHit) []qdrantSearchHit {
if len(hits) == 0 {
return hits
}
present := make(map[string]bool, len(hits))
for _, h := range hits {
if cu := getString(h.Payload, "citation_unit"); cu != "" {
present[cu] = true
}
}
seeds := hits
if len(seeds) > graphSeedCount {
seeds = seeds[:graphSeedCount]
}
// Forward edges only (references_out = the detail a hit explicitly points to,
// e.g. Art. 13 → Anhang I). Reverse (references_in) has high fan-out for popular
// annexes (Anhang I is cited by 23 articles) → pool flooding; it is surfaced as
// connected-norm metadata in the Phase 2 response instead of expanding the pool.
want := make(map[string]float64) // connected citation_unit -> best seeding score
for _, h := range seeds {
for _, cu := range getStringSlice(h.Payload, "references_out") {
if cu == "" || present[cu] {
continue
}
if s, ok := want[cu]; !ok || h.Score > s {
want[cu] = h.Score
}
}
}
if len(want) == 0 {
return hits
}
units := topByScore(want, graphMaxExpand)
fetched, err := c.fetchByCitationUnits(ctx, collection, units)
if err != nil || len(fetched) == 0 {
return hits
}
neighbours := make([]qdrantSearchHit, 0, len(fetched))
for cu, pt := range fetched {
neighbours = append(neighbours, qdrantSearchHit{ID: pt.ID, Score: want[cu] - graphHopPenalty, Payload: pt.Payload})
}
return mergeDedupHits(hits, neighbours)
}
// topByScore returns up to n keys with the highest values. Deterministic: ties
// broken by the key string so the cap is stable across runs.
func topByScore(m map[string]float64, n int) []string {
keys := make([]string, 0, len(m))
for k := range m {
keys = append(keys, k)
}
sort.Slice(keys, func(i, j int) bool {
if m[keys[i]] != m[keys[j]] {
return m[keys[i]] > m[keys[j]]
}
return keys[i] < keys[j]
})
if len(keys) > n {
keys = keys[:n]
}
return keys
}
// fetchByCitationUnits loads one representative point (the first chunk) per
// citation_unit from the given collection.
func (c *LegalRAGClient) fetchByCitationUnits(ctx context.Context, collection string, units []string) (map[string]qdrantScrollPoint, error) {
should := make([]map[string]interface{}, 0, len(units))
for _, cu := range units {
should = append(should, map[string]interface{}{"key": "citation_unit", "match": map[string]interface{}{"value": cu}})
}
reqBody := map[string]interface{}{
"limit": len(units) * 4,
"with_payload": true,
"with_vectors": false,
"filter": map[string]interface{}{"should": should},
}
jsonBody, err := json.Marshal(reqBody)
if err != nil {
return nil, err
}
url := fmt.Sprintf("%s/collections/%s/points/scroll", c.qdrantURL, collection)
req, err := http.NewRequestWithContext(ctx, "POST", url, bytes.NewReader(jsonBody))
if err != nil {
return nil, err
}
req.Header.Set("Content-Type", "application/json")
if c.qdrantAPIKey != "" {
req.Header.Set("api-key", c.qdrantAPIKey)
}
resp, err := c.httpClient.Do(req)
if err != nil {
return nil, err
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != http.StatusOK {
body, _ := io.ReadAll(resp.Body)
return nil, fmt.Errorf("qdrant scroll returned %d: %s", resp.StatusCode, string(body))
}
var scrollResp qdrantScrollResponse
if err := json.NewDecoder(resp.Body).Decode(&scrollResp); err != nil {
return nil, err
}
out := make(map[string]qdrantScrollPoint, len(units))
for _, pt := range scrollResp.Result.Points {
cu := getString(pt.Payload, "citation_unit")
if cu != "" {
if _, seen := out[cu]; !seen {
out[cu] = pt
}
}
}
return out, nil
}
// getStringSlice extracts a []string from a Qdrant payload list field
// (references_out / references_in are stored as JSON arrays of strings).
func getStringSlice(m map[string]interface{}, key string) []string {
v, ok := m[key]
if !ok {
return nil
}
arr, ok := v.([]interface{})
if !ok {
return nil
}
out := make([]string, 0, len(arr))
for _, item := range arr {
if s, ok := item.(string); ok {
out = append(out, s)
}
}
return out
}
@@ -0,0 +1,89 @@
package ucca
import (
"context"
"encoding/json"
"net/http"
"net/http/httptest"
"testing"
)
func TestGetStringSlice(t *testing.T) {
m := map[string]interface{}{
"refs": []interface{}{"a", "b", 3, "c"}, // non-strings are skipped
"str": "not-a-list",
}
got := getStringSlice(m, "refs")
if len(got) != 3 || got[0] != "a" || got[2] != "c" {
t.Errorf("refs: %v", got)
}
if getStringSlice(m, "missing") != nil {
t.Error("missing key should be nil")
}
if getStringSlice(m, "str") != nil {
t.Error("non-list should be nil")
}
}
func TestTopByScore_DeterministicCap(t *testing.T) {
m := map[string]float64{"x": 0.5, "y": 0.9, "z": 0.5, "w": 0.7}
got := topByScore(m, 2)
if len(got) != 2 || got[0] != "y" || got[1] != "w" {
t.Errorf("want [y w], got %v", got)
}
all := topByScore(m, 10)
if all[2] != "x" || all[3] != "z" { // tie 0.5 broken by key string
t.Errorf("tie-break not deterministic: %v", all)
}
}
func TestExpandViaGraph_NoSeedsOrRefs(t *testing.T) {
c := &LegalRAGClient{} // nil httpClient → must not be called on these paths
if out := c.expandViaGraph(context.Background(), "x", nil); out != nil {
t.Error("empty hits should return nil")
}
hits := []qdrantSearchHit{{ID: 1, Score: 0.8, Payload: map[string]interface{}{"citation_unit": "Art. 1 CRA"}}}
if out := c.expandViaGraph(context.Background(), "x", hits); len(out) != 1 {
t.Errorf("no references → unchanged, got %d", len(out))
}
}
func TestExpandViaGraph_PullsConnectedNorm(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
_ = json.NewEncoder(w).Encode(map[string]interface{}{
"result": map[string]interface{}{
"points": []map[string]interface{}{
{"id": 99, "payload": map[string]interface{}{
"citation_unit": "CRA Anhang I", "chunk_text": "Sicherheitsanforderungen",
"source_class": "binding_law", "authority_weight": 100, "regulation_short": "CRA",
}},
},
"next_page_offset": nil,
},
})
}))
defer srv.Close()
c := &LegalRAGClient{qdrantURL: srv.URL, httpClient: srv.Client()}
hits := []qdrantSearchHit{
{ID: 1, Score: 0.70, Payload: map[string]interface{}{
"citation_unit": "Art. 13 CRA", "references_out": []interface{}{"CRA Anhang I"},
}},
}
out := c.expandViaGraph(context.Background(), "bp_compliance_ce", hits)
if len(out) != 2 {
t.Fatalf("want 2 hits (seed + connected annex), got %d", len(out))
}
var found *qdrantSearchHit
for i := range out {
if getString(out[i].Payload, "citation_unit") == "CRA Anhang I" {
found = &out[i]
}
}
if found == nil {
t.Fatal("connected norm CRA Anhang I was not pulled into the pool")
}
if found.Score < 0.64 || found.Score > 0.66 { // 0.70 seed 0.05 hop penalty
t.Errorf("connected score = %v, want ~0.65", found.Score)
}
}
@@ -185,6 +185,55 @@ func (c *LegalRAGClient) searchDense(ctx context.Context, collection string, emb
searchReq.Filter = &qdrantFilter{Should: conditions}
}
return c.doPointsSearch(ctx, collection, searchReq)
}
// searchBinding fetches the top binding_law hits (authority-stratified pool) so the
// obligation source is always a candidate even when guidance dominates semantically.
// It AUGMENTS the semantic pool — guidance is preserved as interpretation context.
func (c *LegalRAGClient) searchBinding(ctx context.Context, collection string, embedding []float64, topK int) ([]qdrantSearchHit, error) {
searchReq := qdrantSearchRequest{
Vector: embedding,
Limit: topK,
WithPayload: true,
Filter: &qdrantFilter{Must: []qdrantCondition{
{Key: "source_class", Match: qdrantMatch{Value: "binding_law"}},
}},
}
return c.doPointsSearch(ctx, collection, searchReq)
}
// controlPoolDepth is how deep the dense control pull reaches. Measured: for an EU-cyber
// control query the relevant control sources sit at dense rank ~8-9 (NIST, CRA Annex), far
// below the client's small top-K — so a fixed dense depth of 60 reliably surfaces them.
const controlPoolDepth = 60
// searchControls fetches a DEEP dense pool and keeps only the control-pool roles, so control
// sources that the small top-K (hybrid) search misses become candidates on an implementation
// question. Role is derived in code (no source_role tag needed). AUGMENTS the pool — the
// caller gates it on control-intent.
func (c *LegalRAGClient) searchControls(ctx context.Context, collection string, embedding []float64) ([]qdrantSearchHit, error) {
searchReq := qdrantSearchRequest{
Vector: embedding,
Limit: controlPoolDepth,
WithPayload: true,
}
hits, err := c.doPointsSearch(ctx, collection, searchReq)
if err != nil {
return nil, err
}
kept := make([]qdrantSearchHit, 0, len(hits))
for _, h := range hits {
if isControlPoolRole(controlRoleOf(h.Payload)) {
kept = append(kept, h)
}
}
return kept, nil
}
// doPointsSearch issues a POST /points/search and decodes the hits.
func (c *LegalRAGClient) doPointsSearch(ctx context.Context, collection string, searchReq qdrantSearchRequest) ([]qdrantSearchHit, error) {
jsonBody, err := json.Marshal(searchReq)
if err != nil {
return nil, fmt.Errorf("failed to marshal search request: %w", err)
@@ -0,0 +1,135 @@
package ucca
import "testing"
func intentRes(reg, sourceClass string, sem float64, weight int) LegalSearchResult {
return LegalSearchResult{
RegulationShort: reg, SourceClass: sourceClass, Score: sem,
AuthorityWeight: weight, Jurisdiction: "EU",
}
}
func TestQueryWantsGuidance(t *testing.T) {
wants := []string{
"Was empfiehlt der EDPB zum DSB?",
"Was sagt die ENISA zu Security Updates?",
"laut DSK ...",
"Orientierungshilfe zur DSFA",
"Welche BSI-Empfehlung gilt?",
"Auslegung der Aufsichtsbehörde",
}
plain := []string{
"Ab wann braucht man einen Datenschutzbeauftragten?",
"Welche Anforderungen bestehen an Security Updates?",
}
for _, q := range wants {
if !queryWantsGuidance(q) {
t.Errorf("should detect interpretation intent: %q", q)
}
}
for _, q := range plain {
if queryWantsGuidance(q) {
t.Errorf("should NOT detect intent (norm question): %q", q)
}
}
}
func TestRerank_NormQuestion_BindingStaysTop(t *testing.T) {
// No intent signal → binding wins even though guidance is semantically higher.
results := []LegalSearchResult{
intentRes("EDPB DPO", "supervisory_guidance", 0.64, 70),
intentRes("DSGVO", "binding_law", 0.58, 100),
}
out := rerankByAuthority("Ab wann braucht man einen Datenschutzbeauftragten?", results)
if out[0].SourceClass != "binding_law" {
t.Errorf("norm question: binding must stay Top-1, got %s", out[0].SourceClass)
}
}
func TestRerank_InterpretationQuestion_GuidanceMayWin(t *testing.T) {
// Explicit intent + guidance semantically competitive → guidance wins.
results := []LegalSearchResult{
intentRes("EDPB DPO", "supervisory_guidance", 0.64, 70),
intentRes("DSGVO", "binding_law", 0.58, 100),
}
out := rerankByAuthority("Was empfiehlt der EDPB zum Datenschutzbeauftragten?", results)
if out[0].SourceClass != "supervisory_guidance" {
t.Errorf("interpretation question: guidance should win Top-1, got %s", out[0].SourceClass)
}
}
func TestRerank_OffTopicGuidance_BlockedByGuard(t *testing.T) {
// Intent present, but guidance semantic is far below the best binding hit →
// the margin guard keeps binding on top (no off-topic guideline override).
results := []LegalSearchResult{
intentRes("EDPB DPO", "supervisory_guidance", 0.40, 70),
intentRes("DSGVO", "binding_law", 0.58, 100),
}
out := rerankByAuthority("Was empfiehlt der EDPB zum Datenschutzbeauftragten?", results)
if out[0].SourceClass != "binding_law" {
t.Errorf("off-topic guidance must not win even with intent, got %s", out[0].SourceClass)
}
}
func TestQueryWantsControls(t *testing.T) {
wants := []string{
"Welche Controls passen zu Security Updates?",
"Welche Maßnahmen sollten wir umsetzen?",
"Wie härten wir den Server ab?",
"Gibt es NIST-Controls dafür?",
"OWASP Best Practice für Logging?",
"BSI Grundschutz Bausteine",
}
plain := []string{
"Welche Anforderungen bestehen an Security Updates?",
"Ab wann braucht man einen Datenschutzbeauftragten?",
}
for _, q := range wants {
if !queryWantsControls(q) {
t.Errorf("should detect control/implementation intent: %q", q)
}
}
for _, q := range plain {
if queryWantsControls(q) {
t.Errorf("should NOT detect control intent (norm question): %q", q)
}
}
}
func TestRerank_ControlQuestion_OperationalReqTop(t *testing.T) {
// User priority for implementation questions: operational_requirement (binding concrete,
// CRA Anhang I) > control_standard (NIST). Both are in the control-pool; op_req wins.
results := []LegalSearchResult{
{RegulationShort: "NIST SP 800-82r3", ArticleLabel: "AU-8", SourceClass: "technical_standard", AuthorityWeight: 80, Jurisdiction: "EU", Score: 0.60},
{RegulationShort: "CRA", ArticleLabel: "CRA Anhang I", Category: "regulation", Score: 0.58},
}
out := rerankByAuthority("Welche Controls und Massnahmen passen zu Security Updates?", results)
if out[0].RegulationShort != "CRA" {
t.Errorf("operational_requirement (CRA Anhang I) should be Top-1 over control_standard, got %q", out[0].RegulationShort)
}
}
func TestRerank_NormQuestion_BindingOverStandard(t *testing.T) {
// "Anforderungen" → no control intent → binding obligation stays Top-1 over the standard.
results := []LegalSearchResult{
intentRes("NIST SP 800-82", "technical_standard", 0.62, 80),
intentRes("CRA", "binding_law", 0.58, 100),
}
out := rerankByAuthority("Welche Anforderungen bestehen an Security Updates?", results)
if out[0].SourceClass != "binding_law" {
t.Errorf("norm question: binding must stay Top-1 over standard, got %s", out[0].SourceClass)
}
}
func TestRerank_ControlQuestion_PoolBeatsBareObligation(t *testing.T) {
// A control-pool source (NIST control_standard) outranks an abstract obligation with no
// domain/topic advantage, because the implementation intent boosts the control-pool.
results := []LegalSearchResult{
{RegulationShort: "NIST SP 800-82r3", ArticleLabel: "AU-8", SourceClass: "technical_standard", AuthorityWeight: 80, Jurisdiction: "EU", Score: 0.55},
{RegulationShort: "XYZ", ArticleLabel: "Art. 5 XYZ", Category: "regulation", Score: 0.58},
}
out := rerankByAuthority("Welche Controls und Massnahmen passen zu Security Updates?", results)
if out[0].RegulationShort != "NIST SP 800-82r3" {
t.Errorf("control_standard should beat a bare abstract obligation on a control question, got %q", out[0].RegulationShort)
}
}
@@ -225,6 +225,18 @@ func getIntSlice(m map[string]interface{}, key string) []int {
return result
}
func getInt(m map[string]interface{}, key string) int {
if v, ok := m[key]; ok {
switch n := v.(type) {
case float64:
return int(n)
case int:
return n
}
}
return 0
}
func contains(slice []string, item string) bool {
for _, s := range slice {
if s == item {
@@ -0,0 +1,30 @@
package ucca
import "testing"
// A superseded alt-source must rank below the same result when it is NOT
// superseded (the eu-v1 norm), but only demoted — the penalty is finite, so it
// stays in the pool and remains findable for history/transition questions.
func TestAuthorityScore_SupersededIsDemotedNotRemoved(t *testing.T) {
fresh := LegalSearchResult{
Score: 0.65, SourceClass: "binding_law", AuthorityWeight: 100,
Jurisdiction: "EU", RegulationShort: "CRA", Article: "13",
}
old := fresh
old.Superseded = true
sFresh := authorityScore("CRA Sicherheitsupdates Hersteller", fresh, "", false)
sOld := authorityScore("CRA Sicherheitsupdates Hersteller", old, "", false)
if sOld >= sFresh {
t.Errorf("superseded must score lower: fresh=%.3f superseded=%.3f", sFresh, sOld)
}
gap := sFresh - sOld
if gap < supersededPenalty-0.001 || gap > supersededPenalty+0.001 {
t.Errorf("demotion should equal supersededPenalty (%.2f), got %.3f", supersededPenalty, gap)
}
// Still a positive, finite score → present in the pool, not hidden.
if sOld <= -1 {
t.Errorf("superseded score collapsed (%.3f) — must remain findable", sOld)
}
}
@@ -399,8 +399,9 @@ func TestHybridSearch_UsesQueryAPI(t *testing.T) {
return
}
// Fallback: should not reach dense search
t.Error("Unexpected dense search call when hybrid succeeded")
// /points/search is now the stratified binding-law augmentation query (it AUGMENTS
// the hybrid pool, it is not a dense fallback). Return empty so the hybrid hit
// remains the sole result for this test.
json.NewEncoder(w).Encode(qdrantSearchResponse{Result: []qdrantSearchHit{}})
}))
defer qdrantMock.Close()
@@ -446,6 +447,59 @@ func TestHybridSearch_UsesQueryAPI(t *testing.T) {
}
}
// TestSearch_StratifiedBindingRerank verifies that the binding-law pool augments the
// semantic pool and that authority re-ranking lifts binding law above higher-semantic guidance.
func TestSearch_StratifiedBindingRerank(t *testing.T) {
ollamaMock := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
json.NewEncoder(w).Encode(ollamaEmbeddingResponse{Embedding: make([]float64, 1024)})
}))
defer ollamaMock.Close()
qdrantMock := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if strings.Contains(r.URL.Path, "/index") {
w.WriteHeader(http.StatusOK)
w.Write([]byte(`{"result":{"status":"completed"}}`))
return
}
if strings.Contains(r.URL.Path, "/points/query") {
json.NewEncoder(w).Encode(qdrantQueryResponse{Result: []qdrantSearchHit{
{ID: "g1", Score: 0.72, Payload: map[string]interface{}{
"chunk_text": "ENISA guidance", "regulation_short": "ENISA",
"article_label": "ENISA CRA Mapping", "source_class": "supervisory_guidance",
"authority_weight": float64(70), "jurisdiction": "EU",
}},
}})
return
}
// /points/search = stratified binding-law pool (source_class=binding_law)
json.NewEncoder(w).Encode(qdrantSearchResponse{Result: []qdrantSearchHit{
{ID: "b1", Score: 0.66, Payload: map[string]interface{}{
"chunk_text": "CRA Anhang I requirement", "regulation_short": "CRA",
"article_label": "CRA Anhang I", "source_class": "binding_law",
"authority_weight": float64(100), "jurisdiction": "EU",
}},
}})
}))
defer qdrantMock.Close()
client := &LegalRAGClient{
qdrantURL: qdrantMock.URL, ollamaURL: ollamaMock.URL, embeddingModel: "bge-m3",
collection: "bp_compliance_ce", textIndexEnsured: make(map[string]bool),
hybridEnabled: true, httpClient: http.DefaultClient,
}
results, err := client.Search(context.Background(), "Was gilt hier?", nil, 5)
if err != nil {
t.Fatalf("search failed: %v", err)
}
if len(results) != 2 {
t.Fatalf("expected 2 merged results (guidance + binding), got %d", len(results))
}
if results[0].RegulationShort != "CRA" {
t.Errorf("binding CRA must rank first over higher-semantic guidance, got %q", results[0].RegulationShort)
}
}
func TestHybridSearch_FallbackToDense(t *testing.T) {
var requestedPaths []string
@@ -20,6 +20,38 @@ type LegalSearchResult struct {
Pages []int `json:"pages,omitempty"`
SourceURL string `json:"source_url"`
Score float64 `json:"score"`
// Interne Felder fuer das Authority-Re-Ranking (Phase 1) — NICHT serialisiert
// (json:"-"), daher kein Contract-Change. Aus dem Qdrant-Payload befuellt und nur
// fuer die Sortierung in rerankByAuthority verwendet.
AuthorityWeight int `json:"-"`
SourceClass string `json:"-"`
Jurisdiction string `json:"-"`
// Zitations-Graph (Phase 2) — intern, speist nur die Assessment-Berechnung
// (verbundene Normen, Begruendung). Pro-Result-Schema bleibt eingefroren.
CitationUnit string `json:"-"`
ReferencesOut []string `json:"-"`
ReferencesIn []string `json:"-"`
// Supersede-Status (status="superseded", use_for_primary=false) — Alt-Quelle,
// die fuer Default-Fragen demoted wird (nicht versteckt; fuer Historie auffindbar).
Superseded bool `json:"-"`
}
// LegalAssessment is the auditable explanation layer over a ranked result set:
// which norm is primary, which norms connect to it via the citation graph,
// whether the answer crosses regulatory regimes, and whether a human should
// review. Computed from the already-ranked results — it EXPLAINS retrieval, it
// does not change it (graph edges for reasoning/completeness, not pool-expansion).
type LegalAssessment struct {
PrimaryNorm string `json:"primary_norm"`
PrimaryRegulation string `json:"primary_regulation"`
ConnectedNorms []string `json:"connected_norms"`
CrossRegime bool `json:"cross_regime"`
HumanReviewFlag bool `json:"human_review_flag"`
WinnerMargin float64 `json:"winner_margin"`
ScoreReasoning string `json:"score_reasoning"`
}
// LegalContext represents aggregated legal context for an assessment.
@@ -45,6 +45,11 @@ class LLMChecker:
text = doc.text or ""
if len(text) < 50:
return CheckResult(present=None, source="llm")
# decision_method=LLM mit judge='haiku': Sufficiency-Pfad (validiert
# P0.89/R0.91). Der Qwen-first-Cascade ist als Sufficiency-Judge
# widerlegt -> hier Haiku direkt, kriteriengeführte Subsumtion.
if (ctrl.extra or {}).get("judge") == "haiku":
return await self._haiku(ctrl, text)
secs = _sections(text)
if ctrl.topic_regex:
rel = [s for s in secs if re.search(ctrl.topic_regex, s, re.I)][:6] or secs[:6]
@@ -71,3 +76,31 @@ class LLMChecker:
except Exception as e:
logger.info("llm checker fail %s: %s", ctrl.control_id, str(e)[:80])
return CheckResult(present=None, source="error")
async def _haiku(self, ctrl: ControlSpec, text: str) -> CheckResult:
"""Sufficiency via Haiku direkt (validierter Judge). Kriteriengeführt:
die Rechts-Elemente stehen in ctrl.paraphrases; wiederverwendet den
validierten deep_check-Sufficiency-Prompt."""
try:
from compliance.services.llm_cascade import _call_anthropic
from compliance.services.specialist_agents.dse.deep_check import (
_JUDGE_SYS, _build_user, _parse as _parse_judge,
)
crit = ctrl.paraphrases or [ctrl.label or ctrl.control_id]
user = _build_user(text, ctrl.label or ctrl.control_id, crit)
obj = None
for _ in range(2):
obj = _parse_judge(await _call_anthropic(_JUDGE_SYS, user, max_tokens=400))
if obj:
break
if not obj:
return CheckResult(present=None, source="haiku")
return CheckResult(
present=bool(obj.get("erfuellt")),
evidence=(obj.get("begruendung") or "")[:120],
confidence=float(obj.get("confidence") or 0.0),
source="haiku",
)
except Exception as e:
logger.info("llm haiku checker fail %s: %s", ctrl.control_id, str(e)[:80])
return CheckResult(present=None, source="error")
@@ -0,0 +1,68 @@
"""Prüfer-Router — method-agnostischer Dispatch.
control → sensor_classification (verification_method + decision_method) → Checker.
Ein neues Modul liefert nur ControlSpecs; der Router wählt den Prüfer. Damit wird
der „Embedding findet, Claude entscheidet"-Pfad EIN gemeinsamer CONTENT/LLM-Prüfer
statt Cookie-Sonderlogik. Nicht-gebaute Prüfer (PLAYWRIGHT/AUDIT/SCANNER/REGEX-
FIELD) → present=None (fail-safe: Aufrufer behält sein deterministisches Ergebnis).
"""
from __future__ import annotations
from typing import Any, Optional
from .base import CheckResult, ControlSpec, DecisionMethod, DocContext
from .embedding_checker import EmbeddingChecker
from .llm_checker import LLMChecker
from .reference_checker import ReferenceChecker
_LLM = LLMChecker()
_EMB = EmbeddingChecker()
_REF = ReferenceChecker()
# decision_method → Checker. Fehlende Mechanismen bewusst None (noch nicht gebaut).
_BY_DECISION: dict[str, Any] = {
DecisionMethod.LLM: _LLM,
DecisionMethod.EMBEDDING: _EMB,
DecisionMethod.LINK_RESOLVER: _REF,
}
async def route_and_check(ctrl: ControlSpec, doc: DocContext) -> CheckResult:
checker = _BY_DECISION.get((ctrl.decision_method or "").upper())
if checker is None:
return CheckResult(present=None,
source=f"no_checker:{ctrl.decision_method}")
return await checker.check(ctrl, doc)
def build_spec(
control_id: str,
sensor_classification: Optional[dict[str, Any]],
*,
label: str = "",
criteria: Optional[list] = None,
question: str = "",
patterns: Optional[list[str]] = None,
embed_threshold: Optional[float] = None,
) -> ControlSpec:
"""Baut ein ControlSpec aus der GESPEICHERTEN sensor_classification
(canonical_controls.generation_metadata.sensor_classification) + den
Control-Kriterien. CONTENT/LLM → judge='haiku' (validierter Sufficiency-
Judge; Default für Sufficiency lt. Entscheidung 2026-06-22)."""
sc = sensor_classification or {}
vm = (sc.get("verification_method") or "").upper()
dm = (sc.get("decision_method") or "").upper()
extra: dict[str, Any] = {}
if vm == "CONTENT" and dm == "LLM":
extra["judge"] = "haiku"
return ControlSpec(
control_id=control_id,
verification_method=vm,
decision_method=dm,
label=label,
paraphrases=[str(c) for c in (criteria or []) if c],
question=question,
patterns=patterns or [],
embed_threshold=embed_threshold,
extra=extra,
)
@@ -142,19 +142,26 @@ async def _call_ovh(system: str, user: str, max_tokens: int = 6000) -> str:
headers = {"Content-Type": "application/json"}
if key:
headers["Authorization"] = f"Bearer {key}"
# gpt-oss-120b is a REASONING model: it spends output tokens on
# chain-of-thought before emitting the answer. A low cap (e.g. deep_check's
# max_tokens=400) makes it hit the length limit mid-reasoning and return
# content=null — the whole tier then silently yields nothing. Floor the
# budget so the reasoning AND the JSON answer fit.
payload = {
"model": model, "temperature": 0.05, "max_tokens": max_tokens,
"model": model, "temperature": 0.05, "max_tokens": max(max_tokens, 2000),
"messages": [{"role": "system", "content": system},
{"role": "user", "content": user}],
"response_format": {"type": "json_object"},
}
try:
async with httpx.AsyncClient(timeout=45.0) as c:
async with httpx.AsyncClient(timeout=90.0) as c:
r = await c.post(f"{base.rstrip('/')}/v1/chat/completions",
json=payload, headers=headers)
r.raise_for_status()
choice = (r.json().get("choices") or [{}])[0]
return (choice.get("message") or {}).get("content", "") or ""
msg = (r.json().get("choices") or [{}])[0].get("message") or {}
# Answer is normally in content; if the model was length-capped the
# JSON can land in reasoning_content instead — fall back to it.
return (msg.get("content") or "") or (msg.get("reasoning_content") or "")
except Exception as e:
logger.warning("ovh cascade tier 2 failed: %s", e)
return ""
@@ -0,0 +1,78 @@
"""Applicability-Gate fuer den Cookie-Policy-Scan.
Schliesst Controls aus dem Cookie-Findings-Scan aus, die laut
`compliance.control_classification` NICHT gegen eine Cookie-Policy laufen
('COOKIE_POLICY' nicht in applicable_artifacts). Diese gehoeren zu einem
anderen Artefakt/Pruefer — Banner (BEHAVIOR/Playwright), Security/TOM/Audit
(PROCESS) — und erzeugen sonst Unsinn-Findings (z.B. 'TOMs nicht dokumentiert'
gegen eine Cookie-Richtlinie). Sie werden NICHT geloescht, sondern als
Routing-Liste zurueckgegeben.
Anders als das DSE-Gate OHNE needs_review-Ausnahme: das Artefakt-Signal ist
hier entscheidend und per Inventar (2026-06-21) belegt; die mis-scopeten 11
sind geprueft. Fail-safe: fehlt die Tabelle / DB nicht erreichbar -> leeres
Dict -> es wird NICHT gefiltert (kein stiller Recall-Verlust).
"""
from __future__ import annotations
import logging
import os
from typing import Any
logger = logging.getLogger(__name__)
async def load_cookie_gate(db_url: str = "") -> dict[str, dict[str, Any]]:
"""Liefert {control_id: meta} fuer Controls, die aus dem Cookie-Findings-
Scan auszuschliessen sind (kein COOKIE_POLICY-Artefakt). Leeres Dict =
kein Filter."""
dsn = (db_url or os.getenv("DATABASE_URL")
or os.getenv("COMPLIANCE_DATABASE_URL") or "")
if not dsn:
return {}
try:
import asyncpg
conn = await asyncpg.connect(dsn)
try:
rows = await conn.fetch(
"""SELECT control_id, obligation_type, check_intent,
applicable_artifacts
FROM compliance.control_classification
WHERE is_active
AND NOT ('COOKIE_POLICY' = ANY(applicable_artifacts))""")
finally:
await conn.close()
except Exception as e: # Tabelle fehlt / DB weg -> kein Filter
logger.info("cookie classification gate inaktiv: %s", str(e)[:90])
return {}
return {
r["control_id"]: {
"obligation_type": r["obligation_type"],
"check_intent": r["check_intent"],
"applicable_artifacts": list(r["applicable_artifacts"] or []),
}
for r in rows if r["control_id"]
}
def apply_gate(
controls: list[dict[str, Any]],
gate: dict[str, dict[str, Any]],
) -> tuple[list[dict[str, Any]], list[dict[str, Any]]]:
"""Teilt geladene Controls in (kept, routed_out).
kept: laufen normal durch den Cookie-Scan.
routed_out: aus dem Scan genommen (control_id + title + Klassifikations-
Metadaten fuer das Routing zu Banner/Security/Audit).
"""
kept: list[dict[str, Any]] = []
routed_out: list[dict[str, Any]] = []
for c in controls:
cid = c.get("control_id")
meta = gate.get(cid) if cid else None
if meta:
routed_out.append({"control_id": cid, "title": c.get("title"), **meta})
else:
kept.append(c)
return kept, routed_out
@@ -0,0 +1,63 @@
"""Layer-3 Sufficiency-Judge fuer Cookie-Policy.
Das Embedding/Boost-Auto-Rescue (Layer 0/2) ist BEWUSST optimistisch — es findet
das Thema, beweist aber nicht die Erfuellung. Messung (2026-06-22): 159 FN
(Over-Rescue) gegen Opus-GT, weil 'Thema erwaehnt' als 'erfuellt' durchgewunken
wurde. Diese Schicht prueft GENAU die rescued Controls mit dem validierten
Haiku-Judge (Cohort cookie_sufficiency_v1: P0.89/R0.91) — NICHT die Qwen-first-
Kaskade (lokal ist als Sufficiency-Judge widerlegt) — und nimmt 'passed' zurueck,
wenn die konkrete Pflicht nicht erfuellt ist. 'Embedding findet, Claude entscheidet.'
Nur fuer den NICHT-skip_llm-Pfad (voller Check); der schnelle/interaktive Pfad
behaelt das deterministische Rescue.
"""
from __future__ import annotations
import logging
from typing import Any
logger = logging.getLogger(__name__)
_RESCUE_MARKERS = ("+embedding", "+regex_boost")
def _is_rescued(r: dict[str, Any]) -> bool:
src = r.get("source") or ""
return r.get("passed") and any(m in src for m in _RESCUE_MARKERS)
async def judge_rescued(text: str, results: list[dict[str, Any]]) -> int:
"""Prueft alle rescued (embedding/boost) passed-Controls mit Haiku.
Nimmt passed zurueck, wenn der Judge die Pflicht als NICHT erfuellt sieht.
Gibt die Anzahl zurueckgenommener (korrigierter) Rescues zurueck.
"""
# Über den gemeinsamen Prüfer-Router (kein Cookie-Sonderfall mehr):
# CONTENT/LLM → build_spec setzt judge='haiku' → LLMChecker (validierter
# Sufficiency-Judge). Damit ist Cookie der erste echte Router-Consumer.
from compliance.services.checkers.base import DocContext
from compliance.services.checkers.router import build_spec, route_and_check
candidates = [r for r in results if _is_rescued(r)]
if not candidates:
return 0
doc = DocContext(text=text)
sc = {"verification_method": "CONTENT", "decision_method": "LLM"}
corrected = 0
for r in candidates:
crit = r.get("_pass_criteria") or [r.get("label") or r.get("hint") or ""]
if not isinstance(crit, list):
crit = [str(crit)]
label = r.get("label") or r.get("hint") or r.get("control_id") or ""
spec = build_spec(r.get("control_id") or "", sc, label=label, criteria=crit)
res = await route_and_check(spec, doc)
if res.present is False:
r["passed"] = False
r["source"] = (r.get("source") or "") + "+llm_failed"
r["matched_text"] = "[layer-3 sufficiency-judge: nicht erfuellt]"
r["_judge_reason"] = (res.evidence or "")[:200]
corrected += 1
if corrected:
logger.info("cookie layer-3 sufficiency-judge: %d/%d rescues zurueckgenommen",
corrected, len(candidates))
return corrected
@@ -96,6 +96,22 @@ class CookiePolicyAgent(BaseSpecialistAgent):
"Branchen-MCs entfernt"
)
# Layer 3 — Sufficiency-Judge (Haiku) auf die embedding/boost-rescued
# Controls: Embedding findet das Thema, Claude entscheidet ob die Pflicht
# konkret erfuellt ist. Nur im vollen Check (nicht skip_llm).
skip_llm = bool((agent_input.context or {}).get("skip_llm"))
if not skip_llm:
try:
from ._sufficiency_judge import judge_rescued
corrected = await judge_rescued(text, results)
if corrected:
notes_parts.append(
f"layer-3 sufficiency-judge: {corrected} Rescues "
"zurückgenommen"
)
except Exception as e:
logger.warning("cookie layer-3 judge skipped: %s", e)
seen: set[str] = set()
for r in results:
mc_id = r.get("control_id") or ""
@@ -45,6 +45,15 @@ async def run_v3_pipeline(
controls = []
_normalize_criteria(controls)
controls, sector_dropped = _filter_sector(controls, business_scope)
# Artefakt-Gate: Controls ohne COOKIE_POLICY-Artefakt (Security/TOM/Audit,
# Banner) raus — sie gehoeren zu anderem Pruefer/Artefakt und erzeugen sonst
# Unsinn-Findings. Siehe _classification_gate.
routed_out: list[dict[str, Any]] = []
try:
from ._classification_gate import apply_gate, load_cookie_gate
controls, routed_out = apply_gate(controls, await load_cookie_gate(db_url))
except Exception as e:
logger.warning("cookie classification gate skipped: %s", e)
results: list[dict[str, Any]] = []
if controls:
try:
@@ -111,6 +120,7 @@ async def run_v3_pipeline(
"layer_0_boost_overrides": boost_overrides,
"total_mcs": len(results),
"sector_dropped": sector_dropped,
"artifact_gated": len(routed_out),
}
return results, telemetry
@@ -0,0 +1,183 @@
"""Getierte 3-Status-Auswertung für DSE-Controls mit `tiered_criteria`.
Pro Kriterium wird nach `decision_method` bewertet:
- EMBEDDING (Präsenz): deterministisch (festes Modell), Doc EINMAL pro Scan
eingebettet → reproduzierbar, kein LLM. Trägt den GROSSTEIL.
- LLM (Sufficiency): Haiku-Judge, GECACHT pro (doc_hash, control_id#idx,
PROMPT_VERSION, criterion) → gleicher Scan = gleiches Ergebnis. Löst die
empirisch gemessene Judge-Varianz (ein Live-Call ist NICHT reproduzierbar).
Status NUR aus LEGAL_MINIMUM:
ERFÜLLT (alle LM erfüllt ODER kein LM) · FEHLT (kein LM erfüllt) ·
TEILWEISE (Teil der LM erfüllt) · UNBESTIMMT (LM nicht bewertbar, z. B.
Embedding-Service down → Aufrufer behält sein Legacy-Ergebnis).
BEST_PRACTICE/OPTIONAL fließen NIE in den Status, nur in `recommendations`.
Siehe docs-src/development/criterion_meta_model.md.
"""
from __future__ import annotations
import asyncio
import hashlib
import logging
import os
import sqlite3
from typing import Any, Optional
logger = logging.getLogger(__name__)
PROMPT_VERSION = "dse-tier-v1"
_CACHE_DB = os.getenv("TIERED_JUDGE_CACHE", "/data/tiered_judge_cache.db")
_EMBED_THR = float(os.getenv("DSE_CRITERION_EMBED_THRESHOLD", "0.62"))
LM = "LEGAL_MINIMUM"
def _doc_hash(text: str) -> str:
return hashlib.sha256(text.encode("utf-8", "ignore")).hexdigest()[:20]
def _ckey(dh: str, cid: str, idx: int, crit: str) -> str:
ch = hashlib.sha256(crit.encode("utf-8", "ignore")).hexdigest()[:12]
return f"{dh}|{cid}#{idx}|{PROMPT_VERSION}|{ch}"
def _cache_get(key: str) -> Optional[bool]:
try:
with sqlite3.connect(_CACHE_DB) as c:
c.execute("create table if not exists judge(k text primary key, met int)")
row = c.execute("select met from judge where k=?", (key,)).fetchone()
return None if row is None else bool(row[0])
except Exception:
return None
def _cache_put(key: str, met: bool) -> None:
try:
with sqlite3.connect(_CACHE_DB) as c:
c.execute("create table if not exists judge(k text primary key, met int)")
c.execute("insert or replace into judge values(?,?)", (key, int(met)))
except Exception as e:
logger.warning("tiered judge cache put: %s", e)
async def prepare_doc(text: str) -> dict[str, Any]:
"""Doc EINMAL pro Scan einbetten. Liefert {hash, chunk_vecs}. Bei Embedding-
Ausfall: chunk_vecs=None → EMBEDDING-Kriterien werden UNBESTIMMT (Fallback)."""
ctx: dict[str, Any] = {"hash": _doc_hash(text or ""), "chunk_vecs": None}
if not text or len(text) < 100:
return ctx
try:
from compliance.services.mc_embedding_matcher import DIM, _chunk_text, _embed_texts
vecs = await asyncio.wait_for(_embed_texts(_chunk_text(text)), timeout=90.0)
ctx["chunk_vecs"] = [v for v in vecs if v and len(v) == DIM]
except (Exception, asyncio.TimeoutError) as e:
logger.warning("tiered prepare_doc embedding inaktiv: %s", e)
return ctx
async def _embed_present(crits: list[str], ctx: dict, thr: float) -> dict[str, Optional[bool]]:
cvecs = ctx.get("chunk_vecs")
if not cvecs:
return {c: None for c in crits}
try:
from compliance.services.mc_embedding_matcher import DIM, _cosine, _embed_texts
pv = await _embed_texts(crits)
out: dict[str, Optional[bool]] = {}
for crit, v in zip(crits, pv):
if not v or len(v) != DIM:
out[crit] = None
else:
out[crit] = max((_cosine(v, cv) for cv in cvecs), default=0.0) >= thr
return out
except Exception as e:
logger.warning("tiered embed present: %s", e)
return {c: None for c in crits}
async def _llm_met(cid: str, idx: int, crit: str, doc, dh: str) -> Optional[bool]:
key = _ckey(dh, cid, idx, crit)
cached = _cache_get(key)
if cached is not None:
return cached
from compliance.services.checkers.router import build_spec, route_and_check
spec = build_spec(cid, {"verification_method": "CONTENT", "decision_method": "LLM"},
label=crit, criteria=[crit])
res = await route_and_check(spec, doc)
if res.present is None:
return None
_cache_put(key, bool(res.present))
return bool(res.present)
def _status(lm_vals: list[Optional[bool]]) -> str:
if not lm_vals:
return "ERFÜLLT" # kein gesetzliches Minimum → nie rot
if any(m is None for m in lm_vals):
return "UNBESTIMMT" # Aufrufer behält Legacy
n = sum(1 for m in lm_vals if m)
if n == len(lm_vals):
return "ERFÜLLT"
return "FEHLT" if n == 0 else "TEILWEISE"
async def evaluate_tiered(control_id: str, tiered_criteria: list[dict],
ctx: dict, doc) -> dict[str, Any]:
dh = ctx.get("hash") or _doc_hash(getattr(doc, "text", "") or "")
emb_texts = [c["criterion"] for c in (tiered_criteria or [])
if c.get("criterion")
and (c.get("decision_method") or "EMBEDDING").upper() != "LLM"]
emb_res = await _embed_present(emb_texts, ctx, _EMBED_THR) if emb_texts else {}
lm_vals: list[Optional[bool]] = []
recs: list[dict] = []
detail: list[dict] = []
for idx, c in enumerate(tiered_criteria or []):
crit = c.get("criterion") or ""
if not crit:
continue
tier = (c.get("compliance_tier") or "").upper()
if (c.get("decision_method") or "EMBEDDING").upper() == "LLM":
met = await _llm_met(control_id, idx, crit, doc, dh)
src = "haiku-cache"
else:
met = emb_res.get(crit)
src = "embedding"
detail.append({"criterion": crit, "tier": tier, "met": met, "source": src})
if tier == LM:
lm_vals.append(met)
elif met is False:
recs.append({"criterion": crit, "tier": tier or "OPTIONAL",
"legal_basis": c.get("legal_basis")})
return {"status": _status(lm_vals), "lm_met": sum(1 for m in lm_vals if m),
"lm_total": len(lm_vals), "recommendations": recs, "detail": detail}
async def fetch_tiered_criteria(cids: list[str], db_url: str = "") -> dict[str, list]:
"""tiered_criteria der angegebenen Controls aus canonical_controls laden.
Leeres Dict bei Fehler/keiner DB (Fallback: kein Tiering, Legacy trägt)."""
cids = [c for c in cids if c]
if not cids:
return {}
import json
dsn = db_url or os.getenv("DATABASE_URL") or os.getenv("COMPLIANCE_DATABASE_URL")
if not dsn:
return {}
try:
import asyncpg
conn = await asyncpg.connect(dsn)
rows = await conn.fetch(
"select control_id, generation_metadata->'tiered_criteria' tc "
"from compliance.canonical_controls "
"where control_id = any($1::text[]) "
"and generation_metadata ? 'tiered_criteria'", cids)
await conn.close()
except Exception as e:
logger.warning("fetch_tiered_criteria failed: %s", e)
return {}
out: dict[str, list] = {}
for r in rows:
tc = r["tc"]
tc = json.loads(tc) if isinstance(tc, str) else tc
if tc:
out[r["control_id"]] = tc
return out
@@ -129,11 +129,41 @@ async def run_v3_pipeline(
r["source"] = (r.get("source") or "") + "+embedding"
embedding_passes += 1
# Layer 3: getierte 3-Status-Auswertung (nur Controls mit tiered_criteria).
# Reproduzierbar: EMBEDDING-Präsenz (deterministisch) + GECACHTER Haiku-Judge
# nur für Sufficiency. UNBESTIMMT → Legacy-Pass bleibt. Gated + fail-safe.
tiered_evaluated = 0
try:
from compliance.services.checkers.base import DocContext
from ._tiered_eval import (
evaluate_tiered, fetch_tiered_criteria, prepare_doc,
)
result_cids = [r.get("control_id") for r in results if r.get("control_id")]
tiered_map = await fetch_tiered_criteria(result_cids, db_url)
if tiered_map:
ctx = await prepare_doc(text)
doc_ctx = DocContext(text=text)
for r in results:
tc = tiered_map.get(r.get("control_id"))
if not tc:
continue
ev = await evaluate_tiered(r["control_id"], tc, ctx, doc_ctx)
if ev["status"] == "UNBESTIMMT":
continue
r["compliance_status"] = ev["status"]
r["recommendations"] = ev["recommendations"]
r["tier_lm"] = f"{ev['lm_met']}/{ev['lm_total']}"
r["passed"] = ev["status"] == "ERFÜLLT"
tiered_evaluated += 1
except Exception as e:
logger.warning("dse tiered eval skipped: %s", e)
telemetry = {
"layer_0_field_hits": len(boost_field_ids),
"layer_0_field_ids": boost_field_ids,
"layer_1_pass": layer_1_pass,
"embedding_passes": embedding_passes,
"tiered_evaluated": tiered_evaluated,
"total_mcs": len(results),
"sector_dropped": drop_stats.get("sector_dropped", 0),
"offtopic_dropped": drop_stats.get("offtopic_dropped", 0),
@@ -0,0 +1,51 @@
"""Prüfer-Router: build_spec aus sensor_classification + method-agnostischer
Dispatch. CONTENT/LLM -> Haiku-Sufficiency-Tier (validiert), unbekannte
decision_methods -> fail-safe present=None."""
import pytest
from unittest.mock import AsyncMock, patch
from compliance.services.checkers.base import DocContext
from compliance.services.checkers.router import build_spec, route_and_check
_ANTHROPIC = "compliance.services.llm_cascade._call_anthropic"
def test_build_spec_content_llm_uses_haiku():
s = build_spec("X", {"verification_method": "CONTENT", "decision_method": "LLM"},
label="L", criteria=["a", "b"])
assert s.verification_method == "CONTENT" and s.decision_method == "LLM"
assert s.extra.get("judge") == "haiku"
assert s.paraphrases == ["a", "b"]
def test_build_spec_embedding_no_haiku():
s = build_spec("X", {"verification_method": "CONTENT", "decision_method": "EMBEDDING"})
assert s.extra.get("judge") is None
@pytest.mark.asyncio
async def test_route_unknown_decision_is_failsafe():
s = build_spec("X", {"verification_method": "BEHAVIOR", "decision_method": "PLAYWRIGHT"})
r = await route_and_check(s, DocContext(text="x" * 200))
assert r.present is None and "no_checker" in r.source
@pytest.mark.asyncio
async def test_route_content_llm_haiku_fehlt():
s = build_spec("X", {"verification_method": "CONTENT", "decision_method": "LLM"},
label="Speicherdauer", criteria=["Höchstdauer pro Kategorie"])
fake = AsyncMock(return_value='{"erfuellt": false, "confidence": 0.9, "begruendung": "fehlt"}')
with patch(_ANTHROPIC, new=fake):
r = await route_and_check(s, DocContext(text="Wir nutzen Cookies. " * 30))
assert r.present is False and r.source == "haiku"
assert fake.call_count >= 1
@pytest.mark.asyncio
async def test_route_content_llm_haiku_erfuellt():
s = build_spec("X", {"verification_method": "CONTENT", "decision_method": "LLM"},
label="L", criteria=["x"])
fake = AsyncMock(return_value='{"erfuellt": true, "confidence": 0.8}')
with patch(_ANTHROPIC, new=fake):
r = await route_and_check(s, DocContext(text="text " * 40))
assert r.present is True
@@ -0,0 +1,42 @@
"""Tests for the cookie-policy applicability gate: controls without a
COOKIE_POLICY artifact are routed out of the findings scan (not deleted),
and the gate is fail-safe (no DSN -> no filter)."""
import pytest
from compliance.services.specialist_agents.cookie_policy._classification_gate import (
apply_gate, load_cookie_gate,
)
def test_apply_gate_splits_kept_and_routed():
controls = [
{"control_id": "COOK-1", "title": "Kategorien"},
{"control_id": "TOM-1", "title": "Verschlüsselung"},
{"control_id": "BAN-1", "title": "Consent vor Setzen"},
]
gate = {
"TOM-1": {"obligation_type": "TECHNICAL", "check_intent": "DIRECT_TECHNICAL",
"applicable_artifacts": ["TOM", "AUDIT"]},
"BAN-1": {"obligation_type": "TECHNICAL", "check_intent": "DIRECT_TECHNICAL",
"applicable_artifacts": ["COOKIE_BANNER", "SYSTEMSCAN"]},
}
kept, routed = apply_gate(controls, gate)
assert [c["control_id"] for c in kept] == ["COOK-1"]
assert {c["control_id"] for c in routed} == {"TOM-1", "BAN-1"}
# routed entries carry title + classification metadata for downstream routing
tom = next(c for c in routed if c["control_id"] == "TOM-1")
assert tom["title"] == "Verschlüsselung"
assert tom["applicable_artifacts"] == ["TOM", "AUDIT"]
def test_apply_gate_empty_gate_keeps_all():
controls = [{"control_id": "A"}, {"control_id": "B"}]
kept, routed = apply_gate(controls, {})
assert len(kept) == 2 and routed == []
@pytest.mark.asyncio
async def test_load_cookie_gate_no_dsn_is_failsafe(monkeypatch):
monkeypatch.delenv("DATABASE_URL", raising=False)
monkeypatch.delenv("COMPLIANCE_DATABASE_URL", raising=False)
assert await load_cookie_gate("") == {}
@@ -0,0 +1,68 @@
"""Layer-3 cookie sufficiency-judge: only embedding/boost-RESCUED passes are
re-judged by Haiku; keyword passes are untouched; a FEHLT verdict un-passes."""
import pytest
from unittest.mock import AsyncMock, patch
from compliance.services.specialist_agents.cookie_policy._sufficiency_judge import (
judge_rescued,
)
_ANTHROPIC = "compliance.services.llm_cascade._call_anthropic"
_DOC = "Volltext der Cookie-Richtlinie mit ausreichend Inhalt. " * 4
def _r(cid, source, passed=True):
return {"control_id": cid, "source": source, "passed": passed,
"label": cid, "_pass_criteria": ["konkrete Angabe nötig"]}
@pytest.mark.asyncio
async def test_rescued_unpassed_when_judge_fehlt():
results = [_r("A", "keyword+embedding")]
fake = AsyncMock(return_value='{"erfuellt": false, "confidence": 0.9, "begruendung": "fehlt"}')
with patch(_ANTHROPIC, new=fake):
n = await judge_rescued(_DOC, results)
assert n == 1
assert results[0]["passed"] is False
assert "+llm_failed" in results[0]["source"]
@pytest.mark.asyncio
async def test_rescued_kept_when_judge_erfuellt():
results = [_r("A", "keyword+embedding")]
fake = AsyncMock(return_value='{"erfuellt": true, "confidence": 0.9}')
with patch(_ANTHROPIC, new=fake):
n = await judge_rescued(_DOC, results)
assert n == 0
assert results[0]["passed"] is True
@pytest.mark.asyncio
async def test_keyword_pass_not_judged():
"""Deterministisch (keyword) bestandene Controls werden NICHT befragt."""
results = [_r("A", "keyword")]
fake = AsyncMock(return_value='{"erfuellt": false}')
with patch(_ANTHROPIC, new=fake):
n = await judge_rescued(_DOC, results)
assert n == 0
assert results[0]["passed"] is True
assert fake.call_count == 0
@pytest.mark.asyncio
async def test_boost_rescue_is_judged():
results = [_r("A", "keyword+regex_boost")]
fake = AsyncMock(return_value='{"erfuellt": false}')
with patch(_ANTHROPIC, new=fake):
n = await judge_rescued(_DOC, results)
assert n == 1 and results[0]["passed"] is False
@pytest.mark.asyncio
async def test_failed_controls_ignored():
"""Nicht-bestandene (failed) Controls sind nicht Sache dieser Schicht."""
results = [_r("A", "keyword+embedding", passed=False)]
fake = AsyncMock(return_value='{"erfuellt": false}')
with patch(_ANTHROPIC, new=fake):
n = await judge_rescued(_DOC, results)
assert n == 0 and fake.call_count == 0
@@ -0,0 +1,77 @@
"""Regression tests for the OVH (gpt-oss-120b) tier of the LLM cascade.
gpt-oss-120b is a reasoning model: it spends output tokens on chain-of-thought
before the answer. Two bugs this pins:
1. A small max_tokens (deep_check passed 400) length-caps it mid-reasoning →
content=null → the tier silently returns nothing. _call_ovh must floor the
budget so reasoning + the JSON answer fit.
2. When length-capped, the JSON can land in reasoning_content, not content →
_call_ovh must fall back to reasoning_content.
"""
import pytest
from unittest.mock import AsyncMock, MagicMock, patch
from compliance.services import llm_cascade
def _resp(data):
r = MagicMock()
r.raise_for_status = MagicMock()
r.json = MagicMock(return_value=data)
return r
def _client(resp):
inst = AsyncMock()
inst.post.return_value = resp
inst.__aenter__ = AsyncMock(return_value=inst)
inst.__aexit__ = AsyncMock(return_value=False)
return inst
class TestCallOvhReasoning:
@pytest.mark.asyncio
async def test_reasoning_content_used_when_content_null(self, monkeypatch):
monkeypatch.setenv("OVH_LLM_URL", "https://llm.example.com")
monkeypatch.setenv("OVH_LLM_MODEL", "gpt-oss-120b")
monkeypatch.setenv("OVH_LLM_KEY", "k")
resp = _resp({"choices": [{"message": {
"content": None,
"reasoning_content": '{"erfuellt": true, "confidence": 0.9}'}}]})
with patch("httpx.AsyncClient", return_value=_client(resp)):
out = await llm_cascade._call_ovh("sys", "user", max_tokens=400)
assert '"erfuellt": true' in out
@pytest.mark.asyncio
async def test_small_budget_is_floored(self, monkeypatch):
monkeypatch.setenv("OVH_LLM_URL", "https://llm.example.com")
monkeypatch.setenv("OVH_LLM_MODEL", "gpt-oss-120b")
inst = _client(_resp({"choices": [{"message": {"content": "{}"}}]}))
with patch("httpx.AsyncClient", return_value=inst):
await llm_cascade._call_ovh("sys", "user", max_tokens=400)
assert inst.post.call_args.kwargs["json"]["max_tokens"] >= 2000
@pytest.mark.asyncio
async def test_large_budget_is_preserved(self, monkeypatch):
monkeypatch.setenv("OVH_LLM_URL", "https://llm.example.com")
monkeypatch.setenv("OVH_LLM_MODEL", "gpt-oss-120b")
inst = _client(_resp({"choices": [{"message": {"content": "{}"}}]}))
with patch("httpx.AsyncClient", return_value=inst):
await llm_cascade._call_ovh("sys", "user", max_tokens=6000)
assert inst.post.call_args.kwargs["json"]["max_tokens"] == 6000
@pytest.mark.asyncio
async def test_content_preferred_when_present(self, monkeypatch):
monkeypatch.setenv("OVH_LLM_URL", "https://llm.example.com")
monkeypatch.setenv("OVH_LLM_MODEL", "gpt-oss-120b")
resp = _resp({"choices": [{"message": {
"content": '{"erfuellt": false}', "reasoning_content": "noise"}}]})
with patch("httpx.AsyncClient", return_value=_client(resp)):
out = await llm_cascade._call_ovh("sys", "user")
assert out == '{"erfuellt": false}'
@pytest.mark.asyncio
async def test_unconfigured_returns_empty(self, monkeypatch):
monkeypatch.delenv("OVH_LLM_URL", raising=False)
monkeypatch.delenv("OVH_LLM_MODEL", raising=False)
assert await llm_cascade._call_ovh("sys", "user") == ""
@@ -0,0 +1,102 @@
"""Unit-Tests für die getierte 3-Status-Auswertung (_tiered_eval).
Deckt ab: Status-Logik (inkl. kein-LM → ERFÜLLT, UNBESTIMMT bei nicht bewertbar),
Empfehlungs-Sammlung, EMBEDDING/LLM-Routing (gemockt) und den Reproduzierbarkeits-
Cache. Embedding/LLM werden gemockt — kein Netzwerk."""
import asyncio
from compliance.services.specialist_agents.dse import _tiered_eval as te
# ---- reine Status-Logik -------------------------------------------------
def test_status_no_lm_is_erfuellt():
assert te._status([]) == "ERFÜLLT"
def test_status_all_met_erfuellt():
assert te._status([True, True]) == "ERFÜLLT"
def test_status_none_met_fehlt():
assert te._status([False, False]) == "FEHLT"
def test_status_partial_teilweise():
assert te._status([True, False]) == "TEILWEISE"
def test_status_any_none_unbestimmt():
assert te._status([True, None]) == "UNBESTIMMT"
# ---- evaluate_tiered (Embedding/LLM gemockt) ----------------------------
def _crit(text, tier, dm="EMBEDDING"):
return {"criterion": text, "compliance_tier": tier,
"decision_method": dm, "legal_basis": "x"}
class _Doc:
def __init__(self, text):
self.text = text
def test_evaluate_partial_with_recommendation(monkeypatch):
crits = [_crit("Zwecke genannt", "LEGAL_MINIMUM"),
_crit("Speicherdauer genannt", "LEGAL_MINIMUM"),
_crit("tabellarisch ausgewiesen", "BEST_PRACTICE")]
async def fake_embed(texts, ctx, thr):
return {"Zwecke genannt": True, "Speicherdauer genannt": False,
"tabellarisch ausgewiesen": False}
monkeypatch.setattr(te, "_embed_present", fake_embed)
out = asyncio.run(te.evaluate_tiered("C1", crits, {"hash": "h"}, _Doc("x" * 200)))
assert out["status"] == "TEILWEISE"
assert out["lm_met"] == 1 and out["lm_total"] == 2
assert len(out["recommendations"]) == 1
assert out["recommendations"][0]["tier"] == "BEST_PRACTICE"
def test_evaluate_no_lm_is_erfuellt_with_recs(monkeypatch):
crits = [_crit("Bildsymbole", "OPTIONAL"), _crit("Legende", "OPTIONAL")]
async def fake_embed(texts, ctx, thr):
return {t: False for t in texts}
monkeypatch.setattr(te, "_embed_present", fake_embed)
out = asyncio.run(te.evaluate_tiered("C2", crits, {"hash": "h"}, _Doc("x" * 200)))
assert out["status"] == "ERFÜLLT"
assert out["lm_total"] == 0
assert len(out["recommendations"]) == 2
def test_evaluate_llm_criterion_routed(monkeypatch):
crits = [_crit("Speicherdauer hinreichend nachvollziehbar", "LEGAL_MINIMUM", dm="LLM")]
async def fake_llm(cid, idx, crit, doc, dh):
return True
monkeypatch.setattr(te, "_llm_met", fake_llm)
out = asyncio.run(te.evaluate_tiered("C3", crits, {"hash": "h"}, _Doc("x" * 200)))
assert out["status"] == "ERFÜLLT" and out["lm_total"] == 1
def test_evaluate_unbestimmt_when_embed_unavailable(monkeypatch):
crits = [_crit("Zwecke genannt", "LEGAL_MINIMUM")]
async def fake_embed(texts, ctx, thr):
return {t: None for t in texts} # Embedding-Service down
monkeypatch.setattr(te, "_embed_present", fake_embed)
out = asyncio.run(te.evaluate_tiered("C4", crits, {"hash": "h"}, _Doc("x" * 200)))
assert out["status"] == "UNBESTIMMT"
# ---- Reproduzierbarkeits-Cache -----------------------------------------
def test_cache_roundtrip(monkeypatch, tmp_path):
monkeypatch.setattr(te, "_CACHE_DB", str(tmp_path / "cache.db"))
assert te._cache_get("k1") is None
te._cache_put("k1", True)
te._cache_put("k2", False)
assert te._cache_get("k1") is True
assert te._cache_get("k2") is False
@@ -0,0 +1,155 @@
# Kriterien-Meta-Modell & Compliance-Tier-Architektur
> **Status: EINGEFROREN 2026-06-22.** Änderungen an diesem Modell sind
> Architekturentscheidungen und erfordern eine bewusste Freigabe (DB-Owner /
> Produktverantwortung). Verwandt: [`platform_checker_matrix.md`](platform_checker_matrix.md),
> [`verification_method.md`](verification_method.md), [`platform_validation_v1.md`](platform_validation_v1.md).
## 1. Motivation
Die Kalibrierung der vier Website-Compliance-Module deckte vier **verschiedene**
dominante Fehlerursachen auf:
| Modul | Dominanter Hebel |
|-------|------------------|
| Cookie-Policy | Sufficiency (Judge) |
| Impressum | Scope / Routing |
| AGB | Decision-Method / Routing |
| DSE | **Überladene Controls + Vermischung „gesetzliches Minimum vs. Best Practice"** |
Die DSE-Untersuchung (Adjudikation von 13 Judge↔GT-Disagreements) ergab: **85 % der
Restfehler sind Katalog-Defekte, 15 % Prüfer.** Der größte Einzeldefekt: ein Control
bündelt mehrere Anforderungen **unterschiedlicher Verbindlichkeit** und wird nur dann
als ERFÜLLT gewertet, wenn *alle* erfüllt sind. Folge: gesetzlich konforme Dokumente
werden als „FEHLT" gemeldet, weil eine Best-Practice-Empfehlung fehlt.
Dieses Modell behebt das **im Katalog** — ohne den Prüfer zu ändern und ohne Controls
physisch aufzuspalten.
## 2. Datenmodell
Ein Control bleibt **stabil** (UUID, Citations, GT-Historie, Kalibrierung,
Statistiken). Seine `pass_criteria` werden von einer Stringliste zu **atomaren,
getypten Kriterien-Objekten**:
```
Control (stabile control_uuid — NICHT splitten)
└─ criteria: Criterion[]
Criterion
├─ criterion (Text der Einzelanforderung)
├─ legal_basis (z. B. "Art. 13(1)(c) DSGVO")
├─ verification_method (Achse 1 — WAS wird geprüft)
├─ decision_method (Achse 2 — WIE wird entschieden)
├─ compliance_tier (Achse 3 — WIE VERBINDLICH)
└─ weight (reserviert für Reifegrad, s. §6 — heute NICHT gating)
```
**Speicherort:** `canonical_controls.generation_metadata->'tiered_criteria'` (jsonb).
**Keine Schema-Änderung.** Kein physischer Control-Split (Variante A wurde verworfen:
neue UUIDs → Verlust von Benchmarks/Kalibrierung/Citation/GT = Migrationsprojekt).
## 3. Die drei Achsen
Jedes Kriterium trägt drei **unabhängige** Klassifikationen:
1. **`verification_method`** — artefakt-abhängig: CONTENT · FIELD · REFERENCE ·
BEHAVIOR · PRESENTATION · PROCESS · TECHNICAL · CONTRACTUAL. Siehe
[`verification_method.md`](verification_method.md).
2. **`decision_method`** — welcher Prüfer: REGEX · EMBEDDING · LLM · LINK_RESOLVER ·
PLAYWRIGHT · AUDIT · SCANNER. Siehe [`platform_checker_matrix.md`](platform_checker_matrix.md).
3. **`compliance_tier`** *(neu, dieses Dokument)* — Verbindlichkeit:
- **`LEGAL_MINIMUM`** — gesetzlich erforderlich. Beeinflusst den Compliance-Status.
- **`BEST_PRACTICE`** — empfehlenswert, gesetzlich nicht erforderlich. Erscheint als
Empfehlung. Beeinflusst den Status **nie**.
- **`OPTIONAL`** — Komfort/Detailtiefe. Empfehlung. Beeinflusst den Status **nie**.
Achse 1 + 2 sind primär **per Kriterium** (atomar); ein Control kann Kriterien
verschiedener Methoden mischen.
## 4. Status-Berechnung (3 Zustände) — Gating NUR auf LEGAL_MINIMUM
Sei `LM` die Menge der `LEGAL_MINIMUM`-Kriterien eines Controls und `met(LM)` die
erfüllten darunter:
```
ERFÜLLT := |LM| > 0 und met(LM) == |LM| (alle Pflicht-Kriterien erfüllt)
TEILWEISE := 0 < met(LM) < |LM| (mind. eines erfüllt, mind. eines fehlt)
FEHLT := |LM| > 0 und met(LM) == 0 (kein Pflicht-Kriterium erfüllt)
```
`BEST_PRACTICE`/`OPTIONAL`-Kriterien gehen **nicht** in diese Berechnung ein. Sie
werden separat als Empfehlungen ausgewiesen (§5, Ebene 2).
> **Invariante:** Ein erfülltes gesetzliches Minimum darf NIE durch fehlende
> Best-Practice-/Optional-Kriterien auf FEHLT/Rot gezogen werden.
## 5. Reporting — drei Ebenen
| Ebene | Inhalt | Quelle |
|-------|--------|--------|
| **1 — Compliance-Status (rechtlich)** | ERFÜLLT / TEILWEISE / FEHLT | NUR `LEGAL_MINIMUM` |
| **2 — Optimierungspotenzial** | „Empfehlungen: N · Best-Practice-Abdeckung X %" | `BEST_PRACTICE` + `OPTIONAL` |
| **3 — Risiko-Reifegrad** *(optional, später)* | „Reifegrad Y %" für CRA/NIS2/ISO 27001/TOM | gewichtet, s. §6 |
**Anti-Pattern (verboten):** kein „Compliance-Score = 72 %", wenn alle gesetzlichen
Anforderungen erfüllt sind. Das erzeugt „welche 28 % fehlen?" → „eigentlich keine
Pflicht" → der Score wird wertlos.
### Farb-Semantik (Bedeutung, nicht Wertung)
- **Grün** = gesetzliche Anforderungen erfüllt (Pflicht erfüllt)
- **Blau** = empfohlene Verbesserungen vorhanden (Optimierung möglich)
- **Rot** = gesetzliche Anforderungen fehlen (Pflichtverletzung)
`TEILWEISE` ist visuell ein eigener Zustand (z. B. Gelb/Amber): Pflicht teilweise
erfüllt. Verbindet sich mit der BreakPilot-Tonalität (kein Panik-Rot) und dem
3-Tier-Obligation-Modell (Pflicht/Empfehlung/Kann).
## 6. `weight`
Wird heute **gespeichert, aber nicht für das Gating verwendet** (bewusste
Entscheidung: Gewichte erzeugen sofort „warum 0.3 und nicht 0.4?"-Diskussionen). Es
ist die Reserve für **Ebene 3 (Reifegrad)**: später lässt sich daraus ein gewichteter
Best-Practice-/Reifegrad-Prozentwert berechnen. Richtwerte: LEGAL_MINIMUM 1.0 ·
BEST_PRACTICE ~0.3 · OPTIONAL ~0.1.
## 7. compliance_tier ist eine PLATTFORM-Achse
Nicht nur ein DSE-Fix. Dasselbe Muster tritt überall auf — DSE (Minimum vs. BP),
Cookie (Offenlegung vs. Transparenz), Impressum (Pflicht- vs. Komfortfelder), AGB
(erforderlich vs. empfehlenswert) und perspektivisch CRA/NIS2/Maschinenverordnung.
Ein einzelnes Kriterium trägt überall `compliance_tier`; die Plattform wertet
**Compliance / Empfehlungen / Reifegrad** regulierungsunabhängig aus.
## 8. Validierungsnachweis (Pilot, 2026-06-22)
Geschrieben auf macmini (`generation_metadata.tiered_criteria`, prod-guarded), gemessen
gegen Opus-GT (ikea/ob/teamviewer):
- **5 Pilot-Controls** (SEC-7285-A03, SEC-3257-A01, Portabilitäts-Cluster
DATA-1613/DATA-2552/COMP-2087): alle **6 Disagreement-Fälle** (vormals falsch-FEHLT)
wandern zu **ERFÜLLT + Empfehlungen**; echte Lücken bleiben korrekt FEHLT — ohne
Prüfer-Änderung.
- **TEILWEISE-Validierung** (DATA-1445-A02, SEC-4752-A02): der 3. Status tritt real auf
(1 ERFÜLLT / 5 TEILWEISE), Splitter durchgängig „Speicherdauer pro Zweck"
(Art. 13(2)(a)).
- Lehre: selbst Pilot-Kriterien können Minimum + Best-Practice vermischen
(„Speicherdauer *pro Zweck*"). Die LM/BP-Linie ist eine **Produktpolitik-Entscheidung
(Mensch)**, kein NLP-Problem. Das Modell ist korrekt; die Kriterien-Schärfe ist
Kurationsarbeit.
## 9. Invarianten (nicht verletzen)
1. Control-UUID bleibt stabil — **kein** physischer Split.
2. Status (Grün/Gelb/Rot) hängt **ausschließlich** an `LEGAL_MINIMUM`.
3. `BEST_PRACTICE`/`OPTIONAL` erzeugen Empfehlungen, **nie** einen FEHLT-Status.
4. Kein Prozent-Compliance-Score, wenn alle gesetzlichen Anforderungen erfüllt sind.
5. Speicherung in `generation_metadata` (jsonb) — keine Schema-Migration.
## 10. Rollout (nach diesem Freeze)
1. **1015** der schlimmsten überladenen DSE-Controls tiern (nicht alle 49 auf einmal).
2. 3-Status-Logik in die Live-DSE-Engine verdrahten (heute nur Mess-Harness).
3. Benchmark erneut: FP / FN / Precision / Recall + Status-Verteilung.
4. Erst bei stabilem Effekt: Rollout auf alle 49 überladenen Controls.