Files
breakpilot-compliance/docs-src/architecture/03-source-class.md
T
Benjamin Admin a3053c3c86
CI / detect-changes (push) Successful in 14s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 9s
CI / validate-canonical-controls (push) Successful in 19s
CI / loc-budget (push) Successful in 23s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
docs(architecture): RAG retrieval engine architecture set (01-09)
9 docs + index in docs-src/architecture/ documenting the deterministic
retrieval engine: retrieval pipeline, authority rerank, source_class,
source_role, control-intent + diversity, assessment, confidence,
explainability + supersede, framework_* layer. Each doc carries the exact
constants, the rationale behind them, code refs, and the failure class
it addresses. Audit/onboarding reference.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-25 09:25:22 +02:00

50 lines
3.1 KiB
Markdown

# 03 — `source_class` (Rechtsnatur / Autorität)
**Zweck:** Die Autoritäts-Achse, die den **Rang** bestimmt (siehe [02](02-authority.md)). Deterministisch abgeleitet — der noch nicht re-ingestierte (ungetaggte) Korpus wird trotzdem klassifiziert, ohne Re-Tagging des Bestands.
## Mechanik
`classifyAuthority()` (`authority.go`) entscheidet in dieser Reihenfolge:
1. **Standard-NAME-Override** — erkannter Standard-Name (NIST/OWASP/ISO 27001/CIS/CSA CCM/Grundschutz) erzwingt `technical_standard` (Gewicht 80), **auch wenn die Payload `supervisory_guidance` sagt**. Grund: der Korpus taggt viele Standards mit generischem guidance-`source_class`; der Name ist autoritativer. `binding_law` bleibt unangetastet.
2. **Explizite Payload-Werte** — gesetztes `source_class` / `authority_weight` gewinnen.
3. **Marker-Fallback** — foreign → standard → guidance → regulation → unknown.
`inferJurisdiction()`: Fremd-Marker → `CH`; enthält `§` oder DE-Marker → `DE`; sonst → `EU`.
## Konstanten + Warum
**Gewichte je Klasse** (`sourceClassFromWeight()`):
| `source_class` | Gewicht | Schwelle | Bedeutung |
|----------------|---------|----------|-----------|
| `binding_law` | `100` | w ≥ 100 | bindendes Recht (Gesetz/VO) |
| `technical_standard` | `80` | 80 ≤ w < 100 | Best-Practice-Control-Katalog (NIST/OWASP/ISO) |
| `supervisory_guidance` | `70` | 70 ≤ w < 80 | Aufsichts-/Auslegungs-Guidance (ENISA/BSI/EDPB) |
| `unknown` | `50` | default | unklassifiziert |
| `foreign_law` | `0` | w ≤ 0 | Fremdrecht (CH) |
**Marker-Listen** (Substring-Match):
| Liste | Einträge (Auszug) | Wirkung |
|-------|-------------------|---------|
| `standardMarkers` *(vor guidance geprüft)* | NIST, OWASP, Grundschutz, ISO 27001, ISO/IEC 27001, CSA CCM, Cloud Controls Matrix, CIS Benchmark, CIS Control | → `technical_standard` (80) |
| `guidanceMarkers` | DSK, EDPB, BfDI, ENISA, BSI, EUCC, Standards Mapping, Orientierungshilfe, Handreichung, Leitlinie, Empfehlung, OECD, CISA, Blue Guide, … | → `supervisory_guidance` (70) |
| `foreignMarkers` | RevDSG, fedlex, (CH) | → `foreign_law` (0) |
| `deMarkers` | BDSG, DSK, BfDI, BayLfD, BSI | Signal **DE**-Jurisdiktion |
## Der Standard-Name-Override (Fix 2026-06-25)
**Problem:** Der CE-Korpus taggt z.B. `NIST SP 800-82r3` als `source_class=supervisory_guidance` (Gewicht 70), **nicht** technical_standard. `classifyAuthority` vertraute dem Payload-Tag → NIST landete als guidance, **kein `control_standard`** im Pool → die Diversity-Regel ([05](05-control-intent.md)) konnte nichts injizieren.
**Fix:** Erkannter Standard-Name überschreibt ein fehl-getaggtes guidance/unknown-`source_class``technical_standard`. Code-Fix, **kein Re-Ingest** nötig. Bindendes Recht bleibt unangetastet (Sanity geprüft: Rechtsfrage liefert weiterhin binding Top-1).
## Code
- `authority.go``classifyAuthority()`, `sourceClassFromWeight()`, `inferJurisdiction()`
## Adressierte Fehlerklassen
- **„Standard als guidance mistagged → kein control_standard"** → Standard-Name-Override.
- **„Fremdrecht falsch eingeordnet"** → `foreignMarkers` + `foreign_law`-Gewicht 0.