Eigenstaendig formulierte Security Controls mit unabhaengiger Taxonomie und Open-Source-Verankerung (OWASP, NIST, ENISA). Keine BSI-Nomenklatur. - Migration 044: 5 DB-Tabellen (frameworks, controls, sources, licenses, mappings) - 10 Seed Controls mit 39 Open-Source-Referenzen - License Gate: Quellen-Berechtigungspruefung (analysis/excerpt/embeddings/product) - Too-Close-Detektor: 5 Metriken (exact-phrase, token-overlap, ngram, embedding, LCS) - REST API: 8 Endpoints unter /v1/canonical/ - Go Loader mit Multi-Index (ID, domain, severity, framework) - Frontend: Control Library Browser + Provenance Wiki - CI/CD: validate-controls.py Job (schema, no-leak, open-anchors) - 67 Tests (8 Go + 59 Python), alle PASS - MkDocs Dokumentation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
8.7 KiB
Canonical Control Library (CP-CLIB)
Eigenstaendig formulierte Security Controls basierend auf offenem Wissen (OWASP, NIST, ENISA). Unabhaengige Taxonomie — kein Bezug zu proprietaeren Frameworks.
Prefix: CP-CLIB · Frontend: https://macmini:3007/sdk/control-library
Provenance Wiki: https://macmini:3007/sdk/control-provenance
Proxy: /api/sdk/v1/canonical → backend-compliance:8002/api/v1/canonical/...
Motivation
Wir benoetigen ein System, um aus verschiedenen Security-Guidelines eigenstaendige, rechtlich defensible Controls zu extrahieren, ohne proprietaere Texte im Produkt zu verwenden.
Kernprinzipien
- Unabhaengige Taxonomie — Eigene Domain-IDs (AUTH, NET, SUP, etc.), eigenes ID-Format (
DOMAIN-NNN) - Open-Source-Verankerung — Jedes Control hat mindestens 1 Open Anchor (OWASP/NIST/ENISA)
- Strikte Quellentrennung — Geschuetzte Quellen nur intern zur Analyse, nie im Produkt
- Automatisierte Pruefung — Too-Close-Detektor + No-Leak-Scanner in CI/CD
Rechtliche Basis
| Gesetz | Bezug |
|---|---|
| UrhG §44b | Text & Data Mining — Kopien loeschen |
| UrhG §23 | Hinreichender Abstand zum Originalwerk |
| BSI Nutzungsbedingungen | Kommerziell nur mit Zustimmung |
Domains (Unabhaengige Taxonomie)
| Domain | Name | Beschreibung |
|---|---|---|
| AUTH | Identity & Access Management | Authentisierung, MFA, Token-Management |
| NET | Network & Transport Security | TLS, Zertifikate, Netzwerk-Haertung |
| SUP | Software Supply Chain | Signierung, SBOM, Dependency-Scanning |
| LOG | Security Operations & Logging | Privacy-Aware Logging, SIEM |
| WEB | Web Application Security | Admin-Flows, Account Recovery |
| DATA | Data Governance & Classification | Datenklassifikation, Schutzmassnahmen |
| CRYP | Cryptographic Operations | Key Management, Rotation, HSM |
| REL | Release & Change Governance | Change Impact Assessment, Security Review |
!!! warning "Keine BSI-Nomenklatur"
Die Domains verwenden bewusst KEINE BSI-Bezeichner (O.Auth_, O.Netz_).
Das ID-Format DOMAIN-NNN ist eine gaengige, nicht-proprietaere Konvention.
Datenmodell (Migration 044)
erDiagram
canonical_control_licenses ||--o{ canonical_control_sources : "hat"
canonical_control_frameworks ||--o{ canonical_controls : "enthaelt"
canonical_controls ||--o{ canonical_control_mappings : "hat"
canonical_control_sources ||--o{ canonical_control_mappings : "referenziert"
canonical_control_licenses {
varchar license_id PK
varchar name
varchar commercial_use
boolean deletion_required
}
canonical_control_sources {
uuid id PK
varchar source_id UK
varchar title
boolean allowed_ship_in_product
}
canonical_control_frameworks {
uuid id PK
varchar framework_id UK
varchar name
varchar version
}
canonical_controls {
uuid id PK
uuid framework_id FK
varchar control_id
varchar severity
jsonb open_anchors
}
canonical_control_mappings {
uuid id PK
uuid control_id FK
uuid source_id FK
varchar mapping_type
varchar attribution_class
}
Tabellen
| Tabelle | Zweck | Produktfaehig? |
|---|---|---|
canonical_control_licenses |
Lizenz-Metadaten | Ja (read-only) |
canonical_control_sources |
Quellen-Register | Nein (nur intern) |
canonical_control_frameworks |
Framework-Registry | Ja |
canonical_controls |
Die eigentlichen Controls | Ja |
canonical_control_mappings |
Provenance-Trail | Nein (nur Audit) |
API Endpoints
| Methode | Pfad | Beschreibung |
|---|---|---|
GET |
/v1/canonical/frameworks |
Alle Frameworks |
GET |
/v1/canonical/frameworks/{id} |
Framework-Details |
GET |
/v1/canonical/frameworks/{id}/controls |
Controls eines Frameworks |
GET |
/v1/canonical/controls |
Alle Controls (Filter: severity, domain, release_state) |
GET |
/v1/canonical/controls/{control_id} |
Einzelnes Control (z.B. AUTH-001) |
GET |
/v1/canonical/sources |
Quellenregister mit Berechtigungen |
GET |
/v1/canonical/licenses |
Lizenz-Matrix |
POST |
/v1/canonical/controls/{id}/similarity-check |
Too-Close-Pruefung |
Beispiel: Control abrufen
curl -s https://macmini:8002/api/v1/canonical/controls/AUTH-001 | jq
Beispiel: Similarity Check
curl -X POST https://macmini:8002/api/v1/canonical/controls/AUTH-001/similarity-check \
-H 'Content-Type: application/json' \
-d '{
"source_text": "Die Anwendung muss MFA implementieren.",
"candidate_text": "Privileged accounts require multi-factor authentication."
}' | jq
Response:
{
"max_exact_run": 0,
"token_overlap": 0.0714,
"ngram_jaccard": 0.0323,
"embedding_cosine": 0.0,
"lcs_ratio": 0.0714,
"status": "PASS",
"details": {
"max_exact_run": "PASS",
"token_overlap": "PASS",
"ngram_jaccard": "PASS",
"embedding_cosine": "PASS",
"lcs_ratio": "PASS"
}
}
Too-Close-Detektor
5 Metriken mit Schwellwerten:
| Metrik | Warn | Fail | Beschreibung |
|---|---|---|---|
| Exact Phrase | ≥8 Tokens | ≥12 Tokens | Laengste identische Token-Sequenz |
| Token Overlap | ≥0.20 | ≥0.30 | Jaccard der Token-Mengen |
| 3-Gram Jaccard | ≥0.10 | ≥0.18 | Zeichenketten-Aehnlichkeit |
| Embedding Cosine | ≥0.86 | ≥0.92 | Semantische Aehnlichkeit (bge-m3) |
| LCS Ratio | ≥0.35 | ≥0.50 | Longest Common Subsequence |
Entscheidungslogik:
- PASS — Kein Fail + max 1 Warn
- WARN — Max 2 Warn, kein Fail → Human Review
- FAIL — Irgendein Fail → Block, Umformulierung noetig
License Gate
Jede Quelle hat definierte Berechtigungen:
| Nutzungsart | Spalte | Beispiel OWASP | Beispiel BSI |
|---|---|---|---|
| Analyse | allowed_analysis |
✅ | ✅ |
| Excerpt speichern | allowed_store_excerpt |
✅ | ❌ |
| Embeddings shippen | allowed_ship_embeddings |
✅ | ❌ |
| Im Produkt shippen | allowed_ship_in_product |
✅ | ❌ |
CI/CD Validation
Der Validator (scripts/validate-controls.py) prueft bei jedem Commit:
- Schema Validation — Alle Pflichtfelder, ID-Format, Severity
- No-Leak Scanner — Regex gegen BSI-Muster (
O.Auth_*,TR-03161, etc.) - Open Anchor Check — Jedes Control hat ≥1 Open Anchor
- Taxonomy Check — Keine BSI-style ID-Prefixe
- Evidence Structure — Alle Evidence-Items haben
type+description
Frontend
Control Library Browser (/sdk/control-library)
- Framework-Info mit Version und Beschreibung
- Filterable Control-Tabelle (Domain, Severity, Freitext)
- Detail-Ansicht mit: Ziel, Begruendung, Anforderungen, Pruefverfahren, Nachweise
- Open-Source-Referenzen prominent dargestellt (gruener Kasten)
- Tags und Scope-Informationen
Control Provenance Wiki (/sdk/control-provenance)
- Dokumentation der Methodik
- Unabhaengige Taxonomie erklaert
- Offene Referenzquellen aufgelistet
- Geschuetzte Quellen und Trennungsprinzip
- Live-Daten: Lizenz-Matrix und Quellenregister aus der Datenbank
Dateien
| Datei | Typ | Beschreibung |
|---|---|---|
backend-compliance/migrations/044_canonical_control_library.sql |
SQL | 5 Tabellen + Seed-Daten |
backend-compliance/compliance/api/canonical_control_routes.py |
Python | REST API (8 Endpoints) |
backend-compliance/compliance/services/license_gate.py |
Python | Lizenz-Gate-Logik |
backend-compliance/compliance/services/similarity_detector.py |
Python | Too-Close-Detektor (5 Metriken) |
ai-compliance-sdk/policies/canonical_controls_v1.json |
JSON | 10 Seed Controls, 39 Open Anchors |
ai-compliance-sdk/internal/ucca/canonical_control_loader.go |
Go | Control Loader mit Multi-Index |
admin-compliance/app/sdk/control-library/page.tsx |
TSX | Control Library Browser |
admin-compliance/app/sdk/control-provenance/page.tsx |
TSX | Provenance Wiki |
admin-compliance/app/api/sdk/v1/canonical/route.ts |
TS | Next.js API Proxy |
scripts/validate-controls.py |
Python | CI/CD Validator |
Tests
| Datei | Sprache | Tests |
|---|---|---|
ai-compliance-sdk/internal/ucca/canonical_control_loader_test.go |
Go | 8 Tests |
backend-compliance/compliance/tests/test_similarity_detector.py |
Python | 19 Tests |
backend-compliance/tests/test_canonical_control_routes.py |
Python | 14 Tests |
backend-compliance/tests/test_license_gate.py |
Python | 12 Tests |
backend-compliance/tests/test_validate_controls.py |
Python | 14 Tests |
| Gesamt | 67 Tests |