docs: update control library taxonomy, add provenance wiki page
Some checks failed
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Failing after 36s
CI/CD / test-python-backend-compliance (push) Successful in 33s
CI/CD / test-python-document-crawler (push) Successful in 24s
CI/CD / test-python-dsms-gateway (push) Successful in 16s
CI/CD / validate-canonical-controls (push) Successful in 10s
CI/CD / Deploy (push) Has been skipped
Some checks failed
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Failing after 36s
CI/CD / test-python-backend-compliance (push) Successful in 33s
CI/CD / test-python-document-crawler (push) Successful in 24s
CI/CD / test-python-dsms-gateway (push) Successful in 16s
CI/CD / validate-canonical-controls (push) Successful in 10s
CI/CD / Deploy (push) Has been skipped
- canonical-control-library.md: add 7 release_states (was 4), target_audience (17 values), generation_strategy, missing API endpoints (controls-customer, backfill-citations, backfill-domain), updated test counts (81+) - NEW control-provenance.md: extracted from frontend page.tsx — methodology, legal basis, filters, badges, taxonomy, open/restricted sources, verification methods, 23 categories, license matrix, source registry - mkdocs.yml: add Control Provenance Wiki to navigation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -91,6 +91,13 @@ erDiagram
|
||||
uuid framework_id FK
|
||||
varchar control_id
|
||||
varchar severity
|
||||
varchar release_state
|
||||
varchar category
|
||||
varchar verification_method
|
||||
varchar target_audience
|
||||
varchar generation_strategy
|
||||
smallint pipeline_version
|
||||
integer license_rule
|
||||
jsonb open_anchors
|
||||
}
|
||||
canonical_control_mappings {
|
||||
@@ -121,18 +128,26 @@ erDiagram
|
||||
| `GET` | `/v1/canonical/frameworks` | Alle Frameworks |
|
||||
| `GET` | `/v1/canonical/frameworks/{id}` | Framework-Details |
|
||||
| `GET` | `/v1/canonical/frameworks/{id}/controls` | Controls eines Frameworks |
|
||||
| `GET` | `/v1/canonical/controls` | Alle Controls (Filter: `severity`, `domain`, `release_state`) |
|
||||
| `GET` | `/v1/canonical/controls` | Alle Controls (Filter: `severity`, `domain`, `release_state`, `category`) |
|
||||
| `GET` | `/v1/canonical/controls/{control_id}` | Einzelnes Control (z.B. AUTH-001) |
|
||||
| `POST` | `/v1/canonical/controls` | Neues Control anlegen |
|
||||
| `PUT` | `/v1/canonical/controls/{control_id}` | Control aktualisieren |
|
||||
| `DELETE` | `/v1/canonical/controls/{control_id}` | Control loeschen (Soft Delete) |
|
||||
| `GET` | `/v1/canonical/controls-customer` | Kunden-View: verbirgt generation_metadata, Rule-3-Quellen |
|
||||
| `GET` | `/v1/canonical/sources` | Quellenregister mit Berechtigungen |
|
||||
| `GET` | `/v1/canonical/licenses` | Lizenz-Matrix |
|
||||
| `GET` | `/v1/canonical/categories` | Alle 23 Kategorien |
|
||||
| `POST` | `/v1/canonical/controls/{id}/similarity-check` | Too-Close-Pruefung |
|
||||
| `POST` | `/v1/canonical/generate` | Generator-Job starten |
|
||||
| `GET` | `/v1/canonical/generate/jobs` | Alle Generator-Jobs |
|
||||
| `GET` | `/v1/canonical/generate/status/{job_id}` | Einzelnen Job-Status abfragen |
|
||||
| `GET` | `/v1/canonical/generate/processed-stats` | Verarbeitungsstatistik pro Collection |
|
||||
| `GET` | `/v1/canonical/generate/review-queue` | Controls zur Pruefung |
|
||||
| `POST` | `/v1/canonical/generate/review/{control_id}` | Review abschliessen |
|
||||
| `GET` | `/v1/canonical/generate/review-queue` | Controls zur Pruefung (needs_review, too_close, duplicate) |
|
||||
| `POST` | `/v1/canonical/generate/review/{control_id}` | Review abschliessen (approve/reject) |
|
||||
| `POST` | `/v1/canonical/generate/bulk-review` | Bulk-Review (approve/reject nach State) |
|
||||
| `POST` | `/v1/canonical/generate/qa-reclassify` | QA-Reklassifizierung bestehender Controls |
|
||||
| `POST` | `/v1/canonical/generate/backfill-citations` | Article/Paragraph-Referenzen nachpflegen |
|
||||
| `POST` | `/v1/canonical/generate/backfill-domain` | Domain/Category/Target-Audience nachpflegen (Anthropic) |
|
||||
| `GET` | `/v1/canonical/blocked-sources` | Gesperrte Quellen (Rule 3) |
|
||||
| `POST` | `/v1/canonical/blocked-sources/cleanup` | Cleanup-Workflow starten |
|
||||
|
||||
@@ -207,6 +222,65 @@ Jede Quelle hat definierte Berechtigungen:
|
||||
|
||||
---
|
||||
|
||||
## Release States (7 Werte)
|
||||
|
||||
| State | Frontend-Label | Farbe | Beschreibung |
|
||||
|-------|---------------|-------|-------------|
|
||||
| `draft` | Draft | Grau | Entwurf — noch nicht freigegeben |
|
||||
| `review` | Review | Blau | Wartet auf manuelle Pruefung |
|
||||
| `approved` | Approved | Gruen | Freigegeben fuer Kunden |
|
||||
| `needs_review` | Review noetig | Gelb | Vom Generator erzeugt, QA-Pruefung noetig |
|
||||
| `too_close` | Zu aehnlich | Rot | Too-Close-Detektor hat Warnung ausgeloest |
|
||||
| `duplicate` | Duplikat | Orange | Wurde als Duplikat eines bestehenden Controls erkannt |
|
||||
| `deprecated` | Deprecated | Rot | Veraltet/geloescht (Soft Delete) |
|
||||
|
||||
!!! note "Pipeline-erzeugte States"
|
||||
`needs_review`, `too_close` und `duplicate` werden automatisch vom Generator vergeben.
|
||||
`draft`, `review` und `approved` sind manuelle Workflow-States.
|
||||
|
||||
---
|
||||
|
||||
## Target Audience (Zielgruppe)
|
||||
|
||||
Jedes Control kann eine oder mehrere Zielgruppen haben. Die Zielgruppe bestimmt, fuer welchen Organisationstyp das Control relevant ist.
|
||||
|
||||
| Key | Label | Farbe |
|
||||
|-----|-------|-------|
|
||||
| `enterprise` / `unternehmen` | Unternehmen | Cyan |
|
||||
| `authority` / `behoerden` | Behoerden | Rose |
|
||||
| `provider` | Anbieter | Violet |
|
||||
| `all` | Alle | Grau |
|
||||
| `entwickler` | Entwickler | Sky |
|
||||
| `datenschutzbeauftragte` | DSB | Purple |
|
||||
| `geschaeftsfuehrung` | GF | Amber |
|
||||
| `it-abteilung` | IT | Blau |
|
||||
| `rechtsabteilung` | Recht | Fuchsia |
|
||||
| `compliance-officer` | Compliance | Indigo |
|
||||
| `personalwesen` | Personal | Pink |
|
||||
| `einkauf` | Einkauf | Lime |
|
||||
| `produktion` | Produktion | Orange |
|
||||
| `vertrieb` | Vertrieb | Teal |
|
||||
| `gesundheitswesen` | Gesundheit | Rot |
|
||||
| `finanzwesen` | Finanzen | Emerald |
|
||||
| `oeffentlicher_dienst` | Oeffentl. Dienst | Rose |
|
||||
|
||||
**DB-Feld:** `target_audience` (VARCHAR, kann Array sein als JSONB)
|
||||
**Migration:** 049_target_audience.sql
|
||||
|
||||
---
|
||||
|
||||
## Generation Strategy
|
||||
|
||||
| Strategy | Badge | Farbe | Bedeutung |
|
||||
|----------|-------|-------|-----------|
|
||||
| `ungrouped` (Default/null) | v1 | Grau | Einzelverarbeitung (Original-Ansatz) |
|
||||
| `document_grouped` | v2 | Emerald | Dokumentgruppenweise Verarbeitung (v2 Pipeline) |
|
||||
|
||||
**DB-Feld:** `generation_strategy` (TEXT, Default: `'ungrouped'`)
|
||||
**Migration:** 058_generation_strategy.sql
|
||||
|
||||
---
|
||||
|
||||
## CI/CD Validation
|
||||
|
||||
Der Validator (`scripts/validate-controls.py`) prueft bei jedem Commit:
|
||||
@@ -223,19 +297,77 @@ Der Validator (`scripts/validate-controls.py`) prueft bei jedem Commit:
|
||||
|
||||
### Control Library Browser (`/sdk/control-library`)
|
||||
|
||||
**Listen-Ansicht:**
|
||||
|
||||
- Framework-Info mit Version und Beschreibung
|
||||
- Filterable Control-Tabelle (Domain, Severity, Freitext)
|
||||
- Detail-Ansicht mit: Ziel, Begruendung, Anforderungen, Pruefverfahren, Nachweise
|
||||
- **Open-Source-Referenzen** prominent dargestellt (gruener Kasten)
|
||||
- Tags und Scope-Informationen
|
||||
- Filterable Control-Tabelle mit **7 Filter-Dropdowns:**
|
||||
1. Schweregrad (critical, high, medium, low)
|
||||
2. Domain (aus Meta-Daten, alle vorhandenen Domains)
|
||||
3. Status (draft, approved, needs_review, too_close, duplicate, deprecated)
|
||||
4. Nachweis (code_review, document, tool, hybrid)
|
||||
5. Kategorie (23 thematische Kategorien)
|
||||
6. Zielgruppe (17 Audience-Werte)
|
||||
7. Dokumentenursprung (nach Quellen-Regulation)
|
||||
- Sortierung: ID, Quelle, Neueste/Aelteste
|
||||
- Pagination: 50 Controls pro Seite
|
||||
- Freitext-Suche (ID, Titel, Ziel)
|
||||
|
||||
**Detail-Ansicht:**
|
||||
|
||||
- Ziel, Begruendung, Geltungsbereich
|
||||
- Anforderungen, Pruefverfahren, Nachweise
|
||||
- **Gesetzliche Grundlage** (blaue Box): source_citation mit Artikel, Paragraph, Lizenz, Link
|
||||
- **Open-Source-Referenzen** (gruener Kasten): Verlinkte Open Anchors
|
||||
- Generierungsdetails: processing_path, similarity_status
|
||||
- Tags, Risk Score, Implementation Effort
|
||||
- **Badges:** Severity, State, LicenseRule, VerificationMethod, Category, TargetAudience, GenerationStrategy
|
||||
|
||||
**Review-Modus:**
|
||||
|
||||
Der Review-Modus wird aktiviert wenn `needs_review`-Controls vorhanden sind.
|
||||
Er teilt die Review-Queue in zwei Tabs:
|
||||
|
||||
| Tab | Inhalt | Ansicht |
|
||||
|-----|--------|---------|
|
||||
| **Duplikat-Verdacht** | Controls mit `similar_controls` in generation_metadata | Side-by-Side Vergleich (ReviewCompare) |
|
||||
| **Rule 3 ohne Anchor** | Controls ohne Open Anchors | Einzel-Detail-Ansicht |
|
||||
|
||||
**Duplikat-Vergleich (ReviewCompare):**
|
||||
|
||||
- Linke Seite: Zu pruefendes Control (gelb hervorgehoben)
|
||||
- Rechte Seite: Verdaechtiges Duplikat (aus `generation_metadata.similar_controls[0]`)
|
||||
- Aehnlichkeits-Prozentsatz im Header
|
||||
- Aktionen: Behalten (approve), Duplikat (reject), Bearbeiten
|
||||
|
||||
**Weitere Aktionen:**
|
||||
|
||||
- Generator-Modal: Job starten (Domain, Collections, Dry-Run, max_controls)
|
||||
- Bulk-Review: Alle gefilterten Controls genehmigen/ablehnen
|
||||
- Statistik-Dialog: Verarbeitungsstatistik pro Collection
|
||||
|
||||
### Control Provenance Wiki (`/sdk/control-provenance`)
|
||||
|
||||
- Dokumentation der Methodik
|
||||
- Unabhaengige Taxonomie erklaert
|
||||
- Offene Referenzquellen aufgelistet
|
||||
- Geschuetzte Quellen und Trennungsprinzip
|
||||
- **Live-Daten:** Lizenz-Matrix und Quellenregister aus der Datenbank
|
||||
Wiki-artige Dokumentation der rechtlichen und methodischen Grundlage fuer die Control-Erstellung:
|
||||
|
||||
**Statische Sektionen (10):**
|
||||
|
||||
1. **Methodik der Control-Erstellung** — Dreistufiger Prozess, rechtliche Basis (UrhG §44b, §23)
|
||||
2. **Filter in der Control Library** — Erklaerung aller 7 Filter + Sortierung
|
||||
3. **Badges & Lizenzregeln** — Rule 1/2/3, Processing Paths, Referenzen
|
||||
4. **Unabhaengige Taxonomie** — Top-10 Domains, ID-Format, spezialisierte Domains
|
||||
5. **Offene Referenzquellen** — OWASP, NIST, ENISA, SLSA, CIS
|
||||
6. **Geschuetzte Quellen** — BSI, ISO, ETSI + Trennungsprinzip
|
||||
7. **Verifikationsmethoden** — 4 Methoden + Kundenbedeutung
|
||||
8. **Thematische Kategorien** — 17 Kategorien mit Beschreibung
|
||||
9. **Master Library Strategie** — RAG-First, Dedup, Wellen-Ansatz
|
||||
10. **Automatisierte Validierung** — CI/CD-Checks, Too-Close-Detektor
|
||||
|
||||
**Live-Daten (API-gespeist):**
|
||||
|
||||
- **Lizenz-Matrix:** Tabelle aller `canonical_control_licenses` mit Berechtigungsbadges
|
||||
- **Quellenregister:** Karten fuer jede `canonical_control_sources` mit 4 Berechtigungsstufen
|
||||
|
||||
Siehe auch: [Control Provenance Dokumentation](control-provenance.md)
|
||||
|
||||
---
|
||||
|
||||
@@ -684,21 +816,27 @@ curl -X POST https://api-dev.breakpilot.ai/api/compliance/v1/canonical/controls
|
||||
| `backend-compliance/tests/test_canonical_control_routes.py` | Python | 14 Tests | REST API Endpoints |
|
||||
| `backend-compliance/tests/test_license_gate.py` | Python | 12 Tests | Lizenz-Klassifikation |
|
||||
| `backend-compliance/tests/test_validate_controls.py` | Python | 14 Tests | CI/CD Validator |
|
||||
| `backend-compliance/tests/test_control_generator.py` | Python | 15 Tests | Pipeline, Batch, Lizenzregeln |
|
||||
| **Gesamt** | | **82 Tests** |
|
||||
| `backend-compliance/tests/test_control_generator.py` | Python | 81 Tests | Pipeline, Batch, Lizenzregeln, QA, Recital |
|
||||
| **Gesamt** | | **149+ Tests** |
|
||||
|
||||
### Control Generator Tests (test_control_generator.py)
|
||||
|
||||
Die Generator-Tests decken folgende Bereiche ab:
|
||||
|
||||
- **`TestLicenseMapping`** (12 Tests) — Korrekte Zuordnung von `regulation_code` zu Lizenzregeln (Rule 1/2/3),
|
||||
Case-Insensitivity, Rule 3 darf keine Quellennamen exponieren
|
||||
- **`TestDomainDetection`** (5 Tests) — Erkennung von AUTH, CRYPT, NET, DATA Domains aus Chunk-Text
|
||||
- **`TestJsonParsing`** (4 Tests) — Robustes Parsing von LLM-Antworten (plain JSON, Markdown-Fenced, mit Preamble)
|
||||
- **`TestGeneratedControlRules`** (3 Tests) — Rule 1 hat Originaltext, Rule 2 hat Citation, Rule 3 hat **nichts**
|
||||
- **`TestAnchorFinder`** (2 Tests) — RAG-Suche filtert Rule 3 Quellen aus, Web-Suche erkennt Frameworks
|
||||
- **`TestPipelineMocked`** (5 Tests) — End-to-End mit Mocks: Lizenz-Klassifikation, Rule 3 Blocking,
|
||||
Hash-Deduplizierung, Config-Defaults (`batch_size: 5`), Rule 1 Citation-Generierung
|
||||
| Klasse | Tests | Prueft |
|
||||
|--------|-------|--------|
|
||||
| `TestLicenseMapping` | 12 | Lizenz-Klassifikation (Rule 1/2/3), Case-Insensitivitaet |
|
||||
| `TestDomainDetection` | 5 | Keyword-basierte Domain-Erkennung (AUTH, CRYP, NET, DATA) |
|
||||
| `TestJsonParsing` | 4 | JSON-Parser fuer LLM-Responses (Markdown-Fencing, Preamble) |
|
||||
| `TestGeneratedControlRules` | 3 | Rule-spezifische Felder (original_text, citation, source_info) |
|
||||
| `TestAnchorFinder` | 2 | RAG-Suche + Web-Framework-Erkennung |
|
||||
| `TestPipelineMocked` | 5 | End-to-End Pipeline mit Mocks (Lizenz, Hash-Dedup, Config) |
|
||||
| `TestParseJsonArray` | 15 | JSON-Array-Parser (Wrapper-Objekte, Bracket-Extraction, Fallbacks) |
|
||||
| `TestBatchSizeConfig` | 5 | Batch-Groesse-Konfiguration + Defaults |
|
||||
| `TestBatchProcessingLoop` | 10 | Batch-Verarbeitung (Rule-Split, Mixed-Rules, Too-Close, Null-Handling) |
|
||||
| `TestRegulationFilter` | 5 | regulation_filter Prefix-Matching, leere regulation_codes |
|
||||
| `TestPipelineVersion` | 5 | pipeline_version=2 in DB-Writes, null-Handling in Structure/Reform |
|
||||
| `TestRecitalDetection` | 10 | Erwaegungsgrund-Erkennung in Quelltexten (Regex, Phrasen, Kombiniert) |
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user