4fd2bfefcd
Next: F1 Regulation Registry (DB + API + Frontend + Auto-Create) Frontend at /sdk/regulation-registry in breakpilot-compliance admin Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
116 lines
4.1 KiB
Markdown
116 lines
4.1 KiB
Markdown
# Session-Instruktionen: Block F — Hardcoded Knowledge Migration
|
|
|
|
**Datum:** 2026-05-03
|
|
**Fuer:** Naechste Claude-Session
|
|
**Repo:** breakpilot-core (~/Projekte/breakpilot-core)
|
|
|
|
---
|
|
|
|
## NAECHSTER SCHRITT: Block F1 — Regulation Registry
|
|
|
|
### Was zu tun ist
|
|
|
|
1. **DB-Tabelle** `compliance.regulation_registry` erstellen (Migration-Script)
|
|
2. **Daten migrieren** aus `control_generator.py` (135 Eintraege) + `source_type_classification.py` (58)
|
|
3. **Auto-Create** im RAG-Service bei Document-Upload (status='needs_review')
|
|
4. **Backend-API** in breakpilot-compliance Backend (GET/POST/PUT /v1/regulations)
|
|
5. **Frontend** in breakpilot-compliance Admin unter `/sdk/regulation-registry` (zwischen roadmap und isms)
|
|
6. **Sync-Check** Script (wöchentlich: Qdrant regulation_ids vs. DB)
|
|
7. **Code umstellen** in control_generator.py (Dict → DB-Query mit Cache)
|
|
|
|
### Frontend-Anforderungen (breakpilot-compliance Admin, Port 3007)
|
|
|
|
- NAV-Position: zwischen `/sdk/roadmap` und `/sdk/isms`
|
|
- Tabelle mit allen Regulations (sortierbar, filterbar)
|
|
- Status-Badge: "Needs Review" (gelb), "Active" (grün), "Deprecated" (grau)
|
|
- Counter im NAV für unreviewed Einträge
|
|
- Inline-Edit: license_rule, jurisdiction, source_type, names
|
|
- "Approve" Button → status='active'
|
|
- Diskrepanz-Anzeige: regulation_ids in Qdrant die nicht in DB sind
|
|
|
|
### Kritische Dateien
|
|
|
|
| Repo | Datei | Aktion |
|
|
|------|-------|--------|
|
|
| core | `control-pipeline/services/control_generator.py` Z.75-236 | EDIT: Dict → DB |
|
|
| core | `control-pipeline/data/source_type_classification.py` | DELETE (nach Migration) |
|
|
| core | `rag-service/api/documents.py` | EDIT: Auto-Create bei Upload |
|
|
| compliance | `backend-compliance/compliance/api/regulations.py` | NEU: API Endpoints |
|
|
| compliance | `admin-compliance/app/sdk/regulation-registry/` | NEU: Frontend-Seite |
|
|
|
|
### DB-Schema
|
|
|
|
```sql
|
|
CREATE TABLE compliance.regulation_registry (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
regulation_id VARCHAR(100) UNIQUE NOT NULL,
|
|
regulation_name_de TEXT,
|
|
regulation_name_en TEXT,
|
|
regulation_short VARCHAR(50),
|
|
license_rule INTEGER NOT NULL DEFAULT 1 CHECK (license_rule IN (1, 2, 3)),
|
|
license_type VARCHAR(50),
|
|
source_type VARCHAR(20) NOT NULL DEFAULT 'law',
|
|
jurisdiction VARCHAR(10),
|
|
category VARCHAR(50),
|
|
celex VARCHAR(20),
|
|
url TEXT,
|
|
status VARCHAR(20) NOT NULL DEFAULT 'needs_review',
|
|
created_at TIMESTAMPTZ DEFAULT NOW(),
|
|
updated_at TIMESTAMPTZ DEFAULT NOW()
|
|
);
|
|
|
|
CREATE INDEX idx_reg_registry_status ON compliance.regulation_registry(status);
|
|
CREATE INDEX idx_reg_registry_jurisdiction ON compliance.regulation_registry(jurisdiction);
|
|
```
|
|
|
|
---
|
|
|
|
## GESAMTPLAN Block F (4 Tage)
|
|
|
|
| Phase | Was | Aufwand | Status |
|
|
|-------|-----|---------|--------|
|
|
| F1 | Regulation Registry (DB + API + Frontend + Auto-Create) | 1 Tag | 🔥 NAECHSTER |
|
|
| F2 | Action Types + Synonyme → DB | 1 Tag | Ausstehend |
|
|
| F3 | Object Synonyms → DB | 0.5 Tag | Ausstehend |
|
|
| F4 | LLM Synonym-Enrichment | 1 Tag | Ausstehend |
|
|
| F5 | Validation + Cleanup | 0.5 Tag | Ausstehend |
|
|
|
|
---
|
|
|
|
## SESSION 02-03.05.2026 ERLEDIGT
|
|
|
|
- Block D5+: NIST/ENISA PDF-Qualitaet (0%→45%)
|
|
- Block D6: Citation-Backfill (3.651 Controls)
|
|
- Block E2: 8 DE-Gesetze (1.629 Chunks)
|
|
- Block E3: 5 EU-Regulierungen (1.057 Chunks)
|
|
- Block E4: GoBD, BAIT, VAIT (144 Chunks)
|
|
- Block E6: 3 CH + 4 AT Gesetze (3.881 Chunks)
|
|
- Block E7: 9 Urteile als Volltext (709 Chunks total)
|
|
- Schrems II: 154, BVerfG Datenanalyse: 161, DSK OH Telemedien: 119
|
|
- Meta: 101, BAG Zeiterfassung: 48, Planet49: 42, SCHUFA: 41
|
|
- Schadenersatz: 29, Google Fonts: 14
|
|
- Infra: Qdrant-Snapshot, Upload-before-Delete, 99 Tests
|
|
|
|
**Gesamt neue Chunks diese Session: ~25.000+**
|
|
|
|
---
|
|
|
|
## TESTS
|
|
|
|
```bash
|
|
# Embedding-Service (99 Tests)
|
|
cd embedding-service && python3 -m pytest test_chunking.py test_d4_bgb.py test_nist_normalization.py -v
|
|
|
|
# Control-Pipeline (387 Tests)
|
|
PYTHONPATH=control-pipeline python3 -m pytest control-pipeline/tests/ -v
|
|
|
|
# Qdrant-Snapshot
|
|
ssh macmini "cd ~/Projekte/breakpilot-core && bash scripts/qdrant-snapshot.sh"
|
|
```
|
|
|
|
---
|
|
|
|
## PLAN-DATEI
|
|
|
|
Block F Detailplan: `/Users/benjaminadmin/.claude/plans/humming-nibbling-sonnet.md`
|