docs: add session handover instructions for next session
Covers: completed blocks A-D1, remaining D2-G, critical files, DB state, memory files, test commands. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,99 @@
|
||||
# Session-Uebergabe: Pipeline-Gesamtplan
|
||||
|
||||
**Datum:** 2026-05-01
|
||||
**Uebergeben von:** Pipeline-Session (26.04 - 01.05.2026, ~6 Tage)
|
||||
|
||||
## Was wurde erledigt
|
||||
|
||||
| Block | Was | Status |
|
||||
|-------|-----|--------|
|
||||
| **Pass 0b** | 173.471 atomare Controls generiert | ✅ $750 API |
|
||||
| **Dedup** | 173k → 151.675 unique (21.796 Duplikate) | ✅ ~30h Mac Mini |
|
||||
| **Block A** | v1 Tag, Dependencies (15.291), Healthcheck, Textkorrektur | ✅ |
|
||||
| **Block B** | Review-Verify (67k Paare, 43.527 DUPLIKAT) | ✅ $17 Haiku |
|
||||
| **Block C** | Adversarial Tests (30 Cases), Regression Harness (371 Tests) | ✅ |
|
||||
| **Block D1** | Strukturelles Chunking Endpoint (Metadaten-Extraktion) | ✅ deployed |
|
||||
|
||||
## Was als naechstes zu tun ist
|
||||
|
||||
### Block D2-D6: Strukturelles Chunking fertigstellen
|
||||
|
||||
D1 (Embedding-Service Metadaten) ist deployed. Naechste Schritte:
|
||||
|
||||
1. **D2: RAG Upload erweitern** (`rag-service/api/documents.py`)
|
||||
- Neue Payload-Felder in Qdrant speichern: section, section_title, paragraph, page
|
||||
- `chunks_with_metadata` vom Embedding-Service nutzen
|
||||
|
||||
2. **D3: Control Generator anpassen** (`control-pipeline/services/control_generator.py`)
|
||||
- Strukturelle Metadaten aus Qdrant-Payload bevorzugen
|
||||
- source_citation um Seitenzahl erweitern
|
||||
|
||||
3. **D4: Test mit BGB § 312**
|
||||
- 1 Dokument mit neuem Chunking hochladen
|
||||
- Pruefen ob § 312k eigenen Chunk hat
|
||||
|
||||
4. **D5-D6: Alle 297 Quellen re-ingestieren** (grosser Aufwand)
|
||||
|
||||
### Block E: Gesetze aktualisieren + ingestieren
|
||||
|
||||
Siehe Plan: BGB aktualisieren, fehlende Gesetze, Urteils-Control-Packs.
|
||||
**WICHTIG:** 16 Urteile muessen MANUELL heruntergeladen werden (WebFetch funktioniert nicht).
|
||||
Download-Liste: `legal-sources/urteile/DOWNLOAD_LIST.md`
|
||||
|
||||
### Block F: Hardcoded Knowledge Migration
|
||||
|
||||
6 Dateien im Compliance-Backend mit hartkodiertem Rechtswissen.
|
||||
Instruktionsdatei: `breakpilot-compliance/zeroclaw/INSTRUCTION-hardcoded-knowledge-migration.md`
|
||||
|
||||
### Block G-pre: Master Control Konsolidierung
|
||||
|
||||
151k → 15-25k Master-Gruppen durch Clustering.
|
||||
|
||||
## Kritische Dateien
|
||||
|
||||
| Datei | Repo | Aenderung |
|
||||
|-------|------|-----------|
|
||||
| `embedding-service/main.py` | core | D1 FERTIG — Metadaten-Extraktion |
|
||||
| `rag-service/api/documents.py` | core | D2 — Payload-Felder in Qdrant |
|
||||
| `control-pipeline/services/control_generator.py` | core | D3 — Metadaten nutzen |
|
||||
| `control-pipeline/services/batch_dedup_runner.py` | core | Checkpoint-Logik (fertig) |
|
||||
| `control-pipeline/services/dependency_engine.py` | core | Dependency Engine (fertig) |
|
||||
| `control-pipeline/services/dependency_generator.py` | core | Auto-Generation (fertig) |
|
||||
|
||||
## DB-Stand
|
||||
|
||||
| Tabelle | Lokal (Mac Mini) | Production (Hetzner) |
|
||||
|---------|-----------------|---------------------|
|
||||
| canonical_controls | 291.402 | 291.402 (aber ohne Block B Duplikate) |
|
||||
| obligation_candidates | 234.538 | 234.538 |
|
||||
| control_dependencies | 15.294 | 15.294 |
|
||||
| Pass 0b Drafts | 151.675 | ~162.387 (Block B fehlt) |
|
||||
|
||||
**Production Sync noetig** nach Abschluss aller Bloecke.
|
||||
|
||||
## Memory-Dateien (WICHTIG — lesen!)
|
||||
|
||||
Alle unter `/Users/benjaminadmin/.claude/projects/-Users-benjaminadmin-Projekte-breakpilot-core/memory/`:
|
||||
|
||||
- `MEMORY.md` — Index aller Memories
|
||||
- `feedback_batch_api_safety.md` — NIEMALS curl-Retry fuer Batch-Submits
|
||||
- `feedback_no_hardcoded_knowledge.md` — Kein hartkodiertes Rechtswissen
|
||||
- `feedback_legal_source_licensing.md` — Rule 1/2/3 Lizenzpruefung
|
||||
- `project_structural_chunking.md` — Architektur-Entscheidung
|
||||
- `project_missing_legal_sources.md` — Fehlende Gesetze + 20 Urteile
|
||||
- `project_rag_version_audit.md` — 297 Quellen, BGB veraltet
|
||||
- `project_delta_pipeline.md` — Diff-Strategie fuer Gesetzesupates
|
||||
- `project_test_strategy.md` — 4-Ebenen Teststrategie
|
||||
- `project_compliance_execution_layer.md` — Moat-Strategie
|
||||
|
||||
## Plan-Datei
|
||||
|
||||
`/Users/benjaminadmin/.claude/plans/jazzy-snacking-creek.md` — Vollstaendiger Plan mit Bloecken A-G.
|
||||
|
||||
## Tests
|
||||
|
||||
```bash
|
||||
cd /Users/benjaminadmin/Projekte/breakpilot-core
|
||||
PYTHONPATH=control-pipeline python3 -m pytest control-pipeline/tests/ -v
|
||||
# Ergebnis: 371 passed, 33 skipped
|
||||
```
|
||||
Reference in New Issue
Block a user