dbd44ecc2091c5c051d26c0bbd0ef3dd7047088f
Two idempotent scripts that complete Task #22 (300k atomic_controls reclassification) across both Postgres DBs and all Qdrant collections on Mac Mini + Production. backfill_license_rule.py - iterative parent_control_uuid inheritance with cycle cap - dry-run + apply modes, per-iteration row counts - residual-orphan cluster report for manual review backfill_qdrant_license_payload.py - joins canonical_controls.id (or regulation_id) → license_rule - scrolls + grouped set_payload per rule (3 batches per collection) - supports both lookup tables (canonical_controls / regulation_registry) - supports managed Qdrant via --qdrant-api-key (Production) Backfill bilance: - Mac Mini canonical_controls: 0 NULL (was 279,384) across 314,811 rows - Mac Mini Qdrant atomic_controls_dedup: 44,987 points patched - Mac Mini bp_compliance_gesetze: 37,634 points patched - Mac Mini bp_compliance_datenschutz: 11,338 points patched - Production canonical_controls: 0 NULL (was 259,914) across 294,027 rows - Production Qdrant bp_compliance_gesetze: 55,836 patched - Production Qdrant bp_compliance_datenschutz: 18,980 patched - Production Qdrant bp_compliance_ce: 23,239 patched Schema migration 002_regulation_registry.sql + 252 registry rows were replicated to Production (was missing — only existed on Mac Mini). 20 BSI/DE-Gesetz entries added to registry to close Qdrant lookup gap. 100% deterministic classification achieved on both DBs via: - parent_control_uuid inheritance (94% coverage) - control_parent_links.source_regulation → regulation_registry - source_citation->>'source' → regulation_registry - canonical_processed_chunks ground truth (chunk-validated) - ungrouped LLM-aggregate Vorfahren → own works (Rule 3) [migration-approved]
…
…
Description
No description provided
Languages
Python
38.3%
TypeScript
37.8%
Go
18.9%
HTML
3.2%
Shell
0.7%
Other
1.1%