feat(controls): Zitierfaehigkeit — Embedding-Re-Link + Atom-Vererbung

citation_backfill Tier-1 von totem sha256-Hash auf Semantik-Suche gegen die
re-ingestierten, article_label-tragenden Chunks umgestellt (Fundstelle aus
article_label); rag_client reicht article_label durch (additiv, Default-Feld).
NEU: scripts/atom_citation_inheritance.py vererbt source_citation parent->atom
(license_rule != 3), iterativ. macmini-Apply verifiziert: Zitierfaehigkeit
6.9%->61.3% (+171.765 Atome), Stichprobe korrekt (Atom == Parent-Fundstelle).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Benjamin Admin
2026-06-21 14:17:57 +02:00
parent ff4a743558
commit de542633e2
3 changed files with 238 additions and 111 deletions
+3
View File
@@ -33,6 +33,7 @@ class RAGSearchResult:
paragraph: str
source_url: str
score: float
article_label: str = ""
collection: str = ""
page: Optional[int] = None
@@ -90,6 +91,7 @@ class ComplianceRAGClient:
regulation_short=r.get("regulation_short", ""),
category=r.get("category", ""),
article=r.get("article", ""),
article_label=r.get("article_label", ""),
paragraph=r.get("paragraph", ""),
source_url=r.get("source_url", ""),
score=r.get("score", 0.0),
@@ -171,6 +173,7 @@ class ComplianceRAGClient:
regulation_short=r.get("regulation_short", ""),
category=r.get("category", ""),
article=r.get("article", ""),
article_label=r.get("article_label", ""),
paragraph=r.get("paragraph", ""),
source_url=r.get("source_url", ""),
score=0.0,