Compare commits

...

6 Commits

Author SHA1 Message Date
Benjamin Admin a606000a20 feat(ai-sdk): EvidenceType-Schicht — autoritative Fußnoten/Tabellen/Figuren surfacen
Router-Schicht Intent→KnowledgeSpace→EvidenceType→Collection→Merge→Authority (User-
Entscheidung A generalisiert). Neuer EvidenceType{TEXT,FOOTNOTE,TABLE,FIGURE} +
classifyEvidence (aus is_footnote/is_table/is_figure-Payload). RetrieveEvidence() zieht
die autoritative typisierte Evidence GEZIELT aus der KB-Slice (top-20, in-scope) statt
sie im Breit-Basis-Text-Merge zu verlieren; /retrieve liefert footnotes[]/tables[]/
figures[]. Kein perColl-Blindanstieg. Dieselbe Infra trägt C8 (FIGURE) ohne Router-Umbau.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-07-01 09:30:42 +02:00
Benjamin Admin 6f0c1cf30d feat(ai-sdk): /retrieve liefert footnotes[] (C-FN Evidence) für Advisor-Workspace
Footnote-Hits werden aus dem Qdrant-Payload (is_footnote/footnote_label/
footnote_verbatim/ref_citation_unit) in interne LegalSearchResult-Felder (json:"-",
kein Pro-Result-Contract-Change) gemappt und im /retrieve-Handler als Top-Level
footnotes[] (Frontend RawFootnote-Shape) herausgezogen; Hits bleiben in results[]
(LLM-Kontext). figures[] als leerer C8-Platzhalter. Speist den Evidence-Workspace
(evidence-adapter.ts) der Frontend-Session.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-07-01 08:37:59 +02:00
Benjamin_Boenisch f0120b237e fix(ucca): Cross-Reg 0070 — beide Domaenen im Router-Top-K (#47)
CI / detect-changes (push) Successful in 6s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 9s
CI / validate-canonical-controls (push) Successful in 5s
CI / loc-budget (push) Successful in 18s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Successful in 59s
CI / iace-gt-coverage (push) Successful in 18s
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-06-30 13:42:28 +00:00
Benjamin Admin 1d65d99d5f style(ucca): gocritic equalFold in balanceByRegulation (go-lint gruen)
CI / detect-changes (pull_request) Successful in 14s
CI / branch-name (pull_request) Successful in 2s
CI / guardrail-integrity (pull_request) Successful in 6s
CI / secret-scan (pull_request) Successful in 6s
CI / dep-audit (pull_request) Failing after 54s
CI / sbom-scan (pull_request) Failing after 58s
CI / build-sha-integrity (pull_request) Successful in 5s
CI / validate-canonical-controls (pull_request) Successful in 4s
CI / loc-budget (pull_request) Successful in 20s
CI / go-lint (pull_request) Successful in 43s
CI / python-lint (pull_request) Failing after 18s
CI / nodejs-lint (pull_request) Failing after 1m10s
CI / nodejs-build (pull_request) Successful in 3m1s
CI / test-go (pull_request) Successful in 1m4s
CI / iace-gt-coverage (pull_request) Successful in 16s
CI / test-python-backend (pull_request) Successful in 27s
CI / test-python-document-crawler (pull_request) Successful in 12s
CI / test-python-dsms-gateway (pull_request) Successful in 13s
strings.EqualFold(code, cv) statt code==strings.ToUpper(cv) — behebt den einzigen
gocritic-Befund auf der neuen Zeile (CI go-lint, new-from-merge-base). Verhalten
unveraendert (case-insensitive exakter regulation_code-Match); Unit + 0070-e2e bleiben gruen.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-30 15:30:58 +02:00
Benjamin Admin f2d445b891 fix(ucca): Cross-Reg 0070 — beide Regelwerk-Domaenen im Router-Top-K (Known Defects 0)
CI / detect-changes (pull_request) Successful in 13s
CI / branch-name (pull_request) Successful in 1s
CI / guardrail-integrity (pull_request) Successful in 9s
CI / secret-scan (pull_request) Successful in 10s
CI / dep-audit (pull_request) Failing after 56s
CI / sbom-scan (pull_request) Failing after 59s
CI / build-sha-integrity (pull_request) Successful in 5s
CI / validate-canonical-controls (pull_request) Successful in 3s
CI / test-python-document-crawler (pull_request) Successful in 15s
CI / test-python-dsms-gateway (pull_request) Successful in 13s
CI / loc-budget (pull_request) Successful in 23s
CI / go-lint (pull_request) Failing after 51s
CI / python-lint (pull_request) Failing after 18s
CI / nodejs-lint (pull_request) Failing after 1m8s
CI / nodejs-build (pull_request) Successful in 3m6s
CI / test-go (pull_request) Successful in 1m3s
CI / iace-gt-coverage (pull_request) Successful in 18s
CI / test-python-backend (pull_request) Successful in 28s
Der einzige offene Retrieval-Haertefall: eine Query mit >=2 genannten Regelwerken
("CRA und Maschinenverordnung") lieferte nur die keyword-dominante Domaene (CRA),
MaschVO fiel raus. Drei zusammenwirkende Ursachen, alle behoben:

1. CodeValues-Mismatch: MaschVO heisst je Collection anders (Slice MASCHVO ·
   gesetze MVO · ce MACHINERY/MASCHINENVO), der Catalog hatte nur ["MASCHVO","MaschVO"]
   → Filter fand MaschVO nur in der Slice. Jetzt alle Varianten als CodeValues.
2. Per-Collection-Truncation: der Router gab perColl=3 → searchMultiRegulation holte
   3+3=6, schnitt auf 3 → konnte eine Domaene je Collection verlieren. Multi-Reg-Queries
   bekommen jetzt perColl = 3*len(regs).
3. Router-Score-Merge starvte die nicht-dominante Domaene. Neue balanceByRegulation()
   gruppiert den gemergten Pool per Regelwerk (exakter regulation_code-Match) und nimmt
   round-robin ueber die genannten Domaenen → jede Domaene mit Treffern ist im Top-K.
   Generisch ueber jede genannte Menge; Single-Domain-Pfad unveraendert.

Validierung: Go-Unit (balanceByRegulation: dominante CRA verdraengt MaschVO NICHT mehr);
0070-e2e gegen dev (Retrieve() → [CRA MVO CRA MVO CRA MVO CRA MASCHINENVO] = beide
Domaenen, vorher nur CRA); CB-100-Stichprobe REGR 0 (Gain-Profil unveraendert).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-30 15:08:18 +02:00
Benjamin_Boenisch 08086ee75f feat: Authority Router — Advisor collection-agnostisch, KB-2026.1 live (#46)
CI / detect-changes (push) Successful in 6s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 5s
CI / validate-canonical-controls (push) Successful in 3s
CI / loc-budget (push) Successful in 17s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m58s
CI / test-go (push) Successful in 1m0s
CI / iace-gt-coverage (push) Successful in 15s
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-06-30 12:26:53 +00:00
7 changed files with 285 additions and 3 deletions
@@ -111,14 +111,75 @@ func (h *RAGHandlers) Retrieve(c *gin.Context) {
return
}
// Evidence-Type-Schicht: die autoritative typisierte Evidence (Fußnoten/Tabellen/Figuren) aus
// dem KB-Wissensraum SEPARAT surfacen, statt sie im Breit-Basis-Text-Merge zu verlieren.
// results[] bleibt der Text-Kontext fürs LLM + die Quellen-Liste.
ev := h.ragClient.RetrieveEvidence(c.Request.Context(), req.Query)
c.JSON(http.StatusOK, gin.H{
"query": req.Query,
"results": results,
"count": len(results),
"assessment": ucca.Assess(results),
"footnotes": footnotesFromEvidence(ev[ucca.EvidenceFootnote]),
"tables": tablesFromEvidence(ev[ucca.EvidenceTable]),
"figures": figuresFromEvidence(ev[ucca.EvidenceFigure]),
})
}
// footnotesFromEvidence maps FOOTNOTE evidence to the Evidence-Workspace RawFootnote shape.
func footnotesFromEvidence(rs []ucca.LegalSearchResult) []gin.H {
out := make([]gin.H, 0, len(rs))
for _, r := range rs {
out = append(out, gin.H{
"id": r.CitationUnit,
"ref": r.CitationUnit,
"number": r.FootnoteLabel,
"regulation_code": r.RegulationCode,
"regulation_short": r.RegulationShort,
"regulation_name": r.RegulationName,
"section": r.RefCitationUnit,
"text": r.FootnoteVerbatim,
})
}
return out
}
// tablesFromEvidence maps TABLE evidence (C6/C9). Key is present so the same Evidence-Type path
// carries tables the moment the UI adds a table section.
func tablesFromEvidence(rs []ucca.LegalSearchResult) []gin.H {
out := make([]gin.H, 0, len(rs))
for _, r := range rs {
out = append(out, gin.H{
"id": r.CitationUnit,
"caption": r.ArticleLabel,
"regulation_code": r.RegulationCode,
"regulation_short": r.RegulationShort,
"regulation_name": r.RegulationName,
"section": r.RefCitationUnit,
"text": r.Text,
})
}
return out
}
// figuresFromEvidence maps FIGURE evidence (C8). Empty until C8 populates figure units; image_url/
// caption/vision_summary get added here when C8 lands — same path, no router change.
func figuresFromEvidence(rs []ucca.LegalSearchResult) []gin.H {
out := make([]gin.H, 0, len(rs))
for _, r := range rs {
out = append(out, gin.H{
"figure_id": r.CitationUnit,
"caption": r.ArticleLabel,
"regulation_code": r.RegulationCode,
"regulation_short": r.RegulationShort,
"regulation_name": r.RegulationName,
"section": r.RefCitationUnit,
})
}
return out
}
// ListRegulations returns the list of available regulations in the corpus.
// GET /sdk/v1/rag/regulations
func (h *RAGHandlers) ListRegulations(c *gin.Context) {
@@ -55,6 +55,15 @@ func (c *LegalRAGClient) Retrieve(ctx context.Context, query string, topK int) (
collections = append(collections, c.kbSliceCollection)
}
// Cross-regulation queries (>=2 explicitly named regulations) get a larger per-collection budget
// so each collection's multi-regulation search isn't truncated down to the keyword-dominant
// domain; the final per-regulation balancing then guarantees every named domain in the top-K.
regs := detectRegulations(query)
perColl := routerPerCollectionTopK
if len(regs) >= 2 {
perColl = routerPerCollectionTopK * len(regs)
}
// Warm the full-text indexes sequentially first so the concurrent fan-out below only READS the
// shared textIndexEnsured map (the writes happen here, serialized) — closes the cold-start map
// race deterministically. Best-effort: a missing collection just stays un-indexed (hybrid then
@@ -71,19 +80,25 @@ func (c *LegalRAGClient) Retrieve(ctx context.Context, query string, topK int) (
wg.Add(1)
go func(i int, coll string) {
defer wg.Done()
if res, err := c.searchInternal(ctx, coll, query, nil, routerPerCollectionTopK); err == nil {
if res, err := c.searchInternal(ctx, coll, query, nil, perColl); err == nil {
out[i] = res
}
}(i, coll)
}
wg.Wait()
merged := make([]LegalSearchResult, 0, len(collections)*routerPerCollectionTopK)
merged := make([]LegalSearchResult, 0, len(collections)*perColl)
for _, r := range out {
merged = append(merged, r...)
}
merged = dedupResults(merged)
sort.SliceStable(merged, func(a, b int) bool { return merged[a].Score > merged[b].Score })
// Cross-regulation: guarantee every named domain is represented (0070-class fix) instead of
// letting a global score-sort starve the non-dominant domain.
if len(regs) >= 2 {
return balanceByRegulation(merged, regs, topK), nil
}
if len(merged) > topK {
merged = merged[:topK]
}
@@ -60,6 +60,36 @@ func hitDoc(results []LegalSearchResult, toks []string) bool {
return false
}
// TestMultiReg0070E2E (RUN_E2E=1) is the 0070 regression: a cross-regulation query (CRA + MaschVO)
// must return BOTH domains through the real Retrieve(), not just the keyword-dominant CRA.
func TestMultiReg0070E2E(t *testing.T) {
if os.Getenv("RUN_E2E") != "1" {
t.Skip("set RUN_E2E=1 + QDRANT_URL/OLLAMA_URL/QDRANT_API_KEY")
}
c := NewLegalRAGClient()
q := "Wie greifen CRA und Maschinenverordnung bei einer vernetzten Maschine ineinander?"
res, err := c.Retrieve(context.Background(), q, 8)
if err != nil {
t.Fatalf("retrieve: %v", err)
}
var hasCRA, hasMasch bool
var codes []string
for _, r := range res {
u := strings.ToUpper(r.RegulationCode)
codes = append(codes, u)
if strings.Contains(u, "CRA") {
hasCRA = true
}
if strings.Contains(u, "MASCH") || strings.Contains(u, "MACHIN") || u == "MVO" {
hasMasch = true
}
}
t.Logf("0070 top-8 codes: %v", codes)
if !hasCRA || !hasMasch {
t.Errorf("0070 must return BOTH domains via Retrieve(): CRA=%v MaschVO=%v", hasCRA, hasMasch)
}
}
// TestAuthorityRouterCB100 (RUN_E2E=1) drives the REAL Retrieve() over the ComplianceBench-100 against
// the live collections: NEW (scope routing on → slice added for in-scope queries) vs OLD (routing off
// → broad base only). It is the regression gate that the router actually delivers the proven slice
@@ -49,6 +49,38 @@ func TestRouterSliceSelection(t *testing.T) {
}
}
func TestBalanceByRegulation(t *testing.T) {
regs := []detectedRegulation{
{Canonical: "CRA", CodeValues: []string{"CRA"}},
{Canonical: "MaschVO", CodeValues: []string{"MASCHVO", "MVO", "MACHINERY"}},
}
// CRA dominates by score; without balancing the top-4 would be all CRA + NIST.
pool := []LegalSearchResult{
{RegulationCode: "CRA", Score: 0.99},
{RegulationCode: "CRA", Score: 0.98},
{RegulationCode: "CRA", Score: 0.97},
{RegulationCode: "NIST", Score: 0.96},
{RegulationCode: "MACHINERY", Score: 0.70},
{RegulationCode: "MVO", Score: 0.65},
}
out := balanceByRegulation(pool, regs, 4)
var hasCRA, hasMasch bool
for _, r := range out {
switch r.RegulationCode {
case "CRA":
hasCRA = true
case "MACHINERY", "MVO":
hasMasch = true
}
}
if !hasCRA || !hasMasch {
t.Errorf("both named domains must be represented: CRA=%v MaschVO=%v out=%v", hasCRA, hasMasch, out)
}
if out[0].RegulationCode != "CRA" || !(out[1].RegulationCode == "MACHINERY" || out[1].RegulationCode == "MVO") {
t.Errorf("round-robin should alternate domains, got %s then %s", out[0].RegulationCode, out[1].RegulationCode)
}
}
func TestDedupResults(t *testing.T) {
in := []LegalSearchResult{
{RegulationCode: "EDPB WP248", ArticleLabel: "III.B", Text: "lorem", Score: 0.7},
@@ -0,0 +1,68 @@
package ucca
import "context"
// EvidenceType classifies a retrieved unit by WHAT KIND of evidence it is, independent of its
// collection. Footnotes/tables/figures are Evidence Types, not collections. The Authority Router
// surfaces non-text evidence from the authoritative knowledge space (the KB slice) SEPARATELY from
// the merged text top-K, so fine-grained evidence isn't outranked by broad-base text.
//
// The layer this introduces: Intent -> Knowledge Space -> EvidenceType -> Collection -> Merge ->
// Authority. Today FOOTNOTE is populated; FIGURE arrives with C8 and TABLE is already present from
// C6/C9 — no router rebuild needed, the same path carries every new evidence type.
type EvidenceType string
const (
EvidenceText EvidenceType = "text"
EvidenceFootnote EvidenceType = "footnote"
EvidenceTable EvidenceType = "table"
EvidenceFigure EvidenceType = "figure"
)
// classifyEvidence derives the EvidenceType from a result's payload markers. Precedence
// footnote > figure > table > text (a unit carries at most one is_* marker in practice).
func classifyEvidence(r LegalSearchResult) EvidenceType {
switch {
case r.IsFootnote:
return EvidenceFootnote
case r.IsFigure:
return EvidenceFigure
case r.IsTable:
return EvidenceTable
default:
return EvidenceText
}
}
// evidenceRetrievalTopK is the budget for the authoritative-KB evidence pass. Deliberately targeted
// (the authoritative slice within the recognized knowledge space), NOT a blanket top-K increase of
// the merged result set — the successes came from BETTER-targeted evidence, not MORE evidence.
const evidenceRetrievalTopK = 20
// maxEvidencePerType caps each surfaced evidence type.
const maxEvidencePerType = 6
// RetrieveEvidence returns the authoritative typed evidence (footnotes/tables/figures) for an
// in-scope query, pulled from the KB slice and grouped by EvidenceType. This is the "Evidence Type"
// router layer (Option A): when the query is in the KB knowledge space, the authoritative evidence
// within that space is surfaced separately so it isn't lost in the broad-base text merge. Returns an
// empty map when out of scope or KB routing is disabled. Text evidence is NOT returned here — it
// flows through the normal Retrieve() merge (the LLM context + the sources list).
func (c *LegalRAGClient) RetrieveEvidence(ctx context.Context, query string) map[EvidenceType][]LegalSearchResult {
ev := map[EvidenceType][]LegalSearchResult{}
if !c.kbScopeRoutingEnabled || c.kbSliceCollection == "" || !inKBScope(query) {
return ev
}
hits, err := c.searchInternal(ctx, c.kbSliceCollection, query, nil, evidenceRetrievalTopK)
if err != nil {
return ev
}
for _, h := range hits {
t := classifyEvidence(h)
if t == EvidenceText || len(ev[t]) >= maxEvidencePerType {
continue
}
ev[t] = append(ev[t], h)
}
return ev
}
@@ -37,6 +37,17 @@ type LegalSearchResult struct {
// Supersede-Status (status="superseded", use_for_primary=false) — Alt-Quelle,
// die fuer Default-Fragen demoted wird (nicht versteckt; fuer Historie auffindbar).
Superseded bool `json:"-"`
// Evidence-Type-Marker — intern (json:"-", kein Pro-Result-Contract-Change), aus dem
// Qdrant-Payload befuellt. classifyEvidence() leitet daraus den EvidenceType ab; der
// Router surfacet nicht-Text-Evidence (Fußnote/Tabelle/Figur) getrennt vom Text-Merge,
// damit feingranulare Evidence nicht von Breit-Basis-Text ueberrankt wird.
IsFootnote bool `json:"-"`
FootnoteLabel string `json:"-"`
FootnoteVerbatim string `json:"-"`
RefCitationUnit string `json:"-"`
IsTable bool `json:"-"` // C6/C9: is_table (liniiert + borderless)
IsFigure bool `json:"-"` // C8: is_figure (noch nicht befuellt bis C8)
}
// LegalAssessment is the auditable explanation layer over a ranked result set:
@@ -20,7 +20,9 @@ var regulationCatalog = []struct {
CodeValues []string
}{
{"CRA", []string{"cra", "cyber resilience"}, []string{"CRA"}},
{"MaschVO", []string{"maschinenverordnung", "maschvo", "machinery regulation"}, []string{"MASCHVO", "MaschVO"}},
// MaschVO heisst je Collection anders: Slice MASCHVO · gesetze MVO · ce MACHINERY/MASCHINENVO.
// Alle Varianten als CodeValues, sonst findet der per-Reg-Filter MaschVO nur in der Slice (0070).
{"MaschVO", []string{"maschinenverordnung", "maschvo", "machinery regulation"}, []string{"MASCHVO", "MaschVO", "MVO", "MASCHINENVO", "MACHINERY"}},
{"NIS2", []string{"nis2", "nis-2", "nis 2"}, []string{"NIS2"}},
{"DORA", []string{"dora"}, []string{"DORA"}},
{"Data Act", []string{"data act", "datengesetz"}, []string{"DATA ACT", "DataAct"}},
@@ -53,6 +55,62 @@ func detectRegulations(query string) []detectedRegulation {
func hitID(h qdrantSearchHit) string { return fmt.Sprintf("%v", h.ID) }
// balanceByRegulation builds the final top-K so EVERY explicitly-named regulation with hits is
// represented, instead of letting the keyword-dominant domain (e.g. CRA) crowd out the other
// (e.g. MaschVO) in a cross-regulation query. The input pool must already be score-ordered;
// results are grouped by exact regulation_code match against each regulation's CodeValues, then
// taken round-robin across the named domains (highest-scored first within each), with any
// remaining slots filled by the leftover pool in score order. Generic; no doc-specific logic.
func balanceByRegulation(pool []LegalSearchResult, regs []detectedRegulation, topK int) []LegalSearchResult {
if topK <= 0 {
topK = 8
}
byReg := make([][]LegalSearchResult, len(regs))
matched := make([]bool, len(pool))
for ri, r := range regs {
for pi := range pool {
if matched[pi] {
continue
}
code := strings.TrimSpace(pool[pi].RegulationCode)
for _, cv := range r.CodeValues {
if strings.EqualFold(code, cv) {
byReg[ri] = append(byReg[ri], pool[pi])
matched[pi] = true
break
}
}
}
}
out := make([]LegalSearchResult, 0, topK)
idx := make([]int, len(regs))
for len(out) < topK {
progressed := false
for ri := range regs {
if idx[ri] < len(byReg[ri]) {
out = append(out, byReg[ri][idx[ri]])
idx[ri]++
progressed = true
if len(out) >= topK {
break
}
}
}
if !progressed {
break
}
}
for pi := range pool {
if len(out) >= topK {
break
}
if !matched[pi] {
out = append(out, pool[pi])
}
}
return out
}
// searchMultiRegulation retrieves each explicitly-named regulation SEPARATELY (per-regulation
// filter) and merges, so a cross-regulation query ("Wie greifen CRA und MaschVO ineinander?")
// returns BOTH domains in the prompt instead of only the keyword-dominant one. Generic over any
@@ -137,6 +195,13 @@ func hitsToResults(hits []qdrantSearchHit) []LegalSearchResult {
ReferencesOut: getStringSlice(hit.Payload, "references_out"),
ReferencesIn: getStringSlice(hit.Payload, "references_in"),
Superseded: getString(hit.Payload, "status") == "superseded",
IsFootnote: getBool(hit.Payload, "is_footnote"),
FootnoteLabel: getString(hit.Payload, "footnote_label"),
FootnoteVerbatim: getString(hit.Payload, "footnote_verbatim"),
RefCitationUnit: getString(hit.Payload, "ref_citation_unit"),
IsTable: getBool(hit.Payload, "is_table"),
IsFigure: getBool(hit.Payload, "is_figure"),
}
}
return results