feat(ai-sdk): legal-corpus structure endpoint + coverage page

Expose GET /sdk/v1/rag/legal-corpus, which scrolls the eur-lex legal
corpus (filtered to a few hundred points regardless of total size) and
aggregates each ingested act's composition: distinct articles, annexes,
recitals and chunk count. Surface it as a new section on /sdk/coverage so
the ingested corpus is no longer a black box — a developer SEES what each
act actually contains, not only its name.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Benjamin Admin
2026-06-23 19:47:17 +02:00
parent b83c3e6e00
commit 4c99773fa1
6 changed files with 352 additions and 3 deletions
@@ -46,6 +46,28 @@ export interface CorpusOverview {
totals: { documents: number; catalog_sources: number }
}
// --- Ingested legal-corpus structure (from the vector store, via the Go SDK).
// Shows WHAT each eur-lex act consists of (articles/annexes/recitals), so the
// ingested corpus is not a black box for developers. ---
export interface LegalActStructure {
regulation_short: string
regulation_name: string
articles: number
annexes: number
recitals: number
chunks: number
}
export interface LegalCorpus {
regulations: LegalActStructure[]
totals: {
regulations: number
articles: number
annexes: number
recitals: number
}
}
// --- Korpus-Dokumente: gruppieren nach Art (Gesetz/Leitfaden/Standard/Urteil)
// + Herausgeber-Familie (DSK, EDPB, OWASP, NIST …). Deterministisch, pure. ---
interface DocCat {