Compare commits

...

47 Commits

Author SHA1 Message Date
Claude 47af5b833f merge(admin): FE advisor widget — topic threads, per-question delete/copy, fullscreen
CI / build-sha-integrity (push) Successful in 5s
CI / validate-canonical-controls (push) Successful in 3s
CI / detect-changes (push) Successful in 5s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / loc-budget (push) Successful in 19s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 3m0s
CI / test-go (push) Successful in 1m0s
CI / iace-gt-coverage (push) Successful in 15s
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-07-01 19:07:48 +02:00
Claude 0903e3a8d1 perf(ai-sdk): embed query once across router fan-out + fold umlauts in intent/concept matching
CI / detect-changes (push) Successful in 5s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 5s
CI / validate-canonical-controls (push) Successful in 4s
CI / loc-budget (push) Successful in 18s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 3m0s
CI / test-go (push) Successful in 59s
CI / iace-gt-coverage (push) Successful in 17s
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
Authority Router re-embedded the query per collection (6x); on dev the embed
endpoint (OVH) is remote so that was 6 round-trips = 7-12s per /retrieve. Embed
once, reuse via ctx across the concurrent per-collection searches.
DetectIntent + ConceptNorms now fold ae/oe/ue/ss so ASCII (Pruefe) and umlaut
(Pruefe) inputs both match.
2026-07-01 19:03:11 +02:00
Benjamin Admin e8ea179228 feat(advisor): topic threads, per-question delete/copy, fullscreen
Adds case management to the Compliance Advisor widget.

- topic threads: cases group into threads; the left menu shows each
  thread's first question as the Thema with expandable follow-ups.
  Send = follow-up to the active thread (carries the thread's prior Q&A
  as history for contextual answers); "+" starts a new topic.
- delete: a trash action per question (menu + stacked view).
- copy: single Q&A (question + answer + evidence + footnotes) or a whole
  thread, as Markdown to the clipboard (pure formatters in copy.ts).
- fullscreen: compact -> panel -> fullscreen view.
- route.ts consumes an optional bounded `history` so follow-ups are
  contextual for both the widget and the workspace consumer.

Tests: copy formatter unit tests + Playwright specs (threads/new-topic,
delete, fullscreen, copy affordance). No deploy.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-07-01 18:51:17 +02:00
Claude cf2cea437e merge(admin): FE Evidence-Workspace → main (evidence-framed header + bindingness)
CI / detect-changes (push) Successful in 5s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 4s
CI / validate-canonical-controls (push) Successful in 4s
CI / loc-budget (push) Successful in 18s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 3m2s
CI / test-go (push) Successful in 1m0s
CI / iace-gt-coverage (push) Successful in 15s
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-07-01 17:01:42 +02:00
Claude 75501d106a merge(ai-sdk): Advisor Reasoning Stack → main (Clarity+G1+Concept+Scope+Term+Intent) 2026-07-01 17:01:42 +02:00
Claude e901447096 feat(ai-sdk): Advisor Reasoning Stack — Clarity+G1+Concept-Injector+Context-Scope+Term-Resolution+E4-Curation+Intent-Signal 2026-07-01 15:27:23 +02:00
Benjamin Admin a9b04e5286 feat(advisor): evidence-framed header + bindingness contract seam
Rework the Compliance Advisor header ("Diese Antwort stuetzt sich auf")
to describe the EVIDENCE rather than the documents: binding
Rechtsgrundlagen split from Leitlinien (soft-law guidance), a
per-regulation breakdown, plus Abbildungen, Fussnoten and Evidence Units.
No fabricated trust score — objective counts only.

- bindingness is a canonical Legal-KG fact (APEX rule): added an optional
  EvidenceUnit.bindingness contract seam; the FE renders the split from it
  and degrades to a neutral per-regulation breakdown when it is absent
  (SDK/RAG asked via board to populate it in /retrieve).
- evidence-grouping.ts: pure, tested grouping/counting model.
- route.ts: optional `audience` field (tonality) kept out of the retrieval
  question; answers lead with a "Kurz gesagt" summary, structured by theme.
- E2E + unit tests updated for the evidence framing.

Not deployed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-07-01 15:17:21 +02:00
Benjamin Admin 62da36872f docs(citability): generische Legal-Reference-Hülle als Phase-B-Design-Vorgabe
User-Ergänzung: norm_ids sind nur die erste Rechtsreferenz-Art. Runtime-Vertrag (Phase B)
soll eine erweiterbare Hülle tragen (legal_reference: {norm_ids, citation_units, recital_ids,
guidance_ids, case_law_ids, interpretation_ids}), damit neue Referenz-Arten additiv als
optionale Keys landen ohne erneuten Vertrags-/Go-Struct-Umbau. Naming-Hinweis: NICHT
"legal_basis" (kollidiert mit dem Obligation-Array). Bindend ⊥ Guidance bleibt erhalten.
Phase B baut NUR norm_ids; Rest = reservierter Platz, kein Bauauftrag. Nur Spec, kein Code.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-07-01 14:35:56 +02:00
Benjamin Admin 2a8abb6235 docs(citability): Weg-1 Datenbereitschaft dokumentiert (Obligation->Norm), UI deferred
User-Auftrag: Datenbereitschaft für Weg 1 bestätigen + dokumentieren, KEIN UI bauen.
Neuer Spec-Abschnitt "Weg 1 — Obligation->Norm-Zitierfähigkeit":
- Zwei Zitierebenen getrennt: RAG-Evidence-Zitat (Advisor heute, /retrieve) ⊥
  Obligation->Norm-Zitat (norm_id-Join, diese Registry) — beide langfristig nötig.
- Datenbereitschaft: obligation norm_ids READY (62/64 joinbar) + KB-v2-Ziele bestätigt;
  GAP = Runtime-Vertrag obligation_join_keys.json trägt norm_ids NICHT (nur citation_units),
  Go ObligationKey ohne NormIDs, AssessObligationStatus setzt CitationSpans:"pending" hart.
- Deferred-Sequenz (nicht gebaut): 1 Data-Prep norm_ids->join_keys (Domäne 2) ->
  2 Go-Build (CitationSpans aus norm_ids, koordiniert) -> 3 UI ("Diese Pflicht beruht auf ...").

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-07-01 14:31:32 +02:00
Benjamin Admin f37081b60b test(advisor): E2E for the Clarity-Gate chain (Playwright, stubbed endpoint)
Per the "always write E2E" rule: drives the floating advisor widget end-to-end against a
stubbed /api/sdk/compliance-advisor/chat with contract fixtures — clarify (L1 + context
chips), answer ([n] citation + evidence pane), and clarify->pick-context->scoped-answer.
No backend needed (route interception). Runs on CI/macmini (Next app on :3002); validated
here via tsc + `playwright --list` (3 tests discovered). check-loc 0.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-07-01 14:06:02 +02:00
Benjamin Admin 4027470855 feat(pipeline): Scope-Audit als fester Review-Step (flag-only, non-blocking)
#3 (User): Scope-Audit in den Cut-Prozess aufnehmen — nicht-invasiv, nur flaggen,
Reklassifizierung bleibt menschlich/koordiniert (betrifft join_keys + Compliance Execution).

- validate_registry.py surfaced jetzt pro Cut authority-/institution-adressierte Obligations
  OHNE scope-Klassifikation als non-fatal Warnung (ändert Exit-Code NICHT, mutiert NICHT;
  importiert NON_MANUFACTURER_DOMAINS aus scope_audit.py, DRY). Verifiziert: clean-Fall
  (cra_machinery, 3 klassifiziert) still, Negativtest (sanctions ohne scope) feuert Warnung.
- obligation_registry_v1.md: neuer Abschnitt "Scope-Audit (Review-Step, PFLICHT je Cut)" mit
  scope-Achse (in_scope/out_of_scope/derived_obligation), Gate-Regel verbatim und
  Werkzeug-Trennung FLAG (scope_audit.py) ⊥ MUTATE (apply_scope_classification.py, nur Review-Go).

Gate: jeder neue Cut läuft durch Scope-Audit; Findings werden dokumentiert; automatische
Reclassification verboten ohne explizites Review-Go.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-07-01 13:54:44 +02:00
Benjamin Admin 3f372bcb39 feat(advisor): Phase 1 — endpoint backward-compat (keep breakpilot-workspace working)
The advisor endpoint now serves two shapes off one orchestration:
- new FE ({question}) -> v3 JSON contract (clarity/answer/evidence/citations/...).
- legacy consumer ({message}, e.g. breakpilot-workspace which reads a text stream and
  persists raw bytes) -> plain-text stream of the L2 answer (clean prose, no [n] markup,
  no clarify gate). isLegacyRequest() discriminates; answerSystem() gains withCitations.

Prevents the v3 contract from breaking breakpilot-workspace's chat (CLAUDE.md rule #4,
keep every consumer working). No deploy. tsc clean, 13 vitest (incl. isLegacyRequest),
check-loc 0.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-07-01 13:53:17 +02:00
Benjamin Admin 6523286af6 feat(registry-quality): scope-Achse — 2 out_of_scope + derived_obligation (User Option 2)
User-Entscheidung 2026-07-01 zum Scope-Audit: Adressat der Norm != Handlungspflicht des
Herstellers. Neue `scope`-Attribut-Achse (Enum, KEINE neue Objektklasse -> Freeze v1.0
unberuehrt): in_scope (default) / out_of_scope / derived_obligation.

- sanctions + market_surveillance_safeguard -> out_of_scope (reine Staats-/Durchsetzungs-
  bestimmungen; Praezedenz CSIRT/ENISA im CRA-Vuln-Cut). Aus join_keys gefiltert.
- notified_body_requirements -> derived_obligation (Norm adressiert primaer die notifizierte
  Stelle, erzeugt aber mittelbare Herstellerpflichten: NB einbeziehen + Unterlagen +
  Konformitaetsbewertung) + scope_split_candidate (spaetere Aufspaltung Normadressat <->
  abgeleitete Herstellerpflicht). BLEIBT im Set (Prinzip: Wissen nicht zu frueh verwerfen).
- export_join_keys.py filtert scope==out_of_scope + fuehrt scope je Eintrag -> join_keys
  126->124 (MaschVO 31->29; 123 in_scope + 1 derived_obligation).
- scope_audit.py jetzt 3-Wege-klassifikations-bewusst (0 unklassifizierte Reste) +
  apply_scope_classification.py (deterministisch). Fuer jeden kuenftigen Cut mitlaufen.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-07-01 13:22:38 +02:00
Benjamin Admin e54d07f2d9 feat(registry-quality): Scope-Audit-Tool — Adressaten-Prüfung (Hersteller vs Behörde)
Generalisiert den Zitierfähigkeits-Befund (2 Kapitel-Obligations authority-facing) zu
einem wiederverwendbaren Review-Stage-Werkzeug: flaggt Obligations, deren applicability
an Behörde/notifizierte Stelle/Mitgliedstaat adressiert ist = out_of_scope-Kandidaten
(Registry modelliert Hersteller-Pflichten; Präzedenz CSIRT/ENISA im CRA-Vuln-Cut).

- scope_audit.py: deterministisch, key't auf applicability (Adressat), NICHT auf
  Behörden-Nennung im Namen → Melde-AN-Behörde-Pflichten bleiben korrekt IN-SCOPE
  (False-Positive-Guard, verifiziert an exploited_vuln_reporting_authorities).
- 126 Obligations gescannt → 3 Kandidaten (alle MaschVO LEGAL_MINIMUM):
  notified_body_requirements (domain:notified_body) · market_surveillance_safeguard
  (domain:authority) · sanctions (domain:authority).
- scope_audit_findings.json = Findings-Artefakt. Audit FLAGGT nur;
  Reklassifizierung = User/Owner-Entscheidung (ändert join_keys, cross-session).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-07-01 13:12:40 +02:00
Benjamin Admin 5a513181cc feat(advisor): Clarity-Gate orchestration in route.ts (consumes /retrieve)
Completes the advisor stack (FE + orchestration; /retrieve is SDK/RAG-owned). The route
now returns the FE contract instead of a text stream:
- retrieveFull() calls /retrieve with {query, context}; consumes clarity/evidence/
  visual_evidence/footnotes (exact shape per board 2026-07-01 12:25).
- mode-routing (resolveMode): clarify unless a context was chosen and /retrieve's
  clarity.mode says so. clarify -> L1 general answer (completeAdvisorAnswer, ungrounded,
  no sources). answer -> L2 answer over numbered evidence with [n] markers.
- citations generated here ([n] -> nth evidence unit); footnotes remapped; evidence /
  visual_evidence passed through.
- advisor-llm: non-streaming completeAdvisorAnswer(). Pure mappings in retrieve-mapping.ts
  (+ tests). Removed the dead v2 evidence.ts/evidence-adapter (RegulationRef moved to
  regulation-display). controls-augmentation kept (tested; re-integrable later).

NOT deployed: joint deploy with the SDK /retrieve endpoint (deploy-coupling). tsc clean,
25 vitest (mapping/clarify/answer/markdown/registry/rag), check-loc 0.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-07-01 12:39:47 +02:00
Benjamin Admin a4cb104258 feat(citability): KB-v2-Verifikation der norm_ids einarbeiten (16/19 confirmed)
Feedback der Compliance/KB-Session (2026-07-01): die 19 verify_pending norm_ids
gegen KB-v2 (bp_compliance_kb_2026_1_build) geprüft — 16/19 bestätigt (alle
CRA+MaschVO-Artikel existieren). Die 3 fehlenden = Kapitel-Ebene
(EU-MaschVO-KapitelIV/V/VI): der KB-Compiler mintet Artikel+Annex, KEINE Kapitel.

- Artikel-norm_ids verify_pending -> article_confirmed (16 distinkt).
- Kapitel-norm_ids -> chapter_no_kb_unit (danglender Join-Key) + norm_id_note
  (Re-Anchor auf Konstituenten-Artikel = Enhancement, KB-v2 hat sie; NICHT geraten).
- 2 Kapitel-Obligations (notifizierte Stellen · Marktüberwachung/Schutzklausel,
  beide rein prozedural, obligation_id=None) citation_status norm_id_linked ->
  chapter_reanchor_pending. Joinbar bleiben 62 Obligations.
- Status gesamt: 53 annex_confirmed + 10 article_confirmed + 2 chapter_no_kb_unit.
- norm_id_manifest.json + Contract-Block um kb_v2_verification ergänzt.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-07-01 12:31:35 +02:00
Benjamin Admin 808a7faea3 Merge remote-tracking branch 'origin/main' into feat/obligation-aggregation 2026-07-01 12:24:51 +02:00
Benjamin Admin ffbedfa0dc feat(citability): logischer norm_id-Join auf legal_basis (KB-v2 Zitier-Vertrag)
Wake-up #2 (Domaene 2): Zitierfaehigkeit ohne char-Level-Spans via logischem
norm_id-Join auf KB-v2-Units (bp_compliance_kb_2026_1_build). Konvention (Board
Compliance/KB-v2 2026-07-01): EU-<ACT>-Anhang<ROM> (Annex-Ebene, confirmed) /
EU-<ACT>-Art<N> + EU-<ACT>-Kapitel<ROM> (verify_pending). Namensvariante
EU-MaschVO-* (NICHT MaschinenVO). KEINE neue Klasse — norm_ids ist ein Attribut
auf legal_basis (freeze-safe).

- 65/65 legal_basis gejoint (CRA 40 + MaschVO 25), 0 unparsed; 64 Obligations
  citation_status -> norm_id_linked (BP/guidance-anchored bleiben ohne norm_id).
- 53 annex_confirmed, 12 verify_pending; distinkt 5 Annex-IDs + 19 Art/Kapitel.
- norm_id_manifest.json = KB-v2-Handoff (verify_pending Art-/Kapitel-IDs pruefen).
- Granularitaet annex-grob (Part/Punkt = KB-Enhancement TBD); Artikel-norm_ids in
  KB-v2 noch zu verifizieren.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-07-01 12:14:55 +02:00
Benjamin Admin f9b7ba2424 feat(advisor): v3 Clarity Gate — Case model + clarify/answer contract, [n] citations
Builds the FE against the SDK<->FE Clarity-Gate contract (board 2026-07-01 /
advisor-clarity-gate-contract). The advisor is now a CASE, not a chat:
- Request {question, context?}; response {mode: clarify|answer, clarity, general_answer,
  answer, evidence, citations, visual_evidence, footnotes}.
- clarify mode: short L1 general answer (marked "allgemeine Definition, ohne Rechtsquelle")
  + domain context chips; picking a chip re-runs the case scoped (-> answer).
- answer mode: markdown answer with clickable [n] citation markers coupled to evidence
  cards (highlight + scroll), evidence grouped by document family, visual_evidence
  (visual_type), footnotes, honest summary counts (no trust score).
- FE never parses the answer for structure — only the deliberate [n] markers, mapped via
  citations[]. New: contract.ts, useAdvisorCase, useCitationHighlight, ClarifyView,
  EvidenceUnitCard, VisualEvidencePane, CaseView. Removed the v2 stream/chat components.

NOT deployed: FE shape-switch (JSON modes) must deploy TOGETHER with the SDK endpoint
delivering the contract (board deploy-coupling). Proxy/route.ts unchanged (SDK-owned).
tsc clean, 16 vitest (incl. clarify+answer fixtures), check-loc 0.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-07-01 11:31:28 +02:00
Benjamin Admin 591cae5ebc feat(advisor): Case Workspace v2 — Evidence grouping, human names, 3-column, summary
Reworks the advisor toward a Compliance Case Workspace (review feedback):
- Rename user-facing "Quellen" -> "Evidence".
- Evidence grouped by document/regulation family (count + expandable) — no more
  unsorted DSK/DSK/DPF/... jumble.
- Human-readable regulation names via a display registry (DSK Sdm B51 -> "DSK
  Standard-Datenschutzmodell (SDM)" / Kapitel B51); generic, bridges G2.
- Evidence summary "Antwort basiert auf" with meaningful counts; Regelwerke = distinct
  FAMILIES (fixes the inflated count). NO fabricated trust score (needs a defined basis).
- Expanded mode = 3-column workspace (question+summary | answer | evidence, independent
  scroll) + history switcher; narrow mode stays stacked.
- Prompt: push aggressive markdown structure (## per aspect, numbered phases).

Deferred/coordinated on board: C8 diagrams (RAG contract), answer<->evidence coupling
[1] (needs LLM citation anchors — phase 2), G1 retrieval relevance + G2 metadata (RAG).
tsc clean, 17 vitest, check-loc 0.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-07-01 10:38:06 +02:00
Benjamin Admin 3884038b06 fix(advisor): generic — drop trailing source list in answer + de-duplicate source card
Two structural fixes (not query-specific):
- Proxy prompt: forbid ANY trailing "Quellen:"/"Quellen im RAG-System" list and make it
  the LAST instruction so it overrides the soul file's answer-structure + example that
  teach a closing sources section. Applies to every answer.
- KnowledgeUnitCard: render the label only when it differs from regulation.short, so a
  source whose label == short name no longer prints twice. Applies to every source.

Answer text is still never parsed in the FE (sources live in the pane). + card test.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-07-01 10:13:58 +02:00
Benjamin Admin a606000a20 feat(ai-sdk): EvidenceType-Schicht — autoritative Fußnoten/Tabellen/Figuren surfacen
Router-Schicht Intent→KnowledgeSpace→EvidenceType→Collection→Merge→Authority (User-
Entscheidung A generalisiert). Neuer EvidenceType{TEXT,FOOTNOTE,TABLE,FIGURE} +
classifyEvidence (aus is_footnote/is_table/is_figure-Payload). RetrieveEvidence() zieht
die autoritative typisierte Evidence GEZIELT aus der KB-Slice (top-20, in-scope) statt
sie im Breit-Basis-Text-Merge zu verlieren; /retrieve liefert footnotes[]/tables[]/
figures[]. Kein perColl-Blindanstieg. Dieselbe Infra trägt C8 (FIGURE) ohne Router-Umbau.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-07-01 09:30:42 +02:00
Benjamin Admin 6f0c1cf30d feat(ai-sdk): /retrieve liefert footnotes[] (C-FN Evidence) für Advisor-Workspace
Footnote-Hits werden aus dem Qdrant-Payload (is_footnote/footnote_label/
footnote_verbatim/ref_citation_unit) in interne LegalSearchResult-Felder (json:"-",
kein Pro-Result-Contract-Change) gemappt und im /retrieve-Handler als Top-Level
footnotes[] (Frontend RawFootnote-Shape) herausgezogen; Hits bleiben in results[]
(LLM-Kontext). figures[] als leerer C8-Platzhalter. Speist den Evidence-Workspace
(evidence-adapter.ts) der Frontend-Session.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-07-01 08:37:59 +02:00
Benjamin Admin 49171e841f feat(advisor): Evidence Workspace — structured panes, markdown, sources as knowledge units
Rebuilds the Compliance Advisor floating widget from a plain chat into an Evidence
Workspace: pinned last question, markdown-rendered answer (clean prose), and separate
panes for Sources (hierarchical Knowledge Units), Figures (C8, conditional) and
Footnotes (C-FN), plus a stats bar (Quellen/Regelwerke/Diagramme/Fußnoten). Scrollable
turn history; stays a floating icon on every SDK page.

Architecture (user direction): the frontend renders ONLY structured evidence and NEVER
parses the answer text. The proxy now returns a JSON AdvisorEvidenceMeta line followed
by the streamed markdown answer; advisor-rag exposes structured results; an adapter maps
RAG/compiler output to the frontend envelope. Figures/footnotes wire in once the
RAG-ingestion contract lands (requested on the board) — figures pane is conditional.

- lib/sdk/advisor/{evidence,evidence-adapter}.ts (+ adapter test, 7 cases)
- components/sdk/advisor/* panes + in-house safe Markdown (no new dep, no dangerouslySetInnerHTML) + test
- useAdvisorStream (meta-line parse + streamed answer) + useAdvisorEmail (escaped)
- proxy: evidence-meta-v1 envelope + clean-prose prompt (no inline citations)
- tsc clean, 11 vitest pass, check-loc 0. ESLint not installed in this node_modules -> CI lints on push.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-07-01 07:46:37 +02:00
Benjamin_Boenisch f0120b237e fix(ucca): Cross-Reg 0070 — beide Domaenen im Router-Top-K (#47)
CI / detect-changes (push) Successful in 6s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 9s
CI / validate-canonical-controls (push) Successful in 5s
CI / loc-budget (push) Successful in 18s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Successful in 59s
CI / iace-gt-coverage (push) Successful in 18s
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-06-30 13:42:28 +00:00
Benjamin Admin 1d65d99d5f style(ucca): gocritic equalFold in balanceByRegulation (go-lint gruen)
CI / detect-changes (pull_request) Successful in 14s
CI / branch-name (pull_request) Successful in 2s
CI / guardrail-integrity (pull_request) Successful in 6s
CI / secret-scan (pull_request) Successful in 6s
CI / dep-audit (pull_request) Failing after 54s
CI / sbom-scan (pull_request) Failing after 58s
CI / build-sha-integrity (pull_request) Successful in 5s
CI / validate-canonical-controls (pull_request) Successful in 4s
CI / loc-budget (pull_request) Successful in 20s
CI / go-lint (pull_request) Successful in 43s
CI / python-lint (pull_request) Failing after 18s
CI / nodejs-lint (pull_request) Failing after 1m10s
CI / nodejs-build (pull_request) Successful in 3m1s
CI / test-go (pull_request) Successful in 1m4s
CI / iace-gt-coverage (pull_request) Successful in 16s
CI / test-python-backend (pull_request) Successful in 27s
CI / test-python-document-crawler (pull_request) Successful in 12s
CI / test-python-dsms-gateway (pull_request) Successful in 13s
strings.EqualFold(code, cv) statt code==strings.ToUpper(cv) — behebt den einzigen
gocritic-Befund auf der neuen Zeile (CI go-lint, new-from-merge-base). Verhalten
unveraendert (case-insensitive exakter regulation_code-Match); Unit + 0070-e2e bleiben gruen.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-30 15:30:58 +02:00
Benjamin Admin f2d445b891 fix(ucca): Cross-Reg 0070 — beide Regelwerk-Domaenen im Router-Top-K (Known Defects 0)
CI / detect-changes (pull_request) Successful in 13s
CI / branch-name (pull_request) Successful in 1s
CI / guardrail-integrity (pull_request) Successful in 9s
CI / secret-scan (pull_request) Successful in 10s
CI / dep-audit (pull_request) Failing after 56s
CI / sbom-scan (pull_request) Failing after 59s
CI / build-sha-integrity (pull_request) Successful in 5s
CI / validate-canonical-controls (pull_request) Successful in 3s
CI / test-python-document-crawler (pull_request) Successful in 15s
CI / test-python-dsms-gateway (pull_request) Successful in 13s
CI / loc-budget (pull_request) Successful in 23s
CI / go-lint (pull_request) Failing after 51s
CI / python-lint (pull_request) Failing after 18s
CI / nodejs-lint (pull_request) Failing after 1m8s
CI / nodejs-build (pull_request) Successful in 3m6s
CI / test-go (pull_request) Successful in 1m3s
CI / iace-gt-coverage (pull_request) Successful in 18s
CI / test-python-backend (pull_request) Successful in 28s
Der einzige offene Retrieval-Haertefall: eine Query mit >=2 genannten Regelwerken
("CRA und Maschinenverordnung") lieferte nur die keyword-dominante Domaene (CRA),
MaschVO fiel raus. Drei zusammenwirkende Ursachen, alle behoben:

1. CodeValues-Mismatch: MaschVO heisst je Collection anders (Slice MASCHVO ·
   gesetze MVO · ce MACHINERY/MASCHINENVO), der Catalog hatte nur ["MASCHVO","MaschVO"]
   → Filter fand MaschVO nur in der Slice. Jetzt alle Varianten als CodeValues.
2. Per-Collection-Truncation: der Router gab perColl=3 → searchMultiRegulation holte
   3+3=6, schnitt auf 3 → konnte eine Domaene je Collection verlieren. Multi-Reg-Queries
   bekommen jetzt perColl = 3*len(regs).
3. Router-Score-Merge starvte die nicht-dominante Domaene. Neue balanceByRegulation()
   gruppiert den gemergten Pool per Regelwerk (exakter regulation_code-Match) und nimmt
   round-robin ueber die genannten Domaenen → jede Domaene mit Treffern ist im Top-K.
   Generisch ueber jede genannte Menge; Single-Domain-Pfad unveraendert.

Validierung: Go-Unit (balanceByRegulation: dominante CRA verdraengt MaschVO NICHT mehr);
0070-e2e gegen dev (Retrieve() → [CRA MVO CRA MVO CRA MVO CRA MASCHINENVO] = beide
Domaenen, vorher nur CRA); CB-100-Stichprobe REGR 0 (Gain-Profil unveraendert).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-30 15:08:18 +02:00
Benjamin_Boenisch 08086ee75f feat: Authority Router — Advisor collection-agnostisch, KB-2026.1 live (#46)
CI / detect-changes (push) Successful in 6s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 5s
CI / validate-canonical-controls (push) Successful in 3s
CI / loc-budget (push) Successful in 17s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m58s
CI / test-go (push) Successful in 1m0s
CI / iace-gt-coverage (push) Successful in 15s
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-06-30 12:26:53 +00:00
Benjamin Admin 1e5aaf7103 feat(advisor): Authority Router — Advisor collection-agnostisch, KB-2026.1-Gewinn im Produktpfad
CI / detect-changes (pull_request) Successful in 13s
CI / branch-name (pull_request) Successful in 2s
CI / guardrail-integrity (pull_request) Successful in 5s
CI / secret-scan (pull_request) Successful in 11s
CI / dep-audit (pull_request) Failing after 54s
CI / sbom-scan (pull_request) Failing after 1m1s
CI / build-sha-integrity (pull_request) Successful in 11s
CI / validate-canonical-controls (pull_request) Successful in 7s
CI / loc-budget (pull_request) Successful in 23s
CI / go-lint (pull_request) Successful in 53s
CI / python-lint (pull_request) Failing after 17s
CI / nodejs-lint (pull_request) Failing after 1m6s
CI / nodejs-build (pull_request) Successful in 2m59s
CI / test-go (pull_request) Successful in 1m0s
CI / iace-gt-coverage (pull_request) Successful in 17s
CI / test-python-backend (pull_request) Successful in 26s
CI / test-python-document-crawler (pull_request) Successful in 12s
CI / test-python-dsms-gateway (pull_request) Successful in 8s
Der Advisor fan-outete bisher selbst ueber eine feste Liste expliziter Collections
(advisor-rag.ts) und umging damit das #61-Scope-Routing (das nur den Default-Pfad
routet) → der gemessene +28-Retrieval-Gewinn (CB-100: 53→81, 0 Regr) kam nie beim
Antwort-LLM an. Dieser Router zieht den Fan-out in die Retriever-Schicht:

- SDK: LegalRAGClient.Retrieve() + POST /sdk/v1/rag/retrieve {query, top_k} —
  fan-outet server-seitig ueber die Broad-Authority-Base + die KB-2026.1-Slice bei
  inKBScope, merge+dedup, sortiert nach Authority-Score (rerankByAuthority je
  Collection), top-K. Index-Warmup vor dem nebenlaeufigen Fan-out (Map-Race-frei).
  Per-Env via RAG_ROUTER_COLLECTIONS.
- admin: advisor-rag.ts ruft EINMAL /retrieve statt 6-fach expliziter Collections.
  Advisor ist collection-agnostisch (Vertrag Compiler→Collections→Retriever→Advisor);
  COMPLIANCE_COLLECTIONS/searchCollection entfernt.

Validierung: Go-Unit (Router-Selektion, dedup); e2e gegen dev-Qdrant (echter
Retrieve(), CB-100-Stichprobe stride 5): OLD-hit 11/20 → NEW-hit 15/20, GAIN 4
(alle DS-Guidance), REGR 0 — reproduziert den +28/0-Regr durch den Produktionscode.
TS-Tests auf den Single-/retrieve-Call angepasst.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-30 14:13:09 +02:00
Benjamin_Boenisch af11d21f6e feat(ucca): Blue-Green KB-2026.1 Scope-Routing (authoritative slice) (#45)
CI / detect-changes (push) Successful in 5s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Successful in 59s
CI / build-sha-integrity (push) Successful in 4s
CI / validate-canonical-controls (push) Successful in 4s
CI / loc-budget (push) Successful in 18s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / iace-gt-coverage (push) Successful in 15s
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-06-30 10:01:59 +00:00
Benjamin Admin e2c74fd243 feat(ucca): Blue-Green „authoritative slice promotion" — KB-2026.1 Scope-Routing
CI / detect-changes (pull_request) Successful in 12s
CI / branch-name (pull_request) Successful in 1s
CI / guardrail-integrity (pull_request) Successful in 9s
CI / secret-scan (pull_request) Successful in 10s
CI / dep-audit (pull_request) Failing after 56s
CI / sbom-scan (pull_request) Failing after 1m1s
CI / build-sha-integrity (pull_request) Successful in 6s
CI / validate-canonical-controls (pull_request) Successful in 3s
CI / loc-budget (pull_request) Successful in 18s
CI / go-lint (pull_request) Successful in 52s
CI / python-lint (pull_request) Failing after 15s
CI / nodejs-lint (pull_request) Failing after 1m12s
CI / nodejs-build (pull_request) Successful in 3m4s
CI / test-go (pull_request) Successful in 1m2s
CI / iace-gt-coverage (pull_request) Successful in 19s
CI / test-python-backend (pull_request) Successful in 27s
CI / test-python-document-crawler (pull_request) Successful in 19s
CI / test-python-dsms-gateway (pull_request) Successful in 15s
Additiv (KEIN CE-Ersatz): faellt eine Query in den KB-2026.1-Scope (DP/CRA/MaschVO/
NIS2/DataAct/DORA/AIAct + EDPB/DSK-Guidance), wird die hochwertige Slice-Collection
`kb_2026_1_build` abgefragt; sonst bleibt der breite Default `bp_compliance_ce`.
Damit werden die Guidance-Intent- + Multi-Reg-Fixes (PR #42/#43) fuer den Slice LIVE,
Broad-Corpus (OWASP/NIST/ENISA/IFRS/ISO) unangetastet -> 0 Regressionen by construction.

- resolveCollection(query, requested): explizit angefragte Collection unveraendert;
  Default-Request -> Slice bei inKBScope, sonst CE. Env RAG_KB_SCOPE_ROUTING=false = Rollback
  ohne Redeploy; RAG_KB_SLICE_COLLECTION ueberschreibt den Slice-Namen.
- inKBScope: detectRegulations (in-Slice-Regelwerke) + DP-Guidance-Marker (edpb/dsk/wp/gl) +
  DP/Compliance-Topics. Bewusst NICHT die generischen Verben aus guidanceIntentSignals
  (sagt/laut) und NICHT enisa/bsi/nist/owasp (die liegen in CE) -> konservativ, in-scope->Slice.

Validierung: Unit (Scoping + resolveCollection); dev-e2e (RUN_E2E, geroutetes Search() gegen
dev): WP248/MaschVO/CRA+MaschVO -> Slice (Treffer da, fehlen in dev-ce); NIST -> CE (NIST-Treffer).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-30 11:49:34 +02:00
Benjamin_Boenisch 8ed99c255d Merge pull request 'fix(api): F821-Regression (Extract-Service-Halb-Refactor) — 7 Route-Dateien' (#44) from fix/api-f821-extract-service-regression into main
CI / detect-changes (push) Successful in 8s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 9s
CI / validate-canonical-controls (push) Successful in 7s
CI / loc-budget (push) Successful in 22s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 27s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-06-30 09:06:08 +00:00
Benjamin Admin 3389fa3e7a fix(api): F821-Regression in 6 weiteren Route-Dateien beheben
CI / detect-changes (pull_request) Successful in 5s
CI / branch-name (pull_request) Successful in 1s
CI / guardrail-integrity (pull_request) Successful in 5s
CI / secret-scan (pull_request) Successful in 8s
CI / dep-audit (pull_request) Failing after 57s
CI / sbom-scan (pull_request) Failing after 56s
CI / build-sha-integrity (pull_request) Successful in 6s
CI / validate-canonical-controls (pull_request) Successful in 5s
CI / loc-budget (pull_request) Successful in 22s
CI / go-lint (pull_request) Successful in 46s
CI / python-lint (pull_request) Failing after 17s
CI / nodejs-lint (pull_request) Failing after 1m8s
CI / nodejs-build (pull_request) Successful in 3m1s
CI / test-go (pull_request) Successful in 1m2s
CI / iace-gt-coverage (pull_request) Successful in 18s
CI / test-python-backend (pull_request) Successful in 25s
CI / test-python-document-crawler (pull_request) Successful in 14s
CI / test-python-dsms-gateway (pull_request) Successful in 10s
Gleiche Wurzel wie evidence_routes (Extract-Service-Refactor a638d0e5 ff.):
Signaturen/Imports halb umgestellt → undefined names → NameError beim Aufruf.

- routes.py: db-Param in get_control/update_control/review_control + EvidenceDB-Import
- dsfa_routes.py: db-Param in create_dsfa + HTTPException/text-Import
- dashboard_routes.py: timezone-Import
- canonical_control_routes.py: logger-Definition
- ai_routes.py: timezone in den lokalen datetime-Imports
- vvt_routes.py: HTTPException-Import

Verifiziert: ruff F821 0 über das gesamte compliance/api/, alle 6 py_compile,
294 Tests grün auf den betroffenen Modulen (die 2 dsfa-invalid-status/risk-Failures
sind vorbestehend = 400-vs-422, unabhängig von diesem Fix).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-30 10:51:00 +02:00
Benjamin Admin 79abf23ea8 fix(api): evidence_routes F821-Regression beheben (Extract-Service-Halb-Refactor)
a638d0e5 ("extract EvidenceService") stellte Signaturen auf service=Depends um,
ließ aber Bodies + Imports auf dem alten Stand → 43 F821 (NameError zur Laufzeit).

- gelöschte stdlib-Imports restauriert (os/json/hashlib/uuid/datetime/timedelta)
- db: Session = Depends(get_db) an den betroffenen Endpoints restauriert
- translate_domain_errors + _update_risks_impl (=evidence_service._update_risks) importiert
- unerreichbaren toten Block (alte get_ci_evidence_status-Impl nach dem return) entfernt
- dsms_cid=None no-op in create/review/reject (DSMS-Commit-Copy-Paste)

Verifiziert: ruff F821 0, py_compile, test_evidence_routes.py 35 passed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-30 10:19:28 +02:00
Benjamin Admin d5925e57af feat(ai-sdk): pin accepted proposer decisions into the GT gate (P3)
CI / detect-changes (push) Successful in 12s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 9s
CI / validate-canonical-controls (push) Successful in 8s
CI / loc-budget (push) Successful in 21s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Successful in 59s
CI / iace-gt-coverage (push) Successful in 19s
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
When a human accepts a proposer proposal, an AcceptedPin records a machine-scoped
invariant — a pattern MUST fire (coverage/vocab→tag) or must NOT fire
(dedup/framing) — that a test re-checks on every run. This makes the library's
growth COMPOUND into the gate instead of eroding it: a change that re-introduces
a dropped duplicate, un-gates a foreign pattern, or removes a coverage hazard
breaks a pin and fails CI. One boolean covers all four proposal types.

Seeded testdata/accepted_pins_warewashing.json with the accepted P1 supersessions
(HP016/HP018/HP013 must NOT fire; their clean equivalents HP2201/HP144 must fire).
TestWarewashing_AcceptedPins re-checks 5/5 against the live engine output;
GenerateDedupPin turns an accepted dedup verdict into its pin.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-30 09:42:31 +02:00
Benjamin Admin 1877829b1d Merge remote-tracking branch 'gitea/main' into reconcile-dev
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 8s
CI / validate-canonical-controls (push) Successful in 5s
CI / loc-budget (push) Successful in 22s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 3m3s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 26s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-06-30 09:04:58 +02:00
Benjamin_Boenisch 866889b453 Merge pull request 'feat(ucca): Multi-Regulation-Retrieval (Cross-Regulation-Fragen)' (#43) from fix/multi-regulation-retrieval into main
CI / detect-changes (push) Successful in 12s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 7s
CI / validate-canonical-controls (push) Successful in 6s
CI / loc-budget (push) Successful in 21s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Successful in 1m0s
CI / iace-gt-coverage (push) Successful in 20s
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-06-30 06:46:21 +00:00
Benjamin Admin 9760dca443 feat(ucca): Multi-Regulation-Retrieval für Cross-Regulation-Fragen
CI / detect-changes (pull_request) Successful in 10s
CI / branch-name (pull_request) Successful in 1s
CI / guardrail-integrity (pull_request) Successful in 8s
CI / secret-scan (pull_request) Successful in 9s
CI / dep-audit (pull_request) Failing after 56s
CI / sbom-scan (pull_request) Failing after 58s
CI / build-sha-integrity (pull_request) Successful in 9s
CI / validate-canonical-controls (pull_request) Successful in 7s
CI / loc-budget (pull_request) Successful in 24s
CI / go-lint (pull_request) Successful in 54s
CI / python-lint (pull_request) Failing after 16s
CI / nodejs-lint (pull_request) Failing after 1m9s
CI / nodejs-build (pull_request) Successful in 3m6s
CI / test-go (pull_request) Successful in 1m3s
CI / iace-gt-coverage (pull_request) Successful in 19s
CI / test-python-backend (pull_request) Successful in 26s
CI / test-python-document-crawler (pull_request) Successful in 15s
CI / test-python-dsms-gateway (pull_request) Successful in 12s
Nennt eine Query EXPLIZIT >=2 Regelwerke ("Wie greifen CRA und Maschinen-
verordnung ineinander?"), retrievt searchInternal pro Regelwerk separat
(regulation_code/regulation_id-Filter) und merged — damit BEIDE Domänen im
Prompt landen statt nur der keyword-dominanten. Generisch (Query->Regelwerke,
KEINE doc-spezifische Logik), gegated auf >=2 erkannte Regelwerke; sonst
unveränderter Single-Domain-Pfad.

Behebt GQ-0070: vorher CRA x8 / null MaschVO -> Modell halluzinierte
MaschVO=2019/2144 + falsche "CRA ausgenommen"-Konklusion. Nachher CRA + MaschVO
im Prompt -> korrekt "beide gleichzeitig anwendbar" + Art. 20(9)
Konformitätsvermutung, gegroundet.

Validierung (Build-Collection, echtes SearchCollection):
- Unit: detectRegulations-Scoping (>=2 -> multi, 1/0 -> single)
- 5 Cross-Reg-Fälle (0070 + DSGVO+TDDDG/CRA+NIS2/DORA+NIS2/AI Act+DSGVO):
  beide Regelwerke in Top-8
- CB-100 Freeze-Regression: NUR GQ-0070 + GQ-0095 geändert (beide echte
  Cross-Reg, beide verbessert), 98/100 byte-identisch
- 10 Hard Cases: 9 Single-Domain unverändert, 0070 behält CRA Rang 1

Filter erweitert auf regulation_id UND regulation_code (rückwärtskompatibel,
aktiviert die re-ingestierte Build-Collection).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-30 08:18:06 +02:00
Benjamin_Boenisch e5e7b825af Merge pull request 'fix(ucca): Guidance-Intent für direkt benannte WP/GL-Dokumente' (#42) from fix/legal-rag-guidance-intent into main
CI / branch-name (push) Has been skipped
CI / detect-changes (push) Successful in 7s
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 6s
CI / validate-canonical-controls (push) Successful in 5s
CI / loc-budget (push) Successful in 20s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Successful in 1m0s
CI / iace-gt-coverage (push) Successful in 17s
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-06-29 18:42:27 +00:00
pilotadmin f0da86ca19 Merge pull request 'feat(onboarding): advisor responsiveness — moving headline + auto-recompute' (#54) from feat/advisor-ux-responsiveness into main 2026-06-28 19:31:20 +02:00
Benjamin Admin 867f8c3854 feat(onboarding): make the advisor visibly responsive — headline leads with the moving number + auto-recompute
Testing surfaced that toggling certifications appeared to "do nothing": the headline led with the TOTAL
requirement count (constant per target, e.g. 17 for CRA), and the page only recomputed on an explicit
button click. Both fixed:
  - engine.py headline now leads with the number that actually moves: "11 von 17 Anforderungen offen ·
    6 wahrscheinlich (Zertifikate) · 5 zu klären" (was "17 Anforderungen erkannt · …"). Keeps the
    "automatisch erkannt (Intake)" substring.
  - frontend auto-recomputes on certifications / target / scanner-signal change (no button needed).

Now ISO27001 alone -> "13 von 17 offen · 4 wahrscheinlich"; + ISO9001+TISAX+IEC62443 -> "11 von 17 offen ·
6 wahrscheinlich". (Domain truth stays visible: CRA's product-cyber gaps barely move with management-system
certs.) 28 onboarding+transition tests pass, check-loc 0.
2026-06-28 19:31:15 +02:00
pilotadmin 26a8518107 Merge pull request 'feat(onboarding): surface curated expert text + human labels' (#53) from feat/advisor-human-text into main 2026-06-28 18:47:07 +02:00
Benjamin Admin 807a7002b2 feat(onboarding): surface curated expert text + human capability labels (advisor was showing snake_case)
The advisor was structurally correct but unusable: every question showed a snake_case capability id plus a
single generic fallback reason ("Keine Anhaltspunkte im Unternehmensprofil — klären"). The expert text
already EXISTED in the transition patterns (why_asked / reviewable_claim) — the pipeline just dropped it.

  - transition_reasoning: TargetRequirement gains `rationale`; assess_transition uses it as the request
    reason when present, else the generic fallback (additive, backward-compatible for all consumers).
  - onboarding_service._target carries the pattern's why_asked (delta) and reviewable_claim (likely_covered)
    into the requirement rationale -> the question's `why`.
  - knowledge/onboarding/capability_labels.yaml: curated DE labels (id -> human), reusable across targets;
    labels_for() + response.capability_labels expose them; the frontend renders label || prettified id.

Now ISO27001->TISAX reads "Auftragsverarbeitung (Art. 28 DSGVO) — If a TISAX data label is in scope, you
must show Art. 28 GDPR processing-on-behalf controls; ISO 27001 does not establish these." instead of
"data_protection_processing_on_behalf — klären". why_asked text is still EN (existing knowledge; translation
is curation). 34 onboarding+transition tests pass, mypy --strict clean (13 modules), check-loc 0.
2026-06-28 18:46:56 +02:00
pilotadmin 5beb5a319a Merge pull request 'feat(admin): ETO / Onboarding-Advisor test page' (#52) from feat/onboarding-advisor-frontend into main 2026-06-28 17:12:44 +02:00
Benjamin Admin 239702fdca feat(admin): ETO / Onboarding-Advisor test page (thin operator surface over the advisor endpoint)
A focused client page at /sdk/onboarding-advisor that exercises POST /api/compliance/onboarding/
advisor-start through the existing compliance proxy: pick certifications + target + scanner findings
(observation / partial / requirement) and render the result — headline, silent-intake summary,
auto-detected (green), indications (amber), next-best questions with WHY, inferred (Welt-1) vs rejected
assumptions, capability delta, evidence requests, completeness. NOT the regulation gap engine
(/sdk/gap-analysis is a different flow). No new backend; calls only the existing endpoint. 195 lines.
2026-06-28 17:12:40 +02:00
pilotadmin d1a5fc7205 Merge pull request 'feat(onboarding): Observation Log — append-only JSONL calibration store (59b/c)' (#51) from feat/observation-log into main 2026-06-28 16:29:58 +02:00
Benjamin Admin 7df15010ff feat(onboarding): Observation Log — append-only JSONL calibration store (Task 59b/c v1)
Per the user's decision (2026-06-28): observations are CALIBRATION data for the knowledge base, NOT
business data and NOT product-DB data. So they live with the other versioned knowledge artifacts as an
append-only JSONL log under knowledge/observations/ — NO migration, NO DB. (A real persistence layer is
only warranted once thousands of onboardings exist; not before.)

  - ObservationRecord = Observation + log metadata (observation_id, timestamp [caller-stamped, no hidden
    clock], customer_archetype [anonymised — NEVER a real name], evidence, provenance, knowledge_version).
  - append_observation() writes one JSON line; append-only, lines are never rewritten. A later review is a
    NEW line with the same observation_id; load_observations(reconcile=True) keeps the latest per id.
  - load_observations() reads a single .jsonl or a directory of monthly .jsonl files.
  - aggregate_by_hypothesis() (59c) -> per-hypothesis distribution + confidence, COMPUTED from the log
    (computed-not-stored); the review gate (reviewed-only) is enforced in empirical_distribution/confidence.
  - review_queue() -> the unreviewed worklist. Observation -> Review -> Accepted -> recompute, never
    Observation -> confidence++. Nothing is ever written back to a hypothesis.

You can `rm` the log and recompute, `git diff` it over months, or rebuild confidence under a new policy —
fully consistent with computed-not-stored and the product/knowledge data separation.

Non-runtime (module + tests only, no endpoint) -> origin/main, NO dev deploy. 5 new tests (append-only,
review supersession, review-gate statistics, queue, monthly-file load); 27 onboarding tests pass, mypy
--strict clean (9 modules), check-loc 0. 59d (surface computed confidence at runtime) stays a later step.
2026-06-28 16:29:54 +02:00
103 changed files with 6901 additions and 1171 deletions
@@ -1,34 +1,29 @@
/** /**
* Compliance Advisor Chat API * Compliance Advisor Chat API — Clarity-Gate orchestration.
* *
* Verbindet das ComplianceAdvisorWidget mit: * Consumes the SDK/RAG /retrieve (evidence/visual_evidence/footnotes/clarity) and returns the
* 1. Multi-Collection-RAG ueber die ai-compliance-sdk (bge-m3) — siehe advisor-rag * FE-facing contract (advisor-clarity-gate-contract):
* 2. Strukturierten Controls zum erkannten Thema — buildControlsContext * - clarify mode -> short L1 general answer (no RAG) + domain context chips
* 3. LLM-Kaskade OVH (prod) -> Ollama (Dev) — siehe advisor-llm * - answer mode -> L2 answer over the scoped evidence with [n] citation markers
* * Citations are generated here ([n] -> nth evidence unit). The FE renders ONLY this structured data.
* Laenderspezifische Filterung (DE, AT, CH, EU). Streamt die Antwort als Text.
*/ */
import { NextRequest, NextResponse } from 'next/server' import { NextRequest, NextResponse } from 'next/server'
import { readSoulFile } from '@/lib/sdk/agents/soul-reader' import { readSoulFile } from '@/lib/sdk/agents/soul-reader'
import { buildControlsContext } from '@/lib/sdk/agents/controls-augmentation' import { retrieveFull } from '@/lib/sdk/agents/advisor-rag'
import { queryAdvisorRAG } from '@/lib/sdk/agents/advisor-rag' import { completeAdvisorAnswer, streamAdvisorAnswer, type ChatMessage } from '@/lib/sdk/agents/advisor-llm'
import { streamAdvisorAnswer, type ChatMessage } from '@/lib/sdk/agents/advisor-llm' import {
buildCitations,
isLegacyRequest,
mapClarity,
mapFootnotes,
numberedEvidenceForPrompt,
resolveMode,
} from '@/lib/sdk/advisor/retrieve-mapping'
import type { AdvisorResponse } from '@/lib/sdk/advisor/contract'
type Country = 'DE' | 'AT' | 'CH' | 'EU' type Country = 'DE' | 'AT' | 'CH' | 'EU'
const FALLBACK_SYSTEM_PROMPT = `# Compliance Advisor Agent
## Identitaet
Du bist der BreakPilot Compliance-Berater. Du hilfst Nutzern des AI Compliance SDK,
Datenschutz- und Compliance-Fragen in verstaendlicher Sprache zu beantworten.
## Kernprinzipien
- Quellenbasiert: Verweise auf DSGVO-Artikel, BDSG-Paragraphen
- Verstaendlich: Einfache, praxisnahe Sprache
- Ehrlich: Bei Unsicherheit empfehle Rechtsberatung
- Deutsch als Hauptsprache`
const COUNTRY_LABELS: Record<Country, string> = { const COUNTRY_LABELS: Record<Country, string> = {
DE: 'Deutschland', DE: 'Deutschland',
AT: 'Oesterreich', AT: 'Oesterreich',
@@ -38,84 +33,153 @@ const COUNTRY_LABELS: Record<Country, string> = {
function countryBlock(c: Country): string { function countryBlock(c: Country): string {
const label = COUNTRY_LABELS[c] const label = COUNTRY_LABELS[c]
const nationalLaws =
c === 'DE'
? 'BDSG, TDDDG, TKG, UWG'
: c === 'AT'
? 'AT DSG, ECG, TKG, KSchG, MedienG'
: 'CH DSG, DSV, OR, UWG, FMG'
const guidance =
c === 'EU'
? 'EU-weiten Fragen: Beziehe dich auf EU-Verordnungen und -Richtlinien'
: `${label}: Beziehe nationale Gesetze (${nationalLaws}) mit ein`
return `\n\n## Laenderspezifische Auskunft return `\n\n## Laenderspezifische Auskunft
Der Nutzer hat "${label} (${c})" gewaehlt. Der Nutzer hat "${label} (${c})" gewaehlt. Beziehe dich auf ${c}-Recht + anwendbares EU-Recht und nenne das Land.`
- Beziehe dich AUSSCHLIESSLICH auf ${c}-Recht + anwendbares EU-Recht }
- Nenne IMMER explizit das Land in deiner Antwort
- Verwende NIEMALS Gesetze eines anderen Landes // L1: general knowledge, deliberately NOT grounded (the clarify step precedes the legal retrieval).
- Bei ${guidance}` const L1_SYSTEM = `Du bist der BreakPilot Compliance-Berater. Gib eine KURZE, allgemeine Definition/Erklaerung
des gefragten Begriffs aus Allgemeinwissen — 2 bis 4 Saetze, Markdown, neutral. NENNE KEINE Rechtsquellen,
Paragraphen, Artikel oder Fundstellen; der Nutzer waehlt anschliessend einen konkreten Kontext, erst dann
folgen belegte Quellen. Wenn der Begriff in mehreren Bereichen vorkommt, erwaehne das in einem Halbsatz.`
const FALLBACK_SYSTEM = `Du bist der BreakPilot Compliance-Berater. Antworte quellenbasiert, verstaendlich und ehrlich auf Deutsch.`
// Optional audience/tonality guidance (e.g. the workspace's role hint). Kept out of the retrieval
// `question` on purpose — it only shapes the answer's tone, so it belongs in the system prompt.
function audienceBlock(audience: string): string {
return audience ? `\n\n## Ansprache / Zielgruppe\n${audience}` : ''
}
function answerSystem(
soul: string | null,
country: Country | undefined,
evidenceBlock: string,
withCitations = true,
audience = '',
): string {
let s = soul || FALLBACK_SYSTEM
if (country) s += countryBlock(country)
s += audienceBlock(audience)
s += `\n\n## Belegte Evidence (nummeriert — DEINE EINZIGEN Quellen)\n${evidenceBlock || '(keine Evidence gefunden)'}`
s += `\n\n## Antwortformat (WICHTIG)
- Beginne mit einer **Kurzzusammenfassung** (12 Saetze, "Kurz gesagt: …"), die den Kern direkt beantwortet.
- Danach gut gegliedertes Markdown: kurze ## Ueberschriften je THEMA/Aspekt (nicht je Rechtsquelle), Aufzaehlungen, **Fettung** fuer Kernbegriffe.`
if (withCitations) {
s += `\n- Belege Kernaussagen mit [n], wobei n die NUMMER der Evidence-Quelle oben ist (z. B. [1], [2]).
- Nenne KEINE Quellen-/Fundstellen-Liste im Fliesstext — die Quellen werden dem Nutzer separat angezeigt.`
} else {
s += `\n- Nenne Fundstellen nur, wo sie der Antwort dienen (natuerlich im Text, KEIN [n]-Markup).`
}
s += `\n- Triff KEINE Aussage, die nicht durch die nummerierte Evidence belegt ist; fehlt der Beleg, sage das offen.`
return s
}
// Prior thread turns for contextual follow-ups. Validated + bounded (last 8 turns ~ 4 Q&A).
function parseHistory(raw: unknown): ChatMessage[] {
if (!Array.isArray(raw)) return []
const turns: ChatMessage[] = []
for (const t of raw) {
if (!t || typeof t !== 'object') continue
const role = (t as { role?: unknown }).role
const content = (t as { content?: unknown }).content
if ((role === 'user' || role === 'assistant') && typeof content === 'string' && content.trim()) {
turns.push({ role, content })
}
}
return turns.slice(-8)
} }
export async function POST(request: NextRequest) { export async function POST(request: NextRequest) {
try { try {
const body = await request.json() const body = await request.json()
const { message, history = [], currentStep = 'default', country } = body const question = String(body.question ?? body.message ?? '').trim()
const context: string | null = body.context ?? null
if (!message || typeof message !== 'string') { const audience = typeof body.audience === 'string' ? body.audience.trim() : ''
return NextResponse.json({ error: 'Message is required' }, { status: 400 }) const history = parseHistory(body.history)
} const country = (['DE', 'AT', 'CH', 'EU'] as const).includes(body.country)
? (body.country as Country)
const validCountry = (['DE', 'AT', 'CH', 'EU'] as const).includes(country)
? (country as Country)
: undefined : undefined
// 1. RAG (ai-sdk, bge-m3) + strukturierte Controls zum Thema — beide parallel if (!question) {
const [ragContext, controlsContext] = await Promise.all([ return NextResponse.json({ error: 'Question is required' }, { status: 400 })
queryAdvisorRAG(message),
buildControlsContext(message),
])
// 2. System-Prompt zusammenbauen
const soulPrompt = await readSoulFile('compliance-advisor')
let systemContent = soulPrompt || FALLBACK_SYSTEM_PROMPT
if (validCountry) systemContent += countryBlock(validCountry)
if (ragContext) {
systemContent += `\n\n## Relevanter Kontext aus dem RAG-System (deine EINZIGEN Rechtsquellen)\n\nDies sind deine einzigen zulaessigen Rechtsquellen. Triff keine konkrete Rechtsaussage (Zahl, Frist, Schwelle, Pflicht, Fundstelle), die nicht hier oder im Controls-Block belegt ist — sonst sage offen, dass du sie aus deinen Quellen nicht belegen kannst. Verweise in deiner Antwort auf die jeweilige Quelle:\n\n${ragContext}`
} }
if (controlsContext) systemContent += `\n\n${controlsContext}`
systemContent += `\n\n## Aktueller SDK-Schritt\nDer Nutzer befindet sich im SDK-Schritt: ${currentStep}`
// 3. Nachrichten (History auf die letzten 6 begrenzen) const retrieved = await retrieveFull(question, context)
// Backward-compat: legacy consumers (breakpilot-workspace) send {message} and read a plain-text
// stream. Serve the L2 answer streamed (clean prose, no [n]); no clarify gate, no JSON.
if (isLegacyRequest(body)) {
const legacyEvidence = retrieved.evidence ?? []
const legacySoul = await readSoulFile('compliance-advisor')
const legacyStream = await streamAdvisorAnswer([
{ role: 'system', content: answerSystem(legacySoul, country, numberedEvidenceForPrompt(legacyEvidence), false, audience) },
...history,
{ role: 'user', content: question },
])
if (!legacyStream) {
return NextResponse.json({ error: 'LLM nicht erreichbar.' }, { status: 502 })
}
return new NextResponse(legacyStream, {
headers: {
'Content-Type': 'text/plain; charset=utf-8',
'Cache-Control': 'no-cache',
'X-Advisor-Format': 'legacy-stream',
},
})
}
const mode = resolveMode(retrieved.clarity?.mode, !!context)
if (mode === 'clarify') {
const general = await completeAdvisorAnswer([
{ role: 'system', content: L1_SYSTEM + audienceBlock(audience) },
{ role: 'user', content: question },
])
if (general === null) {
return NextResponse.json({ error: 'LLM nicht erreichbar.' }, { status: 502 })
}
const resp: AdvisorResponse = {
mode: 'clarify',
question,
clarity: mapClarity(retrieved.clarity, 'clarify'),
general_answer: general,
answer: null,
scoped_query: null,
evidence: [],
citations: [],
visual_evidence: [],
footnotes: [],
}
return NextResponse.json(resp)
}
const evidence = retrieved.evidence ?? []
const soul = await readSoulFile('compliance-advisor')
const messages: ChatMessage[] = [ const messages: ChatMessage[] = [
{ role: 'system', content: systemContent }, { role: 'system', content: answerSystem(soul, country, numberedEvidenceForPrompt(evidence), true, audience) },
...history.slice(-6).map((h: { role: string; content: string }) => ({ ...history,
role: h.role === 'user' ? 'user' : 'assistant', { role: 'user', content: question },
content: h.content,
})),
{ role: 'user', content: message },
] ]
const answer = await completeAdvisorAnswer(messages)
// 4. LLM-Kaskade -> Plain-Text-Stream if (answer === null) {
const stream = await streamAdvisorAnswer(messages) return NextResponse.json({ error: 'LLM nicht erreichbar.' }, { status: 502 })
if (!stream) {
return NextResponse.json(
{ error: 'LLM nicht erreichbar. Weder OVH/LiteLLM noch Ollama haben geantwortet.' },
{ status: 502 },
)
} }
const resp: AdvisorResponse = {
return new NextResponse(stream, { mode: 'answer',
headers: { question,
'Content-Type': 'text/plain; charset=utf-8', clarity: mapClarity(retrieved.clarity, 'answer'),
'Cache-Control': 'no-cache', general_answer: null,
Connection: 'keep-alive', answer,
}, scoped_query: context,
}) evidence,
citations: buildCitations(evidence),
visual_evidence: retrieved.visual_evidence ?? [],
footnotes: mapFootnotes(retrieved.footnotes),
}
return NextResponse.json(resp)
} catch (error) { } catch (error) {
console.error('Compliance advisor chat error:', error) console.error('Compliance advisor chat error:', error)
return NextResponse.json( return NextResponse.json({ error: 'Verbindung zum Advisor fehlgeschlagen.' }, { status: 503 })
{ error: 'Verbindung zum LLM fehlgeschlagen.' },
{ status: 503 },
)
} }
} }
@@ -0,0 +1,200 @@
'use client'
// ETO / Onboarding-Advisor — thin operator surface over POST /api/compliance/onboarding/advisor-start.
// Certifications + target + scanner findings -> Silent Pass -> Advisor. NOT the regulation gap engine
// (/sdk/gap-analysis is a different flow: product -> applicable regulations). This tests the cert->delta
// case: "TISAX/ISO27001 -> CRA, what is auto-detected, what stays an open question?". No new backend.
import React, { useEffect, useState } from 'react'
const CERTS = ['ISO27001', 'TISAX', 'ISO9001', 'IEC62443', 'ISO13485', 'ISO14001', 'ASPICE', 'IATF16949']
// label -> {signal_id, source_type} — demonstrates all three signal KINDS (observation / partial / requirement)
const FINDINGS: Array<{ label: string; signal_id: string; source_type: string; kind: string }> = [
{ label: 'SBOM im Repo (CycloneDX/SPDX)', signal_id: 'cyclonedx_found', source_type: 'repository', kind: 'observation' },
{ label: 'security.txt / CVD-Policy veröffentlicht', signal_id: 'security_txt', source_type: 'website', kind: 'observation' },
{ label: 'Signierte Releases', signal_id: 'signed_releases', source_type: 'repository', kind: 'observation' },
{ label: 'Produkt-Risikobewertung (Dokument)', signal_id: 'risk_assessment_pdf', source_type: 'document', kind: 'observation' },
{ label: 'CI-Pipeline vorhanden (nur Indikation)', signal_id: 'github_actions_ci', source_type: 'repository', kind: 'partial' },
{ label: 'Cloud-/vernetztes Produkt', signal_id: 'cloud_hosted', source_type: 'product', kind: 'observation' },
{ label: 'Ausschreibung FORDERT SBOM (Requirement)', signal_id: 'requires_sbom', source_type: 'tender', kind: 'requirement' },
{ label: 'OEM FORDERT PSIRT (Requirement)', signal_id: 'supplier_requires_psirt', source_type: 'oem', kind: 'requirement' },
]
interface Question { capability_id: string; question_intent: string; why: string; information_value: number; priority: string }
interface Inferred { certification: string; capabilities: string[]; statement: string }
interface Rejected { certification?: string; statement: string; reason: string }
interface Measure { capability_id: string; leverage: number; closes: string[] }
interface AdvisorResponse {
silent_intake_summary: string; headline: string; auto_detected: string[]; indications: string[]
inferred_assumptions: Inferred[]; rejected_assumptions: Rejected[]; top_5_questions: Question[]
capability_delta: string[]; top_measures: Measure[]; evidence_requests: string[]
unsupported_domains: string[]; completeness_summary: string; capability_labels: Record<string, string>
}
const PROXY = '/api/sdk/v1/compliance/onboarding'
function Chips({ items, tone }: { items: string[]; tone: string }) {
if (!items.length) return <span className="text-gray-400 text-sm"></span>
return (
<div className="flex flex-wrap gap-2">
{items.map(c => <span key={c} className={`px-2.5 py-1 rounded-full text-xs font-medium ${tone}`}>{c}</span>)}
</div>
)
}
function Section({ title, hint, children }: { title: string; hint?: string; children: React.ReactNode }) {
return (
<div className="bg-white rounded-xl border border-gray-200 p-5">
<h3 className="font-semibold text-gray-900">{title}</h3>
{hint && <p className="text-xs text-gray-500 mt-0.5 mb-2">{hint}</p>}
<div className="mt-2">{children}</div>
</div>
)
}
export default function OnboardingAdvisorPage() {
const [targets, setTargets] = useState<string[]>([])
const [company, setCompany] = useState('Beispiel Maschinenbau')
const [industry, setIndustry] = useState('machine_builder')
const [certs, setCerts] = useState<string[]>(['ISO27001', 'ISO9001'])
const [target, setTarget] = useState('CRA')
const [findings, setFindings] = useState<string[]>(['cyclonedx_found', 'github_actions_ci', 'requires_sbom'])
const [knownEvidence, setKnownEvidence] = useState('CE-Prozess')
const [result, setResult] = useState<AdvisorResponse | null>(null)
const [loading, setLoading] = useState(false)
const [error, setError] = useState('')
useEffect(() => {
fetch(`${PROXY}/targets`).then(r => r.json()).then(d => {
if (Array.isArray(d.targets)) { setTargets(d.targets); if (!d.targets.includes('CRA') && d.targets[0]) setTarget(d.targets[0]) }
}).catch(() => {})
}, [])
const toggle = (list: string[], set: (v: string[]) => void, v: string) =>
set(list.includes(v) ? list.filter(x => x !== v) : [...list, v])
const lbl = (id: string) => result?.capability_labels?.[id] || id.replace(/_/g, ' ')
const run = async () => {
setLoading(true); setError(''); setResult(null)
try {
const scanner_findings = FINDINGS.filter(f => findings.includes(f.signal_id))
.map(f => ({ signal_id: f.signal_id, source_type: f.source_type }))
const res = await fetch(`${PROXY}/advisor-start`, {
method: 'POST', headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
company, industry, products: [], markets: ['EU'], certifications: certs,
known_evidence: knownEvidence ? knownEvidence.split(',').map(s => s.trim()).filter(Boolean) : [],
target, scanner_findings,
}),
})
if (!res.ok) throw new Error(await res.text())
setResult(await res.json())
} catch (e) {
setError(e instanceof Error ? e.message : 'Advisor fehlgeschlagen')
} finally { setLoading(false) }
}
// auto-recompute when certifications / target / scanner signals change (no button click needed)
useEffect(() => { if (certs.length) run() }, [certs, target, findings]) // eslint-disable-line react-hooks/exhaustive-deps
return (
<div className="min-h-screen bg-gray-50 py-8">
<div className="max-w-5xl mx-auto px-4">
<h1 className="text-3xl font-bold text-gray-900">ETO / Onboarding-Advisor</h1>
<p className="text-gray-600 mt-2 mb-6">
Zertifikate + Ziel + Scanner-Signale Silent Pass Capability-Delta + nächste beste Fragen.
Welt-1: ein Zertifikat <em>legt nahe</em>, beweist nichts (Verifikation erforderlich).
</p>
<div className="grid md:grid-cols-2 gap-4 mb-6">
<Section title="Unternehmen & Ziel">
<label className="block text-sm text-gray-600">Unternehmen
<input value={company} onChange={e => setCompany(e.target.value)} className="mt-1 w-full border rounded-lg px-3 py-2" /></label>
<label className="block text-sm text-gray-600 mt-3">Branche
<input value={industry} onChange={e => setIndustry(e.target.value)} className="mt-1 w-full border rounded-lg px-3 py-2" /></label>
<label className="block text-sm text-gray-600 mt-3">Ziel
<select value={target} onChange={e => setTarget(e.target.value)} className="mt-1 w-full border rounded-lg px-3 py-2">
{(targets.length ? targets : ['CRA']).map(t => <option key={t} value={t}>{t}</option>)}
</select></label>
<label className="block text-sm text-gray-600 mt-3">Vorhandene Nachweise (kommagetrennt)
<input value={knownEvidence} onChange={e => setKnownEvidence(e.target.value)} className="mt-1 w-full border rounded-lg px-3 py-2" /></label>
</Section>
<Section title="Zertifizierungen">
<div className="flex flex-wrap gap-2">
{CERTS.map(c => (
<button key={c} onClick={() => toggle(certs, setCerts, c)}
className={`px-3 py-1.5 rounded-lg text-sm border ${certs.includes(c) ? 'bg-blue-600 text-white border-blue-600' : 'bg-white text-gray-700 border-gray-300'}`}>{c}</button>
))}
</div>
</Section>
</div>
<Section title="Scanner-Signale (Silent Pass)" hint="observation = gesehen · partial = Indikation · requirement = gefordert (≠ vorhanden)">
<div className="grid sm:grid-cols-2 gap-2">
{FINDINGS.map(f => (
<label key={f.signal_id} className="flex items-center gap-2 text-sm text-gray-700">
<input type="checkbox" checked={findings.includes(f.signal_id)} onChange={() => toggle(findings, setFindings, f.signal_id)} />
<span>{f.label}</span>
<span className={`ml-auto text-[10px] px-1.5 py-0.5 rounded ${f.kind === 'requirement' ? 'bg-purple-100 text-purple-700' : f.kind === 'partial' ? 'bg-amber-100 text-amber-700' : 'bg-emerald-100 text-emerald-700'}`}>{f.kind}</span>
</label>
))}
</div>
</Section>
<button onClick={run} disabled={loading || !certs.length}
className="mt-6 w-full py-3 bg-blue-600 text-white rounded-xl font-medium hover:bg-blue-700 disabled:opacity-50">
{loading ? 'Analysiere…' : 'Advisor starten'}
</button>
{error && <div className="mt-6 bg-red-50 border border-red-200 rounded-lg p-4 text-red-700 text-sm whitespace-pre-wrap">{error}</div>}
{result && (
<div className="mt-8 space-y-4">
<div className="bg-blue-600 text-white rounded-xl p-5">
<div className="text-lg font-semibold">{result.headline}</div>
<div className="text-blue-100 text-sm mt-1">{result.silent_intake_summary}</div>
</div>
<div className="grid md:grid-cols-2 gap-4">
<Section title="Automatisch erkannt" hint="konkrete Artefakte nicht mehr gefragt"><Chips items={result.auto_detected.map(lbl)} tone="bg-emerald-100 text-emerald-800" /></Section>
<Section title="Indikationen" hint="erhöht Annahmestärke trotzdem gefragt"><Chips items={result.indications.map(lbl)} tone="bg-amber-100 text-amber-800" /></Section>
</div>
<Section title="Nächste beste Fragen" hint="max 5, jede erklärt sich selbst">
{result.top_5_questions.length ? (
<ol className="space-y-3">
{result.top_5_questions.map((q, i) => (
<li key={q.capability_id} className="border-l-2 border-blue-300 pl-3">
<div className="font-medium text-gray-900">{i + 1}. {lbl(q.capability_id)}</div>
<div className="text-sm text-gray-600">{q.why}</div>
</li>
))}
</ol>
) : <span className="text-gray-400 text-sm"></span>}
</Section>
<div className="grid md:grid-cols-2 gap-4">
<Section title="Wahrscheinlich abgedeckt (Welt-1)" hint="Zertifikat legt nahe Verifikation erforderlich">
{result.inferred_assumptions.length ? result.inferred_assumptions.map(a => (
<div key={a.certification} className="mb-2"><span className="font-medium">{a.certification}</span>: {a.capabilities.map(lbl).join(', ')}</div>
)) : <span className="text-gray-400 text-sm"></span>}
</Section>
<Section title="Nicht relevant" hint="relevance(evidence, target) = 0">
{result.rejected_assumptions.length ? result.rejected_assumptions.map((a, i) => (
<div key={i} className="mb-1 text-sm text-gray-700">{a.statement}</div>
)) : <span className="text-gray-400 text-sm"></span>}
</Section>
</div>
<div className="grid md:grid-cols-2 gap-4">
<Section title="Offene Lücken (Delta)"><Chips items={result.capability_delta.map(lbl)} tone="bg-gray-100 text-gray-700" /></Section>
<Section title="Geforderte Nachweise"><Chips items={result.evidence_requests} tone="bg-gray-100 text-gray-700" /></Section>
</div>
<Section title="Vollständigkeit" hint={result.unsupported_domains.length ? `nicht abgedeckt: ${result.unsupported_domains.join(', ')}` : undefined}>
<span className="text-sm text-gray-700">{result.completeness_summary || '—'}</span>
</Section>
</div>
)}
</div>
</div>
)
}
@@ -1,131 +0,0 @@
'use client'
// =============================================================================
// ComplianceAdvisorWidget — shared constants and sub-components
// =============================================================================
// =============================================================================
// EXAMPLE QUESTIONS
// =============================================================================
export const EXAMPLE_QUESTIONS: Record<string, string[]> = {
vvt: [
'Was ist ein Verarbeitungsverzeichnis?',
'Welche Informationen muss ich erfassen?',
'Wie dokumentiere ich die Rechtsgrundlage?',
],
'compliance-scope': [
'Was bedeutet L3?',
'Wann brauche ich eine DSFA?',
'Was ist der Unterschied zwischen L2 und L3?',
],
tom: [
'Was sind TOM?',
'Welche Massnahmen sind erforderlich?',
'Wie dokumentiere ich Verschluesselung?',
],
dsfa: [
'Was ist eine DSFA?',
'Wann ist eine DSFA verpflichtend?',
'Wie bewerte ich Risiken?',
],
loeschfristen: [
'Wie definiere ich Loeschfristen?',
'Was ist der Unterschied zwischen Loeschpflicht und Aufbewahrungspflicht?',
'Wann muss ich Daten loeschen?',
],
default: [
'Wie starte ich mit dem SDK?',
'Was ist der erste Schritt?',
'Welche Compliance-Anforderungen gelten fuer KI-Systeme?',
],
}
// =============================================================================
// TYPES
// =============================================================================
export interface Message {
id: string
role: 'user' | 'agent'
content: string
timestamp: Date
}
// =============================================================================
// EmptyState — shown when no messages yet
// =============================================================================
interface EmptyStateProps {
exampleQuestions: string[]
onExampleClick: (question: string) => void
}
export function AdvisorEmptyState({ exampleQuestions, onExampleClick }: EmptyStateProps) {
return (
<div className="text-center py-8">
<div className="w-16 h-16 bg-purple-100 rounded-full flex items-center justify-center mx-auto mb-4">
<svg className="w-8 h-8 text-purple-600" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M8 10h.01M12 10h.01M16 10h.01M9 16H5a2 2 0 01-2-2V6a2 2 0 012-2h14a2 2 0 012 2v8a2 2 0 01-2 2h-5l-5 5v-5z" />
</svg>
</div>
<h3 className="text-sm font-medium text-gray-900 mb-2">Willkommen beim Compliance Advisor</h3>
<p className="text-xs text-gray-500 mb-4">Stellen Sie Fragen zu DSGVO, KI-Verordnung und mehr.</p>
<div className="text-left space-y-2">
<p className="text-xs font-medium text-gray-700 mb-2">Beispielfragen:</p>
{exampleQuestions.map((question, idx) => (
<button
key={idx}
onClick={() => onExampleClick(question)}
className="w-full text-left px-3 py-2 text-xs bg-white hover:bg-purple-50 border border-gray-200 rounded-lg transition-colors text-gray-700"
>
{question}
</button>
))}
</div>
</div>
)
}
// =============================================================================
// MessageList — renders messages + typing indicator
// =============================================================================
interface MessageListProps {
messages: Message[]
isTyping: boolean
messagesEndRef: React.RefObject<HTMLDivElement | null>
}
export function AdvisorMessageList({ messages, isTyping, messagesEndRef }: MessageListProps) {
return (
<>
{messages.map((message) => (
<div key={message.id} className={`flex ${message.role === 'user' ? 'justify-end' : 'justify-start'}`}>
<div className={`max-w-[80%] rounded-lg px-3 py-2 ${message.role === 'user' ? 'bg-indigo-600 text-white' : 'bg-white border border-gray-200 text-gray-800'}`}>
<p className={`text-sm ${message.role === 'agent' ? 'whitespace-pre-wrap' : ''}`}>
{message.content || (message.role === 'agent' && isTyping ? '' : message.content)}
</p>
<p className={`text-xs mt-1 ${message.role === 'user' ? 'text-indigo-200' : 'text-gray-400'}`}>
{message.timestamp.toLocaleTimeString('de-DE', { hour: '2-digit', minute: '2-digit' })}
</p>
</div>
</div>
))}
{isTyping && (
<div className="flex justify-start">
<div className="bg-white border border-gray-200 rounded-lg px-3 py-2">
<div className="flex space-x-1">
<div className="w-2 h-2 bg-gray-400 rounded-full animate-bounce" />
<div className="w-2 h-2 bg-gray-400 rounded-full animate-bounce" style={{ animationDelay: '0.1s' }} />
<div className="w-2 h-2 bg-gray-400 rounded-full animate-bounce" style={{ animationDelay: '0.2s' }} />
</div>
</div>
</div>
)}
<div ref={messagesEndRef as React.RefObject<HTMLDivElement>} />
</>
)
}
@@ -1,198 +1,78 @@
'use client' 'use client'
import { useState, useEffect, useRef, useCallback } from 'react' import { useCallback, useState } from 'react'
import { EXAMPLE_QUESTIONS, AdvisorEmptyState, AdvisorMessageList, type Message } from './ComplianceAdvisorParts' import {
Check,
// ============================================================================= Expand,
// TYPES Loader2,
// ============================================================================= Mail,
Maximize2,
MessagesSquare,
Minimize2,
Plus,
Send,
Shrink,
Square,
X,
} from 'lucide-react'
import { EXAMPLE_QUESTIONS } from './advisor/EmptyState'
import { EvidenceWorkspace } from './advisor/EvidenceWorkspace'
import { useAdvisorCase } from './advisor/useAdvisorCase'
import { useAdvisorEmail } from './advisor/useAdvisorEmail'
interface ComplianceAdvisorWidgetProps { interface ComplianceAdvisorWidgetProps {
currentStep?: string currentStep?: string
} }
type Country = 'DE' | 'AT' | 'CH' | 'EU' type Country = 'DE' | 'AT' | 'CH' | 'EU'
const COUNTRIES: Country[] = ['DE', 'AT', 'CH', 'EU']
const COUNTRIES: { code: Country; label: string }[] = [ type View = 'compact' | 'panel' | 'fullscreen'
{ code: 'DE', label: 'DE' }, const SIZE: Record<View, string> = {
{ code: 'AT', label: 'AT' }, compact: 'bottom-6 right-6 h-[560px] w-[420px] rounded-2xl',
{ code: 'CH', label: 'CH' }, panel: 'bottom-6 right-6 h-[85vh] w-[960px] rounded-2xl',
{ code: 'EU', label: 'EU' }, fullscreen: 'inset-0 h-full w-full',
] }
// =============================================================================
// COMPONENT
// =============================================================================
/**
* Compliance Advisor — a floating Case Workspace on every SDK page (compact / panel / fullscreen).
* Renders ONLY structured SDK data (clarify/answer contract); it never parses the answer text.
* See memory: advisor-evidence-workspace-no-parse, advisor-clarity-gate-contract.
*/
export function ComplianceAdvisorWidget({ currentStep = 'default' }: ComplianceAdvisorWidgetProps) { export function ComplianceAdvisorWidget({ currentStep = 'default' }: ComplianceAdvisorWidgetProps) {
const [isOpen, setIsOpen] = useState(false) const [isOpen, setIsOpen] = useState(false)
const [isExpanded, setIsExpanded] = useState(false) const [view, setView] = useState<View>('compact')
const [messages, setMessages] = useState<Message[]>([])
const [inputValue, setInputValue] = useState('') const [inputValue, setInputValue] = useState('')
const [isTyping, setIsTyping] = useState(false) const [country, setCountry] = useState<Country>('DE')
const [selectedCountry, setSelectedCountry] = useState<Country>('DE')
const messagesEndRef = useRef<HTMLDivElement>(null)
const abortControllerRef = useRef<AbortController | null>(null)
const { cases, threads, busy, activeCaseId, ask, newTopic, selectContext, selectCase, remove, stop } =
useAdvisorCase({ currentStep, country })
const email = useAdvisorEmail(cases, country, currentStep)
const exampleQuestions = EXAMPLE_QUESTIONS[currentStep] || EXAMPLE_QUESTIONS.default const exampleQuestions = EXAMPLE_QUESTIONS[currentStep] || EXAMPLE_QUESTIONS.default
const expanded = view !== 'compact'
useEffect(() => { const submit = useCallback(
messagesEndRef.current?.scrollIntoView({ behavior: 'smooth' }) (q: string) => {
}, [messages]) if (!q.trim() || busy) return
useEffect(() => {
return () => {
abortControllerRef.current?.abort()
}
}, [])
const handleSendMessage = useCallback(
async (content: string) => {
if (!content.trim() || isTyping) return
const userMessage: Message = {
id: `msg-${Date.now()}`,
role: 'user',
content: content.trim(),
timestamp: new Date(),
}
setMessages((prev) => [...prev, userMessage])
setInputValue('') setInputValue('')
setIsTyping(true) ask(q)
const agentMessageId = `msg-${Date.now()}-agent`
abortControllerRef.current = new AbortController()
try {
const history = messages.map((m) => ({
role: m.role === 'user' ? 'user' : 'assistant',
content: m.content,
}))
const response = await fetch('/api/sdk/compliance-advisor/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
message: content.trim(),
history,
currentStep,
country: selectedCountry,
}),
signal: abortControllerRef.current.signal,
})
if (!response.ok) {
const errorData = await response.json().catch(() => ({ error: 'Unbekannter Fehler' }))
throw new Error(errorData.error || `Server-Fehler (${response.status})`)
}
setMessages((prev) => [
...prev,
{ id: agentMessageId, role: 'agent', content: '', timestamp: new Date() },
])
const reader = response.body!.getReader()
const decoder = new TextDecoder()
let accumulated = ''
while (true) {
const { done, value } = await reader.read()
if (done) break
accumulated += decoder.decode(value, { stream: true })
const currentText = accumulated
setMessages((prev) =>
prev.map((m) => (m.id === agentMessageId ? { ...m, content: currentText } : m))
)
requestAnimationFrame(() => {
messagesEndRef.current?.scrollIntoView({ behavior: 'smooth' })
})
}
setIsTyping(false)
} catch (error) {
if ((error as Error).name === 'AbortError') {
setIsTyping(false)
return
}
const errorMessage = error instanceof Error ? error.message : 'Verbindung fehlgeschlagen'
setMessages((prev) => {
const hasAgent = prev.some((m) => m.id === agentMessageId)
if (hasAgent) {
return prev.map((m) =>
m.id === agentMessageId ? { ...m, content: `Fehler: ${errorMessage}` } : m
)
}
return [
...prev,
{ id: agentMessageId, role: 'agent' as const, content: `Fehler: ${errorMessage}`, timestamp: new Date() },
]
})
setIsTyping(false)
}
}, },
[isTyping, messages, currentStep, selectedCountry] [busy, ask],
) )
const handleStopGeneration = useCallback(() => { const submitNewTopic = useCallback(
abortControllerRef.current?.abort() (q: string) => {
setIsTyping(false) if (!q.trim() || busy) return
}, []) setInputValue('')
newTopic(q)
},
[busy, newTopic],
)
const [emailSending, setEmailSending] = useState(false) const onKeyDown = (e: React.KeyboardEvent) => {
const [emailSent, setEmailSent] = useState(false)
const handleSendAsEmail = useCallback(async () => {
if (messages.length === 0 || emailSending) return
setEmailSending(true)
try {
// Build HTML from chat messages
const qaPairs = messages.reduce<{ q: string; a: string }[]>((acc, m, i) => {
if (m.role === 'user') {
const next = messages[i + 1]
acc.push({ q: m.content, a: next?.role === 'agent' ? next.content : '(keine Antwort)' })
}
return acc
}, [])
const qaHtml = qaPairs.map(({ q, a }) =>
`<div style="margin-bottom:16px;"><p style="font-weight:600;color:#1e293b;">Frage: ${q}</p><p style="color:#475569;white-space:pre-wrap;">${a}</p></div>`
).join('')
const bodyHtml = `
<h2 style="color:#1e293b;">Compliance Advisor — Beratungsprotokoll</h2>
<p style="color:#64748b;font-size:13px;">Datum: ${new Date().toLocaleString('de-DE')} | Land: ${selectedCountry} | Kontext: ${currentStep}</p>
<hr style="border-color:#e2e8f0;margin:16px 0;">
${qaHtml}
<hr style="border-color:#e2e8f0;margin:16px 0;">
<p style="color:#94a3b8;font-size:11px;">Automatisch erstellt vom BreakPilot Compliance Advisor (Qwen)</p>
`
await fetch('/api/sdk/v1/agent/notify', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
recipient: 'dsb@breakpilot.local',
subject: `Compliance Advisor — ${qaPairs.length} Fragen (${currentStep})`,
body_html: bodyHtml,
role: 'Datenschutzbeauftragter',
}),
})
setEmailSent(true)
setTimeout(() => setEmailSent(false), 3000)
} catch (e) {
console.error('Email send failed:', e)
} finally {
setEmailSending(false)
}
}, [messages, emailSending, selectedCountry, currentStep])
const handleKeyDown = (e: React.KeyboardEvent) => {
if (e.key === 'Enter' && !e.shiftKey) { if (e.key === 'Enter' && !e.shiftKey) {
e.preventDefault() e.preventDefault()
handleSendMessage(inputValue) submit(inputValue)
} }
} }
@@ -200,136 +80,132 @@ export function ComplianceAdvisorWidget({ currentStep = 'default' }: ComplianceA
return ( return (
<button <button
onClick={() => setIsOpen(true)} onClick={() => setIsOpen(true)}
className="fixed bottom-6 right-[5.5rem] w-14 h-14 bg-indigo-600 hover:bg-indigo-700 text-white rounded-full shadow-lg flex items-center justify-center transition-all duration-200 hover:scale-110 z-50" className="fixed bottom-6 right-[5.5rem] z-50 flex h-14 w-14 items-center justify-center rounded-full bg-indigo-600 text-white shadow-lg transition-all duration-200 hover:scale-110 hover:bg-indigo-700"
aria-label="Compliance Advisor oeffnen" aria-label="Compliance Advisor oeffnen"
> >
<svg className="w-6 h-6" fill="none" stroke="currentColor" viewBox="0 0 24 24"> <MessagesSquare className="h-6 w-6" />
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M8 10h.01M12 10h.01M16 10h.01M9 16H5a2 2 0 01-2-2V6a2 2 0 012-2h14a2 2 0 012 2v8a2 2 0 01-2 2h-5l-5 5v-5z" />
</svg>
</button> </button>
) )
} }
const headRound = view === 'fullscreen' ? '' : 'rounded-t-2xl'
const footRound = view === 'fullscreen' ? '' : 'rounded-b-2xl'
return ( return (
<div className={`fixed bottom-6 right-6 ${isExpanded ? 'w-[700px] h-[80vh]' : 'w-[400px] h-[500px]'} max-h-screen bg-white rounded-2xl shadow-2xl flex flex-col z-50 border border-gray-200 transition-all duration-200`}> <div
{/* Header */} className={`fixed z-50 flex max-h-screen flex-col border border-gray-200 bg-white shadow-2xl transition-all duration-200 ${SIZE[view]}`}
<div className="bg-gradient-to-r from-purple-600 to-indigo-600 text-white px-4 py-3 rounded-t-2xl flex items-center justify-between"> >
<div className={`flex items-center justify-between bg-gradient-to-r from-purple-600 to-indigo-600 px-4 py-3 text-white ${headRound}`}>
<div className="flex items-center gap-2"> <div className="flex items-center gap-2">
<div className="w-8 h-8 bg-white/20 rounded-full flex items-center justify-center"> <div className="flex h-8 w-8 items-center justify-center rounded-full bg-white/20">
<svg className="w-5 h-5" fill="none" stroke="currentColor" viewBox="0 0 24 24"> <MessagesSquare className="h-5 w-5" />
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M9.663 17h4.673M12 3v1m6.364 1.636l-.707.707M21 12h-1M4 12H3m3.343-5.657l-.707-.707m2.828 9.9a5 5 0 117.072 0l-.548.547A3.374 3.374 0 0014 18.469V19a2 2 0 11-4 0v-.531c0-.895-.356-1.754-.988-2.386l-.548-.547z" />
</svg>
</div> </div>
<div> <div>
<div className="font-semibold text-sm">Compliance Advisor</div> <div className="text-sm font-semibold">Compliance Advisor</div>
<div className="flex items-center gap-1 mt-0.5"> <div className="mt-0.5 flex items-center gap-1">
{COUNTRIES.map(({ code, label }) => ( {COUNTRIES.map((c) => (
<button <button
key={code} key={c}
onClick={() => setSelectedCountry(code)} onClick={() => setCountry(c)}
className={`px-1.5 py-0.5 text-[10px] font-medium rounded transition-colors ${selectedCountry === code ? 'bg-white text-indigo-700' : 'bg-white/15 text-white/80 hover:bg-white/25'}`} className={`rounded px-1.5 py-0.5 text-[10px] font-medium transition-colors ${
country === c ? 'bg-white text-indigo-700' : 'bg-white/15 text-white/80 hover:bg-white/25'
}`}
> >
{label} {c}
</button> </button>
))} ))}
</div> </div>
</div> </div>
</div> </div>
<div className="flex items-center gap-1"> <div className="flex items-center gap-1">
{/* Send as Email */} {cases.length > 0 && (
{messages.length > 0 && (
<button <button
onClick={handleSendAsEmail} onClick={email.send}
disabled={emailSending} disabled={email.sending}
className={`text-white/80 hover:text-white transition-colors ${emailSent ? 'text-green-300' : ''}`} className={`text-white/80 transition-colors hover:text-white ${email.sent ? 'text-green-300' : ''}`}
title={email.sent ? 'Email gesendet!' : 'Beratungsprotokoll als Email senden'}
aria-label="Als Email an DSB senden" aria-label="Als Email an DSB senden"
title={emailSent ? 'Email gesendet!' : 'Beratungsprotokoll als Email senden'}
> >
{emailSent ? ( {email.sent ? <Check className="h-5 w-5" /> : email.sending ? <Loader2 className="h-5 w-5 animate-spin" /> : <Mail className="h-5 w-5" />}
<svg className="w-5 h-5" fill="none" stroke="currentColor" viewBox="0 0 24 24"> </button>
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M5 13l4 4L19 7" /> )}
</svg> {view !== 'fullscreen' && (
) : emailSending ? ( <button
<svg className="w-5 h-5 animate-spin" fill="none" viewBox="0 0 24 24"> onClick={() => setView((v) => (v === 'compact' ? 'panel' : 'compact'))}
<circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" /> className="text-white/80 transition-colors hover:text-white"
<path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4z" /> aria-label={view === 'compact' ? 'Vergroessern' : 'Verkleinern'}
</svg> >
) : ( {view === 'compact' ? <Maximize2 className="h-5 w-5" /> : <Minimize2 className="h-5 w-5" />}
<svg className="w-5 h-5" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M3 8l7.89 5.26a2 2 0 002.22 0L21 8M5 19h14a2 2 0 002-2V7a2 2 0 00-2-2H5a2 2 0 00-2 2v10a2 2 0 002 2z" />
</svg>
)}
</button> </button>
)} )}
<button <button
onClick={() => setIsExpanded(!isExpanded)} onClick={() => setView((v) => (v === 'fullscreen' ? 'panel' : 'fullscreen'))}
className="text-white/80 hover:text-white transition-colors" className="text-white/80 transition-colors hover:text-white"
aria-label={isExpanded ? 'Verkleinern' : 'Vergroessern'} aria-label={view === 'fullscreen' ? 'Vollbild verlassen' : 'Vollbild'}
title={view === 'fullscreen' ? 'Vollbild verlassen' : 'Vollbild'}
> >
<svg className="w-5 h-5" fill="none" stroke="currentColor" viewBox="0 0 24 24"> {view === 'fullscreen' ? <Shrink className="h-5 w-5" /> : <Expand className="h-5 w-5" />}
{isExpanded ? (
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M9 9L4 4m0 0v4m0-4h4m6 6l5 5m0 0v-4m0 4h-4" />
) : (
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M4 8V4m0 0h4M4 4l5 5m11-1V4m0 0h-4m4 0l-5 5M4 16v4m0 0h4m-4 0l5-5m11 5v-4m0 4h-4m4 0l-5-5" />
)}
</svg>
</button> </button>
<button onClick={() => setIsOpen(false)} className="text-white/80 hover:text-white transition-colors" aria-label="Schliessen"> <button
<svg className="w-5 h-5" fill="none" stroke="currentColor" viewBox="0 0 24 24"> onClick={() => setIsOpen(false)}
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M6 18L18 6M6 6l12 12" /> className="text-white/80 transition-colors hover:text-white"
</svg> aria-label="Schliessen"
>
<X className="h-5 w-5" />
</button> </button>
</div> </div>
</div> </div>
{/* Messages Area */} <EvidenceWorkspace
<div className="flex-1 overflow-y-auto p-4 space-y-4 bg-gray-50"> cases={cases}
{messages.length === 0 ? ( threads={threads}
<AdvisorEmptyState expanded={expanded}
exampleQuestions={exampleQuestions} busy={busy}
onExampleClick={(q) => handleSendMessage(q)} activeCaseId={activeCaseId}
/> exampleQuestions={exampleQuestions}
) : ( onExample={submit}
<AdvisorMessageList onSelectContext={selectContext}
messages={messages} onSelectCase={selectCase}
isTyping={isTyping} onRemove={remove}
messagesEndRef={messagesEndRef} />
/>
)}
</div>
{/* Input Area */} <div className={`border-t border-gray-200 bg-white p-3 ${footRound}`}>
<div className="border-t border-gray-200 p-3 bg-white rounded-b-2xl">
<div className="flex gap-2"> <div className="flex gap-2">
<input <input
type="text" type="text"
value={inputValue} value={inputValue}
onChange={(e) => setInputValue(e.target.value)} onChange={(e) => setInputValue(e.target.value)}
onKeyDown={handleKeyDown} onKeyDown={onKeyDown}
placeholder="Frage eingeben..." placeholder={cases.length > 0 ? 'Folgefrage eingeben...' : 'Frage eingeben...'}
disabled={isTyping} disabled={busy}
className="flex-1 px-3 py-2 text-sm border border-gray-300 rounded-lg focus:outline-none focus:ring-2 focus:ring-purple-500 focus:border-transparent disabled:opacity-50" className="flex-1 rounded-lg border border-gray-300 px-3 py-2 text-sm focus:border-transparent focus:outline-none focus:ring-2 focus:ring-purple-500 disabled:opacity-50"
/> />
{isTyping ? ( {busy ? (
<button <button onClick={stop} className="rounded-lg bg-red-500 px-4 py-2 text-white transition-colors hover:bg-red-600" title="Abbrechen">
onClick={handleStopGeneration} <Square className="h-5 w-5" />
className="px-4 py-2 bg-red-500 text-white rounded-lg hover:bg-red-600 transition-colors"
title="Generierung stoppen"
>
<svg className="w-5 h-5" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M6 6h12v12H6z" />
</svg>
</button> </button>
) : ( ) : (
<button <>
onClick={() => handleSendMessage(inputValue)} {cases.length > 0 && (
disabled={!inputValue.trim()} <button
className="px-4 py-2 bg-indigo-600 text-white rounded-lg hover:bg-indigo-700 disabled:opacity-50 disabled:cursor-not-allowed transition-colors" onClick={() => submitNewTopic(inputValue)}
> disabled={!inputValue.trim()}
<svg className="w-5 h-5" fill="none" stroke="currentColor" viewBox="0 0 24 24"> className="rounded-lg border border-gray-300 px-3 py-2 text-gray-600 transition-colors hover:bg-gray-50 disabled:cursor-not-allowed disabled:opacity-50"
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M12 19l9 2-9-18-9 18 9-2zm0 0v-8" /> title="Als neues Thema stellen"
</svg> aria-label="Neues Thema"
</button> >
<Plus className="h-5 w-5" />
</button>
)}
<button
onClick={() => submit(inputValue)}
disabled={!inputValue.trim()}
className="rounded-lg bg-indigo-600 px-4 py-2 text-white transition-colors hover:bg-indigo-700 disabled:cursor-not-allowed disabled:opacity-50"
title={cases.length > 0 ? 'Folgefrage senden' : 'Frage senden'}
>
<Send className="h-5 w-5" />
</button>
</>
)} )}
</div> </div>
</div> </div>
@@ -0,0 +1,68 @@
import { describe, it, expect, vi } from 'vitest'
import { render, fireEvent } from '@testing-library/react'
import { CaseView } from './CaseView'
import type { AdvisorCase } from './useAdvisorCase'
import type { AdvisorResponse } from '@/lib/sdk/advisor/contract'
const clarify: AdvisorResponse = {
mode: 'clarify',
question: 'Was ist PDCA?',
clarity: {
is_underspecified: true,
concentration: 0.38,
suggested_contexts: [
{ id: 'datenschutz', label: 'Datenschutz' },
{ id: 'qm', label: 'Qualitätsmanagement' },
],
},
general_answer: 'PDCA steht für **Plan-Do-Check-Act**.',
answer: null,
evidence: [],
citations: [],
visual_evidence: [],
footnotes: [],
}
const answer: AdvisorResponse = {
mode: 'answer',
question: 'PDCA im Datenschutz?',
clarity: { is_underspecified: false, dominant_context: 'datenschutz', concentration: 0.88 },
answer: 'Der DSM-Zyklus [1] beschreibt den Ablauf.',
evidence: [
{ evidence_id: 'e1', document: 'DSK Sdm B41', section: 'Art. 5', paragraph: 'Abs. 2', snippet: 'x' },
],
citations: [
{ citation_id: 'c1', evidence_id: 'e1', document: 'DSK Sdm B41', section: 'Art. 5', paragraph: 'Abs. 2' },
],
visual_evidence: [
{ visual_id: 'v1', visual_type: 'flowchart', caption: 'PDCA-Zyklus', document: 'DSK SDM', vision_summary: 's' },
],
footnotes: [],
}
function mk(response: AdvisorResponse): AdvisorCase {
return { id: 'case1', threadId: 'thread1', question: response.question, response, selectedContext: null, status: 'done' }
}
describe('CaseView — clarify mode', () => {
it('renders the L1 general answer + context chips and fires onSelectContext', () => {
const onSel = vi.fn()
const { container, getByText } = render(
<CaseView c={mk(clarify)} busy={false} onSelectContext={onSel} />,
)
expect(container.textContent).toContain('Plan-Do-Check-Act')
expect(container.textContent).toContain('Allgemeine Definition')
fireEvent.click(getByText('Datenschutz'))
expect(onSel).toHaveBeenCalledWith('datenschutz')
})
})
describe('CaseView — answer mode', () => {
it('renders answer with a clickable [n] citation, grouped evidence (friendly name), and visual', () => {
const { container } = render(<CaseView c={mk(answer)} busy={false} onSelectContext={() => {}} />)
expect(container.textContent).toContain('DSM-Zyklus')
expect(container.querySelector('button[title="Beleg 1 anzeigen"]')).not.toBeNull()
expect(container.textContent).toContain('DSK Standard-Datenschutzmodell')
expect(container.textContent).toContain('PDCA-Zyklus')
})
})
@@ -0,0 +1,107 @@
'use client'
import { Check, Copy, Trash2 } from 'lucide-react'
import type { AdvisorResponse } from '@/lib/sdk/advisor/contract'
import { formatCaseForCopy } from '@/lib/sdk/advisor/copy'
import type { AdvisorCase } from './useAdvisorCase'
import { ClarifyView } from './ClarifyView'
import { EvidenceSummary } from './EvidenceSummary'
import { EvidencePane } from './EvidencePane'
import { VisualEvidencePane } from './VisualEvidencePane'
import { FootnotesPane } from './FootnotesPane'
import { Markdown } from './Markdown'
import { useCitationHighlight } from './useCitationHighlight'
import { useClipboard } from './useClipboard'
export function LoadingDots() {
return (
<div className="flex space-x-1 px-1 py-2" aria-label="Antwort wird erstellt">
<span className="h-2 w-2 animate-bounce rounded-full bg-gray-400" />
<span className="h-2 w-2 animate-bounce rounded-full bg-gray-400" style={{ animationDelay: '0.1s' }} />
<span className="h-2 w-2 animate-bounce rounded-full bg-gray-400" style={{ animationDelay: '0.2s' }} />
</div>
)
}
export function ErrorBox({ msg }: { msg?: string }) {
return (
<div className="rounded-lg border border-red-200 bg-red-50 px-3 py-2 text-sm text-red-700">
{msg || 'Verbindung fehlgeschlagen'}
</div>
)
}
/** Answer mode body (stacked): summary + answer (with [n] coupling) + evidence/visual/footnotes. */
export function AnswerBody({ response }: { response: AdvisorResponse }) {
const { highlightedId, cite } = useCitationHighlight(response.citations)
return (
<div className="space-y-3">
<EvidenceSummary response={response} />
<div className="rounded-lg border border-gray-200 bg-white px-3 py-2">
<Markdown content={response.answer || ''} citations={cite} />
</div>
<EvidencePane evidence={response.evidence} highlightedId={highlightedId} />
<VisualEvidencePane items={response.visual_evidence} />
<FootnotesPane footnotes={response.footnotes} />
</div>
)
}
/** One case rendered stacked (narrow mode). Clarify -> L1 + chips; answer -> full evidence body. */
export function CaseView({
c,
onSelectContext,
busy,
showQuestion,
onRemove,
}: {
c: AdvisorCase
onSelectContext: (ctx: string) => void
busy: boolean
showQuestion?: boolean
onRemove?: () => void
}) {
const r = c.response
const { copiedKey, copy } = useClipboard()
return (
<div className="group space-y-2 border-b border-gray-100 pb-4 last:border-0">
<div className="flex items-start justify-between gap-2">
{showQuestion ? (
<div className="min-w-0 flex-1 text-xs text-gray-500">
<span className="font-medium text-gray-400">Frage:</span> {c.question}
</div>
) : (
<span className="flex-1" />
)}
<div className="flex shrink-0 items-center gap-0.5 text-gray-400 opacity-0 transition-opacity group-hover:opacity-100">
<button
type="button"
title="Frage & Antwort kopieren"
aria-label="Frage & Antwort kopieren"
onClick={() => copy(c.id, formatCaseForCopy(c))}
className="rounded p-0.5 hover:bg-gray-100 hover:text-gray-700"
>
{copiedKey === c.id ? <Check className="h-3.5 w-3.5 text-green-600" /> : <Copy className="h-3.5 w-3.5" />}
</button>
{onRemove && (
<button
type="button"
title="Frage löschen"
aria-label="Frage löschen"
onClick={onRemove}
className="rounded p-0.5 hover:bg-gray-100 hover:text-gray-700"
>
<Trash2 className="h-3.5 w-3.5" />
</button>
)}
</div>
</div>
{c.status === 'loading' && <LoadingDots />}
{c.status === 'error' && <ErrorBox msg={c.error} />}
{r && r.mode === 'clarify' && (
<ClarifyView response={r} onSelectContext={onSelectContext} busy={busy} />
)}
{r && r.mode === 'answer' && <AnswerBody response={r} />}
</div>
)
}
@@ -0,0 +1,52 @@
'use client'
import { Info } from 'lucide-react'
import type { AdvisorResponse } from '@/lib/sdk/advisor/contract'
import { Markdown } from './Markdown'
/**
* Clarify mode: a short general (L1) definition — explicitly marked as general, no legal source —
* plus domain context chips. Picking a chip re-runs the case scoped to that domain (-> L2).
*/
export function ClarifyView({
response,
onSelectContext,
busy,
}: {
response: AdvisorResponse
onSelectContext: (id: string) => void
busy: boolean
}) {
const chips = response.clarity.suggested_contexts ?? []
return (
<div className="space-y-3">
<div className="rounded-lg border border-amber-200 bg-amber-50 px-3 py-2">
<div className="mb-1 flex items-center gap-1 text-[11px] font-semibold text-amber-700">
<Info className="h-3.5 w-3.5" />
Allgemeine Definition (ohne Rechtsquelle)
</div>
<Markdown content={response.general_answer || ''} />
</div>
{chips.length > 0 && (
<div>
<div className="mb-1.5 text-xs font-medium text-gray-700">
Meintest du einen bestimmten Kontext?
</div>
<div className="flex flex-wrap gap-1.5">
{chips.map((c) => (
<button
key={c.id}
type="button"
disabled={busy}
onClick={() => onSelectContext(c.id)}
className="rounded-full border border-indigo-200 bg-white px-3 py-1 text-xs font-medium text-indigo-700 transition-colors hover:bg-indigo-50 disabled:opacity-50"
>
{c.label}
</button>
))}
</div>
</div>
)}
</div>
)
}
@@ -0,0 +1,64 @@
'use client'
import { ShieldCheck } from 'lucide-react'
export const EXAMPLE_QUESTIONS: Record<string, string[]> = {
vvt: [
'Was ist ein Verarbeitungsverzeichnis?',
'Welche Informationen muss ich erfassen?',
'Wie dokumentiere ich die Rechtsgrundlage?',
],
'compliance-scope': [
'Was bedeutet L3?',
'Wann brauche ich eine DSFA?',
'Was ist der Unterschied zwischen L2 und L3?',
],
tom: [
'Was sind TOM?',
'Welche Massnahmen sind erforderlich?',
'Wie dokumentiere ich Verschluesselung?',
],
dsfa: ['Was ist eine DSFA?', 'Wann ist eine DSFA verpflichtend?', 'Wie bewerte ich Risiken?'],
loeschfristen: [
'Wie definiere ich Loeschfristen?',
'Unterschied Loeschpflicht und Aufbewahrungspflicht?',
'Wann muss ich Daten loeschen?',
],
default: [
'Wie starte ich mit dem SDK?',
'Was ist der erste Schritt?',
'Welche Compliance-Anforderungen gelten fuer KI-Systeme?',
],
}
export function AdvisorEmptyState({
exampleQuestions,
onExampleClick,
}: {
exampleQuestions: string[]
onExampleClick: (question: string) => void
}) {
return (
<div className="px-4 py-8 text-center">
<div className="mx-auto mb-3 flex h-14 w-14 items-center justify-center rounded-full bg-indigo-100">
<ShieldCheck className="h-7 w-7 text-indigo-600" />
</div>
<h3 className="text-sm font-semibold text-gray-900">Compliance Advisor</h3>
<p className="mx-auto mt-1 max-w-xs text-xs text-gray-500">
Antworten mit nachvollziehbaren Quellen, Fundstellen und wo vorhanden Original-Abbildungen.
</p>
<div className="mt-4 space-y-2 text-left">
<p className="text-xs font-medium text-gray-700">Beispielfragen</p>
{exampleQuestions.map((q, i) => (
<button
key={i}
onClick={() => onExampleClick(q)}
className="w-full rounded-lg border border-gray-200 bg-white px-3 py-2 text-left text-xs text-gray-700 transition-colors hover:bg-indigo-50"
>
{q}
</button>
))}
</div>
</div>
)
}
@@ -0,0 +1,76 @@
'use client'
import { useState } from 'react'
import { ChevronDown, ChevronRight, Library } from 'lucide-react'
import type { EvidenceUnit } from '@/lib/sdk/advisor/contract'
import { resolveRegulation } from '@/lib/sdk/advisor/regulation-display'
import { EvidenceUnitCard } from './EvidenceUnitCard'
import { PaneHeader } from './PaneHeader'
interface Group {
key: string
label: string
units: EvidenceUnit[]
}
function groupByFamily(units: EvidenceUnit[]): Group[] {
const map = new Map<string, Group>()
for (const u of units) {
const d = resolveRegulation({ code: u.document, short: u.document })
const g = map.get(d.familyKey) ?? { key: d.familyKey, label: d.familyLabel, units: [] }
g.units.push(u)
map.set(d.familyKey, g)
}
return [...map.values()].sort((a, b) => b.units.length - a.units.length)
}
function EvidenceGroup({ group, highlightedId }: { group: Group; highlightedId?: string }) {
const [open, setOpen] = useState(group.units.length <= 3)
return (
<div className="rounded-lg border border-gray-200 bg-white">
<button
type="button"
onClick={() => setOpen((v) => !v)}
className="flex w-full items-center justify-between gap-2 px-2.5 py-2 text-left"
>
<span className="min-w-0 truncate text-xs font-semibold text-gray-900">{group.label}</span>
<span className="flex flex-shrink-0 items-center gap-1 text-[11px] text-gray-500">
{group.units.length} Treffer
{open ? <ChevronDown className="h-3.5 w-3.5" /> : <ChevronRight className="h-3.5 w-3.5" />}
</span>
</button>
{open && (
<div className="space-y-1 border-t border-gray-100 px-2 py-2">
{group.units.map((u) => (
<EvidenceUnitCard key={u.evidence_id} unit={u} compact highlighted={u.evidence_id === highlightedId} />
))}
</div>
)}
</div>
)
}
/** Evidence pane — units grouped by document/regulation family, count + expandable. */
export function EvidencePane({
evidence,
highlightedId,
}: {
evidence: EvidenceUnit[]
highlightedId?: string
}) {
const groups = groupByFamily(evidence)
return (
<section>
<PaneHeader icon={<Library className="h-3.5 w-3.5 text-gray-500" />} title="Evidence" count={evidence.length} />
{groups.length === 0 ? (
<p className="px-1 text-[11px] text-gray-400">Keine strukturierte Evidence zu dieser Antwort.</p>
) : (
<div className="space-y-1.5">
{groups.map((g) => (
<EvidenceGroup key={g.key} group={g} highlightedId={highlightedId} />
))}
</div>
)}
</section>
)
}
@@ -0,0 +1,135 @@
'use client'
import { BookMarked, FileText, Hash, Image as ImageIcon, Library, Scale } from 'lucide-react'
import type { AdvisorResponse } from '@/lib/sdk/advisor/contract'
import {
provisionSummary,
summarizeEvidence,
type FamilyGroup,
} from '@/lib/sdk/advisor/evidence-grouping'
const plural = (n: number, one: string, many: string) => (n === 1 ? one : many)
function Count({
icon,
value,
label,
dim,
}: {
icon: React.ReactNode
value: number
label: string
dim?: boolean
}) {
return (
<div
className={`flex items-center gap-2 rounded-lg border px-2.5 py-1.5 ${
dim ? 'border-gray-100 bg-gray-50' : 'border-gray-200 bg-white'
}`}
>
<span className={dim ? 'text-gray-300' : 'text-indigo-500'}>{icon}</span>
<span>
<span className={`text-sm font-bold ${dim ? 'text-gray-400' : 'text-gray-900'}`}>{value}</span>{' '}
<span className="text-[11px] text-gray-500">{label}</span>
</span>
</div>
)
}
function GroupRow({ group, icon }: { group: FamilyGroup; icon: React.ReactNode }) {
// A single-unit guidance doc needs no "1 Fundstelle" noise; norms always show their provisions.
const detail = group.sections.length === 0 && group.units <= 1 ? '' : provisionSummary(group)
return (
<div className="flex items-start gap-2 py-0.5">
<span className="mt-0.5 shrink-0 text-gray-400">{icon}</span>
<span className="min-w-0 flex-1 text-[12px] leading-snug text-gray-700">{group.label}</span>
{detail && <span className="shrink-0 text-[11px] font-medium text-gray-500">{detail}</span>}
</div>
)
}
function Section({
title,
groups,
icon,
}: {
title: string
groups: FamilyGroup[]
icon: React.ReactNode
}) {
if (groups.length === 0) return null
return (
<div className="mt-2 first:mt-0">
<div className="mb-0.5 text-[10px] font-semibold uppercase tracking-wide text-gray-400">{title}</div>
{groups.map((g) => (
<GroupRow key={g.key} group={g} icon={icon} />
))}
</div>
)
}
/**
* "Diese Antwort stützt sich auf" — describes the EVIDENCE (not the documents), objective counts
* only (no fabricated trust score). When the Legal-KG ships `bindingness`, binding Rechtsgrundlagen
* are split from Leitlinien (soft-law guidance); until then it shows a neutral evidence breakdown.
*/
export function EvidenceSummary({ response }: { response: AdvisorResponse }) {
const m = summarizeEvidence(response.evidence)
const figures = response.visual_evidence.length
const notes = response.footnotes.length
const cls = 'h-4 w-4'
const smallIcon = 'h-3.5 w-3.5'
return (
<div>
<div className="mb-1.5 text-[10px] font-semibold uppercase tracking-wide text-gray-400">
Diese Antwort stützt sich auf
</div>
<div className="grid grid-cols-2 gap-1.5">
{m.hasBindingness && (
<>
<Count
icon={<Scale className={cls} />}
value={m.normProvisions}
label={plural(m.normProvisions, 'Rechtsgrundlage', 'Rechtsgrundlagen')}
/>
<Count
icon={<BookMarked className={cls} />}
value={m.guidanceCount}
label={plural(m.guidanceCount, 'Leitlinie', 'Leitlinien')}
dim={m.guidanceCount === 0}
/>
</>
)}
<Count
icon={<ImageIcon className={cls} />}
value={figures}
label={plural(figures, 'Abbildung', 'Abbildungen')}
dim={figures === 0}
/>
<Count
icon={<Hash className={cls} />}
value={notes}
label={plural(notes, 'Fußnote', 'Fußnoten')}
dim={notes === 0}
/>
<Count icon={<FileText className={cls} />} value={m.unitCount} label="Evidence Units" />
</div>
{m.groups.length > 0 && (
<div className="mt-2.5 rounded-lg border border-gray-100 bg-gray-50/60 px-2.5 py-1.5">
{m.hasBindingness ? (
<>
<Section title="Rechtsgrundlagen" groups={m.norms} icon={<Scale className={smallIcon} />} />
<Section title="Leitlinien" groups={m.guidance} icon={<BookMarked className={smallIcon} />} />
<Section title="Weitere" groups={m.other} icon={<FileText className={smallIcon} />} />
</>
) : (
m.groups.map((g) => <GroupRow key={g.key} group={g} icon={<Library className={smallIcon} />} />)
)}
</div>
)}
</div>
)
}
@@ -0,0 +1,80 @@
'use client'
import { useState } from 'react'
import { ChevronDown, ChevronRight, ExternalLink } from 'lucide-react'
import type { EvidenceUnit } from '@/lib/sdk/advisor/contract'
import { resolveRegulation } from '@/lib/sdk/advisor/regulation-display'
/** One evidence unit (contract shape). Compact inside a document group: chapter/section only. */
export function EvidenceUnitCard({
unit,
compact,
highlighted,
}: {
unit: EvidenceUnit
compact?: boolean
highlighted?: boolean
}) {
const [open, setOpen] = useState(false)
const d = resolveRegulation({ code: unit.document, short: unit.document })
const crumbs = [unit.section, unit.paragraph].filter((x): x is string => Boolean(x))
const canOpen = !!unit.url && /^https?:\/\//i.test(unit.url)
const header = compact ? (d.chapter ? `Kapitel ${d.chapter}` : crumbs[0] || d.familyLabel) : d.familyLabel
const sub = compact && !d.chapter && crumbs.length ? crumbs.slice(1) : crumbs
return (
<div
id={`ev-${unit.evidence_id}`}
className={`${
compact
? 'rounded-md border border-gray-100 bg-gray-50 p-2'
: 'rounded-lg border border-gray-200 bg-white p-2.5'
} ${highlighted ? 'ring-2 ring-indigo-400' : ''}`}
>
<div className="flex items-start justify-between gap-2">
<div className="min-w-0">
<div className="truncate text-xs font-semibold text-gray-900">{header}</div>
{sub.length > 0 && (
<div className="mt-0.5 flex flex-wrap items-center gap-x-1 text-[11px] text-gray-500">
{sub.map((c, i) => (
<span key={i} className="flex items-center gap-1">
{i > 0 && <span className="text-gray-300"></span>}
{c}
</span>
))}
</div>
)}
</div>
{canOpen && (
<a
href={unit.url}
target="_blank"
rel="noopener noreferrer"
className="flex flex-shrink-0 items-center gap-0.5 rounded px-1.5 py-0.5 text-[11px] font-medium text-indigo-600 hover:bg-indigo-50"
>
<ExternalLink className="h-3 w-3" />
öffnen
</a>
)}
</div>
{unit.snippet && (
<div className="mt-1.5">
<button
type="button"
onClick={() => setOpen((v) => !v)}
className="flex items-center gap-0.5 text-[11px] text-gray-400 hover:text-gray-600"
>
{open ? <ChevronDown className="h-3 w-3" /> : <ChevronRight className="h-3 w-3" />}
Textauszug
</button>
{open && (
<p className="mt-1 border-l-2 border-gray-200 pl-2 text-[11px] italic text-gray-500">
{unit.snippet}
</p>
)}
</div>
)}
</div>
)
}
@@ -0,0 +1,134 @@
'use client'
import { useEffect, useRef } from 'react'
import type { AdvisorCase, AdvisorThread } from './useAdvisorCase'
import { StickyQuestion } from './StickyQuestion'
import { AdvisorEmptyState } from './EmptyState'
import { CaseView, LoadingDots, ErrorBox } from './CaseView'
import { ClarifyView } from './ClarifyView'
import { EvidenceSummary } from './EvidenceSummary'
import { EvidencePane } from './EvidencePane'
import { VisualEvidencePane } from './VisualEvidencePane'
import { FootnotesPane } from './FootnotesPane'
import { Markdown } from './Markdown'
import { ThreadMenu } from './ThreadMenu'
import { useCitationHighlight } from './useCitationHighlight'
/**
* Advisor body as topic THREADS of cases.
* - Narrow: stacked cases with a pinned last question; per-case copy + delete.
* - Wide/fullscreen: 3-column Case Workspace — topic tree (left) | answer/clarify (center) | evidence (right).
*/
export function EvidenceWorkspace({
cases,
threads,
expanded,
busy,
activeCaseId,
exampleQuestions,
onExample,
onSelectContext,
onSelectCase,
onRemove,
}: {
cases: AdvisorCase[]
threads: AdvisorThread[]
expanded: boolean
busy: boolean
activeCaseId: string | null
exampleQuestions: string[]
onExample: (q: string) => void
onSelectContext: (caseId: string, ctx: string) => void
onSelectCase: (id: string) => void
onRemove: (id: string) => void
}) {
const endRef = useRef<HTMLDivElement>(null)
const latest = cases[cases.length - 1]
const active = cases.find((c) => c.id === activeCaseId) ?? latest
useEffect(() => {
if (!expanded) endRef.current?.scrollIntoView({ behavior: 'smooth' })
}, [cases.length, expanded])
const answer = active?.response?.mode === 'answer' ? active.response : null
const { highlightedId, cite } = useCitationHighlight(answer?.citations ?? [])
if (cases.length === 0) {
return (
<div className="flex-1 overflow-y-auto bg-gray-50">
<AdvisorEmptyState exampleQuestions={exampleQuestions} onExampleClick={onExample} />
</div>
)
}
if (!expanded) {
return (
<div className="min-h-0 flex-1 overflow-y-auto bg-gray-50">
{latest && <StickyQuestion question={latest.question} />}
<div className="space-y-4 p-4">
{cases.map((c, i) => (
<CaseView
key={c.id}
c={c}
busy={busy}
showQuestion={i !== cases.length - 1}
onSelectContext={(ctx) => onSelectContext(c.id, ctx)}
onRemove={() => onRemove(c.id)}
/>
))}
<div ref={endRef} />
</div>
</div>
)
}
const r = active?.response
return (
<div className="grid min-h-0 flex-1 grid-cols-[260px_1fr_320px] divide-x divide-gray-200 overflow-hidden">
<aside className="min-h-0 overflow-y-auto bg-indigo-50/40 p-3">
<ThreadMenu
threads={threads}
activeCaseId={active?.id ?? null}
onSelectCase={onSelectCase}
onRemove={onRemove}
/>
{active && (
<div className="mt-3 border-t border-indigo-100 pt-3">
<div className="text-[10px] font-semibold uppercase tracking-wide text-indigo-400">Aktive Frage</div>
<div className="mb-3 text-sm font-medium text-gray-800">{active.question}</div>
{answer && <EvidenceSummary response={answer} />}
</div>
)}
</aside>
<main className="min-h-0 overflow-y-auto bg-gray-50 p-4">
{active?.status === 'loading' && <LoadingDots />}
{active?.status === 'error' && <ErrorBox msg={active.error} />}
{r?.mode === 'clarify' && (
<ClarifyView
response={r}
busy={busy}
onSelectContext={(ctx) => active && onSelectContext(active.id, ctx)}
/>
)}
{r?.mode === 'answer' && (
<div className="rounded-lg border border-gray-200 bg-white px-3 py-2">
<Markdown content={r.answer || ''} citations={cite} />
</div>
)}
</main>
<aside className="min-h-0 space-y-3 overflow-y-auto bg-gray-50 p-3">
{answer ? (
<>
<EvidencePane evidence={answer.evidence} highlightedId={highlightedId} />
<VisualEvidencePane items={answer.visual_evidence} />
<FootnotesPane footnotes={answer.footnotes} />
</>
) : (
<p className="px-1 text-[11px] text-gray-400">Evidence erscheint nach Auswahl eines Kontexts.</p>
)}
</aside>
</div>
)
}
@@ -0,0 +1,30 @@
'use client'
import { Hash } from 'lucide-react'
import type { Footnote } from '@/lib/sdk/advisor/contract'
import { PaneHeader } from './PaneHeader'
/** Footnotes pane (C-FN) — rendered only when present. */
export function FootnotesPane({ footnotes }: { footnotes: Footnote[] }) {
if (footnotes.length === 0) return null
return (
<section>
<PaneHeader icon={<Hash className="h-3.5 w-3.5 text-gray-500" />} title="Fußnoten" count={footnotes.length} />
<div className="space-y-1">
{footnotes.map((fn, i) => (
<div key={fn.footnote_id || i} className="rounded-md border border-gray-200 bg-white p-2 text-[11px]">
<span className="font-semibold text-gray-900">{fn.ref || `Fußnote ${i + 1}`}</span>
{(fn.document || fn.section) && (
<span className="text-gray-400">
{' · '}
{fn.document}
{fn.section ? ` / ${fn.section}` : ''}
</span>
)}
{fn.text && <p className="mt-0.5 text-gray-600">{fn.text}</p>}
</div>
))}
</div>
</section>
)
}
@@ -0,0 +1,40 @@
import { describe, it, expect } from 'vitest'
import { render } from '@testing-library/react'
import { Markdown } from './Markdown'
describe('Markdown', () => {
it('renders headings, bold and bullet lists (not raw markdown markers)', () => {
const { container } = render(
<Markdown
content={'## Pflichten\n\nDer **Verantwortliche** muss:\n\n- ein Verzeichnis fuehren\n- Risiken bewerten'}
/>,
)
expect(container.querySelector('h4')?.textContent).toBe('Pflichten')
expect(container.querySelector('strong')?.textContent).toBe('Verantwortliche')
expect(container.querySelectorAll('li')).toHaveLength(2)
expect(container.textContent).not.toContain('##')
expect(container.textContent).not.toContain('**')
})
it('renders ordered lists and inline code', () => {
const { container } = render(<Markdown content={'1. Erst `init`\n2. Dann `build`'} />)
expect(container.querySelector('ol')).not.toBeNull()
expect(container.querySelectorAll('li')).toHaveLength(2)
expect(container.querySelectorAll('code')).toHaveLength(2)
})
it('renders fenced code blocks', () => {
const { container } = render(<Markdown content={'```\nconst x = 1\n```'} />)
expect(container.querySelector('pre')).not.toBeNull()
expect(container.textContent).toContain('const x = 1')
})
it('only allows http(s) links', () => {
const { container } = render(
<Markdown content={'[ok](https://example.test) and [bad](javascript:alert(1))'} />,
)
const links = container.querySelectorAll('a')
expect(links).toHaveLength(1)
expect(links[0].getAttribute('href')).toBe('https://example.test')
})
})
@@ -0,0 +1,176 @@
'use client'
// Minimal, SAFE markdown -> React renderer. No dangerouslySetInnerHTML, no dependency.
// Covers the subset LLMs emit: headings, bold, italic, inline code, fenced code, ul/ol, links.
// Plus deliberate [n] citation markers (mapped via `citations`, NOT parsed for structure).
export interface CiteHandler {
count: number
onSelect: (n: number) => void
}
const INLINE_RE =
/(`[^`]+`|\*\*[^*]+\*\*|\*[^*\s][^*]*\*|_[^_]+_|\[[^\]]+\]\([^)]+\)|\[\d+\])/g
function renderInline(text: string, kp: string, cite?: CiteHandler): React.ReactNode[] {
const nodes: React.ReactNode[] = []
let last = 0
let idx = 0
INLINE_RE.lastIndex = 0
let m: RegExpExecArray | null
while ((m = INLINE_RE.exec(text)) !== null) {
if (m.index > last) nodes.push(text.slice(last, m.index))
const tok = m[0]
const key = `${kp}-${idx++}`
if (tok.startsWith('`')) {
nodes.push(
<code key={key} className="rounded bg-gray-100 px-1 py-0.5 font-mono text-[0.85em]">
{tok.slice(1, -1)}
</code>,
)
} else if (tok.startsWith('**')) {
nodes.push(
<strong key={key} className="font-semibold text-gray-900">
{tok.slice(2, -2)}
</strong>,
)
} else if (tok.startsWith('*') || tok.startsWith('_')) {
nodes.push(<em key={key}>{tok.slice(1, -1)}</em>)
} else if (/^\[\d+\]$/.test(tok)) {
const n = parseInt(tok.slice(1, -1), 10)
if (cite && n >= 1 && n <= cite.count) {
nodes.push(
<button
key={key}
type="button"
onClick={() => cite.onSelect(n)}
className="mx-0.5 align-super text-[10px] font-semibold text-indigo-600 hover:underline"
title={`Beleg ${n} anzeigen`}
>
[{n}]
</button>,
)
} else {
nodes.push(tok)
}
} else {
const mm = /^\[([^\]]+)\]\(([^)]+)\)$/.exec(tok)
if (mm && /^https?:\/\//i.test(mm[2])) {
nodes.push(
<a
key={key}
href={mm[2]}
target="_blank"
rel="noopener noreferrer"
className="text-indigo-600 underline hover:text-indigo-800"
>
{mm[1]}
</a>,
)
} else {
nodes.push(mm ? mm[1] : tok)
}
}
last = m.index + tok.length
}
if (last < text.length) nodes.push(text.slice(last))
return nodes
}
function Heading({ level, kp, text, cite }: { level: number; kp: string; text: string; cite?: CiteHandler }) {
const children = renderInline(text, kp, cite)
if (level <= 1) return <h3 className="mb-1 mt-3 text-base font-bold text-gray-900">{children}</h3>
if (level === 2) return <h4 className="mb-1 mt-3 text-sm font-bold text-gray-900">{children}</h4>
return <h5 className="mb-1 mt-2 text-sm font-semibold text-gray-800">{children}</h5>
}
const UL_RE = /^\s*[-*]\s+/
const OL_RE = /^\s*\d+\.\s+/
const H_RE = /^(#{1,6})\s+(.*)$/
export function Markdown({ content, citations }: { content: string; citations?: CiteHandler }) {
const lines = (content || '').replace(/\r\n/g, '\n').split('\n')
const blocks: React.ReactNode[] = []
let i = 0
while (i < lines.length) {
const line = lines[i]
const key = `b${blocks.length}`
if (line.trim().startsWith('```')) {
const buf: string[] = []
i++
while (i < lines.length && !lines[i].trim().startsWith('```')) {
buf.push(lines[i])
i++
}
i++
blocks.push(
<pre
key={key}
className="my-2 overflow-x-auto rounded bg-gray-900 p-3 font-mono text-xs text-gray-100"
>
<code>{buf.join('\n')}</code>
</pre>,
)
continue
}
if (line.trim() === '') {
i++
continue
}
const h = H_RE.exec(line)
if (h) {
blocks.push(<Heading key={key} kp={key} level={h[1].length} text={h[2]} cite={citations} />)
i++
continue
}
if (UL_RE.test(line)) {
const items: string[] = []
while (i < lines.length && UL_RE.test(lines[i])) {
items.push(lines[i].replace(UL_RE, ''))
i++
}
blocks.push(
<ul key={key} className="my-1.5 ml-4 list-disc space-y-1 text-gray-700">
{items.map((it, k) => (
<li key={k}>{renderInline(it, `${key}-${k}`, citations)}</li>
))}
</ul>,
)
continue
}
if (OL_RE.test(line)) {
const items: string[] = []
while (i < lines.length && OL_RE.test(lines[i])) {
items.push(lines[i].replace(OL_RE, ''))
i++
}
blocks.push(
<ol key={key} className="my-1.5 ml-5 list-decimal space-y-1 text-gray-700">
{items.map((it, k) => (
<li key={k}>{renderInline(it, `${key}-${k}`, citations)}</li>
))}
</ol>,
)
continue
}
const para: string[] = []
while (
i < lines.length &&
lines[i].trim() !== '' &&
!H_RE.test(lines[i]) &&
!UL_RE.test(lines[i]) &&
!OL_RE.test(lines[i]) &&
!lines[i].trim().startsWith('```')
) {
para.push(lines[i])
i++
}
blocks.push(
<p key={key} className="my-1.5 leading-relaxed text-gray-700">
{renderInline(para.join(' '), key, citations)}
</p>,
)
}
return <div className="advisor-markdown text-sm">{blocks}</div>
}
@@ -0,0 +1,24 @@
'use client'
/** Shared section header for evidence panes (icon + title + count badge). */
export function PaneHeader({
icon,
title,
count,
}: {
icon: React.ReactNode
title: string
count?: number
}) {
return (
<div className="mb-1.5 flex items-center gap-1.5 text-xs font-semibold text-gray-700">
{icon}
<span>{title}</span>
{count != null && (
<span className="rounded-full bg-gray-100 px-1.5 text-[10px] font-medium text-gray-500">
{count}
</span>
)}
</div>
)
}
@@ -0,0 +1,21 @@
'use client'
import { HelpCircle } from 'lucide-react'
/** The last question, pinned so it never scrolls out of view while the answer grows. */
export function StickyQuestion({ question }: { question: string }) {
if (!question) return null
return (
<div className="sticky top-0 z-10 border-b border-indigo-100 bg-indigo-50/95 px-4 py-2 backdrop-blur">
<div className="flex items-start gap-2">
<HelpCircle className="mt-0.5 h-4 w-4 flex-shrink-0 text-indigo-500" />
<div className="min-w-0">
<div className="text-[10px] font-semibold uppercase tracking-wide text-indigo-400">
Letzte Frage
</div>
<div className="text-sm font-medium text-gray-800">{question}</div>
</div>
</div>
</div>
)
}
@@ -0,0 +1,139 @@
'use client'
import { useState } from 'react'
import { Check, ChevronDown, ChevronRight, Copy, Files, Trash2 } from 'lucide-react'
import type { AdvisorThread } from './useAdvisorCase'
import { formatCaseForCopy, formatThreadForCopy } from '@/lib/sdk/advisor/copy'
import { useClipboard } from './useClipboard'
function IconBtn({
title,
onClick,
children,
}: {
title: string
onClick: () => void
children: React.ReactNode
}) {
return (
<button
type="button"
title={title}
aria-label={title}
onClick={(e) => {
e.stopPropagation()
onClick()
}}
className="rounded p-0.5 text-gray-400 transition-colors hover:bg-gray-200 hover:text-gray-700"
>
{children}
</button>
)
}
/**
* Left-menu topic tree: each thread's first question is the Thema; follow-ups nest underneath and
* expand/collapse. Per row: copy (single Q&A) + delete; per topic: copy the whole thread.
*/
export function ThreadMenu({
threads,
activeCaseId,
onSelectCase,
onRemove,
}: {
threads: AdvisorThread[]
activeCaseId: string | null
onSelectCase: (id: string) => void
onRemove: (id: string) => void
}) {
const { copiedKey, copy } = useClipboard()
const [collapsed, setCollapsed] = useState<Record<string, boolean>>({})
const ic = 'h-3.5 w-3.5'
return (
<div>
<div className="mb-1 text-[10px] font-semibold uppercase tracking-wide text-gray-400">Themen</div>
<div className="space-y-0.5">
{threads.map((t) => {
const first = t.cases[0]
const followups = t.cases.slice(1)
const open = !collapsed[t.id]
const activeInThread = t.cases.some((c) => c.id === activeCaseId)
return (
<div key={t.id}>
<div
className={`group flex items-center gap-1 rounded px-1.5 py-1 ${
first.id === activeCaseId ? 'bg-indigo-100' : activeInThread ? 'bg-indigo-50/60' : 'hover:bg-gray-100'
}`}
>
{followups.length > 0 ? (
<button
type="button"
onClick={() => setCollapsed((s) => ({ ...s, [t.id]: !s[t.id] }))}
className="text-gray-400 hover:text-gray-700"
aria-label={open ? 'Thema einklappen' : 'Thema aufklappen'}
>
{open ? <ChevronDown className={ic} /> : <ChevronRight className={ic} />}
</button>
) : (
<span className="w-3.5 shrink-0" />
)}
<button
type="button"
onClick={() => onSelectCase(first.id)}
title={first.question}
className={`min-w-0 flex-1 truncate text-left text-[12px] font-medium ${
first.id === activeCaseId ? 'text-indigo-800' : 'text-gray-700'
}`}
>
{first.question}
</button>
<div className="flex shrink-0 items-center gap-0.5 opacity-0 transition-opacity group-hover:opacity-100">
<IconBtn title="Ganzes Thema kopieren" onClick={() => copy(`thread:${t.id}`, formatThreadForCopy(t.title, t.cases))}>
{copiedKey === `thread:${t.id}` ? <Check className={`${ic} text-green-600`} /> : <Files className={ic} />}
</IconBtn>
<IconBtn title="Diese Frage kopieren" onClick={() => copy(`case:${first.id}`, formatCaseForCopy(first))}>
{copiedKey === `case:${first.id}` ? <Check className={`${ic} text-green-600`} /> : <Copy className={ic} />}
</IconBtn>
<IconBtn title="Frage löschen" onClick={() => onRemove(first.id)}>
<Trash2 className={ic} />
</IconBtn>
</div>
</div>
{open &&
followups.map((c) => (
<div
key={c.id}
className={`group ml-4 flex items-center gap-1 rounded px-1.5 py-1 ${
c.id === activeCaseId ? 'bg-indigo-100' : 'hover:bg-gray-100'
}`}
>
<span className="shrink-0 text-gray-300"></span>
<button
type="button"
onClick={() => onSelectCase(c.id)}
title={c.question}
className={`min-w-0 flex-1 truncate text-left text-[11px] ${
c.id === activeCaseId ? 'text-indigo-800' : 'text-gray-600'
}`}
>
{c.question}
</button>
<div className="flex shrink-0 items-center gap-0.5 opacity-0 transition-opacity group-hover:opacity-100">
<IconBtn title="Diese Frage kopieren" onClick={() => copy(`case:${c.id}`, formatCaseForCopy(c))}>
{copiedKey === `case:${c.id}` ? <Check className={`${ic} text-green-600`} /> : <Copy className={ic} />}
</IconBtn>
<IconBtn title="Frage löschen" onClick={() => onRemove(c.id)}>
<Trash2 className={ic} />
</IconBtn>
</div>
</div>
))}
</div>
)
})}
</div>
</div>
)
}
@@ -0,0 +1,70 @@
'use client'
import { ExternalLink, Image as ImageIcon } from 'lucide-react'
import type { VisualEvidence } from '@/lib/sdk/advisor/contract'
import { PaneHeader } from './PaneHeader'
function VisualCard({ v }: { v: VisualEvidence }) {
const canOpen = !!v.image_ref && /^https?:\/\//i.test(v.image_ref)
return (
<div className="rounded-lg border border-gray-200 bg-white p-2.5">
<div className="flex items-start justify-between gap-2">
<div className="min-w-0">
<div className="text-xs font-semibold text-gray-900">{v.caption || v.visual_type}</div>
<div className="mt-0.5 flex flex-wrap items-center gap-1 text-[11px] text-gray-500">
<span className="rounded bg-gray-100 px-1 text-[10px] uppercase tracking-wide text-gray-500">
{v.visual_type}
</span>
<span>Quelle: {v.document}</span>
</div>
</div>
{canOpen && (
<a
href={v.image_ref}
target="_blank"
rel="noopener noreferrer"
className="flex flex-shrink-0 items-center gap-0.5 rounded px-1.5 py-0.5 text-[11px] font-medium text-indigo-600 hover:bg-indigo-50"
>
<ExternalLink className="h-3 w-3" />
Original anzeigen
</a>
)}
</div>
{canOpen ? (
<a href={v.image_ref} target="_blank" rel="noopener noreferrer" className="mt-1.5 block">
{/* eslint-disable-next-line @next/next/no-img-element */}
<img
src={v.image_ref}
alt={v.caption || v.visual_type}
loading="lazy"
className="max-h-44 w-full rounded border border-gray-100 object-contain"
/>
</a>
) : (
<div className="mt-1.5 flex items-center justify-center rounded border border-dashed border-gray-200 bg-gray-50 px-3 py-5 text-[11px] text-gray-400">
Original-Darstellung folgt
</div>
)}
{v.vision_summary && <p className="mt-1.5 text-[11px] italic text-gray-500">{v.vision_summary}</p>}
</div>
)
}
/** Visual evidence (C8) — diagrams/figures, rendered only when present. */
export function VisualEvidencePane({ items }: { items: VisualEvidence[] }) {
if (items.length === 0) return null
return (
<section>
<PaneHeader
icon={<ImageIcon className="h-3.5 w-3.5 text-gray-500" />}
title="Diagramme & Abbildungen"
count={items.length}
/>
<div className="space-y-1.5">
{items.map((v) => (
<VisualCard key={v.visual_id} v={v} />
))}
</div>
</section>
)
}
@@ -0,0 +1,188 @@
'use client'
import { useCallback, useEffect, useMemo, useRef, useState } from 'react'
import type { AdvisorResponse } from '@/lib/sdk/advisor/contract'
export interface AdvisorCase {
id: string
threadId: string
question: string
response: AdvisorResponse | null
selectedContext: string | null
status: 'loading' | 'done' | 'error'
error?: string
}
/** A topic: the first case's question is the title; follow-ups are the rest, in order. */
export interface AdvisorThread {
id: string
title: string
cases: AdvisorCase[]
}
interface HistoryTurn {
role: 'user' | 'assistant'
content: string
}
interface UseAdvisorCaseArgs {
currentStep: string
country: string
}
let counter = 0
const uid = (p: string) => `${p}-${Date.now()}-${counter++}`
/**
* Drives the Advisor as topic THREADS of CASES. Each ask posts {question, context?, history} and
* receives a structured AdvisorResponse (clarify | answer) — no streaming, no answer-text parsing.
* A follow-up appends to the active thread (and carries the thread's prior Q&A as history);
* newTopic() starts a fresh thread. selectContext() re-runs a case scoped to a chosen domain.
*/
export function useAdvisorCase({ currentStep, country }: UseAdvisorCaseArgs) {
const [cases, setCases] = useState<AdvisorCase[]>([])
const [busy, setBusy] = useState(false)
const [activeCaseId, setActiveCaseId] = useState<string | null>(null)
const [activeThreadId, setActiveThreadId] = useState<string | null>(null)
const abortRef = useRef<AbortController | null>(null)
const patch = useCallback((id: string, p: Partial<AdvisorCase>) => {
setCases((prev) => prev.map((c) => (c.id === id ? { ...c, ...p } : c)))
}, [])
// Prior answered turns of a thread, up to (but excluding) `beforeId`, for contextual follow-ups.
const buildHistory = useCallback(
(threadId: string, beforeId?: string): HistoryTurn[] => {
const turns: HistoryTurn[] = []
for (const c of cases) {
if (c.threadId !== threadId) continue
if (beforeId && c.id === beforeId) break
const a = c.response?.answer ?? c.response?.general_answer
if (!a) continue
turns.push({ role: 'user', content: c.question }, { role: 'assistant', content: a })
}
return turns
},
[cases],
)
const run = useCallback(
async (id: string, question: string, context: string | null, history: HistoryTurn[]) => {
setBusy(true)
abortRef.current = new AbortController()
try {
const res = await fetch('/api/sdk/compliance-advisor/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ question, context, history, currentStep, country }),
signal: abortRef.current.signal,
})
if (!res.ok) {
const e = await res.json().catch(() => ({ error: 'Unbekannter Fehler' }))
throw new Error(e.error || `Server-Fehler (${res.status})`)
}
const data = (await res.json()) as AdvisorResponse
patch(id, { response: data, status: 'done', selectedContext: context })
} catch (err) {
if ((err as Error).name === 'AbortError') {
patch(id, { status: 'done' })
return
}
patch(id, {
status: 'error',
error: err instanceof Error ? err.message : 'Verbindung fehlgeschlagen',
})
} finally {
setBusy(false)
}
},
[currentStep, country, patch],
)
const ask = useCallback(
(question: string, opts?: { newThread?: boolean }) => {
const q = question.trim()
if (!q || busy) return
const startNew = opts?.newThread || !activeThreadId || cases.length === 0
const threadId = startNew ? uid('thread') : activeThreadId!
const id = uid('case')
const history = startNew ? [] : buildHistory(threadId)
setCases((prev) => [
...prev,
{ id, threadId, question: q, response: null, selectedContext: null, status: 'loading' },
])
setActiveThreadId(threadId)
setActiveCaseId(id)
void run(id, q, null, history)
},
[busy, activeThreadId, cases.length, buildHistory, run],
)
const newTopic = useCallback((question: string) => ask(question, { newThread: true }), [ask])
const selectContext = useCallback(
(id: string, context: string) => {
const c = cases.find((x) => x.id === id)
if (!c || busy) return
patch(id, { status: 'loading', selectedContext: context })
void run(id, c.question, context, buildHistory(c.threadId, id))
},
[cases, busy, run, patch, buildHistory],
)
const remove = useCallback((id: string) => {
setCases((prev) => prev.filter((c) => c.id !== id))
}, [])
const selectCase = useCallback((id: string) => {
setActiveCaseId(id)
setCases((prev) => {
const c = prev.find((x) => x.id === id)
if (c) setActiveThreadId(c.threadId)
return prev
})
}, [])
const stop = useCallback(() => {
abortRef.current?.abort()
setBusy(false)
}, [])
// Keep the active selection valid after deletions.
useEffect(() => {
if (activeCaseId && !cases.some((c) => c.id === activeCaseId)) {
const last = cases[cases.length - 1] ?? null
setActiveCaseId(last?.id ?? null)
setActiveThreadId(last?.threadId ?? null)
}
}, [cases, activeCaseId])
const threads = useMemo<AdvisorThread[]>(() => {
const order: string[] = []
const byId = new Map<string, AdvisorThread>()
for (const c of cases) {
let t = byId.get(c.threadId)
if (!t) {
t = { id: c.threadId, title: c.question, cases: [] }
byId.set(c.threadId, t)
order.push(c.threadId)
}
t.cases.push(c)
}
return order.map((id) => byId.get(id)!)
}, [cases])
return {
cases,
threads,
busy,
activeCaseId,
activeThreadId,
ask,
newTopic,
selectContext,
selectCase,
remove,
stop,
}
}
@@ -0,0 +1,72 @@
'use client'
import { useCallback, useState } from 'react'
import type { AdvisorCase } from './useAdvisorCase'
function esc(s: string): string {
return s
.replace(/&/g, '&amp;')
.replace(/</g, '&lt;')
.replace(/>/g, '&gt;')
.replace(/"/g, '&quot;')
}
function evidenceHtml(c: AdvisorCase): string {
const ev = c.response?.evidence ?? []
if (ev.length === 0) return ''
const items = ev
.map(
(e) =>
`<li>${esc(e.document)}${e.section ? `${esc(e.section)}` : ''}${e.paragraph ? ` ${esc(e.paragraph)}` : ''}</li>`,
)
.join('')
return `<p style="color:#64748b;font-size:12px;margin:4px 0 0;">Evidence:</p><ul style="color:#64748b;font-size:12px;margin:2px 0;">${items}</ul>`
}
/** Sends the consultation cases (question + answer + evidence) as an email to the DSB. */
export function useAdvisorEmail(cases: AdvisorCase[], country: string, currentStep: string) {
const [sending, setSending] = useState(false)
const [sent, setSent] = useState(false)
const send = useCallback(async () => {
if (cases.length === 0 || sending) return
setSending(true)
try {
const qaHtml = cases
.map((c) => {
const a = c.response?.answer || c.response?.general_answer || '(keine Antwort)'
return `<div style="margin-bottom:16px;"><p style="font-weight:600;color:#1e293b;">Frage: ${esc(
c.question,
)}</p><p style="color:#475569;white-space:pre-wrap;">${esc(a)}</p>${evidenceHtml(c)}</div>`
})
.join('')
const bodyHtml = `
<h2 style="color:#1e293b;">Compliance Advisor — Beratungsprotokoll</h2>
<p style="color:#64748b;font-size:13px;">Datum: ${esc(new Date().toLocaleString('de-DE'))} | Land: ${esc(country)} | Kontext: ${esc(currentStep)}</p>
<hr style="border-color:#e2e8f0;margin:16px 0;">
${qaHtml}
<hr style="border-color:#e2e8f0;margin:16px 0;">
<p style="color:#94a3b8;font-size:11px;">Automatisch erstellt vom BreakPilot Compliance Advisor</p>`
await fetch('/api/sdk/v1/agent/notify', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
recipient: 'dsb@breakpilot.local',
subject: `Compliance Advisor — ${cases.length} Fragen (${currentStep})`,
body_html: bodyHtml,
role: 'Datenschutzbeauftragter',
}),
})
setSent(true)
setTimeout(() => setSent(false), 3000)
} catch (e) {
console.error('Email send failed:', e)
} finally {
setSending(false)
}
}, [cases, sending, country, currentStep])
return { send, sending, sent }
}
@@ -0,0 +1,34 @@
'use client'
import { useState } from 'react'
import type { Citation } from '@/lib/sdk/advisor/contract'
import type { CiteHandler } from './Markdown'
/**
* Couples answer [n] markers to evidence cards: clicking [n] highlights + scrolls to the referenced
* evidence unit. Works across layout columns via the card's DOM id (ev-<evidence_id>).
*/
export function useCitationHighlight(citations: Citation[]): {
highlightedId?: string
cite?: CiteHandler
} {
const [highlightedId, setHighlightedId] = useState<string | undefined>()
if (citations.length === 0) return { highlightedId }
return {
highlightedId,
cite: {
count: citations.length,
onSelect: (n: number) => {
const c = citations[n - 1]
if (!c) return
setHighlightedId(c.evidence_id)
if (typeof document !== 'undefined') {
document.getElementById(`ev-${c.evidence_id}`)?.scrollIntoView({
behavior: 'smooth',
block: 'center',
})
}
},
},
}
}
@@ -0,0 +1,19 @@
'use client'
import { useCallback, useState } from 'react'
/** Writes text to the clipboard and flags the copied key briefly (for a check-mark affordance). */
export function useClipboard(resetMs = 1500) {
const [copiedKey, setCopiedKey] = useState<string | null>(null)
const copy = useCallback(
(key: string, text: string) => {
void navigator.clipboard?.writeText(text)
setCopiedKey(key)
window.setTimeout(() => setCopiedKey((k) => (k === key ? null : k)), resetMs)
},
[resetMs],
)
return { copiedKey, copy }
}
@@ -0,0 +1,70 @@
/**
* E2E: Compliance Advisor widget UX — topic threads, new-topic vs follow-up, delete, copy, fullscreen.
* Stubs the chat endpoint with an answer fixture so every ask yields a finished case.
*/
import { test, expect } from '../fixtures/sdk-fixtures'
const CHAT_ROUTE = '**/api/sdk/compliance-advisor/chat'
const openAdvisor = 'Compliance Advisor oeffnen'
const ANSWER = {
mode: 'answer',
question: '',
clarity: { is_underspecified: false, dominant_context: 'cyber', concentration: 0.9 },
general_answer: null,
answer: 'Musterantwort [1].',
scoped_query: null,
evidence: [{ evidence_id: 'e1', document: 'DSGVO', section: 'Art. 5', bindingness: 'binding' }],
citations: [{ citation_id: 'c1', number: 1, evidence_id: 'e1', document: 'DSGVO', section: 'Art. 5' }],
visual_evidence: [],
footnotes: [],
}
async function stub(page: import('@playwright/test').Page) {
await page.route(CHAT_ROUTE, (r) =>
r.fulfill({ status: 200, contentType: 'application/json', body: JSON.stringify(ANSWER) }),
)
}
test('new topic creates a second thread; copy control + fullscreen available', async ({ sdkPage }) => {
await stub(sdkPage)
await sdkPage.getByRole('button', { name: openAdvisor }).click()
const input = sdkPage.getByPlaceholder('Frage eingeben...')
await input.fill('Erste Frage')
await input.press('Enter')
await expect(sdkPage.getByText(/Musterantwort/)).toBeVisible()
// expand -> the topic tree ("Themen") appears in the left menu
await sdkPage.getByRole('button', { name: 'Vergroessern' }).click()
await expect(sdkPage.getByText('Themen')).toBeVisible()
await expect(sdkPage.getByText('Erste Frage').first()).toBeVisible()
// a second, separate topic
await sdkPage.getByPlaceholder('Folgefrage eingeben...').fill('Zweites Thema')
await sdkPage.getByRole('button', { name: 'Neues Thema' }).click()
await expect(sdkPage.getByText('Zweites Thema').first()).toBeVisible()
await expect(sdkPage.getByText('Erste Frage').first()).toBeVisible()
// copy affordance + fullscreen toggle
await expect(sdkPage.getByRole('button', { name: 'Diese Frage kopieren' }).first()).toBeVisible()
await sdkPage.getByRole('button', { name: 'Vollbild' }).click()
await expect(sdkPage.getByRole('button', { name: 'Vollbild verlassen' })).toBeVisible()
})
test('delete removes a question from the thread', async ({ sdkPage }) => {
await stub(sdkPage)
await sdkPage.getByRole('button', { name: openAdvisor }).click()
const input = sdkPage.getByPlaceholder('Frage eingeben...')
await input.fill('Zu löschen')
await input.press('Enter')
await expect(sdkPage.getByText(/Musterantwort/)).toBeVisible()
await sdkPage.getByRole('button', { name: 'Vergroessern' }).click()
await expect(sdkPage.getByText('Zu löschen').first()).toBeVisible()
await sdkPage.getByRole('button', { name: 'Frage löschen' }).first().click()
await expect(sdkPage.getByText('Zu löschen')).toHaveCount(0)
})
@@ -0,0 +1,102 @@
/**
* E2E: Compliance Advisor — Clarity Gate (v3 contract)
*
* Drives the floating advisor widget end-to-end against a stubbed /api/sdk/compliance-advisor/chat
* (contract fixtures), so the whole FE chain is exercised without the RAG/LLM backend:
* - underspecified question -> clarify mode (L1 general answer + domain context chips)
* - specific question -> answer mode (markdown + [n] citation coupling + evidence pane)
* - clarify -> pick a context -> scoped answer
* Runs on CI / macmini (needs the Next app on :3002).
*/
import { test, expect } from '../fixtures/sdk-fixtures'
const CHAT_ROUTE = '**/api/sdk/compliance-advisor/chat'
const openAdvisor = 'Compliance Advisor oeffnen'
const inputPlaceholder = 'Frage eingeben...'
const CLARIFY = {
mode: 'clarify',
question: 'Was ist PDCA?',
clarity: {
is_underspecified: true,
concentration: 0.3,
suggested_contexts: [
{ id: 'datenschutz', label: 'Datenschutz' },
{ id: 'cyber', label: 'Cybersecurity' },
],
},
general_answer: 'PDCA steht für **Plan-Do-Check-Act**.',
answer: null,
evidence: [],
citations: [],
visual_evidence: [],
footnotes: [],
}
const ANSWER = {
mode: 'answer',
question: 'CRA Meldefrist',
clarity: { is_underspecified: false, dominant_context: 'cyber', concentration: 0.88 },
answer: 'Die Meldung erfolgt unverzüglich [1].',
evidence: [
{ evidence_id: 'e1', document: 'CRA', section: 'Art. 14', paragraph: 'Abs. 1', snippet: 'unverzüglich melden', bindingness: 'binding' },
],
citations: [
{ citation_id: 'c1', number: 1, evidence_id: 'e1', document: 'CRA', section: 'Art. 14', paragraph: 'Abs. 1' },
],
visual_evidence: [],
footnotes: [],
}
async function ask(page: import('@playwright/test').Page, question: string) {
await page.getByRole('button', { name: openAdvisor }).click()
const input = page.getByPlaceholder(inputPlaceholder)
await input.fill(question)
await input.press('Enter')
}
test.describe('Compliance Advisor — Clarity Gate', () => {
test('underspecified question -> clarify (L1 definition + context chips, no evidence)', async ({ sdkPage }) => {
await sdkPage.route(CHAT_ROUTE, (r) =>
r.fulfill({ status: 200, contentType: 'application/json', body: JSON.stringify(CLARIFY) }),
)
await ask(sdkPage, 'Was ist PDCA?')
await expect(sdkPage.getByText('Allgemeine Definition')).toBeVisible()
await expect(sdkPage.getByText('Plan-Do-Check-Act')).toBeVisible()
await expect(sdkPage.getByRole('button', { name: 'Datenschutz' })).toBeVisible()
await expect(sdkPage.getByRole('button', { name: 'Cybersecurity' })).toBeVisible()
})
test('specific question -> answer with [n] citation + evidence pane', async ({ sdkPage }) => {
await sdkPage.route(CHAT_ROUTE, (r) =>
r.fulfill({ status: 200, contentType: 'application/json', body: JSON.stringify(ANSWER) }),
)
await ask(sdkPage, 'CRA Meldefrist')
await expect(sdkPage.getByText(/unverzüglich/)).toBeVisible()
await expect(sdkPage.getByTitle('Beleg 1 anzeigen')).toBeVisible()
// bindingness present -> header splits into Rechtsgrundlagen vs Leitlinien (evidence framing)
await expect(sdkPage.getByText('Rechtsgrundlagen').first()).toBeVisible()
// family name resolved for the user (shown both in the summary breakdown and the evidence card)
await expect(sdkPage.getByText('Cyber Resilience Act (CRA)').first()).toBeVisible()
})
test('clarify -> pick a context -> scoped answer', async ({ sdkPage }) => {
let calls = 0
await sdkPage.route(CHAT_ROUTE, (r) => {
calls += 1
r.fulfill({
status: 200,
contentType: 'application/json',
body: JSON.stringify(calls === 1 ? CLARIFY : ANSWER),
})
})
await ask(sdkPage, 'Was ist PDCA?')
await sdkPage.getByRole('button', { name: 'Datenschutz' }).click()
await expect(sdkPage.getByText(/unverzüglich/)).toBeVisible()
await expect(sdkPage.getByTitle('Beleg 1 anzeigen')).toBeVisible()
})
})
@@ -0,0 +1,65 @@
import { describe, it, expect } from 'vitest'
import { formatCaseForCopy, formatThreadForCopy } from '../advisor/copy'
import type { AdvisorResponse } from '../advisor/contract'
const answer: AdvisorResponse = {
mode: 'answer',
question: 'CRA Meldefrist',
clarity: { is_underspecified: false, concentration: 0.9 },
general_answer: null,
answer: 'Unverzüglich melden [1].',
scoped_query: 'cyber',
evidence: [{ evidence_id: 'e1', document: 'CRA', section: 'Art. 14', paragraph: 'Abs. 1', url: 'https://x' }],
citations: [],
visual_evidence: [],
footnotes: [{ footnote_id: 'f1', ref: 'Fußnote 3', document: 'EDPB', section: 'Kap III', text: 'Detail' }],
}
const clarify: AdvisorResponse = {
mode: 'clarify',
question: 'Was ist PDCA?',
clarity: { is_underspecified: true, concentration: 0.3 },
general_answer: 'PDCA = Plan-Do-Check-Act.',
answer: null,
scoped_query: null,
evidence: [],
citations: [],
visual_evidence: [],
footnotes: [],
}
describe('formatCaseForCopy', () => {
it('includes question, answer, resolved evidence and footnotes', () => {
const s = formatCaseForCopy({ question: 'CRA Meldefrist', response: answer })
expect(s).toContain('### Frage\nCRA Meldefrist')
expect(s).toContain('### Antwort\nUnverzüglich melden [1].')
expect(s).toContain('### Belege')
expect(s).toContain('Cyber Resilience Act (CRA), Art. 14 Abs. 1 (https://x)')
expect(s).toContain('### Fußnoten')
expect(s).toContain('Fußnote 3 — EDPB / Kap III: Detail')
})
it('falls back to the general definition for a clarify case', () => {
const s = formatCaseForCopy({ question: 'Was ist PDCA?', response: clarify })
expect(s).toContain('### Antwort\nPDCA = Plan-Do-Check-Act.')
expect(s).not.toContain('### Belege')
})
it('handles a case without a response', () => {
const s = formatCaseForCopy({ question: 'offen', response: null })
expect(s).toContain('### Antwort\n(keine Antwort)')
})
})
describe('formatThreadForCopy', () => {
it('renders a title heading + every case separated by a rule', () => {
const s = formatThreadForCopy('CRA Meldefrist', [
{ question: 'CRA Meldefrist', response: answer },
{ question: 'Und für KMU?', response: clarify },
])
expect(s.startsWith('# CRA Meldefrist')).toBe(true)
expect(s).toContain('\n---\n')
expect(s).toContain('CRA Meldefrist')
expect(s).toContain('Und für KMU?')
})
})
@@ -0,0 +1,85 @@
import { describe, it, expect } from 'vitest'
import {
groupByFamily,
provisionSummary,
summarizeEvidence,
type FamilyGroup,
} from '../advisor/evidence-grouping'
import type { EvidenceUnit } from '../advisor/contract'
function u(p: Partial<EvidenceUnit> & { document: string }): EvidenceUnit {
return { evidence_id: Math.random().toString(36).slice(2), ...p }
}
// The Datenschutzerklärung scenario the user reviewed: 6 Kernnormen (5 DSGVO Artikel + § 25 TDDDG)
// + 2 Leitlinien (DSK, EDPB) across 8 evidence units.
const DSE: EvidenceUnit[] = [
u({ document: 'DSGVO', section: 'Art. 6', bindingness: 'binding' }),
u({ document: 'DSGVO', section: 'Art. 7', bindingness: 'binding' }),
u({ document: 'DSGVO', section: 'Art. 12', bindingness: 'binding' }),
u({ document: 'DSGVO', section: 'Art. 13', bindingness: 'binding' }),
u({ document: 'DSGVO', section: 'Art. 14', bindingness: 'binding' }),
u({ document: 'TDDDG', section: '§ 25', bindingness: 'binding' }),
u({ document: 'DSK', bindingness: 'guidance' }),
u({ document: 'EDPB WP 259', bindingness: 'guidance' }),
]
describe('groupByFamily', () => {
it('groups a family and collects distinct provisions in order', () => {
const groups = groupByFamily(DSE)
const dsgvo = groups.find((g) => g.key === 'dsgvo')!
expect(dsgvo.units).toBe(5)
expect(dsgvo.sections).toEqual(['Art. 6', 'Art. 7', 'Art. 12', 'Art. 13', 'Art. 14'])
expect(dsgvo.bindingness).toBe('binding')
})
it('does not duplicate a repeated section', () => {
const groups = groupByFamily([
u({ document: 'DSGVO', section: 'Art. 13', bindingness: 'binding' }),
u({ document: 'DSGVO', section: 'Art. 13', bindingness: 'binding' }),
])
expect(groups[0].sections).toEqual(['Art. 13'])
expect(groups[0].units).toBe(2)
})
})
describe('summarizeEvidence', () => {
it('splits binding norms from guidance with correct counts', () => {
const m = summarizeEvidence(DSE)
expect(m.hasBindingness).toBe(true)
expect(m.normProvisions).toBe(6) // 5 DSGVO Artikel + § 25 TDDDG
expect(m.guidanceCount).toBe(2) // DSK + EDPB
expect(m.unitCount).toBe(8)
expect(m.norms.map((g) => g.key).sort()).toEqual(['dsgvo', 'tddg'])
})
it('degrades to a neutral breakdown when bindingness is absent', () => {
const m = summarizeEvidence([
u({ document: 'DSGVO', section: 'Art. 30' }),
u({ document: 'CRA', section: 'Art. 14' }),
])
expect(m.hasBindingness).toBe(false)
expect(m.groups).toHaveLength(2)
expect(m.normProvisions).toBe(0)
expect(m.guidanceCount).toBe(0)
})
})
describe('provisionSummary', () => {
const g = (sections: string[], units = sections.length): FamilyGroup => ({
key: 'k',
label: 'L',
sections,
units,
bindingness: 'binding',
})
it('names Artikel, §§, single provisions and bare units', () => {
expect(provisionSummary(g(['Art. 6', 'Art. 7', 'Art. 13']))).toBe('3 Artikel')
expect(provisionSummary(g(['§ 25']))).toBe('§ 25')
expect(provisionSummary(g(['§ 25', '§ 26']))).toBe('2 §§')
expect(provisionSummary(g(['Art. 13', '§ 25', 'Anhang I']))).toBe('3 Fundstellen')
expect(provisionSummary(g([], 3))).toBe('3 Fundstellen')
expect(provisionSummary(g([], 1))).toBe('1 Fundstelle')
})
})
@@ -0,0 +1,31 @@
import { describe, it, expect } from 'vitest'
import { resolveRegulation } from '../advisor/regulation-display'
describe('resolveRegulation', () => {
it('groups DSK SDM building blocks under one family + extracts the chapter', () => {
const b51 = resolveRegulation({ code: 'dsk_sdm_b51', short: 'DSK Sdm B51' })
const b41 = resolveRegulation({ code: 'dsk_sdm_b41', short: 'DSK Sdm B41' })
const v31 = resolveRegulation({ code: 'dsk_sdm_v31', short: 'DSK Sdm V31' })
expect(b51.familyKey).toBe('dsk_sdm')
expect(b41.familyKey).toBe('dsk_sdm')
expect(v31.familyKey).toBe('dsk_sdm')
expect(b51.familyLabel).toContain('Standard-Datenschutzmodell')
expect(b51.chapter).toBe('B51')
expect(v31.chapter).toBe('V31')
})
it('maps known regulations to friendly family keys', () => {
expect(resolveRegulation({ code: 'cra', short: 'CRA' }).familyKey).toBe('cra')
expect(resolveRegulation({ code: 'nis2', short: 'NIS2' }).familyKey).toBe('nis2')
expect(resolveRegulation({ code: 'dpf', short: 'DPF' }).familyKey).toBe('dpf')
expect(resolveRegulation({ code: 'dsgvo', short: 'DS-GVO' }).familyKey).toBe('dsgvo')
expect(resolveRegulation({ code: 'bdsg', short: 'BDSG' }).familyKey).toBe('bdsg')
})
it('falls back to code as family + short as label for unknown regulations', () => {
const r = resolveRegulation({ code: 'xyz_reg', short: 'XYZ' })
expect(r.familyKey).toBe('xyz_reg')
expect(r.familyLabel).toBe('XYZ')
expect(r.chapter).toBeUndefined()
})
})
@@ -0,0 +1,81 @@
import { describe, it, expect } from 'vitest'
import {
resolveMode,
mapClarity,
mapFootnotes,
buildCitations,
numberedEvidenceForPrompt,
isLegacyRequest,
} from '../advisor/retrieve-mapping'
import type { EvidenceUnit } from '../advisor/contract'
describe('resolveMode', () => {
it('a chosen context always forces answer', () => expect(resolveMode('clarify', true)).toBe('answer'))
it('clarify + no context -> clarify', () => expect(resolveMode('clarify', false)).toBe('clarify'))
it('answer -> answer', () => expect(resolveMode('answer', false)).toBe('answer'))
it('unknown/undefined -> answer', () => expect(resolveMode(undefined, false)).toBe('answer'))
})
describe('mapClarity', () => {
it('clarify maps candidate_contexts -> suggested_contexts', () => {
const c = mapClarity(
{ mode: 'clarify', concentration: 0.3, candidate_contexts: [{ id: 'ds', label: 'Datenschutz', hits: 5 }] },
'clarify',
)
expect(c.is_underspecified).toBe(true)
expect(c.suggested_contexts).toEqual([{ id: 'ds', label: 'Datenschutz' }])
})
it('answer keeps dominant_context, drops suggestions', () => {
const c = mapClarity({ mode: 'answer', concentration: 0.88, dominant_context: 'ds' }, 'answer')
expect(c.is_underspecified).toBe(false)
expect(c.dominant_context).toBe('ds')
expect(c.suggested_contexts).toBeUndefined()
})
})
const ev: EvidenceUnit[] = [
{ evidence_id: 'e1', document: 'DSGVO', section: 'Art. 30', paragraph: 'Abs. 1', snippet: 'x' },
{ evidence_id: 'e2', document: 'BDSG', section: '§ 38' },
]
describe('buildCitations', () => {
it('numbers citations 1..n mapped to evidence', () => {
const cs = buildCitations(ev)
expect(cs).toHaveLength(2)
expect(cs[0]).toMatchObject({ citation_id: 'c1', number: 1, evidence_id: 'e1' })
expect(cs[1].number).toBe(2)
})
})
describe('numberedEvidenceForPrompt', () => {
it('prefixes each unit with [n] + its location', () => {
const s = numberedEvidenceForPrompt(ev)
expect(s).toContain('[1] DSGVO Art. 30 Abs. 1')
expect(s).toContain('[2] BDSG § 38')
})
})
describe('mapFootnotes', () => {
it('remaps a /retrieve footnote to the contract footnote', () => {
const fns = mapFootnotes([
{ id: 'f1', number: 17, regulation_short: 'EDPB WP248', section: 'Kap III', text: 't' },
])
expect(fns[0]).toMatchObject({
footnote_id: 'f1',
ref: 'Fußnote 17',
document: 'EDPB WP248',
section: 'Kap III',
text: 't',
})
})
})
describe('isLegacyRequest', () => {
it('message-only (workspace) -> legacy stream', () => {
expect(isLegacyRequest({ message: 'Ist meine DSE ausreichend?' })).toBe(true)
})
it('question present -> contract (JSON)', () => {
expect(isLegacyRequest({ question: 'x', message: 'y' })).toBe(false)
expect(isLegacyRequest({ question: 'x' })).toBe(false)
})
})
@@ -0,0 +1,83 @@
// FE-facing contract for the Compliance Advisor "Case" (Clarity Gate).
// Matches the SDK<->FE contract (board 2026-07-01 / memory advisor-clarity-gate-contract).
// The FE renders ONLY these structured fields; it never extracts structure from the answer text.
// The only exception is rendering the deliberate [n] citation markers, mapped via `citations`.
export interface SuggestedContext {
id: string // e.g. "datenschutz"
label: string // e.g. "Datenschutz"
}
export interface ClarityInfo {
is_underspecified: boolean
concentration: number
suggested_contexts?: SuggestedContext[] // clarify mode
dominant_context?: string // answer mode
}
/** A retrieved evidence unit. (`evidence[]` item shape — confirm with SDK; see board rückfrage.) */
export interface EvidenceUnit {
evidence_id: string
document: string
section?: string
paragraph?: string
snippet?: string
url?: string
regulation_code?: string // preferred key for family grouping (from /retrieve)
context?: string // knowledge space / domain
// Canonical Legal-KG fact (APEX rule): binding norm vs. soft-law guidance. Owned by the
// Legal-KG/RAG, not derived in the FE. Absent until /retrieve populates it (board request 2026-07-01);
// the FE degrades to a neutral per-regulation breakdown when it is missing.
bindingness?: 'binding' | 'guidance'
}
/** Numbered [n] <-> evidence coupling, produced by the SDK (not parsed from the answer). */
export interface Citation {
citation_id: string
number?: number // 1-based marker number ([n])
evidence_id: string
document: string
section?: string | null
paragraph?: string | null
footnote?: string | null
figure?: string | null
}
/** C8 / visual evidence — `visual_type` generalizes beyond figures (flowchart/bpmn/state_machine/...). */
export interface VisualEvidence {
visual_id: string
visual_type: string
caption?: string
document: string
context?: string
image_ref?: string
vision_summary?: string
}
export interface Footnote {
footnote_id?: string
ref?: string
document?: string
section?: string
text?: string
}
export type AdvisorMode = 'clarify' | 'answer'
export interface AdvisorResponse {
mode: AdvisorMode
question: string
clarity: ClarityInfo
general_answer?: string | null // L1 (clarify mode)
answer?: string | null // L2 (answer mode)
scoped_query?: string | null
evidence: EvidenceUnit[]
citations: Citation[]
visual_evidence: VisualEvidence[]
footnotes: Footnote[]
}
export interface AdvisorRequest {
question: string
context?: string | null
}
+48
View File
@@ -0,0 +1,48 @@
// Pure formatters for "copy to clipboard": a single case (Q + answer + evidence) or a whole thread.
// No DOM/clipboard here (that is the caller's side effect) so this stays testable.
import type { AdvisorResponse, EvidenceUnit } from './contract'
import { resolveRegulation } from './regulation-display'
export interface CopyableCase {
question: string
response: AdvisorResponse | null
}
function evidenceLine(e: EvidenceUnit): string {
const { familyLabel } = resolveRegulation({ code: e.regulation_code || e.document, short: e.document })
const loc = [e.section, e.paragraph].filter(Boolean).join(' ')
const ref = [familyLabel, loc].filter(Boolean).join(', ')
return e.url ? `- ${ref} (${e.url})` : `- ${ref}`
}
/** One case as portable Markdown: question, answer (or general definition), and its evidence. */
export function formatCaseForCopy(c: CopyableCase): string {
const parts: string[] = [`### Frage`, c.question.trim()]
const r = c.response
const answer = (r?.answer || r?.general_answer || '').trim()
parts.push('', '### Antwort', answer || '(keine Antwort)')
if (r && r.evidence.length > 0) {
parts.push('', '### Belege', ...r.evidence.map(evidenceLine))
}
if (r && r.footnotes.length > 0) {
parts.push(
'',
'### Fußnoten',
...r.footnotes.map((f, i) => {
const head = f.ref || `Fußnote ${i + 1}`
const src = [f.document, f.section].filter(Boolean).join(' / ')
return `- ${[head, src].filter(Boolean).join(' — ')}${f.text ? `: ${f.text}` : ''}`
}),
)
}
return parts.join('\n')
}
/** A whole topic thread: a title heading followed by every case, separated by rules. */
export function formatThreadForCopy(title: string, cases: CopyableCase[]): string {
const header = `# ${title.trim() || 'Compliance-Advisor-Verlauf'}`
const body = cases.map(formatCaseForCopy).join('\n\n---\n\n')
return `${header}\n\n${body}`
}
@@ -0,0 +1,80 @@
// Pure grouping/counting for the "Diese Antwort stützt sich auf" evidence header. No React, testable.
// Splits evidence into binding norms (Kernnormen) vs. soft-law guidance (Leitlinien) using the
// Legal-KG-owned `bindingness` fact (APEX rule) — the FE never derives bindingness itself. When the
// fact is absent it degrades to a neutral per-regulation breakdown (no norm/guidance labels, no
// fabricated legal classification).
import type { EvidenceUnit } from './contract'
import { resolveRegulation } from './regulation-display'
export type Bindingness = 'binding' | 'guidance' | 'unknown'
export interface FamilyGroup {
key: string // stable family key (grouping)
label: string // human-readable regulation name
sections: string[] // distinct provisions in first-seen order (e.g. "Art. 13", "§ 25")
units: number // raw evidence units in this family
bindingness: Bindingness
}
export interface EvidenceSummaryModel {
groups: FamilyGroup[]
norms: FamilyGroup[] // bindingness === 'binding'
guidance: FamilyGroup[] // bindingness === 'guidance'
other: FamilyGroup[] // bindingness unknown
hasBindingness: boolean // at least one unit carries the Legal-KG fact
normProvisions: number // distinct binding provisions (Kernnormen)
guidanceCount: number // distinct guidance documents (Leitlinien)
unitCount: number // total evidence units
}
export function groupByFamily(evidence: EvidenceUnit[]): FamilyGroup[] {
const byKey = new Map<string, FamilyGroup>()
for (const e of evidence) {
const { familyKey, familyLabel } = resolveRegulation({
code: e.regulation_code || e.document,
short: e.document,
})
let g = byKey.get(familyKey)
if (!g) {
g = { key: familyKey, label: familyLabel, sections: [], units: 0, bindingness: 'unknown' }
byKey.set(familyKey, g)
}
g.units += 1
if (e.section && !g.sections.includes(e.section)) g.sections.push(e.section)
if (e.bindingness && g.bindingness === 'unknown') g.bindingness = e.bindingness
}
return [...byKey.values()]
}
/** distinct provisions for a family; falls back to raw unit count when no section is known. */
export function provisionCount(g: FamilyGroup): number {
return g.sections.length || g.units
}
/** "5 Artikel" / "§ 25" / "3 Fundstellen" — the noun follows the family's own citation style. */
export function provisionSummary(g: FamilyGroup): string {
const n = g.sections.length
if (n === 0) return `${g.units} ${g.units === 1 ? 'Fundstelle' : 'Fundstellen'}`
if (n === 1) return g.sections[0]
if (g.sections.every((s) => /^\s*art/i.test(s))) return `${n} Artikel`
if (g.sections.every((s) => s.trim().startsWith('§'))) return `${n} §§`
return `${n} Fundstellen`
}
export function summarizeEvidence(evidence: EvidenceUnit[]): EvidenceSummaryModel {
const groups = groupByFamily(evidence)
const norms = groups.filter((g) => g.bindingness === 'binding')
const guidance = groups.filter((g) => g.bindingness === 'guidance')
const other = groups.filter((g) => g.bindingness === 'unknown')
return {
groups,
norms,
guidance,
other,
hasBindingness: norms.length > 0 || guidance.length > 0,
normProvisions: norms.reduce((n, g) => n + provisionCount(g), 0),
guidanceCount: guidance.length,
unitCount: evidence.length,
}
}
@@ -0,0 +1,57 @@
// Human-readable display for regulations. Maps messy codes/short-names to a stable FAMILY key +
// friendly label (+ chapter for multi-part works like the DSK SDM). Presentation layer only:
// it bridges G2 (clean RAG metadata) and keeps working once codes are clean. Extend the table freely.
export interface RegulationRef {
code?: string
name?: string
short?: string
}
export interface RegulationDisplay {
familyKey: string // stable key used to GROUP evidence
familyLabel: string // human-readable regulation name
chapter?: string // e.g. "B51" for a DSK SDM building block
}
interface Rule {
test: RegExp
key: string
label: string
chapter?: RegExp
}
// Order matters: more specific patterns first.
const RULES: Rule[] = [
{
test: /dsk.?sdm|standard.?datenschutzmodell|(^|[^a-z])sdm([^a-z]|$)/i,
key: 'dsk_sdm',
label: 'DSK Standard-Datenschutzmodell (SDM)',
chapter: /\b([A-Z]\d{1,3})\b/,
},
{ test: /cyber.?resilience|(^|[^a-z])cra([^a-z]|$)/i, key: 'cra', label: 'Cyber Resilience Act (CRA)' },
{ test: /(^|[^a-z])nis.?2([^a-z]|$)/i, key: 'nis2', label: 'NIS2-Richtlinie' },
{ test: /data.?privacy.?framework|(^|[^a-z])dpf([^a-z]|$)/i, key: 'dpf', label: 'EU-US Data Privacy Framework' },
{ test: /maschinen|2023.?1230/i, key: 'maschinenvo', label: 'Maschinenverordnung (EU) 2023/1230' },
{ test: /ds.?gvo|gdpr/i, key: 'dsgvo', label: 'DSGVO Datenschutz-Grundverordnung' },
{ test: /(^|[^a-z])bdsg([^a-z]|$)/i, key: 'bdsg', label: 'BDSG Bundesdatenschutzgesetz' },
{ test: /tdddg|ttdsg/i, key: 'tddg', label: 'TDDDG (Digitale-Dienste-Datenschutz)' },
{ test: /edpb|edsa|(^|[^a-z])wp\s?\d+/i, key: 'edpb', label: 'EDPB / DSK Leitlinien' },
{ test: /(^|[^a-z])bsi([^a-z]|$)/i, key: 'bsi', label: 'BSI' },
]
export function resolveRegulation(reg: RegulationRef): RegulationDisplay {
const hay = `${reg.code || ''} ${reg.short || ''} ${reg.name || ''}`
for (const r of RULES) {
if (r.test.test(hay)) {
const chapter = r.chapter
? r.chapter.exec(reg.short || reg.code || '')?.[1] || undefined
: undefined
return { familyKey: r.key, familyLabel: r.label, chapter }
}
}
return {
familyKey: reg.code || reg.short || 'unknown',
familyLabel: reg.short || reg.name || reg.code || 'Regelwerk',
}
}
@@ -0,0 +1,91 @@
// Pure mappings from the Go /retrieve response (SDK/RAG-owned; board 2026-07-01 12:25)
// to the FE-facing advisor contract. Kept pure + testable; the orchestration (route.ts) wires them.
import type { Citation, ClarityInfo, EvidenceUnit, Footnote, VisualEvidence } from './contract'
export interface RetrieveClarity {
mode?: string // 'clarify' | 'answer'
reason?: string // e.g. 'middle_band_llm_needed'
concentration?: number
domain_count?: number
dominant_context?: string
candidate_contexts?: { id: string; label: string; hits?: number }[]
}
export interface RetrieveFootnote {
id?: string
ref?: string
number?: number
regulation_code?: string
regulation_short?: string
regulation_name?: string
section?: string
text?: string
}
export interface RetrieveResponse {
evidence?: EvidenceUnit[]
visual_evidence?: VisualEvidence[]
footnotes?: RetrieveFootnote[]
clarity?: RetrieveClarity
results?: unknown[]
tables?: unknown[] // C6 — not in the FE contract yet (future TablesPane)
}
/** clarify unless a context was chosen; /retrieve's clarity.mode decides for un-scoped queries. */
export function resolveMode(clarityMode: string | undefined, hasContext: boolean): 'clarify' | 'answer' {
if (hasContext) return 'answer'
return clarityMode === 'clarify' ? 'clarify' : 'answer'
}
export function mapClarity(c: RetrieveClarity | undefined, mode: 'clarify' | 'answer'): ClarityInfo {
return {
is_underspecified: mode === 'clarify',
concentration: c?.concentration ?? 0,
dominant_context: c?.dominant_context,
suggested_contexts:
mode === 'clarify' ? (c?.candidate_contexts ?? []).map((cc) => ({ id: cc.id, label: cc.label })) : undefined,
}
}
export function mapFootnotes(fns: RetrieveFootnote[] | undefined): Footnote[] {
return (fns ?? []).map((f) => ({
footnote_id: f.id,
ref: f.ref ?? (f.number != null ? `Fußnote ${f.number}` : undefined),
document: f.regulation_short || f.regulation_name || f.regulation_code,
section: f.section,
text: f.text,
}))
}
/** Citations are generated by the orchestration (not by /retrieve): [n] -> nth evidence unit. */
export function buildCitations(evidence: EvidenceUnit[]): Citation[] {
return evidence.map((e, i) => ({
citation_id: `c${i + 1}`,
number: i + 1,
evidence_id: e.evidence_id,
document: e.document,
section: e.section ?? null,
paragraph: e.paragraph ?? null,
footnote: null,
figure: null,
}))
}
/** Numbered evidence list injected into the L2 prompt so the LLM can cite [n]. */
export function numberedEvidenceForPrompt(evidence: EvidenceUnit[]): string {
return evidence
.map((e, i) => {
const loc = [e.document, e.section, e.paragraph].filter(Boolean).join(' ')
return `[${i + 1}] ${loc}\n${e.snippet ?? ''}`.trim()
})
.join('\n\n')
}
/**
* Backward-compat discriminator: legacy consumers (e.g. breakpilot-workspace) send `{message}`
* and read a plain-text stream; the new FE sends `{question}` and expects the JSON contract.
*/
export function isLegacyRequest(body: { question?: unknown; message?: unknown }): boolean {
return body.question == null && typeof body.message === 'string'
}
@@ -51,8 +51,8 @@ describe('advisor-rag', () => {
}) })
}) })
describe('queryAdvisorRAG', () => { describe('queryAdvisorRAG (Authority Router)', () => {
it('fragt alle 6 Collections ab und formatiert die Treffer', async () => { it('ruft den Router EINMAL auf und formatiert die Treffer', async () => {
mockFetch.mockResolvedValue({ mockFetch.mockResolvedValue({
ok: true, ok: true,
json: async () => ({ results: [{ text: 'Inhalt A', regulation_short: 'DSGVO', score: 0.9 }] }), json: async () => ({ results: [{ text: 'Inhalt A', regulation_short: 'DSGVO', score: 0.9 }] }),
@@ -60,19 +60,19 @@ describe('advisor-rag', () => {
const result = await mod.queryAdvisorRAG('Was ist eine DSFA?') const result = await mod.queryAdvisorRAG('Was ist eine DSFA?')
expect(result).toContain('[Quelle 1: DSGVO]') expect(result).toContain('[Quelle 1: DSGVO]')
expect(result).toContain('Inhalt A') expect(result).toContain('Inhalt A')
expect(mockFetch).toHaveBeenCalledTimes(mod.COMPLIANCE_COLLECTIONS.length) expect(mockFetch).toHaveBeenCalledTimes(1)
}) })
it('ruft die ai-sdk /sdk/v1/rag/search mit collection + top_k auf', async () => { it('ruft /sdk/v1/rag/retrieve mit query + top_k (ohne collection) auf', async () => {
mockFetch.mockResolvedValue({ ok: true, json: async () => ({ results: [] }) }) mockFetch.mockResolvedValue({ ok: true, json: async () => ({ results: [] }) })
await mod.queryAdvisorRAG('test') await mod.queryAdvisorRAG('test')
expect(mockFetch).toHaveBeenCalledWith( expect(mockFetch).toHaveBeenCalledWith(
expect.stringContaining('/sdk/v1/rag/search'), expect.stringContaining('/sdk/v1/rag/retrieve'),
expect.objectContaining({ method: 'POST' }), expect.objectContaining({ method: 'POST' }),
) )
const body = JSON.parse(mockFetch.mock.calls[0][1].body) const body = JSON.parse(mockFetch.mock.calls[0][1].body)
expect(body).toMatchObject({ query: 'test', top_k: 3 }) expect(body).toMatchObject({ query: 'test', top_k: 8 })
expect(mod.COMPLIANCE_COLLECTIONS).toContain(body.collection) expect(body.collection).toBeUndefined()
}) })
it('liefert leeren String wenn das RAG-Backend nicht erreichbar ist (graceful)', async () => { it('liefert leeren String wenn das RAG-Backend nicht erreichbar ist (graceful)', async () => {
@@ -80,10 +80,5 @@ describe('advisor-rag', () => {
const result = await mod.queryAdvisorRAG('test') const result = await mod.queryAdvisorRAG('test')
expect(result).toBe('') expect(result).toBe('')
}) })
it('umfasst genau die 6 Compliance-Collections', () => {
expect(mod.COMPLIANCE_COLLECTIONS).toHaveLength(6)
expect(mod.COMPLIANCE_COLLECTIONS).toContain('bp_compliance_recht')
})
}) })
}) })
@@ -138,3 +138,26 @@ export async function streamAdvisorAnswer(
if (ollama) return textStream(ollama, parseOllamaLine) if (ollama) return textStream(ollama, parseOllamaLine)
return null return null
} }
/**
* Nicht-streamende Variante: sammelt die vollstaendige LLM-Antwort als String (fuer die
* JSON-Contract-Antwort der Advisor-Orchestrierung). null = kein LLM erreichbar.
*/
export async function completeAdvisorAnswer(messages: ChatMessage[]): Promise<string | null> {
const stream = await streamAdvisorAnswer(messages)
if (!stream) return null
const reader = stream.getReader()
const decoder = new TextDecoder()
let out = ''
try {
for (;;) {
const { done, value } = await reader.read()
if (done) break
if (value) out += decoder.decode(value, { stream: true })
}
out += decoder.decode()
} finally {
reader.releaseLock()
}
return out
}
+85 -41
View File
@@ -1,15 +1,19 @@
/** /**
* Compliance-Advisor RAG-Suche. * Compliance-Advisor RAG-Suche.
* *
* Fragt die ai-compliance-sdk (`/sdk/v1/rag/search`) ab statt des frueheren * Fragt den Authority Router der ai-compliance-sdk (`/sdk/v1/rag/retrieve`) mit NUR der
* `rag-service:8097` (auf prod nicht erreichbar). Die ai-sdk embeddet die Query * Query ab — der Router waehlt selbst die Collections (Broad-Authority-Base + KB-2026.1-Slice
* mit bge-m3 (prod: ollama-embed) und sucht in den Qdrant-Compliance-Collections * bei in-scope), embeddet mit bge-m3 (prod: ollama-embed), merged + authority-ranked. Der
* — damit profitiert der Advisor vom reicheren Embedding. * Advisor bleibt damit collection-agnostisch (Vertrag: Compiler -> Collections -> Retriever
* -> Advisor); die fruehere Multi-Collection-Logik liegt jetzt im Retriever.
* *
* Fehler je Collection werden geschluckt (graceful: Antwort ohne diesen Treffer). * `retrieveAdvisorEvidence` liefert die STRUKTURIERTEN Treffer (fuer das Evidence-Workspace-
* Fundstellen via article_label sind live ab dem Prod-Re-Ingest 2026-06. * Frontend, das nur strukturierte Daten rendert und nie den Antworttext parst) UND den
* vorformatierten Kontext-Block fuer den LLM-Prompt. Fehler werden geschluckt (graceful).
*/ */
import type { RetrieveResponse } from '@/lib/sdk/advisor/retrieve-mapping'
const SDK_URL = const SDK_URL =
process.env.SDK_API_URL || process.env.SDK_URL || 'http://ai-compliance-sdk:8090' process.env.SDK_API_URL || process.env.SDK_URL || 'http://ai-compliance-sdk:8090'
@@ -17,17 +21,7 @@ const DEFAULT_USER = '00000000-0000-0000-0000-000000000001'
const DEFAULT_TENANT = const DEFAULT_TENANT =
process.env.DEFAULT_TENANT_ID || '9282a473-5c95-4b3a-bf78-0ecc0ec71d3e' process.env.DEFAULT_TENANT_ID || '9282a473-5c95-4b3a-bf78-0ecc0ec71d3e'
// Compliance-relevante Collections (ai-sdk-Whitelist `AllowedCollections`). export interface SdkRagResult {
export const COMPLIANCE_COLLECTIONS = [
'bp_compliance_gesetze',
'bp_compliance_ce',
'bp_compliance_datenschutz',
'bp_dsfa_corpus',
'bp_compliance_recht',
'bp_legal_templates',
] as const
interface SdkRagResult {
text?: string text?: string
regulation_code?: string regulation_code?: string
regulation_name?: string regulation_name?: string
@@ -43,20 +37,27 @@ interface SdkRagResult {
score?: number score?: number
} }
/** Raw RAG response. `figures`/`footnotes` (C8 / C-FN) are passed through untyped until the
* RAG-ingestion contract is finalized (board), then mapped in the evidence-adapter. */
interface SdkRagResponse {
results?: SdkRagResult[]
figures?: unknown[]
footnotes?: unknown[]
}
interface ScoredPassage { interface ScoredPassage {
content: string content: string
source: string source: string
score: number score: number
} }
/** Normalisiert eine ai-sdk-RAG-Antwort auf {content, source, score}. */ /** Normalisiert eine ai-sdk-RAG-Antwort auf {content, source, score} (fuer den Prompt-Kontext). */
export function mapSdkResults(results: SdkRagResult[] | undefined): ScoredPassage[] { export function mapSdkResults(results: SdkRagResult[] | undefined): ScoredPassage[] {
return (results || []) return (results || [])
.map((r) => ({ .map((r) => ({
content: r.text || '', content: r.text || '',
// Fundstelle: article_label ist die fertig formatierte, druckbare Quelle aus der // Fundstelle: article_label ist die fertig formatierte, druckbare Quelle aus der
// Ingestion ("BDSG § 38 Abs. 1"); Fallback baut sie aus den strukturierten Feldern // Ingestion ("BDSG § 38 Abs. 1"); Fallback baut sie aus den strukturierten Feldern.
// (bzw. alt-ingestierte Chunks ohne Legal-Metadaten). Siehe rag_reingest_spec.md §2/§7.
source: source:
(r.article_label && r.article_label.trim()) || (r.article_label && r.article_label.trim()) ||
[r.regulation_short || r.regulation_name || r.regulation_code, r.article, r.paragraph, r.sub] [r.regulation_short || r.regulation_name || r.regulation_code, r.article, r.paragraph, r.sub]
@@ -68,39 +69,82 @@ export function mapSdkResults(results: SdkRagResult[] | undefined): ScoredPassag
.filter((p) => p.content) .filter((p) => p.content)
} }
async function searchCollection(collection: string, query: string): Promise<ScoredPassage[]> { /** Formatiert die Top-Passagen als Kontext-Block fuer den System-Prompt. */
function formatContext(passages: ScoredPassage[]): string {
if (passages.length === 0) return ''
return passages
.map((r, i) => `[Quelle ${i + 1}: ${r.source}]\n${r.content}`)
.join('\n\n---\n\n')
}
/** EIN collection-agnostischer Aufruf an die ai-sdk. Fehler -> leeres Ergebnis (graceful). */
async function fetchRag(query: string): Promise<SdkRagResponse> {
try { try {
const res = await fetch(`${SDK_URL}/sdk/v1/rag/search`, { const res = await fetch(`${SDK_URL}/sdk/v1/rag/retrieve`, {
method: 'POST', method: 'POST',
headers: { headers: {
'Content-Type': 'application/json', 'Content-Type': 'application/json',
'X-User-ID': DEFAULT_USER, 'X-User-ID': DEFAULT_USER,
'X-Tenant-ID': DEFAULT_TENANT, 'X-Tenant-ID': DEFAULT_TENANT,
}, },
body: JSON.stringify({ query, collection, top_k: 3 }), body: JSON.stringify({ query, top_k: 8 }),
signal: AbortSignal.timeout(10000), signal: AbortSignal.timeout(15000),
}) })
if (!res.ok) return [] if (res.ok) return ((await res.json()) as SdkRagResponse) || {}
const data = await res.json()
return mapSdkResults(data.results)
} catch { } catch {
return [] // graceful: keine Verbindung -> Antwort ohne RAG-Kontext
} }
return {}
}
export interface AdvisorEvidenceRaw {
contextText: string
results: SdkRagResult[]
figures?: unknown[]
footnotes?: unknown[]
} }
/** /**
* Fragt alle Compliance-Collections parallel ab und liefert die Top-8-Passagen * Strukturierte Evidence + Prompt-Kontext aus EINEM Retrieval. Das Frontend bekommt die
* als formatierten Kontextblock (oder '' wenn nichts erreichbar/gefunden). * `results` (und kuenftig `figures`/`footnotes`) als Daten; der `contextText` geht in den
* LLM-Prompt. Reihenfolge der authority-geordneten Top-K bleibt erhalten.
*/ */
export async function queryAdvisorRAG(query: string): Promise<string> { export async function retrieveAdvisorEvidence(query: string): Promise<AdvisorEvidenceRaw> {
const settled = await Promise.all( const data = await fetchRag(query)
COMPLIANCE_COLLECTIONS.map((c) => searchCollection(c, query)), const results = data.results || []
) return {
const all = settled.flat() contextText: formatContext(mapSdkResults(results)),
if (all.length === 0) return '' results,
all.sort((a, b) => b.score - a.score) figures: Array.isArray(data.figures) ? data.figures : undefined,
return all footnotes: Array.isArray(data.footnotes) ? data.footnotes : undefined,
.slice(0, 8) }
.map((r, i) => `[Quelle ${i + 1}: ${r.source}]\n${r.content}`) }
.join('\n\n---\n\n')
/** Abwaertskompatibel: nur der Prompt-Kontext als String. */
export async function queryAdvisorRAG(query: string): Promise<string> {
return (await retrieveAdvisorEvidence(query)).contextText
}
/**
* Voller `/retrieve`-Aufruf fuer die Clarity-Gate-Orchestrierung: liefert die strukturierte
* SDK/RAG-Response (evidence/visual_evidence/footnotes/tables/clarity/results). `context` scopet
* den 2. Aufruf auf die gewaehlte Domaene. Fehler -> leeres Ergebnis (graceful).
*/
export async function retrieveFull(query: string, context?: string | null): Promise<RetrieveResponse> {
try {
const res = await fetch(`${SDK_URL}/sdk/v1/rag/retrieve`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'X-User-ID': DEFAULT_USER,
'X-Tenant-ID': DEFAULT_TENANT,
},
body: JSON.stringify({ query, top_k: 8, ...(context ? { context } : {}) }),
signal: AbortSignal.timeout(15000),
})
if (res.ok) return ((await res.json()) as RetrieveResponse) || {}
} catch {
// graceful: keine Verbindung -> leeres Ergebnis
}
return {}
} }
@@ -1,6 +1,9 @@
package handlers package handlers
import ( import (
"encoding/json"
"fmt"
"log"
"net/http" "net/http"
"strconv" "strconv"
@@ -82,6 +85,195 @@ func (h *RAGHandlers) Search(c *gin.Context) {
}) })
} }
// RetrieveRequest is the Authority Router request: a query only, no collection — the router decides
// which collections to query (broad authority base + the in-scope KB-2026.1 slice).
type RetrieveRequest struct {
Query string `json:"query" binding:"required"`
TopK int `json:"top_k,omitempty"`
Context string `json:"context,omitempty"`
}
// Retrieve is the Authority Router endpoint. The Advisor calls this with ONLY a query and stays
// collection-agnostic; the router fans out over the authority base + the in-scope slice, merges by
// authority score, and returns the unified top-K. Response shape matches Search (query/results/
// count/assessment) so existing consumers parse it unchanged.
// POST /sdk/v1/rag/retrieve
func (h *RAGHandlers) Retrieve(c *gin.Context) {
var req RetrieveRequest
if err := c.ShouldBindJSON(&req); err != nil {
c.JSON(http.StatusBadRequest, gin.H{"error": err.Error()})
return
}
if req.TopK <= 0 || req.TopK > 20 {
req.TopK = 8
}
// E2 Term Resolution: expand unambiguous abbreviations (TOM/VVT/AVV/DSB/DSFA) into the
// query so retrieval finds them; ambiguous ones (DSE/DPA) are surfaced to the FE — NOT
// auto-mapped (chat context E1 wins, else the FE asks).
intent := ucca.DetectIntent(req.Query)
termRes := ucca.ResolveAbbreviations(req.Query)
req.Query = termRes.Expanded
results, err := h.ragClient.Retrieve(c.Request.Context(), req.Query, req.TopK)
if err != nil {
c.JSON(http.StatusInternalServerError, gin.H{"error": "RAG retrieve failed: " + err.Error()})
return
}
// Evidence-Type-Schicht: die autoritative typisierte Evidence (Fußnoten/Tabellen/Figuren) aus
// dem KB-Wissensraum SEPARAT surfacen, statt sie im Breit-Basis-Text-Merge zu verlieren.
// results[] bleibt der Text-Kontext fürs LLM + die Quellen-Liste.
// Context scoping (E5): the user explicitly chose a knowledge space (chip), so scope
// the evidence HARD to it (wider re-retrieve + domain filter) — no off-domain regelwerke
// (MDR/UStG/eIDAS) after a context decision.
if req.Context != "" {
if wide, werr := h.ragClient.Retrieve(c.Request.Context(), req.Query, 30); werr == nil && len(wide) > 0 {
results = ucca.FilterByKnowledgeSpace(wide, req.Context, req.TopK)
} else {
results = ucca.FilterByKnowledgeSpace(results, req.Context, req.TopK)
}
}
// G1 scope-gating: a named regulation scopes the evidence to its knowledge space.
// Re-retrieve wider and lead with the named regulation's domain so the L2 answer +
// [n] citations are built on scoped evidence, not the embedding-majority domain.
if scope := ucca.QueryKnowledgeSpace(req.Query); scope != "" {
if wide, werr := h.ragClient.Retrieve(c.Request.Context(), req.Query, 30); werr == nil && len(wide) > 0 {
results = ucca.ScopeResults(wide, scope, req.TopK)
} else {
results = ucca.ScopeResults(results, scope, req.TopK)
}
}
ev := h.ragClient.RetrieveEvidence(c.Request.Context(), req.Query)
// Concept->Norm recall injector: if the query names a legal concept, fetch its
// load-bearing norms (Datenschutzerklärung -> Art. 12/13/14 DSGVO, ...) and inject
// them into the evidence set so they surface (embedding similarity misses them).
if norms := ucca.ConceptNorms(req.Query); len(norms) > 0 {
top := 0.9
if len(results) > 0 {
top = results[0].Score
}
injected := h.ragClient.FetchByNormIDs(c.Request.Context(), norms, top-0.001)
results = ucca.InjectConceptNorms(results, injected, req.TopK)
}
clarity := ucca.ClassifyClarity(req.Query, results)
traceClarity(req.Query, clarity, results)
c.JSON(http.StatusOK, gin.H{
"query": req.Query,
"results": results,
"count": len(results),
"assessment": ucca.Assess(results),
"footnotes": footnotesFromEvidence(ev[ucca.EvidenceFootnote]),
"tables": tablesFromEvidence(ev[ucca.EvidenceTable]),
"evidence": evidenceFromResults(results),
"visual_evidence": visualEvidenceFromEvidence(ev[ucca.EvidenceFigure]),
"clarity": clarity,
"term_resolution": termRes.Ambiguous,
"interaction_intent": intent,
})
}
// footnotesFromEvidence maps FOOTNOTE evidence to the Evidence-Workspace RawFootnote shape.
func footnotesFromEvidence(rs []ucca.LegalSearchResult) []gin.H {
out := make([]gin.H, 0, len(rs))
for _, r := range rs {
out = append(out, gin.H{
"id": r.CitationUnit,
"ref": r.CitationUnit,
"number": r.FootnoteLabel,
"regulation_code": r.RegulationCode,
"regulation_short": r.RegulationShort,
"regulation_name": r.RegulationName,
"section": r.RefCitationUnit,
"text": r.FootnoteVerbatim,
})
}
return out
}
// tablesFromEvidence maps TABLE evidence (C6/C9). Key is present so the same Evidence-Type path
// carries tables the moment the UI adds a table section.
func tablesFromEvidence(rs []ucca.LegalSearchResult) []gin.H {
out := make([]gin.H, 0, len(rs))
for _, r := range rs {
out = append(out, gin.H{
"id": r.CitationUnit,
"caption": r.ArticleLabel,
"regulation_code": r.RegulationCode,
"regulation_short": r.RegulationShort,
"regulation_name": r.RegulationName,
"section": r.RefCitationUnit,
"text": r.Text,
})
}
return out
}
// visualEvidenceFromEvidence maps FIGURE evidence to the Visual Evidence contract shape
// (C8). visual_type/image_ref/vision_summary populate once C8 lands; the shape is stable now.
func visualEvidenceFromEvidence(rs []ucca.LegalSearchResult) []gin.H {
out := make([]gin.H, 0, len(rs))
for _, r := range rs {
out = append(out, gin.H{
"visual_id": r.CitationUnit,
"visual_type": "figure",
"caption": r.ArticleLabel,
"document": evidenceDocName(r),
"context": ucca.KnowledgeSpaceOf(r.RegulationCode),
"regulation_code": r.RegulationCode,
"section": r.RefCitationUnit,
"image_ref": "",
"vision_summary": "",
})
}
return out
}
// evidenceFromResults maps retrieval hits to the Evidence contract shape the Advisor
// Evidence Workspace renders (citations[] reference evidence_id). Populated at retrieve
// time; citations[] (the [n]<->evidence coupling) come from the answer-generation step.
func evidenceFromResults(rs []ucca.LegalSearchResult) []gin.H {
out := make([]gin.H, 0, len(rs))
for _, r := range rs {
id := r.CitationUnit
if id == "" {
id = r.ArticleLabel
}
out = append(out, gin.H{
"evidence_id": id,
"document": evidenceDocName(r),
"section": r.ArticleLabel,
"paragraph": r.Paragraph,
"snippet": evidenceSnippet(r.Text, 280),
"url": r.SourceURL,
"regulation_code": r.RegulationCode,
"context": ucca.KnowledgeSpaceOf(r.RegulationCode),
})
}
return out
}
// evidenceDocName is the human-facing source name (short code, else full name).
func evidenceDocName(r ucca.LegalSearchResult) string {
if r.RegulationShort != "" {
return r.RegulationShort
}
return r.RegulationName
}
// evidenceSnippet returns a trimmed excerpt of at most n runes.
func evidenceSnippet(s string, n int) string {
rs := []rune(s)
if len(rs) <= n {
return s
}
return string(rs[:n]) + "…"
}
// ListRegulations returns the list of available regulations in the corpus. // ListRegulations returns the list of available regulations in the corpus.
// GET /sdk/v1/rag/regulations // GET /sdk/v1/rag/regulations
func (h *RAGHandlers) ListRegulations(c *gin.Context) { func (h *RAGHandlers) ListRegulations(c *gin.Context) {
@@ -236,3 +428,29 @@ func (h *RAGHandlers) LegalCorpusStructure(c *gin.Context) {
}, },
}) })
} }
// traceClarity emits a structured CLARITY_TRACE log line per retrieve for the macmini
// test session, so qualitative user ratings can be correlated with the gate decision.
func traceClarity(query string, cl ucca.Clarity, results []ucca.LegalSearchResult) {
top := make([]string, 0, 3)
for i, r := range results {
if i >= 3 {
break
}
top = append(top, r.RegulationShort)
}
chips := make([]string, 0, len(cl.CandidateContexts))
for _, c := range cl.CandidateContexts {
chips = append(chips, fmt.Sprintf("%s:%d", c.ID, c.Hits))
}
b, _ := json.Marshal(map[string]interface{}{
"query": query,
"mode": cl.Mode,
"reason": cl.Reason,
"concentration": cl.Concentration,
"dominant": cl.DominantContext,
"chips": chips,
"top_evidence": top,
})
log.Printf("CLARITY_TRACE %s", string(b))
}
+1 -1
View File
@@ -159,6 +159,7 @@ func registerRAGRoutes(v1 *gin.RouterGroup, h *handlers.RAGHandlers) {
ragRoutes := v1.Group("/rag") ragRoutes := v1.Group("/rag")
{ {
ragRoutes.POST("/search", h.Search) ragRoutes.POST("/search", h.Search)
ragRoutes.POST("/retrieve", h.Retrieve)
ragRoutes.GET("/regulations", h.ListRegulations) ragRoutes.GET("/regulations", h.ListRegulations)
ragRoutes.GET("/corpus-status", h.CorpusStatus) ragRoutes.GET("/corpus-status", h.CorpusStatus)
ragRoutes.GET("/corpus-versions/:collection", h.CorpusVersionHistory) ragRoutes.GET("/corpus-versions/:collection", h.CorpusVersionHistory)
@@ -358,7 +359,6 @@ func registerWhistleblowerRoutes(v1 *gin.RouterGroup, h *handlers.WhistleblowerH
} }
} }
func registerMaximizerRoutes(v1 *gin.RouterGroup, h *handlers.MaximizerHandlers) { func registerMaximizerRoutes(v1 *gin.RouterGroup, h *handlers.MaximizerHandlers) {
m := v1.Group("/maximizer") m := v1.Group("/maximizer")
{ {
@@ -0,0 +1,73 @@
package iace
// P3: pin accepted proposer decisions into the GT gate.
//
// When a human accepts a proposal from the offline proposer (a dedup
// supersession, a foreign-framing gate, a vocab→tag mapping, a coverage hazard),
// they record an AcceptedPin. A pin is a tiny, machine-scoped invariant — "this
// pattern MUST (or must NOT) fire for this machine" — that a test re-checks on
// every run. This is what makes the library's growth COMPOUND into the gate
// instead of silently eroding it: a future change that re-introduces a dropped
// duplicate, un-gates a foreign pattern, or removes a coverage hazard breaks the
// pin and fails CI.
//
// A single boolean covers all four proposal types:
// - dedup supersession accepted → DropPattern MustFire=false
// - foreign-framing gate accepted → foreign pattern MustFire=false
// - vocab→tag / coverage hazard accepted → the enabled pattern MustFire=true
// AcceptedPin is one regression invariant for an accepted proposal.
type AcceptedPin struct {
Pattern string `json:"pattern"`
MustFire bool `json:"must_fire"`
Reason string `json:"reason"`
FromProposal string `json:"from_proposal,omitempty"`
}
// PinSet is the accepted-pin registry for one machine (testdata/accepted_pins_*.json).
type PinSet struct {
Machine string `json:"machine"`
Pins []AcceptedPin `json:"pins"`
}
// PinResult is the verdict for one pin against an engine run.
type PinResult struct {
Pin AcceptedPin
OK bool
Detail string
}
// VerifyPins checks every pin against the set of pattern IDs the engine actually
// fired for the machine. A pin holds iff the pattern's presence equals MustFire.
func VerifyPins(pins []AcceptedPin, firedPatternIDs []string) []PinResult {
fired := make(map[string]bool, len(firedPatternIDs))
for _, id := range firedPatternIDs {
fired[id] = true
}
out := make([]PinResult, 0, len(pins))
for _, p := range pins {
got := fired[p.Pattern]
ok := got == p.MustFire
detail := "ok"
if !ok {
if p.MustFire {
detail = "expected to fire but did NOT — coverage/mapping regressed"
} else {
detail = "expected to be suppressed but FIRED — gate/supersession regressed"
}
}
out = append(out, PinResult{Pin: p, OK: ok, Detail: detail})
}
return out
}
// GenerateDedupPin turns an accepted (verdict=duplicate) dedup candidate into the
// pin that protects the supersession: the dropped pattern must no longer fire.
func GenerateDedupPin(c DedupCandidate) AcceptedPin {
return AcceptedPin{
Pattern: c.DropPattern,
MustFire: false,
Reason: "accepted duplicate of " + c.KeepPattern + " (" + c.Category + ")",
FromProposal: "dedup " + c.DropPattern + " -> " + c.KeepPattern,
}
}
@@ -0,0 +1,63 @@
package iace
import (
"encoding/json"
"os"
"path/filepath"
"testing"
)
func TestVerifyPins(t *testing.T) {
pins := []AcceptedPin{
{Pattern: "HPa", MustFire: true},
{Pattern: "HPb", MustFire: false},
}
res := VerifyPins(pins, []string{"HPa", "HPb"})
if !res[0].OK {
t.Errorf("HPa must_fire=true and it fired -> should be OK")
}
if res[1].OK {
t.Errorf("HPb must_fire=false but it fired -> should be VIOLATED")
}
res2 := VerifyPins(pins, []string{})
if res2[0].OK || !res2[1].OK {
t.Errorf("expected HPa violated + HPb ok, got %+v", res2)
}
}
func TestGenerateDedupPin(t *testing.T) {
pin := GenerateDedupPin(DedupCandidate{KeepPattern: "HP144", DropPattern: "HP013", Category: "electrical_hazard"})
if pin.Pattern != "HP013" || pin.MustFire {
t.Fatalf("want pin {HP013, must_fire=false}, got %+v", pin)
}
}
// TestWarewashing_AcceptedPins re-checks every accepted P1 supersession against the
// live warewashing engine output. A future change that un-suppresses HP013/016/018
// or drops HP2201/HP144 breaks a pin here — the gate compounds, not erodes.
func TestWarewashing_AcceptedPins(t *testing.T) {
raw, err := os.ReadFile(filepath.Join("testdata", "accepted_pins_warewashing.json"))
if err != nil {
t.Fatalf("read pins: %v", err)
}
var ps PinSet
if err := json.Unmarshal(raw, &ps); err != nil {
t.Fatalf("parse pins: %v", err)
}
_, _, kept := warewashingEngineOutput()
firedIDs := make([]string, 0, len(kept))
for _, pm := range kept {
firedIDs = append(firedIDs, pm.PatternID)
}
ok := 0
for _, r := range VerifyPins(ps.Pins, firedIDs) {
if r.OK {
ok++
continue
}
t.Errorf("PIN VIOLATED: %s (must_fire=%v) — %s [%s]", r.Pin.Pattern, r.Pin.MustFire, r.Detail, r.Pin.Reason)
}
t.Logf("accepted pins for %q: %d/%d hold", ps.Machine, ok, len(ps.Pins))
}
@@ -0,0 +1,10 @@
{
"machine": "Gewerbliche Untertisch-Geschirrspuelmaschine (vernetzt)",
"pins": [
{"pattern": "HP016", "must_fire": false, "reason": "generic hot-surface (Formwerkzeuge/Auspuffleitung framing) superseded by HP2201", "from_proposal": "P1 thermal supersession"},
{"pattern": "HP018", "must_fire": false, "reason": "actuator-burn superseded by HP2201", "from_proposal": "P1 thermal supersession"},
{"pattern": "HP013", "must_fire": false, "reason": "stored-energy Batterie/USV framing superseded by HP144", "from_proposal": "P1 stored-energy supersession"},
{"pattern": "HP2201", "must_fire": true, "reason": "warewashing hot-surface (Boiler/Tank/Spuelkammer) must remain — it is the clean equivalent that replaces HP016/HP018", "from_proposal": "P1 thermal supersession"},
{"pattern": "HP144", "must_fire": true, "reason": "residual-voltage (Frequenzumrichter/Zwischenkreis) must remain — clean equivalent that replaces HP013", "from_proposal": "P1 stored-energy supersession"}
]
}
@@ -0,0 +1,33 @@
package ucca
import (
"strings"
"testing"
)
func TestResolveAbbreviations(t *testing.T) {
// unambiguous -> expanded, not flagged
tr := ResolveAbbreviations("Was ist eine TOM?")
if !strings.Contains(tr.Expanded, "technische und organisatorische") {
t.Errorf("TOM must be expanded, got %q", tr.Expanded)
}
if len(tr.Ambiguous) != 0 {
t.Errorf("TOM must not be ambiguous, got %v", tr.Ambiguous)
}
// ambiguous DSE -> flagged, NOT auto-expanded (chat context must win, else FE asks)
tr2 := ResolveAbbreviations("welche Infos in eine DSE?")
if tr2.Expanded != "welche Infos in eine DSE?" {
t.Errorf("DSE must NOT be auto-mapped, got %q", tr2.Expanded)
}
if len(tr2.Ambiguous) != 1 || tr2.Ambiguous[0].Abbreviation != "DSE" || len(tr2.Ambiguous[0].Candidates) != 2 {
t.Errorf("DSE must be flagged ambiguous with 2 candidates, got %v", tr2.Ambiguous)
}
// no abbreviation -> unchanged
if ResolveAbbreviations("Wie ist das Wetter?").Expanded != "Wie ist das Wetter?" {
t.Errorf("query without abbreviation must be unchanged")
}
// substring must NOT match ("atom" contains "tom" but is not the word TOM)
if strings.Contains(ResolveAbbreviations("Was ist ein Atom?").Expanded, "organisatorische") {
t.Errorf("substring 'tom' in 'Atom' must not trigger expansion")
}
}
@@ -0,0 +1,65 @@
package ucca
import (
"strings"
"unicode"
)
// TermResolution is the E2 (Term Resolution) signal in the Advisor Reasoning Stack.
// Expanded drives retrieval internally (unambiguous abbreviations are spelled out so
// the embedding/concept layer finds them). Ambiguous is surfaced to the FE, which
// resolves it via chat context (E1) or asks the user ("Meinst du X oder Y?"). The
// lexicon NEVER auto-maps an ambiguous abbreviation (e.g. DSE) — real-life discipline.
type TermResolution struct {
Expanded string `json:"-"`
Ambiguous []TermAmbiguity `json:"ambiguous,omitempty"`
}
// TermAmbiguity flags one abbreviation the SDK could not resolve deterministically.
type TermAmbiguity struct {
Abbreviation string `json:"abbreviation"`
Candidates []string `json:"candidates"`
}
// abbreviationLexicon maps a (lowercased) abbreviation to its canonical term(s).
// >1 candidate = ambiguous → flagged, not expanded. Start small (User-Spec).
var abbreviationLexicon = map[string][]string{
"dse": {"Datenschutzerklärung", "Datenschutz-Folgenabschätzung"}, // ambiguous — context wins, else ask
"dsfa": {"Datenschutz-Folgenabschätzung"},
"tom": {"technische und organisatorische Maßnahmen"},
"vvt": {"Verzeichnis von Verarbeitungstätigkeiten"},
"avv": {"Auftragsverarbeitungsvertrag"},
"dsb": {"Datenschutzbeauftragter"},
"dpa": {"Data Processing Agreement", "Datenschutzaufsichtsbehörde"}, // ambiguous
}
// ResolveAbbreviations expands unambiguous abbreviations into the query and flags
// ambiguous ones. Deterministic: iterates query tokens in order (no map-order
// dependence). Whole-word match (case-insensitive) so "TOM" hits but "atom" does not.
func ResolveAbbreviations(query string) TermResolution {
tr := TermResolution{Expanded: query}
words := strings.FieldsFunc(query, func(r rune) bool {
return !unicode.IsLetter(r) && !unicode.IsNumber(r)
})
seen := map[string]bool{}
var expansions []string
for _, w := range words {
lw := strings.ToLower(w)
cands, ok := abbreviationLexicon[lw]
if !ok || seen[lw] {
continue
}
seen[lw] = true
if len(cands) == 1 {
expansions = append(expansions, cands[0])
} else {
tr.Ambiguous = append(tr.Ambiguous, TermAmbiguity{
Abbreviation: strings.ToUpper(lw), Candidates: cands,
})
}
}
if len(expansions) > 0 {
tr.Expanded = query + " " + strings.Join(expansions, " ")
}
return tr
}
@@ -0,0 +1,133 @@
package ucca
import (
"context"
"os"
"sort"
"strings"
"sync"
)
// routerBaseCollections is the broad authority base the Authority Router fans out over. It mirrors
// the Advisor's historical multi-collection set; the KB-2026.1 slice is added separately when the
// query is in scope. Override via RAG_ROUTER_COLLECTIONS (comma-separated) per environment.
func (c *LegalRAGClient) routerBaseCollections() []string {
if v := strings.TrimSpace(os.Getenv("RAG_ROUTER_COLLECTIONS")); v != "" {
var out []string
for _, p := range strings.Split(v, ",") {
if s := strings.TrimSpace(p); s != "" {
out = append(out, s)
}
}
if len(out) > 0 {
return out
}
}
return []string{
"bp_compliance_gesetze",
"bp_compliance_ce",
"bp_compliance_datenschutz",
"bp_dsfa_corpus",
"bp_compliance_recht",
"bp_legal_templates",
}
}
const routerPerCollectionTopK = 3
// Retrieve is the Authority Router entry point: callers (the Advisor) pass ONLY a query and stay
// collection-agnostic. The router fans out over the broad authority base and ADDS the KB-2026.1
// slice when the query is in scope (inKBScope), then merges all hits, deduplicates, and returns the
// top-K by authority score. This moves the former Advisor-side collection fan-out into the retrieval
// layer (the "Retriever" tier of the quality pyramid), so the proven KB-2026.1 slice gain reaches
// the product path without the Advisor knowing about individual collections.
//
// The merged set is ordered by the per-collection authority score that rerankByAuthority already
// produced inside searchInternal — i.e. binding-vs-guidance ordering is preserved across the merge.
// Per-collection failures (e.g. a collection absent on an environment) degrade gracefully.
func (c *LegalRAGClient) Retrieve(ctx context.Context, query string, topK int) ([]LegalSearchResult, error) {
if topK <= 0 {
topK = 8
}
collections := c.routerBaseCollections()
if c.kbScopeRoutingEnabled && c.kbSliceCollection != "" && inKBScope(query) {
collections = append(collections, c.kbSliceCollection)
}
// Cross-regulation queries (>=2 explicitly named regulations) get a larger per-collection budget
// so each collection's multi-regulation search isn't truncated down to the keyword-dominant
// domain; the final per-regulation balancing then guarantees every named domain in the top-K.
regs := detectRegulations(query)
perColl := routerPerCollectionTopK
if len(regs) >= 2 {
perColl = routerPerCollectionTopK * len(regs)
}
// Warm the full-text indexes sequentially first so the concurrent fan-out below only READS the
// shared textIndexEnsured map (the writes happen here, serialized) — closes the cold-start map
// race deterministically. Best-effort: a missing collection just stays un-indexed (hybrid then
// falls back to dense, or the per-collection search degrades to nothing).
if c.hybridEnabled {
for _, coll := range collections {
_ = c.ensureTextIndex(ctx, coll)
}
}
// Embed the query ONCE and stash it in ctx so the concurrent per-collection searches
// below reuse it instead of each re-embedding (was N remote round-trips on dev/OVH).
ctx = c.withQueryEmbedding(ctx, query)
out := make([][]LegalSearchResult, len(collections))
var wg sync.WaitGroup
for i, coll := range collections {
wg.Add(1)
go func(i int, coll string) {
defer wg.Done()
if res, err := c.searchInternal(ctx, coll, query, nil, perColl); err == nil {
out[i] = res
}
}(i, coll)
}
wg.Wait()
merged := make([]LegalSearchResult, 0, len(collections)*perColl)
for _, r := range out {
merged = append(merged, r...)
}
merged = dedupResults(merged)
sort.SliceStable(merged, func(a, b int) bool { return merged[a].Score > merged[b].Score })
// Cross-regulation: guarantee every named domain is represented (0070-class fix) instead of
// letting a global score-sort starve the non-dominant domain.
if len(regs) >= 2 {
return balanceByRegulation(merged, regs, topK), nil
}
if len(merged) > topK {
merged = merged[:topK]
}
return merged, nil
}
// dedupResults removes duplicate passages that can appear when collections overlap, keeping the
// highest-scoring occurrence. Identity = regulation_code + article_label + a text prefix.
func dedupResults(in []LegalSearchResult) []LegalSearchResult {
pos := make(map[string]int, len(in))
out := make([]LegalSearchResult, 0, len(in))
for _, r := range in {
text := r.Text
if len(text) > 80 {
text = text[:80]
}
key := r.RegulationCode + "|" + r.ArticleLabel + "|" + text
if idx, ok := pos[key]; ok {
if r.Score > out[idx].Score {
out[idx] = r
}
continue
}
pos[key] = len(out)
out = append(out, r)
}
return out
}
@@ -0,0 +1,164 @@
package ucca
import (
"context"
"encoding/json"
"os"
"strconv"
"strings"
"testing"
)
type benchQ struct {
ID string `json:"id"`
Document string `json:"document"`
Question string `json:"question"`
}
// docTokens maps a bench question's expected document to acceptable regulation_code/label substrings.
func docTokens(document string) []string {
d := strings.ToUpper(document)
var t []string
for _, wp := range []string{"WP243", "WP248", "WP260"} {
if strings.Contains(d, wp) {
t = append(t, wp)
}
}
dns := strings.ReplaceAll(d, " ", "")
for _, gl := range []struct{ key, tok string }{{"07/2020", "GL07"}, {"05/2020", "GL05"}, {"09/2022", "GL09"}} {
if strings.Contains(dns, gl.key) {
t = append(t, gl.tok)
}
}
if strings.Contains(d, "TDDDG") {
t = append(t, "TDDDG")
}
if strings.Contains(d, "DSGVO") || strings.Contains(d, "ART. 13") || strings.Contains(d, "ART. 14") {
t = append(t, "DSGVO")
}
if strings.Contains(d, "BDSG") {
t = append(t, "BDSG")
}
if strings.Contains(d, "CRA") {
t = append(t, "CRA")
}
if strings.Contains(d, "MASCH") {
t = append(t, "MASCH", "MACHINERY", "MVO")
}
return t
}
func hitDoc(results []LegalSearchResult, toks []string) bool {
for _, r := range results {
s := strings.ReplaceAll(strings.ToUpper(r.RegulationCode+" "+r.ArticleLabel), " ", "")
for _, tk := range toks {
if strings.Contains(s, strings.ReplaceAll(tk, " ", "")) {
return true
}
}
}
return false
}
// TestMultiReg0070E2E (RUN_E2E=1) is the 0070 regression: a cross-regulation query (CRA + MaschVO)
// must return BOTH domains through the real Retrieve(), not just the keyword-dominant CRA.
func TestMultiReg0070E2E(t *testing.T) {
if os.Getenv("RUN_E2E") != "1" {
t.Skip("set RUN_E2E=1 + QDRANT_URL/OLLAMA_URL/QDRANT_API_KEY")
}
c := NewLegalRAGClient()
q := "Wie greifen CRA und Maschinenverordnung bei einer vernetzten Maschine ineinander?"
res, err := c.Retrieve(context.Background(), q, 8)
if err != nil {
t.Fatalf("retrieve: %v", err)
}
var hasCRA, hasMasch bool
var codes []string
for _, r := range res {
u := strings.ToUpper(r.RegulationCode)
codes = append(codes, u)
if strings.Contains(u, "CRA") {
hasCRA = true
}
if strings.Contains(u, "MASCH") || strings.Contains(u, "MACHIN") || u == "MVO" {
hasMasch = true
}
}
t.Logf("0070 top-8 codes: %v", codes)
if !hasCRA || !hasMasch {
t.Errorf("0070 must return BOTH domains via Retrieve(): CRA=%v MaschVO=%v", hasCRA, hasMasch)
}
}
// TestAuthorityRouterCB100 (RUN_E2E=1) drives the REAL Retrieve() over the ComplianceBench-100 against
// the live collections: NEW (scope routing on → slice added for in-scope queries) vs OLD (routing off
// → broad base only). It is the regression gate that the router actually delivers the proven slice
// gain (+28/0-regr in the offline simulation) through the production Go code path.
func TestAuthorityRouterCB100(t *testing.T) {
if os.Getenv("RUN_E2E") != "1" {
t.Skip("set RUN_E2E=1 + QDRANT_URL/OLLAMA_URL/QDRANT_API_KEY + BENCH_PATH")
}
path := os.Getenv("BENCH_PATH")
if path == "" {
path = "/tmp/compliance_bench.json"
}
raw, err := os.ReadFile(path)
if err != nil {
t.Fatalf("bench read: %v", err)
}
var doc struct {
Questions []benchQ `json:"questions"`
}
if err := json.Unmarshal(raw, &doc); err != nil {
t.Fatalf("bench parse: %v", err)
}
// BENCH_STRIDE samples every Kth question (stratified across DS/CRA/MaschVO) so the gate stays
// tractable against the remote dev Qdrant; default 1 = full CB-100.
stride := 1
if s := os.Getenv("BENCH_STRIDE"); s != "" {
if n, err := strconv.Atoi(s); err == nil && n > 0 {
stride = n
}
}
c := NewLegalRAGClient()
ctx := context.Background()
var n, oldHit, newHit, gain, regr int
for i, q := range doc.Questions {
if i%stride != 0 {
continue
}
n++
toks := docTokens(q.Document)
c.kbScopeRoutingEnabled = false
oldRes, _ := c.Retrieve(ctx, q.Question, 8)
c.kbScopeRoutingEnabled = true
newRes, _ := c.Retrieve(ctx, q.Question, 8)
oh, nh := hitDoc(oldRes, toks), hitDoc(newRes, toks)
if oh {
oldHit++
}
if nh {
newHit++
}
flip := "="
switch {
case !oh && nh:
gain++
flip = "GAIN"
case oh && !nh:
regr++
flip = "REGR"
}
t.Logf("%-9s [%-14s] OLD=%-5v NEW=%-5v %s", q.ID, q.Document, oh, nh, flip)
}
t.Logf("CB-100 sample (stride=%d) via Retrieve(): N=%d | OLD-hit %d | NEW-hit %d | GAIN %d | REGR %d",
stride, n, oldHit, newHit, gain, regr)
if newHit <= oldHit || gain < 3 {
t.Errorf("router must add slice gains: NEW(%d) must exceed OLD(%d), gain=%d", newHit, oldHit, gain)
}
if regr > 2 {
t.Errorf("too many regressions through the router: %d", regr)
}
}
@@ -0,0 +1,99 @@
package ucca
import (
"os"
"testing"
)
func TestRouterBaseCollections(t *testing.T) {
c := &LegalRAGClient{}
os.Unsetenv("RAG_ROUTER_COLLECTIONS")
def := c.routerBaseCollections()
if len(def) != 6 || def[1] != "bp_compliance_ce" {
t.Fatalf("default base collections unexpected: %v", def)
}
os.Setenv("RAG_ROUTER_COLLECTIONS", " bp_compliance_ce , kb_2026_1_build ,, ")
defer os.Unsetenv("RAG_ROUTER_COLLECTIONS")
got := c.routerBaseCollections()
if len(got) != 2 || got[0] != "bp_compliance_ce" || got[1] != "kb_2026_1_build" {
t.Fatalf("env override parse failed (trim/empty): %v", got)
}
}
func TestRouterSliceSelection(t *testing.T) {
// The router appends the slice exactly when the query is in scope (inKBScope) and routing is on.
// Mirror the selection logic so a regression in either is caught without a live Qdrant.
c := &LegalRAGClient{kbSliceCollection: "kb_2026_1_build", kbScopeRoutingEnabled: true}
sel := func(q string) bool {
colls := c.routerBaseCollections()
if c.kbScopeRoutingEnabled && c.kbSliceCollection != "" && inKBScope(q) {
colls = append(colls, c.kbSliceCollection)
}
for _, x := range colls {
if x == c.kbSliceCollection {
return true
}
}
return false
}
if !sel("Welche neun Kriterien nennt WP248 fuer ein hohes Risiko?") {
t.Error("in-scope guidance query must include the slice")
}
if sel("Was sagt NIST SP 800-53 zu Access Control?") {
t.Error("out-of-scope query must NOT include the slice")
}
c.kbScopeRoutingEnabled = false
if sel("Welche Kriterien nennt WP248?") {
t.Error("routing disabled => slice never included")
}
}
func TestBalanceByRegulation(t *testing.T) {
regs := []detectedRegulation{
{Canonical: "CRA", CodeValues: []string{"CRA"}},
{Canonical: "MaschVO", CodeValues: []string{"MASCHVO", "MVO", "MACHINERY"}},
}
// CRA dominates by score; without balancing the top-4 would be all CRA + NIST.
pool := []LegalSearchResult{
{RegulationCode: "CRA", Score: 0.99},
{RegulationCode: "CRA", Score: 0.98},
{RegulationCode: "CRA", Score: 0.97},
{RegulationCode: "NIST", Score: 0.96},
{RegulationCode: "MACHINERY", Score: 0.70},
{RegulationCode: "MVO", Score: 0.65},
}
out := balanceByRegulation(pool, regs, 4)
var hasCRA, hasMasch bool
for _, r := range out {
switch r.RegulationCode {
case "CRA":
hasCRA = true
case "MACHINERY", "MVO":
hasMasch = true
}
}
if !hasCRA || !hasMasch {
t.Errorf("both named domains must be represented: CRA=%v MaschVO=%v out=%v", hasCRA, hasMasch, out)
}
if out[0].RegulationCode != "CRA" || !(out[1].RegulationCode == "MACHINERY" || out[1].RegulationCode == "MVO") {
t.Errorf("round-robin should alternate domains, got %s then %s", out[0].RegulationCode, out[1].RegulationCode)
}
}
func TestDedupResults(t *testing.T) {
in := []LegalSearchResult{
{RegulationCode: "EDPB WP248", ArticleLabel: "III.B", Text: "lorem", Score: 0.7},
{RegulationCode: "EDPB WP248", ArticleLabel: "III.B", Text: "lorem", Score: 0.9}, // dup, higher score
{RegulationCode: "DSGVO", ArticleLabel: "Art. 35", Text: "ipsum", Score: 0.8},
}
out := dedupResults(in)
if len(out) != 2 {
t.Fatalf("expected 2 deduped, got %d", len(out))
}
for _, r := range out {
if r.RegulationCode == "EDPB WP248" && r.Score != 0.9 {
t.Errorf("dedup must keep highest score, got %v", r.Score)
}
}
}
+135
View File
@@ -0,0 +1,135 @@
package ucca
import (
"sort"
"strings"
)
// Clarity is the READ-ONLY, INSTRUMENTED clarity-gate signal emitted alongside a
// retrieve response. It does NOT change retrieval or advisor behaviour yet — the
// advisor still answers normally. Once ~30-50 real questions are collected the
// thresholds get finalised and the gate is activated in the advisor flow.
//
// Ambiguity has two independent sources (empirically measured, 12-question set):
// - retrieval scatter: hits spread across many knowledge spaces (low
// concentration / high domain_count) — the retriever itself can't localise.
// - conceptual generality: a general term the corpus OVER-localises (e.g. "PDCA"
// concentrates on datenschutz but is cross-domain) — only an LLM knows this.
// The middle band is where the LLM-intent classifier must decide.
//
// G1 (explicit scope): when the query NAMES a regulation ("... nach TRGS", "CRA
// ...", "MaschinenVO ..."), that explicit context beats the embedding scatter —
// the gate scopes to the named regulation's knowledge space regardless of
// concentration. This is regulation detection, NOT a broad-term list.
type Clarity struct {
Mode string `json:"mode"` // "answer" | "clarify"
Reason string `json:"reason"` // low_concentration | many_domains | high_confidence_scope | middle_band_llm_needed | explicit_scope | no_domain_signal
Concentration float64 `json:"concentration"` // fraction of tagged hits in the dominant knowledge space
DomainCount int `json:"domain_count"` // distinct knowledge spaces in the hits
DominantContext string `json:"dominant_context"` // knowledge-space id (explicit scope wins if the query names a regulation)
CandidateContexts []ClarityContext `json:"candidate_contexts"` // corpus-grounded chips (spaces actually present)
}
// ClarityContext is one corpus-grounded context chip.
type ClarityContext struct {
ID string `json:"id"`
Label string `json:"label"`
Hits int `json:"hits"`
}
// Tiered thresholds — INSTRUMENTED DEFAULTS, calibrate on 30-50 real questions.
const (
clarityMaxConcentration = 0.45 // <= this => clarify (retrieval scatter)
clarityMinDomains = 4 // >= this => clarify (broad spread)
clarityAnswerConc = 0.75 // >= this => answer (confident scope)
)
// QueryKnowledgeSpace detects an EXPLICIT regulation mention in the query and maps
// it to a knowledge space. Regulation detection (authority), not a broad-term list:
// only fires when the user names a concrete regelwerk. "" if none named.
func QueryKnowledgeSpace(query string) string {
q := " " + strings.ToLower(query) + " "
has := func(subs ...string) bool {
for _, s := range subs {
if strings.Contains(q, s) {
return true
}
}
return false
}
switch {
case has("trgs", "trbs", " asr ", "gefahrstoff", "arbeitsplatzgrenzwert", "arbeitsschutz"):
return "arbeitsschutz"
case has("dsgvo", "gdpr", "bdsg", "tdddg", "ttdsg", " dsk ", "edpb", "datenschutz", " dsfa "):
return "datenschutz"
case has(" cra ", "cyber resilience", "nis2", "nis-2", " dora ", "enisa", "bsig", "kritis"):
return "cyber"
case has("ai act", "ki-vo", "ki-verordnung", "ki-system"):
return "ki"
case has("maschinenverordnung", "maschinenvo", "maschvo", "maschinenrichtlinie", " gpsr ", "produktsicherheit"):
return "produktsicherheit"
case has(" mdr ", "medizinprodukt", "medical device"):
return "produktsicherheit"
default:
return ""
}
}
// ClassifyClarity computes the read-only clarity signal. Deterministic tiers on the
// knowledge-space concentration, PLUS the G1 explicit-scope override: if the query
// names a regulation, that scope wins over the embedding scatter.
func ClassifyClarity(query string, results []LegalSearchResult) Clarity {
counts := map[string]int{}
total := 0
for _, r := range results {
if s := KnowledgeSpaceOf(r.RegulationCode); s != "" {
counts[s]++
total++
}
}
cl := Clarity{Mode: "answer", Reason: "high_confidence_scope", CandidateContexts: []ClarityContext{}}
if total == 0 {
cl.Mode, cl.Reason = "clarify", "no_domain_signal"
if ks := QueryKnowledgeSpace(query); ks != "" {
cl.Mode, cl.Reason, cl.DominantContext = "answer", "explicit_scope", ks
}
return cl
}
type kc struct {
id string
n int
}
ks := make([]kc, 0, len(counts))
for id, n := range counts {
ks = append(ks, kc{id, n})
}
sort.Slice(ks, func(i, j int) bool {
if ks[i].n != ks[j].n {
return ks[i].n > ks[j].n
}
return ks[i].id < ks[j].id
})
cl.DominantContext = ks[0].id
cl.Concentration = float64(ks[0].n) / float64(total)
cl.DomainCount = len(counts)
for _, k := range ks {
cl.CandidateContexts = append(cl.CandidateContexts, ClarityContext{
ID: k.id, Label: KnowledgeSpaceLabel[k.id], Hits: k.n,
})
}
switch {
case cl.Concentration <= clarityMaxConcentration:
cl.Mode, cl.Reason = "clarify", "low_concentration"
case cl.DomainCount >= clarityMinDomains:
cl.Mode, cl.Reason = "clarify", "many_domains"
case cl.Concentration >= clarityAnswerConc:
cl.Mode, cl.Reason = "answer", "high_confidence_scope"
default:
cl.Mode, cl.Reason = "answer", "middle_band_llm_needed"
}
// G1: an explicitly named regulation beats the embedding scatter.
if q := QueryKnowledgeSpace(query); q != "" {
cl.Mode, cl.Reason, cl.DominantContext = "answer", "explicit_scope", q
}
return cl
}
@@ -0,0 +1,64 @@
package ucca
import "testing"
func TestKnowledgeSpaceOf(t *testing.T) {
cases := map[string]string{
"DSGVO": "datenschutz",
"BDSG": "datenschutz",
"DSK SDM B51 ZUGRIFFE": "datenschutz",
"EDPS DIGITAL ETHICS": "datenschutz",
"TRGS 900": "arbeitsschutz",
"OSHA 1910 SUBPART O": "arbeitsschutz",
"HGB": "wirtschaftsrecht",
"BGB": "wirtschaftsrecht",
"MASCHINENVO": "produktsicherheit",
"MVO": "produktsicherheit",
"CRA": "cyber",
"NIST SP800 53R5": "cyber",
"AI ACT": "ki",
"KI-VO": "ki",
"DORA": "finanz",
"ARG": "arbeitsrecht",
"": "",
}
for code, want := range cases {
if got := KnowledgeSpaceOf(code); got != want {
t.Errorf("KnowledgeSpaceOf(%q)=%q want %q", code, got, want)
}
}
}
func TestClassifyClarity(t *testing.T) {
scattered := []LegalSearchResult{
{RegulationCode: "CRA"}, {RegulationCode: "MASCHINENVO"}, {RegulationCode: "EU MDR"},
{RegulationCode: "KI-VO"}, {RegulationCode: "TRBS 1111"}, {RegulationCode: "OWASP TOP10"},
}
if c := ClassifyClarity("Welche Risiken gibt es?", scattered); c.Mode != "clarify" {
t.Errorf("scattered: mode=%q reason=%q want clarify", c.Mode, c.Reason)
}
concentrated := []LegalSearchResult{
{RegulationCode: "DSGVO"}, {RegulationCode: "BDSG"}, {RegulationCode: "DSK SDM"},
{RegulationCode: "EDPB WP243"}, {RegulationCode: "TDDDG"},
}
c := ClassifyClarity("Was ist eine DSFA?", concentrated)
if c.Mode != "answer" || c.DominantContext != "datenschutz" {
t.Errorf("concentrated: mode=%q dominant=%q want answer/datenschutz", c.Mode, c.DominantContext)
}
}
func TestClassifyClarity_ExplicitScope(t *testing.T) {
// G1: query names TRGS -> arbeitsschutz wins even though retrieval scatters to datenschutz.
scattered := []LegalSearchResult{
{RegulationCode: "DSK SDM METHODE"}, {RegulationCode: "DSK SDM V31"}, {RegulationCode: "DSK SDM B41 PLANEN"},
{RegulationCode: "DSGVO"}, {RegulationCode: "DSK SDM"}, {RegulationCode: "TRGS 900"}, {RegulationCode: "TRGS 554"},
}
c := ClassifyClarity("Schwellwertanalyse nach TRGS", scattered)
if c.Mode != "answer" || c.Reason != "explicit_scope" || c.DominantContext != "arbeitsschutz" {
t.Errorf("explicit TRGS: mode=%q reason=%q dominant=%q want answer/explicit_scope/arbeitsschutz", c.Mode, c.Reason, c.DominantContext)
}
// no regulation named -> falls through to tiered logic
if c := ClassifyClarity("Welche Risiken gibt es?", scattered); c.Reason == "explicit_scope" {
t.Errorf("no reg named should not be explicit_scope, got %q", c.Reason)
}
}
@@ -0,0 +1,88 @@
package ucca
import (
"bytes"
"context"
"encoding/json"
"fmt"
"net/http"
)
// FetchByNormIDs loads one representative unit per norm_id from the KB slice
// collection — the fetch side of the Concept->Norm recall injector. Returns
// LegalSearchResult with the caller-provided concept-relevance score (there is no
// similarity query; the injector places them by that score). Returns nil on any
// error or when no KB slice is configured (graceful degradation).
func (c *LegalRAGClient) FetchByNormIDs(ctx context.Context, normIDs []string, score float64) []LegalSearchResult {
if c.kbSliceCollection == "" || len(normIDs) == 0 {
return nil
}
should := make([]map[string]interface{}, 0, len(normIDs))
for _, nid := range normIDs {
should = append(should, map[string]interface{}{"key": "norm_id", "match": map[string]interface{}{"value": nid}})
}
reqBody := map[string]interface{}{
"limit": len(normIDs) * 3,
"with_payload": true,
"with_vectors": false,
"filter": map[string]interface{}{"should": should},
}
jsonBody, err := json.Marshal(reqBody)
if err != nil {
return nil
}
url := fmt.Sprintf("%s/collections/%s/points/scroll", c.qdrantURL, c.kbSliceCollection)
req, err := http.NewRequestWithContext(ctx, "POST", url, bytes.NewReader(jsonBody))
if err != nil {
return nil
}
req.Header.Set("Content-Type", "application/json")
if c.qdrantAPIKey != "" {
req.Header.Set("api-key", c.qdrantAPIKey)
}
resp, err := c.httpClient.Do(req)
if err != nil {
return nil
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != http.StatusOK {
return nil
}
var scrollResp qdrantScrollResponse
if err := json.NewDecoder(resp.Body).Decode(&scrollResp); err != nil {
return nil
}
seen := map[string]bool{}
out := make([]LegalSearchResult, 0, len(normIDs))
for _, pt := range scrollResp.Result.Points {
nid := getString(pt.Payload, "norm_id")
if nid == "" || seen[nid] {
continue
}
seen[nid] = true
out = append(out, scrollPointToResult(pt.Payload, score))
}
return out
}
// scrollPointToResult maps a scroll-point payload to a LegalSearchResult. Mirrors
// hitsToResults' payload keys; the score is assigned by the caller (concept rank).
func scrollPointToResult(payload map[string]interface{}, score float64) LegalSearchResult {
regCode := getString(payload, "regulation_code")
if regCode == "" {
regCode = getString(payload, "regulation_id")
}
return LegalSearchResult{
Text: getString(payload, "chunk_text"),
RegulationCode: regCode,
RegulationName: getString(payload, "regulation_name_de"),
RegulationShort: getString(payload, "regulation_short"),
Category: getString(payload, "category"),
Article: getString(payload, "article"),
ArticleLabel: getString(payload, "article_label"),
Paragraph: getString(payload, "paragraph"),
SourceURL: getString(payload, "source_url"),
CitationUnit: getString(payload, "citation_unit"),
Score: score,
}
}
@@ -0,0 +1,97 @@
package ucca
import (
"sort"
"strings"
)
// Legal Concept Ontology — the fachliche IP bridge for the Concept->Norm recall
// injector. The words users type ("Datenschutzerklärung", "Cookie Banner") are
// rarely identical to the article titles that actually govern them (Art. 12/13/14
// DSGVO, § 25 TDDDG). Embedding similarity misses this leap, so these bridges are
// curated: concept keyword -> load-bearing norm_ids. This is NOT a fallback to
// hardcoding — it is domain knowledge that surfaces the normatively load-bearing
// units within the (already correctly retrieved) documents.
type conceptNorm struct {
keywords []string
normIDs []string
}
var legalConceptOntology = []conceptNorm{
{[]string{"datenschutzerklärung", "datenschutzerklaerung", "privacy policy", "datenschutzhinweise", "datenschutzinformation"},
[]string{"EU-DSGVO-Art12", "EU-DSGVO-Art13", "EU-DSGVO-Art14"}},
{[]string{"cookie banner", "cookie-banner", "cookies", "cookie", "tracking"},
[]string{"DE-TDDDG-§25", "EU-DSGVO-Art6", "EU-DSGVO-Art7"}},
{[]string{"dsfa", "folgenabschätzung", "folgenabschaetzung", "datenschutz-folgenabschätzung"},
[]string{"EU-DSGVO-Art35", "EU-DSGVO-Art36"}},
{[]string{"auskunft", "auskunftsrecht", "auskunftsersuchen"},
[]string{"EU-DSGVO-Art15"}},
{[]string{"löschung", "loeschung", "vergessenwerden", "recht auf vergessen"},
[]string{"EU-DSGVO-Art17"}},
{[]string{"datenübertragbarkeit", "datenuebertragbarkeit", "portabilität", "portabilitaet"},
[]string{"EU-DSGVO-Art20"}},
{[]string{"widerspruch", "widerspruchsrecht"},
[]string{"EU-DSGVO-Art21"}},
{[]string{"datenpanne", "datenschutzverletzung", "data breach", "verletzung des schutzes"},
[]string{"EU-DSGVO-Art33", "EU-DSGVO-Art34"}},
// E4-Quick-Curation (2026-07-01): resolved abbreviations (E2) pull their core norms.
{[]string{"technische und organisatorische maßnahmen", "technische und organisatorische massnahmen"},
[]string{"EU-DSGVO-Art32", "EU-DSGVO-Art25", "EU-DSGVO-Art5"}},
{[]string{"verzeichnis von verarbeitungstätigkeiten", "verzeichnis von verarbeitungstaetigkeiten", "verarbeitungsverzeichnis"},
[]string{"EU-DSGVO-Art30"}},
{[]string{"auftragsverarbeitungsvertrag", "auftragsverarbeitung", "auftragsverarbeiter"},
[]string{"EU-DSGVO-Art28"}},
{[]string{"datenschutzbeauftragt"},
[]string{"EU-DSGVO-Art37", "EU-DSGVO-Art38", "EU-DSGVO-Art39"}},
}
// ConceptNorms returns the load-bearing norm_ids for the concepts named in the
// query (dedup, order-preserving). Empty if no concept is named.
func ConceptNorms(query string) []string {
q := normalizeGerman(query)
seen := map[string]bool{}
out := []string{}
for _, cn := range legalConceptOntology {
for _, kw := range cn.keywords {
if strings.Contains(q, normalizeGerman(kw)) {
for _, nid := range cn.normIDs {
if !seen[nid] {
seen[nid] = true
out = append(out, nid)
}
}
break
}
}
}
return out
}
// InjectConceptNorms merges concept-injected norm units into the results so the
// load-bearing norms are VISIBLE in the evidence set. Dedups by citation_unit
// (skips norms already retrieved), then re-sorts by score — the injected units
// carry a just-below-top score so they surface high WITHOUT displacing the top
// document hit (inject, don't blindly dominate). Caps at topK.
func InjectConceptNorms(results, injected []LegalSearchResult, topK int) []LegalSearchResult {
if len(injected) == 0 {
return results
}
present := map[string]bool{}
for _, r := range results {
if r.CitationUnit != "" {
present[r.CitationUnit] = true
}
}
merged := append([]LegalSearchResult{}, results...)
for _, in := range injected {
if in.CitationUnit != "" && !present[in.CitationUnit] {
merged = append(merged, in)
present[in.CitationUnit] = true
}
}
sort.SliceStable(merged, func(i, j int) bool { return merged[i].Score > merged[j].Score })
if topK > 0 && len(merged) > topK {
merged = merged[:topK]
}
return merged
}
@@ -0,0 +1,48 @@
package ucca
import "testing"
func TestConceptNorms(t *testing.T) {
q := "Was muss ich beachten wenn ich meine Datenschutzerklärung schreibe für meine Website mit Cookie Banner?"
got := ConceptNorms(q)
want := map[string]bool{
"EU-DSGVO-Art12": true, "EU-DSGVO-Art13": true, "EU-DSGVO-Art14": true,
"DE-TDDDG-§25": true, "EU-DSGVO-Art6": true, "EU-DSGVO-Art7": true,
}
for _, nid := range got {
delete(want, nid)
}
if len(want) > 0 {
t.Errorf("ConceptNorms missing %v; got %v", want, got)
}
if len(ConceptNorms("Wie ist das Wetter heute?")) != 0 {
t.Errorf("no concept named should yield no norms")
}
}
func TestInjectConceptNorms(t *testing.T) {
results := []LegalSearchResult{
{CitationUnit: "DSK OH Telemedien", Score: 0.98},
{CitationUnit: "Art. 25 DSGVO", Score: 0.95},
}
injected := []LegalSearchResult{
{CitationUnit: "Art. 13 DSGVO", Score: 0.979},
{CitationUnit: "Art. 25 DSGVO", Score: 0.979}, // already present -> must not double
}
out := InjectConceptNorms(results, injected, 10)
if out[0].CitationUnit != "DSK OH Telemedien" {
t.Errorf("top document hit must stay #1 (not dominated), got %s", out[0].CitationUnit)
}
if len(out) != 3 {
t.Errorf("expected 3 (Art.25 not duplicated), got %d", len(out))
}
found := false
for _, r := range out {
if r.CitationUnit == "Art. 13 DSGVO" {
found = true
}
}
if !found {
t.Errorf("Art. 13 DSGVO must be injected + visible")
}
}
@@ -0,0 +1,34 @@
package ucca
import "context"
type embCacheKeyT struct{}
var embCacheKey embCacheKeyT
type embCacheEntry struct {
query string
vec []float64
}
// embedForQuery returns the query embedding, reusing a value precomputed for the SAME
// query and stashed in ctx by withQueryEmbedding. This collapses the Authority Router's
// per-collection fan-out from N embeddings to ONE — decisive when the embedding endpoint
// is remote (dev/OVH), where N round-trips dominated /retrieve latency. Falls back to a
// fresh embedding when nothing is cached (direct Search / SearchCollection callers).
func (c *LegalRAGClient) embedForQuery(ctx context.Context, query string) ([]float64, error) {
if v, ok := ctx.Value(embCacheKey).(*embCacheEntry); ok && v.query == query && len(v.vec) > 0 {
return v.vec, nil
}
return c.generateEmbedding(ctx, query)
}
// withQueryEmbedding precomputes the query embedding once and stashes it in ctx so the
// concurrent per-collection searches reuse it instead of each re-embedding. Best-effort:
// on embed error the ctx is returned unchanged and callers fall back to per-call embedding.
func (c *LegalRAGClient) withQueryEmbedding(ctx context.Context, query string) context.Context {
if vec, err := c.generateEmbedding(ctx, query); err == nil && len(vec) > 0 {
return context.WithValue(ctx, embCacheKey, &embCacheEntry{query: query, vec: vec})
}
return ctx
}
@@ -0,0 +1,68 @@
package ucca
import "context"
// EvidenceType classifies a retrieved unit by WHAT KIND of evidence it is, independent of its
// collection. Footnotes/tables/figures are Evidence Types, not collections. The Authority Router
// surfaces non-text evidence from the authoritative knowledge space (the KB slice) SEPARATELY from
// the merged text top-K, so fine-grained evidence isn't outranked by broad-base text.
//
// The layer this introduces: Intent -> Knowledge Space -> EvidenceType -> Collection -> Merge ->
// Authority. Today FOOTNOTE is populated; FIGURE arrives with C8 and TABLE is already present from
// C6/C9 — no router rebuild needed, the same path carries every new evidence type.
type EvidenceType string
const (
EvidenceText EvidenceType = "text"
EvidenceFootnote EvidenceType = "footnote"
EvidenceTable EvidenceType = "table"
EvidenceFigure EvidenceType = "figure"
)
// classifyEvidence derives the EvidenceType from a result's payload markers. Precedence
// footnote > figure > table > text (a unit carries at most one is_* marker in practice).
func classifyEvidence(r LegalSearchResult) EvidenceType {
switch {
case r.IsFootnote:
return EvidenceFootnote
case r.IsFigure:
return EvidenceFigure
case r.IsTable:
return EvidenceTable
default:
return EvidenceText
}
}
// evidenceRetrievalTopK is the budget for the authoritative-KB evidence pass. Deliberately targeted
// (the authoritative slice within the recognized knowledge space), NOT a blanket top-K increase of
// the merged result set — the successes came from BETTER-targeted evidence, not MORE evidence.
const evidenceRetrievalTopK = 20
// maxEvidencePerType caps each surfaced evidence type.
const maxEvidencePerType = 6
// RetrieveEvidence returns the authoritative typed evidence (footnotes/tables/figures) for an
// in-scope query, pulled from the KB slice and grouped by EvidenceType. This is the "Evidence Type"
// router layer (Option A): when the query is in the KB knowledge space, the authoritative evidence
// within that space is surfaced separately so it isn't lost in the broad-base text merge. Returns an
// empty map when out of scope or KB routing is disabled. Text evidence is NOT returned here — it
// flows through the normal Retrieve() merge (the LLM context + the sources list).
func (c *LegalRAGClient) RetrieveEvidence(ctx context.Context, query string) map[EvidenceType][]LegalSearchResult {
ev := map[EvidenceType][]LegalSearchResult{}
if !c.kbScopeRoutingEnabled || c.kbSliceCollection == "" || !inKBScope(query) {
return ev
}
hits, err := c.searchInternal(ctx, c.kbSliceCollection, query, nil, evidenceRetrievalTopK)
if err != nil {
return ev
}
for _, h := range hits {
t := classifyEvidence(h)
if t == EvidenceText || len(ev[t]) >= maxEvidencePerType {
continue
}
ev[t] = append(ev[t], h)
}
return ev
}
@@ -0,0 +1,26 @@
package ucca
import "testing"
func TestFilterByKnowledgeSpace(t *testing.T) {
results := []LegalSearchResult{
{CitationUnit: "Art. 13 DSGVO", RegulationCode: "DSGVO"},
{CitationUnit: "EU Mdr", RegulationCode: "EU MDR"},
{CitationUnit: "UStG § 14", RegulationCode: "USTG"},
{CitationUnit: "DSK OH Telemedien", RegulationCode: "DSK OH TELEMEDIEN"},
{CitationUnit: "eIDAS", RegulationCode: "EIDAS"},
}
out := FilterByKnowledgeSpace(results, "datenschutz", 10)
for _, r := range out {
if KnowledgeSpaceOf(r.RegulationCode) != "datenschutz" {
t.Errorf("off-domain leaked into scoped result: %s (%s)", r.CitationUnit, r.RegulationCode)
}
}
if len(out) != 2 { // Art. 13 DSGVO + DSK OH Telemedien
t.Errorf("expected 2 datenschutz hits, got %d", len(out))
}
// domain with no hits -> fall back to input (never strand the answer)
if len(FilterByKnowledgeSpace(results, "maschinen", 10)) != len(results) {
t.Errorf("no-hit domain should fall back to full input")
}
}
+46
View File
@@ -0,0 +1,46 @@
package ucca
import "strings"
// DetectIntent classifies the INTERACTION INTENT of a query (Advisor Reasoning
// Stack E3). The same norms answer very differently depending on the TASK the user
// wants: "Was ist X?" (definition) vs "Wie schreibe ich X?" (anleitung) vs "Prüfe X"
// (review). The SDK detects the intent deterministically and emits it; the FE picks
// the answer FORM, so the LLM gets a precise assignment ("write an Anleitung over
// this evidence") instead of guessing the format. Returns "" (neutral) when no
// clear task is signalled. First tier of ~20-30 intent types.
func DetectIntent(query string) string {
q := " " + normalizeGerman(query) + " "
has := func(subs ...string) bool {
for _, s := range subs {
if strings.Contains(q, normalizeGerman(s)) {
return true
}
}
return false
}
switch {
case has("prüfe", "prüf mein", "überprüfe", "überprüf", "review", "checke mein",
"ist mein", "ist meine", "ist unser", "ist unsere", "konform", "stimmt mein",
"bewerte mein", "analysiere mein"):
return "review"
case has("checkliste", "was muss ich alles", "was gehört alles", "was gehört in",
"welche punkte muss", "was brauche ich alles"):
return "checkliste"
case has("vergleich", "unterschied", "worin unterscheid", " vs ", " versus ",
"gegenüber", "im gegensatz"):
return "vergleich"
case has("wie schreibe", "wie erstelle", "wie erstell", "wie mache", "wie baue",
"wie setze ich", "wie gehe ich vor", "wie formuliere", "wie richte ich",
"anleitung", "schritt für schritt", "schritt-für-schritt", "erstelle mir",
"erstell mir", "generiere", "was muss ich beachten", "worauf muss ich achten"):
return "anleitung"
case has("welche risiken", "welche gefahren", "risikoanalyse", "welche bedrohungen"):
return "risikoanalyse"
case has("was ist", "was bedeutet", "was versteht man", "was sind", "definition",
"erkläre mir", "erklär mir", "was heißt", "was genau ist"):
return "definition"
default:
return ""
}
}
@@ -0,0 +1,22 @@
package ucca
import "testing"
func TestDetectIntent(t *testing.T) {
cases := map[string]string{
"Was ist eine Datenschutzerklärung?": "definition",
"Wie schreibe ich eine Datenschutzerklärung?": "anleitung",
"Was muss ich beachten wenn ich eine DSE schreibe?": "anleitung",
"Prüfe meine Datenschutzerklärung.": "review",
"Ist meine Datenschutzerklärung konform?": "review",
"Vergleiche DSGVO und BDSG.": "vergleich",
"Welche Risiken gibt es?": "risikoanalyse",
"Erstelle mir eine Checkliste für die DSFA.": "checkliste",
"Wie ist das Wetter?": "",
}
for q, want := range cases {
if got := DetectIntent(q); got != want {
t.Errorf("DetectIntent(%q)=%q want %q", q, got, want)
}
}
}
@@ -0,0 +1,52 @@
package ucca
import "strings"
// kbScopeTopics are high-precision data-protection / compliance topic markers that place a query in
// the KB-2026.1 authoritative slice even when it does NOT name a regulation. Conservative by design:
// an unmatched query falls back to the broad CE default (no regression) — the slice is only used when
// the query is confidently in-scope.
var kbScopeTopics = []string{
// DP-Guidance-Marker, die IN der Slice liegen (EDPB/DSK/WP/GL) — bewusst NICHT die generischen
// Verben aus guidanceIntentSignals (sagt/laut/empfiehlt/auslegung) und NICHT enisa/bsi/nist/owasp
// (die liegen im breiten CE-Pool, nicht in der Slice).
"edpb", "dsk", "datenschutzausschuss", "orientierungshilfe",
"wp2", "wp 2", "wp29", "working paper", "gl 0",
"datenschutz", "dsgvo", "gdpr", "dsfa", "folgenabschätzung", "folgenabschaetzung",
"einwilligung", "auftragsverarbeit", "betroffenenrecht", "auskunftsrecht",
"verarbeitungsverzeichnis", "datenschutzbeauftragt", "verzeichnis von verarbeitung",
"cookie", "tracking", "transparenzpflicht", "datenpanne", "meldepflicht",
"technische und organisatorische maßnahmen",
"cyber resilience", "schwachstelle", "vulnerability", "sicherheitsupdate",
"maschinensicherheit", "wesentliche veränderung", "wesentliche veraenderung",
"konformitätsbewertung", "konformitaetsbewertung", "ce-kennzeichnung",
}
// inKBScope reports whether the query belongs to the KB-2026.1 authoritative slice. True when it
// names an in-slice regulation (detectRegulations), asks for guidance (EDPB/DSK/WP/GL), or hits a
// data-protection / compliance topic marker.
func inKBScope(query string) bool {
if len(detectRegulations(query)) > 0 {
return true
}
q := strings.ToLower(query)
for _, t := range kbScopeTopics {
if strings.Contains(q, t) {
return true
}
}
return false
}
// resolveCollection applies the Blue-Green „authoritative slice promotion" routing. An explicitly
// requested collection is honoured unchanged; the DEFAULT (empty) request is routed to the KB-2026.1
// slice when the query is in-scope, else to the broad CE default. Disable via RAG_KB_SCOPE_ROUTING=false.
func (c *LegalRAGClient) resolveCollection(query, requested string) string {
if requested != "" {
return requested
}
if c.kbScopeRoutingEnabled && c.kbSliceCollection != "" && inKBScope(query) {
return c.kbSliceCollection
}
return c.collection
}
@@ -0,0 +1,101 @@
package ucca
import (
"context"
"fmt"
"os"
"strings"
"testing"
)
func TestInKBScope(t *testing.T) {
inScope := []string{
"Welche neun Kriterien nennt WP248 fuer ein hohes Risiko?",
"Wie greifen CRA und Maschinenverordnung bei einer vernetzten Maschine ineinander?",
"Wann ist eine Datenschutz-Folgenabschaetzung erforderlich?",
"Welche Anforderungen stellt die DSGVO an die Einwilligung?",
"Brauche ich einen Datenschutzbeauftragten?",
"Wann muss eine aktiv ausgenutzte Schwachstelle gemeldet werden?",
}
outScope := []string{
"Welche OWASP-Kontrollen gibt es fuer Authentifizierung?",
"Was sagt NIST SP 800-53 zu Access Control?",
"Wie funktioniert ISO 27001 Zertifizierung?",
"Welche IFRS-Standards gelten fuer Leasing?",
}
for _, q := range inScope {
if !inKBScope(q) {
t.Errorf("inKBScope(%q) = false, want true", q)
}
}
for _, q := range outScope {
if inKBScope(q) {
t.Errorf("inKBScope(%q) = true, want false", q)
}
}
}
func TestResolveCollection(t *testing.T) {
c := &LegalRAGClient{collection: "bp_compliance_ce", kbSliceCollection: "kb_2026_1_build", kbScopeRoutingEnabled: true}
if got := c.resolveCollection("Welche Kriterien nennt WP248?", ""); got != "kb_2026_1_build" {
t.Errorf("in-scope default -> %s, want kb_2026_1_build", got)
}
if got := c.resolveCollection("Was sagt NIST SP 800-53?", ""); got != "bp_compliance_ce" {
t.Errorf("out-of-scope default -> %s, want bp_compliance_ce", got)
}
if got := c.resolveCollection("Welche Kriterien nennt WP248?", "explicit_coll"); got != "explicit_coll" {
t.Errorf("explicit request must be honoured -> %s", got)
}
c.kbScopeRoutingEnabled = false
if got := c.resolveCollection("Welche Kriterien nennt WP248?", ""); got != "bp_compliance_ce" {
t.Errorf("disabled routing -> %s, want bp_compliance_ce", got)
}
}
// TestKBScopeRoutingE2E (RUN_E2E=1) verifies the routing against the REAL collections: a default
// Search() of an in-scope query must hit the KB-2026.1 slice (WP248/MaschVO live there but NOT in
// the broad CE pool = clean discriminator); an out-of-scope query stays on CE.
func TestKBScopeRoutingE2E(t *testing.T) {
if os.Getenv("RUN_E2E") != "1" {
t.Skip("set RUN_E2E=1 + QDRANT_URL/OLLAMA_URL/QDRANT_API_KEY")
}
c := NewLegalRAGClient()
cases := []struct {
q string
wantToken string // expected in top-8 when routed to the slice
wantInKB bool
}{
{"Welche neun Kriterien nennt WP248 fuer ein voraussichtlich hohes Risiko?", "WP248", true},
{"Welche grundlegenden Sicherheits- und Gesundheitsschutzanforderungen enthaelt Anhang III der Maschinenverordnung?", "MASCH", true},
{"Wie greifen CRA und Maschinenverordnung bei einer vernetzten Maschine ineinander?", "MASCH", true},
{"Was sagt NIST SP 800-53 zu Access Control?", "", false},
}
for _, tc := range cases {
routed := c.resolveCollection(tc.q, "")
res, err := c.Search(context.Background(), tc.q, nil, 8)
if err != nil {
t.Fatalf("%q: %v", tc.q, err)
}
codes := map[string]bool{}
for _, r := range res {
codes[strings.ToUpper(r.RegulationCode)] = true
}
hit := false
if tc.wantToken != "" {
for cd := range codes {
if strings.Contains(cd, tc.wantToken) {
hit = true
break
}
}
}
col := make([]string, 0, len(codes))
for cd := range codes {
col = append(col, cd)
}
fmt.Printf("inKB=%-5v routed=%-16s wantTok=%-6s found=%-5v | %v\n", tc.wantInKB, routed, tc.wantToken, hit, col)
if tc.wantInKB && tc.wantToken != "" && !hit {
t.Errorf("%q routed to %s but %s not in top-8 (slice not active?)", tc.q, routed, tc.wantToken)
}
}
}
@@ -0,0 +1,148 @@
package ucca
import "strings"
// KnowledgeSpace is the CHIP-level knowledge domain used by the clarity gate's
// concentration signal + the user-facing context chips. It is deliberately RICHER
// than the 4 authority domains in authority.go (data_protection/cyber/ai/
// product_safety), which drive the EU-primary/subsidiarity rerank. The clarity
// gate must reflect the FULL corpus breadth (arbeitsschutz, arbeitsrecht,
// wirtschaftsrecht, finanz, ...) so a broad query surfaces as broad. Kept separate
// + additive so the tuned authority rerank stays untouched. Corpus-grounded from
// the 463 real regulation codes (0.3% fall through to "sonstiges").
// knowledgeSpaceExact matches short/ambiguous codes by EXACT string (substring
// would misfire on 2-3 char codes like "OR"/"AO"/"BGB").
var knowledgeSpaceExact = map[string]string{
"HGB": "wirtschaftsrecht", "BGB": "wirtschaftsrecht", "AO": "wirtschaftsrecht", "OR": "wirtschaftsrecht",
"ABGB": "wirtschaftsrecht", "UGB": "wirtschaftsrecht", "IFRS": "wirtschaftsrecht", "BAO": "wirtschaftsrecht",
"GMBHG": "wirtschaftsrecht", "AKTG": "wirtschaftsrecht", "INSO": "wirtschaftsrecht", "USTG": "wirtschaftsrecht",
"GOBD": "wirtschaftsrecht", "EGBGB": "wirtschaftsrecht", "GEWO": "wirtschaftsrecht", "URHG": "wirtschaftsrecht",
"DPF": "datenschutz", "TKG": "datenschutz", "TMG": "datenschutz", "DDG": "datenschutz", "DSG": "datenschutz",
"DSV": "datenschutz", "DSM": "datenschutz", "SCC": "datenschutz", "EPRIVACY": "datenschutz",
"SCHREMS II": "datenschutz", "CH_REVDSG": "datenschutz", "PLANET49": "datenschutz", "GOOGLE FONTS": "datenschutz",
"DSA": "digitale_dienste", "DMA": "digitale_dienste", "DGA": "digitale_dienste", "EHDS": "digitale_dienste",
"EIDAS": "digitale_dienste", "EIDAS 2.0": "digitale_dienste", "DATA ACT": "digitale_dienste",
"DATAACT": "digitale_dienste", "DIGITAL CONTENT": "digitale_dienste",
"MVO": "produktsicherheit", "MACHINERY": "produktsicherheit", "MASCHVO": "produktsicherheit",
"MASCHINENVO": "produktsicherheit", "GPSR": "produktsicherheit", "PID": "produktsicherheit",
"EAA": "produktsicherheit", "BFSG": "produktsicherheit", "ELEKTROG": "produktsicherheit",
"VERPACKG": "produktsicherheit", "BATTVO": "produktsicherheit", "BATTDG": "produktsicherheit", "EU MDR": "produktsicherheit",
"DORA": "finanz", "PSD2": "finanz", "MICA": "finanz", "AMLR": "finanz", "VAIT": "finanz", "BAIT": "finanz", "GWG": "finanz",
"UWG": "verbraucherschutz", "UCPD": "verbraucherschutz", "VSBG": "verbraucherschutz", "PANGV": "verbraucherschutz",
"DL-INFOV": "verbraucherschutz", "OMNIBUS": "verbraucherschutz", "UWG AT": "verbraucherschutz",
"PRODHAFTG": "verbraucherschutz", "PRODUKTHAFTUNGS-RL": "verbraucherschutz",
"ARG": "arbeitsrecht",
}
// KnowledgeSpaceLabel maps a knowledge-space id to a user-facing chip label.
var KnowledgeSpaceLabel = map[string]string{
"datenschutz": "Datenschutz", "cyber": "Cybersecurity", "ki": "KI",
"produktsicherheit": "Produktsicherheit", "arbeitsschutz": "Arbeitsschutz",
"arbeitsrecht": "Arbeitsrecht", "wirtschaftsrecht": "Wirtschaftsrecht",
"finanz": "Finanzregulierung", "digitale_dienste": "Digitale Dienste",
"verbraucherschutz": "Verbraucherschutz", "lieferkette": "Lieferkette/Nachhaltigkeit",
"hinweisgeber": "Hinweisgeberschutz", "sonstiges": "Sonstiges",
}
// KnowledgeSpaceOf maps a regulation_code to a knowledge space. Robust to code
// variants (MVO/MASCHVO/MASCHINENVO -> produktsicherheit; DSK SDM / SDM B51 ->
// datenschutz). Returns "" for empty/untagged codes (not a knowledge space).
func KnowledgeSpaceOf(code string) string {
c := strings.ToUpper(strings.TrimSpace(code))
if c == "" || c == "NONE" {
return ""
}
if d, ok := knowledgeSpaceExact[c]; ok {
return d
}
has := func(subs ...string) bool {
for _, s := range subs {
if strings.Contains(c, s) {
return true
}
}
return false
}
pre := func(subs ...string) bool {
for _, s := range subs {
if strings.HasPrefix(c, s) {
return true
}
}
return false
}
switch {
case pre("TRGS", "TRBS", "ASR", "OSHA") || has("ARBSCHG", "GEFAHRSTOFF"):
return "arbeitsschutz"
case has("AI ACT", "KI-VO", "KI VERORDNUNG", "GPAI", "AI RMF", "HLEG AI", "GENAI", "OECD AI", "AI PRINCIPLES", "OH KI", "KI BEHOERDEN", "KI SICHERHEIT", "POS KI"):
return "ki"
case pre("DSGVO", "BDSG", "TDDDG", "DSK", "EDPB", "WP24", "WP25", "WP26", "DSFA", "BFDI", "BAYLDA", "BAYLFB", "EDPS") || has("DATENSCHUTZ", "LOESCHKONZEPT", "LOESCHUNG", "VVT", "TELEMEDIEN", "EU US DPF", "BESCHAEFTIGTENDATEN"):
return "datenschutz"
case has("CRA", "NIS2", "NISG", "BSIG", "BSI-TR", "BSI_KRITIS", "KRITIS", "ENISA", "NIST", "OWASP", "EUCSA", "EUCC", "CISA", "CYCLONEDX", "SPDX", "SLSA", "OPENTELEMETRY", "CVSS", "SECURE BY DESIGN"):
return "cyber"
case has("MACHINERY", "MASCH", "BLUE GUIDE", "FDA HFE"):
return "produktsicherheit"
case has("LKSG", "CSDDD", "CSRD", "TAXONOMY"):
return "lieferkette"
case has("HINSCHG", "GESCHGEHG"):
return "hinweisgeber"
case pre("BAG ", "BAG_") || has("ARBVG", "AZG", "ARBZG", "BETRVG", "KSCHG", "MUSCHG", "AGG", "MILOG", "TZBFG", "NACHWG", "BURLG", "611A", "PAY TRANSPARENCY", "ANGG", "MUTTERSCHUTZ"):
return "arbeitsrecht"
case has("ECOMMERCE", "ECG", "MEDIENG", "VERBRAUCHERRECHTE", "DIGITAL CONTENT"):
return "verbraucherschutz"
case pre("EUGH", "BVERFG", "BVGE", "BGH", "OGH") || has("EU TAXONOMY"):
return "wirtschaftsrecht"
default:
return "sonstiges"
}
}
// ScopeResults implements G1 scope-gating: when the query names a regulation, its
// knowledge space's hits LEAD the result set (the L2 answer + [n] citations are
// built on this order, so scoped answers cite the named regulation instead of the
// embedding-majority domain). Non-scoped hits backfill to keep topK. Stable within
// each partition. Returns results unchanged when scope is "".
func ScopeResults(results []LegalSearchResult, scope string, topK int) []LegalSearchResult {
if scope == "" {
return results
}
scoped := make([]LegalSearchResult, 0, len(results))
rest := make([]LegalSearchResult, 0, len(results))
for _, r := range results {
if KnowledgeSpaceOf(r.RegulationCode) == scope {
scoped = append(scoped, r)
} else {
rest = append(rest, r)
}
}
out := append(scoped, rest...)
if topK > 0 && len(out) > topK {
out = out[:topK]
}
return out
}
// FilterByKnowledgeSpace returns ONLY the results in the given knowledge space —
// a HARD scope with no off-domain backfill. Used by E5 context scoping: when the
// user explicitly chose a domain chip, off-domain regelwerke (MDR/UStG/eIDAS) must
// not reappear in the evidence. Falls back to the input when the domain has no hits
// (never strand the answer). Caps topK.
func FilterByKnowledgeSpace(results []LegalSearchResult, scope string, topK int) []LegalSearchResult {
if scope == "" {
return results
}
out := make([]LegalSearchResult, 0, len(results))
for _, r := range results {
if KnowledgeSpaceOf(r.RegulationCode) == scope {
out = append(out, r)
}
}
if len(out) == 0 {
return results
}
if topK > 0 && len(out) > topK {
out = out[:topK]
}
return out
}
@@ -21,6 +21,12 @@ type LegalRAGClient struct {
textIndexEnsured map[string]bool textIndexEnsured map[string]bool
hybridEnabled bool hybridEnabled bool
graphEnabled bool graphEnabled bool
// Blue-Green „authoritative slice promotion" (additiv, KEIN CE-Ersatz): faellt eine Query
// in den KB-2026.1-Scope (DP/CRA/MaschVO/NIS2/DataAct/DORA/AIAct + EDPB/DSK-Guidance), wird
// die hochwertige Slice-Collection abgefragt; sonst bleibt der breite Default (bp_compliance_ce).
kbSliceCollection string
kbScopeRoutingEnabled bool
} }
// NewLegalRAGClient creates a new Legal RAG client using Ollama bge-m3 embeddings. // NewLegalRAGClient creates a new Legal RAG client using Ollama bge-m3 embeddings.
@@ -45,15 +51,25 @@ func NewLegalRAGClient() *LegalRAGClient {
// zur Begruendung/Vollstaendigkeit genutzt, nicht zur Pool-Expansion (Default). // zur Begruendung/Vollstaendigkeit genutzt, nicht zur Pool-Expansion (Default).
graphEnabled := os.Getenv("RAG_GRAPH_EXPANSION") == "true" graphEnabled := os.Getenv("RAG_GRAPH_EXPANSION") == "true"
// KB-2026.1 authoritative slice (Blue-Green, additiv). Routing default AN; Rollback ohne
// Redeploy ueber RAG_KB_SCOPE_ROUTING=false (dann faellt alles auf den CE-Default zurueck).
kbSlice := os.Getenv("RAG_KB_SLICE_COLLECTION")
if kbSlice == "" {
kbSlice = "kb_2026_1_build"
}
kbScopeRouting := os.Getenv("RAG_KB_SCOPE_ROUTING") != "false"
return &LegalRAGClient{ return &LegalRAGClient{
qdrantURL: qdrantURL, qdrantURL: qdrantURL,
qdrantAPIKey: qdrantAPIKey, qdrantAPIKey: qdrantAPIKey,
ollamaURL: ollamaURL, ollamaURL: ollamaURL,
embeddingModel: "bge-m3", embeddingModel: "bge-m3",
collection: "bp_compliance_ce", collection: "bp_compliance_ce",
textIndexEnsured: make(map[string]bool), textIndexEnsured: make(map[string]bool),
hybridEnabled: hybridEnabled, hybridEnabled: hybridEnabled,
graphEnabled: graphEnabled, graphEnabled: graphEnabled,
kbSliceCollection: kbSlice,
kbScopeRoutingEnabled: kbScopeRouting,
httpClient: &http.Client{ httpClient: &http.Client{
Timeout: 60 * time.Second, Timeout: 60 * time.Second,
}, },
@@ -63,22 +79,33 @@ func NewLegalRAGClient() *LegalRAGClient {
// SearchCollection queries a specific Qdrant collection for relevant passages. // SearchCollection queries a specific Qdrant collection for relevant passages.
// If collection is empty, it falls back to the default collection (bp_compliance_ce). // If collection is empty, it falls back to the default collection (bp_compliance_ce).
func (c *LegalRAGClient) SearchCollection(ctx context.Context, collection string, query string, regulationIDs []string, topK int) ([]LegalSearchResult, error) { func (c *LegalRAGClient) SearchCollection(ctx context.Context, collection string, query string, regulationIDs []string, topK int) ([]LegalSearchResult, error) {
if collection == "" { return c.searchInternal(ctx, c.resolveCollection(query, collection), query, regulationIDs, topK)
collection = c.collection
}
return c.searchInternal(ctx, collection, query, regulationIDs, topK)
} }
// Search queries the compliance CE corpus for relevant passages. // Search queries the compliance corpus for relevant passages. The target collection is resolved by
// the Blue-Green slice routing: the KB-2026.1 slice for in-scope queries, else the broad CE default.
func (c *LegalRAGClient) Search(ctx context.Context, query string, regulationIDs []string, topK int) ([]LegalSearchResult, error) { func (c *LegalRAGClient) Search(ctx context.Context, query string, regulationIDs []string, topK int) ([]LegalSearchResult, error) {
return c.searchInternal(ctx, c.collection, query, regulationIDs, topK) return c.searchInternal(ctx, c.resolveCollection(query, ""), query, regulationIDs, topK)
} }
// searchInternal performs the actual search against a given collection. // searchInternal performs the actual search against a given collection.
// If hybrid search is enabled, it uses the Qdrant Query API with RRF fusion // If hybrid search is enabled, it uses the Qdrant Query API with RRF fusion
// (dense + full-text). Falls back to dense-only /points/search on failure. // (dense + full-text). Falls back to dense-only /points/search on failure.
func (c *LegalRAGClient) searchInternal(ctx context.Context, collection string, query string, regulationIDs []string, topK int) ([]LegalSearchResult, error) { func (c *LegalRAGClient) searchInternal(ctx context.Context, collection string, query string, regulationIDs []string, topK int) ([]LegalSearchResult, error) {
embedding, err := c.generateEmbedding(ctx, query) // Multi-Regulation-Retrieval: nennt die Query EXPLIZIT >=2 Regelwerke (z.B. "CRA und
// Maschinenverordnung"), wird pro Regelwerk separat retrieved + gemergt, damit BEIDE
// Domaenen im Prompt landen statt nur der keyword-dominanten. Generisch (Query->Regelwerke,
// keine doc-spezifische Logik); nur wenn der Caller nicht ohnehin schon auf Regulierungen
// filtert. Best-effort: leeres/fehlerhaftes Multi-Ergebnis faellt auf die Standardsuche zurueck.
if len(regulationIDs) == 0 {
if regs := detectRegulations(query); len(regs) >= 2 {
if mr, mErr := c.searchMultiRegulation(ctx, collection, query, regs, topK); mErr == nil && len(mr) > 0 {
return mr, nil
}
}
}
embedding, err := c.embedForQuery(ctx, query)
if err != nil { if err != nil {
return nil, fmt.Errorf("failed to generate embedding: %w", err) return nil, fmt.Errorf("failed to generate embedding: %w", err)
} }
@@ -123,43 +150,7 @@ func (c *LegalRAGClient) searchInternal(ctx context.Context, collection string,
hits = c.expandViaGraph(ctx, collection, hits) hits = c.expandViaGraph(ctx, collection, hits)
} }
results := make([]LegalSearchResult, len(hits)) results := hitsToResults(hits)
for i, hit := range hits {
// Legal-Metadaten nach rag_reingest_spec.md §2: bevorzugt die normalisierten Felder
// (article_label/regulation_code/article/...); Fallback auf alte Feldnamen, solange der
// Korpus noch nicht re-ingestiert ist (regulation_id, section="§ 38").
regCode := getString(hit.Payload, "regulation_code")
if regCode == "" {
regCode = getString(hit.Payload, "regulation_id")
}
article := getString(hit.Payload, "article")
if article == "" {
article = getString(hit.Payload, "section")
}
results[i] = LegalSearchResult{
Text: getString(hit.Payload, "chunk_text"),
RegulationCode: regCode,
RegulationName: getString(hit.Payload, "regulation_name_de"),
RegulationShort: getString(hit.Payload, "regulation_short"),
Category: getString(hit.Payload, "category"),
ArticleLabel: getString(hit.Payload, "article_label"),
Article: article,
Paragraph: getString(hit.Payload, "paragraph"),
Sub: getString(hit.Payload, "sub"),
IsRecital: getBool(hit.Payload, "is_recital"),
CitationStyle: getString(hit.Payload, "citation_style"),
Pages: getIntSlice(hit.Payload, "pages"),
SourceURL: getString(hit.Payload, "source"),
Score: hit.Score,
AuthorityWeight: getInt(hit.Payload, "authority_weight"),
SourceClass: getString(hit.Payload, "source_class"),
Jurisdiction: getString(hit.Payload, "jurisdiction"),
CitationUnit: getString(hit.Payload, "citation_unit"),
ReferencesOut: getStringSlice(hit.Payload, "references_out"),
ReferencesIn: getStringSlice(hit.Payload, "references_in"),
Superseded: getString(hit.Payload, "status") == "superseded",
}
}
// Authority-aware Re-Ranking: bindendes Recht der passenden Jurisdiktion/Domaene nach // Authority-aware Re-Ranking: bindendes Recht der passenden Jurisdiktion/Domaene nach
// oben, Guidance/Fremdrecht/Off-Domain runter (nichts wird geloescht). Reihenfolge only, // oben, Guidance/Fremdrecht/Off-Domain runter (nichts wird geloescht). Reihenfolge only,
@@ -122,12 +122,14 @@ func (c *LegalRAGClient) searchHybrid(ctx context.Context, collection string, em
} }
if len(regulationIDs) > 0 { if len(regulationIDs) > 0 {
conditions := make([]qdrantCondition, len(regulationIDs)) // Match BOTH the legacy field (regulation_id) and the normalized field
for i, regID := range regulationIDs { // (regulation_code) so per-regulation filtering works on the re-ingested corpus too.
conditions[i] = qdrantCondition{ conditions := make([]qdrantCondition, 0, len(regulationIDs)*2)
Key: "regulation_id", for _, regID := range regulationIDs {
Match: qdrantMatch{Value: regID}, conditions = append(conditions,
} qdrantCondition{Key: "regulation_id", Match: qdrantMatch{Value: regID}},
qdrantCondition{Key: "regulation_code", Match: qdrantMatch{Value: regID}},
)
} }
queryReq.Filter = &qdrantFilter{Should: conditions} queryReq.Filter = &qdrantFilter{Should: conditions}
} }
@@ -175,12 +177,14 @@ func (c *LegalRAGClient) searchDense(ctx context.Context, collection string, emb
} }
if len(regulationIDs) > 0 { if len(regulationIDs) > 0 {
conditions := make([]qdrantCondition, len(regulationIDs)) // Match BOTH the legacy field (regulation_id) and the normalized field
for i, regID := range regulationIDs { // (regulation_code) so per-regulation filtering works on the re-ingested corpus too.
conditions[i] = qdrantCondition{ conditions := make([]qdrantCondition, 0, len(regulationIDs)*2)
Key: "regulation_id", for _, regID := range regulationIDs {
Match: qdrantMatch{Value: regID}, conditions = append(conditions,
} qdrantCondition{Key: "regulation_id", Match: qdrantMatch{Value: regID}},
qdrantCondition{Key: "regulation_code", Match: qdrantMatch{Value: regID}},
)
} }
searchReq.Filter = &qdrantFilter{Should: conditions} searchReq.Filter = &qdrantFilter{Should: conditions}
} }
@@ -37,6 +37,17 @@ type LegalSearchResult struct {
// Supersede-Status (status="superseded", use_for_primary=false) — Alt-Quelle, // Supersede-Status (status="superseded", use_for_primary=false) — Alt-Quelle,
// die fuer Default-Fragen demoted wird (nicht versteckt; fuer Historie auffindbar). // die fuer Default-Fragen demoted wird (nicht versteckt; fuer Historie auffindbar).
Superseded bool `json:"-"` Superseded bool `json:"-"`
// Evidence-Type-Marker — intern (json:"-", kein Pro-Result-Contract-Change), aus dem
// Qdrant-Payload befuellt. classifyEvidence() leitet daraus den EvidenceType ab; der
// Router surfacet nicht-Text-Evidence (Fußnote/Tabelle/Figur) getrennt vom Text-Merge,
// damit feingranulare Evidence nicht von Breit-Basis-Text ueberrankt wird.
IsFootnote bool `json:"-"`
FootnoteLabel string `json:"-"`
FootnoteVerbatim string `json:"-"`
RefCitationUnit string `json:"-"`
IsTable bool `json:"-"` // C6/C9: is_table (liniiert + borderless)
IsFigure bool `json:"-"` // C8: is_figure (noch nicht befuellt bis C8)
} }
// LegalAssessment is the auditable explanation layer over a ranked result set: // LegalAssessment is the auditable explanation layer over a ranked result set:
@@ -0,0 +1,208 @@
package ucca
import (
"context"
"fmt"
"strings"
)
// multiRegMinPerRegulation is the minimum number of hits fetched per named regulation, so
// each domain is fairly represented even when topK/len(regs) would be tiny.
const multiRegMinPerRegulation = 3
// regulationCatalog maps a regulation to (a) the aliases that signal it is EXPLICITLY named
// in a query and (b) the regulation_code/regulation_id values used to filter the corpus.
// Deterministic + generic: a query naming >=2 regulations triggers per-regulation retrieval
// so a cross-regulation question returns every named domain — NOT a doc-specific rule.
var regulationCatalog = []struct {
Canonical string
Aliases []string
CodeValues []string
}{
{"CRA", []string{"cra", "cyber resilience"}, []string{"CRA"}},
// MaschVO heisst je Collection anders: Slice MASCHVO · gesetze MVO · ce MACHINERY/MASCHINENVO.
// Alle Varianten als CodeValues, sonst findet der per-Reg-Filter MaschVO nur in der Slice (0070).
{"MaschVO", []string{"maschinenverordnung", "maschvo", "machinery regulation"}, []string{"MASCHVO", "MaschVO", "MVO", "MASCHINENVO", "MACHINERY"}},
{"NIS2", []string{"nis2", "nis-2", "nis 2"}, []string{"NIS2"}},
{"DORA", []string{"dora"}, []string{"DORA"}},
{"Data Act", []string{"data act", "datengesetz"}, []string{"DATA ACT", "DataAct"}},
{"AI Act", []string{"ai act", "ki-vo", "ki-verordnung", "ai-verordnung"}, []string{"AI ACT", "AIAct"}},
{"DSGVO", []string{"dsgvo", "gdpr"}, []string{"DSGVO"}},
{"TDDDG", []string{"tdddg"}, []string{"TDDDG"}},
{"BDSG", []string{"bdsg"}, []string{"BDSG"}},
}
type detectedRegulation struct {
Canonical string
CodeValues []string
}
// detectRegulations returns the DISTINCT regulations explicitly named in the query. >=2 of
// them is the trigger for multi-regulation retrieval. Pure + deterministic, no LLM.
func detectRegulations(query string) []detectedRegulation {
q := strings.ToLower(query)
var out []detectedRegulation
for _, r := range regulationCatalog {
for _, a := range r.Aliases {
if strings.Contains(q, a) {
out = append(out, detectedRegulation{Canonical: r.Canonical, CodeValues: r.CodeValues})
break
}
}
}
return out
}
func hitID(h qdrantSearchHit) string { return fmt.Sprintf("%v", h.ID) }
// balanceByRegulation builds the final top-K so EVERY explicitly-named regulation with hits is
// represented, instead of letting the keyword-dominant domain (e.g. CRA) crowd out the other
// (e.g. MaschVO) in a cross-regulation query. The input pool must already be score-ordered;
// results are grouped by exact regulation_code match against each regulation's CodeValues, then
// taken round-robin across the named domains (highest-scored first within each), with any
// remaining slots filled by the leftover pool in score order. Generic; no doc-specific logic.
func balanceByRegulation(pool []LegalSearchResult, regs []detectedRegulation, topK int) []LegalSearchResult {
if topK <= 0 {
topK = 8
}
byReg := make([][]LegalSearchResult, len(regs))
matched := make([]bool, len(pool))
for ri, r := range regs {
for pi := range pool {
if matched[pi] {
continue
}
code := strings.TrimSpace(pool[pi].RegulationCode)
for _, cv := range r.CodeValues {
if strings.EqualFold(code, cv) {
byReg[ri] = append(byReg[ri], pool[pi])
matched[pi] = true
break
}
}
}
}
out := make([]LegalSearchResult, 0, topK)
idx := make([]int, len(regs))
for len(out) < topK {
progressed := false
for ri := range regs {
if idx[ri] < len(byReg[ri]) {
out = append(out, byReg[ri][idx[ri]])
idx[ri]++
progressed = true
if len(out) >= topK {
break
}
}
}
if !progressed {
break
}
}
for pi := range pool {
if len(out) >= topK {
break
}
if !matched[pi] {
out = append(out, pool[pi])
}
}
return out
}
// searchMultiRegulation retrieves each explicitly-named regulation SEPARATELY (per-regulation
// filter) and merges, so a cross-regulation query ("Wie greifen CRA und MaschVO ineinander?")
// returns BOTH domains in the prompt instead of only the keyword-dominant one. Generic over any
// named pair (DSGVO+TDDDG, CRA+NIS2, DORA+NIS2, AI Act+DSGVO, ...). The merged pool is
// authority-reranked once. Pure pool-construction; topK contract preserved.
func (c *LegalRAGClient) searchMultiRegulation(ctx context.Context, collection, query string, regs []detectedRegulation, topK int) ([]LegalSearchResult, error) {
embedding, err := c.generateEmbedding(ctx, query)
if err != nil {
return nil, fmt.Errorf("failed to generate embedding: %w", err)
}
perReg := topK / len(regs)
if perReg < multiRegMinPerRegulation {
perReg = multiRegMinPerRegulation
}
var merged []qdrantSearchHit
seen := make(map[string]bool)
for _, r := range regs {
var hits []qdrantSearchHit
if c.hybridEnabled {
if h, hErr := c.searchHybrid(ctx, collection, embedding, r.CodeValues, perReg); hErr == nil {
hits = h
}
}
if hits == nil {
if h, dErr := c.searchDense(ctx, collection, embedding, r.CodeValues, perReg); dErr == nil {
hits = h
}
}
for _, h := range hits {
id := hitID(h)
if seen[id] {
continue
}
seen[id] = true
merged = append(merged, h)
}
}
if len(merged) == 0 {
return nil, fmt.Errorf("multi-regulation search returned no hits")
}
results := hitsToResults(merged)
results = rerankByAuthority(query, results)
if topK > 0 && len(results) > topK {
results = results[:topK]
}
return results, nil
}
// hitsToResults maps raw Qdrant hits to LegalSearchResult, preferring the normalized payload
// fields (regulation_code/article_label/...) with fallback to the legacy names (regulation_id,
// section) while the corpus is mid-re-ingestion. Shared by searchInternal + searchMultiRegulation.
func hitsToResults(hits []qdrantSearchHit) []LegalSearchResult {
results := make([]LegalSearchResult, len(hits))
for i, hit := range hits {
regCode := getString(hit.Payload, "regulation_code")
if regCode == "" {
regCode = getString(hit.Payload, "regulation_id")
}
article := getString(hit.Payload, "article")
if article == "" {
article = getString(hit.Payload, "section")
}
results[i] = LegalSearchResult{
Text: getString(hit.Payload, "chunk_text"),
RegulationCode: regCode,
RegulationName: getString(hit.Payload, "regulation_name_de"),
RegulationShort: getString(hit.Payload, "regulation_short"),
Category: getString(hit.Payload, "category"),
ArticleLabel: getString(hit.Payload, "article_label"),
Article: article,
Paragraph: getString(hit.Payload, "paragraph"),
Sub: getString(hit.Payload, "sub"),
IsRecital: getBool(hit.Payload, "is_recital"),
CitationStyle: getString(hit.Payload, "citation_style"),
Pages: getIntSlice(hit.Payload, "pages"),
SourceURL: getString(hit.Payload, "source"),
Score: hit.Score,
AuthorityWeight: getInt(hit.Payload, "authority_weight"),
SourceClass: getString(hit.Payload, "source_class"),
Jurisdiction: getString(hit.Payload, "jurisdiction"),
CitationUnit: getString(hit.Payload, "citation_unit"),
ReferencesOut: getStringSlice(hit.Payload, "references_out"),
ReferencesIn: getStringSlice(hit.Payload, "references_in"),
Superseded: getString(hit.Payload, "status") == "superseded",
IsFootnote: getBool(hit.Payload, "is_footnote"),
FootnoteLabel: getString(hit.Payload, "footnote_label"),
FootnoteVerbatim: getString(hit.Payload, "footnote_verbatim"),
RefCitationUnit: getString(hit.Payload, "ref_citation_unit"),
IsTable: getBool(hit.Payload, "is_table"),
IsFigure: getBool(hit.Payload, "is_figure"),
}
}
return results
}
@@ -0,0 +1,92 @@
package ucca
import (
"context"
"fmt"
"os"
"strings"
"testing"
)
// TestDetectRegulations is a pure unit test of the multi-regulation TRIGGER (no Qdrant):
// only an explicit naming of >=2 regulations enables multi-regulation retrieval. A single
// named regulation, or a topical question that doesn't name one, stays single-domain.
func TestDetectRegulations(t *testing.T) {
cases := []struct {
q string
want int
}{
{"Welche neun Kriterien nennt WP248 fuer ein voraussichtlich hohes Risiko?", 0},
{"Welche Anforderungen gelten fuer wesentliche Veraenderungen einer Maschine?", 0}, // "Maschine" != MaschVO
{"Benoetigt eine SPS ohne Netzwerkanschluss eine CRA-Bewertung?", 1}, // 1 -> single
{"Wie greifen CRA und Maschinenverordnung bei einer vernetzten Maschine ineinander?", 2},
{"Wie greifen DSGVO und TDDDG bei der Nutzung von Cookies ineinander?", 2},
{"Wie verhalten sich DORA und NIS2 fuer ein Finanzunternehmen?", 2},
{"Wie greifen AI Act und DSGVO bei einem KI-System ineinander?", 2},
}
for _, c := range cases {
if got := len(detectRegulations(c.q)); got != c.want {
t.Errorf("detectRegulations(%q) = %d, want %d", c.q, got, c.want)
}
}
}
// TestMultiRegE2E (RUN_E2E=1) verifies against the build collection that an explicit
// cross-regulation query returns BOTH named domains in the top-K — the core acceptance
// gate for multi-regulation retrieval.
func TestMultiRegE2E(t *testing.T) {
if os.Getenv("RUN_E2E") != "1" {
t.Skip("set RUN_E2E=1 + QDRANT_URL/OLLAMA_URL")
}
c := NewLegalRAGClient()
coll := os.Getenv("E2E_COLLECTION")
if coll == "" {
coll = "bp_compliance_kb_2026_1_build"
}
cases := []struct {
id string
q string
want []string
}{
{"GQ-0070 CRA+MaschVO", "Wie greifen CRA und Maschinenverordnung bei einer vernetzten Maschine ineinander?", []string{"CRA", "MASCH"}},
{"DSGVO+TDDDG", "Wie greifen DSGVO und TDDDG bei der Nutzung von Cookies und Tracking-Technologien ineinander?", []string{"DSGVO", "TDDDG"}},
{"CRA+NIS2", "Wie verhalten sich CRA und NIS2 bei einem vernetzten Produkt eines wichtigen Unternehmens zueinander?", []string{"CRA", "NIS2"}},
{"DORA+NIS2", "Wie greifen DORA und NIS2 bei einem Finanzunternehmen ineinander?", []string{"DORA", "NIS2"}},
{"AI Act+DSGVO", "Wie greifen AI Act und DSGVO bei einem KI-System ineinander, das personenbezogene Daten verarbeitet?", []string{"AI ACT", "DSGVO"}},
}
for _, tc := range cases {
res, err := c.SearchCollection(context.Background(), coll, tc.q, nil, 8)
if err != nil {
t.Fatalf("%s: %v", tc.id, err)
}
present := map[string]bool{}
for _, r := range res {
present[strings.ToUpper(r.RegulationCode)] = true
}
ok := true
for _, w := range tc.want {
found := false
for cd := range present {
if strings.Contains(cd, w) {
found = true
break
}
}
if !found {
ok = false
}
}
codes := make([]string, 0, len(present))
for cd := range present {
codes = append(codes, cd)
}
status := "OK"
if !ok {
status = "FAIL"
}
fmt.Printf("%-22s want=%v present=%v %s\n", tc.id, tc.want, codes, status)
if !ok {
t.Errorf("%s: not all named regulations in top-8 (want %v, got %v)", tc.id, tc.want, codes)
}
}
}
@@ -0,0 +1,15 @@
package ucca
import "strings"
// normalizeGerman lowercases and folds German umlauts / ß to their ASCII digraphs
// (ä→ae, ö→oe, ü→ue, ß→ss) so keyword matching is insensitive to whether the user
// typed "Prüfe" or "Pruefe", "Datenschutzerklärung" or "Datenschutzerklaerung".
// Applied to BOTH the query and the keyword lists in the German-text matchers.
func normalizeGerman(s string) string {
return umlautFolder.Replace(strings.ToLower(s))
}
var umlautFolder = strings.NewReplacer(
"ä", "ae", "ö", "oe", "ü", "ue", "ß", "ss",
)
@@ -0,0 +1,29 @@
package ucca
import "testing"
func TestDetectIntentUmlautFold(t *testing.T) {
cases := map[string]string{
"Pruefe meine Datenschutzerklaerung.": "review", // ASCII digraph
"Prüfe meine Datenschutzerklärung.": "review", // umlaut
"Ueberpruefe das Impressum": "review", // ASCII "überprüfe"
"Was ist eine TOM?": "definition", // unchanged
}
for q, want := range cases {
if got := DetectIntent(q); got != want {
t.Errorf("DetectIntent(%q)=%q want %q", q, got, want)
}
}
}
func TestConceptNormsUmlautFold(t *testing.T) {
// ASCII "datenschutzerklaerung" must resolve to the same core norms as the umlaut form.
ascii := ConceptNorms("Was gehoert in eine Datenschutzerklaerung?")
umlaut := ConceptNorms("Was gehört in eine Datenschutzerklärung?")
if len(ascii) == 0 {
t.Errorf("ConceptNorms(ASCII datenschutzerklaerung) returned none")
}
if len(ascii) != len(umlaut) {
t.Errorf("ASCII vs umlaut concept norms differ: %v vs %v", ascii, umlaut)
}
}
@@ -162,7 +162,7 @@ async def update_ai_system(
db: Session = Depends(get_db), db: Session = Depends(get_db),
): ):
"""Update an AI system.""" """Update an AI system."""
from datetime import datetime from datetime import datetime, timezone
system = db.query(AISystemDB).filter(AISystemDB.id == system_id).first() system = db.query(AISystemDB).filter(AISystemDB.id == system_id).first()
if not system: if not system:
@@ -226,7 +226,7 @@ async def assess_ai_system(
db: Session = Depends(get_db), db: Session = Depends(get_db),
): ):
"""Run AI Act risk assessment for an AI system.""" """Run AI Act risk assessment for an AI system."""
from datetime import datetime from datetime import datetime, timezone
system = db.query(AISystemDB).filter(AISystemDB.id == system_id).first() system = db.query(AISystemDB).filter(AISystemDB.id == system_id).first()
if not system: if not system:
@@ -47,6 +47,8 @@ from compliance.services.canonical_control_service import (
_control_row, # re-exported for legacy test imports _control_row, # re-exported for legacy test imports
) )
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/v1/canonical", tags=["canonical-controls"]) router = APIRouter(prefix="/v1/canonical", tags=["canonical-controls"])
@@ -14,7 +14,7 @@ Endpoints:
""" """
import logging import logging
from datetime import datetime, date, timedelta from datetime import datetime, date, timedelta, timezone
from calendar import month_abbr from calendar import month_abbr
from typing import Optional, Dict, Any, List from typing import Optional, Dict, Any, List
from decimal import Decimal from decimal import Decimal
@@ -26,10 +26,11 @@ versions). Module-level helpers re-exported for legacy tests.
import logging import logging
from typing import Any, List, Optional from typing import Any, List, Optional
from fastapi import APIRouter, Depends, Query from fastapi import APIRouter, Depends, HTTPException, Query
from pydantic import BaseModel from pydantic import BaseModel
from fastapi.responses import Response from fastapi.responses import Response
from sqlalchemy.orm import Session from sqlalchemy.orm import Session
from sqlalchemy import text
from classroom_engine.database import get_db from classroom_engine.database import get_db
from compliance.api._http_errors import translate_domain_errors from compliance.api._http_errors import translate_domain_errors
@@ -484,6 +485,7 @@ async def list_dsfas(
async def create_dsfa( async def create_dsfa(
request: DSFACreate, request: DSFACreate,
tenant_id: Optional[str] = Query(None), tenant_id: Optional[str] = Query(None),
db: Session = Depends(get_db),
service: DSFAService = Depends(get_dsfa_service), service: DSFAService = Depends(get_dsfa_service),
) -> dict[str, Any]: ) -> dict[str, Any]:
"""Neue DSFA erstellen.""" """Neue DSFA erstellen."""
@@ -16,6 +16,11 @@ from the legacy path.
""" """
import logging import logging
import os
import json
import hashlib
import uuid as uuid_module
from datetime import datetime, timedelta
from typing import Any, Optional from typing import Any, Optional
from fastapi import APIRouter, Depends, File, HTTPException, Query, UploadFile from fastapi import APIRouter, Depends, File, HTTPException, Query, UploadFile
@@ -30,14 +35,15 @@ from ..db import (
EvidenceConfidenceEnum, EvidenceConfidenceEnum,
EvidenceTruthStatusEnum, EvidenceTruthStatusEnum,
) )
from ..db.models import EvidenceDB, ControlDB, AuditTrailDB from ..db.models import EvidenceDB, AuditTrailDB
from ..services.auto_risk_updater import AutoRiskUpdater from ..services.auto_risk_updater import AutoRiskUpdater
from ..services.evidence_service import EvidenceService from ..services.evidence_service import EvidenceService, _update_risks as _update_risks_impl
from .schemas import ( from .schemas import (
EvidenceCreate, EvidenceResponse, EvidenceListResponse, EvidenceCreate, EvidenceResponse, EvidenceListResponse,
EvidenceRejectRequest, EvidenceRejectRequest,
) )
from .audit_trail_utils import log_audit_trail from .audit_trail_utils import log_audit_trail
from ._http_errors import translate_domain_errors
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
router = APIRouter(tags=["compliance-evidence"]) router = APIRouter(tags=["compliance-evidence"])
@@ -146,6 +152,7 @@ async def list_evidence(
status: Optional[str] = None, status: Optional[str] = None,
page: Optional[int] = Query(None, ge=1, description="Page number (1-based)"), page: Optional[int] = Query(None, ge=1, description="Page number (1-based)"),
limit: Optional[int] = Query(None, ge=1, le=500, description="Items per page"), limit: Optional[int] = Query(None, ge=1, le=500, description="Items per page"),
db: Session = Depends(get_db),
service: EvidenceService = Depends(get_evidence_service), service: EvidenceService = Depends(get_evidence_service),
) -> EvidenceListResponse: ) -> EvidenceListResponse:
"""List evidence with optional filters and pagination.""" """List evidence with optional filters and pagination."""
@@ -186,9 +193,11 @@ async def list_evidence(
@router.post("/evidence", response_model=EvidenceResponse) @router.post("/evidence", response_model=EvidenceResponse)
async def create_evidence( async def create_evidence(
evidence_data: EvidenceCreate, evidence_data: EvidenceCreate,
db: Session = Depends(get_db),
service: EvidenceService = Depends(get_evidence_service), service: EvidenceService = Depends(get_evidence_service),
) -> EvidenceResponse: ) -> EvidenceResponse:
"""Create new evidence record.""" """Create new evidence record."""
dsms_cid = None
repo = EvidenceRepository(db) repo = EvidenceRepository(db)
# Get control UUID # Get control UUID
@@ -257,6 +266,7 @@ async def create_evidence(
@router.delete("/evidence/{evidence_id}") @router.delete("/evidence/{evidence_id}")
async def delete_evidence( async def delete_evidence(
evidence_id: str, evidence_id: str,
db: Session = Depends(get_db),
service: EvidenceService = Depends(get_evidence_service), service: EvidenceService = Depends(get_evidence_service),
) -> dict[str, Any]: ) -> dict[str, Any]:
"""Delete an evidence record.""" """Delete an evidence record."""
@@ -275,6 +285,7 @@ async def upload_evidence(
title: str = Query(...), title: str = Query(...),
file: UploadFile = File(...), file: UploadFile = File(...),
description: Optional[str] = Query(None), description: Optional[str] = Query(None),
db: Session = Depends(get_db),
service: EvidenceService = Depends(get_evidence_service), service: EvidenceService = Depends(get_evidence_service),
) -> EvidenceResponse: ) -> EvidenceResponse:
"""Upload evidence file.""" """Upload evidence file."""
@@ -674,6 +685,7 @@ async def collect_ci_evidence(
async def get_ci_evidence_status( async def get_ci_evidence_status(
control_id: Optional[str] = Query(None, description="Filter by control ID"), control_id: Optional[str] = Query(None, description="Filter by control ID"),
days: int = Query(30, description="Look back N days"), days: int = Query(30, description="Look back N days"),
db: Session = Depends(get_db),
service: EvidenceService = Depends(get_evidence_service), service: EvidenceService = Depends(get_evidence_service),
) -> dict[str, Any]: ) -> dict[str, Any]:
"""Get CI/CD evidence collection status overview.""" """Get CI/CD evidence collection status overview."""
@@ -681,70 +693,8 @@ async def get_ci_evidence_status(
return service.ci_status(control_id, days) return service.ci_status(control_id, days)
# ---------------------------------------------------------------------------- # (Alte CI-Status-Implementierung entfernt — unerreichbarer Code nach `return
# Legacy re-exports for tests that import helpers directly. # service.ci_status(...)`; durch den Service ersetzt, `query` war nie initialisiert.)
# ----------------------------------------------------------------------------
if control_id:
ctrl_repo = ControlRepository(db)
control = ctrl_repo.get_by_control_id(control_id)
if control:
query = query.filter(EvidenceDB.control_id == control.id)
evidence_list = query.order_by(EvidenceDB.collected_at.desc()).limit(100).all()
# Group by control and calculate stats
control_stats = defaultdict(lambda: {
"total": 0,
"valid": 0,
"failed": 0,
"last_collected": None,
"evidence": [],
})
for e in evidence_list:
# Get control_id string
control = db.query(ControlDB).filter(ControlDB.id == e.control_id).first()
ctrl_id = control.control_id if control else "unknown"
stats = control_stats[ctrl_id]
stats["total"] += 1
if e.status:
if e.status.value == "valid":
stats["valid"] += 1
elif e.status.value == "failed":
stats["failed"] += 1
if not stats["last_collected"] or e.collected_at > stats["last_collected"]:
stats["last_collected"] = e.collected_at
# Add evidence summary
stats["evidence"].append({
"id": e.id,
"type": e.evidence_type,
"status": e.status.value if e.status else None,
"collected_at": e.collected_at.isoformat() if e.collected_at else None,
"ci_job_id": e.ci_job_id,
})
# Convert to list and sort
result = []
for ctrl_id, stats in control_stats.items():
result.append({
"control_id": ctrl_id,
"total_evidence": stats["total"],
"valid_count": stats["valid"],
"failed_count": stats["failed"],
"last_collected": stats["last_collected"].isoformat() if stats["last_collected"] else None,
"recent_evidence": stats["evidence"][:5],
})
result.sort(key=lambda x: x["last_collected"] or "", reverse=True)
return {
"period_days": days,
"total_evidence": len(evidence_list),
"controls": result,
}
# ============================================================================ # ============================================================================
@@ -772,6 +722,7 @@ async def review_evidence(
approval_status='first_approved'. A second (different) reviewer then approval_status='first_approved'. A second (different) reviewer then
sets second_reviewer and approval_status='approved'. sets second_reviewer and approval_status='approved'.
""" """
dsms_cid = None
evidence = db.query(EvidenceDB).filter(EvidenceDB.id == evidence_id).first() evidence = db.query(EvidenceDB).filter(EvidenceDB.id == evidence_id).first()
if not evidence: if not evidence:
raise HTTPException(status_code=404, detail=f"Evidence {evidence_id} not found") raise HTTPException(status_code=404, detail=f"Evidence {evidence_id} not found")
@@ -851,6 +802,7 @@ async def reject_evidence(
db: Session = Depends(get_db), db: Session = Depends(get_db),
): ):
"""Reject evidence (sets approval_status='rejected').""" """Reject evidence (sets approval_status='rejected')."""
dsms_cid = None
evidence = db.query(EvidenceDB).filter(EvidenceDB.id == evidence_id).first() evidence = db.query(EvidenceDB).filter(EvidenceDB.id == evidence_id).first()
if not evidence: if not evidence:
raise HTTPException(status_code=404, detail=f"Evidence {evidence_id} not found") raise HTTPException(status_code=404, detail=f"Evidence {evidence_id} not found")
@@ -8,7 +8,7 @@ This adds NO new reasoning logic. It exposes the already-built, tested orchestra
""" """
import logging import logging
from typing import List, Optional from typing import Dict, List, Optional
from fastapi import APIRouter, HTTPException from fastapi import APIRouter, HTTPException
from pydantic import BaseModel, Field from pydantic import BaseModel, Field
@@ -20,7 +20,7 @@ from compliance.onboarding import (
ProducedSignal, ProducedSignal,
RejectedAssumption, RejectedAssumption,
) )
from compliance.services.onboarding_service import run_advisor, supported_targets from compliance.services.onboarding_service import labels_for, run_advisor, supported_targets
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
router = APIRouter(prefix="/onboarding", tags=["onboarding"]) router = APIRouter(prefix="/onboarding", tags=["onboarding"])
@@ -50,6 +50,7 @@ class AdvisorResponse(BaseModel):
evidence_requests: List[str] = Field(default_factory=list) evidence_requests: List[str] = Field(default_factory=list)
unsupported_domains: List[str] = Field(default_factory=list) unsupported_domains: List[str] = Field(default_factory=list)
completeness_summary: str = "" completeness_summary: str = ""
capability_labels: Dict[str, str] = Field(default_factory=dict) # capability_id -> human label (DE)
@router.get("/targets") @router.get("/targets")
@@ -65,10 +66,17 @@ def advisor_start_endpoint(req: OnboardingAdvisorRequest) -> AdvisorResponse:
company=req.company, certifications=req.certifications, target=req.target, company=req.company, certifications=req.certifications, target=req.target,
signals=req.scanner_findings, known_evidence=req.known_evidence, signals=req.scanner_findings, known_evidence=req.known_evidence,
products=req.products, markets=req.markets, industry=req.industry or "") products=req.products, markets=req.markets, industry=req.industry or "")
surfaced = [
*result.auto_detected, *result.indications, *result.capability_delta,
*(q.capability_id for q in result.next_best_questions),
*(c for a in result.inferred_assumptions for c in a.capabilities),
*(m.capability_id for m in result.top_measures),
]
return AdvisorResponse( return AdvisorResponse(
silent_intake_summary=si_summary, headline=result.headline, auto_detected=result.auto_detected, silent_intake_summary=si_summary, headline=result.headline, auto_detected=result.auto_detected,
indications=result.indications, indications=result.indications,
inferred_assumptions=result.inferred_assumptions, rejected_assumptions=result.rejected_assumptions, inferred_assumptions=result.inferred_assumptions, rejected_assumptions=result.rejected_assumptions,
top_5_questions=result.next_best_questions, capability_delta=result.capability_delta, top_5_questions=result.next_best_questions, capability_delta=result.capability_delta,
top_measures=result.top_measures, evidence_requests=result.evidence_requests, top_measures=result.top_measures, evidence_requests=result.evidence_requests,
unsupported_domains=result.unsupported_domains, completeness_summary=result.completeness_summary) unsupported_domains=result.unsupported_domains, completeness_summary=result.completeness_summary,
capability_labels=labels_for(surfaced))
@@ -24,6 +24,7 @@ from fastapi.responses import FileResponse
from sqlalchemy.orm import Session from sqlalchemy.orm import Session
from classroom_engine.database import get_db from classroom_engine.database import get_db
from ..db.models import EvidenceDB
from .audit_trail_utils import log_audit_trail from .audit_trail_utils import log_audit_trail
from ..db import ( from ..db import (
@@ -310,6 +311,7 @@ async def list_controls_paginated(
) )
async def get_control( async def get_control(
control_id: str, control_id: str,
db: Session = Depends(get_db),
svc: ControlExportService = Depends(get_ctrl_export_service), svc: ControlExportService = Depends(get_ctrl_export_service),
) -> ControlResponse: ) -> ControlResponse:
"""Get a specific control by control_id.""" """Get a specific control by control_id."""
@@ -354,6 +356,7 @@ async def get_control(
async def update_control( async def update_control(
control_id: str, control_id: str,
update: ControlUpdate, update: ControlUpdate,
db: Session = Depends(get_db),
svc: ControlExportService = Depends(get_ctrl_export_service), svc: ControlExportService = Depends(get_ctrl_export_service),
) -> ControlResponse: ) -> ControlResponse:
"""Update a control.""" """Update a control."""
@@ -443,6 +446,7 @@ async def update_control(
async def review_control( async def review_control(
control_id: str, control_id: str,
review: ControlReviewRequest, review: ControlReviewRequest,
db: Session = Depends(get_db),
svc: ControlExportService = Depends(get_ctrl_export_service), svc: ControlExportService = Depends(get_ctrl_export_service),
) -> ControlResponse: ) -> ControlResponse:
"""Mark a control as reviewed with new status.""" """Mark a control as reviewed with new status."""
@@ -21,7 +21,7 @@ Phase 1 Step 4 refactor: handlers delegate to VVTService.
import logging import logging
from typing import Any, List, Optional from typing import Any, List, Optional
from fastapi import APIRouter, Depends, Query, Request from fastapi import APIRouter, Depends, HTTPException, Query, Request
from fastapi.responses import StreamingResponse from fastapi.responses import StreamingResponse
from sqlalchemy.orm import Session from sqlalchemy.orm import Session
@@ -21,6 +21,14 @@ from .observations import (
empirical_distribution, empirical_distribution,
reviewed, reviewed,
) )
from .observation_log import (
HypothesisStats,
ObservationRecord,
aggregate_by_hypothesis,
append_observation,
load_observations,
review_queue,
)
from .signals import ( from .signals import (
ProducedSignal, ProducedSignal,
SignalVocabularyEntry, SignalVocabularyEntry,
@@ -69,4 +77,10 @@ __all__ = [
"ProducedSignal", "ProducedSignal",
"SignalVocabularyEntry", "SignalVocabularyEntry",
"normalize_signals", "normalize_signals",
"ObservationRecord",
"HypothesisStats",
"append_observation",
"load_observations",
"aggregate_by_hypothesis",
"review_queue",
] ]
@@ -143,8 +143,8 @@ def advisor_start(
next_best_questions=next_q, capability_delta=delta, top_measures=measures, next_best_questions=next_q, capability_delta=delta, top_measures=measures,
evidence_requests=evidence, unsupported_domains=unsupported, evidence_requests=evidence, unsupported_domains=unsupported,
completeness_summary=rep.completeness_summary, completeness_summary=rep.completeness_summary,
headline="%d Anforderungen erkannt · %d automatisch erkannt (Intake) · %d wahrscheinlich (Zertifikate) · %d zu klären" headline="%d von %d Anforderungen offen · %d automatisch erkannt (Intake) · %d wahrscheinlich (Zertifikate) · %d zu klären"
% (len(assess.coverage), len(auto_detected), len(probably), len(next_q))) % (len(delta), len(assess.coverage), len(auto_detected), len(probably), len(next_q)))
def apply_answer(known_capabilities: Sequence[str], capability_id: str, answer: str) -> List[str]: def apply_answer(known_capabilities: Sequence[str], capability_id: str, answer: str) -> List[str]:
@@ -0,0 +1,108 @@
"""Observation Log — append-only JSONL store for empirical calibration events (Task 59b v1).
Observations are NOT business data and NOT product-DB data they are CALIBRATION events for the
knowledge base ("ISO27001 -> SDL confirmed", "TISAX -> supplier security refuted"). So they live with the
other versioned knowledge artifacts (hypotheses, transition patterns, vocabulary), NOT in the product
database: an append-only JSONL log under `knowledge/observations/`. NO migration, NO DB. The empirical
DISTRIBUTION and CONFIDENCE are COMPUTED from this log on demand (computed-not-stored) a hypothesis is
NEVER auto-updated; only REVIEWED observations calibrate (the review gate, enforced in observations.py).
Append-only: each line is one ObservationRecord and lines are NEVER modified in place. A later review is
a NEW line with the same observation_id and reviewed=true; load_observations() reconciles to the latest
per id. You can `rm` the log and recompute, `git diff` it over months, or rebuild confidence under a new
policy. Anonymisation is MANDATORY: customer_archetype is a sector/cert archetype, NEVER a real company
name (this file is committed to git). Time is stamped by the CALLER (no hidden clock) for determinism.
I/O only at the append/load boundary; statistics are pure. Python 3.9 compatible.
"""
from __future__ import annotations
import json
import os
from typing import Dict, List, Optional, Sequence
from pydantic import BaseModel, Field
from .observations import Observation, empirical_confidence, empirical_distribution
_DEFAULT_LOG = os.path.join(
os.path.dirname(__file__), "..", "..", "knowledge", "observations", "observations.jsonl")
class ObservationRecord(Observation):
"""A persisted observation line: an Observation (with its review gate + observation_type) plus log
metadata. `observation_id` is stable a review re-appends the SAME id with reviewed=true."""
observation_id: str # stable id; a review re-appends the same id
timestamp: str = "" # ISO 8601, stamped by the CALLER (no hidden clock)
customer_archetype: str = "" # sector/cert archetype — NEVER a real company name
evidence: str = "" # what backs the answer (reference, not the artifact)
provenance: str = "" # where the answer came from (audit trail)
knowledge_version: str = "" # hypotheses/vocabulary version observed under
class HypothesisStats(BaseModel):
"""Per-hypothesis empirical rollup — all COMPUTED from the log, nothing stored on the hypothesis."""
hypothesis_id: str
distribution: Dict[str, int] = Field(default_factory=dict) # reviewed counts per observation_type
confidence: Optional[float] = None # None until a for/against obs is reviewed
reviewed_count: int = 0
total_count: int = 0
def append_observation(record: ObservationRecord, path: str = _DEFAULT_LOG) -> None:
"""Append ONE record as a JSON line. Append-only — existing lines are never rewritten."""
os.makedirs(os.path.dirname(path), exist_ok=True)
line = json.dumps(record.model_dump(mode="json"), ensure_ascii=False, sort_keys=True)
with open(path, "a", encoding="utf-8") as fh:
fh.write(line + "\n")
def load_observations(path: str = _DEFAULT_LOG, reconcile: bool = True) -> List[ObservationRecord]:
"""Read all records — a single `.jsonl` file or a directory of monthly `.jsonl` files. With
reconcile, the LATEST record per observation_id wins (a later reviewed=true supersedes the original).
Returns deterministic order (by observation_id when reconciled, else append order)."""
files: List[str] = []
if os.path.isdir(path):
files = sorted(os.path.join(path, f) for f in os.listdir(path) if f.endswith(".jsonl"))
elif os.path.exists(path):
files = [path]
records: List[ObservationRecord] = []
for fpath in files:
with open(fpath, encoding="utf-8") as fh:
for raw in fh:
raw = raw.strip()
if raw:
records.append(ObservationRecord(**json.loads(raw)))
if not reconcile:
return records
latest: Dict[str, ObservationRecord] = {}
for r in records: # file/append order -> later lines win
latest[r.observation_id] = r
return [latest[k] for k in sorted(latest)]
def aggregate_by_hypothesis(records: Sequence[ObservationRecord]) -> List[HypothesisStats]:
"""Per-hypothesis distribution + confidence. The review gate applies inside empirical_distribution/
empirical_confidence (reviewed-only), so unreviewed observations are counted in total but never
calibrate. Deterministic order (by hypothesis id)."""
by_hyp: Dict[str, List[ObservationRecord]] = {}
for r in records:
by_hyp.setdefault(r.hypothesis_id, []).append(r)
out: List[HypothesisStats] = []
for hyp in sorted(by_hyp):
obs = by_hyp[hyp]
out.append(HypothesisStats(
hypothesis_id=hyp,
distribution=empirical_distribution(obs), # reviewed-only (the gate)
confidence=empirical_confidence(obs), # None until reviewed for/against
reviewed_count=sum(1 for o in obs if o.reviewed),
total_count=len(obs)))
return out
def review_queue(records: Sequence[ObservationRecord]) -> List[ObservationRecord]:
"""The reviewer's worklist: observations not yet reviewed. Calibration ignores these until a reviewer
accepts them (Observation -> Review -> Accepted -> Knowledge recomputed), never Observation -> conf++."""
return [r for r in records if not r.reviewed]
@@ -9,7 +9,7 @@ It adds NO new reasoning logic — it only exposes what exists. No DB, no persis
from __future__ import annotations from __future__ import annotations
import os import os
from typing import Any, Dict, List, Sequence, Tuple from typing import Any, Dict, Iterable, List, Sequence, Tuple
import yaml import yaml
@@ -37,6 +37,13 @@ def _load(*parts: str) -> Any:
_HYP_LIB = [CapabilityHypothesis(**h) for h in _load("certification_hypotheses", "hypotheses.yaml")["hypotheses"]] _HYP_LIB = [CapabilityHypothesis(**h) for h in _load("certification_hypotheses", "hypotheses.yaml")["hypotheses"]]
_VOCAB = [SignalVocabularyEntry(**v) for v in _load("onboarding", "signal_vocabulary.yaml")["signals"]] _VOCAB = [SignalVocabularyEntry(**v) for v in _load("onboarding", "signal_vocabulary.yaml")["signals"]]
_SIGNAL_MAP = [SignalMapping(**m) for m in _load("onboarding", "intake_signal_map.yaml")["mappings"]] _SIGNAL_MAP = [SignalMapping(**m) for m in _load("onboarding", "intake_signal_map.yaml")["mappings"]]
_LABELS: Dict[str, str] = _load("onboarding", "capability_labels.yaml")["labels"]
def labels_for(capability_ids: Iterable[str]) -> Dict[str, str]:
"""Human labels (DE) for the given capability ids — presentation only. Ids without a curated label
are omitted (the frontend falls back to a prettified id). Deduped, deterministic."""
return {c: _LABELS[c] for c in dict.fromkeys(capability_ids) if c in _LABELS}
# target id -> transition pattern that defines its required capabilities (curated registry) # target id -> transition pattern that defines its required capabilities (curated registry)
_TARGET_PATTERNS = { _TARGET_PATTERNS = {
@@ -53,9 +60,10 @@ def supported_targets() -> List[str]:
def _target(target_id: str) -> Tuple[List[TargetRequirement], Dict[str, List[str]]]: def _target(target_id: str) -> Tuple[List[TargetRequirement], Dict[str, List[str]]]:
pat = _load("transition_patterns", _TARGET_PATTERNS[target_id]) pat = _load("transition_patterns", _TARGET_PATTERNS[target_id])
reqs = [TargetRequirement(capability_id=a["capability"]) for a in pat["likely_covered"]] reqs = [TargetRequirement(capability_id=a["capability"], rationale=a.get("reviewable_claim", "")) for a in pat["likely_covered"]]
reqs += [TargetRequirement(capability_id=d["capability"], question_intent=d.get("needed_information", "verify_existence"), reqs += [TargetRequirement(capability_id=d["capability"], question_intent=d.get("needed_information", "verify_existence"),
expected_evidence=d.get("expected_evidence", [])) for d in pat["delta_requirements"]] rationale=d.get("why_asked", ""), expected_evidence=d.get("expected_evidence", []))
for d in pat["delta_requirements"]]
covers = {d["capability"]: d.get("covers_targets", []) for d in pat["delta_requirements"]} covers = {d["capability"]: d.get("covers_targets", []) for d in pat["delta_requirements"]}
return reqs, covers return reqs, covers
@@ -104,7 +104,8 @@ def assess_transition(
) )
buckets[status].append(req.capability_id) buckets[status].append(req.capability_id)
if status in _REQUESTABLE: if status in _REQUESTABLE:
reason, prio = _REQUESTABLE[status] default_reason, prio = _REQUESTABLE[status]
reason = req.rationale or default_reason # curated human text wins over the generic fallback
requests.append( requests.append(
TransitionQuestionRequest( TransitionQuestionRequest(
capability_id=req.capability_id, capability_id=req.capability_id,
@@ -70,6 +70,7 @@ class TargetRequirement(BaseModel):
capability_id: str # MCAP-... capability_id: str # MCAP-...
question_intent: str = "verify_existence" # passed through to the request, not rendered question_intent: str = "verify_existence" # passed through to the request, not rendered
rationale: str = "" # curated human text (e.g. why_asked / reviewable_claim) — surfaced as the request reason
expected_evidence: List[str] = Field(default_factory=list) expected_evidence: List[str] = Field(default_factory=list)
source_control_id: Optional[str] = None source_control_id: Optional[str] = None
supports_obligations: List[str] = Field(default_factory=list) supports_obligations: List[str] = Field(default_factory=list)
@@ -0,0 +1,2 @@
# Append-only observation log (Task 59b). Real lines (observations.jsonl / YYYY-MM.jsonl) are written at
# runtime via compliance/onboarding/observation_log.py. Anonymised archetypes only — NEVER real company names.
@@ -0,0 +1,45 @@
# Human-readable capability labels (DE) — presentation only, reusable across all targets.
# A capability id is the stable machine identity; this maps it to an expert-facing label for the UI.
# Curated knowledge (draft — to be corrected by the domain expert). Missing ids fall back to a
# prettified id in the frontend. NO real company names. Keep labels short + concrete.
labels:
# ── ISMS / ISO 27001 core ───────────────────────────────────────────────
information_security_management: "Informationssicherheits-Managementsystem (ISMS)"
access_control_and_authentication: "Zugriffskontrolle & Authentifizierung"
asset_and_configuration_management: "Asset- & Konfigurationsverwaltung"
cryptography: "Kryptographie / Verschlüsselung"
incident_management: "Security-Incident-Management"
security_awareness_training: "Security-Awareness-Schulungen"
supplier_security: "Lieferanten-Sicherheit"
security_logging_and_monitoring: "Security-Logging & Monitoring"
technical_vulnerability_management: "Technisches Schwachstellen-Management"
# ── TISAX / VDA-spezifisch ──────────────────────────────────────────────
prototype_protection: "Prototypenschutz (physisch & logisch)"
tisax_label_scope_selection: "TISAX-Label-/Scope-Festlegung"
tisax_assessment_via_enx: "TISAX-Assessment über die ENX-Plattform"
vda_isa_self_assessment: "VDA-ISA-Selbstauskunft"
data_protection_processing_on_behalf: "Auftragsverarbeitung (Art. 28 DSGVO)"
physical_security: "Physische Sicherheit / Zutrittskontrolle"
# ── QM / ISO 9001 ───────────────────────────────────────────────────────
document_and_change_control: "Dokumenten- & Änderungslenkung"
supplier_evaluation: "Lieferantenbewertung"
release_and_approval_process: "Freigabe- & Genehmigungsprozess"
ce_conformity_assessment_and_technical_documentation: "CE-Konformitätsbewertung & technische Dokumentation"
# ── CRA / Produkt-Cybersecurity ─────────────────────────────────────────
sbom_creation: "SBOM-Erstellung (Software-Stückliste)"
coordinated_vulnerability_disclosure: "Coordinated Vulnerability Disclosure (CVD)"
secure_development_lifecycle: "Sicherer Entwicklungslebenszyklus (SDLC)"
secure_signed_update_distribution: "Sichere, signierte Update-Verteilung"
security_update_support_period: "Sicherheits-Update-Supportzeitraum"
product_cyber_risk_assessment: "Produkt-Cyber-Risikobewertung"
exploited_vuln_and_incident_reporting: "Meldung ausgenutzter Schwachstellen & Vorfälle"
public_security_advisories: "Öffentliche Security Advisories"
cybersecurity_management_system: "Cybersecurity-Managementsystem (CSMS)"
# ── MaschinenVO / Safety ────────────────────────────────────────────────
machine_safety_risk_assessment: "Maschinen-Risikobeurteilung"
mechanical_safety_and_guards: "Mechanische Sicherheit & Schutzeinrichtungen"
operating_instructions_and_safety_information: "Betriebsanleitung & Sicherheitshinweise"
protection_against_corruption_of_safety_functions: "Schutz der Sicherheitsfunktionen vor Manipulation"
# ── Umwelt ──────────────────────────────────────────────────────────────
environmental_management_documentation: "Umweltmanagement-Dokumentation"
@@ -0,0 +1,73 @@
"""Observation Log — append-only JSONL store + computed statistics (Task 59b/c v1).
Pins the user's decision (2026-06-28): observations are CALIBRATION data, not product data -> an
append-only JSONL log under knowledge/observations/, NO DB, NO migration. Distribution and confidence are
COMPUTED from the log; only REVIEWED observations calibrate (review gate); a later review is a new line
that supersedes by observation_id. Nothing is ever written back to a hypothesis.
"""
from __future__ import annotations
from compliance.onboarding import (
ObservationRecord,
ObservationType,
aggregate_by_hypothesis,
append_observation,
load_observations,
review_queue,
)
def _rec(oid, hyp, otype, reviewed=False, **kw):
return ObservationRecord(
observation_id=oid, hypothesis_id=hyp, observation_type=otype, reviewed=reviewed,
timestamp="2026-07-01T00:00:00Z", customer_archetype="machine_builder+ISO27001", **kw)
def test_append_only_round_trip(tmp_path):
p = str(tmp_path / "obs.jsonl")
append_observation(_rec("o1", "HYP-secure_dev", ObservationType.CONFIRMED, reviewed=True), p)
append_observation(_rec("o2", "HYP-secure_dev", ObservationType.REFUTED, reviewed=True), p)
recs = load_observations(p)
assert {r.observation_id for r in recs} == {"o1", "o2"}
assert all(r.customer_archetype == "machine_builder+ISO27001" for r in recs) # anonymised archetype, not a name
def test_review_supersedes_by_id_append_only(tmp_path):
p = str(tmp_path / "obs.jsonl")
append_observation(_rec("o1", "HYP-x", ObservationType.CONFIRMED, reviewed=False), p) # raw answer
append_observation(_rec("o1", "HYP-x", ObservationType.CONFIRMED, reviewed=True,
reviewed_by="anna"), p) # later review event
assert len(load_observations(p, reconcile=False)) == 2 # both lines kept (append-only)
recs = load_observations(p) # reconciled
assert len(recs) == 1 and recs[0].reviewed and recs[0].reviewed_by == "anna"
def test_statistics_apply_the_review_gate(tmp_path):
p = str(tmp_path / "obs.jsonl")
append_observation(_rec("a", "HYP-sdl", ObservationType.CONFIRMED, reviewed=True), p)
append_observation(_rec("b", "HYP-sdl", ObservationType.CONFIRMED, reviewed=True), p)
append_observation(_rec("c", "HYP-sdl", ObservationType.REFUTED, reviewed=True), p)
append_observation(_rec("d", "HYP-sdl", ObservationType.CONFIRMED, reviewed=False), p) # unreviewed -> ignored
stats = {s.hypothesis_id: s for s in aggregate_by_hypothesis(load_observations(p))}
s = stats["HYP-sdl"]
assert s.total_count == 4 and s.reviewed_count == 3
assert s.distribution["confirmed"] == 2 and s.distribution["refuted"] == 1 # unreviewed one excluded
assert s.confidence == round(2 / 3, 2) # (2 + 0.5*0) / 3
def test_review_queue_lists_unreviewed(tmp_path):
p = str(tmp_path / "obs.jsonl")
append_observation(_rec("a", "HYP-y", ObservationType.CONFIRMED, reviewed=True), p)
append_observation(_rec("b", "HYP-y", ObservationType.PARTIAL, reviewed=False), p)
q = review_queue(load_observations(p))
assert [r.observation_id for r in q] == ["b"]
def test_load_directory_of_monthly_files(tmp_path):
d = tmp_path / "observations"
d.mkdir()
append_observation(_rec("a", "HYP-z", ObservationType.CONFIRMED, reviewed=True), str(d / "2026-06.jsonl"))
append_observation(_rec("b", "HYP-z", ObservationType.REFUTED, reviewed=True), str(d / "2026-07.jsonl"))
recs = load_observations(str(d))
assert {r.observation_id for r in recs} == {"a", "b"}
@@ -73,6 +73,17 @@ def test_partial_signal_surfaces_as_indication_and_is_still_asked():
assert "secure_development_lifecycle" in asked or "secure_development_lifecycle" in d["capability_delta"] assert "secure_development_lifecycle" in asked or "secure_development_lifecycle" in d["capability_delta"]
def test_questions_carry_curated_text_and_human_labels():
# the curated why_asked from the transition pattern must reach the question (not the generic
# fallback "Keine Anhaltspunkte ... klären"), and surfaced capabilities get human labels.
body = dict(_BODY, certifications=["ISO27001"], target="TISAX", scanner_findings=[])
r = _client.post("/onboarding/advisor-start", json=body)
assert r.status_code == 200, r.text
d = r.json()
assert any("Keine Anhaltspunkte" not in q["why"] for q in d["top_5_questions"]) # real expert text surfaced
assert d["capability_labels"].get("vda_isa_self_assessment") == "VDA-ISA-Selbstauskunft"
def test_unknown_target_is_404(): def test_unknown_target_is_404():
body = dict(_BODY, target="NOPE") body = dict(_BODY, target="NOPE")
r = _client.post("/onboarding/advisor-start", json=body) r = _client.post("/onboarding/advisor-start", json=body)
@@ -128,3 +128,74 @@ SBOM ✓ → Vuln ✓ → Registry v1 (DIESE Spec) → Ontologie/Beziehung
Begründung: Schema jetzt billig änderbar; bei 3001000 Obligations wird jede Schemaänderung Begründung: Schema jetzt billig änderbar; bei 3001000 Obligations wird jede Schemaänderung
teuer. Fortschritt wird daran gemessen, ob jede neue Obligation die Registry besser macht — teuer. Fortschritt wird daran gemessen, ob jede neue Obligation die Registry besser macht —
nicht an neuen Controls. nicht an neuen Controls.
## Scope-Audit (Review-Step, PFLICHT je Cut)
Die Registry modelliert **Hersteller-Pflichten**. Bestimmungen, die an Behörden / notifizierte
Stellen / Mitgliedstaaten adressiert sind (Sanktionen, Marktüberwachung, Anforderungen an
Konformitätsbewertungsstellen), sind Enforcement-/Institutions-Recht. **Prinzip: Adressat der Norm
⊥ Handlungspflicht des Herstellers.** `scope`-Attribut-Achse (Enum, KEINE neue Objektklasse):
- `in_scope` — Norm adressiert direkt den Hersteller (Default).
- `out_of_scope` — reines Staats-/Durchsetzungs-/Institutions-Recht (Adressat ≠ Hersteller, KEINE
mittelbare Herstellerpflicht). Aus `obligation_join_keys.json` gefiltert. Präzedenz CSIRT/ENISA.
- `derived_obligation` — Norm adressiert primär eine andere Rolle, erzeugt aber MITTELBAR eine
Hersteller-Handlungspflicht → **bleibt im Set** (`scope_split_candidate` markiert spätere
Aufspaltung Normadressat ↔ abgeleitete Pflicht; nicht vorzeitig festziehen).
**Gate-Regel:**
```
Jeder neue Obligation-Cut muss durch Scope-Audit laufen.
Findings mit authority-/institution-addressed obligations werden dokumentiert.
Automatische Reclassification ist verboten, solange kein explizites Review-Go vorliegt.
```
**Werkzeug-Trennung (FLAG ⊥ MUTATE):**
- `scope_audit.py`**flaggt nur** (scannt alle Registries → `scope_audit_findings.json`; mutiert nie).
- `validate_registry.py` — surfaced pro Cut unklassifizierte authority-/institution-Obligations als
**non-fatal Warnung** (blockt nicht, mutiert nicht).
- `apply_scope_classification.py`**mutiert** (setzt `scope`), läuft NUR nach explizitem Review-Go
(ändert `join_keys` + Compliance-Execution-Sync → menschlich/koordiniert).
## Weg 1 — Obligation→Norm-Zitierfähigkeit (Datenbereitschaft; UI deferred)
**Zwei Zitierebenen — NICHT verwechseln (beide langfristig nötig):**
- **RAG-Evidence-Zitat** (heute im Advisor sichtbar): `Frage/Antwort → Evidence-Chunk`
(KB-v2 `article_label`/`source_url`, `[n]`-Citations). Quelle = Compliance/KB-Session `/retrieve`.
- **Obligation→Norm-Zitat** (der `norm_id`-Join): `Pflicht → konkrete Rechtsgrundlage` (KB-v2-Unit).
Quelle = diese Registry. **Weg 1** = dieses Zitat sichtbar machen.
**Datenbereitschaft (Stand 2026-07-01):**
| Baustein | Status |
|---|---|
| `obligations/*.json``legal_basis.norm_ids` | ✅ 62/64 joinbar (53 annex + 16 article, KB-v2-verifiziert) |
| KB-v2-Join-Ziele (CRA Art1-71 · MaschVO Art1-54 · Annexe) | ✅ bestätigt |
| `obligation-status`-Endpoint + Traversal (`obligation_id→citation_unit→Controls→Evidence`) | ✅ vorhanden; exponiert `LegalBasis = citation_units` |
| Runtime-Vertrag `obligation_join_keys.json` trägt `norm_ids` | ❌ nur `citation_units` (Anker-Strings), KEINE `norm_ids` |
| Go `ObligationKey`-Struct / `CitationSpans` | ❌ kein `NormIDs`-Feld; `AssessObligationStatus` setzt `CitationSpans:"pending"` hart (compliance_status.go) |
**Deferred-Sequenz (UI zuletzt) — bewusst NICHT gebaut:**
1. **Data-Prep** (nicht-UI, Domäne 2): `norm_ids` in `export_join_keys.py``obligation_join_keys.json` exportieren.
2. **Build** (ai-sdk, koordiniert): `NormIDs` in `ObligationKey`; `AssessObligationStatus` füllt `CitationSpans` aus `norm_ids` statt `"pending"`.
3. **UI** (später): „Diese Pflicht beruht auf **CRA Anhang I / Art. 13**".
**Design-Vorgabe für Phase B (Runtime-Vertrag): generische Legal-Reference-Hülle statt flachem
`norm_ids`.** `norm_ids` sind nur die ERSTE Rechtsreferenz-Art. Damit der Runtime-Vertrag
(`obligation_join_keys.json` + Go `ObligationKey`) bei neuen Referenz-Arten NICHT erneut geändert
werden muss, trägt jeder Eintrag eine **erweiterbare Hülle** (optionale Keys, additiv):
```json
"legal_reference": { // Hülle — NICHT "legal_basis" (kollidiert mit dem Obligation-Array)
"norm_ids": [...], // bindendes Primärrecht (Artikel/Anhang) ← Phase B baut nur DAS
"citation_units": [...], // menschliche Anker-Strings (Interim-Brücke)
"recital_ids": [...], // Erwägungsgründe (später, additiv)
"guidance_ids": [...], // Leitlinien EDPB/DSK (später, additiv)
"case_law_ids": [...], // Gerichtsurteile (später, additiv)
"interpretation_ids": [...] // Interpretationshilfen (später, additiv)
}
```
Prinzip: neue Referenz-**Art** = neuer optionaler Key in der Hülle → Vertrag/Go-Struct bleiben
stabil (wie die `scope`-Achse: erweitern über Attribute, nicht über Strukturumbau). **Bindend ⊥
Guidance-Trennung bleibt erhalten** (`norm_ids`/`case_law` = bindend · `guidance_ids` = Soft-Law) —
konsistent mit `legal_basis``guidance_basis` und dem Authority-Router. In Phase B wird NUR
`norm_ids` befüllt; die übrigen Keys sind reservierter Platz, kein Bauauftrag.
+98 -29
View File
@@ -23,7 +23,11 @@
"source": "CRA", "source": "CRA",
"regulation_code": "eu_2024_2847", "regulation_code": "eu_2024_2847",
"anchor": "Annex I Part II (1)", "anchor": "Annex I Part II (1)",
"citation": "SBOM in gängigem maschinenlesbarem Format, mind. Top-Level-Abhängigkeiten" "citation": "SBOM in gängigem maschinenlesbarem Format, mind. Top-Level-Abhängigkeiten",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -128,7 +132,7 @@
"member_count": 85, "member_count": 85,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft" "review_status": "draft"
}, },
{ {
@@ -149,7 +153,12 @@
"source": "CRA", "source": "CRA",
"regulation_code": "eu_2024_2847", "regulation_code": "eu_2024_2847",
"anchor": "Art. 3(36) i.V.m. Annex I Part II (1)", "anchor": "Art. 3(36) i.V.m. Annex I Part II (1)",
"citation": "SBOM-Definition: formale Aufzeichnung enthaltener Komponenten und Abhängigkeiten" "citation": "SBOM-Definition: formale Aufzeichnung enthaltener Komponenten und Abhängigkeiten",
"norm_ids": [
"EU-CRA-AnhangI",
"EU-CRA-Art3"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -188,7 +197,7 @@
"member_count": 24, "member_count": 24,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft" "review_status": "draft"
}, },
{ {
@@ -209,7 +218,11 @@
"source": "CRA", "source": "CRA",
"regulation_code": "eu_2024_2847", "regulation_code": "eu_2024_2847",
"anchor": "Annex I Part II (1)", "anchor": "Annex I Part II (1)",
"citation": "gängiges, maschinenlesbares Format" "citation": "gängiges, maschinenlesbares Format",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -248,7 +261,7 @@
"member_count": 19, "member_count": 19,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft" "review_status": "draft"
}, },
{ {
@@ -269,7 +282,11 @@
"source": "CRA", "source": "CRA",
"regulation_code": "eu_2024_2847", "regulation_code": "eu_2024_2847",
"anchor": "Annex I Part II (1)", "anchor": "Annex I Part II (1)",
"citation": "SBOM während Support-Zeitraum führen" "citation": "SBOM während Support-Zeitraum führen",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -315,7 +332,7 @@
"member_count": 31, "member_count": 31,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft" "review_status": "draft"
}, },
{ {
@@ -476,7 +493,12 @@
"source": "CRA", "source": "CRA",
"regulation_code": "eu_2024_2847", "regulation_code": "eu_2024_2847",
"anchor": "Art. 31 / Annex I Part II (1)", "anchor": "Art. 31 / Annex I Part II (1)",
"citation": "Vorlage der SBOM auf begründetes Verlangen der Marktüberwachungsbehörde" "citation": "Vorlage der SBOM auf begründetes Verlangen der Marktüberwachungsbehörde",
"norm_ids": [
"EU-CRA-AnhangI",
"EU-CRA-Art31"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [], "guidance_basis": [],
@@ -493,7 +515,7 @@
"member_count": 8, "member_count": 8,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft" "review_status": "draft"
}, },
{ {
@@ -514,7 +536,11 @@
"source": "CRA", "source": "CRA",
"regulation_code": "eu_2024_2847", "regulation_code": "eu_2024_2847",
"anchor": "Art. 31(4)", "anchor": "Art. 31(4)",
"citation": "Marktüberwachungsbehörden wahren Vertraulichkeit der erhaltenen Informationen" "citation": "Marktüberwachungsbehörden wahren Vertraulichkeit der erhaltenen Informationen",
"norm_ids": [
"EU-CRA-Art31"
],
"norm_id_status": "article_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -539,7 +565,7 @@
"member_count": 10, "member_count": 10,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft" "review_status": "draft"
}, },
{ {
@@ -600,7 +626,12 @@
"source": "CRA", "source": "CRA",
"regulation_code": "eu_2024_2847", "regulation_code": "eu_2024_2847",
"anchor": "Art. 31 i.V.m. Annex VII", "anchor": "Art. 31 i.V.m. Annex VII",
"citation": "technische Dokumentation muss SBOM-relevante Nachweise enthalten" "citation": "technische Dokumentation muss SBOM-relevante Nachweise enthalten",
"norm_ids": [
"EU-CRA-AnhangVII",
"EU-CRA-Art31"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -628,7 +659,7 @@
"member_count": 13, "member_count": 13,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft" "review_status": "draft"
}, },
{ {
@@ -649,7 +680,11 @@
"source": "CRA", "source": "CRA",
"regulation_code": "eu_2024_2847", "regulation_code": "eu_2024_2847",
"anchor": "Annex I Part II (1)", "anchor": "Annex I Part II (1)",
"citation": "Komponenten identifizieren und dokumentieren, einschl. SBOM" "citation": "Komponenten identifizieren und dokumentieren, einschl. SBOM",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -717,7 +752,7 @@
"member_count": 48, "member_count": 48,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft" "review_status": "draft"
}, },
{ {
@@ -738,7 +773,11 @@
"source": "CRA", "source": "CRA",
"regulation_code": "eu_2024_2847", "regulation_code": "eu_2024_2847",
"anchor": "Annex I Part II (1)", "anchor": "Annex I Part II (1)",
"citation": "Schwachstellen behandeln und beheben" "citation": "Schwachstellen behandeln und beheben",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -819,7 +858,7 @@
"member_count": 61, "member_count": 61,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft" "review_status": "draft"
}, },
{ {
@@ -840,7 +879,11 @@
"source": "CRA", "source": "CRA",
"regulation_code": "eu_2024_2847", "regulation_code": "eu_2024_2847",
"anchor": "Annex I Part II (2) & (8)", "anchor": "Annex I Part II (2) & (8)",
"citation": "Schwachstellen unverzüglich beheben, kostenlose Sicherheitsupdates" "citation": "Schwachstellen unverzüglich beheben, kostenlose Sicherheitsupdates",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -965,7 +1008,7 @@
"member_count": 110, "member_count": 110,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"merged_into": "provide_security_updates", "merged_into": "provide_security_updates",
"status": "deprecated_alias", "status": "deprecated_alias",
@@ -989,7 +1032,12 @@
"source": "CRA", "source": "CRA",
"regulation_code": "eu_2024_2847", "regulation_code": "eu_2024_2847",
"anchor": "Article 13(8) & Annex VII", "anchor": "Article 13(8) & Annex VII",
"citation": "Schwachstellenbehandlungsprozesse einrichten und in technischer Doku belegen" "citation": "Schwachstellenbehandlungsprozesse einrichten und in technischer Doku belegen",
"norm_ids": [
"EU-CRA-AnhangVII",
"EU-CRA-Art13"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -1114,7 +1162,7 @@
"member_count": 105, "member_count": 105,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft" "review_status": "draft"
}, },
{ {
@@ -1135,7 +1183,11 @@
"source": "CRA", "source": "CRA",
"regulation_code": "eu_2024_2847", "regulation_code": "eu_2024_2847",
"anchor": "Annex I Part II (5)", "anchor": "Annex I Part II (5)",
"citation": "Coordinated Vulnerability Disclosure Policy einrichten" "citation": "Coordinated Vulnerability Disclosure Policy einrichten",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -1233,7 +1285,7 @@
"member_count": 78, "member_count": 78,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft" "review_status": "draft"
}, },
{ {
@@ -1254,7 +1306,12 @@
"source": "CRA", "source": "CRA",
"regulation_code": "eu_2024_2847", "regulation_code": "eu_2024_2847",
"anchor": "Article 14 & Article 16", "anchor": "Article 14 & Article 16",
"citation": "Meldepflicht aktiv ausgenutzter Schwachstellen über Single Reporting Platform" "citation": "Meldepflicht aktiv ausgenutzter Schwachstellen über Single Reporting Platform",
"norm_ids": [
"EU-CRA-Art14",
"EU-CRA-Art16"
],
"norm_id_status": "article_confirmed"
} }
], ],
"guidance_basis": [], "guidance_basis": [],
@@ -1294,7 +1351,7 @@
"member_count": 31, "member_count": 31,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft" "review_status": "draft"
}, },
{ {
@@ -1315,7 +1372,11 @@
"source": "CRA", "source": "CRA",
"regulation_code": "eu_2024_2847", "regulation_code": "eu_2024_2847",
"anchor": "Annex I Part II (4) & (6)", "anchor": "Annex I Part II (4) & (6)",
"citation": "Informationen über behobene Schwachstellen teilen und offenlegen" "citation": "Informationen über behobene Schwachstellen teilen und offenlegen",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -1335,7 +1396,7 @@
"member_count": 5, "member_count": 5,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft" "review_status": "draft"
} }
], ],
@@ -1581,5 +1642,13 @@
"produces_evidence_for", "produces_evidence_for",
"implements", "implements",
"derived_from" "derived_from"
] ],
"norm_id_contract": {
"convention": "EU-<ACT>-Anhang<ROM> (Annex-Ebene) / EU-<ACT>-Art<N> (verify) — KB-v2 bp_compliance_kb_2026_1_build",
"act_naming": "EU-MaschVO-* (NICHT MaschinenVO)",
"granularity": "annex-grob — 'Annex I Part II (1)' -> EU-CRA-AnhangI; Part/Punkt = KB-Enhancement TBD",
"article_status": "EU-<ACT>-Art<N> in KB-v2 BESTÄTIGT (16/16); Annex-IDs confirmed",
"source": "Board Compliance/KB-v2 2026-07-01",
"kb_v2_verification": "2026-07-01: 16/19 verify_pending IDs in KB-v2 bestätigt (alle Artikel); 3 Kapitel-IDs = chapter_no_kb_unit (Compiler mintet keine Kapitel)."
}
} }
+44 -12
View File
@@ -26,7 +26,11 @@
"source": "CRA", "source": "CRA",
"regulation_code": "eu_2024_2847", "regulation_code": "eu_2024_2847",
"anchor": "Annex I (2)(d)", "anchor": "Annex I (2)(d)",
"citation": "protect... by ensuring protection from unauthorised access, including by reporting... appropriate control mechanisms incl. authentication, identity or access management" "citation": "protect... by ensuring protection from unauthorised access, including by reporting... appropriate control mechanisms incl. authentication, identity or access management",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -1391,7 +1395,7 @@
"member_count": 1339, "member_count": 1339,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.95, "discovery_confidence": 0.95,
@@ -4682,7 +4686,11 @@
"source": "CRA", "source": "CRA",
"regulation_code": "eu_2024_2847", "regulation_code": "eu_2024_2847",
"anchor": "Annex I (2)(e)", "anchor": "Annex I (2)(e)",
"citation": "protect the confidentiality... through state-of-the-art mechanisms incl. encryption" "citation": "protect the confidentiality... through state-of-the-art mechanisms incl. encryption",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -5277,7 +5285,7 @@
"member_count": 533, "member_count": 533,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.85, "discovery_confidence": 0.85,
@@ -5650,7 +5658,11 @@
"source": "CRA", "source": "CRA",
"regulation_code": "eu_2024_2847", "regulation_code": "eu_2024_2847",
"anchor": "Annex I (2)(e)", "anchor": "Annex I (2)(e)",
"citation": "protect the confidentiality of stored, transmitted or otherwise processed data" "citation": "protect the confidentiality of stored, transmitted or otherwise processed data",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -5994,7 +6006,7 @@
"member_count": 315, "member_count": 315,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.85, "discovery_confidence": 0.85,
@@ -6326,7 +6338,11 @@
"source": "CRA", "source": "CRA",
"regulation_code": "eu_2024_2847", "regulation_code": "eu_2024_2847",
"anchor": "Annex I (2)(a)", "anchor": "Annex I (2)(a)",
"citation": "be made available with a secure by default configuration" "citation": "be made available with a secure by default configuration",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [], "guidance_basis": [],
@@ -6347,7 +6363,7 @@
"member_count": 9, "member_count": 9,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.8, "discovery_confidence": 0.8,
@@ -8408,7 +8424,11 @@
"source": "CRA", "source": "CRA",
"regulation_code": "eu_2024_2847", "regulation_code": "eu_2024_2847",
"anchor": "Annex I (2)(e)", "anchor": "Annex I (2)(e)",
"citation": "protect the confidentiality of... transmitted... data... incl. encryption in transit" "citation": "protect the confidentiality of... transmitted... data... incl. encryption in transit",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -8485,7 +8505,7 @@
"member_count": 57, "member_count": 57,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.8, "discovery_confidence": 0.8,
@@ -10214,7 +10234,11 @@
"source": "CRA", "source": "CRA",
"regulation_code": "eu_2024_2847", "regulation_code": "eu_2024_2847",
"anchor": "Annex I (2)(c)", "anchor": "Annex I (2)(c)",
"citation": "ensure that vulnerabilities can be addressed through security updates... ensuring integrity" "citation": "ensure that vulnerabilities can be addressed through security updates... ensuring integrity",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -10273,7 +10297,7 @@
"member_count": 37, "member_count": 37,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.78, "discovery_confidence": 0.78,
@@ -10458,5 +10482,13 @@
], ],
"from_obligations": 54, "from_obligations": 54,
"to_obligations": 29 "to_obligations": 29
},
"norm_id_contract": {
"convention": "EU-<ACT>-Anhang<ROM> (Annex-Ebene) / EU-<ACT>-Art<N> (verify) — KB-v2 bp_compliance_kb_2026_1_build",
"act_naming": "EU-MaschVO-* (NICHT MaschinenVO)",
"granularity": "annex-grob — 'Annex I Part II (1)' -> EU-CRA-AnhangI; Part/Punkt = KB-Enhancement TBD",
"article_status": "EU-<ACT>-Art<N> in KB-v2 BESTÄTIGT (16/16); Annex-IDs confirmed",
"source": "Board Compliance/KB-v2 2026-07-01",
"kb_v2_verification": "2026-07-01: 16/19 verify_pending IDs in KB-v2 bestätigt (alle Artikel); 3 Kapitel-IDs = chapter_no_kb_unit (Compiler mintet keine Kapitel)."
} }
} }
+21 -5
View File
@@ -23,7 +23,11 @@
{ {
"source": "CRA", "source": "CRA",
"anchor": "Annex I Part I (2)(j)", "anchor": "Annex I Part I (2)(j)",
"citation": "limit attack surfaces, including external interfaces" "citation": "limit attack surfaces, including external interfaces",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -38,7 +42,7 @@
"component_remote_interface_security" "component_remote_interface_security"
], ],
"primary_implementation": "NIST CM-7", "primary_implementation": "NIST CM-7",
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "core_from_5b" "review_status": "core_from_5b"
}, },
{ {
@@ -56,7 +60,11 @@
{ {
"source": "CRA", "source": "CRA",
"anchor": "Annex I Part I (2)(f)", "anchor": "Annex I Part I (2)(f)",
"citation": "protect the integrity of stored, transmitted or processed data, software and configuration" "citation": "protect the integrity of stored, transmitted or processed data, software and configuration",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -74,9 +82,17 @@
"code_signing" "code_signing"
], ],
"primary_implementation": "NIST SI-7", "primary_implementation": "NIST SI-7",
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "core_from_5b" "review_status": "core_from_5b"
} }
], ],
"relationships": [] "relationships": [],
"norm_id_contract": {
"convention": "EU-<ACT>-Anhang<ROM> (Annex-Ebene) / EU-<ACT>-Art<N> (verify) — KB-v2 bp_compliance_kb_2026_1_build",
"act_naming": "EU-MaschVO-* (NICHT MaschinenVO)",
"granularity": "annex-grob — 'Annex I Part II (1)' -> EU-CRA-AnhangI; Part/Punkt = KB-Enhancement TBD",
"article_status": "EU-<ACT>-Art<N> in KB-v2 BESTÄTIGT (16/16); Annex-IDs confirmed",
"source": "Board Compliance/KB-v2 2026-07-01",
"kb_v2_verification": "2026-07-01: 16/19 verify_pending IDs in KB-v2 bestätigt (alle Artikel); 3 Kapitel-IDs = chapter_no_kb_unit (Compiler mintet keine Kapitel)."
}
} }
+45 -13
View File
@@ -44,7 +44,11 @@
{ {
"source": "CRA", "source": "CRA",
"anchor": "Annex I Part I (2)(k)", "anchor": "Annex I Part I (2)(k)",
"citation": "monitor relevant internal activity, including the access to or modification of data, services or functions, where applicable, through recording and monitoring" "citation": "monitor relevant internal activity, including the access to or modification of data, services or functions, where applicable, through recording and monitoring",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -1038,7 +1042,7 @@
"member_count": 961, "member_count": 961,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.95, "discovery_confidence": 0.95,
@@ -1066,7 +1070,11 @@
{ {
"source": "CRA", "source": "CRA",
"anchor": "Annex I Part I (2)(k)", "anchor": "Annex I Part I (2)(k)",
"citation": "recording and monitoring access to or modification of data, services or functions" "citation": "recording and monitoring access to or modification of data, services or functions",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -1601,7 +1609,7 @@
"member_count": 505, "member_count": 505,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.92, "discovery_confidence": 0.92,
@@ -1629,7 +1637,11 @@
{ {
"source": "CRA", "source": "CRA",
"anchor": "Annex I Part I (2)(k)", "anchor": "Annex I Part I (2)(k)",
"citation": "monitor relevant internal activity including access to or modification of functions" "citation": "monitor relevant internal activity including access to or modification of functions",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -1878,7 +1890,7 @@
"member_count": 226, "member_count": 226,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.9, "discovery_confidence": 0.9,
@@ -1906,7 +1918,11 @@
{ {
"source": "CRA", "source": "CRA",
"anchor": "Annex I Part I (2)(k)", "anchor": "Annex I Part I (2)(k)",
"citation": "recording and monitoring ... in a secure manner" "citation": "recording and monitoring ... in a secure manner",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -2442,7 +2458,7 @@
"member_count": 505, "member_count": 505,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.93, "discovery_confidence": 0.93,
@@ -2470,7 +2486,11 @@
{ {
"source": "CRA", "source": "CRA",
"anchor": "Annex I Part I (2)(k)", "anchor": "Annex I Part I (2)(k)",
"citation": "in a secure manner" "citation": "in a secure manner",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -2550,7 +2570,7 @@
"member_count": 59, "member_count": 59,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.88, "discovery_confidence": 0.88,
@@ -2942,7 +2962,11 @@
{ {
"source": "CRA", "source": "CRA",
"anchor": "Annex I Part I (2)(k)", "anchor": "Annex I Part I (2)(k)",
"citation": "monitor relevant internal activity" "citation": "monitor relevant internal activity",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -3251,7 +3275,7 @@
"member_count": 283, "member_count": 283,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.9, "discovery_confidence": 0.9,
@@ -4737,5 +4761,13 @@
], ],
"note": "M8/M5/M81 KI-/FRT- bzw. domaenenspezifische Trainings-/PIN-/Biometrie-Protokollierung (AI Act/sektorale Regulierung); M58/M59/M71/M56/M63 reine DSGVO-/datenschutzrechtliche bzw. nationale Verwaltungs-Protokollierungspflichten, nicht CRA Annex I (2)(k)" "note": "M8/M5/M81 KI-/FRT- bzw. domaenenspezifische Trainings-/PIN-/Biometrie-Protokollierung (AI Act/sektorale Regulierung); M58/M59/M71/M56/M63 reine DSGVO-/datenschutzrechtliche bzw. nationale Verwaltungs-Protokollierungspflichten, nicht CRA Annex I (2)(k)"
} }
] ],
"norm_id_contract": {
"convention": "EU-<ACT>-Anhang<ROM> (Annex-Ebene) / EU-<ACT>-Art<N> (verify) — KB-v2 bp_compliance_kb_2026_1_build",
"act_naming": "EU-MaschVO-* (NICHT MaschinenVO)",
"granularity": "annex-grob — 'Annex I Part II (1)' -> EU-CRA-AnhangI; Part/Punkt = KB-Enhancement TBD",
"article_status": "EU-<ACT>-Art<N> in KB-v2 BESTÄTIGT (16/16); Annex-IDs confirmed",
"source": "Board Compliance/KB-v2 2026-07-01",
"kb_v2_verification": "2026-07-01: 16/19 verify_pending IDs in KB-v2 bestätigt (alle Artikel); 3 Kapitel-IDs = chapter_no_kb_unit (Compiler mintet keine Kapitel)."
}
} }
+179 -54
View File
@@ -48,7 +48,11 @@
{ {
"source": "MaschVO", "source": "MaschVO",
"anchor": "Anhang III Nr. 1 (Allgemeine Grundsätze)", "anchor": "Anhang III Nr. 1 (Allgemeine Grundsätze)",
"citation": "Der Hersteller einer Maschine hat eine Risikobeurteilung durchzuführen, um die für die Maschine geltenden Sicherheits- und Gesundheitsschutzanforderungen zu ermitteln." "citation": "Der Hersteller einer Maschine hat eine Risikobeurteilung durchzuführen, um die für die Maschine geltenden Sicherheits- und Gesundheitsschutzanforderungen zu ermitteln.",
"norm_ids": [
"EU-MaschVO-AnhangIII"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -560,7 +564,7 @@
"member_count": 480, "member_count": 480,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.95, "discovery_confidence": 0.95,
@@ -588,7 +592,11 @@
{ {
"source": "MaschVO", "source": "MaschVO",
"anchor": "Anhang IV (Technische Unterlagen)", "anchor": "Anhang IV (Technische Unterlagen)",
"citation": "Die technischen Unterlagen müssen die Risikobeurteilung mit den Ergebnissen enthalten." "citation": "Die technischen Unterlagen müssen die Risikobeurteilung mit den Ergebnissen enthalten.",
"norm_ids": [
"EU-MaschVO-AnhangIV"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [], "guidance_basis": [],
@@ -882,7 +890,7 @@
"member_count": 278, "member_count": 278,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.9, "discovery_confidence": 0.9,
@@ -1268,7 +1276,11 @@
{ {
"source": "MaschVO", "source": "MaschVO",
"anchor": "Anhang III Nr. 1.1.2 (Grundsätze für die Integration der Sicherheit)", "anchor": "Anhang III Nr. 1.1.2 (Grundsätze für die Integration der Sicherheit)",
"citation": "Verbleibende Restrisiken sind in der Betriebsanleitung anzugeben." "citation": "Verbleibende Restrisiken sind in der Betriebsanleitung anzugeben.",
"norm_ids": [
"EU-MaschVO-AnhangIII"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [], "guidance_basis": [],
@@ -1439,7 +1451,7 @@
"member_count": 158, "member_count": 158,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.9, "discovery_confidence": 0.9,
@@ -1467,7 +1479,11 @@
{ {
"source": "MaschVO", "source": "MaschVO",
"anchor": "Anhang III Nr. 1.4 (Anforderungen an Schutzeinrichtungen)", "anchor": "Anhang III Nr. 1.4 (Anforderungen an Schutzeinrichtungen)",
"citation": "Bewegliche Teile der Maschine sind so zu gestalten und zu bauen, dass jegliches Unfallrisiko durch Kontakt verhütet wird; trennende oder nichttrennende Schutzeinrichtungen sind vorzusehen." "citation": "Bewegliche Teile der Maschine sind so zu gestalten und zu bauen, dass jegliches Unfallrisiko durch Kontakt verhütet wird; trennende oder nichttrennende Schutzeinrichtungen sind vorzusehen.",
"norm_ids": [
"EU-MaschVO-AnhangIII"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -2024,7 +2040,7 @@
"member_count": 530, "member_count": 530,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.9, "discovery_confidence": 0.9,
@@ -2052,7 +2068,11 @@
{ {
"source": "MaschVO", "source": "MaschVO",
"anchor": "Anhang III Nr. 1.2.4 (Stillsetzen, Not-Halt)", "anchor": "Anhang III Nr. 1.2.4 (Stillsetzen, Not-Halt)",
"citation": "Jede Maschine muss mit einer oder mehreren Notvorrichtungen ausgerüstet sein, mit denen sich drohende oder eintretende Gefahren abwenden lassen." "citation": "Jede Maschine muss mit einer oder mehreren Notvorrichtungen ausgerüstet sein, mit denen sich drohende oder eintretende Gefahren abwenden lassen.",
"norm_ids": [
"EU-MaschVO-AnhangIII"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [], "guidance_basis": [],
@@ -2101,7 +2121,7 @@
"member_count": 32, "member_count": 32,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.88, "discovery_confidence": 0.88,
@@ -2129,7 +2149,11 @@
{ {
"source": "MaschVO", "source": "MaschVO",
"anchor": "Anhang III Nr. 1.2.1 (Sicherheit und Zuverlässigkeit von Steuerungen)", "anchor": "Anhang III Nr. 1.2.1 (Sicherheit und Zuverlässigkeit von Steuerungen)",
"citation": "Steuerungen sind so zu gestalten, dass sie sicher und zuverlässig sind und Gefährdungssituationen verhindern." "citation": "Steuerungen sind so zu gestalten, dass sie sicher und zuverlässig sind und Gefährdungssituationen verhindern.",
"norm_ids": [
"EU-MaschVO-AnhangIII"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -2367,7 +2391,7 @@
"member_count": 214, "member_count": 214,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.85, "discovery_confidence": 0.85,
@@ -2395,7 +2419,12 @@
{ {
"source": "MaschVO", "source": "MaschVO",
"anchor": "Anhang I (Liste der Sicherheitsbauteile), Art. 5", "anchor": "Anhang I (Liste der Sicherheitsbauteile), Art. 5",
"citation": "Sicherheitsbauteile gemäß Anhang I unterliegen den Anforderungen der Verordnung und der Konformitätsbewertung." "citation": "Sicherheitsbauteile gemäß Anhang I unterliegen den Anforderungen der Verordnung und der Konformitätsbewertung.",
"norm_ids": [
"EU-MaschVO-AnhangI",
"EU-MaschVO-Art5"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [], "guidance_basis": [],
@@ -2454,7 +2483,7 @@
"member_count": 43, "member_count": 43,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.85, "discovery_confidence": 0.85,
@@ -2482,7 +2511,11 @@
{ {
"source": "MaschVO", "source": "MaschVO",
"anchor": "Anhang III Nr. 1.7.4 (Betriebsanleitung)", "anchor": "Anhang III Nr. 1.7.4 (Betriebsanleitung)",
"citation": "Jeder Maschine muss eine Betriebsanleitung in der/den Amtssprache(n) des Mitgliedstaats beiliegen, in dem die Maschine in Verkehr gebracht wird." "citation": "Jeder Maschine muss eine Betriebsanleitung in der/den Amtssprache(n) des Mitgliedstaats beiliegen, in dem die Maschine in Verkehr gebracht wird.",
"norm_ids": [
"EU-MaschVO-AnhangIII"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [], "guidance_basis": [],
@@ -2828,7 +2861,7 @@
"member_count": 325, "member_count": 325,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.92, "discovery_confidence": 0.92,
@@ -2856,7 +2889,11 @@
{ {
"source": "MaschVO", "source": "MaschVO",
"anchor": "Anhang III Nr. 1.3.7/1.7.4", "anchor": "Anhang III Nr. 1.3.7/1.7.4",
"citation": "Gefährdungen durch Blockierung beweglicher Teile sind zu berücksichtigen und Maßnahmen zur sicheren Beseitigung in der Betriebsanleitung anzugeben." "citation": "Gefährdungen durch Blockierung beweglicher Teile sind zu berücksichtigen und Maßnahmen zur sicheren Beseitigung in der Betriebsanleitung anzugeben.",
"norm_ids": [
"EU-MaschVO-AnhangIII"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [], "guidance_basis": [],
@@ -3204,7 +3241,7 @@
"member_count": 334, "member_count": 334,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.8, "discovery_confidence": 0.8,
@@ -3232,7 +3269,11 @@
{ {
"source": "MaschVO", "source": "MaschVO",
"anchor": "Art. 25 (Konformitätsbewertungsverfahren)", "anchor": "Art. 25 (Konformitätsbewertungsverfahren)",
"citation": "Vor dem Inverkehrbringen führt der Hersteller das anwendbare Konformitätsbewertungsverfahren durch." "citation": "Vor dem Inverkehrbringen führt der Hersteller das anwendbare Konformitätsbewertungsverfahren durch.",
"norm_ids": [
"EU-MaschVO-Art25"
],
"norm_id_status": "article_confirmed"
} }
], ],
"guidance_basis": [], "guidance_basis": [],
@@ -3422,7 +3463,7 @@
"member_count": 172, "member_count": 172,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.88, "discovery_confidence": 0.88,
@@ -3450,7 +3491,11 @@
{ {
"source": "MaschVO", "source": "MaschVO",
"anchor": "Anhang IV (Technische Unterlagen)", "anchor": "Anhang IV (Technische Unterlagen)",
"citation": "Die technischen Unterlagen müssen die Konstruktions-, Herstellungs- und Funktionsbeschreibung sowie die Risikobeurteilung enthalten." "citation": "Die technischen Unterlagen müssen die Konstruktions-, Herstellungs- und Funktionsbeschreibung sowie die Risikobeurteilung enthalten.",
"norm_ids": [
"EU-MaschVO-AnhangIV"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [], "guidance_basis": [],
@@ -3523,7 +3568,7 @@
"member_count": 57, "member_count": 57,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.85, "discovery_confidence": 0.85,
@@ -3551,7 +3596,12 @@
{ {
"source": "MaschVO", "source": "MaschVO",
"anchor": "Art. 21, Art. 22 (EU-Konformitätserklärung, CE-Kennzeichnung)", "anchor": "Art. 21, Art. 22 (EU-Konformitätserklärung, CE-Kennzeichnung)",
"citation": "Der Hersteller stellt eine EU-Konformitätserklärung aus und bringt die CE-Kennzeichnung an." "citation": "Der Hersteller stellt eine EU-Konformitätserklärung aus und bringt die CE-Kennzeichnung an.",
"norm_ids": [
"EU-MaschVO-Art21",
"EU-MaschVO-Art22"
],
"norm_id_status": "article_confirmed"
} }
], ],
"guidance_basis": [], "guidance_basis": [],
@@ -3624,7 +3674,7 @@
"member_count": 57, "member_count": 57,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.85, "discovery_confidence": 0.85,
@@ -3652,7 +3702,12 @@
{ {
"source": "MaschVO", "source": "MaschVO",
"anchor": "Art. 10, Art. 11 (Pflichten der Hersteller)", "anchor": "Art. 10, Art. 11 (Pflichten der Hersteller)",
"citation": "Die Hersteller gewährleisten, dass ihre Maschinen gemäß den grundlegenden Sicherheits- und Gesundheitsschutzanforderungen konstruiert und hergestellt wurden." "citation": "Die Hersteller gewährleisten, dass ihre Maschinen gemäß den grundlegenden Sicherheits- und Gesundheitsschutzanforderungen konstruiert und hergestellt wurden.",
"norm_ids": [
"EU-MaschVO-Art10",
"EU-MaschVO-Art11"
],
"norm_id_status": "article_confirmed"
} }
], ],
"guidance_basis": [], "guidance_basis": [],
@@ -3723,7 +3778,7 @@
"member_count": 55, "member_count": 55,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.85, "discovery_confidence": 0.85,
@@ -3751,7 +3806,11 @@
{ {
"source": "MaschVO", "source": "MaschVO",
"anchor": "Anhang III (Grundlegende Sicherheits- und Gesundheitsschutzanforderungen)", "anchor": "Anhang III (Grundlegende Sicherheits- und Gesundheitsschutzanforderungen)",
"citation": "Maschinen müssen die in Anhang III aufgeführten grundlegenden Sicherheits- und Gesundheitsschutzanforderungen erfüllen." "citation": "Maschinen müssen die in Anhang III aufgeführten grundlegenden Sicherheits- und Gesundheitsschutzanforderungen erfüllen.",
"norm_ids": [
"EU-MaschVO-AnhangIII"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [], "guidance_basis": [],
@@ -3943,7 +4002,7 @@
"member_count": 171, "member_count": 171,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.85, "discovery_confidence": 0.85,
@@ -4029,7 +4088,12 @@
{ {
"source": "MaschVO", "source": "MaschVO",
"anchor": "Kapitel IV (Notifizierung von Konformitätsbewertungsstellen)", "anchor": "Kapitel IV (Notifizierung von Konformitätsbewertungsstellen)",
"citation": "Notifizierte Stellen müssen die Anforderungen an Unabhängigkeit, Kompetenz und Unparteilichkeit erfüllen." "citation": "Notifizierte Stellen müssen die Anforderungen an Unabhängigkeit, Kompetenz und Unparteilichkeit erfüllen.",
"norm_ids": [
"EU-MaschVO-KapitelIV"
],
"norm_id_status": "chapter_no_kb_unit",
"norm_id_note": "Kapitel-Ebene nicht als KB-v2-Unit gemintet (Compiler = Artikel+Annex). Re-Anchor auf Konstituenten-Artikel = Enhancement (KB-v2 hat die Artikel); NICHT geraten."
} }
], ],
"guidance_basis": [], "guidance_basis": [],
@@ -4052,7 +4116,7 @@
"member_count": 11, "member_count": 11,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "chapter_reanchor_pending",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.8, "discovery_confidence": 0.8,
@@ -4061,7 +4125,11 @@
"llm_model": "claude-opus-4-8", "llm_model": "claude-opus-4-8",
"synthesis_version": "v1" "synthesis_version": "v1"
}, },
"family": "machinery" "family": "machinery",
"scope": "derived_obligation",
"scope_reason": "Norm adressiert primär die notifizierte Stelle (Unabhängigkeit/Kompetenz/Unparteilichkeit), erzeugt aber mittelbare Hersteller-Pflichten: notifizierte Stelle einbeziehen, erforderliche Unterlagen bereitstellen, Konformitätsbewertung korrekt durchführen.",
"scope_split_candidate": true,
"scope_split_note": "Kandidat für spätere Aufspaltung: 'Normadressat' (Anforderungen AN die notifizierte Stelle = institutional/out_of_scope) ↔ 'abgeleitete Herstellerpflicht' (NB einbeziehen + Unterlagen + Konformitätsbewertung = in_scope). NICHT vorzeitig festziehen."
}, },
{ {
"id": "market_surveillance_safeguard", "id": "market_surveillance_safeguard",
@@ -4080,7 +4148,13 @@
{ {
"source": "MaschVO", "source": "MaschVO",
"anchor": "Kapitel V/VI (Marktüberwachung, Schutzklauselverfahren)", "anchor": "Kapitel V/VI (Marktüberwachung, Schutzklauselverfahren)",
"citation": "Mitgliedstaaten ergreifen geeignete Maßnahmen gegen Maschinen, die ein Risiko darstellen; die Kommission koordiniert Schutzmaßnahmen." "citation": "Mitgliedstaaten ergreifen geeignete Maßnahmen gegen Maschinen, die ein Risiko darstellen; die Kommission koordiniert Schutzmaßnahmen.",
"norm_ids": [
"EU-MaschVO-KapitelV",
"EU-MaschVO-KapitelVI"
],
"norm_id_status": "chapter_no_kb_unit",
"norm_id_note": "Kapitel-Ebene nicht als KB-v2-Unit gemintet (Compiler = Artikel+Annex). Re-Anchor auf Konstituenten-Artikel = Enhancement (KB-v2 hat die Artikel); NICHT geraten."
} }
], ],
"guidance_basis": [], "guidance_basis": [],
@@ -4125,7 +4199,7 @@
"member_count": 30, "member_count": 30,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "chapter_reanchor_pending",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.8, "discovery_confidence": 0.8,
@@ -4134,7 +4208,9 @@
"llm_model": "claude-opus-4-8", "llm_model": "claude-opus-4-8",
"synthesis_version": "v1" "synthesis_version": "v1"
}, },
"family": "machinery" "family": "machinery",
"scope": "out_of_scope",
"scope_reason": "Adressat = Marktüberwachungsbehörden/Kommission (Schutzmaßnahmen, Schutzklauselverfahren); keine Hersteller-Handlungspflicht. Präzedenz CSIRT/ENISA."
}, },
{ {
"id": "sanctions", "id": "sanctions",
@@ -4153,7 +4229,11 @@
{ {
"source": "MaschVO", "source": "MaschVO",
"anchor": "Art. 50 (Sanktionen)", "anchor": "Art. 50 (Sanktionen)",
"citation": "Die Mitgliedstaaten legen Vorschriften über Sanktionen für Verstöße gegen diese Verordnung fest." "citation": "Die Mitgliedstaaten legen Vorschriften über Sanktionen für Verstöße gegen diese Verordnung fest.",
"norm_ids": [
"EU-MaschVO-Art50"
],
"norm_id_status": "article_confirmed"
} }
], ],
"guidance_basis": [], "guidance_basis": [],
@@ -4185,7 +4265,7 @@
"member_count": 19, "member_count": 19,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.85, "discovery_confidence": 0.85,
@@ -4194,7 +4274,9 @@
"llm_model": "claude-opus-4-8", "llm_model": "claude-opus-4-8",
"synthesis_version": "v1" "synthesis_version": "v1"
}, },
"family": "machinery" "family": "machinery",
"scope": "out_of_scope",
"scope_reason": "Adressat = Mitgliedstaaten (legen Sanktionen fest); keine Hersteller-Handlungspflicht. Präzedenz CSIRT/ENISA (CRA-Vuln-Cut)."
}, },
{ {
"id": "scope_transition_application", "id": "scope_transition_application",
@@ -4213,7 +4295,13 @@
{ {
"source": "MaschVO", "source": "MaschVO",
"anchor": "Art. 1, Art. 53, Art. 54 (Anwendungsbereich, Übergangsbestimmungen, Geltungsbeginn)", "anchor": "Art. 1, Art. 53, Art. 54 (Anwendungsbereich, Übergangsbestimmungen, Geltungsbeginn)",
"citation": "Diese Verordnung gilt ab dem festgelegten Datum unmittelbar in allen Mitgliedstaaten; Übergangsbestimmungen regeln die Anwendbarkeit." "citation": "Diese Verordnung gilt ab dem festgelegten Datum unmittelbar in allen Mitgliedstaaten; Übergangsbestimmungen regeln die Anwendbarkeit.",
"norm_ids": [
"EU-MaschVO-Art1",
"EU-MaschVO-Art53",
"EU-MaschVO-Art54"
],
"norm_id_status": "article_confirmed"
} }
], ],
"guidance_basis": [], "guidance_basis": [],
@@ -4316,7 +4404,7 @@
"member_count": 85, "member_count": 85,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.85, "discovery_confidence": 0.85,
@@ -4344,7 +4432,11 @@
{ {
"source": "MaschVO", "source": "MaschVO",
"anchor": "Art. 18 (wesentliche Veränderung)", "anchor": "Art. 18 (wesentliche Veränderung)",
"citation": "Wer eine wesentliche Veränderung an einer Maschine vornimmt, gilt als Hersteller und muss die Anforderungen erfüllen." "citation": "Wer eine wesentliche Veränderung an einer Maschine vornimmt, gilt als Hersteller und muss die Anforderungen erfüllen.",
"norm_ids": [
"EU-MaschVO-Art18"
],
"norm_id_status": "article_confirmed"
} }
], ],
"guidance_basis": [], "guidance_basis": [],
@@ -4417,7 +4509,7 @@
"member_count": 60, "member_count": 60,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.8, "discovery_confidence": 0.8,
@@ -4445,7 +4537,11 @@
{ {
"source": "MaschVO", "source": "MaschVO",
"anchor": "Anhang III Nr. 1.1.9 (Schutz gegen Korrumpierung)", "anchor": "Anhang III Nr. 1.1.9 (Schutz gegen Korrumpierung)",
"citation": "Die Maschine ist so zu konstruieren, dass die Verbindung mit anderen Geräten nicht zu einer gefährlichen Situation führt; Hard- und Software, die für sicherheitsrelevante Funktionen kritisch sind, sind gegen unbeabsichtigte oder vorsätzliche Korrumpierung zu schützen." "citation": "Die Maschine ist so zu konstruieren, dass die Verbindung mit anderen Geräten nicht zu einer gefährlichen Situation führt; Hard- und Software, die für sicherheitsrelevante Funktionen kritisch sind, sind gegen unbeabsichtigte oder vorsätzliche Korrumpierung zu schützen.",
"norm_ids": [
"EU-MaschVO-AnhangIII"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -4562,7 +4658,7 @@
"member_count": 86, "member_count": 86,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.9, "discovery_confidence": 0.9,
@@ -4590,7 +4686,11 @@
{ {
"source": "MaschVO", "source": "MaschVO",
"anchor": "Anhang III Nr. 1.1.9, Nr. 1.2.1", "anchor": "Anhang III Nr. 1.1.9, Nr. 1.2.1",
"citation": "Sicherheitsrelevante Hard- und Software ist gegen unbeabsichtigte oder vorsätzliche Korrumpierung zu schützen; eine Korrumpierung darf nicht zu gefährlichen Situationen führen." "citation": "Sicherheitsrelevante Hard- und Software ist gegen unbeabsichtigte oder vorsätzliche Korrumpierung zu schützen; eine Korrumpierung darf nicht zu gefährlichen Situationen führen.",
"norm_ids": [
"EU-MaschVO-AnhangIII"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -4659,7 +4759,7 @@
"member_count": 46, "member_count": 46,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.8, "discovery_confidence": 0.8,
@@ -4687,7 +4787,12 @@
{ {
"source": "MaschVO", "source": "MaschVO",
"anchor": "Anhang I Teil A, Anhang III Nr. 1.2.1", "anchor": "Anhang I Teil A, Anhang III Nr. 1.2.1",
"citation": "Maschinen mit sich vollständig oder teilweise selbst entwickelndem Verhalten durch maschinelles Lernen gelten als Hochrisikomaschinen und unterliegen besonderen Anforderungen." "citation": "Maschinen mit sich vollständig oder teilweise selbst entwickelndem Verhalten durch maschinelles Lernen gelten als Hochrisikomaschinen und unterliegen besonderen Anforderungen.",
"norm_ids": [
"EU-MaschVO-AnhangI",
"EU-MaschVO-AnhangIII"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [], "guidance_basis": [],
@@ -4798,7 +4903,7 @@
"member_count": 96, "member_count": 96,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.8, "discovery_confidence": 0.8,
@@ -4826,7 +4931,11 @@
{ {
"source": "MaschVO", "source": "MaschVO",
"anchor": "Anhang III Nr. 3 (Mobile Maschinen) / Nr. 6", "anchor": "Anhang III Nr. 3 (Mobile Maschinen) / Nr. 6",
"citation": "Mobile Maschinen sind so zu konstruieren, dass Risiken im Gefahrenbereich und bei Fernsteuerung beherrscht werden." "citation": "Mobile Maschinen sind so zu konstruieren, dass Risiken im Gefahrenbereich und bei Fernsteuerung beherrscht werden.",
"norm_ids": [
"EU-MaschVO-AnhangIII"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [], "guidance_basis": [],
@@ -4866,7 +4975,7 @@
"member_count": 23, "member_count": 23,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.78, "discovery_confidence": 0.78,
@@ -4894,7 +5003,11 @@
{ {
"source": "MaschVO", "source": "MaschVO",
"anchor": "Anhang III Nr. 2-6 (besondere Maschinenkategorien)", "anchor": "Anhang III Nr. 2-6 (besondere Maschinenkategorien)",
"citation": "Für bestimmte Maschinenkategorien gelten zusätzliche grundlegende Sicherheitsanforderungen." "citation": "Für bestimmte Maschinenkategorien gelten zusätzliche grundlegende Sicherheitsanforderungen.",
"norm_ids": [
"EU-MaschVO-AnhangIII"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [], "guidance_basis": [],
@@ -5026,7 +5139,7 @@
"member_count": 111, "member_count": 111,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.75, "discovery_confidence": 0.75,
@@ -5054,7 +5167,11 @@
{ {
"source": "MaschVO", "source": "MaschVO",
"anchor": "Anhang III Nr. 1.5.8/1.5.9, Nr. 1.7.4.2", "anchor": "Anhang III Nr. 1.5.8/1.5.9, Nr. 1.7.4.2",
"citation": "Die Betriebsanleitung muss Angaben zu Luftschallemissionen und Vibrationen enthalten." "citation": "Die Betriebsanleitung muss Angaben zu Luftschallemissionen und Vibrationen enthalten.",
"norm_ids": [
"EU-MaschVO-AnhangIII"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [], "guidance_basis": [],
@@ -5089,7 +5206,7 @@
"member_count": 22, "member_count": 22,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.8, "discovery_confidence": 0.8,
@@ -5552,5 +5669,13 @@
], ],
"note": "Common-Criteria-/TOE-/SFR-Evaluierung, BCM, Banking, Smart-Meter-Gateway, DNS und allgemeine ISMS-/OT-Cybersecurity-Themen ohne direkten MaschVO-Bezug; nur teilweise IN-Scope-Anteile bereits in access_control_safety_functions abgebildet" "note": "Common-Criteria-/TOE-/SFR-Evaluierung, BCM, Banking, Smart-Meter-Gateway, DNS und allgemeine ISMS-/OT-Cybersecurity-Themen ohne direkten MaschVO-Bezug; nur teilweise IN-Scope-Anteile bereits in access_control_safety_functions abgebildet"
} }
] ],
"norm_id_contract": {
"convention": "EU-<ACT>-Anhang<ROM> (Annex-Ebene) / EU-<ACT>-Art<N> (verify) — KB-v2 bp_compliance_kb_2026_1_build",
"act_naming": "EU-MaschVO-* (NICHT MaschinenVO)",
"granularity": "annex-grob — 'Annex I Part II (1)' -> EU-CRA-AnhangI; Part/Punkt = KB-Enhancement TBD",
"article_status": "EU-<ACT>-Art<N> in KB-v2 BESTÄTIGT (16/16); Annex-IDs confirmed",
"source": "Board Compliance/KB-v2 2026-07-01",
"kb_v2_verification": "2026-07-01: 16/19 verify_pending IDs in KB-v2 bestätigt (alle Artikel); 3 Kapitel-IDs = chapter_no_kb_unit (Compiler mintet keine Kapitel)."
}
} }
+125 -35
View File
@@ -11,7 +11,12 @@
"name": "SBOM-Erstellungsprozess", "name": "SBOM-Erstellungsprozess",
"description": "Erzeugen einer vollstaendigen, maschinenlesbaren Software Bill of Materials fuer ein Produkt mit digitalen Elementen.", "description": "Erzeugen einer vollstaendigen, maschinenlesbaren Software Bill of Materials fuer ein Produkt mit digitalen Elementen.",
"source_role": "procedural_requirement", "source_role": "procedural_requirement",
"fulfills_obligations": ["sbom_creation", "sbom_dependency_coverage", "sbom_format_standard", "sbom_tooling_automation"], "fulfills_obligations": [
"sbom_creation",
"sbom_dependency_coverage",
"sbom_format_standard",
"sbom_tooling_automation"
],
"steps": [ "steps": [
"Komponenten und (direkte + transitive) Abhaengigkeiten inventarisieren", "Komponenten und (direkte + transitive) Abhaengigkeiten inventarisieren",
"SBOM automatisiert in der Build-/Toolchain generieren", "SBOM automatisiert in der Build-/Toolchain generieren",
@@ -24,15 +29,22 @@
"Format ist maschinenlesbar und standardkonform (CycloneDX/SPDX)", "Format ist maschinenlesbar und standardkonform (CycloneDX/SPDX)",
"direkte und transitive Abhaengigkeiten enthalten" "direkte und transitive Abhaengigkeiten enthalten"
], ],
"evidence": ["sbom.cyclonedx.json", "Format-Validierungs-Log", "Build-/Toolchain-Konfiguration"], "evidence": [
"citation_spans": [], "citation_status": "pending_span_anchor" "sbom.cyclonedx.json",
"Format-Validierungs-Log",
"Build-/Toolchain-Konfiguration"
],
"citation_spans": [],
"citation_status": "pending_span_anchor"
}, },
{ {
"procedure_id": "sbom_update_process", "procedure_id": "sbom_update_process",
"name": "SBOM-Aktualisierungsprozess", "name": "SBOM-Aktualisierungsprozess",
"description": "Halten der SBOM aktuell ueber den Produktlebenszyklus bei Komponenten-, Versions- und Patch-Aenderungen.", "description": "Halten der SBOM aktuell ueber den Produktlebenszyklus bei Komponenten-, Versions- und Patch-Aenderungen.",
"source_role": "procedural_requirement", "source_role": "procedural_requirement",
"fulfills_obligations": ["sbom_maintenance_update"], "fulfills_obligations": [
"sbom_maintenance_update"
],
"steps": [ "steps": [
"Komponentenaenderung erkennen (Dependency-/Patch-/Versionsaenderung)", "Komponentenaenderung erkennen (Dependency-/Patch-/Versionsaenderung)",
"SBOM neu generieren", "SBOM neu generieren",
@@ -45,15 +57,24 @@
"SBOM-Version passt zum Release", "SBOM-Version passt zum Release",
"Supplier-Komponenten enthalten" "Supplier-Komponenten enthalten"
], ],
"evidence": ["sbom.json", "CI-Log", "Release-Artefakt", "Supplier-SBOM"], "evidence": [
"citation_spans": [], "citation_status": "pending_span_anchor" "sbom.json",
"CI-Log",
"Release-Artefakt",
"Supplier-SBOM"
],
"citation_spans": [],
"citation_status": "pending_span_anchor"
}, },
{ {
"procedure_id": "sbom_supplier_integration_process", "procedure_id": "sbom_supplier_integration_process",
"name": "Lieferanten-SBOM-Integration", "name": "Lieferanten-SBOM-Integration",
"description": "Beschaffen und Einarbeiten von Lieferanten-/Drittkomponenten-SBOMs in die Produkt-SBOM.", "description": "Beschaffen und Einarbeiten von Lieferanten-/Drittkomponenten-SBOMs in die Produkt-SBOM.",
"source_role": "procedural_requirement", "source_role": "procedural_requirement",
"fulfills_obligations": ["sbom_supply_chain_contracts", "sbom_dependency_coverage"], "fulfills_obligations": [
"sbom_supply_chain_contracts",
"sbom_dependency_coverage"
],
"steps": [ "steps": [
"SBOM-Anforderung in Lieferantenvertraege aufnehmen", "SBOM-Anforderung in Lieferantenvertraege aufnehmen",
"Lieferanten-SBOMs einsammeln", "Lieferanten-SBOMs einsammeln",
@@ -65,15 +86,24 @@
"Lieferanten-SBOMs eingegangen", "Lieferanten-SBOMs eingegangen",
"Drittkomponenten in der SBOM gelistet" "Drittkomponenten in der SBOM gelistet"
], ],
"evidence": ["Lieferantenvertrag-Klausel", "eingegangene Supplier-SBOMs", "gemergte SBOM"], "evidence": [
"citation_spans": [], "citation_status": "pending_span_anchor" "Lieferantenvertrag-Klausel",
"eingegangene Supplier-SBOMs",
"gemergte SBOM"
],
"citation_spans": [],
"citation_status": "pending_span_anchor"
}, },
{ {
"procedure_id": "sbom_provision_process", "procedure_id": "sbom_provision_process",
"name": "SBOM-Bereitstellungsprozess", "name": "SBOM-Bereitstellungsprozess",
"description": "Zugaenglichmachen der SBOM fuer berechtigte Parteien (Nutzer, Behoerde) unter Wahrung der Vertraulichkeit.", "description": "Zugaenglichmachen der SBOM fuer berechtigte Parteien (Nutzer, Behoerde) unter Wahrung der Vertraulichkeit.",
"source_role": "procedural_requirement", "source_role": "procedural_requirement",
"fulfills_obligations": ["sbom_access_provision", "sbom_authority_provision", "sbom_confidentiality"], "fulfills_obligations": [
"sbom_access_provision",
"sbom_authority_provision",
"sbom_confidentiality"
],
"steps": [ "steps": [
"Zugangskanal definieren (Portal/API/dokumentierter Pfad)", "Zugangskanal definieren (Portal/API/dokumentierter Pfad)",
"Nutzer ueber den Zugangsweg informieren", "Nutzer ueber den Zugangsweg informieren",
@@ -85,15 +115,23 @@
"Zugriffskontrolle/Vertraulichkeit umgesetzt", "Zugriffskontrolle/Vertraulichkeit umgesetzt",
"Behoerden-Bereitstellungsprozess definiert" "Behoerden-Bereitstellungsprozess definiert"
], ],
"evidence": ["Zugangskanal-Dokumentation", "Behoerden-Anfrage-Log", "Zugriffskontroll-Konfiguration"], "evidence": [
"citation_spans": [], "citation_status": "pending_span_anchor" "Zugangskanal-Dokumentation",
"Behoerden-Anfrage-Log",
"Zugriffskontroll-Konfiguration"
],
"citation_spans": [],
"citation_status": "pending_span_anchor"
}, },
{ {
"procedure_id": "sbom_conformity_documentation_process", "procedure_id": "sbom_conformity_documentation_process",
"name": "SBOM in technischer Dokumentation/Konformitaet", "name": "SBOM in technischer Dokumentation/Konformitaet",
"description": "Aufnehmen der SBOM in die technische Dokumentation und Verifizieren der Vollstaendigkeit fuer die Konformitaetsbewertung.", "description": "Aufnehmen der SBOM in die technische Dokumentation und Verifizieren der Vollstaendigkeit fuer die Konformitaetsbewertung.",
"source_role": "procedural_requirement", "source_role": "procedural_requirement",
"fulfills_obligations": ["sbom_technical_documentation", "sbom_completeness_verification"], "fulfills_obligations": [
"sbom_technical_documentation",
"sbom_completeness_verification"
],
"steps": [ "steps": [
"SBOM in die technische Dokumentation aufnehmen", "SBOM in die technische Dokumentation aufnehmen",
"Vollstaendigkeit gegen die real eingesetzte Softwarekomposition pruefen", "Vollstaendigkeit gegen die real eingesetzte Softwarekomposition pruefen",
@@ -104,16 +142,22 @@
"Vollstaendigkeit verifiziert", "Vollstaendigkeit verifiziert",
"Konformitaetsnachweis vorhanden" "Konformitaetsnachweis vorhanden"
], ],
"evidence": ["technische Dokumentation", "Vollstaendigkeits-Pruefbericht", "Konformitaetsnachweis"], "evidence": [
"citation_spans": [], "citation_status": "pending_span_anchor" "technische Dokumentation",
"Vollstaendigkeits-Pruefbericht",
"Konformitaetsnachweis"
],
"citation_spans": [],
"citation_status": "pending_span_anchor"
}, },
{ {
"procedure_id": "vuln_handling_process_setup", "procedure_id": "vuln_handling_process_setup",
"name": "Schwachstellenbehandlungsprozess einrichten", "name": "Schwachstellenbehandlungsprozess einrichten",
"description": "Dokumentierten Prozess und Meldekanal (CVD) fuer die Schwachstellenbehandlung etablieren.", "description": "Dokumentierten Prozess und Meldekanal (CVD) fuer die Schwachstellenbehandlung etablieren.",
"source_role": "procedural_requirement", "source_role": "procedural_requirement",
"fulfills_obligations": ["vuln_handling_process"], "fulfills_obligations": [
"vuln_handling_process"
],
"steps": [ "steps": [
"dokumentierten Schwachstellenbehandlungsprozess definieren", "dokumentierten Schwachstellenbehandlungsprozess definieren",
"Coordinated-Vulnerability-Disclosure-Richtlinie und Meldekanal veroeffentlichen", "Coordinated-Vulnerability-Disclosure-Richtlinie und Meldekanal veroeffentlichen",
@@ -124,15 +168,22 @@
"Meldekanal/Kontaktstelle auffindbar (z.B. security.txt)", "Meldekanal/Kontaktstelle auffindbar (z.B. security.txt)",
"Triage-Verfahren vorhanden" "Triage-Verfahren vorhanden"
], ],
"evidence": ["Prozessdokument", "security.txt / Kontaktstelle", "Triage-Log"], "evidence": [
"citation_spans": [], "citation_status": "pending_span_anchor" "Prozessdokument",
"security.txt / Kontaktstelle",
"Triage-Log"
],
"citation_spans": [],
"citation_status": "pending_span_anchor"
}, },
{ {
"procedure_id": "vuln_identification_process", "procedure_id": "vuln_identification_process",
"name": "Schwachstellen-Identifikation", "name": "Schwachstellen-Identifikation",
"description": "Bekannte Schwachstellen in eingesetzten Komponenten erkennen und inventarisieren.", "description": "Bekannte Schwachstellen in eingesetzten Komponenten erkennen und inventarisieren.",
"source_role": "procedural_requirement", "source_role": "procedural_requirement",
"fulfills_obligations": ["vuln_identification_inventory"], "fulfills_obligations": [
"vuln_identification_inventory"
],
"steps": [ "steps": [
"Advisories/CVE-Feeds beobachten", "Advisories/CVE-Feeds beobachten",
"gegen die SBOM-Komponenten abgleichen", "gegen die SBOM-Komponenten abgleichen",
@@ -143,15 +194,21 @@
"SBOM-zu-CVE-Abgleich durchgefuehrt", "SBOM-zu-CVE-Abgleich durchgefuehrt",
"Schwachstellen-Inventar gepflegt" "Schwachstellen-Inventar gepflegt"
], ],
"evidence": ["CVE-Abgleich-Report", "Schwachstellen-Register"], "evidence": [
"citation_spans": [], "citation_status": "pending_span_anchor" "CVE-Abgleich-Report",
"Schwachstellen-Register"
],
"citation_spans": [],
"citation_status": "pending_span_anchor"
}, },
{ {
"procedure_id": "vuln_assessment_process", "procedure_id": "vuln_assessment_process",
"name": "Schwachstellen-Bewertung/Priorisierung", "name": "Schwachstellen-Bewertung/Priorisierung",
"description": "Identifizierte Schwachstellen nach Schweregrad, Ausnutzbarkeit und Exposition bewerten und priorisieren.", "description": "Identifizierte Schwachstellen nach Schweregrad, Ausnutzbarkeit und Exposition bewerten und priorisieren.",
"source_role": "procedural_requirement", "source_role": "procedural_requirement",
"fulfills_obligations": ["vuln_assessment_prioritization"], "fulfills_obligations": [
"vuln_assessment_prioritization"
],
"steps": [ "steps": [
"Schweregrad bewerten (z.B. CVSS)", "Schweregrad bewerten (z.B. CVSS)",
"Ausnutzbarkeit/Exposition einschaetzen", "Ausnutzbarkeit/Exposition einschaetzen",
@@ -161,15 +218,21 @@
"Schweregrad standardisiert bewertet", "Schweregrad standardisiert bewertet",
"risikobasierte Priorisierung vorhanden" "risikobasierte Priorisierung vorhanden"
], ],
"evidence": ["Bewertungsdatensatz (CVSS)", "Prioritaetenliste"], "evidence": [
"citation_spans": [], "citation_status": "pending_span_anchor" "Bewertungsdatensatz (CVSS)",
"Prioritaetenliste"
],
"citation_spans": [],
"citation_status": "pending_span_anchor"
}, },
{ {
"procedure_id": "vuln_remediation_process", "procedure_id": "vuln_remediation_process",
"name": "Schwachstellen-Behebung", "name": "Schwachstellen-Behebung",
"description": "Bekannte Schwachstellen fristgerecht durch Patches/Gegenmassnahmen beheben und Sicherheitsupdates bereitstellen.", "description": "Bekannte Schwachstellen fristgerecht durch Patches/Gegenmassnahmen beheben und Sicherheitsupdates bereitstellen.",
"source_role": "procedural_requirement", "source_role": "procedural_requirement",
"fulfills_obligations": ["vuln_remediation_patching"], "fulfills_obligations": [
"vuln_remediation_patching"
],
"steps": [ "steps": [
"Fix/Gegenmassnahme entwickeln", "Fix/Gegenmassnahme entwickeln",
"testen", "testen",
@@ -181,15 +244,23 @@
"Sicherheitsupdate bereitgestellt", "Sicherheitsupdate bereitgestellt",
"Follow-up bis Closure" "Follow-up bis Closure"
], ],
"evidence": ["Patch/Release", "Behebungs-Zeitleiste", "Follow-up-Log"], "evidence": [
"citation_spans": [], "citation_status": "pending_span_anchor" "Patch/Release",
"Behebungs-Zeitleiste",
"Follow-up-Log"
],
"citation_spans": [],
"citation_status": "pending_span_anchor"
}, },
{ {
"procedure_id": "vuln_disclosure_process", "procedure_id": "vuln_disclosure_process",
"name": "Offenlegung + Nutzerinformation", "name": "Offenlegung + Nutzerinformation",
"description": "Koordinierte Offenlegung behobener Schwachstellen und Information der Nutzer ueber Schutzmassnahmen.", "description": "Koordinierte Offenlegung behobener Schwachstellen und Information der Nutzer ueber Schutzmassnahmen.",
"source_role": "procedural_requirement", "source_role": "procedural_requirement",
"fulfills_obligations": ["coordinated_vulnerability_disclosure", "vuln_info_dissemination_users"], "fulfills_obligations": [
"coordinated_vulnerability_disclosure",
"vuln_info_dissemination_users"
],
"steps": [ "steps": [
"Offenlegungszeitpunkt koordinieren", "Offenlegungszeitpunkt koordinieren",
"Security Advisory / CVE-Eintrag veroeffentlichen", "Security Advisory / CVE-Eintrag veroeffentlichen",
@@ -199,15 +270,22 @@
"Advisory veroeffentlicht", "Advisory veroeffentlicht",
"Nutzer informiert" "Nutzer informiert"
], ],
"evidence": ["Security Advisory", "CVE-Eintrag", "Nutzer-Benachrichtigung"], "evidence": [
"citation_spans": [], "citation_status": "pending_span_anchor" "Security Advisory",
"CVE-Eintrag",
"Nutzer-Benachrichtigung"
],
"citation_spans": [],
"citation_status": "pending_span_anchor"
}, },
{ {
"procedure_id": "vuln_authority_reporting_process", "procedure_id": "vuln_authority_reporting_process",
"name": "Behoerdenmeldung aktiv ausgenutzter Schwachstellen", "name": "Behoerdenmeldung aktiv ausgenutzter Schwachstellen",
"description": "Aktiv ausgenutzte Schwachstellen fristgerecht an CSIRT/ENISA melden (CRA Art. 14-Kaskade).", "description": "Aktiv ausgenutzte Schwachstellen fristgerecht an CSIRT/ENISA melden (CRA Art. 14-Kaskade).",
"source_role": "procedural_requirement", "source_role": "procedural_requirement",
"fulfills_obligations": ["exploited_vuln_reporting_authorities"], "fulfills_obligations": [
"exploited_vuln_reporting_authorities"
],
"applicability_note": "bedingt: nur bei aktiv ausgenutzter Schwachstelle", "applicability_note": "bedingt: nur bei aktiv ausgenutzter Schwachstelle",
"steps": [ "steps": [
"aktive Ausnutzung erkennen", "aktive Ausnutzung erkennen",
@@ -220,8 +298,20 @@
"72h-Meldung erfolgt", "72h-Meldung erfolgt",
"14d-Abschlussbericht erfolgt" "14d-Abschlussbericht erfolgt"
], ],
"evidence": ["CSIRT/ENISA-Meldungsbelege", "Zeitstempel der Kaskade"], "evidence": [
"citation_spans": [], "citation_status": "pending_span_anchor" "CSIRT/ENISA-Meldungsbelege",
"Zeitstempel der Kaskade"
],
"citation_spans": [],
"citation_status": "pending_span_anchor"
} }
] ],
"norm_id_contract": {
"convention": "EU-<ACT>-Anhang<ROM> (Annex-Ebene) / EU-<ACT>-Art<N> (verify) — KB-v2 bp_compliance_kb_2026_1_build",
"act_naming": "EU-MaschVO-* (NICHT MaschinenVO)",
"granularity": "annex-grob — 'Annex I Part II (1)' -> EU-CRA-AnhangI; Part/Punkt = KB-Enhancement TBD",
"article_status": "EU-<ACT>-Art<N> in KB-v2 BESTÄTIGT (16/16); Annex-IDs confirmed",
"source": "Board Compliance/KB-v2 2026-07-01",
"kb_v2_verification": "2026-07-01: 16/19 verify_pending IDs in KB-v2 bestätigt (alle Artikel); 3 Kapitel-IDs = chapter_no_kb_unit (Compiler mintet keine Kapitel)."
}
} }
+39 -11
View File
@@ -46,7 +46,11 @@
{ {
"source": "CRA", "source": "CRA",
"anchor": "Annex I (1)(2)(d)", "anchor": "Annex I (1)(2)(d)",
"citation": "Schutz vor unbefugtem Zugriff durch geeignete Kontrollmechanismen (Authentifizierung, Identitaets- und Zugriffsmanagement)" "citation": "Schutz vor unbefugtem Zugriff durch geeignete Kontrollmechanismen (Authentifizierung, Identitaets- und Zugriffsmanagement)",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -342,7 +346,7 @@
"member_count": 277, "member_count": 277,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.92, "discovery_confidence": 0.92,
@@ -370,7 +374,11 @@
{ {
"source": "CRA", "source": "CRA",
"anchor": "Annex I (1)(2)(b)(c)", "anchor": "Annex I (1)(2)(b)(c)",
"citation": "Schutz der Vertraulichkeit und Integritaet von Daten und Befehlen" "citation": "Schutz der Vertraulichkeit und Integritaet von Daten und Befehlen",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [], "guidance_basis": [],
@@ -656,7 +664,7 @@
"member_count": 274, "member_count": 274,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.9, "discovery_confidence": 0.9,
@@ -917,7 +925,11 @@
{ {
"source": "CRA", "source": "CRA",
"anchor": "Annex I (1)(2)(g)", "anchor": "Annex I (1)(2)(g)",
"citation": "Aufzeichnung und Ueberwachung relevanter interner Aktivitaeten (Logging)" "citation": "Aufzeichnung und Ueberwachung relevanter interner Aktivitaeten (Logging)",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -959,7 +971,7 @@
"member_count": 22, "member_count": 22,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.9, "discovery_confidence": 0.9,
@@ -1145,7 +1157,11 @@
{ {
"source": "CRA", "source": "CRA",
"anchor": "Annex I (1)(2)(a)", "anchor": "Annex I (1)(2)(a)",
"citation": "Bereitstellung ohne bekannte ausnutzbare Schwachstellen / minimierte Angriffsflaeche" "citation": "Bereitstellung ohne bekannte ausnutzbare Schwachstellen / minimierte Angriffsflaeche",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [], "guidance_basis": [],
@@ -1178,7 +1194,7 @@
"member_count": 19, "member_count": 19,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.83, "discovery_confidence": 0.83,
@@ -1210,7 +1226,11 @@
{ {
"source": "CRA", "source": "CRA",
"anchor": "Annex I (2)(1)", "anchor": "Annex I (2)(1)",
"citation": "Behandlung und Behebung von Schwachstellen, Sicherheitsupdates" "citation": "Behandlung und Behebung von Schwachstellen, Sicherheitsupdates",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -1247,7 +1267,7 @@
"member_count": 17, "member_count": 17,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.82, "discovery_confidence": 0.82,
@@ -1662,5 +1682,13 @@
], ],
"note": "Physische Maschinen-Fernsteuerung/Ergonomie/Gefahrenzonen-Sicherheit (MaschinenVO 2023/1230), keine Cybersecurity-Fernwartung" "note": "Physische Maschinen-Fernsteuerung/Ergonomie/Gefahrenzonen-Sicherheit (MaschinenVO 2023/1230), keine Cybersecurity-Fernwartung"
} }
] ],
"norm_id_contract": {
"convention": "EU-<ACT>-Anhang<ROM> (Annex-Ebene) / EU-<ACT>-Art<N> (verify) — KB-v2 bp_compliance_kb_2026_1_build",
"act_naming": "EU-MaschVO-* (NICHT MaschinenVO)",
"granularity": "annex-grob — 'Annex I Part II (1)' -> EU-CRA-AnhangI; Part/Punkt = KB-Enhancement TBD",
"article_status": "EU-<ACT>-Art<N> in KB-v2 BESTÄTIGT (16/16); Annex-IDs confirmed",
"source": "Board Compliance/KB-v2 2026-07-01",
"kb_v2_verification": "2026-07-01: 16/19 verify_pending IDs in KB-v2 bestätigt (alle Artikel); 3 Kapitel-IDs = chapter_no_kb_unit (Compiler mintet keine Kapitel)."
}
} }
+50 -14
View File
@@ -52,12 +52,20 @@
{ {
"source": "CRA", "source": "CRA",
"anchor": "Annex I (2)(c)", "anchor": "Annex I (2)(c)",
"citation": "Schwachstellen durch Sicherheitsupdates ohne Verzug behandeln, einschliesslich automatischer Updates und Benachrichtigung." "citation": "Schwachstellen durch Sicherheitsupdates ohne Verzug behandeln, einschliesslich automatischer Updates und Benachrichtigung.",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
}, },
{ {
"source": "CRA", "source": "CRA",
"anchor": "Art. 13", "anchor": "Art. 13",
"citation": "Pflicht zur Bereitstellung von Sicherheitsupdates waehrend des Support-Zeitraums." "citation": "Pflicht zur Bereitstellung von Sicherheitsupdates waehrend des Support-Zeitraums.",
"norm_ids": [
"EU-CRA-Art13"
],
"norm_id_status": "article_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -661,7 +669,7 @@
"member_count": 578, "member_count": 578,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.95, "discovery_confidence": 0.95,
@@ -689,7 +697,11 @@
{ {
"source": "CRA", "source": "CRA",
"anchor": "Art. 13(8)", "anchor": "Art. 13(8)",
"citation": "Bestimmung des Support-Zeitraums entsprechend der erwarteten Nutzungsdauer." "citation": "Bestimmung des Support-Zeitraums entsprechend der erwarteten Nutzungsdauer.",
"norm_ids": [
"EU-CRA-Art13"
],
"norm_id_status": "article_confirmed"
} }
], ],
"guidance_basis": [], "guidance_basis": [],
@@ -1275,7 +1287,7 @@
"member_count": 574, "member_count": 574,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.85, "discovery_confidence": 0.85,
@@ -1303,7 +1315,11 @@
{ {
"source": "CRA", "source": "CRA",
"anchor": "Annex I (1)(3)(f)", "anchor": "Annex I (1)(3)(f)",
"citation": "Schutz der Integritaet von Daten, Befehlen und Konfigurationen vor Manipulation." "citation": "Schutz der Integritaet von Daten, Befehlen und Konfigurationen vor Manipulation.",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -1382,7 +1398,7 @@
"member_count": 58, "member_count": 58,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.9, "discovery_confidence": 0.9,
@@ -1415,7 +1431,11 @@
{ {
"source": "CRA", "source": "CRA",
"anchor": "Annex I (1)(3)(d)", "anchor": "Annex I (1)(3)(d)",
"citation": "Schutz vor unbefugtem Zugriff durch geeignete Kontrollmechanismen." "citation": "Schutz vor unbefugtem Zugriff durch geeignete Kontrollmechanismen.",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [ "guidance_basis": [
@@ -1476,7 +1496,7 @@
"member_count": 42, "member_count": 42,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.85, "discovery_confidence": 0.85,
@@ -1642,7 +1662,11 @@
{ {
"source": "CRA", "source": "CRA",
"anchor": "Annex I (2)(c)", "anchor": "Annex I (2)(c)",
"citation": "Sicherheitsupdates werden, soweit moeglich, automatisch installiert mit Opt-out-Moeglichkeit des Nutzers." "citation": "Sicherheitsupdates werden, soweit moeglich, automatisch installiert mit Opt-out-Moeglichkeit des Nutzers.",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [], "guidance_basis": [],
@@ -1661,7 +1685,7 @@
"member_count": 6, "member_count": 6,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.9, "discovery_confidence": 0.9,
@@ -1690,7 +1714,11 @@
{ {
"source": "CRA", "source": "CRA",
"anchor": "Annex I (1)(2)", "anchor": "Annex I (1)(2)",
"citation": "Cybersicherheits-Risikobeurteilung als Grundlage fuer Schwachstellenbehandlung." "citation": "Cybersicherheits-Risikobeurteilung als Grundlage fuer Schwachstellenbehandlung.",
"norm_ids": [
"EU-CRA-AnhangI"
],
"norm_id_status": "annex_confirmed"
} }
], ],
"guidance_basis": [], "guidance_basis": [],
@@ -1704,7 +1732,7 @@
"member_count": 2, "member_count": 2,
"relationships": [], "relationships": [],
"citation_anchor_ids": [], "citation_anchor_ids": [],
"citation_status": "pending_span_anchor", "citation_status": "norm_id_linked",
"review_status": "draft", "review_status": "draft",
"provenance": { "provenance": {
"discovery_confidence": 0.8, "discovery_confidence": 0.8,
@@ -1816,5 +1844,13 @@
], ],
"note": "M4 (digitale Veraenderungen allgemein) und M7 (TLS-Proxy-Kanalverwaltung) betreffen Konfigurations-/Netzwerkmanagement, nicht die Update-/Patch-Pflicht im engeren Sinne." "note": "M4 (digitale Veraenderungen allgemein) und M7 (TLS-Proxy-Kanalverwaltung) betreffen Konfigurations-/Netzwerkmanagement, nicht die Update-/Patch-Pflicht im engeren Sinne."
} }
] ],
"norm_id_contract": {
"convention": "EU-<ACT>-Anhang<ROM> (Annex-Ebene) / EU-<ACT>-Art<N> (verify) — KB-v2 bp_compliance_kb_2026_1_build",
"act_naming": "EU-MaschVO-* (NICHT MaschinenVO)",
"granularity": "annex-grob — 'Annex I Part II (1)' -> EU-CRA-AnhangI; Part/Punkt = KB-Enhancement TBD",
"article_status": "EU-<ACT>-Art<N> in KB-v2 BESTÄTIGT (16/16); Annex-IDs confirmed",
"source": "Board Compliance/KB-v2 2026-07-01",
"kb_v2_verification": "2026-07-01: 16/19 verify_pending IDs in KB-v2 bestätigt (alle Artikel); 3 Kapitel-IDs = chapter_no_kb_unit (Compiler mintet keine Kapitel)."
}
} }
+43
View File
@@ -0,0 +1,43 @@
{
"contract": "norm_id logical citation join (Domäne 2 legal_basis -> KB-v2 units)",
"convention": "EU-<ACT>-Anhang<ROM> (annex) / EU-<ACT>-Art<N> (article) — beide in KB-v2 confirmed; EU-<ACT>-Kapitel<ROM> NICHT gemintet",
"act_naming": "EU-MaschVO-* (NICHT MaschinenVO)",
"granularity": "annex-grob",
"source": "KB-v2 bp_compliance_kb_2026_1_build",
"kb_v2_verification": {
"date": "2026-07-01",
"by": "Compliance/KB (RAG-Ingestion)-Session",
"result": "16/19 verify_pending bestätigt (alle Artikel existieren); 3 Kapitel-IDs fehlen (Compiler mintet Artikel+Annex, keine Kapitel)",
"resolution": "Kapitel re-anchoren auf Konstituenten-Artikel (KB-v2 hat alle) ODER Kapitel-Units als KB-Enhancement; hier als chapter_no_kb_unit markiert, NICHT geraten"
},
"annex_confirmed": {
"EU-CRA-AnhangI": 34,
"EU-CRA-AnhangVII": 2,
"EU-MaschVO-AnhangI": 2,
"EU-MaschVO-AnhangIII": 14,
"EU-MaschVO-AnhangIV": 2
},
"article_confirmed": {
"EU-CRA-Art13": 3,
"EU-CRA-Art14": 1,
"EU-CRA-Art16": 1,
"EU-CRA-Art3": 1,
"EU-CRA-Art31": 3,
"EU-MaschVO-Art1": 1,
"EU-MaschVO-Art10": 1,
"EU-MaschVO-Art11": 1,
"EU-MaschVO-Art18": 1,
"EU-MaschVO-Art21": 1,
"EU-MaschVO-Art22": 1,
"EU-MaschVO-Art25": 1,
"EU-MaschVO-Art5": 1,
"EU-MaschVO-Art50": 1,
"EU-MaschVO-Art53": 1,
"EU-MaschVO-Art54": 1
},
"chapter_no_kb_unit": {
"EU-MaschVO-KapitelIV": 1,
"EU-MaschVO-KapitelV": 1,
"EU-MaschVO-KapitelVI": 1
}
}
File diff suppressed because it is too large Load Diff
+38
View File
@@ -0,0 +1,38 @@
{
"audit": "obligation scope audit (Adressat: Hersteller vs Behörde/notified_body)",
"principle": "Adressat der Norm != Handlungspflicht des Herstellers; scope-Achse in_scope/out_of_scope/derived_obligation",
"false_positive_guard": "Melde-AN-Behörde-Pflichten (applicability=domain:products…) bleiben IN-SCOPE",
"obligations_scanned": 126,
"classified": [
{
"file": "cra_machinery.json",
"id": "notified_body_requirements",
"name": "Anforderungen an notifizierte Stellen",
"tier": "LEGAL_MINIMUM",
"applicability": "domain:notified_body",
"scope": "derived_obligation",
"scope_reason": "Norm adressiert primär die notifizierte Stelle (Unabhängigkeit/Kompetenz/Unparteilichkeit), erzeugt aber mittelbare Hersteller-Pflichten: notifizierte Stelle einbeziehen, erforderliche Unterlagen bereitstellen, Konformitätsbewertung korrekt durchführen.",
"scope_split_note": "Kandidat für spätere Aufspaltung: 'Normadressat' (Anforderungen AN die notifizierte Stelle = institutional/out_of_scope) ↔ 'abgeleitete Herstellerpflicht' (NB einbeziehen + Unterlagen + Konformitätsbewertung = in_scope). NICHT vorzeitig festziehen."
},
{
"file": "cra_machinery.json",
"id": "market_surveillance_safeguard",
"name": "Marktüberwachung, nationale Schutzmaßnahmen und Korrekturmaßnahmen",
"tier": "LEGAL_MINIMUM",
"applicability": "domain:authority",
"scope": "out_of_scope",
"scope_reason": "Adressat = Marktüberwachungsbehörden/Kommission (Schutzmaßnahmen, Schutzklauselverfahren); keine Hersteller-Handlungspflicht. Präzedenz CSIRT/ENISA."
},
{
"file": "cra_machinery.json",
"id": "sanctions",
"name": "Sanktionen für Verstöße gegen die Maschinenverordnung",
"tier": "LEGAL_MINIMUM",
"applicability": "domain:authority",
"scope": "out_of_scope",
"scope_reason": "Adressat = Mitgliedstaaten (legen Sanktionen fest); keine Hersteller-Handlungspflicht. Präzedenz CSIRT/ENISA (CRA-Vuln-Cut)."
}
],
"unclassified_candidates": [],
"decision_owner": "User/Registry-Owner — Audit FLAGGT nur; für jeden künftigen Cut mitlaufen lassen"
}
@@ -0,0 +1,75 @@
"""Zitierfähigkeits-Join (Wake-up #2): logischer norm_id-Join auf legal_basis.
KB-v2-Konvention (Board 2026-07-01, Compliance/KB-v2): `EU-<ACT>-Anhang<ROM>` (Annex-Ebene, grob) ·
`EU-<ACT>-Art<N>` (Artikel, in KB-v2 noch zu verifizieren) · Kapitel = TBD-Konvention.
Namensvariante: `EU-MaschVO-*` (NICHT MaschinenVO). Kein char-Span nötig logischer Join auf norm_id.
Fügt `norm_ids` (Liste) je legal_basis + `norm_id_status` hinzu; setzt obligation.citation_status
auf `norm_id_linked` (annex-grob). KEINE neue Klasse (Attribut). Freeze-safe.
"""
from __future__ import annotations
import glob
import json
import re
ACT = {"CRA": "CRA", "MaschVO": "MaschVO", "MaschinenVO": "MaschVO"}
FILES = sorted(glob.glob("obligations/cra*.json"))
def derive(source: str, anchor: str) -> tuple[list[str], str]:
act = ACT.get(source, source)
ids: list[str] = []
for rom in re.findall(r"An(?:hang|nex)\s+([IVX]+)", anchor, re.I):
ids.append(f"EU-{act}-Anhang{rom.upper()}")
articles = re.findall(r"\bArt(?:icle|\.)?\s*(\d+)", anchor)
chapters = re.findall(r"Kapitel\s+([IVX/]+)", anchor, re.I)
verify: list[str] = []
for n in articles:
verify.append(f"EU-{act}-Art{n}")
for grp in chapters:
for rom in grp.split("/"):
rom = rom.strip()
if rom:
verify.append(f"EU-{act}-Kapitel{rom.upper()}")
# dedup, Annexe zuerst (confirmed), dann verify
seen: set[str] = set()
ordered = [x for x in ids + verify if not (x in seen or seen.add(x))]
status = "annex_confirmed" if ids else ("verify_pending" if verify else "unparsed")
return ordered, status
def main() -> None:
total_lb = linked = unparsed = 0
obl_linked = 0
for f in FILES:
d = json.load(open(f, encoding="utf-8"))
d.setdefault("norm_id_contract", {
"convention": "EU-<ACT>-Anhang<ROM> (Annex-Ebene) / EU-<ACT>-Art<N> (verify) — KB-v2 bp_compliance_kb_2026_1_build",
"act_naming": "EU-MaschVO-* (NICHT MaschinenVO)",
"granularity": "annex-grob — 'Annex I Part II (1)' -> EU-CRA-AnhangI; Part/Punkt = KB-Enhancement TBD",
"article_status": "EU-<ACT>-Art<N> in KB-v2 noch zu verifizieren; Annex-IDs confirmed",
"source": "Board Compliance/KB-v2 2026-07-01",
})
for o in d.get("obligations", []):
got = False
for b in o.get("legal_basis", []):
total_lb += 1
nids, st = derive(b.get("source", ""), b.get("anchor", ""))
b["norm_ids"] = nids
b["norm_id_status"] = st
if nids:
linked += 1
got = True
if st == "unparsed":
unparsed += 1
print(f" UNPARSED: {b.get('source')} \"{b.get('anchor')}\"")
if got:
o["citation_status"] = "norm_id_linked"
obl_linked += 1
json.dump(d, open(f, "w", encoding="utf-8"), ensure_ascii=False, indent=1)
print(f"legal_basis gesamt {total_lb} | mit norm_ids {linked} | unparsed {unparsed}")
print(f"Obligations citation_status -> norm_id_linked: {obl_linked}")
if __name__ == "__main__":
main()
@@ -0,0 +1,70 @@
"""KB-v2-Verifikation der norm_ids einarbeiten (2026-07-01, Feedback Compliance/KB-Session).
KB-v2 hat die 19 verify_pending IDs geprüft: 16/19 (alle CRA+MaschVO-Artikel existieren),
3 fehlen = Kapitel-Ebene (`EU-MaschVO-KapitelIV/V/VI`) der KB-Compiler mintet Artikel+Annex,
KEINE Kapitel. Konsequenz: Artikel-norm_ids `verify_pendingarticle_confirmed`; Kapitel-norm_ids
`chapter_no_kb_unit` (danglender Join-Key) + Re-Anchor-Hinweis (KB-v2 hat die Konstituenten-Artikel;
Re-Anchor = Enhancement, NICHT geraten). Deterministisch aus dem norm_id-Inhalt neu abgeleitet.
"""
from __future__ import annotations
import glob
import json
REANCHOR_NOTE = (
"Kapitel-Ebene nicht als KB-v2-Unit gemintet (Compiler = Artikel+Annex). "
"Re-Anchor auf Konstituenten-Artikel = Enhancement (KB-v2 hat die Artikel); NICHT geraten."
)
def status_for(norm_ids: list[str]) -> str:
has_annex = any("-Anhang" in n for n in norm_ids)
has_article = any("-Art" in n and "-Anhang" not in n for n in norm_ids)
has_chapter = any("-Kapitel" in n for n in norm_ids)
if has_annex:
return "annex_confirmed"
if has_article:
return "article_confirmed" # KB-v2 verified 16/16
if has_chapter:
return "chapter_no_kb_unit"
return "unparsed"
def main() -> None:
counts = {"annex_confirmed": 0, "article_confirmed": 0, "chapter_no_kb_unit": 0}
obl_linked = obl_chapter = 0
for f in sorted(glob.glob("obligations/cra*.json")):
d = json.load(open(f, encoding="utf-8"))
for o in d.get("obligations", []):
joinable = chapter_only = False
for b in o.get("legal_basis", []):
nids = b.get("norm_ids", [])
st = status_for(nids)
b["norm_id_status"] = st
counts[st] = counts.get(st, 0) + 1
if st == "chapter_no_kb_unit":
b["norm_id_note"] = REANCHOR_NOTE
chapter_only = True
elif st in ("annex_confirmed", "article_confirmed"):
joinable = True
if joinable:
o["citation_status"] = "norm_id_linked"
obl_linked += 1
elif chapter_only:
o["citation_status"] = "chapter_reanchor_pending"
obl_chapter += 1
# Contract-Block um den Verifikationsstand ergänzen
c = d.get("norm_id_contract")
if isinstance(c, dict):
c["kb_v2_verification"] = (
"2026-07-01: 16/19 verify_pending IDs in KB-v2 bestätigt (alle Artikel); "
"3 Kapitel-IDs = chapter_no_kb_unit (Compiler mintet keine Kapitel)."
)
c["article_status"] = "EU-<ACT>-Art<N> in KB-v2 BESTÄTIGT (16/16); Annex-IDs confirmed"
json.dump(d, open(f, "w", encoding="utf-8"), ensure_ascii=False, indent=1)
print("legal_basis status:", counts)
print(f"citation_status: norm_id_linked {obl_linked} | chapter_reanchor_pending {obl_chapter}")
if __name__ == "__main__":
main()
@@ -0,0 +1,53 @@
"""Scope-Klassifikation anwenden (User-Entscheidung 2026-07-01, Option 2 + derived_obligation).
Neue `scope`-Attribut-Achse (KEINE neue Objektklasse Enum-Wert, freeze-safe):
in_scope (default/implizit) · out_of_scope · derived_obligation
Prinzip (User): Adressat der Norm Handlungspflicht des Herstellers. Reine Staats-/
Durchsetzungs-/Institutions-Bestimmungen = out_of_scope. Norm, die primär eine andere Rolle
adressiert ABER mittelbar eine Hersteller-Handlungspflicht erzeugt = derived_obligation
(bleibt im Hersteller-Set, wird NICHT verworfen 'im Zweifel nicht zu früh Wissen verwerfen').
"""
from __future__ import annotations
import glob
import json
SCOPE = {
"sanctions": {
"scope": "out_of_scope",
"scope_reason": "Adressat = Mitgliedstaaten (legen Sanktionen fest); keine Hersteller-Handlungspflicht. Präzedenz CSIRT/ENISA (CRA-Vuln-Cut).",
},
"market_surveillance_safeguard": {
"scope": "out_of_scope",
"scope_reason": "Adressat = Marktüberwachungsbehörden/Kommission (Schutzmaßnahmen, Schutzklauselverfahren); keine Hersteller-Handlungspflicht. Präzedenz CSIRT/ENISA.",
},
"notified_body_requirements": {
"scope": "derived_obligation",
"scope_reason": "Norm adressiert primär die notifizierte Stelle (Unabhängigkeit/Kompetenz/Unparteilichkeit), erzeugt aber mittelbare Hersteller-Pflichten: notifizierte Stelle einbeziehen, erforderliche Unterlagen bereitstellen, Konformitätsbewertung korrekt durchführen.",
"scope_split_candidate": True,
"scope_split_note": "Kandidat für spätere Aufspaltung: 'Normadressat' (Anforderungen AN die notifizierte Stelle = institutional/out_of_scope) ↔ 'abgeleitete Herstellerpflicht' (NB einbeziehen + Unterlagen + Konformitätsbewertung = in_scope). NICHT vorzeitig festziehen.",
},
}
def main() -> None:
applied = []
for f in sorted(glob.glob("obligations/cra*.json")):
d = json.load(open(f, encoding="utf-8"))
changed = False
for o in d.get("obligations", []):
spec = SCOPE.get(o.get("id"))
if spec:
o.update(spec)
applied.append((o["id"], spec["scope"]))
changed = True
if changed:
json.dump(d, open(f, "w", encoding="utf-8"), ensure_ascii=False, indent=1)
for oid, sc in applied:
print(f" {oid:32} scope={sc}")
print(f"angewendet: {len(applied)} (erwartet 3)")
if __name__ == "__main__":
main()

Some files were not shown because too many files have changed in this diff Show More