Der Floating-Compliance-Advisor war auf prod kaputt (502): RAG ging ueber
rag-service:8097 (auf prod nicht vorhanden) und der Chat ueber
OLLAMA_URL=ollama-embed (embedding-only, kein qwen2.5vl).
- RAG laeuft jetzt ueber die ai-compliance-sdk /sdk/v1/rag/search (bge-m3,
prod-erreichbar) statt rag-service -> profitiert vom reicheren Embedding.
(lib/sdk/agents/advisor-rag.ts)
- LLM-Kaskade: OVH/LiteLLM (gpt-oss-120b) zuerst, Ollama als Dev-Fallback.
(lib/sdk/agents/advisor-llm.ts; OVH-Env via orca-infra admin-Block)
- ai-sdk: bp_compliance_recht in AllowedCollections ergaenzt (Whitelist war
inkonsistent — die Fehlermeldung listete es bereits als erlaubt).
- Route auf die Module umgestellt (duenn); Controls-Augmentation unveraendert.
- Tests: advisor-rag + advisor-llm.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Prompt-augments the RAG-only advisor with the shared use-case->controls API:
deterministic topic detection -> local controls API -> context block, so the
agent can answer from real Control-IDs. 100% local at runtime (no Anthropic).
NOT pushed/deployed: the shared API currently returns MASTER-grain controls,
whose composition is broken (gpre2 object-only clustering -> mega-clusters).
Pending the atom-grain rework of the API. tsc + vitest green.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Ollama entlädt das 35b-Modell nach 5 Min Leerlauf → jede Frage danach
startet es kalt (Modell-Load) und läuft in den Frontend-Timeout ("Load
failed"). keep_alive='30m' im Chat-Request hält es warm.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Compliance Advisor, Drafting Agent und Validator haben nicht geantwortet
weil qwen3.5 standardmaessig im Thinking-Mode laeuft (interne Chain-of-
Thought > 2min Timeout). Keiner der Agenten benoetigt Thinking-Mode —
alle Aufgaben sind Chat/Textgenerierung/JSON-Validierung ohne tiefes
Reasoning. think:false sorgt fuer direkte schnelle Antworten.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Increase num_predict from 2048 to 8192 to prevent mid-sentence cutoff
- Add "Quellenschutz" rules to system prompt: agent refuses to list all
available sources/collections, only reveals sources used in answers
- Remove internal collection names from RAG context sent to LLM
- Agent confirms knowledge on specific topics but refuses meta-queries
like "what sources do you have?"
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Ingestion script: Add 3 new PDFs (IFRS DE/EN, EFRAG Endorsement Status)
to ingest-industry-compliance.sh (7 → 10 documents total)
- System prompt: Add EU-IFRS and EFRAG to competence area, add mandatory
IFRS endorsement warning section for all IFRS/IAS queries
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace single DSFA corpus query with parallel search across 6 collections
via RAG service (port 8097)
- Add country parameter with metadata filter for bp_compliance_gesetze
- Add country-specific system prompt section
- Add DE/AT/CH/EU toggle buttons in ComplianceAdvisorWidget header
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Reduce chat history from 10 to 6 messages to fit context window
- Lower num_predict from 8192 to 2048 for faster responses
- Add Training module link to SDK sidebar navigation
- Add snake_case to camelCase key transformation for reporting API
(Go backend returns snake_case, TypeScript expects camelCase)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>