fix(drafting): Drafting-Engine auf prod reparieren — RAG via ai-sdk + OVH-LLM-Kaskade

Die Drafting-Engine (Dokument-Entwurf, v2-Pipeline, Validierung, Drafting-Chat, Vendor-Vertragspruefung) war auf prod doppelt tot: - RAG ueber bp-core-rag-service:8097 (existiert auf prod nicht) - LLM ueber OLLAMA_URL/api/chat mit qwen2.5vl (prod = ollama-embed, kein Chat-Modell) Fix (analog zum Compliance-Advisor): - rag-query.ts -> ai-compliance-sdk /sdk/v1/rag/search (bge-m3, prod-erreichbar). - Neue lib/sdk/drafting-engine/llm-cascade.ts: OVH/LiteLLM (gpt-oss-120b) zuerst, Ollama als Dev-Fallback; cascadeComplete (JSON) + cascadeStream. Das Backend nutzt OVH+JSON bereits erfolgreich auf prod (extract-datasheet). - 5 Aufrufstellen (draft-helpers, draft-helpers-v2, validate, chat, vendor-review) auf die Kaskade umgestellt; keine direkten Ollama-Calls mehr. - Tests: llm-cascade + rag-query aktualisiert. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-19 10:02:06 +02:00
parent cd3e0b15ad
commit 90a70c8404
9 changed files with 398 additions and 203 deletions
@@ -1,5 +1,5 @@
 /**
- * Tests for the shared queryRAG utility.
+ * Tests for the shared queryRAG utility (ai-sdk /sdk/v1/rag/search, bge-m3).
 */

 import { describe, it, expect, beforeEach, vi } from 'vitest'
@@ -19,13 +19,13 @@ describe('queryRAG', () => {
    queryRAG = mod.queryRAG
  })

-  it('should return formatted results on success', async () => {
+  it('should return formatted results on success (ai-sdk shape)', async () => {
    mockFetch.mockResolvedValueOnce({
      ok: true,
      json: async () => ({
        results: [
-          { source_name: 'DSGVO', content: 'Art. 35 regelt die DSFA...' },
-          { source_code: 'EU_2016_679', content: 'Risikobewertung erforderlich' },
+          { text: 'Art. 35 regelt die DSFA...', regulation_short: 'DSGVO' },
+          { text: 'Risikobewertung erforderlich', regulation_code: 'EU_2016_679' },
        ],
      }),
    })
@@ -38,7 +38,7 @@ describe('queryRAG', () => {
    expect(mockFetch).toHaveBeenCalledTimes(1)
  })

-  it('should send POST request to RAG_SERVICE_URL', async () => {
+  it('should POST to the ai-sdk /sdk/v1/rag/search endpoint', async () => {
    mockFetch.mockResolvedValueOnce({
      ok: true,
      json: async () => ({ results: [] }),
@@ -47,10 +47,10 @@ describe('queryRAG', () => {
    await queryRAG('test query')

    expect(mockFetch).toHaveBeenCalledWith(
-      expect.stringContaining('/api/v1/search'),
+      expect.stringContaining('/sdk/v1/rag/search'),
      expect.objectContaining({
        method: 'POST',
-        headers: { 'Content-Type': 'application/json' },
+        headers: expect.objectContaining({ 'Content-Type': 'application/json' }),
      })
    )
  })
@@ -99,43 +99,24 @@ describe('queryRAG', () => {
  })

  it('should return empty string on HTTP error', async () => {
-    mockFetch.mockResolvedValueOnce({
-      ok: false,
-      status: 500,
-    })
-
-    const result = await queryRAG('test query')
-
-    expect(result).toBe('')
+    mockFetch.mockResolvedValueOnce({ ok: false, status: 500 })
+    expect(await queryRAG('test query')).toBe('')
  })

  it('should return empty string on network error', async () => {
    mockFetch.mockRejectedValueOnce(new Error('Connection refused'))
-
-    const result = await queryRAG('test query')
-
-    expect(result).toBe('')
+    expect(await queryRAG('test query')).toBe('')
  })

  it('should return empty string when no results', async () => {
-    mockFetch.mockResolvedValueOnce({
-      ok: true,
-      json: async () => ({ results: [] }),
-    })
-
-    const result = await queryRAG('test query')
-
-    expect(result).toBe('')
+    mockFetch.mockResolvedValueOnce({ ok: true, json: async () => ({ results: [] }) })
+    expect(await queryRAG('test query')).toBe('')
  })

-  it('should handle results with missing fields gracefully', async () => {
+  it('should handle results with missing source fields gracefully', async () => {
    mockFetch.mockResolvedValueOnce({
      ok: true,
-      json: async () => ({
-        results: [
-          { content: 'Some content without source' },
-        ],
-      }),
+      json: async () => ({ results: [{ text: 'Some content without source' }] }),
    })

    const result = await queryRAG('test')