fix(ocr-pipeline): increase LLM timeout to 300s and disable qwen3 thinking

- Add /no_think tag to prompt (qwen3 thinking mode causes massive slowdown) - Increase httpx timeout from 120s to 300s for large vocab tables - Improve error logging with traceback and exception type Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 11:31:03 +01:00
parent 938d1d69cf
commit e171a736e7
2 changed files with 6 additions and 3 deletions
--- a/klausur-service/backend/cv_vocab_pipeline.py
+++ b/klausur-service/backend/cv_vocab_pipeline.py
@@ -4354,11 +4354,13 @@ Antworte NUR mit dem korrigierten JSON-Array. Kein erklaerener Text.
 Fuer jeden Eintrag den du aenderst, setze "corrected": true.
 Fuer unveraenderte Eintraege setze "corrected": false.

+/no_think
+
 Eingabe:
 {_json.dumps(table_lines, ensure_ascii=False, indent=2)}"""

    t0 = time.time()
-    async with httpx.AsyncClient(timeout=120.0) as client:
+    async with httpx.AsyncClient(timeout=300.0) as client:
        resp = await client.post(
            f"{_OLLAMA_URL}/api/chat",
            json={