Switch Vision-LLM Fusion to llama3.2-vision:11b
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 41s
CI / test-go-edu-search (push) Successful in 29s
CI / test-python-klausur (push) Failing after 2m35s
CI / test-python-agent-core (push) Successful in 19s
CI / test-nodejs-website (push) Successful in 28s

qwen2.5vl:32b needs ~100GB RAM and crashes Ollama.
llama3.2-vision:11b is already installed and fits in memory.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Benjamin Admin
2026-04-24 00:44:59 +02:00
parent 5fbf0f4ee2
commit 7fc5464df7

View File

@@ -22,7 +22,7 @@ import numpy as np
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
OLLAMA_BASE_URL = os.getenv("OLLAMA_BASE_URL", "http://host.docker.internal:11434") OLLAMA_BASE_URL = os.getenv("OLLAMA_BASE_URL", "http://host.docker.internal:11434")
OLLAMA_HTR_MODEL = os.getenv("OLLAMA_HTR_MODEL", "qwen2.5vl:32b") VISION_FUSION_MODEL = os.getenv("VISION_FUSION_MODEL", "llama3.2-vision:11b")
# Document category → prompt context # Document category → prompt context
CATEGORY_PROMPTS: Dict[str, Dict[str, str]] = { CATEGORY_PROMPTS: Dict[str, Dict[str, str]] = {
@@ -225,7 +225,7 @@ async def vision_fuse_ocr(
resp = await client.post( resp = await client.post(
f"{OLLAMA_BASE_URL}/api/generate", f"{OLLAMA_BASE_URL}/api/generate",
json={ json={
"model": OLLAMA_HTR_MODEL, "model": VISION_FUSION_MODEL,
"prompt": prompt, "prompt": prompt,
"images": [img_b64], "images": [img_b64],
"stream": False, "stream": False,