fix(llm): qwen3.5 think:false + num_ctx 8192 in allen Chat/Draft-Routen
All checks were successful
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-ai-compliance (push) Successful in 35s
CI / test-python-backend-compliance (push) Successful in 31s
CI / test-python-document-crawler (push) Successful in 22s
CI / test-python-dsms-gateway (push) Successful in 18s

Compliance Advisor, Drafting Agent und Validator haben nicht geantwortet
weil qwen3.5 standardmaessig im Thinking-Mode laeuft (interne Chain-of-
Thought > 2min Timeout). Keiner der Agenten benoetigt Thinking-Mode —
alle Aufgaben sind Chat/Textgenerierung/JSON-Validierung ohne tiefes
Reasoning. think:false sorgt fuer direkte schnelle Antworten.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Benjamin Admin
2026-03-06 08:35:53 +01:00
parent adc95267bd
commit 960b8e757c
4 changed files with 10 additions and 3 deletions

View File

@@ -88,9 +88,11 @@ export async function POST(request: NextRequest) {
model: LLM_MODEL,
messages,
stream: true,
think: false,
options: {
temperature: mode === 'draft' ? 0.2 : 0.3,
num_predict: mode === 'draft' ? 16384 : 8192,
num_ctx: 8192,
},
}),
signal: AbortSignal.timeout(120000),