feat: Backlog 1-5 — soft-hints, chatbot-discovery, API-payload, LLM-Agent

5 Backlog-Items aus dem Multi-Site-Briefing in einem Sprint:

1. B13 B2C-Soft-Hints — Versicherungs/Tarif/Buchungs-Marker
   _B2C_WEAK erweitert um "Reiseversicherung", "Tarifrechner",
   "Online-Antrag", "Flug buchen", "Stromtarif" etc.
   Fängt Allianz-Reise-Chatbot (vorher False-Negative).

2. Chatbot-Policy-Discovery (chatbot_policy_discovery.py)
   Probt 14 Standard-Slugs (privacypolicychatbot, chatbot-datenschutz,
   ai-policy, ki-datenschutz, ...) × 5 Lang-Prefixe auf jeder
   submitted Origin. Successful >300-Wort-Findings werden in
   doc_texts['dse'] gemerged. Audit-Trail über
   doc_entries[dse].chatbot_policy_sources.
   Hebt Westfield-iAdvize-Lücke.

3. API-Response-Payload erweitert
   phase_f_persist.response um extra_findings, audit_walk und
   html_blocks erweitert. B-Wiring-Output (B1, B3-B18) ist nicht
   mehr nur im Mail-HTML versteckt — externe Aufrufer sehen jeden
   Finding. Schema additiv, legacy clients ignorieren neue Felder.

4. Plausibility-LLM Empty-Response-Fix
   Resilienz-Strategie A→B→C→D:
   A) format='json' (strict, default)
   B) format='' (loose, _try_extract_json mit ```json-fence + prose-
      wrap-Unterstützung)
   C) Split-Batch-Recursion (vorhanden)
   D) Give up, leeres dict (callers behandeln als skipped)
   Plus _post_llm() als isolierter LLM-Call-Helper, catched
   Network-Errors.

5. Specialist-Agents Phase 2 LLM (MVP) — Impressum-Agent
   impressum_agent_llm.py: qwen3:30b-a3b mit § 5 TMG System-Prompt,
   business_scope-hints aus profile_dict. Output identisches Schema
   wie pattern-agent für ein Merge ohne API-Bruch.
   _b18_wiring.py orchestriert beide Agents + deduplet nach
   field_id, rendert lila V2-Block mit KB/LLM-Tags pro Finding.
   Pattern-first im Dedup (deterministisch + stable).

Tests: 107/107 grün (7 Test-Suites + chatbot-discovery + b18).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Benjamin Admin
2026-06-07 18:41:54 +02:00
parent a2cae94526
commit e8ff75cbfe
11 changed files with 832 additions and 34 deletions
@@ -132,54 +132,102 @@ def _build_user_prompt(items: list[dict], doc_title: str,
)
async def _post_llm(body: dict) -> str:
"""One LLM call. Returns content string or empty on failure.
Catches network errors so the caller can decide fallback strategy."""
try:
async with httpx.AsyncClient(timeout=TIMEOUT) as c:
r = await c.post(f"{OLLAMA_URL}/api/chat", json=body)
r.raise_for_status()
return (r.json().get("message") or {}).get("content", "") or ""
except Exception as e:
logger.warning("plausibility LLM call failed: %s", e)
return ""
def _try_extract_json(content: str) -> dict | None:
"""Extract a JSON object from free-form LLM output. Handles
markdown-fenced and prose-wrapped responses."""
if not content:
return None
s = content.strip()
# Strip ```json … ``` fences
if s.startswith("```"):
s = s.strip("`")
if s.lower().startswith("json"):
s = s[4:]
s = s.strip()
# Heuristic: cut from first { to last }
first = s.find("{")
last = s.rfind("}")
if first >= 0 and last > first:
s = s[first:last + 1]
try:
return json.loads(s)
except Exception:
return None
async def _ask_llm_batch(items: list[dict], doc_title: str,
doc_excerpt: str) -> dict[str, dict]:
"""Send a batch of up to BATCH_SIZE findings to the LLM."""
body = {
"""Send a batch of up to BATCH_SIZE findings to the LLM.
Resilience strategy (P125 fix for empty-response bug):
A. format='json' (strict) — current default
B. If A returns empty: format='' (loose), extract JSON manually
C. If B also empty AND batch >2: split batch + recurse
D. Else: give up, return {} (callers stamp llm_skipped=true)
"""
user_prompt = _build_user_prompt(items, doc_title, doc_excerpt)
base_body = {
"model": MODEL,
"messages": [
{"role": "system", "content": _SYSTEM_PROMPT},
{"role": "user", "content": _build_user_prompt(
items, doc_title, doc_excerpt,
)},
{"role": "user", "content": user_prompt},
],
"format": "json",
"stream": False,
"options": {"temperature": 0.0, "seed": 42, "num_predict": 1500},
}
out: dict[str, dict] = {}
input_ids = [it["id"] for it in items]
try:
async with httpx.AsyncClient(timeout=TIMEOUT) as c:
r = await c.post(f"{OLLAMA_URL}/api/chat", json=body)
r.raise_for_status()
content = (r.json().get("message") or {}).get("content", "")
if not content:
# Single retry with smaller batch — qwen3 sometimes
# rejects ≥6-item prompts under format='json'.
if len(items) > 2:
half = len(items) // 2
logger.info(
"plausibility empty → retry split %d%dx2",
len(items), half,
)
first = await _ask_llm_batch(
items[:half], doc_title, doc_excerpt,
)
second = await _ask_llm_batch(
items[half:], doc_title, doc_excerpt,
)
out.update(first)
out.update(second)
return out
logger.warning("plausibility LLM returned empty content")
# Strategy A: format='json'
content = await _post_llm({**base_body, "format": "json"})
if not content:
# Strategy B: format-free, parse-on-our-side
logger.info(
"plausibility A→empty, trying B (format-free) batch=%d",
len(items),
)
content = await _post_llm(base_body)
if not content:
# Strategy C: split + recurse
if len(items) > 2:
half = len(items) // 2
logger.info(
"plausibility A+B empty → split %d%dx2",
len(items), half,
)
first = await _ask_llm_batch(
items[:half], doc_title, doc_excerpt,
)
second = await _ask_llm_batch(
items[half:], doc_title, doc_excerpt,
)
out.update(first)
out.update(second)
return out
try:
data = json.loads(content)
except json.JSONDecodeError as je:
# Strategy D: give up
logger.warning(
"plausibility gave up after A+B for batch=%d", len(items),
)
return out
data = _try_extract_json(content)
if data is None:
logger.warning(
"plausibility LLM JSON parse failed: %s; raw=%s",
je, content[:300],
"plausibility LLM JSON parse failed (after fallback); "
"raw=%s", content[:300],
)
return out
llm_findings = data.get("findings") or []