fix: Correct Ollama model name + strict blank-line heading detection
Build + Deploy / build-admin-compliance (push) Failing after 48s
Build + Deploy / build-backend-compliance (push) Successful in 9s
Build + Deploy / build-ai-sdk (push) Successful in 8s
CI / loc-budget (push) Failing after 17s
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Failing after 2m3s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-python-backend (push) Successful in 40s
Build + Deploy / build-developer-portal (push) Successful in 9s
Build + Deploy / build-tts (push) Successful in 7s
Build + Deploy / build-document-crawler (push) Successful in 8s
Build + Deploy / build-dsms-gateway (push) Successful in 7s
Build + Deploy / build-dsms-node (push) Successful in 7s
CI / branch-name (push) Has been skipped
Build + Deploy / trigger-orca (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / test-go (push) Failing after 45s
CI / test-python-document-crawler (push) Successful in 34s
CI / test-python-dsms-gateway (push) Successful in 27s
CI / validate-canonical-controls (push) Successful in 15s

1. LLM model: qwen3:32b → qwen3.5:35b-a3b (actual model on Mac Mini)
2. Section splitter: headings MUST be preceded by a blank line.
   This prevents cookie table entries ("Funktionale Cookies",
   "Session Cookies") from splitting the cookie section.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Benjamin Admin
2026-05-07 15:53:53 +02:00
parent 1cc0c3d34a
commit f51671737a
2 changed files with 4 additions and 5 deletions
@@ -346,10 +346,9 @@ def _split_into_sections(text: str, parent_label: str, url: str) -> list[dict]:
and not stripped.endswith(".")
and not stripped.endswith(",")
and stripped[0].isupper()
# Require preceding blank line OR line > 15 chars to avoid
# table column headers ("Funktion", "Speicherdauer") being
# treated as section headings
and (prev_blank or len(stripped) > 15)
# Require preceding blank line to distinguish real headings
# from table content ("Funktionale Cookies", "Session Cookies")
and prev_blank
)
is_skip = is_heading and stripped.lower().strip() in SKIP_HEADINGS
@@ -15,7 +15,7 @@ import httpx
logger = logging.getLogger(__name__)
OLLAMA_URL = os.getenv("OLLAMA_URL", "http://host.docker.internal:11434")
OLLAMA_MODEL = os.getenv("OLLAMA_VERIFY_MODEL", "qwen3:32b")
OLLAMA_MODEL = os.getenv("OLLAMA_VERIFY_MODEL", "qwen3.5:35b-a3b")
TIMEOUT = 30.0