breakpilot-compliance/consent-tester/requirements.txt at 0a84c747f2bee2d20e79041d12249abbf4b426f5 - breakpilot-compliance - Gitea: Git with a cup of tea

Benjamin_Boenisch/breakpilot-compliance

Files

T

Benjamin Admin 2400aa6a9e feat(consent-tester): Phase C+D — LLM cascade fallback (Qwen → OVH)

New module consent-tester/services/cmp_llm_fallback.py:
- LLMCookieExtractor: single-endpoint adapter (Ollama OR OpenAI-compat)
- LLMCascade: tries Qwen (local Mac Mini Ollama) first; falls through to
  OVH (managed 120B) when Qwen returns no usable strategy
- LLMCascade.from_env(): reads OLLAMA_URL/CMP_LLM_MODEL + OVH_LLM_URL/
  OVH_LLM_KEY/OVH_LLM_MODEL from environment
- LLM returns JSON {strategy: url|selector|text, value: ...}
- Valkey-backed cache per netloc (cmp:hint:<netloc>, 7-day TTL) — next run
  against the same domain skips the LLM entirely

dsi_discovery.py:
- Wired network_log collector (URL/status/content-type/size of every JSON
  response on the page) — passed to LLM prompt as observation
- After Named CMP (Phase B) + Heuristic (Phase A) both fail AND DOM
  < 300 words: invoke LLMCascade.analyze(...)
- _apply_llm_hint executes the LLM's strategy: refetch URL via Playwright
  request context, query DOM selector, or use text directly
- Cache HIT path: apply cached hint, only fall back to LLM if cache is stale

docker-compose.yml:
- consent-tester gets env vars + cmp-data volume (for Phase E)
- All LLM endpoints configurable via env, sensible defaults

consent-tester/requirements.txt:
- redis>=5.0 (asyncio client, Valkey-compatible)
- httpx>=0.27

2026-05-16 23:06:05 +02:00

8 lines

116 B

Plaintext

Raw Blame History

 fastapi==0.115.12
 uvicorn==0.34.2
 playwright==1.52.0
 playwright-stealth==1.0.6
 pydantic>=2.0
 httpx>=0.27
 redis>=5.0