Files
breakpilot-compliance/consent-tester
Benjamin Admin 9814b56f2f fix(cookie-extract): max_documents=1 + faster networkidle bail (Phase 0 fix)
Root cause of the recurring 603-word BMW result:
- DSI discovery for cookie-policy URL was hitting 4x networkidle timeouts
  (60s each = ~240s total).
- Backend httpx timeout (180s after the previous fix) gave up before the
  consent-tester finished, falling through to the raw HTTP fetch which
  returned BMWs SSR navigation chrome (603 words) as the 'cookie policy'.

Two orthogonal fixes:
1. _fetch_text now passes max_documents=1 for user-specified URLs. We only
   want self-extraction of THAT page; link-following is unnecessary noise.
2. networkidle wait_until window dropped 60s -> 15s. SPAs like BMW/Daimler
   never reach networkidle anyway; the 60s wait was pure latency. Falls
   through to domcontentloaded+5s render-wait, same as before.
2026-05-16 22:53:23 +02:00
..