Files
breakpilot-compliance/consent-tester/services
Benjamin Admin 17a93bc694 fix(consent-tester): prefer CMP-JSON over thin DOM extraction
Previous threshold (DOM < 300 words) missed the BMW case where Playwright
extracted 346 words of pure site navigation. The CMP JSON had 1673 words
of real policy content but was discarded.

New heuristic: prefer CMP when ANY of:
  - DOM < 300 words (existing)
  - CMP text >= 1000 words (authoritative at scale)
  - CMP text >1.5x longer than DOM
2026-05-16 20:56:11 +02:00
..