17a93bc694
Previous threshold (DOM < 300 words) missed the BMW case where Playwright extracted 346 words of pure site navigation. The CMP JSON had 1673 words of real policy content but was discarded. New heuristic: prefer CMP when ANY of: - DOM < 300 words (existing) - CMP text >= 1000 words (authoritative at scale) - CMP text >1.5x longer than DOM