Files
breakpilot-compliance/consent-tester
Benjamin Admin 1792c6f896 fix(consent-tester): capture CMP JSON to extract dynamically-loaded cookie policies
BMW (and other big enterprise sites) do NOT render cookie policies as
static HTML. Their widget loads structured data from a JSON endpoint
(BMW: ePaaS at /epaas/prod/policypage/.../<locale>.epaas.json) and
renders it client-side after consent. Our DOM extraction therefore only
captured site navigation (603 words of header/footer chrome), not the
actual policy.

New module consent-tester/services/cmp_extractor.py:
- CMPCapture: response listener that catches policy JSON during navigation
- Reconstructors for ePaaS (BMW) + OneTrust placeholder
- Returns Cookie-Richtlinie text built from policyPageMetadata +
  categories + providers (BMW: 1673 words reconstructed vs. 603 noise)

dsi_discovery.py:
- Attach CMPCapture before page.goto
- After self-extraction: if rendered DOM < 300 words AND CMP captured a
  payload, prefer the CMP-reconstructed text. This bypasses the empty
  '.cookie-policy' div problem entirely.
2026-05-16 20:50:15 +02:00
..