fix(vvt): correct ePaaS schema mapping + category-aware scoring
The first BMW VVT table rendered all 24 providers at 20% score because
the ePaaS extractor was reading the wrong field names. Actual schema is
nested: providers[].processings[].persistences[], NOT providers[] alone.
Correct ePaaS schema (verified against bmw.com/epaas/.../de_DE.epaas.json):
Provider: {id, name, description, processings[]}
Processing: {id, name, description, categoryId, optOutLink,
privacyPolicyLink, persistences[]}
Persistence: {id, name, domain, type, expiry, description}
Two structural changes:
1. One row per processing (not provider). BMW has 26 providers but ~91
processings spread across them (Adobe alone has ACMProcessing,
AdobeAnalytics, AdobeCampaign, AdobeTargetAnalytics, AdobeTargetPers.).
The cookie widget displays each processing separately — VVT now
mirrors that. Display name format: 'Provider Name — Processing Name'.
2. Read optOutLink/privacyPolicyLink from PROCESSING (where they live),
not provider. Persistences flatten to cookies[] with name + expiry +
description.
Plus category mapping:
advertising -> marketing
strictlyNecessary -> necessary
statistics -> statistics
functional -> functional
Category-aware scoring (cookie_link_validator.score_vendors):
- 'necessary' (technisch erforderliche, §25 Abs. 2 TDDDG): no opt-out
required, no country required. Score weight shifts to purpose +
cookie disclosure (essential cookies must list names + expiry).
- All other categories: opt-out URL still mandatory; missing opt-out
flags 'no_opt_out_url' and zeros that block of points.
Expected BMW result after this fix:
- ~91 rows (Adobe Analytics, Adform Retargeting, Akamai Infrastructure,
AWS, ..., plus ~60 strictlyNecessary processings)
- Marketing rows with present opt-out → ~75-90%
- Necessary rows with cookie+expiry → ~85-95%
- Rows missing fields → still flagged
This commit is contained in:
@@ -173,8 +173,16 @@ async def validate_vendor_urls(vendors: list[dict]) -> list[dict]:
|
||||
|
||||
|
||||
def score_vendors(vendors: list[dict]) -> list[dict]:
|
||||
"""Compute per-vendor compliance score (0-100) and flags. Mutates."""
|
||||
"""Compute per-vendor compliance score (0-100) and flags. Mutates.
|
||||
|
||||
Category-aware: 'necessary' (technisch erforderliche Cookies) do NOT
|
||||
require an opt-out — §25 Abs. 2 TDDDG. Penalising them for that would
|
||||
be wrong; instead we require precise purpose + cookie disclosure.
|
||||
"""
|
||||
for v in vendors:
|
||||
is_necessary = (v.get("category") or "").lower() in (
|
||||
"necessary", "strictlynecessary",
|
||||
)
|
||||
score = 0
|
||||
max_score = 0
|
||||
flags: list[str] = []
|
||||
@@ -186,50 +194,56 @@ def score_vendors(vendors: list[dict]) -> list[dict]:
|
||||
else:
|
||||
flags.append("no_name")
|
||||
|
||||
# Purpose — 15
|
||||
max_score += 15
|
||||
# Purpose — 20
|
||||
max_score += 20
|
||||
if v.get("purpose"):
|
||||
score += 15
|
||||
score += 20
|
||||
else:
|
||||
flags.append("no_purpose")
|
||||
|
||||
# Country (3rd-country transfer relevance) — 10
|
||||
max_score += 10
|
||||
if v.get("country"):
|
||||
score += 10
|
||||
else:
|
||||
flags.append("no_country")
|
||||
# Country (3rd-country transfer relevance) — only relevant for
|
||||
# consent-based categories (otherwise irrelevant flag noise)
|
||||
if not is_necessary:
|
||||
max_score += 10
|
||||
if v.get("country"):
|
||||
score += 10
|
||||
else:
|
||||
flags.append("no_country")
|
||||
|
||||
# Opt-Out URL present + reachable — 25
|
||||
max_score += 25
|
||||
if not v.get("opt_out_url"):
|
||||
flags.append("no_opt_out_url")
|
||||
elif v.get("opt_out_ok") is False:
|
||||
flags.append("broken_opt_out")
|
||||
score += 5 # at least they tried
|
||||
else:
|
||||
score += 25
|
||||
# Opt-Out URL — only for consent-based categories (§25 TDDDG)
|
||||
if not is_necessary:
|
||||
max_score += 25
|
||||
if not v.get("opt_out_url"):
|
||||
flags.append("no_opt_out_url")
|
||||
elif v.get("opt_out_ok") is False:
|
||||
flags.append("broken_opt_out")
|
||||
score += 5
|
||||
else:
|
||||
score += 25
|
||||
|
||||
# Privacy policy URL present + reachable — 15
|
||||
max_score += 15
|
||||
# Privacy policy URL — relevant for all, but weight lower for necessary
|
||||
weight = 10 if is_necessary else 15
|
||||
max_score += weight
|
||||
if not v.get("privacy_policy_url"):
|
||||
flags.append("no_privacy_url")
|
||||
elif v.get("privacy_ok") is False:
|
||||
flags.append("broken_privacy_url")
|
||||
score += 5
|
||||
score += weight // 3
|
||||
else:
|
||||
score += 15
|
||||
score += weight
|
||||
|
||||
# Cookies disclosed (names + expiry) — 15
|
||||
max_score += 15
|
||||
# Cookies disclosed (names + expiry) — higher weight for necessary
|
||||
# (since that's mostly what they offer in lieu of opt-out)
|
||||
weight = 50 if is_necessary else 15
|
||||
max_score += weight
|
||||
cookies = v.get("cookies") or []
|
||||
if cookies:
|
||||
named = sum(1 for c in cookies if c.get("name"))
|
||||
with_expiry = sum(1 for c in cookies if c.get("expiry"))
|
||||
if named >= 1 and with_expiry >= 1:
|
||||
score += 15
|
||||
score += weight
|
||||
elif named >= 1:
|
||||
score += 8
|
||||
score += weight // 2
|
||||
flags.append("cookies_no_expiry")
|
||||
else:
|
||||
flags.append("cookies_no_names")
|
||||
|
||||
Reference in New Issue
Block a user