feat(tcf-vendors): GVL cache + vendor extraction + VVT mapping
Build + Deploy / build-admin-compliance (push) Successful in 14s
Build + Deploy / build-backend-compliance (push) Successful in 16s
Build + Deploy / build-ai-sdk (push) Successful in 20s
Build + Deploy / build-developer-portal (push) Successful in 12s
Build + Deploy / build-tts (push) Successful in 15s
Build + Deploy / build-document-crawler (push) Successful in 13s
Build + Deploy / build-dsms-gateway (push) Successful in 13s
Build + Deploy / build-dsms-node (push) Successful in 12s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / loc-budget (push) Failing after 16s
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m49s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Successful in 45s
CI / test-python-backend (push) Successful in 38s
CI / test-python-document-crawler (push) Successful in 26s
CI / test-python-dsms-gateway (push) Successful in 23s
CI / validate-canonical-controls (push) Successful in 15s
Build + Deploy / trigger-orca (push) Successful in 2m23s
Build + Deploy / build-admin-compliance (push) Successful in 14s
Build + Deploy / build-backend-compliance (push) Successful in 16s
Build + Deploy / build-ai-sdk (push) Successful in 20s
Build + Deploy / build-developer-portal (push) Successful in 12s
Build + Deploy / build-tts (push) Successful in 15s
Build + Deploy / build-document-crawler (push) Successful in 13s
Build + Deploy / build-dsms-gateway (push) Successful in 13s
Build + Deploy / build-dsms-node (push) Successful in 12s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / loc-budget (push) Failing after 16s
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m49s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Successful in 45s
CI / test-python-backend (push) Successful in 38s
CI / test-python-document-crawler (push) Successful in 26s
CI / test-python-dsms-gateway (push) Successful in 23s
CI / validate-canonical-controls (push) Successful in 15s
Build + Deploy / trigger-orca (push) Successful in 2m23s
Phase 1-2 of the closed quality loop: - GVL cache (consent-tester/services/gvl_cache.py): downloads and caches IAB Global Vendor List with 24h TTL, resolves vendor IDs to names, purposes, policy URLs, retention, country - Vendor extraction (consent_interceptor.py): extract_tcf_vendors() reads __tcfapi after accept phase, resolves via GVL - Scan response: tcf_vendors field added to /scan endpoint - VVT mapper (vendor_vvt_mapper.py): maps TCF vendors to VVT format with purpose labels, Rechtsgrundlage, Drittland detection - Vendor cross-check (banner_cookie_cross_check.py): checks all TCF vendors against DSI text — missing vendors, undocumented transfers - Compliance check integrates Step 3d: TCF vendors vs DSI Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -110,6 +110,39 @@ async def get_consent_state(page) -> dict:
|
||||
return {"gcm_state": {}, "tcf_data": None}
|
||||
|
||||
|
||||
async def extract_tcf_vendors(page) -> list[dict]:
|
||||
"""Extract full TCF vendor list from page via __tcfapi + GVL resolution.
|
||||
|
||||
Returns list of resolved vendors with names, purposes, countries, etc.
|
||||
Returns empty list if no TCF API is available on the page.
|
||||
"""
|
||||
state = await get_consent_state(page)
|
||||
tcf = state.get("tcf_data")
|
||||
if not tcf:
|
||||
return []
|
||||
|
||||
vendor_map = tcf.get("vendor", {})
|
||||
consents = vendor_map.get("consents", {})
|
||||
if not consents:
|
||||
return []
|
||||
|
||||
vendor_ids = [int(k) for k, v in consents.items() if v]
|
||||
if not vendor_ids:
|
||||
return []
|
||||
|
||||
try:
|
||||
from .gvl_cache import GVLCache
|
||||
gvl = GVLCache()
|
||||
resolved = await gvl.resolve_vendors(vendor_ids)
|
||||
logger.info("TCF: %d/%d vendors resolved via GVL", len(resolved), len(vendor_ids))
|
||||
return resolved
|
||||
except Exception as e:
|
||||
logger.warning("TCF vendor resolution failed: %s", e)
|
||||
# Fallback: return unresolved IDs
|
||||
return [{"vendor_id": vid, "name": f"Vendor #{vid}", "purposes": []}
|
||||
for vid in vendor_ids[:50]]
|
||||
|
||||
|
||||
# -- Internal helpers --------------------------------------------------------
|
||||
|
||||
def _is_tracking_event(event_data: dict) -> bool:
|
||||
|
||||
Reference in New Issue
Block a user