65b4857be5fda17a59e5bc103810926e78622154
5 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
d2dc0c9fe4 |
feat: Deep consent verification — DataLayer, Storage, GCM, TCF
5 verification layers added to the 3-phase banner test:
1. DataLayer/GTM Interception: Proxy on window.dataLayer captures
all push() events. Distinguishes safe lifecycle events (gtm.js,
gtm.dom) from tracking events (page_view, conversion, purchase).
Flags tracking events before consent as violations.
2. localStorage/sessionStorage Monitoring: Intercepts setItem() to
detect tracking keys (_ga, _fbp, amplitude, mixpanel, etc.)
written before consent.
3. Google Consent Mode v2 Runtime Verification: Reads actual GCM
state (analytics_storage, ad_storage) per phase. Verifies
default=denied before consent, stays denied after reject,
switches to granted after accept.
4. TCF v2.2 State: Reads __tcfapi('getTCData') if available.
Verifies consent purpose states match user choice.
5. Cookie Attribute Analysis: Domain (1st vs 3rd party), expires
(>13 months), secure flag for tracking cookies.
10 new L2 checks with expert hints (EDPB, CNIL, §25 TDDDG).
All interceptor calls wrapped in try/except for graceful fallback.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
|
|
80ae196853 |
fix: Banner checks no longer default to PASS when untested
20 checks were defaulting to PASS when no violation was found, even if the scanner couldn't actually test them. Now: - Phase-based checks (tracking/cookies): absence = PASS (correct) - UI checks: only PASS if banner_checks actually ran - If banner not detected: everything except banner_detected = FAIL This prevents false 100% scores when violations exist but the text→code mapping doesn't cover them. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
561150b5a8 |
fix: Banner runner maps violations by text when code field is missing
The consent-tester produces violations without a 'code' field — only text, severity, service. The runner now infers check_keys from the violation text content (36 text→code mappings). This fixes the 100% false-pass for safetykon.de which had 3 real violations (impressum, re-access, color contrast dark pattern) that were silently ignored. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
686834cea0 |
feat: 4 remaining tasks — EU institutions, banner integration, JS-sites, Caritas fixes
Build + Deploy / build-admin-compliance (push) Successful in 8s
Build + Deploy / build-backend-compliance (push) Successful in 8s
Build + Deploy / build-ai-sdk (push) Failing after 36s
Build + Deploy / build-developer-portal (push) Successful in 8s
Build + Deploy / build-tts (push) Successful in 7s
Build + Deploy / build-document-crawler (push) Successful in 7s
Build + Deploy / build-dsms-gateway (push) Successful in 8s
Build + Deploy / build-dsms-node (push) Successful in 8s
CI / branch-name (push) Has been skipped
Build + Deploy / trigger-orca (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / loc-budget (push) Failing after 17s
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 3m14s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Failing after 46s
CI / test-python-backend (push) Successful in 43s
CI / test-python-document-crawler (push) Successful in 29s
CI / test-python-dsms-gateway (push) Successful in 30s
CI / validate-canonical-controls (push) Successful in 16s
1. EU Institution Checks (Verordnung 2018/1725): - New doc_type "eu_institution" with 9 L1 + 15 L2 checks - Both German + English patterns (EU institutions are multilingual) - Auto-detection via "2018/1725", "EDSB", "EDPS" keywords - Correct article references (Art. 15 instead of 13, Art. 5 instead of 6) 2. Banner Check Integration: - banner_runner.py maps scan results to 36 L1/L2 structured checks - BannerCheckTab shows hierarchical ChecklistView with hints - 3-phase summary (cookies/scripts before/after consent) - /scan endpoint now includes structured_checks in response 3. JS-heavy Website Fixes (dm, Zalando, HWK): - dsi_helpers.py: goto_resilient (networkidle→domcontentloaded fallback) - try_dismiss_consent_banner before text extraction - PDF redirect detection (dm.de redirects to GCS PDF) 4. Caritas False Positive Fixes: - Phone regex allows parentheses: +49 (0)761 → now matches - "Recht auf Widerspruch" (3 words) + §23 KDG → matches Art. 21 - Church authorities: "Katholisches Datenschutzzentrum" recognized Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
608fb7faf5 |
fix: DSI self-extraction + banner L1/L2 check definitions
Build + Deploy / build-admin-compliance (push) Successful in 2m22s
Build + Deploy / build-backend-compliance (push) Successful in 3m20s
Build + Deploy / build-ai-sdk (push) Successful in 54s
Build + Deploy / build-developer-portal (push) Successful in 1m26s
Build + Deploy / build-tts (push) Successful in 1m38s
Build + Deploy / build-document-crawler (push) Successful in 37s
Build + Deploy / build-dsms-gateway (push) Successful in 26s
Build + Deploy / build-dsms-node (push) Successful in 11s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / loc-budget (push) Failing after 18s
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 3m7s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Failing after 46s
CI / test-python-backend (push) Successful in 45s
CI / test-python-document-crawler (push) Successful in 30s
CI / test-python-dsms-gateway (push) Successful in 27s
CI / validate-canonical-controls (push) Successful in 17s
Build + Deploy / trigger-orca (push) Successful in 3m37s
1. DSI Discovery fix for direct-URL use case (e.g. example.com/datenschutz):
- Self-extraction: if the URL itself is a DSE page, extract its text
directly from the page body (main/article/content element)
- Remove "datenschutz" from NOISE_TITLES — it's a legitimate doc title
- Fixes safetykon.de/datenschutz returning 0 documents
2. Banner check definitions (36 checks: 6 L1 + 30 L2):
- consent-tester/checks/banner_checks.py with expert-level hints
- EDPB 3/2022, CNIL rulings, EuGH C-673/17, §25 TDDDG references
- check_key maps to existing consent_scanner check codes
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|