57c0f940a2
CI / detect-changes (push) Successful in 11s
CI / branch-name (push) Has been skipped
CI / nodejs-build (push) Successful in 2m19s
CI / test-go (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 16s
CI / loc-budget (push) Failing after 15s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 37s
P56 Anti-Auditing-Detection als constructive Compliance-Finding (Audit-API-
Empfehlung statt Anklage, weil Mercedes berechtigt Bots blockiert)
P57 Phase G vendor_details Union mit cmp_vendors -> 42 Anbieter sichtbar
P58 Anti-Audit-Detection robuster (Script-Domain-Check + Settings-spezifisch)
P59 Cookie-Behavior-Validator (4 Layer, 3-Tier-Severity: MEDIUM=Kategorie-
Mismatch / HIGH=Zweck-Mismatch / CRITICAL=beide=Vorsatz-Indiz)
+ Open Cookie Database (CC0) als Library-Seed (2264 Cookies)
P59b Cookie-Behavior in Banner-Check verdrahtet + Mail-Block (BUGFIX:
SessionLocal selbst oeffnen, db war im Background-Task nicht im Scope)
Mail-Polish nach Mercedes-Review:
P63 Banner-Footer-Links auch im wb7-link/role=link erkennen (Shadow-DOM-
Walker label-based statt nur <a href>)
P64 Re-Access-Severity: MEDIUM statt HIGH, wenn Footer "Einstellungen" oder
Mercedes-typisch existiert; OEM-Footer-Detection (wb7-footer)
P65 Text-Truncation: Word-Boundary statt Zeichen-Cut (kein "einfa"-Bruch
mehr in Sofortmassnahmen)
P66 GF-Aktionen: Service-Zweck vs Cookie-Zweck explizit erklaert
(haeufige Verwechslung Marketing/GF: "Akamai-Beschreibung" != Cookie-
Zweck pro DSK-OH 2024)
P67 Stirring-Finding mit "Verlust-Framing"-Erklaerung + Alt-vs-Neutral-
Beispiel, statt nur EDPB-Fachbegriff
Compliance-Advisor FAQ (admin agent-core/soul):
+ CNIL/EDPB Top-Bussgelder (Google 100M, Meta 60M, Amazon 35M)
+ Deutsche Praezedenz (LG Muenchen Google Fonts, EuGH Planet49, BGH I ZR 7/16)
+ 4 Risiko-Pfade (Bussgeld/Abmahnung/Sammelklage/NOYB) + Berechnungs-Methodik
Document-Generator Templates: AGB-DE (142), Impressum (140), Widerrufs-
formular-Anlage (143), DSR-Process-Dedup (139), Cookie-Library (144).
Architektur: doc_action_mappings.py + banner_dom_walkers.py +
cookie_behavior_validator.py + vendor_detail_extractor.py rausgezogen,
um die 500-LOC-Caps in agent_doc_check_report.py und
banner_text_checker.py einzuhalten.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
83 lines
3.6 KiB
PL/PgSQL
83 lines
3.6 KiB
PL/PgSQL
-- Migration 144: Cookie-Library für P59 — Behavior-Validator
|
|
--
|
|
-- Eigene Cookie-Wissensbasis: Name+Domain → tatsächliche Kategorie,
|
|
-- Zweck, typische Werte-Patterns, Datenempfänger. Basis für Findings
|
|
-- "Cookie als X deklariert, tatsächlich Y" nach Art. 5(1)(b) DSGVO.
|
|
--
|
|
-- Quellen:
|
|
-- - Open Cookie Database (CC0, github.com/jkwakman/Open-Cookie-Database)
|
|
-- - Cookiepedia (kommerziell, nur Referenz nicht ingestiert)
|
|
-- - Manuelle BreakPilot-Recherche (OEM-Cookies)
|
|
|
|
BEGIN;
|
|
|
|
CREATE TABLE IF NOT EXISTS compliance.cookie_library (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
cookie_name TEXT NOT NULL,
|
|
-- Domain pattern: exact ".example.com" or wildcard "*.googletagmanager.com"
|
|
domain_pattern TEXT NOT NULL,
|
|
-- Vendor / processing company
|
|
vendor_name TEXT NOT NULL,
|
|
vendor_country TEXT, -- ISO-2 (DE/IE/US)
|
|
vendor_privacy_url TEXT,
|
|
vendor_opt_out_url TEXT,
|
|
-- Behavioural classification (truth, not declaration)
|
|
actual_category TEXT NOT NULL CHECK (actual_category IN
|
|
('essential', 'functional', 'statistics', 'marketing',
|
|
'social_media', 'unknown')),
|
|
purpose_de TEXT, -- "Cross-Site-Tracking ueber 80% der dt. Sites"
|
|
purpose_en TEXT,
|
|
-- Typical value pattern (regex) — used for value-mismatch findings
|
|
value_pattern TEXT, -- e.g. ^[a-f0-9]{32}$ (Hash-ID)
|
|
typical_max_age_seconds BIGINT, -- Lebensdauer typ. Wert
|
|
-- Receiver-domains (XHR/img to which the cookie value flows)
|
|
data_receivers TEXT[], -- ["google-analytics.com", "doubleclick.net"]
|
|
-- Cross-site usage signal (~ how widespread)
|
|
cross_site_count INTEGER, -- ca. wie viele Sites verwenden ihn
|
|
is_pii BOOLEAN DEFAULT FALSE, -- enthält Personenbezug direkt
|
|
-- Provenance + trust
|
|
source_name TEXT NOT NULL, -- "Open Cookie Database" / "BreakPilot Research"
|
|
source_url TEXT,
|
|
source_license TEXT, -- "CC0", "MIT" — was wir nutzen duerfen
|
|
confidence NUMERIC(3,2) DEFAULT 0.80, -- 0..1
|
|
last_verified TIMESTAMPTZ DEFAULT now(),
|
|
notes TEXT,
|
|
created_at TIMESTAMPTZ DEFAULT now(),
|
|
updated_at TIMESTAMPTZ DEFAULT now()
|
|
);
|
|
|
|
-- Index for fast lookup by name + domain
|
|
CREATE INDEX IF NOT EXISTS idx_cookie_lib_name
|
|
ON compliance.cookie_library (cookie_name);
|
|
CREATE INDEX IF NOT EXISTS idx_cookie_lib_domain
|
|
ON compliance.cookie_library (domain_pattern);
|
|
|
|
-- Cookie behavior audit log — was haben wir bei welcher Site beobachtet
|
|
CREATE TABLE IF NOT EXISTS compliance.cookie_behavior_audits (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
check_id TEXT, -- compliance-check ID
|
|
site_url TEXT NOT NULL,
|
|
cookie_name TEXT NOT NULL,
|
|
cookie_domain TEXT,
|
|
-- Observed
|
|
observed_value_sample TEXT, -- truncated 200 chars
|
|
observed_max_age_seconds BIGINT,
|
|
declared_category TEXT, -- was die Site behauptet
|
|
-- Library match
|
|
library_id UUID REFERENCES compliance.cookie_library(id),
|
|
matched_actual_category TEXT,
|
|
mismatch_severity TEXT, -- "HIGH" / "MEDIUM" / "LOW" / NULL
|
|
mismatch_reason TEXT,
|
|
-- Network observations
|
|
observed_receivers TEXT[],
|
|
third_party_transfer BOOLEAN DEFAULT FALSE,
|
|
created_at TIMESTAMPTZ DEFAULT now()
|
|
);
|
|
|
|
CREATE INDEX IF NOT EXISTS idx_cba_check
|
|
ON compliance.cookie_behavior_audits (check_id);
|
|
CREATE INDEX IF NOT EXISTS idx_cba_site
|
|
ON compliance.cookie_behavior_audits (site_url);
|
|
|
|
COMMIT;
|