feat(audit): P75 Banner-vs-CMP + P84 Diff-Mode + P74/P96/P97 Doc-Types
CI / detect-changes (push) Successful in 11s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 15s
CI / loc-budget (push) Failing after 18s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 42s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped

P75 — check_banner_vs_cmp_partner_count: wenn Banner-Text 'N Partner'
nennt und N < cmp_vendors * 0.6, HIGH-Finding (Art. 13(1)(e) DSGVO).
Erkennt Verharmlosung der tatsaechlichen Vendor-Anzahl.

P84 — run_diff.py: vergleicht aktuellen Lauf mit letztem Snapshot
derselben Site (set-Diff auf normalisierten Finding-Labels). Block
ueber dem GF-1-Pager: 'Seit letztem Lauf: X Findings weg, Y neue'.
USP — keiner der grossen Anbieter hat das.

P74/P96/P97 — Labels fuer legal_notice (Rechtliche Hinweise / IP /
Forward-Looking), dsa (Art. 12+17 Digital Services Act), lizenzhinweise
(OSS-Compliance) in _DOC_TYPE_LABELS registriert. Echte Pflichtangaben-
Checks kommen separat.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Benjamin Admin
2026-05-21 16:38:25 +02:00
parent 7842c95532
commit df8832c521
4 changed files with 267 additions and 4 deletions
@@ -178,7 +178,54 @@ def check_init_banner_vs_cookie_doc(
}
def run_all(banner_result: dict, cookie_doc_text: str | None = None) -> list[dict]:
def check_banner_vs_cmp_partner_count(
banner_result: dict,
cmp_vendors: list | None,
) -> dict | None:
"""P75 — Banner nennt N Partner, CMP-Payload listet viel mehr.
Wenn der Banner-Text behauptet "5 Partner" oder "Wir und unsere
Partner", die CMP-Payload aber 100+ Vendors enthaelt, wird der
User getaeuscht.
"""
cmp_count = len(cmp_vendors or [])
if cmp_count < 20:
return None
initial_ph = (_phases(banner_result).get("initial")
or _phases(banner_result).get("before_accept") or {})
banner_text = (initial_ph.get("banner_text") or "")[:5000]
if not banner_text:
return None
m = re.search(r"\b(\d{1,4})\s*(?:partner|drittanbieter|vendor|"
r"anbieter|dienstleister)", banner_text, re.I)
if not m:
return None
claimed = int(m.group(1))
if claimed >= cmp_count * 0.6:
return None # Zahl im Banner ist plausibel.
return {
"severity": "HIGH",
"code": "banner_understates_vendor_count",
"label": (
f"Banner-Text nennt {claimed} Partner, CMP-Payload listet "
f"{cmp_count} Vendors"
),
"detail": (
f"Die im Banner-Text genannte Zahl ({claimed}) unterschaetzt die "
f"tatsaechliche Anzahl der Empfaenger ({cmp_count}) deutlich. "
"Empfehlung: Banner-Text auf die echte Vendor-Zahl heben oder "
"die Vendor-Liste reduzieren."
),
"legal_basis": (
"Art. 13(1)(e) DSGVO + EDPB 5/2020 — die Empfaenger / "
"Empfaengerkategorien muessen vollstaendig und nicht "
"verharmlosend angegeben sein."
),
}
def run_all(banner_result: dict, cookie_doc_text: str | None = None,
cmp_vendors: list | None = None) -> list[dict]:
findings: list[dict] = []
try:
f1 = check_cmp_tool_availability(banner_result)
@@ -192,6 +239,12 @@ def run_all(banner_result: dict, cookie_doc_text: str | None = None) -> list[dic
findings.append(f2)
except Exception as e:
logger.warning("P94 init_vs_cookie_doc failed: %s", e)
try:
f3 = check_banner_vs_cmp_partner_count(banner_result, cmp_vendors)
if f3:
findings.append(f3)
except Exception as e:
logger.warning("P75 banner_vs_cmp_count failed: %s", e)
return findings