d208a2bde2
CI / validate-canonical-controls (push) Successful in 11s
CI / loc-budget (push) Successful in 13s
CI / go-lint (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / detect-changes (push) Successful in 7s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Failing after 4s
CI / python-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-python-backend (push) Successful in 30s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
User-Feedback BMW v5: "740 Cookies verschwunden auf 31, Übersicht
verloren". Drei Anpassungen:
Mail-Restrukturierung (_executive_summary.py + _compose.py):
- render_executive_summary(): Top-of-mail TL;DR mit
Compliance-Score (gross + farbig), Top-3-Findings nach
Severity, Cookie-Statistik (deklariert/Browser/Drittland),
Severity-Verteilungs-Chips.
- collapsible(): wrapt jeden Block in <details>/<summary>.
Mailpit + alle modernen Mail-Clients rendern das nativ.
- _compose.py: alle 18+ B-Blöcke + per_doc + per_theme +
legacy_html in Akkordeons. NUR Critical-Findings + Sofort-
massnahmen sind immer offen — Reviewer sieht ~15 Zeilen
Übersicht und klappt selektiv auf.
- Cookie-Inventar (742) hat jetzt eigene Sektion ganz oben
(Akkordeon "🍪 Cookie-Inventar"), Vendor-Karten parallel.
B22 Cross-Domain-Legal-Doc-Detector (cross_domain_doc_check.py):
Real-Beispiel User-Feedback: Elli's AGB liegt auf docs.logpay.de
statt elli.eco. Detektor erkennt SLD-Mismatch:
- HIGH bei agb / widerruf (vertragsrelevant)
- MEDIUM bei dse / nutzungsbedingungen
- INFO bei cookie / impressum (Best-Practice)
Norm: DSGVO Art. 28 (AVV-Pflicht für Hosting) + Art. 13 Abs. 1
lit. e (Empfänger) + § 312i BGB (Cool-URLs).
9/9 Tests grün inkl. Elli/LogPay Pattern.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
89 lines
3.1 KiB
Python
89 lines
3.1 KiB
Python
"""Tests for B22 Cross-Domain-Legal-Doc-Detector."""
|
|
|
|
from compliance.services.cross_domain_doc_check import (
|
|
_site_origin_sld,
|
|
_sld,
|
|
check_cross_domain_docs,
|
|
)
|
|
|
|
|
|
class TestSld:
|
|
def test_simple(self):
|
|
assert _sld("www.bmw.de") == "bmw"
|
|
|
|
def test_compound_tld(self):
|
|
assert _sld("docs.example.co.uk") == "example"
|
|
|
|
def test_no_www(self):
|
|
assert _sld("elli.eco") == "elli"
|
|
|
|
|
|
class TestPrimaryDetection:
|
|
def test_majority_wins(self):
|
|
state = {"doc_entries": [
|
|
{"url": "https://elli.eco/de/impressum"},
|
|
{"url": "https://elli.eco/de/datenschutz"},
|
|
{"url": "https://docs.logpay.de/_docs/agb.pdf"},
|
|
]}
|
|
assert _site_origin_sld(state) == "elli"
|
|
|
|
def test_auto_discovered_excluded(self):
|
|
# discovery results don't influence primary detection
|
|
state = {"doc_entries": [
|
|
{"url": "https://elli.eco/de/impressum", "auto_discovered": False},
|
|
{"url": "https://discovered.tld/foo", "auto_discovered": True},
|
|
]}
|
|
assert _site_origin_sld(state) == "elli"
|
|
|
|
|
|
class TestCheck:
|
|
def test_elli_logpay_pattern(self):
|
|
state = {"doc_entries": [
|
|
{"doc_type": "dse", "url": "https://www.elli.eco/de/datenschutz"},
|
|
{"doc_type": "impressum",
|
|
"url": "https://www.elli.eco/de/impressum"},
|
|
{"doc_type": "agb",
|
|
"url": "https://docs.logpay.de/_docs/de/"
|
|
"allgemeine_geschaeftsbedingungen_de_EM.pdf"},
|
|
]}
|
|
findings = check_cross_domain_docs(state)
|
|
assert len(findings) == 1
|
|
f = findings[0]
|
|
assert f["check_id"] == "CROSS-DOMAIN-DOC-001"
|
|
assert f["severity"] == "HIGH" # AGB is HIGH
|
|
assert f["doc_type"] == "agb"
|
|
assert f["site_sld"] == "elli"
|
|
assert f["host_sld"] == "logpay"
|
|
|
|
def test_same_subdomain_no_finding(self):
|
|
# docs.bmw.de is same SLD as www.bmw.de — no finding
|
|
state = {"doc_entries": [
|
|
{"doc_type": "dse",
|
|
"url": "https://www.bmw.de/de/datenschutz.html"},
|
|
{"doc_type": "agb",
|
|
"url": "https://docs.bmw.de/agb.pdf"},
|
|
]}
|
|
findings = check_cross_domain_docs(state)
|
|
assert findings == []
|
|
|
|
def test_no_primary_no_finding(self):
|
|
# No URLs at all
|
|
state = {"doc_entries": []}
|
|
assert check_cross_domain_docs(state) == []
|
|
|
|
def test_severity_per_doc_type(self):
|
|
state = {"doc_entries": [
|
|
{"doc_type": "agb", "url": "https://acme.de/x"},
|
|
{"doc_type": "dse",
|
|
"url": "https://docs.thirdparty.com/agb"},
|
|
{"doc_type": "impressum",
|
|
"url": "https://www.other.com/impressum"},
|
|
]}
|
|
findings = check_cross_domain_docs(state)
|
|
sev_by_doc = {f["doc_type"]: f["severity"] for f in findings}
|
|
# agb is on primary (acme.de) — no finding
|
|
# dse on thirdparty.com → MEDIUM
|
|
# impressum on other.com → INFO
|
|
assert sev_by_doc.get("dse") == "MEDIUM"
|
|
assert sev_by_doc.get("impressum") == "INFO"
|