Files
Benjamin Admin 6c223c7c9b
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 14s
CI / loc-budget (push) Failing after 15s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m43s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 37s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
feat(compliance-check): exec-summary + voll-audit + TDM-respect + cookie-KB-extended + saving-scan-funnel
P1 — Exec-Summary oben im Email-Report (4 KPIs + 2 CTAs, dunkler Gradient)
P3 — no_direct_sales-Flag fuer OEM-Konfigurator-Sites; AGB/Widerruf/AGB als
     "NICHT ANWENDBAR" (grau) statt "NICHT GEFUNDEN" (rot)
P5 — Voll-Audit Unification: alle Findings (MC + Pflichtangaben + Vendor +
     Redundanz) in /data/compliance_audits.db.unified_findings; neuer
     /api/compliance/agent/findings/<id> Endpoint + FindingsTab im Audit-UI
     mit Filter + CSV-Export
P7 — Crawl-Hardening: TDM-Reservation-Check (robots.txt / ai.txt / Header /
     Meta) vor jedem Run mit 24h-Cache; HeadlessChrome-UA (Firma noch nicht
     gegruendet — Switch via BREAKPILOT_BRANDED_UA env); per-Domain
     Rate-Limit 1 req/s + max 2 concurrent
P2 — Cookie-Knowledge-DB additiv erweitert (35 -> 74 Cookies): Adobe, Meta,
     Microsoft, LinkedIn, TikTok, HubSpot, Marketo, Salesforce, Hotjar,
     FullStory, Mouseflow, Intercom, Drift, Zendesk, Cloudflare, Stripe,
     OneTrust/Cookiebot/Usercentrics, Matomo, Pinterest, Snapchat, X/Twitter,
     YouTube, Vimeo, Klaviyo, Mailchimp, Mixpanel, Segment, Amplitude,
     Optimizely, Datadog; Wire-in in cookie_function_classifier liefert
     compliance_risk-Label (kritisch/hoch/mittel/gering) pro Vendor
A  — k-Anonymitaets-Helper (benchmark_k_anonymity) fuer P6-Vorbereitung
B  — Cross-Tenant-Domain-Assertion im /findings-Endpoint (expected_domain
     Query-Param -> 403 bei Mismatch)
C  — Saving-Scan-Funnel: /api/compliance/agent/saving-scan/start mit
     Validierung + 24h-Rate-Limit pro Domain + Lead-Persistenz in
     saving_scan_leads + Auto-Discovery via _run_compliance_check; 6 Tests
D  — Risk-Badge im Email-Vendor-Row

Rechtliche Leitplanken (Memory feedback_oem_data_legal.md): nur eigene
Knapp-Bewertungen + Source-Pointer, keine 1:1-Kopien fremder CMP-Texte.
TDM-Opt-Out-Respect nach § 44b UrhG. KEINE Schema-Aenderungen — alles in
Sidecar-SQLite.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 23:48:34 +02:00

117 lines
3.9 KiB
Python

"""
Tests for the saving-scan funnel endpoint.
Focus: input validation + lead persistence + rate-limit error path.
The actual compliance check is mocked — we only verify the route layer.
"""
import os
import sys
from unittest.mock import patch
import pytest
from fastapi import FastAPI
from fastapi.testclient import TestClient
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..'))
# Use a temp SQLite for the sidecar
os.environ["COMPLIANCE_AUDIT_DB"] = "/tmp/test_saving_scan.db"
if os.path.exists("/tmp/test_saving_scan.db"):
os.remove("/tmp/test_saving_scan.db")
from compliance.api.saving_scan_routes import router # noqa: E402
app = FastAPI()
app.include_router(router, prefix="/api")
client = TestClient(app)
class TestStartSavingScanValidation:
def test_missing_email_returns_422(self):
resp = client.post("/api/compliance/agent/saving-scan/start",
json={"url": "https://example.de"})
assert resp.status_code == 422
def test_invalid_email_returns_400(self):
with patch("compliance.api.saving_scan_routes.asyncio.create_task"):
resp = client.post(
"/api/compliance/agent/saving-scan/start",
json={"url": "https://example.de", "email": "kein-email",
"consent": True},
)
assert resp.status_code == 400
assert "E-Mail" in resp.json()["detail"]
def test_invalid_url_returns_400(self):
with patch("compliance.api.saving_scan_routes.asyncio.create_task"):
resp = client.post(
"/api/compliance/agent/saving-scan/start",
json={"url": "ftp://wrong.de", "email": "u@x.de",
"consent": True},
)
assert resp.status_code == 400
def test_consent_required(self):
with patch("compliance.api.saving_scan_routes.asyncio.create_task"):
resp = client.post(
"/api/compliance/agent/saving-scan/start",
json={"url": "https://example.de", "email": "u@x.de",
"consent": False},
)
assert resp.status_code == 400
assert "Consent" in resp.json()["detail"]
def _patch_check_runner():
"""Stub the lazy-imported worker — avoids loading smtp_sender (Py3.10+)."""
import sys, types
fake = types.ModuleType("compliance.api.agent_compliance_check_routes")
class _DocInput:
def __init__(self, doc_type="other", url=""): self.doc_type, self.url = doc_type, url
class _Req:
def __init__(self, **kw): self.__dict__.update(kw)
async def _runner(*_a, **_kw): pass
fake.DocumentInput = _DocInput
fake.ComplianceCheckRequest = _Req
fake._run_compliance_check = _runner
fake._compliance_check_jobs = {}
sys.modules["compliance.api.agent_compliance_check_routes"] = fake
class TestStartSavingScanSuccess:
def test_valid_request_starts_check(self):
_patch_check_runner()
resp = client.post(
"/api/compliance/agent/saving-scan/start",
json={"url": "https://example-newdomain.de",
"email": "user@example.de", "consent": True},
)
assert resp.status_code == 200, resp.text
data = resp.json()
assert "check_id" in data
assert data["status"] == "running"
assert "example-newdomain.de" in data["message"]
class TestLeadCount:
def test_lead_count_after_submit(self):
_patch_check_runner()
client.post(
"/api/compliance/agent/saving-scan/start",
json={"url": "https://abc-leadtest.de",
"email": "lead@x.de", "consent": True},
)
resp = client.get("/api/compliance/agent/saving-scan/lead-count")
assert resp.status_code == 200
data = resp.json()
assert data["total_leads"] >= 1
assert "abc-leadtest.de" in str(data["top_domains"])