6c223c7c9b
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 14s
CI / loc-budget (push) Failing after 15s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m43s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 37s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
P1 — Exec-Summary oben im Email-Report (4 KPIs + 2 CTAs, dunkler Gradient)
P3 — no_direct_sales-Flag fuer OEM-Konfigurator-Sites; AGB/Widerruf/AGB als
"NICHT ANWENDBAR" (grau) statt "NICHT GEFUNDEN" (rot)
P5 — Voll-Audit Unification: alle Findings (MC + Pflichtangaben + Vendor +
Redundanz) in /data/compliance_audits.db.unified_findings; neuer
/api/compliance/agent/findings/<id> Endpoint + FindingsTab im Audit-UI
mit Filter + CSV-Export
P7 — Crawl-Hardening: TDM-Reservation-Check (robots.txt / ai.txt / Header /
Meta) vor jedem Run mit 24h-Cache; HeadlessChrome-UA (Firma noch nicht
gegruendet — Switch via BREAKPILOT_BRANDED_UA env); per-Domain
Rate-Limit 1 req/s + max 2 concurrent
P2 — Cookie-Knowledge-DB additiv erweitert (35 -> 74 Cookies): Adobe, Meta,
Microsoft, LinkedIn, TikTok, HubSpot, Marketo, Salesforce, Hotjar,
FullStory, Mouseflow, Intercom, Drift, Zendesk, Cloudflare, Stripe,
OneTrust/Cookiebot/Usercentrics, Matomo, Pinterest, Snapchat, X/Twitter,
YouTube, Vimeo, Klaviyo, Mailchimp, Mixpanel, Segment, Amplitude,
Optimizely, Datadog; Wire-in in cookie_function_classifier liefert
compliance_risk-Label (kritisch/hoch/mittel/gering) pro Vendor
A — k-Anonymitaets-Helper (benchmark_k_anonymity) fuer P6-Vorbereitung
B — Cross-Tenant-Domain-Assertion im /findings-Endpoint (expected_domain
Query-Param -> 403 bei Mismatch)
C — Saving-Scan-Funnel: /api/compliance/agent/saving-scan/start mit
Validierung + 24h-Rate-Limit pro Domain + Lead-Persistenz in
saving_scan_leads + Auto-Discovery via _run_compliance_check; 6 Tests
D — Risk-Badge im Email-Vendor-Row
Rechtliche Leitplanken (Memory feedback_oem_data_legal.md): nur eigene
Knapp-Bewertungen + Source-Pointer, keine 1:1-Kopien fremder CMP-Texte.
TDM-Opt-Out-Respect nach § 44b UrhG. KEINE Schema-Aenderungen — alles in
Sidecar-SQLite.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
105 lines
3.5 KiB
Python
105 lines
3.5 KiB
Python
"""
|
|
Voll-Audit Findings Router — unified view across all 4 finding sources.
|
|
|
|
Endpoint:
|
|
GET /api/compliance/agent/findings/{check_id}
|
|
?source=mc|pflichtangabe|vendor|redundanz|all
|
|
&severity=CRITICAL|HIGH|MEDIUM|LOW|INFO|all
|
|
&doc_type=impressum|dse|cookie|...|all
|
|
&status=failed|passed|skipped|na|info|all
|
|
&q=<freitext>
|
|
&limit=<int>
|
|
|
|
Liefert summary + filtered findings list. Frontend rendert daraus den
|
|
Voll-Audit-Tab unter /sdk/agent/audit/<check_id>.
|
|
"""
|
|
|
|
from __future__ import annotations
|
|
|
|
import logging
|
|
from urllib.parse import urlparse
|
|
from fastapi import APIRouter, HTTPException, Query
|
|
|
|
from compliance.services.unified_findings_store import (
|
|
findings_summary,
|
|
list_findings,
|
|
)
|
|
from compliance.services.compliance_audit_log import get_check_run
|
|
|
|
logger = logging.getLogger(__name__)
|
|
|
|
router = APIRouter(prefix="/compliance/agent", tags=["agent"])
|
|
|
|
|
|
def _normalize_domain(d: str) -> str:
|
|
if not d:
|
|
return ""
|
|
if "://" not in d:
|
|
d = "https://" + d
|
|
host = urlparse(d).netloc.lower()
|
|
return host[4:] if host.startswith("www.") else host
|
|
|
|
|
|
@router.get("/findings/{check_id}")
|
|
def get_findings(
|
|
check_id: str,
|
|
source: str | None = Query(None, description="mc|pflichtangabe|vendor|redundanz|all"),
|
|
severity: str | None = Query(None, description="CRITICAL|HIGH|MEDIUM|LOW|INFO|all"),
|
|
doc_type: str | None = Query(None),
|
|
status: str | None = Query(None, description="failed|passed|skipped|na|info|all"),
|
|
q: str | None = Query(None, description="freitext-suche label/vendor"),
|
|
limit: int = Query(1000, ge=1, le=5000),
|
|
expected_domain: str | None = Query(
|
|
None, description="Hard-Assertion: Run muss zu dieser Domain gehoeren (Cross-Tenant-Schutz)",
|
|
),
|
|
) -> dict:
|
|
"""Return aggregated findings + summary counters for a check run."""
|
|
# P7-Restpunkt: optionale Domain-Assertion. Verhindert dass ein Frontend
|
|
# einen check_id einer fremden Tenant-Domain anfragen kann.
|
|
if expected_domain:
|
|
run = get_check_run(check_id)
|
|
actual = _normalize_domain((run or {}).get("base_domain") or "")
|
|
if not run or actual != _normalize_domain(expected_domain):
|
|
raise HTTPException(
|
|
status_code=403,
|
|
detail=f"Cross-tenant access blocked: check_id {check_id} "
|
|
f"gehoert zu Domain '{actual or '?'}', angefragt: "
|
|
f"'{_normalize_domain(expected_domain)}'",
|
|
)
|
|
try:
|
|
summary = findings_summary(check_id)
|
|
findings = list_findings(
|
|
check_id=check_id,
|
|
source_type=source,
|
|
severity=severity,
|
|
doc_type=doc_type,
|
|
status=status,
|
|
q=q,
|
|
limit=limit,
|
|
)
|
|
return {
|
|
"found": summary.get("total", 0) > 0,
|
|
"check_id": check_id,
|
|
"summary": summary,
|
|
"filter": {
|
|
"source": source or "all",
|
|
"severity": severity or "all",
|
|
"doc_type": doc_type or "all",
|
|
"status": status or "all",
|
|
"q": q or "",
|
|
"limit": limit,
|
|
},
|
|
"count": len(findings),
|
|
"findings": findings,
|
|
}
|
|
except Exception as e:
|
|
logger.exception("get_findings failed for %s", check_id)
|
|
return {
|
|
"found": False,
|
|
"check_id": check_id,
|
|
"error": str(e)[:200],
|
|
"summary": {},
|
|
"count": 0,
|
|
"findings": [],
|
|
}
|