e318215cc5
- Extracted website compliance checks + helpers to website_compliance_checks.py - Created agent documentation (zeroclaw/docs/compliance-agent.md) - DB migration 086 executed (compliance_agent_scans table) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
115 lines
4.3 KiB
Markdown
115 lines
4.3 KiB
Markdown
# Compliance Agent — Dokumentation
|
|
|
|
## Uebersicht
|
|
|
|
Der Compliance Agent analysiert Websites und Dokumente automatisch auf DSGVO-Konformitaet.
|
|
Er kombiniert Website-Scanning, LLM-Analyse, Control Library und Playwright Browser-Tests
|
|
zu einem umfassenden Compliance-Audit.
|
|
|
|
## 5 Analyse-Modi
|
|
|
|
### 1. Schnellanalyse
|
|
Einzelne URL klassifizieren und bewerten.
|
|
- Qwen klassifiziert Dokumenttyp (DSE, Cookie-Banner, AGB, Impressum)
|
|
- LLM extrahiert Intake-Flags (14 Kategorien)
|
|
- UCCA Assessment bewertet Risiko
|
|
- Relevance Filter entfernt False-Positive Controls
|
|
- Email-Benachrichtigung an zustaendige Rolle
|
|
|
|
### 2. Website-Scan
|
|
Multi-Page Crawl mit Dienstleister-Abgleich.
|
|
- Playwright-Browser scannt 5-15 Seiten (JS-Rendering, Menue-Klicks)
|
|
- 82+ Dienste erkannt (Tracking, CDN, Chatbots, Payment, Marketing)
|
|
- SOLL/IST-Abgleich: DSE-Text vs. tatsaechlich eingebundene Dienste
|
|
- Pflichtinhalte-Check: Art. 13 DSGVO (9 Felder) + §5 TMG (5 Felder)
|
|
- Textblock-Referenzierung: Originaltext, Position, Korrekturvorschlag
|
|
- Lit-Mapping: Prueft ob korrekte Rechtsgrundlage (lit. a-f) verwendet wird
|
|
|
|
### 3. Cookie-Test
|
|
3-Phasen Consent-Test mit echtem Chromium-Browser.
|
|
- Phase A: Was laedt VOR Einwilligung? (§25 TDDDG Verstoss)
|
|
- Phase B: Was laedt NACH Ablehnung? (KRITISCH wenn Tracking weiterlaeuft)
|
|
- Phase C: Was laedt NACH Zustimmung? (Abgleich mit Cookie-Policy)
|
|
- Phase D-F: Einzelne Kategorien testen (Statistik, Marketing, Funktional)
|
|
- 10 CMP-spezifische Selektoren (Cookiebot, OneTrust, Didomi, etc.)
|
|
|
|
### 4. Vergleich
|
|
2-5 Websites parallel scannen und Compliance vergleichen.
|
|
- Vergleichstabelle: Risiko, Findings, Services, Impressum, Cookie-Banner
|
|
|
|
### 5. Login-Test
|
|
Kundenbereich nach Login pruefen.
|
|
- §312k BGB: Kuendigungsbutton (2 Klicks)
|
|
- Art. 17 DSGVO: Konto loeschen
|
|
- Art. 20 DSGVO: Daten exportieren
|
|
- Art. 7(3) DSGVO: Einwilligungen widerrufen
|
|
- Art. 15 DSGVO: Profildaten einsehen
|
|
|
|
## API Endpoints
|
|
|
|
### Backend (Port 8002)
|
|
|
|
| Method | Endpoint | Beschreibung |
|
|
|--------|----------|-------------|
|
|
| POST | `/api/compliance/agent/analyze` | Schnellanalyse |
|
|
| POST | `/api/compliance/agent/scan` | Website-Scan |
|
|
| POST | `/api/compliance/agent/notify` | Email senden |
|
|
| POST | `/api/compliance/agent/scans` | Scan speichern |
|
|
| GET | `/api/compliance/agent/scans` | Scan-Verlauf |
|
|
| POST | `/api/compliance/agent/scans/pdf` | PDF-Export |
|
|
| POST | `/api/compliance/agent/compare` | Multi-Website Vergleich |
|
|
| POST | `/api/compliance/agent/monitored-urls` | URL zur Ueberwachung |
|
|
| POST | `/api/compliance/agent/run-scheduled` | Scheduled Scans triggern |
|
|
|
|
### Consent-Tester (Port 8094)
|
|
|
|
| Method | Endpoint | Beschreibung |
|
|
|--------|----------|-------------|
|
|
| POST | `/scan` | 3-Phasen Cookie-Test |
|
|
| POST | `/website-scan` | Playwright Website-Scan |
|
|
| POST | `/authenticated-scan` | Login-Test |
|
|
| GET | `/health` | Health Check |
|
|
|
|
## Service-Registry
|
|
|
|
82+ Dienste in 15 Kategorien:
|
|
Tracking, Marketing, Newsletter, CDN, Chatbots, Payment, Heatmaps,
|
|
A/B Testing, Tag Manager, Push, Video, Social, Error Tracking, CRM, Accessibility.
|
|
|
|
Datei: `backend-compliance/compliance/services/service_registry.py`
|
|
|
|
## Pre-Launch vs. Post-Launch
|
|
|
|
| Modus | Tonfall | Empfehlung |
|
|
|-------|---------|------------|
|
|
| Pre-Launch | "Vor Veroeffentlichung korrigieren" | Einbaufertige DSE-Textbausteine |
|
|
| Post-Launch | "ACHTUNG: Oeffentlich sichtbar!" | Sofortige Nachbesserung |
|
|
|
|
## Architektur
|
|
|
|
```
|
|
Browser (Frontend)
|
|
|
|
|
├── /sdk/agent (Next.js, 5 Tabs)
|
|
|
|
|
├── Next.js API Proxies (/api/sdk/v1/agent/*)
|
|
| |
|
|
| ├── Backend (FastAPI, Port 8002)
|
|
| | ├── agent_analyze_routes.py
|
|
| | ├── agent_scan_routes.py (+ Playwright integration)
|
|
| | ├── agent_history_routes.py
|
|
| | ├── agent_recurring_routes.py
|
|
| | └── agent_compare_routes.py
|
|
| |
|
|
| └── Consent-Tester (FastAPI + Playwright, Port 8094)
|
|
| ├── consent_scanner.py (3-Phasen + Kategorien)
|
|
| ├── playwright_scanner.py (Website-Scan)
|
|
| ├── authenticated_scanner.py (Login-Test)
|
|
| ├── banner_detector.py (10 CMPs)
|
|
| ├── category_tester.py (Kategorie-Toggles)
|
|
| └── script_analyzer.py (Service-Erkennung)
|
|
|
|
|
├── Qwen 3.5:35b-a3b (Ollama, Port 11434)
|
|
└── Mailpit (SMTP 1025, Web 8025)
|
|
```
|