feat: ZeroClaw compliance agent — document analysis + role assignment + email

Add autonomous compliance agent that fetches web documents (cookie banners,
privacy policies), classifies them via Qwen/Ollama, assesses DSGVO compliance,
assigns to the responsible role, and sends notification emails.

Components:
- ZeroClaw SOP (6-step workflow: fetch, classify, assess, summarize, assign, notify)
- Backend: /api/compliance/agent/analyze (combined endpoint)
- Backend: /api/compliance/agent/notify (standalone email)
- Frontend: /sdk/agent page (Manager UI with URL input + results)
- Helper scripts + E2E test

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Benjamin Admin
2026-04-27 23:27:25 +02:00
parent f528b8e7a9
commit 0c0dd4e3a6
16 changed files with 1095 additions and 0 deletions

56
zeroclaw/README.md Normal file
View File

@@ -0,0 +1,56 @@
# ZeroClaw Compliance Agent Demo
Autonomer Compliance-Agent der Web-Dokumente (Cookie-Banner, Datenschutzerklaerungen) analysiert und die Ergebnisse an die zustaendige Rolle weiterleitet.
## Architektur
```
ZeroClaw Agent (Rust, Mac Mini)
├── LLM: Qwen 3.5:35b-a3b (Ollama, localhost:11434)
├── Compliance SDK (Go/Gin, localhost:8093)
│ ├── /sdk/v1/llm/chat → Dokumentklassifizierung
│ ├── /sdk/v1/ucca/assess → Risikobewertung
│ └── /sdk/v1/ucca/escalations → Eskalation + Rollenzuweisung
├── Backend (Python/FastAPI, localhost:8002)
│ └── /api/compliance/agent/notify → Email-Benachrichtigung
└── Mailpit (SMTP localhost:1025, Web localhost:8025)
└── Fiktive Email-Zustellung
```
## Voraussetzungen
- ZeroClaw v0.7.3+ (`brew install zeroclaw`)
- Ollama mit `qwen3.5:35b-a3b` Modell
- Alle Compliance-Services laufen (SDK, Backend, Mailpit)
## Demo ausfuehren
```bash
# 1. ZeroClaw mit Ollama verbinden (einmalig)
zeroclaw onboard --quick --provider ollama --model qwen3.5:35b-a3b
# 2. SOP ausfuehren
zeroclaw agent -m "Analysiere die Datenschutzerklaerung von https://www.google.com/intl/de/policies/privacy/"
# 3. Ergebnis pruefen
open http://localhost:8025 # Mailpit Web-UI
```
## E2E Test
```bash
bash zeroclaw/tests/test_sop_workflow.sh
```
## SOP-Workflow (6 Schritte)
1. **Fetch** — URL holen, HTML strippen
2. **Classify** — Dokumenttyp bestimmen (privacy_policy, cookie_banner, etc.)
3. **Assess** — DSGVO-Risikobewertung via UCCA
4. **Summarize** — Manager-Report auf Deutsch
5. **Assign** — Zustaendige Rolle bestimmen (E0-E3 Mapping)
6. **Notify** — Email an DSB/Teamleitung senden

View File

@@ -0,0 +1,34 @@
#!/usr/bin/env bash
#
# fetch-and-analyze.sh — Fetch a URL and extract clean text for compliance analysis.
#
# Usage: bash fetch-and-analyze.sh <url> [max_chars]
#
# Outputs clean text to stdout, truncated to max_chars (default: 4000).
set -euo pipefail
URL="${1:?Usage: fetch-and-analyze.sh <url> [max_chars]}"
MAX_CHARS="${2:-4000}"
# Fetch page with reasonable timeout and user agent
HTML=$(curl -sL --max-time 30 \
-H "User-Agent: Mozilla/5.0 (compatible; BreakPilot-Compliance-Agent/1.0)" \
"$URL" 2>/dev/null || echo "")
if [ -z "$HTML" ]; then
echo "ERROR: Could not fetch $URL" >&2
exit 1
fi
# Strip HTML: remove style/script blocks, then all tags, normalize whitespace
CLEAN=$(echo "$HTML" \
| sed 's/<style[^>]*>[^<]*<\/style>//gi' \
| sed 's/<script[^>]*>[^<]*<\/script>//gi' \
| sed 's/<[^>]*>//g' \
| sed 's/&nbsp;/ /g; s/&amp;/\&/g; s/&lt;/</g; s/&gt;/>/g; s/&quot;/"/g' \
| tr -s '[:space:]' ' ' \
| sed 's/^ //; s/ $//')
# Truncate to max chars
echo "$CLEAN" | head -c "$MAX_CHARS"

View File

@@ -0,0 +1,35 @@
#!/usr/bin/env bash
#
# send-notification.sh — Send a notification email via Mailpit SMTP.
#
# Usage: bash send-notification.sh <recipient> <subject> <body_text>
#
# Uses Mailpit's SMTP on localhost:1025 via Python smtplib (one-liner).
set -euo pipefail
RECIPIENT="${1:?Usage: send-notification.sh <recipient> <subject> <body_text>}"
SUBJECT="${2:?Missing subject}"
BODY="${3:?Missing body text}"
SMTP_HOST="${SMTP_HOST:-localhost}"
SMTP_PORT="${SMTP_PORT:-1025}"
FROM_ADDR="${SMTP_FROM_ADDR:-compliance-agent@breakpilot.local}"
FROM_NAME="${SMTP_FROM_NAME:-BreakPilot Compliance Agent}"
python3 -c "
import smtplib
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart
msg = MIMEMultipart('alternative')
msg['From'] = '${FROM_NAME} <${FROM_ADDR}>'
msg['To'] = '${RECIPIENT}'
msg['Subject'] = '${SUBJECT}'
msg.attach(MIMEText('''${BODY}''', 'html', 'utf-8'))
with smtplib.SMTP('${SMTP_HOST}', ${SMTP_PORT}) as server:
server.sendmail('${FROM_ADDR}', '${RECIPIENT}', msg.as_string())
print('Email sent to ${RECIPIENT}')
"

View File

@@ -0,0 +1,98 @@
## Context
Du bist ein Compliance-Analyst-Agent. Du analysierst Web-Dokumente (Cookie-Banner, Datenschutzerklaerungen) auf DSGVO-Konformitaet mithilfe des BreakPilot Compliance SDK.
### Endpunkte
- **Compliance SDK:** http://localhost:8093
- **Backend:** http://localhost:8002
- **Mailpit SMTP:** localhost:1025
- **Mailpit Web:** http://localhost:8025
### Authentifizierung
Alle SDK-Anfragen benoetigen diese Header:
- `X-Tenant-ID: 9282a473-5c95-4b3a-bf78-0ecc0ec71d3e`
- `X-User-ID: 00000000-0000-0000-0000-000000000001`
## Steps
### 1. Fetch Document
Hole die Ziel-URL und extrahiere den Text:
```bash
curl -sL "$URL" | sed 's/<style[^>]*>.*<\/style>//g; s/<script[^>]*>.*<\/script>//g; s/<[^>]*>//g; s/&nbsp;/ /g; s/&amp;/\&/g; s/ */ /g' | head -c 4000
```
Speichere das Ergebnis als `$DOCUMENT_TEXT`.
### 2. Classify Document
Sende den Text an das SDK zur Klassifizierung:
```bash
curl -s -X POST http://localhost:8093/sdk/v1/llm/chat \
-H "Content-Type: application/json" \
-H "X-Tenant-ID: 9282a473-5c95-4b3a-bf78-0ecc0ec71d3e" \
-H "X-User-ID: 00000000-0000-0000-0000-000000000001" \
-d '{
"messages": [
{"role": "system", "content": "Klassifiziere das folgende Dokument in GENAU EINE Kategorie: privacy_policy, cookie_banner, terms_of_service, imprint, dpa, other. Antworte NUR mit dem Kategorienamen."},
{"role": "user", "content": "'"$DOCUMENT_TEXT"'"}
]
}'
```
### 3. Analyze Compliance
Fuehre eine UCCA-Bewertung durch:
```bash
curl -s -X POST http://localhost:8093/sdk/v1/ucca/assess \
-H "Content-Type: application/json" \
-H "X-Tenant-ID: 9282a473-5c95-4b3a-bf78-0ecc0ec71d3e" \
-H "X-User-ID: 00000000-0000-0000-0000-000000000001" \
-d '{
"use_case_text": "'"$DOCUMENT_TEXT"'",
"domain": "'"$CLASSIFICATION"'",
"data_categories": ["personal_data", "tracking", "cookies", "third_party_sharing"]
}'
```
Notiere: `risk_score`, `risk_level`, `escalation_level`, `triggered_rules`, `required_controls`.
### 4. Prepare Summary
Erstelle einen Manager-Report auf Deutsch mit:
- **Dokumenttyp:** (aus Schritt 2)
- **Quelle:** (URL)
- **Risikobewertung:** (risk_level + risk_score aus Schritt 3)
- **Wesentliche Findings:** (triggered_rules zusammengefasst)
- **Erforderliche Massnahmen:** (required_controls zusammengefasst)
- **Empfehlung:** (Handlungsempfehlung basierend auf escalation_level)
### 5. Determine Responsible Role
Basierend auf dem `escalation_level` aus Schritt 3:
- **E0** → Kein Handlungsbedarf, automatische Compliance
- **E1** → Teamleitung Datenschutz
- **E2** → Datenschutzbeauftragter (DSB)
- **E3** → DSB + Rechtsabteilung (gemeinsame Entscheidung)
### 6. Send Notification Email
Sende eine Benachrichtigung an die zustaendige Rolle:
```bash
curl -s -X POST http://localhost:8002/api/compliance/agent/notify \
-H "Content-Type: application/json" \
-d '{
"recipient": "dsb@breakpilot.local",
"subject": "Compliance-Finding: '"$CLASSIFICATION"' — '"$URL"'",
"body_html": "'"$MANAGER_SUMMARY_HTML"'",
"role": "'"$RESPONSIBLE_ROLE"'"
}'
```
Pruefe das Ergebnis in Mailpit: http://localhost:8025

View File

@@ -0,0 +1,15 @@
[sop]
name = "compliance-analyst"
description = "Fetch a web document (cookie banner, privacy policy), analyze for DSGVO compliance via BreakPilot SDK, assign to responsible role, notify via email"
version = "1.0.0"
priority = "normal"
execution_mode = "supervised"
max_concurrent = 1
cooldown_secs = 60
[[triggers]]
type = "manual"
[[triggers]]
type = "webhook"
path = "/sop/compliance-analyst"

View File

@@ -0,0 +1,96 @@
#!/usr/bin/env bash
#
# test_sop_workflow.sh — End-to-end test for the compliance-analyst SOP.
#
# Prerequisites:
# - Compliance SDK running on localhost:8093
# - Backend running on localhost:8002
# - Ollama running on localhost:11434 with qwen model
# - Mailpit running (SMTP on 1025, Web on 8025)
# - ZeroClaw installed
set -euo pipefail
SDK="http://localhost:8093"
BACKEND="http://localhost:8002"
OLLAMA="http://localhost:11434"
MAILPIT="http://localhost:8025"
TENANT="9282a473-5c95-4b3a-bf78-0ecc0ec71d3e"
USER_ID="00000000-0000-0000-0000-000000000001"
red() { printf '\033[31m✗ %s\033[0m\n' "$*"; }
green() { printf '\033[32m✓ %s\033[0m\n' "$*"; }
echo "═══ Compliance Agent SOP — E2E Test ═══"
echo ""
# Step 1: Health checks
echo "── Step 1: Service Health ──"
curl -sf "$SDK/health" >/dev/null && green "SDK healthy" || red "SDK unreachable"
curl -sf "$BACKEND/health" >/dev/null && green "Backend healthy" || red "Backend unreachable"
curl -sf "$OLLAMA/api/tags" >/dev/null && green "Ollama running" || red "Ollama unreachable"
# Step 2: Test document fetch
echo ""
echo "── Step 2: Document Fetch ──"
TEXT=$(bash "$(dirname "$0")/../scripts/fetch-and-analyze.sh" "https://www.google.com/intl/de/policies/privacy/" 2000)
CHARS=${#TEXT}
if [ "$CHARS" -gt 100 ]; then
green "Fetched $CHARS chars from Google Privacy Policy"
else
red "Fetch returned too little text ($CHARS chars)"
exit 1
fi
# Step 3: Test LLM classification
echo ""
echo "── Step 3: LLM Classification ──"
CLASSIFY_RESULT=$(curl -sf -X POST "$SDK/sdk/v1/llm/chat" \
-H "Content-Type: application/json" \
-H "X-Tenant-ID: $TENANT" \
-H "X-User-ID: $USER_ID" \
-d "{
\"messages\": [
{\"role\": \"system\", \"content\": \"Klassifiziere: privacy_policy, cookie_banner, terms_of_service, imprint, dpa, other. Antworte NUR mit dem Kategorienamen.\"},
{\"role\": \"user\", \"content\": $(echo "$TEXT" | head -c 1000 | python3 -c 'import json,sys; print(json.dumps(sys.stdin.read()))')}
]
}" 2>&1) || true
if echo "$CLASSIFY_RESULT" | grep -qi "privacy_policy\|cookie\|terms\|imprint\|dpa"; then
green "Classification: $(echo "$CLASSIFY_RESULT" | python3 -c 'import json,sys; d=json.load(sys.stdin); print(d.get("response","").strip()[:50])' 2>/dev/null || echo "$CLASSIFY_RESULT" | head -c 50)"
else
echo " Classification result: $(echo "$CLASSIFY_RESULT" | head -c 100)"
red "Classification did not return expected category (may still be valid)"
fi
# Step 4: Test notification endpoint
echo ""
echo "── Step 4: Agent Notification ──"
NOTIFY_RESULT=$(curl -sf -X POST "$BACKEND/api/compliance/agent/notify" \
-H "Content-Type: application/json" \
-d '{
"recipient": "dsb@breakpilot.local",
"subject": "E2E Test: Compliance-Finding",
"body_html": "<h2>Test-Benachrichtigung</h2><p>Automatischer E2E-Test des Compliance-Agent SOP.</p>",
"role": "Datenschutzbeauftragter"
}' 2>&1) || true
if echo "$NOTIFY_RESULT" | grep -qi "sent\|success\|ok"; then
green "Notification sent"
else
echo " Notify result: $(echo "$NOTIFY_RESULT" | head -c 100)"
red "Notification endpoint returned unexpected result"
fi
# Step 5: Check Mailpit
echo ""
echo "── Step 5: Mailpit Check ──"
MAIL_COUNT=$(curl -sf "$MAILPIT/api/v1/messages" 2>/dev/null | python3 -c 'import json,sys; d=json.load(sys.stdin); print(d.get("total",0))' 2>/dev/null || echo "0")
if [ "$MAIL_COUNT" -gt 0 ]; then
green "Mailpit has $MAIL_COUNT message(s)"
else
red "No messages in Mailpit (check SMTP connectivity)"
fi
echo ""
echo "═══ E2E Test Complete ═══"