feat: ZeroClaw compliance agent — document analysis + role assignment + email

Add autonomous compliance agent that fetches web documents (cookie banners,
privacy policies), classifies them via Qwen/Ollama, assesses DSGVO compliance,
assigns to the responsible role, and sends notification emails.

Components:
- ZeroClaw SOP (6-step workflow: fetch, classify, assess, summarize, assign, notify)
- Backend: /api/compliance/agent/analyze (combined endpoint)
- Backend: /api/compliance/agent/notify (standalone email)
- Frontend: /sdk/agent page (Manager UI with URL input + results)
- Helper scripts + E2E test

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Benjamin Admin
2026-04-27 23:27:25 +02:00
parent f528b8e7a9
commit 0c0dd4e3a6
16 changed files with 1095 additions and 0 deletions

View File

@@ -0,0 +1,34 @@
#!/usr/bin/env bash
#
# fetch-and-analyze.sh — Fetch a URL and extract clean text for compliance analysis.
#
# Usage: bash fetch-and-analyze.sh <url> [max_chars]
#
# Outputs clean text to stdout, truncated to max_chars (default: 4000).
set -euo pipefail
URL="${1:?Usage: fetch-and-analyze.sh <url> [max_chars]}"
MAX_CHARS="${2:-4000}"
# Fetch page with reasonable timeout and user agent
HTML=$(curl -sL --max-time 30 \
-H "User-Agent: Mozilla/5.0 (compatible; BreakPilot-Compliance-Agent/1.0)" \
"$URL" 2>/dev/null || echo "")
if [ -z "$HTML" ]; then
echo "ERROR: Could not fetch $URL" >&2
exit 1
fi
# Strip HTML: remove style/script blocks, then all tags, normalize whitespace
CLEAN=$(echo "$HTML" \
| sed 's/<style[^>]*>[^<]*<\/style>//gi' \
| sed 's/<script[^>]*>[^<]*<\/script>//gi' \
| sed 's/<[^>]*>//g' \
| sed 's/&nbsp;/ /g; s/&amp;/\&/g; s/&lt;/</g; s/&gt;/>/g; s/&quot;/"/g' \
| tr -s '[:space:]' ' ' \
| sed 's/^ //; s/ $//')
# Truncate to max chars
echo "$CLEAN" | head -c "$MAX_CHARS"