feat: website scanner with SOLL/IST service comparison + corrections

- website_scanner.py: multi-page crawl, 20+ service patterns (tracking,
  CDN, chatbots, payment, fonts, captcha, video), AI text detection
- dse_service_extractor.py: LLM extracts services from privacy policy text
- agent_scan_routes.py: POST /agent/scan — combines scan + DSE comparison,
  generates findings (undocumented, outdated, third-country transfer),
  auto-corrections via Qwen in pre-launch mode

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Benjamin Admin
2026-04-28 15:35:31 +02:00
parent d0dc284cd5
commit 711b9b3146
4 changed files with 679 additions and 0 deletions

View File

@@ -44,6 +44,7 @@ from compliance.api.company_profile_routes import router as company_profile_rout
# Agent (ZeroClaw compliance agent)
from compliance.api.agent_notification_routes import router as agent_notify_router
from compliance.api.agent_analyze_routes import router as agent_analyze_router
from compliance.api.agent_scan_routes import router as agent_scan_router
# Middleware
from middleware import (
@@ -142,6 +143,7 @@ app.include_router(company_profile_router, prefix="/api")
# Agent (ZeroClaw compliance agent → analyze + email via SMTP)
app.include_router(agent_notify_router, prefix="/api")
app.include_router(agent_analyze_router, prefix="/api")
app.include_router(agent_scan_router, prefix="/api")
if __name__ == "__main__":