Backend: _ensure_list() converts null/string/malformed JSONB to []
for requirements, test_procedure, evidence, open_anchors, tags.
Frontend: defensive Array.isArray() check on ControlDetail.tsx.
Fixes: TypeError: A.requirements.map is not a function
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- CookieBannerOverlay: shows vendors per category with expandable tables
(Verarbeiter, Cookies, Dauer, Land) for full transparency
- Demo vendors: 4 necessary, 3 statistics, 3 marketing, 3 functional
- cookie_table_generator.py: renders {{COOKIE_TABLE}} Markdown tables
from vendor configs (DB) or service registry (fallback)
- SERVICE_COOKIES: 16 known vendor-to-cookie mappings with provider + country
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fundamental architecture fix: data processing happens through APIs/scripts/
cookies — NOT through visible page text. A news site about healthcare does
NOT process health data.
Before: Qwen reads website text → guesses "health_data: true" (WRONG)
After: Google Analytics detected → tracking: true (CORRECT, deterministic)
New flow: detect services from HTML → map service categories to flags →
feed flags into UCCA assessment. No LLM needed for flag extraction.
SERVICE_TO_FLAGS maps categories: tracking→tracking, marketing→marketing+
third_party_sharing, payment→payment_data, heatmap→profiling, etc.
SPECIFIC_SERVICE_FLAGS for Klarna (Art.22), Stripe (US transfer), etc.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New templates for the Vendor Compliance module:
- 105: Transfer Impact Assessment (TIA) — Schrems II risk assessment
with country evaluation, government access assessment, supplementary
measures, risk matrix, and go/conditional/deny decision
- 105: SCC Companion Document — annexes to EU Decision 2021/914
(module selection C2C/C2P/P2P/P2C, party details, data description,
TOMs, sub-processor list)
Template recommendations: SCC+TIA triggered by tech_third_country answer
Generator: New "Drittlandtransfer" category
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase 3 of the Document Templates Masterplan:
- 103: 4 new security policies (information_security_policy, password_policy,
encryption_policy, access_control_policy) + updates for CRA (056) and
all 15 HR/Vendor/BCM policies (072)
New templates:
- Information Security Policy: ISMS-Leitlinie (ISO 27001, BSI, NIS2)
- Password Policy: BSI/NIST compliant (12+ chars, MFA, no forced rotation)
- Encryption Policy: BSI TR-02102, algorithms, key management, TLS config
- Access Control Policy: RBAC, Least Privilege, Zero Trust, rezertification
Updates: AI Act + NIS2UmsuCG references for CRA and all 15 HR/Vendor/BCM
Generator: 6 new categories (security, HR, data, vendor, BCM policies)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Dockerfile: install Playwright AS appuser (not root) so chromium
binary is accessible at runtime. Was causing 500 error.
2. DSE service matching: text-search fallback when LLM extraction fails.
If "etracker" appears in DSE text, mark as documented even without
LLM parsing the service list.
3. CMP skip: consent managers in category "cmp" skipped (not just "other"
with id "cmp").
NOT DEPLOYED — RAG pipeline is running.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New /website-scan endpoint in consent-tester service:
- Real browser renders JavaScript (finds dynamic content)
- Clicks navigation menus (discovers hidden sub-pages like IHK DSB page)
- Follows links within DSE to find regional privacy policies
- Collects rendered HTML for each page (after JS execution)
Backend integration:
- agent_scan_routes tries Playwright first, falls back to httpx
- DSE text and HTML extracted from Playwright-rendered pages
- Service detection runs on rendered HTML (catches JS-loaded scripts)
Also fixes:
- GA regex: G-[A-Z0-9]{8,12} prevents CSS class false positives
- etracker added to service registry
- External page scanning blocked (same-domain only)
- CSS/JS/image files excluded from page list
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. GA regex: G-\w{5,} matched CSS classes (g-7031048). Now requires
G-[A-Z0-9]{8,12} (uppercase after G-, 8-12 chars = real GA4 ID)
2. External page scanning: DSE-internal links now SAME DOMAIN only.
Previously followed links to etracker.com, google.de/policies etc.
and detected services on THOSE sites as IHK services.
3. Added etracker to service registry (DE, ePrivacy-certified)
4. CSS/JS/image files excluded from page scanning
5. Navigation-pattern links for deeper DSE sub-pages
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. DSE-Matcher: Google/YouTube false match — now requires 2+ word match
for provider-name fallback, not just "Google" matching YouTube section
2. AGB/Widerrufsbelehrung: only_ecommerce flag — skips for non-shop
websites (detected via payment providers, cart keywords)
3. DSE-internal link following — scanner now discovers links WITHIN the
privacy policy and scans those too (finds regional DSE sub-pages)
4. Expanded keyword synonyms for DSE mandatory checks:
- "Zweck und Rechtsgrundlage" now matches "zwecke"
- "behoerdlichen datenschutzbeauftragt" matches DSB
- "aufsichtsbehörde" with umlaut matches
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
8 test cases with deliberately wrong legal basis assignments:
- Cookie tracking on lit. f (should be lit. a)
- Analytics on lit. b (should be lit. a)
- Newsletter on lit. f (should be lit. a)
- Klarna without Art. 22
- Session recording on lit. f
- 2 correct cases (should NOT trigger findings)
Runs both hardcoded dict AND Control Library query, compares results.
If Control Library passes all → dict can be removed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
6 files with hardcoded legal knowledge identified. Review deadline 2026-07-01.
legal_basis_validator.py marked with warning log on every use.
Instruction file for other session to execute migration.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase 6: PDF export via WeasyPrint — POST /agent/scans/pdf generates
printable compliance report with findings table, service comparison,
risk badge, and legal disclaimer.
Phase 7: Recurring scans — POST /agent/monitored-urls to add URLs,
POST /agent/run-scheduled triggers all enabled scans (cron/ZeroClaw).
In-memory storage with DB upgrade path.
Phase 8: Multi-website compare — POST /agent/compare with 2-5 URLs,
parallel scanning, comparison table (risk, findings, services, compliance
features per site).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Migration 086: compliance_agent_scans table (findings, services, corrections)
- agent_history_routes.py: POST /scans (save), GET /scans (list), GET /scans/{id}
- Scan results survive page reloads and can be reviewed later
- Phase 10 (Playwright website scanner) added to product roadmap
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Scan pages in parallel instead of sequential. Reduces scan time
from ~10s (5 pages × 2s) to ~3s (all pages at once).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- dse_parser.py: HTML → structured sections (heading, number, content, parent)
Uses heading hierarchy (h1-h4) with regex fallback
- dse_matcher.py: matches detected services against DSE sections
Exact name → provider → category matching with insertion point suggestion
- agent_scan_routes: TextReference model in findings (original text,
section, paragraph, correction type, insert_after)
Enables showing: "Google Analytics not found in DSE, insert after
Section 2.4 Cookies und Tracking"
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase 0: Qwen extracts 14 structured intake flags (personal_data,
marketing, profiling, ai_usage, etc.) instead of keyword matching.
Fallback to keywords if LLM unavailable. Flags feed into UCCA for
accurate scoring.
Phase 1: Control relevance filter removes false positives.
C_TRANSPARENCY only recommended if AI/ML keywords found in text.
7 control rules with keyword lists + intake flag fallback.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Summary now renders as styled HTML (table layout, colored risk badge,
warning banners) instead of plaintext in <div>
- Tab info text explains scope: "Analysiert nur die eingegebene URL" vs
"Scannt automatisch 5-10 Unterseiten"
- Scan history with findings count badge and page count
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Email now lists all scanned URLs with checkmark/cross status.
Frontend shows collapsible "X Seiten gescannt — Details anzeigen".
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
SDK LLM chat returns empty content due to Qwen think-mode. Direct Ollama
/api/generate call with stream:false gets the full response including
think tags which we strip.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Backend: mode field in request, adapts summary tone and email subject
- Pre-launch: "Implementieren Sie X vor Veroeffentlichung"
- Post-launch: "ACHTUNG: Maengel sind oeffentlich sichtbar, sofortige Nachbesserung"
- Frontend: Mode toggle (internes Dokument vs. Live-Website)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Scan public website for cancellation button, imprint, privacy link, cookie consent
- Generate follow-up questions when checks can't be verified without login
- User answers "no" → finding with legal basis is added to results
- Frontend: FollowUpQuestions component with Ja/Nein buttons
- Sidebar: "Compliance Agent" entry added under KI-Compliance
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
SessionLocal: 5x verwendet fuer DB-Sessions ausserhalb Depends()
HTTPException: verwendet in Framework-Validation
text: 55x verwendet fuer raw SQL queries
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Merged feature/fisa-702-drittland-risiko in den refakturierten main-Branch.
Konflikte in 8 Dateien aufgelöst — neue Features in die aufgesplittete
Modulstruktur integriert.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>