DSI Dedup (consent-tester):
- Only H1/H2 headings count as documents (not H3/H4 sub-sections)
- Sub-sections (Cookies, Betroffenenrechte, Social Media) are part of
parent document's full text, not separate documents
- Reduces IHK result from 30 to ~11 real documents
Backend (agent_scan_routes):
- ScanFinding gets doc_title field linking each finding to its document
- doc_title set when creating DSI findings for document attribution
Frontend (ScanResult.tsx):
- 3 sections: Services table, Document cards, General findings
- Documents: expandable cards with completeness bar (green/yellow/red)
- Findings grouped under their parent document
- Each card shows: title, word count, findings count, % completeness
- Findings without doc_title go to "Allgemeine Findings" section
Email Summary (agent_scan_helpers):
- Findings listed under their parent document
- General findings in separate section
- No more flat mixed list
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- scanProgress state tracks live progress (not mixed into scanData)
- ScanResult only renders when scanData.services exists (prevents crash)
- Purple progress bar with spinner shows current step during scan
- Fixes: TypeError 's.services.filter' when progress data set as scanData
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CE-Risikobeurteilung Datenerfassung mit 3 wählbaren Eingabe-Modi:
1. Interview-Modus (Chat-artig): Fragen werden nacheinander gestellt
wie im Kundengespräch. Antwort-Historie sichtbar.
2. Wizard-Modus: Schritt-für-Schritt durch 8 Sektionen.
3. Formular-Modus: Alle Sektionen als Accordion auf einer Seite.
20 strukturierte Fragen in 8 Abschnitten:
- Maschinenbeschreibung (Name, Typ, Baugruppen)
- Lebensphasen (Betrieb, Einrichten, Wartung)
- Bestimmungsgemäße Verwendung
- Vorhersehbare Fehlanwendung
- Qualifikation der Benutzer
- Räumliche/Zeitliche Grenzen
- Technische Daten (Kräfte, Spannungen, Temperaturen, Drehzahlen)
- Umgebungsbedingungen
answersToNarrativeText() konvertiert alle Antworten in den Freitext
der an POST /parse-narrative gesendet wird.
Ergebnis-Panel zeigt: Komponenten, Gefahren, Patterns, Energiequellen.
URL: /sdk/iace/[projectId]/interview
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fundamental fix: scans now run asynchronously with progress polling.
Backend:
- POST /scan starts background task, returns scan_id immediately
- GET /scan/{scan_id} returns status + progress + result when done
- 7 progress steps shown: Website scan, DSI discovery, DSE analysis,
SOLL/IST comparison, corrections, report, email
- In-memory job store (dict with scan_id → status/result)
- No timeout limits on scan duration
Frontend:
- POST starts scan, receives scan_id
- Polls GET every 5 seconds (max 120 attempts = 10 min)
- Shows live progress message during scan
- Displays result when completed, error when failed
Proxy:
- POST timeout reduced to 30s (just starts the job)
- GET timeout 10s (just status check)
- No more 504/connection-dropped errors
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
banner_detector.py, script_analyzer.py, category_tester.py, authenticated_scanner.py
were only on the feature branch — needed for consent-tester to start.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Bug 1: max_pages was hardcoded to 15 in backend call — raised to 50
Bug 2: DSI documents checked against text_preview (500 chars) — now uses
full_text (10,000 chars) for Art. 13 mandatory field checks
Bug 3: DSE text not found when Playwright misses DSE page — now falls
back to DSI Discovery full_text as second source
Bug 4: Backend timeout 120s too short for 50 pages — raised to 300s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
These files existed on the feature branch but were never cherry-picked
to main, causing ModuleNotFoundError on import:
- dse_parser.py — parses DSE HTML into structured sections
- dse_matcher.py — matches detected services against DSE sections
- mandatory_content_checker.py — checks Art. 13 DSGVO mandatory fields
- legal_basis_validator.py — validates legal basis (lit. a-f)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
NameError: name 're' is not defined at line 146 — the import was
accidentally removed when extracting helper functions to agent_scan_helpers.py.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Both scanners now search until done, not until a counter runs out:
playwright_scanner.py:
- Default max_pages raised from 15 to 50
- Added 3-minute timeout as safety net
- Recursive link discovery on EVERY visited page (not just DSE pages)
- Stops when: all links visited OR max_pages OR timeout
dsi_discovery.py:
- Default max_documents raised from 30 to 100
- Added 5-minute timeout as safety net
- Recursive: on each visited page, searches for MORE DSI links
- Processes ALL discovered links exhaustively
- Stops when: no more pending links OR max_documents OR timeout
The scanners now behave like a real user: they follow every relevant
link they find, and on each new page they look for more links.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New checks (from EUIPO reference case):
- Check 9: Third-party DSE link — detects when consent dialog links to
external domain's privacy policy instead of own DSE (Art. 13 DSGVO)
- Check 10: Dark-pattern language — detects "muessen/erforderlich" for
non-essential cookies suggesting false technical necessity (EDPB Rn. 70)
- Check 11: Non-modal dismiss = consent — detects when clicking outside
dialog closes it (possibly treating as consent, Planet49 violation)
Refactor: extracted _check_banner_text (375 LOC) from consent_scanner.py
into services/banner_text_checker.py to keep both files under 500 LOC.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- X button to close banner (SDK admin context only)
- Overlay leaves sidebar area accessible (ml-16/ml-64)
- Click overlay backdrop to dismiss
- Preview page: close banner on API error (don't trap user)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Browser blocks direct calls to backend-compliance:8093 due to
self-signed SSL certificate. All banner API calls now go through
Next.js API proxy at /api/sdk/v1/banner/* which runs server-side.
- New catch-all proxy: /api/sdk/v1/banner/[[...path]]/route.ts
Maps to backend-compliance:8002/api/compliance/banner/*
- Preview page: uses /api/sdk/v1/banner/ instead of https://macmini:8093
- CMP Dashboard: uses proxy for banner stats + compliance proxy for DSR/einwilligungen
- Fixes: banner not closeable due to API errors, consent not saving
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- New route /sdk/cmp with full CMP dashboard
- 4 KPI cards: total consents, active consents, open DSR requests, configured sites
- Cookie category acceptance bars (necessary/statistics/marketing/functional)
- DSR breakdown: by status, by type (Art. 15-21), avg processing time, overdue count
- 9-point compliance checklist (banner, DSE, impressum, Art.7 proof, DSR, loeschfristen,
vendor AVV, email templates, EWR-only mode) — each links to relevant module
- 8 module cards with icons linking to all CMP sub-modules
- Real API integration: /banner/admin/stats, /einwilligungen/consents/stats, /dsr/stats
- Dashboard link added as first entry in CMP sidebar section
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CMP Section in Sidebar:
- New "CMP" group with purple accent, above other module sections
- Links: Cookie-Banner, Live-Vorschau, Consent-Records, Consent-Verwaltung,
Vendor-Compliance, DSR Portal, Loeschfristen, E-Mail-Templates
Live Preview (/sdk/cookie-banner/preview):
- Simulated "MusterShop GmbH" website with full cookie banner
- Real API calls to POST /banner/consent (saves to DB)
- EWR-Only toggle functional in preview
- API Debug panel shows fingerprint, consent status, blocked vendors
- Response JSON viewer for API debugging
- Links to verify in Consent-Verwaltung, Consent-Records, DSR Portal
- "Consent zuruecksetzen" button to re-test
- Footer "Cookie-Einstellungen" link to reopen banner
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Deleted 6 unused components from /sdk/einwilligungen/cookie-banner/_components/
- Replaced page.tsx with Next.js redirect() to /sdk/cookie-banner
- Updated EinwilligungenNavTabs link to /sdk/cookie-banner
- Updated catalog page link to /sdk/cookie-banner
- Single source of truth: /sdk/cookie-banner (Step in "Rechtliche Texte")
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- All 4 categories with toggles visible on first layer (no "Einstellungen" step)
- Removed showSettings state — single-view banner
- EWR toggle + info button in header, always visible
- Two equal-weight buttons: "Alle akzeptieren" + "Auswahl speichern"
- "Nur notwendige" as text link below (not hidden, but less prominent)
- Vendor tables expandable per category via chevron
- DSK OH Telemedien 2022 + CNIL 2020 compliant layout
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Game-changing CMP feature: Users accept a category (e.g. Marketing) but
can restrict data processing to EU/EWR-only vendors. Non-EWR vendors are
blocked even when the category is accepted.
- Toggle "Nur EU/EWR-Anbieter" with globe icon in blue gradient bar
- Blocked vendors shown as red pills with strikethrough icon
- Per-vendor status icons: green checkmark (active), red slash (blocked),
gray dash (category disabled)
- Country column: green circle+check for EWR, amber warning for non-EWR
- EWR = EU27 + IS/LI/NO + CH (Angemessenheitsbeschluss)
- Vendor data extracted to cookie-banner-vendors.ts (under 500 LOC)
- Consent state includes ewrOnly flag + blockedVendors list
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- CookieBannerOverlay: shows vendors per category with expandable tables
(Verarbeiter, Cookies, Dauer, Land) for full transparency
- Demo vendors: 4 necessary, 3 statistics, 3 marketing, 3 functional
- cookie_table_generator.py: renders {{COOKIE_TABLE}} Markdown tables
from vendor configs (DB) or service registry (fallback)
- SERVICE_COOKIES: 16 known vendor-to-cookie mappings with provider + country
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Email now lists all scanned URLs with checkmark/cross status.
Frontend shows collapsible "X Seiten gescannt — Details anzeigen".
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
SDK LLM chat returns empty content due to Qwen think-mode. Direct Ollama
/api/generate call with stream:false gets the full response including
think tags which we strip.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Backend gibt variables manchmal als {} (Objekt) statt [] (Array)
zurueck. (template.variables || []).map() greift nicht weil {}
truthy ist. Fix: Array.isArray() Check in TemplateCard + EditorTab.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Automated comparison: services mentioned in privacy policy vs. actually
embedded on website. Three categories: undocumented (Art. 13 violation),
outdated (cleanup), correctly documented (check third country only).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Multi-page crawl: scan 5-10 strategic pages (start, footer links) for
chatbot widgets, AI text mentions, and tracking services. Feed results
into relevance filter to reduce false positives.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Addresses false-positive controls like C_TRANSPARENCY being recommended
when no AI usage is evident. Plan for separate implementation session.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Backend: mode field in request, adapts summary tone and email subject
- Pre-launch: "Implementieren Sie X vor Veroeffentlichung"
- Post-launch: "ACHTUNG: Maengel sind oeffentlich sichtbar, sofortige Nachbesserung"
- Frontend: Mode toggle (internes Dokument vs. Live-Website)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>