Commit Graph

613 Commits

Author SHA1 Message Date
Benjamin Admin f591871277 feat: Phase 1 — Whistleblower + Cookie/Impressum + HR-DSI templates
Phase 1 of the Document Templates Masterplan:

- 098: Whistleblower-Richtlinie (HinSchG) — 10 sections, anonymous
  reporting, 7-day confirmation, 3-month feedback, reprisal protection
- 099: Cookie-Banner + Impressum updates — OS-Plattform discontinued
  note (July 2025), description updates
- 100: Applicant DSI + Employee DSI — two new HR privacy notices with
  § 26 BDSG, 6-month retention (applicants), modular blocks for video
  interviews, talent pool, IT monitoring, company vehicles, works council

Generator: 25 new fields (whistleblower, applicant, employee categories)
Categories: whistleblower, hr_dsi added to document generator

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-01 08:29:52 +02:00
Benjamin Admin bae59e2ce0 feat: Document Templates v2 — 11 migrations + scope-based generator
Complete overhaul of document generator templates based on paragraph-by-paragraph
legal review of attorney-drafted templates (TOM, AVV, AGB, DSI, Community
Guidelines, Nutzungsbedingungen, Widerrufsbelehrung, Cookie-Richtlinie).

Templates (11 migrations 087-097):
- 087: TOM-Dokumentation v2 (11 categories incl. Trennungskontrolle)
- 088: AVV Art. 28 DSGVO (complete, §§ 1-11, 3 annexes)
- 089: Cross-document updates (Löschkonzept DIN 66399, VVT recipients)
- 090: AGB SaaS/Shop v2 (18 §§, B2B/B2C, IoT, physical goods, IP protection)
- 091: Community Guidelines v2 (3 tones, 11 modular categories, DSA-compliant)
- 092: Media & Content modules (MStV, AI Act Art. 50, UWG, Pressekodex)
- 093: DSI/Privacy Policy v2 (Art. 13 complete, shop+corporate modules)
- 094: Nutzungsbedingungen (Terms of Use, UGC, tipping, wallet, CC licenses)
- 095: Widerrufsbelehrung (SaaS + physical + IoT bundle + combo)
- 096: Social Media DSI (Facebook, YouTube, LinkedIn, TikTok, Meta Pixel)
- 097: Cookie-Richtlinie v2 (TDDDG § 25, consent banner, browser links)

Frontend (generator):
- scopeDefaults.ts: L1-L4 scope-based defaults from Compliance Scope Engine
- contextBridge.ts: TOMCtx + DPACtx interfaces (70+ new fields)
- contextBridge-helpers.ts: 35+ placeholder mappings for TOM/DPA/AGB
- _constants.ts: 120+ new generator fields (TOM, DPA, AGB, community,
  media, social, nutzungsbedingungen, widerruf, cookie, shop, IoT)
- page.tsx: Auto-prefill TOM/DPA from scope engine decision

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-01 01:18:33 +02:00
Benjamin Admin 58957a4aaa fix: Playwright user permission + etracker DSE matching + CMP skip
1. Dockerfile: install Playwright AS appuser (not root) so chromium
   binary is accessible at runtime. Was causing 500 error.
2. DSE service matching: text-search fallback when LLM extraction fails.
   If "etracker" appears in DSE text, mark as documented even without
   LLM parsing the service list.
3. CMP skip: consent managers in category "cmp" skipped (not just "other"
   with id "cmp").

NOT DEPLOYED — RAG pipeline is running.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 19:36:46 +02:00
Benjamin Admin cedc5de15d feat: Phase 10 — Playwright website scanner replaces httpx
New /website-scan endpoint in consent-tester service:
- Real browser renders JavaScript (finds dynamic content)
- Clicks navigation menus (discovers hidden sub-pages like IHK DSB page)
- Follows links within DSE to find regional privacy policies
- Collects rendered HTML for each page (after JS execution)

Backend integration:
- agent_scan_routes tries Playwright first, falls back to httpx
- DSE text and HTML extracted from Playwright-rendered pages
- Service detection runs on rendered HTML (catches JS-loaded scripts)

Also fixes:
- GA regex: G-[A-Z0-9]{8,12} prevents CSS class false positives
- etracker added to service registry
- External page scanning blocked (same-domain only)
- CSS/JS/image files excluded from page list

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 19:16:50 +02:00
Benjamin Admin 5eeef3a9c3 fix: 4 bugs from IHK scan — false positives + missing etracker
1. GA regex: G-\w{5,} matched CSS classes (g-7031048). Now requires
   G-[A-Z0-9]{8,12} (uppercase after G-, 8-12 chars = real GA4 ID)
2. External page scanning: DSE-internal links now SAME DOMAIN only.
   Previously followed links to etracker.com, google.de/policies etc.
   and detected services on THOSE sites as IHK services.
3. Added etracker to service registry (DE, ePrivacy-certified)
4. CSS/JS/image files excluded from page scanning
5. Navigation-pattern links for deeper DSE sub-pages

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 19:08:07 +02:00
Benjamin Admin 891fc5bea0 docs: add keyword-based checker problem to migration instruction
mandatory_content_checker.py keywords break with alternative formulations.
Solution: LLM-based check per mandatory field (9 calls, parallelizable).
For other session to implement alongside Dict→Control migration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 18:18:45 +02:00
Benjamin Admin fff47cc52e fix: 4 bugs from IHK Konstanz scan validation
1. DSE-Matcher: Google/YouTube false match — now requires 2+ word match
   for provider-name fallback, not just "Google" matching YouTube section
2. AGB/Widerrufsbelehrung: only_ecommerce flag — skips for non-shop
   websites (detected via payment providers, cart keywords)
3. DSE-internal link following — scanner now discovers links WITHIN the
   privacy policy and scans those too (finds regional DSE sub-pages)
4. Expanded keyword synonyms for DSE mandatory checks:
   - "Zweck und Rechtsgrundlage" now matches "zwecke"
   - "behoerdlichen datenschutzbeauftragt" matches DSB
   - "aufsichtsbehörde" with umlaut matches

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 17:57:19 +02:00
Benjamin Admin 0f3ba9c207 test: Lit-Mapping validation — Dict vs Control Library comparison
8 test cases with deliberately wrong legal basis assignments:
- Cookie tracking on lit. f (should be lit. a)
- Analytics on lit. b (should be lit. a)
- Newsletter on lit. f (should be lit. a)
- Klarna without Art. 22
- Session recording on lit. f
- 2 correct cases (should NOT trigger findings)

Runs both hardcoded dict AND Control Library query, compares results.
If Control Library passes all → dict can be removed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 16:56:38 +02:00
Benjamin Admin b53b36fdc5 feat: 5-tab agent UI — PDF export, compare, auth test, all proxies
- 5 tabs: Schnellanalyse, Website-Scan, Cookie-Test, Vergleich, Login-Test
- PDF download button in ScanResult
- CompareResult: side-by-side compliance comparison table
- AuthTestResult: 5 post-login checks with legal refs
- API proxies: /scans/pdf, /compare, /authenticated-scan
- Compare: textarea for 2-5 URLs, parallel scanning

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 16:43:08 +02:00
Benjamin Admin 2c9cea74e3 docs: instruction for hardcoded knowledge → Control Library migration
6 files with hardcoded legal knowledge identified. Review deadline 2026-07-01.
legal_basis_validator.py marked with warning log on every use.
Instruction file for other session to execute migration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 16:33:48 +02:00
Benjamin Admin 85c4cbbf37 fix: increase scan proxy timeout from 3 to 5 minutes 2026-04-29 16:24:22 +02:00
Benjamin Admin 4bf92f42b8 feat: Phase 9 — Authenticated Testing + Legal Basis Validator (lit. mapping)
Phase 9: Playwright login + 5 post-login checks:
- §312k BGB: Kündigungsbutton (2 Klicks)
- Art. 17 DSGVO: Konto löschen
- Art. 20 DSGVO: Daten exportieren
- Art. 7(3): Einwilligungen widerrufen
- Art. 15: Profildaten einsehen
Auto-detects login form selectors. Credentials destroyed after test.

Legal Basis Validator: Checks 7 common lit-mapping mistakes:
- Cookie tracking on lit. f instead of lit. a (Planet49)
- Analytics on lit. b (contract overextension)
- Klarna without Art. 22 reference
- Session recording without consent
Integrated into website scan pipeline.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 16:08:41 +02:00
Benjamin Admin 8336c01c5c feat: Phase 6-8 — PDF export, recurring scans, multi-website compare
Phase 6: PDF export via WeasyPrint — POST /agent/scans/pdf generates
printable compliance report with findings table, service comparison,
risk badge, and legal disclaimer.

Phase 7: Recurring scans — POST /agent/monitored-urls to add URLs,
POST /agent/run-scheduled triggers all enabled scans (cron/ZeroClaw).
In-memory storage with DB upgrade path.

Phase 8: Multi-website compare — POST /agent/compare with 2-5 URLs,
parallel scanning, comparison table (risk, findings, services, compliance
features per site).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 15:27:51 +02:00
Benjamin Admin e35db90232 feat: Phase 5 — DB persistence for scan results + Phase 10 in plan
- Migration 086: compliance_agent_scans table (findings, services, corrections)
- agent_history_routes.py: POST /scans (save), GET /scans (list), GET /scans/{id}
- Scan results survive page reloads and can be reviewed later
- Phase 10 (Playwright website scanner) added to product roadmap

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 15:17:51 +02:00
Benjamin Admin 53774886e7 perf: Phase 4 — parallel page fetching (asyncio.gather)
Scan pages in parallel instead of sequential. Reduces scan time
from ~10s (5 pages × 2s) to ~3s (all pages at once).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 15:09:03 +02:00
Benjamin Admin 5c5054f740 feat: Phase 3 — registry 82 services, mandatory checker, SDK flow step
- website_scanner.py: imports from master service_registry.py (82 services)
- agent_scan_routes.py: mandatory content checks (documents + DSE sections)
- steps-betrieb.ts: Compliance Agent step added to SDK Flow (seq 5000)
- PLAN: Phase 9 (Authenticated Testing) added to product roadmap

Mandatory checks know what MUST be there:
- Documents: Impressum, DSE, AGB, Widerrufsbelehrung
- DSE content: 9 Art. 13 DSGVO fields (DSB, Speicherdauer, etc.)
- Impressum content: 5 §5 TMG fields (GF, HRB, USt-ID, etc.)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 15:04:44 +02:00
Benjamin Admin 642382cbe8 feat: Mandatory Content Checker — knows what MUST be there
Three check levels:
1. Documents: Impressum, DSE, AGB, Widerrufsbelehrung must exist as pages
2. DSE content: 9 Art. 13 DSGVO mandatory sections (Verantwortlicher,
   DSB-Kontakt, Zwecke, Rechtsgrundlagen, Speicherdauer, Betroffenenrechte,
   Beschwerderecht, Drittlandtransfer, Profiling)
3. Impressum content: 5 §5 TMG mandatory fields (GF, Handelsregister,
   USt-ID, Anschrift, Kontakt)

Detects both missing documents AND missing content within documents.
Also catches HTTP errors (page exists but returns 404/500).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 14:23:22 +02:00
Benjamin Admin f219b9c244 feat: Master Service Registry — 82 third-party services across 15 categories
Tracking (12), Marketing/Ads (9), Newsletter (8), CDN/Fonts (7),
Chatbots/Support (7), Payment (5), Heatmaps (4), A/B Testing (3),
Tag Managers (3), Push (3), Video (4), Social (3), Error Tracking (4),
CRM (3), Maps (3), Captcha (3), Accessibility (2), CMP (1).

Each entry: regex, provider, country, EU adequacy, consent requirement,
legal reference. Pure data, no logic.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 14:21:32 +02:00
Benjamin Admin 16c40ddae4 feat: consent-test email with phase-structured findings
Email shows 3 phases (Before/After Reject/After Accept) with:
- Violation cards per phase (CRITICAL/HIGH badges)
- Undocumented services in Phase C
- Summary table (critical/high/undocumented counts)
- Dark Pattern warning if tracking persists after rejection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 13:14:01 +02:00
Benjamin Admin b7f9099ad9 feat: Cookie-Test tab — 3-phase consent test UI + API proxy
Third tab "Cookie-Test" in Compliance Agent:
- Phase A: Before consent (tracking without permission)
- Phase B: After rejection (CRITICAL if tracking persists)
- Phase C: After acceptance (undocumented services)
- CMP badge (Didomi, OneTrust, etc.)
- Violation cards with severity badges and legal references

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 12:38:15 +02:00
Benjamin Admin f3c0481631 feat: add consent-tester service to docker-compose (port 8094, 2GB mem limit)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 12:33:20 +02:00
Benjamin Admin d105842bf2 feat: consent-tester microservice — Playwright 3-phase cookie test
New independent service (port 8094) with headless Chromium:
- Phase A: What loads BEFORE any consent interaction
- Phase B: What loads AFTER rejecting consent (CRITICAL if tracking persists)
- Phase C: What loads AFTER accepting (check against cookie policy)
- 10 CMP-specific selectors (Didomi, OneTrust, Cookiebot, Usercentrics, etc.)
- Generic fallback via button text matching
- 18 tracking service patterns for script classification

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 12:14:41 +02:00
Benjamin Admin 15d1e118ed feat: TextReference component — original text, position, correction in findings
Shows for each finding:
- Original text block from DSE (or "missing" indicator)
- Position: section heading, number, parent section, paragraph index
- Correction: insert/append/replace with copy button
Falls back to plain correction view if no text reference available.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 11:59:55 +02:00
Benjamin Admin 0ba76d041a feat: DSE parser + matcher — textblock references in scan findings
- dse_parser.py: HTML → structured sections (heading, number, content, parent)
  Uses heading hierarchy (h1-h4) with regex fallback
- dse_matcher.py: matches detected services against DSE sections
  Exact name → provider → category matching with insertion point suggestion
- agent_scan_routes: TextReference model in findings (original text,
  section, paragraph, correction type, insert_after)

Enables showing: "Google Analytics not found in DSE, insert after
Section 2.4 Cookies und Tracking"

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 11:55:26 +02:00
Benjamin Admin 4298ae17ab feat: Phase 0+1 — LLM intake extraction + control relevance filter
Phase 0: Qwen extracts 14 structured intake flags (personal_data,
marketing, profiling, ai_usage, etc.) instead of keyword matching.
Fallback to keywords if LLM unavailable. Flags feed into UCCA for
accurate scoring.

Phase 1: Control relevance filter removes false positives.
C_TRANSPARENCY only recommended if AI/ML keywords found in text.
7 control rules with keyword lists + intake flag fallback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 11:36:24 +02:00
Benjamin Admin 0266dfd011 docs: Compliance Agent product roadmap — 8 phases, PoC to production
P0: UCCA score calibration + control relevance filter
P1: Headless browser consent test (before/after cookie banner) + 80+ services
P2: Scan acceleration, DB persistence, PDF export
P3: Recurring scans, multi-website comparison

Investor demo scenario included.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 11:32:27 +02:00
Benjamin Admin 6a77cf6a89 feat: HTML email format, tab info hints, scan history
- Summary now renders as styled HTML (table layout, colored risk badge,
  warning banners) instead of plaintext in <div>
- Tab info text explains scope: "Analysiert nur die eingegebene URL" vs
  "Scannt automatisch 5-10 Unterseiten"
- Scan history with findings count badge and page count

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 11:04:29 +02:00
Benjamin Admin 10e4e8472b feat: add SDK product knowledge to Compliance Advisor soul
Advisor now knows about: project setup (3 steps), all SDK modules
(DSGVO, AI Act, CE, independent modules), recommended workflow order,
navigation (sidebar, CommandBar, SDK-Flow). No business secrets.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 09:17:54 +02:00
Benjamin Admin 2134383b5a fix: guard placeholders with Array.isArray to prevent e.filter crash
Same pattern as the email templates variables fix. Backend may return
placeholders as object instead of array.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 23:36:09 +02:00
Benjamin Admin ac8eb1bf99 feat: "Als Email senden" Button im Compliance Advisor
Chat-Verlauf wird als strukturiertes Beratungsprotokoll per Email
an den DSB gesendet. Button erscheint im Header sobald Nachrichten
vorhanden sind. Zeigt Checkmark nach erfolgreichem Versand.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 23:17:13 +02:00
Benjamin Admin 3c9ac03ccc fix: show ComplianceAdvisor + PipelineSidebar without project selection
Widgets were hidden behind projectId guard. Removed condition so new
users can ask questions (e.g. "Wie lege ich ein Projekt an?") before
creating a project.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 23:06:41 +02:00
Benjamin Admin b39c1d5dce feat: DSR Prozessbeschreibungen Art. 15-21 mit Swim-Lane-Diagrammen
Build + Deploy / build-admin-compliance (push) Successful in 1m56s
Build + Deploy / build-backend-compliance (push) Successful in 3m5s
Build + Deploy / build-ai-sdk (push) Successful in 47s
Build + Deploy / build-developer-portal (push) Successful in 1m5s
Build + Deploy / build-tts (push) Successful in 1m23s
Build + Deploy / build-document-crawler (push) Successful in 33s
Build + Deploy / build-dsms-gateway (push) Successful in 23s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / loc-budget (push) Failing after 17s
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m40s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Successful in 42s
CI / test-python-backend (push) Successful in 47s
CI / test-python-document-crawler (push) Successful in 33s
CI / test-python-dsms-gateway (push) Successful in 22s
CI / validate-canonical-controls (push) Successful in 18s
Build + Deploy / trigger-orca (push) Successful in 2m53s
7 vollstaendige Prozessbeschreibungen fuer den Document Generator:
- Art. 15: Auskunftsrecht (30 Tage, 6 Schritte, Informationskatalog)
- Art. 16: Berichtigungsrecht (14 Tage, inkl. Art. 19 Mitteilung)
- Art. 17: Loeschungsrecht (14 Tage, Art. 17(3) Ausnahmen-Checkliste)
- Art. 18: Einschraenkungsrecht (14 Tage, erlaubte Verarbeitung)
- Art. 19: Mitteilungspflicht (automatisch bei Art. 16/17/18)
- Art. 20: Datenuebertragbarkeit (30 Tage, JSON/CSV/XML Export)
- Art. 21: Widerspruchsrecht (30 Tage, Sonderfall Direktwerbung)

Jede Beschreibung enthaelt:
- Mermaid Swim-Lane-Diagramm (Betroffener/Sachbearbeitung/Fachabteilung/DSB)
- Detaillierte Schritt-Tabelle mit Verantwortlichkeiten und Fristen
- Rechtsgrundlagen-Verweise
- Firmen-Platzhalter (FIRMENNAME, VERSION, DATUM, DSB_NAME)

Integration:
- 7 neue Typen in VALID_DOCUMENT_TYPES (legal_template_routes.py)
- Neue Kategorie "DSR-Prozesse" im Document Generator Frontend
- DSR types-core.ts: templateType Feld verknuepft DSR → Document Generator
- Migration 085 seeded die Templates in die legal_templates Tabelle

[migration-approved]

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 17:53:44 +02:00
Benjamin Admin b06a33a5fe fix: syntax error — missing closing paren in scan summary builder 2026-04-28 17:41:11 +02:00
Benjamin Admin 6c0e76f96d feat: show scanned pages in email summary + frontend (expandable list)
Email now lists all scanned URLs with checkmark/cross status.
Frontend shows collapsible "X Seiten gescannt — Details anzeigen".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 17:26:03 +02:00
Benjamin Admin 0106f3b5b6 fix: use Ollama directly for correction generation (bypass SDK think-mode)
SDK LLM chat returns empty content due to Qwen think-mode. Direct Ollama
/api/generate call with stream:false gets the full response including
think tags which we strip.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 16:30:51 +02:00
Benjamin Admin b175ad2594 fix: increase LLM timeouts for scan corrections (90s) and DSE extraction (120s)
Qwen 3.5:35b needs ~30-60s per call. Multi-call scan was timing out.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 16:05:35 +02:00
Benjamin Admin 4c43253a53 fix: variables als Objekt statt Array crasht Email Templates
Build + Deploy / build-admin-compliance (push) Successful in 2m9s
Build + Deploy / build-backend-compliance (push) Failing after 3m24s
Build + Deploy / build-ai-sdk (push) Successful in 52s
Build + Deploy / build-developer-portal (push) Successful in 1m15s
Build + Deploy / build-tts (push) Successful in 1m23s
Build + Deploy / build-document-crawler (push) Successful in 38s
Build + Deploy / build-dsms-gateway (push) Successful in 27s
CI / branch-name (push) Has been skipped
Build + Deploy / trigger-orca (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / loc-budget (push) Failing after 18s
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m42s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Successful in 41s
CI / test-python-backend (push) Successful in 41s
CI / test-python-document-crawler (push) Successful in 26s
CI / test-python-dsms-gateway (push) Successful in 22s
CI / validate-canonical-controls (push) Successful in 16s
Backend gibt variables manchmal als {} (Objekt) statt [] (Array)
zurueck. (template.variables || []).map() greift nicht weil {}
truthy ist. Fix: Array.isArray() Check in TemplateCard + EditorTab.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 16:00:07 +02:00
Benjamin Admin 0f1fae61a6 feat: Website-Scan tab in agent UI — service table, SOLL/IST, corrections
- Tab system: Schnellanalyse (single page) + Website-Scan (multi-page)
- ScanResult component: service comparison table, severity-colored findings
- Expandable correction suggestions with copy button (pre-launch mode)
- API proxy route for /agent/scan endpoint

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 15:52:40 +02:00
Benjamin Admin 711b9b3146 feat: website scanner with SOLL/IST service comparison + corrections
- website_scanner.py: multi-page crawl, 20+ service patterns (tracking,
  CDN, chatbots, payment, fonts, captcha, video), AI text detection
- dse_service_extractor.py: LLM extracts services from privacy policy text
- agent_scan_routes.py: POST /agent/scan — combines scan + DSE comparison,
  generates findings (undocumented, outdated, third-country transfer),
  auto-corrections via Qwen in pre-launch mode

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 15:35:31 +02:00
Benjamin Admin d0dc284cd5 docs: add Phase 5 (Payment/Marketing checks) + Phase 6 (auto-corrections)
- Payment: Stripe, PayPal, Klarna (Art. 22 Bonitaetspruefung!), Adyen, Mollie
- Marketing: GA, Meta Pixel, TikTok, Hotjar, Clarity, Newsletter-Anbieter
- Each service: DSE mention check, consent check, third-country check
- Pre-launch mode: agent generates ready-to-insert DSE text blocks via Qwen
- Correction types: missing service, wrong legal basis, outdated entry

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 15:26:29 +02:00
Benjamin Admin 24fb1e14e0 docs: add Phase 4b — SOLL/IST Dienstleister-Abgleich (DSE vs. Website)
Automated comparison: services mentioned in privacy policy vs. actually
embedded on website. Three categories: undocumented (Art. 13 violation),
outdated (cleanup), correctly documented (check third country only).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 15:20:12 +02:00
Benjamin Admin 6aa753146f docs: extend plan with third-party service detection + Drittland registry
80+ services: CDN (Cloudflare, Akamai), Fonts (Google Fonts LG München),
Tracking (GA, Meta Pixel, Matomo), Captcha, Maps, Video, Payment.
Static registry with country, EU adequacy, consent requirement, legal ref.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 15:18:43 +02:00
Benjamin Admin acd2d5f944 docs: add Phase 4 (Website-Scan) to Control Relevance Filter plan
Multi-page crawl: scan 5-10 strategic pages (start, footer links) for
chatbot widgets, AI text mentions, and tracking services. Feed results
into relevance filter to reduce false positives.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 15:11:19 +02:00
Benjamin Admin 2a6f526c88 docs: plan for Control Relevance Filter (3-stage: rules, LLM, follow-up)
Addresses false-positive controls like C_TRANSPARENCY being recommended
when no AI usage is evident. Plan for separate implementation session.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 14:32:25 +02:00
Benjamin Admin 1988274420 feat: pre-launch vs post-launch analysis modes
- Backend: mode field in request, adapts summary tone and email subject
- Pre-launch: "Implementieren Sie X vor Veroeffentlichung"
- Post-launch: "ACHTUNG: Maengel sind oeffentlich sichtbar, sofortige Nachbesserung"
- Frontend: Mode toggle (internes Dokument vs. Live-Website)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 14:07:32 +02:00
Benjamin Admin cb5aa2949b feat: hybrid website compliance checks (§312k BGB, §5 TMG, Art. 13 DSGVO)
- Scan public website for cancellation button, imprint, privacy link, cookie consent
- Generate follow-up questions when checks can't be verified without login
- User answers "no" → finding with legal basis is added to results
- Frontend: FollowUpQuestions component with Ja/Nein buttons
- Sidebar: "Compliance Agent" entry added under KI-Compliance

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 13:25:44 +02:00
Benjamin Admin 41fd7e36d1 fix: use string-converted findings in summary builder
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 08:53:32 +02:00
Benjamin Admin f7483f5724 fix: convert UCCA findings/controls dicts to strings for response model
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 08:01:36 +02:00
Benjamin Admin cfc130a544 fix: UCCA assessment — send boolean intake flags, flatten nested response, map risk→escalation
Build + Deploy / build-admin-compliance (push) Successful in 1m56s
Build + Deploy / build-backend-compliance (push) Successful in 3m6s
Build + Deploy / build-ai-sdk (push) Successful in 45s
Build + Deploy / build-developer-portal (push) Successful in 1m2s
Build + Deploy / build-tts (push) Successful in 1m19s
Build + Deploy / build-document-crawler (push) Successful in 34s
Build + Deploy / build-dsms-gateway (push) Successful in 21s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / loc-budget (push) Failing after 16s
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m35s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Successful in 48s
CI / test-python-backend (push) Successful in 1m35s
CI / test-python-document-crawler (push) Successful in 26s
CI / test-python-dsms-gateway (push) Successful in 25s
CI / validate-canonical-controls (push) Successful in 20s
Build + Deploy / trigger-orca (push) Successful in 3m15s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 07:29:28 +02:00
Benjamin Admin 0ccc6c4047 fix: handle Qwen think mode in classification, add German term matching
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 00:51:06 +02:00