Commit Graph

40 Commits

Author SHA1 Message Date
Benjamin Admin 08fcb5f239 feat(compliance-check): scenario badges + extracted profile display
Build + Deploy / build-dsms-node (push) Successful in 13s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / loc-budget (push) Failing after 15s
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
Build + Deploy / build-admin-compliance (push) Successful in 1m58s
Build + Deploy / build-backend-compliance (push) Successful in 13s
Build + Deploy / build-ai-sdk (push) Successful in 49s
Build + Deploy / build-developer-portal (push) Successful in 14s
Build + Deploy / build-tts (push) Successful in 15s
Build + Deploy / build-document-crawler (push) Successful in 12s
Build + Deploy / build-dsms-gateway (push) Successful in 11s
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m40s
CI / test-python-dsms-gateway (push) Successful in 24s
CI / validate-canonical-controls (push) Successful in 14s
CI / test-go (push) Successful in 43s
CI / test-python-backend (push) Successful in 39s
CI / test-python-document-crawler (push) Successful in 27s
Build + Deploy / trigger-orca (push) Successful in 2m34s
- Show extracted profile fields (company name, legal form, address,
  DPO, USt-IdNr) with "In Company Profile uebernehmen" button
- Show Compliance Scope hints extracted from documents
- Scenario badges per document: Neugenerierung (red), Korrekturen
  (amber), Konform (green)
- Summary line shows scenario counts

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-12 17:49:45 +02:00
Benjamin Admin 4a7e09bbb0 fix(impressum): regex [A-Z] never matches on lowercased text
Build + Deploy / build-admin-compliance (push) Successful in 12s
Build + Deploy / build-backend-compliance (push) Successful in 14s
Build + Deploy / build-ai-sdk (push) Successful in 20s
Build + Deploy / build-developer-portal (push) Successful in 13s
Build + Deploy / build-tts (push) Successful in 12s
Build + Deploy / build-document-crawler (push) Successful in 14s
Build + Deploy / build-dsms-gateway (push) Successful in 13s
Build + Deploy / build-dsms-node (push) Successful in 18s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / loc-budget (push) Failing after 15s
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m39s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Successful in 46s
CI / test-python-backend (push) Successful in 42s
CI / test-python-document-crawler (push) Successful in 27s
CI / test-python-dsms-gateway (push) Successful in 22s
CI / validate-canonical-controls (push) Successful in 15s
Build + Deploy / trigger-orca (push) Successful in 2m28s
All patterns matched against text_lower but used [A-Z] character class.
Changed to [a-zA-Z] so patterns like "geschäftsführung: dr. oliver"
are found. Also added "Pflicht"/"Detail" labels to the two progress
bars to clarify what 100% vs 8% means.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-12 14:02:25 +02:00
Benjamin Admin 128967fa3d fix(checklist-ui): show INFO-severity checks as gray info icon
Build + Deploy / build-admin-compliance (push) Successful in 2m7s
Build + Deploy / build-backend-compliance (push) Successful in 3m20s
Build + Deploy / build-ai-sdk (push) Successful in 1m2s
Build + Deploy / build-developer-portal (push) Successful in 1m14s
Build + Deploy / build-tts (push) Successful in 1m45s
Build + Deploy / build-document-crawler (push) Successful in 48s
Build + Deploy / build-dsms-gateway (push) Successful in 37s
Build + Deploy / build-dsms-node (push) Successful in 23s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / loc-budget (push) Failing after 17s
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m44s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Successful in 49s
CI / test-python-backend (push) Successful in 37s
CI / test-python-document-crawler (push) Successful in 27s
CI / test-python-dsms-gateway (push) Successful in 23s
CI / validate-canonical-controls (push) Successful in 14s
Build + Deploy / trigger-orca (push) Failing after 32s
INFO checks (V.i.S.d.P., Streitbeilegung, Berufsrecht, Stammkapital,
etc.) that fail are now shown with a gray info icon instead of red X,
with gray hint text. They are excluded from the Pflichtangaben count
since they are context-dependent and likely not applicable.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-12 12:28:00 +02:00
Benjamin Admin ce77cde309 fix(compliance-check): batch LLM verification + increase poll timeout
Build + Deploy / build-admin-compliance (push) Successful in 1m52s
Build + Deploy / build-backend-compliance (push) Successful in 18s
Build + Deploy / build-ai-sdk (push) Successful in 11s
Build + Deploy / build-developer-portal (push) Successful in 11s
Build + Deploy / build-tts (push) Successful in 12s
Build + Deploy / build-document-crawler (push) Successful in 14s
Build + Deploy / build-dsms-gateway (push) Successful in 10s
Build + Deploy / build-dsms-node (push) Successful in 12s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / loc-budget (push) Failing after 15s
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m35s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Successful in 42s
CI / test-python-backend (push) Successful in 37s
CI / test-python-document-crawler (push) Successful in 25s
CI / test-python-dsms-gateway (push) Successful in 21s
CI / validate-canonical-controls (push) Successful in 16s
Build + Deploy / trigger-orca (push) Successful in 2m24s
- LLM verify now sends ALL failed checks in one batched call instead of
  one Ollama call per check (80+ calls → 1 per document)
- Increase frontend poll timeout from 6 min to 15 min

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-12 11:49:30 +02:00
Benjamin Admin a127dd971b fix(compliance-check): resume polling after navigation away
Build + Deploy / build-admin-compliance (push) Successful in 2m16s
Build + Deploy / build-backend-compliance (push) Successful in 12s
Build + Deploy / build-ai-sdk (push) Successful in 12s
Build + Deploy / build-developer-portal (push) Successful in 12s
Build + Deploy / build-tts (push) Successful in 15s
Build + Deploy / build-document-crawler (push) Successful in 13s
Build + Deploy / build-dsms-gateway (push) Successful in 13s
Build + Deploy / build-dsms-node (push) Successful in 16s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / loc-budget (push) Failing after 18s
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m38s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Successful in 42s
CI / test-python-backend (push) Successful in 41s
CI / test-python-document-crawler (push) Successful in 27s
CI / test-python-dsms-gateway (push) Successful in 21s
CI / validate-canonical-controls (push) Successful in 13s
Build + Deploy / trigger-orca (push) Successful in 2m32s
Save active check_id to localStorage so polling resumes when the user
navigates away via sidebar and comes back. Same pattern as scan tab.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-12 11:37:06 +02:00
Benjamin Admin ed3ebbc246 fix(compliance-check): send 'documents' instead of 'entries' to backend
Build + Deploy / build-admin-compliance (push) Successful in 11s
Build + Deploy / build-backend-compliance (push) Successful in 13s
Build + Deploy / build-ai-sdk (push) Successful in 13s
Build + Deploy / build-developer-portal (push) Successful in 10s
Build + Deploy / build-tts (push) Successful in 11s
Build + Deploy / build-document-crawler (push) Successful in 11s
Build + Deploy / build-dsms-gateway (push) Successful in 12s
Build + Deploy / build-dsms-node (push) Successful in 11s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / loc-budget (push) Failing after 15s
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m33s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Successful in 39s
CI / test-python-backend (push) Successful in 37s
CI / test-python-document-crawler (push) Successful in 26s
CI / test-python-dsms-gateway (push) Successful in 21s
CI / validate-canonical-controls (push) Successful in 14s
Build + Deploy / trigger-orca (push) Successful in 2m30s
Frontend was sending field name 'entries' but backend Pydantic model
expects 'documents', causing 422 validation error.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-12 09:25:36 +02:00
Benjamin Admin f3751a4efa feat(compliance-check): show business profile + banner check result in UI
Build + Deploy / build-admin-compliance (push) Successful in 1m55s
Build + Deploy / build-backend-compliance (push) Successful in 3m17s
Build + Deploy / build-ai-sdk (push) Successful in 49s
Build + Deploy / build-developer-portal (push) Successful in 1m17s
Build + Deploy / build-tts (push) Successful in 1m33s
Build + Deploy / build-document-crawler (push) Successful in 41s
Build + Deploy / build-dsms-gateway (push) Successful in 28s
Build + Deploy / build-dsms-node (push) Successful in 17s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / loc-budget (push) Failing after 16s
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m35s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Successful in 47s
CI / test-python-backend (push) Successful in 38s
CI / test-python-document-crawler (push) Successful in 25s
CI / test-python-dsms-gateway (push) Successful in 24s
CI / validate-canonical-controls (push) Successful in 13s
Build + Deploy / trigger-orca (push) Successful in 2m58s
Add two info boxes above the checklist results:
- Business profile (B2B/B2C, industry, regulated profession)
- Banner check status (CMP detected, violations count, cross-check hint)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-12 00:19:51 +02:00
Benjamin Admin 0d0e705117 feat: Unified Compliance-Check — 8 document types in one form
New 3-tab structure: Website-Scan, Compliance-Check, Banner-Check.

Compliance-Check Tab (replaces Dokumenten-Pruefung + Impressum-Check):
- 8 document rows: DSI, Impressum, Social Media, Cookie, AGB,
  Nutzungsbedingungen, Widerruf, DSB-Kontakt
- Each row: URL input + "Text laden" + file upload + manual text
- "Text laden" extracts via consent-tester, shows in editable textarea
- User verifies/corrects text before checking
- Empty fields = "not present" → own finding

Business Profiler (business_profiler.py):
- Detects B2B/B2C/B2G from all documents together
- Recognizes regulated professions, online shops, editorial content
- Context-aware: INFO checks become PASS/FAIL based on profile

Backend: /compliance-check + /extract-text endpoints
Frontend: ComplianceCheckTab.tsx + DocumentRow.tsx
API proxies: compliance-check/route.ts + extract-text/route.ts

Also: Impressum regex fixes (Telefon, AG, Geschaeftsfuehrung)
and INFO severity for context-dependent checks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-11 20:56:10 +02:00
Benjamin Admin 02ff96f74e fix: resolve all merge conflict markers from feat/zeroclaw-compliance-agent
Build + Deploy / build-admin-compliance (push) Successful in 2m7s
Build + Deploy / build-backend-compliance (push) Failing after 5m21s
Build + Deploy / build-ai-sdk (push) Successful in 53s
Build + Deploy / build-developer-portal (push) Successful in 1m18s
Build + Deploy / build-tts (push) Successful in 1m42s
Build + Deploy / build-document-crawler (push) Successful in 45s
Build + Deploy / build-dsms-gateway (push) Successful in 27s
Build + Deploy / build-dsms-node (push) Successful in 19s
CI / branch-name (push) Has been skipped
Build + Deploy / trigger-orca (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / loc-budget (push) Failing after 19s
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 3m6s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Successful in 55s
CI / test-python-backend (push) Successful in 44s
CI / test-python-document-crawler (push) Successful in 30s
CI / test-python-dsms-gateway (push) Successful in 26s
CI / validate-canonical-controls (push) Successful in 18s
9 files had conflict markers from the branch merge. All resolved keeping
the feature branch version. Also split agent_scan_routes.py (534→367 LOC)
by extracting Pydantic models to agent_scan_models.py.

[guardrail-change]

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-11 12:15:07 +02:00
Benjamin Admin 36c6101b91 Merge feat/zeroclaw-compliance-agent into main
Brings all compliance doc-check features:
- 162 regex checks + 1874 Master Controls
- LLM-agnostic agent with tool calling
- Banner check (46 checks, 30 CMPs, stealth, Shadow DOM)
- Impressum check (24 checks)
- Deep consent verification (DataLayer, GCM, TCF)
- CMP E2E tests (39 tests)
- HTML email reports, FAQ, persistent history

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-11 11:44:20 +02:00
Benjamin Admin cc919eb608 feat: KI-Agent toggle in all 3 check tabs
- Impressum-Check: Toggle activates 75 Impressum MCs via agent
- Banner-Check: Toggle runs additional cookie doc-check (381 MCs)
  after the Playwright banner test completes
- Both use the same use_agent flag through doc-check endpoint

Green pill button consistent across all tabs:
'KI-Agent aus' / 'KI-Agent aktiv (X MCs)'

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-11 08:00:36 +02:00
Benjamin Admin 91d6d8b1a7 feat: KI-Agent toggle button in Dokumenten-Pruefung
Build + Deploy / build-admin-compliance (push) Successful in 3m15s
Build + Deploy / build-backend-compliance (push) Successful in 3m43s
Build + Deploy / build-ai-sdk (push) Failing after 49s
Build + Deploy / build-developer-portal (push) Successful in 1m26s
Build + Deploy / build-tts (push) Successful in 1m49s
Build + Deploy / build-document-crawler (push) Successful in 46s
Build + Deploy / build-dsms-gateway (push) Successful in 33s
Build + Deploy / build-dsms-node (push) Successful in 22s
CI / branch-name (push) Has been skipped
Build + Deploy / trigger-orca (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / loc-budget (push) Failing after 22s
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 3m1s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Failing after 58s
CI / test-python-backend (push) Successful in 47s
CI / test-python-document-crawler (push) Successful in 28s
CI / test-python-dsms-gateway (push) Successful in 28s
CI / validate-canonical-controls (push) Successful in 16s
Green pill button: 'KI-Agent aus' / 'KI-Agent aktiv (1.874 MCs)'
Toggles use_agent flag which is passed through the full chain:
Frontend → DocCheckRequest → _run_doc_check → _check_single_document
→ check_document_with_controls(use_agent=True)
→ ComplianceAgent with tool calling

Default: OFF (deterministic regex). User can enable per scan.
Also works via env var COMPLIANCE_USE_AGENT=true for always-on.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-10 23:26:21 +02:00
Benjamin Admin 05d98ea95f feat: New tab structure — Discovery Scan, Doc-Check, Banner, Impressum
Removed Schnellanalyse tab. New 4-tab structure:

1. Website-Scan (Discovery): Finds legal documents + services,
   shows "Jetzt pruefen" buttons that navigate to specialized tabs
   with pre-filled URLs.

2. Dokumenten-Pruefung: DSI, AGB, Cookie, Widerruf checks (existing)

3. Banner-Check: Cookie banner 46-check deep verification (existing)

4. Impressum-Check (NEW): §5 TMG / §18 MStV with 16 checks,
   own tab with URL input, history, email report.
   Uses existing impressum_checks.py via doc-check endpoint.

Tab cross-navigation: Scan → "Jetzt pruefen" → opens target tab
with URL pre-filled via localStorage handoff.

Removed: Mode selector (pre/post launch), Schnellanalyse,
useAgentAnalysis hook import, AnalysisResult/FollowUpQuestions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-10 09:09:27 +02:00
Benjamin Admin f201c01a06 fix: Replace unicode escapes with actual emoji characters 2026-05-10 08:20:00 +02:00
Benjamin Admin 33f0a64ff6 feat: Persistent result history — click to reload old scan results
Both DocCheckTab and BannerCheckTab now:
- Store full scan results per history entry in localStorage
- History entries are clickable — loads the saved result immediately
- No need to re-scan to see old results
- Fallback to last result if specific entry not found
- Banner-Check sends HTML email report to mailpit

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-10 07:59:02 +02:00
Benjamin Admin 1b8e9881bb feat: Banner-Check — Historie, persistentes Ergebnis, E-Mail-Report
1. localStorage Persistenz: URL, letztes Ergebnis, Historie (30 Eintraege)
2. Historie: Zeigt URL, Datum, Provider, Violations, Prozent
3. Letztes Ergebnis bleibt nach Tab-Wechsel/Reload sichtbar
4. E-Mail-Report: HTML-formatiert mit Violations + Hints an mailpit
5. Email-Status Anzeige im Frontend

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-10 07:55:12 +02:00
Benjamin Admin 2143840ee7 docs(agent): add FAQ about harmonised standards copyright + EuGH C-588/21 P
Explains why companies must buy norms their own employees wrote,
and the 2024 EuGH ruling that harmonised standards are EU law
and must be freely accessible.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-09 09:50:44 +02:00
Benjamin Admin 4bfb438c92 feat: 4 banner check upgrades — 30 CMPs, stealth, Shadow DOM, categories
Build + Deploy / build-admin-compliance (push) Successful in 2m17s
Build + Deploy / build-backend-compliance (push) Successful in 3m17s
Build + Deploy / build-ai-sdk (push) Successful in 56s
Build + Deploy / build-developer-portal (push) Successful in 1m37s
Build + Deploy / build-tts (push) Successful in 1m33s
Build + Deploy / build-document-crawler (push) Successful in 42s
Build + Deploy / build-dsms-gateway (push) Successful in 33s
Build + Deploy / build-dsms-node (push) Successful in 16s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / loc-budget (push) Failing after 25s
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 3m33s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Failing after 1m18s
CI / test-python-backend (push) Successful in 53s
CI / test-python-document-crawler (push) Successful in 36s
CI / test-python-dsms-gateway (push) Successful in 33s
CI / validate-canonical-controls (push) Successful in 24s
Build + Deploy / trigger-orca (push) Successful in 3m19s
1. 30 CMP selectors (was 10): Added Sourcepoint, Iubenda, Complianz,
   CookieFirst, HubSpot, Osano, Piwik PRO, Cookie Consent (Insites),
   Axeptio, Termly, CookieScript, Civic UK, GDPR Cookie Compliance,
   CookieHub, Ketch, Admiral, Sibbo, Evidon, LiveRamp, Adsimple.
   Plus improved generic fallback: role=dialog, aria-label, data-* attrs.

2. Playwright stealth mode: playwright-stealth against bot detection.
   Removes WebDriver flag, simulates plugins, realistic viewport/locale.
   Launch args: --disable-blink-features=AutomationControlled.

3. Shadow DOM: Recursive JS-based search through shadowRoot elements
   for consent banners. Fallback click via page.evaluate() when
   normal Playwright selectors can't penetrate Shadow DOM.

4. Category selection UI: User can choose which cookie categories to
   test (Notwendig, Statistik, Marketing, Funktional, Praeferenzen).
   Pill-style checkboxes in BannerCheckTab, forwarded through API chain.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-09 08:42:30 +02:00
Benjamin Admin 751f4a5ee7 fix: Remove dead polling code from BannerCheckTab
Build + Deploy / build-admin-compliance (push) Successful in 2m32s
Build + Deploy / build-backend-compliance (push) Successful in 3m20s
Build + Deploy / build-ai-sdk (push) Successful in 53s
Build + Deploy / build-developer-portal (push) Successful in 1m19s
Build + Deploy / build-tts (push) Successful in 1m28s
Build + Deploy / build-document-crawler (push) Successful in 35s
Build + Deploy / build-dsms-gateway (push) Successful in 24s
Build + Deploy / build-dsms-node (push) Successful in 11s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / loc-budget (push) Failing after 19s
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 3m9s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Failing after 1m0s
CI / test-python-backend (push) Successful in 42s
CI / test-python-document-crawler (push) Successful in 32s
CI / test-python-dsms-gateway (push) Successful in 24s
CI / validate-canonical-controls (push) Successful in 19s
Build + Deploy / trigger-orca (push) Successful in 3m11s
The /banner-check endpoint is synchronous (Playwright completes in
<30s and returns result directly). Removed unused async polling loop
that would never match since no scan_id is returned.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-09 08:22:36 +02:00
Benjamin Admin 0fcb3ee488 docs(agent): add Machinery Regulation harmonised standards FAQ
Explains current status: no harmonised standards published under
(EU) 2023/1230 yet, ~800 from old directive still valid. Timeline
from June 2023 to January 2027 full application.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-09 07:17:32 +02:00
Benjamin Admin 686834cea0 feat: 4 remaining tasks — EU institutions, banner integration, JS-sites, Caritas fixes
Build + Deploy / build-admin-compliance (push) Successful in 8s
Build + Deploy / build-backend-compliance (push) Successful in 8s
Build + Deploy / build-ai-sdk (push) Failing after 36s
Build + Deploy / build-developer-portal (push) Successful in 8s
Build + Deploy / build-tts (push) Successful in 7s
Build + Deploy / build-document-crawler (push) Successful in 7s
Build + Deploy / build-dsms-gateway (push) Successful in 8s
Build + Deploy / build-dsms-node (push) Successful in 8s
CI / branch-name (push) Has been skipped
Build + Deploy / trigger-orca (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / loc-budget (push) Failing after 17s
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 3m14s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Failing after 46s
CI / test-python-backend (push) Successful in 43s
CI / test-python-document-crawler (push) Successful in 29s
CI / test-python-dsms-gateway (push) Successful in 30s
CI / validate-canonical-controls (push) Successful in 16s
1. EU Institution Checks (Verordnung 2018/1725):
   - New doc_type "eu_institution" with 9 L1 + 15 L2 checks
   - Both German + English patterns (EU institutions are multilingual)
   - Auto-detection via "2018/1725", "EDSB", "EDPS" keywords
   - Correct article references (Art. 15 instead of 13, Art. 5 instead of 6)

2. Banner Check Integration:
   - banner_runner.py maps scan results to 36 L1/L2 structured checks
   - BannerCheckTab shows hierarchical ChecklistView with hints
   - 3-phase summary (cookies/scripts before/after consent)
   - /scan endpoint now includes structured_checks in response

3. JS-heavy Website Fixes (dm, Zalando, HWK):
   - dsi_helpers.py: goto_resilient (networkidle→domcontentloaded fallback)
   - try_dismiss_consent_banner before text extraction
   - PDF redirect detection (dm.de redirects to GCS PDF)

4. Caritas False Positive Fixes:
   - Phone regex allows parentheses: +49 (0)761 → now matches
   - "Recht auf Widerspruch" (3 words) + §23 KDG → matches Art. 21
   - Church authorities: "Katholisches Datenschutzzentrum" recognized

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-08 01:10:10 +02:00
Benjamin Admin 63bd6a7c6d feat: Compliance FAQ section in Agent page
Build + Deploy / build-admin-compliance (push) Successful in 2m9s
Build + Deploy / build-backend-compliance (push) Successful in 3m17s
Build + Deploy / build-ai-sdk (push) Successful in 50s
Build + Deploy / build-developer-portal (push) Successful in 1m14s
Build + Deploy / build-tts (push) Successful in 1m27s
Build + Deploy / build-document-crawler (push) Successful in 42s
Build + Deploy / build-dsms-gateway (push) Successful in 24s
Build + Deploy / build-dsms-node (push) Successful in 11s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / loc-budget (push) Failing after 22s
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 3m10s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Failing after 46s
CI / test-python-backend (push) Successful in 40s
CI / test-python-document-crawler (push) Successful in 29s
CI / test-python-dsms-gateway (push) Successful in 24s
CI / validate-canonical-controls (push) Successful in 18s
Build + Deploy / trigger-orca (push) Successful in 2m15s
5 FAQ items covering:
- What happens when companies are sued (4 enforcement paths)
- How document checks work (3-step process)
- Which document types are checked (7 types, 138 checks)
- How reliable results are (0 false positives, LLM verification)
- What GDPR violations cost in practice (fine tiers + examples)

Includes EuGH rulings (C-300/21, C-319/20), CNIL fine examples,
and practical cost ranges.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-08 00:32:07 +02:00
Benjamin Admin 7c17321089 feat: Cookie Banner Check as standalone tab in Compliance Agent
Build + Deploy / build-admin-compliance (push) Successful in 2m7s
Build + Deploy / build-backend-compliance (push) Successful in 10s
Build + Deploy / build-ai-sdk (push) Successful in 8s
Build + Deploy / build-developer-portal (push) Successful in 7s
Build + Deploy / build-tts (push) Successful in 7s
Build + Deploy / build-document-crawler (push) Successful in 9s
Build + Deploy / build-dsms-gateway (push) Successful in 8s
Build + Deploy / build-dsms-node (push) Successful in 8s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / loc-budget (push) Failing after 17s
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 3m21s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Failing after 47s
CI / test-python-backend (push) Successful in 47s
CI / test-python-document-crawler (push) Successful in 31s
CI / test-python-dsms-gateway (push) Successful in 26s
CI / validate-canonical-controls (push) Successful in 16s
Build + Deploy / trigger-orca (push) Successful in 2m23s
New "Banner-Check" tab with:
- URL input → Playwright 3-phase test (before/reject/accept)
- Shield icon + provider detection
- Progress bar with pass/fail percentage
- 3-phase summary (cookies + scripts per phase)
- Violations (red) and passes (green) in structured list

Backend: new POST /api/compliance/agent/banner-check endpoint
that proxies to consent-tester:8094/scan.

Next step: Upgrade banner checks to L1/L2 format with expert
hints (same quality as document checks).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-07 17:39:44 +02:00
Benjamin Admin 293c58d0dd feat: Add actionable hints to all 138 compliance checks
Build + Deploy / build-admin-compliance (push) Successful in 1m40s
Build + Deploy / build-backend-compliance (push) Successful in 7s
Build + Deploy / build-ai-sdk (push) Successful in 35s
Build + Deploy / build-developer-portal (push) Successful in 8s
Build + Deploy / build-tts (push) Successful in 7s
Build + Deploy / build-document-crawler (push) Successful in 8s
Build + Deploy / build-dsms-gateway (push) Successful in 7s
Build + Deploy / build-dsms-node (push) Successful in 8s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / loc-budget (push) Failing after 16s
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m50s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Failing after 40s
CI / test-python-backend (push) Successful in 37s
CI / test-python-document-crawler (push) Successful in 25s
CI / test-python-dsms-gateway (push) Successful in 23s
CI / validate-canonical-controls (push) Successful in 15s
Build + Deploy / trigger-orca (push) Successful in 2m28s
Each check now has a "hint" field explaining what is missing and
what the customer should do to fix it. Hints are shown in the
frontend below failed checks in red text.

Examples:
- "Bei Verarbeitung auf Basis von Art. 6(1)(f) muss dokumentiert
  werden, warum Ihr berechtigtes Interesse die Rechte der
  Betroffenen ueberwiegt."
- "Die ladungsfaehige Anschrift fehlt. Erforderlich: Strasse,
  Hausnummer, PLZ und Ort."

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-07 14:05:01 +02:00
Benjamin Admin 8849c396b5 fix: Show L2 detail checks always visible (no extra click needed)
Build + Deploy / build-admin-compliance (push) Successful in 2m44s
Build + Deploy / build-backend-compliance (push) Successful in 3m25s
Build + Deploy / build-ai-sdk (push) Successful in 56s
Build + Deploy / build-developer-portal (push) Successful in 1m22s
Build + Deploy / build-tts (push) Successful in 1m30s
Build + Deploy / build-document-crawler (push) Successful in 8s
Build + Deploy / build-dsms-gateway (push) Successful in 8s
Build + Deploy / build-dsms-node (push) Successful in 9s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / loc-budget (push) Failing after 20s
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 3m5s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Failing after 44s
CI / test-python-backend (push) Successful in 42s
CI / test-python-document-crawler (push) Successful in 27s
CI / test-python-dsms-gateway (push) Successful in 22s
CI / validate-canonical-controls (push) Successful in 18s
Build + Deploy / trigger-orca (push) Successful in 3m22s
L2 checks were hidden behind a second click on L1 items.
Now they render inline below their L1 parent, always visible
when the document card is expanded.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-07 13:16:04 +02:00
Benjamin Admin b363c28539 feat: Add 76 Level-2 regex checks for document correctness verification
Split dsi_document_checker.py (466 LOC) into doc_checks/ package (9 files).
Two-pass L1→L2 logic: L1 checks "Is it mentioned?", L2 checks "Is it correct?"
(e.g. controller has full address, specific Art. 6 lit., concrete time periods).

138 total checks (62 L1 + 76 L2) across 7 doc types:
- DSE Art. 13: 31, Impressum §5 TMG: 16, Cookie §25 TDDDG: 15
- Widerruf §355: 15, AGB §305ff: 21, Social Media Art. 26: 20, DSFA Art. 35: 18

Frontend: hierarchical L1→L2 display with dual progress bars
(green=completeness, blue=correctness).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-07 12:37:03 +02:00
Benjamin Admin 3853a0838a feat: Art. 26 Joint Controller + DSFA checklists for Social Media sections
New checklists:
- JOINT_CONTROLLER_CHECKLIST (Art. 26 DSGVO, 7 checks):
  Joint parties, arrangement, contact point, processing split,
  data categories, third-country transfer (USA), rights
- DSFA_CHECKLIST (Art. 35 DSGVO, 5 checks):
  Description, necessity, risk assessment, measures, DSB involvement

Section detection: 'Datenschutzerklaerung fuer Social Media' → social_media,
'Datenschutzfolgeabschaetzung/Risikoanalyse' → dsfa

classify_document_type: DSFA and social_media detected before generic DSE

Frontend: DOC_TYPES dropdown + ChecklistView labels updated

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-07 10:49:32 +02:00
Benjamin Admin 45446aef16 fix: 8 quality + UX improvements
1. Cookie 'Zwecke' false positive: added 'um...zu', 'dienen', 'helfen',
   'ermöglichen' patterns — catches purpose descriptions without 'Zweck'
2. Kurzhinweis: added empty all_checks for short documents (<200 words)
3. Bezeichnungsfeld: placeholder shows 'Version / Stand' for typed docs,
   'Dokumentname' for 'Sonstiges'
4. DocCheckTab state persistence: entries + results survive navigation
5. DocCheck history: saves each check with date, doc count, findings
6. History display: 'Letzte Pruefungen' section at bottom of tab
7. ChecklistView: shows 'X von Y Pruefpunkten bestanden' per document
8. Results persist in localStorage across page navigation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-07 09:37:47 +02:00
Benjamin Admin 0416bb5d04 fix: Checklist expand — use index instead of URL (prevents all opening at once)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-06 10:56:44 +02:00
Benjamin Admin 4c68caac4e feat: Multi-URL Document Check with full checklist visibility
New "Dokumenten-Pruefung" tab in Compliance Agent:
- User adds multiple URLs with document type (DSI, AGB, Impressum, Cookie, Widerruf)
- Each document loaded via Playwright, accordions expanded, text extracted
- Checked against type-specific legal checklist
- Optional: Cookie banner check via checkbox

Checklisten-UX (solves "100% looks like nothing was checked"):
- All checks shown per document: green checkmark + matched text excerpt
- Red X for missing fields with legal reference
- Builds user trust: "9 Punkte geprueft, alle bestanden"
- Expandable per document with completeness bar

New checklists:
- Impressum: §5 TMG (6 fields: name, address, contact, register, VAT, representative)
- Cookie-Richtlinie: §25 TDDDG (5 fields: types, purposes, retention, third-party, opt-out)

Backend:
- POST /agent/doc-check — async with polling (same pattern as /scan)
- DocCheckResult includes checks[] with passed/failed + matched_text
- dsi_document_checker returns all_checks in SCORE finding
- Email report shows per-document checklist

Files: agent_doc_check_routes.py (280 LOC), DocCheckTab.tsx (248 LOC),
ChecklistView.tsx (130 LOC), dsi_document_checker.py (+70 LOC)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-06 10:08:40 +02:00
Benjamin Admin 7c7513525e feat: Document-centric scan results + DSI deduplication
DSI Dedup (consent-tester):
- Only H1/H2 headings count as documents (not H3/H4 sub-sections)
- Sub-sections (Cookies, Betroffenenrechte, Social Media) are part of
  parent document's full text, not separate documents
- Reduces IHK result from 30 to ~11 real documents

Backend (agent_scan_routes):
- ScanFinding gets doc_title field linking each finding to its document
- doc_title set when creating DSI findings for document attribution

Frontend (ScanResult.tsx):
- 3 sections: Services table, Document cards, General findings
- Documents: expandable cards with completeness bar (green/yellow/red)
- Findings grouped under their parent document
- Each card shows: title, word count, findings count, % completeness
- Findings without doc_title go to "Allgemeine Findings" section

Email Summary (agent_scan_helpers):
- Findings listed under their parent document
- General findings in separate section
- No more flat mixed list

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-05 09:56:29 +02:00
Benjamin Admin ca21feedc8 feat: display 8 banner text checks in consent test UI
Shows: Impressum link ✓/✗, DSE link ✓/✗, plus violation cards for
wrong DSE consent wording, pre-ticked checkboxes, dark patterns,
missing reject button, no settings re-access.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-02 15:38:07 +02:00
Benjamin Admin 6864849115 feat: Phase 11 — granular cookie category testing
Tests each consent category in isolation:
- Phase D: Only "Statistics" enabled → checks if only analytics loads
- Phase E: Only "Marketing" enabled → checks if only ads load
- Phase F: Only "Functional" enabled → checks no tracking loads

CMP-specific category selectors for Cookiebot, OneTrust, Usercentrics,
Didomi. Generic fallback via toggle/checkbox keyword detection.

SERVICE_CATEGORY_MAP maps 35+ services to expected categories.
Violations: "Facebook Pixel loads with only Statistics enabled" = miscategorization.

Frontend: category test results shown below Phase A-C with
per-category violation cards.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-01 21:15:23 +02:00
Benjamin Admin b53b36fdc5 feat: 5-tab agent UI — PDF export, compare, auth test, all proxies
- 5 tabs: Schnellanalyse, Website-Scan, Cookie-Test, Vergleich, Login-Test
- PDF download button in ScanResult
- CompareResult: side-by-side compliance comparison table
- AuthTestResult: 5 post-login checks with legal refs
- API proxies: /scans/pdf, /compare, /authenticated-scan
- Compare: textarea for 2-5 URLs, parallel scanning

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 16:43:08 +02:00
Benjamin Admin b7f9099ad9 feat: Cookie-Test tab — 3-phase consent test UI + API proxy
Third tab "Cookie-Test" in Compliance Agent:
- Phase A: Before consent (tracking without permission)
- Phase B: After rejection (CRITICAL if tracking persists)
- Phase C: After acceptance (undocumented services)
- CMP badge (Didomi, OneTrust, etc.)
- Violation cards with severity badges and legal references

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 12:38:15 +02:00
Benjamin Admin 15d1e118ed feat: TextReference component — original text, position, correction in findings
Shows for each finding:
- Original text block from DSE (or "missing" indicator)
- Position: section heading, number, parent section, paragraph index
- Correction: insert/append/replace with copy button
Falls back to plain correction view if no text reference available.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 11:59:55 +02:00
Benjamin Admin 6c0e76f96d feat: show scanned pages in email summary + frontend (expandable list)
Email now lists all scanned URLs with checkmark/cross status.
Frontend shows collapsible "X Seiten gescannt — Details anzeigen".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 17:26:03 +02:00
Benjamin Admin 0f1fae61a6 feat: Website-Scan tab in agent UI — service table, SOLL/IST, corrections
- Tab system: Schnellanalyse (single page) + Website-Scan (multi-page)
- ScanResult component: service comparison table, severity-colored findings
- Expandable correction suggestions with copy button (pre-launch mode)
- API proxy route for /agent/scan endpoint

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 15:52:40 +02:00
Benjamin Admin cb5aa2949b feat: hybrid website compliance checks (§312k BGB, §5 TMG, Art. 13 DSGVO)
- Scan public website for cancellation button, imprint, privacy link, cookie consent
- Generate follow-up questions when checks can't be verified without login
- User answers "no" → finding with legal basis is added to results
- Frontend: FollowUpQuestions component with Ja/Nein buttons
- Sidebar: "Compliance Agent" entry added under KI-Compliance

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 13:25:44 +02:00
Benjamin Admin 0c0dd4e3a6 feat: ZeroClaw compliance agent — document analysis + role assignment + email
Add autonomous compliance agent that fetches web documents (cookie banners,
privacy policies), classifies them via Qwen/Ollama, assesses DSGVO compliance,
assigns to the responsible role, and sends notification emails.

Components:
- ZeroClaw SOP (6-step workflow: fetch, classify, assess, summarize, assign, notify)
- Backend: /api/compliance/agent/analyze (combined endpoint)
- Backend: /api/compliance/agent/notify (standalone email)
- Frontend: /sdk/agent page (Manager UI with URL input + results)
- Helper scripts + E2E test

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-27 23:28:21 +02:00