breakpilot-compliance

Author	SHA1	Message	Date
Benjamin Admin	479ce2225b	feat(profile): P14+P15+P16 — B2B-Heuristik + Doc-URL-Dedup + Homepage-Profile P14 — _detect_no_direct_sales erweitert um 3 Cluster: A) OEM-Konfigurator (BMW/Audi/Mercedes/VW/Porsche-Markennamen + Vertragshaendler-Pattern) B) B2B-Dienstleister (CE-Zertifizierung, Compliance-Beratung, Schulungen, Auditierung, TISAX, ISO-Normen, Arbeitssicherheit, ...) C) NGO/Verein/Public (Spendenkonto, Vereinsregister, gemeinnuetzig, ...) Schwelle: pos >= 2 pro Cluster UND pos > neg. Bisher: nur OEM. P15 — Doc-URL-Dedup im Worker: wenn mehrere Doc-Types DASSELBE Dokument referenzieren (Safetykon-Pattern: User gibt /datenschutz fuer dse, cookie UND widerruf), wird nur dem primaeren Doc-Type (Priority: dse > impressum > cookie > widerruf > agb > nutzungsbedingungen) der Text gegeben. Andere landen als "Nicht separat vorhanden — wird im Dokument 'X' mit-geprueft." Eliminiert die 8+8 systematischen widerruf/cookie False Positives. P16 — Profile-Detection auch Homepage-Text: Homepage-HTML wird mit kurzem Fetch (8s timeout) gezogen, getrippt und zum profile_input gemerged. Vor- her wirkte P14 nur wenn B2B-Indikatoren im DSE/Impressum-Pflichttext standen — bei Safetykon stehen sie nur im Homepage-Menue. Plus Bonus: TDM-Override-Submit-Button wird deaktiviert wenn Reason < 10 Zeichen — verhindert dass User wie heute in den Bug rein klickt. Smoke-Test Safetykon (B2B Compliance-Dienstleister): dse geprueft (kein err) impressum geprueft (kein err) cookie "Nicht separat vorhanden — wird in DSE mit-geprueft" agb "Nicht anwendbar — kein Direkt-Kaufvertrag" widerruf "Nicht anwendbar — kein Direkt-Kaufvertrag" nutzungsbedingungen "Nicht anwendbar — kein Direkt-Kaufvertrag" Vorher: 16 False Positives. Jetzt: 0. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 11:46:58 +02:00
Benjamin Admin	78b27d4684	feat(compliance-check): P12 — TDM-Override mit dokumentierter Kunden-Erlaubnis CI / guardrail-integrity (push) Has been skipped Details CI / nodejs-build (push) Successful in 3m5s Details CI / test-go (push) Has been skipped Details CI / detect-changes (push) Successful in 10s Details CI / branch-name (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 16s Details CI / loc-budget (push) Successful in 17s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 40s Details CI / test-python-document-crawler (push) Has been skipped Details Backend: ComplianceCheckRequest um tdm_override + tdm_override_reason erweitert. Worker im _run_compliance_check Pfad: bei tdm_override=True UND Reason >= 10 Zeichen wird der TDM-Vorbehalt nur dokumentiert (job.tdm_override.{reason, original_status}) und NICHT als Abbruch-Grund gewertet. Ohne Reason: Override ignoriert. Audit-Spur via logger.warning(reason). Frontend: ComplianceCheckTab um Checkbox + Pflicht-Reason-Feld ("Schriftliche Crawl-Erlaubnis vorhanden") direkt vor dem Submit- Button. Pflicht: Reason >= 10 Zeichen. Submit sendet die Flags ans Backend. Anwendungsfall: Safetykon-Pattern — robots.txt + ai.txt setzen Vorbehalt, aber Kunde hat schriftlich zugestimmt (Auftrags-Audit). [guardrail-change] ComplianceCheckTab.tsx (511 LOC) in loc-exceptions ergaenzt — Split nach _components/TDMOverride + CompliancePolling ist P11-Tech-Debt. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 08:56:50 +02:00
Benjamin Admin	6c223c7c9b	feat(compliance-check): exec-summary + voll-audit + TDM-respect + cookie-KB-extended + saving-scan-funnel CI / detect-changes (push) Successful in 10s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 14s Details CI / loc-budget (push) Failing after 15s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 2m43s Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 37s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details P1 — Exec-Summary oben im Email-Report (4 KPIs + 2 CTAs, dunkler Gradient) P3 — no_direct_sales-Flag fuer OEM-Konfigurator-Sites; AGB/Widerruf/AGB als "NICHT ANWENDBAR" (grau) statt "NICHT GEFUNDEN" (rot) P5 — Voll-Audit Unification: alle Findings (MC + Pflichtangaben + Vendor + Redundanz) in /data/compliance_audits.db.unified_findings; neuer /api/compliance/agent/findings/<id> Endpoint + FindingsTab im Audit-UI mit Filter + CSV-Export P7 — Crawl-Hardening: TDM-Reservation-Check (robots.txt / ai.txt / Header / Meta) vor jedem Run mit 24h-Cache; HeadlessChrome-UA (Firma noch nicht gegruendet — Switch via BREAKPILOT_BRANDED_UA env); per-Domain Rate-Limit 1 req/s + max 2 concurrent P2 — Cookie-Knowledge-DB additiv erweitert (35 -> 74 Cookies): Adobe, Meta, Microsoft, LinkedIn, TikTok, HubSpot, Marketo, Salesforce, Hotjar, FullStory, Mouseflow, Intercom, Drift, Zendesk, Cloudflare, Stripe, OneTrust/Cookiebot/Usercentrics, Matomo, Pinterest, Snapchat, X/Twitter, YouTube, Vimeo, Klaviyo, Mailchimp, Mixpanel, Segment, Amplitude, Optimizely, Datadog; Wire-in in cookie_function_classifier liefert compliance_risk-Label (kritisch/hoch/mittel/gering) pro Vendor A — k-Anonymitaets-Helper (benchmark_k_anonymity) fuer P6-Vorbereitung B — Cross-Tenant-Domain-Assertion im /findings-Endpoint (expected_domain Query-Param -> 403 bei Mismatch) C — Saving-Scan-Funnel: /api/compliance/agent/saving-scan/start mit Validierung + 24h-Rate-Limit pro Domain + Lead-Persistenz in saving_scan_leads + Auto-Discovery via _run_compliance_check; 6 Tests D — Risk-Badge im Email-Vendor-Row Rechtliche Leitplanken (Memory feedback_oem_data_legal.md): nur eigene Knapp-Bewertungen + Source-Pointer, keine 1:1-Kopien fremder CMP-Texte. TDM-Opt-Out-Respect nach § 44b UrhG. KEINE Schema-Aenderungen — alles in Sidecar-SQLite. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 23:48:34 +02:00
Benjamin Admin	df7d83134b	feat(agent): migrate compliance-check results to banner + documents (M1-M5) After a compliance-check run finishes, the user can now apply the extracted vendor inventory directly to their own: - CookieBanner config (admin /sdk/einwilligungen) - Cookie-Policy / VVT-Register / Privacy-Policy templates (admin /sdk/document-generator) Backend: - migration_to_banner.py: vendor list -> CookieBannerConfig with ESSENTIAL/PERFORMANCE/PERSONALIZATION/EXTERNAL_MEDIA buckets + review flags (broken opt-out URLs, missing expiry, no cookies listed) - migration_to_document.py: vendor list -> pre-fills for 3 doc templates, recipient-type aware (INTERNAL/GROUP/PROCESSOR/CONTROLLER) - agent_migration_routes.py: GET /banner-preview, /document-preview, /summary keyed on check_id - compliance_audit_log: new check_payloads table persists cmp_vendors + extracted_profile so the preview survives an app restart - tests: 9 mapper units + 4 endpoint integration tests Frontend: - MigrationPanel.tsx: modal showing banner-config diff + document pre-fills, plus links into the existing editors - ComplianceCheckTab.tsx: replaces standalone audit link with the panel; net -3 lines, stays at the 500-cap Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 14:06:28 +02:00
Benjamin Admin	6ed30dae5b	feat(agent): MC scorecard + audit drill-down + tenant trend (A1-A6) Now that all 1874 MCs run per check (Task #30 cap removal), the report was about to drown in noise. This commit adds the full aggregation / persistence / drill-down stack so each MC is actionable, not just counted. A1 mc_scorecard.py (new): build_scorecard(checks) -> per-regulation PASS/FAIL/SKIP + severity top_fails(checks, n) -> N most severe failed MCs full_audit_records(...) -> flat rows ready for sidecar SQLite A2 Email rendering: agent_doc_check_scorecard.py (new) builds an HTML scorecard table (regulation × passed/failed/HIGH/MEDIUM/score) shown at the top of the email. agent_doc_check_report._render_document now collapses the 500-MC L2 forest into 'X/Y bestanden (Z Fail)' summary plus a top-10 fails block per doc — old verbose render is gone. A3 compliance_audit_log.py (new) — sidecar SQLite at /data/compliance_audits.db (separate from compliance Postgres schema to comply with the no-new-migrations rule in CLAUDE.md): check_runs(check_id, ts, tenant_id, site_name, base_domain, doc_count, scorecard json, vvt_summary json) mc_results(check_id, doc_type, mc_id, label, passed, skipped, severity, regulation, matched_text, hint) Route persists every run after the email is sent. docker-compose.yml adds compliance-audit volume + env. A4 backfill_mc_regulation_llm.py (new) — Qwen-tagged backfill for the 1636 MCs the regex pass couldn't classify. Batches of 25, format=json, output constrained to the canonical regulation list. Run manually: docker exec bp-compliance-backend python3 \ /app/scripts/backfill_mc_regulation_llm.py [--dry-run] A5 Admin audit tab — GET /api/compliance/agent/audit/<check_id> proxied via /api/sdk/v1/agent/audit/<id>. New page /sdk/agent/audit/[checkId] renders scorecard + filterable MC table (status / doc_type / regulation, expandable rows with matched_text + hint). ComplianceCheckTab now shows 'Voll-Audit oeffnen' link. A6 Trend per tenant — GET /api/compliance/agent/audit/tenant/<id> returns recent runs. Email scorecard shows per-regulation delta badges ('(+12%)', '(-3%)') compared with the previous run for the same tenant + base_domain. Lookup is one SQLite query. Plumbing: rag_document_checker.py — SELECT now includes 'article'; MC results carry 'regulation' + 'article' through to CheckItem. agent_doc_check_routes.CheckItem schema gains regulation + article fields (defaults '') so old clients still parse. agent_compliance_check_routes — response gains 'check_id' so the frontend can build the audit link.	2026-05-17 13:45:58 +02:00
Benjamin Admin	525038359a	feat(compliance-check): auto-discover missing doc types from homepage When the user leaves some doc-type rows empty, the tool now actively searches the website for them — only marks 'not found' as last resort. Flow: 1. User submits N URLs (e.g. just DSI) 2. For each canonical doc_type with no submitted URL/text, the route identifies the most-common base (scheme://netloc) from submitted URLs 3. Calls consent-tester /dsi-discovery on the homepage with max_documents=15 (180s timeout) 4. Classifies every discovered doc into a canonical doc_type via title/URL keyword rules (_DISCOVERY_RULES — covers cookie/widerruf/ social_media/agb/nutzungsbedingungen/dsb/impressum/dse) 5. Fills matching empty entries with the discovered text, marks auto_discovered=True and discovery_attempted=True Padding now differentiates: - 'Auf der Website nicht gefunden' — discovery was attempted, no doc matched. Amber badge, friendly hint to add URL manually. - 'Nicht eingereicht — Quelle nicht angegeben' — user gave NO URLs at all, nothing to crawl from. Grey badge. Email + frontend: - Status labels: NICHT GEFUNDEN (amber) vs NICHT EINGEREICHT (grey) - 'Gepruefte Quellen' table tags auto-discovered URLs with a small blue 'auto-entdeckt' badge so GF sees what tool found vs user submitted. Implementation only runs when ≥1 URL was submitted (no base to crawl from otherwise). Adds 30-90s for unsubmitted types but avoids the 'just say nicht gefunden' anti-pattern.	2026-05-17 01:14:05 +02:00
Benjamin Admin	bc21480a2a	fix(compliance-check): always render 8 doc types + 4 BMW GT-gap fixes Always-show-8 (user-requested): - agent_compliance_check_routes.py: _pad_results_with_missing pads the results list to always include all 8 canonical doc_types in canonical order. Missing types get a placeholder DocCheckResult with error= 'Nicht eingereicht' + scenario='missing'. - agent_doc_check_report.py: NICHT EINGEREICHT status label (neutral), friendly grey body block instead of red error. - ChecklistView.tsx: 'Nicht eingereicht' chip (neutral grey, not red 'Fehler'); SCENARIO_LABELS adds missing entry + header chip counter. Impressum-Regression fix (#18): - _fetch_text(url, doc_type): cookie/dse/social_media -> max_documents=1 (CMP capture authoritative, sub-pages dilute). Other types -> =3 (Impressum needs Versicherungsvermittler, Aufsicht, Berufsrecht sub- pages). 15s networkidle bail keeps timing safe. ODR/Verbraucherstreitbeilegung filter (#19): - _apply_profile_filter: when profile.needs_odr=True (B2C), override the check's default B2B-oriented hint with action-oriented B2C guidance pointing at Art. 14 EU-VO 524/2013 + §36 VSBG. Previously the check contradicted itself: 'profile says B2C' + hint 'only relevant for B2C online vendors'. Registergericht regex (#20): - impressum_checks.py: accept colon/dot/dash between keyword and city (BMW writes 'registergericht: münchen hrb 42243'). Add 'sitz und registergericht: X' as separate pattern. Industry detection (#21): - business_profiler.py: 'automotive' keywords broadened (antriebs, motor, leasing, werkstatt, probefahrt, plus brand names BMW/Mercedes/ Audi/VW/Porsche/Opel). 'it_services' keywords narrowed — software/ cloud/hosting are mentioned in every privacy policy and were biasing the result toward IT for any tech-aware company.	2026-05-17 01:03:58 +02:00
Benjamin Admin	e61e9d9e2a	feat(agent): progress_pct + 6 BMW-Run Verbesserungen Backend (agent_compliance_check_routes.py): - progress_pct (0-100%) im Job-State, ueber alle Phasen verteilt (Laden 0-30, Profil 35-40, Pruefen 40-80, Banner 80-92, Report 95-100) - Status-Texte vereinheitlicht ("Texte laden X/N", "Pruefen X/N") - Firmenname fuer Email-Subject jetzt aus URL abgeleitet (bmw.de -> "BMW", mercedes-benz.de -> "Mercedes-Benz") statt unzuverlaessigem extracted_profile.companyName (matchte oft juris.de) - E-Mail-Report enthaelt jetzt Banner+TCF-Vendor-Liste (build_provider_list_html) Backend (agent_doc_check_extras.py — neu): - build_scanned_urls_html: gepruefte URLs als Tabelle oben im Report (transparent fuer GF, welche Quellen wirklich gezogen wurden) - Cross-Domain-Hinweis bei >1 netloc (BMW: bmw.de / bmwgroup.com / bmwgroup.jobs — Auffindbarkeit nach Art. 12 DSGVO) - build_provider_list_html: Banner-Box + TCF-Vendor-Tabelle mit Spalten Name \| Kategorie \| Zweck \| Drittland \| Rechtsgrundlage Backend (business_profiler.py): - §34d-GewO Versicherungsvermittler-Hinweise zaehlen nicht mehr als "finance"-Industrie (BMW wurde dadurch falsch als B2B/finance erkannt) - Neue Industry "automotive" (Fahrzeug/KFZ/Konfigurator/Modellpalette) - B2B-Keywords: generische Begriffe wie "unternehmen", "beratung", "consulting" entfernt (matchten in jedem Konzerntext) - B2C-Fallback: bei Verbraucher-Signalen ("widerruf", "kunde", redaktioneller Inhalt) tendiert auf b2c statt b2b Frontend (ComplianceCheckTab.tsx): - Progress-Balken mit Width-% und XX%-Anzeige rechts - liest data.progress_pct aus Polling-Response Consent-Tester (dsi_discovery.py): - Cookie-Policy-Extraktion kritisch fixt: wait_for_function bis body.innerText > 500 chars (BMW SPA-Rendering brauchte mehr Zeit) - _extract_text_robust: 3-Strategien-Extraktion (Selektoren -> Body- Cleanup -> P/LI/TD-Tags) - _extract_text_from_iframes: liest OneTrust/Sourcepoint/Usercentrics Iframe-Inhalte (manche Cookie-Policies leben dort) Adressiert alle Findings aus dem BMW-Ground-Truth-Vergleich.	2026-05-16 17:53:14 +02:00
Benjamin Admin	d45e08e25f	fix: reduce Playwright timeout 180s→60s, increase poll limit 15→25min	2026-05-16 00:47:28 +02:00
Benjamin Admin	08fcb5f239	feat(compliance-check): scenario badges + extracted profile display Build + Deploy / build-admin-compliance (push) Successful in 1m58s Details Build + Deploy / build-backend-compliance (push) Successful in 13s Details Build + Deploy / build-ai-sdk (push) Successful in 49s Details Build + Deploy / build-developer-portal (push) Successful in 14s Details Build + Deploy / build-tts (push) Successful in 15s Details Build + Deploy / build-document-crawler (push) Successful in 12s Details Build + Deploy / build-dsms-gateway (push) Successful in 11s Details Build + Deploy / build-dsms-node (push) Successful in 13s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / loc-budget (push) Failing after 15s Details CI / secret-scan (push) Has been skipped Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 2m40s Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / test-go (push) Successful in 43s Details CI / test-python-backend (push) Successful in 39s Details CI / test-python-document-crawler (push) Successful in 27s Details CI / test-python-dsms-gateway (push) Successful in 24s Details CI / validate-canonical-controls (push) Successful in 14s Details Build + Deploy / trigger-orca (push) Successful in 2m34s Details - Show extracted profile fields (company name, legal form, address, DPO, USt-IdNr) with "In Company Profile uebernehmen" button - Show Compliance Scope hints extracted from documents - Scenario badges per document: Neugenerierung (red), Korrekturen (amber), Konform (green) - Summary line shows scenario counts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-12 17:49:45 +02:00
Benjamin Admin	4a7e09bbb0	fix(impressum): regex [A-Z] never matches on lowercased text Build + Deploy / build-admin-compliance (push) Successful in 12s Details Build + Deploy / build-backend-compliance (push) Successful in 14s Details Build + Deploy / build-ai-sdk (push) Successful in 20s Details Build + Deploy / build-developer-portal (push) Successful in 13s Details Build + Deploy / build-tts (push) Successful in 12s Details Build + Deploy / build-document-crawler (push) Successful in 14s Details Build + Deploy / build-dsms-gateway (push) Successful in 13s Details Build + Deploy / build-dsms-node (push) Successful in 18s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / loc-budget (push) Failing after 15s Details CI / secret-scan (push) Has been skipped Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 2m39s Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / test-go (push) Successful in 46s Details CI / test-python-backend (push) Successful in 42s Details CI / test-python-document-crawler (push) Successful in 27s Details CI / test-python-dsms-gateway (push) Successful in 22s Details CI / validate-canonical-controls (push) Successful in 15s Details Build + Deploy / trigger-orca (push) Successful in 2m28s Details All patterns matched against text_lower but used [A-Z] character class. Changed to [a-zA-Z] so patterns like "geschäftsführung: dr. oliver" are found. Also added "Pflicht"/"Detail" labels to the two progress bars to clarify what 100% vs 8% means. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-12 14:02:25 +02:00
Benjamin Admin	128967fa3d	fix(checklist-ui): show INFO-severity checks as gray info icon Build + Deploy / build-admin-compliance (push) Successful in 2m7s Details Build + Deploy / build-backend-compliance (push) Successful in 3m20s Details Build + Deploy / build-ai-sdk (push) Successful in 1m2s Details Build + Deploy / build-developer-portal (push) Successful in 1m14s Details Build + Deploy / build-tts (push) Successful in 1m45s Details Build + Deploy / build-document-crawler (push) Successful in 48s Details Build + Deploy / build-dsms-gateway (push) Successful in 37s Details Build + Deploy / build-dsms-node (push) Successful in 23s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / loc-budget (push) Failing after 17s Details CI / secret-scan (push) Has been skipped Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 2m44s Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / test-go (push) Successful in 49s Details CI / test-python-backend (push) Successful in 37s Details CI / test-python-document-crawler (push) Successful in 27s Details CI / test-python-dsms-gateway (push) Successful in 23s Details CI / validate-canonical-controls (push) Successful in 14s Details Build + Deploy / trigger-orca (push) Failing after 32s Details INFO checks (V.i.S.d.P., Streitbeilegung, Berufsrecht, Stammkapital, etc.) that fail are now shown with a gray info icon instead of red X, with gray hint text. They are excluded from the Pflichtangaben count since they are context-dependent and likely not applicable. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-12 12:28:00 +02:00
Benjamin Admin	ce77cde309	fix(compliance-check): batch LLM verification + increase poll timeout Build + Deploy / build-admin-compliance (push) Successful in 1m52s Details Build + Deploy / build-backend-compliance (push) Successful in 18s Details Build + Deploy / build-ai-sdk (push) Successful in 11s Details Build + Deploy / build-developer-portal (push) Successful in 11s Details Build + Deploy / build-tts (push) Successful in 12s Details Build + Deploy / build-document-crawler (push) Successful in 14s Details Build + Deploy / build-dsms-gateway (push) Successful in 10s Details Build + Deploy / build-dsms-node (push) Successful in 12s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / loc-budget (push) Failing after 15s Details CI / secret-scan (push) Has been skipped Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 2m35s Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / test-go (push) Successful in 42s Details CI / test-python-backend (push) Successful in 37s Details CI / test-python-document-crawler (push) Successful in 25s Details CI / test-python-dsms-gateway (push) Successful in 21s Details CI / validate-canonical-controls (push) Successful in 16s Details Build + Deploy / trigger-orca (push) Successful in 2m24s Details - LLM verify now sends ALL failed checks in one batched call instead of one Ollama call per check (80+ calls → 1 per document) - Increase frontend poll timeout from 6 min to 15 min Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-12 11:49:30 +02:00
Benjamin Admin	a127dd971b	fix(compliance-check): resume polling after navigation away Build + Deploy / build-admin-compliance (push) Successful in 2m16s Details Build + Deploy / build-backend-compliance (push) Successful in 12s Details Build + Deploy / build-ai-sdk (push) Successful in 12s Details Build + Deploy / build-developer-portal (push) Successful in 12s Details Build + Deploy / build-tts (push) Successful in 15s Details Build + Deploy / build-document-crawler (push) Successful in 13s Details Build + Deploy / build-dsms-gateway (push) Successful in 13s Details Build + Deploy / build-dsms-node (push) Successful in 16s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / loc-budget (push) Failing after 18s Details CI / secret-scan (push) Has been skipped Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 2m38s Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / test-go (push) Successful in 42s Details CI / test-python-backend (push) Successful in 41s Details CI / test-python-document-crawler (push) Successful in 27s Details CI / test-python-dsms-gateway (push) Successful in 21s Details CI / validate-canonical-controls (push) Successful in 13s Details Build + Deploy / trigger-orca (push) Successful in 2m32s Details Save active check_id to localStorage so polling resumes when the user navigates away via sidebar and comes back. Same pattern as scan tab. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-12 11:37:06 +02:00
Benjamin Admin	ed3ebbc246	fix(compliance-check): send 'documents' instead of 'entries' to backend Build + Deploy / build-admin-compliance (push) Successful in 11s Details Build + Deploy / build-backend-compliance (push) Successful in 13s Details Build + Deploy / build-ai-sdk (push) Successful in 13s Details Build + Deploy / build-developer-portal (push) Successful in 10s Details Build + Deploy / build-tts (push) Successful in 11s Details Build + Deploy / build-document-crawler (push) Successful in 11s Details Build + Deploy / build-dsms-gateway (push) Successful in 12s Details Build + Deploy / build-dsms-node (push) Successful in 11s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / loc-budget (push) Failing after 15s Details CI / secret-scan (push) Has been skipped Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 2m33s Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / test-go (push) Successful in 39s Details CI / test-python-backend (push) Successful in 37s Details CI / test-python-document-crawler (push) Successful in 26s Details CI / test-python-dsms-gateway (push) Successful in 21s Details CI / validate-canonical-controls (push) Successful in 14s Details Build + Deploy / trigger-orca (push) Successful in 2m30s Details Frontend was sending field name 'entries' but backend Pydantic model expects 'documents', causing 422 validation error. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-12 09:25:36 +02:00
Benjamin Admin	f3751a4efa	feat(compliance-check): show business profile + banner check result in UI Build + Deploy / build-admin-compliance (push) Successful in 1m55s Details Build + Deploy / build-backend-compliance (push) Successful in 3m17s Details Build + Deploy / build-ai-sdk (push) Successful in 49s Details Build + Deploy / build-developer-portal (push) Successful in 1m17s Details Build + Deploy / build-tts (push) Successful in 1m33s Details Build + Deploy / build-document-crawler (push) Successful in 41s Details Build + Deploy / build-dsms-gateway (push) Successful in 28s Details Build + Deploy / build-dsms-node (push) Successful in 17s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / loc-budget (push) Failing after 16s Details CI / secret-scan (push) Has been skipped Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 2m35s Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / test-go (push) Successful in 47s Details CI / test-python-backend (push) Successful in 38s Details CI / test-python-document-crawler (push) Successful in 25s Details CI / test-python-dsms-gateway (push) Successful in 24s Details CI / validate-canonical-controls (push) Successful in 13s Details Build + Deploy / trigger-orca (push) Successful in 2m58s Details Add two info boxes above the checklist results: - Business profile (B2B/B2C, industry, regulated profession) - Banner check status (CMP detected, violations count, cross-check hint) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-12 00:19:51 +02:00
Benjamin Admin	0d0e705117	feat: Unified Compliance-Check — 8 document types in one form New 3-tab structure: Website-Scan, Compliance-Check, Banner-Check. Compliance-Check Tab (replaces Dokumenten-Pruefung + Impressum-Check): - 8 document rows: DSI, Impressum, Social Media, Cookie, AGB, Nutzungsbedingungen, Widerruf, DSB-Kontakt - Each row: URL input + "Text laden" + file upload + manual text - "Text laden" extracts via consent-tester, shows in editable textarea - User verifies/corrects text before checking - Empty fields = "not present" → own finding Business Profiler (business_profiler.py): - Detects B2B/B2C/B2G from all documents together - Recognizes regulated professions, online shops, editorial content - Context-aware: INFO checks become PASS/FAIL based on profile Backend: /compliance-check + /extract-text endpoints Frontend: ComplianceCheckTab.tsx + DocumentRow.tsx API proxies: compliance-check/route.ts + extract-text/route.ts Also: Impressum regex fixes (Telefon, AG, Geschaeftsfuehrung) and INFO severity for context-dependent checks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-11 20:56:10 +02:00
Benjamin Admin	02ff96f74e	fix: resolve all merge conflict markers from feat/zeroclaw-compliance-agent Build + Deploy / build-admin-compliance (push) Successful in 2m7s Details Build + Deploy / build-backend-compliance (push) Failing after 5m21s Details Build + Deploy / build-ai-sdk (push) Successful in 53s Details Build + Deploy / build-developer-portal (push) Successful in 1m18s Details Build + Deploy / build-tts (push) Successful in 1m42s Details Build + Deploy / build-document-crawler (push) Successful in 45s Details Build + Deploy / build-dsms-gateway (push) Successful in 27s Details Build + Deploy / build-dsms-node (push) Successful in 19s Details CI / branch-name (push) Has been skipped Details Build + Deploy / trigger-orca (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / loc-budget (push) Failing after 19s Details CI / secret-scan (push) Has been skipped Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 3m6s Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / test-go (push) Successful in 55s Details CI / test-python-backend (push) Successful in 44s Details CI / test-python-document-crawler (push) Successful in 30s Details CI / test-python-dsms-gateway (push) Successful in 26s Details CI / validate-canonical-controls (push) Successful in 18s Details 9 files had conflict markers from the branch merge. All resolved keeping the feature branch version. Also split agent_scan_routes.py (534→367 LOC) by extracting Pydantic models to agent_scan_models.py. [guardrail-change] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-11 12:15:07 +02:00
Benjamin Admin	36c6101b91	Merge feat/zeroclaw-compliance-agent into main Brings all compliance doc-check features: - 162 regex checks + 1874 Master Controls - LLM-agnostic agent with tool calling - Banner check (46 checks, 30 CMPs, stealth, Shadow DOM) - Impressum check (24 checks) - Deep consent verification (DataLayer, GCM, TCF) - CMP E2E tests (39 tests) - HTML email reports, FAQ, persistent history Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-11 11:44:20 +02:00
Benjamin Admin	cc919eb608	feat: KI-Agent toggle in all 3 check tabs - Impressum-Check: Toggle activates 75 Impressum MCs via agent - Banner-Check: Toggle runs additional cookie doc-check (381 MCs) after the Playwright banner test completes - Both use the same use_agent flag through doc-check endpoint Green pill button consistent across all tabs: 'KI-Agent aus' / 'KI-Agent aktiv (X MCs)' Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-11 08:00:36 +02:00
Benjamin Admin	91d6d8b1a7	feat: KI-Agent toggle button in Dokumenten-Pruefung Build + Deploy / build-admin-compliance (push) Successful in 3m15s Details Build + Deploy / build-backend-compliance (push) Successful in 3m43s Details Build + Deploy / build-ai-sdk (push) Failing after 49s Details Build + Deploy / build-developer-portal (push) Successful in 1m26s Details Build + Deploy / build-tts (push) Successful in 1m49s Details Build + Deploy / build-document-crawler (push) Successful in 46s Details Build + Deploy / build-dsms-gateway (push) Successful in 33s Details Build + Deploy / build-dsms-node (push) Successful in 22s Details CI / branch-name (push) Has been skipped Details Build + Deploy / trigger-orca (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / loc-budget (push) Failing after 22s Details CI / secret-scan (push) Has been skipped Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 3m1s Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / test-go (push) Failing after 58s Details CI / test-python-backend (push) Successful in 47s Details CI / test-python-document-crawler (push) Successful in 28s Details CI / test-python-dsms-gateway (push) Successful in 28s Details CI / validate-canonical-controls (push) Successful in 16s Details Green pill button: 'KI-Agent aus' / 'KI-Agent aktiv (1.874 MCs)' Toggles use_agent flag which is passed through the full chain: Frontend → DocCheckRequest → _run_doc_check → _check_single_document → check_document_with_controls(use_agent=True) → ComplianceAgent with tool calling Default: OFF (deterministic regex). User can enable per scan. Also works via env var COMPLIANCE_USE_AGENT=true for always-on. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-10 23:26:21 +02:00
Benjamin Admin	05d98ea95f	feat: New tab structure — Discovery Scan, Doc-Check, Banner, Impressum Removed Schnellanalyse tab. New 4-tab structure: 1. Website-Scan (Discovery): Finds legal documents + services, shows "Jetzt pruefen" buttons that navigate to specialized tabs with pre-filled URLs. 2. Dokumenten-Pruefung: DSI, AGB, Cookie, Widerruf checks (existing) 3. Banner-Check: Cookie banner 46-check deep verification (existing) 4. Impressum-Check (NEW): §5 TMG / §18 MStV with 16 checks, own tab with URL input, history, email report. Uses existing impressum_checks.py via doc-check endpoint. Tab cross-navigation: Scan → "Jetzt pruefen" → opens target tab with URL pre-filled via localStorage handoff. Removed: Mode selector (pre/post launch), Schnellanalyse, useAgentAnalysis hook import, AnalysisResult/FollowUpQuestions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-10 09:09:27 +02:00
Benjamin Admin	f201c01a06	fix: Replace unicode escapes with actual emoji characters	2026-05-10 08:20:00 +02:00
Benjamin Admin	33f0a64ff6	feat: Persistent result history — click to reload old scan results Both DocCheckTab and BannerCheckTab now: - Store full scan results per history entry in localStorage - History entries are clickable — loads the saved result immediately - No need to re-scan to see old results - Fallback to last result if specific entry not found - Banner-Check sends HTML email report to mailpit Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-10 07:59:02 +02:00
Benjamin Admin	1b8e9881bb	feat: Banner-Check — Historie, persistentes Ergebnis, E-Mail-Report 1. localStorage Persistenz: URL, letztes Ergebnis, Historie (30 Eintraege) 2. Historie: Zeigt URL, Datum, Provider, Violations, Prozent 3. Letztes Ergebnis bleibt nach Tab-Wechsel/Reload sichtbar 4. E-Mail-Report: HTML-formatiert mit Violations + Hints an mailpit 5. Email-Status Anzeige im Frontend Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-10 07:55:12 +02:00
Benjamin Admin	2143840ee7	docs(agent): add FAQ about harmonised standards copyright + EuGH C-588/21 P Explains why companies must buy norms their own employees wrote, and the 2024 EuGH ruling that harmonised standards are EU law and must be freely accessible. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-09 09:50:44 +02:00
Benjamin Admin	4bfb438c92	feat: 4 banner check upgrades — 30 CMPs, stealth, Shadow DOM, categories Build + Deploy / build-admin-compliance (push) Successful in 2m17s Details Build + Deploy / build-backend-compliance (push) Successful in 3m17s Details Build + Deploy / build-ai-sdk (push) Successful in 56s Details Build + Deploy / build-developer-portal (push) Successful in 1m37s Details Build + Deploy / build-tts (push) Successful in 1m33s Details Build + Deploy / build-document-crawler (push) Successful in 42s Details Build + Deploy / build-dsms-gateway (push) Successful in 33s Details Build + Deploy / build-dsms-node (push) Successful in 16s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / loc-budget (push) Failing after 25s Details CI / secret-scan (push) Has been skipped Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 3m33s Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / test-go (push) Failing after 1m18s Details CI / test-python-backend (push) Successful in 53s Details CI / test-python-document-crawler (push) Successful in 36s Details CI / test-python-dsms-gateway (push) Successful in 33s Details CI / validate-canonical-controls (push) Successful in 24s Details Build + Deploy / trigger-orca (push) Successful in 3m19s Details 1. 30 CMP selectors (was 10): Added Sourcepoint, Iubenda, Complianz, CookieFirst, HubSpot, Osano, Piwik PRO, Cookie Consent (Insites), Axeptio, Termly, CookieScript, Civic UK, GDPR Cookie Compliance, CookieHub, Ketch, Admiral, Sibbo, Evidon, LiveRamp, Adsimple. Plus improved generic fallback: role=dialog, aria-label, data-* attrs. 2. Playwright stealth mode: playwright-stealth against bot detection. Removes WebDriver flag, simulates plugins, realistic viewport/locale. Launch args: --disable-blink-features=AutomationControlled. 3. Shadow DOM: Recursive JS-based search through shadowRoot elements for consent banners. Fallback click via page.evaluate() when normal Playwright selectors can't penetrate Shadow DOM. 4. Category selection UI: User can choose which cookie categories to test (Notwendig, Statistik, Marketing, Funktional, Praeferenzen). Pill-style checkboxes in BannerCheckTab, forwarded through API chain. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-09 08:42:30 +02:00
Benjamin Admin	751f4a5ee7	fix: Remove dead polling code from BannerCheckTab Build + Deploy / build-admin-compliance (push) Successful in 2m32s Details Build + Deploy / build-backend-compliance (push) Successful in 3m20s Details Build + Deploy / build-ai-sdk (push) Successful in 53s Details Build + Deploy / build-developer-portal (push) Successful in 1m19s Details Build + Deploy / build-tts (push) Successful in 1m28s Details Build + Deploy / build-document-crawler (push) Successful in 35s Details Build + Deploy / build-dsms-gateway (push) Successful in 24s Details Build + Deploy / build-dsms-node (push) Successful in 11s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / loc-budget (push) Failing after 19s Details CI / secret-scan (push) Has been skipped Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 3m9s Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / test-go (push) Failing after 1m0s Details CI / test-python-backend (push) Successful in 42s Details CI / test-python-document-crawler (push) Successful in 32s Details CI / test-python-dsms-gateway (push) Successful in 24s Details CI / validate-canonical-controls (push) Successful in 19s Details Build + Deploy / trigger-orca (push) Successful in 3m11s Details The /banner-check endpoint is synchronous (Playwright completes in <30s and returns result directly). Removed unused async polling loop that would never match since no scan_id is returned. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-09 08:22:36 +02:00
Benjamin Admin	0fcb3ee488	docs(agent): add Machinery Regulation harmonised standards FAQ Explains current status: no harmonised standards published under (EU) 2023/1230 yet, ~800 from old directive still valid. Timeline from June 2023 to January 2027 full application. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-09 07:17:32 +02:00
Benjamin Admin	686834cea0	feat: 4 remaining tasks — EU institutions, banner integration, JS-sites, Caritas fixes Build + Deploy / build-admin-compliance (push) Successful in 8s Details Build + Deploy / build-backend-compliance (push) Successful in 8s Details Build + Deploy / build-ai-sdk (push) Failing after 36s Details Build + Deploy / build-developer-portal (push) Successful in 8s Details Build + Deploy / build-tts (push) Successful in 7s Details Build + Deploy / build-document-crawler (push) Successful in 7s Details Build + Deploy / build-dsms-gateway (push) Successful in 8s Details Build + Deploy / build-dsms-node (push) Successful in 8s Details CI / branch-name (push) Has been skipped Details Build + Deploy / trigger-orca (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / loc-budget (push) Failing after 17s Details CI / secret-scan (push) Has been skipped Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 3m14s Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / test-go (push) Failing after 46s Details CI / test-python-backend (push) Successful in 43s Details CI / test-python-document-crawler (push) Successful in 29s Details CI / test-python-dsms-gateway (push) Successful in 30s Details CI / validate-canonical-controls (push) Successful in 16s Details 1. EU Institution Checks (Verordnung 2018/1725): - New doc_type "eu_institution" with 9 L1 + 15 L2 checks - Both German + English patterns (EU institutions are multilingual) - Auto-detection via "2018/1725", "EDSB", "EDPS" keywords - Correct article references (Art. 15 instead of 13, Art. 5 instead of 6) 2. Banner Check Integration: - banner_runner.py maps scan results to 36 L1/L2 structured checks - BannerCheckTab shows hierarchical ChecklistView with hints - 3-phase summary (cookies/scripts before/after consent) - /scan endpoint now includes structured_checks in response 3. JS-heavy Website Fixes (dm, Zalando, HWK): - dsi_helpers.py: goto_resilient (networkidle→domcontentloaded fallback) - try_dismiss_consent_banner before text extraction - PDF redirect detection (dm.de redirects to GCS PDF) 4. Caritas False Positive Fixes: - Phone regex allows parentheses: +49 (0)761 → now matches - "Recht auf Widerspruch" (3 words) + §23 KDG → matches Art. 21 - Church authorities: "Katholisches Datenschutzzentrum" recognized Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-08 01:10:10 +02:00
Benjamin Admin	63bd6a7c6d	feat: Compliance FAQ section in Agent page Build + Deploy / build-admin-compliance (push) Successful in 2m9s Details Build + Deploy / build-backend-compliance (push) Successful in 3m17s Details Build + Deploy / build-ai-sdk (push) Successful in 50s Details Build + Deploy / build-developer-portal (push) Successful in 1m14s Details Build + Deploy / build-tts (push) Successful in 1m27s Details Build + Deploy / build-document-crawler (push) Successful in 42s Details Build + Deploy / build-dsms-gateway (push) Successful in 24s Details Build + Deploy / build-dsms-node (push) Successful in 11s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / loc-budget (push) Failing after 22s Details CI / secret-scan (push) Has been skipped Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 3m10s Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / test-go (push) Failing after 46s Details CI / test-python-backend (push) Successful in 40s Details CI / test-python-document-crawler (push) Successful in 29s Details CI / test-python-dsms-gateway (push) Successful in 24s Details CI / validate-canonical-controls (push) Successful in 18s Details Build + Deploy / trigger-orca (push) Successful in 2m15s Details 5 FAQ items covering: - What happens when companies are sued (4 enforcement paths) - How document checks work (3-step process) - Which document types are checked (7 types, 138 checks) - How reliable results are (0 false positives, LLM verification) - What GDPR violations cost in practice (fine tiers + examples) Includes EuGH rulings (C-300/21, C-319/20), CNIL fine examples, and practical cost ranges. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-08 00:32:07 +02:00
Benjamin Admin	7c17321089	feat: Cookie Banner Check as standalone tab in Compliance Agent Build + Deploy / build-admin-compliance (push) Successful in 2m7s Details Build + Deploy / build-backend-compliance (push) Successful in 10s Details Build + Deploy / build-ai-sdk (push) Successful in 8s Details Build + Deploy / build-developer-portal (push) Successful in 7s Details Build + Deploy / build-tts (push) Successful in 7s Details Build + Deploy / build-document-crawler (push) Successful in 9s Details Build + Deploy / build-dsms-gateway (push) Successful in 8s Details Build + Deploy / build-dsms-node (push) Successful in 8s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / loc-budget (push) Failing after 17s Details CI / secret-scan (push) Has been skipped Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 3m21s Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / test-go (push) Failing after 47s Details CI / test-python-backend (push) Successful in 47s Details CI / test-python-document-crawler (push) Successful in 31s Details CI / test-python-dsms-gateway (push) Successful in 26s Details CI / validate-canonical-controls (push) Successful in 16s Details Build + Deploy / trigger-orca (push) Successful in 2m23s Details New "Banner-Check" tab with: - URL input → Playwright 3-phase test (before/reject/accept) - Shield icon + provider detection - Progress bar with pass/fail percentage - 3-phase summary (cookies + scripts per phase) - Violations (red) and passes (green) in structured list Backend: new POST /api/compliance/agent/banner-check endpoint that proxies to consent-tester:8094/scan. Next step: Upgrade banner checks to L1/L2 format with expert hints (same quality as document checks). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-07 17:39:44 +02:00
Benjamin Admin	293c58d0dd	feat: Add actionable hints to all 138 compliance checks Build + Deploy / build-admin-compliance (push) Successful in 1m40s Details Build + Deploy / build-backend-compliance (push) Successful in 7s Details Build + Deploy / build-ai-sdk (push) Successful in 35s Details Build + Deploy / build-developer-portal (push) Successful in 8s Details Build + Deploy / build-tts (push) Successful in 7s Details Build + Deploy / build-document-crawler (push) Successful in 8s Details Build + Deploy / build-dsms-gateway (push) Successful in 7s Details Build + Deploy / build-dsms-node (push) Successful in 8s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / loc-budget (push) Failing after 16s Details CI / secret-scan (push) Has been skipped Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 2m50s Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / test-go (push) Failing after 40s Details CI / test-python-backend (push) Successful in 37s Details CI / test-python-document-crawler (push) Successful in 25s Details CI / test-python-dsms-gateway (push) Successful in 23s Details CI / validate-canonical-controls (push) Successful in 15s Details Build + Deploy / trigger-orca (push) Successful in 2m28s Details Each check now has a "hint" field explaining what is missing and what the customer should do to fix it. Hints are shown in the frontend below failed checks in red text. Examples: - "Bei Verarbeitung auf Basis von Art. 6(1)(f) muss dokumentiert werden, warum Ihr berechtigtes Interesse die Rechte der Betroffenen ueberwiegt." - "Die ladungsfaehige Anschrift fehlt. Erforderlich: Strasse, Hausnummer, PLZ und Ort." Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-07 14:05:01 +02:00
Benjamin Admin	8849c396b5	fix: Show L2 detail checks always visible (no extra click needed) Build + Deploy / build-admin-compliance (push) Successful in 2m44s Details Build + Deploy / build-backend-compliance (push) Successful in 3m25s Details Build + Deploy / build-ai-sdk (push) Successful in 56s Details Build + Deploy / build-developer-portal (push) Successful in 1m22s Details Build + Deploy / build-tts (push) Successful in 1m30s Details Build + Deploy / build-document-crawler (push) Successful in 8s Details Build + Deploy / build-dsms-gateway (push) Successful in 8s Details Build + Deploy / build-dsms-node (push) Successful in 9s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / loc-budget (push) Failing after 20s Details CI / secret-scan (push) Has been skipped Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 3m5s Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / test-go (push) Failing after 44s Details CI / test-python-backend (push) Successful in 42s Details CI / test-python-document-crawler (push) Successful in 27s Details CI / test-python-dsms-gateway (push) Successful in 22s Details CI / validate-canonical-controls (push) Successful in 18s Details Build + Deploy / trigger-orca (push) Successful in 3m22s Details L2 checks were hidden behind a second click on L1 items. Now they render inline below their L1 parent, always visible when the document card is expanded. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-07 13:16:04 +02:00
Benjamin Admin	b363c28539	feat: Add 76 Level-2 regex checks for document correctness verification Split dsi_document_checker.py (466 LOC) into doc_checks/ package (9 files). Two-pass L1→L2 logic: L1 checks "Is it mentioned?", L2 checks "Is it correct?" (e.g. controller has full address, specific Art. 6 lit., concrete time periods). 138 total checks (62 L1 + 76 L2) across 7 doc types: - DSE Art. 13: 31, Impressum §5 TMG: 16, Cookie §25 TDDDG: 15 - Widerruf §355: 15, AGB §305ff: 21, Social Media Art. 26: 20, DSFA Art. 35: 18 Frontend: hierarchical L1→L2 display with dual progress bars (green=completeness, blue=correctness). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-07 12:37:03 +02:00
Benjamin Admin	3853a0838a	feat: Art. 26 Joint Controller + DSFA checklists for Social Media sections New checklists: - JOINT_CONTROLLER_CHECKLIST (Art. 26 DSGVO, 7 checks): Joint parties, arrangement, contact point, processing split, data categories, third-country transfer (USA), rights - DSFA_CHECKLIST (Art. 35 DSGVO, 5 checks): Description, necessity, risk assessment, measures, DSB involvement Section detection: 'Datenschutzerklaerung fuer Social Media' → social_media, 'Datenschutzfolgeabschaetzung/Risikoanalyse' → dsfa classify_document_type: DSFA and social_media detected before generic DSE Frontend: DOC_TYPES dropdown + ChecklistView labels updated Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-07 10:49:32 +02:00
Benjamin Admin	45446aef16	fix: 8 quality + UX improvements 1. Cookie 'Zwecke' false positive: added 'um...zu', 'dienen', 'helfen', 'ermöglichen' patterns — catches purpose descriptions without 'Zweck' 2. Kurzhinweis: added empty all_checks for short documents (<200 words) 3. Bezeichnungsfeld: placeholder shows 'Version / Stand' for typed docs, 'Dokumentname' for 'Sonstiges' 4. DocCheckTab state persistence: entries + results survive navigation 5. DocCheck history: saves each check with date, doc count, findings 6. History display: 'Letzte Pruefungen' section at bottom of tab 7. ChecklistView: shows 'X von Y Pruefpunkten bestanden' per document 8. Results persist in localStorage across page navigation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-07 09:37:47 +02:00
Benjamin Admin	0416bb5d04	fix: Checklist expand — use index instead of URL (prevents all opening at once) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-06 10:56:44 +02:00
Benjamin Admin	4c68caac4e	feat: Multi-URL Document Check with full checklist visibility New "Dokumenten-Pruefung" tab in Compliance Agent: - User adds multiple URLs with document type (DSI, AGB, Impressum, Cookie, Widerruf) - Each document loaded via Playwright, accordions expanded, text extracted - Checked against type-specific legal checklist - Optional: Cookie banner check via checkbox Checklisten-UX (solves "100% looks like nothing was checked"): - All checks shown per document: green checkmark + matched text excerpt - Red X for missing fields with legal reference - Builds user trust: "9 Punkte geprueft, alle bestanden" - Expandable per document with completeness bar New checklists: - Impressum: §5 TMG (6 fields: name, address, contact, register, VAT, representative) - Cookie-Richtlinie: §25 TDDDG (5 fields: types, purposes, retention, third-party, opt-out) Backend: - POST /agent/doc-check — async with polling (same pattern as /scan) - DocCheckResult includes checks[] with passed/failed + matched_text - dsi_document_checker returns all_checks in SCORE finding - Email report shows per-document checklist Files: agent_doc_check_routes.py (280 LOC), DocCheckTab.tsx (248 LOC), ChecklistView.tsx (130 LOC), dsi_document_checker.py (+70 LOC) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-06 10:08:40 +02:00
Benjamin Admin	7c7513525e	feat: Document-centric scan results + DSI deduplication DSI Dedup (consent-tester): - Only H1/H2 headings count as documents (not H3/H4 sub-sections) - Sub-sections (Cookies, Betroffenenrechte, Social Media) are part of parent document's full text, not separate documents - Reduces IHK result from 30 to ~11 real documents Backend (agent_scan_routes): - ScanFinding gets doc_title field linking each finding to its document - doc_title set when creating DSI findings for document attribution Frontend (ScanResult.tsx): - 3 sections: Services table, Document cards, General findings - Documents: expandable cards with completeness bar (green/yellow/red) - Findings grouped under their parent document - Each card shows: title, word count, findings count, % completeness - Findings without doc_title go to "Allgemeine Findings" section Email Summary (agent_scan_helpers): - Findings listed under their parent document - General findings in separate section - No more flat mixed list Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-05 09:56:29 +02:00
Benjamin Admin	ca21feedc8	feat: display 8 banner text checks in consent test UI Shows: Impressum link ✓/✗, DSE link ✓/✗, plus violation cards for wrong DSE consent wording, pre-ticked checkboxes, dark patterns, missing reject button, no settings re-access. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-02 15:38:07 +02:00
Benjamin Admin	6864849115	feat: Phase 11 — granular cookie category testing Tests each consent category in isolation: - Phase D: Only "Statistics" enabled → checks if only analytics loads - Phase E: Only "Marketing" enabled → checks if only ads load - Phase F: Only "Functional" enabled → checks no tracking loads CMP-specific category selectors for Cookiebot, OneTrust, Usercentrics, Didomi. Generic fallback via toggle/checkbox keyword detection. SERVICE_CATEGORY_MAP maps 35+ services to expected categories. Violations: "Facebook Pixel loads with only Statistics enabled" = miscategorization. Frontend: category test results shown below Phase A-C with per-category violation cards. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-01 21:15:23 +02:00
Benjamin Admin	b53b36fdc5	feat: 5-tab agent UI — PDF export, compare, auth test, all proxies - 5 tabs: Schnellanalyse, Website-Scan, Cookie-Test, Vergleich, Login-Test - PDF download button in ScanResult - CompareResult: side-by-side compliance comparison table - AuthTestResult: 5 post-login checks with legal refs - API proxies: /scans/pdf, /compare, /authenticated-scan - Compare: textarea for 2-5 URLs, parallel scanning Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 16:43:08 +02:00
Benjamin Admin	b7f9099ad9	feat: Cookie-Test tab — 3-phase consent test UI + API proxy Third tab "Cookie-Test" in Compliance Agent: - Phase A: Before consent (tracking without permission) - Phase B: After rejection (CRITICAL if tracking persists) - Phase C: After acceptance (undocumented services) - CMP badge (Didomi, OneTrust, etc.) - Violation cards with severity badges and legal references Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 12:38:15 +02:00
Benjamin Admin	15d1e118ed	feat: TextReference component — original text, position, correction in findings Shows for each finding: - Original text block from DSE (or "missing" indicator) - Position: section heading, number, parent section, paragraph index - Correction: insert/append/replace with copy button Falls back to plain correction view if no text reference available. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-29 11:59:55 +02:00
Benjamin Admin	6c0e76f96d	feat: show scanned pages in email summary + frontend (expandable list) Email now lists all scanned URLs with checkmark/cross status. Frontend shows collapsible "X Seiten gescannt — Details anzeigen". Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-28 17:26:03 +02:00
Benjamin Admin	0f1fae61a6	feat: Website-Scan tab in agent UI — service table, SOLL/IST, corrections - Tab system: Schnellanalyse (single page) + Website-Scan (multi-page) - ScanResult component: service comparison table, severity-colored findings - Expandable correction suggestions with copy button (pre-launch mode) - API proxy route for /agent/scan endpoint Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-28 15:52:40 +02:00
Benjamin Admin	cb5aa2949b	feat: hybrid website compliance checks (§312k BGB, §5 TMG, Art. 13 DSGVO) - Scan public website for cancellation button, imprint, privacy link, cookie consent - Generate follow-up questions when checks can't be verified without login - User answers "no" → finding with legal basis is added to results - Frontend: FollowUpQuestions component with Ja/Nein buttons - Sidebar: "Compliance Agent" entry added under KI-Compliance Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-28 13:25:44 +02:00
Benjamin Admin	0c0dd4e3a6	feat: ZeroClaw compliance agent — document analysis + role assignment + email Add autonomous compliance agent that fetches web documents (cookie banners, privacy policies), classifies them via Qwen/Ollama, assesses DSGVO compliance, assigns to the responsible role, and sends notification emails. Components: - ZeroClaw SOP (6-step workflow: fetch, classify, assess, summarize, assign, notify) - Backend: /api/compliance/agent/analyze (combined endpoint) - Backend: /api/compliance/agent/notify (standalone email) - Frontend: /sdk/agent page (Manager UI with URL input + results) - Helper scripts + E2E test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-27 23:28:21 +02:00

49 Commits