Commit Graph

103 Commits

Author SHA1 Message Date
Benjamin Admin 3f90e40807 fix(browser-matrix): Tracking-Signal statt Cookie-Rohzahl + Matrix-Schnellpfad
Korrektheit (§ 25 TDDDG): "Cookies vor Consent" ist KEIN Verstoss per se —
technisch notwendige Cookies inkl. des Consent-Cookies (speichert die
Ablehnung) sind nach Abs. 2 erlaubt. Verstoss ist nur nicht-essentielles
TRACKING vor Consent.
- browser_cross_finding: Befund haengt jetzt an violations.before_consent
  (Tracking), nicht an der Cookie-Rohzahl; § 25 Abs. 2-Hinweis im Detail.
  Regressionstest: Cookies-ohne-Tracking → KEIN Befund.
- multi_browser_scanner._extract_dimensions: Score nutzt Tracking-Violations
  + reject_respected-Verdikt statt Rohzahl (Fallback erhalten).
- BrowserBehaviorView: "Cookies vor Consent" nur rot/⚠ bei Tracking,
  "nach Ablehnen" neutral (Verdikt = reject-Spalte); erklaerende Zeile.

Speed: run_consent_test ueberspringt im Matrix-Modus (browser_profile gesetzt)
die teuren Phasen C/D-F/G — nur A+B noetig. Verhindert das 504 beim
Multi-Engine-Scan (BMW 4 Engines lief sonst in den 338s-Gateway-Timeout).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-13 00:10:41 +02:00
Benjamin Admin 85a8a1d545 feat(browser-matrix): Cross-Browser-Befunde + Browser-Default-Einordnung (Phase 4)
- browser_cross_finding: deterministische Sicht ueber die Matrix (keine 2.
  Engine, kein LLM). Findet Inkonsistenzen ZWISCHEN Browsern (Cookies vor
  Consent / Ablehnen nicht universell respektiert / Banner-Links fehlend) und
  ordnet ein: Safari-ITP / Brave-Shields / Firefox-ETP maskieren Verstoesse
  clientseitig → strenge Engine "sauber" ist KEIN Compliance-Beleg, massgeblich
  sind die nachgiebigen (Chrome/Edge). Coverage-Hinweis fuer nicht verfuegbare
  Browser. Je Befund Titel/Detail/Severity/affected/Massnahme.
- snapshot_check_routes: cross_findings frisch in run + GET (nicht persistiert).
- BrowserBehaviorView: "Cross-Browser-Befunde"-Block ueber der Tabelle.
- Tests: test_browser_cross_finding (6).

Offen (Folge-Task): Borlabs-Consent-Historie-Live-Erkennung (braucht
consent-tester-Storage-Scan).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-12 23:22:57 +02:00
Benjamin Admin 9587726936 feat(admin): Tab "Browser-Verhalten" — Per-Browser-Matrix + Screenshots (Phase 3)
- BrowserBehaviorView: laedt gespeicherte Matrix (GET), sonst "Browser-Test
  starten" (POST run, Live-Lauf). Per-Browser-Tabelle (Cookies vor Consent /
  nach Ablehnen / Ablehnen respektiert / Oberflaeche / Score), Engine-Detail
  mit Banner-Screenshot + Oberflaechen-Befunden, Mobil-Badge, "nicht
  verfuegbar"-Zeilen fuer fehlende Browser (arm64-Dev).
- Proxys browser-behavior (GET) + browser-behavior/run (POST, langer Timeout).
- page.tsx: Tab "Browser-Verhalten" (sichtbar sobald scanbare URL im Snapshot).
- consent-tester scan_matrix_summary: banner_findings je Engine im summary
  (Text/Severity/Norm) → Oberflaechen-Befunde im Tab.
- tsc strict clean; Vitest BrowserBehaviorView (2).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-12 23:15:06 +02:00
Benjamin Admin 45df68537e feat(agent): Voll-Audit-Link in die Snapshot-Detail-Seite
Der "Voll-Audit öffnen (alle MCs)"-Link hing in ComplianceResultTabs (aus
der Agent-Seite entfernt). Jetzt im Detail-Header der Snapshot-Ansicht via
snap.check_id → /sdk/agent/audit/{check_id} (Audit-Daten verifiziert
vorhanden). Plus Site-Titel-Header.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-12 10:05:09 +02:00
Benjamin Admin 1b2b030367 feat(agent): /sdk/agent auf Compliance-Check + Snapshot-Historie reduzieren
- Tabs Website-Scan (nie funktioniert), Banner-Check, Agent-Test entfernt;
  Tab-Leiste weg, da nur noch Compliance-Check übrig.
- Unter dem Compliance-Check jetzt die Snapshot-Historie (neuer
  SnapshotHistoryList): neuester oben + farblich markiert, Klick → Detail-
  Seite mit den Ergebnissen. Macht /sdk/agent/snapshots erreichbar.
- ComplianceCheckTab zeigt nach dem Lauf keine Inline-Ergebnisse mehr,
  sondern einen Hinweis auf die Historie (onComplete refresht sie).
- Tote Komponenten gelöscht (ScanResult/BannerCheckTab/AgentTestTab).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-12 09:31:35 +02:00
Benjamin Admin 403e3c66d2 feat(cookie): Deklaration-vs-Bibliothek-Diff-Sicht + Funnel-KPI
Für die Library-getroffene Teilmesse (~32%) pro Cookie die Feld-
Abweichungen deklariert→Library (Kategorie/Laufzeit/Zweck) als Diff-Karte,
plus ehrlicher Funnel (gesamt → geprüft → abweichend) — nicht-getroffene
Cookies sind nicht prüfbar (kein Pass/Fail), passend zur Tonalität.

- analyze_cookies: 'expected'-Soll-Wert an tracker_as_necessary/
  excessive_lifetime/missing_purpose (+ _CAT_LABEL_DE).
- neues cookie_declaration_diff.build_declaration_diff: reine Regroup-
  Aggregation der Findings pro Cookie (single source = analyze_cookies),
  Hinweis-Typen (third_country/eu_alternative) bewusst ausgeschlossen.
- cookie-check exponiert out['declaration_diff'].
- CookieDeclarationDiff.tsx oben im Cookie-Tab (vor Panel/ResultView).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-11 21:00:50 +02:00
Benjamin Admin 97e39579d5 feat(cookie+routing): Storage-Typ-Filter + legal_notice capture-only
#3 Storage-Filter: cookie-check exponiert per-Cookie-Speichertyp
(storage_inventory.per_cookie); CookieResultView bekommt Filter-Chips
(Cookie/Local Storage/Framework …) + eine Speicher-Spalte, Anbieter ohne
passenden Treffer werden ausgeblendet, KPI zeigt gefilterte Zahl.

A-Routing: legal_notice ist jetzt ein kanonischer Doc-Type. Eigene
Discovery-Regel (legal-disclaimer/rechtlicher-hinweis) VOR impressum →
die Disclaimer-Seite wird nicht mehr als Impressum substituiert (Ursache,
dass die Cross-Doc-Reconciliation nie zündete). capture-only: als
doc_entry für B persistiert, aber nicht einzeln gescort (keine 0%-Noise,
da ohne eigene Checkliste). Im Scan-Form als Option auswählbar.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-11 20:45:18 +02:00
Benjamin Admin 0f6cdc93fd fix(snapshot): Cookie-Dedup + schneller Impressum-Tab + Tabellen-Zahl
- Cookies werden je Vendor nach Name dedupliziert (Consent-Phasen-Dubletten;
  BMW 2196 → ~772) — in cookie-check + get_snapshot, behebt aufgeblähte
  Kachel-/Finding-Zahlen.
- Impressum-Snapshot-Check überspringt den ~40s-LLM-Schritt (context skip_llm)
  → Tab lädt sofort statt leer zu bleiben.
- Vendor-Tabelle zeigt nur die Cookie-Zahl (kein 'Cookies'-Wort je Zeile).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-11 19:54:15 +02:00
Benjamin Admin ba7d98be36 feat(reconcile): B — Cross-Doc-Reconciliation (Pflicht in anderem Doc erfüllt)
Ein 'X fehlt'/'zu prüfen'-Finding wird unterdrückt, wenn die Pflicht in einem
ANDEREN Snapshot-Dokument erfüllt ist (z.B. § 36 VSBG / OS-Link stehen bei BMW
in AGB/'Rechtlicher Hinweis', nicht im Impressum → war False Positive).
Konservative Allowlist (impressum: verbraucher_streitbeilegung, odr_link) gegen
False-Reconciliation. Verdrahtet in _run_doc_agent (alle Doc-Checks). Frontend:
'In anderem Dokument abgedeckt'-Sektion. Greift voll nach Scan + Legal-Capture.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-11 15:42:16 +02:00
Benjamin Admin 7258744107 refactor+feat: Snapshot-Router-Split + generischer ChecklistAgent + AGB-Modul
- Item 2: Snapshot-Doc-Checks (cookie/impressum/dse/agb) in snapshot_check_routes.py
  (agent_compliance_check_routes.py 464→365 Z.); gleiche Pfade, in main.py registriert.
- ChecklistAgent-Basis: DSE-Logik generalisiert (L1/L2, kurze Titel, _severity_
  override-Hook). DSEAgent + AGBAgent sind jetzt Thin-Subclasses → künftige
  Doc-Agenten (widerruf/avv/…) trivial.
- Item 4: AGBAgent (§§ 305 ff. BGB, AGB_CHECKLIST) + agb-check + AGB-Tab via
  AgentModuleTab. Kein Library-Firehose.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-11 14:23:29 +02:00
Benjamin Admin 76be96556d feat(dse): kuratierter DSEAgent + Snapshot-Tab (Art. 13/14, kein Firehose)
DSEAgent wrappt die existierende ART13_CHECKLIST (33 kuratierte Pflichtangaben
L1 + Detailchecks L2) → strukturierter AgentOutput, NICHT der 90k-Library-
Firehose (eCall/Gesundheit/Telekom-Lärm). GET /snapshots/{id}/dse-check spiegelt
impressum-check; doc_input_from_snapshot generalisiert. Frontend: generischer
AgentModuleTab (lazy → AgentResultTab) für Impressum + DSE; DSE-Tab in der
Snapshot-Seite. Plus HRB-Pattern \d→\d+ (volle Registernummer als Beleg).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-11 12:46:46 +02:00
Benjamin Admin 877d540ce1 fix(cookie+impressum): Drittland-FP, Impressum-Beleg, neuer Opt-Out-Finding
- Drittland: unbekannte Herkunft ('N/A') + Self-Hosting feuern nicht mehr —
  First-Party-Session-Cookies (PHPSESSID/JSESSIONID) waren False Positives.
- Impressum _line_of: enges Fenster um den Treffer bei Texten ohne Umbrüche
  (BMW = ein Block) → jede Pflichtangabe zeigt IHREN Beleg statt denselben Satz.
- Neuer Finding-Typ missing_opt_out: einwilligungspflichtiger Anbieter mit
  Cookies ohne Opt-Out-/Widerspruchs-Link (Art. 7 Abs. 3 + Art. 21).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-11 11:59:48 +02:00
Benjamin Admin 5b36b3f367 feat(impressum): Snapshot-Modul-Tab — ImpressumAgent auf gespeichertem Text
Snapshot-Detailseite wird zu Modul-Tabs (Cookies & Tracking | Impressum).
Backend GET /snapshots/{id}/impressum-check laeuft den v3 ImpressumAgent auf
dem gespeicherten Impressum-Text (kein Re-Crawl); Input-Erzeugung in
impressum_input_from_snapshot() ausgelagert (pure + getestet: Text/Scope/
company_name-Fallback/None-Pfad). Frontend laedt lazy beim Tab-Wechsel und
rendert mit dem bestehenden AgentResultTab (keine zweite Engine).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-11 11:24:44 +02:00
Benjamin Admin 39cb6afc23 feat(cookie): Findings bearbeitbar — gruppiert nach Typ + Matrix + Hinweise-Split
CookieFindings: Umschalter [Nach Fehlertyp] (je Typ: Maßnahme + betroffene
Cookies + Ticket-Text) ↔ [Matrix] (Cookie×Typ, ✗ Handlung / ⚠ Hinweis).
Trennung FINDINGS (zu beheben) vs HINWEISE (neutral, gegen DSE zu prüfen).
Backend: kind-Klassifikation (third_country/eu_alternative=hinweis); Drittland-
Remediation neutral formuliert (pro Verarbeiter prüfen, keine 'in DSE benennen'-
Befehle, da DSE-Abdeckung wie BMWs 'in der Regel SCC' oft unzureichend).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-11 11:02:34 +02:00
Benjamin Admin b0115cb10b feat(cookie): 2. Sicht Banner-Kategorie + Fehl-Einsortierung
CookieResultView bekommt einen Umschalter [Rechtliche Rolle] ↔
[Banner-Kategorie] (Notwendig/Funktional/Statistik/Marketing). In beiden
Sichten zeigt jede Cookie-Zeile '→ sollte: Marketing', wenn die tatsächliche
Kategorie laut Library von der deklarierten abweicht (rot bei Tracker als
notwendig, § 25 TDDDG). Neue KPI 'Falsch einsortiert'. Backend liefert dazu
cookie_categories (name→actual_category) aus big_lib im cookie-check-Output;
Seite lädt cookie-check einmal und reicht es an beide Komponenten.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-11 10:33:33 +02:00
Benjamin Admin 7fa9968ce1 feat(cookie): missing_retention — Vendor ohne Speicherdauer/Löschfrist
Vendor-Ebenen-Finding: greift, wenn ein Vendor eine Verarbeitung deklariert
(Kategorie/Zweck), aber KEINE Cookies gelistet sind UND keine persistence
angegeben ist (z.B. Nayoki GmbH — 'necessary' Auftragsverarbeiter ohne
Löschfrist). Die Pro-Cookie-Schleife sah solche Vendors nie (0 Cookies →
0 Findings). Remediation = Ticket-Text 'bitte Löschfrist festlegen'.
Art. 5 Abs. 1 lit. e + Art. 13 Abs. 2 lit. a → Control AUTH-2051-A03.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-11 10:02:59 +02:00
Benjamin Admin 05a1795ea8 feat(cookie): ② Documentation Drift — Richtlinie vs. Browser-Realität
Cookie-Check-Endpoint liefert jetzt out["drift"] (audit_cookie_compliance):
deklariert (Cookie-Richtlinie-Text) vs. tatsaechlich geladen (Browser).
Frontend zeigt den Reality-Check-Strip oben im Panel: X dokumentiert ·
Y geladen · Z undokumentiert. Pinnt den Vertrag mit test_cookie_drift.py
(undokumentiert-geladen + beide Drift-Richtungen) + Vitest Drift-Strip.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-11 09:33:41 +02:00
Benjamin Admin 289988d23e feat(cookie): ① Storage Inventory + storage_transparency-Finding
Trennt echte Cookies von anderem Endgeraete-Speicher (Local/Session Storage,
IndexedDB, Salesforce-Framework-Artefakte) — § 25 TDDDG ist technologieneutral.
- cookie_storage_inventory: detect_storage_type (Name-Muster ComponentDefStorage/
  __MUTEX/LSKey + Laufzeit-Text) + build_storage_inventory + storage_transparency-
  Summenbefund ('X als Cookie gelistet -> Y echte + Z andere').
- Endpoint cookie-check liefert storage_inventory; Frontend zeigt den Breakdown.

Tests: 4 + Frontend-Vitest gruen. Differenzierungsmerkmal: '740 -> 132 + 608'.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-11 09:05:29 +02:00
Benjamin Admin 901de1ca97 feat(cookie): A — Findings auditfest an Controls verdrahten
Jeder Cookie-Befund traegt jetzt ein strukturiertes control-Feld
(control_id aus doc_check_controls + regulation + article) statt nur
hardcodeter Strings: vague_duration->AUTH-2051-A03 (Art.5(1)e+13),
tracker_as_necessary->DATA-2851-A05 (§25 TDDDG), third_country->
DATA-1624-A04 (Art.44). Kette Regulation->Article->Control->Finding.
Frontend zeigt die Rechtsgrundlage je Befund. (Controls tragen
regulation/article noch NULL -> hier mitgeliefert bis gepflegt.)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-11 08:44:19 +02:00
Benjamin Admin 4c45f11e43 feat(cookie): Finding 'vague_duration' — unkonkrete Speicherdauer
Flaggt Laufzeit-Angaben ohne konkrete Dauer/Kriterium ('dauerhaft', 'bis zur
Loeschung', 'bis Nutzer deaktiviert', 'unbegrenzt' …) — Art. 5(1)(e) + Art. 13
DSGVO. Library-unabhaengig, gilt fuer ALLE Cookies (Coverage auf BMWs 780).
'13 Monate'/'Session'/'bis Widerruf, max. X' bleiben ok.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-11 08:33:06 +02:00
Benjamin Admin d18ef79f18 feat(cookie): Pro-Cookie-Library-Abgleich (2287er OCD + 35er rich) + Panel
- analyze_cookies gleicht Cookies gegen BEIDE Libraries ab: compliance.cookie_library
  (2287, OCD/CC0 — Kategorie/Retention) + 35er rich-DB (technical_necessity/reid/
  schrems/eu_alternative). 5 Befund-Typen: tracker_as_necessary, missing_purpose,
  excessive_lifetime (Art.5), third_country (Art.44), eu_alternative (kommerziell).
- Endpoint GET /snapshots/{id}/cookie-check (load_big_library batch + analyze).
- Frontend CookieLibraryPanel im Snapshot-Detail.
- Fix CookieResultView: Zweck nicht mehr auf 60 Zeichen gekuerzt; Rolle 'unknown'
  als Strich statt 'Unbekannt'.

Tests: 7 backend + frontend vitest gruen.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-11 08:18:25 +02:00
Benjamin Admin cefadf9e4c test(agent): CookieResultView KPI-Assertion entschaerfen (mehrdeutige '3')
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-11 01:04:37 +02:00
Benjamin Admin 410a814230 fix(agent): Cookie-View CONTROLLER -> Joint-Controller-Gruppe
recipient_type=CONTROLLER (Meta/LinkedIn/Criteo) gehoert zu Art. 26
(eigenverantwortliche Dritte / Joint Controller), nicht zu den eigenen
Verarbeitungen. BMW: 58 eigene / 16 AVV / 7 joint / 2 sonstige (= Mail-VVT).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-11 01:03:20 +02:00
Benjamin Admin 3332eb0bf9 feat(agent): Cookie-Result-View + Check-Historie aus Snapshots
Snapshot-getriebene Result-Views, entkoppelt vom Live-Check:
- CookieResultView: laedt cmp_vendors aus einem Snapshot (kein Re-Crawl),
  KPIs (Anbieter/Cookies/Marketing/Drittland) + Empfaenger-Gruppen
  (Eigene/AVV/Joint-Controller) + aufklappbare Vendor->Cookie-Tabelle.
- Historie (/sdk/agent/snapshots): alle gespeicherten Checks, jederzeit
  oeffnbar (DSB/Mitarbeiter) + Detail-Seite je Snapshot.
- Next.js-Proxys fuer GET /snapshots (Liste) + /snapshots/{id} (einzeln).

BMW-Snapshot 4603d15b: 83 Vendors / 780 Cookies. Library-Abgleich
(cookie_knowledge_db.lookup_cookie) folgt als Phase B.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-11 00:51:25 +02:00
Benjamin Admin a28db8f8f0 fix(admin): resolve all 266 TypeScript errors, enable strict build
Eliminate the pre-existing TS errors that were masked by
next.config.js `typescript.ignoreBuildErrors: true`, then turn the flag
OFF so the compiler is a real safety net for future changes. `next build`
and `tsc --noEmit` now pass with 0 errors.

The errors were not cosmetic — several exposed real latent bugs hidden by
the flag, e.g. the drafting-engine ConstraintEnforcer read non-existent
fields (`t.rule.dsfaRequired`, `d.required`, `r.title`), so its DSFA hard
gate and risk-flag checks were silently no-ops; scopeDefaults read
snake_case CompanyProfile fields that never matched the camelCase type
(generator defaults never populated). Both fixed by aligning code to the
current types.

Highlights:
- Vitest globals: add vitest-globals.d.ts (config already had globals:true)
  so the test files type-check; exclude Playwright specs from vitest.
- Add a minimal ambient `pg` module declaration (no @types/pg installed).
- Fix Next 15 route handlers to await Promise params.
- Reconcile drifted types across loeschfristen, compliance-scope, document-
  generator, drafting-engine, vendor-compliance, agent and more.

Pre-existing (NOT caused here, proven by stashing the diff): 3 vitest
logic tests still fail — getNextStep (2) and buildDocumentScope priority (1).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 00:42:44 +02:00
Benjamin Admin bb9aacc3d3 feat(agent): Abstellmaßnahmen + Ticket-Formulierung (Schritt 3)
RemediationPlan: aus den offenen Punkten (result.results, Haupt-Engine) je
Finding eine Massnahme + fertigen Ticket-Text ableiten, nach Prioritaet
sortiert, mit Kopieren + JSON-Export als Uebergabe. SCOPE: BreakPilot
formuliert nur — Ticketsystem/Jira/Feedback-Loop baut ein anderes Team.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-11 00:12:35 +02:00
Benjamin Admin 5da20af4fd feat(agent): Audit-Kopf + 4 KPI-Kacheln ueber den Ergebnis-Tabs
ResultSummary: Titel (Firma aus extracted_profile) + check_id + 4 Kacheln
(Dokumente, Konform, Offene Pflichtangaben, Zu pruefen), gerechnet aus
result.results. Co-Pilot-Ton: gruen/gelb/rot nur bei echten Werten.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-10 23:53:12 +02:00
Benjamin Admin 3f23a64d5f feat(agent): Impressum-Tab auf Haupt-Engine + Profil/§36-Fixes
Ergebnis-Tab rendert jetzt result.results (Haupt-Doc-Check) statt des
abweichenden v3-Agenten — BMW korrekt statt False Positives:
- DocResultView: ein Dokument als Pflichtangaben-Tabelle (Label + gefundener
  Text + 3-Tier-Status), KEINE MC-IDs. ComplianceResultTabs speist Tabs aus
  result.results; ChecklistView-Bausteine exportiert + wiederverwendet.
- profile_extractor: Firmenname/Rechtsform = fruehester Treffer + ausge-
  schriebene Formen (Aktiengesellschaft) -> BMW AG statt "juris GmbH".
- 36 VSBG (MC-010): reines b2c -> POSSIBLY_APPLICABLE (Pruef-Hinweis) statt
  MEDIUM-FAIL; hart nur bei ecommerce. possibly_hint pro MC.
- McCoverage traegt label + found (Snippet); mc_possibly-Aggregat.
- AgentFindingCard/Methodik: interne check_id/mc_id nicht mehr angezeigt.

Tests: test_four_status (16) + Frontend-Vitest gruen; CI-Suite 206, v3/GT
unveraendert. Nur eigene Dateien (geteilter Tree).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-10 23:44:01 +02:00
Benjamin Admin 97575cc9c0 feat(agent): 4-Status-Modell (NOT_APPLICABLE/INSUFFICIENT_EVIDENCE/POSSIBLY_APPLICABLE) für Impressum
Kanonisches Compliance-Datenmodell, Impressum-Agent als Referenz:
- CheckStatus-Enum + Finding.status GETRENNT von severity (Verdikt ≠ Risiko)
- Unbestimmte Rechtsform (weder Text noch Wizard) → INSUFFICIENT_EVIDENCE (INFO)
  statt hartem HIGH-FAIL; legal_form_dependent-Gate + detect_legal_form_present
- §18-MStV-Graubereich (Corporate-Blog via has_editorial_content) →
  POSSIBLY_APPLICABLE (LOW Prüf-Hinweis); 3-stufig via scope_disposition
- Recommendations nur aus echten FAILs; mc_insufficient/mc_possibly-Aggregate
- Frontend: Verdikt-Pill + Coverage-Vokabular
- 19 neue Tests (test_four_status.py, AgentFindingCard); CI-Suite 204 grün,
  v3 25 / GT 13 unverändert

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-10 22:38:11 +02:00
Benjamin Admin 65de90114a feat(agent): SSE — progressive Themen-Tabs (Phase 2)
Der Compliance-Check streamt jetzt progressive Events; der Impressum-Tab
erscheint, sobald das Thema fertig ist, statt am Ende alles auf einmal.
Additiv — das Polling fürs finale Ergebnis bleibt.

- backend: _sse.py (Queue/emit/event_generator) + Endpoint
  /compliance-check/{id}/stream; _update emittiert progress,
  run_agent_outputs emittiert topic (laeuft jetzt frueh nach Phase B),
  Orchestrator emittiert complete/error.
- frontend: SSE-Proxy-Route + EventSource in ComplianceCheckTab merged
  topic-Events in agent_outputs -> Tab erscheint progressiv.
- Tests: backend 5 passed (SSE + agent_outputs); tsc 0 neue Fehler,
  vitest 2 passed, check-loc 0.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-10 19:07:26 +02:00
Benjamin Admin e21984e0ad feat(agent): strukturierte Ergebnis-Tabs — Impressum (Phase 1)
Der Compliance-Check legt zusätzlich einen strukturierten v3-AgentOutput
pro Thema in result.agent_outputs ab (additiv; B18-HTML + Firehose-Mail
bleiben unangetastet). Frontend: standardisiertes Ergebnis-Tab statt
Firehose — Impressum-Tab (AgentResultTab) + "Alle Checks (roh)" (ChecklistView).

- backend: _agent_outputs.py ruft den registrierten v3-ImpressumAgent,
  gewired in _orchestrator nach B18, surfaced via _phase_f_persist.
- frontend: AgentResultView (aus AgentSlotCard extrahiert, DRY),
  AgentResultTab, ComplianceResultTabs; ComplianceCheckTab 490->391 Zeilen.
- Tests: backend 2 passed, frontend 2 passed; tsc 0 neue Fehler; check-loc 0.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-10 18:32:06 +02:00
Benjamin Admin 3ec6393919 docs(agents): korrigierte Zahlen — 13.588 Master-Controls (dedup) statt 314k
CI / nodejs-build (push) Successful in 2m20s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / detect-changes (push) Successful in 7s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Failing after 4s
CI / validate-canonical-controls (push) Successful in 11s
CI / loc-budget (push) Successful in 14s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
User-Klarstellung 2026-06-09:
  - 314.811 Atomic-Controls (compliance.canonical_controls)
  - 13.588 Master-Controls nach RAG-Dedup (compliance.master_controls)
  - ~1.778 Master-Controls fuer dieses Compliance-Tool selektiert
    (vermutlich phases_covered = ['implementation', 'testing'])
  - Frontend: https://macmini:3007/sdk/master-controls und
    https://macmini:3007/sdk/control-library

Methodik-Box im Agent-Test-Tab aktualisiert mit korrekten Zahlen
+ Roadmap-Hinweis: Sprint 1.12 wird interne Pattern-IDs formal
mit Master-Controls verknuepfen.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-09 08:34:23 +02:00
Benjamin Admin 18e4f98201 fix(agents): klarere Naming + korrektes LLM-Default-Modell
CI / detect-changes (push) Successful in 6s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / nodejs-build (push) Successful in 2m20s
CI / test-go (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Failing after 4s
CI / validate-canonical-controls (push) Successful in 11s
CI / loc-budget (push) Successful in 14s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 30s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
User-Korrektur 2026-06-09:

(1) Begriff 'MC' steht im Projekt fuer Master-Control aus
canonical_controls (314k Eintraege, ~1.800 fuer dieses Tool). Mein
neuer Agent-Code hatte 'MC' als Abkuerzung fuer 'Machine-Check'
verwendet — Naming-Konflikt. Frontend-Methodik-Box jetzt:
  - 'Pattern-Check' statt 'Machine-Check'
  - Explizit: 'Diese Pattern-IDs (IMP-MC-001) sind interne Test-IDs,
    NICHT die Master-Control-IDs aus der canonical_controls-DB'
  - Roadmap-Hinweis: formale Verknuepfung Pattern→Master-Control folgt

Backend-Variablen mc_id bleiben technisch unveraendert (Refactor
waere gross), aber UI darf sie nicht als 'Master-Control' bezeichnen.

(2) LLM-Modell-Default war 'qwen2.5:7b' — Projekt nutzt aber das
groessere 'qwen3.5:35b-a3b' auf macmini (ENV SELF_HOSTED_LLM_MODEL).
_escalation.py default jetzt: SELF_HOSTED_LLM_MODEL als Fallback,
und Methodik-Erklaerung nennt das richtige Modell.

(3) Methodik-Erklaerung erweitert um Sprint-1.10 Semantic-Validator
und Sprint-1.11 Auto-Learning-Pattern-Library + Cross-Placement.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-09 08:29:00 +02:00
Benjamin Admin 3ef8c9b247 feat(agents): Frontend Methodik-First Layout
CI / detect-changes (push) Successful in 7s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Failing after 4s
CI / validate-canonical-controls (push) Successful in 11s
CI / loc-budget (push) Successful in 14s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m24s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
User-Vorgabe: pro Slot transparent zeigen WAS wir tun:
  1. Was wurde geprueft (MC-Coverage, collapsible)
  2. Speedometer mit Severity-Verteilung
  3. LLM-Eskalation-Log (wenn benutzt)
  4. Findings sortiert HIGH->LOW, je Card:
     - Methodik-Badge (MC / Regex / KB / LLM / Cross)
     - Gesetzliche Basis (Norm-Block, violett)
     - Befund (Zitat-Block, amber)
     - Empfehlung -> 'Pflicht-Massnahme' bei HIGH,
       'Best-Practice' bei MEDIUM/LOW, 'LLM-Vorschlag'
       bei LLM-Quelle
  5. Maszahmen-Plan (gerollupte Recommendations mit
     related_finding_ids + Aufwand)

Refactor: ein File AgentTestTab.tsx (519 LOC) -> 7 Files:
  _agentTypes.ts (Types + Methodik-Konstanten)
  AgentSpeedometer.tsx
  AgentMcCoverage.tsx
  AgentFindingCard.tsx
  AgentRecommendationCard.tsx
  AgentSlotCard.tsx
  AgentTestTab.tsx (Top-Level, schlank)

Plus Methodik-Info-Erklaerung am Tab-Anfang + Disclaimer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-09 07:53:24 +02:00
Benjamin Admin 702e7a6333 fix(impressum): Pattern fasst Geschäftsführung/Vorstand/Inhaber
CI / build-sha-integrity (push) Failing after 4s
CI / validate-canonical-controls (push) Successful in 11s
CI / loc-budget (push) Successful in 13s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m21s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 29s
CI / detect-changes (push) Successful in 8s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
Safetykon-Bug: 'Geschäftsführung:' (Sammelbegriff für GF einer GmbH)
matched das alte Pattern 'Geschäftsführer' nicht — False-Positive
IMPRESSUM-AGENT-VERTRETUNGSBERECHTIGTE_LABEL_KORREKT.
Pattern erweitert: Geschäftsführer|Geschäftsführung|Geschäftsführerin
+ Vorstand|Vorstandsvorsitzender + Inhaber|persönlich haftend.
Test test_safetykon_geschaeftsfuehrung_passes ergänzt (11/11 grün).

frontend: SlotCard zeigt jetzt Badge bei 0/0/0-Slots
('Dokument konnte nicht geladen werden') statt silent-fail, +
bei 0 Findings ein 'alle MCs OK'-Badge.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-08 18:24:01 +02:00
Benjamin Admin 3ae4e60c9d feat(agents): SSE-Endpoint + Agent-Test-Tab (5-URL parallel)
CI / detect-changes (push) Successful in 7s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Failing after 4s
CI / validate-canonical-controls (push) Successful in 12s
CI / loc-budget (push) Successful in 14s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m24s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 29s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
Backend:
- specialist_agent_routes.py: GET /agents, POST /test/start (run_id),
  GET /test/stream/{run_id} (SSE), GET /run/{run_id}/result,
  GET /run/{run_id}/artifacts, GET /run/{run_id}/artifact/{path},
  DELETE /run/{run_id}, GET /runs.
- Per-URL async orchestrator: text fetch via consent-tester
  dsi-discovery → agent.evaluate() → vault.put_json + stream events.
- Tests: 7/7 grün.

Frontend:
- /api/sdk/v1/specialist-agent proxy mit SSE-passthrough.
- AgentTestTab.tsx: Agent-Wähler + 5 URL-Slots + Live-Events +
  Speedometer (OK/N-A/HIGH/MEDIUM/LOW) + Findings + Recommendations +
  Eskalations-Log + Artefakt-Link pro Slot.
- Neuer Tab "Agent-Test" in /sdk/agent.

User-Wunsch 2026-06-08: pro Agent isoliert testen, 5 URLs gleichzeitig,
Live-Updates statt Polling-Wartespiel.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-08 17:47:05 +02:00
Benjamin Admin ec03317170 feat(frontend): Firmenname + Domain Input + useCompanyOrigin hook
CI / nodejs-build (push) Successful in 2m20s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / detect-changes (push) Successful in 7s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Failing after 4s
CI / validate-canonical-controls (push) Successful in 10s
CI / loc-budget (push) Successful in 14s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
ComplianceCheckTab.tsx bekommt zwei neue UI-Felder oberhalb des
PreScanWizard:
  - Firma  → z.B. 'Tesla Germany GmbH'
  - Domain (Site-Origin) → z.B. 'https://www.tesla.com/de_de'

Beide werden:
  - in localStorage persistiert (Hook _useCompanyOrigin.ts)
  - im POST-Body als company_name + origin_domain mitgeschickt
  - haben Vorrang vor LLM-extracted_profile (Backend nutzt
    eingegebene Werte falls vorhanden, fallback auf Inferenz)

Datei jetzt 489 LOC (war vorher 461 + 28 für die Inputs).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-08 13:01:44 +02:00
Benjamin Admin 5aaf7ac613 refactor(complianceCheckTab): split — DOCUMENT_TYPES + Storage + Polling out
CI / detect-changes (push) Successful in 7s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / build-sha-integrity (push) Failing after 4s
CI / validate-canonical-controls (push) Successful in 10s
CI / loc-budget (push) Successful in 14s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m21s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
ComplianceCheckTab.tsx war 519 LOC und blockte jeden weiteren Edit
(500-LOC-Hard-Cap). Drei Concerns ausgelagert:

  - _document_types.ts: DOCUMENT_TYPES + DocTypeId (inkl. news doc_type)
  - _compliance_storage.ts: STORAGE_KEY_*, DocState/HistoryEntry types,
    emptyDocState/initState helpers, countWords
  - _useCompliancePolling.ts: Resume-Polling-Hook (importierbar,
    Inline-Polling bleibt für Stabilität)

ComplianceCheckTab.tsx ist jetzt 461 LOC (-58).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-08 12:18:30 +02:00
Benjamin Admin b4ce3528e5 feat(impressum-agent): Tesla-Pattern + KBA-Hint + News-Doc-Type
CI / build-sha-integrity (push) Failing after 4s
CI / validate-canonical-controls (push) Successful in 11s
CI / loc-budget (push) Successful in 14s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m20s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 30s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / detect-changes (push) Successful in 6s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
User-Feedback Tesla-Impressum: 10 FAIL bei 46 Worten — viele False-
Positives. Nach Tuning: 5 juristisch saubere Findings.

Impressum-Agent Patterns:
  - name_anbieter zusätzlich label-frei matchen (Firma+Rechtsform+
    Anschrift, Tesla schreibt ohne "Anbieter:" Label).
  - vertretungsberechtigte akzeptiert jetzt "Management" / "Director"
    als alternative (US-Konzern-Habit), aber emittiert separates
    Sub-Finding "Label sollte Geschäftsführer für § 5 TMG sein".
  - aufsichtsbehoerde-Pattern um KBA / Bundesnetzagentur erweitert.
  - NEU: verantwortlicher_redaktion (§ 18 MStV bei Blog/News).
  - NEU: verbraucher_streitbeilegung (§ 36 VSBG bei B2C).
  - Auto-Detection von Automotive-Branche: explizite Begriffe ODER
    bekannte Hersteller-Namen (Tesla/BMW/Mercedes/Audi/VW/Porsche…).
    Triggert KBA-Hint im aufsichtsbehoerde-Finding-Action.

Frontend (_document_types.ts):
  - Extrahiert aus ComplianceCheckTab.tsx (vorher inline).
  - NEU: doc_type "news" für Blog/Newsroom-URL → § 18 MStV-Pflicht-
    angaben prüfen. User-Hinweis: tesla.com/de_de/blog ist
    relevanter Audit-Input neben DSE/Impressum.

Smoke gegen Tesla-Impressum (46 Worte):
  Vorher 10 Findings (5 davon FP).
  Jetzt 5 Findings — alle juristisch korrekt:
    [MED] Management statt Geschäftsführer
    [LOW] KBA als Aufsichtsbehörde fehlt
    [MED] § 18 MStV-Verantwortlicher fehlt (Tesla Blog!)
    [MED] § 36 VSBG-Hinweis fehlt
    [MED] ODR-Plattform-Link fehlt

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-08 12:07:08 +02:00
Benjamin Admin dfadff5b02 feat(agent): PreScanWizard im ComplianceCheckTab (P79 sichtbar)
Wizard war bisher nur im DocCheckTab eingebaut, der aber nirgends im UI
gemountet ist. Daher: alle Compliance-Checks schickten scan_context=null,
P72 Branchen-Filter wirkte nie.

Fix: PreScanWizard ins ComplianceCheckTab über die Document-Rows
gestellt. Submit-Button disabled bis alle 8 Felder (Branche, B2B/B2C,
Direkt-Vertrieb, Rechtsform, Konzern, MA, Besondere Daten, Drittland)
gesetzt sind. scan_context wird im POST body mitgesendet.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 07:21:11 +02:00
Benjamin Admin 6263462ba3 feat(frontend): Tab-Layout für Audit-Ergebnisse + cookie_audit in API
CI / detect-changes (push) Successful in 8s
CI / branch-name (push) Has been skipped
CI / iace-gt-coverage (push) Successful in 28s
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 15s
CI / loc-budget (push) Failing after 16s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m40s
CI / test-go (push) Failing after 45s
CI / test-python-backend (push) Successful in 40s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
ResultsTabsView.tsx — neue Komponente mit 7 Tabs:
  1. Übersicht (KPIs: Docs, Findings, Vendors, Score)
  2. Cookies & VVT (3-Quellen-Compliance-Vergleich +
     undokumentiert/compliant/nicht-geladen + deduplizierte Vendor-Tabelle)
  3. Datenschutzerklärung (DSE-Findings via ChecklistView)
  4. Impressum
  5. AGB / Widerruf (zwei Sections in einem Tab)
  6. Cookie-Banner (Verstoesse + Phasen-KPIs)
  7. Mail-Vorschau (PDF-Download-Link)

Sticky Tab-Header oben, Content scrollt darunter. Lange Scroll-Mail
ist damit verschwunden.

DocCheckTab nutzt ResultsTabsView statt der alten Inline-ChecklistView.

Backend liefert jetzt cookie_audit-dict in der Response (zusaetzlich
zu cmp_vendors + banner_result) damit das Cookie-Tab die 3 Listen
(undokumentiert / compliant / nicht-geladen) rendern kann.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 23:44:36 +02:00
Benjamin Admin e411c4f0d3 feat(audit): Text-Paste-Mode pro Row — Crawler optional umgehen
CI / detect-changes (push) Successful in 12s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 17s
CI / loc-budget (push) Failing after 20s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 3m27s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 47s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
Hintergrund: VW liefert ueber URL-Crawler nur 6 Vendors statt der 100+
die in der echten Cookie-Tabelle stehen. Wenn der User die Tabelle aber
direkt von der Site kopieren kann (was bei den meisten OEM-Sites moeglich
ist), umgehen wir den Crawler komplett und parsen den Text deterministisch.

Backend:
* doc_type_classifier.py — 7 Pattern-Gruppen (§5 TMG, Art.13 DSGVO,
  AGB-Klauseln, Widerrufs-Frist, Cookie-Tabellen-Header, etc). Wenn der
  User Text ins falsche Doc-Type-Feld kopiert (Impressum->DSE),
  detect_mismatch liefert detected + action ('reclassify' bei sehr hoher
  Konfidenz, 'warn' bei medium).
* cookies_table_parser.py — Tab/Pipe/Komma/Semicolon-Separator-Auto-
  Detection, Spalten-Mapping per Header-Keyword. Aggregiert Cookie-
  Eintraege zu Vendor-Records (mit _guess_vendor-Fallback). Voll
  deterministisch, kein LLM.
* doc_input_warnings.py — Mail-Block ueber dem Audit, der Mismatches +
  Auto-Reclassifies dem User transparent macht.
* Pipeline: text gewinnt ueber url (war schon im Schema vermerkt), neue
  Felder declared_doc_type / input_source / reclassify_hint in doc_entries.
  Pasted-Tabellen-Vendors haben Vorrang vor Library-Fallback + LLM-Cascade
  (sind 100% genau).

Frontend (DocCheckTab):
* Pro Row Mode-Toggle 'URL' / 'Text einfuegen' (lila wenn aktiv).
* Textarea (h-32, monospace) im text-mode mit kontext-spezifischem
  Placeholder (Cookie-Hinweis ggue. anderen Doc-Types) und Live-
  Zeichen-/Wort-Counter.
* Submit-Button accepted entries mit URL ODER text.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 18:58:32 +02:00
Benjamin Admin c491af5d02 feat(audit): P47 localStorage-Quota — safeSetItem mit Auto-Prune
CI / detect-changes (push) Successful in 8s
CI / branch-name (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 13s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 41s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / loc-budget (push) Failing after 16s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m47s
storageHelpers.ts: safeSetItem faengt QuotaExceededError, prunet
alte doc-check-result-*-Eintraege (oldest first, MAX_KEEP=10) und
retried. Bei zweitem Fail aggressiver pruefen.

DocCheckTab.tsx nutzt safeSetItem statt setItem fuer doc-check-results,
result-Keys und history. Verhindert silent-data-loss + Crash wenn
~5MB localStorage-Limit erreicht.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 16:47:42 +02:00
Benjamin Admin 50fc0ecc59 feat(audit): P79 Pre-Scan-Wizard (8 Pflichtfelder) + P99 erweitert + P102 Replay-Fix
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 14s
CI / nodejs-lint (push) Has been skipped
CI / detect-changes (push) Successful in 11s
CI / branch-name (push) Has been skipped
CI / loc-budget (push) Failing after 17s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m56s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 40s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
P79: PreScanWizard.tsx mit 8 Pflichtfeldern (Branche, B2B/B2C,
Direkt-Vertrieb, Rechtsform, Konzern-Struktur, MA-Zahl, Besondere
Daten, Drittland). Scan-Button disabled bis alle 8 ausgefuellt. Werte
landen in scan_context und ueber Backend in compliance_check_snapshots.

P99: DOC_TYPES um dsa + legal_notice + lizenzhinweise + nutzungsbedingungen
erweitert. URL-hinzufuegen-Button war schon da.

P102 (Replay-Bug): check_replay.py liest jetzt e.get('text') statt
nur full_text — Snapshot-Schema verwendet 'text'. Library-Mismatch-
Block wird damit auch im Replay angezeigt.

Backend: ComplianceCheckRequest.scan_context optional; save_snapshot
persistiert ihn in compliance_check_snapshots.scan_context.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 15:59:01 +02:00
Benjamin Admin 6f16507c5f feat(banner): P19 + P20 — Per-Category-Click-Test + Frontend-Drilldown
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 17s
CI / loc-budget (push) Successful in 17s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m54s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 43s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
P19 (consent-tester):
- dp-cookieconsent (TYPO3, Safetykon-Pattern) als CMP-Profil hinzu —
  Selektoren #dp--cookie-statistics/marketing + a.cc-allow Save-Button
- Neues Signal provider_details_visible: nach Kategorie-Toggle prueft
  Playwright ob im Banner sichtbare Provider-/Cookie-Detail-Elemente
  erscheinen. Bei dp-cookieconsent (Banner ohne Listing) immer False
  -> HIGH-Violation "Kategorie zeigt keine Provider-/Cookie-Details —
  Nutzer kann nicht informiert einwilligen (Art. 7 Abs. 1 DSGVO)"
- main.py serialisiert provider_details_visible + cookies_set pro Kategorie

P20 (Frontend-Drilldown):
- Backend: check_payloads-Tabelle um Spalte 'banner' (JSON) — voller
  banner_result persistiert (vorher nur in-memory). ALTER TABLE
  Migration idempotent.
- Neuer Endpoint GET /api/compliance/agent/banner/<check_id> — liefert
  Quality-Score, Phases, Category-Tests, Banner-Checks, alle 46
  structured_checks.
- Frontend: BannerTab im /sdk/agent/audit/<id> mit Quality-Cards,
  3-Phasen-Cookie-Tabelle, Per-Category-Listing (mit P19-Signal
  rot/gruen), Banner-Verstoesse + Rechtsgrundlagen, 46-Check-Drilldown
  filterbar nach Severity.
- Tab-Switcher in page.tsx um "Cookie-Banner-Analyse" erweitert.
- Bonus: 2 alte route.ts auf Next.js 15 Promise-params umgestellt
  (Build-Fix).

Plus: Critical-Findings-Block nutzt provider_details_visible als
primaeres Signal statt nur tracking_services-Anzahl.

Smoke-Test Safetykon: 4 Critical Findings im Mail, banner-Endpoint
liefert 46 checks + 3 phases + 2 categories mit provider_details_visible=False.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 14:31:13 +02:00
Benjamin Admin 479ce2225b feat(profile): P14+P15+P16 — B2B-Heuristik + Doc-URL-Dedup + Homepage-Profile
P14 — _detect_no_direct_sales erweitert um 3 Cluster:
  A) OEM-Konfigurator (BMW/Audi/Mercedes/VW/Porsche-Markennamen + Vertragshaendler-Pattern)
  B) B2B-Dienstleister (CE-Zertifizierung, Compliance-Beratung, Schulungen, Auditierung, TISAX, ISO-Normen, Arbeitssicherheit, ...)
  C) NGO/Verein/Public (Spendenkonto, Vereinsregister, gemeinnuetzig, ...)
Schwelle: pos >= 2 pro Cluster UND pos > neg. Bisher: nur OEM.

P15 — Doc-URL-Dedup im Worker: wenn mehrere Doc-Types DASSELBE Dokument
referenzieren (Safetykon-Pattern: User gibt /datenschutz fuer dse, cookie
UND widerruf), wird nur dem primaeren Doc-Type (Priority: dse > impressum
> cookie > widerruf > agb > nutzungsbedingungen) der Text gegeben. Andere
landen als "Nicht separat vorhanden — wird im Dokument 'X' mit-geprueft."
Eliminiert die 8+8 systematischen widerruf/cookie False Positives.

P16 — Profile-Detection auch Homepage-Text: Homepage-HTML wird mit kurzem
Fetch (8s timeout) gezogen, getrippt und zum profile_input gemerged. Vor-
her wirkte P14 nur wenn B2B-Indikatoren im DSE/Impressum-Pflichttext
standen — bei Safetykon stehen sie nur im Homepage-Menue.

Plus Bonus: TDM-Override-Submit-Button wird deaktiviert wenn Reason < 10
Zeichen — verhindert dass User wie heute in den Bug rein klickt.

Smoke-Test Safetykon (B2B Compliance-Dienstleister):
  dse                  geprueft (kein err)
  impressum            geprueft (kein err)
  cookie               "Nicht separat vorhanden — wird in DSE mit-geprueft"
  agb                  "Nicht anwendbar — kein Direkt-Kaufvertrag"
  widerruf             "Nicht anwendbar — kein Direkt-Kaufvertrag"
  nutzungsbedingungen  "Nicht anwendbar — kein Direkt-Kaufvertrag"
Vorher: 16 False Positives. Jetzt: 0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 11:46:58 +02:00
Benjamin Admin 78b27d4684 feat(compliance-check): P12 — TDM-Override mit dokumentierter Kunden-Erlaubnis
CI / guardrail-integrity (push) Has been skipped
CI / nodejs-build (push) Successful in 3m5s
CI / test-go (push) Has been skipped
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 16s
CI / loc-budget (push) Successful in 17s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 40s
CI / test-python-document-crawler (push) Has been skipped
Backend: ComplianceCheckRequest um tdm_override + tdm_override_reason
erweitert. Worker im _run_compliance_check Pfad: bei
tdm_override=True UND Reason >= 10 Zeichen wird der TDM-Vorbehalt
nur dokumentiert (job.tdm_override.{reason, original_status}) und
NICHT als Abbruch-Grund gewertet. Ohne Reason: Override ignoriert.
Audit-Spur via logger.warning(reason).

Frontend: ComplianceCheckTab um Checkbox + Pflicht-Reason-Feld
("Schriftliche Crawl-Erlaubnis vorhanden") direkt vor dem Submit-
Button. Pflicht: Reason >= 10 Zeichen. Submit sendet die Flags ans
Backend.

Anwendungsfall: Safetykon-Pattern — robots.txt + ai.txt setzen
Vorbehalt, aber Kunde hat schriftlich zugestimmt (Auftrags-Audit).

[guardrail-change] ComplianceCheckTab.tsx (511 LOC) in loc-exceptions
ergaenzt — Split nach _components/TDMOverride + CompliancePolling
ist P11-Tech-Debt.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 08:56:50 +02:00
Benjamin Admin 575644c9c5 feat(audit): P8 — MC-Severity raus, Email nur harte Findings, MC-Audit als Checkliste
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 17s
CI / loc-budget (push) Failing after 17s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m48s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 40s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
Email-Hardening (mc_scorecard.top_fails):
  Neue _is_hard_finding-Heuristik filtert konditionale MCs ohne
  Negativ-Beleg aus den Top-Auffaelligkeiten. matched_text leer + Label
  enthaelt "falls/sofern/wenn/soweit/ggf." -> raus, landet nur noch im
  MC-Audit als "selbst pruefen". DATA-2066-A05 (kostenfreie Abschaltung
  Standortdaten) ist das prototypische Beispiel.

MC-Audit-Frontend (audit/[checkId]/page.tsx):
  Severity-Spalte (CRITICAL/HIGH/MEDIUM/LOW) entfernt — der MC-Audit
  ist eine Checkliste, keine Severity-Drohung. Stattdessen:
   - Spalte "Prioritaet" mit 3-Tier aus regulation-Mapping:
     Gesetz (DSGVO/ePrivacy/TDDDG/...) / Behoerden-Leitlinie
     (EDPB/DSK/EuGH/...) / Best-Practice (ISO/NIST/BSI)
   - 3-Status: erfuellt (✓) / nicht erfuellt (✗) / selbst pruefen (?)
     / nicht anwendbar (—). rowReviewStatus() leitet "selbst pruefen"
     aus matched_text-leer + konditionalem Label ab.
   - Filter umgebaut auf 5 Stati statt 4
   - Default-Filter "Nicht erfuellt" (vorher "Nur Fail")

Bonus: f.payload.risk_label TS-Cast im FindingsTab clean gemacht
(unknown -> string).

Effekt:
  - Email an die GF zeigt nur noch echte Belege ("DSB fehlt",
    "Gebuehr fuer Widerruf")
  - MC-Audit ist eine sachliche Pruefliste fuer den Compliance-Officer

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 00:30:04 +02:00
Benjamin Admin 6c223c7c9b feat(compliance-check): exec-summary + voll-audit + TDM-respect + cookie-KB-extended + saving-scan-funnel
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 14s
CI / loc-budget (push) Failing after 15s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m43s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 37s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
P1 — Exec-Summary oben im Email-Report (4 KPIs + 2 CTAs, dunkler Gradient)
P3 — no_direct_sales-Flag fuer OEM-Konfigurator-Sites; AGB/Widerruf/AGB als
     "NICHT ANWENDBAR" (grau) statt "NICHT GEFUNDEN" (rot)
P5 — Voll-Audit Unification: alle Findings (MC + Pflichtangaben + Vendor +
     Redundanz) in /data/compliance_audits.db.unified_findings; neuer
     /api/compliance/agent/findings/<id> Endpoint + FindingsTab im Audit-UI
     mit Filter + CSV-Export
P7 — Crawl-Hardening: TDM-Reservation-Check (robots.txt / ai.txt / Header /
     Meta) vor jedem Run mit 24h-Cache; HeadlessChrome-UA (Firma noch nicht
     gegruendet — Switch via BREAKPILOT_BRANDED_UA env); per-Domain
     Rate-Limit 1 req/s + max 2 concurrent
P2 — Cookie-Knowledge-DB additiv erweitert (35 -> 74 Cookies): Adobe, Meta,
     Microsoft, LinkedIn, TikTok, HubSpot, Marketo, Salesforce, Hotjar,
     FullStory, Mouseflow, Intercom, Drift, Zendesk, Cloudflare, Stripe,
     OneTrust/Cookiebot/Usercentrics, Matomo, Pinterest, Snapchat, X/Twitter,
     YouTube, Vimeo, Klaviyo, Mailchimp, Mixpanel, Segment, Amplitude,
     Optimizely, Datadog; Wire-in in cookie_function_classifier liefert
     compliance_risk-Label (kritisch/hoch/mittel/gering) pro Vendor
A  — k-Anonymitaets-Helper (benchmark_k_anonymity) fuer P6-Vorbereitung
B  — Cross-Tenant-Domain-Assertion im /findings-Endpoint (expected_domain
     Query-Param -> 403 bei Mismatch)
C  — Saving-Scan-Funnel: /api/compliance/agent/saving-scan/start mit
     Validierung + 24h-Rate-Limit pro Domain + Lead-Persistenz in
     saving_scan_leads + Auto-Discovery via _run_compliance_check; 6 Tests
D  — Risk-Badge im Email-Vendor-Row

Rechtliche Leitplanken (Memory feedback_oem_data_legal.md): nur eigene
Knapp-Bewertungen + Source-Pointer, keine 1:1-Kopien fremder CMP-Texte.
TDM-Opt-Out-Respect nach § 44b UrhG. KEINE Schema-Aenderungen — alles in
Sidecar-SQLite.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 23:48:34 +02:00
Benjamin Admin df7d83134b feat(agent): migrate compliance-check results to banner + documents (M1-M5)
After a compliance-check run finishes, the user can now apply the
extracted vendor inventory directly to their own:

  - CookieBanner config (admin /sdk/einwilligungen)
  - Cookie-Policy / VVT-Register / Privacy-Policy templates
    (admin /sdk/document-generator)

Backend:
  - migration_to_banner.py: vendor list -> CookieBannerConfig with
    ESSENTIAL/PERFORMANCE/PERSONALIZATION/EXTERNAL_MEDIA buckets +
    review flags (broken opt-out URLs, missing expiry, no cookies listed)
  - migration_to_document.py: vendor list -> pre-fills for 3 doc
    templates, recipient-type aware (INTERNAL/GROUP/PROCESSOR/CONTROLLER)
  - agent_migration_routes.py: GET /banner-preview, /document-preview,
    /summary keyed on check_id
  - compliance_audit_log: new check_payloads table persists cmp_vendors +
    extracted_profile so the preview survives an app restart
  - tests: 9 mapper units + 4 endpoint integration tests

Frontend:
  - MigrationPanel.tsx: modal showing banner-config diff + document
    pre-fills, plus links into the existing editors
  - ComplianceCheckTab.tsx: replaces standalone audit link with the
    panel; net -3 lines, stays at the 500-cap

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 14:06:28 +02:00