breakpilot-compliance

Author	SHA1	Message	Date
Benjamin Admin	7aabfbe5b5	feat(controls): Mandanten-Suppression — per-tenant Applicability-Override Geteilte Schicht für alle Surfaces (Workspace-Anwälte, Cyber-Risiko-Projekt, Admin): ein Mandant markiert ein Control als "nicht anwendbar" → in seinen Use-Case-Ansichten (und künftig Repo-Scans) ausgeblendet. - Migration 156: compliance.control_suppressions (PK tenant_id+control_uuid), reversibel (active + reverted_*), auditierbar (actor/reason/created_at). [migration-approved] - Service control_suppression: suppress/revert/list_suppressions + suppressed_control_uuids (geteilter Filter). - Routes: GET/POST /v1/controls/suppressions + POST .../{uuid}/revert (X-Tenant-ID). - controls_for_use_case: optionaler X-Tenant-ID + include_suppressed; suppressed per Default versteckt (nie gelöscht), suppressed_count, suppressed-Flag pro Control. Agenten/CRA ohne Tenant unberührt. - Tests: Request-Validierung + import-safety (E2E-Zyklus gegen macmini bewiesen). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-16 16:35:38 +02:00
Benjamin Admin	9e9d780902	feat(cra): Management-Fortschritts-Ansicht (Ticket-Status-Readback) Liest den Lebenszyklus jedes Befunds (status + tracker_issue_url) aus dem Scanner zurück und rollt ihn zu einem Management-Bild auf: % erledigt, 4-Phasen (offen/in Arbeit/erledigt/ausgeschlossen), offenes Restrisiko nach Schweregrad, Fortschritt je CRA-Anforderung und eine Aufgaben-/Ticket-Tabelle mit Jira-Link. Neuer Endpoint GET/POST /api/v1/cra/progress (dünn → Service cra_progress, rein deterministisch, kein /assess-Schema-Drift). Frontend: ProgressView in Ebene 1 (CRACyberView), live je Scanner-Repo, sonst Demo-Status. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-16 10:10:45 +02:00
Benjamin Admin	7a4f086151	feat(cra): Maßnahmen-Provenienz + Lizenzklasse je Normquelle Jede Normreferenz einer Maßnahme wird lizenzklassifiziert (eu_law / public_domain / open / paid_reference) — paid-reference-Normen werden nur als Verweis geführt, nie im Text gespeichert (idea/expression). Kuratierte Maßnahmen tragen Tier 'core', KI-/Fallback-Maßnahmen 'review' (indikativ). Frontend zeigt Quellen-Badges + "indikativ"-Kennzeichnung. Methodik in docs-src/development/mapping-methodology.md (Szenario C, Due-Diligence). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-16 10:10:20 +02:00
Benjamin Admin	6c619ecc42	feat(cra): kuratierte Maßnahmen-Bibliothek — alle 40 CRA-Anforderungen belegt - data/measures_curated.json: 24 deduplizierte, standard-gestützte Maßnahmen (9 bestehende M540-548 + 15 neue M600-614), Volltext + norm_refs + multi-reg covers. Deckt alle 40 CRA-AI-x (vorher nur 17). - cra_annex_i_data lädt die Bibliothek defensiv: MEASURES=Superset, MEASURE_DETAILS (Volltext), mapped_measures aus covers abgeleitet. Fallback = hartkodierte 9. - Mapper: open_measures tragen jetzt name+description+norm_refs (echte Volltexte). - useCRA: merge nutzt Backend-Volltexte statt Demo-Lookup. - Tests: Coverage (40/40) + Volltext im Assessment. Quelle: extern handkuratiert/recherchiert, hier dedupliziert + gemappt. Maschinen- VO/NIS2/IEC-Maßnahmen folgen, sobald deren Spine existiert. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-16 07:44:13 +02:00
Benjamin Admin	4c206aa332	feat(cra): scanner-repo→IACE-Projekt-Mapping persistieren (Pull-Flow) [migration-approved] Ersetzt die ephemere Dropdown-Auswahl durch DB-Persistenz pro IACE-Projekt: - Migration 156: compliance_cra_scanner_repo_map (tenant_id, iace_project_id PK, scanner_repo_id). Additiv + idempotent. - GET/PUT /v1/cra/scanner-repo-map/{iace_project_id} (Upsert/Clear). - useCRA lädt das gespeicherte Repo beim Laden + persistiert bei Auswahl. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-16 07:05:33 +02:00
Benjamin Admin	0a6e57ac02	feat(use-case-controls): Adressat-Achse — out-of-scope advisory + additiver GOV-Tag 2-Pass-Haiku-Klassifikation (konservativ + Re-Confirm jeder Nicht-unternehmen- Einstufung) der Review-Tier-Atome: wer muss die Pflicht erfuellen? - Migration 155: atom_classification.addressee (unternehmen/oeffentliche_stelle/ aufsichtsbefugnis/staat_eu/dritter/meta), additiv, kein CHECK. [migration-approved] - Service: addressee + applicable + is_gov pro Control; include_out_of_scope-Param (Default false -> out-of-scope advisory ausgeblendet, NIE geloescht); out_of_scope_count. Pure Helper addressee_applicable/addressee_is_gov (+ Tests). - Route: optionaler include_out_of_scope-Query (contract-safe, additiv). - Frontend: GOV-Chip (additiv) + "kein Kunden-Pruefaspekt"-Chip + 1-Klick-Toggle zum Einblenden der out-of-scope-Atome. Daten: 40.859 Adressat-Tags auf macmini geladen (81% applicable, 19% advisory, 3.146 GOV). Konservativ: NULL/Unklar = applicable. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-16 06:58:37 +02:00
Benjamin Admin	90def4d857	feat(cra): Flow-2 UI — Scanner-Repo wählen → echtes Assessment - GET /v1/cra/scanner-repos: distinct repo_ids (+counts) vom Scanner-MCP für den Picker. - useCRA: scannerRepo-State; bei Auswahl POST /assess-from-scanner (echte Findings), sonst by-iace/Demo wie bisher. - ScannerRepoPicker im CRA/Cyber-Tab; leere Auswahl = Demo, Repo gewählt = echte Befunde. Mapping repo_id↔Projekt aktuell UI-seitig (ephemeral); DB-Persistenz pro Projekt folgt. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-16 05:49:15 +02:00
Benjamin Admin	926dc02a09	feat(use-case-controls): relevant als Stufe statt Hard-Filter + Provenance CI / detect-changes (push) Successful in 15s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / build-sha-integrity (push) Successful in 12s Details CI / validate-canonical-controls (push) Successful in 12s Details CI / loc-budget (push) Successful in 25s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 3m9s Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 30s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details Der harte relevant=true-Filter versteckte ~25% des Korpus (40.926 Atome), ~70% davon echte Pflichten (500er-Validierung). relevant wird zur Stufe: - Service: tier-Param (core=Default schuetzt Agent/CRA; all=alles inkl. review), ORDER BY relevant DESC; pro Control relevant/tier/source_type (own_library bei license_rule=3, sonst derived) + source_regulation/article; core_count/review_count. Pure Helper tier_label + source_type (+ Tests). - Route: optionaler tier-Query (default core) — contract-safe (additiv). - Frontend: Coverage-Drill-down /sdk/coverage/[useCase] — Kern-Pflichten vs. "zur fachlichen Pruefung", je mit Herkunfts-Badge; Uebersicht zeigt Delta. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-15 20:58:25 +02:00
Benjamin Admin	e140477c0b	feat(cra): Pull-Flow — Findings vom Scanner-MCP ziehen + assessen CI / detect-changes (push) Successful in 15s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / build-sha-integrity (push) Successful in 12s Details CI / validate-canonical-controls (push) Successful in 12s Details CI / loc-budget (push) Successful in 25s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 3m12s Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 39s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details (2) Wir als MCP-Client zum compliance-scanner-agent: - scanner_mcp_client.fetch_findings(): streamablehttp_client + ClientSession → list_findings, parst JSON-Text zu Finding-Dicts. Config via SCANNER_MCP_URL/ SCANNER_MCP_TOKEN (unset = leer → UI behält Demo). Transport lazy-importiert. - POST /v1/cra/assess-from-scanner: rohe Scanner-Dicts → toleranter Mapper (behält scan_type/cvss_score/file_path) → assess + Breadth. - Tests: parse_findings_text + no-config-Pfad. Live-Verdrahtung der UI folgt, sobald ihr Endpoint+Token stehen (dann nur Env setzen + useCRA auf /assess-from-scanner zeigen). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-15 19:05:44 +02:00
Benjamin Admin	731076835d	fix(cra): Konformitätspfad-Kacheln korrekt benennen + Gating nach CRA Art. 32 (a) Labels: Module korrekt zugeordnet — Modul A = Selbstbewertung, Modul B+C = benannte Stelle, EUCC = eigenes Zertifikat (nicht Modul H), "harmonisierte Norm" ist kein Modul sondern Konformitätsvermutung. Für den CRA noch KEINE harmonisierte Norm veröffentlicht → Kachel als "noch nicht verfügbar" (erwartet ~2027), nicht wählbar, mit Hinweis. (page/path/documents-Labels.) (b) Gating: wichtige Klasse II + kritische Produkte dürfen NICHT selbst bewerten; harmonisierte Norm allein genügt dort nicht → ALLOWED_PATHS IMPORTANT_II/ CRITICAL = {eucc, notified_body}; DEFAULT_FOR II = notified_body. _PATH_HINT entsprechend. Regressionstest test_cra_conformity_paths.py. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-15 13:49:00 +02:00
Benjamin Admin	00f304fed9	feat(controls): 5 neue Use Cases + Machinery-Fix + Korpus-/Lizenz-Übersicht CI / detect-changes (push) Successful in 14s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / build-sha-integrity (push) Successful in 11s Details CI / validate-canonical-controls (push) Failing after 5s Details CI / loc-budget (push) Successful in 22s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / test-go (push) Successful in 1m11s Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 3m41s Details CI / iace-gt-coverage (push) Failing after 5s Details CI / test-python-backend (push) Failing after 5s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details - Registry: arbeitsrecht, gesellschaftsrecht, insolvenzrecht, csrd, bafin_it + Mapper-Regeln für zuvor ungemappte Quell-Gesetze, Machinery-Guide 2006/42 -> maschinen. Jetzt 43 Use Cases (Achse 1 / license 1+2 vollständig). - corpus_overview Service + GET /v1/controls/corpus: Quell-Dokumente mit Lizenz-Tier + atom-Count + Use-Case + kuratiertem Lizenz-Katalog. - list_use_cases trägt atom_classification-Counts (atom_total/atom_relevant). - Frontend /sdk/coverage: Use-Case-Übersicht + Korpus-Dokumente + Lizenz-Katalog. - Tests: registry-Mappings (neue Domänen), corpus tier-labels, coverage-helpers. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-14 21:49:22 +02:00
Benjamin Admin	60f988f3cb	feat(cra): hard CRA<->IACE link — IACE tab pulls the linked assessment [migration-approved] Migration 153 adds compliance_cra_projects.linked_iace_project_id (additive, idempotent). New thin router cra_link_routes.py: POST /projects/{id}/link-iace sets the reference; GET /by-iace/{iace_project_id} returns the linked CRA project + its latest assessment snapshot. The IACE "CRA / Cyber" tab now resolves the linked CRA assessment first (real, from the snapshot) and only falls back to the demo scenario when nothing is linked. One assessment, two views. [migration-approved] — user approved the new column for the CRA<->IACE reference. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-14 19:22:29 +02:00
Benjamin Admin	b2392fb680	refactor(cra): readiness fetches Machinery-Reg obligations from use_case=maschinen Follow-up to the machinery_reg_cyber.py removal: the readiness endpoint now pulls Machinery Regulation 2023/1230 cyber-with-safety obligations from the shared Controls-API (use_case=maschinen), tagged "Maschinen-VO", best-effort. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-14 15:39:39 +02:00
Benjamin Admin	add16ad970	refactor(cra): pull Machinery-Reg obligations from Controls-API, drop hardcode Machinery Regulation 2023/1230 cyber-with-safety obligations are already in the shared Controls-API (use_case=maschinen, atom-grain, classified, license-clean) — so remove the hand-authored machinery_reg_cyber.py spine. The readiness check now fetches them from use_case=maschinen (sub_topics sicherheitsanforderungen -> code, risikomanagement -> process, konformitaetsbewertung -> document), tagged source "Maschinen-VO" alongside the CRA obligations. Same pattern as the security cluster; no own formulation, no license question. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-14 15:39:03 +02:00
Benjamin Admin	b0f78ae9a3	feat(cra): readiness derives obligations from Machinery Reg 2023/1230 too Machine/plant builders are hit by BOTH the CRA and the new Machinery Regulation. New machinery_reg_cyber.py models its two well-corroborated Annex III cyber-with- safety essential requirements (1.1.9 protection against corruption, 1.2.1 control- system safety incl. foreseeable manipulation) in our own words; EU legal text is freely reusable (Commission Decision 2011/833/EU, source acknowledged), harmonised standards referenced by identifier only. The readiness check asks "is it machinery?" and, if so, adds these obligations tagged "Maschinen-VO" alongside the CRA ones — the combination is visible (regulations list + per-item source badge). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-14 14:26:08 +02:00
Benjamin Admin	9660724a2c	feat(cra): CRA Readiness Check lead-magnet on /sdk/cra (Track A) Low-friction, stateless readiness check (no project/DB): business-scope answers (internet / parameter app / remote maintenance / updates / firmware / personal data / critical infra) -> Annex III/IV classification (reuses _classify) + a high-level guideline grouped Code / Prozess / Dokumentation (via Annex I evidence_type) + conformity path + deadlines + rough effort + the "we implement" hook and a CTA into the existing project workflow. Endpoint POST /api/v1/cra/ readiness. Reuse + reframe of the existing CRA module — no duplicate questionnaire. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-14 13:33:09 +02:00
Benjamin Admin	437c2c8fa1	feat(cra): hardware path — derive cyber findings from networked components For hardware CE projects (no repo) each networked component (controller/hmi/ gateway/drive/remote_access/sensor) yields typical ICS vulnerability CLASSES (real CWE + "CISA-ICS — product-specific check" framing, NO fabricated CVEs); they flow through the same CRA engine. /assess accepts components[]. MappedFinding now echoes title/location/cwe so the response is self-contained for any finding source. Live CISA-ICS/NVD per-product CVE lookup is the later enrichment. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-14 12:37:22 +02:00
Benjamin Admin	c7845f67d6	feat(cra): attach network_security regulatory breadth (shared Controls-API) Semantic breadth (2): each finding's CRA-AI is mapped to a network_security sub_topic and enriched with atom-grain, framework-traceable obligations from the shared Controls-API (compliance.atom_classification) — at the endpoint/view layer (SessionLocal), NOT in the pure mapper. CRA-AI anchor + curated measure + NIST/OWASP crosswalk stay the lead; this is breadth + source evidence. Only network_security is queried (atom-grain), scoped by sub_topic + limit. Frontend renders it under the collapsible best-practice depth (control_id · title · source). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-14 10:45:21 +02:00
Benjamin Admin	4d01e99ca1	feat(controls): atom-grain path in get_controls_for_use_case Reads compliance.atom_classification (Haiku pass: relevant + sub_topic + canonical_obligation) when present -> precise, sub-topic-organized controls per topic; master-grain seed stays as fallback for unprocessed topics. New optional sub_topic filter + subtopic_counts facet + granularity flag in the response. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-14 09:47:49 +02:00
Benjamin Admin	cf917ab733	feat(cra): versioned assessment snapshots — CRA Art. 13 running system (step 3) Persist each CRA assessment as a versioned, auditable snapshot over the product lifecycle. Reuses the existing compliance_cra_documents table (NO new schema, frozen DB respected): doc_type='doc_risk_assessment', full assessment in generation_context, requirements_coverage summary, auto-incrementing version, prior version superseded. New endpoints: POST /projects/{id}/assess-snapshot, GET /projects/{id}/assess-snapshots (history), GET /assess-snapshots/{id}. Additive (no contract baseline change). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-14 09:27:09 +02:00
Benjamin Admin	10c32d7f7c	feat(cra): cyber-meets-safety bridge as real logic (step 2) Deterministic bridge (cra_safety_bridge.py): a cyber finding's attack capability (remote_actuation / code_tampering / integrity_loss / auth_bypass, derived from its CRA category) is matched against what each CE safety function is vulnerable to. A match re-opens the mitigated hazard, flags the finding safety_impact (which floors it to P0), and produces the cross-link. Endpoint accepts safety_functions; frontend passes the project's safety functions and renders the LIVE cross-links (no more hardcode). Safety functions are demo input now; come from the CE risk assessment in production. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-14 08:59:41 +02:00
Benjamin Admin	12fa179bfd	feat(cra): coarse priority engine — P0 floor + customer weights + quick wins Deterministic prioritisation on top of the mapper (cra_prioritizer.py): a non-negotiable P0 floor (safety-function compromise / actively exploited / CRITICAL — customer weights cannot demote) plus a discretionary tier ranked by severity x the customer's weight (high/medium/low) for the 5 business objectives (access/data/network_api/supply_updates/monitoring). Quick-win flag (high impact, low effort) for a second view; each finding carries a short priority reason. Endpoint accepts weights + per-finding safety_impact/exploited. Rough pre-sort only (devs re-sort in Jira). No DB. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-14 08:21:56 +02:00
Benjamin Admin	34a678caef	feat(cra): standalone POST /api/v1/cra/assess endpoint Live HTTP entry for the deterministic CRA assessment — repo-scanner findings in, CRA Annex I mapping + risk + curated measures + NIST/OWASP golden-set crosswalk out. Project-less (works for any customer, no CE-RA/FMEA required); reuses the tested mapper, same logic the MCP server exposes. Additive endpoint (no contract baseline change); no DB. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-14 07:19:01 +02:00
Benjamin Admin	a4b405077f	feat(controls): shared get_controls_for_use_case retrieval API Read-only layer (service + thin route + tests) that returns the controls mapped to a use-case/topic, ranked by a deterministic precision proxy (is_primary + mapping confidence + registry keyword relevance) over the existing mc_use_case_mappings seed. No schema change. Shared handoff point: the document specialist agents AND the CRA finding-mapper draw from this one controls index instead of separate retrievals. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-13 21:37:18 +02:00
Benjamin Admin	d1ea54b378	feat(audit-report): Exec-Summary, Top-N je Modul, Statistik, Gesamtanalyse User-Feedback umgesetzt: Cookie-Titel-Fix (rendern nicht mehr als nacktes "Befund" — Titel aus cookie/type/vendor), Executive Summary oben, je Modul Statistik (Counts + Severity-Balken + MCs) + nur Top-3 Befunde + Verweis auf "N weitere" mit Frontend-Link (snapshot_id) + Zwischenfazit, Browser-Übersicht, Gesamtanalyse, klarerer "Grenzen"-Satz, Report-Versionsnummer. 6 Tests grün. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-13 15:57:07 +02:00
Benjamin Admin	d720db07dd	feat(audit-report): deterministischer Textreport je Audit (MD + PDF) + Bericht-Tab Firmen-tauglicher Bericht aus den Snapshot-Modulergebnissen (kein Re-Crawl, kein LLM): Einleitung, Testumfang+Methodik, Management-Summary (4-Status), Detail- befunde je Modul, Maßnahmen, Rechtlicher Hinweis. Co-Pilot-Tonalität, Tracking- statt Cookie-Rohzahl, Norm nur referenziert (kein Normtext). - audit_report.py: assemble_report (pur) + render_markdown + render_pdf (reportlab) - snapshot_check_routes: GET /report (struktur+md) + GET /report.pdf - Frontend: AuditReportTab + Proxys (report, report/pdf) + "Bericht"-Tab - Tests: 5 Assembler (compliance/tests → CI-geprüft) + 1 Vitest Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-13 14:50:45 +02:00
Benjamin Admin	85a8a1d545	feat(browser-matrix): Cross-Browser-Befunde + Browser-Default-Einordnung (Phase 4) - browser_cross_finding: deterministische Sicht ueber die Matrix (keine 2. Engine, kein LLM). Findet Inkonsistenzen ZWISCHEN Browsern (Cookies vor Consent / Ablehnen nicht universell respektiert / Banner-Links fehlend) und ordnet ein: Safari-ITP / Brave-Shields / Firefox-ETP maskieren Verstoesse clientseitig → strenge Engine "sauber" ist KEIN Compliance-Beleg, massgeblich sind die nachgiebigen (Chrome/Edge). Coverage-Hinweis fuer nicht verfuegbare Browser. Je Befund Titel/Detail/Severity/affected/Massnahme. - snapshot_check_routes: cross_findings frisch in run + GET (nicht persistiert). - BrowserBehaviorView: "Cross-Browser-Befunde"-Block ueber der Tabelle. - Tests: test_browser_cross_finding (6). Offen (Folge-Task): Borlabs-Consent-Historie-Live-Erkennung (braucht consent-tester-Storage-Scan). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-12 23:22:57 +02:00
Benjamin Admin	c7fde93061	feat(backend): On-demand Browser-Verhaltens-Matrix + Snapshot-Persistenz (Phase 2) - check_snapshot: update_browser_matrix/load_browser_matrix — migrationsfrei in banner_result.browser_matrix (JSONB jsonb_set, eigener scanned_at) - snapshot_check_routes: POST /snapshots/{id}/browser-behavior/run laeuft /scan-matrix LIVE (Re-Crawl je Engine, nur live messbar), persistiert das Ergebnis; GET /snapshots/{id}/browser-behavior liefert die gespeicherte Matrix ohne Re-Crawl. Profil-Set = 4 Default-Engines + Brave/Chrome/Edge. - consent-tester multi_browser_scanner: Semaphore(2) gegen OOM (7 Browser parallel sprengten das 2g-mem_limit) - Pydantic-Modell mit Optional[List[...]] (nicht `\| None`) → Py3.9-sicher - Tests: _snapshot_scan_url + Request-Defaults (5) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-12 23:03:28 +02:00
Benjamin Admin	403e3c66d2	feat(cookie): Deklaration-vs-Bibliothek-Diff-Sicht + Funnel-KPI Für die Library-getroffene Teilmesse (~32%) pro Cookie die Feld- Abweichungen deklariert→Library (Kategorie/Laufzeit/Zweck) als Diff-Karte, plus ehrlicher Funnel (gesamt → geprüft → abweichend) — nicht-getroffene Cookies sind nicht prüfbar (kein Pass/Fail), passend zur Tonalität. - analyze_cookies: 'expected'-Soll-Wert an tracker_as_necessary/ excessive_lifetime/missing_purpose (+ _CAT_LABEL_DE). - neues cookie_declaration_diff.build_declaration_diff: reine Regroup- Aggregation der Findings pro Cookie (single source = analyze_cookies), Hinweis-Typen (third_country/eu_alternative) bewusst ausgeschlossen. - cookie-check exponiert out['declaration_diff']. - CookieDeclarationDiff.tsx oben im Cookie-Tab (vor Panel/ResultView). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-11 21:00:50 +02:00
Benjamin Admin	97e39579d5	feat(cookie+routing): Storage-Typ-Filter + legal_notice capture-only #3 Storage-Filter: cookie-check exponiert per-Cookie-Speichertyp (storage_inventory.per_cookie); CookieResultView bekommt Filter-Chips (Cookie/Local Storage/Framework …) + eine Speicher-Spalte, Anbieter ohne passenden Treffer werden ausgeblendet, KPI zeigt gefilterte Zahl. A-Routing: legal_notice ist jetzt ein kanonischer Doc-Type. Eigene Discovery-Regel (legal-disclaimer/rechtlicher-hinweis) VOR impressum → die Disclaimer-Seite wird nicht mehr als Impressum substituiert (Ursache, dass die Cross-Doc-Reconciliation nie zündete). capture-only: als doc_entry für B persistiert, aber nicht einzeln gescort (keine 0%-Noise, da ohne eigene Checkliste). Im Scan-Form als Option auswählbar. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-11 20:45:18 +02:00
Benjamin Admin	0f6cdc93fd	fix(snapshot): Cookie-Dedup + schneller Impressum-Tab + Tabellen-Zahl - Cookies werden je Vendor nach Name dedupliziert (Consent-Phasen-Dubletten; BMW 2196 → ~772) — in cookie-check + get_snapshot, behebt aufgeblähte Kachel-/Finding-Zahlen. - Impressum-Snapshot-Check überspringt den ~40s-LLM-Schritt (context skip_llm) → Tab lädt sofort statt leer zu bleiben. - Vendor-Tabelle zeigt nur die Cookie-Zahl (kein 'Cookies'-Wort je Zeile). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-11 19:54:15 +02:00
Benjamin Admin	ba7d98be36	feat(reconcile): B — Cross-Doc-Reconciliation (Pflicht in anderem Doc erfüllt) Ein 'X fehlt'/'zu prüfen'-Finding wird unterdrückt, wenn die Pflicht in einem ANDEREN Snapshot-Dokument erfüllt ist (z.B. § 36 VSBG / OS-Link stehen bei BMW in AGB/'Rechtlicher Hinweis', nicht im Impressum → war False Positive). Konservative Allowlist (impressum: verbraucher_streitbeilegung, odr_link) gegen False-Reconciliation. Verdrahtet in _run_doc_agent (alle Doc-Checks). Frontend: 'In anderem Dokument abgedeckt'-Sektion. Greift voll nach Scan + Legal-Capture. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-11 15:42:16 +02:00
Benjamin Admin	7258744107	refactor+feat: Snapshot-Router-Split + generischer ChecklistAgent + AGB-Modul - Item 2: Snapshot-Doc-Checks (cookie/impressum/dse/agb) in snapshot_check_routes.py (agent_compliance_check_routes.py 464→365 Z.); gleiche Pfade, in main.py registriert. - ChecklistAgent-Basis: DSE-Logik generalisiert (L1/L2, kurze Titel, _severity_ override-Hook). DSEAgent + AGBAgent sind jetzt Thin-Subclasses → künftige Doc-Agenten (widerruf/avv/…) trivial. - Item 4: AGBAgent (§§ 305 ff. BGB, AGB_CHECKLIST) + agb-check + AGB-Tab via AgentModuleTab. Kein Library-Firehose. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-11 14:23:29 +02:00
Benjamin Admin	3c6deac1c5	fix(dse+linter): Drittland-Applicability, kein na-Detail, kurze Titel, Linter-Wortgrenzen - Linter: FORBIDDEN_OUTPUT_TERMS per Wortgrenze → 'Schutzgarantien'/'geeignete Garantien' (Art. 46) passieren, 'garantiert'-Claims bleiben geblockt. - DSE: L2-Detail wird übersprungen statt 'na', wenn die L1-Pflichtangabe fehlt (kein irreführendes 'nicht anwendbar' für z.B. Transfermechanismus). - DSE: Drittland → HIGH bei dokumentiertem Drittlandtransfer (scan_context via AgentInput.context) — BMW (Konzern, US-Provider) ist kein weiches MEDIUM. - DSE: Titel/Maßnahme kurz (treibt den Recommendation-Titel); ausführliche Begründung als evidence — behebt 120-Zeichen-abgeschnittene Überschriften. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-11 13:43:24 +02:00
Benjamin Admin	76be96556d	feat(dse): kuratierter DSEAgent + Snapshot-Tab (Art. 13/14, kein Firehose) DSEAgent wrappt die existierende ART13_CHECKLIST (33 kuratierte Pflichtangaben L1 + Detailchecks L2) → strukturierter AgentOutput, NICHT der 90k-Library- Firehose (eCall/Gesundheit/Telekom-Lärm). GET /snapshots/{id}/dse-check spiegelt impressum-check; doc_input_from_snapshot generalisiert. Frontend: generischer AgentModuleTab (lazy → AgentResultTab) für Impressum + DSE; DSE-Tab in der Snapshot-Seite. Plus HRB-Pattern \d→\d+ (volle Registernummer als Beleg). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-11 12:46:46 +02:00
Benjamin Admin	5b36b3f367	feat(impressum): Snapshot-Modul-Tab — ImpressumAgent auf gespeichertem Text Snapshot-Detailseite wird zu Modul-Tabs (Cookies & Tracking \| Impressum). Backend GET /snapshots/{id}/impressum-check laeuft den v3 ImpressumAgent auf dem gespeicherten Impressum-Text (kein Re-Crawl); Input-Erzeugung in impressum_input_from_snapshot() ausgelagert (pure + getestet: Text/Scope/ company_name-Fallback/None-Pfad). Frontend laedt lazy beim Tab-Wechsel und rendert mit dem bestehenden AgentResultTab (keine zweite Engine). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-11 11:24:44 +02:00
Benjamin Admin	05a1795ea8	feat(cookie): ② Documentation Drift — Richtlinie vs. Browser-Realität Cookie-Check-Endpoint liefert jetzt out["drift"] (audit_cookie_compliance): deklariert (Cookie-Richtlinie-Text) vs. tatsaechlich geladen (Browser). Frontend zeigt den Reality-Check-Strip oben im Panel: X dokumentiert · Y geladen · Z undokumentiert. Pinnt den Vertrag mit test_cookie_drift.py (undokumentiert-geladen + beide Drift-Richtungen) + Vitest Drift-Strip. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-11 09:33:41 +02:00
Benjamin Admin	289988d23e	feat(cookie): ① Storage Inventory + storage_transparency-Finding Trennt echte Cookies von anderem Endgeraete-Speicher (Local/Session Storage, IndexedDB, Salesforce-Framework-Artefakte) — § 25 TDDDG ist technologieneutral. - cookie_storage_inventory: detect_storage_type (Name-Muster ComponentDefStorage/ __MUTEX/LSKey + Laufzeit-Text) + build_storage_inventory + storage_transparency- Summenbefund ('X als Cookie gelistet -> Y echte + Z andere'). - Endpoint cookie-check liefert storage_inventory; Frontend zeigt den Breakdown. Tests: 4 + Frontend-Vitest gruen. Differenzierungsmerkmal: '740 -> 132 + 608'. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-11 09:05:29 +02:00
Benjamin Admin	d18ef79f18	feat(cookie): Pro-Cookie-Library-Abgleich (2287er OCD + 35er rich) + Panel - analyze_cookies gleicht Cookies gegen BEIDE Libraries ab: compliance.cookie_library (2287, OCD/CC0 — Kategorie/Retention) + 35er rich-DB (technical_necessity/reid/ schrems/eu_alternative). 5 Befund-Typen: tracker_as_necessary, missing_purpose, excessive_lifetime (Art.5), third_country (Art.44), eu_alternative (kommerziell). - Endpoint GET /snapshots/{id}/cookie-check (load_big_library batch + analyze). - Frontend CookieLibraryPanel im Snapshot-Detail. - Fix CookieResultView: Zweck nicht mehr auf 60 Zeichen gekuerzt; Rolle 'unknown' als Strich statt 'Unbekannt'. Tests: 7 backend + frontend vitest gruen. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-11 08:18:25 +02:00
Benjamin Admin	97575cc9c0	feat(agent): 4-Status-Modell (NOT_APPLICABLE/INSUFFICIENT_EVIDENCE/POSSIBLY_APPLICABLE) für Impressum Kanonisches Compliance-Datenmodell, Impressum-Agent als Referenz: - CheckStatus-Enum + Finding.status GETRENNT von severity (Verdikt ≠ Risiko) - Unbestimmte Rechtsform (weder Text noch Wizard) → INSUFFICIENT_EVIDENCE (INFO) statt hartem HIGH-FAIL; legal_form_dependent-Gate + detect_legal_form_present - §18-MStV-Graubereich (Corporate-Blog via has_editorial_content) → POSSIBLY_APPLICABLE (LOW Prüf-Hinweis); 3-stufig via scope_disposition - Recommendations nur aus echten FAILs; mc_insufficient/mc_possibly-Aggregate - Frontend: Verdikt-Pill + Coverage-Vokabular - 19 neue Tests (test_four_status.py, AgentFindingCard); CI-Suite 204 grün, v3 25 / GT 13 unverändert Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-10 22:38:11 +02:00
Benjamin Admin	b7a7e70731	feat(agent): Impressum Rechtsform-Gates + USt-optional (Phase 3) Die 8 Audit-Klassifizierungs-Felder (scan_context) treiben jetzt den business_scope der Agenten (vorher gespeichert, aber nicht genutzt). Rechtsform-Gates als opt-out (excludes_scope): Verein -> kein Handelsregister-Finding, e.K. -> kein Vertretungsberechtigte-Finding; unbekannte Rechtsform bleibt anwendbar. USt-IdNr optional -> fehlt = kein Finding. Rechts-Zuordnung vom Domain-Experten bestaetigt. - _classification.py: scan_context_to_scope (8 Felder -> scope-Tokens) - mcs.py: MC.excludes_scope + MC.optional; IMP-MC-004/006 Gate-Tokens; IMP-MC-005 optional; scope_matches respektiert excludes_scope - agent.py: optional -> kein Finding bei Abwesenheit - _agent_outputs.py: scope = scan_context vereinigt LLM-Profil-Fallback - Tests gruen: v3 25, Groundtruth 13, CI-Pfad 14 (+ SSE-Loop-Fix) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-10 20:37:56 +02:00
Benjamin Admin	65de90114a	feat(agent): SSE — progressive Themen-Tabs (Phase 2) Der Compliance-Check streamt jetzt progressive Events; der Impressum-Tab erscheint, sobald das Thema fertig ist, statt am Ende alles auf einmal. Additiv — das Polling fürs finale Ergebnis bleibt. - backend: _sse.py (Queue/emit/event_generator) + Endpoint /compliance-check/{id}/stream; _update emittiert progress, run_agent_outputs emittiert topic (laeuft jetzt frueh nach Phase B), Orchestrator emittiert complete/error. - frontend: SSE-Proxy-Route + EventSource in ComplianceCheckTab merged topic-Events in agent_outputs -> Tab erscheint progressiv. - Tests: backend 5 passed (SSE + agent_outputs); tsc 0 neue Fehler, vitest 2 passed, check-loc 0. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-10 19:07:26 +02:00
Benjamin Admin	e21984e0ad	feat(agent): strukturierte Ergebnis-Tabs — Impressum (Phase 1) Der Compliance-Check legt zusätzlich einen strukturierten v3-AgentOutput pro Thema in result.agent_outputs ab (additiv; B18-HTML + Firehose-Mail bleiben unangetastet). Frontend: standardisiertes Ergebnis-Tab statt Firehose — Impressum-Tab (AgentResultTab) + "Alle Checks (roh)" (ChecklistView). - backend: _agent_outputs.py ruft den registrierten v3-ImpressumAgent, gewired in _orchestrator nach B18, surfaced via _phase_f_persist. - frontend: AgentResultView (aus AgentSlotCard extrahiert, DRY), AgentResultTab, ComplianceResultTabs; ComplianceCheckTab 490->391 Zeilen. - Tests: backend 2 passed, frontend 2 passed; tsc 0 neue Fehler; check-loc 0. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-10 18:32:06 +02:00
Benjamin Admin	08c08fcba2	feat(crawl): Vollstaendigkeit — Shadow-DOM/versteckte Links + Interaktions-Fixpunkt + Wayback-CDX-Orphans CI / test-python-backend (push) Successful in 30s Details CI / detect-changes (push) Successful in 9s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / build-sha-integrity (push) Failing after 4s Details CI / validate-canonical-controls (push) Successful in 12s Details CI / loc-budget (push) Successful in 15s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details Damit die Specialist-Agents auf vollstaendigem Website-Content arbeiten: A — _find_dsi_links pierct jetzt Shadow-DOM (Web-Components wie Usercentrics/ Mercedes) rekursiv; versteckte (display:none) Links werden erfasst + als Coverage-Metadatum geflaggt. B — _expand_to_fixpoint klappt Akkordeons/Tabs/Hover-Menues in einer Schleife auf, bis das DOM stabil ist (statt 1 Pass); erweiterte Selektoren; Coverage-Telemetrie (Runden, expandierte Elemente, DOM-Wachstum, Shadow-/ versteckte Links) → Response + Backend-Log. C — legacy_url_cdx.cdx_enumerate listet via Wayback-CDX-API ALLE je archivierten URLs der Domain → findet Orphan-/Legacy-Seiten, die nie im Slug-Raster standen (z.B. nicht mehr verlinktes /datenschutz, per Direkt- URL noch erreichbar). Fliesst durch das bestehende Legacy-URL-Inventar. Tests: test_legacy_url_cdx.py (6) + consent-tester/tests/test_dsi_discovery.py (Pure-Helper + Real-Browser-Integration). Alle gruen, LOC-Gate gruen. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-09 12:33:34 +02:00
Benjamin Admin	593baace7c	fix(agents): HTML-Entity-Decode vor Agent + Pattern duldet '(' CI / detect-changes (push) Successful in 6s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / build-sha-integrity (push) Failing after 4s Details CI / validate-canonical-controls (push) Successful in 11s Details CI / loc-budget (push) Successful in 14s Details CI / go-lint (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 28s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details Bug bei BMW: dsi-discovery liefert HTML-Entities ( ) als Literal-Strings ohne Decode. Beispiel im BMW-Impressum: 'wird gesetzlich durch den Vorstand (Milan Nedeljkovic, …)' Mein Pattern erwartet ':' / '.' / Whitespace nach Vorstand → matched nicht das '&' → false-positive HIGH-Finding. Fix 1 (Hauptfix): Test-Harness ruft html.unescape() vor agent.evaluate() auf, so dass jeder Agent sauberen Text bekommt — entkoppelt von dsi-discovery-Eigenarten. Fix 2 (Belt-and-suspenders): Pattern duldet jetzt auch '(' direkt nach Vorstand/Geschaeftsfuehrer (falls Decode mal fehlschlaegt). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-08 18:45:37 +02:00
Benjamin Admin	361a5e7605	feat(agents): Test-Harness nutzt volle Compliance-Pipeline für Fetch CI / detect-changes (push) Successful in 7s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / build-sha-integrity (push) Failing after 4s Details CI / validate-canonical-controls (push) Successful in 10s Details CI / loc-budget (push) Successful in 12s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 28s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details Statt der simplen dsi-discovery-Wrapper-Funktion ruft der Test-Harness jetzt _fetch_text() aus agent_check/_fetch.py — die VOLLE Pipeline die auch der produktive Compliance-Check verwendet: - consent-tester dsi-discovery mit 240s Timeout (statt 120s) - doc_type-aware max_documents (1 für cookie/dse, 3 für impressum) - CMP-Payload-Capture (ePaaS, OneTrust …) - HTTP-Fallback mit Browser-User-Agent + DomainRateLimiter - HTML-Tag-Strip wenn Playwright fail Damit funktionieren Cloudflare-/Anti-Bot-geschützte Sites wie BMW und Elli auch im Test-Harness — vorher Timeout nach 90s. Plus: bei leerem Fetch klare Fehlermeldung im Slot ('Cloudflare-/Anti-Bot-geschützt — Tipp: Text manuell einfügen') statt silent-fail. cmp_payloads landen jetzt auch im Vault. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-08 18:38:59 +02:00
Benjamin Admin	3ae4e60c9d	feat(agents): SSE-Endpoint + Agent-Test-Tab (5-URL parallel) CI / detect-changes (push) Successful in 7s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / build-sha-integrity (push) Failing after 4s Details CI / validate-canonical-controls (push) Successful in 12s Details CI / loc-budget (push) Successful in 14s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 2m24s Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 29s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details Backend: - specialist_agent_routes.py: GET /agents, POST /test/start (run_id), GET /test/stream/{run_id} (SSE), GET /run/{run_id}/result, GET /run/{run_id}/artifacts, GET /run/{run_id}/artifact/{path}, DELETE /run/{run_id}, GET /runs. - Per-URL async orchestrator: text fetch via consent-tester dsi-discovery → agent.evaluate() → vault.put_json + stream events. - Tests: 7/7 grün. Frontend: - /api/sdk/v1/specialist-agent proxy mit SSE-passthrough. - AgentTestTab.tsx: Agent-Wähler + 5 URL-Slots + Live-Events + Speedometer (OK/N-A/HIGH/MEDIUM/LOW) + Findings + Recommendations + Eskalations-Log + Artefakt-Link pro Slot. - Neuer Tab "Agent-Test" in /sdk/agent. User-Wunsch 2026-06-08: pro Agent isoliert testen, 5 URLs gleichzeitig, Live-Updates statt Polling-Wartespiel. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-08 17:47:05 +02:00
Benjamin Admin	d6b8bf87c2	fix: 4 Bugs gemeinsam — B22 PDF + B17 Walk-Fallback + company_name + Plausibility-Fallback CI / detect-changes (push) Successful in 9s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / test-python-backend (push) Successful in 29s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / build-sha-integrity (push) Failing after 4s Details CI / validate-canonical-controls (push) Successful in 10s Details CI / loc-budget (push) Successful in 13s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details (1) B22 Cross-Domain (fix #59): Elli-Test fand AGB auf logpay.de NICHT obwohl URL in doc_entries korrekt. Vermutete Ursache: Discovery-Phase A drops/überschreibt Original-URL bei PDF-Fetch-Fail (word_count=0). Fix: _collect_audit_urls() iteriert über state.doc_entries + rejected_url + req.documents — Cross-Domain-Hosting ist unabhängig vom Text-Inhalt. Plus Trace-Logging für künftige Diagnose. Dedup per (doc_type, host_sld). (2) B17 Audit-Walk-Fail-Fallback (fix #60): BMW v5 hatte audit_walk=None ohne Mail-Hinweis. Vermutlich 180s-Timeout bei OneTrust-CMP-Banner-Tour. Fix: Timeout 180s → 300s. Plus: Bei Fail wird ein Hinweis- Stub mit error-Grund in state["audit_walk"] + HTML-Block geschrieben — Reviewer sieht den Fail statt silent-skip. (3) company_name + origin_domain im Backend (fix #61): Frontend sendet seit `ec03317` die zwei Felder — Backend ignorierte sie. Fix: ComplianceCheckRequest-Schema um company_name + origin_domain erweitert. phase_e_email priorisiert User-Input vor URL-Heuristik für site_name. Bei origin_domain ohne ableitbare doc_entries-domain wird der User-Input als domain übernommen. (4) Plausibility-LLM Fallback-Modell (fix #62): qwen3:30b-a3b liefert auf großen DSEs (BMW 122 FAIL) gehäuft leere format='json'-Responses — Circuit-Breaker griff aber Phase blieb nutzlos. Fix: Default-Modell auf qwen2.5:7b umgestellt (4× kleiner, zuverlässiger bei format=json, ausreichendes Reasoning für PASS/MODIFY/DROP-Klassifikation). Plus Strategy-C eingeführt — Fallback-Modell (llama3.2:3b) wenn primary leer bleibt. BATCH_SIZE 4 → 3. ENV-Switches PLAUSIBILITY_LLM_MODEL + PLAUSIBILITY_FALLBACK_MODEL für Tuning. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-08 16:39:33 +02:00
Benjamin Admin	d208a2bde2	feat: Mail-Restrukturierung + B22 Cross-Domain-Doc-Detector CI / validate-canonical-controls (push) Successful in 11s Details CI / loc-budget (push) Successful in 13s Details CI / go-lint (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / detect-changes (push) Successful in 7s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / build-sha-integrity (push) Failing after 4s Details CI / python-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-python-backend (push) Successful in 30s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details User-Feedback BMW v5: "740 Cookies verschwunden auf 31, Übersicht verloren". Drei Anpassungen: Mail-Restrukturierung (_executive_summary.py + _compose.py): - render_executive_summary(): Top-of-mail TL;DR mit Compliance-Score (gross + farbig), Top-3-Findings nach Severity, Cookie-Statistik (deklariert/Browser/Drittland), Severity-Verteilungs-Chips. - collapsible(): wrapt jeden Block in <details>/<summary>. Mailpit + alle modernen Mail-Clients rendern das nativ. - _compose.py: alle 18+ B-Blöcke + per_doc + per_theme + legacy_html in Akkordeons. NUR Critical-Findings + Sofort- massnahmen sind immer offen — Reviewer sieht ~15 Zeilen Übersicht und klappt selektiv auf. - Cookie-Inventar (742) hat jetzt eigene Sektion ganz oben (Akkordeon "🍪 Cookie-Inventar"), Vendor-Karten parallel. B22 Cross-Domain-Legal-Doc-Detector (cross_domain_doc_check.py): Real-Beispiel User-Feedback: Elli's AGB liegt auf docs.logpay.de statt elli.eco. Detektor erkennt SLD-Mismatch: - HIGH bei agb / widerruf (vertragsrelevant) - MEDIUM bei dse / nutzungsbedingungen - INFO bei cookie / impressum (Best-Practice) Norm: DSGVO Art. 28 (AVV-Pflicht für Hosting) + Art. 13 Abs. 1 lit. e (Empfänger) + § 312i BGB (Cool-URLs). 9/9 Tests grün inkl. Elli/LogPay Pattern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-08 11:35:55 +02:00
Benjamin Admin	5c5d676f01	feat: Plan B + A + C — DSE-Versions-MCs + Legacy-URL + Multi-Version CI / detect-changes (push) Successful in 7s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / build-sha-integrity (push) Failing after 4s Details CI / validate-canonical-controls (push) Successful in 10s Details CI / loc-budget (push) Failing after 11s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 28s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details Drei verwandte Mechanismen für DSE-Beweisbarkeit + URL-Hygiene. Plan B + PDF — Versions-Beweisbarkeit-MCs (dse_checks.py): - mc-dse_version_date (HIGH) — sichtbares Stand/Versionsdatum Pflicht. 12 Regex-Pattern: "Stand: April 2024", ISO-Datum, "Letzte Aktualisierung", "Version 3.2", englische Varianten ("Last updated", "Effective date as of …"). Norm: Art. 7 Abs. 1 DSGVO (Nachweisbarkeit Einwilligung). - mc-dse_version_proof (MED) — PDF-Download oder versionierte Archiv-URL. Reine HTML-DSE ohne Snapshot ist juristisch fragil. 8 Pattern: .pdf, Download-Hinweis, web.archive.org, /dse-vNNN.html. Norm: DSK-Orientierungshilfe 2024. Plan A — Legacy-URL-Discovery (legacy_url_discovery.py + B20): Vier komplementäre Quellen: A.1 /sitemap.xml + Sub-Sitemaps parsen, auf compliance- relevante Slugs filtern A.2 archive.org/wayback/available pro Slug — wenn Wayback zeigt ≥18 Monate alten Snapshot UND Seite heute noch 200 liefert UND nicht im Footer → Legacy-Verdacht A.3 Slug-Permutations: 6 doc_types × 6 Slug-Varianten × 5 Lang-Prefixe × 4 Brand-Parameter A.4 Banner-Modal-Links (über consent-tester Stufe 4 Tour) Mail-Block "🗂️ Legacy-URL-Inventar" mit Tabelle: URL · HTTP · Wayback-Alter · Footer · Empfehlung (301/Offline/Behalten). Engine entscheidet NICHT was Legacy ist — präsentiert das Inventar, Kunde wählt. Real-World-Smoke Elli: /en/cookies → HTTP 200, Wayback 69 Mo alt, nicht im Footer → "Legacy-Verdacht, 301 setzen" /en/impressum → HTTP 302, redirected → "behalten" Plan C — Multi-Version-DSE-Analyse (multi_version_dse.py): Wenn ≥2 DSE-URLs reachable: pro Variante DSB-Name + Datum + Wortzahl + SHA-256 extrahieren, Inkonsistenzen flaggen (date_divergent, dsb_divergent, no_date_count). Mail-Block "📑 Mehrere DSE-Versionen erkannt" mit Vergleichstabelle + rotem Hinweis "Nur eine Version kann gültig sein". Beispiel Elli: /de/datenschutz (Mollstr-DSB, 2022) vs /de/datenschutzerklaerung?brand=elli (Proliance, ohne Datum). API-Response erweitert um legacy_url_inventory + html_blocks.legacy_urls + multi_version_dse_html im V2-Layout. ENV-Override: LEGACY_URL_DISABLED=1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-08 10:04:14 +02:00

1 2 3 4 5 ...

357 Commits