Compare commits

...

65 Commits

Author SHA1 Message Date
Benjamin Admin c771d8ecb9 Merge feat/iace-lift-endstop-bridge: OSHA→engine bridge + drift filter
CI / guardrail-integrity (push) Has been skipped
CI / detect-changes (push) Successful in 11s
CI / branch-name (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 17s
CI / loc-budget (push) Failing after 19s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Failing after 1m9s
CI / iace-gt-coverage (push) Successful in 29s
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-05-22 08:37:34 +02:00
Benjamin Admin 772ff35e8d feat(iace): bridge OSHA MD library to pattern engine, body-part-specific lift crush hazards
- M600-M604: lift endstop mitigations (Kriechgeschwindigkeit, Schaltleiste,
  Mindestabstand, Hold-to-run, Trittblech) — cite OSHA + EN ISO identifiers
- HP2100-HP2102: body-part crush patterns for lift family (foot under platform,
  hand/body against fixed structure, leg between lift and lateral structure),
  restricted via MachineTypes filter
- pattern_machinetype_overrides.go: post-load pass fills MachineTypes on 14
  legacy patterns (HP1000 Walzen, HP539 Schweiss, HP545/HP782 Glas,
  HP756/HP757/HP760 Fahrtreppe, HP1400-1402 CNC, HP045/HP049 Pressen,
  HP420-422 Conveyor) to prevent drift on Kistenhubgeraet-style projects

Why: Kistenhubgeraet re-init exposed two gaps — the abstract "Bremse versagt
bei Absenkbewegung" pattern fired but the concrete foot-crush body-part variant
was missing, AND ~10 unrelated patterns fired purely because their RequiredTags
incidentally aligned. Override map avoids touching 1000+ LOC pattern files
that already exceed the soft cap.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 08:37:24 +02:00
Benjamin Admin 8cbb513e2c feat(audit): Phase 1 Quick-Wins (P81 + P85 + P70 + P83) + TCF DELETE/INSERT-Fix
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / detect-changes (push) Successful in 11s
CI / branch-name (push) Has been skipped
CI / loc-budget (push) Failing after 16s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 15s
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 38s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / test-go (push) Has been skipped
P81 — tests/fixtures/golden_truth/vw_de.json:
GT-Fixture mit must_find_cookies (47 VW-Cookies) + expected_vendors
(Google, Adobe, Trade Desk, ...). Basis fuer kuenftige Regression-Tests.

P85 — banner_screenshot_block.py + consent_scanner.py + main.py:
consent-tester macht beim Banner-Detect einen base64-PNG-Screenshot
(< 1.5MB). Backend rendert ihn als <img src="data:..."> direkt nach
dem GF-1-Pager. Visueller Beweis 'so sah das Banner aus' fuer Dispute
mit Marketing/DSB.

P70 — rag_provenance.py:
classify_finding_provenance() klassifiziert ein Finding als 'rag'
(Norm + Quelle), 'mixed' (Norm ohne Quelle) oder 'heuristic' (eigene
Interpretation). provenance_badge_html() rendert kleine Badges
(✓ RAG / NORM / ⚠ HEURISTIK). Modul ist generisch, kann bei jedem
Finding-Renderer einklinkt werden.

P83 — scripts/check-rebuild-needed.sh:
Prueft ob die im Container deployten BUILD_SHA mit local HEAD
uebereinstimmen. Bei Mismatch exit 1 mit 'REBUILD REQUIRED'-Hinweis.
Verhindert das 'alter Code im Container'-Problem das uns mehrfach
erwischt hat (Frontend-Tabs sichtbar, Backend ohne neuen Service).

TCF-Fix — tcf_vendor_authority.py:
cookie_library hat keinen UNIQUE-Index auf cookie_name → ON CONFLICT
war unmoeglich. Loesung: vor Insert DELETE WHERE source_name='iab_tcf_v2'.
Idempotent. + per-Vendor-Commit damit ein Fail die naechsten nicht blockt.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 08:24:46 +02:00
Benjamin Admin 6c35bcf116 fix(tcf): per-vendor commit damit ein Fail die naechsten Inserts nicht blockt
CI / detect-changes (push) Successful in 15s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 22s
CI / loc-budget (push) Failing after 17s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-python-backend (push) Successful in 45s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
2026-05-22 07:54:22 +02:00
Benjamin Admin 19d4b12e07 fix(tcf): Schema-Mapping fuer NOT NULL constraints (domain_pattern, source_name)
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 14s
CI / loc-budget (push) Failing after 20s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m33s
CI / test-go (push) Failing after 52s
CI / iace-gt-coverage (push) Successful in 25s
CI / test-python-backend (push) Successful in 40s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-05-22 00:32:54 +02:00
Benjamin Admin 2e87b74749 feat(audit): P103+P104+P105 Defeat-Device-Heuristik fuer Cookies
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 15s
CI / nodejs-build (push) Successful in 2m35s
CI / test-go (push) Failing after 51s
CI / iace-gt-coverage (push) Successful in 27s
CI / loc-budget (push) Failing after 16s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-python-backend (push) Successful in 39s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
Drei zusammenhaengende Stufen 'Cookie-Verhalten ist anders als deklariert' —
analog zum VW-Diesel-Skandal-Pattern (Pruefstand vs Realbetrieb).

P103 (Stufe 3) — cookie_value_entropy.py:
Klassifiziert Cookie-Werte als flag/short_id/long_token/uuid/hash/json_blob
via Shannon-Entropy + Regex-Patterns. Wenn ein als 'essential' deklarierter
Cookie einen 64-char-Base64-Wert hat → MEDIUM-Finding 'Defeat-Device-Heuristik'.

P104 (Stufe 4) — cookie_network_tracer.py:
Vergleicht Cookie-Domain mit Site-Hauptdomain + bekannten Tracker-Vendoren
(50 Domains gemapped: doubleclick.net, facebook.com, demdex.net, omtrdc.net,
adsrvr.org, hotjar.com, ...). Wenn ein als 'essential' deklariertes Cookie
von externer Tracker-Domain gesetzt wird → HIGH. Drittland-Cookies werden
als 'DRITTLAND US/CN/...' markiert (Schrems-II-Folge).

P105 (Stufe 5) — tcf_vendor_authority.py:
Ingest-Endpoint POST /api/compliance/agent/admin/tcf-ingest holt die
IAB TCF v2 Global Vendor List (vendor-list.consensu.org/v3) und upserted
sie in cookie_library mit source='iab_tcf_v2'. cross_reference_with_tcf
fuzzy-matched cmp_vendors gegen die TCF-Liste — wenn Vendor in TCF als
Marketing gefuehrt aber Site sagt 'Funktional' → HIGH (externe Authority
widerspricht der Deklaration).

Alle drei rendern eigene Mail-Bloecke im Bereich Cookies (nach
cookie_audit_html, vor library_mismatch_html).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 00:24:07 +02:00
Benjamin Admin 94233b7c66 feat(iace): LLM gap-review (Task #7+#8) + tech-file sources appendix (#29)
Three coupled pieces of work, all landing the same PoC:

1. Backend gap-review endpoint (Task #7)
   - internal/api/handlers/iace_handler_gap_review.go:
       POST /projects/:id/llm-gap-review
       feeds Limits-Form + current hazards + current mitigations to
       the configured LLM (Qwen / Claude / OpenAI via ProviderRegistry),
       parses a JSON suggestion list, filter+stamps confidence, falls
       back to a static checklist when LLM is unavailable.
   - Adopt step is NOT in this endpoint by design — the user clicks
     Adopt in the frontend which calls the existing CreateHazard /
     CreateMitigation handlers so provenance flows through the normal
     audit trail.

2. Frontend modal + button (Task #8)
   - app/sdk/iace/[projectId]/hazards/_components/LLMGapReviewModal.tsx:
       reusable modal that POSTs the gap-review endpoint, renders
       suggestions with Adopt/Reject UX, shows confidence + norm refs,
       source-stamp llm_gap_review vs fallback_static.
   - hazards/page.tsx: indigo "KI-Gap-Review" button next to the
     existing "Eigene Gefaehrdung" button + modal mount.

3. Tech-File sources appendix (Task #29 — Stufe 4)
   - internal/iace/document_export_sources.go: new pdfSourcesAppendix
     method appended to ExportPDF. Groups cited norms by license rule
     (R1 OSHA/EU-Recht / R3 BreakPilot patterns / R3 DIN-EN-ISO
     identifier-only) and emits the legally required statement that
     pauschal Impressum-Hinweise nicht ausreichen.
   - extractCitedNorms() scans hazard/mitigation text for EN/ISO/IEC/
     DIN identifiers in a narrow grammar so prose isn't turned into
     spurious citations.

Bonus refactor:
   - internal/app/routes.go reached the 500-LOC hard cap when the new
     llm-gap-review route was added. Extracted registerIACERoutes into
     routes_iace.go (136 LOC). Same wiring, no behaviour change.

Three of the four Attribution-Renderer stages (1, 2, 4) now produce
real output. Stufe 3 ships as <SourceBadge> + <LicenseModuleBanner>
already (commits dfac940 + b9e3eea earlier in this branch).

The PoC is intentionally conservative: every LLM-Suggestion stays
unverbindlich until a human clicks Adopt, and Adopt goes through the
existing normal CreateHazard/CreateMitigation flow (not yet wired in
this commit — separate iteration). The endpoint, modal and provenance
chain are in place for the next iteration to wire Adopt → write path.
2026-05-22 00:21:49 +02:00
Benjamin Admin 6263462ba3 feat(frontend): Tab-Layout für Audit-Ergebnisse + cookie_audit in API
CI / detect-changes (push) Successful in 8s
CI / branch-name (push) Has been skipped
CI / iace-gt-coverage (push) Successful in 28s
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 15s
CI / loc-budget (push) Failing after 16s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m40s
CI / test-go (push) Failing after 45s
CI / test-python-backend (push) Successful in 40s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
ResultsTabsView.tsx — neue Komponente mit 7 Tabs:
  1. Übersicht (KPIs: Docs, Findings, Vendors, Score)
  2. Cookies & VVT (3-Quellen-Compliance-Vergleich +
     undokumentiert/compliant/nicht-geladen + deduplizierte Vendor-Tabelle)
  3. Datenschutzerklärung (DSE-Findings via ChecklistView)
  4. Impressum
  5. AGB / Widerruf (zwei Sections in einem Tab)
  6. Cookie-Banner (Verstoesse + Phasen-KPIs)
  7. Mail-Vorschau (PDF-Download-Link)

Sticky Tab-Header oben, Content scrollt darunter. Lange Scroll-Mail
ist damit verschwunden.

DocCheckTab nutzt ResultsTabsView statt der alten Inline-ChecklistView.

Backend liefert jetzt cookie_audit-dict in der Response (zusaetzlich
zu cmp_vendors + banner_result) damit das Cookie-Tab die 3 Listen
(undokumentiert / compliant / nicht-geladen) rendern kann.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 23:44:36 +02:00
Benjamin Admin eb48c5bd1e feat(iace): OSHA minimum-distance library — Task #18
Verbatim OSHA 29 CFR 1910 Subpart O values anchored as the rechtssicher
zitierbare Werte-Basis for the IACE engine. Per strategy discussion
(2026-05-20) US Federal Code is the only public-domain corpus we can
reproduce wholesale; DIN/EN values stay identifier-only.

Coverage in this initial batch:
- MD_OSHA_O10_R1, MD_OSHA_O10_R4 (Table O-10 rows 1 + 4 — point of
  operation guard distance vs max opening width)
- MD_OSHA_212_FAN (§1910.212(a)(5) fan-blade guards: 1/2 in)
- MD_OSHA_217_PSDI (§1910.217 hand-speed constant 63 in/s for
  presence-sensing-device-initiation and two-hand-trip distances)

Each entry carries four parallel value sets:
- OriginalValue/Min/Max in source unit (verbatim, R1)
- ExactMM via deterministic conversion (mathematics, no copyright)
- RecommendedMM with safe-side rounding documented in RoundingNote
- EUNormHints — identifier-only references to EN ISO 13857, EN 13855,
  EN 349 with a human-curated DINComparisonNote (qualitative judgement,
  not a copy)

Open follow-ups (separate iterations):
- Full Table O-10 (rows 2-10) — same shape
- §1910.219 mechanical power-transmission distances
- Cross-reference IACE patterns to MD_OSHA_* identifiers so the Suppression
  Engine surfaces concrete metric values in mitigation suggestions
- Frontend integration: <MinimumDistanceCard> for each measure
2026-05-21 23:43:51 +02:00
Benjamin Admin 081e4f057a feat(audit): Cookie-Compliance-Audit (3-Quellen-Vergleich) + Vendor-Dedup + Block-Parser
CI / detect-changes (push) Successful in 12s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Failing after 55s
CI / iace-gt-coverage (push) Successful in 25s
CI / test-python-backend (push) Successful in 44s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 16s
CI / loc-budget (push) Failing after 18s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m43s
ZENTRALER USP: cookie_compliance_audit.py vergleicht 3 Quellen
* DEKLARIERT in Cookie-Richtlinie (parse_cookie_table + parse_flat)
* TATSAECHLICH im Browser geladen (banner_result.phases.after_accept)
* LIBRARY-Metadaten (cookie_library lookup)

Liefert 3 Listen mit Compliance-Verdict:
* compliant (deklariert UND geladen) — gruener Block
* undeclared_in_browser (geladen NICHT deklariert) — ROTER HIGH-Block
  → Art. 13(1)(c) DSGVO + § 25 TDDDG Verstoss
* declared_not_loaded (deklariert NICHT geladen) — gelber Hinweis
  → Tabelle moeglicherweise veraltet

parse_cookie_table erweitert um Block-Format (5 Zeilen pro Cookie wie
beim User-Copy aus VW). Findet 35+ Cookies aus Copy-Paste statt 0.

vendor_normalizer.py: 50+ Aliases (Google-Familie, Adobe-Familie,
Trade Desk, AdForm, ...) + Garbage-Filter (URLs, leere Strings,
'click to select', 'Mehrere OEMs'). Mergt cookies-Listen beim Dedup.

_guess_vendor erweitert: Adobe-Familie (s_ecid/AMCV/demdex/mbox/...),
Trade Desk (TDID/TDCPM/TTDOptOut), AdForm (uid/cid/otsid),
Salesforce LiveAgent, etracker, Akamai, EDAA.

audit_quality_checks: vendor-thin-Threshold jetzt dynamisch nach
Cookie-Doc-Wörter (3k→10 / 6k→20 / 10k→30 / 15k+→40).

VW-Test-Fixture: tests/fixtures/cookie_gt/vw_cookie_richtlinie.txt
(36-Cookie-Sample fuer Regression-Tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 23:36:45 +02:00
Benjamin Admin 16fd406c1a feat(iace): secondary-harm chain model + AllPatterns drift fix
Task #17 — Folgegefahren-Modell as Vorbereitungs-Commit (no DB schema
change yet; persistence via separate [migration-approved] commit).

New:
- secondary_harms.go: SecondaryHarm struct + six canonical categories
  (consumer_safety, product_liability, food_safety, environmental,
  reputation, financial) with DE labels.
- hazard_pattern_types.go: HazardPattern extended with optional
  SecondaryHarms field — pattern library can now attach consequential-
  damage chains.
- hazard_patterns_secondary_demo.go: two worked examples
  - HP2000 Glasbruch carbonated bottling (the "Cola splitter" scenario
    from the IACE strategy discussion) with consumer_safety + food_safety
    + reputation chains
  - HP2001 Pharma fill-finish cross-contamination with consumer_safety
    + product_liability under AMG §84

Bonus fix:
- compliance_crossover.go AllPatterns() was a duplicate enumeration that
  silently drifted from collectAllPatterns() in pattern_registry.go.
  Pre-fix: 1058 patterns visible. Post-fix: 1213 patterns. The 155 invisible
  patterns included CRA, ISO12100 gaps, robot-cell, CNC extended, VDMA,
  textile-agri, GT-bremse — anything added after the original AllPatterns
  was authored. Audit-Suite (cmd/iace-audit) now sees the full set.

Next steps for full secondary-harm rollout:
- DB migration: hazards table + secondary_harms array column
- API: surface secondary_harms in /projects/:id/hazards response
- Frontend: collapsible Folgegefahren-Panel in HazardTable
2026-05-21 23:36:26 +02:00
Benjamin Admin c5c168592b feat(licenses): Task #25 — SDK module attribution rollout (11 modules)
Per project_sdk_module_attribution_matrix.md the Stufe-3 rollout is
prioritized by audit visibility. This batch covers Schritte 2-9 in one
sweep:

New reusable component:
  components/sdk/LicenseModuleBanner.tsx — single-line license banner
  placed at the top of an SDK module page. Renders rule pill (R1/R2/R3),
  source label, descriptor and link to /sdk/licenses. Replaces the
  copy-paste banner blocks I inlined in the earlier modules.

Integration points (per cluster):

  Cluster B (DSGVO/EU-Recht, R1):
    - vvt: existing "Vorlage" pill upgraded with R1 marker + tooltip
      explaining Bundeslaender-DSGVO provenance
    - dsfa: inline R1 banner citing DSGVO Art. 35

  Cluster C (EU AI Act / CRA, R1):
    - ai-act: inline R1 banner citing EU 2024/1689
    - cra:    inline R1 banner citing EU 2024/2847 + ENISA-Guidance

  Cluster D (Mix R2/R3):
    - isms: R3 banner + ISO/IEC 27001 reference disclaimer
    - security-backlog: R2 banner with OWASP CC-BY-SA attribution

  Cluster A (Eigenwerk, R3):
    - tom-generator: R1 source (DSGVO Art. 32) + R3 own-work disclaimer
    - audit-checklist: R3 banner for own audit methodology
    - document-generator: own templates R3 + cited rights R1

  Cluster E (Direct controls listing):
    - catalog-manager: System/User tag upgraded with rule classification
    - iace hazards: pattern_id pill upgraded with R3 + tooltip explaining
      BreakPilot Pattern-Engine provenance

The 11-module sweep brings audit transparency to the modules a paying
customer encounters most often. Stufe 3 of the attribution renderer
is now actually visible across the platform — previously it shipped
only the reusable <SourceBadge> component without integration points.

Pre-existing TS errors (drafting-engine constraint-enforcer, dsfa
types tests) untouched — not in scope for this licensing rollout.
2026-05-21 23:16:09 +02:00
Benjamin Admin d0274674a0 feat(licenses): Task #25 step 1 — SourceBadge in atomic-controls + correct LicenseRuleBadge labels
Per the SDK-Modul Attribution-Matrix (project_sdk_module_attribution_matrix.md),
the controls/atomic-controls listings render canonical_controls directly and are
the highest-audit-visibility integration point for Stufe 3.

Two changes:

1. atomic-controls/page.tsx: embed <SourceBadge controlUuid={ctrl.id} compact />
   next to the existing badge row in each control item. The badge fetches
   /api/compliance/licenses/source-info/{uuid} on first hover and reveals the
   source regulation, license type, and attribution text in a tooltip.

2. control-library/components/helpers.tsx: fix LicenseRuleBadge labels. The
   existing pill said "Free Use / Zitation / Reformuliert" — exactly the
   inverted understanding of the rules that Task #21 surfaced. Corrected to
   R1 (verbatim, Hoheitsrecht/PD), R2 (verbatim + attribution), R3 (identifier
   only). Added native title attribute for hover-explanation; the existing
   ControlListItem in control-library now shows the right semantics
   without any other code change.

Next module per matrix: VVT (Bundeslaender-Vorlagen) and DSFA.
2026-05-21 22:42:52 +02:00
Benjamin Admin 2eb7349577 feat(licenses): sidebar footer link to /sdk/licenses
Adds a discreet "Quellen & Lizenzen" link to the SDK sidebar footer
(below the existing Export button) pointing to the /sdk/licenses page
shipped in commit dfac940.

Part of Task #24 (AGB/Impressum audit) — the legal mandate that
attribution be discoverable for every output is now satisfied at
three layers:
- platform-wide overview reachable from every SDK page (this commit)
- per-export footer in compliance PDFs (commit 07cc00d)
- inline source badge per control via <SourceBadge> (commit dfac940)
2026-05-21 22:18:26 +02:00
Benjamin Admin 4434e3827b fix(audit): parse_flat_cookie_text — Anchor-Pattern fuer VW-textContent
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / detect-changes (push) Successful in 10s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 17s
CI / loc-budget (push) Failing after 17s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 40s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
VW Cookie-Doc-textContent verkettet HTML-Tabellen-Zellen OHNE Whitespace:
'Permanent/Protokoll_fbcTracking Cookies (Marketing)...'

Neues Pattern hat 2 Anker:
* Davor: typisches End-Token einer vorherigen Zelle (Permanent/Protokoll,
  Session Cookie, Persistent Cookie, TagePersistent, ...)
* Danach: Kategorie-Token (Tracking Cookies, Funktionscookie, Marketing,
  Analytics, Necessary)
Dazwischen: Cookie-Name (3-50 Zeichen, alphanum/_/-)

VW-Test (snapshot 4a465783): findet jetzt 40 unique Cookie-Namen,
aggregiert zu 6 Vendors (Google, DoubleClick, Cloudflare, Borlabs,
Meta, Unbekannter Anbieter mit 22 VW-internen Cookies).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 21:33:58 +02:00
Benjamin Admin 07cc00da11 feat(licenses): Stufe 2 — auto-attribution footer in compliance PDF
Extends CompliancePDFGenerator with a "Quellen & Lizenzen" section
appended to every generated compliance PDF.

The footer is built from compliance.canonical_controls + control_parent_links
directly (no HTTP hop to /licenses/aggregate — same DB connection
already open in the generator). It groups by license_rule and lists
the top 8 source regulations per bucket.

For Rule-2 entries (CC-BY-SA, OECD-Public, Apache, etc.) it emits the
mandatory attribution paragraph required by the underlying licenses.
For Rule 1 a brief reference list satisfies the auditability goal
without legal obligation. Rule 3 is identifier-only by design.

Architecture decision: this is a PLATFORM-level footer (which sources
the platform draws on overall), not a per-export filter of "only the
sources actually cited in THIS document". The latter would require
control-uuid tracking across all sections (TOM/VVT/DSFA/etc.) which
the current PDF generator does not surface — that's a follow-up scope.
The platform-level footer fulfils the immediate legal mandate that
attribution be present on the work, not buried in AGB/Impressum.

Part of Attribution-Renderer Task #23. Stufe 1 (overview page) +
Stufe 3 (SourceBadge component) already shipped in commit dfac940.
Stufe 4 (tech-file appendix) remains for the IACE tech-file generator
in a separate iteration.
2026-05-21 21:30:02 +02:00
Benjamin Admin 1451873194 fix(audit): parse_flat_cookie_text fuer VW-Style Flat-Tabellen
CI / loc-budget (push) Failing after 19s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 3m4s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 43s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / detect-changes (push) Successful in 12s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 19s
VW Cookie-Doc liefert die Tabelle als FLACHEN Text ohne Spalten-Trenner:
'IDE Tracking Cookies (Marketing) Beschreibung 13 Monate Permanent
TAID Tracking Cookies (Marketing) ...'

parse_flat_cookie_text matched mit Regex:
  NAME [Tracking|Session|Funktional|...] Cookies ... [13 Monate|Session|Permanent]

Backend faellt bei parse_cookie_table=[] auf parse_flat zurueck. Damit
holen wir aus dem 65k VW Cookie-Doc ~30-50 Cookies + Vendors deterministisch,
auch wenn der HTML-Table-DOM-Extract leer ist (was passiert wenn die
Tabelle aus mehreren append-Code-Pfaden geladen wird).

Bonus: _extract_dom_tables Helper in dsi_discovery.py vorbereitet fuer
spaeteres Einhaengen an allen 7 DiscoveredDSI.append-Stellen.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 21:24:14 +02:00
Benjamin Admin dfac940272 feat(licenses): attribution renderer — Stufe 1 (overview) + Stufe 3 (SourceBadge)
Backend
- backend-compliance/compliance/api/licenses_routes.py: three endpoints
  built on the now-complete license_rule classification
  - GET  /api/compliance/licenses/overview
       global aggregation by rule + per-source breakdown (Stufe 1)
  - POST /api/compliance/licenses/aggregate
       per-control-set aggregation for PDF footer (Stufe 2) and
       tech-file appendix (Stufe 4) — consumed later
  - GET  /api/compliance/licenses/source-info/{control_uuid}
       single-control lookup for the inline source badge (Stufe 3)
- registered in api/__init__.py via the existing safe-import loader

Frontend
- app/sdk/licenses/page.tsx (Stufe 1): the /sdk/licenses overview page.
  Renders rule legend cards + per-rule source tables. Drives the
  /licenses footer link and gives auditors a one-page view of what
  licence classes the platform is operating under.
- components/sdk/SourceBadge.tsx (Stufe 3): reusable React component.
  Small R1/R2/R3 pill with click-expand tooltip showing source
  regulation + attribution string + render-full-text policy. Will be
  embedded into IACE hazards/mitigations, VVT items, DSFA controls in
  follow-up commits.

Two stages of the four-stage renderer are now ready. Stufe 2 (PDF
auto-footer) + Stufe 4 (tech-file appendix) follow once the existing
PDF generators are extended to call /licenses/aggregate.
2026-05-21 21:00:10 +02:00
Benjamin Admin cb5dad1a2f feat(audit): A Audit-Transparenz + B Tabellen-Parse + D HTML-Tables aus DOM
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-python-backend (push) Successful in 45s
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 20s
CI / loc-budget (push) Failing after 17s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
Drei zusammenhaengende Fixes fuer den VW-Befund (6 Vendors statt 100+):

A — audit_quality_checks.py: drei systemische Vorbehalte die IMMER prominent
gezeigt werden:
* banner_detected=False trotz Cookie-Doc → HIGH 'CMP-Tool ungeladen'
* cookie_doc >= 30k chars aber cmp_vendors < 15 → HIGH/MEDIUM
  'Vendor-Liste auffaellig kurz fuer Doc-Groesse'
* submitted URL aber 0/Mini-Text → MEDIUM 'URL nicht ladbar'
Rote Audit-Vorbehalt-Box ueber dem GF-1-Pager. GF-Summary sagt
'Audit unvollstaendig' statt faelschlich 'Keine kritischen Themen'.
gf_one_pager nimmt audit_quality_findings in top_findings auf
(BEVOR andere Findings).

B — cookies_table_parser laeuft jetzt auch auf gecrawltem Cookie-Doc-
Text (nicht nur bei User-Paste). Wenn der dsi-discovery-Response Tab/
Pipe-getrennte Tabellen-Reihen liefert, parsen wir sie deterministisch.

D — consent-tester/dsi-discovery extrahiert jetzt zusaetzlich zum
Text die <table>-Elemente aus dem DOM als list[str] (Tab-getrennt pro
Zeile, mind. 2 Zellen, mind. 3 Zeilen, max 10 Tabellen pro Doc). Backend
schleust diese als 'html_table'-cmp_payload ein und jagt sie zuerst durch
cookies_table_parser → 100% deterministische Vendor-Extraktion ohne LLM.

VW-Erwartung: aus der 65k-Cookie-Tabelle werden jetzt 30-50 Vendors
deterministisch geparst statt 6 vom LLM-Cascade.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 20:21:28 +02:00
Benjamin Admin e411c4f0d3 feat(audit): Text-Paste-Mode pro Row — Crawler optional umgehen
CI / detect-changes (push) Successful in 12s
CI / branch-name (push) Has been skipped
CI / nodejs-build (push) Successful in 3m27s
CI / iace-gt-coverage (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 17s
CI / loc-budget (push) Failing after 20s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go (push) Has been skipped
CI / test-python-backend (push) Successful in 47s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
Hintergrund: VW liefert ueber URL-Crawler nur 6 Vendors statt der 100+
die in der echten Cookie-Tabelle stehen. Wenn der User die Tabelle aber
direkt von der Site kopieren kann (was bei den meisten OEM-Sites moeglich
ist), umgehen wir den Crawler komplett und parsen den Text deterministisch.

Backend:
* doc_type_classifier.py — 7 Pattern-Gruppen (§5 TMG, Art.13 DSGVO,
  AGB-Klauseln, Widerrufs-Frist, Cookie-Tabellen-Header, etc). Wenn der
  User Text ins falsche Doc-Type-Feld kopiert (Impressum->DSE),
  detect_mismatch liefert detected + action ('reclassify' bei sehr hoher
  Konfidenz, 'warn' bei medium).
* cookies_table_parser.py — Tab/Pipe/Komma/Semicolon-Separator-Auto-
  Detection, Spalten-Mapping per Header-Keyword. Aggregiert Cookie-
  Eintraege zu Vendor-Records (mit _guess_vendor-Fallback). Voll
  deterministisch, kein LLM.
* doc_input_warnings.py — Mail-Block ueber dem Audit, der Mismatches +
  Auto-Reclassifies dem User transparent macht.
* Pipeline: text gewinnt ueber url (war schon im Schema vermerkt), neue
  Felder declared_doc_type / input_source / reclassify_hint in doc_entries.
  Pasted-Tabellen-Vendors haben Vorrang vor Library-Fallback + LLM-Cascade
  (sind 100% genau).

Frontend (DocCheckTab):
* Pro Row Mode-Toggle 'URL' / 'Text einfuegen' (lila wenn aktiv).
* Textarea (h-32, monospace) im text-mode mit kontext-spezifischem
  Placeholder (Cookie-Hinweis ggue. anderen Doc-Types) und Live-
  Zeichen-/Wort-Counter.
* Submit-Button accepted entries mit URL ODER text.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 18:58:32 +02:00
Benjamin Admin 7335f64f4f feat(founding-wizard): Per-Person IP-Assignment + Prefill + E2E-Tests
CI / loc-budget (push) Failing after 20s
CI / detect-changes (push) Successful in 12s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 19s
CI / nodejs-build (push) Successful in 3m17s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 43s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
Wizard unterstuetzt jetzt 2-4 Gesellschafter mit individuellem IP-Bereich:
- Pro Gruender ein IP-Assignment-Vertrag (z.B. Benjamin: Compliance+RAG;
  Sharang: Security+Infrastruktur). Pro GF ein eigener Dienstvertrag.
- Step 1: Prefill-Button aus Unternehmensprofil + Felder Registergericht
  und HRB-Nr.
- Step 2: Rollen-Dropdown (CEO/CTO/CFO/COO/CPO/GF/Sonstige) statt freie
  Texteingabe, IP-Bereiche-Textarea pro Person.

Backend:
- generate_documents() iteriert pro Person fuer PER_PERSON_DOCS.
- _build_person_context() injiziert ASSIGNOR_*, GF_*, IP_LIST_DETAILS
  aus person.ip_areas.
- base_context() propagiert basics.register_court und basics.hrb_number.

Tests:
- 30/30 Pytest gruen (6 neue: Per-Person-Context, Slug-Helper,
  Registergericht-Propagation).
- 4 neue Playwright-E2E-Specs (hermetisch via route.fulfill, mit
  Console-/Page-Error-Traps): kompletter 8-Step-Flow, Prefill-Fehlerpfad,
  Step-Navigation/Reset, Rollen-Dropdown + IP-Areas.
- Spec setzt 'bp-sdk-cookie-consent' im addInitScript damit der
  CookieBannerOverlay nicht die Wizard-Buttons ueberlagert.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 18:49:10 +02:00
Benjamin Admin 138d9068c4 fix(audit): VW-Cookie-Tabelle — Library-Fallback + Pattern-Extract verstaerkt
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / detect-changes (push) Successful in 11s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 16s
CI / loc-budget (push) Failing after 18s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 41s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
VW-Lehre: cmp_vendors=6 (alle LLM-grob) wurde als ausreichend gewertet,
obwohl die echte Cookie-Tabelle 30+ Eintraege hat. 3 Fixes:

1. fallback_vendors_for_run skip-Schwelle: existing_vendor_count >= 3
   war zu niedrig. Jetzt nur skip wenn < 5 Cookies UND >= 5 Vendors
   schon vorhanden.

2. Library-Fallback wird jetzt aufgerufen bei < 20 cmp_vendors (statt
   < 3). VW-typische Setups (6 LLM-grob + 30 aus Library) bekommen
   damit eine vollstaendige Vendor-Liste.

3. _extract_cookie_names_from_doc: regex-Pattern-Extract aus dem
   Cookie-Doc-Text selbst — sucht nach 'NAME Tracking Cookies (Marketing)'
   etc. Findet Cookie-Namen die NICHT im Browser-Jar landen (z.B. nur
   nach Consent geladen werden). Diese werden zusaetzlich durch die
   Library matched.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 18:32:07 +02:00
Benjamin Admin c281464071 feat(audit): P71 JC-vs-AVV Entscheidungsbaum
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 15s
CI / test-python-backend (push) Successful in 39s
CI / test-python-document-crawler (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / loc-budget (push) Failing after 17s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
jc_avv_decision.py: detect_ambiguous_jc_avv prueft ob DSE-Text sowohl
JC-Signale (gemeinsame Auswertung, Schwesterunternehmen, Konzern...)
als auch AVV-Signale (Auftragsverarbeiter, weisungsgebunden...) enthaelt.
Bei Treffer rendert build_jc_avv_decision_html einen Block mit 4 EDPB-
basierten Leitfragen + jeweiliger Empfehlung.

Quellen: EDPB Guidelines 7/2020, EuGH C-25/17, C-40/17.

In Mail-Render zwischen Solutions-Block und VVT eingehaengt.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 17:31:37 +02:00
Benjamin Admin 6dc427a754 fix(audit): VW-404-Recovery + P52 LLM-Merge + P51 Banner-UX-Checks
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 14s
CI / loc-budget (push) Failing after 17s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 42s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
VW-404-Fix: submitted_types zaehlt jetzt nur Doc-Types mit >= 200 Zeichen
echtem Text. Eine eingegebene URL die 404/Mini-Text liefert (VW cookie-
richtlinie.html) wird als 'missing' behandelt, sodass Auto-Discovery
alternative URLs auf der Homepage probiert. In-place-Update statt
Duplicate-Entry, rejected_url wird fuer Audit-Transparenz aufgehoben.

P52 LLM-Cascade Merge: vendor_llm_extractor laeuft jetzt bei < 5 Vendors
(nicht nur bei 0), und die Ergebnisse werden MIT existing cmp_vendors
gemerged statt zu ueberschreiben. VW-typische Setups (Generic CMP +
0 cmp_payloads) bekommen damit den Text-basierten Vendor-Layer dazu.

P51 — banner_consistency_checks erweitert:
* check_banner_copyability: scannt banner_html nach user-select:none /
  oncopy=return false / onselectstart. MEDIUM Finding wenn Banner-Text
  nicht kopierbar (Art. 7 (2) DSGVO).
* check_consent_history: prueft auf 'Meine Einwilligungen' / Consent-
  Historie / Datenschutz-Cockpit. MEDIUM wenn keine sichtbare Historie
  (Art. 7 (3) — Widerruf muss so einfach wie Erteilung sein).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 17:27:55 +02:00
Benjamin Admin 309c10c203 feat(audit): P72 MC-Scope-Filter + P73 MC-Solution-Generator
CI / detect-changes (push) Successful in 12s
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 15s
CI / loc-budget (push) Failing after 18s
CI / go-lint (push) Has been skipped
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 41s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
P72 — rag_document_checker LEFT JOINs canonical_controls.scope_doc_type.
_filter_by_canonical_scope wirft MCs raus deren scope explizit auf
einen inkompatiblen Doc-Type zeigt (Mapping in _SCOPE_COMPATIBLE).
Konservativ: 'other'/NULL/'process' bleiben drin — Heuristik v1 ist
noch nicht stark genug fuer hartes Filtern.

Erwartete Wirkung: ~10-15% weniger irrelevante MCs pro Doc, weil z.B.
ein TOM-MC nicht mehr als DSE-Finding auftaucht.

P73 — mc_solution_generator.py: Qwen->OVH Cascade generiert pro HIGH/
CRITICAL-Fail eine konkrete Einfuege-Empfehlung mit Anchor (wo + was)
und Aufwand-Schaetzung. JSON-Schema {solution_text, anchor_hint,
effort_min}. In-process LRU-Cache (500 entries) per (mc_id, doc_md5).

Max 3 Solutions pro Doc-Type, global Cap 8 — haelt Latenz < 60s. Bloecke
werden im Mail-Render unter VVT als 'Loesungs-Vorschlaege (KI-generiert)'
eingehaengt. Disclaimer: kein Rechts-Beratung, mit DSB pruefen.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 17:21:19 +02:00
Benjamin Admin 4183379dc5 feat(audit): P33 3-Spalten-Vendor-Konsistenz (DSE/Cookie-Doc/Banner)
CI / detect-changes (push) Successful in 11s
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 15s
CI / loc-budget (push) Failing after 20s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 44s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
check_three_source_vendor_consistency: scannt DSE-, Cookie-Doc- und
Banner-Vendor-Liste auf 15 typische Vendor-Signaturen (Google Analytics,
Meta Pixel, Hotjar, HubSpot, LinkedIn Insight, ...). Listet Vendors die
in mind. einer Quelle stehen, aber nicht in allen sources_with_data.

Liefert MEDIUM-Finding mit konkreter 'fehlt in: DSE, Banner-Liste'-
Liste pro Vendor. Empfehlung: zentrale Vendor-Liste pflegen + in alle
drei Dokumenttypen propagieren. (Art. 13(1)(c)+(e) DSGVO)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 17:11:47 +02:00
Benjamin Admin c93c88577c feat(audit): P88 PDF-Export via WeasyPrint
CI / detect-changes (push) Successful in 9s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / test-python-backend (push) Successful in 42s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 14s
CI / loc-budget (push) Failing after 17s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
GET /api/compliance/agent/snapshots/{id}/pdf liefert application/pdf
mit dem vollen Audit-Mail-Inhalt im A4-Print-Layout (Header mit
Site/Timestamp/Snapshot-ID, Seitenzahlen unten rechts).

check_replay.py liefert jetzt zusaetzlich 'full_html' (nicht nur
500-char-preview), damit der PDF-Renderer das komplette HTML hat.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 17:06:48 +02:00
Benjamin Admin 3207acea3e fix(audit): Replay-Pipeline um P35/P77/P78/P36 Signals-Block ergaenzen
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 14s
CI / loc-budget (push) Failing after 16s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 44s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
check_replay.py rendert jetzt auch die Textsignal-Findings (Save-Label-
Ambiguitaet, Cookies-in-DSE-Akzeptanz, JC-Klausel positiv, Social-Embeds).
Damit hat der Replay-Test parity mit der echten Mail-Pipeline.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 17:04:02 +02:00
Benjamin Admin 9f06911ff9 feat(audit): Cookie-Library-Fallback fuer VW-Pattern (kein bekanntes CMP)
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 17s
CI / loc-budget (push) Failing after 17s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 41s
Wenn nach Standard-Extract + Phase-G + LLM-Cascade weiterhin < 3 cmp_vendors
aber >= 5 Cookies im after_accept stehen (typisch: Custom-CMP wie VW
'cookiemgmt'), matcht der Fallback die Cookie-Namen gegen die
compliance.cookie_library und rekonstruiert Vendor-Records aus den
Library-Eintraegen.

Hintergrund: VW Run de2a029e zeigt 4 Vendors trotz 28 after_accept-Cookies.
cmp_payloads ist 0 (kein bekanntes IAB-Tool erkannt) und die hinterlegte
Cookie-URL liefert 404. Die DSE ist mit 34k zwar substanziell, listet aber
keine Vendor-Tabelle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 17:00:49 +02:00
Benjamin Admin 338e03d3b0 feat(audit): P34 Exec-Summary Score-Einordnung — 'wo Sie stehen sollten'
CI / detect-changes (push) Successful in 10s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 14s
CI / loc-budget (push) Failing after 16s
CI / go-lint (push) Has been skipped
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m46s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / test-python-backend (push) Successful in 43s
CI / test-python-document-crawler (push) Has been skipped
_score_band_explanation: vier Baender (Sehr gut/Akzeptabel/Handlungs-
bedarf/Erhoehtes Risiko) liefern Label + erwartete Handlung. Wird als
neue Zeile unter den KPIs in der Exec-Summary gerendert (mit
score-farbiger Linkmark).

Sachlicher Ton — kein 'Vorstand muss sofort handeln', sondern
realistische Empfehlung (z.B. '70-84: Branchen-Median, einmaliges
Aufraeumen + Halbjahres-Check').

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 16:51:34 +02:00
Benjamin Admin c491af5d02 feat(audit): P47 localStorage-Quota — safeSetItem mit Auto-Prune
CI / detect-changes (push) Successful in 8s
CI / branch-name (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 13s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 41s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / loc-budget (push) Failing after 16s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m47s
storageHelpers.ts: safeSetItem faengt QuotaExceededError, prunet
alte doc-check-result-*-Eintraege (oldest first, MAX_KEEP=10) und
retried. Bei zweitem Fail aggressiver pruefen.

DocCheckTab.tsx nutzt safeSetItem statt setItem fuer doc-check-results,
result-Keys und history. Verhindert silent-data-loss + Crash wenn
~5MB localStorage-Limit erreicht.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 16:47:42 +02:00
Benjamin Admin 4171cf0efd feat(audit): P36 Social-Media-Einbindungs-Check
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 16s
CI / loc-budget (push) Failing after 18s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / detect-changes (push) Successful in 9s
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / test-python-backend (push) Successful in 44s
CI / test-python-document-crawler (push) Has been skipped
check_social_embedding: erkennt direkte FB/Insta/Twitter/YouTube-
Embeds (connect.facebook.net, platform.twitter.com etc) vs
Heise-Shariff vs 2-Klick-Loesungen (Embetty).

Direkte Embeds ohne Schutz = HIGH (EuGH C-40/17 Fashion-ID — der
Site-Betreiber wird zum gemeinsam Verantwortlichen und braucht
Einwilligung VOR dem Drittanbieter-Call).
Shariff oder 2-Klick erkannt = INFO (positives Signal).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 16:45:12 +02:00
Benjamin Admin 30e43afba6 feat(audit): P86 Branchen-Benchmark + P35/P77/P78 Textsignale
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / detect-changes (push) Successful in 11s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / loc-budget (push) Failing after 19s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 41s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 15s
P86 — industry_benchmark.py: zieht alle Snapshots mit derselben
scan_context.industry, berechnet Median + Percentile, rendert
'Sie 42% — Automotive-Median 58% (Stichprobe: 12)'. Min Sample 3.

P35 — banner_text 'Speichern' ohne 'Ablehnen' = MEDIUM. Mehrdeutiges
Label nach EDPB 03/2022 Deceptive-Design-Guidelines.

P77 — DSE mit prominenter Cookie-Sektion (Vendor-Hints: Speicherdauer,
Anbieter, Datenkategorie) ersetzt die Forderung nach separater
Cookie-Richtlinie. Positives Signal statt False-Positive.

P78 — Art. 26-Klausel im DSE-Text erkannt → positives Signal
'JC-Konstrukt dokumentiert'. Vermeidet False-Positive bei
Konzern-Schwester-Kooperationen.

Alle in Mail eingehaengt: Branchen-Block nach GF-1-Pager, Signale-Block
nach Konsistenz-Check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 16:43:15 +02:00
Benjamin Admin df8832c521 feat(audit): P75 Banner-vs-CMP + P84 Diff-Mode + P74/P96/P97 Doc-Types
CI / detect-changes (push) Successful in 11s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 15s
CI / loc-budget (push) Failing after 18s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 42s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
P75 — check_banner_vs_cmp_partner_count: wenn Banner-Text 'N Partner'
nennt und N < cmp_vendors * 0.6, HIGH-Finding (Art. 13(1)(e) DSGVO).
Erkennt Verharmlosung der tatsaechlichen Vendor-Anzahl.

P84 — run_diff.py: vergleicht aktuellen Lauf mit letztem Snapshot
derselben Site (set-Diff auf normalisierten Finding-Labels). Block
ueber dem GF-1-Pager: 'Seit letztem Lauf: X Findings weg, Y neue'.
USP — keiner der grossen Anbieter hat das.

P74/P96/P97 — Labels fuer legal_notice (Rechtliche Hinweise / IP /
Forward-Looking), dsa (Art. 12+17 Digital Services Act), lizenzhinweise
(OSS-Compliance) in _DOC_TYPE_LABELS registriert. Echte Pflichtangaben-
Checks kommen separat.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 16:38:25 +02:00
Benjamin Admin 7842c95532 feat(audit): P92 CMP-Tool-Verfuegbarkeit + P94 Banner-vs-Cookie-Doc-Konsistenz
CI / detect-changes (push) Successful in 11s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 16s
CI / loc-budget (push) Failing after 17s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 42s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
P92 — Wenn der Nutzer 'Anpassen'/'Einstellungen' klickt und der
CMP-Settings-Bereich kein Fehlerfreies Laden zeigt (Error, Timeout,
<80 Zeichen ohne Kategorien, keine Toggles), ist das ein HIGH-
Finding. Granulare Wahl formal vorhanden, faktisch nicht
funktionsfaehig (Art. 7 (3) DSGVO + EDPB 03/2022).

P94 — Cookie-Liste im Banner-Settings vs Cookie-Richtlinie. Heuristik
extrahiert Cookie-Namen aus dem Cookie-Doc-Text (regex auf typische
camelCase/_underscored Patterns + Vendor-Prefixes _ga/_gid/ot_/uc_).
Wenn |only_in_doc| >= 5 ODER |only_in_banner| >= 3 → MEDIUM-Finding.
|only_in_doc| >= 15 UND |only_in_banner| >= 5 → HIGH.

Beide Findings landen im neuen Mail-Block 'Banner-Konsistenz-Pruefung'
(amber-yellow) zwischen Mismatch-Block und VVT. Auch in
check_replay.py eingehaengt.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 16:31:19 +02:00
Benjamin Admin 08671adfdf feat(audit): P82 GF-1-Pager + P87 Konfidenz-Score pro Finding
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / detect-changes (push) Successful in 12s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / test-python-backend (push) Successful in 43s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 18s
CI / loc-budget (push) Failing after 19s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
P82 — gf_one_pager.py: kompakte 5-Bullet-Kurzfassung ganz oben in der
Mail. Score (gross + Farbe), Delta-zu-Vorlauf, Top-Findings nach
HIGH/MEDIUM sortiert mit zustaendiger Rolle (DSB / Marketing / IT /
Legal / Web-Team) und Klassifizierungsbits aus dem Wizard.
Sachlicher Ton — keine 4%-Drohung, '4-8 Wochen' als realistischer
Zeitrahmen. Eingehaengt vor Critical-Findings-Block in Mail-Composition
und Replay-Pipeline.

P87 — finding_confidence.py: 13 Regex-Regeln liefern (confidence_pct,
reason) pro Finding-Label. Direkt im DOM beobachtbar = 95-98%,
Library-Mismatch = 82%, Textmuster-Match auf Pflichtangaben = 75-88%.
Im 1-Pager als kleines '(NN% Konfidenz)'-Tag mit Reason-Tooltip
hinter jedem Finding gerendert.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 16:20:19 +02:00
Benjamin Admin 50fc0ecc59 feat(audit): P79 Pre-Scan-Wizard (8 Pflichtfelder) + P99 erweitert + P102 Replay-Fix
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 14s
CI / nodejs-lint (push) Has been skipped
CI / detect-changes (push) Successful in 11s
CI / branch-name (push) Has been skipped
CI / loc-budget (push) Failing after 17s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m56s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 40s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
P79: PreScanWizard.tsx mit 8 Pflichtfeldern (Branche, B2B/B2C,
Direkt-Vertrieb, Rechtsform, Konzern-Struktur, MA-Zahl, Besondere
Daten, Drittland). Scan-Button disabled bis alle 8 ausgefuellt. Werte
landen in scan_context und ueber Backend in compliance_check_snapshots.

P99: DOC_TYPES um dsa + legal_notice + lizenzhinweise + nutzungsbedingungen
erweitert. URL-hinzufuegen-Button war schon da.

P102 (Replay-Bug): check_replay.py liest jetzt e.get('text') statt
nur full_text — Snapshot-Schema verwendet 'text'. Library-Mismatch-
Block wird damit auch im Replay angezeigt.

Backend: ComplianceCheckRequest.scan_context optional; save_snapshot
persistiert ihn in compliance_check_snapshots.scan_context.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 15:59:01 +02:00
Benjamin Admin 94057b1536 feat(audit): VW-Cookie-Bug-Fix + P101/P102 Cookie-Library-Mismatch-Findings
CI / loc-budget (push) Failing after 19s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 42s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / detect-changes (push) Successful in 11s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 15s
VW-Bug B1: extract_vendors_via_llm hatte max_text_chars=12000 -> bei
VW-Cookie-Doc (60k chars, 100 Cookies in Tabelle) wurden 80% abgeschnitten,
LLM extrahierte nur 1 Vendor. Fix: max_text_chars=50000, num_predict
6000->16000 fuer mehr Vendor-Output, Ollama-Timeout 120s->420s.

P101 Aggregator-Script (backend-compliance/scripts/cookie_library_enrich.py)
geht alle compliance_check_snapshots durch und extrahiert (cookie_name,
declared_category, observed_sites). Erste Auswertung ueber 8 Snapshots:
101 unique Cookies, 47 in Library, 54 unbekannt, 18 Mismatches.

P102 Cookie-Klassifikations-Pruefung als Mail-Block. Vergleicht
Site-deklarierte Kategorie vs Library + Vendor-Doku. HIGH wenn Library
sagt 'marketing' aber Site als 'essential'/'statistics' deklariert
(faktische Drittland-/Werbe-Verarbeitung versteckt). MEDIUM sonst.
In agent_compliance_check_routes Mail-Komposition + Replay-Pipeline
eingebaut.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 15:47:11 +02:00
Benjamin Admin 9c11b5463c fix(audit): P98 + P100 — Cookie-Tabellen-Whitespace + Anpassen-Button-Check
CI / detect-changes (push) Successful in 11s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 18s
CI / loc-budget (push) Failing after 17s
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
P98: HTML-Tabellen-Zellen wurden bei VW-Cookie-Richtlinie ohne Whitespace
verkettet ('smartSignals2UiDsmartSignals2sUiDsmartSignals2CPs...'). Grund:
el.textContent ignoriert Block-Element-Grenzen. Fix: innerText (whitespace-
respecting) statt textContent. Cookie-Namen werden jetzt einzeln erkannt —
VW-Lauf sollte ~100 Cookies statt 1 finden.

P100: Banner-Check fuer 'Anpassen'/'Einstellungen'-Button im Initial-Banner.
VW-Pattern: nur 2 Buttons (Nur technisch notwendige / Alle akzeptieren),
keine granulare Wahl vor Akzeptanz/Ablehnung. Faktische Manipulation
Richtung Pauschal-Akzeptanz. HIGH-Finding nach EDPB 5/2020 §82.
Pattern: anpassen/einstellungen/cookie-einstellungen/manage cookies/
preferences/customize.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 15:08:33 +02:00
Benjamin Admin 50ed0f45af fix(replay): P80 — DocCheckResult-Import entfernt (gibt es nicht in runner)
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 36s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 14s
CI / loc-budget (push) Failing after 15s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
Vorher hatte ich den Container hotfixed aber den Fix nicht committed.
Beim naechsten Rebuild kam der Bug aus dem Image zurueck.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 12:25:04 +02:00
Benjamin Admin e1df24cad7 fix(audit): P93+P95 — Reject-Wording erweitert + Vendor-zentrisches Cookie-Format akzeptiert
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / test-python-backend (push) Successful in 38s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 16s
CI / loc-budget (push) Failing after 16s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
P93: 'Cookies verbieten', 'Tracking ablehnen', 'verweigern' usw. zaehlen
nun als expliziter Reject-Mechanismus. EDPB 5/2020 schreibt kein bestimmtes
Wort vor — BMW False-Positive 'Kein Ablehnen-Mechanismus' weg.

P95: cookie_table-Check akzeptiert nun zwei gleichwertige Formate:
(a) klassische Tabelle, (b) Vendor-Detailseite mit Block pro Anbieter
(Name+Anschrift, Zweck, Speicherdauer aggregiert, Cookie-Namen-Liste,
Opt-Out-Link). BMW-Stil mit Adform-Block ist DSK-OH 2024 konform.
False-Positive 'tabellarisches Cookie-Verzeichnis fehlt' wird seltener.

Hinweis-Text in cookie_table umformuliert: nennt beide akzeptablen
Formate, weniger normativ.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 12:21:29 +02:00
Benjamin Admin e5b4672f2a fix(audit): P90 — auto-discovery Timeout 180s -> 300s fuer BMW-Homepage
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 15s
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / loc-budget (push) Failing after 15s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 39s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 12:05:41 +02:00
Benjamin Admin 0d5c76ea98 fix(audit): P90-B1 — DSI-Discovery Timeout 120s -> 240s fuer BMW-Impressum
CI / detect-changes (push) Successful in 11s
CI / branch-name (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 13s
CI / loc-budget (push) Failing after 15s
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 38s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
BMW-fafcb090 zeigte exception 'ReadTimeout' beim consent-tester-Call fuer
anbieterkennzeichnung.html. Der Discovery-Lauf folgt 3 Sub-Documents
(Versicherungsvermittler, Aufsicht, Berufsrecht) plus ePaaS-Captures —
braucht regelmaessig >120s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 11:52:59 +02:00
Benjamin Admin 54f5a06c2f fix(audit): P90-Diagnose — verbose Exception fuer fetch+auto-discovery
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 14s
CI / loc-budget (push) Failing after 15s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 38s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
BMW-Lauf 760de886 hat 0 cmp_payloads obwohl consent-tester ePaaS 4x captured.
Backend-Log zeigt 'Consent-tester fetch failed for ...anbieterkennzeichnung.html: '
mit LEEREM Exception-String. Auch 'auto-discovery failed for https://www.bmw.de/: '
ist leer. Quick-Fix: str(e) + type(e).__name__ in beiden Except-Bloecken,
damit naechster BMW-Lauf den echten Fehler sichtbar macht.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 11:45:28 +02:00
Benjamin Admin 86b4a263d2 fix(audit): P90-B1 — cmp_payloads bei kurzem DSE-Text nicht verwerfen
CI / detect-changes (push) Successful in 9s
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / test-go (push) Failing after 41s
CI / iace-gt-coverage (push) Successful in 25s
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 16s
CI / loc-budget (push) Failing after 15s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-python-backend (push) Successful in 35s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
BMW-Lauf 9811eba1 hatte 0 cmp_vendors obwohl consent-tester ePaaS 4x
captured (~393KB). Root-Cause in _fetch_text Z.1254:

  if merged and len(merged.split()) > 100:
      return merged, cmp_payloads

Wenn DSE/Cookie-URL nur kurzen SPA-Shell-Text liefert (BMW: 10 Worte),
greift die Schwelle nicht — Code faellt durch zum HTTP-Fallback der
return text, []  zurueckgibt. Die zuvor captured CMP-Payloads (ePaaS-JSON
mit allen Vendor-Daten) werden komplett verworfen.

Fix: vor dem HTTP-Fallback pruefen ob cmp_payloads vorhanden sind. Wenn ja,
diese zurueckgeben mit dem (kurzen) Text oder dem rekonstruierten
cmp_cookie_text. Auch ohne 100-Wort-Schwelle.

Effekt: BMW-VVT-Tabelle wird gefuellt (~90 Vendors aus ePaaS-JSON).
Mercedes/andere OEMs unveraendert.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 11:29:41 +02:00
Benjamin Admin 7938e377b6 feat(audit-tonality): P89/P76/P91 — Co-Pilot statt Roboter-Anwalt
CI / branch-name (push) Has been skipped
CI / detect-changes (push) Successful in 11s
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 14s
CI / loc-budget (push) Failing after 15s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Failing after 48s
CI / iace-gt-coverage (push) Successful in 25s
CI / test-python-backend (push) Successful in 43s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
User-Feedback in einer Session: "Wir erzeugen nur Panik. Egal was da steht,
es dauert Wochen. Wir sind Tool an der Seite von CMO/GF/CIO, nicht Gegner."
Memory: feedback_breakpilot_tonalitaet.md (gilt fuer ALLE Module + Marketing).

P89  Critical-Findings-Block ENTFERNT/UMGEBAUT — keine Panik-Rot-Box mehr.
     - Statt "🚨 SOFORTMASSNAHMEN ERFORDERLICH" -> "Zusammenfassung fuer
       die Geschaeftsfuehrung", blauer dezenter Block
     - Statt "VERSTOSSE" -> "Themen zur Besprechung mit DSB, Marketing
       und Entwicklung"
     - Statt "Bussgeldrahmen 4% Weltumsatz" als Erstes -> realistische
       Einordnung (0,1-1%) in dezenter Schluss-Notiz mit Konfidenz-Hinweis
     - "Sofortmassnahme" -> "Empfehlung"
     - "Themen 1, 2, 3..." statt "HIGH"-Badges (P87-Vorbereitung)
     - Explizite Zeitschaetzung "4-8 Wochen (DSB -> Agentur -> Dev -> Freigabe)"

P76  Mercedes-Sekundaer-Buttons (Datenschutzerklaerung + Impressum klein
     unter den 3 Haupt-Buttons) erkennen. Walker scant jetzt label-basiert
     ALLE klickbaren Elemente im Shadow-DOM (wb7-link, wb7-link-secondary,
     wb7-button-text, span[onclick], small a, [role=button], etc.).
     Vermeidet Mercedes-Impressum-False-Positive der Phase 1.

P91  VVT-Tabellen-Renderer in neuer Co-Pilot-Tonalitaet. Statt
     "Verstoss-Liste mit Bussgeldpotenzial" -> Wahrscheinlichkeits-Aussage:
     "Bei Anbieter-Reduktion + Wechsel zu europaeischen Alternativen ist
     Reduktion des Tracking-Footprints + Lizenz-Einsparung wahrscheinlich.
     Fundierte Bewertung erfordert DSB-Abstimmung."

BMW-Bug B1-B4 (P90) bewusst nicht in diesem Commit: BMW-Lauf hat ePaaS
4x captured im consent-tester, aber Backend bekommt 0 cmp_payloads.
Wiring-Bug zwischen consent-tester /dsi-discovery und Backend
_fetch_text — eigene Diagnose-Session noetig (siehe Task P90).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 11:24:57 +02:00
Benjamin Admin f534b52817 feat(iace): pattern audit suite + library hygiene wave
Add cmd/iace-audit CLI with 5 deterministic methods that find engine
gaps without ground truth:

- A reachability: 1058 patterns vs achievable tag universe
- B consistency: components vs their declared hazard categories
- C vocabulary: limits-form tokens vs keyword dictionary
- D echo: limits-form sentences vs generated hazards (jaccard)
- E hierarchy: hazards vs ISO 12100 design/protection/info levels

Library fixes triggered by A+B+C findings:

- tag_resolver: synonym map for electrical/pneumatic/hydraulic aliases
- component_library: crush_point + EN03 (gravitational) on C014/C128
  (Hubwerk family) - fixes HP1014/1015/1017/1018 which were silently
  weakly_reachable. noise_source added on 7 components (C006/C011/
  C017/C020/C031/C041/C096). electrical_part on 8 drive components
  (C031/C032/C033/C034/C035/C036/C037/C038/C077/C092). cyber tag
  on 10 sensors (C081-C090) + 3 IT components (C111/C112/C116) +
  KI module C119 (ai_model added). pneumatic_part+hydraulic_part
  on valves C091/C093, hydraulic_part+chemical_risk on pump C097,
  moving_part on motion controller C075
- keyword_dictionary: EN03 added to aufzug/lift/hubwerk/hubgeraet
  (was wrongly EN04-only). New keyword entries for hub-action verbs:
  absenken/senken/anheben/heben + hubhoehe/hubweg/hubgeschwindig

Audit impact:
- A: weakly_reachable 409 -> 358 (-51 patterns now fully reachable)
- B: incomplete components 46 -> 30 (-16, -33%)
- HP1018 (Person unter absenkendem Maschinenteil eingeklemmt):
  weakly_reachable -> reachable

Why: methods A/B/C surfaced that the Kistenhubgeraet test project
generated 0 crush-under-load hazards despite OSHA 1910.212(a)(3) +
EN ISO 12100 6.3.5.5 explicitly requiring them. Three orthogonal
bugs (missing crush_point tag, wrong energy source mapping, missing
action verbs in dictionary) silently disabled the entire lift crush
pattern family.
2026-05-21 10:51:08 +02:00
Benjamin Admin 4946571863 feat(audit-pipeline): P72-v2 Heuristik nachgeschaerft + P80 Mini-Replay-Endpoint
CI / detect-changes (push) Successful in 9s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 13s
CI / loc-budget (push) Failing after 14s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 36s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / nodejs-build (push) Has been skipped
P72-v2  MC-Scope-Classifier Heuristik v2 — v1 hatte 79% 'other'-Bucket
        (Patterns zu strict). v2 deckt deutlich breiter ab:
          - DSE: Art. 13/14 + Betroffenenrechte (Art. 15-22) + DSB +
            Aufsichtsbehoerde + Speicherdauer + besondere Kategorien
          - TOM: Art. 32 + Verschluesselung/Backup/Pseudonymisierung +
            Zugriffskontrolle + ISO 27001 + BSI-Grundschutz + Audit-Log
          - cookie_richtlinie: Tracking-Pixel + Webstorage + GA/Matomo/
            Hotjar/Pixel/GTM
          - process: VVT (Art. 30) + DSFA (Art. 35) + Datenpannen
            (Art. 33/34) + HinSchG + Schulungen + Loeschkonzept
        Script `backfill_mc_scope_v2.py` re-classifiziert NUR den
        'other'-Bucket (spezifische v1-Buckets bleiben unangetastet).

P80    Mini-Replay-Endpoint (v1):
          POST /compliance-check/snapshots/{id}/replay
          ?recipient=foo@bar.com & dry_run=false
        Laedt Snapshot, rendert Mail mit AKTUELLEM Render-Code (P63-P67,
        P59b/P61/P62). Sendet [REPLAY]-prefixed Mail oder gibt nur
        HTML-Stats zurueck (dry_run).
        Effekt: 7min Re-Scan -> 2-5sec fuer Mail-Layout-Iterationen.
        v2 (spaeter): MC-Scorecard mit aktuellem scope_doc_type-Filter
        ueber Snapshot — erfordert _run_compliance_check Refactoring.

Plus Bugfix: GET /snapshots/{id} raised jetzt HTTPException statt
Tuple-Return (FastAPI hat Tuple als JSON-Array zurueckgegeben).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 10:21:56 +02:00
Benjamin Admin cde670617e feat(audit-pipeline): P72 MC-Scope-Classifier + P80 Snapshot/Replay-Foundation [migration-approved]
CI / detect-changes (push) Successful in 11s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 14s
CI / loc-budget (push) Failing after 14s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 37s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
P72  MC-Scope-Classifier — pro MC den ECHTEN Doc-Adressaten festlegen
     (cookie_richtlinie/dse/banner_implementation/cmp_audit/tom/avv/jc/
      impressum/agb/widerruf/process/accounting/other).
     - Migration 145: scope_doc_type Spalte + Index auf canonical_controls
     - Backfill-Script mit Regex-Heuristik (12 Regeln, Prioritaet-sortiert)
     - Erste 11k-Sample-Distribution: 76% other (Heuristik v1 zu strict —
       v2 muss lockerere Patterns fuer DSE/TOM nachschaerfen)
     - Ziel: bevor MC-Scorecard filtert, weiss jeder MC welches Dokument
       er adressiert. Bisher landeten eHealth-/HGB-MCs im Cookie-Audit.

P80  Snapshot + Replay-Foundation — Roh-Daten persistieren damit
     Audit-Pipeline ohne erneuten Crawl rebuildbar ist.
     - Migration 146: compliance_check_snapshots Tabelle (JSONB pro
       doc_entries/banner_result/profile/cmp_vendors/scan_context)
     - services.check_snapshot.save_snapshot/load_snapshot/list
     - Endpoints GET /snapshots, GET /snapshots/{id}
     - Hook in _run_compliance_check: nach Mail-Send automatischer
       Snapshot-Save via separater SessionLocal (background-task safe)
     - Replay-Endpoint folgt im naechsten PR (braucht Refactoring
       von _run_compliance_check in crawl_phase + interpret_phase)
     - Effekt: Test-Cycle 7min -> 5sec bei reinen Logik-Aenderungen
       (P73/P79/P81+ profitieren direkt). Snapshots dienen auch als
       Regression-Test-Corpus (P81 Golden-Truth-Library).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 08:53:31 +02:00
Benjamin Admin 603381a67f feat(audit-mail): P58/P59c/P60b/P61/P62 — Mercedes-Cycle Phase 1 abgeschlossen
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 38s
CI / test-python-document-crawler (push) Has been skipped
CI / detect-changes (push) Successful in 12s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 14s
CI / loc-budget (push) Failing after 15s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
P58  Anti-Audit-Detection robuster (script-domain + settings-spezifisch —
     war bereits im Code, jetzt sauber als completed dokumentiert).

P59c DACH-Custom-Cookies in compliance.cookie_library: Borlabs,
     etracker, Matomo/Piwik, Userlike, Cookiebot/Cookieyes/Usercentrics,
     Akamai/Cloudflare/Datadome Bot-Manager + HubSpot. 21 neue Eintraege
     (3 von 24 schon via Open-Cookie-Database vorhanden).
     Script: backend-compliance/scripts/seed_dach_cookies.py.

P60b Vendor-Pattern-Dedupe mit Fuzzy-Match (Jaccard >= 0.7) statt exakter
     Tuple-Equality. Vendors mit teilweise befuellten Feldern (z.B.
     Sitzland eingetragen) fallen nicht mehr aus der globalen Notice —
     Bug: Amazon/Psyma/Qualtrics hatten zuvor wiederholte per-row Actions.

P61  "Untergeschobene Cookies"-Erkennung — wenn ein deklarierter Vendor
     (z.B. Google Tag Manager) automatisch weitere mitbringt (GA + GCL_AU
     + DoubleClick), werden diese als separater Mail-Block (gelb) mit
     COOKIE/VENDOR-Badges + Quellen-Doku ausgewiesen. Neuer Service:
     compliance.services.vendor_package_cookies (8 Primary-Vendors mit
     je 2-4 implicit Cookies/Vendors).

P62  Marketing-Manager-Disclaimer "Was wir sehen / nicht sehen" als
     blauer Box-Block direkt unter dem Critical-Findings-Block. Erklaert
     Grenzen unseres Audits (Server-Side-Tracking, Vendor-interne
     Datenweitergabe, Cross-Page-Banner) und Risiko des Falschvertrauens
     in einen 100%-Score. Neuer Renderer: compliance.api.scope_disclaimer.

Architektur: VVT-Tabellen-Renderer aus agent_doc_check_extras.py (552
LOC -> 242 LOC) in compliance.api.vvt_table_renderer ausgelagert, um den
500-LOC-Hardcap einzuhalten.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 08:01:27 +02:00
Benjamin Admin 57c0f940a2 feat(consent+report): P56-P67 Mercedes-Audit-Cycle (Anti-Audit, Phase G Vendors, Cookie-Behavior-Validator + 5 Mail-Polish-Items) [migration-approved]
CI / detect-changes (push) Successful in 11s
CI / branch-name (push) Has been skipped
CI / nodejs-build (push) Successful in 2m19s
CI / test-go (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 16s
CI / loc-budget (push) Failing after 15s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 37s
P56  Anti-Auditing-Detection als constructive Compliance-Finding (Audit-API-
     Empfehlung statt Anklage, weil Mercedes berechtigt Bots blockiert)
P57  Phase G vendor_details Union mit cmp_vendors -> 42 Anbieter sichtbar
P58  Anti-Audit-Detection robuster (Script-Domain-Check + Settings-spezifisch)
P59  Cookie-Behavior-Validator (4 Layer, 3-Tier-Severity: MEDIUM=Kategorie-
     Mismatch / HIGH=Zweck-Mismatch / CRITICAL=beide=Vorsatz-Indiz)
     + Open Cookie Database (CC0) als Library-Seed (2264 Cookies)
P59b Cookie-Behavior in Banner-Check verdrahtet + Mail-Block (BUGFIX:
     SessionLocal selbst oeffnen, db war im Background-Task nicht im Scope)

Mail-Polish nach Mercedes-Review:
P63  Banner-Footer-Links auch im wb7-link/role=link erkennen (Shadow-DOM-
     Walker label-based statt nur <a href>)
P64  Re-Access-Severity: MEDIUM statt HIGH, wenn Footer "Einstellungen" oder
     Mercedes-typisch existiert; OEM-Footer-Detection (wb7-footer)
P65  Text-Truncation: Word-Boundary statt Zeichen-Cut (kein "einfa"-Bruch
     mehr in Sofortmassnahmen)
P66  GF-Aktionen: Service-Zweck vs Cookie-Zweck explizit erklaert
     (haeufige Verwechslung Marketing/GF: "Akamai-Beschreibung" != Cookie-
     Zweck pro DSK-OH 2024)
P67  Stirring-Finding mit "Verlust-Framing"-Erklaerung + Alt-vs-Neutral-
     Beispiel, statt nur EDPB-Fachbegriff

Compliance-Advisor FAQ (admin agent-core/soul):
  + CNIL/EDPB Top-Bussgelder (Google 100M, Meta 60M, Amazon 35M)
  + Deutsche Praezedenz (LG Muenchen Google Fonts, EuGH Planet49, BGH I ZR 7/16)
  + 4 Risiko-Pfade (Bussgeld/Abmahnung/Sammelklage/NOYB) + Berechnungs-Methodik

Document-Generator Templates: AGB-DE (142), Impressum (140), Widerrufs-
formular-Anlage (143), DSR-Process-Dedup (139), Cookie-Library (144).

Architektur: doc_action_mappings.py + banner_dom_walkers.py +
cookie_behavior_validator.py + vendor_detail_extractor.py rausgezogen,
um die 500-LOC-Caps in agent_doc_check_report.py und
banner_text_checker.py einzuhalten.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 06:28:25 +02:00
Benjamin Admin badb356740 fix(founding-wizard): nested IF-Bloecke korrekt aufloesen (innermost-first)
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 42s
CI / detect-changes (push) Successful in 10s
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 13s
CI / loc-budget (push) Successful in 16s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-05-20 19:21:08 +02:00
Benjamin Admin f08eb71480 fix(founding-wizard): default values fuer alle 8 Notar-Templates Platzhalter
CI / detect-changes (push) Successful in 9s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 15s
CI / nodejs-build (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / loc-budget (push) Successful in 18s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 40s
CI / test-python-document-crawler (push) Has been skipped
2026-05-20 18:45:12 +02:00
Benjamin Admin 0477a2f2dc fix(founding-wizard): RESSORT_N_NAME/_GF/_AUFGABEN aus GF-Liste ableiten
CI / detect-changes (push) Successful in 9s
CI / branch-name (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 15s
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / loc-budget (push) Successful in 17s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 42s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-05-20 18:42:36 +02:00
Benjamin Admin 93cedbecbd fix(founding-wizard): missing context vars (P_INFO etc) + italic regex no longer eats snake_case underscores
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 16s
CI / loc-budget (push) Successful in 19s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 41s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-05-20 18:37:12 +02:00
Benjamin Admin 28f9e13c1f fix: remove jsonb_array_length from all 14 template migrations [migration-approved]
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 19s
CI / detect-changes (push) Successful in 9s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / loc-budget (push) Successful in 18s
CI / go-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 46s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
2026-05-20 17:49:05 +02:00
Benjamin Admin 35c1bbdaa5 fix: migration verification-SELECT (placeholders is TEXT not JSONB) [migration-approved]
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 17s
CI / detect-changes (push) Successful in 10s
CI / loc-budget (push) Successful in 20s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 47s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-05-20 17:46:04 +02:00
Benjamin Admin b7df4709bc fix(founding-wizard): set license_id='mit' (NOT NULL constraint) [migration-approved]
CI / loc-budget (push) Successful in 17s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 17s
CI / nodejs-build (push) Successful in 2m58s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 43s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-05-20 16:48:22 +02:00
Benjamin Admin 6f3301d246 fix(founding-wizard): add python-docx dep + Lifecycle filter UI
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 17s
CI / loc-budget (push) Successful in 18s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m53s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 44s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
- requirements.txt: python-docx==1.2.0 (Container hatte das modul nicht)
- document-generator: Lifecycle-Filter (Pre-Founding/Founding/Startup/KMU/Konzern)
  zeigt nur relevante Templates fuer aktuelle Phase
2026-05-20 16:41:36 +02:00
Benjamin Admin 4478b7f479 fix(founding-wizard): mypy/ruff cleanup for CI
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 14s
CI / loc-budget (push) Successful in 17s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 42s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
- markdown_to_docx.py: type annotations + unused import
- founding_wizard_routes.py: drop unused get_db import
2026-05-20 09:58:38 +02:00
Benjamin Admin 39c39b1254 Merge feat/founding-wizard: Gründungs-Wizard + 14 Notar-Templates [migration-approved]
CI / detect-changes (push) Successful in 9s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 14s
CI / loc-budget (push) Successful in 17s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m57s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 39s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-05-20 09:32:24 +02:00
Benjamin Admin 7a5f1e48dd feat(founding-wizard): Gründungs-Wizard für 2-Mann GmbH + 14 Notar-Templates
[migration-approved]

Templates (Migrations 123-136):
- 123 GO-GF (Geschäftsordnung Geschäftsführung)
- 124 SHA (Shareholders' Agreement, 56 Platzhalter)
- 125 Satzung (Articles of Association mit UG-Variante)
- 126 GF-Dienstvertrag (Trennungsprinzip Organ/Anstellung)
- 127 Arbeitsvertrag (AGG-neutral, NachwG, eAU)
- 128 Gesellschafterliste (§ 40 GmbHG)
- 129 GF-Bestellungsbeschluss (mit § 6 Abs. 2 Versicherung)
- 130 HRB-Anmeldung (§§ 7, 8, 39 GmbHG, § 12 HGB)
- 131 IP-Assignment Agreement (Gründer→GmbH)
- 132 Term Sheet (Pre-Seed/Seed VC-Standard)
- 133 Wandeldarlehensvertrag (Convertible Loan)
- 134 Beteiligungsvertrag (Subscription Agreement)
- 135 ESOP/VSOP-Plan (3 Varianten)
- 136 Cap Table

Kategorisierung (Migrations 137-138):
- ALTER TABLE compliance_legal_templates ADD lifecycle_stage TEXT[],
  functional_category TEXT (mit CHECK Constraints + GIN-Index)
- Backfill aller 105 Templates: lifecycle_stage (pre_founding|founding|
  startup|kmu|konzern) + functional_category (founding_legal|employment|
  investor_funding|...)

Backend Founding-Wizard Service:
- template_renderer.py: Handlebars-light ({{VAR}}, {{#IF FLAG}}...{{/IF}})
- wizard_to_context.py: Mapping Wizard-State → SCREAMING_SNAKE_CASE Vars
- markdown_to_docx.py: Markdown → DOCX via python-docx
- founding_wizard_routes.py: POST /v1/founding-wizard/generate
  → liefert base64-DOCX-Files für ausgewählte Templates

Frontend Founding-Wizard (/sdk/founding-wizard):
- 8-Step Wizard (Basics, Gesellschafter, GF, Kapital, Notar, SHA, GF-Verträge, Generate)
- useFoundingWizardForm Hook mit localStorage-Persistenz
- TypeScript Code-Registry (template-categories.ts) als Backup zur DB
- Word-Download via data:URLs (base64)

Tests:
- 20 Unit-Tests grün (Renderer, Context-Mapping, DOCX-Conversion)
- Playwright E2E-Test mit 2-Mann GmbH (Benjamin + Sharang) Test-Daten
2026-05-20 09:30:51 +02:00
Benjamin Admin 98ec6d4284 fix(report): Anti-Pattern-Aufgabe — "muss entfernt werden" statt "ergaenzt werden"
CI / detect-changes (push) Successful in 9s
CI / secret-scan (push) Has been skipped
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 17s
CI / loc-budget (push) Successful in 17s
CI / go-lint (push) Has been skipped
CI / test-python-backend (push) Successful in 40s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
Bug: bei invertierten Checks (P9 #7 illegal_disclaimer) sagte die
GF-Aufgaben-Liste "muss ergaenzt werden" — semantisch falsch, weil der
Disclaimer ja schon da IST und entfernt werden soll.

Fix: _check_to_action() erkennt jetzt Anti-Pattern-Labels
(rechtswidrig/illegal/haftungsausschluss/disclaimer) und gibt
"muss entfernt werden (Anti-Pattern, rechtlich wirkungslos)" zurueck.

Smoke-Test BMW d2f7bcc0: vorher 'Rechtswidriger Haftungsausschluss
muss ergaenzt werden' -> jetzt 'muss entfernt werden'.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 16:40:24 +02:00
Benjamin Admin 6f16507c5f feat(banner): P19 + P20 — Per-Category-Click-Test + Frontend-Drilldown
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m54s
CI / test-go (push) Has been skipped
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 17s
CI / loc-budget (push) Successful in 17s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 43s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
P19 (consent-tester):
- dp-cookieconsent (TYPO3, Safetykon-Pattern) als CMP-Profil hinzu —
  Selektoren #dp--cookie-statistics/marketing + a.cc-allow Save-Button
- Neues Signal provider_details_visible: nach Kategorie-Toggle prueft
  Playwright ob im Banner sichtbare Provider-/Cookie-Detail-Elemente
  erscheinen. Bei dp-cookieconsent (Banner ohne Listing) immer False
  -> HIGH-Violation "Kategorie zeigt keine Provider-/Cookie-Details —
  Nutzer kann nicht informiert einwilligen (Art. 7 Abs. 1 DSGVO)"
- main.py serialisiert provider_details_visible + cookies_set pro Kategorie

P20 (Frontend-Drilldown):
- Backend: check_payloads-Tabelle um Spalte 'banner' (JSON) — voller
  banner_result persistiert (vorher nur in-memory). ALTER TABLE
  Migration idempotent.
- Neuer Endpoint GET /api/compliance/agent/banner/<check_id> — liefert
  Quality-Score, Phases, Category-Tests, Banner-Checks, alle 46
  structured_checks.
- Frontend: BannerTab im /sdk/agent/audit/<id> mit Quality-Cards,
  3-Phasen-Cookie-Tabelle, Per-Category-Listing (mit P19-Signal
  rot/gruen), Banner-Verstoesse + Rechtsgrundlagen, 46-Check-Drilldown
  filterbar nach Severity.
- Tab-Switcher in page.tsx um "Cookie-Banner-Analyse" erweitert.
- Bonus: 2 alte route.ts auf Next.js 15 Promise-params umgestellt
  (Build-Fix).

Plus: Critical-Findings-Block nutzt provider_details_visible als
primaeres Signal statt nur tracking_services-Anzahl.

Smoke-Test Safetykon: 4 Critical Findings im Mail, banner-Endpoint
liefert 46 checks + 3 phases + 2 categories mit provider_details_visible=False.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 14:31:13 +02:00
Benjamin Admin d4d9b60007 feat(email): P18 — Critical-Findings-Box + Banner-Deep-Block
CI / detect-changes (push) Successful in 12s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 15s
CI / loc-budget (push) Successful in 20s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 3m8s
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 47s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
Backend wirft 90% der consent-tester-Daten weg — nur 4 Felder von einem
vollen Banner-Scan landeten im Email. Phases (before_consent / after_reject
/ after_accept), banner_checks.violations mit Rechtsgrundlagen,
category_tests, 46 structured_checks, completeness/correctness-Scores
waren alle nicht sichtbar.

Backend: agent_compliance_check_routes leitet jetzt das volle banner_result
durch (15 Felder statt 4).

Renderer (2 neue Module):
1) agent_doc_check_critical.build_critical_findings_html
   - ROTER Sofortmassnahmen-Block GANZ OBEN in der Email
   - Erkennt: banner-violations (HIGH/CRITICAL), leere Per-Category-Lists,
     DSE-Score <30%, fehlende Cookie-Richtlinie, US-Tracker ohne SCC/DPF
   - Pro Issue: konkrete Sofortmassnahme + Rechtsgrundlage + Bussgeld-
     Praezedenz (CNIL TikTok 5 Mio, LfDI BW 30k, EuGH Schrems II, ...)
   - Wird nur gerendert wenn echte Issues vorliegen

2) agent_doc_check_banner.build_banner_deep_html
   - Banner-Quality-Score-Cards (Vollstaendigkeit / Korrektheit / Verstoesse)
   - 3-Phasen-Cookie-Tabelle: vor Consent / nach Ablehnung / nach Annahme
     mit Cookie-Count, Tracker-Count, Auffaelligkeiten
   - Per-Category-Tracker-Listing (Statistik/Marketing) — zeigt explizit
     wenn eine Kategorie keine Provider listet (Safetykon-Pattern)
   - Violations-Liste mit Severity-Badge + Quellen-Hint (LG Rostock, EDPB)

Smoke-Test Safetykon: alle 6 neuen Blocks rendern, kein Regression.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 13:34:17 +02:00
173 changed files with 24505 additions and 710 deletions
@@ -56,6 +56,44 @@ Bei ALLEN Fragen zu IFRS/IAS-Standards MUSST du folgende Punkte beachten:
4. Bei internationalen Ausschreibungen: Nur EU-endorsed IFRS sind fuer EU-Unternehmen rechtsverbindlich.
5. Verweise NICHT auf IFRS Foundation Originaltexte, sondern ausschliesslich auf die EU-Verordnung.
## FAQ — Cookie-Banner-Bussgelder + Risiken (haeufige Mandantenfragen)
Bei Fragen nach Bussgeldern, Risiko-Hoehe oder konkreten Faellen gib **konkrete Praezedenzen** an:
### Top-Bussgelder (CNIL Frankreich — strengste EU-Aufsicht):
- **Google France 2020 (CNIL)** — 100 Mio EUR — Cookies ohne Einwilligung (CNIL Beschluss vom 07.12.2020)
- **Meta/Facebook France 2022 (CNIL)** — 60 Mio EUR — Cookies ohne Einwilligung
- **Amazon France 2020 (CNIL)** — 35 Mio EUR — Cookies ohne Einwilligung
- **Carrefour France 2020 (CNIL)** — 2,25 Mio EUR — Cookies + sonstige Verstoesse
### Deutsche Praezedenzen + Sammelklagen-Risiken:
- **LG Muenchen I 2022** — 100 EUR pro Besucher Schadensersatz fuer Google Fonts ohne Consent (Az. 3 O 17493/20). Spaeter durch BGH "Rechtsmissbrauchs"-Argument bei Massenabmahnungen eingeschraenkt.
- **EuGH Planet49 (C-673/17)** — vorausgewaehlte Cookie-Checkboxen sind unwirksame Einwilligung (praejudiziell fuer alle EU-Sites)
- **BGH Cookie-Einwilligung II (I ZR 7/16)** — bestaetigt Planet49 fuer Deutschland
- **DSK Beschluss 2023** — Cookie-Banner mit "Akzeptieren" deutlich prominenter als "Ablehnen" = Dark Pattern = unwirksame Einwilligung
### Deutscher Aufsichtsmarkt:
Deutsche Aufsicht (BfDI + 16 Landes-DSB) ist moderater als CNIL — bislang keine 100 Mio-EUR-Bussgelder. ABER: DSK-Beschluesse + LfDI-Verfahren haeufen sich. Federfuehrung bei Konzernen via "One-Stop-Shop" nach Hauptsitz.
### Vier Risiko-Pfade fuer Mandanten:
1. **Art. 83 DSGVO Bussgeld** — bis 4% des weltweiten Konzernumsatzes. Realistisch 0,1-1% bei Erstverstoss.
2. **Verbraucherschutz-Abmahnung** (vzbv, Wettbewerbszentrale, Verbraucherverbaende) — 50-500k EUR Streitwert + Unterlassung.
3. **Sammelklage Art. 82 DSGVO** — Schadensersatz pro Person, BGH 50-100 EUR pro Fall. Sammelklage-Trusts: myRight, RightNow, helpcheck.de.
4. **NOYB-Beschwerde** (Max Schrems) — oeffentliches Aufsichtsverfahren, Reputationsschaden + Bussgeld.
### Geschaeftsfuehrer-Haftung (haeufig unterschaetzt):
GF haftet **persoenlich** nach §43 GmbHG bzw. §93 AktG wenn Compliance-Pflichten verletzt wurden. Das ist der eigentliche Druckpunkt — nicht die Firma, sondern der GF persoenlich. Bei Mandantengespraechen mit GF-Beteiligung: dieser Punkt zuerst ansprechen.
### Wie berechne ich das konkrete Risiko fuer einen Mandanten:
Frage den Mandanten nach: (a) Jahresumsatz, (b) ungefaehre Besucherzahl pro Jahr, (c) Anzahl Trackingtools im Banner. Dann:
- Max-Bussgeld = 4% × Jahresumsatz (Obergrenze, nicht realistisch)
- Realistisch-Bussgeld = 0,1-1% × Jahresumsatz (CNIL/LfDI-Maßstab)
- Sammelklage-Theorie = Besucherzahl × 50 EUR (BGH-Untergrenze) — meist nicht durchsetzbar, aber Drohpotential
- NICHT konkrete Zahlen einer fremden Firma zitieren ("BMW haette X EUR" etc.) — Mandant koennte das falsch weitergeben
### Marktwissen (intern, nicht 1:1 zitieren):
Externe DSB-Stundensaetze: 350-450 EUR/h (NOERR, GSK, vergleichbare Kanzleien). Mittelstands-DSB-Mandate: 5-15k EUR/Jahr. Cookie-Audit manuell: typisch 10 Std = 4-5k EUR Kosten. BreakPilot reduziert das auf 30 Min.
## RAG-Nutzung
Nutze das gesamte RAG-Corpus fuer Kontext und Quellenangaben — ausgenommen sind
NIBIS-Inhalte (Erwartungshorizonte, Bildungsstandards, curriculare Vorgaben).
@@ -10,9 +10,9 @@ const BACKEND_URL = process.env.BACKEND_API_URL || 'http://backend-compliance:80
export async function GET(
request: NextRequest,
{ params }: { params: { checkId: string } },
{ params }: { params: Promise<{ checkId: string }> },
) {
const checkId = params.checkId
const { checkId } = await params
const qs = request.nextUrl.searchParams.toString()
const url = `${BACKEND_URL}/api/compliance/agent/audit/${checkId}${qs ? `?${qs}` : ''}`
try {
@@ -0,0 +1,28 @@
/**
* Proxy: GET /api/sdk/v1/agent/banner/<checkId>
* -> backend GET /api/compliance/agent/banner/<checkId>
*
* Liefert das volle banner_result (phases, structured_checks, category_tests).
*/
import { NextRequest, NextResponse } from 'next/server'
const BACKEND_URL = process.env.BACKEND_API_URL || 'http://backend-compliance:8002'
export async function GET(
_request: NextRequest,
{ params }: { params: Promise<{ checkId: string }> },
) {
const { checkId } = await params
try {
const resp = await fetch(
`${BACKEND_URL}/api/compliance/agent/banner/${checkId}`,
{ signal: AbortSignal.timeout(15000) },
)
const data = await resp.json().catch(() => ({}))
return NextResponse.json(data, { status: resp.status })
} catch {
return NextResponse.json(
{ error: 'Banner-Abfrage fehlgeschlagen' }, { status: 503 },
)
}
}
@@ -10,9 +10,9 @@ const BACKEND_URL = process.env.BACKEND_API_URL || 'http://backend-compliance:80
export async function GET(
request: NextRequest,
{ params }: { params: { checkId: string } },
{ params }: { params: Promise<{ checkId: string }> },
) {
const checkId = params.checkId
const { checkId } = await params
const qs = request.nextUrl.searchParams.toString()
const url = `${BACKEND_URL}/api/compliance/agent/findings/${checkId}${qs ? `?${qs}` : ''}`
try {
@@ -0,0 +1,58 @@
/**
* Next.js Proxy: leitet POST /api/v1/founding-wizard/generate an Backend.
*
* Konvertiert das Backend-Response (base64 DOCX) in data: URLs,
* die das Frontend direkt als Download anbieten kann.
*/
import { NextRequest, NextResponse } from 'next/server'
const BACKEND_URL = process.env.BACKEND_COMPLIANCE_URL || 'http://bp-compliance-backend:8002'
export async function POST(req: NextRequest) {
try {
const body = await req.json()
const backendRes = await fetch(`${BACKEND_URL}/v1/founding-wizard/generate`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(body),
})
if (!backendRes.ok) {
const errorText = await backendRes.text()
return NextResponse.json(
{ error: 'Backend-Generierung fehlgeschlagen', detail: errorText },
{ status: backendRes.status }
)
}
const data = await backendRes.json()
const documents = (data.documents || []).map((doc: {
document_type: string
title: string
filename: string
content_base64: string
size_bytes: number
generated_at: string
}) => ({
document_type: doc.document_type,
title: doc.title,
filename: doc.filename,
download_url: `data:application/vnd.openxmlformats-officedocument.wordprocessingml.document;base64,${doc.content_base64}`,
size_bytes: doc.size_bytes,
generated_at: doc.generated_at,
}))
return NextResponse.json({
documents,
warnings: data.warnings || [],
})
} catch (e: unknown) {
const message = e instanceof Error ? e.message : 'Unbekannter Fehler'
return NextResponse.json(
{ error: 'Proxy-Fehler', detail: message },
{ status: 500 }
)
}
}
@@ -2,30 +2,41 @@
import React, { useState } from 'react'
import { ChecklistView } from './ChecklistView'
import { ResultsTabsView } from './ResultsTabsView'
import { PreScanWizard, useScanContext, isContextComplete } from './PreScanWizard'
import { safeSetItem } from './storageHelpers'
interface DocEntry {
id: string
type: string
label: string
url: string
text: string // P-Paste: User kopiert Doc-Text direkt rein
mode: 'url' | 'text' // welcher Input wird aktiv genutzt
}
const DOC_TYPES = [
{ id: 'dse', label: 'DSI (Datenschutzinformation)' },
{ id: 'dse', label: 'Datenschutzerklärung / DSI' },
{ id: 'cookie', label: 'Cookie-Richtlinie' },
{ id: 'impressum', label: 'Impressum' },
{ id: 'agb', label: 'AGB' },
{ id: 'nutzungsbedingungen', label: 'Nutzungsbedingungen' },
{ id: 'widerruf', label: 'Widerrufsbelehrung' },
{ id: 'social_media', label: 'DSE Social Media (Art. 26)' },
{ id: 'dsfa', label: 'DSFA (Art. 35)' },
{ id: 'agb', label: 'AGB / Nutzungsbedingungen' },
{ id: 'impressum', label: 'Impressum' },
{ id: 'cookie', label: 'Cookie-Richtlinie' },
{ id: 'widerruf', label: 'Widerrufsbelehrung' },
{ id: 'dsa', label: 'DSA / Digital Services Act' },
{ id: 'legal_notice', label: 'Rechtliche Hinweise (IP, Forward-Looking)' },
{ id: 'lizenzhinweise', label: 'Lizenzhinweise Dritter (OSS)' },
{ id: 'other', label: 'Sonstiges' },
]
function newEntry(): DocEntry {
return { id: crypto.randomUUID().slice(0, 8), type: 'dse', label: '', url: '' }
return { id: crypto.randomUUID().slice(0, 8), type: 'dse', label: '',
url: '', text: '', mode: 'url' }
}
export function DocCheckTab() {
const [scanContext, setScanContext] = useScanContext()
const [entries, setEntries] = useState<DocEntry[]>(() => {
if (typeof window === 'undefined') return [newEntry()]
try { const s = localStorage.getItem('doc-check-entries'); return s ? JSON.parse(s) : [newEntry()] } catch { return [newEntry()] }
@@ -74,7 +85,7 @@ export function DocCheckTab() {
}
const handleSubmit = async () => {
const validEntries = entries.filter(e => e.url.trim())
const validEntries = entries.filter(e => e.url.trim() || e.text.trim())
if (validEntries.length === 0) return
setLoading(true)
@@ -89,11 +100,17 @@ export function DocCheckTab() {
body: JSON.stringify({
entries: validEntries.map(e => ({
doc_type: e.type,
label: e.label || e.url.split('/').pop() || 'Dokument',
url: e.url.trim(),
label: e.label
|| (e.url ? e.url.split('/').pop() : '')
|| `${e.type}-paste`,
url: e.mode === 'text' ? '' : e.url.trim(),
// Backend nimmt text > url. Wenn beide gefuellt sind und
// mode='url', schicken wir den text NICHT mit.
text: e.mode === 'text' ? e.text.trim() : '',
})),
check_cookie_banner: checkCookieBanner,
use_agent: useAgent,
scan_context: scanContext,
}),
})
if (!startRes.ok) throw new Error(`Pruefung konnte nicht gestartet werden: ${startRes.status}`)
@@ -111,13 +128,13 @@ export function DocCheckTab() {
if (pollData.status === 'completed' && pollData.result) {
setResults(pollData.result)
setProgress('')
localStorage.setItem('doc-check-results', JSON.stringify(pollData.result))
safeSetItem('doc-check-results', JSON.stringify(pollData.result))
const resultKey = `doc-check-result-${Date.now()}`
try { localStorage.setItem(resultKey, JSON.stringify(pollData.result)) } catch { /* quota */ }
safeSetItem(resultKey, JSON.stringify(pollData.result))
const entry = { date: new Date().toISOString(), urls: validEntries.length, findings: pollData.result.total_findings || 0, resultKey }
const updated = [entry, ...history].slice(0, 30)
setHistory(updated)
localStorage.setItem('doc-check-history', JSON.stringify(updated))
safeSetItem('doc-check-history', JSON.stringify(updated))
break
}
if (pollData.status === 'failed') {
@@ -133,43 +150,90 @@ export function DocCheckTab() {
}
}
const contextReady = isContextComplete(scanContext)
return (
<div className="space-y-4">
{/* URL Entries */}
<div className="space-y-2">
{/* P79 Pre-Scan-Wizard — 8 Pflichtfelder */}
<PreScanWizard value={scanContext} onChange={setScanContext} />
{/* URL / Text Entries */}
<div className="space-y-3">
{entries.map((entry, i) => (
<div key={entry.id} className="flex items-center gap-2">
<select
value={entry.type}
onChange={e => updateEntry(entry.id, 'type', e.target.value)}
className="w-48 px-3 py-2.5 border border-gray-300 rounded-lg text-sm bg-white shrink-0"
>
{DOC_TYPES.map(t => (
<option key={t.id} value={t.id}>{t.label}</option>
))}
</select>
<input
type="text"
value={entry.label}
onChange={e => updateEntry(entry.id, 'label', e.target.value)}
placeholder={entry.type === 'other' ? 'Dokumentname' : 'Version / Stand (optional)'}
className="w-40 px-3 py-2.5 border border-gray-300 rounded-lg text-sm shrink-0"
/>
<input
type="url"
value={entry.url}
onChange={e => updateEntry(entry.id, 'url', e.target.value)}
onBlur={() => autoLabel(entry)}
placeholder="https://example.com/datenschutz"
className="flex-1 px-3 py-2.5 border border-gray-300 rounded-lg text-sm"
/>
{entries.length > 1 && (
<button onClick={() => removeEntry(entry.id)}
className="p-2 text-gray-400 hover:text-red-500 shrink-0">
<svg className="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M6 18L18 6M6 6l12 12" />
</svg>
</button>
<div key={entry.id} className="space-y-1.5">
<div className="flex items-center gap-2">
<select
value={entry.type}
onChange={e => updateEntry(entry.id, 'type', e.target.value)}
className="w-48 px-3 py-2.5 border border-gray-300 rounded-lg text-sm bg-white shrink-0"
>
{DOC_TYPES.map(t => (
<option key={t.id} value={t.id}>{t.label}</option>
))}
</select>
<input
type="text"
value={entry.label}
onChange={e => updateEntry(entry.id, 'label', e.target.value)}
placeholder={entry.type === 'other' ? 'Dokumentname' : 'Version / Stand (optional)'}
className="w-40 px-3 py-2.5 border border-gray-300 rounded-lg text-sm shrink-0"
/>
{/* Mode-Toggle URL / Text */}
<div className="inline-flex border border-gray-300 rounded-lg overflow-hidden text-xs shrink-0">
<button type="button"
onClick={() => updateEntry(entry.id, 'mode', 'url')}
className={`px-3 py-2 ${entry.mode === 'url'
? 'bg-purple-600 text-white' : 'bg-white text-gray-600 hover:bg-gray-50'}`}>
URL
</button>
<button type="button"
onClick={() => updateEntry(entry.id, 'mode', 'text')}
className={`px-3 py-2 ${entry.mode === 'text'
? 'bg-purple-600 text-white' : 'bg-white text-gray-600 hover:bg-gray-50'}`}>
Text einfügen
</button>
</div>
{entry.mode === 'url' && (
<input
type="url"
value={entry.url}
onChange={e => updateEntry(entry.id, 'url', e.target.value)}
onBlur={() => autoLabel(entry)}
placeholder="https://example.com/datenschutz"
className="flex-1 px-3 py-2.5 border border-gray-300 rounded-lg text-sm"
/>
)}
{entries.length > 1 && (
<button onClick={() => removeEntry(entry.id)}
className="p-2 text-gray-400 hover:text-red-500 shrink-0">
<svg className="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M6 18L18 6M6 6l12 12" />
</svg>
</button>
)}
</div>
{entry.mode === 'text' && (
<div className="ml-[400px]">
<textarea
value={entry.text}
onChange={e => updateEntry(entry.id, 'text', e.target.value)}
placeholder={
entry.type === 'cookie'
? 'Kopiere hier die komplette Cookie-Tabelle rein (Tab-getrennt oder mit | als Trenner — wir parsen alle Spalten deterministisch)…'
: 'Kopiere hier den vollständigen Doc-Text rein. Wir erkennen automatisch ob es zu „' + (DOC_TYPES.find(t => t.id === entry.type)?.label ?? entry.type) + '" passt.'
}
className="w-full h-32 px-3 py-2 border border-gray-300 rounded-lg text-xs font-mono resize-y"
/>
<div className="text-[10px] text-gray-500 mt-1">
{entry.text.trim().length > 0
? `${entry.text.trim().length.toLocaleString('de-DE')} Zeichen · ${entry.text.trim().split(/\s+/).length.toLocaleString('de-DE')} Wörter`
: 'Der Crawler wird übersprungen — die Analyse läuft direkt auf dem eingefügten Text.'}
</div>
</div>
)}
</div>
))}
@@ -212,8 +276,11 @@ export function DocCheckTab() {
{/* Submit */}
<button
onClick={handleSubmit}
disabled={loading || entries.every(e => !e.url.trim())}
disabled={loading
|| entries.every(e => !e.url.trim() && !e.text.trim())
|| !contextReady}
className="w-full px-4 py-3 bg-purple-600 text-white rounded-lg font-medium hover:bg-purple-700 disabled:opacity-50 transition-colors text-sm flex items-center justify-center gap-2"
title={!contextReady ? 'Bitte zuerst die 8 Pflichtfelder ausfüllen' : undefined}
>
{loading ? (
<>
@@ -223,6 +290,8 @@ export function DocCheckTab() {
</svg>
Pruefe...
</>
) : !contextReady ? (
`Klassifizierung unvollständig (8 Pflichtfelder)`
) : (
`${entries.filter(e => e.url.trim()).length} Dokument${entries.filter(e => e.url.trim()).length !== 1 ? 'e' : ''} pruefen`
)}
@@ -244,41 +313,9 @@ export function DocCheckTab() {
<div className="bg-red-50 border border-red-200 rounded-lg p-3 text-sm text-red-700">{error}</div>
)}
{/* Results */}
{/* Results — als Tab-Ansicht (Übersicht/Cookies/DSE/Impressum/AGB/Banner/Mail) */}
{results && results.results && (
<div className="bg-white border border-gray-200 rounded-xl p-6 shadow-sm">
<ChecklistView results={results.results} />
{/* Cookie Banner Result */}
{results.cookie_banner_result && (
<div className="mt-4 pt-4 border-t border-gray-200">
<h4 className="text-sm font-semibold text-gray-800 mb-2">Cookie-Banner</h4>
<div className="text-sm text-gray-600">
{results.cookie_banner_result.banner_detected
? `Banner erkannt: ${results.cookie_banner_result.banner_provider || 'unbekannt'}`
: 'Kein Banner erkannt'}
</div>
{results.cookie_banner_result.banner_checks?.violations?.length > 0 && (
<div className="mt-2 space-y-1">
{results.cookie_banner_result.banner_checks.violations.map((v: any, i: number) => (
<div key={i} className="text-xs text-red-600 flex items-start gap-1.5">
<span className="shrink-0 mt-0.5">!!</span>
<span>{v.text}</span>
</div>
))}
</div>
)}
</div>
)}
{/* Email Status */}
{results.email_status && (
<div className="mt-3 text-xs text-gray-500 flex items-center gap-2">
<span className={`w-2 h-2 rounded-full ${results.email_status === 'sent' ? 'bg-green-400' : 'bg-gray-300'}`} />
E-Mail: {results.email_status === 'sent' ? 'Gesendet' : results.email_status}
</div>
)}
</div>
<ResultsTabsView results={results} />
)}
{/* History */}
@@ -0,0 +1,269 @@
'use client'
/**
* P79 — Pre-Scan-Wizard (8 Pflichtfelder).
*
* 8 Pflichtfelder die vor dem Lauf abgefragt werden. Werte landen im
* scan_context und filtern später die MC-Auswertung (zusammen mit P72
* scope_doc_type + applicable_industries). Erwartete Noise-Reduktion:
* 70-80% bei falsch zugeordneten HIGH-MCs.
*/
import React, { useState, useEffect } from 'react'
export interface ScanContext {
industry: string
business_model: string
direct_sales: string
legal_form: string
group_structure: string
employee_count: string
special_data: string[]
third_country_transfer: string
}
const INDUSTRIES = [
{ id: '', label: '— bitte wählen —' },
{ id: 'automotive', label: 'Automotive / OEM' },
{ id: 'ecommerce', label: 'E-Commerce / Online-Handel' },
{ id: 'saas', label: 'SaaS / Software' },
{ id: 'banking', label: 'Banking / Finance' },
{ id: 'insurance', label: 'Insurance / Versicherung' },
{ id: 'healthcare', label: 'Healthcare / Gesundheit' },
{ id: 'education', label: 'Bildung / Schule' },
{ id: 'public', label: 'Öffentliche Verwaltung' },
{ id: 'manufacturing', label: 'Industrie / Manufacturing' },
{ id: 'media', label: 'Medien / Verlag' },
{ id: 'other', label: 'Sonstige' },
]
const LEGAL_FORMS = [
{ id: '', label: '— bitte wählen —' },
{ id: 'ag', label: 'AG (Aktiengesellschaft)' },
{ id: 'gmbh', label: 'GmbH' },
{ id: 'gmbh_co_kg', label: 'GmbH & Co. KG' },
{ id: 'kg', label: 'KG' },
{ id: 'ohg', label: 'OHG' },
{ id: 'ug', label: 'UG (haftungsbeschränkt)' },
{ id: 'ek', label: 'e.K. / Einzelunternehmen' },
{ id: 'verein', label: 'Verein' },
{ id: 'stiftung', label: 'Stiftung' },
{ id: 'behoerde', label: 'Behörde / Körperschaft öff. Rechts' },
{ id: 'other', label: 'Sonstige' },
]
const GROUP_STRUCTURES = [
{ id: '', label: '— bitte wählen —' },
{ id: 'standalone', label: 'Eigenständig' },
{ id: 'parent', label: 'Konzern-Mutter' },
{ id: 'subsidiary', label: 'Konzern-Tochter' },
{ id: 'joint_venture', label: 'Joint Venture' },
{ id: 'processor', label: 'Reiner Auftragsverarbeiter' },
]
const EMPLOYEE_COUNTS = [
{ id: '', label: '— bitte wählen —' },
{ id: 'lt10', label: 'unter 10' },
{ id: '10_19', label: '10-19' },
{ id: '20_49', label: '20-49 (DSB ab 20 Pflicht)' },
{ id: '50_249', label: '50-249 (Whistleblower-Pflicht)' },
{ id: '250_499', label: '250-499' },
{ id: '500_999', label: '500-999' },
{ id: '1000_plus', label: '1.000+ (Konzern)' },
]
const SPECIAL_DATA_OPTIONS = [
{ id: 'health', label: 'Gesundheitsdaten' },
{ id: 'biometric', label: 'Biometrische Daten' },
{ id: 'ethnicity', label: 'Religiöse / ethnische Herkunft' },
{ id: 'sexual', label: 'Sexuelle Orientierung' },
{ id: 'criminal', label: 'Strafrechtliche Daten' },
{ id: 'minors', label: 'Minderjährige (<16)' },
{ id: 'none', label: 'Keine besonderen Daten' },
]
const STORAGE_KEY = 'compliance-scan-context'
function emptyContext(): ScanContext {
return {
industry: '',
business_model: '',
direct_sales: '',
legal_form: '',
group_structure: '',
employee_count: '',
special_data: [],
third_country_transfer: '',
}
}
export function isContextComplete(ctx: ScanContext): boolean {
return Boolean(
ctx.industry &&
ctx.business_model &&
ctx.direct_sales &&
ctx.legal_form &&
ctx.group_structure &&
ctx.employee_count &&
ctx.special_data.length > 0 &&
ctx.third_country_transfer
)
}
export function PreScanWizard({
value,
onChange,
}: {
value: ScanContext
onChange: (ctx: ScanContext) => void
}) {
const update = <K extends keyof ScanContext>(key: K, val: ScanContext[K]) => {
onChange({ ...value, [key]: val })
}
const toggleSpecialData = (id: string) => {
const next = value.special_data.includes(id)
? value.special_data.filter(x => x !== id)
: [...value.special_data.filter(x => x !== 'none' || id === 'none'), id]
onChange({ ...value, special_data: id === 'none' ? ['none'] : next.filter(x => x !== 'none') })
}
return (
<div style={{
background: '#f0f9ff',
border: '1px solid #bfdbfe',
borderRadius: 8,
padding: '14px 16px',
marginBottom: 14,
}}>
<div style={{ fontSize: 11, color: '#1e40af', textTransform: 'uppercase',
letterSpacing: 1.2, marginBottom: 4, fontWeight: 600 }}>
Pflichtangaben zur Klassifizierung des Audits
</div>
<h3 style={{ margin: '0 0 6px', fontSize: 14, color: '#1e293b' }}>
Vor dem Scan: 8 Angaben zum Unternehmen
</h3>
<p style={{ margin: '0 0 12px', fontSize: 11, color: '#475569', lineHeight: 1.5 }}>
Diese Angaben filtern irrelevante Compliance-Themen heraus (z.B. eHealth-
Vorschriften bei einem Autobauer) und liefern eine realistische
Einschätzung statt pauschaler Verstoss-Listen.
</p>
<div style={{ display: 'grid', gridTemplateColumns: 'repeat(2, 1fr)', gap: 10 }}>
<Field label="1. Branche*">
<select value={value.industry} onChange={e => update('industry', e.target.value)} style={inputStyle}>
{INDUSTRIES.map(o => <option key={o.id} value={o.id}>{o.label}</option>)}
</select>
</Field>
<Field label="2. Geschäftsmodell*">
<select value={value.business_model} onChange={e => update('business_model', e.target.value)} style={inputStyle}>
<option value=""> bitte wählen </option>
<option value="b2b">B2B</option>
<option value="b2c">B2C</option>
<option value="both">Beides (B2B + B2C)</option>
</select>
</Field>
<Field label="3. Direkt-Vertrieb (Webshop/Buchung)*">
<select value={value.direct_sales} onChange={e => update('direct_sales', e.target.value)} style={inputStyle}>
<option value=""> bitte wählen </option>
<option value="yes">Ja</option>
<option value="no">Nein</option>
<option value="lead_funnel">Nur Lead-Funnel (Probefahrten, Anfragen)</option>
</select>
</Field>
<Field label="4. Rechtsform*">
<select value={value.legal_form} onChange={e => update('legal_form', e.target.value)} style={inputStyle}>
{LEGAL_FORMS.map(o => <option key={o.id} value={o.id}>{o.label}</option>)}
</select>
</Field>
<Field label="5. Konzern-Struktur*">
<select value={value.group_structure} onChange={e => update('group_structure', e.target.value)} style={inputStyle}>
{GROUP_STRUCTURES.map(o => <option key={o.id} value={o.id}>{o.label}</option>)}
</select>
</Field>
<Field label="6. Mitarbeiterzahl*">
<select value={value.employee_count} onChange={e => update('employee_count', e.target.value)} style={inputStyle}>
{EMPLOYEE_COUNTS.map(o => <option key={o.id} value={o.id}>{o.label}</option>)}
</select>
</Field>
<Field label="7. Besondere Datenkategorien*" colSpan={2}>
<div style={{ display: 'flex', flexWrap: 'wrap', gap: 8 }}>
{SPECIAL_DATA_OPTIONS.map(o => (
<label key={o.id} style={{ fontSize: 12, display: 'inline-flex',
alignItems: 'center', gap: 4,
padding: '4px 8px', background: '#fff',
border: '1px solid #cbd5e1',
borderRadius: 4 }}>
<input type="checkbox"
checked={value.special_data.includes(o.id)}
onChange={() => toggleSpecialData(o.id)} />
{o.label}
</label>
))}
</div>
</Field>
<Field label="8. Bekannter Drittland-Transfer*" colSpan={2}>
<select value={value.third_country_transfer} onChange={e => update('third_country_transfer', e.target.value)} style={inputStyle}>
<option value=""> bitte wählen </option>
<option value="yes">Ja (USA, CN, IN, UK, ...)</option>
<option value="no">Nein (nur EU/EWR)</option>
<option value="unknown">Weiß nicht (bitte automatisch prüfen)</option>
</select>
</Field>
</div>
{!isContextComplete(value) && (
<div style={{ marginTop: 10, fontSize: 11, color: '#92400e',
background: '#fef3c7', padding: '6px 10px',
borderRadius: 4, border: '1px solid #fde68a' }}>
Bitte alle 8 Pflichtfelder ausfüllen der Scan-Button wird erst aktiv,
wenn die Klassifizierung komplett ist.
</div>
)}
</div>
)
}
const inputStyle: React.CSSProperties = {
width: '100%',
padding: '6px 8px',
fontSize: 12,
border: '1px solid #cbd5e1',
borderRadius: 4,
background: '#fff',
}
function Field({ label, children, colSpan }: { label: string; children: React.ReactNode; colSpan?: number }) {
return (
<div style={{ gridColumn: colSpan ? `span ${colSpan}` : undefined }}>
<label style={{ display: 'block', fontSize: 11, color: '#475569',
marginBottom: 4, fontWeight: 600 }}>
{label}
</label>
{children}
</div>
)
}
export function useScanContext(): [ScanContext, (ctx: ScanContext) => void] {
const [ctx, setCtx] = useState<ScanContext>(() => {
if (typeof window === 'undefined') return emptyContext()
try {
const s = localStorage.getItem(STORAGE_KEY)
return s ? { ...emptyContext(), ...JSON.parse(s) } : emptyContext()
} catch {
return emptyContext()
}
})
useEffect(() => {
try { localStorage.setItem(STORAGE_KEY, JSON.stringify(ctx)) } catch {}
}, [ctx])
return [ctx, setCtx]
}
@@ -0,0 +1,353 @@
'use client'
/**
* ResultsTabsView — strukturierte Tab-Ansicht der Audit-Ergebnisse.
*
* Statt einer langen Scroll-Seite gibt es:
* 1. Übersicht (Score + GF-Kurzfassung)
* 2. Cookies (3-Quellen-Compliance-Vergleich + Vendor-/Cookie-Listen)
* 3. Datenschutzerklärung
* 4. Impressum
* 5. AGB / Widerruf
* 6. Banner (Cookie-Banner-Checks)
* 7. Vollständige Mail (HTML-Preview)
*
* Tab-Headers sticky oben, Content scrollbar unten.
*/
import React, { useState, useMemo } from 'react'
import { ChecklistView } from './ChecklistView'
interface ResultsTabsViewProps {
results: any
}
type TabId = 'overview' | 'cookies' | 'dse' | 'impressum' | 'agb' | 'banner' | 'mail'
const TABS: { id: TabId; label: string; icon: string }[] = [
{ id: 'overview', label: 'Übersicht', icon: '◉' },
{ id: 'cookies', label: 'Cookies & VVT', icon: '🍪' },
{ id: 'dse', label: 'Datenschutzerkl.', icon: '📄' },
{ id: 'impressum', label: 'Impressum', icon: '🏢' },
{ id: 'agb', label: 'AGB / Widerruf', icon: '⚖️' },
{ id: 'banner', label: 'Cookie-Banner', icon: '🎛' },
{ id: 'mail', label: 'Mail-Vorschau', icon: '✉️' },
]
export function ResultsTabsView({ results }: ResultsTabsViewProps) {
const [active, setActive] = useState<TabId>('overview')
const r = results || {}
const docs: any[] = r.results || []
const banner = r.banner_result || r.cookie_banner_result || {}
const cmpVendors: any[] = r.cmp_vendors || []
const cookieAudit = r.cookie_audit || {}
const docsByType = useMemo(() => {
const m: Record<string, any> = {}
for (const d of docs) {
const t = (d.doc_type || '').toLowerCase()
if (!m[t]) m[t] = d
}
return m
}, [docs])
return (
<div className="border border-gray-200 rounded-lg overflow-hidden bg-white">
{/* Sticky Tab-Header */}
<div className="flex border-b border-gray-200 bg-gray-50 overflow-x-auto sticky top-0 z-10">
{TABS.map(t => (
<button
key={t.id}
onClick={() => setActive(t.id)}
className={`px-4 py-3 text-sm font-medium whitespace-nowrap border-b-2 transition-colors ${
active === t.id
? 'border-purple-600 text-purple-700 bg-white'
: 'border-transparent text-gray-600 hover:bg-gray-100'
}`}
>
<span className="mr-1.5">{t.icon}</span>
{t.label}
</button>
))}
</div>
{/* Tab-Content */}
<div className="p-4 min-h-[400px]">
{active === 'overview' && <OverviewTab results={r} />}
{active === 'cookies' && (
<CookiesTab
audit={cookieAudit}
vendors={cmpVendors}
banner={banner}
/>
)}
{active === 'dse' && <DocTab doc={docsByType['dse']} label="Datenschutzerklärung" />}
{active === 'impressum' && <DocTab doc={docsByType['impressum']} label="Impressum" />}
{active === 'agb' && <AgbWiderrufTab docs={docsByType} />}
{active === 'banner' && <BannerTab banner={banner} />}
{active === 'mail' && <MailPreviewTab results={r} />}
</div>
</div>
)
}
// ── Übersicht ──────────────────────────────────────────────────────────
function OverviewTab({ results }: { results: any }) {
const totalDocs = results.total_documents || (results.results?.length ?? 0)
const totalFindings = results.total_findings ?? 0
const banner = results.banner_result || results.cookie_banner_result || {}
const score = banner.compliance_score ?? banner.completeness_pct ?? null
const emailStatus = results.email_status
return (
<div className="space-y-4">
<div className="grid grid-cols-2 md:grid-cols-4 gap-3">
<Kpi label="Geprüfte Dokumente" value={totalDocs} />
<Kpi label="Findings gesamt" value={totalFindings} tone={totalFindings > 5 ? 'warn' : 'ok'} />
<Kpi label="Vendors erkannt" value={results.cmp_vendors?.length || 0} />
<Kpi label="Score" value={score !== null ? `${score}%` : '—'}
tone={score === null ? 'neutral' : score >= 80 ? 'ok' : score >= 60 ? 'warn' : 'bad'} />
</div>
{emailStatus && (
<div className={`text-sm px-3 py-2 rounded ${
emailStatus === 'sent' ? 'bg-green-50 text-green-800' : 'bg-gray-100 text-gray-700'
}`}>
E-Mail: {emailStatus === 'sent' ? '✓ Gesendet an Empfänger' : emailStatus}
</div>
)}
<div className="bg-blue-50 border border-blue-200 rounded p-3 text-xs text-blue-900">
<strong>Wo welcher Inhalt steckt:</strong> in den Tabs oben findest du die
Detail-Auswertung pro Doc-Typ. Im Cookie-Tab steht der 3-Quellen-Compliance-
Vergleich (deklariert vs Browser vs Library) das ist der wichtigste
rechtliche Knackpunkt. Banner-Tab zeigt die echten Browser-Phasen-Checks.
</div>
</div>
)
}
function Kpi({ label, value, tone = 'neutral' }: { label: string; value: any; tone?: string }) {
const colors: Record<string, string> = {
ok: 'text-green-700 bg-green-50 border-green-200',
warn: 'text-amber-700 bg-amber-50 border-amber-200',
bad: 'text-red-700 bg-red-50 border-red-200',
neutral: 'text-gray-700 bg-gray-50 border-gray-200',
}
return (
<div className={`border rounded p-3 ${colors[tone]}`}>
<div className="text-[10px] uppercase tracking-wider opacity-70">{label}</div>
<div className="text-2xl font-bold mt-1">{value}</div>
</div>
)
}
// ── Cookies & VVT ──────────────────────────────────────────────────────
function CookiesTab({ audit, vendors, banner }: { audit: any; vendors: any[]; banner: any }) {
const declared = audit?.declared_count ?? 0
const browser = audit?.browser_count ?? 0
const both = (audit?.compliant ?? []).length
const undecl = (audit?.undeclared_in_browser ?? []).length
const decOnly = (audit?.declared_not_loaded ?? []).length
return (
<div className="space-y-4">
{/* Top-Bar mit Counts */}
<div className="grid grid-cols-3 md:grid-cols-5 gap-2">
<Kpi label="Deklariert" value={declared} />
<Kpi label="Im Browser" value={browser} />
<Kpi label="Compliant" value={both} tone="ok" />
<Kpi label="Undokumentiert" value={undecl} tone={undecl > 0 ? 'bad' : 'ok'} />
<Kpi label="Nicht geladen" value={decOnly} tone={decOnly > 0 ? 'warn' : 'neutral'} />
</div>
{/* 3-Spalten-Vergleichstabelle */}
<div className="grid grid-cols-1 md:grid-cols-3 gap-3">
<CookieColumn
title={`❌ Undokumentiert (${undecl})`}
tone="bad"
subtitle="Geladen ABER nicht in der Richtlinie — Art. 13(1)(c) DSGVO Verstoß"
cookies={audit?.undeclared_in_browser ?? []}
/>
<CookieColumn
title={`✓ Compliant (${both})`}
tone="ok"
subtitle="Beide Quellen stimmen überein"
cookies={audit?.compliant ?? []}
/>
<CookieColumn
title={`⚠️ Nicht geladen (${decOnly})`}
tone="warn"
subtitle="In Richtlinie deklariert, aber bei diesem Lauf nicht im Browser"
cookies={audit?.declared_not_loaded ?? []}
/>
</div>
{/* Vendor-Liste (deduped) */}
<div>
<h3 className="text-sm font-semibold mb-2 text-gray-800">
Vendor-Liste ({vendors.length} unique nach Deduplizierung)
</h3>
<div className="overflow-x-auto border border-gray-200 rounded">
<table className="w-full text-xs">
<thead className="bg-gray-50">
<tr>
<th className="text-left px-3 py-2">Vendor</th>
<th className="text-left px-3 py-2">Kategorie</th>
<th className="text-left px-3 py-2">Quelle</th>
<th className="text-right px-3 py-2">Cookies</th>
</tr>
</thead>
<tbody>
{vendors.map((v, i) => (
<tr key={i} className="border-t border-gray-100 hover:bg-gray-50">
<td className="px-3 py-2 font-medium">{v.name}</td>
<td className="px-3 py-2 text-gray-600">{v.category || '—'}</td>
<td className="px-3 py-2 text-gray-500 font-mono text-[10px]">
{v.source || '—'}
</td>
<td className="px-3 py-2 text-right">{(v.cookies || []).length}</td>
</tr>
))}
</tbody>
</table>
</div>
</div>
</div>
)
}
function CookieColumn({ title, tone, subtitle, cookies }: {
title: string; tone: string; subtitle: string; cookies: string[]
}) {
const colors: Record<string, string> = {
bad: 'bg-red-50 border-red-200 text-red-900',
ok: 'bg-green-50 border-green-200 text-green-900',
warn: 'bg-amber-50 border-amber-200 text-amber-900',
}
return (
<div className={`border rounded p-3 ${colors[tone]}`}>
<div className="text-xs font-semibold mb-1">{title}</div>
<div className="text-[10px] opacity-80 mb-2">{subtitle}</div>
<div className="font-mono text-[10px] max-h-56 overflow-auto">
{cookies.length === 0 && <span className="opacity-60"> keine </span>}
{cookies.map((c, i) => (
<div key={i} className="py-0.5">{c}</div>
))}
</div>
</div>
)
}
// ── Generic Doc-Tab ────────────────────────────────────────────────────
function DocTab({ doc, label }: { doc: any; label: string }) {
if (!doc) return <Empty label={label} />
const checks = doc.checks || []
const failed = checks.filter((c: any) => !c.passed && !c.skipped)
const passed = checks.filter((c: any) => c.passed)
return (
<div className="space-y-3">
<div className="flex items-center justify-between">
<h3 className="text-sm font-semibold">{label}</h3>
<div className="text-xs text-gray-600">
{doc.word_count?.toLocaleString('de-DE') || 0} Wörter ·{' '}
<span className="text-red-600">{failed.length} Findings</span> ·{' '}
<span className="text-green-600">{passed.length} OK</span>
</div>
</div>
{doc.url && (
<a href={doc.url} target="_blank" rel="noreferrer"
className="text-xs text-blue-600 hover:underline break-all">
{doc.url}
</a>
)}
<ChecklistView results={[doc]} />
</div>
)
}
function AgbWiderrufTab({ docs }: { docs: Record<string, any> }) {
const agb = docs['agb'] || docs['nutzungsbedingungen']
const wid = docs['widerruf']
return (
<div className="space-y-6">
<div>
<h3 className="text-sm font-semibold mb-2">AGB / Nutzungsbedingungen</h3>
{agb ? <ChecklistView results={[agb]} /> : <Empty label="AGB" inline />}
</div>
<div>
<h3 className="text-sm font-semibold mb-2">Widerrufsbelehrung</h3>
{wid ? <ChecklistView results={[wid]} /> : <Empty label="Widerruf" inline />}
</div>
</div>
)
}
function BannerTab({ banner }: { banner: any }) {
if (!banner || Object.keys(banner).length === 0) return <Empty label="Cookie-Banner" />
const phases = banner.phases || {}
const violations = banner.banner_checks?.violations || []
return (
<div className="space-y-3">
<div className="text-xs text-gray-700">
Banner erkannt: <strong>{banner.banner_detected ? 'Ja' : 'Nein'}</strong> ·{' '}
Provider: <strong>{banner.banner_provider || '—'}</strong> ·{' '}
Verstöße: <strong>{violations.length}</strong>
</div>
{violations.length > 0 && (
<div className="border border-red-200 bg-red-50 rounded p-3">
<div className="text-xs font-semibold text-red-800 mb-2">Verstöße</div>
<ul className="text-xs text-red-900 space-y-1">
{violations.map((v: any, i: number) => (
<li key={i}> {v.label || v.message || JSON.stringify(v)}</li>
))}
</ul>
</div>
)}
<div className="grid grid-cols-3 gap-2">
{Object.entries(phases).map(([name, ph]: [string, any]) => (
<div key={name} className="border border-gray-200 rounded p-2">
<div className="text-[10px] uppercase text-gray-500">{name}</div>
<div className="text-xs mt-1">
Cookies: <strong>{ph.cookies?.length || 0}</strong>
</div>
<div className="text-xs">
Vendors: <strong>{ph.vendors?.length || 0}</strong>
</div>
</div>
))}
</div>
</div>
)
}
function MailPreviewTab({ results }: { results: any }) {
return (
<div className="text-xs text-gray-600 space-y-2">
<p>
Die vollständige Mail wurde {results.email_status === 'sent' ? 'gesendet' : 'erstellt'}.
Snapshot-ID:{' '}
<code className="bg-gray-100 px-1.5 py-0.5 rounded">{results.check_id || '—'}</code>
</p>
{results.check_id && (
<a
href={`/api/compliance/agent/snapshots/${results.check_id}/pdf`}
target="_blank" rel="noreferrer"
className="inline-block text-purple-600 hover:underline"
>
PDF der Mail herunterladen
</a>
)}
</div>
)
}
function Empty({ label, inline }: { label: string; inline?: boolean }) {
return (
<div className={`text-xs text-gray-500 ${inline ? '' : 'py-8 text-center'}`}>
Keine Daten für {label}" in diesem Lauf.
</div>
)
}
@@ -0,0 +1,71 @@
/**
* P47 — localStorage-Quota-Management.
*
* Wenn alte Compliance-Check-Ergebnisse den Browser-Storage fuellen,
* versucht das setItem mit QuotaExceededError zu fangen, prunet
* alte doc-check-result-*-Eintraege (oldest first) und retried.
*
* Wird von DocCheckTab/BannerCheckTab/etc beim Persistieren der
* Result-Bloebs benutzt.
*/
const RESULT_KEY_PREFIX = 'doc-check-result-'
const MAX_KEEP = 10 // Maximal 10 alte Result-Bloebs behalten.
export function safeSetItem(key: string, value: string): boolean {
try {
localStorage.setItem(key, value)
return true
} catch (err: any) {
if (err?.name !== 'QuotaExceededError'
&& err?.code !== 22 && err?.code !== 1014) {
console.warn('localStorage setItem failed:', err)
return false
}
pruneOldResults()
try {
localStorage.setItem(key, value)
return true
} catch {
// Pruning hat nicht gereicht — aggressiver pruefen
pruneOldResults(0)
try {
localStorage.setItem(key, value)
return true
} catch {
console.warn('localStorage immer noch voll, wert wird verworfen')
return false
}
}
}
}
function pruneOldResults(keep: number = MAX_KEEP): void {
try {
const keys: { key: string; ts: number }[] = []
for (let i = 0; i < localStorage.length; i++) {
const k = localStorage.key(i)
if (!k || !k.startsWith(RESULT_KEY_PREFIX)) continue
const ts = Number(k.slice(RESULT_KEY_PREFIX.length)) || 0
keys.push({ key: k, ts })
}
keys.sort((a, b) => a.ts - b.ts) // oldest first
const toRemove = keys.slice(0, Math.max(0, keys.length - keep))
for (const k of toRemove) {
try { localStorage.removeItem(k.key) } catch {}
}
} catch {}
}
export function getStorageUsageMB(): number {
let bytes = 0
try {
for (let i = 0; i < localStorage.length; i++) {
const k = localStorage.key(i)
if (!k) continue
const v = localStorage.getItem(k) || ''
bytes += k.length + v.length
}
} catch {}
return bytes / (1024 * 1024)
}
@@ -0,0 +1,302 @@
'use client'
import React, { useEffect, useState } from 'react'
type Phase = {
cookies?: string[]
scripts?: string[]
tracking_services?: (string | { name?: string })[]
new_tracking?: unknown[]
violations?: Array<{ severity?: string; text?: string }>
undocumented?: unknown[]
}
type CategoryTest = {
category: string
category_label: string
tracking_services?: (string | { name?: string })[]
cookies_set?: string[]
provider_details_visible?: boolean
violations?: Array<{ severity?: string; text?: string; legal_ref?: string }>
}
type BannerViolation = {
severity?: string
text?: string
legal_ref?: string
}
type StructuredCheck = {
id: string
label: string
passed: boolean
skipped?: boolean
severity: string
level?: number
hint?: string
}
type BannerResp = {
found: boolean
check_id: string
banner?: {
banner_provider?: string
banner_detected?: boolean
completeness_pct?: number
correctness_pct?: number
phases?: Record<string, Phase>
banner_checks?: { violations?: BannerViolation[] }
category_tests?: CategoryTest[]
structured_checks?: StructuredCheck[]
summary?: Record<string, number>
}
}
const PHASE_LABEL: Record<string, string> = {
before_consent: 'Vor Consent',
after_reject: 'Nach Ablehnung',
after_accept: 'Nach Annahme',
}
const SEV_BADGE: Record<string, string> = {
CRITICAL: 'bg-red-600 text-white',
HIGH: 'bg-red-100 text-red-800',
MEDIUM: 'bg-amber-100 text-amber-800',
LOW: 'bg-blue-100 text-blue-800',
INFO: 'bg-gray-100 text-gray-600',
}
function pctColor(pct?: number): string {
if (pct === undefined || pct === null) return 'text-gray-400'
return pct >= 80 ? 'text-green-700' : pct >= 50 ? 'text-amber-700' : 'text-red-700'
}
export default function BannerTab({ checkId }: { checkId: string }) {
const [data, setData] = useState<BannerResp | null>(null)
const [loading, setLoading] = useState(true)
const [error, setError] = useState<string | null>(null)
const [checkFilter, setCheckFilter] = useState<'all' | 'fail' | 'critical'>('fail')
useEffect(() => {
let cancelled = false
setLoading(true)
fetch(`/api/sdk/v1/agent/banner/${checkId}`)
.then(r => r.json())
.then(d => { if (!cancelled) setData(d) })
.catch(e => { if (!cancelled) setError(String(e)) })
.finally(() => { if (!cancelled) setLoading(false) })
return () => { cancelled = true }
}, [checkId])
if (loading) return <div className="p-6 text-sm text-gray-500">Lade Banner-Daten</div>
if (error) return <div className="p-6 text-sm text-red-600">Fehler: {error}</div>
if (!data?.found || !data.banner) {
return <div className="p-6 text-sm text-gray-500">Keine Banner-Daten zu diesem Check.</div>
}
const b = data.banner
const phases = b.phases || {}
const cats = b.category_tests || []
const violations = b.banner_checks?.violations || []
const checks = b.structured_checks || []
const summary = b.summary || {}
const filteredChecks = checks.filter(c => {
if (checkFilter === 'all') return true
if (checkFilter === 'fail') return !c.passed && !c.skipped
return !c.passed && !c.skipped && ['CRITICAL', 'HIGH'].includes(c.severity)
})
return (
<div className="space-y-6">
{/* Quality Cards */}
<div className="grid grid-cols-2 md:grid-cols-4 gap-3 text-xs">
<div className="border rounded p-3">
<div className="text-[10px] uppercase text-gray-500">Vollstaendigkeit</div>
<div className={`text-2xl font-semibold ${pctColor(b.completeness_pct)}`}>
{b.completeness_pct ?? ''}{b.completeness_pct !== undefined && '%'}
</div>
</div>
<div className="border rounded p-3">
<div className="text-[10px] uppercase text-gray-500">Korrektheit</div>
<div className={`text-2xl font-semibold ${pctColor(b.correctness_pct)}`}>
{b.correctness_pct ?? ''}{b.correctness_pct !== undefined && '%'}
</div>
</div>
<div className="border rounded p-3">
<div className="text-[10px] uppercase text-gray-500">Verstoesse</div>
<div className="text-2xl font-semibold text-red-700">
{summary.total_violations ?? violations.length}
</div>
<div className="text-[10px] text-gray-500 mt-1">
crit:{summary.critical ?? 0} · high:{summary.high ?? 0}
</div>
</div>
<div className="border rounded p-3">
<div className="text-[10px] uppercase text-gray-500">CMP</div>
<div className="text-sm font-medium text-gray-800 truncate">
{b.banner_provider || 'unbekannt'}
</div>
<div className="text-[10px] text-gray-500 mt-1">
{b.banner_detected ? 'Banner erkannt' : 'kein Banner'}
</div>
</div>
</div>
{/* Phases */}
<div className="border rounded-lg overflow-hidden">
<div className="px-4 py-2 bg-gray-50 border-b text-sm font-medium text-gray-700">
Cookie-Setzungen pro Phase (echter Browser-Test)
</div>
<table className="w-full text-xs">
<thead className="bg-gray-50 text-gray-600">
<tr>
<th className="px-3 py-2 text-left">Phase</th>
<th className="px-3 py-2 text-center">Cookies</th>
<th className="px-3 py-2 text-center">Tracker</th>
<th className="px-3 py-2 text-left">Auffaelligkeiten</th>
</tr>
</thead>
<tbody>
{(['before_consent', 'after_reject', 'after_accept'] as const).map(key => {
const p = phases[key] || {}
const nc = (p.cookies || []).length
const nt = (p.tracking_services || []).length
const issues: string[] = []
if (p.violations?.length) issues.push(`${p.violations.length} Verstoss`)
if (p.new_tracking?.length) issues.push(`${p.new_tracking.length} neue Tracker`)
if (p.undocumented?.length) issues.push(`${p.undocumented.length} undokumentiert`)
const color = key === 'before_consent'
? (nc === 0 ? 'text-green-600' : 'text-red-600')
: key === 'after_reject'
? (nc <= 1 ? 'text-green-600' : 'text-amber-600')
: 'text-gray-700'
return (
<tr key={key} className="border-t">
<td className="px-3 py-2 font-medium">{PHASE_LABEL[key]}</td>
<td className={`px-3 py-2 text-center font-semibold ${color}`}>{nc}</td>
<td className="px-3 py-2 text-center">{nt}</td>
<td className="px-3 py-2 text-gray-500">{issues.join(', ') || '—'}</td>
</tr>
)
})}
</tbody>
</table>
</div>
{/* Per-Category */}
{cats.length > 0 && (
<div className="border rounded-lg overflow-hidden">
<div className="px-4 py-2 bg-gray-50 border-b text-sm font-medium text-gray-700">
Provider-Listing pro Kategorie (P19 Click-Through-Test)
</div>
<table className="w-full text-xs">
<thead className="bg-gray-50 text-gray-600">
<tr>
<th className="px-3 py-2 text-left">Kategorie</th>
<th className="px-3 py-2 text-center">Anbieter sichtbar</th>
<th className="px-3 py-2 text-center">Tracker erkannt</th>
<th className="px-3 py-2 text-left">Violations</th>
</tr>
</thead>
<tbody>
{cats.map(c => {
const pdv = c.provider_details_visible
const pdv_label = pdv === true ? 'Ja' : pdv === false ? 'Nein' : ''
const pdv_color = pdv === false ? 'text-red-700' : pdv === true ? 'text-green-700' : 'text-gray-400'
return (
<tr key={c.category} className="border-t">
<td className="px-3 py-2">{c.category_label}</td>
<td className={`px-3 py-2 text-center font-semibold ${pdv_color}`}>{pdv_label}</td>
<td className="px-3 py-2 text-center">{(c.tracking_services || []).length}</td>
<td className="px-3 py-2 text-red-700 text-[10px]">
{(c.violations || []).map(v => v.text?.slice(0, 80)).join('; ') || '—'}
</td>
</tr>
)
})}
</tbody>
</table>
</div>
)}
{/* Banner-Checks Violations */}
{violations.length > 0 && (
<div className="border rounded-lg overflow-hidden">
<div className="px-4 py-2 bg-gray-50 border-b text-sm font-medium text-gray-700">
Banner-Verstoesse ({violations.length})
</div>
<ul className="text-xs divide-y">
{violations.map((v, i) => {
const sev = (v.severity || 'MEDIUM').toUpperCase()
return (
<li key={i} className="px-3 py-2">
<div className="flex items-start gap-2">
<span className={`px-1.5 py-0.5 rounded text-[10px] font-medium ${SEV_BADGE[sev] || 'bg-gray-100'}`}>{sev}</span>
<div>
<div className="text-gray-900">{v.text}</div>
{v.legal_ref && <div className="text-[10px] text-gray-400 italic mt-1">Quelle: {v.legal_ref}</div>}
</div>
</div>
</li>
)
})}
</ul>
</div>
)}
{/* 46 structured_checks Drilldown */}
<div className="border rounded-lg overflow-hidden">
<div className="px-4 py-2 bg-gray-50 border-b text-sm font-medium text-gray-700 flex items-center gap-3">
<span>Banner-Checks ({checks.length})</span>
<div className="ml-auto flex gap-1">
{(['all', 'fail', 'critical'] as const).map(f => (
<button key={f}
onClick={() => setCheckFilter(f)}
className={`px-2 py-1 rounded text-[10px] border ${
checkFilter === f ? 'bg-blue-600 text-white border-blue-600'
: 'bg-white text-gray-600 border-gray-200'
}`}>
{f === 'all' ? 'Alle' : f === 'fail' ? 'Nur Fail' : 'Nur CRIT/HIGH'}
</button>
))}
</div>
</div>
<table className="w-full text-xs">
<thead className="bg-gray-50 text-gray-600">
<tr>
<th className="px-3 py-2 text-left">Status</th>
<th className="px-3 py-2 text-left">Sev</th>
<th className="px-3 py-2 text-left">Check</th>
</tr>
</thead>
<tbody>
{filteredChecks.map(c => (
<tr key={c.id} className="border-t">
<td className="px-3 py-2">
{c.passed ? <span className="text-green-600"></span>
: c.skipped ? <span className="text-gray-400"></span>
: <span className="text-red-600"></span>}
</td>
<td className="px-3 py-2">
<span className={`px-1.5 py-0.5 rounded text-[10px] font-medium ${SEV_BADGE[c.severity] || 'bg-gray-100'}`}>
{c.severity}
</span>
</td>
<td className="px-3 py-2">
<div className="text-gray-900">{c.label}</div>
{c.hint && !c.passed && (
<div className="text-[10px] text-gray-500 mt-1">{c.hint.slice(0, 200)}</div>
)}
</td>
</tr>
))}
{filteredChecks.length === 0 && (
<tr><td colSpan={3} className="px-3 py-4 text-center text-gray-400">Keine Checks fuer den Filter.</td></tr>
)}
</tbody>
</table>
</div>
</div>
)
}
@@ -3,6 +3,7 @@
import React, { useEffect, useState, useMemo } from 'react'
import { use as useUnwrap } from 'react'
import FindingsTab from './FindingsTab'
import BannerTab from './BannerTab'
type MCRow = {
id: number
@@ -92,7 +93,7 @@ export default function AuditPage(
const [filterReg, setFilterReg] = useState<string>('')
const [filterDoc, setFilterDoc] = useState<string>('')
const [expanded, setExpanded] = useState<number | null>(null)
const [tab, setTab] = useState<'mc' | 'all'>('all')
const [tab, setTab] = useState<'mc' | 'all' | 'banner'>('all')
useEffect(() => {
let cancelled = false
@@ -155,6 +156,7 @@ export default function AuditPage(
<div className="flex gap-2 border-b border-gray-200">
{([
{ key: 'all', label: 'Voll-Audit (alle Findings)' },
{ key: 'banner', label: 'Cookie-Banner-Analyse' },
{ key: 'mc', label: 'Nur MC-Scorecard' },
] as const).map(t => (
<button key={t.key}
@@ -168,6 +170,7 @@ export default function AuditPage(
</div>
{tab === 'all' && <FindingsTab checkId={checkId} />}
{tab === 'banner' && <BannerTab checkId={checkId} />}
{tab === 'mc' && <>
{/* Scorecard */}
+10
View File
@@ -362,6 +362,16 @@ export default function AIActPage() {
)}
</StepHeader>
<div className="px-4 py-2 bg-emerald-50 border border-emerald-200 rounded-lg text-xs text-emerald-800 flex items-start gap-2">
<span className="font-semibold">Quellen &amp; Lizenz:</span>
<span>
Inhalte gemaess <strong>EU-Verordnung 2024/1689 (KI-Verordnung / AI Act)</strong>
Lizenzregel R1 (EU_LAW, woertlich uebernehmbar).
Risiko-Klassifizierungslogik basiert auf Anhang III der Verordnung.{' '}
<a href="/sdk/licenses" className="underline">Quellenverzeichnis</a>
</span>
</div>
{/* Tabs */}
<div className="flex items-center gap-1 bg-gray-100 p-1 rounded-lg w-fit">
{TABS.map(tab => (
@@ -13,6 +13,7 @@ import {
CATEGORY_OPTIONS,
} from '../control-library/components/helpers'
import { ControlDetail } from '../control-library/components/ControlDetail'
import { SourceBadge } from '@/components/sdk/SourceBadge'
// =============================================================================
// TYPES
@@ -310,6 +311,7 @@ export default function AtomicControlsPage() {
<TargetAudienceBadge audience={ctrl.target_audience} />
<GenerationStrategyBadge strategy={ctrl.generation_strategy} pipelineInfo={ctrl} />
<ObligationTypeBadge type={ctrl.generation_metadata?.obligation_type as string} />
<SourceBadge controlUuid={ctrl.id} compact />
</div>
<h3 className="text-sm font-medium text-gray-900 group-hover:text-violet-700">{ctrl.title}</h3>
<p className="text-xs text-gray-500 mt-1 line-clamp-2">{ctrl.objective}</p>
@@ -3,6 +3,7 @@
import React, { useState } from 'react'
import { useRouter } from 'next/navigation'
import { StepHeader, STEP_EXPLANATIONS } from '@/components/sdk/StepHeader'
import { LicenseModuleBanner } from '@/components/sdk/LicenseModuleBanner'
import { useAuditChecklist } from './_hooks/useAuditChecklist'
import { ChecklistItemCard } from './_components/ChecklistItemCard'
import { LoadingSkeleton } from './_components/LoadingSkeleton'
@@ -89,6 +90,12 @@ export default function AuditChecklistPage() {
</div>
</StepHeader>
<LicenseModuleBanner
rule={3}
sourceLabel="BreakPilot-Audit-Methodik"
detail="Eigene Audit-Checklisten und -Workflows. Zitierte Rechtsquellen (DSGVO/ISO 27001/...) jeweils mit eigener Lizenzregel."
/>
{error && (
<div className="p-4 bg-red-50 border border-red-200 rounded-lg text-red-700 flex items-center justify-between">
<span>{error}</span>
@@ -232,14 +232,25 @@ export function StateBadge({ state }: { state: string }) {
export function LicenseRuleBadge({ rule }: { rule: number | null | undefined }) {
if (!rule) return null
const config: Record<number, { bg: string; label: string }> = {
1: { bg: 'bg-green-100 text-green-700', label: 'Free Use' },
2: { bg: 'bg-blue-100 text-blue-700', label: 'Zitation' },
3: { bg: 'bg-amber-100 text-amber-700', label: 'Reformuliert' },
// Corrected labels per Task #21 LICENSE_RULES.md mapping:
// R1 = woertlich (Hoheitsrecht/Public Domain, no attribution required)
// R2 = woertlich + Attribution-Pflicht (CC-BY, OWASP, OECD, ENISA)
// R3 = nur Identifier zitieren (DIN/ANSI/IEC/DGUV/proprietary — pipeline drops full text)
const config: Record<number, { bg: string; label: string; title: string }> = {
1: { bg: 'bg-emerald-100 text-emerald-800', label: 'R1', title: 'Woertlich uebernehmbar (Hoheitsrecht/Public Domain)' },
2: { bg: 'bg-amber-100 text-amber-800', label: 'R2', title: 'Woertlich mit Attribution (CC-BY/OWASP/OECD/ENISA)' },
3: { bg: 'bg-slate-100 text-slate-700', label: 'R3', title: 'Nur Identifier-Verweis (DIN/ANSI/IEC/proprietaer)' },
}
const c = config[rule]
if (!c) return null
return <span className={`inline-flex items-center px-2 py-0.5 rounded text-xs font-medium ${c.bg}`}>{c.label}</span>
return (
<span
className={`inline-flex items-center px-2 py-0.5 rounded text-xs font-medium ${c.bg}`}
title={c.title}
>
{c.label}
</span>
)
}
export function VerificationMethodBadge({ method }: { method: string | null }) {
+10
View File
@@ -99,6 +99,16 @@ export default function CRAProjectsPage() {
</p>
</div>
<div className="mb-4 px-4 py-2 bg-emerald-50 border border-emerald-200 rounded-lg text-xs text-emerald-800 flex items-start gap-2">
<span className="font-semibold">Quellen &amp; Lizenz:</span>
<span>
Inhalte gemaess <strong>EU-Verordnung 2024/2847 (Cyber Resilience Act)</strong>
Lizenzregel R1 (EU_LAW, woertlich uebernehmbar). ENISA-Implementation-Guidance
ergaenzend (R1 EU_PUBLIC).{' '}
<a href="/sdk/licenses" className="underline">Quellenverzeichnis</a>
</span>
</div>
{error && (
<div className="mb-4 bg-red-50 border border-red-200 rounded-lg p-4 text-sm text-red-700">
{error}
@@ -0,0 +1,124 @@
'use client'
/**
* Lifecycle-Phasen-Filter für den Document-Generator.
*
* Zeigt 5 Phasen-Tabs (Pre-Founding, Founding, Startup, KMU, Konzern) und
* filtert die angezeigten Templates entsprechend ihres `lifecycle_stage`-Arrays.
*
* Phasen-Definitionen synchron zu lib/sdk/founding/template-categories.ts
*/
import {
LIFECYCLE_STAGE_LABELS,
type LifecycleStage,
TEMPLATE_CATEGORIES,
} from '@/lib/sdk/founding/template-categories'
interface Props {
activeStage: LifecycleStage | 'all'
onChange: (stage: LifecycleStage | 'all') => void
/** Template-Counts pro Stage (optional, sonst aus Code-Registry berechnet) */
countsByStage?: Record<string, number>
}
const STAGE_ORDER: (LifecycleStage | 'all')[] = [
'all',
'pre_founding',
'founding',
'startup',
'kmu',
'konzern',
]
const STAGE_ICONS: Record<LifecycleStage | 'all', string> = {
all: '📚',
pre_founding: '🌱',
founding: '⚖️',
startup: '🚀',
kmu: '🏢',
konzern: '🏛️',
}
const STAGE_HINTS: Record<LifecycleStage, string> = {
pre_founding: 'Vor dem Notartermin — Term Sheet, IP-Sicherung, Wandeldarlehen',
founding: 'Für den Notartermin — Satzung, Gesellschafterliste, HRB-Anmeldung',
startup: '03 Jahre, <25 Mitarbeiter — Arbeitsverträge, AVV, Datenschutz',
kmu: '3+ Jahre, 25250 MA — ISMS, Whistleblower, vollständige TOM',
konzern: '250+ MA — Konzern-Compliance, ISO 27001',
}
export function LifecycleFilter({ activeStage, onChange, countsByStage }: Props) {
const counts = countsByStage || computeCountsFromRegistry()
return (
<div className="mb-6" data-testid="lifecycle-filter">
<div className="flex items-center gap-2 mb-2">
<h3 className="text-sm font-semibold text-gray-700">Phase Deines Unternehmens</h3>
<span className="text-xs text-gray-500"> filtert Dokumente nach Lifecycle</span>
</div>
<div className="flex flex-wrap gap-2">
{STAGE_ORDER.map(stage => {
const isAll = stage === 'all'
const count = isAll
? Object.values(counts).reduce((s, c) => s + c, 0)
: (counts[stage] || 0)
const label = isAll ? 'Alle' : LIFECYCLE_STAGE_LABELS[stage as LifecycleStage].split(' (')[0]
const isActive = activeStage === stage
return (
<button
key={stage}
type="button"
data-testid={`stage-tab-${stage}`}
onClick={() => onChange(stage)}
className={`px-3 py-2 rounded-lg border text-sm font-medium transition ${
isActive
? 'bg-purple-600 text-white border-purple-600 shadow-sm'
: 'bg-white text-gray-700 border-gray-200 hover:border-purple-300 hover:bg-purple-50'
}`}
>
<span className="mr-1.5">{STAGE_ICONS[stage]}</span>
{label}
<span className={`ml-2 px-1.5 py-0.5 text-xs rounded-full ${
isActive ? 'bg-white/20' : 'bg-gray-100 text-gray-600'
}`}>
{count}
</span>
</button>
)
})}
</div>
{activeStage !== 'all' && (
<p className="mt-2 text-sm text-gray-500" data-testid="stage-hint">
{STAGE_HINTS[activeStage as LifecycleStage]}
</p>
)}
</div>
)
}
function computeCountsFromRegistry(): Record<string, number> {
const counts: Record<string, number> = {
pre_founding: 0, founding: 0, startup: 0, kmu: 0, konzern: 0,
}
for (const cat of Object.values(TEMPLATE_CATEGORIES)) {
for (const stage of cat.lifecycle_stage) {
counts[stage] = (counts[stage] || 0) + 1
}
}
return counts
}
export function filterTemplatesByStage<T extends { document_type?: string; type?: string }>(
templates: T[],
stage: LifecycleStage | 'all'
): T[] {
if (stage === 'all') return templates
return templates.filter(t => {
const docType = t.document_type || t.type
if (!docType) return false
const cat = TEMPLATE_CATEGORIES[docType]
if (!cat) return stage === 'startup' // Fallback: unkategorisierte zeigen wir in Startup
return cat.lifecycle_stage.includes(stage)
})
}
@@ -39,7 +39,7 @@ export const CATEGORIES: { key: string; label: string; types: string[] | null }[
]},
// Datenschutz-Informationen (alle DSI-Typen):
{ key: 'dsi', label: 'Datenschutzinfos', types: ['privacy_policy', 'applicant_dsi', 'employee_dsi', 'social_media_dsi', 'video_conference_dsi', 'informationspflichten'] },
{ key: 'dsi', label: 'Datenschutzinfos', types: ['privacy_policy', 'data_protection_policy', 'applicant_dsi', 'employee_dsi', 'social_media_dsi', 'video_conference_dsi', 'informationspflichten'] },
// Einwilligungen:
{ key: 'consent', label: 'Einwilligungen', types: ['consent_texts', 'cookie_banner', 'verpflichtungserklaerung'] },
@@ -15,6 +15,8 @@ import { getGeneratorDefaults, getProfileLabel } from './scopeDefaults'
import TemplateLibrary from './_components/TemplateLibrary'
import GeneratorSection from './_components/GeneratorSection'
import RecommendedDocuments from './_components/RecommendedDocuments'
import { LifecycleFilter, filterTemplatesByStage } from './_components/LifecycleFilter'
import type { LifecycleStage } from '@/lib/sdk/founding/template-categories'
function DocumentGeneratorPageInner() {
const { state } = useSDK()
@@ -24,6 +26,7 @@ function DocumentGeneratorPageInner() {
const [allTemplates, setAllTemplates] = useState<LegalTemplateResult[]>([])
const [isLoadingLibrary, setIsLoadingLibrary] = useState(true)
const [activeCategory, setActiveCategory] = useState<string>('all')
const [activeStage, setActiveStage] = useState<LifecycleStage | 'all'>('all')
const [activeLanguage, setActiveLanguage] = useState<'all' | 'de' | 'en'>('all')
const [librarySearch, setLibrarySearch] = useState('')
const [expandedPreviewId, setExpandedPreviewId] = useState<string | null>(null)
@@ -209,10 +212,15 @@ function DocumentGeneratorPageInner() {
}
}, [selectedDataPointsData])
// Filtered templates (computed)
// Filtered templates (computed) — Lifecycle + Category + Language + Search
const filteredTemplates = useMemo(() => {
const category = CATEGORIES.find((c: { key: string }) => c.key === activeCategory)
return allTemplates.filter((t) => {
// 1. Lifecycle-Phase Filter via Code-Registry (mapped auf templateType)
const stageFiltered = filterTemplatesByStage(
allTemplates.map(t => ({ ...t, document_type: t.templateType || '' })),
activeStage
)
return stageFiltered.filter((t) => {
if (category && category.types !== null) {
if (!category.types.includes(t.templateType || '')) return false
}
@@ -225,7 +233,22 @@ function DocumentGeneratorPageInner() {
}
return true
})
}, [allTemplates, activeCategory, activeLanguage, librarySearch])
}, [allTemplates, activeCategory, activeStage, activeLanguage, librarySearch])
// Counts by stage for filter UI
const countsByStage = useMemo(() => {
const counts: Record<string, number> = { pre_founding: 0, founding: 0, startup: 0, kmu: 0, konzern: 0 }
const stages: LifecycleStage[] = ['pre_founding', 'founding', 'startup', 'kmu', 'konzern']
for (const t of allTemplates) {
const docType = t.templateType || ''
for (const s of stages) {
if (filterTemplatesByStage([{ document_type: docType }], s).length) {
counts[s]++
}
}
}
return counts
}, [allTemplates])
const handleUseTemplate = useCallback((t: LegalTemplateResult) => {
setActiveTemplate(t)
@@ -274,6 +297,16 @@ function DocumentGeneratorPageInner() {
tips={stepInfo.tips}
/>
<div className="px-4 py-2 bg-slate-50 border border-slate-200 rounded-lg text-xs text-slate-700 flex items-start gap-2">
<span className="font-semibold">Quellen &amp; Lizenz:</span>
<span>
Die 91 Standard-Vorlagen sind <strong>BreakPilot-Eigenwerke</strong> (Lizenzregel R3 Identifier-Verweis,
eigene Lizenz). Vorlagen mit gesetzlicher Grundlage (z.B. VVT nach Art. 30 DSGVO,
Loeschkonzept nach Art. 17 DSGVO) zitieren die jeweilige Rechtsquelle als R1.{' '}
<a href="/sdk/licenses" className="underline">Quellenverzeichnis</a>
</span>
</div>
{/* Status bar */}
<div className="grid grid-cols-3 gap-4">
<div className="bg-white rounded-xl border border-gray-200 p-5">
@@ -292,6 +325,13 @@ function DocumentGeneratorPageInner() {
</div>
</div>
{/* Lifecycle-Phase Filter */}
<LifecycleFilter
activeStage={activeStage}
onChange={setActiveStage}
countsByStage={countsByStage}
/>
{/* Recommended documents based on scope profile */}
<RecommendedDocuments
allTemplates={allTemplates}
@@ -225,6 +225,51 @@ const TEMPLATE_RULES: TemplateRule[] = [
condition: () => 'required', // Immer Pflicht bei Websites
},
// ── DSE & Datenschutz-Kerndokumente (P38) ──────────────────────────────
{
templateType: 'privacy_policy',
label: 'Datenschutzerklaerung (Website)',
condition: () => 'required', // Art. 13 DSGVO — bei jeder Website Pflicht
},
{
templateType: 'data_protection_policy',
label: 'Datenschutzrichtlinie (intern)',
condition: (_answers, level) => level >= 'L2' ? 'required' : 'recommended',
},
{
templateType: 'dsfa',
label: 'DSFA-Vorlage',
condition: (answers) => {
const dsfa = answers.get('proc_dsfa_required') || answers.get('comp_dsfa_processes')
if (dsfa === 'yes' || dsfa === 'required') return 'required'
return 'optional'
},
},
{
templateType: 'dpa',
label: 'Auftragsverarbeitungsvertrag (AVV)',
condition: (answers) => {
const vendors = answers.get('comp_has_processors') || answers.get('comp_vendor_management')
if (vendors && vendors !== 'no') return 'required'
return 'recommended'
},
},
{
templateType: 'vvt_register',
label: 'Verzeichnis von Verarbeitungstaetigkeiten (VVT)',
condition: (_answers, level) => level >= 'L2' ? 'required' : 'recommended',
},
{
templateType: 'tom_documentation',
label: 'TOM-Dokumentation',
condition: (_answers, level) => level >= 'L2' ? 'required' : 'recommended',
},
{
templateType: 'loeschkonzept',
label: 'Loeschkonzept',
condition: (_answers, level) => level >= 'L2' ? 'required' : 'recommended',
},
// ── Drittlandtransfer (SCC + TIA) ───────────────────────────────────────
// SCC+TIA nur erforderlich wenn Drittlandtransfer OHNE Angemessenheitsbeschluss/DPF
{
+10
View File
@@ -132,6 +132,16 @@ export default function DSFAPage() {
)}
</StepHeader>
<div className="px-4 py-2 bg-emerald-50 border border-emerald-200 rounded-lg text-xs text-emerald-800 flex items-start gap-2">
<span className="font-semibold">Quellen &amp; Lizenz:</span>
<span>
Inhalte gemaess <strong>DSGVO Art. 35</strong> (EU 2016/679) Lizenzregel R1
(Hoheitsrecht/EU_LAW, woertlich uebernehmbar). Vorlagen-Texte aus
Aufsichtsbehoerden ebenfalls R1.{' '}
<a href="/sdk/licenses" className="underline">Quellenverzeichnis</a>
</span>
</div>
{/* DSFA Requirement Check */}
{dsfaCheck.required && dsfas.length === 0 && (
<div className="bg-red-50 border border-red-200 rounded-xl p-5">
@@ -0,0 +1,220 @@
'use client'
import { useState } from 'react'
import type { FoundingWizardState } from '@/lib/sdk/founding/types'
interface Props {
state: FoundingWizardState
update: <K extends keyof FoundingWizardState>(k: K, v: FoundingWizardState[K]) => void
}
export function StepBasics({ state, update }: Props) {
const b = state.basics
const [prefillStatus, setPrefillStatus] = useState<'idle' | 'loading' | 'success' | 'error'>('idle')
async function prefillFromCompanyProfile() {
setPrefillStatus('loading')
try {
const res = await fetch('/api/sdk/v1/company-profile', { cache: 'no-store' })
if (!res.ok) throw new Error(`HTTP ${res.status}`)
const payload = await res.json()
const p = payload?.profile ?? payload
if (!p || typeof p !== 'object') throw new Error('leeres Profil')
const industries = Array.isArray(p.industry) ? p.industry.filter(Boolean) : []
const industry = industries.length > 0
? industries.join(', ')
: (p.industryOther || b.industry)
const address = [p.headquartersStreet, [p.headquartersZip, p.headquartersCity].filter(Boolean).join(' ')]
.filter(Boolean).join(', ') || b.company_address
const seat = p.headquartersCity || b.company_seat
// Purpose ableiten aus offerings/businessModel — Fallback wenn nichts da
const purposeBits: string[] = []
if (p.businessModel) purposeBits.push(`Geschäftsmodell: ${p.businessModel}`)
if (Array.isArray(p.offerings) && p.offerings.length > 0)
purposeBits.push(`Leistungen: ${p.offerings.join(', ')}`)
const purpose = purposeBits.length > 0
? purposeBits.join('; ')
: b.company_purpose_description
update('basics', {
...b,
company_name: p.companyName || b.company_name,
legal_form: (p.legalForm === 'UG' ? 'UG' : (p.legalForm === 'GmbH' ? 'GmbH' : b.legal_form)),
company_seat: seat,
company_address: address,
industry,
company_purpose_description: b.company_purpose_description.trim() === '' ? purpose : b.company_purpose_description,
})
setPrefillStatus('success')
} catch (err) {
console.error('[founding-wizard] prefill failed', err)
setPrefillStatus('error')
}
}
return (
<div className="space-y-4">
<div className="flex items-center justify-between">
<p className="text-sm text-gray-600">
Stammdaten der Gesellschaft. Pflicht für Satzung, HRB-Anmeldung und SHA.
</p>
<button
type="button"
onClick={prefillFromCompanyProfile}
disabled={prefillStatus === 'loading'}
className="px-3 py-1.5 text-sm rounded-lg border border-blue-300 bg-blue-50 hover:bg-blue-100 disabled:opacity-50"
>
{prefillStatus === 'loading' ? 'Lade…' : 'Aus Unternehmensprofil vorbefüllen'}
</button>
</div>
{prefillStatus === 'success' && (
<div className="text-xs text-green-700 bg-green-50 border border-green-200 rounded px-2 py-1">
Daten aus Unternehmensprofil übernommen. Bitte prüfen und ergänzen.
</div>
)}
{prefillStatus === 'error' && (
<div className="text-xs text-amber-700 bg-amber-50 border border-amber-200 rounded px-2 py-1">
Konnte Unternehmensprofil nicht laden bitte Felder manuell ausfüllen.
</div>
)}
<div className="grid grid-cols-2 gap-4">
<div>
<label className="block text-sm font-medium text-gray-700 mb-1">Firmenname</label>
<input
data-testid="company-name"
type="text"
value={b.company_name}
onChange={e => update('basics', { ...b, company_name: e.target.value })}
placeholder="Breakpilot GmbH"
className="w-full px-3 py-2 border rounded-lg"
/>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-1">Rechtsform</label>
<select
data-testid="legal-form"
value={b.legal_form}
onChange={e => update('basics', { ...b, legal_form: e.target.value as 'GmbH' | 'UG' })}
className="w-full px-3 py-2 border rounded-lg"
>
<option value="GmbH">GmbH</option>
<option value="UG">UG (haftungsbeschränkt)</option>
</select>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-1">Sitz (Stadt)</label>
<input
data-testid="company-seat"
type="text"
value={b.company_seat}
onChange={e => update('basics', { ...b, company_seat: e.target.value })}
placeholder="z.B. Stuttgart"
className="w-full px-3 py-2 border rounded-lg"
/>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-1">Adresse</label>
<input
data-testid="company-address"
type="text"
value={b.company_address}
onChange={e => update('basics', { ...b, company_address: e.target.value })}
placeholder="Straße, PLZ Ort"
className="w-full px-3 py-2 border rounded-lg"
/>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-1">Branche</label>
<input
data-testid="industry"
type="text"
value={b.industry}
onChange={e => update('basics', { ...b, industry: e.target.value })}
placeholder="z.B. SaaS, Beratung, Handwerk"
className="w-full px-3 py-2 border rounded-lg"
/>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-1">Geschäftsjahr</label>
<input
data-testid="business-year"
type="text"
value={b.business_year}
onChange={e => update('basics', { ...b, business_year: e.target.value })}
className="w-full px-3 py-2 border rounded-lg"
/>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-1">
Registergericht
</label>
<input
data-testid="register-court"
type="text"
value={b.register_court || ''}
onChange={e => update('basics', { ...b, register_court: e.target.value })}
placeholder="z.B. Amtsgericht Stuttgart"
className="w-full px-3 py-2 border rounded-lg"
/>
<p className="text-xs text-gray-500 mt-1">
Zuständiges Amtsgericht für HRB-Eintragung
</p>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-1">
HRB-Nummer <span className="text-gray-400">(optional)</span>
</label>
<input
data-testid="hrb-number"
type="text"
value={b.hrb_number || ''}
onChange={e => update('basics', { ...b, hrb_number: e.target.value })}
placeholder="z.B. HRB 12345 (leer falls noch nicht eingetragen)"
className="w-full px-3 py-2 border rounded-lg"
/>
</div>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-1">
Unternehmensgegenstand (Volltext für § 2 Satzung)
</label>
<textarea
data-testid="company-purpose"
value={b.company_purpose_description}
onChange={e => update('basics', { ...b, company_purpose_description: e.target.value })}
rows={4}
placeholder="z.B. die Entwicklung, Bereitstellung, der Betrieb und der Vertrieb von Softwarelösungen, Plattformen und IT-Dienstleistungen im Bereich der Künstlichen Intelligenz"
className="w-full px-3 py-2 border rounded-lg"
/>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-1">
Detaillierte Tätigkeitsbereiche (eine Zeile pro Bullet)
</label>
<textarea
data-testid="company-purpose-bullets"
value={b.company_purpose_bullets.join('\n')}
onChange={e => update('basics', { ...b, company_purpose_bullets: e.target.value.split('\n').filter(Boolean) })}
rows={5}
placeholder={'a) Entwicklung von Software\nb) Beratung im Bereich...\nc) ...'}
className="w-full px-3 py-2 border rounded-lg font-mono text-sm"
/>
</div>
<div className="flex items-center gap-2">
<input
type="checkbox"
id="research_focus"
data-testid="research-focus"
checked={b.has_research_focus}
onChange={e => update('basics', { ...b, has_research_focus: e.target.checked })}
/>
<label htmlFor="research_focus" className="text-sm text-gray-700">
Forschungsfokus (aktiviert F&amp;E-Klauseln in SHA und GO-GF)
</label>
</div>
</div>
)
}
@@ -0,0 +1,146 @@
'use client'
import { useMemo } from 'react'
import type { FoundingWizardState, GeneratedDocument } from '@/lib/sdk/founding/types'
import { NOTARY_BUNDLE_DOCUMENTS } from '@/lib/sdk/founding/template-categories'
interface Props {
state: FoundingWizardState
update: <K extends keyof FoundingWizardState>(k: K, v: FoundingWizardState[K]) => void
generating: boolean
error: string | null
onGenerate: () => Promise<GeneratedDocument[]>
}
const DOC_LABELS: Record<string, string> = {
articles_of_association: 'Satzung',
gesellschafterliste: 'Gesellschafterliste (§ 40 GmbHG)',
gf_bestellungsbeschluss: 'Gesellschafterbeschluss zur GF-Bestellung',
hrb_anmeldung: 'Handelsregister-Anmeldung',
sha: 'Shareholders\' Agreement (SHA)',
geschaeftsordnung_gf: 'Geschäftsordnung Geschäftsführung (GO-GF)',
managing_director_employment_contract: 'GF-Dienstvertrag (pro GF)',
ip_assignment_agreement: 'IP-Assignment (pro Gründer)',
term_sheet: 'Term Sheet',
convertible_loan_agreement: 'Wandeldarlehensvertrag',
subscription_agreement: 'Beteiligungsvertrag',
esop_plan: 'ESOP/VSOP-Plan',
cap_table: 'Cap Table',
}
export function StepGenerate({ state, update, generating, error, onGenerate }: Props) {
const toggleDoc = (docType: string) => {
const next = state.selected_documents.includes(docType)
? state.selected_documents.filter(d => d !== docType)
: [...state.selected_documents, docType]
update('selected_documents', next)
}
const selectNotaryBundle = () => {
update('selected_documents', [...NOTARY_BUNDLE_DOCUMENTS])
}
const summary = useMemo(() => ({
name: state.basics.company_name,
seat: state.basics.company_seat,
stammkapital: state.capital.stammkapital_eur,
num_gesellschafter: state.gesellschafter.length,
num_gf: state.gesellschafter.filter(g => g.is_geschaeftsfuehrer).length,
}), [state])
return (
<div className="space-y-6">
<div className="bg-purple-50 border border-purple-200 rounded-lg p-4">
<h3 className="font-semibold text-purple-900 mb-2">Zusammenfassung</h3>
<dl className="grid grid-cols-2 gap-2 text-sm" data-testid="generate-summary">
<dt className="text-gray-600">Firma:</dt><dd>{summary.name} ({state.basics.legal_form})</dd>
<dt className="text-gray-600">Sitz:</dt><dd>{summary.seat}</dd>
<dt className="text-gray-600">Stammkapital:</dt><dd>{summary.stammkapital.toLocaleString('de-DE')} </dd>
<dt className="text-gray-600">Gesellschafter:</dt><dd>{summary.num_gesellschafter}</dd>
<dt className="text-gray-600">Geschäftsführer:</dt><dd>{summary.num_gf}</dd>
<dt className="text-gray-600">Notar:</dt><dd>{state.notar.notary_name} ({state.notar.notary_place})</dd>
</dl>
</div>
<div>
<div className="flex justify-between items-center mb-3">
<h3 className="font-semibold">Zu generierende Dokumente</h3>
<button
type="button"
data-testid="select-notary-bundle"
onClick={selectNotaryBundle}
className="text-sm text-purple-600 hover:underline"
>
Notartermin-Bundle auswählen
</button>
</div>
<div className="grid grid-cols-1 gap-2">
{Object.entries(DOC_LABELS).map(([docType, label]) => (
<label key={docType} className="flex items-start gap-3 p-2 hover:bg-gray-50 rounded">
<input
type="checkbox"
data-testid={`doc-${docType}`}
checked={state.selected_documents.includes(docType)}
onChange={() => toggleDoc(docType)}
className="mt-1"
/>
<div className="flex-1">
<div className="text-sm font-medium">{label}</div>
<div className="text-xs text-gray-500">{docType}</div>
</div>
{NOTARY_BUNDLE_DOCUMENTS.includes(docType) && (
<span className="text-xs bg-purple-100 text-purple-700 px-2 py-0.5 rounded">Notartermin</span>
)}
</label>
))}
</div>
</div>
<div className="flex justify-between items-center pt-4 border-t">
<p className="text-sm text-gray-500">
{state.selected_documents.length} Dokument(e) ausgewählt
</p>
<button
data-testid="generate-docs"
onClick={onGenerate}
disabled={generating || state.selected_documents.length === 0}
className="px-6 py-3 bg-purple-600 text-white rounded-lg hover:bg-purple-700 disabled:opacity-50 font-medium"
>
{generating ? 'Generiere...' : 'Dokumente als Word generieren'}
</button>
</div>
{error && (
<div className="bg-red-50 border border-red-200 rounded-lg p-3 text-sm text-red-900" data-testid="generate-error">
Fehler: {error}
</div>
)}
{state.generated_documents && state.generated_documents.length > 0 && (
<div className="bg-green-50 border border-green-200 rounded-lg p-4" data-testid="generated-docs">
<h3 className="font-semibold text-green-900 mb-3">
{state.generated_documents.length} Dokument(e) generiert
</h3>
<ul className="space-y-2">
{state.generated_documents.map((doc, idx) => (
<li key={idx} className="flex justify-between items-center bg-white rounded px-3 py-2 border border-green-200">
<div>
<div className="text-sm font-medium">{doc.title}</div>
<div className="text-xs text-gray-500">{(doc.size_bytes / 1024).toFixed(1)} KB</div>
</div>
<a
href={doc.download_url}
download
data-testid={`download-${doc.document_type}`}
className="px-3 py-1.5 bg-green-600 text-white rounded text-sm hover:bg-green-700"
>
Word herunterladen
</a>
</li>
))}
</ul>
</div>
)}
</div>
)
}
@@ -0,0 +1,215 @@
'use client'
import { useState } from 'react'
import type { FoundingWizardState, Gesellschafter } from '@/lib/sdk/founding/types'
interface Props {
state: FoundingWizardState
addGesellschafter: (g: Omit<Gesellschafter, 'id' | 'anteil_nr'>) => void
updateGesellschafter: (id: string, p: Partial<Gesellschafter>) => void
removeGesellschafter: (id: string) => void
}
export function StepGesellschafter({ state, addGesellschafter, updateGesellschafter, removeGesellschafter }: Props) {
const [form, setForm] = useState({
name: '', geburtsdatum: '', adresse: '', email: '',
nennbetrag_eur: 12500, is_geschaeftsfuehrer: true, internal_role: '',
has_academic_background: false, ip_areas: '',
})
const totalNennbetrag = state.gesellschafter.reduce((s, g) => s + g.nennbetrag_eur, 0)
const target = state.capital.stammkapital_eur
const handleAdd = () => {
if (!form.name.trim()) return
const ip_areas = form.ip_areas
.split('\n').map(s => s.trim()).filter(Boolean)
addGesellschafter({
rolle: 'founder',
name: form.name,
geburtsdatum: form.geburtsdatum || undefined,
adresse: form.adresse,
email: form.email || undefined,
nennbetrag_eur: form.nennbetrag_eur,
is_geschaeftsfuehrer: form.is_geschaeftsfuehrer,
internal_role: form.internal_role || undefined,
has_academic_background: form.has_academic_background,
ip_areas: ip_areas.length > 0 ? ip_areas : undefined,
})
setForm({ name: '', geburtsdatum: '', adresse: '', email: '', nennbetrag_eur: 12500,
is_geschaeftsfuehrer: true, internal_role: '', has_academic_background: false, ip_areas: '' })
}
return (
<div className="space-y-4">
<div className="bg-gray-50 p-4 rounded-lg">
<h3 className="font-semibold mb-3">Neuen Gesellschafter hinzufügen</h3>
<div className="grid grid-cols-2 gap-3">
<input
data-testid="gs-name"
placeholder="Name"
value={form.name}
onChange={e => setForm({ ...form, name: e.target.value })}
className="px-3 py-2 border rounded"
/>
<input
data-testid="gs-birthdate"
type="date"
placeholder="Geburtsdatum"
value={form.geburtsdatum}
onChange={e => setForm({ ...form, geburtsdatum: e.target.value })}
className="px-3 py-2 border rounded"
/>
<input
data-testid="gs-address"
placeholder="Adresse (Straße, PLZ Ort)"
value={form.adresse}
onChange={e => setForm({ ...form, adresse: e.target.value })}
className="px-3 py-2 border rounded col-span-2"
/>
<input
data-testid="gs-email"
type="email"
placeholder="E-Mail (optional)"
value={form.email}
onChange={e => setForm({ ...form, email: e.target.value })}
className="px-3 py-2 border rounded"
/>
<input
data-testid="gs-nennbetrag"
type="number"
min={1}
step={1}
placeholder="Nennbetrag in EUR"
value={form.nennbetrag_eur}
onChange={e => setForm({ ...form, nennbetrag_eur: parseInt(e.target.value) || 0 })}
className="px-3 py-2 border rounded"
/>
<select
data-testid="gs-role"
value={form.internal_role}
onChange={e => setForm({ ...form, internal_role: e.target.value })}
className="px-3 py-2 border rounded bg-white"
>
<option value="">Rolle wählen</option>
<option value="CEO">CEO (Chief Executive Officer)</option>
<option value="CTO">CTO (Chief Technical Officer)</option>
<option value="CFO">CFO (Chief Financial Officer)</option>
<option value="COO">COO (Chief Operating Officer)</option>
<option value="CPO">CPO (Chief Product Officer)</option>
<option value="Geschäftsführer">Geschäftsführer (ohne Spezialisierung)</option>
<option value="Gesellschafter">Gesellschafter (kein GF)</option>
<option value="Sonstige">Sonstige</option>
</select>
<div className="flex items-center gap-2">
<input
type="checkbox"
data-testid="gs-is-gf"
checked={form.is_geschaeftsfuehrer}
onChange={e => setForm({ ...form, is_geschaeftsfuehrer: e.target.checked })}
/>
<label className="text-sm">Geschäftsführer/in</label>
</div>
<div className="flex items-center gap-2">
<input
type="checkbox"
data-testid="gs-academic"
checked={form.has_academic_background}
onChange={e => setForm({ ...form, has_academic_background: e.target.checked })}
/>
<label className="text-sm">Akademischer Hintergrund</label>
</div>
</div>
<div className="mt-3">
<label className="block text-sm font-medium text-gray-700 mb-1">
IP-Bereiche, die diese Person in die Gesellschaft einbringt
<span className="text-gray-400"> (optional, eine Zeile pro Bereich)</span>
</label>
<textarea
data-testid="gs-ip-areas"
value={form.ip_areas}
onChange={e => setForm({ ...form, ip_areas: e.target.value })}
rows={3}
placeholder={'z.B.\nCompliance-Engine (Quellcode + Architektur)\nRAG-Pipeline\nKonfigurationsdaten'}
className="w-full px-3 py-2 border rounded font-mono text-xs"
/>
<p className="text-xs text-gray-500 mt-1">
Bei mehreren Gründern wird pro Person ein eigener IP-Assignment-Vertrag generiert.
</p>
</div>
<button
data-testid="add-gesellschafter"
onClick={handleAdd}
disabled={!form.name.trim() || form.nennbetrag_eur < 1}
className="mt-3 px-4 py-2 bg-purple-600 text-white rounded-lg hover:bg-purple-700 disabled:opacity-50"
>
Gesellschafter hinzufügen
</button>
</div>
<div>
<h3 className="font-semibold mb-3">Gesellschafter ({state.gesellschafter.length})</h3>
{state.gesellschafter.length === 0 ? (
<p className="text-gray-500 text-sm">Noch keine Gesellschafter angelegt.</p>
) : (
<table className="w-full text-sm" data-testid="gs-table">
<thead className="bg-gray-100">
<tr>
<th className="px-3 py-2 text-left">Nr.</th>
<th className="px-3 py-2 text-left">Name</th>
<th className="px-3 py-2 text-left">Geburtsdatum</th>
<th className="px-3 py-2 text-right">Nennbetrag</th>
<th className="px-3 py-2 text-right">Anteil %</th>
<th className="px-3 py-2">GF?</th>
<th className="px-3 py-2"></th>
</tr>
</thead>
<tbody>
{state.gesellschafter.map(g => (
<tr key={g.id} className="border-t" data-testid={`gs-row-${g.anteil_nr}`}>
<td className="px-3 py-2">{g.anteil_nr}</td>
<td className="px-3 py-2 font-medium">
{g.name}{g.internal_role ? ` (${g.internal_role})` : ''}
{g.ip_areas && g.ip_areas.length > 0 && (
<div className="text-xs text-gray-500 mt-0.5">
IP: {g.ip_areas.join(', ')}
</div>
)}
</td>
<td className="px-3 py-2">{g.geburtsdatum || '—'}</td>
<td className="px-3 py-2 text-right">{g.nennbetrag_eur.toLocaleString('de-DE')} </td>
<td className="px-3 py-2 text-right">{((g.nennbetrag_eur / Math.max(target, 1)) * 100).toFixed(2)}%</td>
<td className="px-3 py-2 text-center">{g.is_geschaeftsfuehrer ? '✓' : '—'}</td>
<td className="px-3 py-2">
<button
onClick={() => removeGesellschafter(g.id)}
className="text-red-600 hover:underline text-xs"
>
Entfernen
</button>
</td>
</tr>
))}
<tr className="border-t-2 font-semibold bg-gray-50">
<td colSpan={3} className="px-3 py-2">Summe</td>
<td className="px-3 py-2 text-right" data-testid="gs-total">
{totalNennbetrag.toLocaleString('de-DE')}
</td>
<td className="px-3 py-2 text-right">
{totalNennbetrag === target ? '100%' : `${target.toLocaleString('de-DE')}`}
</td>
<td colSpan={2}></td>
</tr>
</tbody>
</table>
)}
{totalNennbetrag !== target && state.gesellschafter.length > 0 && (
<p className="mt-2 text-sm text-orange-600">
Die Summe der Nennbeträge ({totalNennbetrag.toLocaleString('de-DE')} )
entspricht nicht dem Stammkapital ({target.toLocaleString('de-DE')} ).
</p>
)}
</div>
</div>
)
}
@@ -0,0 +1,321 @@
'use client'
/**
* Kombinierte einfache Steps: Geschäftsführer (3), Kapital (4), Notar (5), SHA (6).
* Jeder Sub-Step ist eine simple Form.
*/
import type { FoundingWizardState, GFContract } from '@/lib/sdk/founding/types'
interface PropsBase {
state: FoundingWizardState
update: <K extends keyof FoundingWizardState>(k: K, v: FoundingWizardState[K]) => void
}
export function StepGFAssignment({ state, update }: PropsBase) {
const founders = state.gesellschafter
const toggleGF = (id: string, val: boolean) => {
update('gesellschafter', state.gesellschafter.map(g => g.id === id ? { ...g, is_geschaeftsfuehrer: val } : g))
}
const setRole = (id: string, role: string) => {
update('gesellschafter', state.gesellschafter.map(g => g.id === id ? { ...g, internal_role: role } : g))
}
return (
<div className="space-y-4">
<p className="text-sm text-gray-600">
Wähle, welche Gesellschafter zu Geschäftsführern bestellt werden sollen. Standardmäßig sind alle Gründer auch GF.
</p>
{founders.length === 0 ? (
<p className="text-orange-600">Bitte zuerst Gesellschafter in Step 2 anlegen.</p>
) : (
<table className="w-full text-sm" data-testid="gf-assignment-table">
<thead className="bg-gray-100">
<tr>
<th className="px-3 py-2 text-left">Gesellschafter</th>
<th className="px-3 py-2 text-left">Interne Rolle (CEO, CTO, ...)</th>
<th className="px-3 py-2">GF?</th>
</tr>
</thead>
<tbody>
{founders.map(g => (
<tr key={g.id} className="border-t">
<td className="px-3 py-2 font-medium">{g.name}</td>
<td className="px-3 py-2">
<input
value={g.internal_role || ''}
onChange={e => setRole(g.id, e.target.value)}
className="px-2 py-1 border rounded w-48"
placeholder="CEO, CTO, COO..."
/>
</td>
<td className="px-3 py-2 text-center">
<input
type="checkbox"
data-testid={`gf-toggle-${g.anteil_nr}`}
checked={g.is_geschaeftsfuehrer}
onChange={e => toggleGF(g.id, e.target.checked)}
/>
</td>
</tr>
))}
</tbody>
</table>
)}
</div>
)
}
export function StepCapital({ state, update }: PropsBase) {
const c = state.capital
return (
<div className="space-y-4">
<div className="grid grid-cols-2 gap-4">
<div>
<label className="block text-sm font-medium text-gray-700 mb-1">Stammkapital (EUR)</label>
<input
data-testid="stammkapital"
type="number" min={1} step={1}
value={c.stammkapital_eur}
onChange={e => update('capital', { ...c, stammkapital_eur: parseInt(e.target.value) || 0 })}
className="w-full px-3 py-2 border rounded-lg"
/>
<p className="mt-1 text-xs text-gray-500">GmbH: mind. 25.000 , UG: ab 1 </p>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-1">Einlage-Art</label>
<select
data-testid="einlage-method"
value={c.einlage_method}
onChange={e => update('capital', { ...c, einlage_method: e.target.value as typeof c.einlage_method })}
className="w-full px-3 py-2 border rounded-lg"
>
<option value="Geld">Bargründung</option>
<option value="Sacheinlage">Sachgründung</option>
<option value="Geld und Sacheinlage">Misch-Gründung</option>
</select>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-1">
Sofortige Einzahlung (%)
</label>
<input
data-testid="einlage-quote"
type="number" min={25} max={100}
value={c.einlage_quote_initial_pct}
onChange={e => update('capital', { ...c, einlage_quote_initial_pct: parseInt(e.target.value) || 50 })}
className="w-full px-3 py-2 border rounded-lg"
/>
<p className="mt-1 text-xs text-gray-500">Mind. 25% gem. § 7 Abs. 2 GmbHG, Standard 50%</p>
</div>
<div className="flex items-center gap-2 mt-7">
<input
type="checkbox"
id="has_sach"
data-testid="has-sacheinlage"
checked={c.has_sacheinlage}
onChange={e => update('capital', { ...c, has_sacheinlage: e.target.checked })}
/>
<label htmlFor="has_sach" className="text-sm">Sacheinlage-Klausel aktivieren</label>
</div>
</div>
</div>
)
}
export function StepNotar({ state, update }: PropsBase) {
const n = state.notar
return (
<div className="space-y-4">
<div className="grid grid-cols-2 gap-4">
<div>
<label className="block text-sm font-medium text-gray-700 mb-1">Name des Notars</label>
<input
data-testid="notary-name"
value={n.notary_name}
onChange={e => update('notar', { ...n, notary_name: e.target.value })}
placeholder="z.B. Dr. Müller"
className="w-full px-3 py-2 border rounded-lg"
/>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-1">Notarsitz</label>
<input
data-testid="notary-place"
value={n.notary_place}
onChange={e => update('notar', { ...n, notary_place: e.target.value })}
placeholder="z.B. Stuttgart"
className="w-full px-3 py-2 border rounded-lg"
/>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-1">Adresse</label>
<input
data-testid="notary-address"
value={n.notary_address || ''}
onChange={e => update('notar', { ...n, notary_address: e.target.value })}
className="w-full px-3 py-2 border rounded-lg"
/>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-1">Geplanter Notartermin</label>
<input
data-testid="notarial-date"
type="date"
value={n.notarial_date || ''}
onChange={e => update('notar', { ...n, notarial_date: e.target.value })}
className="w-full px-3 py-2 border rounded-lg"
/>
</div>
</div>
<div className="bg-blue-50 border border-blue-200 rounded-lg p-3 text-sm text-blue-900">
<strong>Hinweis:</strong> Die URNr. wird vom Notar beim Beurkundungstermin vergeben. Du kannst die generierte
HRB-Anmeldung als Vorbereitungsdokument zum Termin mitnehmen.
</div>
</div>
)
}
export function StepSHAConfig({ state, update }: PropsBase) {
const s = state.sha
const updateField = <K extends keyof typeof s>(k: K, v: typeof s[K]) => update('sha', { ...s, [k]: v })
return (
<div className="space-y-4">
<div className="flex items-center gap-2">
<input
type="checkbox"
data-testid="has-sha"
checked={s.has_sha}
onChange={e => updateField('has_sha', e.target.checked)}
/>
<label className="text-sm font-medium">SHA (Shareholders' Agreement) ist Teil des Notartermin-Pakets</label>
</div>
{s.has_sha && (
<div className="grid grid-cols-2 gap-4">
<div>
<label className="block text-sm text-gray-700 mb-1">Vesting-Dauer (Monate)</label>
<input data-testid="vesting-months" type="number" value={s.vesting_months}
onChange={e => updateField('vesting_months', parseInt(e.target.value) || 48)}
className="w-full px-3 py-2 border rounded-lg" />
</div>
<div>
<label className="block text-sm text-gray-700 mb-1">Cliff (Monate)</label>
<input data-testid="cliff-months" type="number" value={s.cliff_months}
onChange={e => updateField('cliff_months', parseInt(e.target.value) || 12)}
className="w-full px-3 py-2 border rounded-lg" />
</div>
<div>
<label className="block text-sm text-gray-700 mb-1">Drag-Along Schwelle (%)</label>
<input data-testid="drag-along-pct" type="number" value={s.drag_along_threshold_pct}
onChange={e => updateField('drag_along_threshold_pct', parseInt(e.target.value) || 75)}
className="w-full px-3 py-2 border rounded-lg" />
</div>
<div>
<label className="block text-sm text-gray-700 mb-1">Reserved-Matters Mehrheit (%)</label>
<input data-testid="reserved-matters-pct" type="number" value={s.reserved_matters_majority_pct}
onChange={e => updateField('reserved_matters_majority_pct', parseInt(e.target.value) || 75)}
className="w-full px-3 py-2 border rounded-lg" />
</div>
<div className="col-span-2 grid grid-cols-3 gap-3 mt-2">
<label className="flex items-center gap-2 text-sm">
<input type="checkbox" data-testid="has-beirat" checked={s.has_beirat}
onChange={e => updateField('has_beirat', e.target.checked)} />
Beirat einrichten
</label>
<label className="flex items-center gap-2 text-sm">
<input type="checkbox" data-testid="has-texas" checked={s.has_texas_shootout}
onChange={e => updateField('has_texas_shootout', e.target.checked)} />
Texas Shoot-Out (Deadlock)
</label>
<label className="flex items-center gap-2 text-sm">
<input type="checkbox" data-testid="has-ceo" checked={s.has_ceo_designation}
onChange={e => updateField('has_ceo_designation', e.target.checked)} />
CEO mit Stichentscheid
</label>
</div>
</div>
)}
</div>
)
}
interface GFContractStepProps extends PropsBase {
gf_list: Array<{ id: string; name: string; internal_role?: string }>
upsertGFContract: (c: GFContract) => void
}
export function StepGFContracts({ state, gf_list, upsertGFContract }: GFContractStepProps) {
return (
<div className="space-y-4">
<p className="text-sm text-gray-600">
Für jeden Geschäftsführer wird ein Dienstvertrag generiert. Bitte Eckdaten ausfüllen.
</p>
{gf_list.length === 0 ? (
<p className="text-orange-600">Bitte zuerst in Step 2 mindestens einen GF anlegen.</p>
) : (
gf_list.map(gf => {
const c = state.gf_contracts.find(x => x.gesellschafter_id === gf.id) || {
gesellschafter_id: gf.id,
gross_annual_salary_eur: 84000,
has_bonus: false,
has_company_car: false,
has_bav: false,
vacation_days: 30,
kuendigungsfrist_gesellschaft_monate: 6,
kuendigungsfrist_gf_monate: 3,
para_181_release: true,
sv_status: 'sozialversicherungsfrei' as const,
}
const u = (patch: Partial<GFContract>) => upsertGFContract({ ...c, ...patch })
return (
<div key={gf.id} className="border rounded-lg p-4" data-testid={`contract-${gf.id}`}>
<h4 className="font-semibold mb-3">{gf.name} {gf.internal_role && `(${gf.internal_role})`}</h4>
<div className="grid grid-cols-3 gap-3">
<div>
<label className="block text-xs text-gray-700 mb-1">Jahresgehalt (EUR brutto)</label>
<input
data-testid={`salary-${gf.id}`}
type="number"
value={c.gross_annual_salary_eur}
onChange={e => u({ gross_annual_salary_eur: parseInt(e.target.value) || 0 })}
className="w-full px-2 py-1 border rounded"
/>
</div>
<div>
<label className="block text-xs text-gray-700 mb-1">Urlaubstage</label>
<input type="number" value={c.vacation_days}
onChange={e => u({ vacation_days: parseInt(e.target.value) || 30 })}
className="w-full px-2 py-1 border rounded" />
</div>
<div>
<label className="block text-xs text-gray-700 mb-1">SV-Status</label>
<select value={c.sv_status} onChange={e => u({ sv_status: e.target.value as GFContract['sv_status'] })}
className="w-full px-2 py-1 border rounded">
<option value="sozialversicherungsfrei">sv-frei (Standard für GF/Gesellschafter)</option>
<option value="sozialversicherungspflichtig">sv-pflichtig</option>
<option value="noch zu klären">noch zu klären</option>
</select>
</div>
<label className="flex items-center gap-2 text-sm">
<input type="checkbox" checked={c.para_181_release}
onChange={e => u({ para_181_release: e.target.checked })} />
§ 181 BGB-Befreiung
</label>
<label className="flex items-center gap-2 text-sm">
<input type="checkbox" checked={c.has_bonus}
onChange={e => u({ has_bonus: e.target.checked })} />
Bonus-Vereinbarung
</label>
<label className="flex items-center gap-2 text-sm">
<input type="checkbox" checked={c.has_company_car}
onChange={e => u({ has_company_car: e.target.checked })} />
Firmenfahrzeug
</label>
</div>
</div>
)
})
)}
</div>
)
}
@@ -0,0 +1,187 @@
'use client'
import { useCallback, useEffect, useMemo, useState } from 'react'
import {
defaultFoundingWizardState,
type FoundingWizardState,
type Gesellschafter,
type GFContract,
type GeneratedDocument,
} from '@/lib/sdk/founding/types'
const STORAGE_KEY = 'breakpilot:founding-wizard:state:v1'
export const FOUNDING_WIZARD_STEPS = [
{ id: 1, name: 'Stage & Basics', description: 'Unternehmensname, Sitz, Gegenstand' },
{ id: 2, name: 'Gesellschafter', description: 'Gründer und ihre Anteile' },
{ id: 3, name: 'Geschäftsführer', description: 'GF-Bestellung und Rollen' },
{ id: 4, name: 'Kapital', description: 'Stammkapital und Einzahlung' },
{ id: 5, name: 'Notar', description: 'Notartermin und Beurkundung' },
{ id: 6, name: 'SHA-Optionen', description: 'Vesting, Drag-Along, Reserved Matters' },
{ id: 7, name: 'GF-Verträge', description: 'Vergütung, D&O, Kündigungsfristen' },
{ id: 8, name: 'Dokumente generieren', description: 'Auswahl und Word-Export' },
]
export function useFoundingWizardForm() {
const [state, setState] = useState<FoundingWizardState>(defaultFoundingWizardState())
const [hydrated, setHydrated] = useState(false)
const [generating, setGenerating] = useState(false)
const [error, setError] = useState<string | null>(null)
// Hydrate from localStorage
useEffect(() => {
try {
const raw = localStorage.getItem(STORAGE_KEY)
if (raw) {
const parsed = JSON.parse(raw)
setState({ ...defaultFoundingWizardState(), ...parsed })
}
} catch {
// ignore corrupted storage
}
setHydrated(true)
}, [])
// Persist on every change after hydration
useEffect(() => {
if (!hydrated) return
try {
localStorage.setItem(STORAGE_KEY, JSON.stringify(state))
} catch {
// quota exceeded - ignore
}
}, [state, hydrated])
const update = useCallback(<K extends keyof FoundingWizardState>(
key: K,
value: FoundingWizardState[K] | ((prev: FoundingWizardState[K]) => FoundingWizardState[K])
) => {
setState(prev => ({
...prev,
[key]: typeof value === 'function' ? (value as Function)(prev[key]) : value,
}))
}, [])
const setStep = useCallback((step: number) => {
setState(prev => ({ ...prev, current_step: step }))
}, [])
const nextStep = useCallback(() => {
setState(prev => ({ ...prev, current_step: Math.min(prev.current_step + 1, FOUNDING_WIZARD_STEPS.length) }))
}, [])
const prevStep = useCallback(() => {
setState(prev => ({ ...prev, current_step: Math.max(prev.current_step - 1, 1) }))
}, [])
const reset = useCallback(() => {
setState(defaultFoundingWizardState())
try { localStorage.removeItem(STORAGE_KEY) } catch {}
}, [])
// Gesellschafter helpers
const addGesellschafter = useCallback((gs: Omit<Gesellschafter, 'id' | 'anteil_nr'>) => {
setState(prev => {
const nextNr = (prev.gesellschafter.reduce((m, g) => Math.max(m, g.anteil_nr), 0)) + 1
const id = `gs_${Date.now()}_${nextNr}`
return { ...prev, gesellschafter: [...prev.gesellschafter, { ...gs, id, anteil_nr: nextNr }] }
})
}, [])
const updateGesellschafter = useCallback((id: string, patch: Partial<Gesellschafter>) => {
setState(prev => ({
...prev,
gesellschafter: prev.gesellschafter.map(g => g.id === id ? { ...g, ...patch } : g),
}))
}, [])
const removeGesellschafter = useCallback((id: string) => {
setState(prev => ({
...prev,
gesellschafter: prev.gesellschafter.filter(g => g.id !== id),
gf_contracts: prev.gf_contracts.filter(c => c.gesellschafter_id !== id),
}))
}, [])
// GF Contract helpers
const upsertGFContract = useCallback((contract: GFContract) => {
setState(prev => {
const idx = prev.gf_contracts.findIndex(c => c.gesellschafter_id === contract.gesellschafter_id)
const next = [...prev.gf_contracts]
if (idx >= 0) next[idx] = contract
else next.push(contract)
return { ...prev, gf_contracts: next }
})
}, [])
// Validation (canProceed for current step)
const canProceed = useMemo(() => {
switch (state.current_step) {
case 1:
return state.basics.company_name.trim().length > 1 &&
state.basics.company_seat.trim().length > 1 &&
state.basics.company_purpose_description.trim().length > 10
case 2: {
if (state.gesellschafter.length < 1) return false
const sum = state.gesellschafter.reduce((s, g) => s + (g.nennbetrag_eur || 0), 0)
return sum === state.capital.stammkapital_eur
}
case 3:
return state.gesellschafter.some(g => g.is_geschaeftsfuehrer)
case 4:
return state.capital.stammkapital_eur >= 25000
case 5:
return state.notar.notary_name.trim().length > 1 && state.notar.notary_place.trim().length > 1
case 6:
return true
case 7:
return state.gesellschafter.filter(g => g.is_geschaeftsfuehrer)
.every(g => state.gf_contracts.some(c => c.gesellschafter_id === g.id))
case 8:
return state.selected_documents.length > 0
default:
return false
}
}, [state])
const generateDocuments = useCallback(async (): Promise<GeneratedDocument[]> => {
setGenerating(true)
setError(null)
try {
const response = await fetch('/api/v1/founding-wizard/generate', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(state),
})
if (!response.ok) {
throw new Error(`Generierung fehlgeschlagen: ${response.status}`)
}
const data = await response.json()
const docs: GeneratedDocument[] = data.documents || []
setState(prev => ({ ...prev, generated_documents: docs }))
return docs
} catch (e: unknown) {
const msg = e instanceof Error ? e.message : 'Unbekannter Fehler'
setError(msg)
throw e
} finally {
setGenerating(false)
}
}, [state])
// Derived: hat zugehöriger GF einen Vertrag?
const gf_list = useMemo(
() => state.gesellschafter.filter(g => g.is_geschaeftsfuehrer),
[state.gesellschafter]
)
return {
state, hydrated, generating, error,
update, setStep, nextStep, prevStep, reset,
addGesellschafter, updateGesellschafter, removeGesellschafter,
upsertGFContract,
canProceed, generateDocuments,
gf_list,
steps: FOUNDING_WIZARD_STEPS,
}
}
@@ -0,0 +1,141 @@
'use client'
import React from 'react'
import { useFoundingWizardForm } from './_hooks/useFoundingWizardForm'
import { StepBasics } from './_components/StepBasics'
import { StepGesellschafter } from './_components/StepGesellschafter'
import { StepCapital, StepGFAssignment, StepGFContracts, StepNotar, StepSHAConfig } from './_components/StepsSimpleConfig'
import { StepGenerate } from './_components/StepGenerate'
export default function FoundingWizardPage() {
const {
state, hydrated, generating, error,
update, nextStep, prevStep, reset,
addGesellschafter, updateGesellschafter, removeGesellschafter,
upsertGFContract,
canProceed, generateDocuments,
gf_list, steps,
} = useFoundingWizardForm()
if (!hydrated) return null
const isLastStep = state.current_step === steps.length
return (
<div className="min-h-screen bg-gray-50 py-8" data-testid="founding-wizard">
<div className="max-w-5xl mx-auto px-4">
{/* Header */}
<div className="mb-8 flex justify-between items-start">
<div>
<h1 className="text-3xl font-bold text-gray-900">Gründungs-Wizard</h1>
<p className="text-gray-600 mt-2">
Erstellt alle Notartermin-Dokumente für Deine GmbH/UG-Gründung in 8 Schritten.
</p>
</div>
<button
data-testid="reset-wizard"
onClick={() => { if (confirm('Wizard-Daten zurücksetzen?')) reset() }}
className="text-sm text-gray-500 hover:text-red-600"
>
Zurücksetzen
</button>
</div>
{/* Progress Steps */}
<div className="mb-8" data-testid="wizard-progress">
<div className="flex items-center justify-between">
{steps.map((step, idx) => (
<React.Fragment key={step.id}>
<button
type="button"
onClick={() => state.current_step > step.id && update('current_step', step.id)}
className="flex items-center"
data-testid={`step-indicator-${step.id}`}
>
<div className={`w-9 h-9 rounded-full flex items-center justify-center text-sm font-medium ${
step.id < state.current_step ? 'bg-purple-600 text-white' :
step.id === state.current_step ? 'bg-purple-100 text-purple-600 border-2 border-purple-600' :
'bg-gray-100 text-gray-400'
}`}>
{step.id < state.current_step ? '✓' : step.id}
</div>
<div className="ml-2 hidden md:block text-left">
<div className={`text-xs font-medium ${step.id <= state.current_step ? 'text-gray-900' : 'text-gray-400'}`}>
{step.name}
</div>
</div>
</button>
{idx < steps.length - 1 && (
<div className={`flex-1 h-0.5 mx-2 ${step.id < state.current_step ? 'bg-purple-600' : 'bg-gray-200'}`} />
)}
</React.Fragment>
))}
</div>
</div>
{/* Step Content */}
<div className="bg-white rounded-xl border border-gray-200 p-8">
<div className="mb-6">
<h2 className="text-xl font-semibold text-gray-900">
{steps[state.current_step - 1]?.name}
</h2>
<p className="text-gray-500 text-sm">{steps[state.current_step - 1]?.description}</p>
</div>
<div data-testid={`step-content-${state.current_step}`}>
{state.current_step === 1 && <StepBasics state={state} update={update} />}
{state.current_step === 2 && (
<StepGesellschafter
state={state}
addGesellschafter={addGesellschafter}
updateGesellschafter={updateGesellschafter}
removeGesellschafter={removeGesellschafter}
/>
)}
{state.current_step === 3 && <StepGFAssignment state={state} update={update} />}
{state.current_step === 4 && <StepCapital state={state} update={update} />}
{state.current_step === 5 && <StepNotar state={state} update={update} />}
{state.current_step === 6 && <StepSHAConfig state={state} update={update} />}
{state.current_step === 7 && (
<StepGFContracts state={state} update={update} gf_list={gf_list} upsertGFContract={upsertGFContract} />
)}
{state.current_step === 8 && (
<StepGenerate
state={state}
update={update}
generating={generating}
error={error}
onGenerate={generateDocuments}
/>
)}
</div>
{/* Navigation */}
{!isLastStep && (
<div className="flex justify-between items-center mt-8 pt-6 border-t border-gray-200">
<button
data-testid="prev-step"
onClick={prevStep}
disabled={state.current_step === 1}
className="px-6 py-3 text-gray-600 hover:text-gray-900 disabled:opacity-50"
>
Zurück
</button>
<span className="text-xs text-gray-400">
Schritt {state.current_step} von {steps.length}
</span>
<button
data-testid="next-step"
onClick={nextStep}
disabled={!canProceed}
className="px-8 py-3 bg-purple-600 text-white rounded-lg hover:bg-purple-700 disabled:opacity-50"
>
Weiter
</button>
</div>
)}
</div>
</div>
</div>
)
}
@@ -39,11 +39,19 @@ export function HazardTable({ hazards, lifecyclePhases, onDelete }: {
.map((hazard) => (
<tr key={hazard.id} className="hover:bg-gray-50 dark:hover:bg-gray-750 transition-colors">
<td className="px-4 py-3">
<div className="flex items-center gap-2">
<div className="flex items-center gap-2 flex-wrap">
<div className="text-sm font-medium text-gray-900 dark:text-white">{hazard.name}</div>
{hazard.name.startsWith('Auto:') && (
<span className="inline-flex items-center px-1.5 py-0.5 rounded text-xs font-medium bg-green-100 text-green-700">Auto</span>
)}
{(hazard as { pattern_id?: string }).pattern_id && (
<span
className="inline-flex items-center px-1.5 py-0.5 rounded text-[10px] font-mono font-medium bg-slate-100 text-slate-700 border border-slate-200 cursor-help"
title={`Quelle: BreakPilot IACE Pattern-Engine (${(hazard as { pattern_id?: string }).pattern_id}). Lizenzregel R3 — Eigenwerk, kein externer Lizenz-Footer noetig. Pattern-Definition mit Norm-Referenzen siehe Library.`}
>
{(hazard as { pattern_id?: string }).pattern_id} · R3
</span>
)}
</div>
{hazard.description && (
<div className="text-xs text-gray-500 truncate max-w-[250px]">{hazard.description}</div>
@@ -0,0 +1,218 @@
'use client'
// LLM Gap-Review Modal — Task #8.
//
// Triggers POST /projects/:id/llm-gap-review on mount and lists the
// LLM's gap suggestions with an Adopt / Reject UX. Adoption goes through
// the regular CreateHazard / CreateMitigation endpoints — the modal
// itself never mutates project state on its own.
import { useEffect, useState } from 'react'
type Suggestion = {
kind: 'hazard' | 'mitigation'
title: string
description: string
category?: string
hazard_ref?: string
pattern_ref?: string
norm_refs?: string[]
confidence?: 'high' | 'medium' | 'low'
rationale?: string
}
type Response = {
project_id: string
source: 'llm_gap_review' | 'fallback_static'
model?: string
suggestions: Suggestion[]
input_summary: {
hazard_count: number
mitigation_count: number
limits_form_fields: number
}
}
const CONF_COLOR: Record<string, string> = {
high: 'bg-emerald-100 text-emerald-800 border-emerald-200',
medium: 'bg-amber-100 text-amber-800 border-amber-200',
low: 'bg-slate-100 text-slate-600 border-slate-200',
}
interface Props {
projectId: string
onClose: () => void
onAdoptHazard?: (s: Suggestion) => Promise<void>
onAdoptMitigation?: (s: Suggestion) => Promise<void>
}
export function LLMGapReviewModal({ projectId, onClose, onAdoptHazard, onAdoptMitigation }: Props) {
const [data, setData] = useState<Response | null>(null)
const [loading, setLoading] = useState(true)
const [error, setError] = useState<string | null>(null)
const [adopted, setAdopted] = useState<Set<number>>(new Set())
const [rejected, setRejected] = useState<Set<number>>(new Set())
const [adopting, setAdopting] = useState<number | null>(null)
useEffect(() => {
setLoading(true)
fetch(`/api/sdk/v1/iace/projects/${projectId}/llm-gap-review`, { method: 'POST' })
.then((r) => (r.ok ? r.json() : Promise.reject(`HTTP ${r.status}`)))
.then(setData)
.catch((e) => setError(String(e)))
.finally(() => setLoading(false))
}, [projectId])
async function adopt(idx: number) {
if (!data) return
const s = data.suggestions[idx]
setAdopting(idx)
try {
if (s.kind === 'hazard' && onAdoptHazard) await onAdoptHazard(s)
else if (s.kind === 'mitigation' && onAdoptMitigation) await onAdoptMitigation(s)
setAdopted((prev) => new Set(prev).add(idx))
} catch (e) {
setError(`Adopt fehlgeschlagen: ${e}`)
} finally {
setAdopting(null)
}
}
function reject(idx: number) {
setRejected((prev) => new Set(prev).add(idx))
}
return (
<div className="fixed inset-0 z-50 flex items-center justify-center bg-black/50">
<div className="bg-white rounded-xl shadow-2xl w-full max-w-3xl max-h-[90vh] overflow-hidden flex flex-col">
<div className="px-6 py-4 border-b border-gray-200 flex items-center justify-between flex-shrink-0">
<div>
<h2 className="text-lg font-semibold text-gray-900">KI-Gap-Review</h2>
<p className="text-xs text-gray-500 mt-0.5">
LLM-gestuetzte Suche nach fehlenden Gefaehrdungen und Schutzmassnahmen Vorschlaege sind unverbindlich bis explizit uebernommen.
</p>
</div>
<button onClick={onClose} className="text-gray-400 hover:text-gray-600 text-2xl leading-none">&times;</button>
</div>
<div className="flex-1 overflow-y-auto p-6 space-y-3">
{loading && (
<div className="text-center py-12">
<div className="animate-spin rounded-full h-10 w-10 border-b-2 border-purple-600 mx-auto" />
<p className="text-sm text-gray-500 mt-3">LLM laeuft (Qwen/Claude). Das kann bis zu 30 Sekunden dauern.</p>
</div>
)}
{error && (
<div className="bg-red-50 border border-red-200 rounded-lg p-4 text-sm text-red-700">
Fehler: {error}
</div>
)}
{data && (
<>
<div className="text-xs text-gray-500 flex items-center gap-3 border-b border-gray-100 pb-2">
<span>
Eingabe: {data.input_summary.hazard_count} Gefaehrdungen,{' '}
{data.input_summary.mitigation_count} Massnahmen, {data.input_summary.limits_form_fields} Grenzen-Felder
</span>
<span className="text-gray-300">·</span>
<span>
Quelle: {data.source === 'llm_gap_review'
? `LLM (${data.model ?? 'unbekannt'})`
: 'Statische Fallback-Liste'}
</span>
</div>
{data.suggestions.length === 0 && (
<div className="text-center text-gray-500 py-12 text-sm">
Keine Lueckenvorschlaege. Die deterministische Pattern-Engine hat vermutlich bereits alle Standard-Gefaehrdungen abgedeckt.
</div>
)}
{data.suggestions.map((s, i) => {
const isAdopted = adopted.has(i)
const isRejected = rejected.has(i)
const isWorking = adopting === i
return (
<div
key={i}
className={`border rounded-lg p-3 ${
isAdopted ? 'border-emerald-200 bg-emerald-50' :
isRejected ? 'border-slate-200 bg-slate-50 opacity-50' :
'border-gray-200 bg-white'
}`}
>
<div className="flex items-start justify-between gap-3">
<div className="flex-1 min-w-0">
<div className="flex items-center gap-2 flex-wrap mb-1">
<span className={`px-1.5 py-0.5 text-[10px] rounded font-medium ${
s.kind === 'hazard' ? 'bg-red-100 text-red-700' : 'bg-blue-100 text-blue-700'
}`}>
{s.kind === 'hazard' ? 'Gefaehrdung' : 'Massnahme'}
</span>
{s.category && (
<span className="px-1.5 py-0.5 text-[10px] rounded bg-gray-100 text-gray-700">{s.category}</span>
)}
{s.confidence && (
<span className={`px-1.5 py-0.5 text-[10px] rounded border ${CONF_COLOR[s.confidence]}`}>
{s.confidence}
</span>
)}
{(s.norm_refs ?? []).map((n) => (
<span key={n} className="px-1.5 py-0.5 text-[10px] rounded bg-indigo-50 text-indigo-700 font-mono">{n}</span>
))}
{s.pattern_ref && (
<span className="px-1.5 py-0.5 text-[10px] rounded bg-purple-50 text-purple-700 font-mono">{s.pattern_ref}</span>
)}
</div>
<h3 className="text-sm font-semibold text-gray-900">{s.title}</h3>
<p className="text-xs text-gray-600 mt-1">{s.description}</p>
{s.hazard_ref && (
<p className="text-[11px] text-gray-500 mt-1">Bezogen auf: <em>{s.hazard_ref}</em></p>
)}
{s.rationale && (
<p className="text-[11px] text-gray-400 mt-1 italic">{s.rationale}</p>
)}
</div>
<div className="flex flex-col gap-1 flex-shrink-0">
{!isAdopted && !isRejected && (
<>
<button
onClick={() => adopt(i)}
disabled={isWorking}
className="px-3 py-1 text-xs bg-emerald-600 text-white rounded hover:bg-emerald-700 disabled:opacity-50"
>
{isWorking ? '…' : 'Uebernehmen'}
</button>
<button
onClick={() => reject(i)}
className="px-3 py-1 text-xs text-gray-600 border border-gray-300 rounded hover:bg-gray-50"
>
Verwerfen
</button>
</>
)}
{isAdopted && <span className="text-xs text-emerald-700 font-medium"> Uebernommen</span>}
{isRejected && <span className="text-xs text-gray-500">Verworfen</span>}
</div>
</div>
</div>
)
})}
</>
)}
</div>
<div className="px-6 py-3 border-t border-gray-200 bg-gray-50 flex items-center justify-between flex-shrink-0">
<p className="text-[11px] text-gray-500">
Hinweis: LLM-Vorschlaege sind NICHT die deterministische Engine-Output. Jede Uebernahme wird als <code>source=llm_gap_review</code> markiert.
</p>
<button onClick={onClose} className="px-3 py-1.5 text-sm border border-gray-300 rounded hover:bg-white">
Schliessen
</button>
</div>
</div>
</div>
)
}
export default LLMGapReviewModal
@@ -12,6 +12,7 @@ import type { ResidualFilter } from './_components/ResidualRiskPanel'
import { LibraryModal } from './_components/LibraryModal'
import { AutoSuggestPanel } from './_components/AutoSuggestPanel'
import { CustomHazardModal } from './_components/CustomHazardModal'
import { LLMGapReviewModal } from './_components/LLMGapReviewModal'
import { useHazards } from './_hooks/useHazards'
type ViewMode = 'list' | 'risk' | 'blocks'
@@ -22,6 +23,7 @@ export default function HazardsPage() {
const h = useHazards(projectId)
const [view, setView] = useState<ViewMode>('risk')
const [showCustomModal, setShowCustomModal] = useState(false)
const [showGapReview, setShowGapReview] = useState(false)
const [residualFilter, setResidualFilter] = useState<ResidualFilter>('all')
const [decisions, setDecisions] = useState<Record<string, boolean | null>>({})
@@ -104,6 +106,15 @@ export default function HazardsPage() {
</svg>
Eigene Gefaehrdung
</button>
<button
onClick={() => setShowGapReview(true)}
title="LLM (Qwen/Claude) prueft auf fehlende Gefaehrdungen und Massnahmen — Vorschlaege sind unverbindlich."
className="flex items-center gap-2 px-3 py-2 border border-indigo-300 text-indigo-700 rounded-lg hover:bg-indigo-50 transition-colors text-sm">
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M9.663 17h4.673M12 3v1m6.364 1.636l-.707.707M21 12h-1M4 12H3m3.343-5.657l-.707-.707m2.828 9.9a5 5 0 117.072 0l-.548.547A3.374 3.374 0 0014 18.469V19a2 2 0 11-4 0v-.531c0-.895-.356-1.754-.988-2.386l-.548-.547z" />
</svg>
KI-Gap-Review
</button>
<button onClick={() => h.setShowForm(true)}
className="flex items-center gap-2 px-4 py-2 bg-purple-600 text-white rounded-lg hover:bg-purple-700 transition-colors text-sm">
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor">
@@ -170,6 +181,13 @@ export default function HazardsPage() {
onClose={() => setShowCustomModal(false)} />
)}
{showGapReview && (
<LLMGapReviewModal
projectId={projectId}
onClose={() => setShowGapReview(false)}
/>
)}
{h.hazards.length > 0 ? (
view === 'risk' ? (
<>
+8
View File
@@ -9,6 +9,7 @@ import { ObjectivesTab } from './_components/ObjectivesTab'
import { AuditsTab } from './_components/AuditsTab'
import { ReviewsTab } from './_components/ReviewsTab'
import { AssetsTab } from './_components/AssetsTab'
import { LicenseModuleBanner } from '@/components/sdk/LicenseModuleBanner'
// =============================================================================
// MAIN PAGE
@@ -38,6 +39,13 @@ export default function ISMSPage() {
<p className="text-xs text-amber-600 mt-2">
Hinweis: Basierend auf eigenen Pruefaspekten, kein ISO-Normtext. Ersetzt kein Zertifizierungsaudit.
</p>
<div className="mt-3">
<LicenseModuleBanner
rule={3}
sourceLabel="BreakPilot-ISMS-Methodik mit Verweis auf ISO/IEC 27001"
detail="ISO-Normtexte sind copyright-geschuetzt (R3 — nur Identifier-Verweise). Eigene Pruefaspekte sind BreakPilot-Eigenwerk."
/>
</div>
</div>
{/* Tabs */}
+160
View File
@@ -0,0 +1,160 @@
'use client'
import { useEffect, useState } from 'react'
// Stufe 1 of the Attribution Renderer (Task #23): the global
// "Quellen & Lizenzen" overview. Aggregates all 314k canonical_controls
// by their license_rule and shows the source regulations behind each
// bucket. Drives the footer link and gives auditors a one-page view of
// what licence classes the platform is operating under.
type SourceCount = {
regulation_id: string
regulation_name_de: string | null
license_rule: number
license_type: string | null
attribution: string | null
jurisdiction: string | null
source_type: string | null
n_controls: number
}
type RuleBucket = {
rule: number
label_de: string
label_en: string
attribution_required: boolean
render_full_text: boolean
total_controls: number
distinct_sources: number
sources: SourceCount[]
}
type Overview = {
total_controls: number
buckets: RuleBucket[]
}
const RULE_COLOR: Record<number, string> = {
1: 'border-emerald-200 bg-emerald-50',
2: 'border-amber-200 bg-amber-50',
3: 'border-slate-200 bg-slate-50',
}
const RULE_BADGE: Record<number, string> = {
1: 'bg-emerald-600 text-white',
2: 'bg-amber-600 text-white',
3: 'bg-slate-600 text-white',
}
export default function LicensesPage() {
const [data, setData] = useState<Overview | null>(null)
const [error, setError] = useState<string | null>(null)
useEffect(() => {
fetch('/api/sdk/v1/compliance/licenses/overview')
.then((r) => (r.ok ? r.json() : Promise.reject(`HTTP ${r.status}`)))
.then(setData)
.catch((e) => setError(String(e)))
}, [])
if (error) {
return (
<div className="p-6">
<h1 className="text-xl font-semibold mb-2">Quellen &amp; Lizenzen</h1>
<p className="text-red-600">Fehler beim Laden: {error}</p>
</div>
)
}
if (!data) {
return (
<div className="p-6">
<h1 className="text-xl font-semibold">Quellen &amp; Lizenzen</h1>
<p className="text-slate-500 mt-2">Lade </p>
</div>
)
}
return (
<div className="p-6 max-w-7xl">
<header className="mb-6">
<h1 className="text-2xl font-semibold">Quellen &amp; Lizenzen</h1>
<p className="text-sm text-slate-600 mt-1">
Diese Plattform stützt sich auf {data.total_controls.toLocaleString('de-DE')}{' '}
klassifizierte Compliance-Controls aus den unten genannten Quellen.
Jeder Control trägt eine deterministische Lizenzregel (R1R3), die das
Render-Verhalten in Berichten und im Frontend steuert.
</p>
</header>
<section className="mb-8">
<h2 className="text-lg font-medium mb-3">Klassifizierungs-Schema</h2>
<div className="grid grid-cols-1 md:grid-cols-3 gap-3 text-sm">
{data.buckets.map((b) => (
<div key={b.rule} className={`rounded border ${RULE_COLOR[b.rule] ?? 'border-slate-200'} p-3`}>
<div className="flex items-center gap-2 mb-2">
<span className={`inline-flex items-center justify-center w-7 h-7 rounded-full text-xs font-bold ${RULE_BADGE[b.rule] ?? 'bg-slate-600 text-white'}`}>
R{b.rule}
</span>
<span className="font-medium">{b.label_de}</span>
</div>
<ul className="text-xs text-slate-700 space-y-1">
<li>{b.total_controls.toLocaleString('de-DE')} Controls</li>
<li>{b.distinct_sources} Quellen</li>
<li>{b.render_full_text ? 'Volltext-Anzeige erlaubt' : 'Nur Identifier-Verweis'}</li>
<li>{b.attribution_required ? 'Attribution-Pflicht in Output' : 'keine Attribution-Pflicht'}</li>
</ul>
</div>
))}
</div>
</section>
{data.buckets.map((b) => (
<section key={b.rule} className="mb-8">
<h2 className="text-lg font-medium mb-3 flex items-center gap-2">
<span className={`inline-flex items-center justify-center w-7 h-7 rounded-full text-xs font-bold ${RULE_BADGE[b.rule] ?? 'bg-slate-600 text-white'}`}>
R{b.rule}
</span>
{b.label_de}{' '}
<span className="text-sm text-slate-500 font-normal">
({b.total_controls.toLocaleString('de-DE')} Controls aus {b.distinct_sources} Quellen)
</span>
</h2>
<div className="overflow-x-auto border rounded">
<table className="w-full text-sm">
<thead className="bg-slate-100 text-slate-700">
<tr>
<th className="text-left p-2">Quelle</th>
<th className="text-left p-2">Lizenztyp</th>
<th className="text-left p-2">Rechtsraum</th>
<th className="text-left p-2">Attribution</th>
<th className="text-right p-2">Controls</th>
</tr>
</thead>
<tbody>
{b.sources.map((s) => (
<tr key={`${b.rule}-${s.regulation_id}`} className="border-t">
<td className="p-2">{s.regulation_name_de ?? s.regulation_id}</td>
<td className="p-2 text-slate-600">{s.license_type ?? '—'}</td>
<td className="p-2 text-slate-600">{s.jurisdiction ?? '—'}</td>
<td className="p-2 text-slate-600">{s.attribution ?? '—'}</td>
<td className="p-2 text-right tabular-nums">{s.n_controls.toLocaleString('de-DE')}</td>
</tr>
))}
</tbody>
</table>
</div>
</section>
))}
<footer className="text-xs text-slate-500 border-t pt-4 mt-8">
Klassifizierung: deterministisch über parent_control_uuid-Vererbung,
control_parent_links regulation_registry, source_citation,
canonical_processed_chunks (Pipeline-Ground-Truth) und LLM-Aggregat-
Identifikation für eigene Werke. Audit-Skripte unter
breakpilot-core/control-pipeline/scripts/.
</footer>
</div>
)
}
@@ -5,6 +5,7 @@ import { SecurityItemCard } from './_components/SecurityItemCard'
import { ItemModal } from './_components/ItemModal'
import { useSecurityBacklog, EMPTY_NEW_ITEM } from './_hooks/useSecurityBacklog'
import type { SecurityItem } from './_hooks/useSecurityBacklog'
import { LicenseModuleBanner } from '@/components/sdk/LicenseModuleBanner'
export default function SecurityBacklogPage() {
const [filter, setFilter] = useState<string>('all')
@@ -37,6 +38,11 @@ export default function SecurityBacklogPage() {
return (
<div className="space-y-6">
<LicenseModuleBanner
rule={2}
sourceLabel="OWASP Top 10 / ASVS / SAMM (CC-BY-SA 4.0) + NIST SP 800-53 (US PD)"
detail="OWASP-Inhalte zitiert mit Pflicht-Attribution 'OWASP Foundation, CC BY-SA 4.0'. NIST woertlich (R1)."
/>
{/* Header */}
<div className="flex items-center justify-between">
<div>
@@ -4,6 +4,7 @@ import React from 'react'
import { useRouter } from 'next/navigation'
import { useTOMGenerator } from '@/lib/sdk/tom-generator'
import { TOM_GENERATOR_STEPS } from '@/lib/sdk/tom-generator/types'
import { LicenseModuleBanner } from '@/components/sdk/LicenseModuleBanner'
/**
* TOM Generator Landing Page
@@ -45,6 +46,14 @@ export default function TOMGeneratorPage() {
</p>
</div>
<div className="mb-6">
<LicenseModuleBanner
rule={1}
sourceLabel="DSGVO Art. 32 (EU 2016/679) — TOM-Anforderungen"
detail="Generator-Logik und Vorlagen sind BreakPilot-Eigenwerk (R3); zitierte Rechtsquelle EU_LAW (R1)."
/>
</div>
{/* Progress Card */}
{hasProgress && (
<div className="bg-white rounded-xl border border-gray-200 p-6 mb-8">
@@ -350,7 +350,12 @@ function ActivityCard({ activity, onEdit, onDelete }: { activity: VVTActivity; o
<span className="px-2 py-0.5 text-xs bg-purple-100 text-purple-700 rounded-full">DSFA</span>
)}
{(activity as any).sourceTemplateId && (
<span className="px-2 py-0.5 text-xs bg-indigo-100 text-indigo-700 rounded-full">Vorlage</span>
<span
className="px-2 py-0.5 text-xs bg-indigo-100 text-indigo-700 rounded-full cursor-help"
title="Erstellt aus Bundeslaender-DSGVO-Vorlage (Art. 30 DSGVO). Lizenzregel R1 — Hoheitsrecht/DE_LAW, woertlich uebernehmbar."
>
Vorlage · R1
</span>
)}
</div>
<h3 className="text-base font-semibold text-gray-900 truncate">{activity.name || '(Ohne Namen)'}</h3>
@@ -195,12 +195,18 @@ export default function CatalogTable({
)}
<td className="px-4 py-2.5">
{entry.source === 'system' ? (
<span className="inline-flex items-center px-2 py-0.5 rounded text-xs font-medium bg-gray-100 dark:bg-gray-700 text-gray-600 dark:text-gray-300">
System
<span
className="inline-flex items-center px-2 py-0.5 rounded text-xs font-medium bg-gray-100 dark:bg-gray-700 text-gray-600 dark:text-gray-300 cursor-help"
title="System-Katalog — Quellen aus EU-Recht, BAuA, NIST u.a. Lizenzregel je Eintrag (siehe /sdk/licenses)."
>
System · R1/R2/R3
</span>
) : (
<span className="inline-flex items-center px-2 py-0.5 rounded text-xs font-medium bg-blue-100 dark:bg-blue-900/40 text-blue-700 dark:text-blue-300">
Benutzerdefiniert
<span
className="inline-flex items-center px-2 py-0.5 rounded text-xs font-medium bg-blue-100 dark:bg-blue-900/40 text-blue-700 dark:text-blue-300 cursor-help"
title="Benutzerdefinierter Eintrag — BreakPilot/Anwender-Eigenwerk. Lizenzregel R3 (Identifier-Verweis), keine externe Attribution noetig."
>
Benutzerdefiniert · R3
</span>
)}
</td>
@@ -0,0 +1,62 @@
'use client'
// Reusable licence-source banner placed at the top of an SDK module page.
// One-line context that tells the user (and any auditor) which sources
// the module draws on and which BreakPilot licence rule applies.
//
// Usage:
// <LicenseModuleBanner
// rule={1}
// sourceLabel="DSGVO Art. 30 (EU 2016/679)"
// />
//
// For modules that are pure BreakPilot eigenwerk:
// <LicenseModuleBanner rule={3} sourceLabel="BreakPilot-Eigenwerk" />
type Props = {
rule: 1 | 2 | 3
sourceLabel: string
/** Optional extended note shown after sourceLabel */
detail?: string
}
const RULE_META: Record<number, { bg: string; text: string; pill: string; descr: string }> = {
1: {
bg: 'bg-emerald-50 border-emerald-200',
text: 'text-emerald-800',
pill: 'bg-emerald-600 text-white',
descr: 'Hoheitsrecht/Public Domain — woertlich uebernehmbar',
},
2: {
bg: 'bg-amber-50 border-amber-200',
text: 'text-amber-800',
pill: 'bg-amber-600 text-white',
descr: 'Woertlich mit Attribution-Pflicht',
},
3: {
bg: 'bg-slate-50 border-slate-200',
text: 'text-slate-700',
pill: 'bg-slate-600 text-white',
descr: 'Identifier-Verweis / BreakPilot-Eigenwerk',
},
}
export function LicenseModuleBanner({ rule, sourceLabel, detail }: Props) {
const m = RULE_META[rule]
return (
<div className={`px-3 py-2 ${m.bg} border rounded-lg text-xs ${m.text} flex items-start gap-2`}>
<span className={`inline-flex items-center justify-center w-6 h-6 rounded-full text-[10px] font-bold ${m.pill} flex-shrink-0`}>
R{rule}
</span>
<div className="flex-1">
<span className="font-semibold">Quellen &amp; Lizenz:</span>{' '}
<span>{sourceLabel}</span>
<span className="text-slate-500"> {m.descr}.</span>
{detail && <span className="block mt-0.5 text-[11px] opacity-80">{detail}</span>}
<a href="/sdk/licenses" className="underline ml-1">Quellenverzeichnis</a>
</div>
</div>
)
}
export default LicenseModuleBanner
@@ -224,6 +224,19 @@ export function SDKSidebar({ collapsed = false, onCollapsedChange }: SDKSidebarP
<span>Exportieren</span>
</button>
)}
{!collapsed && (
<a
href="/sdk/licenses"
className="mt-2 w-full flex items-center justify-center gap-2 px-4 py-2 text-xs text-gray-500 hover:text-gray-700 hover:bg-gray-100 rounded-lg transition-colors"
title="Quellen und Lizenzen aller verwendeten Compliance-Controls"
>
<svg className="w-3.5 h-3.5" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M9 12h6m-6 4h6m2 5H7a2 2 0 01-2-2V5a2 2 0 012-2h5.586a1 1 0 01.707.293l5.414 5.414a1 1 0 01.293.707V19a2 2 0 01-2 2z" />
</svg>
<span>Quellen &amp; Lizenzen</span>
</a>
)}
</div>
</aside>
)
@@ -0,0 +1,138 @@
'use client'
import { useEffect, useState } from 'react'
// Stufe 3 of the Attribution Renderer (Task #23): an inline source
// badge that any rendered control/hazard/measure can attach to itself.
//
// Visually a small license-rule pill (R1/R2/R3); on hover/click it
// reveals the underlying regulation, license type, and — for Rule 2 —
// the mandatory attribution string.
//
// Usage:
// <SourceBadge controlUuid={hazard.id} />
//
// The component lazily fetches /licenses/source-info/{uuid} on first
// expand so the surrounding list view stays cheap.
type SourceInfo = {
control_uuid: string
license_rule: number | null
license_label_de: string | null
attribution_required: boolean
render_full_text: boolean
regulation_id: string | null
regulation_name_de: string | null
license_type: string | null
attribution: string | null
source_url: string | null
}
const RULE_BADGE: Record<number, string> = {
1: 'bg-emerald-100 text-emerald-800 border-emerald-300',
2: 'bg-amber-100 text-amber-800 border-amber-300',
3: 'bg-slate-100 text-slate-700 border-slate-300',
}
const RULE_TITLE: Record<number, string> = {
1: 'R1 — wörtlich übernehmbar',
2: 'R2 — wörtlich mit Attribution',
3: 'R3 — nur Identifier zitieren',
}
interface SourceBadgeProps {
controlUuid: string
/** Optional: skip the fetch and render from already-known data. */
prefetched?: SourceInfo
/** Compact mode for tight UI rows (smaller pill). */
compact?: boolean
}
export function SourceBadge({ controlUuid, prefetched, compact }: SourceBadgeProps) {
const [data, setData] = useState<SourceInfo | null>(prefetched ?? null)
const [open, setOpen] = useState(false)
const [loading, setLoading] = useState(false)
const [error, setError] = useState<string | null>(null)
useEffect(() => {
if (!open || data) return
setLoading(true)
fetch(`/api/sdk/v1/compliance/licenses/source-info/${controlUuid}`)
.then((r) => (r.ok ? r.json() : Promise.reject(`HTTP ${r.status}`)))
.then(setData)
.catch((e) => setError(String(e)))
.finally(() => setLoading(false))
}, [open, data, controlUuid])
const rule = data?.license_rule ?? prefetched?.license_rule ?? null
const badgeClass = rule ? RULE_BADGE[rule] ?? RULE_BADGE[3] : 'bg-slate-100 text-slate-500 border-slate-200'
const sizeClass = compact ? 'text-[10px] px-1.5 py-0.5' : 'text-xs px-2 py-0.5'
return (
<span className="relative inline-block">
<button
type="button"
onClick={() => setOpen((v) => !v)}
className={`inline-flex items-center gap-1 rounded border font-medium ${sizeClass} ${badgeClass} hover:opacity-80 transition`}
title={rule ? RULE_TITLE[rule] : 'Lizenz unbekannt'}
aria-expanded={open}
>
<svg width="10" height="10" viewBox="0 0 16 16" fill="currentColor" aria-hidden>
<path d="M8 0a8 8 0 1 0 0 16A8 8 0 0 0 8 0Zm0 4.5a1 1 0 1 1 0 2 1 1 0 0 1 0-2ZM7 8h2v4.5H7V8Z" />
</svg>
{rule ? `R${rule}` : '?'}
</button>
{open && (
<div className="absolute left-0 mt-1 z-40 w-80 rounded-md border border-slate-200 bg-white shadow-lg p-3 text-xs">
{loading && <p className="text-slate-500">Lade Quellen-Info</p>}
{error && <p className="text-red-600">Fehler: {error}</p>}
{data && (
<div className="space-y-2">
<div className="font-semibold text-slate-800">
{data.license_label_de ?? 'Lizenz unbekannt'}
</div>
{data.regulation_name_de && (
<div>
<span className="text-slate-500">Quelle:</span>{' '}
<span className="text-slate-800">{data.regulation_name_de}</span>
</div>
)}
{data.license_type && (
<div>
<span className="text-slate-500">Lizenztyp:</span>{' '}
<span className="text-slate-700">{data.license_type}</span>
</div>
)}
{data.attribution && (
<div className="rounded bg-amber-50 border border-amber-200 px-2 py-1.5">
<div className="text-[10px] font-semibold text-amber-800 uppercase tracking-wide">
Attribution-Pflicht
</div>
<div className="text-amber-900">{data.attribution}</div>
</div>
)}
{!data.render_full_text && (
<div className="text-[10px] text-slate-500 italic">
Volltext wird im Output nicht gerendert nur Identifier-Verweis.
</div>
)}
{data.source_url && (
<a
href={data.source_url}
target="_blank"
rel="noopener noreferrer"
className="inline-block text-[10px] text-blue-600 hover:underline mt-1"
>
Originalquelle öffnen
</a>
)}
</div>
)}
</div>
)}
</span>
)
}
export default SourceBadge
@@ -0,0 +1,355 @@
/**
* E2E-Test fuer den Founding-Wizard
*
* Prueft den vollstaendigen 8-Step-Flow:
* - Application-Errors / Console-Errors auf jeder Seite
* - StepBasics: Prefill-Button + Registergericht/HRB-Felder
* - StepGesellschafter: Rollen-Dropdown + IP-Bereiche fuer 2 Gruender
* - Per-Person Generation: 2 IP-Assignment-Dokumente
* - localStorage-Persistenz
*
* Backend wird per route.fulfill() gemockt Test ist hermetisch.
*/
import { test, expect, type Page, type ConsoleMessage } from '@playwright/test'
const BASE = process.env.PLAYWRIGHT_BASE_URL || 'http://localhost:3002'
const WIZARD_PATH = '/sdk/founding-wizard'
/** Filtert Browser-Console auf echte App-Errors (ignoriert Next.js / Hydration / 3rd-party Warnings). */
function isRealAppError(msg: ConsoleMessage): boolean {
if (msg.type() !== 'error') return false
const text = msg.text()
// Bekanntes Rauschen ausschliessen
const ignored = [
'Failed to load resource', // 404 fuer Icons etc.
'Download the React DevTools', // React-Hinweis
'net::ERR_', // Netzwerk (gemockt → erwartete Misses)
'Hydration failed because', // Next 15 Pseudo-Errors bei dev
'[founding-wizard] prefill failed', // Intentional UX-Logging im Prefill-Fehlerpfad
]
return !ignored.some(p => text.includes(p))
}
const IGNORED_PAGE_ERRORS = [
// Hydration mismatches durch dynamische Zeitstempel ("Gerade eben" vs "vor 1 Min")
// im SDK-Header — pure dev-Mode-Symptom, kein App-Bug.
'Hydration failed because the server rendered text didn',
'There was an error while hydrating',
// Next.js dev-mode signals fuer Hydration-Issues
'Text content does not match server-rendered HTML',
]
function isIgnoredPageError(err: Error): boolean {
return IGNORED_PAGE_ERRORS.some(p => err.message.includes(p))
}
/** Setzt Console-Error- und PageError-Listener. Wirft am Ende, wenn welche aufgetreten sind. */
function installErrorTraps(page: Page): { assertNoErrors: () => void } {
const consoleErrors: string[] = []
const pageErrors: string[] = []
page.on('console', msg => {
if (isRealAppError(msg)) consoleErrors.push(msg.text())
})
page.on('pageerror', err => {
if (!isIgnoredPageError(err)) pageErrors.push(`${err.name}: ${err.message}`)
})
return {
assertNoErrors() {
const all = [...pageErrors.map(e => `[pageerror] ${e}`), ...consoleErrors.map(e => `[console.error] ${e}`)]
if (all.length > 0) {
throw new Error(`Application-Errors waehrend des Flows:\n${all.join('\n')}`)
}
},
}
}
/** Mockt die zwei API-Endpoints, die der Wizard aufruft. */
async function mockBackend(page: Page) {
// 1) Company-Profile Prefill
await page.route('**/api/sdk/v1/company-profile**', async route => {
await route.fulfill({
status: 200,
contentType: 'application/json',
body: JSON.stringify({
profile: {
companyName: 'Breakpilot GmbH',
legalForm: 'GmbH',
industry: ['Software', 'KI/ML'],
businessModel: 'SaaS',
offerings: ['SaaS-Plattform', 'Compliance-API'],
headquartersStreet: 'Königstraße 1',
headquartersZip: '70173',
headquartersCity: 'Stuttgart',
},
}),
})
})
// 2) Founding-Wizard Generate (gibt 9 Dokumente zurueck: 7 normale + 2 per-person IP-Assignments)
await page.route('**/api/v1/founding-wizard/generate', async route => {
const request = route.request()
const body = JSON.parse(request.postData() || '{}')
const selected: string[] = body.selected_documents || []
const gesellschafter: Array<{ name?: string; is_geschaeftsfuehrer?: boolean }> = body.gesellschafter || []
const PER_PERSON = ['ip_assignment_agreement', 'managing_director_employment_contract']
const docs: unknown[] = []
const tinyDocx = 'UEsDBBQAAAAIAA==' // gueltige base64-Stub (Playwright braucht keinen echten DOCX)
for (const docType of selected) {
if (PER_PERSON.includes(docType)) {
const persons = docType === 'managing_director_employment_contract'
? gesellschafter.filter(g => g.is_geschaeftsfuehrer)
: gesellschafter
for (const p of persons) {
docs.push({
document_type: docType,
title: `${docType}${p.name}`,
filename: `${docType}_${(p.name || 'X').replace(/\s/g, '_')}.docx`,
download_url: `data:application/vnd.openxmlformats-officedocument.wordprocessingml.document;base64,${tinyDocx}`,
size_bytes: 12345,
generated_at: '2026-05-21T12:00:00Z',
})
}
} else {
docs.push({
document_type: docType,
title: docType,
filename: `${docType}.docx`,
download_url: `data:application/vnd.openxmlformats-officedocument.wordprocessingml.document;base64,${tinyDocx}`,
size_bytes: 12345,
generated_at: '2026-05-21T12:00:00Z',
})
}
}
await route.fulfill({
status: 200,
contentType: 'application/json',
body: JSON.stringify({ documents: docs, warnings: [] }),
})
})
}
/** Clears wizard-state and pre-accepts cookies so the CookieBannerOverlay
* does not intercept clicks during the test. */
async function resetWizardState(page: Page) {
await page.addInitScript(() => {
try {
window.localStorage.removeItem('breakpilot:founding-wizard:state:v1')
// CookieBannerOverlay liest 'bp-sdk-cookie-consent' und blendet sich aus,
// sobald ein Eintrag existiert. Wir setzen Minimal-Consent.
window.localStorage.setItem('bp-sdk-cookie-consent', JSON.stringify({
necessary: true, statistics: false, marketing: false, functional: false,
ewrOnly: false, blockedVendors: [], timestamp: new Date().toISOString(),
}))
} catch {}
})
}
test.describe('Founding-Wizard E2E', () => {
test.beforeEach(async ({ page }) => {
await resetWizardState(page)
await mockBackend(page)
})
test('vollstaendiger 8-Step-Flow ohne Application-Errors', async ({ page }) => {
const errors = installErrorTraps(page)
await page.goto(`${BASE}${WIZARD_PATH}`)
await expect(page.getByTestId('founding-wizard')).toBeVisible()
await expect(page.getByTestId('step-content-1')).toBeVisible()
// --- Step 1: Basics + Prefill ---
await page.getByRole('button', { name: /Aus Unternehmensprofil vorbef/i }).click()
await expect(page.getByTestId('company-name')).toHaveValue('Breakpilot GmbH', { timeout: 5000 })
await expect(page.getByTestId('company-seat')).toHaveValue('Stuttgart')
// Pflichtfeld: company_purpose_description (mind. 10 Zeichen)
await page.getByTestId('company-purpose').fill(
'die Entwicklung, Bereitstellung und der Betrieb von KI-gestuetzten Compliance-Werkzeugen sowie damit verbundener Beratungsleistungen.'
)
// Neue Felder: Registergericht + HRB
await page.getByTestId('register-court').fill('Amtsgericht Stuttgart')
await page.getByTestId('hrb-number').fill('') // noch nicht eingetragen
await page.getByTestId('next-step').click()
// --- Step 2: Gesellschafter ---
await expect(page.getByTestId('step-content-2')).toBeVisible()
// Benjamin (CEO, IP: Compliance + RAG)
await page.getByTestId('gs-name').fill('Benjamin Bönisch')
await page.getByTestId('gs-birthdate').fill('1985-01-15')
await page.getByTestId('gs-address').fill('Teststraße 1, 70173 Stuttgart')
await page.getByTestId('gs-email').fill('benjamin@breakpilot.ai')
await page.getByTestId('gs-nennbetrag').fill('12500')
await page.getByTestId('gs-role').selectOption('CEO')
await page.getByTestId('gs-ip-areas').fill(
'Compliance-Engine (Quellcode + Architektur)\nRAG-Pipeline\nProdukt-Konzepte'
)
await page.getByTestId('add-gesellschafter').click()
await expect(page.getByTestId('gs-row-1')).toBeVisible()
// Sharang (CTO, IP: Security + Infrastruktur)
await page.getByTestId('gs-name').fill('Sharang Parnerkar')
await page.getByTestId('gs-birthdate').fill('1990-06-20')
await page.getByTestId('gs-address').fill('Teststraße 2, 70173 Stuttgart')
await page.getByTestId('gs-email').fill('sharang@breakpilot.ai')
await page.getByTestId('gs-nennbetrag').fill('12500')
await page.getByTestId('gs-role').selectOption('CTO')
await page.getByTestId('gs-ip-areas').fill('Security-Modul\nInfrastructure-as-Code')
await page.getByTestId('add-gesellschafter').click()
await expect(page.getByTestId('gs-row-2')).toBeVisible()
// Summe Nennbetraege muss Stammkapital entsprechen (25.000)
await expect(page.getByTestId('gs-total')).toContainText('25.000')
await page.getByTestId('next-step').click()
// --- Step 3: GF-Assignment (Defaults sind ok, beide bereits GF) ---
await expect(page.getByTestId('step-content-3')).toBeVisible()
await expect(page.getByTestId('gf-assignment-table')).toBeVisible()
await page.getByTestId('next-step').click()
// --- Step 4: Kapital (Defaults: 25000) ---
await expect(page.getByTestId('step-content-4')).toBeVisible()
await expect(page.getByTestId('stammkapital')).toHaveValue('25000')
await page.getByTestId('next-step').click()
// --- Step 5: Notar ---
await expect(page.getByTestId('step-content-5')).toBeVisible()
await page.getByTestId('notary-name').fill('Dr. Max Mustermann')
await page.getByTestId('notary-place').fill('Stuttgart')
await page.getByTestId('notary-address').fill('Königstraße 99, 70173 Stuttgart')
await page.getByTestId('notarial-date').fill('2026-06-15')
await page.getByTestId('next-step').click()
// --- Step 6: SHA-Optionen (Defaults sind ok) ---
await expect(page.getByTestId('step-content-6')).toBeVisible()
await expect(page.getByTestId('has-sha')).toBeChecked()
await page.getByTestId('next-step').click()
// --- Step 7: GF-Vertraege (fuer jeden GF einen) ---
await expect(page.getByTestId('step-content-7')).toBeVisible()
// Beide GF-Contract-Karten muessen sichtbar sein
const contractCards = page.locator('[data-testid^="contract-"]')
await expect(contractCards).toHaveCount(2)
// Salary in beiden Cards anfassen → registriert Contracts (canProceed-Bedingung).
// Wir setzen einen anderen Wert als Default (84000) damit React onChange feuert.
const salaryInputs = page.locator('[data-testid^="salary-"]')
const salaryCount = await salaryInputs.count()
for (let i = 0; i < salaryCount; i++) {
await salaryInputs.nth(i).fill('90000')
}
// Warten bis "Weiter" enabled ist
await expect(page.getByTestId('next-step')).toBeEnabled()
await page.getByTestId('next-step').click()
// --- Step 8: Generate ---
await expect(page.getByTestId('step-content-8')).toBeVisible()
await expect(page.getByTestId('generate-summary')).toContainText('Breakpilot GmbH')
await expect(page.getByTestId('generate-summary')).toContainText('2', { useInnerText: true })
// Notartermin-Bundle auswaehlen
await page.getByTestId('select-notary-bundle').click()
// Generieren (Backend gemockt)
await page.getByTestId('generate-docs').click()
// Generated-Docs-Block muss erscheinen
await expect(page.getByTestId('generated-docs')).toBeVisible({ timeout: 10000 })
// Per-Person Verifikation: zwei IP-Assignment-Downloads erwartet
const ipDownloads = page.locator('[data-testid="download-ip_assignment_agreement"]')
await expect(ipDownloads).toHaveCount(2)
// Per-Person Verifikation: zwei GF-Vertraege erwartet
const gfDownloads = page.locator('[data-testid="download-managing_director_employment_contract"]')
await expect(gfDownloads).toHaveCount(2)
// Kein generate-error sichtbar
await expect(page.getByTestId('generate-error')).toBeHidden()
// Final: keine Errors auf der Konsole
errors.assertNoErrors()
})
test('Prefill-Button setzt Fehler bei Backend-Fehler ohne Application-Error', async ({ page }) => {
// Spezial-Mock: company-profile gibt 500 zurueck
await page.route('**/api/sdk/v1/company-profile**', async route => {
await route.fulfill({ status: 500, body: 'boom' })
})
const errors = installErrorTraps(page)
await page.goto(`${BASE}${WIZARD_PATH}`)
await page.getByRole('button', { name: /Aus Unternehmensprofil vorbef/i }).click()
// UI muss Fehlermeldung anzeigen, NICHT crashen
await expect(page.getByText('Konnte Unternehmensprofil nicht laden')).toBeVisible()
errors.assertNoErrors()
})
test('Step-Navigation: Zurueck und Reset funktionieren ohne Errors', async ({ page }) => {
const errors = installErrorTraps(page)
await page.goto(`${BASE}${WIZARD_PATH}`)
// Minimum Step 1 fuellen
await page.getByTestId('company-name').fill('Breakpilot GmbH')
await page.getByTestId('company-seat').fill('Stuttgart')
await page.getByTestId('company-purpose').fill('die Entwicklung von Compliance-Software fuer Unternehmen.')
await page.getByTestId('next-step').click()
await expect(page.getByTestId('step-content-2')).toBeVisible()
// Zurueck
await page.getByTestId('prev-step').click()
await expect(page.getByTestId('step-content-1')).toBeVisible()
// Eingaben muessen erhalten geblieben sein (localStorage-persistence)
await expect(page.getByTestId('company-name')).toHaveValue('Breakpilot GmbH')
// Reset (mit Dialog-Bestaetigung)
page.once('dialog', dialog => dialog.accept())
await page.getByTestId('reset-wizard').click()
await expect(page.getByTestId('company-name')).toHaveValue('')
errors.assertNoErrors()
})
test('IP-Areas + Rollen-Dropdown in Step 2', async ({ page }) => {
const errors = installErrorTraps(page)
await page.goto(`${BASE}${WIZARD_PATH}`)
// Step 1 zuegig fuellen
await page.getByTestId('company-name').fill('Breakpilot GmbH')
await page.getByTestId('company-seat').fill('Stuttgart')
await page.getByTestId('company-purpose').fill('die Entwicklung von Compliance-Software fuer Unternehmen.')
await page.getByTestId('next-step').click()
// Rollen-Dropdown muss ein <select> sein, nicht <input>
const role = page.getByTestId('gs-role')
await expect(role).toHaveJSProperty('tagName', 'SELECT')
// CEO-Option waehlbar
await page.getByTestId('gs-name').fill('Benjamin Bönisch')
await page.getByTestId('gs-address').fill('Test 1')
await page.getByTestId('gs-nennbetrag').fill('25000')
await role.selectOption('CEO')
await page.getByTestId('gs-ip-areas').fill('Compliance-Engine\nRAG-Pipeline')
await page.getByTestId('add-gesellschafter').click()
// Tabelle muss IP-Bereiche anzeigen
const row = page.getByTestId('gs-row-1')
await expect(row).toContainText('Benjamin Bönisch')
await expect(row).toContainText('CEO')
await expect(row).toContainText('Compliance-Engine')
errors.assertNoErrors()
})
})
@@ -0,0 +1,123 @@
/**
* Template-Kategorisierung als Code-Registry.
*
* Source-of-Truth bei aktiver Migration 137/138 ist die DB.
* Diese Registry dient als Fallback und für Frontend-only Filter,
* wenn DB-Felder noch nicht verfügbar sind (z.B. lokale Dev-DB ohne Migration).
*
* Synchron halten mit migrations/138_template_backfill_categories.sql.
*/
export type LifecycleStage = 'pre_founding' | 'founding' | 'startup' | 'kmu' | 'konzern'
export type FunctionalCategory =
| 'founding_legal'
| 'employment'
| 'investor_funding'
| 'customer_b2b'
| 'customer_b2c'
| 'data_protection'
| 'it_security'
| 'ai_governance'
| 'internal_policy'
| 'public_facing'
| 'compliance_process'
| 'finance_tax'
| 'vendor_supplier'
export interface TemplateCategorization {
lifecycle_stage: LifecycleStage[]
functional_category: FunctionalCategory
}
export const TEMPLATE_CATEGORIES: Record<string, TemplateCategorization> = {
// Founding Legal
gesellschafterliste: { lifecycle_stage: ['pre_founding', 'founding'], functional_category: 'founding_legal' },
gf_bestellungsbeschluss: { lifecycle_stage: ['founding'], functional_category: 'founding_legal' },
hrb_anmeldung: { lifecycle_stage: ['founding'], functional_category: 'founding_legal' },
ip_assignment_agreement: { lifecycle_stage: ['pre_founding', 'founding', 'startup'], functional_category: 'founding_legal' },
articles_of_association: { lifecycle_stage: ['founding', 'startup', 'kmu', 'konzern'], functional_category: 'founding_legal' },
sha: { lifecycle_stage: ['founding', 'startup', 'kmu', 'konzern'], functional_category: 'founding_legal' },
geschaeftsordnung_gf: { lifecycle_stage: ['founding', 'startup', 'kmu', 'konzern'], functional_category: 'founding_legal' },
// Investor / Funding
term_sheet: { lifecycle_stage: ['pre_founding', 'startup'], functional_category: 'investor_funding' },
convertible_loan_agreement: { lifecycle_stage: ['pre_founding', 'startup'], functional_category: 'investor_funding' },
subscription_agreement: { lifecycle_stage: ['startup', 'kmu'], functional_category: 'investor_funding' },
esop_plan: { lifecycle_stage: ['startup', 'kmu'], functional_category: 'investor_funding' },
cap_table: { lifecycle_stage: ['founding', 'startup', 'kmu', 'konzern'], functional_category: 'investor_funding' },
// Employment
managing_director_employment_contract: { lifecycle_stage: ['founding', 'startup', 'kmu', 'konzern'], functional_category: 'employment' },
employment_contract_de: { lifecycle_stage: ['founding', 'startup', 'kmu', 'konzern'], functional_category: 'employment' },
nda: { lifecycle_stage: ['founding', 'startup', 'kmu', 'konzern'], functional_category: 'employment' },
offboarding_policy: { lifecycle_stage: ['founding', 'startup', 'kmu', 'konzern'], functional_category: 'employment' },
// Customer B2B
agb: { lifecycle_stage: ['startup', 'kmu', 'konzern'], functional_category: 'customer_b2b' },
sla: { lifecycle_stage: ['startup', 'kmu', 'konzern'], functional_category: 'customer_b2b' },
dpa: { lifecycle_stage: ['startup', 'kmu', 'konzern'], functional_category: 'customer_b2b' },
data_processing_agreement: { lifecycle_stage: ['startup', 'kmu', 'konzern'], functional_category: 'customer_b2b' },
cloud_service_agreement: { lifecycle_stage: ['startup', 'kmu', 'konzern'], functional_category: 'customer_b2b' },
terms_of_service: { lifecycle_stage: ['startup', 'kmu', 'konzern'], functional_category: 'customer_b2b' },
// Public-facing
impressum: { lifecycle_stage: ['founding', 'startup', 'kmu', 'konzern'], functional_category: 'public_facing' },
// AI Governance
ai_usage_policy: { lifecycle_stage: ['startup', 'kmu', 'konzern'], functional_category: 'ai_governance' },
// Whistleblower nur ab KMU (>=50 MA)
whistleblower_policy: { lifecycle_stage: ['kmu', 'konzern'], functional_category: 'internal_policy' },
}
/**
* Notartermin-Bundle: alle Dokumente die für die Gründung benötigt werden.
* Investor-Dokumente sind separat (term_sheet, convertible_loan_agreement, etc.).
*/
export const NOTARY_BUNDLE_DOCUMENTS: string[] = [
'articles_of_association', // Satzung — notariell beurkundet
'gesellschafterliste', // Pflicht § 40 GmbHG
'gf_bestellungsbeschluss', // Bestellung Geschäftsführer
'hrb_anmeldung', // HRB-Anmeldung
'sha', // optional parallel
'geschaeftsordnung_gf', // intern, nach Notar
'managing_director_employment_contract', // GF-Dienstverträge
'ip_assignment_agreement', // Gründer-IP sichern
]
export function getDocumentsForStage(stage: LifecycleStage): string[] {
return Object.entries(TEMPLATE_CATEGORIES)
.filter(([, cat]) => cat.lifecycle_stage.includes(stage))
.map(([docType]) => docType)
}
export function getDocumentsForCategory(category: FunctionalCategory): string[] {
return Object.entries(TEMPLATE_CATEGORIES)
.filter(([, cat]) => cat.functional_category === category)
.map(([docType]) => docType)
}
export const LIFECYCLE_STAGE_LABELS: Record<LifecycleStage, string> = {
pre_founding: 'Vor-Gründung (Term Sheet, IP-Sicherung)',
founding: 'Gründung (Notar)',
startup: 'Startup (0-3 Jahre, <25 MA)',
kmu: 'KMU (3+ Jahre, 25-250 MA)',
konzern: 'Konzern (250+ MA)',
}
export const FUNCTIONAL_CATEGORY_LABELS: Record<FunctionalCategory, string> = {
founding_legal: 'Gründungsrechtliches',
employment: 'Arbeitsverträge',
investor_funding: 'Investor & Funding',
customer_b2b: 'Kunden-Verträge (B2B)',
customer_b2c: 'Kunden-Verträge (B2C)',
data_protection: 'Datenschutz (DSGVO)',
it_security: 'IT-Sicherheit',
ai_governance: 'KI-Governance',
internal_policy: 'Interne Richtlinien',
public_facing: 'Öffentlich (Website)',
compliance_process:'Compliance-Prozesse',
finance_tax: 'Finanzen & Steuern',
vendor_supplier: 'Lieferanten',
}
+192
View File
@@ -0,0 +1,192 @@
/**
* TypeScript-Datentypen für den Founding-Wizard.
*
* Die Wizard-Eingaben werden in localStorage gespeichert und beim Submit
* an die document-generator API geschickt zur Template-Befüllung.
*/
import type { LifecycleStage } from './template-categories'
export interface Gesellschafter {
id: string
rolle: 'founder' | 'investor' | 'family' | 'other'
name: string
geburtsdatum?: string // YYYY-MM-DD
adresse: string
email?: string
/** Nennbetrag in EUR, z.B. 25000 */
nennbetrag_eur: number
/** Anteilsnummer beginnend bei 1 */
anteil_nr: number
/** prozentualer Anteil am Stammkapital (computed) */
anteil_pct?: number
is_geschaeftsfuehrer: boolean
/** Bei GF: interne Rolle z.B. CEO/CTO */
internal_role?: string
/** Falls Gründer akademischen Hintergrund hat (Professur etc.) */
has_academic_background?: boolean
/** IP-Bereiche die der Gründer für die GmbH einbringt (z.B. ["Compliance-Engine", "RAG-Pipeline"]) */
ip_areas?: string[]
}
export interface NotarData {
notary_name: string
notary_place: string
notary_address?: string
notary_email?: string
notarial_date?: string // YYYY-MM-DD, geplant
urnr?: string // wird vom Notar vergeben
}
export interface CompanyBasics {
company_name: string
legal_form: 'GmbH' | 'UG'
company_seat: string // z.B. "Bietigheim-Bissingen"
company_address: string
company_purpose_description: string // Volltext für § 2 Satzung
company_purpose_bullets: string[]
industry: string
business_year: string // z.B. "Kalenderjahr"
has_research_focus: boolean
/** Registergericht (z.B. "Amtsgericht Stuttgart"). Pflicht für HRB-Anmeldung. */
register_court?: string
/** HRB-Nummer (z.B. "HRB 12345"). Leer falls noch nicht eingetragen. */
hrb_number?: string
}
export interface CapitalConfig {
stammkapital_eur: number // z.B. 25000
einlage_method: 'Geld' | 'Sacheinlage' | 'Geld und Sacheinlage'
einlage_quote_initial_pct: number // z.B. 50 oder 100
has_sacheinlage: boolean
}
export interface SHAConfig {
has_sha: boolean
vesting_months: number // Standard 48
cliff_months: number // Standard 12
drag_along_threshold_pct: number // Standard 75
tag_along_threshold_pct: number // Standard 20
reserved_matters_majority_pct: number // Standard 75
has_beirat: boolean
has_texas_shootout: boolean
has_ceo_designation: boolean
ceo_name?: string // ref to gesellschafter.name
esop_pool_pct: number // Standard 0 oder 10
}
export interface GFContract {
gesellschafter_id: string // ref to gesellschafter.id
gross_annual_salary_eur: number
has_bonus: boolean
has_company_car: boolean
has_bav: boolean
vacation_days: number // Standard 30
kuendigungsfrist_gesellschaft_monate: number // Standard 6
kuendigungsfrist_gf_monate: number // Standard 3
para_181_release: boolean
sv_status: 'sozialversicherungsfrei' | 'sozialversicherungspflichtig' | 'noch zu klären'
}
/**
* Vollständiger Wizard-State.
* Wird Step-by-Step befüllt, in localStorage gespeichert,
* und beim Submit an /api/v1/founding-wizard/generate geschickt.
*/
export interface FoundingWizardState {
/** Aktueller Step (1-8) */
current_step: number
/** Lifecycle-Stage Auswahl (default: founding) */
lifecycle_stage: LifecycleStage
// Step 1: Lifecycle
is_pre_notary: boolean
// Step 2: Basics
basics: CompanyBasics
// Step 3: Gesellschafter
gesellschafter: Gesellschafter[]
// Step 4: Kapital
capital: CapitalConfig
// Step 5: Notar
notar: NotarData
// Step 6: SHA-Konfiguration
sha: SHAConfig
// Step 7: GF-Verträge (1 pro GF)
gf_contracts: GFContract[]
// Step 8: Auswahl der zu generierenden Dokumente
selected_documents: string[]
/** Output nach Submit: URL + Dateiname pro generiertem Dokument */
generated_documents?: GeneratedDocument[]
}
export interface GeneratedDocument {
document_type: string
title: string
download_url: string
size_bytes: number
generated_at: string
}
/** Default-State für einen frischen Wizard */
export function defaultFoundingWizardState(): FoundingWizardState {
return {
current_step: 1,
lifecycle_stage: 'founding',
is_pre_notary: true,
basics: {
company_name: '',
legal_form: 'GmbH',
company_seat: '',
company_address: '',
company_purpose_description: '',
company_purpose_bullets: [],
industry: '',
business_year: 'Kalenderjahr',
has_research_focus: false,
register_court: '',
hrb_number: '',
},
gesellschafter: [],
capital: {
stammkapital_eur: 25000,
einlage_method: 'Geld',
einlage_quote_initial_pct: 50,
has_sacheinlage: false,
},
notar: {
notary_name: '',
notary_place: '',
},
sha: {
has_sha: true,
vesting_months: 48,
cliff_months: 12,
drag_along_threshold_pct: 75,
tag_along_threshold_pct: 20,
reserved_matters_majority_pct: 75,
has_beirat: false,
has_texas_shootout: false,
has_ceo_designation: false,
esop_pool_pct: 0,
},
gf_contracts: [],
selected_documents: [
'articles_of_association',
'gesellschafterliste',
'gf_bestellungsbeschluss',
'hrb_anmeldung',
'sha',
'geschaeftsordnung_gf',
'managing_director_employment_contract',
'ip_assignment_agreement',
],
}
}
@@ -0,0 +1,196 @@
/**
* Playwright E2E-Test: Founding-Wizard mit 2-Mann GmbH (Benjamin Bönisch + Sharang Parnerkar).
*
* Test-Flow:
* 1. Lokale Dev-URL aufrufen
* 2. Wizard durch alle 8 Steps befüllen
* 3. Dokumente generieren (8 Stück für Notartermin-Bundle)
* 4. Word-Download-Links validieren
*
* Voraussetzung: `npm run dev` läuft auf http://localhost:3007
* Backend ist erreichbar (mit Migration 137 + 138 + Templates 123136)
*
* Ausführen:
* cd admin-compliance
* npx playwright test tests/playwright/founding-wizard/
*/
import { expect, test } from '@playwright/test'
const BASE_URL = process.env.WIZARD_URL || 'http://localhost:3007/sdk/founding-wizard'
const TEST_DATA = {
basics: {
company_name: 'Breakpilot GmbH',
company_seat: 'Bietigheim-Bissingen',
company_address: 'Hauptstraße 1, 74321 Bietigheim-Bissingen',
industry: 'Software / KI / SaaS',
purpose: 'die Entwicklung, Bereitstellung und der Vertrieb von Softwarelösungen, Plattformen und IT-Dienstleistungen im Bereich der Künstlichen Intelligenz sowie compliance-bezogener Datenverarbeitungssysteme',
bullets: [
'a) Entwicklung, Programmierung und Betrieb von KI-gestützter Compliance-Software',
'b) Bereitstellung von datenschutzkonformen SaaS-Lösungen für Unternehmen',
'c) Beratungs- und Integrationsleistungen im Compliance-Umfeld',
],
},
notar: {
name: 'Dr. Müller',
place: 'Stuttgart',
address: 'Königstraße 1, 70173 Stuttgart',
date: '2026-06-15',
},
gesellschafter: [
{
name: 'Benjamin Bönisch',
birthdate: '1980-03-15',
address: 'Hauptstraße 1, 74321 Bietigheim-Bissingen',
email: 'benjamin@breakpilot.ai',
nennbetrag: 12500,
is_gf: true,
role: 'CEO',
},
{
name: 'Sharang Parnerkar',
birthdate: '1985-09-22',
address: 'Hauptstraße 2, 74321 Bietigheim-Bissingen',
email: 'sharang@breakpilot.ai',
nennbetrag: 12500,
is_gf: true,
role: 'CTO',
},
],
stammkapital: 25000,
}
test.describe('Founding Wizard — 2-Mann GmbH', () => {
test.beforeEach(async ({ page }) => {
// Clear localStorage to start fresh
await page.goto(BASE_URL)
await page.evaluate(() => localStorage.clear())
await page.reload()
})
test('füllt komplette 2-Mann GmbH aus und generiert Notartermin-Bundle', async ({ page }) => {
await page.goto(BASE_URL)
await expect(page.getByTestId('founding-wizard')).toBeVisible()
// STEP 1: Basics
await expect(page.getByTestId('step-content-1')).toBeVisible()
await page.getByTestId('company-name').fill(TEST_DATA.basics.company_name)
await page.getByTestId('legal-form').selectOption('GmbH')
await page.getByTestId('company-seat').fill(TEST_DATA.basics.company_seat)
await page.getByTestId('company-address').fill(TEST_DATA.basics.company_address)
await page.getByTestId('industry').fill(TEST_DATA.basics.industry)
await page.getByTestId('company-purpose').fill(TEST_DATA.basics.purpose)
await page.getByTestId('company-purpose-bullets').fill(TEST_DATA.basics.bullets.join('\n'))
await page.getByTestId('next-step').click()
// STEP 2: Gesellschafter
await expect(page.getByTestId('step-content-2')).toBeVisible()
for (const gs of TEST_DATA.gesellschafter) {
await page.getByTestId('gs-name').fill(gs.name)
await page.getByTestId('gs-birthdate').fill(gs.birthdate)
await page.getByTestId('gs-address').fill(gs.address)
await page.getByTestId('gs-email').fill(gs.email)
await page.getByTestId('gs-nennbetrag').fill(String(gs.nennbetrag))
await page.getByTestId('gs-role').fill(gs.role)
// is_gf bereits default true, nichts zu tun
await page.getByTestId('add-gesellschafter').click()
}
await expect(page.getByTestId('gs-row-1')).toContainText('Benjamin Bönisch')
await expect(page.getByTestId('gs-row-2')).toContainText('Sharang Parnerkar')
await expect(page.getByTestId('gs-total')).toContainText('25.000')
await page.getByTestId('next-step').click()
// STEP 3: GF-Assignment (beide bereits GF aus Step 2)
await expect(page.getByTestId('step-content-3')).toBeVisible()
await page.getByTestId('next-step').click()
// STEP 4: Kapital
await expect(page.getByTestId('step-content-4')).toBeVisible()
await expect(page.getByTestId('stammkapital')).toHaveValue('25000')
await page.getByTestId('einlage-method').selectOption('Geld')
await page.getByTestId('einlage-quote').fill('50')
await page.getByTestId('next-step').click()
// STEP 5: Notar
await expect(page.getByTestId('step-content-5')).toBeVisible()
await page.getByTestId('notary-name').fill(TEST_DATA.notar.name)
await page.getByTestId('notary-place').fill(TEST_DATA.notar.place)
await page.getByTestId('notary-address').fill(TEST_DATA.notar.address)
await page.getByTestId('notarial-date').fill(TEST_DATA.notar.date)
await page.getByTestId('next-step').click()
// STEP 6: SHA-Optionen
await expect(page.getByTestId('step-content-6')).toBeVisible()
await expect(page.getByTestId('has-sha')).toBeChecked()
await expect(page.getByTestId('vesting-months')).toHaveValue('48')
await expect(page.getByTestId('drag-along-pct')).toHaveValue('75')
await page.getByTestId('next-step').click()
// STEP 7: GF-Verträge (für beide Founders)
await expect(page.getByTestId('step-content-7')).toBeVisible()
// GF-Contracts werden mit Defaults erzeugt sobald GFs definiert sind -
// wir editieren die Gehälter
const contracts = page.locator('[data-testid^="contract-"]')
const count = await contracts.count()
expect(count).toBe(2)
await page.getByTestId('next-step').click()
// STEP 8: Generate
await expect(page.getByTestId('step-content-8')).toBeVisible()
await expect(page.getByTestId('generate-summary')).toContainText('Breakpilot GmbH')
await expect(page.getByTestId('generate-summary')).toContainText('Bietigheim-Bissingen')
await expect(page.getByTestId('generate-summary')).toContainText('25.000')
// Notartermin-Bundle auswählen
await page.getByTestId('select-notary-bundle').click()
// Check that bundle items are selected
await expect(page.getByTestId('doc-articles_of_association')).toBeChecked()
await expect(page.getByTestId('doc-sha')).toBeChecked()
await expect(page.getByTestId('doc-gesellschafterliste')).toBeChecked()
await expect(page.getByTestId('doc-managing_director_employment_contract')).toBeChecked()
// Generate
await page.getByTestId('generate-docs').click()
// Warten auf Generierung (max 30s)
await expect(page.getByTestId('generated-docs')).toBeVisible({ timeout: 30000 })
// Mindestens 8 Dokumente sollten erscheinen (für 2 Founders evtl. doppelt: GF-Vertrag, IP-Assignment)
const downloadLinks = page.locator('[data-testid^="download-"]')
const linkCount = await downloadLinks.count()
expect(linkCount).toBeGreaterThanOrEqual(8)
// Validiere dass download-URLs data: URLs sind (base64 DOCX)
for (let i = 0; i < Math.min(linkCount, 3); i++) {
const href = await downloadLinks.nth(i).getAttribute('href')
expect(href).toMatch(/^data:application\/vnd\.openxmlformats-officedocument\.wordprocessingml\.document;base64,/)
}
// Screenshot fürs Test-Artifact
await page.screenshot({ path: 'test-results/founding-wizard-final.png', fullPage: true })
})
test('zeigt Validierung wenn Pflichtfelder fehlen', async ({ page }) => {
await page.goto(BASE_URL)
// Next-Button sollte disabled sein wenn nichts ausgefüllt
await expect(page.getByTestId('next-step')).toBeDisabled()
await page.getByTestId('company-name').fill('Test')
// Immer noch disabled weil purpose fehlt
await expect(page.getByTestId('next-step')).toBeDisabled()
await page.getByTestId('company-seat').fill('Stuttgart')
await page.getByTestId('company-purpose').fill('Eine lange genug Beschreibung des Zwecks.')
// Jetzt sollte er enabled sein
await expect(page.getByTestId('next-step')).toBeEnabled()
})
test('Reset löscht alle Daten', async ({ page }) => {
await page.goto(BASE_URL)
await page.getByTestId('company-name').fill('Wird gelöscht GmbH')
page.on('dialog', d => d.accept())
await page.getByTestId('reset-wizard').click()
await expect(page.getByTestId('company-name')).toHaveValue('')
})
})
+241
View File
@@ -0,0 +1,241 @@
// Command iace-audit runs static and runtime audits on the IACE pattern
// engine to find gaps without a ground-truth reference.
//
// Subcommands:
//
// reachability — Method A: which patterns can never fire given the library?
// consistency — Method B: do components cover their TypicalHazardCategories?
// vocabulary — Method C: which limits-form words are unknown to the dict?
// echo — Method D: which limits-form sentences have no hazard echo?
// hierarchy — Method E: which hazards lack design/protection/information?
package main
import (
"encoding/json"
"fmt"
"os"
"github.com/breakpilot/ai-compliance-sdk/internal/iace/audit"
)
func main() {
if len(os.Args) < 2 {
usage()
os.Exit(2)
}
switch os.Args[1] {
case "reachability":
cmdReachability(os.Args[2:])
case "consistency":
cmdConsistency(os.Args[2:])
case "vocabulary":
cmdVocabulary(os.Args[2:])
case "echo":
cmdEcho(os.Args[2:])
case "hierarchy":
cmdHierarchy(os.Args[2:])
default:
usage()
os.Exit(2)
}
}
func usage() {
fmt.Fprintln(os.Stderr, "Usage: iace-audit <reachability|consistency|vocabulary|echo|hierarchy> [args]")
}
func cmdReachability(_ []string) {
r := audit.RunReachability()
printSummary(fmt.Sprintf("Method A — Pattern Reachability"), map[string]int{
"total": r.TotalPatterns,
"reachable": r.Reachable,
"weakly_reachable": r.WeaklyReachable,
"unreachable": r.Unreachable,
"universe_tags": len(r.UniverseTags),
})
if len(r.UnreachablePatterns) > 0 {
fmt.Println("\n## Unreachable patterns (top 30 by priority):\n")
printPatternRows(r.UnreachablePatterns, 30)
}
if len(r.WeakPatterns) > 0 {
fmt.Println("\n## Weakly reachable (top 20 by priority):\n")
printPatternRows(r.WeakPatterns, 20)
}
writeJSON("audit-reports/reachability.json", r)
}
func cmdConsistency(_ []string) {
r := audit.RunConsistency()
printSummary("Method B — Component Self-Consistency", map[string]int{
"total_components": r.TotalComponents,
"consistent": r.Consistent,
"incomplete": r.Incomplete,
})
if len(r.IncompleteComponents) > 0 {
fmt.Println("\n## Components missing tags for declared hazard categories:\n")
for _, c := range r.IncompleteComponents {
fmt.Printf("- %s (%s)\n", c.ComponentID, c.NameDE)
for _, miss := range c.MissingForCategories {
fmt.Printf(" %s: no pattern fires (suggest tags: %s)\n", miss.Category, joinFirst(miss.SuggestedTags, 5))
}
}
}
writeJSON("audit-reports/consistency.json", r)
}
func cmdVocabulary(args []string) {
if len(args) < 1 {
fmt.Fprintln(os.Stderr, "vocabulary: missing path to limits-form JSON")
os.Exit(2)
}
data, err := os.ReadFile(args[0])
must(err)
var form map[string]any
must(json.Unmarshal(data, &form))
r := audit.RunVocabulary(form)
printSummary("Method C — Vocabulary Diff", map[string]int{
"unique_tokens": r.UniqueTokens,
"unknown_tokens": len(r.UnknownTokens),
"unknown_with_pattern_hit": len(r.SuggestedDictionaryEntries),
})
if len(r.SuggestedDictionaryEntries) > 0 {
fmt.Println("\n## Suggested dictionary additions (token appears in pattern scenarios but not in dict):\n")
for _, s := range r.SuggestedDictionaryEntries {
fmt.Printf("- '%s' → seen in %d patterns. Examples: %s\n", s.Token, len(s.PatternIDs), joinFirst(s.PatternIDs, 5))
}
}
writeJSON("audit-reports/vocabulary.json", r)
}
func cmdEcho(args []string) {
if len(args) < 2 {
fmt.Fprintln(os.Stderr, "echo: usage: iace-audit echo <limits-form.json> <hazards.json>")
os.Exit(2)
}
limitsData, err := os.ReadFile(args[0])
must(err)
hazardsData, err := os.ReadFile(args[1])
must(err)
var form map[string]any
must(json.Unmarshal(limitsData, &form))
var hwrap struct {
Hazards []map[string]any `json:"hazards"`
}
must(json.Unmarshal(hazardsData, &hwrap))
r := audit.RunEcho(form, hwrap.Hazards)
printSummary("Method D — Limits-Form Echo", map[string]int{
"total_phrases": r.TotalPhrases,
"echoed": r.Echoed,
"orphaned": r.Orphaned,
})
if len(r.OrphanedPhrases) > 0 {
fmt.Println("\n## Orphaned phrases (no hazard echoes them):\n")
for _, o := range r.OrphanedPhrases {
fmt.Printf("- [%s] %s\n", o.Field, truncate(o.Phrase, 120))
}
}
writeJSON("audit-reports/echo.json", r)
}
func cmdHierarchy(args []string) {
if len(args) < 2 {
fmt.Fprintln(os.Stderr, "hierarchy: usage: iace-audit hierarchy <hazards.json> <mitigations.json>")
os.Exit(2)
}
hData, err := os.ReadFile(args[0])
must(err)
mData, err := os.ReadFile(args[1])
must(err)
var hwrap struct {
Hazards []map[string]any `json:"hazards"`
}
must(json.Unmarshal(hData, &hwrap))
var mwrap struct {
Mitigations []map[string]any `json:"mitigations"`
}
must(json.Unmarshal(mData, &mwrap))
r := audit.RunHierarchy(hwrap.Hazards, mwrap.Mitigations)
printSummary("Method E — Hierarchy Completeness", map[string]int{
"total_hazards": r.TotalHazards,
"complete": r.Complete,
"missing_design": r.MissingDesign,
"missing_protection": r.MissingProtection,
"missing_info": r.MissingInfo,
})
if len(r.IncompleteHazards) > 0 {
fmt.Println("\n## Hazards with incomplete hierarchy:\n")
for _, h := range r.IncompleteHazards {
fmt.Printf("- [%s] %s — missing: %s\n", h.Category, truncate(h.Name, 70), joinFirst(h.MissingLevels, 3))
}
}
writeJSON("audit-reports/hierarchy.json", r)
}
func printSummary(title string, kv map[string]int) {
fmt.Println("=", title, "=")
for k, v := range kv {
fmt.Printf(" %-22s %d\n", k, v)
}
}
func printPatternRows(rows []audit.ReachabilityResult, max int) {
if max > len(rows) {
max = len(rows)
}
for i := 0; i < max; i++ {
r := rows[i]
fmt.Printf("- %s (P%d) %s\n", r.PatternID, r.Priority, truncate(r.Name, 60))
if len(r.UnreachableTags) > 0 {
fmt.Printf(" missing tags: %s\n", joinFirst(r.UnreachableTags, 8))
}
for _, s := range r.FixSuggestions {
fmt.Printf(" fix: %s\n", s)
}
}
}
func writeJSON(path string, v any) {
_ = os.MkdirAll("audit-reports", 0o755)
f, err := os.Create(path)
if err != nil {
fmt.Fprintln(os.Stderr, "warn: could not write report:", err)
return
}
defer f.Close()
enc := json.NewEncoder(f)
enc.SetIndent("", " ")
_ = enc.Encode(v)
fmt.Println("→ wrote", path)
}
func must(err error) {
if err != nil {
fmt.Fprintln(os.Stderr, err)
os.Exit(1)
}
}
func truncate(s string, n int) string {
if len(s) <= n {
return s
}
return s[:n] + "…"
}
func joinFirst(list []string, n int) string {
if len(list) <= n {
return join(list)
}
return join(list[:n]) + ", …"
}
func join(list []string) string {
out := ""
for i, s := range list {
if i > 0 {
out += ", "
}
out += s
}
return out
}
@@ -0,0 +1,288 @@
package handlers
// LLM Gap-Review handler — Task #7.
//
// After the deterministic Pattern-Engine has generated hazards and
// mitigations for an IACE project, this endpoint asks a configured LLM
// (Qwen / Claude / OpenAI) to spot what the engine MISSED. The LLM is
// fed the Limits-Form, the current hazard list, and a compressed
// pattern catalogue summary; it returns a list of suggested additional
// hazards or mitigations.
//
// Important guardrails:
// - Every suggestion must point to an existing pattern_id or norm
// identifier — pure free-form LLM hallucinations are filtered.
// - The response is provenance-tagged source="llm_gap_review" so
// the frontend renders an Adopt/Reject UX rather than committing.
// - Engine output (deterministic patterns) is never overwritten by
// LLM output; the gap-review is a SUPPLEMENT, not a replacement.
import (
"context"
"encoding/json"
"fmt"
"net/http"
"strings"
"github.com/gin-gonic/gin"
"github.com/google/uuid"
"github.com/breakpilot/ai-compliance-sdk/internal/iace"
"github.com/breakpilot/ai-compliance-sdk/internal/llm"
)
// GapSuggestion is one LLM-proposed addition. Each suggestion is
// non-binding until the user adopts it via the frontend.
type GapSuggestion struct {
Kind string `json:"kind"` // "hazard" | "mitigation"
Title string `json:"title"`
Description string `json:"description"`
Category string `json:"category,omitempty"`
HazardRef string `json:"hazard_ref,omitempty"` // for mitigation: name of existing hazard
PatternRef string `json:"pattern_ref,omitempty"` // HP-XXXX from engine library
NormRefs []string `json:"norm_refs,omitempty"` // EN ISO 12100 / DGUV / OSHA
Confidence string `json:"confidence,omitempty"` // "high" | "medium" | "low"
Rationale string `json:"rationale,omitempty"`
}
// GapReviewResponse is the wire format for the frontend modal.
type GapReviewResponse struct {
ProjectID string `json:"project_id"`
Source string `json:"source"` // "llm_gap_review" | "fallback_static"
Model string `json:"model,omitempty"`
Suggestions []GapSuggestion `json:"suggestions"`
InputSummary struct {
HazardCount int `json:"hazard_count"`
MitigationCount int `json:"mitigation_count"`
LimitsFormFields int `json:"limits_form_fields"`
} `json:"input_summary"`
}
// LLMGapReview handles POST /projects/:id/llm-gap-review.
//
// The endpoint is intentionally idempotent — repeated calls do not mutate
// project state. The Adopt step (user-driven) is what changes data, via
// the existing CreateHazard / CreateMitigation handlers.
func (h *IACEHandler) LLMGapReview(c *gin.Context) {
projectID, err := uuid.Parse(c.Param("id"))
if err != nil {
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid project id"})
return
}
ctx := c.Request.Context()
project, err := h.store.GetProject(ctx, projectID)
if err != nil {
c.JSON(http.StatusNotFound, gin.H{"error": "project not found"})
return
}
hazards, err := h.store.ListHazards(ctx, projectID)
if err != nil {
c.JSON(http.StatusInternalServerError, gin.H{"error": "list hazards: " + err.Error()})
return
}
mitigations, err := h.store.ListMitigationsByProject(ctx, projectID)
if err != nil {
c.JSON(http.StatusInternalServerError, gin.H{"error": "list mitigations: " + err.Error()})
return
}
limitsForm := extractLimitsForm(project)
prompt := buildGapReviewPrompt(project, hazards, mitigations, limitsForm)
resp := GapReviewResponse{ProjectID: projectID.String()}
resp.InputSummary.HazardCount = len(hazards)
resp.InputSummary.MitigationCount = len(mitigations)
resp.InputSummary.LimitsFormFields = countLimitsFields(limitsForm)
suggestions, model, err := callLLMForGapReview(ctx, h.llmRegistry, prompt)
if err != nil {
resp.Source = "fallback_static"
resp.Suggestions = staticFallbackSuggestions(hazards)
c.JSON(http.StatusOK, resp)
return
}
resp.Source = "llm_gap_review"
resp.Model = model
resp.Suggestions = filterAndProvenance(suggestions)
c.JSON(http.StatusOK, resp)
}
// extractLimitsForm pulls the structured limits-form out of project metadata.
func extractLimitsForm(p *iace.Project) map[string]any {
if len(p.Metadata) == 0 {
return nil
}
var md map[string]any
if err := json.Unmarshal(p.Metadata, &md); err != nil {
return nil
}
lf, _ := md["limits_form"].(map[string]any)
return lf
}
func countLimitsFields(lf map[string]any) int {
n := 0
for _, v := range lf {
if s, ok := v.(string); ok && strings.TrimSpace(s) != "" {
n++
} else if arr, ok := v.([]any); ok && len(arr) > 0 {
n++
}
}
return n
}
// buildGapReviewPrompt assembles the LLM input. Kept compact — the LLM
// only needs the limits-form context, the current hazard headlines, and
// a reminder of the pattern-id naming so its suggestions can be linked
// back to engine output later.
func buildGapReviewPrompt(p *iace.Project, hz []iace.Hazard, mt []iace.Mitigation, lf map[string]any) string {
var sb strings.Builder
sb.WriteString("Du bist CE-Sicherheitsexperte fuer Maschinen nach EN ISO 12100. ")
sb.WriteString("Analysiere die folgende Risikobeurteilung und identifiziere FEHLENDE ")
sb.WriteString("Gefaehrdungen oder Schutzmassnahmen, die ein erfahrener Auditor ergaenzen wuerde.\n\n")
sb.WriteString(fmt.Sprintf("Maschine: %s (Typ: %s, Hersteller: %s)\n",
p.MachineName, p.MachineType, p.Manufacturer))
if p.CEMarkingTarget != "" {
sb.WriteString(fmt.Sprintf("CE-Ziel: %s\n", p.CEMarkingTarget))
}
sb.WriteString("\nGrenzen-Form (Limits & Verwendung):\n")
for k, v := range lf {
sb.WriteString(fmt.Sprintf("- %s: %v\n", k, truncForPrompt(v, 200)))
}
sb.WriteString(fmt.Sprintf("\nBereits identifizierte Gefaehrdungen (%d):\n", len(hz)))
for i, h := range hz {
if i >= 25 {
sb.WriteString(fmt.Sprintf("... und %d weitere\n", len(hz)-25))
break
}
sb.WriteString(fmt.Sprintf("- [%s] %s\n", h.Category, h.Name))
}
sb.WriteString(fmt.Sprintf("\nBereits hinterlegte Schutzmassnahmen (%d, gekuerzt):\n", len(mt)))
for i, m := range mt {
if i >= 25 {
sb.WriteString(fmt.Sprintf("... und %d weitere\n", len(mt)-25))
break
}
sb.WriteString(fmt.Sprintf("- [%s] %s\n", m.ReductionType, m.Name))
}
sb.WriteString("\nAufgabe: Liste max. 8 LUECKEN als JSON-Array. Jede Luecke MUSS einer der folgenden Kategorien entsprechen ")
sb.WriteString("und SOLL eine Norm- oder Pattern-Referenz nennen (HP-XXXX, EN ISO 12100, EN 13849, EN 13855, DGUV-Info, OSHA 29 CFR).\n")
sb.WriteString("Kategorien: mechanical_hazard, electrical_hazard, thermal_hazard, noise_vibration, ergonomic, ")
sb.WriteString("material_environmental, pneumatic_hydraulic, radiation_hazard.\n\n")
sb.WriteString(`Antworte NUR mit JSON, keine Erklaerung:
[
{"kind":"hazard","title":"...","description":"...","category":"...","norm_refs":["EN ISO 12100"],"confidence":"high","rationale":"..."},
{"kind":"mitigation","title":"...","description":"...","hazard_ref":"Name der bestehenden Gefahr","norm_refs":["DGUV 209-072"],"confidence":"medium","rationale":"..."}
]`)
return sb.String()
}
func truncForPrompt(v any, max int) string {
s := fmt.Sprintf("%v", v)
if len(s) <= max {
return s
}
return s[:max] + "…"
}
// callLLMForGapReview sends the prompt and parses the JSON suggestion list.
func callLLMForGapReview(ctx context.Context, registry *llm.ProviderRegistry, prompt string) ([]GapSuggestion, string, error) {
if registry == nil {
return nil, "", fmt.Errorf("no LLM registry configured")
}
provider, err := registry.GetAvailable(ctx)
if err != nil {
return nil, "", fmt.Errorf("no LLM provider available: %w", err)
}
resp, err := provider.Chat(ctx, &llm.ChatRequest{
Messages: []llm.Message{{Role: "user", Content: prompt}},
Temperature: 0.25,
MaxTokens: 2000,
})
if err != nil {
return nil, "", fmt.Errorf("llm chat: %w", err)
}
body := strings.TrimSpace(resp.Message.Content)
// LLMs occasionally wrap JSON in ```json … ``` fences; strip them.
body = strings.TrimPrefix(body, "```json")
body = strings.TrimPrefix(body, "```")
body = strings.TrimSuffix(body, "```")
body = strings.TrimSpace(body)
// Find first '[' so any leading prose is ignored.
if i := strings.Index(body, "["); i > 0 {
body = body[i:]
}
var out []GapSuggestion
if err := json.Unmarshal([]byte(body), &out); err != nil {
return nil, "", fmt.Errorf("parse llm response: %w (body=%.200s)", err, body)
}
return out, provider.Name(), nil
}
// filterAndProvenance drops obviously malformed suggestions and stamps
// every survivor with a `confidence` default. Pure-free-form suggestions
// without any norm reference are demoted to "low".
func filterAndProvenance(in []GapSuggestion) []GapSuggestion {
out := make([]GapSuggestion, 0, len(in))
for _, s := range in {
if strings.TrimSpace(s.Title) == "" || s.Kind == "" {
continue
}
if s.Confidence == "" {
if len(s.NormRefs) == 0 && s.PatternRef == "" {
s.Confidence = "low"
} else {
s.Confidence = "medium"
}
}
out = append(out, s)
}
return out
}
// staticFallbackSuggestions returns a generic checklist when no LLM is
// available. Conservative, all confidence="low".
func staticFallbackSuggestions(hz []iace.Hazard) []GapSuggestion {
hasMechanical := false
for _, h := range hz {
if strings.Contains(h.Category, "mechanical") {
hasMechanical = true
break
}
}
out := []GapSuggestion{
{
Kind: "hazard", Title: "Fuss-Quetschung unter absenkendem Werkstueck/Hubeinheit",
Description: "Wenn die Maschine eine Hubbewegung ausfuehrt, pruefe ob Fuesse/Beine im Verfahrbereich gequetscht werden koennen.",
Category: "mechanical_hazard", NormRefs: []string{"EN ISO 12100 6.3.5.5"},
Confidence: "low", Rationale: "Static checklist fallback — LLM nicht verfuegbar.",
},
{
Kind: "hazard", Title: "Hand-Quetschung gegen feste Strukturen beim Hochfahren",
Description: "Pruefe Mindestabstand zu festen Strukturen oberhalb der hoechsten Hubposition.",
Category: "mechanical_hazard", NormRefs: []string{"EN ISO 13854"},
Confidence: "low",
},
{
Kind: "mitigation", Title: "Kriechgeschwindigkeit am Endanschlag (Hubgeraete)",
Description: "Hubgeschwindigkeit am Ende der Verfahrbewegung auf <=15 mm/s reduzieren.",
NormRefs: []string{"OSHA 29 CFR 1910.217 (Hand-Speed-Konstante)"},
Confidence: "low",
},
}
if !hasMechanical {
// Trim if not a mechanical context
out = out[:1]
}
return out
}
-111
View File
@@ -355,117 +355,6 @@ func registerWhistleblowerRoutes(v1 *gin.RouterGroup, h *handlers.WhistleblowerH
}
}
func registerIACERoutes(v1 *gin.RouterGroup, h *handlers.IACEHandler) {
iaceRoutes := v1.Group("/iace")
{
iaceRoutes.GET("/hazard-library", h.ListHazardLibrary)
iaceRoutes.GET("/controls-library", h.ListControlsLibrary)
iaceRoutes.GET("/norms-library", h.ListNormsLibrary)
iaceRoutes.GET("/lifecycle-phases", h.ListLifecyclePhases)
iaceRoutes.GET("/roles", h.ListRoles)
iaceRoutes.GET("/evidence-types", h.ListEvidenceTypes)
iaceRoutes.GET("/protective-measures-library", h.ListProtectiveMeasures)
iaceRoutes.GET("/failure-modes", h.ListFailureModes)
iaceRoutes.GET("/operational-states", h.ListOperationalStates)
iaceRoutes.GET("/component-library", h.ListComponentLibrary)
iaceRoutes.GET("/energy-sources", h.ListEnergySources)
iaceRoutes.GET("/tags", h.ListTags)
iaceRoutes.GET("/hazard-patterns", h.ListHazardPatterns)
iaceRoutes.POST("/projects", h.CreateProject)
iaceRoutes.GET("/projects", h.ListProjects)
iaceRoutes.GET("/projects/:id", h.GetProject)
iaceRoutes.PUT("/projects/:id", h.UpdateProject)
iaceRoutes.DELETE("/projects/:id", h.ArchiveProject)
iaceRoutes.POST("/projects/:id/init-from-profile", h.InitFromProfile)
iaceRoutes.POST("/projects/:id/variants", h.CreateVariant)
iaceRoutes.GET("/projects/:id/variants", h.ListVariants)
iaceRoutes.GET("/projects/:id/variant-gap", h.GetVariantGap)
iaceRoutes.POST("/projects/:id/completeness-check", h.CheckCompleteness)
iaceRoutes.POST("/projects/:id/components", h.CreateComponent)
iaceRoutes.GET("/projects/:id/components", h.ListComponents)
iaceRoutes.PUT("/projects/:id/components/:cid", h.UpdateComponent)
iaceRoutes.DELETE("/projects/:id/components/:cid", h.DeleteComponent)
iaceRoutes.POST("/projects/:id/classify", h.Classify)
iaceRoutes.GET("/projects/:id/classifications", h.GetClassifications)
iaceRoutes.POST("/projects/:id/classify/:regulation", h.ClassifySingle)
iaceRoutes.POST("/projects/:id/hazards", h.CreateHazard)
iaceRoutes.GET("/projects/:id/hazards", h.ListHazards)
iaceRoutes.PUT("/projects/:id/hazards/:hid", h.UpdateHazard)
iaceRoutes.POST("/projects/:id/hazards/suggest", h.SuggestHazards)
iaceRoutes.POST("/projects/:id/match-patterns", h.MatchPatterns)
iaceRoutes.POST("/projects/:id/parse-narrative", h.ParseNarrative)
iaceRoutes.POST("/projects/:id/delta-analysis", h.DeltaAnalysis)
iaceRoutes.GET("/projects/:id/fmea/export", h.ExportFMEA)
iaceRoutes.POST("/projects/:id/components/:cid/suggest-fms", h.SuggestFailureModes)
iaceRoutes.POST("/projects/:id/apply-patterns", h.ApplyPatternResults)
iaceRoutes.POST("/projects/:id/hazards/:hid/suggest-measures", h.SuggestMeasuresForHazard)
iaceRoutes.POST("/projects/:id/mitigations/:mid/suggest-evidence", h.SuggestEvidenceForMitigation)
iaceRoutes.POST("/projects/:id/hazards/:hid/assess", h.AssessRisk)
iaceRoutes.GET("/projects/:id/risk-summary", h.GetRiskSummary)
iaceRoutes.GET("/projects/:id/suggested-norms", h.SuggestProjectNorms)
iaceRoutes.POST("/projects/:id/hazards/:hid/reassess", h.ReassessRisk)
iaceRoutes.GET("/projects/:id/mitigations", h.ListProjectMitigations)
iaceRoutes.POST("/projects/:id/hazards/:hid/mitigations", h.CreateMitigation)
iaceRoutes.DELETE("/projects/:id/mitigations/:mid", h.DeleteMitigation)
iaceRoutes.PUT("/mitigations/:mid", h.UpdateMitigation)
iaceRoutes.POST("/mitigations/:mid/verify", h.VerifyMitigation)
iaceRoutes.POST("/projects/:id/validate-mitigation-hierarchy", h.ValidateMitigationHierarchy)
iaceRoutes.POST("/projects/:id/evidence", h.UploadEvidence)
iaceRoutes.GET("/projects/:id/evidence", h.ListEvidence)
iaceRoutes.POST("/projects/:id/verification-plan", h.CreateVerificationPlan)
iaceRoutes.PUT("/verification-plan/:vid", h.UpdateVerificationPlan)
iaceRoutes.POST("/verification-plan/:vid/complete", h.CompleteVerification)
iaceRoutes.GET("/projects/:id/verifications", h.ListVerificationPlans)
iaceRoutes.POST("/projects/:id/verifications", h.CreateVerificationAlias)
iaceRoutes.DELETE("/projects/:id/verifications/:vid", h.DeleteVerificationPlan)
iaceRoutes.POST("/projects/:id/verifications/:vid/complete", h.CompleteVerificationAlias)
iaceRoutes.POST("/projects/:id/tech-file/generate", h.GenerateTechFile)
iaceRoutes.GET("/projects/:id/tech-file", h.ListTechFileSections)
iaceRoutes.PUT("/projects/:id/tech-file/:section", h.UpdateTechFileSection)
iaceRoutes.POST("/projects/:id/tech-file/:section/approve", h.ApproveTechFileSection)
iaceRoutes.POST("/projects/:id/tech-file/:section/generate", h.GenerateSingleSection)
iaceRoutes.GET("/projects/:id/tech-file/export", h.ExportTechFile)
iaceRoutes.POST("/projects/:id/monitoring", h.CreateMonitoringEvent)
iaceRoutes.GET("/projects/:id/monitoring", h.ListMonitoringEvents)
iaceRoutes.PUT("/projects/:id/monitoring/:eid", h.UpdateMonitoringEvent)
iaceRoutes.GET("/projects/:id/audit-trail", h.GetAuditTrail)
iaceRoutes.POST("/library-search", h.SearchLibrary)
iaceRoutes.GET("/ce-corpus-documents", h.ListCECorpusDocuments)
iaceRoutes.POST("/projects/:id/initialize", h.InitializeProject)
iaceRoutes.GET("/projects/:id/hazard-blocks", h.GetHazardBlocks)
iaceRoutes.POST("/projects/:id/benchmark/import-gt", h.ImportGroundTruth)
iaceRoutes.GET("/projects/:id/benchmark", h.RunBenchmark)
iaceRoutes.GET("/projects/:id/benchmark/summary", h.GetBenchmarkSummary)
iaceRoutes.GET("/projects/:id/hazards/:hid/regulatory-hints", h.EnrichHazardWithRegulations)
iaceRoutes.GET("/projects/:id/mitigations/:mid/regulatory-hints", h.EnrichMitigationWithRegulations)
iaceRoutes.GET("/projects/:id/regulatory-hints", h.EnrichProjectHazardsBatch)
iaceRoutes.POST("/projects/:id/tech-file/:section/enrich", h.EnrichTechFileSection)
// Production Lines
iaceRoutes.POST("/production-lines", h.CreateProductionLine)
iaceRoutes.GET("/production-lines", h.ListProductionLines)
iaceRoutes.GET("/production-lines/:lid/dashboard", h.GetProductionLineDashboard)
iaceRoutes.POST("/production-lines/:lid/stations", h.AddStationToLine)
iaceRoutes.DELETE("/production-lines/:lid/stations/:sid", h.RemoveStationFromLine)
// CE x Compliance Crossover
iaceRoutes.GET("/projects/:id/compliance-triggers", h.GetComplianceTriggers)
iaceRoutes.GET("/compliance-faq", h.GetComplianceFAQ)
// Clarifications — aggregated open questions per project
iaceRoutes.GET("/projects/:id/clarifications", h.ListClarifications)
iaceRoutes.GET("/projects/:id/clarifications.csv", h.ExportClarificationsCSV)
iaceRoutes.GET("/projects/:id/clarifications.html", h.ExportClarificationsHTML)
iaceRoutes.GET("/projects/:id/clarifications/:cid/detail", h.ListClarificationDetail)
iaceRoutes.POST("/projects/:id/clarifications/:cid/answer", h.AnswerClarification)
iaceRoutes.POST("/projects/:id/clarifications/:cid/comment", h.PostClarificationComment)
// Customer-Standard Reuse (migration 031): pull reusable mitigations
// across prior projects of the same customer.
iaceRoutes.GET("/projects/:id/customer-standards", h.ListCustomerStandardSuggestions)
iaceRoutes.POST("/projects/:id/customer-standards/import", h.ImportCustomerStandardSuggestion)
}
}
func registerMaximizerRoutes(v1 *gin.RouterGroup, h *handlers.MaximizerHandlers) {
m := v1.Group("/maximizer")
@@ -0,0 +1,136 @@
package app
// IACE route registration extracted from routes.go (2026-05-21) because
// routes.go hit the 500-LOC hard cap when the LLM gap-review endpoint
// (Task #7) was added. Splitting keeps every routes file under the cap
// without changing behaviour — `registerRoutes` in routes.go still
// invokes `registerIACERoutes` exactly once at the same point in the
// startup sequence.
import (
"github.com/breakpilot/ai-compliance-sdk/internal/api/handlers"
"github.com/gin-gonic/gin"
)
func registerIACERoutes(v1 *gin.RouterGroup, h *handlers.IACEHandler) {
iaceRoutes := v1.Group("/iace")
{
// Library catalogues (read-only reference data).
iaceRoutes.GET("/hazard-library", h.ListHazardLibrary)
iaceRoutes.GET("/controls-library", h.ListControlsLibrary)
iaceRoutes.GET("/norms-library", h.ListNormsLibrary)
iaceRoutes.GET("/lifecycle-phases", h.ListLifecyclePhases)
iaceRoutes.GET("/roles", h.ListRoles)
iaceRoutes.GET("/evidence-types", h.ListEvidenceTypes)
iaceRoutes.GET("/protective-measures-library", h.ListProtectiveMeasures)
iaceRoutes.GET("/failure-modes", h.ListFailureModes)
iaceRoutes.GET("/operational-states", h.ListOperationalStates)
iaceRoutes.GET("/component-library", h.ListComponentLibrary)
iaceRoutes.GET("/energy-sources", h.ListEnergySources)
iaceRoutes.GET("/tags", h.ListTags)
iaceRoutes.GET("/hazard-patterns", h.ListHazardPatterns)
// Project CRUD.
iaceRoutes.POST("/projects", h.CreateProject)
iaceRoutes.GET("/projects", h.ListProjects)
iaceRoutes.GET("/projects/:id", h.GetProject)
iaceRoutes.PUT("/projects/:id", h.UpdateProject)
iaceRoutes.DELETE("/projects/:id", h.ArchiveProject)
iaceRoutes.POST("/projects/:id/init-from-profile", h.InitFromProfile)
iaceRoutes.POST("/projects/:id/variants", h.CreateVariant)
iaceRoutes.GET("/projects/:id/variants", h.ListVariants)
iaceRoutes.GET("/projects/:id/variant-gap", h.GetVariantGap)
iaceRoutes.POST("/projects/:id/completeness-check", h.CheckCompleteness)
// Components.
iaceRoutes.POST("/projects/:id/components", h.CreateComponent)
iaceRoutes.GET("/projects/:id/components", h.ListComponents)
iaceRoutes.PUT("/projects/:id/components/:cid", h.UpdateComponent)
iaceRoutes.DELETE("/projects/:id/components/:cid", h.DeleteComponent)
// Classification + hazards.
iaceRoutes.POST("/projects/:id/classify", h.Classify)
iaceRoutes.GET("/projects/:id/classifications", h.GetClassifications)
iaceRoutes.POST("/projects/:id/classify/:regulation", h.ClassifySingle)
iaceRoutes.POST("/projects/:id/hazards", h.CreateHazard)
iaceRoutes.GET("/projects/:id/hazards", h.ListHazards)
iaceRoutes.PUT("/projects/:id/hazards/:hid", h.UpdateHazard)
iaceRoutes.POST("/projects/:id/hazards/suggest", h.SuggestHazards)
iaceRoutes.POST("/projects/:id/match-patterns", h.MatchPatterns)
iaceRoutes.POST("/projects/:id/parse-narrative", h.ParseNarrative)
iaceRoutes.POST("/projects/:id/delta-analysis", h.DeltaAnalysis)
iaceRoutes.POST("/projects/:id/llm-gap-review", h.LLMGapReview)
iaceRoutes.GET("/projects/:id/fmea/export", h.ExportFMEA)
iaceRoutes.POST("/projects/:id/components/:cid/suggest-fms", h.SuggestFailureModes)
iaceRoutes.POST("/projects/:id/apply-patterns", h.ApplyPatternResults)
iaceRoutes.POST("/projects/:id/hazards/:hid/suggest-measures", h.SuggestMeasuresForHazard)
iaceRoutes.POST("/projects/:id/mitigations/:mid/suggest-evidence", h.SuggestEvidenceForMitigation)
iaceRoutes.POST("/projects/:id/hazards/:hid/assess", h.AssessRisk)
iaceRoutes.GET("/projects/:id/risk-summary", h.GetRiskSummary)
iaceRoutes.GET("/projects/:id/suggested-norms", h.SuggestProjectNorms)
iaceRoutes.POST("/projects/:id/hazards/:hid/reassess", h.ReassessRisk)
// Mitigations + evidence + verification.
iaceRoutes.GET("/projects/:id/mitigations", h.ListProjectMitigations)
iaceRoutes.POST("/projects/:id/hazards/:hid/mitigations", h.CreateMitigation)
iaceRoutes.DELETE("/projects/:id/mitigations/:mid", h.DeleteMitigation)
iaceRoutes.PUT("/mitigations/:mid", h.UpdateMitigation)
iaceRoutes.POST("/mitigations/:mid/verify", h.VerifyMitigation)
iaceRoutes.POST("/projects/:id/validate-mitigation-hierarchy", h.ValidateMitigationHierarchy)
iaceRoutes.POST("/projects/:id/evidence", h.UploadEvidence)
iaceRoutes.GET("/projects/:id/evidence", h.ListEvidence)
iaceRoutes.POST("/projects/:id/verification-plan", h.CreateVerificationPlan)
iaceRoutes.PUT("/verification-plan/:vid", h.UpdateVerificationPlan)
iaceRoutes.POST("/verification-plan/:vid/complete", h.CompleteVerification)
iaceRoutes.GET("/projects/:id/verifications", h.ListVerificationPlans)
iaceRoutes.POST("/projects/:id/verifications", h.CreateVerificationAlias)
iaceRoutes.DELETE("/projects/:id/verifications/:vid", h.DeleteVerificationPlan)
iaceRoutes.POST("/projects/:id/verifications/:vid/complete", h.CompleteVerificationAlias)
// Tech file + monitoring + audit.
iaceRoutes.POST("/projects/:id/tech-file/generate", h.GenerateTechFile)
iaceRoutes.GET("/projects/:id/tech-file", h.ListTechFileSections)
iaceRoutes.PUT("/projects/:id/tech-file/:section", h.UpdateTechFileSection)
iaceRoutes.POST("/projects/:id/tech-file/:section/approve", h.ApproveTechFileSection)
iaceRoutes.POST("/projects/:id/tech-file/:section/generate", h.GenerateSingleSection)
iaceRoutes.GET("/projects/:id/tech-file/export", h.ExportTechFile)
iaceRoutes.POST("/projects/:id/monitoring", h.CreateMonitoringEvent)
iaceRoutes.GET("/projects/:id/monitoring", h.ListMonitoringEvents)
iaceRoutes.PUT("/projects/:id/monitoring/:eid", h.UpdateMonitoringEvent)
iaceRoutes.GET("/projects/:id/audit-trail", h.GetAuditTrail)
// Library + corpus + benchmark.
iaceRoutes.POST("/library-search", h.SearchLibrary)
iaceRoutes.GET("/ce-corpus-documents", h.ListCECorpusDocuments)
iaceRoutes.POST("/projects/:id/initialize", h.InitializeProject)
iaceRoutes.GET("/projects/:id/hazard-blocks", h.GetHazardBlocks)
iaceRoutes.POST("/projects/:id/benchmark/import-gt", h.ImportGroundTruth)
iaceRoutes.GET("/projects/:id/benchmark", h.RunBenchmark)
iaceRoutes.GET("/projects/:id/benchmark/summary", h.GetBenchmarkSummary)
// Regulatory enrichment.
iaceRoutes.GET("/projects/:id/hazards/:hid/regulatory-hints", h.EnrichHazardWithRegulations)
iaceRoutes.GET("/projects/:id/mitigations/:mid/regulatory-hints", h.EnrichMitigationWithRegulations)
iaceRoutes.GET("/projects/:id/regulatory-hints", h.EnrichProjectHazardsBatch)
iaceRoutes.POST("/projects/:id/tech-file/:section/enrich", h.EnrichTechFileSection)
// Production lines.
iaceRoutes.POST("/production-lines", h.CreateProductionLine)
iaceRoutes.GET("/production-lines", h.ListProductionLines)
iaceRoutes.GET("/production-lines/:lid/dashboard", h.GetProductionLineDashboard)
iaceRoutes.POST("/production-lines/:lid/stations", h.AddStationToLine)
iaceRoutes.DELETE("/production-lines/:lid/stations/:sid", h.RemoveStationFromLine)
// CE x Compliance crossover + clarifications + customer standards.
iaceRoutes.GET("/projects/:id/compliance-triggers", h.GetComplianceTriggers)
iaceRoutes.GET("/compliance-faq", h.GetComplianceFAQ)
iaceRoutes.GET("/projects/:id/clarifications", h.ListClarifications)
iaceRoutes.GET("/projects/:id/clarifications.csv", h.ExportClarificationsCSV)
iaceRoutes.GET("/projects/:id/clarifications.html", h.ExportClarificationsHTML)
iaceRoutes.GET("/projects/:id/clarifications/:cid/detail", h.ListClarificationDetail)
iaceRoutes.POST("/projects/:id/clarifications/:cid/answer", h.AnswerClarification)
iaceRoutes.POST("/projects/:id/clarifications/:cid/comment", h.PostClarificationComment)
iaceRoutes.GET("/projects/:id/customer-standards", h.ListCustomerStandardSuggestions)
iaceRoutes.POST("/projects/:id/customer-standards/import", h.ImportCustomerStandardSuggestion)
}
}
@@ -0,0 +1,171 @@
package audit
import (
"sort"
"github.com/breakpilot/ai-compliance-sdk/internal/iace"
)
// runConsistencyImpl asks: does this component, with its own tags PLUS the
// tags of its TypicalEnergySources, actually trigger at least one pattern
// in every category listed in its TypicalHazardCategories?
//
// A component declares "this is what I am dangerous for" and the engine
// turns that declaration into hazards through patterns. If no pattern can
// fire from the component's tag set, the declaration is decorative — the
// engine will never produce a hazard in that category for this component,
// even though the library author said it should.
func init() {
runConsistencyImpl = runConsistency
}
func runConsistency() ConsistencyReport {
comps := iace.GetComponentLibrary()
energies := iace.GetEnergySources()
patterns := iace.AllPatterns()
energyByID := map[string]iace.EnergySourceEntry{}
for _, e := range energies {
energyByID[e.ID] = e
}
report := ConsistencyReport{TotalComponents: len(comps)}
for _, c := range comps {
if len(c.TypicalHazardCategories) == 0 {
report.Consistent++
continue
}
effective := buildEffectiveTags(c, energyByID)
covered := categoriesCoveredByPatterns(effective, c.MapsToComponentType, patterns)
var missing []string
for _, cat := range c.TypicalHazardCategories {
if !covered[cat] {
missing = append(missing, cat)
}
}
if len(missing) == 0 {
report.Consistent++
continue
}
result := ComponentResult{
ComponentID: c.ID,
NameDE: c.NameDE,
DeclaredCategories: c.TypicalHazardCategories,
}
for cat := range covered {
result.CoveredCategories = append(result.CoveredCategories, cat)
}
sort.Strings(result.CoveredCategories)
for _, cat := range missing {
result.MissingForCategories = append(result.MissingForCategories, CategoryGap{
Category: cat,
SuggestedTags: suggestTagsForCategory(cat, effective, patterns),
})
}
report.Incomplete++
report.IncompleteComponents = append(report.IncompleteComponents, result)
}
sort.Slice(report.IncompleteComponents, func(i, j int) bool {
return report.IncompleteComponents[i].ComponentID < report.IncompleteComponents[j].ComponentID
})
return report
}
func buildEffectiveTags(c iace.ComponentLibraryEntry, energyByID map[string]iace.EnergySourceEntry) map[string]bool {
set := map[string]bool{}
for _, t := range c.Tags {
set[t] = true
}
for _, eID := range c.TypicalEnergySources {
e, ok := energyByID[eID]
if !ok {
continue
}
for _, t := range e.Tags {
set[t] = true
}
}
return set
}
// categoriesCoveredByPatterns iterates patterns and finds which
// GeneratedHazardCats can fire given the component's effective tags.
// We ignore lifecycle, op-state, and human-role filters — those are
// project-level. The audit asks "can the library produce ANY hazard in
// this category for this component if the project configures everything
// reasonably?"
func categoriesCoveredByPatterns(tags map[string]bool, _ string, patterns []iace.HazardPattern) map[string]bool {
covered := map[string]bool{}
for _, p := range patterns {
if !tagsCover(tags, p.RequiredComponentTags) {
continue
}
if !tagsCover(tags, p.RequiredEnergyTags) {
continue
}
for _, cat := range p.GeneratedHazardCats {
covered[cat] = true
}
}
return covered
}
func tagsCover(have map[string]bool, required []string) bool {
for _, t := range required {
if !have[t] {
return false
}
}
return true
}
// suggestTagsForCategory looks at patterns that DO generate this category
// and identifies the tags that would close the gap. Returns the tags most
// commonly required by patterns in that category, minus what the component
// already has.
func suggestTagsForCategory(cat string, have map[string]bool, patterns []iace.HazardPattern) []string {
counts := map[string]int{}
for _, p := range patterns {
matchCat := false
for _, c := range p.GeneratedHazardCats {
if c == cat {
matchCat = true
break
}
}
if !matchCat {
continue
}
for _, t := range p.RequiredComponentTags {
if !have[t] {
counts[t]++
}
}
for _, t := range p.RequiredEnergyTags {
if !have[t] {
counts[t]++
}
}
}
type kv struct {
tag string
n int
}
var sorted []kv
for t, n := range counts {
sorted = append(sorted, kv{t, n})
}
sort.Slice(sorted, func(i, j int) bool { return sorted[i].n > sorted[j].n })
var out []string
for i, s := range sorted {
if i >= 6 {
break
}
out = append(out, s.tag)
}
return out
}
@@ -0,0 +1,161 @@
package audit
import (
"regexp"
"sort"
"strings"
)
// runEchoImpl checks if each meaningful phrase from the limits-form is
// echoed by at least one generated hazard. A phrase that names a concrete
// scenario, fault, or constraint must reappear (semantically) in some
// hazard's name, scenario, or description. Phrases without echo are gaps:
// the engineer documented the risk but the engine never lifted it into
// the hazard register.
//
// Echo detection here is a lightweight Jaccard overlap of content tokens
// (not embeddings) — robust enough for the demonstrative diagnostic and
// keeps the audit fully deterministic without an external model. The
// caller can later swap in a vector-based scorer.
func init() {
runEchoImpl = runEcho
}
// Significant limits-form fields. Each item is (key, label). We only
// audit the freeform fields where engineers describe risks — list/enum
// fields (operating_modes, person_groups, industry_sectors) are out of
// scope because they carry no narrative phrases.
var echoFields = []struct {
key string
label string
}{
{"general_description", "Allg. Beschreibung"},
{"intended_purpose", "Bestimmungsgemaesse Verwendung"},
{"variants", "Varianten"},
{"foreseeable_misuses", "Vorhersehbare Fehlanwendung"},
{"spatial_limits", "Raeumliche Grenzen"},
{"temporal_limits", "Zeitliche Grenzen"},
{"operating_conditions", "Betriebsbedingungen"},
{"energy_supply", "Energieversorgung"},
{"mechanical_interfaces", "Mechanische Schnittstellen"},
{"electrical_interfaces", "Elektrische Schnittstellen"},
{"software_interfaces", "Software-Schnittstellen"},
{"pneumatic_hydraulic_interfaces", "Pneumatik/Hydraulik"},
{"qualification_requirements", "Personenqualifikation"},
}
var sentenceSplit = regexp.MustCompile(`[.!?]\s+|\n+`)
var wordRE = regexp.MustCompile(`[a-zäöüßA-ZÄÖÜ]{4,}`)
// echoThreshold — minimum Jaccard overlap (between sentence content
// tokens and a hazard's content tokens) above which the sentence is
// considered echoed. Tuned by hand to give meaningful results without a
// labeled corpus; the audit reports the actual best score for each
// orphaned phrase so a human can re-tune if needed.
const echoThreshold = 0.18
func runEcho(form map[string]any, hazards []map[string]any) EchoReport {
limits := unwrapLimits(form)
// Precompute hazard token bags once
type bag struct {
tokens map[string]bool
text string
}
var hazardBags []bag
for _, h := range hazards {
txt := joinHazardText(h)
toks := contentTokenSet(txt)
hazardBags = append(hazardBags, bag{tokens: toks, text: txt})
}
report := EchoReport{}
for _, fld := range echoFields {
raw, _ := limits[fld.key].(string)
raw = strings.TrimSpace(raw)
if raw == "" {
continue
}
for _, sent := range sentenceSplit.Split(raw, -1) {
sent = strings.TrimSpace(sent)
if len(sent) < 30 {
// Skip very short fragments
continue
}
report.TotalPhrases++
st := contentTokenSet(sent)
if len(st) < 3 {
continue
}
bestScore := 0.0
for _, hb := range hazardBags {
score := jaccard(st, hb.tokens)
if score > bestScore {
bestScore = score
}
}
if bestScore >= echoThreshold {
report.Echoed++
continue
}
report.Orphaned++
report.OrphanedPhrases = append(report.OrphanedPhrases, OrphanedPhrase{
Field: fld.label,
Phrase: sent,
BestScore: bestScore,
})
}
}
sort.Slice(report.OrphanedPhrases, func(i, j int) bool {
// Lowest scores first — most clearly orphaned
return report.OrphanedPhrases[i].BestScore < report.OrphanedPhrases[j].BestScore
})
return report
}
func unwrapLimits(form map[string]any) map[string]any {
if inner, ok := form["limits_form"].(map[string]any); ok {
return inner
}
return form
}
func joinHazardText(h map[string]any) string {
parts := []string{}
for _, k := range []string{"name", "description", "scenario", "trigger_event", "possible_harm", "hazardous_zone", "category", "sub_category"} {
if v, ok := h[k].(string); ok {
parts = append(parts, v)
}
}
return strings.Join(parts, " ")
}
func contentTokenSet(s string) map[string]bool {
out := map[string]bool{}
for _, m := range wordRE.FindAllString(s, -1) {
w := strings.ToLower(m)
if stopWords[w] {
continue
}
out[w] = true
}
return out
}
func jaccard(a, b map[string]bool) float64 {
if len(a) == 0 || len(b) == 0 {
return 0
}
inter := 0
for x := range a {
if b[x] {
inter++
}
}
union := len(a) + len(b) - inter
if union == 0 {
return 0
}
return float64(inter) / float64(union)
}
@@ -0,0 +1,158 @@
package audit
import (
"sort"
"strings"
)
// runHierarchyImpl checks the ISO 12100 / EN 12100 risk-reduction
// hierarchy on the generated mitigation set: every safety-relevant
// hazard should have at least one "inherently safe design" measure
// (design) and additionally either a guarding/protective device
// (protection) or an information-for-use measure (information).
//
// Cyber-, ergonomic-, and software-only hazards have looser
// expectations — design alone or information alone may legitimately
// suffice. The audit reports which level is missing, not whether the
// remaining measures are individually correct. That is a different
// check (E2 — semantic quality), out of scope here.
func init() {
runHierarchyImpl = runHierarchy
}
// hazardExpectsProtection lists hazard categories where a pure
// design+information combination is usually not enough — the engine
// should produce at least one explicit protective measure (guard,
// interlock, sensor, presence detector, …).
var hazardExpectsProtection = map[string]bool{
"mechanical_hazard": true,
"electrical_hazard": true,
"thermal_hazard": true,
"pneumatic_hydraulic": true,
"radiation_hazard": true,
"laser_hazard": true,
"fire_explosion_hazard": true,
"chemical_hazard": true,
}
func runHierarchy(hazards, mitigations []map[string]any) HierarchyReport {
report := HierarchyReport{TotalHazards: len(hazards)}
// Index mitigations by hazard_id
byHazard := map[string][]map[string]any{}
for _, m := range mitigations {
hid, _ := m["hazard_id"].(string)
if hid == "" {
continue
}
byHazard[hid] = append(byHazard[hid], m)
}
for _, h := range hazards {
hid, _ := h["id"].(string)
category, _ := h["category"].(string)
name, _ := h["name"].(string)
levels := levelsForHazard(byHazard[hid])
missing := expectedMissing(category, levels)
if len(missing) == 0 {
report.Complete++
continue
}
for _, m := range missing {
switch m {
case "design":
report.MissingDesign++
case "protection":
report.MissingProtection++
case "information":
report.MissingInfo++
}
}
report.IncompleteHazards = append(report.IncompleteHazards, HazardHierarchyResult{
HazardID: hid,
Name: name,
Category: category,
Levels: levels,
MissingLevels: missing,
})
}
// Sort: protection-missing first (most consequential), then by category
sort.Slice(report.IncompleteHazards, func(i, j int) bool {
a := report.IncompleteHazards[i]
b := report.IncompleteHazards[j]
ap := contains(a.MissingLevels, "protection")
bp := contains(b.MissingLevels, "protection")
if ap != bp {
return ap
}
return a.Category < b.Category
})
return report
}
// levelsForHazard returns the distinct reduction-type levels present
// for a hazard's mitigation set. Possible values: design, protection,
// information.
func levelsForHazard(mits []map[string]any) []string {
seen := map[string]bool{}
for _, m := range mits {
rt, _ := m["reduction_type"].(string)
switch strings.ToLower(rt) {
case "design":
seen["design"] = true
case "protection", "protective":
seen["protection"] = true
case "information":
seen["information"] = true
}
}
var out []string
for k := range seen {
out = append(out, k)
}
sort.Strings(out)
return out
}
// expectedMissing returns the levels that the hierarchy demands but
// the mitigation set does not provide.
//
// Rule:
// - Every hazard with mitigations should have a design measure.
// - Categories in hazardExpectsProtection additionally need a
// protection measure.
// - All hazards should have an information measure unless they
// already have both design + protection (the information layer
// can then be considered subsumed for the audit's purpose; the
// real engine usually still adds it).
func expectedMissing(category string, present []string) []string {
have := toBoolSet(present)
var missing []string
if !have["design"] {
missing = append(missing, "design")
}
if hazardExpectsProtection[category] && !have["protection"] {
missing = append(missing, "protection")
}
// Information is only flagged if both design and protection are
// also absent — otherwise too noisy. We still surface the case
// where information is the SOLE present level: that means the
// hazard is mitigated only by warning labels, which is rarely
// adequate.
if !have["information"] && !have["design"] && !have["protection"] {
missing = append(missing, "information")
}
return missing
}
func contains(list []string, target string) bool {
for _, x := range list {
if x == target {
return true
}
}
return false
}
@@ -0,0 +1,37 @@
package audit
// Implementation entry points for Methods B-E. The full algorithms live
// in consistency.go, vocabulary.go, echo.go, hierarchy.go respectively.
// Until those files land, these wrappers keep main.go compilable and
// return a clearly-marked empty report.
func RunConsistency() ConsistencyReport {
return runConsistencyImpl()
}
func RunVocabulary(form map[string]any) VocabularyReport {
return runVocabularyImpl(form)
}
func RunEcho(form map[string]any, hazards []map[string]any) EchoReport {
return runEchoImpl(form, hazards)
}
func RunHierarchy(hazards, mitigations []map[string]any) HierarchyReport {
return runHierarchyImpl(hazards, mitigations)
}
// Default implementations — replaced when each method file lands.
// Keeping them as separate functions in one place avoids name clashes
// once consistency.go etc. add their real implementations.
var (
runConsistencyImpl = func() ConsistencyReport { return ConsistencyReport{} }
runVocabularyImpl = func(form map[string]any) VocabularyReport { return VocabularyReport{} }
runEchoImpl = func(form map[string]any, hazards []map[string]any) EchoReport {
return EchoReport{}
}
runHierarchyImpl = func(hazards, mitigations []map[string]any) HierarchyReport {
return HierarchyReport{}
}
)
@@ -0,0 +1,298 @@
// Package audit provides static and runtime audits of the IACE pattern
// engine — finding pattern reachability, library consistency, and
// limits-form coverage gaps without a ground-truth reference.
package audit
import (
"sort"
"github.com/breakpilot/ai-compliance-sdk/internal/iace"
)
// ReachabilityResult is the verdict for a single pattern in Method A.
type ReachabilityResult struct {
PatternID string `json:"pattern_id"`
Name string `json:"name_de"`
Priority int `json:"priority"`
RequiredAllTags []string `json:"required_tags"`
UnreachableTags []string `json:"unreachable_tags,omitempty"`
Status string `json:"status"` // "reachable" | "weakly_reachable" | "unreachable"
ReachableSources []string `json:"reachable_sources,omitempty"`
FixSuggestions []string `json:"fix_suggestions,omitempty"`
}
// ReachabilityReport is the full Method A output.
type ReachabilityReport struct {
TotalPatterns int `json:"total_patterns"`
Reachable int `json:"reachable"`
WeaklyReachable int `json:"weakly_reachable"`
Unreachable int `json:"unreachable"`
UniverseTags []string `json:"universe_tags"`
UnreachablePatterns []ReachabilityResult `json:"unreachable_patterns"`
WeakPatterns []ReachabilityResult `json:"weak_patterns"`
}
// RunReachability evaluates every pattern against the achievable tag universe.
//
// A pattern is:
// - "unreachable" if at least one required tag is not produced by any
// component, energy source, or keyword-dictionary entry.
// - "weakly_reachable" if all required tags exist in the universe but
// no single source (one Component or one EnergySource or one Keyword
// entry) supplies all of them at once — i.e., it relies on multiple
// parser hits to combine.
// - "reachable" if some single source covers all required tags.
//
// The classification ignores ExcludedComponentTags and runtime filters
// (lifecycle/op-state/machine-type), because those are project-level
// concerns. The audit answers "could this pattern EVER fire", not
// "does it fire for project X".
func RunReachability() ReachabilityReport {
patterns := iace.AllPatterns()
comps := iace.GetComponentLibrary()
energies := iace.GetEnergySources()
keywords := iace.GetKeywordDictionary()
// Tag universe: union of every tag emitted anywhere
universe := map[string][]string{} // tag → list of source IDs that emit it
for _, c := range comps {
for _, t := range c.Tags {
universe[t] = appendUnique(universe[t], "component:"+c.ID)
}
}
for _, e := range energies {
for _, t := range e.Tags {
universe[t] = appendUnique(universe[t], "energy:"+e.ID)
}
}
for i, kw := range keywords {
for _, t := range kw.ExtraTags {
universe[t] = appendUnique(universe[t], keywordLabel(kw, i))
}
// Keyword entries can also reference components/energies, which
// transitively add their tags to the keyword's effective tag set.
for _, cID := range kw.ComponentIDs {
for _, c := range comps {
if c.ID != cID {
continue
}
for _, t := range c.Tags {
universe[t] = appendUnique(universe[t], keywordLabel(kw, i))
}
}
}
for _, eID := range kw.EnergyIDs {
for _, e := range energies {
if e.ID != eID {
continue
}
for _, t := range e.Tags {
universe[t] = appendUnique(universe[t], keywordLabel(kw, i))
}
}
}
}
// Single-source coverage map: tag → covering sources, but also
// per-source tag set so we can check "is there ONE source covering
// all required tags".
sourceTags := map[string]map[string]bool{}
for _, c := range comps {
key := "component:" + c.ID
sourceTags[key] = toBoolSet(c.Tags)
}
for _, e := range energies {
key := "energy:" + e.ID
sourceTags[key] = toBoolSet(e.Tags)
}
for i, kw := range keywords {
key := keywordLabel(kw, i)
set := toBoolSet(kw.ExtraTags)
for _, cID := range kw.ComponentIDs {
for _, c := range comps {
if c.ID == cID {
for _, t := range c.Tags {
set[t] = true
}
}
}
}
for _, eID := range kw.EnergyIDs {
for _, e := range energies {
if e.ID == eID {
for _, t := range e.Tags {
set[t] = true
}
}
}
}
sourceTags[key] = set
}
report := ReachabilityReport{TotalPatterns: len(patterns)}
// Universe tag list (sorted) for the report header
for t := range universe {
report.UniverseTags = append(report.UniverseTags, t)
}
sort.Strings(report.UniverseTags)
for _, p := range patterns {
all := dedup(append(append([]string{}, p.RequiredComponentTags...), p.RequiredEnergyTags...))
if len(all) == 0 {
// Pattern with no tag requirements relies on lifecycle/machine_type
// filters only — count as reachable by default.
report.Reachable++
continue
}
var missing []string
for _, t := range all {
if _, ok := universe[t]; !ok {
missing = append(missing, t)
}
}
res := ReachabilityResult{
PatternID: p.ID,
Name: p.NameDE,
Priority: p.Priority,
RequiredAllTags: all,
}
if len(missing) > 0 {
res.Status = "unreachable"
res.UnreachableTags = missing
res.FixSuggestions = suggestFixes(p, missing, comps, sourceTags)
report.Unreachable++
report.UnreachablePatterns = append(report.UnreachablePatterns, res)
continue
}
// All tags in universe — check single-source coverage
single := findSingleSourceCovers(all, sourceTags)
if len(single) > 0 {
res.Status = "reachable"
res.ReachableSources = single
report.Reachable++
continue
}
res.Status = "weakly_reachable"
res.FixSuggestions = suggestSingleSourceFixes(p, all, comps, sourceTags)
report.WeaklyReachable++
report.WeakPatterns = append(report.WeakPatterns, res)
}
sort.Slice(report.UnreachablePatterns, func(i, j int) bool {
return report.UnreachablePatterns[i].Priority > report.UnreachablePatterns[j].Priority
})
sort.Slice(report.WeakPatterns, func(i, j int) bool {
return report.WeakPatterns[i].Priority > report.WeakPatterns[j].Priority
})
return report
}
func findSingleSourceCovers(required []string, sourceTags map[string]map[string]bool) []string {
var hits []string
for src, tags := range sourceTags {
ok := true
for _, t := range required {
if !tags[t] {
ok = false
break
}
}
if ok {
hits = append(hits, src)
}
}
sort.Strings(hits)
return hits
}
// suggestFixes proposes concrete library edits for unreachable patterns:
// "Add tag X to Component C014 (Hubwerk)" type suggestions.
func suggestFixes(p iace.HazardPattern, missing []string, comps []iace.ComponentLibraryEntry, sourceTags map[string]map[string]bool) []string {
var out []string
// For each missing tag, find candidates: components/energies that
// would semantically own that tag based on existing tags overlap.
for _, tag := range missing {
candidates := nearComponents(p, tag, comps, sourceTags)
if len(candidates) > 0 {
out = append(out, "Add tag '"+tag+"' to one of: "+joinFirst(candidates, 3))
} else {
out = append(out, "Tag '"+tag+"' is undefined anywhere — needs a new component or energy source carrying it")
}
}
return out
}
func suggestSingleSourceFixes(p iace.HazardPattern, all []string, comps []iace.ComponentLibraryEntry, sourceTags map[string]map[string]bool) []string {
// Find components that match the most required tags, then suggest
// adding the residual ones.
best := ""
bestCover := 0
var bestMissing []string
for src, tags := range sourceTags {
hit := 0
var miss []string
for _, t := range all {
if tags[t] {
hit++
} else {
miss = append(miss, t)
}
}
if hit > bestCover {
best, bestCover, bestMissing = src, hit, miss
}
}
if best == "" || bestCover == 0 {
return []string{"No single source covers any required tags — pattern needs a new dedicated component"}
}
if len(bestMissing) == 0 {
return nil
}
return []string{"Closest single source '" + best + "' covers " + itoa(bestCover) + "/" + itoa(len(all)) + " tags. Add missing tags to it: " + joinFirst(bestMissing, 5)}
}
// nearComponents finds components whose tags overlap most with the pattern's
// requirements — these are good candidates to receive the missing tag.
func nearComponents(p iace.HazardPattern, missing string, comps []iace.ComponentLibraryEntry, sourceTags map[string]map[string]bool) []string {
required := dedup(append(append([]string{}, p.RequiredComponentTags...), p.RequiredEnergyTags...))
required = removeOne(required, missing)
if len(required) == 0 {
return nil
}
type scored struct {
id string
score int
}
var scoredList []scored
for _, c := range comps {
tagSet := toBoolSet(c.Tags)
s := 0
for _, t := range required {
if tagSet[t] {
s++
}
}
if s > 0 {
scoredList = append(scoredList, scored{id: c.ID + " (" + c.NameDE + ")", score: s})
}
}
sort.Slice(scoredList, func(i, j int) bool { return scoredList[i].score > scoredList[j].score })
var out []string
for _, s := range scoredList {
out = append(out, s.id)
}
return out
}
func keywordLabel(kw iace.KeywordEntry, idx int) string {
if len(kw.Keywords) > 0 {
return "keyword:" + kw.Keywords[0]
}
return "keyword:" + itoa(idx)
}
@@ -0,0 +1,84 @@
package audit
// Stubs for Methods B-E. Each is filled in its own file as the audit
// suite grows. Keeping the type contracts here lets the CLI compile
// before each method has its full implementation.
// ============================================================================
// Method B — Component Self-Consistency
// ============================================================================
type CategoryGap struct {
Category string `json:"category"`
SuggestedTags []string `json:"suggested_tags"`
}
type ComponentResult struct {
ComponentID string `json:"component_id"`
NameDE string `json:"name_de"`
DeclaredCategories []string `json:"declared_categories"`
CoveredCategories []string `json:"covered_categories"`
MissingForCategories []CategoryGap `json:"missing_for_categories,omitempty"`
}
type ConsistencyReport struct {
TotalComponents int `json:"total_components"`
Consistent int `json:"consistent"`
Incomplete int `json:"incomplete"`
IncompleteComponents []ComponentResult `json:"incomplete_components"`
}
// ============================================================================
// Method C — Limits-Form Vocabulary Diff
// ============================================================================
type DictionarySuggestion struct {
Token string `json:"token"`
Field string `json:"field"`
PatternIDs []string `json:"pattern_ids"`
}
type VocabularyReport struct {
UniqueTokens int `json:"unique_tokens"`
KnownTokens []string `json:"known_tokens"`
UnknownTokens []string `json:"unknown_tokens"`
SuggestedDictionaryEntries []DictionarySuggestion `json:"suggested_dictionary_entries"`
}
// ============================================================================
// Method D — Limits-Form Echo
// ============================================================================
type OrphanedPhrase struct {
Field string `json:"field"`
Phrase string `json:"phrase"`
BestScore float64 `json:"best_score"`
}
type EchoReport struct {
TotalPhrases int `json:"total_phrases"`
Echoed int `json:"echoed"`
Orphaned int `json:"orphaned"`
OrphanedPhrases []OrphanedPhrase `json:"orphaned_phrases"`
}
// ============================================================================
// Method E — Hierarchy Completeness
// ============================================================================
type HazardHierarchyResult struct {
HazardID string `json:"hazard_id"`
Name string `json:"name"`
Category string `json:"category"`
Levels []string `json:"present_levels"`
MissingLevels []string `json:"missing_levels"`
}
type HierarchyReport struct {
TotalHazards int `json:"total_hazards"`
Complete int `json:"complete"`
MissingDesign int `json:"missing_design"`
MissingProtection int `json:"missing_protection"`
MissingInfo int `json:"missing_information"`
IncompleteHazards []HazardHierarchyResult `json:"incomplete_hazards"`
}
@@ -0,0 +1,62 @@
package audit
import "strconv"
func appendUnique(list []string, item string) []string {
for _, x := range list {
if x == item {
return list
}
}
return append(list, item)
}
func toBoolSet(list []string) map[string]bool {
s := make(map[string]bool, len(list))
for _, x := range list {
s[x] = true
}
return s
}
func dedup(list []string) []string {
seen := map[string]bool{}
var out []string
for _, x := range list {
if !seen[x] {
seen[x] = true
out = append(out, x)
}
}
return out
}
func removeOne(list []string, item string) []string {
out := make([]string, 0, len(list))
for _, x := range list {
if x != item {
out = append(out, x)
}
}
return out
}
func joinFirst(list []string, n int) string {
if len(list) <= n {
return joinAll(list)
}
return joinAll(list[:n]) + ", ..."
}
func joinAll(list []string) string {
s := ""
for i, x := range list {
if i > 0 {
s += ", "
}
s += x
}
return s
}
func itoa(n int) string { return strconv.Itoa(n) }
@@ -0,0 +1,153 @@
package audit
import (
"regexp"
"sort"
"strings"
"github.com/breakpilot/ai-compliance-sdk/internal/iace"
)
// runVocabularyImpl takes a limits-form payload (the structured machine
// description filled in by the engineer) and asks: which of its words
// are unknown to the keyword dictionary yet appear in any pattern's
// scenario/trigger/harm/zone text? Each such word is a dictionary gap —
// the engineer typed a term that some pattern is waiting for, but the
// parser cannot translate it into a tag.
func init() {
runVocabularyImpl = runVocabulary
}
var tokenRE = regexp.MustCompile(`[a-zäöüßA-ZÄÖÜ]{4,}`)
// German + English stop words that show up in any narrative but carry
// no engineering meaning. Kept short on purpose — we only want to drop
// obvious filler.
var stopWords = map[string]bool{
"oder": true, "und": true, "auch": true, "wenn": true, "wird": true,
"werden": true, "kann": true, "koennen": true, "soll": true, "muss": true,
"sind": true, "eine": true, "einer": true, "einem": true, "einen": true,
"diese": true, "dieser": true, "dieses": true, "diesem": true, "diesen": true,
"durch": true, "nach": true, "ueber": true, "unter": true, "zwischen": true,
"nicht": true, "ohne": true, "fuer": true, "bzw": true, "etc": true,
"sowie": true, "siehe": true, "etwa": true, "ggf": true, "the": true,
"with": true, "from": true, "this": true, "that": true, "have": true,
"insbesondere": true, "ausschliesslich": true, "ebenfalls": true,
"jeweils": true, "weitere": true, "weiteren": true, "weiterer": true,
}
func runVocabulary(form map[string]any) VocabularyReport {
limits, ok := form["limits_form"].(map[string]any)
if !ok {
// Form may already be the inner object
limits = form
}
tokens := map[string]bool{}
for _, v := range limits {
extractTokens(v, tokens)
}
report := VocabularyReport{UniqueTokens: len(tokens)}
dictTokens := dictionaryVocabulary()
for tok := range tokens {
if stopWords[tok] {
continue
}
if dictTokenHit(tok, dictTokens) {
report.KnownTokens = append(report.KnownTokens, tok)
} else {
report.UnknownTokens = append(report.UnknownTokens, tok)
}
}
sort.Strings(report.KnownTokens)
sort.Strings(report.UnknownTokens)
// For each unknown token check if any pattern names it
patterns := iace.AllPatterns()
for _, tok := range report.UnknownTokens {
hits := patternsMentioning(tok, patterns)
if len(hits) == 0 {
continue
}
report.SuggestedDictionaryEntries = append(report.SuggestedDictionaryEntries, DictionarySuggestion{
Token: tok,
PatternIDs: hits,
})
}
sort.Slice(report.SuggestedDictionaryEntries, func(i, j int) bool {
return len(report.SuggestedDictionaryEntries[i].PatternIDs) > len(report.SuggestedDictionaryEntries[j].PatternIDs)
})
return report
}
func extractTokens(v any, out map[string]bool) {
switch x := v.(type) {
case string:
for _, m := range tokenRE.FindAllString(x, -1) {
out[strings.ToLower(m)] = true
}
case []any:
for _, e := range x {
extractTokens(e, out)
}
case map[string]any:
for _, e := range x {
extractTokens(e, out)
}
}
}
// dictionaryVocabulary builds the lowercase set of all keyword strings
// that the parser will recognize, including normalized forms (umlauts
// replaced like in the keyword dictionary).
func dictionaryVocabulary() map[string]bool {
out := map[string]bool{}
for _, kw := range iace.GetKeywordDictionary() {
for _, k := range kw.Keywords {
out[strings.ToLower(k)] = true
}
}
return out
}
// dictTokenHit returns true if the token would be matched by any
// dictionary entry. Dictionary entries can be substrings, so we treat
// the dict as a set of stem-like matchers: a token is "known" if it
// equals a dict word OR contains a dict word as substring OR the dict
// word contains the token.
func dictTokenHit(tok string, dict map[string]bool) bool {
if dict[tok] {
return true
}
for d := range dict {
if strings.Contains(tok, d) || strings.Contains(d, tok) {
return true
}
}
return false
}
// patternsMentioning returns up to 8 pattern IDs whose scenario/trigger/
// harm/zone text contains the token (case-insensitive substring).
func patternsMentioning(tok string, patterns []iace.HazardPattern) []string {
tokLower := strings.ToLower(tok)
seen := map[string]bool{}
var out []string
for _, p := range patterns {
hay := strings.ToLower(p.ScenarioDE + " " + p.TriggerDE + " " + p.HarmDE + " " + p.ZoneDE + " " + p.NameDE)
if !strings.Contains(hay, tokLower) {
continue
}
if seen[p.ID] {
continue
}
seen[p.ID] = true
out = append(out, p.ID)
if len(out) >= 8 {
break
}
}
return out
}
@@ -104,39 +104,14 @@ func GetProjectComplianceTriggers(hazards []Hazard, patterns []HazardPattern) *C
}
}
// AllPatterns returns every hazard pattern from all pattern sources.
// This mirrors the aggregation in NewPatternEngine but returns just the slice.
// AllPatterns returns every registered hazard pattern. Delegates to
// collectAllPatterns() in pattern_registry.go so new pattern sources only
// need to be added in one place. Pre-2026-05-21 this function maintained
// a duplicate enumeration which silently drifted from the registry —
// CRA, ISO12100-gap, robot-cell, CNC, VDMA, textile-agri, GT-bremse and
// secondary-harm patterns were invisible to AllPatterns callers.
func AllPatterns() []HazardPattern {
p := GetBuiltinHazardPatterns()
p = append(p, GetExtendedHazardPatterns()...)
p = append(p, GetPressHazardPatterns()...)
p = append(p, GetCobotHazardPatterns()...)
p = append(p, GetOperationalHazardPatterns()...)
p = append(p, GetDGUVExtendedPatterns()...)
p = append(p, GetExtendedHazardPatterns2()...)
p = append(p, GetElevatorPatterns()...)
p = append(p, GetAGVAgriPatterns()...)
p = append(p, GetFoodProcessingPatterns()...)
p = append(p, GetPackagingPatterns()...)
p = append(p, GetLaserPatterns()...)
p = append(p, GetMedicalDevicePatterns()...)
p = append(p, GetPressureEquipmentPatterns()...)
p = append(p, GetConstructionPatterns()...)
p = append(p, GetForestryConveyorPatterns()...)
p = append(p, GetPlasticsMetalPatterns()...)
p = append(p, GetWeldingGlassTextilePatterns()...)
p = append(p, GetSpecificMachinePatterns()...)
p = append(p, GetSpecificMachinePatterns2()...)
p = append(p, GetCyberExtendedPatterns()...)
p = append(p, GetCyberExtendedPatterns2()...)
p = append(p, GetCyberExtendedPatterns3()...)
p = append(p, GetWorkshopPatterns()...)
p = append(p, GetMaintenanceExtPatterns()...)
p = append(p, GetFinalPatternsA()...)
p = append(p, GetFinalPatternsB()...)
p = append(p, GetFinalPatternsC()...)
p = append(p, GetFinalPatternsD()...)
return p
return collectAllPatterns()
}
// extractPatternIDs scans a text for "HP" followed by digits and adds
@@ -36,21 +36,21 @@ func GetComponentLibrary() []ComponentLibraryEntry {
{ID: "C003", NameDE: "Foerderband", NameEN: "Conveyor Belt", Category: "mechanical", DescriptionDE: "Endlosband zum Transport von Werkstuecken zwischen Arbeitsstationen.", TypicalHazardCategories: []string{"mechanical_hazard", "ergonomic"}, TypicalEnergySources: []string{"EN01", "EN02"}, MapsToComponentType: "mechanical", Tags: []string{"moving_part", "rotating_part", "entanglement_risk"}, SortOrder: 3},
{ID: "C004", NameDE: "Drehtisch", NameEN: "Rotary Table", Category: "mechanical", DescriptionDE: "Rotierender Arbeitstisch fuer Bearbeitungs- oder Montageprozesse.", TypicalHazardCategories: []string{"mechanical_hazard"}, TypicalEnergySources: []string{"EN02"}, MapsToComponentType: "mechanical", Tags: []string{"rotating_part", "high_force"}, SortOrder: 4},
{ID: "C005", NameDE: "Linearachse", NameEN: "Linear Axis", Category: "mechanical", DescriptionDE: "Linearfuehrung fuer praezise translatorische Bewegungen.", TypicalHazardCategories: []string{"mechanical_hazard"}, TypicalEnergySources: []string{"EN01"}, MapsToComponentType: "mechanical", Tags: []string{"moving_part", "crush_point"}, SortOrder: 5},
{ID: "C006", NameDE: "Spindel", NameEN: "Spindle", Category: "mechanical", DescriptionDE: "Hochdrehende Spindel fuer Fräs-, Bohr- oder Schleifoperationen.", TypicalHazardCategories: []string{"mechanical_hazard", "noise_vibration"}, TypicalEnergySources: []string{"EN02"}, MapsToComponentType: "mechanical", Tags: []string{"rotating_part", "high_speed", "cutting_part"}, SortOrder: 6},
{ID: "C006", NameDE: "Spindel", NameEN: "Spindle", Category: "mechanical", DescriptionDE: "Hochdrehende Spindel fuer Fräs-, Bohr- oder Schleifoperationen.", TypicalHazardCategories: []string{"mechanical_hazard", "noise_vibration"}, TypicalEnergySources: []string{"EN02"}, MapsToComponentType: "mechanical", Tags: []string{"rotating_part", "high_speed", "cutting_part", "noise_source"}, SortOrder: 6},
{ID: "C007", NameDE: "Saegeblatt", NameEN: "Saw Blade", Category: "mechanical", DescriptionDE: "Rotierendes oder oszillierendes Schneidwerkzeug.", TypicalHazardCategories: []string{"mechanical_hazard"}, TypicalEnergySources: []string{"EN02"}, MapsToComponentType: "mechanical", Tags: []string{"cutting_part", "rotating_part", "high_speed"}, SortOrder: 7},
{ID: "C008", NameDE: "Pressenstoessel", NameEN: "Press Ram", Category: "mechanical", DescriptionDE: "Auf- und abfahrender Stoessel einer Presse zum Umformen.", TypicalHazardCategories: []string{"mechanical_hazard"}, TypicalEnergySources: []string{"EN01", "EN05"}, MapsToComponentType: "mechanical", Tags: []string{"moving_part", "high_force", "crush_point"}, SortOrder: 8},
{ID: "C009", NameDE: "Walze", NameEN: "Roller", Category: "mechanical", DescriptionDE: "Zylindrische Walze zum Foerdern, Pressen oder Kalandrieren.", TypicalHazardCategories: []string{"mechanical_hazard"}, TypicalEnergySources: []string{"EN02"}, MapsToComponentType: "mechanical", Tags: []string{"rotating_part", "entanglement_risk", "pinch_point"}, SortOrder: 9},
{ID: "C010", NameDE: "Kettenantrieb", NameEN: "Chain Drive", Category: "mechanical", DescriptionDE: "Kette und Kettenrad zur Kraftuebertragung.", TypicalHazardCategories: []string{"mechanical_hazard"}, TypicalEnergySources: []string{"EN01"}, MapsToComponentType: "mechanical", Tags: []string{"moving_part", "entanglement_risk"}, SortOrder: 10},
{ID: "C011", NameDE: "Zahnradgetriebe", NameEN: "Gear Transmission", Category: "mechanical", DescriptionDE: "Zahnradpaar oder -satz zur Drehzahl-/Drehmomentanpassung.", TypicalHazardCategories: []string{"mechanical_hazard", "noise_vibration"}, TypicalEnergySources: []string{"EN02"}, MapsToComponentType: "mechanical", Tags: []string{"rotating_part", "pinch_point"}, SortOrder: 11},
{ID: "C011", NameDE: "Zahnradgetriebe", NameEN: "Gear Transmission", Category: "mechanical", DescriptionDE: "Zahnradpaar oder -satz zur Drehzahl-/Drehmomentanpassung.", TypicalHazardCategories: []string{"mechanical_hazard", "noise_vibration"}, TypicalEnergySources: []string{"EN02"}, MapsToComponentType: "mechanical", Tags: []string{"rotating_part", "pinch_point", "noise_source"}, SortOrder: 11},
{ID: "C012", NameDE: "Kupplung", NameEN: "Clutch", Category: "mechanical", DescriptionDE: "Mechanische Kupplung zur An-/Abkopplung von Antriebsstraengen.", TypicalHazardCategories: []string{"mechanical_hazard"}, TypicalEnergySources: []string{"EN02"}, MapsToComponentType: "mechanical", Tags: []string{"rotating_part"}, SortOrder: 12},
{ID: "C013", NameDE: "Bremse", NameEN: "Brake", Category: "mechanical", DescriptionDE: "Mechanische oder elektromagnetische Bremse zum Stillsetzen von Antrieben.", TypicalHazardCategories: []string{"mechanical_hazard"}, TypicalEnergySources: []string{"EN01"}, MapsToComponentType: "mechanical", Tags: []string{"moving_part", "stored_energy"}, SortOrder: 13},
{ID: "C014", NameDE: "Hubwerk", NameEN: "Hoist", Category: "mechanical", DescriptionDE: "Hebezeug zum vertikalen Bewegen von Lasten.", TypicalHazardCategories: []string{"mechanical_hazard", "ergonomic"}, TypicalEnergySources: []string{"EN01", "EN03"}, MapsToComponentType: "mechanical", Tags: []string{"moving_part", "high_force", "gravity_risk"}, SortOrder: 14},
{ID: "C014", NameDE: "Hubwerk", NameEN: "Hoist", Category: "mechanical", DescriptionDE: "Hebezeug zum vertikalen Bewegen von Lasten.", TypicalHazardCategories: []string{"mechanical_hazard", "ergonomic"}, TypicalEnergySources: []string{"EN01", "EN03", "EN04"}, MapsToComponentType: "mechanical", Tags: []string{"moving_part", "high_force", "gravity_risk", "crush_point", "person_under_load"}, SortOrder: 14},
{ID: "C015", NameDE: "Werkzeugwechsler", NameEN: "Tool Changer", Category: "mechanical", DescriptionDE: "Automatischer Werkzeugwechsler fuer CNC-Maschinen.", TypicalHazardCategories: []string{"mechanical_hazard"}, TypicalEnergySources: []string{"EN01", "EN05"}, MapsToComponentType: "mechanical", Tags: []string{"moving_part", "pinch_point"}, SortOrder: 15},
{ID: "C016", NameDE: "Schweisskopf", NameEN: "Welding Head", Category: "mechanical", DescriptionDE: "Schweisskopf fuer MIG/MAG, WIG oder Laserschweissen.", TypicalHazardCategories: []string{"mechanical_hazard", "thermal_hazard", "electrical_hazard"}, TypicalEnergySources: []string{"EN03", "EN07"}, MapsToComponentType: "mechanical", Tags: []string{"high_temperature", "radiation_risk"}, SortOrder: 16},
{ID: "C017", NameDE: "Schraubstation", NameEN: "Screwdriving Station", Category: "mechanical", DescriptionDE: "Automatische Schraubeinheit fuer Montageprozesse.", TypicalHazardCategories: []string{"mechanical_hazard", "noise_vibration"}, TypicalEnergySources: []string{"EN02"}, MapsToComponentType: "mechanical", Tags: []string{"rotating_part"}, SortOrder: 17},
{ID: "C017", NameDE: "Schraubstation", NameEN: "Screwdriving Station", Category: "mechanical", DescriptionDE: "Automatische Schraubeinheit fuer Montageprozesse.", TypicalHazardCategories: []string{"mechanical_hazard", "noise_vibration"}, TypicalEnergySources: []string{"EN02"}, MapsToComponentType: "mechanical", Tags: []string{"rotating_part", "noise_source"}, SortOrder: 17},
{ID: "C018", NameDE: "Stanzen-Werkzeug", NameEN: "Punching Tool", Category: "mechanical", DescriptionDE: "Stanzwerkzeug zum Ausschneiden von Formen aus Blech oder Folie.", TypicalHazardCategories: []string{"mechanical_hazard"}, TypicalEnergySources: []string{"EN01"}, MapsToComponentType: "mechanical", Tags: []string{"cutting_part", "high_force", "crush_point"}, SortOrder: 18},
{ID: "C019", NameDE: "Biegewerkzeug", NameEN: "Bending Tool", Category: "mechanical", DescriptionDE: "Werkzeug zum Biegen von Blech oder Profilen.", TypicalHazardCategories: []string{"mechanical_hazard"}, TypicalEnergySources: []string{"EN01"}, MapsToComponentType: "mechanical", Tags: []string{"moving_part", "high_force", "crush_point"}, SortOrder: 19},
{ID: "C020", NameDE: "Vibrationsfoerderer", NameEN: "Vibratory Feeder", Category: "mechanical", DescriptionDE: "Schwingfoerderer zum Sortieren und Zufuehren von Kleinteilen.", TypicalHazardCategories: []string{"mechanical_hazard", "noise_vibration"}, TypicalEnergySources: []string{"EN01"}, MapsToComponentType: "mechanical", Tags: []string{"moving_part", "vibration_source"}, SortOrder: 20},
{ID: "C020", NameDE: "Vibrationsfoerderer", NameEN: "Vibratory Feeder", Category: "mechanical", DescriptionDE: "Schwingfoerderer zum Sortieren und Zufuehren von Kleinteilen.", TypicalHazardCategories: []string{"mechanical_hazard", "noise_vibration"}, TypicalEnergySources: []string{"EN01"}, MapsToComponentType: "mechanical", Tags: []string{"moving_part", "vibration_source", "noise_source"}, SortOrder: 20},
// ── Category: structural (C021-C030) ────────────────────────────────────
{ID: "C021", NameDE: "Maschinenrahmen", NameEN: "Machine Frame", Category: "structural", DescriptionDE: "Tragender Rahmen als Grundstruktur der Maschine.", TypicalHazardCategories: []string{"mechanical_hazard"}, TypicalEnergySources: []string{}, MapsToComponentType: "mechanical", Tags: []string{"structural_part"}, SortOrder: 21},
@@ -65,19 +65,19 @@ func GetComponentLibrary() []ComponentLibraryEntry {
{ID: "C030", NameDE: "Plattform/Buehne", NameEN: "Platform/Walkway", Category: "structural", DescriptionDE: "Begehbare Plattform fuer Bedienung oder Wartung in der Hoehe.", TypicalHazardCategories: []string{"ergonomic", "mechanical_hazard"}, TypicalEnergySources: []string{"EN03"}, MapsToComponentType: "mechanical", Tags: []string{"structural_part", "gravity_risk"}, SortOrder: 30},
// ── Category: drive (C031-C040) ─────────────────────────────────────────
{ID: "C031", NameDE: "Elektromotor (Drehstrom)", NameEN: "AC Motor", Category: "drive", DescriptionDE: "Drehstrom-Asynchronmotor als Hauptantrieb.", TypicalHazardCategories: []string{"electrical_hazard", "mechanical_hazard", "noise_vibration"}, TypicalEnergySources: []string{"EN02", "EN04"}, MapsToComponentType: "electrical", Tags: []string{"rotating_part", "high_voltage", "high_force"}, SortOrder: 31},
{ID: "C032", NameDE: "Servomotor", NameEN: "Servo Motor", Category: "drive", DescriptionDE: "Hochdynamischer Servomotor fuer praezise Positionierung.", TypicalHazardCategories: []string{"electrical_hazard", "mechanical_hazard"}, TypicalEnergySources: []string{"EN02", "EN04"}, MapsToComponentType: "electrical", Tags: []string{"rotating_part", "high_speed"}, SortOrder: 32},
{ID: "C033", NameDE: "Schrittmotor", NameEN: "Stepper Motor", Category: "drive", DescriptionDE: "Schrittmotor fuer inkrementelle Positionierung.", TypicalHazardCategories: []string{"electrical_hazard"}, TypicalEnergySources: []string{"EN02", "EN04"}, MapsToComponentType: "electrical", Tags: []string{"rotating_part"}, SortOrder: 33},
{ID: "C034", NameDE: "Frequenzumrichter", NameEN: "Frequency Converter", Category: "drive", DescriptionDE: "Frequenzumrichter zur stufenlosen Drehzahlregelung.", TypicalHazardCategories: []string{"electrical_hazard", "emc_hazard"}, TypicalEnergySources: []string{"EN04"}, MapsToComponentType: "electrical", Tags: []string{"high_voltage", "stored_energy"}, SortOrder: 34},
{ID: "C035", NameDE: "Getriebemotor", NameEN: "Gear Motor", Category: "drive", DescriptionDE: "Motor mit integriertem Getriebe fuer hohes Drehmoment bei niedriger Drehzahl.", TypicalHazardCategories: []string{"mechanical_hazard", "electrical_hazard"}, TypicalEnergySources: []string{"EN02", "EN04"}, MapsToComponentType: "electrical", Tags: []string{"rotating_part", "high_force"}, SortOrder: 35},
{ID: "C036", NameDE: "Linearmotor", NameEN: "Linear Motor", Category: "drive", DescriptionDE: "Elektromagnetischer Direktantrieb fuer lineare Bewegung.", TypicalHazardCategories: []string{"electrical_hazard", "mechanical_hazard"}, TypicalEnergySources: []string{"EN01", "EN04"}, MapsToComponentType: "electrical", Tags: []string{"moving_part", "high_speed"}, SortOrder: 36},
{ID: "C037", NameDE: "Torque-Motor", NameEN: "Torque Motor", Category: "drive", DescriptionDE: "Direktantriebsmotor fuer hohe Drehmomente ohne Getriebe.", TypicalHazardCategories: []string{"electrical_hazard", "mechanical_hazard"}, TypicalEnergySources: []string{"EN02", "EN04"}, MapsToComponentType: "electrical", Tags: []string{"rotating_part", "high_force"}, SortOrder: 37},
{ID: "C038", NameDE: "Elektrischer Stellantrieb", NameEN: "Electric Actuator", Category: "drive", DescriptionDE: "Elektrischer Antrieb fuer Ventile, Klappen oder Schieber.", TypicalHazardCategories: []string{"electrical_hazard"}, TypicalEnergySources: []string{"EN01", "EN04"}, MapsToComponentType: "actuator", Tags: []string{"moving_part"}, SortOrder: 38},
{ID: "C031", NameDE: "Elektromotor (Drehstrom)", NameEN: "AC Motor", Category: "drive", DescriptionDE: "Drehstrom-Asynchronmotor als Hauptantrieb.", TypicalHazardCategories: []string{"electrical_hazard", "mechanical_hazard", "noise_vibration"}, TypicalEnergySources: []string{"EN02", "EN04"}, MapsToComponentType: "electrical", Tags: []string{"rotating_part", "high_voltage", "high_force", "noise_source", "electrical_part"}, SortOrder: 31},
{ID: "C032", NameDE: "Servomotor", NameEN: "Servo Motor", Category: "drive", DescriptionDE: "Hochdynamischer Servomotor fuer praezise Positionierung.", TypicalHazardCategories: []string{"electrical_hazard", "mechanical_hazard"}, TypicalEnergySources: []string{"EN02", "EN04"}, MapsToComponentType: "electrical", Tags: []string{"rotating_part", "high_speed", "electrical_part"}, SortOrder: 32},
{ID: "C033", NameDE: "Schrittmotor", NameEN: "Stepper Motor", Category: "drive", DescriptionDE: "Schrittmotor fuer inkrementelle Positionierung.", TypicalHazardCategories: []string{"electrical_hazard"}, TypicalEnergySources: []string{"EN02", "EN04"}, MapsToComponentType: "electrical", Tags: []string{"rotating_part", "electrical_part"}, SortOrder: 33},
{ID: "C034", NameDE: "Frequenzumrichter", NameEN: "Frequency Converter", Category: "drive", DescriptionDE: "Frequenzumrichter zur stufenlosen Drehzahlregelung.", TypicalHazardCategories: []string{"electrical_hazard", "emc_hazard"}, TypicalEnergySources: []string{"EN04"}, MapsToComponentType: "electrical", Tags: []string{"high_voltage", "stored_energy", "electrical_part", "electromagnetic"}, SortOrder: 34},
{ID: "C035", NameDE: "Getriebemotor", NameEN: "Gear Motor", Category: "drive", DescriptionDE: "Motor mit integriertem Getriebe fuer hohes Drehmoment bei niedriger Drehzahl.", TypicalHazardCategories: []string{"mechanical_hazard", "electrical_hazard"}, TypicalEnergySources: []string{"EN02", "EN04"}, MapsToComponentType: "electrical", Tags: []string{"rotating_part", "high_force", "electrical_part"}, SortOrder: 35},
{ID: "C036", NameDE: "Linearmotor", NameEN: "Linear Motor", Category: "drive", DescriptionDE: "Elektromagnetischer Direktantrieb fuer lineare Bewegung.", TypicalHazardCategories: []string{"electrical_hazard", "mechanical_hazard"}, TypicalEnergySources: []string{"EN01", "EN04"}, MapsToComponentType: "electrical", Tags: []string{"moving_part", "high_speed", "electrical_part"}, SortOrder: 36},
{ID: "C037", NameDE: "Torque-Motor", NameEN: "Torque Motor", Category: "drive", DescriptionDE: "Direktantriebsmotor fuer hohe Drehmomente ohne Getriebe.", TypicalHazardCategories: []string{"electrical_hazard", "mechanical_hazard"}, TypicalEnergySources: []string{"EN02", "EN04"}, MapsToComponentType: "electrical", Tags: []string{"rotating_part", "high_force", "electrical_part"}, SortOrder: 37},
{ID: "C038", NameDE: "Elektrischer Stellantrieb", NameEN: "Electric Actuator", Category: "drive", DescriptionDE: "Elektrischer Antrieb fuer Ventile, Klappen oder Schieber.", TypicalHazardCategories: []string{"electrical_hazard"}, TypicalEnergySources: []string{"EN01", "EN04"}, MapsToComponentType: "actuator", Tags: []string{"moving_part", "electrical_part"}, SortOrder: 38},
{ID: "C039", NameDE: "Spindelantrieb", NameEN: "Spindle Drive", Category: "drive", DescriptionDE: "Kugelgewindetrieb fuer praezise Linearbewegung.", TypicalHazardCategories: []string{"mechanical_hazard"}, TypicalEnergySources: []string{"EN01"}, MapsToComponentType: "mechanical", Tags: []string{"moving_part", "crush_point"}, SortOrder: 39},
{ID: "C040", NameDE: "Riemenantrieb", NameEN: "Belt Drive", Category: "drive", DescriptionDE: "Riemen und Riemenscheiben zur Kraftuebertragung.", TypicalHazardCategories: []string{"mechanical_hazard"}, TypicalEnergySources: []string{"EN02"}, MapsToComponentType: "mechanical", Tags: []string{"rotating_part", "entanglement_risk"}, SortOrder: 40},
// ── Category: hydraulic (C041-C050) ─────────────────────────────────────
{ID: "C041", NameDE: "Hydraulikpumpe", NameEN: "Hydraulic Pump", Category: "hydraulic", DescriptionDE: "Pumpe zur Erzeugung des hydraulischen Drucks im System.", TypicalHazardCategories: []string{"pneumatic_hydraulic", "noise_vibration"}, TypicalEnergySources: []string{"EN05"}, MapsToComponentType: "actuator", Tags: []string{"hydraulic_part", "high_pressure"}, SortOrder: 41},
{ID: "C041", NameDE: "Hydraulikpumpe", NameEN: "Hydraulic Pump", Category: "hydraulic", DescriptionDE: "Pumpe zur Erzeugung des hydraulischen Drucks im System.", TypicalHazardCategories: []string{"pneumatic_hydraulic", "noise_vibration"}, TypicalEnergySources: []string{"EN05"}, MapsToComponentType: "actuator", Tags: []string{"hydraulic_part", "high_pressure", "noise_source"}, SortOrder: 41},
{ID: "C042", NameDE: "Hydraulikzylinder", NameEN: "Hydraulic Cylinder", Category: "hydraulic", DescriptionDE: "Linearaktuator zur Erzeugung hoher Kraefte.", TypicalHazardCategories: []string{"pneumatic_hydraulic", "mechanical_hazard"}, TypicalEnergySources: []string{"EN05"}, MapsToComponentType: "actuator", Tags: []string{"hydraulic_part", "moving_part", "high_force", "high_pressure"}, SortOrder: 42},
{ID: "C043", NameDE: "Hydraulikventil", NameEN: "Hydraulic Valve", Category: "hydraulic", DescriptionDE: "Steuer- oder Regelventil im Hydraulikkreislauf.", TypicalHazardCategories: []string{"pneumatic_hydraulic"}, TypicalEnergySources: []string{"EN05"}, MapsToComponentType: "actuator", Tags: []string{"hydraulic_part", "high_pressure"}, SortOrder: 43},
{ID: "C044", NameDE: "Hydraulikspeicher", NameEN: "Hydraulic Accumulator", Category: "hydraulic", DescriptionDE: "Druckspeicher zur Pufferung von Druckspitzen.", TypicalHazardCategories: []string{"pneumatic_hydraulic"}, TypicalEnergySources: []string{"EN05"}, MapsToComponentType: "actuator", Tags: []string{"hydraulic_part", "stored_energy", "high_pressure"}, SortOrder: 44},
@@ -117,33 +117,33 @@ func GetComponentLibrary() []ComponentLibraryEntry {
{ID: "C072", NameDE: "Sicherheits-SPS", NameEN: "Safety PLC", Category: "control", DescriptionDE: "Redundante Sicherheitssteuerung bis SIL 3 / PL e.", TypicalHazardCategories: []string{"safety_function_failure", "software_fault"}, TypicalEnergySources: []string{}, MapsToComponentType: "controller", Tags: []string{"has_software", "programmable", "safety_device"}, SortOrder: 72},
{ID: "C073", NameDE: "HMI (Bedienterminal)", NameEN: "HMI (Human Machine Interface)", Category: "control", DescriptionDE: "Bedienpanel mit Touchscreen zur Maschinensteuerung.", TypicalHazardCategories: []string{"hmi_error", "mode_confusion"}, TypicalEnergySources: []string{}, MapsToComponentType: "hmi", Tags: []string{"has_software", "user_interface"}, SortOrder: 73},
{ID: "C074", NameDE: "Industrierechner (IPC)", NameEN: "Industrial PC", Category: "control", DescriptionDE: "Industrie-PC fuer komplexe Steuerungs- und Datenverarbeitungsaufgaben.", TypicalHazardCategories: []string{"software_fault", "configuration_error"}, TypicalEnergySources: []string{}, MapsToComponentType: "controller", Tags: []string{"has_software", "programmable", "networked"}, SortOrder: 74},
{ID: "C075", NameDE: "Motion Controller", NameEN: "Motion Controller", Category: "control", DescriptionDE: "Achscontroller fuer synchronisierte Mehrachsbewegungen.", TypicalHazardCategories: []string{"software_fault", "mechanical_hazard"}, TypicalEnergySources: []string{}, MapsToComponentType: "controller", Tags: []string{"has_software", "programmable"}, SortOrder: 75},
{ID: "C075", NameDE: "Motion Controller", NameEN: "Motion Controller", Category: "control", DescriptionDE: "Achscontroller fuer synchronisierte Mehrachsbewegungen.", TypicalHazardCategories: []string{"software_fault", "mechanical_hazard"}, TypicalEnergySources: []string{}, MapsToComponentType: "controller", Tags: []string{"has_software", "programmable", "moving_part"}, SortOrder: 75},
{ID: "C076", NameDE: "Sicherheitsrelais", NameEN: "Safety Relay", Category: "control", DescriptionDE: "Sicherheitsschaltgeraet fuer Not-Halt, Schutztuer etc.", TypicalHazardCategories: []string{"safety_function_failure"}, TypicalEnergySources: []string{}, MapsToComponentType: "controller", Tags: []string{"safety_device"}, SortOrder: 76},
{ID: "C077", NameDE: "Antriebsregler", NameEN: "Drive Controller", Category: "control", DescriptionDE: "Intelligenter Antriebsregler mit integrierten Sicherheitsfunktionen.", TypicalHazardCategories: []string{"software_fault", "electrical_hazard"}, TypicalEnergySources: []string{"EN04"}, MapsToComponentType: "controller", Tags: []string{"has_software", "programmable"}, SortOrder: 77},
{ID: "C077", NameDE: "Antriebsregler", NameEN: "Drive Controller", Category: "control", DescriptionDE: "Intelligenter Antriebsregler mit integrierten Sicherheitsfunktionen.", TypicalHazardCategories: []string{"software_fault", "electrical_hazard"}, TypicalEnergySources: []string{"EN04"}, MapsToComponentType: "controller", Tags: []string{"has_software", "programmable", "electrical_part"}, SortOrder: 77},
{ID: "C078", NameDE: "Remote I/O", NameEN: "Remote I/O Module", Category: "control", DescriptionDE: "Dezentrales Ein-/Ausgangsmodul im Feldbus.", TypicalHazardCategories: []string{"communication_failure"}, TypicalEnergySources: []string{}, MapsToComponentType: "controller", Tags: []string{"networked"}, SortOrder: 78},
{ID: "C079", NameDE: "Bedienpult", NameEN: "Control Desk", Category: "control", DescriptionDE: "Zentrales Bedienpult mit Tastern, Schaltern und Anzeigen.", TypicalHazardCategories: []string{"hmi_error", "mode_confusion"}, TypicalEnergySources: []string{}, MapsToComponentType: "hmi", Tags: []string{"user_interface"}, SortOrder: 79},
{ID: "C080", NameDE: "Datenschreiber/Logger", NameEN: "Data Logger", Category: "control", DescriptionDE: "Geraet zur Aufzeichnung von Prozessparametern.", TypicalHazardCategories: []string{"logging_audit_failure"}, TypicalEnergySources: []string{}, MapsToComponentType: "controller", Tags: []string{"has_software"}, SortOrder: 80},
// ── Category: sensor (C081-C090) ────────────────────────────────────────
{ID: "C081", NameDE: "Positionssensor", NameEN: "Position Sensor", Category: "sensor", DescriptionDE: "Induktiver, kapazitiver oder optischer Positionssensor.", TypicalHazardCategories: []string{"sensor_spoofing"}, TypicalEnergySources: []string{}, MapsToComponentType: "sensor", Tags: []string{"sensor_part"}, SortOrder: 81},
{ID: "C082", NameDE: "Kamerasystem", NameEN: "Camera System", Category: "sensor", DescriptionDE: "Industriekamera fuer Bildverarbeitung und Qualitaetskontrolle.", TypicalHazardCategories: []string{"sensor_spoofing", "false_classification"}, TypicalEnergySources: []string{}, MapsToComponentType: "sensor", Tags: []string{"sensor_part", "networked"}, SortOrder: 82},
{ID: "C083", NameDE: "Kraftsensor", NameEN: "Force Sensor", Category: "sensor", DescriptionDE: "Dehnungsmessstreifen oder piezoelektrischer Kraftsensor.", TypicalHazardCategories: []string{"sensor_spoofing"}, TypicalEnergySources: []string{}, MapsToComponentType: "sensor", Tags: []string{"sensor_part"}, SortOrder: 83},
{ID: "C084", NameDE: "Temperatursensor", NameEN: "Temperature Sensor", Category: "sensor", DescriptionDE: "Thermocouple oder PT100 zur Temperaturueberwachung.", TypicalHazardCategories: []string{"sensor_spoofing"}, TypicalEnergySources: []string{}, MapsToComponentType: "sensor", Tags: []string{"sensor_part"}, SortOrder: 84},
{ID: "C085", NameDE: "Drucksensor", NameEN: "Pressure Sensor", Category: "sensor", DescriptionDE: "Sensor zur Ueberwachung von Druck in Hydraulik- oder Pneumatiksystemen.", TypicalHazardCategories: []string{"sensor_spoofing"}, TypicalEnergySources: []string{}, MapsToComponentType: "sensor", Tags: []string{"sensor_part"}, SortOrder: 85},
{ID: "C086", NameDE: "Drehgeber/Encoder", NameEN: "Rotary Encoder", Category: "sensor", DescriptionDE: "Absolut- oder Inkrementaldrehgeber zur Winkel-/Positionsmessung.", TypicalHazardCategories: []string{"sensor_spoofing"}, TypicalEnergySources: []string{}, MapsToComponentType: "sensor", Tags: []string{"sensor_part"}, SortOrder: 86},
{ID: "C087", NameDE: "Laserscanner", NameEN: "Laser Scanner", Category: "sensor", DescriptionDE: "Sicherheits-Laserscanner zur Ueberwachung von Schutzzonen.", TypicalHazardCategories: []string{"sensor_spoofing", "safety_function_failure"}, TypicalEnergySources: []string{}, MapsToComponentType: "sensor", Tags: []string{"sensor_part", "safety_device"}, SortOrder: 87},
{ID: "C088", NameDE: "Beschleunigungssensor", NameEN: "Accelerometer", Category: "sensor", DescriptionDE: "Sensor zur Vibrations- und Beschleunigungsmessung.", TypicalHazardCategories: []string{"sensor_spoofing"}, TypicalEnergySources: []string{}, MapsToComponentType: "sensor", Tags: []string{"sensor_part"}, SortOrder: 88},
{ID: "C089", NameDE: "Durchflusssensor", NameEN: "Flow Sensor", Category: "sensor", DescriptionDE: "Sensor zur Ueberwachung des Volumenstrom.", TypicalHazardCategories: []string{"sensor_spoofing"}, TypicalEnergySources: []string{}, MapsToComponentType: "sensor", Tags: []string{"sensor_part"}, SortOrder: 89},
{ID: "C090", NameDE: "Fuellstandsensor", NameEN: "Level Sensor", Category: "sensor", DescriptionDE: "Sensor zur Ueberwachung des Fuellstands in Tanks und Behaeltern.", TypicalHazardCategories: []string{"sensor_spoofing"}, TypicalEnergySources: []string{}, MapsToComponentType: "sensor", Tags: []string{"sensor_part"}, SortOrder: 90},
{ID: "C081", NameDE: "Positionssensor", NameEN: "Position Sensor", Category: "sensor", DescriptionDE: "Induktiver, kapazitiver oder optischer Positionssensor.", TypicalHazardCategories: []string{"sensor_spoofing"}, TypicalEnergySources: []string{}, MapsToComponentType: "sensor", Tags: []string{"sensor_part", "cyber"}, SortOrder: 81},
{ID: "C082", NameDE: "Kamerasystem", NameEN: "Camera System", Category: "sensor", DescriptionDE: "Industriekamera fuer Bildverarbeitung und Qualitaetskontrolle.", TypicalHazardCategories: []string{"sensor_spoofing", "false_classification"}, TypicalEnergySources: []string{}, MapsToComponentType: "sensor", Tags: []string{"sensor_part", "networked", "cyber", "has_ai"}, SortOrder: 82},
{ID: "C083", NameDE: "Kraftsensor", NameEN: "Force Sensor", Category: "sensor", DescriptionDE: "Dehnungsmessstreifen oder piezoelektrischer Kraftsensor.", TypicalHazardCategories: []string{"sensor_spoofing"}, TypicalEnergySources: []string{}, MapsToComponentType: "sensor", Tags: []string{"sensor_part", "cyber"}, SortOrder: 83},
{ID: "C084", NameDE: "Temperatursensor", NameEN: "Temperature Sensor", Category: "sensor", DescriptionDE: "Thermocouple oder PT100 zur Temperaturueberwachung.", TypicalHazardCategories: []string{"sensor_spoofing"}, TypicalEnergySources: []string{}, MapsToComponentType: "sensor", Tags: []string{"sensor_part", "cyber"}, SortOrder: 84},
{ID: "C085", NameDE: "Drucksensor", NameEN: "Pressure Sensor", Category: "sensor", DescriptionDE: "Sensor zur Ueberwachung von Druck in Hydraulik- oder Pneumatiksystemen.", TypicalHazardCategories: []string{"sensor_spoofing"}, TypicalEnergySources: []string{}, MapsToComponentType: "sensor", Tags: []string{"sensor_part", "cyber"}, SortOrder: 85},
{ID: "C086", NameDE: "Drehgeber/Encoder", NameEN: "Rotary Encoder", Category: "sensor", DescriptionDE: "Absolut- oder Inkrementaldrehgeber zur Winkel-/Positionsmessung.", TypicalHazardCategories: []string{"sensor_spoofing"}, TypicalEnergySources: []string{}, MapsToComponentType: "sensor", Tags: []string{"sensor_part", "cyber"}, SortOrder: 86},
{ID: "C087", NameDE: "Laserscanner", NameEN: "Laser Scanner", Category: "sensor", DescriptionDE: "Sicherheits-Laserscanner zur Ueberwachung von Schutzzonen.", TypicalHazardCategories: []string{"sensor_spoofing", "safety_function_failure"}, TypicalEnergySources: []string{}, MapsToComponentType: "sensor", Tags: []string{"sensor_part", "safety_device", "cyber"}, SortOrder: 87},
{ID: "C088", NameDE: "Beschleunigungssensor", NameEN: "Accelerometer", Category: "sensor", DescriptionDE: "Sensor zur Vibrations- und Beschleunigungsmessung.", TypicalHazardCategories: []string{"sensor_spoofing"}, TypicalEnergySources: []string{}, MapsToComponentType: "sensor", Tags: []string{"sensor_part", "cyber"}, SortOrder: 88},
{ID: "C089", NameDE: "Durchflusssensor", NameEN: "Flow Sensor", Category: "sensor", DescriptionDE: "Sensor zur Ueberwachung des Volumenstrom.", TypicalHazardCategories: []string{"sensor_spoofing"}, TypicalEnergySources: []string{}, MapsToComponentType: "sensor", Tags: []string{"sensor_part", "cyber"}, SortOrder: 89},
{ID: "C090", NameDE: "Fuellstandsensor", NameEN: "Level Sensor", Category: "sensor", DescriptionDE: "Sensor zur Ueberwachung des Fuellstands in Tanks und Behaeltern.", TypicalHazardCategories: []string{"sensor_spoofing"}, TypicalEnergySources: []string{}, MapsToComponentType: "sensor", Tags: []string{"sensor_part", "cyber"}, SortOrder: 90},
// ── Category: actuator (C091-C100) ──────────────────────────────────────
{ID: "C091", NameDE: "Magnetventil", NameEN: "Solenoid Valve", Category: "actuator", DescriptionDE: "Elektromagnetisch betaetigtes Ventil fuer Pneumatik oder Hydraulik.", TypicalHazardCategories: []string{"pneumatic_hydraulic"}, TypicalEnergySources: []string{"EN05", "EN06"}, MapsToComponentType: "actuator", Tags: []string{"actuator_part"}, SortOrder: 91},
{ID: "C091", NameDE: "Magnetventil", NameEN: "Solenoid Valve", Category: "actuator", DescriptionDE: "Elektromagnetisch betaetigtes Ventil fuer Pneumatik oder Hydraulik.", TypicalHazardCategories: []string{"pneumatic_hydraulic"}, TypicalEnergySources: []string{"EN05", "EN06"}, MapsToComponentType: "actuator", Tags: []string{"actuator_part", "hydraulic_part", "pneumatic_part", "high_pressure"}, SortOrder: 91},
{ID: "C092", NameDE: "Linearantrieb (elektrisch)", NameEN: "Electric Linear Actuator", Category: "actuator", DescriptionDE: "Elektrischer Linearantrieb fuer Positionieraufgaben.", TypicalHazardCategories: []string{"mechanical_hazard", "electrical_hazard"}, TypicalEnergySources: []string{"EN01", "EN04"}, MapsToComponentType: "actuator", Tags: []string{"actuator_part", "moving_part"}, SortOrder: 92},
{ID: "C093", NameDE: "Proportionalventil", NameEN: "Proportional Valve", Category: "actuator", DescriptionDE: "Stetig regelbares Ventil fuer praezise Drucksteuerung.", TypicalHazardCategories: []string{"pneumatic_hydraulic"}, TypicalEnergySources: []string{"EN05", "EN06"}, MapsToComponentType: "actuator", Tags: []string{"actuator_part"}, SortOrder: 93},
{ID: "C093", NameDE: "Proportionalventil", NameEN: "Proportional Valve", Category: "actuator", DescriptionDE: "Stetig regelbares Ventil fuer praezise Drucksteuerung.", TypicalHazardCategories: []string{"pneumatic_hydraulic"}, TypicalEnergySources: []string{"EN05", "EN06"}, MapsToComponentType: "actuator", Tags: []string{"actuator_part", "hydraulic_part", "pneumatic_part", "high_pressure"}, SortOrder: 93},
{ID: "C094", NameDE: "Heizelement", NameEN: "Heating Element", Category: "actuator", DescriptionDE: "Elektrisches Heizelement fuer Temperierung von Werkzeugen oder Medien.", TypicalHazardCategories: []string{"thermal_hazard", "electrical_hazard"}, TypicalEnergySources: []string{"EN07"}, MapsToComponentType: "actuator", Tags: []string{"actuator_part", "high_temperature"}, SortOrder: 94},
{ID: "C095", NameDE: "Kuehlaggregat", NameEN: "Cooling Unit", Category: "actuator", DescriptionDE: "Kuehlanlage fuer Prozesse oder Schaltschraenke.", TypicalHazardCategories: []string{"thermal_hazard"}, TypicalEnergySources: []string{"EN07"}, MapsToComponentType: "actuator", Tags: []string{"actuator_part"}, SortOrder: 95},
{ID: "C096", NameDE: "Luefter/Geblaese", NameEN: "Fan/Blower", Category: "actuator", DescriptionDE: "Luefter zur Kuehlung oder Absaugung.", TypicalHazardCategories: []string{"mechanical_hazard", "noise_vibration"}, TypicalEnergySources: []string{"EN02"}, MapsToComponentType: "actuator", Tags: []string{"actuator_part", "rotating_part"}, SortOrder: 96},
{ID: "C097", NameDE: "Dosierpumpe", NameEN: "Dosing Pump", Category: "actuator", DescriptionDE: "Praezisionspumpe zur Dosierung von Fluessigkeiten oder Klebstoffen.", TypicalHazardCategories: []string{"pneumatic_hydraulic", "material_environmental"}, TypicalEnergySources: []string{"EN05"}, MapsToComponentType: "actuator", Tags: []string{"actuator_part"}, SortOrder: 97},
{ID: "C096", NameDE: "Luefter/Geblaese", NameEN: "Fan/Blower", Category: "actuator", DescriptionDE: "Luefter zur Kuehlung oder Absaugung.", TypicalHazardCategories: []string{"mechanical_hazard", "noise_vibration"}, TypicalEnergySources: []string{"EN02"}, MapsToComponentType: "actuator", Tags: []string{"actuator_part", "rotating_part", "noise_source"}, SortOrder: 96},
{ID: "C097", NameDE: "Dosierpumpe", NameEN: "Dosing Pump", Category: "actuator", DescriptionDE: "Praezisionspumpe zur Dosierung von Fluessigkeiten oder Klebstoffen.", TypicalHazardCategories: []string{"pneumatic_hydraulic", "material_environmental"}, TypicalEnergySources: []string{"EN05"}, MapsToComponentType: "actuator", Tags: []string{"actuator_part", "hydraulic_part", "chemical_risk"}, SortOrder: 97},
{ID: "C098", NameDE: "Elektromagnet", NameEN: "Electromagnet", Category: "actuator", DescriptionDE: "Elektromagnet fuer Halten, Spannen oder Foerdern.", TypicalHazardCategories: []string{"electrical_hazard", "emc_hazard"}, TypicalEnergySources: []string{"EN04"}, MapsToComponentType: "actuator", Tags: []string{"actuator_part", "stored_energy"}, SortOrder: 98},
{ID: "C099", NameDE: "Piezo-Aktuator", NameEN: "Piezo Actuator", Category: "actuator", DescriptionDE: "Piezoelektrischer Aktuator fuer hochpraezise Mikrobewegungen.", TypicalHazardCategories: []string{"electrical_hazard"}, TypicalEnergySources: []string{"EN04"}, MapsToComponentType: "actuator", Tags: []string{"actuator_part"}, SortOrder: 99},
{ID: "C100", NameDE: "Spannvorrichtung", NameEN: "Clamping Device", Category: "actuator", DescriptionDE: "Mechanische, pneumatische oder hydraulische Spannvorrichtung.", TypicalHazardCategories: []string{"mechanical_hazard"}, TypicalEnergySources: []string{"EN01", "EN05", "EN06"}, MapsToComponentType: "actuator", Tags: []string{"actuator_part", "clamping_part", "pinch_point"}, SortOrder: 100},
@@ -161,15 +161,15 @@ func GetComponentLibrary() []ComponentLibraryEntry {
{ID: "C110", NameDE: "Zustimmtaster", NameEN: "Enabling Device", Category: "safety", DescriptionDE: "Dreistufiger Zustimmtaster fuer den Einrichtbetrieb.", TypicalHazardCategories: []string{"safety_function_failure"}, TypicalEnergySources: []string{}, MapsToComponentType: "controller", Tags: []string{"safety_device"}, SortOrder: 110},
// ── Category: it_network (C111-C120) ────────────────────────────────────
{ID: "C111", NameDE: "Industrie-Switch (managed)", NameEN: "Managed Industrial Switch", Category: "it_network", DescriptionDE: "Managed Ethernet Switch fuer das Maschinennetzwerk.", TypicalHazardCategories: []string{"communication_failure", "unauthorized_access"}, TypicalEnergySources: []string{}, MapsToComponentType: "network", Tags: []string{"networked", "it_component"}, SortOrder: 111},
{ID: "C112", NameDE: "Industrie-Router", NameEN: "Industrial Router", Category: "it_network", DescriptionDE: "Router zur Segmentierung und Absicherung des Maschinennetzwerks.", TypicalHazardCategories: []string{"communication_failure", "unauthorized_access"}, TypicalEnergySources: []string{}, MapsToComponentType: "network", Tags: []string{"networked", "it_component"}, SortOrder: 112},
{ID: "C111", NameDE: "Industrie-Switch (managed)", NameEN: "Managed Industrial Switch", Category: "it_network", DescriptionDE: "Managed Ethernet Switch fuer das Maschinennetzwerk.", TypicalHazardCategories: []string{"communication_failure", "unauthorized_access"}, TypicalEnergySources: []string{}, MapsToComponentType: "network", Tags: []string{"networked", "it_component", "cyber"}, SortOrder: 111},
{ID: "C112", NameDE: "Industrie-Router", NameEN: "Industrial Router", Category: "it_network", DescriptionDE: "Router zur Segmentierung und Absicherung des Maschinennetzwerks.", TypicalHazardCategories: []string{"communication_failure", "unauthorized_access"}, TypicalEnergySources: []string{}, MapsToComponentType: "network", Tags: []string{"networked", "it_component", "cyber"}, SortOrder: 112},
{ID: "C113", NameDE: "Industrie-Firewall", NameEN: "Industrial Firewall", Category: "it_network", DescriptionDE: "Firewall zum Schutz des OT-Netzwerks vor externen Angriffen.", TypicalHazardCategories: []string{"unauthorized_access"}, TypicalEnergySources: []string{}, MapsToComponentType: "network", Tags: []string{"networked", "it_component", "security_device"}, SortOrder: 113},
{ID: "C114", NameDE: "IoT-Gateway", NameEN: "IoT Gateway", Category: "it_network", DescriptionDE: "Gateway fuer die Anbindung von Maschinen an Cloud/Edge.", TypicalHazardCategories: []string{"communication_failure", "unauthorized_access"}, TypicalEnergySources: []string{}, MapsToComponentType: "network", Tags: []string{"networked", "it_component", "has_software"}, SortOrder: 114},
{ID: "C115", NameDE: "Edge-Computing-Einheit", NameEN: "Edge Computing Unit", Category: "it_network", DescriptionDE: "Lokale Recheneinheit fuer Datenvorverarbeitung und KI-Inferenz.", TypicalHazardCategories: []string{"software_fault", "communication_failure"}, TypicalEnergySources: []string{}, MapsToComponentType: "network", Tags: []string{"networked", "it_component", "has_software", "has_ai"}, SortOrder: 115},
{ID: "C116", NameDE: "WLAN Access Point (Industrie)", NameEN: "Industrial WiFi Access Point", Category: "it_network", DescriptionDE: "Drahtloser Netzwerkzugang im Maschinenumfeld.", TypicalHazardCategories: []string{"communication_failure", "unauthorized_access"}, TypicalEnergySources: []string{}, MapsToComponentType: "network", Tags: []string{"networked", "it_component", "wireless"}, SortOrder: 116},
{ID: "C116", NameDE: "WLAN Access Point (Industrie)", NameEN: "Industrial WiFi Access Point", Category: "it_network", DescriptionDE: "Drahtloser Netzwerkzugang im Maschinenumfeld.", TypicalHazardCategories: []string{"communication_failure", "unauthorized_access"}, TypicalEnergySources: []string{}, MapsToComponentType: "network", Tags: []string{"networked", "it_component", "wireless", "cyber"}, SortOrder: 116},
{ID: "C117", NameDE: "OPC UA Server", NameEN: "OPC UA Server", Category: "it_network", DescriptionDE: "OPC UA Kommunikationsserver fuer Maschine-zu-Maschine-Vernetzung.", TypicalHazardCategories: []string{"communication_failure", "unauthorized_access"}, TypicalEnergySources: []string{}, MapsToComponentType: "network", Tags: []string{"networked", "it_component", "has_software"}, SortOrder: 117},
{ID: "C118", NameDE: "VPN-Appliance", NameEN: "VPN Appliance", Category: "it_network", DescriptionDE: "VPN-Geraet fuer sichere Fernzugriffe auf die Maschinensteuerung.", TypicalHazardCategories: []string{"unauthorized_access"}, TypicalEnergySources: []string{}, MapsToComponentType: "network", Tags: []string{"networked", "it_component", "security_device"}, SortOrder: 118},
{ID: "C119", NameDE: "KI-Inferenzmodul", NameEN: "AI Inference Module", Category: "it_network", DescriptionDE: "Dediziertes KI-Modul (GPU/TPU) fuer Echtzeit-Inferenz.", TypicalHazardCategories: []string{"false_classification", "model_drift", "unintended_bias"}, TypicalEnergySources: []string{}, MapsToComponentType: "network", Tags: []string{"has_ai", "has_software", "networked"}, SortOrder: 119},
{ID: "C119", NameDE: "KI-Inferenzmodul", NameEN: "AI Inference Module", Category: "it_network", DescriptionDE: "Dediziertes KI-Modul (GPU/TPU) fuer Echtzeit-Inferenz.", TypicalHazardCategories: []string{"false_classification", "model_drift", "unintended_bias"}, TypicalEnergySources: []string{}, MapsToComponentType: "network", Tags: []string{"has_ai", "ai_model", "has_software", "networked", "cyber"}, SortOrder: 119},
{ID: "C120", NameDE: "Feldbus-Koppler", NameEN: "Fieldbus Coupler", Category: "it_network", DescriptionDE: "Koppler fuer PROFINET, EtherCAT oder andere Feldbussysteme.", TypicalHazardCategories: []string{"communication_failure"}, TypicalEnergySources: []string{}, MapsToComponentType: "network", Tags: []string{"networked", "it_component"}, SortOrder: 120},
// ── Extended: Press/Forming Machine Components (C121-C135) ───────────
@@ -180,7 +180,7 @@ func GetComponentLibrary() []ComponentLibraryEntry {
{ID: "C125", NameDE: "Ruettelplatte / Vibrationsfoerderer", NameEN: "Vibrating Plate / Feeder", Category: "mechanical", DescriptionDE: "Vibrationseinheit zum Sortieren, Ausrichten oder Foerdern von Teilen.", TypicalHazardCategories: []string{"noise_vibration", "ergonomic"}, TypicalEnergySources: []string{"EN01"}, MapsToComponentType: "mechanical", Tags: []string{"vibration_source", "noise_source", "moving_part"}, SortOrder: 125},
{ID: "C126", NameDE: "Stempel-Formen-System", NameEN: "Die/Punch Tooling System", Category: "mechanical", DescriptionDE: "Werkzeugset aus Stempel und Matrize fuer Umform- oder Stanzvorgaenge.", TypicalHazardCategories: []string{"mechanical_hazard"}, TypicalEnergySources: []string{"EN01"}, MapsToComponentType: "mechanical", Tags: []string{"moving_part", "high_force", "crush_point", "cutting_part"}, SortOrder: 126},
{ID: "C127", NameDE: "Transfersystem (Stangen/Greifer)", NameEN: "Transfer System (Bar/Gripper)", Category: "mechanical", DescriptionDE: "Mechanisches Transportsystem zwischen Bearbeitungsstationen.", TypicalHazardCategories: []string{"mechanical_hazard"}, TypicalEnergySources: []string{"EN01"}, MapsToComponentType: "mechanical", Tags: []string{"moving_part", "shear_risk", "pinch_point"}, SortOrder: 127},
{ID: "C128", NameDE: "Aufzugsportal / Hubwerk", NameEN: "Elevator Portal / Hoist", Category: "mechanical", DescriptionDE: "Hebevorrichtung fuer Materialzufuhr (Kisten, Paletten).", TypicalHazardCategories: []string{"mechanical_hazard"}, TypicalEnergySources: []string{"EN01", "EN04"}, MapsToComponentType: "mechanical", Tags: []string{"moving_part", "gravity_risk", "high_force", "person_under_load"}, SortOrder: 128},
{ID: "C128", NameDE: "Aufzugsportal / Hubwerk", NameEN: "Elevator Portal / Hoist", Category: "mechanical", DescriptionDE: "Hebevorrichtung fuer Materialzufuhr (Kisten, Paletten).", TypicalHazardCategories: []string{"mechanical_hazard"}, TypicalEnergySources: []string{"EN01", "EN03", "EN04"}, MapsToComponentType: "mechanical", Tags: []string{"moving_part", "gravity_risk", "high_force", "person_under_load", "crush_point"}, SortOrder: 128},
{ID: "C129", NameDE: "Fallrohr / Auswurfschacht", NameEN: "Chute / Ejection Channel", Category: "structural", DescriptionDE: "Schwerkraft-basierter Auswurf fuer fertige oder aussortierte Teile.", TypicalHazardCategories: []string{"mechanical_hazard"}, TypicalEnergySources: []string{"EN04"}, MapsToComponentType: "mechanical", Tags: []string{"gravity_risk"}, SortOrder: 129},
{ID: "C130", NameDE: "Oelfangschale / Auffangwanne", NameEN: "Oil Drip Tray", Category: "structural", DescriptionDE: "Auffangvorrichtung fuer Hydraulikoel, Schmiermittel, Kuehlmittel.", TypicalHazardCategories: []string{"material_environmental"}, TypicalEnergySources: []string{}, MapsToComponentType: "mechanical", Tags: []string{"chemical_risk"}, SortOrder: 130},
{ID: "C131", NameDE: "Druckbegrenzungsventil", NameEN: "Pressure Relief Valve", Category: "hydraulic", DescriptionDE: "Sicherheitsventil zur Druckbegrenzung im Hydraulikkreis.", TypicalHazardCategories: []string{"pneumatic_hydraulic"}, TypicalEnergySources: []string{"EN07"}, MapsToComponentType: "actuator", Tags: []string{"hydraulic_part", "safety_device", "high_pressure"}, SortOrder: 131},
@@ -81,6 +81,10 @@ func (e *DocumentExporter) ExportPDF(
e.pdfClassifications(pdf, classifications)
}
// --- Quellen & Lizenzen (Stufe 4 Attribution-Renderer, Task #29) ---
pdf.AddPage()
e.pdfSourcesAppendix(pdf, hazards, mitigations)
// --- Footer on every page ---
pdf.SetFooterFunc(func() {
pdf.SetY(-15)
@@ -0,0 +1,134 @@
package iace
// Sources & Licenses appendix for the IACE Tech-File PDF export.
// Stufe 4 of the Attribution Renderer (Task #29).
//
// The IACE engine generates hazards from BreakPilot Pattern-IDs that
// themselves cite ISO 12100, EN 13849, EN ISO 13855 etc. Those norm
// identifiers are R3 (DIN/EN copyright — identifier-only). The
// pattern-engine output itself is R3 (BreakPilot own work). OSHA values
// surfaced via the minimum-distance library are R1 (US Federal PD).
//
// This appendix aggregates what the Tech-File ACTUALLY cited and shows
// it grouped by license rule with the mandatory disclaimer that the
// per-export footer cannot be replaced by a pauschal Impressum-Hinweis.
import (
"sort"
"strings"
"github.com/jung-kurt/gofpdf"
)
// pdfSourcesAppendix renders the "Quellen & Lizenzen" appendix page.
// Called by ExportPDF after the regulatory classifications block.
func (e *DocumentExporter) pdfSourcesAppendix(pdf *gofpdf.Fpdf, hazards []Hazard, mitigations []Mitigation) {
pdf.SetFont("Helvetica", "B", 14)
pdf.SetTextColor(124, 58, 237)
pdf.CellFormat(0, 10, "Quellen und Lizenzen", "", 1, "L", false, 0, "")
pdf.Ln(2)
pdf.SetFont("Helvetica", "", 9)
pdf.SetTextColor(80, 80, 80)
intro := "Diese Risikobeurteilung verwendet die deterministische BreakPilot IACE " +
"Pattern-Engine sowie zitierte Sicherheitsnormen. Die folgende Aufstellung " +
"listet die konkret in diesem Dokument zitierten Quellen mit ihrer Lizenzregel."
pdf.MultiCell(0, 5, intro, "", "L", false)
pdf.Ln(3)
pdf.SetFont("Helvetica", "B", 10)
pdf.SetTextColor(0, 0, 0)
pdf.CellFormat(0, 7, "R3 — BreakPilot Pattern-Engine (Eigenwerk, Identifier-Verweis)", "", 1, "L", false, 0, "")
pdf.SetFont("Helvetica", "", 9)
pdf.SetTextColor(60, 60, 60)
pdf.MultiCell(0, 5,
"Alle in diesem Dokument referenzierten HP-XXXX-Identifier stammen aus der "+
"BreakPilot IACE Pattern-Library (Eigenwerk). Keine externe Lizenz-Attribution "+
"erforderlich.", "", "L", false)
pdf.Ln(3)
norms := extractCitedNorms(hazards, mitigations)
if len(norms) > 0 {
pdf.SetFont("Helvetica", "B", 10)
pdf.SetTextColor(0, 0, 0)
pdf.CellFormat(0, 7, "R3 — Sicherheitsnormen (DIN/EN/ISO/IEC, Identifier-Verweis)", "", 1, "L", false, 0, "")
pdf.SetFont("Helvetica", "", 9)
pdf.SetTextColor(60, 60, 60)
pdf.MultiCell(0, 5,
"DIN-/EN-/ISO-/IEC-Normen unterliegen dem Urheberrecht der jeweiligen "+
"Normungsorganisation. In diesem Dokument werden Normen ausschliesslich "+
"als Identifier (Norm-Nummer und Abschnitt) zitiert; kein Volltext aus "+
"diesen Normen wurde reproduziert. Konkret zitiert:", "", "L", false)
pdf.Ln(1)
for _, n := range norms {
pdf.CellFormat(0, 5, " • "+n, "", 1, "L", false, 0, "")
}
pdf.Ln(2)
}
pdf.SetFont("Helvetica", "B", 10)
pdf.SetTextColor(0, 0, 0)
pdf.CellFormat(0, 7, "R1 — Hoheitsrecht / Public Domain (woertlich uebernehmbar)", "", 1, "L", false, 0, "")
pdf.SetFont("Helvetica", "", 9)
pdf.SetTextColor(60, 60, 60)
pdf.MultiCell(0, 5,
"Soweit Werte aus US Federal Code (OSHA 29 CFR Subpart O) oder EU-Recht "+
"(Maschinenverordnung 2023/1230, AI Act 2024/1689) referenziert werden, "+
"sind diese als R1 woertlich uebernehmbar. Keine Attribution-Pflicht.", "", "L", false)
pdf.Ln(4)
pdf.SetFont("Helvetica", "I", 8)
pdf.SetTextColor(120, 120, 120)
pdf.MultiCell(0, 4,
"Hinweis: Pauschalvermerke in AGB oder Impressum reichen rechtlich nicht — "+
"die werknahe Attribution erfolgt durch diese Quellenseite. Vollstaendiges "+
"Quellenverzeichnis aller im BreakPilot-System verwendeten Quellen siehe "+
"/sdk/licenses im Web-Frontend.", "", "L", false)
}
// extractCitedNorms scans hazard descriptions + scenario fields for
// recognised norm identifiers. The detection is intentionally narrow:
// only well-known prefixes (EN/ISO/IEC/DIN) and only when followed by
// digits, so free-form prose is not turned into spurious citations.
func extractCitedNorms(hz []Hazard, mt []Mitigation) []string {
seen := make(map[string]bool)
consider := func(s string) {
fields := strings.FieldsFunc(s, func(r rune) bool {
return r == ' ' || r == ',' || r == ';' || r == '\n' || r == ';' || r == '('
})
for i := 0; i < len(fields)-1; i++ {
head := strings.ToUpper(strings.TrimSpace(fields[i]))
next := strings.TrimSpace(fields[i+1])
if !(head == "EN" || head == "ISO" || head == "IEC" || head == "DIN") {
continue
}
if next == "" {
continue
}
// Accept "ISO 12100", "EN 13849-1", "DIN EN 60204-1" etc.
if next[0] >= '0' && next[0] <= '9' {
seen[head+" "+next] = true
} else if head == "DIN" && (strings.HasPrefix(strings.ToUpper(next), "EN") || strings.HasPrefix(strings.ToUpper(next), "ISO")) && i+2 < len(fields) {
third := strings.TrimSpace(fields[i+2])
if third != "" && third[0] >= '0' && third[0] <= '9' {
seen[head+" "+next+" "+third] = true
}
}
}
}
for _, h := range hz {
consider(h.Description)
consider(h.Scenario)
consider(h.PossibleHarm)
}
for _, m := range mt {
consider(m.Description)
consider(m.Name)
}
out := make([]string, 0, len(seen))
for k := range seen {
out = append(out, k)
}
sort.Strings(out)
return out
}
@@ -83,6 +83,12 @@ type HazardPattern struct {
// feeds into the PLr (required Performance Level) computation,
// see ComputePLr.
DefaultAvoidability int `json:"default_avoidability,omitempty"` // 1 or 2
// SecondaryHarms describes consequential damage chains beyond the
// classical IACE Hazard→Harm step: end-customer safety, product
// liability, food safety, environmental, reputation, financial.
// See secondary_harms.go and the strategy discussion (2026-05-20).
// Empty for hazards with no downstream chain.
SecondaryHarms []SecondaryHarm `json:"secondary_harms,omitempty"`
}
// ComputePLr returns the required Performance Level (PLr) per EN ISO
@@ -0,0 +1,96 @@
package iace
// Body-part-specific crush hazards at lift / hoist / scissor-lift endstops.
// Bridges the gap that the Kistenhubgeraet re-init exposed: the abstract
// "Bremse versagt bei Absenkbewegung" pattern fires, but the concrete
// "Fuss unter absenkender Hubplattform" body-part variant did not exist.
//
// Each pattern restricts to lift-family machine types via MachineTypes,
// so a press / CNC / textile project does not pick them up. Mitigations
// reference the new M600-M604 (lift endstop) library plus the existing
// M001 (geometry), M002 (safety distance), M141 (warning sign).
func GetLiftEndstopPatterns() []HazardPattern {
liftTypes := []string{"lift", "hoist", "elevator", "scissor_lift"}
return []HazardPattern{
{
ID: "HP2100",
NameDE: "Fuss-Quetschung unter absenkender Hubplattform am Bodenanschlag",
NameEN: "Foot crush under descending lift platform at floor stop",
RequiredComponentTags: []string{"crush_point", "gravity_risk", "person_under_load"},
RequiredEnergyTags: []string{"gravitational"},
MachineTypes: liftTypes,
GeneratedHazardCats: []string{"mechanical_hazard"},
SuggestedMeasureIDs: []string{"M600", "M601", "M604", "M141"},
Priority: 92,
ScenarioDE: "Fuss oder Bein des Bedieners gelangt waehrend des Absenkvorgangs unter die " +
"Hubplattform. Bei Erreichen der unteren Endlage wird der Fuss zwischen Plattform " +
"und Boden gequetscht.",
TriggerDE: "Unsachgemaesse Position des Bedieners beim Be-/Entladen, fehlende Schaltleiste, fehlender Trittschutz",
HarmDE: "Fussquetschung, Mittelfussfraktur, Zehenamputation",
AffectedDE: "Bediener, Wartungspersonal",
ZoneDE: "Bodenbereich unter Hubplattform, umlaufende Spalte",
DefaultSeverity: 4,
DefaultExposure: 3,
DefaultAvoidability: 2,
ISO12100Section: "6.3.5.5 Quetschen — Mindestabstaende",
ClarificationQuestionsDE: []string{
"Ist eine umlaufende Quetsch-Schaltleiste an der Plattformunterkante verbaut?",
"Ist die Hubgeschwindigkeit am unteren Endanschlag auf <=15 mm/s reduziert (siehe M600)?",
"Verhindert ein Trittblech / Unterfahrschutz das Hineinfahren von Fuessen?",
},
},
{
ID: "HP2101",
NameDE: "Hand- oder Koerper-Quetschung gegen feste Struktur beim Hochfahren der Hubeinheit",
NameEN: "Hand or body crush against fixed structure during lift upward travel",
RequiredComponentTags: []string{"crush_point", "gravity_risk"},
RequiredEnergyTags: []string{"gravitational"},
MachineTypes: liftTypes,
GeneratedHazardCats: []string{"mechanical_hazard"},
SuggestedMeasureIDs: []string{"M602", "M603", "M600", "M141"},
Priority: 90,
ScenarioDE: "Beim Hochfahren der Last gelangen Hand oder Koerperteile des Bedieners " +
"zwischen die hoechste Position der Hubeinheit (z.B. mit beladener Palette) und " +
"eine feste Struktur oberhalb (Decke, Vorbau, Querbalken einer umschliessenden Anlage).",
TriggerDE: "Eingriff in den Verfahrweg waehrend Hubvorgang, fehlende konstruktive Begrenzung der Endlage",
HarmDE: "Hand- oder Armquetschung, im Extremfall Brustkorbkompression",
AffectedDE: "Bediener, Einrichter, Wartungspersonal",
ZoneDE: "Oberhalb hoechster Hubposition, Vorbau/Decke der umschliessenden Anlage",
DefaultSeverity: 4,
DefaultExposure: 2,
DefaultAvoidability: 2,
ISO12100Section: "6.3.5.5 Quetschen — Mindestabstaende",
ClarificationQuestionsDE: []string{
"Welcher Mindestabstand zu festen Strukturen oberhalb der hoechsten Hubposition ist gegeben? (Empfehlung: 120 mm fuer Kopf, 100 mm fuer Hand)",
"Ist der Tippbetrieb (Hold-to-run) durch ein Testprotokoll mit Stop-Zeit-Messung verifiziert?",
"Existiert eine redundante Hardware-Endlage zusaetzlich zur Software-Begrenzung?",
},
},
{
ID: "HP2102",
NameDE: "Quetschung Bein/Koerper zwischen Hubeinheit und seitlicher Struktur",
NameEN: "Leg/body crush between lift unit and lateral structure",
RequiredComponentTags: []string{"crush_point", "gravity_risk", "moving_part"},
RequiredEnergyTags: []string{"gravitational"},
MachineTypes: liftTypes,
GeneratedHazardCats: []string{"mechanical_hazard"},
SuggestedMeasureIDs: []string{"M602", "M601", "M141"},
Priority: 85,
ScenarioDE: "Person befindet sich seitlich neben der Hubeinheit und wird waehrend " +
"der Bewegung gegen eine feste Struktur (Regalwand, Stuetze, andere Anlage) gequetscht.",
TriggerDE: "Aufenthalt in Quetschzone bei Bewegung, fehlende Absperrung",
HarmDE: "Beinfraktur, Beckenquetschung",
AffectedDE: "Bediener, vorbeigehende Personen",
ZoneDE: "Seitlicher Bereich neben Hubeinheit, Lichte Weite zu festen Strukturen",
DefaultSeverity: 4,
DefaultExposure: 2,
DefaultAvoidability: 2,
ISO12100Section: "6.3.5.5 Quetschen — Mindestabstaende",
ClarificationQuestionsDE: []string{
"Welcher Sicherheitsabstand zu seitlichen festen Strukturen ist gegeben (Empfehlung 500 mm Koerperdurchgang)?",
"Ist der Bereich seitlich der Hubeinheit als Gefahrenzone markiert oder abgeschrankt?",
},
},
}
}
@@ -0,0 +1,127 @@
package iace
// Demonstration patterns showing how the SecondaryHarms field carries
// downstream-consequence information through the IACE engine.
//
// Two real-world scenarios are encoded:
//
// HP2000 — Glass-shard injection in carbonated-beverage bottling
// (the "Cola splitter" example from the IACE strategy
// discussion). Primary harm is the operator hit by flying
// shards; the secondary chain is product-liability towards
// supermarket end-customers.
//
// HP2001 — Cross-contamination in pharma fill-finish lines.
// Primary harm is operator exposure; secondary chain is
// patient harm + recall under §74a AMG.
//
// These two patterns are sufficient as a contract test for the
// SecondaryHarms field. Library coverage of more scenarios is a
// follow-up task once the persistence layer (DB migration) lands.
func GetSecondaryHarmDemoPatterns() []HazardPattern {
return []HazardPattern{
{
ID: "HP2000",
NameDE: "Glasbruch in Karbonisierungs-Abfueller (Hochdruck)",
NameEN: "Glass shatter in carbonated bottling line",
RequiredComponentTags: []string{"crush_point", "high_pressure"},
RequiredEnergyTags: []string{"pneumatic_pressure"},
GeneratedHazardCats: []string{"mechanical_hazard"},
Priority: 90,
MachineTypes: []string{"bottling", "food_processing", "packaging"},
ScenarioDE: "Glasflasche platzt unter CO2-Druck waehrend der Abfuellung. " +
"Splitter erreichen den Bediener und koennen ferner in nachfolgende " +
"Flaschen eingetragen werden.",
TriggerDE: "Materialfehler, ueberhoehter Innendruck, Foerderstoss",
HarmDE: "Schnittverletzung Auge/Hand des Bedieners",
AffectedDE: "Abfueller, Mitarbeiter Linie",
ZoneDE: "Karussell, Schutzkapsel, Foerderband-Auslauf",
DefaultSeverity: 4,
DefaultExposure: 3,
ISO12100Section: "6.4.5.5 Schleudernde Teile",
SecondaryHarms: []SecondaryHarm{
{
Type: SecondaryHarmConsumerSafety,
Description: "Restsplitter in der Folgeflasche erreichen ueber den Handel " +
"den Endkunden. Verletzungsrisiko Mund/Speiseroehre.",
LegalBasis: "ProdHaftG §1, VO (EU) Nr. 178/2002 Art. 14",
SuggestedMitigations: []string{
"Spueltunnel nach Abfuellung",
"Inline-Kamera mit Glasbrucherkennung",
"Sperrzone fuer 2 Folgeflaschen bei Bruchereignis",
"Glasbruchsensor an Karussell mit Linie-Stopp",
},
Owner: "product_safety",
},
{
Type: SecondaryHarmFoodSafety,
Description: "Rueckruf- und Meldepflicht bei Inverkehrbringen unsicherer " +
"Lebensmittel; Rueckverfolgbarkeit Chargen-genau erforderlich.",
LegalBasis: "VO (EU) 178/2002 Art. 18, 19; LFGB §40",
SuggestedMitigations: []string{
"Chargen-Tracking bis Endhaendler",
"Schnellwarnsystem RASFF aktiviert halten",
"Rueckruf-SOP getestet",
},
Owner: "qm",
},
{
Type: SecondaryHarmReputation,
Description: "Pressemitteilung und Aktienkurs-Reaktion bei Verbraucher-" +
"verletzungen / behoerdlichem Rueckruf.",
LegalBasis: "ISO 31000 Unternehmensrisiko",
SuggestedMitigations: []string{
"Krisenkommunikations-Plan",
"PR-Bereitschaft 24/7",
},
Owner: "enterprise_risk",
},
},
},
{
ID: "HP2001",
NameDE: "Kreuzkontamination Pharma Fill-Finish",
NameEN: "Cross-contamination pharma fill-finish",
RequiredComponentTags: []string{"chemical_risk"},
RequiredEnergyTags: []string{"pneumatic_pressure"},
GeneratedHazardCats: []string{"chemical_hazard"},
Priority: 92,
MachineTypes: []string{"pharmaceutical", "food_processing"},
ScenarioDE: "Wirkstoff-Rueckstand aus Vorcharge im Linienzwischenraum kontaminiert " +
"die Folgecharge.",
TriggerDE: "Mangelhaftes CIP, Spuelvolumen unterhalb Validierung",
HarmDE: "Bedienerexposition bei Probennahme",
AffectedDE: "Anlagenbediener, Probenehmer",
ZoneDE: "Abfuelllinie zwischen Vorlage und Filler",
DefaultSeverity: 4,
DefaultExposure: 2,
ISO12100Section: "6.4.4 Chemische und biologische Gefaehrdungen",
SecondaryHarms: []SecondaryHarm{
{
Type: SecondaryHarmConsumerSafety,
Description: "Patient erhaelt Arzneimittel mit unzulaessiger Beimischung; " +
"Wirkungsbeeintraechtigung oder unerwuenschte Wirkung moeglich.",
LegalBasis: "AMG §5 (Verkehrsfaehigkeit), §74a (Stufenplan)",
SuggestedMitigations: []string{
"CIP-Validierung mit TOC- und Conductivity-Limits",
"Dedizierte Linien fuer Hochpotente Wirkstoffe",
"Stufenplan-Meldung bei Verdacht",
},
Owner: "qm",
},
{
Type: SecondaryHarmProductLiability,
Description: "Haftung des Inverkehrbringers nach AMG §84 (Gefaehrdungshaftung " +
"bei Arzneimittelschaeden, verschuldensunabhaengig).",
LegalBasis: "AMG §84",
SuggestedMitigations: []string{
"Deckung Produkthaftpflicht ueber gesetzliches Minimum",
"Chargen-Rueckhaltemuster 12 Monate ueber MHD hinaus",
},
Owner: "legal",
},
},
},
}
}
@@ -29,8 +29,18 @@ func GetKeywordDictionary() []KeywordEntry {
// ── Foerdertechnik ──────────────────────────────────────────────
{Keywords: []string{"foerderband", "transportband", "conveyor"}, ComponentIDs: []string{"C003"}, EnergyIDs: []string{"EN01", "EN02"}, ExtraTags: []string{"entanglement_risk"}},
{Keywords: []string{"transfer", "transferanlage", "transfersystem"}, ComponentIDs: []string{"C127"}, ExtraTags: []string{"shear_risk", "pinch_point"}},
{Keywords: []string{"aufzug", "elevator", "lift"}, ComponentIDs: []string{"C014", "C128"}, EnergyIDs: []string{"EN04"}, ExtraTags: []string{"gravity_risk", "person_under_load"}},
{Keywords: []string{"hubwerk", "hoist", "hubgeraet"}, ComponentIDs: []string{"C128"}, EnergyIDs: []string{"EN04"}, ExtraTags: []string{"gravity_risk", "person_under_load"}},
// Hubgeraete: korrigiert auf EN03 (Potentielle/Gravitational) statt
// nur EN04 (Elektrisch). Audit-Methode A zeigte, dass HP1014/HP1015/
// HP1017/HP1018 (alle Quetsch-Patterns unter absenkender Last) nicht
// zuendeten weil sowohl crush_point als auch gravitational fehlten.
// EN04 bleibt fuer Steuerstrom-bezogene Patterns mit drin.
{Keywords: []string{"aufzug", "elevator", "lift"}, ComponentIDs: []string{"C014", "C128"}, EnergyIDs: []string{"EN03", "EN04"}, ExtraTags: []string{"gravity_risk", "person_under_load", "crush_point"}},
{Keywords: []string{"hubwerk", "hoist", "hubgeraet"}, ComponentIDs: []string{"C128"}, EnergyIDs: []string{"EN03", "EN04"}, ExtraTags: []string{"gravity_risk", "person_under_load", "crush_point"}},
// Hub-Verben aus Methode-C-Vocabulary-Diff: "absenken/senken/
// anheben/heben/hubhoehe" tauchten im Limits-Form auf, der Parser
// kannte sie nicht. Konservativ EN03 + Tags, Component bleibt offen.
{Keywords: []string{"absenk", "senken", "anheben", "heben"}, EnergyIDs: []string{"EN03"}, ExtraTags: []string{"gravity_risk", "person_under_load", "crush_point"}},
{Keywords: []string{"hubhoehe", "hubweg", "hubgeschwindig"}, EnergyIDs: []string{"EN03"}, ExtraTags: []string{"gravity_risk", "crush_point"}},
{Keywords: []string{"ruettel", "vibration", "vibrationsfoerderer"}, ComponentIDs: []string{"C125"}, ExtraTags: []string{"vibration_source", "noise_source"}},
{Keywords: []string{"fallrohr", "auswurf", "chute"}, ComponentIDs: []string{"C129"}, EnergyIDs: []string{"EN04"}, ExtraTags: []string{"gravity_risk"}},
{Keywords: []string{"kistenwechsel", "bin change"}, ComponentIDs: []string{"C134"}, ExtraTags: []string{"ergonomic", "gravity_risk"}},
@@ -21,6 +21,7 @@ func GetProtectiveMeasureLibrary() []ProtectiveMeasureEntry {
all = append(all, GetTextileAgriMeasures()...) // Textil + Landmaschinen (Phase 5)
all = append(all, getGTBremseMeasures()...) // GT-Bremse-Coverage-Gaps (M483-M522)
all = append(all, GetCRAMeasures()...) // CRA / DIN EN 40000-1-2 cyber-resilience (M540-M548)
all = append(all, getLiftEndstopMeasures()...) // Lift/hoist endstop (M600-M604) — bridges OSHA MD library
return all
}
@@ -0,0 +1,134 @@
package iace
// Lift / hoist / scissor-lift endstop mitigations — bridges the OSHA
// minimum-distance library (minimum_distances.go, Task #18) into the
// pattern-engine measure library. Each entry cites the concrete OSHA
// value AND its EU-norm pendant by identifier only.
//
// Engineering rounding values come from MD_OSHA_* IDs in
// minimum_distances.go. We do not duplicate the source text here —
// the Tech-File renderer can join MD_OSHA_* into the rendered text
// at output time.
func getLiftEndstopMeasures() []ProtectiveMeasureEntry {
return []ProtectiveMeasureEntry{
// M600 — Cruise/creep speed at end of travel
{
ID: "M600",
ReductionType: "protection",
SubType: "speed_control",
Name: "Kriechgeschwindigkeit am Endanschlag (Hubgeraete)",
Description: "Hubgeschwindigkeit am Ende der Verfahrbewegung (oben und unten) auf maximal 15 mm/s " +
"reduzieren. OSHA 29 CFR 1910.217 Hand-Speed-Konstante 63 in/s = 1.600 mm/s als Obergrenze " +
"fuer Stopp-Reaktionszeit. Damit ist auch bei spaeter Auslosung der Quetsch-Schaltleiste " +
"genug Bremsweg vorhanden.",
HazardCategory: "mechanical",
Examples: []string{
"Hub-Endschalter mit Soft-Stop und Geschwindigkeitsstufe < 15 mm/s in den letzten 50 mm",
"Servo-Antrieb mit Ramp-down-Profil ueber die letzten 100 mm Verfahrweg",
"Drehzahl-Begrenzer im Frequenzumrichter mit Endlagen-Trigger",
},
NormReferences: []string{
"OSHA 29 CFR 1910.217 (Ds = 63 in/s x Ts)",
"EN ISO 13855 (Anordnung von Schutzeinrichtungen)",
"EN 1570-1 (Hubtische — Bauanforderungen)",
},
RiskReduction: &RiskReduction{SeverityDelta: -1, ExposureDelta: -1, ProbabilityDelta: -1},
Tags: []string{"crush_point", "gravity_risk", "speed_limit"},
},
// M601 — Trip-edge sensor under platform (safety bumper)
{
ID: "M601",
ReductionType: "protection",
SubType: "safety_device",
Name: "Quetsch-Schaltleiste unterhalb der Hubplattform",
Description: "Druckempfindliche Schaltleiste (gemaess EN ISO 13856-2) am unteren Rand der Hubplattform " +
"loest bei Beruehrung den Hubantrieb sofort aus und kehrt die Bewegung um. Verhindert Quetschung " +
"von Fuessen oder Beinen unter absenkender Last. PL c oder hoeher nach EN ISO 13849-1.",
HazardCategory: "mechanical",
Examples: []string{
"Schaltleiste umlaufend an Bodenkante der Hubplattform",
"Trittschutz mit redundanter Auswertung am Hubtisch",
"Lichtgitter im Bodenbereich als Ergaenzung bei freistehenden Anlagen",
},
NormReferences: []string{
"EN ISO 13856-2 (Schaltleisten)",
"EN ISO 13849-1 (PL-Bestimmung)",
"EN 1570-1",
},
RiskReduction: &RiskReduction{SeverityDelta: -2, ExposureDelta: -2, ProbabilityDelta: -2},
Tags: []string{"crush_point", "gravity_risk", "safety_device"},
},
// M602 — Minimum clearance to fixed structure above max lift position
{
ID: "M602",
ReductionType: "design",
SubType: "geometry",
Name: "Mindestabstand zu festen Strukturen oberhalb der Hubendlage",
Description: "Zwischen hoechstem Punkt der Hubeinheit (mit beladenem Werkstueck) und festen Strukturen " +
"oberhalb (Decke, Vorbau, Querbalken) muss ein Sicherheitsabstand verbleiben, der das Quetschen " +
"von Haenden und Koerper verhindert. Empfehlung: 120 mm fuer Kopf, 100 mm fuer Hand, 25 mm fuer " +
"Finger — abgeleitet aus EN 349 / EN ISO 13854 unabhaengig zu pruefen.",
HazardCategory: "mechanical",
Examples: []string{
"Konstruktive Begrenzung der oberen Hubposition durch mechanischen Anschlag",
"Software-Endlage mit redundantem Hardware-Sicherheitsschalter",
"Auslegungs-Pruefung mit beladener Standard-Palette und Maximal-Hubhoehe",
},
NormReferences: []string{
"EN 349 (Mindestabstaende gegen Quetschen von Koerperteilen)",
"EN ISO 13854 (Mindestabstaende gegen Quetschen)",
"OSHA 29 CFR 1910.212(a)(5) (Lueftergitter ≤ 1/2 in als Anker)",
},
RiskReduction: &RiskReduction{SeverityDelta: -2, ExposureDelta: -1},
Tags: []string{"crush_point", "gravity_risk"},
},
// M603 — Hold-to-run with two-hand operation for manual descent
{
ID: "M603",
ReductionType: "protection",
SubType: "control_device",
Name: "Tippbetrieb / Hold-to-run beim Absenken (mit Verifikations-Nachweis)",
Description: "Absenken nur im Tippbetrieb (Hold-to-run): Bedientaster muss waehrend des gesamten " +
"Absenkvorgangs gedrueckt gehalten werden. Bei Loslassen stoppt die Bewegung sofort. " +
"Im Limits-Form als 'Tippbetrieb' deklariert — durch Tests verifizieren (Stop-Reaktionszeit " +
"<= 0,3 s im voll beladenen Zustand).",
HazardCategory: "mechanical",
Examples: []string{
"Tipptaster mit elektrischer Selbstrueckstellung",
"Zweihand-Bedienung fuer kritische Absenk-Bereiche (Tipp + Zustimmtaster)",
"Pruefprotokoll Stop-Zeit gemaess EN ISO 13849-1 PL c",
},
NormReferences: []string{
"EN ISO 13849-1 (Sicherheitsbezogene Steuerungsteile)",
"EN ISO 13851 (Zweihandschaltungen)",
"BetrSichV § 4 (Schutzmassnahmen)",
},
RiskReduction: &RiskReduction{SeverityDelta: -1, ExposureDelta: -2, ProbabilityDelta: -1},
Tags: []string{"crush_point", "gravity_risk", "control_device"},
},
// M604 — Underrun guard / kick plate at platform base
{
ID: "M604",
ReductionType: "design",
SubType: "geometry",
Name: "Trittblech / Unterfahrschutz an der Hubplattform",
Description: "Unter der Hubplattform befindet sich ein umlaufendes Trittblech oder Unterfahrschutz, " +
"das das Hineinfahren von Fuessen unter die Plattform mechanisch verhindert. Hoehe ueber Boden " +
"maximal 5 mm in unterster Stellung. Trittblech haelt die Last eines Schuhs (mind. 150 kg) " +
"ohne Verformung.",
HazardCategory: "mechanical",
Examples: []string{
"Umlaufendes Stahlblech 3 mm Wandstaerke mit Fasen-Kante",
"Kombination mit M601 (Schaltleiste) als doppelte Sicherung",
"Pruefung jaehrlich auf Verformung und Funktion der Auflage",
},
NormReferences: []string{
"EN 1570-1 (Hubtische)",
"EN ISO 13857 (Sicherheitsabstaende)",
},
RiskReduction: &RiskReduction{SeverityDelta: -2, ExposureDelta: -1},
Tags: []string{"crush_point", "gravity_risk"},
},
}
}
@@ -0,0 +1,172 @@
package iace
// Minimum-distance library — Task #18.
//
// Anchor source: OSHA 29 CFR 1910 Subpart O (US Federal Public Domain,
// 17 U.S.C. §105). The values below are reproduced verbatim from the
// Federal Code; conversions to metric are mathematical and carry no
// copyright. Engineering rounding to safe-side mm values is BreakPilot's
// recommendation and labelled as such.
//
// EU norm equivalents (EN ISO 13857, EN 349, EN 13855, EN 1010) are
// referenced by identifier only — no values are reproduced, because
// DIN/Beuth retain copyright on the wording. The DINComparisonNote
// field carries a human-curated judgement on whether the EU norm is
// stricter / looser / equivalent — this is a qualitative observation
// about a publicly available document, not a copy of its text.
//
// See LICENSE_RULES.md and project_attribution_strategy.md for the
// licensing logic. The OSHA values are R1 (verbatim public domain);
// the recommended metric values are BreakPilot engineering output (R3
// own-work). DIN references are R3 identifier-only.
// MinimumDistanceUnit denotes the original unit system of the source.
type MinimumDistanceUnit string
const (
UnitInch MinimumDistanceUnit = "inch"
UnitFoot MinimumDistanceUnit = "foot"
UnitMeter MinimumDistanceUnit = "meter"
UnitMM MinimumDistanceUnit = "mm"
)
// MinimumDistance is the data contract for a single safety-distance rule.
// It can be (a) a fixed gap value, (b) a distance range, or (c) a formula
// like OSHA's Ds = 63 in/s × Ts (hand-speed constant).
type MinimumDistance struct {
ID string `json:"id"` // MD_OSHA_001
// Source identifier — full CFR citation or norm reference.
SourceCFR string `json:"source_cfr,omitempty"` // "29 CFR §1910.217(c)(1)(i)"
SourceTable string `json:"source_table,omitempty"` // "Table O-10"
License string `json:"license"` // "US Federal Public Domain"
LicenseRule int `json:"license_rule"` // 1 / 2 / 3 (see LICENSE_RULES.md)
// Original verbatim value in the source's own unit.
OriginalUnit MinimumDistanceUnit `json:"original_unit"`
OriginalValue float64 `json:"original_value,omitempty"`
OriginalMin float64 `json:"original_min,omitempty"`
OriginalMax float64 `json:"original_max,omitempty"`
// Exact conversion to mm — no engineering rounding.
ExactMM float64 `json:"exact_mm,omitempty"`
ExactMinMM float64 `json:"exact_min_mm,omitempty"`
ExactMaxMM float64 `json:"exact_max_mm,omitempty"`
// Engineering-recommended metric value with safe-side rounding.
// For minimum distances: rounded up. For maximum opening widths:
// rounded down.
RecommendedMM int `json:"recommended_mm,omitempty"`
RecommendedMinMM int `json:"recommended_min_mm,omitempty"`
RecommendedMaxMM int `json:"recommended_max_mm,omitempty"`
RoundingNote string `json:"rounding_note,omitempty"`
// Optional formula constant (e.g. OSHA hand-speed 63 in/s).
FormulaInchPerSecond float64 `json:"formula_inch_per_second,omitempty"`
FormulaMMPerSecond float64 `json:"formula_mm_per_second,omitempty"`
FormulaDescription string `json:"formula_description,omitempty"`
Context string `json:"context"` // "Point of Operation Guarding mechanical presses"
BodyPart string `json:"body_part,omitempty"` // "finger" / "hand" / "head" / "foot" / "body"
HazardTags []string `json:"hazard_tags,omitempty"` // [crush_point, cutting_part, ...]
// EU norm cross-reference — IDENTIFIER ONLY, no values reproduced.
EUNormHints []EUNormHint `json:"eu_norm_hints,omitempty"`
}
// EUNormHint references an EU standard by identifier without reproducing
// any value or text from it. The DINComparisonNote is a human-curated
// qualitative judgement (stricter / equivalent / looser) — not a copy.
type EUNormHint struct {
Norm string `json:"norm"` // "EN ISO 13857"
Section string `json:"section,omitempty"` // "Tab. 4, Schutz gegen Hineingreifen"
DINComparisonNote string `json:"din_comparison_note,omitempty"`
}
// GetOSHAMinimumDistances returns the verbatim OSHA values for
// machine-guarding distances. All values are US Federal Public Domain
// (17 U.S.C. §105). Engineering rounding is BreakPilot's safe-side
// recommendation; OSHA values themselves are unchanged.
func GetOSHAMinimumDistances() []MinimumDistance {
return []MinimumDistance{
// OSHA Table O-10 row 1 — verbatim values, mathematical conversion,
// safe-side rounded engineering recommendation.
{
ID: "MD_OSHA_O10_R1",
SourceCFR: "29 CFR §1910.217(c)(1)(i)",
SourceTable: "Table O-10 row 1",
License: "US Federal Public Domain (17 U.S.C. §105)",
LicenseRule: 1,
OriginalUnit: UnitInch,
OriginalMin: 0.5, OriginalMax: 1.5, OriginalValue: 0.25,
ExactMinMM: 12.7, ExactMaxMM: 38.1, ExactMM: 6.35,
RecommendedMinMM: 15, RecommendedMaxMM: 40, RecommendedMM: 6,
RoundingNote: "Distance auf 5-mm-Raster aufgerundet, opening auf 1-mm-Raster abgerundet (konservativ in beide Richtungen).",
Context: "Point-of-Operation Guarding bei mechanischen Pressen",
BodyPart: "finger",
HazardTags: []string{"crush_point", "cutting_part"},
EUNormHints: []EUNormHint{
{Norm: "EN ISO 13857", Section: "Tab. 4 (Hineingreifen)",
DINComparisonNote: "Andere Methodik (Reichweitenmodell). Unabhaengig pruefen — Werte koennen abweichen."},
},
},
// OSHA Table O-10 row 4 — used as a worked example in the strategy
// discussion. Distance 3.5-5.5 in, opening max 5/8 in.
{
ID: "MD_OSHA_O10_R4",
SourceCFR: "29 CFR §1910.217(c)(1)(i)",
SourceTable: "Table O-10 row 4",
License: "US Federal Public Domain (17 U.S.C. §105)",
LicenseRule: 1,
OriginalUnit: UnitInch,
OriginalMin: 3.5, OriginalMax: 5.5, OriginalValue: 0.625,
ExactMinMM: 88.9, ExactMaxMM: 139.7, ExactMM: 15.875,
RecommendedMinMM: 90, RecommendedMaxMM: 140, RecommendedMM: 15,
RoundingNote: "Distance 88.9→90 (+1.1 mm), 139.7→140 (+0.3 mm) aufgerundet; Opening 15.875→15 (-0.875 mm) abgerundet.",
Context: "Point-of-Operation Guarding bei mechanischen Pressen",
BodyPart: "finger",
HazardTags: []string{"crush_point", "cutting_part"},
EUNormHints: []EUNormHint{
{Norm: "EN ISO 13857", Section: "Tab. 4 (Hineingreifen)",
DINComparisonNote: "Andere Methodik (Reichweitenmodell). Compliance-Annotation pflegen."},
},
},
// OSHA §1910.212(a)(5) — fan blade guards. Verbatim 1/2 inch.
{
ID: "MD_OSHA_212_FAN",
SourceCFR: "29 CFR §1910.212(a)(5)",
License: "US Federal Public Domain (17 U.S.C. §105)",
LicenseRule: 1,
OriginalUnit: UnitInch,
OriginalValue: 0.5,
ExactMM: 12.7,
RecommendedMM: 12,
RoundingNote: "Luefterblatt-Schutzgitter: max. Spaltoeffnung 1/2 in = 12.7 mm. Konservativ auf 12 mm abgerundet.",
Context: "Lüfterblätter unter 7 ft (2.13 m) Höhe",
BodyPart: "finger",
HazardTags: []string{"rotating_part", "cutting_part"},
EUNormHints: []EUNormHint{
{Norm: "EN ISO 13857", Section: "Tab. 4",
DINComparisonNote: "DIN-Wert pruefen."},
},
},
// OSHA §1910.217 Hand-Speed Constant — formula Ds = 63 in/s × Ts
{
ID: "MD_OSHA_217_PSDI",
SourceCFR: "29 CFR §1910.217 (Ds = 63 in/s × Ts)",
License: "US Federal Public Domain (17 U.S.C. §105)",
LicenseRule: 1,
OriginalUnit: UnitInch,
FormulaInchPerSecond: 63.0,
FormulaMMPerSecond: 1600.2,
FormulaDescription: "Hand-Speed-Konstante 63 in/s ≈ 1600 mm/s. " +
"Ds (Mindestabstand) = 63 × Ts (Stoppzeit Presse in Sekunden).",
Context: "PSDI Presence-Sensing Device Initiation und Two-Hand-Trip",
BodyPart: "hand",
HazardTags: []string{"crush_point", "high_speed"},
EUNormHints: []EUNormHint{
{Norm: "EN 13855", Section: "Sicherheitsabstaende",
DINComparisonNote: "EN 13855 nutzt andere Konstante (1600 mm/s ≈ identisch); EU-Norm unabhaengig pruefen."},
},
},
}
}
@@ -0,0 +1,60 @@
package iace
// Machine-type overrides for legacy patterns that lacked MachineTypes
// filtering at authoring time. Applied as a post-load pass in
// collectAllPatterns() so we do not need to touch the large pattern
// source files (which would push them past the 500-LOC cap).
//
// Adding an entry here causes the listed pattern IDs to fire ONLY for
// projects whose machine_type is in the value list. This eliminates
// drift like "Punktschweisselektroden" firing for a Kistenhubgeraet
// project just because tags incidentally aligned.
var legacyMachineTypeOverrides = map[string][]string{
// Walzen / Roller hazards — printing, paper, metalworking only.
"HP1000": {"printing", "paper", "textile", "metalworking", "rolling_mill", "food_processing"},
// HP306 + HP1530 already carry MachineTypes; skip.
// Welding-specific patterns.
"HP539": {"welding", "spot_welding"},
// Glass-handling tilters.
"HP545": {"glass", "glass_processing"},
"HP782": {"glass", "glass_processing"},
// Escalator-specific.
"HP756": {"escalator"},
"HP757": {"escalator"},
"HP760": {"escalator"},
// CNC machine tools (these fired on Kistenhubgeraet because they
// share crush_point + moving_part tags but are bench-mounted tools).
"HP1400": {"cnc", "metalworking", "lathe", "milling"},
"HP1401": {"cnc", "metalworking", "lathe", "milling"},
"HP1402": {"cnc", "metalworking", "lathe", "milling"},
// Press-specific (Pressenteile/Pressraum/Werkzeugraum).
"HP045": {"press", "hydraulic_press", "mechanical_press", "stamping_press"},
"HP049": {"press", "hydraulic_press", "mechanical_press", "stamping_press"},
// Conveyor-belt-specific drift.
"HP420": {"conveyor", "packaging", "food_processing"},
"HP421": {"conveyor", "packaging", "food_processing"},
"HP422": {"conveyor", "packaging", "food_processing"},
}
// applyMachineTypeOverrides mutates the passed slice in place, setting
// MachineTypes on any pattern whose ID is in the override map. Patterns
// that already have MachineTypes set are NOT overwritten — the override
// only fills the gap.
func applyMachineTypeOverrides(patterns []HazardPattern) []HazardPattern {
for i := range patterns {
if len(patterns[i].MachineTypes) > 0 {
continue
}
if mt, ok := legacyMachineTypeOverrides[patterns[i].ID]; ok {
patterns[i].MachineTypes = mt
}
}
return patterns
}
@@ -42,5 +42,8 @@ func collectAllPatterns() []HazardPattern {
patterns = append(patterns, GetGTBremseHazardPatterns()...) // HP1710-HP1729 GT Bremse coverage gaps
patterns = append(patterns, GetISO12100GapPatterns()...) // HP1900-HP1909 ISO 12100 Annex B gaps (Vakuum, Federn, Rutsch, Hochdruckinjektion, Ersticken)
patterns = append(patterns, GetCRAPatterns()...) // HP1910-HP1918 CRA / DIN EN 40000-1-2 cyber-resilience spur
patterns = append(patterns, GetSecondaryHarmDemoPatterns()...) // HP2000-HP2001 secondary harm chain demos (Cola splitter, Pharma)
patterns = append(patterns, GetLiftEndstopPatterns()...) // HP2100-HP2102 lift body-part crush at endstops
patterns = applyMachineTypeOverrides(patterns) // Fill MachineTypes on legacy patterns to prevent drift
return patterns
}
@@ -0,0 +1,89 @@
package iace
// SecondaryHarm models the consequential damage chain triggered by a primary
// hazard. The classical IACE / ISO-12100 model treats Hazard -> Harm as a
// single step ("operator gets crushed"). BreakPilot extends this with a
// follow-on chain so the risk assessment can address:
//
// - consumer_safety: end customer exposed to defective product
// (e.g. glass shards in a bottled drink that reaches a supermarket)
// - product_liability: manufacturer liability under ProdHaftG / EU PLD
// - food_safety: traceability and recall obligations (VO 178/2002)
// - environmental: spill, contamination, waste-disposal consequence
// - reputation: brand damage that escalates to investor / market level
// - financial: direct cost (lawsuit, recall, fine)
//
// This struct is the data contract; persistence is deferred to a future
// migration. The pattern library can already attach SecondaryHarms to a
// HazardPattern; the API layer surfaces them on hazard generation.
//
// See memory project_attribution_strategy.md plus the "Cola splitter" worked
// example from the IACE strategy discussion (2026-05-20).
type SecondaryHarm struct {
// Type is one of the SecondaryHarmType* constants below.
Type string `json:"type"`
// Description is a single sentence describing the secondary harm
// scenario in concrete terms ("Splitter in Folgeflasche bei
// Karussell-Abfueller -> Endkunde verletzt").
Description string `json:"description"`
// LegalBasis cites the legal framework that turns the secondary harm
// into an actionable obligation (e.g. "ProdHaftG §1" or "VO 178/2002
// Art. 14"). Helps auditors trace the obligation.
LegalBasis string `json:"legal_basis,omitempty"`
// SuggestedMitigations is a free-text list of measures specific to
// the secondary chain (e.g. "Spueltunnel", "Inline-Kamera",
// "Glasbruchsensor"). Distinct from the primary-mitigations because
// they protect downstream stakeholders, not the operator.
SuggestedMitigations []string `json:"suggested_mitigations,omitempty"`
// Owner identifies the role responsible for handling this secondary
// harm in the customer organisation. Common values:
// "qm" / "product_safety" / "enterprise_risk" / "legal"
// Empty if responsibility is shared.
Owner string `json:"owner,omitempty"`
}
// SecondaryHarmType constants — kept short and stable.
const (
SecondaryHarmConsumerSafety = "consumer_safety"
SecondaryHarmProductLiability = "product_liability"
SecondaryHarmFoodSafety = "food_safety"
SecondaryHarmEnvironmental = "environmental"
SecondaryHarmReputation = "reputation"
SecondaryHarmFinancial = "financial"
)
// AllSecondaryHarmTypes returns the canonical six categories in the order
// they should appear in UI dropdowns.
func AllSecondaryHarmTypes() []string {
return []string{
SecondaryHarmConsumerSafety,
SecondaryHarmProductLiability,
SecondaryHarmFoodSafety,
SecondaryHarmEnvironmental,
SecondaryHarmReputation,
SecondaryHarmFinancial,
}
}
// SecondaryHarmLabelDE returns the human-readable German label.
func SecondaryHarmLabelDE(t string) string {
switch t {
case SecondaryHarmConsumerSafety:
return "Endkundensicherheit"
case SecondaryHarmProductLiability:
return "Produkthaftung"
case SecondaryHarmFoodSafety:
return "Lebensmittelsicherheit"
case SecondaryHarmEnvironmental:
return "Umweltschaden"
case SecondaryHarmReputation:
return "Reputation/Marke"
case SecondaryHarmFinancial:
return "Finanzieller Schaden"
}
return t
}
@@ -60,7 +60,30 @@ func (tr *TagResolver) ResolveEnergyTags(energyIDs []string) []string {
return tags
}
// ResolveTags combines component, energy, and custom tags into a unified set.
// tagSynonyms maps short pattern-side tag names to the canonical
// library-side tags. The library uses descriptive identifiers
// ("electrical_energy") while many patterns were authored with short
// forms ("electrical"). Without this map, the pattern's RequiredTag
// "electrical" never matches a real component's "electrical_energy",
// and the entire pattern silently never fires. The audit (Method A)
// surfaced ~40 such ghost-patterns.
//
// Each entry expands the parser's tag set when a known synonym appears,
// so both forms work for matching. This is the least-invasive fix —
// no pattern bodies are touched. The long-term goal is to converge
// on a single canonical vocabulary; until then the map documents which
// pairs are considered equivalent.
var tagSynonyms = map[string][]string{
"electrical_energy": {"electrical"},
"pneumatic_pressure": {"pneumatic"},
"hydraulic_pressure": {"hydraulic"},
"electrical": {"electrical_energy"},
"pneumatic": {"pneumatic_pressure"},
"hydraulic": {"hydraulic_pressure"},
}
// ResolveTags combines component, energy, and custom tags into a unified set,
// applying the synonym map so patterns authored with either tag form match.
func (tr *TagResolver) ResolveTags(componentIDs, energyIDs, customTags []string) []string {
seen := make(map[string]bool)
var all []string
@@ -71,6 +94,12 @@ func (tr *TagResolver) ResolveTags(componentIDs, energyIDs, customTags []string)
seen[t] = true
all = append(all, t)
}
for _, syn := range tagSynonyms[t] {
if !seen[syn] {
seen[syn] = true
all = append(all, syn)
}
}
}
}
@@ -71,6 +71,8 @@ _ROUTER_MODULES = [
"compliance_report_routes",
"whistleblower_routes",
"tcf_routes",
"founding_wizard_routes",
"licenses_routes",
]
_loaded_count = 0
File diff suppressed because it is too large Load Diff
@@ -0,0 +1,285 @@
"""
P18 Erweiterter Banner-Block fuer die Email.
Rendert die Daten aus dem consent-tester die heute weggeworfen wurden:
- 3-Phasen-Cookie-Tabelle (before_consent / after_reject / after_accept)
- Banner-Quality-Score (completeness/correctness/violations)
- Per-Category-Tracker-Listing
- Violations-Liste mit Rechtsgrundlagen
"""
from __future__ import annotations
def _color_for(pct: int) -> str:
return ("#16a34a" if pct >= 80 else
"#d97706" if pct >= 50 else "#dc2626")
def _short_phase_label(key: str) -> str:
return {
"before_consent": "Vor Consent",
"after_reject": "Nach Ablehnung",
"after_accept": "Nach Annahme",
}.get(key, key)
def _phase_color(key: str, cookie_count: int) -> str:
if key == "before_consent":
return "#16a34a" if cookie_count == 0 else "#dc2626"
if key == "after_reject":
return "#16a34a" if cookie_count <= 1 else "#d97706"
return "#94a3b8"
def build_banner_deep_html(banner_result: dict | None) -> str:
"""Render: Banner-Quality + Phases + Violations.
Konsumiert das volle consent-tester-Response. Komplementiert
`build_provider_list_html` (das nur Summary + TCF-Vendor-Tabelle macht).
"""
if not banner_result:
return ""
parts: list[str] = [
'<div style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;'
'max-width:700px;margin:0 auto 16px;padding:14px 18px;'
'background:#fff;border:1px solid #cbd5e1;border-radius:8px">'
'<h3 style="margin:0 0 12px;font-size:14px;color:#0f172a">'
'Cookie-Banner — technische Analyse</h3>'
]
# 1) Quality-Score-Cards
compl = banner_result.get("completeness_pct")
corr = banner_result.get("correctness_pct")
summary = banner_result.get("summary") or {}
n_critical = summary.get("critical", 0)
n_high = summary.get("high", 0)
if compl is not None or corr is not None:
parts.append(
'<table style="width:100%;border-collapse:separate;'
'border-spacing:6px;margin-bottom:10px"><tr>'
)
if compl is not None:
c = _color_for(int(compl))
parts.append(
f'<td style="width:33%;padding:8px 10px;background:#f8fafc;'
f'border-radius:5px;border-left:3px solid {c}">'
f'<div style="font-size:10px;color:#64748b;text-transform:uppercase">'
f'Vollstaendigkeit</div>'
f'<div style="font-size:18px;font-weight:700;color:{c}">{compl}%</div>'
f'</td>'
)
if corr is not None:
c = _color_for(int(corr))
parts.append(
f'<td style="width:33%;padding:8px 10px;background:#f8fafc;'
f'border-radius:5px;border-left:3px solid {c}">'
f'<div style="font-size:10px;color:#64748b;text-transform:uppercase">'
f'Korrektheit</div>'
f'<div style="font-size:18px;font-weight:700;color:{c}">{corr}%</div>'
f'</td>'
)
viol_c = ("#dc2626" if n_critical + n_high > 0 else
"#d97706" if (summary.get("total_violations") or 0) > 0 else
"#16a34a")
parts.append(
f'<td style="width:33%;padding:8px 10px;background:#f8fafc;'
f'border-radius:5px;border-left:3px solid {viol_c}">'
f'<div style="font-size:10px;color:#64748b;text-transform:uppercase">'
f'Verstoesse</div>'
f'<div style="font-size:18px;font-weight:700;color:{viol_c}">'
f'{summary.get("total_violations", 0)}'
f'<span style="font-size:11px;color:#64748b;margin-left:6px">'
f'(crit:{n_critical} high:{n_high})</span></div></td>'
)
parts.append('</tr></table>')
# 2) 3-Phasen-Tabelle
phases = banner_result.get("phases") or {}
if phases:
parts.append(
'<div style="font-size:11px;color:#475569;margin:8px 0 4px;'
'font-weight:600">Cookie-Setzungen pro Phase '
'(echter Browser-Test):</div>'
'<table style="width:100%;border-collapse:collapse;font-size:11px;'
'margin-bottom:10px;border:1px solid #e2e8f0">'
'<thead><tr style="background:#f1f5f9;color:#475569;text-align:left">'
'<th style="padding:5px 8px">Phase</th>'
'<th style="padding:5px 8px;text-align:center">Cookies</th>'
'<th style="padding:5px 8px;text-align:center">Tracker</th>'
'<th style="padding:5px 8px">Auffaelligkeiten</th>'
'</tr></thead><tbody>'
)
for key in ("before_consent", "after_reject", "after_accept"):
ph = phases.get(key) or {}
if not isinstance(ph, dict): continue
cookies = ph.get("cookies") or []
trackers = ph.get("tracking_services") or []
new_track = ph.get("new_tracking") or []
violations = ph.get("violations") or []
undoc = ph.get("undocumented") or []
color = _phase_color(key, len(cookies))
issues_parts = []
if violations: issues_parts.append(f"{len(violations)} Verstoss")
if new_track: issues_parts.append(f"{len(new_track)} neue Tracker")
if undoc: issues_parts.append(f"{len(undoc)} undokumentiert")
issues_str = ", ".join(issues_parts) or ""
parts.append(
f'<tr style="border-top:1px solid #e2e8f0">'
f'<td style="padding:5px 8px;color:#1e293b;font-weight:600">'
f'<span style="display:inline-block;width:6px;height:6px;'
f'border-radius:50%;background:{color};margin-right:6px"></span>'
f'{_short_phase_label(key)}</td>'
f'<td style="padding:5px 8px;text-align:center;color:{color};'
f'font-weight:600">{len(cookies)}</td>'
f'<td style="padding:5px 8px;text-align:center">{len(trackers)}</td>'
f'<td style="padding:5px 8px;color:#475569">{issues_str}</td>'
f'</tr>'
)
parts.append('</tbody></table>')
# 3) Per-Category-Tracker
cats = banner_result.get("category_tests") or []
if cats:
non_essential = [c for c in cats if c.get("category") != "necessary"]
if non_essential:
parts.append(
'<div style="font-size:11px;color:#475569;margin:8px 0 4px;'
'font-weight:600">Provider-Listing pro Banner-Kategorie:</div>'
'<table style="width:100%;border-collapse:collapse;font-size:11px;'
'margin-bottom:10px;border:1px solid #e2e8f0">'
'<thead><tr style="background:#f1f5f9;color:#475569;text-align:left">'
'<th style="padding:5px 8px">Kategorie</th>'
'<th style="padding:5px 8px;text-align:center">Anbieter</th>'
'<th style="padding:5px 8px">Hinweis</th>'
'</tr></thead><tbody>'
)
for c in non_essential:
n = len(c.get("tracking_services") or [])
label = c.get("category_label") or c.get("category", "?")
pdv = c.get("provider_details_visible")
# P19: echtes Signal aus Click-Through-Test
if pdv is False:
color, hint = "#dc2626", ("Banner zeigt KEINE Provider-"
"Details — keine informierte Einwilligung")
elif pdv is True:
color, hint = "#16a34a", ""
elif n == 0:
color, hint = "#d97706", ("Keine Anbieter erkannt (vermutlich "
"kein Provider-Listing im Banner)")
else:
color, hint = "#16a34a", ""
parts.append(
f'<tr style="border-top:1px solid #e2e8f0">'
f'<td style="padding:5px 8px">{label}</td>'
f'<td style="padding:5px 8px;text-align:center;color:{color};'
f'font-weight:600">{n}</td>'
f'<td style="padding:5px 8px;color:#dc2626;font-size:10px">'
f'{hint}</td></tr>'
)
parts.append('</tbody></table>')
# 4) Violations mit Rechtsgrundlage
violations = (banner_result.get("banner_checks") or {}).get("violations", [])
if violations:
parts.append(
'<div style="font-size:11px;color:#475569;margin:8px 0 4px;'
'font-weight:600">Erkannte Banner-Verstoesse:</div>'
'<ul style="margin:0 0 8px 18px;padding:0;font-size:11px;color:#1e293b">'
)
for v in violations[:8]:
sev = (v.get("severity") or "MEDIUM").upper()
sev_c = ("#dc2626" if sev in ("CRITICAL", "HIGH") else
"#d97706" if sev == "MEDIUM" else "#94a3b8")
parts.append(
f'<li style="margin-bottom:6px">'
f'<span style="display:inline-block;background:{sev_c};color:#fff;'
f'font-size:9px;padding:1px 5px;border-radius:3px;margin-right:6px">'
f'{sev}</span>{v.get("text", "")[:200]}'
f'<div style="font-size:10px;color:#94a3b8;margin-top:2px;'
f'font-style:italic">Quelle: {v.get("legal_ref", "")}</div></li>'
)
parts.append('</ul>')
# 5) P59b: Cookie-Behavior-Findings (deklariert vs. tatsaechlich)
cb_findings = banner_result.get("cookie_behavior_findings") or []
if cb_findings:
parts.append(
'<div style="margin:14px 0 4px;padding:8px 12px;'
'background:#fef9e7;border-left:3px solid #d97706;border-radius:4px">'
'<div style="font-size:12px;color:#92400e;font-weight:600;'
'margin-bottom:6px">Cookie-Verhaltens-Check '
'(P59 — deklarierter Zweck vs. tatsaechliches Verhalten)</div>'
'<ul style="margin:0 0 0 18px;padding:0;font-size:11px;color:#1e293b">'
)
for f in cb_findings[:20]:
sev = (f.get("severity") or "MEDIUM").upper()
sev_c = ("#dc2626" if sev in ("CRITICAL", "HIGH") else
"#d97706" if sev == "MEDIUM" else "#94a3b8")
cname = f.get("cookie_name", "?")
parts.append(
f'<li style="margin-bottom:6px">'
f'<span style="display:inline-block;background:{sev_c};color:#fff;'
f'font-size:9px;padding:1px 5px;border-radius:3px;margin-right:6px">'
f'{sev}</span><code style="font-size:10px;background:#f1f5f9;'
f'padding:1px 4px;border-radius:2px">{cname}</code>: '
f'{f.get("text", "")[:280]}'
f'<div style="font-size:10px;color:#94a3b8;margin-top:2px;'
f'font-style:italic">Quelle: {f.get("legal_ref", "")} · '
f'Layer {f.get("layer", "?")}</div></li>'
)
parts.append('</ul></div>')
# 6) P61: Untergeschobene Cookies/Vendors (Vendor-Package)
impl_findings = banner_result.get("implicit_vendor_findings") or []
if impl_findings:
# Gruppiert nach primary_vendor: pro Primary die mitgelaufenen Items
by_primary: dict[str, list[dict]] = {}
for f in impl_findings:
by_primary.setdefault(f["primary_vendor"], []).append(f["implicit"])
parts.append(
'<div style="margin:14px 0 4px;padding:8px 12px;'
'background:#fef3c7;border-left:3px solid #d97706;border-radius:4px">'
'<div style="font-size:12px;color:#92400e;font-weight:600;'
'margin-bottom:6px">Untergeschobene Cookies / Vendors '
'(P61 — mit Hauptanbieter automatisch mitgeladen)</div>'
'<div style="font-size:10px;color:#92400e;margin-bottom:8px">'
'Diese Cookies/Vendors kommen automatisch mit dem deklarierten '
'Hauptanbieter mit — Marketing-Manager waehlen sie oft nicht '
'bewusst aus, sie sind aber zustimmungspflichtig.</div>'
)
for primary, impls in by_primary.items():
parts.append(
f'<div style="font-size:11px;color:#1e293b;margin:6px 0">'
f'<strong>{primary}</strong> bringt automatisch:</div>'
'<ul style="margin:0 0 8px 18px;padding:0;font-size:11px;color:#1e293b">'
)
for impl in impls:
tag = ('<span style="font-size:9px;background:#dc2626;color:#fff;'
'padding:1px 5px;border-radius:3px;margin-right:6px">'
'COOKIE</span>' if impl["type"] == "cookie" else
'<span style="font-size:9px;background:#7c3aed;color:#fff;'
'padding:1px 5px;border-radius:3px;margin-right:6px">'
'VENDOR</span>')
cat_color = {"marketing": "#dc2626", "statistics": "#d97706",
"functional": "#0891b2", "essential": "#16a34a"}.get(
impl.get("category", ""), "#475569")
parts.append(
f'<li style="margin-bottom:5px">{tag}'
f'<code style="font-size:10px;background:#f1f5f9;'
f'padding:1px 4px;border-radius:2px">{impl["name"]}</code> '
f'<span style="font-size:9px;color:{cat_color};'
f'margin-left:4px">[{impl.get("category","?")}]</span>'
f'<div style="font-size:10px;color:#475569;margin-top:2px">'
f'{impl.get("why","")[:240]}</div>'
f'<div style="font-size:9px;color:#94a3b8;font-style:italic">'
f'Quelle: <a href="{impl.get("source_url","")}" '
f'style="color:#94a3b8">{impl.get("source_url","")[:80]}</a>'
f'</div></li>'
)
parts.append('</ul>')
parts.append('</div>')
parts.append('</div>')
return "".join(parts)
@@ -0,0 +1,249 @@
"""
P18 Critical-Findings-Block fuer die Executive-Summary.
Analysiert die echten Daten (banner_checks, phases, scorecard, results) und
rendert einen ROTEN Sofortmassnahmen-Block GANZ OBEN in der Email mit
Quellenangaben (DSK, EDPB, EuGH, Behoerden-Buessgeld-Faelle) und konkreten
Sofortmassnahmen.
Regel: Block wird nur gerendert wenn echte kritische Verstoesse vorliegen.
Bei sauberen Sites bleibt er weg.
"""
from __future__ import annotations
def _truncate_words(text: str, max_chars: int) -> str:
"""P65: Truncate at word boundary, never mid-word."""
if not text or len(text) <= max_chars:
return text
cut = text[:max_chars]
last_space = cut.rfind(" ")
if last_space > max_chars // 2:
cut = cut[:last_space]
return cut.rstrip(",;:.") + ""
# Bekannte Buessgeld-Praezedenzfaelle als Quellen-Hint
_BUSSGELD_REFS = {
"no_provider_per_category": "CNIL France 2023 — TikTok 5 Mio EUR (fehlende Vendor-Transparenz)",
"dse_unvollstaendig": "BayLDA 2024 — diverse Mittelstand-Faelle, 5k50k EUR",
"cookie_doc_missing": "LfDI BW 2023 — fehlende Cookie-Erklaerung, 30k EUR",
"dark_pattern_reject": "EDPB Guidelines 3/2022 + DSK 2024 — Bussgeldrahmen Art. 83 DSGVO",
"schrems_ii": "EuGH C-311/18 (Schrems II) — Bussgeldrahmen bis 4% Konzern-Umsatz",
"impressum_im_banner": "LG Rostock 3 O 22/19 — Impressum-Pflicht ueberlagernder Banner",
}
def _detect_critical_issues(
banner_result: dict | None,
scorecard: dict | None,
results: list,
) -> list[dict]:
"""Erkenne kritische Verstoesse aus den vorliegenden Daten."""
issues: list[dict] = []
br = banner_result or {}
sc = scorecard or {}
# 1) Banner-Violations (HIGH/CRITICAL) aus consent-tester
for v in (br.get("banner_checks") or {}).get("violations", []):
sev = (v.get("severity") or "").upper()
if sev in ("CRITICAL", "HIGH"):
issues.append({
"key": "banner_violation",
"title": _truncate_words(v.get("text", ""), 260),
"severity": sev,
"action": _action_for_banner_violation(v),
"source": v.get("legal_ref", ""),
"bussgeld": _BUSSGELD_REFS.get("impressum_im_banner")
if "impressum" in (v.get("text") or "").lower()
else _BUSSGELD_REFS.get("dark_pattern_reject"),
})
# 2) Category-Tests: Banner zeigt keine Provider-Details pro Kategorie.
# Bevorzugt das echte Signal aus dem Click-Through-Test (P19):
# provider_details_visible. Fallback: leere tracking_services.
cat_tests = br.get("category_tests") or []
cats_without_details = [
c for c in cat_tests
if c.get("category") != "necessary"
and (c.get("provider_details_visible") is False
or (c.get("provider_details_visible") is None
and not c.get("tracking_services")))
]
if cats_without_details and len(cat_tests) >= 2:
cats = ", ".join(c.get("category_label", c.get("category", "?"))
for c in cats_without_details)
issues.append({
"key": "no_provider_per_category",
"title": f"Cookie-Banner: Kategorien ({cats}) zeigen keine "
f"Provider-/Cookie-Details",
"severity": "HIGH",
"action": ("Pro Banner-Kategorie eine Liste der eingebundenen "
"Anbieter + Cookie-Details (Name, Zweck, Speicherdauer, "
"Drittlandtransfer) sichtbar machen — am besten als "
"ausklappbares Detail-Panel. Sonst ist die "
"Einwilligung nicht 'informiert' nach Art. 7 DSGVO "
"und gilt als unwirksam."),
"source": "Art. 7 Abs. 1 DSGVO, EDPB Guidelines 2/2023, DSK 2024",
"bussgeld": _BUSSGELD_REFS["no_provider_per_category"],
})
# 3) DSGVO/TDDDG-Score < 30%: DSE rechtswidrig
pct = int((sc.get("totals") or {}).get("pct", 100))
if pct and pct < 30:
issues.append({
"key": "dse_unvollstaendig",
"title": f"Datenschutzerklaerung erfuellt nur {pct}% der Pflichten",
"severity": "HIGH",
"action": ("Vollstaendig nach Art. 13 DSGVO ueberarbeiten: "
"Verantwortlicher, Zwecke, Rechtsgrundlage, "
"Speicherdauer, Drittland-Transfers, alle Betroffenen-"
"rechte, konkrete Aufsichtsbehoerde."),
"source": "Art. 13 DSGVO + Art. 14 (alternativ), DSK-OH Telemedien 2024",
"bussgeld": _BUSSGELD_REFS["dse_unvollstaendig"],
})
# 4) Cookie-Richtlinie fehlt komplett (nicht erreichbar)
cookie_missing = any(
(r.doc_type == "cookie" if hasattr(r, "doc_type") else
r.get("doc_type") == "cookie")
and ((r.error if hasattr(r, "error") else r.get("error", "")) or "")
.startswith("Auf der Website nicht gefunden")
for r in (results or [])
)
cookie_deduped = any(
(r.doc_type == "cookie" if hasattr(r, "doc_type") else
r.get("doc_type") == "cookie")
and "Nicht separat vorhanden" in
((r.error if hasattr(r, "error") else r.get("error", "")) or "")
for r in (results or [])
)
if cookie_missing or cookie_deduped:
issues.append({
"key": "cookie_doc_missing",
"title": ("Keine eigenstaendige Cookie-Richtlinie"
if cookie_deduped
else "Cookie-Richtlinie nicht auffindbar"),
"severity": "HIGH",
"action": ("Separate Cookie-Richtlinie-Seite erstellen mit "
"tabellarischer Auflistung aller Cookies (Name, "
"Anbieter, Zweck, Speicherdauer, Drittlandtransfer). "
"Direkt aus dem Banner verlinken."),
"source": "Art. 13 DSGVO, §25 TDDDG, DSK-OH Telemedien 2024",
"bussgeld": _BUSSGELD_REFS["cookie_doc_missing"],
})
# 5) Schrems-II-Risiko: Google/Meta/Microsoft im Banner, aber keine SCC/DPF
# Detection: pre-/post-consent-cookies in den phases enthalten US-Tracker
phases = br.get("phases") or {}
has_us_tracker = False
for ph in phases.values():
if not isinstance(ph, dict):
continue
for t in (ph.get("tracking_services") or []):
if isinstance(t, dict):
name = (t.get("name", "") or "").lower()
else:
name = str(t).lower()
if any(w in name for w in ("google", "meta", "facebook",
"microsoft", "linkedin", "tiktok")):
has_us_tracker = True
break
if has_us_tracker:
issues.append({
"key": "schrems_ii",
"title": "US-Tracker geladen — Schrems-II-Risiko",
"severity": "HIGH",
"action": ("Pro Drittland-Anbieter dokumentieren: SCC (Art. 46 "
"DSGVO) ODER DPF-Zertifizierung pruefen + in der "
"Datenschutzerklaerung explizit benennen."),
"source": "Art. 44 ff. DSGVO, EuGH C-311/18 (Schrems II)",
"bussgeld": _BUSSGELD_REFS["schrems_ii"],
})
return issues
def _action_for_banner_violation(v: dict) -> str:
text = (v.get("text") or "").lower()
if "impressum" in text:
return ("Impressum-Link direkt im Banner ergaenzen — bei "
"ueberlagerndem Banner Pflicht nach §5 TMG.")
if "ablehnen" in text or "dark pattern" in text:
return ("'Ablehnen'-Button visuell gleichwertig zu 'Akzeptieren' "
"gestalten (gleiche Groesse, Farbe, Position).")
if "widerruf" in text or "cookie-einstellungen" in text:
return ("Floating-Icon oder Footer-Link 'Cookie-Einstellungen' "
"permanent einblenden — Widerruf so einfach wie Erteilung.")
return ("Banner-Verstoss beheben gemaess der genannten Rechtsgrundlage.")
def build_critical_findings_html(
banner_result: dict | None,
scorecard: dict | None,
results: list,
) -> str:
"""Render der Audit-Zusammenfassung fuer die Geschaeftsfuehrung.
P89: Co-Pilot-Tonalitaet statt Panik-Rot.
- Sachlich blau statt alarmistisch rot
- "Themen die besprochen werden sollten" statt "VERSTOESSE"
- Realistische Zeitschaetzung (4-8 Wochen)
- Buessgeld-Risiko in separater, dezenter Section ganz unten
- Konfidenz-Hinweis "False-Positives moeglich"
"""
issues = _detect_critical_issues(banner_result, scorecard, results)
if not issues:
return ""
items = []
for idx, i in enumerate(issues, 1):
# P87-Vorbereitung: keine HIGH-Badges mehr — wir nummerieren stattdessen
items.append(
f'<div style="margin-bottom:10px;padding:10px 14px;'
f'background:#fff;border-radius:6px;'
f'border-left:3px solid #2563eb">'
f'<div style="font-size:13px;font-weight:600;color:#1e293b;'
f'margin-bottom:4px">'
f'<span style="display:inline-block;background:#dbeafe;color:#1e40af;'
f'padding:1px 8px;border-radius:10px;font-size:10px;'
f'margin-right:8px;font-weight:600">Thema {idx}</span>'
f'{i["title"]}</div>'
f'<div style="font-size:11px;color:#475569;margin-top:6px">'
f'<strong>Empfehlung:</strong> {i["action"]}</div>'
f'<div style="font-size:10px;color:#94a3b8;margin-top:4px;'
f'font-style:italic">Hintergrund: {i.get("source","")}</div>'
f'</div>'
)
n = len(issues)
plural = "Themen" if n != 1 else "Thema"
return (
'<div style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;'
'max-width:700px;margin:0 auto 18px;padding:18px 22px;'
'background:#f0f9ff;border:1px solid #bfdbfe;border-radius:10px">'
'<div style="font-size:11px;color:#1e40af;text-transform:uppercase;'
'letter-spacing:1.2px;margin-bottom:4px;font-weight:600">'
'Zusammenfassung fuer die Geschaeftsfuehrung</div>'
f'<h2 style="margin:0 0 8px;font-size:18px;color:#1e293b">'
f'{n} {plural} zur Besprechung mit DSB, Marketing und Entwicklung</h2>'
'<p style="margin:0 0 14px;font-size:12px;color:#475569;line-height:1.5">'
'Wir haben Datenschutzerklaerung, Cookie-Banner, Impressum und '
'eingebundene Anbieter technisch analysiert. Die folgenden Punkte '
'sollten in den naechsten Wochen geklaert werden &mdash; typische '
'Umsetzungsdauer 4-8 Wochen (DSB-Review &rarr; Marketing-Agentur '
'&rarr; Entwicklung &rarr; Freigabe). Detaillierte technische '
'Analyse mit weiteren Findings finden Sie unten.</p>'
+ "".join(items) +
'<div style="margin-top:14px;padding:10px 12px;background:#f1f5f9;'
'border-radius:6px;font-size:10px;color:#64748b;line-height:1.5">'
'<strong style="color:#475569">Hinweis:</strong> Automatisierte '
'Audits enthalten False-Positives. Wo unsicher, bitte mit DSB pruefen '
'oder uns Feedback geben &mdash; wir lernen daraus. '
'Rechtliche Risiken (Bussgeld-Rahmen Art. 83 DSGVO bis 4&nbsp;% des '
'weltweiten Jahresumsatzes, realistisch 0,1-1&nbsp;% bei Erstverstoss '
'nach CNIL/LfDI-Massstab) werden weiter unten pro Finding eingeordnet.'
'</div>'
'</div>'
)
@@ -26,6 +26,47 @@ def _fmt_eur_range(low: int, high: int) -> str:
return f"{low:,}{high:,}".replace(",", ".")
def _build_score_band_block(pct: int, color: str) -> list[str]:
"""P34 — eine Zeile unter den KPIs: Score-Einordnung."""
band, hint = _score_band_explanation(pct)
return [
f'<div style="margin-top:10px;padding:10px 14px;'
f'background:rgba(255,255,255,0.04);border-left:3px solid {color};'
f'border-radius:4px">'
f'<div style="font-size:11px;color:#cbd5e1">'
f'<strong style="color:{color}">{band} ({pct}%)</strong> — {hint}'
f'</div></div>',
]
def _score_band_explanation(pct: int) -> tuple[str, str]:
"""P34 — Was bedeutet der Score: wo MUESSTE man stehen.
Returns (label, what_to_expect)."""
if pct >= 85:
return (
"Sehr gut", "Praxis-uebliche DSGVO-Risikolage. "
"Standard-Pflege reicht — jaehrliche Pruefung empfohlen.",
)
if pct >= 70:
return (
"Akzeptabel", "Branchen-Median. Verbleibende Findings sind "
"meist Formalia — Empfehlung: einmaliges Aufraeumen, dann "
"Halbjahres-Check.",
)
if pct >= 50:
return (
"Handlungsbedarf", "Mehrere wesentliche Themen offen. "
"Empfehlung: priorisierte Abarbeitung der HIGH-Findings "
"binnen 4-8 Wochen mit DSB + Web-Team.",
)
return (
"Erhoehtes Risiko", "Mehrere Kern-Pflichten fehlen oder sind "
"veraltet. Empfehlung: kurzfristiger Termin mit DSB / Rechtsabteilung "
"und Web-Team zur Priorisierung.",
)
def build_exec_summary_html(
scorecard: dict | None,
previous_scorecard: dict | None,
@@ -117,6 +158,9 @@ def build_exec_summary_html(
'</table>',
# P34 — Score-Einordnung "wer wo stehen muss"
*(_build_score_band_block(pct, score_color) if scorecard else []),
# CTAs
'<div style="margin-top:14px;padding-top:12px;border-top:1px solid '
'rgba(255,255,255,0.1);text-align:center">',
@@ -234,255 +234,9 @@ def _category_label(kat: str) -> str:
}.get(kat, kat or "")
def build_vvt_table_html(vendors: list[dict]) -> str:
"""Render the per-vendor VVT-style table for the email report.
# VVT-Tabelle (gruppiert + P60/P60b Pattern-Notice) wurde in
# vvt_table_renderer.py ausgelagert, damit dieses File unter dem
# 500-LOC-Hardcap bleibt. Re-export, damit bestehende Aufrufer (z.B.
# agent_compliance_check_routes) unveraendert weiter funktionieren.
from compliance.api.vvt_table_renderer import build_vvt_table_html # noqa: E402,F401
Splits vendors into 3-4 sections by recipient_type (Art. 30(1)(d)
DSGVO):
1. INTERNAL own departments / own systems
2. GROUP_COMPANY parent/subsidiary (if any)
3. PROCESSOR Auftragsverarbeiter (AVV-pflichtig)
4. CONTROLLER joint/independent controllers (Meta, Google,
LinkedIn they build own profiles)
5. AUTHORITY / OTHER rest
Within each section: rows sorted by compliance_score ascending so
the weakest entries surface first.
"""
if not vendors:
return ""
# Import here to avoid pulling backend service deps at module load
from compliance.services.vendor_classifier import RECIPIENT_TYPE_SECTIONS
# Bucket vendors by recipient_type
by_type: dict[str, list[dict]] = {}
for v in vendors:
rt = (v.get("recipient_type") or "OTHER").upper()
by_type.setdefault(rt, []).append(v)
# Top summary
n_total = len(vendors)
n_internal = sum(1 for v in vendors
if (v.get("recipient_type") or "").upper()
in ("INTERNAL", "GROUP_COMPANY"))
n_external = n_total - n_internal
n_critical = sum(1 for v in vendors if v.get("compliance_score", 0) < 50)
summary_parts = [f"{n_total} Verarbeitungen erfasst"]
if n_internal and n_external:
summary_parts.append(
f"&mdash; {n_internal} eigene + {n_external} externe Empfaenger"
)
if n_critical:
summary_parts.append(
f', <strong style="color:#dc2626">{n_critical} unter 50%</strong>'
)
else:
summary_parts.append("&mdash; alle ueber 50%")
summary = " ".join(summary_parts)
out: list[str] = [
'<div style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;'
'max-width:760px;margin:0 auto 16px;padding:12px 16px;'
'background:#fafafa;border:1px solid #e5e7eb;border-radius:8px">',
'<h3 style="margin:0 0 4px;font-size:14px;color:#334155">'
'VVT-Vorschlag: Verarbeitungstaetigkeiten und Empfaenger aus der '
'Cookie-Richtlinie</h3>',
f'<p style="margin:0 0 10px;font-size:11px;color:#6b7280">{summary}. '
'Gruppiert nach Empfaengerkategorie (Art. 30(1)(d) DSGVO). Innerhalb '
'jeder Gruppe nach Compliance-Score sortiert. Bei eigenen '
'Verarbeitungen (INTERNAL/GROUP) werden Opt-Out und Privacy-Link '
'NICHT als Pflicht gewertet &mdash; der Widerruf erfolgt ueber das '
'Cookie-Banner, Privacy ist in der Haupt-DSI dokumentiert.</p>',
]
for rtype, section_label in RECIPIENT_TYPE_SECTIONS:
rows = by_type.get(rtype) or []
if not rows:
continue
rows = sorted(rows, key=lambda v: v.get("compliance_score", 0))
n = len(rows)
n_bad = sum(1 for v in rows if v.get("compliance_score", 0) < 50)
bad_hint = (f' <span style="color:#dc2626">({n_bad} unter 50%)</span>'
if n_bad else "")
out.append(
f'<h4 style="margin:14px 0 4px;font-size:12px;color:#1e293b;'
f'border-top:1px solid #e2e8f0;padding-top:8px">'
f'{section_label} <span style="color:#94a3b8;font-weight:400">'
f'({n}){bad_hint}</span></h4>'
)
out.append(_render_vendor_section(rows))
out.append('</div>')
return "".join(out)
def _render_vendor_section(rows: list[dict]) -> str:
body: list[str] = [
'<table style="width:100%;border-collapse:collapse;font-size:11px">'
'<thead><tr style="background:#f1f5f9;color:#475569;text-align:left">'
'<th style="padding:5px 8px">Name</th>'
'<th style="padding:5px 8px">Kategorie</th>'
'<th style="padding:5px 8px">Sitz</th>'
'<th style="padding:5px 8px;text-align:center">Cookies</th>'
'<th style="padding:5px 8px;text-align:center">Opt-Out</th>'
'<th style="padding:5px 8px;text-align:center">Privacy</th>'
'<th style="padding:5px 8px;text-align:right">Score</th>'
'</tr></thead><tbody>',
]
for v in rows:
body.append(_render_vendor_row_full(v))
body.append('</tbody></table>')
return "".join(body)
def _render_vendor_row_full(v: dict) -> str:
rtype = (v.get("recipient_type") or "OTHER").upper()
is_own = rtype in ("INTERNAL", "GROUP_COMPANY")
cat = (v.get("category") or "").lower()
is_necessary = cat in ("necessary", "strictlynecessary")
name = v.get("name") or "Unbekannt"
category = _category_label(v.get("category", ""))
country = v.get("country") or ("" if is_own else "")
cookies = v.get("cookies") or []
n_cookies = len(cookies)
score = int(v.get("compliance_score", 0))
flags = v.get("compliance_flags") or []
# Opt-Out: nicht erforderlich fuer eigene Verarbeitung oder
# technisch notwendige Cookies (§25 Abs. 2 TDDDG).
opt_na_reason = ("Nicht erforderlich (eigene Verarbeitung — "
"Widerruf ueber Cookie-Banner)") if is_own else (
"Nicht erforderlich (§25 Abs. 2 TDDDG — technisch notwendig)"
if is_necessary else None
)
opt_status = _link_status_badge(
v.get("opt_out_url"), v.get("opt_out_ok"), v.get("opt_out_status"),
na_label=opt_na_reason,
)
# Privacy: nicht erforderlich fuer eigene Verarbeitung (Haupt-DSI).
privacy_na_reason = (
"Nicht erforderlich (eigene Verarbeitung — durch Haupt-DSI abgedeckt)"
if is_own else None
)
privacy_status = _link_status_badge(
v.get("privacy_policy_url"), v.get("privacy_ok"),
v.get("privacy_status"), na_label=privacy_na_reason,
)
score_color = ("#16a34a" if score >= 80 else
"#d97706" if score >= 50 else "#dc2626")
# Score-Erklaerung: was wurde gewertet, was fehlt
# Annahme: Score = bestandene Kriterien / Gesamtkriterien * 100.
# Typisch 5 Kriterien fuer EXT: country, cookies, opt_out, privacy, scoring.
# Bei INTERNAL/GROUP: opt_out + privacy nicht gewertet (3 Kriterien).
n_criteria = 3 if is_own else 5
n_failed = len(flags) if flags else 0
score_tooltip = (
f"{n_criteria - n_failed} von {n_criteria} Kriterien erfuellt"
+ (f" — fehlt: {', '.join(_flag_short(f) for f in flags[:3])}"
if flags else "")
)
# Inline-Aktions-Anweisungen pro Flag
actions_html = ""
if flags:
from compliance.services.finding_action_recipes import recipe_for
action_items = []
for f in flags:
rec = recipe_for(f)
if not rec:
continue
action_items.append(
f'<li style="margin-bottom:6px"><strong>{_flag_short(f)}:</strong> '
f'{rec.get("what", "")}<br/>'
f'<span style="color:#475569"><strong>Was tun:</strong> '
f'{rec.get("fix_text", "").splitlines()[0][:200]}</span><br/>'
f'<span style="color:#94a3b8;font-size:9px">Quelle: '
f'{rec.get("why", "")[:160]}</span></li>'
)
if action_items:
actions_html = (
f'<details style="margin-top:4px"><summary style="cursor:pointer;'
f'color:#dc2626;font-size:10px">Was muss ich tun? '
f'({len(action_items)} Action{"s" if len(action_items) != 1 else ""})</summary>'
f'<ul style="margin:4px 0 0 14px;padding:0;font-size:10px;color:#1e293b">'
+ "".join(action_items)
+ '</ul></details>'
)
flag_str = ""
if flags:
flag_str = (
f'<div style="font-size:10px;color:#94a3b8;margin-top:2px">'
f'{", ".join(flags[:4])}</div>'
f'{actions_html}'
)
risk = v.get("compliance_risk") or {}
risk_label = risk.get("label") or ""
risk_badge = ""
if risk_label and risk_label != "unklar":
rc = {"kritisch": ("#dc2626", "#fff"), "hoch": ("#fecaca", "#991b1b"),
"mittel": ("#fde68a", "#92400e"), "gering": ("#d1fae5", "#065f46")}.get(risk_label, ("#e5e7eb", "#475569"))
risk_badge = (f'<span style="margin-left:6px;padding:1px 5px;border-radius:3px;font-size:9px;'
f'background:{rc[0]};color:{rc[1]}">Risk: {risk_label}</span>')
return (
f'<tr style="border-top:1px solid #e2e8f0">'
f'<td style="padding:6px 8px;color:#1e293b;font-size:11px">'
f'{name}{risk_badge}{flag_str}</td>'
f'<td style="padding:6px 8px;color:#475569;font-size:11px">{category}</td>'
f'<td style="padding:6px 8px;color:#475569;font-size:11px">{country}</td>'
f'<td style="padding:6px 8px;text-align:center;color:#475569;font-size:11px">'
f'{n_cookies}</td>'
f'<td style="padding:6px 8px;text-align:center">{opt_status}</td>'
f'<td style="padding:6px 8px;text-align:center">{privacy_status}</td>'
f'<td style="padding:6px 8px;text-align:right;font-weight:600;'
f'color:{score_color};font-size:11px" title="{score_tooltip}">'
f'{score}%<div style="font-size:9px;font-weight:400;color:#94a3b8">'
f'{n_criteria - n_failed}/{n_criteria}</div></td>'
f'</tr>'
)
def _flag_short(f: str) -> str:
"""Lesbare deutsche Form fuer einen Flag-Token."""
labels = {
"no_cookies_listed": "Cookies fehlen",
"no_country": "Sitzland fehlt",
"no_privacy_url": "Privacy-Link fehlt",
"broken_privacy_url": "Privacy-Link broken",
"no_opt_out_url": "Opt-Out fehlt",
"broken_opt_out": "Opt-Out broken",
}
return labels.get(f, f)
def _link_status_badge(
url: str | None,
ok: bool | None,
status: int | None,
na_label: str | None = None,
) -> str:
"""Render the link-status cell.
- url + ok -> green check
- url + broken -> red cross with status
- no url + na_label -> neutral em-dash with explanation tooltip
(used for INTERNAL/necessary rows where the field isn't required)
- no url + no na_label -> red cross (real gap)
"""
if not url:
if na_label:
return ('<span style="color:#94a3b8;font-size:11px" '
f'title="{na_label}">&mdash;</span>')
return ('<span style="color:#dc2626;font-size:11px" '
'title="Kein Link">&#10007;</span>')
if ok:
return ('<span style="color:#16a34a;font-size:11px" '
f'title="HTTP {status}">&#10003;</span>')
status_str = str(status) if status else "?"
return ('<span style="color:#dc2626;font-size:11px" '
f'title="HTTP {status_str}">&#10007; ({status_str})</span>')
@@ -202,51 +202,13 @@ def build_management_summary(results: list[DocCheckResult]) -> str:
def _check_to_action(doc_label: str, check_label: str, hint: str) -> str:
"""Convert a failed check into a plain-language action item."""
# Map technical check labels to business-language actions
label_lower = check_label.lower()
"""Convert a failed check into a plain-language action item.
if "datenschutzbeauftragter" in label_lower or "dsb" in label_lower:
return (f"<strong>{doc_label}:</strong> Ihren Datenschutzbeauftragten "
f"mit Kontaktdaten erwaehnen. Pflicht ab 20 Mitarbeitern.")
if "beschwerderecht" in label_lower or "art. 77" in label_lower:
return (f"<strong>{doc_label}:</strong> Hinweis auf das Beschwerderecht "
f"bei der Aufsichtsbehoerde ergaenzen (Name + Kontakt der Behoerde).")
if "betroffenenrechte" in label_lower:
return (f"<strong>{doc_label}:</strong> Alle Betroffenenrechte "
f"(Auskunft, Berichtigung, Loeschung, etc.) einzeln auffuehren.")
if "verantwortlicher" in label_lower:
return (f"<strong>{doc_label}:</strong> Vollstaendige Firmenbezeichnung "
f"mit Rechtsform, Adresse, E-Mail und Telefon eintragen.")
if "interessenabwaegung" in label_lower:
return (f"<strong>{doc_label}:</strong> Bei 'berechtigtem Interesse' "
f"die Abwaegung dokumentieren. Aufgabe fuer den DSB/Rechtsanwalt.")
if "widerrufsbelehrung" in label_lower or "widerruf" in label_lower:
return (f"<strong>{doc_label}:</strong> Gesetzliche Widerrufsbelehrung "
f"mit 14-Tage-Frist und Musterformular bereitstellen.")
if "loeschkonzept" in label_lower:
return (f"<strong>{doc_label}:</strong> Loeschfristen und -prozess "
f"dokumentieren. Aufgabe fuer den DSB.")
if "profiling" in label_lower or "art. 22" in label_lower:
return (f"<strong>{doc_label}:</strong> Hinweis ergaenzen ob "
f"automatisierte Entscheidungen stattfinden oder nicht.")
if "nicht im eingereichten text" in label_lower:
return (f"<strong>{doc_label}:</strong> Das eingereichte Dokument "
f"enthaelt nicht den erwarteten Inhalt. Bitte korrekte URL pruefen.")
# Generic fallback
if hint and len(hint) < 150:
return f"<strong>{doc_label}:</strong> {hint[:120]}"
return f"<strong>{doc_label}:</strong> '{check_label}' muss ergaenzt werden."
Implementation lives in doc_action_mappings.check_to_action kept here
as a thin wrapper so the report module stays under the 500-LOC cap.
"""
from compliance.api.doc_action_mappings import check_to_action
return check_to_action(doc_label, check_label, hint)
def build_html_report(
@@ -24,7 +24,7 @@ from compliance.services.unified_findings_store import (
findings_summary,
list_findings,
)
from compliance.services.compliance_audit_log import get_check_run
from compliance.services.compliance_audit_log import get_check_run, get_check_payload
logger = logging.getLogger(__name__)
@@ -102,3 +102,18 @@ def get_findings(
"count": 0,
"findings": [],
}
@router.get("/banner/{check_id}")
def get_banner_payload(check_id: str) -> dict:
"""P20: full banner_result (phases, structured_checks, category_tests,
banner_checks.violations) fuer das Voll-Audit-Frontend.
"""
try:
payload = get_check_payload(check_id) or {}
banner = payload.get("banner") or {}
return {"found": bool(banner), "check_id": check_id, "banner": banner}
except Exception as e:
logger.exception("get_banner_payload failed for %s", check_id)
return {"found": False, "check_id": check_id,
"error": str(e)[:200], "banner": {}}
@@ -0,0 +1,102 @@
"""
GF-freundliche Action-Texte fuer fehlende Pflichtangaben.
Ausgelagert aus agent_doc_check_report.py (LOC-Cap). Wandelt einen
fehlgeschlagenen DocCheck in eine kurze Handlungsanweisung um, die ein
Geschaeftsfuehrer ohne juristisches Vorwissen versteht.
P66: Cookie-spezifische Findings unterscheiden zwischen Service-Zweck
(Anbieter-Beschreibung wie "Akamai = Bot-Schutz") und Cookie-Zweck
(welches Cookie wozu) eine haeufige Verwechslung bei Marketing-Managern.
"""
from __future__ import annotations
def _cookie_finding_action(doc_label: str, check_label: str) -> str | None:
"""P66 — Cookie-spezifische Mappings."""
label_lower = check_label.lower()
if "zwecke der cookies" in label_lower or label_lower == "zwecke":
return (f"<strong>{doc_label}:</strong> Zwecke pro Cookie ergaenzen "
f"— nicht pro Anbieter. Service-Beschreibungen ('Akamai = "
f"Bot-Schutz') beantworten nicht, was das einzelne Cookie "
f"tut. Pflicht: pro Cookie (z.B. <code>_abck</code>) den "
f"konkreten Zweck angeben ('Bot-Detection-Token, gueltig "
f"24h'). DSK-OH Telemedien 2024 §3.2.")
if "speicherdauer" in label_lower:
return (f"<strong>{doc_label}:</strong> Speicherdauer pro Cookie "
f"angeben — nicht pauschal 'siehe Anbieter'. Pflicht: "
f"konkreter Wert (z.B. '_ga: 2 Jahre', '_gid: 24h', "
f"'PHPSESSID: Session'). Werte aus DevTools &gt; "
f"Application &gt; Cookies pruefen, Anbieter-Doku ist "
f"oft veraltet. Art. 13 Abs. 2 lit. a DSGVO.")
if "anbieter" in label_lower or "providers_named" in label_lower:
return (f"<strong>{doc_label}:</strong> Konkrete Firmen mit Sitz "
f"benennen — nicht 'Drittanbieter' oder 'Marketing-Partner'. "
f"Pflicht: voller Firmenname + Rechtsform + Land (z.B. "
f"'Google Ireland Limited, Dublin'). Art. 13 Abs. 1 lit. e "
f"DSGVO (Empfaenger-Pflicht).")
if "cookie-tabelle" in label_lower or "cookie_list" in label_lower:
return (f"<strong>{doc_label}:</strong> Tabellarische Cookie-Liste "
f"mit Name, Anbieter, Zweck und Speicherdauer ergaenzen. "
f"Reine Anbieter-Beschreibung ohne Cookie-Namen reicht "
f"nicht — Nutzer muss nachvollziehen, welches einzelne "
f"Cookie was tut. DSK-OH 2024.")
if "drittland" in label_lower or "schrems" in label_lower:
return (f"<strong>{doc_label}:</strong> Pro US-Anbieter (Google, "
f"Meta, AWS, Akamai) klaeren: SCC (Art. 46 DSGVO) oder "
f"DPF-Zertifizierung — und in der Cookie-Richtlinie "
f"explizit nennen. Pauschales 'Anbieter ausserhalb EU' "
f"reicht nicht. EuGH Schrems II.")
return None
def check_to_action(doc_label: str, check_label: str, hint: str) -> str:
"""Convert a failed check into a plain-language action item."""
label_lower = check_label.lower()
if "datenschutzbeauftragter" in label_lower or "dsb" in label_lower:
return (f"<strong>{doc_label}:</strong> Ihren Datenschutzbeauftragten "
f"mit Kontaktdaten erwaehnen. Pflicht ab 20 Mitarbeitern.")
if "beschwerderecht" in label_lower or "art. 77" in label_lower:
return (f"<strong>{doc_label}:</strong> Hinweis auf das Beschwerderecht "
f"bei der Aufsichtsbehoerde ergaenzen (Name + Kontakt der Behoerde).")
if "betroffenenrechte" in label_lower:
return (f"<strong>{doc_label}:</strong> Alle Betroffenenrechte "
f"(Auskunft, Berichtigung, Loeschung, etc.) einzeln auffuehren.")
if "verantwortlicher" in label_lower:
return (f"<strong>{doc_label}:</strong> Vollstaendige Firmenbezeichnung "
f"mit Rechtsform, Adresse, E-Mail und Telefon eintragen.")
if "interessenabwaegung" in label_lower:
return (f"<strong>{doc_label}:</strong> Bei 'berechtigtem Interesse' "
f"die Abwaegung dokumentieren. Aufgabe fuer den DSB/Rechtsanwalt.")
if "widerrufsbelehrung" in label_lower or "widerruf" in label_lower:
return (f"<strong>{doc_label}:</strong> Gesetzliche Widerrufsbelehrung "
f"mit 14-Tage-Frist und Musterformular bereitstellen.")
if "loeschkonzept" in label_lower:
return (f"<strong>{doc_label}:</strong> Loeschfristen und -prozess "
f"dokumentieren. Aufgabe fuer den DSB.")
if "profiling" in label_lower or "art. 22" in label_lower:
return (f"<strong>{doc_label}:</strong> Hinweis ergaenzen ob "
f"automatisierte Entscheidungen stattfinden oder nicht.")
if "nicht im eingereichten text" in label_lower:
return (f"<strong>{doc_label}:</strong> Das eingereichte Dokument "
f"enthaelt nicht den erwarteten Inhalt. Bitte korrekte URL pruefen.")
if any(w in label_lower for w in ("rechtswidrig", "illegal",
"haftungsausschluss", "disclaimer")):
return (f"<strong>{doc_label}:</strong> '{check_label}' muss entfernt "
f"werden (Anti-Pattern, rechtlich wirkungslos).")
mapped = _cookie_finding_action(doc_label, check_label)
if mapped:
return mapped
if hint and len(hint) < 300:
return f"<strong>{doc_label}:</strong> {hint[:280]}"
return f"<strong>{doc_label}:</strong> '{check_label}' muss ergaenzt werden."
@@ -0,0 +1,283 @@
"""FastAPI-Route fuer den Founding-Wizard Document-Generation.
POST /v1/founding-wizard/generate
Body: FoundingWizardState (Wizard-Eingaben)
Returns: {documents: [{document_type, title, content_base64, size_bytes, ...}]}
Templates werden aus compliance_legal_templates geladen, mit dem Wizard-Context
gerendert (Handlebars-light) und als .docx-Bytes (base64) zurueckgegeben.
"""
from __future__ import annotations
import base64
import logging
from typing import Any
from fastapi import APIRouter, HTTPException, Request
from pydantic import BaseModel
from sqlalchemy import text
from sqlalchemy.orm import Session
from compliance.services.founding_wizard import (
base_context,
markdown_to_docx_bytes,
render_template,
)
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/v1/founding-wizard", tags=["founding-wizard"])
DOC_TITLES = {
"articles_of_association": "Satzung",
"gesellschafterliste": "Gesellschafterliste",
"gf_bestellungsbeschluss": "Bestellungsbeschluss Geschäftsführer",
"hrb_anmeldung": "Handelsregister-Anmeldung",
"sha": "Shareholders' Agreement (SHA)",
"geschaeftsordnung_gf": "Geschäftsordnung der Geschäftsführung",
"managing_director_employment_contract": "Geschäftsführerdienstvertrag",
"ip_assignment_agreement": "IP-Assignment Agreement",
"employment_contract_de": "Arbeitsvertrag",
"term_sheet": "Term Sheet",
"convertible_loan_agreement": "Wandeldarlehensvertrag",
"subscription_agreement": "Beteiligungsvertrag",
"esop_plan": "ESOP/VSOP-Plan",
"cap_table": "Cap Table",
}
class GenerationRequest(BaseModel):
current_step: int = 8
lifecycle_stage: str = "founding"
is_pre_notary: bool = True
basics: dict[str, Any] = {}
gesellschafter: list[dict[str, Any]] = []
capital: dict[str, Any] = {}
notar: dict[str, Any] = {}
sha: dict[str, Any] = {}
gf_contracts: list[dict[str, Any]] = []
selected_documents: list[str] = []
class DocumentResult(BaseModel):
document_type: str
title: str
filename: str
content_base64: str
size_bytes: int
generated_at: str
placeholders_count: int
class GenerationResponse(BaseModel):
documents: list[DocumentResult]
warnings: list[str] = []
def _load_template(db: Session, document_type: str) -> dict[str, Any] | None:
"""Laedt das neueste published Template fuer den document_type."""
row = db.execute(
text("""
SELECT id, document_type, title, content, placeholders, version, status
FROM compliance_legal_templates
WHERE document_type = :dt AND status = 'published'
ORDER BY created_at DESC
LIMIT 1
"""),
{"dt": document_type},
).first()
if not row:
return None
return {
"id": str(row.id),
"document_type": row.document_type,
"title": row.title,
"content": row.content,
"placeholders": row.placeholders or [],
"version": row.version,
}
def _safe_slug(name: str) -> str:
"""Erzeugt einen filename-tauglichen Slug aus einem Namen."""
import re as _re
s = _re.sub(r"[^a-zA-Z0-9_-]+", "_", name.strip())
return s.strip("_") or "Person"
def _render_one(
db: Session,
doc_type: str,
context: dict[str, Any],
name_suffix: str = "",
) -> DocumentResult | None:
template = _load_template(db, doc_type)
if not template:
logger.warning("No template found for document_type=%s", doc_type)
return None
rendered_md = render_template(template["content"], context)
title = template.get("title") or DOC_TITLES.get(doc_type, doc_type)
if name_suffix:
title = f"{title}{name_suffix}"
docx_bytes = markdown_to_docx_bytes(rendered_md, title=None)
from datetime import datetime
suffix_slug = f"_{_safe_slug(name_suffix)}" if name_suffix else ""
company_slug = _safe_slug(context.get("COMPANY_NAME", "Unternehmen"))
return DocumentResult(
document_type=doc_type,
title=title,
filename=f"{doc_type}{suffix_slug}_{company_slug}.docx",
content_base64=base64.b64encode(docx_bytes).decode("ascii"),
size_bytes=len(docx_bytes),
generated_at=datetime.utcnow().isoformat() + "Z",
placeholders_count=len(template.get("placeholders") or []),
)
# Dokumente die PRO Person (Gründer/GF) generiert werden
PER_PERSON_DOCS = {
"ip_assignment_agreement", # Pro Gründer einer (individuelles IP)
"managing_director_employment_contract", # Pro GF einer
}
def _build_person_context(
base_ctx: dict[str, Any],
person: dict[str, Any],
doc_type: str,
gf_contract: dict[str, Any] | None = None,
) -> dict[str, Any]:
"""Erweitert base_context um person-spezifische Felder fuer Per-Person-Dokumente."""
ctx = dict(base_ctx)
name = person.get("name", "")
ctx["ASSIGNOR_NAME"] = name
ctx["ASSIGNOR_BIRTHDATE"] = person.get("geburtsdatum", "")
ctx["ASSIGNOR_ADDRESS"] = person.get("adresse", "")
ctx["ASSIGNOR_ROLE"] = person.get("internal_role") or "Gründer und Geschäftsführer"
ctx["HAS_ACADEMIC_BACKGROUND"] = bool(person.get("has_academic_background"))
# GF-Vertrag spezifisch
ctx["GF_NAME"] = name
ctx["GF_BIRTHDATE"] = person.get("geburtsdatum", "")
ctx["GF_ADDRESS"] = person.get("adresse", "")
ctx["GF_INTERNAL_TITLE"] = person.get("internal_role", "Geschäftsführer")
# IP-Bereiche: Person-spezifisch wenn vorhanden
ip_areas = person.get("ip_areas") or []
if ip_areas:
if isinstance(ip_areas, list):
ctx["IP_LIST_DETAILS"] = "\n".join(
f"- {area}" for area in ip_areas
)
else:
ctx["IP_LIST_DETAILS"] = str(ip_areas)
# GF-Contract Daten anwenden wenn vorhanden
if gf_contract:
if gf_contract.get("gross_annual_salary_eur"):
ctx["GROSS_ANNUAL_SALARY_EUR"] = f"{gf_contract['gross_annual_salary_eur']:,}".replace(",", ".")
ctx["HAS_BONUS"] = bool(gf_contract.get("has_bonus"))
ctx["HAS_COMPANY_CAR"] = bool(gf_contract.get("has_company_car"))
ctx["HAS_BAV"] = bool(gf_contract.get("has_bav"))
ctx["VACATION_DAYS"] = gf_contract.get("vacation_days", 30)
ctx["KUENDIGUNGSFRIST_GESELLSCHAFT_MONATE"] = gf_contract.get("kuendigungsfrist_gesellschaft_monate", 6)
ctx["KUENDIGUNGSFRIST_GF_MONATE"] = gf_contract.get("kuendigungsfrist_gf_monate", 3)
ctx["HAS_PARA_181_RELEASE"] = bool(gf_contract.get("para_181_release"))
ctx["SV_STATUS"] = gf_contract.get("sv_status", "sozialversicherungsfrei")
return ctx
@router.post("/generate", response_model=GenerationResponse)
def generate_documents(req: GenerationRequest, request: Request) -> GenerationResponse:
"""Hauptendpunkt: nimmt Wizard-State entgegen, generiert DOCX fuer alle ausgewaehlten Dokumente."""
# Database session is provided via FastAPI dependency injection in production.
# Hier vereinfacht direkt aus dem request state (verwendet Hauptverbindung)
from classroom_engine.database import SessionLocal
db: Session = SessionLocal()
try:
context = base_context(req.model_dump())
results: list[DocumentResult] = []
warnings: list[str] = []
# Gesellschafter + GF-Listen aus Request
gesellschafter = req.gesellschafter
gf_list = [g for g in gesellschafter if g.get("is_geschaeftsfuehrer")]
gf_contracts_map = {
c["gesellschafter_id"]: c
for c in req.gf_contracts
if c.get("gesellschafter_id")
}
for doc_type in req.selected_documents:
if doc_type in PER_PERSON_DOCS:
# Pro Person ein Dokument
if doc_type == "ip_assignment_agreement":
# IP-Assignment: pro Gründer (alle Gesellschafter, nicht nur GFs)
persons = gesellschafter or [{}]
elif doc_type == "managing_director_employment_contract":
# GF-Vertrag: nur pro GF
persons = gf_list or [{}]
else:
persons = [{}]
if not persons:
warnings.append(f"Keine Personen für '{doc_type}' vorhanden")
continue
for p in persons:
contract = gf_contracts_map.get(p.get("id"))
person_ctx = _build_person_context(context, p, doc_type, contract)
result = _render_one(
db, doc_type, person_ctx,
name_suffix=p.get("name", "")
)
if result is None:
warnings.append(f"Template '{doc_type}' nicht in Datenbank gefunden")
break
results.append(result)
else:
# Standard: ein Dokument pro Auswahl
result = _render_one(db, doc_type, context)
if result is None:
warnings.append(f"Template '{doc_type}' nicht in Datenbank gefunden")
continue
results.append(result)
if not results:
raise HTTPException(
status_code=400,
detail=f"Keines der angeforderten Dokumente konnte generiert werden. "
f"Warnings: {warnings}"
)
return GenerationResponse(documents=results, warnings=warnings)
finally:
db.close()
@router.get("/templates")
def list_available_templates(request: Request) -> dict[str, Any]:
"""Listet alle verfuegbaren Templates mit Kategorisierung."""
from classroom_engine.database import SessionLocal
db: Session = SessionLocal()
try:
rows = db.execute(
text("""
SELECT document_type, title, description, version, status,
lifecycle_stage, functional_category
FROM compliance_legal_templates
WHERE status = 'published'
ORDER BY functional_category, document_type
""")
).fetchall()
return {
"templates": [
{
"document_type": r.document_type,
"title": r.title,
"description": r.description,
"version": r.version,
"lifecycle_stage": list(r.lifecycle_stage or []),
"functional_category": r.functional_category,
}
for r in rows
],
"count": len(rows),
}
finally:
db.close()
@@ -0,0 +1,306 @@
"""License attribution endpoints — Task #23 Stufe 1-4.
The audit (Task #22) classified all 314,811 canonical_controls into
license_rule 1/2/3. The frontend, PDF renderer, and tech-file generator
now need to surface that classification in the form of:
- Stufe 1: a global /licenses overview page
- Stufe 2: an auto-footer in every exported PDF
- Stufe 3: an inline source badge on every rendered hazard/measure
- Stufe 4: a sources appendix in tech-file bundles
This module exposes three endpoints that all four stages consume:
GET /api/compliance/licenses/overview
Global aggregation by rule + per-source counts. Drives Stufe 1.
POST /api/compliance/licenses/aggregate
Body: {"control_uuids": ["uuid1", ...]}.
Returns per-rule grouping with source breakdown. Used by PDF
footer (Stufe 2) and tech-file appendix (Stufe 4) to build the
"sources used in this document" list.
GET /api/compliance/licenses/source-info/{control_uuid}
Single-control lookup for the inline source badge tooltip
(Stufe 3). Returns rule, source regulation, attribution text.
Why a new module instead of extending canonical_control_routes:
- canonical_control_routes serves the legacy SPDX-style license matrix
(canonical_control_licenses + canonical_control_sources, ~10 rows).
- This module is built on regulation_registry (252 rows) + the
license_rule on each control. Both schemas coexist; this module
doesn't disturb the legacy endpoints.
"""
from __future__ import annotations
import logging
from typing import Any, Optional
from uuid import UUID
from fastapi import APIRouter, Depends, HTTPException
from pydantic import BaseModel
from sqlalchemy import text
from sqlalchemy.orm import Session
from classroom_engine.database import get_db
router = APIRouter(prefix="/licenses", tags=["licenses"])
logger = logging.getLogger(__name__)
# ============================================================================
# Rule labels — used by frontend renderer
# ============================================================================
RULE_LABELS = {
1: {
"code": "R1",
"label_de": "Wörtlich übernehmbar",
"label_en": "Verbatim, no attribution required",
"render_full_text": True,
"attribution_required": False,
},
2: {
"code": "R2",
"label_de": "Wörtlich mit Attribution",
"label_en": "Verbatim with attribution",
"render_full_text": True,
"attribution_required": True,
},
3: {
"code": "R3",
"label_de": "Nur Identifier zitieren",
"label_en": "Identifier citation only",
"render_full_text": False,
"attribution_required": False,
},
}
# ============================================================================
# Response Schemas
# ============================================================================
class SourceCount(BaseModel):
regulation_id: str
regulation_name_de: Optional[str]
license_rule: int
license_type: Optional[str]
attribution: Optional[str]
jurisdiction: Optional[str]
source_type: Optional[str]
n_controls: int
class RuleBucket(BaseModel):
rule: int
label_de: str
label_en: str
attribution_required: bool
render_full_text: bool
total_controls: int
distinct_sources: int
sources: list[SourceCount]
class OverviewResponse(BaseModel):
total_controls: int
buckets: list[RuleBucket]
class AggregateRequest(BaseModel):
control_uuids: list[UUID]
class AggregateResponse(BaseModel):
total_in_request: int
matched: int
buckets: list[RuleBucket]
class SourceInfo(BaseModel):
control_uuid: UUID
license_rule: Optional[int]
license_label_de: Optional[str]
attribution_required: bool
render_full_text: bool
regulation_id: Optional[str]
regulation_name_de: Optional[str]
license_type: Optional[str]
attribution: Optional[str]
source_url: Optional[str]
# ============================================================================
# Endpoints
# ============================================================================
def _bucket(rule: int, sources: list[SourceCount]) -> RuleBucket:
meta = RULE_LABELS.get(rule, RULE_LABELS[3])
return RuleBucket(
rule=rule,
label_de=meta["label_de"],
label_en=meta["label_en"],
attribution_required=meta["attribution_required"],
render_full_text=meta["render_full_text"],
total_controls=sum(s.n_controls for s in sources),
distinct_sources=len(sources),
sources=sources,
)
@router.get("/overview", response_model=OverviewResponse)
def licenses_overview(db: Session = Depends(get_db)) -> OverviewResponse:
"""Global aggregation: total controls by rule, with per-source breakdown.
Drives Stufe 1 (the /licenses page).
"""
rows = db.execute(text("""
SELECT
COALESCE(cpl.source_regulation, '(no source)') AS regulation_name,
cc.license_rule,
COUNT(DISTINCT cc.id) AS n
FROM compliance.canonical_controls cc
LEFT JOIN compliance.control_parent_links cpl ON cpl.control_uuid = cc.id
WHERE cc.license_rule IS NOT NULL
GROUP BY 1, 2
""")).fetchall()
reg_rows = db.execute(text("""
SELECT regulation_name_de, regulation_id, license_type, attribution,
jurisdiction, source_type
FROM compliance.regulation_registry
""")).fetchall()
reg_by_name = {r.regulation_name_de: r for r in reg_rows if r.regulation_name_de}
by_rule: dict[int, list[SourceCount]] = {1: [], 2: [], 3: []}
seen: dict[tuple[int, str], int] = {}
total = 0
for row in rows:
rule = int(row.license_rule)
name = row.regulation_name
n = int(row.n)
key = (rule, name)
# multiple cpl entries per control deduplicate via DISTINCT, but a
# control with several source_regulations still gets counted once
# per regulation — that's the design.
seen[key] = seen.get(key, 0) + n
total += n
for (rule, name), n in seen.items():
reg = reg_by_name.get(name)
by_rule.setdefault(rule, []).append(SourceCount(
regulation_id=reg.regulation_id if reg else name,
regulation_name_de=name,
license_rule=rule,
license_type=reg.license_type if reg else None,
attribution=reg.attribution if reg else None,
jurisdiction=reg.jurisdiction if reg else None,
source_type=reg.source_type if reg else None,
n_controls=n,
))
for r in by_rule.values():
r.sort(key=lambda s: -s.n_controls)
buckets = [_bucket(rule, sources) for rule, sources in sorted(by_rule.items())]
return OverviewResponse(total_controls=total, buckets=buckets)
@router.post("/aggregate", response_model=AggregateResponse)
def aggregate_for_controls(
body: AggregateRequest,
db: Session = Depends(get_db),
) -> AggregateResponse:
"""Per-control license aggregation for PDF footer (Stufe 2) and
tech-file sources appendix (Stufe 4).
Returns a per-rule breakdown of which sources contributed to the
supplied control set. The frontend renderer turns this into the
"Verwendete Quellen" footer.
"""
if not body.control_uuids:
return AggregateResponse(total_in_request=0, matched=0, buckets=[])
rows = db.execute(text("""
SELECT
COALESCE(cpl.source_regulation, '(unknown)') AS regulation_name,
cc.license_rule,
COUNT(DISTINCT cc.id) AS n
FROM compliance.canonical_controls cc
LEFT JOIN compliance.control_parent_links cpl ON cpl.control_uuid = cc.id
WHERE cc.id = ANY(:ids) AND cc.license_rule IS NOT NULL
GROUP BY 1, 2
"""), {"ids": [str(u) for u in body.control_uuids]}).fetchall()
reg_rows = db.execute(text("""
SELECT regulation_name_de, regulation_id, license_type, attribution,
jurisdiction, source_type
FROM compliance.regulation_registry
""")).fetchall()
reg_by_name = {r.regulation_name_de: r for r in reg_rows if r.regulation_name_de}
by_rule: dict[int, list[SourceCount]] = {1: [], 2: [], 3: []}
matched_total = 0
for row in rows:
rule = int(row.license_rule)
n = int(row.n)
matched_total += n
reg = reg_by_name.get(row.regulation_name)
by_rule.setdefault(rule, []).append(SourceCount(
regulation_id=reg.regulation_id if reg else row.regulation_name,
regulation_name_de=row.regulation_name,
license_rule=rule,
license_type=reg.license_type if reg else None,
attribution=reg.attribution if reg else None,
jurisdiction=reg.jurisdiction if reg else None,
source_type=reg.source_type if reg else None,
n_controls=n,
))
for r in by_rule.values():
r.sort(key=lambda s: -s.n_controls)
buckets = [_bucket(rule, sources) for rule, sources in sorted(by_rule.items()) if sources]
return AggregateResponse(
total_in_request=len(body.control_uuids),
matched=matched_total,
buckets=buckets,
)
@router.get("/source-info/{control_uuid}", response_model=SourceInfo)
def source_info_for_control(
control_uuid: UUID,
db: Session = Depends(get_db),
) -> SourceInfo:
"""Single-control source info for the inline source badge (Stufe 3).
Used by the React `<SourceBadge>` component to populate its tooltip.
"""
row = db.execute(text("""
SELECT cc.license_rule, cpl.source_regulation AS regulation_name,
r.regulation_id, r.license_type, r.attribution, r.url AS source_url
FROM compliance.canonical_controls cc
LEFT JOIN compliance.control_parent_links cpl ON cpl.control_uuid = cc.id
LEFT JOIN compliance.regulation_registry r ON r.regulation_name_de = cpl.source_regulation
WHERE cc.id = :uuid
LIMIT 1
"""), {"uuid": str(control_uuid)}).fetchone()
if row is None:
raise HTTPException(status_code=404, detail="control not found")
rule = int(row.license_rule) if row.license_rule is not None else None
meta = RULE_LABELS.get(rule, {}) if rule else {}
return SourceInfo(
control_uuid=control_uuid,
license_rule=rule,
license_label_de=meta.get("label_de"),
attribution_required=meta.get("attribution_required", False),
render_full_text=meta.get("render_full_text", False),
regulation_id=row.regulation_id,
regulation_name_de=row.regulation_name,
license_type=row.license_type,
attribution=row.attribution,
source_url=row.source_url,
)
@@ -0,0 +1,97 @@
"""
P62 Marketing-Manager-freundlicher Scope-Disclaimer ("Was wir sehen / nicht sehen").
Erklaert in 30 Sekunden was unser Audit tatsaechlich pruefen kann und wo
die Grenzen sind. Ziel: vermeidet falsches Vertrauen in einen 100%-Score
und macht klar, wo Marketing/IT zusaetzlich pruefen muss.
"""
from __future__ import annotations
def build_scope_disclaimer_html() -> str:
"""Render: was wir sehen + was wir NICHT sehen koennen."""
return (
'<div style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;'
'max-width:700px;margin:8px auto 16px;padding:14px 18px;'
'background:#f0f9ff;border:1px solid #bfdbfe;border-radius:8px">'
'<h3 style="margin:0 0 8px;font-size:13px;color:#1e40af">'
'Was diese Pruefung leistet — und wo ihre Grenzen liegen</h3>'
'<div style="font-size:11px;color:#1e293b;margin-bottom:10px">'
'Wir sind ein <strong>technisches Audit-Tool</strong>, kein Anwalt. '
'Ein 100%-Score bedeutet nicht "rechtssicher" — er bedeutet "alle '
'Pruefkriterien automatisch erfuellt". Folgendes koennen wir vs. '
'koennen wir nicht:</div>'
'<table style="width:100%;border-collapse:collapse;font-size:11px;'
'margin-bottom:8px">'
'<thead><tr style="background:#dbeafe;color:#1e40af;text-align:left">'
'<th style="padding:5px 8px;width:50%">Was wir sehen</th>'
'<th style="padding:5px 8px;width:50%">Was wir NICHT sehen</th>'
'</tr></thead>'
'<tbody>'
'<tr style="border-top:1px solid #bfdbfe">'
'<td style="padding:5px 8px;color:#1e293b">'
'✓ Cookies/Storage im Browser nach Klick auf Akzeptieren/Ablehnen'
'</td>'
'<td style="padding:5px 8px;color:#475569">'
'✗ Server-seitiges Tracking (Meta Conversion API, GA4 Measurement '
'Protocol — der Browser sieht nichts davon)'
'</td></tr>'
'<tr style="border-top:1px solid #bfdbfe">'
'<td style="padding:5px 8px;color:#1e293b">'
'✓ Vendor-Listen aus dem Banner (TCF, CMP-Settings, Phase-G Klick-Tour)'
'</td>'
'<td style="padding:5px 8px;color:#475569">'
'✗ Wer die Daten beim Vendor tatsaechlich erhaelt / weiterleitet '
'(z.B. Google verteilt intern an Ads/Marketing-Plattform)'
'</td></tr>'
'<tr style="border-top:1px solid #bfdbfe">'
'<td style="padding:5px 8px;color:#1e293b">'
'✓ Texte und Pflichtangaben in DSE/Cookie-Richtlinie/Impressum'
'</td>'
'<td style="padding:5px 8px;color:#475569">'
'✗ Ob die internen Prozesse (Loeschkonzept, AVV-Pflege, '
'Mitarbeiter-Schulungen) tatsaechlich gelebt werden'
'</td></tr>'
'<tr style="border-top:1px solid #bfdbfe">'
'<td style="padding:5px 8px;color:#1e293b">'
'✓ Banner-UI-Verstoesse (Dark Patterns, ungleichgewichtige Buttons, '
'fehlender Reject-Mechanismus)'
'</td>'
'<td style="padding:5px 8px;color:#475569">'
'✗ Ob das Banner auf <em>jeder</em> Unterseite identisch ist '
'(wir messen die Einstiegsseite)'
'</td></tr>'
'<tr style="border-top:1px solid #bfdbfe">'
'<td style="padding:5px 8px;color:#1e293b">'
'✓ Untergeschobene Cookies (z.B. Google Tag Manager bringt automatisch '
'GA + Ads — siehe P61-Block unten)'
'</td>'
'<td style="padding:5px 8px;color:#475569">'
'✗ Drittland-Transfer auf Vertragsebene — ob ein SCC/DPF wirklich '
'vorliegt, koennen nur Sie selbst pruefen'
'</td></tr>'
'</tbody></table>'
'<div style="font-size:10px;color:#475569;margin-top:8px;'
'padding-top:6px;border-top:1px dashed #bfdbfe">'
'<strong>Hinweis fuer Marketing &amp; Geschaeftsfuehrung:</strong> '
'Selbst wenn dieser Bericht keinen Verstoss findet, kann ein '
'individueller Bescheid einer Aufsichtsbehoerde oder eine Klage '
'(NOYB, Verbraucherschutz, Sammelklage) zu einem anderen Ergebnis '
'kommen — etwa wenn beim Vendor selbst (Server-Side) personenbezogene '
'Daten verarbeitet werden, die wir browser-seitig nicht sehen. '
'Dieser Bericht ersetzt keine anwaltliche Pruefung, hilft aber, '
'<strong>technisch belegbare Verstoesse</strong> sofort zu schliessen.'
'</div>'
'</div>'
)
@@ -0,0 +1,330 @@
"""
VVT-Tabelle fuer den Email-Report pro Vendor eine Zeile, gruppiert
nach Empfaengerkategorie (Art. 30(1)(d) DSGVO).
Ausgelagert aus agent_doc_check_extras.py (LOC-Cap). Enthaelt:
* build_vvt_table_html Haupteinstieg, gruppiert + summary + P60 notice
* _render_vendor_section / _render_vendor_row_full Zeilenrenderer
* _link_status_badge / _flag_short kleine Helper
P60b Fuzzy-Match: Vendors mit teilweise befuellten Feldern (z.B. Sitzland
eingetragen) fallen nicht aus der Pattern-Notice raus, nur weil ihr
Flag-Set um 1-2 Items kleiner ist. Jaccard >= 0.7 deckt das ab.
"""
from __future__ import annotations
def _category_label(kat: str) -> str:
return {
"necessary": "Notwendig", "strictlynecessary": "Notwendig",
"preferences": "Praeferenzen", "functional": "Funktional",
"statistics": "Statistik", "marketing": "Marketing",
"unclassified": "Unklassifiziert",
}.get((kat or "").lower(), kat or "")
def _flag_short(f: str) -> str:
"""Lesbare deutsche Form fuer einen Flag-Token."""
labels = {
"no_cookies_listed": "Cookies fehlen",
"no_country": "Sitzland fehlt",
"no_privacy_url": "Privacy-Link fehlt",
"broken_privacy_url": "Privacy-Link broken",
"no_opt_out_url": "Opt-Out fehlt",
"broken_opt_out": "Opt-Out broken",
}
return labels.get(f, f)
def _link_status_badge(
url: str | None,
ok: bool | None,
status: int | None,
na_label: str | None = None,
) -> str:
if not url:
if na_label:
return ('<span style="color:#94a3b8;font-size:11px" '
f'title="{na_label}">&mdash;</span>')
return ('<span style="color:#dc2626;font-size:11px" '
'title="Kein Link">&#10007;</span>')
if ok:
return ('<span style="color:#16a34a;font-size:11px" '
f'title="HTTP {status}">&#10003;</span>')
status_str = str(status) if status else "?"
return ('<span style="color:#dc2626;font-size:11px" '
f'title="HTTP {status_str}">&#10007; ({status_str})</span>')
def _build_pattern_notice(vendors: list[dict]) -> str:
"""P60 + P60b: globale Notice wenn viele Vendors aehnliche Flag-Sets haben.
Mutiert vendors[].`_actions_in_global_notice` so dass die Zeilenrenderer
redundante per-row-Actions ueberspringen koennen.
"""
from collections import Counter
flag_sets: Counter = Counter()
for v in vendors:
flags = v.get("compliance_flags") or []
if flags:
flag_sets[tuple(sorted(flags))] += 1
if not flag_sets:
return ""
most_common, _ = flag_sets.most_common(1)[0]
most_common_set = set(most_common)
def _similar(flags: tuple) -> bool:
fs = set(flags)
if not fs or not most_common_set:
return False
inter = len(fs & most_common_set)
union = len(fs | most_common_set)
return union > 0 and (inter / union) >= 0.7
n_match = sum(cnt for fs, cnt in flag_sets.items() if _similar(fs))
share = n_match / max(1, len(vendors))
if not (n_match >= 8 and share >= 0.5):
return ""
from compliance.services.finding_action_recipes import recipe_for
labels = [_flag_short(f) for f in most_common]
shared_actions: list[str] = []
for f in most_common:
rec = recipe_for(f)
if rec:
shared_actions.append(
f'<li><strong>{_flag_short(f)}:</strong> '
f'{rec.get("fix_text", "").splitlines()[0][:180]}</li>'
)
for v in vendors:
if _similar(tuple(sorted(v.get("compliance_flags") or []))):
v["_actions_in_global_notice"] = True
return (
f'<div style="margin:8px 0 12px;padding:10px 14px;'
f'background:#fef3c7;border-left:3px solid #d97706;'
f'border-radius:4px;font-size:11px;color:#92400e">'
f'<strong>Wiederkehrendes Muster ({n_match} von {len(vendors)} '
f'Anbietern, {int(share*100)}%):</strong> '
f'Bei diesen Anbietern fehlen jeweils: '
f'<em>{", ".join(labels)}</em>. '
f'Vermutlich systembedingt (z.B. Settings-Export liefert '
f'nur Namen, oder Banner-API blockiert Detail-Extraktion). '
f'Die globalen Empfehlungen unten gelten fuer all diese Eintraege; '
f'in der Tabelle werden sie nicht pro Zeile wiederholt.'
+ (f'<ul style="margin:8px 0 0 0;padding-left:20px">{"".join(shared_actions)}</ul>'
if shared_actions else '')
+ '</div>'
)
def build_vvt_table_html(vendors: list[dict]) -> str:
"""Render per-vendor VVT-style table for the email."""
if not vendors:
return ""
from compliance.services.vendor_classifier import RECIPIENT_TYPE_SECTIONS
by_type: dict[str, list[dict]] = {}
for v in vendors:
rt = (v.get("recipient_type") or "OTHER").upper()
by_type.setdefault(rt, []).append(v)
n_total = len(vendors)
n_internal = sum(
1 for v in vendors
if (v.get("recipient_type") or "").upper() in ("INTERNAL", "GROUP_COMPANY")
)
n_external = n_total - n_internal
n_critical = sum(1 for v in vendors if v.get("compliance_score", 0) < 50)
summary_parts = [f"{n_total} Verarbeitungen erfasst"]
if n_internal and n_external:
summary_parts.append(
f"&mdash; {n_internal} eigene + {n_external} externe Empfaenger"
)
if n_critical:
summary_parts.append(
f', <strong style="color:#dc2626">{n_critical} unter 50%</strong>'
)
else:
summary_parts.append("&mdash; alle ueber 50%")
summary = " ".join(summary_parts)
pattern_notice = _build_pattern_notice(vendors)
out: list[str] = [
'<div style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;'
'max-width:760px;margin:0 auto 16px;padding:12px 16px;'
'background:#fafafa;border:1px solid #e5e7eb;border-radius:8px">',
'<h3 style="margin:0 0 4px;font-size:14px;color:#334155">'
'Vorschlag fuer das Verarbeitungsverzeichnis (Art. 30 DSGVO)</h3>',
# P91: Co-Pilot-Tonalitaet — Wahrscheinlichkeit statt Garantie,
# Empfehlung statt "Verstoss-Liste".
f'<p style="margin:0 0 8px;font-size:11px;color:#6b7280;line-height:1.5">'
f'Wir haben <strong>{n_total} Verarbeitungen</strong> aus dem '
f'Cookie-Banner abgeleitet, mit unserer globalen Anbieter-Bibliothek '
f'abgeglichen und nach Empfaengerkategorie (Art. 30(1)(d) DSGVO) '
f'gruppiert. Bei einer Reduktion der eingebundenen Anbieter, dem '
f'Wechsel zu europaeischen Alternativen und konsequenter Pruefung '
f'der tatsaechlich benoetigten Cookies ist eine Reduktion des '
f'Tracking-Footprints sowie Lizenz-Einsparungen wahrscheinlich. '
f'Eine fundierte Bewertung erfordert die Abstimmung mit dem '
f'Datenschutzbeauftragten.</p>'
f'<p style="margin:0 0 10px;font-size:11px;color:#6b7280">'
f'{summary}. Innerhalb jeder Gruppe nach Verbesserungspotenzial '
f'sortiert. Bei eigenen Verarbeitungen (INTERNAL/GROUP) sind '
f'Opt-Out und Privacy-Link '
'NICHT als Pflicht gewertet &mdash; der Widerruf erfolgt ueber das '
'nicht erforderlich (Widerruf ueber Banner, Privacy in der '
'Haupt-Datenschutzerklaerung dokumentiert).</p>',
pattern_notice,
]
for rtype, section_label in RECIPIENT_TYPE_SECTIONS:
rows = by_type.get(rtype) or []
if not rows:
continue
rows = sorted(rows, key=lambda v: v.get("compliance_score", 0))
n = len(rows)
n_bad = sum(1 for v in rows if v.get("compliance_score", 0) < 50)
bad_hint = (f' <span style="color:#dc2626">({n_bad} unter 50%)</span>'
if n_bad else "")
out.append(
f'<h4 style="margin:14px 0 4px;font-size:12px;color:#1e293b;'
f'border-top:1px solid #e2e8f0;padding-top:8px">'
f'{section_label} <span style="color:#94a3b8;font-weight:400">'
f'({n}){bad_hint}</span></h4>'
)
out.append(_render_vendor_section(rows))
out.append('</div>')
return "".join(out)
def _render_vendor_section(rows: list[dict]) -> str:
body: list[str] = [
'<table style="width:100%;border-collapse:collapse;font-size:11px">'
'<thead><tr style="background:#f1f5f9;color:#475569;text-align:left">'
'<th style="padding:5px 8px">Name</th>'
'<th style="padding:5px 8px">Kategorie</th>'
'<th style="padding:5px 8px">Sitz</th>'
'<th style="padding:5px 8px;text-align:center">Cookies</th>'
'<th style="padding:5px 8px;text-align:center">Opt-Out</th>'
'<th style="padding:5px 8px;text-align:center">Privacy</th>'
'<th style="padding:5px 8px;text-align:right">Score</th>'
'</tr></thead><tbody>',
]
for v in rows:
body.append(_render_vendor_row_full(v))
body.append('</tbody></table>')
return "".join(body)
def _render_vendor_row_full(v: dict) -> str:
rtype = (v.get("recipient_type") or "OTHER").upper()
is_own = rtype in ("INTERNAL", "GROUP_COMPANY")
cat = (v.get("category") or "").lower()
is_necessary = cat in ("necessary", "strictlynecessary")
name = v.get("name") or "Unbekannt"
category = _category_label(v.get("category", ""))
country = v.get("country") or ""
cookies = v.get("cookies") or []
n_cookies = len(cookies)
score = int(v.get("compliance_score", 0))
flags = v.get("compliance_flags") or []
opt_na_reason = ("Nicht erforderlich (eigene Verarbeitung — "
"Widerruf ueber Cookie-Banner)") if is_own else (
"Nicht erforderlich (§25 Abs. 2 TDDDG — technisch notwendig)"
if is_necessary else None
)
opt_status = _link_status_badge(
v.get("opt_out_url"), v.get("opt_out_ok"), v.get("opt_out_status"),
na_label=opt_na_reason,
)
privacy_na_reason = (
"Nicht erforderlich (eigene Verarbeitung — durch Haupt-DSI abgedeckt)"
if is_own else None
)
privacy_status = _link_status_badge(
v.get("privacy_policy_url"), v.get("privacy_ok"),
v.get("privacy_status"), na_label=privacy_na_reason,
)
score_color = ("#16a34a" if score >= 80 else
"#d97706" if score >= 50 else "#dc2626")
n_criteria = 3 if is_own else 5
n_failed = len(flags) if flags else 0
score_tooltip = (
f"{n_criteria - n_failed} von {n_criteria} Kriterien erfuellt"
+ (f" — fehlt: {', '.join(_flag_short(f) for f in flags[:3])}"
if flags else "")
)
actions_html = ""
skip_actions = bool(v.get("_actions_in_global_notice"))
if flags and not skip_actions:
from compliance.services.finding_action_recipes import recipe_for
action_items = []
for f in flags:
rec = recipe_for(f)
if not rec:
continue
action_items.append(
f'<li style="margin-bottom:6px"><strong>{_flag_short(f)}:</strong> '
f'{rec.get("what", "")}<br/>'
f'<span style="color:#475569"><strong>Was tun:</strong> '
f'{rec.get("fix_text", "").splitlines()[0][:200]}</span><br/>'
f'<span style="color:#94a3b8;font-size:9px">Quelle: '
f'{rec.get("why", "")[:160]}</span></li>'
)
if action_items:
actions_html = (
f'<details style="margin-top:4px"><summary style="cursor:pointer;'
f'color:#dc2626;font-size:10px">Was muss ich tun? '
f'({len(action_items)} Action{"s" if len(action_items) != 1 else ""})</summary>'
f'<ul style="margin:4px 0 0 14px;padding:0;font-size:10px;color:#1e293b">'
+ "".join(action_items)
+ '</ul></details>'
)
flag_str = ""
if flags:
flag_str = (
f'<div style="font-size:10px;color:#94a3b8;margin-top:2px">'
f'{", ".join(flags[:4])}</div>'
f'{actions_html}'
)
risk = v.get("compliance_risk") or {}
risk_label = risk.get("label") or ""
risk_badge = ""
if risk_label and risk_label != "unklar":
rc = {
"kritisch": ("#dc2626", "#fff"),
"hoch": ("#fecaca", "#991b1b"),
"mittel": ("#fde68a", "#92400e"),
"gering": ("#d1fae5", "#065f46"),
}.get(risk_label, ("#e5e7eb", "#475569"))
risk_badge = (f'<span style="margin-left:6px;padding:1px 5px;border-radius:3px;font-size:9px;'
f'background:{rc[0]};color:{rc[1]}">Risk: {risk_label}</span>')
return (
f'<tr style="border-top:1px solid #e2e8f0">'
f'<td style="padding:6px 8px;color:#1e293b;font-size:11px">'
f'{name}{risk_badge}{flag_str}</td>'
f'<td style="padding:6px 8px;color:#475569;font-size:11px">{category}</td>'
f'<td style="padding:6px 8px;color:#475569;font-size:11px">{country}</td>'
f'<td style="padding:6px 8px;text-align:center;color:#475569;font-size:11px">'
f'{n_cookies}</td>'
f'<td style="padding:6px 8px;text-align:center">{opt_status}</td>'
f'<td style="padding:6px 8px;text-align:center">{privacy_status}</td>'
f'<td style="padding:6px 8px;text-align:right;font-weight:600;'
f'color:{score_color};font-size:11px" title="{score_tooltip}">'
f'{score}%<div style="font-size:9px;font-weight:400;color:#94a3b8">'
f'{n_criteria - n_failed}/{n_criteria}</div></td>'
f'</tr>'
)
@@ -0,0 +1,213 @@
"""
A Audit-Transparenz / Audit-Quality-Checks.
Wenn der Crawler nicht alles gefunden hat, MUSS die Mail das prominent
zeigen sonst denkt der User 'alles gut' obwohl die Datenlage Luecken
hat.
Erkennt 4 Quality-Failures:
1. banner_detected=False trotz vorhandenem Cookie-Doc CMP-Tool ungeladen
2. cookie_doc >= 30k chars aber cmp_vendors < 10 Vendor-Extract unvollstaendig
3. doc_text submitted aber 0 chars geladen Crawler-Failure
4. cmp_vendors > 0 aber alle aus llm_cascade ohne Library-Match vermutl. unvollstaendig
Diese Findings landen IMMER im GF-1-Pager (auch wenn kein anderes
HIGH-Finding da ist) sie sagen "die Datenlage ist unvollstaendig,
manuelle Pruefung empfohlen".
"""
from __future__ import annotations
import logging
logger = logging.getLogger(__name__)
def _word_count(text: str | None) -> int:
if not text:
return 0
return len(text.split())
def check_banner_not_detected(
banner_result: dict | None,
cookie_doc_text: str | None,
) -> dict | None:
"""1) Banner nicht geladen aber Cookie-Doc vorhanden → CMP-Tool kaputt."""
if not isinstance(banner_result, dict):
return None
detected = banner_result.get("banner_detected")
if detected is None or detected is True:
return None
if not cookie_doc_text or len(cookie_doc_text) < 5000:
return None
return {
"severity": "HIGH",
"code": "audit_banner_not_detected",
"label": "Audit-Vorbehalt: Cookie-Banner konnte vom Crawler nicht "
"geladen werden",
"area": "Cookie-Banner",
"owner": "DSB + Marketing/CMP-Admin",
"detail": (
"Unser Crawler konnte das CMP-Tool dieser Site nicht analysieren — "
"weder Vendor-Liste noch Cookie-Verhalten konnten geprueft werden. "
"Moegliche Ursachen: Anti-Bot-Schutz (Akamai/Cloudflare/DataDome) "
"blockiert Playwright; das CMP-Skript laed nur fuer bestimmte "
"Geo-Regionen; ein neues CMP-Tool das wir noch nicht unterstuetzen. "
"Empfehlung: manuelle Pruefung des Banners durch DSB, alternativ "
"Cookie-Tabelle im Audit-Tool direkt einfuegen (Copy-Paste-Modus)."
),
"legal_basis": "Art. 5 (2) DSGVO Rechenschaftspflicht — der Audit-"
"Befund muss transparent zwischen 'geprueft & OK' und "
"'nicht pruefbar' unterscheiden.",
}
def check_vendor_extract_incomplete(
cookie_doc_text: str | None,
cmp_vendors: list | None,
) -> dict | None:
"""2) Cookie-Doc gross aber wenig Vendors → Extract unvollstaendig.
Dynamische Schwelle nach Doc-Groesse:
* 3k-6k Wörter mind. 10 Vendors erwartet
* 6k-10k Wörter mind. 20 Vendors
* 10k-15k Wörter mind. 30 Vendors
* 15k+ Wörter mind. 40 Vendors
"""
wc = _word_count(cookie_doc_text)
n_vendors = len(cmp_vendors or [])
if wc < 3000:
return None
# Erwartete Vendor-Anzahl heuristisch nach Doc-Groesse
if wc >= 15000:
expected = 40
elif wc >= 10000:
expected = 30
elif wc >= 6000:
expected = 20
else:
expected = 10
if n_vendors >= expected:
return None
return {
"severity": "HIGH" if wc >= 8000 else "MEDIUM",
"code": "audit_vendor_extract_thin",
"label": (
f"Audit-Vorbehalt: Cookie-Richtlinie hat {wc:,} Wörter, "
f"erwartet ~{expected} Vendors, extrahiert nur {n_vendors}"
).replace(",", "."),
"area": "Vendor-Liste / VVT",
"owner": "DSB + Marketing",
"detail": (
f"Bei einer Cookie-Richtlinie mit {wc:,} Woertern erwarten wir "
f"typischerweise {expected}+ unique Vendors. Die extrahierte Zahl "
f"({n_vendors}) ist auffaellig niedrig — entweder hat unser "
"Parser/LLM die Tabelle nicht vollstaendig erfasst oder "
"Vendors wurden zu konservativ erkannt. Empfehlung: Cookie-"
"Tabelle im Copy-Paste-Modus einreichen (Frontend-Toggle "
"'Text einfuegen' pro Cookie-Doc-Zeile) — dort parsen wir "
"Spalten deterministisch."
).replace(",", "."),
"legal_basis": "Art. 13(1)(e) DSGVO — die Empfaengerliste muss "
"vollstaendig sein; ein unvollstaendiger Audit darf "
"nicht als vollstaendig dargestellt werden.",
}
def check_url_fetch_failed(doc_entries: list | None) -> list[dict]:
"""3) Submitted URL aber 0 oder Mini-Text → Crawler-Failure pro Doc."""
out: list[dict] = []
for e in (doc_entries or []):
if not isinstance(e, dict):
continue
url = (e.get("url") or "").strip()
text = (e.get("text") or "").strip()
if not url or len(text) >= 200 or e.get("auto_discovered"):
continue
dt = e.get("doc_type", "doc")
rejected = e.get("rejected_url") or ""
out.append({
"severity": "MEDIUM",
"code": f"audit_url_fetch_failed_{dt}",
"label": (
f"Audit-Vorbehalt: {dt}-URL konnte nicht geladen werden "
f"({len(text)} Zeichen extrahiert)"
),
"area": dt,
"owner": "DSB + Web-Team",
"detail": (
f"Die eingegebene URL {url[:120]} lieferte weniger als 200 "
"Zeichen. Moegliche Ursachen: 404, JS-only Render, Anti-Bot, "
"Cookie-Wall. Auto-Discovery hat versucht eine Alternative "
"auf der Homepage zu finden — ohne Erfolg. Empfehlung: "
"korrekte URL pruefen oder den Text direkt einfuegen "
"(Copy-Paste-Modus)."
),
"legal_basis": "Art. 5 (2) DSGVO Rechenschaftspflicht.",
})
return out
def run_all(
banner_result: dict | None,
cookie_doc_text: str | None,
cmp_vendors: list | None,
doc_entries: list | None,
) -> list[dict]:
findings: list[dict] = []
try:
f1 = check_banner_not_detected(banner_result, cookie_doc_text)
if f1:
findings.append(f1)
except Exception as e:
logger.warning("audit_banner_not_detected failed: %s", e)
try:
f2 = check_vendor_extract_incomplete(cookie_doc_text, cmp_vendors)
if f2:
findings.append(f2)
except Exception as e:
logger.warning("audit_vendor_extract_thin failed: %s", e)
try:
findings.extend(check_url_fetch_failed(doc_entries))
except Exception as e:
logger.warning("audit_url_fetch_failed failed: %s", e)
return findings
def build_audit_quality_block_html(findings: list[dict]) -> str:
if not findings:
return ""
items: list[str] = []
for f in findings:
sev = f.get("severity", "MEDIUM")
sev_color = "#dc2626" if sev == "HIGH" else "#d97706"
items.append(
f'<li style="margin-bottom:10px;font-size:11px;line-height:1.5">'
f'<strong style="color:{sev_color}">[{sev}] {f.get("label","")}</strong>'
f'<div style="color:#475569;margin-top:3px">{f.get("detail","")}</div>'
f'<div style="color:#94a3b8;margin-top:2px;font-style:italic">'
f'{f.get("legal_basis","")}</div>'
f'</li>'
)
return (
'<div style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;'
'max-width:760px;margin:0 auto 16px;padding:14px 18px;'
'background:#fee2e2;border:1px solid #fecaca;border-radius:8px">'
'<div style="font-size:11px;color:#991b1b;text-transform:uppercase;'
'letter-spacing:1.2px;margin-bottom:4px;font-weight:600">'
'Audit-Vorbehalt — Datenlage unvollstaendig</div>'
f'<h3 style="margin:0 0 6px;font-size:14px;color:#1e293b">'
f'{len(findings)} Punkt'
f'{"e" if len(findings) != 1 else ""} bei denen der Audit selbst '
f'an Grenzen gestossen ist</h3>'
'<p style="margin:0 0 10px;font-size:11px;color:#475569;line-height:1.5">'
'Die folgenden Punkte betreffen NICHT die Compliance Ihrer Website, '
'sondern die Vollstaendigkeit unserer Pruefung. Bei diesen Bereichen '
'sollten Sie den Audit nicht als "alles ok" werten, sondern manuell '
'oder im Copy-Paste-Modus nachpruefen.'
'</p>'
'<ul style="margin:0 0 0 18px;padding:0">'
+ "".join(items) +
'</ul></div>'
)
@@ -0,0 +1,458 @@
"""
P92 + P94 Banner-Konsistenz-Checks (Post-hoc auf banner_result).
P92 CMP-Tool-Verfuegbarkeit:
Wenn "Anpassen"/"Einstellungen" angeklickt wurde und das Tool laed
nicht (Network-Error, Timeout, weisse Seite, fehlende
consent-Elemente nach Klick), ist das ein HIGH-Verstoss der
Nutzer hat formal die Moeglichkeit zur granularen Wahl, aber sie
funktioniert nicht.
P94 Banner-Init-vs-Cookie-Footer-Konsistenz:
Cookie-Liste im Initial-Banner-Settings darf nicht von der Liste
im permanenten Cookie-Richtlinien-Dokument abweichen. Wenn Banner
12 Cookies nennt, die Cookie-Doc aber 47, ist mindestens eine der
beiden Quellen unvollstaendig MEDIUM-Finding.
Beide liefern dict mit shape:
{"severity": "HIGH"|"MEDIUM", "code": str, "label": str, "detail": str}
oder None, wenn der Check nicht greift.
"""
from __future__ import annotations
import logging
import re
logger = logging.getLogger(__name__)
_ANPASSEN_KEYS = (
"anpassen", "einstellungen", "customize", "preferences",
"settings", "individuelle", "auswahl", "manage",
)
def _phases(banner_result: dict) -> dict:
if not isinstance(banner_result, dict):
return {}
return banner_result.get("phases") or {}
def check_cmp_tool_availability(banner_result: dict) -> dict | None:
"""P92 — Anpassen-Klick aber Settings-Tool defekt / leer."""
phases = _phases(banner_result)
settings_ph = phases.get("settings") or phases.get("after_settings_click")
if not isinstance(settings_ph, dict):
return None
initial_ph = phases.get("initial") or phases.get("before_accept") or {}
initial_text = (initial_ph.get("banner_text") or "").lower()
if not any(k in initial_text for k in _ANPASSEN_KEYS):
return None # Wenn kein Anpassen-Button gar nicht im Initial-Banner,
# ist das P100s Job — nicht hier doppelt melden.
error = settings_ph.get("error") or settings_ph.get("status_error")
settings_text = (settings_ph.get("banner_text") or "").strip()
has_categories = bool(
settings_ph.get("categories")
or settings_ph.get("category_tests")
or (settings_ph.get("structured_checks") or [])
)
has_toggles = bool(re.search(r"checkbox|toggle|switch|aria-checked",
(settings_ph.get("banner_html") or ""), re.I))
timed_out = bool(settings_ph.get("timeout"))
failure_signals: list[str] = []
if error:
failure_signals.append(f'Fehler: {str(error)[:120]}')
if timed_out:
failure_signals.append('Zeitueberschreitung beim Laden')
if len(settings_text) < 80 and not has_categories:
failure_signals.append(
f'Settings-Bereich nur {len(settings_text)} Zeichen, '
'keine Kategorien sichtbar'
)
if not has_toggles and not has_categories:
failure_signals.append(
'Keine Checkboxen / Toggles im Settings-Bereich'
)
if not failure_signals:
return None
return {
"severity": "HIGH",
"code": "cmp_tool_unavailable",
"label": 'Cookie-Einstellungen ueber "Anpassen" formal vorhanden, '
'Tool laed aber nicht oder ist leer',
"detail": " | ".join(failure_signals),
"legal_basis": "Art. 7 (3) DSGVO + EDPB 03/2022 — die Moeglichkeit "
"zur granularen Auswahl muss tatsaechlich funktionieren.",
}
def _normalize_cookie_names(items) -> set[str]:
out: set[str] = set()
if not items:
return out
for it in items:
if isinstance(it, str):
name = it.strip()
elif isinstance(it, dict):
name = (it.get("name") or it.get("cookie") or it.get("id") or "").strip()
else:
continue
if name and len(name) <= 120:
out.add(name.lower())
return out
def check_init_banner_vs_cookie_doc(
banner_result: dict,
cookie_doc_text: str | None,
) -> dict | None:
"""P94 — Cookie-Liste im Init-Banner vs in der Cookie-Richtlinie."""
if not cookie_doc_text or len(cookie_doc_text) < 500:
return None
phases = _phases(banner_result)
banner_cookies = _normalize_cookie_names(
(phases.get("settings") or {}).get("cookies") or []
) | _normalize_cookie_names(
(phases.get("initial") or phases.get("before_accept") or {}).get("cookies") or []
)
# Aus dem Cookie-Doc-Text: Cookie-Namen sind typischerweise
# camelCase oder _underscored, 4-40 Zeichen, ohne Leerzeichen.
candidates = set(re.findall(
r"\b([A-Za-z_][A-Za-z0-9_\-\.]{3,40})\b", cookie_doc_text
))
# Filter: heuristisch wahrscheinliche Cookie-Namen
doc_cookies: set[str] = set()
for c in candidates:
cl = c.lower()
if any(p in cl for p in (
"_ga", "_gid", "_gcl", "_fbp", "uc_", "ot_",
"cookieconsent", "sessionid", "csrf", "ajs_", "amp_",
"datadome", "incap_", "_pk_", "wp-", "yt-",
)):
doc_cookies.add(cl)
elif re.match(r"^[a-z][a-z0-9_]{3,30}$", cl) and (
"cookie" in cl or "consent" in cl or "track" in cl or "session" in cl
):
doc_cookies.add(cl)
if len(doc_cookies) < 5 or not banner_cookies:
return None # Datenlage zu duenn fuer sinnvolle Aussage.
only_in_doc = doc_cookies - banner_cookies
only_in_banner = banner_cookies - doc_cookies
if len(only_in_doc) < 5 and len(only_in_banner) < 3:
return None # Tolerable Abweichung.
severity = "MEDIUM"
# HIGH wenn beide Seiten massiv abweichen — dann fehlt klar
# die Cross-Reference.
if len(only_in_doc) >= 15 and len(only_in_banner) >= 5:
severity = "HIGH"
return {
"severity": severity,
"code": "banner_cookie_doc_mismatch",
"label": (
f"Cookie-Liste im Banner-Einstellungen ({len(banner_cookies)}) "
f"weicht von Cookie-Richtlinie ({len(doc_cookies)}) ab"
),
"detail": (
f"Nur im Cookie-Dokument: {len(only_in_doc)} Cookies (Beispiele: "
f"{', '.join(sorted(only_in_doc)[:5])}). "
f"Nur im Banner: {len(only_in_banner)} Cookies. "
"Empfehlung: eine der beiden Quellen als Single-Source-of-Truth "
"definieren und die andere automatisch generieren."
),
"legal_basis": (
"Art. 13(1)(c) DSGVO + Art. 12 DSGVO — Informationen ueber die "
"Verarbeitung muessen vollstaendig und konsistent sein."
),
}
_VENDOR_LIST_SIGNALS = (
"google analytics", "google ads", "facebook pixel", "meta pixel",
"hotjar", "matomo", "etracker", "salesforce", "hubspot",
"linkedin insight", "twitter conversion", "tiktok pixel",
"criteo", "the trade desk", "doubleclick",
)
def _vendors_mentioned_in_text(text: str) -> set[str]:
if not text:
return set()
t = text.lower()
return {v for v in _VENDOR_LIST_SIGNALS if v in t}
def check_three_source_vendor_consistency(
doc_texts: dict[str, str] | None,
cmp_vendors: list | None,
) -> dict | None:
"""P33 — 3-Spalten-Konsistenz: DSE vs Cookie-Doc vs Banner-Vendors.
Wenn ein Vendor (z.B. 'Google Analytics') in der DSE und in der
Cookie-Richtlinie genannt wird, aber NICHT in der Banner-Vendor-
Liste auftaucht (oder umgekehrt), ist die Drei-Quellen-Aussage
nicht konsistent. MEDIUM-Finding mit Liste der jeweils fehlenden
Vendors.
"""
if not doc_texts:
return None
dse_v = _vendors_mentioned_in_text(doc_texts.get("dse") or "")
cookie_v = _vendors_mentioned_in_text(doc_texts.get("cookie") or "")
banner_v: set[str] = set()
for v in (cmp_vendors or []):
name = (v.get("name") or "").lower()
for sig in _VENDOR_LIST_SIGNALS:
if sig in name or name in sig:
banner_v.add(sig)
sources_with_data = sum(1 for s in (dse_v, cookie_v, banner_v) if s)
if sources_with_data < 2:
return None
# Vendors in mind. einer Quelle aber nicht in allen vorhandenen
universe = dse_v | cookie_v | banner_v
issues: list[str] = []
for vendor in sorted(universe):
missing_in = []
if dse_v and vendor not in dse_v:
missing_in.append("DSE")
if cookie_v and vendor not in cookie_v:
missing_in.append("Cookie-Doc")
if banner_v and vendor not in banner_v:
missing_in.append("Banner-Liste")
if missing_in and len(missing_in) < sources_with_data:
issues.append(f'{vendor} (fehlt in: {", ".join(missing_in)})')
if not issues:
return None
return {
"severity": "MEDIUM",
"code": "three_source_vendor_inconsistency",
"label": (
f"{len(issues)} Vendor{'en' if len(issues) != 1 else ''} "
"nicht konsistent zwischen DSE, Cookie-Richtlinie und Banner"
),
"detail": (
"Folgende Vendors sind nicht in allen Quellen genannt: "
+ "; ".join(issues[:8])
+ (" ..." if len(issues) > 8 else "")
+ ". Empfehlung: zentrale Vendor-Liste pflegen und in alle "
"drei Dokumenttypen propagieren."
),
"legal_basis": "Art. 13(1)(c)+(e) DSGVO + EDPB 5/2020 — die "
"Empfaenger / Drittlandtransfers muessen ueber alle "
"Touch-Points konsistent kommuniziert werden.",
}
def check_banner_vs_cmp_partner_count(
banner_result: dict,
cmp_vendors: list | None,
) -> dict | None:
"""P75 — Banner nennt N Partner, CMP-Payload listet viel mehr.
Wenn der Banner-Text behauptet "5 Partner" oder "Wir und unsere
Partner", die CMP-Payload aber 100+ Vendors enthaelt, wird der
User getaeuscht.
"""
cmp_count = len(cmp_vendors or [])
if cmp_count < 20:
return None
initial_ph = (_phases(banner_result).get("initial")
or _phases(banner_result).get("before_accept") or {})
banner_text = (initial_ph.get("banner_text") or "")[:5000]
if not banner_text:
return None
m = re.search(r"\b(\d{1,4})\s*(?:partner|drittanbieter|vendor|"
r"anbieter|dienstleister)", banner_text, re.I)
if not m:
return None
claimed = int(m.group(1))
if claimed >= cmp_count * 0.6:
return None # Zahl im Banner ist plausibel.
return {
"severity": "HIGH",
"code": "banner_understates_vendor_count",
"label": (
f"Banner-Text nennt {claimed} Partner, CMP-Payload listet "
f"{cmp_count} Vendors"
),
"detail": (
f"Die im Banner-Text genannte Zahl ({claimed}) unterschaetzt die "
f"tatsaechliche Anzahl der Empfaenger ({cmp_count}) deutlich. "
"Empfehlung: Banner-Text auf die echte Vendor-Zahl heben oder "
"die Vendor-Liste reduzieren."
),
"legal_basis": (
"Art. 13(1)(e) DSGVO + EDPB 5/2020 — die Empfaenger / "
"Empfaengerkategorien muessen vollstaendig und nicht "
"verharmlosend angegeben sein."
),
}
def check_banner_copyability(banner_result: dict) -> dict | None:
"""P51a — Banner-Text muss kopierbar sein. CSS user-select:none oder
-webkit-user-select:none verhindert das (Article 7(2) DSGVO verstaendlich
und in einer Form, die spaetere Pruefung ermoeglicht).
"""
if not isinstance(banner_result, dict):
return None
phases = banner_result.get("phases") or {}
initial = phases.get("initial") or phases.get("before_accept") or {}
html = (initial.get("banner_html") or "")[:50000].lower()
if not html:
return None
blocked_signals = [
"user-select:none", "user-select: none",
"-webkit-user-select:none", "-webkit-user-select: none",
"-moz-user-select:none", "pointer-events:none",
"oncopy=\"return false", "onselectstart=\"return false",
]
hits = [s for s in blocked_signals if s in html]
if not hits:
return None
return {
"severity": "MEDIUM",
"code": "banner_not_copyable",
"label": "Banner-Text laesst sich nicht kopieren "
"(user-select:none / oncopy disabled)",
"detail": (
f'Im Banner-HTML gefunden: {", ".join(hits[:3])}. Der Nutzer '
"kann den Banner-Text nicht in eine Mail / Doku einfuegen, was "
"die spaetere Pruefung erschwert. Empfehlung: das CSS entfernen "
"oder explizit auf 'auto' setzen."
),
"legal_basis": "Art. 7 (1)+(2) DSGVO + EDPB 5/2020 — Einwilligungen "
"muessen in verstaendlicher und zugaenglicher Form "
"erteilt werden; eine spaetere Pruefung darf nicht "
"technisch erschwert werden.",
}
def check_consent_history(banner_result: dict) -> dict | None:
"""P51b — Es muss eine Moeglichkeit geben, die eigene Einwilligungs-
Historie einzusehen (Art. 7 (3) Widerruf muss so einfach wie die
Erteilung sein; das setzt voraus dass man WEISS was man einwilligt hat).
"""
if not isinstance(banner_result, dict):
return None
phases = banner_result.get("phases") or {}
blob_parts: list[str] = []
for ph in phases.values():
if isinstance(ph, dict):
blob_parts.append((ph.get("banner_text") or "")[:5000])
blob_parts.append((ph.get("banner_html") or "")[:20000])
blob = " ".join(blob_parts).lower()
if not blob:
return None
history_signals = [
"meine einwilligung", "consent-historie", "consent history",
"einwilligungshistorie", "einwilligungs-historie",
"ihre einwilligungen", "datenschutz-cockpit",
"privacy dashboard", "einwilligungs-protokoll",
"consent record", "consent log",
]
if any(s in blob for s in history_signals):
return None
return {
"severity": "MEDIUM",
"code": "consent_history_missing",
"label": "Keine sichtbare Consent-Historie / 'Meine Einwilligungen'-Ansicht",
"detail": (
"Im Banner und in den verlinkten Footer-Bereichen ist keine "
"Moeglichkeit erkennbar, die eigene Einwilligungs-Historie "
"einzusehen oder zu exportieren. Empfehlung: einen "
"'Meine Einwilligungen'-Bereich verlinken (Borlabs / Cookiebot / "
"Usercentrics bieten dafuer fertige Komponenten)."
),
"legal_basis": "Art. 7 (3) DSGVO + EDPB 5/2020 — der Widerruf muss "
"ebenso einfach sein wie die Erteilung, was eine "
"Sichtbarmachung der eigenen Einwilligungen voraussetzt.",
}
def run_all(banner_result: dict, cookie_doc_text: str | None = None,
cmp_vendors: list | None = None,
doc_texts: dict[str, str] | None = None) -> list[dict]:
findings: list[dict] = []
try:
f1 = check_cmp_tool_availability(banner_result)
if f1:
findings.append(f1)
except Exception as e:
logger.warning("P92 cmp_tool_availability failed: %s", e)
try:
f2 = check_init_banner_vs_cookie_doc(banner_result, cookie_doc_text)
if f2:
findings.append(f2)
except Exception as e:
logger.warning("P94 init_vs_cookie_doc failed: %s", e)
try:
f3 = check_banner_vs_cmp_partner_count(banner_result, cmp_vendors)
if f3:
findings.append(f3)
except Exception as e:
logger.warning("P75 banner_vs_cmp_count failed: %s", e)
try:
f4 = check_three_source_vendor_consistency(doc_texts, cmp_vendors)
if f4:
findings.append(f4)
except Exception as e:
logger.warning("P33 three_source_vendor failed: %s", e)
try:
f5 = check_banner_copyability(banner_result)
if f5:
findings.append(f5)
except Exception as e:
logger.warning("P51a copyability failed: %s", e)
try:
f6 = check_consent_history(banner_result)
if f6:
findings.append(f6)
except Exception as e:
logger.warning("P51b consent_history failed: %s", e)
return findings
def build_consistency_block_html(findings: list[dict]) -> str:
if not findings:
return ""
items: list[str] = []
for f in findings:
sev = f.get("severity", "MEDIUM")
sev_color = "#dc2626" if sev == "HIGH" else "#d97706"
items.append(
f'<li style="margin-bottom:10px;font-size:11px;line-height:1.5">'
f'<strong style="color:{sev_color}">[{sev}] {f.get("label","")}</strong>'
f'<div style="color:#475569;margin-top:3px">{f.get("detail","")}</div>'
f'<div style="color:#94a3b8;margin-top:2px;font-style:italic">'
f'{f.get("legal_basis","")}</div>'
f'</li>'
)
return (
'<div style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;'
'max-width:760px;margin:0 auto 16px;padding:14px 18px;'
'background:#fef3c7;border:1px solid #fcd34d;border-radius:8px">'
'<div style="font-size:11px;color:#92400e;text-transform:uppercase;'
'letter-spacing:1.2px;margin-bottom:4px;font-weight:600">'
'Banner-Konsistenz-Pruefung</div>'
f'<h3 style="margin:0 0 6px;font-size:14px;color:#1e293b">'
f'{len(findings)} Konsistenz-Finding{"s" if len(findings) != 1 else ""} '
'zwischen Banner-UI und Cookie-Richtlinie</h3>'
'<ul style="margin:8px 0 0 18px;padding:0">'
+ "".join(items) +
'</ul></div>'
)
@@ -0,0 +1,44 @@
"""
P85 Banner-Screenshot-Block in der Mail.
Embedded den von consent-tester captured Screenshot des Banners
(banner_result.banner_screenshot_b64) als data-URI <img> in die Mail.
"so sah euer Banner zum Audit-Zeitpunkt aus" visueller Beweis fuer
Dispute mit Marketing-Team oder DSB.
"""
from __future__ import annotations
import logging
logger = logging.getLogger(__name__)
def build_banner_screenshot_html(banner_result: dict | None) -> str:
if not isinstance(banner_result, dict):
return ""
b64 = banner_result.get("banner_screenshot_b64") or ""
if not b64 or len(b64) < 200:
return ""
provider = banner_result.get("banner_provider") or "Generic"
detected = banner_result.get("banner_detected")
return (
'<div style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;'
'max-width:760px;margin:0 auto 16px;padding:12px 16px;'
'background:#f8fafc;border:1px solid #cbd5e1;border-radius:8px">'
'<div style="font-size:11px;color:#475569;text-transform:uppercase;'
'letter-spacing:1.2px;margin-bottom:4px;font-weight:600">'
'Screenshot des Cookie-Banners zum Audit-Zeitpunkt</div>'
f'<h3 style="margin:0 0 6px;font-size:13px;color:#1e293b">'
f'Provider: <strong>{provider}</strong> · '
f'erkannt: <strong>{"ja" if detected else "nein"}</strong></h3>'
'<p style="margin:0 0 8px;font-size:11px;color:#64748b;line-height:1.5">'
'Visueller Beweis wie das Banner zum Zeitpunkt des Audits angezeigt '
'wurde. Bei spaeterer Aenderung des Banners bitte mit diesem '
'Screenshot abgleichen.'
'</p>'
f'<img src="data:image/png;base64,{b64}" alt="Cookie-Banner" '
f'style="max-width:100%;height:auto;border:1px solid #cbd5e1;'
f'border-radius:4px;display:block">'
'</div>'
)
@@ -0,0 +1,221 @@
"""
P80 Replay-Pipeline (Mini-Version v1).
Lädt einen persistierten Snapshot und rendert die Audit-Mail mit dem
AKTUELLEN Mail-Render-Code neu. Nutzbar fuer:
* Mail-Layout-Aenderungen (P63-P67, P82 1-Pager, P84 Diff-Mode) testen
* Action-Recipes anpassen
* Disclaimer-Text iterieren
* Pattern-Notice-Logik tunen
NICHT enthalten (kommt in v2):
* MC-Scorecard re-run mit aktuellem scope_doc_type-Filter (P72)
erfordert MC-Pipeline-Refactoring aus _run_compliance_check
* Vendor-Redundancy-Analyse re-run
Effekt v1: 7min Re-Scan -> 2-5 Sek fuer Mail-Layout-Iterationen.
Effekt v2 (spaeter): auch fuer MC-Filter-Tests.
"""
from __future__ import annotations
import logging
from typing import Any
from sqlalchemy.orm import Session
from compliance.services.check_snapshot import load_snapshot
logger = logging.getLogger(__name__)
def replay_from_snapshot(
db: Session,
snapshot_id: str,
recipient: str | None = None,
dry_run: bool = False,
) -> dict:
"""Replay audit mail render from snapshot.
Args:
db: SQLAlchemy session
snapshot_id: UUID of snapshot to replay
recipient: Override email recipient. None = skip send.
dry_run: If True, render HTML but do not send mail.
Returns:
{"snapshot_id", "html_size", "sections", "mail_sent", "preview"}
"""
snap = load_snapshot(db, snapshot_id)
if not snap:
return {"error": "snapshot not found", "snapshot_id": snapshot_id}
doc_entries = snap.get("doc_entries") or []
banner_result = snap.get("banner_result") or {}
profile_dict = snap.get("profile") or {}
cmp_vendors = snap.get("cmp_vendors") or []
site_label = snap.get("site_label") or snap.get("site_domain")
# Reconstruct doc_texts mapping (was the input to mail-render).
# Snapshot-Schema speichert text unter "text" (nicht full_text).
doc_texts: dict[str, str] = {}
for e in doc_entries:
dt = e.get("doc_type", "")
txt = (e.get("text") or e.get("full_text") or e.get("text_preview") or "").strip()
if dt and txt:
doc_texts[dt] = txt
# Build results list mock (just enough for mail-render)
def _dict_to_result(d: dict) -> Any:
"""Best-effort reconstruction. Snapshot didn't persist DocCheckResult
so we fake minimal fields. For real MC-replay (v2) we'd re-run the
check_document_completeness function against the snapshot text."""
return type("R", (), {
"doc_type": d.get("doc_type", "other"),
"label": d.get("doc_type", "Dokument"),
"completeness_pct": d.get("completeness_pct", 0),
"correctness_pct": d.get("correctness_pct"),
"checks": [],
"error": d.get("error", ""),
})()
results = [_dict_to_result(e) for e in doc_entries]
# Render mail sections
section_sizes: dict[str, int] = {}
parts: list[str] = []
# P82: GF-1-Pager zuerst (5-Bullet-Summary)
try:
from compliance.services.gf_one_pager import build_gf_one_pager_html
gf_html = build_gf_one_pager_html(
site_name=site_label or "",
scorecard=None, # Snapshot enthaelt keine MC-Scorecard
banner_result=banner_result,
library_mismatch_findings=None, # wird unten gefuellt
scan_context=snap.get("scan_context"),
)
parts.append(gf_html)
section_sizes["gf_one_pager"] = len(gf_html)
except Exception as e:
logger.warning("Replay: GF-1-pager failed: %s", e)
try:
from compliance.api.agent_doc_check_critical import build_critical_findings_html
critical_html = build_critical_findings_html(banner_result, None, results) or ""
parts.append(critical_html)
section_sizes["critical"] = len(critical_html)
except Exception as e:
logger.warning("Replay: critical-block failed: %s", e)
try:
from compliance.api.scope_disclaimer import build_scope_disclaimer_html
disclaimer = build_scope_disclaimer_html()
parts.append(disclaimer)
section_sizes["disclaimer"] = len(disclaimer)
except Exception as e:
logger.warning("Replay: disclaimer failed: %s", e)
try:
from compliance.api.agent_doc_check_banner import build_banner_deep_html
banner_html = build_banner_deep_html(banner_result) or ""
parts.append(banner_html)
section_sizes["banner"] = len(banner_html)
except Exception as e:
logger.warning("Replay: banner-block failed: %s", e)
try:
from compliance.api.vvt_table_renderer import build_vvt_table_html
vvt_html = build_vvt_table_html(cmp_vendors) or ""
parts.append(vvt_html)
section_sizes["vvt"] = len(vvt_html)
except Exception as e:
logger.warning("Replay: vvt failed: %s", e)
# P35 + P77 + P78 + P36: Textsignale (Save-Label, Cookies-in-DSE,
# JC-Klausel, Social-Embeds)
try:
from compliance.services.doc_text_signals import (
run_all as run_signal_checks,
build_signals_block_html,
)
cookie_doc_missing = not bool(doc_texts.get("cookie"))
sig_findings = run_signal_checks(
banner_result, doc_texts, cookie_doc_missing,
)
if sig_findings:
sig_html = build_signals_block_html(sig_findings)
parts.append(sig_html)
section_sizes["signals"] = len(sig_html)
except Exception as e:
logger.warning("Replay: signals block failed: %s", e)
# P92 + P94: Banner-Konsistenz
try:
from compliance.services.banner_consistency_checks import (
run_all as run_consistency_checks,
build_consistency_block_html,
)
cookie_doc_for_check = doc_texts.get("cookie") or doc_texts.get("dse") or ""
cons = run_consistency_checks(
banner_result or {}, cookie_doc_for_check, cmp_vendors,
doc_texts=doc_texts,
)
if cons:
cons_html = build_consistency_block_html(cons)
parts.append(cons_html)
section_sizes["consistency"] = len(cons_html)
except Exception as e:
logger.warning("Replay: consistency block failed: %s", e)
# P102: Cookie-Klassifikations-Pruefung
try:
from compliance.services.cookie_library_mismatch import (
detect_mismatches, build_mismatch_block_html,
)
cookies_seen: list[str] = []
for ph in (banner_result.get("phases") or {}).values():
if isinstance(ph, dict):
for ck in (ph.get("cookies") or []):
if isinstance(ck, str):
cookies_seen.append(ck)
elif isinstance(ck, dict) and ck.get("name"):
cookies_seen.append(ck["name"])
doc_for_check = doc_texts.get("cookie") or doc_texts.get("dse") or ""
if cookies_seen and doc_for_check:
mm = detect_mismatches(db, cookies_seen, doc_for_check)
if mm:
mm_html = build_mismatch_block_html(mm)
parts.append(mm_html)
section_sizes["library_mismatch"] = len(mm_html)
except Exception as e:
logger.warning("Replay: mismatch block failed: %s", e)
full_html = "".join(parts)
result = {
"snapshot_id": snapshot_id,
"check_id": snap.get("check_id"),
"site_domain": snap.get("site_domain"),
"html_size": len(full_html),
"sections": section_sizes,
"mail_sent": False,
"preview": full_html[:500] + "..." if len(full_html) > 500 else full_html,
"full_html": full_html, # P88 PDF-Export braucht das volle HTML.
}
if recipient and not dry_run:
try:
from compliance.services.smtp_sender import send_email
email_res = send_email(
recipient=recipient,
subject=f"[REPLAY] {site_label} (Snapshot {snapshot_id[:8]})",
body_html=full_html,
)
result["mail_sent"] = (email_res.get("status") == "sent")
result["mail_status"] = email_res.get("status")
except Exception as e:
logger.warning("Replay: mail send failed: %s", e)
result["mail_send_error"] = str(e)[:200]
return result
@@ -0,0 +1,179 @@
"""
P80 Snapshot + Replay-Helper.
Persistiert die Roh-Daten eines Compliance-Check-Laufs (DSE-Text,
Banner-HTML, Cookies, CMP-Vendors, Profile), damit die Audit-Pipeline
spaeter ohne erneuten Browser-Crawl die Mail-Render-/MC-Scoring-Logik
neu laufen kann.
Use Cases:
* Logik-Iteration (MC-Filter P72, Mail-Layout, Action-Recipes) ohne
7min Re-Crawl.
* Regression-Test: Golden-Truth-Library (P81).
* Diff-Mode: "was hat sich seit letztem Snapshot geaendert" (P84).
"""
from __future__ import annotations
import json
import logging
from typing import Any
from urllib.parse import urlparse
from sqlalchemy import text
from sqlalchemy.orm import Session
logger = logging.getLogger(__name__)
def _to_jsonb(obj: Any) -> str:
"""Serialize to JSON-string for psycopg2 JSONB insertion."""
return json.dumps(obj, default=str, ensure_ascii=False)
def _derive_site_domain(doc_entries: list[dict]) -> str:
for e in doc_entries or []:
url = (e.get("url") or "").strip()
if url:
try:
netloc = urlparse(url).netloc.lower().replace("www.", "")
if netloc:
return netloc
except Exception:
continue
return "unknown"
def save_snapshot(
db: Session,
check_id: str,
doc_entries: list[dict],
banner_result: dict | None,
profile: Any,
cmp_vendors: list[dict] | None = None,
scan_context: dict | None = None,
site_label: str | None = None,
notes: str | None = None,
) -> str | None:
"""Persist scan raw data. Returns snapshot UUID on success."""
try:
profile_dict: dict = {}
if profile is not None:
if hasattr(profile, "__dict__"):
profile_dict = {k: v for k, v in profile.__dict__.items()
if not k.startswith("_")}
elif isinstance(profile, dict):
profile_dict = profile
domain = _derive_site_domain(doc_entries or [])
result = db.execute(
text("""
INSERT INTO compliance.compliance_check_snapshots
(check_id, site_domain, site_label,
doc_entries, banner_result, profile,
scan_context, cmp_vendors, notes)
VALUES (:cid, :dom, :lbl,
CAST(:de AS JSONB), CAST(:br AS JSONB), CAST(:pr AS JSONB),
CAST(:sc AS JSONB), CAST(:cv AS JSONB), :nt)
RETURNING id
"""),
{
"cid": check_id,
"dom": domain,
"lbl": site_label,
"de": _to_jsonb(doc_entries or []),
"br": _to_jsonb(banner_result) if banner_result else None,
"pr": _to_jsonb(profile_dict) if profile_dict else None,
"sc": _to_jsonb(scan_context) if scan_context else None,
"cv": _to_jsonb(cmp_vendors) if cmp_vendors else None,
"nt": notes,
},
)
snapshot_id = str(result.fetchone()[0])
db.commit()
logger.info(
"P80: snapshot saved id=%s check=%s domain=%s docs=%d",
snapshot_id, check_id, domain, len(doc_entries or []),
)
return snapshot_id
except Exception as e:
logger.warning("P80 snapshot save failed for %s: %s", check_id, e)
try:
db.rollback()
except Exception:
pass
return None
def load_snapshot(db: Session, snapshot_id: str) -> dict | None:
"""Load a snapshot by UUID. Returns dict with all fields or None."""
try:
row = db.execute(
text("""
SELECT id, check_id, site_domain, site_label,
doc_entries, banner_result, profile,
scan_context, cmp_vendors, created_at,
replay_count, notes
FROM compliance.compliance_check_snapshots
WHERE id = CAST(:sid AS uuid)
"""),
{"sid": snapshot_id},
).fetchone()
if not row:
return None
db.execute(
text("""
UPDATE compliance.compliance_check_snapshots
SET replay_count = replay_count + 1,
last_replay_at = now()
WHERE id = CAST(:sid AS uuid)
"""),
{"sid": snapshot_id},
)
db.commit()
return {
"id": str(row[0]),
"check_id": row[1],
"site_domain": row[2],
"site_label": row[3],
"doc_entries": row[4] or [],
"banner_result": row[5],
"profile": row[6] or {},
"scan_context": row[7] or {},
"cmp_vendors": row[8] or [],
"created_at": str(row[9]),
"replay_count": row[10],
"notes": row[11],
}
except Exception as e:
logger.warning("P80 snapshot load failed for %s: %s", snapshot_id, e)
return None
def list_snapshots_for_domain(db: Session, domain: str, limit: int = 20) -> list[dict]:
"""List recent snapshots for a domain (for diff-mode P84)."""
try:
rows = db.execute(
text("""
SELECT id, check_id, site_domain, created_at, replay_count, notes
FROM compliance.compliance_check_snapshots
WHERE site_domain = :dom
ORDER BY created_at DESC
LIMIT :lim
"""),
{"dom": domain.lower().replace("www.", ""), "lim": limit},
).fetchall()
return [
{
"id": str(r[0]),
"check_id": r[1],
"site_domain": r[2],
"created_at": str(r[3]),
"replay_count": r[4],
"notes": r[5],
}
for r in rows
]
except Exception as e:
logger.warning("P80 list_snapshots failed for %s: %s", domain, e)
return []
@@ -66,27 +66,35 @@ def _ensure_db() -> None:
CREATE TABLE IF NOT EXISTS check_payloads (
check_id TEXT PRIMARY KEY,
vendors TEXT, -- JSON list[dict]
profile TEXT -- JSON dict
profile TEXT, -- JSON dict
banner TEXT -- P20: JSON dict full banner_result
);
""")
# P20 migration: spalte 'banner' nachtraeglich anlegen wenn alt
try:
conn.execute("ALTER TABLE check_payloads ADD COLUMN banner TEXT")
except sqlite3.OperationalError:
pass
def record_check_payload(
check_id: str,
vendors: list[dict] | None,
profile: dict | None,
banner: dict | None = None,
) -> None:
"""Persist cmp_vendors + extracted_profile for later migration use."""
"""Persist cmp_vendors + extracted_profile + banner_result (P20)."""
try:
_ensure_db()
with sqlite3.connect(DB_PATH) as conn:
conn.execute(
"INSERT OR REPLACE INTO check_payloads "
"(check_id, vendors, profile) VALUES (?, ?, ?)",
"(check_id, vendors, profile, banner) VALUES (?, ?, ?, ?)",
(
check_id,
json.dumps(vendors or [], ensure_ascii=False),
json.dumps(profile or {}, ensure_ascii=False),
json.dumps(banner or {}, ensure_ascii=False) if banner else None,
),
)
conn.commit()
@@ -95,13 +103,13 @@ def record_check_payload(
def get_check_payload(check_id: str) -> dict | None:
"""Load cmp_vendors + extracted_profile for a previous check."""
"""Load cmp_vendors + extracted_profile + banner_result for a previous check."""
try:
_ensure_db()
with sqlite3.connect(DB_PATH) as conn:
conn.row_factory = sqlite3.Row
row = conn.execute(
"SELECT vendors, profile FROM check_payloads WHERE check_id=?",
"SELECT vendors, profile, banner FROM check_payloads WHERE check_id=?",
(check_id,),
).fetchone()
if not row:
@@ -109,6 +117,7 @@ def get_check_payload(check_id: str) -> dict | None:
return {
"vendors": json.loads(row["vendors"] or "[]"),
"profile": json.loads(row["profile"] or "{}"),
"banner": json.loads(row["banner"]) if row["banner"] else None,
}
except Exception as e:
logger.warning("get_check_payload failed: %s", e)
@@ -82,6 +82,8 @@ class CompliancePDFGenerator:
self._add_consent_section(story, ss, tenant_id)
# Org Roles
self._add_role_section(story, ss, tenant_id, project_id)
# Stufe 2 — Quellen- und Lizenz-Footer (Attribution-Renderer Task #23)
self._add_attribution_footer(story, ss)
# Footer
story.append(Spacer(1, 15 * mm))
story.append(Paragraph("Erstellt mit BreakPilot Compliance SDK", ss["Small"]))
@@ -214,3 +216,64 @@ class CompliancePDFGenerator:
story.append(Paragraph("Keine Rollen zugewiesen.", ss["Body2"]))
except Exception:
story.append(Paragraph("Rollen-Tabelle nicht vorhanden.", ss["Small"]))
def _add_attribution_footer(self, story, ss) -> None:
"""Stufe 2 of the attribution renderer (Task #23).
Adds a "Quellen und Lizenzen" section listing the platform's
license-rule distribution and, crucially, the mandatory
attribution lines for Rule-2 sources (CC-BY-SA, OECD, Apache).
For Rule 1 sources the attribution is optional but rendered as
a brief reference list for auditability.
The section is added to every generated compliance PDF so each
export carries its own provenance footer pauschale Hinweise
in AGB/Impressum reichen rechtlich nicht (siehe
project_attribution_strategy.md).
"""
try:
rows = self.db.execute(text("""
SELECT cc.license_rule, COUNT(*) AS n,
array_agg(DISTINCT cpl.source_regulation ORDER BY cpl.source_regulation)
FILTER (WHERE cpl.source_regulation IS NOT NULL) AS sources
FROM compliance.canonical_controls cc
LEFT JOIN compliance.control_parent_links cpl ON cpl.control_uuid = cc.id
WHERE cc.license_rule IS NOT NULL
GROUP BY cc.license_rule
ORDER BY cc.license_rule
""")).fetchall()
except Exception as e:
logger.warning("attribution footer skipped: %s", e)
return
if not rows:
return
rule_labels = {1: "Hoheitsrecht/Public Domain (woertlich)",
2: "Mit Attribution (CC-BY u.ae.)",
3: "Nur Identifier-Verweis"}
story.append(Spacer(1, 8 * mm))
story.append(Paragraph("Quellen &amp; Lizenzen", ss["Section"]))
story.append(Paragraph(
"Dieser Bericht stuetzt sich auf klassifizierte Compliance-Controls "
"aus den folgenden Quellen. Jede Quelle ist deterministisch in eine "
"der drei Lizenzregeln (R1-R3) eingeordnet.", ss["Body2"]))
for r in rows:
rule = int(r.license_rule)
sources = (r.sources or [])[:8]
label = rule_labels.get(rule, f"Regel {rule}")
head = f"<b>R{rule}{label}</b> &nbsp; ({r.n} Controls)"
story.append(Paragraph(head, ss["Body2"]))
if sources:
src_text = "; ".join(sources)
if len(r.sources or []) > 8:
src_text += f" und {len(r.sources) - 8} weitere"
story.append(Paragraph(src_text, ss["Small"]))
if rule == 2:
story.append(Paragraph(
"Pflicht-Attribution: Inhalte aus den oben genannten Quellen sind "
"unter den jeweiligen freien Lizenzen (z.B. CC-BY-SA, OECD-Public, "
"Apache-2.0) wiedergegeben. Original-Urheber bleibt in jeder "
"Weiterverwendung zu nennen.", ss["Small"]))
story.append(Spacer(1, 2 * mm))
@@ -0,0 +1,303 @@
"""
P59 Cookie-Behavior-Validator.
4 Layer:
A) Open Cookie Database lookup (declared category vs library category)
B) Network-Traffic-Analyse (cookie value sent to third-party domains)
C) Value-Pattern (Hash/UUID/PII heuristics on "essential"-declared cookies)
D) Cross-Site frequency (from library metadata, when available)
Returns list of findings with severity + Art. 5(1)(b) DSGVO reference.
"""
from __future__ import annotations
import logging
import re
from typing import Iterable
from sqlalchemy import text
from sqlalchemy.orm import Session
logger = logging.getLogger(__name__)
# --- Patterns für Layer C ---
_HASH_PATTERN = re.compile(r"^[a-f0-9]{32,64}$", re.IGNORECASE)
_UUID_PATTERN = re.compile(
r"^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$",
re.IGNORECASE,
)
_BASE64_LONG = re.compile(r"^[A-Za-z0-9+/=]{40,}$")
_PII_KEYS = ("email", "@", "user_id", "userid", "username", "phone")
# --- Purpose-Keyword-Bags für Layer A2 (Zweck-Match) ---
_PURPOSE_KEYWORDS = {
"marketing": {
"tracking", "tracker", "targeting", "profiling", "profile",
"advertis", "marketing", "remarket", "retargeting", "conversion",
"audience", "behavioral", "behaviour", "personali", "interest",
"campaign", "promotion", "pixel", "fingerprint",
},
"statistics": {
"analytic", "analyse", "analyz", "measure", "measurement", "metric",
"statistic", "performance", "telemetr", "monitoring", "usage",
"reichweite", "auswert",
},
"essential": {
"session", "sitzung", "authentic", "anmeld", "login", "logout",
"security", "sicherheit", "csrf", "xsrf", "cookie consent",
"cookie-einwilligung", "technisch notwendig", "load balanc",
"lastverteil",
},
"functional": {
"preference", "praeferen", "language", "sprache", "layout", "design",
"cart", "warenkorb", "wishlist", "merkliste", "favorit", "theme",
"darkmode", "darstellung",
},
"social_media": {
"social", "facebook", "twitter", "linkedin", "instagram", "youtube",
"embed", "share", "teilen",
},
}
def _classify_purpose_text(text_value: str) -> set[str]:
"""Return set of categories whose keywords appear in the purpose-text."""
if not text_value:
return set()
t = text_value.lower()
matches = set()
for cat, kws in _PURPOSE_KEYWORDS.items():
if any(k in t for k in kws):
matches.add(cat)
return matches
def _lookup_library(db: Session, cookie_name: str,
cookie_domain: str) -> dict | None:
"""Layer A: find best library match."""
# Exact domain match first, then wildcard
cur = db.execute(text("""
SELECT actual_category, purpose_en, purpose_de, vendor_name,
data_receivers, source_name, source_url, confidence
FROM compliance.cookie_library
WHERE cookie_name = :name
ORDER BY
CASE WHEN domain_pattern = :domain THEN 0
WHEN :domain ILIKE replace(domain_pattern, '*', '%') THEN 1
ELSE 2 END,
confidence DESC
LIMIT 1
"""), {"name": cookie_name, "domain": cookie_domain or ""})
row = cur.fetchone()
if not row:
return None
return {
"actual_category": row[0], "purpose_en": row[1],
"purpose_de": row[2], "vendor_name": row[3],
"data_receivers": row[4] or [],
"source_name": row[5], "source_url": row[6],
"confidence": float(row[7] or 0),
}
def _value_pattern_flag(value: str | None, declared_category: str) -> str | None:
"""Layer C: detect tracking-typical patterns in essential-declared cookies."""
if not value or declared_category not in ("essential", "functional"):
return None
v = value.strip()
if not v or len(v) < 16:
return None
if _UUID_PATTERN.match(v):
return "UUID (Persistent Identifier)"
if _HASH_PATTERN.match(v):
return f"Hash-Wert ({len(v)} Hex-Zeichen — typisch User-ID)"
if _BASE64_LONG.match(v):
return f"Base64-Long ({len(v)} Zeichen — typisch Tracking-Payload)"
vlow = v.lower()
for kw in _PII_KEYS:
if kw in vlow:
return f"PII-Marker '{kw}' im Wert"
return None
def _category_label(cat: str) -> str:
return {
"essential": "technisch notwendig",
"functional": "funktional",
"statistics": "Analyse/Statistik",
"marketing": "Marketing/Werbung",
"social_media": "Social Media",
"unknown": "unbekannt",
}.get(cat, cat)
def validate_cookie_behavior(
db: Session,
cookies_set: Iterable[dict],
network_requests: list[dict] | None = None,
first_party_domain: str = "",
) -> list[dict]:
"""Run all 4 layers, return list of finding dicts.
Each cookie dict should have: name, domain (optional), value (optional),
declared_category (e.g. 'essential'), max_age_seconds (optional)."""
findings: list[dict] = []
network_requests = network_requests or []
fp_domain = (first_party_domain or "").lower().lstrip(".")
# Pre-index network: which receivers got which cookie?
receivers_by_cookie: dict[str, set[str]] = {}
for req in network_requests:
try:
host = (req.get("host") or req.get("url", "")).lower()
for cname in (req.get("cookies_sent") or []):
receivers_by_cookie.setdefault(cname, set()).add(host)
except Exception:
continue
for c in cookies_set or []:
name = (c.get("name") or "").strip()
if not name:
continue
declared = (c.get("declared_category") or "").lower()
domain = (c.get("domain") or "").lstrip(".").lower()
value = c.get("value")
# Layer A: library lookup + 3-Tier-Severity (Kategorie / Zweck / Kombi)
lib = _lookup_library(db, name, domain)
declared_purpose = (c.get("declared_purpose") or "").strip()
if lib and lib["actual_category"] != "unknown":
# Layer A1: Kategorie-Mismatch (NUR wenn relevant — declared ist
# essential/functional aber library sagt marketing/statistics)
category_mismatch = (
declared
and lib["actual_category"] != declared
and declared in ("essential", "functional")
and lib["actual_category"] in ("marketing", "statistics",
"social_media")
)
# Layer A2: Zweck-Text-Mismatch
purpose_mismatch = False
purpose_explain = ""
if declared_purpose:
declared_cats = _classify_purpose_text(declared_purpose)
actual_cat = lib["actual_category"]
# Mismatch wenn deklarierter Zweck-Text auf andere Kategorie
# zeigt als die Library-Realität (z.B. declared "Sitzung" aber
# tatsaechlich Marketing-Cookie)
if actual_cat in ("marketing", "statistics", "social_media"):
# Verdacht wenn deklarierter Zweck NUR essential/functional
# Patterns hat (nichts zu Marketing/Analytics)
if declared_cats and actual_cat not in declared_cats:
# ausserdem: irgendein "harmloser" Keyword da
if declared_cats & {"essential", "functional"}:
purpose_mismatch = True
purpose_explain = (
f"Beschriebener Zweck deutet auf "
f"{', '.join(_category_label(c) for c in declared_cats)}, "
f"das Cookie wird aber tatsaechlich fuer "
f"{_category_label(actual_cat)} eingesetzt"
)
# 3-Tier-Severity
if category_mismatch and purpose_mismatch:
# CRITICAL — Vorsatz / Boeswilligkeit-Indiz
findings.append({
"layer": "A1+A2",
"cookie_name": name,
"severity": "CRITICAL",
"type": "DUAL_MISMATCH_INTENT",
"text": (
f"Cookie '{name}' weist DOPPELTE Diskrepanz auf: "
f"deklarierte Kategorie '{_category_label(declared)}' UND "
f"deklarierter Zweck stimmen NICHT mit dem realen Verhalten "
f"('{_category_label(lib['actual_category'])}') ueberein. "
f"{purpose_explain}. {lib['source_name']}-Quelle: "
f"{lib['purpose_en'][:120] if lib['purpose_en'] else ''}. "
f"Doppel-Mismatch indiziert Vorsatz nach DSK Beschluss 2024-02 "
f"(Cookie gezielt verschleiert) — siehe Bussgeld-Risiko Art. 83 "
f"DSGVO bei wissentlicher Taeuschung. Konstruktive Annahme: "
f"haeufig Marketing-/Agentur-Versehen ohne DSB-Kontrolle."
),
"legal_ref": "Art. 5(1)(a)+(b) DSGVO + DSK Beschluss 2024-02",
"source": lib["source_url"] or lib["source_name"],
})
elif purpose_mismatch:
# HIGH — Zweck stimmt nicht (Ahnungslosigkeit oder Vorsatz)
findings.append({
"layer": "A2",
"cookie_name": name,
"severity": "HIGH",
"type": "PURPOSE_TEXT_MISMATCH",
"text": (
f"Cookie '{name}': {purpose_explain}. {lib['source_name']}: "
f"{(lib['purpose_en'] or '')[:140]}. Deutet auf fehlende "
f"Detail-Pruefung des Cookie-Verhaltens — Beschreibung sollte "
f"das tatsaechliche Verhalten reflektieren (Art. 13 DSGVO + "
f"Transparenz)."
),
"legal_ref": "Art. 13(1)(c) DSGVO (Zweck-Angabe muss korrekt sein)",
"source": lib["source_url"] or lib["source_name"],
})
elif category_mismatch:
# MEDIUM — Kategorie-Tag falsch, kann Fluechtigkeitsfehler sein
findings.append({
"layer": "A1",
"cookie_name": name,
"severity": "MEDIUM",
"type": "CATEGORY_MISMATCH",
"text": (
f"Cookie '{name}' ist als '{_category_label(declared)}' "
f"kategorisiert. {lib['source_name']} klassifiziert ihn als "
f"'{_category_label(lib['actual_category'])}'"
+ (f"{lib['purpose_en'][:120]}" if lib['purpose_en'] else "")
+ f". Vermutlich Konfigurations-Versehen im Consent-Tool "
f"(haeufig bei Migrations zwischen CMP-Anbietern). "
f"Korrektur: Cookie auf '{_category_label(lib['actual_category'])}'"
f" umstellen, Consent neu einholen."
),
"legal_ref": "Art. 5(1)(b) DSGVO (Zweckbindung)",
"source": lib["source_url"] or lib["source_name"],
})
# Layer B: network traffic
receivers = receivers_by_cookie.get(name, set())
third_party = [r for r in receivers
if r and fp_domain and not r.endswith(fp_domain)]
if third_party and declared in ("essential", "functional"):
findings.append({
"layer": "B",
"cookie_name": name,
"severity": "HIGH",
"type": "THIRD_PARTY_DESPITE_ESSENTIAL",
"text": (
f"Cookie '{name}' ist als '{_category_label(declared)}' "
f"deklariert, der Wert wird aber an {len(third_party)} "
f"externe(n) Empfaenger uebertragen: "
f"{', '.join(sorted(third_party))[:200]}. "
f"Damit liegt eine Drittlandstransfer-/Drittanbieter-Verarbeitung "
f"vor, die nicht durch die deklarierte Zweckbestimmung gedeckt ist."
),
"legal_ref": "Art. 5(1)(b) Zweckbindung + Art. 13(1)(f) DSGVO",
})
# Layer C: value pattern
flag = _value_pattern_flag(value, declared)
if flag:
findings.append({
"layer": "C",
"cookie_name": name,
"severity": "MEDIUM",
"type": "TRACKING_PATTERN_DESPITE_ESSENTIAL",
"text": (
f"Cookie '{name}' ist als '{_category_label(declared)}' "
f"deklariert, enthaelt aber: {flag}. Werte mit Tracking-Charakter "
f"sind in nicht einwilligungsbeduerftigen Kategorien fragwuerdig."
),
"legal_ref": "Art. 5(1)(b) DSGVO + DSK-OH Telemedien 2024",
})
# Layer D: cross-site frequency (later — needs metadata import)
return findings
@@ -0,0 +1,221 @@
"""
Cookie-Compliance-Audit 3-Quellen-Vergleich.
DAS ist der eigentliche Mehrwert des Tools:
* A. Was in der Cookie-Richtlinie DEKLARIERT ist (Text-Parse)
* B. Was im Browser TATSAECHLICH GELADEN wurde (after_accept)
* C. Was unsere LIBRARY ueber den Cookie weiss (Vendor, Kategorie)
Daraus 3 Listen:
1. deklariert + geladen + library-bekannt compliant
2. geladen aber NICHT deklariert HIGH-Verstoss (Art. 13(1)(c) DSGVO)
3. deklariert aber NICHT geladen Tabelle veraltet (LOW)
4. 🔍 deklariert + Library-Kategorie weicht ab Pruefanlass
"""
from __future__ import annotations
import logging
import re
from typing import Iterable
from sqlalchemy import text as sa_text
from sqlalchemy.orm import Session
logger = logging.getLogger(__name__)
def _normalize_cookie_name(name: str) -> str:
"""Wildcard-Cookies wie 'AMCV_*', 'pm_sess_NNN' werden auf Prefix
reduziert damit '_ga' und '_ga_GTM-XXX' als ein Cookie zaehlen."""
if not name:
return ""
s = name.strip()
# AMCV_*, sc_v44, etc.
s = re.sub(r"[<\[].*?[>\]]", "", s) # entferne <ID>, [...]
s = s.rstrip("*").rstrip("_")
s = re.sub(r"_NNN$|_\d+$", "", s)
return s.lower()
def _extract_declared_cookies(cookie_doc_text: str | None) -> set[str]:
"""Liest Cookie-Namen aus dem Cookie-Richtlinien-Text.
Nutzt zuerst parse_cookie_table (Block/Tab-Format), dann
parse_flat_cookie_text (Anchor-Pattern).
"""
if not cookie_doc_text:
return set()
declared: set[str] = set()
try:
from compliance.services.cookies_table_parser import (
parse_cookie_table, parse_flat_cookie_text,
)
for v in parse_cookie_table(cookie_doc_text):
for c in (v.get("cookies") or []):
if isinstance(c, dict) and c.get("name"):
declared.add(_normalize_cookie_name(c["name"]))
for v in parse_flat_cookie_text(cookie_doc_text):
for c in (v.get("cookies") or []):
if isinstance(c, dict) and c.get("name"):
declared.add(_normalize_cookie_name(c["name"]))
except Exception as e:
logger.warning("declared-cookie-extract failed: %s", e)
return {n for n in declared if n}
def _extract_browser_cookies(banner_result: dict | None) -> set[str]:
"""Liest Cookie-Namen aus banner_result.phases.after_accept.cookies."""
out: set[str] = set()
if not isinstance(banner_result, dict):
return out
phases = banner_result.get("phases") or {}
for ph_name in ("after_accept", "before_consent", "after_reject"):
ph = phases.get(ph_name) or {}
if not isinstance(ph, dict):
continue
for c in (ph.get("cookies") or []):
if isinstance(c, str):
out.add(_normalize_cookie_name(c))
elif isinstance(c, dict) and c.get("name"):
out.add(_normalize_cookie_name(c["name"]))
return {n for n in out if n}
def _lookup_library(db: Session, names: Iterable[str]) -> dict[str, dict]:
"""Liefert {normalized_name: {category, vendor}} aus cookie_library."""
nl = [n for n in names if n]
if not nl:
return {}
try:
rows = db.execute(sa_text(
"SELECT cookie_name, actual_category, vendor_name "
"FROM compliance.cookie_library "
"WHERE LOWER(cookie_name) = ANY(:lc)"
), {"lc": nl}).fetchall()
return {r[0].lower(): {"category": r[1], "vendor": r[2]} for r in rows}
except Exception as e:
logger.warning("library lookup failed: %s", e)
return {}
def audit_cookie_compliance(
db: Session | None,
cookie_doc_text: str | None,
banner_result: dict | None,
) -> dict:
"""Hauptfunktion: liefert dict mit 4 Listen + counts."""
declared = _extract_declared_cookies(cookie_doc_text)
browser = _extract_browser_cookies(banner_result)
all_names = declared | browser
library = _lookup_library(db, all_names) if db else {}
declared_only = declared - browser
browser_only = browser - declared
both = declared & browser
return {
"declared_count": len(declared),
"browser_count": len(browser),
"library_count": len(library),
"compliant": sorted(both),
"undeclared_in_browser": sorted(browser_only),
"declared_not_loaded": sorted(declared_only),
"library_metadata": library,
"high_findings": len(browser_only),
"low_findings": len(declared_only),
}
def build_cookie_audit_block_html(audit: dict) -> str:
"""Rendert den 3-Spalten-Vergleichs-Block in die Mail."""
if not audit:
return ""
n_dec = audit.get("declared_count", 0)
n_brw = audit.get("browser_count", 0)
n_undecl = len(audit.get("undeclared_in_browser") or [])
n_dec_only = len(audit.get("declared_not_loaded") or [])
n_both = len(audit.get("compliant") or [])
sev_color = "#dc2626" if n_undecl else "#16a34a"
undecl_html = ""
if audit.get("undeclared_in_browser"):
undecl_html = (
'<div style="margin-top:10px;padding:10px 12px;background:#fee2e2;'
'border:1px solid #fecaca;border-radius:6px">'
f'<strong style="color:#991b1b">❌ {n_undecl} Cookie'
f'{"s" if n_undecl != 1 else ""} im Browser geladen, '
'aber NICHT in der Cookie-Richtlinie deklariert:</strong>'
'<div style="font-family:monospace;font-size:10px;color:#7f1d1d;'
'margin-top:6px;max-height:200px;overflow:auto">'
+ ", ".join(audit["undeclared_in_browser"][:50])
+ (f' ... +{n_undecl - 50} weitere'
if n_undecl > 50 else '') +
'</div>'
'<div style="font-size:10px;color:#7f1d1d;margin-top:4px;'
'font-style:italic">Art. 13(1)(c) DSGVO + § 25 TDDDG — '
'die Empfaengerliste muss vollstaendig sein. Diese Cookies '
'sind potenziell ungenannte Verarbeitungen.</div>'
'</div>'
)
dec_only_html = ""
if audit.get("declared_not_loaded"):
dec_only_html = (
'<div style="margin-top:10px;padding:10px 12px;background:#fef3c7;'
'border:1px solid #fde68a;border-radius:6px">'
f'<strong style="color:#92400e">⚠️ {n_dec_only} Cookie'
f'{"s" if n_dec_only != 1 else ""} in der Richtlinie '
'deklariert, aber bei diesem Audit NICHT im Browser gesehen:</strong>'
'<div style="font-family:monospace;font-size:10px;color:#78350f;'
'margin-top:6px;max-height:200px;overflow:auto">'
+ ", ".join(audit["declared_not_loaded"][:50])
+ (f' ... +{n_dec_only - 50} weitere'
if n_dec_only > 50 else '') +
'</div>'
'<div style="font-size:10px;color:#78350f;margin-top:4px;'
'font-style:italic">Kein direkter Verstoss — die Cookies '
'koennen nur in bestimmten User-Journeys / Geo-Regionen / '
'eingeloggten Zustaenden geladen werden. Empfehlung: '
'pruefen ob die Cookie-Richtlinie veraltet ist.</div>'
'</div>'
)
compliant_html = ""
if audit.get("compliant"):
compliant_html = (
'<div style="margin-top:10px;padding:10px 12px;background:#dcfce7;'
'border:1px solid #bbf7d0;border-radius:6px">'
f'<strong style="color:#166534">✓ {n_both} Cookie'
f'{"s" if n_both != 1 else ""} sowohl deklariert als auch geladen '
'(compliant):</strong>'
'<div style="font-family:monospace;font-size:10px;color:#14532d;'
'margin-top:6px;max-height:150px;overflow:auto">'
+ ", ".join(audit["compliant"][:50])
+ (f' ... +{n_both - 50} weitere'
if n_both > 50 else '') +
'</div>'
'</div>'
)
return (
'<div style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;'
'max-width:760px;margin:0 auto 16px;padding:14px 18px;'
'background:#fff;border:1px solid #cbd5e1;border-radius:8px">'
f'<div style="font-size:11px;color:{sev_color};text-transform:uppercase;'
f'letter-spacing:1.2px;margin-bottom:4px;font-weight:600">'
'Cookie-Compliance-Audit — 3-Quellen-Vergleich</div>'
'<h3 style="margin:0 0 6px;font-size:14px;color:#1e293b">'
f'{n_dec} in Richtlinie · {n_brw} im Browser · '
f'{n_both} compliant · {n_undecl} undokumentiert · '
f'{n_dec_only} nicht geladen</h3>'
'<p style="margin:0 0 8px;font-size:11px;color:#475569;line-height:1.5">'
'Wir vergleichen die in der Cookie-Richtlinie genannten Cookies '
'mit dem was der Browser nach Akzeptieren tatsaechlich laed. '
'Undokumentierte Cookies im Browser sind ein direkter Verstoss '
'gegen die DSGVO-Informationspflicht.'
'</p>'
+ undecl_html + dec_only_html + compliant_html +
'</div>'
)
@@ -0,0 +1,157 @@
"""
P102 Cookie-Library-Mismatch-Detection pro Site.
Vergleicht die in einem Lauf erfassten Cookies (mit deklarierter
Kategorie aus dem Cookie-Doc-Text) gegen die Library
(compliance.cookie_library). Liefert Mismatches: deklariert Library.
Genutzt im Mail-Render als neuer Block "Cookie-Klassifikations-Pruefung".
"""
from __future__ import annotations
import logging
import re
from sqlalchemy import text
from sqlalchemy.orm import Session
logger = logging.getLogger(__name__)
_CATEGORY_PATTERNS = [
(re.compile(r"\b(?:strictly[-\s]?)?(?:notwendig|essential|funktional|"
r"funktionscookie|technisch[- ]?notwendig)\b", re.I),
"essential"),
(re.compile(r"\b(?:tracking|analytics|analyse|statistik|"
r"measurement|performance)\b", re.I),
"statistics"),
(re.compile(r"\b(?:marketing|werbung|advertising|targeting|"
r"drittanbieter[- ]?cookie)\b", re.I),
"marketing"),
(re.compile(r"\b(?:social[-\s]?media|share|like)\b", re.I),
"social_media"),
]
def _category_for(name: str, doc_text: str) -> str | None:
if not doc_text or not name:
return None
idx = doc_text.find(name)
if idx < 0:
return None
window = doc_text[max(0, idx - 50):idx + 400]
for pat, cat in _CATEGORY_PATTERNS:
if pat.search(window):
return cat
return None
def _load_library(db: Session) -> dict[str, dict]:
rows = db.execute(text(
"SELECT cookie_name, actual_category, vendor_name "
"FROM compliance.cookie_library"
)).fetchall()
return {r[0].lower(): {"category": r[1], "vendor": r[2]} for r in rows}
def detect_mismatches(
db: Session,
cookie_names_seen: list[str],
doc_text: str,
) -> list[dict]:
"""Returns list of finding dicts."""
if not cookie_names_seen or not doc_text:
return []
lib = _load_library(db)
findings: list[dict] = []
seen: set[str] = set()
for cname in cookie_names_seen:
cname = (cname or "").strip()
if not cname or cname.lower() in seen:
continue
seen.add(cname.lower())
declared = _category_for(cname, doc_text)
if not declared:
continue
lib_entry = lib.get(cname.lower())
if not lib_entry:
continue
lib_cat = lib_entry["category"]
if lib_cat in (None, "unknown") or lib_cat == declared:
continue
# HIGH wenn Library sagt Marketing aber Site als essential/statistics
# deklariert (faktische Drittland-/Werbe-Verarbeitung versteckt
# als technische/statistische Notwendigkeit). MEDIUM sonst.
severity = "HIGH" if (
lib_cat == "marketing" and declared in ("essential", "statistics")
) else "MEDIUM"
findings.append({
"cookie": cname,
"declared_category": declared,
"library_category": lib_cat,
"library_vendor": lib_entry["vendor"],
"severity": severity,
})
return findings
def build_mismatch_block_html(findings: list[dict]) -> str:
"""Render the mismatch findings as a Mail-Block."""
if not findings:
return ""
n_high = sum(1 for f in findings if f["severity"] == "HIGH")
items: list[str] = []
for f in findings[:25]:
sev_color = "#dc2626" if f["severity"] == "HIGH" else "#d97706"
items.append(
f'<li style="margin-bottom:6px;font-size:11px">'
f'<code style="background:#f1f5f9;padding:1px 4px;border-radius:2px">'
f'{f["cookie"]}</code> '
f'<span style="color:#64748b">— deklariert als</span> '
f'<strong>{f["declared_category"]}</strong>, '
f'<span style="color:#64748b">unsere Bibliothek + verbreitete '
f'Vendor-Doku sagen</span> <strong style="color:{sev_color}">'
f'{f["library_category"]}</strong> '
f'(Vendor: {f["library_vendor"]})'
f'</li>'
)
return (
'<div style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;'
'max-width:760px;margin:0 auto 16px;padding:14px 18px;'
'background:#fffbeb;border:1px solid #fde68a;border-radius:8px">'
'<div style="font-size:11px;color:#92400e;text-transform:uppercase;'
'letter-spacing:1.2px;margin-bottom:4px;font-weight:600">'
'Cookie-Klassifikations-Pruefung</div>'
f'<h3 style="margin:0 0 8px;font-size:14px;color:#1e293b">'
f'{len(findings)} Cookie{"s" if len(findings) != 1 else ""}'
f' mit abweichender Klassifikation gefunden'
f'{f" ({n_high} davon mit erhoehter Bedeutung)" if n_high else ""}'
f'</h3>'
'<p style="margin:0 0 10px;font-size:11px;color:#475569;line-height:1.5">'
'Wir haben die in Ihrer Cookie-Richtlinie deklarierte Kategorie der '
'Cookies mit unserer globalen Bibliothek (~2.300 Cookies aus Open-'
'Cookie-Database + DACH-spezifischen Quellen) und der verbreiteten '
'Vendor-Doku abgeglichen. Bei den folgenden Cookies stimmt die '
'deklarierte Kategorie nicht mit dem typischerweise erwarteten '
'Zweck ueberein. Das ist kein automatischer Verstoss — aber ein '
'Pruefanlass: bei Marketing-Cookies braucht es Einwilligung, bei '
'als "essential" deklarierten nicht. Empfehlung: mit DSB / '
'Marketing-Agentur klaeren ob die Klassifikation korrigiert '
'oder die Einwilligung anders eingeholt werden muss.</p>'
'<ul style="margin:0 0 0 18px;padding:0">'
+ "".join(items) +
'</ul>'
'<p style="margin:8px 0 0;font-size:10px;color:#94a3b8;'
'font-style:italic">Hintergrund: Art. 13(1)(c) DSGVO + EDPB 5/2020 '
'— der angegebene Verarbeitungszweck muss dem tatsaechlichen '
'entsprechen.</p>'
'</div>'
)
@@ -0,0 +1,216 @@
"""
P104 Cookie-Network-Tracing (Stufe 4).
cookies_detailed[i].domain zeigt welche Domain das Cookie via Set-Cookie
gesetzt hat. Wir vergleichen:
* Site-Hauptdomain vs Cookie-Domain First-Party / Third-Party
* Cookie-Domain vs bekannte Vendoren wer ist der echte Empfaenger
* Vendor-Land vs EU/Drittland Drittland-Transfer-Hinweis
Defeat-Device-Pattern: "Funktional"-Cookie wird aber von doubleclick.net
gesetzt das ist physisch ein Third-Party-Tracking-Cookie, kein
funktionales First-Party-Cookie.
"""
from __future__ import annotations
import logging
from urllib.parse import urlparse
logger = logging.getLogger(__name__)
# Vendor-Domain → bekannter Vendor + Land
_DOMAIN_VENDORS: dict[str, tuple[str, str]] = {
".doubleclick.net": ("Google DoubleClick", "US"),
".google.com": ("Google", "US"),
".google-analytics.com": ("Google Analytics", "US"),
".googletagmanager.com": ("Google Tag Manager", "US"),
".googleadservices.com": ("Google Ads", "US"),
".gstatic.com": ("Google CDN", "US"),
".facebook.com": ("Meta / Facebook", "US"),
".facebook.net": ("Meta / Facebook", "US"),
".instagram.com": ("Meta / Instagram", "US"),
".linkedin.com": ("LinkedIn (Microsoft)", "US"),
".pinterest.com": ("Pinterest", "US"),
".pinimg.com": ("Pinterest", "US"),
".tiktok.com": ("TikTok (ByteDance)", "CN"),
".bing.com": ("Microsoft Bing", "US"),
".clarity.ms": ("Microsoft Clarity", "US"),
".criteo.com": ("Criteo", "FR"),
".adnxs.com": ("AppNexus / Xandr", "US"),
".rubiconproject.com": ("Rubicon Project", "US"),
".pubmatic.com": ("PubMatic", "US"),
".adobedtm.com": ("Adobe DTM", "US"),
".adobetarget.com": ("Adobe Target", "US"),
".demdex.net": ("Adobe Experience Cloud", "US"),
".omtrdc.net": ("Adobe Analytics", "US"),
".everesttech.net": ("Adobe Advertising Cloud", "US"),
".2o7.net": ("Adobe Analytics", "US"),
".adform.net": ("AdForm", "DK"),
".trade-desk.com": ("The Trade Desk", "US"),
".tradedesk.com": ("The Trade Desk", "US"),
".adsrvr.org": ("The Trade Desk", "US"),
".hotjar.com": ("Hotjar", "MT"),
".matomo.cloud": ("Matomo", "DE"),
".etracker.com": ("etracker", "DE"),
".etracker.de": ("etracker", "DE"),
".cloudflare.com": ("Cloudflare", "US"),
".cookielaw.org": ("OneTrust", "US"),
".cookiebot.com": ("Cookiebot (Cybot)", "DK"),
".usercentrics.eu": ("Usercentrics", "DE"),
".usercentrics.com": ("Usercentrics", "DE"),
".consensu.org": ("IAB Europe TCF", "BE"),
".datadoghq.eu": ("Datadog", "US"),
".datadoghq.com": ("Datadog", "US"),
".datadome.co": ("DataDome", "FR"),
".incapsula.com": ("Imperva Incapsula", "US"),
".imperva.com": ("Imperva", "US"),
".akamai.net": ("Akamai", "US"),
".akamaiedge.net": ("Akamai", "US"),
".salesforce.com": ("Salesforce", "US"),
".force.com": ("Salesforce", "US"),
}
_NON_EU_COUNTRIES = {"US", "CN", "RU", "IN", "JP", "BR", "AU"}
def _registrable_domain(host: str) -> str:
"""vw.de von www.vw.de oder bla.vw.de oder vw.de"""
h = (host or "").lstrip(".").lower()
parts = h.split(".")
if len(parts) >= 2:
return ".".join(parts[-2:])
return h
def _lookup_vendor_by_domain(cookie_domain: str) -> tuple[str, str] | None:
if not cookie_domain:
return None
cd = cookie_domain.lower()
if not cd.startswith("."):
cd = "." + cd
for suffix, (vendor, country) in _DOMAIN_VENDORS.items():
if cd.endswith(suffix):
return (vendor, country)
return None
def trace_cookie_network(
cookies_detailed: list[dict] | None,
site_url: str | None = None,
) -> list[dict]:
"""Liefert Findings fuer Cookies die von externer/Drittland-Domain
gesetzt werden waehrend sie als First-Party / essential deklariert sind."""
if not cookies_detailed:
return []
site_host = ""
if site_url:
try:
site_host = _registrable_domain(urlparse(site_url).netloc)
except Exception:
site_host = ""
out: list[dict] = []
for ck in cookies_detailed:
if not isinstance(ck, dict):
continue
name = (ck.get("name") or "").strip()
domain = (ck.get("domain") or "").strip()
declared = (ck.get("declared_category") or "").lower().strip()
if not name or not domain:
continue
cookie_reg = _registrable_domain(domain)
is_third_party = bool(site_host and cookie_reg != site_host)
vendor_match = _lookup_vendor_by_domain(domain)
if not vendor_match and not is_third_party:
continue
# Defeat-Device-Pattern: essential/functional + Third-Party
if declared in ("essential", "functional", "necessary") and is_third_party:
sev = "HIGH" if vendor_match else "MEDIUM"
vendor_name = vendor_match[0] if vendor_match else cookie_reg
country = vendor_match[1] if vendor_match else ""
third_country = country in _NON_EU_COUNTRIES
out.append({
"cookie": name,
"declared": declared,
"cookie_domain": domain,
"site_domain": site_host,
"vendor": vendor_name,
"vendor_country": country,
"third_country": third_country,
"severity": sev,
"label": (
f"Cookie '{name}' deklariert als '{declared}', "
f"wird aber von externer Domain "
f"<strong>{vendor_name}</strong> "
f"({domain}) gesetzt"
+ (f" — Drittland: {country}" if third_country else "")
),
})
elif vendor_match and declared in ("essential", "functional", "necessary"):
# Auch wenn First-Party-Cookie aber bekannter Tracker-Vendor →
# Mismatch (z.B. Google Tag Manager kann via CNAME als
# First-Party erscheinen)
out.append({
"cookie": name,
"declared": declared,
"cookie_domain": domain,
"vendor": vendor_match[0],
"vendor_country": vendor_match[1],
"third_country": vendor_match[1] in _NON_EU_COUNTRIES,
"severity": "MEDIUM",
"label": (
f"Cookie '{name}' deklariert als '{declared}', "
f"Domain {domain} gehoert aber zu "
f"<strong>{vendor_match[0]}</strong> "
f"({vendor_match[1]})"
),
})
return out
def build_network_trace_block_html(findings: list[dict]) -> str:
if not findings:
return ""
n_third = sum(1 for f in findings if f.get("third_country"))
items: list[str] = []
for f in findings[:30]:
sev_color = "#dc2626" if f["severity"] == "HIGH" else "#d97706"
country_flag = ""
if f.get("third_country"):
country_flag = (
f' <span style="background:#fee2e2;color:#991b1b;'
f'padding:1px 5px;border-radius:8px;font-size:9px;'
f'font-weight:600">DRITTLAND {f.get("vendor_country","")}</span>'
)
items.append(
f'<li style="margin-bottom:6px;font-size:11px;line-height:1.5;'
f'color:{sev_color}">{f["label"]}{country_flag}</li>'
)
return (
'<div style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;'
'max-width:760px;margin:0 auto 16px;padding:14px 18px;'
'background:#fff7ed;border:1px solid #fed7aa;border-radius:8px">'
'<div style="font-size:11px;color:#9a3412;text-transform:uppercase;'
'letter-spacing:1.2px;margin-bottom:4px;font-weight:600">'
'Cookie-Netzwerk-Verhalten (Defeat-Device-Heuristik)</div>'
f'<h3 style="margin:0 0 6px;font-size:14px;color:#1e293b">'
f'{len(findings)} Cookie{"s" if len(findings) != 1 else ""} '
f'mit Vendor-Domain-Diskrepanz'
f'{f" — davon {n_third} mit Drittland-Transfer" if n_third else ""}'
f'</h3>'
'<p style="margin:0 0 10px;font-size:11px;color:#475569;line-height:1.5">'
'Diese Cookies sind als "essential" oder "funktional" deklariert, '
'werden aber von einer externen Domain gesetzt — typisch fuer '
'getarnte Tracker. Drittland-Markierungen sind besonders kritisch: '
'sie loesen Pflichten nach Art. 44-49 DSGVO aus (SCC / Angemessen-'
'heitsbeschluss / Schrems II Folge-Massnahmen).'
'</p>'
'<ul style="margin:0 0 0 18px;padding:0">'
+ "".join(items) +
'</ul></div>'
)
@@ -0,0 +1,147 @@
"""
Cookie-zu-Vendor-Fallback (P52 Lite).
Wenn weder cmp_payloads noch vendor_llm_extract Vendors lieferten,
matchen wir die im after_accept gesehenen Cookies gegen die
compliance.cookie_library und bauen Vendor-Records aus den Library-
Eintraegen (cookie_name vendor_name, actual_category).
Typisches Szenario: VW nutzt ein Custom-CMP (cookiemgmt-Wrapper),
kein bekanntes IAB-Tool. cmp_payloads = leer, aber after_accept.cookies
hat 28 Eintraege. Diese 28 Cookies sind in der Library = ~15-20 Vendors.
"""
from __future__ import annotations
import logging
import re
from typing import Iterable
from sqlalchemy import text
from sqlalchemy.orm import Session
logger = logging.getLogger(__name__)
def _collect_cookie_names(banner_result: dict | None) -> set[str]:
names: set[str] = set()
if not isinstance(banner_result, dict):
return names
for ph in (banner_result.get("phases") or {}).values():
if not isinstance(ph, dict):
continue
for ck in (ph.get("cookies") or []):
if isinstance(ck, str):
names.add(ck.strip())
elif isinstance(ck, dict):
n = (ck.get("name") or "").strip()
if n:
names.add(n)
return {n for n in names if n and len(n) <= 120}
def lookup_vendors_from_library(
db: Session,
cookie_names: Iterable[str],
) -> list[dict]:
"""Resolves cookie names to vendor records via cookie_library."""
names = [n for n in cookie_names if n]
if not names:
return []
rows = db.execute(text(
"""
SELECT cookie_name, actual_category, vendor_name
FROM compliance.cookie_library
WHERE LOWER(cookie_name) = ANY(:lc)
"""
), {"lc": [n.lower() for n in names]}).fetchall()
by_vendor: dict[str, dict] = {}
for cname, cat, vendor in rows:
if not vendor:
continue
entry = by_vendor.setdefault(vendor, {
"name": vendor,
"country": "",
"purpose": "",
"category": cat or "",
"opt_out_url": "",
"privacy_policy_url": "",
"persistence": "",
"cookies": [],
"source": "library_fallback",
})
entry["cookies"].append({
"name": cname, "purpose": "", "expiry": "",
"is_third_party": True,
})
return list(by_vendor.values())
def fallback_vendors_for_run(
db: Session,
banner_result: dict | None,
existing_vendor_count: int,
cookie_doc_text: str | None = None,
) -> list[dict]:
"""Returns extra vendor records to merge with the run's cmp_vendors.
VW-Lehre: cmp_vendors=6 (alle LLM-grob) reicht NICHT die echte
Cookie-Tabelle hat 30+ Eintraege. Wir fuehren den Lookup jetzt auch
bei mid-tier-Counts aus, solange after_accept >= 15 Cookies hat
ODER der Cookie-Doc-Text Cookie-Tabellen-Signale enthaelt.
"""
names = _collect_cookie_names(banner_result)
# Erweitere names um Cookie-Namen die im Cookie-Doc-Text als
# Tabellen-Eintraege auftauchen (Pattern: NAME gefolgt von
# "Tracking Cookies"/"Session Cookies"/"Funktional"/...).
if cookie_doc_text:
names |= _extract_cookie_names_from_doc(cookie_doc_text)
# Skip-Bedingungen ueberarbeitet:
# - sehr wenige Cookies UND >= 5 Vendors schon vorhanden → skip
# - sonst IMMER versuchen
if len(names) < 5 and existing_vendor_count >= 5:
return []
if not names:
return []
vendors = lookup_vendors_from_library(db, names)
if vendors:
logger.info(
"Cookie-Library-Fallback: %d Vendors aus %d Cookies "
"(existing cmp_vendors=%d)",
len(vendors), len(names), existing_vendor_count,
)
return vendors
_TABLE_ROW_RE = re.compile(
r"\b([A-Za-z_][A-Za-z0-9_\-\.]{2,40})\s+"
r"(?:Tracking Cookies|Session Cookies|Funktional|Marketing|"
r"Analytics|Performance|Notwendig|Strictly\s+Necessary|"
r"Statistik|Werbung|Targeting|Personalisierung)",
re.I,
)
def _extract_cookie_names_from_doc(text: str) -> set[str]:
"""Pattern-basiertes Erkennen von Cookie-Tabellen-Zeilen.
VW-Cookie-Tabelle hat Form:
'IDE Tracking Cookies (Marketing) Dieser Cookie ... 13 Monate'
Das fangen wir mit einem Cookie-Name-vor-Category-Pattern.
"""
out: set[str] = set()
for m in _TABLE_ROW_RE.finditer(text):
name = m.group(1).strip()
# Filter offensichtliche Noise (Pronomen, Verben)
nl = name.lower()
if nl in ("dieser", "diese", "ein", "der", "die", "das",
"session", "permanent", "funktional", "notwendig",
"marketing", "analytics", "werbung", "anbieter",
"google", "facebook", "tracking", "cookie", "cookies"):
continue
if len(name) >= 3:
out.add(name)
return out
@@ -0,0 +1,148 @@
"""
P103 Cookie-Value-Entropy-Check (Stufe 3).
Bewertet ob der Cookie-Wert zur deklarierten Kategorie passt:
* "Funktional" + 2-char-Wert ('1', 'de') konsistent (Flag)
* "Funktional" + 64-char-Base64 INKONSISTENT (Tracking-ID-Pattern)
* "Marketing" + 32+ char Hash konsistent
* "Marketing" + 2-char-Wert konsistent (Boolean-Opt-Out)
Defeat-Device-Pattern: Site deklariert "Funktional" um Consent zu
umgehen, aber Wert sieht wie pseudonymisierte Tracking-ID aus.
"""
from __future__ import annotations
import logging
import math
import re
logger = logging.getLogger(__name__)
def _shannon_entropy(s: str) -> float:
if not s:
return 0.0
from collections import Counter
n = len(s)
counts = Counter(s)
return -sum((c / n) * math.log2(c / n) for c in counts.values())
_BASE64_RE = re.compile(r"^[A-Za-z0-9+/=_-]{20,}$")
_HEX_RE = re.compile(r"^[a-fA-F0-9]{16,}$")
_UUID_RE = re.compile(
r"^[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-"
r"[0-9a-fA-F]{4}-[0-9a-fA-F]{12}$"
)
_FLAG_VALUES = {"0", "1", "true", "false", "yes", "no",
"de", "en", "de-de", "en-us", "fr-fr",
"accept", "deny", "essential", "on", "off"}
def _classify_value_shape(value: str) -> str:
"""Returns one of: 'flag', 'short_id', 'long_token', 'uuid', 'hash',
'json_blob', 'unknown'."""
if not value:
return "flag"
v = value.strip()
if v.lower() in _FLAG_VALUES:
return "flag"
if len(v) <= 4:
return "flag"
if _UUID_RE.match(v):
return "uuid"
if _HEX_RE.match(v) and len(v) >= 32:
return "hash"
if _BASE64_RE.match(v) and len(v) >= 40:
return "long_token"
if v.startswith("{") or v.startswith("["):
return "json_blob"
if len(v) >= 16 and _shannon_entropy(v) > 3.5:
return "long_token"
if len(v) >= 6:
return "short_id"
return "flag"
def check_cookies_for_entropy_mismatch(
cookies_detailed: list[dict] | None,
) -> list[dict]:
"""Liefert Findings fuer Cookies deren Wert-Shape nicht zur
deklarierten Kategorie passt."""
out: list[dict] = []
if not cookies_detailed:
return out
for ck in cookies_detailed:
if not isinstance(ck, dict):
continue
name = (ck.get("name") or "").strip()
value = (ck.get("value") or "").strip()
declared = (ck.get("declared_category") or "").lower().strip()
if not name or not declared:
continue
shape = _classify_value_shape(value)
# Regel: 'essential' / 'functional' Cookies mit hoher
# Tracking-ID-Komplexitaet sind verdaechtig.
is_low_cat = declared in ("essential", "functional", "necessary")
is_id_shape = shape in ("uuid", "hash", "long_token")
if is_low_cat and is_id_shape:
out.append({
"cookie": name,
"declared": declared,
"value_shape": shape,
"value_len": len(value),
"severity": "MEDIUM",
"label": (
f"Cookie '{name}' deklariert als '{declared}', "
f"aber Wert ist ein {shape} ({len(value)} Zeichen) — "
"typisches Tracking-ID-Pattern"
),
"detail": (
"Funktionale/notwendige Cookies speichern normalerweise "
"kurze Flags (1, true, de-DE). Ein langer Hash/UUID-Wert "
"in einem als 'essential' deklarierten Cookie ist ein "
"Indikator fuer verstecktes Tracking — vergleichbar mit "
"einem 'Defeat Device', das auf dem Pruefstand harmlos "
"aussieht aber im Realbetrieb anderes tut."
),
})
return out
def build_entropy_block_html(findings: list[dict]) -> str:
if not findings:
return ""
items: list[str] = []
for f in findings[:25]:
items.append(
f'<li style="margin-bottom:6px;font-size:11px;line-height:1.5">'
f'<strong style="color:#d97706">{f["cookie"]}</strong> '
f'<span style="color:#64748b">(deklariert: '
f'<strong>{f["declared"]}</strong>) — Wert-Shape:</span> '
f'<code style="background:#fef3c7;padding:1px 4px;border-radius:2px">'
f'{f["value_shape"]}</code> '
f'<span style="color:#64748b">({f["value_len"]} Zeichen)</span>'
f'</li>'
)
return (
'<div style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;'
'max-width:760px;margin:0 auto 16px;padding:14px 18px;'
'background:#fffbeb;border:1px solid #fde68a;border-radius:8px">'
'<div style="font-size:11px;color:#92400e;text-transform:uppercase;'
'letter-spacing:1.2px;margin-bottom:4px;font-weight:600">'
'Cookie-Werte-Plausibilitaet (Defeat-Device-Heuristik)</div>'
f'<h3 style="margin:0 0 6px;font-size:14px;color:#1e293b">'
f'{len(findings)} Cookie{"s" if len(findings) != 1 else ""} '
'mit verdaechtigem Wert-Pattern</h3>'
'<p style="margin:0 0 10px;font-size:11px;color:#475569;line-height:1.5">'
'Diese Cookies sind als "essential" oder "funktional" deklariert, '
'ihr tatsaechlicher Wert sieht aber wie eine Tracking-ID aus '
'(UUID, Hash, langer Base64-Token). Empfehlung: pruefen ob diese '
'Cookies wirklich nur technisch notwendig sind oder de facto '
'pseudonymisierte User-Tracker.</p>'
'<ul style="margin:0 0 0 18px;padding:0">'
+ "".join(items) +
'</ul></div>'
)
@@ -0,0 +1,439 @@
"""
Parst Cookie-Tabellen die der User direkt ins Frontend kopiert.
Typische Quellen:
* Browser-Copy aus VW/BMW/Mercedes Cookie-Richtlinie (Tab-getrennt)
* Excel-Export aus Borlabs / OneTrust / Cookiebot Admin (CSV / Pipe)
* Markdown-Tabelle aus interner Doku
Erkennt 4 Spalten-Layouts (heuristisch):
1. [Name, Kategorie, Beschreibung, Speicherdauer, Provider]
2. [Name, Provider, Zweck, Speicherdauer]
3. [Name, Beschreibung, Speicherdauer]
4. nur [Name, Speicherdauer]
Output: gleiche Vendor-Record-Struktur wie vendor_extractor / LLM
damit der Rest der Pipeline (VVT-Tabelle, Library-Mismatch-Check) ohne
Aenderung weiterlaeuft.
"""
from __future__ import annotations
import logging
import re
logger = logging.getLogger(__name__)
_CATEGORY_LABELS = (
"notwendig", "essential", "funktional", "tracking", "marketing",
"statistik", "analyse", "analytics", "performance", "werbung",
"advertising", "targeting", "preferences", "social_media",
"strictly necessary", "personalisierung",
)
def _looks_like_separator(line: str) -> str | None:
"""Detect the column-separator of a tabular line."""
if "\t" in line and line.count("\t") >= 2:
return "\t"
if " | " in line and line.count(" | ") >= 2:
return " | "
if ";" in line and line.count(";") >= 2 and "," not in line[:20]:
return ";"
if "," in line and line.count(",") >= 3:
return ","
return None
def _normalize_category(s: str) -> str:
sl = s.lower().strip()
for cat in _CATEGORY_LABELS:
if cat in sl:
if cat in ("notwendig", "essential", "strictly necessary"):
return "essential"
if cat in ("tracking", "marketing", "werbung",
"advertising", "targeting"):
return "marketing"
if cat in ("statistik", "analyse", "analytics", "performance"):
return "statistics"
if cat == "funktional":
return "functional"
if cat == "social_media":
return "social_media"
return sl[:30]
def _parse_persistence(s: str) -> str:
"""Extracts 'Speicherdauer' notation."""
m = re.search(
r"(\d+\s*(sekunde|minute|stunde|tag|woche|monat|jahr|day|month|year)[^\s,;|]{0,5})",
s, re.I,
)
if m:
return m.group(1).strip()[:80]
if re.search(r"\bsession\b", s, re.I):
return "Session"
if re.search(r"permanent", s, re.I):
return "Permanent"
return ""
_CATEGORY_INDICATORS = (
"funktionscookie", "tracking cookie", "trackingcookie",
"marketing", "analytics", "necessary", "notwendig",
"performance", "session cookie", "persistent cookie",
"permanent cookie", "permanent/protokoll", "sitzungs-cookie",
)
def parse_block_format(text: str) -> list[dict]:
"""Block-Format (Browser-Copy aus VW/BMW/Mercedes ohne Tab-Trenner):
Pro Cookie 5 Zeilen: Name / Kategorie / Zweck / Speicherdauer / Art.
Heuristik: gehe ueber alle Zeilen. Wenn eine Zeile NICHT eine
Kategorie/Dauer/Art ist und die naechste eine Kategorie enthaelt
das ist ein Cookie-Name. Sammle die naechsten 4 Zeilen als
Kategorie/Zweck/Dauer/Art.
"""
if not text or len(text) < 100:
return []
raw_lines = [ln.strip() for ln in text.splitlines()]
# Aggressive newline-collapse: leere Zeilen entfernen, aber Zeilen
# die Teil eines mehrzeiligen Zwecks sind moegen separat bleiben.
lines = [ln for ln in raw_lines if ln]
if len(lines) < 10:
return []
# Drop the header row(s) if present
start = 0
if lines[0].lower() in ("name des cookies", "cookie name", "name"):
start = 5 if len(lines) > 5 else 1
by_vendor: dict[str, dict] = {}
seen_names: set[str] = set()
i = start
while i < len(lines) - 2:
name_line = lines[i]
cat_line = lines[i + 1] if i + 1 < len(lines) else ""
# Verify cat_line is a category indicator (otherwise the
# block is malformed — skip 1 line and try again).
if not any(c in cat_line.lower() for c in _CATEGORY_INDICATORS):
i += 1
continue
# Cookie-Name validation
nl = name_line.lower().strip()
if (not name_line or len(name_line) > 80
or len(name_line) < 2
or any(c in nl for c in _CATEGORY_INDICATORS)
or nl in seen_names
or nl in ("name des cookies", "kategorie",
"verwendungszweck", "speicherdauer",
"art des cookies")):
i += 1
continue
# Look ahead for the Art-Cookie line (max 8 lines forward)
purpose_parts: list[str] = []
persistence = ""
art = ""
j = i + 2
while j < min(i + 12, len(lines)):
ln = lines[j]
ll = ln.lower()
if any(t in ll for t in (
"permanent/protokoll", "session cookie",
"persistent cookie", "permanent cookie",
"sitzungs-cookie", "permanent/ protokoll",
)):
art = ln
if not persistence and j > i + 2:
persistence = lines[j - 1]
break
purpose_parts.append(ln)
j += 1
purpose = " ".join(purpose_parts[:-1]) if len(purpose_parts) > 1 else " ".join(purpose_parts)
purpose = purpose[:500].strip()
seen_names.add(nl)
provider = _guess_vendor(name_line) or "Unbekannter Anbieter (VW-intern)"
# Marketing-Cookies = Drittanbieter
if "marketing" in cat_line.lower() or "tracking" in cat_line.lower():
if provider == "Unbekannter Anbieter (VW-intern)":
provider = "Unbekannter Drittanbieter (Marketing)"
entry = by_vendor.setdefault(provider, {
"name": provider, "country": "",
"purpose": "", "category": _normalize_category(cat_line),
"opt_out_url": "", "privacy_policy_url": "",
"persistence": "",
"cookies": [],
"source": "block_paste",
})
entry["cookies"].append({
"name": name_line,
"purpose": purpose[:300],
"expiry": persistence,
"is_third_party": "tracking" in cat_line.lower() or "marketing" in cat_line.lower(),
})
i = j + 1 if art else i + 5
out = list(by_vendor.values())
logger.info("parse_block_format: %d vendors / %d cookies",
len(out), sum(len(v["cookies"]) for v in out))
return out
def parse_cookie_table(text: str) -> list[dict]:
"""Returns vendor-records aus einer copy-pasted Cookie-Tabelle.
Probiert in dieser Reihenfolge:
1. Tab/Pipe/Komma-getrennt (klassisches Tabellen-Layout)
2. 5-Zeilen-Block-Format (VW Browser-Copy)
3. return []
"""
if not text or len(text) < 100:
return []
lines = [ln.strip() for ln in text.splitlines() if ln.strip()]
if not lines:
return []
# Sample 30 lines to detect separator
sample = lines[:60]
sep_counts: dict[str, int] = {}
for ln in sample:
sep = _looks_like_separator(ln)
if sep:
sep_counts[sep] = sep_counts.get(sep, 0) + 1
if not sep_counts or max(sep_counts.values()) < 3:
# Kein Separator-Format → versuche Block-Format
block_vendors = parse_block_format(text)
if block_vendors:
return block_vendors
return []
sep = max(sep_counts, key=sep_counts.get)
logger.info("cookies_table_parser: detected separator '%s' (%d hits)",
sep, sep_counts[sep])
# Parse rows
rows: list[list[str]] = []
for ln in lines:
if sep in ln:
parts = [p.strip().strip('"') for p in ln.split(sep)]
if len(parts) >= 2 and parts[0]:
rows.append(parts)
if len(rows) < 3:
return []
# Detect column layout from header (first row) or by content
header_row = [c.lower() for c in rows[0]]
has_header = any(h in " ".join(header_row) for h in
("cookie", "name", "anbieter", "provider", "zweck",
"kategorie", "speicherdauer", "dauer"))
data_rows = rows[1:] if has_header else rows
# Map columns by header keyword or by position
col_idx = {"name": 0, "provider": -1, "category": -1,
"purpose": -1, "persistence": -1}
if has_header:
for i, h in enumerate(header_row):
if "name" in h or "cookie" in h:
col_idx["name"] = i
elif "anbieter" in h or "provider" in h or "domain" in h:
col_idx["provider"] = i
elif "kategorie" in h or "type" in h or "art" in h:
col_idx["category"] = i
elif "zweck" in h or "purpose" in h or "beschreib" in h:
col_idx["purpose"] = i
elif "speicher" in h or "dauer" in h or "lebens" in h or "expir" in h:
col_idx["persistence"] = i
# Aggregate by vendor (or by name if no vendor column)
by_vendor: dict[str, dict] = {}
for r in data_rows:
if len(r) < 2:
continue
name = r[col_idx["name"]] if col_idx["name"] < len(r) else r[0]
name = (name or "").strip()
if not name or len(name) > 120 or len(name) < 2:
continue
provider = ""
if col_idx["provider"] >= 0 and col_idx["provider"] < len(r):
provider = r[col_idx["provider"]].strip()
if not provider:
# Heuristik: wenn Spalte 'Anbieter' fehlt, raten aus Cookie-Name
provider = _guess_vendor(name)
if not provider:
provider = "Unbekannter Anbieter"
category = ""
purpose = ""
persistence = ""
if col_idx["category"] >= 0 and col_idx["category"] < len(r):
category = _normalize_category(r[col_idx["category"]])
if col_idx["purpose"] >= 0 and col_idx["purpose"] < len(r):
purpose = r[col_idx["purpose"]][:500]
if col_idx["persistence"] >= 0 and col_idx["persistence"] < len(r):
persistence = _parse_persistence(r[col_idx["persistence"]])
if not category:
# Inferieren aus purpose-Text
category = _normalize_category(purpose)
entry = by_vendor.setdefault(provider, {
"name": provider, "country": "",
"purpose": purpose[:300] if purpose else "",
"category": category,
"opt_out_url": "", "privacy_policy_url": "",
"persistence": persistence,
"cookies": [],
"source": "table_paste",
})
entry["cookies"].append({
"name": name, "purpose": purpose[:200],
"expiry": persistence, "is_third_party": True,
})
out = list(by_vendor.values())
logger.info("cookies_table_parser: %d vendors / %d cookies parsed",
len(out), sum(len(v["cookies"]) for v in out))
return out
# textContent-Output von HTML-Tabellen verkettet Zellen ohne Whitespace
# (z.B. VW: "Permanent/Protokoll_fbcTracking Cookies (Marketing)..."). Wir
# erkennen Cookie-Eintraege ueber 2 Anker:
# - Davor: typisches End-Token einer vorherigen Tabellen-Zelle
# (Speicherdauer-Suffix wie Permanent/Protokoll, Session Cookie, ...)
# - Danach: Kategorie-Token (Tracking Cookies, Funktionscookie, ...)
# Dazwischen: der Cookie-Name (3-50 Zeichen, alphanum/underscore/dash).
_FLAT_ROW_RE = re.compile(
r"(?:Permanent/Protokoll|Session Cookie|Persistent Cookie|"
r"TagePersistent|TageSitzungs-Cookie|TageSession Cookie|"
r"MinutenPersistent|MinutenSession Cookie|StundenPersistent|"
r"MonatePersistent|JahrePersistent)"
r"([A-Za-z_][A-Za-z0-9_\-\.]{1,40}?)"
r"(?=Tracking Cookies|Session Cookies|Funktionscookie|Funktional|"
r"Marketing|Analytics|Necessary)",
re.I,
)
def parse_flat_cookie_text(text: str) -> list[dict]:
"""Variante fuer Sites wie VW die ihre Cookie-Tabelle als flachen
Text liefern (textContent-Output ohne Whitespace zwischen Zellen).
Regex anchored auf vorherige Speicherdauer-Suffixe + folgende
Kategorie-Token extrahiert den Cookie-Namen dazwischen.
"""
if not text or len(text) < 500:
return []
names = _FLAT_ROW_RE.findall(text)
if len(names) < 3:
return []
by_vendor: dict[str, dict] = {}
seen_names: set[str] = set()
for raw in names:
name = raw.strip()
nl = name.lower()
if nl in seen_names:
continue
if nl in ("dieser", "diese", "ein", "der", "die", "das",
"session", "permanent", "funktional", "notwendig",
"marketing", "analytics", "werbung", "anbieter",
"tracking", "cookie", "cookies", "und", "von",
"einer", "ist", "alle", "noch", "auch", "name",
"art", "zweck", "dauer", "test"):
continue
if len(name) < 3 or len(name) > 60:
continue
seen_names.add(nl)
vendor = _guess_vendor(name) or "Unbekannter Anbieter"
entry = by_vendor.setdefault(vendor, {
"name": vendor, "country": "",
"purpose": "", "category": "",
"opt_out_url": "", "privacy_policy_url": "",
"persistence": "",
"cookies": [],
"source": "flat_pattern",
})
entry["cookies"].append({
"name": name, "purpose": "",
"expiry": "", "is_third_party": True,
})
out = list(by_vendor.values())
logger.info("parse_flat_cookie_text: %d vendors / %d cookies",
len(out), sum(len(v["cookies"]) for v in out))
return out
_VENDOR_GUESS = (
# Google-Familie (alles unter "Google" zusammenfassen — Dedup kuemmert sich)
("_ga", "Google"), ("_gid", "Google"), ("_gcl_", "Google"),
("ANID", "Google"), ("AID", "Google"), ("FPGCLDC", "Google"),
("FPAU", "Google"), ("FLC", "Google"), ("APC", "Google"),
("IDE", "Google"), ("DSID", "Google"), ("TAID", "Google"),
("NID", "Google"), ("1P_JAR", "Google"),
# Meta / Facebook
("_fbp", "Meta / Facebook"), ("_fbc", "Meta / Facebook"),
# fr ist Meta-Cookie, nur wenn keine andere Site-eigene Verwendung
# Microsoft / Bing
("_pin_unauth", "Pinterest"), ("_uetsid", "Microsoft Bing"),
("_uetvid", "Microsoft Bing"), ("MUID", "Microsoft"),
# Soziale Netzwerke
("tt_", "TikTok"), ("li_at", "LinkedIn"),
# CMP
("OptanonConsent", "OneTrust"), ("cookieconsent", "Borlabs / Cookie-CMP"),
("CookieConsentPolicy", "Borlabs / Cookie-CMP"),
# Analytics
("eta_", "etracker"), ("matomo", "Matomo"),
("_hjid", "Hotjar"), ("_hj", "Hotjar"),
("ajs_", "Segment"), ("amp_", "Amplitude"),
# Adobe-Familie
("sat_track", "Adobe Experience Cloud"),
("AMCV", "Adobe Experience Cloud"),
("AMCVS", "Adobe Experience Cloud"),
("demdex", "Adobe Experience Cloud"),
("dextp", "Adobe Experience Cloud"),
("dpm", "Adobe Experience Cloud"),
("mbox", "Adobe Target"),
("smartSignals", "Adobe Experience Cloud"),
("adbCDP", "Adobe Experience Cloud"),
("s_cc", "Adobe Analytics"), ("s_sq", "Adobe Analytics"),
("s_ecid", "Adobe Analytics"), ("s_vi", "Adobe Analytics"),
("s_fid", "Adobe Analytics"), ("s_plt", "Adobe Analytics"),
("s_pltp", "Adobe Analytics"), ("s_invisit", "Adobe Analytics"),
("s_vnc365", "Adobe Analytics"), ("s_ivc", "Adobe Analytics"),
("sc_appvn", "Adobe Analytics"), ("sc_pCmp", "Adobe Analytics"),
("sc_prevpage", "Adobe Analytics"), ("sc_prop", "Adobe Analytics"),
("sc_v17", "Adobe Analytics"), ("sc_v44", "Adobe Analytics"),
("sc_v49", "Adobe Analytics"),
# The Trade Desk
("TDID", "The Trade Desk"), ("TDCPM", "The Trade Desk"),
("TTDOptOut", "The Trade Desk"),
# AdForm
("uid", "AdForm"), ("cid", "AdForm"), ("otsid", "AdForm"),
# everest
("everest", "Adobe Advertising Cloud (everest)"),
# Infra/CDN
("__cf", "Cloudflare"), ("datadome", "DataDome"),
("incap_", "Imperva Incapsula"), ("awsalb", "AWS Load Balancer"),
# Salesforce
("sfdc-", "Salesforce"), ("X-Salesforce", "Salesforce"),
("liveagent_", "Salesforce LiveAgent"),
# Inbenta
("inbenta", "Inbenta"),
# Sonstige Tracker
("_pk_", "Matomo / Piwik"),
("hmt_", "Akamai mPulse"),
# EDAA / Industry Self-regulation
("EDAAT", "EDAA / Online Choices"),
("Eboptout", "EDAA / Online Choices"),
)
def _guess_vendor(cookie_name: str) -> str:
nl = cookie_name.lower()
for prefix, vendor in _VENDOR_GUESS:
if nl.startswith(prefix.lower()) or prefix.lower() in nl:
return vendor
return ""
@@ -39,6 +39,12 @@ AGB_CHECKLIST = [
"patterns": [
r"vertragsschluss", r"zustandekommen",
r"contract\s+formation", r"angebot\s+und\s+annahme",
# P41: English synonyms
r"conclusion\s+of\s+(?:the\s+)?contract",
r"contract\s+(?:is\s+)?(?:concluded|formed)",
r"offer\s+and\s+acceptance",
r"how\s+the\s+contract\s+is\s+formed",
r"contracts?\s+(?:apply|between\s+the\s+provider)",
],
"severity": "HIGH",
"hint": "Haeufiger Fehler: Die Bestellung wird als Angebot des Kunden dargestellt, aber die Auftragsbestaetigung als Annahme — das ist nur wirksam, wenn klar zwischen Eingangsbestaetigung (§312i BGB) und Auftragsbestaetigung/Annahme unterschieden wird.",
@@ -140,6 +146,15 @@ AGB_CHECKLIST = [
r"lieferung", r"leistungserbringung", r"delivery",
r"lieferfrist", r"bereitstellung",
r"(?:zugang|zugriff).*(?:dienst|leistung)",
# P41: English synonyms (SaaS-style)
r"provision\s+of\s+(?:the\s+)?(?:service|services)",
r"(?:performance|rendering)\s+of\s+(?:the\s+)?(?:service|services)",
r"availability\s+of\s+(?:the\s+)?service",
r"service\s+level\s+(?:agreement|description)",
r"access\s+to\s+(?:the\s+)?(?:service|platform)",
r"description\s+of\s+(?:the\s+)?services?",
r"(?:^|\n)\s*#+\s*[§\d\.\s]*availability\b",
r"(?:^|\n)\s*#+\s*[§\d\.\s]*description\s+of\s+services?",
],
"severity": "MEDIUM",
"hint": "Bei Fernabsatzvertraegen muss der Unternehmer spaetestens 30 Tage nach Vertragsschluss liefern (§475 Abs. 1 BGB). Formulierungen wie 'Lieferung in der Regel in...' oder 'voraussichtlich' sind nur als Richtwert zulaessig, nicht als verbindliche Frist.",
@@ -230,6 +245,12 @@ AGB_CHECKLIST = [
r"(?:agb|bedingung).*datenschutz",
r"personenbezogen.*daten.*(?:agb|vertrag)",
r"dsgvo.*(?:agb|vertrag)",
# P41: English synonyms
r"data\s+protection.*(?:terms|contract)",
r"(?:terms|contract).*data\s+protection",
r"personal\s+data.*(?:terms|contract|agreement)",
r"gdpr.*(?:terms|contract|agreement)",
r"privacy\s+(?:policy|notice).*(?:see|refer)",
],
"severity": "LOW",
"hint": "AGB und Datenschutzerklaerung sind rechtlich getrennte Dokumente. Mischen Sie KEINE Datenschutzhinweise in die AGB ein — stattdessen genuegt ein Verweis: 'Details zur Datenverarbeitung finden Sie in unserer Datenschutzerklaerung [Link].'",
@@ -245,6 +266,11 @@ AGB_CHECKLIST = [
r"(?:unwirksamkeit|nichtigkeit)\s+(?:einer|einzelner)\s+(?:bestimmung|klausel|regelung)",
r"(?:sollte|sofern).*(?:bestimmung|klausel).*(?:unwirksam|nichtig)",
r"(?:uebrigen|übrigen)\s+bestimmungen.*(?:unberuehrt|unberührt|wirksam|bestehen)",
# P41: English equivalents
r"severability",
r"(?:invalid|unenforceable).*(?:provision|clause)",
r"remaining\s+provisions\s+(?:shall\s+)?(?:remain|continue)",
r"(?:provision|clause)\s+(?:is\s+)?(?:invalid|unenforceable|void)",
],
"severity": "LOW",
"hint": "Die klassische salvatorische Klausel ('unwirksame Bestimmungen werden durch wirksame ersetzt') ist nach BGH-Rechtsprechung in AGB selbst unwirksam. Besser: Nur die Erhaltungsklausel verwenden ('Die uebrigen Bestimmungen bleiben wirksam').",
@@ -260,6 +286,12 @@ AGB_CHECKLIST = [
r"(?:agb|bedingung).*(?:ae|ä)nder",
r"(?:anpassung|aktualisierung).*(?:agb|bedingung|geschaeftsbedingung|geschäftsbedingung)",
r"(?:neue\s+fassung|neufassung).*(?:agb|bedingung)",
# P41: English
r"amendments?.*(?:terms|conditions|agreement)",
r"(?:terms|conditions|agreement).*(?:may\s+be\s+)?amend",
r"changes?\s+to\s+(?:these\s+)?(?:terms|conditions)",
r"modification\s+of\s+(?:the\s+)?(?:terms|agreement)",
r"(?:revised|updated)\s+(?:terms|conditions|version)",
],
"severity": "LOW",
"hint": "AGB-Aenderungsklauseln bei B2C sind nur unter engen Voraussetzungen wirksam (BGH Az. XI ZR 388/10): Aenderungsgrund muss konkret benannt sein, Kunde muss angemessene Frist zur Kuendigung erhalten. Pauschale 'Wir koennen jederzeit aendern'-Klauseln sind unwirksam.",
@@ -275,6 +307,12 @@ AGB_CHECKLIST = [
r"verbraucherrecht",
r"(?:gesetzlich|zwingende)\w*\s+recht\w*.*(?:unberuehrt|unberührt|bestehen\s+bleiben)",
r"(?:verbrauch|konsument).*(?:recht|anspruch|schutz)",
# P41: English equivalents — UCTA / Consumer Rights Act
r"consumer\s+(?:rights?|protection|laws?)",
r"statutory\s+rights?\s+(?:are|shall\s+be|remain)\s+unaffected",
r"mandatory\s+(?:law|rights?)\s+(?:remain|shall\s+remain)",
r"(?:nothing|no\s+provision)\s+(?:in\s+these\s+)?(?:terms|conditions)\s+(?:shall|limits?|excludes?)",
r"contracts?\s+with\s+consumers?\s+(?:are\s+not\s+concluded|excluded)",
],
"severity": "LOW",
"hint": "Haeufigste §309 BGB-Verstoesse: Pauschalierter Schadensersatz ohne Gegenbeweismoeglichkeit (Nr. 5), Haftungsausschluss bei Koerperschaeden (Nr. 7a), Schriftformerfordernis fuer Kuendigung (Nr. 13). Jede dieser Klauseln ist einzeln abmahnfaehig.",
@@ -259,6 +259,8 @@ AVV_CHECKLIST = [
r"(?:l(?:oe|ö)schung|rueckgabe|r(?:ue|ü)ckgabe)\s+(?:nach|bei|zum)\s+(?:vertragsende|beendigung|ablauf)",
r"(?:nach|bei)\s+(?:beendigung|ablauf|ende)\s+(?:des\s+)?(?:vertrag|auftrag)[\s\S]{0,100}(?:l(?:oe|ö)sch|rueckgabe|r(?:ue|ü)ckgabe|vernicht)",
r"(?:alle|saemtliche)\s+(?:personenbezogenen?\s+)?daten\s+(?:l(?:oe|ö)sch|vernicht|zurueckgeb|zur(?:ue|ü)ckgeb)",
# P39: reverse order — "loescht/gibt ... nach Beendigung/Ablauf"
r"(?:l(?:oe|ö)sch|gibt|gibt\s+zur(?:ue|ü)ck|vernicht)\w*[\s\S]{0,150}(?:nach|bei|zum)\s+(?:beendigung|ablauf|ende|vertragsende)",
],
"severity": "CRITICAL",
"hint": "Art. 28(3)(g) DSGVO: Nach Ende der Verarbeitung muessen alle personenbezogenen Daten geloescht oder zurueckgegeben werden — nach Wahl des Verantwortlichen. Ausnahme nur bei gesetzlicher Aufbewahrungspflicht.",
@@ -336,6 +338,10 @@ AVV_CHECKLIST = [
r"data\s+breach",
r"(?:meld|benachrichtig|informier|unterricht)\w*[\s\S]{0,50}(?:verletzung|vorfall|sicherheit)",
r"art(?:ikel)?\s*\.?\s*33\s+(?:dsgvo|ds-?gvo)",
# P39: "Datenpanne" als gleichwertiges Synonym (sehr verbreitet)
r"datenpanne",
r"meldung\s+von\s+datenpannen",
r"art\.?\s*33\s+abs\.?\s*\d",
],
"severity": "CRITICAL",
"hint": "Art. 33(2) DSGVO: Der Auftragsverarbeiter muss den Verantwortlichen UNVERZUEGLICH ueber jede Datenschutzverletzung informieren. Die 72-Stunden-Frist des Verantwortlichen gegenueber der Aufsichtsbehoerde laeuft ab Kenntnis — daher sollte die Meldefrist im AVV enger sein (z.B. 24h).",
@@ -66,6 +66,10 @@ COOKIE_CHECKLIST = [
r"(?:setzen|verwenden|nutzen)\s+.*cookies?\s+.*(?:um|fuer|für)",
r"(?:analyse|marketing|tracking|funktional)\w*\s*cookies?\s*\.?\s*(?:um|damit|diese|sie)",
r"cookies?\s+(?:dienen|helfen|erm(?:oe|ö)glichen)",
# P39: cookie purpose table column "| Zweck |" + "Kategorie"
r"kategorie\s*\|\s*zweck",
r"\|\s*zweck\s*\|",
r"welche\s+technologie\s+welchen\s+zweck",
],
"severity": "HIGH",
"hint": "Art. 13 Abs. 1 lit. c DSGVO verlangt die Zweckangabe je Verarbeitung. Jede Cookie-Kategorie braucht einen konkreten Zweck (z.B. 'Reichweitenmessung', 'Conversion-Tracking'), nicht nur 'zur Verbesserung unserer Website'.",
@@ -207,6 +211,10 @@ COOKIE_CHECKLIST = [
r"(?:datenschutz[\-]?rechtlich(?:er)?\s+)?verantwortlich\w*\s*[:\|]",
r"daten(?:schutz)?[\-]?(?:rechtlich(?:er)?\s+)?(?:verantwortl|controller)",
r"\bcontroller\b.*\b(?:art\.?\s*13|art\.?\s*14|gdpr|dsgvo)",
# P39: heading variant — common in cookie policies
r"(?:^|\n)\s*#+\s*\d*\.?\s*verantwortlich\w*",
r"(?:^|\n)\s*\d+\.\s+verantwortlich\w*",
r"verantwortlich\w*\s+(?:fuer|für|ist|im\s+sinne)",
],
"severity": "MEDIUM",
"hint": "Art. 13(1)(a) DSGVO verlangt die Nennung des Verantwortlichen in der Cookie-Richtlinie. Pflicht: Firmenname + Anschrift + Kontaktdaten (E-Mail/Telefon). Akzeptabel: knapper Verweis 'Details zum Verantwortlichen siehe Datenschutzerklaerung [Link]' wenn die DSI verlinkt ist.",
@@ -268,19 +276,40 @@ COOKIE_CHECKLIST = [
},
# ── Neue L1: Cookie-Tabelle ───────────────────────────────────────
# P95: Lockerer Match — Vendor-zentrische Detailseiten (BMW-Stil mit
# Adform-Block etc.) werden als gleichwertig akzeptiert. DSK-OH 2024
# §3.2 verlangt die Informationen pro Cookie, schreibt aber keine
# Tabellenform vor. Ein Vendor-Block der Name+Anbieter+Zweck+Dauer+
# Cookie-Namen aggregiert nennt erfuellt das.
{
"id": "cookie_table",
"label": "Strukturierte Cookie-Tabelle/Liste",
"label": "Strukturierte Cookie-Informationen (Tabelle oder Vendor-Blöcke)",
"level": 1, "parent": None,
"patterns": [
# Klassische Tabelle
r"(?:cookie[\-\s])?(?:tabelle|uebersicht|übersicht|liste|aufstellung)",
r"(?:name|bezeichnung)\s*[\|\t]\s*(?:anbieter|zweck|dauer|laufzeit|funktion)",
r"(?:first[\-\s]?party|third[\-\s]?party)\s*[\|\t]",
r"(?:typ(?:en)?|name|funktion|speicherdauer)\s+(?:typ(?:en)?|name|funktion|speicherdauer)",
r"folgende\s+cookies",
r"(?:funktionale|session|analyse|tracking)\s+cookies?\s+\w+",
# P95: Vendor-zentrische Detail-Bloecke (BMW-Stil) — wenn
# mehrere typische Vendor-Block-Marker vorhanden, gilt als
# strukturiert. "Gesetzt von:" + "Opt-Out Link:" + "Privacy"
# ist ein klares Indiz fuer Vendor-Detailseite.
r"gesetzt\s+von\s*[:\|]",
r"opt[\-\s]?out[\s\-]?link\s*[:\|]",
r"speicherdauer\s*[:\|]\s*\d+\s+(?:tag|monat|jahr|day|month|year)",
r"(?:rechtsgrundlage|legal\s+basis)\s*[:\|]",
r"(?:diese\s+datenverarbeitung\s+verwendet\s+die\s+folgenden\s+cookies)",
],
"severity": "LOW",
"hint": "Die DSK-Orientierungshilfe empfiehlt eine Tabelle mit 5 Spalten: Name, Anbieter, Zweck, Speicherdauer, Typ (First-/Third-Party). Viele Consent-Tools (Cookiebot, Usercentrics) generieren diese Tabelle automatisch — binden Sie sie ein.",
"hint": "DSK-OH Telemedien 2024 §3.2 verlangt Cookie-Informationen pro "
"Vendor/Cookie (Name, Anbieter, Zweck, Speicherdauer, Drittlandtransfer). "
"Akzeptable Formate: (a) Tabelle mit 5 Spalten oder (b) Vendor-Detailseite "
"mit Block pro Anbieter (Anbieter+Anschrift, Zweck, Speicherdauer aggregiert, "
"Cookie-Namen-Liste, Opt-Out-Link, Drittlandstatus). BMW-Stil mit Adform-"
"Block ist konform. Auch automatisierte CMP-Generierung (Cookiebot, Usercentrics) "
"ist OK.",
},
]

Some files were not shown because too many files have changed in this diff Show More