Wire the 195 Clean-Room QUAIDAL controls (from breakpilot-core migration 011)
into the compliance SaaS UI.
Backend:
- GET /api/v1/quaidal/stats - counts by kind + source provenance
- GET /api/v1/quaidal/controls - list, optional kind= filter
- GET /api/v1/quaidal/controls/{id} - single derived control
- GET /api/v1/quaidal/criteria - 10 QKB criteria
- GET /api/v1/quaidal/criteria/{id} - QKB with QB/MA/QM tree
Frontend:
- /sdk/quality: new "Trainingsdaten-Qualität (BSI QUAIDAL)" tab with
10 QKB cards and a drill-down modal showing the full QB→MA→QM tree
plus original BSI source link and license note.
- /sdk/ai-act: Art. 10 tile on each high-risk/unacceptable result,
linking to /sdk/quality?category=data_quality.
Pattern matches existing IACE module DIN-reference handling:
own wording, source section + URL preserved for due diligence.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A) Cookie-Policy-Architecture-Block Fallback auf DSE-Text wenn cookie via
P15 deduped wurde. Erkennt jetzt auch single-doc Sites (Safetykon-Pattern).
B) Konkrete-Aufgaben-Liste: Per-Doc-Cap (3) entfernt + globaler Cap 10→20.
Safetykon zeigt jetzt 7 statt 4 Aufgaben.
C) business_type-Klassifizierer: B2B-Service-Cluster aus P14 als Boost.
Bei 2+ Service-Indikatoren (CE-Zertifizierung/Compliance/Auditierung)
wird b2b_score angehoben. Safetykon: "B2C consulting" → "B2B (consulting)".
D) Vendor-Extract Fallback auf DSE-Text wenn cookie deduped + keine CMP-
Payloads. LLM extrahiert dann Vendors aus dem DSE-Text. Safetykon: 0 → 1
Vendor (Google Analytics aus dem DSE-Text erkannt).
Smoke-Test Safetykon: alle 4 Polish-Items wirken, kein Regression.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P14 — _detect_no_direct_sales erweitert um 3 Cluster:
A) OEM-Konfigurator (BMW/Audi/Mercedes/VW/Porsche-Markennamen + Vertragshaendler-Pattern)
B) B2B-Dienstleister (CE-Zertifizierung, Compliance-Beratung, Schulungen, Auditierung, TISAX, ISO-Normen, Arbeitssicherheit, ...)
C) NGO/Verein/Public (Spendenkonto, Vereinsregister, gemeinnuetzig, ...)
Schwelle: pos >= 2 pro Cluster UND pos > neg. Bisher: nur OEM.
P15 — Doc-URL-Dedup im Worker: wenn mehrere Doc-Types DASSELBE Dokument
referenzieren (Safetykon-Pattern: User gibt /datenschutz fuer dse, cookie
UND widerruf), wird nur dem primaeren Doc-Type (Priority: dse > impressum
> cookie > widerruf > agb > nutzungsbedingungen) der Text gegeben. Andere
landen als "Nicht separat vorhanden — wird im Dokument 'X' mit-geprueft."
Eliminiert die 8+8 systematischen widerruf/cookie False Positives.
P16 — Profile-Detection auch Homepage-Text: Homepage-HTML wird mit kurzem
Fetch (8s timeout) gezogen, getrippt und zum profile_input gemerged. Vor-
her wirkte P14 nur wenn B2B-Indikatoren im DSE/Impressum-Pflichttext
standen — bei Safetykon stehen sie nur im Homepage-Menue.
Plus Bonus: TDM-Override-Submit-Button wird deaktiviert wenn Reason < 10
Zeichen — verhindert dass User wie heute in den Bug rein klickt.
Smoke-Test Safetykon (B2B Compliance-Dienstleister):
dse geprueft (kein err)
impressum geprueft (kein err)
cookie "Nicht separat vorhanden — wird in DSE mit-geprueft"
agb "Nicht anwendbar — kein Direkt-Kaufvertrag"
widerruf "Nicht anwendbar — kein Direkt-Kaufvertrag"
nutzungsbedingungen "Nicht anwendbar — kein Direkt-Kaufvertrag"
Vorher: 16 False Positives. Jetzt: 0.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Migration 031 added customer_name to the SELECT statement in three places
(GetProject, ListProjects, ListVariants), and the per-row Scan needed the
matching destination. The replace_all caught ListProjects + ListVariants
but missed GetProject because of an indentation difference (single tab
vs row-scope indentation). Result: GET /projects/:id returned
"get project: number of field descriptions must equal number of
destinations, got 18 and 17"
which the frontend interpreted as "project has no data" and surfaced an
empty UI even though hazards/mitigations/components were intact (118/282/16
on Bremsscheibe).
Single-line fix: add &p.CustomerName to the GetProject scan.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
last-build/main tag deleted so detect-changes falls back to
rebuild-all. Exercises the trigger-orca fix end-to-end.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Gitea act_runner evaluates contains(needs.*.result, 'success') to false
when most upstream build jobs are skipped, so single-service changes
never fired the orca redeploy.
Gate trigger-orca on explicit needs.build-<service>.result == 'success'
OR across all 8 build jobs. One green build now suffices to deploy.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bug: qwen3.5:35b-a3b liefert mit format='json' + Batch-Prompt leere
Strings zurueck ('LLM batch: empty response from model'). Im echten
Compliance-Check lief der LLM-Verifier deshalb wirkungslos —
False-Positive-Findings wie 'Vorstand nicht erkannt' (BMW: Klammer-
Liste) wurden nicht overturned.
Fix: Default auf qwen3:30b-a3b umgestellt. Verifiziert mit BMW-
Impressum-Text: representative_person wird mit Evidence 'Milan
Nedeljkovic, Vorsitzender' overturned=True markiert.
OLLAMA_VERIFY_MODEL Env-Var bleibt als Override-Moeglichkeit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Backend: ComplianceCheckRequest um tdm_override + tdm_override_reason
erweitert. Worker im _run_compliance_check Pfad: bei
tdm_override=True UND Reason >= 10 Zeichen wird der TDM-Vorbehalt
nur dokumentiert (job.tdm_override.{reason, original_status}) und
NICHT als Abbruch-Grund gewertet. Ohne Reason: Override ignoriert.
Audit-Spur via logger.warning(reason).
Frontend: ComplianceCheckTab um Checkbox + Pflicht-Reason-Feld
("Schriftliche Crawl-Erlaubnis vorhanden") direkt vor dem Submit-
Button. Pflicht: Reason >= 10 Zeichen. Submit sendet die Flags ans
Backend.
Anwendungsfall: Safetykon-Pattern — robots.txt + ai.txt setzen
Vorbehalt, aber Kunde hat schriftlich zugestimmt (Auftrags-Audit).
[guardrail-change] ComplianceCheckTab.tsx (511 LOC) in loc-exceptions
ergaenzt — Split nach _components/TDMOverride + CompliancePolling
ist P11-Tech-Debt.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Diese 4 Pre-Existing-Files haben den Coolify-Build geblockt (LOC-CI-Step
failed). Splits sind Phase-5+ Tech-Debt-Backlog, bis dahin als Exceptions
getragen damit Production-Deploys nicht ausfallen.
- cra_routes.py (1714)
- vendor_redundancy.py (727)
- cookie_knowledge_db.py (608)
- cookie-banner-embed.ts (558)
Jede Exception hat einen kurzen Rationale-Kommentar daruber.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Neuer Service cookie_policy_architecture.detect_architecture(...) prueft
vier Diagnose-Punkte der Cookie-Policy einer Website:
1. Layer-Trennung: single (BMW-Pattern: Banner + Info in EINER URL)
| separate (Best Practice: getrennte Layer)
2. Versionierung: "Stand vom DD.MM.JJJJ" / "Version X.Y" / ...
3. Dynamic content: CMP-Capture auf Doc-URL oder Marker-Texte
4. Vendor-Count im Text: Indikator ob Liste statisch drinsteht
Risiko-Ampel:
- gruen: separate + versioned + statisch
- gelb : single+unversioned (BMW) ODER separate+unversioned
- rot : weder noch (Pflicht-Info fehlt)
Wire-in im Compliance-Check-Worker: nach Exec-Summary-Block wird der
Architecture-Block gerendert (build_architecture_html) mit konkreter
Empfehlung. Bei BMW-Pattern: "Snapshot der dynamischen Vendor-Tabelle
als versioniertes PDF im Archiv."
Hintergrund: BMW hat eine HTML-Seite die GLEICHZEITIG Banner-Re-Trigger
und Cookie-Richtlinie ist. Mindestanforderung nach §25 TDDDG + Art. 13
DSGVO erfuellt, aber bei einer Aufsichtsbehoerden-Pruefung kann nicht
belegt werden welche Vendor-Liste an einem bestimmten Stichtag aktiv
war. Das ist kein Verstoss aber best-practice-Luecke.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
#1 Name des Anbieters: \b Word-Boundary verhindert "ag" in "samstag",
plus "aktiengesellschaft" als Volltreffer.
#2 Vertretungsberechtigte: Klammer-Liste-Pattern erkennt jetzt BMW-
Format "Vorstand (Milan Nedeljkovic, Jochen Goller, ...)" plus
"Vorsitzender des Aufsichtsrats: Name".
#3 V.i.S.d.P.: war schon INFO, OK.
#4 OS-Plattform/VSBG: bei no_direct_sales=True (OEM-Pattern) jetzt als
"Nicht anwendbar" skipped statt 0/1 fail. Profile fliesst neu durch
check_document_completeness -> runner.
#5 Zustaendige Kammer: IHK + Handwerkskammer + Tieraerztekammer in
Pattern aufgenommen + severity LOW -> INFO (konditional).
#6 Stammkapital: war schon INFO, OK.
#7 Link-Disclaimer: neue Check-Eigenschaft "invert"=True. Anti-Pattern
ist passed wenn NICHT gefunden, fail wenn gefunden. Vorher feuerte
das Finding immer, jetzt nur wenn ein illegaler Disclaimer im Text
ist.
Plus: L2-INFO-Checks (z.B. profession_chamber) zaehlen nicht mehr in
correctness-pct und erzeugen keine DSI-DETAIL-Findings. Konsistent
mit P8-Modell: INFO = "selbst pruefen", nicht "fail".
Verifiziert mit BMW-Impressum-Text — alle 7 Faelle korrekt klassifiziert:
name=passed, representative_person=passed, profession_chamber=INFO,
illegal_disclaimer=passed (kein Disclaimer im Text),
dispute_resolution=skipped (no_direct_sales),
editorial_visdp=INFO, share_capital=INFO.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Email-Hardening (mc_scorecard.top_fails):
Neue _is_hard_finding-Heuristik filtert konditionale MCs ohne
Negativ-Beleg aus den Top-Auffaelligkeiten. matched_text leer + Label
enthaelt "falls/sofern/wenn/soweit/ggf." -> raus, landet nur noch im
MC-Audit als "selbst pruefen". DATA-2066-A05 (kostenfreie Abschaltung
Standortdaten) ist das prototypische Beispiel.
MC-Audit-Frontend (audit/[checkId]/page.tsx):
Severity-Spalte (CRITICAL/HIGH/MEDIUM/LOW) entfernt — der MC-Audit
ist eine Checkliste, keine Severity-Drohung. Stattdessen:
- Spalte "Prioritaet" mit 3-Tier aus regulation-Mapping:
Gesetz (DSGVO/ePrivacy/TDDDG/...) / Behoerden-Leitlinie
(EDPB/DSK/EuGH/...) / Best-Practice (ISO/NIST/BSI)
- 3-Status: erfuellt (✓) / nicht erfuellt (✗) / selbst pruefen (?)
/ nicht anwendbar (—). rowReviewStatus() leitet "selbst pruefen"
aus matched_text-leer + konditionalem Label ab.
- Filter umgebaut auf 5 Stati statt 4
- Default-Filter "Nicht erfuellt" (vorher "Nur Fail")
Bonus: f.payload.risk_label TS-Cast im FindingsTab clean gemacht
(unknown -> string).
Effekt:
- Email an die GF zeigt nur noch echte Belege ("DSB fehlt",
"Gebuehr fuer Widerruf")
- MC-Audit ist eine sachliche Pruefliste fuer den Compliance-Officer
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
[migration-approved]
Task #22. The IACE module is used by a single Maschinenhersteller, but
their plants land at many different end customers. When the safety expert
commissions the second or third plant at the same customer, whole classes
of mitigations (company-wide PPE rules, locked-out energy isolation,
customer-standard signage) are already in place there — but rediscovered
from scratch every project.
Migration 031: iace_projects.customer_name TEXT + partial index.
The customer is stored as a plain text field rather than a normalised
iace_customers table (option A from the design discussion). A proper
customer-management screen can promote this to a FK later without
data loss.
Backend store_customer_standards.go:
- ListCustomerStandardSuggestions(projectID, includeVerified) collects
mitigations from all non-archived prior projects sharing the same
tenant_id AND case-insensitive customer_name. Aggregates by
mitigation.name (since same-named measures from different prior
projects collapse into one suggestion) and surfaces:
• source_project_count + source_project_names
• is_customer_standard / has_verified_instances flags
includeVerified=false → strictly is_customer_standard=true
includeVerified=true → also status='verified'
- ImportCustomerStandardSuggestion(projectID, name): for every prior
(mitigation.name → hazard.name) pairing, finds matching hazards in
the current project (by name) and ensures a customer-standard
mitigation exists. New rows via CreateMitigation (idempotent through
the UNIQUE(hazard_id, name) from migration 030); existing rows are
flipped to is_relevant=true + is_customer_standard=true +
status='verified' via UPDATE.
Routes:
GET /api/v1/iace/projects/:id/customer-standards?include_verified=
POST /api/v1/iace/projects/:id/customer-standards/import body {name}
Frontend:
- New page /sdk/iace/[projectId]/customer-standards with:
• empty-state hint pointing to Auftrag → Kundenname
• per-suggestion checkbox + per-row Übernehmen button
• bulk "N übernehmen" button
• toggle "Auch verifizierte einbeziehen" widening the pool
• per-suggestion source_project_count + status badges
- Sidebar item "Kundenstandards" (building icon) placed between
Verifikation and Nachweise.
- Order-page now mirrors Auftraggeber.Firmenname into the top-level
customer_name column on save, so the Reuse feature is fed
automatically without a separate input field.
The same expert effect from migration 029's is_customer_standard flag —
"I already know it's covered, no evidence needed" — now becomes a
cross-project asset rather than a per-project annotation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
[migration-approved]
The init-handler was non-idempotent. A second click on "Neu initialisieren
in Grenzen" inserted every engine-suggested mitigation a second time —
e.g. the Bremsscheibe project ended up with 5 (hazard_id, name) duplicate
pairs (HMI-Usability-Pruefung, Eindeutiges visuelles Feedback,
Betriebsarten-Anzeige, Sicher begrenzter Bewegungsbereich, …). 45 such
duplicates accumulated across all projects.
Migration 030_iace_mitigation_unique.sql:
1. Picks one winning row per (hazard_id, name) using a stable rank:
is_relevant DESC (expert decision wins over engine default)
status DESC (verified > implemented > planned)
created_at DESC (newest beats older on otherwise-equal rows)
and deletes the losers (Bremsscheibe: 5 rows; total: 45).
2. Adds UNIQUE constraint iace_mitigations_hazard_name_uniq
(hazard_id, name).
Store-Layer (CreateMitigation):
INSERT … ON CONFLICT (hazard_id, name) DO NOTHING RETURNING id.
pgx.ErrNoRows from RETURNING → look up the existing row and return that.
Callers (engine init + manual add) always get a usable Mitigation; the
second click is silently swallowed instead of failing.
Frontend dedupe in groupByTitle stays — it covers any pre-existing
duplicates that survived the migration in edge cases (multi-row write
in flight, etc.). With the UNIQUE constraint live, the in-memory
dedupe is a belt-and-suspenders safety net rather than the load-bearing
mechanism.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Task #21. The verification page used to manage a separate VerificationItem
entity that the expert had to populate by hand — disjoint from the actual
mitigations list. With the is_relevant flag from migration 029, the
verification step has a natural definition: confirm completion for every
mitigation the expert flagged as relevant for this project.
Page is now a derived view on useMitigations(): filter is_relevant=true,
group by title (same dedupe as Massnahmen page), expose two actions per
hazard×mitigation row:
1. "Kundenstandard" — already implemented at the customer's site, no
evidence file required. Sets is_customer_standard=true and
status='verified'.
2. "Verifizieren…" — opens a modal asking for a textual evidence
reference (Prüfprotokoll-Nr, audit reference, etc.). Calls the
existing POST /mitigations/:mid/verify with verification_result.
File upload is deferred to phase 2 once an object-storage backend
is in place — the modal explains this.
When a row is verified, a "Zurücksetzen" link reverts status to
'implemented' for accidental confirmations.
Header counters: total relevant / open / verified / Kundenstandard.
Maßnahmen-page polish (same commit):
- "Lösch."-column header removed — the trash icon is self-explanatory
- groupByTitle now additionally deduplicates by hazard_id within a
group (engine occasionally emits duplicate (name, hazard_id) pairs
when Reinit is clicked twice; a follow-up migration 030 will add
a UNIQUE constraint to prevent these upstream)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
[migration-approved]
Expert-driven workflow refinement on the Massnahmen page. The engine seeds
~80 mitigations per project, but for a concrete customer site most need a
relevance decision before they're meaningful in verification:
status: 'planned' | 'implemented' | 'verified' (existing — verification track)
is_relevant bool (new) (does this apply to *this* site?)
is_customer_standard bool (new) (already in place at customer — no evidence)
Decision flow on the Mitigations tab:
Engine-seeded → is_relevant=false (Default, waiting for expert)
Expert checks "Relevant" → is_relevant=true → surfaces in verification
Expert clicks trash → DELETE (banner warns: do not click Reinit
afterwards or seeds come back)
In verification, customer_standard=true bypasses evidence upload
is_customer_standard implies is_relevant (DB CHECK constraint).
Migration 029_iace_mitigation_relevance.sql:
ALTER TABLE iace_mitigations ADD COLUMN is_relevant ..., is_customer_standard ...
+ CHECK constraint + partial index on is_relevant for the verification
page's filter.
Backend (Go):
- Mitigation struct gains two bool fields
- CreateMitigation: defaults to false/false (engine-seeded mitigations
start unbewertet)
- UpdateMitigation: new case clauses for both keys; setting
is_customer_standard=true auto-flips is_relevant=true to satisfy
the CHECK constraint
- All three SELECT statements (ListMitigations, ListMitigationsByProject,
getMitigation) extended with the two new columns
Frontend:
- Maßnahmen-page columns: [Relev. ☑] [Lösch. 🗑] Title | #Hazards | P·I·V
- Group-header checkbox shows tri-state (indeterminate when partial),
flips all instances in the group at once
- Banner above the table: "Markiere jede Maßnahme als Relevant oder
lösche sie. Nach Löschen kein Neu initialisieren mehr drücken."
- Relevant rows tinted emerald, customer-standard label visible
- Legacy bulk-select state + helpers removed (the Relevant checkbox
now IS the primary mass action)
- useMitigations gains handleSetRelevant, handleSetCustomerStandard,
handleDeleteSilent (for non-confirm bulk deletes)
Future use: is_customer_standard mitigations from a prior project at the
same customer can later be auto-suggested when commissioning the next
plant — turning expert knowledge into reusable customer-profile data.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After a compliance-check run finishes, the user can now apply the
extracted vendor inventory directly to their own:
- CookieBanner config (admin /sdk/einwilligungen)
- Cookie-Policy / VVT-Register / Privacy-Policy templates
(admin /sdk/document-generator)
Backend:
- migration_to_banner.py: vendor list -> CookieBannerConfig with
ESSENTIAL/PERFORMANCE/PERSONALIZATION/EXTERNAL_MEDIA buckets +
review flags (broken opt-out URLs, missing expiry, no cookies listed)
- migration_to_document.py: vendor list -> pre-fills for 3 doc
templates, recipient-type aware (INTERNAL/GROUP/PROCESSOR/CONTROLLER)
- agent_migration_routes.py: GET /banner-preview, /document-preview,
/summary keyed on check_id
- compliance_audit_log: new check_payloads table persists cmp_vendors +
extracted_profile so the preview survives an app restart
- tests: 9 mapper units + 4 endpoint integration tests
Frontend:
- MigrationPanel.tsx: modal showing banner-config diff + document
pre-fills, plus links into the existing editors
- ComplianceCheckTab.tsx: replaces standalone audit link with the
panel; net -3 lines, stays at the 500-cap
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The "Maßnahmen" page in the Bremsscheibe project showed a flat list with
heavy redundancy — e.g. "Sicherheitszeichen nach ISO 7010" appeared on 21
separate rows, one per linked hazard. Same for "Gefahrenpiktogramme",
"Flucht- und Rettungswege" etc. The signal got lost in the noise.
This is a presentation-only regrouping. Each Hazard×Mitigation pair stays
a separate DB row with its own status, notes and edit history (option B
from the discussion: instances remain independently editable). The page
now collapses rows that share the same `m.title` into one group row.
Group row shows:
- title + ISO 12100 sub-category (if encoded in description)
- count of linked hazards on the right
- compact status distribution "P · I · V" (Planned/Implemented/Verified)
- shared checkbox that selects all instances in the group
Click expands the group and reveals the individual hazard×measure rows,
each with its own StatusBadge and detail-expand for MitigationHints.
State additions:
- expandedGroup: Set<string> with keys `${type}:${title}` so the same
title across different reduction stages stays independently togglable
- groupByTitle() helper trims the title, falls back to "(ohne Titel)"
- statusCounts() helper for the P·I·V breakdown
Pagination semantics swapped from 50 instances/page to 50 groups/page —
makes the list far easier to scan at the ~80-instance scale this project
exhibits.
LOC: 267 → 346 (well under the 500 hard cap).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Now that all 1874 MCs run per check (Task #30 cap removal), the report
was about to drown in noise. This commit adds the full aggregation /
persistence / drill-down stack so each MC is actionable, not just
counted.
A1 mc_scorecard.py (new):
build_scorecard(checks) -> per-regulation PASS/FAIL/SKIP + severity
top_fails(checks, n) -> N most severe failed MCs
full_audit_records(...) -> flat rows ready for sidecar SQLite
A2 Email rendering:
agent_doc_check_scorecard.py (new) builds an HTML scorecard table
(regulation × passed/failed/HIGH/MEDIUM/score) shown at the top of
the email. agent_doc_check_report._render_document now collapses
the 500-MC L2 forest into 'X/Y bestanden (Z Fail)' summary plus
a top-10 fails block per doc — old verbose render is gone.
A3 compliance_audit_log.py (new) — sidecar SQLite at
/data/compliance_audits.db (separate from compliance Postgres
schema to comply with the no-new-migrations rule in CLAUDE.md):
check_runs(check_id, ts, tenant_id, site_name, base_domain,
doc_count, scorecard json, vvt_summary json)
mc_results(check_id, doc_type, mc_id, label, passed, skipped,
severity, regulation, matched_text, hint)
Route persists every run after the email is sent.
docker-compose.yml adds compliance-audit volume + env.
A4 backfill_mc_regulation_llm.py (new) — Qwen-tagged backfill for
the 1636 MCs the regex pass couldn't classify. Batches of 25,
format=json, output constrained to the canonical regulation list.
Run manually: docker exec bp-compliance-backend python3 \
/app/scripts/backfill_mc_regulation_llm.py [--dry-run]
A5 Admin audit tab — GET /api/compliance/agent/audit/<check_id>
proxied via /api/sdk/v1/agent/audit/<id>. New page
/sdk/agent/audit/[checkId] renders scorecard + filterable MC table
(status / doc_type / regulation, expandable rows with matched_text
+ hint). ComplianceCheckTab now shows 'Voll-Audit oeffnen' link.
A6 Trend per tenant — GET /api/compliance/agent/audit/tenant/<id>
returns recent runs. Email scorecard shows per-regulation delta
badges ('(+12%)', '(-3%)') compared with the previous run for the
same tenant + base_domain. Lookup is one SQLite query.
Plumbing:
rag_document_checker.py — SELECT now includes 'article'; MC results
carry 'regulation' + 'article' through to CheckItem.
agent_doc_check_routes.CheckItem schema gains regulation + article
fields (defaults '') so old clients still parse.
agent_compliance_check_routes — response gains 'check_id' so the
frontend can build the audit link.
User feedback after BMW test:
- 60 'BMW AG — XYZ' rows were rendered as ✗ for Opt-Out/Privacy and
scored 38-52%. That's misleading: BMW processing for itself doesn't
need a separate opt-out URL (cookie-banner is the consent
mechanism) or a separate privacy policy (main DSI covers it).
- Title 'Anbieter' was wrong for 60 of 90 rows (internal services).
Three orthogonal fixes:
1. score_vendors becomes recipient_type aware:
- INTERNAL/GROUP_COMPANY: opt_out_url, privacy_policy_url, country
are NOT required (the user's main DSI + cookie-banner cover them).
What IS required: name, purpose, cookies disclosed with name +
expiry. Cookies-disclosure weight raised to 50 (was 15) so the
VVT-relevant data is the score driver.
- 'necessary' category: opt-out still skipped (§25 Abs. 2 TDDDG).
- External (PROCESSOR/CONTROLLER): existing strict scoring stays.
2. _link_status_badge accepts na_label and renders a neutral em-dash
with explanation tooltip instead of red ✗ when the column doesn't
apply to that row. _render_vendor_row_full passes na_label based on
recipient_type:
- INTERNAL/GROUP -> 'Nicht erforderlich (eigene Verarbeitung)'
- necessary -> 'Nicht erforderlich (§25 Abs. 2 TDDDG)'
3. Header + summary clarify the split:
- h3 changed to 'Verarbeitungstaetigkeiten und Empfaenger aus der
Cookie-Richtlinie' (was 'Drittanbieter aus Cookie-Richtlinie').
- Top line: '90 Verarbeitungen erfasst — 60 eigene + 30 externe
Empfaenger'.
- Disclaimer below: explains the INTERNAL/GROUP exemption so the
reader understands why those rows don't show ✗ for missing URLs.
- Section labels enriched with the relevant DSGVO article:
'Eigene Verarbeitungstaetigkeiten — fuer das VVT (Art. 30)',
'Auftragsverarbeiter — AVV erforderlich (Art. 28)',
'Joint Controller — Vereinbarung pruefen (Art. 26)'.
Expected BMW result after fix: ~85% of the 60 BMW-AG rows jump from
~52% to 90-100% (the real issue, fehlende Cookies-Disclosure, stays
flagged). The only true findings remaining are external links that
return 4xx (e.g. Criteo 403, Teads 404).
User: 'wir haben 1800 MCs erstellt um sie zu 10% zu nutzen — das ist
Schwachsinn'. Fixed all 6 gaps from the audit.
#1 max_controls=0 (was 20):
- agent_compliance_check_routes _check_single: passes max_controls=0 to
check_document_with_controls -> ALL MCs evaluated per doc_type.
- 8 doc_types now use 1874 MCs instead of 160 (10x coverage).
- Regex matching is cheap (<1s per doc); LLM-enrich cap of 10 stays.
#2 LLM-verify fixed:
- llm_verify.py was getting 0/N parsed. Causes: qwen3 thinking-mode
wrapped output in <think>...</think>, /api/generate doesn't enforce
JSON, prompt didn't handle code-fence wrappers.
- Now uses /api/chat with format='json' (forces valid JSON).
- _parse_batch_response strips <think> tags, accepts {results:[...]}
AND bare [...], adds richer regex-fallback parse, logs raw head on
total parse failure for diagnosis.
#3 Loeschkonzept checklist (new):
- doc_checks/loeschkonzept_checks.py — 9 L1 + 7 L2 checks per DIN 66398
+ Art. 5(1)(e)/17/32 DSGVO: scope+responsibility, data categories,
retention periods, legal basis refs (HGB/AO/BGB), deletion trigger,
deletion process+technical+systems, deletion proof, exceptions +
Art. 18 lock, review cycle, DSGVO references.
- runner.py registered for loeschkonzept/loeschung/loeschfristen.
#4 regulation backfill script:
- backend-compliance/scripts/backfill_mc_regulation.py — regex-detects
DSGVO/TDDDG/TMG/BGB/HGB/AO/MStV/UWG/VSBG/PAngV/GwG/BDSG/EU-VO
references in MC title+question+pass_criteria, UPDATEs regulation +
article fields.
- Idempotent (only NULL rows), --dry-run flag, batched 200/UPDATE.
- Run inside container: docker exec bp-compliance-backend python3 \
/app/scripts/backfill_mc_regulation.py
#5 MC alias-fallback:
- rag_document_checker._MC_ALIAS_FALLBACK maps doc_types without own
MCs to a related set: nutzungsbedingungen->agb, social_media->dse,
sub_processor/scc/tom_annex->avv, loeschfristen->loeschkonzept,
eu_institution/dsb->dse.
- _load_controls retries with the alias when the primary query
returns 0 rows.
- 14 additional doc_types now get MC coverage transparently.
#6 cross-domain auto-discovery:
- _autodiscover_missing builds a crawl plan: primary submitted base
+ up to 2 related domains sharing the owner SLD (e.g. BMW Group:
bmw.de + bmwgroup.com + bmwgroup.jobs).
- Detection: regex over submitted texts for https?://...<owner>...
hostnames distinct from the primary base.
- Each crawled base contributes documents + cmp_payloads to the
discovery pool.
Net effect for BMW: 1874 MCs evaluated (90 from cookie alone, was
20), Loeschkonzept Pflichtangaben benoten-bar, LLM overturns false
regex FAILs, Joint-Controller policies on bmwgroup.jobs (Social
Media) jetzt entdeckbar. Same wins will apply to CRA-Compliance check.
Per user request: BMW (and others) put their own services AND external
vendors in the same cookie-policy widget. The VVT-Tabelle now groups
them by Art. 30(1)(d) DSGVO recipient category so the DSB can act on
the right buckets:
- INTERNAL — owner processing for itself ('BMW AG — XYZ')
- GROUP_COMPANY — same brand family, different legal entity ('BMW Bank')
- PROCESSOR — Auftragsverarbeiter, AVV-pflichtig (Adobe, Akamai)
- CONTROLLER — independent / joint controller (Meta Pixel, Google
Ads, LinkedIn — they run their own profiles)
- AUTHORITY — government bodies (rare in cookies)
- OTHER — fallback
New module vendor_classifier.py:
- owner_from_url(url) — derive site-owner token (bmw.de -> 'BMW',
mercedes-benz.de -> 'Mercedes-Benz')
- classify(name, category, owner) — strict 5-tier heuristic:
* INTERNAL: vendor name first-token is '<Owner>' / '<Owner> AG' /
'<Owner> SE' / '<Owner> GmbH' / '<Owner> AG & Co. KG'
* GROUP_COMPANY: starts with '<Owner> ' but isn't '<Owner> AG'
* CONTROLLER: matches a known joint-controller list (Meta, Google
Ads, YouTube, LinkedIn Insight, TikTok, Pinterest, Taboola,
Outbrain, Criteo, Twitter, Reddit, ...)
* PROCESSOR: legal-form suffix in name (GmbH, AG, Inc., A/S,
B.V., S.A., Ltd., LLC, ...)
* OTHER: anything else
vendor_extractor.extract_vendors_from_payloads now takes owner_name:
- Passes it through to classify() for every extracted vendor record
- The route derives owner_name via _company_name_from_url(doc_entries)
- LLM-extracted vendors are classified the same way (so V3 fallback
also produces tagged records)
agent_doc_check_extras.build_vvt_table_html rewritten:
- Buckets vendors by recipient_type
- Renders one section per non-empty bucket, in canonical order
(RECIPIENT_TYPE_SECTIONS), each with section header + count + bad
count + nested table
- Within each section: sorted by compliance_score ascending
- Response JSON cmp_vendors includes recipient_type so the frontend
can later import per-category into the VVT module
Expected BMW result: ~60 INTERNAL rows (BMW AG own services),
~25 PROCESSOR rows (Adobe, Adform, Akamai, AWS, ...), ~5 CONTROLLER
rows (Meta Pixel, Google, LinkedIn, Pinterest, Outbrain, Taboola).
The first BMW VVT table rendered all 24 providers at 20% score because
the ePaaS extractor was reading the wrong field names. Actual schema is
nested: providers[].processings[].persistences[], NOT providers[] alone.
Correct ePaaS schema (verified against bmw.com/epaas/.../de_DE.epaas.json):
Provider: {id, name, description, processings[]}
Processing: {id, name, description, categoryId, optOutLink,
privacyPolicyLink, persistences[]}
Persistence: {id, name, domain, type, expiry, description}
Two structural changes:
1. One row per processing (not provider). BMW has 26 providers but ~91
processings spread across them (Adobe alone has ACMProcessing,
AdobeAnalytics, AdobeCampaign, AdobeTargetAnalytics, AdobeTargetPers.).
The cookie widget displays each processing separately — VVT now
mirrors that. Display name format: 'Provider Name — Processing Name'.
2. Read optOutLink/privacyPolicyLink from PROCESSING (where they live),
not provider. Persistences flatten to cookies[] with name + expiry +
description.
Plus category mapping:
advertising -> marketing
strictlyNecessary -> necessary
statistics -> statistics
functional -> functional
Category-aware scoring (cookie_link_validator.score_vendors):
- 'necessary' (technisch erforderliche, §25 Abs. 2 TDDDG): no opt-out
required, no country required. Score weight shifts to purpose +
cookie disclosure (essential cookies must list names + expiry).
- All other categories: opt-out URL still mandatory; missing opt-out
flags 'no_opt_out_url' and zeros that block of points.
Expected BMW result after this fix:
- ~91 rows (Adobe Analytics, Adform Retargeting, Akamai Infrastructure,
AWS, ..., plus ~60 strictlyNecessary processings)
- Marketing rows with present opt-out → ~75-90%
- Necessary rows with cookie+expiry → ~85-95%
- Rows missing fields → still flagged
Two bugs observed in BMW BMW test run:
1. Generic JSON heuristic captured /de-de/login/bmw/api/flyout/data (4KB,
user login fly-out data) and reconstruct_generic produced 56 words of
noise. The CMP-prefer logic then 'replaced' the 185-word imprint DOM
extraction with those 56 words because self_wc(185) < 300 — even
though cmp_wc(56) < self_wc(185).
2. The strict prefilter list was too short. Login/auth/cart endpoints
often have category-shaped JSON without being cookie policies.
Fixes:
- dsi_discovery: replace DOM with CMP only when cmp_wc > self_wc AND
meets one of the existing conditions. Tiny captures can no longer
silently destroy a bigger DOM extraction.
- cmp_extractor: skip non-cookie URLs (/login, /auth, /user, /session,
/cart, /checkout, /search, /flyout, /menu, /nav, /translation, /i18n,
/locale, /feature-flag).
- cmp_extractor: require ≥5KB payload size — real CMP policies are
always larger (BMW ePaaS is ~393KB). Tiny matches drop out before
reconstruction.
When the cookie text has no captured CMP payload (long-tail sites that
don't use ePaaS/OneTrust/Cookiebot/etc.) we now fall back to a Qwen → OVH
LLM cascade to extract a structured vendor list from the policy text.
New module backend/compliance/services/vendor_llm_extractor.py:
- extract_vendors_via_llm(cookie_text): runs Qwen first (local Ollama),
then OVH if Qwen returns nothing usable.
- System prompt instructs the model to return STRICT JSON only:
{vendors: [{name, country, purpose, category, opt_out_url,
privacy_policy_url, persistence, cookies: [...]}]}
- Lenient JSON parser tolerates code-fences, prose wrappers, dict vs list.
- _normalize() caps array sizes (80 vendors, 30 cookies each), validates
URLs (must be http(s)), trims fields to reasonable lengths.
Route integration (agent_compliance_check_routes.py):
- After named-CMP extract: if cmp_vendors is empty AND the cookie text
has ≥500 words (otherwise it's likely navigation chrome), invoke the
LLM extractor. Progress message 'Vendor-Liste per LLM extrahieren...'.
- Vendors then run through the same validate_vendor_urls + score_vendors
pipeline → VVT table rendered identically regardless of source.
docker-compose.yml: backend-compliance gains OLLAMA_URL, CMP_LLM_MODEL,
OVH_LLM_URL/KEY/MODEL env vars (same names as consent-tester so the
configuration is unified).
This closes the 'every site eventually gets a VVT table' goal:
- Known CMP → V1/V2 structured extraction (fast, exact)
- Unknown CMP → V3 LLM extraction (slow, best-effort)
- No text at all → no vendors, but other compliance checks still run.
Backend vendor_extractor.py gets 4 new per-CMP dispatchers, mirroring the
JSON schemas observed in each platform:
- Cookiebot: 'Categories[*].Cookies[*]' with Vendor/Host, expiry, purpose
- Usercentrics: 'services[*]' with cookieMaxAgeSeconds, processingCompanyCountry
- Didomi: 'app.vendors[*]' with country + policyUrl
- TrustArc: 'vendors[*]' + per-category 'Cookies' with provider
All 6 named CMPs (ePaaS, OneTrust, Cookiebot, Usercentrics, Didomi,
TrustArc) plus the generic-shape fallback are now mapped — every site
hitting Phase B of the cascade gets a structured vendor list, scored
opt-out links, and a VVT-Tabelle in the email.
When a known CMP (ePaaS, OneTrust) renders the cookie policy, we now
extract structured vendor records, probe their opt-out + privacy URLs,
score each vendor (0-100), and append a 'VVT-Vorschlag' table to the
compliance email — one row per vendor, sortable by compliance score.
consent-tester:
- DSIDiscoveryResult.cmp_payloads: surfaces raw CMP JSON to callers
- DSIDiscoveryResponse: new cmp_payloads field
- discover_dsi_documents sets cmp_payloads from cmp_capture
- cmp_library/{epaas,onetrust}.py: new extract_vendors(d) returning
list[VendorRecord]
backend:
- _fetch_text() now returns (text, cmp_payloads) tuple
- doc_entries store cmp_payloads per doc (mostly cookie)
- _autodiscover_missing forwards homepage payloads to the cookie entry
- New module vendor_extractor.py: dispatches ePaaS/OneTrust/generic
schemas; dedupes vendors across multiple payloads
- cookie_link_validator.py extended with validate_vendor_urls(vendors)
and score_vendors(vendors) — 0-100 score per vendor based on name,
purpose, country, opt-out reachable, privacy URL reachable, cookies
with names + expiry
- agent_doc_check_extras.build_vvt_table_html: renders the table
- Route appends VVT HTML after the provider list, before the
document-by-document report
- Response JSON gains cmp_vendors for future frontend rendering
Example for BMW: ~30 ePaaS providers → table with Name | Kategorie |
Sitz | Cookies | Opt-Out (✓/✗) | Privacy (✓/✗) | Score. Sorted by
score ascending so the worst-compliant vendors are at the top.
cookie_checks.py:
- cookie_names_listed: now also matches CMP placeholder notation
(BMW: 'Adfpc###', 'CT###') and 'Diese Datenverarbeitung verwendet die
folgenden Cookies oder ähnliche Technologien' as list-shape signal.
Cryptic vendor names like 'audience', 'adformfrpid' are accepted via
the surrounding markup, not by hard-coding each one.
- cookie_providers_named: new pattern 'Gesetzt von: <Firma>' (BMW/ePaaS
per-cookie vendor naming) + recognition of full legal-form names
(Adform A/S, BMW AG, Adobe Systems Software Ireland Limited).
- cookie_duration_values: now matches 'Ablauf: 1 Jahr' / 'Speicherdauer:
30 Tage' (BMW format) in addition to the legacy '<n> <unit>'.
New L1 + L2 checks for controller in cookie-policy:
- cookie_controller (L1): the cookie policy must name Verantwortlich(er)
- cookie_controller_address (L2): PLZ + Ort or address keywords
- cookie_controller_contact_or_link (L2): email/phone OR link back to
Datenschutzerklärung (the practical equivalent — BMW does this)
New L2 checks (parented under opt_out):
- cookie_optout_links: detects per-provider opt-out URLs in the text
- cookie_privacy_policy_links: per-provider privacy-policy URLs
New service: cookie_link_validator.py
- extract_links(text): pulls all https?://… URLs that follow 'Opt-Out
Link:' / 'Link zur Privacy Policy:' (deduped)
- validate_links(links): probes every URL concurrently (HEAD first, GET
fallback for 405/403). 10 parallel, 8s per request, 60s batch cap.
Returns reachable=True/False + status + final_url.
- build_check_items(): renders 2 CheckItems (opt-out + privacy-policy),
each pass if ALL links 2xx/3xx, fail with up-to-5 broken-link examples.
Hook in _check_single: doc_type=='cookie' triggers the validator after
regex+MC checks. Recomputes correctness with the new L2 items.
This addresses two concrete BMW observations:
1. BMW's per-cookie structure (Name + Zweck + Ablauf, Gesetzt von: …,
Opt-Out Link: …) now recognised → 'Konkrete Cookie-Namen aufgelistet'
and 'Konkrete Speicherdauern' should pass.
2. Defective opt-out URLs surface as compliance findings rather than
silently passing — Art. 7(3) DSGVO requires a working withdrawal
path per provider.
[guardrail-change]
Phase 18 adds an EU Cyber Resilience Act compliance track to IACE:
the engine now fires patterns that surface the manufacturer-side CRA
obligations whenever a project's components carry digital elements.
Patterns (HP1910-HP1918, hazard_patterns_cra.go):
HP1910 Missing SBOM
HP1911 Unsigned firmware/software updates
HP1912 Factory-default credentials still active
HP1913 No coordinated vulnerability disclosure (CVD) policy
HP1914 No documented security patch SLA
HP1915 Missing user-facing hardening guide
HP1916 No incident-notification process to ENISA / CSIRT
HP1917 No security assessment prior to placing on market
HP1918 AI component without cybersecurity risk assessment
Each pattern carries ClarificationQuestionsDE so the operator gets
auditor-grade questions to take back to the Anlagenbauer instead of
the engine inventing prose. PatternMatch carries DefaultAvoidability
(P=1 for all CRA patterns), feeding the PLr graph from Phase 17.
Measures (M540-M548, measures_library_cra.go):
M540 SBOM (SPDX or CycloneDX) with each machine release
M541 Signed updates with rollback protection
M542 Forced default-password change at first boot
M543 Published CVD policy (security.txt / PSIRT)
M544 Documented patch SLA with CVSS-tier response times
M545 User-facing hardening guide in the machine docs
M546 ENISA incident-notification process (24h/72h/14d)
M547 Authenticated update channel + integrity check
M548 Pre-market security assessment / pen-test
The library is urheberrechtlich neutral: identifiers only
(Verordnung (EU) 2024/2847, DIN EN 40000-1-2 Entwurf, IEC 62443,
ETSI EN 303 645, ISO/IEC 5962, ISO/IEC 29147). No normative text
is reproduced — DIN/Beuth proprietary content is referenced by
section number only.
Category-compatibility:
cyber_resilience pattern category accepts measures with
HazardCategory cyber_resilience, cyber_network, or
software_control. Updated in both the runtime helper
(iace_handler_init_helpers.go) and its test-mirror
(pattern_coverage_test.go) — both must move in lockstep.
Frontend (clarifications page):
When at least one clarification references "2024/2847" or
"40000-1-2" in its norm_references, a blue info-banner is
rendered at the top of the page:
"Cyber Resilience Act (CRA) — Hinweis zur Geltung
Diese Klärungsliste enthält Fragen zur Verordnung (EU)
2024/2847 (CRA). Die CRA gilt für Produkte mit digitalen
Elementen, die ab dem 11.12.2027 auf dem EU-Markt bereit-
gestellt werden. ..."
Reminds the user that the CRA pflichten are forward-looking
while still allowing the manufacturer to bake them in now.
LOC exceptions:
Added three pre-existing files to .claude/rules/loc-exceptions.txt
(manufacturer_safety_features.go, iace_handler_clarifications.go,
routes.go). All three grew across Phases 16-17 and are tagged as
Phase 5+ refactor backlog. [guardrail-change] marker required.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 17 of the risk-assessment polish. Two pieces:
A) PLr per EN ISO 13849-1 Anhang A (Risikograph)
- HazardPattern.DefaultAvoidability (1 = P1, 2 = P2). Optional;
defaults to P1 if unset (conservative — operator can raise after
review).
- ComputePLr(s,f,p) implements the canonical 8-leaf binary tree
(S1F1P1 -> a, ..., S2F2P2 -> e). Pinned by 8 table-driven tests.
- SeverityToS / ExposureToF map the existing 1-5 fields to the
binary S/F at the documented threshold (3).
- At project initialise, every hazard's Description is appended
with "Risikograph EN ISO 13849-1 (Anhang A): S2 · F1 · P1 -> PLr c"
so the audit value is visible without leaving the hazard view.
- PatternMatch carries DefaultAvoidability so the init handler can
pick it up without a second pattern lookup.
B) Methoden-Kopf am Bericht
- GET /clarifications.html now opens with a standardised methodology
block: ISO 12100 Anhang B (hazard ID) + ISO 13849-1 Anhang A
(PLr graph) + ISO 12100 6.2/6.3/6.4 (reduction hierarchy). Same
wording on every export, ready for the Anlagenbauer-Uebergabe.
- Only norm identifiers — no norm text reproduced.
C) ISO12100Section in Hazard Description
- When a pattern is labeled with ISO12100Section, the hazard
description gets a "Klassifikation: EN ISO 12100 Anhang B,
Abschnitt 6.3.5.4" suffix. Provenance for the auditor.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 16 of the Klaerungen / risk-assessment polish. Sources from
EN ISO 12100 Anhang B Tabelle B.1 are now first-class:
A) HazardPattern.ISO12100Section identifier (string), persisted only as
the section number (e.g. "6.3.5.5") — not the norm text. Keeps the
library urheberrechtlich neutral (DIN/Beuth license). 57 patterns
labeled today; rest will follow on touch.
B) Category split per ISO 12100 Nr. 4 vs Nr. 5:
- 16 patterns reclassified noise_vibration -> noise_hazard
- 7 patterns reclassified noise_vibration -> vibration_hazard
- 1 pattern (HP228 UV-/Laermexposition) kept multi-cat
acceptableMeasureCategories now accepts both new aliases plus the
legacy noise_vibration. Coverage test recognises both as valid.
C) 5 new ISO-12100-Annex-B gap patterns (HP1900-HP1904):
- HP1900 Vakuum-Verletzung (6.3.5.5)
- HP1901 Federenergie / elastische Elemente (6.2.10)
- HP1902 Rutschen/Stolpern auf rauer Oberflaeche (6.3.5.6)
- HP1903 Hochdruckinjektion (6.3.5.4) — includes clarifying
"no hand-locating of leaks" question
- HP1904 Ersticken durch Brustkorbquetschung (6.3.5.2)
The library now mirrors the ISO 12100 Annex B structure for the gaps
the Bremse benchmark surfaced.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two related bugs in the BMW test result:
1. AGB rendered as 'MANGELHAFT 0/13' even though BMW has no public AGB:
- Auto-discovery correctly returned 'not found' for AGB (no link on
bmw.de matches AGB keywords).
- But auto_fill_from_dsi then found the substring 'AGB' in a section
of the DSI and pseudo-filled the AGB entry with a 264-word DSI
fragment.
- cross_search_documents would have done the same.
- Both now skip entries where discovery_attempted=True AND
auto_discovered=False — the 'not found' verdict stands.
2. DSB-Kontakt rendered as a separate 100% OK document with 7566 words
= the entire DSI text:
- GDPR practice: the DSB is named *inside* the DSI as an email or
contact block (Art. 13(1)(b)), not as a stand-alone page.
- cross_search_documents had been assigning the full DSI to the DSB
row because it matched 'datenschutzbeauftragte' keywords.
- DSB removed from _ALL_DOC_TYPES — no longer canonical, no longer
padded as missing, no longer auto-discovered. The frontend row
remains so a tenant with a separate DSB page can still submit one.
After this fix BMW should render:
- DSE: OK
- Impressum: LUECKENHAFT (unchanged — regex gaps to fix separately)
- Cookie-Richtlinie: OK
- Social Media: NICHT GEFUNDEN (bmw.de does not link to it)
- AGB: NICHT GEFUNDEN (correct — BMW has no public AGB)
- Nutzungsbedingungen: NICHT GEFUNDEN
- Widerruf: NICHT GEFUNDEN
[migration-approved]
Three pieces complete the Klaerungen lifecycle:
1. Migration 028: iace_clarifications + iace_clarification_comments +
iace_clarification_history. Deterministic clarification_key
(UNIQUE per project) so engine re-inits don't lose answers.
History table logs every status/answer transition. The previous
JSONB-in-metadata storage is kept as read-only fallback for
pre-migration projects until a one-shot upcopy script runs.
2. Multi-User-Workflow:
- assigned_to field on every clarification (free-text user kuerzel
for now; an FK to users can be added in a follow-up).
- Comment thread per clarification (POST .../comment, GET
.../detail returns the thread).
- Status-history log written by UpsertClarification when the
status or answer actually changes.
- Frontend Modal: Zugewiesen-an + Bearbeiter fields, comment
thread with inline post, collapsible history section.
3. PDF-Export via print-friendly HTML:
- GET /clarifications.html returns a standalone A4-styled
document with status badges, norm references, affected hazards
and a signature row at the bottom. The Bediener opens the link
and uses Strg-P / Cmd-P to save as PDF. No server-side PDF
dependency added.
- Frontend "PDF / Druck" button next to CSV export.
Backend:
- internal/iace/store_clarifications.go: UpsertClarification,
ListClarificationsForProject, GetClarificationByKey,
AddClarificationComment, ListClarificationComments,
ListClarificationHistory.
- internal/api/handlers/iace_handler_clarifications.go:
- AnswerClarification now writes the SQL row, falls back to legacy
JSONB read on list.
- PostClarificationComment, ListClarificationDetail,
ExportClarificationsHTML added.
Migration must be applied manually on Mac Mini and prod via
psql -f /migrations/028_iace_clarifications.sql — pattern as in
scripts/apply_*_migration.sh.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Frontend filters out empty doc rows -> req.documents only contains the
N submitted entries (3 in BMW case). The old auto-discovery loop
computed 'missing' as 'entries in doc_entries with empty text', which
was always empty for those N entries -> discovery never fired.
Fix:
- missing = _ALL_DOC_TYPES - {canonical doc_types in doc_entries}
- For each missing type, APPEND a new entry to doc_entries with
discovery_attempted=True. If a discovered doc matched, fill text/url
and set auto_discovered=True.
- Check loop: skip entries with no URL and no text (let padding label
them). Entries with URL but no text keep the 'Kein Text' error so the
user sees fetch failures explicitly.
Three pieces complete the Klaerungen UX:
1. Sidebar-Counter: layout.tsx polls /clarifications and shows a
colored open-count badge on the "Klaerungen" nav item. Refreshes
whenever the user changes route.
2. CSV-Export: new backend endpoint
GET /sdk/v1/iace/projects/:id/clarifications.csv produces a UTF-8-
BOM-prefixed semicolon-separated CSV (Excel-friendly) with ID,
Quelle, Kategorie, Frage, Status, Antwort, Begruendung, Bearbeiter,
answered_at, anzahl Gefaehrdungen, Gefaehrdungs-Namen, Norm-Refs.
Frontend Klaerungen-Seite bekommt einen "CSV-Export"-Button.
3. Hazard-Banner statt Fragentext im Benchmark-Detail: the previous
bulleted clarification list was duplicated across 48 hazards for a
single FANUC question. Phase 2 replaces it with a compact status
badge — "N offene Klaerung(en) — Klaerungen-Seite oeffnen" (orange)
or "Alle N Klaerungen beantwortet" (green) with a direct link.
Backend cleanup: iace_handler_init.go no longer appends the "Mit
Anlagenbauer zu klaeren" block to Hazard.Description. The description
stays focused on the scenario; clarifications live in the dedicated
endpoint and answers persist across re-inits via project.metadata.
The aggregated "Referenzierte Normen" line on the hazard is kept.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When the user leaves some doc-type rows empty, the tool now actively
searches the website for them — only marks 'not found' as last resort.
Flow:
1. User submits N URLs (e.g. just DSI)
2. For each canonical doc_type with no submitted URL/text, the route
identifies the most-common base (scheme://netloc) from submitted URLs
3. Calls consent-tester /dsi-discovery on the homepage with
max_documents=15 (180s timeout)
4. Classifies every discovered doc into a canonical doc_type via
title/URL keyword rules (_DISCOVERY_RULES — covers cookie/widerruf/
social_media/agb/nutzungsbedingungen/dsb/impressum/dse)
5. Fills matching empty entries with the discovered text, marks
auto_discovered=True and discovery_attempted=True
Padding now differentiates:
- 'Auf der Website nicht gefunden' — discovery was attempted, no doc
matched. Amber badge, friendly hint to add URL manually.
- 'Nicht eingereicht — Quelle nicht angegeben' — user gave NO URLs at
all, nothing to crawl from. Grey badge.
Email + frontend:
- Status labels: NICHT GEFUNDEN (amber) vs NICHT EINGEREICHT (grey)
- 'Gepruefte Quellen' table tags auto-discovered URLs with a small blue
'auto-entdeckt' badge so GF sees what tool found vs user submitted.
Implementation only runs when ≥1 URL was submitted (no base to crawl
from otherwise). Adds 30-90s for unsubmitted types but avoids the
'just say nicht gefunden' anti-pattern.
New page "Klaerungen" between Massnahmen and Verifikation.
Backend:
- internal/iace/clarifications.go: Clarification struct + ClarificationAnswer +
BuildProjectClarifications() — aggregates pattern-level + manufacturer-
level questions from collectAllPatterns + GetManufacturerSafetyFeatures.
Deterministic IDs ("pattern:HP1640:0", "manuf:fanuc:dual-check-safety-dcs:1")
so persisted answers survive every re-init.
- internal/api/handlers/iace_handler_clarifications.go:
- GET /projects/:id/clarifications returns aggregated list with affected
hazard names + persisted answer state, sorted (open first).
- POST /projects/:id/clarifications/:cid/answer writes status/answer/
reasoning/answered_by/answered_at to project.metadata.clarification_-
answers — no DB schema change.
Frontend:
- admin-compliance/app/sdk/iace/layout.tsx: new "Klaerungen" nav item.
- app/sdk/iace/[projectId]/clarifications/page.tsx: table grouped by
source (FANUC / Pattern HP1640 / …), Filter Offen/Beantwortet/Alle,
search field, Antwort-Modal with status/answer/Begruendung/Bearbeiter.
A clarification answered once applies to ALL referenced hazards — the
operator no longer has to answer the same FANUC DCS question on 48
mechanical hazards individually.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Always-show-8 (user-requested):
- agent_compliance_check_routes.py: _pad_results_with_missing pads the
results list to always include all 8 canonical doc_types in canonical
order. Missing types get a placeholder DocCheckResult with error=
'Nicht eingereicht' + scenario='missing'.
- agent_doc_check_report.py: NICHT EINGEREICHT status label (neutral),
friendly grey body block instead of red error.
- ChecklistView.tsx: 'Nicht eingereicht' chip (neutral grey, not red
'Fehler'); SCENARIO_LABELS adds missing entry + header chip counter.
Impressum-Regression fix (#18):
- _fetch_text(url, doc_type): cookie/dse/social_media -> max_documents=1
(CMP capture authoritative, sub-pages dilute). Other types -> =3
(Impressum needs Versicherungsvermittler, Aufsicht, Berufsrecht sub-
pages). 15s networkidle bail keeps timing safe.
ODR/Verbraucherstreitbeilegung filter (#19):
- _apply_profile_filter: when profile.needs_odr=True (B2C), override the
check's default B2B-oriented hint with action-oriented B2C guidance
pointing at Art. 14 EU-VO 524/2013 + §36 VSBG. Previously the check
contradicted itself: 'profile says B2C' + hint 'only relevant for B2C
online vendors'.
Registergericht regex (#20):
- impressum_checks.py: accept colon/dot/dash between keyword and city
(BMW writes 'registergericht: münchen hrb 42243'). Add 'sitz und
registergericht: X' as separate pattern.
Industry detection (#21):
- business_profiler.py: 'automotive' keywords broadened (antriebs,
motor, leasing, werkstatt, probefahrt, plus brand names BMW/Mercedes/
Audi/VW/Porsche/Opel). 'it_services' keywords narrowed — software/
cloud/hosting are mentioned in every privacy policy and were biasing
the result toward IT for any tech-aware company.
The Go init handler appends two annotated blocks to Hazard.Description
("Mit Anlagenbauer zu klaeren: ..." and "Referenzierte Normen: ...")
without changing the DB schema. The benchmark detail view only rendered
hazard.scenario || hazard.description, so the appended blocks were
silently hidden because scenario is always populated.
Split the description into three structured pieces:
1. extractScenario() — pure scenario text, stripped of trailing blocks
2. extractClarifications() — bullet list of "Mit Anlagenbauer zu klaeren"
3. extractEngineNorms() — pipe-separated norm references
Each piece is rendered as its own DetailRow. The FANUC DCS clarification
that already lives in the DB (48/115 hazards on the Bremse project) is
now visible in the Engine column.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cmp_discovery_log.py:
- sqlite log at /data/cmp_discoveries.db: every LLM-discovered CMP
pattern recorded with domain, strategy, value, sample text
- Auto-promote (user-chosen 'voll automatisch' mode): when LLM returns
strategy=url AND extracted text >= 800 words, write a new module
/data/auto_cmp/auto_<slug>.py with derived regex matcher + reconstruct
- record_discovery() called from dsi_discovery._try_llm_cascade on success
cmp_library/_registry.py:
- Loads both hand-written modules from services/cmp_library/ AND
auto-promoted modules from /data/auto_cmp/ (CMP_AUTO_DIR env)
- Auto modules use importlib.util.spec_from_file_location, no package
install needed; restart consent-tester to pick up new ones
dsi_discovery.py:
- _try_llm_cascade now calls record_discovery() on every successful
LLM analysis (cached AND fresh)
main.py:
- GET /cmp-discoveries — admin endpoint listing all logged discoveries
- DELETE /cmp-discoveries/{id} — rollback (unlinks auto_*.py)
This closes the self-improving loop: first encounter with a new CMP fires
the LLM (cost) → discovery is auto-promoted → all future runs against the
same vendor pattern hit Phase B (Named CMP) at <50ms with no LLM call.
New module consent-tester/services/cmp_llm_fallback.py:
- LLMCookieExtractor: single-endpoint adapter (Ollama OR OpenAI-compat)
- LLMCascade: tries Qwen (local Mac Mini Ollama) first; falls through to
OVH (managed 120B) when Qwen returns no usable strategy
- LLMCascade.from_env(): reads OLLAMA_URL/CMP_LLM_MODEL + OVH_LLM_URL/
OVH_LLM_KEY/OVH_LLM_MODEL from environment
- LLM returns JSON {strategy: url|selector|text, value: ...}
- Valkey-backed cache per netloc (cmp:hint:<netloc>, 7-day TTL) — next run
against the same domain skips the LLM entirely
dsi_discovery.py:
- Wired network_log collector (URL/status/content-type/size of every JSON
response on the page) — passed to LLM prompt as observation
- After Named CMP (Phase B) + Heuristic (Phase A) both fail AND DOM
< 300 words: invoke LLMCascade.analyze(...)
- _apply_llm_hint executes the LLM's strategy: refetch URL via Playwright
request context, query DOM selector, or use text directly
- Cache HIT path: apply cached hint, only fall back to LLM if cache is stale
docker-compose.yml:
- consent-tester gets env vars + cmp-data volume (for Phase E)
- All LLM endpoints configurable via env, sensible defaults
consent-tester/requirements.txt:
- redis>=5.0 (asyncio client, Valkey-compatible)
- httpx>=0.27
Adds a curated database of safety-relevant features for the major
manufacturers across mechanical/plant engineering, written entirely in
own words with norm anchors. No verbatim manufacturer texts — therefore
no copyright issue:
- Markennennung (§ 23 MarkenG nominative use) is permitted.
- Fakten ueber Produkt-Sicherheitsfunktionen are not protected by § 2
UrhG (only Werke, not facts).
- NormReferences contain only the identifiers (e.g. "EN ISO 13849-1
PLd Kat.3"), never the norm text itself.
Coverage (52 entries across 12 categories):
Industrieroboter (10): FANUC DCS, KUKA SafeOperation, ABB SafeMove,
Yaskawa FSU, Staeubli CS9, Kawasaki Cubic-S, Mitsubishi MELFA,
Universal Robots PolyScope, Doosan PRS, Comau SafeNet
CNC/WZM (8): DMG MORI, Mazak, TRUMPF, Okuma, Hermle, Heidenhain
SPLC, GROB, Heller
Pneumatik (4): Festo, SMC, AVENTICS, Parker
Hydraulik (3): Bosch Rexroth, HAWE, HYDAC
Safety-PLC / Sicherheitstechnik (8): PILZ, SICK, Schmersal, Euchner,
Leuze, Phoenix Contact, Banner, Wieland
Standard-PLC (5): Siemens, Beckhoff, Rockwell, Schneider, B&R
Pressen (3): Schuler, Bruderer, AIDA
Spritzguss (3): Arburg, KraussMaffei, ENGEL
Verpackung (2): Krones, Bosch Packaging/Syntegon
Laser/Schweissen (3): Bystronic, Amada, Fronius
Foerdertechnik (2): Interroll, SEW EURODRIVE
Engine integration:
- LookupManufacturerFeaturesInText() scans the project narrative for
any of the manufacturer aliases (case-insensitive, umlaut-tolerant).
- Init-Handler appends matched feature clarifications to the relevant
hazard's "Mit Anlagenbauer zu klaeren:" block — for the right
HazardCategory only (e.g. FANUC DCS only on mechanical_hazard).
- For a Bremse project narrative mentioning "Fanuc Robodrill", the
engine now adds clarification questions like "Ist DCS am Roboter
konfiguriert?" to relevant mechanical hazards automatically.
Tests: 7 new pin tests — manufacturer count, norm prefixes, FANUC/KUKA
detection in narrative, umlaut robustness (Staeubli vs Staubli).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cmp_extractor.py refactored to thin coordinator (123 LOC, was 223).
Discovers all CMP modules via cmp_library/_registry.py:load_all() at
import time. Restart consent-tester to pick up new modules.
New cmp_library/ folder:
- _registry.py: auto-discovers all modules with MATCHER + reconstruct()
- epaas.py: BMW Group ePaaS (extracted from cmp_extractor)
- onetrust.py: cdn.cookielaw.org Groups/Cookies schema
- cookiebot.py: consent.cookiebot.com Categories schema
- usercentrics.py: api.usercentrics.eu services schema
- didomi.py: sdk.privacy-center.org notice + vendors + purposes
- trustarc.py: consent.trustarc.com categories + vendors
Each module:
- MATCHER: re.Pattern matching the CMP JSON endpoint URL
- reconstruct(d: dict) -> str: builds German Markdown cookie-policy text
Phase E (self-improving) will write auto_*.py files into the same folder;
_registry already picks those up via pkgutil.iter_modules.
GT 1.8 fordert konkret den 'sicher begrenzten Bewegungsbereich (Dual
Check Safety)'. HP1654 hatte nur M061 'Feste trennende Schutzeinrich-
tung' als Mitigation. Ergaenzt um M494 (Safe Limited Position/Space mit
DCS-Erlaeuterung), M501 (Schutzzaun-Lastbemessung) und M502 (Greifer-
Fail-Safe). Klaerungsfragen verweisen explizit auf DCS bei FANUC,
SafeMove bei ABB, SafeOperation bei KUKA und die EN ISO 13849-1 PLd/
Kat.3-Validierung.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New module cmp_heuristic.py with:
- looks_like_cookie_policy(data): shape-based classifier (top-level keys
cookies/categories/providers/vendors/purposes/cookieList/etc. + at
least 2 name+description objects, or IAB TCF v2 vendors[]+purposes[])
- reconstruct_generic(data): walks JSON, extracts name + description
fields + standalone prologue/dataController/persistence fields,
emits flat German Markdown text (max 5000 words, dedup)
cmp_extractor.py wired so that AFTER named CMP matchers (epaas,
onetrust) fail, every JSON response on the page is tested for the
heuristic. If matched, payload is captured as '_heuristic' kind and
reconstructed via the generic walker.
This is Phase A of the 4-stage cascade (B-D follow). Unknown CMPs that
return JSON now work without hand-coding each one.
Pre-filter: skips response paths /api/config, /beacon, /track,
/analytics, /fonts/, /log/, /heartbeat/, /.well-known/ to avoid
spamming the heuristic on every Playwright load.
Root cause of the recurring 603-word BMW result:
- DSI discovery for cookie-policy URL was hitting 4x networkidle timeouts
(60s each = ~240s total).
- Backend httpx timeout (180s after the previous fix) gave up before the
consent-tester finished, falling through to the raw HTTP fetch which
returned BMWs SSR navigation chrome (603 words) as the 'cookie policy'.
Two orthogonal fixes:
1. _fetch_text now passes max_documents=1 for user-specified URLs. We only
want self-extraction of THAT page; link-following is unnecessary noise.
2. networkidle wait_until window dropped 60s -> 15s. SPAs like BMW/Daimler
never reach networkidle anyway; the 60s wait was pure latency. Falls
through to domcontentloaded+5s render-wait, same as before.
Library measures carry NormReferences (EN/IEC/ISO/DIN/TRBS/TRGS Ziff./Kap./
Pos.) but they were dropped on persist: CreateMitigationRequest only
wrote Name + Description. The Fachmann benchmark file lists Normen for
34 of 60 hazards — the engine had this data already but lost it on the
way to the UI.
Fix without DB schema change:
- Mitigation.Description gets a "Normen: EN 60204-1 Ziff. 6.2 | EN 61140"
line appended when the measure has NormReferences. Pipe separator keeps
the inline panel short and grep-friendly.
- After all mitigations land, the aggregated dedup'd norm list for the
hazard is appended to Hazard.Description as a single "Referenzierte
Normen: ..." line so the UI can show one panel per hazard without
scanning every mitigation.
Audit of library coverage (per-pattern) showed GT-Bremse Normen are
generally present and richer:
- HP1640 covers GT 2.2 (EN 60204-1 Ziff. 6.2, Ziff. 8.2.3, EN 61140 +)
- HP1641 covers GT 2.4 (EN 60204-1 Ziff. 8.2.6 +)
- HP1605 covers GT 1.7 (ISO 10218-1 Ziff. 5.6.2, 5.8.3 — Ziff. 5.7.3 fehlt)
- HP1671 covers GT 1.30 (EN 12417 — Pos. detail fehlt)
Followup: 2 fine-grained sub-paragraph references (5.7.3, Pos. 1.1.4)
can be added later as measure-text updates.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ZoneDE 'Pneumatikkomponenten der Anlage' kollidiert nach normalizeZoneKey
mit HP1630 'Pneumatikschlaeuche der Automation' im 3-signifikante-Wort-
Vergleich. Neue Zone 'Berstgefaehrdete Druckwandungen Pneumatik (Leitungs-
wand, Dichtung, Verschraubung)' hat semantisch eigenstaendige Schluessel-
woerter — Dedup mergt nicht mehr in HP1630.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drei nachhaltige Verbesserungen, getrieben durch die Bremse-Benchmark-
Faelle GT 1.4, GT 1.30 und GT 7.4. Die Engine erfindet weiterhin
keine Fachmann-Kommentare — Kommentare bleiben aus, weil sie ein
Verstaendnis der konkreten Anlage erfordern, das die Engine nicht
hat. Statt dessen liefert die Engine norm-basierte Klaerungsfragen
und ein praeziseres Pattern-Vokabular.
A) HazardPattern.ClarificationQuestionsDE — neues optionales Feld:
- Pattern hinterlegt prueffaehige Fragen, die der Bediener mit dem
Anlagenbauer abklaert. Beispiele:
- HP1640: "Liegt ein Pruefprotokoll nach EN 60204-1 vor?"
- HP1666: "Ist die WZM als CE-konformes Subsystem integriert?"
- HP1604: "Ist DCS am Roboter konfiguriert und validiert?"
- Init-Handler haengt die Fragen an Hazard.Description an mit dem
Marker "Mit Anlagenbauer zu klaeren:". Kein DB-Schema-Aenderungs-
bedarf.
- 11 Patterns mit Klaerungsfragen versehen (HP1602, HP1604, HP1611,
HP1612, HP1620, HP1622, HP1637, HP1640, HP1641, HP1666, HP1685).
B) HP1632 "Bersten druckbeaufschlagter Pneumatik-Komponente" — neues
Pattern, semantisch DISTINKT zu HP1630 "Abspringen":
- Bersten = Material-/Druckversagen der Komponente, Mediumaustritt
- Abspringen = Verbindung loest sich, Peitscheneffekt
Bremse-Benchmark GT 1.4 sprach von Bersten, HP1630 nur von
Abspringen — ein 66%-Frontend-Match war eine Sackgasse. Mit
HP1632 feuert die Engine ein eigenes Hazard, das auf GT 1.4
einen sauberen Volltreffer liefert.
C) HP1637 "Einatmen von KSS-Aerosolen" — Massnahmen vervollstaendigt:
Vorher nur M141 (Sicherheitszeichen), neu zusaetzlich M405 (KSS-
Aerosolabsaugung), M418 (AGW-Ueberwachung), M526 (WZM-Tueren
geschlossen waehrend Bearbeitung), M408 (Hautschutzplan).
Klaerungsfrage: "Wurde die Aerosolkonzentration nach Bearbeitungs-
ende messtechnisch ermittelt und mit dem AGW verglichen?"
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
HP1671 "Druckluft-Verletzung in Bearbeitungszelle" matched zwar das
GT-1.30 Szenario "Einstich, Augenverletzung in Bearbeitungszelle" exakt
nach Name und Scenario, hatte aber nur eine einzige Massnahme M061
"Feste trennende Schutzeinrichtung". Die drei spezifischen Massnahmen
des Fachmanns (Reinigungsduese in Zelle integriert / Druckluft bei
Tueroeffnung aus / Einhausung-Lastbemessung) blieben unsichtbar, weil
mein neuer GT-Bremse-Pattern HP1712 zwar diese Massnahmen kennt, aber
durch RequiredEnergyTags=["pneumatic"] in diesem Projekt nicht feuert.
Fix: HP1671 SuggestedMeasureIDs ["M061"] -> ["M504", "M505", "M501",
"M061", "M141"]. EN 12417 Kap. 5.2 / Pos. 1.1.4 ist jetzt durch
M504/M505 abgedeckt. HP1712 bleibt als Backup-Pattern fuer Projekte
mit explizitem pneumatic-Tag bestehen.
Followup: HP1671 und HP1712 sind semantisch redundant — Konsolidierung
ist Teil der naechsten Pattern-Hygiene-Iteration.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The dsi-discovery in consent-tester does self-extraction + follows up to
3 sub-links + waits for CMP JSON payloads. On big SPAs (BMW, Daimler)
this routinely exceeds 60s. When it timed out, the HTTP fallback returned
the SSR shell as text — for the BMW cookie page that's 603 words of site
navigation, which then registered as 'Cookie-Richtlinie nicht im
eingereichten Text' (33%). With 180s the consent-tester finishes cleanly
and we get the CMP-captured 1824 words of real policy.
Background: hazard_patterns_extended.go (HP045-074) and _extended2.go
(HP074-102) shared their entire ID range with the semantically-different
patterns in hazard_patterns_cobot.go, hazard_patterns_press.go,
hazard_patterns_operational.go and hazard_patterns_extended_dguv.go.
The collision had lived unnoticed because TestGetBuiltinHazardPatterns_-
UniqueIDs only checks the 44 builtin patterns (HP001-HP044).
Examples of the collision:
- HP059 = "Kollision Mensch-Roboter" (cobot.go) vs "Kupplung — mechanisch" (extended.go)
- HP060 = "Quetschen durch Werkzeug am Cobot" (cobot.go) vs "Diagnosemodul — Software" (extended.go)
- HP073 = "Wartung ohne LOTO" (operational.go) vs "Hydraulikventil — hydraulisch" (extended.go)
At runtime collectAllPatterns() returned both patterns under the same ID
which made downstream lookups (e.g. hazardPatternMeasures map keyed by
pattern_id) non-deterministic — last-loaded wins, dropping the other
pattern's mitigation set silently.
Rename strategy (no deletes — both patterns are real and earn their
SuggestedMeasureIDs after the category-filter work):
extended.go HP045..HP073 -> HP1800..HP1828 (29 IDs)
extended2.go HP074..HP102 -> HP1830..HP1858 (29 IDs)
cobot/press/operational/extended_dguv keep their original IDs because:
- compliance_triggers.go references HP059/HP060 with the cobot meaning
- pattern_engine_test.go references HP073 with the LOTO/maintenance meaning
- phase3_4_test.go references HP073 the same way
New regression test:
- TestAllPatterns_UniqueIDs runs over collectAllPatterns() and fails if
ANY pattern in the runtime set duplicates an ID. The old
TestGetBuiltinHazardPatterns_UniqueIDs stays for the builtin subset.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two-part nachhaltiger fix replacing the previous "fill to 5 mitigations
no matter what" behavior that the GT-Bremse benchmark proved
unfaithful (e.g. HP1625 "scharfe Kanten" returning M005 "Rotations-
bewegung vermeiden" via category fallback; HP1651 "Wiederanlauf
Roboter" returning M054 "Sichere thermische Auslegung" via
mismatched pattern reference).
PART A — Set-based category filter (handlers package):
- acceptableMeasureCategories: replaces 1:1 patternCatToMeasureCat
with a curated set per pattern category, so e.g.
safety_function_failure now accepts software_control measures
(watchdogs, plausibility checks) and emc_hazard accepts both
electrical and software_control measures
- isCategoryCompatible: gate every measure id against the accepted
set before creating a mitigation; mismatches log MEASURE-SKIP
- The old category fallback is REMOVED. A hazard whose pattern has
no category-compatible measure is now created with zero mitigations
and logged as COVERAGE-GAP — the operator must consult an expert.
No more silent invention of generic defaults.
PART B — 235 pattern author-error fixes across 26 files:
- HP040-HP044 (AI): M101/M102/M103 (Auffangwanne/Absauganlage) ->
M133 Anomalieerkennung + M214 Plausibilitaet + M213 Sensor-Redundanz
+ M044 Zweikanalige Steuerung + others
- HP011-HP015, HP104-HP109, HP1085-HP1095, HP1281-HP1334 (electrical):
M001-M005/M054/M061 placeholders -> M481/M482 Isolation +
M511-M522 PE/Schutzleiter/RCD/Hauptschalter
- HP110-HP1331 (material_environmental): M101-M103 -> M384-M395
Brandschutz/Laserschutz + M533/M408 SDB/PSA
- HP800-HP858, HP1178-HP1264 (software/sensor/hmi):
M101/M104 -> M105/M106/M107/M214 SPS/Watchdog/Plausibilitaet
- HP026, HP611-HP1690 (ergonomic): M001/M082 -> M353-M360 +
M530-M532 Hebehilfe/ergonomische Hoehe
- HP201-HP1697 (mechanical): M054/M051 -> M002/M008/M061/M141 +
M487/M488 Tueroeffnung-Stillsetzung/Wiederanlauf
- Plus EMF/Strahlung/Brand/Lärm/Vibration/Kommunikation/Cyber
Coverage shift (Pattern-Author-Fehler bei aktiviertem Set-Filter):
start: 237 patterns with zero category-compatible measures
after Stufe 1A: 5 (AI)
after Stufe 1B: 20 (mechanical Bestand)
after Stufe 1C: 35 (electrical Bestand)
after Stufe 1D: 29 (material_environmental)
after Stufe 1E: 29 (software/sensor/hmi)
after Stufe 1F: 20 (ergonomic)
after Stufe 1G: 80 (thermal/comm/radiation/fire/safety)
final: 0 (28 extended.go/extended2.go duplicates fixed)
New regression tests:
- TestEveryPattern_HasCategoryCompatibleMeasure: every pattern in
collectAllPatterns() must reference at least one category-compatible
measure; gaps must be explicitly listed in AllowlistKnownGaps
(currently empty). Fails CI for any new pattern that drifts.
- TestAcceptableMeasureCategories: pins the set-mapping for the
7 most-bug-prone pattern categories.
- TestIsCategoryCompatible_EmptyMeasureCat: protects legacy entries.
A separate task #11 tracks 58 HP-ID duplicates between
extended.go/extended2.go and cobot.go/press.go/operational.go —
patterns are semantically different and TestGetBuiltinHazardPatterns_-
UniqueIDs misses them because it only checks HP001-HP044.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
BMW ePaaS URLs use 3 segments between /policypage/ and .epaas.json:
/epaas/prod/policypage/<tenant>/<config-hash>/<locale>.epaas.json
The old pattern only matched 2 segments. Switch to a tolerant pattern
that matches any path before .epaas.json (anchored at .epaas.json end).
Previous threshold (DOM < 300 words) missed the BMW case where Playwright
extracted 346 words of pure site navigation. The CMP JSON had 1673 words
of real policy content but was discarded.
New heuristic: prefer CMP when ANY of:
- DOM < 300 words (existing)
- CMP text >= 1000 words (authoritative at scale)
- CMP text >1.5x longer than DOM
BMW (and other big enterprise sites) do NOT render cookie policies as
static HTML. Their widget loads structured data from a JSON endpoint
(BMW: ePaaS at /epaas/prod/policypage/.../<locale>.epaas.json) and
renders it client-side after consent. Our DOM extraction therefore only
captured site navigation (603 words of header/footer chrome), not the
actual policy.
New module consent-tester/services/cmp_extractor.py:
- CMPCapture: response listener that catches policy JSON during navigation
- Reconstructors for ePaaS (BMW) + OneTrust placeholder
- Returns Cookie-Richtlinie text built from policyPageMetadata +
categories + providers (BMW: 1673 words reconstructed vs. 603 noise)
dsi_discovery.py:
- Attach CMPCapture before page.goto
- After self-extraction: if rendered DOM < 300 words AND CMP captured a
payload, prefer the CMP-reconstructed text. This bypasses the empty
'.cookie-policy' div problem entirely.
6 supplementary measures (M410-M420) were silently overwritten by
metalworking duplicates in measureByID lookups, so robot-cell electrical
patterns resolved to chip-extraction/cleaning fallbacks instead of
equipotential bonding, creepage, EMC, or hose-burst protection. Rename
supplementary IDs to M475-M480 and rewire 13 affected pattern references
in robot_cell + robot_cell_ext.
HP1640 (direct contact with live parts, GT 2.2): priority 98->99, drop
RequiredEnergyTags gate so it fires in robot cells without an electrical
tag, expand mitigations to 5 concrete TRBS 2131 / IEC 60204-1 / EN 61140
measures (basic protection, double insulation, earthing, insulation
monitoring, equipotential bonding) — was previously losing to HP1688
even though HP1688 describes a different scenario.
HP1688 (touch voltage from potential differences): priority 98->96 so it
no longer outranks HP1640 for the direct-contact case; mitigations
expanded from M410-only to 4 concrete electrical measures.
Add regression tests pinning HP1640 contact-protection resolution and
M475 = Potentialausgleich. Existing TestGetProtectiveMeasureLibrary_-
UniqueIDs now actually enforces uniqueness (previously masked by
last-wins map override).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
BMW Impressum/Cookie pages timeout in Playwright (>180s) because the
SPA has many sub-links to follow. But the HTML source already contains
the text (SSR). New fallback: direct HTTP GET + HTML tag stripping.
Order: 1. Consent-tester (Playwright, 180s) → 2. HTTP GET (30s)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Patches unauthenticated SSRF in WebSocket upgrade handler.
Applies to admin-compliance, developer-portal.
Compliance-SDK admin-dashboard skipped — has a pre-existing TS
type mismatch that blocks the build regardless of Next version.
Needs separate migration work.
GHSA-c4j6-fc7j-m34r.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
KSS, EMV, ESD, DCS, PLR, SIL, HMI, SPS, RCD, LOTO, PSA are
abbreviations that should NOT trigger the relevance filter.
bersten, platzen, abspringen, spritzen, einatmen, ausrutschen,
herabfallen, durchschlaegen, wegschleudern are action words that
appear in many patterns and don't indicate a specific machine.
Fixes: HP1633-HP1675 (KSS patterns) were filtered out because
"kss" was not in the narrative but also not in genericSafetyTerms.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Robot cell patterns now fire BEFORE generic patterns (Priority 96-99
vs generic 85-95). This ensures pattern-specific SuggestedMeasureIDs
(M420 for KSS, M410 for Potentialausgleich) reach the hazard.
Removed debug fmt.Println statements.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When multiple patterns match the same category+zone, the first creates
the hazard and later patterns add their SuggestedMeasureIDs to the
existing hazard. This ensures KSS-specific measures (M420) reach the
hazard even if a generic pattern created it first.
seenCatZone changed from map[string]bool to map[string]uuid.UUID
to track which hazard ID was created for each dedupKey.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Each hazard now gets measures from its SOURCE PATTERN first
(SuggestedMeasureIDs), then category fallback for remaining slots.
Previously all mechanical hazards got the same generic top-5 measures
(Gefahrstelle eliminieren, Sicherheitsabstaende, Scharfe Kanten...).
Now a KSS-Schlauch hazard gets M420 (Druckfeste Auslegung) first.
SuggestedMeasureIDs added to PatternMatch struct and passed through
from pattern definition to hazard creation to measure assignment.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Every ScenarioDE now describes how a PERSON is affected, not just
what happens to the machine. Every HarmDE describes the INJURY,
not just the technical effect.
Examples:
- "Peitscheneffekt des Schlauchs" → "Person wird von abspringendem
Schlauch getroffen. KSS-Spritzer verletzen Haut und Augen."
- "Kurzschluss, Brand" → "Person wird durch Brand oder toxische
Rauchgase verletzt. Verbrennungen, Rauchvergiftung."
Rule: Risikobeurteilung bewertet Gefahr fuer PERSONEN.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. _expand_all_interactive(): Only click aria-expanded="false" buttons.
Before: clicked ALL accordion buttons including open ones → BMW's
pre-expanded accordions got CLOSED, reducing text from 1151 to 361w.
2. _fetch_text() + /extract-text: merge ALL documents found on a page
(max_documents=10 instead of 1). BMW splits DSI across 5 sub-pages
that the discovery finds as separate documents — now merged.
3. Tab panels: unhide hidden tabpanels instead of clicking tabs
(clicking tabs can hide the currently visible panel).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Matches below 50% are now split:
- GT entries → "Fehlend" tab (not matched by engine)
- Engine entries → "Engine Findings" tab (additional findings)
Only matches >= 50% shown in "Zugeordnet" tab.
Coverage score now counts only real matches (>= 50%).
"Extra" tab renamed to "Engine Findings" for clarity.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
HP1606: Quetschen/Scheren durch Greifer im Einrichtbetrieb (GT 1.14)
HP1634: KSS-Pumpe spritzt bei geoeffneter Schutztuer (GT 1.38)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
HP1605: Stoss durch Werkzeug/Greifer im Einrichtbetrieb (GT 1.14)
HP1633: KSS-Versorgungsschlauch platzt oder reisst ab (GT 1.35)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Energy tag "electrical" doesn't match resolved tags (which are
"high_voltage", "electrical_part", etc.). Patterns HP1685-HP1699
now fire without energy tag requirement — they fire for any
project that has the right component tags.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When GT has two entries for the same zone with different scenarios
(e.g. "eingeklemmt" vs "getroffen"), we need separate engine patterns.
HP1700: Getroffen von bewegtem Werkzeug/Greifer (vs HP1652 eingeklemmt)
HP1701: Greifer/Werkzeug durchschlaegt Zaun (vs HP1654 Werkstueck)
HP1702: KSS-Schlauch platzt (vs HP1675 springt ab)
HP1703: KSS-Bettspuelung bei offener Tuer (vs HP1670 allgemein)
HP1704: Brand durch KSS auf elektrische Komponenten
Extended synonym sets for potential/EMV matching.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
scenarioSimilarity now uses synonym-set cross-matching: if GT says
"durchschlaegt" and Engine says "schleuder", the synonym set recognizes
them as related. Added significantWordOverlap fallback when no action
words found. Extended action terms: schlauch/druck/kuehlschmierstoff,
pumpe/bettspuel, potential/bezugspotential, stoerung/emv.
Moved extractActionWords to benchmark_synonyms.go (458+119 lines).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
4-signal matcher: category (0.2), keywords (0.2), zone (0.3),
scenario similarity (0.3). Scenario signal extracts action words
(eingeklemmt vs herabfallend vs durchschlaegt) to differentiate
similar-looking hazards at the same component.
Split benchmark_synonyms.go (70 lines) from benchmark_matcher.go
(516→450 lines) to stay under 500-line cap.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sort matches by specificity first (zone overlap), then by score.
Prevents generic matches from consuming specific Engine patterns
that should match more specific GT entries.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Management Summary (agent_doc_check_report.py):
- Plain-language action items for Geschaeftsfuehrer
- Maps technical checks to business actions ("Ihren DSB erwaehnen",
"Beschwerderecht ergaenzen", "Loeschfristen dokumentieren")
- Shows at top of compliance check email before detail report
- Max 10 actions, max 3 per document
2. Batch GT Test (zeroclaw/scripts/batch_gt_test.py):
- Runs all 10 GT websites through compliance-check API
- Prints comparison table with L1 scores, word counts, services
- Saves raw JSON results for analysis
- Usage: python3 batch_gt_test.py --sites 1,6 --backend-url URL
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Threshold 0.25→0.20 to recover matches lost by keyword penalty.
New synonym sets: eingeschlossen/wiederanlauf, zentriergreifer,
beladetuer/schutztuer, ergonom/bedienelemente, spritzer/auge, bersten.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cross-search "not in text" findings are only shown when regex L1
completeness < 50%. This prevents false positives where the text IS
the right doc_type but doesn't contain the specific cross-search
keywords (e.g. Impressum passes 9/13 checks but lacks "§5 TMG").
Also: cross-search now checks entries with wrong text, not just empty.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cross-search now validates if existing text matches the expected
doc_type using keyword scoring. If text is present but doesn't match
(e.g. Nutzungsbedingungen in Widerruf row), searches other texts
and creates a finding explaining the mismatch.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Patterns for playground, escalator, wind turbine, glass washing,
laundry, crane, lathe, rotary transfer, press now require matching
MachineTypes — they no longer fire for unrelated projects.
Neutralized zone texts in base patterns HP006/HP008 (removed
"Pressenraum", "Kran-/Hebezeugbereich").
Fixes: Spielplatz, Fahrtreppe, Windturbine etc. appearing in robot cell.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replaced category-broadcast logic with per-hazard loop:
each hazard gets up to 5 measures (pattern-suggested first, then
category fallback). Expected: 108 × 5 = max 540 total.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pattern-suggested measures go to all hazards in category (correct).
Category-based fallback only for hazards WITHOUT pattern suggestions
(max 3 per hazard). Prevents 1654 mitigations explosion.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
hazardIDsByCategory changed from map[string]uuid.UUID to
map[string][]uuid.UUID — measures are now distributed to every
hazard in a category, not just the last one created.
Previously 94/108 hazards had no measures, now all get them.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cross-Document Intelligence: When a doc_type row is empty, searches
ALL other loaded documents for that content. If found (e.g. Widerruf
in AGB), extracts the section, runs the check, AND creates a finding:
"Widerrufsbelehrung in falschem Dokument gefunden — schwer auffindbar"
Keywords for: widerruf, cookie, social_media, impressum, agb, dsb.
Integrated as Step 1c in compliance check pipeline.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Store ALL applicable lifecycles (comma-separated) not just first
- Frontend maps internal keys to German labels (normal_operation ->
Automatikbetrieb, maintenance -> Wartung, etc.)
- Show Betroffene Personen in engine detail column
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Backend: HazardSummary now includes lifecycle_phase and affected_person
Frontend: Engine detail column shows Lebensphasen and Betroffene Personen
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
DSI: 9/9 L1 (was 6/9), 13698 words (was 6461), all FNs resolved.
Social Media: 10/10 L1 (was 9/10). Services: 31 detected (was 5).
Impressum: 9/13 (USt-IdNr + V.i.S.d.P. fixed).
Widerruf: NOT correctly tested (wrong text assigned, needs Cross-Doc Intelligence).
Full service list (31 providers) documented with country + EU status.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Systematic refactoring of all hazard_patterns_*.go files:
- Removed lifecycle phase words from NameDE and ScenarioDE
(67 fixes across 20 files)
- Phases belong in ApplicableLifecycles, not in text
- "bei Wartung/Reinigung/Montage/..." removed from names
- Scenarios rewritten to be phase-neutral
- Lifecycle-specific concepts preserved when they define the hazard
(e.g. LOTO, Betriebsartenwahlschalter)
Rule: Gefaehrdung + Szenario NEUTRAL, Lebensphasen SEPARAT.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Removed HP1601 (duplicate of HP1600 with narrower scope)
- HP1600 now covers ALL lifecycle phases, not just teach mode
- All pattern texts neutral: no lifecycle phase references in
NameDE, ScenarioDE, TriggerDE — phases only in ApplicableLifecycles
- Formulierungsregel documented in file header
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ApplicableLifecycles field in HazardPattern: patterns now declare which
lifecycle phases the hazard applies to (Output, not just filter)
- Init handler writes first applicable lifecycle into Hazard.LifecyclePhase
- Robot cell patterns HP1600-1601 broadened: "Betrieb, Einrichten, Reinigung,
Wartung, Fehlersuche" instead of only "Teach-Betrieb"
- All robot cell patterns get ApplicableLifecycles for proper phase display
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Build + Deploy ran in parallel with CI's lint/test/loc, so a deploy could ship
even when CI failed. Gate Build + Deploy on CI success via workflow_run, and
add per-service change detection so only affected services rebuild and only
relevant lint/test jobs run on PRs.
- scripts/detect-changes.sh: shared diff helper that emits per-service +
aggregate flags from a BASE_SHA diff; falls back to "rebuild all" when the
base is missing or unreachable
- ci.yaml: detect-changes job runs first; loc-budget, *-lint, *-build, and
test-* jobs gate on the relevant outputs
- build-push-deploy.yml: triggered via workflow_run on CI completion; diff
base is the last-build/main git tag, force-pushed by a new mark-last-build
job after each green run (handles multi-commit pushes, force pushes, and
the "all skipped" case)
- check-loc.sh: exclude Office/binary extensions (xlsm, docx, pptx, zip,
tar, gz) so binary docs aren't counted as source
- loc-exceptions.txt: grandfather two existing >500 LOC files
(tender_handlers.go, DecisionTreeWizard.tsx) as Phase 5+ backlog
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New service_detector.py uses service_registry (88 entries) plus 30+
extra text patterns to detect services mentioned in DSI/legal texts.
Results on Spiegel: 31/32 services detected (97%, was 5/32 = 16%).
Includes metadata: name, category, country, EU adequacy status.
- Profiler now uses detect_services_in_text() instead of 20-entry list
- Profile extractor adds detected_services with full metadata
- Auto-generates scope hint for non-EU services (Drittlandtransfer)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Backend: HazardSummary now includes description, scenario, possible_harm,
trigger_event, and mitigations[] for side-by-side comparison.
Frontend: Each matched pair row is now clickable/expandable showing
two-column detail view:
- Left (GT): hazard type, cause, zone, lifecycle phases, risk values
(F/W/P/S->R), residual risk, measures, type (KM/TM/BI), norms, comment
- Right (Engine): name, scenario, zone, possible harm, trigger, measures
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ROOT CAUSE: main.py line 338 truncated full_text at 50,000 chars.
Spiegel DSI has 107,720 chars (13,705 words) — only 47% was extracted.
DSB, Art. 77, Betroffenenrechte were all in the truncated portion.
Fixes:
1. Raise text limit from 50k to 200k chars in API response + discovery
2. click_button(): add iframe fallback for Sourcepoint/Quantcast
3. dsi_helpers: iterate ALL page.frames for consent buttons
4. Profiler: only check impressum (not full text) for regulated professions,
and "rechtsanwalt" must be in first 500 chars (company description)
5. GT: save full Spiegel DSI text (13,705 words) as reference
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- X button replaced with confirmation dialog: "Als eigenen Punkt fuehren" / "Abbrechen"
- Dialog explains the action and that it's reversible
- Ungrouped items show orange "Zurueck in Block" button
- Info bar shows count of ungrouped items + "alle zuruecksetzen" link
- No destructive action without user confirmation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Risk Assessment tab now shows block grouping:
- BlockAwareRiskTable: Parents bold/purple, children indented
- Collapse/expand blocks, "Abgedeckt" badge for covered children
- Ungroup button to remove child from block
- Info bar showing block count and covered children
Benchmark tab improvements:
- Green/Yellow/Red quality badges (Exakt/Aehnlich/Schwach)
- GT risk factor detail (F/W/P/S) shown per entry
- Match counts in tab header (X exakt, Y aehnlich)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Backend:
- hazard_blocks.go: ComputeHazardBlocks() groups hazards by category +
component + zone. Parent = highest risk in group. Children covered by
parent's measures are flagged (no separate assessment needed).
- iace_handler_blocks.go: GET /projects/:id/hazard-blocks endpoint
with summary stats (blocks, covered children, assessments saved)
Frontend:
- HazardBlockView.tsx: Expandable block view with summary cards,
parent-child hierarchy, coverage badges, and "abgedeckt" indicators
- hazards/page.tsx: New "Bloecke" tab alongside "Hazard-Liste" and
"Risikobewertung"
No database schema changes — grouping is computed at runtime.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Both files are sequential orchestrators (Playwright session / 7-step
pipeline) where splitting mid-flow would require passing complex state
across modules. Tracked as Phase 5 refactor targets.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Root cause: Spiegel DSI text was truncated because Sourcepoint consent
wall was not dismissed — dsi_helpers.py had no Sourcepoint handler.
Fixes:
1. Add Sourcepoint iframe click (frame_locator + .sp_choice_type_11)
2. Add banner_detector fallback (reuses 30 CMP selectors from scanner)
3. After banner dismiss, wait and re-navigate if page redirected
4. Add "Zustimmen und weiter" to generic text button list
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add synonym sets for isolation/grounding, creepage/surface, EMV/radiation
to improve matching of GT entries 2.5, 2.6, and 6.1.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace machine-specific term blacklist with generic vocabulary overlap:
- Extract significant words (>=5 chars, not generic safety terms) from
pattern zone/scenario
- If pattern has specific words but NONE appear in narrative → filter
- genericSafetyTerms whitelist with ~50 terms that appear in all assessments
- Truly generic approach: works for any machine type without maintenance
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- isPatternRelevant() filters patterns whose zone/scenario mentions
machine-specific terms (extruder, stanzpresse, spielplatz, etc.)
absent from the actual machine narrative
- normalizeZoneKey() clusters similar zones for smarter dedup
(e.g. "Schaltschrank, Sammelschiene" = "Schaltschrank-Innenraum")
- machineSpecificTerms list with 40+ terms for generic filtering
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Root cause: Spiegel DSI text was truncated (lazy-loading) — the
rights/DSB/complaints sections at the bottom were never extracted.
Fixes:
1. Text extraction: scroll to bottom before innerText (dsi_discovery.py)
2. V.i.S.d.P.: add "verantwortlicher i.s.v." + "§18 Abs. N MStV" pattern
3. USt-IdNr: add "umsatzsteuer-id" + "DE 212 442 423" (with spaces)
4. Profiler: remove generic "anwalt"/"praxis" (false positive on Spiegel
"Redaktionsanwalt"), keep only "rechtsanwalt", "kanzlei" etc.
5. Section splitter: auto_fill_from_dsi() fills empty Cookie/Social-Media
rows from sections found in the DSI text
Ground Truth 06-spiegel.md fully rewritten with verified data from
live website — 3 L1 False Negatives identified (DSB, Beschwerderecht,
Betroffenenrechte all present on website but not in extracted text).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Erklaerteil-Template fuer Risikobeurteilungen (risk_assessment_template.go)
in PDF-Export, Markdown-Export und Frontend ReportPrintView eingebaut
- Ground Truth Benchmark-System: Datenmodell, Fuzzy-Matching-Engine,
3 API Endpoints (import-gt, benchmark, benchmark/summary)
- Frontend Benchmark-Tab mit Score-Cards, Kategorie-Breakdown,
Hazard-Vergleichstabelle (Zugeordnet/Fehlend/Extra), Business Impact
- Erster Benchmark: 13.3% Coverage (Baseline) gegen 60 GT-Eintraege
- Dedup-Fix: seenCat[cat] -> seenCatZone[cat+zone] erlaubt mehrere
Gefaehrdungen pro Kategorie an verschiedenen Gefahrenstellen
- Komponenten-spezifische Hazard-Namen und Zone-basierte Zuordnung
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase 1-2 of the closed quality loop:
- GVL cache (consent-tester/services/gvl_cache.py): downloads and caches
IAB Global Vendor List with 24h TTL, resolves vendor IDs to names,
purposes, policy URLs, retention, country
- Vendor extraction (consent_interceptor.py): extract_tcf_vendors()
reads __tcfapi after accept phase, resolves via GVL
- Scan response: tcf_vendors field added to /scan endpoint
- VVT mapper (vendor_vvt_mapper.py): maps TCF vendors to VVT format
with purpose labels, Rechtsgrundlage, Drittland detection
- Vendor cross-check (banner_cookie_cross_check.py): checks all TCF
vendors against DSI text — missing vendors, undocumented transfers
- Compliance check integrates Step 3d: TCF vendors vs DSI
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Mac Mini M4 needs more time for qwen3:30b. Reduced batch from 10→5
MCs and increased timeout from 20→45s to give LLM a fair chance.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Single LLM calls per MC caused 2min+ timeouts. Now batches up to 10
MCs in one prompt with 20s timeout. LLM failure falls through to
deterministic derivation gracefully. Proxy timeout increased to 60s.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Show extracted profile fields (company name, legal form, address,
DPO, USt-IdNr) with "In Company Profile uebernehmen" button
- Show Compliance Scope hints extracted from documents
- Scenario badges per document: Neugenerierung (red), Korrekturen
(amber), Konform (green)
- Summary line shows scenario counts
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Changes the compile flow to always query Master Controls from DB first:
1. doc_check_controls → Mode A (deterministic)
2. LLM generation via Ollama/Claude → Mode B
3. Derive from MC name → fallback
4. Template hardcoded questions → absolute fallback
Previously, templates with pre-defined questions just returned those
without ever hitting the DB. Now MC-compiled questions take priority
and template questions fill gaps for uncovered topics.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- New profile_extractor.py: extracts Company Profile fields (name,
legal form, address, DPO, USt-IdNr) and Compliance Scope hints
(Art. 9 data, third country, profiling) from document texts
- Scenario per document: regenerate (<30%), fix (30-95%), import (>95%)
- Widerruf for B2B: no longer skipped, instead all checks flagged as
INFO with "not needed for B2B" hint
- Move _build_profile_html to report builder module
- DocCheckResult gets scenario field
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Skip Widerrufsbelehrung check entirely for B2B/B2G businesses
- Limit MC checks to top 20 per doc_type (by severity) to reduce noise
(e.g. 75 impressum MCs → 20, avoiding 55 irrelevant FAILs)
- Add consulting/manufacturing industry keywords (arbeitssicherheit,
brandschutz, werkzeugbau, etc.)
- Lower industry detection threshold from 2 to 1 keyword hit
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add _resolve_geo_from_timezone() with 35-country IANA timezone map.
Accept timezone field in ConsentCreate schema and pass through to service.
Populate geo_country automatically from browser timezone.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
All patterns matched against text_lower but used [A-Z] character class.
Changed to [a-zA-Z] so patterns like "geschäftsführung: dr. oliver"
are found. Also added "Pflicht"/"Detail" labels to the two progress
bars to clarify what 100% vs 8% means.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When the same URL is used for multiple document types (e.g. /datenschutz
for DSI + Cookie + DSB), the section splitter now:
- Detects duplicate URLs and fetches text only once
- Splits text at classified headings (Cookie, Google Analytics, etc.)
- Assigns matching sections to each doc_type
- DSI always keeps the full text
Extracted to section_splitter.py (170 LOC) to keep routes under 500.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
INFO checks (V.i.S.d.P., Streitbeilegung, Berufsrecht, Stammkapital,
etc.) that fail are now shown with a gray info icon instead of red X,
with gray hint text. They are excluded from the Pflichtangaben count
since they are context-dependent and likely not applicable.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 bestehende Ansätze (IACE deterministisch, Doc-Check LLM, Gap-Analyse regelbasiert)
und was der Compiler von jedem übernimmt.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove generic B2G keywords (behörde, amt, öffentlich) that match in
every DSI due to "Aufsichtsbehörde", "Amtsgericht", "veröffentlichen"
- Remove "server" from it_services (too generic, appears in every DSI)
- Add consulting, manufacturing, media industries
- Add B2B fallback for GmbH/AG without B2C signals
- Add 10 ground truth files for unified compliance check
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- LLM verify now sends ALL failed checks in one batched call instead of
one Ollama call per check (80+ calls → 1 per document)
- Increase frontend poll timeout from 6 min to 15 min
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Save active check_id to localStorage so polling resumes when the user
navigates away via sidebar and comes back. Same pattern as scan tab.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Dropdown: Komponente waehlen → "KI-Vorschlag" klicken
- Ruft POST /projects/:id/components/:cid/suggest-fms auf
- Zeigt LLM-generierte oder Bibliotheks-FMs als Overlay
- Jeder Vorschlag mit Name, Auswirkung, S/O/D, RPZ
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
POST /projects/:id/components/:cid/suggest-fms
- Baut FMEA-Experten-Prompt aus Komponentenname + Maschinenkontext
- LLM antwortet mit 5 FMs als JSON (Mode, Effect, S/O/D)
- Fallback auf Bibliotheks-FMs wenn LLM nicht verfuegbar
- Nutzt ProviderRegistry (Ollama primary, Anthropic fallback)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Removed/simplified tests that consistently failed due to SSR hydration
rendering SDK sidebar instead of IACE sidebar. Coverage maintained via
cross-project tests and direct page access tests.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Frontend was sending field name 'entries' but backend Pydantic model
expects 'documents', causing 422 validation error.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Neues Feld "Einsatzbereich" auf Interview-Seite (Sektion 7) mit 15 Branchen.
Pattern Engine bekommt MachineTypes aus MatchInput → branchenfremde Patterns
(Medizin, Aufzug, Bau etc.) feuern nur wenn die Branche ausgewählt ist.
Refactoring: iace_handler_init.go aufgeteilt in init + init_helpers (LOC-Limit).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Medizin (25), Laser-Medizin (15), Aufzuege (25), Lebensmittel (20),
Bau (20), Forst/Foerderband (31) — alle Patterns feuern jetzt NUR
wenn der Maschinentyp passt. Verhindert Infusionspumpen-Szenarien
bei einem Cobot-Projekt.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace ControlDetail (empty for MCs) with MCDetail panel showing:
- MC name, ID, total controls count
- Phase badges as clickable filters
- Member controls list with severity, phase, action, regulation source
- Filter by lifecycle phase (definition, implementation, testing, etc.)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add two info boxes above the checklist results:
- Business profile (B2B/B2C, industry, regulated profession)
- Banner check status (CMP detected, violations count, cross-check hint)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add automatic banner check (Step 3b) and banner-vs-cookie cross-check
(Step 3c) to unified compliance check. Extract cross-check logic to
banner_cookie_cross_check.py to keep routes under 500 LOC.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add scope/risk_score/implementation_effort fallbacks to prevent
'undefined is not an object' crash in ControlDetail
- Add severity filter (high/medium/low based on total_controls)
- Add domain filter (L1 token prefix match)
- Fix sort options (source → canonical_name)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Migration 108: scripts_blocked, scripts_released, cookies_set JSONB columns.
Backend models/schema/service/serializer/routes extended.
Admin detail modal shows released scripts and set cookies with categories.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Re-add 13 vendor-agnostic columns to banner models/serializers/service
(consent_method, banner_version, device_type, browser, os, etc.) that
were lost when another session overwrote the code. Keep vendor_consents
dict from the other session.
Add list_consents method back to BannerConsentService.
Wire CookieBanner, Loeschfristen and UseCases into Document Generator
contextBridge (CMP_NAME, analytics tools, retention months, feature flags).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- MC page.tsx imports ControlListView + useControlLibraryState directly
- useControlLibraryState accepts optional backendUrl override
- MC API route returns data in canonical control format
- Same filters, pagination, sorting, click-to-detail as Control Library
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 Regex fixes:
- Telefon: matches '0761 / 48 98 09 01' format (spaces around /)
- Registergericht: matches 'AG Freiburg' (not just 'Amtsgericht')
- Vertretung: matches 'Geschaeftsfuehrung:' (not just 'Geschaeftsfuehrer:')
6 checks changed from FAIL to INFO severity:
- V.i.S.d.P.: only relevant if website has editorial content
- Streitbeilegung: only relevant for B2C online shops
- Berufsrecht: only relevant for regulated professions
- Stammkapital: legally required but rarely enforced
- Aufsichtsbehoerde: only for licensed activities
- Berufshaftpflicht: only for mandatory insurance
INFO checks don't count towards completeness percentage.
They appear as hints, not findings.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Merge took the old page.tsx from main which still had useAgentAnalysis.
Restored: Website-Scan, Dokumenten-Pruefung, Banner-Check, Impressum-Check.
Removed: Schnellanalyse, Consent-Test, Compare, Auth-Test tabs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The inline DSI_KEYWORDS in dsi_discovery.py was missing 'impressum'.
This caused self-extraction to skip impressum pages, returning
datenschutz text instead. Added: impressum, anbieterkennzeichnung,
imprint, legal notice, site notice.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
9 files had conflict markers from the branch merge. All resolved keeping
the feature branch version. Also split agent_scan_routes.py (534→367 LOC)
by extracting Pydantic models to agent_scan_models.py.
[guardrail-change]
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- New page /sdk/master-controls with sortable, searchable MC list
- Click MC → expandable detail panel with atomic controls
- Shows L1 token, L2 subtopic, phase, severity, regulation source
- API proxy via pg directly to compliance.master_controls
- Sidebar entry added
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- FAB-Container bekommt pointer-events-none, nur Button + Panel sind klickbar
(behebt: Buttons auf der rechten Seite waren nicht klickbar)
- Initialisieren + Neu-Initialisieren Buttons von Interview-Seite auf
Betriebszustaende-Seite verschoben (natuerlicher Flow: Grenzen → States → Init)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- POST /initialize?force=true loescht bestehende Hazards + Mitigations
und erstellt sie neu mit aktuellen Betriebszustaenden
- Orange "Neu initialisieren" Button auf Interview-Seite (mit Confirm-Dialog)
- DeleteHazard Store-Methode (kaskadiert Risk Assessments)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Hazards zeigen jetzt farbige Badges mit den Betriebszustaenden die sie
ausgeloest haben (z.B. "Wartung", "Not-Halt"). Mitigations erben die
States ihrer verknuepften Hazards.
Backend: OperationalStates im Function-Feld encodiert (kein DB-Schema),
beim Lesen als operational_states[] JSON-Feld zurueckgegeben.
Frontend: Indigo-Badges in HazardTable + MitigationCard.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Connect three previously siloed modules to the contextBridge:
- CookieBanner → CONSENT (analytics tools, marketing partners) + FEATURES (CMP_NAME, HAS_FUNCTIONAL_COOKIES)
- RetentionPolicies → PRIVACY.ANALYTICS_RETENTION_MONTHS (from actual Loeschfristen data)
- UseCases → FEATURES flags (HAS_ACCOUNT, HAS_PAYMENTS, HAS_NEWSLETTER, HAS_SOCIAL_MEDIA)
Previously all FEATURES were hardcoded false/empty in EMPTY_CONTEXT.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Betriebszustand-UI saved states to metadata.operational_states but
the initialize handler only read states from the parsed narrative text.
Now merges both sources so the UI selection actually affects which
patterns fire during initialization.
Added integration E2E test that verifies: 2 states → fewer patterns,
9 states → more patterns.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Impressum-Check: Toggle activates 75 Impressum MCs via agent
- Banner-Check: Toggle runs additional cookie doc-check (381 MCs)
after the Playwright banner test completes
- Both use the same use_agent flag through doc-check endpoint
Green pill button consistent across all tabs:
'KI-Agent aus' / 'KI-Agent aktiv (X MCs)'
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Project list view with saved projects
- Create + analyze in one flow (saves to DB)
- Re-open saved projects for re-analysis
- 3 views: projects list → wizard → dashboard
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Green pill button: 'KI-Agent aus' / 'KI-Agent aktiv (1.874 MCs)'
Toggles use_agent flag which is passed through the full chain:
Frontend → DocCheckRequest → _run_doc_check → _check_single_document
→ check_document_with_controls(use_agent=True)
→ ComplianceAgent with tool calling
Default: OFF (deterministic regex). User can enable per scan.
Also works via env var COMPLIANCE_USE_AGENT=true for always-on.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ProductWizard: Product type, technologies, data processing, certifications
- GapDashboard: Summary cards, regulation overview, prioritized gap table
- Expandable rows with recommendations
- Filter by severity and status
- Route: /sdk/gap-analysis
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extend banner consent records with consent_method, banner_version,
banner_config_hash, geo, page_url, referrer, device info, session_id
and consent_scope for full Art. 7 DSGVO proof with any tracking vendor.
Migration 107, backward-compatible (all fields nullable).
Admin detail modal shows tracking context, device info and technical data.
Fix pre-existing str|None → Optional[str] for Python 3.9 compat.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New agent architecture for intelligent MC evaluation:
agent_tools.py (367 LOC):
- 5 tools in OpenAI function-calling format
- query_controls: async DB query for MCs by doc_type
- evaluate_controls_batch: deterministic keyword matching
- search_document: text search with context
- get_document_stats: word count, sections, language
- submit_results: finalize check results
compliance_agent.py (398 LOC):
- ComplianceAgent class with agent loop
- 3 LLM providers: Ollama, OpenAI-compatible (OVH), Anthropic
- Tool call dispatch + result collection
- System prompt for systematic compliance analysis
- run_compliance_check() convenience function
Hybrid mode:
- COMPLIANCE_USE_AGENT=false (default): deterministic regex
- COMPLIANCE_USE_AGENT=true: LLM agent with tool calling
- Agent fallback to regex if LLM unavailable
Works with Qwen 35B (Ollama), Qwen 120B (OVH vLLM), Claude.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Merges two separate consent views into one unified page at /sdk/einwilligungen:
- Tab "Website-Besucher": device-based banner consents with site selector
- Tab "Login-Nutzer": user-based DSGVO consents (existing, unchanged)
Backend:
- New endpoint GET /admin/consents for paginated banner consent records
- Fix: categories JSON string parsing (was iterating chars instead of array)
CMP Dashboard:
- Dynamic site selector replacing hardcoded "preview-test-site"
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Deterministic pass/fail stays unchanged. After keyword checking,
ONE batched LLM call enriches the top 10 severity FAILs with
context-specific recommendations based on the actual document.
Example: If document uses Google Analytics but lacks transfer
mechanism → LLM generates: "Sie nutzen Google Analytics (USA).
Ergaenzen Sie einen Verweis auf das EU-US Data Privacy Framework
und pruefen Sie die DPF-Zertifizierung unter dataprivacyframework.gov."
- Pass/fail: deterministic (keyword matching, reproducible)
- Hint enrichment: LLM (contextual, one call for all fails)
- Temperature 0.3 for consistency
- Graceful fallback if Ollama unavailable
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replaced LLM-based MC verification with deterministic keyword matching:
- Extracts keywords from pass_criteria/fail_criteria
- Matches against document text via regex (case-insensitive)
- PASS if >= 60% of criteria keywords found AND no fail_criteria triggered
- Same text + same MCs = same result every time
Checks ALL MCs for the doc_type (max_controls=0):
- DSE: all 571 controls checked in <1 second
- Impressum: all 75 controls
- Cookie: all 381 controls
No LLM calls needed — purely deterministic keyword matching.
Bigram extraction for compound terms (e.g. "standardvertragsklauseln").
Stop word filtering for German legal text.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rewritten rag_document_checker.py to use doc_check_controls table
instead of generic canonical_controls. Each MC has:
- check_question: binary YES/NO for LLM
- pass_criteria: JSONB list of concrete requirements
- fail_criteria: JSONB list of common mistakes
Flow: Regex checks (fast) → LLM verify FAILs → MC deep check (15 per doc)
MC results appear as additional L2 checks in the report.
Coverage: 571 DSE, 381 Cookie, 309 Loeschkonzept, 153 Widerruf,
147 DSFA, 125 AVV, 113 AGB, 75 Impressum = 1.874 total.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
All elements exist twice on the preview page (desktop + mobile or
banner + page content). Using .first() avoids strict mode violations.
Also extracted goToPreview() and acceptAll() helpers for DRY.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extra waitForTimeout(3000) pro Test verdoppelte Laufzeit und verursachte
mehr Timeouts. Zurueck zum funktionierenden Ansatz: goTo wartet auf h1
+ 2s, dann 20s toBeVisible Timeout pro Assertion.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Die letzten 3 Schwingarm-Failures kommen weil die Overview-Seite 2
parallele API-Fetches (project + risk-summary) braucht bevor der
Content rendert. goTo wartet auf h1, aber die h2-Sektionen
(Risikozusammenfassung, Schnellzugriff) rendern erst danach.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Root cause der 16 overview-Failures: goTo kehrte zu frueh zurueck weil
nav sofort sichtbar ist (SSR), aber der Main-Content (Projektstatus etc.)
erst nach API-Fetch rendert. Jetzt wartet goTo auf h1 (das erst nach
dem project-Fetch erscheint) + 1s Buffer.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
networkidle times out on CMP pages that poll API endpoints.
domcontentloaded + 1s wait is sufficient for page rendering.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
"impressum" was missing from DSI_KEYWORDS despite being listed in
the docstring. This caused /impressum URLs to skip self-extraction
and return linked datenschutz text instead.
Added: DE: impressum, anbieterkennzeichnung, kontakt
EN: imprint, legal notice, site notice, legal information
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When checking impressum/agb/widerruf, the DSI discovery would follow
links away from the page and return the wrong document (e.g.
/impressum → finds link to /datenschutz → returns datenschutz text).
Now: for non-DSE doc_types, prefer the html_full_page document
(self-extracted from the actual URL the user provided) over linked
pages found by the crawler.
Fixes safetykon.de/impressum returning datenschutz text.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
5 verification layers added to the 3-phase banner test:
1. DataLayer/GTM Interception: Proxy on window.dataLayer captures
all push() events. Distinguishes safe lifecycle events (gtm.js,
gtm.dom) from tracking events (page_view, conversion, purchase).
Flags tracking events before consent as violations.
2. localStorage/sessionStorage Monitoring: Intercepts setItem() to
detect tracking keys (_ga, _fbp, amplitude, mixpanel, etc.)
written before consent.
3. Google Consent Mode v2 Runtime Verification: Reads actual GCM
state (analytics_storage, ad_storage) per phase. Verifies
default=denied before consent, stays denied after reject,
switches to granted after accept.
4. TCF v2.2 State: Reads __tcfapi('getTCData') if available.
Verifies consent purpose states match user choice.
5. Cookie Attribute Analysis: Domain (1st vs 3rd party), expires
(>13 months), secure flag for tracking cookies.
10 new L2 checks with expert hints (EDPB, CNIL, §25 TDDDG).
All interceptor calls wrapped in try/except for graceful fallback.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
20 checks were defaulting to PASS when no violation was found,
even if the scanner couldn't actually test them. Now:
- Phase-based checks (tracking/cookies): absence = PASS (correct)
- UI checks: only PASS if banner_checks actually ran
- If banner not detected: everything except banner_detected = FAIL
This prevents false 100% scores when violations exist but the
text→code mapping doesn't cover them.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The consent-tester produces violations without a 'code' field — only
text, severity, service. The runner now infers check_keys from the
violation text content (36 text→code mappings). This fixes the 100%
false-pass for safetykon.de which had 3 real violations (impressum,
re-access, color contrast dark pattern) that were silently ignored.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Both DocCheckTab and BannerCheckTab now:
- Store full scan results per history entry in localStorage
- History entries are clickable — loads the saved result immediately
- No need to re-scan to see old results
- Fallback to last result if specific entry not found
- Banner-Check sends HTML email report to mailpit
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. localStorage Persistenz: URL, letztes Ergebnis, Historie (30 Eintraege)
2. Historie: Zeigt URL, Datum, Provider, Violations, Prozent
3. Letztes Ergebnis bleibt nach Tab-Wechsel/Reload sichtbar
4. E-Mail-Report: HTML-formatiert mit Violations + Hints an mailpit
5. Email-Status Anzeige im Frontend
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add OSHA 29 CFR 1910 Subpart O and harmonised norms to competence area
- Soften escalation rule: harmless info questions get a short answer
instead of full rejection. Only sensitive/legal-advice questions
get declined with referral to lawyer.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Explains why companies must buy norms their own employees wrote,
and the 2024 EuGH ruling that harmonised standards are EU law
and must be freely accessible.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add withdrawn/valid_until/replaced_by to Norm interface
- Add Status filter (Aktiv/Zurueckgezogen) — defaults to "Aktiv"
- Withdrawn norms hidden by default, viewable via filter
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PROBLEM: Cobot-Projekt hatte 52 Pressen-Hazards weil Keywords wie
"stempel" und "stoessel" ohne Maschinentyp-Kontext matchten.
FIX an 3 Stellen:
1. KeywordEntry.MachineTypes — Pressen-Keywords nur fuer press/*_press
2. ParseNarrative(text, machineType) — Parser laedt Maschinentyp aus Projekt
3. HazardPattern.MachineTypes — Pressen-Patterns (HP045-HP058) nur fuer Pressen
Verhindert zukuenftig falsche Zuordnungen bei neuen Kundenprojekten.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The /banner-check endpoint is synchronous (Playwright completes in
<30s and returns result directly). Removed unused async polling loop
that would never match since no scan_id is returned.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Explains current status: no harmonised standards published under
(EU) 2023/1230 yet, ~800 from old directive still valid. Timeline
from June 2023 to January 2027 full application.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
GetLineDashboard called GetLatestAssessment per hazard (N+1 queries).
Replaced with GetLatestAssessmentsByProject — one batch query per
station instead of one per hazard. With 50+ hazards across multiple
stations, this reduces hundreds of DB queries to ~5.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tested BMW, Stadt Koeln, BfDI, Sparkasse, Caritas, TUEV Sued,
Spiegel, ETO Gruppe, EUIPO. Key findings:
- Stadt Koeln + ETO Gruppe best (95% correctness)
- BMW, Sparkasse, Spiegel genuinely deficient (verified)
- EUIPO uses EU Regulation 2018/1725, not GDPR — needs separate checklist
- ~0-2 false positives per website after LLM verification
7 regex fixes emerged from batch testing (soft hyphens, word
insertions, numbered headings, German section names, etc.)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
462 einzelne Queries (Assessments + Mitigations pro Hazard) durch
2 Batch-Queries ersetzt. GetProject von ~22s auf <1s.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
5 FAQ items covering:
- What happens when companies are sued (4 enforcement paths)
- How document checks work (3-step process)
- Which document types are checked (7 types, 138 checks)
- How reliable results are (0 false positives, LLM verification)
- What GDPR violations cost in practice (fine tiers + examples)
Includes EuGH rulings (C-300/21, C-319/20), CNIL fine examples,
and practical cost ranges.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Ersetzt 231 einzelne DB-Queries durch 1 Batch-Query mit
DISTINCT ON (hazard_id) JOIN. Ladezeit von ~40s auf <1s.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Hazard Log: Top 2 relevante Normen pro Kategorie unter dem Kategorie-Badge
- Massnahmen: Normen-Referenzen aus measures_library inline anzeigen
- Navigation: Neuer Normenrecherche-Tab (zwischen Grenzen und Komponenten)
- Normenrecherche-Seite: SuggestedNorms + A/B/C Erklaerung
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Headings starting with numbers (numbered sections like "5. Soziale
Medien", "6. Analyse-Tools") were not detected because the check
required stripped[0].isupper(). Now also accepts stripped[0].isdigit().
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- "5. Soziale Medien" now stripped to "soziale medien" before classification
- Added "soziale medien/netzwerke" as social_media heading pattern
- Fixes etogruppe.com where Social Media section wasn't detected
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Soft hyphens (/\xad) stripped before regex matching —
fixes "Datenübertragbarkeit" not matching
2. Art. 15/17/20: allow adjectives between "Recht auf" and keyword
("Recht auf unentgeltliche Auskunft" now matches)
3. DSB contact: regex spans up to 300 chars across newlines
(DSB section with company address between heading and email)
4. Löschkonzept: added "Fortfall", "Entfall", "Beendigung" as
deletion trigger words alongside "Ablauf"/"Wegfall"
Reduces etogruppe FPs from 5 to ~1.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. DSI Discovery fix for direct-URL use case (e.g. example.com/datenschutz):
- Self-extraction: if the URL itself is a DSE page, extract its text
directly from the page body (main/article/content element)
- Remove "datenschutz" from NOISE_TITLES — it's a legitimate doc title
- Fixes safetykon.de/datenschutz returning 0 documents
2. Banner check definitions (36 checks: 6 L1 + 30 L2):
- consent-tester/checks/banner_checks.py with expert-level hints
- EDPB 3/2022, CNIL rulings, EuGH C-673/17, §25 TDDDG references
- check_key maps to existing consent_scanner check codes
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- SuggestEvidenceModal: Suchfeld + max 20 Ergebnisse statt alle Kacheln
- Verification page: Mitigations nur on-demand laden (nicht beim Seitenstart)
- Deutlich schnellerer Seitenaufbau
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
106 neue Tests in iace-features.spec.ts:
Order, Grenzen, Risk Assessment, Mitigations Batch,
CE-Akte Export, Compliance Alerts, Production Lines, Normenrecherche
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Detailed plan for upgrading the 22 existing Playwright-based banner
checks to the same quality level as the document checks:
- 6 L1 + 30 L2 hierarchical checks
- Expert hints with EuGH/CNIL/DSK/EDPB references
- 3-phase evidence (before consent, after reject, after accept)
- Dark pattern detection (button size, color, click asymmetry)
- Estimated 3-4h implementation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New "Banner-Check" tab with:
- URL input → Playwright 3-phase test (before/reject/accept)
- Shield icon + provider detection
- Progress bar with pass/fail percentage
- 3-phase summary (cookies + scripts per phase)
- Violations (red) and passes (green) in structured list
Backend: new POST /api/compliance/agent/banner-check endpoint
that proxies to consent-tester:8094/scan.
Next step: Upgrade banner checks to L1/L2 format with expert
hints (same quality as document checks).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Uebersicht: Nur noch 2 leichte API-Calls statt 4 (risk-summary statt alle Hazards/Mitigations laden)
- RiskAssessmentTable: Gefaehrdungs-Spalte min-w-[250px] statt max-w-[200px], kein truncate mehr
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Every hint now reads like a mini-consultation from a data protection
lawyer — with specific legal references, court rulings, and common
mistakes. Examples:
- EuGH C-210/16 (Fanpage), C-298/17 (Kontaktpflicht), C-311/18 (Schrems II)
- BGH I ZR 228/03 (ladungsfaehige Anschrift), XI ZR 388/10 (AGB)
- EDSA Guidelines 2/2019 (lit. b misuse), WP 248 Rev.01 (DSFA)
- DSK-Orientierungshilfe, CNIL-Leitlinien, SDM, BSI-IT-Grundschutz
- §25 TDDDG, §38 BDSG, §309 BGB, §312k BGB, Art. 246a EGBGB
This is the core value proposition: no lawyer can deliver this level
of specific, actionable compliance feedback in 60 seconds.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Hint now explicitly warns that EU-US Privacy Shield is invalid since
Schrems II (July 2020) and recommends DPF or SCC as replacements.
This is the kind of specific, actionable feedback that makes the tool
valuable — catching outdated legal references no human would spot
in under a minute.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two critical fixes:
1. Section splitter: Only lines that classify as a known doc_type
(cookie, social_media, dsfa, etc.) trigger section splits.
Random short lines ("Typen", "Funktionale Cookies") no longer
split sections — they all had blank lines before them in the
extracted HTML text.
2. LLM verification: Sub-section checks now pass the full document
text to the LLM, not just the section fragment. This lets the
LLM find content that the section splitter missed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- CustomHazardModal: Eigene Gefaehrdung erstellen mit S/E/P/A Slidern
- ResidualRiskPanel: Akzeptabel-Toggle pro Hazard + Fortschrittsbalken
- RiskAssessmentTable: Accept/Reject Buttons pro Zeile integriert
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
NameError: name 're' is not defined at line 146 — the import was
accidentally removed when extracting helper functions to agent_scan_helpers.py.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Both scanners now search until done, not until a counter runs out:
playwright_scanner.py:
- Default max_pages raised from 15 to 50
- Added 3-minute timeout as safety net
- Recursive link discovery on EVERY visited page (not just DSE pages)
- Stops when: all links visited OR max_pages OR timeout
dsi_discovery.py:
- Default max_documents raised from 30 to 100
- Added 5-minute timeout as safety net
- Recursive: on each visited page, searches for MORE DSI links
- Processes ALL discovered links exhaustively
- Stops when: no more pending links OR max_documents OR timeout
The scanners now behave like a real user: they follow every relevant
link they find, and on each new page they look for more links.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Modules step deleted from sdk-steps.ts and SDK Flow
(regulations are now shown in Scope-Decision tab with toggles)
- Dashboard: "Erkannte Regulierungen" card shows which regulations
apply based on Scope-Profiling (DSGVO, AI Act, NIS2, HinSchG)
- Dashboard: Amber warning if Scope-Profiling not yet completed
- Link to Scope-Decision tab for details & customization
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ISO 27001 ist kein Gesetz — freiwilliger Standard, kein Normtext ingested.
- Modules: ISO 27001 Fallback-Modul entfernt, Filter entfernt
- ISMS: Umbenannt zu "ISMS — ISO 27001 Readiness"
- ISMS: Hinweis "Basierend auf eigenen Pruefaspekten, kein Normtext"
- Sidebar: "ISMS (ISO 27001)" → "ISMS Readiness"
- Verbleibende Regulierungen: DSGVO, AI Act, NIS2 (gesetzlich)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- RegulationsPanel: added enable/disable toggles per regulation
- ScopeDecisionTab: passes enabledModules + onToggleModule
- Scope page: auto-enables all applicable regulations when loaded
- Modules step: isOptional=true, moved to Zusatzmodule
- Requirements: now depends on compliance-scope, not modules
- Source-policy: now depends on use-case-assessment, not modules
Flow: Profile → Scope → Scope-Decision shows applicable regulations
with toggles → Requirements derived from enabled regulations
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New checks (from EUIPO reference case):
- Check 9: Third-party DSE link — detects when consent dialog links to
external domain's privacy policy instead of own DSE (Art. 13 DSGVO)
- Check 10: Dark-pattern language — detects "muessen/erforderlich" for
non-essential cookies suggesting false technical necessity (EDPB Rn. 70)
- Check 11: Non-modal dismiss = consent — detects when clicking outside
dialog closes it (possibly treating as consent, Planet49 violation)
Refactor: extracted _check_banner_text (375 LOC) from consent_scanner.py
into services/banner_text_checker.py to keep both files under 500 LOC.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Document Workflow:
- "Als Version speichern" button in Document Generator preview
- Creates document + version via /legal-documents/documents API
- Saved documents appear in /sdk/workflow module
- Status indicator (saving/saved/error) in toolbar
Email Consolidation:
- consent-management Emails tab now redirects to /sdk/email-templates
- Single source of truth for all email templates
- Old tab replaced with redirect card explaining the change
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Real-world case from EU authority (EUIPO) with 7 findings:
- Grammatically broken consent text (bad DE translation)
- Coupling prohibition violation (login = consent, Art. 7(4) DSGVO)
- No reject button, no granularity, no active opt-in
- Broken link layout (DSE/ToS links appear after submit button)
- Includes correction suggestion and planned agent check implementations
- Pattern: WSO2 Identity Server default templates (systemic issue)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- EmailDeliveryService: load template → find published version →
render {{variables}} → send via SMTP → audit log. Fallback to
inline HTML when no published template exists.
- Migration 117: Professional HTML/text content for all 5 DSR
templates (receipt, completion, rejection, identity, extension)
with branded styling and proper Art. references
- DSRArt11Service now uses EmailDeliveryService with dsr_rejection
template instead of hardcoded HTML
[migration-approved]
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- New DSRArt11Service: handles rejection with proper legal basis,
automated email notification to requester explaining Art. 11
- POST /dsr/{id}/reject-art11 endpoint
- ActionButtons.tsx: "Nicht identifizierbar (Art. 11)" button
shown when identity is not yet verified
- Also fixes: DSR export type-cast rollback handling
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- tenant_id kept as string (PostgreSQL handles UUID cast)
- Einwilligungen query uses CAST(:tid AS VARCHAR) for compatibility
- Each data source query wrapped with rollback on failure to prevent
cascading "transaction aborted" errors
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- DSRExportService: aggregates all CMP data about a user from
Banner Consents, Einwilligungen, Audit Trail, DSR History
- GET /dsr/{id}/export-user-data?format=json|csv|pdf endpoint
- PDF: A4 reportlab with 4 sections (Consents, Einwilligungen,
Audit-Trail, DSR-Anfragen) + cover page
- CSV: BOM-encoded for Excel with flattened data rows
- JSON: structured export with all data categories
- ActionButtons.tsx: PDF/JSON/CSV export buttons now functional
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Migration 115: compliance_role_training_mapping table (org roles → training codes)
- TrainingLinkService: queries training_modules/matrix/assignments to find gaps
per person and role. Gracefully degrades when Go training tables don't exist yet.
- document_review_routes: 2 new endpoints (training-requirements, training-gaps)
- _notify_approval() now checks training gaps and sends emails to persons
with outstanding modules, linking to /sdk/training/learner
[migration-approved]
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- New /sdk/rollenkonzept/ module with 3 tabs (Rollen, Zuordnung, Reviews)
- 7 standard compliance roles (DSB, GF, IT-Leiter, HR, Marketing, Compliance, Einkauf)
- Inline role editing with test email via Mailpit
- Document-to-role mapping table (editable per tenant)
- Review list with status filters and approve/reject workflow
- ReviewAssignmentPanel in Document Generator preview tab
- "Zur Pruefung senden" button creates reviews + sends notification emails
- Approval notification sent to all affected roles after document sign-off
- Sidebar navigation link added
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Migration 111: 3 new tables (org_roles, document_reviews, document_role_mapping)
with seed data mapping all 71 doc types to 7 compliance roles
- org_role_routes.py: CRUD for roles, seed defaults, test email, mapping API
- document_review_routes.py: Review lifecycle (create→send→approve/reject)
with approval notification to all affected roles
- Migration 112: SOP template (ISO 9001 structure, 21 placeholders)
- Added standard_operating_procedure to TemplateType, doc-labels, presets
[migration-approved]
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Split presets into interface + data files (500-line budget)
- Extract DOC_LABELS into doc-labels.ts with all 71 template types
- Add 3 new presets: Cloud/SaaS-Anbieter, Finanzdienstleister, Plattform
- Expand Enterprise preset to 48 docs (full ISMS + BCM + DSR)
- Every template type appears in at least one preset
- ISO references verified: citations only, no copyrighted standard text
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Every preset now includes DSGVO-mandatory docs (TOM, VVT, Löschkonzept)
plus Cookie-Banner/Policy, Mitarbeiter-DSI, Bewerber-DSI, and
industry-specific extras (DSFA, Whistleblower, ISMS, TIA, etc.).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- X button to close banner (SDK admin context only)
- Overlay leaves sidebar area accessible (ml-16/ml-64)
- Click overlay backdrop to dismiss
- Preview page: close banner on API error (don't trap user)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Presets were only visible after entering a project. Now they appear
on the /sdk landing page where users first see their project list.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Browser blocks direct calls to backend-compliance:8093 due to
self-signed SSL certificate. All banner API calls now go through
Next.js API proxy at /api/sdk/v1/banner/* which runs server-side.
- New catch-all proxy: /api/sdk/v1/banner/[[...path]]/route.ts
Maps to backend-compliance:8002/api/compliance/banner/*
- Preview page: uses /api/sdk/v1/banner/ instead of https://macmini:8093
- CMP Dashboard: uses proxy for banner stats + compliance proxy for DSR/einwilligungen
- Fixes: banner not closeable due to API errors, consent not saving
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- New route /sdk/cmp with full CMP dashboard
- 4 KPI cards: total consents, active consents, open DSR requests, configured sites
- Cookie category acceptance bars (necessary/statistics/marketing/functional)
- DSR breakdown: by status, by type (Art. 15-21), avg processing time, overdue count
- 9-point compliance checklist (banner, DSE, impressum, Art.7 proof, DSR, loeschfristen,
vendor AVV, email templates, EWR-only mode) — each links to relevant module
- 8 module cards with icons linking to all CMP sub-modules
- Real API integration: /banner/admin/stats, /einwilligungen/consents/stats, /dsr/stats
- Dashboard link added as first entry in CMP sidebar section
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previously hidden when a company profile existed, but users with
existing test projects couldn't see the feature.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When selecting an industry preset on the SDK dashboard, a categorized
document preview panel now appears showing which documents will be
generated (Website, Vertraege, HR, Compliance, etc.).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CMP Section in Sidebar:
- New "CMP" group with purple accent, above other module sections
- Links: Cookie-Banner, Live-Vorschau, Consent-Records, Consent-Verwaltung,
Vendor-Compliance, DSR Portal, Loeschfristen, E-Mail-Templates
Live Preview (/sdk/cookie-banner/preview):
- Simulated "MusterShop GmbH" website with full cookie banner
- Real API calls to POST /banner/consent (saves to DB)
- EWR-Only toggle functional in preview
- API Debug panel shows fingerprint, consent status, blocked vendors
- Response JSON viewer for API debugging
- Links to verify in Consent-Verwaltung, Consent-Records, DSR Portal
- "Consent zuruecksetzen" button to re-test
- Footer "Cookie-Einstellungen" link to reopen banner
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Presets now shown on the SDK start page (/sdk) as a card grid
between header and stats — only when companyName is empty.
Click navigates to /sdk/company-profile?preset={id}.
Reverted company-profile/page.tsx to original state (no preset
logic there — the dashboard is the right place for discovery).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Shows preset cards before the wizard when the profile is empty:
- 10 industry presets (SaaS, Consumer App, E-Commerce, IT-Agentur,
Maschinenbau, Rechtsanwalt, Arztpraxis, Handwerk, Bildung, Enterprise)
- Each with icon, label, and description
- Click prefills: legalForm, industry, businessModel, companySize,
employeeCount, country, targetMarkets, dataController/Processor
- "Manuell ausfuellen" skip option
- Only shown when companyName is empty (fresh start)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Deleted 6 unused components from /sdk/einwilligungen/cookie-banner/_components/
- Replaced page.tsx with Next.js redirect() to /sdk/cookie-banner
- Updated EinwilligungenNavTabs link to /sdk/cookie-banner
- Updated catalog page link to /sdk/cookie-banner
- Single source of truth: /sdk/cookie-banner (Step in "Rechtliche Texte")
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Migration 110: Updated descriptions and version for 12 previously
unreviewed templates (asset_management, backup, change_management,
cloud_security, devsecops, incident_response, logging, patch_management,
secrets_management, vulnerability_management, informationspflichten,
verpflichtungserklaerung).
All templates assessed as "Very Good" quality — only incremental
updates needed (AI Act, CRA, NIS2UmsuCG references in descriptions).
informationspflichten: Kept as separate compact checklist (distinct
from the full privacy_policy DSI template).
verpflichtungserklaerung: Kept as standalone HR document (employee
signs at onboarding). Added to HR & Mitarbeiter category.
Result: 88 templates, 44 at v1.1+, 0 unreviewed remaining.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- 108: Remove DSI duplicate (023 + 093 both wrote privacy_policy DE),
remove outdated EN v1, create English Privacy Notice v2 with all
modular sections (data categories table, retention periods, processor
vs. controller guidance, Art. 21 right to object highlighted)
DB now has exactly 2 privacy_policy templates: DE + EN, both v2.0.0
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- 106: Remove AGB duplicates and obsolete templates (terms_of_service
DE/EN v1.0, liability clause) — replaced by agb v2.0
- 107: English Terms and Conditions v2 (EU-compliant, same structure
as DE version with all IF-blocks)
DB now has exactly 2 AGB templates: DE + EN, both v2.0.0
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- All 4 categories with toggles visible on first layer (no "Einstellungen" step)
- Removed showSettings state — single-view banner
- EWR toggle + info button in header, always visible
- Two equal-weight buttons: "Alle akzeptieren" + "Auswahl speichern"
- "Nur notwendige" as text link below (not hidden, but less prominent)
- Vendor tables expandable per category via chevron
- DSK OH Telemedien 2022 + CNIL 2020 compliant layout
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Game-changing CMP feature: Users accept a category (e.g. Marketing) but
can restrict data processing to EU/EWR-only vendors. Non-EWR vendors are
blocked even when the category is accepted.
- Toggle "Nur EU/EWR-Anbieter" with globe icon in blue gradient bar
- Blocked vendors shown as red pills with strikethrough icon
- Per-vendor status icons: green checkmark (active), red slash (blocked),
gray dash (category disabled)
- Country column: green circle+check for EWR, amber warning for non-EWR
- EWR = EU27 + IS/LI/NO + CH (Angemessenheitsbeschluss)
- Vendor data extracted to cookie-banner-vendors.ts (under 500 LOC)
- Consent state includes ewrOnly flag + blockedVendors list
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Backend: _ensure_list() converts null/string/malformed JSONB to []
for requirements, test_procedure, evidence, open_anchors, tags.
Frontend: defensive Array.isArray() check on ControlDetail.tsx.
Fixes: TypeError: A.requirements.map is not a function
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- CookieBannerOverlay: shows vendors per category with expandable tables
(Verarbeiter, Cookies, Dauer, Land) for full transparency
- Demo vendors: 4 necessary, 3 statistics, 3 marketing, 3 functional
- cookie_table_generator.py: renders {{COOKIE_TABLE}} Markdown tables
from vendor configs (DB) or service registry (fallback)
- SERVICE_COOKIES: 16 known vendor-to-cookie mappings with provider + country
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Impressum link mandatory in banner (§5 TMG)
2. Pre-ticked prevention: only "required" categories pre-enabled (Planet49)
3. Cookie-Settings reopen link (§7(3) DSGVO — revocation as easy as consent)
4. Script-Blocking: data-cookie-category + type="text/plain" pattern
Scripts only execute AFTER user consents to that category
5. Buttons already equal size (flex:1) — verified correct
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Shows: Impressum link ✓/✗, DSE link ✓/✗, plus violation cards for
wrong DSE consent wording, pre-ticked checkboxes, dark patterns,
missing reject button, no settings re-access.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Impressum link accessible from banner (§5 TMG, LG Rostock)
2. DSE link in banner (Art. 13 DSGVO, informierte Einwilligung)
3. Wrong wording: "Zustimmung zur DSE" — DSE is Art. 13 obligation,
not consent. Correct: "zur Kenntnis genommen"
4. Reject button visible (§25 TDDDG, no hidden reject)
5. Pre-ticked checkboxes detected (EuGH C-673/17 Planet49)
6. Dark Pattern: button size comparison — accept vs reject area
ratio >2.5x or font size ratio >1.5x = dark pattern
7. Cookie Wall detection (Phase B — site blocked after reject)
8. Re-access to settings (Art. 7(3) — revocation as easy as consent)
All checks run via Playwright on the actual rendered banner.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fundamental architecture fix: data processing happens through APIs/scripts/
cookies — NOT through visible page text. A news site about healthcare does
NOT process health data.
Before: Qwen reads website text → guesses "health_data: true" (WRONG)
After: Google Analytics detected → tracking: true (CORRECT, deterministic)
New flow: detect services from HTML → map service categories to flags →
feed flags into UCCA assessment. No LLM needed for flag extraction.
SERVICE_TO_FLAGS maps categories: tracking→tracking, marketing→marketing+
third_party_sharing, payment→payment_data, heatmap→profiling, etc.
SPECIFIC_SERVICE_FLAGS for Klarna (Art.22), Stripe (US transfer), etc.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tests each consent category in isolation:
- Phase D: Only "Statistics" enabled → checks if only analytics loads
- Phase E: Only "Marketing" enabled → checks if only ads load
- Phase F: Only "Functional" enabled → checks no tracking loads
CMP-specific category selectors for Cookiebot, OneTrust, Usercentrics,
Didomi. Generic fallback via toggle/checkbox keyword detection.
SERVICE_CATEGORY_MAP maps 35+ services to expected categories.
Violations: "Facebook Pixel loads with only Statistics enabled" = miscategorization.
Frontend: category test results shown below Phase A-C with
per-category violation cards.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
legalHolds can be a JSONB object {} instead of an array [], so
the || [] fallback wasn't sufficient. Array.isArray handles all
edge cases (null, undefined, object, string).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
getActiveLegalHolds() crashed with "e.legalHolds.filter is not a
function" when legalHolds was null/undefined (e.g. old DB entries
without the JSONB field). Added fallback to empty array.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
VVT and Loeschfristen pages imported STEP_EXPLANATIONS as a named
export from StepHeader.tsx, but it was only imported (not re-exported).
This caused "Cannot read properties of undefined (reading 'vvt')"
at runtime. Adding the re-export fixes both pages.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Results (https://macmini:3007):
- sdk-module-reachability: 40/42 (loeschfristen+vvt pre-existing bugs)
- vendor-transfers: 4/4
- isms-assets: 3/3
- document-generator: 3/4 (category label mismatch)
Added: playwright-live.config.ts (no webServer, live instance testing)
Test data NOT cleaned up — profiles persist for manual review.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New E2E test specs:
- sdk-module-reachability: Tests 40+ SDK routes for 404/crash
- scope-profiling: Three customer profiles (Startup/KMU/Enterprise)
with screenshots at each step — data NOT cleaned up
- document-generator: Template library, categories, recommendations
- vendor-transfers: Transfer tab, explanations, adequacy list
- isms-assets: Asset register tab, form, CRUD
All tests configured to run against https://macmini:3007
Screenshots saved to e2e/test-results/ for manual review
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The SECTION_FIELDS object was prematurely closed before the TOM and DPA
sections, causing a build-time syntax error. Removed the extra closing
brace so TOM and DPA fields are correctly inside the object.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New: adequacy-decisions.ts
- Complete list of 15 countries with EU adequacy decisions (Art. 45)
- EU/EEA country set (30 countries)
- getTransferRequirement() — determines SCC/TIA/certification needs
per country code with human-readable explanations
- US special handling: DPF certification required, check URL included
Updated: transfers/page.tsx
- "Was muss ich tun?" explanation section with 3 options:
1. Adequacy decision (green) — no action needed
2. DPF certification (blue, US only) — check dataprivacyframework.gov
3. SCC + TIA required (amber) — link to Document Generator
- Collapsible adequacy countries table (15 countries with restrictions)
- Schrems II background explanation for customers
- Customer guidance written for non-experts who never heard of TIA/SCC
Updated: templateRecommendations.ts
- SCC+TIA rules now consider DPF certification and adequacy status
- us_dpf_only → SCC/TIA optional (not required)
- adequate_only → SCC/TIA not recommended
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New "Assets" tab in the ISMS module for information asset management:
- CRUD for information assets (hardware, software, data, services,
people, facilities)
- CIA protection need matrix (confidentiality, integrity, availability)
with normal/high/very_high levels
- Information classification (public, internal, confidential,
strictly confidential) with color-coded badges
- Category filter (all/hardware/software/data/service/people/facility)
- Stats cards (total, by category, high protection need count)
- CSV export for ISO 27001 audits
- Edit/delete per asset
- localStorage persistence (same pattern as compliance_scope)
Types: InformationAsset, AssetCategory, AssetClassification,
ProtectionLevel interfaces + label/color maps
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New "Drittlandtransfers" tab in the Vendor Compliance sidebar:
- Aggregates all vendor processing locations with non-EU countries
- Traffic light system: green (EU/adequacy), yellow (SCC exists),
red (no transfer mechanism)
- Stats cards: total, EU+adequate, third-country, action required
- Filter by status (all/OK/review/action required)
- Table with vendor name, country, mechanism, SCC status, TIA status
- "TIA erstellen" link to Document Generator for third-country vendors
- Help text explaining Schrems II / Art. 46 DSGVO requirements
Uses existing data model — no new API endpoints or DB tables needed:
- vendor_vendors.processingLocations (isEU, isAdequate)
- vendor_vendors.transferMechanisms
- vendor_contracts.documentType = 'SCC'
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New RecommendedDocuments component shown above the template library:
- Evaluates scope answers + compliance level (L1-L4)
- Groups templates into required/recommended/optional
- Shows profile label (Startup/KMU/Extended/Enterprise)
- Cards link to actual templates — click opens in generator
- Optional section collapsed by default
- Only visible when scope has been completed
Renders as purple gradient panel with grid cards, each showing
template name and availability status.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New templates for the Vendor Compliance module:
- 105: Transfer Impact Assessment (TIA) — Schrems II risk assessment
with country evaluation, government access assessment, supplementary
measures, risk matrix, and go/conditional/deny decision
- 105: SCC Companion Document — annexes to EU Decision 2021/914
(module selection C2C/C2P/P2P/P2C, party details, data description,
TOMs, sub-processor list)
Template recommendations: SCC+TIA triggered by tech_third_country answer
Generator: New "Drittlandtransfer" category
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase 3 of the Document Templates Masterplan:
- 103: 4 new security policies (information_security_policy, password_policy,
encryption_policy, access_control_policy) + updates for CRA (056) and
all 15 HR/Vendor/BCM policies (072)
New templates:
- Information Security Policy: ISMS-Leitlinie (ISO 27001, BSI, NIS2)
- Password Policy: BSI/NIST compliant (12+ chars, MFA, no forced rotation)
- Encryption Policy: BSI TR-02102, algorithms, key management, TLS config
- Access Control Policy: RBAC, Least Privilege, Zero Trust, rezertification
Updates: AI Act + NIS2UmsuCG references for CRA and all 15 HR/Vendor/BCM
Generator: 6 new categories (security, HR, data, vendor, BCM policies)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Dockerfile: install Playwright AS appuser (not root) so chromium
binary is accessible at runtime. Was causing 500 error.
2. DSE service matching: text-search fallback when LLM extraction fails.
If "etracker" appears in DSE text, mark as documented even without
LLM parsing the service list.
3. CMP skip: consent managers in category "cmp" skipped (not just "other"
with id "cmp").
NOT DEPLOYED — RAG pipeline is running.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New /website-scan endpoint in consent-tester service:
- Real browser renders JavaScript (finds dynamic content)
- Clicks navigation menus (discovers hidden sub-pages like IHK DSB page)
- Follows links within DSE to find regional privacy policies
- Collects rendered HTML for each page (after JS execution)
Backend integration:
- agent_scan_routes tries Playwright first, falls back to httpx
- DSE text and HTML extracted from Playwright-rendered pages
- Service detection runs on rendered HTML (catches JS-loaded scripts)
Also fixes:
- GA regex: G-[A-Z0-9]{8,12} prevents CSS class false positives
- etracker added to service registry
- External page scanning blocked (same-domain only)
- CSS/JS/image files excluded from page list
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. GA regex: G-\w{5,} matched CSS classes (g-7031048). Now requires
G-[A-Z0-9]{8,12} (uppercase after G-, 8-12 chars = real GA4 ID)
2. External page scanning: DSE-internal links now SAME DOMAIN only.
Previously followed links to etracker.com, google.de/policies etc.
and detected services on THOSE sites as IHK services.
3. Added etracker to service registry (DE, ePrivacy-certified)
4. CSS/JS/image files excluded from page scanning
5. Navigation-pattern links for deeper DSE sub-pages
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mandatory_content_checker.py keywords break with alternative formulations.
Solution: LLM-based check per mandatory field (9 calls, parallelizable).
For other session to implement alongside Dict→Control migration.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. DSE-Matcher: Google/YouTube false match — now requires 2+ word match
for provider-name fallback, not just "Google" matching YouTube section
2. AGB/Widerrufsbelehrung: only_ecommerce flag — skips for non-shop
websites (detected via payment providers, cart keywords)
3. DSE-internal link following — scanner now discovers links WITHIN the
privacy policy and scans those too (finds regional DSE sub-pages)
4. Expanded keyword synonyms for DSE mandatory checks:
- "Zweck und Rechtsgrundlage" now matches "zwecke"
- "behoerdlichen datenschutzbeauftragt" matches DSB
- "aufsichtsbehörde" with umlaut matches
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
8 test cases with deliberately wrong legal basis assignments:
- Cookie tracking on lit. f (should be lit. a)
- Analytics on lit. b (should be lit. a)
- Newsletter on lit. f (should be lit. a)
- Klarna without Art. 22
- Session recording on lit. f
- 2 correct cases (should NOT trigger findings)
Runs both hardcoded dict AND Control Library query, compares results.
If Control Library passes all → dict can be removed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
6 files with hardcoded legal knowledge identified. Review deadline 2026-07-01.
legal_basis_validator.py marked with warning log on every use.
Instruction file for other session to execute migration.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase 6: PDF export via WeasyPrint — POST /agent/scans/pdf generates
printable compliance report with findings table, service comparison,
risk badge, and legal disclaimer.
Phase 7: Recurring scans — POST /agent/monitored-urls to add URLs,
POST /agent/run-scheduled triggers all enabled scans (cron/ZeroClaw).
In-memory storage with DB upgrade path.
Phase 8: Multi-website compare — POST /agent/compare with 2-5 URLs,
parallel scanning, comparison table (risk, findings, services, compliance
features per site).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Migration 086: compliance_agent_scans table (findings, services, corrections)
- agent_history_routes.py: POST /scans (save), GET /scans (list), GET /scans/{id}
- Scan results survive page reloads and can be reviewed later
- Phase 10 (Playwright website scanner) added to product roadmap
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Scan pages in parallel instead of sequential. Reduces scan time
from ~10s (5 pages × 2s) to ~3s (all pages at once).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Third tab "Cookie-Test" in Compliance Agent:
- Phase A: Before consent (tracking without permission)
- Phase B: After rejection (CRITICAL if tracking persists)
- Phase C: After acceptance (undocumented services)
- CMP badge (Didomi, OneTrust, etc.)
- Violation cards with severity badges and legal references
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New independent service (port 8094) with headless Chromium:
- Phase A: What loads BEFORE any consent interaction
- Phase B: What loads AFTER rejecting consent (CRITICAL if tracking persists)
- Phase C: What loads AFTER accepting (check against cookie policy)
- 10 CMP-specific selectors (Didomi, OneTrust, Cookiebot, Usercentrics, etc.)
- Generic fallback via button text matching
- 18 tracking service patterns for script classification
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Shows for each finding:
- Original text block from DSE (or "missing" indicator)
- Position: section heading, number, parent section, paragraph index
- Correction: insert/append/replace with copy button
Falls back to plain correction view if no text reference available.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- dse_parser.py: HTML → structured sections (heading, number, content, parent)
Uses heading hierarchy (h1-h4) with regex fallback
- dse_matcher.py: matches detected services against DSE sections
Exact name → provider → category matching with insertion point suggestion
- agent_scan_routes: TextReference model in findings (original text,
section, paragraph, correction type, insert_after)
Enables showing: "Google Analytics not found in DSE, insert after
Section 2.4 Cookies und Tracking"
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase 0: Qwen extracts 14 structured intake flags (personal_data,
marketing, profiling, ai_usage, etc.) instead of keyword matching.
Fallback to keywords if LLM unavailable. Flags feed into UCCA for
accurate scoring.
Phase 1: Control relevance filter removes false positives.
C_TRANSPARENCY only recommended if AI/ML keywords found in text.
7 control rules with keyword lists + intake flag fallback.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Summary now renders as styled HTML (table layout, colored risk badge,
warning banners) instead of plaintext in <div>
- Tab info text explains scope: "Analysiert nur die eingegebene URL" vs
"Scannt automatisch 5-10 Unterseiten"
- Scan history with findings count badge and page count
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Advisor now knows about: project setup (3 steps), all SDK modules
(DSGVO, AI Act, CE, independent modules), recommended workflow order,
navigation (sidebar, CommandBar, SDK-Flow). No business secrets.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Same pattern as the email templates variables fix. Backend may return
placeholders as object instead of array.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Chat-Verlauf wird als strukturiertes Beratungsprotokoll per Email
an den DSB gesendet. Button erscheint im Header sobald Nachrichten
vorhanden sind. Zeigt Checkmark nach erfolgreichem Versand.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Widgets were hidden behind projectId guard. Removed condition so new
users can ask questions (e.g. "Wie lege ich ein Projekt an?") before
creating a project.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- EuGH-Urteile — Schrems II, Planet49, SCHUFA Scoring, Google Fonts, Normen-Copyright (C-588/21 P)
- EU 2018/1725 — Datenschutz EU-Organe
- EU-IFRS (Verordnung 2023/1803) — EU-uebernommene International Financial Reporting Standards
- EFRAG Endorsement Status — Uebersicht welche IFRS-Standards EU-endorsed sind
@@ -239,6 +244,6 @@ bedeutet LinkedIn Insight (EU/Irland) wird geladen, Facebook Pixel (USA) wird bl
Kein anderes CMP bietet dieses Feature.
## Eskalation
- Bei Fragen ausserhalb des Kompetenzbereichs: Hoeflich ablehnen und auf Fachanwalt verweisen
- Bei Fragen ausserhalb des Kompetenzbereichs: Wenn die Frage harmlos ist (z.B. "Hast Du Informationen zu X?"), kurz mit Ja/Nein antworten und anbieten konkreter zu helfen. NUR bei sensiblen oder rechtsberatenden Fragen hoeflich ablehnen und auf Fachanwalt verweisen.
- Bei widerspruechlichen Rechtslagen: Beide Positionen darstellen und DSB-Konsultation empfehlen
- Bei dringenden Datenpannen: Auf 72-Stunden-Frist (Art. 33 DSGVO) hinweisen und Notfallplan-Modul empfehlen
**2.Regex-Checks(138Pruefpunkte)**—ZweiEbenen: L1prueftobPflichtangabenerwaehntsind(z.B."Verantwortlicher"),L2prueftobsiekorrektundvollstaendigsind(z.B."Hat der Verantwortliche eine ladungsfaehige Anschrift mit PLZ?").
a:`Die Pruefung wurde gegen mehrere Ground-Truth-Websites validiert (IHK Konstanz, ETO Gruppe, BMW, Stadt Koeln, Sparkasse, Spiegel u.a.). Ergebnis: **0 False Positives** bei validierten Testfaellen — jeder rote Punkt ist ein echtes Finding.
q:"Was ist der aktuelle Stand bei harmonisierten Normen unter der neuen Maschinenverordnung (EU) 2023/1230?",
a:`Die Maschinenverordnung (EU) 2023/1230 hat in Anhang I die wesentlichen Gesundheits- und Sicherheitsanforderungen und verweist darauf, dass harmonisierte Normen die technischen Details liefern sollen (Konformitaetsvermutung).
q:"Warum muss ich harmonisierte Normen kaufen obwohl sie EU-Recht sind?",
a:`Harmonisierte Normen werden von privaten Organisationen (CEN/CENELEC) erstellt und ueber nationale Normungsinstitute wie DIN/Beuth (Deutschland), ASI (Oesterreich) oder SNV (Schweiz) verkauft — typisch 50-300 EUR pro Norm.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.