breakpilot-compliance

Author	SHA1	Message	Date
Benjamin Admin	e9002175ac	feat(iace): manufacturer safety feature library (Stufe A — 50+ entries) Adds a curated database of safety-relevant features for the major manufacturers across mechanical/plant engineering, written entirely in own words with norm anchors. No verbatim manufacturer texts — therefore no copyright issue: - Markennennung (§ 23 MarkenG nominative use) is permitted. - Fakten ueber Produkt-Sicherheitsfunktionen are not protected by § 2 UrhG (only Werke, not facts). - NormReferences contain only the identifiers (e.g. "EN ISO 13849-1 PLd Kat.3"), never the norm text itself. Coverage (52 entries across 12 categories): Industrieroboter (10): FANUC DCS, KUKA SafeOperation, ABB SafeMove, Yaskawa FSU, Staeubli CS9, Kawasaki Cubic-S, Mitsubishi MELFA, Universal Robots PolyScope, Doosan PRS, Comau SafeNet CNC/WZM (8): DMG MORI, Mazak, TRUMPF, Okuma, Hermle, Heidenhain SPLC, GROB, Heller Pneumatik (4): Festo, SMC, AVENTICS, Parker Hydraulik (3): Bosch Rexroth, HAWE, HYDAC Safety-PLC / Sicherheitstechnik (8): PILZ, SICK, Schmersal, Euchner, Leuze, Phoenix Contact, Banner, Wieland Standard-PLC (5): Siemens, Beckhoff, Rockwell, Schneider, B&R Pressen (3): Schuler, Bruderer, AIDA Spritzguss (3): Arburg, KraussMaffei, ENGEL Verpackung (2): Krones, Bosch Packaging/Syntegon Laser/Schweissen (3): Bystronic, Amada, Fronius Foerdertechnik (2): Interroll, SEW EURODRIVE Engine integration: - LookupManufacturerFeaturesInText() scans the project narrative for any of the manufacturer aliases (case-insensitive, umlaut-tolerant). - Init-Handler appends matched feature clarifications to the relevant hazard's "Mit Anlagenbauer zu klaeren:" block — for the right HazardCategory only (e.g. FANUC DCS only on mechanical_hazard). - For a Bremse project narrative mentioning "Fanuc Robodrill", the engine now adds clarification questions like "Ist DCS am Roboter konfiguriert?" to relevant mechanical hazards automatically. Tests: 7 new pin tests — manufacturer count, norm prefixes, FANUC/KUKA detection in narrative, umlaut robustness (Staeubli vs Staubli). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 23:04:56 +02:00
Benjamin Admin	7e426c31f1	feat(consent-tester): Phase B — named CMP library + plugin architecture cmp_extractor.py refactored to thin coordinator (123 LOC, was 223). Discovers all CMP modules via cmp_library/_registry.py:load_all() at import time. Restart consent-tester to pick up new modules. New cmp_library/ folder: - _registry.py: auto-discovers all modules with MATCHER + reconstruct() - epaas.py: BMW Group ePaaS (extracted from cmp_extractor) - onetrust.py: cdn.cookielaw.org Groups/Cookies schema - cookiebot.py: consent.cookiebot.com Categories schema - usercentrics.py: api.usercentrics.eu services schema - didomi.py: sdk.privacy-center.org notice + vendors + purposes - trustarc.py: consent.trustarc.com categories + vendors Each module: - MATCHER: re.Pattern matching the CMP JSON endpoint URL - reconstruct(d: dict) -> str: builds German Markdown cookie-policy text Phase E (self-improving) will write auto_*.py files into the same folder; _registry already picks those up via pkgutil.iter_modules.	2026-05-16 22:59:48 +02:00
Benjamin Admin	4f19310130	fix(iace): HP1654 Greifer durchschlaegt Zaun — DCS-Bezug GT 1.8 fordert konkret den 'sicher begrenzten Bewegungsbereich (Dual Check Safety)'. HP1654 hatte nur M061 'Feste trennende Schutzeinrich- tung' als Mitigation. Ergaenzt um M494 (Safe Limited Position/Space mit DCS-Erlaeuterung), M501 (Schutzzaun-Lastbemessung) und M502 (Greifer- Fail-Safe). Klaerungsfragen verweisen explizit auf DCS bei FANUC, SafeMove bei ABB, SafeOperation bei KUKA und die EN ISO 13849-1 PLd/ Kat.3-Validierung. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 22:56:40 +02:00
Benjamin Admin	8283483909	feat(consent-tester): Phase A — generic JSON cookie-policy heuristic New module cmp_heuristic.py with: - looks_like_cookie_policy(data): shape-based classifier (top-level keys cookies/categories/providers/vendors/purposes/cookieList/etc. + at least 2 name+description objects, or IAB TCF v2 vendors[]+purposes[]) - reconstruct_generic(data): walks JSON, extracts name + description fields + standalone prologue/dataController/persistence fields, emits flat German Markdown text (max 5000 words, dedup) cmp_extractor.py wired so that AFTER named CMP matchers (epaas, onetrust) fail, every JSON response on the page is tested for the heuristic. If matched, payload is captured as '_heuristic' kind and reconstructed via the generic walker. This is Phase A of the 4-stage cascade (B-D follow). Unknown CMPs that return JSON now work without hand-coding each one. Pre-filter: skips response paths /api/config, /beacon, /track, /analytics, /fonts/, /log/, /heartbeat/, /.well-known/ to avoid spamming the heuristic on every Playwright load.	2026-05-16 22:56:20 +02:00
Benjamin Admin	9814b56f2f	fix(cookie-extract): max_documents=1 + faster networkidle bail (Phase 0 fix) Root cause of the recurring 603-word BMW result: - DSI discovery for cookie-policy URL was hitting 4x networkidle timeouts (60s each = ~240s total). - Backend httpx timeout (180s after the previous fix) gave up before the consent-tester finished, falling through to the raw HTTP fetch which returned BMWs SSR navigation chrome (603 words) as the 'cookie policy'. Two orthogonal fixes: 1. _fetch_text now passes max_documents=1 for user-specified URLs. We only want self-extraction of THAT page; link-following is unnecessary noise. 2. networkidle wait_until window dropped 60s -> 15s. SPAs like BMW/Daimler never reach networkidle anyway; the 60s wait was pure latency. Falls through to domcontentloaded+5s render-wait, same as before.	2026-05-16 22:53:23 +02:00
Benjamin Admin	69729ef6ac	feat(iace): norm references in mitigations + aggregated norm panel per hazard Library measures carry NormReferences (EN/IEC/ISO/DIN/TRBS/TRGS Ziff./Kap./ Pos.) but they were dropped on persist: CreateMitigationRequest only wrote Name + Description. The Fachmann benchmark file lists Normen for 34 of 60 hazards — the engine had this data already but lost it on the way to the UI. Fix without DB schema change: - Mitigation.Description gets a "Normen: EN 60204-1 Ziff. 6.2 \| EN 61140" line appended when the measure has NormReferences. Pipe separator keeps the inline panel short and grep-friendly. - After all mitigations land, the aggregated dedup'd norm list for the hazard is appended to Hazard.Description as a single "Referenzierte Normen: ..." line so the UI can show one panel per hazard without scanning every mitigation. Audit of library coverage (per-pattern) showed GT-Bremse Normen are generally present and richer: - HP1640 covers GT 2.2 (EN 60204-1 Ziff. 6.2, Ziff. 8.2.3, EN 61140 +) - HP1641 covers GT 2.4 (EN 60204-1 Ziff. 8.2.6 +) - HP1605 covers GT 1.7 (ISO 10218-1 Ziff. 5.6.2, 5.8.3 — Ziff. 5.7.3 fehlt) - HP1671 covers GT 1.30 (EN 12417 — Pos. detail fehlt) Followup: 2 fine-grained sub-paragraph references (5.7.3, Pos. 1.1.4) can be added later as measure-text updates. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 22:51:50 +02:00
Benjamin Admin	35d6422247	fix(iace): HP1632 Bersten-Pattern eindeutige Zone fuer Dedup ZoneDE 'Pneumatikkomponenten der Anlage' kollidiert nach normalizeZoneKey mit HP1630 'Pneumatikschlaeuche der Automation' im 3-signifikante-Wort- Vergleich. Neue Zone 'Berstgefaehrdete Druckwandungen Pneumatik (Leitungs- wand, Dichtung, Verschraubung)' hat semantisch eigenstaendige Schluessel- woerter — Dedup mergt nicht mehr in HP1630. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 22:34:51 +02:00
Benjamin Admin	5ea68ebea4	feat(iace): clarification questions + HP1632 Bersten + HP1637 KSS-Aerosol fix Drei nachhaltige Verbesserungen, getrieben durch die Bremse-Benchmark- Faelle GT 1.4, GT 1.30 und GT 7.4. Die Engine erfindet weiterhin keine Fachmann-Kommentare — Kommentare bleiben aus, weil sie ein Verstaendnis der konkreten Anlage erfordern, das die Engine nicht hat. Statt dessen liefert die Engine norm-basierte Klaerungsfragen und ein praeziseres Pattern-Vokabular. A) HazardPattern.ClarificationQuestionsDE — neues optionales Feld: - Pattern hinterlegt prueffaehige Fragen, die der Bediener mit dem Anlagenbauer abklaert. Beispiele: - HP1640: "Liegt ein Pruefprotokoll nach EN 60204-1 vor?" - HP1666: "Ist die WZM als CE-konformes Subsystem integriert?" - HP1604: "Ist DCS am Roboter konfiguriert und validiert?" - Init-Handler haengt die Fragen an Hazard.Description an mit dem Marker "Mit Anlagenbauer zu klaeren:". Kein DB-Schema-Aenderungs- bedarf. - 11 Patterns mit Klaerungsfragen versehen (HP1602, HP1604, HP1611, HP1612, HP1620, HP1622, HP1637, HP1640, HP1641, HP1666, HP1685). B) HP1632 "Bersten druckbeaufschlagter Pneumatik-Komponente" — neues Pattern, semantisch DISTINKT zu HP1630 "Abspringen": - Bersten = Material-/Druckversagen der Komponente, Mediumaustritt - Abspringen = Verbindung loest sich, Peitscheneffekt Bremse-Benchmark GT 1.4 sprach von Bersten, HP1630 nur von Abspringen — ein 66%-Frontend-Match war eine Sackgasse. Mit HP1632 feuert die Engine ein eigenes Hazard, das auf GT 1.4 einen sauberen Volltreffer liefert. C) HP1637 "Einatmen von KSS-Aerosolen" — Massnahmen vervollstaendigt: Vorher nur M141 (Sicherheitszeichen), neu zusaetzlich M405 (KSS- Aerosolabsaugung), M418 (AGW-Ueberwachung), M526 (WZM-Tueren geschlossen waehrend Bearbeitung), M408 (Hautschutzplan). Klaerungsfrage: "Wurde die Aerosolkonzentration nach Bearbeitungs- ende messtechnisch ermittelt und mit dem AGW verglichen?" Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 22:23:56 +02:00
Benjamin Admin	41023f6343	fix(iace): HP1671 Druckluft-Verletzung — 4 zusaetzliche GT-1.30 Massnahmen HP1671 "Druckluft-Verletzung in Bearbeitungszelle" matched zwar das GT-1.30 Szenario "Einstich, Augenverletzung in Bearbeitungszelle" exakt nach Name und Scenario, hatte aber nur eine einzige Massnahme M061 "Feste trennende Schutzeinrichtung". Die drei spezifischen Massnahmen des Fachmanns (Reinigungsduese in Zelle integriert / Druckluft bei Tueroeffnung aus / Einhausung-Lastbemessung) blieben unsichtbar, weil mein neuer GT-Bremse-Pattern HP1712 zwar diese Massnahmen kennt, aber durch RequiredEnergyTags=["pneumatic"] in diesem Projekt nicht feuert. Fix: HP1671 SuggestedMeasureIDs ["M061"] -> ["M504", "M505", "M501", "M061", "M141"]. EN 12417 Kap. 5.2 / Pos. 1.1.4 ist jetzt durch M504/M505 abgedeckt. HP1712 bleibt als Backup-Pattern fuer Projekte mit explizitem pneumatic-Tag bestehen. Followup: HP1671 und HP1712 sind semantisch redundant — Konsolidierung ist Teil der naechsten Pattern-Hygiene-Iteration. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 22:08:05 +02:00
Benjamin Admin	6689b37f95	fix(agent): bump _fetch_text timeout 60s->180s The dsi-discovery in consent-tester does self-extraction + follows up to 3 sub-links + waits for CMP JSON payloads. On big SPAs (BMW, Daimler) this routinely exceeds 60s. When it timed out, the HTTP fallback returned the SSR shell as text — for the BMW cookie page that's 603 words of site navigation, which then registered as 'Cookie-Richtlinie nicht im eingereichten Text' (33%). With 180s the consent-tester finishes cleanly and we get the CMP-captured 1824 words of real policy.	2026-05-16 22:00:42 +02:00
Benjamin Admin	80d62a0c5f	fix(iace): rename 58 duplicate HP-IDs in extended.go/extended2.go Background: hazard_patterns_extended.go (HP045-074) and _extended2.go (HP074-102) shared their entire ID range with the semantically-different patterns in hazard_patterns_cobot.go, hazard_patterns_press.go, hazard_patterns_operational.go and hazard_patterns_extended_dguv.go. The collision had lived unnoticed because TestGetBuiltinHazardPatterns_- UniqueIDs only checks the 44 builtin patterns (HP001-HP044). Examples of the collision: - HP059 = "Kollision Mensch-Roboter" (cobot.go) vs "Kupplung — mechanisch" (extended.go) - HP060 = "Quetschen durch Werkzeug am Cobot" (cobot.go) vs "Diagnosemodul — Software" (extended.go) - HP073 = "Wartung ohne LOTO" (operational.go) vs "Hydraulikventil — hydraulisch" (extended.go) At runtime collectAllPatterns() returned both patterns under the same ID which made downstream lookups (e.g. hazardPatternMeasures map keyed by pattern_id) non-deterministic — last-loaded wins, dropping the other pattern's mitigation set silently. Rename strategy (no deletes — both patterns are real and earn their SuggestedMeasureIDs after the category-filter work): extended.go HP045..HP073 -> HP1800..HP1828 (29 IDs) extended2.go HP074..HP102 -> HP1830..HP1858 (29 IDs) cobot/press/operational/extended_dguv keep their original IDs because: - compliance_triggers.go references HP059/HP060 with the cobot meaning - pattern_engine_test.go references HP073 with the LOTO/maintenance meaning - phase3_4_test.go references HP073 the same way New regression test: - TestAllPatterns_UniqueIDs runs over collectAllPatterns() and fails if ANY pattern in the runtime set duplicates an ID. The old TestGetBuiltinHazardPatterns_UniqueIDs stays for the builtin subset. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 22:00:06 +02:00
Benjamin Admin	6a3e96d54c	fix(iace): set-based measure-category filter + 235 pattern-author fixes Two-part nachhaltiger fix replacing the previous "fill to 5 mitigations no matter what" behavior that the GT-Bremse benchmark proved unfaithful (e.g. HP1625 "scharfe Kanten" returning M005 "Rotations- bewegung vermeiden" via category fallback; HP1651 "Wiederanlauf Roboter" returning M054 "Sichere thermische Auslegung" via mismatched pattern reference). PART A — Set-based category filter (handlers package): - acceptableMeasureCategories: replaces 1:1 patternCatToMeasureCat with a curated set per pattern category, so e.g. safety_function_failure now accepts software_control measures (watchdogs, plausibility checks) and emc_hazard accepts both electrical and software_control measures - isCategoryCompatible: gate every measure id against the accepted set before creating a mitigation; mismatches log MEASURE-SKIP - The old category fallback is REMOVED. A hazard whose pattern has no category-compatible measure is now created with zero mitigations and logged as COVERAGE-GAP — the operator must consult an expert. No more silent invention of generic defaults. PART B — 235 pattern author-error fixes across 26 files: - HP040-HP044 (AI): M101/M102/M103 (Auffangwanne/Absauganlage) -> M133 Anomalieerkennung + M214 Plausibilitaet + M213 Sensor-Redundanz + M044 Zweikanalige Steuerung + others - HP011-HP015, HP104-HP109, HP1085-HP1095, HP1281-HP1334 (electrical): M001-M005/M054/M061 placeholders -> M481/M482 Isolation + M511-M522 PE/Schutzleiter/RCD/Hauptschalter - HP110-HP1331 (material_environmental): M101-M103 -> M384-M395 Brandschutz/Laserschutz + M533/M408 SDB/PSA - HP800-HP858, HP1178-HP1264 (software/sensor/hmi): M101/M104 -> M105/M106/M107/M214 SPS/Watchdog/Plausibilitaet - HP026, HP611-HP1690 (ergonomic): M001/M082 -> M353-M360 + M530-M532 Hebehilfe/ergonomische Hoehe - HP201-HP1697 (mechanical): M054/M051 -> M002/M008/M061/M141 + M487/M488 Tueroeffnung-Stillsetzung/Wiederanlauf - Plus EMF/Strahlung/Brand/Lärm/Vibration/Kommunikation/Cyber Coverage shift (Pattern-Author-Fehler bei aktiviertem Set-Filter): start: 237 patterns with zero category-compatible measures after Stufe 1A: 5 (AI) after Stufe 1B: 20 (mechanical Bestand) after Stufe 1C: 35 (electrical Bestand) after Stufe 1D: 29 (material_environmental) after Stufe 1E: 29 (software/sensor/hmi) after Stufe 1F: 20 (ergonomic) after Stufe 1G: 80 (thermal/comm/radiation/fire/safety) final: 0 (28 extended.go/extended2.go duplicates fixed) New regression tests: - TestEveryPattern_HasCategoryCompatibleMeasure: every pattern in collectAllPatterns() must reference at least one category-compatible measure; gaps must be explicitly listed in AllowlistKnownGaps (currently empty). Fails CI for any new pattern that drifts. - TestAcceptableMeasureCategories: pins the set-mapping for the 7 most-bug-prone pattern categories. - TestIsCategoryCompatible_EmptyMeasureCat: protects legacy entries. A separate task #11 tracks 58 HP-ID duplicates between extended.go/extended2.go and cobot.go/press.go/operational.go — patterns are semantically different and TestGetBuiltinHazardPatterns_- UniqueIDs misses them because it only checks HP001-HP044. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 21:11:02 +02:00
Benjamin Admin	938f9a6c51	fix(cmp): tolerate variable URL segments in ePaaS policy pattern BMW ePaaS URLs use 3 segments between /policypage/ and .epaas.json: /epaas/prod/policypage/<tenant>/<config-hash>/<locale>.epaas.json The old pattern only matched 2 segments. Switch to a tolerant pattern that matches any path before .epaas.json (anchored at .epaas.json end).	2026-05-16 20:58:48 +02:00
Benjamin Admin	17a93bc694	fix(consent-tester): prefer CMP-JSON over thin DOM extraction Previous threshold (DOM < 300 words) missed the BMW case where Playwright extracted 346 words of pure site navigation. The CMP JSON had 1673 words of real policy content but was discarded. New heuristic: prefer CMP when ANY of: - DOM < 300 words (existing) - CMP text >= 1000 words (authoritative at scale) - CMP text >1.5x longer than DOM	2026-05-16 20:56:11 +02:00
Benjamin Admin	1792c6f896	fix(consent-tester): capture CMP JSON to extract dynamically-loaded cookie policies BMW (and other big enterprise sites) do NOT render cookie policies as static HTML. Their widget loads structured data from a JSON endpoint (BMW: ePaaS at /epaas/prod/policypage/.../<locale>.epaas.json) and renders it client-side after consent. Our DOM extraction therefore only captured site navigation (603 words of header/footer chrome), not the actual policy. New module consent-tester/services/cmp_extractor.py: - CMPCapture: response listener that catches policy JSON during navigation - Reconstructors for ePaaS (BMW) + OneTrust placeholder - Returns Cookie-Richtlinie text built from policyPageMetadata + categories + providers (BMW: 1673 words reconstructed vs. 603 noise) dsi_discovery.py: - Attach CMPCapture before page.goto - After self-extraction: if rendered DOM < 300 words AND CMP captured a payload, prefer the CMP-reconstructed text. This bypasses the empty '.cookie-policy' div problem entirely.	2026-05-16 20:50:15 +02:00
Benjamin Admin	e61e9d9e2a	feat(agent): progress_pct + 6 BMW-Run Verbesserungen Backend (agent_compliance_check_routes.py): - progress_pct (0-100%) im Job-State, ueber alle Phasen verteilt (Laden 0-30, Profil 35-40, Pruefen 40-80, Banner 80-92, Report 95-100) - Status-Texte vereinheitlicht ("Texte laden X/N", "Pruefen X/N") - Firmenname fuer Email-Subject jetzt aus URL abgeleitet (bmw.de -> "BMW", mercedes-benz.de -> "Mercedes-Benz") statt unzuverlaessigem extracted_profile.companyName (matchte oft juris.de) - E-Mail-Report enthaelt jetzt Banner+TCF-Vendor-Liste (build_provider_list_html) Backend (agent_doc_check_extras.py — neu): - build_scanned_urls_html: gepruefte URLs als Tabelle oben im Report (transparent fuer GF, welche Quellen wirklich gezogen wurden) - Cross-Domain-Hinweis bei >1 netloc (BMW: bmw.de / bmwgroup.com / bmwgroup.jobs — Auffindbarkeit nach Art. 12 DSGVO) - build_provider_list_html: Banner-Box + TCF-Vendor-Tabelle mit Spalten Name \| Kategorie \| Zweck \| Drittland \| Rechtsgrundlage Backend (business_profiler.py): - §34d-GewO Versicherungsvermittler-Hinweise zaehlen nicht mehr als "finance"-Industrie (BMW wurde dadurch falsch als B2B/finance erkannt) - Neue Industry "automotive" (Fahrzeug/KFZ/Konfigurator/Modellpalette) - B2B-Keywords: generische Begriffe wie "unternehmen", "beratung", "consulting" entfernt (matchten in jedem Konzerntext) - B2C-Fallback: bei Verbraucher-Signalen ("widerruf", "kunde", redaktioneller Inhalt) tendiert auf b2c statt b2b Frontend (ComplianceCheckTab.tsx): - Progress-Balken mit Width-% und XX%-Anzeige rechts - liest data.progress_pct aus Polling-Response Consent-Tester (dsi_discovery.py): - Cookie-Policy-Extraktion kritisch fixt: wait_for_function bis body.innerText > 500 chars (BMW SPA-Rendering brauchte mehr Zeit) - _extract_text_robust: 3-Strategien-Extraktion (Selektoren -> Body- Cleanup -> P/LI/TD-Tags) - _extract_text_from_iframes: liest OneTrust/Sourcepoint/Usercentrics Iframe-Inhalte (manche Cookie-Policies leben dort) Adressiert alle Findings aus dem BMW-Ground-Truth-Vergleich.	2026-05-16 17:53:14 +02:00
Benjamin Admin	4d1e0a7f8e	feat(iace): GT-Bremse coverage — 59 expert measures + 7 hazard patterns Systematic gap analysis of the Bremse ground-truth file (60 entries, 100 unique expert measures) revealed only ~5% library coverage. This commit closes the documented gaps with concrete, norm-anchored mitigations. Library additions (M481-M539, 59 entries): - M481-M482 Low-voltage isolation (>= 2,0 / 2x1,0 / 1,0 MOhm + IP2X/IPXXB per EN 60204-1 Ziff. 6.2/8.2.3) — primary trigger of this work - M483-M485 Pneumatic safety (component pressure rating, hose retention, depressurization per EN ISO 4414) - M486-M490 Robot-cell access (tool-secured fence, dual-channel door monitor, intentional restart, anti-trap inside opening, HMI sight line per ISO 10218-2) - M491-M493 Teach mode (key/password mode selector, safe reduced speed <= 250 mm/s, hold-to-run with 3-stage enabler per ISO 10218-1) - M494-M500 Geometry constants (Safe Limited Position, reach-over 250 mm @ 2250 mm fence, conveyor opening >= 850 mm, 25 mm finger gap, band speed <= 100 mm/s per EN ISO 13857 / EN 619) - M501-M507 Enclosure load rating, gripper fail-safe, centring gripper stop on door, MWF nozzle integration, floor load capacity per DIN 1055-3 - M508-M517 Electrical cabling + PE protection (environment-rated, drag chain, strain relief, 10 mm² Cu PE, dual PE, monitoring, continuity check, class-II equipment, SELV/PELV per EN 60204-1) - M518-M522 RCD, cable cross-section, overcurrent in each active conductor, IP22 water ingress, lockable main switch - M523-M539 Teach-locked door, WZM door interlock, dual-channel door switch, machining-doors-closed for aerosol retention, post-NOTHALT release, >25 kg lifting aid (DGUV 208-016), 95-120 cm control height, ergonomic conveyor height, SDS/PSA reference, BA instructions for depressurization/clamp release/max weight/pinch warning/slip warning/dead-state cleaning New hazard patterns (HP1710-HP1717): floor overload, gripper failure throw, compressed-air injury in machining cell, manual handling load + awkward posture, MWF skin contact, live-cabinet cleaning short, pneumatic stored-energy. Existing patterns rewired to the new measures: HP1600, HP1602-1606, HP1610-1612, HP1620-1622, HP1630/1631/1633, HP1640/1641, HP1660/1661, HP1675, HP1685, HP1688, HP1689, HP1698-1704. Tooling: - scripts/gt_measure_gap_analysis.py: 4-signal fuzzy matcher (Jaccard, token recall, substring containment, norm-reference overlap). Outputs markdown + JSON. - gt_coverage_test.go: 23 expert-validated (GT-Nr, pattern, measure) triples + a norm-reference presence test for every new expert measure (no generic 'do X safely' entries allowed). - .gitea/workflows/ci.yaml: new iace-gt-coverage job enforces MIN_COVERAGE_PCT (70%) on Strong+Weak GT coverage; never lower without explicit decision. Coverage shift: 5% Strong -> 30% Strong, 0% -> 72% Strong+Weak. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 13:08:52 +02:00
Benjamin Admin	bf9d8a5ed3	fix(iace): resolve M-ID collisions for electrical/pressure patterns 6 supplementary measures (M410-M420) were silently overwritten by metalworking duplicates in measureByID lookups, so robot-cell electrical patterns resolved to chip-extraction/cleaning fallbacks instead of equipotential bonding, creepage, EMC, or hose-burst protection. Rename supplementary IDs to M475-M480 and rewire 13 affected pattern references in robot_cell + robot_cell_ext. HP1640 (direct contact with live parts, GT 2.2): priority 98->99, drop RequiredEnergyTags gate so it fires in robot cells without an electrical tag, expand mitigations to 5 concrete TRBS 2131 / IEC 60204-1 / EN 61140 measures (basic protection, double insulation, earthing, insulation monitoring, equipotential bonding) — was previously losing to HP1688 even though HP1688 describes a different scenario. HP1688 (touch voltage from potential differences): priority 98->96 so it no longer outranks HP1640 for the direct-contact case; mitigations expanded from M410-only to 4 concrete electrical measures. Add regression tests pinning HP1640 contact-protection resolution and M475 = Potentialausgleich. Existing TestGetProtectiveMeasureLibrary_- UniqueIDs now actually enforces uniqueness (previously masked by last-wins map override). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 10:12:55 +02:00
Benjamin Admin	d45e08e25f	fix: reduce Playwright timeout 180s→60s, increase poll limit 15→25min	2026-05-16 00:47:28 +02:00
Benjamin Admin	3dbf3aa34a	feat: HTTP fallback for text extraction when Playwright times out BMW Impressum/Cookie pages timeout in Playwright (>180s) because the SPA has many sub-links to follow. But the HTML source already contains the text (SSR). New fallback: direct HTTP GET + HTML tag stripping. Order: 1. Consent-tester (Playwright, 180s) → 2. HTTP GET (30s) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-15 23:16:10 +02:00
Benjamin Admin	77308b783f	debug: log CreateMitigation errors	2026-05-15 21:52:04 +02:00
Sharang Parnerkar	3784988d00	chore: bump next 15.1.0 → 15.5.16 (CVE-2026-44578) CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / validate-canonical-controls (push) Successful in 1m37s Details CI / detect-changes (push) Successful in 1m6s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / loc-budget (push) Successful in 21s Details CI / nodejs-build (push) Successful in 4m16s Details CI / test-go (push) Has been skipped Details CI / test-python-backend (push) Has been skipped Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details Patches unauthenticated SSRF in WebSocket upgrade handler. Applies to admin-compliance, developer-portal. Compliance-SDK admin-dashboard skipped — has a pre-existing TS type mismatch that blocks the build regardless of Next version. Needs separate migration work. GHSA-c4j6-fc7j-m34r. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 18:48:36 +02:00
Benjamin Admin	9797234ff6	fix(iace): add abbreviations + action words to genericSafetyTerms KSS, EMV, ESD, DCS, PLR, SIL, HMI, SPS, RCD, LOTO, PSA are abbreviations that should NOT trigger the relevance filter. bersten, platzen, abspringen, spritzen, einatmen, ausrutschen, herabfallen, durchschlaegen, wegschleudern are action words that appear in many patterns and don't indicate a specific machine. Fixes: HP1633-HP1675 (KSS patterns) were filtered out because "kss" was not in the narrative but also not in genericSafetyTerms. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-15 16:05:20 +02:00
Benjamin Admin	7080eb5f45	fix(iace): boost robot cell priorities 96-99, remove debug code Robot cell patterns now fire BEFORE generic patterns (Priority 96-99 vs generic 85-95). This ensures pattern-specific SuggestedMeasureIDs (M420 for KSS, M410 for Potentialausgleich) reach the hazard. Removed debug fmt.Println statements. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-15 16:01:52 +02:00
Benjamin Admin	c93cf2719a	debug: trace M420 in Priority-1 loop	2026-05-15 14:56:05 +02:00
Benjamin Admin	7a27dbc01b	debug: check M420 in measureByID	2026-05-15 14:53:49 +02:00
Benjamin Admin	de35dfce18	debug: add pattern-measure count to init step details	2026-05-15 14:51:26 +02:00
Benjamin Admin	69240faf24	fix(iace): accumulate SuggestedMeasureIDs across dedup'd patterns When multiple patterns match the same category+zone, the first creates the hazard and later patterns add their SuggestedMeasureIDs to the existing hazard. This ensures KSS-specific measures (M420) reach the hazard even if a generic pattern created it first. seenCatZone changed from map[string]bool to map[string]uuid.UUID to track which hazard ID was created for each dedupKey. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-15 14:45:37 +02:00
Benjamin Admin	f34305c0a1	fix: increase dsi-discovery timeout 90s→300s, reduce max_documents 10→5	2026-05-15 14:21:13 +02:00
Benjamin Admin	2b5376ed54	fix(iace): pattern-specific measures take priority over category fallback Each hazard now gets measures from its SOURCE PATTERN first (SuggestedMeasureIDs), then category fallback for remaining slots. Previously all mechanical hazards got the same generic top-5 measures (Gefahrstelle eliminieren, Sicherheitsabstaende, Scharfe Kanten...). Now a KSS-Schlauch hazard gets M420 (Druckfeste Auslegung) first. SuggestedMeasureIDs added to PatternMatch struct and passed through from pattern definition to hazard creation to measure assignment. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-15 14:17:32 +02:00
Benjamin Admin	958c03ab40	fix(iace): add human reference to all 33 robot cell patterns Every ScenarioDE now describes how a PERSON is affected, not just what happens to the machine. Every HarmDE describes the INJURY, not just the technical effect. Examples: - "Peitscheneffekt des Schlauchs" → "Person wird von abspringendem Schlauch getroffen. KSS-Spritzer verletzen Haut und Augen." - "Kurzschluss, Brand" → "Person wird durch Brand oder toxische Rauchgase verletzt. Verbrennungen, Rauchvergiftung." Rule: Risikobeurteilung bewertet Gefahr fuer PERSONEN. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-15 13:43:54 +02:00
Benjamin Admin	fca67c1f43	fix: accordion close bug + merge multi-page DSIs (BMW fix) 1. _expand_all_interactive(): Only click aria-expanded="false" buttons. Before: clicked ALL accordion buttons including open ones → BMW's pre-expanded accordions got CLOSED, reducing text from 1151 to 361w. 2. _fetch_text() + /extract-text: merge ALL documents found on a page (max_documents=10 instead of 1). BMW splits DSI across 5 sub-pages that the discovery finds as separate documents — now merged. 3. Tab panels: unhide hidden tabpanels instead of clicking tabs (clicking tabs can hide the currently visible panel). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-15 13:32:04 +02:00
Benjamin Admin	70af018da5	docs(gt): BMW cross-domain finding — 3 domains, no AGB, Social Media on jobs portal	2026-05-15 13:21:27 +02:00
Benjamin Admin	0182c91ef9	docs(gt): BMW fully verified — URLs, DSB, Impressum, Social Media data	2026-05-15 12:01:20 +02:00
Benjamin Admin	a67cfa7c4a	fix(gt): update BMW URLs (all old URLs are 404 since 2026)	2026-05-15 10:38:07 +02:00
Benjamin Admin	3b7ab4cbd7	feat(iace): 50% display threshold — weak matches shown as separate Matches below 50% are now split: - GT entries → "Fehlend" tab (not matched by engine) - Engine entries → "Engine Findings" tab (additional findings) Only matches >= 50% shown in "Zugeordnet" tab. Coverage score now counts only real matches (>= 50%). "Extra" tab renamed to "Engine Findings" for clarity. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-15 10:33:29 +02:00
Benjamin Admin	3469105d18	feat(iace): HP1606 + HP1634 — target 100% GT coverage HP1606: Quetschen/Scheren durch Greifer im Einrichtbetrieb (GT 1.14) HP1634: KSS-Pumpe spritzt bei geoeffneter Schutztuer (GT 1.38) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-15 10:20:42 +02:00
Benjamin Admin	1414c63515	feat(iace): HP1605 + HP1633 — final 2 patterns for GT coverage HP1605: Stoss durch Werkzeug/Greifer im Einrichtbetrieb (GT 1.14) HP1633: KSS-Versorgungsschlauch platzt oder reisst ab (GT 1.35) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-15 10:16:39 +02:00
Benjamin Admin	9f87bc5a2c	fix: include website/company name in compliance-check email subject	2026-05-15 10:15:34 +02:00
Benjamin Admin	f5f4de7359	fix(iace): remove RequiredEnergyTags from electrical patterns Energy tag "electrical" doesn't match resolved tags (which are "high_voltage", "electrical_part", etc.). Patterns HP1685-HP1699 now fire without energy tag requirement — they fire for any project that has the right component tags. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-15 10:13:00 +02:00
Benjamin Admin	38d15d4d29	feat(iace): 5 differentiated patterns for GT duplicate scenarios When GT has two entries for the same zone with different scenarios (e.g. "eingeklemmt" vs "getroffen"), we need separate engine patterns. HP1700: Getroffen von bewegtem Werkzeug/Greifer (vs HP1652 eingeklemmt) HP1701: Greifer/Werkzeug durchschlaegt Zaun (vs HP1654 Werkstueck) HP1702: KSS-Schlauch platzt (vs HP1675 springt ab) HP1703: KSS-Bettspuelung bei offener Tuer (vs HP1670 allgemein) HP1704: Brand durch KSS auf elektrische Komponenten Extended synonym sets for potential/EMV matching. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-15 10:08:21 +02:00
Benjamin Admin	003eafa75d	fix(iace): synonym-cross-matching + expanded action words scenarioSimilarity now uses synonym-set cross-matching: if GT says "durchschlaegt" and Engine says "schleuder", the synonym set recognizes them as related. Added significantWordOverlap fallback when no action words found. Extended action terms: schlauch/druck/kuehlschmierstoff, pumpe/bettspuel, potential/bezugspotential, stoerung/emv. Moved extractActionWords to benchmark_synonyms.go (458+119 lines). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-15 10:03:23 +02:00
Benjamin Admin	b82853a95b	feat(iace): scenario-based matching + split benchmark_synonyms.go 4-signal matcher: category (0.2), keywords (0.2), zone (0.3), scenario similarity (0.3). Scenario signal extracts action words (eingeklemmt vs herabfallend vs durchschlaegt) to differentiate similar-looking hazards at the same component. Split benchmark_synonyms.go (70 lines) from benchmark_matcher.go (516→450 lines) to stay under 500-line cap. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-15 09:58:12 +02:00
Benjamin Admin	c060ac222a	fix(iace): prioritize zone-specific matches in greedy assignment Sort matches by specificity first (zone overlap), then by score. Prevents generic matches from consuming specific Engine patterns that should match more specific GT entries. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-15 09:45:08 +02:00
Benjamin Admin	659c0505f8	fix: format code in batch test output	2026-05-15 09:44:48 +02:00
Benjamin Admin	02c2325e1b	feat(iace): 2 final patterns (Kriechstrecken, EMV) + matcher synonyms HP1698: Kurzschluss durch unzureichende Luft-/Kriechstrecken (GT 2.6) HP1699: EMV-Stoereinfluss auf Sicherheitsfunktionen (GT 6.1) Extended synonym sets: durchschlag/bewegungsbereich, potentialausgleich, kriechstreck, kuehlschmierstoff/bettspuel, rutsch/stolper. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-15 09:42:14 +02:00
Benjamin Admin	d72aa10691	feat: management summary for GF + batch GT test script 1. Management Summary (agent_doc_check_report.py): - Plain-language action items for Geschaeftsfuehrer - Maps technical checks to business actions ("Ihren DSB erwaehnen", "Beschwerderecht ergaenzen", "Loeschfristen dokumentieren") - Shows at top of compliance check email before detail report - Max 10 actions, max 3 per document 2. Batch GT Test (zeroclaw/scripts/batch_gt_test.py): - Runs all 10 GT websites through compliance-check API - Prints comparison table with L1 scores, word counts, services - Saves raw JSON results for analysis - Usage: python3 batch_gt_test.py --sites 1,6 --backend-url URL Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-15 09:39:19 +02:00
Benjamin Admin	3c05ff8ef6	fix(iace): lower threshold 0.20 + more synonym sets for GT matching Threshold 0.25→0.20 to recover matches lost by keyword penalty. New synonym sets: eingeschlossen/wiederanlauf, zentriergreifer, beladetuer/schutztuer, ergonom/bedienelemente, spritzer/auge, bersten. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-15 09:31:12 +02:00
Benjamin Admin	935c9205b9	feat(iace): 25 new robot cell patterns (HP1650-HP1697) + matcher fix New patterns from GT benchmark gap analysis: - HP1650-1655: Robot arm motion limit, restart safety, tool/workpiece crushing, workpiece penetrates fence, reaching over fence - HP1660-1661: Centering gripper crushing (outside/inside cell) - HP1665-1666: Machine tool loading door, machining workspace - HP1670-1671: Coolant splash eyes, compressed air injury - HP1675: Coolant hose burst/detachment - HP1680: Workpiece/tunnel crushing at conveyor - HP1685-1689: Indirect contact, cabinet contact, liquid ingress fire, potential differences, RCD socket protection - HP1690-1691: Ergonomic loading/control position - HP1695: Burns from hot workpieces - HP1697: Machine collapse through floor Matcher: keyword overlap penalty — matches without shared hazard-type keywords AND low zone score get 0.5x penalty to prevent false matches. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-15 09:28:01 +02:00
Benjamin Admin	826ce2a1b8	fix(cross-doc): suppress false positives when regex checks already pass Cross-search "not in text" findings are only shown when regex L1 completeness < 50%. This prevents false positives where the text IS the right doc_type but doesn't contain the specific cross-search keywords (e.g. Impressum passes 9/13 checks but lacks "§5 TMG"). Also: cross-search now checks entries with wrong text, not just empty. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-15 00:54:33 +02:00

1 2 3 4 5 ...

1115 Commits