Commit Graph

2 Commits

Author SHA1 Message Date
Benjamin Admin c13aa9183a feat(ai-sdk): vocab->tag proposer (P2 slice 5, type 3)
Extends Method C: for each unknown narrative token that pattern text names, suggest
the keyword_dictionary tag = the RequiredComponentTags shared by the naming
patterns (ranked by frequency, kept only when shared by >=40% of them, top 3).
Surfaces real dictionary gaps like "zwischenkreis" -> stored_energy and
"updates" -> has_software, which close coverage without hand-editing the dict.

Two precision fixes to Method C while here:
- patternsMentioning now matches WHOLE WORDS, not substrings — substring matching
  flagged fragments like "stehen" inside "entstehen" and produced nonsensical
  tag suggestions.
- a token is only proposed with a tag if one is shared by >=40% of its naming
  patterns, so diffuse common verbs (spread across categories) drop out.

Wired into iace-audit propose -> audit-reports/vocab.{md,json}. Residual
common-verb noise is left to the human/LLM filter rather than a hand-grown
stopword list. Type 4 (coverage blind spots) + P3 (pin accepted proposals into a
GT case) remain for slice 6.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-26 10:27:01 +02:00
Benjamin Admin f534b52817 feat(iace): pattern audit suite + library hygiene wave
Add cmd/iace-audit CLI with 5 deterministic methods that find engine
gaps without ground truth:

- A reachability: 1058 patterns vs achievable tag universe
- B consistency: components vs their declared hazard categories
- C vocabulary: limits-form tokens vs keyword dictionary
- D echo: limits-form sentences vs generated hazards (jaccard)
- E hierarchy: hazards vs ISO 12100 design/protection/info levels

Library fixes triggered by A+B+C findings:

- tag_resolver: synonym map for electrical/pneumatic/hydraulic aliases
- component_library: crush_point + EN03 (gravitational) on C014/C128
  (Hubwerk family) - fixes HP1014/1015/1017/1018 which were silently
  weakly_reachable. noise_source added on 7 components (C006/C011/
  C017/C020/C031/C041/C096). electrical_part on 8 drive components
  (C031/C032/C033/C034/C035/C036/C037/C038/C077/C092). cyber tag
  on 10 sensors (C081-C090) + 3 IT components (C111/C112/C116) +
  KI module C119 (ai_model added). pneumatic_part+hydraulic_part
  on valves C091/C093, hydraulic_part+chemical_risk on pump C097,
  moving_part on motion controller C075
- keyword_dictionary: EN03 added to aufzug/lift/hubwerk/hubgeraet
  (was wrongly EN04-only). New keyword entries for hub-action verbs:
  absenken/senken/anheben/heben + hubhoehe/hubweg/hubgeschwindig

Audit impact:
- A: weakly_reachable 409 -> 358 (-51 patterns now fully reachable)
- B: incomplete components 46 -> 30 (-16, -33%)
- HP1018 (Person unter absenkendem Maschinenteil eingeklemmt):
  weakly_reachable -> reachable

Why: methods A/B/C surfaced that the Kistenhubgeraet test project
generated 0 crush-under-load hazards despite OSHA 1910.212(a)(3) +
EN ISO 12100 6.3.5.5 explicitly requiring them. Three orthogonal
bugs (missing crush_point tag, wrong energy source mapping, missing
action verbs in dictionary) silently disabled the entire lift crush
pattern family.
2026-05-21 10:51:08 +02:00