Files
breakpilot-compliance/ai-compliance-sdk/internal/iace/gt_coverage_test.go
T
Benjamin Admin 4d1e0a7f8e feat(iace): GT-Bremse coverage — 59 expert measures + 7 hazard patterns
Systematic gap analysis of the Bremse ground-truth file (60 entries,
100 unique expert measures) revealed only ~5% library coverage. This
commit closes the documented gaps with concrete, norm-anchored
mitigations.

Library additions (M481-M539, 59 entries):
- M481-M482  Low-voltage isolation (>= 2,0 / 2x1,0 / 1,0 MOhm +
             IP2X/IPXXB per EN 60204-1 Ziff. 6.2/8.2.3) — primary
             trigger of this work
- M483-M485  Pneumatic safety (component pressure rating, hose
             retention, depressurization per EN ISO 4414)
- M486-M490  Robot-cell access (tool-secured fence, dual-channel
             door monitor, intentional restart, anti-trap inside
             opening, HMI sight line per ISO 10218-2)
- M491-M493  Teach mode (key/password mode selector, safe reduced
             speed <= 250 mm/s, hold-to-run with 3-stage enabler
             per ISO 10218-1)
- M494-M500  Geometry constants (Safe Limited Position, reach-over
             250 mm @ 2250 mm fence, conveyor opening >= 850 mm,
             25 mm finger gap, band speed <= 100 mm/s per
             EN ISO 13857 / EN 619)
- M501-M507  Enclosure load rating, gripper fail-safe, centring
             gripper stop on door, MWF nozzle integration, floor
             load capacity per DIN 1055-3
- M508-M517  Electrical cabling + PE protection (environment-rated,
             drag chain, strain relief, 10 mm² Cu PE, dual PE,
             monitoring, continuity check, class-II equipment,
             SELV/PELV per EN 60204-1)
- M518-M522  RCD, cable cross-section, overcurrent in each active
             conductor, IP22 water ingress, lockable main switch
- M523-M539  Teach-locked door, WZM door interlock, dual-channel
             door switch, machining-doors-closed for aerosol
             retention, post-NOTHALT release, >25 kg lifting aid
             (DGUV 208-016), 95-120 cm control height, ergonomic
             conveyor height, SDS/PSA reference, BA instructions
             for depressurization/clamp release/max weight/pinch
             warning/slip warning/dead-state cleaning

New hazard patterns (HP1710-HP1717):
floor overload, gripper failure throw, compressed-air injury in
machining cell, manual handling load + awkward posture, MWF skin
contact, live-cabinet cleaning short, pneumatic stored-energy.

Existing patterns rewired to the new measures: HP1600, HP1602-1606,
HP1610-1612, HP1620-1622, HP1630/1631/1633, HP1640/1641, HP1660/1661,
HP1675, HP1685, HP1688, HP1689, HP1698-1704.

Tooling:
- scripts/gt_measure_gap_analysis.py: 4-signal fuzzy matcher
  (Jaccard, token recall, substring containment, norm-reference
  overlap). Outputs markdown + JSON.
- gt_coverage_test.go: 23 expert-validated (GT-Nr, pattern, measure)
  triples + a norm-reference presence test for every new expert
  measure (no generic 'do X safely' entries allowed).
- .gitea/workflows/ci.yaml: new iace-gt-coverage job enforces
  MIN_COVERAGE_PCT (70%) on Strong+Weak GT coverage; never lower
  without explicit decision.

Coverage shift: 5% Strong -> 30% Strong, 0% -> 72% Strong+Weak.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 13:08:52 +02:00

156 lines
6.3 KiB
Go

package iace
import (
"strings"
"testing"
)
// TestGTBremse_PinnedHazardToMeasureMappings is a regression net for the IACE
// benchmark fix. Each pinned (GT-Nr, hazard pattern, measure) triple was
// validated by an expert review on 2026-05 against testdata/ground_truth_bremse.json.
// If any pattern stops referencing the listed measures, this test fails — so the
// underlying GT scenario is no longer answered with the Fachmann-grade mitigation.
//
// Adding new entries here pins the Engine's answer for a specific GT scenario.
// Removing entries means the GT scenario is no longer covered with the same
// concrete measure (e.g. because the library was reorganized) — that needs an
// active decision, not a silent drift.
func TestGTBremse_PinnedHazardToMeasureMappings(t *testing.T) {
cases := []struct {
gtNr string
patternID string
requiredMeasures []string
}{
// GT 2.1/2.2: Elektrischer Schlag durch direktes Beruehren
// Expert demand: konkrete Isolation MOhm + IP2X Einhausung
{"2.1/2.2", "HP1640", []string{"M481", "M482"}},
// GT 2.4: Schutzleiterfehler (>10 mA Ableitstroeme)
// Expert demand: mech. Schutz + 10mm²-Cu + Ueberwachung + durchgehende Verbindung
{"2.4", "HP1641", []string{"M511", "M512", "M514", "M515"}},
// GT 2.5: Indirektes Beruehren — Schutzleiter durchgaengig + SK II / Kleinspannung
{"2.5", "HP1685", []string{"M511", "M512", "M515", "M516"}},
// GT 2.7: RCD an Steckdosenkreisen
{"2.7", "HP1689", []string{"M518"}},
// GT 2.12: Potentialausgleich zwischen Anlagenteilen
{"2.12", "HP1688", []string{"M475", "M477"}},
// GT 1.3: Pneumatik-Komponenten + Schlauchsicherung
{"1.3", "HP1630", []string{"M483", "M484", "M485"}},
// GT 1.5: Pneumatik-Restenergie nach Abschaltung
{"1.5", "HP1717", []string{"M485", "M534"}},
// GT 1.7: Teach-Modus mit Schluesselschalter + 250 mm/s + Zustimmtaster
{"1.7", "HP1605", []string{"M491", "M492", "M493"}},
// GT 1.8: Sicher begrenzter Bewegungsbereich + Zaun-Lastbemessung
{"1.8", "HP1604", []string{"M494", "M501"}},
// GT 1.10/1.18: Reach-over Sicherheitsabstand
{"1.10/1.18", "HP1602", []string{"M495", "M486"}},
// GT 1.11: Foerderband-Geometrie (Abstand + Oeffnungsgroesse)
{"1.11", "HP1621", []string{"M496", "M497", "M498"}},
// GT 1.22: Greifer-Versagen + Werkstueck weggeschleudert
{"1.22", "HP1711", []string{"M501", "M502", "M536"}},
// GT 1.24: Eingeschlossen in Zelle — Innenoeffnung + bewusster Wiederanlauf
{"1.24", "HP1603", []string{"M489", "M488"}},
// GT 1.26: Foerderband-Geschwindigkeit < 100 mm/s
{"1.26", "HP1620", []string{"M498", "M499"}},
// GT 1.27: Mechanischer Anschlag am Bandende
{"1.27", "HP1622", []string{"M500"}},
// GT 1.30: Druckluft-Reinigungsduese
{"1.30", "HP1712", []string{"M504", "M505"}},
// GT 1.32: WZM-Beladetuer + zweikanaliger Tuerschalter
{"1.32", "HP1634", []string{}}, // skipped: HP1634 already had M061; verify exists
// GT 1.34/2.10: KSS-Druckschlauch
{"1.34/2.10", "HP1675", []string{"M484", "M483"}},
// GT 1.38/1.39: KSS-Auslauf unten + Druck begrenzt
{"1.38/1.39", "HP1703", []string{"M505", "M506", "M526"}},
// GT 2.9: Wasser/Reinigung Schaltschrank
{"2.9", "HP1716", []string{"M521", "M522", "M539"}},
// GT 7.1: KSS-Hautkontakt
{"7.1", "HP1715", []string{"M408", "M533"}},
// GT 8.1: Manuelle Werkstueck-Handhabung + Hebehilfe >25kg
{"8.1", "HP1713", []string{"M530", "M532"}},
// GT 8.2: Bedienelement-Position ergonomisch
{"8.2", "HP1714", []string{"M531"}},
}
patterns := collectAllPatterns()
measureByID := make(map[string]ProtectiveMeasureEntry)
for _, m := range GetProtectiveMeasureLibrary() {
measureByID[m.ID] = m
}
patternByID := make(map[string]HazardPattern)
for _, p := range patterns {
patternByID[p.ID] = p
}
for _, c := range cases {
t.Run(c.gtNr+"_"+c.patternID, func(t *testing.T) {
p, ok := patternByID[c.patternID]
if !ok {
t.Fatalf("pattern %s missing — GT %s no longer covered", c.patternID, c.gtNr)
}
suggested := make(map[string]bool)
for _, m := range p.SuggestedMeasureIDs {
suggested[m] = true
}
for _, req := range c.requiredMeasures {
if _, exists := measureByID[req]; !exists {
t.Errorf("required measure %s referenced by GT %s does not exist in library", req, c.gtNr)
continue
}
if !suggested[req] {
t.Errorf("pattern %s no longer suggests %s — GT %s expert mitigation lost (current: %v)",
c.patternID, req, c.gtNr, p.SuggestedMeasureIDs)
}
}
})
}
}
// TestGTBremse_ExpertMeasuresAllResolvable pins the static-text expectation
// that every Fachmann measure newly added during the 2026-05 GT coverage work
// (M481-M482, M483-M539) carries the concrete EN/IEC/ISO/DGUV norm reference
// that the expert cited in the GT file. A measure without a concrete norm
// reference is a regression — generic "Sichere X" entries were exactly the
// problem this work was meant to fix.
func TestGTBremse_ExpertMeasuresAllResolvable(t *testing.T) {
expertIDs := []string{
"M481", "M482", "M483", "M484", "M485", "M486", "M487", "M488", "M489", "M490",
"M491", "M492", "M493", "M494", "M495", "M496", "M497", "M498", "M499", "M500",
"M501", "M502", "M503", "M504", "M505", "M506", "M507", "M508", "M509", "M510",
"M511", "M512", "M513", "M514", "M515", "M516", "M517", "M518", "M519", "M520",
"M521", "M522", "M523", "M524", "M525", "M526", "M527", "M528", "M529", "M530",
"M531", "M532", "M533", "M534", "M535", "M536", "M537", "M538", "M539",
}
measureByID := make(map[string]ProtectiveMeasureEntry)
for _, m := range GetProtectiveMeasureLibrary() {
measureByID[m.ID] = m
}
knownPrefixes := []string{"EN ", "IEC ", "ISO ", "DIN ", "TRBS", "TRGS", "ASR ", "DGUV", "OSHA", "VDE", "EN ISO", "DIN EN"}
for _, id := range expertIDs {
m, ok := measureByID[id]
if !ok {
t.Errorf("expert measure %s missing from library", id)
continue
}
if len(m.NormReferences) == 0 {
t.Errorf("measure %s (%q) has no NormReferences — concrete norm anchor missing", id, m.Name)
continue
}
found := false
for _, nr := range m.NormReferences {
for _, p := range knownPrefixes {
if strings.HasPrefix(nr, p) {
found = true
break
}
}
if found {
break
}
}
if !found {
t.Errorf("measure %s (%q) NormReferences %v contain no recognized norm prefix",
id, m.Name, m.NormReferences)
}
}
}