feat(iace): benchmark risk comparison (traffic lights) + misuse pattern + 1:n matcher

#1 Risk-number comparison in the benchmark: ComputeRiskComparison derives the tool's S/F/W/P + Fine-Kinney per matched hazard and compares to the GT values; exposed on the benchmark response and rendered in a new RiskComparison table with GREEN/YELLOW/RED traffic lights on the risk number R (like the Excel), plus per-axis within-1 agreement cards. #2 Generic misuse pattern HP2103 "Personenbefoerderung auf Hebezeug" — gated to lift-family machine types, fires for ANY lifting device (not machine-specific). #3 Benchmark matcher is now 1:n — one broad engine hazard may cover several fine-grained GT sub-scenarios (foot/hand/leg crush), so coverage reflects real risk coverage rather than 1:1 wording matches. Validated on BOTH ground truths (robot cell + lift): leakage 0, ghosts 0, coverage held. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-09 17:24:52 +02:00
parent ef746ea8f0
commit 2677bca9ca
8 changed files with 284 additions and 1 deletions
@@ -6,6 +6,7 @@ import { useBenchmark } from './_hooks/useBenchmark'
 import { GTImportForm } from './_components/GTImportForm'
 import { HazardComparisonTable } from './_components/HazardComparisonTable'
 import { CategoryBreakdown } from './_components/CategoryBreakdown'
+import { RiskComparison } from './_components/RiskComparison'

 export default function BenchmarkPage() {
  const { projectId } = useParams<{ projectId: string }>()
@@ -102,6 +103,9 @@ export default function BenchmarkPage() {
          {/* Category Breakdown */}
          <CategoryBreakdown breakdown={result.category_breakdown || []} />

+          {/* Risk-number comparison (tool vs professional) with traffic lights */}
+          <RiskComparison pairs={result.risk_comparison} agreement={result.risk_agreement} />
+
          {/* Hazard Comparison Table */}
          <HazardComparisonTable
            matched={result.matched_pairs || []}