5e18df63b1
Executes the accident-statistics pipeline for the risk anchors: - Refresh contactModeEvidence with real Eurostat ESAW figures (dataset hsw_ph3_08, reference year 2023): impact 24.0%/21.4%, struck-by 13.0%/23.8%, sharp 14.5%, trapped/crushed 13.8% (fatal), + new physical/mental-stress mode 24.7% → ergonomic. GT-calibrated tier VALUES unchanged; the real data confirms the ordering. - Add the versioned source document (datasources/esaw_accident_stats_2023.md, ESAW CC BY 4.0 + OSHA public-domain context) that is ingested into the core RAG collection bp_iace_accident_stats for searchable evidence. - Whitelist bp_iace_accident_stats in the RAG search handler so seeding can full-text search the statistics with citation at seed time. Two-layer design: the small license-tagged code table stays the deterministic tier/citation lookup; the RAG holds the searchable source evidence. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
62 lines
3.0 KiB
Markdown
62 lines
3.0 KiB
Markdown
# Risk-estimation data sources & licenses
|
|
|
|
Provenance for the probability (W) / avoidance (P) tiers in `risk_estimation.go`
|
|
(`contactModeTable`). We do **not** vendor any raw dataset — only the small
|
|
aggregate facts used as anchors plus our own calibrated tiers live in code.
|
|
|
|
## What we use and how
|
|
|
|
The tiers are derived in two steps:
|
|
|
|
1. **Anchor** — the *relative ordering* of injury contact modes from public,
|
|
permissively-licensed occupational-accident statistics (which mechanisms are
|
|
more vs. less frequent).
|
|
2. **Calibrate** — adjust the tier *values* to our own ground-truth corpus
|
|
(the professional's W/P per mode). Well-sampled modes are set to the GT mean;
|
|
sparse modes use conservative defaults (no overfitting to a 2-GT sample).
|
|
|
|
The numbers in code are therefore **ours**, not a copy of any dataset, and they
|
|
do **not** reproduce any standard's risk-graph table, decision tree or matrix.
|
|
|
|
## Primary source — Eurostat ESAW
|
|
|
|
- **Dataset:** European Statistics on Accidents at Work (ESAW), contact mode of injury.
|
|
- **License:** **CC BY 4.0** — commercial and non-commercial reuse permitted,
|
|
source acknowledgement required.
|
|
- **Attribution string:** `Source: Eurostat (ESAW), CC BY 4.0` — surface this in
|
|
any generated risk-assessment export that shows engine risk numbers.
|
|
- **URL:** https://ec.europa.eu/eurostat/statistics-explained/index.php/Accidents_at_work_-_statistics_on_causes_and_circumstances
|
|
- **Aggregate facts used (anchor only):** contact-mode shares of accidents at
|
|
work. **Dataset `hsw_ph3_08`, reference year 2023** (Figure 7, "contact —
|
|
mode of injury"), EU shares:
|
|
- Physical/mental stress: 24.7% (non-fatal)
|
|
- Impact with stationary object (victim in motion): 24.0% (non-fatal) / 21.4% (fatal)
|
|
- Contact with sharp/pointed/rough agent: 14.5% (non-fatal)
|
|
- Struck by object in motion / collision: 13.0% (non-fatal) / 23.8% (fatal)
|
|
- Trapped / crushed: 13.8% (fatal)
|
|
|
|
Retrieved 2026-06. The source document is also ingested into the core RAG
|
|
collection `bp_iace_accident_stats` for searchable evidence at seeding time.
|
|
|
|
## Acceptable supplements
|
|
|
|
- **US BLS / OSHA** (Bureau of Labor Statistics, occupational injuries) — **U.S.
|
|
Government work, public domain**; free for any use.
|
|
- **UK HSE** (RIDDOR / kinds-of-accident) — **Open Government Licence v3**;
|
|
commercial reuse with attribution.
|
|
|
|
## Explicitly excluded
|
|
|
|
- **DGUV statistics** — terms grant only editorial use and forbid modification
|
|
/ re-licensing; **unsuitable for a commercial product**. Not used.
|
|
- **DIN / Beuth / ISO / IEC standards** (e.g. risk-graph tables, parameter
|
|
decision trees, SIL/PL matrices) — copyrighted; **not reproduced or
|
|
re-implemented**. Our model uses only the universal, non-protectable risk
|
|
*dimensions* (severity, frequency, probability, avoidance).
|
|
|
|
## Maintenance
|
|
|
|
When a tier in `contactModeTable` changes, record the source figure and the GT
|
|
calibration basis here. Add this file to the repository SBOM / license register
|
|
alongside software dependencies.
|