# Risk-estimation data sources & licenses Provenance for the probability (W) / avoidance (P) tiers in `risk_estimation.go` (`contactModeTable`). We do **not** vendor any raw dataset — only the small aggregate facts used as anchors plus our own calibrated tiers live in code. ## What we use and how The tiers are derived in two steps: 1. **Anchor** — the *relative ordering* of injury contact modes from public, permissively-licensed occupational-accident statistics (which mechanisms are more vs. less frequent). 2. **Calibrate** — adjust the tier *values* to our own ground-truth corpus (the professional's W/P per mode). Well-sampled modes are set to the GT mean; sparse modes use conservative defaults (no overfitting to a 2-GT sample). The numbers in code are therefore **ours**, not a copy of any dataset, and they do **not** reproduce any standard's risk-graph table, decision tree or matrix. ## Primary source — Eurostat ESAW - **Dataset:** European Statistics on Accidents at Work (ESAW), contact mode of injury. - **License:** **CC BY 4.0** — commercial and non-commercial reuse permitted, source acknowledgement required. - **Attribution string:** `Source: Eurostat (ESAW), CC BY 4.0` — surface this in any generated risk-assessment export that shows engine risk numbers. - **URL:** https://ec.europa.eu/eurostat/statistics-explained/index.php/Accidents_at_work_-_statistics_on_causes_and_circumstances - **Aggregate facts used (anchor only):** contact-mode shares of accidents at work. **Dataset `hsw_ph3_08`, reference year 2023** (Figure 7, "contact — mode of injury"), EU shares: - Physical/mental stress: 24.7% (non-fatal) - Impact with stationary object (victim in motion): 24.0% (non-fatal) / 21.4% (fatal) - Contact with sharp/pointed/rough agent: 14.5% (non-fatal) - Struck by object in motion / collision: 13.0% (non-fatal) / 23.8% (fatal) - Trapped / crushed: 13.8% (fatal) Retrieved 2026-06. The source document is also ingested into the core RAG collection `bp_iace_accident_stats` for searchable evidence at seeding time. ## Acceptable supplements - **US BLS / OSHA** (Bureau of Labor Statistics, occupational injuries) — **U.S. Government work, public domain**; free for any use. - **UK HSE** (RIDDOR / kinds-of-accident) — **Open Government Licence v3**; commercial reuse with attribution. ## Explicitly excluded - **DGUV statistics** — terms grant only editorial use and forbid modification / re-licensing; **unsuitable for a commercial product**. Not used. - **DIN / Beuth / ISO / IEC standards** (e.g. risk-graph tables, parameter decision trees, SIL/PL matrices) — copyrighted; **not reproduced or re-implemented**. Our model uses only the universal, non-protectable risk *dimensions* (severity, frequency, probability, avoidance). ## Maintenance When a tier in `contactModeTable` changes, record the source figure and the GT calibration basis here. Add this file to the repository SBOM / license register alongside software dependencies.