breakpilot-compliance/backend-compliance/knowledge/programs/README.md

# Domain Knowledge Program — the production line for every domain

**The architecture is stable. From here the value comes from DOMAIN MODELLING, not more software.**
The real bottleneck is no longer architecture or controls or even „knowledge" — it is **domain
modelling**. Phase B is therefore organised as ONE program with sub-programs per domain, each run
through the SAME production line. No new runtime framework (ADR-008/009, Freeze v1.0).

## The customer enters by INDUSTRY, not by regulation

A customer never says „explain ISO 9001". They say „I build packaging machines" / „I'm an automotive
supplier" / „I build parking systems". So the pipeline starts at the industry:

```
Industry → Domain Model → Requirement Sources → Requirements → Capabilities → … → Reality / Verification
```

## The 7-stage checklist (identical for EVERY domain)

| # | Stage | Owner |
|---|---|---|
| 1 | **Domain Model** (industry → what world is this?) | Reasoning / curation |
| 2 | **Requirement Sources** (which regulations/standards/specs apply) | Legal Knowledge |
| 3 | **Capability Registry** (capabilities the sources require) | Compliance Execution |
| 4 | **Transition Patterns** (source-state → domain delta) | Reasoning |
| 5 | **Playbooks** (how to implement each capability) | Reasoning |
| 6 | **Reference Scenarios** (canonical regression + expected outcomes) | Reasoning |
| 7 | **Completeness** (auditable coverage per domain) | Reasoning / curation |

This is the scaling mechanism: every new domain reuses the same production line; the existing engines
(Scope, Gap, Capability Delta, Optimization, Playbooks, Reference, Completeness) extend automatically.

## A domain knows its typical sources → pre-onboarding HYPOTHESIS (the ETO insight)

Each domain definition lists `typical_requirement_sources` and `typical_certifications`. So before
onboarding, BreakPilot can say „this process world is *probably* present" — as a **hypothesis, not a
truth**. We don't want to know whether an automotive supplier has ISO 9001 (everyone does); we want
to know **which company capabilities are therefore probably already present** (feeds Company 2A as
`inferred`, never `confirmed`).

## Per-domain KPI — reproducible, not marketing

Progress per domain is **derived from the Regulatory Completeness Engine + the actual corpus**
(computed-not-stored): identified requirement sources · modelled capabilities · transition patterns ·
playbooks · passed reference scenarios · consciously declared corpus gaps. Rendered as a bar
(`Industrial ███████░░░ 70 %`). These are reproducible quality metrics — no curated numbers.

## Domain Knowledge Program v1 — backlog (by current customer value)

| Rank | Domain | File | Typical sources |
|---|---|---|---|
| 1 | **Industrial Automation** | `industrial_automation.yaml` | CRA · MaschinenVO · EMV · RED · Data Act · IEC 62443 · NIS2 |
| 2 | Environmental | `environmental.yaml` | Wasser · Chemikalien · Luft · Energie · Abfall · Produktverantwortung |
| 3 | Automotive | `automotive.yaml` | IATF · TISAX · UNECE R155/R156 · ASPICE · OEM-Lastenhefte |
| 4 | Medical | `medical.yaml` | MDR · IEC 62304 · ISO 14971 |
| 5 | Energy | `energy.yaml` | je nach Zielmarkt |

The work shifts decisively from software development to knowledge production; the competitive
advantage now comes from the quality and breadth of the modelled domains.