feat(programs): open Domain Knowledge Program v1 — 7-stage production line + per-domain KPI

The real bottleneck is domain MODELLING. Phase B is organized as one program with sub-programs per domain, each run through the SAME 7-stage production line. No new runtime framework, no new module (ADR-009, Freeze v1.0) — only program data + a derived reporting view. - Customer enters by INDUSTRY, not regulation: Industry -> Domain Model -> Requirement Sources -> Requirements -> Capabilities -> ... -> Completeness. - 7-stage checklist identical for every domain (Domain Model / Requirement Sources / Capability Registry / Transition Patterns / Playbooks / Reference Scenarios / Completeness) with per-stage ownership. README generalized to the framework. - Each domain lists typical_requirement_sources + typical_certifications -> pre-onboarding capability HYPOTHESIS (the ETO insight; feeds Company 2A as inferred, never confirmed). - Backlog v1 (by customer value): 1 Industrial Automation, 2 Environmental, 3 Automotive, 4 Medical, 5 Energy. Five domain-definition shells (environmental restructured to the unified shape, law-first preserved). - Per-domain KPI is DERIVED from the real corpus (computed-not-stored; sources modelled / transition patterns / playbooks / reference scenarios), NOT a curated number. Reference suite renders maturity bars: Industrial Automation 43% (3/7 sources) leads, Environmental 0% (work ahead). Backlog (value) and KPI (corpus state) are deliberately separated. - ADR-009: Domain Knowledge Program framework. Honest known refinement: regulation-ID normalization (CRA vs Cyber Resilience Act) aliased in the KPI. 7 program-contract tests (backlog order + industry-first + derived-not-stored), check-loc 0. Knowledge data + ADR + reference harness = non-runtime -> no deploy (ADR-001). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-27 18:49:06 +02:00
parent c737e1ad7d
commit 1a9439d013
10 changed files with 312 additions and 159 deletions
@@ -1,51 +1,58 @@
-# Domain Knowledge Programs — from architecture to domains
+# Domain Knowledge Program — the production line for every domain
-**The architecture is stable. From here the value comes from DOMAINS, not more software.**
+**The architecture is stable. From here the value comes from DOMAIN MODELLING, not more software.**
-The runtime architecture (scope, regulatory map, capability delta, optimization, playbooks, intake,
+The real bottleneck is no longer architecture or controls or even „knowledge" — it is **domain
-production, completeness) is frozen. A new regulatory domain is a **data + knowledge** project that
+modelling**. Phase B is therefore organised as ONE program with sub-programs per domain, each run
-extends every existing view automatically — no new runtime framework (see ADR-008, Freeze v1.0).
+through the SAME production line. No new runtime framework (ADR-008/009, Freeze v1.0).
-## The production line (reusable for EVERY domain)
+## The customer enters by INDUSTRY, not by regulation
 A customer never says „explain ISO 9001". They say „I build packaging machines" / „I'm an automotive
 supplier" / „I build parking systems". So the pipeline starts at the industry:
 ```
-Regulatory Corpus → Obligations → Capabilities → Transition Patterns → Playbooks → Reference Scenarios → Completeness
+Industry → Domain Model → Requirement Sources → Requirements → Capabilities → … → Reality / Verification
 ```
-Each domain delivers the SAME artifacts. Once they land, the existing engines automatically extend:
+## The 7-stage checklist (identical for EVERY domain)
 Scope · Gap · Capability Delta · Optimization · Playbooks · Reference Scenarios · Regulatory Completeness.
-## Law-first (a deliberate ordering)
+| # | Stage | Owner |
 Start from the **law**, not from a management system. A management system (e.g. ISO 14001) is NOT
 the domain — it is one possible source state. The customer asks *„welche Anforderungen gelten für mein
 Produkt?"*, not *„wie komme ich von ISO 14001 weg?"*. So the order is:
 ```
 Recht → Obligations → Capabilities → (Managementsystem) → Delta
 ```
 A management-system→corpus transition pattern is built LAST, once BOTH sides are known.
 ## Ownership per stage (coordinate via the board, do not duplicate)
 | Stage | Artifact | Owner |
 |---|---|---|
-| B1 Corpus + Obligations | legal sources, obligations registry | **Legal Knowledge / Obligation Registry** |
+| 1 | **Domain Model** (industry → what world is this?) | Reasoning / curation |
-| B2 Capability Model | new capabilities in the Master Capability Registry | **Compliance Execution** |
+| 2 | **Requirement Sources** (which regulations/standards/specs apply) | Legal Knowledge |
-| B3 Transition Patterns | source-state → corpus delta patterns | **Reasoning (Knowledge Acquisition)** |
+| 3 | **Capability Registry** (capabilities the sources require) | Compliance Execution |
-| Playbooks | implementation playbooks per capability | **Reasoning** |
+| 4 | **Transition Patterns** (source-state → domain delta) | Reasoning |
-| Reference Scenarios | canonical regression + expected outcomes | **Reasoning** |
+| 5 | **Playbooks** (how to implement each capability) | Reasoning |
-| Completeness | corpus-status registry per domain | **Reasoning / curation** |
+| 6 | **Reference Scenarios** (canonical regression + expected outcomes) | Reasoning |
 | 7 | **Completeness** (auditable coverage per domain) | Reasoning / curation |
-## Programs (planned)
+This is the scaling mechanism: every new domain reuses the same production line; the existing engines
 (Scope, Gap, Capability Delta, Optimization, Playbooks, Reference, Completeness) extend automatically.
-| Program | File | Status |
+## A domain knows its typical sources → pre-onboarding HYPOTHESIS (the ETO insight)
 |---|---|---|
 | Environmental Knowledge Program | `environmental.yaml` | started (B1 handed off) |
 | Automotive Knowledge Program | _planned_ | — |
 | OT / IEC 62443 Knowledge Program | _planned_ | — |
 | Functional Safety Knowledge Program | _planned_ | — |
-Each program is a machine-readable definition (`*.yaml`) consumed by the reference suite to track its
+Each domain definition lists `typical_requirement_sources` and `typical_certifications`. So before
-progress; future sessions flip stage `status` as artifacts land, and the Completeness engine reports
+onboarding, BreakPilot can say „this process world is *probably* present" — as a **hypothesis, not a
-the domain flipping `unsupported → validated` automatically.
+truth**. We don't want to know whether an automotive supplier has ISO 9001 (everyone does); we want
 to know **which company capabilities are therefore probably already present** (feeds Company 2A as
 `inferred`, never `confirmed`).
 ## Per-domain KPI — reproducible, not marketing
 Progress per domain is **derived from the Regulatory Completeness Engine + the actual corpus**
 (computed-not-stored): identified requirement sources · modelled capabilities · transition patterns ·
 playbooks · passed reference scenarios · consciously declared corpus gaps. Rendered as a bar
 (`Industrial ███████░░░ 70 %`). These are reproducible quality metrics — no curated numbers.
 ## Domain Knowledge Program v1 — backlog (by current customer value)
 | Rank | Domain | File | Typical sources |
 |---|---|---|---|
 | 1 | **Industrial Automation** | `industrial_automation.yaml` | CRA · MaschinenVO · EMV · RED · Data Act · IEC 62443 · NIS2 |
 | 2 | Environmental | `environmental.yaml` | Wasser · Chemikalien · Luft · Energie · Abfall · Produktverantwortung |
 | 3 | Automotive | `automotive.yaml` | IATF · TISAX · UNECE R155/R156 · ASPICE · OEM-Lastenhefte |
 | 4 | Medical | `medical.yaml` | MDR · IEC 62304 · ISO 14971 |
 | 5 | Energy | `energy.yaml` | je nach Zielmarkt |
 The work shifts decisively from software development to knowledge production; the competitive
 advantage now comes from the quality and breadth of the modelled domains.
@@ -0,0 +1,19 @@
 # Domain Knowledge Program — Automotive (backlog rank 3). A domain DEFINITION, not corpus content.
 # 7-stage progress is DERIVED from the corpus (computed-not-stored). See programs/README.md.
 id: PROG-automotive
 name: "Automotive Domain"
 industry: "Automobilzulieferer, OEM-Zulieferkette"
 customer_entry: "Ich bin Automobilzulieferer."
 backlog_rank: 3
 rationale: "großer Markt; OEM-Lastenhefte = früher Business-Requirement-Anwendungsfall."
 status: planned
 typical_requirement_sources: [IATF16949, TISAX, "UNECE R155", "UNECE R156", ASPICE, OEM_Lastenheft]
 typical_certifications: [ISO9001, IATF16949, TISAX, ISO27001]
 ownership: "Stufe 1 Reasoning · 2 Legal-KG · 3 Execution · 4-7 Reasoning"
 note: >
  ISMS→TISAX-Transition-Pattern existiert bereits (Vorarbeit). UNECE R155 (Cybersecurity Management
  System) ↔ CRA = quellenübergreifender Convergence-Kandidat. OEM-Lastenheft = erster Business
  Requirement (siehe Vision V2 / Requirements Verification, NICHT jetzt).
@@ -0,0 +1,18 @@
 # Domain Knowledge Program — Energy (backlog rank 5). A domain DEFINITION, not corpus content.
 # 7-stage progress is DERIVED from the corpus (computed-not-stored). See programs/README.md.
 id: PROG-energy
 name: "Energy Domain"
 industry: "Energieerzeugung/-verteilung, Anlagen kritischer Infrastruktur"
 customer_entry: "Ich baue Anlagen für Energieerzeugung / kritische Infrastruktur."
 backlog_rank: 5
 rationale: "Zielmarkt-abhängig; nach den klareren Industrie-/Produkt-Domänen."
 status: planned
 typical_requirement_sources: [NIS2, IEC62443, CRA, "netzcode/marktabhängig"]
 typical_certifications: [ISO27001, IEC62443]
 ownership: "Stufe 1 Reasoning · 2 Legal-KG · 3 Execution · 4-7 Reasoning"
 note: >
  Stark zielmarkt-abhängig (Netzcodes, nationale Vorgaben). NIS2/IEC62443 teilen sich Capabilities
  mit Industrial Automation → Wiederverwendung wahrscheinlich hoch.
@@ -1,57 +1,33 @@
-# Environmental Knowledge Program — a regulatory DOMAIN, not an ISO-14001 project.
+# Domain Knowledge Program — Environmental (backlog rank 2). A domain DEFINITION, not corpus content.
 # Machine-readable program definition consumed by the reference suite to track progress.
 # LAW-FIRST: Umweltrecht -> Obligations -> Capabilities -> ISO 14001 -> Delta (never the reverse).
 # 7-stage progress is DERIVED from the corpus (computed-not-stored). See programs/README.md.
 id: PROG-environmental
-name: "Environmental Knowledge Program"
+name: "Environmental Domain"
-customer_question: "Welche Umweltanforderungen gelten für mein Produkt (z. B. Industriespülmaschine)?"
+industry: "Hersteller mit Umweltpflichten (z. B. Industriespülmaschinen, Anlagenbau)"
-status: started                 # planned | started | in_progress | complete
+customer_entry: "Welche Umweltanforderungen gelten für mein Produkt (z. B. Industriespülmaschine)?"
 backlog_rank: 2
 rationale: "konkreter Kundenbezug (Abwasser/Chemikalien) — direkt nach Industrial Automation."
 status: started
 principle: >
-  ISO 14001 ist KEIN Umweltrecht, sondern ein Managementsystem. Wir starten beim Recht und fragen
+  ISO 14001 ist KEIN Umweltrecht, sondern ein Managementsystem (= ein Quellzustand). LAW-FIRST:
-  erst danach, welche vorhandenen Managementsysteme davon wahrscheinlich schon etwas abdecken.
+  erst das Recht, dann welche vorhandenen Managementsysteme davon wahrscheinlich schon etwas abdecken.
-# the reusable production line, instantiated for this domain
+# Stage 2 — the requirement-source areas of this domain (each becomes laws/obligations at Stage 2-3)
-blueprint: [corpus, obligations, capabilities, transition_patterns, playbooks, reference_scenarios, completeness]
+typical_requirement_sources: [water, chemicals, emissions, energy, waste, product_responsibility]
 typical_certifications: [ISO14001, ISO9001]   # pre-onboarding capability HYPOTHESIS (nicht Wahrheit)
-stages:
+# Reasoning capabilities to be modelled (Stage 3, @Execution) once the corpus lands
-  - id: B1
+target_capabilities:
-    name: "Environmental Regulatory Corpus"
+  - chemical_management
-    owner: "Legal Knowledge / Obligation Registry"
+  - wastewater_management
-    status: open                # handed off — not built here
+  - emissions_monitoring
-    note: "Zunächst NUR Rechtsquellen + Pflichten (noch keine ISO, keine Capabilities)."
+  - hazardous_substance_management
-    areas:                      # the six environmental obligation areas the customer actually faces
+  - energy_data_capture
-      - water                   # Wasser / Abwasser
+  - environmental_incident_management
      - chemicals               # Chemikalien (REACH/CLP)
      - emissions               # Emissionen
      - energy                  # Energie
      - waste                   # Abfall
      - product_responsibility  # Produktverantwortung
-  - id: B2
+ownership: "Stufe 1 Reasoning · 2 Legal-KG (HANDOFF, nur Recht+Pflichten) · 3 Execution (HANDOFF) · 4-7 Reasoning"
-    name: "Environmental Capability Model"
+note: >
-    owner: "Compliance Execution"
+  B3 (ISO 14001 -> Korpus-Transition) entsteht ZULETZT, erst wenn Recht + Capabilities bekannt sind.
-    status: open                # depends on B1; Registry grows here
+  Acceptance: Regulatory Completeness kippt `Environmental` von unsupported/open auf assessed.
    depends_on: [B1]
    capabilities:
      - chemical_management
      - wastewater_management
      - emissions_monitoring
      - hazardous_substance_management
      - energy_data_capture
      - environmental_incident_management
  - id: B3
    name: "Transition Patterns (ISO 14001 -> Environmental Corpus)"
    owner: "Reasoning (Knowledge Acquisition)"
    status: blocked             # LAST — only once both sides (corpus + capabilities) are known
    depends_on: [B1, B2]
    note: "Erst jetzt sinnvoll: ISO 14001 als Quellzustand gegen den Umwelt-Korpus (User: law-first)."
 # once B1-B3 land these extend AUTOMATICALLY via the existing engines (no new runtime architecture)
 downstream_auto: [playbooks, reference_scenarios, optimization, scope, capability_delta, completeness]
 acceptance: >
  Regulatory Completeness kippt `Environmental` von unsupported/open auf assessed; die sechs Bereiche
  sind als Obligations + Capabilities im validierten Korpus, das ISO-14001-Delta als Transition Pattern.
  Messbar an der Reference Suite (Environmental-Zelle UNSUPPORTED -> PASS).
@@ -0,0 +1,22 @@
 # Domain Knowledge Program — Industrial Automation (backlog rank 1, highest current customer value).
 # A domain DEFINITION, not corpus content. The 7-stage progress is DERIVED from the corpus by the
 # reference suite (computed-not-stored), never stored here. See programs/README.md for the checklist.
 id: PROG-industrial-automation
 name: "Industrial Automation Domain"
 industry: "Maschinen-/Anlagenbau, Industrieautomation, Parksysteme, Verpackungsmaschinen"
 customer_entry: "Ich baue Verpackungsmaschinen / Parksysteme / Industrieanlagen."
 backlog_rank: 1
 rationale: "höchster aktueller Kundennutzen — bereits am weitesten modelliert (CRA + MaschinenVO)."
 status: in_progress
 # Stage 2 — the requirement sources that typically apply to this industry
 typical_requirement_sources: [CRA, MaschinenVO, EMV, RED, DataAct, IEC62443, NIS2]
 # pre-onboarding capability HYPOTHESIS (nicht Wahrheit, vgl. ETO): feeds Company 2A as `inferred`
 typical_certifications: [ISO9001, ISO27001]
 ownership: "Stufe 1 Reasoning · 2 Legal-KG · 3 Execution · 4-7 Reasoning"
 note: >
  Diese Domäne hat den Vorlauf: CRA + MaschinenVO sind als Convergence-Pattern, RTS und Playbooks
  bereits (teilweise) im Korpus. EMV/RED/IEC62443/NIS2 sind identifiziert, aber noch nicht modelliert.
@@ -0,0 +1,18 @@
 # Domain Knowledge Program — Medical (backlog rank 4). A domain DEFINITION, not corpus content.
 # 7-stage progress is DERIVED from the corpus (computed-not-stored). See programs/README.md.
 id: PROG-medical
 name: "Medical Domain"
 industry: "Medizinprodukte-Hersteller, Medizintechnik"
 customer_entry: "Ich baue Medizinprodukte / Medizintechnik."
 backlog_rank: 4
 rationale: "hoher Leidensdruck (MDR), aber spezialisierter Markt → nach Industrial/Automotive."
 status: planned
 typical_requirement_sources: [MDR, IEC62304, ISO14971, IEC60601, CRA]
 typical_certifications: [ISO13485, ISO14971]
 ownership: "Stufe 1 Reasoning · 2 Legal-KG · 3 Execution · 4-7 Reasoning"
 note: >
  IEC 62304 (Software-Lebenszyklus) ↔ CRA secure-development = quellenübergreifender Convergence-
  Kandidat. ISO 14971 (Risikomanagement) ↔ Produkt-Risikoanalyse. Erst nach Industrial/Automotive.
@@ -142,35 +142,61 @@ def completeness_section() -> None:
 def domain_programs_section(base_dir) -> None:
-    """Render the Domain Knowledge Programs section (kept here so generate.py stays under the LOC budget)."""
+    """Domain Knowledge Program v1 — per-domain maturity KPI DERIVED from the corpus (computed-not-stored)."""
    import os
    import yaml
-    from compliance.completeness import assess_completeness
+    from compliance.knowledge_intake import build_knowledge_index
    def _load(sub):
        d = os.path.join(base_dir, "..", "knowledge", sub)
        return [yaml.safe_load(open(os.path.join(d, f), encoding="utf-8"))
                for f in sorted(os.listdir(d)) if f.endswith(".yaml")]
    idx = build_knowledge_index(_load("transition_patterns"), _load("implementation_playbooks"),
                                _load("reference_transition_scenarios"))
    pdir = os.path.join(base_dir, "..", "knowledge", "programs")
-    progs = [yaml.safe_load(open(os.path.join(pdir, f), encoding="utf-8"))
+    progs = sorted((yaml.safe_load(open(os.path.join(pdir, f), encoding="utf-8"))
-             for f in sorted(os.listdir(pdir)) if f.endswith(".yaml")]
+                    for f in sorted(os.listdir(pdir)) if f.endswith(".yaml")), key=lambda p: p.get("backlog_rank", 99))
-    w("## Domain Knowledge Programs — ab jetzt Domänen, nicht Architektur")
+
    _ALIAS = {"cyber resilience act": "cra", "maschinenverordnung": "maschinenvo", "iatf": "iatf16949"}
    def _canon(r):
        k = str(r).strip().lower()
        return _ALIAS.get(k, k)
    def _hits(reg_lists, src):
        cs = {_canon(s) for s in src}
        return [k for k, regs in reg_lists.items() if cs & {_canon(x) for x in regs}]
    def _source_modeled(index, source, canon):
        c = canon(source)
        in_tp = any(c in {canon(x) for x in regs} for regs in index.transition_patterns.values())
        in_rts = any(c in {canon(x) for x in regs} for regs in index.reference_scenarios.values())
        in_pb = any(c in {canon(x) for x in index.capability_regulations.get(cap, [])} for cap in index.playbook_capabilities)
        return in_tp or in_rts or in_pb
    w("## Domain Knowledge Program v1 — Reifegrad je Domäne (reproduzierbarer KPI)")
    w("")
-    w('_Die Runtime-Architektur ist eingefroren. Eine neue Domäne = Daten + Wissen, die jede Sicht automatisch erweitern. Produktionsstraße: Corpus→Obligations→Capabilities→Transition→Playbooks→Reference→Completeness. **Law-first: Recht → Pflichten → Capabilities → Managementsystem → Delta.**_')
+    w('_Engpass = Domänenmodellierung. Jede Domäne läuft durch DIESELBE 7-Stufen-Produktionsstraße (Domain Model → Requirement Sources → Capability Registry → Transition Patterns → Playbooks → Reference Scenarios → Completeness). Reifegrad aus dem ECHTEN Korpus abgeleitet (computed-not-stored), keine Marketingzahl. Einstieg über Industry, nicht Regelwerk._')
    w("")
    w("| Rank | Domäne | Reifegrad (Sources modelliert) | modelliert/total | Korpus TP·PB·RTS |")
    w("|---|---|---|---|---|")
    for p in progs:
-        w("**%s** — _%s_ (status: `%s`)" % (p["name"], p["customer_question"], p["status"]))
+        src = p.get("typical_requirement_sources", [])
-        w("")
+        tp, rts = _hits(idx.transition_patterns, src), _hits(idx.reference_scenarios, src)
-        w("| Stufe | Artefakt | Owner | Status |")
+        cs = {_canon(s) for s in src}
-        w("|---|---|---|---|")
+        pb = [c for c in idx.playbook_capabilities if cs & {_canon(x) for x in idx.capability_regulations.get(c, [])}]
-        for s in p.get("stages", []):
+        modeled = [s for s in src if _source_modeled(idx, s, _canon)]   # sources with >=1 corpus artifact
-            w("| %s | %s | %s | **%s** |" % (s["id"], s["name"], s["owner"], s["status"]))
+        breadth = (len(modeled) / len(src)) if src else 0.0             # honest differentiator (not CRA-shared depth)
-        w("")
+        filled = int(round(breadth * 10))
-        areas = next((s.get("areas", []) for s in p.get("stages", []) if s.get("id") == "B1"), [])
+        w("| %d | **%s** | `%s` %d%% | %d/%d | %d·%d·%d |" % (
-        if areas:
+            p.get("backlog_rank", 99), p["name"], "█" * filled + "░" * (10 - filled),
-            rep = assess_completeness(identified_regulations=areas, corpus_status={})   # all unknown -> open baseline
+            int(round(breadth * 100)), len(modeled), len(src), len(tp), len(pb), len(rts)))
-            w("- **Baseline (Completeness):** %s — die 6 Bereiche: %s" % (rep.completeness_summary, ", ".join(areas)))
+    w("")
-        w("")
+    w('_Industry-Einstieg + ETO-Hypothese: jede Domäne kennt ihre typischen Sources + Zertifikate → vor dem Onboarding „diese Prozesswelt ist wahrscheinlich vorhanden" (Hypothese, nie Wahrheit; speist Company 2A als `inferred`). Backlog nach Kundennutzen, KPI nach echtem Korpusstand — beides bewusst getrennt._')
    w("_Jedes Programm liefert dieselben Artefakte; Status `open/blocked` kippt automatisch, wenn die Stufen landen — Reference Suite + Completeness dokumentieren den Fortschritt je Domäne._")
    w("")
    coverage_table([
-        ("Domain Program Blueprint (wiederverwendbar)", "PASS", "Corpus→…→Completeness, law-first, Ownership je Stufe"),
+        ("Domain Knowledge Program (7-Stufen-Produktionsstraße)", "PASS", "%d Domänen im Backlog, Industrial Automation #1" % len(progs)),
-        ("Environmental Program (Daten)", "PASS", "B1@Legal-KG · B2@Execution · B3@Reasoning (blocked)"),
+        ("Reifegrad-KPI (computed-not-stored)", "PASS", "aus echtem Korpus abgeleitet (TP/PB/RTS je Domäne)"),
-        ("Phase B = Domänen, keine Architektur", "PASS", "kein neues Runtime-Framework (Freeze, ADR-008)"),
+        ("Regelwerk-ID-Normalisierung", "TODO", "Alias CRA/MaschinenVO im KPI — kanonische IDs ausstehend"),
    ])
@@ -365,29 +365,27 @@ _Sobald der Umwelt-Korpus (ISO 14001 etc.) landet, kippt `Environmental` automat
 | Begründete Ausschlüsse (Korpus/Anwendbarkeit) | **PASS** | 3 Ausschlüsse, alle mit Grund |
 | Fortschritts-Doku je Domäne | **PASS** | Environmental offen→validated bei Korpus-Landung |
-## Domain Knowledge Programs — ab jetzt Domänen, nicht Architektur
+## Domain Knowledge Program v1 — Reifegrad je Domäne (reproduzierbarer KPI)
-_Die Runtime-Architektur ist eingefroren. Eine neue Domäne = Daten + Wissen, die jede Sicht automatisch erweitern. Produktionsstraße: Corpus→Obligations→Capabilities→Transition→Playbooks→Reference→Completeness. **Law-first: Recht → Pflichten → Capabilities → Managementsystem → Delta.**_
+_Engpass = Domänenmodellierung. Jede Domäne läuft durch DIESELBE 7-Stufen-Produktionsstraße (Domain Model → Requirement Sources → Capability Registry → Transition Patterns → Playbooks → Reference Scenarios → Completeness). Reifegrad aus dem ECHTEN Korpus abgeleitet (computed-not-stored), keine Marketingzahl. Einstieg über Industry, nicht Regelwerk._
-**Environmental Knowledge Program** — _Welche Umweltanforderungen gelten für mein Produkt (z. B. Industriespülmaschine)?_ (status: `started`)
+| Rank | Domäne | Reifegrad (Sources modelliert) | modelliert/total | Korpus TP·PB·RTS |
 |---|---|---|---|---|
 | 1 | **Industrial Automation Domain** | `████░░░░░░` 43% | 3/7 | 3·2·3 |
 | 2 | **Environmental Domain** | `░░░░░░░░░░` 0% | 0/6 | 0·0·0 |
 | 3 | **Automotive Domain** | `██░░░░░░░░` 17% | 1/6 | 1·0·0 |
 | 4 | **Medical Domain** | `██░░░░░░░░` 20% | 1/5 | 3·2·3 |
 | 5 | **Energy Domain** | `██░░░░░░░░` 25% | 1/4 | 3·2·3 |
-| Stufe | Artefakt | Owner | Status |
+_Industry-Einstieg + ETO-Hypothese: jede Domäne kennt ihre typischen Sources + Zertifikate → vor dem Onboarding „diese Prozesswelt ist wahrscheinlich vorhanden" (Hypothese, nie Wahrheit; speist Company 2A als `inferred`). Backlog nach Kundennutzen, KPI nach echtem Korpusstand — beides bewusst getrennt._
 |---|---|---|---|
 | B1 | Environmental Regulatory Corpus | Legal Knowledge / Obligation Registry | **open** |
 | B2 | Environmental Capability Model | Compliance Execution | **open** |
 | B3 | Transition Patterns (ISO 14001 -> Environmental Corpus) | Reasoning (Knowledge Acquisition) | **blocked** |
 - **Baseline (Completeness):** Identifiziert 6 · bewertet 0 · offen 6 · Unsicherheiten 0 · Begründung ja — die 6 Bereiche: water, chemicals, emissions, energy, waste, product_responsibility
 _Jedes Programm liefert dieselben Artefakte; Status `open/blocked` kippt automatisch, wenn die Stufen landen — Reference Suite + Completeness dokumentieren den Fortschritt je Domäne._
 **Architecture Coverage**
 | Layer | Status | Hinweis |
 |---|---|---|
-| Domain Program Blueprint (wiederverwendbar) | **PASS** | Corpus→…→Completeness, law-first, Ownership je Stufe |
+| Domain Knowledge Program (7-Stufen-Produktionsstraße) | **PASS** | 5 Domänen im Backlog, Industrial Automation #1 |
-| Environmental Program (Daten) | **PASS** | B1@Legal-KG · B2@Execution · B3@Reasoning (blocked) |
+| Reifegrad-KPI (computed-not-stored) | **PASS** | aus echtem Korpus abgeleitet (TP/PB/RTS je Domäne) |
-| Phase B = Domänen, keine Architektur | **PASS** | kein neues Runtime-Framework (Freeze, ADR-008) |
+| Regelwerk-ID-Normalisierung | **TODO** | Alias CRA/MaschinenVO im KPI — kanonische IDs ausstehend |
 ## Gaps → Epics (Backlog — nur erfasst, NICHT implementiert)
@@ -401,5 +399,5 @@ _Jedes Programm liefert dieselben Artefakte; Status `open/blocked` kippt automat
 ## Suite-Status (Roll-up)
 - Coverage-Zellen gesamt: **47**
- PASS: **36** · PARTIAL: 3 · UNSUPPORTED: 1 · TODO: 6 · N/A: 1 · NEEDS_FACTS: 0
+- PASS: **35** · PARTIAL: 3 · UNSUPPORTED: 1 · TODO: 7 · N/A: 1 · NEEDS_FACTS: 0
 - Fortschritt = PASS-Anteil steigt, wenn Epics RS-001…004 landen (objektiver Maßstab, kein LOC).
@@ -1,8 +1,9 @@
-"""Characterization test for the Environmental Knowledge Program definition (data, not code).
+"""Characterization tests for the Domain Knowledge Program v1 backlog (data, not code).
-Pins the LAW-FIRST contract: the domain is ordered Corpus(B1) -> Capabilities(B2) -> Transition(B3),
+Pins the program FRAMEWORK contract: a ranked backlog of domain definitions, each entered by INDUSTRY
-not the reverse; ownership is assigned per stage; B3 (ISO 14001 -> corpus) is blocked until both sides
+with its typical requirement sources + a pre-onboarding capability hypothesis (typical_certifications).
-exist. If a future edit reverses the order or drops an owner, this test fails.
+Industrial Automation is rank 1. Environmental stays law-first. If a future edit reorders the backlog,
 drops a source list, or reverts environmental to an ISO-first framing, these tests fail.
 """
 from __future__ import annotations
@@ -11,45 +12,60 @@ import os
 import yaml
-_PROG = os.path.join(os.path.dirname(__file__), "..", "knowledge", "programs", "environmental.yaml")
+_DIR = os.path.join(os.path.dirname(__file__), "..", "knowledge", "programs")
-def _program():
+def _programs():
-    with open(_PROG, encoding="utf-8") as f:
+    out = {}
-        return yaml.safe_load(f)
+    for f in sorted(os.listdir(_DIR)):
        if f.endswith(".yaml"):
            with open(os.path.join(_DIR, f), encoding="utf-8") as h:
                p = yaml.safe_load(h)
            out[p["id"]] = p
    return out
-def test_blueprint_is_the_reusable_production_line():
+def test_five_domains_ranked_backlog():
-    p = _program()
+    ranks = sorted(p["backlog_rank"] for p in _programs().values())
-    assert p["blueprint"] == ["corpus", "obligations", "capabilities", "transition_patterns",
+    assert ranks == [1, 2, 3, 4, 5]
                              "playbooks", "reference_scenarios", "completeness"]
-def test_stages_are_law_first_in_order():
+def test_industrial_automation_is_rank_1():
-    stages = _program()["stages"]
+    progs = _programs()
-    assert [s["id"] for s in stages] == ["B1", "B2", "B3"]          # corpus -> capabilities -> transition
+    rank1 = [p for p in progs.values() if p["backlog_rank"] == 1]
-    assert "Corpus" in stages[0]["name"] and "Transition" in stages[2]["name"]
+    assert len(rank1) == 1 and rank1[0]["id"] == "PROG-industrial-automation"
    assert {"CRA", "MaschinenVO"} <= set(rank1[0]["typical_requirement_sources"])
-def test_ownership_assigned_per_stage():
+def test_every_domain_entered_by_industry_with_sources_and_hypothesis():
-    by = {s["id"]: s for s in _program()["stages"]}
+    for p in _programs().values():
-    assert "Legal Knowledge" in by["B1"]["owner"]                   # corpus + obligations
+        assert p.get("industry") and p.get("customer_entry")           # industry-first entry
-    assert "Compliance Execution" in by["B2"]["owner"]             # capability model
+        assert p["typical_requirement_sources"]                         # stage 2 defined
-    assert "Reasoning" in by["B3"]["owner"]                        # transition patterns
+        assert p["typical_certifications"]                             # pre-onboarding capability hypothesis (ETO)
-def test_transition_is_blocked_until_both_sides_known():
+def test_no_stored_stage_status_progress_is_derived():
-    b3 = {s["id"]: s for s in _program()["stages"]}["B3"]
+    # the 7-stage progress is computed-not-stored: program shells must NOT hard-code stage status
-    assert b3["status"] == "blocked"
+    for p in _programs().values():
-    assert b3["depends_on"] == ["B1", "B2"]                         # built LAST (law-first)
+        assert "stages" not in p
-def test_b1_covers_the_six_environmental_areas():
+def test_environmental_stays_law_first():
-    b1 = {s["id"]: s for s in _program()["stages"]}["B1"]
+    env = _programs()["PROG-environmental"]
-    assert set(b1["areas"]) == {"water", "chemicals", "emissions", "energy", "waste", "product_responsibility"}
+    assert "ISO 14001 ist KEIN Umweltrecht" in env["principle"]
    assert set(env["typical_requirement_sources"]) == {"water", "chemicals", "emissions", "energy", "waste", "product_responsibility"}
-def test_program_is_a_domain_not_an_iso_project():
+def test_automotive_and_medical_present():
-    p = _program()
+    progs = _programs()
-    assert "Umweltanforderungen" in p["customer_question"]          # starts from the law, not ISO 14001
+    assert "TISAX" in progs["PROG-automotive"]["typical_requirement_sources"]
-    assert "ISO 14001 ist KEIN Umweltrecht" in p["principle"]
+    assert "MDR" in progs["PROG-medical"]["typical_requirement_sources"]
 def test_readme_documents_seven_stage_checklist():
    with open(os.path.join(_DIR, "README.md"), encoding="utf-8") as h:
        readme = h.read()
    for stage in ["Domain Model", "Requirement Sources", "Capability Registry",
                  "Transition Patterns", "Playbooks", "Reference Scenarios", "Completeness"]:
        assert stage in readme
    assert "Industrial Automation" in readme                            # backlog #1 documented
@@ -0,0 +1,53 @@
 # ADR-009: Domain Knowledge Program — one 7-stage production line per domain
 - **Status:** Accepted
 - **Datum:** 2026-06-27
 - **Typ:** Architektur- / Organisations-Entscheidung
 - **Bezug:** [ADR-008](ADR-008-from-architecture-to-domains.md), [ADR-007](ADR-007-regulatory-completeness.md), [ADR-005](ADR-005-knowledge-production-pipeline.md), Architektur-Freeze v1.0, [[company-intelligence-2a]]
 ## Kontext
 Der Engpass ist nicht mehr Architektur, Controls oder „Wissen" allgemein, sondern präzise:
 **Domänenmodellierung.** Phase B (ADR-008) wird daher nicht als Einzel-Regelwerk-Features
 organisiert, sondern als EIN Arbeitsprogramm mit Unterprogrammen je Domäne — alle durch dieselbe
 Produktionsstraße. Kein weiteres Architektur-Epic, keine neue Runtime-Architektur.
 ## Entscheidung
 1. **Einstieg über die INDUSTRIE, nicht über das Regelwerk.** Der Kunde sagt „ich baue
   Verpackungsmaschinen / bin Automobilzulieferer / baue Parksysteme", nicht „erklär mir ISO 9001".
   Die Pipeline beginnt davor: `Industry → Domain Model → Requirement Sources → Requirements →
   Capabilities → … → Completeness`.
 2. **Eine 7-Stufen-Checkliste, identisch für JEDE Domäne:**
   1 Domain Model · 2 Requirement Sources · 3 Capability Registry · 4 Transition Patterns ·
   5 Playbooks · 6 Reference Scenarios · 7 Completeness. Ownership je Stufe (1 Reasoning · 2 Legal-KG ·
   3 Execution · 4–7 Reasoning). Das ist der Skalierungsmechanismus: jede neue Domäne nutzt dieselbe
   Straße, die bestehenden Engines erweitern sich automatisch.
 3. **Domänen tragen `typical_requirement_sources` + `typical_certifications` → Pre-Onboarding-HYPOTHESE
   (ETO-Einsicht).** Vor dem Onboarding: „diese Prozesswelt ist *wahrscheinlich* vorhanden" — als
   Hypothese, nie Wahrheit. Speist Company 2A als `inferred`, nie `confirmed`. Wir wollen nicht wissen,
   OB ein Automobilzulieferer ISO 9001 hat (das hat jeder), sondern welche Fähigkeiten dadurch
   wahrscheinlich schon vorhanden sind.
 4. **Per-Domain-KPI, reproduzierbar (computed-not-stored).** Reifegrad wird aus dem ECHTEN Korpus
   abgeleitet (modellierte Sources / Transition Patterns / Playbooks / Reference Scenarios / bewusst
   ausgewiesene Lücken — auf Basis der Regulatory Completeness Engine), NICHT als kuratierte Zahl.
   Programm-Shells speichern KEINEN Stufen-Status. Keine Marketingzahl.
 5. **Domain Knowledge Program v1 — Backlog nach Kundennutzen** (getrennt vom KPI nach Korpusstand):
   1 Industrial Automation · 2 Environmental · 3 Automotive · 4 Medical · 5 Energy.
 ## Konsequenzen
 - **Programme statt Features:** jede Domäne ist eine maschinenlesbare Definition (`programs/*.yaml`);
  der Reifegrad-KPI im Reference-Suite ist aus dem Korpus abgeleitet und differenziert ehrlich
  (Industrial Automation führt, Environmental 0 % — die Arbeit liegt vor uns).
 - **Backlog ≠ KPI:** der Backlog ordnet nach Kundennutzen, der KPI misst den echten Korpusstand —
  bewusst getrennt (z. B. eine Domäne kann hoch im Backlog, aber niedrig im KPI stehen).
 - **Arbeit verschiebt sich endgültig von Software- zu Wissensproduktion.** Wettbewerbsvorteil =
  Qualität und Breite der modellierten Domänen.
 - **Freeze-konform:** kein neues Metamodell, kein Graph, kein neues `compliance/`-Modul. Nur
  Programm-Daten (`knowledge/programs/`) + abgeleitete Reporting-Sicht im Reference-Suite.
 - Diese ADR ist non-runtime → kein Deploy (siehe [ADR-001](ADR-001-runtime-deploy-policy.md)).