Merge pull request 'feat: open Domain Knowledge Program v1 (7-stage production line + per-domain KPI)' (#23) from feat/domain-knowledge-program-v1 into main

This commit is contained in:
pilotadmin
2026-06-27 18:50:12 +02:00
10 changed files with 312 additions and 159 deletions
+47 -40
View File
@@ -1,51 +1,58 @@
# Domain Knowledge Programsfrom architecture to domains
# Domain Knowledge Program — the production line for every domain
**The architecture is stable. From here the value comes from DOMAINS, not more software.**
The runtime architecture (scope, regulatory map, capability delta, optimization, playbooks, intake,
production, completeness) is frozen. A new regulatory domain is a **data + knowledge** project that
extends every existing view automatically — no new runtime framework (see ADR-008, Freeze v1.0).
**The architecture is stable. From here the value comes from DOMAIN MODELLING, not more software.**
The real bottleneck is no longer architecture or controls or even „knowledge" — it is **domain
modelling**. Phase B is therefore organised as ONE program with sub-programs per domain, each run
through the SAME production line. No new runtime framework (ADR-008/009, Freeze v1.0).
## The production line (reusable for EVERY domain)
## The customer enters by INDUSTRY, not by regulation
A customer never says „explain ISO 9001". They say „I build packaging machines" / „I'm an automotive
supplier" / „I build parking systems". So the pipeline starts at the industry:
```
Regulatory Corpus → Obligations → Capabilities → Transition Patterns → Playbooks → Reference Scenarios → Completeness
Industry → Domain Model → Requirement Sources → Requirements → Capabilities → … → Reality / Verification
```
Each domain delivers the SAME artifacts. Once they land, the existing engines automatically extend:
Scope · Gap · Capability Delta · Optimization · Playbooks · Reference Scenarios · Regulatory Completeness.
## The 7-stage checklist (identical for EVERY domain)
## Law-first (a deliberate ordering)
Start from the **law**, not from a management system. A management system (e.g. ISO 14001) is NOT
the domain — it is one possible source state. The customer asks *„welche Anforderungen gelten für mein
Produkt?"*, not *„wie komme ich von ISO 14001 weg?"*. So the order is:
```
Recht → Obligations → Capabilities → (Managementsystem) → Delta
```
A management-system→corpus transition pattern is built LAST, once BOTH sides are known.
## Ownership per stage (coordinate via the board, do not duplicate)
| Stage | Artifact | Owner |
| # | Stage | Owner |
|---|---|---|
| B1 Corpus + Obligations | legal sources, obligations registry | **Legal Knowledge / Obligation Registry** |
| B2 Capability Model | new capabilities in the Master Capability Registry | **Compliance Execution** |
| B3 Transition Patterns | source-state → corpus delta patterns | **Reasoning (Knowledge Acquisition)** |
| Playbooks | implementation playbooks per capability | **Reasoning** |
| Reference Scenarios | canonical regression + expected outcomes | **Reasoning** |
| Completeness | corpus-status registry per domain | **Reasoning / curation** |
| 1 | **Domain Model** (industry → what world is this?) | Reasoning / curation |
| 2 | **Requirement Sources** (which regulations/standards/specs apply) | Legal Knowledge |
| 3 | **Capability Registry** (capabilities the sources require) | Compliance Execution |
| 4 | **Transition Patterns** (source-state → domain delta) | Reasoning |
| 5 | **Playbooks** (how to implement each capability) | Reasoning |
| 6 | **Reference Scenarios** (canonical regression + expected outcomes) | Reasoning |
| 7 | **Completeness** (auditable coverage per domain) | Reasoning / curation |
## Programs (planned)
This is the scaling mechanism: every new domain reuses the same production line; the existing engines
(Scope, Gap, Capability Delta, Optimization, Playbooks, Reference, Completeness) extend automatically.
| Program | File | Status |
|---|---|---|
| Environmental Knowledge Program | `environmental.yaml` | started (B1 handed off) |
| Automotive Knowledge Program | _planned_ | — |
| OT / IEC 62443 Knowledge Program | _planned_ | — |
| Functional Safety Knowledge Program | _planned_ | — |
## A domain knows its typical sources → pre-onboarding HYPOTHESIS (the ETO insight)
Each program is a machine-readable definition (`*.yaml`) consumed by the reference suite to track its
progress; future sessions flip stage `status` as artifacts land, and the Completeness engine reports
the domain flipping `unsupported → validated` automatically.
Each domain definition lists `typical_requirement_sources` and `typical_certifications`. So before
onboarding, BreakPilot can say „this process world is *probably* present" — as a **hypothesis, not a
truth**. We don't want to know whether an automotive supplier has ISO 9001 (everyone does); we want
to know **which company capabilities are therefore probably already present** (feeds Company 2A as
`inferred`, never `confirmed`).
## Per-domain KPI — reproducible, not marketing
Progress per domain is **derived from the Regulatory Completeness Engine + the actual corpus**
(computed-not-stored): identified requirement sources · modelled capabilities · transition patterns ·
playbooks · passed reference scenarios · consciously declared corpus gaps. Rendered as a bar
(`Industrial ███████░░░ 70 %`). These are reproducible quality metrics — no curated numbers.
## Domain Knowledge Program v1 — backlog (by current customer value)
| Rank | Domain | File | Typical sources |
|---|---|---|---|
| 1 | **Industrial Automation** | `industrial_automation.yaml` | CRA · MaschinenVO · EMV · RED · Data Act · IEC 62443 · NIS2 |
| 2 | Environmental | `environmental.yaml` | Wasser · Chemikalien · Luft · Energie · Abfall · Produktverantwortung |
| 3 | Automotive | `automotive.yaml` | IATF · TISAX · UNECE R155/R156 · ASPICE · OEM-Lastenhefte |
| 4 | Medical | `medical.yaml` | MDR · IEC 62304 · ISO 14971 |
| 5 | Energy | `energy.yaml` | je nach Zielmarkt |
The work shifts decisively from software development to knowledge production; the competitive
advantage now comes from the quality and breadth of the modelled domains.
@@ -0,0 +1,19 @@
# Domain Knowledge Program — Automotive (backlog rank 3). A domain DEFINITION, not corpus content.
# 7-stage progress is DERIVED from the corpus (computed-not-stored). See programs/README.md.
id: PROG-automotive
name: "Automotive Domain"
industry: "Automobilzulieferer, OEM-Zulieferkette"
customer_entry: "Ich bin Automobilzulieferer."
backlog_rank: 3
rationale: "großer Markt; OEM-Lastenhefte = früher Business-Requirement-Anwendungsfall."
status: planned
typical_requirement_sources: [IATF16949, TISAX, "UNECE R155", "UNECE R156", ASPICE, OEM_Lastenheft]
typical_certifications: [ISO9001, IATF16949, TISAX, ISO27001]
ownership: "Stufe 1 Reasoning · 2 Legal-KG · 3 Execution · 4-7 Reasoning"
note: >
ISMS→TISAX-Transition-Pattern existiert bereits (Vorarbeit). UNECE R155 (Cybersecurity Management
System) ↔ CRA = quellenübergreifender Convergence-Kandidat. OEM-Lastenheft = erster Business
Requirement (siehe Vision V2 / Requirements Verification, NICHT jetzt).
@@ -0,0 +1,18 @@
# Domain Knowledge Program — Energy (backlog rank 5). A domain DEFINITION, not corpus content.
# 7-stage progress is DERIVED from the corpus (computed-not-stored). See programs/README.md.
id: PROG-energy
name: "Energy Domain"
industry: "Energieerzeugung/-verteilung, Anlagen kritischer Infrastruktur"
customer_entry: "Ich baue Anlagen für Energieerzeugung / kritische Infrastruktur."
backlog_rank: 5
rationale: "Zielmarkt-abhängig; nach den klareren Industrie-/Produkt-Domänen."
status: planned
typical_requirement_sources: [NIS2, IEC62443, CRA, "netzcode/marktabhängig"]
typical_certifications: [ISO27001, IEC62443]
ownership: "Stufe 1 Reasoning · 2 Legal-KG · 3 Execution · 4-7 Reasoning"
note: >
Stark zielmarkt-abhängig (Netzcodes, nationale Vorgaben). NIS2/IEC62443 teilen sich Capabilities
mit Industrial Automation → Wiederverwendung wahrscheinlich hoch.
@@ -1,39 +1,25 @@
# Environmental Knowledge Program — a regulatory DOMAIN, not an ISO-14001 project.
# Machine-readable program definition consumed by the reference suite to track progress.
# Domain Knowledge Program — Environmental (backlog rank 2). A domain DEFINITION, not corpus content.
# LAW-FIRST: Umweltrecht -> Obligations -> Capabilities -> ISO 14001 -> Delta (never the reverse).
# 7-stage progress is DERIVED from the corpus (computed-not-stored). See programs/README.md.
id: PROG-environmental
name: "Environmental Knowledge Program"
customer_question: "Welche Umweltanforderungen gelten für mein Produkt (z. B. Industriespülmaschine)?"
status: started # planned | started | in_progress | complete
name: "Environmental Domain"
industry: "Hersteller mit Umweltpflichten (z. B. Industriespülmaschinen, Anlagenbau)"
customer_entry: "Welche Umweltanforderungen gelten für mein Produkt (z. B. Industriespülmaschine)?"
backlog_rank: 2
rationale: "konkreter Kundenbezug (Abwasser/Chemikalien) — direkt nach Industrial Automation."
status: started
principle: >
ISO 14001 ist KEIN Umweltrecht, sondern ein Managementsystem. Wir starten beim Recht und fragen
erst danach, welche vorhandenen Managementsysteme davon wahrscheinlich schon etwas abdecken.
ISO 14001 ist KEIN Umweltrecht, sondern ein Managementsystem (= ein Quellzustand). LAW-FIRST:
erst das Recht, dann welche vorhandenen Managementsysteme davon wahrscheinlich schon etwas abdecken.
# the reusable production line, instantiated for this domain
blueprint: [corpus, obligations, capabilities, transition_patterns, playbooks, reference_scenarios, completeness]
# Stage 2 — the requirement-source areas of this domain (each becomes laws/obligations at Stage 2-3)
typical_requirement_sources: [water, chemicals, emissions, energy, waste, product_responsibility]
typical_certifications: [ISO14001, ISO9001] # pre-onboarding capability HYPOTHESIS (nicht Wahrheit)
stages:
- id: B1
name: "Environmental Regulatory Corpus"
owner: "Legal Knowledge / Obligation Registry"
status: open # handed off — not built here
note: "Zunächst NUR Rechtsquellen + Pflichten (noch keine ISO, keine Capabilities)."
areas: # the six environmental obligation areas the customer actually faces
- water # Wasser / Abwasser
- chemicals # Chemikalien (REACH/CLP)
- emissions # Emissionen
- energy # Energie
- waste # Abfall
- product_responsibility # Produktverantwortung
- id: B2
name: "Environmental Capability Model"
owner: "Compliance Execution"
status: open # depends on B1; Registry grows here
depends_on: [B1]
capabilities:
# Reasoning capabilities to be modelled (Stage 3, @Execution) once the corpus lands
target_capabilities:
- chemical_management
- wastewater_management
- emissions_monitoring
@@ -41,17 +27,7 @@ stages:
- energy_data_capture
- environmental_incident_management
- id: B3
name: "Transition Patterns (ISO 14001 -> Environmental Corpus)"
owner: "Reasoning (Knowledge Acquisition)"
status: blocked # LAST — only once both sides (corpus + capabilities) are known
depends_on: [B1, B2]
note: "Erst jetzt sinnvoll: ISO 14001 als Quellzustand gegen den Umwelt-Korpus (User: law-first)."
# once B1-B3 land these extend AUTOMATICALLY via the existing engines (no new runtime architecture)
downstream_auto: [playbooks, reference_scenarios, optimization, scope, capability_delta, completeness]
acceptance: >
Regulatory Completeness kippt `Environmental` von unsupported/open auf assessed; die sechs Bereiche
sind als Obligations + Capabilities im validierten Korpus, das ISO-14001-Delta als Transition Pattern.
Messbar an der Reference Suite (Environmental-Zelle UNSUPPORTED -> PASS).
ownership: "Stufe 1 Reasoning · 2 Legal-KG (HANDOFF, nur Recht+Pflichten) · 3 Execution (HANDOFF) · 4-7 Reasoning"
note: >
B3 (ISO 14001 -> Korpus-Transition) entsteht ZULETZT, erst wenn Recht + Capabilities bekannt sind.
Acceptance: Regulatory Completeness kippt `Environmental` von unsupported/open auf assessed.
@@ -0,0 +1,22 @@
# Domain Knowledge Program — Industrial Automation (backlog rank 1, highest current customer value).
# A domain DEFINITION, not corpus content. The 7-stage progress is DERIVED from the corpus by the
# reference suite (computed-not-stored), never stored here. See programs/README.md for the checklist.
id: PROG-industrial-automation
name: "Industrial Automation Domain"
industry: "Maschinen-/Anlagenbau, Industrieautomation, Parksysteme, Verpackungsmaschinen"
customer_entry: "Ich baue Verpackungsmaschinen / Parksysteme / Industrieanlagen."
backlog_rank: 1
rationale: "höchster aktueller Kundennutzen — bereits am weitesten modelliert (CRA + MaschinenVO)."
status: in_progress
# Stage 2 — the requirement sources that typically apply to this industry
typical_requirement_sources: [CRA, MaschinenVO, EMV, RED, DataAct, IEC62443, NIS2]
# pre-onboarding capability HYPOTHESIS (nicht Wahrheit, vgl. ETO): feeds Company 2A as `inferred`
typical_certifications: [ISO9001, ISO27001]
ownership: "Stufe 1 Reasoning · 2 Legal-KG · 3 Execution · 4-7 Reasoning"
note: >
Diese Domäne hat den Vorlauf: CRA + MaschinenVO sind als Convergence-Pattern, RTS und Playbooks
bereits (teilweise) im Korpus. EMV/RED/IEC62443/NIS2 sind identifiziert, aber noch nicht modelliert.
@@ -0,0 +1,18 @@
# Domain Knowledge Program — Medical (backlog rank 4). A domain DEFINITION, not corpus content.
# 7-stage progress is DERIVED from the corpus (computed-not-stored). See programs/README.md.
id: PROG-medical
name: "Medical Domain"
industry: "Medizinprodukte-Hersteller, Medizintechnik"
customer_entry: "Ich baue Medizinprodukte / Medizintechnik."
backlog_rank: 4
rationale: "hoher Leidensdruck (MDR), aber spezialisierter Markt → nach Industrial/Automotive."
status: planned
typical_requirement_sources: [MDR, IEC62304, ISO14971, IEC60601, CRA]
typical_certifications: [ISO13485, ISO14971]
ownership: "Stufe 1 Reasoning · 2 Legal-KG · 3 Execution · 4-7 Reasoning"
note: >
IEC 62304 (Software-Lebenszyklus) ↔ CRA secure-development = quellenübergreifender Convergence-
Kandidat. ISO 14971 (Risikomanagement) ↔ Produkt-Risikoanalyse. Erst nach Industrial/Automotive.
@@ -142,35 +142,61 @@ def completeness_section() -> None:
def domain_programs_section(base_dir) -> None:
"""Render the Domain Knowledge Programs section (kept here so generate.py stays under the LOC budget)."""
"""Domain Knowledge Program v1 — per-domain maturity KPI DERIVED from the corpus (computed-not-stored)."""
import os
import yaml
from compliance.completeness import assess_completeness
from compliance.knowledge_intake import build_knowledge_index
def _load(sub):
d = os.path.join(base_dir, "..", "knowledge", sub)
return [yaml.safe_load(open(os.path.join(d, f), encoding="utf-8"))
for f in sorted(os.listdir(d)) if f.endswith(".yaml")]
idx = build_knowledge_index(_load("transition_patterns"), _load("implementation_playbooks"),
_load("reference_transition_scenarios"))
pdir = os.path.join(base_dir, "..", "knowledge", "programs")
progs = [yaml.safe_load(open(os.path.join(pdir, f), encoding="utf-8"))
for f in sorted(os.listdir(pdir)) if f.endswith(".yaml")]
w("## Domain Knowledge Programs — ab jetzt Domänen, nicht Architektur")
progs = sorted((yaml.safe_load(open(os.path.join(pdir, f), encoding="utf-8"))
for f in sorted(os.listdir(pdir)) if f.endswith(".yaml")), key=lambda p: p.get("backlog_rank", 99))
_ALIAS = {"cyber resilience act": "cra", "maschinenverordnung": "maschinenvo", "iatf": "iatf16949"}
def _canon(r):
k = str(r).strip().lower()
return _ALIAS.get(k, k)
def _hits(reg_lists, src):
cs = {_canon(s) for s in src}
return [k for k, regs in reg_lists.items() if cs & {_canon(x) for x in regs}]
def _source_modeled(index, source, canon):
c = canon(source)
in_tp = any(c in {canon(x) for x in regs} for regs in index.transition_patterns.values())
in_rts = any(c in {canon(x) for x in regs} for regs in index.reference_scenarios.values())
in_pb = any(c in {canon(x) for x in index.capability_regulations.get(cap, [])} for cap in index.playbook_capabilities)
return in_tp or in_rts or in_pb
w("## Domain Knowledge Program v1 — Reifegrad je Domäne (reproduzierbarer KPI)")
w("")
w('_Die Runtime-Architektur ist eingefroren. Eine neue Domäne = Daten + Wissen, die jede Sicht automatisch erweitern. Produktionsstraße: Corpus→Obligations→Capabilities→Transition→PlaybooksReferenceCompleteness. **Law-first: Recht → Pflichten → Capabilities → Managementsystem → Delta.**_')
w('_Engpass = Domänenmodellierung. Jede Domäne läuft durch DIESELBE 7-Stufen-Produktionsstraße (Domain Model → Requirement Sources → Capability Registry → Transition Patterns → PlaybooksReference Scenarios → Completeness). Reifegrad aus dem ECHTEN Korpus abgeleitet (computed-not-stored), keine Marketingzahl. Einstieg über Industry, nicht Regelwerk._')
w("")
w("| Rank | Domäne | Reifegrad (Sources modelliert) | modelliert/total | Korpus TP·PB·RTS |")
w("|---|---|---|---|---|")
for p in progs:
w("**%s** — _%s_ (status: `%s`)" % (p["name"], p["customer_question"], p["status"]))
src = p.get("typical_requirement_sources", [])
tp, rts = _hits(idx.transition_patterns, src), _hits(idx.reference_scenarios, src)
cs = {_canon(s) for s in src}
pb = [c for c in idx.playbook_capabilities if cs & {_canon(x) for x in idx.capability_regulations.get(c, [])}]
modeled = [s for s in src if _source_modeled(idx, s, _canon)] # sources with >=1 corpus artifact
breadth = (len(modeled) / len(src)) if src else 0.0 # honest differentiator (not CRA-shared depth)
filled = int(round(breadth * 10))
w("| %d | **%s** | `%s` %d%% | %d/%d | %d·%d·%d |" % (
p.get("backlog_rank", 99), p["name"], "" * filled + "" * (10 - filled),
int(round(breadth * 100)), len(modeled), len(src), len(tp), len(pb), len(rts)))
w("")
w("| Stufe | Artefakt | Owner | Status |")
w("|---|---|---|---|")
for s in p.get("stages", []):
w("| %s | %s | %s | **%s** |" % (s["id"], s["name"], s["owner"], s["status"]))
w("")
areas = next((s.get("areas", []) for s in p.get("stages", []) if s.get("id") == "B1"), [])
if areas:
rep = assess_completeness(identified_regulations=areas, corpus_status={}) # all unknown -> open baseline
w("- **Baseline (Completeness):** %s — die 6 Bereiche: %s" % (rep.completeness_summary, ", ".join(areas)))
w("")
w("_Jedes Programm liefert dieselben Artefakte; Status `open/blocked` kippt automatisch, wenn die Stufen landen — Reference Suite + Completeness dokumentieren den Fortschritt je Domäne._")
w('_Industry-Einstieg + ETO-Hypothese: jede Domäne kennt ihre typischen Sources + Zertifikate → vor dem Onboarding „diese Prozesswelt ist wahrscheinlich vorhanden" (Hypothese, nie Wahrheit; speist Company 2A als `inferred`). Backlog nach Kundennutzen, KPI nach echtem Korpusstand — beides bewusst getrennt._')
w("")
coverage_table([
("Domain Program Blueprint (wiederverwendbar)", "PASS", "Corpus→…→Completeness, law-first, Ownership je Stufe"),
("Environmental Program (Daten)", "PASS", "B1@Legal-KG · B2@Execution · B3@Reasoning (blocked)"),
("Phase B = Domänen, keine Architektur", "PASS", "kein neues Runtime-Framework (Freeze, ADR-008)"),
("Domain Knowledge Program (7-Stufen-Produktionsstraße)", "PASS", "%d Domänen im Backlog, Industrial Automation #1" % len(progs)),
("Reifegrad-KPI (computed-not-stored)", "PASS", "aus echtem Korpus abgeleitet (TP/PB/RTS je Domäne)"),
("Regelwerk-ID-Normalisierung", "TODO", "Alias CRA/MaschinenVO im KPI — kanonische IDs ausstehend"),
])
@@ -365,29 +365,27 @@ _Sobald der Umwelt-Korpus (ISO 14001 etc.) landet, kippt `Environmental` automat
| Begründete Ausschlüsse (Korpus/Anwendbarkeit) | **PASS** | 3 Ausschlüsse, alle mit Grund |
| Fortschritts-Doku je Domäne | **PASS** | Environmental offen→validated bei Korpus-Landung |
## Domain Knowledge Programs — ab jetzt Domänen, nicht Architektur
## Domain Knowledge Program v1 — Reifegrad je Domäne (reproduzierbarer KPI)
_Die Runtime-Architektur ist eingefroren. Eine neue Domäne = Daten + Wissen, die jede Sicht automatisch erweitern. Produktionsstraße: Corpus→Obligations→Capabilities→Transition→PlaybooksReferenceCompleteness. **Law-first: Recht → Pflichten → Capabilities → Managementsystem → Delta.**_
_Engpass = Domänenmodellierung. Jede Domäne läuft durch DIESELBE 7-Stufen-Produktionsstraße (Domain Model → Requirement Sources → Capability Registry → Transition Patterns → PlaybooksReference Scenarios → Completeness). Reifegrad aus dem ECHTEN Korpus abgeleitet (computed-not-stored), keine Marketingzahl. Einstieg über Industry, nicht Regelwerk._
**Environmental Knowledge Program**_Welche Umweltanforderungen gelten für mein Produkt (z. B. Industriespülmaschine)?_ (status: `started`)
| Rank | Domäne | Reifegrad (Sources modelliert) | modelliert/total | Korpus TP·PB·RTS |
|---|---|---|---|---|
| 1 | **Industrial Automation Domain** | `████░░░░░░` 43% | 3/7 | 3·2·3 |
| 2 | **Environmental Domain** | `░░░░░░░░░░` 0% | 0/6 | 0·0·0 |
| 3 | **Automotive Domain** | `██░░░░░░░░` 17% | 1/6 | 1·0·0 |
| 4 | **Medical Domain** | `██░░░░░░░░` 20% | 1/5 | 3·2·3 |
| 5 | **Energy Domain** | `██░░░░░░░░` 25% | 1/4 | 3·2·3 |
| Stufe | Artefakt | Owner | Status |
|---|---|---|---|
| B1 | Environmental Regulatory Corpus | Legal Knowledge / Obligation Registry | **open** |
| B2 | Environmental Capability Model | Compliance Execution | **open** |
| B3 | Transition Patterns (ISO 14001 -> Environmental Corpus) | Reasoning (Knowledge Acquisition) | **blocked** |
- **Baseline (Completeness):** Identifiziert 6 · bewertet 0 · offen 6 · Unsicherheiten 0 · Begründung ja — die 6 Bereiche: water, chemicals, emissions, energy, waste, product_responsibility
_Jedes Programm liefert dieselben Artefakte; Status `open/blocked` kippt automatisch, wenn die Stufen landen — Reference Suite + Completeness dokumentieren den Fortschritt je Domäne._
_Industry-Einstieg + ETO-Hypothese: jede Domäne kennt ihre typischen Sources + Zertifikate → vor dem Onboarding „diese Prozesswelt ist wahrscheinlich vorhanden" (Hypothese, nie Wahrheit; speist Company 2A als `inferred`). Backlog nach Kundennutzen, KPI nach echtem Korpusstand — beides bewusst getrennt._
**Architecture Coverage**
| Layer | Status | Hinweis |
|---|---|---|
| Domain Program Blueprint (wiederverwendbar) | **PASS** | Corpus→…→Completeness, law-first, Ownership je Stufe |
| Environmental Program (Daten) | **PASS** | B1@Legal-KG · B2@Execution · B3@Reasoning (blocked) |
| Phase B = Domänen, keine Architektur | **PASS** | kein neues Runtime-Framework (Freeze, ADR-008) |
| Domain Knowledge Program (7-Stufen-Produktionsstraße) | **PASS** | 5 Domänen im Backlog, Industrial Automation #1 |
| Reifegrad-KPI (computed-not-stored) | **PASS** | aus echtem Korpus abgeleitet (TP/PB/RTS je Domäne) |
| Regelwerk-ID-Normalisierung | **TODO** | Alias CRA/MaschinenVO im KPI — kanonische IDs ausstehend |
## Gaps → Epics (Backlog — nur erfasst, NICHT implementiert)
@@ -401,5 +399,5 @@ _Jedes Programm liefert dieselben Artefakte; Status `open/blocked` kippt automat
## Suite-Status (Roll-up)
- Coverage-Zellen gesamt: **47**
- PASS: **36** · PARTIAL: 3 · UNSUPPORTED: 1 · TODO: 6 · N/A: 1 · NEEDS_FACTS: 0
- PASS: **35** · PARTIAL: 3 · UNSUPPORTED: 1 · TODO: 7 · N/A: 1 · NEEDS_FACTS: 0
- Fortschritt = PASS-Anteil steigt, wenn Epics RS-001…004 landen (objektiver Maßstab, kein LOC).
@@ -1,8 +1,9 @@
"""Characterization test for the Environmental Knowledge Program definition (data, not code).
"""Characterization tests for the Domain Knowledge Program v1 backlog (data, not code).
Pins the LAW-FIRST contract: the domain is ordered Corpus(B1) -> Capabilities(B2) -> Transition(B3),
not the reverse; ownership is assigned per stage; B3 (ISO 14001 -> corpus) is blocked until both sides
exist. If a future edit reverses the order or drops an owner, this test fails.
Pins the program FRAMEWORK contract: a ranked backlog of domain definitions, each entered by INDUSTRY
with its typical requirement sources + a pre-onboarding capability hypothesis (typical_certifications).
Industrial Automation is rank 1. Environmental stays law-first. If a future edit reorders the backlog,
drops a source list, or reverts environmental to an ISO-first framing, these tests fail.
"""
from __future__ import annotations
@@ -11,45 +12,60 @@ import os
import yaml
_PROG = os.path.join(os.path.dirname(__file__), "..", "knowledge", "programs", "environmental.yaml")
_DIR = os.path.join(os.path.dirname(__file__), "..", "knowledge", "programs")
def _program():
with open(_PROG, encoding="utf-8") as f:
return yaml.safe_load(f)
def _programs():
out = {}
for f in sorted(os.listdir(_DIR)):
if f.endswith(".yaml"):
with open(os.path.join(_DIR, f), encoding="utf-8") as h:
p = yaml.safe_load(h)
out[p["id"]] = p
return out
def test_blueprint_is_the_reusable_production_line():
p = _program()
assert p["blueprint"] == ["corpus", "obligations", "capabilities", "transition_patterns",
"playbooks", "reference_scenarios", "completeness"]
def test_five_domains_ranked_backlog():
ranks = sorted(p["backlog_rank"] for p in _programs().values())
assert ranks == [1, 2, 3, 4, 5]
def test_stages_are_law_first_in_order():
stages = _program()["stages"]
assert [s["id"] for s in stages] == ["B1", "B2", "B3"] # corpus -> capabilities -> transition
assert "Corpus" in stages[0]["name"] and "Transition" in stages[2]["name"]
def test_industrial_automation_is_rank_1():
progs = _programs()
rank1 = [p for p in progs.values() if p["backlog_rank"] == 1]
assert len(rank1) == 1 and rank1[0]["id"] == "PROG-industrial-automation"
assert {"CRA", "MaschinenVO"} <= set(rank1[0]["typical_requirement_sources"])
def test_ownership_assigned_per_stage():
by = {s["id"]: s for s in _program()["stages"]}
assert "Legal Knowledge" in by["B1"]["owner"] # corpus + obligations
assert "Compliance Execution" in by["B2"]["owner"] # capability model
assert "Reasoning" in by["B3"]["owner"] # transition patterns
def test_every_domain_entered_by_industry_with_sources_and_hypothesis():
for p in _programs().values():
assert p.get("industry") and p.get("customer_entry") # industry-first entry
assert p["typical_requirement_sources"] # stage 2 defined
assert p["typical_certifications"] # pre-onboarding capability hypothesis (ETO)
def test_transition_is_blocked_until_both_sides_known():
b3 = {s["id"]: s for s in _program()["stages"]}["B3"]
assert b3["status"] == "blocked"
assert b3["depends_on"] == ["B1", "B2"] # built LAST (law-first)
def test_no_stored_stage_status_progress_is_derived():
# the 7-stage progress is computed-not-stored: program shells must NOT hard-code stage status
for p in _programs().values():
assert "stages" not in p
def test_b1_covers_the_six_environmental_areas():
b1 = {s["id"]: s for s in _program()["stages"]}["B1"]
assert set(b1["areas"]) == {"water", "chemicals", "emissions", "energy", "waste", "product_responsibility"}
def test_environmental_stays_law_first():
env = _programs()["PROG-environmental"]
assert "ISO 14001 ist KEIN Umweltrecht" in env["principle"]
assert set(env["typical_requirement_sources"]) == {"water", "chemicals", "emissions", "energy", "waste", "product_responsibility"}
def test_program_is_a_domain_not_an_iso_project():
p = _program()
assert "Umweltanforderungen" in p["customer_question"] # starts from the law, not ISO 14001
assert "ISO 14001 ist KEIN Umweltrecht" in p["principle"]
def test_automotive_and_medical_present():
progs = _programs()
assert "TISAX" in progs["PROG-automotive"]["typical_requirement_sources"]
assert "MDR" in progs["PROG-medical"]["typical_requirement_sources"]
def test_readme_documents_seven_stage_checklist():
with open(os.path.join(_DIR, "README.md"), encoding="utf-8") as h:
readme = h.read()
for stage in ["Domain Model", "Requirement Sources", "Capability Registry",
"Transition Patterns", "Playbooks", "Reference Scenarios", "Completeness"]:
assert stage in readme
assert "Industrial Automation" in readme # backlog #1 documented
@@ -0,0 +1,53 @@
# ADR-009: Domain Knowledge Program — one 7-stage production line per domain
- **Status:** Accepted
- **Datum:** 2026-06-27
- **Typ:** Architektur- / Organisations-Entscheidung
- **Bezug:** [ADR-008](ADR-008-from-architecture-to-domains.md), [ADR-007](ADR-007-regulatory-completeness.md), [ADR-005](ADR-005-knowledge-production-pipeline.md), Architektur-Freeze v1.0, [[company-intelligence-2a]]
## Kontext
Der Engpass ist nicht mehr Architektur, Controls oder „Wissen" allgemein, sondern präzise:
**Domänenmodellierung.** Phase B (ADR-008) wird daher nicht als Einzel-Regelwerk-Features
organisiert, sondern als EIN Arbeitsprogramm mit Unterprogrammen je Domäne — alle durch dieselbe
Produktionsstraße. Kein weiteres Architektur-Epic, keine neue Runtime-Architektur.
## Entscheidung
1. **Einstieg über die INDUSTRIE, nicht über das Regelwerk.** Der Kunde sagt „ich baue
Verpackungsmaschinen / bin Automobilzulieferer / baue Parksysteme", nicht „erklär mir ISO 9001".
Die Pipeline beginnt davor: `Industry → Domain Model → Requirement Sources → Requirements →
Capabilities → … → Completeness`.
2. **Eine 7-Stufen-Checkliste, identisch für JEDE Domäne:**
1 Domain Model · 2 Requirement Sources · 3 Capability Registry · 4 Transition Patterns ·
5 Playbooks · 6 Reference Scenarios · 7 Completeness. Ownership je Stufe (1 Reasoning · 2 Legal-KG ·
3 Execution · 47 Reasoning). Das ist der Skalierungsmechanismus: jede neue Domäne nutzt dieselbe
Straße, die bestehenden Engines erweitern sich automatisch.
3. **Domänen tragen `typical_requirement_sources` + `typical_certifications` → Pre-Onboarding-HYPOTHESE
(ETO-Einsicht).** Vor dem Onboarding: „diese Prozesswelt ist *wahrscheinlich* vorhanden" — als
Hypothese, nie Wahrheit. Speist Company 2A als `inferred`, nie `confirmed`. Wir wollen nicht wissen,
OB ein Automobilzulieferer ISO 9001 hat (das hat jeder), sondern welche Fähigkeiten dadurch
wahrscheinlich schon vorhanden sind.
4. **Per-Domain-KPI, reproduzierbar (computed-not-stored).** Reifegrad wird aus dem ECHTEN Korpus
abgeleitet (modellierte Sources / Transition Patterns / Playbooks / Reference Scenarios / bewusst
ausgewiesene Lücken — auf Basis der Regulatory Completeness Engine), NICHT als kuratierte Zahl.
Programm-Shells speichern KEINEN Stufen-Status. Keine Marketingzahl.
5. **Domain Knowledge Program v1 — Backlog nach Kundennutzen** (getrennt vom KPI nach Korpusstand):
1 Industrial Automation · 2 Environmental · 3 Automotive · 4 Medical · 5 Energy.
## Konsequenzen
- **Programme statt Features:** jede Domäne ist eine maschinenlesbare Definition (`programs/*.yaml`);
der Reifegrad-KPI im Reference-Suite ist aus dem Korpus abgeleitet und differenziert ehrlich
(Industrial Automation führt, Environmental 0 % — die Arbeit liegt vor uns).
- **Backlog ≠ KPI:** der Backlog ordnet nach Kundennutzen, der KPI misst den echten Korpusstand —
bewusst getrennt (z. B. eine Domäne kann hoch im Backlog, aber niedrig im KPI stehen).
- **Arbeit verschiebt sich endgültig von Software- zu Wissensproduktion.** Wettbewerbsvorteil =
Qualität und Breite der modellierten Domänen.
- **Freeze-konform:** kein neues Metamodell, kein Graph, kein neues `compliance/`-Modul. Nur
Programm-Daten (`knowledge/programs/`) + abgeleitete Reporting-Sicht im Reference-Suite.
- Diese ADR ist non-runtime → kein Deploy (siehe [ADR-001](ADR-001-runtime-deploy-policy.md)).