Merge pull request 'Cross-Domain MCAP Convergence Analysis (Phase Omega pause)' (#38) from feat/mcap-convergence-analysis into main
This commit is contained in:
@@ -0,0 +1,47 @@
|
||||
# Cross-Domain MCAP Convergence Analysis — wo konvergiert das Wissensmodell?
|
||||
|
||||
_Nicht „welche MCAPs kommen am häufigsten vor?" (Häufigkeit täuscht), sondern „welche MCAPs TRAGEN den größten Teil des Systems?". Deterministischer **Impact-Score** (kein ML), internes Engineering-Werkzeug, reine Aggregation über vorhandene Daten (5 Transition Patterns + 7 Automotive-Quellen). Non-runtime, keine echten Namen._
|
||||
|
||||
## Impact-Score (deterministisch)
|
||||
> `Impact = distinct Sources + distinct Target-Types + distinct Domains + distinct Journeys + Regulatory Leverage + Business Leverage`
|
||||
- 62 distinct Capabilities (MCAP-Kandidaten) über alle Quellen aggregiert.
|
||||
|
||||
## 1. Core MCAPs — höchster Impact (die tragenden Knoten)
|
||||
| Capability | Impact | Sources | Types | Domains | Journeys |
|
||||
|---|---:|---:|---:|---:|---:|
|
||||
| `secure_signed_update_distribution` | **18** | 5 | 2 | 2 | 4 |
|
||||
| `technical_vulnerability_management` | **17** | 5 | 3 | 2 | 4 |
|
||||
| `access_control_and_authentication` | **15** | 4 | 2 | 2 | 5 |
|
||||
| `incident_management` | **14** | 4 | 2 | 2 | 4 |
|
||||
| `product_cyber_risk_assessment` | **13** | 3 | 1 | 2 | 4 |
|
||||
| `secure_development_lifecycle` | **11** | 2 | 2 | 2 | 4 |
|
||||
| `supplier_security` | **11** | 3 | 2 | 2 | 3 |
|
||||
| `ce_conformity_assessment_and_technical_documentation` | **9** | 2 | 1 | 1 | 3 |
|
||||
| `coordinated_vulnerability_disclosure` | **9** | 1 | 1 | 2 | 4 |
|
||||
| `sbom_creation` | **9** | 1 | 1 | 2 | 4 |
|
||||
|
||||
→ Hoher Impact = ein Knoten verbindet viele Quellen ÜBER Typen/Domänen/Journeys hinweg — nicht „in 40 Dokumenten einer Normenfamilie".
|
||||
|
||||
## 2. Emerging MCAPs — verbinden ≥2 Domänen (Brücken zwischen Anforderungswelten)
|
||||
- `secure_signed_update_distribution` — 2 Domänen (automotive, industrial_automation), 2 Typen.
|
||||
- `technical_vulnerability_management` — 2 Domänen (automotive, industrial_automation), 3 Typen.
|
||||
- `access_control_and_authentication` — 2 Domänen (automotive, industrial_automation), 2 Typen.
|
||||
- `incident_management` — 2 Domänen (automotive, industrial_automation), 2 Typen.
|
||||
- `product_cyber_risk_assessment` — 2 Domänen (automotive, industrial_automation), 1 Typen.
|
||||
- `secure_development_lifecycle` — 2 Domänen (automotive, industrial_automation), 2 Typen.
|
||||
- `supplier_security` — 2 Domänen (automotive, industrial_automation), 2 Typen.
|
||||
- `coordinated_vulnerability_disclosure` — 2 Domänen (automotive, industrial_automation), 1 Typen.
|
||||
- _(Echtes „Wachstum über Zeit" braucht historische Snapshots — hier Proxy = Domänen-Spannweite jetzt.)_
|
||||
|
||||
## 3. Isolated MCAPs — nur 1 Quelle/Journey (Review: spezialisiert ODER Konvergenz übersehen?)
|
||||
- 36 Stück, u. a.: `account_energy_consumption`, `cybersecurity_management_system`, `document_update_campaigns`, `document_waste_streams`, `issue_battery_passport`, `machine_safety_risk_assessment`, `measure_air_emissions`, `mechanical_safety_and_guards`.
|
||||
|
||||
## 4. Suspicious MCAPs — Abstraktionsgrad-Verdacht (Experten-Review)
|
||||
- **Evtl. zu grob** (generisches Verb, breit aber nur 1 Typ): `document_and_change_control`, `manage_chemical_substances`.
|
||||
- **Evtl. zu fein** (isoliert + sehr spezifischer Name): `operating_instructions_and_safety_information`, `provide_dedicated_security_contact`, `provide_functional_safety_evidence`, `restrict_hazardous_substances_rohs`, `secure_by_default_no_default_credentials`, `threat_analysis_and_risk_assessment`.
|
||||
- Die Analyse sagt damit nicht nur WELCHE MCAPs wichtig sind, sondern auch, ob sie auf dem **richtigen Abstraktionsniveau** definiert sind.
|
||||
|
||||
## Befund
|
||||
|
||||
> **Ein Kern beginnt sich zu zeigen:** 11 von 62 Capabilities erreichen Impact ≥ 8 (tragende Knoten), 14 verbinden ≥2 Domänen. Bislang ist das Wissensmodell noch jung (5 Patterns + 1 Automotive-Profil), aber die Methode steht: sobald Medical/Payment/weitere Domänen als DATEN hinzukommen, zeigt dieselbe Aggregation, ob sich der erwartete stabile Kern von 30–50 hochkonvergenten MCAPs bildet — der gemeinsame Strukturkern hinter sehr unterschiedlichen Anforderungswelten. Das ist ein tieferer Wertnachweis als „eine weitere Norm unterstützt". Reine Aggregation, 0 Runtime, 0 neue Architektur.
|
||||
|
||||
@@ -0,0 +1,146 @@
|
||||
# ruff: noqa
|
||||
# mypy: ignore-errors
|
||||
"""Cross-Domain MCAP Convergence Analysis — where does the knowledge model converge? (Phase Ω, pause)
|
||||
|
||||
After Automotive the user paused on adding domains to ask a deeper question. NOT "which MCAPs occur most
|
||||
often?" (frequency deceives — a generic `document_changes` may be in 40 sources but is not the product
|
||||
core) but "which MCAPs CARRY the largest part of the system?". The answer is a deterministic MCAP IMPACT
|
||||
SCORE (no AI), an internal engineering tool, computed by aggregating over the EXISTING data only.
|
||||
|
||||
Impact(MCAP) = distinct Requirement Sources + distinct Target Types + distinct Domains
|
||||
+ distinct Journeys + Regulatory Leverage + Business Leverage
|
||||
|
||||
Four reports, all pure aggregation (no new runtime, no new architecture):
|
||||
1. Core — highest impact (the cross-cutting nodes that carry the system)
|
||||
2. Emerging — span >= 2 domains (bridges across requirement worlds)
|
||||
3. Isolated — only one source/journey/domain (specialised, OR convergence not yet recognised)
|
||||
4. Suspicious— probably cut too coarse (generic) or too fine (one hyper-specific occurrence)
|
||||
|
||||
Non-runtime -> no deploy.
|
||||
Run: cd backend-compliance && PYTHONPATH=. python3 reference_scenarios/mcap_convergence_analysis.py
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
import yaml
|
||||
|
||||
OUT = []
|
||||
|
||||
|
||||
def w(s=""):
|
||||
OUT.append(s)
|
||||
|
||||
|
||||
_HERE = os.path.dirname(__file__)
|
||||
_TP = os.path.join(_HERE, "..", "knowledge", "transition_patterns")
|
||||
|
||||
# pattern -> (domain, default target_type, default sources, source_type-of-default)
|
||||
PATTERN_META = {
|
||||
"transition_pattern_iso27001_to_cra_maschinenvo_v1.yaml": ("industrial_automation", "regulation", ["CRA", "MaschinenVO"]),
|
||||
"transition_pattern_iso27001_to_cra_v1.yaml": ("industrial_automation", "regulation", ["CRA"]),
|
||||
"transition_pattern_iso9001_to_cra_v1.yaml": ("industrial_automation", "regulation", ["CRA"]),
|
||||
"transition_pattern_isms_to_tisax_v1.yaml": ("automotive", "certification", ["TISAX"]),
|
||||
"transition_pattern_iso14001_to_environmental_v1.yaml": ("environmental", "regulation",
|
||||
["REACH", "RoHS", "Batterieverordnung", "Wasserrecht", "Abwasservorschriften", "Energiemanagement", "Kreislaufwirtschaft", "Emissionsschutz"]),
|
||||
}
|
||||
|
||||
# capability -> dict of sets we aggregate
|
||||
idx = {}
|
||||
|
||||
|
||||
def _ent(cap):
|
||||
return idx.setdefault(cap, {"sources": set(), "types": set(), "domains": set(), "journeys": set(),
|
||||
"regs": set(), "markets": set()})
|
||||
|
||||
|
||||
def _add(cap, sources, stype, domain, journey):
|
||||
e = _ent(cap)
|
||||
e["sources"] |= set(sources)
|
||||
e["types"].add(stype)
|
||||
e["domains"].add(domain)
|
||||
e["journeys"].add(journey)
|
||||
if stype == "regulation":
|
||||
e["regs"] |= set(sources)
|
||||
if stype == "contract":
|
||||
e["markets"] |= set(sources)
|
||||
|
||||
|
||||
# ── A) transition patterns: each pattern is a journey with a target/domain ───────────────────
|
||||
# IMPORTANT (anti-frequency-deception): a `likely_covered` cap is PROVIDED BY the source cert (one
|
||||
# certification source), NOT required by every target regulation — attributing all target sources to it
|
||||
# would inflate management caps on raw frequency alone. Only `delta` caps name their real target sources.
|
||||
for fname, (domain, ttype, default_sources) in PATTERN_META.items():
|
||||
p = yaml.safe_load(open(os.path.join(_TP, fname), encoding="utf-8"))
|
||||
journey = p.get("id", fname)
|
||||
cert = (p.get("transition_goal", {}).get("from", {}) or {}).get("standard", journey)
|
||||
for a in p.get("likely_covered", []):
|
||||
_add(a["capability"], [cert], "certification", domain, journey) # provided by the cert
|
||||
for d in p.get("delta_requirements", []):
|
||||
srcs = d.get("covers_targets") or default_sources # required by these target sources
|
||||
_add(d["capability"], srcs, ttype, domain, journey)
|
||||
|
||||
# ── B) automotive multi-source data: precise per-source attribution ──────────────────────────
|
||||
A = yaml.safe_load(open(os.path.join(_HERE, "..", "knowledge", "domains", "automotive", "source_capabilities.yaml"), encoding="utf-8"))
|
||||
for s in A["sources"]:
|
||||
for cap in s["requires"]:
|
||||
_add(cap, [s["id"]], s["type"], "automotive", "automotive_ecu")
|
||||
|
||||
# ── Impact score (deterministic) ─────────────────────────────────────────────────────────────
|
||||
def impact(e):
|
||||
return len(e["sources"]) + len(e["types"]) + len(e["domains"]) + len(e["journeys"]) + len(e["regs"]) + len(e["markets"])
|
||||
|
||||
|
||||
scored = sorted(idx.items(), key=lambda kv: (-impact(kv[1]), kv[0]))
|
||||
GENERIC = ("document_", "manage_", "control_", "conduct_", "operate_", "run_", "assign_", "plan_", "approve_")
|
||||
|
||||
w("# Cross-Domain MCAP Convergence Analysis — wo konvergiert das Wissensmodell?")
|
||||
w("")
|
||||
w('_Nicht „welche MCAPs kommen am häufigsten vor?" (Häufigkeit täuscht), sondern „welche MCAPs TRAGEN den größten Teil des Systems?". Deterministischer **Impact-Score** (kein ML), internes Engineering-Werkzeug, reine Aggregation über vorhandene Daten (5 Transition Patterns + 7 Automotive-Quellen). Non-runtime, keine echten Namen._')
|
||||
w("")
|
||||
w("## Impact-Score (deterministisch)")
|
||||
w("> `Impact = distinct Sources + distinct Target-Types + distinct Domains + distinct Journeys + Regulatory Leverage + Business Leverage`")
|
||||
w("- %d distinct Capabilities (MCAP-Kandidaten) über alle Quellen aggregiert." % len(idx))
|
||||
w("")
|
||||
|
||||
# ── 1. Core MCAPs ─────────────────────────────────────────────────────────
|
||||
w("## 1. Core MCAPs — höchster Impact (die tragenden Knoten)")
|
||||
w("| Capability | Impact | Sources | Types | Domains | Journeys |")
|
||||
w("|---|---:|---:|---:|---:|---:|")
|
||||
for cap, e in scored[:10]:
|
||||
w("| `%s` | **%d** | %d | %d | %d | %d |" % (cap, impact(e), len(e["sources"]), len(e["types"]), len(e["domains"]), len(e["journeys"])))
|
||||
w("")
|
||||
w('→ Hoher Impact = ein Knoten verbindet viele Quellen ÜBER Typen/Domänen/Journeys hinweg — nicht „in 40 Dokumenten einer Normenfamilie".')
|
||||
w("")
|
||||
|
||||
# ── 2. Emerging MCAPs (cross-domain bridges) ──────────────────────────────
|
||||
emerging = [(c, e) for c, e in scored if len(e["domains"]) >= 2]
|
||||
w("## 2. Emerging MCAPs — verbinden ≥2 Domänen (Brücken zwischen Anforderungswelten)")
|
||||
for cap, e in emerging[:8]:
|
||||
w("- `%s` — %d Domänen (%s), %d Typen." % (cap, len(e["domains"]), ", ".join(sorted(e["domains"])), len(e["types"])))
|
||||
w('- _(Echtes „Wachstum über Zeit" braucht historische Snapshots — hier Proxy = Domänen-Spannweite jetzt.)_')
|
||||
w("")
|
||||
|
||||
# ── 3. Isolated MCAPs ─────────────────────────────────────────────────────
|
||||
isolated = [(c, e) for c, e in scored if len(e["sources"]) == 1 and len(e["journeys"]) == 1]
|
||||
w("## 3. Isolated MCAPs — nur 1 Quelle/Journey (Review: spezialisiert ODER Konvergenz übersehen?)")
|
||||
w("- %d Stück, u. a.: %s." % (len(isolated), ", ".join("`%s`" % c for c, _ in isolated[:8])))
|
||||
w("")
|
||||
|
||||
# ── 4. Suspicious MCAPs (abstraction level) ───────────────────────────────
|
||||
too_coarse = [(c, e) for c, e in scored if c.startswith(GENERIC) and len(e["types"]) <= 1 and len(e["sources"]) >= 2]
|
||||
too_fine = [(c, e) for c, e in isolated if len(c) >= 34]
|
||||
w("## 4. Suspicious MCAPs — Abstraktionsgrad-Verdacht (Experten-Review)")
|
||||
w("- **Evtl. zu grob** (generisches Verb, breit aber nur 1 Typ): %s." % (", ".join("`%s`" % c for c, _ in too_coarse[:6]) or "—"))
|
||||
w("- **Evtl. zu fein** (isoliert + sehr spezifischer Name): %s." % (", ".join("`%s`" % c for c, _ in too_fine[:6]) or "—"))
|
||||
w("- Die Analyse sagt damit nicht nur WELCHE MCAPs wichtig sind, sondern auch, ob sie auf dem **richtigen Abstraktionsniveau** definiert sind.")
|
||||
w("")
|
||||
|
||||
# ── Befund ─────────────────────────────────────────────────────────────────
|
||||
core_cut = [c for c, e in scored if impact(e) >= 8]
|
||||
cross = [c for c, e in scored if len(e["domains"]) >= 2]
|
||||
w("## Befund")
|
||||
w("")
|
||||
w('> **Ein Kern beginnt sich zu zeigen:** %d von %d Capabilities erreichen Impact ≥ 8 (tragende Knoten), %d verbinden ≥2 Domänen. Bislang ist das Wissensmodell noch jung (5 Patterns + 1 Automotive-Profil), aber die Methode steht: sobald Medical/Payment/weitere Domänen als DATEN hinzukommen, zeigt dieselbe Aggregation, ob sich der erwartete stabile Kern von 30–50 hochkonvergenten MCAPs bildet — der gemeinsame Strukturkern hinter sehr unterschiedlichen Anforderungswelten. Das ist ein tieferer Wertnachweis als „eine weitere Norm unterstützt". Reine Aggregation, 0 Runtime, 0 neue Architektur.' % (len(core_cut), len(idx), len(cross)))
|
||||
w("")
|
||||
|
||||
print("\n".join(OUT))
|
||||
@@ -0,0 +1,65 @@
|
||||
"""Cross-Domain MCAP Convergence Analysis — impact over frequency (Phase Ω pause).
|
||||
|
||||
Pins the deterministic MCAP Impact analysis and, critically, the anti-frequency-deception property:
|
||||
a capability that bridges many target TYPES / domains / journeys (secure_signed_update_distribution)
|
||||
must outrank a high-frequency single-domain management cap (conduct_internal_environmental_audits).
|
||||
Four reports (Core / Emerging / Isolated / Suspicious), pure aggregation over existing data, no runtime.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
import subprocess
|
||||
import sys
|
||||
|
||||
|
||||
def _run():
|
||||
root = os.path.join(os.path.dirname(__file__), "..")
|
||||
r = subprocess.run(
|
||||
[sys.executable, "reference_scenarios/mcap_convergence_analysis.py"],
|
||||
cwd=root, env={**os.environ, "PYTHONPATH": "."}, capture_output=True, text=True,
|
||||
)
|
||||
assert r.returncode == 0, r.stderr
|
||||
return r.stdout
|
||||
|
||||
|
||||
def _section(out, header):
|
||||
start = out.index(header)
|
||||
nxt = out.find("\n## ", start + 1)
|
||||
return out[start: nxt if nxt != -1 else len(out)]
|
||||
|
||||
|
||||
def test_runs_end_to_end():
|
||||
out = _run()
|
||||
assert "Cross-Domain MCAP Convergence Analysis" in out
|
||||
assert "Impact = distinct Sources" in out
|
||||
|
||||
|
||||
def test_core_is_cross_cutting_not_frequency():
|
||||
out = _run()
|
||||
core = _section(out, "## 1. Core MCAPs")
|
||||
# the most cross-cutting capability tops the Core report
|
||||
assert "`secure_signed_update_distribution` | **18** |" in core
|
||||
assert "technical_vulnerability_management" in core
|
||||
# a high-frequency BUT single-domain management cap must NOT be in Core (frequency != impact)
|
||||
assert "conduct_internal_environmental_audits" not in core
|
||||
|
||||
|
||||
def test_all_four_reports_present():
|
||||
out = _run()
|
||||
for header in ["## 1. Core MCAPs", "## 2. Emerging MCAPs", "## 3. Isolated MCAPs", "## 4. Suspicious MCAPs"]:
|
||||
assert header in out
|
||||
|
||||
|
||||
def test_isolated_and_suspicious_are_review_tools():
|
||||
out = _run()
|
||||
iso = _section(out, "## 3. Isolated MCAPs")
|
||||
assert "issue_battery_passport" in iso or "measure_air_emissions" in iso
|
||||
susp = _section(out, "## 4. Suspicious MCAPs")
|
||||
assert "zu grob" in susp and "zu fein" in susp
|
||||
|
||||
|
||||
def test_abstraction_level_signal():
|
||||
out = _run()
|
||||
assert "richtigen Abstraktionsniveau" in out
|
||||
assert "Strukturkern" in out
|
||||
Reference in New Issue
Block a user