feat(mission): Customer Mission #1 — the platform as one connected expert system (end-to-end)

Turn the architecture inside-out: instead of refining classes/registries/journeys, force the whole platform to behave as ONE expert system and run a real consulting project end-to-end — measuring how often the consultant has to "jump" (special-case glue instead of a clean engine-to-engine handoff). A Reference Scenario asks "is the knowledge correct?"; a Customer Mission asks "can a customer WORK with it?". This is the last big architecture test before broad corpus expansion. - reference_scenarios/mission_machine_builder.py: a synthetic machine builder (ISO9001 + ISMS + CE + PLC + remote maintenance + cloud + 80 devs + EU; no real names) asks "what must I do in the next 6 months?". Runs the REAL engines: Regulatory Map -> Journey selection -> Capability Delta (RS-005) -> Roadmap (leverage) -> Playbooks -> Evidence -> Verification -> Completeness, and produces the 6-month consulting answer ("the top-5 measures close 9/16 = 56%, starting with the ones that satisfy CRA AND MaschinenVO at once"). - Flow-Continuity audit (the actual test): 5 CLEAN, 2 JUMPS, 2 deliberate DEPENDENCIES. The two real seams: (1) Scope -> Journey (no `certs x targets -> journeys` selector engine; the data exists in transitions.yaml, only the selection is glue); (2) Evidence -> Verification (parked, Vision V2). The two dependencies (cert->capability map @Execution, corpus_status curation) are intended ownership boundaries, not architecture breaks. - Finding: the platform carries the WHOLE consulting flow end-to-end. Once the Scope->Journey selector exists, the foundation is essentially done — from there the work is knowledge, not architecture. 4 end-to-end tests (mission runs, exactly two known jumps, full flow present, no real company names). check-loc 0. Non-runtime harness -> no deploy (ADR-001). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-28 08:39:26 +02:00
parent f652e2d4ed
commit 98f67e75d9
3 changed files with 283 additions and 0 deletions
@@ -0,0 +1,171 @@
+# ruff: noqa
+# mypy: ignore-errors
+"""Customer Mission #1 — a full consulting simulation, NOT another architecture artifact.
+
+A Reference Scenario asks „is the knowledge correct?". A Customer Mission asks „can a customer actually
+WORK with it?" — it forces the whole platform to behave as ONE connected expert system, from the first
+question to a prioritised 6-month plan, and MEASURES how often the consultant had to „jump" (special-case
+glue instead of a clean engine-to-engine handoff). If Mission #1 runs without jumps, the architecture is
+probably done; the remaining work is knowledge, not foundation.
+
+Synthetic machine builder (NO real company names). Runs the REAL engines end-to-end.
+Run:  cd backend-compliance && PYTHONPATH=. python3 reference_scenarios/mission_machine_builder.py
+Not product code; not imported by the app. Non-runtime -> no deploy.
+"""
+from __future__ import annotations
+
+import os
+import yaml
+
+from compliance.profile.canonical import (
+    CanonicalProductRegulatoryProfile as P, CanonicalProductType as PT,
+    EconomicOperatorRole as Role, CanonicalLifecyclePhase as LP,
+)
+from compliance.regulatory_map.renderer import render_regulatory_map
+from compliance.company import CompanyContext, Certification, CapabilityMappingEntry, build_company_profile
+from compliance.reasoning.enums import Confidence
+from compliance.transition_reasoning import (
+    TransitionContext, TransitionGoal, TargetRequirement, assess_transition, CoverageStatus,
+)
+from compliance.optimization import roadmap_from_delta, select_within_budget
+from compliance.playbook import playbooks_for_plan
+from compliance.completeness import assess_completeness
+
+OUT = []
+JUMPS = []  # (handoff, status, note) — the flow-continuity audit
+
+
+def w(s=""):
+    OUT.append(s)
+
+
+def step(handoff, status, note):
+    JUMPS.append((handoff, status, note))
+
+
+_HERE = os.path.dirname(__file__)
+_K = os.path.join(_HERE, "..", "knowledge")
+
+w('# Customer Mission #1 — Maschinenbauer: „Was muss ich in den nächsten 6 Monaten tun?"')
+w("")
+w('_KEINE Demo, KEIN Reference Scenario — eine vollständige Simulation eines Beratungsprojekts mit den ECHTEN Engines. Gemessen wird, wie oft der Berater „springen" muss (Sonderlogik statt sauberem Engine-Fluss). Synthetischer Kunde, keine echten Namen._')
+w("")
+w("## Der Kunde (synthetisch)")
+w("> ISO 9001 · ISMS (ISO 27001) · CE-Prozess · SPS · Fernwartung · Cloud · 80 Entwickler · Export EU")
+w('> **Eine Frage:** „Was muss ich in den nächsten sechs Monaten tun?"')
+w("")
+
+# ── 1. Scope — was gilt? (regulatory map) ─────────────────────────────────
+prod = P(name="mb", product_type=PT.MACHINERY, markets=["EU"], economic_operator_role=Role.MANUFACTURER,
+         lifecycle_phase=LP.PLACING_ON_MARKET, is_machine=True, is_component=False, has_embedded_software=True,
+         connected_to_internet=True, has_remote_access=True, generates_usage_data=None)
+rm = render_regulatory_map(prod)
+appl = [v.regulation_id for v in rm.applicable_regulations]
+unc = [v.regulation_id for v in rm.uncertain_regulations]
+w("## 1. Scope — was gilt? _(Regulatory Map)_")
+w("- **Gilt:** %s" % ", ".join(appl))
+w("- **Unsicher (Rückfrage):** %s" % ", ".join(unc))
+w("- **Overlaps:** %s" % ", ".join(ov.overlap_group_id for ov in rm.overlaps))
+w("")
+step("Onboarding → Scope", "CLEAN", "Regulatory Map leitet aus dem Produktprofil CRA/MaschinenVO/EMV ab; RED/DataAct/NIS2 unsicher.")
+
+# ── 2. Journey — welche Übergänge? (certs + targets -> transitions) ────────
+# the company HAS ISO 27001 + ISO 9001; the product triggers CRA + MaschinenVO.
+# THERE IS NO ENGINE that selects the journeys from (certs x targets) — we do it by hand here.
+w("## 2. Journey — welche Übergänge? _(aus Zertifikaten + Zielen)_")
+w("- Hat **ISO 27001 + ISO 9001**, Produkt = vernetzte Maschine → Ziel **CRA + MaschinenVO**.")
+w("- Gewählte Journey: **ISO 27001 → CRA + MaschinenVO** (Convergence-Pattern) + QM-Seite ISO 9001 → MaschinenVO.")
+w("- ⚠️ Die Übergänge stehen als DATEN in `knowledge/programs/transitions.yaml`, aber **keine Engine wählt sie aus Zertifikaten+Zielen** — hier manuell selektiert.")
+w("")
+step("Scope → Journey", "JUMP", "Kein Selektor-Engine `certs × applicable-targets → journeys` — die Journey-Wahl ist Glue (Daten existieren in transitions.yaml).")
+
+CP = yaml.safe_load(open(os.path.join(_K, "transition_patterns", "transition_pattern_iso27001_to_cra_maschinenvo_v1.yaml"), encoding="utf-8"))
+
+# ── 3. Capability Delta — was fehlt? (Company 2A + RS-005) ─────────────────
+have = [a["capability"] for a in CP["likely_covered"]]
+cmap = {"ISO27001": CapabilityMappingEntry(capability_ids=have, confidence=Confidence.MEDIUM)}
+prof = build_company_profile(CompanyContext(company_id="mb", certifications=[Certification(certification_id="ISO27001")]), cmap)
+reqs = [TargetRequirement(capability_id=a["capability"]) for a in CP["likely_covered"]]
+reqs += [TargetRequirement(capability_id=d["capability"], question_intent=d.get("needed_information", "verify_existence"))
+         for d in CP["delta_requirements"]]
+assess = assess_transition(TransitionContext(company_id="mb", target=TransitionGoal(target_id="CRA+MaschinenVO")), reqs, prof)
+missing = sorted({c.capability_id for c in assess.coverage if c.status == CoverageStatus.MISSING})
+w("## 3. Capability Delta — was fehlt? _(Company 2A + RS-005)_")
+w("> %s" % assess.summary.headline)
+w("- Vermutlich vorhanden (aus ISMS, Welt 1): %s" % ", ".join(assess.summary.probably_covered[:4]) + " …")
+w("- Fehlt (Delta): %d Capabilities, z. B. %s …" % (len(missing), ", ".join(missing[:4])))
+w("")
+step("Journey → Capability Delta", "CLEAN", "assess_transition(Company-Profil, Required) → Coverage + Delta; sauberer Engine-Handoff.")
+step("Zertifikate → Capabilities (Dependency)", "DEPENDENCY", "cert→capability-Map ist Execution-owned + injiziert (hier gemockt) — bewusste Ownership-Grenze, kein Architektur-Bruch.")
+
+# ── 4. Roadmap — was zuerst? (Optimization / Leverage) ────────────────────
+delta_t = {d["capability"]: d.get("covers_targets", []) for d in CP["delta_requirements"]}
+opt = roadmap_from_delta(assess, delta_t)
+bud = select_within_budget({m.capability_id: m.covers for m in opt.ranked_measures}, 5)
+w("## 4. Roadmap — was zuerst? _(Optimization, größter Hebel)_")
+w("> %s" % opt.headline)
+w("- **Top-Maßnahmen:** %s" % ", ".join("`%s`(%d)" % (m.capability_id, m.leverage) for m in opt.ranked_measures[:5]))
+w("")
+step("Capability Delta → Roadmap", "CLEAN", "roadmap_from_delta(assessment, covers_targets) → Maßnahmen nach Hebel; sauber.")
+
+# ── 5. Playbooks — wie umsetzen? ──────────────────────────────────────────
+kb = {}
+for f in sorted(os.listdir(os.path.join(_K, "implementation_playbooks"))):
+    if f.endswith(".yaml"):
+        d = yaml.safe_load(open(os.path.join(_K, "implementation_playbooks", f), encoding="utf-8"))
+        kb[d["capability_id"]] = d
+pbs = playbooks_for_plan(opt, kb)
+have_pb = [p for p in pbs if p.status != "missing"]
+w("## 5. Playbooks — wie umsetzen? _(Berater-Renderer)_")
+w("- %d von %d Maßnahmen haben ein Playbook; %d brauchen Inhalt (Maschinensicherheits-Playbooks @IACE delegiert)." % (len(have_pb), len(pbs), len(pbs) - len(have_pb)))
+w("")
+step("Roadmap → Playbook", "CLEAN", "playbooks_for_plan(plan, knowledge) → Reise je Maßnahme; fehlender Inhalt = ehrliche `missing`-Stubs.")
+
+# ── 6. Nachweise — was belegen? ───────────────────────────────────────────
+ev = sorted({e for d in CP["delta_requirements"] for e in d.get("expected_evidence", [])})
+w("## 6. Nachweise — was belegen? _(expected_evidence)_")
+w("- Geforderte Nachweise (Auszug): %s …" % ", ".join(ev[:6]))
+w("")
+step("Playbook → Evidence", "CLEAN", "expected_evidence trägt aus Pattern/Playbook durch — Datenfeld, kein Bruch.")
+
+# ── 7. Verification — kann ich es beweisen? ───────────────────────────────
+w("## 7. Verification — kann ich es BEWEISEN?")
+w("- ⚠️ **Nicht gebaut** — der Verification-Layer (Evidence × Reality → bewiesen) ist Vision V2 (geparkt, Task #45).")
+w("")
+step("Evidence → Verification", "JUMP", "Verification-Layer fehlt (bewusst geparkt, Vision V2 / Requirements Verification Platform).")
+
+# ── 8. Completeness — wie sicher/vollständig? ─────────────────────────────
+corpus = {r: ("validated" if r in ("CRA", "MaschinenVO", "DataAct") else "unsupported") for r in appl + unc}
+rep = assess_completeness(appl + unc, corpus,
+                          uncertain=[{"regulation": "DataAct", "deciding_question": "generates_usage_data", "reason": "generates_usage_data unbekannt"}])
+w("## 8. Completeness — wie sicher/vollständig? _(auditierbar)_")
+w("> %s" % rep.completeness_summary)
+w("- Offen/begründet: %s" % ", ".join("`%s`(%s)" % (e.subject, e.resolution) for e in rep.exclusions))
+w("")
+step("Completeness (Dependency)", "DEPENDENCY", "corpus_status (welche Regelwerke validiert) wird kuratiert/injiziert, nicht aus dem Korpus abgeleitet.")
+
+# ── Die 6-Monats-Antwort ──────────────────────────────────────────────────
+w("## Die 6-Monats-Antwort (Beratungsnarrativ)")
+w("")
+w('> „Sie sind als Maschinenbauer von **CRA + MaschinenVO** (und EMV) betroffen; RED/Data Act/NIS2 sind erst nach **einer Rückfrage** (`generates_usage_data`) zu klären. Ihr ISMS deckt die Informationssicherheits-Seite *wahrscheinlich* ab (zu bestätigen). Offen sind **%d Maßnahmen**. **Wenn Sie in den nächsten 6 Monaten die Top-5 nach regulatorischem Hebel umsetzen, schließen Sie %d von %d identifizierten Anforderungen (%d%%)** — beginnend mit den Maßnahmen, die CRA UND MaschinenVO gleichzeitig erfüllen. Für jede gibt es ein Umsetzungs-Playbook und die geforderten Nachweise; was wir noch NICHT bewerten konnten (EMV/RED/NIS2), weisen wir transparent aus."' % (
+    len(missing), bud.requirements_closed, bud.total_requirements, int(round(bud.coverage_ratio * 100))))
+w("")
+
+# ── Flow-Continuity-Audit (der eigentliche Test) ──────────────────────────
+clean = sum(1 for _, s, _ in JUMPS if s == "CLEAN")
+jumps = sum(1 for _, s, _ in JUMPS if s == "JUMP")
+deps = sum(1 for _, s, _ in JUMPS if s == "DEPENDENCY")
+w("## Flow-Continuity-Audit — der eigentliche Test")
+w("")
+w("| Übergang | Status | Befund |")
+w("|---|---|---|")
+for h, s, n in JUMPS:
+    icon = {"CLEAN": "✅ sauber", "JUMP": "⚠️ SPRUNG", "DEPENDENCY": "🔌 Dependency"}[s]
+    w("| %s | %s | %s |" % (h, icon, n))
+w("")
+w("**%d sauber · %d Sprünge · %d bewusste Dependencies.**" % (clean, jumps, deps))
+w("")
+w("**Befund:** Die Plattform trägt den **gesamten Beratungsfluss** end-to-end — von der Kundenfrage bis zur priorisierten 6-Monats-Maßnahmenliste mit Playbooks, Nachweisen und ehrlicher Vollständigkeit. **Genau ZWEI echte Sprünge:** (1) **Scope → Journey** — es fehlt ein Selektor-Engine `Zertifikate × Ziele → Journeys` (die Daten existieren, nur die Auswahl ist Glue); (2) **Evidence → Verification** — bewusst geparkter Layer (Vision V2). Die zwei Dependencies (cert→capability-Map @Execution, corpus_status-Kuratierung) sind gewollte Ownership-Grenzen, keine Architektur-Brüche. → **Wenn der Scope→Journey-Selektor steht, ist das Fundament im Wesentlichen fertig; ab dann ist die Arbeit Wissen, nicht Architektur.**")
+w("")
+
+print("\n".join(OUT))