feat(mission): Customer Mission #1 — the platform as one connected expert system (end-to-end)

Turn the architecture inside-out: instead of refining classes/registries/journeys, force the whole
platform to behave as ONE expert system and run a real consulting project end-to-end — measuring how
often the consultant has to "jump" (special-case glue instead of a clean engine-to-engine handoff). A
Reference Scenario asks "is the knowledge correct?"; a Customer Mission asks "can a customer WORK with
it?". This is the last big architecture test before broad corpus expansion.

- reference_scenarios/mission_machine_builder.py: a synthetic machine builder (ISO9001 + ISMS + CE +
  PLC + remote maintenance + cloud + 80 devs + EU; no real names) asks "what must I do in the next 6
  months?". Runs the REAL engines: Regulatory Map -> Journey selection -> Capability Delta (RS-005) ->
  Roadmap (leverage) -> Playbooks -> Evidence -> Verification -> Completeness, and produces the 6-month
  consulting answer ("the top-5 measures close 9/16 = 56%, starting with the ones that satisfy CRA AND
  MaschinenVO at once").
- Flow-Continuity audit (the actual test): 5 CLEAN, 2 JUMPS, 2 deliberate DEPENDENCIES. The two real
  seams: (1) Scope -> Journey (no `certs x targets -> journeys` selector engine; the data exists in
  transitions.yaml, only the selection is glue); (2) Evidence -> Verification (parked, Vision V2). The
  two dependencies (cert->capability map @Execution, corpus_status curation) are intended ownership
  boundaries, not architecture breaks.
- Finding: the platform carries the WHOLE consulting flow end-to-end. Once the Scope->Journey selector
  exists, the foundation is essentially done — from there the work is knowledge, not architecture.

4 end-to-end tests (mission runs, exactly two known jumps, full flow present, no real company names).
check-loc 0. Non-runtime harness -> no deploy (ADR-001).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Benjamin Admin
2026-06-28 08:39:26 +02:00
parent f652e2d4ed
commit 98f67e75d9
3 changed files with 283 additions and 0 deletions
@@ -0,0 +1,171 @@
# ruff: noqa
# mypy: ignore-errors
"""Customer Mission #1 — a full consulting simulation, NOT another architecture artifact.
A Reference Scenario asks „is the knowledge correct?". A Customer Mission asks „can a customer actually
WORK with it?" — it forces the whole platform to behave as ONE connected expert system, from the first
question to a prioritised 6-month plan, and MEASURES how often the consultant had to „jump" (special-case
glue instead of a clean engine-to-engine handoff). If Mission #1 runs without jumps, the architecture is
probably done; the remaining work is knowledge, not foundation.
Synthetic machine builder (NO real company names). Runs the REAL engines end-to-end.
Run: cd backend-compliance && PYTHONPATH=. python3 reference_scenarios/mission_machine_builder.py
Not product code; not imported by the app. Non-runtime -> no deploy.
"""
from __future__ import annotations
import os
import yaml
from compliance.profile.canonical import (
CanonicalProductRegulatoryProfile as P, CanonicalProductType as PT,
EconomicOperatorRole as Role, CanonicalLifecyclePhase as LP,
)
from compliance.regulatory_map.renderer import render_regulatory_map
from compliance.company import CompanyContext, Certification, CapabilityMappingEntry, build_company_profile
from compliance.reasoning.enums import Confidence
from compliance.transition_reasoning import (
TransitionContext, TransitionGoal, TargetRequirement, assess_transition, CoverageStatus,
)
from compliance.optimization import roadmap_from_delta, select_within_budget
from compliance.playbook import playbooks_for_plan
from compliance.completeness import assess_completeness
OUT = []
JUMPS = [] # (handoff, status, note) — the flow-continuity audit
def w(s=""):
OUT.append(s)
def step(handoff, status, note):
JUMPS.append((handoff, status, note))
_HERE = os.path.dirname(__file__)
_K = os.path.join(_HERE, "..", "knowledge")
w('# Customer Mission #1 — Maschinenbauer: „Was muss ich in den nächsten 6 Monaten tun?"')
w("")
w('_KEINE Demo, KEIN Reference Scenario — eine vollständige Simulation eines Beratungsprojekts mit den ECHTEN Engines. Gemessen wird, wie oft der Berater „springen" muss (Sonderlogik statt sauberem Engine-Fluss). Synthetischer Kunde, keine echten Namen._')
w("")
w("## Der Kunde (synthetisch)")
w("> ISO 9001 · ISMS (ISO 27001) · CE-Prozess · SPS · Fernwartung · Cloud · 80 Entwickler · Export EU")
w('> **Eine Frage:** „Was muss ich in den nächsten sechs Monaten tun?"')
w("")
# ── 1. Scope — was gilt? (regulatory map) ─────────────────────────────────
prod = P(name="mb", product_type=PT.MACHINERY, markets=["EU"], economic_operator_role=Role.MANUFACTURER,
lifecycle_phase=LP.PLACING_ON_MARKET, is_machine=True, is_component=False, has_embedded_software=True,
connected_to_internet=True, has_remote_access=True, generates_usage_data=None)
rm = render_regulatory_map(prod)
appl = [v.regulation_id for v in rm.applicable_regulations]
unc = [v.regulation_id for v in rm.uncertain_regulations]
w("## 1. Scope — was gilt? _(Regulatory Map)_")
w("- **Gilt:** %s" % ", ".join(appl))
w("- **Unsicher (Rückfrage):** %s" % ", ".join(unc))
w("- **Overlaps:** %s" % ", ".join(ov.overlap_group_id for ov in rm.overlaps))
w("")
step("Onboarding → Scope", "CLEAN", "Regulatory Map leitet aus dem Produktprofil CRA/MaschinenVO/EMV ab; RED/DataAct/NIS2 unsicher.")
# ── 2. Journey — welche Übergänge? (certs + targets -> transitions) ────────
# the company HAS ISO 27001 + ISO 9001; the product triggers CRA + MaschinenVO.
# THERE IS NO ENGINE that selects the journeys from (certs x targets) — we do it by hand here.
w("## 2. Journey — welche Übergänge? _(aus Zertifikaten + Zielen)_")
w("- Hat **ISO 27001 + ISO 9001**, Produkt = vernetzte Maschine → Ziel **CRA + MaschinenVO**.")
w("- Gewählte Journey: **ISO 27001 → CRA + MaschinenVO** (Convergence-Pattern) + QM-Seite ISO 9001 → MaschinenVO.")
w("- ⚠️ Die Übergänge stehen als DATEN in `knowledge/programs/transitions.yaml`, aber **keine Engine wählt sie aus Zertifikaten+Zielen** — hier manuell selektiert.")
w("")
step("Scope → Journey", "JUMP", "Kein Selektor-Engine `certs × applicable-targets → journeys` — die Journey-Wahl ist Glue (Daten existieren in transitions.yaml).")
CP = yaml.safe_load(open(os.path.join(_K, "transition_patterns", "transition_pattern_iso27001_to_cra_maschinenvo_v1.yaml"), encoding="utf-8"))
# ── 3. Capability Delta — was fehlt? (Company 2A + RS-005) ─────────────────
have = [a["capability"] for a in CP["likely_covered"]]
cmap = {"ISO27001": CapabilityMappingEntry(capability_ids=have, confidence=Confidence.MEDIUM)}
prof = build_company_profile(CompanyContext(company_id="mb", certifications=[Certification(certification_id="ISO27001")]), cmap)
reqs = [TargetRequirement(capability_id=a["capability"]) for a in CP["likely_covered"]]
reqs += [TargetRequirement(capability_id=d["capability"], question_intent=d.get("needed_information", "verify_existence"))
for d in CP["delta_requirements"]]
assess = assess_transition(TransitionContext(company_id="mb", target=TransitionGoal(target_id="CRA+MaschinenVO")), reqs, prof)
missing = sorted({c.capability_id for c in assess.coverage if c.status == CoverageStatus.MISSING})
w("## 3. Capability Delta — was fehlt? _(Company 2A + RS-005)_")
w("> %s" % assess.summary.headline)
w("- Vermutlich vorhanden (aus ISMS, Welt 1): %s" % ", ".join(assess.summary.probably_covered[:4]) + "")
w("- Fehlt (Delta): %d Capabilities, z. B. %s" % (len(missing), ", ".join(missing[:4])))
w("")
step("Journey → Capability Delta", "CLEAN", "assess_transition(Company-Profil, Required) → Coverage + Delta; sauberer Engine-Handoff.")
step("Zertifikate → Capabilities (Dependency)", "DEPENDENCY", "cert→capability-Map ist Execution-owned + injiziert (hier gemockt) — bewusste Ownership-Grenze, kein Architektur-Bruch.")
# ── 4. Roadmap — was zuerst? (Optimization / Leverage) ────────────────────
delta_t = {d["capability"]: d.get("covers_targets", []) for d in CP["delta_requirements"]}
opt = roadmap_from_delta(assess, delta_t)
bud = select_within_budget({m.capability_id: m.covers for m in opt.ranked_measures}, 5)
w("## 4. Roadmap — was zuerst? _(Optimization, größter Hebel)_")
w("> %s" % opt.headline)
w("- **Top-Maßnahmen:** %s" % ", ".join("`%s`(%d)" % (m.capability_id, m.leverage) for m in opt.ranked_measures[:5]))
w("")
step("Capability Delta → Roadmap", "CLEAN", "roadmap_from_delta(assessment, covers_targets) → Maßnahmen nach Hebel; sauber.")
# ── 5. Playbooks — wie umsetzen? ──────────────────────────────────────────
kb = {}
for f in sorted(os.listdir(os.path.join(_K, "implementation_playbooks"))):
if f.endswith(".yaml"):
d = yaml.safe_load(open(os.path.join(_K, "implementation_playbooks", f), encoding="utf-8"))
kb[d["capability_id"]] = d
pbs = playbooks_for_plan(opt, kb)
have_pb = [p for p in pbs if p.status != "missing"]
w("## 5. Playbooks — wie umsetzen? _(Berater-Renderer)_")
w("- %d von %d Maßnahmen haben ein Playbook; %d brauchen Inhalt (Maschinensicherheits-Playbooks @IACE delegiert)." % (len(have_pb), len(pbs), len(pbs) - len(have_pb)))
w("")
step("Roadmap → Playbook", "CLEAN", "playbooks_for_plan(plan, knowledge) → Reise je Maßnahme; fehlender Inhalt = ehrliche `missing`-Stubs.")
# ── 6. Nachweise — was belegen? ───────────────────────────────────────────
ev = sorted({e for d in CP["delta_requirements"] for e in d.get("expected_evidence", [])})
w("## 6. Nachweise — was belegen? _(expected_evidence)_")
w("- Geforderte Nachweise (Auszug): %s" % ", ".join(ev[:6]))
w("")
step("Playbook → Evidence", "CLEAN", "expected_evidence trägt aus Pattern/Playbook durch — Datenfeld, kein Bruch.")
# ── 7. Verification — kann ich es beweisen? ───────────────────────────────
w("## 7. Verification — kann ich es BEWEISEN?")
w("- ⚠️ **Nicht gebaut** — der Verification-Layer (Evidence × Reality → bewiesen) ist Vision V2 (geparkt, Task #45).")
w("")
step("Evidence → Verification", "JUMP", "Verification-Layer fehlt (bewusst geparkt, Vision V2 / Requirements Verification Platform).")
# ── 8. Completeness — wie sicher/vollständig? ─────────────────────────────
corpus = {r: ("validated" if r in ("CRA", "MaschinenVO", "DataAct") else "unsupported") for r in appl + unc}
rep = assess_completeness(appl + unc, corpus,
uncertain=[{"regulation": "DataAct", "deciding_question": "generates_usage_data", "reason": "generates_usage_data unbekannt"}])
w("## 8. Completeness — wie sicher/vollständig? _(auditierbar)_")
w("> %s" % rep.completeness_summary)
w("- Offen/begründet: %s" % ", ".join("`%s`(%s)" % (e.subject, e.resolution) for e in rep.exclusions))
w("")
step("Completeness (Dependency)", "DEPENDENCY", "corpus_status (welche Regelwerke validiert) wird kuratiert/injiziert, nicht aus dem Korpus abgeleitet.")
# ── Die 6-Monats-Antwort ──────────────────────────────────────────────────
w("## Die 6-Monats-Antwort (Beratungsnarrativ)")
w("")
w('> „Sie sind als Maschinenbauer von **CRA + MaschinenVO** (und EMV) betroffen; RED/Data Act/NIS2 sind erst nach **einer Rückfrage** (`generates_usage_data`) zu klären. Ihr ISMS deckt die Informationssicherheits-Seite *wahrscheinlich* ab (zu bestätigen). Offen sind **%d Maßnahmen**. **Wenn Sie in den nächsten 6 Monaten die Top-5 nach regulatorischem Hebel umsetzen, schließen Sie %d von %d identifizierten Anforderungen (%d%%)** — beginnend mit den Maßnahmen, die CRA UND MaschinenVO gleichzeitig erfüllen. Für jede gibt es ein Umsetzungs-Playbook und die geforderten Nachweise; was wir noch NICHT bewerten konnten (EMV/RED/NIS2), weisen wir transparent aus."' % (
len(missing), bud.requirements_closed, bud.total_requirements, int(round(bud.coverage_ratio * 100))))
w("")
# ── Flow-Continuity-Audit (der eigentliche Test) ──────────────────────────
clean = sum(1 for _, s, _ in JUMPS if s == "CLEAN")
jumps = sum(1 for _, s, _ in JUMPS if s == "JUMP")
deps = sum(1 for _, s, _ in JUMPS if s == "DEPENDENCY")
w("## Flow-Continuity-Audit — der eigentliche Test")
w("")
w("| Übergang | Status | Befund |")
w("|---|---|---|")
for h, s, n in JUMPS:
icon = {"CLEAN": "✅ sauber", "JUMP": "⚠️ SPRUNG", "DEPENDENCY": "🔌 Dependency"}[s]
w("| %s | %s | %s |" % (h, icon, n))
w("")
w("**%d sauber · %d Sprünge · %d bewusste Dependencies.**" % (clean, jumps, deps))
w("")
w("**Befund:** Die Plattform trägt den **gesamten Beratungsfluss** end-to-end — von der Kundenfrage bis zur priorisierten 6-Monats-Maßnahmenliste mit Playbooks, Nachweisen und ehrlicher Vollständigkeit. **Genau ZWEI echte Sprünge:** (1) **Scope → Journey** — es fehlt ein Selektor-Engine `Zertifikate × Ziele → Journeys` (die Daten existieren, nur die Auswahl ist Glue); (2) **Evidence → Verification** — bewusst geparkter Layer (Vision V2). Die zwei Dependencies (cert→capability-Map @Execution, corpus_status-Kuratierung) sind gewollte Ownership-Grenzen, keine Architektur-Brüche. → **Wenn der Scope→Journey-Selektor steht, ist das Fundament im Wesentlichen fertig; ab dann ist die Arbeit Wissen, nicht Architektur.**")
w("")
print("\n".join(OUT))