Commit Graph

1678 Commits

Author SHA1 Message Date
Benjamin Admin 978052b5a2 fix(onboarding): decouple partial/indicative signals from detected — partial no longer removes a question
Fix B of the pre-#59 semantic correction. The Silent Pass had only TWO effective states though the data
carries three: a `detected` mapping (a concrete artifact) AND a `partial` mapping (an indicative signal,
e.g. a CI pipeline -> secure-development-lifecycle) both flowed through capability_ids() and were fed to
the Advisor as already-present — so a weak indication silently removed a question, exactly the Welt-1/
Welt-2 transparency we want to keep.

Now three distinct states:
  - detected   -> reduces the delta immediately (auto_detected, not asked).   [unchanged]
  - partial    -> raises assumption strength but does NOT replace the question (surfaced as `indications`,
                  the capability stays in the delta and is still asked).
  - requirement-> describes a target, never the present state (already handled by Fix A's kind split).

Changes (data + thin wiring, no new architecture):
  - SilentIntakeResult.capability_ids() returns only relationship==detected; new indicative_capability_ids()
    returns the partial ones.
  - advisor_start() gains indicative_capabilities (NOT fed into the profile) and surfaces result.indications
    = indicative ∩ required − auto_detected.
  - AdvisorResult / AdvisorResponse gain `indications` (additive, contract-safe); the service passes the
    indicative ids through.

Tests: a partial CI signal is indicative-not-detected and does NOT shrink the delta; end-to-end it appears
in `indications`, not `auto_detected`, and the gap is still asked. 28 onboarding tests pass, mypy --strict
clean on the onboarding modules, demo runs, check-loc 0. Runtime effect -> deploy + smoke.
2026-06-28 16:02:35 +02:00
pilotadmin 19931208a9 Merge pull request 'fix(onboarding): observation vs requirement signals — demanded ≠ present (Fix A)' (#48) from feat/signal-kind-split into main
CI / detect-changes (push) Successful in 5s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 5s
CI / validate-canonical-controls (push) Successful in 4s
CI / loc-budget (push) Successful in 18s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 23s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-06-28 15:53:10 +02:00
Benjamin Admin c39787ad96 fix(onboarding): separate observation vs requirement signals — a demanded SBOM is not a present SBOM
Semantic correction of the knowledge base BEFORE the empirical loop (#59) is built — otherwise the
Observation Store would learn from already-misclassified signals. The Silent Pass conflated two kinds of
signal into one: an OBSERVATION ("I saw an SBOM in the repo") and a REQUIREMENT ("a tender DEMANDS an
SBOM"). They were aliased to the same canonical id, so a tender clause read as "SBOM already present" and
suppressed the very question that should have been asked.

Fix — make the kind explicit and authoritative (no new architecture, data + thin wiring):
  - `kind` ∈ {observation, requirement} on ProducedSignal (producer may declare) and on the canonical
    SignalVocabularyEntry (AUTHORITATIVE — a mislabelled producer cannot collapse the two).
  - Vocabulary split: sbom_file_found → sbom_present (obs) + sbom_required (req);
    security_txt_or_cvd_policy → cvd_policy_present (obs) + psirt_required (req); add signed_updates_required.
    requirement signals are intentionally UNMAPPED in intake_signal_map (they describe a target, not state).
  - silent_intake() consumes ONLY kind==observation; requirement signals are preserved in
    `requirements_seen` (visible/auditable) but NEVER become a detected capability.
  - normalize_signals() stamps the vocabulary's kind onto every IntakeSignal; unknown ids still pass through.

This is the same Observation-vs-Requirement split the Requirements Verification Platform rests on:
observations are reality, requirements are targets, and their comparison is the delta. A tender / OEM spec /
law now produces requirement signals; scanners / repos / documents produce observation signals.

Tests: rewrote the two test_signal_producer cases that previously ASSERTED the bug (tender == repo) to pin
the correct split; regression — `requires_sbom` yields no capability + stays in requirements_seen while
`cyclonedx_found` still detects sbom_creation; endpoint-level regression that a tender requirement does not
auto-detect and the gap stays asked; vocabulary-kind-overrides-mislabelled-producer. 25 onboarding tests
pass, mypy --strict clean, demo runs, check-loc 0. Runtime effect → deploy + smoke. (Fix A; partial-vs-
detected decoupling follows as Fix B before #59.)
2026-06-28 15:52:50 +02:00
pilotadmin b5b6cdddb3 Merge pull request 'POST /onboarding/advisor-start — expose the Advisor at runtime (#58)' (#47) from feat/onboarding-advisor-endpoint into main
CI / detect-changes (push) Successful in 5s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 5s
CI / nodejs-lint (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 3s
CI / loc-budget (push) Successful in 18s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-python-backend (push) Successful in 23s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
2026-06-28 15:14:05 +02:00
Benjamin Admin a4123ace71 feat: POST /onboarding/advisor-start — expose the Smart Onboarding Advisor at runtime (#58)
This exposes the existing Smart Onboarding Advisor through a runtime endpoint; it does not add new
reasoning logic. Tightly scoped: adapter boundary + endpoint, no big frontend, no persistence, no
empirical learning, no new scanners, no LLM.

  POST /onboarding/advisor-start : (company + certifications + target + scanner_findings[ProducedSignal])
        -> Normalizer -> Silent Knowledge Pass -> Advisor -> { silent_intake_summary, inferred_assumptions,
           rejected_assumptions, top_5_questions, capability_delta, top_measures, evidence_requests,
           completeness_summary, auto_detected, headline }
  GET  /onboarding/targets       : the supported target ids (CRA, TISAX, MDR, Environmental)

compliance/services/onboarding_service.py is the app-caller: it loads the curated knowledge (hypothesis
library, signal vocabulary + map, the target's required capabilities) once and calls the pure, tested
orchestration (normalize_signals -> silent_intake -> advisor_start). The scanner ADAPTER boundary is the
ProducedSignal format the request carries — existing scanners emit it, no new scanners. Thin handler
(<30 LOC), registered in the auto-load list. No DB. Additive to the OpenAPI contract (contract test is
additive-friendly; baseline regenerates on CI/py3.12). First deployable runtime feature -> dev deploy +
smoke. mypy --strict clean, 22 onboarding tests pass, check-loc 0.
2026-06-28 15:14:00 +02:00
pilotadmin 3bb48f2147 Merge pull request 'Signal Producer interface + Normalizer — one signal language' (#46) from feat/silent-knowledge-pass into main 2026-06-28 14:51:08 +02:00
Benjamin Admin c2c8f7e424 feat: Signal Producer interface + Normalizer — one signal language for all sources (before #58)
Not scanner stubs — the scanners exist. The Silent Pass needs only their UNIFIED output. This adds the
small common DATA FORMAT (not a new module/framework) the user asked for, exactly the Requirement-
Source / MCAP / regulation-alias pattern: many inputs, one language.

  Producer A / B / C  ->  normalize_signals (vocabulary: id + aliases)  ->  canonical IntakeSignal  ->  Silent Pass

- ProducedSignal {signal_id, source_type, confidence, evidence, provenance} = what ANY source emits
  (website scanner, repo scanner, PDF parser, tender parser, API, the user).
- knowledge/onboarding/signal_vocabulary.yaml reduces producer dialects to a canonical signal: "SBOM
  present" arrives as cyclonedx_found / spdx_found / sbom_uploaded / requires_sbom (tender) — all become
  `sbom_file_found`. The Silent Pass cannot tell where it came from -> no per-scanner special logic, ever.
- Unknown signals pass through (a new producer stays visible). confidence/evidence/provenance flow to
  the detected capability for the audit trail.

A tender that "requires SBOM" now produces the same effect as a repo that HAS one — fits Vision V2
(Requirement Source over Regulation). Endpoint (#58) then has its final shape: POST -> Producers ->
Normalizer -> Silent Pass -> Profile -> Delta -> Questions -> Roadmap. Non-runtime -> no deploy. mypy
--strict clean, 14 onboarding tests pass, check-loc 0.
2026-06-28 14:49:57 +02:00
pilotadmin b70c1b7c37 Merge pull request 'Silent Knowledge Pass — recognise before asking (Phase 0)' (#45) from feat/silent-knowledge-pass into main 2026-06-28 14:34:31 +02:00
Benjamin Admin 9c33582412 feat: Silent Knowledge Pass — recognise before asking (Phase 0, before the endpoint)
Not the endpoint yet — the bigger knowledge lever first. The Advisor can say "I need 5 answers" but
does not yet decide what it can find out by ITSELF. The Silent Knowledge Pass runs in front of the
Advisor and, from signals existing scanners/parsers already produce (website, repository, documents,
product data), deterministically derives capabilities the company demonstrably HAS + product facts
that drive scope — so every recognised item shrinks the delta and removes a question.

compliance/onboarding/silent_intake.py: silent_intake(signals, signal_map) -> detected_capabilities
(+ evidence already in hand) + product_facts. The signal->conclusion map is curated DATA
(knowledge/onboarding/intake_signal_map.yaml), signals are injected (scanners are upstream). Pure,
deterministic, no LLM. advisor_start gains detected_capabilities (folded into the profile at HIGH
confidence -> covered, not asked) and an auto_detected result + headline.

The experience flips from a question wall to "we already recognised 4 capabilities, 2 product facts
and have 4 pieces of evidence in hand — only these few remain". Order now: Silent Pass -> #58
endpoint/frontend -> #59 empirical loop. NOT new architecture, just an orchestration step in front.
Non-runtime (no app caller) -> no deploy. 15 onboarding tests pass, mypy --strict clean, check-loc 0.
2026-06-28 14:34:27 +02:00
Benjamin Admin 23d977e26b deploy: promote macmini staging to dev — merge live dev ai-sdk fixes (ePrivacy/§25 TDDDG, national-law subsidiarity) with +86 backend-compliance commits (Phase Ω, onboarding, hypotheses). Coordinated GO from all 4 sessions.
CI / detect-changes (push) Successful in 6s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 5s
CI / validate-canonical-controls (push) Successful in 3s
CI / loc-budget (push) Successful in 17s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Has been skipped
2026-06-28 13:57:07 +02:00
pilotadmin 88b83d4daf Merge pull request 'Observation Model — empirical learning unit (Task 59a)' (#44) from feat/observation-model into main 2026-06-28 13:34:54 +02:00
Benjamin Admin 98d616d82b feat: Observation Model — the empirical learning unit, defined BEFORE persistence (Task 59a)
The learning point is not the hypothesis, it is the QUESTION — and confirmed/refuted is too coarse.
"partial, only critical suppliers" or "certified but not lived" are not "wrong", they are valuable
knowledge. So the chain is Hypothesis -> Question -> Observation -> (Review) -> Hypothesis, and the
observation model must be defined cleanly before any store/API (else thousands of too-coarse
observations get migrated later).

compliance/onboarding/observations.py:
  - ObservationType: confirmed / partial / refuted / not_applicable / unknown (richer than binary).
  - Observation: {hypothesis_id, capability, question, answer (free text), observation_type,
    scope_note ("only critical suppliers"), evidence_uploaded, reviewed, reviewed_by}.
  - empirical_distribution() -> a DISTRIBUTION (confirmed 61 / partial 31 / refuted 8), not one %.
  - empirical_confidence() -> (confirmed + 0.5*partial) / (confirmed+partial+refuted); n.a./unknown
    excluded; None until calibrated.
  - REVIEW GATE: only reviewed observations calibrate — a raw answer never changes a hypothesis (no
    learning from outliers).

Refactor: the hypothesis is now PURE curated knowledge — the binary observations counter and any
confidence are removed from CapabilityHypothesis and the YAML; confidence is COMPUTED from the separate
reviewed observation stream. Pure, mypy --strict clean. Persistence/aggregation/calibration are 59b/c/d.
Non-runtime -> no deploy. 12 tests pass, check-loc 0.
2026-06-28 13:31:43 +02:00
pilotadmin 59b7006e5a Merge pull request 'Certification Capability Hypotheses — capability-centric + empirical confidence' (#42) from feat/certification-hypotheses into main 2026-06-28 13:17:20 +02:00
Benjamin Admin 2d2cb2a244 feat: Certification Capability Hypotheses — capability-centric library + empirical confidence
The bottleneck is knowledge, not the endpoint. This builds the knowledge the Onboarding Advisor needs,
restructured per the user's key insight: NOT "ISO27001 -> 30 capabilities" but each hypothesis as its
own object "capability -> supported_by: [certs]". A capability is written ONCE with all supporting
certs, so the shared management-system core (document control, incident, supplier, audit, access,
asset, monitoring, training, crypto, release, risk) covers most certifications with ~18 hypotheses
instead of ~300 — and multi-certification merges AUTOMATICALLY (a company's inferred caps = every
hypothesis whose supported_by intersects its certs).

Welt-1 throughout: "IF cert present, EXPECT capability (verification required)", never "erfüllt".
Capabilities NO cert suggests (SBOM, signed updates, CVD, support period) have no hypothesis -> they
stay in the delta and get asked. confidence is EMPIRICAL: computed from real-onboarding observations
(confirmed/(confirmed+refuted)), None until calibrated — never an LLM/expert score (record_observation
+ empirical_confidence). The long-term moat: knowledge that learns from reality, not from a norm.

compliance/onboarding/hypotheses.py (resolve_for_certifications / inferred_hypotheses / empirical_
confidence / record_observation) feeds the existing advisor_start unchanged; the demo now runs on the
curated library. Pure, mypy --strict clean, library is DATA (no norm text, no real names). Non-runtime
-> no deploy. 12 tests pass, check-loc 0.
2026-06-28 13:16:45 +02:00
pilotadmin 02c9fdb18e Merge pull request 'Smart Onboarding Advisor (ADR-012) — orchestration over existing engines' (#41) from feat/smart-onboarding-advisor into main 2026-06-28 12:46:23 +02:00
Benjamin Admin 3ba90f49cf feat: Smart Onboarding Advisor — make the knowledge usable in onboarding (ADR-012)
The user-named "right next runtime step": stop building knowledge, start using it automatically in
onboarding — no sales training, no regulation picking. compliance/onboarding/ is an ORCHESTRATOR (not
a new engine) wiring Company 2A -> RS-005 -> optimization -> completeness:

  advisor_start(input, cert_hypotheses, target_requirements, ...) -> AdvisorResult

From (company + products + certifications + target) it returns inferred_assumptions, rejected_
assumptions, next_best_questions (<=5, ranked by information_gain + leverage + unknown_high_risk +
evidence_missing, each self-explaining), capability_delta, top_measures, evidence_requests,
unsupported_domains, completeness_summary. apply_answer() updates the profile (delta shrinks).

Welt-1 throughout: certificates REDUCE questions but satisfy nothing automatically (verification_
required); relevance(evidence,target) keeps ISO 14001 out of the CRA result. Certificate->capability
hypotheses + target requirements are INJECTED (curated knowledge, outsourced; not in code).

All 7 acceptance criteria pass; mypy --strict clean. First app-caller wiring the engines into a
product flow — still no endpoint/persistence, so 0 runtime effect -> no deploy yet (deploys when
POST /onboarding/advisor-start + frontend are wired). check-loc 0.
2026-06-28 12:45:49 +02:00
pilotadmin 009083882a Merge pull request 'Capability Convergence Explanation + Core/Domain (Phase Omega)' (#40) from feat/capability-families-and-core-domain into main 2026-06-28 12:26:53 +02:00
Benjamin Admin a98076196b feat: Capability Convergence Explanation — why the registry converges + Core/Domain (Phase Ω)
The mature step after Medical is not the next domain but understanding WHY the registry converges.
Three derived views over existing data (no ML, no new architecture):
  1. Why converge? — a domain matrix per cross-domain MCAP + a curated REASON (the moat: not "MCAP-X
     exists" but "why MCAP-X must exist": software product / supply chain / product operation /
     universal process).
  2. Capability Families — ~75 MCAPs collapse to ~15 curated families (knowledge/capability_families/
     families.yaml), each with the reason it is universal or domain-specific.
  3. Core vs Domain — a COMPUTED property (not a new class): Core recurs across >=2 independent domains
     AND source types; Domain stays in one. Medical made it obvious (new medical caps are nearly all
     Domain; update/SBOM/access/logging are Core).

Non-runtime -> no deploy. 4 tests pass, check-loc 0.
2026-06-28 12:26:22 +02:00
pilotadmin afe5a98474 Merge pull request 'Medical stress test + Missing Convergence report (Phase Omega #3)' (#39) from feat/medical-stress-test-and-missing-convergence into main 2026-06-28 12:10:38 +02:00
Benjamin Admin 80f2e2f619 feat: Medical stress test (safety+security coupled) + Missing Convergence report (Phase Ω #3)
Medical before Payment: the harder scientific test (safety AND security coupled, full lifecycle,
deep risk/evidence demands). ISO 13485 runs through the SAME engine as ISO 27001 -> CRA, only new
data, 0 runtime. The key result: IEC 81001-5-1 (health-software security) pulls in the SAME security
MCAPs as the CRA, so Medical REUSES cyber capabilities (the safety/security coupling appears as
capability reuse) while adding 7 genuinely new medical caps (clinical evaluation, software safety
classification, ISO 14971 risk file, benefit-risk). rejected_assumptions intact.

Effect on the convergence core: secure_signed_update_distribution 18 -> 24 and
technical_vulnerability_management 17 -> 23, now spanning 3 domains (cyber + industrial + medical) —
the core visibly GROWS, exactly the convergence signal.

New 5th report: MISSING CONVERGENCE — deterministic (no ML) token-cluster detector for potential
structural duplications: a name token shared by >=3 MCAPs across >=2 distinct sources is flagged for
EXPERT REVIEW (never auto-merged). Surfaces e.g. the `risk` cluster (6 risk MCAPs across 6 sources)
and `security`/`software`; single-source decompositions are filtered out. Complements Suspicious by
looking at cross-source duplication, not single MCAPs.

Also records the durable modelling rule extracted from the frequency fix: evidence is attributed to
its ORIGIN; its value against a target is computed later (relevance(evidence,target)). Ledger now 8
sources, Architecture Stability 8/8 = 100%. Non-runtime -> no deploy. 29 tests pass, check-loc 0.
2026-06-28 12:09:52 +02:00
pilotadmin 897e9464a7 Merge pull request 'Cross-Domain MCAP Convergence Analysis (Phase Omega pause)' (#38) from feat/mcap-convergence-analysis into main 2026-06-28 11:48:30 +02:00
Benjamin Admin c160bb8291 feat: Cross-Domain MCAP Convergence Analysis — which capabilities carry the system (Phase Ω pause)
After Automotive, pause on domains and ask the deeper question: not "which MCAPs occur most often?"
(frequency deceives) but "which MCAPs CARRY the largest part of the system?". A deterministic MCAP
Impact Score (no AI) aggregates over the EXISTING data only:

  Impact = distinct Sources + Target Types + Domains + Journeys + Regulatory + Business Leverage

Critically anti-frequency-deception: a `likely_covered` cap is attributed to its source CERT (one
source), not to every target regulation — otherwise generic management caps win on raw frequency.
With that fix the Core surfaces the true cross-cutting nodes: secure_signed_update_distribution (18),
technical_vulnerability_management (17), access_control, incident_management, sbom_creation,
product_cyber_risk_assessment — exactly the bridges the user predicted; the high-frequency single-
domain environmental management caps correctly drop out.

Four reports, pure aggregation (no runtime, no new architecture): Core (highest impact), Emerging
(>=2 domains), Isolated (1 source/journey — specialised or convergence-not-yet-seen), Suspicious
(too coarse: generic verbs; too fine: hyper-specific isolated names) — an abstraction-level review
tool for domain experts. 11/62 caps already reach impact >=8; the method is ready to reveal whether a
30-50 MCAP core forms as Medical/Payment arrive. Non-runtime -> no deploy. 5 tests pass, check-loc 0.
2026-06-28 11:48:04 +02:00
pilotadmin a2332fb13d Merge pull request 'Automotive convergence stress test (Phase Omega #2)' (#37) from feat/automotive-convergence-stress-test into main 2026-06-28 11:31:03 +02:00
Benjamin Admin 90c3fe16b5 feat: Automotive convergence stress test — same capability from many sources (Phase Ω #2)
Not another domain to prove agnosticism (Environmental did that) but a DIFFERENT property: can the
SAME capability be fed by many overlapping Requirement Sources at once without the model becoming
unstable? Realistic setup — a supplier with ISO 9001 + IATF 16949 + TISAX + ASPICE + CSMS + SUMS
developing an ECU for OEM X. Seven sources (CRA, UNECE R155/CSMS, R156/SUMS, IATF, TISAX, ASPICE,
OEM X) with deliberate overlap, run through the SAME engine (0 runtime code, data only).

Three new measurements (user-requested):
  - Capability Convergence: technical_vulnerability_management = 4 sources across 3 source TYPES
    (regulation + certification + contract); secure_signed_update_distribution = 4 sources. The
    overlap is where the economic value lives ("one capability replaces five evidence worlds").
  - Existing-vs-New: 13/27 required caps reuse existing cyber/environmental MCAPs (48%) -> the
    registry is starting to converge; the automotive-specific rest (CSMS/SUMS/ASPICE/functional
    safety) is expectedly new (a maturity hint, not an architecture break).
  - Business Leverage: a convergent capability satisfies N regulations AND unlocks the OEM market —
    more convincing to a GF than "satisfies five laws". (Regulatory Leverage counts regulations;
    Business Leverage counts regulations + markets/customers.)

Ledger gains the automotive row (0/0, 14 new types, data_only); stability stays 7/7 = 100%. The
verdict recommends the user's next step: NOT a new domain but PAUSE and analyse the registry for the
cross-domain high-convergence core MCAPs. Non-runtime -> no deploy. 12 tests pass, check-loc 0.
2026-06-28 11:30:30 +02:00
pilotadmin e0d9816c99 Merge pull request 'Environmental stress test — architecture works outside cyber (Phase Omega)' (#36) from feat/environmental-stress-test into main 2026-06-28 11:10:36 +02:00
Benjamin Admin fbbd0957bd feat: Environmental stress test — the architecture works OUTSIDE cyber (Phase Ω, data-only)
First NON-cyber stress test. Every prior journey was cyber (infosec/software/product security).
Environmental brings a completely different mental model (substance flows, emissions, water,
chemicals, energy, circularity). The claim under test: RS-005 carries it UNCHANGED — only new DATA,
zero runtime code.

ISO 14001 (an EMS) is modelled as a Company Profile and run through the SAME engines as ISO 27001 ->
CRA (new pattern transition_pattern_iso14001_to_environmental_v1.yaml, capabilities as VERBS):
  - ISO 14001 yields 5 environmental MANAGEMENT capabilities (Welt-1, probably present)
  - the concrete substance/emission/water/material EVIDENCE is the 11-capability delta
  - rejected_assumptions state what ISO 14001 does NOT produce (substance lists, REACH, emissions,
    battery passports, water analyses) — preserving the Welt-1/Welt-2 separation
  - the Journey Matcher stays domain-agnostic: ISO14001->Environmental 100%, cyber journeys 0%

Result: a non-cyber domain ran through Reality -> ... -> Journey with 0 new runtime classes and 0
new pipeline — a stronger generality proof than ten more cyber regulations.

Also extends the Architecture Stability ledger with the third KPI column the user requested — "new
capability types" — as a granularity Frühindikator (a domain needing ~80 new types at 0 runtime would
flag a too-coarse/too-fine capability model). Environmental = 16 types (5 mgmt + 11 evidence), in
range. Ledger now flags cyber vs non_cyber family. Non-runtime -> no deploy. 19 tests pass, check-loc 0.
2026-06-28 11:10:07 +02:00
pilotadmin 2805256c33 Merge pull request 'Architecture Stability + Knowledge Velocity KPI (Phase Omega)' (#35) from feat/architecture-stability-kpi into main 2026-06-28 10:49:22 +02:00
Benjamin Admin cefacb87af feat: Architecture Stability + Knowledge Velocity KPI — Phase Ω (Evidence of Generality)
The focus has shifted: no more architecture epics (the Journey Matcher was the last building
block). The question is no longer "can the architecture do this?" but "where does it fail under
real domain knowledge?". This operationalises the two KPIs almost nobody measures, as a non-
runtime, auditable ledger:

  - Architecture Stability : per integrated Requirement Source — new runtime classes? new pipeline?
  - Knowledge Velocity     : can a domain EXPERT integrate a source data-only, without a developer?

A new domain is a ROW in knowledge/architecture_stability/integration_ledger.yaml (data), never a
code change — so the KPI improves by adding data, which IS the proof. Current state: 6 sources
across 5 target types (CRA, MaschinenVO, TISAX, Tender, OEM, Environmental) = 6/6 = 100% stability
and 100% data-only. The pipeline functions are listed honestly as one-time, domain-agnostic
infrastructure (now frozen), so the KPI cannot be gamed.

The test is a LIVING GUARDRAIL: it fails the day a source needs runtime code, surfacing the exact
moment generality breaks. Non-runtime -> no deploy. 5 tests pass, check-loc 0.
2026-06-28 10:49:00 +02:00
pilotadmin d0575d286f Merge pull request 'Journey Matcher — Delta -> Journey (ADR-011)' (#34) from feat/journey-matcher into main 2026-06-28 10:37:02 +02:00
Benjamin Admin 80bf1993e0 feat: Journey Matcher — the delta explains the journey (Delta -> Journey, ADR-011)
The sanctioned last architectural building block. Reverses the order: not Goal -> Journey -> Delta
but Goal -> Required -> Delta -> Journey. A Journey is the EXPLANATION of the Capability Delta, not
its cause — so this is a Matcher/Explainer, not a Selector.

New module compliance/journey_matcher/ = the third independent, interchangeable function of the
pipeline, beside Company 2A (Evidence -> Capability) and RS-005 (Capability -> Delta):

  match_journeys(delta, journeys, context) -> ranked, auditable explanation

- Looks ONLY at the Capability Delta — never at certificates, regulation, tenders or the goal.
  Journey signatures are certificate-agnostic capability clusters (Input -> Output pattern).
- score = share of the delta a journey explains (recall over the missing capabilities); journey_only
  documents where a journey reaches beyond the delta so a broad journey is not silently preferred.
- Deliberately dumb + deterministic (pure set overlap; NO ML/embeddings/LLM), fully auditable
  (matched / unexplained / journey_only / context signals); a learning ranker can sit on top later.
- Signatures injected, engine hermetic. mypy --strict clean.

Validated on the real patterns (demo): a CRA+MaschinenVO delta ranks the convergence journey 100%,
"ISO27001 -> CRA" 56% (misses the machine-safety caps), "ISMS -> TISAX" 0%. This resolves the
"Scope -> Journey" jump from Customer Mission #1. Freeze exception explicitly authorised; non-runtime
-> no deploy. 12 tests pass, check-loc 0.
2026-06-28 10:36:43 +02:00
pilotadmin 3c6e2a2acc Merge pull request 'Customer Mission #5 — a non-security target (evidence relevance flips)' (#33) from feat/customer-mission-5-non-security into main 2026-06-28 10:18:52 +02:00
Benjamin Admin dbf7b9b587 feat: Customer Mission #5 — a non-security target, evidence relevance flips both ways
Closes the Evidence-Relevance(Target) claim by testing it on a deliberately NON-security target
(a hand-authored environmental / material-evidence Required set — no corpus, no ISO-14001 norm
model, no new module). One company profile, three targets through the same engine:

  - ISO 14001: keine (CRA) / keine (TISAX) / HOCH (environmental)   <- flips
  - ISO 27001: hoch (CRA) / hoch (TISAX) / keine (environmental)    <- flips the other way
  - PSIRT:     hoch (CRA) / keine (TISAX) / keine (environmental)

Proves relevance(evidence, target) is two-sided: no evidence is relevant "in itself"; relevance
only arises against a target -> it must be computed, never stored as an attribute of the evidence.

With this, the target-type diversity for the later selector is complete (Regulation · Certification
· Contract/Tender · OEM-Spec · Environmental/Material) — five target types through one engine, so a
Scope→Journey selector finally makes sense. Synthetic, no real names. Non-runtime -> no deploy. 5 tests.
2026-06-28 10:18:28 +02:00
pilotadmin 5cba0504df Merge pull request 'Customer Mission #4 — a second, different contract target' (#32) from feat/customer-mission-4-second-contract into main 2026-06-28 09:54:42 +02:00
pilotadmin 77d6bc5551 Merge pull request 'Customer Mission #3 — one profile, three target types' (#31) from feat/customer-mission-3-target-types into main 2026-06-28 09:54:21 +02:00
pilotadmin d196ad1cab Merge pull request 'Customer Mission #2 + IACE machine-safety playbooks + RS-004 linking' (#30) from feat/customer-mission-2-multicert into main 2026-06-28 09:54:00 +02:00
Benjamin Admin b71771e52e feat: Customer Mission #4 — a second, different contract target (no tender-special-logic)
One contract example (Mission #3's public tender) is not enough to safely generalise: it risks
baking tender-shaped assumptions into the later Scope→Journey selector. This mission runs TWO
deliberately different contract sub-types against the same company through the IDENTICAL engine:

  - public tender   (procurement: pentest report, references, support SLA, SBOM)  -> delta 4
  - private OEM spec (Lastenheft: CSMS, functional safety, SUMS, ASPICE)            -> delta 3

The two deltas are completely DISJOINT (no shared missing capability), proving the contracts are
genuinely different — yet there is no per-contract code: assess_transition treats each as a plain
Required set, exactly like a regulation or a certification. Evidence-Relevance is target-relative
even between two contracts (TISAX worth more to the automotive OEM than to the generic tender).

Conclusion: "Contract" as a requirement source is now covered by >=2 diverse cases, so the later
selector can treat any contract uniformly. Synthetic company + synthetic contracts (NO real names).
Non-runtime -> no deploy. 5 tests pass.
2026-06-28 09:42:31 +02:00
Benjamin Admin 256bb0607d feat: Customer Mission #3 — one profile, three target TYPES (Requirements Verification proof)
Proves the next thing after Mission #2: the pipeline is target-type-agnostic. One company
profile runs against THREE deliberately different target types through the identical engine
(assess_transition):
  - CRA          (Regulation)  -> delta 8
  - TISAX        (Certification) -> delta 3
  - public tender (Contract, synthetic) -> delta 4

A regulation, a certification and a contract all reduce to required capabilities; Profile −
Required = Delta does not care which. That is the Requirements Verification Platform: the
requirement SOURCE is swappable, the pipeline stays.

Makes Evidence-Relevance(Target) concrete: the same evidence is worth a different amount per
target. PSIRT = hoch(CRA)/keine(TISAX)/mittel(tender); ISO 14001 = keine against all three
security targets but would be hoch against an environmental target. Relevance is a function
of the target, not an attribute of the evidence.

Also: cross-target-TYPE convergence (8 capabilities satisfy >=2 of the 3 target types) — the
leverage one level above law-convergence.

Synthetic company + synthetic tender (NO real names). Non-runtime -> no deploy. 5 tests pass.
2026-06-28 09:30:28 +02:00
Benjamin Admin ff9a66fb72 chore: regenerate Customer Mission #1 snapshot — IACE machine-safety playbooks close the gap (2/12 -> 7/12)
The 4 machine-safety playbooks (+ CE conformity) delegated to the IACE session now exist, so
Mission #1's end-to-end run finds content for them. Generated artifact only; non-runtime.
2026-06-28 09:10:31 +02:00
Benjamin Admin 363c76d274 feat(rs-004): PROPOSED MaschinenVO obligation->capability linking (safety-expert input)
Additive proposal for RS-004 (MaschinenVO/EMV registry-linking gap), from IACE's
machinery-safety authority. Links all 31 MaschVO obligations to capability/control
targets: 2 high-confidence cyber-safety bridges wired to existing CRA-core
obligations + capabilities (the CRA<->MaschinenVO convergence), the rest as
safety-expert capability candidates for Execution to mint and Legal-KG to ratify.

Asserts nothing into the obligation/capability registries — status=PROPOSED,
for_ratification_by legal-knowledge-graph + execution. Respects semantic-authority
(propose, don't assert across authorities) and the knowledge freeze (data, no new
classes). EMV obligation authoring + reg-id/scope wiring explicitly left to their
owners (out_of_scope).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-28 09:08:54 +02:00
Benjamin Admin dfb2c6dfdb Merge remote-tracking branch 'origin/feat/iace-machinery-playbooks' into feat/customer-mission-2-multicert 2026-06-28 09:08:54 +02:00
Benjamin Admin 16d6ad4122 feat(knowledge): CE conformity + technical documentation playbook draft (machinery)
5th machinery-safety playbook, capability ce_conformity_assessment_and_technical_
documentation — referenced by the ISO27001->CRA+MaschinenVO transition pattern and
listed as content-missing. Covers MaschVO conformity assessment (Annex XI), technical
file (Annex IV), EU declaration (Annex V) and CE marking; notes the CRA<->MaschinenVO
integrated technical file. status: draft, with canonical_action verb. New file only ->
non-runtime, no deploy, conflict-free ride-along. capability_id unchanged.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-28 09:02:06 +02:00
Benjamin Admin 3856bb3a4f feat: Customer Mission #2 — the company arrives with a PROFILE, not a journey
Second mission, deliberately different from #1: a highly-certified company (ISO 9001 +
ISO 27001 + ISO 14001 + TISAX + CE + PSIRT) asking „what do WE still need for the CRA?".
Stresses Mission #1's one open seam (Scope → Journey) and proves the reframe with the
real engines:

- The start is a Company Capability Profile (certs aggregated), NOT a single cert→target
  journey. Certifications are OBSERVATIONS feeding the profile.
- Evidence is target-relative: ISO 14001 is in the profile but irrelevant to the CRA;
  PSIRT covers two CRA-delta capabilities. More evidence = smaller delta (12 → 9).
- The „journey" is the computed delta (Profile, Target) — not a thing a selector picks.
  This SHRINKS Mission #1's jump: the seam is profile-intake + target-pick, not a
  journey-matcher engine. There is no „ISO 27001 → CRA"; only „Profile → CRA".

Records the 5 per-mission selection-rationale questions (which journey/why/decisive
info/model-extended?/new-parameter?). Selector input = (Company Profile, Target), which
collapses the 2^N cert-combination explosion.

Non-runtime (reference_scenarios + tests only) -> no deploy. 6 tests pass; check-loc 0.
2026-06-28 09:00:51 +02:00
Benjamin Admin 0b962b41fa feat(knowledge): 4 machinery-safety implementation playbook drafts (Reasoning delegation)
Fulfils the board delegation Reasoning -> IACE (line 45): expert FIRST DRAFTS for the
4 MaschinenVO capabilities the Reference-Suite playbook dashboard lists as "content
missing": machine_safety_risk_assessment (ISO 12100), mechanical_safety_and_guards
(ISO 14120/14119/13850/13849), operating_instructions_and_safety_information
(ISO 12100 6.4 / IEC 82079), protection_against_corruption_of_safety_functions
(MaschVO Annex III 1.1.9 = the CRA<->MaschinenVO cyber-safety bridge).

Schema per knowledge/implementation_playbooks/README.md. status: draft (expert draft,
non-normative). Includes the optional canonical_action verb-formulation (capability-is-
a-verb experiment). New files only -> non-runtime, no deploy, conflict-free ride-along.
Capability ids unchanged (Execution registry contract). Owner verifies + integrates.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-28 08:47:56 +02:00
pilotadmin b6c400902e Merge pull request 'feat: Customer Mission #1 — the platform as one connected expert system (end-to-end)' (#29) from feat/customer-mission-1 into main 2026-06-28 08:40:12 +02:00
Benjamin Admin 98f67e75d9 feat(mission): Customer Mission #1 — the platform as one connected expert system (end-to-end)
Turn the architecture inside-out: instead of refining classes/registries/journeys, force the whole
platform to behave as ONE expert system and run a real consulting project end-to-end — measuring how
often the consultant has to "jump" (special-case glue instead of a clean engine-to-engine handoff). A
Reference Scenario asks "is the knowledge correct?"; a Customer Mission asks "can a customer WORK with
it?". This is the last big architecture test before broad corpus expansion.

- reference_scenarios/mission_machine_builder.py: a synthetic machine builder (ISO9001 + ISMS + CE +
  PLC + remote maintenance + cloud + 80 devs + EU; no real names) asks "what must I do in the next 6
  months?". Runs the REAL engines: Regulatory Map -> Journey selection -> Capability Delta (RS-005) ->
  Roadmap (leverage) -> Playbooks -> Evidence -> Verification -> Completeness, and produces the 6-month
  consulting answer ("the top-5 measures close 9/16 = 56%, starting with the ones that satisfy CRA AND
  MaschinenVO at once").
- Flow-Continuity audit (the actual test): 5 CLEAN, 2 JUMPS, 2 deliberate DEPENDENCIES. The two real
  seams: (1) Scope -> Journey (no `certs x targets -> journeys` selector engine; the data exists in
  transitions.yaml, only the selection is glue); (2) Evidence -> Verification (parked, Vision V2). The
  two dependencies (cert->capability map @Execution, corpus_status curation) are intended ownership
  boundaries, not architecture breaks.
- Finding: the platform carries the WHOLE consulting flow end-to-end. Once the Scope->Journey selector
  exists, the foundation is essentially done — from there the work is knowledge, not architecture.

4 end-to-end tests (mission runs, exactly two known jumps, full flow present, no real company names).
check-loc 0. Non-runtime harness -> no deploy (ADR-001).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-28 08:39:26 +02:00
pilotadmin f652e2d4ed Merge pull request 'feat: Domain Vocabulary — identity vs representation + regulation aliases (fixes KPI normalization)' (#28) from feat/domain-vocabulary into main 2026-06-28 08:12:15 +02:00
Benjamin Admin ecae5bc7f1 feat(vocabulary): Domain Vocabulary — identity vs representation; regulation aliases fix the KPI normalization
Before the next Journey: the LANGUAGE. With 5 knowledge objects but no vocabulary, the same reise gets
named four different ways (ISO9001->MaschinenVO vs Quality Management->Product Safety vs ...). The spec
answers ONE question: which terms are IDENTITIES and which are REPRESENTATIONS of the same meaning?

- spec docs-src/architecture/domain-vocabulary-spec-v1.md (PROPOSAL): identity hierarchy
  (Requirement RQ / Capability MCAP [Registry 2C] / regulation-source-target / Journey Class MJRN
  [PROVISIONAL] / Journey instance / Playbook MPLB); canonical name + aliases; capability vocabulary =
  the Capability Registry (not rebuilt); reorder Vocabulary -> Transition #2 -> #3 -> Rule of Three.
- knowledge/vocabulary/regulations.yaml: regulation/standard IDENTITIES (id + canonical + aliases).
  SOLVES the regulation-ID normalization the KPIs flagged: CRA == "Cyber Resilience Act" == "Regulation
  (EU) 2024/2847" all resolve to `cra`; ISO9001/QMS -> iso9001; etc. Shared artifact (@Legal-KG/@Execution
  please adopt).
- knowledge/vocabulary/journey_classes.yaml (PROVISIONAL): clusters our transitions into classes
  (Information Security -> Product Cybersecurity; Quality Management -> Product Compliance/Safety).
  Finding: ISO9001->MaschinenVO is an INSTANCE of an existing class (like ISO9001->CRA, ISO13485->MDR),
  not a new kind -> avoids duplication. Journey Class is a new abstraction -> its own Rule of Three (no
  MJRN minting yet).
- reference suite: both KPIs now read aliases from regulations.yaml instead of hard-coded maps; the
  "Regelwerk-ID-Normalisierung" line flips TODO -> PASS. KPI numbers unchanged (vocab is a superset).
- Side effect = Requirements Intelligence: a Tender "Security Patch Procedure" resolves to MCAP-0017.

7 vocabulary tests (17 with domain programs), check-loc 0. Knowledge data + spec + reference harness =
non-runtime -> no deploy (ADR-001). No new module, no runtime change, no minting (Freeze).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-28 08:11:30 +02:00
pilotadmin 23a6f02ec2 Merge pull request 'docs: sharpen Journey canonicalization gate (two conditions + diverse transitions + model-change balance)' (#27) from docs/journey-canon-criteria into main 2026-06-28 07:43:13 +02:00
Benjamin Admin 4a7412e4f2 docs(spec): sharpen Journey canonicalization gate — two conditions, diverse transitions, model-change balance
User 2026-06-28: canonicalization is NOT just "3 transitions built". Two conditions:
1. >= 3 deliberately DIFFERENT transitions (the more different the character, the stronger the
   evidence — not three similar security transitions): ISO27001->CRA (security->cyber), ISO9001->
   MaschinenVO (QM->product safety), TISAX->CRA (automotive security->cyber).
2. NO structural extension of the Journey model in the last two transitions (or only clearly
   justified, general extensions). Per-transition maturity test: "did the MODEL need extending, or
   were only DATA added?" — tracked as a balance sheet.

Only when both hold (3 diverse + model stable in the last two) -> rename Transition Pattern -> Journey,
ratify ADR-011, derive renderers. Matches the pattern at Compiler / Layout families / Master Controls:
become the standard only after proving stable under DIFFERENT loads. Non-runtime -> no deploy.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-28 07:42:26 +02:00
pilotadmin 0cb224a7f1 Merge pull request 'docs: Journey model — Accepted as Concept, Pending Canonicalization (Rule of Three)' (#26) from docs/journey-provisional-acceptance into main 2026-06-28 07:32:55 +02:00