The user-named "right next runtime step": stop building knowledge, start using it automatically in
onboarding — no sales training, no regulation picking. compliance/onboarding/ is an ORCHESTRATOR (not
a new engine) wiring Company 2A -> RS-005 -> optimization -> completeness:
advisor_start(input, cert_hypotheses, target_requirements, ...) -> AdvisorResult
From (company + products + certifications + target) it returns inferred_assumptions, rejected_
assumptions, next_best_questions (<=5, ranked by information_gain + leverage + unknown_high_risk +
evidence_missing, each self-explaining), capability_delta, top_measures, evidence_requests,
unsupported_domains, completeness_summary. apply_answer() updates the profile (delta shrinks).
Welt-1 throughout: certificates REDUCE questions but satisfy nothing automatically (verification_
required); relevance(evidence,target) keeps ISO 14001 out of the CRA result. Certificate->capability
hypotheses + target requirements are INJECTED (curated knowledge, outsourced; not in code).
All 7 acceptance criteria pass; mypy --strict clean. First app-caller wiring the engines into a
product flow — still no endpoint/persistence, so 0 runtime effect -> no deploy yet (deploys when
POST /onboarding/advisor-start + frontend are wired). check-loc 0.
The sanctioned last architectural building block. Reverses the order: not Goal -> Journey -> Delta
but Goal -> Required -> Delta -> Journey. A Journey is the EXPLANATION of the Capability Delta, not
its cause — so this is a Matcher/Explainer, not a Selector.
New module compliance/journey_matcher/ = the third independent, interchangeable function of the
pipeline, beside Company 2A (Evidence -> Capability) and RS-005 (Capability -> Delta):
match_journeys(delta, journeys, context) -> ranked, auditable explanation
- Looks ONLY at the Capability Delta — never at certificates, regulation, tenders or the goal.
Journey signatures are certificate-agnostic capability clusters (Input -> Output pattern).
- score = share of the delta a journey explains (recall over the missing capabilities); journey_only
documents where a journey reaches beyond the delta so a broad journey is not silently preferred.
- Deliberately dumb + deterministic (pure set overlap; NO ML/embeddings/LLM), fully auditable
(matched / unexplained / journey_only / context signals); a learning ranker can sit on top later.
- Signatures injected, engine hermetic. mypy --strict clean.
Validated on the real patterns (demo): a CRA+MaschinenVO delta ranks the convergence journey 100%,
"ISO27001 -> CRA" 56% (misses the machine-safety caps), "ISMS -> TISAX" 0%. This resolves the
"Scope -> Journey" jump from Customer Mission #1. Freeze exception explicitly authorised; non-runtime
-> no deploy. 12 tests pass, check-loc 0.
Before the next Journey: the LANGUAGE. With 5 knowledge objects but no vocabulary, the same reise gets
named four different ways (ISO9001->MaschinenVO vs Quality Management->Product Safety vs ...). The spec
answers ONE question: which terms are IDENTITIES and which are REPRESENTATIONS of the same meaning?
- spec docs-src/architecture/domain-vocabulary-spec-v1.md (PROPOSAL): identity hierarchy
(Requirement RQ / Capability MCAP [Registry 2C] / regulation-source-target / Journey Class MJRN
[PROVISIONAL] / Journey instance / Playbook MPLB); canonical name + aliases; capability vocabulary =
the Capability Registry (not rebuilt); reorder Vocabulary -> Transition #2 -> #3 -> Rule of Three.
- knowledge/vocabulary/regulations.yaml: regulation/standard IDENTITIES (id + canonical + aliases).
SOLVES the regulation-ID normalization the KPIs flagged: CRA == "Cyber Resilience Act" == "Regulation
(EU) 2024/2847" all resolve to `cra`; ISO9001/QMS -> iso9001; etc. Shared artifact (@Legal-KG/@Execution
please adopt).
- knowledge/vocabulary/journey_classes.yaml (PROVISIONAL): clusters our transitions into classes
(Information Security -> Product Cybersecurity; Quality Management -> Product Compliance/Safety).
Finding: ISO9001->MaschinenVO is an INSTANCE of an existing class (like ISO9001->CRA, ISO13485->MDR),
not a new kind -> avoids duplication. Journey Class is a new abstraction -> its own Rule of Three (no
MJRN minting yet).
- reference suite: both KPIs now read aliases from regulations.yaml instead of hard-coded maps; the
"Regelwerk-ID-Normalisierung" line flips TODO -> PASS. KPI numbers unchanged (vocab is a superset).
- Side effect = Requirements Intelligence: a Tender "Security Patch Procedure" resolves to MCAP-0017.
7 vocabulary tests (17 with domain programs), check-loc 0. Knowledge data + spec + reference harness =
non-runtime -> no deploy (ADR-001). No new module, no runtime change, no minting (Freeze).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
User 2026-06-28: canonicalization is NOT just "3 transitions built". Two conditions:
1. >= 3 deliberately DIFFERENT transitions (the more different the character, the stronger the
evidence — not three similar security transitions): ISO27001->CRA (security->cyber), ISO9001->
MaschinenVO (QM->product safety), TISAX->CRA (automotive security->cyber).
2. NO structural extension of the Journey model in the last two transitions (or only clearly
justified, general extensions). Per-transition maturity test: "did the MODEL need extending, or
were only DATA added?" — tracked as a balance sheet.
Only when both hold (3 diverse + model stable in the last two) -> rename Transition Pattern -> Journey,
ratify ADR-011, derive renderers. Matches the pattern at Compiler / Layout families / Master Controls:
become the standard only after proving stable under DIFFERENT loads. Non-runtime -> no deploy.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
User decision (2026-06-28): provisional acceptance. Journey is now the preferred way of THINKING, but
the persisted artifact stays "Transition Pattern" — NO rename, NO migration, NO runtime change. Per the
Rule of Three, Journey becomes the official primary entity only after it proves itself on >=3 distinct
transitions (1. ISO27001->CRA done, 2. ISO9001->MaschinenVO, 3. TISAX->CRA). Only then: rename to
Journey, ratify ADR-011, derive renderers officially. Erst beweisen, dann kanonisieren — as with Master
Controls/Capabilities.
Also makes the two-axis separation durable (the most valuable finding): Atomic Requirement -> Capability
-> Journey (transition axis) vs Capability -> Playbook (implementation axis). Journey belongs to the
transition; Playbook stays capability-owned, referenced by any number of journeys. We do NOT force-unify.
Non-runtime doc -> no deploy (ADR-001).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Customers don't buy "EMV domain"; they buy "we have ISO 9001, help us with the CRA". The sellable
unit of knowledge is the TRANSITION (from -> to), not the law and not the capability. This reframes
the backlog from "model EMV next" to "the top demanded transitions". No new runtime framework (ADR-010).
- knowledge/programs/transitions.yaml: the Operational Knowledge backlog — the ~20-30 actually demanded
transitions (of ~N*(N-1) possible) with priority. ISO27001->CRA, ISO9001->CRA, ISO9001->MaschinenVO
(all 5-star), IEC62443->CRA, TISAX->CRA, ISO27001/IEC62443->NIS2, ISO14001->Umweltrecht.
- Transition Coverage KPI (reference suite, computed-not-stored): per transition a status DERIVED from
the transition-pattern corpus (reviewed/validated/proven -> Gold, draft -> 🟡, none -> ⚪). Honest
current state: ISO27001->CRA ✅ reviewed, ISO9001->CRA 🟡 draft, rest ⚪. Highest-priority gap =
ISO9001->MaschinenVO (the next Track-B work) — a far stronger product indicator than "EMV 30% modelled".
- Three knowledge layers documented: Regulatory -> Operational (transitions/playbooks/deltas, the
biggest differentiator) -> Verification (Vision V2). A domain is a TRANSITION PROGRAM with two tracks:
Track A breadth (model sources, @Legal-KG/@Execution) + Track B product (transitions/playbooks/RTS
per source, @Reasoning).
- ADR-010: the transition is the unit of knowledge; Transition Coverage KPI; three layers; two tracks.
10 program/transition-contract tests, check-loc 0. Knowledge data + ADR + reference harness =
non-runtime -> no deploy (ADR-001). No new module, no runtime change.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The real bottleneck is domain MODELLING. Phase B is organized as one program with sub-programs per
domain, each run through the SAME 7-stage production line. No new runtime framework, no new module
(ADR-009, Freeze v1.0) — only program data + a derived reporting view.
- Customer enters by INDUSTRY, not regulation: Industry -> Domain Model -> Requirement Sources ->
Requirements -> Capabilities -> ... -> Completeness.
- 7-stage checklist identical for every domain (Domain Model / Requirement Sources / Capability
Registry / Transition Patterns / Playbooks / Reference Scenarios / Completeness) with per-stage
ownership. README generalized to the framework.
- Each domain lists typical_requirement_sources + typical_certifications -> pre-onboarding capability
HYPOTHESIS (the ETO insight; feeds Company 2A as inferred, never confirmed).
- Backlog v1 (by customer value): 1 Industrial Automation, 2 Environmental, 3 Automotive, 4 Medical,
5 Energy. Five domain-definition shells (environmental restructured to the unified shape, law-first
preserved).
- Per-domain KPI is DERIVED from the real corpus (computed-not-stored; sources modelled / transition
patterns / playbooks / reference scenarios), NOT a curated number. Reference suite renders maturity
bars: Industrial Automation 43% (3/7 sources) leads, Environmental 0% (work ahead). Backlog (value)
and KPI (corpus state) are deliberately separated.
- ADR-009: Domain Knowledge Program framework. Honest known refinement: regulation-ID normalization
(CRA vs Cyber Resilience Act) aliased in the KPI.
7 program-contract tests (backlog order + industry-first + derived-not-stored), check-loc 0.
Knowledge data + ADR + reference harness = non-runtime -> no deploy (ADR-001).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The architecture is stable; from here the value comes from DOMAINS, not more software. Phase B is
organized as law-first Domain Knowledge Programs, each delivering the same production line: Corpus ->
Obligations -> Capabilities -> Transition Patterns -> Playbooks -> Reference Scenarios -> Completeness.
No new runtime framework (Freeze v1.0).
- knowledge/programs/README.md: reusable Domain Program blueprint (production line, per-stage ownership,
law-first ordering, planned programs Environmental/Automotive/IEC62443/Functional-Safety).
- knowledge/programs/environmental.yaml: the Environmental domain as DATA. Law-first: B1 Environmental
Regulatory Corpus (water/chemicals/emissions/energy/waste/product-responsibility — law + obligations
only) -> B2 Capability Model -> B3 Transition Patterns (ISO 14001 -> corpus, built LAST). ISO 14001
is a source state, NOT the domain.
- Ownership handoffs: B1 -> Legal Knowledge, B2 -> Compliance Execution, B3+/playbooks/reference ->
Reasoning. Coordinate via the board; no session builds another's artifacts.
- reference suite: "Domain Knowledge Programs" section renders the program stages + a measurable
Completeness baseline (6 areas, 0 assessed today) that flips automatically as stages land.
- ADR-008: from architecture to domains; Phase B as law-first programs; architecture frozen.
6 program-contract tests (law-first order + ownership pinned), check-loc 0. Knowledge data + ADR +
reference harness = non-runtime -> no deploy (ADR-001). No new module, no runtime change.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Phase A½. The move from feature to product development: for every assessment, answer "how sure are
we that this answer is COMPLETE?" — different from confidence. The product never claims full coverage;
it makes its own knowledge state transparent and auditable. Shows what we do NOT know and why.
- compliance/completeness/: assess_completeness(identified, corpus_status, uncertain, assumptions,
assessed_obligations) -> CompletenessReport. Separates IDENTIFIED from ASSESSED (validated corpus
AND determined applicability) and justifies every gap. Two kinds of open: corpus gap (future_corpus)
and applicability uncertainty (query_required + deciding question, e.g. Data Act / generates_usage_data).
- The metric is COUNTS, never a single percentage: "Identifiziert N · bewertet M · offen K ·
Unsicherheiten U · Begründung ja" + an honest audit statement.
- ADR-007: auditable honesty; phase order A factory -> A½ Completeness -> B new domains; the
transparency selling point. Deterministic, no LLM; corpus status + obligation count injected.
- reference suite: "Regulatory Completeness" section runs an industrial-dishwasher assessment
(assessed CRA/MaschinenVO; open EMV/Environmental=future_corpus, Data Act=query_required) and notes
Environmental flips open->validated automatically once the corpus lands.
11 completeness tests (54 with adjacent modules), mypy --strict clean (15 files), check-loc 0.
Product code with no app caller + ADR/reference = non-runtime -> no deploy (ADR-001). Freeze-safe.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Phase A1. The real knowledge production is not writing — it is TARGETED UPDATING: when 20 documents
arrive, which 5 change our knowledge and which 15 are ignorable? Before the parser, Knowledge Intake
classifies a new document (no content extraction) and intersects its signals with an index of the
existing knowledge to emit a Knowledge Package (an impact analysis).
- compliance/knowledge_intake/: build_knowledge_index(patterns, playbooks, reference_scenarios,
obligation_index) + assess_document_impact(descriptor, index) -> KnowledgePackage. Deterministic,
NO content extraction, NO LLM. Surfaces affected capabilities / playbooks / transition patterns /
reference scenarios / (injected) obligations, whether it is a new domain, and a triage level
(HIGH / LOW / NONE / NEW_DOMAIN) with a recommendation.
- ADR-006: Knowledge Intake = classify + impact before extraction; full factory Intake -> Package ->
Parser -> Draft -> Review -> Published; phase order A1 Intake / A2 Draft / A3 Review.
- reference suite: "Knowledge Intake" section triages 3 example documents (CRA SBOM-FAQ -> high,
14C/2PB/3RTS/2Obl; environmental guidance -> new_domain; marketing blog -> ignorable). Section
lives in _helpers.py to keep generate.py under the 500-LOC budget.
- Honest known refinement surfaced by intake: regulation-ID normalization (CRA vs Cyber Resilience Act).
10 intake tests (60 with the adjacent modules), mypy --strict clean (16 files), check-loc 0.
Product code with no app caller + ADR/reference = non-runtime -> no deploy (ADR-001). Freeze-safe.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The bottleneck is not content, it is knowledge PRODUCTION. Instead of writing 200 playbooks by
hand, generate drafts deterministically from data the software already owns, then have an expert
review them. Mirrors the legal pipeline (Gesetz -> Parser -> Obligation -> Review) for BreakPilot's
own knowledge: new Capability -> Registry -> Transition Pattern -> Playbook Draft Generator ->
Expert Review -> versioned Playbook.
- compliance/knowledge_production/: generate_playbook_draft(capability, requirement, control_links)
+ drafts_from_pattern(pattern) -> one PlaybookDraft per delta capability. Owned fields (why /
closes_regulations / expected_evidence / typical_controls) are assembled with per-field provenance;
the practitioner know-how (tools / process_steps / how_others) is left as an explicit TODO.
- DraftStatus lifecycle (Freigabestatus): draft_generated -> in_review -> reviewed -> validated ->
proven. Deterministic, NO LLM in the core (any model enrichment stays offline/advisory/propose-only).
- ADR-005: extends "the engine does not change, the corpus grows" with "and the corpus is not written
by hand — it is deterministically prepared, then curated".
- reference suite: "Knowledge Production" section turns the convergence pattern into 12 auto-assembled
drafts (why/closes/evidence filled, tools/steps TODO) -> review 12 drafts, don't write 12 playbooks.
10 tests (50 with playbook/optimization/transition/company), mypy --strict clean, check-loc 0.
Product code with no app caller + ADR/reference = non-runtime -> no deploy (ADR-001). Freeze-safe.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Roadmap item 4. After WHAT applies / WHAT is missing / WHICH first, the GF asks HOW. The
Implementation Playbook renders, for one capability, the full journey — why / which regulations
it closes / tools / process / evidence / controls — and chains the Optimization Roadmap into
per-measure playbooks. Another renderer over the same Capability spine (ADR-003/004), not a new
engine: ~95% of the data already exists, it just needs a different rendering.
- compliance/playbook/: build_playbook() + playbooks_for_plan() (chains optimization -> playbook,
acyclic; reuses leverage for "closes which regulations"). Capabilities without curated content
render as honest status:missing stubs — the content-owed signal.
- knowledge/implementation_playbooks/: curated knowledge layer (Reasoning Knowledge Acquisition),
two deep expert drafts (SBOM, CVD/PSIRT, status draft, expert-draft-not-normative) + README.
The bottleneck is now CONTENT, not software; Playbook (own knowledge) != regulatory domain.
- ADR-004: Implementation Playbooks = renderer + knowledge layer; content is the bottleneck.
- reference suite: "Implementation Playbook" section renders the SBOM journey + Roadmap->Playbook
table (high-leverage caps flagged "fehlt (Inhalt)" — content backlog, highest leverage first).
- refactor: extracted markdown helpers to reference_scenarios/_helpers.py to keep generate.py
under the 500-LOC budget.
9 playbook tests (40 with optimization+transition+company), mypy --strict clean, check-loc 0.
Product code with no app caller + knowledge/ADR/reference = non-runtime -> no deploy (ADR-001).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Roadmap item 5. GAP analysis and measure-prioritisation are the SAME computation: Required −
Known = the Capability Delta. The Capability Delta Engine (RS-005) computes it once; renderers
read that ONE delta. Interview Renderer (missing info → questions) was already built; this adds
the Roadmap/Management Renderer (missing capabilities → measures ranked by regulatory leverage).
- compliance/optimization/: regulatory_leverage() + select_within_budget() (pure leverage math)
+ roadmap_from_delta(assessment, ...) — the keystone binding optimization to the RS-005 delta
(dependency optimization → transition_reasoning, acyclic; the delta engine stays hermetic).
leverage(measure) = number of regulatory requirements it closes at once (e.g. patch management
→ CRA+MaschinenVO+IEC62443+ISO27001 = 4). No new corpus, no new meta-model class (freeze v1.0).
- Welt-1 honesty: percentages are exact count ratios over the IDENTIFIED requirements (the known
delta), never "% gesetzeskonform".
- reference suite: "Regulatory Optimization" section runs the SAME convergence delta → ranked
measures + budget answer + the management sentence "of N identified requirements you close M
with the top-K measures (X%) — highest regulatory leverage".
- ADR-003: Capability Delta Engine — one delta, many renderers; rename Gap → Capability Delta.
13 optimization tests (31 with transition+company), mypy --strict clean, check-loc 0.
Product code with no app caller + ADR/reference = non-runtime → no deploy (ADR-001).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Aligns the spec with RS-005 v0: the Transition Planning Engine owns the INFORMATION
GAPS (TransitionQuestionRequest), not the questions. Chain: Planning Engine ->
TransitionQuestionRequest -> Question Renderer (RS-005.1) -> Interview. RS-005.1
(renderer/templates) deliberately deferred; GeneratedQuestion reframed as the renderer's
output (a swappable policy layer), not part of the engine.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
v1.1: interview questions are GENERATED from the existing (Master) Controls, not
hand-written. Three building blocks: Control->question_intent (corpus/Execution),
~30-40 Master Question Templates (Reasoning), Transition-Prioritization (certs decide
which generated questions can be skipped; 217->19 funnel, reuses Company 2A + cert map).
v1.2: knowledge production. LLMs produce the first expert DRAFT (the prioritization per
transition); BreakPilot reviews + versions + OWNS the canonical library (in Git, not the
AI; model-independent, MDQ-00127 v4). Offline multi-model workflow, NOT runtime
(deterministic-first: LLM offline-propose, never online-mutate). Hard boundary: the
library is an expert DRAFT, not a normative/legal proof -- "cert probably covers X" is
Welt-1 (ClaimCoverage), never "erfuellt" (anti-fake-evidence).
Reframes the 100 seed questions as validation/template-extraction set. Spec only, no
code; non-runtime docs -> no deploy (ADR-001).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Second reasoning mode (extends, does not replace): BreakPilot answers MIGRATION
questions (start state -> target state -> delta), not regulation Q&A. New package
compliance/transition_reasoning/ (spec only). Transition Reasoning is RCI
generalized; reuses Company 2A (have), Master Capability Registry (MCAP) and RCI.
MDQ Registry = 4th identity-machine instance (after Master Controls/Obligations/
Capabilities): every Master Delta Question is a versioned, identifiable knowledge
unit (verifies MCAP, supports obligations, transition patterns, evidence types,
information gain, confidence impact, follow-up). Transition Patterns hold only MDQ
references -> reuse across transitions. Delta interview = information-gain
optimization, not a sequential questionnaire.
ADR-002: transitions are DATA (patterns + capability/MDQ knowledge), never engine
or metamodel extensions. 100 seed questions captured as v1.
Spec only (no code; freeze-respecting: additive package, no new graph/base class/
meta-model class). Non-runtime docs -> no deploy (ADR-001).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
A dev deploy must always have a verifiable runtime effect. Deploy only on
runtime/API/data-model/reasoning/security changes; docs, reference suites, ADRs,
board and ownership texts are merged to origin/main but NOT pushed to dev (no Orca
build). Keeps the CI/CD history meaningful: every build == a runtime change.
Architecture/release decision (not a developer convention) -> own folder
docs-src/architecture/adr/. Non-runtime: this commit triggers no deploy, per its
own policy.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
9 docs + index in docs-src/architecture/ documenting the deterministic
retrieval engine: retrieval pipeline, authority rerank, source_class,
source_role, control-intent + diversity, assessment, confidence,
explainability + supersede, framework_* layer. Each doc carries the exact
constants, the rationale behind them, code refs, and the failure class
it addresses. Audit/onboarding reference.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>