Surfaces fired patterns whose zone names terms the machine's narrative never
mentions — foreign framing that leaks through terms not yet in domainGateTerms
(once a term is a gate term, the ghost-pattern invariant already fences it out).
- FindFramingCandidates (proposer_framing.go): per fired pattern, zone terms with
no narrative echo (minus a generic hazard-location stoplist). Echo matching is
bidirectional to survive German compounding (narrative "Steuerung" echoes zone
"Steuerungssystem"). Heuristic verdict foreign (fully orphan) / plausible
(partial). Over-surfaces by design — human/LLM is the precision filter.
- Wired into iace-audit propose -> audit-reports/framing.{md,json}, threshold via
IACE_FRAMING_MIN_ORPHAN (default 0.6).
Honest finding: genuine wrong-MACHINE framing (Walzen, Transportbaender) no longer
fires thanks to the machine-type gate; the residual is mostly cyber/control
patterns with generic-industrial zone vocabulary, candidates for re-framing.
Proposal types 3-4 (vocab->tag, coverage blind spots) remain for slice 5.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Makes the offline proposer runnable end-to-end.
- BuildProposerInput (proposer_input.go): non-test engine->hazards path. The
PatternMatch->Hazard converter is lifted out of the GT test files into
production scope so both the tests and the CLI share one pipeline.
- iace-audit propose <narrative.json> [<ground-truth.json>]: detect candidates ->
GT-screen survivors (when a ground truth is given) -> judge (HeuristicJudge by
default, LLMJudge over ollama when IACE_PROPOSE_LLM=1) -> write the human-review
queue to audit-reports/proposals.{md,json}. Propose-only.
Smoke run on a dishwasher narrative: 32 fired -> 3 candidates -> queue with a
confident duplicate, a confident distinct, and one punted to the LLM judge; GT
wall recall-safe. Live qwen is opt-in via env; the heuristic default keeps the
tool runnable (and CI deterministic) without a model. Proposal types 2-4
(foreign-framing gates, vocab->tag, coverage blind spots) remain for slice 4.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds the semantic judgement layer on top of the slice-1 detector + GT wall.
DEV-TIME, propose-only — nothing mutates the library or runtime.
- CandidateJudge interface with two implementations: HeuristicJudge
(deterministic default/fallback, used in tests) and LLMJudge (offline, over the
shared llm.ProviderRegistry via the LLMCompleter adapter). LLMJudge degrades to
"uncertain" on any transport/parse error — it can never break a run.
- BuildJudgePrompt: the ISO 12100 same-vs-distinct prompt, unit-tested
deterministically even though the call is not.
- RenderProposalQueue: markdown human-review queue with a suggested action per
candidate (supersede / keep both / needs review).
On real warewashing output the heuristic punts to "uncertain — needs the LLM
judge" for exactly the two recall-safe near-dupes (HP807/HP033 update,
HP101/HP096 winding-vs-friction), making the LLM's role explicit. All 3 GTs
unaffected (read-only). Live qwen wiring + a CLI/file queue are slice 3.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
First thin slice of the offline library-improvement proposer. DEV-TIME ONLY,
propose-only — it never mutates the pattern library or the runtime.
- FindDedupCandidates (proposer_dedup.go): structural near-duplicate detection
over the fired patterns (category + measure/zone/scenario overlap). Bakes in
the P1 lesson: only same-category pairs compare, and pairs with different
operational states are never proposed (normal-operation vs maintenance are
legitimately distinct, e.g. HP011 vs HP077).
- ScreenSupersession (proposer_screen.go): the wall. A proposal is safe only if
(1) dropping the hazard does not reduce GT recall AND (2) keep/drop do not
credit DIFFERENT GT entries. Check 2 catches distinct hazards that merely share
measures (HP2201 hot surface GT 1.3 vs HP2202 hot ware GT 1.4) which recall
alone would wave through.
On real warewashing output: 3 candidates -> 1 BLOCKED (distinct GT), 2
RECALL-SAFE for human/LLM review (the update + winding/friction near-dupes).
Nothing auto-applied. All 3 GTs unaffected (read-only). The LLM judgement and a
CLI/file queue are slice 2.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
HP013 (stored electrical energy) fires for dishwashers via the broad stored_energy
tag but its zone is framed for Batteriefaecher/USV-Anlagen, which a dishwasher does
not have. The precise residual-voltage pattern HP144 (Frequenzumrichter/Zwischenkreis,
Priority 90) already fires and covers the same hazard. Add HP013 to the
warewashing-scoped supersession set so the duplicate is dropped only when
dom_warewashing is present.
Warewashing recall stays 100% (25/25), precision 92.6% -> 96.2%. Kistenhub/Bremse
keep HP013 (no dom_warewashing); 26 Bremse pins + benchmark unaffected.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The generic hot-surface patterns HP016 (high_temperature) and HP018 (actuator
burn) fire for dishwashers via broad tags and duplicate the precise warewashing
pattern HP2201 (Boiler/Tank/Spuelkammer). Suppress HP016/HP018 only when
dom_warewashing is present, so the specific pattern wins and the duplicate is
dropped. Scoped to the domain tag -> Kistenhub/Bremse and every non-warewashing
machine keep the generic patterns unchanged.
Warewashing recall stays 100% (25/25), precision 90% -> 92.6% (2 dupes removed).
Bremse 26 pins and Kistenhub benchmark unaffected.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
ListHazards returned hazards in pattern-firing order, which reads as a jumble.
Sort by EN ISO 12100 hazard group (A. Mechanisch, B. Elektrisch, C. Thermisch,
D. Pneumatik/Hydraulik, E. Laerm, F. Ergonomie, G. Stoffe, H. Software/Steuerung,
I. Cyber, J. KI), stable within a group. Matches the frontend CATEGORY_LABELS.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
HP1717 was gated on the generic stored_energy tag (carried by a frequency
converter's DC link) + pneumatic_pressure (emitted by "Boiler unter Druck"),
so it leaked into the dishwasher despite the absence of any pneumatics. Require
pneumatic_part instead. The Bremse pin is a static pattern->measure check
(unaffected); full suite incl. Bremse coverage and Kistenhub 97.1% unchanged.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
HP753 (lithium thermal runaway), HP754 (battery off-gassing) and HP755 (HV
battery shock) were gated on stored_energy, which a frequency converter (C034,
DC-link capacitors) legitimately carries — so they leaked into any machine with
a VFD (surfaced by the dishwasher after the Frequenzumrichter narrative). Now
require the "battery" tag; add lithium/batteriespeicher synonyms so real
battery-storage machines still emit it.
GT #3 100% recall unchanged, battery themes gone from the dishwasher log;
Kistenhub 97.1% and Bremse pinned mappings unchanged.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
User-Antwort auf „wie verteilen wir die Arbeit": nach BESITZ der Datenmodelle, NICHT nach
Regulierung. 3 Domaenen (Legal Knowledge / Compliance Execution / Product Knowledge), jede
besitzt EIN Modell (andere read-only). 3 Vertraege: Legal->Compliance citation_span->legal_basis ·
Product->Compliance Feature->Capability (WICHTIGSTE Schnittstelle) · Compliance->Legal
obligation_id->legal_basis. Product Knowledge Graph = naechster Meilenstein (Reasoning-Session
umfokussieren, besitzt schon CanonicalProductRegulatoryProfile+Navigator). NIS2 verschoben.
Offene Fragen: Legal-KG-Owner, IACE-4.-Session, Compliance-2-Branch-Split.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
evidence_required lists only required:true rows; repo_scan was required:false so
attack_surface_minimization surfaced config_export alone. An attack-surface scan
IS required to evidence a minimized attack surface. Adds a test pinning the curated
evidence_required set per NIST obligation (the table test only checked control count).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Registry materialized the generic CORE security objectives (#5b, Modell C), so
the two broad NIST controls now point at their canonical parents instead of the
domain-scoped matches:
SI-7 -> software_integrity_protection (CORE, Annex I (2)(f))
CM-7 -> attack_surface_minimization (CORE, Annex I (2)(j))
Non-breaking: the domain-scoped obligations stay valid and specialize the CORE.
SI-7 evidence = sbom + config_export (SBOM evidences component/supply-chain
integrity; config = signing/secure-boot). Export proposed_obligation_id + handler
test (2 CORE cases) updated. go test green.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
User-Entscheidung: Metamodell als v1.0 einfrieren (nur META-SEMANTIK: 6 Klassen + Kanten-
Vokabular + Attribute; NICHT Registry/Capabilities/Procedures). Architektur-Freeze in Kraft:
neue Regulierung = DATEN nicht Architektur; 0 neue Objektklassen erwartet; reopen nur bei
nachgewiesenem Scheitern (Hazard/Threat = einzige bekannte künftige Öffnungs-Ursache, nur fuer
FMEA). Reuse-Metrik-KPI definiert (Wissens-Akkumulations-Beweis). Validiert gegen 5
Regulierungsarten (DSGVO/CRA/MaschVO/Data-Act/NIS2). Erster Live-Durchlauf: MaschVO.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
User-Stresstest VOR der naechsten Regulierung: passt MaschVO/Data-Act/AI-Act/NIS2 ins
6-Klassen-Modell (Obligation/Capability/Procedure/Control/Evidence + Guidance) OHNE neue
Objektklasse? Ergebnis 4x NEIN -> Compliance Meta Model steht. 2 Verfeinerungen
(realized_by Capability OPTIONAL; Risiko-Niveau/Frist/Hazard-Schwere/Risiko-Tier = Attribute,
keine Klassen). 1 Watch-Point: Hazard/Threat (erst noetig bei quantitativem FMEA-Risiko als
First-Class-Knoten, nicht fuer Compliance-Abbildung). Kein Code, keine Regulierung ingestiert.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Registry grew to 95 (Capability materialization #5b added CORE obligations).
Keep the ai-sdk build-context copy current so obligation-status reflects the
live registry contract.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Prior gitea push's build-ai-sdk failed on a transient registry push (arm64 built
clean on macmini; amd64 cross-compile is green) and last-build/main got poisoned
to that SHA, so a plain re-run scopes to nothing. A real touch in ai-compliance-sdk/
re-scopes the build. Also documents the synced-copy contract for
data/obligations/obligation_join_keys.json.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Aligns provide_security_updates -> SI-2 evidence to the curated acceptance set:
config_export (secure-update mechanism config) + test_report (patch verification).
For "provide updates" the patch-verification test is more on-point than a vuln
scan; repo_scan stays on CM-7 for attack-surface.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Schema-Papier statt capabilities.json (User-Entscheidung). Befund: die 8 SHARED_CAPABILITY-
Cluster zerfallen in Typ-1 (technische Capabilities: mfa/tls/code_signing/session/anomaly)
und Typ-2 (Sicherheitsziele: attack_surface_min/software_integrity = die #4-Gaps). Empfehlung
Modell C: Capability = EINZIGE neue Klasse; Sicherheitsziele = CORE Legal Obligations
(CORE/DOMAIN existiert bereits). Kanten-Graph (realized_by/specializes/...). guidance_basis
gehört konzeptionell an die Capability. 4 Entscheidungen offen (User). #5b Materialisierung
GEGATED auf Modell-Annahme — keine Daten verschoben.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Registry filled proposed_obligation_id for the 3 NIST primary_implementation
controls: SI-7->signed_update_integrity, SI-2->provide_security_updates,
CM-7->remote_access_attack_surface_min. Adopted onto cra_nist.jsonl so the join
is now EXACT (obligation_id) instead of the coarse citation_unit fallback.
obligation-status now surfaces SI-2 under provide_security_updates; test extended.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Vertical slice over the Compliance Execution Graph: obligation_id -> accepted
controls -> required evidence -> status. NEVER auto-asserts fulfillment - with
no evidence collection wired (MVP), a mapped obligation is "not_assessed" and
every required evidence is "missing". Fail-closed: no id -> 400; unknown id ->
unknown_obligation; mapped-but-no-control -> unmapped; graph not loaded -> 503.
- ComplianceGraphHandlers (separate from the DB-backed ObligationsHandlers):
loads Registry join keys + accepted control mappings + evidence once at start.
- LoadComplianceGraph: candidate-path resolution across dev/container/test.
- Data plumbing: Dockerfile now COPYs data/{control_mappings,evidence_requirements,
obligations}; data/obligations/obligation_join_keys.json is a SYNCED COPY of the
repo-root Registry contract (re-sync on Registry growth).
- Table-driven handler test (mapped/unmapped/unknown/400 + no-fulfillment-claim).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds SI-7/SI-2/CM-7 to controls_for_obligation_mapping.json (7 OWASP -> 10),
mapping_type=primary_implementation (the single canonical control per obligation).
proposed_obligation_id left empty for the Registry to assign. Notes aligned to the
updates family (join_keys 93): SI-2 -> provide_security_updates (strong),
SI-7 -> signed_update_integrity (partial; SI-7 broader), CM-7 ->
remote_access_attack_surface_min (partial; CM-7 broader).
Origin-only (data/tooling; backend does not load obligations/* at runtime) -> no Orca.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CRA Annex I Part I (2)(e)/(2)(l)/(2)(i) had no clean OWASP target (rejected:
"Mapping ueber NIST/BSI erforderlich"). Their NIST home, curated + accepted:
(2)(e) Integritaet -> SI-7 (Software/Firmware/Information Integrity)
(2)(l) Sichere Updates -> SI-2 (Flaw Remediation)
(2)(i) Angriffsflaeche -> CM-7 (Least Functionality)
New mapping_type=primary_implementation = the single canonical control per
obligation (stronger than implements/supports); related controls (SC-3(3),
RA-5, AC-6, SI-16, ...) follow later as supports.
Evidence is framework-AGNOSTIC: SI-7/SI-2/CM-7 reuse the shared evidence_type
catalog (config_export/test_report/repo_scan) - same types carry CRA, NIST,
ISO 27001, IEC 62443, BSI. (framework,control) is only the link, not the type.
obligation_id left empty: the Obligation Registry assigns it (exported via
controls_for_obligation_mapping.json), then we adopt. go test ./internal/ucca green.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The Obligation Registry filled proposed_obligation_id (7/7) + cut the logging
family (obligations 47->66). Adopted obligation_id onto our 7 accepted CRA->OWASP
mappings; the join now prefers the EXACT obligation_id over the coarse
citation_unit (which stays as fallback for not-yet-adopted rows).
Effect: semantic coverage 2->4 (user_authentication_required,
credential_confidentiality_protection, auth_key_management,
event_logging_security_events). Befund 1 resolved: V11.2.1 crypto now sits under
credential_confidentiality_protection, not user_authentication_required.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
V6.x->user_authentication_required, V11.2.1->credential_confidentiality_protection,
V11.7.1->auth_key_management; semantisch (NICHT CRA-Anker, die sind approximativ).
V16.x pending bis Logging-Cut. anchor_quality_note dokumentiert.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
obligations/controls_for_obligation_mapping.json — the Compliance Execution
Graph's accepted controls (V6 auth / V11 crypto / V16 logging) handed to the
Obligation Registry to propose the SEMANTIC control->obligation_id, replacing
the coarse citation_unit interim join (Befund 1). Registry fills
proposed_obligation_id; we then adopt it into control_mapping.obligation_id.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
AssessObligationStatus traverses obligation_id -> (citation_unit) -> accepted
controls -> required evidence -> status (erfuellt|offen|unklar). Evidence
presence is a callback; MVP passes nil (nothing collected yet) -> offen.
citation_spans = "pending" until the Legal-Knowledge-Graph session attaches
them. This is the vertical slice that makes the graph a product feature:
"CRA obligation fulfilled because evidence X/Y/Z is present", not "a doc exists".
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Consumes the cross-session contract obligations/obligation_join_keys.json (47
obligation_ids). Interim bridge = citation_unit (our source_norm <-> registry
citation_units), to be hardened to the stable obligation_id (field now optional
on ControlMapping).
ComputeObligationCoverage joins the 47 registry obligations to our accepted
control mappings: covered=2 (user_authentication_required, firmware_software_
authentication), mapped_rejected=3 ((2)(e) -> our OWASP mappings rejected,
route via NIST/BSI), uncovered=42. This coverage signal is the feedback to the
Obligation session for what to cut/refine next.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Macht meine Seite des Cross-Session-Vertrags konkret: obligation_id ist der stabile Join-Key
zwischen Legal Knowledge Graph (citation_spans -> obligation_id) und Compliance Execution Graph
(control_mapping.source_norm -> obligation_id). Export aller 47 obligation_ids (CRA: 11 sbom +
7 vuln + 29 auth) mit citation_units als Interim-Brücke. Disziplin: obligation_id nie neu
vergeben (re-link, Pendant zu span_id/control_uuid).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The last edge of the compliance graph: what concrete, fresh evidence proves a
framework control is met (config_export/test_report/sbom/audit_log/pentest/...
from github/ci/scanner/manual_upload, with a freshness requirement).
Seeded for all 7 accepted CRA->OWASP controls (Auth/Crypto/Logging). A graph
test enforces connectivity: every accepted control must carry >=1 required
evidence — no dangling node in Obligation -> Control -> Evidence.
This is what will let the Advisor state "the CRA requirement is fulfilled" from
present evidence, not from the mere existence of a document.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>