breakpilot-compliance

Author	SHA1	Message	Date
Benjamin Admin	77459d06d6	fix(onboarding): apply hypothesis/vocabulary review decisions (ISO13485, patch-policy rationale, summary) Two reviewed knowledge decisions (2026-06-28) + the deferred cosmetic counter, before #59. 1. ISO13485 removed from the incident_management hypothesis. ISO 13485 CAPA / quality-safety incident handling is NOT security incident management — the mapping was too broad and would seed false hypotheses for the empirical loop. A dedicated manage_quality_and_safety_incidents capability can come later IF a target needs it; not forced now. (ISO27001/TISAX/IEC62443 keep incident_management.) 2. patch_policy_doc -> secure_signed_update_distribution stays `partial`, but the curated rationale is sharpened: "indicates update governance, does not evidence signed distribution" (a patch policy is not proof of SIGNED distribution). New optional SignalMapping.rationale field carries the curated note. (github_actions_ci -> SDL and dependency_scanning -> vuln-mgmt reviewed and APPROVED as-is.) 3. Cosmetic (folded in since we touched the file): the silent-intake summary now counts detected and indications SEPARATELY ("N automatisch erkannt, M Indikation(en)") instead of lumping partial signals into "automatisch erkannt" — consistent with the three-state model just shipped. Tests: ISO13485 no longer resolves to incident_management; summary counts split correctly. 29 onboarding tests pass, mypy --strict clean, demo runs, check-loc 0. Runtime-visible (hypothesis resolution + summary text) -> deploy + smoke.	2026-06-28 16:18:28 +02:00
Benjamin Admin	978052b5a2	fix(onboarding): decouple partial/indicative signals from detected — partial no longer removes a question Fix B of the pre-#59 semantic correction. The Silent Pass had only TWO effective states though the data carries three: a `detected` mapping (a concrete artifact) AND a `partial` mapping (an indicative signal, e.g. a CI pipeline -> secure-development-lifecycle) both flowed through capability_ids() and were fed to the Advisor as already-present — so a weak indication silently removed a question, exactly the Welt-1/ Welt-2 transparency we want to keep. Now three distinct states: - detected -> reduces the delta immediately (auto_detected, not asked). [unchanged] - partial -> raises assumption strength but does NOT replace the question (surfaced as `indications`, the capability stays in the delta and is still asked). - requirement-> describes a target, never the present state (already handled by Fix A's kind split). Changes (data + thin wiring, no new architecture): - SilentIntakeResult.capability_ids() returns only relationship==detected; new indicative_capability_ids() returns the partial ones. - advisor_start() gains indicative_capabilities (NOT fed into the profile) and surfaces result.indications = indicative ∩ required − auto_detected. - AdvisorResult / AdvisorResponse gain `indications` (additive, contract-safe); the service passes the indicative ids through. Tests: a partial CI signal is indicative-not-detected and does NOT shrink the delta; end-to-end it appears in `indications`, not `auto_detected`, and the gap is still asked. 28 onboarding tests pass, mypy --strict clean on the onboarding modules, demo runs, check-loc 0. Runtime effect -> deploy + smoke.	2026-06-28 16:02:35 +02:00
Benjamin Admin	c39787ad96	fix(onboarding): separate observation vs requirement signals — a demanded SBOM is not a present SBOM Semantic correction of the knowledge base BEFORE the empirical loop (#59) is built — otherwise the Observation Store would learn from already-misclassified signals. The Silent Pass conflated two kinds of signal into one: an OBSERVATION ("I saw an SBOM in the repo") and a REQUIREMENT ("a tender DEMANDS an SBOM"). They were aliased to the same canonical id, so a tender clause read as "SBOM already present" and suppressed the very question that should have been asked. Fix — make the kind explicit and authoritative (no new architecture, data + thin wiring): - `kind` ∈ {observation, requirement} on ProducedSignal (producer may declare) and on the canonical SignalVocabularyEntry (AUTHORITATIVE — a mislabelled producer cannot collapse the two). - Vocabulary split: sbom_file_found → sbom_present (obs) + sbom_required (req); security_txt_or_cvd_policy → cvd_policy_present (obs) + psirt_required (req); add signed_updates_required. requirement signals are intentionally UNMAPPED in intake_signal_map (they describe a target, not state). - silent_intake() consumes ONLY kind==observation; requirement signals are preserved in `requirements_seen` (visible/auditable) but NEVER become a detected capability. - normalize_signals() stamps the vocabulary's kind onto every IntakeSignal; unknown ids still pass through. This is the same Observation-vs-Requirement split the Requirements Verification Platform rests on: observations are reality, requirements are targets, and their comparison is the delta. A tender / OEM spec / law now produces requirement signals; scanners / repos / documents produce observation signals. Tests: rewrote the two test_signal_producer cases that previously ASSERTED the bug (tender == repo) to pin the correct split; regression — `requires_sbom` yields no capability + stays in requirements_seen while `cyclonedx_found` still detects sbom_creation; endpoint-level regression that a tender requirement does not auto-detect and the gap stays asked; vocabulary-kind-overrides-mislabelled-producer. 25 onboarding tests pass, mypy --strict clean, demo runs, check-loc 0. Runtime effect → deploy + smoke. (Fix A; partial-vs- detected decoupling follows as Fix B before #59.)	2026-06-28 15:52:50 +02:00
Benjamin Admin	a4123ace71	feat: POST /onboarding/advisor-start — expose the Smart Onboarding Advisor at runtime (#58 ) This exposes the existing Smart Onboarding Advisor through a runtime endpoint; it does not add new reasoning logic. Tightly scoped: adapter boundary + endpoint, no big frontend, no persistence, no empirical learning, no new scanners, no LLM. POST /onboarding/advisor-start : (company + certifications + target + scanner_findings[ProducedSignal]) -> Normalizer -> Silent Knowledge Pass -> Advisor -> { silent_intake_summary, inferred_assumptions, rejected_assumptions, top_5_questions, capability_delta, top_measures, evidence_requests, completeness_summary, auto_detected, headline } GET /onboarding/targets : the supported target ids (CRA, TISAX, MDR, Environmental) compliance/services/onboarding_service.py is the app-caller: it loads the curated knowledge (hypothesis library, signal vocabulary + map, the target's required capabilities) once and calls the pure, tested orchestration (normalize_signals -> silent_intake -> advisor_start). The scanner ADAPTER boundary is the ProducedSignal format the request carries — existing scanners emit it, no new scanners. Thin handler (<30 LOC), registered in the auto-load list. No DB. Additive to the OpenAPI contract (contract test is additive-friendly; baseline regenerates on CI/py3.12). First deployable runtime feature -> dev deploy + smoke. mypy --strict clean, 22 onboarding tests pass, check-loc 0.	2026-06-28 15:14:00 +02:00
Benjamin Admin	c2c8f7e424	feat: Signal Producer interface + Normalizer — one signal language for all sources (before #58 ) Not scanner stubs — the scanners exist. The Silent Pass needs only their UNIFIED output. This adds the small common DATA FORMAT (not a new module/framework) the user asked for, exactly the Requirement- Source / MCAP / regulation-alias pattern: many inputs, one language. Producer A / B / C -> normalize_signals (vocabulary: id + aliases) -> canonical IntakeSignal -> Silent Pass - ProducedSignal {signal_id, source_type, confidence, evidence, provenance} = what ANY source emits (website scanner, repo scanner, PDF parser, tender parser, API, the user). - knowledge/onboarding/signal_vocabulary.yaml reduces producer dialects to a canonical signal: "SBOM present" arrives as cyclonedx_found / spdx_found / sbom_uploaded / requires_sbom (tender) — all become `sbom_file_found`. The Silent Pass cannot tell where it came from -> no per-scanner special logic, ever. - Unknown signals pass through (a new producer stays visible). confidence/evidence/provenance flow to the detected capability for the audit trail. A tender that "requires SBOM" now produces the same effect as a repo that HAS one — fits Vision V2 (Requirement Source over Regulation). Endpoint (#58) then has its final shape: POST -> Producers -> Normalizer -> Silent Pass -> Profile -> Delta -> Questions -> Roadmap. Non-runtime -> no deploy. mypy --strict clean, 14 onboarding tests pass, check-loc 0.	2026-06-28 14:49:57 +02:00
Benjamin Admin	9c33582412	feat: Silent Knowledge Pass — recognise before asking (Phase 0, before the endpoint) Not the endpoint yet — the bigger knowledge lever first. The Advisor can say "I need 5 answers" but does not yet decide what it can find out by ITSELF. The Silent Knowledge Pass runs in front of the Advisor and, from signals existing scanners/parsers already produce (website, repository, documents, product data), deterministically derives capabilities the company demonstrably HAS + product facts that drive scope — so every recognised item shrinks the delta and removes a question. compliance/onboarding/silent_intake.py: silent_intake(signals, signal_map) -> detected_capabilities (+ evidence already in hand) + product_facts. The signal->conclusion map is curated DATA (knowledge/onboarding/intake_signal_map.yaml), signals are injected (scanners are upstream). Pure, deterministic, no LLM. advisor_start gains detected_capabilities (folded into the profile at HIGH confidence -> covered, not asked) and an auto_detected result + headline. The experience flips from a question wall to "we already recognised 4 capabilities, 2 product facts and have 4 pieces of evidence in hand — only these few remain". Order now: Silent Pass -> #58 endpoint/frontend -> #59 empirical loop. NOT new architecture, just an orchestration step in front. Non-runtime (no app caller) -> no deploy. 15 onboarding tests pass, mypy --strict clean, check-loc 0.	2026-06-28 14:34:27 +02:00
Benjamin Admin	98d616d82b	feat: Observation Model — the empirical learning unit, defined BEFORE persistence (Task 59a) The learning point is not the hypothesis, it is the QUESTION — and confirmed/refuted is too coarse. "partial, only critical suppliers" or "certified but not lived" are not "wrong", they are valuable knowledge. So the chain is Hypothesis -> Question -> Observation -> (Review) -> Hypothesis, and the observation model must be defined cleanly before any store/API (else thousands of too-coarse observations get migrated later). compliance/onboarding/observations.py: - ObservationType: confirmed / partial / refuted / not_applicable / unknown (richer than binary). - Observation: {hypothesis_id, capability, question, answer (free text), observation_type, scope_note ("only critical suppliers"), evidence_uploaded, reviewed, reviewed_by}. - empirical_distribution() -> a DISTRIBUTION (confirmed 61 / partial 31 / refuted 8), not one %. - empirical_confidence() -> (confirmed + 0.5*partial) / (confirmed+partial+refuted); n.a./unknown excluded; None until calibrated. - REVIEW GATE: only reviewed observations calibrate — a raw answer never changes a hypothesis (no learning from outliers). Refactor: the hypothesis is now PURE curated knowledge — the binary observations counter and any confidence are removed from CapabilityHypothesis and the YAML; confidence is COMPUTED from the separate reviewed observation stream. Pure, mypy --strict clean. Persistence/aggregation/calibration are 59b/c/d. Non-runtime -> no deploy. 12 tests pass, check-loc 0.	2026-06-28 13:31:43 +02:00
Benjamin Admin	2d2cb2a244	feat: Certification Capability Hypotheses — capability-centric library + empirical confidence The bottleneck is knowledge, not the endpoint. This builds the knowledge the Onboarding Advisor needs, restructured per the user's key insight: NOT "ISO27001 -> 30 capabilities" but each hypothesis as its own object "capability -> supported_by: [certs]". A capability is written ONCE with all supporting certs, so the shared management-system core (document control, incident, supplier, audit, access, asset, monitoring, training, crypto, release, risk) covers most certifications with ~18 hypotheses instead of ~300 — and multi-certification merges AUTOMATICALLY (a company's inferred caps = every hypothesis whose supported_by intersects its certs). Welt-1 throughout: "IF cert present, EXPECT capability (verification required)", never "erfüllt". Capabilities NO cert suggests (SBOM, signed updates, CVD, support period) have no hypothesis -> they stay in the delta and get asked. confidence is EMPIRICAL: computed from real-onboarding observations (confirmed/(confirmed+refuted)), None until calibrated — never an LLM/expert score (record_observation + empirical_confidence). The long-term moat: knowledge that learns from reality, not from a norm. compliance/onboarding/hypotheses.py (resolve_for_certifications / inferred_hypotheses / empirical_ confidence / record_observation) feeds the existing advisor_start unchanged; the demo now runs on the curated library. Pure, mypy --strict clean, library is DATA (no norm text, no real names). Non-runtime -> no deploy. 12 tests pass, check-loc 0.	2026-06-28 13:16:45 +02:00
Benjamin Admin	3ba90f49cf	feat: Smart Onboarding Advisor — make the knowledge usable in onboarding (ADR-012) The user-named "right next runtime step": stop building knowledge, start using it automatically in onboarding — no sales training, no regulation picking. compliance/onboarding/ is an ORCHESTRATOR (not a new engine) wiring Company 2A -> RS-005 -> optimization -> completeness: advisor_start(input, cert_hypotheses, target_requirements, ...) -> AdvisorResult From (company + products + certifications + target) it returns inferred_assumptions, rejected_ assumptions, next_best_questions (<=5, ranked by information_gain + leverage + unknown_high_risk + evidence_missing, each self-explaining), capability_delta, top_measures, evidence_requests, unsupported_domains, completeness_summary. apply_answer() updates the profile (delta shrinks). Welt-1 throughout: certificates REDUCE questions but satisfy nothing automatically (verification_ required); relevance(evidence,target) keeps ISO 14001 out of the CRA result. Certificate->capability hypotheses + target requirements are INJECTED (curated knowledge, outsourced; not in code). All 7 acceptance criteria pass; mypy --strict clean. First app-caller wiring the engines into a product flow — still no endpoint/persistence, so 0 runtime effect -> no deploy yet (deploys when POST /onboarding/advisor-start + frontend are wired). check-loc 0.	2026-06-28 12:45:49 +02:00
Benjamin Admin	80bf1993e0	feat: Journey Matcher — the delta explains the journey (Delta -> Journey, ADR-011) The sanctioned last architectural building block. Reverses the order: not Goal -> Journey -> Delta but Goal -> Required -> Delta -> Journey. A Journey is the EXPLANATION of the Capability Delta, not its cause — so this is a Matcher/Explainer, not a Selector. New module compliance/journey_matcher/ = the third independent, interchangeable function of the pipeline, beside Company 2A (Evidence -> Capability) and RS-005 (Capability -> Delta): match_journeys(delta, journeys, context) -> ranked, auditable explanation - Looks ONLY at the Capability Delta — never at certificates, regulation, tenders or the goal. Journey signatures are certificate-agnostic capability clusters (Input -> Output pattern). - score = share of the delta a journey explains (recall over the missing capabilities); journey_only documents where a journey reaches beyond the delta so a broad journey is not silently preferred. - Deliberately dumb + deterministic (pure set overlap; NO ML/embeddings/LLM), fully auditable (matched / unexplained / journey_only / context signals); a learning ranker can sit on top later. - Signatures injected, engine hermetic. mypy --strict clean. Validated on the real patterns (demo): a CRA+MaschinenVO delta ranks the convergence journey 100%, "ISO27001 -> CRA" 56% (misses the machine-safety caps), "ISMS -> TISAX" 0%. This resolves the "Scope -> Journey" jump from Customer Mission #1. Freeze exception explicitly authorised; non-runtime -> no deploy. 12 tests pass, check-loc 0.	2026-06-28 10:36:43 +02:00
Benjamin Admin	aa99111a87	feat(completeness): Regulatory Completeness Engine — auditable coverage, not confidence Phase A½. The move from feature to product development: for every assessment, answer "how sure are we that this answer is COMPLETE?" — different from confidence. The product never claims full coverage; it makes its own knowledge state transparent and auditable. Shows what we do NOT know and why. - compliance/completeness/: assess_completeness(identified, corpus_status, uncertain, assumptions, assessed_obligations) -> CompletenessReport. Separates IDENTIFIED from ASSESSED (validated corpus AND determined applicability) and justifies every gap. Two kinds of open: corpus gap (future_corpus) and applicability uncertainty (query_required + deciding question, e.g. Data Act / generates_usage_data). - The metric is COUNTS, never a single percentage: "Identifiziert N · bewertet M · offen K · Unsicherheiten U · Begründung ja" + an honest audit statement. - ADR-007: auditable honesty; phase order A factory -> A½ Completeness -> B new domains; the transparency selling point. Deterministic, no LLM; corpus status + obligation count injected. - reference suite: "Regulatory Completeness" section runs an industrial-dishwasher assessment (assessed CRA/MaschinenVO; open EMV/Environmental=future_corpus, Data Act=query_required) and notes Environmental flips open->validated automatically once the corpus lands. 11 completeness tests (54 with adjacent modules), mypy --strict clean (15 files), check-loc 0. Product code with no app caller + ADR/reference = non-runtime -> no deploy (ADR-001). Freeze-safe. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-27 14:16:12 +02:00
Benjamin Admin	07e392913f	feat(knowledge-intake): classify a document + assess its impact before extraction Phase A1. The real knowledge production is not writing — it is TARGETED UPDATING: when 20 documents arrive, which 5 change our knowledge and which 15 are ignorable? Before the parser, Knowledge Intake classifies a new document (no content extraction) and intersects its signals with an index of the existing knowledge to emit a Knowledge Package (an impact analysis). - compliance/knowledge_intake/: build_knowledge_index(patterns, playbooks, reference_scenarios, obligation_index) + assess_document_impact(descriptor, index) -> KnowledgePackage. Deterministic, NO content extraction, NO LLM. Surfaces affected capabilities / playbooks / transition patterns / reference scenarios / (injected) obligations, whether it is a new domain, and a triage level (HIGH / LOW / NONE / NEW_DOMAIN) with a recommendation. - ADR-006: Knowledge Intake = classify + impact before extraction; full factory Intake -> Package -> Parser -> Draft -> Review -> Published; phase order A1 Intake / A2 Draft / A3 Review. - reference suite: "Knowledge Intake" section triages 3 example documents (CRA SBOM-FAQ -> high, 14C/2PB/3RTS/2Obl; environmental guidance -> new_domain; marketing blog -> ignorable). Section lives in _helpers.py to keep generate.py under the 500-LOC budget. - Honest known refinement surfaced by intake: regulation-ID normalization (CRA vs Cyber Resilience Act). 10 intake tests (60 with the adjacent modules), mypy --strict clean (16 files), check-loc 0. Product code with no app caller + ADR/reference = non-runtime -> no deploy (ADR-001). Freeze-safe. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-27 13:58:59 +02:00
Benjamin Admin	b6cfc0a503	feat(knowledge-production): Playbook Draft Generator — prepare the corpus deterministically The bottleneck is not content, it is knowledge PRODUCTION. Instead of writing 200 playbooks by hand, generate drafts deterministically from data the software already owns, then have an expert review them. Mirrors the legal pipeline (Gesetz -> Parser -> Obligation -> Review) for BreakPilot's own knowledge: new Capability -> Registry -> Transition Pattern -> Playbook Draft Generator -> Expert Review -> versioned Playbook. - compliance/knowledge_production/: generate_playbook_draft(capability, requirement, control_links) + drafts_from_pattern(pattern) -> one PlaybookDraft per delta capability. Owned fields (why / closes_regulations / expected_evidence / typical_controls) are assembled with per-field provenance; the practitioner know-how (tools / process_steps / how_others) is left as an explicit TODO. - DraftStatus lifecycle (Freigabestatus): draft_generated -> in_review -> reviewed -> validated -> proven. Deterministic, NO LLM in the core (any model enrichment stays offline/advisory/propose-only). - ADR-005: extends "the engine does not change, the corpus grows" with "and the corpus is not written by hand — it is deterministically prepared, then curated". - reference suite: "Knowledge Production" section turns the convergence pattern into 12 auto-assembled drafts (why/closes/evidence filled, tools/steps TODO) -> review 12 drafts, don't write 12 playbooks. 10 tests (50 with playbook/optimization/transition/company), mypy --strict clean, check-loc 0. Product code with no app caller + ADR/reference = non-runtime -> no deploy (ADR-001). Freeze-safe. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-27 13:31:31 +02:00
Benjamin Admin	78f0ffa9de	feat(playbook): Implementation Playbooks — the Berater renderer ("wie komme ich dort hin?") Roadmap item 4. After WHAT applies / WHAT is missing / WHICH first, the GF asks HOW. The Implementation Playbook renders, for one capability, the full journey — why / which regulations it closes / tools / process / evidence / controls — and chains the Optimization Roadmap into per-measure playbooks. Another renderer over the same Capability spine (ADR-003/004), not a new engine: ~95% of the data already exists, it just needs a different rendering. - compliance/playbook/: build_playbook() + playbooks_for_plan() (chains optimization -> playbook, acyclic; reuses leverage for "closes which regulations"). Capabilities without curated content render as honest status:missing stubs — the content-owed signal. - knowledge/implementation_playbooks/: curated knowledge layer (Reasoning Knowledge Acquisition), two deep expert drafts (SBOM, CVD/PSIRT, status draft, expert-draft-not-normative) + README. The bottleneck is now CONTENT, not software; Playbook (own knowledge) != regulatory domain. - ADR-004: Implementation Playbooks = renderer + knowledge layer; content is the bottleneck. - reference suite: "Implementation Playbook" section renders the SBOM journey + Roadmap->Playbook table (high-leverage caps flagged "fehlt (Inhalt)" — content backlog, highest leverage first). - refactor: extracted markdown helpers to reference_scenarios/_helpers.py to keep generate.py under the 500-LOC budget. 9 playbook tests (40 with optimization+transition+company), mypy --strict clean, check-loc 0. Product code with no app caller + knowledge/ADR/reference = non-runtime -> no deploy (ADR-001). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-27 10:38:13 +02:00
Benjamin Admin	cfafa31ea2	feat(optimization): Regulatory Optimization — Roadmap/Management renderer over the Capability Delta Roadmap item 5. GAP analysis and measure-prioritisation are the SAME computation: Required − Known = the Capability Delta. The Capability Delta Engine (RS-005) computes it once; renderers read that ONE delta. Interview Renderer (missing info → questions) was already built; this adds the Roadmap/Management Renderer (missing capabilities → measures ranked by regulatory leverage). - compliance/optimization/: regulatory_leverage() + select_within_budget() (pure leverage math) + roadmap_from_delta(assessment, ...) — the keystone binding optimization to the RS-005 delta (dependency optimization → transition_reasoning, acyclic; the delta engine stays hermetic). leverage(measure) = number of regulatory requirements it closes at once (e.g. patch management → CRA+MaschinenVO+IEC62443+ISO27001 = 4). No new corpus, no new meta-model class (freeze v1.0). - Welt-1 honesty: percentages are exact count ratios over the IDENTIFIED requirements (the known delta), never "% gesetzeskonform". - reference suite: "Regulatory Optimization" section runs the SAME convergence delta → ranked measures + budget answer + the management sentence "of N identified requirements you close M with the top-K measures (X%) — highest regulatory leverage". - ADR-003: Capability Delta Engine — one delta, many renderers; rename Gap → Capability Delta. 13 optimization tests (31 with transition+company), mypy --strict clean, check-loc 0. Product code with no app caller + ADR/reference = non-runtime → no deploy (ADR-001). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-27 09:49:38 +02:00
Benjamin Admin	66be23f0c4	feat(convergence): first Regulatory Convergence Pattern (ISO27001 -> CRA + MaschinenVO) The first multi-regulation pattern: each capability declares `covers_targets`, so we can answer the convergence USP — "which capability satisfies CRA AND MaschinenVO at once?" - knowledge: transition_pattern_iso27001_to_cra_maschinenvo_v1.yaml (pattern_type: regulatory_convergence, status draft). The cyber-safety bridge = MaschinenVO Annex III 1.1.9 "protection against corruption" overlapping CRA integrity. 4 convergence capabilities cover BOTH; 5 CRA-only; 3 MaschinenVO-only. - product: compliance/transition_reasoning/convergence.py — regulatory_convergence() pure/deterministic/computed-not-stored, no new graph/class (freeze v1.0 untouched). No app caller yet -> non-runtime, no deploy (ADR-001). - reference suite: Cross-Regulation Capability Mapping section renders the customer sentence "von N neuen Massnahmen erfuellen M gleichzeitig CRA und MaschinenVO". - README: term -> Regulatory Transition / Convergence Pattern; covers_targets documented. - tests: test_regulatory_convergence (18 transition+company pass), mypy --strict clean. Curated expert knowledge, AI first draft (L1/draft) — Annex/Article refs indicative, review_required by a machinery-safety expert. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-27 09:12:30 +02:00
Benjamin Admin	77de7e794c	feat(transition): Transition Reasoning v0 (RS-005) — Transition Planning Engine Second reasoning mode, scope per user: the engine owns the INFORMATION GAPS, not the questions. assess_transition(context, target_requirements, company_profile) emits ranked TransitionQuestionRequest {capability, control, reason, question_intent, expected_evidence, priority, information_gain} -- NOT rendered question text. Rendering (intent+subject->sentence) is a separate swappable layer (RS-005.1), not here. Consumes the Company Capability Profile (2A) as "have" + injected TargetRequirement (Execution-owned placeholder) as "required" -- no required-capability data in product code (EMPTY_REQUIREMENTS, mocks only in tests). A certification-derived capability is probably_covered (Welt 1) -> a confirmation request, never already_covered/"erfuellt". Deterministic, computed-not-stored, no percentages. Activates 2A/2C/RCI (first consumer of the Company profile). Freeze-respecting: additive package, no new graph/base class/meta-model class. 9 tests, mypy --strict clean, LOC ok. No endpoint/UI/RAG; question rendering deliberately deferred to RS-005.1. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-27 07:31:11 +02:00
Benjamin Admin	6ccc6c87c1	feat(capability): Master Capability Registry v0 (Phase 2C, Compliance Execution domain) Third instance of the identity-machine pattern (after Master Controls and Master Obligations). New compliance/capability/ package: MasterCapability with stable MCAP ids, CapabilityCandidate minting, seven typed relation types, a VERSIONED derivation policy, and identity lifecycle (merge/split/deprecate/redirect with provenance). Stored: identities, sources, relationship types, policy versions, lifecycle events, provenance. Derived (never stored): confidence/status via evaluate_relation under a policy version. Hard rule (structurally guarded): a certification alone can never yield CONFIRMED — only CONFIRMS + concrete artifact (or expert) does. Built from the Reasoning session per user directive but this IS the Compliance Execution model (Execution owns Capability) — handed off via the board. Metadata-first: CapabilityRelation is registry metadata, NOT a new meta-model class (freeze v1.0 untouched). No Company-Gap, no real ISO/cert mappings, no UI/RAG, no generic canonicalization engine. 11 tests; mypy --strict clean; LOC ok. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-26 21:35:12 +02:00
Benjamin Admin	8c893ca783	feat(company): Company Intelligence 2A — Company Capability Profile foundation HEAD of the spine Company->Capability->Product->Regulation->Obligation->Procedure ->Evidence. New compliance/company/ package: CompanyContext container + a four-state trust model (declared/inferred/confirmed/unknown). Hard rule (structural): a certification yields at most an INFERRED candidate and is never auto-treated as CONFIRMED/"erfuellt". A certification produces evidence-of- capability; only real ExistingEvidence promotes a capability to CONFIRMED. Ownership: Reasoning owns the container + trust-state; the Certification->Capability mapping is Execution's domain, consumed via an injected contract. No mapping data in product code (tests inject mocks). No endpoint/UI/RAG/new regs/controls; no meta-model classes (freeze v1.0 untouched). 8 tests; mypy --strict clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-26 14:59:42 +02:00
Benjamin Admin	a5687bbc65	feat(rci): Regulatory Change Intelligence foundation (delta over the stored map) RCI/Delta as a read-/reasoning layer ON TOP of the product-first pipeline. Answers "what changes relative to my existing Regulatory Map?" — NOT "what does the new law say in general". No UI, no ingestion (newsletter/mailbox), no RAG, no new regulations/controls, no legal evaluation outside the stored map. - 4 core objects (compliance/rci/schemas.py): ComplianceBaseline (snapshot of profile + map + registry obligations + required/present evidence), RegulatoryChange (simulated/provided INPUT), ObligationDelta (delta_type NEW\|CHANGED\|REMOVED\| ALREADY_COVERED\|NEEDS_REVIEW\|NOT_APPLICABLE), ChangeImpactSummary. delta_type is a THIRD vocabulary, disjoint from ClaimCoverage (Welt 1) and ComplianceStatus (Welt 2). - create_baseline() snapshots the existing pipeline once; assess_change() computes deltas deterministically against the snapshot (no re-evaluation). - 12 tests = the 5 acceptance questions (affects product? new/changed? already covered by evidence? needs human review? not relevant?) + repeal/uncertain-reg/ missing-evidence/boundary. Existing pipeline tests stay green; mypy clean; LOC ok. - App/reasoning types only — no compliance-meta-model classes (freeze v1.0 untouched). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-26 13:45:23 +02:00
Benjamin Admin	50ae9e94d1	feat(interpretation-in-map): judge a customer interpretation within the map (step 5) Thin adapter — it judges the customer's reading WITHIN the already-built RegulatoryMap, it does not assess abstract legal questions and it is not RCI. - Reuses the existing assess_interpretation (no new legal reasoning); the 6 verdicts (plausible/too_narrow/too_broad/partially_correct/unsupported/uncertain) pass through unchanged. - Restricts affected_regulations/affected_obligations to those present in the map (intersection); links to the map's uncertain regulations. - Touched unsupported domains (wastewater/chemicals/...) are reported as future_corpus_domains (future_corpus_needed) — never pseudo-evaluated. - Customer-readable explanation ("Ihre Interpretation ist wahrscheinlich zu eng. … Betroffen in Ihrer Map: CRA."). - POST /reasoning/interpretation-in-map (renders the map, then interprets). - 7 tests; 63 green (existing reasoning MVP stays green), mypy clean, LOC ok. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-26 10:58:00 +02:00
Benjamin Admin	9312ad18ef	feat(regulatory-map): customer-readable read-model over the scope (step 4) The Map Renderer explains the engine's state, it does not extend it. Pure composition of resolve_product_scope (scope verdict) + derive_obligations (registry-linked obligations + overlaps) into one RegulatoryMap. - product_summary, trigger_facts, applicable/uncertain/excluded regulations, unsupported_domains, overlaps (shared_obligations), shared_evidence, and a customer-readable executive_summary. - No own legal decisions: applicable/uncertain mirror the scope verdict exactly. - Obligations shown ONLY when registry-linkable (registry_anchor) — MaschinenVO/ EMV obligations are proposed, so they render empty + a note, never as linked. Overlaps/shared_evidence likewise filtered to registry-linked members. - Uncertain regulations link to the navigator question that would resolve them (RED -> has_radio_module, DataAct -> generates_usage_data). - Environmental appears only as unsupported_domain; executive_summary has NO percentage (counts + "no further regulations identified" instead). - POST /reasoning/regulatory-map (thin handler). Response types are presentation- level, not meta-model classes (freeze v1.0 untouched). - 9 tests; 56 green (existing reasoning MVP stays green), mypy clean, LOC ok. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-26 10:36:06 +02:00
Benjamin Admin	4e8eb2dc0e	feat(product-scope): gate Navigator facts, then reuse discover_scope (step 3) Connects the Navigator's fact-gate to the existing reasoning discover_scope — the Scope Engine decides only once the minimum (P0) facts are released. - resolve_product_scope(canonical): if not ready_for_scope -> NEEDS_FACTS (missing_facts + suggested_questions, discover_scope NOT run); else project canonical->reasoning profile and run the EXISTING discover_scope exactly once -> RESOLVED with applicable/excluded/uncertain regulations. - Environmental triggers surface ONLY as unsupported_domains (future_corpus_needed), never as a legal evaluation — transparency, no false completeness. - POST /reasoning/product-scope (thin handler) returns case NEEDS_FACTS or RESOLVED. - No new scope rules, no new regulations, no environmental-law evaluation, no UI, no Go, no RAG, no percent-compliance. Response types are application-level, not meta-model classes (freeze v1.0 untouched). - 6 tests incl. discover_scope spy (0 calls when gated, exactly 1 when ready), category separation, environmental-as-unsupported-only. 47 tests green (existing reasoning MVP tests stay green), mypy clean, LOC ok. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-26 10:21:27 +02:00
Benjamin Admin	78aeedafae	feat(navigator): Product Regulatory Navigator as a thin missing-facts layer Step 2 of the convergence sequence. The Navigator sits over the CanonicalProductRegulatoryProfile (prefilled from company-profile / ProductWizard) and reports ONLY which facts are still missing + prioritized questions to collect them. It decides which facts are needed, NEVER what applies — that stays with the Scope Engine (step 3). No regulation logic, no UI, no Go, no RAG. - NavigatorQuestion (interaction type, NOT a compliance-meta-model class — freeze v1.0 untouched): question_id, target_field, label, why_needed, regulatory_domains_unblocked (static metadata), answer_type, options, priority. - QUESTION_CATALOG: 12 questions over canonical gaps — P0 (markets, role, lifecycle, machine/component), P1 (radio, usage-data, security-function, environmental wastewater/air/chemicals triggers), P2 (structured BOM). - engine: navigate() -> missing_facts + suggested_questions (priority-sorted) + completeness_summary (ready_for_scope = no P0 missing); apply_answers() -> updated profile. Pure field-presence; no scope import. - 8 tests: <=10 questions for a filled company-profile, known facts not re-asked, environmental = trigger questions only (no law evaluation), apply round-trip, P0 ordering, ready_for_scope. 41 tests green, mypy clean, LOC ok. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-26 10:05:27 +02:00
Benjamin Admin	739a477d3f	feat(profile): CanonicalProductRegulatoryProfile convergence layer (types + mappers + tests) ONE canonical product profile so the Go gap engine and the Python reasoning engine stop diverging ("SPS mit Remote Access" means the same everywhere). gap.ProductProfile LEADS; the reasoning ProductProfile becomes an adapter/DTO. Types + mappers only — no regulation logic, no Go changes, no UI, no new questions. - CanonicalProductRegulatoryProfile mirrors gap.ProductProfile + the Navigator gaps the audit found: economic-operator role, radio_module, generates_usage_data, lifecycle_phase, structured BOM (ProductComponent), safety-vs-security split, machine-vs-component + a forward-looking EnvironmentalImpact domain (wastewater/ air/chemicals triggers — fields only, no rules yet). - Mappers: from_product_wizard (lossless), from_company_profile (prefill incl. the machineBuilder block), to_gap_profile (emits the unchanged gap JSON shape), to_reasoning_profile (projects into the reasoning ProductProfile; AI stays delegated to ai-act/ucca). Only profile->reasoning is coupled; reasoning stays hermetic. - 10 tests = the 10 acceptance criteria incl. ProductWizard round-trip lossless, markets no longer forced ['EU'], and canonical->reasoning->discover_scope proving one semantic profile drives the engine. 33 tests green, mypy clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-26 09:52:46 +02:00
Benjamin Admin	6673c8052b	fix(reasoning): drop "vollständig" from ClaimCoverage wording [F1 final] "vollständig" still implied fulfillment. potentially_addresses now reads "… adressiert N Pflichten direkt und M teilweise; K werden durch die Aussage nicht berührt. … Dies ist keine Konformitätsaussage." Enum value kept (potentially_addresses chosen over addresses_claimed for product clarity). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-26 00:49:20 +02:00
Benjamin Admin	5e5002c883	refactor(reasoning): enforce ClaimCoverage (Welt 1) vs ComplianceStatus (Welt 2) boundary [F1] Architecture-validation finding: the implementation mode produced compliance- flavored output ("teilweise erfüllt", "covered") from a mere customer claim, blurring the line to the Execution layer. This is a design decision, not a text fix — the reasoning layer judges only the customer's STATEMENT, never conformity. - CoverageStatus -> ClaimCoverage; values are claim-relative + carry "potential": potentially_addresses / partially_addresses / does_not_address / insufficient_information. - ImplementationAssessment -> ClaimObligationMapping (coverage_status -> claim_coverage); ImplementationResponse -> ImplementationReasoningResponse (assessments -> mappings, + explicit `disclaimer`); request renamed; engine entry assess_implementation -> reason_implementation_claim. - Endpoint /reasoning/implementation-assessment -> /reasoning/implementation-reasoning. - Summary/explanations reworded: "adressiert wahrscheinlich N Pflichten … für eine Bewertung der tatsächlichen Umsetzung sind Nachweise erforderlich (keine Konformitätsaussage)". No "erfüllt"/"abgedeckt" leaks. - New guard test asserts no compliance verdict leaks (no "erfüllt"; disclaimer separates ClaimCoverage from ComplianceStatus). 23 tests green, mypy clean. Discovery (scope/obligations) was already structurally claim-free and unaffected. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-26 00:37:57 +02:00
Benjamin Admin	1607c89459	feat(reasoning): Regulatory Reasoning Engine MVP (scope/obligations/implementation/interpretation) Deterministic reasoning layer ON TOP of the Legal Knowledge Graph (obligation registry) and the Compliance Execution Graph (control mapping/evidence). Answers which regulations apply to a concrete product, which obligations follow, whether the customer's implementation covers them, and whether a customer interpretation is too narrow/broad/plausible. - ProductProfile with tri-state facts (Optional[bool]=None => uncertain, never false security); safe predicate evaluator (no eval). - 6 regulation triggers (CRA/MaschinenVO/RED/EMV/DataAct/NIS2) with missing-fact prompts; 24 obligation scope rules. - CRA obligation_ids RE-USED verbatim from the registry (93 ids) — never re-minted (control_uuid trap); Machine/Data-Act flagged proposed=True. - required_evidence constrained to the framework-agnostic shared evidence catalog; capabilities echo the planned Obligation->Capability layer. - Overlap groups (CRA<->MaschinenVO cyber-safety) + evidence-for-multiple (USP). - 4 endpoints POST /reasoning/{scope,obligations,implementation-assessment, interpretation-assessment}; thin handlers, registered in api/__init__.py. - 22 tests (5 machine-builder scenarios + 10 acceptance questions). No DB migration, no RAG, no new controls. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-25 19:30:53 +02:00
Benjamin Admin	c1ea9458a7	Add met_count and recall_limited_obligations to shadow telemetry Reichert die Obligation-Shadow-Telemetrie um zwei Felder an für die Cross-Firmen- Auswertung: met_count (abgedeckte Obligations) + recall_limited_obligations (welche Obligations recall-limitiert sind) — erlaubt die Konzentrations-Analyse über Firmen. 7-Firmen-Shadow: 136 Control-Findings → 29 Obligation-Findings (4,7×); recall_limited nur 6/29, konzentriert auf third_country/safeguards in 2/7 Firmen → LLM-Fix bounded. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-24 20:15:45 +02:00
Benjamin Admin	0631a98bdd	Mark recall-limited obligations in DSE shadow telemetry Trennt im Shadow drei Kategorien statt eines pauschalen FAILED: - echte Lücke (failed_by_current_checker) - redundanter Control-FP (kollabiert per OR zu MET) - Prüfer-Reichweitenproblem (recall_limited) obligation_taxonomy.py: decision_method_required=LLM für recipients_disclosed, third_country_transfer_disclosed, safeguards_disclosed, safeguards_accessible (versioniertes Registry-Artefakt bis DB-Tabelle, v1-Spec). Empirisch: TeamViewer 0/22 kw+emb trotz erfüllter Pflicht (cos 0.49-0.57) → CONTENT/LLM-Klasse, kein Schwellen-Fix. compute_obligation_shadow segregiert FAILED/PARTIAL über requires_llm(): teamviewer 5 Findings → 2 echte + 3 recall_limited. 9 neue Unit-Tests (41 gesamt grün). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-24 13:46:21 +02:00
Benjamin Admin	c3542f7dfe	feat(dse): obligation shadow telemetry Verdrahtet die Obligation Aggregation Engine als Layer 4 (SHADOW) in v3_engine: erzeugt aus den results zusätzlich Obligation-Ergebnisse AUSSCHLIESSLICH für die Telemetrie. Greift NICHT in results ein — nutzer-sichtbare Findings unverändert. - _obligation_shadow.py: fetch_obligation_markers (legal_obligations + applicability) + compute_obligation_shadow (pure): legacy_control_findings, obligation_shadow_results, collapse_factor, na_count, met_failed_delta, top_collapsed_obligations - met-Signal = Legacy-passed (kein zusätzlicher Prüfer-Call/Key) E2E (3 Firmen, echte Engine): 57 Control-Findings → 14 Obligation-Findings (4,1×); Redundanz kollabiert wo Evidenz existiert, echte Lücken bleiben FAILED. 6 Unit-Tests grün. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-24 12:59:52 +02:00
Benjamin Admin	7ec29999a2	feat(obligation): obligation applicability predicates Minimaler Applicability-Hook für die Obligation Aggregation Engine: entscheidet aus dem Dokumenttext, ob eine bedingte Obligation anwendbar ist (True/False/None). - has_third_country_transfer · uses_legitimate_interest · direct_marketing (+ Alias legitimate_interest_or_public_task) - unbekanntes Prädikat → None → Aufrufer behält Default=anwendbar (fail-safe, nie stille NA) - profiling/employment/telecom/health/data_act folgen als nächste Charge Re-Benchmark (Opus-GT, 3 Firmen): Prädikate erkennen Transfer/berecht.Interesse/ Direktwerbung korrekt → keine falsche NA; NA-Flip-Probe bestätigt FEHLT→NA ohne Transfer. 14 Unit-Tests grün. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-24 12:43:42 +02:00
Benjamin Admin	402a42d30d	feat(obligation): obligation-level aggregation engine Erste Ausführung des Legal Obligation Layer v1: aggregiert Bewertungen auf Kriterium-/Control-Ebene zu Findings auf Obligation-Ebene (Regulation → Legal Obligation → Control → Criterion). - regulierungs-agnostisch (obligation_id/tier/met/legal_basis/conditional) - fail-safe: LM applicable=false→NA · keine erfüllt→FAILED · alle→MET · Teil→PARTIAL; BP/OPT covered→MET sonst OPEN (nie FAILED); LM unbewertbar→UNDETERMINED (Legacy behalten) - Redundanz-Kollaps per OR pro legal_basis-Anforderung → kein künstliches PARTIAL - Applicability als Hook (Prädikat-Engine folgt separat) Shadow-Benchmark (Opus-GT, 3 Firmen): 38 Control-Findings → 13 Obligation-Findings (2,9×); ~23 redundante Falsch-Positive strukturell korrigiert, echte Lücken erhalten, PARTIAL=0. 16/16 Unit-Tests grün. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-24 12:28:03 +02:00
Benjamin Admin	067118b12d	fix(cascade): give OVH/gpt-oss reasoning headroom so Tier-2 isn't silently dead CI / detect-changes (push) Successful in 8s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / build-sha-integrity (push) Successful in 6s Details CI / validate-canonical-controls (push) Successful in 5s Details CI / loc-budget (push) Successful in 20s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 25s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details gpt-oss-120b is a reasoning model: it spends output tokens on chain-of-thought before the answer. deep_check called _call_ovh with max_tokens=400, which length-capped it mid-reasoning -> content=null -> the OVH tier returned nothing and the cascade always skipped Tier-2. Floor the OVH budget to >=2000, fall back to reasoning_content when content is null, and raise the client timeout to 90s for the slower reasoning path. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-22 17:37:48 +02:00
Benjamin Admin	5ff08a240b	feat(dse): tiered 3-state evaluator + Layer-3 wiring (compliance_tier) Getierte Auswertung mit compliance_tier-Gating (nur LEGAL_MINIMUM bestimmt ERFÜLLT/TEILWEISE/FEHLT; BEST_PRACTICE/OPTIONAL → Empfehlungen). Deterministisch- first: EMBEDDING-Präsenz + gecachter Haiku nur für Sufficiency → reproduzierbar (löst die gemessene Judge-Varianz). Layer-3 in v3_engine gated auf tiered_criteria, fail-safe (UNBESTIMMT → Legacy). Offene Kalibrierung: Präsenz-Schwelle (Schritt 2). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-22 17:37:48 +02:00
Benjamin Admin	3e3644f83d	feat(checkers): platform router + Haiku sufficiency tier; cookie is first consumer Generalise "Embedding finds, Claude decides" into the shared Pruefer-Library: - router.route_and_check dispatches control -> sensor_classification -> Checker. - build_spec reads sensor_classification (CONTENT/LLM -> judge=haiku, the validated sufficiency tier; the Qwen-first cascade is disproven for sufficiency). - LLMChecker gains a Haiku-direct tier (reuses the validated deep_check prompt). - Cookie Layer-3 now routes through route_and_check instead of bespoke code, so cookie is the first real router consumer -- proves the architecture end-to-end. Reproduces the validated result via the shared path: FN 159->14, recall 0.13->0.92, precision 0.89 (vs bespoke 12/0.93/0.90 -- within Haiku noise). Tests: 10/10 (router dispatch + build_spec + haiku tier + cookie rewire). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-22 17:37:48 +02:00
Benjamin Admin	e809d0bc1c	feat(cookie): Layer-3 sufficiency-judge — Haiku re-judges embedding/boost rescues The embedding/boost auto-rescue is intentionally optimistic (finds the topic, not fulfilment) -> 159 FN over-rescues vs Opus-GT (recall 0.13). Layer-3 re-judges exactly the rescued passes with the validated Haiku judge (cohort cookie_sufficiency_v1 P0.89/R0.91) -- NOT the Qwen-first cascade (local is disproven as a sufficiency judge) -- and un-passes them when the obligation is not concretely met. Gated to the full check (not skip_llm). Measured (5-firm Opus-GT, engine+L3): FN 159->12, recall 0.13->0.93, precision 0.96->0.90 (276 rescues corrected). "Embedding finds, Claude decides." Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-22 17:37:48 +02:00
Benjamin Admin	869e7aeb1e	fix(cookie): gate non-COOKIE_POLICY controls out of the cookie-policy scan The cookie agent loaded 100 controls, 11 of which have no COOKIE_POLICY in applicable_artifacts -- Security/TOM/Audit (PROCESS) or Banner-behaviour (BEHAVIOR) controls that produce nonsense findings against a cookie policy (e.g. "TOMs not documented"). Add a cookie classification gate (analogous to the DSE gate, keyed on COOKIE_POLICY, without the needs_review carve-out since the artifact signal is decisive and the set is inventory-verified). Controls are routed out, not deleted. Effect vs Opus-GT: FP 16->11, FN 179->159; the remaining FN=159 over-rescue is a separate (judge/criteria) question, not routing. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-22 17:37:48 +02:00
Benjamin_Boenisch	38a347a82a	feat(platform): live-wire AGB v2 + DSE v3 + Architektur-Tab (#29 ) CI / detect-changes (push) Successful in 7s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / build-sha-integrity (push) Successful in 9s Details CI / validate-canonical-controls (push) Successful in 12s Details CI / loc-budget (push) Successful in 24s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 3m11s Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 24s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details AGB v2 (decision_method routing, 71%FP->~0) + DSE v3 (4-layer, recovered from container) + Architektur-Tab into /sdk/agent live path. Incl CI robustness (detect-changes.sh + PR-head checkout) + security (hardcoded Qdrant key removed, gitleaks allowlist). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-21 12:58:26 +00:00
Benjamin Bönisch	43e02f794a	feat(cra): SBOM- + DAST-Findings aus dem Scanner-MCP konsumieren CI / detect-changes (push) Successful in 8s Details CI / branch-name (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / build-sha-integrity (push) Successful in 6s Details CI / validate-canonical-controls (push) Successful in 10s Details CI / loc-budget (push) Successful in 20s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Successful in 1m4s Details CI / iace-gt-coverage (push) Successful in 15s Details CI / test-python-backend (push) Successful in 24s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details Sharangs compliance-scanner-agent exponiert SBOM (sbom_vuln_report) + DAST (list_dast_findings) als eigene MCP-Tools (nicht via list_findings). Neuer fetch_all_findings(repo_id) zieht list_findings + SBOM + DAST in EINER MCP-Session und normalisiert ins Finding-Schema: - SBOM: ein Finding pro verwundbarem Paket (nicht pro CVE), cwe=CWE-1395 -> deterministisch CRA-AI-22 (robust gegen Paketnamen wie "sqlite"). - DAST: cwe/endpoint/vuln_type uebernommen -> Mapping via cwe/keywords. assess-from-scanner nutzt fetch_all_findings + liefert source.breakdown (code/sbom/dast). DAST hat im MCP keinen repo_id-Filter -> dast_repo_scoped:false (deployment-weit, transparent geflaggt). Echte MCP-Daten: Kitchenasty 58 code + 35 sbom + 81 dast -> 174 gemappt (Coverage 94,3%, alle 35 SBOM -> CRA-AI-22). Enthaelt zusaetzlich das Qdrant->Prod-Kopierскript (#42, verbatim macmini->prod). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-18 12:05:05 +02:00
Benjamin Bönisch	8f21650d74	feat(sdk): Kunden-Dokumente + CRA-Meldewesen, Screening aus Frontend genommen CI / detect-changes (push) Successful in 16s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / build-sha-integrity (push) Successful in 15s Details CI / validate-canonical-controls (push) Successful in 13s Details CI / loc-budget (push) Successful in 25s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 3m9s Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 31s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details - /sdk/dokumente: Kundensicht nur auf veroeffentlichte Rechtsdokumente (Ansehen + Download); Proxy mit Allow-List nur /public — Templates/Drafts/ Generator bleiben unerreichbar. - /sdk/cra-meldewesen: CRA Art. 14 Meldewesen (24h/72h/14d-Kaskade) mit Fristen-Tracking + ENISA-SRP-Export-Entwurf (kein Live-API). Backend: cra_meldewesen (pure, getestet) + cra_incident_store (schema-neutral ueber compliance_cra_documents) + /api/v1/cra/incidents (additiv, contract-safe). - Screening (Self-Scan) aus dem Frontend genommen: Flow-Stepper-Eintrag ausgeblendet (visibleWhen), Dashboard-Kachel + Import-Button entfernt. Repo-Scanning laeuft extern im Compliance-Scanner; Backend-Router bleibt vorerst gemountet (Contract-Stabilitaet). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-17 21:21:28 +02:00
Benjamin Bönisch	72093e5501	fix(cra): Scanner-Findings vollstaendig mappen + assess-from-scanner-Latenz senken CI / detect-changes (push) Successful in 17s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / build-sha-integrity (push) Successful in 13s Details CI / validate-canonical-controls (push) Successful in 12s Details CI / loc-budget (push) Successful in 25s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 30s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details Punkt 2 (Coverage): semgrep/gdpr-Findings ohne CWE blieben unmapped (~21%). Der Mapper nutzt jetzt den scanner rule_id + gezielte Keywords (gdpr -> Datenminimierung CRA-AI-17, path-traversal/prototype-pollution -> CRA-AI-20, nginx-header/Docker-Hardening -> CRA-AI-1/4, insecure-websocket -> CRA-AI-15). Reale Scanner-Daten: unmapped 19/92 -> 0/92 (Coverage 100%). Punkt 3 (Latenz): enrich_findings_with_breadth lief ~6 Aggregat-Queries je (use_case,sub_topic)-Paar, nutzte aber nur die Liste. Jetzt EINE batched Query (breadth_controls_batch) fuer alle Paare + Prozess-Cache (TTL 1800s). macmini: cold 0,23s / warm 0,000s. Prod-Root-Cause: atom_classification ohne (use_case,sub_topic)-Index nach DB-Swap -> Index dem DB-Owner empfohlen. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-17 13:17:51 +02:00
Benjamin Admin	fda94afd5f	fix(cra): prod hang-guard /readiness machinery + robuster Datenblatt-JSON-Parse CI / detect-changes (push) Successful in 19s Details CI / guardrail-integrity (push) Has been skipped Details CI / branch-name (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / build-sha-integrity (push) Successful in 10s Details CI / validate-canonical-controls (push) Successful in 9s Details CI / loc-budget (push) Successful in 22s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Has been skipped Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 32s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details #1 _machinery_obligations: SET statement_timeout=4s + run_in_threadpool — auf prod hing die maschinen-Query ~30s (langsame/unindizierte DB nach DB-Swap) und blockierte den async-Worker. Jetzt: bei Langsamkeit graceful 'keine Maschinen-Pflichten' statt Hang. (Fehlender prod-Index = Controls/DB-Session.) #2 parse_grenzen_json: tolerant ggue. ```json-Fences / Prosa-umschlossenem JSON (gehostete Modelle wie OVH ignorieren z.T. response_format) → Datenblatt- Extraktion liefert auch ueber den OVH-Fallback Felder. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-17 07:39:39 +02:00
Benjamin Admin	9e2655bfef	fix(cra): IACE-Create id-Wrapper + MaschinenVO eigene Sektion CI / detect-changes (push) Successful in 16s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / build-sha-integrity (push) Successful in 12s Details CI / validate-canonical-controls (push) Successful in 11s Details CI / loc-budget (push) Successful in 24s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 3m10s Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 32s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details 1) createProject las proj.id, der Create-Response ist aber {project:{id}} → 'Projekt anlegen' war kaputt. Jetzt proj.project?.id. E2E verifiziert (create→put limits_form→get→delete = 200). 2) MaschinenVO-Sicherheitspflichten wurden in die CRA-Cyber-Buckets (Code/Prozess/Doku) gemischt → fehl-kategorisiert (Maschinen-Safety ≠ CRA-Annex-I-Cyber). Jetzt eigene Response-Liste machinery_guideline + eigener Frontend-Abschnitt 'Maschinensicherheit (MaschinenVO 2023/1230)'; geklebtes 'MaschVO'-Badge entfaellt damit. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-17 00:12:52 +02:00
Benjamin Admin	fae826e1f7	fix(cra): 35B-Datenblatt-Extraktion — Thinking-Mode aus (think=false) qwen3.5:35b-a3b ist ein Thinking-Modell → generierte erst Reasoning, riss das 90s-Timeout → leere Extraktion. llm_cascade additiv um think-Param erweitert (Cache-Key kennt think); Datenblatt-Extraktor setzt think=False → sauberes JSON in ~1s. Default fuer alle anderen Cascade-Nutzer unveraendert. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-16 20:22:57 +02:00
Benjamin Admin	b217429d39	feat(cra): Datenblatt-Extraktion auf lokales 35B + llm_status-Fix llm_cascade additiv modell-faehig (optionaler model-Param, Cache-Key kennt model_hint → keine Kollision; Default unveraendert für alle anderen Nutzer). Datenblatt-Extraktor nutzt jetzt qwen3.5:35b-a3b (CRA_DATASHEET_MODEL, gleiches Modell wie der Compliance Advisor) für bessere semantische Zuordnung. Plus llm_status (ok\|empty\|unavailable) + Logging statt stillem except; Frontend zeigt bei 'unavailable' einen Hinweis statt leerer Felder (wichtig auf prod ohne lokales Ollama → Cascade-Fallback bzw. Hinweis). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-16 19:53:48 +02:00
Benjamin Admin	cfdc5fe277	feat(cra): Datenblatt→Grenzen-Extraktor (hybrid, lokales 35B) Hybrid-Extraktion Datenblatt → IACE Grenzen (ISO 12100): deterministischer Detektor (Schnittstellen/Einheiten per Regex) + lokales 35B via llm_cascade (Qwen-lokal-first) fuer die semantische Zuordnung auf die echten LimitsFormData- Keys. Nichts erfinden: Feld nicht im Text → leer + Quellen-Zitat je Feld. Essenzielle ISO-12100-Felder, die leer bleiben → gezielte Rückfragen (foreseeable_misuses, person_groups, qualification, temporal_limits …). Endpoint POST /api/v1/cra/extract-datasheet. 13 Tests gruen (reine Teile). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-16 19:06:07 +02:00
Benjamin Admin	62fafaaec5	feat(cra): MaschinenVO-Gefährdungs-Ableitung + Cyber-Safety-Brücke 3-Tier-MaschinenVO-Verdict (direkt / sicherheitsrelevant / nicht relevant) aus Personengefährdungs-Signal: eine Komponente ist keine Maschine, aber wenn ihre Funktion bei Fehler ODER Manipulation Personen gefaehrden kann (Bewegung, Laser/ Auge, Kraft, Temperatur, elektrisch), ist sie sicherheitsrelevant — Pflicht trifft den Maschinenbauer, Zulieferer liefert Nachweise, und ein Cyber-Angriff kann die Sicherheitsfunktion aushebeln (Cyber-Safety-Bruecke). OWIS-mit-Laser landet so korrekt als 'sicherheitsrelevante Komponente'. Engine + /readiness additiv; Frontend: Gefährdungs-Frage + -Typen, MaschinenVO-Ergebnisblock. Presets aktualisiert (OWIS: Laser+Bewegung, Zwick: Bewegung). 22 Tests gruen. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-16 18:48:52 +02:00
Benjamin Admin	3afb0e7f4d	feat(cra): neutrale Eingangstür-Verdict-Engine (zwingend/ratsam/nicht betroffen) CI / detect-changes (push) Successful in 20s Details CI / branch-name (push) Has been skipped Details CI / guardrail-integrity (push) Has been skipped Details CI / secret-scan (push) Has been skipped Details CI / dep-audit (push) Has been skipped Details CI / sbom-scan (push) Has been skipped Details CI / build-sha-integrity (push) Successful in 10s Details CI / validate-canonical-controls (push) Successful in 12s Details CI / loc-budget (push) Successful in 24s Details CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / nodejs-build (push) Successful in 3m11s Details CI / test-go (push) Has been skipped Details CI / iace-gt-coverage (push) Has been skipped Details CI / test-python-backend (push) Successful in 33s Details CI / test-python-document-crawler (push) Has been skipped Details CI / test-python-dsms-gateway (push) Has been skipped Details Reine, deterministische Verdict-Schicht ueber der bestehenden Annex-III/IV- Klassifikation (kein vierter Klassifizierer): trennt Rechtspflicht von Markt- Druck. Kern: das Inverkehrbringen (ab 11.12.2027), nicht der Entwicklungs- zeitpunkt, entscheidet — Bestandsprodukte, die nach der Frist weiter verkauft werden, fallen unter CRA. Producer-Typen (component/end_device/machine_ integrator/software_app) steuern Default-Annahmen (Anlagenbauer: Vernetzung/OTA vorausgesetzt) + Verdict-Betonung (Komponente => Markt-Druck). Plus Evidence- Checkliste (SBOM/VDP/Patch/Lifecycle/Threat-Model/Logging/Auth/Incident) + Reifegrad. /readiness additiv erweitert (verdict/maturity/digital_elements/ producer_type). 15 Tests gruen. Beispiele: OWIS PS90+, ZwickRoell roboTest. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-16 17:17:55 +02:00
Benjamin Admin	7aabfbe5b5	feat(controls): Mandanten-Suppression — per-tenant Applicability-Override Geteilte Schicht für alle Surfaces (Workspace-Anwälte, Cyber-Risiko-Projekt, Admin): ein Mandant markiert ein Control als "nicht anwendbar" → in seinen Use-Case-Ansichten (und künftig Repo-Scans) ausgeblendet. - Migration 156: compliance.control_suppressions (PK tenant_id+control_uuid), reversibel (active + reverted_*), auditierbar (actor/reason/created_at). [migration-approved] - Service control_suppression: suppress/revert/list_suppressions + suppressed_control_uuids (geteilter Filter). - Routes: GET/POST /v1/controls/suppressions + POST .../{uuid}/revert (X-Tenant-ID). - controls_for_use_case: optionaler X-Tenant-ID + include_suppressed; suppressed per Default versteckt (nie gelöscht), suppressed_count, suppressed-Flag pro Control. Agenten/CRA ohne Tenant unberührt. - Tests: Request-Validierung + import-safety (E2E-Zyklus gegen macmini bewiesen). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-16 16:35:38 +02:00

1 2 3 4 5 ...

549 Commits