breakpilot-compliance

Benjamin_Boenisch/breakpilot-compliance

Fork 0

Commit Graph

Author	SHA1	Message	Date
Benjamin Admin	c39787ad96	fix(onboarding): separate observation vs requirement signals — a demanded SBOM is not a present SBOM Semantic correction of the knowledge base BEFORE the empirical loop (#59) is built — otherwise the Observation Store would learn from already-misclassified signals. The Silent Pass conflated two kinds of signal into one: an OBSERVATION ("I saw an SBOM in the repo") and a REQUIREMENT ("a tender DEMANDS an SBOM"). They were aliased to the same canonical id, so a tender clause read as "SBOM already present" and suppressed the very question that should have been asked. Fix — make the kind explicit and authoritative (no new architecture, data + thin wiring): - `kind` ∈ {observation, requirement} on ProducedSignal (producer may declare) and on the canonical SignalVocabularyEntry (AUTHORITATIVE — a mislabelled producer cannot collapse the two). - Vocabulary split: sbom_file_found → sbom_present (obs) + sbom_required (req); security_txt_or_cvd_policy → cvd_policy_present (obs) + psirt_required (req); add signed_updates_required. requirement signals are intentionally UNMAPPED in intake_signal_map (they describe a target, not state). - silent_intake() consumes ONLY kind==observation; requirement signals are preserved in `requirements_seen` (visible/auditable) but NEVER become a detected capability. - normalize_signals() stamps the vocabulary's kind onto every IntakeSignal; unknown ids still pass through. This is the same Observation-vs-Requirement split the Requirements Verification Platform rests on: observations are reality, requirements are targets, and their comparison is the delta. A tender / OEM spec / law now produces requirement signals; scanners / repos / documents produce observation signals. Tests: rewrote the two test_signal_producer cases that previously ASSERTED the bug (tender == repo) to pin the correct split; regression — `requires_sbom` yields no capability + stays in requirements_seen while `cyclonedx_found` still detects sbom_creation; endpoint-level regression that a tender requirement does not auto-detect and the gap stays asked; vocabulary-kind-overrides-mislabelled-producer. 25 onboarding tests pass, mypy --strict clean, demo runs, check-loc 0. Runtime effect → deploy + smoke. (Fix A; partial-vs- detected decoupling follows as Fix B before #59.)	2026-06-28 15:52:50 +02:00
Benjamin Admin	a4123ace71	feat: POST /onboarding/advisor-start — expose the Smart Onboarding Advisor at runtime (#58 ) This exposes the existing Smart Onboarding Advisor through a runtime endpoint; it does not add new reasoning logic. Tightly scoped: adapter boundary + endpoint, no big frontend, no persistence, no empirical learning, no new scanners, no LLM. POST /onboarding/advisor-start : (company + certifications + target + scanner_findings[ProducedSignal]) -> Normalizer -> Silent Knowledge Pass -> Advisor -> { silent_intake_summary, inferred_assumptions, rejected_assumptions, top_5_questions, capability_delta, top_measures, evidence_requests, completeness_summary, auto_detected, headline } GET /onboarding/targets : the supported target ids (CRA, TISAX, MDR, Environmental) compliance/services/onboarding_service.py is the app-caller: it loads the curated knowledge (hypothesis library, signal vocabulary + map, the target's required capabilities) once and calls the pure, tested orchestration (normalize_signals -> silent_intake -> advisor_start). The scanner ADAPTER boundary is the ProducedSignal format the request carries — existing scanners emit it, no new scanners. Thin handler (<30 LOC), registered in the auto-load list. No DB. Additive to the OpenAPI contract (contract test is additive-friendly; baseline regenerates on CI/py3.12). First deployable runtime feature -> dev deploy + smoke. mypy --strict clean, 22 onboarding tests pass, check-loc 0.	2026-06-28 15:14:00 +02:00

Author

SHA1

Message

Date

Benjamin Admin

c39787ad96

fix(onboarding): separate observation vs requirement signals — a demanded SBOM is not a present SBOM

Semantic correction of the knowledge base BEFORE the empirical loop (#59) is built — otherwise the
Observation Store would learn from already-misclassified signals. The Silent Pass conflated two kinds of
signal into one: an OBSERVATION ("I saw an SBOM in the repo") and a REQUIREMENT ("a tender DEMANDS an
SBOM"). They were aliased to the same canonical id, so a tender clause read as "SBOM already present" and
suppressed the very question that should have been asked.

Fix — make the kind explicit and authoritative (no new architecture, data + thin wiring):
  - `kind` ∈ {observation, requirement} on ProducedSignal (producer may declare) and on the canonical
    SignalVocabularyEntry (AUTHORITATIVE — a mislabelled producer cannot collapse the two).
  - Vocabulary split: sbom_file_found → sbom_present (obs) + sbom_required (req);
    security_txt_or_cvd_policy → cvd_policy_present (obs) + psirt_required (req); add signed_updates_required.
    requirement signals are intentionally UNMAPPED in intake_signal_map (they describe a target, not state).
  - silent_intake() consumes ONLY kind==observation; requirement signals are preserved in
    `requirements_seen` (visible/auditable) but NEVER become a detected capability.
  - normalize_signals() stamps the vocabulary's kind onto every IntakeSignal; unknown ids still pass through.

This is the same Observation-vs-Requirement split the Requirements Verification Platform rests on:
observations are reality, requirements are targets, and their comparison is the delta. A tender / OEM spec /
law now produces requirement signals; scanners / repos / documents produce observation signals.

Tests: rewrote the two test_signal_producer cases that previously ASSERTED the bug (tender == repo) to pin
the correct split; regression — `requires_sbom` yields no capability + stays in requirements_seen while
`cyclonedx_found` still detects sbom_creation; endpoint-level regression that a tender requirement does not
auto-detect and the gap stays asked; vocabulary-kind-overrides-mislabelled-producer. 25 onboarding tests
pass, mypy --strict clean, demo runs, check-loc 0. Runtime effect → deploy + smoke. (Fix A; partial-vs-
detected decoupling follows as Fix B before #59.)

2026-06-28 15:52:50 +02:00

Benjamin Admin

a4123ace71

feat: POST /onboarding/advisor-start — expose the Smart Onboarding Advisor at runtime (#58 )

This exposes the existing Smart Onboarding Advisor through a runtime endpoint; it does not add new
reasoning logic. Tightly scoped: adapter boundary + endpoint, no big frontend, no persistence, no
empirical learning, no new scanners, no LLM.

  POST /onboarding/advisor-start : (company + certifications + target + scanner_findings[ProducedSignal])
        -> Normalizer -> Silent Knowledge Pass -> Advisor -> { silent_intake_summary, inferred_assumptions,
           rejected_assumptions, top_5_questions, capability_delta, top_measures, evidence_requests,
           completeness_summary, auto_detected, headline }
  GET  /onboarding/targets       : the supported target ids (CRA, TISAX, MDR, Environmental)

compliance/services/onboarding_service.py is the app-caller: it loads the curated knowledge (hypothesis
library, signal vocabulary + map, the target's required capabilities) once and calls the pure, tested
orchestration (normalize_signals -> silent_intake -> advisor_start). The scanner ADAPTER boundary is the
ProducedSignal format the request carries — existing scanners emit it, no new scanners. Thin handler
(<30 LOC), registered in the auto-load list. No DB. Additive to the OpenAPI contract (contract test is
additive-friendly; baseline regenerates on CI/py3.12). First deployable runtime feature -> dev deploy +
smoke. mypy --strict clean, 22 onboarding tests pass, check-loc 0.

2026-06-28 15:14:00 +02:00

2 Commits