Files
breakpilot-compliance/backend-compliance/compliance/services/onboarding_service.py
T
Benjamin Admin 978052b5a2 fix(onboarding): decouple partial/indicative signals from detected — partial no longer removes a question
Fix B of the pre-#59 semantic correction. The Silent Pass had only TWO effective states though the data
carries three: a `detected` mapping (a concrete artifact) AND a `partial` mapping (an indicative signal,
e.g. a CI pipeline -> secure-development-lifecycle) both flowed through capability_ids() and were fed to
the Advisor as already-present — so a weak indication silently removed a question, exactly the Welt-1/
Welt-2 transparency we want to keep.

Now three distinct states:
  - detected   -> reduces the delta immediately (auto_detected, not asked).   [unchanged]
  - partial    -> raises assumption strength but does NOT replace the question (surfaced as `indications`,
                  the capability stays in the delta and is still asked).
  - requirement-> describes a target, never the present state (already handled by Fix A's kind split).

Changes (data + thin wiring, no new architecture):
  - SilentIntakeResult.capability_ids() returns only relationship==detected; new indicative_capability_ids()
    returns the partial ones.
  - advisor_start() gains indicative_capabilities (NOT fed into the profile) and surfaces result.indications
    = indicative ∩ required − auto_detected.
  - AdvisorResult / AdvisorResponse gain `indications` (additive, contract-safe); the service passes the
    indicative ids through.

Tests: a partial CI signal is indicative-not-detected and does NOT shrink the delta; end-to-end it appears
in `indications`, not `auto_detected`, and the gap is still asked. 28 onboarding tests pass, mypy --strict
clean on the onboarding modules, demo runs, check-loc 0. Runtime effect -> deploy + smoke.
2026-06-28 16:02:35 +02:00

82 lines
3.6 KiB
Python

"""Onboarding Advisor service — the app-caller that loads knowledge and runs the pure orchestration.
This is the SERVICE layer that makes the Smart Onboarding Advisor runtime-usable: it loads the curated
knowledge (certification hypotheses, signal vocabulary + map, the target's required capabilities) once
and calls the already-built, pure orchestration (normalize_signals -> silent_intake -> advisor_start).
It adds NO new reasoning logic — it only exposes what exists. No DB, no persistence (by scope).
"""
from __future__ import annotations
import os
from typing import Any, Dict, List, Sequence, Tuple
import yaml
from compliance.onboarding import (
AdvisorResult,
CapabilityHypothesis,
OnboardingInput,
ProducedSignal,
SignalMapping,
SignalVocabularyEntry,
advisor_start,
normalize_signals,
resolve_for_certifications,
silent_intake,
)
from compliance.transition_reasoning import TargetRequirement
_K = os.path.join(os.path.dirname(__file__), "..", "..", "knowledge")
def _load(*parts: str) -> Any:
return yaml.safe_load(open(os.path.join(_K, *parts), encoding="utf-8"))
_HYP_LIB = [CapabilityHypothesis(**h) for h in _load("certification_hypotheses", "hypotheses.yaml")["hypotheses"]]
_VOCAB = [SignalVocabularyEntry(**v) for v in _load("onboarding", "signal_vocabulary.yaml")["signals"]]
_SIGNAL_MAP = [SignalMapping(**m) for m in _load("onboarding", "intake_signal_map.yaml")["mappings"]]
# target id -> transition pattern that defines its required capabilities (curated registry)
_TARGET_PATTERNS = {
"CRA": "transition_pattern_iso27001_to_cra_maschinenvo_v1.yaml",
"TISAX": "transition_pattern_isms_to_tisax_v1.yaml",
"MDR": "transition_pattern_iso13485_to_medical_v1.yaml",
"Environmental": "transition_pattern_iso14001_to_environmental_v1.yaml",
}
def supported_targets() -> List[str]:
return sorted(_TARGET_PATTERNS)
def _target(target_id: str) -> Tuple[List[TargetRequirement], Dict[str, List[str]]]:
pat = _load("transition_patterns", _TARGET_PATTERNS[target_id])
reqs = [TargetRequirement(capability_id=a["capability"]) for a in pat["likely_covered"]]
reqs += [TargetRequirement(capability_id=d["capability"], question_intent=d.get("needed_information", "verify_existence"),
expected_evidence=d.get("expected_evidence", [])) for d in pat["delta_requirements"]]
covers = {d["capability"]: d.get("covers_targets", []) for d in pat["delta_requirements"]}
return reqs, covers
def run_advisor(
company: str, certifications: Sequence[str], target: str,
signals: Sequence[ProducedSignal], known_evidence: Sequence[str],
products: Sequence[str], markets: Sequence[str], industry: str = "",
) -> Tuple[AdvisorResult, str]:
"""Producers (ProducedSignal) -> Normalizer -> Silent Pass -> Advisor. Returns an AdvisorResult.
`target` must be a supported target id. Raises KeyError otherwise (the handler maps it to 400/404).
"""
reqs, covers = _target(target)
si = silent_intake(normalize_signals(signals, _VOCAB), _SIGNAL_MAP)
inp = OnboardingInput(company=company, industry=industry or None, products=list(products),
markets=list(markets), certifications=list(certifications),
known_evidence=list(known_evidence), target=[target])
result = advisor_start(
inp, resolve_for_certifications(certifications, _HYP_LIB), reqs, target_id=target,
covers_targets=covers, corpus_status={target: "validated"},
detected_capabilities=si.capability_ids(), indicative_capabilities=si.indicative_capability_ids())
return result, si.summary