Files
breakpilot-compliance/backend-compliance/compliance/onboarding/__init__.py
T
Benjamin Admin 2d2cb2a244 feat: Certification Capability Hypotheses — capability-centric library + empirical confidence
The bottleneck is knowledge, not the endpoint. This builds the knowledge the Onboarding Advisor needs,
restructured per the user's key insight: NOT "ISO27001 -> 30 capabilities" but each hypothesis as its
own object "capability -> supported_by: [certs]". A capability is written ONCE with all supporting
certs, so the shared management-system core (document control, incident, supplier, audit, access,
asset, monitoring, training, crypto, release, risk) covers most certifications with ~18 hypotheses
instead of ~300 — and multi-certification merges AUTOMATICALLY (a company's inferred caps = every
hypothesis whose supported_by intersects its certs).

Welt-1 throughout: "IF cert present, EXPECT capability (verification required)", never "erfüllt".
Capabilities NO cert suggests (SBOM, signed updates, CVD, support period) have no hypothesis -> they
stay in the delta and get asked. confidence is EMPIRICAL: computed from real-onboarding observations
(confirmed/(confirmed+refuted)), None until calibrated — never an LLM/expert score (record_observation
+ empirical_confidence). The long-term moat: knowledge that learns from reality, not from a norm.

compliance/onboarding/hypotheses.py (resolve_for_certifications / inferred_hypotheses / empirical_
confidence / record_observation) feeds the existing advisor_start unchanged; the demo now runs on the
curated library. Pure, mypy --strict clean, library is DATA (no norm text, no real names). Non-runtime
-> no deploy. 12 tests pass, check-loc 0.
2026-06-28 13:16:45 +02:00

45 lines
1.3 KiB
Python

"""Smart Onboarding Advisor — the onboarding runtime step (orchestration over existing engines).
Turns (company + products + certifications + target) into inferred assumptions, the next best questions
(<=5, each self-explaining), the capability delta, top measures, evidence requests and completeness —
with NO sales interpretation and NO regulation picking. Orchestrator only: no new engine/registry/
meta-model; certificate->capability hypotheses and target requirements are INJECTED.
"""
from __future__ import annotations
from .engine import advisor_start, apply_answer
from .hypotheses import (
CapabilityHypothesis,
HypothesisObservations,
empirical_confidence,
inferred_hypotheses,
record_observation,
resolve_for_certifications,
)
from .schemas import (
AdvisorMeasure,
AdvisorQuestion,
AdvisorResult,
InferredAssumption,
OnboardingInput,
RejectedAssumption,
)
__all__ = [
"advisor_start",
"apply_answer",
"OnboardingInput",
"AdvisorResult",
"AdvisorQuestion",
"AdvisorMeasure",
"InferredAssumption",
"RejectedAssumption",
"CapabilityHypothesis",
"HypothesisObservations",
"empirical_confidence",
"record_observation",
"inferred_hypotheses",
"resolve_for_certifications",
]