feat: Observation Model — the empirical learning unit, defined BEFORE persistence (Task 59a)
The learning point is not the hypothesis, it is the QUESTION — and confirmed/refuted is too coarse.
"partial, only critical suppliers" or "certified but not lived" are not "wrong", they are valuable
knowledge. So the chain is Hypothesis -> Question -> Observation -> (Review) -> Hypothesis, and the
observation model must be defined cleanly before any store/API (else thousands of too-coarse
observations get migrated later).
compliance/onboarding/observations.py:
- ObservationType: confirmed / partial / refuted / not_applicable / unknown (richer than binary).
- Observation: {hypothesis_id, capability, question, answer (free text), observation_type,
scope_note ("only critical suppliers"), evidence_uploaded, reviewed, reviewed_by}.
- empirical_distribution() -> a DISTRIBUTION (confirmed 61 / partial 31 / refuted 8), not one %.
- empirical_confidence() -> (confirmed + 0.5*partial) / (confirmed+partial+refuted); n.a./unknown
excluded; None until calibrated.
- REVIEW GATE: only reviewed observations calibrate — a raw answer never changes a hypothesis (no
learning from outliers).
Refactor: the hypothesis is now PURE curated knowledge — the binary observations counter and any
confidence are removed from CapabilityHypothesis and the YAML; confidence is COMPUTED from the separate
reviewed observation stream. Pure, mypy --strict clean. Persistence/aggregation/calibration are 59b/c/d.
Non-runtime -> no deploy. 12 tests pass, check-loc 0.
This commit is contained in:
@@ -14,12 +14,13 @@ import yaml
|
||||
|
||||
from compliance.onboarding import (
|
||||
CapabilityHypothesis,
|
||||
HypothesisObservations,
|
||||
Observation,
|
||||
ObservationType,
|
||||
OnboardingInput,
|
||||
advisor_start,
|
||||
empirical_confidence,
|
||||
empirical_distribution,
|
||||
inferred_hypotheses,
|
||||
record_observation,
|
||||
resolve_for_certifications,
|
||||
)
|
||||
from compliance.transition_reasoning import TargetRequirement
|
||||
@@ -47,13 +48,21 @@ def test_multi_certification_merges_automatically():
|
||||
assert "sbom_creation" not in caps and "secure_signed_update_distribution" not in caps
|
||||
|
||||
|
||||
def test_empirical_confidence_is_computed_not_assigned():
|
||||
obs = HypothesisObservations()
|
||||
assert empirical_confidence(obs) is None # null until observed
|
||||
obs = record_observation(obs, True)
|
||||
obs = record_observation(obs, True)
|
||||
obs = record_observation(obs, False)
|
||||
assert empirical_confidence(obs) == 0.67 # 2 / 3, from observations only
|
||||
def test_observations_are_richer_than_binary_and_review_gated():
|
||||
# the learning unit is the QUESTION; an answer can be partial with a scope note, not just yes/no
|
||||
raw = [Observation(hypothesis_id="HYP-supplier", observation_type=ObservationType.CONFIRMED)]
|
||||
assert empirical_confidence(raw) is None # unreviewed -> does NOT calibrate (review gate)
|
||||
obs = [
|
||||
Observation(hypothesis_id="HYP-supplier", observation_type=ObservationType.CONFIRMED, reviewed=True),
|
||||
Observation(hypothesis_id="HYP-supplier", observation_type=ObservationType.PARTIAL,
|
||||
scope_note="nur kritische Lieferanten", reviewed=True),
|
||||
Observation(hypothesis_id="HYP-supplier", observation_type=ObservationType.REFUTED, reviewed=True),
|
||||
Observation(hypothesis_id="HYP-supplier", observation_type=ObservationType.NOT_APPLICABLE, reviewed=True),
|
||||
]
|
||||
dist = empirical_distribution(obs) # a DISTRIBUTION, not a single percentage
|
||||
assert dist["confirmed"] == 1 and dist["partial"] == 1 and dist["refuted"] == 1 and dist["not_applicable"] == 1
|
||||
# confidence = (confirmed + 0.5*partial) / (confirmed+partial+refuted); n.a. excluded from the base
|
||||
assert empirical_confidence(obs) == 0.5
|
||||
|
||||
|
||||
def test_resolve_adapts_to_advisor_input():
|
||||
|
||||
Reference in New Issue
Block a user