feat: Signal Producer interface + Normalizer — one signal language for all sources (before #58)

Not scanner stubs — the scanners exist. The Silent Pass needs only their UNIFIED output. This adds the
small common DATA FORMAT (not a new module/framework) the user asked for, exactly the Requirement-
Source / MCAP / regulation-alias pattern: many inputs, one language.

  Producer A / B / C  ->  normalize_signals (vocabulary: id + aliases)  ->  canonical IntakeSignal  ->  Silent Pass

- ProducedSignal {signal_id, source_type, confidence, evidence, provenance} = what ANY source emits
  (website scanner, repo scanner, PDF parser, tender parser, API, the user).
- knowledge/onboarding/signal_vocabulary.yaml reduces producer dialects to a canonical signal: "SBOM
  present" arrives as cyclonedx_found / spdx_found / sbom_uploaded / requires_sbom (tender) — all become
  `sbom_file_found`. The Silent Pass cannot tell where it came from -> no per-scanner special logic, ever.
- Unknown signals pass through (a new producer stays visible). confidence/evidence/provenance flow to
  the detected capability for the audit trail.

A tender that "requires SBOM" now produces the same effect as a repo that HAS one — fits Vision V2
(Requirement Source over Regulation). Endpoint (#58) then has its final shape: POST -> Producers ->
Normalizer -> Silent Pass -> Profile -> Delta -> Questions -> Roadmap. Non-runtime -> no deploy. mypy
--strict clean, 14 onboarding tests pass, check-loc 0.
This commit is contained in:
Benjamin Admin
2026-06-28 14:49:57 +02:00
parent 9c33582412
commit c2c8f7e424
7 changed files with 184 additions and 16 deletions
@@ -21,6 +21,11 @@ from .observations import (
empirical_distribution,
reviewed,
)
from .signals import (
ProducedSignal,
SignalVocabularyEntry,
normalize_signals,
)
from .silent_intake import (
DetectedCapability,
IntakeSignal,
@@ -61,4 +66,7 @@ __all__ = [
"DetectedCapability",
"ProductFact",
"SilentIntakeResult",
"ProducedSignal",
"SignalVocabularyEntry",
"normalize_signals",
]