feat: Signal Producer interface + Normalizer — one signal language for all sources (before #58)
Not scanner stubs — the scanners exist. The Silent Pass needs only their UNIFIED output. This adds the
small common DATA FORMAT (not a new module/framework) the user asked for, exactly the Requirement-
Source / MCAP / regulation-alias pattern: many inputs, one language.
Producer A / B / C -> normalize_signals (vocabulary: id + aliases) -> canonical IntakeSignal -> Silent Pass
- ProducedSignal {signal_id, source_type, confidence, evidence, provenance} = what ANY source emits
(website scanner, repo scanner, PDF parser, tender parser, API, the user).
- knowledge/onboarding/signal_vocabulary.yaml reduces producer dialects to a canonical signal: "SBOM
present" arrives as cyclonedx_found / spdx_found / sbom_uploaded / requires_sbom (tender) — all become
`sbom_file_found`. The Silent Pass cannot tell where it came from -> no per-scanner special logic, ever.
- Unknown signals pass through (a new producer stays visible). confidence/evidence/provenance flow to
the detected capability for the audit trail.
A tender that "requires SBOM" now produces the same effect as a repo that HAS one — fits Vision V2
(Requirement Source over Regulation). Endpoint (#58) then has its final shape: POST -> Producers ->
Normalizer -> Silent Pass -> Profile -> Delta -> Questions -> Roadmap. Non-runtime -> no deploy. mypy
--strict clean, 14 onboarding tests pass, check-loc 0.
This commit is contained in:
@@ -20,11 +20,15 @@ from pydantic import BaseModel, Field
|
||||
|
||||
|
||||
class IntakeSignal(BaseModel):
|
||||
"""One finding a scanner/parser produced (no LLM here — the scanners are upstream)."""
|
||||
"""A CANONICAL signal the Silent Pass consumes. Producer-agnostic: the same `signal` may have come
|
||||
from a website, a repo, a PDF, a tender or the user — normalize_signals() unified them (see signals.py)."""
|
||||
|
||||
source: str # website / repository / document / product
|
||||
signal: str # signal id, e.g. "sbom_file_found"
|
||||
detail: str = "" # optional (url, filename) for the audit trail
|
||||
source: str # source_type: website / repository / document / product / tender / user
|
||||
signal: str # CANONICAL signal id, e.g. "sbom_file_found"
|
||||
confidence: float = 1.0 # carried from the producer
|
||||
evidence: Optional[str] = None # the artifact already in hand
|
||||
provenance: str = "" # where it came from (url / filename / tender clause) — audit trail
|
||||
detail: str = "" # free-text (kept for back-compat)
|
||||
|
||||
|
||||
class SignalMapping(BaseModel):
|
||||
@@ -43,6 +47,8 @@ class DetectedCapability(BaseModel):
|
||||
relationship: str = "detected"
|
||||
source: str = "" # which signal/source detected it (audit trail)
|
||||
evidence: Optional[str] = None
|
||||
confidence: float = 1.0 # carried from the producing signal
|
||||
provenance: str = "" # where the signal came from
|
||||
|
||||
|
||||
class ProductFact(BaseModel):
|
||||
@@ -82,7 +88,8 @@ def silent_intake(
|
||||
if m.capability and m.capability not in caps:
|
||||
caps[m.capability] = DetectedCapability(
|
||||
capability=m.capability, relationship=m.relationship,
|
||||
source="%s:%s" % (s.source, s.signal), evidence=m.evidence)
|
||||
source="%s:%s" % (s.source, s.signal), evidence=m.evidence,
|
||||
confidence=s.confidence, provenance=s.provenance)
|
||||
if m.evidence:
|
||||
evidence.add(m.evidence)
|
||||
if m.product_fact:
|
||||
|
||||
Reference in New Issue
Block a user