feat: Silent Knowledge Pass — recognise before asking (Phase 0, before the endpoint)

Not the endpoint yet — the bigger knowledge lever first. The Advisor can say "I need 5 answers" but does not yet decide what it can find out by ITSELF. The Silent Knowledge Pass runs in front of the Advisor and, from signals existing scanners/parsers already produce (website, repository, documents, product data), deterministically derives capabilities the company demonstrably HAS + product facts that drive scope — so every recognised item shrinks the delta and removes a question. compliance/onboarding/silent_intake.py: silent_intake(signals, signal_map) -> detected_capabilities (+ evidence already in hand) + product_facts. The signal->conclusion map is curated DATA (knowledge/onboarding/intake_signal_map.yaml), signals are injected (scanners are upstream). Pure, deterministic, no LLM. advisor_start gains detected_capabilities (folded into the profile at HIGH confidence -> covered, not asked) and an auto_detected result + headline. The experience flips from a question wall to "we already recognised 4 capabilities, 2 product facts and have 4 pieces of evidence in hand — only these few remain". Order now: Silent Pass -> #58 endpoint/frontend -> #59 empirical loop. NOT new architecture, just an orchestration step in front. Non-runtime (no app caller) -> no deploy. 15 onboarding tests pass, mypy --strict clean, check-loc 0.
2026-06-28 14:34:27 +02:00
parent 23d977e26b
commit 9c33582412
8 changed files with 290 additions and 30 deletions
@@ -5,8 +5,14 @@ _Eingabe: Unternehmen + Produkte + Zertifizierungen + Ziel. Den Rest macht die O
 ## Eingabe
 > Zertifizierungen: **ISO9001, ISO27001, ISO14001, TISAX** · Produkt: **Parkschein-/Schrankensystem** · Ziel: **CRA**

+## Phase 0 — Stille Vorbefüllung (BEVOR eine Frage erscheint)
+> Stille Vorbefüllung: 4 Fähigkeit(en) automatisch erkannt, 2 Produktfakt(en), 4 Nachweis(e) bereits vorhanden.
+- **Automatisch erkannte Fähigkeiten:** `coordinated_vulnerability_disclosure`, `product_cyber_risk_assessment`, `sbom_creation`, `secure_signed_update_distribution`
+- **Produktfakten (steuern den Scope):** `connected_to_internet=true`, `is_machine=true`
+- **Nachweise bereits in der Hand (kein Upload nötig):** cvd_policy, product_risk_assessment, sbom, signing_config
+
 ## Was wir erkannt haben
-> 17 Anforderungen erkannt · 5 wahrscheinlich abgedeckt · 5 zu klären
+> 17 Anforderungen erkannt · 4 automatisch erkannt (Intake) · 5 wahrscheinlich (Zertifikate) · 5 zu klären

 **Aus Ihren Zertifizierungen abgeleitet (zu bestätigen, nicht automatisch erfüllt):**
 - ISO9001 legt 1 relevante Fähigkeit(en) nahe — Verifikation erforderlich, nicht automatisch erfüllt
@@ -16,26 +22,26 @@ _Eingabe: Unternehmen + Produkte + Zertifizierungen + Ziel. Den Rest macht die O

 ## Die wenigen offenen Punkte — nur die nächsten besten Fragen
 **Frage 1 von 5** _(Informationswert 8)_
-> product cyber risk assessment? — _Warum fragen wir das: Keine Anhaltspunkte im Unternehmensprofil — klären._
-
-**Frage 2 von 5** _(Informationswert 8)_
 > protection against corruption of safety functions? — _Warum fragen wir das: Keine Anhaltspunkte im Unternehmensprofil — klären._

-**Frage 3 von 5** _(Informationswert 8)_
-> secure signed update distribution? — _Warum fragen wir das: Keine Anhaltspunkte im Unternehmensprofil — klären._
-
-**Frage 4 von 5** _(Informationswert 7)_
-> coordinated vulnerability disclosure? — _Warum fragen wir das: Keine Anhaltspunkte im Unternehmensprofil — klären._
-
-**Frage 5 von 5** _(Informationswert 7)_
+**Frage 2 von 5** _(Informationswert 7)_
 > exploited vuln and incident reporting? — _Warum fragen wir das: Keine Anhaltspunkte im Unternehmensprofil — klären._

+**Frage 3 von 5** _(Informationswert 7)_
+> machine safety risk assessment? — _Warum fragen wir das: Keine Anhaltspunkte im Unternehmensprofil — klären._
+
+**Frage 4 von 5** _(Informationswert 7)_
+> mechanical safety and guards? — _Warum fragen wir das: Keine Anhaltspunkte im Unternehmensprofil — klären._
+
+**Frage 5 von 5** _(Informationswert 7)_
+> operating instructions and safety information? — _Warum fragen wir das: Keine Anhaltspunkte im Unternehmensprofil — klären._
+
 ## Womit zuerst anfangen (größter Hebel)
- `product_cyber_risk_assessment` — schließt 2 Anforderung(en): CRA, MaschinenVO
 - `protection_against_corruption_of_safety_functions` — schließt 2 Anforderung(en): CRA, MaschinenVO
- `secure_signed_update_distribution` — schließt 2 Anforderung(en): CRA, MaschinenVO
- `coordinated_vulnerability_disclosure` — schließt 1 Anforderung(en): CRA
 - `exploited_vuln_and_incident_reporting` — schließt 1 Anforderung(en): CRA
+- `machine_safety_risk_assessment` — schließt 1 Anforderung(en): MaschinenVO
+- `mechanical_safety_and_guards` — schließt 1 Anforderung(en): MaschinenVO
+- `operating_instructions_and_safety_information` — schließt 1 Anforderung(en): MaschinenVO

 ## Vollständigkeit (ehrlich)
 > Identifiziert 1 · bewertet 1 · offen 0 · Unsicherheiten 0 · Begründung ja
@@ -12,7 +12,10 @@ from __future__ import annotations
 import os
 import yaml

-from compliance.onboarding import CapabilityHypothesis, OnboardingInput, advisor_start, resolve_for_certifications
+from compliance.onboarding import (
+    CapabilityHypothesis, IntakeSignal, OnboardingInput, SignalMapping,
+    advisor_start, resolve_for_certifications, silent_intake,
+)
 from compliance.transition_reasoning import TargetRequirement

 OUT = []
@@ -37,7 +40,18 @@ inp = OnboardingInput(company="synthetisch", industry="machine_builder",
                      certifications=["ISO9001", "ISO27001", "ISO14001", "TISAX"],
                      known_evidence=["CE process"], target=["CRA"])
 hyp = resolve_for_certifications(inp.certifications, _lib)
-res = advisor_start(inp, hyp, req, target_id="CRA", covers_targets=covers, corpus_status={"CRA": "validated"})
+# Phase 0 — Silent Knowledge Pass: recognise everything possible from scanner signals BEFORE asking.
+_smap = [SignalMapping(**m) for m in yaml.safe_load(
+    open(os.path.join(os.path.dirname(__file__), "..", "knowledge", "onboarding", "intake_signal_map.yaml"), encoding="utf-8"))["mappings"]]
+_signals = [IntakeSignal(source="website", signal="security_txt_or_cvd_policy", detail="/.well-known/security.txt"),
+            IntakeSignal(source="repository", signal="sbom_file_found", detail="sbom.cdx.json"),
+            IntakeSignal(source="repository", signal="signed_releases"),
+            IntakeSignal(source="document", signal="product_risk_assessment_doc"),
+            IntakeSignal(source="product", signal="cloud_connectivity"),
+            IntakeSignal(source="product", signal="plc_sps")]
+si = silent_intake(_signals, _smap)
+res = advisor_start(inp, hyp, req, target_id="CRA", covers_targets=covers, corpus_status={"CRA": "validated"},
+                    detected_capabilities=si.capability_ids())

 w("# Smart Onboarding Advisor — was der Nutzer sieht (automatisch, ohne Vertrieb)")
 w("")
@@ -46,6 +60,12 @@ w("")
 w("## Eingabe")
 w("> Zertifizierungen: **%s** · Produkt: **%s** · Ziel: **%s**" % (", ".join(inp.certifications), inp.products[0], ", ".join(inp.target)))
 w("")
+w("## Phase 0 — Stille Vorbefüllung (BEVOR eine Frage erscheint)")
+w("> %s" % si.summary)
+w("- **Automatisch erkannte Fähigkeiten:** %s" % ", ".join("`%s`" % d.capability for d in si.detected_capabilities))
+w("- **Produktfakten (steuern den Scope):** %s" % ", ".join("`%s=%s`" % (f.key, f.value) for f in si.product_facts))
+w("- **Nachweise bereits in der Hand (kein Upload nötig):** %s" % ", ".join(si.evidence_found))
+w("")
 w("## Was wir erkannt haben")
 w("> %s" % res.headline)
 w("")