e809d0bc1c
The embedding/boost auto-rescue is intentionally optimistic (finds the topic, not fulfilment) -> 159 FN over-rescues vs Opus-GT (recall 0.13). Layer-3 re-judges exactly the rescued passes with the validated Haiku judge (cohort cookie_sufficiency_v1 P0.89/R0.91) -- NOT the Qwen-first cascade (local is disproven as a sufficiency judge) -- and un-passes them when the obligation is not concretely met. Gated to the full check (not skip_llm). Measured (5-firm Opus-GT, engine+L3): FN 159->12, recall 0.13->0.93, precision 0.96->0.90 (276 rescues corrected). "Embedding finds, Claude decides." Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>