feat: Integrate 1.874 Master Controls into document checking

Rewritten rag_document_checker.py to use doc_check_controls table
instead of generic canonical_controls. Each MC has:
- check_question: binary YES/NO for LLM
- pass_criteria: JSONB list of concrete requirements
- fail_criteria: JSONB list of common mistakes

Flow: Regex checks (fast) → LLM verify FAILs → MC deep check (15 per doc)
MC results appear as additional L2 checks in the report.

Coverage: 571 DSE, 381 Cookie, 309 Loeschkonzept, 153 Widerruf,
147 DSFA, 125 AVV, 113 AGB, 75 Impressum = 1.874 total.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Benjamin Admin
2026-05-10 21:06:03 +02:00
parent d339d1edc7
commit 26b222d53d
2 changed files with 138 additions and 182 deletions
@@ -283,10 +283,23 @@ async def _check_single_document(entry: DocCheckEntry) -> list[DocCheckResult]:
# Main document check (full text against primary type)
main_result = await _run_checklist(doc_text, entry.doc_type, entry.label, entry.url, word_count)
# Control Library deep check — DISABLED until doc-check-specific
# Master Controls with binary pass/fail criteria are available.
# See: zeroclaw/INSTRUCTION-master-controls-for-doc-check.md
# Code: compliance/services/rag_document_checker.py (ready to re-enable)
# Master Control deep check — 1.874 doc_check_controls with
# binary pass/fail criteria verified by LLM (Qwen)
try:
from compliance.services.rag_document_checker import check_document_with_controls
mc_results = await check_document_with_controls(
doc_text, entry.doc_type, entry.label, max_controls=15,
)
if mc_results:
# Add MC results as additional checks to the main result
for mc in mc_results:
main_result.checks.append(CheckItem(**mc))
# Recompute correctness with MC results
l2 = [c for c in main_result.checks if c.level == 2 and not c.skipped]
l2_passed = sum(1 for c in l2 if c.passed)
main_result.correctness_pct = round(l2_passed / len(l2) * 100) if l2 else 0
except Exception as e:
logger.warning("MC check skipped: %s", e)
all_results.append(main_result)