feat(consent-tester): Phase E — self-improving CMP library
cmp_discovery_log.py:
- sqlite log at /data/cmp_discoveries.db: every LLM-discovered CMP
pattern recorded with domain, strategy, value, sample text
- Auto-promote (user-chosen 'voll automatisch' mode): when LLM returns
strategy=url AND extracted text >= 800 words, write a new module
/data/auto_cmp/auto_<slug>.py with derived regex matcher + reconstruct
- record_discovery() called from dsi_discovery._try_llm_cascade on success
cmp_library/_registry.py:
- Loads both hand-written modules from services/cmp_library/ AND
auto-promoted modules from /data/auto_cmp/ (CMP_AUTO_DIR env)
- Auto modules use importlib.util.spec_from_file_location, no package
install needed; restart consent-tester to pick up new ones
dsi_discovery.py:
- _try_llm_cascade now calls record_discovery() on every successful
LLM analysis (cached AND fresh)
main.py:
- GET /cmp-discoveries — admin endpoint listing all logged discoveries
- DELETE /cmp-discoveries/{id} — rollback (unlinks auto_*.py)
This closes the self-improving loop: first encounter with a new CMP fires
the LLM (cost) → discovery is auto-promoted → all future runs against the
same vendor pattern hit Phase B (Named CMP) at <50ms with no LLM call.
This commit is contained in:
@@ -344,3 +344,20 @@ async def dsi_discovery(req: DSIDiscoveryRequest):
|
||||
errors=result.errors,
|
||||
scanned_at=datetime.now(timezone.utc).isoformat(),
|
||||
)
|
||||
|
||||
|
||||
# ── Admin: CMP discoveries (Phase E) ────────────────────────────────
|
||||
|
||||
@app.get("/cmp-discoveries")
|
||||
async def cmp_discoveries(limit: int = 200):
|
||||
"""List LLM-discovered CMP patterns (Phase E auto-promote log)."""
|
||||
from services.cmp_discovery_log import list_discoveries
|
||||
return {"discoveries": list_discoveries(limit=limit)}
|
||||
|
||||
|
||||
@app.delete("/cmp-discoveries/{disc_id}")
|
||||
async def cmp_discovery_delete(disc_id: int):
|
||||
"""Delete a discovery + its auto-promoted module (rollback)."""
|
||||
from services.cmp_discovery_log import delete_discovery
|
||||
ok = delete_discovery(disc_id)
|
||||
return {"deleted": ok, "id": disc_id}
|
||||
|
||||
Reference in New Issue
Block a user