Benjamin Admin b151951448 fix(pipeline): make dedup Phase 2 resilient — paginated, timeout, per-control error handling
- Paginated DB queries (100 rows/page) instead of loading all 166k rows
- Individual timeout (30s) per embedding + qdrant call
- Per-control try/except — one failure doesn't kill the job
- Sequential processing (no asyncio.gather) for stability
- Progress logging every 500 controls

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 15:31:28 +02:00
S
Description
No description provided
61 MiB
Languages
Python 38.3%
TypeScript 37.8%
Go 18.9%
HTML 3.2%
Shell 0.7%
Other 1.1%