Files
compliance-scanner-agent/compliance-agent
Sharang Parnerkar b96dda11fb
CI / Check (pull_request) Successful in 10m34s
CI / Detect Changes (pull_request) Has been skipped
CI / Deploy Agent (pull_request) Has been skipped
CI / Deploy Dashboard (pull_request) Has been skipped
CI / Deploy Docs (pull_request) Has been skipped
CI / Deploy MCP (pull_request) Has been skipped
fix: live progress + concurrency for embedding builds
The embedding build progress was only written to MongoDB after every
batch completed (and the final flush + status update only happened at
the very end), so the dashboard would show "0/N chunks (0%)" for the
entire run, then jump straight to "complete." For a repo with 2k+
chunks this looked like the build was stuck.

Three fixes:
- pipeline: call update_build(Running, embedded_count, ...) after each
  batch so /api/v1/chat/:repo_id/status reflects real progress, and
  flush embeddings to Mongo every 200 records so a partial failure does
  not lose everything.
- pipeline: drive batches with FuturesUnordered at concurrency=4 so
  litellm requests overlap instead of going strictly serial (112
  sequential requests for a 2221-chunk repo were the wall-time floor).
- llm client: give the reqwest client a 300s request timeout and 10s
  connect timeout. Previously LlmClient used reqwest::Client::new()
  with no timeout, so a hung embedding call would block the build
  indefinitely.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 11:39:37 +02:00
..