fix: live progress + concurrency for embedding builds #80

2026-05-13T09:41:19Z

sharang commented

2026-05-13 09:41:19 +00:00

Summary

Live progress: write embedded_chunks to MongoDB after each batch (previously only at end-of-run), so the dashboard's /status poll shows real movement instead of frozen 0/N until completion.
Concurrency: drive embedding batches via FuturesUnordered with up to 4 in-flight requests against litellm. For a 2221-chunk repo (112 sequential batches at size 20) this is the main wall-time reduction.
HTTP timeout: give LlmClient's reqwest client a 300s request timeout + 10s connect timeout. Previously reqwest::Client::new() had no timeout, so a hung embedding call could block the entire build forever.

Also flushes embeddings to Mongo every 200 records so a partial failure doesn't discard everything.

Background: a recent embedding build (~2221 chunks) appeared stuck at 0/2221 (0%) in the dashboard. The litellm endpoint was actually healthy (verified: 386ms for a single embedding via bge-multilingual-gemma2) — chunks were being embedded the whole time, but update_build was only called at the very end, so the UI never reflected progress. With this PR, the bar advances per batch.

Test plan

Trigger a fresh embedding build on a multi-thousand-chunk repo and confirm /api/v1/chat/:repo_id/status shows embedded_chunks climbing from 0 toward total_chunks while the build is in progress, not just at the end.
Confirm dashboard "Building embeddings: X/Y" advances in real time.
Confirm final state is Completed with embedded_chunks == total_chunks.
Sanity: re-running the build cleanly deletes old embeddings before re-inserting (existing behavior, unchanged).

🤖 Generated with Claude Code

## Summary - **Live progress**: write `embedded_chunks` to MongoDB after each batch (previously only at end-of-run), so the dashboard's `/status` poll shows real movement instead of frozen `0/N` until completion. - **Concurrency**: drive embedding batches via `FuturesUnordered` with up to 4 in-flight requests against litellm. For a 2221-chunk repo (112 sequential batches at size 20) this is the main wall-time reduction. - **HTTP timeout**: give `LlmClient`'s reqwest client a 300s request timeout + 10s connect timeout. Previously `reqwest::Client::new()` had no timeout, so a hung embedding call could block the entire build forever. Also flushes embeddings to Mongo every 200 records so a partial failure doesn't discard everything. Background: a recent embedding build (~2221 chunks) appeared stuck at `0/2221 (0%)` in the dashboard. The litellm endpoint was actually healthy (verified: 386ms for a single embedding via `bge-multilingual-gemma2`) — chunks were being embedded the whole time, but `update_build` was only called at the very end, so the UI never reflected progress. With this PR, the bar advances per batch. ## Test plan - [ ] Trigger a fresh embedding build on a multi-thousand-chunk repo and confirm `/api/v1/chat/:repo_id/status` shows `embedded_chunks` climbing from `0` toward `total_chunks` while the build is in progress, not just at the end. - [ ] Confirm dashboard "Building embeddings: X/Y" advances in real time. - [ ] Confirm final state is `Completed` with `embedded_chunks == total_chunks`. - [ ] Sanity: re-running the build cleanly deletes old embeddings before re-inserting (existing behavior, unchanged). 🤖 Generated with [Claude Code](https://claude.com/claude-code)

sharang added 1 commit 2026-05-13 09:41:20 +00:00

fix: live progress + concurrency for embedding builds

CI / Check (pull_request) Successful in 10m34s

Details

CI / Detect Changes (pull_request) Has been skipped

Details

CI / Deploy Agent (pull_request) Has been skipped

Details

CI / Deploy Dashboard (pull_request) Has been skipped

Details

CI / Deploy Docs (pull_request) Has been skipped

Details

CI / Deploy MCP (pull_request) Has been skipped

Details

b96dda11fb

The embedding build progress was only written to MongoDB after every
batch completed (and the final flush + status update only happened at
the very end), so the dashboard would show "0/N chunks (0%)" for the
entire run, then jump straight to "complete." For a repo with 2k+
chunks this looked like the build was stuck.

Three fixes:
- pipeline: call update_build(Running, embedded_count, ...) after each
  batch so /api/v1/chat/:repo_id/status reflects real progress, and
  flush embeddings to Mongo every 200 records so a partial failure does
  not lose everything.
- pipeline: drive batches with FuturesUnordered at concurrency=4 so
  litellm requests overlap instead of going strictly serial (112
  sequential requests for a 2221-chunk repo were the wall-time floor).
- llm client: give the reqwest client a 300s request timeout and 10s
  connect timeout. Previously LlmClient used reqwest::Client::new()
  with no timeout, so a hung embedding call would block the build
  indefinitely.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

sharang merged commit 927fbc8ecb into main

2026-05-13 10:01:05 +00:00

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: sharang/compliance-scanner-agent#80