fix: live progress + concurrency for embedding builds #80
Reference in New Issue
Block a user
Delete Branch "fix/embedding-build-progress"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
embedded_chunksto MongoDB after each batch (previously only at end-of-run), so the dashboard's/statuspoll shows real movement instead of frozen0/Nuntil completion.FuturesUnorderedwith up to 4 in-flight requests against litellm. For a 2221-chunk repo (112 sequential batches at size 20) this is the main wall-time reduction.LlmClient's reqwest client a 300s request timeout + 10s connect timeout. Previouslyreqwest::Client::new()had no timeout, so a hung embedding call could block the entire build forever.Also flushes embeddings to Mongo every 200 records so a partial failure doesn't discard everything.
Background: a recent embedding build (~2221 chunks) appeared stuck at
0/2221 (0%)in the dashboard. The litellm endpoint was actually healthy (verified: 386ms for a single embedding viabge-multilingual-gemma2) — chunks were being embedded the whole time, butupdate_buildwas only called at the very end, so the UI never reflected progress. With this PR, the bar advances per batch.Test plan
/api/v1/chat/:repo_id/statusshowsembedded_chunksclimbing from0towardtotal_chunkswhile the build is in progress, not just at the end.Completedwithembedded_chunks == total_chunks.🤖 Generated with Claude Code