Benjamin Admin 77869e32f4 feat(ocr-pipeline): use word-lookup instead of cell-OCR for cell grid
Replace per-cell Tesseract re-runs with lookup of pre-existing full-page
words from row.words. Words are filtered by X-overlap with column bounds.
This fixes phantom rows with garbage text, missing last words, and
incomplete example text by using the more reliable full-page OCR results.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 07:24:46 +01:00
Description
No description provided
42 MiB
Languages
TypeScript 60.2%
Python 32.9%
Go 5.5%
C# 0.8%
CSS 0.2%
Other 0.3%