962bbbe9f6c0b8560e1c6b0f895e3bf7fff9eb3a
- Add Rule 3 to junk-row filter: rows where no word is longer than 2 chars are removed as scattered OCR debris from illustrations - Fully disable spanning-header detection which falsely flagged IPA transcriptions and vocabulary entries as spanning headers - First-row heuristic remains for genuine header detection Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Description
No description provided
Languages
TypeScript
57.9%
Python
33.8%
Go
6.8%
C#
0.8%
Shell
0.2%
Other
0.3%