7252f9a9564c15dd7d7bd3d69c0fa6ae32cb95b3
Replace gap-based splitting with alignment-bin approach: cluster word
left-edges within 8px tolerance, find the leftmost bin with >= 10% of
words as the true column start, split off any words to its left as a
sub-column. This correctly handles both page references ("p.59") and
misread exclamation marks ("!" → "I") even when the pixel gap is small.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Description
No description provided
Languages
TypeScript
57.9%
Python
33.8%
Go
6.8%
C#
0.8%
Shell
0.2%
Other
0.3%