Benjamin Admin 4f37afa222 feat(ocr-pipeline): verticality filter for column detection
Clusters now track Y-positions of their words and filter by vertical
coverage (>=30% primary, >=15%+5words secondary) to reject noise from
indentations or page numbers. Merge distance widened to 3% content width.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 19:57:13 +01:00
Description
No description provided
42 MiB
Languages
TypeScript 60.2%
Python 32.9%
Go 5.5%
C# 0.8%
CSS 0.2%
Other 0.3%