- Reduce left-side threshold from 35% to 20% of content width - Strong language signal (eng/deu > 0.3) now prevents page_ref assignment - Increase column_ignore word threshold from 3 to 8 for edge columns - Apply language guard to Level 1 and Level 2 classification Fixes: column with deu=0.921 was misclassified as page_ref because reference score check ran before language analysis. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
79 KiB
79 KiB