1. Pipe divider fix: Changed OCR char-confusion regex so | between
letters (Ka|me|rad) is NOT converted to I. Only standalone/
word-boundary pipes are converted (|ch → Ich, | want → I want).
2. Alphabet sidebar detection improvements:
- _filter_decorative_margin() now considers 2-char words (OCR reads
"Aa", "Bb" from sidebars), lowered min strip from 8→6
- _filter_border_strip_words() lowered decorative threshold from 50%→45%
- New step 4f: grid-level thin-edge-column filter as safety net —
removes edge columns with <35% fill rate and >60% short text
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>