Implement _split_oversized_rows() in detect_row_geometry() (Step 7) to split content rows >1.5× median height using local horizontal projection. This produces correctly-sized rows before word OCR runs, instead of working around the issue in Step 5 with sub-cell splitting hacks. Removed Step 5 workarounds: _split_oversized_entries(), sub-cell splitting in build_word_grid(), and median_row_h calculation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
126 KiB
126 KiB