Preserve alphabetic marker columns, broaden junk filter, enable IPA in grid
- _merge_inline_marker_columns: skip merge when ≥50% of words are alphabetic (preserves "to", "in", "der" columns) - Rule 2 (oversized stub): widen to ≤3 words / ≤5 chars (catches "SEA &") - IPA phonetics: map longest-avg-text column to column_en so fix_cell_phonetics runs in the grid editor - ocr_pipeline_overlays: add missing split_page_into_zones import Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -25,7 +25,7 @@ from ocr_pipeline_common import (
|
||||
)
|
||||
from ocr_pipeline_session_store import get_session_db, get_session_image
|
||||
from cv_color_detect import _COLOR_HEX, _COLOR_RANGES
|
||||
from cv_box_detect import detect_boxes
|
||||
from cv_box_detect import detect_boxes, split_page_into_zones
|
||||
from ocr_pipeline_rows import _draw_box_exclusion_overlay
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
Reference in New Issue
Block a user