New _detect_colspan_cells() in grid_editor_helpers.py:
- Runs after _build_cells() for every zone (content + box)
- Detects word-blocks that extend across column boundaries
- Merges affected cells into spanning_header with colspan=N
- Uses column midpoints to determine which columns are covered
- Works for full-page scans and box zones equally
Also fixes box flowing/bullet_list row height fields (y_min_px/y_max_px).
Removed duplicate spanning logic from cv_box_layout.py — now uses
the generic _detect_colspan_cells from grid_editor_helpers.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Box 3 empty rows: flowing/bullet_list rows were missing y_min_px/
y_max_px fields that GridTable uses for row height calculation.
Added _px and _pct variants.
Box 2 spanning cells: rows with fewer word-blocks than columns
(e.g., "In Britain..." spanning 2 columns) are now detected and
merged into spanning_header cells. GridTable already renders
spanning_header cells across the full row width.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New pipeline step between Gutter Repair and Ground Truth that processes
embedded boxes (grammar tips, exercises) independently from the main grid.
Backend:
- cv_box_layout.py: classify_box_layout() detects flowing/columnar/
bullet_list/header_only layout types per box
- build_box_zone_grid(): layout-aware grid building (single-column for
flowing text, independent columns for tabular content)
- POST /sessions/{id}/build-box-grids endpoint with SmartSpellChecker
- Layout type overridable per box via request body
Frontend:
- StepBoxGridReview.tsx: shows each box with cropped image + editable
GridTable. Layout type dropdown per box. Auto-builds on first load.
- Auto-skip when no boxes detected on page
- Pipeline steps updated: 13 steps (0-12), Ground Truth moved to 12
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>