Each zone becomes its own Excel sheet tab with independent column widths:
- Sheet "Vokabeln": main content zone with EN/DE/example columns
- Sheet "Pounds and euros": Box 1 with its own 4-column layout
- Sheet "German leihen": Box 2 with single column for flowing text
This solves the column-width conflict: boxes have different column
widths optimized for their content, which is impossible in a single
unified sheet (Excel limitation: column width is per-column, not per-cell).
Sheet tabs visible at bottom (showSheetTabs: true).
Box sheets get colored tab (from box_bg_hex).
First sheet active by default.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Install @fortune-sheet/react (MIT, v1.0.4) as Excel-like spreadsheet
component. New SpreadsheetView.tsx converts unified grid data to
Fortune Sheet format (celldata, merge config, column/row sizes).
StepAnsicht now has Spreadsheet/Grid toggle:
- Spreadsheet mode: full Fortune Sheet with toolbar (bold, italic,
color, borders, merge cells, text wrap, undo/redo)
- Grid mode: existing GridTable for quick editing
Box-origin cells get light tinted background in spreadsheet view.
Colspan cells converted to Fortune Sheet merge format.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Backend (unified_grid.py):
- build_unified_grid(): merges content + box zones into one zone
- Dominant row height from median of content row spacings
- Full-width boxes: rows integrated directly
- Partial-width boxes: extra rows inserted when box has more text
lines than standard rows fit (e.g., 7 lines in 5-row height)
- Box-origin cells tagged with source_zone_type + box_region metadata
Backend (grid_editor_api.py):
- POST /sessions/{id}/build-unified-grid → persists as unified_grid_result
- GET /sessions/{id}/unified-grid → retrieve persisted result
Frontend:
- GridEditorCell: added source_zone_type, box_region fields
- GridTable: box-origin cells get tinted background + left border
- StepAnsicht: split-view with original image (left) + editable
unified GridTable (right). Auto-builds on first load.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Content sections: use dominant (median) row height from all content
rows instead of per-section average. This ensures uniform row height
above and below boxes (the standard case on textbook pages).
Box sections: distribute height proportionally by text line count
per row. A header (1 line) gets 1/7 of box height, a bullet with
3 lines gets 3/7. Fixes Box 2 where row 3 was cut off because
even distribution didn't account for multi-line cells.
Removed overflow:hidden from box container to prevent clipping.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Content rows were incorrectly filtered out when their Y overlapped
with a box, even if the box only covered the right half of the page.
Now checks both Y AND X overlap — rows are only excluded if they
start within the box's horizontal range.
Fixes: rows next to Box 2 (lend, coconut, taste) were missing from
reconstruction because Box 2 (x=871, w=525) only covers the right
side, but left-side content rows at x≈148 were being filtered.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Major rewrite of reconstruction rendering:
- Page split into vertical sections (content/box) around box boundaries
- Content sections: uniform row height = (last_row - first_row) / (n-1)
- Box sections: rows evenly distributed within box height
- Content rows positioned absolutely at original y-coordinates
- Font size derived from row height (55% of row height)
- Multi-line cells (bullets) get expanded height with indentation
- Boxes render at exact bbox position with colored border
- Preparation for unified grid where boxes become part of main grid
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace manual word_box positioning (wild/unsnapped) with the
server-rendered words-overlay image from the OCR step endpoint.
This shows the same cleanly snapped red letters as the OCR step.
Endpoint: /sessions/{id}/image/words-overlay
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Font: use font_size_suggestion_px * scale directly (removed 0.85 factor)
- Row height: calculate from row-to-row spacing (y_min of next row
minus y_min of current row) instead of text height (y_max - y_min).
This produces correct line spacing matching the original layout.
- Multi-line cells: height multiplied by line count
Content zone should now span from ~250 to ~2050 matching the original.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Left panel: Original scan + OCR word overlay (red text at exact
word_box positions) + coordinate grid
Right panel: Reconstructed layout + same coordinate grid
Features:
- Coordinate grid toggle with 50/100/200px spacing options
- Grid lines labeled with pixel coordinates in original image space
- Both panels share the same scale for direct visual comparison
- OCR overlay shows detected text in red mono font at original positions
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New pipeline step showing the reconstructed page with all zones
positioned at their original coordinates:
- Content zones with vocabulary grid cells
- Box zones with colored borders (from structure detection)
- Colspan cells rendered across multiple columns
- Multi-line cells (bullets) with pre-wrap whitespace
- Toggle to overlay original scan image at 15% opacity
- Proportionally scaled to viewport width
- Pure CSS positioning (no canvas/Fabric.js)
Pipeline: 14 steps (0-13), Ground Truth moved to Step 13.
Added colspan field to GridEditorCell type.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>