breakpilot-lehrer

Author	SHA1	Message	Date
Benjamin Admin	610825ac14	SpreadsheetView: add bullet marker (•) for multi-line cells CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 44s Details CI / test-go-edu-search (push) Successful in 40s Details CI / test-python-klausur (push) Failing after 2m34s Details CI / test-python-agent-core (push) Successful in 33s Details CI / test-nodejs-website (push) Successful in 38s Details Multi-line cells (containing \n) that don't already start with a bullet character get • prepended in the frontend. This ensures bullet points are visible regardless of whether the backend inserted them (depends on when boxes were last rebuilt). Skips header rows and cells that already have •, -, or – prefix. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 17:53:54 +02:00
Benjamin Admin	6aec4742e5	SpreadsheetView: keep bullets as single cells with text-wrap CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 44s Details CI / test-go-edu-search (push) Successful in 35s Details CI / test-python-klausur (push) Failing after 2m37s Details CI / test-python-agent-core (push) Successful in 27s Details CI / test-nodejs-website (push) Successful in 31s Details Revert row expansion — multi-line bullet cells stay as single cells with \n and text-wrap (tb='2'). This way the text reflows when the user resizes the column, like normal Excel behavior. Row height auto-scales by line count (24px * lines). Vertical alignment: top (vt=0) for multi-line cells. Removed leading-space indentation hack (didn't work reliably). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 17:07:07 +02:00
Benjamin Admin	f2bc62b4f5	SpreadsheetView: bullet indentation, expanded rows, box borders CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 46s Details CI / test-go-edu-search (push) Successful in 45s Details CI / test-python-klausur (push) Failing after 2m43s Details CI / test-python-agent-core (push) Successful in 35s Details CI / test-nodejs-website (push) Successful in 1m4s Details Multi-line cells (\n): expanded into separate rows so each line gets its own cell. Continuation lines (after •) indented with leading spaces. Bullet marker lines (•) are bold. Font-size detection: cells with word_box height >1.3x median get bold and larger font (fs=12) for box titles. Headers: is_header rows always bold with light background tint. Box borders: thick colored outside border + thin inner grid lines. Content zone: light gray grid borders. Auto-fit column widths from longest text per column. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 16:15:43 +02:00
Benjamin Admin	674c9e949e	SpreadsheetView: auto-fit column widths to longest text CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Failing after 22s Details CI / test-go-edu-search (push) Failing after 23s Details CI / test-python-klausur (push) Failing after 11s Details CI / test-python-agent-core (push) Failing after 8s Details CI / test-nodejs-website (push) Failing after 24s Details Column widths now calculated from the longest text in each column (~7.5px per character + padding). Takes the maximum of auto-fit width and scaled original pixel width. Multi-line cells: uses the longest line for width calculation. Spanning header cells excluded from width calculation (they span multiple columns and would inflate single-column widths). Minimum column width: 60px. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 09:43:50 +02:00
Benjamin Admin	e131aa719e	SpreadsheetView: formatting improvements for Excel-like display CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Failing after 21s Details CI / test-go-edu-search (push) Failing after 19s Details CI / test-python-klausur (push) Failing after 11s Details CI / test-python-agent-core (push) Failing after 10s Details CI / test-nodejs-website (push) Failing after 23s Details Height: sheet height auto-calculated from row count (26px/row + toolbar), no more cutoff at 21 rows. Row count set to exact (no padding). Box borders: thick colored outside border + thin inner grid lines. Content zone: light gray grid lines on all cells. Headers: bold (bl=1) for is_header rows. Larger font detected via word_box height comparison (>1.3x median → fs=12 + bold). Box cells: light tinted background from box_bg_hex. Header cells in boxes: slightly stronger tint. Multi-line cells: text wrap enabled (tb='2'), \n preserved. Bullet points (•) and indentation preserved in cell text. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 09:29:50 +02:00
Benjamin Admin	17f0fdb2ed	Refactor: extract _build_grid_core into grid_build_core.py + clean StepAnsicht CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Failing after 19s Details CI / test-go-edu-search (push) Failing after 23s Details CI / test-python-klausur (push) Failing after 10s Details CI / test-python-agent-core (push) Failing after 9s Details CI / test-nodejs-website (push) Failing after 26s Details grid_editor_api.py: 2411 → 474 lines - Extracted _build_grid_core() (1892 lines) into grid_build_core.py - API file now only contains endpoints (build, save, get, gutter, box, unified) StepAnsicht.tsx: 212 → 112 lines - Removed useGridEditor imports (not needed for read-only spreadsheet) - Removed unified grid fetch/build (not used with multi-sheet approach) - Removed Spreadsheet/Grid toggle (only spreadsheet mode now) - Simple: fetch grid-editor data → pass to SpreadsheetView Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 08:54:55 +02:00
Benjamin Admin	d4353d76fb	SpreadsheetView: multi-sheet tabs instead of unified single sheet CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 36s Details CI / test-go-edu-search (push) Successful in 36s Details CI / test-python-klausur (push) Failing after 2m21s Details CI / test-python-agent-core (push) Successful in 31s Details CI / test-nodejs-website (push) Successful in 31s Details Each zone becomes its own Excel sheet tab with independent column widths: - Sheet "Vokabeln": main content zone with EN/DE/example columns - Sheet "Pounds and euros": Box 1 with its own 4-column layout - Sheet "German leihen": Box 2 with single column for flowing text This solves the column-width conflict: boxes have different column widths optimized for their content, which is impossible in a single unified sheet (Excel limitation: column width is per-column, not per-cell). Sheet tabs visible at bottom (showSheetTabs: true). Box sheets get colored tab (from box_bg_hex). First sheet active by default. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 00:51:21 +02:00
Benjamin Admin	b42f394833	Integrate Fortune Sheet spreadsheet editor in StepAnsicht CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 36s Details CI / test-go-edu-search (push) Successful in 31s Details CI / test-python-klausur (push) Failing after 2m40s Details CI / test-python-agent-core (push) Successful in 32s Details CI / test-nodejs-website (push) Successful in 33s Details Install @fortune-sheet/react (MIT, v1.0.4) as Excel-like spreadsheet component. New SpreadsheetView.tsx converts unified grid data to Fortune Sheet format (celldata, merge config, column/row sizes). StepAnsicht now has Spreadsheet/Grid toggle: - Spreadsheet mode: full Fortune Sheet with toolbar (bold, italic, color, borders, merge cells, text wrap, undo/redo) - Grid mode: existing GridTable for quick editing Box-origin cells get light tinted background in spreadsheet view. Colspan cells converted to Fortune Sheet merge format. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 00:08:03 +02:00
Benjamin Admin	c1a903537b	Unified Grid: merge all zones into single Excel-like grid CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 32s Details CI / test-go-edu-search (push) Successful in 45s Details CI / test-python-klausur (push) Failing after 2m35s Details CI / test-python-agent-core (push) Successful in 31s Details CI / test-nodejs-website (push) Successful in 33s Details Backend (unified_grid.py): - build_unified_grid(): merges content + box zones into one zone - Dominant row height from median of content row spacings - Full-width boxes: rows integrated directly - Partial-width boxes: extra rows inserted when box has more text lines than standard rows fit (e.g., 7 lines in 5-row height) - Box-origin cells tagged with source_zone_type + box_region metadata Backend (grid_editor_api.py): - POST /sessions/{id}/build-unified-grid → persists as unified_grid_result - GET /sessions/{id}/unified-grid → retrieve persisted result Frontend: - GridEditorCell: added source_zone_type, box_region fields - GridTable: box-origin cells get tinted background + left border - StepAnsicht: split-view with original image (left) + editable unified GridTable (right). Auto-builds on first load. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 23:37:55 +02:00
Benjamin Admin	7085c87618	StepAnsicht: dominant row height for content + proportional box rows CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 33s Details CI / test-go-edu-search (push) Successful in 43s Details CI / test-python-klausur (push) Failing after 2m35s Details CI / test-python-agent-core (push) Successful in 34s Details CI / test-nodejs-website (push) Successful in 31s Details Content sections: use dominant (median) row height from all content rows instead of per-section average. This ensures uniform row height above and below boxes (the standard case on textbook pages). Box sections: distribute height proportionally by text line count per row. A header (1 line) gets 1/7 of box height, a bullet with 3 lines gets 3/7. Fixes Box 2 where row 3 was cut off because even distribution didn't account for multi-line cells. Removed overflow:hidden from box container to prevent clipping. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 17:43:02 +02:00
Benjamin Admin	1b7e095176	StepAnsicht: fix row filtering for partial-width boxes CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 45s Details CI / test-go-edu-search (push) Successful in 27s Details CI / test-python-klausur (push) Failing after 2m34s Details CI / test-python-agent-core (push) Successful in 32s Details CI / test-nodejs-website (push) Successful in 36s Details Content rows were incorrectly filtered out when their Y overlapped with a box, even if the box only covered the right half of the page. Now checks both Y AND X overlap — rows are only excluded if they start within the box's horizontal range. Fixes: rows next to Box 2 (lend, coconut, taste) were missing from reconstruction because Box 2 (x=871, w=525) only covers the right side, but left-side content rows at x≈148 were being filtered. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 17:00:28 +02:00
Benjamin Admin	dcb873db35	StepAnsicht: section-based layout with averaged row heights CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 38s Details CI / test-go-edu-search (push) Successful in 38s Details CI / test-python-klausur (push) Failing after 2m28s Details CI / test-python-agent-core (push) Successful in 34s Details CI / test-nodejs-website (push) Successful in 40s Details Major rewrite of reconstruction rendering: - Page split into vertical sections (content/box) around box boundaries - Content sections: uniform row height = (last_row - first_row) / (n-1) - Box sections: rows evenly distributed within box height - Content rows positioned absolutely at original y-coordinates - Font size derived from row height (55% of row height) - Multi-line cells (bullets) get expanded height with indentation - Boxes render at exact bbox position with colored border - Preparation for unified grid where boxes become part of main grid Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 15:29:40 +02:00
Benjamin Admin	fd39d13d06	StepAnsicht: use server-rendered OCR overlay image CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 40s Details CI / test-go-edu-search (push) Successful in 41s Details CI / test-python-klausur (push) Failing after 2m38s Details CI / test-python-agent-core (push) Successful in 32s Details CI / test-nodejs-website (push) Successful in 24s Details Replace manual word_box positioning (wild/unsnapped) with the server-rendered words-overlay image from the OCR step endpoint. This shows the same cleanly snapped red letters as the OCR step. Endpoint: /sessions/{id}/image/words-overlay Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 23:26:54 +02:00
Benjamin Admin	c5733a171b	StepAnsicht: fix font size and row spacing to match original CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 43s Details CI / test-go-edu-search (push) Successful in 40s Details CI / test-nodejs-website (push) Has been cancelled Details CI / test-python-agent-core (push) Has been cancelled Details CI / test-python-klausur (push) Has been cancelled Details - Font: use font_size_suggestion_px * scale directly (removed 0.85 factor) - Row height: calculate from row-to-row spacing (y_min of next row minus y_min of current row) instead of text height (y_max - y_min). This produces correct line spacing matching the original layout. - Multi-line cells: height multiplied by line count Content zone should now span from ~250 to ~2050 matching the original. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 23:24:27 +02:00
Benjamin Admin	18213f0bde	StepAnsicht: split-view with coordinate grid for comparison CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 45s Details CI / test-go-edu-search (push) Successful in 40s Details CI / test-python-klausur (push) Failing after 2m37s Details CI / test-python-agent-core (push) Successful in 32s Details CI / test-nodejs-website (push) Successful in 36s Details Left panel: Original scan + OCR word overlay (red text at exact word_box positions) + coordinate grid Right panel: Reconstructed layout + same coordinate grid Features: - Coordinate grid toggle with 50/100/200px spacing options - Grid lines labeled with pixel coordinates in original image space - Both panels share the same scale for direct visual comparison - OCR overlay shows detected text in red mono font at original positions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 23:00:22 +02:00
Benjamin Admin	cd8eb6ce46	Add Ansicht step (Step 12) — read-only page layout preview CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 39s Details CI / test-go-edu-search (push) Successful in 49s Details CI / test-python-klausur (push) Failing after 2m33s Details CI / test-python-agent-core (push) Successful in 31s Details CI / test-nodejs-website (push) Successful in 36s Details New pipeline step showing the reconstructed page with all zones positioned at their original coordinates: - Content zones with vocabulary grid cells - Box zones with colored borders (from structure detection) - Colspan cells rendered across multiple columns - Multi-line cells (bullets) with pre-wrap whitespace - Toggle to overlay original scan image at 15% opacity - Proportionally scaled to viewport width - Pure CSS positioning (no canvas/Fabric.js) Pipeline: 14 steps (0-13), Ground Truth moved to Step 13. Added colspan field to GridEditorCell type. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 22:42:33 +02:00
Benjamin Admin	48de4d98cd	Fix infinite loop in StepBoxGridReview auto-build CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 41s Details CI / test-go-edu-search (push) Successful in 35s Details CI / test-python-klausur (push) Failing after 2m41s Details CI / test-python-agent-core (push) Successful in 37s Details CI / test-nodejs-website (push) Successful in 35s Details Auto-build was triggering on every grid.zones.length change, which happens on every rebuild (zone indices increment). Now uses a ref to ensure auto-build fires only once. Also removed boxZones.length===0 condition that could trigger unnecessary builds. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 17:06:11 +02:00
Benjamin Admin	8b29d20940	StepBoxGridReview: show box border color from structure detection CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 41s Details CI / test-go-edu-search (push) Successful in 45s Details CI / test-python-klausur (push) Failing after 2m46s Details CI / test-python-agent-core (push) Successful in 35s Details CI / test-nodejs-website (push) Successful in 35s Details - Use box_bg_hex for border color (from Step 7 structure detection) - Numbered color badges per box - Show color name in box header - Add box_bg_color/box_bg_hex to GridZone type Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 23:18:36 +02:00
Benjamin Admin	12b194ad1a	Fix StepBoxGridReview: match GridTable props interface CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 46s Details CI / test-go-edu-search (push) Successful in 43s Details CI / test-python-klausur (push) Failing after 2m50s Details CI / test-python-agent-core (push) Successful in 37s Details CI / test-nodejs-website (push) Successful in 38s Details GridTable expects zone (singular), onSelectCell, onCellTextChange, onToggleColumnBold, onToggleRowHeader, onNavigate — not the incorrect prop names from the first version. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 22:39:38 +02:00
Benjamin Admin	5da9a550bf	Add Box-Grid-Review step (Step 11) to OCR pipeline CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 44s Details CI / test-go-edu-search (push) Successful in 43s Details CI / test-python-klausur (push) Failing after 2m52s Details CI / test-python-agent-core (push) Successful in 36s Details CI / test-nodejs-website (push) Successful in 37s Details New pipeline step between Gutter Repair and Ground Truth that processes embedded boxes (grammar tips, exercises) independently from the main grid. Backend: - cv_box_layout.py: classify_box_layout() detects flowing/columnar/ bullet_list/header_only layout types per box - build_box_zone_grid(): layout-aware grid building (single-column for flowing text, independent columns for tabular content) - POST /sessions/{id}/build-box-grids endpoint with SmartSpellChecker - Layout type overridable per box via request body Frontend: - StepBoxGridReview.tsx: shows each box with cropped image + editable GridTable. Layout type dropdown per box. Auto-builds on first load. - Auto-skip when no boxes detected on page - Pipeline steps updated: 13 steps (0-12), Ground Truth moved to 12 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 17:26:06 +02:00
Benjamin Admin	8c482ce8dd	Fix Grid Build step: show grid-editor summary instead of word_result CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 31s Details CI / test-go-edu-search (push) Successful in 31s Details CI / test-python-klausur (push) Failing after 2m31s Details CI / test-python-agent-core (push) Successful in 21s Details CI / test-nodejs-website (push) Successful in 23s Details The Grid Build step was showing word_result.grid_shape (from the initial OCR word clustering, often just 1 column) instead of the grid-editor summary (zone-based, with correct column/row/cell counts). Now reads summary.total_rows/total_columns/total_cells from the grid-editor result. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 21:01:18 +02:00
Benjamin Admin	2828871e42	Show detected page number in session header CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 39s Details CI / test-go-edu-search (push) Successful in 42s Details CI / test-python-klausur (push) Failing after 2m21s Details CI / test-python-agent-core (push) Successful in 27s Details CI / test-nodejs-website (push) Successful in 28s Details Extracts page_number from grid_editor_result when opening a session and displays it as "S. 233" badge in the SessionHeader, next to the category and GT badges. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 00:20:53 +02:00
Benjamin Admin	611e1ee33d	Add GT badge to grouped sessions and sub-pages in session list CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 39s Details CI / test-go-edu-search (push) Successful in 41s Details CI / test-python-klausur (push) Failing after 2m29s Details CI / test-python-agent-core (push) Successful in 28s Details CI / test-nodejs-website (push) Successful in 34s Details The GT badge was only shown on ungrouped SessionRow items. Now also visible on document group rows (e.g. "GT 1/2") and individual pages within expanded groups. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 23:54:55 +02:00
Benjamin Admin	e6f8e12f44	Show full Grid-Review in Ground Truth step + GT badge in session list CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 34s Details CI / test-go-edu-search (push) Successful in 37s Details CI / test-python-klausur (push) Failing after 2m18s Details CI / test-python-agent-core (push) Successful in 22s Details CI / test-nodejs-website (push) Successful in 27s Details - StepGroundTruth now shows the split view (original image + table) so the user can verify the final result before marking as GT - Backend session list now returns is_ground_truth flag - SessionList shows amber "GT" badge for marked sessions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 19:34:32 +02:00
Benjamin Admin	d1e7dd1c4a	Fix gutter repair: detect short fragments + show spell alternatives CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 48s Details CI / test-go-edu-search (push) Successful in 49s Details CI / test-python-klausur (push) Failing after 2m37s Details CI / test-python-agent-core (push) Successful in 35s Details CI / test-nodejs-website (push) Successful in 35s Details - Lower min word length from 3→2 for hyphen-join candidates so fragments like "ve" (from "ver-künden") are no longer skipped - Return all spellchecker candidates instead of just top-1, so user can pick the correct form (e.g. "stammeln" vs "stammelt") - Frontend shows clickable alternative buttons for spell_fix suggestions - Backend accepts text_overrides in apply endpoint for user-selected alternatives Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 19:09:12 +02:00
Benjamin Admin	71e1b10ac7	Add gutter repair step to OCR Kombi pipeline CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 41s Details CI / test-go-edu-search (push) Successful in 36s Details CI / test-python-klausur (push) Failing after 2m31s Details CI / test-python-agent-core (push) Successful in 28s Details CI / test-nodejs-website (push) Successful in 29s Details New step "Wortkorrektur" between Grid-Review and Ground Truth that detects and fixes words truncated or blurred at the book gutter (binding area) of double-page scans. Uses pyspellchecker (DE+EN) for validation. Two repair strategies: - hyphen_join: words split across rows with missing chars (ve + künden → verkünden) - spell_fix: garbled trailing chars from gutter blur (stammeli → stammeln) Interactive frontend with per-suggestion accept/reject and batch controls. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 18:50:16 +02:00
Benjamin Admin	0168ab1a67	Remove Hauptseite/Box tabs from Kombi pipeline CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 27s Details CI / test-go-edu-search (push) Successful in 29s Details CI / test-python-klausur (push) Failing after 2m15s Details CI / test-python-agent-core (push) Successful in 16s Details CI / test-nodejs-website (push) Successful in 20s Details Page-split now creates independent sessions that appear directly in the session list. After split, the UI switches to the first child session. BoxSessionTabs, sub-session state, and parent-child tracking removed from Kombi code. Legacy ocr-overlay still uses BoxSessionTabs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-27 17:43:58 +01:00
Benjamin Admin	9f68bd3425	feat: Implement page-split step with auto-detection and sub-session naming StepPageSplit now: - Auto-calls POST /page-split on step entry - Shows oriented image + detection result - If double page: creates sub-sessions named "Title — S. 1/2" - If single page: green badge "keine Trennung noetig" - Manual "Weiter" button (no auto-advance) Also: - StepOrientation wrapper simplified (no page-split in orientation) - StepUpload passes name back via onUploaded(sid, name) - page.tsx: after page-split "Weiter" switches to first sub-session - useKombiPipeline exposes setSessionName Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-26 17:56:45 +01:00
Benjamin Admin	469f09d1e1	fix: Redesign StepUpload for manual step control StepUpload now has 3 phases: 1. File selection: drop zone / file picker → shows preview 2. Review: title input, category, file info → "Hochladen" button 3. Uploaded: shows session image → "Weiter" button No more auto-advance after upload. User controls every step. openSession() removed from onUploaded callback to prevent step-reset race condition. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-26 17:35:36 +01:00
Benjamin Admin	d26a9f60ab	Add OCR Kombi Pipeline: modular 11-step architecture with multi-page support CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 29s Details CI / test-go-edu-search (push) Successful in 28s Details CI / test-python-klausur (push) Failing after 2m24s Details CI / test-python-agent-core (push) Successful in 22s Details CI / test-nodejs-website (push) Successful in 20s Details Phase 1 of the clean architecture refactor: Replaces the 751-line ocr-overlay monolith with a modular pipeline. Each step gets its own component file. Frontend: /ai/ocr-kombi route with 11 steps (Upload, Orientation, PageSplit, Deskew, Dewarp, ContentCrop, OCR, Structure, GridBuild, GridReview, GroundTruth). Session list supports document grouping for multi-page uploads. Backend: New ocr_kombi/ module with multi-page PDF upload (splits PDF into N sessions with shared document_group_id). DB migration adds document_group_id and page_number columns. Old /ai/ocr-overlay remains fully functional for A/B testing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-26 15:55:28 +01:00

30 Commits