1. Pipe divider fix: Changed OCR char-confusion regex so | between
letters (Ka|me|rad) is NOT converted to I. Only standalone/
word-boundary pipes are converted (|ch → Ich, | want → I want).
2. Alphabet sidebar detection improvements:
- _filter_decorative_margin() now considers 2-char words (OCR reads
"Aa", "Bb" from sidebars), lowered min strip from 8→6
- _filter_border_strip_words() lowered decorative threshold from 50%→45%
- New step 4f: grid-level thin-edge-column filter as safety net —
removes edge columns with <35% fill rate and >60% short text
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The right panel (grid area) had no vertical overflow handling, causing
the last ~5 rows to be clipped and invisible. Added overflow-y-auto
with max-height 80vh, and removed overflow-hidden from the GridTable
wrapper that was cutting off content.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- build-grid now saves the automatic OCR result as ground_truth.auto_grid_snapshot
- mark-ground-truth includes a correction_diff comparing auto vs corrected
- New endpoint GET /correction-diff returns detailed diff with per-col_type
accuracy breakdown (english, german, ipa, etc.)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Page-split sub-sessions (current_step=2) had orientation marked as skipped
but uiStep remained at 0 (orientation step), causing StepOrientation to
render for a sub-session that has no orientation data. Now advances to
uiStep=1 (deskew) when orientation is skipped.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The reset useEffects in StepOrientation/Deskew/Dewarp/Crop were clearing
orientationResult when sessionId changed (e.g. during handleOrientationComplete),
causing the right side of ImageCompareView to show nothing. Using key={sessionId}
on the step components instead forces React to remount with fresh state when
switching sessions, without interfering with the upload/orientation flow.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Step components (Deskew, Dewarp, Crop, Orientation) had local state
guards that prevented reloading when sessionId changed via sub-session
tab clicks. Added useEffect reset hooks that clear all local state
when sessionId changes, allowing the component to properly reload
the new session's data.
Also renamed "Box N" to "Seite N" in BoxSessionTabs per user feedback.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The page-split detection was only implemented in the regular pipeline
page but not in the OCR Overlay page where the user actually tests
with Kombi mode. Now the overlay page has full sub-session support:
- openSession: handles sub_sessions, parent_session_id, skip logic
for page-split vs crop-based sub-sessions, preserves current mode
- handleOrientationComplete: async, fetches API to detect sub-sessions
- BoxSessionTabs: shown between stepper and step content
- handleNext: returns to parent after sub-session completion
- handleSessionChange/handleBoxSessionsCreated: session switching
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
handleOrientationComplete was checking subSessions from React state,
but due to batching the state was still empty when the user clicked
"Seiten verarbeiten". Now fetches session data directly from the API
to reliably detect sub-sessions and auto-open the first one.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
After orientation detection, the frontend now automatically calls the
page-split endpoint. When a double-page book spread is detected, two
sub-sessions are created and each goes through the full pipeline
(deskew/dewarp/crop) independently — essential because each page of a
spread tilts differently due to the spine.
Frontend changes:
- StepOrientation: calls POST /page-split after orientation, shows
split info ("Doppelseite erkannt"), notifies parent of sub-sessions
- page.tsx: distinguishes page-split sub-sessions (current_step < 5)
from crop-based sub-sessions (current_step >= 5). Page-split subs
only skip orientation, not deskew/dewarp/crop.
- page.tsx: handleOrientationComplete opens first sub-session when
page-split was detected
Backend changes (orientation_crop_api.py):
- page-split endpoint falls back to original image when orientation
rotated a landscape spread to portrait
- start_step parameter: 1 if split from original, 2 if from oriented
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Each page of a double-page scan tilts differently due to the book spine.
The new POST /page-split endpoint detects spreads after orientation and
creates sub-sessions that go through the full pipeline (deskew, dewarp,
crop, etc.) individually, so each page gets its own deskew correction.
Also fixes border-strip filter incorrectly removing German translation
words by adding a decorative-strip validation check.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1. page_crop: Score all dark runs by center-proximity × darkness ×
narrowness instead of picking the widest. Fixes ad810209 where a
wide dark area at 35% was chosen over the actual spine at 50%.
2. cv_words_first: Replace x-center-only word→column assignment with
overlap-based three-pass strategy (overlap → midpoint-range → nearest).
Fixes truncated German translations like "Schal" instead of
"Schal - die Schals" in session 079cd0d9.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Users can now right-click any cell to set text color (red, green, blue,
orange, purple, black) or remove the color bar without changing text.
A "reset" option restores the OCR-detected color. This enables accurate
Ground Truth marking when OCR assigns colors to wrong cells.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Color bar (red/colored indicator) now only shows when word_boxes
text still matches the cell text — editing the cell hides stale colors
- New "Auto-Korrektur" button: detects dominant prefix+number patterns
per column (e.g. p.70, p.71) and completes partial entries (.65 → p.65)
— requires 3+ matching entries before correcting
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resize handle: wider (9px), z-40 (above z-30 buttons).
Add-column button moved to bottom-right corner to avoid overlap.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- New ImageLayoutEditor: SVG overlay on original scan with draggable
column dividers, horizontal guidelines (margins/header/footer),
double-click to add columns, x-button to delete
- GridTable: MIN_COL_WIDTH 40→80px for better readability
- Arrow up/down keys navigate between rows in the grid editor
- Ctrl+Click for multi-cell selection, Ctrl+B to toggle bold on selection
- getAdjacentCell works for cells that don't exist yet (new rows/cols)
- deleteColumn now merges x-boundaries correctly
- Session restore fix: grid_editor_result/structure_result in session GET
- Footer row 3-state cycle, auto-create cells for empty footer rows
- Grid save/build/GT-mark now advance current_step=11
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- New document category "Woerterbuch" (frontend type + backend validation)
- Column delete: hover column header → red "x" button (with confirmation)
- Column add: hover column header → "+" button inserts after that column
- Both operations support undo/redo, update cell IDs and summary
- Available in both GridEditor and StepGridReview (Kombi last step)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- New StepGridReview component: split-view (scan image left, grid right),
confidence stats, row-accept buttons, zoom controls
- Kombi Pipeline case 6 now uses StepGridReview instead of plain GridEditor
- Kombi step label changed to "Review & GT"
- Ground Truth queue page simplified to overview/navigation only
(links to Kombi pipeline for actual review work)
- Deep-link support: /ai/ocr-overlay?session=xxx&mode=kombi
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Ground-truth: zone.columns use 'label' not 'col_type' — calling
.replace() on undefined crashed the page after grid data loaded
- Model-management: same AIToolsSidebarResponsive wrapper bug as the
other pages — does not render children
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
AIToolsSidebarResponsive does not accept children — it renders only a
sidebar nav. Using it as a wrapper caused page content to never render.
Replaced with plain div, matching the pattern used by ocr-pipeline.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Both pages passed `moduleId` which is not a valid prop for PagePurpose.
The component expects explicit title/purpose/audience — calling
audience.join() on undefined caused the client-side crash.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add three new Projekt documentation pages covering product vision
(offline-first desktop app for teachers), 6-phase development roadmap,
and 3-tier hardware strategy with distribution plan.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Run detect_graphic_elements() in the grid pipeline after image loading
and remove ALL words whose centroids fall inside detected graphic regions,
regardless of confidence. Previously only low-confidence words (conf < 50)
were removed, letting artifacts like "Tr", "Su" survive.
Changes:
- grid_editor_api.py: Import and call detect_graphic_elements() at Step 3a,
passing only significant words (len >= 3) to avoid short artifacts fooling
the text-vs-graphic heuristic. Hard-filter all words in graphic regions.
- cv_graphic_detect.py: Lower density threshold from 20% to 5% for large
regions (>100x80px) — photos/illustrations have low color saturation.
Raise page-spanning limit from 50% to 60% width/height.
Tested: 5 ground-truth sessions pass regression (079cd0d9, d8533a2c,
2838c7a7, 4233d7e3, 5997b635). Session 5997 now detects 2 graphic regions
and removes 29 artifact words including "Tr" and "Su".
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
OCR splits words at syllable marks into overlapping word_boxes (e.g.
"zu" + "tiefst" with 52% x-overlap). Step 5i previously removed the
lower-confidence box, losing the prefix. Now: when both boxes are
alphabetic text with 20-75% overlap, MERGE them into one word_box
("zutiefst") instead of removing.
Also relaxed artifact cell filter: 2-char alphabetic text like "Zw"
(dictionary guide word) is no longer removed. Only non-alphabetic
short text like "a=" is filtered.
Results for session 5997: "tiefst"→"zutiefst", "zu"→"zuständig",
"Zu die Zuschüsse"→"Zuschuss, die Zuschüsse", "Zw" restored.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Three fixes for dictionary page session 5997:
1. Heading detection: column_1 cells with article words (die/der/das)
now count as content cells, preventing "die Zuschrift, die Zuschriften"
from being falsely merged into a spanning heading cell.
2. Step 5j-pre: new artifact cell filter removes short garbled text from
OCR on image areas (e.g. "7 EN", "Tr", "\\", "PEE", "a="). Cells
survive earlier filters because their rows have real content in other
columns. Also cleans up empty rows after removal.
3. Footer "PEE" auto-fixed: artifact filter removes the noise cell,
empty row gets cleaned up, footer detection no longer sees it.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Dictionary pages have 2 dictionary columns, each with article + headword
sub-columns. The right article column (die/der at x≈626) had only 14.3%
row coverage — below the 20% secondary threshold. Lowered to 12% so
dictionary article columns qualify. Also strip pipe characters from
individual word_box text (not just cell text) to remove OCR syllable
separation marks (e.g. "zu|trau|en" → "zutrauen").
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The border strip filter (Step 4e) used the LARGEST x-gap which incorrectly
removed base words along with edge artifacts. Now uses a two-stage approach:
1. _filter_border_strip_words() pre-filters raw words BEFORE column detection,
scanning from the page edge inward to find the FIRST significant gap (>30px)
2. Step 4e runs as fallback only when pre-filter didn't apply
Session 4233 now correctly detects 3 columns (base word | oder | synonyms)
instead of 2. Threshold raised from 15% to 20% to handle pages with many
edge artifacts. All 4 ground-truth sessions pass regression.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Step 5i rule (a) only caught blue tiny symbols. Graphic fragments from
page illustrations (e.g. orange quote mark from man illustration) were
missed. Now filters any non-black colored word_box with area < 200 and
confidence < 85.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Content word_boxes in test used x-spacing (i%3)*100 which created
internal gaps larger than the border-to-content gap. Changed to
(i%2)*51 so content words overlap and the border gap remains dominant.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Textbooks with decorative alphabet strips along page edges produce
OCR artifacts (scattered colored letters at x<150 while real content
starts at x>=179). Step 4e detects a significant x-gap (>30px) between
a small cluster (<15% of total word_boxes) near the page edge and the
main content, then removes the border-strip word_boxes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The frontend renders colored cells from the word_boxes array order,
not from cell.text. After post-processing steps (5i bullet removal etc),
word_boxes could remain in their original insertion order instead of
left-to-right reading order. Step 5j now explicitly sorts word_boxes
using _group_words_into_lines before the result is built.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Cell text was rebuilt using naive (top, left) sorting after removing
word_boxes in Steps 4c/4d/5i. This produced wrong word order when
words on the same visual line had slightly different top values (1-6px).
Now uses _words_to_reading_order_text() which groups words into visual
lines by y-tolerance before sorting by x within each line, matching
the initial cell text construction in _build_cells.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Step 5i: For word_boxes with >90% x-overlap and different text, use IPA
dictionary to decide which to keep (e.g. "tightly" in dict, "fighily" not).
Red threshold raised from 80 to 90 to catch remaining scanner artifacts
like "tight" and "5" that were still misclassified as red.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Scanner artifacts on black text produce slight warm tint (hue ~0, sat ~60)
that was misclassified as red. Now requires median_sat >= 80 specifically
for red classification, since genuine red text always has high saturation.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Changed `typeof grid.zones[][]` to `GridZone[][]` which was causing
a silent build error, preventing the vsplit zone grouping logic from
being compiled into the production bundle.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Pages with two side-by-side vocabulary columns separated by a vertical
black line are now split into independent sub-zones before row/column
detection. Each sub-zone gets its own rows, preventing misalignment from
different heading rhythms.
- _detect_vertical_dividers(): finds pipe word_boxes at consistent x
positions spanning >50% of zone height
- _split_zone_at_vertical_dividers(): creates left/right PageZone objects
with layout_hint and vsplit_group metadata
- Column union skips vsplit zones (independent column sets)
- Frontend renders vsplit zones side by side via flex layout
- PageZone gets layout_hint + vsplit_group fields
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Post-processing steps like 5h (slash-IPA conversion) modify cell.text
but not individual word_boxes. The colored per-word display showed
stale word_box text instead of the corrected cell text. Now falls
back to the plain input when texts don't match.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Prevents false narrow columns from text overflow at page edges.
Session 355f3c84 had a 3-row/4% tertiary cluster creating a spurious
third column from right-column text overflow.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Reject /.../ matches containing spaces, parens, or commas (e.g. sb/sth up)
- Second pass converts trailing /ipa2/ after [ipa1] (double pronunciation)
- Validate standalone /ipa/ at start against same reject pattern
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Dictionary-style pages print IPA between slashes (e.g. tiger /'taiga/).
Step 5h detects these patterns, looks up the headword in the IPA dictionary
for proper Unicode IPA, and falls back to OCR text when not found.
Converts /ipa/ to [ipa] bracket notation matching the rest of the pipeline.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Step 4d removes "|" and "||" word_boxes that OCR produces when reading
physical vertical divider lines between columns. Also strips stray pipe
chars from cell text.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
_is_grammar_bracket_content now splits "no pl" into ["no", "pl"]
instead of treating it as single token "no pl".
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two fixes:
1. Add pl, sg, no, also, ae, be etc. to _GRAMMAR_BRACKET_WORDS so
annotations like (pl) and (no pl) are not replaced with IPA.
2. Skip articles (the, a, an) in fix_ipa_continuation_cell — they
never get IPA in vocabulary books.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Footer rows (e.g. page numbers) now show "F" in amber below the row
number, mirroring the blue "H" label for headers.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Three fixes:
1. fix_ipa_continuation_cell: when headword has inline IPA like
"beat [bˈiːt] , beat, beaten", only generate IPA for uncovered
words (beaten), not words already shown (beat). When bracket is
at end like "the Highlands [ˈhaɪləndz]", return inline IPA directly.
2. Step 5d: recover garbled IPA from word_boxes when Step 5c emptied
the cell text (e.g. "[n, nn]" → "").
3. Added 2 tests for inline IPA behavior (35 total).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Footer rows like "two hundred and twelve" are no longer removed from
the grid. Instead they stay in cells/rows and get tagged so the
frontend can render them differently.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>