Commit Graph

  • 52637778b9 SmartSpellChecker: boundary repair + context split + abbreviation awareness Benjamin Admin 2026-04-12 15:41:17 +02:00
  • f6372b8c69 Integrate SmartSpellChecker into build-grid finalization Benjamin Admin 2026-04-12 14:54:01 +02:00
  • 909d0729f6 Add SmartSpellChecker + refactor vocab-worksheet page.tsx Benjamin Admin 2026-04-12 12:25:01 +02:00
  • 04fa01661c Move IPA/syllable toggles to vocabulary tab toolbar Benjamin Admin 2026-04-12 10:17:14 +02:00
  • bf9d24e108 Replace IPA/syllable checkboxes with full dropdowns in vocab-worksheet Benjamin Admin 2026-04-12 10:10:22 +02:00
  • 0f17eb3cd9 Fix IPA:Aus — strip all brackets before skipping IPA block Benjamin Admin 2026-04-12 10:05:22 +02:00
  • 5244e10728 Fix IPA/syllable race condition: loadGrid no longer depends on buildGrid Benjamin Admin 2026-04-12 09:59:49 +02:00
  • a6c5f56003 Fix IPA strip: match all square brackets, not just Unicode IPA Benjamin Admin 2026-04-12 09:53:16 +02:00
  • 584e07eb21 Strip English IPA when mode excludes EN (nur DE / Aus) Benjamin Admin 2026-04-12 09:49:22 +02:00
  • 54b1c7d7d7 Fix IPA/syllable first-click not working (off-by-one in initialLoadDone) Benjamin Admin 2026-04-12 09:40:57 +02:00
  • d8a2331038 Fix IPA/syllable mode change requiring double-click Benjamin Admin 2026-04-12 09:32:02 +02:00
  • ad78e26143 Fix word-split: handle IPA brackets, contractions, and tiebreaker Benjamin Admin 2026-04-12 09:13:02 +02:00
  • 4f4e6c31fa Fix word-split tiebreaker: prefer longer first word Benjamin Admin 2026-04-12 09:05:14 +02:00
  • 7ffa4c90f9 Lower word-split threshold from 7 to 4 chars Benjamin Admin 2026-04-12 08:59:02 +02:00
  • 656cadbb1e Remove page-number footers from grid, promote to metadata Benjamin Admin 2026-04-12 08:50:20 +02:00
  • 757c8460c9 Detect written-out page numbers as footer rows Benjamin Admin 2026-04-12 08:39:43 +02:00
  • 501de4374a Keep page references as visible column cells Benjamin Admin 2026-04-12 08:27:44 +02:00
  • 774bbc50d3 Add debug logging for empty-column-removal Benjamin Admin 2026-04-11 22:45:22 +02:00
  • 9ceee4e07c Protect page references from junk-row removal Benjamin Admin 2026-04-11 22:40:37 +02:00
  • f23aaaea51 Fix false header detection: skip continuation lines and mid-column cells Benjamin Admin 2026-04-11 22:21:09 +02:00
  • cde13c9623 Fix IPA stripping digits after headwords (Theme 1 → Theme) Benjamin Admin 2026-04-11 22:13:45 +02:00
  • 2e42167c73 Remove empty columns from grid zones Benjamin Admin 2026-04-11 22:04:49 +02:00
  • 5eff4cf877 Fix page refs deleted as artifacts + IPA spacing for DE mode Benjamin Admin 2026-04-11 22:01:25 +02:00
  • 7f4b8757ff Fix IPA spacing + add zone debug logging for marker column issue Benjamin Admin 2026-04-11 21:51:52 +02:00
  • 7263328edb Fix marker column detection: remove min-rows requirement Benjamin Admin 2026-04-11 21:24:25 +02:00
  • 8c482ce8dd Fix Grid Build step: show grid-editor summary instead of word_result Benjamin Admin 2026-04-11 21:01:18 +02:00
  • 00f7a7154c Fix left-side gutter detection: find peak instead of scanning from edge Benjamin Admin 2026-04-11 16:52:23 +02:00
  • 9c5e950c99 Fix multi-page PDF upload: include session_id for first page Benjamin Admin 2026-04-11 16:26:25 +02:00
  • 6e494a43ab Apply merged-word splitting to grid-editor cells Benjamin Admin 2026-04-11 14:52:00 +02:00
  • 53b0d77853 Multi-page PDF support: create one session per page Benjamin Admin 2026-04-11 14:39:48 +02:00
  • aed0edbf6d Fix word split scoring: prefer longer words over short ones Benjamin Admin 2026-04-11 14:14:23 +02:00
  • 9e2c301723 Add merged-word splitting to OCR spell review Benjamin Admin 2026-04-11 14:11:16 +02:00
  • 633e301bfd Add camera gutter detection via vertical continuity analysis Benjamin Admin 2026-04-11 13:58:14 +02:00
  • 9b5e8c6b35 Restructure upload flow: document first, then preview + naming Benjamin Admin 2026-04-11 12:53:47 +02:00
  • 682b306e51 Use grid-build zones for vocab extraction (4-column detection) Benjamin Admin 2026-04-11 01:17:40 +02:00
  • 3e3116d2fd Fix vocab extraction: show all columns for generic layouts Benjamin Admin 2026-04-11 01:11:40 +02:00
  • 9a8ce69782 Fix vocab extraction: use original column types for EN/DE classification Benjamin Admin 2026-04-11 01:07:49 +02:00
  • 66f8a7b708 Improve vocab-worksheet UX: better status messages + error details Benjamin Admin 2026-04-11 00:55:56 +02:00
  • 3b78baf37f Replace old OCR pipeline with Kombi pipeline + add IPA/syllable toggles Benjamin Admin 2026-04-11 00:43:42 +02:00
  • 2828871e42 Show detected page number in session header Benjamin Admin 2026-04-11 00:20:53 +02:00
  • 5c96def4ec Skip valid line-break hyphenations in gutter repair Benjamin Admin 2026-04-11 00:14:21 +02:00
  • 611e1ee33d Add GT badge to grouped sessions and sub-pages in session list Benjamin Admin 2026-04-10 23:54:55 +02:00
  • 49d5212f0c Fix hyphen-join: preserve next row + skip valid hyphenations Benjamin Admin 2026-04-10 19:49:07 +02:00
  • e6f8e12f44 Show full Grid-Review in Ground Truth step + GT badge in session list Benjamin Admin 2026-04-10 19:34:32 +02:00
  • aabd849e35 Fix hyphen-join: strip trailing punctuation from continuation word Benjamin Admin 2026-04-10 19:25:28 +02:00
  • d1e7dd1c4a Fix gutter repair: detect short fragments + show spell alternatives Benjamin Admin 2026-04-10 19:09:12 +02:00
  • 71e1b10ac7 Add gutter repair step to OCR Kombi pipeline Benjamin Admin 2026-04-10 18:50:16 +02:00
  • 21b69e06be Fix cross-column word assignment by splitting OCR merge artifacts Benjamin Admin 2026-03-28 10:54:41 +01:00
  • 0168ab1a67 Remove Hauptseite/Box tabs from Kombi pipeline Benjamin Admin 2026-03-27 17:43:58 +01:00
  • 925f4356ce Use spellchecker instead of pyphen for pipe autocorrect validation Benjamin Admin 2026-03-27 16:47:42 +01:00
  • cc4cb3bc2f Add pipe auto-correction and graphic artifact filter for grid builder Benjamin Admin 2026-03-27 16:33:38 +01:00
  • 0685fb12da Fix Bug 3: recover OCR-lost prefixes via overlap merge + chain merging Benjamin Admin 2026-03-27 15:49:52 +01:00
  • 96ea23164d Fix word-gap merge: add missing pronouns to stop words, reduce threshold Benjamin Admin 2026-03-27 15:35:12 +01:00
  • a8773d5b00 Fix 4 Grid Editor bugs: syllable modes, heading detection, word gaps Benjamin Admin 2026-03-27 15:24:35 +01:00
  • 9f68bd3425 feat: Implement page-split step with auto-detection and sub-session naming Benjamin Admin 2026-03-26 17:56:45 +01:00
  • 469f09d1e1 fix: Redesign StepUpload for manual step control Benjamin Admin 2026-03-26 17:35:36 +01:00
  • 3bb04b25ab fix: OCR Kombi upload race condition — openSession was resetting step to 0 Benjamin Admin 2026-03-26 17:10:04 +01:00
  • 85fe0a73d6 docs: Add OCR Kombi Pipeline to MkDocs and cross-reference from OCR Pipeline Benjamin Admin 2026-03-26 16:09:40 +01:00
  • eaade3cad2 feat: Maschinenbau-Branche + INDUSTRY_REGULATION_MAP erweitert Benjamin Admin 2026-03-26 15:59:31 +01:00
  • d26a9f60ab Add OCR Kombi Pipeline: modular 11-step architecture with multi-page support Benjamin Admin 2026-03-26 15:55:28 +01:00
  • d26233b5b3 Add page number display to StepGridReview summary bar Benjamin Admin 2026-03-26 11:21:44 +01:00
  • e019dde01b Extract page number as metadata instead of silently removing it Benjamin Admin 2026-03-26 08:52:09 +01:00
  • 5af5d821a5 Fix 3 grid issues: artifact cells, connector col noise, footer false positive Benjamin Admin 2026-03-26 08:18:55 +01:00
  • 525de55791 Fix syllable+IPA combination: strip bracket content before IPA guard Benjamin Admin 2026-03-26 00:03:10 +01:00
  • f860eb66e6 Add German IPA support (wiki-pronunciation-dict + epitran) Benjamin Admin 2026-03-25 22:18:20 +01:00
  • a73ddce43d Fix missing PageZone import in grid_editor_helpers.py Benjamin Admin 2026-03-25 22:04:21 +01:00
  • 47e83d90bd Remove IPA:DE option — no German IPA dictionary available Benjamin Admin 2026-03-25 21:53:43 +01:00
  • 76cd1ac020 Fix false headers on sparse layouts and IPA corruption on German text Benjamin Admin 2026-03-25 21:49:05 +01:00
  • 256df820cd Auto-rebuild grid when IPA or syllable mode dropdown changes Benjamin Admin 2026-03-25 20:43:20 +01:00
  • 7773c51304 Fix en/de mode edge case on docs without detected English column Benjamin Admin 2026-03-25 08:37:15 +01:00
  • 83c058e400 Add language-specific IPA and syllable modes (de/en) Benjamin Admin 2026-03-25 08:16:29 +01:00
  • 34680732f8 Add IPA and syllable mode toggles, fix false IPA on German documents Benjamin Admin 2026-03-25 08:04:44 +01:00
  • c42924a94a Fix IPA correction persistence and false-positive prefix matching Benjamin Admin 2026-03-25 07:26:32 +01:00
  • 9ea217bdfc Fix IPA correction for dictionary pages (WIP) Benjamin Admin 2026-03-24 23:54:14 +01:00
  • 4feec7c7b7 Lower syllable pipe-ratio threshold from 5% to 1% Benjamin Admin 2026-03-24 23:17:08 +01:00
  • ed7fc99fc4 Improve syllable divider insertion for dictionary pages Benjamin Admin 2026-03-24 19:44:29 +01:00
  • 7fbcae954b fix: auto-trigger orientation for page-split sessions without result Benjamin Admin 2026-03-24 17:19:56 +01:00
  • f931091b57 refactor: independent sessions for page-split + URL-based pipeline navigation Benjamin Admin 2026-03-24 17:05:33 +01:00
  • f34340de9c Fix sub-session completion flow: navigate to next incomplete sub-session Benjamin Admin 2026-03-24 16:33:56 +01:00
  • 55de6c21d2 Fix session resume: auto-open most advanced sub-session on parent click Benjamin Admin 2026-03-24 16:04:53 +01:00
  • 52b66ebe07 Fix NameError: _text_has_garbled_ipa not imported in grid_editor_helpers Benjamin Admin 2026-03-24 15:11:29 +01:00
  • 424e5c51d4 fix: remove nested scrollbar in grid editor Benjamin Admin 2026-03-24 15:06:28 +01:00
  • 12b4c61bac refactor: extract grid helpers + generic CV-gated syllable insertion Benjamin Admin 2026-03-24 14:39:33 +01:00
  • d9b2aa82e9 fix: CV-gated syllable insertion + grid editor scroll Benjamin Admin 2026-03-24 14:31:16 +01:00
  • 364086b86e feat: auto-insert syllable dividers via pyphen on dictionary pages Benjamin Admin 2026-03-24 14:17:26 +01:00
  • fe754398c0 fix: Step 4f sidebar detection uses avg text length instead of fill ratio Benjamin Admin 2026-03-24 14:10:43 +01:00
  • be86a7d14d fix: preserve pipe syllable dividers + detect alphabet sidebar columns Benjamin Admin 2026-03-24 13:52:11 +01:00
  • 19a5f69272 fix: make Grid Editor vertically scrollable so all rows are visible Benjamin Admin 2026-03-24 13:33:52 +01:00
  • ea09fc75df fix: resolve circular import with lazy import for _build_reference_snapshot Benjamin Admin 2026-03-24 13:18:21 +01:00
  • 410d36f3de feat: save automatic grid snapshot before manual edits for GT comparison Benjamin Admin 2026-03-24 13:16:44 +01:00
  • 72ce4420cb fix: advance uiStep past skipped orientation for page-split sub-sessions Benjamin Admin 2026-03-24 12:59:36 +01:00
  • 63dfb4d06f fix: replace reset useEffects with key prop for step component remount Benjamin Admin 2026-03-24 12:20:50 +01:00
  • 08a91ba2be Fix sub-session tab switching: reset step state on sessionId change Benjamin Admin 2026-03-24 12:04:23 +01:00
  • 49a36364a8 Add double-page split support to OCR Overlay (Kombi 7 Schritte) Benjamin Admin 2026-03-24 11:48:26 +01:00
  • 14fd8e0b1e Fix page-split: fetch sub-sessions from API instead of React state Benjamin Admin 2026-03-24 11:22:15 +01:00
  • 247b79674d Add double-page spread detection to frontend pipeline Benjamin Admin 2026-03-24 11:09:44 +01:00
  • 40815dafd1 feat(ocr-pipeline): add page-split endpoint for double-page book spreads Benjamin Admin 2026-03-24 10:53:06 +01:00
  • 2a21127f01 fix(ocr-pipeline): improve page crop spine detection and cell assignment Benjamin Admin 2026-03-24 09:23:30 +01:00
  • 9d34c5201e feat(grid-editor): add manual cell color control via right-click menu Benjamin Admin 2026-03-24 08:51:18 +01:00
  • d54814fa70 feat: color bar respects edits + column pattern auto-correction Benjamin Admin 2026-03-24 08:38:11 +01:00