Commit Graph

4 Commits

Author SHA1 Message Date
Benjamin Admin
a73ddce43d Fix missing PageZone import in grid_editor_helpers.py
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 24s
CI / test-go-edu-search (push) Successful in 27s
CI / test-python-klausur (push) Failing after 1m52s
CI / test-python-agent-core (push) Successful in 14s
CI / test-nodejs-website (push) Successful in 15s
The zone merging function used PageZone but the import was only
in grid_editor_api.py. Caused NameError on sessions that trigger
zone merging (e.g. original_scan_b59a1b1b).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-25 22:04:21 +01:00
Benjamin Admin
76cd1ac020 Fix false headers on sparse layouts and IPA corruption on German text
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 33s
CI / test-go-edu-search (push) Successful in 27s
CI / test-python-klausur (push) Failing after 1m55s
CI / test-python-agent-core (push) Successful in 14s
CI / test-nodejs-website (push) Successful in 17s
1. Header detection: Add 25% cap to single-cell heading heuristic.
   On German synonym dicts where most rows naturally have only 1
   content cell, the old logic marked 60%+ of rows as headers.

2. IPA de/all mode: Use "column_text" (light processing) for non-
   English columns instead of "column_en" (full processing). The
   full path runs _insert_missing_ipa() which splits on whitespace,
   matches English prefixes ("bildschön" → "bild"), and truncates
   the rest — destroying German comma-separated synonym lists.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-25 21:49:05 +01:00
Benjamin Admin
52b66ebe07 Fix NameError: _text_has_garbled_ipa not imported in grid_editor_helpers
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 26s
CI / test-go-edu-search (push) Successful in 26s
CI / test-python-klausur (push) Failing after 1m52s
CI / test-python-agent-core (push) Successful in 15s
CI / test-nodejs-website (push) Successful in 16s
After refactoring grid_editor_api.py into helpers, the function
_text_has_garbled_ipa was used in _detect_heading_rows_by_single_cell
but never imported from cv_ocr_engines. This caused HTTP 500 on
build-grid for sessions that trigger single-cell heading detection.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 15:11:29 +01:00
Benjamin Admin
12b4c61bac refactor: extract grid helpers + generic CV-gated syllable insertion
1. Extracted 1367 lines of helper functions from grid_editor_api.py
   (3051→1620 lines) into grid_editor_helpers.py (filters, detectors,
   zone grid building).

2. Created cv_syllable_detect.py with generic CV+pyphen logic:
   - Checks EVERY word_box for vertical pipe lines (not just first word)
   - No article-column dependency — works with any dictionary layout
   - CV morphological detection gates pyphen insertion

3. Grid editor scroll: calc(100vh-200px) for reliable scrolling.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 14:39:33 +01:00