Commit Graph

3 Commits

Author SHA1 Message Date
Benjamin Admin
ba0f659d1e Preserve = and (= tokens in grid build and cell text cleanup
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 43s
CI / test-go-edu-search (push) Successful in 47s
CI / test-python-klausur (push) Failing after 2m34s
CI / test-python-agent-core (push) Successful in 34s
CI / test-nodejs-website (push) Successful in 42s
= signs are used as definition markers in textbooks ("film = 1. Film").
They were incorrectly removed by two filters:

1. grid_build_core.py Step 5j-pre: _PURE_JUNK_RE matched "=" as
   artifact noise. Now exempts =, (=, ;, :, - and similar meaningful
   punctuation tokens.

2. cv_ocr_engines.py _is_noise_tail_token: "pure non-alpha" check
   removed trailing = tokens. Now exempts meaningful punctuation.

Fixes: "film = 1. Film; 2. filmen" losing the = sign,
       "(= I won and he lost.)" losing the (=.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 23:04:27 +02:00
Benjamin Admin
0599c72cc1 Fix IPA continuation: don't replace normal text with IPA
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 41s
CI / test-go-edu-search (push) Successful in 43s
CI / test-python-klausur (push) Failing after 2m39s
CI / test-python-agent-core (push) Successful in 35s
CI / test-nodejs-website (push) Successful in 19s
Text like "Betonung auf der 1. Silbe: profit ['profit]" was
incorrectly detected as garbled IPA and replaced with generated
IPA transcription of the previous row's example sentence.

Added guard: if the cell text contains >=3 recognizable words
(3+ letter alpha tokens), it's normal text, not garbled IPA.
Garbled IPA is typically short and has no real dictionary words.

Fixes: Row 13 C3 showing IPA instead of pronunciation hint text.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 22:28:58 +02:00
Benjamin Admin
17f0fdb2ed Refactor: extract _build_grid_core into grid_build_core.py + clean StepAnsicht
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Failing after 19s
CI / test-go-edu-search (push) Failing after 23s
CI / test-python-klausur (push) Failing after 10s
CI / test-python-agent-core (push) Failing after 9s
CI / test-nodejs-website (push) Failing after 26s
grid_editor_api.py: 2411 → 474 lines
- Extracted _build_grid_core() (1892 lines) into grid_build_core.py
- API file now only contains endpoints (build, save, get, gutter, box, unified)

StepAnsicht.tsx: 212 → 112 lines
- Removed useGridEditor imports (not needed for read-only spreadsheet)
- Removed unified grid fetch/build (not used with multi-sheet approach)
- Removed Spreadsheet/Grid toggle (only spreadsheet mode now)
- Simple: fetch grid-editor data → pass to SpreadsheetView

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 08:54:55 +02:00