Commit Graph

4 Commits

Author SHA1 Message Date
Benjamin Admin
ccba2bb887 fix(ocr-pipeline): show sub-columns in reconstruction and LLM review steps
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 27s
CI / test-go-edu-search (push) Successful in 26s
CI / test-python-klausur (push) Failing after 1m54s
CI / test-python-agent-core (push) Successful in 18s
CI / test-nodejs-website (push) Successful in 21s
- Add marker/bbox_marker fields to WordEntry type
- Add page_ref/column_marker colors to StepReconstruction
- Make StepLlmReview table dynamic based on columns_used metadata,
  showing all detected columns (EN, DE, Example, page_ref, marker)
  instead of hardcoded EN/DE/Beispiel only

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 10:36:27 +01:00
Benjamin Admin
dbf0db0c13 feat(ocr-pipeline): improve LLM review UI + add reconstruction step
StepLlmReview: Show full vocab table with image overlay, row-level
status tracking (pending/active/reviewed/corrected/skipped), and
auto-scroll during SSE streaming. Load previous results on mount.

StepReconstruction: New step 7 with editable text fields at original
bbox positions over dewarped image. Zoom controls, tab navigation,
color-coded columns, save to backend.

Backend: Add POST /sessions/{id}/reconstruction endpoint.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 12:19:21 +01:00
Benjamin Admin
2a493890b6 feat(ocr-pipeline): add SSE streaming and phonetic filter to LLM review
- Stream LLM review results batch-by-batch (8 entries per batch) via SSE
- Frontend shows live progress bar, batch log, and corrections appearing
- Skip entries with IPA phonetic transcriptions (already dictionary-corrected)
- Refactor llm_review_entries into reusable helpers for both streaming and non-streaming paths

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 11:46:06 +01:00
Benjamin Admin
938d1d69cf feat(ocr-pipeline): add LLM-based OCR correction step (Step 6)
Replace the placeholder "Koordinaten" step with an LLM review step that
sends vocab entries to qwen3:30b-a3b via Ollama for OCR error correction
(e.g. "8en" → "Ben"). Teachers can review, accept/reject individual
corrections in a diff table before applying them.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 11:13:17 +01:00