Commit Graph

3 Commits

Author SHA1 Message Date
Benjamin Admin
8e4cbd84c2 Invalidate grid_editor_result when exclude regions change
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 25s
CI / test-go-edu-search (push) Successful in 27s
CI / test-python-klausur (push) Failing after 1m59s
CI / test-python-agent-core (push) Successful in 17s
CI / test-nodejs-website (push) Successful in 17s
When exclude regions are saved or deleted, the cached grid result is
cleared so the grid rebuilds with updated exclusions on the next step.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-19 09:19:09 +01:00
Benjamin Admin
f9d71d50d1 Add exclude region marking in Structure step
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 26s
CI / test-go-edu-search (push) Successful in 25s
CI / test-python-klausur (push) Failing after 1m47s
CI / test-python-agent-core (push) Successful in 16s
CI / test-nodejs-website (push) Successful in 16s
Users can now draw rectangles on the document image in the Structure
Detection step to mark areas (e.g. header graphics, alphabet strips)
that should be excluded from OCR results during grid building.

- Backend: PUT/DELETE endpoints for exclude regions stored in structure_result
- Backend: _build_grid_core() filters all words inside user-defined exclude regions
- Frontend: Interactive rectangle drawing with visual overlay and delete buttons
- Preserve exclude regions when re-running structure detection

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-19 09:08:30 +01:00
Benjamin Admin
ec287fd12e refactor: split ocr_pipeline_api.py (5426 lines) into 8 modules
Each module is under 1050 lines:
- ocr_pipeline_common.py (354) - shared state, cache, models, helpers
- ocr_pipeline_sessions.py (483) - session CRUD, image serving, doc-type
- ocr_pipeline_geometry.py (1025) - deskew, dewarp, structure, columns
- ocr_pipeline_rows.py (348) - row detection, box-overlay helper
- ocr_pipeline_words.py (876) - word detection (SSE), paddle-direct
- ocr_pipeline_ocr_merge.py (615) - merge helpers, kombi endpoints
- ocr_pipeline_postprocess.py (929) - LLM review, reconstruction, export
- ocr_pipeline_auto.py (705) - auto-mode orchestrator, reprocess

ocr_pipeline_api.py is now a 61-line thin wrapper that re-exports
router, _cache, and test-imported symbols for backward compatibility.
No changes needed in main.py or tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 08:42:00 +01:00