Commit Graph

5 Commits

Author SHA1 Message Date
Benjamin Admin
c3f1547e32 feat: add Excel-like grid editor for OCR overlay (Kombi mode step 6)
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 27s
CI / test-go-edu-search (push) Successful in 28s
CI / test-python-klausur (push) Failing after 2m1s
CI / test-python-agent-core (push) Successful in 17s
CI / test-nodejs-website (push) Successful in 17s
Backend: new grid_editor_api.py with build-grid endpoint that detects
bordered boxes, splits page into zones, clusters columns/rows per zone
from Kombi word positions. New DB column grid_editor_result JSONB.

Frontend: GridEditor component with editable HTML tables per zone,
column bold toggle, header row toggle, undo/redo, keyboard navigation
(Tab/Enter/Arrow), image overlay verification, and save/load.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 23:41:03 +01:00
Benjamin Admin
16dc77e5c2 chore: add migration 005_add_doc_type.sql
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 27s
CI / test-go-edu-search (push) Successful in 27s
CI / test-python-klausur (push) Failing after 1m54s
CI / test-python-agent-core (push) Successful in 16s
CI / test-nodejs-website (push) Successful in 18s
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 13:54:56 +01:00
Benjamin Admin
954103cdf2 feat(ocr-pipeline): add Step 5 word recognition (grid from columns × rows)
Backend: build_word_grid() intersects column regions with content rows,
OCRs each cell with language-specific Tesseract, and returns vocabulary
entries with percent-based bounding boxes. New endpoints: POST /words,
GET /image/words-overlay, ground-truth save/retrieve for words.
Frontend: StepWordRecognition with overview + step-through labeling modes,
goToStep callback for row correction feedback loop.
MkDocs: OCR Pipeline documentation added.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 02:18:29 +01:00
Benjamin Admin
04b83d5f46 feat(ocr-pipeline): add row detection step with horizontal gap analysis
Add Step 4 (row detection) between column detection and word recognition.
Uses horizontal projection profiles + whitespace gaps (same method as columns).
Includes header/footer classification via gap-size heuristics.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 01:14:31 +01:00
Benjamin Admin
aa06ae0f61 feat: Persistente Sessions (PostgreSQL) + Spaltenerkennung (Step 3)
Sessions werden jetzt in PostgreSQL gespeichert statt in-memory.
Neue Session-Liste mit Name, Datum, Schritt. Sessions ueberleben
Browser-Refresh und Container-Neustart. Step 3 nutzt analyze_layout()
fuer automatische Spaltenerkennung mit farbigem Overlay.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 22:16:37 +01:00