-
080fcb5e3c
feat: 180°-Rotation fuer Pixel-Matching im Overlay-Modus
Benjamin Admin
2026-03-10 17:19:14 +01:00
-
bcd97e7d78
feat: Overlay-Modus fuer ganzseitige Tabellenrekonstruktion mit Pixel-Positionierung
Benjamin Admin
2026-03-10 16:18:47 +01:00
-
7f8615b8c1
fix: Schriftgroesse auf haeufigsten Wert (Mode) normalisieren
Benjamin Admin
2026-03-10 14:28:23 +01:00
-
2055597ba4
fix: Pixel-Overlay fuer alle Zellen + Auto-Schriftgroesse + kein contentEditable
Benjamin Admin
2026-03-10 13:25:16 +01:00
-
ad28f9420a
feat: Pixel-basierte Wortpositionierung im Overlay
Benjamin Admin
2026-03-10 12:36:57 +01:00
-
6314e60464
fix: Monospace-Schrift im Overlay fuer korrekte Leerzeichen-Ausrichtung
Benjamin Admin
2026-03-10 11:50:53 +01:00
-
d530738b12
fix: useMemo vor early returns verschieben (React Hooks Regel)
Benjamin Admin
2026-03-10 11:35:59 +01:00
-
ca7d44e543
fix: Overlay spaltenweise Ausrichtung per Median-Snap
Benjamin Admin
2026-03-10 11:20:06 +01:00
-
e44e319ccf
feat: Text-Overlay Rekonstruktion in StepLlmReview
Benjamin Admin
2026-03-10 11:07:11 +01:00
-
6bb023bdc1
fix: vocab_entries fuer column_text Sub-Sessions generieren
Benjamin Admin
2026-03-10 10:28:27 +01:00
-
13553fc5e6
fix: column_text Typ fuer Sub-Sessions in Korrektur-Tabelle
Benjamin Admin
2026-03-10 09:48:40 +01:00
-
964c916a81
fix: _clean_cell_text entfernt Waehrungssymbole am Zeilenende
Benjamin Admin
2026-03-10 09:41:25 +01:00
-
13510b62cc
debug: Log-Level auf INFO fuer Sub-Session Zellinhalte
Benjamin Admin
2026-03-10 09:33:56 +01:00
-
3a791179af
debug: Logging fuer Sub-Session Woertererkennung
Benjamin Admin
2026-03-10 09:31:34 +01:00
-
f65bd11919
fix: Sub-Session Zeilenerkennung nutzt Word-Grouping statt Gap-Detection
Benjamin Admin
2026-03-10 09:05:24 +01:00
-
785b4d7655
fix: zones=None crash bei Sub-Session Zeilenerkennung
Benjamin Admin
2026-03-10 08:50:58 +01:00
-
2716495250
fix: Sub-Session Zeilenerkennung — Tesseract+inv im Spalten-Schritt cachen
Benjamin Admin
2026-03-10 08:43:26 +01:00
-
23b7840ea7
feat: Full-Row OCR mit Spacing fuer Box-Sub-Sessions
Benjamin Admin
2026-03-10 08:28:29 +01:00
-
34adb437d0
fix: Bild-Endpoints fallen auf original zurueck fuer Sub-Sessions
Benjamin Admin
2026-03-09 23:30:38 +01:00
-
ceaef9c6a6
fix: Sub-Sessions original_bgr als cropped_bgr promoten
Benjamin Admin
2026-03-09 22:57:39 +01:00
-
9047339f0d
fix: Sub-Sessions starten direkt bei Spalten, ueberspringe Vorverarbeitung
Benjamin Admin
2026-03-09 22:51:16 +01:00
-
2592ef233b
feat: Frontend Sub-Sessions (Boxen) in OCR-Pipeline UI
Benjamin Admin
2026-03-09 20:33:59 +01:00
-
256efef3ea
feat: Box-Zonen durch gesamte Pipeline + Sub-Sessions fuer Box-Inhalt
Benjamin Admin
2026-03-09 18:24:34 +01:00
-
4610137ecc
fix: Box-Bereiche aus Bild entfernen statt pro Zone separat Spalten erkennen
Benjamin Admin
2026-03-09 17:03:05 +01:00
-
fb46450802
fix: Alignment-Validierung nur fuer verdaechtige Gaps (>2x Median-Breite)
Benjamin Admin
2026-03-09 16:27:14 +01:00
-
11126c4436
fix: UnboundLocalError edge_tolerance in Step 5c
Benjamin Admin
2026-03-09 16:18:47 +01:00
-
7a0ded7562
fix: Left-Edge-Alignment-Validierung fuer Spalten-Gaps
Benjamin Admin
2026-03-09 16:11:58 +01:00
-
04be24a89e
fix: fehlende Imports RAPIDOCR_AVAILABLE und _RE_ALPHA in cv_cell_grid.py
Benjamin Admin
2026-03-09 15:59:24 +01:00
-
cf9dde9876
fix: _group_words_into_lines nach cv_ocr_engines.py verschieben
Benjamin Admin
2026-03-09 15:24:56 +01:00
-
60c4138660
fix: _MIN_WORD_CONF als Modul-Konstante statt lokale Variable
Benjamin Admin
2026-03-09 15:12:02 +01:00
-
7005b18561
feat: generische Box-Erkennung fuer zonenbasierte Spaltenerkennung
Benjamin Admin
2026-03-09 15:06:23 +01:00
-
e60254bc75
fix: alle Post-Crop-Schritte nutzen cropped statt dewarped Bild
Benjamin Admin
2026-03-09 09:10:10 +01:00
-
156a818246
refactor: Crop nach Deskew/Dewarp verschieben + content-basierter Buchscan-Crop
Benjamin Admin
2026-03-09 08:52:11 +01:00
-
eb45bb4879
fix: numpy array or-Verknuepfung in Crop/Deskew + ImageCompareView Labels
Benjamin Admin
2026-03-09 08:02:44 +01:00
-
2763631711
feat: Orientierung + Zuschneiden als Schritte 1-2 in OCR-Pipeline
Benjamin Admin
2026-03-08 23:55:23 +01:00
-
9a5a35bff1
refactor: cv_vocab_pipeline.py in 6 Module aufteilen (8163 → 6 + Fassade)
Benjamin Admin
2026-03-08 23:46:47 +01:00
-
931ab92c92
feat: Orientierungserkennung in OCR-Pipeline-Deskew integrieren
Benjamin Admin
2026-03-08 22:31:36 +01:00
-
853638b03c
Revert "fix: _split_broad_columns nur bei maximal 1 breiter Spalte ausfuehren"
Benjamin Admin
2026-03-07 22:55:24 +01:00
-
d98359fceb
fix: _split_broad_columns nur bei maximal 1 breiter Spalte ausfuehren
Benjamin Admin
2026-03-07 22:51:14 +01:00
-
e1ae5d5fa9
fix: Edge-Gaps in _split_broad_columns ignorieren + return-Tuple bei leerem Ergebnis
Benjamin Admin
2026-03-07 22:16:29 +01:00
-
4e8ea77140
fix: leere Spalten als strukturell behandeln + 2-Spalten-Layout korrekt labeln
Benjamin Admin
2026-03-07 19:35:21 +01:00
-
e8ba5ec073
fix: Orientierungserkennung beim PDF-Upload statt erst bei OCR
Benjamin Admin
2026-03-07 19:11:45 +01:00
-
02631dc4e0
feat: breite Spalten per Word-Gap splitten + gedrehte Scans im Frontend anzeigen
Benjamin Admin
2026-03-07 18:16:32 +01:00
-
a5635e0c43
feat: automatische Orientierungserkennung fuer umgedrehte Scans
Benjamin Admin
2026-03-07 17:26:21 +01:00
-
7a1bd5e82d
refactor: positional_column_regions auch in OCR Pipeline verwenden
Benjamin Admin
2026-03-07 17:20:51 +01:00
-
b0bfc0a960
feat: Session-ID in Vocab-Worksheet Kopfzeile anzeigen
Benjamin Admin
2026-03-07 17:16:47 +01:00
-
a5df2b6e15
fix: Spaltenklassifikation im Vocab-Worksheet durch positionsbasierte Zuordnung ersetzen
Benjamin Admin
2026-03-07 17:07:11 +01:00
-
b697963186
fix: use Alpine-compatible addgroup/adduser flags in Dockerfiles
Sharang Parnerkar
2026-03-06 22:38:31 +01:00
-
14c8bb5da0
chore: LLM qwen3:30b-a3b → qwen3.5:35b-a3b
Benjamin Admin
2026-03-06 07:32:39 +01:00
-
4532f68173
fix: Word-Validation auf Segment-Woerter beschraenken
Benjamin Admin
2026-03-05 23:13:19 +01:00
-
391449fedf
fix: Seite an Sub-Headern segmentieren, groesstes Segment fuer Projektion
Benjamin Admin
2026-03-05 23:07:23 +01:00
-
cb2b924a7b
fix: word-coverage gap detection als Fallback bei Illustrationen
Benjamin Admin
2026-03-05 22:58:27 +01:00
-
8f3a50b981
fix: full-width Zeilen vor Spaltenerkennung maskieren
Benjamin Admin
2026-03-05 22:50:27 +01:00
-
0f821afb23
feat(sbom): Lehrer-spezifisch — 17 Core/Compliance-Eintraege entfernt, Beschreibungen angepasst
Benjamin Admin
2026-03-05 20:34:20 +01:00
-
2ad391e4e4
feat: Feinabstimmung mit 7 Schiebereglern fuer Deskew/Dewarp
Benjamin Admin
2026-03-05 18:22:33 +01:00
-
e0decac7a0
feat: Unified Inbox in Kommunikation-Navigation hinzugefuegt
Benjamin Admin
2026-03-05 18:04:30 +01:00
-
d39d249daa
feat: add pass 3 text-line regression to deskew pipeline
Benjamin Admin
2026-03-05 17:53:11 +01:00
-
538d5c732e
feat: two-pass deskew with wider angle range and residual correction
Benjamin Admin
2026-03-05 17:34:57 +01:00
-
b9c3c47a37
refactor: LLM Compare komplett entfernt, Video/Voice/Alerts Sidebar hinzugefuegt
Benjamin Admin
2026-03-05 17:34:54 +01:00
-
9912997187
refactor: Jitsi/Matrix/Voice von Core übernommen, Camunda/BPMN gelöscht, Kommunikation-Nav
Benjamin Admin
2026-03-05 17:01:47 +01:00
-
2ec4d8aabd
fix: JSX syntax — IIFE wrapping for vocabulary tab
Benjamin Admin
2026-03-05 17:01:33 +01:00
-
24366880ad
feat: vocab worksheet — full-quality images, insert triangles, dynamic columns
Benjamin Admin
2026-03-05 16:49:15 +01:00
-
20b341d839
fix: vocab worksheet fills full browser width, fix missing thumbnails
Benjamin Admin
2026-03-05 16:30:04 +01:00
-
d5be7b6f77
fix: vocab worksheet — wider table, show original pages, better layout
Benjamin Admin
2026-03-05 16:07:25 +01:00
-
b7ae36e92b
feat: use OCR pipeline instead of LLM vision for vocab worksheet extraction
Benjamin Admin
2026-03-05 15:35:44 +01:00
-
9ea77ba157
fix: Abschliessen button returns to session list on last pipeline step
Benjamin Admin
2026-03-05 15:05:48 +01:00
-
4f9cf3b9e8
fix: validation step buttons unreachable — reduce panel height + sticky bar
Benjamin Admin
2026-03-05 14:54:01 +01:00
-
b8a9493310
fix: deskew iterative — use vertical Sobel edges + vertical projection
Benjamin Admin
2026-03-05 14:23:43 +01:00
-
68a6b97654
fix: use gradient score instead of variance for iterative deskew
Benjamin Admin
2026-03-05 14:11:19 +01:00
-
af1b12c97d
feat: iterative projection-profile deskew (2-phase variance optimization)
Benjamin Admin
2026-03-05 13:46:44 +01:00
-
770aea611f
fix: correct example field (fixes iberqueren), disable cell-level bold
Benjamin Admin
2026-03-05 13:15:59 +01:00
-
1a2efbf075
fix: relative bold detection (page median), fix save/finish buttons
Benjamin Admin
2026-03-05 13:02:16 +01:00
-
cd12755da6
feat: OCR umlaut confusion correction + bold detection via stroke-width
Benjamin Admin
2026-03-05 12:06:57 +01:00
-
40cfc1acdd
fix: validation step — original image URL, white background, dynamic font size
Benjamin Admin
2026-03-05 11:40:24 +01:00
-
aa136a9f80
chore: add mflux model download script for off-peak scheduling
Benjamin Admin
2026-03-05 11:20:53 +01:00
-
e6858010c2
feat: RAG Chunk Browser — alle Collections + 59 EDPB/WP29/DSFA Eintraege
Benjamin Admin
2026-03-05 11:01:14 +01:00
-
1cc69d6b5e
feat: OCR pipeline step 8 — validation view with image detection & generation
Benjamin Admin
2026-03-05 10:40:37 +01:00
-
293e7914d8
feat: improved OCR pipeline session manager with categories, thumbnails, pipeline logging
Benjamin Admin
2026-03-05 09:44:38 +01:00
-
a58dfca1d8
fix: move char-confusion fix to correction step, add spell + page-ref corrections
Benjamin Admin
2026-03-05 00:26:13 +01:00
-
fd99d4f875
cleanup: remove sheet-specific code, reduce logging, document constants
Benjamin Admin
2026-03-05 00:04:02 +01:00
-
1e0c6bb4b5
feat: hybrid OCR — full-page for broad columns, cell-crop for narrow
Benjamin Admin
2026-03-04 23:38:44 +01:00
-
e6dc3fcdd7
fix: only replace phonetics in english field, fix grammar detection
Benjamin Admin
2026-03-04 23:19:03 +01:00
-
edbdac3203
fix: improve phonetic bracket replacement logic
Benjamin Admin
2026-03-04 23:13:34 +01:00
-
99573a46ef
debug: add phonetic bracket replacement logging
Benjamin Admin
2026-03-04 23:01:01 +01:00
-
6ad4b84584
fix: broaden phonetic bracket regex to catch Tesseract-garbled IPA
Benjamin Admin
2026-03-04 22:53:50 +01:00
-
f94a3836f8
fix: use Tesseract as default engine for cell-first OCR instead of RapidOCR
Benjamin Admin
2026-03-04 22:30:34 +01:00
-
34c649c8be
fix: send SSE keepalive events every 5s during batch OCR
Benjamin Admin
2026-03-04 22:21:14 +01:00
-
dd16c88007
fix: retry words request on 400/404 + add backend diagnostic logging
Benjamin Admin
2026-03-04 20:15:54 +01:00
-
9cbf0fb278
fix: Fake Compliance Advisor aus Lehrer KI-Admin entfernt
Benjamin Admin
2026-03-04 20:15:50 +01:00
-
90ecb46bed
fix: force 3x upscale for short RapidOCR crops + lower box_thresh
Benjamin Admin
2026-03-04 19:47:36 +01:00
-
bb0e23303c
debug: log RapidOCR upscale dimensions to verify scaling
Benjamin Admin
2026-03-04 18:18:03 +01:00
-
604da26b24
fix: upscale RapidOCR crops to min 150px (was 64px), matching Tesseract
Benjamin Admin
2026-03-04 17:38:06 +01:00
-
113a1c10e5
fix: add 3px cell padding + upscale small RapidOCR crops + diagnostic logging
Benjamin Admin
2026-03-04 16:45:59 +01:00
-
e4bdb3cc24
debug: add diagnostic logging to _ocr_cell_crop for empty cell investigation
Benjamin Admin
2026-03-04 16:35:33 +01:00
-
d0e7966925
fix: use header/footer row boundaries for _heal_row_gaps in cell-first OCR
Benjamin Admin
2026-03-04 15:44:13 +01:00
-
68d230c297
fix: use batch-then-stream SSE for cell-first OCR
Benjamin Admin
2026-03-04 14:51:55 +01:00
-
16dc77e5c2
chore: add migration 005_add_doc_type.sql
Benjamin Admin
2026-03-04 13:54:56 +01:00
-
29c74a9962
feat: cell-first OCR + document type detection + dynamic pipeline steps
Benjamin Admin
2026-03-04 13:52:38 +01:00
-
00a74b3144
revert: remove marker column OCR special handling
Benjamin Admin
2026-03-04 11:52:59 +01:00
-
489835a279
fix: detect red/coloured markers in OCR pipeline
Benjamin Admin
2026-03-04 11:38:12 +01:00