breakpilot-lehrer

Author	SHA1	Message	Date
Benjamin Admin	f65bd11919	fix: Sub-Session Zeilenerkennung nutzt Word-Grouping statt Gap-Detection CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 29s Details CI / test-go-edu-search (push) Successful in 28s Details CI / test-python-klausur (push) Failing after 2m0s Details CI / test-python-agent-core (push) Successful in 18s Details CI / test-nodejs-website (push) Successful in 23s Details Gap-basierte Erkennung findet bei kleinen Box-Bildern zu wenige Gaps und mergt Zeilen (7 raw gaps -> 4 validated -> nur 3 rows statt 6). Sub-Sessions nutzen jetzt direkt _build_rows_from_word_grouping(), das Woerter nach Y-Position clustert — robuster fuer komplexe Box-Layouts. Zusaetzlich: alle zones=None Crashes gefixt (replace_all .get("zones") or []). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 09:05:24 +01:00
Benjamin Admin	785b4d7655	fix: zones=None crash bei Sub-Session Zeilenerkennung CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 29s Details CI / test-go-edu-search (push) Successful in 28s Details CI / test-python-klausur (push) Failing after 2m1s Details CI / test-python-agent-core (push) Successful in 18s Details CI / test-nodejs-website (push) Successful in 20s Details column_result.get("zones", []) gibt None zurueck wenn der Key mit Wert None existiert. Geaendert zu .get("zones") or []. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 08:50:58 +01:00
Benjamin Admin	2716495250	fix: Sub-Session Zeilenerkennung — Tesseract+inv im Spalten-Schritt cachen CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 29s Details CI / test-go-edu-search (push) Successful in 29s Details CI / test-python-klausur (push) Failing after 2m9s Details CI / test-python-agent-core (push) Successful in 18s Details CI / test-nodejs-website (push) Successful in 20s Details Bisher wurden _word_dicts, _inv und _content_bounds fuer Sub-Sessions nicht gecacht, sodass detect_rows auf detect_column_geometry() zurueckfiel. Das konnte bei kleinen Box-Bildern mit <5 Woertern fehlschlagen. Jetzt laeuft Tesseract + Binarisierung direkt im Pseudo-Spalten-Block, und die Intermediates werden gecacht. Zusaetzlich ausfuehrliche Kommentare zur Zeilenerkennung (detect_row_geometry, _regularize_row_grid). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 08:43:26 +01:00
Benjamin Admin	23b7840ea7	feat: Full-Row OCR mit Spacing fuer Box-Sub-Sessions CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 40s Details CI / test-go-edu-search (push) Successful in 28s Details CI / test-python-klausur (push) Failing after 2m16s Details CI / test-python-agent-core (push) Successful in 17s Details CI / test-nodejs-website (push) Successful in 22s Details Sub-Sessions ueberspringen Spaltenerkennung und nutzen stattdessen eine Pseudo-Spalte ueber die volle Breite. Text wird mit proportionalem Spacing aus Wort-Positionen rekonstruiert, um raeumliches Layout zu erhalten. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 08:28:29 +01:00
Benjamin Admin	34adb437d0	fix: Bild-Endpoints fallen auf original zurueck fuer Sub-Sessions CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 30s Details CI / test-go-edu-search (push) Successful in 28s Details CI / test-python-klausur (push) Failing after 2m3s Details CI / test-python-agent-core (push) Successful in 19s Details CI / test-nodejs-website (push) Successful in 20s Details Alle Bild-Endpoints (cropped, columns-overlay, rows-overlay, words-overlay) suchten nur nach cropped/dewarped. Sub-Sessions haben nur ein original-Bild. Neue Hilfsfunktion _get_base_image_png() mit Fallback-Kette: cropped > dewarped > original. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 23:30:38 +01:00
Benjamin Admin	ceaef9c6a6	fix: Sub-Sessions original_bgr als cropped_bgr promoten CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 30s Details CI / test-go-edu-search (push) Successful in 31s Details CI / test-python-klausur (push) Failing after 2m22s Details CI / test-python-agent-core (push) Successful in 19s Details CI / test-nodejs-website (push) Successful in 18s Details Spalten-/Zeilen-/Woerter-Erkennung suchen nach cropped_bgr oder dewarped_bgr. Bei Sub-Sessions existiert nur original_bgr (der Box-Ausschnitt). Jetzt wird original_bgr automatisch als cropped_bgr gesetzt, sowohl im Cache-Aufbau als auch bei der Erstellung. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 22:57:39 +01:00
Benjamin Admin	9047339f0d	fix: Sub-Sessions starten direkt bei Spalten, ueberspringe Vorverarbeitung CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 34s Details CI / test-go-edu-search (push) Successful in 30s Details CI / test-python-klausur (push) Failing after 2m13s Details CI / test-python-agent-core (push) Successful in 20s Details CI / test-nodejs-website (push) Successful in 21s Details Box-Sub-Sessions haben bereits ein zugeschnittenes Bild. Orientierung, Begradigung, Entzerrung und Crop werden uebersprungen (skipped). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 22:51:16 +01:00
Benjamin Admin	2592ef233b	feat: Frontend Sub-Sessions (Boxen) in OCR-Pipeline UI CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 29s Details CI / test-go-edu-search (push) Successful in 29s Details CI / test-python-klausur (push) Failing after 1m57s Details CI / test-python-agent-core (push) Successful in 17s Details CI / test-nodejs-website (push) Successful in 18s Details - BoxSessionTabs: Tab-Leiste zum Wechsel zwischen Haupt- und Box-Sessions - StepColumnDetection: Box-Info + "Box-Sessions erstellen" Button - page.tsx: Session-Wechsel, Sub-Session-State, auto-return nach Abschluss - types.ts: SubSession, PageZone, erweiterte SessionInfo/ColumnResult Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 20:33:59 +01:00
Benjamin Admin	256efef3ea	feat: Box-Zonen durch gesamte Pipeline + Sub-Sessions fuer Box-Inhalt CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 29s Details CI / test-go-edu-search (push) Successful in 27s Details CI / test-python-klausur (push) Failing after 2m0s Details CI / test-python-agent-core (push) Successful in 18s Details CI / test-nodejs-website (push) Successful in 19s Details - Rote semi-transparente Box-Markierung in allen Overlays (Spalten, Zeilen, Woerter) - Zeilenerkennung: Combined-Image-Ansatz schliesst Box-Bereiche aus - Woerter-Erkennung: Zeilen innerhalb von Box-Zonen werden gefiltert - Sub-Sessions: parent_session_id/box_index in DB-Schema - POST /sessions/{id}/create-box-sessions erstellt Sub-Sessions aus Box-Regionen - Session-Info zeigt Sub-Sessions bzw. Parent-Verknuepfung - Sessions-Liste blendet Sub-Sessions per Default aus - Rekonstruktion: Fabric-JSON merged Sub-Session-Zellen an Box-Positionen - Save-Reconstruction routet box{N}_* Updates an Sub-Sessions - GET /sessions/{id}/vocab-entries/merged fuer zusammengefuehrte Eintraege Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 18:24:34 +01:00
Benjamin Admin	4610137ecc	fix: Box-Bereiche aus Bild entfernen statt pro Zone separat Spalten erkennen CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 26s Details CI / test-go-edu-search (push) Successful in 28s Details CI / test-python-klausur (push) Failing after 1m54s Details CI / test-python-agent-core (push) Successful in 16s Details CI / test-nodejs-website (push) Successful in 18s Details Content-Streifen oberhalb/unterhalb von Boxen werden zu einem Bild zusammengefügt, Spaltenerkennung läuft einmal auf dem kombinierten Bild. Entfernt Step 5c (suspicion-based gap alignment), da der neue Ansatz das Problem an der Wurzel löst. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 17:03:05 +01:00
Benjamin Admin	fb46450802	fix: Alignment-Validierung nur fuer verdaechtige Gaps (>2x Median-Breite) CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 28s Details CI / test-go-edu-search (push) Successful in 27s Details CI / test-python-klausur (push) Failing after 1m59s Details CI / test-python-agent-core (push) Successful in 18s Details CI / test-nodejs-website (push) Successful in 20s Details Vorher wurden alle internen Gaps geprueft, was echte Spaltentrennungen (EN→DE) faelschlicherweise entfernte. Jetzt werden nur Gaps geprueft, die eine unverhaeltnismaessig breite rechte Spalte erzeugen wuerden (>2x Median-Spaltenbreite). Schwelle auf 15% gesenkt. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 16:27:14 +01:00
Benjamin Admin	11126c4436	fix: UnboundLocalError edge_tolerance in Step 5c CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 31s Details CI / test-go-edu-search (push) Successful in 29s Details CI / test-python-klausur (push) Failing after 1m58s Details CI / test-python-agent-core (push) Successful in 18s Details CI / test-nodejs-website (push) Successful in 19s Details Variable wurde vor ihrer Definition in Step 7 referenziert. Eigene margin_thresh Variable fuer Step 5c eingefuehrt. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 16:18:47 +01:00
Benjamin Admin	7a0ded7562	fix: Left-Edge-Alignment-Validierung fuer Spalten-Gaps CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 27s Details CI / test-go-edu-search (push) Successful in 28s Details CI / test-python-klausur (push) Failing after 2m7s Details CI / test-python-agent-core (push) Successful in 19s Details CI / test-nodejs-website (push) Successful in 19s Details Interiore Gaps werden jetzt geprueft: rechts des Gaps muessen mindestens 25% der Woerter eine gemeinsame linke Kante teilen. Verhindert falsche Spaltentrennungen innerhalb breiter Spalten (z.B. Example-Spalte mit kurzen und langen Eintraegen). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 16:11:58 +01:00
Benjamin Admin	04be24a89e	fix: fehlende Imports RAPIDOCR_AVAILABLE und _RE_ALPHA in cv_cell_grid.py CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 28s Details CI / test-go-edu-search (push) Successful in 28s Details CI / test-python-klausur (push) Failing after 1m55s Details CI / test-python-agent-core (push) Successful in 19s Details CI / test-nodejs-website (push) Successful in 20s Details Weitere NameError-Probleme vom Modul-Refactoring: beide Symbole werden in cv_cell_grid.py benutzt, sind aber in cv_ocr_engines.py definiert und waren nicht importiert. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 15:59:24 +01:00
Benjamin Admin	cf9dde9876	fix: _group_words_into_lines nach cv_ocr_engines.py verschieben CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 26s Details CI / test-go-edu-search (push) Successful in 30s Details CI / test-python-klausur (push) Failing after 2m4s Details CI / test-python-agent-core (push) Successful in 18s Details CI / test-nodejs-website (push) Successful in 21s Details Funktion war nur in cv_review.py definiert, wurde aber auch in cv_ocr_engines.py und cv_layout.py benutzt — NameError zur Laufzeit. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 15:24:56 +01:00
Benjamin Admin	60c4138660	fix: _MIN_WORD_CONF als Modul-Konstante statt lokale Variable CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 29s Details CI / test-go-edu-search (push) Successful in 29s Details CI / test-python-klausur (push) Failing after 2m12s Details CI / test-python-agent-core (push) Successful in 17s Details CI / test-nodejs-website (push) Successful in 20s Details NameError in build_cell_grid_v2 weil _MIN_WORD_CONF nur in _ocr_cell_crop und build_cell_grid lokal definiert war. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 15:12:02 +01:00
Benjamin Admin	7005b18561	feat: generische Box-Erkennung fuer zonenbasierte Spaltenerkennung CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 29s Details CI / test-go-edu-search (push) Successful in 30s Details CI / test-python-klausur (push) Failing after 1m59s Details CI / test-python-agent-core (push) Successful in 17s Details CI / test-nodejs-website (push) Successful in 19s Details - Neue Datei cv_box_detect.py: 2-Stufen-Algorithmus (Linien + Farbe) - DetectedBox/PageZone Dataclasses in cv_vocab_types.py - detect_column_geometry_zoned() in cv_layout.py - API-Endpoints erweitert: zones/boxes_detected im column_result - Overlay-Funktionen zeichnen Box-Grenzen als gestrichelte Rechtecke - Fix: numpy array or-Verknuepfung an 7 Stellen in ocr_pipeline_api.py - 12 Unit-Tests fuer Box-Erkennung und Zone-Splitting Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 15:06:23 +01:00
Benjamin Admin	e60254bc75	fix: alle Post-Crop-Schritte nutzen cropped statt dewarped Bild CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 27s Details CI / test-go-edu-search (push) Successful in 27s Details CI / test-python-klausur (push) Failing after 1m59s Details CI / test-python-agent-core (push) Successful in 17s Details CI / test-nodejs-website (push) Successful in 24s Details Spalten-, Zeilen-, Woerter-Overlay und alle nachfolgenden Steps (LLM-Review, Rekonstruktion) lesen jetzt image/cropped mit Fallback auf image/dewarped. Tests fuer page_crop.py hinzugefuegt (25 Tests). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 09:10:10 +01:00
Benjamin Admin	156a818246	refactor: Crop nach Deskew/Dewarp verschieben + content-basierter Buchscan-Crop CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 26s Details CI / test-go-edu-search (push) Successful in 27s Details CI / test-python-klausur (push) Failing after 1m56s Details CI / test-python-agent-core (push) Successful in 16s Details CI / test-nodejs-website (push) Successful in 17s Details Pipeline-Reihenfolge neu: Orientierung → Begradigung → Entzerrung → Zuschneiden → Spalten... Crop arbeitet jetzt auf dem bereits geraden Bild, was bessere Ergebnisse liefert. page_crop.py komplett ersetzt: Adaptive Threshold + 4-Kanten-Erkennung (Buchruecken-Schatten links, Ink-Projektion fuer alle Raender) statt Otsu + groesste Kontur. Backend: Step-Nummern, Input-Bilder, Reprocess-Kaskade angepasst. Frontend: PIPELINE_STEPS umgeordnet, Switch-Cases, Vorher-Bilder aktualisiert. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 08:52:11 +01:00
Benjamin Admin	eb45bb4879	fix: numpy array or-Verknuepfung in Crop/Deskew + ImageCompareView Labels CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 37s Details CI / test-go-edu-search (push) Successful in 30s Details CI / test-python-klausur (push) Failing after 2m17s Details CI / test-python-agent-core (push) Successful in 17s Details CI / test-nodejs-website (push) Successful in 24s Details - orientation_crop_api.py: `array or array` durch `is not None` ersetzt (ValueError bei numpy Arrays) - ocr_pipeline_api.py: gleicher Fix fuer Deskew-Fallback-Kette - ImageCompareView.tsx: Fallback-Text nutzt rightLabel statt "Begradigung" Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 08:02:44 +01:00
Benjamin Admin	2763631711	feat: Orientierung + Zuschneiden als Schritte 1-2 in OCR-Pipeline CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 28s Details CI / test-go-edu-search (push) Successful in 27s Details CI / test-python-klausur (push) Failing after 1m59s Details CI / test-python-agent-core (push) Successful in 17s Details CI / test-nodejs-website (push) Successful in 18s Details Zwei neue Wizard-Schritte vor Begradigung: - Step 1: Orientierungserkennung (0/90/180/270° via Tesseract OSD) - Step 2: Seitenrand-Erkennung und Zuschnitt (Scannerraender entfernen) Backend: - orientation_crop_api.py: POST /orientation, POST /crop, POST /crop/skip - page_crop.py: detect_and_crop_page() mit Format-Erkennung (A4/A5/Letter) - Session-Store: orientation_result, crop_result Felder - Pipeline nutzt zugeschnittenes Bild fuer Deskew/Dewarp Frontend: - StepOrientation.tsx: Upload + Auto-Orientierung + Vorher/Nachher - StepCrop.tsx: Auto-Crop + Format-Badge + Ueberspringen-Option - Pipeline-Stepper: 10 Schritte (war 8) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-08 23:55:23 +01:00
Benjamin Admin	9a5a35bff1	refactor: cv_vocab_pipeline.py in 6 Module aufteilen (8163 → 6 + Fassade) CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 27s Details CI / test-go-edu-search (push) Successful in 30s Details CI / test-python-klausur (push) Failing after 1m59s Details CI / test-python-agent-core (push) Successful in 16s Details CI / test-nodejs-website (push) Successful in 18s Details Monolithische 8163-Zeilen-Datei aufgeteilt in fokussierte Module: - cv_vocab_types.py (156 Z.): Dataklassen, Konstanten, IPA, Feature-Flags - cv_preprocessing.py (1166 Z.): Bild-I/O, Orientierung, Deskew, Dewarp - cv_layout.py (3036 Z.): Dokumenttyp, Spalten, Zeilen, Klassifikation - cv_ocr_engines.py (1282 Z.): OCR-Engines, Vocab-Postprocessing, Text-Cleaning - cv_cell_grid.py (1510 Z.): Cell-Grid v2+Legacy, Vocab-Konvertierung - cv_review.py (1184 Z.): LLM/Spell Review, Pipeline-Orchestrierung cv_vocab_pipeline.py ist jetzt eine Re-Export-Fassade (35 Z.) — alle bestehenden Imports bleiben unveraendert. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-08 23:46:47 +01:00
Benjamin Admin	931ab92c92	feat: Orientierungserkennung in OCR-Pipeline-Deskew integrieren CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 38s Details CI / test-go-edu-search (push) Successful in 28s Details CI / test-python-klausur (push) Failing after 1m59s Details CI / test-python-agent-core (push) Successful in 16s Details CI / test-nodejs-website (push) Successful in 21s Details detect_and_fix_orientation() wird jetzt vor dem Deskew-Schritt in der OCR-Pipeline ausgefuehrt, sodass 90/180/270°-gedrehte Scans automatisch korrigiert werden. Frontend zeigt Orientierungskorrektur als Info-Banner. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-08 22:31:36 +01:00
Benjamin Admin	853638b03c	Revert "fix: _split_broad_columns nur bei maximal 1 breiter Spalte ausfuehren" CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 25s Details CI / test-go-edu-search (push) Successful in 26s Details CI / test-python-klausur (push) Failing after 1m57s Details CI / test-python-agent-core (push) Successful in 14s Details CI / test-nodejs-website (push) Successful in 15s Details This reverts commit `d98359fceb`.	2026-03-07 22:55:24 +01:00
Benjamin Admin	d98359fceb	fix: _split_broad_columns nur bei maximal 1 breiter Spalte ausfuehren CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 25s Details CI / test-go-edu-search (push) Successful in 27s Details CI / test-python-klausur (push) Failing after 2m26s Details CI / test-python-agent-core (push) Successful in 15s Details CI / test-nodejs-website (push) Successful in 18s Details Wenn bereits 2+ breite Content-Spalten existieren, ist das Layout wahrscheinlich korrekt in EN/DE getrennt. Split wird nur ausgefuehrt wenn eine einzelne breite Spalte EN+DE kombiniert enthaelt. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-07 22:51:14 +01:00
Benjamin Admin	e1ae5d5fa9	fix: Edge-Gaps in _split_broad_columns ignorieren + return-Tuple bei leerem Ergebnis CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 28s Details CI / test-go-edu-search (push) Successful in 25s Details CI / test-python-klausur (push) Failing after 1m57s Details CI / test-python-agent-core (push) Successful in 14s Details CI / test-nodejs-website (push) Successful in 16s Details Gaps die den Spaltenrand beruehren (Margins) werden jetzt ausgeschlossen, nur interne Gaps werden als Split-Kandidaten betrachtet. Behebt das Problem dass trailing whitespace faelschlich als groesster Gap gewaehlt wurde. Early-return in _run_ocr_pipeline_for_page gibt jetzt korrekt ([], rotation) statt [] zurueck. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-07 22:16:29 +01:00
Benjamin Admin	4e8ea77140	fix: leere Spalten als strukturell behandeln + 2-Spalten-Layout korrekt labeln CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 24s Details CI / test-go-edu-search (push) Successful in 27s Details CI / test-python-klausur (push) Failing after 1m50s Details CI / test-python-agent-core (push) Successful in 15s Details CI / test-nodejs-website (push) Successful in 16s Details Spalten mit <=2 Woertern und <15% Breite werden jetzt als column_marker statt als content-Spalte klassifiziert. Bei 2 breiten Content-Spalten wird die rechte als column_example statt column_de gelabelt, da die linke Spalte EN+DE kombiniert enthaelt. OSD-Zoom von 1.0 auf 2.0 erhoeht fuer zuverlaessigere Orientierungserkennung. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-07 19:35:21 +01:00
Benjamin Admin	e8ba5ec073	fix: Orientierungserkennung beim PDF-Upload statt erst bei OCR CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 23s Details CI / test-go-edu-search (push) Successful in 23s Details CI / test-python-klausur (push) Failing after 1m47s Details CI / test-python-agent-core (push) Successful in 17s Details CI / test-nodejs-website (push) Successful in 17s Details Rotation wird jetzt in upload_pdf_get_info() erkannt, damit Thumbnails bei der Seitenauswahl bereits richtig herum angezeigt werden. Debug-Logging fuer _split_broad_columns hinzugefuegt. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-07 19:11:45 +01:00
Benjamin Admin	02631dc4e0	feat: breite Spalten per Word-Gap splitten + gedrehte Scans im Frontend anzeigen CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 26s Details CI / test-go-edu-search (push) Successful in 25s Details CI / test-python-klausur (push) Failing after 1m52s Details CI / test-python-agent-core (push) Successful in 16s Details CI / test-nodejs-website (push) Successful in 15s Details _split_broad_columns() erkennt EN/DE-Gemisch in breiten Spalten via Word-Coverage-Analyse und trennt sie am groessten Luecken-Gap. Thumbnails und Page-Images werden serverseitig per fitz rotiert, Frontend laedt Thumbnails nach OCR-Processing neu. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-07 18:16:32 +01:00
Benjamin Admin	a5635e0c43	feat: automatische Orientierungserkennung fuer umgedrehte Scans CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 23s Details CI / test-go-edu-search (push) Successful in 25s Details CI / test-python-klausur (push) Failing after 1m50s Details CI / test-python-agent-core (push) Successful in 17s Details CI / test-nodejs-website (push) Successful in 15s Details Tesseract OSD erkennt 0/90/180/270° Rotation und korrigiert automatisch vor dem Deskew. Loest das Problem mit Buchscannern, bei denen jede 2. Seite auf dem Kopf steht. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-07 17:26:21 +01:00
Benjamin Admin	7a1bd5e82d	refactor: positional_column_regions auch in OCR Pipeline verwenden CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 24s Details CI / test-go-edu-search (push) Successful in 24s Details CI / test-python-klausur (push) Failing after 1m48s Details CI / test-python-agent-core (push) Successful in 16s Details CI / test-nodejs-website (push) Successful in 16s Details Shared Funktion positional_column_regions() in cv_vocab_pipeline.py, wird jetzt von beiden Pfaden (Vocab-Worksheet + OCR Pipeline Admin) genutzt. classify_column_types() bleibt als Legacy erhalten. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-07 17:20:51 +01:00
Benjamin Admin	b0bfc0a960	feat: Session-ID in Vocab-Worksheet Kopfzeile anzeigen CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 24s Details CI / test-go-edu-search (push) Successful in 25s Details CI / test-python-klausur (push) Failing after 1m50s Details CI / test-python-agent-core (push) Successful in 15s Details CI / test-nodejs-website (push) Successful in 17s Details Zeigt die ersten 8 Zeichen der Session-ID neben dem Untertitel an, damit die Session einfach identifiziert und kommuniziert werden kann. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-07 17:16:47 +01:00
Benjamin Admin	a5df2b6e15	fix: Spaltenklassifikation im Vocab-Worksheet durch positionsbasierte Zuordnung ersetzen CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 33s Details CI / test-go-edu-search (push) Successful in 25s Details CI / test-python-klausur (push) Failing after 1m47s Details CI / test-python-agent-core (push) Successful in 15s Details CI / test-nodejs-website (push) Successful in 20s Details Sprachbasiertes Scoring (classify_column_types) verursachte vertauschte Spalten auf Seite 3 bei Beispielsaetzen mit vielen englischen Funktionswoertern. Neue _positional_column_regions() ordnet Spalten rein geometrisch (links→rechts) zu. OCR Pipeline Admin bleibt unveraendert. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-07 17:07:11 +01:00
Benjamin Admin	14c8bb5da0	chore: LLM qwen3:30b-a3b → qwen3.5:35b-a3b CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 24s Details CI / test-go-edu-search (push) Successful in 26s Details CI / test-python-klausur (push) Failing after 2m0s Details CI / test-python-agent-core (push) Successful in 13s Details CI / test-nodejs-website (push) Successful in 20s Details Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-06 07:32:39 +01:00
Benjamin Admin	4532f68173	fix: Word-Validation auf Segment-Woerter beschraenken CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 26s Details CI / test-go-edu-search (push) Successful in 28s Details CI / test-python-klausur (push) Failing after 1m55s Details CI / test-python-agent-core (push) Successful in 18s Details CI / test-nodejs-website (push) Successful in 17s Details Woerter aus Sub-Header-Bereichen ueberlappten korrekte Spaltenluecken und liessen die Word-Validation faelschlich Gaps verwerfen. Jetzt werden nur Woerter aus dem gewaehlten Segment fuer die Validation verwendet. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-05 23:13:19 +01:00
Benjamin Admin	391449fedf	fix: Seite an Sub-Headern segmentieren, groesstes Segment fuer Projektion CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 26s Details CI / test-go-edu-search (push) Successful in 26s Details CI / test-python-klausur (push) Failing after 1m58s Details CI / test-python-agent-core (push) Successful in 16s Details CI / test-nodejs-website (push) Successful in 17s Details Statt full-width Zeilen zu maskieren wird die Seite jetzt an grossen horizontalen Luecken (Sub-Header, Kapitelgrenzen) in Segmente unterteilt. Das groesste Segment wird fuer die vertikale Projektion verwendet. Dadurch stoeren Illustrationen und Ueberschriften nicht mehr. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-05 23:07:23 +01:00
Benjamin Admin	cb2b924a7b	fix: word-coverage gap detection als Fallback bei Illustrationen CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 25s Details CI / test-go-edu-search (push) Successful in 26s Details CI / test-python-klausur (push) Failing after 1m53s Details CI / test-python-agent-core (push) Successful in 17s Details CI / test-nodejs-website (push) Successful in 18s Details Wenn pixel-basierte Projektion zu wenige Spaltenluecken findet (z.B. durch Illustrationen/Grafiken die Luecken fuellen), wird jetzt eine wort-basierte Gap-Detection als Zwischenschritt vor dem Clustering ausgefuehrt. Tesseract-Wort-BBs sind immun gegen dekorative Grafiken. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-05 22:58:27 +01:00
Benjamin Admin	8f3a50b981	fix: full-width Zeilen vor Spaltenerkennung maskieren CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 25s Details CI / test-go-edu-search (push) Successful in 26s Details CI / test-python-klausur (push) Failing after 1m56s Details CI / test-python-agent-core (push) Successful in 14s Details CI / test-nodejs-website (push) Successful in 17s Details Farbige Sub-Header (z.B. "Unit 4: Bonnie Scotland") mit voller Breite fuellten die Spaltenluecken im vertikalen Projektionsprofil auf und fuehrten zu 11 statt 5 erkannten Spalten. Zeilen mit >40% Tintendichte werden jetzt vor der Projektion maskiert. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-05 22:50:27 +01:00
Benjamin Admin	0f821afb23	feat(sbom): Lehrer-spezifisch — 17 Core/Compliance-Eintraege entfernt, Beschreibungen angepasst CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 25s Details CI / test-go-edu-search (push) Successful in 27s Details CI / test-python-klausur (push) Failing after 1m58s Details CI / test-python-agent-core (push) Successful in 15s Details CI / test-nodejs-website (push) Successful in 16s Details Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-05 20:34:20 +01:00
Benjamin Admin	2ad391e4e4	feat: Feinabstimmung mit 7 Schiebereglern fuer Deskew/Dewarp CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 27s Details CI / test-go-edu-search (push) Successful in 28s Details CI / test-python-klausur (push) Failing after 2m1s Details CI / test-python-agent-core (push) Successful in 16s Details CI / test-nodejs-website (push) Successful in 18s Details Neues aufklappbares Panel unter Entzerrung mit individuellen Reglern: - 3 Rotations-Regler (P1 Iterative, P2 Word-Alignment, P3 Textline) - 4 Scherungs-Regler (A-D Methoden) mit Radio-Auswahl - Kombinierte Vorschau und Ground-Truth-Speicherung - Backend: POST /sessions/{id}/adjust-combined Endpoint Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-05 18:22:33 +01:00
Benjamin Admin	e0decac7a0	feat: Unified Inbox in Kommunikation-Navigation hinzugefuegt CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 25s Details CI / test-go-edu-search (push) Successful in 26s Details CI / test-python-klausur (push) Failing after 1m57s Details CI / test-python-agent-core (push) Successful in 15s Details CI / test-nodejs-website (push) Successful in 16s Details Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-05 18:04:30 +01:00
Benjamin Admin	d39d249daa	feat: add pass 3 text-line regression to deskew pipeline CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 24s Details CI / test-go-edu-search (push) Successful in 26s Details CI / test-python-klausur (push) Failing after 1m53s Details CI / test-python-agent-core (push) Successful in 15s Details CI / test-nodejs-website (push) Successful in 15s Details After iterative projection (pass 1) and word-alignment (pass 2), a third pass uses Tesseract word positions + linear regression per text line to measure and correct residual rotation. This catches cases where passes 1-2 leave significant slope (e.g. 1.7° residual on heavily skewed scans). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-05 17:53:11 +01:00
Benjamin Admin	538d5c732e	feat: two-pass deskew with wider angle range and residual correction CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 24s Details CI / test-go-edu-search (push) Successful in 25s Details CI / test-python-klausur (push) Failing after 1m52s Details CI / test-python-agent-core (push) Successful in 15s Details CI / test-nodejs-website (push) Successful in 16s Details - Increase iterative deskew coarse_range from ±2° to ±5° to handle heavily skewed scans - New deskew_two_pass(): runs iterative projection first, then word-alignment on the corrected image to detect/fix residual skew (applied when residual ≥ 0.3°) - OCR pipeline API auto_deskew now uses deskew_two_pass by default - Vocab worksheet _run_ocr_pipeline_for_page uses deskew_two_pass - Deskew result now includes angle_residual and two_pass_debug Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-05 17:34:57 +01:00
Benjamin Admin	b9c3c47a37	refactor: LLM Compare komplett entfernt, Video/Voice/Alerts Sidebar hinzugefuegt - LLM Compare Seiten, Configs und alle Referenzen geloescht - Kommunikation-Kategorie in Sidebar mit Video & Chat, Voice Service, Alerts - Compliance SDK Kategorie aus Sidebar entfernt Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-05 17:34:54 +01:00
Benjamin Admin	9912997187	refactor: Jitsi/Matrix/Voice von Core übernommen, Camunda/BPMN gelöscht, Kommunikation-Nav CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 25s Details CI / test-go-edu-search (push) Successful in 26s Details CI / test-python-klausur (push) Failing after 1m55s Details CI / test-python-agent-core (push) Successful in 16s Details CI / test-nodejs-website (push) Successful in 18s Details - Voice-Service von Core nach Lehrer verschoben (bp-lehrer-voice-service) - 4 Jitsi-Services + 2 Synapse-Services in docker-compose.yml aufgenommen - Camunda komplett gelöscht: workflow pages, workflow-config.ts, bpmn-js deps - CAMUNDA_URL aus backend-lehrer environment entfernt - Sidebar: Kategorie "Compliance SDK" + "Katalogverwaltung" entfernt - Sidebar: Neue Kategorie "Kommunikation" mit Video & Chat, Voice Service, Alerts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-05 17:01:47 +01:00
Benjamin Admin	2ec4d8aabd	fix: JSX syntax — IIFE wrapping for vocabulary tab CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 25s Details CI / test-go-edu-search (push) Successful in 24s Details CI / test-python-klausur (push) Failing after 1m51s Details CI / test-python-agent-core (push) Successful in 16s Details CI / test-nodejs-website (push) Successful in 18s Details Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-05 17:01:33 +01:00
Benjamin Admin	24366880ad	feat: vocab worksheet — full-quality images, insert triangles, dynamic columns CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 28s Details CI / test-go-edu-search (push) Successful in 26s Details CI / test-python-klausur (push) Failing after 1m50s Details CI / test-python-agent-core (push) Successful in 15s Details CI / test-nodejs-website (push) Successful in 15s Details - Original pages rendered at full resolution (pdf-page-image endpoint, zoom=2.0) instead of downscaled thumbnails - Insert-row triangles on left margin between every row (hover to reveal) - Dynamic extra columns: "+" button in header adds custom columns (e.g. Aussprache, Wortart), removable via hover-x on column header - Extra columns stored per-page (pageExtraColumns state) so different source pages can have different column structures - Grid template adjusts dynamically based on number of columns Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-05 16:49:15 +01:00
Benjamin Admin	20b341d839	fix: vocab worksheet fills full browser width, fix missing thumbnails CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 25s Details CI / test-go-edu-search (push) Successful in 27s Details CI / test-python-klausur (push) Failing after 1m51s Details CI / test-python-agent-core (push) Successful in 15s Details CI / test-nodejs-website (push) Successful in 17s Details - Remove max-w-7xl constraint on content area so panels stretch to edges - Fall back to direct API thumbnail URLs when blob URLs are empty - Original pages now reliably show even if preloaded thumbnails failed Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-05 16:30:04 +01:00
Benjamin Admin	d5be7b6f77	fix: vocab worksheet — wider table, show original pages, better layout CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 24s Details CI / test-go-edu-search (push) Successful in 24s Details CI / test-python-klausur (push) Failing after 1m44s Details CI / test-python-agent-core (push) Successful in 15s Details CI / test-nodejs-website (push) Successful in 17s Details - Swap from 3/5-2/5 grid to 1/3-2/3 flexbox (original left, table right) - Table uses 3 equal 1fr columns for EN/DE/example instead of cramped 13-col grid - Full viewport height minus header (calc(100vh - 240px)) for more visible rows - Show only processed pages in original preview (filtered by selectedPages) - Remove per-row insert buttons to reduce vertical noise - Compact row spacing (py-1.5) to fit ~15+ rows without scrolling Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-05 16:07:25 +01:00
Benjamin Admin	b7ae36e92b	feat: use OCR pipeline instead of LLM vision for vocab worksheet extraction CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 26s Details CI / test-go-edu-search (push) Successful in 25s Details CI / test-python-klausur (push) Failing after 1m52s Details CI / test-python-agent-core (push) Successful in 18s Details CI / test-nodejs-website (push) Successful in 17s Details process-single-page now runs the full CV pipeline (deskew → dewarp → columns → rows → cell-first OCR v2 → LLM review) for much better extraction quality. Falls back to LLM vision if pipeline imports are unavailable. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-05 15:35:44 +01:00

1 2 3 4 5 ...

291 Commits