Benjamin Admin 2716495250
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 29s
CI / test-go-edu-search (push) Successful in 29s
CI / test-python-klausur (push) Failing after 2m9s
CI / test-python-agent-core (push) Successful in 18s
CI / test-nodejs-website (push) Successful in 20s
fix: Sub-Session Zeilenerkennung — Tesseract+inv im Spalten-Schritt cachen
Bisher wurden _word_dicts, _inv und _content_bounds fuer Sub-Sessions
nicht gecacht, sodass detect_rows auf detect_column_geometry() zurueckfiel.
Das konnte bei kleinen Box-Bildern mit <5 Woertern fehlschlagen.

Jetzt laeuft Tesseract + Binarisierung direkt im Pseudo-Spalten-Block,
und die Intermediates werden gecacht. Zusaetzlich ausfuehrliche Kommentare
zur Zeilenerkennung (detect_row_geometry, _regularize_row_grid).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 08:43:26 +01:00
Description
No description provided
42 MiB
Languages
TypeScript 60.2%
Python 32.9%
Go 5.5%
C# 0.8%
CSS 0.2%
Other 0.3%