breakpilot-lehrer

Files

T

Benjamin Admin e3f939a628 refactor(ocr-pipeline): make post-processing fully generic

Three non-generic solutions replaced with universal heuristics:

1. Cell-OCR fallback: instead of restricting to column_en/column_de,
   now checks pixel density (>2% dark pixels) for ANY column type.
   Truly empty cells are skipped without running Tesseract.

2. Example-sentence detection: instead of checking for example-column
   text (worksheet-specific), now uses sentence heuristics (>=4 words
   or ends with sentence punctuation). Short EN text without DE is
   kept as a vocab entry (OCR may have missed the translation).

3. Comma-split: re-enabled with singular/plural detection. Pairs like
   "mouse, mice" / "Maus, Mäuse" are kept together. Verb forms like
   "break, broke, broken" are still split into individual entries.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-02 09:27:30 +01:00

backend

refactor(ocr-pipeline): make post-processing fully generic

2026-03-02 09:27:30 +01:00

docs

Initial commit: breakpilot-lehrer - Lehrer KI Platform

2026-02-11 23:47:26 +01:00

frontend

Initial commit: breakpilot-lehrer - Lehrer KI Platform

2026-02-11 23:47:26 +01:00

scripts

Initial commit: breakpilot-lehrer - Lehrer KI Platform

2026-02-11 23:47:26 +01:00

Dockerfile

perf(klausur-service): split Dockerfile into base + app layer

2026-02-26 17:43:24 +01:00

Dockerfile.base

fix(ocr-pipeline): add libgl1 for RapidOCR OpenCV dependency

2026-02-28 17:30:12 +01:00