Commit Graph

4 Commits

Author SHA1 Message Date
Benjamin Admin
21ea458fcf feat(ocr-review): replace LLM with rule-based spell-checker (REVIEW_ENGINE=spell)
- Add pyspellchecker (MIT) to requirements for EN+DE dictionary lookup
- New spell_review_entries_sync() + spell_review_entries_streaming():
  - Dictionary-backed substitution: checks if corrected word is known
  - Structural rule: digit at pos 0 + lowercase rest → most likely letter
    (e.g. "8en"→"Ben", "8uch"→"Buch", "5ee"→"See", "6eld"→"Geld")
  - Pattern rule: "|." → "1." for numbered list prefixes
  - Standalone "|" → "I" (capital I)
  - IPA entries still protected via existing _entry_needs_review filter
  - Headings/untranslated words (e.g. "Story") are untouched (no susp. chars)
- llm_review_entries + llm_review_entries_streaming: route via REVIEW_ENGINE
  env var ("spell" default, "llm" to restore previous behaviour)
- docker-compose.yml: REVIEW_ENGINE=${REVIEW_ENGINE:-spell}
- LLM code preserved for fallback (set REVIEW_ENGINE=llm in .env)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 15:04:27 +01:00
Benjamin Admin
d481e0087b deps: add eng-to-ipa for IPA dictionary lookup
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 20:23:40 +01:00
Benjamin Admin
4ec7c20490 feat(ocr-pipeline): add rapidocr + onnxruntime to requirements
RapidOCR uses PaddleOCR models on ONNX Runtime, works natively on ARM64.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 17:08:21 +01:00
Benjamin Boenisch
5a31f52310 Initial commit: breakpilot-lehrer - Lehrer KI Platform
Services: Admin-Lehrer, Backend-Lehrer, Studio v2, Website,
Klausur-Service, School-Service, Voice-Service, Geo-Service,
BreakPilot Drive, Agent-Core

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 23:47:26 +01:00