Feature 1 — StarRating: 1-3 stars per exercise (100%=3, 70%=2, <70%=1)
Feature 2 — Progress bar in UnitCard with Leitner box distribution
Feature 3 — Listening exercise: hear word via TTS, choose correct translation
Feature 4 — Matching game: tap-to-match EN↔DE pairs (6 per round)
Feature 5 — Pronunciation: word with syllable bows + mic → STT comparison
Feature 6 — Syllable bows in FlashCards (SyllableBow under word + IPA)
UnitCard now shows 6 exercise types: Karten, Quiz, Tippen, Hoeren,
Zuordnen, Sprechen. Progress bar and star count displayed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Shared types extracted to shared/types/:
- companion.ts (33+ types, was 100% duplicated admin-lehrer ↔ studio-v2)
- klausur.ts (18+ types, was 95% duplicated across 4 locations)
- ocr-labeling.ts (11 types, was 100% duplicated admin-lehrer ↔ website)
Original type files replaced with re-exports for backward compat.
tsconfig.json paths updated with @shared/* alias in all 3 services.
Docker: Changed build context from ./service to . (root) so shared/
is accessible. Dockerfiles updated to COPY service/ + shared/.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
audio_service.py: Connects to compliance-tts-service (Piper TTS,
MIT license) for high-quality German (Thorsten) and English (Lessac)
voices. Audio cached as MP3 on first request.
vocabulary_api.py: New endpoints:
- GET /vocabulary/word/{id}/audio/{lang} — word pronunciation
- GET /vocabulary/word/{id}/audio-syllables/{lang} — slow syllable-by-syllable
Anton App analysis: identified 5 features to adopt (star system,
games as rewards, progress bars, listening exercises, matching exercises).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
sed replacement left orphaned hostname references in story page
and empty lines in getApiBase functions.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
HTTPS pages cannot fetch from HTTP backend ports. Added Next.js
API route proxies for /api/vocabulary, /api/learning-units, /api/progress
that forward to backend-lehrer internally (same Docker network, HTTP).
All frontend pages now use same-origin requests (getApiBase = '')
instead of direct port:8001 connections.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Trigram extension and index are now created in a separate try/catch
so table creation succeeds even without pg_trgm. Search falls back
to ILIKE when trigram functions are not available.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Strip SQLAlchemy dialect prefix from DATABASE_URL for asyncpg.
Set search_path via server_settings on pool creation.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Strategic pivot: Studio-v2 becomes a language learning platform.
Compliance guardrail added to CLAUDE.md — no scan/OCR of third-party
content in customer frontend. Upload of OWN materials remains allowed.
Phase 1.1 — vocabulary_db.py: PostgreSQL model for 160k+ words
with english, german, IPA, syllables, examples, images, audio,
difficulty, tags, translations (multilingual). Trigram search index.
Phase 1.2 — vocabulary_api.py: Search, browse, filters, bulk import,
learning unit creation from word selection. Creates QA items with
enhanced fields (IPA, syllables, image, audio) for flashcards.
Phase 1.3 — /vocabulary page: Search bar with POS/difficulty filters,
word cards with audio buttons, unit builder sidebar. Teacher selects
words → creates learning unit → redirects to flashcards.
Sidebar: Added "Woerterbuch" (/vocabulary) and "Lernmodule" (/learn).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
qwen2.5vl:32b needs ~100GB RAM and crashes Ollama.
llama3.2-vision:11b is already installed and fits in memory.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New module vision_ocr_fusion.py: Sends scan image + OCR word
coordinates + document type to Qwen2.5-VL 32B. The LLM reads
the image visually while using OCR positions as structural hints.
Key features:
- Document-type-aware prompts (Vokabelseite, Woerterbuch, etc.)
- OCR words grouped into lines with x/y coordinates in prompt
- Low-confidence words marked with (?) for LLM attention
- Continuation row merging instructions in prompt
- JSON response parsing with markdown code block handling
- Fallback to original OCR on any error
Frontend (admin-lehrer Grid Review):
- "Vision-LLM" checkbox toggle
- "Typ" dropdown (Vokabelseite, Woerterbuch, etc.)
- Steps 1-3 defaults set to inactive
Activate: Check "Vision-LLM", select document type, click "OCR neu + Grid".
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New endpoint POST /sessions/{id}/rerun-ocr-and-build-grid that:
1. Runs scan quality assessment
2. Applies CLAHE enhancement if degraded (controlled by enhance toggle)
3. Re-runs dual-engine OCR (RapidOCR + Tesseract) with min_conf filter
4. Merges OCR results and stores updated word_result
5. Builds grid with max_columns constraint
Frontend: Orange "OCR neu + Grid" button in GridToolbar.
Unlike "Neu berechnen" (which only rebuilds grid from existing words),
this button re-runs the full OCR pipeline with quality settings.
Now CLAHE toggle actually has an effect — it enhances the image
before OCR runs, not after.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The max_columns parameter was only implemented in cv_words_first.py
(vocab-worksheet path) but NOT in _build_grid_core which is what
the admin OCR Kombi pipeline uses. The Kombi pipeline uses
grid_editor_helpers._cluster_columns_by_alignment() which has its
own column detection.
Fix: Post-processing step 5k merges narrowest columns after grid
building when zone has more columns than max_columns. Cells from
merged columns get their text appended to the target column.
min_conf word filtering was already working (applied before grid build).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Dev-only toggles belong in admin-lehrer (port 3002) only.
The customer frontend runs the pipeline with optimal defaults
and shows only the finished results.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Each quality improvement step can now be toggled independently:
- CLAHE checkbox (Step 3: image enhancement on/off)
- MaxCols dropdown (Step 2: 0=unlimited, 2-5)
- MinConf dropdown (Step 1: auto/20/30/40/50/60)
Backend: Query params enhance, max_cols, min_conf on process-single-page.
Response includes active_steps dict showing which steps are enabled.
Frontend: Toggle controls in VocabularyTab above the table.
This allows empirical A/B testing of each step on the same scan.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Step 1: scan_quality.py — Laplacian blur + contrast scoring, adjusts
OCR confidence threshold (40 for good scans, 30 for degraded).
Quality report included in API response + shown in frontend.
Step 2: max_columns parameter in cv_words_first.py — limits column
detection to 3 for vocab tables, preventing phantom columns D/E
from degraded OCR fragments.
Step 3: ocr_image_enhance.py — CLAHE contrast + bilateral filter
denoising + unsharp mask, only for degraded scans (gated by
quality score). Pattern from handwriting_htr_api.py.
Frontend: quality info shown in extraction status after processing.
Reprocess button now derives pages from vocabulary data.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Types from deleted ocr-pipeline/types.ts inlined into ocr-kombi/types.ts.
All imports updated across components/ocr-kombi/ and components/ocr-pipeline/.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two bugs fixed:
1. reprocessPages() failed silently after session resume because
successfulPages was empty. Now derives pages from vocabulary
source_page or selectedPages as fallback.
2. process-single-page endpoint built vocabulary entries WITHOUT
applying merge logic (_merge_wrapped_rows, _merge_continuation_rows).
Now applies full merge pipeline after vocabulary extraction.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Allows reprocessing pages from the vocabulary view to apply
new merge logic without navigating back to page selection.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When textbook authors wrap text within a cell (e.g. long German
translations), OCR treats each physical line as a separate row.
New _merge_wrapped_rows() detects this by checking if the primary
column (EN) is empty — indicating a continuation, not a new entry.
Handles: empty EN + DE text, empty EN + example text, parenthetical
continuations like "(bei)", triple wraps, comma-separated lists.
12 tests added covering all cases.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase 3.2 — MicrophoneInput.tsx: Browser Web Speech API for
speech-to-text recognition (EN+DE), integrated for pronunciation practice.
Phase 4.1 — Story Generator: LLM-powered mini-stories using vocabulary
words, with highlighted vocab in HTML output. Backend endpoint
POST /learning-units/{id}/generate-story + frontend /learn/[unitId]/story.
Phase 4.2 — SyllableBow.tsx: SVG arc component for syllable visualization
under words, clickable for per-syllable TTS.
Phase 4.3 — Gamification system:
- CoinAnimation.tsx: Floating coin rewards with accumulator
- CrownBadge.tsx: Crown/medal display for milestones
- ProgressRing.tsx: Circular progress indicator
- progress_api.py: Backend tracking coins, crowns, streaks per unit
Also adds "Geschichte" exercise type button to UnitCard.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New feature: After OCR vocabulary extraction, users can generate interactive
learning modules (flashcards, quiz, type trainer) with one click.
Frontend (studio-v2):
- Fortune Sheet spreadsheet editor tab in vocab-worksheet
- "Lernmodule generieren" button in ExportTab
- /learn page with unit overview and exercise type cards
- /learn/[unitId]/flashcards — Flip-card trainer with Leitner spaced repetition
- /learn/[unitId]/quiz — Multiple choice quiz with explanations
- /learn/[unitId]/type — Type-in trainer with Levenshtein distance feedback
- AudioButton component using Web Speech API for EN+DE TTS
Backend (klausur-service):
- vocab_learn_bridge.py: Converts VocabularyEntry[] to analysis_data format
- POST /sessions/{id}/generate-learning-unit endpoint
Backend (backend-lehrer):
- generate-qa, generate-mc, generate-cloze endpoints on learning units
- get-qa/mc/cloze data retrieval endpoints
- Leitner progress update + next review items endpoints
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The tokenizer regex only matches alphabetic characters, so text
before the first word match (like "(= " in "(= I won...") was
silently dropped when reassembling the corrected text.
Now preserves text[:first_match_start] as a leading prefix.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>