Compare commits

..

199 Commits

Author SHA1 Message Date
Benjamin Admin
7fc5464df7 Switch Vision-LLM Fusion to llama3.2-vision:11b
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 41s
CI / test-go-edu-search (push) Successful in 29s
CI / test-python-klausur (push) Failing after 2m35s
CI / test-python-agent-core (push) Successful in 19s
CI / test-nodejs-website (push) Successful in 28s
qwen2.5vl:32b needs ~100GB RAM and crashes Ollama.
llama3.2-vision:11b is already installed and fits in memory.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-24 00:44:59 +02:00
Benjamin Admin
5fbf0f4ee2 Fix: _merge_paddle_tesseract takes 2 args not 4
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 41s
CI / test-go-edu-search (push) Successful in 32s
CI / test-python-klausur (push) Failing after 2m37s
CI / test-python-agent-core (push) Successful in 20s
CI / test-nodejs-website (push) Successful in 24s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-24 00:33:49 +02:00
Benjamin Admin
2f8270f77b Add Vision-LLM OCR Fusion (Step 4) for degraded scans
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 42s
CI / test-go-edu-search (push) Successful in 29s
CI / test-python-klausur (push) Failing after 2m43s
CI / test-python-agent-core (push) Successful in 20s
CI / test-nodejs-website (push) Successful in 27s
New module vision_ocr_fusion.py: Sends scan image + OCR word
coordinates + document type to Qwen2.5-VL 32B. The LLM reads
the image visually while using OCR positions as structural hints.

Key features:
- Document-type-aware prompts (Vokabelseite, Woerterbuch, etc.)
- OCR words grouped into lines with x/y coordinates in prompt
- Low-confidence words marked with (?) for LLM attention
- Continuation row merging instructions in prompt
- JSON response parsing with markdown code block handling
- Fallback to original OCR on any error

Frontend (admin-lehrer Grid Review):
- "Vision-LLM" checkbox toggle
- "Typ" dropdown (Vokabelseite, Woerterbuch, etc.)
- Steps 1-3 defaults set to inactive

Activate: Check "Vision-LLM", select document type, click "OCR neu + Grid".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-24 00:24:22 +02:00
Benjamin Admin
00eb9f26f6 Add "OCR neu + Grid" button to Grid Review
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 51s
CI / test-go-edu-search (push) Successful in 42s
CI / test-python-klausur (push) Failing after 2m53s
CI / test-python-agent-core (push) Successful in 21s
CI / test-nodejs-website (push) Successful in 55s
New endpoint POST /sessions/{id}/rerun-ocr-and-build-grid that:
1. Runs scan quality assessment
2. Applies CLAHE enhancement if degraded (controlled by enhance toggle)
3. Re-runs dual-engine OCR (RapidOCR + Tesseract) with min_conf filter
4. Merges OCR results and stores updated word_result
5. Builds grid with max_columns constraint

Frontend: Orange "OCR neu + Grid" button in GridToolbar.
Unlike "Neu berechnen" (which only rebuilds grid from existing words),
this button re-runs the full OCR pipeline with quality settings.

Now CLAHE toggle actually has an effect — it enhances the image
before OCR runs, not after.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-23 16:55:01 +02:00
Benjamin Admin
141f69ceaa Fix: max_columns now works in OCR Kombi build-grid pipeline
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 49s
CI / test-go-edu-search (push) Successful in 27s
CI / test-python-klausur (push) Failing after 2m31s
CI / test-python-agent-core (push) Successful in 27s
CI / test-nodejs-website (push) Successful in 30s
The max_columns parameter was only implemented in cv_words_first.py
(vocab-worksheet path) but NOT in _build_grid_core which is what
the admin OCR Kombi pipeline uses. The Kombi pipeline uses
grid_editor_helpers._cluster_columns_by_alignment() which has its
own column detection.

Fix: Post-processing step 5k merges narrowest columns after grid
building when zone has more columns than max_columns. Cells from
merged columns get their text appended to the target column.

min_conf word filtering was already working (applied before grid build).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-23 16:40:39 +02:00
Benjamin Admin
2baad68060 Remove A/B testing toggles from studio-v2 (customer frontend)
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 47s
CI / test-go-edu-search (push) Successful in 45s
CI / test-python-klausur (push) Failing after 2m50s
CI / test-python-agent-core (push) Successful in 38s
CI / test-nodejs-website (push) Successful in 43s
Dev-only toggles belong in admin-lehrer (port 3002) only.
The customer frontend runs the pipeline with optimal defaults
and shows only the finished results.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-23 16:18:44 +02:00
Benjamin Admin
25e5a7415a Add A/B testing toggles to OCR Kombi Grid Review
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 39s
CI / test-go-edu-search (push) Successful in 28s
CI / test-python-klausur (push) Failing after 2m27s
CI / test-python-agent-core (push) Successful in 19s
CI / test-nodejs-website (push) Successful in 24s
Quality step toggles in admin-lehrer StepGridReview (port 3002):
- CLAHE checkbox (Step 3: image enhancement)
- MaxCol dropdown (Step 2: column limit, 0=off)
- MinConf dropdown (Step 1: OCR confidence, 0=auto)

Parameters flow through: StepGridReview → useGridEditor → build-grid
endpoint → _build_grid_core. MinConf filters words before grid building.

Toggle settings, click "Neu berechnen" to test each step individually.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-23 16:09:17 +02:00
Benjamin Admin
545c8676b0 Add A/B testing toggles for OCR quality steps
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 30s
CI / test-go-edu-search (push) Successful in 28s
CI / test-python-klausur (push) Failing after 2m33s
CI / test-python-agent-core (push) Successful in 26s
CI / test-nodejs-website (push) Successful in 18s
Each quality improvement step can now be toggled independently:
- CLAHE checkbox (Step 3: image enhancement on/off)
- MaxCols dropdown (Step 2: 0=unlimited, 2-5)
- MinConf dropdown (Step 1: auto/20/30/40/50/60)

Backend: Query params enhance, max_cols, min_conf on process-single-page.
Response includes active_steps dict showing which steps are enabled.
Frontend: Toggle controls in VocabularyTab above the table.

This allows empirical A/B testing of each step on the same scan.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-23 15:27:26 +02:00
Benjamin Admin
2f34ee9ede Add scan quality scoring, column limit, image enhancement (Steps 1-3)
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 26s
CI / test-go-edu-search (push) Successful in 32s
CI / test-python-klausur (push) Failing after 2m21s
CI / test-python-agent-core (push) Successful in 28s
CI / test-nodejs-website (push) Successful in 20s
Step 1: scan_quality.py — Laplacian blur + contrast scoring, adjusts
OCR confidence threshold (40 for good scans, 30 for degraded).
Quality report included in API response + shown in frontend.

Step 2: max_columns parameter in cv_words_first.py — limits column
detection to 3 for vocab tables, preventing phantom columns D/E
from degraded OCR fragments.

Step 3: ocr_image_enhance.py — CLAHE contrast + bilateral filter
denoising + unsharp mask, only for degraded scans (gated by
quality score). Pattern from handwriting_htr_api.py.

Frontend: quality info shown in extraction status after processing.
Reprocess button now derives pages from vocabulary data.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-23 14:58:39 +02:00
Benjamin Admin
5a154b744d fix: migrate ocr-pipeline types to ocr-kombi after page deletion
Types from deleted ocr-pipeline/types.ts inlined into ocr-kombi/types.ts.
All imports updated across components/ocr-kombi/ and components/ocr-pipeline/.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-23 14:22:09 +02:00
Benjamin Admin
f39cbe9283 refactor: remove unused pages and backends (model-management, OCR legacy, GPU/vast.ai, video-chat, matrix)
Deleted pages:
- /ai/model-management (mock data only, no real backend)
- /ai/ocr-compare (old /vocab/ backend, replaced by ocr-kombi)
- /ai/ocr-pipeline (minimal session browser, redundant)
- /ai/ocr-overlay (legacy monolith, redundant)
- /ai/gpu (vast.ai GPU management, no longer used)
- /infrastructure/gpu (same)
- /communication/video-chat (moved to core)
- /communication/matrix (moved to core)

Deleted backends:
- backend-lehrer/infra/vast_client.py + vast_power.py
- backend-lehrer/meetings_api.py + jitsi_api.py
- website/app/api/admin/gpu/
- edu-search-service/scripts/vast_ai_extractor.py

Total: ~7,800 LOC removed. All code preserved in git history.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-23 13:14:12 +02:00
Benjamin Admin
5abdfa202e chore: install refactoring guardrails (Phase 0) [guardrail-change]
- scripts/check-loc.sh: LOC budget checker (500 LOC hard cap)
- .claude/rules/architecture.md: split triggers, patterns per language
- .claude/rules/loc-exceptions.txt: documented escape hatches
- AGENTS.python.md: FastAPI conventions (routes thin, service layer)
- AGENTS.go.md: Go/Gin conventions (handler ≤40 LOC)
- AGENTS.typescript.md: Next.js conventions (page.tsx ≤250 LOC, colocation)
- CLAUDE.md extended with guardrail section + commit markers

273 files currently exceed 500 LOC — to be addressed phase by phase.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-23 12:25:36 +02:00
Benjamin Admin
9b0e310978 Fix: reprocess button works after session resume + apply merge logic
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 45s
CI / test-go-edu-search (push) Successful in 46s
CI / test-python-klausur (push) Failing after 2m37s
CI / test-python-agent-core (push) Successful in 34s
CI / test-nodejs-website (push) Successful in 34s
Two bugs fixed:
1. reprocessPages() failed silently after session resume because
   successfulPages was empty. Now derives pages from vocabulary
   source_page or selectedPages as fallback.

2. process-single-page endpoint built vocabulary entries WITHOUT
   applying merge logic (_merge_wrapped_rows, _merge_continuation_rows).
   Now applies full merge pipeline after vocabulary extraction.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 00:46:15 +02:00
Benjamin Admin
46c2acb2f4 Add "Neu verarbeiten" button to VocabularyTab
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 53s
CI / test-go-edu-search (push) Successful in 53s
CI / test-python-klausur (push) Failing after 2m44s
CI / test-python-agent-core (push) Successful in 1m3s
CI / test-nodejs-website (push) Successful in 36s
Allows reprocessing pages from the vocabulary view to apply
new merge logic without navigating back to page selection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 08:37:13 +02:00
Benjamin Admin
b8f1b71652 Fix: merge cell-wrap continuation rows in vocabulary extraction
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 58s
CI / test-go-edu-search (push) Successful in 48s
CI / test-python-agent-core (push) Has been cancelled
CI / test-nodejs-website (push) Has been cancelled
CI / test-python-klausur (push) Has started running
When textbook authors wrap text within a cell (e.g. long German
translations), OCR treats each physical line as a separate row.
New _merge_wrapped_rows() detects this by checking if the primary
column (EN) is empty — indicating a continuation, not a new entry.

Handles: empty EN + DE text, empty EN + example text, parenthetical
continuations like "(bei)", triple wraps, comma-separated lists.

12 tests added covering all cases.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 08:32:45 +02:00
Benjamin Admin
6a165b36e5 Add Phase 5.1: LearningProgress dashboard widget
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 51s
CI / test-go-edu-search (push) Successful in 46s
CI / test-python-klausur (push) Failing after 2m39s
CI / test-python-agent-core (push) Successful in 41s
CI / test-nodejs-website (push) Successful in 32s
Eltern-Dashboard widget showing per-unit learning stats:
accuracy ring, coins, crowns, streak, and recent unit list.
Uses ProgressRing and CrownBadge gamification components.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 07:26:44 +02:00
Benjamin Admin
9dddd80d7a Add Phases 3.2-4.3: STT, stories, syllables, gamification
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 37s
CI / test-go-edu-search (push) Successful in 45s
CI / test-python-agent-core (push) Has been cancelled
CI / test-nodejs-website (push) Has been cancelled
CI / test-python-klausur (push) Has started running
Phase 3.2 — MicrophoneInput.tsx: Browser Web Speech API for
speech-to-text recognition (EN+DE), integrated for pronunciation practice.

Phase 4.1 — Story Generator: LLM-powered mini-stories using vocabulary
words, with highlighted vocab in HTML output. Backend endpoint
POST /learning-units/{id}/generate-story + frontend /learn/[unitId]/story.

Phase 4.2 — SyllableBow.tsx: SVG arc component for syllable visualization
under words, clickable for per-syllable TTS.

Phase 4.3 — Gamification system:
- CoinAnimation.tsx: Floating coin rewards with accumulator
- CrownBadge.tsx: Crown/medal display for milestones
- ProgressRing.tsx: Circular progress indicator
- progress_api.py: Backend tracking coins, crowns, streaks per unit

Also adds "Geschichte" exercise type button to UnitCard.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 07:22:52 +02:00
Benjamin Admin
20a0585eb1 Add interactive learning modules MVP (Phases 1-3.1)
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 44s
CI / test-go-edu-search (push) Successful in 51s
CI / test-python-klausur (push) Failing after 2m44s
CI / test-python-agent-core (push) Successful in 33s
CI / test-nodejs-website (push) Successful in 34s
New feature: After OCR vocabulary extraction, users can generate interactive
learning modules (flashcards, quiz, type trainer) with one click.

Frontend (studio-v2):
- Fortune Sheet spreadsheet editor tab in vocab-worksheet
- "Lernmodule generieren" button in ExportTab
- /learn page with unit overview and exercise type cards
- /learn/[unitId]/flashcards — Flip-card trainer with Leitner spaced repetition
- /learn/[unitId]/quiz — Multiple choice quiz with explanations
- /learn/[unitId]/type — Type-in trainer with Levenshtein distance feedback
- AudioButton component using Web Speech API for EN+DE TTS

Backend (klausur-service):
- vocab_learn_bridge.py: Converts VocabularyEntry[] to analysis_data format
- POST /sessions/{id}/generate-learning-unit endpoint

Backend (backend-lehrer):
- generate-qa, generate-mc, generate-cloze endpoints on learning units
- get-qa/mc/cloze data retrieval endpoints
- Leitner progress update + next review items endpoints

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 07:13:23 +02:00
Benjamin Admin
4561320e0d Fix SmartSpellChecker: preserve leading non-alpha text like (=
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 42s
CI / test-go-edu-search (push) Successful in 47s
CI / test-python-klausur (push) Failing after 2m36s
CI / test-python-agent-core (push) Successful in 35s
CI / test-nodejs-website (push) Successful in 33s
The tokenizer regex only matches alphabetic characters, so text
before the first word match (like "(= " in "(= I won...") was
silently dropped when reassembling the corrected text.

Now preserves text[:first_match_start] as a leading prefix.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 23:41:33 +02:00
Benjamin Admin
596864431b Rule (a2): switch from allow-list to block-list for symbol removal
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 47s
CI / test-go-edu-search (push) Successful in 47s
CI / test-python-klausur (push) Failing after 2m42s
CI / test-python-agent-core (push) Successful in 34s
CI / test-nodejs-website (push) Successful in 36s
Instead of keeping only specific symbols (_KEEP_SYMBOLS), now only
removes explicitly decorative symbols (_REMOVE_SYMBOLS: > < ~ \ ^ etc).
All other punctuation (= ( ) ; : - etc.) is preserved by default.

This is more robust: any new symbol used in textbooks will be kept
unless it's in the small block-list of known decorative artifacts.

Fixes: (= token still being removed on page 5 despite being in
the allow-list (possibly due to Unicode variants or whitespace).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 23:34:21 +02:00
Benjamin Admin
c8027eb7f9 Fix: preserve = ; : - and other meaningful symbols in word_boxes
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 40s
CI / test-go-edu-search (push) Successful in 43s
CI / test-python-klausur (push) Failing after 2m38s
CI / test-python-agent-core (push) Successful in 17s
CI / test-nodejs-website (push) Successful in 18s
Rule (a2) in Step 5i removed word_boxes with no letters/digits as
"graphic OCR artifacts". This incorrectly removed = signs used as
definition markers in textbooks ("film = 1. Film; 2. filmen").

Added exception list _KEEP_SYMBOLS for meaningful punctuation:
= (= =) ; : - – — / + • · ( ) & * → ← ↔

The root cause: PaddleOCR returns "film = 1. Film; 2. filmen" as one
block, which gets split into word_boxes ["film", "=", "1.", ...].
The "=" word_box had no alphanumeric chars and was removed as artifact.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 23:18:35 +02:00
Benjamin Admin
ba0f659d1e Preserve = and (= tokens in grid build and cell text cleanup
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 43s
CI / test-go-edu-search (push) Successful in 47s
CI / test-python-klausur (push) Failing after 2m34s
CI / test-python-agent-core (push) Successful in 34s
CI / test-nodejs-website (push) Successful in 42s
= signs are used as definition markers in textbooks ("film = 1. Film").
They were incorrectly removed by two filters:

1. grid_build_core.py Step 5j-pre: _PURE_JUNK_RE matched "=" as
   artifact noise. Now exempts =, (=, ;, :, - and similar meaningful
   punctuation tokens.

2. cv_ocr_engines.py _is_noise_tail_token: "pure non-alpha" check
   removed trailing = tokens. Now exempts meaningful punctuation.

Fixes: "film = 1. Film; 2. filmen" losing the = sign,
       "(= I won and he lost.)" losing the (=.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 23:04:27 +02:00
Benjamin Admin
50bfd6e902 Fix gutter repair: don't suggest corrections for words with parentheses
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 50s
CI / test-go-edu-search (push) Successful in 50s
CI / test-python-klausur (push) Failing after 2m37s
CI / test-python-agent-core (push) Successful in 40s
CI / test-nodejs-website (push) Successful in 31s
Words like "probieren)" or "Englisch)" were incorrectly flagged as
gutter OCR errors because the closing parenthesis wasn't stripped
before dictionary lookup. The spellchecker then suggested "probierend"
(replacing ) with d, edit distance 1).

Two fixes:
1. Strip trailing/leading parentheses in _try_spell_fix before checking
   if the bare word is valid — skip correction if it is
2. Add )( to the rstrip characters in the analysis phase so
   "probieren)" becomes "probieren" for the known-word check

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 22:38:22 +02:00
Benjamin Admin
0599c72cc1 Fix IPA continuation: don't replace normal text with IPA
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 41s
CI / test-go-edu-search (push) Successful in 43s
CI / test-python-klausur (push) Failing after 2m39s
CI / test-python-agent-core (push) Successful in 35s
CI / test-nodejs-website (push) Successful in 19s
Text like "Betonung auf der 1. Silbe: profit ['profit]" was
incorrectly detected as garbled IPA and replaced with generated
IPA transcription of the previous row's example sentence.

Added guard: if the cell text contains >=3 recognizable words
(3+ letter alpha tokens), it's normal text, not garbled IPA.
Garbled IPA is typically short and has no real dictionary words.

Fixes: Row 13 C3 showing IPA instead of pronunciation hint text.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 22:28:58 +02:00
Benjamin Admin
5fad2d420d test+docs(rag): Tests und Entwicklerdoku fuer RAG Landkarte
- 44 Vitest-Tests: JSON-Struktur, Branchen-Zuordnung, Applicability
  Notes, Dokumenttyp-Verteilung, keine Duplikate
- MkDocs-Seite: Architektur, 10 Branchen, Zuordnungslogik,
  Integration in andere Projekte, Datenquellen

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 20:47:54 +02:00
Benjamin Admin
c8e5e498b5 feat(rag): Applicability Notes UI + Branchen-Review
- Matrix-Zeilen aufklappbar: Klick zeigt Branchenrelevanz-Erklaerung,
  Beschreibung und Gueltigkeitsdatum
- 27 Branchen-Zuordnungen korrigiert:
  - OWASP/NIST/CISA/SBOM-Standards → alle (Kunden entwickeln Software)
  - BSI-TR-03161 → leer (DiGA, nicht Zielmarkt)
  - BSI 200-4, ENISA Supply Chain → alle (CRA/NIS2-Pflicht)
  - EAA/BFSG → +automotive (digitale Interfaces)
- 264 horizontal, 42 sektorspezifisch, 14 nicht zutreffend

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 19:15:01 +02:00
Benjamin Admin
261f686dac Add OCR Pipeline Extensions developer docs + update vocab-worksheet docs
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 42s
CI / test-go-edu-search (push) Successful in 39s
CI / test-python-klausur (push) Failing after 2m36s
CI / test-python-agent-core (push) Successful in 26s
CI / test-nodejs-website (push) Successful in 40s
New: .claude/rules/ocr-pipeline-extensions.md
- Complete documentation for SmartSpellChecker, Box-Grid-Review (Step 11),
  Ansicht/Spreadsheet (Step 12), Unified Grid
- All 14 pipeline steps listed
- Backend/frontend file structure with line counts
- 66 tests documented
- API endpoints, data flow, formatting rules

Updated: .claude/rules/vocab-worksheet.md
- Added Frontend Refactoring section (page.tsx → 14 files)
- Updated format extension instructions (constants.ts instead of page.tsx)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 18:35:16 +02:00
Benjamin Admin
3d3c2b30db Add tests for unified_grid and cv_box_layout
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 50s
CI / test-go-edu-search (push) Successful in 45s
CI / test-python-klausur (push) Failing after 2m30s
CI / test-python-agent-core (push) Successful in 31s
CI / test-nodejs-website (push) Successful in 34s
test_unified_grid.py (10 tests):
- Dominant row height calculation (regular, gaps filtered, single row)
- Box classification (full-width, partial left/right, text line count)
- Unified grid building (content-only, box integration, cell tagging)

test_box_layout.py (13 tests):
- Layout classification (header_only, flowing, bullet_list)
- Line grouping by y-proximity
- Flowing layout indent grouping (bullet + continuations → \n)
- Row/column field completeness for GridTable compatibility

Total: 66 tests passing (43 smart_spell + 13 box_layout + 10 unified)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 18:18:52 +02:00
Benjamin Admin
1d22f649ae fix(rag): Branchen auf 10 VDMA/VDA/BDI-Sektoren korrigiert
Alte 17 "Branchen" (inkl. IoT, KI, HR, KRITIS) durch 10 echte
Industriesektoren ersetzt: Automotive, Maschinenbau, Elektrotechnik,
Chemie, Metall, Energie, Transport, Handel, Konsumgueter, Bau.

Zuordnungslogik: 244 horizontal (alle), 65 sektorspezifisch,
11 nicht zutreffend (Finanz/Medizin/Plattformen).
102 applicability_notes mit Begruendung pro Regulierung.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 17:56:28 +02:00
Benjamin Admin
610825ac14 SpreadsheetView: add bullet marker (•) for multi-line cells
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 44s
CI / test-go-edu-search (push) Successful in 40s
CI / test-python-klausur (push) Failing after 2m34s
CI / test-python-agent-core (push) Successful in 33s
CI / test-nodejs-website (push) Successful in 38s
Multi-line cells (containing \n) that don't already start with a
bullet character get • prepended in the frontend. This ensures
bullet points are visible regardless of whether the backend inserted
them (depends on when boxes were last rebuilt).

Skips header rows and cells that already have •, -, or – prefix.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 17:53:54 +02:00
Benjamin Admin
6aec4742e5 SpreadsheetView: keep bullets as single cells with text-wrap
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 44s
CI / test-go-edu-search (push) Successful in 35s
CI / test-python-klausur (push) Failing after 2m37s
CI / test-python-agent-core (push) Successful in 27s
CI / test-nodejs-website (push) Successful in 31s
Revert row expansion — multi-line bullet cells stay as single cells
with \n and text-wrap (tb='2'). This way the text reflows when the
user resizes the column, like normal Excel behavior.

Row height auto-scales by line count (24px * lines).
Vertical alignment: top (vt=0) for multi-line cells.
Removed leading-space indentation hack (didn't work reliably).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 17:07:07 +02:00
Benjamin Admin
0491c2eb84 feat(rag): dynamische Branchen-Regulierungs-Matrix aus JSON
Hardcodierte REGULATIONS/INDUSTRIES/INDUSTRY_REGULATION_MAP durch
JSON-Import ersetzt. 320 Dokumente in 17 Kategorien mit collapsible
Sektionen pro doc_type. page.tsx von 3672 auf 2655 Zeilen reduziert.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 17:01:51 +02:00
Benjamin Admin
f2bc62b4f5 SpreadsheetView: bullet indentation, expanded rows, box borders
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 46s
CI / test-go-edu-search (push) Successful in 45s
CI / test-python-klausur (push) Failing after 2m43s
CI / test-python-agent-core (push) Successful in 35s
CI / test-nodejs-website (push) Successful in 1m4s
Multi-line cells (\n): expanded into separate rows so each line gets
its own cell. Continuation lines (after •) indented with leading spaces.
Bullet marker lines (•) are bold.

Font-size detection: cells with word_box height >1.3x median get bold
and larger font (fs=12) for box titles.

Headers: is_header rows always bold with light background tint.

Box borders: thick colored outside border + thin inner grid lines.
Content zone: light gray grid borders.

Auto-fit column widths from longest text per column.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 16:15:43 +02:00
Benjamin Admin
674c9e949e SpreadsheetView: auto-fit column widths to longest text
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Failing after 22s
CI / test-go-edu-search (push) Failing after 23s
CI / test-python-klausur (push) Failing after 11s
CI / test-python-agent-core (push) Failing after 8s
CI / test-nodejs-website (push) Failing after 24s
Column widths now calculated from the longest text in each column
(~7.5px per character + padding). Takes the maximum of auto-fit
width and scaled original pixel width.

Multi-line cells: uses the longest line for width calculation.
Spanning header cells excluded from width calculation (they span
multiple columns and would inflate single-column widths).

Minimum column width: 60px.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 09:43:50 +02:00
Benjamin Admin
e131aa719e SpreadsheetView: formatting improvements for Excel-like display
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Failing after 21s
CI / test-go-edu-search (push) Failing after 19s
CI / test-python-klausur (push) Failing after 11s
CI / test-python-agent-core (push) Failing after 10s
CI / test-nodejs-website (push) Failing after 23s
Height: sheet height auto-calculated from row count (26px/row + toolbar),
no more cutoff at 21 rows. Row count set to exact (no padding).

Box borders: thick colored outside border + thin inner grid lines.
Content zone: light gray grid lines on all cells.

Headers: bold (bl=1) for is_header rows. Larger font detected via
word_box height comparison (>1.3x median → fs=12 + bold).

Box cells: light tinted background from box_bg_hex.
Header cells in boxes: slightly stronger tint.

Multi-line cells: text wrap enabled (tb='2'), \n preserved.
Bullet points (•) and indentation preserved in cell text.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 09:29:50 +02:00
Benjamin Admin
17f0fdb2ed Refactor: extract _build_grid_core into grid_build_core.py + clean StepAnsicht
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Failing after 19s
CI / test-go-edu-search (push) Failing after 23s
CI / test-python-klausur (push) Failing after 10s
CI / test-python-agent-core (push) Failing after 9s
CI / test-nodejs-website (push) Failing after 26s
grid_editor_api.py: 2411 → 474 lines
- Extracted _build_grid_core() (1892 lines) into grid_build_core.py
- API file now only contains endpoints (build, save, get, gutter, box, unified)

StepAnsicht.tsx: 212 → 112 lines
- Removed useGridEditor imports (not needed for read-only spreadsheet)
- Removed unified grid fetch/build (not used with multi-sheet approach)
- Removed Spreadsheet/Grid toggle (only spreadsheet mode now)
- Simple: fetch grid-editor data → pass to SpreadsheetView

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 08:54:55 +02:00
Benjamin Admin
d4353d76fb SpreadsheetView: multi-sheet tabs instead of unified single sheet
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 36s
CI / test-go-edu-search (push) Successful in 36s
CI / test-python-klausur (push) Failing after 2m21s
CI / test-python-agent-core (push) Successful in 31s
CI / test-nodejs-website (push) Successful in 31s
Each zone becomes its own Excel sheet tab with independent column widths:
- Sheet "Vokabeln": main content zone with EN/DE/example columns
- Sheet "Pounds and euros": Box 1 with its own 4-column layout
- Sheet "German leihen": Box 2 with single column for flowing text

This solves the column-width conflict: boxes have different column
widths optimized for their content, which is impossible in a single
unified sheet (Excel limitation: column width is per-column, not per-cell).

Sheet tabs visible at bottom (showSheetTabs: true).
Box sheets get colored tab (from box_bg_hex).
First sheet active by default.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 00:51:21 +02:00
Benjamin Admin
b42f394833 Integrate Fortune Sheet spreadsheet editor in StepAnsicht
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 36s
CI / test-go-edu-search (push) Successful in 31s
CI / test-python-klausur (push) Failing after 2m40s
CI / test-python-agent-core (push) Successful in 32s
CI / test-nodejs-website (push) Successful in 33s
Install @fortune-sheet/react (MIT, v1.0.4) as Excel-like spreadsheet
component. New SpreadsheetView.tsx converts unified grid data to
Fortune Sheet format (celldata, merge config, column/row sizes).

StepAnsicht now has Spreadsheet/Grid toggle:
- Spreadsheet mode: full Fortune Sheet with toolbar (bold, italic,
  color, borders, merge cells, text wrap, undo/redo)
- Grid mode: existing GridTable for quick editing

Box-origin cells get light tinted background in spreadsheet view.
Colspan cells converted to Fortune Sheet merge format.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 00:08:03 +02:00
Benjamin Admin
c1a903537b Unified Grid: merge all zones into single Excel-like grid
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 32s
CI / test-go-edu-search (push) Successful in 45s
CI / test-python-klausur (push) Failing after 2m35s
CI / test-python-agent-core (push) Successful in 31s
CI / test-nodejs-website (push) Successful in 33s
Backend (unified_grid.py):
- build_unified_grid(): merges content + box zones into one zone
- Dominant row height from median of content row spacings
- Full-width boxes: rows integrated directly
- Partial-width boxes: extra rows inserted when box has more text
  lines than standard rows fit (e.g., 7 lines in 5-row height)
- Box-origin cells tagged with source_zone_type + box_region metadata

Backend (grid_editor_api.py):
- POST /sessions/{id}/build-unified-grid → persists as unified_grid_result
- GET /sessions/{id}/unified-grid → retrieve persisted result

Frontend:
- GridEditorCell: added source_zone_type, box_region fields
- GridTable: box-origin cells get tinted background + left border
- StepAnsicht: split-view with original image (left) + editable
  unified GridTable (right). Auto-builds on first load.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 23:37:55 +02:00
Benjamin Admin
7085c87618 StepAnsicht: dominant row height for content + proportional box rows
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 33s
CI / test-go-edu-search (push) Successful in 43s
CI / test-python-klausur (push) Failing after 2m35s
CI / test-python-agent-core (push) Successful in 34s
CI / test-nodejs-website (push) Successful in 31s
Content sections: use dominant (median) row height from all content
rows instead of per-section average. This ensures uniform row height
above and below boxes (the standard case on textbook pages).

Box sections: distribute height proportionally by text line count
per row. A header (1 line) gets 1/7 of box height, a bullet with
3 lines gets 3/7. Fixes Box 2 where row 3 was cut off because
even distribution didn't account for multi-line cells.

Removed overflow:hidden from box container to prevent clipping.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 17:43:02 +02:00
Benjamin Admin
1b7e095176 StepAnsicht: fix row filtering for partial-width boxes
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 45s
CI / test-go-edu-search (push) Successful in 27s
CI / test-python-klausur (push) Failing after 2m34s
CI / test-python-agent-core (push) Successful in 32s
CI / test-nodejs-website (push) Successful in 36s
Content rows were incorrectly filtered out when their Y overlapped
with a box, even if the box only covered the right half of the page.
Now checks both Y AND X overlap — rows are only excluded if they
start within the box's horizontal range.

Fixes: rows next to Box 2 (lend, coconut, taste) were missing from
reconstruction because Box 2 (x=871, w=525) only covers the right
side, but left-side content rows at x≈148 were being filtered.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 17:00:28 +02:00
Benjamin Admin
dcb873db35 StepAnsicht: section-based layout with averaged row heights
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 38s
CI / test-go-edu-search (push) Successful in 38s
CI / test-python-klausur (push) Failing after 2m28s
CI / test-python-agent-core (push) Successful in 34s
CI / test-nodejs-website (push) Successful in 40s
Major rewrite of reconstruction rendering:
- Page split into vertical sections (content/box) around box boundaries
- Content sections: uniform row height = (last_row - first_row) / (n-1)
- Box sections: rows evenly distributed within box height
- Content rows positioned absolutely at original y-coordinates
- Font size derived from row height (55% of row height)
- Multi-line cells (bullets) get expanded height with indentation
- Boxes render at exact bbox position with colored border
- Preparation for unified grid where boxes become part of main grid

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 15:29:40 +02:00
Benjamin Admin
fd39d13d06 StepAnsicht: use server-rendered OCR overlay image
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 40s
CI / test-go-edu-search (push) Successful in 41s
CI / test-python-klausur (push) Failing after 2m38s
CI / test-python-agent-core (push) Successful in 32s
CI / test-nodejs-website (push) Successful in 24s
Replace manual word_box positioning (wild/unsnapped) with the
server-rendered words-overlay image from the OCR step endpoint.
This shows the same cleanly snapped red letters as the OCR step.

Endpoint: /sessions/{id}/image/words-overlay

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 23:26:54 +02:00
Benjamin Admin
c5733a171b StepAnsicht: fix font size and row spacing to match original
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 43s
CI / test-go-edu-search (push) Successful in 40s
CI / test-nodejs-website (push) Has been cancelled
CI / test-python-agent-core (push) Has been cancelled
CI / test-python-klausur (push) Has been cancelled
- Font: use font_size_suggestion_px * scale directly (removed 0.85 factor)
- Row height: calculate from row-to-row spacing (y_min of next row
  minus y_min of current row) instead of text height (y_max - y_min).
  This produces correct line spacing matching the original layout.
- Multi-line cells: height multiplied by line count

Content zone should now span from ~250 to ~2050 matching the original.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 23:24:27 +02:00
Benjamin Admin
18213f0bde StepAnsicht: split-view with coordinate grid for comparison
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 45s
CI / test-go-edu-search (push) Successful in 40s
CI / test-python-klausur (push) Failing after 2m37s
CI / test-python-agent-core (push) Successful in 32s
CI / test-nodejs-website (push) Successful in 36s
Left panel: Original scan + OCR word overlay (red text at exact
word_box positions) + coordinate grid
Right panel: Reconstructed layout + same coordinate grid

Features:
- Coordinate grid toggle with 50/100/200px spacing options
- Grid lines labeled with pixel coordinates in original image space
- Both panels share the same scale for direct visual comparison
- OCR overlay shows detected text in red mono font at original positions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 23:00:22 +02:00
Benjamin Admin
cd8eb6ce46 Add Ansicht step (Step 12) — read-only page layout preview
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 39s
CI / test-go-edu-search (push) Successful in 49s
CI / test-python-klausur (push) Failing after 2m33s
CI / test-python-agent-core (push) Successful in 31s
CI / test-nodejs-website (push) Successful in 36s
New pipeline step showing the reconstructed page with all zones
positioned at their original coordinates:
- Content zones with vocabulary grid cells
- Box zones with colored borders (from structure detection)
- Colspan cells rendered across multiple columns
- Multi-line cells (bullets) with pre-wrap whitespace
- Toggle to overlay original scan image at 15% opacity
- Proportionally scaled to viewport width
- Pure CSS positioning (no canvas/Fabric.js)

Pipeline: 14 steps (0-13), Ground Truth moved to Step 13.
Added colspan field to GridEditorCell type.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 22:42:33 +02:00
Benjamin Admin
2c2bdf903a Fix GridTable: replace ternary chain with IIFE for cell rendering
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 44s
CI / test-go-edu-search (push) Successful in 36s
CI / test-python-klausur (push) Failing after 2m28s
CI / test-python-agent-core (push) Successful in 36s
CI / test-nodejs-website (push) Successful in 31s
Chained ternary (colored ? div : multiline ? textarea : input) caused
webpack SWC parser issues. Replaced with IIFE {(() => { if/return })()}
which is more robust and readable.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:10:22 +02:00
Benjamin Admin
947ff6bdcb Fix JSX ternary nesting for textarea/input in GridTable
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 41s
CI / test-go-edu-search (push) Successful in 41s
CI / test-python-klausur (push) Failing after 2m32s
CI / test-python-agent-core (push) Successful in 31s
CI / test-nodejs-website (push) Successful in 28s
Remove extra curly braces around the textarea/input ternary that
caused webpack syntax error. The ternary is now a chained condition:
hasColoredWords ? <div> : text.includes('\n') ? <textarea> : <input>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:02:22 +02:00
Benjamin Admin
92e4021898 Fix GridTable JSX syntax error in colspan rendering
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 31s
CI / test-go-edu-search (push) Successful in 42s
CI / test-python-klausur (push) Failing after 2m43s
CI / test-python-agent-core (push) Successful in 33s
CI / test-nodejs-website (push) Successful in 39s
Mismatched closing tags from previous colspan edit caused webpack
build failure. Cleaned up spanning cell map() return structure.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 17:52:26 +02:00
Benjamin Admin
108f1b1a2a GridTable: render multi-line cells with textarea
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 46s
CI / test-go-edu-search (push) Successful in 46s
CI / test-python-klausur (push) Failing after 2m53s
CI / test-python-agent-core (push) Successful in 32s
CI / test-nodejs-website (push) Successful in 34s
Cells containing \n (bullet items with continuation lines) now use
<textarea> instead of <input type=text>, making all lines visible.
Row height auto-expands based on line count in the cell.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 17:17:29 +02:00
Benjamin Admin
48de4d98cd Fix infinite loop in StepBoxGridReview auto-build
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 41s
CI / test-go-edu-search (push) Successful in 35s
CI / test-python-klausur (push) Failing after 2m41s
CI / test-python-agent-core (push) Successful in 37s
CI / test-nodejs-website (push) Successful in 35s
Auto-build was triggering on every grid.zones.length change, which
happens on every rebuild (zone indices increment). Now uses a ref
to ensure auto-build fires only once. Also removed boxZones.length===0
condition that could trigger unnecessary builds.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 17:06:11 +02:00
Benjamin Admin
b5900f1aff Bullet indentation detection: group continuation lines into bullets
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 45s
CI / test-go-edu-search (push) Successful in 41s
CI / test-python-klausur (push) Failing after 2m49s
CI / test-python-agent-core (push) Successful in 34s
CI / test-nodejs-website (push) Successful in 34s
Flowing/bullet_list layout now analyzes left-edge indentation:
- Lines at minimum indent = bullet start / main level
- Lines indented >15px more = continuation (belongs to previous bullet)
- Continuation lines merged with \n into parent bullet cell
- Missing bullet markers (•) auto-added when pattern is clear

Example: 7 OCR lines → 3 items (1 header + 2 bullets × 3 lines each)
"German leihen" header, then two bullet groups with indented examples.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 16:57:16 +02:00
Benjamin Admin
baac98f837 Filter false-positive boxes in header/footer margins
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 55s
CI / test-go-edu-search (push) Successful in 1m0s
CI / test-python-klausur (push) Failing after 2m35s
CI / test-python-agent-core (push) Successful in 27s
CI / test-nodejs-website (push) Successful in 27s
Boxes whose vertical center falls within top/bottom 7% of image
height are filtered out (page numbers, unit headers, running footers).
At typical scan resolutions, 7% ≈ 2.5cm margin.

Fixes: "Box 1" containing just "3" from "Unit 3" page header being
incorrectly treated as an embedded box.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 14:38:53 +02:00
Benjamin Admin
496d34d822 Fix box empty rows: add x_min_px/x_max_px to flowing/header columns
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 55s
CI / test-go-edu-search (push) Successful in 51s
CI / test-python-klausur (push) Failing after 2m7s
CI / test-python-agent-core (push) Successful in 26s
CI / test-nodejs-website (push) Successful in 31s
GridTable calculates column widths from col.x_max_px - col.x_min_px.
Flowing and header_only layouts were missing these fields, producing
NaN widths which collapsed the CSS grid layout and showed empty rows
with only row numbers visible.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 13:01:11 +02:00
Benjamin Admin
709e41e050 GridTable: support partial colspan (2-of-4 columns)
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 39s
CI / test-go-edu-search (push) Successful in 40s
CI / test-python-klausur (push) Failing after 2m16s
CI / test-python-agent-core (push) Successful in 28s
CI / test-nodejs-website (push) Successful in 31s
Previously GridTable only supported full-row spanning (one cell across
all columns). Now renders each spanning_header cell with its actual
colspan, positioned at the correct grid column. This allows rows like
"In Britain..." (colspan=2) + "In Germany..." (colspan=2) to render
side by side instead of only showing the first cell.

Also fix box row fields: is_header always set (was undefined for
flowing/bullet_list), y_min_px/y_max_px for header_only rows.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 12:47:14 +02:00
Benjamin Admin
7b3e8c576d Fix NameError: span_cells removed but still referenced in log
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 43s
CI / test-go-edu-search (push) Successful in 51s
CI / test-python-klausur (push) Failing after 2m42s
CI / test-python-agent-core (push) Successful in 39s
CI / test-nodejs-website (push) Successful in 38s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 12:20:11 +02:00
Benjamin Admin
868f99f109 Fix colspan text + box row fields for GridTable compatibility
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 42s
CI / test-go-edu-search (push) Successful in 41s
CI / test-python-klausur (push) Failing after 2m49s
CI / test-python-agent-core (push) Successful in 42s
CI / test-nodejs-website (push) Successful in 33s
Colspan: use original word-block text instead of split cell texts.
Prevents "euros a nd cents" from split_cross_column_words.

Box rows: add is_header field (was undefined, causing GridTable
rendering issues). Add y_min_px/y_max_px to header_only rows.
These missing fields caused empty rows with only row numbers visible.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 12:08:49 +02:00
Benjamin Admin
dc25f243a4 Fix colspan: use original words before split_cross_column_words
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 42s
CI / test-go-edu-search (push) Successful in 47s
CI / test-python-klausur (push) Failing after 2m33s
CI / test-python-agent-core (push) Successful in 31s
CI / test-nodejs-website (push) Successful in 35s
_split_cross_column_words was destroying the colspan information by
cutting word-blocks at column boundaries BEFORE _detect_colspan_cells
could analyze them. Now passes original (pre-split) words to colspan
detection while using split words for cell building.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 11:58:32 +02:00
Benjamin Admin
c62ff7cd31 Generic colspan detection for merged cells in grids and boxes
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 33s
CI / test-go-edu-search (push) Successful in 38s
CI / test-python-klausur (push) Failing after 2m45s
CI / test-python-agent-core (push) Successful in 38s
CI / test-nodejs-website (push) Successful in 34s
New _detect_colspan_cells() in grid_editor_helpers.py:
- Runs after _build_cells() for every zone (content + box)
- Detects word-blocks that extend across column boundaries
- Merges affected cells into spanning_header with colspan=N
- Uses column midpoints to determine which columns are covered
- Works for full-page scans and box zones equally

Also fixes box flowing/bullet_list row height fields (y_min_px/y_max_px).

Removed duplicate spanning logic from cv_box_layout.py — now uses
the generic _detect_colspan_cells from grid_editor_helpers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 11:38:03 +02:00
Benjamin Admin
5d91698c3b Fix box grid: row height fields + spanning cell detection
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 46s
CI / test-go-edu-search (push) Successful in 43s
CI / test-python-klausur (push) Failing after 2m36s
CI / test-python-agent-core (push) Successful in 33s
CI / test-nodejs-website (push) Successful in 37s
Box 3 empty rows: flowing/bullet_list rows were missing y_min_px/
y_max_px fields that GridTable uses for row height calculation.
Added _px and _pct variants.

Box 2 spanning cells: rows with fewer word-blocks than columns
(e.g., "In Britain..." spanning 2 columns) are now detected and
merged into spanning_header cells. GridTable already renders
spanning_header cells across the full row width.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 09:46:43 +02:00
Benjamin Admin
5fa5767c9a Fix box column detection: use low gap_threshold for small zones
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 42s
CI / test-go-edu-search (push) Successful in 39s
CI / test-python-klausur (push) Failing after 2m48s
CI / test-python-agent-core (push) Successful in 38s
CI / test-nodejs-website (push) Successful in 30s
PaddleOCR returns multi-word blocks (whole phrases), so ALL inter-word
gaps in small zones (boxes, ≤60 words) are column boundaries. Previous
3x-median approach produced thresholds too high to detect real columns.

New approach for small zones: gap_threshold = max(median_h * 1.0, 25).
This correctly detects 4 columns in "Pounds and euros" box where gaps
range from 50-297px and word height is ~31px.

Also includes SmartSpellChecker fixes from previous commits:
- Frequency-based scoring, IPA protection, slash→l, rare-word threshold

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 07:55:29 +02:00
Benjamin Admin
693803fb7c SmartSpellChecker: frequency scoring, IPA protection, slash→l fix
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 42s
CI / test-go-edu-search (push) Successful in 42s
CI / test-python-klausur (push) Failing after 2m55s
CI / test-python-agent-core (push) Successful in 37s
CI / test-nodejs-website (push) Successful in 31s
Major improvements:
- Frequency-based boundary repair: always tries repair, uses word
  frequency product to decide (Pound sand→Pounds and: 2000x better)
- IPA bracket protection: words inside [brackets] are never modified,
  even when brackets land in tokenizer separators
- Slash→l substitution: "p/" → "pl" for italic l misread as slash
- Abbreviation guard uses rare-word threshold (freq < 1e-6) instead
  of binary known/unknown — prevents "Can I" → "Ca nI" while still
  fixing "ats th." → "at sth."
- Tokenizer includes / character for slash-word detection

43 tests passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 07:36:39 +02:00
Benjamin Admin
31089df36f SmartSpellChecker: frequency-based boundary repair for valid word pairs
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 43s
CI / test-go-edu-search (push) Successful in 40s
CI / test-python-klausur (push) Failing after 2m42s
CI / test-python-agent-core (push) Successful in 37s
CI / test-nodejs-website (push) Successful in 35s
Previously, boundary repair was skipped when both words were valid
dictionary words (e.g., "Pound sand", "wit hit", "done euro").
Now uses word-frequency scoring (product of bigram frequencies) to
decide if the repair produces a more common word pair.

Threshold: repair accepted when new pair is >5x more frequent, or
when repair produces a known abbreviation.

New fixes: Pound sand→Pounds and (2000x), wit hit→with it (100000x),
done euro→one euro (7x).

43 tests passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 07:00:22 +02:00
Benjamin Admin
7b294f9150 Cap gap_threshold at 25% of zone_w for column detection
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 46s
CI / test-go-edu-search (push) Successful in 52s
CI / test-python-klausur (push) Failing after 2m51s
CI / test-python-agent-core (push) Successful in 40s
CI / test-nodejs-website (push) Successful in 34s
In small zones (boxes), intra-phrase gaps inflate the median gap,
causing gap_threshold to become too large to detect real column
boundaries. Cap at 25% of zone width to prevent this.

Example: Box "Pounds and euros" has 4 columns at x≈148,534,751,1137
but gap_threshold was 531 (larger than the column gaps themselves).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 23:58:15 +02:00
Benjamin Admin
8b29d20940 StepBoxGridReview: show box border color from structure detection
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 41s
CI / test-go-edu-search (push) Successful in 45s
CI / test-python-klausur (push) Failing after 2m46s
CI / test-python-agent-core (push) Successful in 35s
CI / test-nodejs-website (push) Successful in 35s
- Use box_bg_hex for border color (from Step 7 structure detection)
- Numbered color badges per box
- Show color name in box header
- Add box_bg_color/box_bg_hex to GridZone type

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 23:18:36 +02:00
Benjamin Admin
12b194ad1a Fix StepBoxGridReview: match GridTable props interface
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 46s
CI / test-go-edu-search (push) Successful in 43s
CI / test-python-klausur (push) Failing after 2m50s
CI / test-python-agent-core (push) Successful in 37s
CI / test-nodejs-website (push) Successful in 38s
GridTable expects zone (singular), onSelectCell, onCellTextChange,
onToggleColumnBold, onToggleRowHeader, onNavigate — not the
incorrect prop names from the first version.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 22:39:38 +02:00
Benjamin Admin
058eadb0e4 Fix build-box-grids: use structure_result boxes + raw OCR words
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 48s
CI / test-go-edu-search (push) Successful in 44s
CI / test-python-klausur (push) Failing after 2m47s
CI / test-python-agent-core (push) Successful in 33s
CI / test-nodejs-website (push) Successful in 36s
- Source boxes from structure_result (Step 7) instead of grid zones
- Use raw_paddle_words (top/left/width/height) instead of grid cells
- Create new box zones from all detected boxes (not just existing zones)
- Sort zones by y-position for correct reading order
- Include box background color metadata

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 21:50:28 +02:00
Benjamin Admin
5da9a550bf Add Box-Grid-Review step (Step 11) to OCR pipeline
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 44s
CI / test-go-edu-search (push) Successful in 43s
CI / test-python-klausur (push) Failing after 2m52s
CI / test-python-agent-core (push) Successful in 36s
CI / test-nodejs-website (push) Successful in 37s
New pipeline step between Gutter Repair and Ground Truth that processes
embedded boxes (grammar tips, exercises) independently from the main grid.

Backend:
- cv_box_layout.py: classify_box_layout() detects flowing/columnar/
  bullet_list/header_only layout types per box
- build_box_zone_grid(): layout-aware grid building (single-column for
  flowing text, independent columns for tabular content)
- POST /sessions/{id}/build-box-grids endpoint with SmartSpellChecker
- Layout type overridable per box via request body

Frontend:
- StepBoxGridReview.tsx: shows each box with cropped image + editable
  GridTable. Layout type dropdown per box. Auto-builds on first load.
- Auto-skip when no boxes detected on page
- Pipeline steps updated: 13 steps (0-12), Ground Truth moved to 12

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 17:26:06 +02:00
Benjamin Admin
52637778b9 SmartSpellChecker: boundary repair + context split + abbreviation awareness
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 51s
CI / test-go-edu-search (push) Successful in 47s
CI / test-python-klausur (push) Failing after 2m54s
CI / test-python-agent-core (push) Successful in 35s
CI / test-nodejs-website (push) Successful in 35s
New features:
- Boundary repair: "ats th." → "at sth." (shifted OCR word boundaries)
  Tries shifting 1-2 chars between adjacent words, accepts if result
  includes a known abbreviation or produces better dictionary matches
- Context split: "anew book" → "a new book" (ambiguous word merges)
  Explicit allow/deny list for article+word patterns (alive, alone, etc.)
- Abbreviation awareness: 120+ known abbreviations (sth, sb, adj, etc.)
  are now recognized as valid words, preventing false corrections
- Quality gate: boundary repairs only accepted when result scores
  higher than original (known words + abbreviations)

40 tests passing, all edge cases covered.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 15:41:17 +02:00
Benjamin Admin
f6372b8c69 Integrate SmartSpellChecker into build-grid finalization
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 47s
CI / test-go-edu-search (push) Successful in 43s
CI / test-python-klausur (push) Failing after 2m45s
CI / test-python-agent-core (push) Successful in 36s
CI / test-nodejs-website (push) Successful in 40s
SmartSpellChecker now runs during grid build (not just LLM review),
so corrections are visible immediately in the grid editor.

Language detection per column:
- EN column detected via IPA signals (existing logic)
- All other columns assumed German for vocab tables
- Auto-detection for single/two-column layouts

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 14:54:01 +02:00
Benjamin Admin
909d0729f6 Add SmartSpellChecker + refactor vocab-worksheet page.tsx
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 45s
CI / test-go-edu-search (push) Successful in 43s
CI / test-python-klausur (push) Failing after 2m51s
CI / test-python-agent-core (push) Successful in 36s
CI / test-nodejs-website (push) Successful in 37s
SmartSpellChecker (klausur-service):
- Language-aware OCR post-correction without LLMs
- Dual-dictionary heuristic for EN/DE language detection
- Context-based a/I disambiguation via bigram lookup
- Multi-digit substitution (sch00l→school)
- Cross-language guard (don't false-correct DE words in EN column)
- Umlaut correction (Schuler→Schüler, uber→über)
- Integrated into spell_review_entries_sync() pipeline
- 31 tests, 9ms/100 corrections

Vocab-worksheet refactoring (studio-v2):
- Split 2337-line page.tsx into 14 files
- Custom hook useVocabWorksheet.ts (all state + logic)
- 9 components in components/ directory
- types.ts, constants.ts for shared definitions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 12:25:01 +02:00
Benjamin Admin
04fa01661c Move IPA/syllable toggles to vocabulary tab toolbar
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 49s
CI / test-go-edu-search (push) Successful in 43s
CI / test-python-klausur (push) Failing after 2m51s
CI / test-python-agent-core (push) Successful in 34s
CI / test-nodejs-website (push) Successful in 36s
Dropdowns are now in the vocabulary table header (after processing),
not in the worksheet settings (before processing). Changing a mode
automatically reprocesses all successful pages with the new settings.
Same dropdown options as the OCR pipeline grid editor.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 10:17:14 +02:00
Benjamin Admin
bf9d24e108 Replace IPA/syllable checkboxes with full dropdowns in vocab-worksheet
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 47s
CI / test-go-edu-search (push) Successful in 47s
CI / test-python-klausur (push) Failing after 2m41s
CI / test-python-agent-core (push) Successful in 39s
CI / test-nodejs-website (push) Successful in 42s
Vocab worksheet now has the same IPA/syllable mode options as the
OCR pipeline grid editor: Auto, nur EN, nur DE, Alle, Aus.
Previously only had on/off checkboxes mapping to auto/none.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 10:10:22 +02:00
Benjamin Admin
0f17eb3cd9 Fix IPA:Aus — strip all brackets before skipping IPA block
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 49s
CI / test-go-edu-search (push) Successful in 35s
CI / test-python-klausur (push) Failing after 2m53s
CI / test-nodejs-website (push) Has been cancelled
CI / test-python-agent-core (push) Has started running
When ipa_mode=none, the entire IPA processing block was skipped,
including the bracket-stripping logic. Now strips ALL square brackets
from content columns BEFORE the skip, so IPA:Aus actually removes
all IPA from the display.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 10:05:22 +02:00
Benjamin Admin
5244e10728 Fix IPA/syllable race condition: loadGrid no longer depends on buildGrid
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 43s
CI / test-go-edu-search (push) Successful in 45s
CI / test-python-klausur (push) Failing after 2m55s
CI / test-python-agent-core (push) Successful in 35s
CI / test-nodejs-website (push) Has been cancelled
loadGrid depended on buildGrid (for 404 fallback), which depended on
ipaMode/syllableMode. Every mode change created a new loadGrid ref,
triggering StepGridReview's useEffect to load the OLD saved grid,
overwriting the freshly rebuilt one.

Now loadGrid only depends on sessionId. The 404 fallback builds inline
with current modes. Mode changes are handled exclusively by the
separate rebuild useEffect.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 09:59:49 +02:00
Benjamin Admin
a6c5f56003 Fix IPA strip: match all square brackets, not just Unicode IPA
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 45s
CI / test-go-edu-search (push) Successful in 41s
CI / test-python-klausur (push) Failing after 2m49s
CI / test-python-agent-core (push) Successful in 29s
CI / test-nodejs-website (push) Successful in 23s
OCR text contains ASCII IPA approximations like [kompa'tifn] instead
of Unicode [kˈɒmpətɪʃən]. The strip regex required Unicode IPA chars
inside brackets and missed the ASCII ones. Now strips all [bracket]
content from excluded columns since square brackets in vocab columns
are always IPA.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 09:53:16 +02:00
Benjamin Admin
584e07eb21 Strip English IPA when mode excludes EN (nur DE / Aus)
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 47s
CI / test-go-edu-search (push) Successful in 46s
CI / test-python-agent-core (push) Has been cancelled
CI / test-nodejs-website (push) Has been cancelled
CI / test-python-klausur (push) Has been cancelled
English IPA from the original OCR scan (e.g. [ˈgrænˌdæd]) was always
shown because fix_cell_phonetics only ADDS/CORRECTS but never removes.
Now strips IPA brackets containing Unicode IPA chars from the EN column
when ipa_mode is "de" or "none".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 09:49:22 +02:00
Benjamin Admin
54b1c7d7d7 Fix IPA/syllable first-click not working (off-by-one in initialLoadDone)
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 46s
CI / test-go-edu-search (push) Successful in 45s
CI / test-python-klausur (push) Failing after 2m52s
CI / test-python-agent-core (push) Successful in 36s
CI / test-nodejs-website (push) Successful in 38s
The old guard checked if grid was loaded AND set initialLoadDone in
the same pass, then returned without rebuilding. This meant the first
user-triggered mode change was always swallowed.

Simplified to a mount-skip ref: skip exactly the first useEffect trigger
(component mount), rebuild on every subsequent trigger (user changes).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 09:40:57 +02:00
Benjamin Admin
d8a2331038 Fix IPA/syllable mode change requiring double-click
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 47s
CI / test-go-edu-search (push) Successful in 45s
CI / test-python-klausur (push) Failing after 2m58s
CI / test-python-agent-core (push) Successful in 31s
CI / test-nodejs-website (push) Successful in 38s
The useEffect for mode changes called buildGrid() which was a
useCallback closing over stale ipaMode/syllableMode values due to
React's asynchronous state batching. The first click triggered a
rebuild with the OLD mode; only the second click used the new one.

Now inlines the API call directly in the useEffect, reading ipaMode
and syllableMode from the effect's closure which always has the
current values.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 09:32:02 +02:00
Benjamin Admin
ad78e26143 Fix word-split: handle IPA brackets, contractions, and tiebreaker
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 47s
CI / test-go-edu-search (push) Successful in 46s
CI / test-python-klausur (push) Failing after 2m57s
CI / test-python-agent-core (push) Successful in 36s
CI / test-nodejs-website (push) Successful in 41s
1. Strip IPA brackets [ipa] before attempting word split, so
   "makeadecision[dɪsˈɪʒən]" is processed as "makeadecision"
2. Handle contractions: "solet's" → split "solet" → "so let" + "'s"
3. DP tiebreaker: prefer longer first word when scores are equal
   ("task is" over "ta skis")

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 09:13:02 +02:00
Benjamin Admin
4f4e6c31fa Fix word-split tiebreaker: prefer longer first word
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 47s
CI / test-go-edu-search (push) Successful in 39s
CI / test-python-klausur (push) Failing after 2m44s
CI / test-python-agent-core (push) Successful in 31s
CI / test-nodejs-website (push) Successful in 35s
"taskis" was split as "ta skis" instead of "task is" because both
have the same DP score. Changed comparison from > to >= so that
later candidates (with longer first words) win ties.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 09:05:14 +02:00
Benjamin Admin
7ffa4c90f9 Lower word-split threshold from 7 to 4 chars
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 50s
CI / test-go-edu-search (push) Successful in 46s
CI / test-python-klausur (push) Failing after 2m48s
CI / test-python-agent-core (push) Successful in 37s
CI / test-nodejs-website (push) Successful in 38s
Short merged words like "anew" (a new), "Imadea" (I made a),
"makeadecision" (make a decision) were missed because the split
threshold was too high. Now processes tokens >= 4 chars.

English single-letter words (a, I) are already handled by the DP
algorithm which allows them as valid split points.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 08:59:02 +02:00
Benjamin Admin
656cadbb1e Remove page-number footers from grid, promote to metadata
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 47s
CI / test-go-edu-search (push) Successful in 40s
CI / test-python-klausur (push) Failing after 2m55s
CI / test-python-agent-core (push) Successful in 30s
CI / test-nodejs-website (push) Successful in 37s
Footer rows that are page numbers (digits or written-out like
"two hundred and nine") are now removed from the grid entirely
and promoted to the page_number metadata field. Non-page-number
footer content stays as a visible footer row.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 08:50:20 +02:00
Benjamin Admin
757c8460c9 Detect written-out page numbers as footer rows
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 47s
CI / test-go-edu-search (push) Successful in 44s
CI / test-python-klausur (push) Failing after 2m46s
CI / test-python-agent-core (push) Successful in 32s
CI / test-nodejs-website (push) Successful in 39s
"two hundred and nine" (22 chars) was kept as a content row because
the footer detection only accepted text ≤20 chars. Now recognizes
written-out number words (English + German) as page numbers regardless
of length.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 08:39:43 +02:00
Benjamin Admin
501de4374a Keep page references as visible column cells
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 42s
CI / test-go-edu-search (push) Successful in 41s
CI / test-python-klausur (push) Failing after 2m49s
CI / test-python-agent-core (push) Successful in 37s
CI / test-nodejs-website (push) Successful in 35s
Step 5g was extracting page refs (p.55, p.70) as zone metadata and
removing them from the cell table. Users want to see them as a
separate column. Now keeps cells in place while still extracting
metadata for the frontend header display.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 08:27:44 +02:00
Benjamin Admin
774bbc50d3 Add debug logging for empty-column-removal
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 43s
CI / test-go-edu-search (push) Successful in 54s
CI / test-python-klausur (push) Failing after 2m53s
CI / test-python-agent-core (push) Successful in 39s
CI / test-nodejs-website (push) Successful in 39s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 22:45:22 +02:00
Benjamin Admin
9ceee4e07c Protect page references from junk-row removal
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Failing after 11s
CI / test-go-edu-search (push) Successful in 57s
CI / test-python-klausur (push) Failing after 2m49s
CI / test-nodejs-website (push) Has been cancelled
CI / test-python-agent-core (push) Has been cancelled
Rows containing only a page reference (p.55, S.12) were removed as
"oversized stubs" (Rule 2) when their word-box height exceeded the
median. Now skips Rule 2 if any word matches the page-ref pattern.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 22:40:37 +02:00
Benjamin Admin
f23aaaea51 Fix false header detection: skip continuation lines and mid-column cells
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 54s
CI / test-go-edu-search (push) Successful in 57s
CI / test-python-klausur (push) Failing after 2m57s
CI / test-python-agent-core (push) Successful in 28s
CI / test-nodejs-website (push) Successful in 34s
Single-cell rows were incorrectly detected as headings when they were
actually continuation lines. Two new guards:
1. Text starting with "(" is a continuation (e.g. "(usw.)", "(TV-Serie)")
2. Single cells beyond the first two content columns are overflow lines,
   not headings. Real headings appear in the first columns.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 22:21:09 +02:00
Benjamin Admin
cde13c9623 Fix IPA stripping digits after headwords (Theme 1 → Theme)
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 46s
CI / test-go-edu-search (push) Successful in 42s
CI / test-python-klausur (push) Failing after 2m46s
CI / test-python-agent-core (push) Successful in 34s
CI / test-nodejs-website (push) Successful in 30s
_insert_missing_ipa stripped "1" from "Theme 1" because it treated
the digit as garbled OCR phonetics. Now treats pure digits/numbering
patterns (1, 2., 3)) as delimiters that stop the garble-stripping.

Also fixes _has_non_dict_trailing which incorrectly flagged "Theme 1"
as having non-dictionary trailing text.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 22:13:45 +02:00
Benjamin Admin
2e42167c73 Remove empty columns from grid zones
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 52s
CI / test-go-edu-search (push) Successful in 39s
CI / test-python-klausur (push) Failing after 2m43s
CI / test-python-agent-core (push) Successful in 34s
CI / test-nodejs-website (push) Successful in 29s
Columns with zero cells (e.g. from tertiary detection where the word
was assigned to a neighboring column by overlap) are stripped from the
final result. Remaining columns and cells are re-indexed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 22:04:49 +02:00
Benjamin Admin
5eff4cf877 Fix page refs deleted as artifacts + IPA spacing for DE mode
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 54s
CI / test-go-edu-search (push) Successful in 41s
CI / test-nodejs-website (push) Has been cancelled
CI / test-python-agent-core (push) Has been cancelled
CI / test-python-klausur (push) Has started running
1. Step 5j-pre wrongly classified "p.43", "p.50" etc as artifacts
   (mixed digits+letters, <=5 chars). Added exception for page
   reference patterns (p.XX, S.XX).

2. IPA spacing regex was too narrow (only matched Unicode IPA chars).
   Now matches any [bracket] content >=2 chars directly after a letter,
   fixing German IPA like "Opa[oːpa]" → "Opa [oːpa]".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 22:01:25 +02:00
Benjamin Admin
7f4b8757ff Fix IPA spacing + add zone debug logging for marker column issue
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 55s
CI / test-go-edu-search (push) Successful in 49s
CI / test-python-klausur (push) Failing after 2m48s
CI / test-python-agent-core (push) Successful in 32s
CI / test-nodejs-website (push) Successful in 37s
1. Ensure space before IPA brackets in cell text: "word[ipa]" → "word [ipa]"
   Applied as final cleanup in grid-build finalization.

2. Add debug logging for zone-word assignment to diagnose why marker
   column cells are empty despite correct column detection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 21:51:52 +02:00
Benjamin Admin
7263328edb Fix marker column detection: remove min-rows requirement
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 30s
CI / test-go-edu-search (push) Successful in 30s
CI / test-python-klausur (push) Failing after 2m55s
CI / test-python-agent-core (push) Successful in 18s
CI / test-nodejs-website (push) Successful in 22s
Words to the left of the first detected column boundary must always
form their own column, regardless of how few rows they appear in.
Previously required 4+ distinct rows for tertiary (margin) columns,
which missed page references like p.62, p.63, p.64 (only 3 rows).

Now any cluster at the left/right margin with a clear gap to the
nearest significant column qualifies as its own column.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 21:24:25 +02:00
Benjamin Admin
8c482ce8dd Fix Grid Build step: show grid-editor summary instead of word_result
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 31s
CI / test-go-edu-search (push) Successful in 31s
CI / test-python-klausur (push) Failing after 2m31s
CI / test-python-agent-core (push) Successful in 21s
CI / test-nodejs-website (push) Successful in 23s
The Grid Build step was showing word_result.grid_shape (from the initial
OCR word clustering, often just 1 column) instead of the grid-editor
summary (zone-based, with correct column/row/cell counts). Now reads
summary.total_rows/total_columns/total_cells from the grid-editor result.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 21:01:18 +02:00
Benjamin Admin
00f7a7154c Fix left-side gutter detection: find peak instead of scanning from edge
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 40s
CI / test-go-edu-search (push) Successful in 42s
CI / test-python-klausur (push) Failing after 2m39s
CI / test-python-agent-core (push) Successful in 30s
CI / test-nodejs-website (push) Successful in 32s
Left-side book fold shadows have a V-shape: brightness dips from the
edge toward a peak at ~5-10% of width, then rises again. The previous
algorithm scanned from the edge inward and immediately found a low
dark fraction (0.13 at x=0), missing the gutter entirely.

Now finds the PEAK of the dark fraction profile first, then scans from
that peak toward the page center to find the transition point. Works
for both V-shaped left gutters and edge-darkening right gutters.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 16:52:23 +02:00
Benjamin Admin
9c5e950c99 Fix multi-page PDF upload: include session_id for first page
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 42s
CI / test-nodejs-website (push) Successful in 36s
CI / test-python-klausur (push) Failing after 10m2s
CI / test-go-edu-search (push) Failing after 10m9s
CI / test-python-agent-core (push) Failing after 14m58s
The frontend expects session_id in the upload response, but multi-page
PDFs returned only document_group_id + pages[]. Now includes session_id
pointing to the first page for backwards compatibility.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 16:26:25 +02:00
Benjamin Admin
6e494a43ab Apply merged-word splitting to grid-editor cells
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 42s
CI / test-go-edu-search (push) Successful in 44s
CI / test-python-klausur (push) Failing after 2m28s
CI / test-python-agent-core (push) Successful in 32s
CI / test-nodejs-website (push) Successful in 32s
The spell review only runs on vocab entries, but the OCR pipeline's
grid-editor cells also contain merged words (e.g. "atmyschool").
Now splits merged words directly in the grid-build finalization step,
right before returning the result. Uses the same _try_split_merged_word()
dictionary-based DP algorithm.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 14:52:00 +02:00
Benjamin Admin
53b0d77853 Multi-page PDF support: create one session per page
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Failing after 27s
CI / test-go-edu-search (push) Successful in 39s
CI / test-python-klausur (push) Failing after 2m36s
CI / test-python-agent-core (push) Successful in 24s
CI / test-nodejs-website (push) Successful in 35s
When uploading a PDF with > 1 page to the OCR pipeline, each page
now gets its own session (grouped by document_group_id). Previously
only page 1 was processed. The response includes a pages array with
all session IDs so the frontend can navigate between them.

Single-page PDFs and images continue to work as before.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 14:39:48 +02:00
Benjamin Admin
aed0edbf6d Fix word split scoring: prefer longer words over short ones
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Failing after 20s
CI / test-go-edu-search (push) Successful in 43s
CI / test-python-klausur (push) Failing after 2m41s
CI / test-python-agent-core (push) Successful in 24s
CI / test-nodejs-website (push) Successful in 30s
"Comeon" was split as "Com eon" instead of "Come on" because both
are 2-word splits. Now uses sum-of-squared-lengths as tiebreaker:
"come"(16) + "on"(4) = 20 > "com"(9) + "eon"(9) = 18.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 14:14:23 +02:00
Benjamin Admin
9e2c301723 Add merged-word splitting to OCR spell review
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 43s
CI / test-go-edu-search (push) Successful in 38s
CI / test-nodejs-website (push) Has been cancelled
CI / test-python-agent-core (push) Has been cancelled
CI / test-python-klausur (push) Has been cancelled
OCR often merges adjacent words when spacing is tight, e.g.
"atmyschool" → "at my school", "goodidea" → "good idea".

New _try_split_merged_word() uses dynamic programming to find the
shortest sequence of dictionary words covering the token. Integrated
as step 5 in _spell_fix_token() after general spell correction.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 14:11:16 +02:00
Benjamin Admin
633e301bfd Add camera gutter detection via vertical continuity analysis
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 45s
CI / test-go-edu-search (push) Successful in 32s
CI / test-python-klausur (push) Failing after 2m49s
CI / test-python-agent-core (push) Successful in 30s
CI / test-nodejs-website (push) Successful in 32s
Scanner shadow detection (range > 40, darkest < 180) fails on camera
book scans where the gutter shadow is subtle (range ~25, darkest ~214).

New _detect_gutter_continuity() detects gutters by their unique property:
the shadow runs continuously from top to bottom without interruption.
Divides the image into horizontal strips and checks what fraction of
strips are darker than the page median at each column. A gutter column
has >= 75% of strips darker. The transition point where the smoothed
dark fraction drops below 50% marks the crop boundary.

Integrated as fallback between scanner shadow and binary projection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 13:58:14 +02:00
Benjamin Admin
9b5e8c6b35 Restructure upload flow: document first, then preview + naming
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 36s
CI / test-go-edu-search (push) Successful in 38s
CI / test-python-klausur (push) Failing after 2m39s
CI / test-python-agent-core (push) Successful in 33s
CI / test-nodejs-website (push) Successful in 24s
Step 1 is now document selection (full width).
After selecting a file, Step 2 shows a side-by-side layout with
document preview (3/5 width, scrollable, with fullscreen modal)
and session naming (2/5 width, with start button).

Also adds PDF preview via blob URL before upload.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 12:53:47 +02:00
Benjamin Admin
682b306e51 Use grid-build zones for vocab extraction (4-column detection)
Some checks failed
CI / go-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / test-go-school (push) Successful in 41s
CI / test-go-edu-search (push) Successful in 42s
CI / test-python-klausur (push) Failing after 2m44s
CI / test-python-agent-core (push) Successful in 29s
CI / test-nodejs-website (push) Successful in 36s
The initial build_grid_from_words() under-clusters to 1 column while
_build_grid_core() correctly finds 4 columns (marker, EN, DE, example).
Now extracts vocab from grid zones directly, with heuristic to skip
narrow marker columns. Falls back to original cells if zones fail.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 01:17:40 +02:00
Benjamin Admin
3e3116d2fd Fix vocab extraction: show all columns for generic layouts
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 43s
CI / test-go-edu-search (push) Successful in 41s
CI / test-python-klausur (push) Failing after 2m36s
CI / test-python-agent-core (push) Successful in 31s
CI / test-nodejs-website (push) Successful in 36s
When columns can't be classified as EN/DE, map them by position:
col 0 → english, col 1 → german, col 2+ → example. This ensures
vocabulary pages are always extracted, even without explicit
language classification. Classified pages still use the proper
EN/DE/example mapping.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 01:11:40 +02:00
Benjamin Admin
9a8ce69782 Fix vocab extraction: use original column types for EN/DE classification
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 37s
CI / test-go-edu-search (push) Successful in 39s
CI / test-python-agent-core (push) Has been cancelled
CI / test-nodejs-website (push) Has been cancelled
CI / test-python-klausur (push) Has been cancelled
The grid-build zones use generic column types, losing the EN/DE
classification from build_grid_from_words(). Now extracts improved
cells from grid zones but classifies them using the original
columns_meta which has the correct column_en/column_de types.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 01:07:49 +02:00
Benjamin Admin
66f8a7b708 Improve vocab-worksheet UX: better status messages + error details
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 38s
CI / test-go-edu-search (push) Successful in 45s
CI / test-python-klausur (push) Failing after 2m19s
CI / test-python-agent-core (push) Successful in 33s
CI / test-nodejs-website (push) Successful in 35s
- Change "PDF wird analysiert..." to "PDF wird hochgeladen..." (accurate)
- Switch to pages tab immediately after upload (before thumbnails load)
- Show progressive status: "5 Seiten erkannt. Vorschau wird geladen..."
- Show backend error detail instead of generic "HTTP 404"
- Backend returns helpful message when session not in memory after restart

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 00:55:56 +02:00
Benjamin Admin
3b78baf37f Replace old OCR pipeline with Kombi pipeline + add IPA/syllable toggles
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 41s
CI / test-go-edu-search (push) Successful in 37s
CI / test-python-klausur (push) Failing after 2m22s
CI / test-python-agent-core (push) Successful in 32s
CI / test-nodejs-website (push) Successful in 33s
Backend:
- _run_ocr_pipeline_for_page() now runs the full Kombi pipeline:
  orientation → deskew → dewarp → content crop → dual-engine OCR
  (RapidOCR + Tesseract merge) → _build_grid_core() with pipe-autocorrect,
  word-gap merge, dictionary detection
- Accepts ipa_mode and syllable_mode query params on process-single-page
- Pipeline sessions are visible in admin OCR Kombi UI for debugging

Frontend (vocab-worksheet):
- New "Anzeigeoptionen" section with IPA and syllable toggles
- Settings are passed to process-single-page as query parameters

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 00:43:42 +02:00
Benjamin Admin
2828871e42 Show detected page number in session header
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 39s
CI / test-go-edu-search (push) Successful in 42s
CI / test-python-klausur (push) Failing after 2m21s
CI / test-python-agent-core (push) Successful in 27s
CI / test-nodejs-website (push) Successful in 28s
Extracts page_number from grid_editor_result when opening a session
and displays it as "S. 233" badge in the SessionHeader, next to the
category and GT badges.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 00:20:53 +02:00
Benjamin Admin
5c96def4ec Skip valid line-break hyphenations in gutter repair
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 39s
CI / test-go-edu-search (push) Successful in 38s
CI / test-python-klausur (push) Failing after 2m33s
CI / test-python-agent-core (push) Successful in 32s
CI / test-nodejs-website (push) Successful in 31s
Words ending with "-" where the stem is a known word (e.g. "wunder-"
→ "wunder" is known) are valid line-break hyphenations, not gutter
errors. Gutter problems cause the hyphen to be LOST ("ve" instead of
"ver-"), so a visible hyphen + known stem = intentional word-wrap.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 00:14:21 +02:00
Benjamin Admin
611e1ee33d Add GT badge to grouped sessions and sub-pages in session list
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 39s
CI / test-go-edu-search (push) Successful in 41s
CI / test-python-klausur (push) Failing after 2m29s
CI / test-python-agent-core (push) Successful in 28s
CI / test-nodejs-website (push) Successful in 34s
The GT badge was only shown on ungrouped SessionRow items. Now also
visible on document group rows (e.g. "GT 1/2") and individual pages
within expanded groups.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 23:54:55 +02:00
Benjamin Admin
49d5212f0c Fix hyphen-join: preserve next row + skip valid hyphenations
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 41s
CI / test-go-edu-search (push) Successful in 40s
CI / test-python-klausur (push) Failing after 2m26s
CI / test-python-agent-core (push) Successful in 27s
CI / test-nodejs-website (push) Successful in 31s
Two bugs fixed:
- Apply no longer removes the continuation word from the next row.
  "künden" stays in row 31 — only the current row is repaired
  ("ve" → "ver-"). The original line-break layout is preserved.
- Analysis now skips words that already end with "-" when the direct
  join with the next row is a known word (valid hyphenation, not an error).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 19:49:07 +02:00
Benjamin Admin
e6f8e12f44 Show full Grid-Review in Ground Truth step + GT badge in session list
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 34s
CI / test-go-edu-search (push) Successful in 37s
CI / test-python-klausur (push) Failing after 2m18s
CI / test-python-agent-core (push) Successful in 22s
CI / test-nodejs-website (push) Successful in 27s
- StepGroundTruth now shows the split view (original image + table)
  so the user can verify the final result before marking as GT
- Backend session list now returns is_ground_truth flag
- SessionList shows amber "GT" badge for marked sessions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 19:34:32 +02:00
Benjamin Admin
aabd849e35 Fix hyphen-join: strip trailing punctuation from continuation word
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 50s
CI / test-go-edu-search (push) Successful in 47s
CI / test-python-klausur (push) Failing after 2m35s
CI / test-python-agent-core (push) Successful in 31s
CI / test-nodejs-website (push) Successful in 34s
The next-row word "künden," had a trailing comma, causing dictionary
lookup to fail for "verkünden,". Now strips .,;:!? before joining.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 19:25:28 +02:00
Benjamin Admin
d1e7dd1c4a Fix gutter repair: detect short fragments + show spell alternatives
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 48s
CI / test-go-edu-search (push) Successful in 49s
CI / test-python-klausur (push) Failing after 2m37s
CI / test-python-agent-core (push) Successful in 35s
CI / test-nodejs-website (push) Successful in 35s
- Lower min word length from 3→2 for hyphen-join candidates so fragments
  like "ve" (from "ver-künden") are no longer skipped
- Return all spellchecker candidates instead of just top-1, so user can
  pick the correct form (e.g. "stammeln" vs "stammelt")
- Frontend shows clickable alternative buttons for spell_fix suggestions
- Backend accepts text_overrides in apply endpoint for user-selected alternatives

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 19:09:12 +02:00
Benjamin Admin
71e1b10ac7 Add gutter repair step to OCR Kombi pipeline
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 41s
CI / test-go-edu-search (push) Successful in 36s
CI / test-python-klausur (push) Failing after 2m31s
CI / test-python-agent-core (push) Successful in 28s
CI / test-nodejs-website (push) Successful in 29s
New step "Wortkorrektur" between Grid-Review and Ground Truth that detects
and fixes words truncated or blurred at the book gutter (binding area) of
double-page scans. Uses pyspellchecker (DE+EN) for validation.

Two repair strategies:
- hyphen_join: words split across rows with missing chars (ve + künden → verkünden)
- spell_fix: garbled trailing chars from gutter blur (stammeli → stammeln)

Interactive frontend with per-suggestion accept/reject and batch controls.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 18:50:16 +02:00
Benjamin Admin
21b69e06be Fix cross-column word assignment by splitting OCR merge artifacts
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 47s
CI / test-go-edu-search (push) Successful in 36s
CI / test-python-klausur (push) Failing after 2m21s
CI / test-python-agent-core (push) Successful in 19s
CI / test-nodejs-website (push) Successful in 23s
When OCR merges adjacent words from different columns into one word box
(e.g. "sichzie" spanning Col 1+2, "dasZimmer" crossing boundary), the
grid builder assigned the entire merged word to one column.

New _split_cross_column_words() function splits these at column
boundaries using case transitions and spellchecker validation to
avoid false positives on real words like "oder", "Kabel", "Zeitung".

Regression: 12/12 GT sessions pass with diff=+0.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-28 10:54:41 +01:00
Benjamin Admin
0168ab1a67 Remove Hauptseite/Box tabs from Kombi pipeline
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 27s
CI / test-go-edu-search (push) Successful in 29s
CI / test-python-klausur (push) Failing after 2m15s
CI / test-python-agent-core (push) Successful in 16s
CI / test-nodejs-website (push) Successful in 20s
Page-split now creates independent sessions that appear directly in
the session list. After split, the UI switches to the first child
session. BoxSessionTabs, sub-session state, and parent-child tracking
removed from Kombi code. Legacy ocr-overlay still uses BoxSessionTabs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-27 17:43:58 +01:00
Benjamin Admin
925f4356ce Use spellchecker instead of pyphen for pipe autocorrect validation
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 27s
CI / test-go-edu-search (push) Successful in 26s
CI / test-python-klausur (push) Failing after 2m29s
CI / test-python-agent-core (push) Successful in 16s
CI / test-nodejs-website (push) Successful in 20s
pyphen is a pattern-based hyphenator that accepts nonsense strings
like "Zeplpelin". Switch to spellchecker (frequency-based word list)
which correctly rejects garbled words and can suggest corrections.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-27 16:47:42 +01:00
Benjamin Admin
cc4cb3bc2f Add pipe auto-correction and graphic artifact filter for grid builder
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 27s
CI / test-go-edu-search (push) Successful in 27s
CI / test-python-klausur (push) Failing after 2m10s
CI / test-python-agent-core (push) Successful in 17s
CI / test-nodejs-website (push) Successful in 19s
- autocorrect_pipe_artifacts(): strips OCR pipe artifacts from printed
  syllable dividers, validates with pyphen, tries char-deletion near
  pipe positions for garbled words (e.g. "Ze|plpe|lin" → "Zeppelin")
- Rule (a2): filters isolated non-alphanumeric word boxes (≤2 chars,
  no letters/digits) — catches small icons OCR'd as ">", "<" etc.
- Both fixes are generic: pyphen-validated, no session-specific logic

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-27 16:33:38 +01:00
Benjamin Admin
0685fb12da Fix Bug 3: recover OCR-lost prefixes via overlap merge + chain merging
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 29s
CI / test-go-edu-search (push) Successful in 27s
CI / test-python-klausur (push) Failing after 2m24s
CI / test-python-agent-core (push) Successful in 17s
CI / test-nodejs-website (push) Successful in 19s
When OCR merge expands a prefix word box (e.g. "zer" w=42 → w=104),
it heavily overlaps (>75%) with the next fragment ("brech"). The grid
builder's overlap filter previously removed the prefix as a duplicate.

Fix: when overlap > 75% but both boxes are alphabetic with different
text and one is ≤ 4 chars, merge instead of removing. Also enable
chain merging via merge_parent tracking so "zer" + "brech" + "lich"
→ "zerbrechlich" in a single pass.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-27 15:49:52 +01:00
Benjamin Admin
96ea23164d Fix word-gap merge: add missing pronouns to stop words, reduce threshold
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 38s
CI / test-go-edu-search (push) Successful in 26s
CI / test-python-klausur (push) Failing after 2m13s
CI / test-python-agent-core (push) Successful in 18s
CI / test-nodejs-website (push) Successful in 22s
- Add du/dich/dir/mich/mir/uns/euch/ihm/ihn to _STOP_WORDS to prevent
  false merges like "du" + "zerlegst" → "duzerlegst"
- Reduce max_short threshold from 6 to 5 to prevent merging multi-word
  phrases like "ziehen lassen" → "ziehenlassen"

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-27 15:35:12 +01:00
Benjamin Admin
a8773d5b00 Fix 4 Grid Editor bugs: syllable modes, heading detection, word gaps
1. Syllable "Original" (auto) mode: only normalize cells that already
   have | from OCR — don't add new syllable marks via pyphen to words
   without printed dividers on the original scan.

2. Syllable "Aus" (none) mode: strip residual | chars from OCR text
   so cells display clean (e.g. "Zel|le" → "Zelle").

3. Heading detection: add text length guard in single-cell heuristic —
   words > 4 alpha chars starting lowercase (like "zentral") are regular
   vocabulary, not section headings.

4. Word-gap merge: new merge_word_gaps_in_zones() step with relaxed
   threshold (6 chars) fixes OCR splits like "zerknit tert" → "zerknittert".

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-27 15:24:35 +01:00
Benjamin Admin
9f68bd3425 feat: Implement page-split step with auto-detection and sub-session naming
StepPageSplit now:
- Auto-calls POST /page-split on step entry
- Shows oriented image + detection result
- If double page: creates sub-sessions named "Title — S. 1/2"
- If single page: green badge "keine Trennung noetig"
- Manual "Weiter" button (no auto-advance)

Also:
- StepOrientation wrapper simplified (no page-split in orientation)
- StepUpload passes name back via onUploaded(sid, name)
- page.tsx: after page-split "Weiter" switches to first sub-session
- useKombiPipeline exposes setSessionName

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 17:56:45 +01:00
Benjamin Admin
469f09d1e1 fix: Redesign StepUpload for manual step control
StepUpload now has 3 phases:
1. File selection: drop zone / file picker → shows preview
2. Review: title input, category, file info → "Hochladen" button
3. Uploaded: shows session image → "Weiter" button

No more auto-advance after upload. User controls every step.
openSession() removed from onUploaded callback to prevent
step-reset race condition.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 17:35:36 +01:00
Benjamin Admin
3bb04b25ab fix: OCR Kombi upload race condition — openSession was resetting step to 0
openSession mapped dbStep=1 to uiStep=0 (upload), overriding handleNext's
advancement to step 1. Fix: sessions always exist post-upload, so always
skip past the upload step in openSession.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 17:10:04 +01:00
Benjamin Admin
85fe0a73d6 docs: Add OCR Kombi Pipeline to MkDocs and cross-reference from OCR Pipeline
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 30s
CI / test-go-edu-search (push) Successful in 30s
CI / test-python-klausur (push) Failing after 2m28s
CI / test-python-agent-core (push) Successful in 17s
CI / test-nodejs-website (push) Successful in 18s
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 16:09:40 +01:00
Benjamin Admin
eaade3cad2 feat: Maschinenbau-Branche + INDUSTRY_REGULATION_MAP erweitert
- Neue Branche "Maschinenbau" mit 15 Regularien (MACHINERY_REG, BLUE_GUIDE, CRA, etc.)
- BDSG zu allen DE-relevanten Branchen hinzugefuegt
- Nationale Gesetze (HGB, AO, BGB, UrhG, etc.) branchenspezifisch gemapped
- IoT erweitert: MACHINERY_REG, BLUE_GUIDE, NIS2, DE_ELEKTROG
- THEMATIC_GROUPS: Produktsicherheit um MACHINERY_REG + BLUE_GUIDE erweitert

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 15:59:31 +01:00
Benjamin Admin
d26a9f60ab Add OCR Kombi Pipeline: modular 11-step architecture with multi-page support
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 29s
CI / test-go-edu-search (push) Successful in 28s
CI / test-python-klausur (push) Failing after 2m24s
CI / test-python-agent-core (push) Successful in 22s
CI / test-nodejs-website (push) Successful in 20s
Phase 1 of the clean architecture refactor: Replaces the 751-line ocr-overlay
monolith with a modular pipeline. Each step gets its own component file.

Frontend: /ai/ocr-kombi route with 11 steps (Upload, Orientation, PageSplit,
Deskew, Dewarp, ContentCrop, OCR, Structure, GridBuild, GridReview, GroundTruth).
Session list supports document grouping for multi-page uploads.

Backend: New ocr_kombi/ module with multi-page PDF upload (splits PDF into N
sessions with shared document_group_id). DB migration adds document_group_id
and page_number columns.

Old /ai/ocr-overlay remains fully functional for A/B testing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 15:55:28 +01:00
Benjamin Admin
d26233b5b3 Add page number display to StepGridReview summary bar
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 48s
CI / test-go-edu-search (push) Successful in 36s
CI / test-python-klausur (push) Failing after 2m17s
CI / test-python-agent-core (push) Successful in 18s
CI / test-nodejs-website (push) Successful in 24s
The page_number was only shown in GridEditor.tsx (ocr-overlay) but
the OCR pipeline uses StepGridReview.tsx which has its own summary bar.
Display the extracted page number (e.g. "S. 233") next to the
dictionary detection badge.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 11:21:44 +01:00
Benjamin Admin
e019dde01b Extract page number as metadata instead of silently removing it
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 27s
CI / test-go-edu-search (push) Successful in 36s
CI / test-python-klausur (push) Failing after 2m9s
CI / test-python-agent-core (push) Successful in 17s
CI / test-nodejs-website (push) Successful in 21s
_filter_footer_words now returns page number info (text, y_pct, number)
instead of just removing footer words. The page number is included in
the grid result as `page_number` and displayed in the frontend summary
bar as "S. 233".

This preserves page numbers for later page concatenation in the
customer frontend while still removing them from the grid content.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 08:52:09 +01:00
Benjamin Admin
5af5d821a5 Fix 3 grid issues: artifact cells, connector col noise, footer false positive
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 34s
CI / test-go-edu-search (push) Successful in 27s
CI / test-python-klausur (push) Failing after 2m9s
CI / test-python-agent-core (push) Successful in 15s
CI / test-nodejs-website (push) Successful in 18s
1. Add per-cell artifact filter (4b2): removes single-word cells with
   ≤2 chars and confidence <65 (e.g. "as" from stray OCR marks)

2. Add narrow connector column normalization (4d2): when ≥60% of cells
   in a column share the same short text (e.g. "oder"), normalize
   near-match outliers like "oderb" → "oder"

3. Fix footer detection: require short text (≤20 chars) and no commas.
   Comma-separated lists like "Uhrzeit, Vergangenheit, Zukunft" are
   content continuations, not page numbers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 08:18:55 +01:00
Benjamin Admin
525de55791 Fix syllable+IPA combination: strip bracket content before IPA guard
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 35s
CI / test-go-edu-search (push) Successful in 34s
CI / test-python-klausur (push) Failing after 2m16s
CI / test-python-agent-core (push) Successful in 17s
CI / test-nodejs-website (push) Successful in 18s
The _IPA_RE check in _syllabify_text() skipped entire cells containing
any IPA character. After German IPA insertion adds [bɪltʃøn], the check
blocked syllabification entirely. Now strips bracket content before
checking, so programmatically inserted IPA doesn't prevent syllable
divider insertion on the surrounding text.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 00:03:10 +01:00
Benjamin Admin
f860eb66e6 Add German IPA support (wiki-pronunciation-dict + epitran)
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 33s
CI / test-go-edu-search (push) Successful in 29s
CI / test-python-klausur (push) Failing after 2m12s
CI / test-python-agent-core (push) Successful in 15s
CI / test-nodejs-website (push) Successful in 17s
Hybrid approach mirroring English IPA:
- Primary: wiki-pronunciation-dict (636k entries, CC-BY-SA, Wiktionary)
- Fallback: epitran rule-based G2P (MIT license)

IPA modes now use language-appropriate dictionaries:
- auto/en: English IPA (Britfone + eng_to_ipa)
- de: German IPA (wiki-pronunciation-dict + epitran)
- all: EN column gets English IPA, other columns get German IPA
- none: disabled

Frontend shows CC-BY-SA attribution when German IPA is active.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-25 22:18:20 +01:00
Benjamin Admin
a73ddce43d Fix missing PageZone import in grid_editor_helpers.py
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 24s
CI / test-go-edu-search (push) Successful in 27s
CI / test-python-klausur (push) Failing after 1m52s
CI / test-python-agent-core (push) Successful in 14s
CI / test-nodejs-website (push) Successful in 15s
The zone merging function used PageZone but the import was only
in grid_editor_api.py. Caused NameError on sessions that trigger
zone merging (e.g. original_scan_b59a1b1b).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-25 22:04:21 +01:00
Benjamin Admin
47e83d90bd Remove IPA:DE option — no German IPA dictionary available
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 25s
CI / test-go-edu-search (push) Successful in 26s
CI / test-python-klausur (push) Failing after 1m54s
CI / test-python-agent-core (push) Successful in 15s
CI / test-nodejs-website (push) Successful in 16s
Our IPA system only has English dictionaries (Britfone MIT, eng_to_ipa
MIT). The "IPA: nur DE" option was useless at best and misleading.
Removed from dropdown, type definition, and API validation.

Syllable DE mode stays — pyphen has a German hyphenation dictionary.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-25 21:53:43 +01:00
Benjamin Admin
76cd1ac020 Fix false headers on sparse layouts and IPA corruption on German text
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 33s
CI / test-go-edu-search (push) Successful in 27s
CI / test-python-klausur (push) Failing after 1m55s
CI / test-python-agent-core (push) Successful in 14s
CI / test-nodejs-website (push) Successful in 17s
1. Header detection: Add 25% cap to single-cell heading heuristic.
   On German synonym dicts where most rows naturally have only 1
   content cell, the old logic marked 60%+ of rows as headers.

2. IPA de/all mode: Use "column_text" (light processing) for non-
   English columns instead of "column_en" (full processing). The
   full path runs _insert_missing_ipa() which splits on whitespace,
   matches English prefixes ("bildschön" → "bild"), and truncates
   the rest — destroying German comma-separated synonym lists.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-25 21:49:05 +01:00
Benjamin Admin
256df820cd Auto-rebuild grid when IPA or syllable mode dropdown changes
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 25s
CI / test-go-edu-search (push) Successful in 27s
CI / test-python-klausur (push) Failing after 2m0s
CI / test-python-agent-core (push) Successful in 15s
CI / test-nodejs-website (push) Successful in 15s
The dropdowns only updated state but didn't trigger buildGrid().
Now a useEffect watches ipaMode/syllableMode and rebuilds
automatically (skipping the initial mount).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-25 20:43:20 +01:00
Benjamin Admin
7773c51304 Fix en/de mode edge case on docs without detected English column
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 25s
CI / test-go-edu-search (push) Successful in 27s
CI / test-python-klausur (push) Failing after 1m54s
CI / test-python-agent-core (push) Successful in 15s
CI / test-nodejs-website (push) Successful in 17s
When no IPA signals exist (e.g. German-only dicts), the fallback
that guesses en_col_type was incorrectly triggered for en/de modes,
causing false IPA and syllable insertions. Now only fires for 'all'
mode. Syllable en mode also returns empty set when no EN column found.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-25 08:37:15 +01:00
Benjamin Admin
83c058e400 Add language-specific IPA and syllable modes (de/en)
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 25s
CI / test-go-edu-search (push) Successful in 25s
CI / test-python-klausur (push) Failing after 1m50s
CI / test-python-agent-core (push) Successful in 15s
CI / test-nodejs-website (push) Successful in 15s
Extend ipa_mode and syllable_mode toggles with language options:
- auto: smart detection (default)
- en: only English headword column
- de: only German definition columns
- all: all content columns
- none: skip entirely

Also improve English column auto-detection: use garbled IPA patterns
(apostrophes, colons) in addition to bracket patterns. This correctly
identifies English dictionary pages where OCR produces garbled ASCII
instead of bracket IPA.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-25 08:16:29 +01:00
Benjamin Admin
34680732f8 Add IPA and syllable mode toggles, fix false IPA on German documents
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 24s
CI / test-go-edu-search (push) Successful in 26s
CI / test-python-klausur (push) Failing after 2m1s
CI / test-python-agent-core (push) Successful in 15s
CI / test-nodejs-website (push) Successful in 15s
Backend: Remove en_col_type fallback heuristic (longest avg text) that
incorrectly identified German columns as English. IPA now only applied
when OCR bracket patterns are actually found. Add ipa_mode (auto/all/none)
and syllable_mode (auto/all/none) query params to build-grid API.

Frontend: Add IPA and Silben dropdown selects to GridToolbar. Modes
are passed as query params on rebuild. Auto = current smart detection,
All = force for all words, Aus = skip entirely.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-25 08:04:44 +01:00
Benjamin Admin
c42924a94a Fix IPA correction persistence and false-positive prefix matching
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 34s
CI / test-go-edu-search (push) Successful in 24s
CI / test-python-klausur (push) Failing after 1m57s
CI / test-python-agent-core (push) Successful in 15s
CI / test-nodejs-website (push) Successful in 21s
Step 5i was overwriting IPA-corrected text from Step 5c when
reconstructing cells from word_boxes. Added _ipa_corrected flag
to preserve corrections. Also tightened merged-token prefix matching
(min prefix 4 chars, min suffix 3 chars) to prevent false positives
like "sis" being extracted from "si:said".

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-25 07:26:32 +01:00
Benjamin Admin
9ea217bdfc Fix IPA correction for dictionary pages (WIP)
- Fix Step 5h: restrict slash-IPA conversion to English headword column
  only — prevents converting "der/die/das" to "der [dər]das" in German
  columns (confirmed working)
- Fix _text_has_garbled_ipa: detect embedded apostrophes in merged
  tokens like "Scotland'skotland" where OCR reads ˈ as '
- Fix _insert_missing_ipa: detect dictionary word prefix in merged
  trailing tokens like "fictionsalans'fIkfn" → extract "fiction" with IPA
- Move en_col_type to wider scope for Step 5h access

Note: Fixes 1+2 confirmed working in unit tests but not yet applying
in the full build-grid pipeline — needs further debugging.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 23:54:14 +01:00
Benjamin Admin
4feec7c7b7 Lower syllable pipe-ratio threshold from 5% to 1%
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 24s
CI / test-go-edu-search (push) Successful in 25s
CI / test-python-klausur (push) Failing after 1m58s
CI / test-python-agent-core (push) Successful in 15s
CI / test-nodejs-website (push) Successful in 16s
Real dictionary pages have only ~3% OCR-detected pipes because the thin
syllable divider lines are hard for OCR to read. The primary false-positive
guard (article_col_index check) already blocks synonym dictionaries.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 23:17:08 +01:00
Benjamin Admin
ed7fc99fc4 Improve syllable divider insertion for dictionary pages
Rewrite cv_syllable_detect.py with pyphen-first approach:
- Remove unreliable CV gate (morphological pipe detection)
- Strip existing pipes and re-syllabify via pyphen (DE then EN)
- Merge pipe-gap spaces where OCR split words at divider positions
- Guard merges with function word blacklist and punctuation checks

Add false-positive prevention:
- Pre-check: skip if <5% of cells have existing | from OCR
- Call-site check: require article_col_index (der/die/das column)
- Prevents syllabification of synonym dictionaries and word lists

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 19:44:29 +01:00
Benjamin Admin
7fbcae954b fix: auto-trigger orientation for page-split sessions without result
Page-split sessions (start_step=1) have no orientation_result stored.
StepOrientation now auto-runs orientation detection when loading an
existing session that lacks a result.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 17:19:56 +01:00
Benjamin Admin
f931091b57 refactor: independent sessions for page-split + URL-based pipeline navigation
Page-split now creates independent sessions (no parent_session_id),
parent marked as status='split' and hidden from list. Navigation uses
useSearchParams for URL-based step tracking (browser back/forward works).
page.tsx reduced from 684 to 443 lines via usePipelineNavigation hook.

Box sub-sessions (column detection) remain unchanged.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 17:05:33 +01:00
Benjamin Admin
f34340de9c Fix sub-session completion flow: navigate to next incomplete sub-session
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 25s
CI / test-go-edu-search (push) Successful in 26s
CI / test-python-klausur (push) Failing after 1m51s
CI / test-python-agent-core (push) Successful in 15s
CI / test-nodejs-website (push) Successful in 15s
Instead of returning to parent (which creates a redirect loop), the
handleNext function now finds the next incomplete sub-session and opens
it directly. When all sub-sessions are done, returns to session list.

Also fixes openSession auto-redirect to prefer the first incomplete
sub-session over the most advanced one.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 16:33:56 +01:00
Benjamin Admin
55de6c21d2 Fix session resume: auto-open most advanced sub-session on parent click
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 26s
CI / test-go-edu-search (push) Successful in 26s
CI / test-python-klausur (push) Failing after 1m46s
CI / test-python-agent-core (push) Successful in 37s
CI / test-nodejs-website (push) Successful in 15s
When reopening a parent session that has page-split sub-sessions,
the UI was showing the parent's pipeline step (always step 1/Orientation)
instead of navigating to the sub-sessions. Now automatically opens the
most advanced sub-session, matching the behavior of handleOrientationComplete.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 16:04:53 +01:00
Benjamin Admin
52b66ebe07 Fix NameError: _text_has_garbled_ipa not imported in grid_editor_helpers
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 26s
CI / test-go-edu-search (push) Successful in 26s
CI / test-python-klausur (push) Failing after 1m52s
CI / test-python-agent-core (push) Successful in 15s
CI / test-nodejs-website (push) Successful in 16s
After refactoring grid_editor_api.py into helpers, the function
_text_has_garbled_ipa was used in _detect_heading_rows_by_single_cell
but never imported from cv_ocr_engines. This caused HTTP 500 on
build-grid for sessions that trigger single-cell heading detection.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 15:11:29 +01:00
Benjamin Admin
424e5c51d4 fix: remove nested scrollbar in grid editor
Removed overflow-y-auto and maxHeight from the grid container div.
The page itself handles scrolling — nested scroll containers caused
the bottom rows to be cut off after editing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 15:06:28 +01:00
Benjamin Admin
12b4c61bac refactor: extract grid helpers + generic CV-gated syllable insertion
1. Extracted 1367 lines of helper functions from grid_editor_api.py
   (3051→1620 lines) into grid_editor_helpers.py (filters, detectors,
   zone grid building).

2. Created cv_syllable_detect.py with generic CV+pyphen logic:
   - Checks EVERY word_box for vertical pipe lines (not just first word)
   - No article-column dependency — works with any dictionary layout
   - CV morphological detection gates pyphen insertion

3. Grid editor scroll: calc(100vh-200px) for reliable scrolling.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 14:39:33 +01:00
Benjamin Admin
d9b2aa82e9 fix: CV-gated syllable insertion + grid editor scroll
1. Syllable dividers now require CV validation: morphological vertical
   line detection checks if word_box image actually shows thin isolated
   pipe lines before applying pyphen. Only first word per cell gets
   pipes (matching dictionary print layout).

2. Grid editor scroll: changed maxHeight from 80vh to calc(100vh-200px)
   so editor remains scrollable after edits.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 14:31:16 +01:00
Benjamin Admin
364086b86e feat: auto-insert syllable dividers via pyphen on dictionary pages
OCR engines don't detect | pipe chars used as syllable dividers in
dictionaries. After dictionary detection (is_dict=True), use pyphen
(MIT) to insert syllable breaks into headword cells. Tries DE first,
then EN. Skips IPA content, short words, and cells already containing |.

Also adds pyphen>=0.16.0 to requirements.txt.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 14:17:26 +01:00
Benjamin Admin
fe754398c0 fix: Step 4f sidebar detection uses avg text length instead of fill ratio
Column_1 data showed avg_len=1.0 with 13 single-char cells (alphabet
letters from sidebar). Old fill_ratio check (76% > 35%) missed it.
New criteria: avg_len ≤ 1.5 AND ≥ 70% single chars → removes column.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 14:10:43 +01:00
Benjamin Admin
be86a7d14d fix: preserve pipe syllable dividers + detect alphabet sidebar columns
1. Pipe divider fix: Changed OCR char-confusion regex so | between
   letters (Ka|me|rad) is NOT converted to I. Only standalone/
   word-boundary pipes are converted (|ch → Ich, | want → I want).

2. Alphabet sidebar detection improvements:
   - _filter_decorative_margin() now considers 2-char words (OCR reads
     "Aa", "Bb" from sidebars), lowered min strip from 8→6
   - _filter_border_strip_words() lowered decorative threshold from 50%→45%
   - New step 4f: grid-level thin-edge-column filter as safety net —
     removes edge columns with <35% fill rate and >60% short text

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 13:52:11 +01:00
Benjamin Admin
19a5f69272 fix: make Grid Editor vertically scrollable so all rows are visible
The right panel (grid area) had no vertical overflow handling, causing
the last ~5 rows to be clipped and invisible. Added overflow-y-auto
with max-height 80vh, and removed overflow-hidden from the GridTable
wrapper that was cutting off content.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 13:33:52 +01:00
Benjamin Admin
ea09fc75df fix: resolve circular import with lazy import for _build_reference_snapshot
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 13:18:21 +01:00
Benjamin Admin
410d36f3de feat: save automatic grid snapshot before manual edits for GT comparison
- build-grid now saves the automatic OCR result as ground_truth.auto_grid_snapshot
- mark-ground-truth includes a correction_diff comparing auto vs corrected
- New endpoint GET /correction-diff returns detailed diff with per-col_type
  accuracy breakdown (english, german, ipa, etc.)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 13:16:44 +01:00
Benjamin Admin
72ce4420cb fix: advance uiStep past skipped orientation for page-split sub-sessions
Page-split sub-sessions (current_step=2) had orientation marked as skipped
but uiStep remained at 0 (orientation step), causing StepOrientation to
render for a sub-session that has no orientation data. Now advances to
uiStep=1 (deskew) when orientation is skipped.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 12:59:36 +01:00
Benjamin Admin
63dfb4d06f fix: replace reset useEffects with key prop for step component remount
The reset useEffects in StepOrientation/Deskew/Dewarp/Crop were clearing
orientationResult when sessionId changed (e.g. during handleOrientationComplete),
causing the right side of ImageCompareView to show nothing. Using key={sessionId}
on the step components instead forces React to remount with fresh state when
switching sessions, without interfering with the upload/orientation flow.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 12:20:50 +01:00
Benjamin Admin
08a91ba2be Fix sub-session tab switching: reset step state on sessionId change
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 25s
CI / test-go-edu-search (push) Successful in 25s
CI / test-python-klausur (push) Failing after 1m53s
CI / test-python-agent-core (push) Successful in 15s
CI / test-nodejs-website (push) Successful in 17s
Step components (Deskew, Dewarp, Crop, Orientation) had local state
guards that prevented reloading when sessionId changed via sub-session
tab clicks. Added useEffect reset hooks that clear all local state
when sessionId changes, allowing the component to properly reload
the new session's data.

Also renamed "Box N" to "Seite N" in BoxSessionTabs per user feedback.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 12:04:23 +01:00
Benjamin Admin
49a36364a8 Add double-page split support to OCR Overlay (Kombi 7 Schritte)
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 24s
CI / test-go-edu-search (push) Successful in 25s
CI / test-python-klausur (push) Failing after 2m5s
CI / test-python-agent-core (push) Successful in 16s
CI / test-nodejs-website (push) Successful in 16s
The page-split detection was only implemented in the regular pipeline
page but not in the OCR Overlay page where the user actually tests
with Kombi mode. Now the overlay page has full sub-session support:

- openSession: handles sub_sessions, parent_session_id, skip logic
  for page-split vs crop-based sub-sessions, preserves current mode
- handleOrientationComplete: async, fetches API to detect sub-sessions
- BoxSessionTabs: shown between stepper and step content
- handleNext: returns to parent after sub-session completion
- handleSessionChange/handleBoxSessionsCreated: session switching

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 11:48:26 +01:00
Benjamin Admin
14fd8e0b1e Fix page-split: fetch sub-sessions from API instead of React state
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 37s
CI / test-go-edu-search (push) Successful in 27s
CI / test-python-klausur (push) Failing after 2m1s
CI / test-python-agent-core (push) Successful in 15s
CI / test-nodejs-website (push) Successful in 17s
handleOrientationComplete was checking subSessions from React state,
but due to batching the state was still empty when the user clicked
"Seiten verarbeiten". Now fetches session data directly from the API
to reliably detect sub-sessions and auto-open the first one.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 11:22:15 +01:00
Benjamin Admin
247b79674d Add double-page spread detection to frontend pipeline
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 36s
CI / test-go-edu-search (push) Successful in 34s
CI / test-python-klausur (push) Failing after 2m0s
CI / test-python-agent-core (push) Successful in 17s
CI / test-nodejs-website (push) Successful in 18s
After orientation detection, the frontend now automatically calls the
page-split endpoint. When a double-page book spread is detected, two
sub-sessions are created and each goes through the full pipeline
(deskew/dewarp/crop) independently — essential because each page of a
spread tilts differently due to the spine.

Frontend changes:
- StepOrientation: calls POST /page-split after orientation, shows
  split info ("Doppelseite erkannt"), notifies parent of sub-sessions
- page.tsx: distinguishes page-split sub-sessions (current_step < 5)
  from crop-based sub-sessions (current_step >= 5). Page-split subs
  only skip orientation, not deskew/dewarp/crop.
- page.tsx: handleOrientationComplete opens first sub-session when
  page-split was detected

Backend changes (orientation_crop_api.py):
- page-split endpoint falls back to original image when orientation
  rotated a landscape spread to portrait
- start_step parameter: 1 if split from original, 2 if from oriented

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 11:09:44 +01:00
Benjamin Admin
40815dafd1 feat(ocr-pipeline): add page-split endpoint for double-page book spreads
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 25s
CI / test-go-edu-search (push) Successful in 26s
CI / test-python-klausur (push) Failing after 2m1s
CI / test-python-agent-core (push) Successful in 19s
CI / test-nodejs-website (push) Successful in 20s
Each page of a double-page scan tilts differently due to the book spine.
The new POST /page-split endpoint detects spreads after orientation and
creates sub-sessions that go through the full pipeline (deskew, dewarp,
crop, etc.) individually, so each page gets its own deskew correction.

Also fixes border-strip filter incorrectly removing German translation
words by adding a decorative-strip validation check.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 10:53:06 +01:00
Benjamin Admin
2a21127f01 fix(ocr-pipeline): improve page crop spine detection and cell assignment
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 25s
CI / test-go-edu-search (push) Successful in 25s
CI / test-python-klausur (push) Failing after 1m54s
CI / test-python-agent-core (push) Successful in 15s
CI / test-nodejs-website (push) Successful in 17s
1. page_crop: Score all dark runs by center-proximity × darkness ×
   narrowness instead of picking the widest. Fixes ad810209 where a
   wide dark area at 35% was chosen over the actual spine at 50%.

2. cv_words_first: Replace x-center-only word→column assignment with
   overlap-based three-pass strategy (overlap → midpoint-range → nearest).
   Fixes truncated German translations like "Schal" instead of
   "Schal - die Schals" in session 079cd0d9.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 09:23:30 +01:00
Benjamin Admin
9d34c5201e feat(grid-editor): add manual cell color control via right-click menu
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 23s
CI / test-go-edu-search (push) Successful in 25s
CI / test-python-klausur (push) Failing after 1m53s
CI / test-python-agent-core (push) Successful in 13s
CI / test-nodejs-website (push) Successful in 15s
Users can now right-click any cell to set text color (red, green, blue,
orange, purple, black) or remove the color bar without changing text.
A "reset" option restores the OCR-detected color. This enables accurate
Ground Truth marking when OCR assigns colors to wrong cells.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 08:51:18 +01:00
Benjamin Admin
d54814fa70 feat: color bar respects edits + column pattern auto-correction
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 23s
CI / test-go-edu-search (push) Successful in 25s
CI / test-python-klausur (push) Failing after 1m56s
CI / test-python-agent-core (push) Successful in 14s
CI / test-nodejs-website (push) Successful in 15s
- Color bar (red/colored indicator) now only shows when word_boxes
  text still matches the cell text — editing the cell hides stale colors
- New "Auto-Korrektur" button: detects dominant prefix+number patterns
  per column (e.g. p.70, p.71) and completes partial entries (.65 → p.65)
  — requires 3+ matching entries before correcting

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 08:38:11 +01:00
Benjamin Admin
d6f4944bcc fix: remove maxHeight limit on grid editor — shows all rows
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 27s
CI / test-go-edu-search (push) Successful in 25s
CI / test-python-klausur (push) Failing after 1m50s
CI / test-python-agent-core (push) Successful in 14s
CI / test-nodejs-website (push) Successful in 19s
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 08:24:50 +01:00
Benjamin Admin
ee0d9c881e fix: column resize handle now accessible above add/delete buttons
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 26s
CI / test-go-edu-search (push) Successful in 26s
CI / test-python-klausur (push) Failing after 2m1s
CI / test-python-agent-core (push) Successful in 14s
CI / test-nodejs-website (push) Successful in 15s
Resize handle: wider (9px), z-40 (above z-30 buttons).
Add-column button moved to bottom-right corner to avoid overlap.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 08:20:04 +01:00
Benjamin Admin
65f4ce1947 feat: ImageLayoutEditor, arrow-key nav, multi-select bold, wider columns
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 32s
CI / test-go-edu-search (push) Successful in 25s
CI / test-python-klausur (push) Failing after 1m52s
CI / test-python-agent-core (push) Successful in 15s
CI / test-nodejs-website (push) Successful in 18s
- New ImageLayoutEditor: SVG overlay on original scan with draggable
  column dividers, horizontal guidelines (margins/header/footer),
  double-click to add columns, x-button to delete
- GridTable: MIN_COL_WIDTH 40→80px for better readability
- Arrow up/down keys navigate between rows in the grid editor
- Ctrl+Click for multi-cell selection, Ctrl+B to toggle bold on selection
- getAdjacentCell works for cells that don't exist yet (new rows/cols)
- deleteColumn now merges x-boundaries correctly
- Session restore fix: grid_editor_result/structure_result in session GET
- Footer row 3-state cycle, auto-create cells for empty footer rows
- Grid save/build/GT-mark now advance current_step=11

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 07:45:39 +01:00
Benjamin Admin
4e668660a7 feat: add Woerterbuch category + column add/delete in grid editor
- New document category "Woerterbuch" (frontend type + backend validation)
- Column delete: hover column header → red "x" button (with confirmation)
- Column add: hover column header → "+" button inserts after that column
- Both operations support undo/redo, update cell IDs and summary
- Available in both GridEditor and StepGridReview (Kombi last step)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 16:27:12 +01:00
Benjamin Admin
7a6eadde8b feat: integrate Ground Truth review into Kombi Pipeline last step
- New StepGridReview component: split-view (scan image left, grid right),
  confidence stats, row-accept buttons, zoom controls
- Kombi Pipeline case 6 now uses StepGridReview instead of plain GridEditor
- Kombi step label changed to "Review & GT"
- Ground Truth queue page simplified to overview/navigation only
  (links to Kombi pipeline for actual review work)
- Deep-link support: /ai/ocr-overlay?session=xxx&mode=kombi

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 15:04:23 +01:00
Benjamin Admin
4e809c3860 fix: ground-truth crash on col_type + remove AIToolsSidebarResponsive from model-management
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 26s
CI / test-go-edu-search (push) Successful in 29s
CI / test-python-klausur (push) Failing after 2m0s
CI / test-python-agent-core (push) Successful in 17s
CI / test-nodejs-website (push) Successful in 18s
- Ground-truth: zone.columns use 'label' not 'col_type' — calling
  .replace() on undefined crashed the page after grid data loaded
- Model-management: same AIToolsSidebarResponsive wrapper bug as the
  other pages — does not render children

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 10:14:02 +01:00
Benjamin Admin
dccbb909bc fix: remove AIToolsSidebarResponsive wrapper from ground-truth and regression pages
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 25s
CI / test-go-edu-search (push) Successful in 25s
CI / test-python-klausur (push) Failing after 2m0s
CI / test-python-agent-core (push) Successful in 16s
CI / test-nodejs-website (push) Successful in 17s
AIToolsSidebarResponsive does not accept children — it renders only a
sidebar nav. Using it as a wrapper caused page content to never render.
Replaced with plain div, matching the pattern used by ocr-pipeline.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 09:57:52 +01:00
Benjamin Admin
be7f5f1872 feat: Sprint 2 — TrOCR ONNX, PP-DocLayout, Model Management
D2: TrOCR ONNX export script (printed + handwritten, int8 quantization)
D3: PP-DocLayout ONNX export script (download or Docker-based conversion)
B3: Model Management admin page (PyTorch vs ONNX status, benchmarks, config)
A4: TrOCR ONNX service with runtime routing (auto/pytorch/onnx via TROCR_BACKEND)
A5: PP-DocLayout ONNX detection with OpenCV fallback (via GRAPHIC_DETECT_BACKEND)
B4: Structure Detection UI toggle (OpenCV vs PP-DocLayout) with class color coding
C3: TrOCR-ONNX.md documentation
C4: OCR-Pipeline.md ONNX section added
C5: mkdocs.yml nav updated, optimum added to requirements.txt

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 09:53:02 +01:00
Benjamin Admin
c695b659fb fix: PagePurpose props on ground-truth and regression pages
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 26s
CI / test-go-edu-search (push) Successful in 27s
CI / test-python-klausur (push) Failing after 1m53s
CI / test-python-agent-core (push) Successful in 14s
CI / test-nodejs-website (push) Successful in 17s
Both pages passed `moduleId` which is not a valid prop for PagePurpose.
The component expects explicit title/purpose/audience — calling
audience.join() on undefined caused the client-side crash.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 09:43:36 +01:00
Benjamin Admin
a1e079b911 feat: Sprint 1 — IPA hardening, regression framework, ground-truth review
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 28s
CI / test-go-edu-search (push) Successful in 27s
CI / test-python-klausur (push) Failing after 1m55s
CI / test-python-agent-core (push) Successful in 16s
CI / test-nodejs-website (push) Successful in 19s
Track A (Backend):
- Compound word IPA decomposition (schoolbag→school+bag)
- Trailing garbled IPA fragment removal after brackets (R21 fix)
- Regression runner with DB persistence, history endpoints
- Page crop determinism verified with tests

Track B (Frontend):
- OCR Regression dashboard (/ai/ocr-regression)
- Ground Truth Review workflow (/ai/ocr-ground-truth)
  with split-view, confidence highlighting, inline edit,
  batch mark, progress tracking

Track C (Docs):
- OCR-Pipeline.md v5.0 (Steps 5e-5h)
- Regression testing guide
- mkdocs.yml nav update

Track D (Infra):
- TrOCR baseline benchmark script
- run-regression.sh shell script
- Migration 008: regression_runs table

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 09:21:27 +01:00
Benjamin Admin
f5d5d6c59c docs: add Vision, Roadmap, and Hardware strategy to MkDocs
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 42s
CI / test-go-edu-search (push) Successful in 27s
CI / test-python-klausur (push) Failing after 1m58s
CI / test-python-agent-core (push) Successful in 16s
CI / test-nodejs-website (push) Successful in 18s
Add three new Projekt documentation pages covering product vision
(offline-first desktop app for teachers), 6-phase development roadmap,
and 3-tier hardware strategy with distribution plan.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 08:54:22 +01:00
Benjamin Admin
4a44ad7986 fix: hard-filter OCR words inside detected graphic regions
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 26s
CI / test-go-edu-search (push) Successful in 27s
CI / test-python-klausur (push) Failing after 1m51s
CI / test-python-agent-core (push) Successful in 18s
CI / test-nodejs-website (push) Successful in 16s
Run detect_graphic_elements() in the grid pipeline after image loading
and remove ALL words whose centroids fall inside detected graphic regions,
regardless of confidence. Previously only low-confidence words (conf < 50)
were removed, letting artifacts like "Tr", "Su" survive.

Changes:
- grid_editor_api.py: Import and call detect_graphic_elements() at Step 3a,
  passing only significant words (len >= 3) to avoid short artifacts fooling
  the text-vs-graphic heuristic. Hard-filter all words in graphic regions.
- cv_graphic_detect.py: Lower density threshold from 20% to 5% for large
  regions (>100x80px) — photos/illustrations have low color saturation.
  Raise page-spanning limit from 50% to 60% width/height.

Tested: 5 ground-truth sessions pass regression (079cd0d9, d8533a2c,
2838c7a7, 4233d7e3, 5997b635). Session 5997 now detects 2 graphic regions
and removes 29 artifact words including "Tr" and "Su".

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 10:18:23 +01:00
Benjamin Admin
7b3319be2e fix: merge syllable-split word_boxes + keep dictionary guide words
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 27s
CI / test-go-edu-search (push) Successful in 28s
CI / test-python-klausur (push) Failing after 1m56s
CI / test-python-agent-core (push) Successful in 16s
CI / test-nodejs-website (push) Successful in 17s
OCR splits words at syllable marks into overlapping word_boxes (e.g.
"zu" + "tiefst" with 52% x-overlap). Step 5i previously removed the
lower-confidence box, losing the prefix. Now: when both boxes are
alphabetic text with 20-75% overlap, MERGE them into one word_box
("zutiefst") instead of removing.

Also relaxed artifact cell filter: 2-char alphabetic text like "Zw"
(dictionary guide word) is no longer removed. Only non-alphabetic
short text like "a=" is filtered.

Results for session 5997: "tiefst"→"zutiefst", "zu"→"zuständig",
"Zu die Zuschüsse"→"Zuschuss, die Zuschüsse", "Zw" restored.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 08:21:00 +01:00
Benjamin Admin
882b177fc3 fix: remove image-area artifacts + fix heading false positive for dictionary entries
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 28s
CI / test-go-edu-search (push) Successful in 28s
CI / test-python-klausur (push) Failing after 1m55s
CI / test-python-agent-core (push) Successful in 15s
CI / test-nodejs-website (push) Successful in 16s
Three fixes for dictionary page session 5997:

1. Heading detection: column_1 cells with article words (die/der/das)
   now count as content cells, preventing "die Zuschrift, die Zuschriften"
   from being falsely merged into a spanning heading cell.

2. Step 5j-pre: new artifact cell filter removes short garbled text from
   OCR on image areas (e.g. "7 EN", "Tr", "\\", "PEE", "a="). Cells
   survive earlier filters because their rows have real content in other
   columns. Also cleans up empty rows after removal.

3. Footer "PEE" auto-fixed: artifact filter removes the noise cell,
   empty row gets cleaned up, footer detection no longer sees it.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 07:59:24 +01:00
Benjamin Admin
1fae39dbb8 fix: lower secondary column threshold + strip pipe chars from word_boxes
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 35s
CI / test-go-edu-search (push) Successful in 29s
CI / test-python-klausur (push) Failing after 1m57s
CI / test-python-agent-core (push) Successful in 21s
CI / test-nodejs-website (push) Successful in 18s
Dictionary pages have 2 dictionary columns, each with article + headword
sub-columns. The right article column (die/der at x≈626) had only 14.3%
row coverage — below the 20% secondary threshold. Lowered to 12% so
dictionary article columns qualify. Also strip pipe characters from
individual word_box text (not just cell text) to remove OCR syllable
separation marks (e.g. "zu|trau|en" → "zutrauen").

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 07:44:03 +01:00
Benjamin Admin
46c8c28d34 fix: border strip pre-filter + 3-column detection for vocabulary tables
The border strip filter (Step 4e) used the LARGEST x-gap which incorrectly
removed base words along with edge artifacts. Now uses a two-stage approach:
1. _filter_border_strip_words() pre-filters raw words BEFORE column detection,
   scanning from the page edge inward to find the FIRST significant gap (>30px)
2. Step 4e runs as fallback only when pre-filter didn't apply

Session 4233 now correctly detects 3 columns (base word | oder | synonyms)
instead of 2. Threshold raised from 15% to 20% to handle pages with many
edge artifacts. All 4 ground-truth sessions pass regression.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 21:01:43 +01:00
Benjamin Admin
4000110501 fix: extend tiny symbol filter to all non-black colors, raise area to 200
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 25s
CI / test-go-edu-search (push) Successful in 25s
CI / test-python-klausur (push) Failing after 1m49s
CI / test-python-agent-core (push) Successful in 15s
CI / test-nodejs-website (push) Successful in 17s
Step 5i rule (a) only caught blue tiny symbols. Graphic fragments from
page illustrations (e.g. orange quote mark from man illustration) were
missed. Now filters any non-black colored word_box with area < 200 and
confidence < 85.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 18:05:31 +01:00
Benjamin Admin
2acf8696bf fix: correct border strip test data to avoid false internal gaps
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 36s
CI / test-go-edu-search (push) Successful in 26s
CI / test-python-klausur (push) Failing after 1m52s
CI / test-python-agent-core (push) Successful in 14s
CI / test-nodejs-website (push) Successful in 17s
Content word_boxes in test used x-spacing (i%3)*100 which created
internal gaps larger than the border-to-content gap. Changed to
(i%2)*51 so content words overlap and the border gap remains dominant.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 17:24:33 +01:00
Benjamin Admin
c0e1118870 feat: detect and remove page-border decoration strip artifacts (Step 4e)
Textbooks with decorative alphabet strips along page edges produce
OCR artifacts (scattered colored letters at x<150 while real content
starts at x>=179). Step 4e detects a significant x-gap (>30px) between
a small cluster (<15% of total word_boxes) near the page edge and the
main content, then removes the border-strip word_boxes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 17:20:45 +01:00
Benjamin Admin
f31a7175a2 fix: normalize word_box order to reading order for frontend display (Step 5j)
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 24s
CI / test-go-edu-search (push) Successful in 25s
CI / test-python-klausur (push) Failing after 2m1s
CI / test-python-agent-core (push) Successful in 15s
CI / test-nodejs-website (push) Successful in 16s
The frontend renders colored cells from the word_boxes array order,
not from cell.text. After post-processing steps (5i bullet removal etc),
word_boxes could remain in their original insertion order instead of
left-to-right reading order. Step 5j now explicitly sorts word_boxes
using _group_words_into_lines before the result is built.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 19:21:37 +01:00
Benjamin Admin
bacbfd88f1 Fix word ordering in cell text rebuild (Steps 4c, 4d, 5i)
Cell text was rebuilt using naive (top, left) sorting after removing
word_boxes in Steps 4c/4d/5i. This produced wrong word order when
words on the same visual line had slightly different top values (1-6px).

Now uses _words_to_reading_order_text() which groups words into visual
lines by y-tolerance before sorting by x within each line, matching
the initial cell text construction in _build_cells.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 18:45:33 +01:00
Benjamin Admin
2c63beff04 Fix bullet overlap disambiguation + raise red threshold to 90
Step 5i: For word_boxes with >90% x-overlap and different text, use IPA
dictionary to decide which to keep (e.g. "tightly" in dict, "fighily" not).

Red threshold raised from 80 to 90 to catch remaining scanner artifacts
like "tight" and "5" that were still misclassified as red.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 18:21:00 +01:00
Benjamin Admin
82433b4bad Step 5i: Remove blue bullet/artifact and overlapping duplicate word_boxes
Dictionary pages have small blue square bullets before entries that OCR
reads as text artifacts. Three detection rules:
a) Tiny blue symbols (area < 150, conf < 85): catches ©, e, * etc.
b) X-overlapping word_boxes (>40%): remove lower confidence one
c) Duplicate blue text with gap < 6px: remove one copy

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 18:17:07 +01:00
Benjamin Admin
d889a6959e Fix red false-positive in color detection for scanned black text
Scanner artifacts on black text produce slight warm tint (hue ~0, sat ~60)
that was misclassified as red. Now requires median_sat >= 80 specifically
for red classification, since genuine red text always has high saturation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 17:18:44 +01:00
Benjamin Admin
bc1804ad18 Fix vsplit side-by-side rendering: invalid TypeScript type annotation
Changed `typeof grid.zones[][]` to `GridZone[][]` which was causing
a silent build error, preventing the vsplit zone grouping logic from
being compiled into the production bundle.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 17:09:52 +01:00
Benjamin Admin
45b83560fd Vertical zone split: detect divider lines and create independent sub-zones
Pages with two side-by-side vocabulary columns separated by a vertical
black line are now split into independent sub-zones before row/column
detection. Each sub-zone gets its own rows, preventing misalignment from
different heading rhythms.

- _detect_vertical_dividers(): finds pipe word_boxes at consistent x
  positions spanning >50% of zone height
- _split_zone_at_vertical_dividers(): creates left/right PageZone objects
  with layout_hint and vsplit_group metadata
- Column union skips vsplit zones (independent column sets)
- Frontend renders vsplit zones side by side via flex layout
- PageZone gets layout_hint + vsplit_group fields

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 16:38:12 +01:00
Benjamin Admin
e4fa634a63 Fix GridTable: show cell.text when it diverges from word_boxes
Post-processing steps like 5h (slash-IPA conversion) modify cell.text
but not individual word_boxes. The colored per-word display showed
stale word_box text instead of the corrected cell text. Now falls
back to the plain input when texts don't match.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 15:05:10 +01:00
Benjamin Admin
76ba83eecb Tighten tertiary column detection: require 4+ rows and 5% coverage
Prevents false narrow columns from text overflow at page edges.
Session 355f3c84 had a 3-row/4% tertiary cluster creating a spurious
third column from right-column text overflow.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 12:50:03 +01:00
Benjamin Admin
04092a0a66 Fix Step 5h: reject grammar patterns in slash-IPA, convert trailing variants
- Reject /.../ matches containing spaces, parens, or commas (e.g. sb/sth up)
- Second pass converts trailing /ipa2/ after [ipa1] (double pronunciation)
- Validate standalone /ipa/ at start against same reject pattern

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 12:40:28 +01:00
Benjamin Admin
7fafd297e7 Step 5h: convert slash-delimited IPA to bracket notation with dict lookup
Dictionary-style pages print IPA between slashes (e.g. tiger /'taiga/).
Step 5h detects these patterns, looks up the headword in the IPA dictionary
for proper Unicode IPA, and falls back to OCR text when not found.
Converts /ipa/ to [ipa] bracket notation matching the rest of the pipeline.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 12:36:08 +01:00
Benjamin Admin
7ac09b5941 Filter pipe-character word_boxes from OCR column divider artifacts
Step 4d removes "|" and "||" word_boxes that OCR produces when reading
physical vertical divider lines between columns. Also strips stray pipe
chars from cell text.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 12:09:50 +01:00
189 changed files with 673462 additions and 14628 deletions

View File

@@ -256,3 +256,45 @@ ssh macmini "cd /Users/benjaminadmin/Projekte/breakpilot-lehrer && git push all
| `website/app/admin/klausur-korrektur/` | Korrektur-Workspace |
| `backend-lehrer/classroom_api.py` | Classroom Engine |
| `backend-lehrer/state_engine_api.py` | State Engine |
---
## Code-Qualitaet Guardrails (NON-NEGOTIABLE)
> Vollstaendige Details: `.claude/rules/architecture.md`
> Ausnahmen: `.claude/rules/loc-exceptions.txt`
### File Size Budget
- **Hard Cap: 500 LOC** pro Datei
- Wenn eine Aenderung eine Datei ueber 500 LOC bringen wuerde: **erst splitten, dann aendern**
- Ausnahmen nur mit Begruendung in `loc-exceptions.txt` + `[guardrail-change]` Commit-Marker
### Architektur
- **Python:** Routes duenn → Business Logic in Services → Persistenz in Repositories
- **Go:** Handler ≤40 LOC → Service-Layer → Repository-Pattern
- **TypeScript/Next.js:** page.tsx duenn → Server Actions, Queries, Components auslagern
- **Types:** Monolithische types.ts frueh splitten, types.ts + types/ Shadowing vermeiden
### Workflow (bei jeder Aenderung)
1. Datei lesen + LOC pruefen
2. Wenn nahe am Budget → erst splitten
3. Minimale kohaerente Aenderung
4. Verifikation (Tests + Lint)
5. Zusammenfassung: Was geaendert, was verifiziert, Restrisiko
### Commit-Marker
- `[migration-approved]` — Schema-/Migrations-Aenderungen
- `[guardrail-change]` — Aenderungen an .claude/**, scripts/check-loc.sh
- `[split-required]` — Aenderung beginnt mit Datei-Split
- `[interface-change]` — Public API Contracts geaendert
### LOC-Check ausfuehren
```bash
bash scripts/check-loc.sh --changed # nur geaenderte Dateien
bash scripts/check-loc.sh --all # alle Dateien (zeigt alle Violations)
```

View File

@@ -0,0 +1,46 @@
# Architecture Rule — BreakPilot Lehrer
## File Size Budget
Hard default: **500 LOC max** per file.
Soft targets:
- Handler/Router/Service: 300-400 LOC
- Models/Schemas/Types: 200-300 LOC
- Utilities: 100-200 LOC
Ausnahmen nur in `.claude/rules/loc-exceptions.txt` mit Begruendung.
## Split-Trigger
Sofort splitten wenn:
- Datei ueberschreitet 500 LOC
- Datei wuerde nach Aenderung 500 LOC ueberschreiten
- Datei mischt Transport + Business Logic + Persistence
- Datei enthaelt mehrere unabhaengig testbare Verantwortlichkeiten
## Python (backend-lehrer, klausur-service, voice-service)
- Routes duenn halten — Business Logic in Services
- Persistenz in Repositories/Data-Access-Module
- Pydantic Schemas nach Domain splitten
- Zirkulaere Imports vermeiden
## Go (school-service, edu-search-service)
- Handler duenn halten (≤40 LOC)
- Business Logic in Services/Use-Cases
- Transport/Request-Decoding getrennt von Domain-Logik
## TypeScript / Next.js (admin-lehrer, studio-v2, website)
- page.tsx duenn halten — Server Actions, Queries, Forms auslagern
- Monolithische types.ts frueh splitten
- types.ts + types/ Shadowing vermeiden
- Shared Client/Server Types explizit trennen
## Entscheidungsreihenfolge
1. Bestehendes kleines kohaeesives Modul wiederverwenden
2. Neues Modul in der Naehe erstellen
3. Ueberfuellte Datei splitten, neues Verhalten in richtiges Split-Modul
4. Nur als letzter Ausweg: Grosse bestehende Datei erweitern

View File

@@ -0,0 +1,20 @@
# LOC Exceptions — BreakPilot Lehrer
# Format: <glob> | owner=<person> | reason=<why> | review=<date>
#
# Jede Ausnahme braucht Begruendung und Review-Datum.
# Temporaere Ausnahmen muessen mit [guardrail-change] Commit-Marker versehen werden.
# Generated / Build Artifacts
**/node_modules/** | owner=infra | reason=npm packages | review=permanent
**/.next/** | owner=infra | reason=Next.js build output | review=permanent
**/__pycache__/** | owner=infra | reason=Python bytecode | review=permanent
**/venv/** | owner=infra | reason=Python virtualenv | review=permanent
# Test-Dateien (duerfen groesser sein fuer Table-Driven Tests)
**/tests/test_cv_vocab_pipeline.py | owner=klausur | reason=umfangreiche OCR Pipeline Tests | review=2026-07-01
**/tests/test_rbac.py | owner=klausur | reason=RBAC Test-Matrix | review=2026-07-01
**/tests/test_grid_editor_api.py | owner=klausur | reason=Grid Editor Integrationstests | review=2026-07-01
# Legacy — TEMPORAER bis Refactoring abgeschlossen
# Dateien hier werden Phase fuer Phase abgearbeitet und entfernt.
# KEINE neuen Ausnahmen ohne [guardrail-change] Commit-Marker!

View File

@@ -0,0 +1,237 @@
# OCR Pipeline Erweiterungen - Entwicklerdokumentation
**Status:** Produktiv
**Letzte Aktualisierung:** 2026-04-15
**URL:** https://macmini:3002/ai/ocr-kombi
---
## Uebersicht
Erweiterungen der OCR Kombi Pipeline (14 Steps, 0-13):
- **SmartSpellChecker** — LLM-freie OCR-Korrektur mit Spracherkennung
- **Box-Grid-Review** (Step 11) — Eingebettete Boxen verarbeiten
- **Ansicht/Spreadsheet** (Step 12) — Fortune Sheet Excel-Editor
---
## Pipeline Steps
| Step | ID | Name | Komponente |
|------|----|------|------------|
| 0 | upload | Upload | StepUpload |
| 1 | orientation | Orientierung | StepOrientation |
| 2 | page-split | Seitentrennung | StepPageSplit |
| 3 | deskew | Begradigung | StepDeskew |
| 4 | dewarp | Entzerrung | StepDewarp |
| 5 | content-crop | Zuschneiden | StepContentCrop |
| 6 | ocr | OCR | StepOcr |
| 7 | structure | Strukturerkennung | StepStructure |
| 8 | grid-build | Grid-Aufbau | StepGridBuild |
| 9 | grid-review | Grid-Review | StepGridReview |
| 10 | gutter-repair | Wortkorrektur | StepGutterRepair |
| **11** | **box-review** | **Box-Review** | **StepBoxGridReview** |
| **12** | **ansicht** | **Ansicht** | **StepAnsicht** |
| 13 | ground-truth | Ground Truth | StepGroundTruth |
Step-Definitionen: `admin-lehrer/app/(admin)/ai/ocr-kombi/types.ts`
---
## SmartSpellChecker
**Datei:** `klausur-service/backend/smart_spell.py`
**Tests:** `tests/test_smart_spell.py` (43 Tests)
**Lizenz:** Nur pyspellchecker (MIT) — kein LLM, kein Hunspell
### Features
| Feature | Methode |
|---------|---------|
| Spracherkennung | Dual-Dictionary EN/DE Heuristik |
| a/I Disambiguation | Bigram-Kontext (Folgewort-Lookup) |
| Boundary Repair | Frequenz-basiert: `Pound sand``Pounds and` |
| Context Split | `anew``a new` (Allow/Deny-Liste) |
| Multi-Digit | BFS: `sch00l``school` |
| Cross-Language Guard | DE-Woerter in EN-Spalte nicht falsch korrigieren |
| Umlaut-Korrektur | `Schuler``Schueler` |
| IPA-Schutz | Inhalte in [Klammern] nie aendern |
| Slash→l | `p/``pl` (kursives l als / erkannt) |
| Abkuerzungen | 120+ aus `_KNOWN_ABBREVIATIONS` |
### Integration
```python
# In cv_review.py (LLM Review Step):
from smart_spell import SmartSpellChecker
_smart = SmartSpellChecker()
result = _smart.correct_text(text, lang="en") # oder "de" oder "auto"
# In grid_editor_api.py (Grid Build + Box Build):
# Automatisch nach Grid-Aufbau und Box-Grid-Aufbau
```
### Frequenz-Scoring
Boundary Repair vergleicht Wort-Frequenz-Produkte:
- `old_freq = word_freq(w1) * word_freq(w2)`
- `new_freq = word_freq(repaired_w1) * word_freq(repaired_w2)`
- Akzeptiert wenn `new_freq > old_freq * 5`
- Abkuerzungs-Bonus nur wenn Original-Woerter selten (freq < 1e-6)
---
## Box-Grid-Review (Step 11)
**Frontend:** `admin-lehrer/components/ocr-kombi/StepBoxGridReview.tsx`
**Backend:** `klausur-service/backend/cv_box_layout.py`, `grid_editor_api.py`
**Tests:** `tests/test_box_layout.py` (13 Tests)
### Backend-Endpoints
```
POST /api/v1/ocr-pipeline/sessions/{id}/build-box-grids
```
Verarbeitet alle erkannten Boxen aus `structure_result`:
1. Filtert Header/Footer-Boxen (obere/untere 7% der Bildhoehe)
2. Extrahiert OCR-Woerter pro Box aus `raw_paddle_words`
3. Klassifiziert Layout: `flowing` | `columnar` | `bullet_list` | `header_only`
4. Baut Grid mit layout-spezifischer Logik
5. Wendet SmartSpellChecker an
### Box Layout Klassifikation (`cv_box_layout.py`)
| Layout | Erkennung | Grid-Aufbau |
|--------|-----------|-------------|
| `header_only` | ≤5 Woerter oder 1 Zeile | 1 Zelle, alles zusammen |
| `flowing` | Gleichmaessige Zeilenbreite | 1 Spalte, Bullet-Gruppierung per Einrueckung |
| `bullet_list` | ≥40% Zeilen mit Bullet-Marker | 1 Spalte, Bullet-Items |
| `columnar` | Mehrere X-Cluster | Standard-Spaltenerkennung |
### Bullet-Einrueckung
Erkennung ueber Left-Edge-Analyse:
- Minimale Einrueckung = Bullet-Ebene
- Zeilen mit >15px mehr Einrueckung = Folgezeilen
- Folgezeilen werden mit `\n` in die Bullet-Zelle integriert
- Fehlende `•` Marker werden automatisch ergaenzt
### Colspan-Erkennung (`grid_editor_helpers.py`)
Generische Funktion `_detect_colspan_cells()`:
- Laeuft nach `_build_cells()` fuer ALLE Zonen
- Nutzt Original-Wort-Bloecke (vor `_split_cross_column_words`)
- Wort-Block der ueber Spaltengrenze reicht → `spanning_header` mit `colspan=N`
- Beispiel: "In Britain you pay with pounds and pence." ueber 2 Spalten
### Spalten-Erkennung in Boxen
Fuer kleine Zonen (≤60 Woerter):
- `gap_threshold = max(median_h * 1.0, 25)` statt `3x median`
- PaddleOCR liefert Multi-Word-Bloecke → alle Gaps sind Spalten-Gaps
---
## Ansicht / Spreadsheet (Step 12)
**Frontend:** `admin-lehrer/components/ocr-kombi/StepAnsicht.tsx`, `SpreadsheetView.tsx`
**Bibliothek:** `@fortune-sheet/react` (MIT, v1.0.4)
### Architektur
Split-View:
- **Links:** Original-Scan mit OCR-Overlay (`/image/words-overlay`)
- **Rechts:** Fortune Sheet Spreadsheet mit Multi-Sheet-Tabs
### Multi-Sheet Ansatz
Jede Zone wird ein eigenes Sheet-Tab:
- Sheet "Vokabeln" — Hauptgrid mit EN/DE Spalten
- Sheet "Pounds and euros" — Box 1 mit eigenen 4 Spalten
- Sheet "German leihen" — Box 2 als Fliesstexttext
Grund: Spaltenbreiten sind pro Zone unterschiedlich optimiert. Excel-Limitation: Spaltenbreite gilt fuer die ganze Spalte.
### Zell-Formatierung
| Format | Quelle | Fortune Sheet Property |
|--------|--------|----------------------|
| Fett | `is_header`, `is_bold`, groessere Schrift | `bl: 1` |
| Schriftfarbe | OCR word_boxes color | `fc: '#hex'` |
| Hintergrund | Box bg_hex, Header | `bg: '#hex08'` |
| Text-Wrap | Mehrzeilige Zellen (\n) | `tb: '2'` |
| Vertikal oben | Mehrzeilige Zellen | `vt: 0` |
| Groessere Schrift | word_box height >1.3x median | `fs: 12` |
### Spaltenbreiten
Auto-Fit: `max(laengster_text * 7.5 + 16, original_px * scaleFactor)`
### Toolbar
`undo, redo, font-bold, font-italic, font-strikethrough, font-color, background, font-size, horizontal-align, vertical-align, text-wrap, merge-cell, border`
---
## Unified Grid (Backend)
**Datei:** `klausur-service/backend/unified_grid.py`
**Tests:** `tests/test_unified_grid.py` (10 Tests)
Mergt alle Zonen in ein einzelnes Grid (fuer Export/Analyse):
```
POST /api/v1/ocr-pipeline/sessions/{id}/build-unified-grid
GET /api/v1/ocr-pipeline/sessions/{id}/unified-grid
```
- Dominante Zeilenhoehe = Median der Content-Row-Abstaende
- Full-Width Boxen: Rows direkt integriert
- Partial-Width Boxen: Extra-Rows eingefuegt wenn Box mehr Zeilen hat
- Box-Zellen mit `source_zone_type: "box"` und `box_region` Metadaten
---
## Dateistruktur
### Backend (klausur-service)
| Datei | Zeilen | Beschreibung |
|-------|--------|--------------|
| `grid_build_core.py` | 1943 | `_build_grid_core()` — Haupt-Grid-Aufbau |
| `grid_editor_api.py` | 474 | REST-Endpoints (build, save, get, gutter, box, unified) |
| `grid_editor_helpers.py` | 1737 | Helper: Spalten, Rows, Cells, Colspan, Header |
| `smart_spell.py` | 587 | SmartSpellChecker |
| `cv_box_layout.py` | 339 | Box-Layout-Klassifikation + Grid-Aufbau |
| `unified_grid.py` | 425 | Unified Grid Builder |
### Frontend (admin-lehrer)
| Datei | Zeilen | Beschreibung |
|-------|--------|--------------|
| `StepBoxGridReview.tsx` | 283 | Box-Review Step 11 |
| `StepAnsicht.tsx` | 112 | Ansicht Step 12 (Split-View) |
| `SpreadsheetView.tsx` | ~160 | Fortune Sheet Integration |
| `GridTable.tsx` | 652 | Grid-Editor Tabelle (Steps 9-11) |
| `useGridEditor.ts` | 985 | Grid-Editor Hook |
### Tests
| Datei | Tests | Beschreibung |
|-------|-------|--------------|
| `test_smart_spell.py` | 43 | Spracherkennung, Boundary Repair, IPA-Schutz |
| `test_box_layout.py` | 13 | Layout-Klassifikation, Bullet-Gruppierung |
| `test_unified_grid.py` | 10 | Unified Grid, Box-Klassifikation |
| **Gesamt** | **66** | |
---
## Aenderungshistorie
| Datum | Aenderung |
|-------|-----------|
| 2026-04-15 | Fortune Sheet Multi-Sheet Tabs, Bullet-Points, Auto-Fit, Refactoring |
| 2026-04-14 | Unified Grid, Ansicht Step, Colspan-Erkennung |
| 2026-04-13 | Box-Grid-Review Step, Spalten in Boxen, Header/Footer Filter |
| 2026-04-12 | SmartSpellChecker, Frequency Scoring, IPA-Schutz, Vocab-Worksheet Refactoring |

View File

@@ -188,11 +188,35 @@ ssh macmini "docker compose up -d klausur-service studio-v2"
---
## Frontend Refactoring (2026-04-12)
`page.tsx` wurde von 2337 Zeilen in 14 Dateien aufgeteilt:
```
studio-v2/app/vocab-worksheet/
├── page.tsx # 198 Zeilen — Orchestrator
├── types.ts # Interfaces, VocabWorksheetHook
├── constants.ts # API-Base, Formats, Defaults
├── useVocabWorksheet.ts # 843 Zeilen — Custom Hook (alle State + Logik)
└── components/
├── UploadScreen.tsx # Session-Liste + Dokument-Auswahl
├── PageSelection.tsx # PDF-Seitenauswahl
├── VocabularyTab.tsx # Vokabel-Tabelle + IPA/Silben
├── WorksheetTab.tsx # Format-Auswahl + Konfiguration
├── ExportTab.tsx # PDF-Download
├── OcrSettingsPanel.tsx # OCR-Filter Einstellungen
├── FullscreenPreview.tsx # Vollbild-Vorschau Modal
├── QRCodeModal.tsx # QR-Upload Modal
└── OcrComparisonModal.tsx # OCR-Vergleich Modal
```
---
## Erweiterung: Neue Formate hinzufuegen
1. **Backend**: Neuen Generator in `klausur-service/backend/` erstellen
2. **API**: Neuen Endpoint in `vocab_worksheet_api.py` hinzufuegen
3. **Frontend**: Format zu `worksheetFormats` Array in `page.tsx` hinzufuegen
3. **Frontend**: Format zu `worksheetFormats` Array in `constants.ts` hinzufuegen
4. **Doku**: Diese Datei aktualisieren
---

9
.claude/settings.json Normal file
View File

@@ -0,0 +1,9 @@
{
"permissions": {
"allow": [
"Bash",
"Write",
"Read"
]
}
}

36
AGENTS.go.md Normal file
View File

@@ -0,0 +1,36 @@
# AGENTS.go.md — Go/Gin Konventionen
## Architektur
- `handlers/`: HTTP Transport nur — Decode, Validate, Call Service, Encode Response
- `service/` oder `usecase/`: Business Logic
- `repo/`: Storage/Integration
- `model/` oder `domain/`: Domain Entities
- `tests/`: Table-driven Tests bevorzugen
## Regeln
1. Handler ≤40 LOC — nur Decode → Service → Encode
2. Business Logic NICHT in Handlers verstecken
3. Grosse Handler nach Resource/Verb splitten
4. Request/Response DTOs nah am Transport halten
5. Interfaces nur an echten Boundaries (nicht ueberall fuer Mocks)
6. Keine Giant-Utility-Dateien
7. Generated Files nicht manuell editieren
## Split-Trigger
- Handler-Datei ueberschreitet 400-500 LOC
- Unrelated Endpoints zusammengruppiert
- Encoding/Decoding dominiert die Handler-Datei
- Service-Logik und Transport-Logik gemischt
## Verifikation
```bash
gofmt -l . | grep -q . && exit 1
go vet ./...
golangci-lint run --timeout=5m
go test -race ./...
go build ./...
```

36
AGENTS.python.md Normal file
View File

@@ -0,0 +1,36 @@
# AGENTS.python.md — Python/FastAPI Konventionen
## Architektur
- `routes/` oder `api/`: Request/Response nur — kein Business Logic
- `services/`: Business Logic
- `repositories/`: Persistenz/Data Access
- `schemas/`: Pydantic Models, nach Domain gesplittet
- `tests/`: Spiegelt Produktions-Layout
## Regeln
1. Route-Dateien duenn halten (≤300 LOC)
2. Wenn eine Route-Datei 300-400 LOC erreicht → nach Resource/Operation splitten
3. Schema-Dateien nach Domain splitten wenn sie wachsen
4. Modul-Level Singleton-Kopplung vermeiden (Tests patchen falsches Symbol)
5. Patch immer das Symbol das vom getesteten Modul importiert wird
6. Dependency Injection bevorzugen statt versteckte Imports
7. Pydantic v2: `from __future__ import annotations` NICHT verwenden (bricht Pydantic)
8. Migrationen getrennt von Refactorings halten
## Split-Trigger
- Datei naehert sich oder ueberschreitet 500 LOC
- Zirkulaere Imports erscheinen
- Tests brauchen tiefes Patching
- API-Schemas mischen verschiedene Domains
- Service-Datei macht Transport UND DB-Logik
## Verifikation
```bash
ruff check .
mypy . --ignore-missing-imports --no-error-summary
pytest tests/ -x -q --no-header
```

55
AGENTS.typescript.md Normal file
View File

@@ -0,0 +1,55 @@
# AGENTS.typescript.md — Next.js Konventionen
## Architektur
- `app/.../page.tsx`: Minimale Seiten-Komposition (≤250 LOC)
- `app/.../actions.ts`: Server Actions
- `app/.../queries.ts`: Data Loading
- `app/.../_components/`: View-Teile (Colocation)
- `app/.../_hooks/`: Seiten-spezifische Hooks (Colocation)
- `types/` oder `types/*.ts`: Domain-spezifische Types
- `schemas/`: Zod/Validierungs-Schemas
- `lib/`: Shared Utilities
## Regeln
1. page.tsx duenn halten (≤250 LOC)
2. Grosse Seiten frueh in Sections/Components splitten
3. KEINE einzelne types.ts als Catch-All
4. types.ts UND types/ Shadowing vermeiden (eines waehlen!)
5. Server/Client Module-Grenzen explizit halten
6. Pure Helpers und schmale Props bevorzugen
7. API-Client Types getrennt von handgeschriebenen Domain Types
## Colocation Pattern (bevorzugt)
```
app/(admin)/ai/rag/
page.tsx ← duenn, komponiert nur
_components/
SearchPanel.tsx
ResultsTable.tsx
FilterBar.tsx
_hooks/
useRagSearch.ts
actions.ts ← Server Actions
queries.ts ← Data Fetching
```
## Split-Trigger
- page.tsx ueberschreitet 250-350 LOC
- types.ts ueberschreitet 200-300 LOC
- Form-Logik, Server Actions und Rendering in einer Datei
- Mehrere unabhaengig testbare Sections vorhanden
- Imports werden broechig
## Verifikation
```bash
npx tsc --noEmit
npm run lint
npm run build
```
> `npm run build` ist PFLICHT — `tsc` allein reicht nicht.

View File

@@ -1,395 +0,0 @@
'use client'
/**
* GPU Infrastructure Admin Page
*
* vast.ai GPU Management for LLM Processing
* Part of KI-Werkzeuge
*/
import { useEffect, useState, useCallback } from 'react'
import { PagePurpose } from '@/components/common/PagePurpose'
import { AIToolsSidebarResponsive } from '@/components/ai/AIToolsSidebar'
interface VastStatus {
instance_id: number | null
status: string
gpu_name: string | null
dph_total: number | null
endpoint_base_url: string | null
last_activity: string | null
auto_shutdown_in_minutes: number | null
total_runtime_hours: number | null
total_cost_usd: number | null
account_credit: number | null
account_total_spend: number | null
session_runtime_minutes: number | null
session_cost_usd: number | null
message: string | null
error?: string
}
export default function GPUInfrastructurePage() {
const [status, setStatus] = useState<VastStatus | null>(null)
const [loading, setLoading] = useState(true)
const [actionLoading, setActionLoading] = useState<string | null>(null)
const [error, setError] = useState<string | null>(null)
const [message, setMessage] = useState<string | null>(null)
const API_PROXY = '/api/admin/gpu'
const fetchStatus = useCallback(async () => {
setLoading(true)
setError(null)
try {
const response = await fetch(API_PROXY)
const data = await response.json()
if (!response.ok) {
throw new Error(data.error || `HTTP ${response.status}`)
}
setStatus(data)
} catch (err) {
setError(err instanceof Error ? err.message : 'Verbindungsfehler')
setStatus({
instance_id: null,
status: 'error',
gpu_name: null,
dph_total: null,
endpoint_base_url: null,
last_activity: null,
auto_shutdown_in_minutes: null,
total_runtime_hours: null,
total_cost_usd: null,
account_credit: null,
account_total_spend: null,
session_runtime_minutes: null,
session_cost_usd: null,
message: 'Verbindung fehlgeschlagen'
})
} finally {
setLoading(false)
}
}, [])
useEffect(() => {
fetchStatus()
}, [fetchStatus])
useEffect(() => {
const interval = setInterval(fetchStatus, 30000)
return () => clearInterval(interval)
}, [fetchStatus])
const powerOn = async () => {
setActionLoading('on')
setError(null)
setMessage(null)
try {
const response = await fetch(API_PROXY, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ action: 'on' }),
})
const data = await response.json()
if (!response.ok) {
throw new Error(data.error || data.detail || 'Aktion fehlgeschlagen')
}
setMessage('Start angefordert')
setTimeout(fetchStatus, 3000)
setTimeout(fetchStatus, 10000)
} catch (err) {
setError(err instanceof Error ? err.message : 'Fehler beim Starten')
fetchStatus()
} finally {
setActionLoading(null)
}
}
const powerOff = async () => {
setActionLoading('off')
setError(null)
setMessage(null)
try {
const response = await fetch(API_PROXY, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ action: 'off' }),
})
const data = await response.json()
if (!response.ok) {
throw new Error(data.error || data.detail || 'Aktion fehlgeschlagen')
}
setMessage('Stop angefordert')
setTimeout(fetchStatus, 3000)
setTimeout(fetchStatus, 10000)
} catch (err) {
setError(err instanceof Error ? err.message : 'Fehler beim Stoppen')
fetchStatus()
} finally {
setActionLoading(null)
}
}
const getStatusBadge = (s: string) => {
const baseClasses = 'px-3 py-1 rounded-full text-sm font-semibold uppercase'
switch (s) {
case 'running':
return `${baseClasses} bg-green-100 text-green-800`
case 'stopped':
case 'exited':
return `${baseClasses} bg-red-100 text-red-800`
case 'loading':
case 'scheduling':
case 'creating':
case 'starting...':
case 'stopping...':
return `${baseClasses} bg-yellow-100 text-yellow-800`
default:
return `${baseClasses} bg-slate-100 text-slate-600`
}
}
const getCreditColor = (credit: number | null) => {
if (credit === null) return 'text-slate-500'
if (credit < 5) return 'text-red-600'
if (credit < 15) return 'text-yellow-600'
return 'text-green-600'
}
return (
<div>
{/* Page Purpose */}
<PagePurpose
title="GPU Infrastruktur"
purpose="Verwalten Sie die vast.ai GPU-Instanzen fuer LLM-Verarbeitung und OCR. Starten/Stoppen Sie GPUs bei Bedarf und ueberwachen Sie Kosten in Echtzeit."
audience={['DevOps', 'Entwickler', 'System-Admins']}
architecture={{
services: ['vast.ai API', 'Ollama', 'VLLM'],
databases: ['PostgreSQL (Logs)'],
}}
relatedPages={[
{ name: 'Test Quality (BQAS)', href: '/ai/test-quality', description: 'Golden Suite & Tests' },
{ name: 'Magic Help', href: '/ai/magic-help', description: 'TrOCR Testing' },
]}
collapsible={true}
defaultCollapsed={true}
/>
{/* KI-Werkzeuge Sidebar */}
<AIToolsSidebarResponsive currentTool="gpu" />
{/* Status Cards */}
<div className="bg-white rounded-xl border border-slate-200 p-6 mb-6">
<div className="grid grid-cols-2 md:grid-cols-3 lg:grid-cols-6 gap-6">
<div>
<div className="text-sm text-slate-500 mb-2">Status</div>
{loading ? (
<span className="px-3 py-1 rounded-full text-sm font-semibold bg-slate-100 text-slate-600">
Laden...
</span>
) : (
<span className={getStatusBadge(
actionLoading === 'on' ? 'starting...' :
actionLoading === 'off' ? 'stopping...' :
status?.status || 'unknown'
)}>
{actionLoading === 'on' ? 'starting...' :
actionLoading === 'off' ? 'stopping...' :
status?.status || 'unbekannt'}
</span>
)}
</div>
<div>
<div className="text-sm text-slate-500 mb-2">GPU</div>
<div className="font-semibold text-slate-900">
{status?.gpu_name || '-'}
</div>
</div>
<div>
<div className="text-sm text-slate-500 mb-2">Kosten/h</div>
<div className="font-semibold text-slate-900">
{status?.dph_total ? `$${status.dph_total.toFixed(3)}` : '-'}
</div>
</div>
<div>
<div className="text-sm text-slate-500 mb-2">Auto-Stop</div>
<div className="font-semibold text-slate-900">
{status && status.auto_shutdown_in_minutes !== null
? `${status.auto_shutdown_in_minutes} min`
: '-'}
</div>
</div>
<div>
<div className="text-sm text-slate-500 mb-2">Budget</div>
<div className={`font-bold text-lg ${getCreditColor(status?.account_credit ?? null)}`}>
{status && status.account_credit !== null
? `$${status.account_credit.toFixed(2)}`
: '-'}
</div>
</div>
<div>
<div className="text-sm text-slate-500 mb-2">Session</div>
<div className="font-semibold text-slate-900">
{status && status.session_runtime_minutes !== null && status.session_cost_usd !== null
? `${Math.round(status.session_runtime_minutes)} min / $${status.session_cost_usd.toFixed(3)}`
: '-'}
</div>
</div>
</div>
{/* Buttons */}
<div className="flex items-center gap-4 mt-6 pt-6 border-t border-slate-200">
<button
onClick={powerOn}
disabled={actionLoading !== null || status?.status === 'running'}
className="px-6 py-2 bg-orange-600 text-white rounded-lg font-medium hover:bg-orange-700 disabled:opacity-50 disabled:cursor-not-allowed transition-colors"
>
Starten
</button>
<button
onClick={powerOff}
disabled={actionLoading !== null || status?.status !== 'running'}
className="px-6 py-2 bg-red-600 text-white rounded-lg font-medium hover:bg-red-700 disabled:opacity-50 disabled:cursor-not-allowed transition-colors"
>
Stoppen
</button>
<button
onClick={fetchStatus}
disabled={loading}
className="px-4 py-2 border border-slate-300 text-slate-700 rounded-lg font-medium hover:bg-slate-50 disabled:opacity-50 transition-colors"
>
{loading ? 'Aktualisiere...' : 'Aktualisieren'}
</button>
{message && (
<span className="ml-4 text-sm text-green-600 font-medium">{message}</span>
)}
{error && (
<span className="ml-4 text-sm text-red-600 font-medium">{error}</span>
)}
</div>
</div>
{/* Extended Stats */}
<div className="grid grid-cols-1 lg:grid-cols-2 gap-6 mb-6">
<div className="bg-white rounded-xl border border-slate-200 p-6">
<h3 className="font-semibold text-slate-900 mb-4">Kosten-Uebersicht</h3>
<div className="space-y-4">
<div className="flex justify-between items-center">
<span className="text-slate-600">Session Laufzeit</span>
<span className="font-semibold">
{status && status.session_runtime_minutes !== null
? `${Math.round(status.session_runtime_minutes)} Minuten`
: '-'}
</span>
</div>
<div className="flex justify-between items-center">
<span className="text-slate-600">Session Kosten</span>
<span className="font-semibold">
{status && status.session_cost_usd !== null
? `$${status.session_cost_usd.toFixed(4)}`
: '-'}
</span>
</div>
<div className="flex justify-between items-center pt-4 border-t border-slate-100">
<span className="text-slate-600">Gesamtlaufzeit</span>
<span className="font-semibold">
{status && status.total_runtime_hours !== null
? `${status.total_runtime_hours.toFixed(1)} Stunden`
: '-'}
</span>
</div>
<div className="flex justify-between items-center">
<span className="text-slate-600">Gesamtkosten</span>
<span className="font-semibold">
{status && status.total_cost_usd !== null
? `$${status.total_cost_usd.toFixed(2)}`
: '-'}
</span>
</div>
<div className="flex justify-between items-center">
<span className="text-slate-600">vast.ai Ausgaben</span>
<span className="font-semibold">
{status && status.account_total_spend !== null
? `$${status.account_total_spend.toFixed(2)}`
: '-'}
</span>
</div>
</div>
</div>
<div className="bg-white rounded-xl border border-slate-200 p-6">
<h3 className="font-semibold text-slate-900 mb-4">Instanz-Details</h3>
<div className="space-y-4">
<div className="flex justify-between items-center">
<span className="text-slate-600">Instanz ID</span>
<span className="font-mono text-sm">
{status?.instance_id || '-'}
</span>
</div>
<div className="flex justify-between items-center">
<span className="text-slate-600">GPU</span>
<span className="font-semibold">
{status?.gpu_name || '-'}
</span>
</div>
<div className="flex justify-between items-center">
<span className="text-slate-600">Stundensatz</span>
<span className="font-semibold">
{status?.dph_total ? `$${status.dph_total.toFixed(4)}/h` : '-'}
</span>
</div>
<div className="flex justify-between items-center">
<span className="text-slate-600">Letzte Aktivitaet</span>
<span className="text-sm">
{status?.last_activity
? new Date(status.last_activity).toLocaleString('de-DE')
: '-'}
</span>
</div>
{status?.endpoint_base_url && status.status === 'running' && (
<div className="pt-4 border-t border-slate-100">
<div className="text-slate-600 text-sm mb-1">Endpoint</div>
<code className="text-xs bg-slate-100 px-2 py-1 rounded block overflow-x-auto">
{status.endpoint_base_url}
</code>
</div>
)}
</div>
</div>
</div>
{/* Info */}
<div className="bg-violet-50 border border-violet-200 rounded-xl p-4">
<div className="flex gap-3">
<svg className="w-5 h-5 text-violet-600 flex-shrink-0 mt-0.5" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M13 16h-1v-4h-1m1-4h.01M21 12a9 9 0 11-18 0 9 9 0 0118 0z" />
</svg>
<div>
<h4 className="font-semibold text-violet-900">Auto-Shutdown</h4>
<p className="text-sm text-violet-800 mt-1">
Die GPU-Instanz wird automatisch gestoppt, wenn sie laengere Zeit inaktiv ist.
Der Status wird alle 30 Sekunden automatisch aktualisiert.
</p>
</div>
</div>
</div>
</div>
)
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,420 @@
'use client'
/**
* Ground-Truth Queue & Progress
*
* Overview page showing all sessions with their GT status.
* Clicking a session opens it in the Kombi Pipeline (/ai/ocr-overlay)
* where the actual review (split-view, inline edit, GT marking) happens.
*/
import { useState, useEffect, useCallback } from 'react'
import { useRouter } from 'next/navigation'
import { PagePurpose } from '@/components/common/PagePurpose'
const KLAUSUR_API = '/klausur-api'
// ---------------------------------------------------------------------------
// Types
// ---------------------------------------------------------------------------
interface Session {
id: string
name: string
filename: string
status: string
created_at: string
document_category: string | null
has_ground_truth: boolean
}
interface GTSession {
session_id: string
name: string
filename: string
document_category: string | null
pipeline: string | null
saved_at: string | null
summary: {
total_zones: number
total_columns: number
total_rows: number
total_cells: number
}
}
// ---------------------------------------------------------------------------
// Component
// ---------------------------------------------------------------------------
export default function GroundTruthQueuePage() {
const router = useRouter()
const [allSessions, setAllSessions] = useState<Session[]>([])
const [gtSessions, setGtSessions] = useState<GTSession[]>([])
const [filter, setFilter] = useState<'all' | 'unreviewed' | 'reviewed'>('all')
const [loading, setLoading] = useState(true)
const [selectedSessions, setSelectedSessions] = useState<Set<string>>(new Set())
const [marking, setMarking] = useState(false)
const [markResult, setMarkResult] = useState<string | null>(null)
// Load sessions + GT sessions
const loadData = useCallback(async () => {
setLoading(true)
try {
const [sessRes, gtRes] = await Promise.all([
fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions?limit=200`),
fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/ground-truth-sessions`),
])
if (sessRes.ok) {
const data = await sessRes.json()
const gtSet = new Set<string>()
if (gtRes.ok) {
const gtData = await gtRes.json()
const gts: GTSession[] = gtData.sessions || []
setGtSessions(gts)
for (const g of gts) gtSet.add(g.session_id)
}
const sessions: Session[] = (data.sessions || [])
.filter((s: any) => !s.parent_session_id)
.map((s: any) => ({
id: s.id,
name: s.name || '',
filename: s.filename || '',
status: s.status || 'active',
created_at: s.created_at || '',
document_category: s.document_category || null,
has_ground_truth: gtSet.has(s.id),
}))
setAllSessions(sessions)
}
} catch (e) {
console.error('Failed to load data:', e)
} finally {
setLoading(false)
}
}, [])
useEffect(() => {
loadData()
}, [loadData])
// Filtered sessions
const filteredSessions = allSessions.filter((s) => {
if (filter === 'unreviewed') return !s.has_ground_truth
if (filter === 'reviewed') return s.has_ground_truth
return true
})
const reviewedCount = allSessions.filter((s) => s.has_ground_truth).length
const totalCount = allSessions.length
const pct = totalCount > 0 ? Math.round((reviewedCount / totalCount) * 100) : 0
// Open session in Kombi pipeline
const openInPipeline = (sessionId: string) => {
router.push(`/ai/ocr-overlay?session=${sessionId}&mode=kombi`)
}
// Batch mark as GT
const batchMark = async () => {
setMarking(true)
let success = 0
for (const sid of selectedSessions) {
try {
const res = await fetch(
`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}/mark-ground-truth?pipeline=kombi`,
{ method: 'POST' },
)
if (res.ok) success++
} catch {
/* skip */
}
}
setSelectedSessions(new Set())
setMarking(false)
setMarkResult(`${success} Sessions als Ground Truth markiert`)
setTimeout(() => setMarkResult(null), 3000)
loadData()
}
const toggleSelect = (id: string) => {
setSelectedSessions((prev) => {
const next = new Set(prev)
if (next.has(id)) next.delete(id)
else next.add(id)
return next
})
}
const selectAll = () => {
if (selectedSessions.size === filteredSessions.length) {
setSelectedSessions(new Set())
} else {
setSelectedSessions(new Set(filteredSessions.map((s) => s.id)))
}
}
return (
<div className="space-y-6">
<div className="max-w-5xl mx-auto p-4 space-y-4">
<PagePurpose
title="Ground Truth Queue"
purpose="Uebersicht aller OCR-Sessions und deren Ground-Truth-Status. Zum Pruefen und Korrigieren eine Session oeffnen — sie wird im Kombi-Modus (OCR Overlay) bearbeitet."
audience={['Entwickler', 'QA']}
defaultCollapsed
architecture={{
services: ['klausur-service (FastAPI, Port 8086)'],
databases: ['PostgreSQL (ocr_pipeline_sessions)'],
}}
relatedPages={[
{
name: 'Kombi Pipeline',
href: '/ai/ocr-overlay',
description: 'Sessions bearbeiten und GT markieren',
},
{
name: 'OCR Regression',
href: '/ai/ocr-regression',
description: 'Regressions-Tests',
},
]}
/>
{/* Progress Bar */}
<div className="bg-white rounded-lg border border-slate-200 p-4">
<div className="flex items-center justify-between mb-2">
<h2 className="text-lg font-bold text-slate-900">
Ground Truth Fortschritt
</h2>
<span className="text-sm text-slate-500">
{reviewedCount} von {totalCount} markiert ({pct}%)
</span>
</div>
<div className="w-full bg-slate-100 rounded-full h-2.5">
<div
className="bg-teal-500 h-2.5 rounded-full transition-all duration-500"
style={{ width: `${pct}%` }}
/>
</div>
<div className="flex items-center gap-4 mt-2 text-xs text-slate-500">
<span className="flex items-center gap-1">
<span className="w-2 h-2 rounded-full bg-teal-400" />
{reviewedCount} Ground Truth
</span>
<span className="flex items-center gap-1">
<span className="w-2 h-2 rounded-full bg-slate-300" />
{totalCount - reviewedCount} offen
</span>
<span>
{gtSessions.reduce((sum, g) => sum + g.summary.total_cells, 0)}{' '}
Referenz-Zellen gesamt
</span>
</div>
</div>
{/* Filter + Actions */}
<div className="flex items-center gap-4 flex-wrap">
<div className="flex gap-1 bg-slate-100 rounded-lg p-1">
{(['all', 'unreviewed', 'reviewed'] as const).map((f) => (
<button
key={f}
onClick={() => setFilter(f)}
className={`px-3 py-1.5 text-sm rounded-md transition-colors ${
filter === f
? 'bg-white text-slate-900 shadow-sm font-medium'
: 'text-slate-500 hover:text-slate-700'
}`}
>
{f === 'all'
? 'Alle'
: f === 'unreviewed'
? 'Offen'
: 'Ground Truth'}
<span className="ml-1 text-xs text-slate-400">
(
{
allSessions.filter((s) =>
f === 'unreviewed'
? !s.has_ground_truth
: f === 'reviewed'
? s.has_ground_truth
: true,
).length
}
)
</span>
</button>
))}
</div>
<div className="ml-auto flex items-center gap-2">
{selectedSessions.size > 0 && (
<button
onClick={batchMark}
disabled={marking}
className="px-3 py-1.5 bg-teal-600 text-white text-sm rounded-lg hover:bg-teal-700 disabled:opacity-50"
>
{marking
? 'Markiere...'
: `${selectedSessions.size} als GT markieren`}
</button>
)}
<button
onClick={selectAll}
className="px-3 py-1.5 text-sm text-slate-500 hover:text-slate-700 border border-slate-200 rounded-lg hover:bg-slate-50"
>
{selectedSessions.size === filteredSessions.length
? 'Keine auswaehlen'
: 'Alle auswaehlen'}
</button>
</div>
</div>
{/* Toast */}
{markResult && (
<div className="p-3 rounded-lg text-sm bg-emerald-50 text-emerald-700 border border-emerald-200">
{markResult}
</div>
)}
{/* Session List */}
{loading ? (
<div className="text-center py-12 text-slate-400">
Lade Sessions...
</div>
) : filteredSessions.length === 0 ? (
<div className="text-center py-12 text-slate-400">
<p className="text-lg">Keine Sessions in dieser Ansicht</p>
</div>
) : (
<div className="bg-white rounded-lg border border-slate-200 overflow-hidden">
<table className="w-full text-sm">
<thead>
<tr className="border-b border-slate-200 bg-slate-50 text-left text-slate-500">
<th className="px-4 py-2 w-8">
<input
type="checkbox"
checked={
selectedSessions.size === filteredSessions.length &&
filteredSessions.length > 0
}
onChange={selectAll}
className="rounded border-slate-300"
/>
</th>
<th className="px-4 py-2 font-medium">Status</th>
<th className="px-4 py-2 font-medium">Session</th>
<th className="px-4 py-2 font-medium">Kategorie</th>
<th className="px-4 py-2 font-medium">Erstellt</th>
<th className="px-4 py-2 font-medium text-right">
Aktion
</th>
</tr>
</thead>
<tbody>
{filteredSessions.map((s) => {
const gt = gtSessions.find((g) => g.session_id === s.id)
return (
<tr
key={s.id}
className="border-b border-slate-50 hover:bg-slate-50 transition-colors"
>
<td className="px-4 py-2">
<input
type="checkbox"
checked={selectedSessions.has(s.id)}
onChange={() => toggleSelect(s.id)}
className="rounded border-slate-300"
/>
</td>
<td className="px-4 py-2">
{s.has_ground_truth ? (
<span className="inline-flex items-center gap-1 px-2 py-0.5 rounded-full text-xs font-medium bg-emerald-100 text-emerald-700 border border-emerald-200">
<svg
className="w-3 h-3"
fill="none"
viewBox="0 0 24 24"
stroke="currentColor"
>
<path
strokeLinecap="round"
strokeLinejoin="round"
strokeWidth={2}
d="M5 13l4 4L19 7"
/>
</svg>
GT
</span>
) : (
<span className="inline-flex items-center px-2 py-0.5 rounded-full text-xs font-medium bg-slate-100 text-slate-500 border border-slate-200">
Offen
</span>
)}
</td>
<td className="px-4 py-2">
<div className="flex items-center gap-3">
<div className="flex-shrink-0 w-8 h-8 rounded bg-slate-100 overflow-hidden">
{/* eslint-disable-next-line @next/next/no-img-element */}
<img
src={`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${s.id}/thumbnail?size=64`}
alt=""
className="w-full h-full object-cover"
loading="lazy"
onError={(e) => {
;(e.target as HTMLImageElement).style.display =
'none'
}}
/>
</div>
<div className="min-w-0">
<div className="font-medium text-slate-900 truncate">
{s.name || s.filename || s.id.slice(0, 8)}
</div>
{gt && (
<div className="text-xs text-slate-400">
{gt.summary.total_cells} Zellen,{' '}
{gt.summary.total_zones} Zonen
</div>
)}
</div>
</div>
</td>
<td className="px-4 py-2">
{s.document_category ? (
<span className="text-xs bg-slate-100 px-1.5 py-0.5 rounded text-slate-600">
{s.document_category}
</span>
) : (
<span className="text-xs text-slate-300"></span>
)}
</td>
<td className="px-4 py-2 text-slate-500">
{new Date(s.created_at).toLocaleDateString('de-DE', {
day: '2-digit',
month: '2-digit',
year: '2-digit',
})}
</td>
<td className="px-4 py-2 text-right">
<button
onClick={() => openInPipeline(s.id)}
className="px-3 py-1 text-xs bg-teal-600 text-white rounded hover:bg-teal-700 transition-colors"
>
{s.has_ground_truth
? 'Ueberpruefen'
: 'Im Kombi-Modus oeffnen'}
</button>
</td>
</tr>
)
})}
</tbody>
</table>
</div>
)}
</div>
</div>
)
}

View File

@@ -0,0 +1,173 @@
'use client'
import { Suspense } from 'react'
import { PagePurpose } from '@/components/common/PagePurpose'
import { KombiStepper } from '@/components/ocr-kombi/KombiStepper'
import { SessionList } from '@/components/ocr-kombi/SessionList'
import { SessionHeader } from '@/components/ocr-kombi/SessionHeader'
import { StepUpload } from '@/components/ocr-kombi/StepUpload'
import { StepOrientation } from '@/components/ocr-kombi/StepOrientation'
import { StepPageSplit } from '@/components/ocr-kombi/StepPageSplit'
import { StepDeskew } from '@/components/ocr-kombi/StepDeskew'
import { StepDewarp } from '@/components/ocr-kombi/StepDewarp'
import { StepContentCrop } from '@/components/ocr-kombi/StepContentCrop'
import { StepOcr } from '@/components/ocr-kombi/StepOcr'
import { StepStructure } from '@/components/ocr-kombi/StepStructure'
import { StepGridBuild } from '@/components/ocr-kombi/StepGridBuild'
import { StepGridReview } from '@/components/ocr-kombi/StepGridReview'
import { StepGutterRepair } from '@/components/ocr-kombi/StepGutterRepair'
import { StepBoxGridReview } from '@/components/ocr-kombi/StepBoxGridReview'
import { StepAnsicht } from '@/components/ocr-kombi/StepAnsicht'
import { StepGroundTruth } from '@/components/ocr-kombi/StepGroundTruth'
import { useKombiPipeline } from './useKombiPipeline'
function OcrKombiContent() {
const {
currentStep,
sessionId,
sessionName,
loadingSessions,
activeCategory,
isGroundTruth,
pageNumber,
steps,
gridSaveRef,
groupedSessions,
loadSessions,
openSession,
handleStepClick,
handleNext,
handleNewSession,
deleteSession,
renameSession,
updateCategory,
setSessionId,
setSessionName,
setIsGroundTruth,
} = useKombiPipeline()
const renderStep = () => {
switch (currentStep) {
case 0:
return (
<StepUpload
sessionId={sessionId}
onUploaded={(sid, name) => {
setSessionId(sid)
setSessionName(name)
loadSessions()
}}
onNext={handleNext}
/>
)
case 1:
return (
<StepOrientation
sessionId={sessionId}
onNext={() => handleNext()}
onSessionList={() => { loadSessions(); handleNewSession() }}
/>
)
case 2:
return (
<StepPageSplit
sessionId={sessionId}
sessionName={sessionName}
onNext={handleNext}
onSplitComplete={(childId, childName) => {
// Switch to the first child session and refresh the list
setSessionId(childId)
setSessionName(childName)
loadSessions()
}}
/>
)
case 3:
return <StepDeskew sessionId={sessionId} onNext={handleNext} />
case 4:
return <StepDewarp sessionId={sessionId} onNext={handleNext} />
case 5:
return <StepContentCrop sessionId={sessionId} onNext={handleNext} />
case 6:
return <StepOcr sessionId={sessionId} onNext={handleNext} />
case 7:
return <StepStructure sessionId={sessionId} onNext={handleNext} />
case 8:
return <StepGridBuild sessionId={sessionId} onNext={handleNext} />
case 9:
return <StepGridReview sessionId={sessionId} onNext={handleNext} saveRef={gridSaveRef} />
case 10:
return <StepGutterRepair sessionId={sessionId} onNext={handleNext} />
case 11:
return <StepBoxGridReview sessionId={sessionId} onNext={handleNext} />
case 12:
return <StepAnsicht sessionId={sessionId} onNext={handleNext} />
case 13:
return (
<StepGroundTruth
sessionId={sessionId}
isGroundTruth={isGroundTruth}
onMarked={() => setIsGroundTruth(true)}
gridSaveRef={gridSaveRef}
/>
)
default:
return null
}
}
return (
<div className="space-y-6">
<PagePurpose
title="OCR Kombi Pipeline"
purpose="Modulare 11-Schritt-Pipeline: Upload, Vorverarbeitung, Dual-Engine-OCR (PP-OCRv5 + Tesseract), Strukturerkennung, Grid-Aufbau und Review. Multi-Page-Dokument-Unterstuetzung."
audience={['Entwickler']}
architecture={{
services: ['klausur-service (FastAPI)', 'OpenCV', 'Tesseract', 'PaddleOCR'],
databases: ['PostgreSQL Sessions'],
}}
relatedPages={[
{ name: 'OCR Regression', href: '/ai/ocr-regression', description: 'Regressionstests' },
]}
defaultCollapsed
/>
<SessionList
items={groupedSessions()}
loading={loadingSessions}
activeSessionId={sessionId}
onOpenSession={(sid) => openSession(sid)}
onNewSession={handleNewSession}
onDeleteSession={deleteSession}
onRenameSession={renameSession}
onUpdateCategory={updateCategory}
/>
{sessionId && sessionName && (
<SessionHeader
sessionName={sessionName}
activeCategory={activeCategory}
isGroundTruth={isGroundTruth}
pageNumber={pageNumber}
onUpdateCategory={(cat) => updateCategory(sessionId, cat)}
/>
)}
<KombiStepper
steps={steps}
currentStep={currentStep}
onStepClick={handleStepClick}
/>
<div className="min-h-[400px]">{renderStep()}</div>
</div>
)
}
export default function OcrKombiPage() {
return (
<Suspense fallback={<div className="p-4 text-sm text-gray-400">Lade...</div>}>
<OcrKombiContent />
</Suspense>
)
}

View File

@@ -0,0 +1,266 @@
// OCR Pipeline Types — migrated from deleted ocr-pipeline/types.ts
export type PipelineStepStatus = 'pending' | 'active' | 'completed' | 'failed' | 'skipped'
export interface PipelineStep {
id: string
name: string
icon: string
status: PipelineStepStatus
}
export type DocumentCategory =
| 'vokabelseite' | 'woerterbuch' | 'buchseite' | 'arbeitsblatt' | 'klausurseite'
| 'mathearbeit' | 'statistik' | 'zeitung' | 'formular' | 'handschrift' | 'sonstiges'
export const DOCUMENT_CATEGORIES: { value: DocumentCategory; label: string; icon: string }[] = [
{ value: 'vokabelseite', label: 'Vokabelseite', icon: '📖' },
{ value: 'woerterbuch', label: 'Woerterbuch', icon: '📕' },
{ value: 'buchseite', label: 'Buchseite', icon: '📚' },
{ value: 'arbeitsblatt', label: 'Arbeitsblatt', icon: '📝' },
{ value: 'klausurseite', label: 'Klausurseite', icon: '📄' },
{ value: 'mathearbeit', label: 'Mathearbeit', icon: '🔢' },
{ value: 'statistik', label: 'Statistik', icon: '📊' },
{ value: 'zeitung', label: 'Zeitung', icon: '📰' },
{ value: 'formular', label: 'Formular', icon: '📋' },
{ value: 'handschrift', label: 'Handschrift', icon: '✍️' },
{ value: 'sonstiges', label: 'Sonstiges', icon: '📎' },
]
export interface SessionListItem {
id: string
name: string
filename: string
status: string
current_step: number
document_category?: DocumentCategory
doc_type?: string
parent_session_id?: string
document_group_id?: string
page_number?: number
is_ground_truth?: boolean
created_at: string
updated_at?: string
}
export interface SubSession {
id: string
name: string
box_index: number
current_step?: number
status?: string
}
export interface OrientationResult {
orientation_degrees: number
corrected: boolean
duration_seconds: number
}
export interface CropResult {
crop_applied: boolean
crop_rect?: { x: number; y: number; width: number; height: number }
crop_rect_pct?: { x: number; y: number; width: number; height: number }
original_size: { width: number; height: number }
cropped_size: { width: number; height: number }
detected_format?: string
format_confidence?: number
aspect_ratio?: number
border_fractions?: { top: number; bottom: number; left: number; right: number }
skipped?: boolean
duration_seconds?: number
}
export interface DeskewResult {
session_id: string
angle_hough: number
angle_word_alignment: number
angle_iterative?: number
angle_residual?: number
angle_textline?: number
angle_applied: number
method_used: 'hough' | 'word_alignment' | 'manual' | 'iterative' | 'two_pass' | 'three_pass' | 'manual_combined'
confidence: number
duration_seconds: number
deskewed_image_url: string
binarized_image_url: string
}
export interface DewarpDetection {
method: string
shear_degrees: number
confidence: number
}
export interface DewarpResult {
session_id: string
method_used: string
shear_degrees: number
confidence: number
duration_seconds: number
dewarped_image_url: string
detections?: DewarpDetection[]
}
export interface SessionInfo {
session_id: string
filename: string
name?: string
image_width: number
image_height: number
original_image_url: string
current_step?: number
document_category?: DocumentCategory
doc_type?: string
orientation_result?: OrientationResult
crop_result?: CropResult
deskew_result?: DeskewResult
dewarp_result?: DewarpResult
sub_sessions?: SubSession[]
parent_session_id?: string
box_index?: number
document_group_id?: string
page_number?: number
}
export interface StructureGraphic {
x: number; y: number; w: number; h: number
area: number; shape: string; color_name: string; color_hex: string; confidence: number
}
export interface ExcludeRegion {
x: number; y: number; w: number; h: number; label?: string
}
export interface StructureBox {
x: number; y: number; w: number; h: number
confidence: number; border_thickness: number
bg_color_name?: string; bg_color_hex?: string
}
export interface StructureZone {
index: number; zone_type: 'content' | 'box'
x: number; y: number; w: number; h: number
}
export interface DocLayoutRegion {
x: number; y: number; w: number; h: number
class_name: string; confidence: number
}
export interface StructureResult {
image_width: number; image_height: number
content_bounds: { x: number; y: number; w: number; h: number }
boxes: StructureBox[]; zones: StructureZone[]
graphics: StructureGraphic[]; exclude_regions?: ExcludeRegion[]
color_pixel_counts: Record<string, number>
has_words: boolean; word_count: number
border_ghosts_removed?: number; duration_seconds: number
layout_regions?: DocLayoutRegion[]
detection_method?: 'opencv' | 'ppdoclayout'
}
export interface WordBbox { x: number; y: number; w: number; h: number }
export interface OcrWordBox {
text: string; left: number; top: number; width: number; height: number; conf: number
color?: string; color_name?: string; recovered?: boolean
}
export interface ColumnMeta { index: number; type: string; x: number; width: number }
export interface GridCell {
cell_id: string; row_index: number; col_index: number; col_type: string
text: string; confidence: number; bbox_px: WordBbox; bbox_pct: WordBbox
ocr_engine?: string; is_bold?: boolean
status?: 'pending' | 'confirmed' | 'edited' | 'skipped'
word_boxes?: OcrWordBox[]
}
export interface WordEntry {
row_index: number; english: string; german: string; example: string
source_page?: string; marker?: string; confidence: number
bbox: WordBbox; bbox_en: WordBbox | null; bbox_de: WordBbox | null; bbox_ex: WordBbox | null
bbox_ref?: WordBbox | null; bbox_marker?: WordBbox | null
status?: 'pending' | 'confirmed' | 'edited' | 'skipped'
}
export interface GridResult {
cells: GridCell[]
grid_shape: { rows: number; cols: number; total_cells: number }
columns_used: ColumnMeta[]
layout: 'vocab' | 'generic'
image_width: number; image_height: number; duration_seconds: number
ocr_engine?: string; vocab_entries?: WordEntry[]; entries?: WordEntry[]; entry_count?: number
summary: {
total_cells: number; non_empty_cells: number; low_confidence: number
total_entries?: number; with_english?: number; with_german?: number
}
llm_review?: {
changes: { row_index: number; field: string; old: string; new: string }[]
model_used: string; duration_ms: number; entries_corrected: number
applied_count?: number; applied_at?: string
}
}
// --- Kombi V2 Pipeline ---
export const KOMBI_V2_STEPS: PipelineStep[] = [
{ id: 'upload', name: 'Upload', icon: '📤', status: 'pending' },
{ id: 'orientation', name: 'Orientierung', icon: '🔄', status: 'pending' },
{ id: 'page-split', name: 'Seitentrennung', icon: '📖', status: 'pending' },
{ id: 'deskew', name: 'Begradigung', icon: '📐', status: 'pending' },
{ id: 'dewarp', name: 'Entzerrung', icon: '🔧', status: 'pending' },
{ id: 'content-crop', name: 'Zuschneiden', icon: '✂️', status: 'pending' },
{ id: 'ocr', name: 'OCR', icon: '🔀', status: 'pending' },
{ id: 'structure', name: 'Strukturerkennung', icon: '🔍', status: 'pending' },
{ id: 'grid-build', name: 'Grid-Aufbau', icon: '🧱', status: 'pending' },
{ id: 'grid-review', name: 'Grid-Review', icon: '📊', status: 'pending' },
{ id: 'gutter-repair', name: 'Wortkorrektur', icon: '🩹', status: 'pending' },
{ id: 'box-review', name: 'Box-Review', icon: '📦', status: 'pending' },
{ id: 'ansicht', name: 'Ansicht', icon: '👁️', status: 'pending' },
{ id: 'ground-truth', name: 'Ground Truth', icon: '✅', status: 'pending' },
]
export const KOMBI_V2_UI_TO_DB: Record<number, number> = {
0: 1, 1: 2, 2: 2, 3: 3, 4: 4, 5: 5, 6: 8, 7: 9, 8: 10, 9: 11, 10: 11, 11: 11, 12: 11, 13: 12,
}
export function dbStepToKombiV2Ui(dbStep: number): number {
if (dbStep <= 1) return 0
if (dbStep === 2) return 1
if (dbStep === 3) return 3
if (dbStep === 4) return 4
if (dbStep === 5) return 5
if (dbStep <= 8) return 6
if (dbStep === 9) return 7
if (dbStep === 10) return 8
if (dbStep === 11) return 9
return 13
}
export interface DocumentGroup {
group_id: string; title: string; page_count: number; sessions: DocumentGroupSession[]
}
export interface DocumentGroupSession {
id: string; name: string; page_number: number; current_step: number
status: string; document_category?: DocumentCategory; created_at: string
}
export type OcrEngineSource = 'both' | 'paddle_only' | 'tesseract_only' | 'conflict_paddle' | 'conflict_tesseract'
export interface OcrTransparentWord {
text: string; left: number; top: number; width: number; height: number
conf: number; engine_source: OcrEngineSource
}
export interface OcrTransparentResult {
raw_tesseract: { words: OcrTransparentWord[] }
raw_paddle: { words: OcrTransparentWord[] }
merged: { words: OcrTransparentWord[] }
stats: {
total_words: number; both_agree: number; paddle_only: number
tesseract_only: number; conflict_paddle_wins: number; conflict_tesseract_wins: number
}
}

View File

@@ -0,0 +1,298 @@
'use client'
import { useCallback, useEffect, useState, useRef } from 'react'
import { useSearchParams } from 'next/navigation'
import type { PipelineStep, DocumentCategory, SessionListItem } from './types'
import { KOMBI_V2_STEPS, dbStepToKombiV2Ui } from './types'
export type { SessionListItem }
const KLAUSUR_API = '/klausur-api'
/** Groups sessions by document_group_id for the session list */
export interface DocumentGroupView {
group_id: string
title: string
sessions: SessionListItem[]
page_count: number
}
function initSteps(): PipelineStep[] {
return KOMBI_V2_STEPS.map((s, i) => ({
...s,
status: i === 0 ? 'active' : 'pending',
}))
}
export function useKombiPipeline() {
const [currentStep, setCurrentStep] = useState(0)
const [sessionId, setSessionId] = useState<string | null>(null)
const [sessionName, setSessionName] = useState('')
const [sessions, setSessions] = useState<SessionListItem[]>([])
const [loadingSessions, setLoadingSessions] = useState(true)
const [activeCategory, setActiveCategory] = useState<DocumentCategory | undefined>(undefined)
const [isGroundTruth, setIsGroundTruth] = useState(false)
const [pageNumber, setPageNumber] = useState<number | null>(null)
const [steps, setSteps] = useState<PipelineStep[]>(initSteps())
const searchParams = useSearchParams()
const deepLinkHandled = useRef(false)
const gridSaveRef = useRef<(() => Promise<void>) | null>(null)
// ---- Session loading ----
const loadSessions = useCallback(async () => {
setLoadingSessions(true)
try {
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions`)
if (res.ok) {
const data = await res.json()
setSessions((data.sessions || []).filter((s: SessionListItem) => !s.parent_session_id))
}
} catch (e) {
console.error('Failed to load sessions:', e)
} finally {
setLoadingSessions(false)
}
}, [])
useEffect(() => { loadSessions() }, [loadSessions])
// ---- Group sessions by document_group_id ----
const groupedSessions = useCallback((): (SessionListItem | DocumentGroupView)[] => {
const groups = new Map<string, SessionListItem[]>()
const ungrouped: SessionListItem[] = []
for (const s of sessions) {
if (s.document_group_id) {
const existing = groups.get(s.document_group_id) || []
existing.push(s)
groups.set(s.document_group_id, existing)
} else {
ungrouped.push(s)
}
}
const result: (SessionListItem | DocumentGroupView)[] = []
// Sort groups by earliest created_at
const sortedGroups = Array.from(groups.entries()).sort((a, b) => {
const aTime = Math.min(...a[1].map(s => new Date(s.created_at).getTime()))
const bTime = Math.min(...b[1].map(s => new Date(s.created_at).getTime()))
return bTime - aTime
})
for (const [groupId, groupSessions] of sortedGroups) {
groupSessions.sort((a, b) => (a.page_number || 0) - (b.page_number || 0))
// Extract base title (remove " — S. X" suffix)
const baseName = groupSessions[0]?.name?.replace(/ — S\. \d+$/, '') || 'Dokument'
result.push({
group_id: groupId,
title: baseName,
sessions: groupSessions,
page_count: groupSessions.length,
})
}
for (const s of ungrouped) {
result.push(s)
}
// Sort by creation time (most recent first)
const getTime = (item: SessionListItem | DocumentGroupView): number => {
if ('group_id' in item) {
return Math.min(...item.sessions.map((s: SessionListItem) => new Date(s.created_at).getTime()))
}
return new Date(item.created_at).getTime()
}
result.sort((a, b) => getTime(b) - getTime(a))
return result
}, [sessions])
// ---- Open session ----
const openSession = useCallback(async (sid: string) => {
try {
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`)
if (!res.ok) return
const data = await res.json()
setSessionId(sid)
setSessionName(data.name || data.filename || '')
setActiveCategory(data.document_category || undefined)
setIsGroundTruth(!!data.ground_truth?.build_grid_reference)
setPageNumber(data.grid_editor_result?.page_number?.number ?? null)
// Determine UI step from DB state
const dbStep = data.current_step || 1
const hasGrid = !!data.grid_editor_result
const hasStructure = !!data.structure_result
const hasWords = !!data.word_result
const hasGutterRepair = !!(data.ground_truth?.gutter_repair)
let uiStep: number
if (hasGrid && hasGutterRepair) {
uiStep = 10 // gutter-repair (already analysed)
} else if (hasGrid) {
uiStep = 9 // grid-review
} else if (hasStructure) {
uiStep = 8 // grid-build
} else if (hasWords) {
uiStep = 7 // structure
} else {
uiStep = dbStepToKombiV2Ui(dbStep)
}
// Sessions only exist after upload, so always skip the upload step
if (uiStep === 0) {
uiStep = 1
}
setSteps(
KOMBI_V2_STEPS.map((s, i) => ({
...s,
status: i < uiStep ? 'completed' : i === uiStep ? 'active' : 'pending',
})),
)
setCurrentStep(uiStep)
} catch (e) {
console.error('Failed to open session:', e)
}
}, [])
// ---- Deep link handling ----
useEffect(() => {
if (deepLinkHandled.current) return
const urlSession = searchParams.get('session')
const urlStep = searchParams.get('step')
if (urlSession) {
deepLinkHandled.current = true
openSession(urlSession).then(() => {
if (urlStep) {
const stepIdx = parseInt(urlStep, 10)
if (!isNaN(stepIdx) && stepIdx >= 0 && stepIdx < KOMBI_V2_STEPS.length) {
setCurrentStep(stepIdx)
}
}
})
}
}, [searchParams, openSession])
// ---- Step navigation ----
const goToStep = useCallback((step: number) => {
setCurrentStep(step)
setSteps(prev =>
prev.map((s, i) => ({
...s,
status: i < step ? 'completed' : i === step ? 'active' : 'pending',
})),
)
}, [])
const handleStepClick = useCallback((index: number) => {
if (index <= currentStep || steps[index].status === 'completed') {
setCurrentStep(index)
}
}, [currentStep, steps])
const handleNext = useCallback(() => {
if (currentStep >= steps.length - 1) {
// Last step → return to session list
setSteps(initSteps())
setCurrentStep(0)
setSessionId(null)
loadSessions()
return
}
const nextStep = currentStep + 1
setSteps(prev =>
prev.map((s, i) => {
if (i === currentStep) return { ...s, status: 'completed' }
if (i === nextStep) return { ...s, status: 'active' }
return s
}),
)
setCurrentStep(nextStep)
}, [currentStep, steps, loadSessions])
// ---- Session CRUD ----
const handleNewSession = useCallback(() => {
setSessionId(null)
setSessionName('')
setCurrentStep(0)
setSteps(initSteps())
}, [])
const deleteSession = useCallback(async (sid: string) => {
try {
await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`, { method: 'DELETE' })
setSessions(prev => prev.filter(s => s.id !== sid))
if (sessionId === sid) handleNewSession()
} catch (e) {
console.error('Failed to delete session:', e)
}
}, [sessionId, handleNewSession])
const renameSession = useCallback(async (sid: string, newName: string) => {
try {
await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`, {
method: 'PUT',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ name: newName }),
})
setSessions(prev => prev.map(s => s.id === sid ? { ...s, name: newName } : s))
if (sessionId === sid) setSessionName(newName)
} catch (e) {
console.error('Failed to rename session:', e)
}
}, [sessionId])
const updateCategory = useCallback(async (sid: string, category: DocumentCategory) => {
try {
await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`, {
method: 'PUT',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ document_category: category }),
})
setSessions(prev => prev.map(s => s.id === sid ? { ...s, document_category: category } : s))
if (sessionId === sid) setActiveCategory(category)
} catch (e) {
console.error('Failed to update category:', e)
}
}, [sessionId])
return {
// State
currentStep,
sessionId,
sessionName,
sessions,
loadingSessions,
activeCategory,
isGroundTruth,
pageNumber,
steps,
gridSaveRef,
// Computed
groupedSessions,
// Actions
loadSessions,
openSession,
goToStep,
handleStepClick,
handleNext,
handleNewSession,
deleteSession,
renameSession,
updateCategory,
setSessionId,
setSessionName,
setIsGroundTruth,
}
}

View File

@@ -1,635 +0,0 @@
'use client'
import { useCallback, useEffect, useState } from 'react'
import { PagePurpose } from '@/components/common/PagePurpose'
import { PipelineStepper } from '@/components/ocr-pipeline/PipelineStepper'
import { StepOrientation } from '@/components/ocr-pipeline/StepOrientation'
import { StepDeskew } from '@/components/ocr-pipeline/StepDeskew'
import { StepDewarp } from '@/components/ocr-pipeline/StepDewarp'
import { StepCrop } from '@/components/ocr-pipeline/StepCrop'
import { StepStructureDetection } from '@/components/ocr-pipeline/StepStructureDetection'
import { StepRowDetection } from '@/components/ocr-pipeline/StepRowDetection'
import { StepWordRecognition } from '@/components/ocr-pipeline/StepWordRecognition'
import { OverlayReconstruction } from '@/components/ocr-overlay/OverlayReconstruction'
import { PaddleDirectStep } from '@/components/ocr-overlay/PaddleDirectStep'
import { GridEditor } from '@/components/grid-editor/GridEditor'
import { OVERLAY_PIPELINE_STEPS, PADDLE_DIRECT_STEPS, KOMBI_STEPS, DOCUMENT_CATEGORIES, dbStepToOverlayUi, type PipelineStep, type SessionListItem, type DocumentCategory } from './types'
const KLAUSUR_API = '/klausur-api'
export default function OcrOverlayPage() {
const [mode, setMode] = useState<'pipeline' | 'paddle-direct' | 'kombi'>('pipeline')
const [currentStep, setCurrentStep] = useState(0)
const [sessionId, setSessionId] = useState<string | null>(null)
const [sessionName, setSessionName] = useState<string>('')
const [sessions, setSessions] = useState<SessionListItem[]>([])
const [loadingSessions, setLoadingSessions] = useState(true)
const [editingName, setEditingName] = useState<string | null>(null)
const [editNameValue, setEditNameValue] = useState('')
const [editingCategory, setEditingCategory] = useState<string | null>(null)
const [activeCategory, setActiveCategory] = useState<DocumentCategory | undefined>(undefined)
const [editingActiveCategory, setEditingActiveCategory] = useState(false)
const [isGroundTruth, setIsGroundTruth] = useState(false)
const [gtSaving, setGtSaving] = useState(false)
const [gtMessage, setGtMessage] = useState('')
const [steps, setSteps] = useState<PipelineStep[]>(
OVERLAY_PIPELINE_STEPS.map((s, i) => ({
...s,
status: i === 0 ? 'active' : 'pending',
})),
)
useEffect(() => {
loadSessions()
}, [])
const loadSessions = async () => {
setLoadingSessions(true)
try {
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions`)
if (res.ok) {
const data = await res.json()
// Filter to only show top-level sessions (no sub-sessions)
setSessions((data.sessions || []).filter((s: SessionListItem) => !s.parent_session_id))
}
} catch (e) {
console.error('Failed to load sessions:', e)
} finally {
setLoadingSessions(false)
}
}
const openSession = useCallback(async (sid: string) => {
try {
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`)
if (!res.ok) return
const data = await res.json()
setSessionId(sid)
setSessionName(data.name || data.filename || '')
setActiveCategory(data.document_category || undefined)
setIsGroundTruth(!!data.ground_truth?.build_grid_reference)
setGtMessage('')
// Check if this session was processed with paddle_direct, kombi, or rapid_kombi
const ocrEngine = data.word_result?.ocr_engine
const isPaddleDirect = ocrEngine === 'paddle_direct'
const isKombi = ocrEngine === 'kombi' || ocrEngine === 'rapid_kombi'
if (isPaddleDirect || isKombi) {
const m = isKombi ? 'kombi' : 'paddle-direct'
const baseSteps = isKombi ? KOMBI_STEPS : PADDLE_DIRECT_STEPS
setMode(m)
// For Kombi: if grid_editor_result exists, jump to grid editor step (6)
// If structure_result exists, jump to grid editor (6)
// If word_result exists, jump to structure step (5)
const hasGrid = isKombi && data.grid_editor_result
const hasStructure = isKombi && data.structure_result
const hasWords = isKombi && data.word_result
const activeStep = hasGrid ? 6 : hasStructure ? 6 : hasWords ? 5 : 4
setSteps(
baseSteps.map((s, i) => ({
...s,
status: i < activeStep ? 'completed' : i === activeStep ? 'active' : 'pending',
})),
)
setCurrentStep(activeStep)
} else {
setMode('pipeline')
// Map DB step to overlay UI step
const dbStep = data.current_step || 1
const uiStep = dbStepToOverlayUi(dbStep)
setSteps(
OVERLAY_PIPELINE_STEPS.map((s, i) => ({
...s,
status: i < uiStep ? 'completed' : i === uiStep ? 'active' : 'pending',
})),
)
setCurrentStep(uiStep)
}
} catch (e) {
console.error('Failed to open session:', e)
}
}, [])
const deleteSession = useCallback(async (sid: string) => {
try {
await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`, { method: 'DELETE' })
setSessions((prev) => prev.filter((s) => s.id !== sid))
if (sessionId === sid) {
setSessionId(null)
setCurrentStep(0)
const baseSteps = mode === 'kombi' ? KOMBI_STEPS : mode === 'paddle-direct' ? PADDLE_DIRECT_STEPS : OVERLAY_PIPELINE_STEPS
setSteps(baseSteps.map((s, i) => ({ ...s, status: i === 0 ? 'active' : 'pending' })))
}
} catch (e) {
console.error('Failed to delete session:', e)
}
}, [sessionId, mode])
const renameSession = useCallback(async (sid: string, newName: string) => {
try {
await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`, {
method: 'PUT',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ name: newName }),
})
setSessions((prev) => prev.map((s) => (s.id === sid ? { ...s, name: newName } : s)))
if (sessionId === sid) setSessionName(newName)
} catch (e) {
console.error('Failed to rename session:', e)
}
setEditingName(null)
}, [sessionId])
const updateCategory = useCallback(async (sid: string, category: DocumentCategory) => {
try {
await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`, {
method: 'PUT',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ document_category: category }),
})
setSessions((prev) => prev.map((s) => (s.id === sid ? { ...s, document_category: category } : s)))
if (sessionId === sid) setActiveCategory(category)
} catch (e) {
console.error('Failed to update category:', e)
}
setEditingCategory(null)
}, [sessionId])
const handleStepClick = (index: number) => {
if (index <= currentStep || steps[index].status === 'completed') {
setCurrentStep(index)
}
}
const goToStep = (step: number) => {
setCurrentStep(step)
setSteps((prev) =>
prev.map((s, i) => ({
...s,
status: i < step ? 'completed' : i === step ? 'active' : 'pending',
})),
)
}
const handleNext = () => {
if (currentStep >= steps.length - 1) {
// Last step completed — return to session list
const baseSteps = mode === 'kombi' ? KOMBI_STEPS : mode === 'paddle-direct' ? PADDLE_DIRECT_STEPS : OVERLAY_PIPELINE_STEPS
setSteps(baseSteps.map((s, i) => ({ ...s, status: i === 0 ? 'active' : 'pending' })))
setCurrentStep(0)
setSessionId(null)
loadSessions()
return
}
const nextStep = currentStep + 1
setSteps((prev) =>
prev.map((s, i) => {
if (i === currentStep) return { ...s, status: 'completed' }
if (i === nextStep) return { ...s, status: 'active' }
return s
}),
)
setCurrentStep(nextStep)
}
const handleOrientationComplete = (sid: string) => {
setSessionId(sid)
loadSessions()
handleNext()
}
const handleNewSession = () => {
setSessionId(null)
setSessionName('')
setCurrentStep(0)
const baseSteps = mode === 'kombi' ? KOMBI_STEPS : mode === 'paddle-direct' ? PADDLE_DIRECT_STEPS : OVERLAY_PIPELINE_STEPS
setSteps(baseSteps.map((s, i) => ({ ...s, status: i === 0 ? 'active' : 'pending' })))
}
const stepNames: Record<number, string> = {
1: 'Orientierung',
2: 'Begradigung',
3: 'Entzerrung',
4: 'Zuschneiden',
5: 'Zeilen',
6: 'Woerter',
7: 'Overlay',
}
const reprocessFromStep = useCallback(async (uiStep: number) => {
if (!sessionId) return
// Map overlay UI step to DB step
const dbStepMap: Record<number, number> = { 0: 2, 1: 3, 2: 4, 3: 5, 4: 7, 5: 8, 6: 9 }
const dbStep = dbStepMap[uiStep] || uiStep + 1
if (!confirm(`Ab Schritt ${uiStep + 1} (${stepNames[uiStep + 1] || '?'}) neu verarbeiten?`)) return
try {
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/reprocess`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ from_step: dbStep }),
})
if (!res.ok) {
const data = await res.json().catch(() => ({}))
console.error('Reprocess failed:', data.detail || res.status)
return
}
goToStep(uiStep)
} catch (e) {
console.error('Reprocess error:', e)
}
// eslint-disable-next-line react-hooks/exhaustive-deps
}, [sessionId, goToStep])
const handleMarkGroundTruth = async () => {
if (!sessionId) return
setGtSaving(true)
setGtMessage('')
try {
const resp = await fetch(
`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/mark-ground-truth?pipeline=${mode}`,
{ method: 'POST' }
)
if (!resp.ok) {
const body = await resp.text().catch(() => '')
throw new Error(`Ground Truth fehlgeschlagen (${resp.status}): ${body}`)
}
const data = await resp.json()
setIsGroundTruth(true)
setGtMessage(`Ground Truth gespeichert (${data.cells_saved} Zellen)`)
setTimeout(() => setGtMessage(''), 5000)
} catch (e) {
setGtMessage(e instanceof Error ? e.message : String(e))
} finally {
setGtSaving(false)
}
}
const isLastStep = currentStep === steps.length - 1
const showGtButton = isLastStep && sessionId != null
const renderStep = () => {
if (mode === 'paddle-direct' || mode === 'kombi') {
switch (currentStep) {
case 0:
return <StepOrientation sessionId={sessionId} onNext={handleOrientationComplete} />
case 1:
return <StepDeskew sessionId={sessionId} onNext={handleNext} />
case 2:
return <StepDewarp sessionId={sessionId} onNext={handleNext} />
case 3:
return <StepCrop sessionId={sessionId} onNext={handleNext} />
case 4:
if (mode === 'kombi') {
return (
<PaddleDirectStep
sessionId={sessionId}
onNext={handleNext}
endpoint="paddle-kombi"
title="Kombi-Modus"
description="PP-OCRv5 und Tesseract laufen parallel. Koordinaten werden gewichtet gemittelt fuer optimale Positionierung."
icon="🔀"
buttonLabel="PP-OCRv5 + Tesseract starten"
runningLabel="PP-OCRv5 + Tesseract laufen..."
engineKey="kombi"
/>
)
}
return <PaddleDirectStep sessionId={sessionId} onNext={handleNext} />
case 5:
return mode === 'kombi' ? (
<StepStructureDetection sessionId={sessionId} onNext={handleNext} />
) : null
case 6:
return mode === 'kombi' ? (
<GridEditor sessionId={sessionId} onNext={handleNext} />
) : null
default:
return null
}
}
switch (currentStep) {
case 0:
return <StepOrientation sessionId={sessionId} onNext={handleOrientationComplete} />
case 1:
return <StepDeskew sessionId={sessionId} onNext={handleNext} />
case 2:
return <StepDewarp sessionId={sessionId} onNext={handleNext} />
case 3:
return <StepCrop sessionId={sessionId} onNext={handleNext} />
case 4:
return <StepRowDetection sessionId={sessionId} onNext={handleNext} />
case 5:
return <StepWordRecognition sessionId={sessionId} onNext={handleNext} goToStep={goToStep} skipHealGaps />
case 6:
return <OverlayReconstruction sessionId={sessionId} onNext={handleNext} />
default:
return null
}
}
return (
<div className="space-y-6">
<PagePurpose
title="OCR Overlay"
purpose="Ganzseitige Overlay-Rekonstruktion: Scan begradigen, Zeilen und Woerter erkennen, dann pixelgenau ueber das Bild legen. Ohne Spaltenerkennung — ideal fuer Arbeitsblaetter."
audience={['Entwickler']}
architecture={{
services: ['klausur-service (FastAPI)', 'OpenCV', 'Tesseract'],
databases: ['PostgreSQL Sessions'],
}}
relatedPages={[
{ name: 'OCR Pipeline', href: '/ai/ocr-pipeline', description: 'Volle Pipeline mit Spalten' },
{ name: 'OCR Vergleich', href: '/ai/ocr-compare', description: 'Methoden-Vergleich' },
]}
defaultCollapsed
/>
{/* Session List */}
<div className="bg-white dark:bg-gray-800 rounded-xl border border-gray-200 dark:border-gray-700 p-4">
<div className="flex items-center justify-between mb-3">
<h3 className="text-sm font-medium text-gray-700 dark:text-gray-300">
Sessions ({sessions.length})
</h3>
<button
onClick={handleNewSession}
className="text-xs px-3 py-1.5 bg-teal-600 text-white rounded-lg hover:bg-teal-700 transition-colors"
>
+ Neue Session
</button>
</div>
{loadingSessions ? (
<div className="text-sm text-gray-400 py-2">Lade Sessions...</div>
) : sessions.length === 0 ? (
<div className="text-sm text-gray-400 py-2">Noch keine Sessions vorhanden.</div>
) : (
<div className="space-y-1.5 max-h-[320px] overflow-y-auto">
{sessions.map((s) => {
const catInfo = DOCUMENT_CATEGORIES.find(c => c.value === s.document_category)
return (
<div
key={s.id}
className={`relative flex items-start gap-3 px-3 py-2.5 rounded-lg text-sm transition-colors cursor-pointer ${
sessionId === s.id
? 'bg-teal-50 dark:bg-teal-900/30 border border-teal-200 dark:border-teal-700'
: 'hover:bg-gray-50 dark:hover:bg-gray-700/50'
}`}
>
{/* Thumbnail */}
<div
className="flex-shrink-0 w-12 h-12 rounded-md overflow-hidden bg-gray-100 dark:bg-gray-700"
onClick={() => openSession(s.id)}
>
{/* eslint-disable-next-line @next/next/no-img-element */}
<img
src={`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${s.id}/thumbnail?size=96`}
alt=""
className="w-full h-full object-cover"
loading="lazy"
onError={(e) => { (e.target as HTMLImageElement).style.display = 'none' }}
/>
</div>
{/* Info */}
<div className="flex-1 min-w-0" onClick={() => openSession(s.id)}>
{editingName === s.id ? (
<input
autoFocus
value={editNameValue}
onChange={(e) => setEditNameValue(e.target.value)}
onBlur={() => renameSession(s.id, editNameValue)}
onKeyDown={(e) => {
if (e.key === 'Enter') renameSession(s.id, editNameValue)
if (e.key === 'Escape') setEditingName(null)
}}
onClick={(e) => e.stopPropagation()}
className="w-full px-1 py-0.5 text-sm border rounded dark:bg-gray-700 dark:border-gray-600"
/>
) : (
<div className="truncate font-medium text-gray-700 dark:text-gray-300">
{s.name || s.filename}
</div>
)}
<button
onClick={(e) => {
e.stopPropagation()
navigator.clipboard.writeText(s.id)
const btn = e.currentTarget
btn.textContent = 'Kopiert!'
setTimeout(() => { btn.textContent = `ID: ${s.id.slice(0, 8)}` }, 1500)
}}
className="text-[10px] font-mono text-gray-400 hover:text-teal-500 transition-colors"
title={`Volle ID: ${s.id} — Klick zum Kopieren`}
>
ID: {s.id.slice(0, 8)}
</button>
<div className="text-xs text-gray-400 flex gap-2 mt-0.5">
<span>{new Date(s.created_at).toLocaleDateString('de-DE', { day: '2-digit', month: '2-digit', year: '2-digit', hour: '2-digit', minute: '2-digit' })}</span>
</div>
</div>
{/* Category Badge */}
<div className="flex flex-col gap-1 items-end flex-shrink-0" onClick={(e) => e.stopPropagation()}>
<button
onClick={() => setEditingCategory(editingCategory === s.id ? null : s.id)}
className={`text-[10px] px-1.5 py-0.5 rounded-full border transition-colors ${
catInfo
? 'bg-teal-50 dark:bg-teal-900/30 border-teal-200 dark:border-teal-700 text-teal-700 dark:text-teal-300'
: 'bg-gray-50 dark:bg-gray-700 border-gray-200 dark:border-gray-600 text-gray-400 hover:text-gray-600 dark:hover:text-gray-300'
}`}
title="Kategorie setzen"
>
{catInfo ? `${catInfo.icon} ${catInfo.label}` : '+ Kategorie'}
</button>
</div>
{/* Actions */}
<div className="flex flex-col gap-0.5 flex-shrink-0">
<button
onClick={(e) => {
e.stopPropagation()
setEditNameValue(s.name || s.filename)
setEditingName(s.id)
}}
className="p-1 text-gray-400 hover:text-gray-600 dark:hover:text-gray-300"
title="Umbenennen"
>
<svg className="w-3.5 h-3.5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M15.232 5.232l3.536 3.536m-2.036-5.036a2.5 2.5 0 113.536 3.536L6.5 21.036H3v-3.572L16.732 3.732z" />
</svg>
</button>
<button
onClick={(e) => {
e.stopPropagation()
if (confirm('Session loeschen?')) deleteSession(s.id)
}}
className="p-1 text-gray-400 hover:text-red-500"
title="Loeschen"
>
<svg className="w-3.5 h-3.5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M19 7l-.867 12.142A2 2 0 0116.138 21H7.862a2 2 0 01-1.995-1.858L5 7m5 4v6m4-6v6m1-10V4a1 1 0 00-1-1h-4a1 1 0 00-1 1v3M4 7h16" />
</svg>
</button>
</div>
{/* Category dropdown */}
{editingCategory === s.id && (
<div
className="absolute right-0 top-full mt-1 z-20 bg-white dark:bg-gray-800 border border-gray-200 dark:border-gray-700 rounded-lg shadow-lg p-2 grid grid-cols-2 gap-1 w-64"
onClick={(e) => e.stopPropagation()}
>
{DOCUMENT_CATEGORIES.map((cat) => (
<button
key={cat.value}
onClick={() => updateCategory(s.id, cat.value)}
className={`text-xs px-2 py-1.5 rounded-md text-left transition-colors ${
s.document_category === cat.value
? 'bg-teal-100 dark:bg-teal-900/40 text-teal-700 dark:text-teal-300'
: 'hover:bg-gray-100 dark:hover:bg-gray-700 text-gray-600 dark:text-gray-400'
}`}
>
{cat.icon} {cat.label}
</button>
))}
</div>
)}
</div>
)
})}
</div>
)}
</div>
{/* Active session info + category picker */}
{sessionId && sessionName && (
<div className="relative flex items-center gap-3 text-sm text-gray-500 dark:text-gray-400">
<span>Aktive Session: <span className="font-medium text-gray-700 dark:text-gray-300">{sessionName}</span></span>
<button
onClick={() => setEditingActiveCategory(!editingActiveCategory)}
className={`text-xs px-2.5 py-1 rounded-full border transition-colors ${
activeCategory
? 'bg-teal-50 dark:bg-teal-900/30 border-teal-200 dark:border-teal-700 text-teal-700 dark:text-teal-300 hover:bg-teal-100 dark:hover:bg-teal-900/50'
: 'bg-amber-50 dark:bg-amber-900/20 border-amber-300 dark:border-amber-700 text-amber-700 dark:text-amber-300 hover:bg-amber-100 dark:hover:bg-amber-900/40 animate-pulse'
}`}
>
{activeCategory ? (() => {
const cat = DOCUMENT_CATEGORIES.find(c => c.value === activeCategory)
return cat ? `${cat.icon} ${cat.label}` : activeCategory
})() : 'Kategorie setzen'}
</button>
{isGroundTruth && (
<span className="text-xs px-2 py-0.5 rounded-full bg-amber-50 dark:bg-amber-900/20 border border-amber-300 dark:border-amber-700 text-amber-700 dark:text-amber-300">
GT
</span>
)}
{editingActiveCategory && (
<div className="absolute left-0 top-full mt-1 z-20 bg-white dark:bg-gray-800 border border-gray-200 dark:border-gray-700 rounded-lg shadow-lg p-2 grid grid-cols-2 gap-1 w-64">
{DOCUMENT_CATEGORIES.map((cat) => (
<button
key={cat.value}
onClick={() => {
updateCategory(sessionId, cat.value)
setEditingActiveCategory(false)
}}
className={`text-xs px-2 py-1.5 rounded-md text-left transition-colors ${
activeCategory === cat.value
? 'bg-teal-100 dark:bg-teal-900/40 text-teal-700 dark:text-teal-300'
: 'hover:bg-gray-100 dark:hover:bg-gray-700 text-gray-600 dark:text-gray-400'
}`}
>
{cat.icon} {cat.label}
</button>
))}
</div>
)}
</div>
)}
{/* Mode Toggle */}
<div className="flex items-center gap-1 bg-gray-100 dark:bg-gray-800 rounded-lg p-1 w-fit">
<button
onClick={() => {
if (mode === 'pipeline') return
setMode('pipeline')
setCurrentStep(0)
setSessionId(null)
setSteps(OVERLAY_PIPELINE_STEPS.map((s, i) => ({ ...s, status: i === 0 ? 'active' : 'pending' })))
}}
className={`px-3 py-1.5 text-xs font-medium rounded-md transition-colors ${
mode === 'pipeline'
? 'bg-white dark:bg-gray-700 text-gray-700 dark:text-gray-200 shadow-sm'
: 'text-gray-500 dark:text-gray-400 hover:text-gray-700 dark:hover:text-gray-300'
}`}
>
Pipeline (7 Schritte)
</button>
<button
onClick={() => {
if (mode === 'paddle-direct') return
setMode('paddle-direct')
setCurrentStep(0)
setSessionId(null)
setSteps(PADDLE_DIRECT_STEPS.map((s, i) => ({ ...s, status: i === 0 ? 'active' : 'pending' })))
}}
className={`px-3 py-1.5 text-xs font-medium rounded-md transition-colors ${
mode === 'paddle-direct'
? 'bg-white dark:bg-gray-700 text-gray-700 dark:text-gray-200 shadow-sm'
: 'text-gray-500 dark:text-gray-400 hover:text-gray-700 dark:hover:text-gray-300'
}`}
>
PP-OCRv5 Direct (5 Schritte)
</button>
<button
onClick={() => {
if (mode === 'kombi') return
setMode('kombi')
setCurrentStep(0)
setSessionId(null)
setSteps(KOMBI_STEPS.map((s, i) => ({ ...s, status: i === 0 ? 'active' : 'pending' })))
}}
className={`px-3 py-1.5 text-xs font-medium rounded-md transition-colors ${
mode === 'kombi'
? 'bg-white dark:bg-gray-700 text-gray-700 dark:text-gray-200 shadow-sm'
: 'text-gray-500 dark:text-gray-400 hover:text-gray-700 dark:hover:text-gray-300'
}`}
>
Kombi (7 Schritte)
</button>
</div>
<PipelineStepper
steps={steps}
currentStep={currentStep}
onStepClick={handleStepClick}
onReprocess={mode === 'pipeline' && sessionId != null ? reprocessFromStep : undefined}
/>
<div className="min-h-[400px]">{renderStep()}</div>
{/* Ground Truth button bar — visible on last step */}
{showGtButton && (
<div className="sticky bottom-0 bg-white dark:bg-gray-900 border-t dark:border-gray-700 py-3 px-4 -mx-1 flex items-center justify-between rounded-b-xl">
<div className="text-sm text-gray-500 dark:text-gray-400">
{gtMessage && (
<span className={gtMessage.includes('fehlgeschlagen') ? 'text-red-500' : 'text-amber-600 dark:text-amber-400'}>
{gtMessage}
</span>
)}
</div>
<button
onClick={handleMarkGroundTruth}
disabled={gtSaving}
className="px-4 py-2 text-sm bg-amber-600 text-white rounded hover:bg-amber-700 disabled:opacity-50"
>
{gtSaving ? 'Speichere...' : isGroundTruth ? 'Ground Truth aktualisieren' : 'Als Ground Truth markieren'}
</button>
</div>
)}
</div>
)
}

View File

@@ -1,87 +0,0 @@
import type { PipelineStep } from '../ocr-pipeline/types'
// Re-export types used by overlay components
export type {
PipelineStep,
PipelineStepStatus,
SessionListItem,
SessionInfo,
DocumentCategory,
DocumentTypeResult,
OrientationResult,
CropResult,
DeskewResult,
DewarpResult,
RowResult,
RowItem,
GridResult,
GridCell,
OcrWordBox,
WordBbox,
ColumnMeta,
} from '../ocr-pipeline/types'
export { DOCUMENT_CATEGORIES } from '../ocr-pipeline/types'
/**
* 7-step pipeline for full-page overlay reconstruction.
* Skips: Spalten (columns), LLM-Review (Korrektur), Ground-Truth (Validierung)
*/
export const OVERLAY_PIPELINE_STEPS: PipelineStep[] = [
{ id: 'orientation', name: 'Orientierung', icon: '🔄', status: 'pending' },
{ id: 'deskew', name: 'Begradigung', icon: '📐', status: 'pending' },
{ id: 'dewarp', name: 'Entzerrung', icon: '🔧', status: 'pending' },
{ id: 'crop', name: 'Zuschneiden', icon: '✂️', status: 'pending' },
{ id: 'rows', name: 'Zeilen', icon: '📏', status: 'pending' },
{ id: 'words', name: 'Woerter', icon: '🔤', status: 'pending' },
{ id: 'reconstruction', name: 'Overlay', icon: '🏗️', status: 'pending' },
]
/** Map from overlay UI step index to DB step number (1-indexed) */
export const OVERLAY_UI_TO_DB: Record<number, number> = {
0: 2, // orientation
1: 3, // deskew
2: 4, // dewarp
3: 5, // crop
4: 6, // rows (skip columns=6 in DB, rows=7 — but we reuse DB step numbering)
5: 7, // words
6: 9, // reconstruction
}
/**
* 5-step pipeline for Paddle Direct mode.
* Same preprocessing (orient/deskew/dewarp/crop), then PaddleOCR replaces rows+words+overlay.
*/
export const PADDLE_DIRECT_STEPS: PipelineStep[] = [
{ id: 'orientation', name: 'Orientierung', icon: '🔄', status: 'pending' },
{ id: 'deskew', name: 'Begradigung', icon: '📐', status: 'pending' },
{ id: 'dewarp', name: 'Entzerrung', icon: '🔧', status: 'pending' },
{ id: 'crop', name: 'Zuschneiden', icon: '✂️', status: 'pending' },
{ id: 'paddle-direct', name: 'PP-OCRv5 + Overlay', icon: '⚡', status: 'pending' },
]
/**
* 5-step pipeline for Kombi mode (PP-OCRv5 + Tesseract).
* Same preprocessing, then both engines run and results are merged.
*/
export const KOMBI_STEPS: PipelineStep[] = [
{ id: 'orientation', name: 'Orientierung', icon: '🔄', status: 'pending' },
{ id: 'deskew', name: 'Begradigung', icon: '📐', status: 'pending' },
{ id: 'dewarp', name: 'Entzerrung', icon: '🔧', status: 'pending' },
{ id: 'crop', name: 'Zuschneiden', icon: '✂️', status: 'pending' },
{ id: 'kombi', name: 'PP-OCRv5 + Tesseract', icon: '🔀', status: 'pending' },
{ id: 'structure', name: 'Struktur', icon: '🔍', status: 'pending' },
{ id: 'grid-editor', name: 'Tabelle', icon: '📊', status: 'pending' },
]
/** Map from DB step to overlay UI step index */
export function dbStepToOverlayUi(dbStep: number): number {
// DB: 1=start, 2=orient, 3=deskew, 4=dewarp, 5=crop, 6=columns, 7=rows, 8=words, 9=recon, 10=gt
if (dbStep <= 2) return 0 // orientation
if (dbStep === 3) return 1 // deskew
if (dbStep === 4) return 2 // dewarp
if (dbStep === 5) return 3 // crop
if (dbStep <= 7) return 4 // rows (skip columns)
if (dbStep === 8) return 5 // words
return 6 // reconstruction
}

View File

@@ -1,624 +0,0 @@
'use client'
import { useCallback, useEffect, useState } from 'react'
import { PagePurpose } from '@/components/common/PagePurpose'
import { PipelineStepper } from '@/components/ocr-pipeline/PipelineStepper'
import { StepOrientation } from '@/components/ocr-pipeline/StepOrientation'
import { StepCrop } from '@/components/ocr-pipeline/StepCrop'
import { StepDeskew } from '@/components/ocr-pipeline/StepDeskew'
import { StepDewarp } from '@/components/ocr-pipeline/StepDewarp'
import { StepStructureDetection } from '@/components/ocr-pipeline/StepStructureDetection'
import { StepColumnDetection } from '@/components/ocr-pipeline/StepColumnDetection'
import { StepRowDetection } from '@/components/ocr-pipeline/StepRowDetection'
import { StepWordRecognition } from '@/components/ocr-pipeline/StepWordRecognition'
import { StepLlmReview } from '@/components/ocr-pipeline/StepLlmReview'
import { StepReconstruction } from '@/components/ocr-pipeline/StepReconstruction'
import { StepGroundTruth } from '@/components/ocr-pipeline/StepGroundTruth'
import { BoxSessionTabs } from '@/components/ocr-pipeline/BoxSessionTabs'
import { PIPELINE_STEPS, DOCUMENT_CATEGORIES, type PipelineStep, type SessionListItem, type DocumentTypeResult, type DocumentCategory, type SubSession } from './types'
const KLAUSUR_API = '/klausur-api'
export default function OcrPipelinePage() {
const [currentStep, setCurrentStep] = useState(0)
const [sessionId, setSessionId] = useState<string | null>(null)
const [sessionName, setSessionName] = useState<string>('')
const [sessions, setSessions] = useState<SessionListItem[]>([])
const [loadingSessions, setLoadingSessions] = useState(true)
const [editingName, setEditingName] = useState<string | null>(null)
const [editNameValue, setEditNameValue] = useState('')
const [editingCategory, setEditingCategory] = useState<string | null>(null)
const [docTypeResult, setDocTypeResult] = useState<DocumentTypeResult | null>(null)
const [activeCategory, setActiveCategory] = useState<DocumentCategory | undefined>(undefined)
const [subSessions, setSubSessions] = useState<SubSession[]>([])
const [parentSessionId, setParentSessionId] = useState<string | null>(null)
const [steps, setSteps] = useState<PipelineStep[]>(
PIPELINE_STEPS.map((s, i) => ({
...s,
status: i === 0 ? 'active' : 'pending',
})),
)
// Load session list on mount
useEffect(() => {
loadSessions()
}, [])
const loadSessions = async () => {
setLoadingSessions(true)
try {
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions`)
if (res.ok) {
const data = await res.json()
setSessions(data.sessions || [])
}
} catch (e) {
console.error('Failed to load sessions:', e)
} finally {
setLoadingSessions(false)
}
}
const openSession = useCallback(async (sid: string, keepSubSessions?: boolean) => {
try {
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`)
if (!res.ok) return
const data = await res.json()
setSessionId(sid)
setSessionName(data.name || data.filename || '')
setActiveCategory(data.document_category || undefined)
// Sub-session handling
if (data.sub_sessions && data.sub_sessions.length > 0) {
setSubSessions(data.sub_sessions)
setParentSessionId(sid)
} else if (data.parent_session_id) {
// This is a sub-session — keep parent info but don't reset sub-session list
setParentSessionId(data.parent_session_id)
} else if (!keepSubSessions) {
setSubSessions([])
setParentSessionId(null)
}
// Restore doc type result if available
const savedDocType: DocumentTypeResult | null = data.doc_type_result || null
setDocTypeResult(savedDocType)
// Determine which step to jump to based on current_step
const dbStep = data.current_step || 1
// DB steps: 1=start, 2=orientation, 3=deskew, 4=dewarp, 5=crop, 6=columns, ...
// UI steps are 0-indexed: 0=orientation, 1=deskew, 2=dewarp, 3=crop, 4=columns, ...
let uiStep = Math.max(0, dbStep - 1)
const skipSteps = [...(savedDocType?.skip_steps || [])]
// Sub-sessions: image is already cropped, skip pre-processing steps
// Jump directly to columns (UI step 4) unless already further ahead
const isSubSession = !!data.parent_session_id
const SUB_SESSION_SKIP = ['orientation', 'deskew', 'dewarp', 'crop']
if (isSubSession) {
for (const s of SUB_SESSION_SKIP) {
if (!skipSteps.includes(s)) skipSteps.push(s)
}
if (uiStep < 4) uiStep = 4 // columns step (index 4)
}
setSteps(
PIPELINE_STEPS.map((s, i) => ({
...s,
status: skipSteps.includes(s.id)
? 'skipped'
: i < uiStep ? 'completed' : i === uiStep ? 'active' : 'pending',
})),
)
setCurrentStep(uiStep)
} catch (e) {
console.error('Failed to open session:', e)
}
}, [])
const deleteSession = useCallback(async (sid: string) => {
try {
await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`, { method: 'DELETE' })
setSessions((prev) => prev.filter((s) => s.id !== sid))
if (sessionId === sid) {
setSessionId(null)
setCurrentStep(0)
setDocTypeResult(null)
setSubSessions([])
setParentSessionId(null)
setSteps(PIPELINE_STEPS.map((s, i) => ({ ...s, status: i === 0 ? 'active' : 'pending' })))
}
} catch (e) {
console.error('Failed to delete session:', e)
}
}, [sessionId])
const renameSession = useCallback(async (sid: string, newName: string) => {
try {
await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`, {
method: 'PUT',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ name: newName }),
})
setSessions((prev) => prev.map((s) => (s.id === sid ? { ...s, name: newName } : s)))
if (sessionId === sid) setSessionName(newName)
} catch (e) {
console.error('Failed to rename session:', e)
}
setEditingName(null)
}, [sessionId])
const updateCategory = useCallback(async (sid: string, category: DocumentCategory) => {
try {
await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`, {
method: 'PUT',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ document_category: category }),
})
setSessions((prev) => prev.map((s) => (s.id === sid ? { ...s, document_category: category } : s)))
if (sessionId === sid) setActiveCategory(category)
} catch (e) {
console.error('Failed to update category:', e)
}
setEditingCategory(null)
}, [sessionId])
const deleteAllSessions = useCallback(async () => {
if (!confirm('Alle Sessions loeschen? Dies kann nicht rueckgaengig gemacht werden.')) return
try {
await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions`, { method: 'DELETE' })
setSessions([])
setSessionId(null)
setCurrentStep(0)
setDocTypeResult(null)
setActiveCategory(undefined)
setSubSessions([])
setParentSessionId(null)
setSteps(PIPELINE_STEPS.map((s, i) => ({ ...s, status: i === 0 ? 'active' : 'pending' })))
} catch (e) {
console.error('Failed to delete all sessions:', e)
}
}, [])
const handleStepClick = (index: number) => {
if (index <= currentStep || steps[index].status === 'completed') {
setCurrentStep(index)
}
}
const goToStep = (step: number) => {
setCurrentStep(step)
setSteps((prev) =>
prev.map((s, i) => ({
...s,
status: i < step ? 'completed' : i === step ? 'active' : 'pending',
})),
)
}
const handleNext = () => {
if (currentStep >= steps.length - 1) {
// Last step completed
if (parentSessionId && sessionId !== parentSessionId) {
// Sub-session completed — update its status and stay in tab view
setSubSessions((prev) =>
prev.map((s) => s.id === sessionId ? { ...s, status: 'completed', current_step: 10 } : s)
)
// Switch back to parent
handleSessionChange(parentSessionId)
return
}
// Main session: return to session list
setSteps(PIPELINE_STEPS.map((s, i) => ({ ...s, status: i === 0 ? 'active' : 'pending' })))
setCurrentStep(0)
setSessionId(null)
setSubSessions([])
setParentSessionId(null)
loadSessions()
return
}
// Find the next non-skipped step
const skipSteps = docTypeResult?.skip_steps || []
let nextStep = currentStep + 1
while (nextStep < steps.length && skipSteps.includes(PIPELINE_STEPS[nextStep]?.id)) {
nextStep++
}
if (nextStep >= steps.length) nextStep = steps.length - 1
setSteps((prev) =>
prev.map((s, i) => {
if (i === currentStep) return { ...s, status: 'completed' }
if (i === nextStep) return { ...s, status: 'active' }
// Mark skipped steps between current and next
if (i > currentStep && i < nextStep && skipSteps.includes(PIPELINE_STEPS[i]?.id)) {
return { ...s, status: 'skipped' }
}
return s
}),
)
setCurrentStep(nextStep)
}
const handleOrientationComplete = (sid: string) => {
setSessionId(sid)
// Reload session list to show the new session
loadSessions()
handleNext()
}
const handleCropNext = async () => {
// Auto-detect document type after crop (last image-processing step), then advance
if (sessionId) {
try {
const res = await fetch(
`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/detect-type`,
{ method: 'POST' },
)
if (res.ok) {
const data: DocumentTypeResult = await res.json()
setDocTypeResult(data)
// Mark skipped steps immediately
const skipSteps = data.skip_steps || []
if (skipSteps.length > 0) {
setSteps((prev) =>
prev.map((s) =>
skipSteps.includes(s.id) ? { ...s, status: 'skipped' } : s,
),
)
}
}
} catch (e) {
console.error('Doc type detection failed:', e)
// Not critical — continue without it
}
}
handleNext()
}
const handleDocTypeChange = (newDocType: DocumentTypeResult['doc_type']) => {
if (!docTypeResult) return
// Build new skip_steps based on doc type
let skipSteps: string[] = []
if (newDocType === 'full_text') {
skipSteps = ['columns', 'rows']
}
// vocab_table and generic_table: no skips
const updated: DocumentTypeResult = {
...docTypeResult,
doc_type: newDocType,
skip_steps: skipSteps,
pipeline: newDocType === 'full_text' ? 'full_page' : 'cell_first',
}
setDocTypeResult(updated)
// Update step statuses
setSteps((prev) =>
prev.map((s) => {
if (skipSteps.includes(s.id)) return { ...s, status: 'skipped' as const }
if (s.status === 'skipped') return { ...s, status: 'pending' as const }
return s
}),
)
}
const handleNewSession = () => {
setSessionId(null)
setSessionName('')
setCurrentStep(0)
setDocTypeResult(null)
setSubSessions([])
setParentSessionId(null)
setSteps(PIPELINE_STEPS.map((s, i) => ({ ...s, status: i === 0 ? 'active' : 'pending' })))
}
const handleSessionChange = useCallback((newSessionId: string) => {
openSession(newSessionId, true)
}, [openSession])
const handleBoxSessionsCreated = useCallback((subs: SubSession[]) => {
setSubSessions(subs)
if (sessionId) setParentSessionId(sessionId)
}, [sessionId])
const stepNames: Record<number, string> = {
1: 'Orientierung',
2: 'Begradigung',
3: 'Entzerrung',
4: 'Zuschneiden',
5: 'Spalten',
6: 'Zeilen',
7: 'Woerter',
8: 'Struktur',
9: 'Korrektur',
10: 'Rekonstruktion',
11: 'Validierung',
}
const reprocessFromStep = useCallback(async (uiStep: number) => {
if (!sessionId) return
const dbStep = uiStep + 1 // UI is 0-indexed, DB is 1-indexed
if (!confirm(`Ab Schritt ${dbStep} (${stepNames[dbStep] || '?'}) neu verarbeiten? Nachfolgende Daten werden geloescht.`)) return
try {
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/reprocess`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ from_step: dbStep }),
})
if (!res.ok) {
const data = await res.json().catch(() => ({}))
console.error('Reprocess failed:', data.detail || res.status)
return
}
// Reset UI steps
goToStep(uiStep)
} catch (e) {
console.error('Reprocess error:', e)
}
// eslint-disable-next-line react-hooks/exhaustive-deps
}, [sessionId, goToStep])
const renderStep = () => {
switch (currentStep) {
case 0:
return <StepOrientation sessionId={sessionId} onNext={handleOrientationComplete} />
case 1:
return <StepDeskew sessionId={sessionId} onNext={handleNext} />
case 2:
return <StepDewarp sessionId={sessionId} onNext={handleNext} />
case 3:
return <StepCrop sessionId={sessionId} onNext={handleCropNext} />
case 4:
return <StepColumnDetection sessionId={sessionId} onNext={handleNext} onBoxSessionsCreated={handleBoxSessionsCreated} />
case 5:
return <StepRowDetection sessionId={sessionId} onNext={handleNext} />
case 6:
return <StepWordRecognition sessionId={sessionId} onNext={handleNext} goToStep={goToStep} />
case 7:
return <StepStructureDetection sessionId={sessionId} onNext={handleNext} />
case 8:
return <StepLlmReview sessionId={sessionId} onNext={handleNext} />
case 9:
return <StepReconstruction sessionId={sessionId} onNext={handleNext} />
case 10:
return <StepGroundTruth sessionId={sessionId} onNext={handleNext} />
default:
return null
}
}
return (
<div className="space-y-6">
<PagePurpose
title="OCR Pipeline"
purpose="Schrittweise Seitenrekonstruktion: Scan begradigen, Spalten erkennen, Woerter lokalisieren und die Seite Wort fuer Wort nachbauen. Ziel: 10 Vokabelseiten fehlerfrei rekonstruieren."
audience={['Entwickler', 'Data Scientists']}
architecture={{
services: ['klausur-service (FastAPI)', 'OpenCV', 'Tesseract'],
databases: ['PostgreSQL Sessions'],
}}
relatedPages={[
{ name: 'OCR Vergleich', href: '/ai/ocr-compare', description: 'Methoden-Vergleich' },
{ name: 'OCR-Labeling', href: '/ai/ocr-labeling', description: 'Trainingsdaten' },
]}
defaultCollapsed
/>
{/* Session List */}
<div className="bg-white dark:bg-gray-800 rounded-xl border border-gray-200 dark:border-gray-700 p-4">
<div className="flex items-center justify-between mb-3">
<h3 className="text-sm font-medium text-gray-700 dark:text-gray-300">
Sessions ({sessions.length})
</h3>
<div className="flex gap-2">
{sessions.length > 0 && (
<button
onClick={deleteAllSessions}
className="text-xs px-3 py-1.5 text-red-600 hover:bg-red-50 dark:hover:bg-red-900/20 rounded-lg transition-colors"
title="Alle Sessions loeschen"
>
Alle loeschen
</button>
)}
<button
onClick={handleNewSession}
className="text-xs px-3 py-1.5 bg-teal-600 text-white rounded-lg hover:bg-teal-700 transition-colors"
>
+ Neue Session
</button>
</div>
</div>
{loadingSessions ? (
<div className="text-sm text-gray-400 py-2">Lade Sessions...</div>
) : sessions.length === 0 ? (
<div className="text-sm text-gray-400 py-2">Noch keine Sessions vorhanden.</div>
) : (
<div className="space-y-1.5 max-h-[320px] overflow-y-auto">
{sessions.map((s) => {
const catInfo = DOCUMENT_CATEGORIES.find(c => c.value === s.document_category)
return (
<div
key={s.id}
className={`relative flex items-start gap-3 px-3 py-2.5 rounded-lg text-sm transition-colors cursor-pointer ${
sessionId === s.id
? 'bg-teal-50 dark:bg-teal-900/30 border border-teal-200 dark:border-teal-700'
: 'hover:bg-gray-50 dark:hover:bg-gray-700/50'
}`}
>
{/* Thumbnail */}
<div
className="flex-shrink-0 w-12 h-12 rounded-md overflow-hidden bg-gray-100 dark:bg-gray-700"
onClick={() => openSession(s.id)}
>
{/* eslint-disable-next-line @next/next/no-img-element */}
<img
src={`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${s.id}/thumbnail?size=96`}
alt=""
className="w-full h-full object-cover"
loading="lazy"
onError={(e) => { (e.target as HTMLImageElement).style.display = 'none' }}
/>
</div>
{/* Info */}
<div className="flex-1 min-w-0" onClick={() => openSession(s.id)}>
{editingName === s.id ? (
<input
autoFocus
value={editNameValue}
onChange={(e) => setEditNameValue(e.target.value)}
onBlur={() => renameSession(s.id, editNameValue)}
onKeyDown={(e) => {
if (e.key === 'Enter') renameSession(s.id, editNameValue)
if (e.key === 'Escape') setEditingName(null)
}}
onClick={(e) => e.stopPropagation()}
className="w-full px-1 py-0.5 text-sm border rounded dark:bg-gray-700 dark:border-gray-600"
/>
) : (
<div className="truncate font-medium text-gray-700 dark:text-gray-300">
{s.name || s.filename}
</div>
)}
{/* ID row */}
<button
onClick={(e) => {
e.stopPropagation()
navigator.clipboard.writeText(s.id)
const btn = e.currentTarget
btn.textContent = 'Kopiert!'
setTimeout(() => { btn.textContent = `ID: ${s.id.slice(0, 8)}` }, 1500)
}}
className="text-[10px] font-mono text-gray-400 hover:text-teal-500 transition-colors"
title={`Volle ID: ${s.id} — Klick zum Kopieren`}
>
ID: {s.id.slice(0, 8)}
</button>
<div className="text-xs text-gray-400 flex gap-2 mt-0.5">
<span>{new Date(s.created_at).toLocaleDateString('de-DE', { day: '2-digit', month: '2-digit', year: '2-digit', hour: '2-digit', minute: '2-digit' })}</span>
<span>Schritt {s.current_step}: {stepNames[s.current_step] || '?'}</span>
</div>
</div>
{/* Badges */}
<div className="flex flex-col gap-1 items-end flex-shrink-0" onClick={(e) => e.stopPropagation()}>
{/* Category Badge */}
<button
onClick={() => setEditingCategory(editingCategory === s.id ? null : s.id)}
className={`text-[10px] px-1.5 py-0.5 rounded-full border transition-colors ${
catInfo
? 'bg-teal-50 dark:bg-teal-900/30 border-teal-200 dark:border-teal-700 text-teal-700 dark:text-teal-300'
: 'bg-gray-50 dark:bg-gray-700 border-gray-200 dark:border-gray-600 text-gray-400 hover:text-gray-600 dark:hover:text-gray-300'
}`}
title="Kategorie setzen"
>
{catInfo ? `${catInfo.icon} ${catInfo.label}` : '+ Kategorie'}
</button>
{/* Doc Type Badge (read-only) */}
{s.doc_type && (
<span className="text-[10px] px-1.5 py-0.5 rounded-full bg-gray-100 dark:bg-gray-700 text-gray-500 dark:text-gray-400 border border-gray-200 dark:border-gray-600">
{s.doc_type}
</span>
)}
</div>
{/* Action buttons */}
<div className="flex flex-col gap-0.5 flex-shrink-0">
<button
onClick={(e) => {
e.stopPropagation()
setEditNameValue(s.name || s.filename)
setEditingName(s.id)
}}
className="p-1 text-gray-400 hover:text-gray-600 dark:hover:text-gray-300"
title="Umbenennen"
>
<svg className="w-3.5 h-3.5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M15.232 5.232l3.536 3.536m-2.036-5.036a2.5 2.5 0 113.536 3.536L6.5 21.036H3v-3.572L16.732 3.732z" />
</svg>
</button>
<button
onClick={(e) => {
e.stopPropagation()
if (confirm('Session loeschen?')) deleteSession(s.id)
}}
className="p-1 text-gray-400 hover:text-red-500"
title="Loeschen"
>
<svg className="w-3.5 h-3.5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M19 7l-.867 12.142A2 2 0 0116.138 21H7.862a2 2 0 01-1.995-1.858L5 7m5 4v6m4-6v6m1-10V4a1 1 0 00-1-1h-4a1 1 0 00-1 1v3M4 7h16" />
</svg>
</button>
</div>
{/* Category dropdown (inline) */}
{editingCategory === s.id && (
<div
className="absolute right-0 top-full mt-1 z-20 bg-white dark:bg-gray-800 border border-gray-200 dark:border-gray-700 rounded-lg shadow-lg p-2 grid grid-cols-2 gap-1 w-64"
onClick={(e) => e.stopPropagation()}
>
{DOCUMENT_CATEGORIES.map((cat) => (
<button
key={cat.value}
onClick={() => updateCategory(s.id, cat.value)}
className={`text-xs px-2 py-1.5 rounded-md text-left transition-colors ${
s.document_category === cat.value
? 'bg-teal-100 dark:bg-teal-900/40 text-teal-700 dark:text-teal-300'
: 'hover:bg-gray-100 dark:hover:bg-gray-700 text-gray-600 dark:text-gray-400'
}`}
>
{cat.icon} {cat.label}
</button>
))}
</div>
)}
</div>
)
})}
</div>
)}
</div>
{/* Active session info */}
{sessionId && sessionName && (
<div className="flex items-center gap-3 text-sm text-gray-500 dark:text-gray-400">
<span>Aktive Session: <span className="font-medium text-gray-700 dark:text-gray-300">{sessionName}</span></span>
{activeCategory && (() => {
const cat = DOCUMENT_CATEGORIES.find(c => c.value === activeCategory)
return cat ? <span className="text-xs px-2 py-0.5 rounded-full bg-teal-50 dark:bg-teal-900/30 border border-teal-200 dark:border-teal-700 text-teal-700 dark:text-teal-300">{cat.icon} {cat.label}</span> : null
})()}
{docTypeResult && (
<span className="text-xs px-2 py-0.5 rounded-full bg-gray-100 dark:bg-gray-700 text-gray-500 dark:text-gray-400 border border-gray-200 dark:border-gray-600">
{docTypeResult.doc_type}
</span>
)}
</div>
)}
<PipelineStepper
steps={steps}
currentStep={currentStep}
onStepClick={handleStepClick}
onReprocess={sessionId ? reprocessFromStep : undefined}
docTypeResult={docTypeResult}
onDocTypeChange={handleDocTypeChange}
/>
{subSessions.length > 0 && parentSessionId && sessionId && (
<BoxSessionTabs
parentSessionId={parentSessionId}
subSessions={subSessions}
activeSessionId={sessionId}
onSessionChange={handleSessionChange}
/>
)}
<div className="min-h-[400px]">{renderStep()}</div>
</div>
)
}

View File

@@ -1,412 +0,0 @@
export type PipelineStepStatus = 'pending' | 'active' | 'completed' | 'failed' | 'skipped'
export interface PipelineStep {
id: string
name: string
icon: string
status: PipelineStepStatus
}
export type DocumentCategory =
| 'vokabelseite' | 'buchseite' | 'arbeitsblatt' | 'klausurseite'
| 'mathearbeit' | 'statistik' | 'zeitung' | 'formular' | 'handschrift' | 'sonstiges'
export const DOCUMENT_CATEGORIES: { value: DocumentCategory; label: string; icon: string }[] = [
{ value: 'vokabelseite', label: 'Vokabelseite', icon: '📖' },
{ value: 'buchseite', label: 'Buchseite', icon: '📚' },
{ value: 'arbeitsblatt', label: 'Arbeitsblatt', icon: '📝' },
{ value: 'klausurseite', label: 'Klausurseite', icon: '📄' },
{ value: 'mathearbeit', label: 'Mathearbeit', icon: '🔢' },
{ value: 'statistik', label: 'Statistik', icon: '📊' },
{ value: 'zeitung', label: 'Zeitung', icon: '📰' },
{ value: 'formular', label: 'Formular', icon: '📋' },
{ value: 'handschrift', label: 'Handschrift', icon: '✍️' },
{ value: 'sonstiges', label: 'Sonstiges', icon: '📎' },
]
export interface SessionListItem {
id: string
name: string
filename: string
status: string
current_step: number
document_category?: DocumentCategory
doc_type?: string
created_at: string
updated_at?: string
parent_session_id?: string | null
box_index?: number | null
}
export interface SubSession {
id: string
name: string
box_index: number
current_step?: number
status?: string
}
export interface PipelineLogEntry {
step: string
completed_at: string
success: boolean
duration_ms?: number
metrics: Record<string, unknown>
}
export interface PipelineLog {
steps: PipelineLogEntry[]
}
export interface DocumentTypeResult {
doc_type: 'vocab_table' | 'full_text' | 'generic_table'
confidence: number
pipeline: 'cell_first' | 'full_page'
skip_steps: string[]
features?: Record<string, unknown>
duration_seconds?: number
}
export interface OrientationResult {
orientation_degrees: number
corrected: boolean
duration_seconds: number
}
export interface CropResult {
crop_applied: boolean
crop_rect?: { x: number; y: number; width: number; height: number }
crop_rect_pct?: { x: number; y: number; width: number; height: number }
original_size: { width: number; height: number }
cropped_size: { width: number; height: number }
detected_format?: string
format_confidence?: number
aspect_ratio?: number
border_fractions?: { top: number; bottom: number; left: number; right: number }
skipped?: boolean
duration_seconds?: number
}
export interface SessionInfo {
session_id: string
filename: string
name?: string
image_width: number
image_height: number
original_image_url: string
current_step?: number
document_category?: DocumentCategory
doc_type?: string
orientation_result?: OrientationResult
crop_result?: CropResult
deskew_result?: DeskewResult
dewarp_result?: DewarpResult
column_result?: ColumnResult
row_result?: RowResult
word_result?: GridResult
doc_type_result?: DocumentTypeResult
sub_sessions?: SubSession[]
parent_session_id?: string
box_index?: number
}
export interface DeskewResult {
session_id: string
angle_hough: number
angle_word_alignment: number
angle_iterative?: number
angle_residual?: number
angle_textline?: number
angle_applied: number
method_used: 'hough' | 'word_alignment' | 'manual' | 'iterative' | 'two_pass' | 'three_pass' | 'manual_combined'
confidence: number
duration_seconds: number
deskewed_image_url: string
binarized_image_url: string
}
export interface DeskewGroundTruth {
is_correct: boolean
corrected_angle?: number
notes?: string
}
export interface DewarpDetection {
method: string
shear_degrees: number
confidence: number
}
export interface DewarpResult {
session_id: string
method_used: string
shear_degrees: number
confidence: number
duration_seconds: number
dewarped_image_url: string
detections?: DewarpDetection[]
}
export interface DewarpGroundTruth {
is_correct: boolean
corrected_shear?: number
notes?: string
}
export interface PageRegion {
type: 'column_en' | 'column_de' | 'column_example' | 'page_ref'
| 'column_marker' | 'column_text' | 'column_ignore' | 'header' | 'footer'
x: number
y: number
width: number
height: number
classification_confidence?: number
classification_method?: string
}
export interface PageZone {
zone_type: 'content' | 'box'
y_start: number
y_end: number
box?: { x: number; y: number; width: number; height: number }
}
export interface ColumnResult {
columns: PageRegion[]
duration_seconds: number
zones?: PageZone[]
}
export interface ColumnGroundTruth {
is_correct: boolean
corrected_columns?: PageRegion[]
notes?: string
}
export interface ManualColumnDivider {
xPercent: number // Position in % of image width (0-100)
}
export type ColumnTypeKey = PageRegion['type']
export interface RowResult {
rows: RowItem[]
summary: Record<string, number>
total_rows: number
duration_seconds: number
}
export interface RowItem {
index: number
x: number
y: number
width: number
height: number
word_count: number
row_type: 'content' | 'header' | 'footer'
gap_before: number
}
export interface RowGroundTruth {
is_correct: boolean
corrected_rows?: RowItem[]
notes?: string
}
export interface StructureGraphic {
x: number
y: number
w: number
h: number
area: number
shape: string // image, illustration
color_name: string
color_hex: string
confidence: number
}
export interface ExcludeRegion {
x: number
y: number
w: number
h: number
label?: string
}
export interface StructureResult {
image_width: number
image_height: number
content_bounds: { x: number; y: number; w: number; h: number }
boxes: StructureBox[]
zones: StructureZone[]
graphics: StructureGraphic[]
exclude_regions?: ExcludeRegion[]
color_pixel_counts: Record<string, number>
has_words: boolean
word_count: number
border_ghosts_removed?: number
duration_seconds: number
}
export interface StructureBox {
x: number
y: number
w: number
h: number
confidence: number
border_thickness: number
bg_color_name?: string
bg_color_hex?: string
}
export interface StructureZone {
index: number
zone_type: 'content' | 'box'
x: number
y: number
w: number
h: number
}
export interface WordBbox {
x: number
y: number
w: number
h: number
}
export interface OcrWordBox {
text: string
left: number // absolute image x in px
top: number // absolute image y in px
width: number // px
height: number // px
conf: number
color?: string // hex color of detected text, e.g. '#dc2626'
color_name?: string // 'black' | 'red' | 'blue' | 'green' | 'orange' | 'purple' | 'yellow'
recovered?: boolean // true if this word was recovered via color detection
}
export interface GridCell {
cell_id: string // "R03_C1"
row_index: number
col_index: number
col_type: string
text: string
confidence: number
bbox_px: WordBbox
bbox_pct: WordBbox
ocr_engine?: string
is_bold?: boolean
status?: 'pending' | 'confirmed' | 'edited' | 'skipped'
word_boxes?: OcrWordBox[] // per-word bounding boxes from OCR engine
}
export interface ColumnMeta {
index: number
type: string
x: number
width: number
}
export interface GridResult {
cells: GridCell[]
grid_shape: { rows: number; cols: number; total_cells: number }
columns_used: ColumnMeta[]
layout: 'vocab' | 'generic'
image_width: number
image_height: number
duration_seconds: number
ocr_engine?: string
vocab_entries?: WordEntry[] // Only when layout='vocab'
entries?: WordEntry[] // Backwards compat alias for vocab_entries
entry_count?: number
summary: {
total_cells: number
non_empty_cells: number
low_confidence: number
// Only when layout='vocab':
total_entries?: number
with_english?: number
with_german?: number
}
llm_review?: {
changes: { row_index: number; field: string; old: string; new: string }[]
model_used: string
duration_ms: number
entries_corrected: number
applied_count?: number
applied_at?: string
}
}
export interface WordEntry {
row_index: number
english: string
german: string
example: string
source_page?: string
marker?: string
confidence: number
bbox: WordBbox
bbox_en: WordBbox | null
bbox_de: WordBbox | null
bbox_ex: WordBbox | null
bbox_ref?: WordBbox | null
bbox_marker?: WordBbox | null
status?: 'pending' | 'confirmed' | 'edited' | 'skipped'
}
/** @deprecated Use GridResult instead */
export interface WordResult {
entries: WordEntry[]
entry_count: number
image_width: number
image_height: number
duration_seconds: number
ocr_engine?: string
summary: {
total_entries: number
with_english: number
with_german: number
low_confidence: number
}
}
export interface WordGroundTruth {
is_correct: boolean
corrected_entries?: WordEntry[]
notes?: string
}
export interface ImageRegion {
bbox_pct: { x: number; y: number; w: number; h: number }
prompt: string
description: string
image_b64: string | null
style: 'educational' | 'cartoon' | 'sketch' | 'clipart' | 'realistic'
}
export type ImageStyle = ImageRegion['style']
export const IMAGE_STYLES: { value: ImageStyle; label: string }[] = [
{ value: 'educational', label: 'Lehrbuch' },
{ value: 'cartoon', label: 'Cartoon' },
{ value: 'sketch', label: 'Skizze' },
{ value: 'clipart', label: 'Clipart' },
{ value: 'realistic', label: 'Realistisch' },
]
export const PIPELINE_STEPS: PipelineStep[] = [
{ id: 'orientation', name: 'Orientierung', icon: '🔄', status: 'pending' },
{ id: 'deskew', name: 'Begradigung', icon: '📐', status: 'pending' },
{ id: 'dewarp', name: 'Entzerrung', icon: '🔧', status: 'pending' },
{ id: 'crop', name: 'Zuschneiden', icon: '✂️', status: 'pending' },
{ id: 'columns', name: 'Spalten', icon: '📊', status: 'pending' },
{ id: 'rows', name: 'Zeilen', icon: '📏', status: 'pending' },
{ id: 'words', name: 'Woerter', icon: '🔤', status: 'pending' },
{ id: 'structure', name: 'Struktur', icon: '🔍', status: 'pending' },
{ id: 'llm-review', name: 'Korrektur', icon: '✏️', status: 'pending' },
{ id: 'reconstruction', name: 'Rekonstruktion', icon: '🏗️', status: 'pending' },
{ id: 'ground-truth', name: 'Validierung', icon: '✅', status: 'pending' },
]

View File

@@ -0,0 +1,403 @@
'use client'
/**
* OCR Regression Dashboard
*
* Shows all ground-truth sessions, runs regression tests,
* displays pass/fail results with diff details, and shows history.
*/
import { useState, useEffect, useCallback } from 'react'
import { PagePurpose } from '@/components/common/PagePurpose'
const KLAUSUR_API = '/klausur-api'
// ---------------------------------------------------------------------------
// Types
// ---------------------------------------------------------------------------
interface GTSession {
session_id: string
name: string
filename: string
document_category: string | null
pipeline: string | null
saved_at: string | null
summary: {
total_zones: number
total_columns: number
total_rows: number
total_cells: number
}
}
interface DiffSummary {
structural_changes: number
cells_missing: number
cells_added: number
text_changes: number
col_type_changes: number
}
interface RegressionResult {
session_id: string
name: string
status: 'pass' | 'fail' | 'error'
error?: string
diff_summary?: DiffSummary
reference_summary?: Record<string, number>
current_summary?: Record<string, number>
structural_diffs?: Array<{ field: string; reference: number; current: number }>
cell_diffs?: Array<{ type: string; cell_id: string; reference?: string; current?: string }>
}
interface RegressionRun {
id: string
run_at: string
status: string
total: number
passed: number
failed: number
errors: number
duration_ms: number
triggered_by: string
}
// ---------------------------------------------------------------------------
// Helpers
// ---------------------------------------------------------------------------
function StatusBadge({ status }: { status: string }) {
const cls =
status === 'pass'
? 'bg-emerald-100 text-emerald-800 border-emerald-200'
: status === 'fail'
? 'bg-red-100 text-red-800 border-red-200'
: 'bg-amber-100 text-amber-800 border-amber-200'
return (
<span className={`inline-flex items-center px-2.5 py-0.5 rounded-full text-xs font-medium border ${cls}`}>
{status === 'pass' ? 'Pass' : status === 'fail' ? 'Fail' : 'Error'}
</span>
)
}
function formatDate(iso: string | null) {
if (!iso) return '—'
return new Date(iso).toLocaleString('de-DE', {
day: '2-digit', month: '2-digit', year: 'numeric',
hour: '2-digit', minute: '2-digit',
})
}
// ---------------------------------------------------------------------------
// Component
// ---------------------------------------------------------------------------
export default function OCRRegressionPage() {
const [sessions, setSessions] = useState<GTSession[]>([])
const [results, setResults] = useState<RegressionResult[]>([])
const [history, setHistory] = useState<RegressionRun[]>([])
const [running, setRunning] = useState(false)
const [overallStatus, setOverallStatus] = useState<string | null>(null)
const [durationMs, setDurationMs] = useState<number | null>(null)
const [expandedSession, setExpandedSession] = useState<string | null>(null)
const [tab, setTab] = useState<'current' | 'history'>('current')
// Load ground-truth sessions
const loadSessions = useCallback(async () => {
try {
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/ground-truth-sessions`)
if (res.ok) {
const data = await res.json()
setSessions(data.sessions || [])
}
} catch (e) {
console.error('Failed to load GT sessions:', e)
}
}, [])
// Load history
const loadHistory = useCallback(async () => {
try {
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/regression/history?limit=20`)
if (res.ok) {
const data = await res.json()
setHistory(data.runs || [])
}
} catch (e) {
console.error('Failed to load history:', e)
}
}, [])
useEffect(() => {
loadSessions()
loadHistory()
}, [loadSessions, loadHistory])
// Run all regressions
const runAll = async () => {
setRunning(true)
setResults([])
setOverallStatus(null)
setDurationMs(null)
try {
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/regression/run?triggered_by=manual`, {
method: 'POST',
})
if (res.ok) {
const data = await res.json()
setResults(data.results || [])
setOverallStatus(data.status)
setDurationMs(data.duration_ms)
loadHistory()
}
} catch (e) {
console.error('Regression run failed:', e)
setOverallStatus('error')
} finally {
setRunning(false)
}
}
const totalPass = results.filter(r => r.status === 'pass').length
const totalFail = results.filter(r => r.status === 'fail').length
const totalError = results.filter(r => r.status === 'error').length
return (
<div className="space-y-6">
<div className="max-w-7xl mx-auto p-6 space-y-6">
<PagePurpose
title="OCR Regression Tests"
purpose="Automatische Regressions-Tests fuer die OCR-Pipeline: Ground-Truth Sessions neu auswerten und gegen Referenz-Ergebnisse vergleichen."
audience={['Entwickler', 'QA']}
defaultCollapsed
architecture={{
services: ['klausur-service (FastAPI, Port 8086)'],
databases: ['PostgreSQL (regression_runs, ocr_pipeline_sessions)'],
}}
relatedPages={[
{ name: 'OCR Pipeline', href: '/ai/ocr-pipeline', description: 'OCR-Pipeline ausfuehren' },
{ name: 'Ground Truth Review', href: '/ai/ocr-ground-truth', description: 'Sessions pruefen & markieren' },
]}
/>
{/* Header + Run Button */}
<div className="flex items-center justify-between">
<div>
<h1 className="text-2xl font-bold text-slate-900">OCR Regression Tests</h1>
<p className="text-sm text-slate-500 mt-1">
{sessions.length} Ground-Truth Session{sessions.length !== 1 ? 's' : ''}
</p>
</div>
<button
onClick={runAll}
disabled={running || sessions.length === 0}
className="inline-flex items-center gap-2 px-4 py-2.5 bg-teal-600 text-white rounded-lg hover:bg-teal-700 disabled:opacity-50 disabled:cursor-not-allowed font-medium transition-colors"
>
{running ? (
<>
<svg className="animate-spin h-4 w-4" fill="none" viewBox="0 0 24 24">
<circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" />
<path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.824 3 7.938l3-2.647z" />
</svg>
Laeuft...
</>
) : (
'Alle Tests starten'
)}
</button>
</div>
{/* Overall Result Banner */}
{overallStatus && (
<div className={`rounded-lg p-4 border ${
overallStatus === 'pass'
? 'bg-emerald-50 border-emerald-200'
: 'bg-red-50 border-red-200'
}`}>
<div className="flex items-center justify-between">
<div className="flex items-center gap-3">
<StatusBadge status={overallStatus} />
<span className="font-medium text-slate-900">
{totalPass} bestanden, {totalFail} fehlgeschlagen, {totalError} Fehler
</span>
</div>
{durationMs !== null && (
<span className="text-sm text-slate-500">{(durationMs / 1000).toFixed(1)}s</span>
)}
</div>
</div>
)}
{/* Tabs */}
<div className="border-b border-slate-200">
<nav className="flex gap-4">
{(['current', 'history'] as const).map(t => (
<button
key={t}
onClick={() => setTab(t)}
className={`pb-3 px-1 text-sm font-medium border-b-2 transition-colors ${
tab === t
? 'border-teal-500 text-teal-600'
: 'border-transparent text-slate-500 hover:text-slate-700'
}`}
>
{t === 'current' ? 'Aktuelle Ergebnisse' : 'Verlauf'}
</button>
))}
</nav>
</div>
{/* Current Results Tab */}
{tab === 'current' && (
<div className="space-y-3">
{results.length === 0 && !running && (
<div className="text-center py-12 text-slate-400">
<p className="text-lg">Keine Ergebnisse</p>
<p className="text-sm mt-1">Klicken Sie &quot;Alle Tests starten&quot; um die Regression zu laufen.</p>
</div>
)}
{results.map(r => (
<div
key={r.session_id}
className="bg-white rounded-lg border border-slate-200 overflow-hidden"
>
<div
className="flex items-center justify-between px-4 py-3 cursor-pointer hover:bg-slate-50 transition-colors"
onClick={() => setExpandedSession(expandedSession === r.session_id ? null : r.session_id)}
>
<div className="flex items-center gap-3 min-w-0">
<StatusBadge status={r.status} />
<span className="font-medium text-slate-900 truncate">{r.name || r.session_id}</span>
</div>
<div className="flex items-center gap-4 text-sm text-slate-500">
{r.diff_summary && (
<span>
{r.diff_summary.text_changes} Text, {r.diff_summary.structural_changes} Struktur
</span>
)}
{r.error && <span className="text-red-500">{r.error}</span>}
<svg className={`w-4 h-4 transition-transform ${expandedSession === r.session_id ? 'rotate-180' : ''}`} fill="none" viewBox="0 0 24 24" stroke="currentColor">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M19 9l-7 7-7-7" />
</svg>
</div>
</div>
{/* Expanded Details */}
{expandedSession === r.session_id && r.status === 'fail' && (
<div className="border-t border-slate-100 px-4 py-3 bg-slate-50 space-y-3">
{/* Structural Diffs */}
{r.structural_diffs && r.structural_diffs.length > 0 && (
<div>
<h4 className="text-xs font-medium text-slate-500 uppercase mb-1">Strukturelle Aenderungen</h4>
<div className="space-y-1">
{r.structural_diffs.map((d, i) => (
<div key={i} className="text-sm">
<span className="font-mono text-slate-600">{d.field}</span>: {d.reference} {d.current}
</div>
))}
</div>
</div>
)}
{/* Cell Diffs */}
{r.cell_diffs && r.cell_diffs.length > 0 && (
<div>
<h4 className="text-xs font-medium text-slate-500 uppercase mb-1">
Zellen-Aenderungen ({r.cell_diffs.length})
</h4>
<div className="max-h-60 overflow-y-auto space-y-1">
{r.cell_diffs.slice(0, 50).map((d, i) => (
<div key={i} className="text-sm font-mono bg-white rounded px-2 py-1 border border-slate-100">
<span className={`text-xs px-1 rounded ${
d.type === 'text_change' ? 'bg-amber-100 text-amber-700'
: d.type === 'cell_missing' ? 'bg-red-100 text-red-700'
: 'bg-blue-100 text-blue-700'
}`}>
{d.type}
</span>{' '}
<span className="text-slate-500">{d.cell_id}</span>
{d.reference && (
<>
{' '}<span className="line-through text-red-400">{d.reference}</span>
</>
)}
{d.current && (
<>
{' '}<span className="text-emerald-600">{d.current}</span>
</>
)}
</div>
))}
{r.cell_diffs.length > 50 && (
<p className="text-xs text-slate-400">... und {r.cell_diffs.length - 50} weitere</p>
)}
</div>
</div>
)}
</div>
)}
</div>
))}
{/* Ground Truth Sessions Overview (when no results yet) */}
{results.length === 0 && sessions.length > 0 && (
<div>
<h3 className="text-sm font-medium text-slate-700 mb-2">Ground-Truth Sessions</h3>
<div className="grid gap-2">
{sessions.map(s => (
<div key={s.session_id} className="bg-white rounded-lg border border-slate-200 px-4 py-3 flex items-center justify-between">
<div>
<span className="font-medium text-slate-900">{s.name || s.session_id}</span>
<span className="text-sm text-slate-400 ml-2">{s.filename}</span>
</div>
<div className="text-sm text-slate-500">
{s.summary.total_cells} Zellen, {s.summary.total_zones} Zonen
{s.pipeline && <span className="ml-2 text-xs bg-slate-100 px-1.5 py-0.5 rounded">{s.pipeline}</span>}
</div>
</div>
))}
</div>
</div>
)}
</div>
)}
{/* History Tab */}
{tab === 'history' && (
<div className="space-y-2">
{history.length === 0 ? (
<p className="text-center py-8 text-slate-400">Noch keine Laeufe aufgezeichnet.</p>
) : (
<table className="w-full text-sm">
<thead>
<tr className="border-b border-slate-200 text-left text-slate-500">
<th className="pb-2 font-medium">Datum</th>
<th className="pb-2 font-medium">Status</th>
<th className="pb-2 font-medium text-right">Gesamt</th>
<th className="pb-2 font-medium text-right">Pass</th>
<th className="pb-2 font-medium text-right">Fail</th>
<th className="pb-2 font-medium text-right">Dauer</th>
<th className="pb-2 font-medium">Trigger</th>
</tr>
</thead>
<tbody>
{history.map(run => (
<tr key={run.id} className="border-b border-slate-100 hover:bg-slate-50">
<td className="py-2">{formatDate(run.run_at)}</td>
<td className="py-2"><StatusBadge status={run.status} /></td>
<td className="py-2 text-right">{run.total}</td>
<td className="py-2 text-right text-emerald-600">{run.passed}</td>
<td className="py-2 text-right text-red-600">{run.failed + run.errors}</td>
<td className="py-2 text-right text-slate-500">{(run.duration_ms / 1000).toFixed(1)}s</td>
<td className="py-2 text-slate-400">{run.triggered_by}</td>
</tr>
))}
</tbody>
</table>
)}
</div>
)}
</div>
</div>
)
}

View File

@@ -0,0 +1,252 @@
import { describe, it, expect } from 'vitest'
import ragData from '../rag-documents.json'
/**
* Tests fuer rag-documents.json — Branchen-Regulierungs-Matrix
*
* Validiert die JSON-Struktur, Branchen-Zuordnung und Datenintegritaet
* der 320 Dokumente fuer die RAG Landkarte.
*/
const VALID_INDUSTRY_IDS = ragData.industries.map((i: any) => i.id)
const VALID_DOC_TYPE_IDS = ragData.doc_types.map((dt: any) => dt.id)
describe('rag-documents.json — Struktur', () => {
it('sollte doc_types, industries und documents enthalten', () => {
expect(ragData).toHaveProperty('doc_types')
expect(ragData).toHaveProperty('industries')
expect(ragData).toHaveProperty('documents')
expect(Array.isArray(ragData.doc_types)).toBe(true)
expect(Array.isArray(ragData.industries)).toBe(true)
expect(Array.isArray(ragData.documents)).toBe(true)
})
it('sollte genau 10 Branchen haben (VDMA/VDA/BDI)', () => {
expect(ragData.industries).toHaveLength(10)
const ids = ragData.industries.map((i: any) => i.id)
expect(ids).toContain('automotive')
expect(ids).toContain('maschinenbau')
expect(ids).toContain('elektrotechnik')
expect(ids).toContain('chemie')
expect(ids).toContain('metall')
expect(ids).toContain('energie')
expect(ids).toContain('transport')
expect(ids).toContain('handel')
expect(ids).toContain('konsumgueter')
expect(ids).toContain('bau')
})
it('sollte keine Pseudo-Branchen enthalten (IoT, KI, HR, KRITIS, etc.)', () => {
const ids = ragData.industries.map((i: any) => i.id)
expect(ids).not.toContain('iot')
expect(ids).not.toContain('ai')
expect(ids).not.toContain('hr')
expect(ids).not.toContain('kritis')
expect(ids).not.toContain('ecommerce')
expect(ids).not.toContain('tech')
expect(ids).not.toContain('media')
expect(ids).not.toContain('public')
})
it('sollte 17 Dokumenttypen haben', () => {
expect(ragData.doc_types.length).toBe(17)
})
it('sollte mindestens 300 Dokumente haben', () => {
expect(ragData.documents.length).toBeGreaterThanOrEqual(300)
})
it('sollte jede Branche name und icon haben', () => {
ragData.industries.forEach((ind: any) => {
expect(ind).toHaveProperty('id')
expect(ind).toHaveProperty('name')
expect(ind).toHaveProperty('icon')
expect(ind.name.length).toBeGreaterThan(0)
})
})
it('sollte jeden doc_type mit id, label, icon und sort haben', () => {
ragData.doc_types.forEach((dt: any) => {
expect(dt).toHaveProperty('id')
expect(dt).toHaveProperty('label')
expect(dt).toHaveProperty('icon')
expect(dt).toHaveProperty('sort')
})
})
})
describe('rag-documents.json — Dokument-Validierung', () => {
it('sollte keine doppelten Codes haben', () => {
const codes = ragData.documents.map((d: any) => d.code)
const unique = new Set(codes)
expect(unique.size).toBe(codes.length)
})
it('sollte Pflichtfelder bei jedem Dokument haben', () => {
ragData.documents.forEach((doc: any) => {
expect(doc).toHaveProperty('code')
expect(doc).toHaveProperty('name')
expect(doc).toHaveProperty('doc_type')
expect(doc).toHaveProperty('industries')
expect(doc).toHaveProperty('in_rag')
expect(doc).toHaveProperty('rag_collection')
expect(doc.code.length).toBeGreaterThan(0)
expect(doc.name.length).toBeGreaterThan(0)
expect(Array.isArray(doc.industries)).toBe(true)
})
})
it('sollte nur gueltige doc_type IDs verwenden', () => {
ragData.documents.forEach((doc: any) => {
expect(VALID_DOC_TYPE_IDS).toContain(doc.doc_type)
})
})
it('sollte nur gueltige industry IDs verwenden (oder "all")', () => {
ragData.documents.forEach((doc: any) => {
doc.industries.forEach((ind: string) => {
if (ind !== 'all') {
expect(VALID_INDUSTRY_IDS).toContain(ind)
}
})
})
})
it('sollte gueltige rag_collection Namen verwenden', () => {
const validCollections = [
'bp_compliance_ce',
'bp_compliance_gesetze',
'bp_compliance_datenschutz',
'bp_dsfa_corpus',
'bp_legal_templates',
'bp_compliance_recht',
'bp_nibis_eh',
]
ragData.documents.forEach((doc: any) => {
expect(validCollections).toContain(doc.rag_collection)
})
})
})
describe('rag-documents.json — Branchen-Zuordnungslogik', () => {
const findDoc = (code: string) => ragData.documents.find((d: any) => d.code === code)
describe('Horizontale Regulierungen (alle Branchen)', () => {
const horizontalCodes = [
'GDPR', 'BDSG_FULL', 'EPRIVACY', 'TDDDG', 'AIACT', 'CRA',
'NIS2', 'GPSR', 'PLD', 'EUCSA', 'DATAACT',
]
horizontalCodes.forEach((code) => {
it(`${code} sollte fuer alle Branchen gelten`, () => {
const doc = findDoc(code)
if (doc) {
expect(doc.industries).toContain('all')
}
})
})
})
describe('Sektorspezifische Regulierungen', () => {
it('Maschinenverordnung sollte Maschinenbau, Automotive, Elektrotechnik enthalten', () => {
const doc = findDoc('MACHINERY_REG')
if (doc) {
expect(doc.industries).toContain('maschinenbau')
expect(doc.industries).toContain('automotive')
expect(doc.industries).toContain('elektrotechnik')
expect(doc.industries).not.toContain('all')
}
})
it('ElektroG sollte Elektrotechnik und Automotive enthalten', () => {
const doc = findDoc('DE_ELEKTROG')
if (doc) {
expect(doc.industries).toContain('elektrotechnik')
expect(doc.industries).toContain('automotive')
}
})
it('BattDG sollte Automotive und Elektrotechnik enthalten', () => {
const doc = findDoc('DE_BATTDG')
if (doc) {
expect(doc.industries).toContain('automotive')
expect(doc.industries).toContain('elektrotechnik')
}
})
it('ENISA ICS/SCADA sollte Energie, Maschinenbau, Chemie enthalten', () => {
const doc = findDoc('ENISA_ICS_SCADA')
if (doc) {
expect(doc.industries).toContain('energie')
expect(doc.industries).toContain('maschinenbau')
expect(doc.industries).toContain('chemie')
}
})
})
describe('Nicht zutreffende Regulierungen (Finanz/Medizin/Plattformen)', () => {
const emptyIndustryCodes = ['DORA', 'PSD2', 'MiCA', 'AMLR', 'EHDS', 'DSA', 'DMA', 'MDR']
emptyIndustryCodes.forEach((code) => {
it(`${code} sollte keine Branchen-Zuordnung haben`, () => {
const doc = findDoc(code)
if (doc) {
expect(doc.industries).toHaveLength(0)
}
})
})
})
describe('BSI-TR-03161 (DiGA) sollte nicht zutreffend sein', () => {
['BSI-TR-03161-1', 'BSI-TR-03161-2', 'BSI-TR-03161-3'].forEach((code) => {
it(`${code} sollte keine Branchen-Zuordnung haben`, () => {
const doc = findDoc(code)
if (doc) {
expect(doc.industries).toHaveLength(0)
}
})
})
})
})
describe('rag-documents.json — Applicability Notes', () => {
it('sollte applicability_note bei Dokumenten mit description haben', () => {
const withDescription = ragData.documents.filter((d: any) => d.description)
const withNote = withDescription.filter((d: any) => d.applicability_note)
// Mindestens 90% der Dokumente mit Beschreibung sollten eine Note haben
expect(withNote.length / withDescription.length).toBeGreaterThan(0.9)
})
it('horizontale Regulierungen sollten "alle Branchen" in der Note erwaehnen', () => {
const gdpr = ragData.documents.find((d: any) => d.code === 'GDPR')
if (gdpr?.applicability_note) {
expect(gdpr.applicability_note.toLowerCase()).toContain('alle branchen')
}
})
it('nicht zutreffende sollten "nicht zutreffend" in der Note erwaehnen', () => {
const dora = ragData.documents.find((d: any) => d.code === 'DORA')
if (dora?.applicability_note) {
expect(dora.applicability_note.toLowerCase()).toContain('nicht zutreffend')
}
})
})
describe('rag-documents.json — Dokumenttyp-Verteilung', () => {
it('sollte Dokumente in jedem doc_type haben', () => {
ragData.doc_types.forEach((dt: any) => {
const count = ragData.documents.filter((d: any) => d.doc_type === dt.id).length
expect(count).toBeGreaterThan(0)
})
})
it('sollte EU-Verordnungen als groesste Kategorie haben (mind. 15)', () => {
const euRegs = ragData.documents.filter((d: any) => d.doc_type === 'eu_regulation')
expect(euRegs.length).toBeGreaterThanOrEqual(15)
})
it('sollte EDPB Leitlinien als umfangreichste Kategorie haben (mind. 40)', () => {
const edpb = ragData.documents.filter((d: any) => d.doc_type === 'edpb_guideline')
expect(edpb.length).toBeGreaterThanOrEqual(40)
})
})

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -1,593 +0,0 @@
'use client'
/**
* Voice Service Admin Page (migrated from website/admin/voice)
*
* Displays:
* - Voice-First Architecture Overview
* - Developer Guide Content
* - Live Voice Demo (embedded from studio-v2)
* - Task State Machine Documentation
* - DSGVO Compliance Information
*/
import { useState } from 'react'
import Link from 'next/link'
import { PagePurpose } from '@/components/common/PagePurpose'
type TabType = 'overview' | 'demo' | 'tasks' | 'intents' | 'dsgvo' | 'api'
// Task State Machine data
const TASK_STATES = [
{ state: 'DRAFT', description: 'Task erstellt, noch nicht verarbeitet', color: 'bg-gray-100 text-gray-800', next: ['QUEUED', 'PAUSED'] },
{ state: 'QUEUED', description: 'In Warteschlange fuer Verarbeitung', color: 'bg-blue-100 text-blue-800', next: ['RUNNING', 'PAUSED'] },
{ state: 'RUNNING', description: 'Wird aktuell verarbeitet', color: 'bg-yellow-100 text-yellow-800', next: ['READY', 'PAUSED'] },
{ state: 'READY', description: 'Fertig, wartet auf User-Bestaetigung', color: 'bg-green-100 text-green-800', next: ['APPROVED', 'REJECTED', 'PAUSED'] },
{ state: 'APPROVED', description: 'Vom User bestaetigt', color: 'bg-emerald-100 text-emerald-800', next: ['COMPLETED'] },
{ state: 'REJECTED', description: 'Vom User abgelehnt', color: 'bg-red-100 text-red-800', next: ['DRAFT'] },
{ state: 'COMPLETED', description: 'Erfolgreich abgeschlossen', color: 'bg-teal-100 text-teal-800', next: [] },
{ state: 'EXPIRED', description: 'TTL ueberschritten', color: 'bg-orange-100 text-orange-800', next: [] },
{ state: 'PAUSED', description: 'Vom User pausiert', color: 'bg-purple-100 text-purple-800', next: ['DRAFT', 'QUEUED', 'RUNNING', 'READY'] },
]
// Intent Types (22 types organized by group)
const INTENT_GROUPS = [
{
group: 'Notizen',
color: 'bg-blue-50 border-blue-200',
intents: [
{ type: 'student_observation', example: 'Notiz zu Max: heute wiederholt gestoert', description: 'Schuelerbeobachtungen' },
{ type: 'reminder', example: 'Erinner mich morgen an Konferenz', description: 'Erinnerungen setzen' },
{ type: 'homework_check', example: '7b Mathe Hausaufgabe kontrollieren', description: 'Hausaufgaben pruefen' },
{ type: 'conference_topic', example: 'Thema Lehrerkonferenz: iPad-Regeln', description: 'Konferenzthemen' },
{ type: 'correction_thought', example: 'Aufgabe 3: haeufiger Fehler erklaeren', description: 'Korrekturgedanken' },
]
},
{
group: 'Content-Generierung',
color: 'bg-green-50 border-green-200',
intents: [
{ type: 'worksheet_generate', example: 'Erstelle 3 Lueckentexte zu Vokabeln', description: 'Arbeitsblaetter erstellen' },
{ type: 'quiz_generate', example: '10-Minuten Vokabeltest mit Loesungen', description: 'Quiz/Tests erstellen' },
{ type: 'quick_activity', example: '10 Minuten Einstieg, 5 Aufgaben', description: 'Schnelle Aktivitaeten' },
{ type: 'differentiation', example: 'Zwei Schwierigkeitsstufen: Basis und Plus', description: 'Differenzierung' },
]
},
{
group: 'Kommunikation',
color: 'bg-yellow-50 border-yellow-200',
intents: [
{ type: 'parent_letter', example: 'Neutraler Elternbrief wegen Stoerungen', description: 'Elternbriefe erstellen' },
{ type: 'class_message', example: 'Nachricht an 8a: Hausaufgaben bis Mittwoch', description: 'Klassennachrichten' },
]
},
{
group: 'Canvas-Editor',
color: 'bg-purple-50 border-purple-200',
intents: [
{ type: 'canvas_edit', example: 'Ueberschriften groesser, Zeilenabstand kleiner', description: 'Formatierung aendern' },
{ type: 'canvas_layout', example: 'Alles auf eine Seite, Drucklayout A4', description: 'Layout anpassen' },
{ type: 'canvas_element', example: 'Kasten fuer Merke hinzufuegen', description: 'Elemente hinzufuegen' },
{ type: 'canvas_image', example: 'Bild 2 nach links, Pfeil auf Aufgabe 3', description: 'Bilder positionieren' },
]
},
{
group: 'RAG & Korrektur',
color: 'bg-pink-50 border-pink-200',
intents: [
{ type: 'operator_checklist', example: 'Operatoren-Checkliste fuer diese Aufgabe', description: 'Operatoren abrufen' },
{ type: 'eh_passage', example: 'Erwartungshorizont-Passage zu diesem Thema', description: 'EH-Passagen suchen' },
{ type: 'feedback_suggestion', example: 'Kurze Feedbackformulierung vorschlagen', description: 'Feedback vorschlagen' },
]
},
{
group: 'Follow-up (TaskOrchestrator)',
color: 'bg-teal-50 border-teal-200',
intents: [
{ type: 'task_summary', example: 'Fasse alle offenen Tasks zusammen', description: 'Task-Uebersicht' },
{ type: 'convert_note', example: 'Mach aus der Notiz von gestern einen Elternbrief', description: 'Notizen konvertieren' },
{ type: 'schedule_reminder', example: 'Erinner mich morgen an das Gespraech mit Max', description: 'Erinnerungen planen' },
]
},
]
// DSGVO Data Categories
const DSGVO_CATEGORIES = [
{ category: 'Audio', processing: 'NUR transient im RAM, NIEMALS persistiert', storage: 'Keine', ttl: '-', icon: '🎤', risk: 'low' },
{ category: 'PII (Schuelernamen)', processing: 'NUR auf Lehrergeraet', storage: 'Client-side', ttl: '-', icon: '👤', risk: 'high' },
{ category: 'Pseudonyme', processing: 'Server erlaubt (student_ref, class_ref)', storage: 'Valkey Cache', ttl: '24h', icon: '🔢', risk: 'low' },
{ category: 'Transkripte', processing: 'NUR verschluesselt (AES-256-GCM)', storage: 'PostgreSQL', ttl: '7 Tage', icon: '📝', risk: 'medium' },
{ category: 'Task States', processing: 'TaskOrchestrator', storage: 'Valkey', ttl: '30 Tage', icon: '📋', risk: 'low' },
{ category: 'Audit Logs', processing: 'Nur truncated IDs, keine PII', storage: 'PostgreSQL', ttl: '90 Tage', icon: '📊', risk: 'low' },
]
// API Endpoints
const API_ENDPOINTS = [
{ method: 'POST', path: '/api/v1/sessions', description: 'Voice Session erstellen' },
{ method: 'GET', path: '/api/v1/sessions/{id}', description: 'Session Status abrufen' },
{ method: 'DELETE', path: '/api/v1/sessions/{id}', description: 'Session beenden' },
{ method: 'GET', path: '/api/v1/sessions/{id}/tasks', description: 'Pending Tasks abrufen' },
{ method: 'POST', path: '/api/v1/tasks', description: 'Task erstellen' },
{ method: 'GET', path: '/api/v1/tasks/{id}', description: 'Task Status abrufen' },
{ method: 'PUT', path: '/api/v1/tasks/{id}/transition', description: 'Task State aendern' },
{ method: 'DELETE', path: '/api/v1/tasks/{id}', description: 'Task loeschen' },
{ method: 'WS', path: '/ws/voice', description: 'Voice Streaming (WebSocket)' },
{ method: 'GET', path: '/health', description: 'Health Check' },
]
export default function VoiceMatrixPage() {
const [activeTab, setActiveTab] = useState<TabType>('overview')
const [demoLoaded, setDemoLoaded] = useState(false)
const tabs = [
{ id: 'overview', name: 'Architektur', icon: '🏗️' },
{ id: 'demo', name: 'Live Demo', icon: '🎤' },
{ id: 'tasks', name: 'Task States', icon: '📋' },
{ id: 'intents', name: 'Intents (22)', icon: '🎯' },
{ id: 'dsgvo', name: 'DSGVO', icon: '🔒' },
{ id: 'api', name: 'API', icon: '🔌' },
]
return (
<div>
{/* Page Purpose */}
<PagePurpose
title="Voice Service"
purpose="Voice-First Interface mit PersonaPlex-7B & TaskOrchestrator. Konfigurieren und testen Sie den Voice-Service fuer Lehrer-Interaktionen per Sprache."
audience={['Entwickler', 'Admins']}
architecture={{
services: ['voice-service (Python, Port 8091)', 'studio-v2 (Next.js)', 'valkey (Cache)'],
databases: ['PostgreSQL', 'Valkey Cache'],
}}
relatedPages={[
{ name: 'Matrix & Jitsi', href: '/communication/matrix', description: 'Kommunikation Monitoring' },
{ name: 'GPU Infrastruktur', href: '/infrastructure/gpu', description: 'GPU fuer Voice-Service' },
]}
collapsible={true}
defaultCollapsed={false}
/>
{/* Quick Links */}
<div className="mb-6 flex flex-wrap gap-3">
<a
href="https://macmini:3001/voice-test"
target="_blank"
rel="noopener noreferrer"
className="flex items-center gap-2 px-4 py-2 bg-teal-600 text-white rounded-lg hover:bg-teal-700 transition-colors"
>
<svg className="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M19 11a7 7 0 01-7 7m0 0a7 7 0 01-7-7m7 7v4m0 0H8m4 0h4m-4-8a3 3 0 01-3-3V5a3 3 0 116 0v6a3 3 0 01-3 3z" />
</svg>
Voice Test (Studio)
</a>
<a
href="https://macmini:8091/health"
target="_blank"
rel="noopener noreferrer"
className="flex items-center gap-2 px-4 py-2 bg-green-100 text-green-700 rounded-lg hover:bg-green-200 transition-colors"
>
<svg className="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M9 12l2 2 4-4m6 2a9 9 0 11-18 0 9 9 0 0118 0z" />
</svg>
Health Check
</a>
<Link
href="/development/docs"
className="flex items-center gap-2 px-4 py-2 bg-slate-100 text-slate-700 rounded-lg hover:bg-slate-200 transition-colors"
>
<svg className="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M9 12h6m-6 4h6m2 5H7a2 2 0 01-2-2V5a2 2 0 012-2h5.586a1 1 0 01.707.293l5.414 5.414a1 1 0 01.293.707V19a2 2 0 01-2 2z" />
</svg>
Developer Docs
</Link>
</div>
{/* Stats Overview */}
<div className="grid grid-cols-2 md:grid-cols-6 gap-4 mb-6">
<div className="bg-white rounded-lg shadow p-4">
<div className="text-3xl font-bold text-teal-600">8091</div>
<div className="text-sm text-slate-500">Port</div>
</div>
<div className="bg-white rounded-lg shadow p-4">
<div className="text-3xl font-bold text-blue-600">22</div>
<div className="text-sm text-slate-500">Task Types</div>
</div>
<div className="bg-white rounded-lg shadow p-4">
<div className="text-3xl font-bold text-purple-600">9</div>
<div className="text-sm text-slate-500">Task States</div>
</div>
<div className="bg-white rounded-lg shadow p-4">
<div className="text-3xl font-bold text-green-600">24kHz</div>
<div className="text-sm text-slate-500">Audio Rate</div>
</div>
<div className="bg-white rounded-lg shadow p-4">
<div className="text-3xl font-bold text-orange-600">80ms</div>
<div className="text-sm text-slate-500">Frame Size</div>
</div>
<div className="bg-white rounded-lg shadow p-4">
<div className="text-3xl font-bold text-red-600">0</div>
<div className="text-sm text-slate-500">Audio Persist</div>
</div>
</div>
{/* Tabs */}
<div className="bg-white rounded-lg shadow mb-6">
<div className="border-b border-slate-200 px-4">
<div className="flex gap-1 overflow-x-auto">
{tabs.map((tab) => (
<button
key={tab.id}
onClick={() => setActiveTab(tab.id as TabType)}
className={`px-4 py-3 text-sm font-medium whitespace-nowrap transition-colors border-b-2 ${
activeTab === tab.id
? 'border-teal-600 text-teal-600'
: 'border-transparent text-slate-500 hover:text-slate-700'
}`}
>
<span className="mr-2">{tab.icon}</span>
{tab.name}
</button>
))}
</div>
</div>
<div className="p-6">
{/* Overview Tab */}
{activeTab === 'overview' && (
<div className="space-y-6">
<h3 className="text-lg font-semibold text-slate-900">Voice-First Architektur</h3>
{/* Architecture Diagram */}
<div className="bg-slate-50 rounded-lg p-6 font-mono text-sm overflow-x-auto">
<pre className="text-slate-700">{`
┌──────────────────────────────────────────────────────────────────┐
│ LEHRERGERAET (PWA / App) │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ VoiceCapture.tsx │ voice-encryption.ts │ voice-api.ts │ │
│ │ Mikrofon │ AES-256-GCM │ WebSocket Client │ │
│ └────────────────────────────────────────────────────────────┘ │
└───────────────────────────┬──────────────────────────────────────┘
│ WebSocket (wss://)
┌──────────────────────────────────────────────────────────────────┐
│ VOICE SERVICE (Port 8091) │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ main.py │ streaming.py │ sessions.py │ tasks.py │ │
│ └────────────────────────────────────────────────────────────┘ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ task_orchestrator.py │ intent_router.py │ encryption │ │
│ └────────────────────────────────────────────────────────────┘ │
└───────────────────────────┬──────────────────────────────────────┘
┌──────────────────┼──────────────────┐
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ PersonaPlex-7B │ │ Ollama Fallback │ │ Valkey Cache │
│ (A100 GPU) │ │ (Mac Mini) │ │ (Sessions) │
└─────────────────┘ └─────────────────┘ └─────────────────┘
`}</pre>
</div>
{/* Technology Stack */}
<div className="grid grid-cols-1 md:grid-cols-3 gap-4">
<div className="bg-blue-50 border border-blue-200 rounded-lg p-4">
<h4 className="font-semibold text-blue-800 mb-2">Voice Model (Produktion)</h4>
<p className="text-sm text-blue-700">PersonaPlex-7B (NVIDIA)</p>
<p className="text-xs text-blue-600 mt-1">Full-Duplex Speech-to-Speech</p>
<p className="text-xs text-blue-500">Lizenz: MIT + NVIDIA Open Model</p>
</div>
<div className="bg-green-50 border border-green-200 rounded-lg p-4">
<h4 className="font-semibold text-green-800 mb-2">Agent Orchestration</h4>
<p className="text-sm text-green-700">TaskOrchestrator</p>
<p className="text-xs text-green-600 mt-1">Task State Machine</p>
<p className="text-xs text-green-500">Lizenz: Proprietary</p>
</div>
<div className="bg-purple-50 border border-purple-200 rounded-lg p-4">
<h4 className="font-semibold text-purple-800 mb-2">Audio Codec</h4>
<p className="text-sm text-purple-700">Mimi (24kHz, 80ms)</p>
<p className="text-xs text-purple-600 mt-1">Low-Latency Streaming</p>
<p className="text-xs text-purple-500">Lizenz: MIT</p>
</div>
</div>
{/* Key Files */}
<div>
<h4 className="font-semibold text-slate-800 mb-3">Wichtige Dateien</h4>
<div className="bg-white border border-slate-200 rounded-lg overflow-hidden">
<table className="min-w-full divide-y divide-slate-200">
<thead className="bg-slate-50">
<tr>
<th className="px-4 py-2 text-left text-xs font-medium text-slate-500 uppercase">Datei</th>
<th className="px-4 py-2 text-left text-xs font-medium text-slate-500 uppercase">Beschreibung</th>
</tr>
</thead>
<tbody className="divide-y divide-slate-200">
<tr><td className="px-4 py-2 font-mono text-sm">voice-service/main.py</td><td className="px-4 py-2 text-sm text-slate-600">FastAPI Entry, WebSocket Handler</td></tr>
<tr><td className="px-4 py-2 font-mono text-sm">voice-service/services/task_orchestrator.py</td><td className="px-4 py-2 text-sm text-slate-600">Task State Machine</td></tr>
<tr><td className="px-4 py-2 font-mono text-sm">voice-service/services/intent_router.py</td><td className="px-4 py-2 text-sm text-slate-600">Intent Detection (22 Types)</td></tr>
<tr><td className="px-4 py-2 font-mono text-sm">voice-service/services/encryption_service.py</td><td className="px-4 py-2 text-sm text-slate-600">Namespace Key Management</td></tr>
<tr><td className="px-4 py-2 font-mono text-sm">studio-v2/components/voice/VoiceCapture.tsx</td><td className="px-4 py-2 text-sm text-slate-600">Frontend Mikrofon + Crypto</td></tr>
<tr><td className="px-4 py-2 font-mono text-sm">studio-v2/lib/voice/voice-encryption.ts</td><td className="px-4 py-2 text-sm text-slate-600">AES-256-GCM Client-side</td></tr>
</tbody>
</table>
</div>
</div>
</div>
)}
{/* Demo Tab */}
{activeTab === 'demo' && (
<div className="space-y-4">
<div className="flex items-center justify-between">
<h3 className="text-lg font-semibold text-slate-900">Live Voice Demo</h3>
<a
href="https://macmini:3001/voice-test"
target="_blank"
rel="noopener noreferrer"
className="text-sm text-teal-600 hover:text-teal-700 flex items-center gap-1"
>
In neuem Tab oeffnen
<svg className="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M10 6H6a2 2 0 00-2 2v10a2 2 0 002 2h10a2 2 0 002-2v-4M14 4h6m0 0v6m0-6L10 14" />
</svg>
</a>
</div>
<div className="bg-slate-100 rounded-lg p-4 text-sm text-slate-600 mb-4">
<p><strong>Hinweis:</strong> Die Demo erfordert, dass der Voice Service (Port 8091) und das Studio-v2 Frontend (Port 3001) laufen.</p>
<code className="block mt-2 bg-slate-200 p-2 rounded">docker compose up -d voice-service && cd studio-v2 && npm run dev</code>
</div>
{/* Embedded Demo */}
<div className="relative bg-slate-900 rounded-lg overflow-hidden" style={{ height: '600px' }}>
{!demoLoaded && (
<div className="absolute inset-0 flex items-center justify-center">
<button
onClick={() => setDemoLoaded(true)}
className="px-6 py-3 bg-teal-600 text-white rounded-lg hover:bg-teal-700 transition-colors flex items-center gap-2"
>
<svg className="w-6 h-6" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M14.752 11.168l-3.197-2.132A1 1 0 0010 9.87v4.263a1 1 0 001.555.832l3.197-2.132a1 1 0 000-1.664z" />
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M21 12a9 9 0 11-18 0 9 9 0 0118 0z" />
</svg>
Voice Demo laden
</button>
</div>
)}
{demoLoaded && (
<iframe
src="https://macmini:3001/voice-test?embed=true"
className="w-full h-full border-0"
title="Voice Demo"
allow="microphone"
/>
)}
</div>
</div>
)}
{/* Task States Tab */}
{activeTab === 'tasks' && (
<div className="space-y-6">
<h3 className="text-lg font-semibold text-slate-900">Task State Machine (TaskOrchestrator)</h3>
{/* State Diagram */}
<div className="bg-slate-50 rounded-lg p-6 font-mono text-sm overflow-x-auto">
<pre className="text-slate-700">{`
DRAFT → QUEUED → RUNNING → READY
┌───────────┴───────────┐
│ │
APPROVED REJECTED
│ │
COMPLETED DRAFT (revision)
Any State → EXPIRED (TTL)
Any State → PAUSED (User Interrupt)
`}</pre>
</div>
{/* States Table */}
<div className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-3 gap-4">
{TASK_STATES.map((state) => (
<div key={state.state} className={`${state.color} rounded-lg p-4`}>
<div className="font-semibold text-lg">{state.state}</div>
<p className="text-sm mt-1">{state.description}</p>
{state.next.length > 0 && (
<div className="mt-2 text-xs">
<span className="opacity-75">Naechste:</span>{' '}
{state.next.join(', ')}
</div>
)}
</div>
))}
</div>
</div>
)}
{/* Intents Tab */}
{activeTab === 'intents' && (
<div className="space-y-6">
<h3 className="text-lg font-semibold text-slate-900">Intent Types (22 unterstuetzte Typen)</h3>
{INTENT_GROUPS.map((group) => (
<div key={group.group} className={`${group.color} border rounded-lg p-4`}>
<h4 className="font-semibold text-slate-800 mb-3">{group.group}</h4>
<div className="space-y-2">
{group.intents.map((intent) => (
<div key={intent.type} className="bg-white rounded-lg p-3 shadow-sm">
<div className="flex items-start justify-between">
<div>
<code className="text-sm font-mono text-teal-700 bg-teal-50 px-2 py-0.5 rounded">
{intent.type}
</code>
<p className="text-sm text-slate-600 mt-1">{intent.description}</p>
</div>
</div>
<div className="mt-2 text-xs text-slate-500 italic">
Beispiel: &quot;{intent.example}&quot;
</div>
</div>
))}
</div>
</div>
))}
</div>
)}
{/* DSGVO Tab */}
{activeTab === 'dsgvo' && (
<div className="space-y-6">
<h3 className="text-lg font-semibold text-slate-900">DSGVO-Compliance</h3>
{/* Key Principles */}
<div className="bg-green-50 border border-green-200 rounded-lg p-4">
<h4 className="font-semibold text-green-800 mb-2">Kernprinzipien</h4>
<ul className="list-disc list-inside text-sm text-green-700 space-y-1">
<li><strong>Audio NIEMALS persistiert</strong> - Nur transient im RAM</li>
<li><strong>Namespace-Verschluesselung</strong> - Key nur auf Lehrergeraet</li>
<li><strong>Keine Klartext-PII serverseitig</strong> - Nur verschluesselt oder pseudonymisiert</li>
<li><strong>TTL-basierte Auto-Loeschung</strong> - 7/30/90 Tage je nach Kategorie</li>
</ul>
</div>
{/* Data Categories Table */}
<div className="bg-white border border-slate-200 rounded-lg overflow-hidden">
<table className="min-w-full divide-y divide-slate-200">
<thead className="bg-slate-50">
<tr>
<th className="px-4 py-3 text-left text-xs font-medium text-slate-500 uppercase">Kategorie</th>
<th className="px-4 py-3 text-left text-xs font-medium text-slate-500 uppercase">Verarbeitung</th>
<th className="px-4 py-3 text-left text-xs font-medium text-slate-500 uppercase">Speicherort</th>
<th className="px-4 py-3 text-left text-xs font-medium text-slate-500 uppercase">TTL</th>
<th className="px-4 py-3 text-left text-xs font-medium text-slate-500 uppercase">Risiko</th>
</tr>
</thead>
<tbody className="divide-y divide-slate-200">
{DSGVO_CATEGORIES.map((cat) => (
<tr key={cat.category}>
<td className="px-4 py-3">
<span className="mr-2">{cat.icon}</span>
<span className="font-medium">{cat.category}</span>
</td>
<td className="px-4 py-3 text-sm text-slate-600">{cat.processing}</td>
<td className="px-4 py-3 text-sm text-slate-600">{cat.storage}</td>
<td className="px-4 py-3 text-sm text-slate-600">{cat.ttl}</td>
<td className="px-4 py-3">
<span className={`px-2 py-1 rounded text-xs font-medium ${
cat.risk === 'low' ? 'bg-green-100 text-green-700' :
cat.risk === 'medium' ? 'bg-yellow-100 text-yellow-700' :
'bg-red-100 text-red-700'
}`}>
{cat.risk.toUpperCase()}
</span>
</td>
</tr>
))}
</tbody>
</table>
</div>
{/* Audit Log Info */}
<div className="bg-slate-50 border border-slate-200 rounded-lg p-4">
<h4 className="font-semibold text-slate-800 mb-2">Audit Logs (ohne PII)</h4>
<div className="grid grid-cols-2 gap-4 text-sm">
<div>
<span className="text-green-600 font-medium">Erlaubt:</span>
<ul className="list-disc list-inside text-slate-600 mt-1">
<li>ref_id (truncated)</li>
<li>content_type</li>
<li>size_bytes</li>
<li>ttl_hours</li>
</ul>
</div>
<div>
<span className="text-red-600 font-medium">Verboten:</span>
<ul className="list-disc list-inside text-slate-600 mt-1">
<li>user_name</li>
<li>content / transcript</li>
<li>email</li>
<li>student_name</li>
</ul>
</div>
</div>
</div>
</div>
)}
{/* API Tab */}
{activeTab === 'api' && (
<div className="space-y-6">
<h3 className="text-lg font-semibold text-slate-900">Voice Service API (Port 8091)</h3>
{/* REST Endpoints */}
<div className="bg-white border border-slate-200 rounded-lg overflow-hidden">
<table className="min-w-full divide-y divide-slate-200">
<thead className="bg-slate-50">
<tr>
<th className="px-4 py-3 text-left text-xs font-medium text-slate-500 uppercase">Methode</th>
<th className="px-4 py-3 text-left text-xs font-medium text-slate-500 uppercase">Endpoint</th>
<th className="px-4 py-3 text-left text-xs font-medium text-slate-500 uppercase">Beschreibung</th>
</tr>
</thead>
<tbody className="divide-y divide-slate-200">
{API_ENDPOINTS.map((ep, idx) => (
<tr key={idx}>
<td className="px-4 py-3">
<span className={`px-2 py-1 rounded text-xs font-medium ${
ep.method === 'GET' ? 'bg-green-100 text-green-700' :
ep.method === 'POST' ? 'bg-blue-100 text-blue-700' :
ep.method === 'PUT' ? 'bg-yellow-100 text-yellow-700' :
ep.method === 'DELETE' ? 'bg-red-100 text-red-700' :
'bg-purple-100 text-purple-700'
}`}>
{ep.method}
</span>
</td>
<td className="px-4 py-3 font-mono text-sm">{ep.path}</td>
<td className="px-4 py-3 text-sm text-slate-600">{ep.description}</td>
</tr>
))}
</tbody>
</table>
</div>
{/* WebSocket Protocol */}
<div className="bg-slate-50 rounded-lg p-4">
<h4 className="font-semibold text-slate-800 mb-3">WebSocket Protocol</h4>
<div className="grid grid-cols-1 md:grid-cols-2 gap-4 text-sm">
<div className="bg-white rounded-lg p-3 border border-slate-200">
<div className="font-medium text-slate-700 mb-2">Client Server</div>
<ul className="list-disc list-inside text-slate-600 space-y-1">
<li><code className="bg-slate-100 px-1 rounded">Binary</code>: Int16 PCM Audio (24kHz, 80ms)</li>
<li><code className="bg-slate-100 px-1 rounded">JSON</code>: {`{type: "config|end_turn|interrupt"}`}</li>
</ul>
</div>
<div className="bg-white rounded-lg p-3 border border-slate-200">
<div className="font-medium text-slate-700 mb-2">Server Client</div>
<ul className="list-disc list-inside text-slate-600 space-y-1">
<li><code className="bg-slate-100 px-1 rounded">Binary</code>: Audio Response (base64)</li>
<li><code className="bg-slate-100 px-1 rounded">JSON</code>: {`{type: "transcript|intent|status|error"}`}</li>
</ul>
</div>
</div>
</div>
{/* Example curl commands */}
<div className="bg-slate-900 rounded-lg p-4 text-sm">
<h4 className="font-semibold text-slate-300 mb-3">Beispiel: Session erstellen</h4>
<pre className="text-green-400 overflow-x-auto">{`curl -X POST https://macmini:8091/api/v1/sessions \\
-H "Content-Type: application/json" \\
-d '{
"namespace_id": "ns-12345678abcdef12345678abcdef12",
"key_hash": "sha256:dGVzdGtleWhhc2h0ZXN0a2V5aGFzaHRlc3Q=",
"device_type": "pwa"
}'`}</pre>
</div>
</div>
)}
</div>
</div>
</div>
)
}

View File

@@ -1,635 +0,0 @@
'use client'
/**
* Video & Chat Admin Page
*
* Matrix & Jitsi Monitoring Dashboard
* Provides system statistics, active calls, user metrics, and service health
* Migrated from website/app/admin/communication
*/
import { useEffect, useState, useCallback } from 'react'
import Link from 'next/link'
import { PagePurpose } from '@/components/common/PagePurpose'
import { getModuleByHref } from '@/lib/navigation'
interface MatrixStats {
total_users: number
active_users: number
total_rooms: number
active_rooms: number
messages_today: number
messages_this_week: number
status: 'online' | 'offline' | 'degraded'
}
interface JitsiStats {
active_meetings: number
total_participants: number
meetings_today: number
average_duration_minutes: number
peak_concurrent_users: number
total_minutes_today: number
status: 'online' | 'offline' | 'degraded'
}
interface TrafficStats {
matrix: {
bandwidth_in_mb: number
bandwidth_out_mb: number
messages_per_minute: number
media_uploads_today: number
media_size_mb: number
}
jitsi: {
bandwidth_in_mb: number
bandwidth_out_mb: number
video_streams_active: number
audio_streams_active: number
estimated_hourly_gb: number
}
total: {
bandwidth_in_mb: number
bandwidth_out_mb: number
estimated_monthly_gb: number
}
}
interface CommunicationStats {
matrix: MatrixStats
jitsi: JitsiStats
traffic?: TrafficStats
last_updated: string
}
interface ActiveMeeting {
room_name: string
display_name: string
participants: number
started_at: string
duration_minutes: number
}
interface RecentRoom {
room_id: string
name: string
member_count: number
last_activity: string
room_type: 'class' | 'parent' | 'staff' | 'general'
}
export default function VideoChatPage() {
const [stats, setStats] = useState<CommunicationStats | null>(null)
const [activeMeetings, setActiveMeetings] = useState<ActiveMeeting[]>([])
const [recentRooms, setRecentRooms] = useState<RecentRoom[]>([])
const [loading, setLoading] = useState(true)
const [error, setError] = useState<string | null>(null)
const moduleInfo = getModuleByHref('/communication/video-chat')
// Use local API proxy
const fetchStats = useCallback(async () => {
try {
const response = await fetch('/api/admin/communication/stats')
if (!response.ok) {
throw new Error(`HTTP ${response.status}`)
}
const data = await response.json()
setStats(data)
setActiveMeetings(data.active_meetings || [])
setRecentRooms(data.recent_rooms || [])
setError(null)
} catch (err) {
setError(err instanceof Error ? err.message : 'Verbindungsfehler')
// Set mock data for display purposes when API unavailable
setStats({
matrix: {
total_users: 0,
active_users: 0,
total_rooms: 0,
active_rooms: 0,
messages_today: 0,
messages_this_week: 0,
status: 'offline'
},
jitsi: {
active_meetings: 0,
total_participants: 0,
meetings_today: 0,
average_duration_minutes: 0,
peak_concurrent_users: 0,
total_minutes_today: 0,
status: 'offline'
},
last_updated: new Date().toISOString()
})
} finally {
setLoading(false)
}
}, [])
useEffect(() => {
fetchStats()
}, [fetchStats])
// Auto-refresh every 15 seconds
useEffect(() => {
const interval = setInterval(fetchStats, 15000)
return () => clearInterval(interval)
}, [fetchStats])
const getStatusBadge = (status: string) => {
const baseClasses = 'px-3 py-1 rounded-full text-xs font-semibold uppercase'
switch (status) {
case 'online':
return `${baseClasses} bg-green-100 text-green-800`
case 'degraded':
return `${baseClasses} bg-yellow-100 text-yellow-800`
case 'offline':
return `${baseClasses} bg-red-100 text-red-800`
default:
return `${baseClasses} bg-slate-100 text-slate-600`
}
}
const getRoomTypeBadge = (type: string) => {
const baseClasses = 'px-2 py-0.5 rounded text-xs font-medium'
switch (type) {
case 'class':
return `${baseClasses} bg-blue-100 text-blue-700`
case 'parent':
return `${baseClasses} bg-purple-100 text-purple-700`
case 'staff':
return `${baseClasses} bg-orange-100 text-orange-700`
default:
return `${baseClasses} bg-slate-100 text-slate-600`
}
}
const formatDuration = (minutes: number) => {
if (minutes < 60) return `${Math.round(minutes)} Min.`
const hours = Math.floor(minutes / 60)
const mins = Math.round(minutes % 60)
return `${hours}h ${mins}m`
}
const formatTimeAgo = (dateStr: string) => {
const date = new Date(dateStr)
const now = new Date()
const diffMs = now.getTime() - date.getTime()
const diffMins = Math.floor(diffMs / 60000)
if (diffMins < 1) return 'gerade eben'
if (diffMins < 60) return `vor ${diffMins} Min.`
if (diffMins < 1440) return `vor ${Math.floor(diffMins / 60)} Std.`
return `vor ${Math.floor(diffMins / 1440)} Tagen`
}
// Traffic estimation helpers for SysEleven planning
const calculateEstimatedTraffic = (direction: 'in' | 'out'): number => {
const messages = stats?.matrix?.messages_today || 0
const callMinutes = stats?.jitsi?.total_minutes_today || 0
const participants = stats?.jitsi?.total_participants || 0
const messageTrafficMB = messages * 0.002
const videoTrafficMB = callMinutes * participants * 0.011
if (direction === 'in') {
return messageTrafficMB * 0.3 + videoTrafficMB * 0.4
}
return messageTrafficMB * 0.7 + videoTrafficMB * 0.6
}
const calculateHourlyEstimate = (): number => {
const activeParticipants = stats?.jitsi?.total_participants || 0
return activeParticipants * 0.675
}
const calculateMonthlyEstimate = (): number => {
const dailyCallMinutes = stats?.jitsi?.total_minutes_today || 0
const avgParticipants = stats?.jitsi?.peak_concurrent_users || 1
const monthlyMinutes = dailyCallMinutes * 22
return (monthlyMinutes * avgParticipants * 11) / 1024
}
const getResourceRecommendation = (): string => {
const peakUsers = stats?.jitsi?.peak_concurrent_users || 0
const monthlyGB = calculateMonthlyEstimate()
if (monthlyGB < 10 || peakUsers < 5) {
return 'Starter (1 vCPU, 2GB RAM, 100GB Traffic)'
} else if (monthlyGB < 50 || peakUsers < 20) {
return 'Standard (2 vCPU, 4GB RAM, 500GB Traffic)'
} else if (monthlyGB < 200 || peakUsers < 50) {
return 'Professional (4 vCPU, 8GB RAM, 2TB Traffic)'
} else {
return 'Enterprise (8+ vCPU, 16GB+ RAM, Unlimited Traffic)'
}
}
return (
<div>
{/* Page Purpose */}
<PagePurpose
title={moduleInfo?.module.name || 'Video & Chat'}
purpose={moduleInfo?.module.purpose || 'Matrix & Jitsi Monitoring Dashboard'}
audience={moduleInfo?.module.audience || ['Admins', 'DevOps']}
architecture={{
services: ['synapse (Matrix)', 'jitsi-meet', 'prosody', 'jvb'],
databases: ['PostgreSQL', 'synapse-db'],
}}
collapsible={true}
defaultCollapsed={true}
/>
{/* Quick Actions */}
<div className="flex gap-3 mb-6">
<Link
href="/communication/video-chat/wizard"
className="px-4 py-2 bg-green-600 text-white rounded-lg hover:bg-green-700 transition-colors text-sm font-medium"
>
Test Wizard starten
</Link>
<button
onClick={fetchStats}
disabled={loading}
className="px-4 py-2 border border-slate-300 rounded-lg hover:bg-slate-50 disabled:opacity-50 text-sm"
>
{loading ? 'Lade...' : 'Aktualisieren'}
</button>
</div>
{/* Service Status Overview */}
<div className="grid grid-cols-1 md:grid-cols-2 gap-6 mb-6">
{/* Matrix Status Card */}
<div className="bg-white rounded-xl border border-slate-200 p-6">
<div className="flex items-center justify-between mb-4">
<div className="flex items-center gap-3">
<div className="w-10 h-10 bg-purple-100 rounded-lg flex items-center justify-center">
<svg className="w-6 h-6 text-purple-600" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M8 12h.01M12 12h.01M16 12h.01M21 12c0 4.418-4.03 8-9 8a9.863 9.863 0 01-4.255-.949L3 20l1.395-3.72C3.512 15.042 3 13.574 3 12c0-4.418 4.03-8 9-8s9 3.582 9 8z" />
</svg>
</div>
<div>
<h3 className="font-semibold text-slate-900">Matrix (Synapse)</h3>
<p className="text-sm text-slate-500">E2EE Messaging</p>
</div>
</div>
<span className={getStatusBadge(stats?.matrix.status || 'offline')}>
{stats?.matrix.status || 'offline'}
</span>
</div>
<div className="grid grid-cols-3 gap-4">
<div>
<div className="text-2xl font-bold text-slate-900">{stats?.matrix.total_users || 0}</div>
<div className="text-xs text-slate-500">Benutzer</div>
</div>
<div>
<div className="text-2xl font-bold text-slate-900">{stats?.matrix.active_users || 0}</div>
<div className="text-xs text-slate-500">Aktiv</div>
</div>
<div>
<div className="text-2xl font-bold text-slate-900">{stats?.matrix.total_rooms || 0}</div>
<div className="text-xs text-slate-500">Raeume</div>
</div>
</div>
<div className="mt-4 pt-4 border-t border-slate-100">
<div className="flex justify-between text-sm">
<span className="text-slate-500">Nachrichten heute</span>
<span className="font-medium">{stats?.matrix.messages_today || 0}</span>
</div>
<div className="flex justify-between text-sm mt-1">
<span className="text-slate-500">Diese Woche</span>
<span className="font-medium">{stats?.matrix.messages_this_week || 0}</span>
</div>
</div>
</div>
{/* Jitsi Status Card */}
<div className="bg-white rounded-xl border border-slate-200 p-6">
<div className="flex items-center justify-between mb-4">
<div className="flex items-center gap-3">
<div className="w-10 h-10 bg-blue-100 rounded-lg flex items-center justify-center">
<svg className="w-6 h-6 text-blue-600" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M15 10l4.553-2.276A1 1 0 0121 8.618v6.764a1 1 0 01-1.447.894L15 14M5 18h8a2 2 0 002-2V8a2 2 0 00-2-2H5a2 2 0 00-2 2v8a2 2 0 002 2z" />
</svg>
</div>
<div>
<h3 className="font-semibold text-slate-900">Jitsi Meet</h3>
<p className="text-sm text-slate-500">Videokonferenzen</p>
</div>
</div>
<span className={getStatusBadge(stats?.jitsi.status || 'offline')}>
{stats?.jitsi.status || 'offline'}
</span>
</div>
<div className="grid grid-cols-3 gap-4">
<div>
<div className="text-2xl font-bold text-green-600">{stats?.jitsi.active_meetings || 0}</div>
<div className="text-xs text-slate-500">Live Calls</div>
</div>
<div>
<div className="text-2xl font-bold text-slate-900">{stats?.jitsi.total_participants || 0}</div>
<div className="text-xs text-slate-500">Teilnehmer</div>
</div>
<div>
<div className="text-2xl font-bold text-slate-900">{stats?.jitsi.meetings_today || 0}</div>
<div className="text-xs text-slate-500">Calls heute</div>
</div>
</div>
<div className="mt-4 pt-4 border-t border-slate-100">
<div className="flex justify-between text-sm">
<span className="text-slate-500">Durchschnittliche Dauer</span>
<span className="font-medium">{formatDuration(stats?.jitsi.average_duration_minutes || 0)}</span>
</div>
<div className="flex justify-between text-sm mt-1">
<span className="text-slate-500">Peak gleichzeitig</span>
<span className="font-medium">{stats?.jitsi.peak_concurrent_users || 0} Nutzer</span>
</div>
</div>
</div>
</div>
{/* Traffic & Bandwidth Statistics */}
<div className="bg-white rounded-xl border border-slate-200 p-6 mb-6">
<div className="flex items-center justify-between mb-4">
<div className="flex items-center gap-3">
<div className="w-10 h-10 bg-emerald-100 rounded-lg flex items-center justify-center">
<svg className="w-6 h-6 text-emerald-600" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M13 7h8m0 0v8m0-8l-8 8-4-4-6 6" />
</svg>
</div>
<div>
<h3 className="font-semibold text-slate-900">Traffic & Bandbreite</h3>
<p className="text-sm text-slate-500">SysEleven Ressourcenplanung</p>
</div>
</div>
<span className="px-3 py-1 rounded-full text-xs font-semibold uppercase bg-emerald-100 text-emerald-800">
Live
</span>
</div>
<div className="grid grid-cols-2 md:grid-cols-4 gap-4 mb-4">
<div className="bg-slate-50 rounded-lg p-4">
<div className="text-xs text-slate-500 mb-1">Eingehend (heute)</div>
<div className="text-2xl font-bold text-slate-900">
{stats?.traffic?.total?.bandwidth_in_mb?.toFixed(1) || calculateEstimatedTraffic('in').toFixed(1)} MB
</div>
</div>
<div className="bg-slate-50 rounded-lg p-4">
<div className="text-xs text-slate-500 mb-1">Ausgehend (heute)</div>
<div className="text-2xl font-bold text-slate-900">
{stats?.traffic?.total?.bandwidth_out_mb?.toFixed(1) || calculateEstimatedTraffic('out').toFixed(1)} MB
</div>
</div>
<div className="bg-slate-50 rounded-lg p-4">
<div className="text-xs text-slate-500 mb-1">Geschaetzt/Stunde</div>
<div className="text-2xl font-bold text-blue-600">
{stats?.traffic?.jitsi?.estimated_hourly_gb?.toFixed(2) || calculateHourlyEstimate().toFixed(2)} GB
</div>
</div>
<div className="bg-slate-50 rounded-lg p-4">
<div className="text-xs text-slate-500 mb-1">Geschaetzt/Monat</div>
<div className="text-2xl font-bold text-emerald-600">
{stats?.traffic?.total?.estimated_monthly_gb?.toFixed(1) || calculateMonthlyEstimate().toFixed(1)} GB
</div>
</div>
</div>
<div className="grid grid-cols-1 md:grid-cols-2 gap-4">
{/* Matrix Traffic */}
<div className="border border-slate-200 rounded-lg p-4">
<div className="flex items-center gap-2 mb-3">
<div className="w-3 h-3 bg-purple-500 rounded-full"></div>
<span className="text-sm font-medium text-slate-700">Matrix Messaging</span>
</div>
<div className="space-y-2 text-sm">
<div className="flex justify-between">
<span className="text-slate-500">Nachrichten/Min</span>
<span className="font-medium">{stats?.traffic?.matrix?.messages_per_minute || Math.round((stats?.matrix?.messages_today || 0) / (new Date().getHours() || 1) / 60)}</span>
</div>
<div className="flex justify-between">
<span className="text-slate-500">Media Uploads heute</span>
<span className="font-medium">{stats?.traffic?.matrix?.media_uploads_today || 0}</span>
</div>
<div className="flex justify-between">
<span className="text-slate-500">Media Groesse</span>
<span className="font-medium">{stats?.traffic?.matrix?.media_size_mb?.toFixed(1) || '0.0'} MB</span>
</div>
</div>
</div>
{/* Jitsi Traffic */}
<div className="border border-slate-200 rounded-lg p-4">
<div className="flex items-center gap-2 mb-3">
<div className="w-3 h-3 bg-blue-500 rounded-full"></div>
<span className="text-sm font-medium text-slate-700">Jitsi Video</span>
</div>
<div className="space-y-2 text-sm">
<div className="flex justify-between">
<span className="text-slate-500">Video Streams aktiv</span>
<span className="font-medium">{stats?.traffic?.jitsi?.video_streams_active || (stats?.jitsi?.total_participants || 0)}</span>
</div>
<div className="flex justify-between">
<span className="text-slate-500">Audio Streams aktiv</span>
<span className="font-medium">{stats?.traffic?.jitsi?.audio_streams_active || (stats?.jitsi?.total_participants || 0)}</span>
</div>
<div className="flex justify-between">
<span className="text-slate-500">Bitrate geschaetzt</span>
<span className="font-medium">{((stats?.jitsi?.total_participants || 0) * 1.5).toFixed(1)} Mbps</span>
</div>
</div>
</div>
</div>
{/* SysEleven Recommendation */}
<div className="mt-4 p-4 bg-emerald-50 border border-emerald-200 rounded-lg">
<h4 className="text-sm font-semibold text-emerald-800 mb-2">SysEleven Empfehlung</h4>
<div className="text-sm text-emerald-700">
<p>Basierend auf aktuellem Traffic: <strong>{getResourceRecommendation()}</strong></p>
<p className="mt-1 text-xs text-emerald-600">
Peak Teilnehmer: {stats?.jitsi?.peak_concurrent_users || 0} |
Durchschnittliche Call-Dauer: {stats?.jitsi?.average_duration_minutes?.toFixed(0) || 0} Min. |
Calls heute: {stats?.jitsi?.meetings_today || 0}
</p>
</div>
</div>
</div>
{/* Active Meetings */}
<div className="bg-white rounded-xl border border-slate-200 p-6 mb-6">
<div className="flex items-center justify-between mb-4">
<h3 className="font-semibold text-slate-900">Aktive Meetings</h3>
</div>
{activeMeetings.length === 0 ? (
<div className="text-center py-8 text-slate-500">
<svg className="w-12 h-12 mx-auto mb-3 text-slate-300" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M15 10l4.553-2.276A1 1 0 0121 8.618v6.764a1 1 0 01-1.447.894L15 14M5 18h8a2 2 0 002-2V8a2 2 0 00-2-2H5a2 2 0 00-2 2v8a2 2 0 002 2z" />
</svg>
<p>Keine aktiven Meetings</p>
</div>
) : (
<div className="overflow-x-auto">
<table className="w-full">
<thead>
<tr className="text-left text-xs text-slate-500 uppercase border-b border-slate-200">
<th className="pb-3 pr-4">Meeting</th>
<th className="pb-3 pr-4">Teilnehmer</th>
<th className="pb-3 pr-4">Gestartet</th>
<th className="pb-3">Dauer</th>
</tr>
</thead>
<tbody className="divide-y divide-slate-100">
{activeMeetings.map((meeting, idx) => (
<tr key={idx} className="text-sm">
<td className="py-3 pr-4">
<div className="font-medium text-slate-900">{meeting.display_name}</div>
<div className="text-xs text-slate-500">{meeting.room_name}</div>
</td>
<td className="py-3 pr-4">
<span className="inline-flex items-center gap-1">
<span className="w-2 h-2 bg-green-500 rounded-full animate-pulse" />
{meeting.participants}
</span>
</td>
<td className="py-3 pr-4 text-slate-500">{formatTimeAgo(meeting.started_at)}</td>
<td className="py-3 font-medium">{formatDuration(meeting.duration_minutes)}</td>
</tr>
))}
</tbody>
</table>
</div>
)}
</div>
{/* Recent Chat Rooms & Usage Stats */}
<div className="grid grid-cols-1 lg:grid-cols-2 gap-6 mb-6">
<div className="bg-white rounded-xl border border-slate-200 p-6">
<h3 className="font-semibold text-slate-900 mb-4">Aktive Chat-Raeume</h3>
{recentRooms.length === 0 ? (
<div className="text-center py-6 text-slate-500">
<p>Keine aktiven Raeume</p>
</div>
) : (
<div className="space-y-3">
{recentRooms.slice(0, 5).map((room, idx) => (
<div key={idx} className="flex items-center justify-between p-3 bg-slate-50 rounded-lg">
<div className="flex items-center gap-3">
<div className="w-8 h-8 bg-slate-200 rounded-lg flex items-center justify-center">
<svg className="w-4 h-4 text-slate-600" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M17 20h5v-2a3 3 0 00-5.356-1.857M17 20H7m10 0v-2c0-.656-.126-1.283-.356-1.857M7 20H2v-2a3 3 0 015.356-1.857M7 20v-2c0-.656.126-1.283.356-1.857m0 0a5.002 5.002 0 019.288 0M15 7a3 3 0 11-6 0 3 3 0 016 0z" />
</svg>
</div>
<div>
<div className="font-medium text-slate-900 text-sm">{room.name}</div>
<div className="text-xs text-slate-500">{room.member_count} Mitglieder</div>
</div>
</div>
<div className="flex items-center gap-2">
<span className={getRoomTypeBadge(room.room_type)}>{room.room_type}</span>
<span className="text-xs text-slate-400">{formatTimeAgo(room.last_activity)}</span>
</div>
</div>
))}
</div>
)}
</div>
{/* Usage Statistics */}
<div className="bg-white rounded-xl border border-slate-200 p-6">
<h3 className="font-semibold text-slate-900 mb-4">Nutzungsstatistiken</h3>
<div className="space-y-4">
<div>
<div className="flex justify-between text-sm mb-1">
<span className="text-slate-600">Call-Minuten heute</span>
<span className="font-semibold">{stats?.jitsi.total_minutes_today || 0} Min.</span>
</div>
<div className="w-full bg-slate-100 rounded-full h-2">
<div
className="bg-blue-600 h-2 rounded-full transition-all"
style={{ width: `${Math.min((stats?.jitsi.total_minutes_today || 0) / 500 * 100, 100)}%` }}
/>
</div>
</div>
<div>
<div className="flex justify-between text-sm mb-1">
<span className="text-slate-600">Aktive Chat-Raeume</span>
<span className="font-semibold">{stats?.matrix.active_rooms || 0} / {stats?.matrix.total_rooms || 0}</span>
</div>
<div className="w-full bg-slate-100 rounded-full h-2">
<div
className="bg-purple-600 h-2 rounded-full transition-all"
style={{ width: `${stats?.matrix.total_rooms ? ((stats.matrix.active_rooms / stats.matrix.total_rooms) * 100) : 0}%` }}
/>
</div>
</div>
<div>
<div className="flex justify-between text-sm mb-1">
<span className="text-slate-600">Aktive Nutzer</span>
<span className="font-semibold">{stats?.matrix.active_users || 0} / {stats?.matrix.total_users || 0}</span>
</div>
<div className="w-full bg-slate-100 rounded-full h-2">
<div
className="bg-green-600 h-2 rounded-full transition-all"
style={{ width: `${stats?.matrix.total_users ? ((stats.matrix.active_users / stats.matrix.total_users) * 100) : 0}%` }}
/>
</div>
</div>
</div>
{/* Quick Actions */}
<div className="mt-6 pt-4 border-t border-slate-100">
<h4 className="text-sm font-medium text-slate-700 mb-3">Schnellaktionen</h4>
<div className="flex flex-wrap gap-2">
<a
href="http://localhost:8448/_synapse/admin"
target="_blank"
rel="noopener noreferrer"
className="px-3 py-1.5 text-sm bg-purple-100 text-purple-700 rounded-lg hover:bg-purple-200 transition-colors"
>
Synapse Admin
</a>
<a
href="http://localhost:8443"
target="_blank"
rel="noopener noreferrer"
className="px-3 py-1.5 text-sm bg-blue-100 text-blue-700 rounded-lg hover:bg-blue-200 transition-colors"
>
Jitsi Meet
</a>
</div>
</div>
</div>
</div>
{/* Connection Info */}
<div className="bg-blue-50 border border-blue-200 rounded-xl p-4">
<div className="flex gap-3">
<svg className="w-5 h-5 text-blue-600 flex-shrink-0 mt-0.5" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M13 16h-1v-4h-1m1-4h.01M21 12a9 9 0 11-18 0 9 9 0 0118 0z" />
</svg>
<div>
<h4 className="font-semibold text-blue-900">Service Konfiguration</h4>
<p className="text-sm text-blue-800 mt-1">
<strong>Matrix Homeserver:</strong> http://localhost:8448 (Synapse)<br />
<strong>Jitsi Meet:</strong> http://localhost:8443<br />
<strong>Auto-Refresh:</strong> Alle 15 Sekunden
</p>
{error && (
<p className="text-sm text-red-600 mt-2">
<strong>Fehler:</strong> {error} - Backend nicht erreichbar
</p>
)}
{stats?.last_updated && (
<p className="text-xs text-blue-600 mt-2">
Letzte Aktualisierung: {new Date(stats.last_updated).toLocaleString('de-DE')}
</p>
)}
</div>
</div>
</div>
</div>
)
}

View File

@@ -1,366 +0,0 @@
'use client'
/**
* Video & Chat Wizard Page
*
* Interactive learning and testing wizard for Matrix & Jitsi integration
* Migrated from website/app/admin/communication/wizard
*/
import { useState } from 'react'
import Link from 'next/link'
import {
WizardStepper,
WizardNavigation,
EducationCard,
ArchitectureContext,
TestRunner,
TestSummary,
type WizardStep,
type TestCategoryResult,
type FullTestResults,
type EducationContent,
type ArchitectureContextType,
} from '@/components/wizard'
// ==============================================
// Constants
// ==============================================
const BACKEND_URL = process.env.NEXT_PUBLIC_BACKEND_URL || 'http://localhost:8000'
const STEPS: WizardStep[] = [
{ id: 'welcome', name: 'Willkommen', icon: '👋', status: 'pending' },
{ id: 'api-health', name: 'API Status', icon: '💚', status: 'pending', category: 'api-health' },
{ id: 'matrix', name: 'Matrix', icon: '💬', status: 'pending', category: 'matrix' },
{ id: 'jitsi', name: 'Jitsi', icon: '📹', status: 'pending', category: 'jitsi' },
{ id: 'summary', name: 'Zusammenfassung', icon: '📊', status: 'pending' },
]
const EDUCATION_CONTENT: Record<string, EducationContent> = {
'welcome': {
title: 'Willkommen zum Video & Chat Wizard',
content: [
'Sichere Kommunikation ist das Rueckgrat moderner Bildungsplattformen.',
'',
'BreakPilot nutzt zwei Open-Source Systeme:',
'• Matrix Synapse: Dezentraler Messenger (Ende-zu-Ende verschluesselt)',
'• Jitsi Meet: Video-Konferenzen (WebRTC-basiert)',
'',
'Beide Systeme sind DSGVO-konform und self-hosted.',
'',
'In diesem Wizard testen wir:',
'• Matrix Homeserver und Federation',
'• Jitsi Video-Konferenz Server',
'• Integration mit der Schulverwaltung',
],
},
'api-health': {
title: 'Communication API - Backend Integration',
content: [
'Die Communication API verbindet Matrix und Jitsi mit BreakPilot.',
'',
'Funktionen:',
'• Automatische Raum-Erstellung fuer Klassen',
'• Eltern-Lehrer DM-Raeume',
'• Meeting-Planung mit Kalender-Integration',
'• Benachrichtigungen bei neuen Nachrichten',
'',
'Endpunkte:',
'• /api/v1/communication/admin/stats',
'• /api/v1/communication/admin/matrix/users',
'• /api/v1/communication/rooms',
],
},
'matrix': {
title: 'Matrix Synapse - Dezentraler Messenger',
content: [
'Matrix ist ein offenes Protokoll fuer sichere Kommunikation.',
'',
'Vorteile gegenueber WhatsApp/Teams:',
'• Ende-zu-Ende Verschluesselung (E2EE)',
'• Dezentral: Kein Single Point of Failure',
'• Federation: Kommunikation mit anderen Schulen',
'• Self-Hosted: Volle Datenkontrolle',
'',
'Raum-Typen in BreakPilot:',
'• Klassen-Info (Ankuendigungen)',
'• Elternvertreter-Raum',
'• Lehrer-Eltern DM',
'• Fachgruppen',
],
},
'jitsi': {
title: 'Jitsi Meet - Video-Konferenzen',
content: [
'Jitsi ist eine Open-Source Alternative zu Zoom/Teams.',
'',
'Features:',
'• WebRTC: Keine Software-Installation noetig',
'• Bildschirmfreigabe und Whiteboard',
'• Breakout-Raeume fuer Gruppenarbeit',
'• Aufzeichnung (optional, lokal)',
'',
'Anwendungsfaelle:',
'• Elternsprechtage (online)',
'• Fernunterricht bei Schulausfall',
'• Lehrerkonferenzen',
'• Foerdergespraeche',
],
},
'summary': {
title: 'Test-Zusammenfassung',
content: [
'Hier sehen Sie eine Uebersicht aller durchgefuehrten Tests:',
'• Matrix Homeserver Verfuegbarkeit',
'• Jitsi Server Status',
'• API-Integration',
],
},
}
const ARCHITECTURE_CONTEXTS: Record<string, ArchitectureContextType> = {
'api-health': {
layer: 'api',
services: ['backend', 'consent-service'],
dependencies: ['PostgreSQL', 'Matrix Synapse', 'Jitsi'],
dataFlow: ['Browser', 'FastAPI', 'Go Service', 'Matrix/Jitsi'],
},
'matrix': {
layer: 'service',
services: ['matrix'],
dependencies: ['PostgreSQL', 'Federation', 'TURN Server'],
dataFlow: ['Element Client', 'Matrix Synapse', 'Federation', 'PostgreSQL'],
},
'jitsi': {
layer: 'service',
services: ['jitsi'],
dependencies: ['Prosody XMPP', 'JVB', 'TURN/STUN'],
dataFlow: ['Browser', 'Nginx', 'Prosody', 'Jitsi Videobridge'],
},
}
// ==============================================
// Main Component
// ==============================================
export default function VideoChatWizardPage() {
const [currentStep, setCurrentStep] = useState(0)
const [steps, setSteps] = useState<WizardStep[]>(STEPS)
const [categoryResults, setCategoryResults] = useState<Record<string, TestCategoryResult>>({})
const [fullResults, setFullResults] = useState<FullTestResults | null>(null)
const [isLoading, setIsLoading] = useState(false)
const [error, setError] = useState<string | null>(null)
const currentStepData = steps[currentStep]
const isTestStep = currentStepData?.category !== undefined
const isWelcome = currentStepData?.id === 'welcome'
const isSummary = currentStepData?.id === 'summary'
const runCategoryTest = async (category: string) => {
setIsLoading(true)
setError(null)
try {
const response = await fetch(`${BACKEND_URL}/api/admin/communication-tests/${category}`, {
method: 'POST',
})
if (!response.ok) {
throw new Error(`HTTP ${response.status}: ${response.statusText}`)
}
const result: TestCategoryResult = await response.json()
setCategoryResults((prev) => ({ ...prev, [category]: result }))
setSteps((prev) =>
prev.map((step) =>
step.category === category
? { ...step, status: result.failed === 0 ? 'completed' : 'failed' }
: step
)
)
} catch (err) {
setError(err instanceof Error ? err.message : 'Unbekannter Fehler')
} finally {
setIsLoading(false)
}
}
const runAllTests = async () => {
setIsLoading(true)
setError(null)
try {
const response = await fetch(`${BACKEND_URL}/api/admin/communication-tests/run-all`, {
method: 'POST',
})
if (!response.ok) {
throw new Error(`HTTP ${response.status}: ${response.statusText}`)
}
const results: FullTestResults = await response.json()
setFullResults(results)
setSteps((prev) =>
prev.map((step) => {
if (step.category) {
const catResult = results.categories.find((c) => c.category === step.category)
if (catResult) {
return { ...step, status: catResult.failed === 0 ? 'completed' : 'failed' }
}
}
return step
})
)
const newCategoryResults: Record<string, TestCategoryResult> = {}
results.categories.forEach((cat) => {
newCategoryResults[cat.category] = cat
})
setCategoryResults(newCategoryResults)
} catch (err) {
setError(err instanceof Error ? err.message : 'Unbekannter Fehler')
} finally {
setIsLoading(false)
}
}
const goToNext = () => {
if (currentStep < steps.length - 1) {
setSteps((prev) =>
prev.map((step, idx) =>
idx === currentStep && step.status === 'pending'
? { ...step, status: 'completed' }
: step
)
)
setCurrentStep((prev) => prev + 1)
}
}
const goToPrev = () => {
if (currentStep > 0) {
setCurrentStep((prev) => prev - 1)
}
}
const handleStepClick = (index: number) => {
if (index <= currentStep || steps[index - 1]?.status !== 'pending') {
setCurrentStep(index)
}
}
return (
<div>
{/* Header */}
<div className="bg-white rounded-lg border border-slate-200 p-4 mb-6 flex items-center justify-between">
<div className="flex items-center">
<span className="text-3xl mr-3">💬</span>
<div>
<h2 className="text-lg font-bold text-gray-800">Video & Chat Test Wizard</h2>
<p className="text-sm text-gray-600">Matrix Messenger & Jitsi Video</p>
</div>
</div>
<Link href="/communication/video-chat" className="text-blue-600 hover:text-blue-800 text-sm">
&larr; Zurueck zu Video & Chat
</Link>
</div>
{/* Stepper */}
<div className="bg-white rounded-lg border border-slate-200 p-6 mb-6">
<WizardStepper steps={steps} currentStep={currentStep} onStepClick={handleStepClick} />
</div>
{/* Content */}
<div className="bg-white rounded-lg border border-slate-200 p-6">
<div className="flex items-center mb-6">
<span className="text-3xl mr-3">{currentStepData?.icon}</span>
<div>
<h2 className="text-xl font-bold text-gray-800">
Schritt {currentStep + 1}: {currentStepData?.name}
</h2>
<p className="text-gray-500 text-sm">
{currentStep + 1} von {steps.length}
</p>
</div>
</div>
<EducationCard content={EDUCATION_CONTENT[currentStepData?.id || '']} />
{isTestStep && currentStepData?.category && ARCHITECTURE_CONTEXTS[currentStepData.category] && (
<ArchitectureContext
context={ARCHITECTURE_CONTEXTS[currentStepData.category]}
currentStep={currentStepData.name}
/>
)}
{error && (
<div className="bg-red-50 border border-red-200 text-red-700 rounded-lg p-4 mb-6">
<strong>Fehler:</strong> {error}
</div>
)}
{isWelcome && (
<div className="text-center py-8">
<button
onClick={goToNext}
className="bg-blue-600 text-white px-8 py-3 rounded-lg font-medium hover:bg-blue-700 transition-colors"
>
Wizard starten
</button>
</div>
)}
{isTestStep && currentStepData?.category && (
<TestRunner
category={currentStepData.category}
categoryResult={categoryResults[currentStepData.category]}
isLoading={isLoading}
onRunTests={() => runCategoryTest(currentStepData.category!)}
/>
)}
{isSummary && (
<div>
{!fullResults ? (
<div className="text-center py-8">
<p className="text-gray-600 mb-4">
Fuehren Sie alle Tests aus um eine Zusammenfassung zu sehen.
</p>
<button
onClick={runAllTests}
disabled={isLoading}
className={`px-6 py-3 rounded-lg font-medium transition-colors ${
isLoading
? 'bg-gray-400 cursor-not-allowed'
: 'bg-blue-600 text-white hover:bg-blue-700'
}`}
>
{isLoading ? 'Alle Tests laufen...' : 'Alle Tests ausfuehren'}
</button>
</div>
) : (
<TestSummary results={fullResults} />
)}
</div>
)}
<WizardNavigation
currentStep={currentStep}
totalSteps={steps.length}
onPrev={goToPrev}
onNext={goToNext}
showNext={!isSummary}
isLoading={isLoading}
/>
</div>
<div className="text-center text-gray-500 text-sm mt-6">
Diese Tests pruefen die Matrix- und Jitsi-Integration.
Bei Fragen wenden Sie sich an das IT-Team.
</div>
</div>
)
}

View File

@@ -1,390 +0,0 @@
'use client'
/**
* GPU Infrastructure Admin Page
*
* vast.ai GPU Management for LLM Processing
*/
import { useEffect, useState, useCallback } from 'react'
import { PagePurpose } from '@/components/common/PagePurpose'
interface VastStatus {
instance_id: number | null
status: string
gpu_name: string | null
dph_total: number | null
endpoint_base_url: string | null
last_activity: string | null
auto_shutdown_in_minutes: number | null
total_runtime_hours: number | null
total_cost_usd: number | null
account_credit: number | null
account_total_spend: number | null
session_runtime_minutes: number | null
session_cost_usd: number | null
message: string | null
error?: string
}
export default function GPUInfrastructurePage() {
const [status, setStatus] = useState<VastStatus | null>(null)
const [loading, setLoading] = useState(true)
const [actionLoading, setActionLoading] = useState<string | null>(null)
const [error, setError] = useState<string | null>(null)
const [message, setMessage] = useState<string | null>(null)
const API_PROXY = '/api/admin/gpu'
const fetchStatus = useCallback(async () => {
setLoading(true)
setError(null)
try {
const response = await fetch(API_PROXY)
const data = await response.json()
if (!response.ok) {
throw new Error(data.error || `HTTP ${response.status}`)
}
setStatus(data)
} catch (err) {
setError(err instanceof Error ? err.message : 'Verbindungsfehler')
setStatus({
instance_id: null,
status: 'error',
gpu_name: null,
dph_total: null,
endpoint_base_url: null,
last_activity: null,
auto_shutdown_in_minutes: null,
total_runtime_hours: null,
total_cost_usd: null,
account_credit: null,
account_total_spend: null,
session_runtime_minutes: null,
session_cost_usd: null,
message: 'Verbindung fehlgeschlagen'
})
} finally {
setLoading(false)
}
}, [])
useEffect(() => {
fetchStatus()
}, [fetchStatus])
useEffect(() => {
const interval = setInterval(fetchStatus, 30000)
return () => clearInterval(interval)
}, [fetchStatus])
const powerOn = async () => {
setActionLoading('on')
setError(null)
setMessage(null)
try {
const response = await fetch(API_PROXY, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ action: 'on' }),
})
const data = await response.json()
if (!response.ok) {
throw new Error(data.error || data.detail || 'Aktion fehlgeschlagen')
}
setMessage('Start angefordert')
setTimeout(fetchStatus, 3000)
setTimeout(fetchStatus, 10000)
} catch (err) {
setError(err instanceof Error ? err.message : 'Fehler beim Starten')
fetchStatus()
} finally {
setActionLoading(null)
}
}
const powerOff = async () => {
setActionLoading('off')
setError(null)
setMessage(null)
try {
const response = await fetch(API_PROXY, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ action: 'off' }),
})
const data = await response.json()
if (!response.ok) {
throw new Error(data.error || data.detail || 'Aktion fehlgeschlagen')
}
setMessage('Stop angefordert')
setTimeout(fetchStatus, 3000)
setTimeout(fetchStatus, 10000)
} catch (err) {
setError(err instanceof Error ? err.message : 'Fehler beim Stoppen')
fetchStatus()
} finally {
setActionLoading(null)
}
}
const getStatusBadge = (s: string) => {
const baseClasses = 'px-3 py-1 rounded-full text-sm font-semibold uppercase'
switch (s) {
case 'running':
return `${baseClasses} bg-green-100 text-green-800`
case 'stopped':
case 'exited':
return `${baseClasses} bg-red-100 text-red-800`
case 'loading':
case 'scheduling':
case 'creating':
case 'starting...':
case 'stopping...':
return `${baseClasses} bg-yellow-100 text-yellow-800`
default:
return `${baseClasses} bg-slate-100 text-slate-600`
}
}
const getCreditColor = (credit: number | null) => {
if (credit === null) return 'text-slate-500'
if (credit < 5) return 'text-red-600'
if (credit < 15) return 'text-yellow-600'
return 'text-green-600'
}
return (
<div>
{/* Page Purpose */}
<PagePurpose
title="GPU Infrastruktur"
purpose="Verwalten Sie die vast.ai GPU-Instanzen fuer LLM-Verarbeitung und OCR. Starten/Stoppen Sie GPUs bei Bedarf und ueberwachen Sie Kosten in Echtzeit."
audience={['DevOps', 'Entwickler', 'System-Admins']}
architecture={{
services: ['vast.ai API', 'Ollama', 'VLLM'],
databases: ['PostgreSQL (Logs)'],
}}
relatedPages={[
{ name: 'Security', href: '/infrastructure/security', description: 'DevSecOps Dashboard' },
{ name: 'Builds', href: '/infrastructure/builds', description: 'CI/CD Pipeline' },
]}
collapsible={true}
defaultCollapsed={true}
/>
{/* Status Cards */}
<div className="bg-white rounded-xl border border-slate-200 p-6 mb-6">
<div className="grid grid-cols-2 md:grid-cols-3 lg:grid-cols-6 gap-6">
<div>
<div className="text-sm text-slate-500 mb-2">Status</div>
{loading ? (
<span className="px-3 py-1 rounded-full text-sm font-semibold bg-slate-100 text-slate-600">
Laden...
</span>
) : (
<span className={getStatusBadge(
actionLoading === 'on' ? 'starting...' :
actionLoading === 'off' ? 'stopping...' :
status?.status || 'unknown'
)}>
{actionLoading === 'on' ? 'starting...' :
actionLoading === 'off' ? 'stopping...' :
status?.status || 'unbekannt'}
</span>
)}
</div>
<div>
<div className="text-sm text-slate-500 mb-2">GPU</div>
<div className="font-semibold text-slate-900">
{status?.gpu_name || '-'}
</div>
</div>
<div>
<div className="text-sm text-slate-500 mb-2">Kosten/h</div>
<div className="font-semibold text-slate-900">
{status?.dph_total ? `$${status.dph_total.toFixed(3)}` : '-'}
</div>
</div>
<div>
<div className="text-sm text-slate-500 mb-2">Auto-Stop</div>
<div className="font-semibold text-slate-900">
{status && status.auto_shutdown_in_minutes !== null
? `${status.auto_shutdown_in_minutes} min`
: '-'}
</div>
</div>
<div>
<div className="text-sm text-slate-500 mb-2">Budget</div>
<div className={`font-bold text-lg ${getCreditColor(status?.account_credit ?? null)}`}>
{status && status.account_credit !== null
? `$${status.account_credit.toFixed(2)}`
: '-'}
</div>
</div>
<div>
<div className="text-sm text-slate-500 mb-2">Session</div>
<div className="font-semibold text-slate-900">
{status && status.session_runtime_minutes !== null && status.session_cost_usd !== null
? `${Math.round(status.session_runtime_minutes)} min / $${status.session_cost_usd.toFixed(3)}`
: '-'}
</div>
</div>
</div>
{/* Buttons */}
<div className="flex items-center gap-4 mt-6 pt-6 border-t border-slate-200">
<button
onClick={powerOn}
disabled={actionLoading !== null || status?.status === 'running'}
className="px-6 py-2 bg-orange-600 text-white rounded-lg font-medium hover:bg-orange-700 disabled:opacity-50 disabled:cursor-not-allowed transition-colors"
>
Starten
</button>
<button
onClick={powerOff}
disabled={actionLoading !== null || status?.status !== 'running'}
className="px-6 py-2 bg-red-600 text-white rounded-lg font-medium hover:bg-red-700 disabled:opacity-50 disabled:cursor-not-allowed transition-colors"
>
Stoppen
</button>
<button
onClick={fetchStatus}
disabled={loading}
className="px-4 py-2 border border-slate-300 text-slate-700 rounded-lg font-medium hover:bg-slate-50 disabled:opacity-50 transition-colors"
>
{loading ? 'Aktualisiere...' : 'Aktualisieren'}
</button>
{message && (
<span className="ml-4 text-sm text-green-600 font-medium">{message}</span>
)}
{error && (
<span className="ml-4 text-sm text-red-600 font-medium">{error}</span>
)}
</div>
</div>
{/* Extended Stats */}
<div className="grid grid-cols-1 lg:grid-cols-2 gap-6 mb-6">
<div className="bg-white rounded-xl border border-slate-200 p-6">
<h3 className="font-semibold text-slate-900 mb-4">Kosten-Uebersicht</h3>
<div className="space-y-4">
<div className="flex justify-between items-center">
<span className="text-slate-600">Session Laufzeit</span>
<span className="font-semibold">
{status && status.session_runtime_minutes !== null
? `${Math.round(status.session_runtime_minutes)} Minuten`
: '-'}
</span>
</div>
<div className="flex justify-between items-center">
<span className="text-slate-600">Session Kosten</span>
<span className="font-semibold">
{status && status.session_cost_usd !== null
? `$${status.session_cost_usd.toFixed(4)}`
: '-'}
</span>
</div>
<div className="flex justify-between items-center pt-4 border-t border-slate-100">
<span className="text-slate-600">Gesamtlaufzeit</span>
<span className="font-semibold">
{status && status.total_runtime_hours !== null
? `${status.total_runtime_hours.toFixed(1)} Stunden`
: '-'}
</span>
</div>
<div className="flex justify-between items-center">
<span className="text-slate-600">Gesamtkosten</span>
<span className="font-semibold">
{status && status.total_cost_usd !== null
? `$${status.total_cost_usd.toFixed(2)}`
: '-'}
</span>
</div>
<div className="flex justify-between items-center">
<span className="text-slate-600">vast.ai Ausgaben</span>
<span className="font-semibold">
{status && status.account_total_spend !== null
? `$${status.account_total_spend.toFixed(2)}`
: '-'}
</span>
</div>
</div>
</div>
<div className="bg-white rounded-xl border border-slate-200 p-6">
<h3 className="font-semibold text-slate-900 mb-4">Instanz-Details</h3>
<div className="space-y-4">
<div className="flex justify-between items-center">
<span className="text-slate-600">Instanz ID</span>
<span className="font-mono text-sm">
{status?.instance_id || '-'}
</span>
</div>
<div className="flex justify-between items-center">
<span className="text-slate-600">GPU</span>
<span className="font-semibold">
{status?.gpu_name || '-'}
</span>
</div>
<div className="flex justify-between items-center">
<span className="text-slate-600">Stundensatz</span>
<span className="font-semibold">
{status?.dph_total ? `$${status.dph_total.toFixed(4)}/h` : '-'}
</span>
</div>
<div className="flex justify-between items-center">
<span className="text-slate-600">Letzte Aktivitaet</span>
<span className="text-sm">
{status?.last_activity
? new Date(status.last_activity).toLocaleString('de-DE')
: '-'}
</span>
</div>
{status?.endpoint_base_url && status.status === 'running' && (
<div className="pt-4 border-t border-slate-100">
<div className="text-slate-600 text-sm mb-1">Endpoint</div>
<code className="text-xs bg-slate-100 px-2 py-1 rounded block overflow-x-auto">
{status.endpoint_base_url}
</code>
</div>
)}
</div>
</div>
</div>
{/* Info */}
<div className="bg-orange-50 border border-orange-200 rounded-xl p-4">
<div className="flex gap-3">
<svg className="w-5 h-5 text-orange-600 flex-shrink-0 mt-0.5" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M13 16h-1v-4h-1m1-4h.01M21 12a9 9 0 11-18 0 9 9 0 0118 0z" />
</svg>
<div>
<h4 className="font-semibold text-orange-900">Auto-Shutdown</h4>
<p className="text-sm text-orange-800 mt-1">
Die GPU-Instanz wird automatisch gestoppt, wenn sie laengere Zeit inaktiv ist.
Der Status wird alle 30 Sekunden automatisch aktualisiert.
</p>
</div>
</div>
</div>
</div>
)
}

View File

@@ -2,6 +2,7 @@
import { useCallback, useEffect, useState } from 'react'
import { useGridEditor } from './useGridEditor'
import type { GridZone } from './types'
import { GridToolbar } from './GridToolbar'
import { GridTable } from './GridTable'
import { GridImageOverlay } from './GridImageOverlay'
@@ -31,6 +32,14 @@ export function GridEditor({ sessionId, onNext }: GridEditorProps) {
canUndo,
canRedo,
getAdjacentCell,
deleteColumn,
addColumn,
deleteRow,
addRow,
ipaMode,
setIpaMode,
syllableMode,
setSyllableMode,
} = useGridEditor(sessionId)
const [showOverlay, setShowOverlay] = useState(false)
@@ -160,6 +169,16 @@ export function GridEditor({ sessionId, onNext }: GridEditorProps) {
+{grid.summary.recovered_colored} recovered
</span>
)}
{grid.dictionary_detection?.is_dictionary && (
<span className="px-1.5 py-0.5 rounded bg-blue-50 dark:bg-blue-900/20 text-blue-600 dark:text-blue-400 border border-blue-200 dark:border-blue-800">
Woerterbuch ({Math.round(grid.dictionary_detection.confidence * 100)}%)
</span>
)}
{grid.page_number?.text && (
<span className="px-1.5 py-0.5 rounded bg-gray-100 dark:bg-gray-700 text-gray-600 dark:text-gray-300 border border-gray-200 dark:border-gray-600">
S. {grid.page_number.text}
</span>
)}
<span className="text-gray-400">
{grid.duration_seconds.toFixed(1)}s
</span>
@@ -173,11 +192,15 @@ export function GridEditor({ sessionId, onNext }: GridEditorProps) {
canUndo={canUndo}
canRedo={canRedo}
showOverlay={showOverlay}
ipaMode={ipaMode}
syllableMode={syllableMode}
onSave={saveGrid}
onUndo={undo}
onRedo={redo}
onRebuild={buildGrid}
onToggleOverlay={() => setShowOverlay(!showOverlay)}
onIpaModeChange={setIpaMode}
onSyllableModeChange={setSyllableMode}
/>
</div>
@@ -186,25 +209,70 @@ export function GridEditor({ sessionId, onNext }: GridEditorProps) {
<GridImageOverlay sessionId={sessionId} grid={grid} />
)}
{/* Zone tables */}
{/* Zone tables — group vsplit zones side by side */}
<div className="space-y-4">
{grid.zones.map((zone) => (
<div
key={zone.zone_index}
className="bg-white dark:bg-gray-800 rounded-lg border border-gray-200 dark:border-gray-700 overflow-hidden"
>
<GridTable
zone={zone}
layoutMetrics={grid.layout_metrics}
selectedCell={selectedCell}
onSelectCell={setSelectedCell}
onCellTextChange={updateCellText}
onToggleColumnBold={toggleColumnBold}
onToggleRowHeader={toggleRowHeader}
onNavigate={handleNavigate}
/>
</div>
))}
{(() => {
// Group consecutive zones with same vsplit_group
const groups: GridZone[][] = []
for (const zone of grid.zones) {
const prev = groups[groups.length - 1]
if (
prev &&
zone.vsplit_group != null &&
prev[0].vsplit_group === zone.vsplit_group
) {
prev.push(zone)
} else {
groups.push([zone])
}
}
return groups.map((group) =>
group.length === 1 ? (
<div
key={group[0].zone_index}
className="bg-white dark:bg-gray-800 rounded-lg border border-gray-200 dark:border-gray-700 overflow-hidden"
>
<GridTable
zone={group[0]}
layoutMetrics={grid.layout_metrics}
selectedCell={selectedCell}
onSelectCell={setSelectedCell}
onCellTextChange={updateCellText}
onToggleColumnBold={toggleColumnBold}
onToggleRowHeader={toggleRowHeader}
onNavigate={handleNavigate}
onDeleteColumn={deleteColumn}
onAddColumn={addColumn}
onDeleteRow={deleteRow}
onAddRow={addRow}
/>
</div>
) : (
<div
key={`vsplit-${group[0].vsplit_group}`}
className="flex gap-2"
>
{group.map((zone) => (
<div
key={zone.zone_index}
className="flex-1 min-w-0 bg-white dark:bg-gray-800 rounded-lg border border-gray-200 dark:border-gray-700 overflow-hidden"
>
<GridTable
zone={zone}
layoutMetrics={grid.layout_metrics}
selectedCell={selectedCell}
onSelectCell={setSelectedCell}
onCellTextChange={updateCellText}
onToggleColumnBold={toggleColumnBold}
onToggleRowHeader={toggleRowHeader}
onNavigate={handleNavigate}
/>
</div>
))}
</div>
),
)
})()}
</div>
{/* Tip */}

View File

@@ -7,18 +7,35 @@ interface GridTableProps {
zone: GridZone
layoutMetrics?: LayoutMetrics
selectedCell: string | null
selectedCells?: Set<string>
onSelectCell: (cellId: string) => void
onToggleCellSelection?: (cellId: string) => void
onCellTextChange: (cellId: string, text: string) => void
onToggleColumnBold: (zoneIndex: number, colIndex: number) => void
onToggleRowHeader: (zoneIndex: number, rowIndex: number) => void
onNavigate: (cellId: string, direction: 'up' | 'down' | 'left' | 'right') => void
onDeleteColumn?: (zoneIndex: number, colIndex: number) => void
onAddColumn?: (zoneIndex: number, afterColIndex: number) => void
onDeleteRow?: (zoneIndex: number, rowIndex: number) => void
onAddRow?: (zoneIndex: number, afterRowIndex: number) => void
onSetCellColor?: (cellId: string, color: string | null | undefined) => void
}
/** Color palette for the right-click cell color menu. */
const COLOR_OPTIONS: { label: string; value: string | null }[] = [
{ label: 'Rot', value: '#dc2626' },
{ label: 'Gruen', value: '#16a34a' },
{ label: 'Blau', value: '#2563eb' },
{ label: 'Orange', value: '#ea580c' },
{ label: 'Lila', value: '#9333ea' },
{ label: 'Schwarz', value: null },
]
/** Gutter width for row numbers (px). */
const ROW_NUM_WIDTH = 36
/** Minimum column width in px so columns remain usable. */
const MIN_COL_WIDTH = 40
const MIN_COL_WIDTH = 80
/** Minimum row height in px. */
const MIN_ROW_HEIGHT = 26
@@ -27,14 +44,22 @@ export function GridTable({
zone,
layoutMetrics,
selectedCell,
selectedCells,
onSelectCell,
onToggleCellSelection,
onCellTextChange,
onToggleColumnBold,
onToggleRowHeader,
onNavigate,
onDeleteColumn,
onAddColumn,
onDeleteRow,
onAddRow,
onSetCellColor,
}: GridTableProps) {
const containerRef = useRef<HTMLDivElement>(null)
const [containerWidth, setContainerWidth] = useState(0)
const [colorMenu, setColorMenu] = useState<{ cellId: string; x: number; y: number } | null>(null)
// ----------------------------------------------------------------
// Observe container width for scaling
@@ -82,12 +107,18 @@ export function GridTable({
const row = zone.rows.find((r) => r.index === rowIndex)
if (!row) return Math.max(MIN_ROW_HEIGHT, avgRowHeightPx * scale)
// Multi-line cells (containing \n): expand height based on line count
const rowCells = zone.cells.filter((c) => c.row_index === rowIndex)
const maxLines = Math.max(1, ...rowCells.map((c) => (c.text ?? '').split('\n').length))
if (maxLines > 1) {
const lineH = Math.max(MIN_ROW_HEIGHT, avgRowHeightPx * scale)
return lineH * maxLines
}
if (isHeader) {
// Headers keep their measured height
const measuredH = row.y_max_px - row.y_min_px
return Math.max(MIN_ROW_HEIGHT, measuredH * scale)
}
// Content rows use average for uniformity
return Math.max(MIN_ROW_HEIGHT, avgRowHeightPx * scale)
}
@@ -109,12 +140,18 @@ export function GridTable({
} else if (e.key === 'Enter' && !e.shiftKey) {
e.preventDefault()
onNavigate(cellId, 'down')
} else if (e.key === 'ArrowUp' && e.altKey) {
} else if (e.key === 'ArrowUp') {
e.preventDefault()
onNavigate(cellId, 'up')
} else if (e.key === 'ArrowDown' && e.altKey) {
} else if (e.key === 'ArrowDown') {
e.preventDefault()
onNavigate(cellId, 'down')
} else if (e.key === 'ArrowLeft' && e.altKey) {
e.preventDefault()
onNavigate(cellId, 'left')
} else if (e.key === 'ArrowRight' && e.altKey) {
e.preventDefault()
onNavigate(cellId, 'right')
} else if (e.key === 'Escape') {
;(e.target as HTMLElement).blur()
}
@@ -130,9 +167,13 @@ export function GridTable({
cellMap.set(`${cell.row_index}_${cell.col_index}`, cell)
}
/** Dominant non-black color from a cell's word_boxes, or null. */
/** Dominant non-black color from a cell's word_boxes, or null.
* `color_override` takes priority when set. */
const getCellColor = (cell: (typeof zone.cells)[0] | undefined): string | null => {
if (!cell?.word_boxes?.length) return null
if (!cell) return null
// Manual override: explicit color or null (= "clear color bar")
if (cell.color_override !== undefined) return cell.color_override ?? null
if (!cell.word_boxes?.length) return null
for (const wb of cell.word_boxes) {
if (wb.color_name && wb.color_name !== 'black' && wb.color) {
return wb.color
@@ -245,11 +286,11 @@ export function GridTable({
{/* Header: row-number corner */}
<div className="sticky left-0 z-10 px-1 py-1.5 text-[10px] text-gray-400 dark:text-gray-500 border-b border-r border-gray-200 dark:border-gray-700 bg-gray-50 dark:bg-gray-800/50" />
{/* Header: column labels with resize handles */}
{/* Header: column labels with resize handles + delete/add */}
{zone.columns.map((col, ci) => (
<div
key={col.index}
className={`relative px-2 py-1.5 text-xs font-medium border-b border-r border-gray-200 dark:border-gray-700 bg-gray-50 dark:bg-gray-800/50 cursor-pointer select-none transition-colors hover:bg-gray-100 dark:hover:bg-gray-700 ${
className={`group/colhdr relative px-2 py-1.5 text-xs font-medium border-b border-r border-gray-200 dark:border-gray-700 bg-gray-50 dark:bg-gray-800/50 cursor-pointer select-none transition-colors hover:bg-gray-100 dark:hover:bg-gray-700 ${
col.bold ? 'text-teal-700 dark:text-teal-300' : 'text-gray-600 dark:text-gray-400'
}`}
onClick={() => onToggleColumnBold(zone.zone_index, col.index)}
@@ -263,10 +304,40 @@ export function GridTable({
</span>
)}
</div>
{/* Right-edge resize handle */}
{/* Delete column button (visible on hover) */}
{onDeleteColumn && numCols > 1 && (
<button
className="absolute top-0 left-0 w-4 h-4 flex items-center justify-center bg-red-500 text-white rounded-br text-[9px] leading-none opacity-0 group-hover/colhdr:opacity-100 transition-opacity z-30"
onClick={(e) => {
e.stopPropagation()
if (confirm(`Spalte "${col.label}" loeschen?`)) {
onDeleteColumn(zone.zone_index, col.index)
}
}}
title={`Spalte "${col.label}" loeschen`}
>
x
</button>
)}
{/* Add column button — small icon at bottom-right, below resize handle */}
{onAddColumn && (
<button
className="absolute -right-[7px] -bottom-[7px] w-[14px] h-[14px] flex items-center justify-center text-teal-500 opacity-0 group-hover/colhdr:opacity-100 transition-opacity z-30 hover:bg-teal-100 dark:hover:bg-teal-900/40 rounded-full bg-white dark:bg-gray-800 border border-gray-200 dark:border-gray-700"
onClick={(e) => {
e.stopPropagation()
onAddColumn(zone.zone_index, col.index)
}}
title={`Spalte nach "${col.label}" einfuegen`}
>
<svg className="w-2.5 h-2.5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={3}>
<path strokeLinecap="round" strokeLinejoin="round" d="M12 4v16m8-8H4" />
</svg>
</button>
)}
{/* Right-edge resize handle — wide grab area, highest z-index */}
{ci < numCols - 1 && (
<div
className="absolute top-0 right-0 w-[5px] h-full cursor-col-resize hover:bg-teal-400/40 z-20"
className="absolute top-0 -right-[4px] w-[9px] h-full cursor-col-resize hover:bg-teal-400/40 z-40"
onMouseDown={(e) => {
e.stopPropagation()
handleColResizeStart(ci, e.clientX)
@@ -289,7 +360,7 @@ export function GridTable({
<div key={row.index} style={{ display: 'contents' }}>
{/* Row number cell */}
<div
className={`relative sticky left-0 z-10 flex items-center justify-center text-[10px] border-b border-r border-gray-200 dark:border-gray-700 cursor-pointer select-none transition-colors hover:bg-gray-100 dark:hover:bg-gray-700 ${
className={`group/rowhdr relative sticky left-0 z-10 flex items-center justify-center text-[10px] border-b border-r border-gray-200 dark:border-gray-700 cursor-pointer select-none transition-colors hover:bg-gray-100 dark:hover:bg-gray-700 ${
row.is_header
? 'bg-blue-50 dark:bg-blue-900/20 text-blue-600 dark:text-blue-400 font-medium'
: row.is_footer
@@ -298,11 +369,41 @@ export function GridTable({
}`}
style={{ height: `${rowH}px` }}
onClick={() => onToggleRowHeader(zone.zone_index, row.index)}
title={`Zeile ${row.index + 1} — Klick fuer Header-Toggle`}
title={`Zeile ${row.index + 1} — Klick: ${row.is_header ? 'Footer' : row.is_footer ? 'Normal' : 'Header'}`}
>
{row.index + 1}
{row.is_header && <span className="block text-[8px]">H</span>}
{row.is_footer && <span className="block text-[8px]">F</span>}
{/* Delete row button (visible on hover) */}
{onDeleteRow && zone.rows.length > 1 && (
<button
className="absolute top-0 left-0 w-4 h-4 flex items-center justify-center bg-red-500 text-white rounded-br text-[9px] leading-none opacity-0 group-hover/rowhdr:opacity-100 transition-opacity z-30"
onClick={(e) => {
e.stopPropagation()
if (confirm(`Zeile ${row.index + 1} loeschen?`)) {
onDeleteRow(zone.zone_index, row.index)
}
}}
title={`Zeile ${row.index + 1} loeschen`}
>
x
</button>
)}
{/* Add row button (visible on hover, below this row) */}
{onAddRow && (
<button
className="absolute -bottom-[7px] left-0 w-full h-[14px] flex items-center justify-center text-teal-500 opacity-0 group-hover/rowhdr:opacity-100 transition-opacity z-30 hover:bg-teal-100 dark:hover:bg-teal-900/40 rounded"
onClick={(e) => {
e.stopPropagation()
onAddRow(zone.zone_index, row.index)
}}
title={`Zeile nach ${row.index + 1} einfuegen`}
>
<svg className="w-3 h-3" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2.5}>
<path strokeLinecap="round" strokeLinejoin="round" d="M12 4v16m8-8H4" />
</svg>
</button>
)}
{/* Bottom-edge resize handle */}
<div
className="absolute bottom-0 left-0 w-full h-[4px] cursor-row-resize hover:bg-teal-400/40 z-20"
@@ -315,46 +416,43 @@ export function GridTable({
{/* Cells — spanning header or normal columns */}
{isSpanning ? (
<div
className="border-b border-r border-gray-200 dark:border-gray-700 bg-blue-50/50 dark:bg-blue-900/10 flex items-center"
style={{
gridColumn: `2 / ${numCols + 2}`,
height: `${rowH}px`,
}}
>
{(() => {
const spanCell = zone.cells.find(
(c) => c.row_index === row.index && c.col_type === 'spanning_header',
)
if (!spanCell) return null
const cellId = spanCell.cell_id
const isSelected = selectedCell === cellId
const cellColor = getCellColor(spanCell)
return (
<div className="flex items-center w-full">
{cellColor && (
<span
className="flex-shrink-0 w-1.5 self-stretch rounded-l-sm"
style={{ backgroundColor: cellColor }}
/>
)}
<input
id={`cell-${cellId}`}
type="text"
value={spanCell.text}
onChange={(e) => onCellTextChange(cellId, e.target.value)}
onFocus={() => onSelectCell(cellId)}
onKeyDown={(e) => handleKeyDown(e, cellId)}
className={`w-full px-3 py-1 bg-transparent border-0 outline-none text-center ${
isSelected ? 'ring-2 ring-teal-500 ring-inset rounded' : ''
<>
{zone.cells
.filter((c) => c.row_index === row.index && c.col_type === 'spanning_header')
.sort((a, b) => a.col_index - b.col_index)
.map((spanCell) => {
const colspan = spanCell.colspan || numCols
const cellId = spanCell.cell_id
const isSelected = selectedCell === cellId
const cellColor = getCellColor(spanCell)
const gridColStart = spanCell.col_index + 2
const gridColEnd = gridColStart + colspan
return (
<div
key={cellId}
className={`border-b border-r border-gray-200 dark:border-gray-700 bg-blue-50/50 dark:bg-blue-900/10 flex items-center ${
isSelected ? 'ring-2 ring-teal-500 ring-inset z-10' : ''
}`}
style={{ color: cellColor || undefined }}
spellCheck={false}
/>
</div>
)
})()}
</div>
style={{ gridColumn: `${gridColStart} / ${gridColEnd}`, height: `${rowH}px` }}
>
{cellColor && (
<span className="flex-shrink-0 w-1.5 self-stretch rounded-l-sm" style={{ backgroundColor: cellColor }} />
)}
<input
id={`cell-${cellId}`}
type="text"
value={spanCell.text}
onChange={(e) => onCellTextChange(cellId, e.target.value)}
onFocus={() => onSelectCell(cellId)}
onKeyDown={(e) => handleKeyDown(e, cellId)}
className="w-full px-3 py-1 bg-transparent border-0 outline-none text-center"
style={{ color: cellColor || undefined }}
spellCheck={false}
/>
</div>
)
})}
</>
) : (
zone.columns.map((col) => {
const cell = cellMap.get(`${row.index}_${col.index}`)
@@ -364,21 +462,45 @@ export function GridTable({
const isSelected = selectedCell === cellId
const isBold = col.bold || cell?.is_bold
const isLowConf = cell && cell.confidence > 0 && cell.confidence < 60
const cellColor = getCellColor(cell)
const isMultiSelected = selectedCells?.has(cellId)
// Show per-word colored display only when word_boxes
// match the cell text. Post-processing steps (e.g. 5h
// slash-IPA → bracket conversion) modify cell.text but
// not individual word_boxes, so we fall back to the
// plain input when they diverge.
const wbText = cell?.word_boxes?.map((wb) => wb.text).join(' ') ?? ''
const textMatches = !cell?.text || wbText === cell.text
// Color: prefer manual override, else word_boxes when text matches
const hasOverride = cell?.color_override !== undefined
const cellColor = hasOverride ? getCellColor(cell) : (textMatches ? getCellColor(cell) : null)
const hasColoredWords =
cell?.word_boxes?.some(
!hasOverride &&
textMatches &&
(cell?.word_boxes?.some(
(wb) => wb.color_name && wb.color_name !== 'black',
) ?? false
) ?? false)
return (
<div
key={col.index}
className={`relative border-b border-r border-gray-200 dark:border-gray-700 flex items-center ${
isSelected ? 'ring-2 ring-teal-500 ring-inset z-10' : ''
} ${isLowConf ? 'bg-amber-50/50 dark:bg-amber-900/10' : ''} ${
row.is_header ? 'bg-blue-50/50 dark:bg-blue-900/10' : ''
}`}
style={{ height: `${rowH}px` }}
} ${isMultiSelected ? 'bg-teal-50/60 dark:bg-teal-900/20' : ''} ${
isLowConf && !isMultiSelected ? 'bg-amber-50/50 dark:bg-amber-900/10' : ''
} ${row.is_header && !isMultiSelected ? 'bg-blue-50/50 dark:bg-blue-900/10' : ''}`}
style={{
height: `${rowH}px`,
...(cell?.box_region?.bg_hex ? {
backgroundColor: `${cell.box_region.bg_hex}12`,
borderLeft: cell.box_region.border ? `3px solid ${cell.box_region.bg_hex}60` : undefined,
} : {}),
}}
onContextMenu={(e) => {
if (onSetCellColor) {
e.preventDefault()
setColorMenu({ cellId, x: e.clientX, y: e.clientY })
}
}}
>
{cellColor && (
<span
@@ -388,44 +510,88 @@ export function GridTable({
/>
)}
{/* Per-word colored display when not editing */}
{hasColoredWords && !isSelected ? (
<div
className={`w-full px-2 cursor-text truncate ${isBold ? 'font-bold' : 'font-normal'}`}
onClick={() => {
onSelectCell(cellId)
setTimeout(() => document.getElementById(`cell-${cellId}`)?.focus(), 0)
}}
>
{cell!.word_boxes!.map((wb, i) => (
<span
key={i}
style={
wb.color_name && wb.color_name !== 'black'
? { color: wb.color }
: undefined
}
{(() => {
const cellText = cell?.text ?? ''
const isMultiLine = cellText.includes('\n')
if (hasColoredWords && !isSelected) {
return (
<div
className={`w-full px-2 cursor-text truncate ${isBold ? 'font-bold' : 'font-normal'}`}
onClick={(e) => {
if ((e.metaKey || e.ctrlKey) && onToggleCellSelection) {
onToggleCellSelection(cellId)
} else {
onSelectCell(cellId)
setTimeout(() => document.getElementById(`cell-${cellId}`)?.focus(), 0)
}
}}
>
{wb.text}
{i < cell!.word_boxes!.length - 1 ? ' ' : ''}
</span>
))}
</div>
) : (
<input
id={`cell-${cellId}`}
type="text"
value={cell?.text ?? ''}
onChange={(e) => {
if (cell) onCellTextChange(cellId, e.target.value)
}}
onFocus={() => onSelectCell(cellId)}
onKeyDown={(e) => handleKeyDown(e, cellId)}
className={`w-full px-2 bg-transparent border-0 outline-none ${
isBold ? 'font-bold' : 'font-normal'
}`}
spellCheck={false}
/>
)}
{cell!.word_boxes!.map((wb, i) => (
<span
key={i}
style={
wb.color_name && wb.color_name !== 'black'
? { color: wb.color }
: undefined
}
>
{wb.text}
{i < cell!.word_boxes!.length - 1 ? ' ' : ''}
</span>
))}
</div>
)
}
if (isMultiLine) {
return (
<textarea
id={`cell-${cellId}`}
value={cellText}
onChange={(e) => onCellTextChange(cellId, e.target.value)}
onFocus={() => onSelectCell(cellId)}
onClick={(e) => {
if ((e.metaKey || e.ctrlKey) && onToggleCellSelection) {
e.preventDefault()
onToggleCellSelection(cellId)
}
}}
onKeyDown={(e) => {
if (e.key === 'Tab') {
e.preventDefault()
onNavigate(cellId, e.shiftKey ? 'left' : 'right')
}
}}
rows={cellText.split('\n').length}
className={`w-full px-2 bg-transparent border-0 outline-none resize-none ${
isBold ? 'font-bold' : 'font-normal'
}`}
style={{ color: cellColor || undefined }}
spellCheck={false}
/>
)
}
return (
<input
id={`cell-${cellId}`}
type="text"
value={cellText}
onChange={(e) => onCellTextChange(cellId, e.target.value)}
onFocus={() => onSelectCell(cellId)}
onClick={(e) => {
if ((e.metaKey || e.ctrlKey) && onToggleCellSelection) {
e.preventDefault()
onToggleCellSelection(cellId)
}
}}
onKeyDown={(e) => handleKeyDown(e, cellId)}
className={`w-full px-2 bg-transparent border-0 outline-none ${
isBold ? 'font-bold' : 'font-normal'
}`}
style={{ color: cellColor || undefined }}
spellCheck={false}
/>
)
})()}
</div>
)
})
@@ -434,6 +600,53 @@ export function GridTable({
)
})}
</div>
{/* Color context menu (right-click) */}
{colorMenu && onSetCellColor && (
<div
className="fixed inset-0 z-50"
onClick={() => setColorMenu(null)}
onContextMenu={(e) => { e.preventDefault(); setColorMenu(null) }}
>
<div
className="absolute bg-white dark:bg-gray-800 rounded-lg shadow-lg border border-gray-200 dark:border-gray-700 py-1 min-w-[140px]"
style={{ left: colorMenu.x, top: colorMenu.y }}
onClick={(e) => e.stopPropagation()}
>
<div className="px-3 py-1 text-[10px] text-gray-400 dark:text-gray-500 font-medium uppercase tracking-wider">
Textfarbe
</div>
{COLOR_OPTIONS.map((opt) => (
<button
key={opt.label}
className="w-full px-3 py-1.5 text-xs text-left hover:bg-gray-100 dark:hover:bg-gray-700 flex items-center gap-2"
onClick={() => {
onSetCellColor(colorMenu.cellId, opt.value)
setColorMenu(null)
}}
>
{opt.value ? (
<span className="w-3 h-3 rounded-full flex-shrink-0" style={{ backgroundColor: opt.value }} />
) : (
<span className="w-3 h-3 rounded-full flex-shrink-0 border border-gray-300 dark:border-gray-600" />
)}
<span>{opt.label}</span>
</button>
))}
<div className="border-t border-gray-100 dark:border-gray-700 mt-1 pt-1">
<button
className="w-full px-3 py-1.5 text-xs text-left hover:bg-gray-100 dark:hover:bg-gray-700 text-gray-500 dark:text-gray-400"
onClick={() => {
onSetCellColor(colorMenu.cellId, undefined)
setColorMenu(null)
}}
>
Farbe zuruecksetzen (OCR)
</button>
</div>
</div>
</div>
)}
</div>
)
}

View File

@@ -1,16 +1,38 @@
'use client'
import type { IpaMode, SyllableMode } from './useGridEditor'
interface GridToolbarProps {
dirty: boolean
saving: boolean
canUndo: boolean
canRedo: boolean
showOverlay: boolean
ipaMode: IpaMode
syllableMode: SyllableMode
onSave: () => void
onUndo: () => void
onRedo: () => void
onRebuild: () => void
onToggleOverlay: () => void
onIpaModeChange: (mode: IpaMode) => void
onSyllableModeChange: (mode: SyllableMode) => void
}
const IPA_LABELS: Record<IpaMode, string> = {
auto: 'IPA: Auto',
en: 'IPA: nur EN',
de: 'IPA: nur DE',
all: 'IPA: Alle',
none: 'IPA: Aus',
}
const SYLLABLE_LABELS: Record<SyllableMode, string> = {
auto: 'Silben: Original',
en: 'Silben: nur EN',
de: 'Silben: nur DE',
all: 'Silben: Alle',
none: 'Silben: Aus',
}
export function GridToolbar({
@@ -19,11 +41,15 @@ export function GridToolbar({
canUndo,
canRedo,
showOverlay,
ipaMode,
syllableMode,
onSave,
onUndo,
onRedo,
onRebuild,
onToggleOverlay,
onIpaModeChange,
onSyllableModeChange,
}: GridToolbarProps) {
return (
<div className="flex items-center gap-2 flex-wrap">
@@ -67,6 +93,40 @@ export function GridToolbar({
Bild-Overlay
</button>
{/* IPA mode */}
<div className="flex items-center gap-1">
<select
value={ipaMode}
onChange={(e) => onIpaModeChange(e.target.value as IpaMode)}
className="px-2 py-1.5 text-xs rounded-md border border-gray-200 dark:border-gray-700 bg-white dark:bg-gray-800 text-gray-600 dark:text-gray-400"
title="Lautschrift (IPA): Auto = nur erkannte EN-Woerter, DE = deutsches IPA (Wiktionary), Alle = EN + DE, Aus = keine"
>
{(Object.keys(IPA_LABELS) as IpaMode[]).map((m) => (
<option key={m} value={m}>{IPA_LABELS[m]}</option>
))}
</select>
{(ipaMode === 'de' || ipaMode === 'all') && (
<span
className="text-[9px] text-gray-400 dark:text-gray-500 cursor-help"
title="DE-Lautschrift: Wiktionary (CC-BY-SA 4.0) + epitran (MIT). EN-Lautschrift: Britfone (MIT) + eng_to_ipa (MIT)."
>
CC-BY-SA
</span>
)}
</div>
{/* Syllable mode */}
<select
value={syllableMode}
onChange={(e) => onSyllableModeChange(e.target.value as SyllableMode)}
className="px-2 py-1.5 text-xs rounded-md border border-gray-200 dark:border-gray-700 bg-white dark:bg-gray-800 text-gray-600 dark:text-gray-400"
title="Silbentrennung: Original = nur wo im Scan vorhanden, Alle = fuer alle Woerter, Aus = keine"
>
{(Object.keys(SYLLABLE_LABELS) as SyllableMode[]).map((m) => (
<option key={m} value={m}>{SYLLABLE_LABELS[m]}</option>
))}
</select>
{/* Rebuild */}
<button
onClick={onRebuild}

View File

@@ -0,0 +1,386 @@
'use client'
/**
* ImageLayoutEditor — SVG overlay on the original scan image.
*
* Shows draggable vertical column dividers and horizontal guidelines
* (margins, header/footer zones). Double-click to add a column,
* click the × on a divider to remove it.
*/
import { useCallback, useRef } from 'react'
import type { GridZone, LayoutDividers } from './types'
interface ImageLayoutEditorProps {
imageUrl: string
zones: GridZone[]
imageWidth: number
layoutDividers?: LayoutDividers
zoom: number
onZoomChange: (zoom: number) => void
onColumnDividerMove: (zoneIndex: number, boundaryIndex: number, newXPct: number) => void
onHorizontalsChange: (horizontals: LayoutDividers['horizontals']) => void
onCommitUndo: () => void
onSplitColumnAt: (zoneIndex: number, xPct: number) => void
onDeleteColumn: (zoneIndex: number, colIndex: number) => void
}
const HORIZ_COLORS: Record<string, string> = {
top_margin: 'rgba(239, 68, 68, 0.6)',
header_bottom: 'rgba(59, 130, 246, 0.6)',
footer_top: 'rgba(249, 115, 22, 0.6)',
bottom_margin: 'rgba(239, 68, 68, 0.6)',
}
const HORIZ_LABELS: Record<string, string> = {
top_margin: 'Rand oben',
header_bottom: 'Kopfzeile',
footer_top: 'Fusszeile',
bottom_margin: 'Rand unten',
}
const HORIZ_DEFAULTS: Record<string, number> = {
top_margin: 3,
header_bottom: 10,
footer_top: 92,
bottom_margin: 97,
}
function clamp(val: number, min: number, max: number) {
return Math.max(min, Math.min(max, val))
}
export function ImageLayoutEditor({
imageUrl,
zones,
layoutDividers,
zoom,
onZoomChange,
onColumnDividerMove,
onHorizontalsChange,
onCommitUndo,
onSplitColumnAt,
onDeleteColumn,
}: ImageLayoutEditorProps) {
const wrapperRef = useRef<HTMLDivElement>(null)
const draggingRef = useRef<
| { type: 'col'; zoneIndex: number; boundaryIndex: number }
| { type: 'horiz'; key: string }
| null
>(null)
const horizontalsRef = useRef(layoutDividers?.horizontals ?? {})
horizontalsRef.current = layoutDividers?.horizontals ?? {}
const horizontals = layoutDividers?.horizontals ?? {}
// Compute column boundaries for each zone
const zoneBoundaries = zones.map((zone) => {
const sorted = [...zone.columns].sort((a, b) => a.index - b.index)
const boundaries: number[] = []
if (sorted.length > 0) {
const hasValidPct = sorted.some((c) => c.x_max_pct > 0)
if (hasValidPct) {
boundaries.push(sorted[0].x_min_pct)
for (const col of sorted) {
boundaries.push(col.x_max_pct)
}
} else {
// Fallback: evenly distribute within zone bbox
const zoneX = zone.bbox_pct.x || 0
const zoneW = zone.bbox_pct.w || 100
for (let i = 0; i <= sorted.length; i++) {
boundaries.push(zoneX + (i / sorted.length) * zoneW)
}
}
}
return { zone, boundaries }
})
const startDrag = useCallback(
(
info: NonNullable<typeof draggingRef.current>,
e: React.MouseEvent,
) => {
e.preventDefault()
e.stopPropagation()
draggingRef.current = info
onCommitUndo()
const handleMove = (ev: MouseEvent) => {
const wrap = wrapperRef.current
if (!wrap || !draggingRef.current) return
const rect = wrap.getBoundingClientRect()
const xPct = clamp(((ev.clientX - rect.left) / rect.width) * 100, 0, 100)
const yPct = clamp(((ev.clientY - rect.top) / rect.height) * 100, 0, 100)
if (draggingRef.current.type === 'col') {
onColumnDividerMove(
draggingRef.current.zoneIndex,
draggingRef.current.boundaryIndex,
xPct,
)
} else {
onHorizontalsChange({
...horizontalsRef.current,
[draggingRef.current.key]: yPct,
})
}
}
const handleUp = () => {
draggingRef.current = null
document.removeEventListener('mousemove', handleMove)
document.removeEventListener('mouseup', handleUp)
document.body.style.cursor = ''
document.body.style.userSelect = ''
}
document.body.style.cursor = info.type === 'col' ? 'col-resize' : 'row-resize'
document.body.style.userSelect = 'none'
document.addEventListener('mousemove', handleMove)
document.addEventListener('mouseup', handleUp)
},
[onColumnDividerMove, onHorizontalsChange, onCommitUndo],
)
const toggleHorizontal = (key: string) => {
const current = horizontals[key as keyof typeof horizontals]
if (current != null) {
const next = { ...horizontals }
delete next[key as keyof typeof next]
onHorizontalsChange(next)
} else {
onCommitUndo()
onHorizontalsChange({
...horizontals,
[key]: HORIZ_DEFAULTS[key],
})
}
}
const handleDoubleClick = (e: React.MouseEvent) => {
const wrap = wrapperRef.current
if (!wrap) return
const rect = wrap.getBoundingClientRect()
const xPct = clamp(((e.clientX - rect.left) / rect.width) * 100, 0, 100)
const yPct = clamp(((e.clientY - rect.top) / rect.height) * 100, 0, 100)
// Find which zone this click is in
for (const { zone } of zoneBoundaries) {
const zy = zone.bbox_pct.y || 0
const zh = zone.bbox_pct.h || 100
if (yPct >= zy && yPct <= zy + zh) {
onSplitColumnAt(zone.zone_index, xPct)
return
}
}
// Fallback: use first zone
if (zones.length > 0) {
onSplitColumnAt(zones[0].zone_index, xPct)
}
}
return (
<div className="bg-white dark:bg-gray-800 rounded-lg border border-gray-200 dark:border-gray-700 overflow-hidden flex flex-col">
{/* Header */}
<div className="flex items-center justify-between px-3 py-2 border-b border-gray-100 dark:border-gray-700 bg-gray-50 dark:bg-gray-800/50">
<span className="text-xs font-medium text-gray-600 dark:text-gray-400">
Layout-Editor
</span>
<div className="flex items-center gap-2">
<button
onClick={() => onZoomChange(Math.max(50, zoom - 25))}
className="px-2 py-0.5 text-xs bg-gray-200 dark:bg-gray-700 rounded hover:bg-gray-300 dark:hover:bg-gray-600 text-gray-700 dark:text-gray-300"
>
-
</button>
<span className="text-xs text-gray-500 dark:text-gray-400 w-10 text-center">
{zoom}%
</span>
<button
onClick={() => onZoomChange(Math.min(300, zoom + 25))}
className="px-2 py-0.5 text-xs bg-gray-200 dark:bg-gray-700 rounded hover:bg-gray-300 dark:hover:bg-gray-600 text-gray-700 dark:text-gray-300"
>
+
</button>
<button
onClick={() => onZoomChange(100)}
className="px-2 py-0.5 text-xs bg-gray-200 dark:bg-gray-700 rounded hover:bg-gray-300 dark:hover:bg-gray-600 text-gray-700 dark:text-gray-300"
>
Fit
</button>
</div>
</div>
{/* Horizontal line toggles */}
<div className="flex items-center gap-1.5 px-3 py-1.5 border-b border-gray-100 dark:border-gray-700 bg-gray-50/50 dark:bg-gray-800/30 flex-wrap">
{Object.entries(HORIZ_LABELS).map(([key, label]) => {
const isActive = horizontals[key as keyof typeof horizontals] != null
return (
<button
key={key}
onClick={() => toggleHorizontal(key)}
className={`px-2 py-0.5 text-[10px] rounded border transition-colors ${
isActive
? 'font-medium'
: 'border-gray-200 dark:border-gray-700 text-gray-400 dark:text-gray-500 hover:text-gray-600 dark:hover:text-gray-400'
}`}
style={
isActive
? {
color: HORIZ_COLORS[key],
borderColor: HORIZ_COLORS[key] + '80',
}
: undefined
}
>
{label}
</button>
)
})}
<span className="text-[10px] text-gray-400 dark:text-gray-500 ml-auto">
Doppelklick = Spalte einfuegen
</span>
</div>
{/* Scrollable image with SVG overlay */}
<div className="flex-1 overflow-auto p-2">
<div
ref={wrapperRef}
style={{ width: `${zoom}%`, position: 'relative', maxWidth: 'none' }}
onDoubleClick={handleDoubleClick}
>
{/* eslint-disable-next-line @next/next/no-img-element */}
<img
src={imageUrl}
alt="Original scan"
style={{ width: '100%', display: 'block' }}
draggable={false}
/>
{/* SVG overlay */}
<svg
style={{
position: 'absolute',
top: 0,
left: 0,
width: '100%',
height: '100%',
pointerEvents: 'none',
}}
viewBox="0 0 100 100"
preserveAspectRatio="none"
>
{/* Column boundary lines per zone */}
{zoneBoundaries.map(({ zone, boundaries }) =>
boundaries.map((xPct, bi) => {
const yTop = zone.bbox_pct.y || 0
const yBottom = (zone.bbox_pct.y || 0) + (zone.bbox_pct.h || 100)
const isEdge = bi === 0 || bi === boundaries.length - 1
const isInterior = bi > 0 && bi < boundaries.length - 1
return (
<g key={`z${zone.zone_index}-b${bi}`}>
{/* Wide invisible hit area */}
<rect
x={xPct - 0.8}
y={yTop}
width={1.6}
height={yBottom - yTop}
fill="transparent"
style={{ cursor: 'col-resize', pointerEvents: 'all' }}
onMouseDown={(e) =>
startDrag(
{ type: 'col', zoneIndex: zone.zone_index, boundaryIndex: bi },
e,
)
}
/>
{/* Visible line */}
<line
x1={xPct}
y1={yTop}
x2={xPct}
y2={yBottom}
stroke={isEdge ? 'rgba(20, 184, 166, 0.35)' : 'rgba(20, 184, 166, 0.7)'}
strokeWidth={isEdge ? 0.15 : 0.25}
strokeDasharray={isEdge ? '0.8,0.4' : '0.5,0.3'}
style={{ pointerEvents: 'none' }}
/>
{/* Delete button for interior dividers */}
{isInterior && zone.columns.length > 1 && (
<g
style={{ pointerEvents: 'all', cursor: 'pointer' }}
onClick={(e) => {
e.stopPropagation()
onDeleteColumn(zone.zone_index, bi)
}}
>
<circle
cx={xPct}
cy={Math.max(yTop + 1.5, 1.5)}
r={1.2}
fill="rgba(239, 68, 68, 0.8)"
/>
<text
x={xPct}
y={Math.max(yTop + 1.5, 1.5) + 0.5}
textAnchor="middle"
fill="white"
fontSize="1.4"
fontWeight="bold"
style={{ pointerEvents: 'none' }}
>
x
</text>
</g>
)}
</g>
)
}),
)}
{/* Horizontal guideline lines */}
{Object.entries(horizontals).map(([key, yPct]) => {
if (yPct == null) return null
const color = HORIZ_COLORS[key] ?? 'rgba(156, 163, 175, 0.6)'
return (
<g key={`horiz-${key}`}>
{/* Wide invisible hit area */}
<rect
x={0}
y={yPct - 0.6}
width={100}
height={1.2}
fill="transparent"
style={{ cursor: 'row-resize', pointerEvents: 'all' }}
onMouseDown={(e) => startDrag({ type: 'horiz', key }, e)}
/>
{/* Visible line */}
<line
x1={0}
y1={yPct}
x2={100}
y2={yPct}
stroke={color}
strokeWidth={0.2}
strokeDasharray="1,0.5"
style={{ pointerEvents: 'none' }}
/>
{/* Label */}
<text
x={1}
y={yPct - 0.5}
fill={color}
fontSize="1.6"
style={{ pointerEvents: 'none' }}
>
{HORIZ_LABELS[key]}
</text>
</g>
)
})}
</svg>
</div>
</div>
</div>
)
}

View File

@@ -1,4 +1,4 @@
import type { OcrWordBox } from '@/app/(admin)/ai/ocr-pipeline/types'
import type { OcrWordBox } from '@/app/(admin)/ai/ocr-kombi/types'
// Re-export for convenience
export type { OcrWordBox }
@@ -11,6 +11,22 @@ export interface LayoutMetrics {
font_size_suggestion_px: number
}
/** Dictionary detection result from backend analysis. */
export interface DictionaryDetection {
is_dictionary: boolean
confidence: number
signals: Record<string, unknown>
article_col_index: number | null
headword_col_index: number | null
}
/** Page number extracted from footer region of the scan. */
export interface PageNumber {
text: string
y_pct: number
number?: number
}
/** A complete structured grid with zones, ready for the Excel-like editor. */
export interface StructuredGrid {
session_id: string
@@ -21,8 +37,11 @@ export interface StructuredGrid {
summary: GridSummary
formatting: GridFormatting
layout_metrics?: LayoutMetrics
dictionary_detection?: DictionaryDetection
page_number?: PageNumber | null
duration_seconds: number
edited?: boolean
layout_dividers?: LayoutDividers
}
export interface GridSummary {
@@ -52,6 +71,12 @@ export interface GridZone {
rows: GridRow[]
cells: GridEditorCell[]
header_rows: number[]
layout_hint?: 'left_of_vsplit' | 'right_of_vsplit' | 'middle_of_vsplit'
vsplit_group?: number
box_layout_type?: 'flowing' | 'columnar' | 'bullet_list' | 'header_only'
box_grid_reviewed?: boolean
box_bg_color?: string
box_bg_hex?: string
}
export interface BBox {
@@ -99,6 +124,28 @@ export interface GridEditorCell {
word_boxes: OcrWordBox[]
ocr_engine: string
is_bold: boolean
/** Manual color override: hex string or null to clear. */
color_override?: string | null
/** Number of columns this cell spans (merged cell). Default 1. */
colspan?: number
/** Source zone type when in unified grid. */
source_zone_type?: 'content' | 'box'
/** Box visual metadata for cells from box zones. */
box_region?: {
bg_hex?: string
bg_color?: string
border?: boolean
}
}
/** Layout dividers for the visual column/margin editor on the original image. */
export interface LayoutDividers {
horizontals: {
top_margin?: number
header_bottom?: number
footer_top?: number
bottom_margin?: number
}
}
/** Cell formatting applied by the user in the editor. */

View File

@@ -1,5 +1,5 @@
import { useCallback, useRef, useState } from 'react'
import type { StructuredGrid, GridZone } from './types'
import { useCallback, useEffect, useRef, useState } from 'react'
import type { StructuredGrid, GridZone, LayoutDividers } from './types'
const KLAUSUR_API = '/klausur-api'
const MAX_UNDO = 50
@@ -14,6 +14,9 @@ export interface GridEditorState {
selectedZone: number | null
}
export type IpaMode = 'auto' | 'all' | 'de' | 'en' | 'none'
export type SyllableMode = 'auto' | 'all' | 'de' | 'en' | 'none'
export function useGridEditor(sessionId: string | null) {
const [grid, setGrid] = useState<StructuredGrid | null>(null)
const [loading, setLoading] = useState(false)
@@ -22,6 +25,17 @@ export function useGridEditor(sessionId: string | null) {
const [dirty, setDirty] = useState(false)
const [selectedCell, setSelectedCell] = useState<string | null>(null)
const [selectedZone, setSelectedZone] = useState<number | null>(null)
const [ipaMode, setIpaMode] = useState<IpaMode>('auto')
const [syllableMode, setSyllableMode] = useState<SyllableMode>('auto')
// OCR Quality Steps (A/B testing toggles — defaults off for now)
const [ocrEnhance, setOcrEnhance] = useState(false)
const [ocrMaxCols, setOcrMaxCols] = useState(0)
const [ocrMinConf, setOcrMinConf] = useState(0)
// Vision-LLM Fusion (Step 4)
const [visionFusion, setVisionFusion] = useState(false)
const [documentCategory, setDocumentCategory] = useState('vokabelseite')
// Undo/redo stacks store serialized zone arrays
const undoStack = useRef<string[]>([])
@@ -44,8 +58,14 @@ export function useGridEditor(sessionId: string | null) {
setLoading(true)
setError(null)
try {
const params = new URLSearchParams()
params.set('ipa_mode', ipaMode)
params.set('syllable_mode', syllableMode)
params.set('enhance', String(ocrEnhance))
if (ocrMaxCols > 0) params.set('max_cols', String(ocrMaxCols))
if (ocrMinConf > 0) params.set('min_conf', String(ocrMinConf))
const res = await fetch(
`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/build-grid`,
`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/build-grid?${params}`,
{ method: 'POST' },
)
if (!res.ok) {
@@ -62,7 +82,41 @@ export function useGridEditor(sessionId: string | null) {
} finally {
setLoading(false)
}
}, [sessionId])
}, [sessionId, ipaMode, syllableMode, ocrEnhance, ocrMaxCols, ocrMinConf])
/** Re-run OCR with current quality settings, then rebuild grid */
const rerunOcr = useCallback(async () => {
if (!sessionId) return
setLoading(true)
setError(null)
try {
const params = new URLSearchParams()
params.set('ipa_mode', ipaMode)
params.set('syllable_mode', syllableMode)
params.set('enhance', String(ocrEnhance))
if (ocrMaxCols > 0) params.set('max_cols', String(ocrMaxCols))
if (ocrMinConf > 0) params.set('min_conf', String(ocrMinConf))
params.set('vision_fusion', String(visionFusion))
if (documentCategory) params.set('doc_category', documentCategory)
const res = await fetch(
`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/rerun-ocr-and-build-grid?${params}`,
{ method: 'POST' },
)
if (!res.ok) {
const data = await res.json().catch(() => ({}))
throw new Error(data.detail || `HTTP ${res.status}`)
}
const data: StructuredGrid = await res.json()
setGrid(data)
setDirty(false)
undoStack.current = []
redoStack.current = []
} catch (e) {
setError(e instanceof Error ? e.message : String(e))
} finally {
setLoading(false)
}
}, [sessionId, ipaMode, syllableMode, ocrEnhance, ocrMaxCols, ocrMinConf, visionFusion, documentCategory])
const loadGrid = useCallback(async () => {
if (!sessionId) return
@@ -73,8 +127,22 @@ export function useGridEditor(sessionId: string | null) {
`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/grid-editor`,
)
if (res.status === 404) {
// No grid yet — build it
await buildGrid()
// No grid yet — build it with current modes
const params = new URLSearchParams()
params.set('ipa_mode', ipaMode)
params.set('syllable_mode', syllableMode)
params.set('enhance', String(ocrEnhance))
if (ocrMaxCols > 0) params.set('max_cols', String(ocrMaxCols))
if (ocrMinConf > 0) params.set('min_conf', String(ocrMinConf))
const buildRes = await fetch(
`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/build-grid?${params}`,
{ method: 'POST' },
)
if (buildRes.ok) {
const data: StructuredGrid = await buildRes.json()
setGrid(data)
setDirty(false)
}
return
}
if (!res.ok) {
@@ -91,7 +159,50 @@ export function useGridEditor(sessionId: string | null) {
} finally {
setLoading(false)
}
}, [sessionId, buildGrid])
// Only depends on sessionId — mode changes are handled by the
// separate useEffect below, not by re-triggering loadGrid.
// eslint-disable-next-line react-hooks/exhaustive-deps
}, [sessionId])
// Auto-rebuild when IPA or syllable mode changes (skip initial mount).
// We call the API directly with the new values instead of going through
// the buildGrid callback, which may still close over stale state due to
// React's asynchronous state batching.
const mountedRef = useRef(false)
useEffect(() => {
if (!mountedRef.current) {
// Skip the first trigger (component mount) — don't rebuild yet
mountedRef.current = true
return
}
if (!sessionId) return
const rebuild = async () => {
setLoading(true)
setError(null)
try {
const params = new URLSearchParams()
params.set('ipa_mode', ipaMode)
params.set('syllable_mode', syllableMode)
const res = await fetch(
`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/build-grid?${params}`,
{ method: 'POST' },
)
if (!res.ok) {
const data = await res.json().catch(() => ({}))
throw new Error(data.detail || `HTTP ${res.status}`)
}
const data: StructuredGrid = await res.json()
setGrid(data)
setDirty(false)
} catch (e) {
setError(e instanceof Error ? e.message : String(e))
} finally {
setLoading(false)
}
}
rebuild()
// eslint-disable-next-line react-hooks/exhaustive-deps
}, [ipaMode, syllableMode])
// ------------------------------------------------------------------
// Save
@@ -134,12 +245,40 @@ export function useGridEditor(sessionId: string | null) {
if (!prev) return prev
return {
...prev,
zones: prev.zones.map((zone) => ({
...zone,
cells: zone.cells.map((cell) =>
cell.cell_id === cellId ? { ...cell, text: newText } : cell,
),
})),
zones: prev.zones.map((zone) => {
// Check if cell exists
const existing = zone.cells.find((c) => c.cell_id === cellId)
if (existing) {
return {
...zone,
cells: zone.cells.map((cell) =>
cell.cell_id === cellId ? { ...cell, text: newText } : cell,
),
}
}
// Cell doesn't exist — create it if the cellId belongs to this zone
// cellId format: Z{zone}_R{row}_C{col}
const match = cellId.match(/^Z(\d+)_R(\d+)_C(\d+)$/)
if (!match || parseInt(match[1]) !== zone.zone_index) return zone
const rowIndex = parseInt(match[2])
const colIndex = parseInt(match[3])
const col = zone.columns.find((c) => c.index === colIndex)
const newCell = {
cell_id: cellId,
zone_index: zone.zone_index,
row_index: rowIndex,
col_index: colIndex,
col_type: col?.label ?? '',
text: newText,
confidence: 0,
bbox_px: { x: 0, y: 0, w: 0, h: 0 },
bbox_pct: { x: 0, y: 0, w: 0, h: 0 },
word_boxes: [],
ocr_engine: 'manual',
is_bold: false,
}
return { ...zone, cells: [...zone.cells, newCell] }
}),
}
})
setDirty(true)
@@ -192,6 +331,7 @@ export function useGridEditor(sessionId: string | null) {
if (!grid) return
pushUndo(grid.zones)
// Cycle: normal → header → footer → normal
setGrid((prev) => {
if (!prev) return prev
return {
@@ -200,9 +340,16 @@ export function useGridEditor(sessionId: string | null) {
if (zone.zone_index !== zoneIndex) return zone
return {
...zone,
rows: zone.rows.map((r) =>
r.index === rowIndex ? { ...r, is_header: !r.is_header } : r,
),
rows: zone.rows.map((r) => {
if (r.index !== rowIndex) return r
if (!r.is_header && !r.is_footer) {
return { ...r, is_header: true, is_footer: false }
} else if (r.is_header) {
return { ...r, is_header: false, is_footer: true }
} else {
return { ...r, is_header: false, is_footer: false }
}
}),
}
}),
}
@@ -212,6 +359,570 @@ export function useGridEditor(sessionId: string | null) {
[grid, pushUndo],
)
// ------------------------------------------------------------------
// Column management
// ------------------------------------------------------------------
const deleteColumn = useCallback(
(zoneIndex: number, colIndex: number) => {
if (!grid) return
const zone = grid.zones.find((z) => z.zone_index === zoneIndex)
if (!zone || zone.columns.length <= 1) return // keep at least 1 column
pushUndo(grid.zones)
setGrid((prev) => {
if (!prev) return prev
return {
...prev,
zones: prev.zones.map((z) => {
if (z.zone_index !== zoneIndex) return z
const deletedCol = z.columns.find((c) => c.index === colIndex)
const newColumns = z.columns
.filter((c) => c.index !== colIndex)
.map((c, i) => {
const result = { ...c, index: i, label: `column_${i + 1}` }
// Merge x-boundary: previous column absorbs deleted column's space
if (deletedCol) {
if (c.index === colIndex - 1) {
result.x_max_pct = deletedCol.x_max_pct
result.x_max_px = deletedCol.x_max_px
} else if (colIndex === 0 && c.index === 1) {
result.x_min_pct = deletedCol.x_min_pct
result.x_min_px = deletedCol.x_min_px
}
}
return result
})
const newCells = z.cells
.filter((c) => c.col_index !== colIndex)
.map((c) => {
const newCI = c.col_index > colIndex ? c.col_index - 1 : c.col_index
return {
...c,
col_index: newCI,
cell_id: `Z${zoneIndex}_R${String(c.row_index).padStart(2, '0')}_C${newCI}`,
}
})
return { ...z, columns: newColumns, cells: newCells }
}),
summary: {
...prev.summary,
total_columns: prev.summary.total_columns - 1,
total_cells: prev.zones.reduce(
(sum, z) =>
sum +
(z.zone_index === zoneIndex
? z.cells.filter((c) => c.col_index !== colIndex).length
: z.cells.length),
0,
),
},
}
})
setDirty(true)
},
[grid, pushUndo],
)
const addColumn = useCallback(
(zoneIndex: number, afterColIndex: number) => {
if (!grid) return
const zone = grid.zones.find((z) => z.zone_index === zoneIndex)
if (!zone) return
pushUndo(grid.zones)
const newColIndex = afterColIndex + 1
setGrid((prev) => {
if (!prev) return prev
return {
...prev,
zones: prev.zones.map((z) => {
if (z.zone_index !== zoneIndex) return z
// Shift existing columns
const shiftedCols = z.columns.map((c) =>
c.index > afterColIndex ? { ...c, index: c.index + 1, label: `column_${c.index + 2}` } : c,
)
// Insert new column
const refCol = z.columns.find((c) => c.index === afterColIndex) || z.columns[z.columns.length - 1]
const newCol = {
index: newColIndex,
label: `column_${newColIndex + 1}`,
x_min_px: refCol.x_max_px,
x_max_px: refCol.x_max_px + (refCol.x_max_px - refCol.x_min_px),
x_min_pct: refCol.x_max_pct,
x_max_pct: Math.min(100, refCol.x_max_pct + (refCol.x_max_pct - refCol.x_min_pct)),
bold: false,
}
const allCols = [...shiftedCols, newCol].sort((a, b) => a.index - b.index)
// Shift existing cells and create empty cells for new column
const shiftedCells = z.cells.map((c) => {
if (c.col_index > afterColIndex) {
const newCI = c.col_index + 1
return {
...c,
col_index: newCI,
cell_id: `Z${zoneIndex}_R${String(c.row_index).padStart(2, '0')}_C${newCI}`,
}
}
return c
})
// Create empty cells for each row
const newCells = z.rows.map((row) => ({
cell_id: `Z${zoneIndex}_R${String(row.index).padStart(2, '0')}_C${newColIndex}`,
zone_index: zoneIndex,
row_index: row.index,
col_index: newColIndex,
col_type: `column_${newColIndex + 1}`,
text: '',
confidence: 0,
bbox_px: { x: 0, y: 0, w: 0, h: 0 },
bbox_pct: { x: 0, y: 0, w: 0, h: 0 },
word_boxes: [],
ocr_engine: 'manual',
is_bold: false,
}))
return { ...z, columns: allCols, cells: [...shiftedCells, ...newCells] }
}),
summary: {
...prev.summary,
total_columns: prev.summary.total_columns + 1,
total_cells: prev.summary.total_cells + (zone?.rows.length ?? 0),
},
}
})
setDirty(true)
},
[grid, pushUndo],
)
// ------------------------------------------------------------------
// Row management
// ------------------------------------------------------------------
const deleteRow = useCallback(
(zoneIndex: number, rowIndex: number) => {
if (!grid) return
const zone = grid.zones.find((z) => z.zone_index === zoneIndex)
if (!zone || zone.rows.length <= 1) return // keep at least 1 row
pushUndo(grid.zones)
setGrid((prev) => {
if (!prev) return prev
return {
...prev,
zones: prev.zones.map((z) => {
if (z.zone_index !== zoneIndex) return z
const newRows = z.rows
.filter((r) => r.index !== rowIndex)
.map((r, i) => ({ ...r, index: i }))
const newCells = z.cells
.filter((c) => c.row_index !== rowIndex)
.map((c) => {
const newRI = c.row_index > rowIndex ? c.row_index - 1 : c.row_index
return {
...c,
row_index: newRI,
cell_id: `Z${zoneIndex}_R${String(newRI).padStart(2, '0')}_C${c.col_index}`,
}
})
return { ...z, rows: newRows, cells: newCells }
}),
summary: {
...prev.summary,
total_rows: prev.summary.total_rows - 1,
total_cells: prev.zones.reduce(
(sum, z) =>
sum +
(z.zone_index === zoneIndex
? z.cells.filter((c) => c.row_index !== rowIndex).length
: z.cells.length),
0,
),
},
}
})
setDirty(true)
},
[grid, pushUndo],
)
const addRow = useCallback(
(zoneIndex: number, afterRowIndex: number) => {
if (!grid) return
const zone = grid.zones.find((z) => z.zone_index === zoneIndex)
if (!zone) return
pushUndo(grid.zones)
const newRowIndex = afterRowIndex + 1
setGrid((prev) => {
if (!prev) return prev
return {
...prev,
zones: prev.zones.map((z) => {
if (z.zone_index !== zoneIndex) return z
// Shift existing rows
const shiftedRows = z.rows.map((r) =>
r.index > afterRowIndex ? { ...r, index: r.index + 1 } : r,
)
// Insert new row
const refRow = z.rows.find((r) => r.index === afterRowIndex) || z.rows[z.rows.length - 1]
const newRow = {
index: newRowIndex,
y_min_px: refRow.y_max_px,
y_max_px: refRow.y_max_px + (refRow.y_max_px - refRow.y_min_px),
y_min_pct: refRow.y_max_pct,
y_max_pct: Math.min(100, refRow.y_max_pct + (refRow.y_max_pct - refRow.y_min_pct)),
is_header: false,
is_footer: false,
}
const allRows = [...shiftedRows, newRow].sort((a, b) => a.index - b.index)
// Shift existing cells
const shiftedCells = z.cells.map((c) => {
if (c.row_index > afterRowIndex) {
const newRI = c.row_index + 1
return {
...c,
row_index: newRI,
cell_id: `Z${zoneIndex}_R${String(newRI).padStart(2, '0')}_C${c.col_index}`,
}
}
return c
})
// Create empty cells for each column
const newCells = z.columns.map((col) => ({
cell_id: `Z${zoneIndex}_R${String(newRowIndex).padStart(2, '0')}_C${col.index}`,
zone_index: zoneIndex,
row_index: newRowIndex,
col_index: col.index,
col_type: col.label,
text: '',
confidence: 0,
bbox_px: { x: 0, y: 0, w: 0, h: 0 },
bbox_pct: { x: 0, y: 0, w: 0, h: 0 },
word_boxes: [],
ocr_engine: 'manual',
is_bold: false,
}))
return { ...z, rows: allRows, cells: [...shiftedCells, ...newCells] }
}),
summary: {
...prev.summary,
total_rows: prev.summary.total_rows + 1,
total_cells: prev.summary.total_cells + (zone?.columns.length ?? 0),
},
}
})
setDirty(true)
},
[grid, pushUndo],
)
// ------------------------------------------------------------------
// Layout editing (image overlay)
// ------------------------------------------------------------------
/** Capture current state for undo — call once at drag start. */
const commitUndoPoint = useCallback(() => {
if (!grid) return
pushUndo(grid.zones)
}, [grid, pushUndo])
/** Move a column boundary. boundaryIndex 0 = left edge of col 0, etc. */
const updateColumnDivider = useCallback(
(zoneIndex: number, boundaryIndex: number, newXPct: number) => {
if (!grid) return
setGrid((prev) => {
if (!prev) return prev
const imgW = prev.image_width || 1
const newPx = Math.round((newXPct / 100) * imgW)
return {
...prev,
zones: prev.zones.map((z) => {
if (z.zone_index !== zoneIndex) return z
return {
...z,
columns: z.columns.map((col) => {
// Right edge of the column before this boundary
if (col.index === boundaryIndex - 1) {
return { ...col, x_max_pct: newXPct, x_max_px: newPx }
}
// Left edge of the column at this boundary
if (col.index === boundaryIndex) {
return { ...col, x_min_pct: newXPct, x_min_px: newPx }
}
return col
}),
}
}),
}
})
setDirty(true)
},
[grid],
)
/** Update horizontal layout guidelines (margins, header, footer). */
const updateLayoutHorizontals = useCallback(
(horizontals: LayoutDividers['horizontals']) => {
if (!grid) return
setGrid((prev) => {
if (!prev) return prev
return {
...prev,
layout_dividers: {
...(prev.layout_dividers || { horizontals: {} }),
horizontals,
},
}
})
setDirty(true)
},
[grid],
)
/** Split a column at a given x percentage, creating a new column. */
const splitColumnAt = useCallback(
(zoneIndex: number, xPct: number) => {
if (!grid) return
const zone = grid.zones.find((z) => z.zone_index === zoneIndex)
if (!zone) return
const sorted = [...zone.columns].sort((a, b) => a.index - b.index)
const targetCol = sorted.find((c) => c.x_min_pct <= xPct && c.x_max_pct >= xPct)
if (!targetCol) return
pushUndo(grid.zones)
const newColIndex = targetCol.index + 1
const imgW = grid.image_width || 1
setGrid((prev) => {
if (!prev) return prev
return {
...prev,
zones: prev.zones.map((z) => {
if (z.zone_index !== zoneIndex) return z
const leftCol = {
...targetCol,
x_max_pct: xPct,
x_max_px: Math.round((xPct / 100) * imgW),
}
const rightCol = {
index: newColIndex,
label: `column_${newColIndex + 1}`,
x_min_pct: xPct,
x_max_pct: targetCol.x_max_pct,
x_min_px: Math.round((xPct / 100) * imgW),
x_max_px: targetCol.x_max_px,
bold: false,
}
const updatedCols = z.columns.map((c) => {
if (c.index === targetCol.index) return leftCol
if (c.index > targetCol.index) return { ...c, index: c.index + 1, label: `column_${c.index + 2}` }
return c
})
const allCols = [...updatedCols, rightCol].sort((a, b) => a.index - b.index)
const shiftedCells = z.cells.map((c) => {
if (c.col_index > targetCol.index) {
const newCI = c.col_index + 1
return {
...c,
col_index: newCI,
cell_id: `Z${zoneIndex}_R${String(c.row_index).padStart(2, '0')}_C${newCI}`,
}
}
return c
})
const newCells = z.rows.map((row) => ({
cell_id: `Z${zoneIndex}_R${String(row.index).padStart(2, '0')}_C${newColIndex}`,
zone_index: zoneIndex,
row_index: row.index,
col_index: newColIndex,
col_type: `column_${newColIndex + 1}`,
text: '',
confidence: 0,
bbox_px: { x: 0, y: 0, w: 0, h: 0 },
bbox_pct: { x: 0, y: 0, w: 0, h: 0 },
word_boxes: [],
ocr_engine: 'manual',
is_bold: false,
}))
return { ...z, columns: allCols, cells: [...shiftedCells, ...newCells] }
}),
summary: {
...prev.summary,
total_columns: prev.summary.total_columns + 1,
total_cells: prev.summary.total_cells + (zone.rows.length),
},
}
})
setDirty(true)
},
[grid, pushUndo],
)
// ------------------------------------------------------------------
// Column pattern auto-correction
// ------------------------------------------------------------------
/**
* Detect dominant prefix+number patterns per column and complete
* partial matches. E.g. if 3+ cells read "p.70", "p.71", etc.,
* a cell reading ".65" is corrected to "p.65".
* Returns the number of corrections made.
*/
const autoCorrectColumnPatterns = useCallback(() => {
if (!grid) return 0
pushUndo(grid.zones)
let totalFixed = 0
const numberPattern = /^(.+?)(\d+)\s*$/
setGrid((prev) => {
if (!prev) return prev
return {
...prev,
zones: prev.zones.map((zone) => {
// Group cells by column
const cellsByCol = new Map<number, { cell: (typeof zone.cells)[0]; idx: number }[]>()
zone.cells.forEach((cell, idx) => {
const arr = cellsByCol.get(cell.col_index) || []
arr.push({ cell, idx })
cellsByCol.set(cell.col_index, arr)
})
const newCells = [...zone.cells]
for (const [, colEntries] of cellsByCol) {
// Count prefix occurrences
const prefixCounts = new Map<string, number>()
for (const { cell } of colEntries) {
const m = cell.text.trim().match(numberPattern)
if (m) {
prefixCounts.set(m[1], (prefixCounts.get(m[1]) || 0) + 1)
}
}
// Find dominant prefix (>= 3 occurrences)
let dominantPrefix = ''
let maxCount = 0
for (const [prefix, count] of prefixCounts) {
if (count >= 3 && count > maxCount) {
dominantPrefix = prefix
maxCount = count
}
}
if (!dominantPrefix) continue
// Fix partial matches — entries that are just [.?\s*]NUMBER
for (const { cell, idx } of colEntries) {
const text = cell.text.trim()
if (!text || text.startsWith(dominantPrefix)) continue
const numMatch = text.match(/^[.\s]*(\d+)\s*$/)
if (numMatch) {
newCells[idx] = { ...newCells[idx], text: `${dominantPrefix}${numMatch[1]}` }
totalFixed++
}
}
}
return { ...zone, cells: newCells }
}),
}
})
if (totalFixed > 0) setDirty(true)
return totalFixed
}, [grid, pushUndo])
// ------------------------------------------------------------------
// Multi-select & bulk formatting
// ------------------------------------------------------------------
const [selectedCells, setSelectedCells] = useState<Set<string>>(new Set())
const toggleCellSelection = useCallback(
(cellId: string) => {
setSelectedCells((prev) => {
const next = new Set(prev)
if (next.has(cellId)) next.delete(cellId)
else next.add(cellId)
return next
})
},
[],
)
const clearCellSelection = useCallback(() => {
setSelectedCells(new Set())
}, [])
/**
* Set a manual color override on a cell.
* - hex string (e.g. "#dc2626"): force text color
* - null: force no color (clear bar)
* - undefined: remove override, restore OCR-detected color
*/
const setCellColor = useCallback(
(cellId: string, color: string | null | undefined) => {
if (!grid) return
pushUndo(grid.zones)
setGrid((prev) => {
if (!prev) return prev
return {
...prev,
zones: prev.zones.map((zone) => ({
...zone,
cells: zone.cells.map((cell) => {
if (cell.cell_id !== cellId) return cell
if (color === undefined) {
// Remove override entirely — restore OCR behavior
const { color_override: _, ...rest } = cell
return rest
}
return { ...cell, color_override: color }
}),
})),
}
})
setDirty(true)
},
[grid, pushUndo],
)
/** Toggle bold on all selected cells (and their columns). */
const toggleSelectedBold = useCallback(() => {
if (!grid || selectedCells.size === 0) return
pushUndo(grid.zones)
// Determine if we're turning bold on or off (majority rule)
const cells = grid.zones.flatMap((z) => z.cells)
const selectedArr = cells.filter((c) => selectedCells.has(c.cell_id))
const boldCount = selectedArr.filter((c) => c.is_bold).length
const newBold = boldCount < selectedArr.length / 2
setGrid((prev) => {
if (!prev) return prev
return {
...prev,
zones: prev.zones.map((zone) => ({
...zone,
cells: zone.cells.map((cell) =>
selectedCells.has(cell.cell_id) ? { ...cell, is_bold: newBold } : cell,
),
})),
}
})
setDirty(true)
setSelectedCells(new Set())
}, [grid, selectedCells, pushUndo])
// ------------------------------------------------------------------
// Undo / Redo
// ------------------------------------------------------------------
@@ -243,20 +954,37 @@ export function useGridEditor(sessionId: string | null) {
(cellId: string, direction: 'up' | 'down' | 'left' | 'right'): string | null => {
if (!grid) return null
for (const zone of grid.zones) {
// Find the cell or derive row/col from cellId pattern
const cell = zone.cells.find((c) => c.cell_id === cellId)
if (!cell) continue
let currentRow: number, currentCol: number
if (cell) {
currentRow = cell.row_index
currentCol = cell.col_index
} else {
// Try to parse from cellId: Z{zone}_R{row}_C{col}
const match = cellId.match(/^Z(\d+)_R(\d+)_C(\d+)$/)
if (!match || parseInt(match[1]) !== zone.zone_index) continue
currentRow = parseInt(match[2])
currentCol = parseInt(match[3])
}
let targetRow = cell.row_index
let targetCol = cell.col_index
let targetRow = currentRow
let targetCol = currentCol
if (direction === 'up') targetRow--
if (direction === 'down') targetRow++
if (direction === 'left') targetCol--
if (direction === 'right') targetCol++
// Check bounds
const hasRow = zone.rows.some((r) => r.index === targetRow)
const hasCol = zone.columns.some((c) => c.index === targetCol)
if (!hasRow || !hasCol) return null
// Return existing cell ID or construct one
const target = zone.cells.find(
(c) => c.row_index === targetRow && c.col_index === targetCol,
)
return target?.cell_id ?? null
return target?.cell_id ?? `Z${zone.zone_index}_R${String(targetRow).padStart(2, '0')}_C${targetCol}`
}
return null
},
@@ -271,6 +999,7 @@ export function useGridEditor(sessionId: string | null) {
dirty,
selectedCell,
selectedZone,
selectedCells,
setSelectedCell,
setSelectedZone,
buildGrid,
@@ -284,5 +1013,33 @@ export function useGridEditor(sessionId: string | null) {
canUndo,
canRedo,
getAdjacentCell,
deleteColumn,
addColumn,
deleteRow,
addRow,
commitUndoPoint,
updateColumnDivider,
updateLayoutHorizontals,
splitColumnAt,
toggleCellSelection,
clearCellSelection,
toggleSelectedBold,
autoCorrectColumnPatterns,
setCellColor,
ipaMode,
setIpaMode,
syllableMode,
setSyllableMode,
ocrEnhance,
setOcrEnhance,
ocrMaxCols,
setOcrMaxCols,
ocrMinConf,
setOcrMinConf,
visionFusion,
setVisionFusion,
documentCategory,
setDocumentCategory,
rerunOcr,
}
}

View File

@@ -0,0 +1,59 @@
'use client'
import type { PipelineStep } from '@/app/(admin)/ai/ocr-kombi/types'
interface KombiStepperProps {
steps: PipelineStep[]
currentStep: number
onStepClick: (index: number) => void
}
export function KombiStepper({ steps, currentStep, onStepClick }: KombiStepperProps) {
return (
<div className="flex items-center gap-0.5 px-3 py-2.5 bg-white dark:bg-gray-800 rounded-lg border border-gray-200 dark:border-gray-700 overflow-x-auto">
{steps.map((step, index) => {
const isActive = index === currentStep
const isCompleted = step.status === 'completed'
const isFailed = step.status === 'failed'
const isSkipped = step.status === 'skipped'
const isClickable = (index <= currentStep || isCompleted) && !isSkipped
return (
<div key={step.id} className="flex items-center flex-shrink-0">
{index > 0 && (
<div
className={`h-0.5 w-4 mx-0.5 ${
isSkipped
? 'bg-gray-200 dark:bg-gray-700 border-t border-dashed border-gray-400'
: index <= currentStep ? 'bg-teal-400' : 'bg-gray-300 dark:bg-gray-600'
}`}
/>
)}
<button
onClick={() => isClickable && onStepClick(index)}
disabled={!isClickable}
className={`flex items-center gap-1 px-2 py-1 rounded-full text-xs font-medium transition-all whitespace-nowrap ${
isSkipped
? 'bg-gray-100 text-gray-400 dark:bg-gray-800 dark:text-gray-600 line-through'
: isActive
? 'bg-teal-100 text-teal-700 dark:bg-teal-900/40 dark:text-teal-300 ring-2 ring-teal-400'
: isCompleted
? 'bg-green-100 text-green-700 dark:bg-green-900/40 dark:text-green-300'
: isFailed
? 'bg-red-100 text-red-700 dark:bg-red-900/40 dark:text-red-300'
: 'text-gray-400 dark:text-gray-500'
} ${isClickable ? 'cursor-pointer hover:opacity-80' : 'cursor-default'}`}
title={step.name}
>
<span className="text-sm">
{isSkipped ? '-' : isCompleted ? '\u2713' : isFailed ? '\u2717' : step.icon}
</span>
<span className="hidden lg:inline">{step.name}</span>
<span className="lg:hidden">{index + 1}</span>
</button>
</div>
)
})}
</div>
)
}

View File

@@ -0,0 +1,73 @@
'use client'
import { useState } from 'react'
import { DOCUMENT_CATEGORIES, type DocumentCategory } from '@/app/(admin)/ai/ocr-kombi/types'
interface SessionHeaderProps {
sessionName: string
activeCategory?: DocumentCategory
isGroundTruth: boolean
pageNumber?: number | null
onUpdateCategory: (category: DocumentCategory) => void
}
export function SessionHeader({
sessionName,
activeCategory,
isGroundTruth,
pageNumber,
onUpdateCategory,
}: SessionHeaderProps) {
const [showCategoryPicker, setShowCategoryPicker] = useState(false)
const catInfo = DOCUMENT_CATEGORIES.find(c => c.value === activeCategory)
return (
<div className="relative flex items-center gap-3 text-sm text-gray-500 dark:text-gray-400">
<span>
Aktive Session:{' '}
<span className="font-medium text-gray-700 dark:text-gray-300">{sessionName}</span>
</span>
<button
onClick={() => setShowCategoryPicker(!showCategoryPicker)}
className={`text-xs px-2.5 py-1 rounded-full border transition-colors ${
activeCategory
? 'bg-teal-50 dark:bg-teal-900/30 border-teal-200 dark:border-teal-700 text-teal-700 dark:text-teal-300 hover:bg-teal-100'
: 'bg-amber-50 dark:bg-amber-900/20 border-amber-300 dark:border-amber-700 text-amber-700 dark:text-amber-300 hover:bg-amber-100 animate-pulse'
}`}
>
{catInfo ? `${catInfo.icon} ${catInfo.label}` : 'Kategorie setzen'}
</button>
{pageNumber != null && (
<span className="text-xs px-2 py-0.5 rounded-full bg-gray-100 dark:bg-gray-700 border border-gray-200 dark:border-gray-600 text-gray-600 dark:text-gray-300">
S. {pageNumber}
</span>
)}
{isGroundTruth && (
<span className="text-xs px-2 py-0.5 rounded-full bg-amber-50 dark:bg-amber-900/20 border border-amber-300 dark:border-amber-700 text-amber-700 dark:text-amber-300">
GT
</span>
)}
{showCategoryPicker && (
<div className="absolute left-0 top-full mt-1 z-20 bg-white dark:bg-gray-800 border border-gray-200 dark:border-gray-700 rounded-lg shadow-lg p-2 grid grid-cols-2 gap-1 w-64">
{DOCUMENT_CATEGORIES.map(cat => (
<button
key={cat.value}
onClick={() => {
onUpdateCategory(cat.value)
setShowCategoryPicker(false)
}}
className={`text-xs px-2 py-1.5 rounded-md text-left transition-colors ${
activeCategory === cat.value
? 'bg-teal-100 dark:bg-teal-900/40 text-teal-700 dark:text-teal-300'
: 'hover:bg-gray-100 dark:hover:bg-gray-700 text-gray-600 dark:text-gray-400'
}`}
>
{cat.icon} {cat.label}
</button>
))}
</div>
)}
</div>
)
}

View File

@@ -0,0 +1,376 @@
'use client'
import { useState } from 'react'
import { DOCUMENT_CATEGORIES, type DocumentCategory } from '@/app/(admin)/ai/ocr-kombi/types'
import type { SessionListItem, DocumentGroupView } from '@/app/(admin)/ai/ocr-kombi/useKombiPipeline'
const KLAUSUR_API = '/klausur-api'
interface SessionListProps {
items: (SessionListItem | DocumentGroupView)[]
loading: boolean
activeSessionId: string | null
onOpenSession: (sid: string) => void
onNewSession: () => void
onDeleteSession: (sid: string) => void
onRenameSession: (sid: string, newName: string) => void
onUpdateCategory: (sid: string, category: DocumentCategory) => void
}
function isGroup(item: SessionListItem | DocumentGroupView): item is DocumentGroupView {
return 'group_id' in item
}
export function SessionList({
items,
loading,
activeSessionId,
onOpenSession,
onNewSession,
onDeleteSession,
onRenameSession,
onUpdateCategory,
}: SessionListProps) {
const [editingName, setEditingName] = useState<string | null>(null)
const [editNameValue, setEditNameValue] = useState('')
const [editingCategory, setEditingCategory] = useState<string | null>(null)
const [expandedGroups, setExpandedGroups] = useState<Set<string>>(new Set())
const toggleGroup = (groupId: string) => {
setExpandedGroups(prev => {
const next = new Set(prev)
if (next.has(groupId)) next.delete(groupId)
else next.add(groupId)
return next
})
}
return (
<div className="bg-white dark:bg-gray-800 rounded-xl border border-gray-200 dark:border-gray-700 p-4">
<div className="flex items-center justify-between mb-3">
<h3 className="text-sm font-medium text-gray-700 dark:text-gray-300">
Sessions ({items.length})
</h3>
<button
onClick={onNewSession}
className="text-xs px-3 py-1.5 bg-teal-600 text-white rounded-lg hover:bg-teal-700 transition-colors"
>
+ Neue Session
</button>
</div>
{loading ? (
<div className="text-sm text-gray-400 py-2">Lade Sessions...</div>
) : items.length === 0 ? (
<div className="text-sm text-gray-400 py-2">Noch keine Sessions vorhanden.</div>
) : (
<div className="space-y-1.5 max-h-[320px] overflow-y-auto">
{items.map(item =>
isGroup(item) ? (
<GroupRow
key={item.group_id}
group={item}
expanded={expandedGroups.has(item.group_id)}
activeSessionId={activeSessionId}
onToggle={() => toggleGroup(item.group_id)}
onOpenSession={onOpenSession}
onDeleteSession={onDeleteSession}
/>
) : (
<SessionRow
key={item.id}
session={item}
isActive={activeSessionId === item.id}
editingName={editingName}
editNameValue={editNameValue}
editingCategory={editingCategory}
onOpenSession={() => onOpenSession(item.id)}
onStartRename={() => {
setEditNameValue(item.name || item.filename)
setEditingName(item.id)
}}
onFinishRename={(newName) => {
onRenameSession(item.id, newName)
setEditingName(null)
}}
onCancelRename={() => setEditingName(null)}
onEditNameChange={setEditNameValue}
onToggleCategory={() => setEditingCategory(editingCategory === item.id ? null : item.id)}
onUpdateCategory={(cat) => {
onUpdateCategory(item.id, cat)
setEditingCategory(null)
}}
onDelete={() => {
if (confirm('Session loeschen?')) onDeleteSession(item.id)
}}
/>
)
)}
</div>
)}
</div>
)
}
// ---- Group row (multi-page document) ----
function GroupRow({
group,
expanded,
activeSessionId,
onToggle,
onOpenSession,
onDeleteSession,
}: {
group: DocumentGroupView
expanded: boolean
activeSessionId: string | null
onToggle: () => void
onOpenSession: (sid: string) => void
onDeleteSession: (sid: string) => void
}) {
const isActive = group.sessions.some(s => s.id === activeSessionId)
return (
<div>
<div
onClick={onToggle}
className={`flex items-center gap-3 px-3 py-2 rounded-lg text-sm cursor-pointer transition-colors ${
isActive
? 'bg-teal-50 dark:bg-teal-900/30 border border-teal-200 dark:border-teal-700'
: 'hover:bg-gray-50 dark:hover:bg-gray-700/50'
}`}
>
<span className="text-base">{expanded ? '\u25BC' : '\u25B6'}</span>
<div className="flex-1 min-w-0">
<div className="truncate font-medium text-gray-700 dark:text-gray-300">
{group.title}
</div>
<div className="text-xs text-gray-400">
{group.page_count} Seiten
</div>
</div>
<div className="flex items-center gap-1.5">
{group.sessions.some(s => s.is_ground_truth) && (
<span className="text-[10px] px-1.5 py-0.5 rounded-full bg-amber-100 dark:bg-amber-900/30 border border-amber-300 dark:border-amber-700 text-amber-700 dark:text-amber-300">
GT {group.sessions.filter(s => s.is_ground_truth).length}/{group.sessions.length}
</span>
)}
<span className="text-xs px-2 py-0.5 rounded-full bg-blue-50 dark:bg-blue-900/20 border border-blue-200 dark:border-blue-800 text-blue-600 dark:text-blue-400">
Dokument
</span>
</div>
</div>
{expanded && (
<div className="ml-6 mt-1 space-y-1 border-l-2 border-gray-200 dark:border-gray-700 pl-3">
{group.sessions.map(s => (
<div
key={s.id}
className={`flex items-center gap-2 px-2 py-1.5 rounded text-xs cursor-pointer transition-colors ${
activeSessionId === s.id
? 'bg-teal-50 dark:bg-teal-900/30 text-teal-700 dark:text-teal-300'
: 'hover:bg-gray-50 dark:hover:bg-gray-700/50 text-gray-600 dark:text-gray-400'
}`}
onClick={() => onOpenSession(s.id)}
>
{/* Thumbnail */}
<div className="flex-shrink-0 w-8 h-8 rounded overflow-hidden bg-gray-100 dark:bg-gray-700">
{/* eslint-disable-next-line @next/next/no-img-element */}
<img
src={`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${s.id}/thumbnail?size=64`}
alt=""
className="w-full h-full object-cover"
loading="lazy"
onError={(e) => { (e.target as HTMLImageElement).style.display = 'none' }}
/>
</div>
<span className="truncate flex-1">S. {s.page_number || '?'}</span>
{s.is_ground_truth && (
<span className="text-[9px] px-1 py-0.5 rounded bg-amber-100 dark:bg-amber-900/30 border border-amber-300 dark:border-amber-700 text-amber-700 dark:text-amber-300">GT</span>
)}
<span className="text-[10px] text-gray-400">Step {s.current_step}</span>
<button
onClick={(e) => {
e.stopPropagation()
if (confirm('Seite loeschen?')) onDeleteSession(s.id)
}}
className="p-0.5 text-gray-400 hover:text-red-500"
title="Loeschen"
>
<svg className="w-3 h-3" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M19 7l-.867 12.142A2 2 0 0116.138 21H7.862a2 2 0 01-1.995-1.858L5 7m5 4v6m4-6v6m1-10V4a1 1 0 00-1-1h-4a1 1 0 00-1 1v3M4 7h16" />
</svg>
</button>
</div>
))}
</div>
)}
</div>
)
}
// ---- Single session row ----
function SessionRow({
session,
isActive,
editingName,
editNameValue,
editingCategory,
onOpenSession,
onStartRename,
onFinishRename,
onCancelRename,
onEditNameChange,
onToggleCategory,
onUpdateCategory,
onDelete,
}: {
session: SessionListItem
isActive: boolean
editingName: string | null
editNameValue: string
editingCategory: string | null
onOpenSession: () => void
onStartRename: () => void
onFinishRename: (name: string) => void
onCancelRename: () => void
onEditNameChange: (val: string) => void
onToggleCategory: () => void
onUpdateCategory: (cat: DocumentCategory) => void
onDelete: () => void
}) {
const catInfo = DOCUMENT_CATEGORIES.find(c => c.value === session.document_category)
const isEditing = editingName === session.id
return (
<div
className={`relative flex items-start gap-3 px-3 py-2.5 rounded-lg text-sm transition-colors cursor-pointer ${
isActive
? 'bg-teal-50 dark:bg-teal-900/30 border border-teal-200 dark:border-teal-700'
: 'hover:bg-gray-50 dark:hover:bg-gray-700/50'
}`}
>
{/* Thumbnail */}
<div
className="flex-shrink-0 w-12 h-12 rounded-md overflow-hidden bg-gray-100 dark:bg-gray-700"
onClick={onOpenSession}
>
{/* eslint-disable-next-line @next/next/no-img-element */}
<img
src={`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${session.id}/thumbnail?size=96`}
alt=""
className="w-full h-full object-cover"
loading="lazy"
onError={(e) => { (e.target as HTMLImageElement).style.display = 'none' }}
/>
</div>
{/* Info */}
<div className="flex-1 min-w-0" onClick={onOpenSession}>
{isEditing ? (
<input
autoFocus
value={editNameValue}
onChange={(e) => onEditNameChange(e.target.value)}
onBlur={() => onFinishRename(editNameValue)}
onKeyDown={(e) => {
if (e.key === 'Enter') onFinishRename(editNameValue)
if (e.key === 'Escape') onCancelRename()
}}
onClick={(e) => e.stopPropagation()}
className="w-full px-1 py-0.5 text-sm border rounded dark:bg-gray-700 dark:border-gray-600"
/>
) : (
<div className="truncate font-medium text-gray-700 dark:text-gray-300">
{session.name || session.filename}
</div>
)}
<button
onClick={(e) => {
e.stopPropagation()
navigator.clipboard.writeText(session.id)
const btn = e.currentTarget
btn.textContent = 'Kopiert!'
setTimeout(() => { btn.textContent = `ID: ${session.id.slice(0, 8)}` }, 1500)
}}
className="text-[10px] font-mono text-gray-400 hover:text-teal-500 transition-colors"
title={`Volle ID: ${session.id} — Klick zum Kopieren`}
>
ID: {session.id.slice(0, 8)}
</button>
<div className="text-xs text-gray-400 mt-0.5">
{new Date(session.created_at).toLocaleDateString('de-DE', {
day: '2-digit', month: '2-digit', year: '2-digit',
hour: '2-digit', minute: '2-digit',
})}
</div>
</div>
{/* Category + GT badge */}
<div className="flex flex-col gap-1 items-end flex-shrink-0" onClick={(e) => e.stopPropagation()}>
<button
onClick={onToggleCategory}
className={`text-[10px] px-1.5 py-0.5 rounded-full border transition-colors ${
catInfo
? 'bg-teal-50 dark:bg-teal-900/30 border-teal-200 dark:border-teal-700 text-teal-700 dark:text-teal-300'
: 'bg-gray-50 dark:bg-gray-700 border-gray-200 dark:border-gray-600 text-gray-400 hover:text-gray-600'
}`}
title="Kategorie setzen"
>
{catInfo ? `${catInfo.icon} ${catInfo.label}` : '+ Kategorie'}
</button>
{session.is_ground_truth && (
<span className="text-[10px] px-1.5 py-0.5 rounded-full bg-amber-100 dark:bg-amber-900/30 border border-amber-300 dark:border-amber-700 text-amber-700 dark:text-amber-300" title="Ground Truth markiert">
GT
</span>
)}
</div>
{/* Actions */}
<div className="flex flex-col gap-0.5 flex-shrink-0">
<button
onClick={(e) => { e.stopPropagation(); onStartRename() }}
className="p-1 text-gray-400 hover:text-gray-600 dark:hover:text-gray-300"
title="Umbenennen"
>
<svg className="w-3.5 h-3.5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M15.232 5.232l3.536 3.536m-2.036-5.036a2.5 2.5 0 113.536 3.536L6.5 21.036H3v-3.572L16.732 3.732z" />
</svg>
</button>
<button
onClick={(e) => { e.stopPropagation(); onDelete() }}
className="p-1 text-gray-400 hover:text-red-500"
title="Loeschen"
>
<svg className="w-3.5 h-3.5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M19 7l-.867 12.142A2 2 0 0116.138 21H7.862a2 2 0 01-1.995-1.858L5 7m5 4v6m4-6v6m1-10V4a1 1 0 00-1-1h-4a1 1 0 00-1 1v3M4 7h16" />
</svg>
</button>
</div>
{/* Category dropdown */}
{editingCategory === session.id && (
<div
className="absolute right-0 top-full mt-1 z-20 bg-white dark:bg-gray-800 border border-gray-200 dark:border-gray-700 rounded-lg shadow-lg p-2 grid grid-cols-2 gap-1 w-64"
onClick={(e) => e.stopPropagation()}
>
{DOCUMENT_CATEGORIES.map(cat => (
<button
key={cat.value}
onClick={() => onUpdateCategory(cat.value)}
className={`text-xs px-2 py-1.5 rounded-md text-left transition-colors ${
session.document_category === cat.value
? 'bg-teal-100 dark:bg-teal-900/40 text-teal-700 dark:text-teal-300'
: 'hover:bg-gray-100 dark:hover:bg-gray-700 text-gray-600 dark:text-gray-400'
}`}
>
{cat.icon} {cat.label}
</button>
))}
</div>
)}
</div>
)
}

View File

@@ -0,0 +1,241 @@
'use client'
/**
* SpreadsheetView — Fortune Sheet with multi-sheet support.
*
* Each zone (content + boxes) becomes its own Excel sheet tab,
* so each can have independent column widths optimized for its content.
*/
import { useMemo } from 'react'
import dynamic from 'next/dynamic'
const Workbook = dynamic(
() => import('@fortune-sheet/react').then((m) => m.Workbook),
{ ssr: false, loading: () => <div className="py-8 text-center text-sm text-gray-400">Spreadsheet wird geladen...</div> },
)
import '@fortune-sheet/react/dist/index.css'
import type { GridZone } from '@/components/grid-editor/types'
interface SpreadsheetViewProps {
gridData: any
height?: number
}
/** No expansion — keep multi-line cells as single cells with \n and text-wrap. */
/** Convert a single zone to a Fortune Sheet sheet object. */
function zoneToSheet(zone: GridZone, sheetIndex: number, isFirst: boolean): any {
const isBox = zone.zone_type === 'box'
const boxColor = (zone as any).box_bg_hex || ''
// Sheet name
let name: string
if (!isBox) {
name = 'Vokabeln'
} else {
const firstText = zone.cells?.[0]?.text ?? `Box ${sheetIndex}`
const cleaned = firstText.replace(/[^\w\s\u00C0-\u024F„"]/g, '').trim()
name = cleaned.length > 25 ? cleaned.slice(0, 25) + '…' : cleaned || `Box ${sheetIndex}`
}
const numCols = zone.columns?.length || 1
const numRows = zone.rows?.length || 0
const expandedCells = zone.cells || []
// Compute zone-wide median word height for font-size detection
const allWordHeights = zone.cells
.flatMap((c: any) => (c.word_boxes || []).map((wb: any) => wb.height || 0))
.filter((h: number) => h > 0)
const medianWordH = allWordHeights.length
? [...allWordHeights].sort((a, b) => a - b)[Math.floor(allWordHeights.length / 2)]
: 0
// Build celldata
const celldata: any[] = []
const merges: Record<string, any> = {}
for (const cell of expandedCells) {
const r = cell.row_index
const c = cell.col_index
const text = cell.text ?? ''
// Row metadata
const row = zone.rows?.find((rr) => rr.index === r)
const isHeader = row?.is_header ?? false
// Font size detection from word_boxes
const avgWbH = cell.word_boxes?.length
? cell.word_boxes.reduce((s: number, wb: any) => s + (wb.height || 0), 0) / cell.word_boxes.length
: 0
const isLargerFont = avgWbH > 0 && medianWordH > 0 && avgWbH > medianWordH * 1.3
const v: any = { v: text, m: text }
// Bold: headers, is_bold, larger font
if (cell.is_bold || isHeader || isLargerFont) {
v.bl = 1
}
// Larger font for box titles
if (isLargerFont && isBox) {
v.fs = 12
}
// Multi-line text (bullets with \n): enable text wrap + vertical top align
// Add bullet marker (•) if multi-line and no bullet present
if (text.includes('\n') && !isHeader) {
if (!text.startsWith('•') && !text.startsWith('-') && !text.startsWith('') && r > 0) {
text = '• ' + text
v.v = text
v.m = text
}
v.tb = '2' // text wrap
v.vt = 0 // vertical align: top
}
// Header row background
if (isHeader) {
v.bg = isBox ? `${boxColor || '#2563eb'}18` : '#f0f4ff'
}
// Box cells: light tinted background
if (isBox && !isHeader && boxColor) {
v.bg = `${boxColor}08`
}
// Text color from OCR
const color = cell.color_override
?? cell.word_boxes?.find((wb: any) => wb.color_name && wb.color_name !== 'black')?.color
if (color) v.fc = color
celldata.push({ r, c, v })
// Colspan → merge
const colspan = cell.colspan || 0
if (colspan > 1 || cell.col_type === 'spanning_header') {
const cs = colspan || numCols
merges[`${r}_${c}`] = { r, c, rs: 1, cs }
}
}
// Column widths — auto-fit based on longest text
const columnlen: Record<string, number> = {}
for (const col of (zone.columns || [])) {
const colCells = expandedCells.filter(
(c: any) => c.col_index === col.index && c.col_type !== 'spanning_header'
)
let maxTextLen = 0
for (const c of colCells) {
const len = (c.text ?? '').length
if (len > maxTextLen) maxTextLen = len
}
const autoWidth = Math.max(60, maxTextLen * 7.5 + 16)
const pxW = (col.x_max_px ?? 0) - (col.x_min_px ?? 0)
const scaledPxW = Math.max(60, Math.round(pxW * (numCols <= 2 ? 0.6 : 0.4)))
columnlen[String(col.index)] = Math.round(Math.max(autoWidth, scaledPxW))
}
// Row heights — taller for multi-line cells
const rowlen: Record<string, number> = {}
for (const row of (zone.rows || [])) {
const rowCells = expandedCells.filter((c: any) => c.row_index === row.index)
const maxLines = Math.max(1, ...rowCells.map((c: any) => (c.text ?? '').split('\n').length))
const baseH = 24
rowlen[String(row.index)] = Math.max(baseH, baseH * maxLines)
}
// Border info
const borderInfo: any[] = []
// Box: colored outside border
if (isBox && boxColor && numRows > 0 && numCols > 0) {
borderInfo.push({
rangeType: 'range',
borderType: 'border-outside',
color: boxColor,
style: 5,
range: [{ row: [0, numRows - 1], column: [0, numCols - 1] }],
})
borderInfo.push({
rangeType: 'range',
borderType: 'border-inside',
color: `${boxColor}40`,
style: 1,
range: [{ row: [0, numRows - 1], column: [0, numCols - 1] }],
})
}
// Content zone: light grid lines
if (!isBox && numRows > 0 && numCols > 0) {
borderInfo.push({
rangeType: 'range',
borderType: 'border-all',
color: '#e5e7eb',
style: 1,
range: [{ row: [0, numRows - 1], column: [0, numCols - 1] }],
})
}
return {
name,
id: `zone_${zone.zone_index}`,
celldata,
row: numRows,
column: Math.max(numCols, 1),
status: isFirst ? 1 : 0,
color: isBox ? boxColor : undefined,
config: {
merge: Object.keys(merges).length > 0 ? merges : undefined,
columnlen,
rowlen,
borderInfo: borderInfo.length > 0 ? borderInfo : undefined,
},
}
}
export function SpreadsheetView({ gridData, height = 600 }: SpreadsheetViewProps) {
const sheets = useMemo(() => {
if (!gridData?.zones) return []
const sorted = [...gridData.zones].sort((a: GridZone, b: GridZone) => {
if (a.zone_type === 'content' && b.zone_type !== 'content') return -1
if (a.zone_type !== 'content' && b.zone_type === 'content') return 1
return (a.bbox_px?.y ?? 0) - (b.bbox_px?.y ?? 0)
})
return sorted
.filter((z: GridZone) => z.cells && z.cells.length > 0)
.map((z: GridZone, i: number) => zoneToSheet(z, i, i === 0))
}, [gridData])
const maxRows = Math.max(0, ...sheets.map((s: any) => s.row || 0))
const estimatedHeight = Math.max(height, maxRows * 26 + 80)
if (sheets.length === 0) {
return <div className="p-4 text-center text-gray-400">Keine Daten für Spreadsheet.</div>
}
return (
<div style={{ width: '100%', height: `${estimatedHeight}px` }}>
<Workbook
data={sheets}
lang="en"
showToolbar
showFormulaBar={false}
showSheetTabs
toolbarItems={[
'undo', 'redo', '|',
'font-bold', 'font-italic', 'font-strikethrough', '|',
'font-color', 'background', '|',
'font-size', '|',
'horizontal-align', 'vertical-align', '|',
'text-wrap', 'merge-cell', '|',
'border',
]}
/>
</div>
)
}

View File

@@ -0,0 +1,110 @@
'use client'
/**
* StepAnsicht — Excel-like Spreadsheet View.
*
* Left: Original scan with OCR word overlay
* Right: Fortune Sheet spreadsheet with multi-sheet tabs per zone
*/
import { useEffect, useRef, useState } from 'react'
import dynamic from 'next/dynamic'
const SpreadsheetView = dynamic(
() => import('./SpreadsheetView').then((m) => m.SpreadsheetView),
{ ssr: false, loading: () => <div className="py-8 text-center text-sm text-gray-400">Spreadsheet wird geladen...</div> },
)
const KLAUSUR_API = '/klausur-api'
interface StepAnsichtProps {
sessionId: string | null
onNext: () => void
}
export function StepAnsicht({ sessionId, onNext }: StepAnsichtProps) {
const [gridData, setGridData] = useState<any>(null)
const [loading, setLoading] = useState(true)
const [error, setError] = useState<string | null>(null)
const leftRef = useRef<HTMLDivElement>(null)
const [leftHeight, setLeftHeight] = useState(600)
// Load grid data on mount
useEffect(() => {
if (!sessionId) return
;(async () => {
try {
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/grid-editor`)
if (!res.ok) throw new Error(`HTTP ${res.status}`)
setGridData(await res.json())
} catch (e) {
setError(e instanceof Error ? e.message : 'Fehler beim Laden')
} finally {
setLoading(false)
}
})()
}, [sessionId])
// Track left panel height
useEffect(() => {
if (!leftRef.current) return
const ro = new ResizeObserver(([e]) => setLeftHeight(e.contentRect.height))
ro.observe(leftRef.current)
return () => ro.disconnect()
}, [])
if (loading) {
return (
<div className="flex items-center justify-center py-16">
<div className="w-8 h-8 border-4 border-teal-500 border-t-transparent rounded-full animate-spin" />
<span className="ml-3 text-gray-500">Lade Spreadsheet...</span>
</div>
)
}
if (error || !gridData) {
return (
<div className="p-8 text-center">
<p className="text-red-500 mb-4">{error || 'Keine Grid-Daten.'}</p>
<button onClick={onNext} className="px-5 py-2 bg-teal-600 text-white rounded-lg">Weiter </button>
</div>
)
}
return (
<div className="space-y-3">
{/* Header */}
<div className="flex items-center justify-between">
<div>
<h3 className="text-lg font-semibold text-gray-900 dark:text-white">Ansicht Spreadsheet</h3>
<p className="text-sm text-gray-500 dark:text-gray-400">
Jede Zone als eigenes Sheet-Tab. Spaltenbreiten pro Sheet optimiert.
</p>
</div>
<button onClick={onNext} className="px-5 py-2 bg-teal-600 text-white rounded-lg hover:bg-teal-700 text-sm font-medium">
Weiter
</button>
</div>
{/* Split view */}
<div className="flex gap-2">
{/* LEFT: Original + OCR overlay */}
<div ref={leftRef} className="w-1/3 border border-gray-300 dark:border-gray-600 rounded-lg overflow-hidden bg-white dark:bg-gray-900 flex-shrink-0">
<div className="px-2 py-1 bg-black/60 text-white text-[10px] font-medium">Original + OCR</div>
{sessionId && (
<img
src={`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/image/words-overlay`}
alt="Original + OCR"
className="w-full h-auto"
/>
)}
</div>
{/* RIGHT: Fortune Sheet — height adapts to content */}
<div className="flex-1 border border-gray-300 dark:border-gray-600 rounded-lg overflow-hidden bg-white dark:bg-gray-900">
<SpreadsheetView gridData={gridData} height={Math.max(700, leftHeight)} />
</div>
</div>
</div>
)
}

View File

@@ -0,0 +1,283 @@
'use client'
import { useCallback, useEffect, useRef, useState } from 'react'
import { useGridEditor } from '@/components/grid-editor/useGridEditor'
import type { GridZone } from '@/components/grid-editor/types'
import { GridTable } from '@/components/grid-editor/GridTable'
const KLAUSUR_API = '/klausur-api'
type BoxLayoutType = 'flowing' | 'columnar' | 'bullet_list' | 'header_only'
const LAYOUT_LABELS: Record<BoxLayoutType, string> = {
flowing: 'Fließtext',
columnar: 'Tabelle/Spalten',
bullet_list: 'Aufzählung',
header_only: 'Überschrift',
}
interface StepBoxGridReviewProps {
sessionId: string | null
onNext: () => void
}
export function StepBoxGridReview({ sessionId, onNext }: StepBoxGridReviewProps) {
const {
grid,
loading,
saving,
error,
dirty,
selectedCell,
setSelectedCell,
loadGrid,
saveGrid,
updateCellText,
toggleColumnBold,
toggleRowHeader,
undo,
redo,
canUndo,
canRedo,
getAdjacentCell,
commitUndoPoint,
selectedCells,
toggleCellSelection,
clearCellSelection,
toggleSelectedBold,
setCellColor,
deleteColumn,
addColumn,
deleteRow,
addRow,
} = useGridEditor(sessionId)
const [building, setBuilding] = useState(false)
const [buildError, setBuildError] = useState<string | null>(null)
// Load grid on mount
useEffect(() => {
if (sessionId) loadGrid()
}, [sessionId]) // eslint-disable-line react-hooks/exhaustive-deps
// Get box zones
const boxZones: GridZone[] = (grid?.zones || []).filter(
(z: GridZone) => z.zone_type === 'box'
)
// Build box grids via backend
const buildBoxGrids = useCallback(async (overrides?: Record<string, string>) => {
if (!sessionId) return
setBuilding(true)
setBuildError(null)
try {
const res = await fetch(
`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/build-box-grids`,
{
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ overrides: overrides || {} }),
},
)
if (!res.ok) {
const data = await res.json().catch(() => ({}))
throw new Error(data.detail || `HTTP ${res.status}`)
}
await loadGrid()
} catch (e) {
setBuildError(e instanceof Error ? e.message : String(e))
} finally {
setBuilding(false)
}
}, [sessionId, loadGrid])
// Handle layout type change for a specific box zone
const changeLayoutType = useCallback(async (boxIdx: number, layoutType: string) => {
await buildBoxGrids({ [String(boxIdx)]: layoutType })
}, [buildBoxGrids])
// Auto-build once on first load if box zones have no cells
const autoBuildDone = useRef(false)
useEffect(() => {
if (!grid || loading || building || autoBuildDone.current) return
const needsBuild = boxZones.some(z => !z.cells || z.cells.length === 0)
if (needsBuild && sessionId) {
autoBuildDone.current = true
buildBoxGrids()
}
}, [grid, loading]) // eslint-disable-line react-hooks/exhaustive-deps
if (loading) {
return (
<div className="flex items-center justify-center py-16">
<div className="w-8 h-8 border-4 border-teal-500 border-t-transparent rounded-full animate-spin" />
<span className="ml-3 text-gray-500">Lade Grid...</span>
</div>
)
}
// No boxes after build attempt — skip step
if (!building && boxZones.length === 0) {
return (
<div className="bg-white dark:bg-gray-800 rounded-xl border border-gray-200 dark:border-gray-700 p-8 text-center">
<div className="text-4xl mb-3">📦</div>
<h3 className="text-lg font-semibold text-gray-900 dark:text-white mb-2">
Keine Boxen erkannt
</h3>
<p className="text-gray-500 dark:text-gray-400 mb-6">
Auf dieser Seite wurden keine eingebetteten Boxen (Grammatik-Tipps, Übungen etc.) erkannt.
</p>
<button
onClick={onNext}
className="px-6 py-2.5 bg-teal-600 text-white rounded-lg hover:bg-teal-700 transition-colors font-medium"
>
Weiter
</button>
</div>
)
}
return (
<div className="space-y-4">
{/* Header */}
<div className="flex items-center justify-between">
<div>
<h3 className="text-lg font-semibold text-gray-900 dark:text-white">
Box-Review ({boxZones.length} {boxZones.length === 1 ? 'Box' : 'Boxen'})
</h3>
<p className="text-sm text-gray-500 dark:text-gray-400">
Eingebettete Boxen prüfen und korrigieren. Layout-Typ kann pro Box angepasst werden.
</p>
</div>
<div className="flex items-center gap-2">
{dirty && (
<button
onClick={saveGrid}
disabled={saving}
className="px-4 py-2 bg-blue-600 text-white rounded-lg hover:bg-blue-700 transition-colors text-sm font-medium disabled:opacity-50"
>
{saving ? 'Speichere...' : 'Speichern'}
</button>
)}
<button
onClick={() => buildBoxGrids()}
disabled={building}
className="px-4 py-2 bg-amber-600 text-white rounded-lg hover:bg-amber-700 transition-colors text-sm font-medium disabled:opacity-50"
>
{building ? 'Verarbeite...' : 'Alle Boxen neu aufbauen'}
</button>
<button
onClick={async () => {
if (dirty) await saveGrid()
onNext()
}}
className="px-5 py-2 bg-teal-600 text-white rounded-lg hover:bg-teal-700 transition-colors text-sm font-medium"
>
Weiter
</button>
</div>
</div>
{/* Errors */}
{(error || buildError) && (
<div className="p-3 bg-red-50 dark:bg-red-900/30 border border-red-200 dark:border-red-800 rounded-lg text-red-700 dark:text-red-300 text-sm">
{error || buildError}
</div>
)}
{building && (
<div className="flex items-center gap-3 p-4 bg-amber-50 dark:bg-amber-900/20 border border-amber-200 dark:border-amber-800 rounded-lg">
<div className="w-5 h-5 border-2 border-amber-500 border-t-transparent rounded-full animate-spin" />
<span className="text-amber-700 dark:text-amber-300 text-sm">Box-Grids werden aufgebaut...</span>
</div>
)}
{/* Box zones */}
{boxZones.map((zone, boxIdx) => {
const boxColor = zone.box_bg_hex || '#d97706' // amber fallback
const boxColorName = zone.box_bg_color || 'box'
return (
<div
key={zone.zone_index}
className="bg-white dark:bg-gray-800 rounded-xl overflow-hidden"
style={{ border: `3px solid ${boxColor}` }}
>
{/* Box header */}
<div
className="flex items-center justify-between px-4 py-3 border-b"
style={{ backgroundColor: `${boxColor}15`, borderColor: `${boxColor}30` }}
>
<div className="flex items-center gap-3">
<div
className="w-8 h-8 rounded-lg flex items-center justify-center text-white text-sm font-bold"
style={{ backgroundColor: boxColor }}
>
{boxIdx + 1}
</div>
<div>
<span className="font-medium text-gray-900 dark:text-white">
Box {boxIdx + 1}
</span>
<span className="text-xs text-gray-500 dark:text-gray-400 ml-2">
{zone.bbox_px?.w}x{zone.bbox_px?.h}px
{zone.cells?.length ? ` | ${zone.cells.length} Zellen` : ''}
{zone.box_layout_type ? ` | ${LAYOUT_LABELS[zone.box_layout_type as BoxLayoutType] || zone.box_layout_type}` : ''}
{boxColorName !== 'box' ? ` | ${boxColorName}` : ''}
</span>
</div>
</div>
<div className="flex items-center gap-2">
<label className="text-xs text-gray-500 dark:text-gray-400">Layout:</label>
<select
value={zone.box_layout_type || 'flowing'}
onChange={(e) => changeLayoutType(boxIdx, e.target.value)}
disabled={building}
className="text-xs px-2 py-1 rounded border border-gray-300 dark:border-gray-600 bg-white dark:bg-gray-700 text-gray-700 dark:text-gray-200"
>
{Object.entries(LAYOUT_LABELS).map(([key, label]) => (
<option key={key} value={key}>{label}</option>
))}
</select>
</div>
</div>
{/* Box grid table */}
<div className="p-3">
{zone.cells && zone.cells.length > 0 ? (
<GridTable
zone={zone}
selectedCell={selectedCell}
selectedCells={selectedCells}
onSelectCell={setSelectedCell}
onCellTextChange={updateCellText}
onToggleColumnBold={toggleColumnBold}
onToggleRowHeader={toggleRowHeader}
onNavigate={(cellId, dir) => {
const next = getAdjacentCell(cellId, dir)
if (next) setSelectedCell(next)
}}
onDeleteColumn={deleteColumn}
onAddColumn={addColumn}
onDeleteRow={deleteRow}
onAddRow={addRow}
onToggleCellSelection={toggleCellSelection}
onSetCellColor={setCellColor}
/>
) : (
<div className="text-center py-8 text-gray-400">
<p className="text-sm">Keine Zellen erkannt.</p>
<button
onClick={() => buildBoxGrids({ [String(boxIdx)]: 'flowing' })}
className="mt-2 text-xs text-amber-600 hover:text-amber-700"
>
Als Fließtext verarbeiten
</button>
</div>
)}
</div>
</div>
)
})}
</div>
)
}

View File

@@ -0,0 +1,13 @@
'use client'
import { StepCrop as BaseStepCrop } from '@/components/ocr-pipeline/StepCrop'
interface StepContentCropProps {
sessionId: string | null
onNext: () => void
}
/** Thin wrapper around the shared StepCrop component */
export function StepContentCrop({ sessionId, onNext }: StepContentCropProps) {
return <BaseStepCrop key={sessionId} sessionId={sessionId} onNext={onNext} />
}

View File

@@ -0,0 +1,13 @@
'use client'
import { StepDeskew as BaseStepDeskew } from '@/components/ocr-pipeline/StepDeskew'
interface StepDeskewProps {
sessionId: string | null
onNext: () => void
}
/** Thin wrapper around the shared StepDeskew component */
export function StepDeskew({ sessionId, onNext }: StepDeskewProps) {
return <BaseStepDeskew key={sessionId} sessionId={sessionId} onNext={onNext} />
}

View File

@@ -0,0 +1,13 @@
'use client'
import { StepDewarp as BaseStepDewarp } from '@/components/ocr-pipeline/StepDewarp'
interface StepDewarpProps {
sessionId: string | null
onNext: () => void
}
/** Thin wrapper around the shared StepDewarp component */
export function StepDewarp({ sessionId, onNext }: StepDewarpProps) {
return <BaseStepDewarp key={sessionId} sessionId={sessionId} onNext={onNext} />
}

View File

@@ -0,0 +1,117 @@
'use client'
import { useState, useEffect } from 'react'
const KLAUSUR_API = '/klausur-api'
interface StepGridBuildProps {
sessionId: string | null
onNext: () => void
}
/**
* Step 9: Grid Build.
* Triggers the build-grid endpoint and shows progress.
*/
export function StepGridBuild({ sessionId, onNext }: StepGridBuildProps) {
const [building, setBuilding] = useState(false)
const [result, setResult] = useState<{ rows: number; cols: number; cells: number } | null>(null)
const [error, setError] = useState('')
const [autoTriggered, setAutoTriggered] = useState(false)
useEffect(() => {
if (!sessionId || autoTriggered) return
// Check if grid already exists
checkExistingGrid()
// eslint-disable-next-line react-hooks/exhaustive-deps
}, [sessionId])
const checkExistingGrid = async () => {
if (!sessionId) return
try {
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/grid-editor`)
if (res.ok) {
const data = await res.json()
// Use grid-editor summary (accurate zone-based counts)
const summary = data.summary
if (summary) {
setResult({ rows: summary.total_rows || 0, cols: summary.total_columns || 0, cells: summary.total_cells || 0 })
return
}
}
} catch { /* no existing grid */ }
// Auto-trigger build
setAutoTriggered(true)
buildGrid()
}
const buildGrid = async () => {
if (!sessionId) return
setBuilding(true)
setError('')
try {
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/build-grid`, {
method: 'POST',
})
if (!res.ok) {
const data = await res.json().catch(() => ({}))
throw new Error(data.detail || `Grid-Build fehlgeschlagen (${res.status})`)
}
const data = await res.json()
// Use grid-editor summary (zone-based, more accurate than word_result.grid_shape)
const summary = data.summary
if (summary) {
setResult({ rows: summary.total_rows || 0, cols: summary.total_columns || 0, cells: summary.total_cells || 0 })
} else {
const shape = data.grid_shape || { rows: 0, cols: 0, total_cells: 0 }
setResult({ rows: shape.rows, cols: shape.cols, cells: shape.total_cells })
}
} catch (e) {
setError(e instanceof Error ? e.message : String(e))
} finally {
setBuilding(false)
}
}
return (
<div className="space-y-4">
{building && (
<div className="flex items-center gap-3 p-6 bg-blue-50 dark:bg-blue-900/20 rounded-xl border border-blue-200 dark:border-blue-800">
<div className="animate-spin w-5 h-5 border-2 border-blue-400 border-t-transparent rounded-full" />
<span className="text-sm text-blue-600 dark:text-blue-400">Grid wird aufgebaut...</span>
</div>
)}
{result && (
<div className="space-y-3">
<div className="p-4 bg-green-50 dark:bg-green-900/20 rounded-xl border border-green-200 dark:border-green-800">
<div className="text-sm font-medium text-green-700 dark:text-green-300">
Grid erstellt: {result.rows} Zeilen, {result.cols} Spalten, {result.cells} Zellen
</div>
</div>
<button
onClick={onNext}
className="px-4 py-2 bg-teal-600 text-white text-sm rounded-lg hover:bg-teal-700"
>
Weiter zum Review
</button>
</div>
)}
{error && (
<div className="space-y-3">
<div className="text-sm text-red-500 bg-red-50 dark:bg-red-900/20 p-3 rounded-lg">
{error}
</div>
<button
onClick={buildGrid}
className="px-4 py-2 bg-orange-600 text-white text-sm rounded-lg hover:bg-orange-700"
>
Erneut versuchen
</button>
</div>
)}
</div>
)
}

View File

@@ -0,0 +1,15 @@
'use client'
import { StepGridReview as BaseStepGridReview } from '@/components/ocr-pipeline/StepGridReview'
import type { MutableRefObject } from 'react'
interface StepGridReviewProps {
sessionId: string | null
onNext: () => void
saveRef: MutableRefObject<(() => Promise<void>) | null>
}
/** Thin wrapper around the shared StepGridReview component */
export function StepGridReview({ sessionId, onNext, saveRef }: StepGridReviewProps) {
return <BaseStepGridReview sessionId={sessionId} onNext={onNext} saveRef={saveRef} />
}

View File

@@ -0,0 +1,295 @@
'use client'
import { useCallback, useEffect, useRef, useState } from 'react'
import { useGridEditor } from '@/components/grid-editor/useGridEditor'
import { GridTable } from '@/components/grid-editor/GridTable'
import { ImageLayoutEditor } from '@/components/grid-editor/ImageLayoutEditor'
import type { GridZone } from '@/components/grid-editor/types'
const KLAUSUR_API = '/klausur-api'
interface StepGroundTruthProps {
sessionId: string | null
isGroundTruth: boolean
onMarked: () => void
gridSaveRef: React.MutableRefObject<(() => Promise<void>) | null>
}
/**
* Step 12: Ground Truth marking.
*
* Shows the full Grid-Review view (original image + table) so the user
* can verify the final result before marking as Ground Truth reference.
*/
export function StepGroundTruth({ sessionId, isGroundTruth, onMarked, gridSaveRef }: StepGroundTruthProps) {
const {
grid,
loading,
saving,
error,
dirty,
selectedCell,
selectedCells,
setSelectedCell,
loadGrid,
saveGrid,
updateCellText,
toggleColumnBold,
toggleRowHeader,
undo,
redo,
canUndo,
canRedo,
getAdjacentCell,
deleteColumn,
addColumn,
deleteRow,
addRow,
toggleCellSelection,
clearCellSelection,
toggleSelectedBold,
setCellColor,
} = useGridEditor(sessionId)
const [showImage, setShowImage] = useState(true)
const [zoom, setZoom] = useState(100)
const [markSaving, setMarkSaving] = useState(false)
const [message, setMessage] = useState('')
// Expose save function via ref
useEffect(() => {
if (gridSaveRef) {
gridSaveRef.current = async () => {
if (dirty) await saveGrid()
}
return () => { gridSaveRef.current = null }
}
}, [gridSaveRef, dirty, saveGrid])
// Load grid on mount
useEffect(() => {
if (sessionId) loadGrid()
}, [sessionId, loadGrid])
// Keyboard shortcuts
useEffect(() => {
const handler = (e: KeyboardEvent) => {
if ((e.metaKey || e.ctrlKey) && e.key === 'z' && !e.shiftKey) {
e.preventDefault(); undo()
} else if ((e.metaKey || e.ctrlKey) && e.key === 'z' && e.shiftKey) {
e.preventDefault(); redo()
} else if ((e.metaKey || e.ctrlKey) && e.key === 's') {
e.preventDefault(); saveGrid()
} else if ((e.metaKey || e.ctrlKey) && e.key === 'b') {
e.preventDefault()
if (selectedCells.size > 0) toggleSelectedBold()
} else if (e.key === 'Escape') {
clearCellSelection()
}
}
window.addEventListener('keydown', handler)
return () => window.removeEventListener('keydown', handler)
}, [undo, redo, saveGrid, selectedCells, toggleSelectedBold, clearCellSelection])
const handleNavigate = useCallback(
(cellId: string, direction: 'up' | 'down' | 'left' | 'right') => {
const target = getAdjacentCell(cellId, direction)
if (target) {
setSelectedCell(target)
setTimeout(() => {
const el = document.getElementById(`cell-${target}`)
if (el) {
el.focus()
if (el instanceof HTMLInputElement) el.select()
}
}, 0)
}
},
[getAdjacentCell, setSelectedCell],
)
const handleMark = async () => {
if (!sessionId) return
setMarkSaving(true)
setMessage('')
try {
if (dirty) await saveGrid()
const res = await fetch(
`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/mark-ground-truth?pipeline=kombi`,
{ method: 'POST' },
)
if (!res.ok) {
const body = await res.text().catch(() => '')
throw new Error(`Ground Truth fehlgeschlagen (${res.status}): ${body}`)
}
const data = await res.json()
setMessage(`Ground Truth gespeichert (${data.cells_saved} Zellen)`)
onMarked()
} catch (e) {
setMessage(e instanceof Error ? e.message : String(e))
} finally {
setMarkSaving(false)
}
}
if (!sessionId) {
return <div className="text-center py-12 text-gray-400">Keine Session ausgewaehlt.</div>
}
if (loading) {
return (
<div className="flex items-center justify-center py-16">
<div className="flex items-center gap-3 text-gray-500 dark:text-gray-400">
<svg className="w-5 h-5 animate-spin" fill="none" viewBox="0 0 24 24">
<circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" />
<path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4z" />
</svg>
Grid wird geladen...
</div>
</div>
)
}
if (error) {
return (
<div className="bg-red-50 dark:bg-red-900/20 border border-red-200 dark:border-red-800 rounded-lg p-4">
<p className="text-sm text-red-700 dark:text-red-300">Fehler: {error}</p>
</div>
)
}
if (!grid || !grid.zones.length) {
return <div className="text-center py-12 text-gray-400">Kein Grid vorhanden.</div>
}
const imageUrl = `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/image/cropped`
return (
<div className="space-y-3">
{/* GT Header Bar */}
<div className="flex items-center justify-between p-3 bg-amber-50 dark:bg-amber-900/10 rounded-xl border border-amber-200 dark:border-amber-800">
<div>
<h3 className="text-sm font-medium text-amber-700 dark:text-amber-300">
Ground Truth
{isGroundTruth && <span className="ml-2 text-xs font-normal text-amber-500">(bereits markiert)</span>}
</h3>
<p className="text-xs text-amber-600 dark:text-amber-400 mt-0.5">
Pruefen Sie das Ergebnis und markieren Sie es als Referenz fuer Regressionstests.
</p>
</div>
<div className="flex items-center gap-2">
{dirty && (
<button
onClick={saveGrid}
disabled={saving}
className="px-3 py-1.5 text-xs bg-teal-600 text-white rounded-lg hover:bg-teal-700 disabled:opacity-50"
>
{saving ? 'Speichere...' : 'Speichern'}
</button>
)}
<button
onClick={handleMark}
disabled={markSaving}
className="px-4 py-1.5 text-xs bg-amber-600 text-white rounded-lg hover:bg-amber-700 disabled:opacity-50"
>
{markSaving ? 'Speichere...' : isGroundTruth ? 'GT aktualisieren' : 'Als Ground Truth markieren'}
</button>
</div>
</div>
{message && (
<div className={`text-sm p-2 rounded ${message.includes('fehlgeschlagen') ? 'text-red-500 bg-red-50 dark:bg-red-900/20' : 'text-amber-600 dark:text-amber-400 bg-amber-50 dark:bg-amber-900/10'}`}>
{message}
</div>
)}
{/* Stats */}
<div className="flex items-center gap-4 text-xs flex-wrap">
<span className="text-gray-500 dark:text-gray-400">
{grid.summary.total_zones} Zone(n), {grid.summary.total_columns} Spalten,{' '}
{grid.summary.total_rows} Zeilen, {grid.summary.total_cells} Zellen
</span>
<button
onClick={() => setShowImage(!showImage)}
className={`px-2.5 py-1 rounded text-xs border transition-colors ${
showImage
? 'bg-teal-50 dark:bg-teal-900/30 border-teal-200 dark:border-teal-700 text-teal-700 dark:text-teal-300'
: 'bg-gray-50 dark:bg-gray-800 border-gray-200 dark:border-gray-700 text-gray-500 dark:text-gray-400'
}`}
>
{showImage ? 'Bild ausblenden' : 'Bild einblenden'}
</button>
</div>
{/* Split View: Image left + Grid right */}
<div className={showImage ? 'grid grid-cols-2 gap-3' : ''} style={{ minHeight: '55vh' }}>
{showImage && (
<ImageLayoutEditor
imageUrl={imageUrl}
zones={grid.zones}
imageWidth={grid.image_width}
layoutDividers={grid.layout_dividers}
zoom={zoom}
onZoomChange={setZoom}
onColumnDividerMove={() => {}}
onHorizontalsChange={() => {}}
onCommitUndo={() => {}}
onSplitColumnAt={() => {}}
onDeleteColumn={() => {}}
/>
)}
<div className="space-y-3">
{(() => {
const groups: GridZone[][] = []
for (const zone of grid.zones) {
const prev = groups[groups.length - 1]
if (prev && zone.vsplit_group != null && prev[0].vsplit_group === zone.vsplit_group) {
prev.push(zone)
} else {
groups.push([zone])
}
}
return groups.map((group) => (
<div key={group[0].vsplit_group ?? group[0].zone_index}>
<div className={`${group.length > 1 ? 'flex gap-2' : ''}`}>
{group.map((zone) => (
<div
key={zone.zone_index}
className={`${group.length > 1 ? 'flex-1 min-w-0' : ''} bg-white dark:bg-gray-800 rounded-lg border border-gray-200 dark:border-gray-700`}
>
<GridTable
zone={zone}
layoutMetrics={grid.layout_metrics}
selectedCell={selectedCell}
selectedCells={selectedCells}
onSelectCell={setSelectedCell}
onToggleCellSelection={toggleCellSelection}
onCellTextChange={updateCellText}
onToggleColumnBold={toggleColumnBold}
onToggleRowHeader={toggleRowHeader}
onNavigate={handleNavigate}
onDeleteColumn={deleteColumn}
onAddColumn={addColumn}
onDeleteRow={deleteRow}
onAddRow={addRow}
onSetCellColor={setCellColor}
/>
</div>
))}
</div>
</div>
))
})()}
</div>
</div>
{/* Keyboard tips */}
<div className="text-[11px] text-gray-400 dark:text-gray-500 flex items-center gap-4">
<span>Tab: naechste Zelle</span>
<span>Ctrl+Z/Y: Undo/Redo</span>
<span>Ctrl+S: Speichern</span>
</div>
</div>
)
}

View File

@@ -0,0 +1,422 @@
'use client'
import { useState, useEffect, useCallback } from 'react'
const KLAUSUR_API = '/klausur-api'
interface GutterSuggestion {
id: string
type: 'hyphen_join' | 'spell_fix'
zone_index: number
row_index: number
col_index: number
col_type: string
cell_id: string
original_text: string
suggested_text: string
next_row_index: number
next_row_cell_id: string
next_row_text: string
missing_chars: string
display_parts: string[]
alternatives: string[]
confidence: number
reason: string
}
interface GutterRepairResult {
suggestions: GutterSuggestion[]
stats: {
words_checked: number
gutter_candidates: number
suggestions_found: number
error?: string
}
duration_seconds: number
}
interface StepGutterRepairProps {
sessionId: string | null
onNext: () => void
}
/**
* Step 11: Gutter Repair (Wortkorrektur).
* Detects words truncated at the book gutter and proposes corrections.
* User can accept/reject each suggestion individually or in batch.
*/
export function StepGutterRepair({ sessionId, onNext }: StepGutterRepairProps) {
const [loading, setLoading] = useState(false)
const [applying, setApplying] = useState(false)
const [result, setResult] = useState<GutterRepairResult | null>(null)
const [accepted, setAccepted] = useState<Set<string>>(new Set())
const [rejected, setRejected] = useState<Set<string>>(new Set())
const [selectedText, setSelectedText] = useState<Record<string, string>>({})
const [applied, setApplied] = useState(false)
const [error, setError] = useState('')
const [applyMessage, setApplyMessage] = useState('')
const analyse = useCallback(async () => {
if (!sessionId) return
setLoading(true)
setError('')
setApplied(false)
setApplyMessage('')
try {
const res = await fetch(
`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/gutter-repair`,
{ method: 'POST' },
)
if (!res.ok) {
const body = await res.json().catch(() => ({}))
throw new Error(body.detail || `Analyse fehlgeschlagen (${res.status})`)
}
const data: GutterRepairResult = await res.json()
setResult(data)
// Auto-accept all suggestions with high confidence
const autoAccept = new Set<string>()
for (const s of data.suggestions) {
if (s.confidence >= 0.85) {
autoAccept.add(s.id)
}
}
setAccepted(autoAccept)
setRejected(new Set())
} catch (e) {
setError(e instanceof Error ? e.message : String(e))
} finally {
setLoading(false)
}
}, [sessionId])
// Auto-trigger analysis on mount
useEffect(() => {
if (sessionId) analyse()
}, [sessionId, analyse])
const toggleSuggestion = (id: string) => {
setAccepted(prev => {
const next = new Set(prev)
if (next.has(id)) {
next.delete(id)
setRejected(r => new Set(r).add(id))
} else {
next.add(id)
setRejected(r => { const n = new Set(r); n.delete(id); return n })
}
return next
})
}
const acceptAll = () => {
if (!result) return
setAccepted(new Set(result.suggestions.map(s => s.id)))
setRejected(new Set())
}
const rejectAll = () => {
if (!result) return
setRejected(new Set(result.suggestions.map(s => s.id)))
setAccepted(new Set())
}
const applyAccepted = async () => {
if (!sessionId || accepted.size === 0) return
setApplying(true)
setApplyMessage('')
try {
const res = await fetch(
`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/gutter-repair/apply`,
{
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
accepted: Array.from(accepted),
text_overrides: selectedText,
}),
},
)
if (!res.ok) {
const body = await res.json().catch(() => ({}))
throw new Error(body.detail || `Anwenden fehlgeschlagen (${res.status})`)
}
const data = await res.json()
setApplied(true)
setApplyMessage(`${data.applied_count} Korrektur(en) angewendet.`)
} catch (e) {
setApplyMessage(e instanceof Error ? e.message : String(e))
} finally {
setApplying(false)
}
}
const suggestions = result?.suggestions || []
const hasSuggestions = suggestions.length > 0
return (
<div className="space-y-4">
{/* Header */}
<div className="flex items-center justify-between">
<div>
<h3 className="text-sm font-medium text-gray-700 dark:text-gray-300">
Wortkorrektur (Buchfalz)
</h3>
<p className="text-xs text-gray-500 dark:text-gray-400 mt-1">
Erkennt abgeschnittene oder unscharfe Woerter am Buchfalz und Bindestrich-Trennungen ueber Zeilen hinweg.
</p>
</div>
{result && !loading && (
<button
onClick={analyse}
className="px-3 py-1.5 text-xs bg-gray-100 dark:bg-gray-700 text-gray-600 dark:text-gray-300 rounded-lg hover:bg-gray-200 dark:hover:bg-gray-600"
>
Erneut analysieren
</button>
)}
</div>
{/* Loading */}
{loading && (
<div className="flex items-center gap-3 p-6 bg-blue-50 dark:bg-blue-900/20 rounded-xl border border-blue-200 dark:border-blue-800">
<div className="animate-spin w-5 h-5 border-2 border-blue-400 border-t-transparent rounded-full" />
<span className="text-sm text-blue-600 dark:text-blue-400">Analysiere Woerter am Buchfalz...</span>
</div>
)}
{/* Error */}
{error && (
<div className="space-y-3">
<div className="text-sm text-red-500 bg-red-50 dark:bg-red-900/20 p-3 rounded-lg">
{error}
</div>
<button
onClick={analyse}
className="px-4 py-2 bg-orange-600 text-white text-sm rounded-lg hover:bg-orange-700"
>
Erneut versuchen
</button>
</div>
)}
{/* No suggestions */}
{result && !hasSuggestions && !loading && (
<div className="p-4 bg-green-50 dark:bg-green-900/20 rounded-xl border border-green-200 dark:border-green-800">
<div className="text-sm font-medium text-green-700 dark:text-green-300">
Keine Buchfalz-Fehler erkannt.
</div>
<div className="text-xs text-green-600 dark:text-green-400 mt-1">
{result.stats.words_checked} Woerter geprueft, {result.stats.gutter_candidates} Kandidaten am Rand analysiert.
</div>
</div>
)}
{/* Suggestions list */}
{hasSuggestions && !loading && (
<>
{/* Stats bar */}
<div className="flex items-center justify-between p-3 bg-gray-50 dark:bg-gray-800 rounded-lg">
<div className="text-xs text-gray-500 dark:text-gray-400">
{suggestions.length} Vorschlag/Vorschlaege &middot;{' '}
{result!.stats.words_checked} Woerter geprueft &middot;{' '}
{result!.duration_seconds}s
</div>
<div className="flex gap-2">
<button
onClick={acceptAll}
disabled={applied}
className="px-2 py-1 text-xs bg-green-100 dark:bg-green-900/30 text-green-700 dark:text-green-300 rounded hover:bg-green-200 dark:hover:bg-green-900/50 disabled:opacity-50"
>
Alle akzeptieren
</button>
<button
onClick={rejectAll}
disabled={applied}
className="px-2 py-1 text-xs bg-red-100 dark:bg-red-900/30 text-red-700 dark:text-red-300 rounded hover:bg-red-200 dark:hover:bg-red-900/50 disabled:opacity-50"
>
Alle ablehnen
</button>
</div>
</div>
{/* Suggestion cards */}
<div className="space-y-2">
{suggestions.map((s) => {
const isAccepted = accepted.has(s.id)
const isRejected = rejected.has(s.id)
return (
<div
key={s.id}
className={`p-3 rounded-lg border transition-colors ${
applied
? isAccepted
? 'bg-green-50 dark:bg-green-900/10 border-green-200 dark:border-green-800'
: 'bg-gray-50 dark:bg-gray-800/50 border-gray-200 dark:border-gray-700 opacity-60'
: isAccepted
? 'bg-green-50 dark:bg-green-900/10 border-green-300 dark:border-green-700'
: isRejected
? 'bg-red-50 dark:bg-red-900/10 border-red-200 dark:border-red-800 opacity-60'
: 'bg-white dark:bg-gray-800 border-gray-200 dark:border-gray-700'
}`}
>
<div className="flex items-start justify-between gap-3">
{/* Left: suggestion details */}
<div className="flex-1 min-w-0">
{/* Type badge */}
<div className="flex items-center gap-2 mb-1.5">
<span className={`inline-flex px-1.5 py-0.5 text-[10px] font-medium rounded ${
s.type === 'hyphen_join'
? 'bg-purple-100 dark:bg-purple-900/30 text-purple-700 dark:text-purple-300'
: 'bg-orange-100 dark:bg-orange-900/30 text-orange-700 dark:text-orange-300'
}`}>
{s.type === 'hyphen_join' ? 'Zeilenumbruch' : 'Buchfalz-Korrektur'}
</span>
<span className="text-[10px] text-gray-400">
Zeile {s.row_index + 1}, Spalte {s.col_index + 1}
{s.col_type && ` (${s.col_type.replace('column_', '')})`}
</span>
<span className={`text-[10px] ${
s.confidence >= 0.9 ? 'text-green-500' :
s.confidence >= 0.7 ? 'text-yellow-500' : 'text-red-500'
}`}>
{Math.round(s.confidence * 100)}%
</span>
</div>
{/* Correction display */}
{s.type === 'hyphen_join' ? (
<div className="space-y-1">
<div className="flex items-center gap-2 text-sm">
<span className="font-mono text-red-600 dark:text-red-400 line-through">
{s.original_text}
</span>
<span className="text-gray-400 text-xs">Z.{s.row_index + 1}</span>
<span className="text-gray-300 dark:text-gray-600">+</span>
<span className="font-mono text-red-600 dark:text-red-400 line-through">
{s.next_row_text.split(' ')[0]}
</span>
<span className="text-gray-400 text-xs">Z.{s.next_row_index + 1}</span>
<span className="text-gray-400">&rarr;</span>
<span className="font-mono text-green-600 dark:text-green-400 font-semibold">
{s.suggested_text}
</span>
</div>
{s.missing_chars && (
<div className="text-[10px] text-gray-400">
Fehlende Zeichen: <span className="font-mono font-semibold">{s.missing_chars}</span>
{' '}&middot; Darstellung: <span className="font-mono">{s.display_parts.join(' | ')}</span>
</div>
)}
</div>
) : (
<div className="space-y-1">
<div className="flex items-center gap-2 text-sm">
<span className="font-mono text-red-600 dark:text-red-400 line-through">
{s.original_text}
</span>
<span className="text-gray-400">&rarr;</span>
<span className="font-mono text-green-600 dark:text-green-400 font-semibold">
{selectedText[s.id] || s.suggested_text}
</span>
</div>
{/* Alternatives: show other candidates the user can pick */}
{s.alternatives && s.alternatives.length > 0 && !applied && (
<div className="flex items-center gap-1.5 flex-wrap">
<span className="text-[10px] text-gray-400">Alternativen:</span>
{[s.suggested_text, ...s.alternatives].map((alt) => {
const isSelected = (selectedText[s.id] || s.suggested_text) === alt
return (
<button
key={alt}
onClick={() => setSelectedText(prev => ({ ...prev, [s.id]: alt }))}
className={`px-1.5 py-0.5 text-[11px] font-mono rounded transition-colors ${
isSelected
? 'bg-green-200 dark:bg-green-800 text-green-800 dark:text-green-200 font-semibold'
: 'bg-gray-100 dark:bg-gray-700 text-gray-600 dark:text-gray-300 hover:bg-gray-200 dark:hover:bg-gray-600'
}`}
>
{alt}
</button>
)
})}
</div>
)}
</div>
)}
</div>
{/* Right: accept/reject toggle */}
{!applied && (
<button
onClick={() => toggleSuggestion(s.id)}
className={`flex-shrink-0 w-8 h-8 rounded-full flex items-center justify-center text-sm transition-colors ${
isAccepted
? 'bg-green-500 text-white hover:bg-green-600'
: isRejected
? 'bg-red-400 text-white hover:bg-red-500'
: 'bg-gray-200 dark:bg-gray-600 text-gray-500 dark:text-gray-300 hover:bg-gray-300 dark:hover:bg-gray-500'
}`}
title={isAccepted ? 'Akzeptiert (klicken zum Ablehnen)' : isRejected ? 'Abgelehnt (klicken zum Akzeptieren)' : 'Klicken zum Akzeptieren'}
>
{isAccepted ? '\u2713' : isRejected ? '\u2717' : '?'}
</button>
)}
</div>
</div>
)
})}
</div>
{/* Apply / Next buttons */}
<div className="flex items-center gap-3 pt-2">
{!applied ? (
<button
onClick={applyAccepted}
disabled={applying || accepted.size === 0}
className="px-4 py-2 bg-teal-600 text-white text-sm rounded-lg hover:bg-teal-700 disabled:opacity-50"
>
{applying ? 'Wird angewendet...' : `${accepted.size} Korrektur(en) anwenden`}
</button>
) : (
<button
onClick={onNext}
className="px-4 py-2 bg-teal-600 text-white text-sm rounded-lg hover:bg-teal-700"
>
Weiter zu Ground Truth
</button>
)}
{!applied && (
<button
onClick={onNext}
className="px-4 py-2 text-sm text-gray-500 dark:text-gray-400 hover:text-gray-700 dark:hover:text-gray-200"
>
Ueberspringen
</button>
)}
</div>
{/* Apply result message */}
{applyMessage && (
<div className={`text-sm p-2 rounded ${
applyMessage.includes('fehlgeschlagen')
? 'text-red-500 bg-red-50 dark:bg-red-900/20'
: 'text-green-600 dark:text-green-400 bg-green-50 dark:bg-green-900/20'
}`}>
{applyMessage}
</div>
)}
</>
)}
{/* Skip button when no suggestions */}
{result && !hasSuggestions && !loading && (
<button
onClick={onNext}
className="px-4 py-2 bg-teal-600 text-white text-sm rounded-lg hover:bg-teal-700"
>
Weiter zu Ground Truth
</button>
)}
</div>
)
}

View File

@@ -0,0 +1,30 @@
'use client'
import { PaddleDirectStep } from '@/components/ocr-overlay/PaddleDirectStep'
interface StepOcrProps {
sessionId: string | null
onNext: () => void
}
/**
* Step 7: OCR (Kombi mode = PaddleOCR + Tesseract).
*
* Phase 1: Uses the existing PaddleDirectStep with kombi endpoint.
* Phase 3 (later) will add transparent 3-phase progress + engine comparison.
*/
export function StepOcr({ sessionId, onNext }: StepOcrProps) {
return (
<PaddleDirectStep
sessionId={sessionId}
onNext={onNext}
endpoint="paddle-kombi"
title="Kombi-Modus"
description="PP-OCRv5 und Tesseract laufen parallel. Koordinaten werden gewichtet gemittelt fuer optimale Positionierung."
icon="🔀"
buttonLabel="PP-OCRv5 + Tesseract starten"
runningLabel="PP-OCRv5 + Tesseract laufen..."
engineKey="kombi"
/>
)
}

View File

@@ -0,0 +1,21 @@
'use client'
import { StepOrientation as BaseStepOrientation } from '@/components/ocr-pipeline/StepOrientation'
interface StepOrientationProps {
sessionId: string | null
onNext: () => void
onSessionList: () => void
}
/** Thin wrapper — adapts the shared StepOrientation to the Kombi pipeline's simpler onNext() */
export function StepOrientation({ sessionId, onNext, onSessionList }: StepOrientationProps) {
return (
<BaseStepOrientation
key={sessionId}
sessionId={sessionId}
onNext={() => onNext()}
onSessionList={onSessionList}
/>
)
}

View File

@@ -0,0 +1,198 @@
'use client'
import { useState, useEffect, useRef } from 'react'
const KLAUSUR_API = '/klausur-api'
interface PageSplitResult {
multi_page: boolean
page_count?: number
page_splits?: { x: number; y: number; width: number; height: number; page_index: number }[]
sub_sessions?: { id: string; name: string; page_index: number }[]
used_original?: boolean
duration_seconds?: number
}
interface StepPageSplitProps {
sessionId: string | null
sessionName: string
onNext: () => void
onSplitComplete: (firstChildId: string, firstChildName: string) => void
}
export function StepPageSplit({ sessionId, sessionName, onNext, onSplitComplete }: StepPageSplitProps) {
const [detecting, setDetecting] = useState(false)
const [splitResult, setSplitResult] = useState<PageSplitResult | null>(null)
const [error, setError] = useState('')
const didDetect = useRef(false)
// Auto-detect page split when step opens
useEffect(() => {
if (!sessionId || didDetect.current) return
didDetect.current = true
detectPageSplit()
// eslint-disable-next-line react-hooks/exhaustive-deps
}, [sessionId])
const detectPageSplit = async () => {
if (!sessionId) return
setDetecting(true)
setError('')
try {
// First check if this session was already split (status='split')
const sessionRes = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}`)
if (sessionRes.ok) {
const sessionData = await sessionRes.json()
if (sessionData.status === 'split' && sessionData.crop_result?.multi_page) {
// Already split — find the child sessions in the session list
const listRes = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions`)
if (listRes.ok) {
const listData = await listRes.json()
// Child sessions have names like "ParentName — Seite N"
const baseName = sessionName || sessionData.name || ''
const children = (listData.sessions || [])
.filter((s: { name?: string }) => s.name?.startsWith(baseName + ' — '))
.sort((a: { name: string }, b: { name: string }) => a.name.localeCompare(b.name))
if (children.length > 0) {
setSplitResult({
multi_page: true,
page_count: children.length,
sub_sessions: children.map((s: { id: string; name: string }, i: number) => ({
id: s.id, name: s.name, page_index: i,
})),
})
onSplitComplete(children[0].id, children[0].name)
setDetecting(false)
return
}
}
}
}
// Run page-split detection
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/page-split`, {
method: 'POST',
})
if (!res.ok) {
const data = await res.json().catch(() => ({}))
throw new Error(data.detail || 'Seitentrennung fehlgeschlagen')
}
const data: PageSplitResult = await res.json()
setSplitResult(data)
if (data.multi_page && data.sub_sessions?.length) {
// Rename sub-sessions to "Title — S. 1", "Title — S. 2"
const baseName = sessionName || 'Dokument'
for (let i = 0; i < data.sub_sessions.length; i++) {
const sub = data.sub_sessions[i]
const newName = `${baseName} — S. ${i + 1}`
await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sub.id}`, {
method: 'PUT',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ name: newName }),
}).catch(() => {})
sub.name = newName
}
// Signal parent to switch to the first child session
onSplitComplete(data.sub_sessions[0].id, data.sub_sessions[0].name)
}
} catch (e) {
setError(e instanceof Error ? e.message : String(e))
} finally {
setDetecting(false)
}
}
if (!sessionId) return null
const imageUrl = `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/image/oriented`
return (
<div className="space-y-4">
{/* Image */}
<div className="relative rounded-lg overflow-hidden bg-gray-100 dark:bg-gray-700">
{/* eslint-disable-next-line @next/next/no-img-element */}
<img
src={imageUrl}
alt="Orientiertes Bild"
className="w-full object-contain max-h-[500px]"
onError={(e) => {
// Fallback to non-oriented image
(e.target as HTMLImageElement).src =
`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/image`
}}
/>
</div>
{/* Detection status */}
{detecting && (
<div className="flex items-center gap-2 text-teal-600 dark:text-teal-400 text-sm">
<div className="animate-spin w-4 h-4 border-2 border-teal-500 border-t-transparent rounded-full" />
Doppelseiten-Erkennung laeuft...
</div>
)}
{/* Detection result */}
{splitResult && !detecting && (
splitResult.multi_page ? (
<div className="bg-blue-50 dark:bg-blue-900/20 rounded-lg border border-blue-200 dark:border-blue-700 p-4 space-y-2">
<div className="text-sm font-medium text-blue-700 dark:text-blue-300">
Doppelseite erkannt {splitResult.page_count} Seiten getrennt
</div>
<p className="text-xs text-blue-600 dark:text-blue-400">
Jede Seite wird als eigene Session weiterverarbeitet (eigene Begradigung, Entzerrung, etc.).
{splitResult.used_original && ' Trennung auf Originalbild, da Orientierung die Doppelseite gedreht hat.'}
</p>
<div className="flex gap-2 mt-2">
{splitResult.sub_sessions?.map(s => (
<span
key={s.id}
className="text-xs px-2.5 py-1 rounded-md bg-blue-100 dark:bg-blue-800/40 text-blue-700 dark:text-blue-300 font-medium"
>
{s.name}
</span>
))}
</div>
{splitResult.duration_seconds != null && (
<div className="text-xs text-gray-400">{splitResult.duration_seconds.toFixed(1)}s</div>
)}
</div>
) : (
<div className="bg-green-50 dark:bg-green-900/20 rounded-lg border border-green-200 dark:border-green-800 p-4">
<div className="flex items-center gap-2 text-sm font-medium text-green-700 dark:text-green-300">
<span>&#10003;</span> Einzelseite keine Trennung noetig
</div>
{splitResult.duration_seconds != null && (
<div className="text-xs text-gray-400 mt-1">{splitResult.duration_seconds.toFixed(1)}s</div>
)}
</div>
)
)}
{/* Error */}
{error && (
<div className="text-sm text-red-500 bg-red-50 dark:bg-red-900/20 p-3 rounded-lg">
{error}
<button
onClick={() => { didDetect.current = false; detectPageSplit() }}
className="ml-2 text-teal-600 hover:underline"
>
Erneut versuchen
</button>
</div>
)}
{/* Next button — only show when detection is done */}
{(splitResult || error) && !detecting && (
<div className="flex justify-end">
<button
onClick={onNext}
className="px-6 py-2.5 bg-teal-600 text-white text-sm font-medium rounded-lg hover:bg-teal-700 transition-colors"
>
Weiter &rarr;
</button>
</div>
)}
</div>
)
}

View File

@@ -0,0 +1,13 @@
'use client'
import { StepStructureDetection } from '@/components/ocr-pipeline/StepStructureDetection'
interface StepStructureProps {
sessionId: string | null
onNext: () => void
}
/** Thin wrapper around the shared StepStructureDetection component */
export function StepStructure({ sessionId, onNext }: StepStructureProps) {
return <StepStructureDetection sessionId={sessionId} onNext={onNext} />
}

View File

@@ -0,0 +1,303 @@
'use client'
import { useState, useCallback, useEffect } from 'react'
import { DOCUMENT_CATEGORIES, type DocumentCategory } from '@/app/(admin)/ai/ocr-kombi/types'
const KLAUSUR_API = '/klausur-api'
interface StepUploadProps {
sessionId: string | null
onUploaded: (sessionId: string, name: string) => void
onNext: () => void
}
export function StepUpload({ sessionId, onUploaded, onNext }: StepUploadProps) {
const [dragging, setDragging] = useState(false)
const [uploading, setUploading] = useState(false)
const [selectedFile, setSelectedFile] = useState<File | null>(null)
const [preview, setPreview] = useState<string | null>(null)
const [title, setTitle] = useState('')
const [category, setCategory] = useState<DocumentCategory>('vokabelseite')
const [error, setError] = useState('')
// Clean up preview URL on unmount
useEffect(() => {
return () => { if (preview) URL.revokeObjectURL(preview) }
}, [preview])
const handleFileSelect = useCallback((file: File) => {
setSelectedFile(file)
setError('')
if (file.type.startsWith('image/')) {
setPreview(URL.createObjectURL(file))
} else {
setPreview(null)
}
// Auto-fill title from filename if empty
if (!title.trim()) {
setTitle(file.name.replace(/\.[^.]+$/, ''))
}
}, [title])
const handleUpload = useCallback(async () => {
if (!selectedFile) return
setUploading(true)
setError('')
try {
const formData = new FormData()
formData.append('file', selectedFile)
if (title.trim()) formData.append('name', title.trim())
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions`, {
method: 'POST',
body: formData,
})
if (!res.ok) {
const data = await res.json().catch(() => ({}))
throw new Error(data.detail || `Upload fehlgeschlagen (${res.status})`)
}
const data = await res.json()
const sid = data.session_id || data.id
// Set category
if (category) {
await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`, {
method: 'PUT',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ document_category: category }),
})
}
onUploaded(sid, title.trim() || selectedFile.name)
} catch (e) {
setError(e instanceof Error ? e.message : String(e))
} finally {
setUploading(false)
}
}, [selectedFile, title, category, onUploaded])
const handleDrop = useCallback((e: React.DragEvent) => {
e.preventDefault()
setDragging(false)
const file = e.dataTransfer.files[0]
if (file) handleFileSelect(file)
}, [handleFileSelect])
const handleInputChange = useCallback((e: React.ChangeEvent<HTMLInputElement>) => {
const file = e.target.files?.[0]
if (file) handleFileSelect(file)
}, [handleFileSelect])
const clearFile = useCallback(() => {
setSelectedFile(null)
if (preview) URL.revokeObjectURL(preview)
setPreview(null)
}, [preview])
// ---- Phase 2: Uploaded → show result + "Weiter" ----
if (sessionId) {
return (
<div className="space-y-4">
<div className="bg-green-50 dark:bg-green-900/20 border border-green-200 dark:border-green-800 rounded-lg p-4">
<div className="flex items-center gap-2 text-green-700 dark:text-green-300 text-sm font-medium mb-3">
<span>&#10003;</span> Dokument hochgeladen
</div>
<div className="flex gap-4">
<div className="w-48 h-64 rounded-lg overflow-hidden bg-gray-100 dark:bg-gray-700 flex-shrink-0 border border-gray-200 dark:border-gray-600">
{/* eslint-disable-next-line @next/next/no-img-element */}
<img
src={`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/image`}
alt="Hochgeladenes Dokument"
className="w-full h-full object-contain"
onError={(e) => { (e.target as HTMLImageElement).style.display = 'none' }}
/>
</div>
<div className="text-sm text-gray-600 dark:text-gray-400">
<div className="font-medium text-gray-700 dark:text-gray-300 mb-1">
{title || 'Dokument'}
</div>
<div className="text-xs text-gray-400 mt-1">
Kategorie: {DOCUMENT_CATEGORIES.find(c => c.value === category)?.label || category}
</div>
<div className="text-xs font-mono text-gray-400 mt-1">
Session: {sessionId.slice(0, 8)}...
</div>
</div>
</div>
</div>
<div className="flex justify-end">
<button
onClick={onNext}
className="px-6 py-2.5 bg-teal-600 text-white text-sm font-medium rounded-lg hover:bg-teal-700 transition-colors"
>
Weiter &rarr;
</button>
</div>
</div>
)
}
// ---- Phase 1b: File selected → preview + "Hochladen" ----
if (selectedFile) {
return (
<div className="space-y-4">
{/* Title input */}
<div>
<label className="block text-sm font-medium text-gray-700 dark:text-gray-300 mb-1">
Titel
</label>
<input
type="text"
value={title}
onChange={(e) => setTitle(e.target.value)}
placeholder="z.B. Vokabeln Unit 3"
className="w-full px-3 py-2 border border-gray-300 dark:border-gray-600 rounded-lg bg-white dark:bg-gray-800 text-sm"
/>
</div>
{/* Category selector */}
<div>
<label className="block text-sm font-medium text-gray-700 dark:text-gray-300 mb-1">
Kategorie
</label>
<div className="grid grid-cols-4 gap-1.5">
{DOCUMENT_CATEGORIES.map(cat => (
<button
key={cat.value}
onClick={() => setCategory(cat.value)}
className={`text-xs px-2 py-1.5 rounded-md text-left transition-colors ${
category === cat.value
? 'bg-teal-100 dark:bg-teal-900/40 text-teal-700 dark:text-teal-300 ring-1 ring-teal-400'
: 'bg-gray-50 dark:bg-gray-700 text-gray-600 dark:text-gray-400 hover:bg-gray-100'
}`}
>
{cat.icon} {cat.label}
</button>
))}
</div>
</div>
{/* File preview */}
<div className="border border-gray-200 dark:border-gray-700 rounded-xl p-4">
<div className="flex items-start gap-4">
{preview ? (
<div className="w-36 h-48 rounded-lg overflow-hidden bg-gray-100 dark:bg-gray-700 flex-shrink-0 border border-gray-200 dark:border-gray-600">
{/* eslint-disable-next-line @next/next/no-img-element */}
<img src={preview} alt="Vorschau" className="w-full h-full object-contain" />
</div>
) : (
<div className="w-36 h-48 rounded-lg bg-gray-100 dark:bg-gray-700 flex-shrink-0 flex items-center justify-center border border-gray-200 dark:border-gray-600">
<span className="text-3xl">&#128196;</span>
</div>
)}
<div className="flex-1 min-w-0">
<div className="font-medium text-sm text-gray-700 dark:text-gray-300 truncate">
{selectedFile.name}
</div>
<div className="text-xs text-gray-400 mt-1">
{(selectedFile.size / 1024 / 1024).toFixed(1)} MB
</div>
<button
onClick={clearFile}
className="text-xs text-red-500 hover:text-red-700 mt-2"
>
Andere Datei waehlen
</button>
</div>
</div>
<button
onClick={handleUpload}
disabled={uploading}
className="mt-4 w-full px-4 py-2.5 bg-teal-600 text-white text-sm font-medium rounded-lg hover:bg-teal-700 disabled:opacity-50 disabled:cursor-not-allowed transition-colors"
>
{uploading ? 'Wird hochgeladen...' : 'Hochladen'}
</button>
</div>
{error && (
<div className="text-sm text-red-500 bg-red-50 dark:bg-red-900/20 p-3 rounded-lg">
{error}
</div>
)}
</div>
)
}
// ---- Phase 1a: No file → drop zone ----
return (
<div className="space-y-4">
{/* Title input */}
<div>
<label className="block text-sm font-medium text-gray-700 dark:text-gray-300 mb-1">
Titel (optional)
</label>
<input
type="text"
value={title}
onChange={(e) => setTitle(e.target.value)}
placeholder="z.B. Vokabeln Unit 3"
className="w-full px-3 py-2 border border-gray-300 dark:border-gray-600 rounded-lg bg-white dark:bg-gray-800 text-sm"
/>
</div>
{/* Category selector */}
<div>
<label className="block text-sm font-medium text-gray-700 dark:text-gray-300 mb-1">
Kategorie
</label>
<div className="grid grid-cols-4 gap-1.5">
{DOCUMENT_CATEGORIES.map(cat => (
<button
key={cat.value}
onClick={() => setCategory(cat.value)}
className={`text-xs px-2 py-1.5 rounded-md text-left transition-colors ${
category === cat.value
? 'bg-teal-100 dark:bg-teal-900/40 text-teal-700 dark:text-teal-300 ring-1 ring-teal-400'
: 'bg-gray-50 dark:bg-gray-700 text-gray-600 dark:text-gray-400 hover:bg-gray-100'
}`}
>
{cat.icon} {cat.label}
</button>
))}
</div>
</div>
{/* Drop zone */}
<div
onDragOver={(e) => { e.preventDefault(); setDragging(true) }}
onDragLeave={() => setDragging(false)}
onDrop={handleDrop}
className={`border-2 border-dashed rounded-xl p-12 text-center transition-colors ${
dragging
? 'border-teal-400 bg-teal-50 dark:bg-teal-900/20'
: 'border-gray-300 dark:border-gray-600 hover:border-gray-400'
}`}
>
<div className="text-4xl mb-3">&#128228;</div>
<div className="text-sm text-gray-600 dark:text-gray-400 mb-2">
Bild oder PDF hierher ziehen
</div>
<label className="inline-block px-4 py-2 bg-teal-600 text-white text-sm rounded-lg cursor-pointer hover:bg-teal-700">
Datei auswaehlen
<input
type="file"
accept="image/*,.pdf"
onChange={handleInputChange}
className="hidden"
/>
</label>
</div>
{error && (
<div className="text-sm text-red-500 bg-red-50 dark:bg-red-900/20 p-3 rounded-lg">
{error}
</div>
)}
</div>
)
}

View File

@@ -1,6 +1,6 @@
'use client'
import type { SubSession } from '@/app/(admin)/ai/ocr-pipeline/types'
import type { SubSession } from '@/app/(admin)/ai/ocr-kombi/types'
interface BoxSessionTabsProps {
parentSessionId: string
@@ -21,6 +21,7 @@ function getStatusIcon(sub: SubSession): string {
return STATUS_ICONS.pending
}
/** Tabs for box sub-sessions (from column detection zone_type='box'). */
export function BoxSessionTabs({ parentSessionId, subSessions, activeSessionId, onSessionChange }: BoxSessionTabsProps) {
if (subSessions.length === 0) return null
@@ -28,7 +29,6 @@ export function BoxSessionTabs({ parentSessionId, subSessions, activeSessionId,
return (
<div className="flex items-center gap-1.5 px-1 py-1.5 bg-gray-50 dark:bg-gray-800/50 rounded-xl border border-gray-200 dark:border-gray-700">
{/* Main session tab */}
<button
onClick={() => onSessionChange(parentSessionId)}
className={`px-3 py-1.5 rounded-lg text-xs font-medium transition-colors ${
@@ -42,7 +42,6 @@ export function BoxSessionTabs({ parentSessionId, subSessions, activeSessionId,
<div className="w-px h-5 bg-gray-200 dark:bg-gray-700" />
{/* Sub-session tabs */}
{subSessions.map((sub) => {
const isActive = activeSessionId === sub.id
const icon = getStatusIcon(sub)

View File

@@ -1,7 +1,7 @@
'use client'
import { useState, useMemo } from 'react'
import type { ColumnResult, ColumnGroundTruth, PageRegion } from '@/app/(admin)/ai/ocr-pipeline/types'
import type { ColumnResult, ColumnGroundTruth, PageRegion } from '@/app/(admin)/ai/ocr-kombi/types'
interface ColumnControlsProps {
columnResult: ColumnResult | null

View File

@@ -1,7 +1,7 @@
'use client'
import { useState } from 'react'
import type { DeskewResult, DeskewGroundTruth } from '@/app/(admin)/ai/ocr-pipeline/types'
import type { DeskewResult, DeskewGroundTruth } from '@/app/(admin)/ai/ocr-kombi/types'
interface DeskewControlsProps {
deskewResult: DeskewResult | null

View File

@@ -1,7 +1,7 @@
'use client'
import { useEffect, useState } from 'react'
import type { DeskewResult, DewarpResult, DewarpDetection, DewarpGroundTruth } from '@/app/(admin)/ai/ocr-pipeline/types'
import type { DeskewResult, DewarpResult, DewarpDetection, DewarpGroundTruth } from '@/app/(admin)/ai/ocr-kombi/types'
interface DewarpControlsProps {
dewarpResult: DewarpResult | null

View File

@@ -1,7 +1,7 @@
'use client'
import { useCallback, useEffect, useRef, useState } from 'react'
import type { GridCell } from '@/app/(admin)/ai/ocr-pipeline/types'
import type { GridCell } from '@/app/(admin)/ai/ocr-kombi/types'
const KLAUSUR_API = '/klausur-api'

View File

@@ -1,7 +1,7 @@
'use client'
import { useCallback, useEffect, useRef, useState } from 'react'
import type { ColumnTypeKey, PageRegion } from '@/app/(admin)/ai/ocr-pipeline/types'
import type { ColumnTypeKey, PageRegion } from '@/app/(admin)/ai/ocr-kombi/types'
const COLUMN_TYPES: { value: ColumnTypeKey; label: string }[] = [
{ value: 'column_en', label: 'EN' },

View File

@@ -1,6 +1,6 @@
'use client'
import { PipelineStep, DocumentTypeResult } from '@/app/(admin)/ai/ocr-pipeline/types'
import { PipelineStep, DocumentTypeResult } from '@/app/(admin)/ai/ocr-kombi/types'
const DOC_TYPE_LABELS: Record<string, string> = {
vocab_table: 'Vokabeltabelle',

View File

@@ -1,10 +1,10 @@
'use client'
import { useCallback, useEffect, useState } from 'react'
import type { ColumnResult, ColumnGroundTruth, PageRegion, SubSession } from '@/app/(admin)/ai/ocr-pipeline/types'
import type { ColumnResult, ColumnGroundTruth, PageRegion, SubSession } from '@/app/(admin)/ai/ocr-kombi/types'
import { ColumnControls } from './ColumnControls'
import { ManualColumnEditor } from './ManualColumnEditor'
import type { ColumnTypeKey } from '@/app/(admin)/ai/ocr-pipeline/types'
import type { ColumnTypeKey } from '@/app/(admin)/ai/ocr-kombi/types'
const KLAUSUR_API = '/klausur-api'

View File

@@ -1,7 +1,7 @@
'use client'
import { useEffect, useState } from 'react'
import type { CropResult } from '@/app/(admin)/ai/ocr-pipeline/types'
import type { CropResult } from '@/app/(admin)/ai/ocr-kombi/types'
import { ImageCompareView } from './ImageCompareView'
const KLAUSUR_API = '/klausur-api'

View File

@@ -1,7 +1,7 @@
'use client'
import { useCallback, useEffect, useState } from 'react'
import type { DeskewGroundTruth, DeskewResult, SessionInfo } from '@/app/(admin)/ai/ocr-pipeline/types'
import type { DeskewGroundTruth, DeskewResult, SessionInfo } from '@/app/(admin)/ai/ocr-kombi/types'
import { DeskewControls } from './DeskewControls'
import { ImageCompareView } from './ImageCompareView'

View File

@@ -1,7 +1,7 @@
'use client'
import { useCallback, useEffect, useState } from 'react'
import type { DeskewResult, DewarpResult, DewarpGroundTruth } from '@/app/(admin)/ai/ocr-pipeline/types'
import type { DeskewResult, DewarpResult, DewarpGroundTruth } from '@/app/(admin)/ai/ocr-kombi/types'
import { DewarpControls } from './DewarpControls'
import { ImageCompareView } from './ImageCompareView'

View File

@@ -0,0 +1,537 @@
'use client'
/**
* StepGridReview — Last step of the Kombi Pipeline
*
* Split view: original scan on the left, GridEditor on the right.
* Adds confidence stats, row-accept buttons, and integrates with
* the GT marking flow in the parent page.
*/
import { useCallback, useEffect, useRef, useState, type MutableRefObject } from 'react'
import { useGridEditor } from '@/components/grid-editor/useGridEditor'
import type { GridZone, LayoutDividers } from '@/components/grid-editor/types'
import { GridToolbar } from '@/components/grid-editor/GridToolbar'
import { GridTable } from '@/components/grid-editor/GridTable'
import { ImageLayoutEditor } from '@/components/grid-editor/ImageLayoutEditor'
const KLAUSUR_API = '/klausur-api'
interface StepGridReviewProps {
sessionId: string | null
onNext?: () => void
saveRef?: MutableRefObject<(() => Promise<void>) | null>
}
export function StepGridReview({ sessionId, onNext, saveRef }: StepGridReviewProps) {
const {
grid,
loading,
saving,
error,
dirty,
selectedCell,
selectedCells,
setSelectedCell,
buildGrid,
loadGrid,
saveGrid,
updateCellText,
toggleColumnBold,
toggleRowHeader,
undo,
redo,
canUndo,
canRedo,
getAdjacentCell,
deleteColumn,
addColumn,
deleteRow,
addRow,
commitUndoPoint,
updateColumnDivider,
updateLayoutHorizontals,
splitColumnAt,
toggleCellSelection,
clearCellSelection,
toggleSelectedBold,
autoCorrectColumnPatterns,
setCellColor,
ipaMode,
setIpaMode,
syllableMode,
setSyllableMode,
ocrEnhance,
setOcrEnhance,
ocrMaxCols,
setOcrMaxCols,
ocrMinConf,
setOcrMinConf,
visionFusion,
setVisionFusion,
documentCategory,
setDocumentCategory,
rerunOcr,
} = useGridEditor(sessionId)
const [showImage, setShowImage] = useState(true)
const [zoom, setZoom] = useState(100)
const [acceptedRows, setAcceptedRows] = useState<Set<string>>(new Set())
// Expose save function to parent via ref (for GT marking auto-save)
useEffect(() => {
if (saveRef) {
saveRef.current = async () => {
if (dirty) await saveGrid()
}
return () => { saveRef.current = null }
}
}, [saveRef, dirty, saveGrid])
// Load grid on mount
useEffect(() => {
if (sessionId) loadGrid()
}, [sessionId, loadGrid])
// Reset accepted rows when session changes
useEffect(() => {
setAcceptedRows(new Set())
}, [sessionId])
// Keyboard shortcuts
useEffect(() => {
const handler = (e: KeyboardEvent) => {
if ((e.metaKey || e.ctrlKey) && e.key === 'z' && !e.shiftKey) {
e.preventDefault()
undo()
} else if ((e.metaKey || e.ctrlKey) && e.key === 'z' && e.shiftKey) {
e.preventDefault()
redo()
} else if ((e.metaKey || e.ctrlKey) && e.key === 's') {
e.preventDefault()
saveGrid()
} else if ((e.metaKey || e.ctrlKey) && e.key === 'b') {
e.preventDefault()
if (selectedCells.size > 0) {
toggleSelectedBold()
}
} else if (e.key === 'Escape') {
clearCellSelection()
}
}
window.addEventListener('keydown', handler)
return () => window.removeEventListener('keydown', handler)
}, [undo, redo, saveGrid, selectedCells, toggleSelectedBold, clearCellSelection])
const handleNavigate = useCallback(
(cellId: string, direction: 'up' | 'down' | 'left' | 'right') => {
const target = getAdjacentCell(cellId, direction)
if (target) {
setSelectedCell(target)
setTimeout(() => {
const el = document.getElementById(`cell-${target}`)
if (el) {
el.focus()
if (el instanceof HTMLInputElement) el.select()
}
}, 0)
}
},
[getAdjacentCell, setSelectedCell],
)
const acceptRow = (zoneIdx: number, rowIdx: number) => {
setAcceptedRows((prev) => {
const next = new Set(prev)
const key = `${zoneIdx}-${rowIdx}`
if (next.has(key)) next.delete(key)
else next.add(key)
return next
})
}
const acceptAllRows = () => {
if (!grid) return
const all = new Set<string>()
for (const zone of grid.zones) {
for (const row of zone.rows) {
all.add(`${zone.zone_index}-${row.index}`)
}
}
setAcceptedRows(all)
}
// Confidence stats
const allCells = grid?.zones?.flatMap((z) => z.cells) || []
const lowConfCells = allCells.filter(
(c) => c.confidence > 0 && c.confidence < 60,
)
const totalRows = grid?.zones?.reduce((sum, z) => sum + z.rows.length, 0) ?? 0
if (!sessionId) {
return (
<div className="text-center py-12 text-gray-400">
Keine Session ausgewaehlt.
</div>
)
}
if (loading) {
return (
<div className="flex items-center justify-center py-16">
<div className="flex items-center gap-3 text-gray-500 dark:text-gray-400">
<svg className="w-5 h-5 animate-spin" fill="none" viewBox="0 0 24 24">
<circle
className="opacity-25"
cx="12"
cy="12"
r="10"
stroke="currentColor"
strokeWidth="4"
/>
<path
className="opacity-75"
fill="currentColor"
d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4z"
/>
</svg>
Grid wird geladen...
</div>
</div>
)
}
if (error) {
return (
<div className="bg-red-50 dark:bg-red-900/20 border border-red-200 dark:border-red-800 rounded-lg p-4">
<p className="text-sm text-red-700 dark:text-red-300">
Fehler: {error}
</p>
<button
onClick={buildGrid}
className="mt-2 text-xs px-3 py-1.5 bg-red-600 text-white rounded hover:bg-red-700"
>
Erneut versuchen
</button>
</div>
)
}
if (!grid || !grid.zones.length) {
return (
<div className="text-center py-12">
<p className="text-gray-400 mb-4">Kein Grid vorhanden.</p>
<button
onClick={buildGrid}
className="px-4 py-2 bg-teal-600 text-white rounded-lg hover:bg-teal-700 text-sm"
>
Grid aus OCR-Ergebnissen erstellen
</button>
</div>
)
}
const imageUrl = `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/image/cropped`
return (
<div className="space-y-3">
{/* Review Stats Bar */}
<div className="flex items-center gap-4 text-xs flex-wrap">
<span className="text-gray-500 dark:text-gray-400">
{grid.summary.total_zones} Zone(n), {grid.summary.total_columns} Spalten,{' '}
{grid.summary.total_rows} Zeilen, {grid.summary.total_cells} Zellen
</span>
{grid.dictionary_detection?.is_dictionary && (
<span className="px-2 py-0.5 rounded-full bg-blue-50 dark:bg-blue-900/20 text-blue-600 dark:text-blue-400 border border-blue-200 dark:border-blue-800">
Woerterbuch ({Math.round(grid.dictionary_detection.confidence * 100)}%)
</span>
)}
{grid.page_number?.text && (
<span className="px-1.5 py-0.5 rounded bg-gray-100 dark:bg-gray-700 text-gray-600 dark:text-gray-300 border border-gray-200 dark:border-gray-600">
S. {grid.page_number.number ?? grid.page_number.text}
</span>
)}
{lowConfCells.length > 0 && (
<span className="px-2 py-0.5 rounded-full bg-red-50 dark:bg-red-900/20 text-red-600 dark:text-red-400 border border-red-200 dark:border-red-800">
{lowConfCells.length} niedrige Konfidenz
</span>
)}
<span className="text-gray-400 dark:text-gray-500">
{acceptedRows.size}/{totalRows} Zeilen akzeptiert
</span>
{acceptedRows.size < totalRows && (
<button
onClick={acceptAllRows}
className="text-teal-600 dark:text-teal-400 hover:text-teal-700 dark:hover:text-teal-300"
>
Alle akzeptieren
</button>
)}
{/* OCR Quality Steps (A/B Testing) */}
<span className="text-gray-400 dark:text-gray-500">|</span>
<label className="flex items-center gap-1 cursor-pointer" title="Step 3: CLAHE + Bilateral-Filter Enhancement">
<input type="checkbox" checked={ocrEnhance} onChange={(e) => setOcrEnhance(e.target.checked)} className="rounded w-3 h-3" />
<span className="text-gray-500 dark:text-gray-400">CLAHE</span>
</label>
<label className="flex items-center gap-1" title="Step 2: Max Spaltenanzahl (0=unbegrenzt)">
<span className="text-gray-500 dark:text-gray-400">MaxCol:</span>
<select value={ocrMaxCols} onChange={(e) => setOcrMaxCols(Number(e.target.value))} className="px-1 py-0.5 text-xs rounded border border-gray-200 dark:border-gray-600 bg-white dark:bg-gray-700 text-gray-700 dark:text-gray-300">
<option value={0}>off</option>
<option value={2}>2</option>
<option value={3}>3</option>
<option value={4}>4</option>
<option value={5}>5</option>
</select>
</label>
<label className="flex items-center gap-1" title="Step 1: Min OCR Confidence (0=auto)">
<span className="text-gray-500 dark:text-gray-400">MinConf:</span>
<select value={ocrMinConf} onChange={(e) => setOcrMinConf(Number(e.target.value))} className="px-1 py-0.5 text-xs rounded border border-gray-200 dark:border-gray-600 bg-white dark:bg-gray-700 text-gray-700 dark:text-gray-300">
<option value={0}>auto</option>
<option value={20}>20</option>
<option value={30}>30</option>
<option value={40}>40</option>
<option value={50}>50</option>
<option value={60}>60</option>
</select>
</label>
<span className="text-gray-400 dark:text-gray-500">|</span>
<label className="flex items-center gap-1 cursor-pointer" title="Step 4: Vision-LLM Fusion — Qwen2.5-VL korrigiert OCR anhand des Bildes">
<input type="checkbox" checked={visionFusion} onChange={(e) => setVisionFusion(e.target.checked)} className="rounded w-3 h-3 accent-orange-500" />
<span className={`${visionFusion ? 'text-orange-500 dark:text-orange-400 font-medium' : 'text-gray-500 dark:text-gray-400'}`}>Vision-LLM</span>
</label>
<label className="flex items-center gap-1" title="Dokumenttyp fuer Vision-LLM Prompt">
<span className="text-gray-500 dark:text-gray-400">Typ:</span>
<select value={documentCategory} onChange={(e) => setDocumentCategory(e.target.value)} className="px-1 py-0.5 text-xs rounded border border-gray-200 dark:border-gray-600 bg-white dark:bg-gray-700 text-gray-700 dark:text-gray-300">
<option value="vokabelseite">Vokabelseite</option>
<option value="woerterbuch">Woerterbuch</option>
<option value="arbeitsblatt">Arbeitsblatt</option>
<option value="buchseite">Buchseite</option>
<option value="sonstiges">Sonstiges</option>
</select>
</label>
<div className="ml-auto flex items-center gap-2">
<button
onClick={() => {
const n = autoCorrectColumnPatterns()
if (n === 0) alert('Keine Muster-Korrekturen gefunden.')
else alert(`${n} Zelle(n) korrigiert (Muster-Vervollstaendigung).`)
}}
className="px-2.5 py-1 rounded text-xs border border-purple-200 dark:border-purple-700 bg-purple-50 dark:bg-purple-900/20 text-purple-700 dark:text-purple-300 hover:bg-purple-100 dark:hover:bg-purple-900/40 transition-colors"
title="Erkennt Muster wie p.70, p.71 und vervollstaendigt partielle Eintraege wie .65 zu p.65"
>
Auto-Korrektur
</button>
<button
onClick={() => setShowImage(!showImage)}
className={`px-2.5 py-1 rounded text-xs border transition-colors ${
showImage
? 'bg-teal-50 dark:bg-teal-900/30 border-teal-200 dark:border-teal-700 text-teal-700 dark:text-teal-300'
: 'bg-gray-50 dark:bg-gray-800 border-gray-200 dark:border-gray-700 text-gray-500 dark:text-gray-400'
}`}
>
{showImage ? 'Bild ausblenden' : 'Bild einblenden'}
</button>
<span className="text-gray-400 dark:text-gray-500">
{grid.duration_seconds.toFixed(1)}s
</span>
</div>
</div>
{/* Toolbar */}
<div className="bg-white dark:bg-gray-800 rounded-lg border border-gray-200 dark:border-gray-700 px-3 py-2">
<GridToolbar
dirty={dirty}
saving={saving}
canUndo={canUndo}
canRedo={canRedo}
showOverlay={false}
ipaMode={ipaMode}
syllableMode={syllableMode}
onSave={saveGrid}
onUndo={undo}
onRedo={redo}
onRebuild={buildGrid}
onToggleOverlay={() => setShowImage(!showImage)}
onIpaModeChange={setIpaMode}
onSyllableModeChange={setSyllableMode}
/>
<button
onClick={rerunOcr}
disabled={loading}
className="ml-2 px-3 py-1.5 text-xs font-medium rounded border border-orange-300 dark:border-orange-700 bg-orange-50 dark:bg-orange-900/20 text-orange-700 dark:text-orange-300 hover:bg-orange-100 dark:hover:bg-orange-900/40 transition-colors disabled:opacity-50"
title="OCR komplett neu ausfuehren mit aktuellen Quality-Step-Einstellungen (CLAHE, MinConf), dann Grid neu bauen"
>
{loading ? 'OCR laeuft...' : 'OCR neu + Grid'}
</button>
</div>
{/* Split View: Image left + Grid right */}
<div
className={showImage ? 'grid grid-cols-2 gap-3' : ''}
style={{ minHeight: '55vh' }}
>
{/* Left: Original Image with Layout Editor */}
{showImage && (
<ImageLayoutEditor
imageUrl={imageUrl}
zones={grid.zones}
imageWidth={grid.image_width}
layoutDividers={grid.layout_dividers}
zoom={zoom}
onZoomChange={setZoom}
onColumnDividerMove={updateColumnDivider}
onHorizontalsChange={updateLayoutHorizontals}
onCommitUndo={commitUndoPoint}
onSplitColumnAt={splitColumnAt}
onDeleteColumn={deleteColumn}
/>
)}
{/* Right: Grid with row-accept buttons */}
<div className="space-y-3">
{/* Zone tables with row-accept buttons */}
{(() => {
// Group consecutive zones with same vsplit_group
const groups: GridZone[][] = []
for (const zone of grid.zones) {
const prev = groups[groups.length - 1]
if (
prev &&
zone.vsplit_group != null &&
prev[0].vsplit_group === zone.vsplit_group
) {
prev.push(zone)
} else {
groups.push([zone])
}
}
return groups.map((group) => (
<div key={group[0].vsplit_group ?? group[0].zone_index}>
{/* Row-accept sidebar wraps each zone group */}
<div className="flex gap-1">
{/* Accept buttons column */}
<div className="flex-shrink-0 pt-[52px]">
{group[0].rows.map((row) => {
const key = `${group[0].zone_index}-${row.index}`
const isAccepted = acceptedRows.has(key)
return (
<button
key={row.index}
onClick={() =>
acceptRow(group[0].zone_index, row.index)
}
className={`w-6 h-6 mb-px rounded flex items-center justify-center transition-colors ${
isAccepted
? 'bg-emerald-100 dark:bg-emerald-900/30 text-emerald-600 dark:text-emerald-400'
: 'bg-gray-100 dark:bg-gray-800 text-gray-300 dark:text-gray-600 hover:bg-emerald-100 dark:hover:bg-emerald-900/30 hover:text-emerald-500'
}`}
title={
isAccepted
? 'Klick zum Entfernen'
: 'Zeile als korrekt markieren'
}
>
<svg
className="w-3.5 h-3.5"
fill="none"
viewBox="0 0 24 24"
stroke="currentColor"
>
<path
strokeLinecap="round"
strokeLinejoin="round"
strokeWidth={2.5}
d="M5 13l4 4L19 7"
/>
</svg>
</button>
)
})}
</div>
{/* Grid table(s) */}
<div
className={`flex-1 min-w-0 ${group.length > 1 ? 'flex gap-2' : ''}`}
>
{group.map((zone) => (
<div
key={zone.zone_index}
className={`${group.length > 1 ? 'flex-1 min-w-0' : ''} bg-white dark:bg-gray-800 rounded-lg border border-gray-200 dark:border-gray-700`}
>
<GridTable
zone={zone}
layoutMetrics={grid.layout_metrics}
selectedCell={selectedCell}
selectedCells={selectedCells}
onSelectCell={setSelectedCell}
onToggleCellSelection={toggleCellSelection}
onCellTextChange={updateCellText}
onToggleColumnBold={toggleColumnBold}
onToggleRowHeader={toggleRowHeader}
onNavigate={handleNavigate}
onDeleteColumn={deleteColumn}
onAddColumn={addColumn}
onDeleteRow={deleteRow}
onAddRow={addRow}
onSetCellColor={setCellColor}
/>
</div>
))}
</div>
</div>
</div>
))
})()}
</div>
</div>
{/* Multi-select toolbar */}
{selectedCells.size > 0 && (
<div className="flex items-center gap-3 px-3 py-2 bg-teal-50 dark:bg-teal-900/20 border border-teal-200 dark:border-teal-800 rounded-lg text-xs">
<span className="text-teal-700 dark:text-teal-300 font-medium">
{selectedCells.size} Zellen markiert
</span>
<button
onClick={toggleSelectedBold}
className="px-2.5 py-1 bg-teal-600 text-white rounded hover:bg-teal-700 transition-colors font-medium"
>
B Fett umschalten
</button>
<button
onClick={clearCellSelection}
className="px-2 py-1 text-teal-600 dark:text-teal-400 hover:text-teal-800 dark:hover:text-teal-200 transition-colors"
>
Auswahl aufheben (Esc)
</button>
</div>
)}
{/* Tips + Next */}
<div className="flex items-center justify-between">
<div className="text-[11px] text-gray-400 dark:text-gray-500 flex items-center gap-4">
<span>Tab: naechste Zelle</span>
<span>Pfeiltasten: Navigation</span>
<span>Ctrl+Klick: Mehrfachauswahl</span>
<span>Ctrl+B: Fett</span>
<span>Rechtsklick: Farbe</span>
<span>Ctrl+Z/Y: Undo/Redo</span>
<span>Ctrl+S: Speichern</span>
</div>
{onNext && (
<button
onClick={async () => {
if (dirty) await saveGrid()
onNext()
}}
className="px-4 py-2 bg-teal-600 text-white text-sm rounded-lg hover:bg-teal-700 transition-colors"
>
Fertig
</button>
)}
</div>
</div>
)
}

View File

@@ -3,8 +3,8 @@
import { useCallback, useEffect, useRef, useState } from 'react'
import type {
GridCell, ColumnMeta, ImageRegion, ImageStyle,
} from '@/app/(admin)/ai/ocr-pipeline/types'
import { IMAGE_STYLES as STYLES } from '@/app/(admin)/ai/ocr-pipeline/types'
} from '@/app/(admin)/ai/ocr-kombi/types'
import { IMAGE_STYLES as STYLES } from '@/app/(admin)/ai/ocr-kombi/types'
const KLAUSUR_API = '/klausur-api'

View File

@@ -1,7 +1,7 @@
'use client'
import { useCallback, useEffect, useMemo, useRef, useState } from 'react'
import type { GridCell, GridResult, WordEntry, ColumnMeta } from '@/app/(admin)/ai/ocr-pipeline/types'
import type { GridCell, GridResult, WordEntry, ColumnMeta } from '@/app/(admin)/ai/ocr-kombi/types'
import { usePixelWordPositions } from './usePixelWordPositions'
const KLAUSUR_API = '/klausur-api'

View File

@@ -1,26 +1,36 @@
'use client'
import { useCallback, useEffect, useState } from 'react'
import type { OrientationResult, SessionInfo } from '@/app/(admin)/ai/ocr-pipeline/types'
import type { OrientationResult, SessionInfo } from '@/app/(admin)/ai/ocr-kombi/types'
import { ImageCompareView } from './ImageCompareView'
const KLAUSUR_API = '/klausur-api'
interface PageSplitResult {
multi_page: boolean
page_count?: number
sub_sessions?: { id: string; name: string; page_index: number }[]
used_original?: boolean
duration_seconds?: number
}
interface StepOrientationProps {
sessionId?: string | null
onNext: (sessionId: string) => void
onSessionList?: () => void
}
export function StepOrientation({ sessionId: existingSessionId, onNext }: StepOrientationProps) {
export function StepOrientation({ sessionId: existingSessionId, onNext, onSessionList }: StepOrientationProps) {
const [session, setSession] = useState<SessionInfo | null>(null)
const [orientationResult, setOrientationResult] = useState<OrientationResult | null>(null)
const [pageSplitResult, setPageSplitResult] = useState<PageSplitResult | null>(null)
const [uploading, setUploading] = useState(false)
const [detecting, setDetecting] = useState(false)
const [error, setError] = useState<string | null>(null)
const [dragOver, setDragOver] = useState(false)
const [sessionName, setSessionName] = useState('')
// Reload session data when navigating back
// Reload session data when navigating back — auto-trigger orientation if missing
useEffect(() => {
if (!existingSessionId || session) return
@@ -41,6 +51,28 @@ export function StepOrientation({ sessionId: existingSessionId, onNext }: StepOr
if (data.orientation_result) {
setOrientationResult(data.orientation_result)
} else {
// Session exists but orientation not yet run (e.g. page-split session)
// Auto-trigger orientation detection
setDetecting(true)
try {
const orientRes = await fetch(
`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${existingSessionId}/orientation`,
{ method: 'POST' },
)
if (orientRes.ok) {
const orientData = await orientRes.json()
setOrientationResult({
orientation_degrees: orientData.orientation_degrees,
corrected: orientData.corrected,
duration_seconds: orientData.duration_seconds,
})
}
} catch (e) {
console.error('Auto-orientation failed:', e)
} finally {
setDetecting(false)
}
}
} catch (e) {
console.error('Failed to reload session:', e)
@@ -92,6 +124,21 @@ export function StepOrientation({ sessionId: existingSessionId, onNext }: StepOr
corrected: orientData.corrected,
duration_seconds: orientData.duration_seconds,
})
// Auto-trigger page-split detection (double-page book spreads)
try {
const splitRes = await fetch(
`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${data.session_id}/page-split`,
{ method: 'POST' },
)
if (splitRes.ok) {
const splitData: PageSplitResult = await splitRes.json()
setPageSplitResult(splitData)
}
} catch (e) {
console.error('Page-split detection failed:', e)
// Not critical — continue as single page
}
} catch (e) {
setError(e instanceof Error ? e.message : 'Unbekannter Fehler')
} finally {
@@ -225,15 +272,47 @@ export function StepOrientation({ sessionId: existingSessionId, onNext }: StepOr
</div>
)}
{/* Page-split result */}
{pageSplitResult?.multi_page && (
<div className="bg-blue-50 dark:bg-blue-900/20 rounded-lg border border-blue-200 dark:border-blue-700 p-4">
<div className="text-sm font-medium text-blue-700 dark:text-blue-300">
Doppelseite erkannt {pageSplitResult.page_count} unabhaengige Sessions erstellt
</div>
<p className="text-xs text-blue-600 dark:text-blue-400 mt-1">
Jede Seite wird als eigene Session durch die Pipeline verarbeitet.
{pageSplitResult.used_original && ' (Seitentrennung auf dem Originalbild, da die Orientierung die Doppelseite gedreht hat.)'}
</p>
<div className="flex gap-2 mt-2">
{pageSplitResult.sub_sessions?.map((s) => (
<span
key={s.id}
className="text-xs px-2 py-1 rounded-md bg-blue-100 dark:bg-blue-800/40 text-blue-700 dark:text-blue-300"
>
{s.name}
</span>
))}
</div>
</div>
)}
{/* Next button */}
{orientationResult && (
<div className="flex justify-end">
<button
onClick={() => onNext(session.session_id)}
className="px-6 py-2 bg-teal-600 text-white rounded-lg hover:bg-teal-700 font-medium transition-colors"
>
Weiter &rarr;
</button>
{pageSplitResult?.multi_page ? (
<button
onClick={() => onSessionList?.()}
className="px-6 py-2 bg-teal-600 text-white rounded-lg hover:bg-teal-700 font-medium transition-colors"
>
Zur Session-Liste &rarr;
</button>
) : (
<button
onClick={() => onNext(session.session_id)}
className="px-6 py-2 bg-teal-600 text-white rounded-lg hover:bg-teal-700 font-medium transition-colors"
>
Weiter &rarr;
</button>
)}
</div>
)}

View File

@@ -2,7 +2,7 @@
import { useCallback, useEffect, useMemo, useRef, useState } from 'react'
import dynamic from 'next/dynamic'
import type { GridResult, GridCell, ColumnResult, RowResult, PageZone, PageRegion, RowItem, StructureResult, StructureBox, StructureGraphic } from '@/app/(admin)/ai/ocr-pipeline/types'
import type { GridResult, GridCell, ColumnResult, RowResult, PageZone, PageRegion, RowItem, StructureResult, StructureBox, StructureGraphic } from '@/app/(admin)/ai/ocr-kombi/types'
import { usePixelWordPositions } from './usePixelWordPositions'
const KLAUSUR_API = '/klausur-api'

View File

@@ -1,7 +1,7 @@
'use client'
import { useCallback, useEffect, useState } from 'react'
import type { RowResult, RowGroundTruth } from '@/app/(admin)/ai/ocr-pipeline/types'
import type { RowResult, RowGroundTruth } from '@/app/(admin)/ai/ocr-kombi/types'
const KLAUSUR_API = '/klausur-api'

View File

@@ -1,7 +1,7 @@
'use client'
import { useCallback, useEffect, useRef, useState } from 'react'
import type { ExcludeRegion, StructureResult } from '@/app/(admin)/ai/ocr-pipeline/types'
import type { ExcludeRegion, StructureResult } from '@/app/(admin)/ai/ocr-kombi/types'
const KLAUSUR_API = '/klausur-api'
@@ -19,6 +19,26 @@ const COLOR_HEX: Record<string, string> = {
purple: '#9333ea',
}
type DetectionMethod = 'auto' | 'opencv' | 'ppdoclayout'
/** Color map for PP-DocLayout region classes */
const DOCLAYOUT_CLASS_COLORS: Record<string, string> = {
table: '#2563eb',
figure: '#16a34a',
title: '#ea580c',
text: '#6b7280',
list: '#9333ea',
header: '#0ea5e9',
footer: '#64748b',
equation: '#dc2626',
}
const DOCLAYOUT_DEFAULT_COLOR = '#a3a3a3'
function getDocLayoutColor(className: string): string {
return DOCLAYOUT_CLASS_COLORS[className.toLowerCase()] || DOCLAYOUT_DEFAULT_COLOR
}
/**
* Convert a mouse event on the image container to image-pixel coordinates.
* The image uses object-contain inside an A4-ratio container, so we need
@@ -96,6 +116,7 @@ export function StepStructureDetection({ sessionId, onNext }: StepStructureDetec
const [error, setError] = useState<string | null>(null)
const [hasRun, setHasRun] = useState(false)
const [overlayTs, setOverlayTs] = useState(0)
const [detectionMethod, setDetectionMethod] = useState<DetectionMethod>('auto')
// Exclude region drawing state
const [excludeRegions, setExcludeRegions] = useState<ExcludeRegion[]>([])
@@ -106,7 +127,9 @@ export function StepStructureDetection({ sessionId, onNext }: StepStructureDetec
const [drawMode, setDrawMode] = useState(false)
const containerRef = useRef<HTMLDivElement>(null)
const overlayContainerRef = useRef<HTMLDivElement>(null)
const [containerSize, setContainerSize] = useState({ w: 0, h: 0 })
const [overlayContainerSize, setOverlayContainerSize] = useState({ w: 0, h: 0 })
// Track container size for overlay positioning
useEffect(() => {
@@ -121,6 +144,19 @@ export function StepStructureDetection({ sessionId, onNext }: StepStructureDetec
return () => obs.disconnect()
}, [])
// Track overlay container size for PP-DocLayout region overlays
useEffect(() => {
const el = overlayContainerRef.current
if (!el) return
const obs = new ResizeObserver((entries) => {
for (const entry of entries) {
setOverlayContainerSize({ w: entry.contentRect.width, h: entry.contentRect.height })
}
})
obs.observe(el)
return () => obs.disconnect()
}, [])
// Auto-trigger detection on mount
useEffect(() => {
if (!sessionId || hasRun) return
@@ -131,7 +167,8 @@ export function StepStructureDetection({ sessionId, onNext }: StepStructureDetec
setError(null)
try {
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/detect-structure`, {
const params = detectionMethod !== 'auto' ? `?method=${detectionMethod}` : ''
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/detect-structure${params}`, {
method: 'POST',
})
@@ -158,7 +195,8 @@ export function StepStructureDetection({ sessionId, onNext }: StepStructureDetec
setDetecting(true)
setError(null)
try {
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/detect-structure`, {
const params = detectionMethod !== 'auto' ? `?method=${detectionMethod}` : ''
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/detect-structure${params}`, {
method: 'POST',
})
if (!res.ok) throw new Error('Erneute Erkennung fehlgeschlagen')
@@ -278,6 +316,31 @@ export function StepStructureDetection({ sessionId, onNext }: StepStructureDetec
</div>
)}
{/* Detection method toggle */}
<div className="flex items-center gap-2">
<span className="text-xs font-medium text-gray-500 dark:text-gray-400">Methode:</span>
{(['auto', 'opencv', 'ppdoclayout'] as DetectionMethod[]).map((method) => (
<button
key={method}
onClick={() => setDetectionMethod(method)}
className={`px-3 py-1.5 text-xs rounded-md font-medium transition-colors ${
detectionMethod === method
? 'bg-teal-600 text-white'
: 'bg-gray-100 dark:bg-gray-700 text-gray-600 dark:text-gray-300 hover:bg-gray-200 dark:hover:bg-gray-600'
}`}
>
{method === 'auto' ? 'Auto' : method === 'opencv' ? 'OpenCV' : 'PP-DocLayout'}
</button>
))}
<span className="text-[10px] text-gray-400 dark:text-gray-500 ml-1">
{detectionMethod === 'auto'
? 'PP-DocLayout wenn verfuegbar, sonst OpenCV'
: detectionMethod === 'ppdoclayout'
? 'ONNX-basierte Layouterkennung mit Klassifikation'
: 'Klassische OpenCV-Konturerkennung'}
</span>
</div>
{/* Draw mode toggle */}
{result && (
<div className="flex items-center gap-3">
@@ -376,8 +439,17 @@ export function StepStructureDetection({ sessionId, onNext }: StepStructureDetec
<div className="space-y-2">
<div className="text-xs font-medium text-gray-500 dark:text-gray-400 uppercase tracking-wider">
Erkannte Struktur
{result?.detection_method && (
<span className="ml-2 text-[10px] font-normal normal-case">
({result.detection_method === 'ppdoclayout' ? 'PP-DocLayout' : 'OpenCV'})
</span>
)}
</div>
<div className="relative bg-gray-100 dark:bg-gray-800 rounded-lg overflow-hidden" style={{ aspectRatio: '210/297' }}>
<div
ref={overlayContainerRef}
className="relative bg-gray-100 dark:bg-gray-800 rounded-lg overflow-hidden"
style={{ aspectRatio: '210/297' }}
>
{/* eslint-disable-next-line @next/next/no-img-element */}
<img
src={overlayUrl}
@@ -387,7 +459,52 @@ export function StepStructureDetection({ sessionId, onNext }: StepStructureDetec
(e.target as HTMLImageElement).style.display = 'none'
}}
/>
{/* PP-DocLayout region overlays with class colors and labels */}
{result?.layout_regions && overlayContainerSize.w > 0 && result.layout_regions.map((region, i) => {
const pos = imageToOverlayPct(region, overlayContainerSize.w, overlayContainerSize.h, result.image_width, result.image_height)
const color = getDocLayoutColor(region.class_name)
return (
<div
key={`layout-${i}`}
className="absolute border-2 pointer-events-none"
style={{
...pos,
borderColor: color,
backgroundColor: `${color}18`,
}}
>
<span
className="absolute -top-4 left-0 px-1 py-px text-[9px] font-medium text-white rounded-sm whitespace-nowrap leading-tight"
style={{ backgroundColor: color }}
>
{region.class_name} {Math.round(region.confidence * 100)}%
</span>
</div>
)
})}
</div>
{/* PP-DocLayout legend */}
{result?.layout_regions && result.layout_regions.length > 0 && (() => {
const usedClasses = [...new Set(result.layout_regions!.map((r) => r.class_name.toLowerCase()))]
return (
<div className="flex flex-wrap gap-x-3 gap-y-1 px-1">
{usedClasses.sort().map((cls) => (
<span key={cls} className="inline-flex items-center gap-1 text-[10px] text-gray-500 dark:text-gray-400">
<span
className="w-2.5 h-2.5 rounded-sm border"
style={{
backgroundColor: `${getDocLayoutColor(cls)}30`,
borderColor: getDocLayoutColor(cls),
}}
/>
{cls}
</span>
))}
</div>
)
})()}
</div>
</div>
@@ -430,6 +547,11 @@ export function StepStructureDetection({ sessionId, onNext }: StepStructureDetec
<span className="inline-flex items-center gap-1.5 px-3 py-1 rounded-full bg-amber-50 dark:bg-amber-900/20 text-amber-700 dark:text-amber-400 text-xs font-medium">
{result.boxes.length} Box(en)
</span>
{result.layout_regions && result.layout_regions.length > 0 && (
<span className="inline-flex items-center gap-1.5 px-3 py-1 rounded-full bg-indigo-50 dark:bg-indigo-900/20 text-indigo-700 dark:text-indigo-400 text-xs font-medium">
{result.layout_regions.length} Layout-Region(en)
</span>
)}
{result.graphics && result.graphics.length > 0 && (
<span className="inline-flex items-center gap-1.5 px-3 py-1 rounded-full bg-purple-50 dark:bg-purple-900/20 text-purple-700 dark:text-purple-400 text-xs font-medium">
{result.graphics.length} Grafik(en)
@@ -451,6 +573,11 @@ export function StepStructureDetection({ sessionId, onNext }: StepStructureDetec
</span>
)}
<span className="text-gray-400 text-xs ml-auto">
{result.detection_method && (
<span className="mr-1.5">
{result.detection_method === 'ppdoclayout' ? 'PP-DocLayout' : 'OpenCV'} |
</span>
)}
{result.image_width}x{result.image_height}px | {result.duration_seconds}s
</span>
</div>
@@ -491,6 +618,37 @@ export function StepStructureDetection({ sessionId, onNext }: StepStructureDetec
</div>
)}
{/* PP-DocLayout regions detail */}
{result.layout_regions && result.layout_regions.length > 0 && (
<div>
<h4 className="text-xs font-medium text-gray-500 dark:text-gray-400 mb-2">
PP-DocLayout Regionen ({result.layout_regions.length})
</h4>
<div className="space-y-1.5">
{result.layout_regions.map((region, i) => {
const color = getDocLayoutColor(region.class_name)
return (
<div key={i} className="flex items-center gap-3 text-xs">
<span
className="w-3 h-3 rounded-sm flex-shrink-0 border"
style={{ backgroundColor: `${color}40`, borderColor: color }}
/>
<span className="text-gray-600 dark:text-gray-400 font-medium min-w-[60px]">
{region.class_name}
</span>
<span className="font-mono text-gray-500">
{region.w}x{region.h}px @ ({region.x}, {region.y})
</span>
<span className="text-gray-400">
{Math.round(region.confidence * 100)}%
</span>
</div>
)
})}
</div>
</div>
)}
{/* Zones detail */}
<div>
<h4 className="text-xs font-medium text-gray-500 dark:text-gray-400 mb-2">Seitenzonen</h4>

View File

@@ -1,7 +1,7 @@
'use client'
import { useCallback, useEffect, useRef, useState } from 'react'
import type { GridResult, GridCell, WordEntry, WordGroundTruth } from '@/app/(admin)/ai/ocr-pipeline/types'
import type { GridResult, GridCell, WordEntry, WordGroundTruth } from '@/app/(admin)/ai/ocr-kombi/types'
const KLAUSUR_API = '/klausur-api'

View File

@@ -0,0 +1,328 @@
/**
* Tests for usePixelWordPositions hook.
*
* The hook performs pixel-based word positioning using an offscreen canvas.
* Since Canvas/getImageData is not available in jsdom, we test the pure
* computation logic by extracting and testing the algorithms directly.
*/
import { describe, it, expect } from 'vitest'
// ---------------------------------------------------------------------------
// Extract pure computation functions from the hook for testing
// ---------------------------------------------------------------------------
interface Cluster {
start: number
end: number
}
/**
* Cluster detection: find runs of dark pixels above a threshold.
* Replicates the cluster detection logic in usePixelWordPositions.
*/
function findClusters(proj: number[], ch: number, cw: number): Cluster[] {
const threshold = Math.max(1, ch * 0.03)
const minGap = Math.max(5, Math.round(cw * 0.02))
const clusters: Cluster[] = []
let inCluster = false
let clStart = 0
let gap = 0
for (let x = 0; x < cw; x++) {
if (proj[x] >= threshold) {
if (!inCluster) { clStart = x; inCluster = true }
gap = 0
} else if (inCluster) {
gap++
if (gap > minGap) {
clusters.push({ start: clStart, end: x - gap })
inCluster = false
gap = 0
}
}
}
if (inCluster) clusters.push({ start: clStart, end: cw - 1 - gap })
return clusters
}
/**
* Mirror clusters for 180° rotation.
* Replicates the rotation logic in usePixelWordPositions.
*/
function mirrorClusters(clusters: Cluster[], cw: number): Cluster[] {
return clusters.map(c => ({
start: cw - 1 - c.end,
end: cw - 1 - c.start,
})).reverse()
}
/**
* Compute fontRatio from cluster width, measured text width, and cell height.
* Replicates the font ratio calculation.
*/
function computeFontRatio(
clusterW: number,
measuredWidth: number,
refFontSize: number,
ch: number,
): number {
const autoFontPx = refFontSize * (clusterW / measuredWidth)
return Math.min(autoFontPx / ch, 1.0)
}
/**
* Mode normalization: find the most common fontRatio (bucketed to 0.02).
* Replicates the mode normalization in usePixelWordPositions.
*/
function normalizeFontRatios(ratios: number[]): number {
if (ratios.length === 0) return 0
const buckets = new Map<number, number>()
for (const r of ratios) {
const key = Math.round(r * 50) / 50
buckets.set(key, (buckets.get(key) || 0) + 1)
}
let modeRatio = ratios[0]
let modeCount = 0
for (const [ratio, count] of buckets) {
if (count > modeCount) { modeRatio = ratio; modeCount = count }
}
return modeRatio
}
/**
* Coordinate transform for 180° rotation.
*/
function transformCellCoords180(
x: number, y: number, w: number, h: number,
imgW: number, imgH: number,
): { cx: number; cy: number } {
return {
cx: Math.round((100 - x - w) / 100 * imgW),
cy: Math.round((100 - y - h) / 100 * imgH),
}
}
// ---------------------------------------------------------------------------
// Tests
// ---------------------------------------------------------------------------
describe('findClusters', () => {
it('should find a single cluster', () => {
// Simulate a projection with dark pixels from x=10 to x=50
const proj = new Array(100).fill(0)
for (let x = 10; x <= 50; x++) proj[x] = 10
const clusters = findClusters(proj, 100, 100)
expect(clusters.length).toBe(1)
expect(clusters[0].start).toBe(10)
expect(clusters[0].end).toBe(50)
})
it('should find multiple clusters separated by gaps', () => {
const proj = new Array(200).fill(0)
// Two word groups with a gap between
for (let x = 10; x <= 40; x++) proj[x] = 10
for (let x = 80; x <= 120; x++) proj[x] = 10
const clusters = findClusters(proj, 100, 200)
expect(clusters.length).toBe(2)
expect(clusters[0].start).toBe(10)
expect(clusters[1].start).toBe(80)
})
it('should merge clusters with small gaps', () => {
// Gap smaller than minGap should not split clusters
const proj = new Array(100).fill(0)
for (let x = 10; x <= 30; x++) proj[x] = 10
// Small gap (3px) — minGap = max(5, 100*0.02) = 5
for (let x = 34; x <= 50; x++) proj[x] = 10
const clusters = findClusters(proj, 100, 100)
expect(clusters.length).toBe(1) // merged into one cluster
})
it('should return empty for all-white projection', () => {
const proj = new Array(100).fill(0)
const clusters = findClusters(proj, 100, 100)
expect(clusters.length).toBe(0)
})
})
describe('mirrorClusters', () => {
it('should mirror clusters for 180° rotation', () => {
const clusters: Cluster[] = [
{ start: 10, end: 50 },
{ start: 80, end: 120 },
]
const cw = 200
const mirrored = mirrorClusters(clusters, cw)
// Cluster at (10,50) → (cw-1-50, cw-1-10) = (149, 189)
// Cluster at (80,120) → (cw-1-120, cw-1-80) = (79, 119)
// After reverse: [(79,119), (149,189)]
expect(mirrored.length).toBe(2)
expect(mirrored[0]).toEqual({ start: 79, end: 119 })
expect(mirrored[1]).toEqual({ start: 149, end: 189 })
})
it('should maintain left-to-right order after mirroring', () => {
const clusters: Cluster[] = [
{ start: 5, end: 30 },
{ start: 50, end: 80 },
{ start: 100, end: 130 },
]
const mirrored = mirrorClusters(clusters, 200)
// After mirroring and reversing, order should be left-to-right
for (let i = 1; i < mirrored.length; i++) {
expect(mirrored[i].start).toBeGreaterThan(mirrored[i - 1].start)
}
})
it('should handle single cluster', () => {
const clusters: Cluster[] = [{ start: 20, end: 80 }]
const mirrored = mirrorClusters(clusters, 200)
expect(mirrored.length).toBe(1)
expect(mirrored[0]).toEqual({ start: 119, end: 179 })
})
})
describe('computeFontRatio', () => {
it('should compute ratio based on cluster vs measured width', () => {
// Cluster is 100px wide, measured text at 40px font is 200px → autoFont = 20px
// Cell height = 30px → ratio = 20/30 = 0.667
const ratio = computeFontRatio(100, 200, 40, 30)
expect(ratio).toBeCloseTo(0.667, 2)
})
it('should cap ratio at 1.0', () => {
// Very large cluster relative to measured text
const ratio = computeFontRatio(400, 100, 40, 30)
expect(ratio).toBe(1.0)
})
it('should handle small cluster width', () => {
const ratio = computeFontRatio(10, 200, 40, 30)
expect(ratio).toBeCloseTo(0.067, 2)
})
})
describe('normalizeFontRatios', () => {
it('should return the most common ratio', () => {
const ratios = [0.5, 0.5, 0.5, 0.3, 0.3, 0.7]
const mode = normalizeFontRatios(ratios)
expect(mode).toBe(0.5)
})
it('should bucket ratios to nearest 0.02', () => {
// 0.51 and 0.49 both round to 0.50 (nearest 0.02)
const ratios = [0.51, 0.49, 0.50, 0.30]
const mode = normalizeFontRatios(ratios)
expect(mode).toBe(0.50)
})
it('should handle empty array', () => {
expect(normalizeFontRatios([])).toBe(0)
})
it('should handle single ratio', () => {
expect(normalizeFontRatios([0.65])).toBe(0.66) // rounded to nearest 0.02
})
})
describe('transformCellCoords180', () => {
it('should transform cell coordinates for 180° rotation', () => {
// Cell at x=10%, y=20%, w=30%, h=5% on a 1000x2000 image
const { cx, cy } = transformCellCoords180(10, 20, 30, 5, 1000, 2000)
// Expected: cx = (100 - 10 - 30) / 100 * 1000 = 600
// cy = (100 - 20 - 5) / 100 * 2000 = 1500
expect(cx).toBe(600)
expect(cy).toBe(1500)
})
it('should handle cell at origin', () => {
const { cx, cy } = transformCellCoords180(0, 0, 50, 50, 1000, 1000)
// Expected: cx = (100 - 0 - 50) / 100 * 1000 = 500
// cy = (100 - 0 - 50) / 100 * 1000 = 500
expect(cx).toBe(500)
expect(cy).toBe(500)
})
it('should handle cell at bottom-right', () => {
const { cx, cy } = transformCellCoords180(80, 90, 20, 10, 1000, 2000)
// Expected: cx = (100 - 80 - 20) / 100 * 1000 = 0
// cy = (100 - 90 - 10) / 100 * 2000 = 0
expect(cx).toBe(0)
expect(cy).toBe(0)
})
})
describe('sub-session coordinate conversion', () => {
/**
* Test the coordinate conversion from sub-session (box-relative)
* to parent (page-absolute) coordinates.
* Replicates the logic in StepReconstruction loadSessionData.
*/
it('should convert sub-session cell coords to parent space', () => {
const imgW = 1746
const imgH = 2487
// Box zone in pixels
const box = { x: 50, y: 1145, width: 1100, height: 270 }
// Box in percent
const boxXPct = (box.x / imgW) * 100
const boxYPct = (box.y / imgH) * 100
const boxWPct = (box.width / imgW) * 100
const boxHPct = (box.height / imgH) * 100
// Sub-session cell at (10%, 20%, 80%, 15%) relative to box
const subCell = { x: 10, y: 20, w: 80, h: 15 }
const parentX = boxXPct + (subCell.x / 100) * boxWPct
const parentY = boxYPct + (subCell.y / 100) * boxHPct
const parentW = (subCell.w / 100) * boxWPct
const parentH = (subCell.h / 100) * boxHPct
// Box start in percent: x ≈ 2.86%, y ≈ 46.04%
expect(parentX).toBeCloseTo(boxXPct + 0.1 * boxWPct, 2)
expect(parentY).toBeCloseTo(boxYPct + 0.2 * boxHPct, 2)
expect(parentW).toBeCloseTo(0.8 * boxWPct, 2)
expect(parentH).toBeCloseTo(0.15 * boxHPct, 2)
// All values should be within 0-100%
expect(parentX).toBeGreaterThan(0)
expect(parentY).toBeGreaterThan(0)
expect(parentX + parentW).toBeLessThan(100)
expect(parentY + parentH).toBeLessThan(100)
})
it('should place sub-cell at box origin when sub coords are 0,0', () => {
const imgW = 1000
const imgH = 2000
const box = { x: 100, y: 500, width: 800, height: 200 }
const boxXPct = (box.x / imgW) * 100 // 10%
const boxYPct = (box.y / imgH) * 100 // 25%
const parentX = boxXPct + (0 / 100) * ((box.width / imgW) * 100)
const parentY = boxYPct + (0 / 100) * ((box.height / imgH) * 100)
expect(parentX).toBeCloseTo(10, 1)
expect(parentY).toBeCloseTo(25, 1)
})
})

View File

@@ -1,5 +1,5 @@
import { useEffect, useState } from 'react'
import type { GridCell } from '@/app/(admin)/ai/ocr-pipeline/types'
import type { GridCell } from '@/app/(admin)/ai/ocr-kombi/types'
export interface WordPosition {
xPct: number

View File

@@ -49,22 +49,6 @@ export const navigation: NavCategory[] = [
purpose: 'E-Mail-Konten verwalten und KI-Kategorisierung nutzen. IMAP/SMTP Konfiguration, Vorlagen und Audit-Log.',
audience: ['Support', 'Admins'],
},
{
id: 'video-chat',
name: 'Video & Chat',
href: '/communication/video-chat',
description: 'Matrix & Jitsi Monitoring',
purpose: 'Dashboard fuer Matrix Synapse und Jitsi Meet. Service-Status, aktive Meetings, Traffic-Analyse und Ressourcen-Empfehlungen.',
audience: ['Admins', 'DevOps'],
},
{
id: 'voice-service',
name: 'Voice Service',
href: '/communication/matrix',
description: 'PersonaPlex-7B & TaskOrchestrator',
purpose: 'Voice-First Interface Konfiguration und Architektur-Dokumentation. Live Demo, Task States, Intents und DSGVO-Informationen.',
audience: ['Entwickler', 'Admins'],
},
{
id: 'alerts',
name: 'Alerts Monitoring',
@@ -133,29 +117,11 @@ export const navigation: NavCategory[] = [
// KI-Werkzeuge: Standalone-Tools fuer Entwicklung & QA
// -----------------------------------------------------------------------
{
id: 'ocr-compare',
name: 'OCR Vergleich',
href: '/ai/ocr-compare',
description: 'OCR-Methoden & Vokabel-Extraktion',
purpose: 'Vergleichen Sie verschiedene OCR-Methoden (lokales LLM, Vision LLM, PaddleOCR, Tesseract, Anthropic) fuer Vokabel-Extraktion. Grid-Overlay, Block-Review und LLM-Vergleich.',
audience: ['Entwickler', 'Data Scientists', 'Lehrer'],
subgroup: 'KI-Werkzeuge',
},
{
id: 'ocr-pipeline',
name: 'OCR Pipeline',
href: '/ai/ocr-pipeline',
description: 'Schrittweise Seitenrekonstruktion',
purpose: 'Schrittweise Seitenrekonstruktion: Scan begradigen, Spalten erkennen, Woerter lokalisieren und die Seite Wort fuer Wort nachbauen. 6-Schritt-Pipeline mit Ground Truth Validierung.',
audience: ['Entwickler', 'Data Scientists'],
subgroup: 'KI-Werkzeuge',
},
{
id: 'ocr-overlay',
name: 'OCR Overlay',
href: '/ai/ocr-overlay',
description: 'Ganzseitige Overlay-Rekonstruktion',
purpose: 'Arbeitsblatt ohne Spaltenerkennung direkt als Overlay rekonstruieren. Vereinfachte 7-Schritt-Pipeline.',
id: 'ocr-kombi',
name: 'OCR Kombi',
href: '/ai/ocr-kombi',
description: 'Modulare 11-Schritt-Pipeline',
purpose: 'Modulare OCR-Pipeline mit Dual-Engine (PP-OCRv5 + Tesseract), Strukturerkennung, Grid-Aufbau und Review. Multi-Page-Dokument-Unterstuetzung.',
audience: ['Entwickler'],
subgroup: 'KI-Werkzeuge',
},
@@ -169,19 +135,27 @@ export const navigation: NavCategory[] = [
oldAdminPath: '/admin/quality',
subgroup: 'KI-Werkzeuge',
},
{
id: 'gpu',
name: 'GPU Infrastruktur',
href: '/ai/gpu',
description: 'vast.ai GPU Management',
purpose: 'Verwalten Sie GPU-Instanzen auf vast.ai fuer ML-Training und Inferenz.',
audience: ['DevOps', 'Entwickler'],
oldAdminPath: '/admin/gpu',
subgroup: 'KI-Werkzeuge',
},
// -----------------------------------------------------------------------
// KI-Anwendungen: Endnutzer-orientierte KI-Module
// -----------------------------------------------------------------------
{
id: 'ocr-regression',
name: 'OCR Regression',
href: '/ai/ocr-regression',
description: 'Regressions-Tests & Ground Truth',
purpose: 'Regressions-Tests fuer die OCR-Pipeline ausfuehren. Zeigt Pass/Fail pro Ground-Truth Session, Diff-Details und Verlauf vergangener Laeufe.',
audience: ['Entwickler', 'QA'],
subgroup: 'KI-Werkzeuge',
},
{
id: 'ocr-ground-truth',
name: 'Ground Truth Review',
href: '/ai/ocr-ground-truth',
description: 'Ground Truth pruefen & markieren',
purpose: 'Effiziente Massenpruefung von OCR-Sessions. Split-View mit Confidence-Highlighting, Quick-Accept und Batch-Markierung als Ground Truth.',
audience: ['Entwickler', 'QA'],
subgroup: 'KI-Werkzeuge',
},
{
id: 'agents',
name: 'Agent Management',

File diff suppressed because it is too large Load Diff

View File

@@ -18,6 +18,8 @@
"test:all": "vitest run && playwright test --project=chromium"
},
"dependencies": {
"@fortune-sheet/react": "^1.0.4",
"fabric": "^6.0.0",
"jspdf": "^4.1.0",
"jszip": "^3.10.1",
"lucide-react": "^0.468.0",
@@ -26,7 +28,6 @@
"react-dom": "^18.3.1",
"reactflow": "^11.11.4",
"recharts": "^2.15.0",
"fabric": "^6.0.0",
"uuid": "^13.0.0"
},
"devDependencies": {

View File

@@ -1,10 +1 @@
"""
Infrastructure management module.
Provides control plane for external GPU resources (vast.ai).
"""
from .vast_client import VastAIClient
from .vast_power import router as vast_router
__all__ = ["VastAIClient", "vast_router"]
# Infrastructure module (vast.ai GPU management removed — see git history)

View File

@@ -1,419 +0,0 @@
"""
Vast.ai REST API Client.
Verwendet die offizielle vast.ai API statt CLI fuer mehr Stabilitaet.
API Dokumentation: https://docs.vast.ai/api
"""
import asyncio
import logging
from dataclasses import dataclass, field
from datetime import datetime, timezone
from enum import Enum
from typing import Optional, Dict, Any, List
import httpx
logger = logging.getLogger(__name__)
class InstanceStatus(Enum):
"""Vast.ai Instance Status."""
RUNNING = "running"
STOPPED = "stopped"
EXITED = "exited"
LOADING = "loading"
SCHEDULING = "scheduling"
CREATING = "creating"
UNKNOWN = "unknown"
@dataclass
class AccountInfo:
"""Informationen ueber den vast.ai Account."""
credit: float # Aktuelles Guthaben in USD
balance: float # Balance (meist 0)
total_spend: float # Gesamtausgaben
username: str
email: str
has_billing: bool
@classmethod
def from_api_response(cls, data: Dict[str, Any]) -> "AccountInfo":
"""Erstellt AccountInfo aus API Response."""
return cls(
credit=data.get("credit", 0.0),
balance=data.get("balance", 0.0),
total_spend=abs(data.get("total_spend", 0.0)), # API gibt negativ zurück
username=data.get("username", ""),
email=data.get("email", ""),
has_billing=data.get("has_billing", False),
)
def to_dict(self) -> Dict[str, Any]:
"""Serialisiert zu Dictionary."""
return {
"credit": self.credit,
"balance": self.balance,
"total_spend": self.total_spend,
"username": self.username,
"email": self.email,
"has_billing": self.has_billing,
}
@dataclass
class InstanceInfo:
"""Informationen ueber eine vast.ai Instanz."""
id: int
status: InstanceStatus
machine_id: Optional[int] = None
gpu_name: Optional[str] = None
num_gpus: int = 1
gpu_ram: Optional[float] = None # GB
cpu_ram: Optional[float] = None # GB
disk_space: Optional[float] = None # GB
dph_total: Optional[float] = None # $/hour
public_ipaddr: Optional[str] = None
ports: Dict[str, Any] = field(default_factory=dict)
label: Optional[str] = None
image_uuid: Optional[str] = None
started_at: Optional[datetime] = None
@classmethod
def from_api_response(cls, data: Dict[str, Any]) -> "InstanceInfo":
"""Erstellt InstanceInfo aus API Response."""
status_map = {
"running": InstanceStatus.RUNNING,
"exited": InstanceStatus.EXITED,
"loading": InstanceStatus.LOADING,
"scheduling": InstanceStatus.SCHEDULING,
"creating": InstanceStatus.CREATING,
}
actual_status = data.get("actual_status", "unknown")
status = status_map.get(actual_status, InstanceStatus.UNKNOWN)
# Parse ports mapping
ports = {}
if "ports" in data and data["ports"]:
ports = data["ports"]
# Parse started_at
started_at = None
if "start_date" in data and data["start_date"]:
try:
started_at = datetime.fromtimestamp(data["start_date"], tz=timezone.utc)
except (ValueError, TypeError):
pass
return cls(
id=data.get("id", 0),
status=status,
machine_id=data.get("machine_id"),
gpu_name=data.get("gpu_name"),
num_gpus=data.get("num_gpus", 1),
gpu_ram=data.get("gpu_ram"),
cpu_ram=data.get("cpu_ram"),
disk_space=data.get("disk_space"),
dph_total=data.get("dph_total"),
public_ipaddr=data.get("public_ipaddr"),
ports=ports,
label=data.get("label"),
image_uuid=data.get("image_uuid"),
started_at=started_at,
)
def get_endpoint_url(self, internal_port: int = 8001) -> Optional[str]:
"""Berechnet die externe URL fuer einen internen Port."""
if not self.public_ipaddr:
return None
# vast.ai mapped interne Ports auf externe Ports
# Format: {"8001/tcp": [{"HostIp": "0.0.0.0", "HostPort": "12345"}]}
port_key = f"{internal_port}/tcp"
if port_key in self.ports:
port_info = self.ports[port_key]
if isinstance(port_info, list) and port_info:
host_port = port_info[0].get("HostPort")
if host_port:
return f"http://{self.public_ipaddr}:{host_port}"
# Fallback: Direkter Port
return f"http://{self.public_ipaddr}:{internal_port}"
def to_dict(self) -> Dict[str, Any]:
"""Serialisiert zu Dictionary."""
return {
"id": self.id,
"status": self.status.value,
"machine_id": self.machine_id,
"gpu_name": self.gpu_name,
"num_gpus": self.num_gpus,
"gpu_ram": self.gpu_ram,
"cpu_ram": self.cpu_ram,
"disk_space": self.disk_space,
"dph_total": self.dph_total,
"public_ipaddr": self.public_ipaddr,
"ports": self.ports,
"label": self.label,
"started_at": self.started_at.isoformat() if self.started_at else None,
}
class VastAIClient:
"""
Async Client fuer vast.ai REST API.
Verwendet die offizielle API unter https://console.vast.ai/api/v0/
"""
BASE_URL = "https://console.vast.ai/api/v0"
def __init__(self, api_key: str, timeout: float = 30.0):
self.api_key = api_key
self.timeout = timeout
self._client: Optional[httpx.AsyncClient] = None
async def _get_client(self) -> httpx.AsyncClient:
"""Lazy Client-Erstellung."""
if self._client is None or self._client.is_closed:
self._client = httpx.AsyncClient(
timeout=self.timeout,
headers={
"Accept": "application/json",
},
)
return self._client
async def close(self) -> None:
"""Schliesst den HTTP Client."""
if self._client and not self._client.is_closed:
await self._client.aclose()
self._client = None
def _build_url(self, endpoint: str) -> str:
"""Baut vollstaendige URL mit API Key."""
sep = "&" if "?" in endpoint else "?"
return f"{self.BASE_URL}{endpoint}{sep}api_key={self.api_key}"
async def list_instances(self) -> List[InstanceInfo]:
"""Listet alle Instanzen auf."""
client = await self._get_client()
url = self._build_url("/instances/")
try:
response = await client.get(url)
response.raise_for_status()
data = response.json()
instances = []
if "instances" in data:
for inst_data in data["instances"]:
instances.append(InstanceInfo.from_api_response(inst_data))
return instances
except httpx.HTTPStatusError as e:
logger.error(f"vast.ai API error listing instances: {e}")
raise
async def get_instance(self, instance_id: int) -> Optional[InstanceInfo]:
"""Holt Details einer spezifischen Instanz."""
client = await self._get_client()
url = self._build_url(f"/instances/{instance_id}/")
try:
response = await client.get(url)
response.raise_for_status()
data = response.json()
if "instances" in data:
instances = data["instances"]
# API gibt bei einzelner Instanz ein dict zurück, bei Liste eine Liste
if isinstance(instances, list) and instances:
return InstanceInfo.from_api_response(instances[0])
elif isinstance(instances, dict):
# Füge ID hinzu falls nicht vorhanden
if "id" not in instances:
instances["id"] = instance_id
return InstanceInfo.from_api_response(instances)
elif isinstance(data, dict) and "id" in data:
return InstanceInfo.from_api_response(data)
return None
except httpx.HTTPStatusError as e:
if e.response.status_code == 404:
return None
logger.error(f"vast.ai API error getting instance {instance_id}: {e}")
raise
async def start_instance(self, instance_id: int) -> bool:
"""Startet eine gestoppte Instanz."""
client = await self._get_client()
url = self._build_url(f"/instances/{instance_id}/")
try:
response = await client.put(
url,
json={"state": "running"},
)
response.raise_for_status()
logger.info(f"vast.ai instance {instance_id} start requested")
return True
except httpx.HTTPStatusError as e:
logger.error(f"vast.ai API error starting instance {instance_id}: {e}")
return False
async def stop_instance(self, instance_id: int) -> bool:
"""Stoppt eine laufende Instanz (haelt Disk)."""
client = await self._get_client()
url = self._build_url(f"/instances/{instance_id}/")
try:
response = await client.put(
url,
json={"state": "stopped"},
)
response.raise_for_status()
logger.info(f"vast.ai instance {instance_id} stop requested")
return True
except httpx.HTTPStatusError as e:
logger.error(f"vast.ai API error stopping instance {instance_id}: {e}")
return False
async def destroy_instance(self, instance_id: int) -> bool:
"""Loescht eine Instanz komplett (Disk weg!)."""
client = await self._get_client()
url = self._build_url(f"/instances/{instance_id}/")
try:
response = await client.delete(url)
response.raise_for_status()
logger.info(f"vast.ai instance {instance_id} destroyed")
return True
except httpx.HTTPStatusError as e:
logger.error(f"vast.ai API error destroying instance {instance_id}: {e}")
return False
async def set_label(self, instance_id: int, label: str) -> bool:
"""Setzt ein Label fuer eine Instanz."""
client = await self._get_client()
url = self._build_url(f"/instances/{instance_id}/")
try:
response = await client.put(
url,
json={"label": label},
)
response.raise_for_status()
return True
except httpx.HTTPStatusError as e:
logger.error(f"vast.ai API error setting label on instance {instance_id}: {e}")
return False
async def wait_for_status(
self,
instance_id: int,
target_status: InstanceStatus,
timeout_seconds: int = 300,
poll_interval: float = 5.0,
) -> Optional[InstanceInfo]:
"""
Wartet bis eine Instanz einen bestimmten Status erreicht.
Returns:
InstanceInfo wenn Status erreicht, None bei Timeout.
"""
deadline = asyncio.get_event_loop().time() + timeout_seconds
while asyncio.get_event_loop().time() < deadline:
instance = await self.get_instance(instance_id)
if instance and instance.status == target_status:
return instance
if instance:
logger.debug(
f"vast.ai instance {instance_id} status: {instance.status.value}, "
f"waiting for {target_status.value}"
)
await asyncio.sleep(poll_interval)
logger.warning(
f"Timeout waiting for instance {instance_id} to reach {target_status.value}"
)
return None
async def wait_for_health(
self,
instance: InstanceInfo,
health_path: str = "/health",
internal_port: int = 8001,
timeout_seconds: int = 600,
poll_interval: float = 5.0,
) -> bool:
"""
Wartet bis der Health-Endpoint erreichbar ist.
Returns:
True wenn Health OK, False bei Timeout.
"""
endpoint = instance.get_endpoint_url(internal_port)
if not endpoint:
logger.error("No endpoint URL available for health check")
return False
health_url = f"{endpoint.rstrip('/')}{health_path}"
logger.info(f"Waiting for health at {health_url}")
deadline = asyncio.get_event_loop().time() + timeout_seconds
health_client = httpx.AsyncClient(timeout=5.0)
try:
while asyncio.get_event_loop().time() < deadline:
try:
response = await health_client.get(health_url)
if 200 <= response.status_code < 300:
logger.info(f"Health check passed: {health_url}")
return True
except Exception as e:
logger.debug(f"Health check failed: {e}")
await asyncio.sleep(poll_interval)
logger.warning(f"Health check timeout: {health_url}")
return False
finally:
await health_client.aclose()
async def get_account_info(self) -> Optional[AccountInfo]:
"""
Holt Account-Informationen inkl. Credit/Budget.
Returns:
AccountInfo oder None bei Fehler.
"""
client = await self._get_client()
url = self._build_url("/users/current/")
try:
response = await client.get(url)
response.raise_for_status()
data = response.json()
return AccountInfo.from_api_response(data)
except httpx.HTTPStatusError as e:
logger.error(f"vast.ai API error getting account info: {e}")
return None
except Exception as e:
logger.error(f"Error getting vast.ai account info: {e}")
return None

View File

@@ -1,618 +0,0 @@
"""
Vast.ai Power Control API.
Stellt Endpoints bereit fuer:
- Start/Stop von vast.ai Instanzen
- Status-Abfrage
- Auto-Shutdown bei Inaktivitaet
- Kosten-Tracking
Sicherheit: Alle Endpoints erfordern CONTROL_API_KEY.
"""
import asyncio
import json
import logging
import os
import time
from datetime import datetime, timezone
from pathlib import Path
from typing import Optional, Dict, Any, List
from fastapi import APIRouter, Depends, HTTPException, Header, BackgroundTasks
from pydantic import BaseModel, Field
from .vast_client import VastAIClient, InstanceInfo, InstanceStatus, AccountInfo
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/infra/vast", tags=["Infrastructure"])
# -------------------------
# Configuration (ENV)
# -------------------------
VAST_API_KEY = os.getenv("VAST_API_KEY")
VAST_INSTANCE_ID = os.getenv("VAST_INSTANCE_ID") # Numeric instance ID
CONTROL_API_KEY = os.getenv("CONTROL_API_KEY") # Admin key for these endpoints
# Health check configuration
VAST_HEALTH_PORT = int(os.getenv("VAST_HEALTH_PORT", "8001"))
VAST_HEALTH_PATH = os.getenv("VAST_HEALTH_PATH", "/health")
VAST_WAIT_TIMEOUT_S = int(os.getenv("VAST_WAIT_TIMEOUT_S", "600")) # 10 min
# Auto-shutdown configuration
AUTO_SHUTDOWN_ENABLED = os.getenv("VAST_AUTO_SHUTDOWN", "true").lower() == "true"
AUTO_SHUTDOWN_MINUTES = int(os.getenv("VAST_AUTO_SHUTDOWN_MINUTES", "30"))
# State persistence (in /tmp for container compatibility)
STATE_PATH = Path(os.getenv("VAST_STATE_PATH", "/tmp/vast_state.json"))
AUDIT_PATH = Path(os.getenv("VAST_AUDIT_PATH", "/tmp/vast_audit.log"))
# -------------------------
# State Management
# -------------------------
class VastState:
"""
Persistenter State fuer vast.ai Kontrolle.
Speichert:
- Aktueller Endpunkt (weil IP sich aendern kann)
- Letzte Aktivitaet (fuer Auto-Shutdown)
- Kosten-Tracking
"""
def __init__(self, path: Path = STATE_PATH):
self.path = path
self._state: Dict[str, Any] = self._load()
def _load(self) -> Dict[str, Any]:
"""Laedt State von Disk."""
if not self.path.exists():
return {
"desired_state": None,
"endpoint_base_url": None,
"last_activity": None,
"last_start": None,
"last_stop": None,
"total_runtime_seconds": 0,
"total_cost_usd": 0.0,
}
try:
return json.loads(self.path.read_text(encoding="utf-8"))
except Exception:
return {}
def _save(self) -> None:
"""Speichert State auf Disk."""
self.path.parent.mkdir(parents=True, exist_ok=True)
self.path.write_text(
json.dumps(self._state, ensure_ascii=False, indent=2),
encoding="utf-8",
)
def get(self, key: str, default: Any = None) -> Any:
return self._state.get(key, default)
def set(self, key: str, value: Any) -> None:
self._state[key] = value
self._save()
def update(self, data: Dict[str, Any]) -> None:
self._state.update(data)
self._save()
def record_activity(self) -> None:
"""Zeichnet letzte Aktivitaet auf (fuer Auto-Shutdown)."""
self._state["last_activity"] = datetime.now(timezone.utc).isoformat()
self._save()
def get_last_activity(self) -> Optional[datetime]:
"""Gibt letzte Aktivitaet als datetime."""
ts = self._state.get("last_activity")
if ts:
return datetime.fromisoformat(ts)
return None
def record_start(self) -> None:
"""Zeichnet Start-Zeit auf."""
self._state["last_start"] = datetime.now(timezone.utc).isoformat()
self._state["desired_state"] = "RUNNING"
self._save()
def record_stop(self, dph_total: Optional[float] = None) -> None:
"""Zeichnet Stop-Zeit auf und berechnet Kosten."""
now = datetime.now(timezone.utc)
self._state["last_stop"] = now.isoformat()
self._state["desired_state"] = "STOPPED"
# Berechne Runtime und Kosten
last_start = self._state.get("last_start")
if last_start:
start_dt = datetime.fromisoformat(last_start)
runtime_seconds = (now - start_dt).total_seconds()
self._state["total_runtime_seconds"] = (
self._state.get("total_runtime_seconds", 0) + runtime_seconds
)
if dph_total:
hours = runtime_seconds / 3600
cost = hours * dph_total
self._state["total_cost_usd"] = (
self._state.get("total_cost_usd", 0.0) + cost
)
logger.info(
f"Session cost: ${cost:.3f} ({runtime_seconds/60:.1f} min @ ${dph_total}/h)"
)
self._save()
# Global state instance
_state = VastState()
# -------------------------
# Audit Logging
# -------------------------
def audit_log(event: str, actor: str = "system", meta: Optional[Dict[str, Any]] = None) -> None:
"""Schreibt Audit-Log Eintrag."""
meta = meta or {}
line = json.dumps(
{
"ts": datetime.now(timezone.utc).isoformat(),
"event": event,
"actor": actor,
"meta": meta,
},
ensure_ascii=False,
)
AUDIT_PATH.parent.mkdir(parents=True, exist_ok=True)
with AUDIT_PATH.open("a", encoding="utf-8") as f:
f.write(line + "\n")
logger.info(f"AUDIT: {event} by {actor}")
# -------------------------
# Request/Response Models
# -------------------------
class PowerOnRequest(BaseModel):
wait_for_health: bool = Field(default=True, description="Warten bis LLM bereit")
health_path: str = Field(default=VAST_HEALTH_PATH)
health_port: int = Field(default=VAST_HEALTH_PORT)
class PowerOnResponse(BaseModel):
status: str
instance_id: Optional[int] = None
endpoint_base_url: Optional[str] = None
health_url: Optional[str] = None
message: Optional[str] = None
class PowerOffRequest(BaseModel):
pass # Keine Parameter noetig
class PowerOffResponse(BaseModel):
status: str
session_runtime_minutes: Optional[float] = None
session_cost_usd: Optional[float] = None
message: Optional[str] = None
class VastStatusResponse(BaseModel):
instance_id: Optional[int] = None
status: str
gpu_name: Optional[str] = None
dph_total: Optional[float] = None
endpoint_base_url: Optional[str] = None
last_activity: Optional[str] = None
auto_shutdown_in_minutes: Optional[int] = None
total_runtime_hours: Optional[float] = None
total_cost_usd: Optional[float] = None
# Budget / Credit Informationen
account_credit: Optional[float] = None # Verbleibendes Guthaben in USD
account_total_spend: Optional[float] = None # Gesamtausgaben auf vast.ai
# Session-Kosten (seit letztem Start)
session_runtime_minutes: Optional[float] = None
session_cost_usd: Optional[float] = None
message: Optional[str] = None
class CostStatsResponse(BaseModel):
total_runtime_hours: float
total_cost_usd: float
sessions_count: int
avg_session_minutes: float
# -------------------------
# Security Dependency
# -------------------------
def require_control_key(x_api_key: Optional[str] = Header(default=None)) -> None:
"""
Admin-Schutz fuer Control-Endpoints.
Header: X-API-Key: <CONTROL_API_KEY>
"""
if not CONTROL_API_KEY:
raise HTTPException(
status_code=500,
detail="CONTROL_API_KEY not configured on server",
)
if x_api_key != CONTROL_API_KEY:
raise HTTPException(status_code=401, detail="Unauthorized")
# -------------------------
# Auto-Shutdown Background Task
# -------------------------
_shutdown_task: Optional[asyncio.Task] = None
async def auto_shutdown_monitor() -> None:
"""
Hintergrund-Task der bei Inaktivitaet die Instanz stoppt.
Laeuft permanent wenn Instanz an ist und prueft alle 60s ob
Aktivitaet stattfand. Stoppt Instanz wenn keine Aktivitaet
seit AUTO_SHUTDOWN_MINUTES.
"""
if not VAST_API_KEY or not VAST_INSTANCE_ID:
return
client = VastAIClient(VAST_API_KEY)
try:
while True:
await asyncio.sleep(60) # Check every minute
if not AUTO_SHUTDOWN_ENABLED:
continue
last_activity = _state.get_last_activity()
if not last_activity:
continue
# Berechne Inaktivitaet
now = datetime.now(timezone.utc)
inactive_minutes = (now - last_activity).total_seconds() / 60
if inactive_minutes >= AUTO_SHUTDOWN_MINUTES:
logger.info(
f"Auto-shutdown triggered: {inactive_minutes:.1f} min inactive"
)
audit_log(
"auto_shutdown",
actor="system",
meta={"inactive_minutes": inactive_minutes},
)
# Hole aktuelle Instanz-Info fuer Kosten
instance = await client.get_instance(int(VAST_INSTANCE_ID))
dph = instance.dph_total if instance else None
# Stop
await client.stop_instance(int(VAST_INSTANCE_ID))
_state.record_stop(dph_total=dph)
audit_log("auto_shutdown_complete", actor="system")
except asyncio.CancelledError:
pass
except Exception as e:
logger.error(f"Auto-shutdown monitor error: {e}")
finally:
await client.close()
def start_auto_shutdown_monitor() -> None:
"""Startet den Auto-Shutdown Monitor."""
global _shutdown_task
if _shutdown_task is None or _shutdown_task.done():
_shutdown_task = asyncio.create_task(auto_shutdown_monitor())
logger.info("Auto-shutdown monitor started")
def stop_auto_shutdown_monitor() -> None:
"""Stoppt den Auto-Shutdown Monitor."""
global _shutdown_task
if _shutdown_task and not _shutdown_task.done():
_shutdown_task.cancel()
logger.info("Auto-shutdown monitor stopped")
# -------------------------
# API Endpoints
# -------------------------
@router.get("/status", response_model=VastStatusResponse, dependencies=[Depends(require_control_key)])
async def get_status() -> VastStatusResponse:
"""
Gibt Status der vast.ai Instanz zurueck.
Inkludiert:
- Aktueller Status (running/stopped/etc)
- GPU Info und Kosten pro Stunde
- Endpoint URL
- Auto-Shutdown Timer
- Gesamtkosten
- Account Credit (verbleibendes Budget)
- Session-Kosten (seit letztem Start)
"""
if not VAST_API_KEY or not VAST_INSTANCE_ID:
return VastStatusResponse(
status="unconfigured",
message="VAST_API_KEY or VAST_INSTANCE_ID not set",
)
client = VastAIClient(VAST_API_KEY)
try:
instance = await client.get_instance(int(VAST_INSTANCE_ID))
if not instance:
return VastStatusResponse(
instance_id=int(VAST_INSTANCE_ID),
status="not_found",
message=f"Instance {VAST_INSTANCE_ID} not found",
)
# Hole Account-Info fuer Budget/Credit
account_info = await client.get_account_info()
account_credit = account_info.credit if account_info else None
account_total_spend = account_info.total_spend if account_info else None
# Update endpoint if running
endpoint = None
if instance.status == InstanceStatus.RUNNING:
endpoint = instance.get_endpoint_url(VAST_HEALTH_PORT)
if endpoint:
_state.set("endpoint_base_url", endpoint)
# Calculate auto-shutdown timer
auto_shutdown_minutes = None
if AUTO_SHUTDOWN_ENABLED and instance.status == InstanceStatus.RUNNING:
last_activity = _state.get_last_activity()
if last_activity:
inactive = (datetime.now(timezone.utc) - last_activity).total_seconds() / 60
auto_shutdown_minutes = max(0, int(AUTO_SHUTDOWN_MINUTES - inactive))
# Berechne aktuelle Session-Kosten (wenn Instanz laeuft)
session_runtime_minutes = None
session_cost_usd = None
last_start = _state.get("last_start")
# Falls Instanz laeuft aber kein last_start gesetzt (z.B. nach Container-Neustart),
# nutze start_date aus der vast.ai API falls vorhanden, sonst jetzt
if instance.status == InstanceStatus.RUNNING and not last_start:
if instance.started_at:
_state.set("last_start", instance.started_at.isoformat())
last_start = instance.started_at.isoformat()
else:
_state.record_start()
last_start = _state.get("last_start")
if last_start and instance.status == InstanceStatus.RUNNING:
start_dt = datetime.fromisoformat(last_start)
session_runtime_minutes = (datetime.now(timezone.utc) - start_dt).total_seconds() / 60
if instance.dph_total:
session_cost_usd = (session_runtime_minutes / 60) * instance.dph_total
return VastStatusResponse(
instance_id=instance.id,
status=instance.status.value,
gpu_name=instance.gpu_name,
dph_total=instance.dph_total,
endpoint_base_url=endpoint or _state.get("endpoint_base_url"),
last_activity=_state.get("last_activity"),
auto_shutdown_in_minutes=auto_shutdown_minutes,
total_runtime_hours=_state.get("total_runtime_seconds", 0) / 3600,
total_cost_usd=_state.get("total_cost_usd", 0.0),
account_credit=account_credit,
account_total_spend=account_total_spend,
session_runtime_minutes=session_runtime_minutes,
session_cost_usd=session_cost_usd,
)
finally:
await client.close()
@router.post("/power/on", response_model=PowerOnResponse, dependencies=[Depends(require_control_key)])
async def power_on(
payload: PowerOnRequest,
background_tasks: BackgroundTasks,
) -> PowerOnResponse:
"""
Startet die vast.ai Instanz.
1. Startet Instanz via API
2. Wartet auf Status RUNNING
3. Optional: Wartet auf Health-Endpoint
4. Startet Auto-Shutdown Monitor
"""
if not VAST_API_KEY or not VAST_INSTANCE_ID:
raise HTTPException(
status_code=500,
detail="VAST_API_KEY or VAST_INSTANCE_ID not configured",
)
instance_id = int(VAST_INSTANCE_ID)
audit_log("power_on_requested", meta={"instance_id": instance_id})
client = VastAIClient(VAST_API_KEY)
try:
# Start instance
success = await client.start_instance(instance_id)
if not success:
raise HTTPException(status_code=502, detail="Failed to start instance")
_state.record_start()
_state.record_activity()
# Wait for running status
instance = await client.wait_for_status(
instance_id,
InstanceStatus.RUNNING,
timeout_seconds=300,
)
if not instance:
return PowerOnResponse(
status="starting",
instance_id=instance_id,
message="Instance start requested but not yet running. Check status.",
)
# Get endpoint
endpoint = instance.get_endpoint_url(payload.health_port)
if endpoint:
_state.set("endpoint_base_url", endpoint)
# Wait for health if requested
if payload.wait_for_health:
health_ok = await client.wait_for_health(
instance,
health_path=payload.health_path,
internal_port=payload.health_port,
timeout_seconds=VAST_WAIT_TIMEOUT_S,
)
if not health_ok:
audit_log("power_on_health_timeout", meta={"instance_id": instance_id})
return PowerOnResponse(
status="running_unhealthy",
instance_id=instance_id,
endpoint_base_url=endpoint,
message=f"Instance running but health check failed at {endpoint}{payload.health_path}",
)
# Start auto-shutdown monitor
start_auto_shutdown_monitor()
audit_log("power_on_complete", meta={
"instance_id": instance_id,
"endpoint": endpoint,
})
return PowerOnResponse(
status="running",
instance_id=instance_id,
endpoint_base_url=endpoint,
health_url=f"{endpoint}{payload.health_path}" if endpoint else None,
message="Instance running and healthy",
)
finally:
await client.close()
@router.post("/power/off", response_model=PowerOffResponse, dependencies=[Depends(require_control_key)])
async def power_off(payload: PowerOffRequest) -> PowerOffResponse:
"""
Stoppt die vast.ai Instanz (behaelt Disk).
Berechnet Session-Kosten und -Laufzeit.
"""
if not VAST_API_KEY or not VAST_INSTANCE_ID:
raise HTTPException(
status_code=500,
detail="VAST_API_KEY or VAST_INSTANCE_ID not configured",
)
instance_id = int(VAST_INSTANCE_ID)
audit_log("power_off_requested", meta={"instance_id": instance_id})
# Stop auto-shutdown monitor
stop_auto_shutdown_monitor()
client = VastAIClient(VAST_API_KEY)
try:
# Get current info for cost calculation
instance = await client.get_instance(instance_id)
dph = instance.dph_total if instance else None
# Calculate session stats before updating state
session_runtime = 0.0
session_cost = 0.0
last_start = _state.get("last_start")
if last_start:
start_dt = datetime.fromisoformat(last_start)
session_runtime = (datetime.now(timezone.utc) - start_dt).total_seconds() / 60
if dph:
session_cost = (session_runtime / 60) * dph
# Stop instance
success = await client.stop_instance(instance_id)
if not success:
raise HTTPException(status_code=502, detail="Failed to stop instance")
_state.record_stop(dph_total=dph)
audit_log("power_off_complete", meta={
"instance_id": instance_id,
"session_minutes": session_runtime,
"session_cost": session_cost,
})
return PowerOffResponse(
status="stopped",
session_runtime_minutes=session_runtime,
session_cost_usd=session_cost,
message=f"Instance stopped. Session: {session_runtime:.1f} min, ${session_cost:.3f}",
)
finally:
await client.close()
@router.post("/activity", dependencies=[Depends(require_control_key)])
async def record_activity() -> Dict[str, str]:
"""
Zeichnet Aktivitaet auf (verzoegert Auto-Shutdown).
Sollte von LLM Gateway aufgerufen werden bei jedem Request.
"""
_state.record_activity()
return {"status": "recorded", "last_activity": _state.get("last_activity")}
@router.get("/costs", response_model=CostStatsResponse, dependencies=[Depends(require_control_key)])
async def get_costs() -> CostStatsResponse:
"""
Gibt Kosten-Statistiken zurueck.
"""
total_seconds = _state.get("total_runtime_seconds", 0)
total_cost = _state.get("total_cost_usd", 0.0)
# TODO: Sessions count from audit log
sessions = 1 if total_seconds > 0 else 0
avg_minutes = (total_seconds / 60 / sessions) if sessions > 0 else 0
return CostStatsResponse(
total_runtime_hours=total_seconds / 3600,
total_cost_usd=total_cost,
sessions_count=sessions,
avg_session_minutes=avg_minutes,
)
@router.get("/audit", dependencies=[Depends(require_control_key)])
async def get_audit_log(limit: int = 50) -> List[Dict[str, Any]]:
"""
Gibt letzte Audit-Log Eintraege zurueck.
"""
if not AUDIT_PATH.exists():
return []
lines = AUDIT_PATH.read_text(encoding="utf-8").strip().split("\n")
entries = []
for line in lines[-limit:]:
try:
entries.append(json.loads(line))
except json.JSONDecodeError:
continue
return list(reversed(entries)) # Neueste zuerst

View File

@@ -1,199 +0,0 @@
"""
BreakPilot Jitsi API
Ermoeglicht das Versenden von Jitsi-Meeting-Einladungen per Email.
"""
import os
import uuid
from datetime import datetime
from typing import Optional, List
from pydantic import BaseModel, Field
from fastapi import APIRouter, HTTPException
router = APIRouter(prefix="/api/jitsi", tags=["Jitsi"])
# Standard Jitsi Server (kann konfiguriert werden)
JITSI_SERVER = os.getenv("JITSI_SERVER", "https://meet.jit.si")
# ==========================================
# PYDANTIC MODELS
# ==========================================
class JitsiInvitation(BaseModel):
"""Model fuer Jitsi-Meeting-Einladung."""
to_email: str = Field(..., description="Email-Adresse des Teilnehmers")
to_name: str = Field(..., description="Name des Teilnehmers")
organizer_name: str = Field(default="BreakPilot Lehrer", description="Name des Organisators")
meeting_title: str = Field(..., description="Titel des Meetings")
meeting_date: str = Field(..., description="Datum z.B. '20. Dezember 2024'")
meeting_time: str = Field(..., description="Uhrzeit z.B. '14:00 Uhr'")
room_name: Optional[str] = Field(None, description="Raumname (wird generiert wenn leer)")
additional_info: Optional[str] = Field(None, description="Zusaetzliche Informationen")
class JitsiInvitationResponse(BaseModel):
"""Antwort auf eine Jitsi-Einladung."""
success: bool
jitsi_url: str
room_name: str
email_sent: bool
email_error: Optional[str] = None
class JitsiBulkInvitation(BaseModel):
"""Model fuer mehrere Jitsi-Einladungen."""
recipients: List[dict] = Field(..., description="Liste von {email, name} Objekten")
organizer_name: str = Field(default="BreakPilot Lehrer")
meeting_title: str
meeting_date: str
meeting_time: str
room_name: Optional[str] = None
additional_info: Optional[str] = None
class JitsiBulkResponse(BaseModel):
"""Antwort auf Bulk-Einladungen."""
jitsi_url: str
room_name: str
sent: int
failed: int
errors: List[str]
# ==========================================
# HELPER FUNCTIONS
# ==========================================
def generate_room_name() -> str:
"""Generiert einen sicheren Raumnamen."""
# UUID-basiert fuer Sicherheit
unique_id = uuid.uuid4().hex[:12]
return f"BreakPilot-{unique_id}"
def build_jitsi_url(room_name: str) -> str:
"""Erstellt die vollstaendige Jitsi-URL."""
return f"{JITSI_SERVER}/{room_name}"
# ==========================================
# API ENDPOINTS
# ==========================================
@router.post("/invite", response_model=JitsiInvitationResponse)
async def send_jitsi_invitation(invitation: JitsiInvitation):
"""
Sendet eine Jitsi-Meeting-Einladung per Email.
Der Empfaenger kann dem Meeting ueber den Browser beitreten,
ohne Matrix oder andere Software installieren zu muessen.
"""
# Raumname generieren oder verwenden
room_name = invitation.room_name or generate_room_name()
jitsi_url = build_jitsi_url(room_name)
email_sent = False
email_error = None
try:
from email_service import email_service
result = email_service.send_jitsi_invitation(
to_email=invitation.to_email,
to_name=invitation.to_name,
organizer_name=invitation.organizer_name,
meeting_title=invitation.meeting_title,
meeting_date=invitation.meeting_date,
meeting_time=invitation.meeting_time,
jitsi_url=jitsi_url,
additional_info=invitation.additional_info
)
email_sent = result.success
if not result.success:
email_error = result.error
except Exception as e:
email_error = str(e)
return JitsiInvitationResponse(
success=email_sent,
jitsi_url=jitsi_url,
room_name=room_name,
email_sent=email_sent,
email_error=email_error
)
@router.post("/invite/bulk", response_model=JitsiBulkResponse)
async def send_bulk_jitsi_invitations(bulk: JitsiBulkInvitation):
"""
Sendet Jitsi-Einladungen an mehrere Empfaenger.
Alle Empfaenger erhalten eine Einladung zum selben Meeting.
"""
# Gemeinsamer Raumname fuer alle
room_name = bulk.room_name or generate_room_name()
jitsi_url = build_jitsi_url(room_name)
sent = 0
failed = 0
errors = []
try:
from email_service import email_service
for recipient in bulk.recipients:
if not recipient.get("email"):
errors.append(f"Fehlende Email fuer {recipient.get('name', 'Unbekannt')}")
failed += 1
continue
result = email_service.send_jitsi_invitation(
to_email=recipient["email"],
to_name=recipient.get("name", ""),
organizer_name=bulk.organizer_name,
meeting_title=bulk.meeting_title,
meeting_date=bulk.meeting_date,
meeting_time=bulk.meeting_time,
jitsi_url=jitsi_url,
additional_info=bulk.additional_info
)
if result.success:
sent += 1
else:
failed += 1
errors.append(f"{recipient.get('email')}: {result.error}")
except Exception as e:
errors.append(f"Allgemeiner Fehler: {str(e)}")
return JitsiBulkResponse(
jitsi_url=jitsi_url,
room_name=room_name,
sent=sent,
failed=failed,
errors=errors[:20] # Max 20 Fehler zurueckgeben
)
@router.get("/room")
async def generate_meeting_room():
"""
Generiert einen neuen Meeting-Raum.
Gibt die URL zurueck ohne Einladungen zu senden.
"""
room_name = generate_room_name()
jitsi_url = build_jitsi_url(room_name)
return {
"room_name": room_name,
"jitsi_url": jitsi_url,
"server": JITSI_SERVER,
"created_at": datetime.utcnow().isoformat()
}

View File

@@ -1,5 +1,9 @@
from typing import List, Dict, Any, Optional
from datetime import datetime
from pathlib import Path
import json
import os
import logging
from fastapi import APIRouter, HTTPException
from pydantic import BaseModel
@@ -15,6 +19,8 @@ from learning_units import (
delete_learning_unit,
)
logger = logging.getLogger(__name__)
router = APIRouter(
prefix="/learning-units",
@@ -49,6 +55,11 @@ class RemoveWorksheetPayload(BaseModel):
worksheet_file: str
class GenerateFromAnalysisPayload(BaseModel):
analysis_data: Dict[str, Any]
num_questions: int = 8
# ---------- Hilfsfunktion: Backend-Modell -> Frontend-Objekt ----------
@@ -195,3 +206,171 @@ def api_delete_learning_unit(unit_id: str):
raise HTTPException(status_code=404, detail="Lerneinheit nicht gefunden.")
return {"status": "deleted", "id": unit_id}
# ---------- Generator-Endpunkte ----------
LERNEINHEITEN_DIR = os.path.expanduser("~/Arbeitsblaetter/Lerneinheiten")
def _save_analysis_and_get_path(unit_id: str, analysis_data: Dict[str, Any]) -> Path:
"""Save analysis_data to disk and return the path."""
os.makedirs(LERNEINHEITEN_DIR, exist_ok=True)
path = Path(LERNEINHEITEN_DIR) / f"{unit_id}_analyse.json"
with open(path, "w", encoding="utf-8") as f:
json.dump(analysis_data, f, ensure_ascii=False, indent=2)
return path
@router.post("/{unit_id}/generate-qa")
def api_generate_qa(unit_id: str, payload: GenerateFromAnalysisPayload):
"""Generate Q&A items with Leitner fields from analysis data."""
lu = get_learning_unit(unit_id)
if not lu:
raise HTTPException(status_code=404, detail="Lerneinheit nicht gefunden.")
analysis_path = _save_analysis_and_get_path(unit_id, payload.analysis_data)
try:
from ai_processing.qa_generator import generate_qa_from_analysis
qa_path = generate_qa_from_analysis(analysis_path, num_questions=payload.num_questions)
with open(qa_path, "r", encoding="utf-8") as f:
qa_data = json.load(f)
# Update unit status
update_learning_unit(unit_id, LearningUnitUpdate(status="qa_generated"))
logger.info(f"Generated QA for unit {unit_id}: {len(qa_data.get('qa_items', []))} items")
return qa_data
except Exception as e:
logger.error(f"QA generation failed for {unit_id}: {e}")
raise HTTPException(status_code=500, detail=f"QA-Generierung fehlgeschlagen: {e}")
@router.post("/{unit_id}/generate-mc")
def api_generate_mc(unit_id: str, payload: GenerateFromAnalysisPayload):
"""Generate multiple choice questions from analysis data."""
lu = get_learning_unit(unit_id)
if not lu:
raise HTTPException(status_code=404, detail="Lerneinheit nicht gefunden.")
analysis_path = _save_analysis_and_get_path(unit_id, payload.analysis_data)
try:
from ai_processing.mc_generator import generate_mc_from_analysis
mc_path = generate_mc_from_analysis(analysis_path, num_questions=payload.num_questions)
with open(mc_path, "r", encoding="utf-8") as f:
mc_data = json.load(f)
update_learning_unit(unit_id, LearningUnitUpdate(status="mc_generated"))
logger.info(f"Generated MC for unit {unit_id}: {len(mc_data.get('questions', []))} questions")
return mc_data
except Exception as e:
logger.error(f"MC generation failed for {unit_id}: {e}")
raise HTTPException(status_code=500, detail=f"MC-Generierung fehlgeschlagen: {e}")
@router.post("/{unit_id}/generate-cloze")
def api_generate_cloze(unit_id: str, payload: GenerateFromAnalysisPayload):
"""Generate cloze (fill-in-the-blank) items from analysis data."""
lu = get_learning_unit(unit_id)
if not lu:
raise HTTPException(status_code=404, detail="Lerneinheit nicht gefunden.")
analysis_path = _save_analysis_and_get_path(unit_id, payload.analysis_data)
try:
from ai_processing.cloze_generator import generate_cloze_from_analysis
cloze_path = generate_cloze_from_analysis(analysis_path)
with open(cloze_path, "r", encoding="utf-8") as f:
cloze_data = json.load(f)
update_learning_unit(unit_id, LearningUnitUpdate(status="cloze_generated"))
logger.info(f"Generated Cloze for unit {unit_id}: {len(cloze_data.get('cloze_items', []))} items")
return cloze_data
except Exception as e:
logger.error(f"Cloze generation failed for {unit_id}: {e}")
raise HTTPException(status_code=500, detail=f"Cloze-Generierung fehlgeschlagen: {e}")
@router.get("/{unit_id}/qa")
def api_get_qa(unit_id: str):
"""Get generated QA items for a unit."""
qa_path = Path(LERNEINHEITEN_DIR) / f"{unit_id}_qa.json"
if not qa_path.exists():
raise HTTPException(status_code=404, detail="Keine QA-Daten gefunden.")
with open(qa_path, "r", encoding="utf-8") as f:
return json.load(f)
@router.get("/{unit_id}/mc")
def api_get_mc(unit_id: str):
"""Get generated MC questions for a unit."""
mc_path = Path(LERNEINHEITEN_DIR) / f"{unit_id}_mc.json"
if not mc_path.exists():
raise HTTPException(status_code=404, detail="Keine MC-Daten gefunden.")
with open(mc_path, "r", encoding="utf-8") as f:
return json.load(f)
@router.get("/{unit_id}/cloze")
def api_get_cloze(unit_id: str):
"""Get generated cloze items for a unit."""
cloze_path = Path(LERNEINHEITEN_DIR) / f"{unit_id}_cloze.json"
if not cloze_path.exists():
raise HTTPException(status_code=404, detail="Keine Cloze-Daten gefunden.")
with open(cloze_path, "r", encoding="utf-8") as f:
return json.load(f)
@router.post("/{unit_id}/leitner/update")
def api_update_leitner(unit_id: str, item_id: str, correct: bool):
"""Update Leitner progress for a QA item."""
qa_path = Path(LERNEINHEITEN_DIR) / f"{unit_id}_qa.json"
if not qa_path.exists():
raise HTTPException(status_code=404, detail="Keine QA-Daten gefunden.")
try:
from ai_processing.qa_generator import update_leitner_progress
result = update_leitner_progress(qa_path, item_id, correct)
return result
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
@router.get("/{unit_id}/leitner/next")
def api_get_next_review(unit_id: str, limit: int = 5):
"""Get next Leitner review items."""
qa_path = Path(LERNEINHEITEN_DIR) / f"{unit_id}_qa.json"
if not qa_path.exists():
raise HTTPException(status_code=404, detail="Keine QA-Daten gefunden.")
try:
from ai_processing.qa_generator import get_next_review_items
items = get_next_review_items(qa_path, limit=limit)
return {"items": items, "count": len(items)}
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
class StoryGeneratePayload(BaseModel):
vocabulary: List[Dict[str, Any]]
language: str = "en"
grade_level: str = "5-8"
@router.post("/{unit_id}/generate-story")
def api_generate_story(unit_id: str, payload: StoryGeneratePayload):
"""Generate a short story using vocabulary words."""
lu = get_learning_unit(unit_id)
if not lu:
raise HTTPException(status_code=404, detail="Lerneinheit nicht gefunden.")
try:
from story_generator import generate_story
result = generate_story(
vocabulary=payload.vocabulary,
language=payload.language,
grade_level=payload.grade_level,
)
return result
except Exception as e:
logger.error(f"Story generation failed for {unit_id}: {e}")
raise HTTPException(status_code=500, detail=f"Story-Generierung fehlgeschlagen: {e}")

View File

@@ -40,7 +40,6 @@ os.environ["DATABASE_URL"] = DATABASE_URL
# ---------------------------------------------------------------------------
LLM_GATEWAY_ENABLED = os.getenv("LLM_GATEWAY_ENABLED", "false").lower() == "true"
ALERTS_AGENT_ENABLED = os.getenv("ALERTS_AGENT_ENABLED", "false").lower() == "true"
VAST_API_KEY = os.getenv("VAST_API_KEY")
# ---------------------------------------------------------------------------
@@ -106,21 +105,20 @@ app.include_router(correction_router, prefix="/api")
from learning_units_api import router as learning_units_router
app.include_router(learning_units_router, prefix="/api")
# --- 4b. Learning Progress ---
from progress_api import router as progress_router
app.include_router(progress_router, prefix="/api")
from unit_api import router as unit_router
app.include_router(unit_router) # Already has /api/units prefix
from unit_analytics_api import router as unit_analytics_router
app.include_router(unit_analytics_router) # Already has /api/analytics prefix
# --- 5. Meetings / Jitsi ---
from meetings_api import router as meetings_api_router
app.include_router(meetings_api_router) # Already has /api/meetings prefix
from recording_api import router as recording_api_router
app.include_router(recording_api_router) # Already has /api/recordings prefix
from jitsi_api import router as jitsi_router
app.include_router(jitsi_router) # Already has /api/jitsi prefix
# --- 6. Messenger ---
from messenger_api import router as messenger_router
@@ -180,11 +178,6 @@ if ALERTS_AGENT_ENABLED:
from alerts_agent.api import router as alerts_router
app.include_router(alerts_router, prefix="/api", tags=["Alerts Agent"])
# --- 14. vast.ai GPU Infrastructure (optional) ---
if VAST_API_KEY:
from infra.vast_power import router as vast_router
app.include_router(vast_router, tags=["GPU Infrastructure"])
# ---------------------------------------------------------------------------
# Middleware (from shared middleware/ package)

View File

@@ -1,443 +0,0 @@
"""
Meetings API Module
Backend API endpoints for Jitsi Meet integration
"""
import os
import uuid
import httpx
from datetime import datetime, timedelta
from typing import Optional, List
from fastapi import APIRouter, HTTPException, Depends
from pydantic import BaseModel, EmailStr
router = APIRouter(prefix="/api/meetings", tags=["meetings"])
# ============================================
# Configuration
# ============================================
JITSI_BASE_URL = os.getenv("JITSI_PUBLIC_URL", "http://localhost:8443")
CONSENT_SERVICE_URL = os.getenv("CONSENT_SERVICE_URL", "http://localhost:8081")
# ============================================
# Models
# ============================================
class MeetingConfig(BaseModel):
enable_lobby: bool = True
enable_recording: bool = False
start_with_audio_muted: bool = True
start_with_video_muted: bool = False
require_display_name: bool = True
enable_breakout: bool = False
class CreateMeetingRequest(BaseModel):
type: str = "quick" # quick, scheduled, training, parent, class
title: str = "Neues Meeting"
duration: int = 60
scheduled_at: Optional[str] = None
config: Optional[MeetingConfig] = None
description: Optional[str] = None
invites: Optional[List[str]] = None
class ScheduleMeetingRequest(BaseModel):
title: str
scheduled_at: str
duration: int = 60
description: Optional[str] = None
invites: Optional[List[str]] = None
class TrainingRequest(BaseModel):
title: str
description: Optional[str] = None
scheduled_at: str
duration: int = 120
max_participants: int = 20
trainer: str
config: Optional[MeetingConfig] = None
class ParentTeacherRequest(BaseModel):
student_name: str
parent_name: str
parent_email: Optional[str] = None
scheduled_at: str
reason: Optional[str] = None
send_invite: bool = True
duration: int = 30
class MeetingResponse(BaseModel):
room_name: str
join_url: str
moderator_url: Optional[str] = None
password: Optional[str] = None
expires_at: Optional[str] = None
class MeetingStats(BaseModel):
active: int = 0
scheduled: int = 0
recordings: int = 0
participants: int = 0
class ActiveMeeting(BaseModel):
room_name: str
title: str
participants: int
started_at: str
# ============================================
# In-Memory Storage (for demo purposes)
# In production, use database
# ============================================
scheduled_meetings = []
active_meetings = []
trainings = []
recordings = []
# ============================================
# Helper Functions
# ============================================
def generate_room_name(prefix: str = "meeting") -> str:
"""Generate a unique room name"""
return f"{prefix}-{uuid.uuid4().hex[:8]}"
def generate_password() -> str:
"""Generate a simple password"""
return uuid.uuid4().hex[:8]
def build_jitsi_url(room_name: str, config: Optional[MeetingConfig] = None) -> str:
"""Build Jitsi meeting URL with config parameters"""
params = []
if config:
if config.start_with_audio_muted:
params.append("config.startWithAudioMuted=true")
if config.start_with_video_muted:
params.append("config.startWithVideoMuted=true")
if config.require_display_name:
params.append("config.requireDisplayName=true")
# Common config
params.extend([
"config.prejoinPageEnabled=false",
"config.disableDeepLinking=true",
"config.defaultLanguage=de",
"interfaceConfig.SHOW_JITSI_WATERMARK=false",
"interfaceConfig.SHOW_BRAND_WATERMARK=false"
])
url = f"{JITSI_BASE_URL}/{room_name}"
if params:
url += "#" + "&".join(params)
return url
async def call_consent_service(endpoint: str, method: str = "GET", data: dict = None) -> dict:
"""Call the consent service API"""
async with httpx.AsyncClient() as client:
url = f"{CONSENT_SERVICE_URL}{endpoint}"
if method == "GET":
response = await client.get(url)
elif method == "POST":
response = await client.post(url, json=data)
else:
raise ValueError(f"Unsupported method: {method}")
if response.status_code >= 400:
return None
return response.json()
# ============================================
# API Endpoints
# ============================================
@router.get("/stats", response_model=MeetingStats)
async def get_meeting_stats():
"""Get meeting statistics"""
return MeetingStats(
active=len(active_meetings),
scheduled=len(scheduled_meetings),
recordings=len(recordings),
participants=sum(m.get("participants", 0) for m in active_meetings)
)
@router.get("/active", response_model=List[ActiveMeeting])
async def get_active_meetings():
"""Get list of active meetings"""
return [
ActiveMeeting(
room_name=m["room_name"],
title=m["title"],
participants=m.get("participants", 0),
started_at=m.get("started_at", datetime.now().isoformat())
)
for m in active_meetings
]
@router.post("/create", response_model=MeetingResponse)
async def create_meeting(request: CreateMeetingRequest):
"""Create a new meeting"""
config = request.config or MeetingConfig()
# Generate room name based on type
if request.type == "quick":
room_name = generate_room_name("quick")
elif request.type == "training":
room_name = generate_room_name("schulung")
elif request.type == "parent":
room_name = generate_room_name("elterngespraech")
elif request.type == "class":
room_name = generate_room_name("klasse")
else:
room_name = generate_room_name("meeting")
join_url = build_jitsi_url(room_name, config)
# Store meeting if scheduled
if request.scheduled_at:
scheduled_meetings.append({
"room_name": room_name,
"title": request.title,
"scheduled_at": request.scheduled_at,
"duration": request.duration,
"config": config.model_dump() if config else None
})
return MeetingResponse(
room_name=room_name,
join_url=join_url
)
@router.post("/schedule", response_model=MeetingResponse)
async def schedule_meeting(request: ScheduleMeetingRequest):
"""Schedule a new meeting"""
room_name = generate_room_name("meeting")
meeting = {
"room_name": room_name,
"title": request.title,
"scheduled_at": request.scheduled_at,
"duration": request.duration,
"description": request.description,
"invites": request.invites or []
}
scheduled_meetings.append(meeting)
join_url = build_jitsi_url(room_name)
# TODO: Send email invites if configured
return MeetingResponse(
room_name=room_name,
join_url=join_url
)
@router.post("/training", response_model=MeetingResponse)
async def create_training(request: TrainingRequest):
"""Create a training session"""
# Generate room name from title
title_slug = request.title.lower().replace(" ", "-")[:20]
room_name = f"schulung-{title_slug}-{uuid.uuid4().hex[:4]}"
config = request.config or MeetingConfig(
enable_lobby=True,
enable_recording=True,
start_with_audio_muted=True
)
training = {
"room_name": room_name,
"title": request.title,
"description": request.description,
"scheduled_at": request.scheduled_at,
"duration": request.duration,
"max_participants": request.max_participants,
"trainer": request.trainer,
"config": config.model_dump()
}
trainings.append(training)
scheduled_meetings.append(training)
join_url = build_jitsi_url(room_name, config)
return MeetingResponse(
room_name=room_name,
join_url=join_url
)
@router.post("/parent-teacher", response_model=MeetingResponse)
async def create_parent_teacher_meeting(request: ParentTeacherRequest):
"""Create a parent-teacher meeting"""
# Generate room name with student name and date
student_slug = request.student_name.lower().replace(" ", "-")[:15]
date_str = datetime.fromisoformat(request.scheduled_at).strftime("%Y%m%d-%H%M")
room_name = f"elterngespraech-{student_slug}-{date_str}"
# Generate password for security
password = generate_password()
config = MeetingConfig(
enable_lobby=True,
enable_recording=False,
start_with_audio_muted=False
)
meeting = {
"room_name": room_name,
"title": f"Elterngespräch - {request.student_name}",
"student_name": request.student_name,
"parent_name": request.parent_name,
"parent_email": request.parent_email,
"scheduled_at": request.scheduled_at,
"duration": request.duration,
"reason": request.reason,
"password": password,
"config": config.model_dump()
}
scheduled_meetings.append(meeting)
join_url = build_jitsi_url(room_name, config)
# TODO: Send email invite to parents if configured
return MeetingResponse(
room_name=room_name,
join_url=join_url,
password=password
)
@router.get("/scheduled")
async def get_scheduled_meetings():
"""Get all scheduled meetings"""
return scheduled_meetings
@router.get("/trainings")
async def get_trainings():
"""Get all training sessions"""
return trainings
@router.delete("/{room_name}")
async def delete_meeting(room_name: str):
"""Delete a scheduled meeting"""
# Find and remove the meeting (in-place modification)
for i, m in enumerate(scheduled_meetings):
if m["room_name"] == room_name:
scheduled_meetings.pop(i)
break
return {"status": "deleted"}
# ============================================
# Recording Endpoints
# ============================================
@router.get("/recordings")
async def get_recordings():
"""Get list of recordings"""
# Demo data
return [
{
"id": "docker-basics",
"title": "Docker Grundlagen Schulung",
"date": "2025-12-10T10:00:00",
"duration": "1:30:00",
"size_mb": 156,
"participants": 15
},
{
"id": "team-kw49",
"title": "Team-Meeting KW 49",
"date": "2025-12-06T14:00:00",
"duration": "1:00:00",
"size_mb": 98,
"participants": 8
},
{
"id": "parent-mueller",
"title": "Elterngespräch - Max Müller",
"date": "2025-12-02T16:00:00",
"duration": "0:28:00",
"size_mb": 42,
"participants": 2
}
]
@router.get("/recordings/{recording_id}")
async def get_recording(recording_id: str):
"""Get recording details"""
return {
"id": recording_id,
"title": "Recording " + recording_id,
"date": "2025-12-10T10:00:00",
"duration": "1:30:00",
"size_mb": 156,
"download_url": f"/api/recordings/{recording_id}/download"
}
@router.get("/recordings/{recording_id}/download")
async def download_recording(recording_id: str):
"""Download a recording"""
# In production, this would stream the actual file
raise HTTPException(status_code=404, detail="Recording file not found (demo mode)")
@router.delete("/recordings/{recording_id}")
async def delete_recording(recording_id: str):
"""Delete a recording"""
return {"status": "deleted", "id": recording_id}
# ============================================
# Health Check
# ============================================
@router.get("/health")
async def health_check():
"""Check meetings service health"""
# Check Jitsi availability
jitsi_healthy = False
try:
async with httpx.AsyncClient(timeout=5.0) as client:
response = await client.get(JITSI_BASE_URL)
jitsi_healthy = response.status_code == 200
except Exception:
pass
return {
"status": "healthy" if jitsi_healthy else "degraded",
"jitsi_url": JITSI_BASE_URL,
"jitsi_available": jitsi_healthy,
"scheduled_meetings": len(scheduled_meetings),
"active_meetings": len(active_meetings)
}

View File

@@ -0,0 +1,131 @@
"""
Progress API — Tracks student learning progress per unit.
Stores coins, crowns, streak data, and exercise completion stats.
Uses JSON file storage (same pattern as learning_units.py).
"""
import os
import json
import logging
from datetime import datetime, date
from typing import Dict, Any, Optional, List
from pathlib import Path
from fastapi import APIRouter, HTTPException
from pydantic import BaseModel
logger = logging.getLogger(__name__)
router = APIRouter(
prefix="/progress",
tags=["progress"],
)
PROGRESS_DIR = os.path.expanduser("~/Arbeitsblaetter/Lerneinheiten/progress")
def _ensure_dir():
os.makedirs(PROGRESS_DIR, exist_ok=True)
def _progress_path(unit_id: str) -> Path:
return Path(PROGRESS_DIR) / f"{unit_id}.json"
def _load_progress(unit_id: str) -> Dict[str, Any]:
path = _progress_path(unit_id)
if path.exists():
with open(path, "r", encoding="utf-8") as f:
return json.load(f)
return {
"unit_id": unit_id,
"coins": 0,
"crowns": 0,
"streak_days": 0,
"last_activity": None,
"exercises": {
"flashcards": {"completed": 0, "correct": 0, "incorrect": 0},
"quiz": {"completed": 0, "correct": 0, "incorrect": 0},
"type": {"completed": 0, "correct": 0, "incorrect": 0},
"story": {"generated": 0},
},
"created_at": datetime.now().isoformat(),
}
def _save_progress(unit_id: str, data: Dict[str, Any]):
_ensure_dir()
path = _progress_path(unit_id)
with open(path, "w", encoding="utf-8") as f:
json.dump(data, f, ensure_ascii=False, indent=2)
class RewardPayload(BaseModel):
exercise_type: str # flashcards, quiz, type, story
correct: bool = True
first_try: bool = True
@router.get("/{unit_id}")
def get_progress(unit_id: str):
"""Get learning progress for a unit."""
return _load_progress(unit_id)
@router.post("/{unit_id}/reward")
def add_reward(unit_id: str, payload: RewardPayload):
"""Record an exercise result and award coins."""
progress = _load_progress(unit_id)
# Update exercise stats
ex = progress["exercises"].get(payload.exercise_type, {"completed": 0, "correct": 0, "incorrect": 0})
ex["completed"] = ex.get("completed", 0) + 1
if payload.correct:
ex["correct"] = ex.get("correct", 0) + 1
else:
ex["incorrect"] = ex.get("incorrect", 0) + 1
progress["exercises"][payload.exercise_type] = ex
# Award coins
if payload.correct:
coins = 3 if payload.first_try else 1
else:
coins = 0
progress["coins"] = progress.get("coins", 0) + coins
# Update streak
today = date.today().isoformat()
last = progress.get("last_activity")
if last != today:
if last == (date.today().replace(day=date.today().day - 1)).isoformat() if date.today().day > 1 else None:
progress["streak_days"] = progress.get("streak_days", 0) + 1
elif last != today:
progress["streak_days"] = 1
progress["last_activity"] = today
# Award crowns for milestones
total_correct = sum(
e.get("correct", 0) for e in progress["exercises"].values() if isinstance(e, dict)
)
progress["crowns"] = total_correct // 20 # 1 crown per 20 correct answers
_save_progress(unit_id, progress)
return {
"coins_awarded": coins,
"total_coins": progress["coins"],
"crowns": progress["crowns"],
"streak_days": progress["streak_days"],
}
@router.get("/")
def list_all_progress():
"""List progress for all units."""
_ensure_dir()
results = []
for f in Path(PROGRESS_DIR).glob("*.json"):
with open(f, "r", encoding="utf-8") as fh:
results.append(json.load(fh))
return results

View File

@@ -0,0 +1,108 @@
"""
Story Generator — Creates short stories using vocabulary words.
Generates age-appropriate mini-stories (3-5 sentences) that incorporate
the given vocabulary words, marked with <mark> tags for highlighting.
Uses Ollama (local LLM) for generation.
"""
import os
import json
import logging
import requests
from typing import List, Dict, Any, Optional
logger = logging.getLogger(__name__)
OLLAMA_URL = os.getenv("OLLAMA_BASE_URL", "http://host.docker.internal:11434")
STORY_MODEL = os.getenv("STORY_MODEL", "llama3.1:8b")
def generate_story(
vocabulary: List[Dict[str, str]],
language: str = "en",
grade_level: str = "5-8",
max_words: int = 5,
) -> Dict[str, Any]:
"""
Generate a short story incorporating vocabulary words.
Args:
vocabulary: List of dicts with 'english' and 'german' keys
language: 'en' for English story, 'de' for German story
grade_level: Target grade level
max_words: Maximum vocab words to include (to keep story short)
Returns:
Dict with 'story_html', 'story_text', 'vocab_used', 'language'
"""
# Select subset of vocabulary
words = vocabulary[:max_words]
word_list = [w.get("english", "") if language == "en" else w.get("german", "") for w in words]
word_list = [w for w in word_list if w.strip()]
if not word_list:
return {"story_html": "", "story_text": "", "vocab_used": [], "language": language}
lang_name = "English" if language == "en" else "German"
words_str = ", ".join(word_list)
prompt = f"""Write a short story (3-5 sentences) in {lang_name} for a grade {grade_level} student.
The story MUST use these vocabulary words: {words_str}
Rules:
1. The story should be fun and age-appropriate
2. Each vocabulary word must appear at least once
3. Keep sentences simple and clear
4. The story should make sense and be engaging
Write ONLY the story, nothing else. No title, no introduction."""
try:
resp = requests.post(
f"{OLLAMA_URL}/api/generate",
json={
"model": STORY_MODEL,
"prompt": prompt,
"stream": False,
"options": {"temperature": 0.8, "num_predict": 300},
},
timeout=30,
)
resp.raise_for_status()
story_text = resp.json().get("response", "").strip()
except Exception as e:
logger.error(f"Story generation failed: {e}")
# Fallback: simple template story
story_text = _fallback_story(word_list, language)
# Mark vocabulary words in the story
story_html = story_text
vocab_found = []
for word in word_list:
if word.lower() in story_html.lower():
# Case-insensitive replacement preserving original case
import re
pattern = re.compile(re.escape(word), re.IGNORECASE)
story_html = pattern.sub(
lambda m: f'<mark class="vocab-highlight">{m.group()}</mark>',
story_html,
count=1,
)
vocab_found.append(word)
return {
"story_html": story_html,
"story_text": story_text,
"vocab_used": vocab_found,
"vocab_total": len(word_list),
"language": language,
}
def _fallback_story(words: List[str], language: str) -> str:
"""Simple fallback when LLM is unavailable."""
if language == "de":
return f"Heute habe ich neue Woerter gelernt: {', '.join(words)}. Es war ein guter Tag zum Lernen."
return f"Today I learned new words: {', '.join(words)}. It was a great day for learning."

View File

@@ -0,0 +1,166 @@
# OCR Pipeline Regression Testing
**Stand:** 2026-03-23
---
## Uebersicht
Das Regression Framework stellt sicher, dass Aenderungen an der OCR-Pipeline keine bestehenden
Ergebnisse verschlechtern. Ground-Truth Sessions dienen als Referenz — nach jeder Code-Aenderung
wird die Pipeline neu ausgefuehrt und das Ergebnis mit der Referenz verglichen.
---
## Ground Truth markieren
### Via Admin-UI (empfohlen)
1. Oeffne die OCR Pipeline: [/ai/ocr-pipeline](https://macmini:3002/ai/ocr-pipeline)
2. Lade eine Session und fuehre alle Pipeline-Schritte aus
3. Pruefe das Ergebnis im Grid Editor (Schritt 10)
4. Korrigiere Fehler falls noetig (Inline-Edit)
5. Klicke **"Als Ground Truth markieren"**
### Via API
```bash
# Bestehende Session als Ground Truth markieren
curl -X POST "http://macmini:8086/api/v1/ocr-pipeline/sessions/{session_id}/mark-ground-truth"
# Ground Truth entfernen
curl -X DELETE "http://macmini:8086/api/v1/ocr-pipeline/sessions/{session_id}/mark-ground-truth"
# Alle Ground-Truth Sessions auflisten
curl "http://macmini:8086/api/v1/ocr-pipeline/ground-truth-sessions"
```
### Via Ground-Truth Review UI
Fuer die Massenpruefung von 50-100 Sessions:
1. Oeffne [/ai/ocr-ground-truth](https://macmini:3002/ai/ocr-ground-truth)
2. Filter auf "Offen" (ungeprueft)
3. Split-View: Bild links, Grid rechts pruefen
4. Korrekte Zeilen mit Haekchen bestaetigen
5. Fehler inline korrigieren
6. "Markieren & Weiter" fuer naechste Session
---
## Regression ausfuehren
### Via Shell-Script (CI/CD)
```bash
# Standard: macmini:8086
./scripts/run-regression.sh
# Custom URL
./scripts/run-regression.sh http://localhost:8086
# Exit-Codes:
# 0 = alle bestanden
# 1 = Fehler gefunden
# 2 = Verbindungsfehler
```
### Via Admin-UI
1. Oeffne [/ai/ocr-regression](https://macmini:3002/ai/ocr-regression)
2. Klicke **"Alle Tests starten"**
3. Ergebnis: Pass/Fail pro Session mit Diff-Details
### Via API
```bash
# Alle Ground-Truth Sessions testen
curl -X POST "http://macmini:8086/api/v1/ocr-pipeline/regression/run?triggered_by=script"
# Einzelne Session testen
curl -X POST "http://macmini:8086/api/v1/ocr-pipeline/sessions/{session_id}/regression/run"
# Verlauf abrufen
curl "http://macmini:8086/api/v1/ocr-pipeline/regression/history?limit=20"
```
---
## Ergebnisse lesen
### Diff-Typen
| Typ | Beschreibung |
|-----|-------------|
| `structural_changes` | Anzahl Zonen, Spalten oder Zeilen hat sich geaendert |
| `text_change` | Text einer Zelle hat sich geaendert |
| `cell_missing` | Zelle war in der Referenz, fehlt jetzt |
| `cell_added` | Neue Zelle die in der Referenz nicht existierte |
| `col_type_change` | Spaltentyp einer Zelle hat sich geaendert |
### Status-Bewertung
- **pass**: Keine Diffs → Code-Aenderung hat keine Auswirkung
- **fail**: Diffs gefunden → pruefen ob gewollt (Feature) oder ungewollt (Regression)
- **error**: Pipeline-Fehler → Build oder Config-Problem
### Verlauf
Alle Laeufe werden in der Tabelle `regression_runs` persistiert:
```sql
SELECT id, run_at, status, total, passed, failed, errors, duration_ms, triggered_by
FROM regression_runs
ORDER BY run_at DESC
LIMIT 10;
```
---
## Best Practices
### Ground-Truth Sessions waehlen
Decke verschiedene Seitentypen ab:
- Woerterbuchseiten (2-3 Spalten, IPA-Klammern)
- Uebungsseiten (Tabellen, Checkboxen)
- Seiten mit Illustrationen
- Seiten ohne IPA (reines Deutsch-Vokabular)
- Verschiedene Verlage und Layouts
### Workflow vor jedem Commit
```bash
# 1. Regression laufen lassen
./scripts/run-regression.sh
# 2. Bei Failure: Diff pruefen
# - Gewollte Aenderung? → Ground Truth aktualisieren
# - Ungewollte Regression? → Code fixen
# 3. Bei Pass: Commit
git add . && git commit -m "fix: ..."
```
---
## Datenbank-Schema
```sql
CREATE TABLE regression_runs (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
run_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
status VARCHAR(20) NOT NULL, -- pass, fail, error
total INT NOT NULL DEFAULT 0,
passed INT NOT NULL DEFAULT 0,
failed INT NOT NULL DEFAULT 0,
errors INT NOT NULL DEFAULT 0,
duration_ms INT,
results JSONB NOT NULL DEFAULT '[]', -- Detail-Ergebnisse pro Session
triggered_by VARCHAR(50) DEFAULT 'manual'
);
```
Ground-Truth Referenzen werden im `ground_truth` JSONB-Feld der
`ocr_pipeline_sessions` Tabelle gespeichert.

View File

@@ -0,0 +1,233 @@
# Hardware-Anforderungen & Distribution
**Stand:** 2026-03-23
---
## 3-Tier Hardware-Strategie
Nicht jeder Lehrer hat dieselbe Hardware. Statt "alles oder nichts" definieren wir drei klare Tiers:
### Tier 1: Basis (jedes Geraet mit 4+ GB RAM)
**Gedruckt-OCR + Arbeitsblaetter + Tests — funktioniert ueberall.**
| Feature | Status |
|---------|--------|
| Tesseract OCR (gedruckt) | ~30 MB Modell, ~200 MB RAM |
| RapidOCR / PP-OCRv5 (ONNX) | ~40 MB Modell, ~300 MB RAM |
| Vokabel-Arbeitsblaetter generieren | Regelbasiert, kein ML |
| Multiple-Choice / Lueckentexte | Regelbasiert |
| Notenspiegel-Berechnung | Statistik |
| PDF-Export | Deterministische Templates |
| Rechtschreibpruefung | ~5 MB Woerterbuch |
**Speicherbedarf:** ~300 MB App + Modelle
**Laeuft auf:**
- Jeder Windows-Laptop (4+ GB RAM, beliebige CPU)
- iPad (alle Generationen inkl. Basis-iPad)
- Android-Tablet ab 4 GB RAM (Galaxy Tab A8/A9)
- Chromebook (4 GB RAM)
- MacBook (alle)
---
### Tier 2: Erweitert (8+ GB RAM, moderne CPU)
**+ Handschrift-OCR — typischer Lehrer-Laptop oder iPad Air.**
| Feature | Zusaetzlich zu Tier 1 |
|---------|----------------------|
| TrOCR Handschrift-OCR (int8 quantisiert) | ~560 MB Modell, ~1.5 GB RAM |
| Text-Embeddings (MiniLM) | ~90 MB Modell, ~100 MB RAM |
| OpenCV Layout-/Grafik-Analyse | ~400 MB RAM (bei A4 300 DPI) |
| Lokale Vektorsuche | Eingebettete DB |
**Speicherbedarf:** ~1.0-1.3 GB App + Modelle
**Mindestanforderungen:**
| Spec | Minimum | Empfohlen |
|------|---------|-----------|
| **RAM** | 8 GB | 16 GB |
| **CPU** | Intel i5 10. Gen / Apple M1 / Snapdragon 8 Gen 1 | Intel i5 12. Gen+ / Apple M2+ |
| **Speicher frei** | 5 GB | 10 GB |
| **OS** | Windows 10, macOS 12+, iPadOS 17+ | Aktuell |
**Laeuft auf:**
- Windows-Laptop 8 GB RAM (typischer Lehrer-Dienstlaptop, z.B. Lenovo L15 i5) — **~60-150 Sek/Seite Handschrift-OCR**
- MacBook Air M1+ 8 GB — **~30-60 Sek/Seite** (Neural Engine Beschleunigung)
- iPad Air M2+ (8 GB) — mit CoreML Konvertierung, **~20-40 Sek/Seite**
- iPad Pro M4+ (12-16 GB) — komfortabel
- Samsung Galaxy Tab S8/S9 (8 GB) — mit NNAPI/QNN, **~30-50 Sek/Seite**
**Laeuft NICHT auf:**
- iPad Basis (4-6 GB RAM) — iOS killt den Prozess bei TrOCR
- Budget-Android-Tablets (Galaxy Tab A8/A9, 3-4 GB RAM)
- Chromebooks mit 4 GB RAM
- Windows-Laptops mit 4 GB RAM
---
### Tier 3: Voll (16+ GB RAM / Cloud / Schul-Server)
**+ KI-Gutachten + RAG + Voice — braucht LLM.**
| Feature | Zusaetzlich zu Tier 2 |
|---------|----------------------|
| KI-Gutachten-Generierung | LLM (llama3.2 ~4 GB oder Cloud) |
| RAG / EH-Suche | Qdrant + Embedding-Service |
| Voice-Assistent | Whisper + LLM |
| KI-Worksheet-Modifikation | Vision-LLM |
**Optionen:**
| Variante | Hardware | Kosten |
|----------|---------|--------|
| **Schul-Server** | Mac Mini oder vergleichbar (Specs noch zu ermitteln) | Noch zu ermitteln |
| **Cloud** | BreakPilot Cloud-Service (gehostet) | Monatlich |
| **Lehrer-PC 16+ GB** | Lokales Ollama mit kleinen LLMs | 0 (eigene Hardware) |
!!! note "Server-Anforderungen noch offen"
Die genauen Hardware-Anforderungen fuer einen Schul-Server (RAM, CPU, Speicher) muessen im Projektverlauf durch Benchmarks ermittelt werden. Unsere Entwicklungsmaschine ist ein Mac Mini M4 Pro mit 64 GB RAM (~3.200 EUR) — das ist aber die Obergrenze, nicht die Empfehlung. Moeglicherweise reicht fuer den Schulbetrieb deutlich weniger.
---
## Typische Lehrer-Hardware in Deutschland (2024-2026)
### Digitalpakt-Kontext
- **Digitalpakt 1.0** (2019-2024): 6,5 Mrd. EUR, 97% der Mittel abgerufen
- **Digitalpakt 2.0** (2025-2030): 5 Mrd. EUR, Fokus Infrastruktur + Geraete
- Beschafft werden: Tablets, Laptops, interaktive Whiteboards, WLAN
### Was Schulen kaufen
| Geraet | Typische Specs | Preis | Verbreitung |
|--------|---------------|-------|-------------|
| **iPad (Basis)** | A16 Chip, 6 GB RAM, 64-128 GB | ~350-400 EUR | Sehr hoch |
| **iPad Air** | M2/M3 Chip, 8 GB RAM, 128-512 GB | ~520-650 EUR | Hoch |
| **Lehrer-Laptop** | Intel i5 10. Gen, 8 GB RAM, 256 GB SSD | ~400-600 EUR | Standard |
| **Samsung Galaxy Tab A** | 3-4 GB RAM, 32-128 GB | ~230-280 EUR | Gering |
| **Chromebook** | 4 GB RAM | ~250-350 EUR | Minimal (in DE) |
!!! info "iPad dominiert"
iPads sind an fast allen Schulen in Deutschland der Standard. Die meisten Schulen beschaffen iPad (Basis) oder iPad Air ueber MDM-Loesungen wie JAMF.
---
## Modell-Groessen und RAM-Bedarf
| Modell | Float32 | Int8 (quantisiert) | RAM |
|--------|---------|-------------------|-----|
| TrOCR-large Handschrift | 2.23 GB | **~560 MB** | 3-4 GB / **1-1.5 GB** |
| RapidOCR / PP-OCRv5 | ~40-90 MB | — | 200-500 MB |
| Tesseract (DE+EN) | ~30-50 MB | — | 200-300 MB |
| MiniLM Embeddings | ~90 MB | ~43 MB | 50-110 MB |
| pyspellchecker | ~5 MB | — | <10 MB |
| OpenCV (A4 300 DPI) | — | — | 200-400 MB |
| **Gesamt Offline-Kern** | **~2.6-3.0 GB** | **~1.0-1.3 GB** | **5-7 GB / 3-4.5 GB** |
!!! warning "TrOCR ist der Flaschenhals"
TrOCR-large (2.23 GB float32) **muss** als int8 ONNX quantisiert werden fuer 8 GB Geraete. Ohne Quantisierung laeuft es nur auf 16+ GB Hardware. Alternativ: TrOCR-base (~1.3 GB float32, ~330 MB int8) mit moderatem Qualitaetsverlust.
---
## ONNX Runtime: Plattform-Support
| Plattform | ONNX Runtime | Hardware-Beschleunigung |
|-----------|-------------|------------------------|
| **Windows** | Ja (nativ) | CPU, DirectML, OpenVINO |
| **macOS** | Ja (nativ) | CPU, **CoreML** (Neural Engine, 3.5x schneller) |
| **iOS** | Ja (ORT Mobile) | **CoreML** EP, XNNPACK |
| **Android** | Ja (ORT Mobile) | **NNAPI** EP, QNN EP, XNNPACK |
| **ChromeOS (Crostini)** | Ja (Linux-Binary) | Nur CPU |
| **Browser (WASM)** | Ja, aber **15-17x langsamer** als nativ | WebGPU (5x langsamer als nativ GPU) |
!!! note "Apple-Geraete am besten"
Durch CoreML + Neural Engine sind Apple-Geraete (M1+ Mac, iPad Air/Pro) die beste Consumer-Hardware fuer unsere Modelle. ~3.5x schneller als reines CPU-Inference.
---
## Distributions-Strategie
### Desktop (Primaer: Lehrer-Laptops)
| Plattform | Methode | Anmerkung |
|-----------|---------|-----------|
| **Windows** | .exe/.msi Installer von Website | Code-Signing-Zertifikat empfohlen (~300 EUR/Jahr) |
| **macOS** | .dmg von Website | Apple Notarization erforderlich (99 EUR/Jahr Developer Account) |
| **Linux** | AppImage oder .deb | Keine Signierung noetig |
**1-2 GB Installer ist fuer Desktop voellig normal** (VSCode, Slack, etc. sind aehnlich gross).
### Tablets (Sekundaer: iPads, Android)
| Plattform | Methode | Anmerkung |
|-----------|---------|-----------|
| **iOS/iPadOS** | Apple Custom Apps via Apple Business Manager + MDM | Nicht im oeffentlichen App Store, nur fuer Schulen via JAMF/Mosyle |
| **Android** | Managed Google Play (private Channel) | Fuer Schulen mit Google Workspace for Education |
**Strategie:** Kleine App (~50-100 MB Initial-Download), Modelle werden beim ersten Start nachgeladen mit Fortschrittsanzeige.
### App Store Limits
| Store | Max. Download | On-Demand Assets |
|-------|-------------|-----------------|
| **Apple App Store** | 4 GB | Bis 20 GB (On-Demand Resources) |
| **Google Play** | 200 MB (AAB) | Bis 2 GB+ (Play Asset Delivery) |
| **Microsoft Store** | 10 GB | — |
!!! info "1-2 GB App-Groesse kein Problem"
Games sind routinemaessig 2-5 GB. Fuer B2B-Education-Apps via MDM ist die Groesse kein Hindernis, da IT-Admins die Verteilung steuern.
### Warum KEIN PWA?
Progressive Web Apps sind **nicht geeignet** fuer unseren Offline-KI-Anwendungsfall:
- iOS Safari begrenzt Storage auf ~1 GB pro Origin und **kann Daten jederzeit loeschen**
- WebAssembly ONNX Runtime ist **15-17x langsamer** als native Ausfuehrung
- Kein zuverlaessiger persistenter Speicher fuer 1+ GB Modelldaten
- PWA waere akzeptabel als leichtgewichtiger Companion (Dashboard, Ergebnisse ansehen), aber nicht fuer OCR/ML-Workloads
---
## Framework-Empfehlung
| Plattform | Framework | Vorteil |
|-----------|-----------|---------|
| **Desktop** | **Tauri** (Rust + Web-Frontend) | 5-10 MB Basis (vs. 200 MB Electron), beste ML-Integration via Rust |
| **Desktop (Alternative)** | Electron | Schnellste Entwicklung wenn Team Web-fokussiert |
| **Mobile** | **Flutter** | Single Codebase iOS+Android, gute FFI fuer native ML-Libs |
| **Mobile (Alternative)** | React Native | Wenn Frontend-Team bereits React-erfahren |
!!! warning "Desktop und Mobile nicht in einer Codebase"
Die UX-Anforderungen sind zu unterschiedlich. Desktop = volle OCR-Pipeline + Korrektur-Workspace. Mobile = Scannen, Ergebnisse ansehen, Lernunits bearbeiten.
---
## Empfohlene Hardware fuer Schulen
### Minimum (Tier 2 — Handschrift-OCR offline)
> **Lehrer-Laptop:** Intel i5 (10. Gen+) oder Apple M1+, **8 GB RAM**, 256 GB SSD, Windows 10/11 oder macOS 12+
>
> **Oder iPad Air M2+** mit 128 GB Speicher
### Optimal (Tier 3 — alle Features offline)
> **Schul-Server:** Hardware-Anforderungen noch zu ermitteln.
>
> Muss LLMs lokal ausfuehren (Ollama) und als RAG-Server fuer die Schule dienen.
> Lehrer verbinden sich ueber WLAN und nutzen Tier-3-Features ueber den Server.
>
> *Referenz: Unsere Entwicklungsmaschine (Mac Mini M4 Pro, 64 GB RAM, ~3.200 EUR) laeuft komfortabel. Ob ein guenstigeres Modell (z.B. 32 GB RAM, ~1.500-2.000 EUR) ausreicht, wird im Projektverlauf durch Benchmarks geklaert.*
### Oder: BreakPilot Cloud
> Schulen ohne eigenen Server koennen Tier-3-Features ueber den BreakPilot Cloud-Service nutzen.
> Tier 1 + 2 funktionieren immer offline auf dem Endgeraet.

152
docs-src/projekt/roadmap.md Normal file
View File

@@ -0,0 +1,152 @@
# Entwicklungsroadmap
**Stand:** 2026-03-23
---
## Ueberblick
```mermaid
graph LR
P1[Phase 1<br>OCR Haerten] --> P2[Phase 2<br>Lernunits]
P1 --> P4[Phase 4<br>Korrektur]
P2 --> P3[Phase 3<br>Schuljahr]
P4 --> P5[Phase 5<br>Zeugnisse]
P5 --> P6[Phase 6<br>Onboarding]
```
Phase 2 (Lernunits) und Phase 4 (Korrektur) koennen **parallel** laufen.
---
## Phase 1: OCR-Pipeline Haerten (April 2026)
**Ziel:** OCR-Pipeline robust genug fuer alle gaengigen Schulbuch-Layouts.
| Task | Beschreibung | Aufwand |
|------|-------------|---------|
| IPA-Korrekturen | Garbled OCR-IPA in Headword-Zellen ersetzen | Klein |
| PP-DocLayout ONNX | Bessere Grafik-/Bilderkennung via ONNX-Konvertierung | Mittel |
| Page-Crop Determinismus | Spine-Shadow-Bug in `page_crop.py` fixen | Klein |
| TrOCR Finetuning | Handschrift-OCR Qualitaet mit Labeling-Daten verbessern | Gross |
| TrOCR ONNX + Int8 | Modell fuer Offline-Deployment quantisieren (~560 MB statt 2.2 GB) | Mittel |
| Ground-Truth erweitern | Regression-Test-Basis auf 10+ Sessions | Klein |
**Ergebnis:** Zuverlaessige OCR fuer gedruckte Texte + erste brauchbare Handschrift-Erkennung auf Consumer-Hardware.
---
## Phase 2: Lernunit-Generator (Mai-Juni 2026)
**Ziel:** Aus OCR-Daten automatisch Lernmodule in verschiedenen Formaten generieren.
| Task | Beschreibung | Aufwand |
|------|-------------|---------|
| Multiple-Choice-Generator | Automatisch aus extrahierten Vokabeln | Mittel |
| Lueckentext-Generator | Saetze mit Luecken aus OCR-Text | Mittel |
| Lernplakat-Generator | Druckbare Zusammenfassungen als PDF | Mittel |
| Grammatik-Test-Generator | Regelbasiert, kein LLM noetig | Gross |
| Spaced-Repetition-Engine | Leitner-System fuer Vokabel-Wiederholung | Mittel |
| Companion-App erweitern | Schueler-Player fuer alle Unit-Formate | Mittel |
**Abhaengigkeit:** Vokabel-Extraktion aus Phase 1 (existiert bereits).
**Ergebnis:** Lehrer scannt Buchseite → System generiert 5+ Lernunit-Varianten.
---
## Phase 3: Schuljahres-Begleitung (Juli-August 2026)
**Ziel:** Proaktives System das den Lehrer durch das Schuljahr fuehrt.
| Task | Beschreibung | Aufwand |
|------|-------------|---------|
| Schulkalender-Import | ICS-Format, Bundesland-spezifische Ferien | Klein |
| Phasen-Engine erweitern | State Machine fuer Schuljahres-Phasen | Mittel |
| Automatische Erinnerungen | 3 Wochen / 1 Woche vor KA via Matrix | Mittel |
| Lernunit-Versand an Eltern | Automatischer Versand via Matrix-Chat | Klein |
| Fortschritts-Dashboard | Uebersicht: Wer hat was gelernt? Wo sind Schwaechen? | Mittel |
| Eltern-View | Vereinfachte Ansicht fuer Eltern in Studio v2 | Mittel |
**Abhaengigkeit:** Lernunit-Generator aus Phase 2, Matrix-Anbindung (existiert).
**Ergebnis:** System erinnert automatisch an kommende Klassenarbeiten, verschickt Lernmaterial.
---
## Phase 4: Klausur-Korrektur Komplett (September-Oktober 2026)
**Ziel:** Vollstaendiger Korrektur-Workflow inkl. deterministische Gutachten.
| Task | Beschreibung | Aufwand |
|------|-------------|---------|
| Handschrift-OCR Integration | TrOCR in Korrektur-Workflow einbinden | Mittel |
| EH-Abgleich | Keyword-Matching gegen NiBiS-Erwartungshorizonte | Mittel |
| RS/Grammatik-Pruefung | Automatisch auf OCR-erkanntem Text | Mittel |
| Gutachten-Template-Engine | Deterministische Gutachten aus Korrekturdaten | Gross |
| Konsistenz-Check | Automatischer Abgleich ueber alle 24 Gutachten | Mittel |
| Zweitgutachter-Workflow | Visibility-Regeln, Einigung, Drittkorrektur | Mittel |
| NiBiS EH-RAG | Erwartungshorizonte in RAG ingestieren (Erweiterung der bestehenden Compliance-RAG-Pipeline) | Mittel |
**Abhaengigkeit:** Handschrift-OCR aus Phase 1, RAG-Pipeline (existiert fuer Compliance, muss fuer NiBiS erweitert werden).
**Ergebnis:** Lehrer scannt Klausur → System liefert RS/Grammatik-Fehler + EH-Abgleich + Gutachten-Entwurf.
---
## Phase 5: Zeugnis-Generator (November-Dezember 2026)
**Ziel:** Zeugnisse aus gesammelten Jahresdaten generieren.
| Task | Beschreibung | Aufwand |
|------|-------------|---------|
| Zeugnis-Templates | Pro Schulform/Jahrgang fuer Niedersachsen | Gross |
| Daten-Aggregation | Alle Noten, Kompetenzen, Fehlzeiten zusammenfuehren | Mittel |
| Textbaustein-System | Deterministische Formulierungen aus Datenpunkten | Mittel |
| PDF-Export | Schulform-konforme Zeugnisse als PDF | Mittel |
| Bundesland-Konfiguration | Start Niedersachsen, erweiterbar auf andere | Klein |
**Abhaengigkeit:** Notenspiegel (existiert), Daten aus dem gesamten Schuljahr.
**Ergebnis:** Ein-Klick-Zeugniserstellung aus allen gesammelten Daten.
---
## Phase 6: Lehrer-Onboarding + Polish (Q1 2027)
**Ziel:** Lehrer koennen mitten im Schuljahr einsteigen.
| Task | Beschreibung | Aufwand |
|------|-------------|---------|
| Notizbuch-OCR | Handschriftliche Notizbuecher → Klassen/Schueler anlegen | Gross |
| Status-Erkennung | System erkennt wo im Schuljahr der Lehrer steht | Mittel |
| Guided Onboarding Wizard | Schritt-fuer-Schritt Einrichtung | Mittel |
| Differenzierung | 3 Niveaus (Basis/Standard/Erweitert) fuer Lernunits | Mittel |
| Export-Formate | IMS QTI, CSV/Excel fuer Interoperabilitaet | Klein |
| Kollaborations-Features | Materialpool fuer Fachschaften | Gross |
---
## Offline vs Cloud pro Phase
| Phase | Offline (deterministisch) | Cloud/LLM (optional) |
|-------|--------------------------|---------------------|
| 1 OCR Haerten | Tesseract, RapidOCR, TrOCR, OpenCV | qwen2.5vl fuer HTR |
| 2 Lernunits | Templates, Algorithmen, Spaced Repetition | KI-Varianten |
| 3 Schuljahres-Begleitung | State-Engine, Kalender, Erinnerungen | — |
| 4 Korrektur Komplett | RS/Grammatik, Keyword-Match, Templates | KI-Gutachten, RAG |
| 5 Zeugnisse | Templates, Daten-Aggregation | KI-Textverbesserung |
| 6 Onboarding | TrOCR, Wizard, Export | — |
---
## Meilensteine
| Datum | Meilenstein |
|-------|------------|
| **April 2026** | OCR-Pipeline stabil, TrOCR quantisiert und offline-faehig |
| **Juni 2026** | Lernunit-Generator mit 5+ Formaten, Companion-Player fuer Schueler |
| **August 2026** | Schuljahres-Begleitung aktiv, automatischer Lernunit-Versand an Eltern |
| **Oktober 2026** | Klausur-Korrektur End-to-End: Scan → Gutachten (deterministisch) |
| **Dezember 2026** | Zeugnis-Generator fuer Niedersachsen produktiv |
| **Maerz 2027** | Lehrer-Onboarding aus handschriftlichen Notizbuechern, Kollaboration |

View File

@@ -0,0 +1,191 @@
# BreakPilot Lehrer — Vision, Mission & Projektuebersicht
## Vision
**Jeder Lehrer in Deutschland hat einen digitalen Assistenten, der repetitive Verwaltungsarbeit uebernimmt — offline, datenschutzkonform und auf jeder Hardware.**
BreakPilot Lehrer befreit Lehrkraefte von Buerokratie, damit sie sich auf das konzentrieren koennen, wofuer sie Lehrer geworden sind: Unterrichten.
---
## Mission
Wir entwickeln eine **Offline-First Desktop- und Tablet-App**, die Lehrer proaktiv durch das Schuljahr begleitet. Die App erkennt handschriftliche und gedruckte Dokumente, automatisiert Korrektur-Workflows, generiert Lernmaterial und erstellt Zeugnisse — alles deterministisch, ohne Cloud-Abhaengigkeit und DSGVO-konform.
**Unsere drei Prinzipien:**
1. **Offline First** — Alle Kernfunktionen laufen ohne Internet, ohne GPU, auf normaler Lehrer-Hardware
2. **Deterministisch vor KI** — Jede Funktion wird zuerst regelbasiert implementiert. LLM ist optionales Upgrade, nie Kernabhaengigkeit
3. **Datenschutz by Design** — Alle Schuelerdaten bleiben lokal auf dem Geraet des Lehrers. Keine Cloud-Speicherung personenbezogener Daten
---
## Das Problem
### Korrektur: 6 Stunden pro Schueler
Eine Deutsch-Abiturarbeit erfordert im Durchschnitt **6 Stunden Korrekturzeit pro Schueler**. Bei 24 Schuelern sitzt ein Lehrer **zwei Wochen** ausschliesslich an der Korrektur. Der Prozess ist streng reglementiert:
1. Rechtschreibfehler markieren (Durchgang 1)
2. Grammatikfehler markieren (Durchgang 2)
3. Inhalt gegen den offiziellen Erwartungshorizont (EH) bewerten (Durchgang 3)
4. Notenspiegel ueber alle Schueler erstellen
5. Gutachten pro Schueler schreiben
6. Alle 24 Gutachten aufeinander abstimmen (Konsistenz)
7. Zweitgutachter bewertet unabhaengig
8. Bei >= 4 Punkte Differenz: Drittkorrektur
### Verwaltung: Fragmentierte Werkzeuge
Lehrer jonglieren zwischen handschriftlichen Notizbuechern, Excel-Tabellen, Word-Dokumenten und verschiedenen Schulportalen. Es gibt kein einheitliches System, das den Schuljahres-Rhythmus kennt und vorausschauend die richtigen Werkzeuge aktiviert.
---
## Die Loesung: BreakPilot Lehrer
### Fuer Lehrer (primaere Zielgruppe)
| Funktion | Was es tut | Offline? |
|----------|-----------|----------|
| **Klausur-Korrektur** | Scan → OCR → RS/Grammatik-Pruefung → EH-Abgleich → Notenspiegel → Gutachten | Ja |
| **Vokabel-Arbeitsblaetter** | Buchseite scannen → Vokabeln extrahieren → 5+ Arbeitsblatt-Formate generieren | Ja |
| **Lernunit-Generator** | Aus OCR-Daten Multiple-Choice, Lueckentexte, Lernplakate erstellen | Ja |
| **Zeugnis-Generator** | Template-basierte Zeugnisse aus gesammelten Jahresdaten | Ja |
| **Notenspiegel** | Automatische Berechnung + Fairness-Analyse + Ausreisser-Erkennung | Ja |
| **Schuljahres-Planer** | Proaktive Aufgabensteuerung basierend auf Schulkalender | Ja |
### Fuer Eltern (sekundaere Zielgruppe)
- Erhalten automatisch generierte Lernmodule via Chat (Matrix)
- Waehlen passende Uebungen fuer ihr Kind aus
- Begleiten den Lernfortschritt mit Spaced-Repetition
### Fuer Schueler (Endnutzer)
- Lernen in automatisch generierten Modulen (Vokabeln, Grammatik, Multiple Choice)
- Fortschritt wird getrackt, Schwaechen erkannt
- Erinnerungen vor Klassenarbeiten
---
## Automatischer Workflow (Beispiel)
```
Trigger: "In 3 Wochen Englisch-Klassenarbeit, Unit 4"
1. Lehrer scannt Buchseiten 54-58 ein
|
2. OCR-Pipeline extrahiert Text, Vokabeln, Grammatik
|
3. System generiert automatisch:
- Multiple-Choice-Test
- Deutsch → Englisch Vokabelabfrage
- Englisch → Deutsch Vokabelabfrage
- Lueckentexte
- Lernplakat (druckbar)
- Grammatiktests
|
4. Lernunits gehen an Eltern via Chat
"Englisch Unit 4 — 5 Lernmodule verfuegbar"
|
5. Eltern waehlen passende Units fuer ihr Kind
|
6. Kind lernt in der App, Fortschritt wird getrackt
|
7. Automatische Erinnerungen bis zur Klassenarbeit
"Noch 5 Tage — 12 von 45 Vokabeln gelernt"
```
---
## Schuljahres-Begleitung
Das System kennt den Schuljahresrhythmus und aktiviert proaktiv die passenden Funktionen:
| Phase | System-Aktion |
|-------|---------------|
| **Schulbeginn** | Klassen einrichten, Schueler anlegen, Onboarding |
| **Unterrichtsphase** | Lernmaterial vorbereiten, Units aus Buchseiten erstellen |
| **3 Wochen vor Klassenarbeit** | Lernunits automatisch generieren + an Eltern senden |
| **1 Woche vor Klassenarbeit** | Erinnerungen, Fortschritt pruefen |
| **Klassenarbeit geschrieben** | Korrektur-Workflow aktivieren |
| **Nach Korrektur** | Notenspiegel erstellen, Gutachten generieren |
| **Konferenzen** | Berichte aggregieren und vorbereiten |
| **Zeugniszeit** | Zeugnis-Generator mit allen gesammelten Daten |
| **Ferien** | Wiederholungsmaterial, naechstes Halbjahr vorbereiten |
---
## Technologie
### Offline-Kern (~1.3 GB quantisiert)
| Komponente | Technologie | Groesse |
|------------|-------------|---------|
| OCR (gedruckt) | Tesseract + RapidOCR (ONNX) | ~70 MB |
| OCR (Handschrift) | TrOCR (ONNX, int8 quantisiert) | ~560 MB |
| Layout-/Grafik-Erkennung | OpenCV | 0 |
| Rechtschreibung | pyspellchecker EN+DE | ~5 MB |
| Text-Embeddings | all-MiniLM-L6-v2 | ~90 MB |
| App + Logik | Python/TypeScript | ~200 MB |
### Optionale Cloud-Features
| Feature | Wofuer | Abhaengigkeit |
|---------|--------|---------------|
| KI-Gutachten | Gutachten-Vorschlaege aus Korrekturdaten | LLM (Cloud/GPU) |
| RAG / EH-Suche | Erwartungshorizont semantisch durchsuchen | Qdrant + Embedding |
| Spracheingabe | Aufgaben per Freitext einsprechen | Whisper |
| KI-Worksheet-Modifikation | Lernmaterial KI-optimiert anpassen | LLM |
---
## Datenschutz (DSGVO)
- **Alle Schuelerdaten, Noten und Klausuren bleiben lokal** auf dem Geraet des Lehrers
- Keine Cloud-Speicherung personenbezogener Daten
- Cloud-Features (wenn aktiviert) verarbeiten nur anonymisierte Daten
- Kommunikation ueber Matrix (Ende-zu-Ende-verschluesselt)
- Keine Audioaufnahmen gespeichert (Voice-Service)
---
## Bundesland-Erweiterbarkeit
Start mit **Niedersachsen** als Pilotbundesland. Die Architektur erlaubt spaetere Erweiterung auf andere Bundeslaender mit eigenen:
- Erwartungshorizonten und Pruefungsformaten
- Notensystemen (15-Punkte vs. Noten 1-6)
- Zeugnisvorlagen und Verwaltungsvorschriften
- Schulformen und Jahrgangsstufen
---
## Aktueller Stand (Maerz 2026)
| Modul | Status |
|-------|--------|
| OCR-Pipeline (gedruckt) | 85% — Tesseract + RapidOCR + PaddleOCR, 10-Step Pipeline |
| Vokabel-Arbeitsblaetter | 90% — 4 Formate + NRU, PDF-Export |
| Klausur-Korrektur | 70% — Scan, Annotation, 5-Kriterien, Gutachten, Fairness |
| Notenspiegel | 85% — 15-Punkte, Histogramm, Ausreisser |
| Admin Dashboard | 85% — 12 Sektionen, AI-Tools, SBOM, Security |
| Studio v2 (Lehrer-UI) | 80% — 12+ Apps, 7 Sprachen, Dark/Light Mode |
| RAG / EH-Suche | 80% — Qdrant + Embedding + Hybrid Search |
| Voice-Service | 60% — WebSocket, Whisper, Intent-Router |
| Messaging | 75% — Matrix/Synapse integriert |
| Video | 70% — Jitsi integriert |
| Handschrift-OCR | 40% — TrOCR integriert, Qualitaet verbessern |
| Lernunit-Generator | 25% — Backend-API + Companion UI vorhanden |
| Zeugnis-Generator | 30% — API + Models vorhanden |
| Schuljahres-Planer | 15% — State-Engine API vorhanden |
---
## Kontakt
**BreakPilot** — KI-Bildungsplattform fuer den deutschen Schulalltag
- Pilotphase: Niedersachsen
- Hardware: Offline auf jedem Laptop mit 8 GB RAM
- Datenschutz: DSGVO-konform, alle Daten lokal

View File

@@ -0,0 +1,253 @@
# OCR Kombi Pipeline - Modulare 11-Schritt-Architektur
**Version:** 1.0.0
**Status:** Phase 1 implementiert (Grundgeruest + DB)
**URL:** https://macmini:3002/ai/ocr-kombi
## Uebersicht
Die OCR Kombi Pipeline ist der Nachfolger des OCR-Overlay-Monolithen (`/ai/ocr-overlay`).
Sie zerlegt den OCR-Prozess in **11 modulare Schritte** mit je einer eigenen Komponente
pro Frontend- und Backend-Datei. Ziel: schnelles Debugging, klare Verantwortlichkeiten,
Multi-Page-Dokument-Unterstuetzung.
**Primaerer Modus:** Kombi (PaddleOCR + Tesseract) — der einzige Modus, den der User nutzt.
### Warum ein Refactor?
| Problem (alt) | Loesung (neu) |
|----------------|---------------|
| `page.tsx` = 751-Zeilen-Monolith mit 3 Modi | `page.tsx` = ~140-Zeilen-Orchestrator, je 1 Datei pro Step |
| Upload, Orientierung und Page-Split in einem Step | 3 separate Steps mit eigener Logik |
| Keine Multi-Page-Dokument-Unterstuetzung | `document_group_id` + `page_number` auf DB-Ebene |
| OCR intransparent (eine Blackbox) | 3-Phasen-Fortschritt + Engine-Attribution pro Wort (geplant) |
| `grid_editor_api.py` = 1801 Zeilen | 4 Module + Orchestrator (geplant) |
---
## Pipeline-Schritte
| # | Step | Frontend | Backend | Beschreibung |
|---|------|----------|---------|--------------|
| 1 | Upload | `StepUpload.tsx` | `step_upload.py` | Bild/PDF hochladen, Titel, Kategorie. Multi-Page-PDF → N Sessions |
| 2 | Orientierung | `StepOrientation.tsx` | (shared) | Rotation 90/180/270 erkennen + korrigieren |
| 3 | Seitentrennung | `StepPageSplit.tsx` | (shared) | Doppelseiten erkennen + splitten |
| 4 | Begradigung | `StepDeskew.tsx` | (shared) | Hough Lines + Word Alignment |
| 5 | Entzerrung | `StepDewarp.tsx` | (shared) | Shear-Korrektur (Vertikalkanten-Drift) |
| 6 | Zuschneiden | `StepContentCrop.tsx` | (shared) | Scanner-Raender entfernen (nach Begradigung!) |
| 7 | OCR | `StepOcr.tsx` | (shared) | Tesseract + PaddleOCR + Merge |
| 8 | Strukturerkennung | `StepStructure.tsx` | (shared) | Boxen, Zonen, Farben, Grafiken |
| 9 | Grid-Aufbau | `StepGridBuild.tsx` | (shared) | Strukturiertes Grid aus OCR + Struktur |
| 10 | Grid-Review | `StepGridReview.tsx` | (shared) | Excel-Editor, IPA, Silben, Korrekturen |
| 11 | Ground Truth | `StepGroundTruth.tsx` | (shared) | Validierung + GT-Markierung |
!!! note "Crop nach Dewarp"
Seitentrennung (Step 3) passiert **vor** Begradigung — richtig, weil jede Haelfte
unabhaengig begradigt wird. Der Content-Crop (Step 6) bleibt **nach** Dewarp,
weil content-basierter Crop auf geradem Bild besser funktioniert.
---
## Multi-Page-Dokument-Gruppierung
### Problem
Ein Lehrer scannt 10 Vokabelseiten als eine PDF-Datei. Im Endnutzer-Frontend soll das
ein zusammenhaengendes Dokument sein. Alle Seiten muessen spaeter zu gemeinsamen
Lern-Units verarbeitet werden koennen.
### Loesung: `document_group_id` + `page_number`
Zwei neue Felder auf `ocr_pipeline_sessions` (Migration `009_add_document_group.sql`):
```sql
ALTER TABLE ocr_pipeline_sessions
ADD COLUMN IF NOT EXISTS document_group_id UUID,
ADD COLUMN IF NOT EXISTS page_number INT;
```
| Upload-Typ | document_group_id | page_number |
|------------|-------------------|-------------|
| Einzelbild | neues UUID | 1 |
| Multi-Page-PDF (10 Seiten) | gleiches UUID fuer alle 10 | 1..10 |
| Doppelseiten-Split von S. 3 | gleiches UUID | neue S. 3 + S. 4, Rest umkontiert |
### Benennung
Upload-Titel "Vokabeln Unit 3" erzeugt:
- "Vokabeln Unit 3 — S. 1"
- "Vokabeln Unit 3 — S. 2"
- ...
- "Vokabeln Unit 3 — S. 10"
### Session-Liste im Admin
Gruppierte Anzeige: Ein Dokument-Header ("Vokabeln Unit 3, 10 Seiten") mit aufklappbaren
Einzel-Sessions darunter. Jede Session hat eigenen Pipeline-Status.
---
## API-Endpoints
### Neue Endpoints (OCR Kombi)
| Methode | Pfad | Beschreibung |
|---------|------|--------------|
| POST | `/api/v1/ocr-kombi/upload` | Upload: Einzelbild oder Multi-Page-PDF |
| GET | `/api/v1/ocr-kombi/documents/{group_id}` | Alle Sessions einer Dokumentgruppe |
### Bestehende Endpoints (wiederverwendet)
Die Kombi-Pipeline nutzt alle bestehenden Endpoints aus `/api/v1/ocr-pipeline/`:
| Methode | Pfad | Step |
|---------|------|------|
| POST | `/sessions` | Upload (Legacy, Einzelbild) |
| POST | `/sessions/{id}/orientation` | Orientierung |
| POST | `/sessions/{id}/page-split` | Seitentrennung |
| POST | `/sessions/{id}/deskew` | Begradigung |
| POST | `/sessions/{id}/dewarp` | Entzerrung |
| POST | `/sessions/{id}/crop` | Zuschneiden |
| POST | `/sessions/{id}/paddle-kombi` | OCR (Kombi) |
| POST | `/sessions/{id}/detect-structure` | Strukturerkennung |
| POST | `/sessions/{id}/build-grid` | Grid-Aufbau |
| POST | `/sessions/{id}/save-grid` | Grid speichern |
| GET | `/sessions/{id}/grid-editor` | Grid laden |
| POST | `/sessions/{id}/mark-ground-truth` | GT markieren |
---
## Dateistruktur
### Frontend
```
admin-lehrer/app/(admin)/ai/ocr-kombi/
├── page.tsx # ~140 Zeilen, Orchestrator mit Suspense-Boundary
├── types.ts # KOMBI_V2_STEPS (11 Steps), DocumentGroup-Types, OCR-Transparenz-Types
└── useKombiPipeline.ts # Hook: Session-State, Step-Navigation, Dokument-Gruppierung
admin-lehrer/components/ocr-kombi/
├── KombiStepper.tsx # 11-Step-Indikator (kompakt, scrollbar)
├── SessionList.tsx # Gruppierte Session-Liste (Dokumentgruppen aufklappbar)
├── SessionHeader.tsx # Aktive Session: Name + Kategorie + GT-Badge
├── StepUpload.tsx # Drag-Drop + Titel + Kategorie-Auswahl
├── StepOrientation.tsx # Wrapper → shared StepOrientation
├── StepPageSplit.tsx # Doppelseiten-Erkennung + Auto-Advance
├── StepDeskew.tsx # Wrapper → shared StepDeskew
├── StepDewarp.tsx # Wrapper → shared StepDewarp
├── StepContentCrop.tsx # Wrapper → shared StepCrop
├── StepOcr.tsx # Wrapper → PaddleDirectStep (kombi endpoint)
├── StepStructure.tsx # Wrapper → shared StepStructureDetection
├── StepGridBuild.tsx # Auto-Trigger build-grid + Ergebnis-Anzeige
├── StepGridReview.tsx # Wrapper → shared StepGridReview (mit saveRef)
└── StepGroundTruth.tsx # GT-Markierung mit Auto-Save
```
### Backend
```
klausur-service/backend/ocr_kombi/
├── __init__.py
├── router.py # Composite Router (/api/v1/ocr-kombi)
└── step_upload.py # Multi-Page-PDF → N Sessions + document_group_id
```
### Shared (wiederverwendet)
Die Kombi-Pipeline nutzt alle bestehenden Backend-Module:
- `orientation_crop_api.py` — Orientierung, Page-Split, Crop
- `ocr_pipeline_api.py` — Deskew, Dewarp
- `ocr_pipeline_ocr_merge.py` — PaddleOCR + Tesseract Merge
- `grid_editor_api.py` — Grid-Aufbau + Editor
- `ocr_pipeline_session_store.py` — DB-Layer (erweitert um document_group_id)
- Alle `cv_*.py` — CV-Algorithmen
### Migration
```
migrations/009_add_document_group.sql # document_group_id UUID + page_number INT + Index
```
---
## Implementierungsphasen
| Phase | Status | Beschreibung |
|-------|--------|--------------|
| **1: Grundgeruest + DB** | Implementiert | DB-Migration, Types, Hook, Stepper, SessionList, page.tsx, Navigation, Backend-Router |
| **2: Vorverarbeitungs-Steps** | Geplant | Multi-Page-PDF-Upload, Orientierung ohne Upload, Seitentrennung mit document_group_id |
| **3: OCR-Transparenz** | Geplant | 3-Phasen-Fortschritt, Engine-Attribution pro Wort, Farbkodierung |
| **4: Grid-Pipeline aufteilen** | Geplant | grid_editor_api.py → 4 Module + Orchestrator |
| **5: Restliche Steps** | Geplant | Structure, GridBuild, GridReview, GroundTruth voll integriert |
| **6: Features migrieren** | Spaeter | LLM-Review-Streaming, Labeling-Mode, Bild-Generierung |
| **7: Aufraeumen** | Spaeter | /ai/ocr-overlay und /ai/ocr-pipeline loeschen |
---
## OCR-Transparenz (Phase 3, geplant)
### 3-Phasen-Fortschritt in Step 7
1. "Tesseract laeuft..." (Fortschrittsbalken)
2. "PaddleOCR laeuft..." (Fortschrittsbalken)
3. "Merge laeuft..." (Fortschrittsbalken)
### Engine-Attribution pro Wort
Vergleichsansicht mit Farbkodierung:
| Farbe | Bedeutung |
|-------|-----------|
| Gruen | Beide Engines einig |
| Blau | Nur PaddleOCR |
| Orange | Nur Tesseract |
| Gelb | Konflikt, PaddleOCR gewaehlt |
| Rot | Konflikt, Tesseract gewaehlt |
### Geplanter Endpoint
```
POST /sessions/{id}/ocr-kombi-transparent
→ { raw_tesseract, raw_paddle, merged, engine_source_per_word }
```
---
## Grid-Pipeline-Aufteilung (Phase 4, geplant)
`grid_editor_api.py` (1801 Zeilen) wird aufgeteilt in:
| Modul | Inhalt | ~Zeilen |
|-------|--------|---------|
| `grid_build_filters.py` | Margin, Footer, Header, Exclude, Grafik-Filter | ~200 |
| `grid_build_zones.py` | Box-Detect, Page-Zones, Vert-Dividers | ~250 |
| `grid_build_columns.py` | Spalten-Clustering + Union-Merge + Zone-Grids | ~300 |
| `grid_build_postprocess.py` | Row/Cell-Postprocessing, IPA, Farben, Dictionary | ~500 |
`grid_editor_api.py` wird zum schlanken Orchestrator.
---
## Verhaeltnis zu bestehenden Pipelines
| Pipeline | Route | Status | Beschreibung |
|----------|-------|--------|--------------|
| **OCR Kombi** | `/ai/ocr-kombi` | Aktiv (neu) | Modulare 11-Schritt-Pipeline |
| OCR Overlay | `/ai/ocr-overlay` | Legacy | 751-Zeilen-Monolith, 3 Modi |
| OCR Pipeline | `/ai/ocr-pipeline` | Legacy | Volle Pipeline mit Spalten |
| OCR Compare | `/ai/ocr-compare` | Eigenstaendig | Methoden-Vergleich |
Die alte OCR Overlay (`/ai/ocr-overlay`) bleibt waehrend des gesamten Umbaus parallel nutzbar
fuer A/B-Tests. Sobald die Kombi-Pipeline feature-complete ist, werden die alten Pipelines
in Phase 7 entfernt.
---
## Aenderungshistorie
| Datum | Version | Aenderung |
|-------|---------|-----------|
| 2026-03-26 | 1.0.0 | **Phase 1:** Grundgeruest mit 11-Step-Architektur, DB-Migration (document_group_id, page_number), Backend-Router mit Multi-Page-Upload, Frontend mit SessionList (gruppiert), KombiStepper, 13 Step-Komponenten, useKombiPipeline Hook, Navigation |

View File

@@ -1,7 +1,7 @@
# OCR Pipeline - Schrittweise Seitenrekonstruktion
**Version:** 4.7.0
**Status:** Produktiv (Schritte 110 + Grid Editor implementiert)
**Version:** 5.1.0
**Status:** Produktiv (Schritte 110 + Grid Editor + Regression Framework)
**URL:** https://macmini:3002/ai/ocr-pipeline
## Uebersicht
@@ -149,7 +149,9 @@ klausur-service/backend/
├── ocr_pipeline_api.py # FastAPI Router (Schritte 2-10)
├── orientation_crop_api.py # FastAPI Router (Schritte 1 + 4)
├── grid_editor_api.py # Grid Editor: build-grid, save-grid, grid-editor
├── grid_editor_helpers.py # Footer-Filterung, Seitenzahl-Extraktion
├── cv_ocr_engines.py # OCR-Engines, IPA-Korrektur, Britfone-Woerterbuch
├── cv_syllable_detect.py # Deutsche Silbentrennung (Silben:DE Modus)
├── cv_box_detect.py # Box-Erkennung + Zonen-Aufteilung
├── cv_graphic_detect.py # Grafik-/Bilderkennung (Region-basiert)
├── cv_color_detect.py # Farbtext-Erkennung (HSV-Analyse)
@@ -1081,6 +1083,8 @@ Rekonstruktion fuer Vokabelseiten mit komplexen Layouts (Bilder, Ueberschriften,
| Datei | Beschreibung |
|-------|--------------|
| `grid_editor_api.py` | `_build_grid_core()` Pipeline, alle Steps |
| `grid_editor_helpers.py` | `_filter_footer_words()` → Seitenzahl-Extraktion, Footer-Filterung |
| `cv_syllable_detect.py` | Deutsche Silbentrennung mit IPA-Kompatibilitaet |
| `cv_ocr_engines.py` | IPA-Korrektur, Britfone-Woerterbuch, Garbled-IPA-Erkennung |
| `cv_vocab_types.py` | `PageZone` (mit `image_overlays`), `ColumnGeometry` |
| `tests/test_grid_editor_api.py` | 27 Tests |
@@ -1106,9 +1110,15 @@ Kombi-Wortdaten
├─ Step 4: Farb-Annotation
│ → detect_word_colors(): HSV-Farbanalyse aller word_boxes
├─ Step 4b2: Per-Cell Artifact Filter
│ → Einzel-Wort-Zellen mit ≤2 Zeichen und conf < 65 entfernen
├─ Step 4c: Oversized Word Box Removal
│ → word_boxes > 3x Median entfernen (Grafik-Artefakte)
├─ Step 4d2: Connector Column Normalization
│ → Dominante Kurzwoerter in schmalen Spalten normalisieren
├─ Step 5: Overlay-Wort-Filter
│ → Woerter innerhalb image_overlays entfernen
@@ -1197,6 +1207,157 @@ des Headwords der vorherigen Zeile). Diese werden von PaddleOCR als garbled Text
4. Schlaegt IPA im Britfone-Woerterbuch nach
5. Beruecksichtigt alle Wortteile (z.B. "close sth. down" → `[klˈəʊz dˈaʊn]`)
### Per-Cell Artifact Filter (Step 4b2)
Entfernt OCR-Rauschen auf Zellebene: Zellen mit genau einer `word_box`, maximal 2 Zeichen
und Confidence unter 65 werden als Artefakte klassifiziert und entfernt.
**Konstanten:**
| Parameter | Wert | Beschreibung |
|-----------|------|--------------|
| `_ARTIFACT_MAX_LEN` | 2 | Maximale Textlaenge fuer Artefakt-Verdacht |
| `_ARTIFACT_CONF_THRESHOLD` | 65 | Confidence-Schwelle (darunter = Artefakt) |
**Sicherheit:** Einzelne Zeichen mit hoher Confidence (z.B. rote `!`-Marker mit conf=98)
werden **nicht** entfernt, da ihre Confidence ueber dem Schwellwert liegt.
**Typische Artefakte:** `(as)` conf=55, `u)` conf=44 — OCR-Noise aus Seitenraendern
oder Schatten.
### Connector Column Normalization (Step 4d2)
Erkennt schmale Spalten mit einem dominanten Kurzwort (z.B. "oder", "and", "bzw.")
und normalisiert OCR-Fehler bei denen das dominante Wort mit Rauschen versehen wurde.
**Algorithmus:**
1. Pro Spalte: Zaehle Textvorkommen aller Zellen
2. Pruefe ob ein dominantes Wort existiert (≥ 60% der Zellen, max 10 Zeichen)
3. Fuer Zellen die mit dem dominanten Wort **beginnen** und max 2 Zeichen laenger sind:
Normalisiere auf das dominante Wort
**Beispiel:** Spalte mit "oder" in 80% der Zellen → `"oderb"` wird zu `"oder"` normalisiert.
### Compound Word IPA Decomposition (Step 5e)
Zusammengesetzte Woerter wie "schoolbag" oder "blackbird" haben oft keinen eigenen
IPA-Eintrag im Woerterbuch. Die Funktion `_decompose_compound()` zerlegt sie:
1. Probiere jede Teilungsposition (min. 3 Zeichen pro Teil)
2. Wenn beide Teile im Woerterbuch stehen → IPA verketten
3. Waehle die Teilung mit dem laengsten ersten Teil
**Beispiele:**
| Eingabe | Zerlegung | IPA |
|---------|-----------|-----|
| schoolbag | school + bag | skˈuːl + bæɡ |
| blackbird | black + bird | blæk + bˈɜːd |
| ice-cream | ice + cream | aɪs + kɹˈiːm |
### Trailing Garbled Fragment Removal (Step 5f)
Nach korrekt erkanntem IPA (z.B. `seat [sˈiːt]`) haengt OCR manchmal
eine garbled Kopie der IPA-Transkription an: `seat [sˈiːt] belt si:t belt`.
**`_strip_post_bracket_garbled()`** erkennt und entfernt diese:
1. Alles nach dem letzten `]` scannen
2. Woerter mit IPA-Markern (`:`, `ə`, `ɪ` etc.) → garbled, entfernen
3. Echte Woerter (Woerterbuch, Deutsch, Delimiter) → behalten
4. **Multi-Wort-Headword:** "belt" ist ein echtes Wort, aber wenn danach
garbled IPA kommt, wird nur "belt" behalten, der Rest entfernt
### Regression Framework (Step 5g)
Ground-Truth Sessions koennen als Referenz markiert werden. Nach jeder
Code-Aenderung vergleicht `POST /regression/run` die aktuelle Pipeline-Ausgabe
mit den gespeicherten Referenzen:
- **Strukturelle Diffs:** Zonen, Spalten, Zeilen (Anzahl-Aenderungen)
- **Zellen-Diffs:** Text-Aenderungen, fehlende/neue Zellen, col_type-Aenderungen
- **Persistenz:** Ergebnisse in `regression_runs` Tabelle fuer Trend-Analyse
- **Shell-Script:** `scripts/run-regression.sh` fuer CI-Integration
Admin-UI: [/ai/ocr-regression](https://macmini:3002/ai/ocr-regression)
### Ground Truth Review Workflow (Step 5h)
Admin-UI fuer effiziente Massenpruefung von Sessions:
- **Split-View:** Original-Bild links, erkannter Grid rechts
- **Confidence-Highlighting:** Niedrige Konfidenz rot hervorgehoben
- **Quick-Accept:** Korrekte Zeilen mit einem Klick bestaetigen
- **Inline-Edit:** Text direkt im Grid korrigieren
- **Session-Queue:** Automatisch naechste Session laden
- **Batch-Mark:** Mehrere Sessions gleichzeitig als Ground Truth markieren
Admin-UI: [/ai/ocr-ground-truth](https://macmini:3002/ai/ocr-ground-truth)
### Page Number Extraction
Die Footer-Filterung (`_filter_footer_words` in `grid_editor_helpers.py`) erkennt
Seitenzahlen in den untersten 5% des Bildes und gibt sie als Metadaten zurueck,
statt sie stillschweigend zu entfernen.
**Algorithmus:**
1. Woerter in den untersten 5% des Bildes identifizieren
2. Wenn ≤ 3 Woerter mit ≤ 10 Zeichen Gesamtlaenge: Als Seitenzahl extrahieren
3. Rueckgabe als `PageNumber`-Objekt: `{text, y_pct, number?}`
4. Ziffern werden separat als `number` (Integer) extrahiert
**Datentyp:**
```typescript
interface PageNumber {
text: string // Roh-OCR-Text (z.B. "u)233")
y_pct: number // Vertikale Position in Prozent
number?: number // Extrahierte Zahl (z.B. 233)
}
```
**Frontend-Anzeige:**
In der Summary-Leiste (GridEditor + StepGridReview) als Badge: `S. 233`.
Zeigt bevorzugt `page_number.number` (saubere Zahl), Fallback auf `page_number.text`.
**Zweck:** Spaetere Zusammenfuehrung aufeinanderfolgender Seiten im Kundenfrontend.
### Footer-Zeilen-Erkennung (Verbesserung)
Die Footer-Erkennung wurde um zwei Pruefungen erweitert, um falsch-positive
Footer-Markierungen bei Content-Zeilen zu verhindern:
| Pruefung | Bedingung | Grund |
|----------|-----------|-------|
| Komma-Check | `',' in text` → kein Footer | Content-Saetze enthalten Kommas, Seitenzahlen nicht |
| Laengen-Check | `len(text) > 20` → kein Footer | Seitenzahlen sind kurz, Content-Zeilen lang |
**Vorher:** `"Uhrzeit, Vergangenheit, Zukunft"` wurde als Footer markiert.
**Nachher:** Nur tatsaechliche Seitenzahlen (kurz, ohne Kommas) werden als Footer erkannt.
### Silben + IPA Kombination (Fix)
**Datei:** `cv_syllable_detect.py`
Wenn beide Modi (Silben:DE und IPA) aktiviert sind, blockierte der `_IPA_RE`-Guard
die Silbentrennung, weil programmatisch eingefuegte IPA-Klammern (z.B. `[bɪltʃøn]`)
IPA-Zeichen enthalten.
**Loesung:** Vor der IPA-Pruefung wird Bracket-Content entfernt:
```python
# Bracket-Content strippen, da programmatisch eingefuegt
text_no_brackets = re.sub(r'\[[^\]]*\]', '', text)
if _IPA_RE.search(text_no_brackets):
return text # Echte IPA im Fliesstext → keine Silbentrennung
```
So wird `"Bild·chen [bɪltʃøn]"` korrekt silbifiziert: Die Silbenpunkte bleiben erhalten,
und die IPA in Klammern wird nicht als Blockiergrund gewertet.
### `en_col_type` Erkennung
Die Erkennung der Englisch-Headword-Spalte nutzt **Bracket-IPA-Pattern-Count**
@@ -1532,10 +1693,41 @@ cd klausur-service/backend && pytest tests/test_paddle_kombi.py -v # 36 Tests
---
## ONNX Backends und PP-DocLayout (Sprint 2)
### TrOCR ONNX Runtime
Ab Sprint 2 unterstuetzt die Pipeline **TrOCR mit ONNX Runtime** als Alternative zu PyTorch.
ONNX reduziert den RAM-Verbrauch von ~1.1 GB auf ~300 MB pro Modell und beschleunigt
die Inferenz um ~3x. Ideal fuer Hardware Tier 2 (8 GB RAM).
**Backend-Auswahl:** Umgebungsvariable `TROCR_BACKEND` (`auto` | `pytorch` | `onnx`).
Im `auto`-Modus wird ONNX bevorzugt, wenn exportierte Modelle vorhanden sind.
Vollstaendige Dokumentation: [TrOCR ONNX Runtime](TrOCR-ONNX.md)
### PP-DocLayout (Document Layout Analysis)
PP-DocLayout ersetzt die bisherige manuelle Zonen-Erkennung durch ein vortrainiertes
Layout-Analyse-Modell. Es erkennt automatisch:
- **Tabellen** (vocab_table, generic_table)
- **Ueberschriften** (title, section_header)
- **Bilder/Grafiken** (figure, illustration)
- **Textbloecke** (paragraph, list)
PP-DocLayout laeuft als ONNX-Modell (~15 MB) und benoetigt kein PyTorch.
Die Ergebnisse fliessen in Schritt 5 (Spaltenerkennung) und den Grid Editor ein.
---
## Aenderungshistorie
| Datum | Version | Aenderung |
|-------|---------|----------|
| 2026-03-26 | 5.2.0 | **OCR Kombi Pipeline:** Neuer modularer Nachfolger als 11-Schritt-Architektur unter `/ai/ocr-kombi`. Eigene Dokumentation: [OCR Kombi Pipeline](OCR-Kombi-Pipeline.md). Phase 1 (Grundgeruest + DB) implementiert: DB-Migration (`document_group_id`, `page_number`), Frontend-Orchestrator, 13 Step-Komponenten, Backend-Router mit Multi-Page-Upload. |
| 2026-03-26 | 5.1.0 | **Grid Quality & Metadata:** Per-Cell Artifact Filter (Step 4b2: ≤2 Zeichen + conf < 65), Connector Column Normalization (Step 4d2: dominante Kurzwoerter), Footer-Erkennung verbessert (Komma/Laengen-Check), Seitenzahl-Extraktion als Metadaten (`page_number` Feld im Grid-Result), Frontend-Anzeige in Summary-Leiste. Silben+IPA-Kombination gefixt (Bracket-Content vor IPA-Guard strippen). |
| 2026-03-23 | 5.0.0 | **Phase 1 Sprint 1:** Compound-IPA-Zerlegung (`_decompose_compound`), Trailing-Garbled-Fragment-Entfernung (Multi-Wort-Headwords), Regression Framework mit DB-Persistenz + History + Shell-Script, Ground-Truth Review Workflow UI, Page-Crop Determinismus verifiziert. Admin-Seiten: `/ai/ocr-regression`, `/ai/ocr-ground-truth`. |
| 2026-03-20 | 4.7.0 | Grid Editor: Zone Merging ueber Bilder (`image_overlays`), Heading Detection (Farbe + Hoehe), Ghost-Filter (borderless-aware), Oversized Word Box Removal, IPA Phonetic Correction (Britfone), IPA Continuation Detection, `en_col_type` via Bracket-Count. 27 Tests. |
| 2026-03-16 | 4.6.0 | Strukturerkennung (Schritt 8): Region-basierte Grafikerkennung (`cv_graphic_detect.py`) mit Zwei-Pass-Verfahren (Farbregionen + schwarze Illustrationen), Wort-Ueberlappungs-Filter, Box/Zonen/Farb-Analyse. Schritt laeuft nach Worterkennung. |
| 2026-03-12 | 4.5.0 | Kombi-Modus (PaddleOCR + Tesseract): Beide Engines laufen parallel, Koordinaten werden IoU-basiert gematcht und confidence-gewichtet gemittelt. Ungematchte Tesseract-Woerter (Bullets, Symbole) werden hinzugefuegt. 3er-Toggle in OCR Overlay. |

View File

@@ -0,0 +1,204 @@
# RAG Landkarte — Branchen-Regulierungs-Matrix
## Uebersicht
Die RAG Landkarte zeigt eine interaktive Matrix aller 320 Compliance-Dokumente im RAG-System, gruppiert nach Dokumenttyp und zugeordnet zu 10 Industriebranchen.
**URL**: `https://macmini:3002/ai/rag` → Tab "Landkarte"
**Letzte Aktualisierung**: 2026-04-15
## Architektur
```
rag-documents.json ← Zentrale Datendatei (320 Dokumente)
├── doc_types[] ← 17 Dokumenttypen (EU-VO, DE-Gesetz, etc.)
├── industries[] ← 10 Branchen (VDMA/VDA/BDI)
└── documents[] ← Alle Dokumente mit Branchen-Mapping
├── code ← Eindeutiger Identifier
├── name ← Anzeigename
├── doc_type ← Verweis auf doc_types.id
├── industries[] ← ["all"] oder ["automotive", "chemie", ...]
├── in_rag ← true (alle im RAG)
├── rag_collection ← Qdrant Collection Name
├── description? ← Beschreibung (fuer ~100 Hauptregulierungen)
├── applicability_note? ← Begruendung der Branchenzuordnung
└── effective_date? ← Gueltigkeitsdatum
rag-constants.ts ← RAG-Metadaten (Chunks, Qdrant-IDs)
page.tsx ← Frontend (importiert aus JSON)
```
## Dateien
| Pfad | Beschreibung |
|------|--------------|
| `admin-lehrer/app/(admin)/ai/rag/rag-documents.json` | Alle 320 Dokumente mit Branchen-Mapping |
| `admin-lehrer/app/(admin)/ai/rag/rag-constants.ts` | REGULATIONS_IN_RAG (Chunk-Counts, Qdrant-IDs) |
| `admin-lehrer/app/(admin)/ai/rag/page.tsx` | Frontend-Rendering |
| `admin-lehrer/app/(admin)/ai/rag/__tests__/rag-documents.test.ts` | 44 Tests fuer JSON-Validierung |
## Branchen (10 Industriesektoren)
Die Branchen orientieren sich an den Mitgliedsverbaenden von VDMA, VDA und BDI:
| ID | Branche | Icon | Typische Kunden |
|----|---------|------|-----------------|
| `automotive` | Automobilindustrie | 🚗 | OEMs, Tier-1/2 Zulieferer |
| `maschinenbau` | Maschinen- & Anlagenbau | ⚙️ | Werkzeugmaschinen, Automatisierung |
| `elektrotechnik` | Elektro- & Digitalindustrie | ⚡ | Embedded Systems, Steuerungstechnik |
| `chemie` | Chemie- & Prozessindustrie | 🧪 | Grundstoffchemie, Spezialchemie |
| `metall` | Metallindustrie | 🔩 | Stahl, Aluminium, Metallverarbeitung |
| `energie` | Energie & Versorgung | 🔋 | Energieerzeugung, Netzbetreiber |
| `transport` | Transport & Logistik | 🚚 | Gueterverkehr, Schiene, Luftfahrt |
| `handel` | Handel | 🏪 | Einzel-/Grosshandel, E-Commerce |
| `konsumgueter` | Konsumgueter & Lebensmittel | 📦 | FMCG, Lebensmittel, Verpackung |
| `bau` | Bauwirtschaft | 🏗️ | Hoch-/Tiefbau, Gebaeudeautomation |
!!! warning "Keine Pseudo-Branchen"
Es werden bewusst **keine** Querschnittsthemen wie IoT, KI, HR, KRITIS oder E-Commerce als "Branchen" gefuehrt. Diese sind Technologien, Abteilungen oder Klassifizierungen — keine Wirtschaftssektoren.
## Zuordnungslogik
### Drei Ebenen
| Ebene | `industries` Wert | Anzahl | Beispiele |
|-------|-------------------|--------|-----------|
| **Horizontal** | `["all"]` | 264 | DSGVO, AI Act, CRA, NIS2, BetrVG |
| **Sektorspezifisch** | `["automotive", "chemie", ...]` | 42 | Maschinenverordnung, ElektroG, BattDG |
| **Nicht zutreffend** | `[]` | 14 | DORA, MiCA, EHDS, DSA |
### Horizontal (alle Branchen)
Regulierungen die **branchenuebergreifend** gelten:
- **Datenschutz**: DSGVO, BDSG, ePrivacy, TDDDG, SCC, DPF
- **KI**: AI Act (jedes Unternehmen das KI einsetzt)
- **Cybersecurity**: CRA (jedes Produkt mit digitalen Elementen), NIS2, EUCSA
- **Produktsicherheit**: GPSR, Produkthaftungs-RL
- **Arbeitsrecht**: BetrVG, AGG, KSchG, ArbSchG, LkSG
- **Handels-/Steuerrecht**: HGB, AO, UStG
- **Software-Security**: OWASP Top 10, NIST SSDF, CISA Secure by Design
- **Supply Chain**: CycloneDX, SPDX, SLSA (CRA verlangt SBOM)
- **Alle Leitlinien**: EDPB, DSK, DSFA-Listen, Gerichtsurteile
### Sektorspezifisch
| Regulierung | Branchen | Begruendung |
|-------------|----------|-------------|
| Maschinenverordnung | Maschinenbau, Automotive, Elektrotechnik, Metall, Bau | Hersteller von Maschinen und zugehoerigen Produkten |
| ElektroG | Elektrotechnik, Automotive, Konsumgueter | Elektro-/Elektronikgeraete |
| BattDG/BattVO | Automotive, Elektrotechnik, Energie | Batterien und Akkumulatoren |
| VerpackG | Konsumgueter, Handel, Chemie | Verpackungspflichtige Produkte |
| PAngV, UWG, VSBG | Handel, Konsumgueter | Verbraucherschutz im Verkauf |
| BSI-KritisV, KRITIS-Dachgesetz | Energie, Transport, Chemie | KRITIS-Sektoren |
| ENISA ICS/SCADA | Maschinenbau, Elektrotechnik, Automotive, Chemie, Energie, Transport | Industrielle Steuerungstechnik |
| NIST SP 800-82 (OT) | Maschinenbau, Automotive, Elektrotechnik, Chemie, Energie, Metall | Operational Technology |
### Nicht zutreffend
Dokumente die **im RAG bleiben** aber fuer keine der 10 Zielbranchen relevant sind:
| Code | Name | Grund |
|------|------|-------|
| DORA | Digital Operational Resilience Act | Finanzsektor |
| PSD2 | Zahlungsdiensterichtlinie | Zahlungsdienstleister |
| MiCA | Markets in Crypto-Assets | Krypto-Maerkte |
| AMLR | AML-Verordnung | Geldwaesche-Bekaempfung |
| EHDS | Europaeischer Gesundheitsdatenraum | Gesundheitswesen |
| DSA | Digital Services Act | Online-Plattformen |
| DMA | Digital Markets Act | Gatekeeper-Plattformen |
| MDR | Medizinprodukteverordnung | Medizintechnik |
| BSI-TR-03161 | DiGA-Sicherheit (3 Teile) | Digitale Gesundheitsanwendungen |
## Dokumenttypen (17)
| doc_type | Label | Anzahl | Beispiele |
|----------|-------|--------|-----------|
| `eu_regulation` | EU-Verordnungen | 22 | DSGVO, AI Act, CRA, DORA |
| `eu_directive` | EU-Richtlinien | 14 | ePrivacy, NIS2, PSD2 |
| `eu_guidance` | EU-Leitfaeden | 9 | Blue Guide, GPAI CoP |
| `de_law` | Deutsche Gesetze | 41 | BDSG, BGB, HGB, BetrVG |
| `at_law` | Oesterreichische Gesetze | 11 | DSG AT, ECG, KSchG |
| `ch_law` | Schweizer Gesetze | 8 | revDSG, DSV, OR |
| `national_law` | Nationale Datenschutzgesetze | 17 | UK DPA, LOPDGDD, UAVG |
| `bsi_standard` | BSI Standards & TR | 4 | BSI 200-4, BSI-TR-03161 |
| `edpb_guideline` | EDPB/WP29 Leitlinien | 50 | Consent, Controller/Processor |
| `dsk_guidance` | DSK Orientierungshilfen | 57 | Kurzpapiere, OH Telemedien |
| `court_decision` | Gerichtsurteile | 20 | BAG M365, BGH Planet49 |
| `dsfa_list` | DSFA Muss-Listen | 20 | Pro Bundesland + DSK |
| `nist_standard` | NIST Standards | 11 | CSF 2.0, SSDF, AI RMF |
| `owasp_standard` | OWASP Standards | 6 | Top 10, ASVS, API Security |
| `enisa_guidance` | ENISA Guidance | 6 | Supply Chain, ICS/SCADA |
| `international` | Internationale Standards | 7 | CVSS, CycloneDX, SPDX |
| `legal_template` | Vorlagen & Muster | 17 | GitHub Policies, VVT-Muster |
## Integration in andere Projekte
### JSON importieren
```typescript
import ragData from './rag-documents.json'
const documents = ragData.documents // 320 Dokumente
const docTypes = ragData.doc_types // 17 Kategorien
const industries = ragData.industries // 10 Branchen
```
### Matrix-Logik
```typescript
// Pruefen ob Dokument fuer Branche gilt
const applies = (doc, industryId) =>
doc.industries.includes(industryId) || doc.industries.includes('all')
// Dokumente nach Typ gruppieren
const grouped = Object.groupBy(documents, d => d.doc_type)
// Nur sektorspezifische Dokumente fuer eine Branche
const forAutomotive = documents.filter(d =>
d.industries.includes('automotive') && !d.industries.includes('all')
)
```
### RAG-Status pruefen
```typescript
import { REGULATIONS_IN_RAG } from './rag-constants'
const isInRag = (code: string) => code in REGULATIONS_IN_RAG
const chunks = REGULATIONS_IN_RAG['GDPR']?.chunks // 423
```
## Datenquellen
| Quelle | Pfad | Beschreibung |
|--------|------|--------------|
| RAG-Inventar | `~/Desktop/RAG-Dokumenten-Inventar.md` | 386 Quelldateien |
| rag-documents.json | `admin-lehrer/.../rag/rag-documents.json` | 320 konsolidierte Dokumente |
| rag-constants.ts | `admin-lehrer/.../rag/rag-constants.ts` | Qdrant-Metadaten |
## Tests
```bash
cd admin-lehrer
npx vitest run app/\(admin\)/ai/rag/__tests__/rag-documents.test.ts
```
44 Tests validieren:
- JSON-Struktur (doc_types, industries, documents)
- 10 echte Branchen (keine Pseudo-Branchen)
- Pflichtfelder und gueltige Referenzen
- Horizontale Regulierungen (DSGVO, AI Act, CRA → "all")
- Sektorspezifische Zuordnungen (Maschinenverordnung, ElektroG)
- Nicht zutreffende Regulierungen (DORA, MiCA → leer)
- Applicability Notes vorhanden und korrekt
## Aenderungshistorie
| Datum | Aenderung |
|-------|-----------|
| 2026-04-15 | Initiale Implementierung: 320 Dokumente, 10 Branchen, 17 Typen |
| 2026-04-15 | Branchen-Review: OWASP/SBOM → alle, BSI-TR-03161 → leer |
| 2026-04-15 | Applicability Notes UI: Aufklappbare Erklaerungen pro Dokument |

View File

@@ -0,0 +1,83 @@
# TrOCR ONNX Runtime
## Uebersicht
TrOCR (Transformer-based OCR) kann sowohl mit PyTorch als auch mit ONNX Runtime betrieben werden. ONNX bietet deutlich reduzierten RAM-Verbrauch und schnellere Inferenz — ideal fuer den Offline-Betrieb auf Hardware Tier 2 (8 GB RAM).
## Export-Prozess
### Voraussetzungen
- Python 3.10+
- `optimum>=1.17.0` (Apache-2.0)
- `onnxruntime` (bereits installiert via RapidOCR)
### Export ausfuehren
```bash
python scripts/export-trocr-onnx.py --model both
```
Dies exportiert:
- `models/onnx/trocr-base-printed/` (~85 MB fp32 → ~25 MB int8)
- `models/onnx/trocr-base-handwritten/` (~85 MB fp32 → ~25 MB int8)
### Quantisierung
Int8-Quantisierung reduziert Modellgroesse um ~70% mit weniger als 2% Genauigkeitsverlust.
## Runtime-Konfiguration
### Umgebungsvariablen
| Variable | Default | Beschreibung |
|----------|---------|--------------|
| `TROCR_BACKEND` | `auto` | Backend-Auswahl: `auto`, `pytorch`, `onnx` |
| `TROCR_ONNX_DIR` | (siehe unten) | Pfad zu ONNX-Modellen |
### Backend-Modi
- **auto** (empfohlen): ONNX wenn verfuegbar, sonst PyTorch Fallback
- **pytorch**: Erzwingt PyTorch (hoeherer RAM, aber bewaehrt)
- **onnx**: Erzwingt ONNX (Fehler wenn Modell nicht vorhanden)
### Modell-Pfade (Suchpfade)
1. `$TROCR_ONNX_DIR/trocr-base-{variant}/`
2. `/root/.cache/huggingface/onnx/trocr-base-{variant}/` (Docker)
3. `models/onnx/trocr-base-{variant}/` (lokale Entwicklung)
## Hardware-Anforderungen
| Metrik | PyTorch float32 | ONNX int8 |
|--------|-----------------|-----------|
| Modell-Groesse | ~340 MB | ~50 MB |
| RAM (Printed) | ~1.1 GB | ~300 MB |
| RAM (Handwritten) | ~1.1 GB | ~300 MB |
| Inferenz/Zeile | ~120 ms | ~40 ms |
| Mindest-RAM | 4 GB | 2 GB |
## Benchmark
```bash
# PyTorch Baseline (Sprint 1)
python scripts/benchmark-trocr.py > benchmark-baseline.json
# ONNX Benchmark
python scripts/benchmark-trocr.py --backend onnx > benchmark-onnx.json
```
Benchmark-Vergleiche koennen im Admin unter `/ai/model-management` eingesehen werden.
## Verifikation
Der Export-Script verifiziert automatisch, dass ONNX-Output weniger als 2% von PyTorch abweicht. Manuelle Verifikation:
```python
# Im Container
python -c "
from services.trocr_onnx_service import is_onnx_available, get_onnx_model_status
print('Available:', is_onnx_available())
print('Status:', get_onnx_model_status())
"
```

View File

@@ -1,320 +0,0 @@
#!/usr/bin/env python3
"""
vast.ai Profile Extractor Script
Dieses Skript läuft auf vast.ai und extrahiert Profildaten von Universitäts-Webseiten.
Verwendung auf vast.ai:
1. Lade dieses Skript auf deine vast.ai Instanz
2. Installiere Abhängigkeiten: pip install requests beautifulsoup4 openai
3. Setze Umgebungsvariablen:
- BREAKPILOT_API_URL=http://deine-ip:8086
- BREAKPILOT_API_KEY=dev-key
- OPENAI_API_KEY=sk-...
4. Starte: python vast_ai_extractor.py
"""
import os
import sys
import json
import time
import logging
import requests
from bs4 import BeautifulSoup
from typing import Optional, Dict, Any, List
# Logging Setup
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)
# Configuration
API_URL = os.environ.get('BREAKPILOT_API_URL', 'http://localhost:8086')
API_KEY = os.environ.get('BREAKPILOT_API_KEY', 'dev-key')
OPENAI_API_KEY = os.environ.get('OPENAI_API_KEY', '')
BATCH_SIZE = 10
SLEEP_BETWEEN_REQUESTS = 1 # Sekunden zwischen Requests (respektiere rate limits)
def fetch_pending_profiles(limit: int = 50) -> List[Dict]:
"""Hole Profile die noch extrahiert werden müssen."""
try:
response = requests.get(
f"{API_URL}/api/v1/ai/extraction/pending",
params={"limit": limit},
headers={"Authorization": f"Bearer {API_KEY}"},
timeout=30
)
response.raise_for_status()
data = response.json()
return data.get("tasks", [])
except Exception as e:
logger.error(f"Fehler beim Abrufen der Profile: {e}")
return []
def fetch_profile_page(url: str) -> Optional[str]:
"""Lade den HTML-Inhalt einer Profilseite."""
try:
headers = {
'User-Agent': 'Mozilla/5.0 (compatible; BreakPilot-Crawler/1.0; +https://breakpilot.de)',
'Accept': 'text/html,application/xhtml+xml',
'Accept-Language': 'de-DE,de;q=0.9,en;q=0.8',
}
response = requests.get(url, headers=headers, timeout=30)
response.raise_for_status()
return response.text
except Exception as e:
logger.error(f"Fehler beim Laden von {url}: {e}")
return None
def extract_with_beautifulsoup(html: str, url: str) -> Dict[str, Any]:
"""Extrahiere Basis-Informationen mit BeautifulSoup (ohne AI)."""
soup = BeautifulSoup(html, 'html.parser')
data = {}
# Email suchen
email_links = soup.find_all('a', href=lambda x: x and x.startswith('mailto:'))
if email_links:
email = email_links[0]['href'].replace('mailto:', '').split('?')[0]
data['email'] = email
# Telefon suchen
phone_links = soup.find_all('a', href=lambda x: x and x.startswith('tel:'))
if phone_links:
data['phone'] = phone_links[0]['href'].replace('tel:', '')
# ORCID suchen
orcid_links = soup.find_all('a', href=lambda x: x and 'orcid.org' in x)
if orcid_links:
orcid = orcid_links[0]['href']
# Extrahiere ORCID ID
if '/' in orcid:
data['orcid'] = orcid.split('/')[-1]
# Google Scholar suchen
scholar_links = soup.find_all('a', href=lambda x: x and 'scholar.google' in x)
if scholar_links:
href = scholar_links[0]['href']
if 'user=' in href:
data['google_scholar_id'] = href.split('user=')[1].split('&')[0]
# ResearchGate suchen
rg_links = soup.find_all('a', href=lambda x: x and 'researchgate.net' in x)
if rg_links:
data['researchgate_url'] = rg_links[0]['href']
# LinkedIn suchen
linkedin_links = soup.find_all('a', href=lambda x: x and 'linkedin.com' in x)
if linkedin_links:
data['linkedin_url'] = linkedin_links[0]['href']
# Institut/Abteilung Links sammeln (für Hierarchie-Erkennung)
base_domain = '/'.join(url.split('/')[:3])
department_links = []
for link in soup.find_all('a', href=True):
href = link['href']
text = link.get_text(strip=True)
# Suche nach Links die auf Institute/Fakultäten hindeuten
if any(kw in text.lower() for kw in ['institut', 'fakultät', 'fachbereich', 'abteilung', 'lehrstuhl']):
if href.startswith('/'):
href = base_domain + href
if href.startswith('http'):
department_links.append({'url': href, 'name': text})
if department_links:
# Nimm den ersten gefundenen Department-Link
data['department_url'] = department_links[0]['url']
data['department_name'] = department_links[0]['name']
return data
def extract_with_ai(html: str, url: str, full_name: str) -> Dict[str, Any]:
"""Extrahiere strukturierte Daten mit OpenAI GPT."""
if not OPENAI_API_KEY:
logger.warning("Kein OPENAI_API_KEY gesetzt - nutze nur BeautifulSoup")
return extract_with_beautifulsoup(html, url)
try:
import openai
client = openai.OpenAI(api_key=OPENAI_API_KEY)
# Reduziere HTML auf relevanten Text
soup = BeautifulSoup(html, 'html.parser')
# Entferne Scripts, Styles, etc.
for tag in soup(['script', 'style', 'nav', 'footer', 'header']):
tag.decompose()
# Extrahiere Text
text = soup.get_text(separator='\n', strip=True)
# Limitiere auf 8000 Zeichen für API
text = text[:8000]
prompt = f"""Analysiere diese Universitäts-Profilseite für {full_name} und extrahiere folgende Informationen im JSON-Format:
{{
"email": "email@uni.de oder null",
"phone": "Telefonnummer oder null",
"office": "Raum/Büro oder null",
"position": "Position/Titel (z.B. Wissenschaftlicher Mitarbeiter, Professorin) oder null",
"department_name": "Name des Instituts/der Abteilung oder null",
"research_interests": ["Liste", "der", "Forschungsthemen"] oder [],
"teaching_topics": ["Liste", "der", "Lehrveranstaltungen/Fächer"] oder [],
"supervisor_name": "Name des Vorgesetzten/Lehrstuhlinhabers falls erkennbar oder null"
}}
Profilseite von {url}:
{text}
Antworte NUR mit dem JSON-Objekt, keine Erklärungen."""
response = client.chat.completions.create(
model="gpt-4o-mini", # Kostengünstig und schnell
messages=[{"role": "user", "content": prompt}],
temperature=0.1,
max_tokens=500
)
result_text = response.choices[0].message.content.strip()
# Parse JSON (entferne eventuelle Markdown-Blöcke)
if result_text.startswith('```'):
result_text = result_text.split('```')[1]
if result_text.startswith('json'):
result_text = result_text[4:]
ai_data = json.loads(result_text)
# Kombiniere mit BeautifulSoup-Ergebnissen (für Links wie ORCID)
bs_data = extract_with_beautifulsoup(html, url)
# AI-Daten haben Priorität, aber BS-Daten für spezifische Links
for key in ['orcid', 'google_scholar_id', 'researchgate_url', 'linkedin_url']:
if key in bs_data and bs_data[key]:
ai_data[key] = bs_data[key]
return ai_data
except Exception as e:
logger.error(f"AI-Extraktion fehlgeschlagen: {e}")
return extract_with_beautifulsoup(html, url)
def submit_extracted_data(staff_id: str, data: Dict[str, Any]) -> bool:
"""Sende extrahierte Daten zurück an BreakPilot."""
try:
payload = {"staff_id": staff_id, **data}
# Entferne None-Werte
payload = {k: v for k, v in payload.items() if v is not None}
response = requests.post(
f"{API_URL}/api/v1/ai/extraction/submit",
json=payload,
headers={
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json"
},
timeout=30
)
response.raise_for_status()
return True
except Exception as e:
logger.error(f"Fehler beim Senden der Daten für {staff_id}: {e}")
return False
def process_profiles():
"""Hauptschleife: Hole Profile, extrahiere Daten, sende zurück."""
logger.info(f"Starte Extraktion - API: {API_URL}")
processed = 0
errors = 0
while True:
# Hole neue Profile
profiles = fetch_pending_profiles(limit=BATCH_SIZE)
if not profiles:
logger.info("Keine weiteren Profile zum Verarbeiten. Warte 60 Sekunden...")
time.sleep(60)
continue
logger.info(f"Verarbeite {len(profiles)} Profile...")
for profile in profiles:
staff_id = profile['staff_id']
url = profile['profile_url']
full_name = profile.get('full_name', 'Unbekannt')
logger.info(f"Verarbeite: {full_name} - {url}")
# Lade Profilseite
html = fetch_profile_page(url)
if not html:
errors += 1
continue
# Extrahiere Daten
extracted = extract_with_ai(html, url, full_name)
if extracted:
# Sende zurück
if submit_extracted_data(staff_id, extracted):
processed += 1
logger.info(f"Erfolgreich: {full_name} - Email: {extracted.get('email', 'N/A')}")
else:
errors += 1
else:
errors += 1
# Rate limiting
time.sleep(SLEEP_BETWEEN_REQUESTS)
logger.info(f"Batch abgeschlossen. Gesamt: {processed} erfolgreich, {errors} Fehler")
def main():
"""Einstiegspunkt."""
logger.info("=" * 60)
logger.info("BreakPilot vast.ai Profile Extractor")
logger.info("=" * 60)
# Prüfe Konfiguration
if not API_KEY:
logger.error("BREAKPILOT_API_KEY nicht gesetzt!")
sys.exit(1)
if not OPENAI_API_KEY:
logger.warning("OPENAI_API_KEY nicht gesetzt - nutze nur BeautifulSoup-Extraktion")
# Teste Verbindung
try:
response = requests.get(
f"{API_URL}/v1/health",
headers={"Authorization": f"Bearer {API_KEY}"},
timeout=10
)
logger.info(f"API-Verbindung OK: {response.status_code}")
except Exception as e:
logger.error(f"Kann API nicht erreichen: {e}")
logger.error(f"Stelle sicher dass {API_URL} erreichbar ist!")
sys.exit(1)
# Starte Verarbeitung
try:
process_profiles()
except KeyboardInterrupt:
logger.info("Beendet durch Benutzer")
except Exception as e:
logger.error(f"Unerwarteter Fehler: {e}")
sys.exit(1)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,339 @@
"""
Box layout classifier — detects internal layout type of embedded boxes.
Classifies each box as: flowing | columnar | bullet_list | header_only
and provides layout-appropriate grid building.
Used by the Box-Grid-Review step to rebuild box zones with correct structure.
"""
import logging
import re
import statistics
from typing import Any, Dict, List, Optional, Tuple
logger = logging.getLogger(__name__)
# Bullet / list-item patterns at the start of a line
_BULLET_RE = re.compile(
r'^[\-\u2022\u2013\u2014\u25CF\u25CB\u25AA\u25A0•·]\s' # dash, bullet chars
r'|^\d{1,2}[.)]\s' # numbered: "1) " or "1. "
r'|^[a-z][.)]\s' # lettered: "a) " or "a. "
)
def classify_box_layout(
words: List[Dict],
box_w: int,
box_h: int,
) -> str:
"""Classify the internal layout of a detected box.
Args:
words: OCR word dicts within the box (with top, left, width, height, text)
box_w: Box width in pixels
box_h: Box height in pixels
Returns:
'header_only' | 'bullet_list' | 'columnar' | 'flowing'
"""
if not words:
return "header_only"
# Group words into lines by y-proximity
lines = _group_into_lines(words)
# Header only: very few words or single line
total_words = sum(len(line) for line in lines)
if total_words <= 5 or len(lines) <= 1:
return "header_only"
# Bullet list: check if majority of lines start with bullet patterns
bullet_count = 0
for line in lines:
first_text = line[0].get("text", "") if line else ""
if _BULLET_RE.match(first_text):
bullet_count += 1
# Also check if first word IS a bullet char
elif first_text.strip() in ("-", "", "", "", "·", "", ""):
bullet_count += 1
if bullet_count >= len(lines) * 0.4 and bullet_count >= 2:
return "bullet_list"
# Columnar: check for multiple distinct x-clusters
if len(lines) >= 3 and _has_column_structure(words, box_w):
return "columnar"
# Default: flowing text
return "flowing"
def _group_into_lines(words: List[Dict]) -> List[List[Dict]]:
"""Group words into lines by y-proximity."""
if not words:
return []
sorted_words = sorted(words, key=lambda w: (w["top"], w["left"]))
heights = [w["height"] for w in sorted_words if w.get("height", 0) > 0]
median_h = statistics.median(heights) if heights else 20
y_tolerance = max(median_h * 0.5, 5)
lines: List[List[Dict]] = []
current_line: List[Dict] = [sorted_words[0]]
current_y = sorted_words[0]["top"]
for w in sorted_words[1:]:
if abs(w["top"] - current_y) <= y_tolerance:
current_line.append(w)
else:
lines.append(sorted(current_line, key=lambda ww: ww["left"]))
current_line = [w]
current_y = w["top"]
if current_line:
lines.append(sorted(current_line, key=lambda ww: ww["left"]))
return lines
def _has_column_structure(words: List[Dict], box_w: int) -> bool:
"""Check if words have multiple distinct left-edge clusters (columns)."""
if box_w <= 0:
return False
lines = _group_into_lines(words)
if len(lines) < 3:
return False
# Collect left-edges of non-first words in each line
# (first word of each line often aligns regardless of columns)
left_edges = []
for line in lines:
for w in line[1:]: # skip first word
left_edges.append(w["left"])
if len(left_edges) < 4:
return False
# Check if left edges cluster into 2+ distinct groups
left_edges.sort()
gaps = [left_edges[i + 1] - left_edges[i] for i in range(len(left_edges) - 1)]
if not gaps:
return False
median_gap = statistics.median(gaps)
# A column gap is typically > 15% of box width
column_gap_threshold = box_w * 0.15
large_gaps = [g for g in gaps if g > column_gap_threshold]
return len(large_gaps) >= 1
def build_box_zone_grid(
zone_words: List[Dict],
box_x: int,
box_y: int,
box_w: int,
box_h: int,
zone_index: int,
img_w: int,
img_h: int,
layout_type: Optional[str] = None,
) -> Dict[str, Any]:
"""Build a grid for a box zone with layout-aware processing.
If layout_type is None, auto-detects it.
For 'flowing' and 'bullet_list', forces single-column layout.
For 'columnar', uses the standard multi-column detection.
For 'header_only', creates a single cell.
Returns the same format as _build_zone_grid (columns, rows, cells, header_rows).
"""
from grid_editor_helpers import _build_zone_grid, _cluster_rows
if not zone_words:
return {
"columns": [],
"rows": [],
"cells": [],
"header_rows": [],
"box_layout_type": layout_type or "header_only",
"box_grid_reviewed": False,
}
# Auto-detect layout if not specified
if not layout_type:
layout_type = classify_box_layout(zone_words, box_w, box_h)
logger.info(
"Box zone %d: layout_type=%s, %d words, %dx%d",
zone_index, layout_type, len(zone_words), box_w, box_h,
)
if layout_type == "header_only":
# Single cell with all text concatenated
all_text = " ".join(
w.get("text", "") for w in sorted(zone_words, key=lambda ww: (ww["top"], ww["left"]))
).strip()
return {
"columns": [{"col_index": 0, "index": 0, "label": "column_text", "col_type": "column_1",
"x_min_px": box_x, "x_max_px": box_x + box_w,
"x_min_pct": round(box_x / img_w * 100, 2) if img_w else 0,
"x_max_pct": round((box_x + box_w) / img_w * 100, 2) if img_w else 0,
"bold": False}],
"rows": [{"index": 0, "row_index": 0,
"y_min": box_y, "y_max": box_y + box_h, "y_center": box_y + box_h / 2,
"y_min_px": box_y, "y_max_px": box_y + box_h,
"y_min_pct": round(box_y / img_h * 100, 2) if img_h else 0,
"y_max_pct": round((box_y + box_h) / img_h * 100, 2) if img_h else 0,
"is_header": True}],
"cells": [{
"cell_id": f"Z{zone_index}_R0C0",
"row_index": 0,
"col_index": 0,
"col_type": "column_1",
"text": all_text,
"word_boxes": zone_words,
}],
"header_rows": [0],
"box_layout_type": layout_type,
"box_grid_reviewed": False,
}
if layout_type in ("flowing", "bullet_list"):
# Force single column — each line becomes one row with one cell.
# Detect bullet structure from indentation and merge continuation
# lines into the bullet they belong to.
lines = _group_into_lines(zone_words)
column = {
"col_index": 0, "index": 0, "label": "column_text", "col_type": "column_1",
"x_min_px": box_x, "x_max_px": box_x + box_w,
"x_min_pct": round(box_x / img_w * 100, 2) if img_w else 0,
"x_max_pct": round((box_x + box_w) / img_w * 100, 2) if img_w else 0,
"bold": False,
}
# --- Detect indentation levels ---
line_indents = []
for line_words in lines:
if not line_words:
line_indents.append(0)
continue
min_left = min(w["left"] for w in line_words)
line_indents.append(min_left - box_x)
# Find the minimum indent (= bullet/main level)
valid_indents = [ind for ind in line_indents if ind >= 0]
min_indent = min(valid_indents) if valid_indents else 0
# Indentation threshold: lines indented > 15px more than minimum
# are continuation lines belonging to the previous bullet
INDENT_THRESHOLD = 15
# --- Group lines into logical items (bullet + continuations) ---
# Each item is a list of line indices
items: List[List[int]] = []
for li, indent in enumerate(line_indents):
is_continuation = (indent > min_indent + INDENT_THRESHOLD) and len(items) > 0
if is_continuation:
items[-1].append(li)
else:
items.append([li])
logger.info(
"Box zone %d flowing: %d lines → %d items (indents=%s, min=%d, threshold=%d)",
zone_index, len(lines), len(items),
[int(i) for i in line_indents], int(min_indent), INDENT_THRESHOLD,
)
# --- Build rows and cells from grouped items ---
rows = []
cells = []
header_rows = []
for row_idx, item_line_indices in enumerate(items):
# Collect all words from all lines in this item
item_words = []
item_texts = []
for li in item_line_indices:
if li < len(lines):
item_words.extend(lines[li])
line_text = " ".join(w.get("text", "") for w in lines[li]).strip()
if line_text:
item_texts.append(line_text)
if not item_words:
continue
y_min = min(w["top"] for w in item_words)
y_max = max(w["top"] + w["height"] for w in item_words)
y_center = (y_min + y_max) / 2
row = {
"index": row_idx,
"row_index": row_idx,
"y_min": y_min,
"y_max": y_max,
"y_center": y_center,
"y_min_px": y_min,
"y_max_px": y_max,
"y_min_pct": round(y_min / img_h * 100, 2) if img_h else 0,
"y_max_pct": round(y_max / img_h * 100, 2) if img_h else 0,
"is_header": False,
}
rows.append(row)
# Join multi-line text with newline for display
merged_text = "\n".join(item_texts)
# Add bullet marker if this is a bullet item without one
first_text = item_texts[0] if item_texts else ""
is_bullet = len(item_line_indices) > 1 or _BULLET_RE.match(first_text)
if is_bullet and not _BULLET_RE.match(first_text) and row_idx > 0:
# Continuation item without bullet — add one
merged_text = "" + merged_text
cell = {
"cell_id": f"Z{zone_index}_R{row_idx}C0",
"row_index": row_idx,
"col_index": 0,
"col_type": "column_1",
"text": merged_text,
"word_boxes": item_words,
}
cells.append(cell)
# Detect header: first item if it has no continuation lines and is short
if len(items) >= 2:
first_item_texts = []
for li in items[0]:
if li < len(lines):
first_item_texts.append(" ".join(w.get("text", "") for w in lines[li]).strip())
first_text = " ".join(first_item_texts)
if (len(first_text) < 40
or first_text.isupper()
or first_text.rstrip().endswith(':')):
header_rows = [0]
return {
"columns": [column],
"rows": rows,
"cells": cells,
"header_rows": header_rows,
"box_layout_type": layout_type,
"box_grid_reviewed": False,
}
# Columnar: use standard grid builder with independent column detection
result = _build_zone_grid(
zone_words, box_x, box_y, box_w, box_h,
zone_index, img_w, img_h,
global_columns=None, # detect columns independently
)
# Colspan detection is now handled generically by _detect_colspan_cells
# in grid_editor_helpers.py (called inside _build_zone_grid).
result["box_layout_type"] = layout_type
result["box_grid_reviewed"] = False
return result

View File

@@ -1447,6 +1447,90 @@ def _merge_phonetic_continuation_rows(
return merged
def _merge_wrapped_rows(
entries: List[Dict[str, Any]],
) -> List[Dict[str, Any]]:
"""Merge rows where the primary column (EN) is empty — cell wrap continuation.
In textbook vocabulary tables, columns are often narrow, so the author
wraps text within a cell. OCR treats each physical line as a separate row.
The key indicator: if the EN column is empty but DE/example have text,
this row is a continuation of the previous row's cells.
Example (original textbook has ONE row):
Row 2: EN="take part (in)" DE="teilnehmen (an), mitmachen" EX="More than 200 singers took"
Row 3: EN="" DE="(bei)" EX="part in the concert."
→ Merged: EN="take part (in)" DE="teilnehmen (an), mitmachen (bei)" EX="More than 200 singers took part in the concert."
Also handles the reverse case: DE empty but EN has text (wrap in EN column).
"""
if len(entries) < 2:
return entries
merged: List[Dict[str, Any]] = []
for entry in entries:
en = (entry.get('english') or '').strip()
de = (entry.get('german') or '').strip()
ex = (entry.get('example') or '').strip()
if not merged:
merged.append(entry)
continue
prev = merged[-1]
prev_en = (prev.get('english') or '').strip()
prev_de = (prev.get('german') or '').strip()
prev_ex = (prev.get('example') or '').strip()
# Case 1: EN is empty → continuation of previous row
# (DE or EX have text that should be appended to previous row)
if not en and (de or ex) and prev_en:
if de:
if prev_de.endswith(','):
sep = ' ' # "Wort," + " " + "Ausdruck"
elif prev_de.endswith(('-', '(')):
sep = '' # "teil-" + "nehmen" or "(" + "bei)"
else:
sep = ' '
prev['german'] = (prev_de + sep + de).strip()
if ex:
sep = ' ' if prev_ex else ''
prev['example'] = (prev_ex + sep + ex).strip()
logger.debug(
f"Merged wrapped row {entry.get('row_index')} into previous "
f"(empty EN): DE={prev['german']!r}, EX={prev.get('example', '')!r}"
)
continue
# Case 2: DE is empty, EN has text that looks like continuation
# (starts with lowercase or is a parenthetical like "(bei)")
if en and not de and prev_de:
is_paren = en.startswith('(')
first_alpha = next((c for c in en if c.isalpha()), '')
starts_lower = first_alpha and first_alpha.islower()
if (is_paren or starts_lower) and len(en.split()) < 5:
sep = ' ' if prev_en and not prev_en.endswith((',', '-', '(')) else ''
prev['english'] = (prev_en + sep + en).strip()
if ex:
sep2 = ' ' if prev_ex else ''
prev['example'] = (prev_ex + sep2 + ex).strip()
logger.debug(
f"Merged wrapped row {entry.get('row_index')} into previous "
f"(empty DE): EN={prev['english']!r}"
)
continue
merged.append(entry)
if len(merged) < len(entries):
logger.info(
f"_merge_wrapped_rows: merged {len(entries) - len(merged)} "
f"continuation rows ({len(entries)}{len(merged)})"
)
return merged
def _merge_continuation_rows(
entries: List[Dict[str, Any]],
) -> List[Dict[str, Any]]:
@@ -1561,6 +1645,9 @@ def build_word_grid(
# --- Post-processing pipeline (deterministic, no LLM) ---
n_raw = len(entries)
# 0. Merge cell-wrap continuation rows (empty primary column = text wrap)
entries = _merge_wrapped_rows(entries)
# 0a. Merge phonetic-only continuation rows into previous entry
entries = _merge_phonetic_continuation_rows(entries)

View File

@@ -178,6 +178,15 @@ def detect_word_colors(
sat_pixels = text_pixels[text_pixels[:, 1] > sat_threshold]
median_hue = float(np.median(sat_pixels[:, 0]))
name = _hue_to_color_name(median_hue)
# Red requires higher saturation — scanner artifacts on black
# text often produce a slight warm tint (hue ~0) with low
# saturation that would otherwise be misclassified as red.
if name == "red" and median_sat < 90:
wb["color"] = _COLOR_HEX["black"]
wb["color_name"] = "black"
continue
wb["color"] = _COLOR_HEX.get(name, _COLOR_HEX["black"])
wb["color_name"] = name
colored_count += 1

View File

@@ -0,0 +1,413 @@
"""
PP-DocLayout ONNX Document Layout Detection.
Uses PP-DocLayout ONNX model to detect document structure regions:
table, figure, title, text, list, header, footer, equation, reference, abstract
Fallback: If ONNX model not available, returns empty list (caller should
fall back to OpenCV-based detection in cv_graphic_detect.py).
DATENSCHUTZ: Alle Verarbeitung erfolgt lokal.
"""
import logging
import os
from dataclasses import dataclass
from pathlib import Path
from typing import Dict, List, Optional
import numpy as np
logger = logging.getLogger(__name__)
__all__ = [
"detect_layout_regions",
"is_doclayout_available",
"get_doclayout_status",
"LayoutRegion",
"DOCLAYOUT_CLASSES",
]
# ---------------------------------------------------------------------------
# Class labels (PP-DocLayout default order)
# ---------------------------------------------------------------------------
DOCLAYOUT_CLASSES = [
"table", "figure", "title", "text", "list",
"header", "footer", "equation", "reference", "abstract",
]
# ---------------------------------------------------------------------------
# Data types
# ---------------------------------------------------------------------------
@dataclass
class LayoutRegion:
"""A detected document layout region."""
x: int
y: int
width: int
height: int
label: str # table, figure, title, text, list, etc.
confidence: float
label_index: int # raw class index
# ---------------------------------------------------------------------------
# ONNX model loading
# ---------------------------------------------------------------------------
_MODEL_SEARCH_PATHS = [
# 1. Explicit environment variable
os.environ.get("DOCLAYOUT_ONNX_PATH", ""),
# 2. Docker default cache path
"/root/.cache/huggingface/onnx/pp-doclayout/model.onnx",
# 3. Local dev relative to working directory
"models/onnx/pp-doclayout/model.onnx",
]
_onnx_session: Optional[object] = None
_model_path: Optional[str] = None
_load_attempted: bool = False
_load_error: Optional[str] = None
def _find_model_path() -> Optional[str]:
"""Search for the ONNX model file in known locations."""
for p in _MODEL_SEARCH_PATHS:
if p and Path(p).is_file():
return str(Path(p).resolve())
return None
def _load_onnx_session():
"""Lazy-load the ONNX runtime session (once)."""
global _onnx_session, _model_path, _load_attempted, _load_error
if _load_attempted:
return _onnx_session
_load_attempted = True
path = _find_model_path()
if path is None:
_load_error = "ONNX model not found in any search path"
logger.info("PP-DocLayout: %s", _load_error)
return None
try:
import onnxruntime as ort # type: ignore[import-untyped]
sess_options = ort.SessionOptions()
sess_options.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_ALL
# Prefer CPU keeps the GPU free for OCR / LLM.
providers = ["CPUExecutionProvider"]
_onnx_session = ort.InferenceSession(path, sess_options, providers=providers)
_model_path = path
logger.info("PP-DocLayout: model loaded from %s", path)
except ImportError:
_load_error = "onnxruntime not installed"
logger.info("PP-DocLayout: %s", _load_error)
except Exception as exc:
_load_error = str(exc)
logger.warning("PP-DocLayout: failed to load model from %s: %s", path, exc)
return _onnx_session
# ---------------------------------------------------------------------------
# Public helpers
# ---------------------------------------------------------------------------
def is_doclayout_available() -> bool:
"""Return True if the ONNX model can be loaded successfully."""
return _load_onnx_session() is not None
def get_doclayout_status() -> Dict:
"""Return diagnostic information about the DocLayout backend."""
_load_onnx_session() # ensure we tried
return {
"available": _onnx_session is not None,
"model_path": _model_path,
"load_error": _load_error,
"classes": DOCLAYOUT_CLASSES,
"class_count": len(DOCLAYOUT_CLASSES),
}
# ---------------------------------------------------------------------------
# Pre-processing
# ---------------------------------------------------------------------------
_INPUT_SIZE = 800 # PP-DocLayout expects 800x800
def preprocess_image(img_bgr: np.ndarray) -> tuple:
"""Resize + normalize image for PP-DocLayout ONNX input.
Returns:
(input_tensor, scale_x, scale_y, pad_x, pad_y)
where scale/pad allow mapping boxes back to original coords.
"""
orig_h, orig_w = img_bgr.shape[:2]
# Compute scale to fit within _INPUT_SIZE keeping aspect ratio
scale = min(_INPUT_SIZE / orig_w, _INPUT_SIZE / orig_h)
new_w = int(orig_w * scale)
new_h = int(orig_h * scale)
import cv2 # local import — cv2 is always available in this service
resized = cv2.resize(img_bgr, (new_w, new_h), interpolation=cv2.INTER_LINEAR)
# Pad to _INPUT_SIZE x _INPUT_SIZE with gray (114)
pad_x = (_INPUT_SIZE - new_w) // 2
pad_y = (_INPUT_SIZE - new_h) // 2
padded = np.full((_INPUT_SIZE, _INPUT_SIZE, 3), 114, dtype=np.uint8)
padded[pad_y:pad_y + new_h, pad_x:pad_x + new_w] = resized
# Normalize to [0, 1] float32
blob = padded.astype(np.float32) / 255.0
# HWC → CHW
blob = blob.transpose(2, 0, 1)
# Add batch dimension → (1, 3, 800, 800)
blob = np.expand_dims(blob, axis=0)
return blob, scale, pad_x, pad_y
# ---------------------------------------------------------------------------
# Non-Maximum Suppression (NMS)
# ---------------------------------------------------------------------------
def _compute_iou(box_a: np.ndarray, box_b: np.ndarray) -> float:
"""Compute IoU between two boxes [x1, y1, x2, y2]."""
ix1 = max(box_a[0], box_b[0])
iy1 = max(box_a[1], box_b[1])
ix2 = min(box_a[2], box_b[2])
iy2 = min(box_a[3], box_b[3])
inter = max(0.0, ix2 - ix1) * max(0.0, iy2 - iy1)
if inter == 0:
return 0.0
area_a = (box_a[2] - box_a[0]) * (box_a[3] - box_a[1])
area_b = (box_b[2] - box_b[0]) * (box_b[3] - box_b[1])
union = area_a + area_b - inter
return inter / union if union > 0 else 0.0
def nms(boxes: np.ndarray, scores: np.ndarray, iou_threshold: float = 0.5) -> List[int]:
"""Apply greedy Non-Maximum Suppression.
Args:
boxes: (N, 4) array of [x1, y1, x2, y2].
scores: (N,) confidence scores.
iou_threshold: Overlap threshold for suppression.
Returns:
List of kept indices.
"""
if len(boxes) == 0:
return []
order = np.argsort(scores)[::-1].tolist()
keep: List[int] = []
while order:
i = order.pop(0)
keep.append(i)
remaining = []
for j in order:
if _compute_iou(boxes[i], boxes[j]) < iou_threshold:
remaining.append(j)
order = remaining
return keep
# ---------------------------------------------------------------------------
# Post-processing
# ---------------------------------------------------------------------------
def _postprocess(
outputs: list,
scale: float,
pad_x: int,
pad_y: int,
orig_w: int,
orig_h: int,
confidence_threshold: float,
max_regions: int,
) -> List[LayoutRegion]:
"""Parse ONNX output tensors into LayoutRegion list.
PP-DocLayout ONNX typically outputs one tensor of shape
(1, N, 6) or three tensors (boxes, scores, class_ids).
We handle both common formats.
"""
regions: List[LayoutRegion] = []
# --- Determine output format ---
if len(outputs) == 1:
# Single tensor: (1, N, 4+1+1) = (batch, detections, [x1,y1,x2,y2,score,class])
raw = np.squeeze(outputs[0]) # (N, 6) or (N, 5+num_classes)
if raw.ndim == 1:
raw = raw.reshape(1, -1)
if raw.shape[0] == 0:
return []
if raw.shape[1] == 6:
# Format: x1, y1, x2, y2, score, class_id
all_boxes = raw[:, :4]
all_scores = raw[:, 4]
all_classes = raw[:, 5].astype(int)
elif raw.shape[1] > 6:
# Format: x1, y1, x2, y2, obj_conf, cls0_conf, cls1_conf, ...
all_boxes = raw[:, :4]
cls_scores = raw[:, 5:]
all_classes = np.argmax(cls_scores, axis=1)
all_scores = raw[:, 4] * np.max(cls_scores, axis=1)
else:
logger.warning("PP-DocLayout: unexpected output shape %s", raw.shape)
return []
elif len(outputs) == 3:
# Three tensors: boxes (N,4), scores (N,), class_ids (N,)
all_boxes = np.squeeze(outputs[0])
all_scores = np.squeeze(outputs[1])
all_classes = np.squeeze(outputs[2]).astype(int)
if all_boxes.ndim == 1:
all_boxes = all_boxes.reshape(1, 4)
all_scores = np.array([all_scores])
all_classes = np.array([all_classes])
else:
logger.warning("PP-DocLayout: unexpected %d output tensors", len(outputs))
return []
# --- Confidence filter ---
mask = all_scores >= confidence_threshold
boxes = all_boxes[mask]
scores = all_scores[mask]
classes = all_classes[mask]
if len(boxes) == 0:
return []
# --- NMS ---
keep_idxs = nms(boxes, scores, iou_threshold=0.5)
boxes = boxes[keep_idxs]
scores = scores[keep_idxs]
classes = classes[keep_idxs]
# --- Scale boxes back to original image coordinates ---
for i in range(len(boxes)):
x1, y1, x2, y2 = boxes[i]
# Remove padding offset
x1 = (x1 - pad_x) / scale
y1 = (y1 - pad_y) / scale
x2 = (x2 - pad_x) / scale
y2 = (y2 - pad_y) / scale
# Clamp to original dimensions
x1 = max(0, min(x1, orig_w))
y1 = max(0, min(y1, orig_h))
x2 = max(0, min(x2, orig_w))
y2 = max(0, min(y2, orig_h))
w = int(round(x2 - x1))
h = int(round(y2 - y1))
if w < 5 or h < 5:
continue
cls_idx = int(classes[i])
label = DOCLAYOUT_CLASSES[cls_idx] if 0 <= cls_idx < len(DOCLAYOUT_CLASSES) else f"class_{cls_idx}"
regions.append(LayoutRegion(
x=int(round(x1)),
y=int(round(y1)),
width=w,
height=h,
label=label,
confidence=round(float(scores[i]), 4),
label_index=cls_idx,
))
# Sort by confidence descending, limit
regions.sort(key=lambda r: r.confidence, reverse=True)
return regions[:max_regions]
# ---------------------------------------------------------------------------
# Main detection function
# ---------------------------------------------------------------------------
def detect_layout_regions(
img_bgr: np.ndarray,
confidence_threshold: float = 0.5,
max_regions: int = 50,
) -> List[LayoutRegion]:
"""Detect document layout regions using PP-DocLayout ONNX model.
Args:
img_bgr: BGR color image (OpenCV format).
confidence_threshold: Minimum confidence to keep a detection.
max_regions: Maximum number of regions to return.
Returns:
List of LayoutRegion sorted by confidence descending.
Returns empty list if model is not available.
"""
session = _load_onnx_session()
if session is None:
return []
if img_bgr is None or img_bgr.size == 0:
return []
orig_h, orig_w = img_bgr.shape[:2]
# Pre-process
input_tensor, scale, pad_x, pad_y = preprocess_image(img_bgr)
# Run inference
try:
input_name = session.get_inputs()[0].name
outputs = session.run(None, {input_name: input_tensor})
except Exception as exc:
logger.warning("PP-DocLayout inference failed: %s", exc)
return []
# Post-process
regions = _postprocess(
outputs,
scale=scale,
pad_x=pad_x,
pad_y=pad_y,
orig_w=orig_w,
orig_h=orig_h,
confidence_threshold=confidence_threshold,
max_regions=max_regions,
)
if regions:
label_counts: Dict[str, int] = {}
for r in regions:
label_counts[r.label] = label_counts.get(r.label, 0) + 1
logger.info(
"PP-DocLayout: %d regions (%s)",
len(regions),
", ".join(f"{k}: {v}" for k, v in sorted(label_counts.items())),
)
else:
logger.debug("PP-DocLayout: no regions above threshold %.2f", confidence_threshold)
return regions

View File

@@ -120,6 +120,57 @@ def detect_graphic_elements(
if img_bgr is None:
return []
# ------------------------------------------------------------------
# Try PP-DocLayout ONNX first if available
# ------------------------------------------------------------------
import os
backend = os.environ.get("GRAPHIC_DETECT_BACKEND", "auto")
if backend in ("doclayout", "auto"):
try:
from cv_doclayout_detect import detect_layout_regions, is_doclayout_available
if is_doclayout_available():
regions = detect_layout_regions(img_bgr)
if regions:
_LABEL_TO_COLOR = {
"figure": ("image", "green", _COLOR_HEX.get("green", "#16a34a")),
"table": ("image", "blue", _COLOR_HEX.get("blue", "#2563eb")),
}
converted: List[GraphicElement] = []
for r in regions:
shape, color_name, color_hex = _LABEL_TO_COLOR.get(
r.label,
(r.label, "gray", _COLOR_HEX.get("gray", "#6b7280")),
)
converted.append(GraphicElement(
x=r.x,
y=r.y,
width=r.width,
height=r.height,
area=r.width * r.height,
shape=shape,
color_name=color_name,
color_hex=color_hex,
confidence=r.confidence,
contour=None,
))
converted.sort(key=lambda g: g.area, reverse=True)
result = converted[:max_elements]
if result:
shape_counts: Dict[str, int] = {}
for g in result:
shape_counts[g.shape] = shape_counts.get(g.shape, 0) + 1
logger.info(
"GraphicDetect (PP-DocLayout): %d elements (%s)",
len(result),
", ".join(f"{s}: {c}" for s, c in sorted(shape_counts.items())),
)
return result
except Exception as e:
logger.warning("PP-DocLayout failed, falling back to OpenCV: %s", e)
# ------------------------------------------------------------------
# OpenCV fallback (original logic)
# ------------------------------------------------------------------
h, w = img_bgr.shape[:2]
logger.debug("GraphicDetect: image %dx%d, %d word_boxes, %d detected_boxes",
@@ -170,7 +221,7 @@ def detect_graphic_elements(
continue
# Skip page-spanning regions
if bw > w * 0.5 or bh > h * 0.5:
if bw > w * 0.6 or bh > h * 0.6:
logger.debug("GraphicDetect PASS1 skip page-spanning (%d,%d) %dx%d", bx, by, bw, bh)
continue
@@ -232,12 +283,16 @@ def detect_graphic_elements(
if color_pixel_count < 200:
continue
# (d) Very low density → thin strokes, almost certainly text
if density < 0.20:
# (d) Very low density → thin strokes, almost certainly text.
# Large regions (photos/illustrations) can have low color density
# because most pixels are grayscale ink. Use a lower threshold
# for regions bigger than 100×80 px.
_min_density = 0.05 if (bw > 100 and bh > 80) else 0.20
if density < _min_density:
logger.info(
"GraphicDetect PASS1 skip low-density (%d,%d) %dx%d "
"density=%.0f%% (likely colored text)",
bx, by, bw, bh, density * 100,
"density=%.0f%% (min=%.0f%%, likely colored text)",
bx, by, bw, bh, density * 100, _min_density * 100,
)
continue

View File

@@ -0,0 +1,610 @@
"""
Gutter Repair — detects and fixes words truncated or blurred at the book gutter.
When scanning double-page spreads, the binding area (gutter) causes:
1. Blurry/garbled trailing characters ("stammeli""stammeln")
2. Words split across lines with a hyphen lost in the gutter
("ve" + "künden""verkünden")
This module analyses grid cells, identifies gutter-edge candidates, and
proposes corrections using pyspellchecker (DE + EN).
Lizenz: Apache 2.0 (kommerziell nutzbar)
DATENSCHUTZ: Alle Verarbeitung erfolgt lokal.
"""
import itertools
import logging
import re
import time
import uuid
from dataclasses import dataclass, field, asdict
from typing import Any, Dict, List, Optional, Tuple
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Spellchecker setup (lazy, cached)
# ---------------------------------------------------------------------------
_spell_de = None
_spell_en = None
_SPELL_AVAILABLE = False
def _init_spellcheckers():
"""Lazy-load DE + EN spellcheckers (cached across calls)."""
global _spell_de, _spell_en, _SPELL_AVAILABLE
if _spell_de is not None:
return
try:
from spellchecker import SpellChecker
_spell_de = SpellChecker(language='de', distance=1)
_spell_en = SpellChecker(language='en', distance=1)
_SPELL_AVAILABLE = True
logger.info("Gutter repair: spellcheckers loaded (DE + EN)")
except ImportError:
logger.warning("pyspellchecker not installed — gutter repair unavailable")
def _is_known(word: str) -> bool:
"""Check if a word is known in DE or EN dictionary."""
_init_spellcheckers()
if not _SPELL_AVAILABLE:
return False
w = word.lower()
return bool(_spell_de.known([w])) or bool(_spell_en.known([w]))
def _spell_candidates(word: str, lang: str = "both") -> List[str]:
"""Get all plausible spellchecker candidates for a word (deduplicated)."""
_init_spellcheckers()
if not _SPELL_AVAILABLE:
return []
w = word.lower()
seen: set = set()
results: List[str] = []
for checker in ([_spell_de, _spell_en] if lang == "both"
else [_spell_de] if lang == "de"
else [_spell_en]):
if checker is None:
continue
cands = checker.candidates(w)
if cands:
for c in cands:
if c and c != w and c not in seen:
seen.add(c)
results.append(c)
return results
# ---------------------------------------------------------------------------
# Gutter position detection
# ---------------------------------------------------------------------------
# Minimum word length for spell-fix (very short words are often legitimate)
_MIN_WORD_LEN_SPELL = 3
# Minimum word length for hyphen-join candidates (fragments at the gutter
# can be as short as 1-2 chars, e.g. "ve" from "ver-künden")
_MIN_WORD_LEN_HYPHEN = 2
# How close to the right column edge a word must be to count as "gutter-adjacent".
# Expressed as fraction of column width (e.g. 0.75 = rightmost 25%).
_GUTTER_EDGE_THRESHOLD = 0.70
# Small common words / abbreviations that should NOT be repaired
_STOPWORDS = frozenset([
# German
"ab", "an", "am", "da", "er", "es", "im", "in", "ja", "ob", "so", "um",
"zu", "wo", "du", "eh", "ei", "je", "na", "nu", "oh",
# English
"a", "am", "an", "as", "at", "be", "by", "do", "go", "he", "if", "in",
"is", "it", "me", "my", "no", "of", "on", "or", "so", "to", "up", "us",
"we",
])
# IPA / phonetic patterns — skip these cells
_IPA_RE = re.compile(r'[\[\]/ˈˌːʃʒθðŋɑɒæɔəɛɪʊʌ]')
def _is_ipa_text(text: str) -> bool:
"""True if text looks like IPA transcription."""
return bool(_IPA_RE.search(text))
def _word_is_at_gutter_edge(word_bbox: Dict, col_x: float, col_width: float) -> bool:
"""Check if a word's right edge is near the right boundary of its column."""
if col_width <= 0:
return False
word_right = word_bbox.get("left", 0) + word_bbox.get("width", 0)
col_right = col_x + col_width
# Word's right edge within the rightmost portion of the column
relative_pos = (word_right - col_x) / col_width
return relative_pos >= _GUTTER_EDGE_THRESHOLD
# ---------------------------------------------------------------------------
# Suggestion types
# ---------------------------------------------------------------------------
@dataclass
class GutterSuggestion:
"""A single correction suggestion."""
id: str = field(default_factory=lambda: str(uuid.uuid4())[:8])
type: str = "" # "hyphen_join" | "spell_fix"
zone_index: int = 0
row_index: int = 0
col_index: int = 0
col_type: str = ""
cell_id: str = ""
original_text: str = ""
suggested_text: str = ""
# For hyphen_join:
next_row_index: int = -1
next_row_cell_id: str = ""
next_row_text: str = ""
missing_chars: str = ""
display_parts: List[str] = field(default_factory=list)
# Alternatives (other plausible corrections the user can pick from)
alternatives: List[str] = field(default_factory=list)
# Meta:
confidence: float = 0.0
reason: str = "" # "gutter_truncation" | "gutter_blur" | "hyphen_continuation"
def to_dict(self) -> Dict[str, Any]:
return asdict(self)
# ---------------------------------------------------------------------------
# Core repair logic
# ---------------------------------------------------------------------------
_TRAILING_PUNCT_RE = re.compile(r'[.,;:!?\)\]]+$')
def _try_hyphen_join(
word_text: str,
next_word_text: str,
max_missing: int = 3,
) -> Optional[Tuple[str, str, float]]:
"""Try joining two fragments with 0..max_missing interpolated chars.
Strips trailing punctuation from the continuation word before testing
(e.g. "künden,""künden") so dictionary lookup succeeds.
Returns (joined_word, missing_chars, confidence) or None.
"""
base = word_text.rstrip("-").rstrip()
# Strip trailing punctuation from continuation (commas, periods, etc.)
raw_continuation = next_word_text.lstrip()
continuation = _TRAILING_PUNCT_RE.sub('', raw_continuation)
if not base or not continuation:
return None
# 1. Direct join (no missing chars)
direct = base + continuation
if _is_known(direct):
return (direct, "", 0.95)
# 2. Try with 1..max_missing missing characters
# Use common letters, weighted by frequency in German/English
_COMMON_CHARS = "enristaldhgcmobwfkzpvjyxqu"
for n_missing in range(1, max_missing + 1):
for chars in itertools.product(_COMMON_CHARS[:15], repeat=n_missing):
candidate = base + "".join(chars) + continuation
if _is_known(candidate):
missing = "".join(chars)
# Confidence decreases with more missing chars
conf = 0.90 - (n_missing - 1) * 0.10
return (candidate, missing, conf)
return None
def _try_spell_fix(
word_text: str, col_type: str = "",
) -> Optional[Tuple[str, float, List[str]]]:
"""Try to fix a single garbled gutter word via spellchecker.
Returns (best_correction, confidence, alternatives_list) or None.
The alternatives list contains other plausible corrections the user
can choose from (e.g. "stammelt" vs "stammeln").
"""
if len(word_text) < _MIN_WORD_LEN_SPELL:
return None
# Strip trailing/leading parentheses and check if the bare word is valid.
# Words like "probieren)" or "(Englisch" are valid words with punctuation,
# not OCR errors. Don't suggest corrections for them.
stripped = word_text.strip("()")
if stripped and _is_known(stripped):
return None
# Determine language priority from column type
if "en" in col_type:
lang = "en"
elif "de" in col_type:
lang = "de"
else:
lang = "both"
candidates = _spell_candidates(word_text, lang=lang)
if not candidates and lang != "both":
candidates = _spell_candidates(word_text, lang="both")
if not candidates:
return None
# Preserve original casing
is_upper = word_text[0].isupper()
def _preserve_case(w: str) -> str:
if is_upper and w:
return w[0].upper() + w[1:]
return w
# Sort candidates by edit distance (closest first)
scored = []
for c in candidates:
dist = _edit_distance(word_text.lower(), c.lower())
scored.append((dist, c))
scored.sort(key=lambda x: x[0])
best_dist, best = scored[0]
best = _preserve_case(best)
conf = max(0.5, 1.0 - best_dist * 0.15)
# Build alternatives (all other candidates, also case-preserved)
alts = [_preserve_case(c) for _, c in scored[1:] if c.lower() != best.lower()]
# Limit to top 5 alternatives
alts = alts[:5]
return (best, conf, alts)
def _edit_distance(a: str, b: str) -> int:
"""Simple Levenshtein distance."""
if len(a) < len(b):
return _edit_distance(b, a)
if len(b) == 0:
return len(a)
prev = list(range(len(b) + 1))
for i, ca in enumerate(a):
curr = [i + 1]
for j, cb in enumerate(b):
cost = 0 if ca == cb else 1
curr.append(min(curr[j] + 1, prev[j + 1] + 1, prev[j] + cost))
prev = curr
return prev[len(b)]
# ---------------------------------------------------------------------------
# Grid analysis
# ---------------------------------------------------------------------------
def analyse_grid_for_gutter_repair(
grid_data: Dict[str, Any],
image_width: int = 0,
) -> Dict[str, Any]:
"""Analyse a structured grid and return gutter repair suggestions.
Args:
grid_data: The grid_editor_result from the session (zones→cells structure).
image_width: Image width in pixels (for determining gutter side).
Returns:
Dict with "suggestions" list and "stats".
"""
t0 = time.time()
_init_spellcheckers()
if not _SPELL_AVAILABLE:
return {
"suggestions": [],
"stats": {"error": "pyspellchecker not installed"},
"duration_seconds": 0,
}
zones = grid_data.get("zones", [])
suggestions: List[GutterSuggestion] = []
words_checked = 0
gutter_candidates = 0
for zi, zone in enumerate(zones):
columns = zone.get("columns", [])
cells = zone.get("cells", [])
if not columns or not cells:
continue
# Build column lookup: col_index → {x, width, type}
col_info: Dict[int, Dict] = {}
for col in columns:
ci = col.get("index", col.get("col_index", -1))
col_info[ci] = {
"x": col.get("x_min_px", col.get("x", 0)),
"width": col.get("x_max_px", col.get("width", 0)) - col.get("x_min_px", col.get("x", 0)),
"type": col.get("type", col.get("col_type", "")),
}
# Build row→col→cell lookup
cell_map: Dict[Tuple[int, int], Dict] = {}
max_row = 0
for cell in cells:
ri = cell.get("row_index", 0)
ci = cell.get("col_index", 0)
cell_map[(ri, ci)] = cell
if ri > max_row:
max_row = ri
# Determine which columns are at the gutter edge.
# For a left page: rightmost content columns.
# For now, check ALL columns — a word is a candidate if it's at the
# right edge of its column AND not a known word.
for (ri, ci), cell in cell_map.items():
text = (cell.get("text") or "").strip()
if not text:
continue
if _is_ipa_text(text):
continue
words_checked += 1
col = col_info.get(ci, {})
col_type = col.get("type", "")
# Get word boxes to check position
word_boxes = cell.get("word_boxes", [])
# Check the LAST word in the cell (rightmost, closest to gutter)
cell_words = text.split()
if not cell_words:
continue
last_word = cell_words[-1]
# Skip stopwords
if last_word.lower().rstrip(".,;:!?-") in _STOPWORDS:
continue
last_word_clean = last_word.rstrip(".,;:!?)(")
if len(last_word_clean) < _MIN_WORD_LEN_HYPHEN:
continue
# Check if the last word is at the gutter edge
is_at_edge = False
if word_boxes:
last_wb = word_boxes[-1]
is_at_edge = _word_is_at_gutter_edge(
last_wb, col.get("x", 0), col.get("width", 1)
)
else:
# No word boxes — use cell bbox
bbox = cell.get("bbox_px", {})
is_at_edge = _word_is_at_gutter_edge(
{"left": bbox.get("x", 0), "width": bbox.get("w", 0)},
col.get("x", 0), col.get("width", 1)
)
if not is_at_edge:
continue
# Word is at gutter edge — check if it's a known word
if _is_known(last_word_clean):
continue
# Check if the word ends with "-" (explicit hyphen break)
ends_with_hyphen = last_word.endswith("-")
# If the word already ends with "-" and the stem (without
# the hyphen) is a known word, this is a VALID line-break
# hyphenation — not a gutter error. Gutter problems cause
# the hyphen to be LOST ("ve" instead of "ver-"), so a
# visible hyphen + known stem = intentional word-wrap.
# Example: "wunder-" → "wunder" is known → skip.
if ends_with_hyphen:
stem = last_word_clean.rstrip("-")
if stem and _is_known(stem):
continue
gutter_candidates += 1
# --- Strategy 1: Hyphen join with next row ---
next_cell = cell_map.get((ri + 1, ci))
if next_cell:
next_text = (next_cell.get("text") or "").strip()
next_words = next_text.split()
if next_words:
first_next = next_words[0]
first_next_clean = _TRAILING_PUNCT_RE.sub('', first_next)
first_alpha = next((c for c in first_next if c.isalpha()), "")
# Also skip if the joined word is known (covers compound
# words where the stem alone might not be in the dictionary)
if ends_with_hyphen and first_next_clean:
direct = last_word_clean.rstrip("-") + first_next_clean
if _is_known(direct):
continue
# Continuation likely if:
# - explicit hyphen, OR
# - next row starts lowercase (= not a new entry)
if ends_with_hyphen or (first_alpha and first_alpha.islower()):
result = _try_hyphen_join(last_word_clean, first_next)
if result:
joined, missing, conf = result
# Build display parts: show hyphenation for original layout
if ends_with_hyphen:
display_p1 = last_word_clean.rstrip("-")
if missing:
display_p1 += missing
display_p1 += "-"
else:
display_p1 = last_word_clean
if missing:
display_p1 += missing + "-"
else:
display_p1 += "-"
suggestion = GutterSuggestion(
type="hyphen_join",
zone_index=zi,
row_index=ri,
col_index=ci,
col_type=col_type,
cell_id=cell.get("cell_id", f"R{ri:02d}_C{ci}"),
original_text=last_word,
suggested_text=joined,
next_row_index=ri + 1,
next_row_cell_id=next_cell.get("cell_id", f"R{ri+1:02d}_C{ci}"),
next_row_text=next_text,
missing_chars=missing,
display_parts=[display_p1, first_next],
confidence=conf,
reason="gutter_truncation" if missing else "hyphen_continuation",
)
suggestions.append(suggestion)
continue # skip spell_fix if hyphen_join found
# --- Strategy 2: Single-word spell fix (only for longer words) ---
fix_result = _try_spell_fix(last_word_clean, col_type)
if fix_result:
corrected, conf, alts = fix_result
suggestion = GutterSuggestion(
type="spell_fix",
zone_index=zi,
row_index=ri,
col_index=ci,
col_type=col_type,
cell_id=cell.get("cell_id", f"R{ri:02d}_C{ci}"),
original_text=last_word,
suggested_text=corrected,
alternatives=alts,
confidence=conf,
reason="gutter_blur",
)
suggestions.append(suggestion)
duration = round(time.time() - t0, 3)
logger.info(
"Gutter repair: checked %d words, %d gutter candidates, %d suggestions (%.2fs)",
words_checked, gutter_candidates, len(suggestions), duration,
)
return {
"suggestions": [s.to_dict() for s in suggestions],
"stats": {
"words_checked": words_checked,
"gutter_candidates": gutter_candidates,
"suggestions_found": len(suggestions),
},
"duration_seconds": duration,
}
def apply_gutter_suggestions(
grid_data: Dict[str, Any],
accepted_ids: List[str],
suggestions: List[Dict[str, Any]],
) -> Dict[str, Any]:
"""Apply accepted gutter repair suggestions to the grid data.
Modifies cells in-place and returns summary of changes.
Args:
grid_data: The grid_editor_result (zones→cells).
accepted_ids: List of suggestion IDs the user accepted.
suggestions: The full suggestions list (from analyse_grid_for_gutter_repair).
Returns:
Dict with "applied_count" and "changes" list.
"""
accepted_set = set(accepted_ids)
accepted_suggestions = [s for s in suggestions if s.get("id") in accepted_set]
zones = grid_data.get("zones", [])
changes: List[Dict[str, Any]] = []
for s in accepted_suggestions:
zi = s.get("zone_index", 0)
ri = s.get("row_index", 0)
ci = s.get("col_index", 0)
stype = s.get("type", "")
if zi >= len(zones):
continue
zone_cells = zones[zi].get("cells", [])
# Find the target cell
target_cell = None
for cell in zone_cells:
if cell.get("row_index") == ri and cell.get("col_index") == ci:
target_cell = cell
break
if not target_cell:
continue
old_text = target_cell.get("text", "")
if stype == "spell_fix":
# Replace the last word in the cell text
original_word = s.get("original_text", "")
corrected = s.get("suggested_text", "")
if original_word and corrected:
# Replace from the right (last occurrence)
idx = old_text.rfind(original_word)
if idx >= 0:
new_text = old_text[:idx] + corrected + old_text[idx + len(original_word):]
target_cell["text"] = new_text
changes.append({
"type": "spell_fix",
"zone_index": zi,
"row_index": ri,
"col_index": ci,
"cell_id": target_cell.get("cell_id", ""),
"old_text": old_text,
"new_text": new_text,
})
elif stype == "hyphen_join":
# Current cell: replace last word with the hyphenated first part
original_word = s.get("original_text", "")
joined = s.get("suggested_text", "")
display_parts = s.get("display_parts", [])
next_ri = s.get("next_row_index", -1)
if not original_word or not joined or not display_parts:
continue
# The first display part is what goes in the current row
first_part = display_parts[0] if display_parts else ""
# Replace the last word in current cell with the restored form.
# The next row is NOT modified — "künden" stays in its row
# because the original book layout has it there. We only fix
# the truncated word in the current row (e.g. "ve" → "ver-").
idx = old_text.rfind(original_word)
if idx >= 0:
new_text = old_text[:idx] + first_part + old_text[idx + len(original_word):]
target_cell["text"] = new_text
changes.append({
"type": "hyphen_join",
"zone_index": zi,
"row_index": ri,
"col_index": ci,
"cell_id": target_cell.get("cell_id", ""),
"old_text": old_text,
"new_text": new_text,
"joined_word": joined,
})
logger.info("Gutter repair applied: %d/%d suggestions", len(changes), len(accepted_suggestions))
return {
"applied_count": len(accepted_suggestions),
"changes": changes,
}

View File

@@ -0,0 +1,135 @@
"""German IPA insertion for grid editor cells.
Hybrid approach:
1. Primary lookup: wiki-pronunciation-dict (636k entries, CC-BY-SA)
2. Fallback: epitran rule-based G2P (MIT license)
German IPA data sourced from Wiktionary contributors (CC-BY-SA 4.0).
Attribution required — see grid editor UI.
Lizenz: Code Apache-2.0, IPA-Daten CC-BY-SA 4.0 (Wiktionary)
DATENSCHUTZ: Alle Verarbeitung erfolgt lokal.
"""
import logging
import re
from typing import Dict, List, Optional, Set
logger = logging.getLogger(__name__)
# IPA/phonetic characters — skip cells that already contain IPA
_IPA_RE = re.compile(r'[\[\]ˈˌːʃʒθðŋɑɒæɔəɛɜɪʊʌ]')
def _lookup_ipa_de(word: str) -> Optional[str]:
"""Look up German IPA for a single word.
Returns IPA string or None if not found.
"""
from cv_vocab_types import _de_ipa_dict, _epitran_de, DE_IPA_AVAILABLE
if not DE_IPA_AVAILABLE and _epitran_de is None:
return None
lower = word.lower().strip()
if not lower:
return None
# 1. Dictionary lookup (636k entries)
ipa = _de_ipa_dict.get(lower)
if ipa:
return ipa
# 2. epitran fallback (rule-based)
if _epitran_de is not None:
try:
result = _epitran_de.transliterate(word)
if result and result != word.lower():
return result
except Exception:
pass
return None
def _insert_ipa_for_text(text: str) -> str:
"""Insert German IPA after each recognized word in a text string.
Handles comma-separated lists:
"bildschön, blendend""bildschön [bɪltʃøn], blendend [blɛndənt]"
Skips cells already containing IPA brackets.
"""
if not text or _IPA_RE.search(text):
return text
# Split on comma/semicolon sequences, keeping separators
tokens = re.split(r'([,;:]+\s*)', text)
result = []
changed = False
for tok in tokens:
# Keep separators as-is
if not tok or re.match(r'^[,;:\s]+$', tok):
result.append(tok)
continue
# Process words within this token
words = tok.split()
new_words = []
for w in words:
# Strip punctuation for lookup
clean = re.sub(r'[^a-zA-ZäöüÄÖÜß]', '', w)
if len(clean) < 3:
new_words.append(w)
continue
ipa = _lookup_ipa_de(clean)
if ipa:
new_words.append(f"{w} [{ipa}]")
changed = True
else:
new_words.append(w)
result.append(' '.join(new_words))
return ''.join(result) if changed else text
def insert_german_ipa(
cells: List[Dict],
target_cols: Set[str],
) -> int:
"""Insert German IPA transcriptions into cells of target columns.
Args:
cells: Flat list of all cells (modified in-place).
target_cols: Set of col_type values to process.
Returns:
Number of cells modified.
"""
from cv_vocab_types import DE_IPA_AVAILABLE, _epitran_de
if not DE_IPA_AVAILABLE and _epitran_de is None:
logger.warning("German IPA not available — skipping")
return 0
count = 0
for cell in cells:
ct = cell.get("col_type", "")
if ct not in target_cols:
continue
text = cell.get("text", "")
if not text.strip():
continue
new_text = _insert_ipa_for_text(text)
if new_text != text:
cell["text"] = new_text
cell["_ipa_corrected"] = True
count += 1
if count:
logger.info(f"German IPA inserted in {count} cells")
return count

Some files were not shown because too many files have changed in this diff Show More