Switch Vision-LLM Fusion to llama3.2-vision:11b

qwen2.5vl:32b needs ~100GB RAM and crashes Ollama. llama3.2-vision:11b is already installed and fits in memory. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fix: _merge_paddle_tesseract takes 2 args not 4
2026-04-24 00:44:59 +02:00 · 2026-04-24 00:33:49 +02:00 · 2026-04-24 00:24:22 +02:00 · 2026-04-23 16:55:01 +02:00 · 2026-04-23 16:40:39 +02:00 · 2026-04-23 16:18:44 +02:00
165 changed files with 663605 additions and 14846 deletions
--- a/.claude/CLAUDE.md
+++ b/.claude/CLAUDE.md
@@ -256,3 +256,45 @@ ssh macmini "cd /Users/benjaminadmin/Projekte/breakpilot-lehrer && git push all
 | `website/app/admin/klausur-korrektur/` | Korrektur-Workspace |
 | `backend-lehrer/classroom_api.py` | Classroom Engine |
 | `backend-lehrer/state_engine_api.py` | State Engine |
+
+---
+
+## Code-Qualitaet Guardrails (NON-NEGOTIABLE)
+
+> Vollstaendige Details: `.claude/rules/architecture.md`
+> Ausnahmen: `.claude/rules/loc-exceptions.txt`
+
+### File Size Budget
+
+- **Hard Cap: 500 LOC** pro Datei
+- Wenn eine Aenderung eine Datei ueber 500 LOC bringen wuerde: **erst splitten, dann aendern**
+- Ausnahmen nur mit Begruendung in `loc-exceptions.txt` + `[guardrail-change]` Commit-Marker
+
+### Architektur
+
+- **Python:** Routes duenn → Business Logic in Services → Persistenz in Repositories
+- **Go:** Handler ≤40 LOC → Service-Layer → Repository-Pattern
+- **TypeScript/Next.js:** page.tsx duenn → Server Actions, Queries, Components auslagern
+- **Types:** Monolithische types.ts frueh splitten, types.ts + types/ Shadowing vermeiden
+
+### Workflow (bei jeder Aenderung)
+
+1. Datei lesen + LOC pruefen
+2. Wenn nahe am Budget → erst splitten
+3. Minimale kohaerente Aenderung
+4. Verifikation (Tests + Lint)
+5. Zusammenfassung: Was geaendert, was verifiziert, Restrisiko
+
+### Commit-Marker
+
+- `[migration-approved]` — Schema-/Migrations-Aenderungen
+- `[guardrail-change]` — Aenderungen an .claude/**, scripts/check-loc.sh
+- `[split-required]` — Aenderung beginnt mit Datei-Split
+- `[interface-change]` — Public API Contracts geaendert
+
+### LOC-Check ausfuehren
+
+```bash
+bash scripts/check-loc.sh --changed   # nur geaenderte Dateien
+bash scripts/check-loc.sh --all       # alle Dateien (zeigt alle Violations)
+```
--- a/.claude/rules/architecture.md
+++ b/.claude/rules/architecture.md
@@ -0,0 +1,46 @@
+# Architecture Rule — BreakPilot Lehrer
+
+## File Size Budget
+
+Hard default: **500 LOC max** per file.
+Soft targets:
+- Handler/Router/Service: 300-400 LOC
+- Models/Schemas/Types: 200-300 LOC
+- Utilities: 100-200 LOC
+
+Ausnahmen nur in `.claude/rules/loc-exceptions.txt` mit Begruendung.
+
+## Split-Trigger
+
+Sofort splitten wenn:
+- Datei ueberschreitet 500 LOC
+- Datei wuerde nach Aenderung 500 LOC ueberschreiten
+- Datei mischt Transport + Business Logic + Persistence
+- Datei enthaelt mehrere unabhaengig testbare Verantwortlichkeiten
+
+## Python (backend-lehrer, klausur-service, voice-service)
+
+- Routes duenn halten — Business Logic in Services
+- Persistenz in Repositories/Data-Access-Module
+- Pydantic Schemas nach Domain splitten
+- Zirkulaere Imports vermeiden
+
+## Go (school-service, edu-search-service)
+
+- Handler duenn halten (≤40 LOC)
+- Business Logic in Services/Use-Cases
+- Transport/Request-Decoding getrennt von Domain-Logik
+
+## TypeScript / Next.js (admin-lehrer, studio-v2, website)
+
+- page.tsx duenn halten — Server Actions, Queries, Forms auslagern
+- Monolithische types.ts frueh splitten
+- types.ts + types/ Shadowing vermeiden
+- Shared Client/Server Types explizit trennen
+
+## Entscheidungsreihenfolge
+
+1. Bestehendes kleines kohaeesives Modul wiederverwenden
+2. Neues Modul in der Naehe erstellen
+3. Ueberfuellte Datei splitten, neues Verhalten in richtiges Split-Modul
+4. Nur als letzter Ausweg: Grosse bestehende Datei erweitern
--- a/.claude/rules/loc-exceptions.txt
+++ b/.claude/rules/loc-exceptions.txt
@@ -0,0 +1,20 @@
+# LOC Exceptions — BreakPilot Lehrer
+# Format: <glob> | owner=<person> | reason=<why> | review=<date>
+#
+# Jede Ausnahme braucht Begruendung und Review-Datum.
+# Temporaere Ausnahmen muessen mit [guardrail-change] Commit-Marker versehen werden.
+
+# Generated / Build Artifacts
+**/node_modules/** | owner=infra | reason=npm packages | review=permanent
+**/.next/** | owner=infra | reason=Next.js build output | review=permanent
+**/__pycache__/** | owner=infra | reason=Python bytecode | review=permanent
+**/venv/** | owner=infra | reason=Python virtualenv | review=permanent
+
+# Test-Dateien (duerfen groesser sein fuer Table-Driven Tests)
+**/tests/test_cv_vocab_pipeline.py | owner=klausur | reason=umfangreiche OCR Pipeline Tests | review=2026-07-01
+**/tests/test_rbac.py | owner=klausur | reason=RBAC Test-Matrix | review=2026-07-01
+**/tests/test_grid_editor_api.py | owner=klausur | reason=Grid Editor Integrationstests | review=2026-07-01
+
+# Legacy — TEMPORAER bis Refactoring abgeschlossen
+# Dateien hier werden Phase fuer Phase abgearbeitet und entfernt.
+# KEINE neuen Ausnahmen ohne [guardrail-change] Commit-Marker!
--- a/.claude/rules/ocr-pipeline-extensions.md
+++ b/.claude/rules/ocr-pipeline-extensions.md
@@ -0,0 +1,237 @@
+# OCR Pipeline Erweiterungen - Entwicklerdokumentation
+
+**Status:** Produktiv
+**Letzte Aktualisierung:** 2026-04-15
+**URL:** https://macmini:3002/ai/ocr-kombi
+
+---
+
+## Uebersicht
+
+Erweiterungen der OCR Kombi Pipeline (14 Steps, 0-13):
+- **SmartSpellChecker** — LLM-freie OCR-Korrektur mit Spracherkennung
+- **Box-Grid-Review** (Step 11) — Eingebettete Boxen verarbeiten
+- **Ansicht/Spreadsheet** (Step 12) — Fortune Sheet Excel-Editor
+
+---
+
+## Pipeline Steps
+
+| Step | ID | Name | Komponente |
+|------|----|------|------------|
+| 0 | upload | Upload | StepUpload |
+| 1 | orientation | Orientierung | StepOrientation |
+| 2 | page-split | Seitentrennung | StepPageSplit |
+| 3 | deskew | Begradigung | StepDeskew |
+| 4 | dewarp | Entzerrung | StepDewarp |
+| 5 | content-crop | Zuschneiden | StepContentCrop |
+| 6 | ocr | OCR | StepOcr |
+| 7 | structure | Strukturerkennung | StepStructure |
+| 8 | grid-build | Grid-Aufbau | StepGridBuild |
+| 9 | grid-review | Grid-Review | StepGridReview |
+| 10 | gutter-repair | Wortkorrektur | StepGutterRepair |
+| **11** | **box-review** | **Box-Review** | **StepBoxGridReview** |
+| **12** | **ansicht** | **Ansicht** | **StepAnsicht** |
+| 13 | ground-truth | Ground Truth | StepGroundTruth |
+
+Step-Definitionen: `admin-lehrer/app/(admin)/ai/ocr-kombi/types.ts`
+
+---
+
+## SmartSpellChecker
+
+**Datei:** `klausur-service/backend/smart_spell.py`
+**Tests:** `tests/test_smart_spell.py` (43 Tests)
+**Lizenz:** Nur pyspellchecker (MIT) — kein LLM, kein Hunspell
+
+### Features
+
+| Feature | Methode |
+|---------|---------|
+| Spracherkennung | Dual-Dictionary EN/DE Heuristik |
+| a/I Disambiguation | Bigram-Kontext (Folgewort-Lookup) |
+| Boundary Repair | Frequenz-basiert: `Pound sand`→`Pounds and` |
+| Context Split | `anew`→`a new` (Allow/Deny-Liste) |
+| Multi-Digit | BFS: `sch00l`→`school` |
+| Cross-Language Guard | DE-Woerter in EN-Spalte nicht falsch korrigieren |
+| Umlaut-Korrektur | `Schuler`→`Schueler` |
+| IPA-Schutz | Inhalte in [Klammern] nie aendern |
+| Slash→l | `p/`→`pl` (kursives l als / erkannt) |
+| Abkuerzungen | 120+ aus `_KNOWN_ABBREVIATIONS` |
+
+### Integration
+
+```python
+# In cv_review.py (LLM Review Step):
+from smart_spell import SmartSpellChecker
+_smart = SmartSpellChecker()
+result = _smart.correct_text(text, lang="en")  # oder "de" oder "auto"
+
+# In grid_editor_api.py (Grid Build + Box Build):
+# Automatisch nach Grid-Aufbau und Box-Grid-Aufbau
+```
+
+### Frequenz-Scoring
+
+Boundary Repair vergleicht Wort-Frequenz-Produkte:
+- `old_freq = word_freq(w1) * word_freq(w2)`
+- `new_freq = word_freq(repaired_w1) * word_freq(repaired_w2)`
+- Akzeptiert wenn `new_freq > old_freq * 5`
+- Abkuerzungs-Bonus nur wenn Original-Woerter selten (freq < 1e-6)
+
+---
+
+## Box-Grid-Review (Step 11)
+
+**Frontend:** `admin-lehrer/components/ocr-kombi/StepBoxGridReview.tsx`
+**Backend:** `klausur-service/backend/cv_box_layout.py`, `grid_editor_api.py`
+**Tests:** `tests/test_box_layout.py` (13 Tests)
+
+### Backend-Endpoints
+
+```
+POST /api/v1/ocr-pipeline/sessions/{id}/build-box-grids
+```
+
+Verarbeitet alle erkannten Boxen aus `structure_result`:
+1. Filtert Header/Footer-Boxen (obere/untere 7% der Bildhoehe)
+2. Extrahiert OCR-Woerter pro Box aus `raw_paddle_words`
+3. Klassifiziert Layout: `flowing` | `columnar` | `bullet_list` | `header_only`
+4. Baut Grid mit layout-spezifischer Logik
+5. Wendet SmartSpellChecker an
+
+### Box Layout Klassifikation (`cv_box_layout.py`)
+
+| Layout | Erkennung | Grid-Aufbau |
+|--------|-----------|-------------|
+| `header_only` | ≤5 Woerter oder 1 Zeile | 1 Zelle, alles zusammen |
+| `flowing` | Gleichmaessige Zeilenbreite | 1 Spalte, Bullet-Gruppierung per Einrueckung |
+| `bullet_list` | ≥40% Zeilen mit Bullet-Marker | 1 Spalte, Bullet-Items |
+| `columnar` | Mehrere X-Cluster | Standard-Spaltenerkennung |
+
+### Bullet-Einrueckung
+
+Erkennung ueber Left-Edge-Analyse:
+- Minimale Einrueckung = Bullet-Ebene
+- Zeilen mit >15px mehr Einrueckung = Folgezeilen
+- Folgezeilen werden mit `\n` in die Bullet-Zelle integriert
+- Fehlende `•` Marker werden automatisch ergaenzt
+
+### Colspan-Erkennung (`grid_editor_helpers.py`)
+
+Generische Funktion `_detect_colspan_cells()`:
+- Laeuft nach `_build_cells()` fuer ALLE Zonen
+- Nutzt Original-Wort-Bloecke (vor `_split_cross_column_words`)
+- Wort-Block der ueber Spaltengrenze reicht → `spanning_header` mit `colspan=N`
+- Beispiel: "In Britain you pay with pounds and pence." ueber 2 Spalten
+
+### Spalten-Erkennung in Boxen
+
+Fuer kleine Zonen (≤60 Woerter):
+- `gap_threshold = max(median_h * 1.0, 25)` statt `3x median`
+- PaddleOCR liefert Multi-Word-Bloecke → alle Gaps sind Spalten-Gaps
+
+---
+
+## Ansicht / Spreadsheet (Step 12)
+
+**Frontend:** `admin-lehrer/components/ocr-kombi/StepAnsicht.tsx`, `SpreadsheetView.tsx`
+**Bibliothek:** `@fortune-sheet/react` (MIT, v1.0.4)
+
+### Architektur
+
+Split-View:
+- **Links:** Original-Scan mit OCR-Overlay (`/image/words-overlay`)
+- **Rechts:** Fortune Sheet Spreadsheet mit Multi-Sheet-Tabs
+
+### Multi-Sheet Ansatz
+
+Jede Zone wird ein eigenes Sheet-Tab:
+- Sheet "Vokabeln" — Hauptgrid mit EN/DE Spalten
+- Sheet "Pounds and euros" — Box 1 mit eigenen 4 Spalten
+- Sheet "German leihen" — Box 2 als Fliesstexttext
+
+Grund: Spaltenbreiten sind pro Zone unterschiedlich optimiert. Excel-Limitation: Spaltenbreite gilt fuer die ganze Spalte.
+
+### Zell-Formatierung
+
+| Format | Quelle | Fortune Sheet Property |
+|--------|--------|----------------------|
+| Fett | `is_header`, `is_bold`, groessere Schrift | `bl: 1` |
+| Schriftfarbe | OCR word_boxes color | `fc: '#hex'` |
+| Hintergrund | Box bg_hex, Header | `bg: '#hex08'` |
+| Text-Wrap | Mehrzeilige Zellen (\n) | `tb: '2'` |
+| Vertikal oben | Mehrzeilige Zellen | `vt: 0` |
+| Groessere Schrift | word_box height >1.3x median | `fs: 12` |
+
+### Spaltenbreiten
+
+Auto-Fit: `max(laengster_text * 7.5 + 16, original_px * scaleFactor)`
+
+### Toolbar
+
+`undo, redo, font-bold, font-italic, font-strikethrough, font-color, background, font-size, horizontal-align, vertical-align, text-wrap, merge-cell, border`
+
+---
+
+## Unified Grid (Backend)
+
+**Datei:** `klausur-service/backend/unified_grid.py`
+**Tests:** `tests/test_unified_grid.py` (10 Tests)
+
+Mergt alle Zonen in ein einzelnes Grid (fuer Export/Analyse):
+
+```
+POST /api/v1/ocr-pipeline/sessions/{id}/build-unified-grid
+GET  /api/v1/ocr-pipeline/sessions/{id}/unified-grid
+```
+
+- Dominante Zeilenhoehe = Median der Content-Row-Abstaende
+- Full-Width Boxen: Rows direkt integriert
+- Partial-Width Boxen: Extra-Rows eingefuegt wenn Box mehr Zeilen hat
+- Box-Zellen mit `source_zone_type: "box"` und `box_region` Metadaten
+
+---
+
+## Dateistruktur
+
+### Backend (klausur-service)
+
+| Datei | Zeilen | Beschreibung |
+|-------|--------|--------------|
+| `grid_build_core.py` | 1943 | `_build_grid_core()` — Haupt-Grid-Aufbau |
+| `grid_editor_api.py` | 474 | REST-Endpoints (build, save, get, gutter, box, unified) |
+| `grid_editor_helpers.py` | 1737 | Helper: Spalten, Rows, Cells, Colspan, Header |
+| `smart_spell.py` | 587 | SmartSpellChecker |
+| `cv_box_layout.py` | 339 | Box-Layout-Klassifikation + Grid-Aufbau |
+| `unified_grid.py` | 425 | Unified Grid Builder |
+
+### Frontend (admin-lehrer)
+
+| Datei | Zeilen | Beschreibung |
+|-------|--------|--------------|
+| `StepBoxGridReview.tsx` | 283 | Box-Review Step 11 |
+| `StepAnsicht.tsx` | 112 | Ansicht Step 12 (Split-View) |
+| `SpreadsheetView.tsx` | ~160 | Fortune Sheet Integration |
+| `GridTable.tsx` | 652 | Grid-Editor Tabelle (Steps 9-11) |
+| `useGridEditor.ts` | 985 | Grid-Editor Hook |
+
+### Tests
+
+| Datei | Tests | Beschreibung |
+|-------|-------|--------------|
+| `test_smart_spell.py` | 43 | Spracherkennung, Boundary Repair, IPA-Schutz |
+| `test_box_layout.py` | 13 | Layout-Klassifikation, Bullet-Gruppierung |
+| `test_unified_grid.py` | 10 | Unified Grid, Box-Klassifikation |
+| **Gesamt** | **66** | |
+
+---
+
+## Aenderungshistorie
+
+| Datum | Aenderung |
+|-------|-----------|
+| 2026-04-15 | Fortune Sheet Multi-Sheet Tabs, Bullet-Points, Auto-Fit, Refactoring |
+| 2026-04-14 | Unified Grid, Ansicht Step, Colspan-Erkennung |
+| 2026-04-13 | Box-Grid-Review Step, Spalten in Boxen, Header/Footer Filter |
+| 2026-04-12 | SmartSpellChecker, Frequency Scoring, IPA-Schutz, Vocab-Worksheet Refactoring |
--- a/.claude/rules/vocab-worksheet.md
+++ b/.claude/rules/vocab-worksheet.md
@@ -188,11 +188,35 @@ ssh macmini "docker compose up -d klausur-service studio-v2"

 ---

+## Frontend Refactoring (2026-04-12)
+
+`page.tsx` wurde von 2337 Zeilen in 14 Dateien aufgeteilt:
+
+```
+studio-v2/app/vocab-worksheet/
+├── page.tsx                 # 198 Zeilen — Orchestrator
+├── types.ts                 # Interfaces, VocabWorksheetHook
+├── constants.ts             # API-Base, Formats, Defaults
+├── useVocabWorksheet.ts     # 843 Zeilen — Custom Hook (alle State + Logik)
+└── components/
+    ├── UploadScreen.tsx      # Session-Liste + Dokument-Auswahl
+    ├── PageSelection.tsx     # PDF-Seitenauswahl
+    ├── VocabularyTab.tsx     # Vokabel-Tabelle + IPA/Silben
+    ├── WorksheetTab.tsx      # Format-Auswahl + Konfiguration
+    ├── ExportTab.tsx         # PDF-Download
+    ├── OcrSettingsPanel.tsx   # OCR-Filter Einstellungen
+    ├── FullscreenPreview.tsx  # Vollbild-Vorschau Modal
+    ├── QRCodeModal.tsx        # QR-Upload Modal
+    └── OcrComparisonModal.tsx # OCR-Vergleich Modal
+```
+
+---
+
 ## Erweiterung: Neue Formate hinzufuegen

 1. **Backend**: Neuen Generator in `klausur-service/backend/` erstellen
 2. **API**: Neuen Endpoint in `vocab_worksheet_api.py` hinzufuegen
-3. **Frontend**: Format zu `worksheetFormats` Array in `page.tsx` hinzufuegen
+3. **Frontend**: Format zu `worksheetFormats` Array in `constants.ts` hinzufuegen
 4. **Doku**: Diese Datei aktualisieren

 ---
--- a/.claude/settings.json
+++ b/.claude/settings.json
@@ -0,0 +1,9 @@
+{
+  "permissions": {
+    "allow": [
+      "Bash",
+      "Write",
+      "Read"
+    ]
+  }
+}
--- a/AGENTS.go.md
+++ b/AGENTS.go.md
@@ -0,0 +1,36 @@
+# AGENTS.go.md — Go/Gin Konventionen
+
+## Architektur
+
+- `handlers/`: HTTP Transport nur — Decode, Validate, Call Service, Encode Response
+- `service/` oder `usecase/`: Business Logic
+- `repo/`: Storage/Integration
+- `model/` oder `domain/`: Domain Entities
+- `tests/`: Table-driven Tests bevorzugen
+
+## Regeln
+
+1. Handler ≤40 LOC — nur Decode → Service → Encode
+2. Business Logic NICHT in Handlers verstecken
+3. Grosse Handler nach Resource/Verb splitten
+4. Request/Response DTOs nah am Transport halten
+5. Interfaces nur an echten Boundaries (nicht ueberall fuer Mocks)
+6. Keine Giant-Utility-Dateien
+7. Generated Files nicht manuell editieren
+
+## Split-Trigger
+
+- Handler-Datei ueberschreitet 400-500 LOC
+- Unrelated Endpoints zusammengruppiert
+- Encoding/Decoding dominiert die Handler-Datei
+- Service-Logik und Transport-Logik gemischt
+
+## Verifikation
+
+```bash
+gofmt -l . | grep -q . && exit 1
+go vet ./...
+golangci-lint run --timeout=5m
+go test -race ./...
+go build ./...
+```
--- a/AGENTS.python.md
+++ b/AGENTS.python.md
@@ -0,0 +1,36 @@
+# AGENTS.python.md — Python/FastAPI Konventionen
+
+## Architektur
+
+- `routes/` oder `api/`: Request/Response nur — kein Business Logic
+- `services/`: Business Logic
+- `repositories/`: Persistenz/Data Access
+- `schemas/`: Pydantic Models, nach Domain gesplittet
+- `tests/`: Spiegelt Produktions-Layout
+
+## Regeln
+
+1. Route-Dateien duenn halten (≤300 LOC)
+2. Wenn eine Route-Datei 300-400 LOC erreicht → nach Resource/Operation splitten
+3. Schema-Dateien nach Domain splitten wenn sie wachsen
+4. Modul-Level Singleton-Kopplung vermeiden (Tests patchen falsches Symbol)
+5. Patch immer das Symbol das vom getesteten Modul importiert wird
+6. Dependency Injection bevorzugen statt versteckte Imports
+7. Pydantic v2: `from __future__ import annotations` NICHT verwenden (bricht Pydantic)
+8. Migrationen getrennt von Refactorings halten
+
+## Split-Trigger
+
+- Datei naehert sich oder ueberschreitet 500 LOC
+- Zirkulaere Imports erscheinen
+- Tests brauchen tiefes Patching
+- API-Schemas mischen verschiedene Domains
+- Service-Datei macht Transport UND DB-Logik
+
+## Verifikation
+
+```bash
+ruff check .
+mypy . --ignore-missing-imports --no-error-summary
+pytest tests/ -x -q --no-header
+```
--- a/AGENTS.typescript.md
+++ b/AGENTS.typescript.md
@@ -0,0 +1,55 @@
+# AGENTS.typescript.md — Next.js Konventionen
+
+## Architektur
+
+- `app/.../page.tsx`: Minimale Seiten-Komposition (≤250 LOC)
+- `app/.../actions.ts`: Server Actions
+- `app/.../queries.ts`: Data Loading
+- `app/.../_components/`: View-Teile (Colocation)
+- `app/.../_hooks/`: Seiten-spezifische Hooks (Colocation)
+- `types/` oder `types/*.ts`: Domain-spezifische Types
+- `schemas/`: Zod/Validierungs-Schemas
+- `lib/`: Shared Utilities
+
+## Regeln
+
+1. page.tsx duenn halten (≤250 LOC)
+2. Grosse Seiten frueh in Sections/Components splitten
+3. KEINE einzelne types.ts als Catch-All
+4. types.ts UND types/ Shadowing vermeiden (eines waehlen!)
+5. Server/Client Module-Grenzen explizit halten
+6. Pure Helpers und schmale Props bevorzugen
+7. API-Client Types getrennt von handgeschriebenen Domain Types
+
+## Colocation Pattern (bevorzugt)
+
+```
+app/(admin)/ai/rag/
+  page.tsx              ← duenn, komponiert nur
+  _components/
+    SearchPanel.tsx
+    ResultsTable.tsx
+    FilterBar.tsx
+  _hooks/
+    useRagSearch.ts
+  actions.ts            ← Server Actions
+  queries.ts            ← Data Fetching
+```
+
+## Split-Trigger
+
+- page.tsx ueberschreitet 250-350 LOC
+- types.ts ueberschreitet 200-300 LOC
+- Form-Logik, Server Actions und Rendering in einer Datei
+- Mehrere unabhaengig testbare Sections vorhanden
+- Imports werden broechig
+
+## Verifikation
+
+```bash
+npx tsc --noEmit
+npm run lint
+npm run build
+```
+
+> `npm run build` ist PFLICHT — `tsc` allein reicht nicht.
--- a/admin-lehrer/app/(admin)/ai/gpu/page.tsx
+++ b/admin-lehrer/app/(admin)/ai/gpu/page.tsx
@@ -1,395 +0,0 @@
-'use client'
-
-/**
- * GPU Infrastructure Admin Page
- *
- * vast.ai GPU Management for LLM Processing
- * Part of KI-Werkzeuge
- */
-
-import { useEffect, useState, useCallback } from 'react'
-import { PagePurpose } from '@/components/common/PagePurpose'
-import { AIToolsSidebarResponsive } from '@/components/ai/AIToolsSidebar'
-
-interface VastStatus {
-  instance_id: number | null
-  status: string
-  gpu_name: string | null
-  dph_total: number | null
-  endpoint_base_url: string | null
-  last_activity: string | null
-  auto_shutdown_in_minutes: number | null
-  total_runtime_hours: number | null
-  total_cost_usd: number | null
-  account_credit: number | null
-  account_total_spend: number | null
-  session_runtime_minutes: number | null
-  session_cost_usd: number | null
-  message: string | null
-  error?: string
-}
-
-export default function GPUInfrastructurePage() {
-  const [status, setStatus] = useState<VastStatus | null>(null)
-  const [loading, setLoading] = useState(true)
-  const [actionLoading, setActionLoading] = useState<string | null>(null)
-  const [error, setError] = useState<string | null>(null)
-  const [message, setMessage] = useState<string | null>(null)
-
-  const API_PROXY = '/api/admin/gpu'
-
-  const fetchStatus = useCallback(async () => {
-    setLoading(true)
-    setError(null)
-
-    try {
-      const response = await fetch(API_PROXY)
-      const data = await response.json()
-
-      if (!response.ok) {
-        throw new Error(data.error || `HTTP ${response.status}`)
-      }
-
-      setStatus(data)
-    } catch (err) {
-      setError(err instanceof Error ? err.message : 'Verbindungsfehler')
-      setStatus({
-        instance_id: null,
-        status: 'error',
-        gpu_name: null,
-        dph_total: null,
-        endpoint_base_url: null,
-        last_activity: null,
-        auto_shutdown_in_minutes: null,
-        total_runtime_hours: null,
-        total_cost_usd: null,
-        account_credit: null,
-        account_total_spend: null,
-        session_runtime_minutes: null,
-        session_cost_usd: null,
-        message: 'Verbindung fehlgeschlagen'
-      })
-    } finally {
-      setLoading(false)
-    }
-  }, [])
-
-  useEffect(() => {
-    fetchStatus()
-  }, [fetchStatus])
-
-  useEffect(() => {
-    const interval = setInterval(fetchStatus, 30000)
-    return () => clearInterval(interval)
-  }, [fetchStatus])
-
-  const powerOn = async () => {
-    setActionLoading('on')
-    setError(null)
-    setMessage(null)
-
-    try {
-      const response = await fetch(API_PROXY, {
-        method: 'POST',
-        headers: { 'Content-Type': 'application/json' },
-        body: JSON.stringify({ action: 'on' }),
-      })
-
-      const data = await response.json()
-
-      if (!response.ok) {
-        throw new Error(data.error || data.detail || 'Aktion fehlgeschlagen')
-      }
-
-      setMessage('Start angefordert')
-      setTimeout(fetchStatus, 3000)
-      setTimeout(fetchStatus, 10000)
-    } catch (err) {
-      setError(err instanceof Error ? err.message : 'Fehler beim Starten')
-      fetchStatus()
-    } finally {
-      setActionLoading(null)
-    }
-  }
-
-  const powerOff = async () => {
-    setActionLoading('off')
-    setError(null)
-    setMessage(null)
-
-    try {
-      const response = await fetch(API_PROXY, {
-        method: 'POST',
-        headers: { 'Content-Type': 'application/json' },
-        body: JSON.stringify({ action: 'off' }),
-      })
-
-      const data = await response.json()
-
-      if (!response.ok) {
-        throw new Error(data.error || data.detail || 'Aktion fehlgeschlagen')
-      }
-
-      setMessage('Stop angefordert')
-      setTimeout(fetchStatus, 3000)
-      setTimeout(fetchStatus, 10000)
-    } catch (err) {
-      setError(err instanceof Error ? err.message : 'Fehler beim Stoppen')
-      fetchStatus()
-    } finally {
-      setActionLoading(null)
-    }
-  }
-
-  const getStatusBadge = (s: string) => {
-    const baseClasses = 'px-3 py-1 rounded-full text-sm font-semibold uppercase'
-    switch (s) {
-      case 'running':
-        return `${baseClasses} bg-green-100 text-green-800`
-      case 'stopped':
-      case 'exited':
-        return `${baseClasses} bg-red-100 text-red-800`
-      case 'loading':
-      case 'scheduling':
-      case 'creating':
-      case 'starting...':
-      case 'stopping...':
-        return `${baseClasses} bg-yellow-100 text-yellow-800`
-      default:
-        return `${baseClasses} bg-slate-100 text-slate-600`
-    }
-  }
-
-  const getCreditColor = (credit: number | null) => {
-    if (credit === null) return 'text-slate-500'
-    if (credit < 5) return 'text-red-600'
-    if (credit < 15) return 'text-yellow-600'
-    return 'text-green-600'
-  }
-
-  return (
-    <div>
-      {/* Page Purpose */}
-      <PagePurpose
-        title="GPU Infrastruktur"
-        purpose="Verwalten Sie die vast.ai GPU-Instanzen fuer LLM-Verarbeitung und OCR. Starten/Stoppen Sie GPUs bei Bedarf und ueberwachen Sie Kosten in Echtzeit."
-        audience={['DevOps', 'Entwickler', 'System-Admins']}
-        architecture={{
-          services: ['vast.ai API', 'Ollama', 'VLLM'],
-          databases: ['PostgreSQL (Logs)'],
-        }}
-        relatedPages={[
-          { name: 'Test Quality (BQAS)', href: '/ai/test-quality', description: 'Golden Suite & Tests' },
-          { name: 'Magic Help', href: '/ai/magic-help', description: 'TrOCR Testing' },
-        ]}
-        collapsible={true}
-        defaultCollapsed={true}
-      />
-
-      {/* KI-Werkzeuge Sidebar */}
-      <AIToolsSidebarResponsive currentTool="gpu" />
-
-      {/* Status Cards */}
-      <div className="bg-white rounded-xl border border-slate-200 p-6 mb-6">
-        <div className="grid grid-cols-2 md:grid-cols-3 lg:grid-cols-6 gap-6">
-          <div>
-            <div className="text-sm text-slate-500 mb-2">Status</div>
-            {loading ? (
-              <span className="px-3 py-1 rounded-full text-sm font-semibold bg-slate-100 text-slate-600">
-                Laden...
-              </span>
-            ) : (
-              <span className={getStatusBadge(
-                actionLoading === 'on' ? 'starting...' :
-                actionLoading === 'off' ? 'stopping...' :
-                status?.status || 'unknown'
-              )}>
-                {actionLoading === 'on' ? 'starting...' :
-                 actionLoading === 'off' ? 'stopping...' :
-                 status?.status || 'unbekannt'}
-              </span>
-            )}
-          </div>
-
-          <div>
-            <div className="text-sm text-slate-500 mb-2">GPU</div>
-            <div className="font-semibold text-slate-900">
-              {status?.gpu_name || '-'}
-            </div>
-          </div>
-
-          <div>
-            <div className="text-sm text-slate-500 mb-2">Kosten/h</div>
-            <div className="font-semibold text-slate-900">
-              {status?.dph_total ? `$${status.dph_total.toFixed(3)}` : '-'}
-            </div>
-          </div>
-
-          <div>
-            <div className="text-sm text-slate-500 mb-2">Auto-Stop</div>
-            <div className="font-semibold text-slate-900">
-              {status && status.auto_shutdown_in_minutes !== null
-                ? `${status.auto_shutdown_in_minutes} min`
-                : '-'}
-            </div>
-          </div>
-
-          <div>
-            <div className="text-sm text-slate-500 mb-2">Budget</div>
-            <div className={`font-bold text-lg ${getCreditColor(status?.account_credit ?? null)}`}>
-              {status && status.account_credit !== null
-                ? `$${status.account_credit.toFixed(2)}`
-                : '-'}
-            </div>
-          </div>
-
-          <div>
-            <div className="text-sm text-slate-500 mb-2">Session</div>
-            <div className="font-semibold text-slate-900">
-              {status && status.session_runtime_minutes !== null && status.session_cost_usd !== null
-                ? `${Math.round(status.session_runtime_minutes)} min / $${status.session_cost_usd.toFixed(3)}`
-                : '-'}
-            </div>
-          </div>
-        </div>
-
-        {/* Buttons */}
-        <div className="flex items-center gap-4 mt-6 pt-6 border-t border-slate-200">
-          <button
-            onClick={powerOn}
-            disabled={actionLoading !== null || status?.status === 'running'}
-            className="px-6 py-2 bg-orange-600 text-white rounded-lg font-medium hover:bg-orange-700 disabled:opacity-50 disabled:cursor-not-allowed transition-colors"
-          >
-            Starten
-          </button>
-          <button
-            onClick={powerOff}
-            disabled={actionLoading !== null || status?.status !== 'running'}
-            className="px-6 py-2 bg-red-600 text-white rounded-lg font-medium hover:bg-red-700 disabled:opacity-50 disabled:cursor-not-allowed transition-colors"
-          >
-            Stoppen
-          </button>
-          <button
-            onClick={fetchStatus}
-            disabled={loading}
-            className="px-4 py-2 border border-slate-300 text-slate-700 rounded-lg font-medium hover:bg-slate-50 disabled:opacity-50 transition-colors"
-          >
-            {loading ? 'Aktualisiere...' : 'Aktualisieren'}
-          </button>
-
-          {message && (
-            <span className="ml-4 text-sm text-green-600 font-medium">{message}</span>
-          )}
-          {error && (
-            <span className="ml-4 text-sm text-red-600 font-medium">{error}</span>
-          )}
-        </div>
-      </div>
-
-      {/* Extended Stats */}
-      <div className="grid grid-cols-1 lg:grid-cols-2 gap-6 mb-6">
-        <div className="bg-white rounded-xl border border-slate-200 p-6">
-          <h3 className="font-semibold text-slate-900 mb-4">Kosten-Uebersicht</h3>
-          <div className="space-y-4">
-            <div className="flex justify-between items-center">
-              <span className="text-slate-600">Session Laufzeit</span>
-              <span className="font-semibold">
-                {status && status.session_runtime_minutes !== null
-                  ? `${Math.round(status.session_runtime_minutes)} Minuten`
-                  : '-'}
-              </span>
-            </div>
-            <div className="flex justify-between items-center">
-              <span className="text-slate-600">Session Kosten</span>
-              <span className="font-semibold">
-                {status && status.session_cost_usd !== null
-                  ? `$${status.session_cost_usd.toFixed(4)}`
-                  : '-'}
-              </span>
-            </div>
-            <div className="flex justify-between items-center pt-4 border-t border-slate-100">
-              <span className="text-slate-600">Gesamtlaufzeit</span>
-              <span className="font-semibold">
-                {status && status.total_runtime_hours !== null
-                  ? `${status.total_runtime_hours.toFixed(1)} Stunden`
-                  : '-'}
-              </span>
-            </div>
-            <div className="flex justify-between items-center">
-              <span className="text-slate-600">Gesamtkosten</span>
-              <span className="font-semibold">
-                {status && status.total_cost_usd !== null
-                  ? `$${status.total_cost_usd.toFixed(2)}`
-                  : '-'}
-              </span>
-            </div>
-            <div className="flex justify-between items-center">
-              <span className="text-slate-600">vast.ai Ausgaben</span>
-              <span className="font-semibold">
-                {status && status.account_total_spend !== null
-                  ? `$${status.account_total_spend.toFixed(2)}`
-                  : '-'}
-              </span>
-            </div>
-          </div>
-        </div>
-
-        <div className="bg-white rounded-xl border border-slate-200 p-6">
-          <h3 className="font-semibold text-slate-900 mb-4">Instanz-Details</h3>
-          <div className="space-y-4">
-            <div className="flex justify-between items-center">
-              <span className="text-slate-600">Instanz ID</span>
-              <span className="font-mono text-sm">
-                {status?.instance_id || '-'}
-              </span>
-            </div>
-            <div className="flex justify-between items-center">
-              <span className="text-slate-600">GPU</span>
-              <span className="font-semibold">
-                {status?.gpu_name || '-'}
-              </span>
-            </div>
-            <div className="flex justify-between items-center">
-              <span className="text-slate-600">Stundensatz</span>
-              <span className="font-semibold">
-                {status?.dph_total ? `$${status.dph_total.toFixed(4)}/h` : '-'}
-              </span>
-            </div>
-            <div className="flex justify-between items-center">
-              <span className="text-slate-600">Letzte Aktivitaet</span>
-              <span className="text-sm">
-                {status?.last_activity
-                  ? new Date(status.last_activity).toLocaleString('de-DE')
-                  : '-'}
-              </span>
-            </div>
-            {status?.endpoint_base_url && status.status === 'running' && (
-              <div className="pt-4 border-t border-slate-100">
-                <div className="text-slate-600 text-sm mb-1">Endpoint</div>
-                <code className="text-xs bg-slate-100 px-2 py-1 rounded block overflow-x-auto">
-                  {status.endpoint_base_url}
-                </code>
-              </div>
-            )}
-          </div>
-        </div>
-      </div>
-
-      {/* Info */}
-      <div className="bg-violet-50 border border-violet-200 rounded-xl p-4">
-        <div className="flex gap-3">
-          <svg className="w-5 h-5 text-violet-600 flex-shrink-0 mt-0.5" fill="none" stroke="currentColor" viewBox="0 0 24 24">
-            <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M13 16h-1v-4h-1m1-4h.01M21 12a9 9 0 11-18 0 9 9 0 0118 0z" />
-          </svg>
-          <div>
-            <h4 className="font-semibold text-violet-900">Auto-Shutdown</h4>
-            <p className="text-sm text-violet-800 mt-1">
-              Die GPU-Instanz wird automatisch gestoppt, wenn sie laengere Zeit inaktiv ist.
-              Der Status wird alle 30 Sekunden automatisch aktualisiert.
-            </p>
-          </div>
-        </div>
-      </div>
-    </div>
-  )
-}
--- a/admin-lehrer/app/(admin)/ai/model-management/page.tsx
+++ b/admin-lehrer/app/(admin)/ai/model-management/page.tsx
@@ -1,549 +0,0 @@
-'use client'
-
-/**
- * Model Management Page
- *
- * Manage ML model backends (PyTorch vs ONNX), view status,
- * run benchmarks, and configure inference settings.
- */
-
-import { useState, useEffect, useCallback } from 'react'
-import { PagePurpose } from '@/components/common/PagePurpose'
-
-const KLAUSUR_API = '/klausur-api'
-
-// ---------------------------------------------------------------------------
-// Types
-// ---------------------------------------------------------------------------
-
-type BackendMode = 'auto' | 'pytorch' | 'onnx'
-type ModelStatus = 'available' | 'not_found' | 'loading' | 'error'
-type Tab = 'overview' | 'benchmarks' | 'configuration'
-
-interface ModelInfo {
-  name: string
-  key: string
-  pytorch: { status: ModelStatus; size_mb: number; ram_mb: number }
-  onnx: { status: ModelStatus; size_mb: number; ram_mb: number; quantized: boolean }
-}
-
-interface BenchmarkRow {
-  model: string
-  backend: string
-  quantization: string
-  size_mb: number
-  ram_mb: number
-  inference_ms: number
-  load_time_s: number
-}
-
-interface StatusInfo {
-  active_backend: BackendMode
-  loaded_models: string[]
-  cache_hits: number
-  cache_misses: number
-  uptime_s: number
-}
-
-// ---------------------------------------------------------------------------
-// Mock data (used when backend is not available)
-// ---------------------------------------------------------------------------
-
-const MOCK_MODELS: ModelInfo[] = [
-  {
-    name: 'TrOCR Printed',
-    key: 'trocr_printed',
-    pytorch: { status: 'available', size_mb: 892, ram_mb: 1800 },
-    onnx: { status: 'available', size_mb: 234, ram_mb: 620, quantized: true },
-  },
-  {
-    name: 'TrOCR Handwritten',
-    key: 'trocr_handwritten',
-    pytorch: { status: 'available', size_mb: 892, ram_mb: 1800 },
-    onnx: { status: 'not_found', size_mb: 0, ram_mb: 0, quantized: false },
-  },
-  {
-    name: 'PP-DocLayout',
-    key: 'pp_doclayout',
-    pytorch: { status: 'not_found', size_mb: 0, ram_mb: 0 },
-    onnx: { status: 'available', size_mb: 48, ram_mb: 180, quantized: false },
-  },
-]
-
-const MOCK_BENCHMARKS: BenchmarkRow[] = [
-  { model: 'TrOCR Printed', backend: 'PyTorch', quantization: 'FP32', size_mb: 892, ram_mb: 1800, inference_ms: 142, load_time_s: 3.2 },
-  { model: 'TrOCR Printed', backend: 'ONNX', quantization: 'INT8', size_mb: 234, ram_mb: 620, inference_ms: 38, load_time_s: 0.8 },
-  { model: 'TrOCR Handwritten', backend: 'PyTorch', quantization: 'FP32', size_mb: 892, ram_mb: 1800, inference_ms: 156, load_time_s: 3.4 },
-  { model: 'PP-DocLayout', backend: 'ONNX', quantization: 'FP32', size_mb: 48, ram_mb: 180, inference_ms: 22, load_time_s: 0.3 },
-]
-
-const MOCK_STATUS: StatusInfo = {
-  active_backend: 'auto',
-  loaded_models: ['trocr_printed (ONNX)', 'pp_doclayout (ONNX)'],
-  cache_hits: 1247,
-  cache_misses: 83,
-  uptime_s: 86400,
-}
-
-// ---------------------------------------------------------------------------
-// Helpers
-// ---------------------------------------------------------------------------
-
-function StatusBadge({ status }: { status: ModelStatus }) {
-  const cls =
-    status === 'available'
-      ? 'bg-emerald-100 text-emerald-800 border-emerald-200'
-      : status === 'loading'
-        ? 'bg-blue-100 text-blue-800 border-blue-200'
-        : status === 'not_found'
-          ? 'bg-slate-100 text-slate-500 border-slate-200'
-          : 'bg-red-100 text-red-800 border-red-200'
-  const label =
-    status === 'available' ? 'Verfuegbar'
-      : status === 'loading' ? 'Laden...'
-        : status === 'not_found' ? 'Nicht vorhanden'
-          : 'Fehler'
-  return (
-    <span className={`inline-flex items-center px-2 py-0.5 rounded-full text-xs font-medium border ${cls}`}>
-      {label}
-    </span>
-  )
-}
-
-function formatBytes(mb: number) {
-  if (mb === 0) return '--'
-  if (mb >= 1000) return `${(mb / 1000).toFixed(1)} GB`
-  return `${mb} MB`
-}
-
-function formatUptime(seconds: number) {
-  const h = Math.floor(seconds / 3600)
-  const m = Math.floor((seconds % 3600) / 60)
-  if (h > 0) return `${h}h ${m}m`
-  return `${m}m`
-}
-
-// ---------------------------------------------------------------------------
-// Component
-// ---------------------------------------------------------------------------
-
-export default function ModelManagementPage() {
-  const [tab, setTab] = useState<Tab>('overview')
-  const [models, setModels] = useState<ModelInfo[]>(MOCK_MODELS)
-  const [benchmarks, setBenchmarks] = useState<BenchmarkRow[]>(MOCK_BENCHMARKS)
-  const [status, setStatus] = useState<StatusInfo>(MOCK_STATUS)
-  const [backend, setBackend] = useState<BackendMode>('auto')
-  const [saving, setSaving] = useState(false)
-  const [benchmarkRunning, setBenchmarkRunning] = useState(false)
-  const [usingMock, setUsingMock] = useState(false)
-
-  // Load status
-  const loadStatus = useCallback(async () => {
-    try {
-      const res = await fetch(`${KLAUSUR_API}/api/v1/models/status`)
-      if (res.ok) {
-        const data = await res.json()
-        setStatus(data)
-        setBackend(data.active_backend || 'auto')
-        setUsingMock(false)
-      } else {
-        setUsingMock(true)
-      }
-    } catch {
-      setUsingMock(true)
-    }
-  }, [])
-
-  // Load models
-  const loadModels = useCallback(async () => {
-    try {
-      const res = await fetch(`${KLAUSUR_API}/api/v1/models`)
-      if (res.ok) {
-        const data = await res.json()
-        if (data.models?.length) setModels(data.models)
-      }
-    } catch {
-      // Keep mock data
-    }
-  }, [])
-
-  // Load benchmarks
-  const loadBenchmarks = useCallback(async () => {
-    try {
-      const res = await fetch(`${KLAUSUR_API}/api/v1/models/benchmarks`)
-      if (res.ok) {
-        const data = await res.json()
-        if (data.benchmarks?.length) setBenchmarks(data.benchmarks)
-      }
-    } catch {
-      // Keep mock data
-    }
-  }, [])
-
-  useEffect(() => {
-    loadStatus()
-    loadModels()
-    loadBenchmarks()
-  }, [loadStatus, loadModels, loadBenchmarks])
-
-  // Save backend preference
-  const saveBackend = async (mode: BackendMode) => {
-    setBackend(mode)
-    setSaving(true)
-    try {
-      await fetch(`${KLAUSUR_API}/api/v1/models/backend`, {
-        method: 'PUT',
-        headers: { 'Content-Type': 'application/json' },
-        body: JSON.stringify({ backend: mode }),
-      })
-      await loadStatus()
-    } catch {
-      // Silently handle — mock mode
-    } finally {
-      setSaving(false)
-    }
-  }
-
-  // Run benchmark
-  const runBenchmark = async () => {
-    setBenchmarkRunning(true)
-    try {
-      const res = await fetch(`${KLAUSUR_API}/api/v1/models/benchmark`, {
-        method: 'POST',
-      })
-      if (res.ok) {
-        const data = await res.json()
-        if (data.benchmarks?.length) setBenchmarks(data.benchmarks)
-      }
-      await loadBenchmarks()
-    } catch {
-      // Keep existing data
-    } finally {
-      setBenchmarkRunning(false)
-    }
-  }
-
-  const tabs: { key: Tab; label: string }[] = [
-    { key: 'overview', label: 'Uebersicht' },
-    { key: 'benchmarks', label: 'Benchmarks' },
-    { key: 'configuration', label: 'Konfiguration' },
-  ]
-
-  return (
-    <div className="space-y-6">
-      <div className="max-w-7xl mx-auto p-6 space-y-6">
-        <PagePurpose
-          title="Model Management"
-          purpose="Verwaltung der ML-Modelle fuer OCR und Layout-Erkennung. Vergleich von PyTorch- und ONNX-Backends, Benchmark-Tests und Backend-Konfiguration."
-          audience={['Entwickler', 'DevOps']}
-          defaultCollapsed
-          architecture={{
-            services: ['klausur-service (FastAPI, Port 8086)'],
-            databases: ['Dateisystem (Modell-Dateien)'],
-          }}
-          relatedPages={[
-            { name: 'OCR Pipeline', href: '/ai/ocr-pipeline', description: 'OCR-Pipeline ausfuehren' },
-            { name: 'OCR Vergleich', href: '/ai/ocr-compare', description: 'OCR-Methoden vergleichen' },
-            { name: 'GPU Infrastruktur', href: '/ai/gpu', description: 'GPU-Ressourcen verwalten' },
-          ]}
-        />
-
-        {/* Header */}
-        <div className="flex items-center justify-between">
-          <div>
-            <h1 className="text-2xl font-bold text-slate-900">Model Management</h1>
-            <p className="text-sm text-slate-500 mt-1">
-              {models.length} Modelle konfiguriert
-              {usingMock && (
-                <span className="ml-2 text-xs bg-amber-100 text-amber-700 px-1.5 py-0.5 rounded">
-                  Mock-Daten (Backend nicht erreichbar)
-                </span>
-              )}
-            </p>
-          </div>
-        </div>
-
-        {/* Status Cards */}
-        <div className="grid grid-cols-1 sm:grid-cols-2 lg:grid-cols-4 gap-4">
-          <div className="bg-white rounded-lg border border-slate-200 px-4 py-3">
-            <p className="text-xs text-slate-500 uppercase font-medium">Aktives Backend</p>
-            <p className="text-lg font-semibold text-slate-900 mt-1">{status.active_backend.toUpperCase()}</p>
-          </div>
-          <div className="bg-white rounded-lg border border-slate-200 px-4 py-3">
-            <p className="text-xs text-slate-500 uppercase font-medium">Geladene Modelle</p>
-            <p className="text-lg font-semibold text-slate-900 mt-1">{status.loaded_models.length}</p>
-          </div>
-          <div className="bg-white rounded-lg border border-slate-200 px-4 py-3">
-            <p className="text-xs text-slate-500 uppercase font-medium">Cache Hit-Rate</p>
-            <p className="text-lg font-semibold text-slate-900 mt-1">
-              {status.cache_hits + status.cache_misses > 0
-                ? `${((status.cache_hits / (status.cache_hits + status.cache_misses)) * 100).toFixed(1)}%`
-                : '--'}
-            </p>
-          </div>
-          <div className="bg-white rounded-lg border border-slate-200 px-4 py-3">
-            <p className="text-xs text-slate-500 uppercase font-medium">Uptime</p>
-            <p className="text-lg font-semibold text-slate-900 mt-1">{formatUptime(status.uptime_s)}</p>
-          </div>
-        </div>
-
-        {/* Tabs */}
-        <div className="border-b border-slate-200">
-          <nav className="flex gap-4">
-            {tabs.map(t => (
-              <button
-                key={t.key}
-                onClick={() => setTab(t.key)}
-                className={`pb-3 px-1 text-sm font-medium border-b-2 transition-colors ${
-                  tab === t.key
-                    ? 'border-teal-500 text-teal-600'
-                    : 'border-transparent text-slate-500 hover:text-slate-700'
-                }`}
-              >
-                {t.label}
-              </button>
-            ))}
-          </nav>
-        </div>
-
-        {/* Overview Tab */}
-        {tab === 'overview' && (
-          <div className="space-y-4">
-            <h3 className="text-sm font-medium text-slate-700">Verfuegbare Modelle</h3>
-            <div className="grid gap-4 sm:grid-cols-2 lg:grid-cols-3">
-              {models.map(m => (
-                <div key={m.key} className="bg-white rounded-lg border border-slate-200 overflow-hidden">
-                  <div className="px-4 py-3 border-b border-slate-100">
-                    <h4 className="font-semibold text-slate-900">{m.name}</h4>
-                    <p className="text-xs text-slate-400 mt-0.5 font-mono">{m.key}</p>
-                  </div>
-                  <div className="px-4 py-3 space-y-3">
-                    {/* PyTorch */}
-                    <div className="flex items-center justify-between">
-                      <div className="flex items-center gap-2">
-                        <span className="text-xs font-medium text-slate-600 w-16">PyTorch</span>
-                        <StatusBadge status={m.pytorch.status} />
-                      </div>
-                      {m.pytorch.status === 'available' && (
-                        <span className="text-xs text-slate-400">
-                          {formatBytes(m.pytorch.size_mb)} / {formatBytes(m.pytorch.ram_mb)} RAM
-                        </span>
-                      )}
-                    </div>
-                    {/* ONNX */}
-                    <div className="flex items-center justify-between">
-                      <div className="flex items-center gap-2">
-                        <span className="text-xs font-medium text-slate-600 w-16">ONNX</span>
-                        <StatusBadge status={m.onnx.status} />
-                      </div>
-                      {m.onnx.status === 'available' && (
-                        <span className="text-xs text-slate-400">
-                          {formatBytes(m.onnx.size_mb)} / {formatBytes(m.onnx.ram_mb)} RAM
-                          {m.onnx.quantized && (
-                            <span className="ml-1 text-xs bg-violet-100 text-violet-700 px-1 rounded">INT8</span>
-                          )}
-                        </span>
-                      )}
-                    </div>
-                  </div>
-                </div>
-              ))}
-            </div>
-
-            {/* Loaded Models List */}
-            {status.loaded_models.length > 0 && (
-              <div>
-                <h3 className="text-sm font-medium text-slate-700 mb-2">Aktuell geladen</h3>
-                <div className="flex flex-wrap gap-2">
-                  {status.loaded_models.map((m, i) => (
-                    <span key={i} className="inline-flex items-center px-3 py-1 rounded-full text-sm bg-teal-50 text-teal-700 border border-teal-200">
-                      {m}
-                    </span>
-                  ))}
-                </div>
-              </div>
-            )}
-          </div>
-        )}
-
-        {/* Benchmarks Tab */}
-        {tab === 'benchmarks' && (
-          <div className="space-y-4">
-            <div className="flex items-center justify-between">
-              <h3 className="text-sm font-medium text-slate-700">PyTorch vs ONNX Vergleich</h3>
-              <button
-                onClick={runBenchmark}
-                disabled={benchmarkRunning}
-                className="inline-flex items-center gap-2 px-4 py-2 bg-teal-600 text-white rounded-lg hover:bg-teal-700 disabled:opacity-50 disabled:cursor-not-allowed text-sm font-medium transition-colors"
-              >
-                {benchmarkRunning ? (
-                  <>
-                    <svg className="animate-spin h-4 w-4" fill="none" viewBox="0 0 24 24">
-                      <circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" />
-                      <path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.824 3 7.938l3-2.647z" />
-                    </svg>
-                    Benchmark laeuft...
-                  </>
-                ) : (
-                  'Benchmark starten'
-                )}
-              </button>
-            </div>
-
-            <div className="bg-white rounded-lg border border-slate-200 overflow-hidden">
-              <div className="overflow-x-auto">
-                <table className="w-full text-sm">
-                  <thead>
-                    <tr className="border-b border-slate-200 bg-slate-50 text-left text-slate-500">
-                      <th className="px-4 py-3 font-medium">Modell</th>
-                      <th className="px-4 py-3 font-medium">Backend</th>
-                      <th className="px-4 py-3 font-medium">Quantisierung</th>
-                      <th className="px-4 py-3 font-medium text-right">Groesse</th>
-                      <th className="px-4 py-3 font-medium text-right">RAM</th>
-                      <th className="px-4 py-3 font-medium text-right">Inferenz</th>
-                      <th className="px-4 py-3 font-medium text-right">Ladezeit</th>
-                    </tr>
-                  </thead>
-                  <tbody>
-                    {benchmarks.map((b, i) => (
-                      <tr key={i} className="border-b border-slate-100 hover:bg-slate-50">
-                        <td className="px-4 py-3 font-medium text-slate-900">{b.model}</td>
-                        <td className="px-4 py-3">
-                          <span className={`inline-flex items-center px-2 py-0.5 rounded text-xs font-medium ${
-                            b.backend === 'ONNX'
-                              ? 'bg-violet-100 text-violet-700'
-                              : 'bg-orange-100 text-orange-700'
-                          }`}>
-                            {b.backend}
-                          </span>
-                        </td>
-                        <td className="px-4 py-3 text-slate-600">{b.quantization}</td>
-                        <td className="px-4 py-3 text-right text-slate-600">{formatBytes(b.size_mb)}</td>
-                        <td className="px-4 py-3 text-right text-slate-600">{formatBytes(b.ram_mb)}</td>
-                        <td className="px-4 py-3 text-right">
-                          <span className={`font-mono ${b.inference_ms < 50 ? 'text-emerald-600' : b.inference_ms < 100 ? 'text-amber-600' : 'text-red-600'}`}>
-                            {b.inference_ms} ms
-                          </span>
-                        </td>
-                        <td className="px-4 py-3 text-right text-slate-500">{b.load_time_s.toFixed(1)}s</td>
-                      </tr>
-                    ))}
-                  </tbody>
-                </table>
-              </div>
-            </div>
-
-            {benchmarks.length === 0 && (
-              <div className="text-center py-12 text-slate-400">
-                <p className="text-lg">Keine Benchmark-Daten</p>
-                <p className="text-sm mt-1">Klicken Sie &quot;Benchmark starten&quot; um einen Vergleich durchzufuehren.</p>
-              </div>
-            )}
-          </div>
-        )}
-
-        {/* Configuration Tab */}
-        {tab === 'configuration' && (
-          <div className="space-y-6">
-            {/* Backend Selector */}
-            <div className="bg-white rounded-lg border border-slate-200 p-5">
-              <h3 className="text-sm font-semibold text-slate-900 mb-1">Inference Backend</h3>
-              <p className="text-sm text-slate-500 mb-4">
-                Waehlen Sie welches Backend fuer die Modell-Inferenz verwendet werden soll.
-              </p>
-              <div className="space-y-3">
-                {([
-                  {
-                    mode: 'auto' as const,
-                    label: 'Auto',
-                    desc: 'ONNX wenn verfuegbar, Fallback auf PyTorch.',
-                  },
-                  {
-                    mode: 'pytorch' as const,
-                    label: 'PyTorch',
-                    desc: 'Immer PyTorch verwenden. Hoeherer RAM-Verbrauch, volle Flexibilitaet.',
-                  },
-                  {
-                    mode: 'onnx' as const,
-                    label: 'ONNX',
-                    desc: 'Immer ONNX verwenden. Schneller und weniger RAM, Fehler wenn nicht vorhanden.',
-                  },
-                ] as const).map(opt => (
-                  <label
-                    key={opt.mode}
-                    className={`flex items-start gap-3 p-3 rounded-lg border cursor-pointer transition-colors ${
-                      backend === opt.mode
-                        ? 'border-teal-300 bg-teal-50'
-                        : 'border-slate-200 hover:bg-slate-50'
-                    }`}
-                  >
-                    <input
-                      type="radio"
-                      name="backend"
-                      value={opt.mode}
-                      checked={backend === opt.mode}
-                      onChange={() => saveBackend(opt.mode)}
-                      disabled={saving}
-                      className="mt-1 text-teal-600 focus:ring-teal-500"
-                    />
-                    <div>
-                      <span className="font-medium text-slate-900">{opt.label}</span>
-                      <p className="text-sm text-slate-500 mt-0.5">{opt.desc}</p>
-                    </div>
-                  </label>
-                ))}
-              </div>
-              {saving && (
-                <p className="text-xs text-teal-600 mt-3">Speichere...</p>
-              )}
-            </div>
-
-            {/* Model Details Table */}
-            <div className="bg-white rounded-lg border border-slate-200 p-5">
-              <h3 className="text-sm font-semibold text-slate-900 mb-4">Modell-Details</h3>
-              <div className="overflow-x-auto">
-                <table className="w-full text-sm">
-                  <thead>
-                    <tr className="border-b border-slate-200 text-left text-slate-500">
-                      <th className="pb-2 font-medium">Modell</th>
-                      <th className="pb-2 font-medium">PyTorch</th>
-                      <th className="pb-2 font-medium text-right">Groesse (PT)</th>
-                      <th className="pb-2 font-medium">ONNX</th>
-                      <th className="pb-2 font-medium text-right">Groesse (ONNX)</th>
-                      <th className="pb-2 font-medium text-right">Einsparung</th>
-                    </tr>
-                  </thead>
-                  <tbody>
-                    {models.map(m => {
-                      const ptAvail = m.pytorch.status === 'available'
-                      const oxAvail = m.onnx.status === 'available'
-                      const savings = ptAvail && oxAvail && m.pytorch.size_mb > 0
-                        ? Math.round((1 - m.onnx.size_mb / m.pytorch.size_mb) * 100)
-                        : null
-                      return (
-                        <tr key={m.key} className="border-b border-slate-100">
-                          <td className="py-2.5 font-medium text-slate-900">{m.name}</td>
-                          <td className="py-2.5"><StatusBadge status={m.pytorch.status} /></td>
-                          <td className="py-2.5 text-right text-slate-500">{ptAvail ? formatBytes(m.pytorch.size_mb) : '--'}</td>
-                          <td className="py-2.5"><StatusBadge status={m.onnx.status} /></td>
-                          <td className="py-2.5 text-right text-slate-500">{oxAvail ? formatBytes(m.onnx.size_mb) : '--'}</td>
-                          <td className="py-2.5 text-right">
-                            {savings !== null ? (
-                              <span className="text-emerald-600 font-medium">-{savings}%</span>
-                            ) : (
-                              <span className="text-slate-300">--</span>
-                            )}
-                          </td>
-                        </tr>
-                      )
-                    })}
-                  </tbody>
-                </table>
-              </div>
-            </div>
-          </div>
-        )}
-      </div>
-    </div>
-  )
-}
--- a/admin-lehrer/app/(admin)/ai/ocr-compare/page.tsx
+++ b/admin-lehrer/app/(admin)/ai/ocr-compare/page.tsx
--- a/admin-lehrer/app/(admin)/ai/ocr-kombi/page.tsx
+++ b/admin-lehrer/app/(admin)/ai/ocr-kombi/page.tsx
@@ -0,0 +1,173 @@
+'use client'
+
+import { Suspense } from 'react'
+import { PagePurpose } from '@/components/common/PagePurpose'
+import { KombiStepper } from '@/components/ocr-kombi/KombiStepper'
+import { SessionList } from '@/components/ocr-kombi/SessionList'
+import { SessionHeader } from '@/components/ocr-kombi/SessionHeader'
+import { StepUpload } from '@/components/ocr-kombi/StepUpload'
+import { StepOrientation } from '@/components/ocr-kombi/StepOrientation'
+import { StepPageSplit } from '@/components/ocr-kombi/StepPageSplit'
+import { StepDeskew } from '@/components/ocr-kombi/StepDeskew'
+import { StepDewarp } from '@/components/ocr-kombi/StepDewarp'
+import { StepContentCrop } from '@/components/ocr-kombi/StepContentCrop'
+import { StepOcr } from '@/components/ocr-kombi/StepOcr'
+import { StepStructure } from '@/components/ocr-kombi/StepStructure'
+import { StepGridBuild } from '@/components/ocr-kombi/StepGridBuild'
+import { StepGridReview } from '@/components/ocr-kombi/StepGridReview'
+import { StepGutterRepair } from '@/components/ocr-kombi/StepGutterRepair'
+import { StepBoxGridReview } from '@/components/ocr-kombi/StepBoxGridReview'
+import { StepAnsicht } from '@/components/ocr-kombi/StepAnsicht'
+import { StepGroundTruth } from '@/components/ocr-kombi/StepGroundTruth'
+import { useKombiPipeline } from './useKombiPipeline'
+
+function OcrKombiContent() {
+  const {
+    currentStep,
+    sessionId,
+    sessionName,
+    loadingSessions,
+    activeCategory,
+    isGroundTruth,
+    pageNumber,
+    steps,
+    gridSaveRef,
+    groupedSessions,
+    loadSessions,
+    openSession,
+    handleStepClick,
+    handleNext,
+    handleNewSession,
+    deleteSession,
+    renameSession,
+    updateCategory,
+    setSessionId,
+    setSessionName,
+    setIsGroundTruth,
+  } = useKombiPipeline()
+
+  const renderStep = () => {
+    switch (currentStep) {
+      case 0:
+        return (
+          <StepUpload
+            sessionId={sessionId}
+            onUploaded={(sid, name) => {
+              setSessionId(sid)
+              setSessionName(name)
+              loadSessions()
+            }}
+            onNext={handleNext}
+          />
+        )
+      case 1:
+        return (
+          <StepOrientation
+            sessionId={sessionId}
+            onNext={() => handleNext()}
+            onSessionList={() => { loadSessions(); handleNewSession() }}
+          />
+        )
+      case 2:
+        return (
+          <StepPageSplit
+            sessionId={sessionId}
+            sessionName={sessionName}
+            onNext={handleNext}
+            onSplitComplete={(childId, childName) => {
+              // Switch to the first child session and refresh the list
+              setSessionId(childId)
+              setSessionName(childName)
+              loadSessions()
+            }}
+          />
+        )
+      case 3:
+        return <StepDeskew sessionId={sessionId} onNext={handleNext} />
+      case 4:
+        return <StepDewarp sessionId={sessionId} onNext={handleNext} />
+      case 5:
+        return <StepContentCrop sessionId={sessionId} onNext={handleNext} />
+      case 6:
+        return <StepOcr sessionId={sessionId} onNext={handleNext} />
+      case 7:
+        return <StepStructure sessionId={sessionId} onNext={handleNext} />
+      case 8:
+        return <StepGridBuild sessionId={sessionId} onNext={handleNext} />
+      case 9:
+        return <StepGridReview sessionId={sessionId} onNext={handleNext} saveRef={gridSaveRef} />
+      case 10:
+        return <StepGutterRepair sessionId={sessionId} onNext={handleNext} />
+      case 11:
+        return <StepBoxGridReview sessionId={sessionId} onNext={handleNext} />
+      case 12:
+        return <StepAnsicht sessionId={sessionId} onNext={handleNext} />
+      case 13:
+        return (
+          <StepGroundTruth
+            sessionId={sessionId}
+            isGroundTruth={isGroundTruth}
+            onMarked={() => setIsGroundTruth(true)}
+            gridSaveRef={gridSaveRef}
+          />
+        )
+      default:
+        return null
+    }
+  }
+
+  return (
+    <div className="space-y-6">
+      <PagePurpose
+        title="OCR Kombi Pipeline"
+        purpose="Modulare 11-Schritt-Pipeline: Upload, Vorverarbeitung, Dual-Engine-OCR (PP-OCRv5 + Tesseract), Strukturerkennung, Grid-Aufbau und Review. Multi-Page-Dokument-Unterstuetzung."
+        audience={['Entwickler']}
+        architecture={{
+          services: ['klausur-service (FastAPI)', 'OpenCV', 'Tesseract', 'PaddleOCR'],
+          databases: ['PostgreSQL Sessions'],
+        }}
+        relatedPages={[
+          { name: 'OCR Regression', href: '/ai/ocr-regression', description: 'Regressionstests' },
+        ]}
+        defaultCollapsed
+      />
+
+      <SessionList
+        items={groupedSessions()}
+        loading={loadingSessions}
+        activeSessionId={sessionId}
+        onOpenSession={(sid) => openSession(sid)}
+        onNewSession={handleNewSession}
+        onDeleteSession={deleteSession}
+        onRenameSession={renameSession}
+        onUpdateCategory={updateCategory}
+      />
+
+      {sessionId && sessionName && (
+        <SessionHeader
+          sessionName={sessionName}
+          activeCategory={activeCategory}
+          isGroundTruth={isGroundTruth}
+          pageNumber={pageNumber}
+          onUpdateCategory={(cat) => updateCategory(sessionId, cat)}
+        />
+      )}
+
+      <KombiStepper
+        steps={steps}
+        currentStep={currentStep}
+        onStepClick={handleStepClick}
+      />
+
+      <div className="min-h-[400px]">{renderStep()}</div>
+    </div>
+  )
+}
+
+export default function OcrKombiPage() {
+  return (
+    <Suspense fallback={<div className="p-4 text-sm text-gray-400">Lade...</div>}>
+      <OcrKombiContent />
+    </Suspense>
+  )
+}
--- a/admin-lehrer/app/(admin)/ai/ocr-kombi/types.ts
+++ b/admin-lehrer/app/(admin)/ai/ocr-kombi/types.ts
@@ -0,0 +1,266 @@
+// OCR Pipeline Types — migrated from deleted ocr-pipeline/types.ts
+
+export type PipelineStepStatus = 'pending' | 'active' | 'completed' | 'failed' | 'skipped'
+
+export interface PipelineStep {
+  id: string
+  name: string
+  icon: string
+  status: PipelineStepStatus
+}
+
+export type DocumentCategory =
+  | 'vokabelseite' | 'woerterbuch' | 'buchseite' | 'arbeitsblatt' | 'klausurseite'
+  | 'mathearbeit' | 'statistik' | 'zeitung' | 'formular' | 'handschrift' | 'sonstiges'
+
+export const DOCUMENT_CATEGORIES: { value: DocumentCategory; label: string; icon: string }[] = [
+  { value: 'vokabelseite', label: 'Vokabelseite', icon: '📖' },
+  { value: 'woerterbuch', label: 'Woerterbuch', icon: '📕' },
+  { value: 'buchseite', label: 'Buchseite', icon: '📚' },
+  { value: 'arbeitsblatt', label: 'Arbeitsblatt', icon: '📝' },
+  { value: 'klausurseite', label: 'Klausurseite', icon: '📄' },
+  { value: 'mathearbeit', label: 'Mathearbeit', icon: '🔢' },
+  { value: 'statistik', label: 'Statistik', icon: '📊' },
+  { value: 'zeitung', label: 'Zeitung', icon: '📰' },
+  { value: 'formular', label: 'Formular', icon: '📋' },
+  { value: 'handschrift', label: 'Handschrift', icon: '✍️' },
+  { value: 'sonstiges', label: 'Sonstiges', icon: '📎' },
+]
+
+export interface SessionListItem {
+  id: string
+  name: string
+  filename: string
+  status: string
+  current_step: number
+  document_category?: DocumentCategory
+  doc_type?: string
+  parent_session_id?: string
+  document_group_id?: string
+  page_number?: number
+  is_ground_truth?: boolean
+  created_at: string
+  updated_at?: string
+}
+
+export interface SubSession {
+  id: string
+  name: string
+  box_index: number
+  current_step?: number
+  status?: string
+}
+
+export interface OrientationResult {
+  orientation_degrees: number
+  corrected: boolean
+  duration_seconds: number
+}
+
+export interface CropResult {
+  crop_applied: boolean
+  crop_rect?: { x: number; y: number; width: number; height: number }
+  crop_rect_pct?: { x: number; y: number; width: number; height: number }
+  original_size: { width: number; height: number }
+  cropped_size: { width: number; height: number }
+  detected_format?: string
+  format_confidence?: number
+  aspect_ratio?: number
+  border_fractions?: { top: number; bottom: number; left: number; right: number }
+  skipped?: boolean
+  duration_seconds?: number
+}
+
+export interface DeskewResult {
+  session_id: string
+  angle_hough: number
+  angle_word_alignment: number
+  angle_iterative?: number
+  angle_residual?: number
+  angle_textline?: number
+  angle_applied: number
+  method_used: 'hough' | 'word_alignment' | 'manual' | 'iterative' | 'two_pass' | 'three_pass' | 'manual_combined'
+  confidence: number
+  duration_seconds: number
+  deskewed_image_url: string
+  binarized_image_url: string
+}
+
+export interface DewarpDetection {
+  method: string
+  shear_degrees: number
+  confidence: number
+}
+
+export interface DewarpResult {
+  session_id: string
+  method_used: string
+  shear_degrees: number
+  confidence: number
+  duration_seconds: number
+  dewarped_image_url: string
+  detections?: DewarpDetection[]
+}
+
+export interface SessionInfo {
+  session_id: string
+  filename: string
+  name?: string
+  image_width: number
+  image_height: number
+  original_image_url: string
+  current_step?: number
+  document_category?: DocumentCategory
+  doc_type?: string
+  orientation_result?: OrientationResult
+  crop_result?: CropResult
+  deskew_result?: DeskewResult
+  dewarp_result?: DewarpResult
+  sub_sessions?: SubSession[]
+  parent_session_id?: string
+  box_index?: number
+  document_group_id?: string
+  page_number?: number
+}
+
+export interface StructureGraphic {
+  x: number; y: number; w: number; h: number
+  area: number; shape: string; color_name: string; color_hex: string; confidence: number
+}
+
+export interface ExcludeRegion {
+  x: number; y: number; w: number; h: number; label?: string
+}
+
+export interface StructureBox {
+  x: number; y: number; w: number; h: number
+  confidence: number; border_thickness: number
+  bg_color_name?: string; bg_color_hex?: string
+}
+
+export interface StructureZone {
+  index: number; zone_type: 'content' | 'box'
+  x: number; y: number; w: number; h: number
+}
+
+export interface DocLayoutRegion {
+  x: number; y: number; w: number; h: number
+  class_name: string; confidence: number
+}
+
+export interface StructureResult {
+  image_width: number; image_height: number
+  content_bounds: { x: number; y: number; w: number; h: number }
+  boxes: StructureBox[]; zones: StructureZone[]
+  graphics: StructureGraphic[]; exclude_regions?: ExcludeRegion[]
+  color_pixel_counts: Record<string, number>
+  has_words: boolean; word_count: number
+  border_ghosts_removed?: number; duration_seconds: number
+  layout_regions?: DocLayoutRegion[]
+  detection_method?: 'opencv' | 'ppdoclayout'
+}
+
+export interface WordBbox { x: number; y: number; w: number; h: number }
+
+export interface OcrWordBox {
+  text: string; left: number; top: number; width: number; height: number; conf: number
+  color?: string; color_name?: string; recovered?: boolean
+}
+
+export interface ColumnMeta { index: number; type: string; x: number; width: number }
+
+export interface GridCell {
+  cell_id: string; row_index: number; col_index: number; col_type: string
+  text: string; confidence: number; bbox_px: WordBbox; bbox_pct: WordBbox
+  ocr_engine?: string; is_bold?: boolean
+  status?: 'pending' | 'confirmed' | 'edited' | 'skipped'
+  word_boxes?: OcrWordBox[]
+}
+
+export interface WordEntry {
+  row_index: number; english: string; german: string; example: string
+  source_page?: string; marker?: string; confidence: number
+  bbox: WordBbox; bbox_en: WordBbox | null; bbox_de: WordBbox | null; bbox_ex: WordBbox | null
+  bbox_ref?: WordBbox | null; bbox_marker?: WordBbox | null
+  status?: 'pending' | 'confirmed' | 'edited' | 'skipped'
+}
+
+export interface GridResult {
+  cells: GridCell[]
+  grid_shape: { rows: number; cols: number; total_cells: number }
+  columns_used: ColumnMeta[]
+  layout: 'vocab' | 'generic'
+  image_width: number; image_height: number; duration_seconds: number
+  ocr_engine?: string; vocab_entries?: WordEntry[]; entries?: WordEntry[]; entry_count?: number
+  summary: {
+    total_cells: number; non_empty_cells: number; low_confidence: number
+    total_entries?: number; with_english?: number; with_german?: number
+  }
+  llm_review?: {
+    changes: { row_index: number; field: string; old: string; new: string }[]
+    model_used: string; duration_ms: number; entries_corrected: number
+    applied_count?: number; applied_at?: string
+  }
+}
+
+// --- Kombi V2 Pipeline ---
+
+export const KOMBI_V2_STEPS: PipelineStep[] = [
+  { id: 'upload',        name: 'Upload',             icon: '📤', status: 'pending' },
+  { id: 'orientation',   name: 'Orientierung',       icon: '🔄', status: 'pending' },
+  { id: 'page-split',    name: 'Seitentrennung',     icon: '📖', status: 'pending' },
+  { id: 'deskew',        name: 'Begradigung',        icon: '📐', status: 'pending' },
+  { id: 'dewarp',        name: 'Entzerrung',         icon: '🔧', status: 'pending' },
+  { id: 'content-crop',  name: 'Zuschneiden',        icon: '✂️', status: 'pending' },
+  { id: 'ocr',           name: 'OCR',                icon: '🔀', status: 'pending' },
+  { id: 'structure',     name: 'Strukturerkennung',  icon: '🔍', status: 'pending' },
+  { id: 'grid-build',    name: 'Grid-Aufbau',        icon: '🧱', status: 'pending' },
+  { id: 'grid-review',   name: 'Grid-Review',        icon: '📊', status: 'pending' },
+  { id: 'gutter-repair', name: 'Wortkorrektur',      icon: '🩹', status: 'pending' },
+  { id: 'box-review',    name: 'Box-Review',          icon: '📦', status: 'pending' },
+  { id: 'ansicht',       name: 'Ansicht',             icon: '👁️', status: 'pending' },
+  { id: 'ground-truth',  name: 'Ground Truth',       icon: '✅', status: 'pending' },
+]
+
+export const KOMBI_V2_UI_TO_DB: Record<number, number> = {
+  0: 1, 1: 2, 2: 2, 3: 3, 4: 4, 5: 5, 6: 8, 7: 9, 8: 10, 9: 11, 10: 11, 11: 11, 12: 11, 13: 12,
+}
+
+export function dbStepToKombiV2Ui(dbStep: number): number {
+  if (dbStep <= 1) return 0
+  if (dbStep === 2) return 1
+  if (dbStep === 3) return 3
+  if (dbStep === 4) return 4
+  if (dbStep === 5) return 5
+  if (dbStep <= 8) return 6
+  if (dbStep === 9) return 7
+  if (dbStep === 10) return 8
+  if (dbStep === 11) return 9
+  return 13
+}
+
+export interface DocumentGroup {
+  group_id: string; title: string; page_count: number; sessions: DocumentGroupSession[]
+}
+
+export interface DocumentGroupSession {
+  id: string; name: string; page_number: number; current_step: number
+  status: string; document_category?: DocumentCategory; created_at: string
+}
+
+export type OcrEngineSource = 'both' | 'paddle_only' | 'tesseract_only' | 'conflict_paddle' | 'conflict_tesseract'
+
+export interface OcrTransparentWord {
+  text: string; left: number; top: number; width: number; height: number
+  conf: number; engine_source: OcrEngineSource
+}
+
+export interface OcrTransparentResult {
+  raw_tesseract: { words: OcrTransparentWord[] }
+  raw_paddle: { words: OcrTransparentWord[] }
+  merged: { words: OcrTransparentWord[] }
+  stats: {
+    total_words: number; both_agree: number; paddle_only: number
+    tesseract_only: number; conflict_paddle_wins: number; conflict_tesseract_wins: number
+  }
+}
--- a/admin-lehrer/app/(admin)/ai/ocr-kombi/useKombiPipeline.ts
+++ b/admin-lehrer/app/(admin)/ai/ocr-kombi/useKombiPipeline.ts
@@ -0,0 +1,298 @@
+'use client'
+
+import { useCallback, useEffect, useState, useRef } from 'react'
+import { useSearchParams } from 'next/navigation'
+import type { PipelineStep, DocumentCategory, SessionListItem } from './types'
+import { KOMBI_V2_STEPS, dbStepToKombiV2Ui } from './types'
+
+export type { SessionListItem }
+
+const KLAUSUR_API = '/klausur-api'
+
+/** Groups sessions by document_group_id for the session list */
+export interface DocumentGroupView {
+  group_id: string
+  title: string
+  sessions: SessionListItem[]
+  page_count: number
+}
+
+function initSteps(): PipelineStep[] {
+  return KOMBI_V2_STEPS.map((s, i) => ({
+    ...s,
+    status: i === 0 ? 'active' : 'pending',
+  }))
+}
+
+export function useKombiPipeline() {
+  const [currentStep, setCurrentStep] = useState(0)
+  const [sessionId, setSessionId] = useState<string | null>(null)
+  const [sessionName, setSessionName] = useState('')
+  const [sessions, setSessions] = useState<SessionListItem[]>([])
+  const [loadingSessions, setLoadingSessions] = useState(true)
+  const [activeCategory, setActiveCategory] = useState<DocumentCategory | undefined>(undefined)
+  const [isGroundTruth, setIsGroundTruth] = useState(false)
+  const [pageNumber, setPageNumber] = useState<number | null>(null)
+  const [steps, setSteps] = useState<PipelineStep[]>(initSteps())
+
+  const searchParams = useSearchParams()
+  const deepLinkHandled = useRef(false)
+  const gridSaveRef = useRef<(() => Promise<void>) | null>(null)
+
+  // ---- Session loading ----
+
+  const loadSessions = useCallback(async () => {
+    setLoadingSessions(true)
+    try {
+      const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions`)
+      if (res.ok) {
+        const data = await res.json()
+        setSessions((data.sessions || []).filter((s: SessionListItem) => !s.parent_session_id))
+      }
+    } catch (e) {
+      console.error('Failed to load sessions:', e)
+    } finally {
+      setLoadingSessions(false)
+    }
+  }, [])
+
+  useEffect(() => { loadSessions() }, [loadSessions])
+
+  // ---- Group sessions by document_group_id ----
+
+  const groupedSessions = useCallback((): (SessionListItem | DocumentGroupView)[] => {
+    const groups = new Map<string, SessionListItem[]>()
+    const ungrouped: SessionListItem[] = []
+
+    for (const s of sessions) {
+      if (s.document_group_id) {
+        const existing = groups.get(s.document_group_id) || []
+        existing.push(s)
+        groups.set(s.document_group_id, existing)
+      } else {
+        ungrouped.push(s)
+      }
+    }
+
+    const result: (SessionListItem | DocumentGroupView)[] = []
+
+    // Sort groups by earliest created_at
+    const sortedGroups = Array.from(groups.entries()).sort((a, b) => {
+      const aTime = Math.min(...a[1].map(s => new Date(s.created_at).getTime()))
+      const bTime = Math.min(...b[1].map(s => new Date(s.created_at).getTime()))
+      return bTime - aTime
+    })
+
+    for (const [groupId, groupSessions] of sortedGroups) {
+      groupSessions.sort((a, b) => (a.page_number || 0) - (b.page_number || 0))
+      // Extract base title (remove " — S. X" suffix)
+      const baseName = groupSessions[0]?.name?.replace(/ — S\. \d+$/, '') || 'Dokument'
+      result.push({
+        group_id: groupId,
+        title: baseName,
+        sessions: groupSessions,
+        page_count: groupSessions.length,
+      })
+    }
+
+    for (const s of ungrouped) {
+      result.push(s)
+    }
+
+    // Sort by creation time (most recent first)
+    const getTime = (item: SessionListItem | DocumentGroupView): number => {
+      if ('group_id' in item) {
+        return Math.min(...item.sessions.map((s: SessionListItem) => new Date(s.created_at).getTime()))
+      }
+      return new Date(item.created_at).getTime()
+    }
+    result.sort((a, b) => getTime(b) - getTime(a))
+
+    return result
+  }, [sessions])
+
+  // ---- Open session ----
+
+  const openSession = useCallback(async (sid: string) => {
+    try {
+      const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`)
+      if (!res.ok) return
+      const data = await res.json()
+
+      setSessionId(sid)
+      setSessionName(data.name || data.filename || '')
+      setActiveCategory(data.document_category || undefined)
+      setIsGroundTruth(!!data.ground_truth?.build_grid_reference)
+      setPageNumber(data.grid_editor_result?.page_number?.number ?? null)
+
+      // Determine UI step from DB state
+      const dbStep = data.current_step || 1
+      const hasGrid = !!data.grid_editor_result
+      const hasStructure = !!data.structure_result
+      const hasWords = !!data.word_result
+      const hasGutterRepair = !!(data.ground_truth?.gutter_repair)
+
+      let uiStep: number
+      if (hasGrid && hasGutterRepair) {
+        uiStep = 10 // gutter-repair (already analysed)
+      } else if (hasGrid) {
+        uiStep = 9 // grid-review
+      } else if (hasStructure) {
+        uiStep = 8 // grid-build
+      } else if (hasWords) {
+        uiStep = 7 // structure
+      } else {
+        uiStep = dbStepToKombiV2Ui(dbStep)
+      }
+
+      // Sessions only exist after upload, so always skip the upload step
+      if (uiStep === 0) {
+        uiStep = 1
+      }
+
+      setSteps(
+        KOMBI_V2_STEPS.map((s, i) => ({
+          ...s,
+          status: i < uiStep ? 'completed' : i === uiStep ? 'active' : 'pending',
+        })),
+      )
+      setCurrentStep(uiStep)
+    } catch (e) {
+      console.error('Failed to open session:', e)
+    }
+  }, [])
+
+  // ---- Deep link handling ----
+
+  useEffect(() => {
+    if (deepLinkHandled.current) return
+    const urlSession = searchParams.get('session')
+    const urlStep = searchParams.get('step')
+    if (urlSession) {
+      deepLinkHandled.current = true
+      openSession(urlSession).then(() => {
+        if (urlStep) {
+          const stepIdx = parseInt(urlStep, 10)
+          if (!isNaN(stepIdx) && stepIdx >= 0 && stepIdx < KOMBI_V2_STEPS.length) {
+            setCurrentStep(stepIdx)
+          }
+        }
+      })
+    }
+  }, [searchParams, openSession])
+
+  // ---- Step navigation ----
+
+  const goToStep = useCallback((step: number) => {
+    setCurrentStep(step)
+    setSteps(prev =>
+      prev.map((s, i) => ({
+        ...s,
+        status: i < step ? 'completed' : i === step ? 'active' : 'pending',
+      })),
+    )
+  }, [])
+
+  const handleStepClick = useCallback((index: number) => {
+    if (index <= currentStep || steps[index].status === 'completed') {
+      setCurrentStep(index)
+    }
+  }, [currentStep, steps])
+
+  const handleNext = useCallback(() => {
+    if (currentStep >= steps.length - 1) {
+      // Last step → return to session list
+      setSteps(initSteps())
+      setCurrentStep(0)
+      setSessionId(null)
+      loadSessions()
+      return
+    }
+
+    const nextStep = currentStep + 1
+    setSteps(prev =>
+      prev.map((s, i) => {
+        if (i === currentStep) return { ...s, status: 'completed' }
+        if (i === nextStep) return { ...s, status: 'active' }
+        return s
+      }),
+    )
+    setCurrentStep(nextStep)
+  }, [currentStep, steps, loadSessions])
+
+  // ---- Session CRUD ----
+
+  const handleNewSession = useCallback(() => {
+    setSessionId(null)
+    setSessionName('')
+    setCurrentStep(0)
+    setSteps(initSteps())
+  }, [])
+
+  const deleteSession = useCallback(async (sid: string) => {
+    try {
+      await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`, { method: 'DELETE' })
+      setSessions(prev => prev.filter(s => s.id !== sid))
+      if (sessionId === sid) handleNewSession()
+    } catch (e) {
+      console.error('Failed to delete session:', e)
+    }
+  }, [sessionId, handleNewSession])
+
+  const renameSession = useCallback(async (sid: string, newName: string) => {
+    try {
+      await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`, {
+        method: 'PUT',
+        headers: { 'Content-Type': 'application/json' },
+        body: JSON.stringify({ name: newName }),
+      })
+      setSessions(prev => prev.map(s => s.id === sid ? { ...s, name: newName } : s))
+      if (sessionId === sid) setSessionName(newName)
+    } catch (e) {
+      console.error('Failed to rename session:', e)
+    }
+  }, [sessionId])
+
+  const updateCategory = useCallback(async (sid: string, category: DocumentCategory) => {
+    try {
+      await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`, {
+        method: 'PUT',
+        headers: { 'Content-Type': 'application/json' },
+        body: JSON.stringify({ document_category: category }),
+      })
+      setSessions(prev => prev.map(s => s.id === sid ? { ...s, document_category: category } : s))
+      if (sessionId === sid) setActiveCategory(category)
+    } catch (e) {
+      console.error('Failed to update category:', e)
+    }
+  }, [sessionId])
+
+  return {
+    // State
+    currentStep,
+    sessionId,
+    sessionName,
+    sessions,
+    loadingSessions,
+    activeCategory,
+    isGroundTruth,
+    pageNumber,
+    steps,
+    gridSaveRef,
+    // Computed
+    groupedSessions,
+    // Actions
+    loadSessions,
+    openSession,
+    goToStep,
+    handleStepClick,
+    handleNext,
+    handleNewSession,
+    deleteSession,
+    renameSession,
+    updateCategory,
+    setSessionId,
+    setSessionName,
+    setIsGroundTruth,
+  }
+}
--- a/admin-lehrer/app/(admin)/ai/ocr-overlay/page.tsx
+++ b/admin-lehrer/app/(admin)/ai/ocr-overlay/page.tsx
@@ -1,751 +0,0 @@
-'use client'
-
-import { useCallback, useEffect, useState, useRef } from 'react'
-import { useSearchParams } from 'next/navigation'
-import { PagePurpose } from '@/components/common/PagePurpose'
-import { PipelineStepper } from '@/components/ocr-pipeline/PipelineStepper'
-import { StepOrientation } from '@/components/ocr-pipeline/StepOrientation'
-import { StepDeskew } from '@/components/ocr-pipeline/StepDeskew'
-import { StepDewarp } from '@/components/ocr-pipeline/StepDewarp'
-import { StepCrop } from '@/components/ocr-pipeline/StepCrop'
-import { StepStructureDetection } from '@/components/ocr-pipeline/StepStructureDetection'
-import { StepRowDetection } from '@/components/ocr-pipeline/StepRowDetection'
-import { StepWordRecognition } from '@/components/ocr-pipeline/StepWordRecognition'
-import { OverlayReconstruction } from '@/components/ocr-overlay/OverlayReconstruction'
-import { PaddleDirectStep } from '@/components/ocr-overlay/PaddleDirectStep'
-import { GridEditor } from '@/components/grid-editor/GridEditor'
-import { StepGridReview } from '@/components/ocr-pipeline/StepGridReview'
-import { BoxSessionTabs } from '@/components/ocr-pipeline/BoxSessionTabs'
-import { OVERLAY_PIPELINE_STEPS, PADDLE_DIRECT_STEPS, KOMBI_STEPS, DOCUMENT_CATEGORIES, dbStepToOverlayUi, type PipelineStep, type SessionListItem, type DocumentCategory } from './types'
-import type { SubSession } from '../ocr-pipeline/types'
-
-const KLAUSUR_API = '/klausur-api'
-
-export default function OcrOverlayPage() {
-  const [mode, setMode] = useState<'pipeline' | 'paddle-direct' | 'kombi'>('pipeline')
-  const [currentStep, setCurrentStep] = useState(0)
-  const [sessionId, setSessionId] = useState<string | null>(null)
-  const [sessionName, setSessionName] = useState<string>('')
-  const [sessions, setSessions] = useState<SessionListItem[]>([])
-  const [loadingSessions, setLoadingSessions] = useState(true)
-  const [editingName, setEditingName] = useState<string | null>(null)
-  const [editNameValue, setEditNameValue] = useState('')
-  const [editingCategory, setEditingCategory] = useState<string | null>(null)
-  const [activeCategory, setActiveCategory] = useState<DocumentCategory | undefined>(undefined)
-  const [editingActiveCategory, setEditingActiveCategory] = useState(false)
-  const [subSessions, setSubSessions] = useState<SubSession[]>([])
-  const [parentSessionId, setParentSessionId] = useState<string | null>(null)
-  const [isGroundTruth, setIsGroundTruth] = useState(false)
-  const [gtSaving, setGtSaving] = useState(false)
-  const [gtMessage, setGtMessage] = useState('')
-  const [steps, setSteps] = useState<PipelineStep[]>(
-    OVERLAY_PIPELINE_STEPS.map((s, i) => ({
-      ...s,
-      status: i === 0 ? 'active' : 'pending',
-    })),
-  )
-
-  const searchParams = useSearchParams()
-  const deepLinkHandled = useRef(false)
-  const gridSaveRef = useRef<(() => Promise<void>) | null>(null)
-
-  useEffect(() => {
-    loadSessions()
-  }, [])
-
-  const loadSessions = async () => {
-    setLoadingSessions(true)
-    try {
-      const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions`)
-      if (res.ok) {
-        const data = await res.json()
-        // Filter to only show top-level sessions (no sub-sessions)
-        setSessions((data.sessions || []).filter((s: SessionListItem) => !s.parent_session_id))
-      }
-    } catch (e) {
-      console.error('Failed to load sessions:', e)
-    } finally {
-      setLoadingSessions(false)
-    }
-  }
-
-  const openSession = useCallback(async (sid: string, keepSubSessions?: boolean) => {
-    try {
-      const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`)
-      if (!res.ok) return
-      const data = await res.json()
-
-      setSessionId(sid)
-      setSessionName(data.name || data.filename || '')
-      setActiveCategory(data.document_category || undefined)
-      setIsGroundTruth(!!data.ground_truth?.build_grid_reference)
-      setGtMessage('')
-
-      // Sub-session handling
-      if (data.sub_sessions && data.sub_sessions.length > 0) {
-        setSubSessions(data.sub_sessions)
-        setParentSessionId(sid)
-      } else if (data.parent_session_id) {
-        setParentSessionId(data.parent_session_id)
-      } else if (!keepSubSessions) {
-        setSubSessions([])
-        setParentSessionId(null)
-      }
-
-      const isSubSession = !!data.parent_session_id
-
-      // Mode detection for root sessions with word_result
-      const ocrEngine = data.word_result?.ocr_engine
-      const isPaddleDirect = ocrEngine === 'paddle_direct'
-      const isKombi = ocrEngine === 'kombi' || ocrEngine === 'rapid_kombi'
-
-      let activeMode = mode // keep current mode for sub-sessions
-      if (!isSubSession && (isPaddleDirect || isKombi)) {
-        activeMode = isKombi ? 'kombi' : 'paddle-direct'
-        setMode(activeMode)
-      } else if (!isSubSession && !ocrEngine) {
-        // Unprocessed root session: keep the user's selected mode
-        activeMode = mode
-      }
-
-      const baseSteps = activeMode === 'kombi' ? KOMBI_STEPS
-        : activeMode === 'paddle-direct' ? PADDLE_DIRECT_STEPS
-        : OVERLAY_PIPELINE_STEPS
-
-      // Determine UI step
-      let uiStep: number
-      const skipIds: string[] = []
-
-      if (!isSubSession && (isPaddleDirect || isKombi)) {
-        const hasGrid = isKombi && data.grid_editor_result
-        const hasStructure = isKombi && data.structure_result
-        uiStep = hasGrid ? 6 : hasStructure ? 6 : data.word_result ? 5 : 4
-        if (isPaddleDirect) uiStep = data.word_result ? 4 : 4
-      } else {
-        const dbStep = data.current_step || 1
-        if (dbStep <= 2) uiStep = 0
-        else if (dbStep === 3) uiStep = 1
-        else if (dbStep === 4) uiStep = 2
-        else if (dbStep === 5) uiStep = 3
-        else uiStep = 4
-
-        // Sub-session skip logic
-        if (isSubSession) {
-          if (dbStep >= 5) {
-            skipIds.push('orientation', 'deskew', 'dewarp', 'crop')
-            if (uiStep < 4) uiStep = 4
-          } else if (dbStep >= 2) {
-            skipIds.push('orientation')
-            if (uiStep < 1) uiStep = 1 // advance past skipped orientation to deskew
-          }
-        }
-      }
-
-      setSteps(
-        baseSteps.map((s, i) => ({
-          ...s,
-          status: skipIds.includes(s.id)
-            ? 'skipped'
-            : i < uiStep ? 'completed' : i === uiStep ? 'active' : 'pending',
-        })),
-      )
-      setCurrentStep(uiStep)
-    } catch (e) {
-      console.error('Failed to open session:', e)
-    }
-  }, [mode])
-
-  // Handle deep-link: ?session=xxx&mode=kombi (from GT Queue page)
-  useEffect(() => {
-    if (deepLinkHandled.current) return
-    const urlSession = searchParams.get('session')
-    const urlMode = searchParams.get('mode')
-    if (urlSession) {
-      deepLinkHandled.current = true
-      if (urlMode === 'kombi' || urlMode === 'paddle-direct') {
-        setMode(urlMode)
-        const baseSteps = urlMode === 'kombi' ? KOMBI_STEPS : PADDLE_DIRECT_STEPS
-        setSteps(baseSteps.map((s, i) => ({ ...s, status: i === 0 ? 'active' : 'pending' })))
-      }
-      openSession(urlSession)
-    }
-  }, [searchParams, openSession])
-
-  const deleteSession = useCallback(async (sid: string) => {
-    try {
-      await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`, { method: 'DELETE' })
-      setSessions((prev) => prev.filter((s) => s.id !== sid))
-      if (sessionId === sid) {
-        setSessionId(null)
-        setCurrentStep(0)
-        setSubSessions([])
-        setParentSessionId(null)
-        const baseSteps = mode === 'kombi' ? KOMBI_STEPS : mode === 'paddle-direct' ? PADDLE_DIRECT_STEPS : OVERLAY_PIPELINE_STEPS
-        setSteps(baseSteps.map((s, i) => ({ ...s, status: i === 0 ? 'active' : 'pending' })))
-      }
-    } catch (e) {
-      console.error('Failed to delete session:', e)
-    }
-  }, [sessionId, mode])
-
-  const renameSession = useCallback(async (sid: string, newName: string) => {
-    try {
-      await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`, {
-        method: 'PUT',
-        headers: { 'Content-Type': 'application/json' },
-        body: JSON.stringify({ name: newName }),
-      })
-      setSessions((prev) => prev.map((s) => (s.id === sid ? { ...s, name: newName } : s)))
-      if (sessionId === sid) setSessionName(newName)
-    } catch (e) {
-      console.error('Failed to rename session:', e)
-    }
-    setEditingName(null)
-  }, [sessionId])
-
-  const updateCategory = useCallback(async (sid: string, category: DocumentCategory) => {
-    try {
-      await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`, {
-        method: 'PUT',
-        headers: { 'Content-Type': 'application/json' },
-        body: JSON.stringify({ document_category: category }),
-      })
-      setSessions((prev) => prev.map((s) => (s.id === sid ? { ...s, document_category: category } : s)))
-      if (sessionId === sid) setActiveCategory(category)
-    } catch (e) {
-      console.error('Failed to update category:', e)
-    }
-    setEditingCategory(null)
-  }, [sessionId])
-
-  const handleStepClick = (index: number) => {
-    if (index <= currentStep || steps[index].status === 'completed') {
-      setCurrentStep(index)
-    }
-  }
-
-  const goToStep = (step: number) => {
-    setCurrentStep(step)
-    setSteps((prev) =>
-      prev.map((s, i) => ({
-        ...s,
-        status: i < step ? 'completed' : i === step ? 'active' : 'pending',
-      })),
-    )
-  }
-
-  const handleNext = () => {
-    if (currentStep >= steps.length - 1) {
-      // Sub-session completed — switch back to parent
-      if (parentSessionId && sessionId !== parentSessionId) {
-        setSubSessions((prev) =>
-          prev.map((s) => s.id === sessionId ? { ...s, status: 'completed', current_step: 10 } : s)
-        )
-        handleSessionChange(parentSessionId)
-        return
-      }
-      // Last step completed — return to session list
-      const baseSteps = mode === 'kombi' ? KOMBI_STEPS : mode === 'paddle-direct' ? PADDLE_DIRECT_STEPS : OVERLAY_PIPELINE_STEPS
-      setSteps(baseSteps.map((s, i) => ({ ...s, status: i === 0 ? 'active' : 'pending' })))
-      setCurrentStep(0)
-      setSessionId(null)
-      setSubSessions([])
-      setParentSessionId(null)
-      loadSessions()
-      return
-    }
-
-    const nextStep = currentStep + 1
-    setSteps((prev) =>
-      prev.map((s, i) => {
-        if (i === currentStep) return { ...s, status: 'completed' }
-        if (i === nextStep) return { ...s, status: 'active' }
-        return s
-      }),
-    )
-    setCurrentStep(nextStep)
-  }
-
-  const handleOrientationComplete = async (sid: string) => {
-    setSessionId(sid)
-    loadSessions()
-
-    // Check for page-split sub-sessions directly from API
-    try {
-      const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`)
-      if (res.ok) {
-        const data = await res.json()
-        if (data.sub_sessions?.length > 0) {
-          const subs: SubSession[] = data.sub_sessions.map((s: SubSession) => ({
-            id: s.id,
-            name: s.name,
-            box_index: s.box_index,
-            current_step: s.current_step,
-          }))
-          setSubSessions(subs)
-          setParentSessionId(sid)
-          openSession(subs[0].id, true)
-          return
-        }
-      }
-    } catch (e) {
-      console.error('Failed to check for sub-sessions:', e)
-    }
-
-    handleNext()
-  }
-
-  const handleBoxSessionsCreated = useCallback((subs: SubSession[]) => {
-    setSubSessions(subs)
-    if (sessionId) setParentSessionId(sessionId)
-  }, [sessionId])
-
-  const handleSessionChange = useCallback((newSessionId: string) => {
-    openSession(newSessionId, true)
-  }, [openSession])
-
-  const handleNewSession = () => {
-    setSessionId(null)
-    setSessionName('')
-    setCurrentStep(0)
-    setSubSessions([])
-    setParentSessionId(null)
-    const baseSteps = mode === 'kombi' ? KOMBI_STEPS : mode === 'paddle-direct' ? PADDLE_DIRECT_STEPS : OVERLAY_PIPELINE_STEPS
-    setSteps(baseSteps.map((s, i) => ({ ...s, status: i === 0 ? 'active' : 'pending' })))
-  }
-
-  const stepNames: Record<number, string> = {
-    1: 'Orientierung',
-    2: 'Begradigung',
-    3: 'Entzerrung',
-    4: 'Zuschneiden',
-    5: 'Zeilen',
-    6: 'Woerter',
-    7: 'Overlay',
-  }
-
-  const reprocessFromStep = useCallback(async (uiStep: number) => {
-    if (!sessionId) return
-    // Map overlay UI step to DB step
-    const dbStepMap: Record<number, number> = { 0: 2, 1: 3, 2: 4, 3: 5, 4: 7, 5: 8, 6: 9 }
-    const dbStep = dbStepMap[uiStep] || uiStep + 1
-    if (!confirm(`Ab Schritt ${uiStep + 1} (${stepNames[uiStep + 1] || '?'}) neu verarbeiten?`)) return
-    try {
-      const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/reprocess`, {
-        method: 'POST',
-        headers: { 'Content-Type': 'application/json' },
-        body: JSON.stringify({ from_step: dbStep }),
-      })
-      if (!res.ok) {
-        const data = await res.json().catch(() => ({}))
-        console.error('Reprocess failed:', data.detail || res.status)
-        return
-      }
-      goToStep(uiStep)
-    } catch (e) {
-      console.error('Reprocess error:', e)
-    }
-  // eslint-disable-next-line react-hooks/exhaustive-deps
-  }, [sessionId, goToStep])
-
-  const handleMarkGroundTruth = async () => {
-    if (!sessionId) return
-    setGtSaving(true)
-    setGtMessage('')
-    try {
-      // Auto-save grid editor before marking GT (so DB has latest edits)
-      if (gridSaveRef.current) {
-        await gridSaveRef.current()
-      }
-      const resp = await fetch(
-        `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/mark-ground-truth?pipeline=${mode}`,
-        { method: 'POST' }
-      )
-      if (!resp.ok) {
-        const body = await resp.text().catch(() => '')
-        throw new Error(`Ground Truth fehlgeschlagen (${resp.status}): ${body}`)
-      }
-      const data = await resp.json()
-      setIsGroundTruth(true)
-      setGtMessage(`Ground Truth gespeichert (${data.cells_saved} Zellen)`)
-      setTimeout(() => setGtMessage(''), 5000)
-    } catch (e) {
-      setGtMessage(e instanceof Error ? e.message : String(e))
-    } finally {
-      setGtSaving(false)
-    }
-  }
-
-  const isLastStep = currentStep === steps.length - 1
-  const showGtButton = isLastStep && sessionId != null
-
-  const renderStep = () => {
-    if (mode === 'paddle-direct' || mode === 'kombi') {
-      switch (currentStep) {
-        case 0:
-          return <StepOrientation key={sessionId} sessionId={sessionId} onNext={handleOrientationComplete} onSubSessionsCreated={handleBoxSessionsCreated} />
-        case 1:
-          return <StepDeskew key={sessionId} sessionId={sessionId} onNext={handleNext} />
-        case 2:
-          return <StepDewarp key={sessionId} sessionId={sessionId} onNext={handleNext} />
-        case 3:
-          return <StepCrop key={sessionId} sessionId={sessionId} onNext={handleNext} />
-        case 4:
-          if (mode === 'kombi') {
-            return (
-              <PaddleDirectStep
-                sessionId={sessionId}
-                onNext={handleNext}
-                endpoint="paddle-kombi"
-                title="Kombi-Modus"
-                description="PP-OCRv5 und Tesseract laufen parallel. Koordinaten werden gewichtet gemittelt fuer optimale Positionierung."
-                icon="🔀"
-                buttonLabel="PP-OCRv5 + Tesseract starten"
-                runningLabel="PP-OCRv5 + Tesseract laufen..."
-                engineKey="kombi"
-              />
-            )
-          }
-          return <PaddleDirectStep sessionId={sessionId} onNext={handleNext} />
-        case 5:
-          return mode === 'kombi' ? (
-            <StepStructureDetection sessionId={sessionId} onNext={handleNext} />
-          ) : null
-        case 6:
-          return mode === 'kombi' ? (
-            <StepGridReview sessionId={sessionId} onNext={handleNext} saveRef={gridSaveRef} />
-          ) : null
-        default:
-          return null
-      }
-    }
-    switch (currentStep) {
-      case 0:
-        return <StepOrientation key={sessionId} sessionId={sessionId} onNext={handleOrientationComplete} onSubSessionsCreated={handleBoxSessionsCreated} />
-      case 1:
-        return <StepDeskew key={sessionId} sessionId={sessionId} onNext={handleNext} />
-      case 2:
-        return <StepDewarp key={sessionId} sessionId={sessionId} onNext={handleNext} />
-      case 3:
-        return <StepCrop key={sessionId} sessionId={sessionId} onNext={handleNext} />
-      case 4:
-        return <StepRowDetection sessionId={sessionId} onNext={handleNext} />
-      case 5:
-        return <StepWordRecognition sessionId={sessionId} onNext={handleNext} goToStep={goToStep} skipHealGaps />
-      case 6:
-        return <OverlayReconstruction sessionId={sessionId} onNext={handleNext} />
-      default:
-        return null
-    }
-  }
-
-  return (
-    <div className="space-y-6">
-      <PagePurpose
-        title="OCR Overlay"
-        purpose="Ganzseitige Overlay-Rekonstruktion: Scan begradigen, Zeilen und Woerter erkennen, dann pixelgenau ueber das Bild legen. Ohne Spaltenerkennung — ideal fuer Arbeitsblaetter."
-        audience={['Entwickler']}
-        architecture={{
-          services: ['klausur-service (FastAPI)', 'OpenCV', 'Tesseract'],
-          databases: ['PostgreSQL Sessions'],
-        }}
-        relatedPages={[
-          { name: 'OCR Pipeline', href: '/ai/ocr-pipeline', description: 'Volle Pipeline mit Spalten' },
-          { name: 'OCR Vergleich', href: '/ai/ocr-compare', description: 'Methoden-Vergleich' },
-        ]}
-        defaultCollapsed
-      />
-
-      {/* Session List */}
-      <div className="bg-white dark:bg-gray-800 rounded-xl border border-gray-200 dark:border-gray-700 p-4">
-        <div className="flex items-center justify-between mb-3">
-          <h3 className="text-sm font-medium text-gray-700 dark:text-gray-300">
-            Sessions ({sessions.length})
-          </h3>
-          <button
-            onClick={handleNewSession}
-            className="text-xs px-3 py-1.5 bg-teal-600 text-white rounded-lg hover:bg-teal-700 transition-colors"
-          >
-            + Neue Session
-          </button>
-        </div>
-
-        {loadingSessions ? (
-          <div className="text-sm text-gray-400 py-2">Lade Sessions...</div>
-        ) : sessions.length === 0 ? (
-          <div className="text-sm text-gray-400 py-2">Noch keine Sessions vorhanden.</div>
-        ) : (
-          <div className="space-y-1.5 max-h-[320px] overflow-y-auto">
-            {sessions.map((s) => {
-              const catInfo = DOCUMENT_CATEGORIES.find(c => c.value === s.document_category)
-              return (
-                <div
-                  key={s.id}
-                  className={`relative flex items-start gap-3 px-3 py-2.5 rounded-lg text-sm transition-colors cursor-pointer ${
-                    sessionId === s.id
-                      ? 'bg-teal-50 dark:bg-teal-900/30 border border-teal-200 dark:border-teal-700'
-                      : 'hover:bg-gray-50 dark:hover:bg-gray-700/50'
-                  }`}
-                >
-                  {/* Thumbnail */}
-                  <div
-                    className="flex-shrink-0 w-12 h-12 rounded-md overflow-hidden bg-gray-100 dark:bg-gray-700"
-                    onClick={() => openSession(s.id)}
-                  >
-                    {/* eslint-disable-next-line @next/next/no-img-element */}
-                    <img
-                      src={`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${s.id}/thumbnail?size=96`}
-                      alt=""
-                      className="w-full h-full object-cover"
-                      loading="lazy"
-                      onError={(e) => { (e.target as HTMLImageElement).style.display = 'none' }}
-                    />
-                  </div>
-
-                  {/* Info */}
-                  <div className="flex-1 min-w-0" onClick={() => openSession(s.id)}>
-                    {editingName === s.id ? (
-                      <input
-                        autoFocus
-                        value={editNameValue}
-                        onChange={(e) => setEditNameValue(e.target.value)}
-                        onBlur={() => renameSession(s.id, editNameValue)}
-                        onKeyDown={(e) => {
-                          if (e.key === 'Enter') renameSession(s.id, editNameValue)
-                          if (e.key === 'Escape') setEditingName(null)
-                        }}
-                        onClick={(e) => e.stopPropagation()}
-                        className="w-full px-1 py-0.5 text-sm border rounded dark:bg-gray-700 dark:border-gray-600"
-                      />
-                    ) : (
-                      <div className="truncate font-medium text-gray-700 dark:text-gray-300">
-                        {s.name || s.filename}
-                      </div>
-                    )}
-                    <button
-                      onClick={(e) => {
-                        e.stopPropagation()
-                        navigator.clipboard.writeText(s.id)
-                        const btn = e.currentTarget
-                        btn.textContent = 'Kopiert!'
-                        setTimeout(() => { btn.textContent = `ID: ${s.id.slice(0, 8)}` }, 1500)
-                      }}
-                      className="text-[10px] font-mono text-gray-400 hover:text-teal-500 transition-colors"
-                      title={`Volle ID: ${s.id} — Klick zum Kopieren`}
-                    >
-                      ID: {s.id.slice(0, 8)}
-                    </button>
-                    <div className="text-xs text-gray-400 flex gap-2 mt-0.5">
-                      <span>{new Date(s.created_at).toLocaleDateString('de-DE', { day: '2-digit', month: '2-digit', year: '2-digit', hour: '2-digit', minute: '2-digit' })}</span>
-                    </div>
-                  </div>
-
-                  {/* Category Badge */}
-                  <div className="flex flex-col gap-1 items-end flex-shrink-0" onClick={(e) => e.stopPropagation()}>
-                    <button
-                      onClick={() => setEditingCategory(editingCategory === s.id ? null : s.id)}
-                      className={`text-[10px] px-1.5 py-0.5 rounded-full border transition-colors ${
-                        catInfo
-                          ? 'bg-teal-50 dark:bg-teal-900/30 border-teal-200 dark:border-teal-700 text-teal-700 dark:text-teal-300'
-                          : 'bg-gray-50 dark:bg-gray-700 border-gray-200 dark:border-gray-600 text-gray-400 hover:text-gray-600 dark:hover:text-gray-300'
-                      }`}
-                      title="Kategorie setzen"
-                    >
-                      {catInfo ? `${catInfo.icon} ${catInfo.label}` : '+ Kategorie'}
-                    </button>
-                  </div>
-
-                  {/* Actions */}
-                  <div className="flex flex-col gap-0.5 flex-shrink-0">
-                    <button
-                      onClick={(e) => {
-                        e.stopPropagation()
-                        setEditNameValue(s.name || s.filename)
-                        setEditingName(s.id)
-                      }}
-                      className="p-1 text-gray-400 hover:text-gray-600 dark:hover:text-gray-300"
-                      title="Umbenennen"
-                    >
-                      <svg className="w-3.5 h-3.5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
-                        <path strokeLinecap="round" strokeLinejoin="round" d="M15.232 5.232l3.536 3.536m-2.036-5.036a2.5 2.5 0 113.536 3.536L6.5 21.036H3v-3.572L16.732 3.732z" />
-                      </svg>
-                    </button>
-                    <button
-                      onClick={(e) => {
-                        e.stopPropagation()
-                        if (confirm('Session loeschen?')) deleteSession(s.id)
-                      }}
-                      className="p-1 text-gray-400 hover:text-red-500"
-                      title="Loeschen"
-                    >
-                      <svg className="w-3.5 h-3.5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
-                        <path strokeLinecap="round" strokeLinejoin="round" d="M19 7l-.867 12.142A2 2 0 0116.138 21H7.862a2 2 0 01-1.995-1.858L5 7m5 4v6m4-6v6m1-10V4a1 1 0 00-1-1h-4a1 1 0 00-1 1v3M4 7h16" />
-                      </svg>
-                    </button>
-                  </div>
-
-                  {/* Category dropdown */}
-                  {editingCategory === s.id && (
-                    <div
-                      className="absolute right-0 top-full mt-1 z-20 bg-white dark:bg-gray-800 border border-gray-200 dark:border-gray-700 rounded-lg shadow-lg p-2 grid grid-cols-2 gap-1 w-64"
-                      onClick={(e) => e.stopPropagation()}
-                    >
-                      {DOCUMENT_CATEGORIES.map((cat) => (
-                        <button
-                          key={cat.value}
-                          onClick={() => updateCategory(s.id, cat.value)}
-                          className={`text-xs px-2 py-1.5 rounded-md text-left transition-colors ${
-                            s.document_category === cat.value
-                              ? 'bg-teal-100 dark:bg-teal-900/40 text-teal-700 dark:text-teal-300'
-                              : 'hover:bg-gray-100 dark:hover:bg-gray-700 text-gray-600 dark:text-gray-400'
-                          }`}
-                        >
-                          {cat.icon} {cat.label}
-                        </button>
-                      ))}
-                    </div>
-                  )}
-                </div>
-              )
-            })}
-          </div>
-        )}
-      </div>
-
-      {/* Active session info + category picker */}
-      {sessionId && sessionName && (
-        <div className="relative flex items-center gap-3 text-sm text-gray-500 dark:text-gray-400">
-          <span>Aktive Session: <span className="font-medium text-gray-700 dark:text-gray-300">{sessionName}</span></span>
-          <button
-            onClick={() => setEditingActiveCategory(!editingActiveCategory)}
-            className={`text-xs px-2.5 py-1 rounded-full border transition-colors ${
-              activeCategory
-                ? 'bg-teal-50 dark:bg-teal-900/30 border-teal-200 dark:border-teal-700 text-teal-700 dark:text-teal-300 hover:bg-teal-100 dark:hover:bg-teal-900/50'
-                : 'bg-amber-50 dark:bg-amber-900/20 border-amber-300 dark:border-amber-700 text-amber-700 dark:text-amber-300 hover:bg-amber-100 dark:hover:bg-amber-900/40 animate-pulse'
-            }`}
-          >
-            {activeCategory ? (() => {
-              const cat = DOCUMENT_CATEGORIES.find(c => c.value === activeCategory)
-              return cat ? `${cat.icon} ${cat.label}` : activeCategory
-            })() : 'Kategorie setzen'}
-          </button>
-          {isGroundTruth && (
-            <span className="text-xs px-2 py-0.5 rounded-full bg-amber-50 dark:bg-amber-900/20 border border-amber-300 dark:border-amber-700 text-amber-700 dark:text-amber-300">
-              GT
-            </span>
-          )}
-          {editingActiveCategory && (
-            <div className="absolute left-0 top-full mt-1 z-20 bg-white dark:bg-gray-800 border border-gray-200 dark:border-gray-700 rounded-lg shadow-lg p-2 grid grid-cols-2 gap-1 w-64">
-              {DOCUMENT_CATEGORIES.map((cat) => (
-                <button
-                  key={cat.value}
-                  onClick={() => {
-                    updateCategory(sessionId, cat.value)
-                    setEditingActiveCategory(false)
-                  }}
-                  className={`text-xs px-2 py-1.5 rounded-md text-left transition-colors ${
-                    activeCategory === cat.value
-                      ? 'bg-teal-100 dark:bg-teal-900/40 text-teal-700 dark:text-teal-300'
-                      : 'hover:bg-gray-100 dark:hover:bg-gray-700 text-gray-600 dark:text-gray-400'
-                  }`}
-                >
-                  {cat.icon} {cat.label}
-                </button>
-              ))}
-            </div>
-          )}
-        </div>
-      )}
-
-      {/* Mode Toggle */}
-      <div className="flex items-center gap-1 bg-gray-100 dark:bg-gray-800 rounded-lg p-1 w-fit">
-        <button
-          onClick={() => {
-            if (mode === 'pipeline') return
-            setMode('pipeline')
-            setCurrentStep(0)
-            setSessionId(null)
-            setSteps(OVERLAY_PIPELINE_STEPS.map((s, i) => ({ ...s, status: i === 0 ? 'active' : 'pending' })))
-          }}
-          className={`px-3 py-1.5 text-xs font-medium rounded-md transition-colors ${
-            mode === 'pipeline'
-              ? 'bg-white dark:bg-gray-700 text-gray-700 dark:text-gray-200 shadow-sm'
-              : 'text-gray-500 dark:text-gray-400 hover:text-gray-700 dark:hover:text-gray-300'
-          }`}
-        >
-          Pipeline (7 Schritte)
-        </button>
-        <button
-          onClick={() => {
-            if (mode === 'paddle-direct') return
-            setMode('paddle-direct')
-            setCurrentStep(0)
-            setSessionId(null)
-            setSteps(PADDLE_DIRECT_STEPS.map((s, i) => ({ ...s, status: i === 0 ? 'active' : 'pending' })))
-          }}
-          className={`px-3 py-1.5 text-xs font-medium rounded-md transition-colors ${
-            mode === 'paddle-direct'
-              ? 'bg-white dark:bg-gray-700 text-gray-700 dark:text-gray-200 shadow-sm'
-              : 'text-gray-500 dark:text-gray-400 hover:text-gray-700 dark:hover:text-gray-300'
-          }`}
-        >
-          PP-OCRv5 Direct (5 Schritte)
-        </button>
-        <button
-          onClick={() => {
-            if (mode === 'kombi') return
-            setMode('kombi')
-            setCurrentStep(0)
-            setSessionId(null)
-            setSteps(KOMBI_STEPS.map((s, i) => ({ ...s, status: i === 0 ? 'active' : 'pending' })))
-          }}
-          className={`px-3 py-1.5 text-xs font-medium rounded-md transition-colors ${
-            mode === 'kombi'
-              ? 'bg-white dark:bg-gray-700 text-gray-700 dark:text-gray-200 shadow-sm'
-              : 'text-gray-500 dark:text-gray-400 hover:text-gray-700 dark:hover:text-gray-300'
-          }`}
-        >
-          Kombi (7 Schritte)
-        </button>
-      </div>
-
-      <PipelineStepper
-        steps={steps}
-        currentStep={currentStep}
-        onStepClick={handleStepClick}
-        onReprocess={mode === 'pipeline' && sessionId != null ? reprocessFromStep : undefined}
-      />
-
-      {subSessions.length > 0 && parentSessionId && sessionId && (
-        <BoxSessionTabs
-          parentSessionId={parentSessionId}
-          subSessions={subSessions}
-          activeSessionId={sessionId}
-          onSessionChange={handleSessionChange}
-        />
-      )}
-
-      <div className="min-h-[400px]">{renderStep()}</div>
-
-      {/* Ground Truth button bar — visible on last step */}
-      {showGtButton && (
-        <div className="sticky bottom-0 bg-white dark:bg-gray-900 border-t dark:border-gray-700 py-3 px-4 -mx-1 flex items-center justify-between rounded-b-xl">
-          <div className="text-sm text-gray-500 dark:text-gray-400">
-            {gtMessage && (
-              <span className={gtMessage.includes('fehlgeschlagen') ? 'text-red-500' : 'text-amber-600 dark:text-amber-400'}>
-                {gtMessage}
-              </span>
-            )}
-          </div>
-          <button
-            onClick={handleMarkGroundTruth}
-            disabled={gtSaving}
-            className="px-4 py-2 text-sm bg-amber-600 text-white rounded hover:bg-amber-700 disabled:opacity-50"
-          >
-            {gtSaving ? 'Speichere...' : isGroundTruth ? 'Ground Truth aktualisieren' : 'Als Ground Truth markieren'}
-          </button>
-        </div>
-      )}
-    </div>
-  )
-}
--- a/admin-lehrer/app/(admin)/ai/ocr-overlay/types.ts
+++ b/admin-lehrer/app/(admin)/ai/ocr-overlay/types.ts
@@ -1,87 +0,0 @@
-import type { PipelineStep } from '../ocr-pipeline/types'
-
-// Re-export types used by overlay components
-export type {
-  PipelineStep,
-  PipelineStepStatus,
-  SessionListItem,
-  SessionInfo,
-  DocumentCategory,
-  DocumentTypeResult,
-  OrientationResult,
-  CropResult,
-  DeskewResult,
-  DewarpResult,
-  RowResult,
-  RowItem,
-  GridResult,
-  GridCell,
-  OcrWordBox,
-  WordBbox,
-  ColumnMeta,
-} from '../ocr-pipeline/types'
-
-export { DOCUMENT_CATEGORIES } from '../ocr-pipeline/types'
-
-/**
- * 7-step pipeline for full-page overlay reconstruction.
- * Skips: Spalten (columns), LLM-Review (Korrektur), Ground-Truth (Validierung)
- */
-export const OVERLAY_PIPELINE_STEPS: PipelineStep[] = [
-  { id: 'orientation', name: 'Orientierung', icon: '🔄', status: 'pending' },
-  { id: 'deskew', name: 'Begradigung', icon: '📐', status: 'pending' },
-  { id: 'dewarp', name: 'Entzerrung', icon: '🔧', status: 'pending' },
-  { id: 'crop', name: 'Zuschneiden', icon: '✂️', status: 'pending' },
-  { id: 'rows', name: 'Zeilen', icon: '📏', status: 'pending' },
-  { id: 'words', name: 'Woerter', icon: '🔤', status: 'pending' },
-  { id: 'reconstruction', name: 'Overlay', icon: '🏗️', status: 'pending' },
-]
-
-/** Map from overlay UI step index to DB step number (1-indexed) */
-export const OVERLAY_UI_TO_DB: Record<number, number> = {
-  0: 2,  // orientation
-  1: 3,  // deskew
-  2: 4,  // dewarp
-  3: 5,  // crop
-  4: 6,  // rows (skip columns=6 in DB, rows=7 — but we reuse DB step numbering)
-  5: 7,  // words
-  6: 9,  // reconstruction
-}
-
-/**
- * 5-step pipeline for Paddle Direct mode.
- * Same preprocessing (orient/deskew/dewarp/crop), then PaddleOCR replaces rows+words+overlay.
- */
-export const PADDLE_DIRECT_STEPS: PipelineStep[] = [
-  { id: 'orientation', name: 'Orientierung', icon: '🔄', status: 'pending' },
-  { id: 'deskew', name: 'Begradigung', icon: '📐', status: 'pending' },
-  { id: 'dewarp', name: 'Entzerrung', icon: '🔧', status: 'pending' },
-  { id: 'crop', name: 'Zuschneiden', icon: '✂️', status: 'pending' },
-  { id: 'paddle-direct', name: 'PP-OCRv5 + Overlay', icon: '⚡', status: 'pending' },
-]
-
-/**
- * 5-step pipeline for Kombi mode (PP-OCRv5 + Tesseract).
- * Same preprocessing, then both engines run and results are merged.
- */
-export const KOMBI_STEPS: PipelineStep[] = [
-  { id: 'orientation', name: 'Orientierung', icon: '🔄', status: 'pending' },
-  { id: 'deskew', name: 'Begradigung', icon: '📐', status: 'pending' },
-  { id: 'dewarp', name: 'Entzerrung', icon: '🔧', status: 'pending' },
-  { id: 'crop', name: 'Zuschneiden', icon: '✂️', status: 'pending' },
-  { id: 'kombi', name: 'PP-OCRv5 + Tesseract', icon: '🔀', status: 'pending' },
-  { id: 'structure', name: 'Struktur', icon: '🔍', status: 'pending' },
-  { id: 'grid-editor', name: 'Review & GT', icon: '📊', status: 'pending' },
-]
-
-/** Map from DB step to overlay UI step index */
-export function dbStepToOverlayUi(dbStep: number): number {
-  // DB: 1=start, 2=orient, 3=deskew, 4=dewarp, 5=crop, 6=columns, 7=rows, 8=words, 9=recon, 10=gt
-  if (dbStep <= 2) return 0  // orientation
-  if (dbStep === 3) return 1 // deskew
-  if (dbStep === 4) return 2 // dewarp
-  if (dbStep === 5) return 3 // crop
-  if (dbStep <= 7) return 4  // rows (skip columns)
-  if (dbStep === 8) return 5 // words
-  return 6                   // reconstruction
-}
--- a/admin-lehrer/app/(admin)/ai/ocr-pipeline/page.tsx
+++ b/admin-lehrer/app/(admin)/ai/ocr-pipeline/page.tsx
@@ -1,684 +0,0 @@
-'use client'
-
-import { useCallback, useEffect, useState } from 'react'
-import { PagePurpose } from '@/components/common/PagePurpose'
-import { PipelineStepper } from '@/components/ocr-pipeline/PipelineStepper'
-import { StepOrientation } from '@/components/ocr-pipeline/StepOrientation'
-import { StepCrop } from '@/components/ocr-pipeline/StepCrop'
-import { StepDeskew } from '@/components/ocr-pipeline/StepDeskew'
-import { StepDewarp } from '@/components/ocr-pipeline/StepDewarp'
-import { StepStructureDetection } from '@/components/ocr-pipeline/StepStructureDetection'
-import { StepColumnDetection } from '@/components/ocr-pipeline/StepColumnDetection'
-import { StepRowDetection } from '@/components/ocr-pipeline/StepRowDetection'
-import { StepWordRecognition } from '@/components/ocr-pipeline/StepWordRecognition'
-import { StepLlmReview } from '@/components/ocr-pipeline/StepLlmReview'
-import { StepReconstruction } from '@/components/ocr-pipeline/StepReconstruction'
-import { StepGroundTruth } from '@/components/ocr-pipeline/StepGroundTruth'
-import { BoxSessionTabs } from '@/components/ocr-pipeline/BoxSessionTabs'
-import { PIPELINE_STEPS, DOCUMENT_CATEGORIES, type PipelineStep, type SessionListItem, type DocumentTypeResult, type DocumentCategory, type SubSession } from './types'
-
-const KLAUSUR_API = '/klausur-api'
-
-export default function OcrPipelinePage() {
-  const [currentStep, setCurrentStep] = useState(0)
-  const [sessionId, setSessionId] = useState<string | null>(null)
-  const [sessionName, setSessionName] = useState<string>('')
-  const [sessions, setSessions] = useState<SessionListItem[]>([])
-  const [loadingSessions, setLoadingSessions] = useState(true)
-  const [editingName, setEditingName] = useState<string | null>(null)
-  const [editNameValue, setEditNameValue] = useState('')
-  const [editingCategory, setEditingCategory] = useState<string | null>(null)
-  const [docTypeResult, setDocTypeResult] = useState<DocumentTypeResult | null>(null)
-  const [activeCategory, setActiveCategory] = useState<DocumentCategory | undefined>(undefined)
-  const [subSessions, setSubSessions] = useState<SubSession[]>([])
-  const [parentSessionId, setParentSessionId] = useState<string | null>(null)
-  const [steps, setSteps] = useState<PipelineStep[]>(
-    PIPELINE_STEPS.map((s, i) => ({
-      ...s,
-      status: i === 0 ? 'active' : 'pending',
-    })),
-  )
-
-  // Load session list on mount
-  useEffect(() => {
-    loadSessions()
-  }, [])
-
-  const loadSessions = async () => {
-    setLoadingSessions(true)
-    try {
-      const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions`)
-      if (res.ok) {
-        const data = await res.json()
-        setSessions(data.sessions || [])
-      }
-    } catch (e) {
-      console.error('Failed to load sessions:', e)
-    } finally {
-      setLoadingSessions(false)
-    }
-  }
-
-  const openSession = useCallback(async (sid: string, keepSubSessions?: boolean) => {
-    try {
-      const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`)
-      if (!res.ok) return
-      const data = await res.json()
-
-      setSessionId(sid)
-      setSessionName(data.name || data.filename || '')
-      setActiveCategory(data.document_category || undefined)
-
-      // Sub-session handling
-      if (data.sub_sessions && data.sub_sessions.length > 0) {
-        setSubSessions(data.sub_sessions)
-        setParentSessionId(sid)
-        // Parent has sub-sessions — open the first incomplete one (or most advanced if all done)
-        const incomplete = data.sub_sessions.find(
-          (s: SubSession) => !s.current_step || s.current_step < 10,
-        )
-        const target = incomplete || [...data.sub_sessions].sort(
-          (a: SubSession, b: SubSession) => (b.current_step || 0) - (a.current_step || 0),
-        )[0]
-        if (target) {
-          openSession(target.id, true)
-          return
-        }
-      } else if (data.parent_session_id) {
-        // This is a sub-session — keep parent info but don't reset sub-session list
-        setParentSessionId(data.parent_session_id)
-      } else if (!keepSubSessions) {
-        setSubSessions([])
-        setParentSessionId(null)
-      }
-
-      // Restore doc type result if available
-      const savedDocType: DocumentTypeResult | null = data.doc_type_result || null
-      setDocTypeResult(savedDocType)
-
-      // Determine which step to jump to based on current_step
-      const dbStep = data.current_step || 1
-      // DB steps: 1=start, 2=orientation, 3=deskew, 4=dewarp, 5=crop, 6=columns, ...
-      // UI steps are 0-indexed: 0=orientation, 1=deskew, 2=dewarp, 3=crop, 4=columns, ...
-      let uiStep = Math.max(0, dbStep - 1)
-      const skipSteps = [...(savedDocType?.skip_steps || [])]
-
-      // Sub-session handling depends on how they were created:
-      // - Crop-based (current_step >= 5): image already cropped, skip all pre-processing
-      // - Page-split (current_step 2): orientation done on parent, skip only orientation
-      // - Page-split from original (current_step 1): needs full pipeline
-      const isSubSession = !!data.parent_session_id
-      if (isSubSession) {
-        if (dbStep >= 5) {
-          // Crop-based sub-sessions: image already cropped
-          const SUB_SESSION_SKIP = ['orientation', 'deskew', 'dewarp', 'crop']
-          for (const s of SUB_SESSION_SKIP) {
-            if (!skipSteps.includes(s)) skipSteps.push(s)
-          }
-          if (uiStep < 4) uiStep = 4 // columns step (index 4)
-        } else if (dbStep >= 2) {
-          // Page-split sub-session: parent orientation applied, skip only orientation
-          if (!skipSteps.includes('orientation')) skipSteps.push('orientation')
-          if (uiStep < 1) uiStep = 1 // advance past skipped orientation to deskew
-        }
-        // dbStep === 1: page-split from original image, needs full pipeline
-      }
-
-      setSteps(
-        PIPELINE_STEPS.map((s, i) => ({
-          ...s,
-          status: skipSteps.includes(s.id)
-            ? 'skipped'
-            : i < uiStep ? 'completed' : i === uiStep ? 'active' : 'pending',
-        })),
-      )
-      setCurrentStep(uiStep)
-    } catch (e) {
-      console.error('Failed to open session:', e)
-    }
-  }, [])
-
-  const deleteSession = useCallback(async (sid: string) => {
-    try {
-      await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`, { method: 'DELETE' })
-      setSessions((prev) => prev.filter((s) => s.id !== sid))
-      if (sessionId === sid) {
-        setSessionId(null)
-        setCurrentStep(0)
-        setDocTypeResult(null)
-        setSubSessions([])
-        setParentSessionId(null)
-        setSteps(PIPELINE_STEPS.map((s, i) => ({ ...s, status: i === 0 ? 'active' : 'pending' })))
-      }
-    } catch (e) {
-      console.error('Failed to delete session:', e)
-    }
-  }, [sessionId])
-
-  const renameSession = useCallback(async (sid: string, newName: string) => {
-    try {
-      await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`, {
-        method: 'PUT',
-        headers: { 'Content-Type': 'application/json' },
-        body: JSON.stringify({ name: newName }),
-      })
-      setSessions((prev) => prev.map((s) => (s.id === sid ? { ...s, name: newName } : s)))
-      if (sessionId === sid) setSessionName(newName)
-    } catch (e) {
-      console.error('Failed to rename session:', e)
-    }
-    setEditingName(null)
-  }, [sessionId])
-
-  const updateCategory = useCallback(async (sid: string, category: DocumentCategory) => {
-    try {
-      await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`, {
-        method: 'PUT',
-        headers: { 'Content-Type': 'application/json' },
-        body: JSON.stringify({ document_category: category }),
-      })
-      setSessions((prev) => prev.map((s) => (s.id === sid ? { ...s, document_category: category } : s)))
-      if (sessionId === sid) setActiveCategory(category)
-    } catch (e) {
-      console.error('Failed to update category:', e)
-    }
-    setEditingCategory(null)
-  }, [sessionId])
-
-  const deleteAllSessions = useCallback(async () => {
-    if (!confirm('Alle Sessions loeschen? Dies kann nicht rueckgaengig gemacht werden.')) return
-    try {
-      await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions`, { method: 'DELETE' })
-      setSessions([])
-      setSessionId(null)
-      setCurrentStep(0)
-      setDocTypeResult(null)
-      setActiveCategory(undefined)
-      setSubSessions([])
-      setParentSessionId(null)
-      setSteps(PIPELINE_STEPS.map((s, i) => ({ ...s, status: i === 0 ? 'active' : 'pending' })))
-    } catch (e) {
-      console.error('Failed to delete all sessions:', e)
-    }
-  }, [])
-
-  const handleStepClick = (index: number) => {
-    if (index <= currentStep || steps[index].status === 'completed') {
-      setCurrentStep(index)
-    }
-  }
-
-  const goToStep = (step: number) => {
-    setCurrentStep(step)
-    setSteps((prev) =>
-      prev.map((s, i) => ({
-        ...s,
-        status: i < step ? 'completed' : i === step ? 'active' : 'pending',
-      })),
-    )
-  }
-
-  const handleNext = () => {
-    if (currentStep >= steps.length - 1) {
-      // Last step completed
-      if (parentSessionId && sessionId !== parentSessionId) {
-        // Sub-session completed — mark it and find next incomplete one
-        const updatedSubs = subSessions.map((s) =>
-          s.id === sessionId ? { ...s, status: 'completed' as const, current_step: 10 } : s,
-        )
-        setSubSessions(updatedSubs)
-
-        // Find next incomplete sub-session
-        const nextIncomplete = updatedSubs.find(
-          (s) => s.id !== sessionId && (!s.current_step || s.current_step < 10),
-        )
-        if (nextIncomplete) {
-          // Open next incomplete sub-session
-          openSession(nextIncomplete.id, true)
-        } else {
-          // All sub-sessions done — return to session list
-          setSteps(PIPELINE_STEPS.map((s, i) => ({ ...s, status: i === 0 ? 'active' : 'pending' })))
-          setCurrentStep(0)
-          setSessionId(null)
-          setSubSessions([])
-          setParentSessionId(null)
-          loadSessions()
-        }
-        return
-      }
-      // Main session: return to session list
-      setSteps(PIPELINE_STEPS.map((s, i) => ({ ...s, status: i === 0 ? 'active' : 'pending' })))
-      setCurrentStep(0)
-      setSessionId(null)
-      setSubSessions([])
-      setParentSessionId(null)
-      loadSessions()
-      return
-    }
-
-    // Find the next non-skipped step
-    const skipSteps = docTypeResult?.skip_steps || []
-    let nextStep = currentStep + 1
-    while (nextStep < steps.length && skipSteps.includes(PIPELINE_STEPS[nextStep]?.id)) {
-      nextStep++
-    }
-    if (nextStep >= steps.length) nextStep = steps.length - 1
-
-    setSteps((prev) =>
-      prev.map((s, i) => {
-        if (i === currentStep) return { ...s, status: 'completed' }
-        if (i === nextStep) return { ...s, status: 'active' }
-        // Mark skipped steps between current and next
-        if (i > currentStep && i < nextStep && skipSteps.includes(PIPELINE_STEPS[i]?.id)) {
-          return { ...s, status: 'skipped' }
-        }
-        return s
-      }),
-    )
-    setCurrentStep(nextStep)
-  }
-
-  const handleOrientationComplete = async (sid: string) => {
-    setSessionId(sid)
-    loadSessions()
-
-    // Check for page-split sub-sessions directly from API
-    // (React state may not be committed yet due to batching)
-    try {
-      const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`)
-      if (res.ok) {
-        const data = await res.json()
-        if (data.sub_sessions?.length > 0) {
-          const subs: SubSession[] = data.sub_sessions.map((s: SubSession) => ({
-            id: s.id,
-            name: s.name,
-            box_index: s.box_index,
-            current_step: s.current_step,
-          }))
-          setSubSessions(subs)
-          setParentSessionId(sid)
-          openSession(subs[0].id, true)
-          return
-        }
-      }
-    } catch (e) {
-      console.error('Failed to check for sub-sessions:', e)
-    }
-
-    handleNext()
-  }
-
-  const handleCropNext = async () => {
-    // Auto-detect document type after crop (last image-processing step), then advance
-    if (sessionId) {
-      try {
-        const res = await fetch(
-          `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/detect-type`,
-          { method: 'POST' },
-        )
-        if (res.ok) {
-          const data: DocumentTypeResult = await res.json()
-          setDocTypeResult(data)
-
-          // Mark skipped steps immediately
-          const skipSteps = data.skip_steps || []
-          if (skipSteps.length > 0) {
-            setSteps((prev) =>
-              prev.map((s) =>
-                skipSteps.includes(s.id) ? { ...s, status: 'skipped' } : s,
-              ),
-            )
-          }
-        }
-      } catch (e) {
-        console.error('Doc type detection failed:', e)
-        // Not critical — continue without it
-      }
-    }
-    handleNext()
-  }
-
-  const handleDocTypeChange = (newDocType: DocumentTypeResult['doc_type']) => {
-    if (!docTypeResult) return
-
-    // Build new skip_steps based on doc type
-    let skipSteps: string[] = []
-    if (newDocType === 'full_text') {
-      skipSteps = ['columns', 'rows']
-    }
-    // vocab_table and generic_table: no skips
-
-    const updated: DocumentTypeResult = {
-      ...docTypeResult,
-      doc_type: newDocType,
-      skip_steps: skipSteps,
-      pipeline: newDocType === 'full_text' ? 'full_page' : 'cell_first',
-    }
-    setDocTypeResult(updated)
-
-    // Update step statuses
-    setSteps((prev) =>
-      prev.map((s) => {
-        if (skipSteps.includes(s.id)) return { ...s, status: 'skipped' as const }
-        if (s.status === 'skipped') return { ...s, status: 'pending' as const }
-        return s
-      }),
-    )
-  }
-
-  const handleNewSession = () => {
-    setSessionId(null)
-    setSessionName('')
-    setCurrentStep(0)
-    setDocTypeResult(null)
-    setSubSessions([])
-    setParentSessionId(null)
-    setSteps(PIPELINE_STEPS.map((s, i) => ({ ...s, status: i === 0 ? 'active' : 'pending' })))
-  }
-
-  const handleSessionChange = useCallback((newSessionId: string) => {
-    openSession(newSessionId, true)
-  }, [openSession])
-
-  const handleBoxSessionsCreated = useCallback((subs: SubSession[]) => {
-    setSubSessions(subs)
-    if (sessionId) setParentSessionId(sessionId)
-  }, [sessionId])
-
-  const stepNames: Record<number, string> = {
-    1: 'Orientierung',
-    2: 'Begradigung',
-    3: 'Entzerrung',
-    4: 'Zuschneiden',
-    5: 'Spalten',
-    6: 'Zeilen',
-    7: 'Woerter',
-    8: 'Struktur',
-    9: 'Korrektur',
-    10: 'Rekonstruktion',
-    11: 'Validierung',
-  }
-
-  const reprocessFromStep = useCallback(async (uiStep: number) => {
-    if (!sessionId) return
-    const dbStep = uiStep + 1 // UI is 0-indexed, DB is 1-indexed
-    if (!confirm(`Ab Schritt ${dbStep} (${stepNames[dbStep] || '?'}) neu verarbeiten? Nachfolgende Daten werden geloescht.`)) return
-    try {
-      const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/reprocess`, {
-        method: 'POST',
-        headers: { 'Content-Type': 'application/json' },
-        body: JSON.stringify({ from_step: dbStep }),
-      })
-      if (!res.ok) {
-        const data = await res.json().catch(() => ({}))
-        console.error('Reprocess failed:', data.detail || res.status)
-        return
-      }
-      // Reset UI steps
-      goToStep(uiStep)
-    } catch (e) {
-      console.error('Reprocess error:', e)
-    }
-  // eslint-disable-next-line react-hooks/exhaustive-deps
-  }, [sessionId, goToStep])
-
-  const renderStep = () => {
-    switch (currentStep) {
-      case 0:
-        return <StepOrientation key={sessionId} sessionId={sessionId} onNext={handleOrientationComplete} onSubSessionsCreated={handleBoxSessionsCreated} />
-      case 1:
-        return <StepDeskew key={sessionId} sessionId={sessionId} onNext={handleNext} />
-      case 2:
-        return <StepDewarp key={sessionId} sessionId={sessionId} onNext={handleNext} />
-      case 3:
-        return <StepCrop key={sessionId} sessionId={sessionId} onNext={handleCropNext} />
-      case 4:
-        return <StepColumnDetection sessionId={sessionId} onNext={handleNext} onBoxSessionsCreated={handleBoxSessionsCreated} />
-      case 5:
-        return <StepRowDetection sessionId={sessionId} onNext={handleNext} />
-      case 6:
-        return <StepWordRecognition sessionId={sessionId} onNext={handleNext} goToStep={goToStep} />
-      case 7:
-        return <StepStructureDetection sessionId={sessionId} onNext={handleNext} />
-      case 8:
-        return <StepLlmReview sessionId={sessionId} onNext={handleNext} />
-      case 9:
-        return <StepReconstruction sessionId={sessionId} onNext={handleNext} />
-      case 10:
-        return <StepGroundTruth sessionId={sessionId} onNext={handleNext} />
-      default:
-        return null
-    }
-  }
-
-  return (
-    <div className="space-y-6">
-      <PagePurpose
-        title="OCR Pipeline"
-        purpose="Schrittweise Seitenrekonstruktion: Scan begradigen, Spalten erkennen, Woerter lokalisieren und die Seite Wort fuer Wort nachbauen. Ziel: 10 Vokabelseiten fehlerfrei rekonstruieren."
-        audience={['Entwickler', 'Data Scientists']}
-        architecture={{
-          services: ['klausur-service (FastAPI)', 'OpenCV', 'Tesseract'],
-          databases: ['PostgreSQL Sessions'],
-        }}
-        relatedPages={[
-          { name: 'OCR Vergleich', href: '/ai/ocr-compare', description: 'Methoden-Vergleich' },
-          { name: 'OCR-Labeling', href: '/ai/ocr-labeling', description: 'Trainingsdaten' },
-        ]}
-        defaultCollapsed
-      />
-
-      {/* Session List */}
-      <div className="bg-white dark:bg-gray-800 rounded-xl border border-gray-200 dark:border-gray-700 p-4">
-        <div className="flex items-center justify-between mb-3">
-          <h3 className="text-sm font-medium text-gray-700 dark:text-gray-300">
-            Sessions ({sessions.length})
-          </h3>
-          <div className="flex gap-2">
-            {sessions.length > 0 && (
-              <button
-                onClick={deleteAllSessions}
-                className="text-xs px-3 py-1.5 text-red-600 hover:bg-red-50 dark:hover:bg-red-900/20 rounded-lg transition-colors"
-                title="Alle Sessions loeschen"
-              >
-                Alle loeschen
-              </button>
-            )}
-            <button
-              onClick={handleNewSession}
-              className="text-xs px-3 py-1.5 bg-teal-600 text-white rounded-lg hover:bg-teal-700 transition-colors"
-            >
-              + Neue Session
-            </button>
-          </div>
-        </div>
-
-        {loadingSessions ? (
-          <div className="text-sm text-gray-400 py-2">Lade Sessions...</div>
-        ) : sessions.length === 0 ? (
-          <div className="text-sm text-gray-400 py-2">Noch keine Sessions vorhanden.</div>
-        ) : (
-          <div className="space-y-1.5 max-h-[320px] overflow-y-auto">
-            {sessions.map((s) => {
-              const catInfo = DOCUMENT_CATEGORIES.find(c => c.value === s.document_category)
-              return (
-                <div
-                  key={s.id}
-                  className={`relative flex items-start gap-3 px-3 py-2.5 rounded-lg text-sm transition-colors cursor-pointer ${
-                    sessionId === s.id
-                      ? 'bg-teal-50 dark:bg-teal-900/30 border border-teal-200 dark:border-teal-700'
-                      : 'hover:bg-gray-50 dark:hover:bg-gray-700/50'
-                  }`}
-                >
-                  {/* Thumbnail */}
-                  <div
-                    className="flex-shrink-0 w-12 h-12 rounded-md overflow-hidden bg-gray-100 dark:bg-gray-700"
-                    onClick={() => openSession(s.id)}
-                  >
-                    {/* eslint-disable-next-line @next/next/no-img-element */}
-                    <img
-                      src={`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${s.id}/thumbnail?size=96`}
-                      alt=""
-                      className="w-full h-full object-cover"
-                      loading="lazy"
-                      onError={(e) => { (e.target as HTMLImageElement).style.display = 'none' }}
-                    />
-                  </div>
-
-                  {/* Info */}
-                  <div className="flex-1 min-w-0" onClick={() => openSession(s.id)}>
-                    {editingName === s.id ? (
-                      <input
-                        autoFocus
-                        value={editNameValue}
-                        onChange={(e) => setEditNameValue(e.target.value)}
-                        onBlur={() => renameSession(s.id, editNameValue)}
-                        onKeyDown={(e) => {
-                          if (e.key === 'Enter') renameSession(s.id, editNameValue)
-                          if (e.key === 'Escape') setEditingName(null)
-                        }}
-                        onClick={(e) => e.stopPropagation()}
-                        className="w-full px-1 py-0.5 text-sm border rounded dark:bg-gray-700 dark:border-gray-600"
-                      />
-                    ) : (
-                      <div className="truncate font-medium text-gray-700 dark:text-gray-300">
-                        {s.name || s.filename}
-                      </div>
-                    )}
-                    {/* ID row */}
-                    <button
-                      onClick={(e) => {
-                        e.stopPropagation()
-                        navigator.clipboard.writeText(s.id)
-                        const btn = e.currentTarget
-                        btn.textContent = 'Kopiert!'
-                        setTimeout(() => { btn.textContent = `ID: ${s.id.slice(0, 8)}` }, 1500)
-                      }}
-                      className="text-[10px] font-mono text-gray-400 hover:text-teal-500 transition-colors"
-                      title={`Volle ID: ${s.id} — Klick zum Kopieren`}
-                    >
-                      ID: {s.id.slice(0, 8)}
-                    </button>
-                    <div className="text-xs text-gray-400 flex gap-2 mt-0.5">
-                      <span>{new Date(s.created_at).toLocaleDateString('de-DE', { day: '2-digit', month: '2-digit', year: '2-digit', hour: '2-digit', minute: '2-digit' })}</span>
-                      <span>Schritt {s.current_step}: {stepNames[s.current_step] || '?'}</span>
-                    </div>
-                  </div>
-
-                  {/* Badges */}
-                  <div className="flex flex-col gap-1 items-end flex-shrink-0" onClick={(e) => e.stopPropagation()}>
-                    {/* Category Badge */}
-                    <button
-                      onClick={() => setEditingCategory(editingCategory === s.id ? null : s.id)}
-                      className={`text-[10px] px-1.5 py-0.5 rounded-full border transition-colors ${
-                        catInfo
-                          ? 'bg-teal-50 dark:bg-teal-900/30 border-teal-200 dark:border-teal-700 text-teal-700 dark:text-teal-300'
-                          : 'bg-gray-50 dark:bg-gray-700 border-gray-200 dark:border-gray-600 text-gray-400 hover:text-gray-600 dark:hover:text-gray-300'
-                      }`}
-                      title="Kategorie setzen"
-                    >
-                      {catInfo ? `${catInfo.icon} ${catInfo.label}` : '+ Kategorie'}
-                    </button>
-                    {/* Doc Type Badge (read-only) */}
-                    {s.doc_type && (
-                      <span className="text-[10px] px-1.5 py-0.5 rounded-full bg-gray-100 dark:bg-gray-700 text-gray-500 dark:text-gray-400 border border-gray-200 dark:border-gray-600">
-                        {s.doc_type}
-                      </span>
-                    )}
-                  </div>
-
-                  {/* Action buttons */}
-                  <div className="flex flex-col gap-0.5 flex-shrink-0">
-                    <button
-                      onClick={(e) => {
-                        e.stopPropagation()
-                        setEditNameValue(s.name || s.filename)
-                        setEditingName(s.id)
-                      }}
-                      className="p-1 text-gray-400 hover:text-gray-600 dark:hover:text-gray-300"
-                      title="Umbenennen"
-                    >
-                      <svg className="w-3.5 h-3.5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
-                        <path strokeLinecap="round" strokeLinejoin="round" d="M15.232 5.232l3.536 3.536m-2.036-5.036a2.5 2.5 0 113.536 3.536L6.5 21.036H3v-3.572L16.732 3.732z" />
-                      </svg>
-                    </button>
-                    <button
-                      onClick={(e) => {
-                        e.stopPropagation()
-                        if (confirm('Session loeschen?')) deleteSession(s.id)
-                      }}
-                      className="p-1 text-gray-400 hover:text-red-500"
-                      title="Loeschen"
-                    >
-                      <svg className="w-3.5 h-3.5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
-                        <path strokeLinecap="round" strokeLinejoin="round" d="M19 7l-.867 12.142A2 2 0 0116.138 21H7.862a2 2 0 01-1.995-1.858L5 7m5 4v6m4-6v6m1-10V4a1 1 0 00-1-1h-4a1 1 0 00-1 1v3M4 7h16" />
-                      </svg>
-                    </button>
-                  </div>
-
-                  {/* Category dropdown (inline) */}
-                  {editingCategory === s.id && (
-                    <div
-                      className="absolute right-0 top-full mt-1 z-20 bg-white dark:bg-gray-800 border border-gray-200 dark:border-gray-700 rounded-lg shadow-lg p-2 grid grid-cols-2 gap-1 w-64"
-                      onClick={(e) => e.stopPropagation()}
-                    >
-                      {DOCUMENT_CATEGORIES.map((cat) => (
-                        <button
-                          key={cat.value}
-                          onClick={() => updateCategory(s.id, cat.value)}
-                          className={`text-xs px-2 py-1.5 rounded-md text-left transition-colors ${
-                            s.document_category === cat.value
-                              ? 'bg-teal-100 dark:bg-teal-900/40 text-teal-700 dark:text-teal-300'
-                              : 'hover:bg-gray-100 dark:hover:bg-gray-700 text-gray-600 dark:text-gray-400'
-                          }`}
-                        >
-                          {cat.icon} {cat.label}
-                        </button>
-                      ))}
-                    </div>
-                  )}
-                </div>
-              )
-            })}
-          </div>
-        )}
-      </div>
-
-      {/* Active session info */}
-      {sessionId && sessionName && (
-        <div className="flex items-center gap-3 text-sm text-gray-500 dark:text-gray-400">
-          <span>Aktive Session: <span className="font-medium text-gray-700 dark:text-gray-300">{sessionName}</span></span>
-          {activeCategory && (() => {
-            const cat = DOCUMENT_CATEGORIES.find(c => c.value === activeCategory)
-            return cat ? <span className="text-xs px-2 py-0.5 rounded-full bg-teal-50 dark:bg-teal-900/30 border border-teal-200 dark:border-teal-700 text-teal-700 dark:text-teal-300">{cat.icon} {cat.label}</span> : null
-          })()}
-          {docTypeResult && (
-            <span className="text-xs px-2 py-0.5 rounded-full bg-gray-100 dark:bg-gray-700 text-gray-500 dark:text-gray-400 border border-gray-200 dark:border-gray-600">
-              {docTypeResult.doc_type}
-            </span>
-          )}
-        </div>
-      )}
-
-      <PipelineStepper
-        steps={steps}
-        currentStep={currentStep}
-        onStepClick={handleStepClick}
-        onReprocess={sessionId ? reprocessFromStep : undefined}
-        docTypeResult={docTypeResult}
-        onDocTypeChange={handleDocTypeChange}
-      />
-
-      {subSessions.length > 0 && parentSessionId && sessionId && (
-        <BoxSessionTabs
-          parentSessionId={parentSessionId}
-          subSessions={subSessions}
-          activeSessionId={sessionId}
-          onSessionChange={handleSessionChange}
-        />
-      )}
-
-      <div className="min-h-[400px]">{renderStep()}</div>
-    </div>
-  )
-}
--- a/admin-lehrer/app/(admin)/ai/ocr-pipeline/types.ts
+++ b/admin-lehrer/app/(admin)/ai/ocr-pipeline/types.ts
@@ -1,425 +0,0 @@
-export type PipelineStepStatus = 'pending' | 'active' | 'completed' | 'failed' | 'skipped'
-
-export interface PipelineStep {
-  id: string
-  name: string
-  icon: string
-  status: PipelineStepStatus
-}
-
-export type DocumentCategory =
-  | 'vokabelseite' | 'woerterbuch' | 'buchseite' | 'arbeitsblatt' | 'klausurseite'
-  | 'mathearbeit' | 'statistik' | 'zeitung' | 'formular' | 'handschrift' | 'sonstiges'
-
-export const DOCUMENT_CATEGORIES: { value: DocumentCategory; label: string; icon: string }[] = [
-  { value: 'vokabelseite', label: 'Vokabelseite', icon: '📖' },
-  { value: 'woerterbuch', label: 'Woerterbuch', icon: '📕' },
-  { value: 'buchseite', label: 'Buchseite', icon: '📚' },
-  { value: 'arbeitsblatt', label: 'Arbeitsblatt', icon: '📝' },
-  { value: 'klausurseite', label: 'Klausurseite', icon: '📄' },
-  { value: 'mathearbeit', label: 'Mathearbeit', icon: '🔢' },
-  { value: 'statistik', label: 'Statistik', icon: '📊' },
-  { value: 'zeitung', label: 'Zeitung', icon: '📰' },
-  { value: 'formular', label: 'Formular', icon: '📋' },
-  { value: 'handschrift', label: 'Handschrift', icon: '✍️' },
-  { value: 'sonstiges', label: 'Sonstiges', icon: '📎' },
-]
-
-export interface SessionListItem {
-  id: string
-  name: string
-  filename: string
-  status: string
-  current_step: number
-  document_category?: DocumentCategory
-  doc_type?: string
-  created_at: string
-  updated_at?: string
-  parent_session_id?: string | null
-  box_index?: number | null
-}
-
-export interface SubSession {
-  id: string
-  name: string
-  box_index: number
-  current_step?: number
-  status?: string
-}
-
-export interface PipelineLogEntry {
-  step: string
-  completed_at: string
-  success: boolean
-  duration_ms?: number
-  metrics: Record<string, unknown>
-}
-
-export interface PipelineLog {
-  steps: PipelineLogEntry[]
-}
-
-export interface DocumentTypeResult {
-  doc_type: 'vocab_table' | 'full_text' | 'generic_table'
-  confidence: number
-  pipeline: 'cell_first' | 'full_page'
-  skip_steps: string[]
-  features?: Record<string, unknown>
-  duration_seconds?: number
-}
-
-export interface OrientationResult {
-  orientation_degrees: number
-  corrected: boolean
-  duration_seconds: number
-}
-
-export interface CropResult {
-  crop_applied: boolean
-  crop_rect?: { x: number; y: number; width: number; height: number }
-  crop_rect_pct?: { x: number; y: number; width: number; height: number }
-  original_size: { width: number; height: number }
-  cropped_size: { width: number; height: number }
-  detected_format?: string
-  format_confidence?: number
-  aspect_ratio?: number
-  border_fractions?: { top: number; bottom: number; left: number; right: number }
-  skipped?: boolean
-  duration_seconds?: number
-}
-
-export interface SessionInfo {
-  session_id: string
-  filename: string
-  name?: string
-  image_width: number
-  image_height: number
-  original_image_url: string
-  current_step?: number
-  document_category?: DocumentCategory
-  doc_type?: string
-  orientation_result?: OrientationResult
-  crop_result?: CropResult
-  deskew_result?: DeskewResult
-  dewarp_result?: DewarpResult
-  column_result?: ColumnResult
-  row_result?: RowResult
-  word_result?: GridResult
-  doc_type_result?: DocumentTypeResult
-  sub_sessions?: SubSession[]
-  parent_session_id?: string
-  box_index?: number
-}
-
-export interface DeskewResult {
-  session_id: string
-  angle_hough: number
-  angle_word_alignment: number
-  angle_iterative?: number
-  angle_residual?: number
-  angle_textline?: number
-  angle_applied: number
-  method_used: 'hough' | 'word_alignment' | 'manual' | 'iterative' | 'two_pass' | 'three_pass' | 'manual_combined'
-  confidence: number
-  duration_seconds: number
-  deskewed_image_url: string
-  binarized_image_url: string
-}
-
-export interface DeskewGroundTruth {
-  is_correct: boolean
-  corrected_angle?: number
-  notes?: string
-}
-
-export interface DewarpDetection {
-  method: string
-  shear_degrees: number
-  confidence: number
-}
-
-export interface DewarpResult {
-  session_id: string
-  method_used: string
-  shear_degrees: number
-  confidence: number
-  duration_seconds: number
-  dewarped_image_url: string
-  detections?: DewarpDetection[]
-}
-
-export interface DewarpGroundTruth {
-  is_correct: boolean
-  corrected_shear?: number
-  notes?: string
-}
-
-export interface PageRegion {
-  type: 'column_en' | 'column_de' | 'column_example' | 'page_ref'
-      | 'column_marker' | 'column_text' | 'column_ignore' | 'header' | 'footer'
-  x: number
-  y: number
-  width: number
-  height: number
-  classification_confidence?: number
-  classification_method?: string
-}
-
-export interface PageZone {
-  zone_type: 'content' | 'box'
-  y_start: number
-  y_end: number
-  box?: { x: number; y: number; width: number; height: number }
-}
-
-export interface ColumnResult {
-  columns: PageRegion[]
-  duration_seconds: number
-  zones?: PageZone[]
-}
-
-export interface ColumnGroundTruth {
-  is_correct: boolean
-  corrected_columns?: PageRegion[]
-  notes?: string
-}
-
-export interface ManualColumnDivider {
-  xPercent: number  // Position in % of image width (0-100)
-}
-
-export type ColumnTypeKey = PageRegion['type']
-
-export interface RowResult {
-  rows: RowItem[]
-  summary: Record<string, number>
-  total_rows: number
-  duration_seconds: number
-}
-
-export interface RowItem {
-  index: number
-  x: number
-  y: number
-  width: number
-  height: number
-  word_count: number
-  row_type: 'content' | 'header' | 'footer'
-  gap_before: number
-}
-
-export interface RowGroundTruth {
-  is_correct: boolean
-  corrected_rows?: RowItem[]
-  notes?: string
-}
-
-export interface StructureGraphic {
-  x: number
-  y: number
-  w: number
-  h: number
-  area: number
-  shape: string   // image, illustration
-  color_name: string
-  color_hex: string
-  confidence: number
-}
-
-export interface ExcludeRegion {
-  x: number
-  y: number
-  w: number
-  h: number
-  label?: string
-}
-
-export interface DocLayoutRegion {
-  x: number
-  y: number
-  w: number
-  h: number
-  class_name: string
-  confidence: number
-}
-
-export interface StructureResult {
-  image_width: number
-  image_height: number
-  content_bounds: { x: number; y: number; w: number; h: number }
-  boxes: StructureBox[]
-  zones: StructureZone[]
-  graphics: StructureGraphic[]
-  exclude_regions?: ExcludeRegion[]
-  color_pixel_counts: Record<string, number>
-  has_words: boolean
-  word_count: number
-  border_ghosts_removed?: number
-  duration_seconds: number
-  /** PP-DocLayout regions (only present when method=ppdoclayout) */
-  layout_regions?: DocLayoutRegion[]
-  detection_method?: 'opencv' | 'ppdoclayout'
-}
-
-export interface StructureBox {
-  x: number
-  y: number
-  w: number
-  h: number
-  confidence: number
-  border_thickness: number
-  bg_color_name?: string
-  bg_color_hex?: string
-}
-
-export interface StructureZone {
-  index: number
-  zone_type: 'content' | 'box'
-  x: number
-  y: number
-  w: number
-  h: number
-}
-
-export interface WordBbox {
-  x: number
-  y: number
-  w: number
-  h: number
-}
-
-export interface OcrWordBox {
-  text: string
-  left: number    // absolute image x in px
-  top: number     // absolute image y in px
-  width: number   // px
-  height: number  // px
-  conf: number
-  color?: string       // hex color of detected text, e.g. '#dc2626'
-  color_name?: string  // 'black' | 'red' | 'blue' | 'green' | 'orange' | 'purple' | 'yellow'
-  recovered?: boolean  // true if this word was recovered via color detection
-}
-
-export interface GridCell {
-  cell_id: string          // "R03_C1"
-  row_index: number
-  col_index: number
-  col_type: string
-  text: string
-  confidence: number
-  bbox_px: WordBbox
-  bbox_pct: WordBbox
-  ocr_engine?: string
-  is_bold?: boolean
-  status?: 'pending' | 'confirmed' | 'edited' | 'skipped'
-  word_boxes?: OcrWordBox[]  // per-word bounding boxes from OCR engine
-}
-
-export interface ColumnMeta {
-  index: number
-  type: string
-  x: number
-  width: number
-}
-
-export interface GridResult {
-  cells: GridCell[]
-  grid_shape: { rows: number; cols: number; total_cells: number }
-  columns_used: ColumnMeta[]
-  layout: 'vocab' | 'generic'
-  image_width: number
-  image_height: number
-  duration_seconds: number
-  ocr_engine?: string
-  vocab_entries?: WordEntry[]   // Only when layout='vocab'
-  entries?: WordEntry[]         // Backwards compat alias for vocab_entries
-  entry_count?: number
-  summary: {
-    total_cells: number
-    non_empty_cells: number
-    low_confidence: number
-    // Only when layout='vocab':
-    total_entries?: number
-    with_english?: number
-    with_german?: number
-  }
-  llm_review?: {
-    changes: { row_index: number; field: string; old: string; new: string }[]
-    model_used: string
-    duration_ms: number
-    entries_corrected: number
-    applied_count?: number
-    applied_at?: string
-  }
-}
-
-export interface WordEntry {
-  row_index: number
-  english: string
-  german: string
-  example: string
-  source_page?: string
-  marker?: string
-  confidence: number
-  bbox: WordBbox
-  bbox_en: WordBbox | null
-  bbox_de: WordBbox | null
-  bbox_ex: WordBbox | null
-  bbox_ref?: WordBbox | null
-  bbox_marker?: WordBbox | null
-  status?: 'pending' | 'confirmed' | 'edited' | 'skipped'
-}
-
-/** @deprecated Use GridResult instead */
-export interface WordResult {
-  entries: WordEntry[]
-  entry_count: number
-  image_width: number
-  image_height: number
-  duration_seconds: number
-  ocr_engine?: string
-  summary: {
-    total_entries: number
-    with_english: number
-    with_german: number
-    low_confidence: number
-  }
-}
-
-export interface WordGroundTruth {
-  is_correct: boolean
-  corrected_entries?: WordEntry[]
-  notes?: string
-}
-
-export interface ImageRegion {
-  bbox_pct: { x: number; y: number; w: number; h: number }
-  prompt: string
-  description: string
-  image_b64: string | null
-  style: 'educational' | 'cartoon' | 'sketch' | 'clipart' | 'realistic'
-}
-
-export type ImageStyle = ImageRegion['style']
-
-export const IMAGE_STYLES: { value: ImageStyle; label: string }[] = [
-  { value: 'educational', label: 'Lehrbuch' },
-  { value: 'cartoon', label: 'Cartoon' },
-  { value: 'sketch', label: 'Skizze' },
-  { value: 'clipart', label: 'Clipart' },
-  { value: 'realistic', label: 'Realistisch' },
-]
-
-export const PIPELINE_STEPS: PipelineStep[] = [
-  { id: 'orientation', name: 'Orientierung', icon: '🔄', status: 'pending' },
-  { id: 'deskew', name: 'Begradigung', icon: '📐', status: 'pending' },
-  { id: 'dewarp', name: 'Entzerrung', icon: '🔧', status: 'pending' },
-  { id: 'crop', name: 'Zuschneiden', icon: '✂️', status: 'pending' },
-  { id: 'columns', name: 'Spalten', icon: '📊', status: 'pending' },
-  { id: 'rows', name: 'Zeilen', icon: '📏', status: 'pending' },
-  { id: 'words', name: 'Woerter', icon: '🔤', status: 'pending' },
-  { id: 'structure', name: 'Struktur', icon: '🔍', status: 'pending' },
-  { id: 'llm-review', name: 'Korrektur', icon: '✏️', status: 'pending' },
-  { id: 'reconstruction', name: 'Rekonstruktion', icon: '🏗️', status: 'pending' },
-  { id: 'ground-truth', name: 'Validierung', icon: '✅', status: 'pending' },
-]
--- a/admin-lehrer/app/(admin)/ai/rag/tests/rag-documents.test.ts
+++ b/admin-lehrer/app/(admin)/ai/rag/tests/rag-documents.test.ts
@@ -0,0 +1,252 @@
+import { describe, it, expect } from 'vitest'
+import ragData from '../rag-documents.json'
+
+/**
+ * Tests fuer rag-documents.json — Branchen-Regulierungs-Matrix
+ *
+ * Validiert die JSON-Struktur, Branchen-Zuordnung und Datenintegritaet
+ * der 320 Dokumente fuer die RAG Landkarte.
+ */
+
+const VALID_INDUSTRY_IDS = ragData.industries.map((i: any) => i.id)
+const VALID_DOC_TYPE_IDS = ragData.doc_types.map((dt: any) => dt.id)
+
+describe('rag-documents.json — Struktur', () => {
+  it('sollte doc_types, industries und documents enthalten', () => {
+    expect(ragData).toHaveProperty('doc_types')
+    expect(ragData).toHaveProperty('industries')
+    expect(ragData).toHaveProperty('documents')
+    expect(Array.isArray(ragData.doc_types)).toBe(true)
+    expect(Array.isArray(ragData.industries)).toBe(true)
+    expect(Array.isArray(ragData.documents)).toBe(true)
+  })
+
+  it('sollte genau 10 Branchen haben (VDMA/VDA/BDI)', () => {
+    expect(ragData.industries).toHaveLength(10)
+    const ids = ragData.industries.map((i: any) => i.id)
+    expect(ids).toContain('automotive')
+    expect(ids).toContain('maschinenbau')
+    expect(ids).toContain('elektrotechnik')
+    expect(ids).toContain('chemie')
+    expect(ids).toContain('metall')
+    expect(ids).toContain('energie')
+    expect(ids).toContain('transport')
+    expect(ids).toContain('handel')
+    expect(ids).toContain('konsumgueter')
+    expect(ids).toContain('bau')
+  })
+
+  it('sollte keine Pseudo-Branchen enthalten (IoT, KI, HR, KRITIS, etc.)', () => {
+    const ids = ragData.industries.map((i: any) => i.id)
+    expect(ids).not.toContain('iot')
+    expect(ids).not.toContain('ai')
+    expect(ids).not.toContain('hr')
+    expect(ids).not.toContain('kritis')
+    expect(ids).not.toContain('ecommerce')
+    expect(ids).not.toContain('tech')
+    expect(ids).not.toContain('media')
+    expect(ids).not.toContain('public')
+  })
+
+  it('sollte 17 Dokumenttypen haben', () => {
+    expect(ragData.doc_types.length).toBe(17)
+  })
+
+  it('sollte mindestens 300 Dokumente haben', () => {
+    expect(ragData.documents.length).toBeGreaterThanOrEqual(300)
+  })
+
+  it('sollte jede Branche name und icon haben', () => {
+    ragData.industries.forEach((ind: any) => {
+      expect(ind).toHaveProperty('id')
+      expect(ind).toHaveProperty('name')
+      expect(ind).toHaveProperty('icon')
+      expect(ind.name.length).toBeGreaterThan(0)
+    })
+  })
+
+  it('sollte jeden doc_type mit id, label, icon und sort haben', () => {
+    ragData.doc_types.forEach((dt: any) => {
+      expect(dt).toHaveProperty('id')
+      expect(dt).toHaveProperty('label')
+      expect(dt).toHaveProperty('icon')
+      expect(dt).toHaveProperty('sort')
+    })
+  })
+})
+
+describe('rag-documents.json — Dokument-Validierung', () => {
+  it('sollte keine doppelten Codes haben', () => {
+    const codes = ragData.documents.map((d: any) => d.code)
+    const unique = new Set(codes)
+    expect(unique.size).toBe(codes.length)
+  })
+
+  it('sollte Pflichtfelder bei jedem Dokument haben', () => {
+    ragData.documents.forEach((doc: any) => {
+      expect(doc).toHaveProperty('code')
+      expect(doc).toHaveProperty('name')
+      expect(doc).toHaveProperty('doc_type')
+      expect(doc).toHaveProperty('industries')
+      expect(doc).toHaveProperty('in_rag')
+      expect(doc).toHaveProperty('rag_collection')
+      expect(doc.code.length).toBeGreaterThan(0)
+      expect(doc.name.length).toBeGreaterThan(0)
+      expect(Array.isArray(doc.industries)).toBe(true)
+    })
+  })
+
+  it('sollte nur gueltige doc_type IDs verwenden', () => {
+    ragData.documents.forEach((doc: any) => {
+      expect(VALID_DOC_TYPE_IDS).toContain(doc.doc_type)
+    })
+  })
+
+  it('sollte nur gueltige industry IDs verwenden (oder "all")', () => {
+    ragData.documents.forEach((doc: any) => {
+      doc.industries.forEach((ind: string) => {
+        if (ind !== 'all') {
+          expect(VALID_INDUSTRY_IDS).toContain(ind)
+        }
+      })
+    })
+  })
+
+  it('sollte gueltige rag_collection Namen verwenden', () => {
+    const validCollections = [
+      'bp_compliance_ce',
+      'bp_compliance_gesetze',
+      'bp_compliance_datenschutz',
+      'bp_dsfa_corpus',
+      'bp_legal_templates',
+      'bp_compliance_recht',
+      'bp_nibis_eh',
+    ]
+    ragData.documents.forEach((doc: any) => {
+      expect(validCollections).toContain(doc.rag_collection)
+    })
+  })
+})
+
+describe('rag-documents.json — Branchen-Zuordnungslogik', () => {
+  const findDoc = (code: string) => ragData.documents.find((d: any) => d.code === code)
+
+  describe('Horizontale Regulierungen (alle Branchen)', () => {
+    const horizontalCodes = [
+      'GDPR', 'BDSG_FULL', 'EPRIVACY', 'TDDDG', 'AIACT', 'CRA',
+      'NIS2', 'GPSR', 'PLD', 'EUCSA', 'DATAACT',
+    ]
+
+    horizontalCodes.forEach((code) => {
+      it(`${code} sollte fuer alle Branchen gelten`, () => {
+        const doc = findDoc(code)
+        if (doc) {
+          expect(doc.industries).toContain('all')
+        }
+      })
+    })
+  })
+
+  describe('Sektorspezifische Regulierungen', () => {
+    it('Maschinenverordnung sollte Maschinenbau, Automotive, Elektrotechnik enthalten', () => {
+      const doc = findDoc('MACHINERY_REG')
+      if (doc) {
+        expect(doc.industries).toContain('maschinenbau')
+        expect(doc.industries).toContain('automotive')
+        expect(doc.industries).toContain('elektrotechnik')
+        expect(doc.industries).not.toContain('all')
+      }
+    })
+
+    it('ElektroG sollte Elektrotechnik und Automotive enthalten', () => {
+      const doc = findDoc('DE_ELEKTROG')
+      if (doc) {
+        expect(doc.industries).toContain('elektrotechnik')
+        expect(doc.industries).toContain('automotive')
+      }
+    })
+
+    it('BattDG sollte Automotive und Elektrotechnik enthalten', () => {
+      const doc = findDoc('DE_BATTDG')
+      if (doc) {
+        expect(doc.industries).toContain('automotive')
+        expect(doc.industries).toContain('elektrotechnik')
+      }
+    })
+
+    it('ENISA ICS/SCADA sollte Energie, Maschinenbau, Chemie enthalten', () => {
+      const doc = findDoc('ENISA_ICS_SCADA')
+      if (doc) {
+        expect(doc.industries).toContain('energie')
+        expect(doc.industries).toContain('maschinenbau')
+        expect(doc.industries).toContain('chemie')
+      }
+    })
+  })
+
+  describe('Nicht zutreffende Regulierungen (Finanz/Medizin/Plattformen)', () => {
+    const emptyIndustryCodes = ['DORA', 'PSD2', 'MiCA', 'AMLR', 'EHDS', 'DSA', 'DMA', 'MDR']
+
+    emptyIndustryCodes.forEach((code) => {
+      it(`${code} sollte keine Branchen-Zuordnung haben`, () => {
+        const doc = findDoc(code)
+        if (doc) {
+          expect(doc.industries).toHaveLength(0)
+        }
+      })
+    })
+  })
+
+  describe('BSI-TR-03161 (DiGA) sollte nicht zutreffend sein', () => {
+    ['BSI-TR-03161-1', 'BSI-TR-03161-2', 'BSI-TR-03161-3'].forEach((code) => {
+      it(`${code} sollte keine Branchen-Zuordnung haben`, () => {
+        const doc = findDoc(code)
+        if (doc) {
+          expect(doc.industries).toHaveLength(0)
+        }
+      })
+    })
+  })
+})
+
+describe('rag-documents.json — Applicability Notes', () => {
+  it('sollte applicability_note bei Dokumenten mit description haben', () => {
+    const withDescription = ragData.documents.filter((d: any) => d.description)
+    const withNote = withDescription.filter((d: any) => d.applicability_note)
+    // Mindestens 90% der Dokumente mit Beschreibung sollten eine Note haben
+    expect(withNote.length / withDescription.length).toBeGreaterThan(0.9)
+  })
+
+  it('horizontale Regulierungen sollten "alle Branchen" in der Note erwaehnen', () => {
+    const gdpr = ragData.documents.find((d: any) => d.code === 'GDPR')
+    if (gdpr?.applicability_note) {
+      expect(gdpr.applicability_note.toLowerCase()).toContain('alle branchen')
+    }
+  })
+
+  it('nicht zutreffende sollten "nicht zutreffend" in der Note erwaehnen', () => {
+    const dora = ragData.documents.find((d: any) => d.code === 'DORA')
+    if (dora?.applicability_note) {
+      expect(dora.applicability_note.toLowerCase()).toContain('nicht zutreffend')
+    }
+  })
+})
+
+describe('rag-documents.json — Dokumenttyp-Verteilung', () => {
+  it('sollte Dokumente in jedem doc_type haben', () => {
+    ragData.doc_types.forEach((dt: any) => {
+      const count = ragData.documents.filter((d: any) => d.doc_type === dt.id).length
+      expect(count).toBeGreaterThan(0)
+    })
+  })
+
+  it('sollte EU-Verordnungen als groesste Kategorie haben (mind. 15)', () => {
+    const euRegs = ragData.documents.filter((d: any) => d.doc_type === 'eu_regulation')
+    expect(euRegs.length).toBeGreaterThanOrEqual(15)
+  })
+
+  it('sollte EDPB Leitlinien als umfangreichste Kategorie haben (mind. 40)', () => {
+    const edpb = ragData.documents.filter((d: any) => d.doc_type === 'edpb_guideline')
+    expect(edpb.length).toBeGreaterThanOrEqual(40)
+  })
+})
--- a/admin-lehrer/app/(admin)/ai/rag/page.tsx
+++ b/admin-lehrer/app/(admin)/ai/rag/page.tsx
--- a/admin-lehrer/app/(admin)/ai/rag/rag-documents.json
+++ b/admin-lehrer/app/(admin)/ai/rag/rag-documents.json
--- a/admin-lehrer/app/(admin)/communication/matrix/page.tsx
+++ b/admin-lehrer/app/(admin)/communication/matrix/page.tsx
@@ -1,593 +0,0 @@
-'use client'
-
-/**
- * Voice Service Admin Page (migrated from website/admin/voice)
- *
- * Displays:
- * - Voice-First Architecture Overview
- * - Developer Guide Content
- * - Live Voice Demo (embedded from studio-v2)
- * - Task State Machine Documentation
- * - DSGVO Compliance Information
- */
-
-import { useState } from 'react'
-import Link from 'next/link'
-import { PagePurpose } from '@/components/common/PagePurpose'
-
-type TabType = 'overview' | 'demo' | 'tasks' | 'intents' | 'dsgvo' | 'api'
-
-// Task State Machine data
-const TASK_STATES = [
-  { state: 'DRAFT', description: 'Task erstellt, noch nicht verarbeitet', color: 'bg-gray-100 text-gray-800', next: ['QUEUED', 'PAUSED'] },
-  { state: 'QUEUED', description: 'In Warteschlange fuer Verarbeitung', color: 'bg-blue-100 text-blue-800', next: ['RUNNING', 'PAUSED'] },
-  { state: 'RUNNING', description: 'Wird aktuell verarbeitet', color: 'bg-yellow-100 text-yellow-800', next: ['READY', 'PAUSED'] },
-  { state: 'READY', description: 'Fertig, wartet auf User-Bestaetigung', color: 'bg-green-100 text-green-800', next: ['APPROVED', 'REJECTED', 'PAUSED'] },
-  { state: 'APPROVED', description: 'Vom User bestaetigt', color: 'bg-emerald-100 text-emerald-800', next: ['COMPLETED'] },
-  { state: 'REJECTED', description: 'Vom User abgelehnt', color: 'bg-red-100 text-red-800', next: ['DRAFT'] },
-  { state: 'COMPLETED', description: 'Erfolgreich abgeschlossen', color: 'bg-teal-100 text-teal-800', next: [] },
-  { state: 'EXPIRED', description: 'TTL ueberschritten', color: 'bg-orange-100 text-orange-800', next: [] },
-  { state: 'PAUSED', description: 'Vom User pausiert', color: 'bg-purple-100 text-purple-800', next: ['DRAFT', 'QUEUED', 'RUNNING', 'READY'] },
-]
-
-// Intent Types (22 types organized by group)
-const INTENT_GROUPS = [
-  {
-    group: 'Notizen',
-    color: 'bg-blue-50 border-blue-200',
-    intents: [
-      { type: 'student_observation', example: 'Notiz zu Max: heute wiederholt gestoert', description: 'Schuelerbeobachtungen' },
-      { type: 'reminder', example: 'Erinner mich morgen an Konferenz', description: 'Erinnerungen setzen' },
-      { type: 'homework_check', example: '7b Mathe Hausaufgabe kontrollieren', description: 'Hausaufgaben pruefen' },
-      { type: 'conference_topic', example: 'Thema Lehrerkonferenz: iPad-Regeln', description: 'Konferenzthemen' },
-      { type: 'correction_thought', example: 'Aufgabe 3: haeufiger Fehler erklaeren', description: 'Korrekturgedanken' },
-    ]
-  },
-  {
-    group: 'Content-Generierung',
-    color: 'bg-green-50 border-green-200',
-    intents: [
-      { type: 'worksheet_generate', example: 'Erstelle 3 Lueckentexte zu Vokabeln', description: 'Arbeitsblaetter erstellen' },
-      { type: 'quiz_generate', example: '10-Minuten Vokabeltest mit Loesungen', description: 'Quiz/Tests erstellen' },
-      { type: 'quick_activity', example: '10 Minuten Einstieg, 5 Aufgaben', description: 'Schnelle Aktivitaeten' },
-      { type: 'differentiation', example: 'Zwei Schwierigkeitsstufen: Basis und Plus', description: 'Differenzierung' },
-    ]
-  },
-  {
-    group: 'Kommunikation',
-    color: 'bg-yellow-50 border-yellow-200',
-    intents: [
-      { type: 'parent_letter', example: 'Neutraler Elternbrief wegen Stoerungen', description: 'Elternbriefe erstellen' },
-      { type: 'class_message', example: 'Nachricht an 8a: Hausaufgaben bis Mittwoch', description: 'Klassennachrichten' },
-    ]
-  },
-  {
-    group: 'Canvas-Editor',
-    color: 'bg-purple-50 border-purple-200',
-    intents: [
-      { type: 'canvas_edit', example: 'Ueberschriften groesser, Zeilenabstand kleiner', description: 'Formatierung aendern' },
-      { type: 'canvas_layout', example: 'Alles auf eine Seite, Drucklayout A4', description: 'Layout anpassen' },
-      { type: 'canvas_element', example: 'Kasten fuer Merke hinzufuegen', description: 'Elemente hinzufuegen' },
-      { type: 'canvas_image', example: 'Bild 2 nach links, Pfeil auf Aufgabe 3', description: 'Bilder positionieren' },
-    ]
-  },
-  {
-    group: 'RAG & Korrektur',
-    color: 'bg-pink-50 border-pink-200',
-    intents: [
-      { type: 'operator_checklist', example: 'Operatoren-Checkliste fuer diese Aufgabe', description: 'Operatoren abrufen' },
-      { type: 'eh_passage', example: 'Erwartungshorizont-Passage zu diesem Thema', description: 'EH-Passagen suchen' },
-      { type: 'feedback_suggestion', example: 'Kurze Feedbackformulierung vorschlagen', description: 'Feedback vorschlagen' },
-    ]
-  },
-  {
-    group: 'Follow-up (TaskOrchestrator)',
-    color: 'bg-teal-50 border-teal-200',
-    intents: [
-      { type: 'task_summary', example: 'Fasse alle offenen Tasks zusammen', description: 'Task-Uebersicht' },
-      { type: 'convert_note', example: 'Mach aus der Notiz von gestern einen Elternbrief', description: 'Notizen konvertieren' },
-      { type: 'schedule_reminder', example: 'Erinner mich morgen an das Gespraech mit Max', description: 'Erinnerungen planen' },
-    ]
-  },
-]
-
-// DSGVO Data Categories
-const DSGVO_CATEGORIES = [
-  { category: 'Audio', processing: 'NUR transient im RAM, NIEMALS persistiert', storage: 'Keine', ttl: '-', icon: '🎤', risk: 'low' },
-  { category: 'PII (Schuelernamen)', processing: 'NUR auf Lehrergeraet', storage: 'Client-side', ttl: '-', icon: '👤', risk: 'high' },
-  { category: 'Pseudonyme', processing: 'Server erlaubt (student_ref, class_ref)', storage: 'Valkey Cache', ttl: '24h', icon: '🔢', risk: 'low' },
-  { category: 'Transkripte', processing: 'NUR verschluesselt (AES-256-GCM)', storage: 'PostgreSQL', ttl: '7 Tage', icon: '📝', risk: 'medium' },
-  { category: 'Task States', processing: 'TaskOrchestrator', storage: 'Valkey', ttl: '30 Tage', icon: '📋', risk: 'low' },
-  { category: 'Audit Logs', processing: 'Nur truncated IDs, keine PII', storage: 'PostgreSQL', ttl: '90 Tage', icon: '📊', risk: 'low' },
-]
-
-// API Endpoints
-const API_ENDPOINTS = [
-  { method: 'POST', path: '/api/v1/sessions', description: 'Voice Session erstellen' },
-  { method: 'GET', path: '/api/v1/sessions/{id}', description: 'Session Status abrufen' },
-  { method: 'DELETE', path: '/api/v1/sessions/{id}', description: 'Session beenden' },
-  { method: 'GET', path: '/api/v1/sessions/{id}/tasks', description: 'Pending Tasks abrufen' },
-  { method: 'POST', path: '/api/v1/tasks', description: 'Task erstellen' },
-  { method: 'GET', path: '/api/v1/tasks/{id}', description: 'Task Status abrufen' },
-  { method: 'PUT', path: '/api/v1/tasks/{id}/transition', description: 'Task State aendern' },
-  { method: 'DELETE', path: '/api/v1/tasks/{id}', description: 'Task loeschen' },
-  { method: 'WS', path: '/ws/voice', description: 'Voice Streaming (WebSocket)' },
-  { method: 'GET', path: '/health', description: 'Health Check' },
-]
-
-export default function VoiceMatrixPage() {
-  const [activeTab, setActiveTab] = useState<TabType>('overview')
-  const [demoLoaded, setDemoLoaded] = useState(false)
-
-  const tabs = [
-    { id: 'overview', name: 'Architektur', icon: '🏗️' },
-    { id: 'demo', name: 'Live Demo', icon: '🎤' },
-    { id: 'tasks', name: 'Task States', icon: '📋' },
-    { id: 'intents', name: 'Intents (22)', icon: '🎯' },
-    { id: 'dsgvo', name: 'DSGVO', icon: '🔒' },
-    { id: 'api', name: 'API', icon: '🔌' },
-  ]
-
-  return (
-    <div>
-      {/* Page Purpose */}
-      <PagePurpose
-        title="Voice Service"
-        purpose="Voice-First Interface mit PersonaPlex-7B & TaskOrchestrator. Konfigurieren und testen Sie den Voice-Service fuer Lehrer-Interaktionen per Sprache."
-        audience={['Entwickler', 'Admins']}
-        architecture={{
-          services: ['voice-service (Python, Port 8091)', 'studio-v2 (Next.js)', 'valkey (Cache)'],
-          databases: ['PostgreSQL', 'Valkey Cache'],
-        }}
-        relatedPages={[
-          { name: 'Matrix & Jitsi', href: '/communication/matrix', description: 'Kommunikation Monitoring' },
-          { name: 'GPU Infrastruktur', href: '/infrastructure/gpu', description: 'GPU fuer Voice-Service' },
-        ]}
-        collapsible={true}
-        defaultCollapsed={false}
-      />
-
-      {/* Quick Links */}
-      <div className="mb-6 flex flex-wrap gap-3">
-        <a
-          href="https://macmini:3001/voice-test"
-          target="_blank"
-          rel="noopener noreferrer"
-          className="flex items-center gap-2 px-4 py-2 bg-teal-600 text-white rounded-lg hover:bg-teal-700 transition-colors"
-        >
-          <svg className="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
-            <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M19 11a7 7 0 01-7 7m0 0a7 7 0 01-7-7m7 7v4m0 0H8m4 0h4m-4-8a3 3 0 01-3-3V5a3 3 0 116 0v6a3 3 0 01-3 3z" />
-          </svg>
-          Voice Test (Studio)
-        </a>
-        <a
-          href="https://macmini:8091/health"
-          target="_blank"
-          rel="noopener noreferrer"
-          className="flex items-center gap-2 px-4 py-2 bg-green-100 text-green-700 rounded-lg hover:bg-green-200 transition-colors"
-        >
-          <svg className="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
-            <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M9 12l2 2 4-4m6 2a9 9 0 11-18 0 9 9 0 0118 0z" />
-          </svg>
-          Health Check
-        </a>
-        <Link
-          href="/development/docs"
-          className="flex items-center gap-2 px-4 py-2 bg-slate-100 text-slate-700 rounded-lg hover:bg-slate-200 transition-colors"
-        >
-          <svg className="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
-            <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M9 12h6m-6 4h6m2 5H7a2 2 0 01-2-2V5a2 2 0 012-2h5.586a1 1 0 01.707.293l5.414 5.414a1 1 0 01.293.707V19a2 2 0 01-2 2z" />
-          </svg>
-          Developer Docs
-        </Link>
-      </div>
-
-      {/* Stats Overview */}
-      <div className="grid grid-cols-2 md:grid-cols-6 gap-4 mb-6">
-        <div className="bg-white rounded-lg shadow p-4">
-          <div className="text-3xl font-bold text-teal-600">8091</div>
-          <div className="text-sm text-slate-500">Port</div>
-        </div>
-        <div className="bg-white rounded-lg shadow p-4">
-          <div className="text-3xl font-bold text-blue-600">22</div>
-          <div className="text-sm text-slate-500">Task Types</div>
-        </div>
-        <div className="bg-white rounded-lg shadow p-4">
-          <div className="text-3xl font-bold text-purple-600">9</div>
-          <div className="text-sm text-slate-500">Task States</div>
-        </div>
-        <div className="bg-white rounded-lg shadow p-4">
-          <div className="text-3xl font-bold text-green-600">24kHz</div>
-          <div className="text-sm text-slate-500">Audio Rate</div>
-        </div>
-        <div className="bg-white rounded-lg shadow p-4">
-          <div className="text-3xl font-bold text-orange-600">80ms</div>
-          <div className="text-sm text-slate-500">Frame Size</div>
-        </div>
-        <div className="bg-white rounded-lg shadow p-4">
-          <div className="text-3xl font-bold text-red-600">0</div>
-          <div className="text-sm text-slate-500">Audio Persist</div>
-        </div>
-      </div>
-
-      {/* Tabs */}
-      <div className="bg-white rounded-lg shadow mb-6">
-        <div className="border-b border-slate-200 px-4">
-          <div className="flex gap-1 overflow-x-auto">
-            {tabs.map((tab) => (
-              <button
-                key={tab.id}
-                onClick={() => setActiveTab(tab.id as TabType)}
-                className={`px-4 py-3 text-sm font-medium whitespace-nowrap transition-colors border-b-2 ${
-                  activeTab === tab.id
-                    ? 'border-teal-600 text-teal-600'
-                    : 'border-transparent text-slate-500 hover:text-slate-700'
-                }`}
-              >
-                <span className="mr-2">{tab.icon}</span>
-                {tab.name}
-              </button>
-            ))}
-          </div>
-        </div>
-
-        <div className="p-6">
-          {/* Overview Tab */}
-          {activeTab === 'overview' && (
-            <div className="space-y-6">
-              <h3 className="text-lg font-semibold text-slate-900">Voice-First Architektur</h3>
-
-              {/* Architecture Diagram */}
-              <div className="bg-slate-50 rounded-lg p-6 font-mono text-sm overflow-x-auto">
-                <pre className="text-slate-700">{`
-┌──────────────────────────────────────────────────────────────────┐
-│                    LEHRERGERAET (PWA / App)                       │
-│  ┌────────────────────────────────────────────────────────────┐  │
-│  │ VoiceCapture.tsx │ voice-encryption.ts │ voice-api.ts      │  │
-│  │ Mikrofon         │ AES-256-GCM         │ WebSocket Client  │  │
-│  └────────────────────────────────────────────────────────────┘  │
-└───────────────────────────┬──────────────────────────────────────┘
-                            │ WebSocket (wss://)
-                            ▼
-┌──────────────────────────────────────────────────────────────────┐
-│                    VOICE SERVICE (Port 8091)                      │
-│  ┌────────────────────────────────────────────────────────────┐  │
-│  │ main.py │ streaming.py │ sessions.py │ tasks.py            │  │
-│  └────────────────────────────────────────────────────────────┘  │
-│  ┌────────────────────────────────────────────────────────────┐  │
-│  │ task_orchestrator.py │ intent_router.py │ encryption        │  │
-│  └────────────────────────────────────────────────────────────┘  │
-└───────────────────────────┬──────────────────────────────────────┘
-                            │
-         ┌──────────────────┼──────────────────┐
-         ▼                  ▼                  ▼
-┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
-│ PersonaPlex-7B  │ │ Ollama Fallback │ │ Valkey Cache    │
-│ (A100 GPU)      │ │ (Mac Mini)      │ │ (Sessions)      │
-└─────────────────┘ └─────────────────┘ └─────────────────┘
-`}</pre>
-              </div>
-
-              {/* Technology Stack */}
-              <div className="grid grid-cols-1 md:grid-cols-3 gap-4">
-                <div className="bg-blue-50 border border-blue-200 rounded-lg p-4">
-                  <h4 className="font-semibold text-blue-800 mb-2">Voice Model (Produktion)</h4>
-                  <p className="text-sm text-blue-700">PersonaPlex-7B (NVIDIA)</p>
-                  <p className="text-xs text-blue-600 mt-1">Full-Duplex Speech-to-Speech</p>
-                  <p className="text-xs text-blue-500">Lizenz: MIT + NVIDIA Open Model</p>
-                </div>
-                <div className="bg-green-50 border border-green-200 rounded-lg p-4">
-                  <h4 className="font-semibold text-green-800 mb-2">Agent Orchestration</h4>
-                  <p className="text-sm text-green-700">TaskOrchestrator</p>
-                  <p className="text-xs text-green-600 mt-1">Task State Machine</p>
-                  <p className="text-xs text-green-500">Lizenz: Proprietary</p>
-                </div>
-                <div className="bg-purple-50 border border-purple-200 rounded-lg p-4">
-                  <h4 className="font-semibold text-purple-800 mb-2">Audio Codec</h4>
-                  <p className="text-sm text-purple-700">Mimi (24kHz, 80ms)</p>
-                  <p className="text-xs text-purple-600 mt-1">Low-Latency Streaming</p>
-                  <p className="text-xs text-purple-500">Lizenz: MIT</p>
-                </div>
-              </div>
-
-              {/* Key Files */}
-              <div>
-                <h4 className="font-semibold text-slate-800 mb-3">Wichtige Dateien</h4>
-                <div className="bg-white border border-slate-200 rounded-lg overflow-hidden">
-                  <table className="min-w-full divide-y divide-slate-200">
-                    <thead className="bg-slate-50">
-                      <tr>
-                        <th className="px-4 py-2 text-left text-xs font-medium text-slate-500 uppercase">Datei</th>
-                        <th className="px-4 py-2 text-left text-xs font-medium text-slate-500 uppercase">Beschreibung</th>
-                      </tr>
-                    </thead>
-                    <tbody className="divide-y divide-slate-200">
-                      <tr><td className="px-4 py-2 font-mono text-sm">voice-service/main.py</td><td className="px-4 py-2 text-sm text-slate-600">FastAPI Entry, WebSocket Handler</td></tr>
-                      <tr><td className="px-4 py-2 font-mono text-sm">voice-service/services/task_orchestrator.py</td><td className="px-4 py-2 text-sm text-slate-600">Task State Machine</td></tr>
-                      <tr><td className="px-4 py-2 font-mono text-sm">voice-service/services/intent_router.py</td><td className="px-4 py-2 text-sm text-slate-600">Intent Detection (22 Types)</td></tr>
-                      <tr><td className="px-4 py-2 font-mono text-sm">voice-service/services/encryption_service.py</td><td className="px-4 py-2 text-sm text-slate-600">Namespace Key Management</td></tr>
-                      <tr><td className="px-4 py-2 font-mono text-sm">studio-v2/components/voice/VoiceCapture.tsx</td><td className="px-4 py-2 text-sm text-slate-600">Frontend Mikrofon + Crypto</td></tr>
-                      <tr><td className="px-4 py-2 font-mono text-sm">studio-v2/lib/voice/voice-encryption.ts</td><td className="px-4 py-2 text-sm text-slate-600">AES-256-GCM Client-side</td></tr>
-                    </tbody>
-                  </table>
-                </div>
-              </div>
-            </div>
-          )}
-
-          {/* Demo Tab */}
-          {activeTab === 'demo' && (
-            <div className="space-y-4">
-              <div className="flex items-center justify-between">
-                <h3 className="text-lg font-semibold text-slate-900">Live Voice Demo</h3>
-                <a
-                  href="https://macmini:3001/voice-test"
-                  target="_blank"
-                  rel="noopener noreferrer"
-                  className="text-sm text-teal-600 hover:text-teal-700 flex items-center gap-1"
-                >
-                  In neuem Tab oeffnen
-                  <svg className="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
-                    <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M10 6H6a2 2 0 00-2 2v10a2 2 0 002 2h10a2 2 0 002-2v-4M14 4h6m0 0v6m0-6L10 14" />
-                  </svg>
-                </a>
-              </div>
-
-              <div className="bg-slate-100 rounded-lg p-4 text-sm text-slate-600 mb-4">
-                <p><strong>Hinweis:</strong> Die Demo erfordert, dass der Voice Service (Port 8091) und das Studio-v2 Frontend (Port 3001) laufen.</p>
-                <code className="block mt-2 bg-slate-200 p-2 rounded">docker compose up -d voice-service && cd studio-v2 && npm run dev</code>
-              </div>
-
-              {/* Embedded Demo */}
-              <div className="relative bg-slate-900 rounded-lg overflow-hidden" style={{ height: '600px' }}>
-                {!demoLoaded && (
-                  <div className="absolute inset-0 flex items-center justify-center">
-                    <button
-                      onClick={() => setDemoLoaded(true)}
-                      className="px-6 py-3 bg-teal-600 text-white rounded-lg hover:bg-teal-700 transition-colors flex items-center gap-2"
-                    >
-                      <svg className="w-6 h-6" fill="none" stroke="currentColor" viewBox="0 0 24 24">
-                        <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M14.752 11.168l-3.197-2.132A1 1 0 0010 9.87v4.263a1 1 0 001.555.832l3.197-2.132a1 1 0 000-1.664z" />
-                        <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M21 12a9 9 0 11-18 0 9 9 0 0118 0z" />
-                      </svg>
-                      Voice Demo laden
-                    </button>
-                  </div>
-                )}
-                {demoLoaded && (
-                  <iframe
-                    src="https://macmini:3001/voice-test?embed=true"
-                    className="w-full h-full border-0"
-                    title="Voice Demo"
-                    allow="microphone"
-                  />
-                )}
-              </div>
-            </div>
-          )}
-
-          {/* Task States Tab */}
-          {activeTab === 'tasks' && (
-            <div className="space-y-6">
-              <h3 className="text-lg font-semibold text-slate-900">Task State Machine (TaskOrchestrator)</h3>
-
-              {/* State Diagram */}
-              <div className="bg-slate-50 rounded-lg p-6 font-mono text-sm overflow-x-auto">
-                <pre className="text-slate-700">{`
-DRAFT → QUEUED → RUNNING → READY
-                              │
-                  ┌───────────┴───────────┐
-                  │                       │
-              APPROVED                REJECTED
-                  │                       │
-              COMPLETED               DRAFT (revision)
-
-Any State → EXPIRED (TTL)
-Any State → PAUSED (User Interrupt)
-`}</pre>
-              </div>
-
-              {/* States Table */}
-              <div className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-3 gap-4">
-                {TASK_STATES.map((state) => (
-                  <div key={state.state} className={`${state.color} rounded-lg p-4`}>
-                    <div className="font-semibold text-lg">{state.state}</div>
-                    <p className="text-sm mt-1">{state.description}</p>
-                    {state.next.length > 0 && (
-                      <div className="mt-2 text-xs">
-                        <span className="opacity-75">Naechste:</span>{' '}
-                        {state.next.join(', ')}
-                      </div>
-                    )}
-                  </div>
-                ))}
-              </div>
-            </div>
-          )}
-
-          {/* Intents Tab */}
-          {activeTab === 'intents' && (
-            <div className="space-y-6">
-              <h3 className="text-lg font-semibold text-slate-900">Intent Types (22 unterstuetzte Typen)</h3>
-
-              {INTENT_GROUPS.map((group) => (
-                <div key={group.group} className={`${group.color} border rounded-lg p-4`}>
-                  <h4 className="font-semibold text-slate-800 mb-3">{group.group}</h4>
-                  <div className="space-y-2">
-                    {group.intents.map((intent) => (
-                      <div key={intent.type} className="bg-white rounded-lg p-3 shadow-sm">
-                        <div className="flex items-start justify-between">
-                          <div>
-                            <code className="text-sm font-mono text-teal-700 bg-teal-50 px-2 py-0.5 rounded">
-                              {intent.type}
-                            </code>
-                            <p className="text-sm text-slate-600 mt-1">{intent.description}</p>
-                          </div>
-                        </div>
-                        <div className="mt-2 text-xs text-slate-500 italic">
-                          Beispiel: &quot;{intent.example}&quot;
-                        </div>
-                      </div>
-                    ))}
-                  </div>
-                </div>
-              ))}
-            </div>
-          )}
-
-          {/* DSGVO Tab */}
-          {activeTab === 'dsgvo' && (
-            <div className="space-y-6">
-              <h3 className="text-lg font-semibold text-slate-900">DSGVO-Compliance</h3>
-
-              {/* Key Principles */}
-              <div className="bg-green-50 border border-green-200 rounded-lg p-4">
-                <h4 className="font-semibold text-green-800 mb-2">Kernprinzipien</h4>
-                <ul className="list-disc list-inside text-sm text-green-700 space-y-1">
-                  <li><strong>Audio NIEMALS persistiert</strong> - Nur transient im RAM</li>
-                  <li><strong>Namespace-Verschluesselung</strong> - Key nur auf Lehrergeraet</li>
-                  <li><strong>Keine Klartext-PII serverseitig</strong> - Nur verschluesselt oder pseudonymisiert</li>
-                  <li><strong>TTL-basierte Auto-Loeschung</strong> - 7/30/90 Tage je nach Kategorie</li>
-                </ul>
-              </div>
-
-              {/* Data Categories Table */}
-              <div className="bg-white border border-slate-200 rounded-lg overflow-hidden">
-                <table className="min-w-full divide-y divide-slate-200">
-                  <thead className="bg-slate-50">
-                    <tr>
-                      <th className="px-4 py-3 text-left text-xs font-medium text-slate-500 uppercase">Kategorie</th>
-                      <th className="px-4 py-3 text-left text-xs font-medium text-slate-500 uppercase">Verarbeitung</th>
-                      <th className="px-4 py-3 text-left text-xs font-medium text-slate-500 uppercase">Speicherort</th>
-                      <th className="px-4 py-3 text-left text-xs font-medium text-slate-500 uppercase">TTL</th>
-                      <th className="px-4 py-3 text-left text-xs font-medium text-slate-500 uppercase">Risiko</th>
-                    </tr>
-                  </thead>
-                  <tbody className="divide-y divide-slate-200">
-                    {DSGVO_CATEGORIES.map((cat) => (
-                      <tr key={cat.category}>
-                        <td className="px-4 py-3">
-                          <span className="mr-2">{cat.icon}</span>
-                          <span className="font-medium">{cat.category}</span>
-                        </td>
-                        <td className="px-4 py-3 text-sm text-slate-600">{cat.processing}</td>
-                        <td className="px-4 py-3 text-sm text-slate-600">{cat.storage}</td>
-                        <td className="px-4 py-3 text-sm text-slate-600">{cat.ttl}</td>
-                        <td className="px-4 py-3">
-                          <span className={`px-2 py-1 rounded text-xs font-medium ${
-                            cat.risk === 'low' ? 'bg-green-100 text-green-700' :
-                            cat.risk === 'medium' ? 'bg-yellow-100 text-yellow-700' :
-                            'bg-red-100 text-red-700'
-                          }`}>
-                            {cat.risk.toUpperCase()}
-                          </span>
-                        </td>
-                      </tr>
-                    ))}
-                  </tbody>
-                </table>
-              </div>
-
-              {/* Audit Log Info */}
-              <div className="bg-slate-50 border border-slate-200 rounded-lg p-4">
-                <h4 className="font-semibold text-slate-800 mb-2">Audit Logs (ohne PII)</h4>
-                <div className="grid grid-cols-2 gap-4 text-sm">
-                  <div>
-                    <span className="text-green-600 font-medium">Erlaubt:</span>
-                    <ul className="list-disc list-inside text-slate-600 mt-1">
-                      <li>ref_id (truncated)</li>
-                      <li>content_type</li>
-                      <li>size_bytes</li>
-                      <li>ttl_hours</li>
-                    </ul>
-                  </div>
-                  <div>
-                    <span className="text-red-600 font-medium">Verboten:</span>
-                    <ul className="list-disc list-inside text-slate-600 mt-1">
-                      <li>user_name</li>
-                      <li>content / transcript</li>
-                      <li>email</li>
-                      <li>student_name</li>
-                    </ul>
-                  </div>
-                </div>
-              </div>
-            </div>
-          )}
-
-          {/* API Tab */}
-          {activeTab === 'api' && (
-            <div className="space-y-6">
-              <h3 className="text-lg font-semibold text-slate-900">Voice Service API (Port 8091)</h3>
-
-              {/* REST Endpoints */}
-              <div className="bg-white border border-slate-200 rounded-lg overflow-hidden">
-                <table className="min-w-full divide-y divide-slate-200">
-                  <thead className="bg-slate-50">
-                    <tr>
-                      <th className="px-4 py-3 text-left text-xs font-medium text-slate-500 uppercase">Methode</th>
-                      <th className="px-4 py-3 text-left text-xs font-medium text-slate-500 uppercase">Endpoint</th>
-                      <th className="px-4 py-3 text-left text-xs font-medium text-slate-500 uppercase">Beschreibung</th>
-                    </tr>
-                  </thead>
-                  <tbody className="divide-y divide-slate-200">
-                    {API_ENDPOINTS.map((ep, idx) => (
-                      <tr key={idx}>
-                        <td className="px-4 py-3">
-                          <span className={`px-2 py-1 rounded text-xs font-medium ${
-                            ep.method === 'GET' ? 'bg-green-100 text-green-700' :
-                            ep.method === 'POST' ? 'bg-blue-100 text-blue-700' :
-                            ep.method === 'PUT' ? 'bg-yellow-100 text-yellow-700' :
-                            ep.method === 'DELETE' ? 'bg-red-100 text-red-700' :
-                            'bg-purple-100 text-purple-700'
-                          }`}>
-                            {ep.method}
-                          </span>
-                        </td>
-                        <td className="px-4 py-3 font-mono text-sm">{ep.path}</td>
-                        <td className="px-4 py-3 text-sm text-slate-600">{ep.description}</td>
-                      </tr>
-                    ))}
-                  </tbody>
-                </table>
-              </div>
-
-              {/* WebSocket Protocol */}
-              <div className="bg-slate-50 rounded-lg p-4">
-                <h4 className="font-semibold text-slate-800 mb-3">WebSocket Protocol</h4>
-                <div className="grid grid-cols-1 md:grid-cols-2 gap-4 text-sm">
-                  <div className="bg-white rounded-lg p-3 border border-slate-200">
-                    <div className="font-medium text-slate-700 mb-2">Client → Server</div>
-                    <ul className="list-disc list-inside text-slate-600 space-y-1">
-                      <li><code className="bg-slate-100 px-1 rounded">Binary</code>: Int16 PCM Audio (24kHz, 80ms)</li>
-                      <li><code className="bg-slate-100 px-1 rounded">JSON</code>: {`{type: "config|end_turn|interrupt"}`}</li>
-                    </ul>
-                  </div>
-                  <div className="bg-white rounded-lg p-3 border border-slate-200">
-                    <div className="font-medium text-slate-700 mb-2">Server → Client</div>
-                    <ul className="list-disc list-inside text-slate-600 space-y-1">
-                      <li><code className="bg-slate-100 px-1 rounded">Binary</code>: Audio Response (base64)</li>
-                      <li><code className="bg-slate-100 px-1 rounded">JSON</code>: {`{type: "transcript|intent|status|error"}`}</li>
-                    </ul>
-                  </div>
-                </div>
-              </div>
-
-              {/* Example curl commands */}
-              <div className="bg-slate-900 rounded-lg p-4 text-sm">
-                <h4 className="font-semibold text-slate-300 mb-3">Beispiel: Session erstellen</h4>
-                <pre className="text-green-400 overflow-x-auto">{`curl -X POST https://macmini:8091/api/v1/sessions \\
-  -H "Content-Type: application/json" \\
-  -d '{
-    "namespace_id": "ns-12345678abcdef12345678abcdef12",
-    "key_hash": "sha256:dGVzdGtleWhhc2h0ZXN0a2V5aGFzaHRlc3Q=",
-    "device_type": "pwa"
-  }'`}</pre>
-              </div>
-            </div>
-          )}
-        </div>
-      </div>
-    </div>
-  )
-}
--- a/admin-lehrer/app/(admin)/communication/video-chat/page.tsx
+++ b/admin-lehrer/app/(admin)/communication/video-chat/page.tsx
@@ -1,635 +0,0 @@
-'use client'
-
-/**
- * Video & Chat Admin Page
- *
- * Matrix & Jitsi Monitoring Dashboard
- * Provides system statistics, active calls, user metrics, and service health
- * Migrated from website/app/admin/communication
- */
-
-import { useEffect, useState, useCallback } from 'react'
-import Link from 'next/link'
-import { PagePurpose } from '@/components/common/PagePurpose'
-import { getModuleByHref } from '@/lib/navigation'
-
-interface MatrixStats {
-  total_users: number
-  active_users: number
-  total_rooms: number
-  active_rooms: number
-  messages_today: number
-  messages_this_week: number
-  status: 'online' | 'offline' | 'degraded'
-}
-
-interface JitsiStats {
-  active_meetings: number
-  total_participants: number
-  meetings_today: number
-  average_duration_minutes: number
-  peak_concurrent_users: number
-  total_minutes_today: number
-  status: 'online' | 'offline' | 'degraded'
-}
-
-interface TrafficStats {
-  matrix: {
-    bandwidth_in_mb: number
-    bandwidth_out_mb: number
-    messages_per_minute: number
-    media_uploads_today: number
-    media_size_mb: number
-  }
-  jitsi: {
-    bandwidth_in_mb: number
-    bandwidth_out_mb: number
-    video_streams_active: number
-    audio_streams_active: number
-    estimated_hourly_gb: number
-  }
-  total: {
-    bandwidth_in_mb: number
-    bandwidth_out_mb: number
-    estimated_monthly_gb: number
-  }
-}
-
-interface CommunicationStats {
-  matrix: MatrixStats
-  jitsi: JitsiStats
-  traffic?: TrafficStats
-  last_updated: string
-}
-
-interface ActiveMeeting {
-  room_name: string
-  display_name: string
-  participants: number
-  started_at: string
-  duration_minutes: number
-}
-
-interface RecentRoom {
-  room_id: string
-  name: string
-  member_count: number
-  last_activity: string
-  room_type: 'class' | 'parent' | 'staff' | 'general'
-}
-
-export default function VideoChatPage() {
-  const [stats, setStats] = useState<CommunicationStats | null>(null)
-  const [activeMeetings, setActiveMeetings] = useState<ActiveMeeting[]>([])
-  const [recentRooms, setRecentRooms] = useState<RecentRoom[]>([])
-  const [loading, setLoading] = useState(true)
-  const [error, setError] = useState<string | null>(null)
-
-  const moduleInfo = getModuleByHref('/communication/video-chat')
-
-  // Use local API proxy
-  const fetchStats = useCallback(async () => {
-    try {
-      const response = await fetch('/api/admin/communication/stats')
-      if (!response.ok) {
-        throw new Error(`HTTP ${response.status}`)
-      }
-      const data = await response.json()
-      setStats(data)
-      setActiveMeetings(data.active_meetings || [])
-      setRecentRooms(data.recent_rooms || [])
-      setError(null)
-    } catch (err) {
-      setError(err instanceof Error ? err.message : 'Verbindungsfehler')
-      // Set mock data for display purposes when API unavailable
-      setStats({
-        matrix: {
-          total_users: 0,
-          active_users: 0,
-          total_rooms: 0,
-          active_rooms: 0,
-          messages_today: 0,
-          messages_this_week: 0,
-          status: 'offline'
-        },
-        jitsi: {
-          active_meetings: 0,
-          total_participants: 0,
-          meetings_today: 0,
-          average_duration_minutes: 0,
-          peak_concurrent_users: 0,
-          total_minutes_today: 0,
-          status: 'offline'
-        },
-        last_updated: new Date().toISOString()
-      })
-    } finally {
-      setLoading(false)
-    }
-  }, [])
-
-  useEffect(() => {
-    fetchStats()
-  }, [fetchStats])
-
-  // Auto-refresh every 15 seconds
-  useEffect(() => {
-    const interval = setInterval(fetchStats, 15000)
-    return () => clearInterval(interval)
-  }, [fetchStats])
-
-  const getStatusBadge = (status: string) => {
-    const baseClasses = 'px-3 py-1 rounded-full text-xs font-semibold uppercase'
-    switch (status) {
-      case 'online':
-        return `${baseClasses} bg-green-100 text-green-800`
-      case 'degraded':
-        return `${baseClasses} bg-yellow-100 text-yellow-800`
-      case 'offline':
-        return `${baseClasses} bg-red-100 text-red-800`
-      default:
-        return `${baseClasses} bg-slate-100 text-slate-600`
-    }
-  }
-
-  const getRoomTypeBadge = (type: string) => {
-    const baseClasses = 'px-2 py-0.5 rounded text-xs font-medium'
-    switch (type) {
-      case 'class':
-        return `${baseClasses} bg-blue-100 text-blue-700`
-      case 'parent':
-        return `${baseClasses} bg-purple-100 text-purple-700`
-      case 'staff':
-        return `${baseClasses} bg-orange-100 text-orange-700`
-      default:
-        return `${baseClasses} bg-slate-100 text-slate-600`
-    }
-  }
-
-  const formatDuration = (minutes: number) => {
-    if (minutes < 60) return `${Math.round(minutes)} Min.`
-    const hours = Math.floor(minutes / 60)
-    const mins = Math.round(minutes % 60)
-    return `${hours}h ${mins}m`
-  }
-
-  const formatTimeAgo = (dateStr: string) => {
-    const date = new Date(dateStr)
-    const now = new Date()
-    const diffMs = now.getTime() - date.getTime()
-    const diffMins = Math.floor(diffMs / 60000)
-
-    if (diffMins < 1) return 'gerade eben'
-    if (diffMins < 60) return `vor ${diffMins} Min.`
-    if (diffMins < 1440) return `vor ${Math.floor(diffMins / 60)} Std.`
-    return `vor ${Math.floor(diffMins / 1440)} Tagen`
-  }
-
-  // Traffic estimation helpers for SysEleven planning
-  const calculateEstimatedTraffic = (direction: 'in' | 'out'): number => {
-    const messages = stats?.matrix?.messages_today || 0
-    const callMinutes = stats?.jitsi?.total_minutes_today || 0
-    const participants = stats?.jitsi?.total_participants || 0
-
-    const messageTrafficMB = messages * 0.002
-    const videoTrafficMB = callMinutes * participants * 0.011
-
-    if (direction === 'in') {
-      return messageTrafficMB * 0.3 + videoTrafficMB * 0.4
-    }
-    return messageTrafficMB * 0.7 + videoTrafficMB * 0.6
-  }
-
-  const calculateHourlyEstimate = (): number => {
-    const activeParticipants = stats?.jitsi?.total_participants || 0
-    return activeParticipants * 0.675
-  }
-
-  const calculateMonthlyEstimate = (): number => {
-    const dailyCallMinutes = stats?.jitsi?.total_minutes_today || 0
-    const avgParticipants = stats?.jitsi?.peak_concurrent_users || 1
-    const monthlyMinutes = dailyCallMinutes * 22
-    return (monthlyMinutes * avgParticipants * 11) / 1024
-  }
-
-  const getResourceRecommendation = (): string => {
-    const peakUsers = stats?.jitsi?.peak_concurrent_users || 0
-    const monthlyGB = calculateMonthlyEstimate()
-
-    if (monthlyGB < 10 || peakUsers < 5) {
-      return 'Starter (1 vCPU, 2GB RAM, 100GB Traffic)'
-    } else if (monthlyGB < 50 || peakUsers < 20) {
-      return 'Standard (2 vCPU, 4GB RAM, 500GB Traffic)'
-    } else if (monthlyGB < 200 || peakUsers < 50) {
-      return 'Professional (4 vCPU, 8GB RAM, 2TB Traffic)'
-    } else {
-      return 'Enterprise (8+ vCPU, 16GB+ RAM, Unlimited Traffic)'
-    }
-  }
-
-  return (
-    <div>
-      {/* Page Purpose */}
-      <PagePurpose
-        title={moduleInfo?.module.name || 'Video & Chat'}
-        purpose={moduleInfo?.module.purpose || 'Matrix & Jitsi Monitoring Dashboard'}
-        audience={moduleInfo?.module.audience || ['Admins', 'DevOps']}
-        architecture={{
-          services: ['synapse (Matrix)', 'jitsi-meet', 'prosody', 'jvb'],
-          databases: ['PostgreSQL', 'synapse-db'],
-        }}
-        collapsible={true}
-        defaultCollapsed={true}
-      />
-
-      {/* Quick Actions */}
-      <div className="flex gap-3 mb-6">
-        <Link
-          href="/communication/video-chat/wizard"
-          className="px-4 py-2 bg-green-600 text-white rounded-lg hover:bg-green-700 transition-colors text-sm font-medium"
-        >
-          Test Wizard starten
-        </Link>
-        <button
-          onClick={fetchStats}
-          disabled={loading}
-          className="px-4 py-2 border border-slate-300 rounded-lg hover:bg-slate-50 disabled:opacity-50 text-sm"
-        >
-          {loading ? 'Lade...' : 'Aktualisieren'}
-        </button>
-      </div>
-
-      {/* Service Status Overview */}
-      <div className="grid grid-cols-1 md:grid-cols-2 gap-6 mb-6">
-        {/* Matrix Status Card */}
-        <div className="bg-white rounded-xl border border-slate-200 p-6">
-          <div className="flex items-center justify-between mb-4">
-            <div className="flex items-center gap-3">
-              <div className="w-10 h-10 bg-purple-100 rounded-lg flex items-center justify-center">
-                <svg className="w-6 h-6 text-purple-600" fill="none" stroke="currentColor" viewBox="0 0 24 24">
-                  <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M8 12h.01M12 12h.01M16 12h.01M21 12c0 4.418-4.03 8-9 8a9.863 9.863 0 01-4.255-.949L3 20l1.395-3.72C3.512 15.042 3 13.574 3 12c0-4.418 4.03-8 9-8s9 3.582 9 8z" />
-                </svg>
-              </div>
-              <div>
-                <h3 className="font-semibold text-slate-900">Matrix (Synapse)</h3>
-                <p className="text-sm text-slate-500">E2EE Messaging</p>
-              </div>
-            </div>
-            <span className={getStatusBadge(stats?.matrix.status || 'offline')}>
-              {stats?.matrix.status || 'offline'}
-            </span>
-          </div>
-          <div className="grid grid-cols-3 gap-4">
-            <div>
-              <div className="text-2xl font-bold text-slate-900">{stats?.matrix.total_users || 0}</div>
-              <div className="text-xs text-slate-500">Benutzer</div>
-            </div>
-            <div>
-              <div className="text-2xl font-bold text-slate-900">{stats?.matrix.active_users || 0}</div>
-              <div className="text-xs text-slate-500">Aktiv</div>
-            </div>
-            <div>
-              <div className="text-2xl font-bold text-slate-900">{stats?.matrix.total_rooms || 0}</div>
-              <div className="text-xs text-slate-500">Raeume</div>
-            </div>
-          </div>
-          <div className="mt-4 pt-4 border-t border-slate-100">
-            <div className="flex justify-between text-sm">
-              <span className="text-slate-500">Nachrichten heute</span>
-              <span className="font-medium">{stats?.matrix.messages_today || 0}</span>
-            </div>
-            <div className="flex justify-between text-sm mt-1">
-              <span className="text-slate-500">Diese Woche</span>
-              <span className="font-medium">{stats?.matrix.messages_this_week || 0}</span>
-            </div>
-          </div>
-        </div>
-
-        {/* Jitsi Status Card */}
-        <div className="bg-white rounded-xl border border-slate-200 p-6">
-          <div className="flex items-center justify-between mb-4">
-            <div className="flex items-center gap-3">
-              <div className="w-10 h-10 bg-blue-100 rounded-lg flex items-center justify-center">
-                <svg className="w-6 h-6 text-blue-600" fill="none" stroke="currentColor" viewBox="0 0 24 24">
-                  <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M15 10l4.553-2.276A1 1 0 0121 8.618v6.764a1 1 0 01-1.447.894L15 14M5 18h8a2 2 0 002-2V8a2 2 0 00-2-2H5a2 2 0 00-2 2v8a2 2 0 002 2z" />
-                </svg>
-              </div>
-              <div>
-                <h3 className="font-semibold text-slate-900">Jitsi Meet</h3>
-                <p className="text-sm text-slate-500">Videokonferenzen</p>
-              </div>
-            </div>
-            <span className={getStatusBadge(stats?.jitsi.status || 'offline')}>
-              {stats?.jitsi.status || 'offline'}
-            </span>
-          </div>
-          <div className="grid grid-cols-3 gap-4">
-            <div>
-              <div className="text-2xl font-bold text-green-600">{stats?.jitsi.active_meetings || 0}</div>
-              <div className="text-xs text-slate-500">Live Calls</div>
-            </div>
-            <div>
-              <div className="text-2xl font-bold text-slate-900">{stats?.jitsi.total_participants || 0}</div>
-              <div className="text-xs text-slate-500">Teilnehmer</div>
-            </div>
-            <div>
-              <div className="text-2xl font-bold text-slate-900">{stats?.jitsi.meetings_today || 0}</div>
-              <div className="text-xs text-slate-500">Calls heute</div>
-            </div>
-          </div>
-          <div className="mt-4 pt-4 border-t border-slate-100">
-            <div className="flex justify-between text-sm">
-              <span className="text-slate-500">Durchschnittliche Dauer</span>
-              <span className="font-medium">{formatDuration(stats?.jitsi.average_duration_minutes || 0)}</span>
-            </div>
-            <div className="flex justify-between text-sm mt-1">
-              <span className="text-slate-500">Peak gleichzeitig</span>
-              <span className="font-medium">{stats?.jitsi.peak_concurrent_users || 0} Nutzer</span>
-            </div>
-          </div>
-        </div>
-      </div>
-
-      {/* Traffic & Bandwidth Statistics */}
-      <div className="bg-white rounded-xl border border-slate-200 p-6 mb-6">
-        <div className="flex items-center justify-between mb-4">
-          <div className="flex items-center gap-3">
-            <div className="w-10 h-10 bg-emerald-100 rounded-lg flex items-center justify-center">
-              <svg className="w-6 h-6 text-emerald-600" fill="none" stroke="currentColor" viewBox="0 0 24 24">
-                <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M13 7h8m0 0v8m0-8l-8 8-4-4-6 6" />
-              </svg>
-            </div>
-            <div>
-              <h3 className="font-semibold text-slate-900">Traffic & Bandbreite</h3>
-              <p className="text-sm text-slate-500">SysEleven Ressourcenplanung</p>
-            </div>
-          </div>
-          <span className="px-3 py-1 rounded-full text-xs font-semibold uppercase bg-emerald-100 text-emerald-800">
-            Live
-          </span>
-        </div>
-
-        <div className="grid grid-cols-2 md:grid-cols-4 gap-4 mb-4">
-          <div className="bg-slate-50 rounded-lg p-4">
-            <div className="text-xs text-slate-500 mb-1">Eingehend (heute)</div>
-            <div className="text-2xl font-bold text-slate-900">
-              {stats?.traffic?.total?.bandwidth_in_mb?.toFixed(1) || calculateEstimatedTraffic('in').toFixed(1)} MB
-            </div>
-          </div>
-          <div className="bg-slate-50 rounded-lg p-4">
-            <div className="text-xs text-slate-500 mb-1">Ausgehend (heute)</div>
-            <div className="text-2xl font-bold text-slate-900">
-              {stats?.traffic?.total?.bandwidth_out_mb?.toFixed(1) || calculateEstimatedTraffic('out').toFixed(1)} MB
-            </div>
-          </div>
-          <div className="bg-slate-50 rounded-lg p-4">
-            <div className="text-xs text-slate-500 mb-1">Geschaetzt/Stunde</div>
-            <div className="text-2xl font-bold text-blue-600">
-              {stats?.traffic?.jitsi?.estimated_hourly_gb?.toFixed(2) || calculateHourlyEstimate().toFixed(2)} GB
-            </div>
-          </div>
-          <div className="bg-slate-50 rounded-lg p-4">
-            <div className="text-xs text-slate-500 mb-1">Geschaetzt/Monat</div>
-            <div className="text-2xl font-bold text-emerald-600">
-              {stats?.traffic?.total?.estimated_monthly_gb?.toFixed(1) || calculateMonthlyEstimate().toFixed(1)} GB
-            </div>
-          </div>
-        </div>
-
-        <div className="grid grid-cols-1 md:grid-cols-2 gap-4">
-          {/* Matrix Traffic */}
-          <div className="border border-slate-200 rounded-lg p-4">
-            <div className="flex items-center gap-2 mb-3">
-              <div className="w-3 h-3 bg-purple-500 rounded-full"></div>
-              <span className="text-sm font-medium text-slate-700">Matrix Messaging</span>
-            </div>
-            <div className="space-y-2 text-sm">
-              <div className="flex justify-between">
-                <span className="text-slate-500">Nachrichten/Min</span>
-                <span className="font-medium">{stats?.traffic?.matrix?.messages_per_minute || Math.round((stats?.matrix?.messages_today || 0) / (new Date().getHours() || 1) / 60)}</span>
-              </div>
-              <div className="flex justify-between">
-                <span className="text-slate-500">Media Uploads heute</span>
-                <span className="font-medium">{stats?.traffic?.matrix?.media_uploads_today || 0}</span>
-              </div>
-              <div className="flex justify-between">
-                <span className="text-slate-500">Media Groesse</span>
-                <span className="font-medium">{stats?.traffic?.matrix?.media_size_mb?.toFixed(1) || '0.0'} MB</span>
-              </div>
-            </div>
-          </div>
-
-          {/* Jitsi Traffic */}
-          <div className="border border-slate-200 rounded-lg p-4">
-            <div className="flex items-center gap-2 mb-3">
-              <div className="w-3 h-3 bg-blue-500 rounded-full"></div>
-              <span className="text-sm font-medium text-slate-700">Jitsi Video</span>
-            </div>
-            <div className="space-y-2 text-sm">
-              <div className="flex justify-between">
-                <span className="text-slate-500">Video Streams aktiv</span>
-                <span className="font-medium">{stats?.traffic?.jitsi?.video_streams_active || (stats?.jitsi?.total_participants || 0)}</span>
-              </div>
-              <div className="flex justify-between">
-                <span className="text-slate-500">Audio Streams aktiv</span>
-                <span className="font-medium">{stats?.traffic?.jitsi?.audio_streams_active || (stats?.jitsi?.total_participants || 0)}</span>
-              </div>
-              <div className="flex justify-between">
-                <span className="text-slate-500">Bitrate geschaetzt</span>
-                <span className="font-medium">{((stats?.jitsi?.total_participants || 0) * 1.5).toFixed(1)} Mbps</span>
-              </div>
-            </div>
-          </div>
-        </div>
-
-        {/* SysEleven Recommendation */}
-        <div className="mt-4 p-4 bg-emerald-50 border border-emerald-200 rounded-lg">
-          <h4 className="text-sm font-semibold text-emerald-800 mb-2">SysEleven Empfehlung</h4>
-          <div className="text-sm text-emerald-700">
-            <p>Basierend auf aktuellem Traffic: <strong>{getResourceRecommendation()}</strong></p>
-            <p className="mt-1 text-xs text-emerald-600">
-              Peak Teilnehmer: {stats?.jitsi?.peak_concurrent_users || 0} |
-              Durchschnittliche Call-Dauer: {stats?.jitsi?.average_duration_minutes?.toFixed(0) || 0} Min. |
-              Calls heute: {stats?.jitsi?.meetings_today || 0}
-            </p>
-          </div>
-        </div>
-      </div>
-
-      {/* Active Meetings */}
-      <div className="bg-white rounded-xl border border-slate-200 p-6 mb-6">
-        <div className="flex items-center justify-between mb-4">
-          <h3 className="font-semibold text-slate-900">Aktive Meetings</h3>
-        </div>
-
-        {activeMeetings.length === 0 ? (
-          <div className="text-center py-8 text-slate-500">
-            <svg className="w-12 h-12 mx-auto mb-3 text-slate-300" fill="none" stroke="currentColor" viewBox="0 0 24 24">
-              <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M15 10l4.553-2.276A1 1 0 0121 8.618v6.764a1 1 0 01-1.447.894L15 14M5 18h8a2 2 0 002-2V8a2 2 0 00-2-2H5a2 2 0 00-2 2v8a2 2 0 002 2z" />
-            </svg>
-            <p>Keine aktiven Meetings</p>
-          </div>
-        ) : (
-          <div className="overflow-x-auto">
-            <table className="w-full">
-              <thead>
-                <tr className="text-left text-xs text-slate-500 uppercase border-b border-slate-200">
-                  <th className="pb-3 pr-4">Meeting</th>
-                  <th className="pb-3 pr-4">Teilnehmer</th>
-                  <th className="pb-3 pr-4">Gestartet</th>
-                  <th className="pb-3">Dauer</th>
-                </tr>
-              </thead>
-              <tbody className="divide-y divide-slate-100">
-                {activeMeetings.map((meeting, idx) => (
-                  <tr key={idx} className="text-sm">
-                    <td className="py-3 pr-4">
-                      <div className="font-medium text-slate-900">{meeting.display_name}</div>
-                      <div className="text-xs text-slate-500">{meeting.room_name}</div>
-                    </td>
-                    <td className="py-3 pr-4">
-                      <span className="inline-flex items-center gap-1">
-                        <span className="w-2 h-2 bg-green-500 rounded-full animate-pulse" />
-                        {meeting.participants}
-                      </span>
-                    </td>
-                    <td className="py-3 pr-4 text-slate-500">{formatTimeAgo(meeting.started_at)}</td>
-                    <td className="py-3 font-medium">{formatDuration(meeting.duration_minutes)}</td>
-                  </tr>
-                ))}
-              </tbody>
-            </table>
-          </div>
-        )}
-      </div>
-
-      {/* Recent Chat Rooms & Usage Stats */}
-      <div className="grid grid-cols-1 lg:grid-cols-2 gap-6 mb-6">
-        <div className="bg-white rounded-xl border border-slate-200 p-6">
-          <h3 className="font-semibold text-slate-900 mb-4">Aktive Chat-Raeume</h3>
-
-          {recentRooms.length === 0 ? (
-            <div className="text-center py-6 text-slate-500">
-              <p>Keine aktiven Raeume</p>
-            </div>
-          ) : (
-            <div className="space-y-3">
-              {recentRooms.slice(0, 5).map((room, idx) => (
-                <div key={idx} className="flex items-center justify-between p-3 bg-slate-50 rounded-lg">
-                  <div className="flex items-center gap-3">
-                    <div className="w-8 h-8 bg-slate-200 rounded-lg flex items-center justify-center">
-                      <svg className="w-4 h-4 text-slate-600" fill="none" stroke="currentColor" viewBox="0 0 24 24">
-                        <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M17 20h5v-2a3 3 0 00-5.356-1.857M17 20H7m10 0v-2c0-.656-.126-1.283-.356-1.857M7 20H2v-2a3 3 0 015.356-1.857M7 20v-2c0-.656.126-1.283.356-1.857m0 0a5.002 5.002 0 019.288 0M15 7a3 3 0 11-6 0 3 3 0 016 0z" />
-                      </svg>
-                    </div>
-                    <div>
-                      <div className="font-medium text-slate-900 text-sm">{room.name}</div>
-                      <div className="text-xs text-slate-500">{room.member_count} Mitglieder</div>
-                    </div>
-                  </div>
-                  <div className="flex items-center gap-2">
-                    <span className={getRoomTypeBadge(room.room_type)}>{room.room_type}</span>
-                    <span className="text-xs text-slate-400">{formatTimeAgo(room.last_activity)}</span>
-                  </div>
-                </div>
-              ))}
-            </div>
-          )}
-        </div>
-
-        {/* Usage Statistics */}
-        <div className="bg-white rounded-xl border border-slate-200 p-6">
-          <h3 className="font-semibold text-slate-900 mb-4">Nutzungsstatistiken</h3>
-          <div className="space-y-4">
-            <div>
-              <div className="flex justify-between text-sm mb-1">
-                <span className="text-slate-600">Call-Minuten heute</span>
-                <span className="font-semibold">{stats?.jitsi.total_minutes_today || 0} Min.</span>
-              </div>
-              <div className="w-full bg-slate-100 rounded-full h-2">
-                <div
-                  className="bg-blue-600 h-2 rounded-full transition-all"
-                  style={{ width: `${Math.min((stats?.jitsi.total_minutes_today || 0) / 500 * 100, 100)}%` }}
-                />
-              </div>
-            </div>
-            <div>
-              <div className="flex justify-between text-sm mb-1">
-                <span className="text-slate-600">Aktive Chat-Raeume</span>
-                <span className="font-semibold">{stats?.matrix.active_rooms || 0} / {stats?.matrix.total_rooms || 0}</span>
-              </div>
-              <div className="w-full bg-slate-100 rounded-full h-2">
-                <div
-                  className="bg-purple-600 h-2 rounded-full transition-all"
-                  style={{ width: `${stats?.matrix.total_rooms ? ((stats.matrix.active_rooms / stats.matrix.total_rooms) * 100) : 0}%` }}
-                />
-              </div>
-            </div>
-            <div>
-              <div className="flex justify-between text-sm mb-1">
-                <span className="text-slate-600">Aktive Nutzer</span>
-                <span className="font-semibold">{stats?.matrix.active_users || 0} / {stats?.matrix.total_users || 0}</span>
-              </div>
-              <div className="w-full bg-slate-100 rounded-full h-2">
-                <div
-                  className="bg-green-600 h-2 rounded-full transition-all"
-                  style={{ width: `${stats?.matrix.total_users ? ((stats.matrix.active_users / stats.matrix.total_users) * 100) : 0}%` }}
-                />
-              </div>
-            </div>
-          </div>
-
-          {/* Quick Actions */}
-          <div className="mt-6 pt-4 border-t border-slate-100">
-            <h4 className="text-sm font-medium text-slate-700 mb-3">Schnellaktionen</h4>
-            <div className="flex flex-wrap gap-2">
-              <a
-                href="http://localhost:8448/_synapse/admin"
-                target="_blank"
-                rel="noopener noreferrer"
-                className="px-3 py-1.5 text-sm bg-purple-100 text-purple-700 rounded-lg hover:bg-purple-200 transition-colors"
-              >
-                Synapse Admin
-              </a>
-              <a
-                href="http://localhost:8443"
-                target="_blank"
-                rel="noopener noreferrer"
-                className="px-3 py-1.5 text-sm bg-blue-100 text-blue-700 rounded-lg hover:bg-blue-200 transition-colors"
-              >
-                Jitsi Meet
-              </a>
-            </div>
-          </div>
-        </div>
-      </div>
-
-      {/* Connection Info */}
-      <div className="bg-blue-50 border border-blue-200 rounded-xl p-4">
-        <div className="flex gap-3">
-          <svg className="w-5 h-5 text-blue-600 flex-shrink-0 mt-0.5" fill="none" stroke="currentColor" viewBox="0 0 24 24">
-            <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M13 16h-1v-4h-1m1-4h.01M21 12a9 9 0 11-18 0 9 9 0 0118 0z" />
-          </svg>
-          <div>
-            <h4 className="font-semibold text-blue-900">Service Konfiguration</h4>
-            <p className="text-sm text-blue-800 mt-1">
-              <strong>Matrix Homeserver:</strong> http://localhost:8448 (Synapse)<br />
-              <strong>Jitsi Meet:</strong> http://localhost:8443<br />
-              <strong>Auto-Refresh:</strong> Alle 15 Sekunden
-            </p>
-            {error && (
-              <p className="text-sm text-red-600 mt-2">
-                <strong>Fehler:</strong> {error} - Backend nicht erreichbar
-              </p>
-            )}
-            {stats?.last_updated && (
-              <p className="text-xs text-blue-600 mt-2">
-                Letzte Aktualisierung: {new Date(stats.last_updated).toLocaleString('de-DE')}
-              </p>
-            )}
-          </div>
-        </div>
-      </div>
-    </div>
-  )
-}
--- a/admin-lehrer/app/(admin)/communication/video-chat/wizard/page.tsx
+++ b/admin-lehrer/app/(admin)/communication/video-chat/wizard/page.tsx
@@ -1,366 +0,0 @@
-'use client'
-
-/**
- * Video & Chat Wizard Page
- *
- * Interactive learning and testing wizard for Matrix & Jitsi integration
- * Migrated from website/app/admin/communication/wizard
- */
-
-import { useState } from 'react'
-import Link from 'next/link'
-import {
-  WizardStepper,
-  WizardNavigation,
-  EducationCard,
-  ArchitectureContext,
-  TestRunner,
-  TestSummary,
-  type WizardStep,
-  type TestCategoryResult,
-  type FullTestResults,
-  type EducationContent,
-  type ArchitectureContextType,
-} from '@/components/wizard'
-
-// ==============================================
-// Constants
-// ==============================================
-
-const BACKEND_URL = process.env.NEXT_PUBLIC_BACKEND_URL || 'http://localhost:8000'
-
-const STEPS: WizardStep[] = [
-  { id: 'welcome', name: 'Willkommen', icon: '👋', status: 'pending' },
-  { id: 'api-health', name: 'API Status', icon: '💚', status: 'pending', category: 'api-health' },
-  { id: 'matrix', name: 'Matrix', icon: '💬', status: 'pending', category: 'matrix' },
-  { id: 'jitsi', name: 'Jitsi', icon: '📹', status: 'pending', category: 'jitsi' },
-  { id: 'summary', name: 'Zusammenfassung', icon: '📊', status: 'pending' },
-]
-
-const EDUCATION_CONTENT: Record<string, EducationContent> = {
-  'welcome': {
-    title: 'Willkommen zum Video & Chat Wizard',
-    content: [
-      'Sichere Kommunikation ist das Rueckgrat moderner Bildungsplattformen.',
-      '',
-      'BreakPilot nutzt zwei Open-Source Systeme:',
-      '• Matrix Synapse: Dezentraler Messenger (Ende-zu-Ende verschluesselt)',
-      '• Jitsi Meet: Video-Konferenzen (WebRTC-basiert)',
-      '',
-      'Beide Systeme sind DSGVO-konform und self-hosted.',
-      '',
-      'In diesem Wizard testen wir:',
-      '• Matrix Homeserver und Federation',
-      '• Jitsi Video-Konferenz Server',
-      '• Integration mit der Schulverwaltung',
-    ],
-  },
-  'api-health': {
-    title: 'Communication API - Backend Integration',
-    content: [
-      'Die Communication API verbindet Matrix und Jitsi mit BreakPilot.',
-      '',
-      'Funktionen:',
-      '• Automatische Raum-Erstellung fuer Klassen',
-      '• Eltern-Lehrer DM-Raeume',
-      '• Meeting-Planung mit Kalender-Integration',
-      '• Benachrichtigungen bei neuen Nachrichten',
-      '',
-      'Endpunkte:',
-      '• /api/v1/communication/admin/stats',
-      '• /api/v1/communication/admin/matrix/users',
-      '• /api/v1/communication/rooms',
-    ],
-  },
-  'matrix': {
-    title: 'Matrix Synapse - Dezentraler Messenger',
-    content: [
-      'Matrix ist ein offenes Protokoll fuer sichere Kommunikation.',
-      '',
-      'Vorteile gegenueber WhatsApp/Teams:',
-      '• Ende-zu-Ende Verschluesselung (E2EE)',
-      '• Dezentral: Kein Single Point of Failure',
-      '• Federation: Kommunikation mit anderen Schulen',
-      '• Self-Hosted: Volle Datenkontrolle',
-      '',
-      'Raum-Typen in BreakPilot:',
-      '• Klassen-Info (Ankuendigungen)',
-      '• Elternvertreter-Raum',
-      '• Lehrer-Eltern DM',
-      '• Fachgruppen',
-    ],
-  },
-  'jitsi': {
-    title: 'Jitsi Meet - Video-Konferenzen',
-    content: [
-      'Jitsi ist eine Open-Source Alternative zu Zoom/Teams.',
-      '',
-      'Features:',
-      '• WebRTC: Keine Software-Installation noetig',
-      '• Bildschirmfreigabe und Whiteboard',
-      '• Breakout-Raeume fuer Gruppenarbeit',
-      '• Aufzeichnung (optional, lokal)',
-      '',
-      'Anwendungsfaelle:',
-      '• Elternsprechtage (online)',
-      '• Fernunterricht bei Schulausfall',
-      '• Lehrerkonferenzen',
-      '• Foerdergespraeche',
-    ],
-  },
-  'summary': {
-    title: 'Test-Zusammenfassung',
-    content: [
-      'Hier sehen Sie eine Uebersicht aller durchgefuehrten Tests:',
-      '• Matrix Homeserver Verfuegbarkeit',
-      '• Jitsi Server Status',
-      '• API-Integration',
-    ],
-  },
-}
-
-const ARCHITECTURE_CONTEXTS: Record<string, ArchitectureContextType> = {
-  'api-health': {
-    layer: 'api',
-    services: ['backend', 'consent-service'],
-    dependencies: ['PostgreSQL', 'Matrix Synapse', 'Jitsi'],
-    dataFlow: ['Browser', 'FastAPI', 'Go Service', 'Matrix/Jitsi'],
-  },
-  'matrix': {
-    layer: 'service',
-    services: ['matrix'],
-    dependencies: ['PostgreSQL', 'Federation', 'TURN Server'],
-    dataFlow: ['Element Client', 'Matrix Synapse', 'Federation', 'PostgreSQL'],
-  },
-  'jitsi': {
-    layer: 'service',
-    services: ['jitsi'],
-    dependencies: ['Prosody XMPP', 'JVB', 'TURN/STUN'],
-    dataFlow: ['Browser', 'Nginx', 'Prosody', 'Jitsi Videobridge'],
-  },
-}
-
-// ==============================================
-// Main Component
-// ==============================================
-
-export default function VideoChatWizardPage() {
-  const [currentStep, setCurrentStep] = useState(0)
-  const [steps, setSteps] = useState<WizardStep[]>(STEPS)
-  const [categoryResults, setCategoryResults] = useState<Record<string, TestCategoryResult>>({})
-  const [fullResults, setFullResults] = useState<FullTestResults | null>(null)
-  const [isLoading, setIsLoading] = useState(false)
-  const [error, setError] = useState<string | null>(null)
-
-  const currentStepData = steps[currentStep]
-  const isTestStep = currentStepData?.category !== undefined
-  const isWelcome = currentStepData?.id === 'welcome'
-  const isSummary = currentStepData?.id === 'summary'
-
-  const runCategoryTest = async (category: string) => {
-    setIsLoading(true)
-    setError(null)
-
-    try {
-      const response = await fetch(`${BACKEND_URL}/api/admin/communication-tests/${category}`, {
-        method: 'POST',
-      })
-
-      if (!response.ok) {
-        throw new Error(`HTTP ${response.status}: ${response.statusText}`)
-      }
-
-      const result: TestCategoryResult = await response.json()
-      setCategoryResults((prev) => ({ ...prev, [category]: result }))
-
-      setSteps((prev) =>
-        prev.map((step) =>
-          step.category === category
-            ? { ...step, status: result.failed === 0 ? 'completed' : 'failed' }
-            : step
-        )
-      )
-    } catch (err) {
-      setError(err instanceof Error ? err.message : 'Unbekannter Fehler')
-    } finally {
-      setIsLoading(false)
-    }
-  }
-
-  const runAllTests = async () => {
-    setIsLoading(true)
-    setError(null)
-
-    try {
-      const response = await fetch(`${BACKEND_URL}/api/admin/communication-tests/run-all`, {
-        method: 'POST',
-      })
-
-      if (!response.ok) {
-        throw new Error(`HTTP ${response.status}: ${response.statusText}`)
-      }
-
-      const results: FullTestResults = await response.json()
-      setFullResults(results)
-
-      setSteps((prev) =>
-        prev.map((step) => {
-          if (step.category) {
-            const catResult = results.categories.find((c) => c.category === step.category)
-            if (catResult) {
-              return { ...step, status: catResult.failed === 0 ? 'completed' : 'failed' }
-            }
-          }
-          return step
-        })
-      )
-
-      const newCategoryResults: Record<string, TestCategoryResult> = {}
-      results.categories.forEach((cat) => {
-        newCategoryResults[cat.category] = cat
-      })
-      setCategoryResults(newCategoryResults)
-    } catch (err) {
-      setError(err instanceof Error ? err.message : 'Unbekannter Fehler')
-    } finally {
-      setIsLoading(false)
-    }
-  }
-
-  const goToNext = () => {
-    if (currentStep < steps.length - 1) {
-      setSteps((prev) =>
-        prev.map((step, idx) =>
-          idx === currentStep && step.status === 'pending'
-            ? { ...step, status: 'completed' }
-            : step
-        )
-      )
-      setCurrentStep((prev) => prev + 1)
-    }
-  }
-
-  const goToPrev = () => {
-    if (currentStep > 0) {
-      setCurrentStep((prev) => prev - 1)
-    }
-  }
-
-  const handleStepClick = (index: number) => {
-    if (index <= currentStep || steps[index - 1]?.status !== 'pending') {
-      setCurrentStep(index)
-    }
-  }
-
-  return (
-    <div>
-      {/* Header */}
-      <div className="bg-white rounded-lg border border-slate-200 p-4 mb-6 flex items-center justify-between">
-        <div className="flex items-center">
-          <span className="text-3xl mr-3">💬</span>
-          <div>
-            <h2 className="text-lg font-bold text-gray-800">Video & Chat Test Wizard</h2>
-            <p className="text-sm text-gray-600">Matrix Messenger & Jitsi Video</p>
-          </div>
-        </div>
-        <Link href="/communication/video-chat" className="text-blue-600 hover:text-blue-800 text-sm">
-          &larr; Zurueck zu Video & Chat
-        </Link>
-      </div>
-
-      {/* Stepper */}
-      <div className="bg-white rounded-lg border border-slate-200 p-6 mb-6">
-        <WizardStepper steps={steps} currentStep={currentStep} onStepClick={handleStepClick} />
-      </div>
-
-      {/* Content */}
-      <div className="bg-white rounded-lg border border-slate-200 p-6">
-        <div className="flex items-center mb-6">
-          <span className="text-3xl mr-3">{currentStepData?.icon}</span>
-          <div>
-            <h2 className="text-xl font-bold text-gray-800">
-              Schritt {currentStep + 1}: {currentStepData?.name}
-            </h2>
-            <p className="text-gray-500 text-sm">
-              {currentStep + 1} von {steps.length}
-            </p>
-          </div>
-        </div>
-
-        <EducationCard content={EDUCATION_CONTENT[currentStepData?.id || '']} />
-
-        {isTestStep && currentStepData?.category && ARCHITECTURE_CONTEXTS[currentStepData.category] && (
-          <ArchitectureContext
-            context={ARCHITECTURE_CONTEXTS[currentStepData.category]}
-            currentStep={currentStepData.name}
-          />
-        )}
-
-        {error && (
-          <div className="bg-red-50 border border-red-200 text-red-700 rounded-lg p-4 mb-6">
-            <strong>Fehler:</strong> {error}
-          </div>
-        )}
-
-        {isWelcome && (
-          <div className="text-center py-8">
-            <button
-              onClick={goToNext}
-              className="bg-blue-600 text-white px-8 py-3 rounded-lg font-medium hover:bg-blue-700 transition-colors"
-            >
-              Wizard starten
-            </button>
-          </div>
-        )}
-
-        {isTestStep && currentStepData?.category && (
-          <TestRunner
-            category={currentStepData.category}
-            categoryResult={categoryResults[currentStepData.category]}
-            isLoading={isLoading}
-            onRunTests={() => runCategoryTest(currentStepData.category!)}
-          />
-        )}
-
-        {isSummary && (
-          <div>
-            {!fullResults ? (
-              <div className="text-center py-8">
-                <p className="text-gray-600 mb-4">
-                  Fuehren Sie alle Tests aus um eine Zusammenfassung zu sehen.
-                </p>
-                <button
-                  onClick={runAllTests}
-                  disabled={isLoading}
-                  className={`px-6 py-3 rounded-lg font-medium transition-colors ${
-                    isLoading
-                      ? 'bg-gray-400 cursor-not-allowed'
-                      : 'bg-blue-600 text-white hover:bg-blue-700'
-                  }`}
-                >
-                  {isLoading ? 'Alle Tests laufen...' : 'Alle Tests ausfuehren'}
-                </button>
-              </div>
-            ) : (
-              <TestSummary results={fullResults} />
-            )}
-          </div>
-        )}
-
-        <WizardNavigation
-          currentStep={currentStep}
-          totalSteps={steps.length}
-          onPrev={goToPrev}
-          onNext={goToNext}
-          showNext={!isSummary}
-          isLoading={isLoading}
-        />
-      </div>
-
-      <div className="text-center text-gray-500 text-sm mt-6">
-        Diese Tests pruefen die Matrix- und Jitsi-Integration.
-        Bei Fragen wenden Sie sich an das IT-Team.
-      </div>
-    </div>
-  )
-}
--- a/admin-lehrer/app/(admin)/infrastructure/gpu/page.tsx
+++ b/admin-lehrer/app/(admin)/infrastructure/gpu/page.tsx
@@ -1,390 +0,0 @@
-'use client'
-
-/**
- * GPU Infrastructure Admin Page
- *
- * vast.ai GPU Management for LLM Processing
- */
-
-import { useEffect, useState, useCallback } from 'react'
-import { PagePurpose } from '@/components/common/PagePurpose'
-
-interface VastStatus {
-  instance_id: number | null
-  status: string
-  gpu_name: string | null
-  dph_total: number | null
-  endpoint_base_url: string | null
-  last_activity: string | null
-  auto_shutdown_in_minutes: number | null
-  total_runtime_hours: number | null
-  total_cost_usd: number | null
-  account_credit: number | null
-  account_total_spend: number | null
-  session_runtime_minutes: number | null
-  session_cost_usd: number | null
-  message: string | null
-  error?: string
-}
-
-export default function GPUInfrastructurePage() {
-  const [status, setStatus] = useState<VastStatus | null>(null)
-  const [loading, setLoading] = useState(true)
-  const [actionLoading, setActionLoading] = useState<string | null>(null)
-  const [error, setError] = useState<string | null>(null)
-  const [message, setMessage] = useState<string | null>(null)
-
-  const API_PROXY = '/api/admin/gpu'
-
-  const fetchStatus = useCallback(async () => {
-    setLoading(true)
-    setError(null)
-
-    try {
-      const response = await fetch(API_PROXY)
-      const data = await response.json()
-
-      if (!response.ok) {
-        throw new Error(data.error || `HTTP ${response.status}`)
-      }
-
-      setStatus(data)
-    } catch (err) {
-      setError(err instanceof Error ? err.message : 'Verbindungsfehler')
-      setStatus({
-        instance_id: null,
-        status: 'error',
-        gpu_name: null,
-        dph_total: null,
-        endpoint_base_url: null,
-        last_activity: null,
-        auto_shutdown_in_minutes: null,
-        total_runtime_hours: null,
-        total_cost_usd: null,
-        account_credit: null,
-        account_total_spend: null,
-        session_runtime_minutes: null,
-        session_cost_usd: null,
-        message: 'Verbindung fehlgeschlagen'
-      })
-    } finally {
-      setLoading(false)
-    }
-  }, [])
-
-  useEffect(() => {
-    fetchStatus()
-  }, [fetchStatus])
-
-  useEffect(() => {
-    const interval = setInterval(fetchStatus, 30000)
-    return () => clearInterval(interval)
-  }, [fetchStatus])
-
-  const powerOn = async () => {
-    setActionLoading('on')
-    setError(null)
-    setMessage(null)
-
-    try {
-      const response = await fetch(API_PROXY, {
-        method: 'POST',
-        headers: { 'Content-Type': 'application/json' },
-        body: JSON.stringify({ action: 'on' }),
-      })
-
-      const data = await response.json()
-
-      if (!response.ok) {
-        throw new Error(data.error || data.detail || 'Aktion fehlgeschlagen')
-      }
-
-      setMessage('Start angefordert')
-      setTimeout(fetchStatus, 3000)
-      setTimeout(fetchStatus, 10000)
-    } catch (err) {
-      setError(err instanceof Error ? err.message : 'Fehler beim Starten')
-      fetchStatus()
-    } finally {
-      setActionLoading(null)
-    }
-  }
-
-  const powerOff = async () => {
-    setActionLoading('off')
-    setError(null)
-    setMessage(null)
-
-    try {
-      const response = await fetch(API_PROXY, {
-        method: 'POST',
-        headers: { 'Content-Type': 'application/json' },
-        body: JSON.stringify({ action: 'off' }),
-      })
-
-      const data = await response.json()
-
-      if (!response.ok) {
-        throw new Error(data.error || data.detail || 'Aktion fehlgeschlagen')
-      }
-
-      setMessage('Stop angefordert')
-      setTimeout(fetchStatus, 3000)
-      setTimeout(fetchStatus, 10000)
-    } catch (err) {
-      setError(err instanceof Error ? err.message : 'Fehler beim Stoppen')
-      fetchStatus()
-    } finally {
-      setActionLoading(null)
-    }
-  }
-
-  const getStatusBadge = (s: string) => {
-    const baseClasses = 'px-3 py-1 rounded-full text-sm font-semibold uppercase'
-    switch (s) {
-      case 'running':
-        return `${baseClasses} bg-green-100 text-green-800`
-      case 'stopped':
-      case 'exited':
-        return `${baseClasses} bg-red-100 text-red-800`
-      case 'loading':
-      case 'scheduling':
-      case 'creating':
-      case 'starting...':
-      case 'stopping...':
-        return `${baseClasses} bg-yellow-100 text-yellow-800`
-      default:
-        return `${baseClasses} bg-slate-100 text-slate-600`
-    }
-  }
-
-  const getCreditColor = (credit: number | null) => {
-    if (credit === null) return 'text-slate-500'
-    if (credit < 5) return 'text-red-600'
-    if (credit < 15) return 'text-yellow-600'
-    return 'text-green-600'
-  }
-
-  return (
-    <div>
-      {/* Page Purpose */}
-      <PagePurpose
-        title="GPU Infrastruktur"
-        purpose="Verwalten Sie die vast.ai GPU-Instanzen fuer LLM-Verarbeitung und OCR. Starten/Stoppen Sie GPUs bei Bedarf und ueberwachen Sie Kosten in Echtzeit."
-        audience={['DevOps', 'Entwickler', 'System-Admins']}
-        architecture={{
-          services: ['vast.ai API', 'Ollama', 'VLLM'],
-          databases: ['PostgreSQL (Logs)'],
-        }}
-        relatedPages={[
-          { name: 'Security', href: '/infrastructure/security', description: 'DevSecOps Dashboard' },
-          { name: 'Builds', href: '/infrastructure/builds', description: 'CI/CD Pipeline' },
-        ]}
-        collapsible={true}
-        defaultCollapsed={true}
-      />
-
-      {/* Status Cards */}
-      <div className="bg-white rounded-xl border border-slate-200 p-6 mb-6">
-        <div className="grid grid-cols-2 md:grid-cols-3 lg:grid-cols-6 gap-6">
-          <div>
-            <div className="text-sm text-slate-500 mb-2">Status</div>
-            {loading ? (
-              <span className="px-3 py-1 rounded-full text-sm font-semibold bg-slate-100 text-slate-600">
-                Laden...
-              </span>
-            ) : (
-              <span className={getStatusBadge(
-                actionLoading === 'on' ? 'starting...' :
-                actionLoading === 'off' ? 'stopping...' :
-                status?.status || 'unknown'
-              )}>
-                {actionLoading === 'on' ? 'starting...' :
-                 actionLoading === 'off' ? 'stopping...' :
-                 status?.status || 'unbekannt'}
-              </span>
-            )}
-          </div>
-
-          <div>
-            <div className="text-sm text-slate-500 mb-2">GPU</div>
-            <div className="font-semibold text-slate-900">
-              {status?.gpu_name || '-'}
-            </div>
-          </div>
-
-          <div>
-            <div className="text-sm text-slate-500 mb-2">Kosten/h</div>
-            <div className="font-semibold text-slate-900">
-              {status?.dph_total ? `$${status.dph_total.toFixed(3)}` : '-'}
-            </div>
-          </div>
-
-          <div>
-            <div className="text-sm text-slate-500 mb-2">Auto-Stop</div>
-            <div className="font-semibold text-slate-900">
-              {status && status.auto_shutdown_in_minutes !== null
-                ? `${status.auto_shutdown_in_minutes} min`
-                : '-'}
-            </div>
-          </div>
-
-          <div>
-            <div className="text-sm text-slate-500 mb-2">Budget</div>
-            <div className={`font-bold text-lg ${getCreditColor(status?.account_credit ?? null)}`}>
-              {status && status.account_credit !== null
-                ? `$${status.account_credit.toFixed(2)}`
-                : '-'}
-            </div>
-          </div>
-
-          <div>
-            <div className="text-sm text-slate-500 mb-2">Session</div>
-            <div className="font-semibold text-slate-900">
-              {status && status.session_runtime_minutes !== null && status.session_cost_usd !== null
-                ? `${Math.round(status.session_runtime_minutes)} min / $${status.session_cost_usd.toFixed(3)}`
-                : '-'}
-            </div>
-          </div>
-        </div>
-
-        {/* Buttons */}
-        <div className="flex items-center gap-4 mt-6 pt-6 border-t border-slate-200">
-          <button
-            onClick={powerOn}
-            disabled={actionLoading !== null || status?.status === 'running'}
-            className="px-6 py-2 bg-orange-600 text-white rounded-lg font-medium hover:bg-orange-700 disabled:opacity-50 disabled:cursor-not-allowed transition-colors"
-          >
-            Starten
-          </button>
-          <button
-            onClick={powerOff}
-            disabled={actionLoading !== null || status?.status !== 'running'}
-            className="px-6 py-2 bg-red-600 text-white rounded-lg font-medium hover:bg-red-700 disabled:opacity-50 disabled:cursor-not-allowed transition-colors"
-          >
-            Stoppen
-          </button>
-          <button
-            onClick={fetchStatus}
-            disabled={loading}
-            className="px-4 py-2 border border-slate-300 text-slate-700 rounded-lg font-medium hover:bg-slate-50 disabled:opacity-50 transition-colors"
-          >
-            {loading ? 'Aktualisiere...' : 'Aktualisieren'}
-          </button>
-
-          {message && (
-            <span className="ml-4 text-sm text-green-600 font-medium">{message}</span>
-          )}
-          {error && (
-            <span className="ml-4 text-sm text-red-600 font-medium">{error}</span>
-          )}
-        </div>
-      </div>
-
-      {/* Extended Stats */}
-      <div className="grid grid-cols-1 lg:grid-cols-2 gap-6 mb-6">
-        <div className="bg-white rounded-xl border border-slate-200 p-6">
-          <h3 className="font-semibold text-slate-900 mb-4">Kosten-Uebersicht</h3>
-          <div className="space-y-4">
-            <div className="flex justify-between items-center">
-              <span className="text-slate-600">Session Laufzeit</span>
-              <span className="font-semibold">
-                {status && status.session_runtime_minutes !== null
-                  ? `${Math.round(status.session_runtime_minutes)} Minuten`
-                  : '-'}
-              </span>
-            </div>
-            <div className="flex justify-between items-center">
-              <span className="text-slate-600">Session Kosten</span>
-              <span className="font-semibold">
-                {status && status.session_cost_usd !== null
-                  ? `$${status.session_cost_usd.toFixed(4)}`
-                  : '-'}
-              </span>
-            </div>
-            <div className="flex justify-between items-center pt-4 border-t border-slate-100">
-              <span className="text-slate-600">Gesamtlaufzeit</span>
-              <span className="font-semibold">
-                {status && status.total_runtime_hours !== null
-                  ? `${status.total_runtime_hours.toFixed(1)} Stunden`
-                  : '-'}
-              </span>
-            </div>
-            <div className="flex justify-between items-center">
-              <span className="text-slate-600">Gesamtkosten</span>
-              <span className="font-semibold">
-                {status && status.total_cost_usd !== null
-                  ? `$${status.total_cost_usd.toFixed(2)}`
-                  : '-'}
-              </span>
-            </div>
-            <div className="flex justify-between items-center">
-              <span className="text-slate-600">vast.ai Ausgaben</span>
-              <span className="font-semibold">
-                {status && status.account_total_spend !== null
-                  ? `$${status.account_total_spend.toFixed(2)}`
-                  : '-'}
-              </span>
-            </div>
-          </div>
-        </div>
-
-        <div className="bg-white rounded-xl border border-slate-200 p-6">
-          <h3 className="font-semibold text-slate-900 mb-4">Instanz-Details</h3>
-          <div className="space-y-4">
-            <div className="flex justify-between items-center">
-              <span className="text-slate-600">Instanz ID</span>
-              <span className="font-mono text-sm">
-                {status?.instance_id || '-'}
-              </span>
-            </div>
-            <div className="flex justify-between items-center">
-              <span className="text-slate-600">GPU</span>
-              <span className="font-semibold">
-                {status?.gpu_name || '-'}
-              </span>
-            </div>
-            <div className="flex justify-between items-center">
-              <span className="text-slate-600">Stundensatz</span>
-              <span className="font-semibold">
-                {status?.dph_total ? `$${status.dph_total.toFixed(4)}/h` : '-'}
-              </span>
-            </div>
-            <div className="flex justify-between items-center">
-              <span className="text-slate-600">Letzte Aktivitaet</span>
-              <span className="text-sm">
-                {status?.last_activity
-                  ? new Date(status.last_activity).toLocaleString('de-DE')
-                  : '-'}
-              </span>
-            </div>
-            {status?.endpoint_base_url && status.status === 'running' && (
-              <div className="pt-4 border-t border-slate-100">
-                <div className="text-slate-600 text-sm mb-1">Endpoint</div>
-                <code className="text-xs bg-slate-100 px-2 py-1 rounded block overflow-x-auto">
-                  {status.endpoint_base_url}
-                </code>
-              </div>
-            )}
-          </div>
-        </div>
-      </div>
-
-      {/* Info */}
-      <div className="bg-orange-50 border border-orange-200 rounded-xl p-4">
-        <div className="flex gap-3">
-          <svg className="w-5 h-5 text-orange-600 flex-shrink-0 mt-0.5" fill="none" stroke="currentColor" viewBox="0 0 24 24">
-            <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M13 16h-1v-4h-1m1-4h.01M21 12a9 9 0 11-18 0 9 9 0 0118 0z" />
-          </svg>
-          <div>
-            <h4 className="font-semibold text-orange-900">Auto-Shutdown</h4>
-            <p className="text-sm text-orange-800 mt-1">
-              Die GPU-Instanz wird automatisch gestoppt, wenn sie laengere Zeit inaktiv ist.
-              Der Status wird alle 30 Sekunden automatisch aktualisiert.
-            </p>
-          </div>
-        </div>
-      </div>
-    </div>
-  )
-}
--- a/admin-lehrer/components/grid-editor/GridEditor.tsx
+++ b/admin-lehrer/components/grid-editor/GridEditor.tsx
@@ -36,6 +36,10 @@ export function GridEditor({ sessionId, onNext }: GridEditorProps) {
    addColumn,
    deleteRow,
    addRow,
+    ipaMode,
+    setIpaMode,
+    syllableMode,
+    setSyllableMode,
  } = useGridEditor(sessionId)

  const [showOverlay, setShowOverlay] = useState(false)
@@ -170,6 +174,11 @@ export function GridEditor({ sessionId, onNext }: GridEditorProps) {
            Woerterbuch ({Math.round(grid.dictionary_detection.confidence * 100)}%)
          </span>
        )}
+        {grid.page_number?.text && (
+          <span className="px-1.5 py-0.5 rounded bg-gray-100 dark:bg-gray-700 text-gray-600 dark:text-gray-300 border border-gray-200 dark:border-gray-600">
+            S. {grid.page_number.text}
+          </span>
+        )}
        <span className="text-gray-400">
          {grid.duration_seconds.toFixed(1)}s
        </span>
@@ -183,11 +192,15 @@ export function GridEditor({ sessionId, onNext }: GridEditorProps) {
          canUndo={canUndo}
          canRedo={canRedo}
          showOverlay={showOverlay}
+          ipaMode={ipaMode}
+          syllableMode={syllableMode}
          onSave={saveGrid}
          onUndo={undo}
          onRedo={redo}
          onRebuild={buildGrid}
          onToggleOverlay={() => setShowOverlay(!showOverlay)}
+          onIpaModeChange={setIpaMode}
+          onSyllableModeChange={setSyllableMode}
        />
      </div>

--- a/admin-lehrer/components/grid-editor/GridTable.tsx
+++ b/admin-lehrer/components/grid-editor/GridTable.tsx
@@ -107,12 +107,18 @@ export function GridTable({
    const row = zone.rows.find((r) => r.index === rowIndex)
    if (!row) return Math.max(MIN_ROW_HEIGHT, avgRowHeightPx * scale)

+    // Multi-line cells (containing \n): expand height based on line count
+    const rowCells = zone.cells.filter((c) => c.row_index === rowIndex)
+    const maxLines = Math.max(1, ...rowCells.map((c) => (c.text ?? '').split('\n').length))
+    if (maxLines > 1) {
+      const lineH = Math.max(MIN_ROW_HEIGHT, avgRowHeightPx * scale)
+      return lineH * maxLines
+    }
+
    if (isHeader) {
-      // Headers keep their measured height
      const measuredH = row.y_max_px - row.y_min_px
      return Math.max(MIN_ROW_HEIGHT, measuredH * scale)
    }
-    // Content rows use average for uniformity
    return Math.max(MIN_ROW_HEIGHT, avgRowHeightPx * scale)
  }

@@ -410,28 +416,27 @@ export function GridTable({

              {/* Cells — spanning header or normal columns */}
              {isSpanning ? (
-                <div
-                  className="border-b border-r border-gray-200 dark:border-gray-700 bg-blue-50/50 dark:bg-blue-900/10 flex items-center"
-                  style={{
-                    gridColumn: `2 / ${numCols + 2}`,
-                    height: `${rowH}px`,
-                  }}
-                >
-                  {(() => {
-                    const spanCell = zone.cells.find(
-                      (c) => c.row_index === row.index && c.col_type === 'spanning_header',
-                    )
-                    if (!spanCell) return null
+                <>
+                  {zone.cells
+                    .filter((c) => c.row_index === row.index && c.col_type === 'spanning_header')
+                    .sort((a, b) => a.col_index - b.col_index)
+                    .map((spanCell) => {
+                      const colspan = spanCell.colspan || numCols
                      const cellId = spanCell.cell_id
                      const isSelected = selectedCell === cellId
                      const cellColor = getCellColor(spanCell)
+                      const gridColStart = spanCell.col_index + 2
+                      const gridColEnd = gridColStart + colspan
                      return (
-                      <div className="flex items-center w-full">
+                        <div
+                          key={cellId}
+                          className={`border-b border-r border-gray-200 dark:border-gray-700 bg-blue-50/50 dark:bg-blue-900/10 flex items-center ${
+                            isSelected ? 'ring-2 ring-teal-500 ring-inset z-10' : ''
+                          }`}
+                          style={{ gridColumn: `${gridColStart} / ${gridColEnd}`, height: `${rowH}px` }}
+                        >
                          {cellColor && (
-                          <span
-                            className="flex-shrink-0 w-1.5 self-stretch rounded-l-sm"
-                            style={{ backgroundColor: cellColor }}
-                          />
+                            <span className="flex-shrink-0 w-1.5 self-stretch rounded-l-sm" style={{ backgroundColor: cellColor }} />
                          )}
                          <input
                            id={`cell-${cellId}`}
@@ -440,16 +445,14 @@ export function GridTable({
                            onChange={(e) => onCellTextChange(cellId, e.target.value)}
                            onFocus={() => onSelectCell(cellId)}
                            onKeyDown={(e) => handleKeyDown(e, cellId)}
-                          className={`w-full px-3 py-1 bg-transparent border-0 outline-none text-center ${
-                            isSelected ? 'ring-2 ring-teal-500 ring-inset rounded' : ''
-                          }`}
+                            className="w-full px-3 py-1 bg-transparent border-0 outline-none text-center"
                            style={{ color: cellColor || undefined }}
                            spellCheck={false}
                          />
                        </div>
                      )
-                  })()}
-                </div>
+                    })}
+                </>
              ) : (
                zone.columns.map((col) => {
                  const cell = cellMap.get(`${row.index}_${col.index}`)
@@ -485,7 +488,13 @@ export function GridTable({
                      } ${isMultiSelected ? 'bg-teal-50/60 dark:bg-teal-900/20' : ''} ${
                        isLowConf && !isMultiSelected ? 'bg-amber-50/50 dark:bg-amber-900/10' : ''
                      } ${row.is_header && !isMultiSelected ? 'bg-blue-50/50 dark:bg-blue-900/10' : ''}`}
-                      style={{ height: `${rowH}px` }}
+                      style={{
+                        height: `${rowH}px`,
+                        ...(cell?.box_region?.bg_hex ? {
+                          backgroundColor: `${cell.box_region.bg_hex}12`,
+                          borderLeft: cell.box_region.border ? `3px solid ${cell.box_region.bg_hex}60` : undefined,
+                        } : {}),
+                      }}
                      onContextMenu={(e) => {
                        if (onSetCellColor) {
                          e.preventDefault()
@@ -501,7 +510,11 @@ export function GridTable({
                        />
                      )}
                      {/* Per-word colored display when not editing */}
-                      {hasColoredWords && !isSelected ? (
+                      {(() => {
+                        const cellText = cell?.text ?? ''
+                        const isMultiLine = cellText.includes('\n')
+                        if (hasColoredWords && !isSelected) {
+                          return (
                            <div
                              className={`w-full px-2 cursor-text truncate ${isBold ? 'font-bold' : 'font-normal'}`}
                              onClick={(e) => {
@@ -527,11 +540,41 @@ export function GridTable({
                                </span>
                              ))}
                            </div>
-                      ) : (
+                          )
+                        }
+                        if (isMultiLine) {
+                          return (
+                            <textarea
+                              id={`cell-${cellId}`}
+                              value={cellText}
+                              onChange={(e) => onCellTextChange(cellId, e.target.value)}
+                              onFocus={() => onSelectCell(cellId)}
+                              onClick={(e) => {
+                                if ((e.metaKey || e.ctrlKey) && onToggleCellSelection) {
+                                  e.preventDefault()
+                                  onToggleCellSelection(cellId)
+                                }
+                              }}
+                              onKeyDown={(e) => {
+                                if (e.key === 'Tab') {
+                                  e.preventDefault()
+                                  onNavigate(cellId, e.shiftKey ? 'left' : 'right')
+                                }
+                              }}
+                              rows={cellText.split('\n').length}
+                              className={`w-full px-2 bg-transparent border-0 outline-none resize-none ${
+                                isBold ? 'font-bold' : 'font-normal'
+                              }`}
+                              style={{ color: cellColor || undefined }}
+                              spellCheck={false}
+                            />
+                          )
+                        }
+                        return (
                          <input
                            id={`cell-${cellId}`}
                            type="text"
-                          value={cell?.text ?? ''}
+                            value={cellText}
                            onChange={(e) => onCellTextChange(cellId, e.target.value)}
                            onFocus={() => onSelectCell(cellId)}
                            onClick={(e) => {
@@ -547,7 +590,8 @@ export function GridTable({
                            style={{ color: cellColor || undefined }}
                            spellCheck={false}
                          />
-                      )}
+                        )
+                      })()}
                    </div>
                  )
                })
--- a/admin-lehrer/components/grid-editor/GridToolbar.tsx
+++ b/admin-lehrer/components/grid-editor/GridToolbar.tsx
@@ -1,16 +1,38 @@
 'use client'

+import type { IpaMode, SyllableMode } from './useGridEditor'
+
 interface GridToolbarProps {
  dirty: boolean
  saving: boolean
  canUndo: boolean
  canRedo: boolean
  showOverlay: boolean
+  ipaMode: IpaMode
+  syllableMode: SyllableMode
  onSave: () => void
  onUndo: () => void
  onRedo: () => void
  onRebuild: () => void
  onToggleOverlay: () => void
+  onIpaModeChange: (mode: IpaMode) => void
+  onSyllableModeChange: (mode: SyllableMode) => void
+}
+
+const IPA_LABELS: Record<IpaMode, string> = {
+  auto: 'IPA: Auto',
+  en: 'IPA: nur EN',
+  de: 'IPA: nur DE',
+  all: 'IPA: Alle',
+  none: 'IPA: Aus',
+}
+
+const SYLLABLE_LABELS: Record<SyllableMode, string> = {
+  auto: 'Silben: Original',
+  en: 'Silben: nur EN',
+  de: 'Silben: nur DE',
+  all: 'Silben: Alle',
+  none: 'Silben: Aus',
 }

 export function GridToolbar({
@@ -19,11 +41,15 @@ export function GridToolbar({
  canUndo,
  canRedo,
  showOverlay,
+  ipaMode,
+  syllableMode,
  onSave,
  onUndo,
  onRedo,
  onRebuild,
  onToggleOverlay,
+  onIpaModeChange,
+  onSyllableModeChange,
 }: GridToolbarProps) {
  return (
    <div className="flex items-center gap-2 flex-wrap">
@@ -67,6 +93,40 @@ export function GridToolbar({
        Bild-Overlay
      </button>

+      {/* IPA mode */}
+      <div className="flex items-center gap-1">
+        <select
+          value={ipaMode}
+          onChange={(e) => onIpaModeChange(e.target.value as IpaMode)}
+          className="px-2 py-1.5 text-xs rounded-md border border-gray-200 dark:border-gray-700 bg-white dark:bg-gray-800 text-gray-600 dark:text-gray-400"
+          title="Lautschrift (IPA): Auto = nur erkannte EN-Woerter, DE = deutsches IPA (Wiktionary), Alle = EN + DE, Aus = keine"
+        >
+          {(Object.keys(IPA_LABELS) as IpaMode[]).map((m) => (
+            <option key={m} value={m}>{IPA_LABELS[m]}</option>
+          ))}
+        </select>
+        {(ipaMode === 'de' || ipaMode === 'all') && (
+          <span
+            className="text-[9px] text-gray-400 dark:text-gray-500 cursor-help"
+            title="DE-Lautschrift: Wiktionary (CC-BY-SA 4.0) + epitran (MIT). EN-Lautschrift: Britfone (MIT) + eng_to_ipa (MIT)."
+          >
+            CC-BY-SA
+          </span>
+        )}
+      </div>
+
+      {/* Syllable mode */}
+      <select
+        value={syllableMode}
+        onChange={(e) => onSyllableModeChange(e.target.value as SyllableMode)}
+        className="px-2 py-1.5 text-xs rounded-md border border-gray-200 dark:border-gray-700 bg-white dark:bg-gray-800 text-gray-600 dark:text-gray-400"
+        title="Silbentrennung: Original = nur wo im Scan vorhanden, Alle = fuer alle Woerter, Aus = keine"
+      >
+        {(Object.keys(SYLLABLE_LABELS) as SyllableMode[]).map((m) => (
+          <option key={m} value={m}>{SYLLABLE_LABELS[m]}</option>
+        ))}
+      </select>
+
      {/* Rebuild */}
      <button
        onClick={onRebuild}
--- a/admin-lehrer/components/grid-editor/types.ts
+++ b/admin-lehrer/components/grid-editor/types.ts
@@ -1,4 +1,4 @@
-import type { OcrWordBox } from '@/app/(admin)/ai/ocr-pipeline/types'
+import type { OcrWordBox } from '@/app/(admin)/ai/ocr-kombi/types'

 // Re-export for convenience
 export type { OcrWordBox }
@@ -20,6 +20,13 @@ export interface DictionaryDetection {
  headword_col_index: number | null
 }

+/** Page number extracted from footer region of the scan. */
+export interface PageNumber {
+  text: string
+  y_pct: number
+  number?: number
+}
+
 /** A complete structured grid with zones, ready for the Excel-like editor. */
 export interface StructuredGrid {
  session_id: string
@@ -31,6 +38,7 @@ export interface StructuredGrid {
  formatting: GridFormatting
  layout_metrics?: LayoutMetrics
  dictionary_detection?: DictionaryDetection
+  page_number?: PageNumber | null
  duration_seconds: number
  edited?: boolean
  layout_dividers?: LayoutDividers
@@ -65,6 +73,10 @@ export interface GridZone {
  header_rows: number[]
  layout_hint?: 'left_of_vsplit' | 'right_of_vsplit' | 'middle_of_vsplit'
  vsplit_group?: number
+  box_layout_type?: 'flowing' | 'columnar' | 'bullet_list' | 'header_only'
+  box_grid_reviewed?: boolean
+  box_bg_color?: string
+  box_bg_hex?: string
 }

 export interface BBox {
@@ -114,6 +126,16 @@ export interface GridEditorCell {
  is_bold: boolean
  /** Manual color override: hex string or null to clear. */
  color_override?: string | null
+  /** Number of columns this cell spans (merged cell). Default 1. */
+  colspan?: number
+  /** Source zone type when in unified grid. */
+  source_zone_type?: 'content' | 'box'
+  /** Box visual metadata for cells from box zones. */
+  box_region?: {
+    bg_hex?: string
+    bg_color?: string
+    border?: boolean
+  }
 }

 /** Layout dividers for the visual column/margin editor on the original image. */
--- a/admin-lehrer/components/grid-editor/useGridEditor.ts
+++ b/admin-lehrer/components/grid-editor/useGridEditor.ts
@@ -1,4 +1,4 @@
-import { useCallback, useRef, useState } from 'react'
+import { useCallback, useEffect, useRef, useState } from 'react'
 import type { StructuredGrid, GridZone, LayoutDividers } from './types'

 const KLAUSUR_API = '/klausur-api'
@@ -14,6 +14,9 @@ export interface GridEditorState {
  selectedZone: number | null
 }

+export type IpaMode = 'auto' | 'all' | 'de' | 'en' | 'none'
+export type SyllableMode = 'auto' | 'all' | 'de' | 'en' | 'none'
+
 export function useGridEditor(sessionId: string | null) {
  const [grid, setGrid] = useState<StructuredGrid | null>(null)
  const [loading, setLoading] = useState(false)
@@ -22,6 +25,17 @@ export function useGridEditor(sessionId: string | null) {
  const [dirty, setDirty] = useState(false)
  const [selectedCell, setSelectedCell] = useState<string | null>(null)
  const [selectedZone, setSelectedZone] = useState<number | null>(null)
+  const [ipaMode, setIpaMode] = useState<IpaMode>('auto')
+  const [syllableMode, setSyllableMode] = useState<SyllableMode>('auto')
+
+  // OCR Quality Steps (A/B testing toggles — defaults off for now)
+  const [ocrEnhance, setOcrEnhance] = useState(false)
+  const [ocrMaxCols, setOcrMaxCols] = useState(0)
+  const [ocrMinConf, setOcrMinConf] = useState(0)
+
+  // Vision-LLM Fusion (Step 4)
+  const [visionFusion, setVisionFusion] = useState(false)
+  const [documentCategory, setDocumentCategory] = useState('vokabelseite')

  // Undo/redo stacks store serialized zone arrays
  const undoStack = useRef<string[]>([])
@@ -44,8 +58,14 @@ export function useGridEditor(sessionId: string | null) {
    setLoading(true)
    setError(null)
    try {
+      const params = new URLSearchParams()
+      params.set('ipa_mode', ipaMode)
+      params.set('syllable_mode', syllableMode)
+      params.set('enhance', String(ocrEnhance))
+      if (ocrMaxCols > 0) params.set('max_cols', String(ocrMaxCols))
+      if (ocrMinConf > 0) params.set('min_conf', String(ocrMinConf))
      const res = await fetch(
-        `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/build-grid`,
+        `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/build-grid?${params}`,
        { method: 'POST' },
      )
      if (!res.ok) {
@@ -62,7 +82,41 @@ export function useGridEditor(sessionId: string | null) {
    } finally {
      setLoading(false)
    }
-  }, [sessionId])
+  }, [sessionId, ipaMode, syllableMode, ocrEnhance, ocrMaxCols, ocrMinConf])
+
+  /** Re-run OCR with current quality settings, then rebuild grid */
+  const rerunOcr = useCallback(async () => {
+    if (!sessionId) return
+    setLoading(true)
+    setError(null)
+    try {
+      const params = new URLSearchParams()
+      params.set('ipa_mode', ipaMode)
+      params.set('syllable_mode', syllableMode)
+      params.set('enhance', String(ocrEnhance))
+      if (ocrMaxCols > 0) params.set('max_cols', String(ocrMaxCols))
+      if (ocrMinConf > 0) params.set('min_conf', String(ocrMinConf))
+      params.set('vision_fusion', String(visionFusion))
+      if (documentCategory) params.set('doc_category', documentCategory)
+      const res = await fetch(
+        `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/rerun-ocr-and-build-grid?${params}`,
+        { method: 'POST' },
+      )
+      if (!res.ok) {
+        const data = await res.json().catch(() => ({}))
+        throw new Error(data.detail || `HTTP ${res.status}`)
+      }
+      const data: StructuredGrid = await res.json()
+      setGrid(data)
+      setDirty(false)
+      undoStack.current = []
+      redoStack.current = []
+    } catch (e) {
+      setError(e instanceof Error ? e.message : String(e))
+    } finally {
+      setLoading(false)
+    }
+  }, [sessionId, ipaMode, syllableMode, ocrEnhance, ocrMaxCols, ocrMinConf, visionFusion, documentCategory])

  const loadGrid = useCallback(async () => {
    if (!sessionId) return
@@ -73,8 +127,22 @@ export function useGridEditor(sessionId: string | null) {
        `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/grid-editor`,
      )
      if (res.status === 404) {
-        // No grid yet — build it
-        await buildGrid()
+        // No grid yet — build it with current modes
+        const params = new URLSearchParams()
+        params.set('ipa_mode', ipaMode)
+        params.set('syllable_mode', syllableMode)
+        params.set('enhance', String(ocrEnhance))
+        if (ocrMaxCols > 0) params.set('max_cols', String(ocrMaxCols))
+        if (ocrMinConf > 0) params.set('min_conf', String(ocrMinConf))
+        const buildRes = await fetch(
+          `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/build-grid?${params}`,
+          { method: 'POST' },
+        )
+        if (buildRes.ok) {
+          const data: StructuredGrid = await buildRes.json()
+          setGrid(data)
+          setDirty(false)
+        }
        return
      }
      if (!res.ok) {
@@ -91,7 +159,50 @@ export function useGridEditor(sessionId: string | null) {
    } finally {
      setLoading(false)
    }
-  }, [sessionId, buildGrid])
+    // Only depends on sessionId — mode changes are handled by the
+    // separate useEffect below, not by re-triggering loadGrid.
+    // eslint-disable-next-line react-hooks/exhaustive-deps
+  }, [sessionId])
+
+  // Auto-rebuild when IPA or syllable mode changes (skip initial mount).
+  // We call the API directly with the new values instead of going through
+  // the buildGrid callback, which may still close over stale state due to
+  // React's asynchronous state batching.
+  const mountedRef = useRef(false)
+  useEffect(() => {
+    if (!mountedRef.current) {
+      // Skip the first trigger (component mount) — don't rebuild yet
+      mountedRef.current = true
+      return
+    }
+    if (!sessionId) return
+    const rebuild = async () => {
+      setLoading(true)
+      setError(null)
+      try {
+        const params = new URLSearchParams()
+        params.set('ipa_mode', ipaMode)
+        params.set('syllable_mode', syllableMode)
+        const res = await fetch(
+          `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/build-grid?${params}`,
+          { method: 'POST' },
+        )
+        if (!res.ok) {
+          const data = await res.json().catch(() => ({}))
+          throw new Error(data.detail || `HTTP ${res.status}`)
+        }
+        const data: StructuredGrid = await res.json()
+        setGrid(data)
+        setDirty(false)
+      } catch (e) {
+        setError(e instanceof Error ? e.message : String(e))
+      } finally {
+        setLoading(false)
+      }
+    }
+    rebuild()
+    // eslint-disable-next-line react-hooks/exhaustive-deps
+  }, [ipaMode, syllableMode])

  // ------------------------------------------------------------------
  // Save
@@ -915,5 +1026,20 @@ export function useGridEditor(sessionId: string | null) {
    toggleSelectedBold,
    autoCorrectColumnPatterns,
    setCellColor,
+    ipaMode,
+    setIpaMode,
+    syllableMode,
+    setSyllableMode,
+    ocrEnhance,
+    setOcrEnhance,
+    ocrMaxCols,
+    setOcrMaxCols,
+    ocrMinConf,
+    setOcrMinConf,
+    visionFusion,
+    setVisionFusion,
+    documentCategory,
+    setDocumentCategory,
+    rerunOcr,
  }
 }
--- a/admin-lehrer/components/ocr-kombi/KombiStepper.tsx
+++ b/admin-lehrer/components/ocr-kombi/KombiStepper.tsx
@@ -0,0 +1,59 @@
+'use client'
+
+import type { PipelineStep } from '@/app/(admin)/ai/ocr-kombi/types'
+
+interface KombiStepperProps {
+  steps: PipelineStep[]
+  currentStep: number
+  onStepClick: (index: number) => void
+}
+
+export function KombiStepper({ steps, currentStep, onStepClick }: KombiStepperProps) {
+  return (
+    <div className="flex items-center gap-0.5 px-3 py-2.5 bg-white dark:bg-gray-800 rounded-lg border border-gray-200 dark:border-gray-700 overflow-x-auto">
+      {steps.map((step, index) => {
+        const isActive = index === currentStep
+        const isCompleted = step.status === 'completed'
+        const isFailed = step.status === 'failed'
+        const isSkipped = step.status === 'skipped'
+        const isClickable = (index <= currentStep || isCompleted) && !isSkipped
+
+        return (
+          <div key={step.id} className="flex items-center flex-shrink-0">
+            {index > 0 && (
+              <div
+                className={`h-0.5 w-4 mx-0.5 ${
+                  isSkipped
+                    ? 'bg-gray-200 dark:bg-gray-700 border-t border-dashed border-gray-400'
+                    : index <= currentStep ? 'bg-teal-400' : 'bg-gray-300 dark:bg-gray-600'
+                }`}
+              />
+            )}
+            <button
+              onClick={() => isClickable && onStepClick(index)}
+              disabled={!isClickable}
+              className={`flex items-center gap-1 px-2 py-1 rounded-full text-xs font-medium transition-all whitespace-nowrap ${
+                isSkipped
+                  ? 'bg-gray-100 text-gray-400 dark:bg-gray-800 dark:text-gray-600 line-through'
+                  : isActive
+                    ? 'bg-teal-100 text-teal-700 dark:bg-teal-900/40 dark:text-teal-300 ring-2 ring-teal-400'
+                    : isCompleted
+                      ? 'bg-green-100 text-green-700 dark:bg-green-900/40 dark:text-green-300'
+                      : isFailed
+                        ? 'bg-red-100 text-red-700 dark:bg-red-900/40 dark:text-red-300'
+                        : 'text-gray-400 dark:text-gray-500'
+              } ${isClickable ? 'cursor-pointer hover:opacity-80' : 'cursor-default'}`}
+              title={step.name}
+            >
+              <span className="text-sm">
+                {isSkipped ? '-' : isCompleted ? '\u2713' : isFailed ? '\u2717' : step.icon}
+              </span>
+              <span className="hidden lg:inline">{step.name}</span>
+              <span className="lg:hidden">{index + 1}</span>
+            </button>
+          </div>
+        )
+      })}
+    </div>
+  )
+}
--- a/admin-lehrer/components/ocr-kombi/SessionHeader.tsx
+++ b/admin-lehrer/components/ocr-kombi/SessionHeader.tsx
@@ -0,0 +1,73 @@
+'use client'
+
+import { useState } from 'react'
+import { DOCUMENT_CATEGORIES, type DocumentCategory } from '@/app/(admin)/ai/ocr-kombi/types'
+
+interface SessionHeaderProps {
+  sessionName: string
+  activeCategory?: DocumentCategory
+  isGroundTruth: boolean
+  pageNumber?: number | null
+  onUpdateCategory: (category: DocumentCategory) => void
+}
+
+export function SessionHeader({
+  sessionName,
+  activeCategory,
+  isGroundTruth,
+  pageNumber,
+  onUpdateCategory,
+}: SessionHeaderProps) {
+  const [showCategoryPicker, setShowCategoryPicker] = useState(false)
+
+  const catInfo = DOCUMENT_CATEGORIES.find(c => c.value === activeCategory)
+
+  return (
+    <div className="relative flex items-center gap-3 text-sm text-gray-500 dark:text-gray-400">
+      <span>
+        Aktive Session:{' '}
+        <span className="font-medium text-gray-700 dark:text-gray-300">{sessionName}</span>
+      </span>
+      <button
+        onClick={() => setShowCategoryPicker(!showCategoryPicker)}
+        className={`text-xs px-2.5 py-1 rounded-full border transition-colors ${
+          activeCategory
+            ? 'bg-teal-50 dark:bg-teal-900/30 border-teal-200 dark:border-teal-700 text-teal-700 dark:text-teal-300 hover:bg-teal-100'
+            : 'bg-amber-50 dark:bg-amber-900/20 border-amber-300 dark:border-amber-700 text-amber-700 dark:text-amber-300 hover:bg-amber-100 animate-pulse'
+        }`}
+      >
+        {catInfo ? `${catInfo.icon} ${catInfo.label}` : 'Kategorie setzen'}
+      </button>
+      {pageNumber != null && (
+        <span className="text-xs px-2 py-0.5 rounded-full bg-gray-100 dark:bg-gray-700 border border-gray-200 dark:border-gray-600 text-gray-600 dark:text-gray-300">
+          S. {pageNumber}
+        </span>
+      )}
+      {isGroundTruth && (
+        <span className="text-xs px-2 py-0.5 rounded-full bg-amber-50 dark:bg-amber-900/20 border border-amber-300 dark:border-amber-700 text-amber-700 dark:text-amber-300">
+          GT
+        </span>
+      )}
+      {showCategoryPicker && (
+        <div className="absolute left-0 top-full mt-1 z-20 bg-white dark:bg-gray-800 border border-gray-200 dark:border-gray-700 rounded-lg shadow-lg p-2 grid grid-cols-2 gap-1 w-64">
+          {DOCUMENT_CATEGORIES.map(cat => (
+            <button
+              key={cat.value}
+              onClick={() => {
+                onUpdateCategory(cat.value)
+                setShowCategoryPicker(false)
+              }}
+              className={`text-xs px-2 py-1.5 rounded-md text-left transition-colors ${
+                activeCategory === cat.value
+                  ? 'bg-teal-100 dark:bg-teal-900/40 text-teal-700 dark:text-teal-300'
+                  : 'hover:bg-gray-100 dark:hover:bg-gray-700 text-gray-600 dark:text-gray-400'
+              }`}
+            >
+              {cat.icon} {cat.label}
+            </button>
+          ))}
+        </div>
+      )}
+    </div>
+  )
+}
--- a/admin-lehrer/components/ocr-kombi/SessionList.tsx
+++ b/admin-lehrer/components/ocr-kombi/SessionList.tsx
@@ -0,0 +1,376 @@
+'use client'
+
+import { useState } from 'react'
+import { DOCUMENT_CATEGORIES, type DocumentCategory } from '@/app/(admin)/ai/ocr-kombi/types'
+import type { SessionListItem, DocumentGroupView } from '@/app/(admin)/ai/ocr-kombi/useKombiPipeline'
+
+const KLAUSUR_API = '/klausur-api'
+
+interface SessionListProps {
+  items: (SessionListItem | DocumentGroupView)[]
+  loading: boolean
+  activeSessionId: string | null
+  onOpenSession: (sid: string) => void
+  onNewSession: () => void
+  onDeleteSession: (sid: string) => void
+  onRenameSession: (sid: string, newName: string) => void
+  onUpdateCategory: (sid: string, category: DocumentCategory) => void
+}
+
+function isGroup(item: SessionListItem | DocumentGroupView): item is DocumentGroupView {
+  return 'group_id' in item
+}
+
+export function SessionList({
+  items,
+  loading,
+  activeSessionId,
+  onOpenSession,
+  onNewSession,
+  onDeleteSession,
+  onRenameSession,
+  onUpdateCategory,
+}: SessionListProps) {
+  const [editingName, setEditingName] = useState<string | null>(null)
+  const [editNameValue, setEditNameValue] = useState('')
+  const [editingCategory, setEditingCategory] = useState<string | null>(null)
+  const [expandedGroups, setExpandedGroups] = useState<Set<string>>(new Set())
+
+  const toggleGroup = (groupId: string) => {
+    setExpandedGroups(prev => {
+      const next = new Set(prev)
+      if (next.has(groupId)) next.delete(groupId)
+      else next.add(groupId)
+      return next
+    })
+  }
+
+  return (
+    <div className="bg-white dark:bg-gray-800 rounded-xl border border-gray-200 dark:border-gray-700 p-4">
+      <div className="flex items-center justify-between mb-3">
+        <h3 className="text-sm font-medium text-gray-700 dark:text-gray-300">
+          Sessions ({items.length})
+        </h3>
+        <button
+          onClick={onNewSession}
+          className="text-xs px-3 py-1.5 bg-teal-600 text-white rounded-lg hover:bg-teal-700 transition-colors"
+        >
+          + Neue Session
+        </button>
+      </div>
+
+      {loading ? (
+        <div className="text-sm text-gray-400 py-2">Lade Sessions...</div>
+      ) : items.length === 0 ? (
+        <div className="text-sm text-gray-400 py-2">Noch keine Sessions vorhanden.</div>
+      ) : (
+        <div className="space-y-1.5 max-h-[320px] overflow-y-auto">
+          {items.map(item =>
+            isGroup(item) ? (
+              <GroupRow
+                key={item.group_id}
+                group={item}
+                expanded={expandedGroups.has(item.group_id)}
+                activeSessionId={activeSessionId}
+                onToggle={() => toggleGroup(item.group_id)}
+                onOpenSession={onOpenSession}
+                onDeleteSession={onDeleteSession}
+              />
+            ) : (
+              <SessionRow
+                key={item.id}
+                session={item}
+                isActive={activeSessionId === item.id}
+                editingName={editingName}
+                editNameValue={editNameValue}
+                editingCategory={editingCategory}
+                onOpenSession={() => onOpenSession(item.id)}
+                onStartRename={() => {
+                  setEditNameValue(item.name || item.filename)
+                  setEditingName(item.id)
+                }}
+                onFinishRename={(newName) => {
+                  onRenameSession(item.id, newName)
+                  setEditingName(null)
+                }}
+                onCancelRename={() => setEditingName(null)}
+                onEditNameChange={setEditNameValue}
+                onToggleCategory={() => setEditingCategory(editingCategory === item.id ? null : item.id)}
+                onUpdateCategory={(cat) => {
+                  onUpdateCategory(item.id, cat)
+                  setEditingCategory(null)
+                }}
+                onDelete={() => {
+                  if (confirm('Session loeschen?')) onDeleteSession(item.id)
+                }}
+              />
+            )
+          )}
+        </div>
+      )}
+    </div>
+  )
+}
+
+// ---- Group row (multi-page document) ----
+
+function GroupRow({
+  group,
+  expanded,
+  activeSessionId,
+  onToggle,
+  onOpenSession,
+  onDeleteSession,
+}: {
+  group: DocumentGroupView
+  expanded: boolean
+  activeSessionId: string | null
+  onToggle: () => void
+  onOpenSession: (sid: string) => void
+  onDeleteSession: (sid: string) => void
+}) {
+  const isActive = group.sessions.some(s => s.id === activeSessionId)
+
+  return (
+    <div>
+      <div
+        onClick={onToggle}
+        className={`flex items-center gap-3 px-3 py-2 rounded-lg text-sm cursor-pointer transition-colors ${
+          isActive
+            ? 'bg-teal-50 dark:bg-teal-900/30 border border-teal-200 dark:border-teal-700'
+            : 'hover:bg-gray-50 dark:hover:bg-gray-700/50'
+        }`}
+      >
+        <span className="text-base">{expanded ? '\u25BC' : '\u25B6'}</span>
+        <div className="flex-1 min-w-0">
+          <div className="truncate font-medium text-gray-700 dark:text-gray-300">
+            {group.title}
+          </div>
+          <div className="text-xs text-gray-400">
+            {group.page_count} Seiten
+          </div>
+        </div>
+        <div className="flex items-center gap-1.5">
+          {group.sessions.some(s => s.is_ground_truth) && (
+            <span className="text-[10px] px-1.5 py-0.5 rounded-full bg-amber-100 dark:bg-amber-900/30 border border-amber-300 dark:border-amber-700 text-amber-700 dark:text-amber-300">
+              GT {group.sessions.filter(s => s.is_ground_truth).length}/{group.sessions.length}
+            </span>
+          )}
+          <span className="text-xs px-2 py-0.5 rounded-full bg-blue-50 dark:bg-blue-900/20 border border-blue-200 dark:border-blue-800 text-blue-600 dark:text-blue-400">
+            Dokument
+          </span>
+        </div>
+      </div>
+
+      {expanded && (
+        <div className="ml-6 mt-1 space-y-1 border-l-2 border-gray-200 dark:border-gray-700 pl-3">
+          {group.sessions.map(s => (
+            <div
+              key={s.id}
+              className={`flex items-center gap-2 px-2 py-1.5 rounded text-xs cursor-pointer transition-colors ${
+                activeSessionId === s.id
+                  ? 'bg-teal-50 dark:bg-teal-900/30 text-teal-700 dark:text-teal-300'
+                  : 'hover:bg-gray-50 dark:hover:bg-gray-700/50 text-gray-600 dark:text-gray-400'
+              }`}
+              onClick={() => onOpenSession(s.id)}
+            >
+              {/* Thumbnail */}
+              <div className="flex-shrink-0 w-8 h-8 rounded overflow-hidden bg-gray-100 dark:bg-gray-700">
+                {/* eslint-disable-next-line @next/next/no-img-element */}
+                <img
+                  src={`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${s.id}/thumbnail?size=64`}
+                  alt=""
+                  className="w-full h-full object-cover"
+                  loading="lazy"
+                  onError={(e) => { (e.target as HTMLImageElement).style.display = 'none' }}
+                />
+              </div>
+              <span className="truncate flex-1">S. {s.page_number || '?'}</span>
+              {s.is_ground_truth && (
+                <span className="text-[9px] px-1 py-0.5 rounded bg-amber-100 dark:bg-amber-900/30 border border-amber-300 dark:border-amber-700 text-amber-700 dark:text-amber-300">GT</span>
+              )}
+              <span className="text-[10px] text-gray-400">Step {s.current_step}</span>
+              <button
+                onClick={(e) => {
+                  e.stopPropagation()
+                  if (confirm('Seite loeschen?')) onDeleteSession(s.id)
+                }}
+                className="p-0.5 text-gray-400 hover:text-red-500"
+                title="Loeschen"
+              >
+                <svg className="w-3 h-3" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
+                  <path strokeLinecap="round" strokeLinejoin="round" d="M19 7l-.867 12.142A2 2 0 0116.138 21H7.862a2 2 0 01-1.995-1.858L5 7m5 4v6m4-6v6m1-10V4a1 1 0 00-1-1h-4a1 1 0 00-1 1v3M4 7h16" />
+                </svg>
+              </button>
+            </div>
+          ))}
+        </div>
+      )}
+    </div>
+  )
+}
+
+// ---- Single session row ----
+
+function SessionRow({
+  session,
+  isActive,
+  editingName,
+  editNameValue,
+  editingCategory,
+  onOpenSession,
+  onStartRename,
+  onFinishRename,
+  onCancelRename,
+  onEditNameChange,
+  onToggleCategory,
+  onUpdateCategory,
+  onDelete,
+}: {
+  session: SessionListItem
+  isActive: boolean
+  editingName: string | null
+  editNameValue: string
+  editingCategory: string | null
+  onOpenSession: () => void
+  onStartRename: () => void
+  onFinishRename: (name: string) => void
+  onCancelRename: () => void
+  onEditNameChange: (val: string) => void
+  onToggleCategory: () => void
+  onUpdateCategory: (cat: DocumentCategory) => void
+  onDelete: () => void
+}) {
+  const catInfo = DOCUMENT_CATEGORIES.find(c => c.value === session.document_category)
+  const isEditing = editingName === session.id
+
+  return (
+    <div
+      className={`relative flex items-start gap-3 px-3 py-2.5 rounded-lg text-sm transition-colors cursor-pointer ${
+        isActive
+          ? 'bg-teal-50 dark:bg-teal-900/30 border border-teal-200 dark:border-teal-700'
+          : 'hover:bg-gray-50 dark:hover:bg-gray-700/50'
+      }`}
+    >
+      {/* Thumbnail */}
+      <div
+        className="flex-shrink-0 w-12 h-12 rounded-md overflow-hidden bg-gray-100 dark:bg-gray-700"
+        onClick={onOpenSession}
+      >
+        {/* eslint-disable-next-line @next/next/no-img-element */}
+        <img
+          src={`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${session.id}/thumbnail?size=96`}
+          alt=""
+          className="w-full h-full object-cover"
+          loading="lazy"
+          onError={(e) => { (e.target as HTMLImageElement).style.display = 'none' }}
+        />
+      </div>
+
+      {/* Info */}
+      <div className="flex-1 min-w-0" onClick={onOpenSession}>
+        {isEditing ? (
+          <input
+            autoFocus
+            value={editNameValue}
+            onChange={(e) => onEditNameChange(e.target.value)}
+            onBlur={() => onFinishRename(editNameValue)}
+            onKeyDown={(e) => {
+              if (e.key === 'Enter') onFinishRename(editNameValue)
+              if (e.key === 'Escape') onCancelRename()
+            }}
+            onClick={(e) => e.stopPropagation()}
+            className="w-full px-1 py-0.5 text-sm border rounded dark:bg-gray-700 dark:border-gray-600"
+          />
+        ) : (
+          <div className="truncate font-medium text-gray-700 dark:text-gray-300">
+            {session.name || session.filename}
+          </div>
+        )}
+        <button
+          onClick={(e) => {
+            e.stopPropagation()
+            navigator.clipboard.writeText(session.id)
+            const btn = e.currentTarget
+            btn.textContent = 'Kopiert!'
+            setTimeout(() => { btn.textContent = `ID: ${session.id.slice(0, 8)}` }, 1500)
+          }}
+          className="text-[10px] font-mono text-gray-400 hover:text-teal-500 transition-colors"
+          title={`Volle ID: ${session.id} — Klick zum Kopieren`}
+        >
+          ID: {session.id.slice(0, 8)}
+        </button>
+        <div className="text-xs text-gray-400 mt-0.5">
+          {new Date(session.created_at).toLocaleDateString('de-DE', {
+            day: '2-digit', month: '2-digit', year: '2-digit',
+            hour: '2-digit', minute: '2-digit',
+          })}
+        </div>
+      </div>
+
+      {/* Category + GT badge */}
+      <div className="flex flex-col gap-1 items-end flex-shrink-0" onClick={(e) => e.stopPropagation()}>
+        <button
+          onClick={onToggleCategory}
+          className={`text-[10px] px-1.5 py-0.5 rounded-full border transition-colors ${
+            catInfo
+              ? 'bg-teal-50 dark:bg-teal-900/30 border-teal-200 dark:border-teal-700 text-teal-700 dark:text-teal-300'
+              : 'bg-gray-50 dark:bg-gray-700 border-gray-200 dark:border-gray-600 text-gray-400 hover:text-gray-600'
+          }`}
+          title="Kategorie setzen"
+        >
+          {catInfo ? `${catInfo.icon} ${catInfo.label}` : '+ Kategorie'}
+        </button>
+        {session.is_ground_truth && (
+          <span className="text-[10px] px-1.5 py-0.5 rounded-full bg-amber-100 dark:bg-amber-900/30 border border-amber-300 dark:border-amber-700 text-amber-700 dark:text-amber-300" title="Ground Truth markiert">
+            GT
+          </span>
+        )}
+      </div>
+
+      {/* Actions */}
+      <div className="flex flex-col gap-0.5 flex-shrink-0">
+        <button
+          onClick={(e) => { e.stopPropagation(); onStartRename() }}
+          className="p-1 text-gray-400 hover:text-gray-600 dark:hover:text-gray-300"
+          title="Umbenennen"
+        >
+          <svg className="w-3.5 h-3.5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
+            <path strokeLinecap="round" strokeLinejoin="round" d="M15.232 5.232l3.536 3.536m-2.036-5.036a2.5 2.5 0 113.536 3.536L6.5 21.036H3v-3.572L16.732 3.732z" />
+          </svg>
+        </button>
+        <button
+          onClick={(e) => { e.stopPropagation(); onDelete() }}
+          className="p-1 text-gray-400 hover:text-red-500"
+          title="Loeschen"
+        >
+          <svg className="w-3.5 h-3.5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
+            <path strokeLinecap="round" strokeLinejoin="round" d="M19 7l-.867 12.142A2 2 0 0116.138 21H7.862a2 2 0 01-1.995-1.858L5 7m5 4v6m4-6v6m1-10V4a1 1 0 00-1-1h-4a1 1 0 00-1 1v3M4 7h16" />
+          </svg>
+        </button>
+      </div>
+
+      {/* Category dropdown */}
+      {editingCategory === session.id && (
+        <div
+          className="absolute right-0 top-full mt-1 z-20 bg-white dark:bg-gray-800 border border-gray-200 dark:border-gray-700 rounded-lg shadow-lg p-2 grid grid-cols-2 gap-1 w-64"
+          onClick={(e) => e.stopPropagation()}
+        >
+          {DOCUMENT_CATEGORIES.map(cat => (
+            <button
+              key={cat.value}
+              onClick={() => onUpdateCategory(cat.value)}
+              className={`text-xs px-2 py-1.5 rounded-md text-left transition-colors ${
+                session.document_category === cat.value
+                  ? 'bg-teal-100 dark:bg-teal-900/40 text-teal-700 dark:text-teal-300'
+                  : 'hover:bg-gray-100 dark:hover:bg-gray-700 text-gray-600 dark:text-gray-400'
+              }`}
+            >
+              {cat.icon} {cat.label}
+            </button>
+          ))}
+        </div>
+      )}
+    </div>
+  )
+}
--- a/admin-lehrer/components/ocr-kombi/SpreadsheetView.tsx
+++ b/admin-lehrer/components/ocr-kombi/SpreadsheetView.tsx
@@ -0,0 +1,241 @@
+'use client'
+
+/**
+ * SpreadsheetView — Fortune Sheet with multi-sheet support.
+ *
+ * Each zone (content + boxes) becomes its own Excel sheet tab,
+ * so each can have independent column widths optimized for its content.
+ */
+
+import { useMemo } from 'react'
+import dynamic from 'next/dynamic'
+
+const Workbook = dynamic(
+  () => import('@fortune-sheet/react').then((m) => m.Workbook),
+  { ssr: false, loading: () => <div className="py-8 text-center text-sm text-gray-400">Spreadsheet wird geladen...</div> },
+)
+
+import '@fortune-sheet/react/dist/index.css'
+
+import type { GridZone } from '@/components/grid-editor/types'
+
+interface SpreadsheetViewProps {
+  gridData: any
+  height?: number
+}
+
+/** No expansion — keep multi-line cells as single cells with \n and text-wrap. */
+
+/** Convert a single zone to a Fortune Sheet sheet object. */
+function zoneToSheet(zone: GridZone, sheetIndex: number, isFirst: boolean): any {
+  const isBox = zone.zone_type === 'box'
+  const boxColor = (zone as any).box_bg_hex || ''
+
+  // Sheet name
+  let name: string
+  if (!isBox) {
+    name = 'Vokabeln'
+  } else {
+    const firstText = zone.cells?.[0]?.text ?? `Box ${sheetIndex}`
+    const cleaned = firstText.replace(/[^\w\s\u00C0-\u024F„"]/g, '').trim()
+    name = cleaned.length > 25 ? cleaned.slice(0, 25) + '…' : cleaned || `Box ${sheetIndex}`
+  }
+
+  const numCols = zone.columns?.length || 1
+  const numRows = zone.rows?.length || 0
+  const expandedCells = zone.cells || []
+
+  // Compute zone-wide median word height for font-size detection
+  const allWordHeights = zone.cells
+    .flatMap((c: any) => (c.word_boxes || []).map((wb: any) => wb.height || 0))
+    .filter((h: number) => h > 0)
+  const medianWordH = allWordHeights.length
+    ? [...allWordHeights].sort((a, b) => a - b)[Math.floor(allWordHeights.length / 2)]
+    : 0
+
+  // Build celldata
+  const celldata: any[] = []
+  const merges: Record<string, any> = {}
+
+  for (const cell of expandedCells) {
+    const r = cell.row_index
+    const c = cell.col_index
+    const text = cell.text ?? ''
+
+    // Row metadata
+    const row = zone.rows?.find((rr) => rr.index === r)
+    const isHeader = row?.is_header ?? false
+
+    // Font size detection from word_boxes
+    const avgWbH = cell.word_boxes?.length
+      ? cell.word_boxes.reduce((s: number, wb: any) => s + (wb.height || 0), 0) / cell.word_boxes.length
+      : 0
+    const isLargerFont = avgWbH > 0 && medianWordH > 0 && avgWbH > medianWordH * 1.3
+
+    const v: any = { v: text, m: text }
+
+    // Bold: headers, is_bold, larger font
+    if (cell.is_bold || isHeader || isLargerFont) {
+      v.bl = 1
+    }
+
+    // Larger font for box titles
+    if (isLargerFont && isBox) {
+      v.fs = 12
+    }
+
+    // Multi-line text (bullets with \n): enable text wrap + vertical top align
+    // Add bullet marker (•) if multi-line and no bullet present
+    if (text.includes('\n') && !isHeader) {
+      if (!text.startsWith('•') && !text.startsWith('-') && !text.startsWith('–') && r > 0) {
+        text = '• ' + text
+        v.v = text
+        v.m = text
+      }
+      v.tb = '2'  // text wrap
+      v.vt = 0    // vertical align: top
+    }
+
+    // Header row background
+    if (isHeader) {
+      v.bg = isBox ? `${boxColor || '#2563eb'}18` : '#f0f4ff'
+    }
+
+    // Box cells: light tinted background
+    if (isBox && !isHeader && boxColor) {
+      v.bg = `${boxColor}08`
+    }
+
+    // Text color from OCR
+    const color = cell.color_override
+      ?? cell.word_boxes?.find((wb: any) => wb.color_name && wb.color_name !== 'black')?.color
+    if (color) v.fc = color
+
+    celldata.push({ r, c, v })
+
+    // Colspan → merge
+    const colspan = cell.colspan || 0
+    if (colspan > 1 || cell.col_type === 'spanning_header') {
+      const cs = colspan || numCols
+      merges[`${r}_${c}`] = { r, c, rs: 1, cs }
+    }
+  }
+
+  // Column widths — auto-fit based on longest text
+  const columnlen: Record<string, number> = {}
+  for (const col of (zone.columns || [])) {
+    const colCells = expandedCells.filter(
+      (c: any) => c.col_index === col.index && c.col_type !== 'spanning_header'
+    )
+    let maxTextLen = 0
+    for (const c of colCells) {
+      const len = (c.text ?? '').length
+      if (len > maxTextLen) maxTextLen = len
+    }
+    const autoWidth = Math.max(60, maxTextLen * 7.5 + 16)
+    const pxW = (col.x_max_px ?? 0) - (col.x_min_px ?? 0)
+    const scaledPxW = Math.max(60, Math.round(pxW * (numCols <= 2 ? 0.6 : 0.4)))
+    columnlen[String(col.index)] = Math.round(Math.max(autoWidth, scaledPxW))
+  }
+
+  // Row heights — taller for multi-line cells
+  const rowlen: Record<string, number> = {}
+  for (const row of (zone.rows || [])) {
+    const rowCells = expandedCells.filter((c: any) => c.row_index === row.index)
+    const maxLines = Math.max(1, ...rowCells.map((c: any) => (c.text ?? '').split('\n').length))
+    const baseH = 24
+    rowlen[String(row.index)] = Math.max(baseH, baseH * maxLines)
+  }
+
+  // Border info
+  const borderInfo: any[] = []
+
+  // Box: colored outside border
+  if (isBox && boxColor && numRows > 0 && numCols > 0) {
+    borderInfo.push({
+      rangeType: 'range',
+      borderType: 'border-outside',
+      color: boxColor,
+      style: 5,
+      range: [{ row: [0, numRows - 1], column: [0, numCols - 1] }],
+    })
+    borderInfo.push({
+      rangeType: 'range',
+      borderType: 'border-inside',
+      color: `${boxColor}40`,
+      style: 1,
+      range: [{ row: [0, numRows - 1], column: [0, numCols - 1] }],
+    })
+  }
+
+  // Content zone: light grid lines
+  if (!isBox && numRows > 0 && numCols > 0) {
+    borderInfo.push({
+      rangeType: 'range',
+      borderType: 'border-all',
+      color: '#e5e7eb',
+      style: 1,
+      range: [{ row: [0, numRows - 1], column: [0, numCols - 1] }],
+    })
+  }
+
+  return {
+    name,
+    id: `zone_${zone.zone_index}`,
+    celldata,
+    row: numRows,
+    column: Math.max(numCols, 1),
+    status: isFirst ? 1 : 0,
+    color: isBox ? boxColor : undefined,
+    config: {
+      merge: Object.keys(merges).length > 0 ? merges : undefined,
+      columnlen,
+      rowlen,
+      borderInfo: borderInfo.length > 0 ? borderInfo : undefined,
+    },
+  }
+}
+
+export function SpreadsheetView({ gridData, height = 600 }: SpreadsheetViewProps) {
+  const sheets = useMemo(() => {
+    if (!gridData?.zones) return []
+
+    const sorted = [...gridData.zones].sort((a: GridZone, b: GridZone) => {
+      if (a.zone_type === 'content' && b.zone_type !== 'content') return -1
+      if (a.zone_type !== 'content' && b.zone_type === 'content') return 1
+      return (a.bbox_px?.y ?? 0) - (b.bbox_px?.y ?? 0)
+    })
+
+    return sorted
+      .filter((z: GridZone) => z.cells && z.cells.length > 0)
+      .map((z: GridZone, i: number) => zoneToSheet(z, i, i === 0))
+  }, [gridData])
+
+  const maxRows = Math.max(0, ...sheets.map((s: any) => s.row || 0))
+  const estimatedHeight = Math.max(height, maxRows * 26 + 80)
+
+  if (sheets.length === 0) {
+    return <div className="p-4 text-center text-gray-400">Keine Daten für Spreadsheet.</div>
+  }
+
+  return (
+    <div style={{ width: '100%', height: `${estimatedHeight}px` }}>
+      <Workbook
+        data={sheets}
+        lang="en"
+        showToolbar
+        showFormulaBar={false}
+        showSheetTabs
+        toolbarItems={[
+          'undo', 'redo', '|',
+          'font-bold', 'font-italic', 'font-strikethrough', '|',
+          'font-color', 'background', '|',
+          'font-size', '|',
+          'horizontal-align', 'vertical-align', '|',
+          'text-wrap', 'merge-cell', '|',
+          'border',
+        ]}
+      />
+    </div>
+  )
+}
--- a/admin-lehrer/components/ocr-kombi/StepAnsicht.tsx
+++ b/admin-lehrer/components/ocr-kombi/StepAnsicht.tsx
@@ -0,0 +1,110 @@
+'use client'
+
+/**
+ * StepAnsicht — Excel-like Spreadsheet View.
+ *
+ * Left:  Original scan with OCR word overlay
+ * Right: Fortune Sheet spreadsheet with multi-sheet tabs per zone
+ */
+
+import { useEffect, useRef, useState } from 'react'
+import dynamic from 'next/dynamic'
+
+const SpreadsheetView = dynamic(
+  () => import('./SpreadsheetView').then((m) => m.SpreadsheetView),
+  { ssr: false, loading: () => <div className="py-8 text-center text-sm text-gray-400">Spreadsheet wird geladen...</div> },
+)
+
+const KLAUSUR_API = '/klausur-api'
+
+interface StepAnsichtProps {
+  sessionId: string | null
+  onNext: () => void
+}
+
+export function StepAnsicht({ sessionId, onNext }: StepAnsichtProps) {
+  const [gridData, setGridData] = useState<any>(null)
+  const [loading, setLoading] = useState(true)
+  const [error, setError] = useState<string | null>(null)
+  const leftRef = useRef<HTMLDivElement>(null)
+  const [leftHeight, setLeftHeight] = useState(600)
+
+  // Load grid data on mount
+  useEffect(() => {
+    if (!sessionId) return
+    ;(async () => {
+      try {
+        const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/grid-editor`)
+        if (!res.ok) throw new Error(`HTTP ${res.status}`)
+        setGridData(await res.json())
+      } catch (e) {
+        setError(e instanceof Error ? e.message : 'Fehler beim Laden')
+      } finally {
+        setLoading(false)
+      }
+    })()
+  }, [sessionId])
+
+  // Track left panel height
+  useEffect(() => {
+    if (!leftRef.current) return
+    const ro = new ResizeObserver(([e]) => setLeftHeight(e.contentRect.height))
+    ro.observe(leftRef.current)
+    return () => ro.disconnect()
+  }, [])
+
+  if (loading) {
+    return (
+      <div className="flex items-center justify-center py-16">
+        <div className="w-8 h-8 border-4 border-teal-500 border-t-transparent rounded-full animate-spin" />
+        <span className="ml-3 text-gray-500">Lade Spreadsheet...</span>
+      </div>
+    )
+  }
+
+  if (error || !gridData) {
+    return (
+      <div className="p-8 text-center">
+        <p className="text-red-500 mb-4">{error || 'Keine Grid-Daten.'}</p>
+        <button onClick={onNext} className="px-5 py-2 bg-teal-600 text-white rounded-lg">Weiter →</button>
+      </div>
+    )
+  }
+
+  return (
+    <div className="space-y-3">
+      {/* Header */}
+      <div className="flex items-center justify-between">
+        <div>
+          <h3 className="text-lg font-semibold text-gray-900 dark:text-white">Ansicht — Spreadsheet</h3>
+          <p className="text-sm text-gray-500 dark:text-gray-400">
+            Jede Zone als eigenes Sheet-Tab. Spaltenbreiten pro Sheet optimiert.
+          </p>
+        </div>
+        <button onClick={onNext} className="px-5 py-2 bg-teal-600 text-white rounded-lg hover:bg-teal-700 text-sm font-medium">
+          Weiter →
+        </button>
+      </div>
+
+      {/* Split view */}
+      <div className="flex gap-2">
+        {/* LEFT: Original + OCR overlay */}
+        <div ref={leftRef} className="w-1/3 border border-gray-300 dark:border-gray-600 rounded-lg overflow-hidden bg-white dark:bg-gray-900 flex-shrink-0">
+          <div className="px-2 py-1 bg-black/60 text-white text-[10px] font-medium">Original + OCR</div>
+          {sessionId && (
+            <img
+              src={`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/image/words-overlay`}
+              alt="Original + OCR"
+              className="w-full h-auto"
+            />
+          )}
+        </div>
+
+        {/* RIGHT: Fortune Sheet — height adapts to content */}
+        <div className="flex-1 border border-gray-300 dark:border-gray-600 rounded-lg overflow-hidden bg-white dark:bg-gray-900">
+          <SpreadsheetView gridData={gridData} height={Math.max(700, leftHeight)} />
+        </div>
+      </div>
+    </div>
+  )
+}
--- a/admin-lehrer/components/ocr-kombi/StepBoxGridReview.tsx
+++ b/admin-lehrer/components/ocr-kombi/StepBoxGridReview.tsx
@@ -0,0 +1,283 @@
+'use client'
+
+import { useCallback, useEffect, useRef, useState } from 'react'
+import { useGridEditor } from '@/components/grid-editor/useGridEditor'
+import type { GridZone } from '@/components/grid-editor/types'
+import { GridTable } from '@/components/grid-editor/GridTable'
+
+const KLAUSUR_API = '/klausur-api'
+
+type BoxLayoutType = 'flowing' | 'columnar' | 'bullet_list' | 'header_only'
+
+const LAYOUT_LABELS: Record<BoxLayoutType, string> = {
+  flowing: 'Fließtext',
+  columnar: 'Tabelle/Spalten',
+  bullet_list: 'Aufzählung',
+  header_only: 'Überschrift',
+}
+
+interface StepBoxGridReviewProps {
+  sessionId: string | null
+  onNext: () => void
+}
+
+export function StepBoxGridReview({ sessionId, onNext }: StepBoxGridReviewProps) {
+  const {
+    grid,
+    loading,
+    saving,
+    error,
+    dirty,
+    selectedCell,
+    setSelectedCell,
+    loadGrid,
+    saveGrid,
+    updateCellText,
+    toggleColumnBold,
+    toggleRowHeader,
+    undo,
+    redo,
+    canUndo,
+    canRedo,
+    getAdjacentCell,
+    commitUndoPoint,
+    selectedCells,
+    toggleCellSelection,
+    clearCellSelection,
+    toggleSelectedBold,
+    setCellColor,
+    deleteColumn,
+    addColumn,
+    deleteRow,
+    addRow,
+  } = useGridEditor(sessionId)
+
+  const [building, setBuilding] = useState(false)
+  const [buildError, setBuildError] = useState<string | null>(null)
+
+  // Load grid on mount
+  useEffect(() => {
+    if (sessionId) loadGrid()
+  }, [sessionId]) // eslint-disable-line react-hooks/exhaustive-deps
+
+  // Get box zones
+  const boxZones: GridZone[] = (grid?.zones || []).filter(
+    (z: GridZone) => z.zone_type === 'box'
+  )
+
+  // Build box grids via backend
+  const buildBoxGrids = useCallback(async (overrides?: Record<string, string>) => {
+    if (!sessionId) return
+    setBuilding(true)
+    setBuildError(null)
+    try {
+      const res = await fetch(
+        `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/build-box-grids`,
+        {
+          method: 'POST',
+          headers: { 'Content-Type': 'application/json' },
+          body: JSON.stringify({ overrides: overrides || {} }),
+        },
+      )
+      if (!res.ok) {
+        const data = await res.json().catch(() => ({}))
+        throw new Error(data.detail || `HTTP ${res.status}`)
+      }
+      await loadGrid()
+    } catch (e) {
+      setBuildError(e instanceof Error ? e.message : String(e))
+    } finally {
+      setBuilding(false)
+    }
+  }, [sessionId, loadGrid])
+
+  // Handle layout type change for a specific box zone
+  const changeLayoutType = useCallback(async (boxIdx: number, layoutType: string) => {
+    await buildBoxGrids({ [String(boxIdx)]: layoutType })
+  }, [buildBoxGrids])
+
+  // Auto-build once on first load if box zones have no cells
+  const autoBuildDone = useRef(false)
+  useEffect(() => {
+    if (!grid || loading || building || autoBuildDone.current) return
+    const needsBuild = boxZones.some(z => !z.cells || z.cells.length === 0)
+    if (needsBuild && sessionId) {
+      autoBuildDone.current = true
+      buildBoxGrids()
+    }
+  }, [grid, loading]) // eslint-disable-line react-hooks/exhaustive-deps
+
+  if (loading) {
+    return (
+      <div className="flex items-center justify-center py-16">
+        <div className="w-8 h-8 border-4 border-teal-500 border-t-transparent rounded-full animate-spin" />
+        <span className="ml-3 text-gray-500">Lade Grid...</span>
+      </div>
+    )
+  }
+
+  // No boxes after build attempt — skip step
+  if (!building && boxZones.length === 0) {
+    return (
+      <div className="bg-white dark:bg-gray-800 rounded-xl border border-gray-200 dark:border-gray-700 p-8 text-center">
+        <div className="text-4xl mb-3">📦</div>
+        <h3 className="text-lg font-semibold text-gray-900 dark:text-white mb-2">
+          Keine Boxen erkannt
+        </h3>
+        <p className="text-gray-500 dark:text-gray-400 mb-6">
+          Auf dieser Seite wurden keine eingebetteten Boxen (Grammatik-Tipps, Übungen etc.) erkannt.
+        </p>
+        <button
+          onClick={onNext}
+          className="px-6 py-2.5 bg-teal-600 text-white rounded-lg hover:bg-teal-700 transition-colors font-medium"
+        >
+          Weiter →
+        </button>
+      </div>
+    )
+  }
+
+  return (
+    <div className="space-y-4">
+      {/* Header */}
+      <div className="flex items-center justify-between">
+        <div>
+          <h3 className="text-lg font-semibold text-gray-900 dark:text-white">
+            Box-Review ({boxZones.length} {boxZones.length === 1 ? 'Box' : 'Boxen'})
+          </h3>
+          <p className="text-sm text-gray-500 dark:text-gray-400">
+            Eingebettete Boxen prüfen und korrigieren. Layout-Typ kann pro Box angepasst werden.
+          </p>
+        </div>
+        <div className="flex items-center gap-2">
+          {dirty && (
+            <button
+              onClick={saveGrid}
+              disabled={saving}
+              className="px-4 py-2 bg-blue-600 text-white rounded-lg hover:bg-blue-700 transition-colors text-sm font-medium disabled:opacity-50"
+            >
+              {saving ? 'Speichere...' : 'Speichern'}
+            </button>
+          )}
+          <button
+            onClick={() => buildBoxGrids()}
+            disabled={building}
+            className="px-4 py-2 bg-amber-600 text-white rounded-lg hover:bg-amber-700 transition-colors text-sm font-medium disabled:opacity-50"
+          >
+            {building ? 'Verarbeite...' : 'Alle Boxen neu aufbauen'}
+          </button>
+          <button
+            onClick={async () => {
+              if (dirty) await saveGrid()
+              onNext()
+            }}
+            className="px-5 py-2 bg-teal-600 text-white rounded-lg hover:bg-teal-700 transition-colors text-sm font-medium"
+          >
+            Weiter →
+          </button>
+        </div>
+      </div>
+
+      {/* Errors */}
+      {(error || buildError) && (
+        <div className="p-3 bg-red-50 dark:bg-red-900/30 border border-red-200 dark:border-red-800 rounded-lg text-red-700 dark:text-red-300 text-sm">
+          {error || buildError}
+        </div>
+      )}
+
+      {building && (
+        <div className="flex items-center gap-3 p-4 bg-amber-50 dark:bg-amber-900/20 border border-amber-200 dark:border-amber-800 rounded-lg">
+          <div className="w-5 h-5 border-2 border-amber-500 border-t-transparent rounded-full animate-spin" />
+          <span className="text-amber-700 dark:text-amber-300 text-sm">Box-Grids werden aufgebaut...</span>
+        </div>
+      )}
+
+      {/* Box zones */}
+      {boxZones.map((zone, boxIdx) => {
+        const boxColor = zone.box_bg_hex || '#d97706' // amber fallback
+        const boxColorName = zone.box_bg_color || 'box'
+        return (
+        <div
+          key={zone.zone_index}
+          className="bg-white dark:bg-gray-800 rounded-xl overflow-hidden"
+          style={{ border: `3px solid ${boxColor}` }}
+        >
+          {/* Box header */}
+          <div
+            className="flex items-center justify-between px-4 py-3 border-b"
+            style={{ backgroundColor: `${boxColor}15`, borderColor: `${boxColor}30` }}
+          >
+            <div className="flex items-center gap-3">
+              <div
+                className="w-8 h-8 rounded-lg flex items-center justify-center text-white text-sm font-bold"
+                style={{ backgroundColor: boxColor }}
+              >
+                {boxIdx + 1}
+              </div>
+              <div>
+                <span className="font-medium text-gray-900 dark:text-white">
+                  Box {boxIdx + 1}
+                </span>
+                <span className="text-xs text-gray-500 dark:text-gray-400 ml-2">
+                  {zone.bbox_px?.w}x{zone.bbox_px?.h}px
+                  {zone.cells?.length ? ` | ${zone.cells.length} Zellen` : ''}
+                  {zone.box_layout_type ? ` | ${LAYOUT_LABELS[zone.box_layout_type as BoxLayoutType] || zone.box_layout_type}` : ''}
+                  {boxColorName !== 'box' ? ` | ${boxColorName}` : ''}
+                </span>
+              </div>
+            </div>
+            <div className="flex items-center gap-2">
+              <label className="text-xs text-gray-500 dark:text-gray-400">Layout:</label>
+              <select
+                value={zone.box_layout_type || 'flowing'}
+                onChange={(e) => changeLayoutType(boxIdx, e.target.value)}
+                disabled={building}
+                className="text-xs px-2 py-1 rounded border border-gray-300 dark:border-gray-600 bg-white dark:bg-gray-700 text-gray-700 dark:text-gray-200"
+              >
+                {Object.entries(LAYOUT_LABELS).map(([key, label]) => (
+                  <option key={key} value={key}>{label}</option>
+                ))}
+              </select>
+            </div>
+          </div>
+
+          {/* Box grid table */}
+          <div className="p-3">
+            {zone.cells && zone.cells.length > 0 ? (
+              <GridTable
+                zone={zone}
+                selectedCell={selectedCell}
+                selectedCells={selectedCells}
+                onSelectCell={setSelectedCell}
+                onCellTextChange={updateCellText}
+                onToggleColumnBold={toggleColumnBold}
+                onToggleRowHeader={toggleRowHeader}
+                onNavigate={(cellId, dir) => {
+                  const next = getAdjacentCell(cellId, dir)
+                  if (next) setSelectedCell(next)
+                }}
+                onDeleteColumn={deleteColumn}
+                onAddColumn={addColumn}
+                onDeleteRow={deleteRow}
+                onAddRow={addRow}
+                onToggleCellSelection={toggleCellSelection}
+                onSetCellColor={setCellColor}
+              />
+            ) : (
+              <div className="text-center py-8 text-gray-400">
+                <p className="text-sm">Keine Zellen erkannt.</p>
+                <button
+                  onClick={() => buildBoxGrids({ [String(boxIdx)]: 'flowing' })}
+                  className="mt-2 text-xs text-amber-600 hover:text-amber-700"
+                >
+                  Als Fließtext verarbeiten
+                </button>
+              </div>
+            )}
+          </div>
+        </div>
+        )
+      })}
+    </div>
+  )
+}
--- a/admin-lehrer/components/ocr-kombi/StepContentCrop.tsx
+++ b/admin-lehrer/components/ocr-kombi/StepContentCrop.tsx
@@ -0,0 +1,13 @@
+'use client'
+
+import { StepCrop as BaseStepCrop } from '@/components/ocr-pipeline/StepCrop'
+
+interface StepContentCropProps {
+  sessionId: string | null
+  onNext: () => void
+}
+
+/** Thin wrapper around the shared StepCrop component */
+export function StepContentCrop({ sessionId, onNext }: StepContentCropProps) {
+  return <BaseStepCrop key={sessionId} sessionId={sessionId} onNext={onNext} />
+}
--- a/admin-lehrer/components/ocr-kombi/StepDeskew.tsx
+++ b/admin-lehrer/components/ocr-kombi/StepDeskew.tsx
@@ -0,0 +1,13 @@
+'use client'
+
+import { StepDeskew as BaseStepDeskew } from '@/components/ocr-pipeline/StepDeskew'
+
+interface StepDeskewProps {
+  sessionId: string | null
+  onNext: () => void
+}
+
+/** Thin wrapper around the shared StepDeskew component */
+export function StepDeskew({ sessionId, onNext }: StepDeskewProps) {
+  return <BaseStepDeskew key={sessionId} sessionId={sessionId} onNext={onNext} />
+}
--- a/admin-lehrer/components/ocr-kombi/StepDewarp.tsx
+++ b/admin-lehrer/components/ocr-kombi/StepDewarp.tsx
@@ -0,0 +1,13 @@
+'use client'
+
+import { StepDewarp as BaseStepDewarp } from '@/components/ocr-pipeline/StepDewarp'
+
+interface StepDewarpProps {
+  sessionId: string | null
+  onNext: () => void
+}
+
+/** Thin wrapper around the shared StepDewarp component */
+export function StepDewarp({ sessionId, onNext }: StepDewarpProps) {
+  return <BaseStepDewarp key={sessionId} sessionId={sessionId} onNext={onNext} />
+}
--- a/admin-lehrer/components/ocr-kombi/StepGridBuild.tsx
+++ b/admin-lehrer/components/ocr-kombi/StepGridBuild.tsx
@@ -0,0 +1,117 @@
+'use client'
+
+import { useState, useEffect } from 'react'
+
+const KLAUSUR_API = '/klausur-api'
+
+interface StepGridBuildProps {
+  sessionId: string | null
+  onNext: () => void
+}
+
+/**
+ * Step 9: Grid Build.
+ * Triggers the build-grid endpoint and shows progress.
+ */
+export function StepGridBuild({ sessionId, onNext }: StepGridBuildProps) {
+  const [building, setBuilding] = useState(false)
+  const [result, setResult] = useState<{ rows: number; cols: number; cells: number } | null>(null)
+  const [error, setError] = useState('')
+  const [autoTriggered, setAutoTriggered] = useState(false)
+
+  useEffect(() => {
+    if (!sessionId || autoTriggered) return
+    // Check if grid already exists
+    checkExistingGrid()
+  // eslint-disable-next-line react-hooks/exhaustive-deps
+  }, [sessionId])
+
+  const checkExistingGrid = async () => {
+    if (!sessionId) return
+    try {
+      const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/grid-editor`)
+      if (res.ok) {
+        const data = await res.json()
+        // Use grid-editor summary (accurate zone-based counts)
+        const summary = data.summary
+        if (summary) {
+          setResult({ rows: summary.total_rows || 0, cols: summary.total_columns || 0, cells: summary.total_cells || 0 })
+          return
+        }
+      }
+    } catch { /* no existing grid */ }
+
+    // Auto-trigger build
+    setAutoTriggered(true)
+    buildGrid()
+  }
+
+  const buildGrid = async () => {
+    if (!sessionId) return
+    setBuilding(true)
+    setError('')
+    try {
+      const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/build-grid`, {
+        method: 'POST',
+      })
+      if (!res.ok) {
+        const data = await res.json().catch(() => ({}))
+        throw new Error(data.detail || `Grid-Build fehlgeschlagen (${res.status})`)
+      }
+      const data = await res.json()
+      // Use grid-editor summary (zone-based, more accurate than word_result.grid_shape)
+      const summary = data.summary
+      if (summary) {
+        setResult({ rows: summary.total_rows || 0, cols: summary.total_columns || 0, cells: summary.total_cells || 0 })
+      } else {
+        const shape = data.grid_shape || { rows: 0, cols: 0, total_cells: 0 }
+        setResult({ rows: shape.rows, cols: shape.cols, cells: shape.total_cells })
+      }
+    } catch (e) {
+      setError(e instanceof Error ? e.message : String(e))
+    } finally {
+      setBuilding(false)
+    }
+  }
+
+  return (
+    <div className="space-y-4">
+      {building && (
+        <div className="flex items-center gap-3 p-6 bg-blue-50 dark:bg-blue-900/20 rounded-xl border border-blue-200 dark:border-blue-800">
+          <div className="animate-spin w-5 h-5 border-2 border-blue-400 border-t-transparent rounded-full" />
+          <span className="text-sm text-blue-600 dark:text-blue-400">Grid wird aufgebaut...</span>
+        </div>
+      )}
+
+      {result && (
+        <div className="space-y-3">
+          <div className="p-4 bg-green-50 dark:bg-green-900/20 rounded-xl border border-green-200 dark:border-green-800">
+            <div className="text-sm font-medium text-green-700 dark:text-green-300">
+              Grid erstellt: {result.rows} Zeilen, {result.cols} Spalten, {result.cells} Zellen
+            </div>
+          </div>
+          <button
+            onClick={onNext}
+            className="px-4 py-2 bg-teal-600 text-white text-sm rounded-lg hover:bg-teal-700"
+          >
+            Weiter zum Review
+          </button>
+        </div>
+      )}
+
+      {error && (
+        <div className="space-y-3">
+          <div className="text-sm text-red-500 bg-red-50 dark:bg-red-900/20 p-3 rounded-lg">
+            {error}
+          </div>
+          <button
+            onClick={buildGrid}
+            className="px-4 py-2 bg-orange-600 text-white text-sm rounded-lg hover:bg-orange-700"
+          >
+            Erneut versuchen
+          </button>
+        </div>
+      )}
+    </div>
+  )
+}
--- a/admin-lehrer/components/ocr-kombi/StepGridReview.tsx
+++ b/admin-lehrer/components/ocr-kombi/StepGridReview.tsx
@@ -0,0 +1,15 @@
+'use client'
+
+import { StepGridReview as BaseStepGridReview } from '@/components/ocr-pipeline/StepGridReview'
+import type { MutableRefObject } from 'react'
+
+interface StepGridReviewProps {
+  sessionId: string | null
+  onNext: () => void
+  saveRef: MutableRefObject<(() => Promise<void>) | null>
+}
+
+/** Thin wrapper around the shared StepGridReview component */
+export function StepGridReview({ sessionId, onNext, saveRef }: StepGridReviewProps) {
+  return <BaseStepGridReview sessionId={sessionId} onNext={onNext} saveRef={saveRef} />
+}
--- a/admin-lehrer/components/ocr-kombi/StepGroundTruth.tsx
+++ b/admin-lehrer/components/ocr-kombi/StepGroundTruth.tsx
@@ -0,0 +1,295 @@
+'use client'
+
+import { useCallback, useEffect, useRef, useState } from 'react'
+import { useGridEditor } from '@/components/grid-editor/useGridEditor'
+import { GridTable } from '@/components/grid-editor/GridTable'
+import { ImageLayoutEditor } from '@/components/grid-editor/ImageLayoutEditor'
+import type { GridZone } from '@/components/grid-editor/types'
+
+const KLAUSUR_API = '/klausur-api'
+
+interface StepGroundTruthProps {
+  sessionId: string | null
+  isGroundTruth: boolean
+  onMarked: () => void
+  gridSaveRef: React.MutableRefObject<(() => Promise<void>) | null>
+}
+
+/**
+ * Step 12: Ground Truth marking.
+ *
+ * Shows the full Grid-Review view (original image + table) so the user
+ * can verify the final result before marking as Ground Truth reference.
+ */
+export function StepGroundTruth({ sessionId, isGroundTruth, onMarked, gridSaveRef }: StepGroundTruthProps) {
+  const {
+    grid,
+    loading,
+    saving,
+    error,
+    dirty,
+    selectedCell,
+    selectedCells,
+    setSelectedCell,
+    loadGrid,
+    saveGrid,
+    updateCellText,
+    toggleColumnBold,
+    toggleRowHeader,
+    undo,
+    redo,
+    canUndo,
+    canRedo,
+    getAdjacentCell,
+    deleteColumn,
+    addColumn,
+    deleteRow,
+    addRow,
+    toggleCellSelection,
+    clearCellSelection,
+    toggleSelectedBold,
+    setCellColor,
+  } = useGridEditor(sessionId)
+
+  const [showImage, setShowImage] = useState(true)
+  const [zoom, setZoom] = useState(100)
+  const [markSaving, setMarkSaving] = useState(false)
+  const [message, setMessage] = useState('')
+
+  // Expose save function via ref
+  useEffect(() => {
+    if (gridSaveRef) {
+      gridSaveRef.current = async () => {
+        if (dirty) await saveGrid()
+      }
+      return () => { gridSaveRef.current = null }
+    }
+  }, [gridSaveRef, dirty, saveGrid])
+
+  // Load grid on mount
+  useEffect(() => {
+    if (sessionId) loadGrid()
+  }, [sessionId, loadGrid])
+
+  // Keyboard shortcuts
+  useEffect(() => {
+    const handler = (e: KeyboardEvent) => {
+      if ((e.metaKey || e.ctrlKey) && e.key === 'z' && !e.shiftKey) {
+        e.preventDefault(); undo()
+      } else if ((e.metaKey || e.ctrlKey) && e.key === 'z' && e.shiftKey) {
+        e.preventDefault(); redo()
+      } else if ((e.metaKey || e.ctrlKey) && e.key === 's') {
+        e.preventDefault(); saveGrid()
+      } else if ((e.metaKey || e.ctrlKey) && e.key === 'b') {
+        e.preventDefault()
+        if (selectedCells.size > 0) toggleSelectedBold()
+      } else if (e.key === 'Escape') {
+        clearCellSelection()
+      }
+    }
+    window.addEventListener('keydown', handler)
+    return () => window.removeEventListener('keydown', handler)
+  }, [undo, redo, saveGrid, selectedCells, toggleSelectedBold, clearCellSelection])
+
+  const handleNavigate = useCallback(
+    (cellId: string, direction: 'up' | 'down' | 'left' | 'right') => {
+      const target = getAdjacentCell(cellId, direction)
+      if (target) {
+        setSelectedCell(target)
+        setTimeout(() => {
+          const el = document.getElementById(`cell-${target}`)
+          if (el) {
+            el.focus()
+            if (el instanceof HTMLInputElement) el.select()
+          }
+        }, 0)
+      }
+    },
+    [getAdjacentCell, setSelectedCell],
+  )
+
+  const handleMark = async () => {
+    if (!sessionId) return
+    setMarkSaving(true)
+    setMessage('')
+    try {
+      if (dirty) await saveGrid()
+      const res = await fetch(
+        `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/mark-ground-truth?pipeline=kombi`,
+        { method: 'POST' },
+      )
+      if (!res.ok) {
+        const body = await res.text().catch(() => '')
+        throw new Error(`Ground Truth fehlgeschlagen (${res.status}): ${body}`)
+      }
+      const data = await res.json()
+      setMessage(`Ground Truth gespeichert (${data.cells_saved} Zellen)`)
+      onMarked()
+    } catch (e) {
+      setMessage(e instanceof Error ? e.message : String(e))
+    } finally {
+      setMarkSaving(false)
+    }
+  }
+
+  if (!sessionId) {
+    return <div className="text-center py-12 text-gray-400">Keine Session ausgewaehlt.</div>
+  }
+
+  if (loading) {
+    return (
+      <div className="flex items-center justify-center py-16">
+        <div className="flex items-center gap-3 text-gray-500 dark:text-gray-400">
+          <svg className="w-5 h-5 animate-spin" fill="none" viewBox="0 0 24 24">
+            <circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" />
+            <path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4z" />
+          </svg>
+          Grid wird geladen...
+        </div>
+      </div>
+    )
+  }
+
+  if (error) {
+    return (
+      <div className="bg-red-50 dark:bg-red-900/20 border border-red-200 dark:border-red-800 rounded-lg p-4">
+        <p className="text-sm text-red-700 dark:text-red-300">Fehler: {error}</p>
+      </div>
+    )
+  }
+
+  if (!grid || !grid.zones.length) {
+    return <div className="text-center py-12 text-gray-400">Kein Grid vorhanden.</div>
+  }
+
+  const imageUrl = `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/image/cropped`
+
+  return (
+    <div className="space-y-3">
+      {/* GT Header Bar */}
+      <div className="flex items-center justify-between p-3 bg-amber-50 dark:bg-amber-900/10 rounded-xl border border-amber-200 dark:border-amber-800">
+        <div>
+          <h3 className="text-sm font-medium text-amber-700 dark:text-amber-300">
+            Ground Truth
+            {isGroundTruth && <span className="ml-2 text-xs font-normal text-amber-500">(bereits markiert)</span>}
+          </h3>
+          <p className="text-xs text-amber-600 dark:text-amber-400 mt-0.5">
+            Pruefen Sie das Ergebnis und markieren Sie es als Referenz fuer Regressionstests.
+          </p>
+        </div>
+        <div className="flex items-center gap-2">
+          {dirty && (
+            <button
+              onClick={saveGrid}
+              disabled={saving}
+              className="px-3 py-1.5 text-xs bg-teal-600 text-white rounded-lg hover:bg-teal-700 disabled:opacity-50"
+            >
+              {saving ? 'Speichere...' : 'Speichern'}
+            </button>
+          )}
+          <button
+            onClick={handleMark}
+            disabled={markSaving}
+            className="px-4 py-1.5 text-xs bg-amber-600 text-white rounded-lg hover:bg-amber-700 disabled:opacity-50"
+          >
+            {markSaving ? 'Speichere...' : isGroundTruth ? 'GT aktualisieren' : 'Als Ground Truth markieren'}
+          </button>
+        </div>
+      </div>
+
+      {message && (
+        <div className={`text-sm p-2 rounded ${message.includes('fehlgeschlagen') ? 'text-red-500 bg-red-50 dark:bg-red-900/20' : 'text-amber-600 dark:text-amber-400 bg-amber-50 dark:bg-amber-900/10'}`}>
+          {message}
+        </div>
+      )}
+
+      {/* Stats */}
+      <div className="flex items-center gap-4 text-xs flex-wrap">
+        <span className="text-gray-500 dark:text-gray-400">
+          {grid.summary.total_zones} Zone(n), {grid.summary.total_columns} Spalten,{' '}
+          {grid.summary.total_rows} Zeilen, {grid.summary.total_cells} Zellen
+        </span>
+        <button
+          onClick={() => setShowImage(!showImage)}
+          className={`px-2.5 py-1 rounded text-xs border transition-colors ${
+            showImage
+              ? 'bg-teal-50 dark:bg-teal-900/30 border-teal-200 dark:border-teal-700 text-teal-700 dark:text-teal-300'
+              : 'bg-gray-50 dark:bg-gray-800 border-gray-200 dark:border-gray-700 text-gray-500 dark:text-gray-400'
+          }`}
+        >
+          {showImage ? 'Bild ausblenden' : 'Bild einblenden'}
+        </button>
+      </div>
+
+      {/* Split View: Image left + Grid right */}
+      <div className={showImage ? 'grid grid-cols-2 gap-3' : ''} style={{ minHeight: '55vh' }}>
+        {showImage && (
+          <ImageLayoutEditor
+            imageUrl={imageUrl}
+            zones={grid.zones}
+            imageWidth={grid.image_width}
+            layoutDividers={grid.layout_dividers}
+            zoom={zoom}
+            onZoomChange={setZoom}
+            onColumnDividerMove={() => {}}
+            onHorizontalsChange={() => {}}
+            onCommitUndo={() => {}}
+            onSplitColumnAt={() => {}}
+            onDeleteColumn={() => {}}
+          />
+        )}
+
+        <div className="space-y-3">
+          {(() => {
+            const groups: GridZone[][] = []
+            for (const zone of grid.zones) {
+              const prev = groups[groups.length - 1]
+              if (prev && zone.vsplit_group != null && prev[0].vsplit_group === zone.vsplit_group) {
+                prev.push(zone)
+              } else {
+                groups.push([zone])
+              }
+            }
+            return groups.map((group) => (
+              <div key={group[0].vsplit_group ?? group[0].zone_index}>
+                <div className={`${group.length > 1 ? 'flex gap-2' : ''}`}>
+                  {group.map((zone) => (
+                    <div
+                      key={zone.zone_index}
+                      className={`${group.length > 1 ? 'flex-1 min-w-0' : ''} bg-white dark:bg-gray-800 rounded-lg border border-gray-200 dark:border-gray-700`}
+                    >
+                      <GridTable
+                        zone={zone}
+                        layoutMetrics={grid.layout_metrics}
+                        selectedCell={selectedCell}
+                        selectedCells={selectedCells}
+                        onSelectCell={setSelectedCell}
+                        onToggleCellSelection={toggleCellSelection}
+                        onCellTextChange={updateCellText}
+                        onToggleColumnBold={toggleColumnBold}
+                        onToggleRowHeader={toggleRowHeader}
+                        onNavigate={handleNavigate}
+                        onDeleteColumn={deleteColumn}
+                        onAddColumn={addColumn}
+                        onDeleteRow={deleteRow}
+                        onAddRow={addRow}
+                        onSetCellColor={setCellColor}
+                      />
+                    </div>
+                  ))}
+                </div>
+              </div>
+            ))
+          })()}
+        </div>
+      </div>
+
+      {/* Keyboard tips */}
+      <div className="text-[11px] text-gray-400 dark:text-gray-500 flex items-center gap-4">
+        <span>Tab: naechste Zelle</span>
+        <span>Ctrl+Z/Y: Undo/Redo</span>
+        <span>Ctrl+S: Speichern</span>
+      </div>
+    </div>
+  )
+}
--- a/admin-lehrer/components/ocr-kombi/StepGutterRepair.tsx
+++ b/admin-lehrer/components/ocr-kombi/StepGutterRepair.tsx
@@ -0,0 +1,422 @@
+'use client'
+
+import { useState, useEffect, useCallback } from 'react'
+
+const KLAUSUR_API = '/klausur-api'
+
+interface GutterSuggestion {
+  id: string
+  type: 'hyphen_join' | 'spell_fix'
+  zone_index: number
+  row_index: number
+  col_index: number
+  col_type: string
+  cell_id: string
+  original_text: string
+  suggested_text: string
+  next_row_index: number
+  next_row_cell_id: string
+  next_row_text: string
+  missing_chars: string
+  display_parts: string[]
+  alternatives: string[]
+  confidence: number
+  reason: string
+}
+
+interface GutterRepairResult {
+  suggestions: GutterSuggestion[]
+  stats: {
+    words_checked: number
+    gutter_candidates: number
+    suggestions_found: number
+    error?: string
+  }
+  duration_seconds: number
+}
+
+interface StepGutterRepairProps {
+  sessionId: string | null
+  onNext: () => void
+}
+
+/**
+ * Step 11: Gutter Repair (Wortkorrektur).
+ * Detects words truncated at the book gutter and proposes corrections.
+ * User can accept/reject each suggestion individually or in batch.
+ */
+export function StepGutterRepair({ sessionId, onNext }: StepGutterRepairProps) {
+  const [loading, setLoading] = useState(false)
+  const [applying, setApplying] = useState(false)
+  const [result, setResult] = useState<GutterRepairResult | null>(null)
+  const [accepted, setAccepted] = useState<Set<string>>(new Set())
+  const [rejected, setRejected] = useState<Set<string>>(new Set())
+  const [selectedText, setSelectedText] = useState<Record<string, string>>({})
+  const [applied, setApplied] = useState(false)
+  const [error, setError] = useState('')
+  const [applyMessage, setApplyMessage] = useState('')
+
+  const analyse = useCallback(async () => {
+    if (!sessionId) return
+    setLoading(true)
+    setError('')
+    setApplied(false)
+    setApplyMessage('')
+    try {
+      const res = await fetch(
+        `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/gutter-repair`,
+        { method: 'POST' },
+      )
+      if (!res.ok) {
+        const body = await res.json().catch(() => ({}))
+        throw new Error(body.detail || `Analyse fehlgeschlagen (${res.status})`)
+      }
+      const data: GutterRepairResult = await res.json()
+      setResult(data)
+      // Auto-accept all suggestions with high confidence
+      const autoAccept = new Set<string>()
+      for (const s of data.suggestions) {
+        if (s.confidence >= 0.85) {
+          autoAccept.add(s.id)
+        }
+      }
+      setAccepted(autoAccept)
+      setRejected(new Set())
+    } catch (e) {
+      setError(e instanceof Error ? e.message : String(e))
+    } finally {
+      setLoading(false)
+    }
+  }, [sessionId])
+
+  // Auto-trigger analysis on mount
+  useEffect(() => {
+    if (sessionId) analyse()
+  }, [sessionId, analyse])
+
+  const toggleSuggestion = (id: string) => {
+    setAccepted(prev => {
+      const next = new Set(prev)
+      if (next.has(id)) {
+        next.delete(id)
+        setRejected(r => new Set(r).add(id))
+      } else {
+        next.add(id)
+        setRejected(r => { const n = new Set(r); n.delete(id); return n })
+      }
+      return next
+    })
+  }
+
+  const acceptAll = () => {
+    if (!result) return
+    setAccepted(new Set(result.suggestions.map(s => s.id)))
+    setRejected(new Set())
+  }
+
+  const rejectAll = () => {
+    if (!result) return
+    setRejected(new Set(result.suggestions.map(s => s.id)))
+    setAccepted(new Set())
+  }
+
+  const applyAccepted = async () => {
+    if (!sessionId || accepted.size === 0) return
+    setApplying(true)
+    setApplyMessage('')
+    try {
+      const res = await fetch(
+        `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/gutter-repair/apply`,
+        {
+          method: 'POST',
+          headers: { 'Content-Type': 'application/json' },
+          body: JSON.stringify({
+            accepted: Array.from(accepted),
+            text_overrides: selectedText,
+          }),
+        },
+      )
+      if (!res.ok) {
+        const body = await res.json().catch(() => ({}))
+        throw new Error(body.detail || `Anwenden fehlgeschlagen (${res.status})`)
+      }
+      const data = await res.json()
+      setApplied(true)
+      setApplyMessage(`${data.applied_count} Korrektur(en) angewendet.`)
+    } catch (e) {
+      setApplyMessage(e instanceof Error ? e.message : String(e))
+    } finally {
+      setApplying(false)
+    }
+  }
+
+  const suggestions = result?.suggestions || []
+  const hasSuggestions = suggestions.length > 0
+
+  return (
+    <div className="space-y-4">
+      {/* Header */}
+      <div className="flex items-center justify-between">
+        <div>
+          <h3 className="text-sm font-medium text-gray-700 dark:text-gray-300">
+            Wortkorrektur (Buchfalz)
+          </h3>
+          <p className="text-xs text-gray-500 dark:text-gray-400 mt-1">
+            Erkennt abgeschnittene oder unscharfe Woerter am Buchfalz und Bindestrich-Trennungen ueber Zeilen hinweg.
+          </p>
+        </div>
+        {result && !loading && (
+          <button
+            onClick={analyse}
+            className="px-3 py-1.5 text-xs bg-gray-100 dark:bg-gray-700 text-gray-600 dark:text-gray-300 rounded-lg hover:bg-gray-200 dark:hover:bg-gray-600"
+          >
+            Erneut analysieren
+          </button>
+        )}
+      </div>
+
+      {/* Loading */}
+      {loading && (
+        <div className="flex items-center gap-3 p-6 bg-blue-50 dark:bg-blue-900/20 rounded-xl border border-blue-200 dark:border-blue-800">
+          <div className="animate-spin w-5 h-5 border-2 border-blue-400 border-t-transparent rounded-full" />
+          <span className="text-sm text-blue-600 dark:text-blue-400">Analysiere Woerter am Buchfalz...</span>
+        </div>
+      )}
+
+      {/* Error */}
+      {error && (
+        <div className="space-y-3">
+          <div className="text-sm text-red-500 bg-red-50 dark:bg-red-900/20 p-3 rounded-lg">
+            {error}
+          </div>
+          <button
+            onClick={analyse}
+            className="px-4 py-2 bg-orange-600 text-white text-sm rounded-lg hover:bg-orange-700"
+          >
+            Erneut versuchen
+          </button>
+        </div>
+      )}
+
+      {/* No suggestions */}
+      {result && !hasSuggestions && !loading && (
+        <div className="p-4 bg-green-50 dark:bg-green-900/20 rounded-xl border border-green-200 dark:border-green-800">
+          <div className="text-sm font-medium text-green-700 dark:text-green-300">
+            Keine Buchfalz-Fehler erkannt.
+          </div>
+          <div className="text-xs text-green-600 dark:text-green-400 mt-1">
+            {result.stats.words_checked} Woerter geprueft, {result.stats.gutter_candidates} Kandidaten am Rand analysiert.
+          </div>
+        </div>
+      )}
+
+      {/* Suggestions list */}
+      {hasSuggestions && !loading && (
+        <>
+          {/* Stats bar */}
+          <div className="flex items-center justify-between p-3 bg-gray-50 dark:bg-gray-800 rounded-lg">
+            <div className="text-xs text-gray-500 dark:text-gray-400">
+              {suggestions.length} Vorschlag/Vorschlaege &middot;{' '}
+              {result!.stats.words_checked} Woerter geprueft &middot;{' '}
+              {result!.duration_seconds}s
+            </div>
+            <div className="flex gap-2">
+              <button
+                onClick={acceptAll}
+                disabled={applied}
+                className="px-2 py-1 text-xs bg-green-100 dark:bg-green-900/30 text-green-700 dark:text-green-300 rounded hover:bg-green-200 dark:hover:bg-green-900/50 disabled:opacity-50"
+              >
+                Alle akzeptieren
+              </button>
+              <button
+                onClick={rejectAll}
+                disabled={applied}
+                className="px-2 py-1 text-xs bg-red-100 dark:bg-red-900/30 text-red-700 dark:text-red-300 rounded hover:bg-red-200 dark:hover:bg-red-900/50 disabled:opacity-50"
+              >
+                Alle ablehnen
+              </button>
+            </div>
+          </div>
+
+          {/* Suggestion cards */}
+          <div className="space-y-2">
+            {suggestions.map((s) => {
+              const isAccepted = accepted.has(s.id)
+              const isRejected = rejected.has(s.id)
+
+              return (
+                <div
+                  key={s.id}
+                  className={`p-3 rounded-lg border transition-colors ${
+                    applied
+                      ? isAccepted
+                        ? 'bg-green-50 dark:bg-green-900/10 border-green-200 dark:border-green-800'
+                        : 'bg-gray-50 dark:bg-gray-800/50 border-gray-200 dark:border-gray-700 opacity-60'
+                      : isAccepted
+                        ? 'bg-green-50 dark:bg-green-900/10 border-green-300 dark:border-green-700'
+                        : isRejected
+                          ? 'bg-red-50 dark:bg-red-900/10 border-red-200 dark:border-red-800 opacity-60'
+                          : 'bg-white dark:bg-gray-800 border-gray-200 dark:border-gray-700'
+                  }`}
+                >
+                  <div className="flex items-start justify-between gap-3">
+                    {/* Left: suggestion details */}
+                    <div className="flex-1 min-w-0">
+                      {/* Type badge */}
+                      <div className="flex items-center gap-2 mb-1.5">
+                        <span className={`inline-flex px-1.5 py-0.5 text-[10px] font-medium rounded ${
+                          s.type === 'hyphen_join'
+                            ? 'bg-purple-100 dark:bg-purple-900/30 text-purple-700 dark:text-purple-300'
+                            : 'bg-orange-100 dark:bg-orange-900/30 text-orange-700 dark:text-orange-300'
+                        }`}>
+                          {s.type === 'hyphen_join' ? 'Zeilenumbruch' : 'Buchfalz-Korrektur'}
+                        </span>
+                        <span className="text-[10px] text-gray-400">
+                          Zeile {s.row_index + 1}, Spalte {s.col_index + 1}
+                          {s.col_type && ` (${s.col_type.replace('column_', '')})`}
+                        </span>
+                        <span className={`text-[10px] ${
+                          s.confidence >= 0.9 ? 'text-green-500' :
+                          s.confidence >= 0.7 ? 'text-yellow-500' : 'text-red-500'
+                        }`}>
+                          {Math.round(s.confidence * 100)}%
+                        </span>
+                      </div>
+
+                      {/* Correction display */}
+                      {s.type === 'hyphen_join' ? (
+                        <div className="space-y-1">
+                          <div className="flex items-center gap-2 text-sm">
+                            <span className="font-mono text-red-600 dark:text-red-400 line-through">
+                              {s.original_text}
+                            </span>
+                            <span className="text-gray-400 text-xs">Z.{s.row_index + 1}</span>
+                            <span className="text-gray-300 dark:text-gray-600">+</span>
+                            <span className="font-mono text-red-600 dark:text-red-400 line-through">
+                              {s.next_row_text.split(' ')[0]}
+                            </span>
+                            <span className="text-gray-400 text-xs">Z.{s.next_row_index + 1}</span>
+                            <span className="text-gray-400">&rarr;</span>
+                            <span className="font-mono text-green-600 dark:text-green-400 font-semibold">
+                              {s.suggested_text}
+                            </span>
+                          </div>
+                          {s.missing_chars && (
+                            <div className="text-[10px] text-gray-400">
+                              Fehlende Zeichen: <span className="font-mono font-semibold">{s.missing_chars}</span>
+                              {' '}&middot; Darstellung: <span className="font-mono">{s.display_parts.join(' | ')}</span>
+                            </div>
+                          )}
+                        </div>
+                      ) : (
+                        <div className="space-y-1">
+                          <div className="flex items-center gap-2 text-sm">
+                            <span className="font-mono text-red-600 dark:text-red-400 line-through">
+                              {s.original_text}
+                            </span>
+                            <span className="text-gray-400">&rarr;</span>
+                            <span className="font-mono text-green-600 dark:text-green-400 font-semibold">
+                              {selectedText[s.id] || s.suggested_text}
+                            </span>
+                          </div>
+                          {/* Alternatives: show other candidates the user can pick */}
+                          {s.alternatives && s.alternatives.length > 0 && !applied && (
+                            <div className="flex items-center gap-1.5 flex-wrap">
+                              <span className="text-[10px] text-gray-400">Alternativen:</span>
+                              {[s.suggested_text, ...s.alternatives].map((alt) => {
+                                const isSelected = (selectedText[s.id] || s.suggested_text) === alt
+                                return (
+                                  <button
+                                    key={alt}
+                                    onClick={() => setSelectedText(prev => ({ ...prev, [s.id]: alt }))}
+                                    className={`px-1.5 py-0.5 text-[11px] font-mono rounded transition-colors ${
+                                      isSelected
+                                        ? 'bg-green-200 dark:bg-green-800 text-green-800 dark:text-green-200 font-semibold'
+                                        : 'bg-gray-100 dark:bg-gray-700 text-gray-600 dark:text-gray-300 hover:bg-gray-200 dark:hover:bg-gray-600'
+                                    }`}
+                                  >
+                                    {alt}
+                                  </button>
+                                )
+                              })}
+                            </div>
+                          )}
+                        </div>
+                      )}
+                    </div>
+
+                    {/* Right: accept/reject toggle */}
+                    {!applied && (
+                      <button
+                        onClick={() => toggleSuggestion(s.id)}
+                        className={`flex-shrink-0 w-8 h-8 rounded-full flex items-center justify-center text-sm transition-colors ${
+                          isAccepted
+                            ? 'bg-green-500 text-white hover:bg-green-600'
+                            : isRejected
+                              ? 'bg-red-400 text-white hover:bg-red-500'
+                              : 'bg-gray-200 dark:bg-gray-600 text-gray-500 dark:text-gray-300 hover:bg-gray-300 dark:hover:bg-gray-500'
+                        }`}
+                        title={isAccepted ? 'Akzeptiert (klicken zum Ablehnen)' : isRejected ? 'Abgelehnt (klicken zum Akzeptieren)' : 'Klicken zum Akzeptieren'}
+                      >
+                        {isAccepted ? '\u2713' : isRejected ? '\u2717' : '?'}
+                      </button>
+                    )}
+                  </div>
+                </div>
+              )
+            })}
+          </div>
+
+          {/* Apply / Next buttons */}
+          <div className="flex items-center gap-3 pt-2">
+            {!applied ? (
+              <button
+                onClick={applyAccepted}
+                disabled={applying || accepted.size === 0}
+                className="px-4 py-2 bg-teal-600 text-white text-sm rounded-lg hover:bg-teal-700 disabled:opacity-50"
+              >
+                {applying ? 'Wird angewendet...' : `${accepted.size} Korrektur(en) anwenden`}
+              </button>
+            ) : (
+              <button
+                onClick={onNext}
+                className="px-4 py-2 bg-teal-600 text-white text-sm rounded-lg hover:bg-teal-700"
+              >
+                Weiter zu Ground Truth
+              </button>
+            )}
+            {!applied && (
+              <button
+                onClick={onNext}
+                className="px-4 py-2 text-sm text-gray-500 dark:text-gray-400 hover:text-gray-700 dark:hover:text-gray-200"
+              >
+                Ueberspringen
+              </button>
+            )}
+          </div>
+
+          {/* Apply result message */}
+          {applyMessage && (
+            <div className={`text-sm p-2 rounded ${
+              applyMessage.includes('fehlgeschlagen')
+                ? 'text-red-500 bg-red-50 dark:bg-red-900/20'
+                : 'text-green-600 dark:text-green-400 bg-green-50 dark:bg-green-900/20'
+            }`}>
+              {applyMessage}
+            </div>
+          )}
+        </>
+      )}
+
+      {/* Skip button when no suggestions */}
+      {result && !hasSuggestions && !loading && (
+        <button
+          onClick={onNext}
+          className="px-4 py-2 bg-teal-600 text-white text-sm rounded-lg hover:bg-teal-700"
+        >
+          Weiter zu Ground Truth
+        </button>
+      )}
+    </div>
+  )
+}
--- a/admin-lehrer/components/ocr-kombi/StepOcr.tsx
+++ b/admin-lehrer/components/ocr-kombi/StepOcr.tsx
@@ -0,0 +1,30 @@
+'use client'
+
+import { PaddleDirectStep } from '@/components/ocr-overlay/PaddleDirectStep'
+
+interface StepOcrProps {
+  sessionId: string | null
+  onNext: () => void
+}
+
+/**
+ * Step 7: OCR (Kombi mode = PaddleOCR + Tesseract).
+ *
+ * Phase 1: Uses the existing PaddleDirectStep with kombi endpoint.
+ * Phase 3 (later) will add transparent 3-phase progress + engine comparison.
+ */
+export function StepOcr({ sessionId, onNext }: StepOcrProps) {
+  return (
+    <PaddleDirectStep
+      sessionId={sessionId}
+      onNext={onNext}
+      endpoint="paddle-kombi"
+      title="Kombi-Modus"
+      description="PP-OCRv5 und Tesseract laufen parallel. Koordinaten werden gewichtet gemittelt fuer optimale Positionierung."
+      icon="🔀"
+      buttonLabel="PP-OCRv5 + Tesseract starten"
+      runningLabel="PP-OCRv5 + Tesseract laufen..."
+      engineKey="kombi"
+    />
+  )
+}
--- a/admin-lehrer/components/ocr-kombi/StepOrientation.tsx
+++ b/admin-lehrer/components/ocr-kombi/StepOrientation.tsx
@@ -0,0 +1,21 @@
+'use client'
+
+import { StepOrientation as BaseStepOrientation } from '@/components/ocr-pipeline/StepOrientation'
+
+interface StepOrientationProps {
+  sessionId: string | null
+  onNext: () => void
+  onSessionList: () => void
+}
+
+/** Thin wrapper — adapts the shared StepOrientation to the Kombi pipeline's simpler onNext() */
+export function StepOrientation({ sessionId, onNext, onSessionList }: StepOrientationProps) {
+  return (
+    <BaseStepOrientation
+      key={sessionId}
+      sessionId={sessionId}
+      onNext={() => onNext()}
+      onSessionList={onSessionList}
+    />
+  )
+}
--- a/admin-lehrer/components/ocr-kombi/StepPageSplit.tsx
+++ b/admin-lehrer/components/ocr-kombi/StepPageSplit.tsx
@@ -0,0 +1,198 @@
+'use client'
+
+import { useState, useEffect, useRef } from 'react'
+const KLAUSUR_API = '/klausur-api'
+
+interface PageSplitResult {
+  multi_page: boolean
+  page_count?: number
+  page_splits?: { x: number; y: number; width: number; height: number; page_index: number }[]
+  sub_sessions?: { id: string; name: string; page_index: number }[]
+  used_original?: boolean
+  duration_seconds?: number
+}
+
+interface StepPageSplitProps {
+  sessionId: string | null
+  sessionName: string
+  onNext: () => void
+  onSplitComplete: (firstChildId: string, firstChildName: string) => void
+}
+
+export function StepPageSplit({ sessionId, sessionName, onNext, onSplitComplete }: StepPageSplitProps) {
+  const [detecting, setDetecting] = useState(false)
+  const [splitResult, setSplitResult] = useState<PageSplitResult | null>(null)
+  const [error, setError] = useState('')
+  const didDetect = useRef(false)
+
+  // Auto-detect page split when step opens
+  useEffect(() => {
+    if (!sessionId || didDetect.current) return
+    didDetect.current = true
+    detectPageSplit()
+  // eslint-disable-next-line react-hooks/exhaustive-deps
+  }, [sessionId])
+
+  const detectPageSplit = async () => {
+    if (!sessionId) return
+    setDetecting(true)
+    setError('')
+    try {
+      // First check if this session was already split (status='split')
+      const sessionRes = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}`)
+      if (sessionRes.ok) {
+        const sessionData = await sessionRes.json()
+        if (sessionData.status === 'split' && sessionData.crop_result?.multi_page) {
+          // Already split — find the child sessions in the session list
+          const listRes = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions`)
+          if (listRes.ok) {
+            const listData = await listRes.json()
+            // Child sessions have names like "ParentName — Seite N"
+            const baseName = sessionName || sessionData.name || ''
+            const children = (listData.sessions || [])
+              .filter((s: { name?: string }) => s.name?.startsWith(baseName + ' — '))
+              .sort((a: { name: string }, b: { name: string }) => a.name.localeCompare(b.name))
+            if (children.length > 0) {
+              setSplitResult({
+                multi_page: true,
+                page_count: children.length,
+                sub_sessions: children.map((s: { id: string; name: string }, i: number) => ({
+                  id: s.id, name: s.name, page_index: i,
+                })),
+              })
+              onSplitComplete(children[0].id, children[0].name)
+              setDetecting(false)
+              return
+            }
+          }
+        }
+      }
+
+      // Run page-split detection
+      const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/page-split`, {
+        method: 'POST',
+      })
+      if (!res.ok) {
+        const data = await res.json().catch(() => ({}))
+        throw new Error(data.detail || 'Seitentrennung fehlgeschlagen')
+      }
+      const data: PageSplitResult = await res.json()
+      setSplitResult(data)
+
+      if (data.multi_page && data.sub_sessions?.length) {
+        // Rename sub-sessions to "Title — S. 1", "Title — S. 2"
+        const baseName = sessionName || 'Dokument'
+        for (let i = 0; i < data.sub_sessions.length; i++) {
+          const sub = data.sub_sessions[i]
+          const newName = `${baseName} — S. ${i + 1}`
+          await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sub.id}`, {
+            method: 'PUT',
+            headers: { 'Content-Type': 'application/json' },
+            body: JSON.stringify({ name: newName }),
+          }).catch(() => {})
+          sub.name = newName
+        }
+
+        // Signal parent to switch to the first child session
+        onSplitComplete(data.sub_sessions[0].id, data.sub_sessions[0].name)
+      }
+    } catch (e) {
+      setError(e instanceof Error ? e.message : String(e))
+    } finally {
+      setDetecting(false)
+    }
+  }
+
+  if (!sessionId) return null
+
+  const imageUrl = `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/image/oriented`
+
+  return (
+    <div className="space-y-4">
+      {/* Image */}
+      <div className="relative rounded-lg overflow-hidden bg-gray-100 dark:bg-gray-700">
+        {/* eslint-disable-next-line @next/next/no-img-element */}
+        <img
+          src={imageUrl}
+          alt="Orientiertes Bild"
+          className="w-full object-contain max-h-[500px]"
+          onError={(e) => {
+            // Fallback to non-oriented image
+            (e.target as HTMLImageElement).src =
+              `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/image`
+          }}
+        />
+      </div>
+
+      {/* Detection status */}
+      {detecting && (
+        <div className="flex items-center gap-2 text-teal-600 dark:text-teal-400 text-sm">
+          <div className="animate-spin w-4 h-4 border-2 border-teal-500 border-t-transparent rounded-full" />
+          Doppelseiten-Erkennung laeuft...
+        </div>
+      )}
+
+      {/* Detection result */}
+      {splitResult && !detecting && (
+        splitResult.multi_page ? (
+          <div className="bg-blue-50 dark:bg-blue-900/20 rounded-lg border border-blue-200 dark:border-blue-700 p-4 space-y-2">
+            <div className="text-sm font-medium text-blue-700 dark:text-blue-300">
+              Doppelseite erkannt — {splitResult.page_count} Seiten getrennt
+            </div>
+            <p className="text-xs text-blue-600 dark:text-blue-400">
+              Jede Seite wird als eigene Session weiterverarbeitet (eigene Begradigung, Entzerrung, etc.).
+              {splitResult.used_original && ' Trennung auf Originalbild, da Orientierung die Doppelseite gedreht hat.'}
+            </p>
+            <div className="flex gap-2 mt-2">
+              {splitResult.sub_sessions?.map(s => (
+                <span
+                  key={s.id}
+                  className="text-xs px-2.5 py-1 rounded-md bg-blue-100 dark:bg-blue-800/40 text-blue-700 dark:text-blue-300 font-medium"
+                >
+                  {s.name}
+                </span>
+              ))}
+            </div>
+            {splitResult.duration_seconds != null && (
+              <div className="text-xs text-gray-400">{splitResult.duration_seconds.toFixed(1)}s</div>
+            )}
+          </div>
+        ) : (
+          <div className="bg-green-50 dark:bg-green-900/20 rounded-lg border border-green-200 dark:border-green-800 p-4">
+            <div className="flex items-center gap-2 text-sm font-medium text-green-700 dark:text-green-300">
+              <span>&#10003;</span> Einzelseite — keine Trennung noetig
+            </div>
+            {splitResult.duration_seconds != null && (
+              <div className="text-xs text-gray-400 mt-1">{splitResult.duration_seconds.toFixed(1)}s</div>
+            )}
+          </div>
+        )
+      )}
+
+      {/* Error */}
+      {error && (
+        <div className="text-sm text-red-500 bg-red-50 dark:bg-red-900/20 p-3 rounded-lg">
+          {error}
+          <button
+            onClick={() => { didDetect.current = false; detectPageSplit() }}
+            className="ml-2 text-teal-600 hover:underline"
+          >
+            Erneut versuchen
+          </button>
+        </div>
+      )}
+
+      {/* Next button — only show when detection is done */}
+      {(splitResult || error) && !detecting && (
+        <div className="flex justify-end">
+          <button
+            onClick={onNext}
+            className="px-6 py-2.5 bg-teal-600 text-white text-sm font-medium rounded-lg hover:bg-teal-700 transition-colors"
+          >
+            Weiter &rarr;
+          </button>
+        </div>
+      )}
+    </div>
+  )
+}
--- a/admin-lehrer/components/ocr-kombi/StepStructure.tsx
+++ b/admin-lehrer/components/ocr-kombi/StepStructure.tsx
@@ -0,0 +1,13 @@
+'use client'
+
+import { StepStructureDetection } from '@/components/ocr-pipeline/StepStructureDetection'
+
+interface StepStructureProps {
+  sessionId: string | null
+  onNext: () => void
+}
+
+/** Thin wrapper around the shared StepStructureDetection component */
+export function StepStructure({ sessionId, onNext }: StepStructureProps) {
+  return <StepStructureDetection sessionId={sessionId} onNext={onNext} />
+}
--- a/admin-lehrer/components/ocr-kombi/StepUpload.tsx
+++ b/admin-lehrer/components/ocr-kombi/StepUpload.tsx
@@ -0,0 +1,303 @@
+'use client'
+
+import { useState, useCallback, useEffect } from 'react'
+import { DOCUMENT_CATEGORIES, type DocumentCategory } from '@/app/(admin)/ai/ocr-kombi/types'
+
+const KLAUSUR_API = '/klausur-api'
+
+interface StepUploadProps {
+  sessionId: string | null
+  onUploaded: (sessionId: string, name: string) => void
+  onNext: () => void
+}
+
+export function StepUpload({ sessionId, onUploaded, onNext }: StepUploadProps) {
+  const [dragging, setDragging] = useState(false)
+  const [uploading, setUploading] = useState(false)
+  const [selectedFile, setSelectedFile] = useState<File | null>(null)
+  const [preview, setPreview] = useState<string | null>(null)
+  const [title, setTitle] = useState('')
+  const [category, setCategory] = useState<DocumentCategory>('vokabelseite')
+  const [error, setError] = useState('')
+
+  // Clean up preview URL on unmount
+  useEffect(() => {
+    return () => { if (preview) URL.revokeObjectURL(preview) }
+  }, [preview])
+
+  const handleFileSelect = useCallback((file: File) => {
+    setSelectedFile(file)
+    setError('')
+    if (file.type.startsWith('image/')) {
+      setPreview(URL.createObjectURL(file))
+    } else {
+      setPreview(null)
+    }
+    // Auto-fill title from filename if empty
+    if (!title.trim()) {
+      setTitle(file.name.replace(/\.[^.]+$/, ''))
+    }
+  }, [title])
+
+  const handleUpload = useCallback(async () => {
+    if (!selectedFile) return
+    setUploading(true)
+    setError('')
+
+    try {
+      const formData = new FormData()
+      formData.append('file', selectedFile)
+      if (title.trim()) formData.append('name', title.trim())
+
+      const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions`, {
+        method: 'POST',
+        body: formData,
+      })
+
+      if (!res.ok) {
+        const data = await res.json().catch(() => ({}))
+        throw new Error(data.detail || `Upload fehlgeschlagen (${res.status})`)
+      }
+
+      const data = await res.json()
+      const sid = data.session_id || data.id
+
+      // Set category
+      if (category) {
+        await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`, {
+          method: 'PUT',
+          headers: { 'Content-Type': 'application/json' },
+          body: JSON.stringify({ document_category: category }),
+        })
+      }
+
+      onUploaded(sid, title.trim() || selectedFile.name)
+    } catch (e) {
+      setError(e instanceof Error ? e.message : String(e))
+    } finally {
+      setUploading(false)
+    }
+  }, [selectedFile, title, category, onUploaded])
+
+  const handleDrop = useCallback((e: React.DragEvent) => {
+    e.preventDefault()
+    setDragging(false)
+    const file = e.dataTransfer.files[0]
+    if (file) handleFileSelect(file)
+  }, [handleFileSelect])
+
+  const handleInputChange = useCallback((e: React.ChangeEvent<HTMLInputElement>) => {
+    const file = e.target.files?.[0]
+    if (file) handleFileSelect(file)
+  }, [handleFileSelect])
+
+  const clearFile = useCallback(() => {
+    setSelectedFile(null)
+    if (preview) URL.revokeObjectURL(preview)
+    setPreview(null)
+  }, [preview])
+
+  // ---- Phase 2: Uploaded → show result + "Weiter" ----
+  if (sessionId) {
+    return (
+      <div className="space-y-4">
+        <div className="bg-green-50 dark:bg-green-900/20 border border-green-200 dark:border-green-800 rounded-lg p-4">
+          <div className="flex items-center gap-2 text-green-700 dark:text-green-300 text-sm font-medium mb-3">
+            <span>&#10003;</span> Dokument hochgeladen
+          </div>
+          <div className="flex gap-4">
+            <div className="w-48 h-64 rounded-lg overflow-hidden bg-gray-100 dark:bg-gray-700 flex-shrink-0 border border-gray-200 dark:border-gray-600">
+              {/* eslint-disable-next-line @next/next/no-img-element */}
+              <img
+                src={`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/image`}
+                alt="Hochgeladenes Dokument"
+                className="w-full h-full object-contain"
+                onError={(e) => { (e.target as HTMLImageElement).style.display = 'none' }}
+              />
+            </div>
+            <div className="text-sm text-gray-600 dark:text-gray-400">
+              <div className="font-medium text-gray-700 dark:text-gray-300 mb-1">
+                {title || 'Dokument'}
+              </div>
+              <div className="text-xs text-gray-400 mt-1">
+                Kategorie: {DOCUMENT_CATEGORIES.find(c => c.value === category)?.label || category}
+              </div>
+              <div className="text-xs font-mono text-gray-400 mt-1">
+                Session: {sessionId.slice(0, 8)}...
+              </div>
+            </div>
+          </div>
+        </div>
+
+        <div className="flex justify-end">
+          <button
+            onClick={onNext}
+            className="px-6 py-2.5 bg-teal-600 text-white text-sm font-medium rounded-lg hover:bg-teal-700 transition-colors"
+          >
+            Weiter &rarr;
+          </button>
+        </div>
+      </div>
+    )
+  }
+
+  // ---- Phase 1b: File selected → preview + "Hochladen" ----
+  if (selectedFile) {
+    return (
+      <div className="space-y-4">
+        {/* Title input */}
+        <div>
+          <label className="block text-sm font-medium text-gray-700 dark:text-gray-300 mb-1">
+            Titel
+          </label>
+          <input
+            type="text"
+            value={title}
+            onChange={(e) => setTitle(e.target.value)}
+            placeholder="z.B. Vokabeln Unit 3"
+            className="w-full px-3 py-2 border border-gray-300 dark:border-gray-600 rounded-lg bg-white dark:bg-gray-800 text-sm"
+          />
+        </div>
+
+        {/* Category selector */}
+        <div>
+          <label className="block text-sm font-medium text-gray-700 dark:text-gray-300 mb-1">
+            Kategorie
+          </label>
+          <div className="grid grid-cols-4 gap-1.5">
+            {DOCUMENT_CATEGORIES.map(cat => (
+              <button
+                key={cat.value}
+                onClick={() => setCategory(cat.value)}
+                className={`text-xs px-2 py-1.5 rounded-md text-left transition-colors ${
+                  category === cat.value
+                    ? 'bg-teal-100 dark:bg-teal-900/40 text-teal-700 dark:text-teal-300 ring-1 ring-teal-400'
+                    : 'bg-gray-50 dark:bg-gray-700 text-gray-600 dark:text-gray-400 hover:bg-gray-100'
+                }`}
+              >
+                {cat.icon} {cat.label}
+              </button>
+            ))}
+          </div>
+        </div>
+
+        {/* File preview */}
+        <div className="border border-gray-200 dark:border-gray-700 rounded-xl p-4">
+          <div className="flex items-start gap-4">
+            {preview ? (
+              <div className="w-36 h-48 rounded-lg overflow-hidden bg-gray-100 dark:bg-gray-700 flex-shrink-0 border border-gray-200 dark:border-gray-600">
+                {/* eslint-disable-next-line @next/next/no-img-element */}
+                <img src={preview} alt="Vorschau" className="w-full h-full object-contain" />
+              </div>
+            ) : (
+              <div className="w-36 h-48 rounded-lg bg-gray-100 dark:bg-gray-700 flex-shrink-0 flex items-center justify-center border border-gray-200 dark:border-gray-600">
+                <span className="text-3xl">&#128196;</span>
+              </div>
+            )}
+            <div className="flex-1 min-w-0">
+              <div className="font-medium text-sm text-gray-700 dark:text-gray-300 truncate">
+                {selectedFile.name}
+              </div>
+              <div className="text-xs text-gray-400 mt-1">
+                {(selectedFile.size / 1024 / 1024).toFixed(1)} MB
+              </div>
+              <button
+                onClick={clearFile}
+                className="text-xs text-red-500 hover:text-red-700 mt-2"
+              >
+                Andere Datei waehlen
+              </button>
+            </div>
+          </div>
+
+          <button
+            onClick={handleUpload}
+            disabled={uploading}
+            className="mt-4 w-full px-4 py-2.5 bg-teal-600 text-white text-sm font-medium rounded-lg hover:bg-teal-700 disabled:opacity-50 disabled:cursor-not-allowed transition-colors"
+          >
+            {uploading ? 'Wird hochgeladen...' : 'Hochladen'}
+          </button>
+        </div>
+
+        {error && (
+          <div className="text-sm text-red-500 bg-red-50 dark:bg-red-900/20 p-3 rounded-lg">
+            {error}
+          </div>
+        )}
+      </div>
+    )
+  }
+
+  // ---- Phase 1a: No file → drop zone ----
+  return (
+    <div className="space-y-4">
+      {/* Title input */}
+      <div>
+        <label className="block text-sm font-medium text-gray-700 dark:text-gray-300 mb-1">
+          Titel (optional)
+        </label>
+        <input
+          type="text"
+          value={title}
+          onChange={(e) => setTitle(e.target.value)}
+          placeholder="z.B. Vokabeln Unit 3"
+          className="w-full px-3 py-2 border border-gray-300 dark:border-gray-600 rounded-lg bg-white dark:bg-gray-800 text-sm"
+        />
+      </div>
+
+      {/* Category selector */}
+      <div>
+        <label className="block text-sm font-medium text-gray-700 dark:text-gray-300 mb-1">
+          Kategorie
+        </label>
+        <div className="grid grid-cols-4 gap-1.5">
+          {DOCUMENT_CATEGORIES.map(cat => (
+            <button
+              key={cat.value}
+              onClick={() => setCategory(cat.value)}
+              className={`text-xs px-2 py-1.5 rounded-md text-left transition-colors ${
+                category === cat.value
+                  ? 'bg-teal-100 dark:bg-teal-900/40 text-teal-700 dark:text-teal-300 ring-1 ring-teal-400'
+                  : 'bg-gray-50 dark:bg-gray-700 text-gray-600 dark:text-gray-400 hover:bg-gray-100'
+              }`}
+            >
+              {cat.icon} {cat.label}
+            </button>
+          ))}
+        </div>
+      </div>
+
+      {/* Drop zone */}
+      <div
+        onDragOver={(e) => { e.preventDefault(); setDragging(true) }}
+        onDragLeave={() => setDragging(false)}
+        onDrop={handleDrop}
+        className={`border-2 border-dashed rounded-xl p-12 text-center transition-colors ${
+          dragging
+            ? 'border-teal-400 bg-teal-50 dark:bg-teal-900/20'
+            : 'border-gray-300 dark:border-gray-600 hover:border-gray-400'
+        }`}
+      >
+        <div className="text-4xl mb-3">&#128228;</div>
+        <div className="text-sm text-gray-600 dark:text-gray-400 mb-2">
+          Bild oder PDF hierher ziehen
+        </div>
+        <label className="inline-block px-4 py-2 bg-teal-600 text-white text-sm rounded-lg cursor-pointer hover:bg-teal-700">
+          Datei auswaehlen
+          <input
+            type="file"
+            accept="image/*,.pdf"
+            onChange={handleInputChange}
+            className="hidden"
+          />
+        </label>
+      </div>
+
+      {error && (
+        <div className="text-sm text-red-500 bg-red-50 dark:bg-red-900/20 p-3 rounded-lg">
+          {error}
+        </div>
+      )}
+    </div>
+  )
+}
--- a/admin-lehrer/components/ocr-pipeline/BoxSessionTabs.tsx
+++ b/admin-lehrer/components/ocr-pipeline/BoxSessionTabs.tsx
@@ -1,6 +1,6 @@
 'use client'

-import type { SubSession } from '@/app/(admin)/ai/ocr-pipeline/types'
+import type { SubSession } from '@/app/(admin)/ai/ocr-kombi/types'

 interface BoxSessionTabsProps {
  parentSessionId: string
@@ -21,6 +21,7 @@ function getStatusIcon(sub: SubSession): string {
  return STATUS_ICONS.pending
 }

+/** Tabs for box sub-sessions (from column detection zone_type='box'). */
 export function BoxSessionTabs({ parentSessionId, subSessions, activeSessionId, onSessionChange }: BoxSessionTabsProps) {
  if (subSessions.length === 0) return null

@@ -28,7 +29,6 @@ export function BoxSessionTabs({ parentSessionId, subSessions, activeSessionId,

  return (
    <div className="flex items-center gap-1.5 px-1 py-1.5 bg-gray-50 dark:bg-gray-800/50 rounded-xl border border-gray-200 dark:border-gray-700">
-      {/* Main session tab */}
      <button
        onClick={() => onSessionChange(parentSessionId)}
        className={`px-3 py-1.5 rounded-lg text-xs font-medium transition-colors ${
@@ -42,7 +42,6 @@ export function BoxSessionTabs({ parentSessionId, subSessions, activeSessionId,

      <div className="w-px h-5 bg-gray-200 dark:bg-gray-700" />

-      {/* Sub-session tabs */}
      {subSessions.map((sub) => {
        const isActive = activeSessionId === sub.id
        const icon = getStatusIcon(sub)
@@ -59,7 +58,7 @@ export function BoxSessionTabs({ parentSessionId, subSessions, activeSessionId,
            title={sub.name}
          >
            <span className="mr-1">{icon}</span>
-            Seite {sub.box_index + 1}
+            Box {sub.box_index + 1}
          </button>
        )
      })}
--- a/admin-lehrer/components/ocr-pipeline/ColumnControls.tsx
+++ b/admin-lehrer/components/ocr-pipeline/ColumnControls.tsx
@@ -1,7 +1,7 @@
 'use client'

 import { useState, useMemo } from 'react'
-import type { ColumnResult, ColumnGroundTruth, PageRegion } from '@/app/(admin)/ai/ocr-pipeline/types'
+import type { ColumnResult, ColumnGroundTruth, PageRegion } from '@/app/(admin)/ai/ocr-kombi/types'

 interface ColumnControlsProps {
  columnResult: ColumnResult | null
--- a/admin-lehrer/components/ocr-pipeline/DeskewControls.tsx
+++ b/admin-lehrer/components/ocr-pipeline/DeskewControls.tsx
@@ -1,7 +1,7 @@
 'use client'

 import { useState } from 'react'
-import type { DeskewResult, DeskewGroundTruth } from '@/app/(admin)/ai/ocr-pipeline/types'
+import type { DeskewResult, DeskewGroundTruth } from '@/app/(admin)/ai/ocr-kombi/types'

 interface DeskewControlsProps {
  deskewResult: DeskewResult | null
--- a/admin-lehrer/components/ocr-pipeline/DewarpControls.tsx
+++ b/admin-lehrer/components/ocr-pipeline/DewarpControls.tsx
@@ -1,7 +1,7 @@
 'use client'

 import { useEffect, useState } from 'react'
-import type { DeskewResult, DewarpResult, DewarpDetection, DewarpGroundTruth } from '@/app/(admin)/ai/ocr-pipeline/types'
+import type { DeskewResult, DewarpResult, DewarpDetection, DewarpGroundTruth } from '@/app/(admin)/ai/ocr-kombi/types'

 interface DewarpControlsProps {
  dewarpResult: DewarpResult | null
--- a/admin-lehrer/components/ocr-pipeline/FabricReconstructionCanvas.tsx
+++ b/admin-lehrer/components/ocr-pipeline/FabricReconstructionCanvas.tsx
@@ -1,7 +1,7 @@
 'use client'

 import { useCallback, useEffect, useRef, useState } from 'react'
-import type { GridCell } from '@/app/(admin)/ai/ocr-pipeline/types'
+import type { GridCell } from '@/app/(admin)/ai/ocr-kombi/types'

 const KLAUSUR_API = '/klausur-api'

--- a/admin-lehrer/components/ocr-pipeline/ManualColumnEditor.tsx
+++ b/admin-lehrer/components/ocr-pipeline/ManualColumnEditor.tsx
@@ -1,7 +1,7 @@
 'use client'

 import { useCallback, useEffect, useRef, useState } from 'react'
-import type { ColumnTypeKey, PageRegion } from '@/app/(admin)/ai/ocr-pipeline/types'
+import type { ColumnTypeKey, PageRegion } from '@/app/(admin)/ai/ocr-kombi/types'

 const COLUMN_TYPES: { value: ColumnTypeKey; label: string }[] = [
  { value: 'column_en', label: 'EN' },
--- a/admin-lehrer/components/ocr-pipeline/PipelineStepper.tsx
+++ b/admin-lehrer/components/ocr-pipeline/PipelineStepper.tsx
@@ -1,6 +1,6 @@
 'use client'

-import { PipelineStep, DocumentTypeResult } from '@/app/(admin)/ai/ocr-pipeline/types'
+import { PipelineStep, DocumentTypeResult } from '@/app/(admin)/ai/ocr-kombi/types'

 const DOC_TYPE_LABELS: Record<string, string> = {
  vocab_table: 'Vokabeltabelle',
--- a/admin-lehrer/components/ocr-pipeline/StepColumnDetection.tsx
+++ b/admin-lehrer/components/ocr-pipeline/StepColumnDetection.tsx
@@ -1,10 +1,10 @@
 'use client'

 import { useCallback, useEffect, useState } from 'react'
-import type { ColumnResult, ColumnGroundTruth, PageRegion, SubSession } from '@/app/(admin)/ai/ocr-pipeline/types'
+import type { ColumnResult, ColumnGroundTruth, PageRegion, SubSession } from '@/app/(admin)/ai/ocr-kombi/types'
 import { ColumnControls } from './ColumnControls'
 import { ManualColumnEditor } from './ManualColumnEditor'
-import type { ColumnTypeKey } from '@/app/(admin)/ai/ocr-pipeline/types'
+import type { ColumnTypeKey } from '@/app/(admin)/ai/ocr-kombi/types'

 const KLAUSUR_API = '/klausur-api'

--- a/admin-lehrer/components/ocr-pipeline/StepCrop.tsx
+++ b/admin-lehrer/components/ocr-pipeline/StepCrop.tsx
@@ -1,7 +1,7 @@
 'use client'

 import { useEffect, useState } from 'react'
-import type { CropResult } from '@/app/(admin)/ai/ocr-pipeline/types'
+import type { CropResult } from '@/app/(admin)/ai/ocr-kombi/types'
 import { ImageCompareView } from './ImageCompareView'

 const KLAUSUR_API = '/klausur-api'
--- a/admin-lehrer/components/ocr-pipeline/StepDeskew.tsx
+++ b/admin-lehrer/components/ocr-pipeline/StepDeskew.tsx
@@ -1,7 +1,7 @@
 'use client'

 import { useCallback, useEffect, useState } from 'react'
-import type { DeskewGroundTruth, DeskewResult, SessionInfo } from '@/app/(admin)/ai/ocr-pipeline/types'
+import type { DeskewGroundTruth, DeskewResult, SessionInfo } from '@/app/(admin)/ai/ocr-kombi/types'
 import { DeskewControls } from './DeskewControls'
 import { ImageCompareView } from './ImageCompareView'

--- a/admin-lehrer/components/ocr-pipeline/StepDewarp.tsx
+++ b/admin-lehrer/components/ocr-pipeline/StepDewarp.tsx
@@ -1,7 +1,7 @@
 'use client'

 import { useCallback, useEffect, useState } from 'react'
-import type { DeskewResult, DewarpResult, DewarpGroundTruth } from '@/app/(admin)/ai/ocr-pipeline/types'
+import type { DeskewResult, DewarpResult, DewarpGroundTruth } from '@/app/(admin)/ai/ocr-kombi/types'
 import { DewarpControls } from './DewarpControls'
 import { ImageCompareView } from './ImageCompareView'

--- a/admin-lehrer/components/ocr-pipeline/StepGridReview.tsx
+++ b/admin-lehrer/components/ocr-pipeline/StepGridReview.tsx
@@ -57,6 +57,21 @@ export function StepGridReview({ sessionId, onNext, saveRef }: StepGridReviewPro
    toggleSelectedBold,
    autoCorrectColumnPatterns,
    setCellColor,
+    ipaMode,
+    setIpaMode,
+    syllableMode,
+    setSyllableMode,
+    ocrEnhance,
+    setOcrEnhance,
+    ocrMaxCols,
+    setOcrMaxCols,
+    ocrMinConf,
+    setOcrMinConf,
+    visionFusion,
+    setVisionFusion,
+    documentCategory,
+    setDocumentCategory,
+    rerunOcr,
  } = useGridEditor(sessionId)

  const [showImage, setShowImage] = useState(true)
@@ -231,6 +246,11 @@ export function StepGridReview({ sessionId, onNext, saveRef }: StepGridReviewPro
            Woerterbuch ({Math.round(grid.dictionary_detection.confidence * 100)}%)
          </span>
        )}
+        {grid.page_number?.text && (
+          <span className="px-1.5 py-0.5 rounded bg-gray-100 dark:bg-gray-700 text-gray-600 dark:text-gray-300 border border-gray-200 dark:border-gray-600">
+            S. {grid.page_number.number ?? grid.page_number.text}
+          </span>
+        )}
        {lowConfCells.length > 0 && (
          <span className="px-2 py-0.5 rounded-full bg-red-50 dark:bg-red-900/20 text-red-600 dark:text-red-400 border border-red-200 dark:border-red-800">
            {lowConfCells.length} niedrige Konfidenz
@@ -247,6 +267,50 @@ export function StepGridReview({ sessionId, onNext, saveRef }: StepGridReviewPro
            Alle akzeptieren
          </button>
        )}
+        {/* OCR Quality Steps (A/B Testing) */}
+        <span className="text-gray-400 dark:text-gray-500">|</span>
+        <label className="flex items-center gap-1 cursor-pointer" title="Step 3: CLAHE + Bilateral-Filter Enhancement">
+          <input type="checkbox" checked={ocrEnhance} onChange={(e) => setOcrEnhance(e.target.checked)} className="rounded w-3 h-3" />
+          <span className="text-gray-500 dark:text-gray-400">CLAHE</span>
+        </label>
+        <label className="flex items-center gap-1" title="Step 2: Max Spaltenanzahl (0=unbegrenzt)">
+          <span className="text-gray-500 dark:text-gray-400">MaxCol:</span>
+          <select value={ocrMaxCols} onChange={(e) => setOcrMaxCols(Number(e.target.value))} className="px-1 py-0.5 text-xs rounded border border-gray-200 dark:border-gray-600 bg-white dark:bg-gray-700 text-gray-700 dark:text-gray-300">
+            <option value={0}>off</option>
+            <option value={2}>2</option>
+            <option value={3}>3</option>
+            <option value={4}>4</option>
+            <option value={5}>5</option>
+          </select>
+        </label>
+        <label className="flex items-center gap-1" title="Step 1: Min OCR Confidence (0=auto)">
+          <span className="text-gray-500 dark:text-gray-400">MinConf:</span>
+          <select value={ocrMinConf} onChange={(e) => setOcrMinConf(Number(e.target.value))} className="px-1 py-0.5 text-xs rounded border border-gray-200 dark:border-gray-600 bg-white dark:bg-gray-700 text-gray-700 dark:text-gray-300">
+            <option value={0}>auto</option>
+            <option value={20}>20</option>
+            <option value={30}>30</option>
+            <option value={40}>40</option>
+            <option value={50}>50</option>
+            <option value={60}>60</option>
+          </select>
+        </label>
+
+        <span className="text-gray-400 dark:text-gray-500">|</span>
+        <label className="flex items-center gap-1 cursor-pointer" title="Step 4: Vision-LLM Fusion — Qwen2.5-VL korrigiert OCR anhand des Bildes">
+          <input type="checkbox" checked={visionFusion} onChange={(e) => setVisionFusion(e.target.checked)} className="rounded w-3 h-3 accent-orange-500" />
+          <span className={`${visionFusion ? 'text-orange-500 dark:text-orange-400 font-medium' : 'text-gray-500 dark:text-gray-400'}`}>Vision-LLM</span>
+        </label>
+        <label className="flex items-center gap-1" title="Dokumenttyp fuer Vision-LLM Prompt">
+          <span className="text-gray-500 dark:text-gray-400">Typ:</span>
+          <select value={documentCategory} onChange={(e) => setDocumentCategory(e.target.value)} className="px-1 py-0.5 text-xs rounded border border-gray-200 dark:border-gray-600 bg-white dark:bg-gray-700 text-gray-700 dark:text-gray-300">
+            <option value="vokabelseite">Vokabelseite</option>
+            <option value="woerterbuch">Woerterbuch</option>
+            <option value="arbeitsblatt">Arbeitsblatt</option>
+            <option value="buchseite">Buchseite</option>
+            <option value="sonstiges">Sonstiges</option>
+          </select>
+        </label>
+
        <div className="ml-auto flex items-center gap-2">
          <button
            onClick={() => {
@@ -283,12 +347,24 @@ export function StepGridReview({ sessionId, onNext, saveRef }: StepGridReviewPro
          canUndo={canUndo}
          canRedo={canRedo}
          showOverlay={false}
+          ipaMode={ipaMode}
+          syllableMode={syllableMode}
          onSave={saveGrid}
          onUndo={undo}
          onRedo={redo}
          onRebuild={buildGrid}
          onToggleOverlay={() => setShowImage(!showImage)}
+          onIpaModeChange={setIpaMode}
+          onSyllableModeChange={setSyllableMode}
        />
+        <button
+          onClick={rerunOcr}
+          disabled={loading}
+          className="ml-2 px-3 py-1.5 text-xs font-medium rounded border border-orange-300 dark:border-orange-700 bg-orange-50 dark:bg-orange-900/20 text-orange-700 dark:text-orange-300 hover:bg-orange-100 dark:hover:bg-orange-900/40 transition-colors disabled:opacity-50"
+          title="OCR komplett neu ausfuehren mit aktuellen Quality-Step-Einstellungen (CLAHE, MinConf), dann Grid neu bauen"
+        >
+          {loading ? 'OCR laeuft...' : 'OCR neu + Grid'}
+        </button>
      </div>

      {/* Split View: Image left + Grid right */}
--- a/admin-lehrer/components/ocr-pipeline/StepGroundTruth.tsx
+++ b/admin-lehrer/components/ocr-pipeline/StepGroundTruth.tsx
@@ -3,8 +3,8 @@
 import { useCallback, useEffect, useRef, useState } from 'react'
 import type {
  GridCell, ColumnMeta, ImageRegion, ImageStyle,
-} from '@/app/(admin)/ai/ocr-pipeline/types'
-import { IMAGE_STYLES as STYLES } from '@/app/(admin)/ai/ocr-pipeline/types'
+} from '@/app/(admin)/ai/ocr-kombi/types'
+import { IMAGE_STYLES as STYLES } from '@/app/(admin)/ai/ocr-kombi/types'

 const KLAUSUR_API = '/klausur-api'

--- a/admin-lehrer/components/ocr-pipeline/StepLlmReview.tsx
+++ b/admin-lehrer/components/ocr-pipeline/StepLlmReview.tsx
@@ -1,7 +1,7 @@
 'use client'

 import { useCallback, useEffect, useMemo, useRef, useState } from 'react'
-import type { GridCell, GridResult, WordEntry, ColumnMeta } from '@/app/(admin)/ai/ocr-pipeline/types'
+import type { GridCell, GridResult, WordEntry, ColumnMeta } from '@/app/(admin)/ai/ocr-kombi/types'
 import { usePixelWordPositions } from './usePixelWordPositions'

 const KLAUSUR_API = '/klausur-api'
--- a/admin-lehrer/components/ocr-pipeline/StepOrientation.tsx
+++ b/admin-lehrer/components/ocr-pipeline/StepOrientation.tsx
@@ -1,7 +1,7 @@
 'use client'

 import { useCallback, useEffect, useState } from 'react'
-import type { OrientationResult, SessionInfo, SubSession } from '@/app/(admin)/ai/ocr-pipeline/types'
+import type { OrientationResult, SessionInfo } from '@/app/(admin)/ai/ocr-kombi/types'
 import { ImageCompareView } from './ImageCompareView'

 const KLAUSUR_API = '/klausur-api'
@@ -17,10 +17,10 @@ interface PageSplitResult {
 interface StepOrientationProps {
  sessionId?: string | null
  onNext: (sessionId: string) => void
-  onSubSessionsCreated?: (subs: SubSession[]) => void
+  onSessionList?: () => void
 }

-export function StepOrientation({ sessionId: existingSessionId, onNext, onSubSessionsCreated }: StepOrientationProps) {
+export function StepOrientation({ sessionId: existingSessionId, onNext, onSessionList }: StepOrientationProps) {
  const [session, setSession] = useState<SessionInfo | null>(null)
  const [orientationResult, setOrientationResult] = useState<OrientationResult | null>(null)
  const [pageSplitResult, setPageSplitResult] = useState<PageSplitResult | null>(null)
@@ -30,7 +30,7 @@ export function StepOrientation({ sessionId: existingSessionId, onNext, onSubSes
  const [dragOver, setDragOver] = useState(false)
  const [sessionName, setSessionName] = useState('')

-  // Reload session data when navigating back
+  // Reload session data when navigating back — auto-trigger orientation if missing
  useEffect(() => {
    if (!existingSessionId || session) return

@@ -51,6 +51,28 @@ export function StepOrientation({ sessionId: existingSessionId, onNext, onSubSes

        if (data.orientation_result) {
          setOrientationResult(data.orientation_result)
+        } else {
+          // Session exists but orientation not yet run (e.g. page-split session)
+          // Auto-trigger orientation detection
+          setDetecting(true)
+          try {
+            const orientRes = await fetch(
+              `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${existingSessionId}/orientation`,
+              { method: 'POST' },
+            )
+            if (orientRes.ok) {
+              const orientData = await orientRes.json()
+              setOrientationResult({
+                orientation_degrees: orientData.orientation_degrees,
+                corrected: orientData.corrected,
+                duration_seconds: orientData.duration_seconds,
+              })
+            }
+          } catch (e) {
+            console.error('Auto-orientation failed:', e)
+          } finally {
+            setDetecting(false)
+          }
        }
      } catch (e) {
        console.error('Failed to reload session:', e)
@@ -112,16 +134,6 @@ export function StepOrientation({ sessionId: existingSessionId, onNext, onSubSes
        if (splitRes.ok) {
          const splitData: PageSplitResult = await splitRes.json()
          setPageSplitResult(splitData)
-          if (splitData.multi_page && splitData.sub_sessions && onSubSessionsCreated) {
-            onSubSessionsCreated(
-              splitData.sub_sessions.map((s) => ({
-                id: s.id,
-                name: s.name,
-                box_index: s.page_index,
-                current_step: splitData.used_original ? 1 : 2,
-              }))
-            )
-          }
        }
      } catch (e) {
        console.error('Page-split detection failed:', e)
@@ -133,7 +145,7 @@ export function StepOrientation({ sessionId: existingSessionId, onNext, onSubSes
      setUploading(false)
      setDetecting(false)
    }
-  }, [sessionName, onSubSessionsCreated])
+  }, [sessionName])

  const handleDrop = useCallback((e: React.DragEvent) => {
    e.preventDefault()
@@ -264,10 +276,10 @@ export function StepOrientation({ sessionId: existingSessionId, onNext, onSubSes
      {pageSplitResult?.multi_page && (
        <div className="bg-blue-50 dark:bg-blue-900/20 rounded-lg border border-blue-200 dark:border-blue-700 p-4">
          <div className="text-sm font-medium text-blue-700 dark:text-blue-300">
-            Doppelseite erkannt — {pageSplitResult.page_count} Seiten
+            Doppelseite erkannt — {pageSplitResult.page_count} unabhaengige Sessions erstellt
          </div>
          <p className="text-xs text-blue-600 dark:text-blue-400 mt-1">
-            Jede Seite wird einzeln durch die Pipeline (Begradigung, Entzerrung, Zuschnitt, ...) verarbeitet.
+            Jede Seite wird als eigene Session durch die Pipeline verarbeitet.
            {pageSplitResult.used_original && ' (Seitentrennung auf dem Originalbild, da die Orientierung die Doppelseite gedreht hat.)'}
          </p>
          <div className="flex gap-2 mt-2">
@@ -286,12 +298,21 @@ export function StepOrientation({ sessionId: existingSessionId, onNext, onSubSes
      {/* Next button */}
      {orientationResult && (
        <div className="flex justify-end">
+          {pageSplitResult?.multi_page ? (
+            <button
+              onClick={() => onSessionList?.()}
+              className="px-6 py-2 bg-teal-600 text-white rounded-lg hover:bg-teal-700 font-medium transition-colors"
+            >
+              Zur Session-Liste &rarr;
+            </button>
+          ) : (
            <button
              onClick={() => onNext(session.session_id)}
              className="px-6 py-2 bg-teal-600 text-white rounded-lg hover:bg-teal-700 font-medium transition-colors"
            >
-            {pageSplitResult?.multi_page ? 'Seiten verarbeiten' : 'Weiter'} &rarr;
+              Weiter &rarr;
            </button>
+          )}
        </div>
      )}

--- a/admin-lehrer/components/ocr-pipeline/StepReconstruction.tsx
+++ b/admin-lehrer/components/ocr-pipeline/StepReconstruction.tsx
@@ -2,7 +2,7 @@

 import { useCallback, useEffect, useMemo, useRef, useState } from 'react'
 import dynamic from 'next/dynamic'
-import type { GridResult, GridCell, ColumnResult, RowResult, PageZone, PageRegion, RowItem, StructureResult, StructureBox, StructureGraphic } from '@/app/(admin)/ai/ocr-pipeline/types'
+import type { GridResult, GridCell, ColumnResult, RowResult, PageZone, PageRegion, RowItem, StructureResult, StructureBox, StructureGraphic } from '@/app/(admin)/ai/ocr-kombi/types'
 import { usePixelWordPositions } from './usePixelWordPositions'

 const KLAUSUR_API = '/klausur-api'
--- a/admin-lehrer/components/ocr-pipeline/StepRowDetection.tsx
+++ b/admin-lehrer/components/ocr-pipeline/StepRowDetection.tsx
@@ -1,7 +1,7 @@
 'use client'

 import { useCallback, useEffect, useState } from 'react'
-import type { RowResult, RowGroundTruth } from '@/app/(admin)/ai/ocr-pipeline/types'
+import type { RowResult, RowGroundTruth } from '@/app/(admin)/ai/ocr-kombi/types'

 const KLAUSUR_API = '/klausur-api'

--- a/admin-lehrer/components/ocr-pipeline/StepStructureDetection.tsx
+++ b/admin-lehrer/components/ocr-pipeline/StepStructureDetection.tsx
@@ -1,7 +1,7 @@
 'use client'

 import { useCallback, useEffect, useRef, useState } from 'react'
-import type { ExcludeRegion, StructureResult } from '@/app/(admin)/ai/ocr-pipeline/types'
+import type { ExcludeRegion, StructureResult } from '@/app/(admin)/ai/ocr-kombi/types'

 const KLAUSUR_API = '/klausur-api'

--- a/admin-lehrer/components/ocr-pipeline/StepWordRecognition.tsx
+++ b/admin-lehrer/components/ocr-pipeline/StepWordRecognition.tsx
@@ -1,7 +1,7 @@
 'use client'

 import { useCallback, useEffect, useRef, useState } from 'react'
-import type { GridResult, GridCell, WordEntry, WordGroundTruth } from '@/app/(admin)/ai/ocr-pipeline/types'
+import type { GridResult, GridCell, WordEntry, WordGroundTruth } from '@/app/(admin)/ai/ocr-kombi/types'

 const KLAUSUR_API = '/klausur-api'

--- a/admin-lehrer/components/ocr-pipeline/tests/usePixelWordPositions.test.ts
+++ b/admin-lehrer/components/ocr-pipeline/tests/usePixelWordPositions.test.ts
@@ -0,0 +1,328 @@
+/**
+ * Tests for usePixelWordPositions hook.
+ *
+ * The hook performs pixel-based word positioning using an offscreen canvas.
+ * Since Canvas/getImageData is not available in jsdom, we test the pure
+ * computation logic by extracting and testing the algorithms directly.
+ */
+import { describe, it, expect } from 'vitest'
+
+
+// ---------------------------------------------------------------------------
+// Extract pure computation functions from the hook for testing
+// ---------------------------------------------------------------------------
+
+interface Cluster {
+  start: number
+  end: number
+}
+
+/**
+ * Cluster detection: find runs of dark pixels above a threshold.
+ * Replicates the cluster detection logic in usePixelWordPositions.
+ */
+function findClusters(proj: number[], ch: number, cw: number): Cluster[] {
+  const threshold = Math.max(1, ch * 0.03)
+  const minGap = Math.max(5, Math.round(cw * 0.02))
+  const clusters: Cluster[] = []
+  let inCluster = false
+  let clStart = 0
+  let gap = 0
+
+  for (let x = 0; x < cw; x++) {
+    if (proj[x] >= threshold) {
+      if (!inCluster) { clStart = x; inCluster = true }
+      gap = 0
+    } else if (inCluster) {
+      gap++
+      if (gap > minGap) {
+        clusters.push({ start: clStart, end: x - gap })
+        inCluster = false
+        gap = 0
+      }
+    }
+  }
+  if (inCluster) clusters.push({ start: clStart, end: cw - 1 - gap })
+
+  return clusters
+}
+
+/**
+ * Mirror clusters for 180° rotation.
+ * Replicates the rotation logic in usePixelWordPositions.
+ */
+function mirrorClusters(clusters: Cluster[], cw: number): Cluster[] {
+  return clusters.map(c => ({
+    start: cw - 1 - c.end,
+    end: cw - 1 - c.start,
+  })).reverse()
+}
+
+/**
+ * Compute fontRatio from cluster width, measured text width, and cell height.
+ * Replicates the font ratio calculation.
+ */
+function computeFontRatio(
+  clusterW: number,
+  measuredWidth: number,
+  refFontSize: number,
+  ch: number,
+): number {
+  const autoFontPx = refFontSize * (clusterW / measuredWidth)
+  return Math.min(autoFontPx / ch, 1.0)
+}
+
+/**
+ * Mode normalization: find the most common fontRatio (bucketed to 0.02).
+ * Replicates the mode normalization in usePixelWordPositions.
+ */
+function normalizeFontRatios(ratios: number[]): number {
+  if (ratios.length === 0) return 0
+  const buckets = new Map<number, number>()
+  for (const r of ratios) {
+    const key = Math.round(r * 50) / 50
+    buckets.set(key, (buckets.get(key) || 0) + 1)
+  }
+  let modeRatio = ratios[0]
+  let modeCount = 0
+  for (const [ratio, count] of buckets) {
+    if (count > modeCount) { modeRatio = ratio; modeCount = count }
+  }
+  return modeRatio
+}
+
+/**
+ * Coordinate transform for 180° rotation.
+ */
+function transformCellCoords180(
+  x: number, y: number, w: number, h: number,
+  imgW: number, imgH: number,
+): { cx: number; cy: number } {
+  return {
+    cx: Math.round((100 - x - w) / 100 * imgW),
+    cy: Math.round((100 - y - h) / 100 * imgH),
+  }
+}
+
+
+// ---------------------------------------------------------------------------
+// Tests
+// ---------------------------------------------------------------------------
+
+describe('findClusters', () => {
+  it('should find a single cluster', () => {
+    // Simulate a projection with dark pixels from x=10 to x=50
+    const proj = new Array(100).fill(0)
+    for (let x = 10; x <= 50; x++) proj[x] = 10
+
+    const clusters = findClusters(proj, 100, 100)
+    expect(clusters.length).toBe(1)
+    expect(clusters[0].start).toBe(10)
+    expect(clusters[0].end).toBe(50)
+  })
+
+  it('should find multiple clusters separated by gaps', () => {
+    const proj = new Array(200).fill(0)
+    // Two word groups with a gap between
+    for (let x = 10; x <= 40; x++) proj[x] = 10
+    for (let x = 80; x <= 120; x++) proj[x] = 10
+
+    const clusters = findClusters(proj, 100, 200)
+    expect(clusters.length).toBe(2)
+    expect(clusters[0].start).toBe(10)
+    expect(clusters[1].start).toBe(80)
+  })
+
+  it('should merge clusters with small gaps', () => {
+    // Gap smaller than minGap should not split clusters
+    const proj = new Array(100).fill(0)
+    for (let x = 10; x <= 30; x++) proj[x] = 10
+    // Small gap (3px) — minGap = max(5, 100*0.02) = 5
+    for (let x = 34; x <= 50; x++) proj[x] = 10
+
+    const clusters = findClusters(proj, 100, 100)
+    expect(clusters.length).toBe(1)  // merged into one cluster
+  })
+
+  it('should return empty for all-white projection', () => {
+    const proj = new Array(100).fill(0)
+    const clusters = findClusters(proj, 100, 100)
+    expect(clusters.length).toBe(0)
+  })
+})
+
+
+describe('mirrorClusters', () => {
+  it('should mirror clusters for 180° rotation', () => {
+    const clusters: Cluster[] = [
+      { start: 10, end: 50 },
+      { start: 80, end: 120 },
+    ]
+    const cw = 200
+
+    const mirrored = mirrorClusters(clusters, cw)
+
+    // Cluster at (10,50) → (cw-1-50, cw-1-10) = (149, 189)
+    // Cluster at (80,120) → (cw-1-120, cw-1-80) = (79, 119)
+    // After reverse: [(79,119), (149,189)]
+    expect(mirrored.length).toBe(2)
+    expect(mirrored[0]).toEqual({ start: 79, end: 119 })
+    expect(mirrored[1]).toEqual({ start: 149, end: 189 })
+  })
+
+  it('should maintain left-to-right order after mirroring', () => {
+    const clusters: Cluster[] = [
+      { start: 5, end: 30 },
+      { start: 50, end: 80 },
+      { start: 100, end: 130 },
+    ]
+
+    const mirrored = mirrorClusters(clusters, 200)
+
+    // After mirroring and reversing, order should be left-to-right
+    for (let i = 1; i < mirrored.length; i++) {
+      expect(mirrored[i].start).toBeGreaterThan(mirrored[i - 1].start)
+    }
+  })
+
+  it('should handle single cluster', () => {
+    const clusters: Cluster[] = [{ start: 20, end: 80 }]
+    const mirrored = mirrorClusters(clusters, 200)
+
+    expect(mirrored.length).toBe(1)
+    expect(mirrored[0]).toEqual({ start: 119, end: 179 })
+  })
+})
+
+
+describe('computeFontRatio', () => {
+  it('should compute ratio based on cluster vs measured width', () => {
+    // Cluster is 100px wide, measured text at 40px font is 200px → autoFont = 20px
+    // Cell height = 30px → ratio = 20/30 = 0.667
+    const ratio = computeFontRatio(100, 200, 40, 30)
+    expect(ratio).toBeCloseTo(0.667, 2)
+  })
+
+  it('should cap ratio at 1.0', () => {
+    // Very large cluster relative to measured text
+    const ratio = computeFontRatio(400, 100, 40, 30)
+    expect(ratio).toBe(1.0)
+  })
+
+  it('should handle small cluster width', () => {
+    const ratio = computeFontRatio(10, 200, 40, 30)
+    expect(ratio).toBeCloseTo(0.067, 2)
+  })
+})
+
+
+describe('normalizeFontRatios', () => {
+  it('should return the most common ratio', () => {
+    const ratios = [0.5, 0.5, 0.5, 0.3, 0.3, 0.7]
+    const mode = normalizeFontRatios(ratios)
+    expect(mode).toBe(0.5)
+  })
+
+  it('should bucket ratios to nearest 0.02', () => {
+    // 0.51 and 0.49 both round to 0.50 (nearest 0.02)
+    const ratios = [0.51, 0.49, 0.50, 0.30]
+    const mode = normalizeFontRatios(ratios)
+    expect(mode).toBe(0.50)
+  })
+
+  it('should handle empty array', () => {
+    expect(normalizeFontRatios([])).toBe(0)
+  })
+
+  it('should handle single ratio', () => {
+    expect(normalizeFontRatios([0.65])).toBe(0.66)  // rounded to nearest 0.02
+  })
+})
+
+
+describe('transformCellCoords180', () => {
+  it('should transform cell coordinates for 180° rotation', () => {
+    // Cell at x=10%, y=20%, w=30%, h=5% on a 1000x2000 image
+    const { cx, cy } = transformCellCoords180(10, 20, 30, 5, 1000, 2000)
+
+    // Expected: cx = (100 - 10 - 30) / 100 * 1000 = 600
+    //           cy = (100 - 20 - 5) / 100 * 2000 = 1500
+    expect(cx).toBe(600)
+    expect(cy).toBe(1500)
+  })
+
+  it('should handle cell at origin', () => {
+    const { cx, cy } = transformCellCoords180(0, 0, 50, 50, 1000, 1000)
+
+    // Expected: cx = (100 - 0 - 50) / 100 * 1000 = 500
+    //           cy = (100 - 0 - 50) / 100 * 1000 = 500
+    expect(cx).toBe(500)
+    expect(cy).toBe(500)
+  })
+
+  it('should handle cell at bottom-right', () => {
+    const { cx, cy } = transformCellCoords180(80, 90, 20, 10, 1000, 2000)
+
+    // Expected: cx = (100 - 80 - 20) / 100 * 1000 = 0
+    //           cy = (100 - 90 - 10) / 100 * 2000 = 0
+    expect(cx).toBe(0)
+    expect(cy).toBe(0)
+  })
+})
+
+
+describe('sub-session coordinate conversion', () => {
+  /**
+   * Test the coordinate conversion from sub-session (box-relative)
+   * to parent (page-absolute) coordinates.
+   * Replicates the logic in StepReconstruction loadSessionData.
+   */
+  it('should convert sub-session cell coords to parent space', () => {
+    const imgW = 1746
+    const imgH = 2487
+
+    // Box zone in pixels
+    const box = { x: 50, y: 1145, width: 1100, height: 270 }
+
+    // Box in percent
+    const boxXPct = (box.x / imgW) * 100
+    const boxYPct = (box.y / imgH) * 100
+    const boxWPct = (box.width / imgW) * 100
+    const boxHPct = (box.height / imgH) * 100
+
+    // Sub-session cell at (10%, 20%, 80%, 15%) relative to box
+    const subCell = { x: 10, y: 20, w: 80, h: 15 }
+
+    const parentX = boxXPct + (subCell.x / 100) * boxWPct
+    const parentY = boxYPct + (subCell.y / 100) * boxHPct
+    const parentW = (subCell.w / 100) * boxWPct
+    const parentH = (subCell.h / 100) * boxHPct
+
+    // Box start in percent: x ≈ 2.86%, y ≈ 46.04%
+    expect(parentX).toBeCloseTo(boxXPct + 0.1 * boxWPct, 2)
+    expect(parentY).toBeCloseTo(boxYPct + 0.2 * boxHPct, 2)
+    expect(parentW).toBeCloseTo(0.8 * boxWPct, 2)
+    expect(parentH).toBeCloseTo(0.15 * boxHPct, 2)
+
+    // All values should be within 0-100%
+    expect(parentX).toBeGreaterThan(0)
+    expect(parentY).toBeGreaterThan(0)
+    expect(parentX + parentW).toBeLessThan(100)
+    expect(parentY + parentH).toBeLessThan(100)
+  })
+
+  it('should place sub-cell at box origin when sub coords are 0,0', () => {
+    const imgW = 1000
+    const imgH = 2000
+    const box = { x: 100, y: 500, width: 800, height: 200 }
+
+    const boxXPct = (box.x / imgW) * 100  // 10%
+    const boxYPct = (box.y / imgH) * 100  // 25%
+
+    const parentX = boxXPct + (0 / 100) * ((box.width / imgW) * 100)
+    const parentY = boxYPct + (0 / 100) * ((box.height / imgH) * 100)
+
+    expect(parentX).toBeCloseTo(10, 1)
+    expect(parentY).toBeCloseTo(25, 1)
+  })
+})
--- a/admin-lehrer/components/ocr-pipeline/usePixelWordPositions.ts
+++ b/admin-lehrer/components/ocr-pipeline/usePixelWordPositions.ts
@@ -1,5 +1,5 @@
 import { useEffect, useState } from 'react'
-import type { GridCell } from '@/app/(admin)/ai/ocr-pipeline/types'
+import type { GridCell } from '@/app/(admin)/ai/ocr-kombi/types'

 export interface WordPosition {
  xPct: number
--- a/admin-lehrer/lib/navigation.ts
+++ b/admin-lehrer/lib/navigation.ts
@@ -49,22 +49,6 @@ export const navigation: NavCategory[] = [
        purpose: 'E-Mail-Konten verwalten und KI-Kategorisierung nutzen. IMAP/SMTP Konfiguration, Vorlagen und Audit-Log.',
        audience: ['Support', 'Admins'],
      },
-      {
-        id: 'video-chat',
-        name: 'Video & Chat',
-        href: '/communication/video-chat',
-        description: 'Matrix & Jitsi Monitoring',
-        purpose: 'Dashboard fuer Matrix Synapse und Jitsi Meet. Service-Status, aktive Meetings, Traffic-Analyse und Ressourcen-Empfehlungen.',
-        audience: ['Admins', 'DevOps'],
-      },
-      {
-        id: 'voice-service',
-        name: 'Voice Service',
-        href: '/communication/matrix',
-        description: 'PersonaPlex-7B & TaskOrchestrator',
-        purpose: 'Voice-First Interface Konfiguration und Architektur-Dokumentation. Live Demo, Task States, Intents und DSGVO-Informationen.',
-        audience: ['Entwickler', 'Admins'],
-      },
      {
        id: 'alerts',
        name: 'Alerts Monitoring',
@@ -133,29 +117,11 @@ export const navigation: NavCategory[] = [
      // KI-Werkzeuge: Standalone-Tools fuer Entwicklung & QA
      // -----------------------------------------------------------------------
      {
-        id: 'ocr-compare',
-        name: 'OCR Vergleich',
-        href: '/ai/ocr-compare',
-        description: 'OCR-Methoden & Vokabel-Extraktion',
-        purpose: 'Vergleichen Sie verschiedene OCR-Methoden (lokales LLM, Vision LLM, PaddleOCR, Tesseract, Anthropic) fuer Vokabel-Extraktion. Grid-Overlay, Block-Review und LLM-Vergleich.',
-        audience: ['Entwickler', 'Data Scientists', 'Lehrer'],
-        subgroup: 'KI-Werkzeuge',
-      },
-      {
-        id: 'ocr-pipeline',
-        name: 'OCR Pipeline',
-        href: '/ai/ocr-pipeline',
-        description: 'Schrittweise Seitenrekonstruktion',
-        purpose: 'Schrittweise Seitenrekonstruktion: Scan begradigen, Spalten erkennen, Woerter lokalisieren und die Seite Wort fuer Wort nachbauen. 6-Schritt-Pipeline mit Ground Truth Validierung.',
-        audience: ['Entwickler', 'Data Scientists'],
-        subgroup: 'KI-Werkzeuge',
-      },
-      {
-        id: 'ocr-overlay',
-        name: 'OCR Overlay',
-        href: '/ai/ocr-overlay',
-        description: 'Ganzseitige Overlay-Rekonstruktion',
-        purpose: 'Arbeitsblatt ohne Spaltenerkennung direkt als Overlay rekonstruieren. Vereinfachte 7-Schritt-Pipeline.',
+        id: 'ocr-kombi',
+        name: 'OCR Kombi',
+        href: '/ai/ocr-kombi',
+        description: 'Modulare 11-Schritt-Pipeline',
+        purpose: 'Modulare OCR-Pipeline mit Dual-Engine (PP-OCRv5 + Tesseract), Strukturerkennung, Grid-Aufbau und Review. Multi-Page-Dokument-Unterstuetzung.',
        audience: ['Entwickler'],
        subgroup: 'KI-Werkzeuge',
      },
@@ -169,16 +135,6 @@ export const navigation: NavCategory[] = [
        oldAdminPath: '/admin/quality',
        subgroup: 'KI-Werkzeuge',
      },
-      {
-        id: 'gpu',
-        name: 'GPU Infrastruktur',
-        href: '/ai/gpu',
-        description: 'vast.ai GPU Management',
-        purpose: 'Verwalten Sie GPU-Instanzen auf vast.ai fuer ML-Training und Inferenz.',
-        audience: ['DevOps', 'Entwickler'],
-        oldAdminPath: '/admin/gpu',
-        subgroup: 'KI-Werkzeuge',
-      },
      // -----------------------------------------------------------------------
      // KI-Anwendungen: Endnutzer-orientierte KI-Module
      // -----------------------------------------------------------------------
@@ -200,15 +156,6 @@ export const navigation: NavCategory[] = [
        audience: ['Entwickler', 'QA'],
        subgroup: 'KI-Werkzeuge',
      },
-      {
-        id: 'model-management',
-        name: 'Model Management',
-        href: '/ai/model-management',
-        description: 'ONNX & PyTorch Modell-Verwaltung',
-        purpose: 'Verfuegbare ML-Modelle verwalten (PyTorch vs ONNX), Backend umschalten, Benchmark-Vergleiche ausfuehren und RAM/Performance-Metriken einsehen.',
-        audience: ['Entwickler', 'DevOps'],
-        subgroup: 'KI-Werkzeuge',
-      },
      {
        id: 'agents',
        name: 'Agent Management',
--- a/admin-lehrer/package-lock.json
+++ b/admin-lehrer/package-lock.json
--- a/admin-lehrer/package.json
+++ b/admin-lehrer/package.json
@@ -18,6 +18,8 @@
    "test:all": "vitest run && playwright test --project=chromium"
  },
  "dependencies": {
+    "@fortune-sheet/react": "^1.0.4",
+    "fabric": "^6.0.0",
    "jspdf": "^4.1.0",
    "jszip": "^3.10.1",
    "lucide-react": "^0.468.0",
@@ -26,7 +28,6 @@
    "react-dom": "^18.3.1",
    "reactflow": "^11.11.4",
    "recharts": "^2.15.0",
-    "fabric": "^6.0.0",
    "uuid": "^13.0.0"
  },
  "devDependencies": {
--- a/backend-lehrer/infra/init.py
+++ b/backend-lehrer/infra/init.py
@@ -1,10 +1 @@
-"""
-Infrastructure management module.
-
-Provides control plane for external GPU resources (vast.ai).
-"""
-
-from .vast_client import VastAIClient
-from .vast_power import router as vast_router
-
-__all__ = ["VastAIClient", "vast_router"]
+# Infrastructure module (vast.ai GPU management removed — see git history)
--- a/backend-lehrer/infra/vast_client.py
+++ b/backend-lehrer/infra/vast_client.py
@@ -1,419 +0,0 @@
-"""
-Vast.ai REST API Client.
-
-Verwendet die offizielle vast.ai API statt CLI fuer mehr Stabilitaet.
-API Dokumentation: https://docs.vast.ai/api
-"""
-
-import asyncio
-import logging
-from dataclasses import dataclass, field
-from datetime import datetime, timezone
-from enum import Enum
-from typing import Optional, Dict, Any, List
-
-import httpx
-
-logger = logging.getLogger(__name__)
-
-
-class InstanceStatus(Enum):
-    """Vast.ai Instance Status."""
-    RUNNING = "running"
-    STOPPED = "stopped"
-    EXITED = "exited"
-    LOADING = "loading"
-    SCHEDULING = "scheduling"
-    CREATING = "creating"
-    UNKNOWN = "unknown"
-
-
-@dataclass
-class AccountInfo:
-    """Informationen ueber den vast.ai Account."""
-    credit: float  # Aktuelles Guthaben in USD
-    balance: float  # Balance (meist 0)
-    total_spend: float  # Gesamtausgaben
-    username: str
-    email: str
-    has_billing: bool
-
-    @classmethod
-    def from_api_response(cls, data: Dict[str, Any]) -> "AccountInfo":
-        """Erstellt AccountInfo aus API Response."""
-        return cls(
-            credit=data.get("credit", 0.0),
-            balance=data.get("balance", 0.0),
-            total_spend=abs(data.get("total_spend", 0.0)),  # API gibt negativ zurück
-            username=data.get("username", ""),
-            email=data.get("email", ""),
-            has_billing=data.get("has_billing", False),
-        )
-
-    def to_dict(self) -> Dict[str, Any]:
-        """Serialisiert zu Dictionary."""
-        return {
-            "credit": self.credit,
-            "balance": self.balance,
-            "total_spend": self.total_spend,
-            "username": self.username,
-            "email": self.email,
-            "has_billing": self.has_billing,
-        }
-
-
-@dataclass
-class InstanceInfo:
-    """Informationen ueber eine vast.ai Instanz."""
-    id: int
-    status: InstanceStatus
-    machine_id: Optional[int] = None
-    gpu_name: Optional[str] = None
-    num_gpus: int = 1
-    gpu_ram: Optional[float] = None  # GB
-    cpu_ram: Optional[float] = None  # GB
-    disk_space: Optional[float] = None  # GB
-    dph_total: Optional[float] = None  # $/hour
-    public_ipaddr: Optional[str] = None
-    ports: Dict[str, Any] = field(default_factory=dict)
-    label: Optional[str] = None
-    image_uuid: Optional[str] = None
-    started_at: Optional[datetime] = None
-
-    @classmethod
-    def from_api_response(cls, data: Dict[str, Any]) -> "InstanceInfo":
-        """Erstellt InstanceInfo aus API Response."""
-        status_map = {
-            "running": InstanceStatus.RUNNING,
-            "exited": InstanceStatus.EXITED,
-            "loading": InstanceStatus.LOADING,
-            "scheduling": InstanceStatus.SCHEDULING,
-            "creating": InstanceStatus.CREATING,
-        }
-
-        actual_status = data.get("actual_status", "unknown")
-        status = status_map.get(actual_status, InstanceStatus.UNKNOWN)
-
-        # Parse ports mapping
-        ports = {}
-        if "ports" in data and data["ports"]:
-            ports = data["ports"]
-
-        # Parse started_at
-        started_at = None
-        if "start_date" in data and data["start_date"]:
-            try:
-                started_at = datetime.fromtimestamp(data["start_date"], tz=timezone.utc)
-            except (ValueError, TypeError):
-                pass
-
-        return cls(
-            id=data.get("id", 0),
-            status=status,
-            machine_id=data.get("machine_id"),
-            gpu_name=data.get("gpu_name"),
-            num_gpus=data.get("num_gpus", 1),
-            gpu_ram=data.get("gpu_ram"),
-            cpu_ram=data.get("cpu_ram"),
-            disk_space=data.get("disk_space"),
-            dph_total=data.get("dph_total"),
-            public_ipaddr=data.get("public_ipaddr"),
-            ports=ports,
-            label=data.get("label"),
-            image_uuid=data.get("image_uuid"),
-            started_at=started_at,
-        )
-
-    def get_endpoint_url(self, internal_port: int = 8001) -> Optional[str]:
-        """Berechnet die externe URL fuer einen internen Port."""
-        if not self.public_ipaddr:
-            return None
-
-        # vast.ai mapped interne Ports auf externe Ports
-        # Format: {"8001/tcp": [{"HostIp": "0.0.0.0", "HostPort": "12345"}]}
-        port_key = f"{internal_port}/tcp"
-        if port_key in self.ports:
-            port_info = self.ports[port_key]
-            if isinstance(port_info, list) and port_info:
-                host_port = port_info[0].get("HostPort")
-                if host_port:
-                    return f"http://{self.public_ipaddr}:{host_port}"
-
-        # Fallback: Direkter Port
-        return f"http://{self.public_ipaddr}:{internal_port}"
-
-    def to_dict(self) -> Dict[str, Any]:
-        """Serialisiert zu Dictionary."""
-        return {
-            "id": self.id,
-            "status": self.status.value,
-            "machine_id": self.machine_id,
-            "gpu_name": self.gpu_name,
-            "num_gpus": self.num_gpus,
-            "gpu_ram": self.gpu_ram,
-            "cpu_ram": self.cpu_ram,
-            "disk_space": self.disk_space,
-            "dph_total": self.dph_total,
-            "public_ipaddr": self.public_ipaddr,
-            "ports": self.ports,
-            "label": self.label,
-            "started_at": self.started_at.isoformat() if self.started_at else None,
-        }
-
-
-class VastAIClient:
-    """
-    Async Client fuer vast.ai REST API.
-
-    Verwendet die offizielle API unter https://console.vast.ai/api/v0/
-    """
-
-    BASE_URL = "https://console.vast.ai/api/v0"
-
-    def __init__(self, api_key: str, timeout: float = 30.0):
-        self.api_key = api_key
-        self.timeout = timeout
-        self._client: Optional[httpx.AsyncClient] = None
-
-    async def _get_client(self) -> httpx.AsyncClient:
-        """Lazy Client-Erstellung."""
-        if self._client is None or self._client.is_closed:
-            self._client = httpx.AsyncClient(
-                timeout=self.timeout,
-                headers={
-                    "Accept": "application/json",
-                },
-            )
-        return self._client
-
-    async def close(self) -> None:
-        """Schliesst den HTTP Client."""
-        if self._client and not self._client.is_closed:
-            await self._client.aclose()
-            self._client = None
-
-    def _build_url(self, endpoint: str) -> str:
-        """Baut vollstaendige URL mit API Key."""
-        sep = "&" if "?" in endpoint else "?"
-        return f"{self.BASE_URL}{endpoint}{sep}api_key={self.api_key}"
-
-    async def list_instances(self) -> List[InstanceInfo]:
-        """Listet alle Instanzen auf."""
-        client = await self._get_client()
-        url = self._build_url("/instances/")
-
-        try:
-            response = await client.get(url)
-            response.raise_for_status()
-            data = response.json()
-
-            instances = []
-            if "instances" in data:
-                for inst_data in data["instances"]:
-                    instances.append(InstanceInfo.from_api_response(inst_data))
-
-            return instances
-
-        except httpx.HTTPStatusError as e:
-            logger.error(f"vast.ai API error listing instances: {e}")
-            raise
-
-    async def get_instance(self, instance_id: int) -> Optional[InstanceInfo]:
-        """Holt Details einer spezifischen Instanz."""
-        client = await self._get_client()
-        url = self._build_url(f"/instances/{instance_id}/")
-
-        try:
-            response = await client.get(url)
-            response.raise_for_status()
-            data = response.json()
-
-            if "instances" in data:
-                instances = data["instances"]
-                # API gibt bei einzelner Instanz ein dict zurück, bei Liste eine Liste
-                if isinstance(instances, list) and instances:
-                    return InstanceInfo.from_api_response(instances[0])
-                elif isinstance(instances, dict):
-                    # Füge ID hinzu falls nicht vorhanden
-                    if "id" not in instances:
-                        instances["id"] = instance_id
-                    return InstanceInfo.from_api_response(instances)
-            elif isinstance(data, dict) and "id" in data:
-                return InstanceInfo.from_api_response(data)
-
-            return None
-
-        except httpx.HTTPStatusError as e:
-            if e.response.status_code == 404:
-                return None
-            logger.error(f"vast.ai API error getting instance {instance_id}: {e}")
-            raise
-
-    async def start_instance(self, instance_id: int) -> bool:
-        """Startet eine gestoppte Instanz."""
-        client = await self._get_client()
-        url = self._build_url(f"/instances/{instance_id}/")
-
-        try:
-            response = await client.put(
-                url,
-                json={"state": "running"},
-            )
-            response.raise_for_status()
-            logger.info(f"vast.ai instance {instance_id} start requested")
-            return True
-
-        except httpx.HTTPStatusError as e:
-            logger.error(f"vast.ai API error starting instance {instance_id}: {e}")
-            return False
-
-    async def stop_instance(self, instance_id: int) -> bool:
-        """Stoppt eine laufende Instanz (haelt Disk)."""
-        client = await self._get_client()
-        url = self._build_url(f"/instances/{instance_id}/")
-
-        try:
-            response = await client.put(
-                url,
-                json={"state": "stopped"},
-            )
-            response.raise_for_status()
-            logger.info(f"vast.ai instance {instance_id} stop requested")
-            return True
-
-        except httpx.HTTPStatusError as e:
-            logger.error(f"vast.ai API error stopping instance {instance_id}: {e}")
-            return False
-
-    async def destroy_instance(self, instance_id: int) -> bool:
-        """Loescht eine Instanz komplett (Disk weg!)."""
-        client = await self._get_client()
-        url = self._build_url(f"/instances/{instance_id}/")
-
-        try:
-            response = await client.delete(url)
-            response.raise_for_status()
-            logger.info(f"vast.ai instance {instance_id} destroyed")
-            return True
-
-        except httpx.HTTPStatusError as e:
-            logger.error(f"vast.ai API error destroying instance {instance_id}: {e}")
-            return False
-
-    async def set_label(self, instance_id: int, label: str) -> bool:
-        """Setzt ein Label fuer eine Instanz."""
-        client = await self._get_client()
-        url = self._build_url(f"/instances/{instance_id}/")
-
-        try:
-            response = await client.put(
-                url,
-                json={"label": label},
-            )
-            response.raise_for_status()
-            return True
-
-        except httpx.HTTPStatusError as e:
-            logger.error(f"vast.ai API error setting label on instance {instance_id}: {e}")
-            return False
-
-    async def wait_for_status(
-        self,
-        instance_id: int,
-        target_status: InstanceStatus,
-        timeout_seconds: int = 300,
-        poll_interval: float = 5.0,
-    ) -> Optional[InstanceInfo]:
-        """
-        Wartet bis eine Instanz einen bestimmten Status erreicht.
-
-        Returns:
-            InstanceInfo wenn Status erreicht, None bei Timeout.
-        """
-        deadline = asyncio.get_event_loop().time() + timeout_seconds
-
-        while asyncio.get_event_loop().time() < deadline:
-            instance = await self.get_instance(instance_id)
-
-            if instance and instance.status == target_status:
-                return instance
-
-            if instance:
-                logger.debug(
-                    f"vast.ai instance {instance_id} status: {instance.status.value}, "
-                    f"waiting for {target_status.value}"
-                )
-
-            await asyncio.sleep(poll_interval)
-
-        logger.warning(
-            f"Timeout waiting for instance {instance_id} to reach {target_status.value}"
-        )
-        return None
-
-    async def wait_for_health(
-        self,
-        instance: InstanceInfo,
-        health_path: str = "/health",
-        internal_port: int = 8001,
-        timeout_seconds: int = 600,
-        poll_interval: float = 5.0,
-    ) -> bool:
-        """
-        Wartet bis der Health-Endpoint erreichbar ist.
-
-        Returns:
-            True wenn Health OK, False bei Timeout.
-        """
-        endpoint = instance.get_endpoint_url(internal_port)
-        if not endpoint:
-            logger.error("No endpoint URL available for health check")
-            return False
-
-        health_url = f"{endpoint.rstrip('/')}{health_path}"
-        logger.info(f"Waiting for health at {health_url}")
-
-        deadline = asyncio.get_event_loop().time() + timeout_seconds
-        health_client = httpx.AsyncClient(timeout=5.0)
-
-        try:
-            while asyncio.get_event_loop().time() < deadline:
-                try:
-                    response = await health_client.get(health_url)
-                    if 200 <= response.status_code < 300:
-                        logger.info(f"Health check passed: {health_url}")
-                        return True
-                except Exception as e:
-                    logger.debug(f"Health check failed: {e}")
-
-                await asyncio.sleep(poll_interval)
-
-            logger.warning(f"Health check timeout: {health_url}")
-            return False
-
-        finally:
-            await health_client.aclose()
-
-    async def get_account_info(self) -> Optional[AccountInfo]:
-        """
-        Holt Account-Informationen inkl. Credit/Budget.
-
-        Returns:
-            AccountInfo oder None bei Fehler.
-        """
-        client = await self._get_client()
-        url = self._build_url("/users/current/")
-
-        try:
-            response = await client.get(url)
-            response.raise_for_status()
-            data = response.json()
-
-            return AccountInfo.from_api_response(data)
-
-        except httpx.HTTPStatusError as e:
-            logger.error(f"vast.ai API error getting account info: {e}")
-            return None
-        except Exception as e:
-            logger.error(f"Error getting vast.ai account info: {e}")
-            return None
--- a/backend-lehrer/infra/vast_power.py
+++ b/backend-lehrer/infra/vast_power.py
@@ -1,618 +0,0 @@
-"""
-Vast.ai Power Control API.
-
-Stellt Endpoints bereit fuer:
- Start/Stop von vast.ai Instanzen
- Status-Abfrage
- Auto-Shutdown bei Inaktivitaet
- Kosten-Tracking
-
-Sicherheit: Alle Endpoints erfordern CONTROL_API_KEY.
-"""
-
-import asyncio
-import json
-import logging
-import os
-import time
-from datetime import datetime, timezone
-from pathlib import Path
-from typing import Optional, Dict, Any, List
-
-from fastapi import APIRouter, Depends, HTTPException, Header, BackgroundTasks
-from pydantic import BaseModel, Field
-
-from .vast_client import VastAIClient, InstanceInfo, InstanceStatus, AccountInfo
-
-logger = logging.getLogger(__name__)
-
-router = APIRouter(prefix="/infra/vast", tags=["Infrastructure"])
-
-
-# -------------------------
-# Configuration (ENV)
-# -------------------------
-VAST_API_KEY = os.getenv("VAST_API_KEY")
-VAST_INSTANCE_ID = os.getenv("VAST_INSTANCE_ID")  # Numeric instance ID
-CONTROL_API_KEY = os.getenv("CONTROL_API_KEY")  # Admin key for these endpoints
-
-# Health check configuration
-VAST_HEALTH_PORT = int(os.getenv("VAST_HEALTH_PORT", "8001"))
-VAST_HEALTH_PATH = os.getenv("VAST_HEALTH_PATH", "/health")
-VAST_WAIT_TIMEOUT_S = int(os.getenv("VAST_WAIT_TIMEOUT_S", "600"))  # 10 min
-
-# Auto-shutdown configuration
-AUTO_SHUTDOWN_ENABLED = os.getenv("VAST_AUTO_SHUTDOWN", "true").lower() == "true"
-AUTO_SHUTDOWN_MINUTES = int(os.getenv("VAST_AUTO_SHUTDOWN_MINUTES", "30"))
-
-# State persistence (in /tmp for container compatibility)
-STATE_PATH = Path(os.getenv("VAST_STATE_PATH", "/tmp/vast_state.json"))
-AUDIT_PATH = Path(os.getenv("VAST_AUDIT_PATH", "/tmp/vast_audit.log"))
-
-
-# -------------------------
-# State Management
-# -------------------------
-class VastState:
-    """
-    Persistenter State fuer vast.ai Kontrolle.
-
-    Speichert:
-    - Aktueller Endpunkt (weil IP sich aendern kann)
-    - Letzte Aktivitaet (fuer Auto-Shutdown)
-    - Kosten-Tracking
-    """
-
-    def __init__(self, path: Path = STATE_PATH):
-        self.path = path
-        self._state: Dict[str, Any] = self._load()
-
-    def _load(self) -> Dict[str, Any]:
-        """Laedt State von Disk."""
-        if not self.path.exists():
-            return {
-                "desired_state": None,
-                "endpoint_base_url": None,
-                "last_activity": None,
-                "last_start": None,
-                "last_stop": None,
-                "total_runtime_seconds": 0,
-                "total_cost_usd": 0.0,
-            }
-        try:
-            return json.loads(self.path.read_text(encoding="utf-8"))
-        except Exception:
-            return {}
-
-    def _save(self) -> None:
-        """Speichert State auf Disk."""
-        self.path.parent.mkdir(parents=True, exist_ok=True)
-        self.path.write_text(
-            json.dumps(self._state, ensure_ascii=False, indent=2),
-            encoding="utf-8",
-        )
-
-    def get(self, key: str, default: Any = None) -> Any:
-        return self._state.get(key, default)
-
-    def set(self, key: str, value: Any) -> None:
-        self._state[key] = value
-        self._save()
-
-    def update(self, data: Dict[str, Any]) -> None:
-        self._state.update(data)
-        self._save()
-
-    def record_activity(self) -> None:
-        """Zeichnet letzte Aktivitaet auf (fuer Auto-Shutdown)."""
-        self._state["last_activity"] = datetime.now(timezone.utc).isoformat()
-        self._save()
-
-    def get_last_activity(self) -> Optional[datetime]:
-        """Gibt letzte Aktivitaet als datetime."""
-        ts = self._state.get("last_activity")
-        if ts:
-            return datetime.fromisoformat(ts)
-        return None
-
-    def record_start(self) -> None:
-        """Zeichnet Start-Zeit auf."""
-        self._state["last_start"] = datetime.now(timezone.utc).isoformat()
-        self._state["desired_state"] = "RUNNING"
-        self._save()
-
-    def record_stop(self, dph_total: Optional[float] = None) -> None:
-        """Zeichnet Stop-Zeit auf und berechnet Kosten."""
-        now = datetime.now(timezone.utc)
-        self._state["last_stop"] = now.isoformat()
-        self._state["desired_state"] = "STOPPED"
-
-        # Berechne Runtime und Kosten
-        last_start = self._state.get("last_start")
-        if last_start:
-            start_dt = datetime.fromisoformat(last_start)
-            runtime_seconds = (now - start_dt).total_seconds()
-            self._state["total_runtime_seconds"] = (
-                self._state.get("total_runtime_seconds", 0) + runtime_seconds
-            )
-
-            if dph_total:
-                hours = runtime_seconds / 3600
-                cost = hours * dph_total
-                self._state["total_cost_usd"] = (
-                    self._state.get("total_cost_usd", 0.0) + cost
-                )
-                logger.info(
-                    f"Session cost: ${cost:.3f} ({runtime_seconds/60:.1f} min @ ${dph_total}/h)"
-                )
-
-        self._save()
-
-
-# Global state instance
-_state = VastState()
-
-
-# -------------------------
-# Audit Logging
-# -------------------------
-def audit_log(event: str, actor: str = "system", meta: Optional[Dict[str, Any]] = None) -> None:
-    """Schreibt Audit-Log Eintrag."""
-    meta = meta or {}
-    line = json.dumps(
-        {
-            "ts": datetime.now(timezone.utc).isoformat(),
-            "event": event,
-            "actor": actor,
-            "meta": meta,
-        },
-        ensure_ascii=False,
-    )
-    AUDIT_PATH.parent.mkdir(parents=True, exist_ok=True)
-    with AUDIT_PATH.open("a", encoding="utf-8") as f:
-        f.write(line + "\n")
-    logger.info(f"AUDIT: {event} by {actor}")
-
-
-# -------------------------
-# Request/Response Models
-# -------------------------
-class PowerOnRequest(BaseModel):
-    wait_for_health: bool = Field(default=True, description="Warten bis LLM bereit")
-    health_path: str = Field(default=VAST_HEALTH_PATH)
-    health_port: int = Field(default=VAST_HEALTH_PORT)
-
-
-class PowerOnResponse(BaseModel):
-    status: str
-    instance_id: Optional[int] = None
-    endpoint_base_url: Optional[str] = None
-    health_url: Optional[str] = None
-    message: Optional[str] = None
-
-
-class PowerOffRequest(BaseModel):
-    pass  # Keine Parameter noetig
-
-
-class PowerOffResponse(BaseModel):
-    status: str
-    session_runtime_minutes: Optional[float] = None
-    session_cost_usd: Optional[float] = None
-    message: Optional[str] = None
-
-
-class VastStatusResponse(BaseModel):
-    instance_id: Optional[int] = None
-    status: str
-    gpu_name: Optional[str] = None
-    dph_total: Optional[float] = None
-    endpoint_base_url: Optional[str] = None
-    last_activity: Optional[str] = None
-    auto_shutdown_in_minutes: Optional[int] = None
-    total_runtime_hours: Optional[float] = None
-    total_cost_usd: Optional[float] = None
-    # Budget / Credit Informationen
-    account_credit: Optional[float] = None  # Verbleibendes Guthaben in USD
-    account_total_spend: Optional[float] = None  # Gesamtausgaben auf vast.ai
-    # Session-Kosten (seit letztem Start)
-    session_runtime_minutes: Optional[float] = None
-    session_cost_usd: Optional[float] = None
-    message: Optional[str] = None
-
-
-class CostStatsResponse(BaseModel):
-    total_runtime_hours: float
-    total_cost_usd: float
-    sessions_count: int
-    avg_session_minutes: float
-
-
-# -------------------------
-# Security Dependency
-# -------------------------
-def require_control_key(x_api_key: Optional[str] = Header(default=None)) -> None:
-    """
-    Admin-Schutz fuer Control-Endpoints.
-
-    Header: X-API-Key: <CONTROL_API_KEY>
-    """
-    if not CONTROL_API_KEY:
-        raise HTTPException(
-            status_code=500,
-            detail="CONTROL_API_KEY not configured on server",
-        )
-    if x_api_key != CONTROL_API_KEY:
-        raise HTTPException(status_code=401, detail="Unauthorized")
-
-
-# -------------------------
-# Auto-Shutdown Background Task
-# -------------------------
-_shutdown_task: Optional[asyncio.Task] = None
-
-
-async def auto_shutdown_monitor() -> None:
-    """
-    Hintergrund-Task der bei Inaktivitaet die Instanz stoppt.
-
-    Laeuft permanent wenn Instanz an ist und prueft alle 60s ob
-    Aktivitaet stattfand. Stoppt Instanz wenn keine Aktivitaet
-    seit AUTO_SHUTDOWN_MINUTES.
-    """
-    if not VAST_API_KEY or not VAST_INSTANCE_ID:
-        return
-
-    client = VastAIClient(VAST_API_KEY)
-
-    try:
-        while True:
-            await asyncio.sleep(60)  # Check every minute
-
-            if not AUTO_SHUTDOWN_ENABLED:
-                continue
-
-            last_activity = _state.get_last_activity()
-            if not last_activity:
-                continue
-
-            # Berechne Inaktivitaet
-            now = datetime.now(timezone.utc)
-            inactive_minutes = (now - last_activity).total_seconds() / 60
-
-            if inactive_minutes >= AUTO_SHUTDOWN_MINUTES:
-                logger.info(
-                    f"Auto-shutdown triggered: {inactive_minutes:.1f} min inactive"
-                )
-                audit_log(
-                    "auto_shutdown",
-                    actor="system",
-                    meta={"inactive_minutes": inactive_minutes},
-                )
-
-                # Hole aktuelle Instanz-Info fuer Kosten
-                instance = await client.get_instance(int(VAST_INSTANCE_ID))
-                dph = instance.dph_total if instance else None
-
-                # Stop
-                await client.stop_instance(int(VAST_INSTANCE_ID))
-                _state.record_stop(dph_total=dph)
-
-                audit_log("auto_shutdown_complete", actor="system")
-
-    except asyncio.CancelledError:
-        pass
-    except Exception as e:
-        logger.error(f"Auto-shutdown monitor error: {e}")
-    finally:
-        await client.close()
-
-
-def start_auto_shutdown_monitor() -> None:
-    """Startet den Auto-Shutdown Monitor."""
-    global _shutdown_task
-    if _shutdown_task is None or _shutdown_task.done():
-        _shutdown_task = asyncio.create_task(auto_shutdown_monitor())
-        logger.info("Auto-shutdown monitor started")
-
-
-def stop_auto_shutdown_monitor() -> None:
-    """Stoppt den Auto-Shutdown Monitor."""
-    global _shutdown_task
-    if _shutdown_task and not _shutdown_task.done():
-        _shutdown_task.cancel()
-        logger.info("Auto-shutdown monitor stopped")
-
-
-# -------------------------
-# API Endpoints
-# -------------------------
-
-@router.get("/status", response_model=VastStatusResponse, dependencies=[Depends(require_control_key)])
-async def get_status() -> VastStatusResponse:
-    """
-    Gibt Status der vast.ai Instanz zurueck.
-
-    Inkludiert:
-    - Aktueller Status (running/stopped/etc)
-    - GPU Info und Kosten pro Stunde
-    - Endpoint URL
-    - Auto-Shutdown Timer
-    - Gesamtkosten
-    - Account Credit (verbleibendes Budget)
-    - Session-Kosten (seit letztem Start)
-    """
-    if not VAST_API_KEY or not VAST_INSTANCE_ID:
-        return VastStatusResponse(
-            status="unconfigured",
-            message="VAST_API_KEY or VAST_INSTANCE_ID not set",
-        )
-
-    client = VastAIClient(VAST_API_KEY)
-    try:
-        instance = await client.get_instance(int(VAST_INSTANCE_ID))
-
-        if not instance:
-            return VastStatusResponse(
-                instance_id=int(VAST_INSTANCE_ID),
-                status="not_found",
-                message=f"Instance {VAST_INSTANCE_ID} not found",
-            )
-
-        # Hole Account-Info fuer Budget/Credit
-        account_info = await client.get_account_info()
-        account_credit = account_info.credit if account_info else None
-        account_total_spend = account_info.total_spend if account_info else None
-
-        # Update endpoint if running
-        endpoint = None
-        if instance.status == InstanceStatus.RUNNING:
-            endpoint = instance.get_endpoint_url(VAST_HEALTH_PORT)
-            if endpoint:
-                _state.set("endpoint_base_url", endpoint)
-
-        # Calculate auto-shutdown timer
-        auto_shutdown_minutes = None
-        if AUTO_SHUTDOWN_ENABLED and instance.status == InstanceStatus.RUNNING:
-            last_activity = _state.get_last_activity()
-            if last_activity:
-                inactive = (datetime.now(timezone.utc) - last_activity).total_seconds() / 60
-                auto_shutdown_minutes = max(0, int(AUTO_SHUTDOWN_MINUTES - inactive))
-
-        # Berechne aktuelle Session-Kosten (wenn Instanz laeuft)
-        session_runtime_minutes = None
-        session_cost_usd = None
-        last_start = _state.get("last_start")
-
-        # Falls Instanz laeuft aber kein last_start gesetzt (z.B. nach Container-Neustart),
-        # nutze start_date aus der vast.ai API falls vorhanden, sonst jetzt
-        if instance.status == InstanceStatus.RUNNING and not last_start:
-            if instance.started_at:
-                _state.set("last_start", instance.started_at.isoformat())
-                last_start = instance.started_at.isoformat()
-            else:
-                _state.record_start()
-                last_start = _state.get("last_start")
-
-        if last_start and instance.status == InstanceStatus.RUNNING:
-            start_dt = datetime.fromisoformat(last_start)
-            session_runtime_minutes = (datetime.now(timezone.utc) - start_dt).total_seconds() / 60
-            if instance.dph_total:
-                session_cost_usd = (session_runtime_minutes / 60) * instance.dph_total
-
-        return VastStatusResponse(
-            instance_id=instance.id,
-            status=instance.status.value,
-            gpu_name=instance.gpu_name,
-            dph_total=instance.dph_total,
-            endpoint_base_url=endpoint or _state.get("endpoint_base_url"),
-            last_activity=_state.get("last_activity"),
-            auto_shutdown_in_minutes=auto_shutdown_minutes,
-            total_runtime_hours=_state.get("total_runtime_seconds", 0) / 3600,
-            total_cost_usd=_state.get("total_cost_usd", 0.0),
-            account_credit=account_credit,
-            account_total_spend=account_total_spend,
-            session_runtime_minutes=session_runtime_minutes,
-            session_cost_usd=session_cost_usd,
-        )
-
-    finally:
-        await client.close()
-
-
-@router.post("/power/on", response_model=PowerOnResponse, dependencies=[Depends(require_control_key)])
-async def power_on(
-    payload: PowerOnRequest,
-    background_tasks: BackgroundTasks,
-) -> PowerOnResponse:
-    """
-    Startet die vast.ai Instanz.
-
-    1. Startet Instanz via API
-    2. Wartet auf Status RUNNING
-    3. Optional: Wartet auf Health-Endpoint
-    4. Startet Auto-Shutdown Monitor
-    """
-    if not VAST_API_KEY or not VAST_INSTANCE_ID:
-        raise HTTPException(
-            status_code=500,
-            detail="VAST_API_KEY or VAST_INSTANCE_ID not configured",
-        )
-
-    instance_id = int(VAST_INSTANCE_ID)
-    audit_log("power_on_requested", meta={"instance_id": instance_id})
-
-    client = VastAIClient(VAST_API_KEY)
-    try:
-        # Start instance
-        success = await client.start_instance(instance_id)
-        if not success:
-            raise HTTPException(status_code=502, detail="Failed to start instance")
-
-        _state.record_start()
-        _state.record_activity()
-
-        # Wait for running status
-        instance = await client.wait_for_status(
-            instance_id,
-            InstanceStatus.RUNNING,
-            timeout_seconds=300,
-        )
-
-        if not instance:
-            return PowerOnResponse(
-                status="starting",
-                instance_id=instance_id,
-                message="Instance start requested but not yet running. Check status.",
-            )
-
-        # Get endpoint
-        endpoint = instance.get_endpoint_url(payload.health_port)
-        if endpoint:
-            _state.set("endpoint_base_url", endpoint)
-
-        # Wait for health if requested
-        if payload.wait_for_health:
-            health_ok = await client.wait_for_health(
-                instance,
-                health_path=payload.health_path,
-                internal_port=payload.health_port,
-                timeout_seconds=VAST_WAIT_TIMEOUT_S,
-            )
-
-            if not health_ok:
-                audit_log("power_on_health_timeout", meta={"instance_id": instance_id})
-                return PowerOnResponse(
-                    status="running_unhealthy",
-                    instance_id=instance_id,
-                    endpoint_base_url=endpoint,
-                    message=f"Instance running but health check failed at {endpoint}{payload.health_path}",
-                )
-
-        # Start auto-shutdown monitor
-        start_auto_shutdown_monitor()
-
-        audit_log("power_on_complete", meta={
-            "instance_id": instance_id,
-            "endpoint": endpoint,
-        })
-
-        return PowerOnResponse(
-            status="running",
-            instance_id=instance_id,
-            endpoint_base_url=endpoint,
-            health_url=f"{endpoint}{payload.health_path}" if endpoint else None,
-            message="Instance running and healthy",
-        )
-
-    finally:
-        await client.close()
-
-
-@router.post("/power/off", response_model=PowerOffResponse, dependencies=[Depends(require_control_key)])
-async def power_off(payload: PowerOffRequest) -> PowerOffResponse:
-    """
-    Stoppt die vast.ai Instanz (behaelt Disk).
-
-    Berechnet Session-Kosten und -Laufzeit.
-    """
-    if not VAST_API_KEY or not VAST_INSTANCE_ID:
-        raise HTTPException(
-            status_code=500,
-            detail="VAST_API_KEY or VAST_INSTANCE_ID not configured",
-        )
-
-    instance_id = int(VAST_INSTANCE_ID)
-    audit_log("power_off_requested", meta={"instance_id": instance_id})
-
-    # Stop auto-shutdown monitor
-    stop_auto_shutdown_monitor()
-
-    client = VastAIClient(VAST_API_KEY)
-    try:
-        # Get current info for cost calculation
-        instance = await client.get_instance(instance_id)
-        dph = instance.dph_total if instance else None
-
-        # Calculate session stats before updating state
-        session_runtime = 0.0
-        session_cost = 0.0
-        last_start = _state.get("last_start")
-        if last_start:
-            start_dt = datetime.fromisoformat(last_start)
-            session_runtime = (datetime.now(timezone.utc) - start_dt).total_seconds() / 60
-            if dph:
-                session_cost = (session_runtime / 60) * dph
-
-        # Stop instance
-        success = await client.stop_instance(instance_id)
-        if not success:
-            raise HTTPException(status_code=502, detail="Failed to stop instance")
-
-        _state.record_stop(dph_total=dph)
-
-        audit_log("power_off_complete", meta={
-            "instance_id": instance_id,
-            "session_minutes": session_runtime,
-            "session_cost": session_cost,
-        })
-
-        return PowerOffResponse(
-            status="stopped",
-            session_runtime_minutes=session_runtime,
-            session_cost_usd=session_cost,
-            message=f"Instance stopped. Session: {session_runtime:.1f} min, ${session_cost:.3f}",
-        )
-
-    finally:
-        await client.close()
-
-
-@router.post("/activity", dependencies=[Depends(require_control_key)])
-async def record_activity() -> Dict[str, str]:
-    """
-    Zeichnet Aktivitaet auf (verzoegert Auto-Shutdown).
-
-    Sollte von LLM Gateway aufgerufen werden bei jedem Request.
-    """
-    _state.record_activity()
-    return {"status": "recorded", "last_activity": _state.get("last_activity")}
-
-
-@router.get("/costs", response_model=CostStatsResponse, dependencies=[Depends(require_control_key)])
-async def get_costs() -> CostStatsResponse:
-    """
-    Gibt Kosten-Statistiken zurueck.
-    """
-    total_seconds = _state.get("total_runtime_seconds", 0)
-    total_cost = _state.get("total_cost_usd", 0.0)
-
-    # TODO: Sessions count from audit log
-    sessions = 1 if total_seconds > 0 else 0
-    avg_minutes = (total_seconds / 60 / sessions) if sessions > 0 else 0
-
-    return CostStatsResponse(
-        total_runtime_hours=total_seconds / 3600,
-        total_cost_usd=total_cost,
-        sessions_count=sessions,
-        avg_session_minutes=avg_minutes,
-    )
-
-
-@router.get("/audit", dependencies=[Depends(require_control_key)])
-async def get_audit_log(limit: int = 50) -> List[Dict[str, Any]]:
-    """
-    Gibt letzte Audit-Log Eintraege zurueck.
-    """
-    if not AUDIT_PATH.exists():
-        return []
-
-    lines = AUDIT_PATH.read_text(encoding="utf-8").strip().split("\n")
-    entries = []
-    for line in lines[-limit:]:
-        try:
-            entries.append(json.loads(line))
-        except json.JSONDecodeError:
-            continue
-
-    return list(reversed(entries))  # Neueste zuerst
--- a/backend-lehrer/jitsi_api.py
+++ b/backend-lehrer/jitsi_api.py
@@ -1,199 +0,0 @@
-"""
-BreakPilot Jitsi API
-
-Ermoeglicht das Versenden von Jitsi-Meeting-Einladungen per Email.
-"""
-
-import os
-import uuid
-from datetime import datetime
-from typing import Optional, List
-from pydantic import BaseModel, Field
-
-from fastapi import APIRouter, HTTPException
-
-router = APIRouter(prefix="/api/jitsi", tags=["Jitsi"])
-
-# Standard Jitsi Server (kann konfiguriert werden)
-JITSI_SERVER = os.getenv("JITSI_SERVER", "https://meet.jit.si")
-
-
-# ==========================================
-# PYDANTIC MODELS
-# ==========================================
-
-class JitsiInvitation(BaseModel):
-    """Model fuer Jitsi-Meeting-Einladung."""
-    to_email: str = Field(..., description="Email-Adresse des Teilnehmers")
-    to_name: str = Field(..., description="Name des Teilnehmers")
-    organizer_name: str = Field(default="BreakPilot Lehrer", description="Name des Organisators")
-    meeting_title: str = Field(..., description="Titel des Meetings")
-    meeting_date: str = Field(..., description="Datum z.B. '20. Dezember 2024'")
-    meeting_time: str = Field(..., description="Uhrzeit z.B. '14:00 Uhr'")
-    room_name: Optional[str] = Field(None, description="Raumname (wird generiert wenn leer)")
-    additional_info: Optional[str] = Field(None, description="Zusaetzliche Informationen")
-
-
-class JitsiInvitationResponse(BaseModel):
-    """Antwort auf eine Jitsi-Einladung."""
-    success: bool
-    jitsi_url: str
-    room_name: str
-    email_sent: bool
-    email_error: Optional[str] = None
-
-
-class JitsiBulkInvitation(BaseModel):
-    """Model fuer mehrere Jitsi-Einladungen."""
-    recipients: List[dict] = Field(..., description="Liste von {email, name} Objekten")
-    organizer_name: str = Field(default="BreakPilot Lehrer")
-    meeting_title: str
-    meeting_date: str
-    meeting_time: str
-    room_name: Optional[str] = None
-    additional_info: Optional[str] = None
-
-
-class JitsiBulkResponse(BaseModel):
-    """Antwort auf Bulk-Einladungen."""
-    jitsi_url: str
-    room_name: str
-    sent: int
-    failed: int
-    errors: List[str]
-
-
-# ==========================================
-# HELPER FUNCTIONS
-# ==========================================
-
-def generate_room_name() -> str:
-    """Generiert einen sicheren Raumnamen."""
-    # UUID-basiert fuer Sicherheit
-    unique_id = uuid.uuid4().hex[:12]
-    return f"BreakPilot-{unique_id}"
-
-
-def build_jitsi_url(room_name: str) -> str:
-    """Erstellt die vollstaendige Jitsi-URL."""
-    return f"{JITSI_SERVER}/{room_name}"
-
-
-# ==========================================
-# API ENDPOINTS
-# ==========================================
-
-@router.post("/invite", response_model=JitsiInvitationResponse)
-async def send_jitsi_invitation(invitation: JitsiInvitation):
-    """
-    Sendet eine Jitsi-Meeting-Einladung per Email.
-
-    Der Empfaenger kann dem Meeting ueber den Browser beitreten,
-    ohne Matrix oder andere Software installieren zu muessen.
-    """
-    # Raumname generieren oder verwenden
-    room_name = invitation.room_name or generate_room_name()
-    jitsi_url = build_jitsi_url(room_name)
-
-    email_sent = False
-    email_error = None
-
-    try:
-        from email_service import email_service
-
-        result = email_service.send_jitsi_invitation(
-            to_email=invitation.to_email,
-            to_name=invitation.to_name,
-            organizer_name=invitation.organizer_name,
-            meeting_title=invitation.meeting_title,
-            meeting_date=invitation.meeting_date,
-            meeting_time=invitation.meeting_time,
-            jitsi_url=jitsi_url,
-            additional_info=invitation.additional_info
-        )
-
-        email_sent = result.success
-        if not result.success:
-            email_error = result.error
-
-    except Exception as e:
-        email_error = str(e)
-
-    return JitsiInvitationResponse(
-        success=email_sent,
-        jitsi_url=jitsi_url,
-        room_name=room_name,
-        email_sent=email_sent,
-        email_error=email_error
-    )
-
-
-@router.post("/invite/bulk", response_model=JitsiBulkResponse)
-async def send_bulk_jitsi_invitations(bulk: JitsiBulkInvitation):
-    """
-    Sendet Jitsi-Einladungen an mehrere Empfaenger.
-
-    Alle Empfaenger erhalten eine Einladung zum selben Meeting.
-    """
-    # Gemeinsamer Raumname fuer alle
-    room_name = bulk.room_name or generate_room_name()
-    jitsi_url = build_jitsi_url(room_name)
-
-    sent = 0
-    failed = 0
-    errors = []
-
-    try:
-        from email_service import email_service
-
-        for recipient in bulk.recipients:
-            if not recipient.get("email"):
-                errors.append(f"Fehlende Email fuer {recipient.get('name', 'Unbekannt')}")
-                failed += 1
-                continue
-
-            result = email_service.send_jitsi_invitation(
-                to_email=recipient["email"],
-                to_name=recipient.get("name", ""),
-                organizer_name=bulk.organizer_name,
-                meeting_title=bulk.meeting_title,
-                meeting_date=bulk.meeting_date,
-                meeting_time=bulk.meeting_time,
-                jitsi_url=jitsi_url,
-                additional_info=bulk.additional_info
-            )
-
-            if result.success:
-                sent += 1
-            else:
-                failed += 1
-                errors.append(f"{recipient.get('email')}: {result.error}")
-
-    except Exception as e:
-        errors.append(f"Allgemeiner Fehler: {str(e)}")
-
-    return JitsiBulkResponse(
-        jitsi_url=jitsi_url,
-        room_name=room_name,
-        sent=sent,
-        failed=failed,
-        errors=errors[:20]  # Max 20 Fehler zurueckgeben
-    )
-
-
-@router.get("/room")
-async def generate_meeting_room():
-    """
-    Generiert einen neuen Meeting-Raum.
-
-    Gibt die URL zurueck ohne Einladungen zu senden.
-    """
-    room_name = generate_room_name()
-    jitsi_url = build_jitsi_url(room_name)
-
-    return {
-        "room_name": room_name,
-        "jitsi_url": jitsi_url,
-        "server": JITSI_SERVER,
-        "created_at": datetime.utcnow().isoformat()
-    }
--- a/backend-lehrer/learning_units_api.py
+++ b/backend-lehrer/learning_units_api.py
@@ -1,5 +1,9 @@
 from typing import List, Dict, Any, Optional
 from datetime import datetime
+from pathlib import Path
+import json
+import os
+import logging

 from fastapi import APIRouter, HTTPException
 from pydantic import BaseModel
@@ -15,6 +19,8 @@ from learning_units import (
    delete_learning_unit,
 )

+logger = logging.getLogger(__name__)
+

 router = APIRouter(
    prefix="/learning-units",
@@ -49,6 +55,11 @@ class RemoveWorksheetPayload(BaseModel):
    worksheet_file: str


+class GenerateFromAnalysisPayload(BaseModel):
+    analysis_data: Dict[str, Any]
+    num_questions: int = 8
+
+
 # ---------- Hilfsfunktion: Backend-Modell -> Frontend-Objekt ----------


@@ -195,3 +206,171 @@ def api_delete_learning_unit(unit_id: str):
        raise HTTPException(status_code=404, detail="Lerneinheit nicht gefunden.")
    return {"status": "deleted", "id": unit_id}

+
+# ---------- Generator-Endpunkte ----------
+
+LERNEINHEITEN_DIR = os.path.expanduser("~/Arbeitsblaetter/Lerneinheiten")
+
+
+def _save_analysis_and_get_path(unit_id: str, analysis_data: Dict[str, Any]) -> Path:
+    """Save analysis_data to disk and return the path."""
+    os.makedirs(LERNEINHEITEN_DIR, exist_ok=True)
+    path = Path(LERNEINHEITEN_DIR) / f"{unit_id}_analyse.json"
+    with open(path, "w", encoding="utf-8") as f:
+        json.dump(analysis_data, f, ensure_ascii=False, indent=2)
+    return path
+
+
+@router.post("/{unit_id}/generate-qa")
+def api_generate_qa(unit_id: str, payload: GenerateFromAnalysisPayload):
+    """Generate Q&A items with Leitner fields from analysis data."""
+    lu = get_learning_unit(unit_id)
+    if not lu:
+        raise HTTPException(status_code=404, detail="Lerneinheit nicht gefunden.")
+
+    analysis_path = _save_analysis_and_get_path(unit_id, payload.analysis_data)
+
+    try:
+        from ai_processing.qa_generator import generate_qa_from_analysis
+        qa_path = generate_qa_from_analysis(analysis_path, num_questions=payload.num_questions)
+        with open(qa_path, "r", encoding="utf-8") as f:
+            qa_data = json.load(f)
+
+        # Update unit status
+        update_learning_unit(unit_id, LearningUnitUpdate(status="qa_generated"))
+        logger.info(f"Generated QA for unit {unit_id}: {len(qa_data.get('qa_items', []))} items")
+        return qa_data
+    except Exception as e:
+        logger.error(f"QA generation failed for {unit_id}: {e}")
+        raise HTTPException(status_code=500, detail=f"QA-Generierung fehlgeschlagen: {e}")
+
+
+@router.post("/{unit_id}/generate-mc")
+def api_generate_mc(unit_id: str, payload: GenerateFromAnalysisPayload):
+    """Generate multiple choice questions from analysis data."""
+    lu = get_learning_unit(unit_id)
+    if not lu:
+        raise HTTPException(status_code=404, detail="Lerneinheit nicht gefunden.")
+
+    analysis_path = _save_analysis_and_get_path(unit_id, payload.analysis_data)
+
+    try:
+        from ai_processing.mc_generator import generate_mc_from_analysis
+        mc_path = generate_mc_from_analysis(analysis_path, num_questions=payload.num_questions)
+        with open(mc_path, "r", encoding="utf-8") as f:
+            mc_data = json.load(f)
+
+        update_learning_unit(unit_id, LearningUnitUpdate(status="mc_generated"))
+        logger.info(f"Generated MC for unit {unit_id}: {len(mc_data.get('questions', []))} questions")
+        return mc_data
+    except Exception as e:
+        logger.error(f"MC generation failed for {unit_id}: {e}")
+        raise HTTPException(status_code=500, detail=f"MC-Generierung fehlgeschlagen: {e}")
+
+
+@router.post("/{unit_id}/generate-cloze")
+def api_generate_cloze(unit_id: str, payload: GenerateFromAnalysisPayload):
+    """Generate cloze (fill-in-the-blank) items from analysis data."""
+    lu = get_learning_unit(unit_id)
+    if not lu:
+        raise HTTPException(status_code=404, detail="Lerneinheit nicht gefunden.")
+
+    analysis_path = _save_analysis_and_get_path(unit_id, payload.analysis_data)
+
+    try:
+        from ai_processing.cloze_generator import generate_cloze_from_analysis
+        cloze_path = generate_cloze_from_analysis(analysis_path)
+        with open(cloze_path, "r", encoding="utf-8") as f:
+            cloze_data = json.load(f)
+
+        update_learning_unit(unit_id, LearningUnitUpdate(status="cloze_generated"))
+        logger.info(f"Generated Cloze for unit {unit_id}: {len(cloze_data.get('cloze_items', []))} items")
+        return cloze_data
+    except Exception as e:
+        logger.error(f"Cloze generation failed for {unit_id}: {e}")
+        raise HTTPException(status_code=500, detail=f"Cloze-Generierung fehlgeschlagen: {e}")
+
+
+@router.get("/{unit_id}/qa")
+def api_get_qa(unit_id: str):
+    """Get generated QA items for a unit."""
+    qa_path = Path(LERNEINHEITEN_DIR) / f"{unit_id}_qa.json"
+    if not qa_path.exists():
+        raise HTTPException(status_code=404, detail="Keine QA-Daten gefunden.")
+    with open(qa_path, "r", encoding="utf-8") as f:
+        return json.load(f)
+
+
+@router.get("/{unit_id}/mc")
+def api_get_mc(unit_id: str):
+    """Get generated MC questions for a unit."""
+    mc_path = Path(LERNEINHEITEN_DIR) / f"{unit_id}_mc.json"
+    if not mc_path.exists():
+        raise HTTPException(status_code=404, detail="Keine MC-Daten gefunden.")
+    with open(mc_path, "r", encoding="utf-8") as f:
+        return json.load(f)
+
+
+@router.get("/{unit_id}/cloze")
+def api_get_cloze(unit_id: str):
+    """Get generated cloze items for a unit."""
+    cloze_path = Path(LERNEINHEITEN_DIR) / f"{unit_id}_cloze.json"
+    if not cloze_path.exists():
+        raise HTTPException(status_code=404, detail="Keine Cloze-Daten gefunden.")
+    with open(cloze_path, "r", encoding="utf-8") as f:
+        return json.load(f)
+
+
+@router.post("/{unit_id}/leitner/update")
+def api_update_leitner(unit_id: str, item_id: str, correct: bool):
+    """Update Leitner progress for a QA item."""
+    qa_path = Path(LERNEINHEITEN_DIR) / f"{unit_id}_qa.json"
+    if not qa_path.exists():
+        raise HTTPException(status_code=404, detail="Keine QA-Daten gefunden.")
+    try:
+        from ai_processing.qa_generator import update_leitner_progress
+        result = update_leitner_progress(qa_path, item_id, correct)
+        return result
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=str(e))
+
+
+@router.get("/{unit_id}/leitner/next")
+def api_get_next_review(unit_id: str, limit: int = 5):
+    """Get next Leitner review items."""
+    qa_path = Path(LERNEINHEITEN_DIR) / f"{unit_id}_qa.json"
+    if not qa_path.exists():
+        raise HTTPException(status_code=404, detail="Keine QA-Daten gefunden.")
+    try:
+        from ai_processing.qa_generator import get_next_review_items
+        items = get_next_review_items(qa_path, limit=limit)
+        return {"items": items, "count": len(items)}
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=str(e))
+
+
+class StoryGeneratePayload(BaseModel):
+    vocabulary: List[Dict[str, Any]]
+    language: str = "en"
+    grade_level: str = "5-8"
+
+
+@router.post("/{unit_id}/generate-story")
+def api_generate_story(unit_id: str, payload: StoryGeneratePayload):
+    """Generate a short story using vocabulary words."""
+    lu = get_learning_unit(unit_id)
+    if not lu:
+        raise HTTPException(status_code=404, detail="Lerneinheit nicht gefunden.")
+
+    try:
+        from story_generator import generate_story
+        result = generate_story(
+            vocabulary=payload.vocabulary,
+            language=payload.language,
+            grade_level=payload.grade_level,
+        )
+        return result
+    except Exception as e:
+        logger.error(f"Story generation failed for {unit_id}: {e}")
+        raise HTTPException(status_code=500, detail=f"Story-Generierung fehlgeschlagen: {e}")
+
--- a/backend-lehrer/main.py
+++ b/backend-lehrer/main.py
@@ -40,7 +40,6 @@ os.environ["DATABASE_URL"] = DATABASE_URL
 # ---------------------------------------------------------------------------
 LLM_GATEWAY_ENABLED = os.getenv("LLM_GATEWAY_ENABLED", "false").lower() == "true"
 ALERTS_AGENT_ENABLED = os.getenv("ALERTS_AGENT_ENABLED", "false").lower() == "true"
-VAST_API_KEY = os.getenv("VAST_API_KEY")


 # ---------------------------------------------------------------------------
@@ -106,21 +105,20 @@ app.include_router(correction_router, prefix="/api")
 from learning_units_api import router as learning_units_router
 app.include_router(learning_units_router, prefix="/api")

+# --- 4b. Learning Progress ---
+from progress_api import router as progress_router
+app.include_router(progress_router, prefix="/api")
+
 from unit_api import router as unit_router
 app.include_router(unit_router)  # Already has /api/units prefix

 from unit_analytics_api import router as unit_analytics_router
 app.include_router(unit_analytics_router)  # Already has /api/analytics prefix

-# --- 5. Meetings / Jitsi ---
-from meetings_api import router as meetings_api_router
-app.include_router(meetings_api_router)  # Already has /api/meetings prefix

 from recording_api import router as recording_api_router
 app.include_router(recording_api_router)  # Already has /api/recordings prefix

-from jitsi_api import router as jitsi_router
-app.include_router(jitsi_router)  # Already has /api/jitsi prefix

 # --- 6. Messenger ---
 from messenger_api import router as messenger_router
@@ -180,11 +178,6 @@ if ALERTS_AGENT_ENABLED:
    from alerts_agent.api import router as alerts_router
    app.include_router(alerts_router, prefix="/api", tags=["Alerts Agent"])

-# --- 14. vast.ai GPU Infrastructure (optional) ---
-if VAST_API_KEY:
-    from infra.vast_power import router as vast_router
-    app.include_router(vast_router, tags=["GPU Infrastructure"])
-

 # ---------------------------------------------------------------------------
 # Middleware (from shared middleware/ package)
--- a/backend-lehrer/meetings_api.py
+++ b/backend-lehrer/meetings_api.py
@@ -1,443 +0,0 @@
-"""
-Meetings API Module
-Backend API endpoints for Jitsi Meet integration
-"""
-import os
-import uuid
-import httpx
-from datetime import datetime, timedelta
-from typing import Optional, List
-from fastapi import APIRouter, HTTPException, Depends
-from pydantic import BaseModel, EmailStr
-
-router = APIRouter(prefix="/api/meetings", tags=["meetings"])
-
-
-# ============================================
-# Configuration
-# ============================================
-
-JITSI_BASE_URL = os.getenv("JITSI_PUBLIC_URL", "http://localhost:8443")
-CONSENT_SERVICE_URL = os.getenv("CONSENT_SERVICE_URL", "http://localhost:8081")
-
-
-# ============================================
-# Models
-# ============================================
-
-class MeetingConfig(BaseModel):
-    enable_lobby: bool = True
-    enable_recording: bool = False
-    start_with_audio_muted: bool = True
-    start_with_video_muted: bool = False
-    require_display_name: bool = True
-    enable_breakout: bool = False
-
-
-class CreateMeetingRequest(BaseModel):
-    type: str = "quick"  # quick, scheduled, training, parent, class
-    title: str = "Neues Meeting"
-    duration: int = 60
-    scheduled_at: Optional[str] = None
-    config: Optional[MeetingConfig] = None
-    description: Optional[str] = None
-    invites: Optional[List[str]] = None
-
-
-class ScheduleMeetingRequest(BaseModel):
-    title: str
-    scheduled_at: str
-    duration: int = 60
-    description: Optional[str] = None
-    invites: Optional[List[str]] = None
-
-
-class TrainingRequest(BaseModel):
-    title: str
-    description: Optional[str] = None
-    scheduled_at: str
-    duration: int = 120
-    max_participants: int = 20
-    trainer: str
-    config: Optional[MeetingConfig] = None
-
-
-class ParentTeacherRequest(BaseModel):
-    student_name: str
-    parent_name: str
-    parent_email: Optional[str] = None
-    scheduled_at: str
-    reason: Optional[str] = None
-    send_invite: bool = True
-    duration: int = 30
-
-
-class MeetingResponse(BaseModel):
-    room_name: str
-    join_url: str
-    moderator_url: Optional[str] = None
-    password: Optional[str] = None
-    expires_at: Optional[str] = None
-
-
-class MeetingStats(BaseModel):
-    active: int = 0
-    scheduled: int = 0
-    recordings: int = 0
-    participants: int = 0
-
-
-class ActiveMeeting(BaseModel):
-    room_name: str
-    title: str
-    participants: int
-    started_at: str
-
-
-# ============================================
-# In-Memory Storage (for demo purposes)
-# In production, use database
-# ============================================
-
-scheduled_meetings = []
-active_meetings = []
-trainings = []
-recordings = []
-
-
-# ============================================
-# Helper Functions
-# ============================================
-
-def generate_room_name(prefix: str = "meeting") -> str:
-    """Generate a unique room name"""
-    return f"{prefix}-{uuid.uuid4().hex[:8]}"
-
-
-def generate_password() -> str:
-    """Generate a simple password"""
-    return uuid.uuid4().hex[:8]
-
-
-def build_jitsi_url(room_name: str, config: Optional[MeetingConfig] = None) -> str:
-    """Build Jitsi meeting URL with config parameters"""
-    params = []
-
-    if config:
-        if config.start_with_audio_muted:
-            params.append("config.startWithAudioMuted=true")
-        if config.start_with_video_muted:
-            params.append("config.startWithVideoMuted=true")
-        if config.require_display_name:
-            params.append("config.requireDisplayName=true")
-
-    # Common config
-    params.extend([
-        "config.prejoinPageEnabled=false",
-        "config.disableDeepLinking=true",
-        "config.defaultLanguage=de",
-        "interfaceConfig.SHOW_JITSI_WATERMARK=false",
-        "interfaceConfig.SHOW_BRAND_WATERMARK=false"
-    ])
-
-    url = f"{JITSI_BASE_URL}/{room_name}"
-    if params:
-        url += "#" + "&".join(params)
-
-    return url
-
-
-async def call_consent_service(endpoint: str, method: str = "GET", data: dict = None) -> dict:
-    """Call the consent service API"""
-    async with httpx.AsyncClient() as client:
-        url = f"{CONSENT_SERVICE_URL}{endpoint}"
-
-        if method == "GET":
-            response = await client.get(url)
-        elif method == "POST":
-            response = await client.post(url, json=data)
-        else:
-            raise ValueError(f"Unsupported method: {method}")
-
-        if response.status_code >= 400:
-            return None
-
-        return response.json()
-
-
-# ============================================
-# API Endpoints
-# ============================================
-
-@router.get("/stats", response_model=MeetingStats)
-async def get_meeting_stats():
-    """Get meeting statistics"""
-    return MeetingStats(
-        active=len(active_meetings),
-        scheduled=len(scheduled_meetings),
-        recordings=len(recordings),
-        participants=sum(m.get("participants", 0) for m in active_meetings)
-    )
-
-
-@router.get("/active", response_model=List[ActiveMeeting])
-async def get_active_meetings():
-    """Get list of active meetings"""
-    return [
-        ActiveMeeting(
-            room_name=m["room_name"],
-            title=m["title"],
-            participants=m.get("participants", 0),
-            started_at=m.get("started_at", datetime.now().isoformat())
-        )
-        for m in active_meetings
-    ]
-
-
-@router.post("/create", response_model=MeetingResponse)
-async def create_meeting(request: CreateMeetingRequest):
-    """Create a new meeting"""
-    config = request.config or MeetingConfig()
-
-    # Generate room name based on type
-    if request.type == "quick":
-        room_name = generate_room_name("quick")
-    elif request.type == "training":
-        room_name = generate_room_name("schulung")
-    elif request.type == "parent":
-        room_name = generate_room_name("elterngespraech")
-    elif request.type == "class":
-        room_name = generate_room_name("klasse")
-    else:
-        room_name = generate_room_name("meeting")
-
-    join_url = build_jitsi_url(room_name, config)
-
-    # Store meeting if scheduled
-    if request.scheduled_at:
-        scheduled_meetings.append({
-            "room_name": room_name,
-            "title": request.title,
-            "scheduled_at": request.scheduled_at,
-            "duration": request.duration,
-            "config": config.model_dump() if config else None
-        })
-
-    return MeetingResponse(
-        room_name=room_name,
-        join_url=join_url
-    )
-
-
-@router.post("/schedule", response_model=MeetingResponse)
-async def schedule_meeting(request: ScheduleMeetingRequest):
-    """Schedule a new meeting"""
-    room_name = generate_room_name("meeting")
-
-    meeting = {
-        "room_name": room_name,
-        "title": request.title,
-        "scheduled_at": request.scheduled_at,
-        "duration": request.duration,
-        "description": request.description,
-        "invites": request.invites or []
-    }
-
-    scheduled_meetings.append(meeting)
-
-    join_url = build_jitsi_url(room_name)
-
-    # TODO: Send email invites if configured
-
-    return MeetingResponse(
-        room_name=room_name,
-        join_url=join_url
-    )
-
-
-@router.post("/training", response_model=MeetingResponse)
-async def create_training(request: TrainingRequest):
-    """Create a training session"""
-    # Generate room name from title
-    title_slug = request.title.lower().replace(" ", "-")[:20]
-    room_name = f"schulung-{title_slug}-{uuid.uuid4().hex[:4]}"
-
-    config = request.config or MeetingConfig(
-        enable_lobby=True,
-        enable_recording=True,
-        start_with_audio_muted=True
-    )
-
-    training = {
-        "room_name": room_name,
-        "title": request.title,
-        "description": request.description,
-        "scheduled_at": request.scheduled_at,
-        "duration": request.duration,
-        "max_participants": request.max_participants,
-        "trainer": request.trainer,
-        "config": config.model_dump()
-    }
-
-    trainings.append(training)
-    scheduled_meetings.append(training)
-
-    join_url = build_jitsi_url(room_name, config)
-
-    return MeetingResponse(
-        room_name=room_name,
-        join_url=join_url
-    )
-
-
-@router.post("/parent-teacher", response_model=MeetingResponse)
-async def create_parent_teacher_meeting(request: ParentTeacherRequest):
-    """Create a parent-teacher meeting"""
-    # Generate room name with student name and date
-    student_slug = request.student_name.lower().replace(" ", "-")[:15]
-    date_str = datetime.fromisoformat(request.scheduled_at).strftime("%Y%m%d-%H%M")
-    room_name = f"elterngespraech-{student_slug}-{date_str}"
-
-    # Generate password for security
-    password = generate_password()
-
-    config = MeetingConfig(
-        enable_lobby=True,
-        enable_recording=False,
-        start_with_audio_muted=False
-    )
-
-    meeting = {
-        "room_name": room_name,
-        "title": f"Elterngespräch - {request.student_name}",
-        "student_name": request.student_name,
-        "parent_name": request.parent_name,
-        "parent_email": request.parent_email,
-        "scheduled_at": request.scheduled_at,
-        "duration": request.duration,
-        "reason": request.reason,
-        "password": password,
-        "config": config.model_dump()
-    }
-
-    scheduled_meetings.append(meeting)
-
-    join_url = build_jitsi_url(room_name, config)
-
-    # TODO: Send email invite to parents if configured
-
-    return MeetingResponse(
-        room_name=room_name,
-        join_url=join_url,
-        password=password
-    )
-
-
-@router.get("/scheduled")
-async def get_scheduled_meetings():
-    """Get all scheduled meetings"""
-    return scheduled_meetings
-
-
-@router.get("/trainings")
-async def get_trainings():
-    """Get all training sessions"""
-    return trainings
-
-
-@router.delete("/{room_name}")
-async def delete_meeting(room_name: str):
-    """Delete a scheduled meeting"""
-    # Find and remove the meeting (in-place modification)
-    for i, m in enumerate(scheduled_meetings):
-        if m["room_name"] == room_name:
-            scheduled_meetings.pop(i)
-            break
-    return {"status": "deleted"}
-
-
-# ============================================
-# Recording Endpoints
-# ============================================
-
-@router.get("/recordings")
-async def get_recordings():
-    """Get list of recordings"""
-    # Demo data
-    return [
-        {
-            "id": "docker-basics",
-            "title": "Docker Grundlagen Schulung",
-            "date": "2025-12-10T10:00:00",
-            "duration": "1:30:00",
-            "size_mb": 156,
-            "participants": 15
-        },
-        {
-            "id": "team-kw49",
-            "title": "Team-Meeting KW 49",
-            "date": "2025-12-06T14:00:00",
-            "duration": "1:00:00",
-            "size_mb": 98,
-            "participants": 8
-        },
-        {
-            "id": "parent-mueller",
-            "title": "Elterngespräch - Max Müller",
-            "date": "2025-12-02T16:00:00",
-            "duration": "0:28:00",
-            "size_mb": 42,
-            "participants": 2
-        }
-    ]
-
-
-@router.get("/recordings/{recording_id}")
-async def get_recording(recording_id: str):
-    """Get recording details"""
-    return {
-        "id": recording_id,
-        "title": "Recording " + recording_id,
-        "date": "2025-12-10T10:00:00",
-        "duration": "1:30:00",
-        "size_mb": 156,
-        "download_url": f"/api/recordings/{recording_id}/download"
-    }
-
-
-@router.get("/recordings/{recording_id}/download")
-async def download_recording(recording_id: str):
-    """Download a recording"""
-    # In production, this would stream the actual file
-    raise HTTPException(status_code=404, detail="Recording file not found (demo mode)")
-
-
-@router.delete("/recordings/{recording_id}")
-async def delete_recording(recording_id: str):
-    """Delete a recording"""
-    return {"status": "deleted", "id": recording_id}
-
-
-# ============================================
-# Health Check
-# ============================================
-
-@router.get("/health")
-async def health_check():
-    """Check meetings service health"""
-    # Check Jitsi availability
-    jitsi_healthy = False
-    try:
-        async with httpx.AsyncClient(timeout=5.0) as client:
-            response = await client.get(JITSI_BASE_URL)
-            jitsi_healthy = response.status_code == 200
-    except Exception:
-        pass
-
-    return {
-        "status": "healthy" if jitsi_healthy else "degraded",
-        "jitsi_url": JITSI_BASE_URL,
-        "jitsi_available": jitsi_healthy,
-        "scheduled_meetings": len(scheduled_meetings),
-        "active_meetings": len(active_meetings)
-    }
--- a/backend-lehrer/progress_api.py
+++ b/backend-lehrer/progress_api.py
@@ -0,0 +1,131 @@
+"""
+Progress API — Tracks student learning progress per unit.
+
+Stores coins, crowns, streak data, and exercise completion stats.
+Uses JSON file storage (same pattern as learning_units.py).
+"""
+
+import os
+import json
+import logging
+from datetime import datetime, date
+from typing import Dict, Any, Optional, List
+from pathlib import Path
+
+from fastapi import APIRouter, HTTPException
+from pydantic import BaseModel
+
+logger = logging.getLogger(__name__)
+
+router = APIRouter(
+    prefix="/progress",
+    tags=["progress"],
+)
+
+PROGRESS_DIR = os.path.expanduser("~/Arbeitsblaetter/Lerneinheiten/progress")
+
+
+def _ensure_dir():
+    os.makedirs(PROGRESS_DIR, exist_ok=True)
+
+
+def _progress_path(unit_id: str) -> Path:
+    return Path(PROGRESS_DIR) / f"{unit_id}.json"
+
+
+def _load_progress(unit_id: str) -> Dict[str, Any]:
+    path = _progress_path(unit_id)
+    if path.exists():
+        with open(path, "r", encoding="utf-8") as f:
+            return json.load(f)
+    return {
+        "unit_id": unit_id,
+        "coins": 0,
+        "crowns": 0,
+        "streak_days": 0,
+        "last_activity": None,
+        "exercises": {
+            "flashcards": {"completed": 0, "correct": 0, "incorrect": 0},
+            "quiz": {"completed": 0, "correct": 0, "incorrect": 0},
+            "type": {"completed": 0, "correct": 0, "incorrect": 0},
+            "story": {"generated": 0},
+        },
+        "created_at": datetime.now().isoformat(),
+    }
+
+
+def _save_progress(unit_id: str, data: Dict[str, Any]):
+    _ensure_dir()
+    path = _progress_path(unit_id)
+    with open(path, "w", encoding="utf-8") as f:
+        json.dump(data, f, ensure_ascii=False, indent=2)
+
+
+class RewardPayload(BaseModel):
+    exercise_type: str  # flashcards, quiz, type, story
+    correct: bool = True
+    first_try: bool = True
+
+
+@router.get("/{unit_id}")
+def get_progress(unit_id: str):
+    """Get learning progress for a unit."""
+    return _load_progress(unit_id)
+
+
+@router.post("/{unit_id}/reward")
+def add_reward(unit_id: str, payload: RewardPayload):
+    """Record an exercise result and award coins."""
+    progress = _load_progress(unit_id)
+
+    # Update exercise stats
+    ex = progress["exercises"].get(payload.exercise_type, {"completed": 0, "correct": 0, "incorrect": 0})
+    ex["completed"] = ex.get("completed", 0) + 1
+    if payload.correct:
+        ex["correct"] = ex.get("correct", 0) + 1
+    else:
+        ex["incorrect"] = ex.get("incorrect", 0) + 1
+    progress["exercises"][payload.exercise_type] = ex
+
+    # Award coins
+    if payload.correct:
+        coins = 3 if payload.first_try else 1
+    else:
+        coins = 0
+    progress["coins"] = progress.get("coins", 0) + coins
+
+    # Update streak
+    today = date.today().isoformat()
+    last = progress.get("last_activity")
+    if last != today:
+        if last == (date.today().replace(day=date.today().day - 1)).isoformat() if date.today().day > 1 else None:
+            progress["streak_days"] = progress.get("streak_days", 0) + 1
+        elif last != today:
+            progress["streak_days"] = 1
+        progress["last_activity"] = today
+
+    # Award crowns for milestones
+    total_correct = sum(
+        e.get("correct", 0) for e in progress["exercises"].values() if isinstance(e, dict)
+    )
+    progress["crowns"] = total_correct // 20  # 1 crown per 20 correct answers
+
+    _save_progress(unit_id, progress)
+
+    return {
+        "coins_awarded": coins,
+        "total_coins": progress["coins"],
+        "crowns": progress["crowns"],
+        "streak_days": progress["streak_days"],
+    }
+
+
+@router.get("/")
+def list_all_progress():
+    """List progress for all units."""
+    _ensure_dir()
+    results = []
+    for f in Path(PROGRESS_DIR).glob("*.json"):
+        with open(f, "r", encoding="utf-8") as fh:
+            results.append(json.load(fh))
+    return results
--- a/backend-lehrer/story_generator.py
+++ b/backend-lehrer/story_generator.py
@@ -0,0 +1,108 @@
+"""
+Story Generator — Creates short stories using vocabulary words.
+
+Generates age-appropriate mini-stories (3-5 sentences) that incorporate
+the given vocabulary words, marked with <mark> tags for highlighting.
+
+Uses Ollama (local LLM) for generation.
+"""
+
+import os
+import json
+import logging
+import requests
+from typing import List, Dict, Any, Optional
+
+logger = logging.getLogger(__name__)
+
+OLLAMA_URL = os.getenv("OLLAMA_BASE_URL", "http://host.docker.internal:11434")
+STORY_MODEL = os.getenv("STORY_MODEL", "llama3.1:8b")
+
+
+def generate_story(
+    vocabulary: List[Dict[str, str]],
+    language: str = "en",
+    grade_level: str = "5-8",
+    max_words: int = 5,
+) -> Dict[str, Any]:
+    """
+    Generate a short story incorporating vocabulary words.
+
+    Args:
+        vocabulary: List of dicts with 'english' and 'german' keys
+        language: 'en' for English story, 'de' for German story
+        grade_level: Target grade level
+        max_words: Maximum vocab words to include (to keep story short)
+
+    Returns:
+        Dict with 'story_html', 'story_text', 'vocab_used', 'language'
+    """
+    # Select subset of vocabulary
+    words = vocabulary[:max_words]
+    word_list = [w.get("english", "") if language == "en" else w.get("german", "") for w in words]
+    word_list = [w for w in word_list if w.strip()]
+
+    if not word_list:
+        return {"story_html": "", "story_text": "", "vocab_used": [], "language": language}
+
+    lang_name = "English" if language == "en" else "German"
+    words_str = ", ".join(word_list)
+
+    prompt = f"""Write a short story (3-5 sentences) in {lang_name} for a grade {grade_level} student.
+The story MUST use these vocabulary words: {words_str}
+
+Rules:
+1. The story should be fun and age-appropriate
+2. Each vocabulary word must appear at least once
+3. Keep sentences simple and clear
+4. The story should make sense and be engaging
+
+Write ONLY the story, nothing else. No title, no introduction."""
+
+    try:
+        resp = requests.post(
+            f"{OLLAMA_URL}/api/generate",
+            json={
+                "model": STORY_MODEL,
+                "prompt": prompt,
+                "stream": False,
+                "options": {"temperature": 0.8, "num_predict": 300},
+            },
+            timeout=30,
+        )
+        resp.raise_for_status()
+        story_text = resp.json().get("response", "").strip()
+    except Exception as e:
+        logger.error(f"Story generation failed: {e}")
+        # Fallback: simple template story
+        story_text = _fallback_story(word_list, language)
+
+    # Mark vocabulary words in the story
+    story_html = story_text
+    vocab_found = []
+    for word in word_list:
+        if word.lower() in story_html.lower():
+            # Case-insensitive replacement preserving original case
+            import re
+            pattern = re.compile(re.escape(word), re.IGNORECASE)
+            story_html = pattern.sub(
+                lambda m: f'<mark class="vocab-highlight">{m.group()}</mark>',
+                story_html,
+                count=1,
+            )
+            vocab_found.append(word)
+
+    return {
+        "story_html": story_html,
+        "story_text": story_text,
+        "vocab_used": vocab_found,
+        "vocab_total": len(word_list),
+        "language": language,
+    }
+
+
+def _fallback_story(words: List[str], language: str) -> str:
+    """Simple fallback when LLM is unavailable."""
+    if language == "de":
+        return f"Heute habe ich neue Woerter gelernt: {', '.join(words)}. Es war ein guter Tag zum Lernen."
+    return f"Today I learned new words: {', '.join(words)}. It was a great day for learning."
--- a/docs-src/services/klausur-service/OCR-Kombi-Pipeline.md
+++ b/docs-src/services/klausur-service/OCR-Kombi-Pipeline.md
@@ -0,0 +1,253 @@
+# OCR Kombi Pipeline - Modulare 11-Schritt-Architektur
+
+**Version:** 1.0.0
+**Status:** Phase 1 implementiert (Grundgeruest + DB)
+**URL:** https://macmini:3002/ai/ocr-kombi
+
+## Uebersicht
+
+Die OCR Kombi Pipeline ist der Nachfolger des OCR-Overlay-Monolithen (`/ai/ocr-overlay`).
+Sie zerlegt den OCR-Prozess in **11 modulare Schritte** mit je einer eigenen Komponente
+pro Frontend- und Backend-Datei. Ziel: schnelles Debugging, klare Verantwortlichkeiten,
+Multi-Page-Dokument-Unterstuetzung.
+
+**Primaerer Modus:** Kombi (PaddleOCR + Tesseract) — der einzige Modus, den der User nutzt.
+
+### Warum ein Refactor?
+
+| Problem (alt) | Loesung (neu) |
+|----------------|---------------|
+| `page.tsx` = 751-Zeilen-Monolith mit 3 Modi | `page.tsx` = ~140-Zeilen-Orchestrator, je 1 Datei pro Step |
+| Upload, Orientierung und Page-Split in einem Step | 3 separate Steps mit eigener Logik |
+| Keine Multi-Page-Dokument-Unterstuetzung | `document_group_id` + `page_number` auf DB-Ebene |
+| OCR intransparent (eine Blackbox) | 3-Phasen-Fortschritt + Engine-Attribution pro Wort (geplant) |
+| `grid_editor_api.py` = 1801 Zeilen | 4 Module + Orchestrator (geplant) |
+
+---
+
+## Pipeline-Schritte
+
+| # | Step | Frontend | Backend | Beschreibung |
+|---|------|----------|---------|--------------|
+| 1 | Upload | `StepUpload.tsx` | `step_upload.py` | Bild/PDF hochladen, Titel, Kategorie. Multi-Page-PDF → N Sessions |
+| 2 | Orientierung | `StepOrientation.tsx` | (shared) | Rotation 90/180/270 erkennen + korrigieren |
+| 3 | Seitentrennung | `StepPageSplit.tsx` | (shared) | Doppelseiten erkennen + splitten |
+| 4 | Begradigung | `StepDeskew.tsx` | (shared) | Hough Lines + Word Alignment |
+| 5 | Entzerrung | `StepDewarp.tsx` | (shared) | Shear-Korrektur (Vertikalkanten-Drift) |
+| 6 | Zuschneiden | `StepContentCrop.tsx` | (shared) | Scanner-Raender entfernen (nach Begradigung!) |
+| 7 | OCR | `StepOcr.tsx` | (shared) | Tesseract + PaddleOCR + Merge |
+| 8 | Strukturerkennung | `StepStructure.tsx` | (shared) | Boxen, Zonen, Farben, Grafiken |
+| 9 | Grid-Aufbau | `StepGridBuild.tsx` | (shared) | Strukturiertes Grid aus OCR + Struktur |
+| 10 | Grid-Review | `StepGridReview.tsx` | (shared) | Excel-Editor, IPA, Silben, Korrekturen |
+| 11 | Ground Truth | `StepGroundTruth.tsx` | (shared) | Validierung + GT-Markierung |
+
+!!! note "Crop nach Dewarp"
+    Seitentrennung (Step 3) passiert **vor** Begradigung — richtig, weil jede Haelfte
+    unabhaengig begradigt wird. Der Content-Crop (Step 6) bleibt **nach** Dewarp,
+    weil content-basierter Crop auf geradem Bild besser funktioniert.
+
+---
+
+## Multi-Page-Dokument-Gruppierung
+
+### Problem
+
+Ein Lehrer scannt 10 Vokabelseiten als eine PDF-Datei. Im Endnutzer-Frontend soll das
+ein zusammenhaengendes Dokument sein. Alle Seiten muessen spaeter zu gemeinsamen
+Lern-Units verarbeitet werden koennen.
+
+### Loesung: `document_group_id` + `page_number`
+
+Zwei neue Felder auf `ocr_pipeline_sessions` (Migration `009_add_document_group.sql`):
+
+```sql
+ALTER TABLE ocr_pipeline_sessions
+    ADD COLUMN IF NOT EXISTS document_group_id UUID,
+    ADD COLUMN IF NOT EXISTS page_number INT;
+```
+
+| Upload-Typ | document_group_id | page_number |
+|------------|-------------------|-------------|
+| Einzelbild | neues UUID | 1 |
+| Multi-Page-PDF (10 Seiten) | gleiches UUID fuer alle 10 | 1..10 |
+| Doppelseiten-Split von S. 3 | gleiches UUID | neue S. 3 + S. 4, Rest umkontiert |
+
+### Benennung
+
+Upload-Titel "Vokabeln Unit 3" erzeugt:
+
+- "Vokabeln Unit 3 — S. 1"
+- "Vokabeln Unit 3 — S. 2"
+- ...
+- "Vokabeln Unit 3 — S. 10"
+
+### Session-Liste im Admin
+
+Gruppierte Anzeige: Ein Dokument-Header ("Vokabeln Unit 3, 10 Seiten") mit aufklappbaren
+Einzel-Sessions darunter. Jede Session hat eigenen Pipeline-Status.
+
+---
+
+## API-Endpoints
+
+### Neue Endpoints (OCR Kombi)
+
+| Methode | Pfad | Beschreibung |
+|---------|------|--------------|
+| POST | `/api/v1/ocr-kombi/upload` | Upload: Einzelbild oder Multi-Page-PDF |
+| GET | `/api/v1/ocr-kombi/documents/{group_id}` | Alle Sessions einer Dokumentgruppe |
+
+### Bestehende Endpoints (wiederverwendet)
+
+Die Kombi-Pipeline nutzt alle bestehenden Endpoints aus `/api/v1/ocr-pipeline/`:
+
+| Methode | Pfad | Step |
+|---------|------|------|
+| POST | `/sessions` | Upload (Legacy, Einzelbild) |
+| POST | `/sessions/{id}/orientation` | Orientierung |
+| POST | `/sessions/{id}/page-split` | Seitentrennung |
+| POST | `/sessions/{id}/deskew` | Begradigung |
+| POST | `/sessions/{id}/dewarp` | Entzerrung |
+| POST | `/sessions/{id}/crop` | Zuschneiden |
+| POST | `/sessions/{id}/paddle-kombi` | OCR (Kombi) |
+| POST | `/sessions/{id}/detect-structure` | Strukturerkennung |
+| POST | `/sessions/{id}/build-grid` | Grid-Aufbau |
+| POST | `/sessions/{id}/save-grid` | Grid speichern |
+| GET | `/sessions/{id}/grid-editor` | Grid laden |
+| POST | `/sessions/{id}/mark-ground-truth` | GT markieren |
+
+---
+
+## Dateistruktur
+
+### Frontend
+
+```
+admin-lehrer/app/(admin)/ai/ocr-kombi/
+├── page.tsx                    # ~140 Zeilen, Orchestrator mit Suspense-Boundary
+├── types.ts                    # KOMBI_V2_STEPS (11 Steps), DocumentGroup-Types, OCR-Transparenz-Types
+└── useKombiPipeline.ts         # Hook: Session-State, Step-Navigation, Dokument-Gruppierung
+
+admin-lehrer/components/ocr-kombi/
+├── KombiStepper.tsx            # 11-Step-Indikator (kompakt, scrollbar)
+├── SessionList.tsx             # Gruppierte Session-Liste (Dokumentgruppen aufklappbar)
+├── SessionHeader.tsx           # Aktive Session: Name + Kategorie + GT-Badge
+├── StepUpload.tsx              # Drag-Drop + Titel + Kategorie-Auswahl
+├── StepOrientation.tsx         # Wrapper → shared StepOrientation
+├── StepPageSplit.tsx           # Doppelseiten-Erkennung + Auto-Advance
+├── StepDeskew.tsx              # Wrapper → shared StepDeskew
+├── StepDewarp.tsx              # Wrapper → shared StepDewarp
+├── StepContentCrop.tsx         # Wrapper → shared StepCrop
+├── StepOcr.tsx                 # Wrapper → PaddleDirectStep (kombi endpoint)
+├── StepStructure.tsx           # Wrapper → shared StepStructureDetection
+├── StepGridBuild.tsx           # Auto-Trigger build-grid + Ergebnis-Anzeige
+├── StepGridReview.tsx          # Wrapper → shared StepGridReview (mit saveRef)
+└── StepGroundTruth.tsx         # GT-Markierung mit Auto-Save
+```
+
+### Backend
+
+```
+klausur-service/backend/ocr_kombi/
+├── __init__.py
+├── router.py                   # Composite Router (/api/v1/ocr-kombi)
+└── step_upload.py              # Multi-Page-PDF → N Sessions + document_group_id
+```
+
+### Shared (wiederverwendet)
+
+Die Kombi-Pipeline nutzt alle bestehenden Backend-Module:
+
+- `orientation_crop_api.py` — Orientierung, Page-Split, Crop
+- `ocr_pipeline_api.py` — Deskew, Dewarp
+- `ocr_pipeline_ocr_merge.py` — PaddleOCR + Tesseract Merge
+- `grid_editor_api.py` — Grid-Aufbau + Editor
+- `ocr_pipeline_session_store.py` — DB-Layer (erweitert um document_group_id)
+- Alle `cv_*.py` — CV-Algorithmen
+
+### Migration
+
+```
+migrations/009_add_document_group.sql  # document_group_id UUID + page_number INT + Index
+```
+
+---
+
+## Implementierungsphasen
+
+| Phase | Status | Beschreibung |
+|-------|--------|--------------|
+| **1: Grundgeruest + DB** | Implementiert | DB-Migration, Types, Hook, Stepper, SessionList, page.tsx, Navigation, Backend-Router |
+| **2: Vorverarbeitungs-Steps** | Geplant | Multi-Page-PDF-Upload, Orientierung ohne Upload, Seitentrennung mit document_group_id |
+| **3: OCR-Transparenz** | Geplant | 3-Phasen-Fortschritt, Engine-Attribution pro Wort, Farbkodierung |
+| **4: Grid-Pipeline aufteilen** | Geplant | grid_editor_api.py → 4 Module + Orchestrator |
+| **5: Restliche Steps** | Geplant | Structure, GridBuild, GridReview, GroundTruth voll integriert |
+| **6: Features migrieren** | Spaeter | LLM-Review-Streaming, Labeling-Mode, Bild-Generierung |
+| **7: Aufraeumen** | Spaeter | /ai/ocr-overlay und /ai/ocr-pipeline loeschen |
+
+---
+
+## OCR-Transparenz (Phase 3, geplant)
+
+### 3-Phasen-Fortschritt in Step 7
+
+1. "Tesseract laeuft..." (Fortschrittsbalken)
+2. "PaddleOCR laeuft..." (Fortschrittsbalken)
+3. "Merge laeuft..." (Fortschrittsbalken)
+
+### Engine-Attribution pro Wort
+
+Vergleichsansicht mit Farbkodierung:
+
+| Farbe | Bedeutung |
+|-------|-----------|
+| Gruen | Beide Engines einig |
+| Blau | Nur PaddleOCR |
+| Orange | Nur Tesseract |
+| Gelb | Konflikt, PaddleOCR gewaehlt |
+| Rot | Konflikt, Tesseract gewaehlt |
+
+### Geplanter Endpoint
+
+```
+POST /sessions/{id}/ocr-kombi-transparent
+→ { raw_tesseract, raw_paddle, merged, engine_source_per_word }
+```
+
+---
+
+## Grid-Pipeline-Aufteilung (Phase 4, geplant)
+
+`grid_editor_api.py` (1801 Zeilen) wird aufgeteilt in:
+
+| Modul | Inhalt | ~Zeilen |
+|-------|--------|---------|
+| `grid_build_filters.py` | Margin, Footer, Header, Exclude, Grafik-Filter | ~200 |
+| `grid_build_zones.py` | Box-Detect, Page-Zones, Vert-Dividers | ~250 |
+| `grid_build_columns.py` | Spalten-Clustering + Union-Merge + Zone-Grids | ~300 |
+| `grid_build_postprocess.py` | Row/Cell-Postprocessing, IPA, Farben, Dictionary | ~500 |
+
+`grid_editor_api.py` wird zum schlanken Orchestrator.
+
+---
+
+## Verhaeltnis zu bestehenden Pipelines
+
+| Pipeline | Route | Status | Beschreibung |
+|----------|-------|--------|--------------|
+| **OCR Kombi** | `/ai/ocr-kombi` | Aktiv (neu) | Modulare 11-Schritt-Pipeline |
+| OCR Overlay | `/ai/ocr-overlay` | Legacy | 751-Zeilen-Monolith, 3 Modi |
+| OCR Pipeline | `/ai/ocr-pipeline` | Legacy | Volle Pipeline mit Spalten |
+| OCR Compare | `/ai/ocr-compare` | Eigenstaendig | Methoden-Vergleich |
+
+Die alte OCR Overlay (`/ai/ocr-overlay`) bleibt waehrend des gesamten Umbaus parallel nutzbar
+fuer A/B-Tests. Sobald die Kombi-Pipeline feature-complete ist, werden die alten Pipelines
+in Phase 7 entfernt.
+
+---
+
+## Aenderungshistorie
+
+| Datum | Version | Aenderung |
+|-------|---------|-----------|
+| 2026-03-26 | 1.0.0 | **Phase 1:** Grundgeruest mit 11-Step-Architektur, DB-Migration (document_group_id, page_number), Backend-Router mit Multi-Page-Upload, Frontend mit SessionList (gruppiert), KombiStepper, 13 Step-Komponenten, useKombiPipeline Hook, Navigation |
--- a/docs-src/services/klausur-service/OCR-Pipeline.md
+++ b/docs-src/services/klausur-service/OCR-Pipeline.md
@@ -1,6 +1,6 @@
 # OCR Pipeline - Schrittweise Seitenrekonstruktion

-**Version:** 5.0.0
+**Version:** 5.1.0
 **Status:** Produktiv (Schritte 1–10 + Grid Editor + Regression Framework)
 **URL:** https://macmini:3002/ai/ocr-pipeline

@@ -149,7 +149,9 @@ klausur-service/backend/
 ├── ocr_pipeline_api.py                 # FastAPI Router (Schritte 2-10)
 ├── orientation_crop_api.py             # FastAPI Router (Schritte 1 + 4)
 ├── grid_editor_api.py                  # Grid Editor: build-grid, save-grid, grid-editor
+├── grid_editor_helpers.py              # Footer-Filterung, Seitenzahl-Extraktion
 ├── cv_ocr_engines.py                   # OCR-Engines, IPA-Korrektur, Britfone-Woerterbuch
+├── cv_syllable_detect.py               # Deutsche Silbentrennung (Silben:DE Modus)
 ├── cv_box_detect.py                    # Box-Erkennung + Zonen-Aufteilung
 ├── cv_graphic_detect.py                # Grafik-/Bilderkennung (Region-basiert)
 ├── cv_color_detect.py                  # Farbtext-Erkennung (HSV-Analyse)
@@ -1081,6 +1083,8 @@ Rekonstruktion fuer Vokabelseiten mit komplexen Layouts (Bilder, Ueberschriften,
 | Datei | Beschreibung |
 |-------|--------------|
 | `grid_editor_api.py` | `_build_grid_core()` Pipeline, alle Steps |
+| `grid_editor_helpers.py` | `_filter_footer_words()` → Seitenzahl-Extraktion, Footer-Filterung |
+| `cv_syllable_detect.py` | Deutsche Silbentrennung mit IPA-Kompatibilitaet |
 | `cv_ocr_engines.py` | IPA-Korrektur, Britfone-Woerterbuch, Garbled-IPA-Erkennung |
 | `cv_vocab_types.py` | `PageZone` (mit `image_overlays`), `ColumnGeometry` |
 | `tests/test_grid_editor_api.py` | 27 Tests |
@@ -1106,9 +1110,15 @@ Kombi-Wortdaten
  ├─ Step 4: Farb-Annotation
  │   → detect_word_colors(): HSV-Farbanalyse aller word_boxes
  │
+  ├─ Step 4b2: Per-Cell Artifact Filter
+  │   → Einzel-Wort-Zellen mit ≤2 Zeichen und conf < 65 entfernen
+  │
  ├─ Step 4c: Oversized Word Box Removal
  │   → word_boxes > 3x Median entfernen (Grafik-Artefakte)
  │
+  ├─ Step 4d2: Connector Column Normalization
+  │   → Dominante Kurzwoerter in schmalen Spalten normalisieren
+  │
  ├─ Step 5: Overlay-Wort-Filter
  │   → Woerter innerhalb image_overlays entfernen
  │
@@ -1197,6 +1207,38 @@ des Headwords der vorherigen Zeile). Diese werden von PaddleOCR als garbled Text
 4. Schlaegt IPA im Britfone-Woerterbuch nach
 5. Beruecksichtigt alle Wortteile (z.B. "close sth. down" → `[klˈəʊz dˈaʊn]`)

+### Per-Cell Artifact Filter (Step 4b2)
+
+Entfernt OCR-Rauschen auf Zellebene: Zellen mit genau einer `word_box`, maximal 2 Zeichen
+und Confidence unter 65 werden als Artefakte klassifiziert und entfernt.
+
+**Konstanten:**
+
+| Parameter | Wert | Beschreibung |
+|-----------|------|--------------|
+| `_ARTIFACT_MAX_LEN` | 2 | Maximale Textlaenge fuer Artefakt-Verdacht |
+| `_ARTIFACT_CONF_THRESHOLD` | 65 | Confidence-Schwelle (darunter = Artefakt) |
+
+**Sicherheit:** Einzelne Zeichen mit hoher Confidence (z.B. rote `!`-Marker mit conf=98)
+werden **nicht** entfernt, da ihre Confidence ueber dem Schwellwert liegt.
+
+**Typische Artefakte:** `(as)` conf=55, `u)` conf=44 — OCR-Noise aus Seitenraendern
+oder Schatten.
+
+### Connector Column Normalization (Step 4d2)
+
+Erkennt schmale Spalten mit einem dominanten Kurzwort (z.B. "oder", "and", "bzw.")
+und normalisiert OCR-Fehler bei denen das dominante Wort mit Rauschen versehen wurde.
+
+**Algorithmus:**
+
+1. Pro Spalte: Zaehle Textvorkommen aller Zellen
+2. Pruefe ob ein dominantes Wort existiert (≥ 60% der Zellen, max 10 Zeichen)
+3. Fuer Zellen die mit dem dominanten Wort **beginnen** und max 2 Zeichen laenger sind:
+   Normalisiere auf das dominante Wort
+
+**Beispiel:** Spalte mit "oder" in 80% der Zellen → `"oderb"` wird zu `"oder"` normalisiert.
+
 ### Compound Word IPA Decomposition (Step 5e)

 Zusammengesetzte Woerter wie "schoolbag" oder "blackbird" haben oft keinen eigenen
@@ -1253,6 +1295,69 @@ Admin-UI fuer effiziente Massenpruefung von Sessions:

 Admin-UI: [/ai/ocr-ground-truth](https://macmini:3002/ai/ocr-ground-truth)

+### Page Number Extraction
+
+Die Footer-Filterung (`_filter_footer_words` in `grid_editor_helpers.py`) erkennt
+Seitenzahlen in den untersten 5% des Bildes und gibt sie als Metadaten zurueck,
+statt sie stillschweigend zu entfernen.
+
+**Algorithmus:**
+
+1. Woerter in den untersten 5% des Bildes identifizieren
+2. Wenn ≤ 3 Woerter mit ≤ 10 Zeichen Gesamtlaenge: Als Seitenzahl extrahieren
+3. Rueckgabe als `PageNumber`-Objekt: `{text, y_pct, number?}`
+4. Ziffern werden separat als `number` (Integer) extrahiert
+
+**Datentyp:**
+
+```typescript
+interface PageNumber {
+  text: string     // Roh-OCR-Text (z.B. "u)233")
+  y_pct: number    // Vertikale Position in Prozent
+  number?: number  // Extrahierte Zahl (z.B. 233)
+}
+```
+
+**Frontend-Anzeige:**
+
+In der Summary-Leiste (GridEditor + StepGridReview) als Badge: `S. 233`.
+Zeigt bevorzugt `page_number.number` (saubere Zahl), Fallback auf `page_number.text`.
+
+**Zweck:** Spaetere Zusammenfuehrung aufeinanderfolgender Seiten im Kundenfrontend.
+
+### Footer-Zeilen-Erkennung (Verbesserung)
+
+Die Footer-Erkennung wurde um zwei Pruefungen erweitert, um falsch-positive
+Footer-Markierungen bei Content-Zeilen zu verhindern:
+
+| Pruefung | Bedingung | Grund |
+|----------|-----------|-------|
+| Komma-Check | `',' in text` → kein Footer | Content-Saetze enthalten Kommas, Seitenzahlen nicht |
+| Laengen-Check | `len(text) > 20` → kein Footer | Seitenzahlen sind kurz, Content-Zeilen lang |
+
+**Vorher:** `"Uhrzeit, Vergangenheit, Zukunft"` wurde als Footer markiert.
+**Nachher:** Nur tatsaechliche Seitenzahlen (kurz, ohne Kommas) werden als Footer erkannt.
+
+### Silben + IPA Kombination (Fix)
+
+**Datei:** `cv_syllable_detect.py`
+
+Wenn beide Modi (Silben:DE und IPA) aktiviert sind, blockierte der `_IPA_RE`-Guard
+die Silbentrennung, weil programmatisch eingefuegte IPA-Klammern (z.B. `[bɪltʃøn]`)
+IPA-Zeichen enthalten.
+
+**Loesung:** Vor der IPA-Pruefung wird Bracket-Content entfernt:
+
+```python
+# Bracket-Content strippen, da programmatisch eingefuegt
+text_no_brackets = re.sub(r'\[[^\]]*\]', '', text)
+if _IPA_RE.search(text_no_brackets):
+    return text  # Echte IPA im Fliesstext → keine Silbentrennung
+```
+
+So wird `"Bild·chen [bɪltʃøn]"` korrekt silbifiziert: Die Silbenpunkte bleiben erhalten,
+und die IPA in Klammern wird nicht als Blockiergrund gewertet.
+
 ### `en_col_type` Erkennung

 Die Erkennung der Englisch-Headword-Spalte nutzt **Bracket-IPA-Pattern-Count**
@@ -1620,6 +1725,8 @@ Die Ergebnisse fliessen in Schritt 5 (Spaltenerkennung) und den Grid Editor ein.

 | Datum | Version | Aenderung |
 |-------|---------|----------|
+| 2026-03-26 | 5.2.0 | **OCR Kombi Pipeline:** Neuer modularer Nachfolger als 11-Schritt-Architektur unter `/ai/ocr-kombi`. Eigene Dokumentation: [OCR Kombi Pipeline](OCR-Kombi-Pipeline.md). Phase 1 (Grundgeruest + DB) implementiert: DB-Migration (`document_group_id`, `page_number`), Frontend-Orchestrator, 13 Step-Komponenten, Backend-Router mit Multi-Page-Upload. |
+| 2026-03-26 | 5.1.0 | **Grid Quality & Metadata:** Per-Cell Artifact Filter (Step 4b2: ≤2 Zeichen + conf < 65), Connector Column Normalization (Step 4d2: dominante Kurzwoerter), Footer-Erkennung verbessert (Komma/Laengen-Check), Seitenzahl-Extraktion als Metadaten (`page_number` Feld im Grid-Result), Frontend-Anzeige in Summary-Leiste. Silben+IPA-Kombination gefixt (Bracket-Content vor IPA-Guard strippen). |
 | 2026-03-23 | 5.0.0 | **Phase 1 Sprint 1:** Compound-IPA-Zerlegung (`_decompose_compound`), Trailing-Garbled-Fragment-Entfernung (Multi-Wort-Headwords), Regression Framework mit DB-Persistenz + History + Shell-Script, Ground-Truth Review Workflow UI, Page-Crop Determinismus verifiziert. Admin-Seiten: `/ai/ocr-regression`, `/ai/ocr-ground-truth`. |
 | 2026-03-20 | 4.7.0 | Grid Editor: Zone Merging ueber Bilder (`image_overlays`), Heading Detection (Farbe + Hoehe), Ghost-Filter (borderless-aware), Oversized Word Box Removal, IPA Phonetic Correction (Britfone), IPA Continuation Detection, `en_col_type` via Bracket-Count. 27 Tests. |
 | 2026-03-16 | 4.6.0 | Strukturerkennung (Schritt 8): Region-basierte Grafikerkennung (`cv_graphic_detect.py`) mit Zwei-Pass-Verfahren (Farbregionen + schwarze Illustrationen), Wort-Ueberlappungs-Filter, Box/Zonen/Farb-Analyse. Schritt laeuft nach Worterkennung. |
--- a/docs-src/services/klausur-service/RAG-Landkarte.md
+++ b/docs-src/services/klausur-service/RAG-Landkarte.md
@@ -0,0 +1,204 @@
+# RAG Landkarte — Branchen-Regulierungs-Matrix
+
+## Uebersicht
+
+Die RAG Landkarte zeigt eine interaktive Matrix aller 320 Compliance-Dokumente im RAG-System, gruppiert nach Dokumenttyp und zugeordnet zu 10 Industriebranchen.
+
+**URL**: `https://macmini:3002/ai/rag` → Tab "Landkarte"
+
+**Letzte Aktualisierung**: 2026-04-15
+
+## Architektur
+
+```
+rag-documents.json          ← Zentrale Datendatei (320 Dokumente)
+    ├── doc_types[]          ← 17 Dokumenttypen (EU-VO, DE-Gesetz, etc.)
+    ├── industries[]         ← 10 Branchen (VDMA/VDA/BDI)
+    └── documents[]          ← Alle Dokumente mit Branchen-Mapping
+         ├── code            ← Eindeutiger Identifier
+         ├── name            ← Anzeigename
+         ├── doc_type        ← Verweis auf doc_types.id
+         ├── industries[]    ← ["all"] oder ["automotive", "chemie", ...]
+         ├── in_rag          ← true (alle im RAG)
+         ├── rag_collection  ← Qdrant Collection Name
+         ├── description?    ← Beschreibung (fuer ~100 Hauptregulierungen)
+         ├── applicability_note?  ← Begruendung der Branchenzuordnung
+         └── effective_date? ← Gueltigkeitsdatum
+
+rag-constants.ts            ← RAG-Metadaten (Chunks, Qdrant-IDs)
+page.tsx                    ← Frontend (importiert aus JSON)
+```
+
+## Dateien
+
+| Pfad | Beschreibung |
+|------|--------------|
+| `admin-lehrer/app/(admin)/ai/rag/rag-documents.json` | Alle 320 Dokumente mit Branchen-Mapping |
+| `admin-lehrer/app/(admin)/ai/rag/rag-constants.ts` | REGULATIONS_IN_RAG (Chunk-Counts, Qdrant-IDs) |
+| `admin-lehrer/app/(admin)/ai/rag/page.tsx` | Frontend-Rendering |
+| `admin-lehrer/app/(admin)/ai/rag/__tests__/rag-documents.test.ts` | 44 Tests fuer JSON-Validierung |
+
+## Branchen (10 Industriesektoren)
+
+Die Branchen orientieren sich an den Mitgliedsverbaenden von VDMA, VDA und BDI:
+
+| ID | Branche | Icon | Typische Kunden |
+|----|---------|------|-----------------|
+| `automotive` | Automobilindustrie | 🚗 | OEMs, Tier-1/2 Zulieferer |
+| `maschinenbau` | Maschinen- & Anlagenbau | ⚙️ | Werkzeugmaschinen, Automatisierung |
+| `elektrotechnik` | Elektro- & Digitalindustrie | ⚡ | Embedded Systems, Steuerungstechnik |
+| `chemie` | Chemie- & Prozessindustrie | 🧪 | Grundstoffchemie, Spezialchemie |
+| `metall` | Metallindustrie | 🔩 | Stahl, Aluminium, Metallverarbeitung |
+| `energie` | Energie & Versorgung | 🔋 | Energieerzeugung, Netzbetreiber |
+| `transport` | Transport & Logistik | 🚚 | Gueterverkehr, Schiene, Luftfahrt |
+| `handel` | Handel | 🏪 | Einzel-/Grosshandel, E-Commerce |
+| `konsumgueter` | Konsumgueter & Lebensmittel | 📦 | FMCG, Lebensmittel, Verpackung |
+| `bau` | Bauwirtschaft | 🏗️ | Hoch-/Tiefbau, Gebaeudeautomation |
+
+!!! warning "Keine Pseudo-Branchen"
+    Es werden bewusst **keine** Querschnittsthemen wie IoT, KI, HR, KRITIS oder E-Commerce als "Branchen" gefuehrt. Diese sind Technologien, Abteilungen oder Klassifizierungen — keine Wirtschaftssektoren.
+
+## Zuordnungslogik
+
+### Drei Ebenen
+
+| Ebene | `industries` Wert | Anzahl | Beispiele |
+|-------|-------------------|--------|-----------|
+| **Horizontal** | `["all"]` | 264 | DSGVO, AI Act, CRA, NIS2, BetrVG |
+| **Sektorspezifisch** | `["automotive", "chemie", ...]` | 42 | Maschinenverordnung, ElektroG, BattDG |
+| **Nicht zutreffend** | `[]` | 14 | DORA, MiCA, EHDS, DSA |
+
+### Horizontal (alle Branchen)
+
+Regulierungen die **branchenuebergreifend** gelten:
+
+- **Datenschutz**: DSGVO, BDSG, ePrivacy, TDDDG, SCC, DPF
+- **KI**: AI Act (jedes Unternehmen das KI einsetzt)
+- **Cybersecurity**: CRA (jedes Produkt mit digitalen Elementen), NIS2, EUCSA
+- **Produktsicherheit**: GPSR, Produkthaftungs-RL
+- **Arbeitsrecht**: BetrVG, AGG, KSchG, ArbSchG, LkSG
+- **Handels-/Steuerrecht**: HGB, AO, UStG
+- **Software-Security**: OWASP Top 10, NIST SSDF, CISA Secure by Design
+- **Supply Chain**: CycloneDX, SPDX, SLSA (CRA verlangt SBOM)
+- **Alle Leitlinien**: EDPB, DSK, DSFA-Listen, Gerichtsurteile
+
+### Sektorspezifisch
+
+| Regulierung | Branchen | Begruendung |
+|-------------|----------|-------------|
+| Maschinenverordnung | Maschinenbau, Automotive, Elektrotechnik, Metall, Bau | Hersteller von Maschinen und zugehoerigen Produkten |
+| ElektroG | Elektrotechnik, Automotive, Konsumgueter | Elektro-/Elektronikgeraete |
+| BattDG/BattVO | Automotive, Elektrotechnik, Energie | Batterien und Akkumulatoren |
+| VerpackG | Konsumgueter, Handel, Chemie | Verpackungspflichtige Produkte |
+| PAngV, UWG, VSBG | Handel, Konsumgueter | Verbraucherschutz im Verkauf |
+| BSI-KritisV, KRITIS-Dachgesetz | Energie, Transport, Chemie | KRITIS-Sektoren |
+| ENISA ICS/SCADA | Maschinenbau, Elektrotechnik, Automotive, Chemie, Energie, Transport | Industrielle Steuerungstechnik |
+| NIST SP 800-82 (OT) | Maschinenbau, Automotive, Elektrotechnik, Chemie, Energie, Metall | Operational Technology |
+
+### Nicht zutreffend
+
+Dokumente die **im RAG bleiben** aber fuer keine der 10 Zielbranchen relevant sind:
+
+| Code | Name | Grund |
+|------|------|-------|
+| DORA | Digital Operational Resilience Act | Finanzsektor |
+| PSD2 | Zahlungsdiensterichtlinie | Zahlungsdienstleister |
+| MiCA | Markets in Crypto-Assets | Krypto-Maerkte |
+| AMLR | AML-Verordnung | Geldwaesche-Bekaempfung |
+| EHDS | Europaeischer Gesundheitsdatenraum | Gesundheitswesen |
+| DSA | Digital Services Act | Online-Plattformen |
+| DMA | Digital Markets Act | Gatekeeper-Plattformen |
+| MDR | Medizinprodukteverordnung | Medizintechnik |
+| BSI-TR-03161 | DiGA-Sicherheit (3 Teile) | Digitale Gesundheitsanwendungen |
+
+## Dokumenttypen (17)
+
+| doc_type | Label | Anzahl | Beispiele |
+|----------|-------|--------|-----------|
+| `eu_regulation` | EU-Verordnungen | 22 | DSGVO, AI Act, CRA, DORA |
+| `eu_directive` | EU-Richtlinien | 14 | ePrivacy, NIS2, PSD2 |
+| `eu_guidance` | EU-Leitfaeden | 9 | Blue Guide, GPAI CoP |
+| `de_law` | Deutsche Gesetze | 41 | BDSG, BGB, HGB, BetrVG |
+| `at_law` | Oesterreichische Gesetze | 11 | DSG AT, ECG, KSchG |
+| `ch_law` | Schweizer Gesetze | 8 | revDSG, DSV, OR |
+| `national_law` | Nationale Datenschutzgesetze | 17 | UK DPA, LOPDGDD, UAVG |
+| `bsi_standard` | BSI Standards & TR | 4 | BSI 200-4, BSI-TR-03161 |
+| `edpb_guideline` | EDPB/WP29 Leitlinien | 50 | Consent, Controller/Processor |
+| `dsk_guidance` | DSK Orientierungshilfen | 57 | Kurzpapiere, OH Telemedien |
+| `court_decision` | Gerichtsurteile | 20 | BAG M365, BGH Planet49 |
+| `dsfa_list` | DSFA Muss-Listen | 20 | Pro Bundesland + DSK |
+| `nist_standard` | NIST Standards | 11 | CSF 2.0, SSDF, AI RMF |
+| `owasp_standard` | OWASP Standards | 6 | Top 10, ASVS, API Security |
+| `enisa_guidance` | ENISA Guidance | 6 | Supply Chain, ICS/SCADA |
+| `international` | Internationale Standards | 7 | CVSS, CycloneDX, SPDX |
+| `legal_template` | Vorlagen & Muster | 17 | GitHub Policies, VVT-Muster |
+
+## Integration in andere Projekte
+
+### JSON importieren
+
+```typescript
+import ragData from './rag-documents.json'
+
+const documents = ragData.documents    // 320 Dokumente
+const docTypes = ragData.doc_types     // 17 Kategorien
+const industries = ragData.industries  // 10 Branchen
+```
+
+### Matrix-Logik
+
+```typescript
+// Pruefen ob Dokument fuer Branche gilt
+const applies = (doc, industryId) =>
+  doc.industries.includes(industryId) || doc.industries.includes('all')
+
+// Dokumente nach Typ gruppieren
+const grouped = Object.groupBy(documents, d => d.doc_type)
+
+// Nur sektorspezifische Dokumente fuer eine Branche
+const forAutomotive = documents.filter(d =>
+  d.industries.includes('automotive') && !d.industries.includes('all')
+)
+```
+
+### RAG-Status pruefen
+
+```typescript
+import { REGULATIONS_IN_RAG } from './rag-constants'
+
+const isInRag = (code: string) => code in REGULATIONS_IN_RAG
+const chunks = REGULATIONS_IN_RAG['GDPR']?.chunks  // 423
+```
+
+## Datenquellen
+
+| Quelle | Pfad | Beschreibung |
+|--------|------|--------------|
+| RAG-Inventar | `~/Desktop/RAG-Dokumenten-Inventar.md` | 386 Quelldateien |
+| rag-documents.json | `admin-lehrer/.../rag/rag-documents.json` | 320 konsolidierte Dokumente |
+| rag-constants.ts | `admin-lehrer/.../rag/rag-constants.ts` | Qdrant-Metadaten |
+
+## Tests
+
+```bash
+cd admin-lehrer
+npx vitest run app/\(admin\)/ai/rag/__tests__/rag-documents.test.ts
+```
+
+44 Tests validieren:
+
+- JSON-Struktur (doc_types, industries, documents)
+- 10 echte Branchen (keine Pseudo-Branchen)
+- Pflichtfelder und gueltige Referenzen
+- Horizontale Regulierungen (DSGVO, AI Act, CRA → "all")
+- Sektorspezifische Zuordnungen (Maschinenverordnung, ElektroG)
+- Nicht zutreffende Regulierungen (DORA, MiCA → leer)
+- Applicability Notes vorhanden und korrekt
+
+## Aenderungshistorie
+
+| Datum | Aenderung |
+|-------|-----------|
+| 2026-04-15 | Initiale Implementierung: 320 Dokumente, 10 Branchen, 17 Typen |
+| 2026-04-15 | Branchen-Review: OWASP/SBOM → alle, BSI-TR-03161 → leer |
+| 2026-04-15 | Applicability Notes UI: Aufklappbare Erklaerungen pro Dokument |
--- a/edu-search-service/scripts/vast_ai_extractor.py
+++ b/edu-search-service/scripts/vast_ai_extractor.py
@@ -1,320 +0,0 @@
-#!/usr/bin/env python3
-"""
-vast.ai Profile Extractor Script
-Dieses Skript läuft auf vast.ai und extrahiert Profildaten von Universitäts-Webseiten.
-
-Verwendung auf vast.ai:
-1. Lade dieses Skript auf deine vast.ai Instanz
-2. Installiere Abhängigkeiten: pip install requests beautifulsoup4 openai
-3. Setze Umgebungsvariablen:
-   - BREAKPILOT_API_URL=http://deine-ip:8086
-   - BREAKPILOT_API_KEY=dev-key
-   - OPENAI_API_KEY=sk-...
-4. Starte: python vast_ai_extractor.py
-"""
-
-import os
-import sys
-import json
-import time
-import logging
-import requests
-from bs4 import BeautifulSoup
-from typing import Optional, Dict, Any, List
-
-# Logging Setup
-logging.basicConfig(
-    level=logging.INFO,
-    format='%(asctime)s - %(levelname)s - %(message)s'
-)
-logger = logging.getLogger(__name__)
-
-# Configuration
-API_URL = os.environ.get('BREAKPILOT_API_URL', 'http://localhost:8086')
-API_KEY = os.environ.get('BREAKPILOT_API_KEY', 'dev-key')
-OPENAI_API_KEY = os.environ.get('OPENAI_API_KEY', '')
-BATCH_SIZE = 10
-SLEEP_BETWEEN_REQUESTS = 1  # Sekunden zwischen Requests (respektiere rate limits)
-
-
-def fetch_pending_profiles(limit: int = 50) -> List[Dict]:
-    """Hole Profile die noch extrahiert werden müssen."""
-    try:
-        response = requests.get(
-            f"{API_URL}/api/v1/ai/extraction/pending",
-            params={"limit": limit},
-            headers={"Authorization": f"Bearer {API_KEY}"},
-            timeout=30
-        )
-        response.raise_for_status()
-        data = response.json()
-        return data.get("tasks", [])
-    except Exception as e:
-        logger.error(f"Fehler beim Abrufen der Profile: {e}")
-        return []
-
-
-def fetch_profile_page(url: str) -> Optional[str]:
-    """Lade den HTML-Inhalt einer Profilseite."""
-    try:
-        headers = {
-            'User-Agent': 'Mozilla/5.0 (compatible; BreakPilot-Crawler/1.0; +https://breakpilot.de)',
-            'Accept': 'text/html,application/xhtml+xml',
-            'Accept-Language': 'de-DE,de;q=0.9,en;q=0.8',
-        }
-        response = requests.get(url, headers=headers, timeout=30)
-        response.raise_for_status()
-        return response.text
-    except Exception as e:
-        logger.error(f"Fehler beim Laden von {url}: {e}")
-        return None
-
-
-def extract_with_beautifulsoup(html: str, url: str) -> Dict[str, Any]:
-    """Extrahiere Basis-Informationen mit BeautifulSoup (ohne AI)."""
-    soup = BeautifulSoup(html, 'html.parser')
-    data = {}
-
-    # Email suchen
-    email_links = soup.find_all('a', href=lambda x: x and x.startswith('mailto:'))
-    if email_links:
-        email = email_links[0]['href'].replace('mailto:', '').split('?')[0]
-        data['email'] = email
-
-    # Telefon suchen
-    phone_links = soup.find_all('a', href=lambda x: x and x.startswith('tel:'))
-    if phone_links:
-        data['phone'] = phone_links[0]['href'].replace('tel:', '')
-
-    # ORCID suchen
-    orcid_links = soup.find_all('a', href=lambda x: x and 'orcid.org' in x)
-    if orcid_links:
-        orcid = orcid_links[0]['href']
-        # Extrahiere ORCID ID
-        if '/' in orcid:
-            data['orcid'] = orcid.split('/')[-1]
-
-    # Google Scholar suchen
-    scholar_links = soup.find_all('a', href=lambda x: x and 'scholar.google' in x)
-    if scholar_links:
-        href = scholar_links[0]['href']
-        if 'user=' in href:
-            data['google_scholar_id'] = href.split('user=')[1].split('&')[0]
-
-    # ResearchGate suchen
-    rg_links = soup.find_all('a', href=lambda x: x and 'researchgate.net' in x)
-    if rg_links:
-        data['researchgate_url'] = rg_links[0]['href']
-
-    # LinkedIn suchen
-    linkedin_links = soup.find_all('a', href=lambda x: x and 'linkedin.com' in x)
-    if linkedin_links:
-        data['linkedin_url'] = linkedin_links[0]['href']
-
-    # Institut/Abteilung Links sammeln (für Hierarchie-Erkennung)
-    base_domain = '/'.join(url.split('/')[:3])
-    department_links = []
-    for link in soup.find_all('a', href=True):
-        href = link['href']
-        text = link.get_text(strip=True)
-        # Suche nach Links die auf Institute/Fakultäten hindeuten
-        if any(kw in text.lower() for kw in ['institut', 'fakultät', 'fachbereich', 'abteilung', 'lehrstuhl']):
-            if href.startswith('/'):
-                href = base_domain + href
-            if href.startswith('http'):
-                department_links.append({'url': href, 'name': text})
-
-    if department_links:
-        # Nimm den ersten gefundenen Department-Link
-        data['department_url'] = department_links[0]['url']
-        data['department_name'] = department_links[0]['name']
-
-    return data
-
-
-def extract_with_ai(html: str, url: str, full_name: str) -> Dict[str, Any]:
-    """Extrahiere strukturierte Daten mit OpenAI GPT."""
-    if not OPENAI_API_KEY:
-        logger.warning("Kein OPENAI_API_KEY gesetzt - nutze nur BeautifulSoup")
-        return extract_with_beautifulsoup(html, url)
-
-    try:
-        import openai
-        client = openai.OpenAI(api_key=OPENAI_API_KEY)
-
-        # Reduziere HTML auf relevanten Text
-        soup = BeautifulSoup(html, 'html.parser')
-
-        # Entferne Scripts, Styles, etc.
-        for tag in soup(['script', 'style', 'nav', 'footer', 'header']):
-            tag.decompose()
-
-        # Extrahiere Text
-        text = soup.get_text(separator='\n', strip=True)
-        # Limitiere auf 8000 Zeichen für API
-        text = text[:8000]
-
-        prompt = f"""Analysiere diese Universitäts-Profilseite für {full_name} und extrahiere folgende Informationen im JSON-Format:
-
-{{
-  "email": "email@uni.de oder null",
-  "phone": "Telefonnummer oder null",
-  "office": "Raum/Büro oder null",
-  "position": "Position/Titel (z.B. Wissenschaftlicher Mitarbeiter, Professorin) oder null",
-  "department_name": "Name des Instituts/der Abteilung oder null",
-  "research_interests": ["Liste", "der", "Forschungsthemen"] oder [],
-  "teaching_topics": ["Liste", "der", "Lehrveranstaltungen/Fächer"] oder [],
-  "supervisor_name": "Name des Vorgesetzten/Lehrstuhlinhabers falls erkennbar oder null"
-}}
-
-Profilseite von {url}:
-
-{text}
-
-Antworte NUR mit dem JSON-Objekt, keine Erklärungen."""
-
-        response = client.chat.completions.create(
-            model="gpt-4o-mini",  # Kostengünstig und schnell
-            messages=[{"role": "user", "content": prompt}],
-            temperature=0.1,
-            max_tokens=500
-        )
-
-        result_text = response.choices[0].message.content.strip()
-
-        # Parse JSON (entferne eventuelle Markdown-Blöcke)
-        if result_text.startswith('```'):
-            result_text = result_text.split('```')[1]
-            if result_text.startswith('json'):
-                result_text = result_text[4:]
-
-        ai_data = json.loads(result_text)
-
-        # Kombiniere mit BeautifulSoup-Ergebnissen (für Links wie ORCID)
-        bs_data = extract_with_beautifulsoup(html, url)
-
-        # AI-Daten haben Priorität, aber BS-Daten für spezifische Links
-        for key in ['orcid', 'google_scholar_id', 'researchgate_url', 'linkedin_url']:
-            if key in bs_data and bs_data[key]:
-                ai_data[key] = bs_data[key]
-
-        return ai_data
-
-    except Exception as e:
-        logger.error(f"AI-Extraktion fehlgeschlagen: {e}")
-        return extract_with_beautifulsoup(html, url)
-
-
-def submit_extracted_data(staff_id: str, data: Dict[str, Any]) -> bool:
-    """Sende extrahierte Daten zurück an BreakPilot."""
-    try:
-        payload = {"staff_id": staff_id, **data}
-
-        # Entferne None-Werte
-        payload = {k: v for k, v in payload.items() if v is not None}
-
-        response = requests.post(
-            f"{API_URL}/api/v1/ai/extraction/submit",
-            json=payload,
-            headers={
-                "Authorization": f"Bearer {API_KEY}",
-                "Content-Type": "application/json"
-            },
-            timeout=30
-        )
-        response.raise_for_status()
-        return True
-    except Exception as e:
-        logger.error(f"Fehler beim Senden der Daten für {staff_id}: {e}")
-        return False
-
-
-def process_profiles():
-    """Hauptschleife: Hole Profile, extrahiere Daten, sende zurück."""
-    logger.info(f"Starte Extraktion - API: {API_URL}")
-
-    processed = 0
-    errors = 0
-
-    while True:
-        # Hole neue Profile
-        profiles = fetch_pending_profiles(limit=BATCH_SIZE)
-
-        if not profiles:
-            logger.info("Keine weiteren Profile zum Verarbeiten. Warte 60 Sekunden...")
-            time.sleep(60)
-            continue
-
-        logger.info(f"Verarbeite {len(profiles)} Profile...")
-
-        for profile in profiles:
-            staff_id = profile['staff_id']
-            url = profile['profile_url']
-            full_name = profile.get('full_name', 'Unbekannt')
-
-            logger.info(f"Verarbeite: {full_name} - {url}")
-
-            # Lade Profilseite
-            html = fetch_profile_page(url)
-            if not html:
-                errors += 1
-                continue
-
-            # Extrahiere Daten
-            extracted = extract_with_ai(html, url, full_name)
-
-            if extracted:
-                # Sende zurück
-                if submit_extracted_data(staff_id, extracted):
-                    processed += 1
-                    logger.info(f"Erfolgreich: {full_name} - Email: {extracted.get('email', 'N/A')}")
-                else:
-                    errors += 1
-            else:
-                errors += 1
-
-            # Rate limiting
-            time.sleep(SLEEP_BETWEEN_REQUESTS)
-
-        logger.info(f"Batch abgeschlossen. Gesamt: {processed} erfolgreich, {errors} Fehler")
-
-
-def main():
-    """Einstiegspunkt."""
-    logger.info("=" * 60)
-    logger.info("BreakPilot vast.ai Profile Extractor")
-    logger.info("=" * 60)
-
-    # Prüfe Konfiguration
-    if not API_KEY:
-        logger.error("BREAKPILOT_API_KEY nicht gesetzt!")
-        sys.exit(1)
-
-    if not OPENAI_API_KEY:
-        logger.warning("OPENAI_API_KEY nicht gesetzt - nutze nur BeautifulSoup-Extraktion")
-
-    # Teste Verbindung
-    try:
-        response = requests.get(
-            f"{API_URL}/v1/health",
-            headers={"Authorization": f"Bearer {API_KEY}"},
-            timeout=10
-        )
-        logger.info(f"API-Verbindung OK: {response.status_code}")
-    except Exception as e:
-        logger.error(f"Kann API nicht erreichen: {e}")
-        logger.error(f"Stelle sicher dass {API_URL} erreichbar ist!")
-        sys.exit(1)
-
-    # Starte Verarbeitung
-    try:
-        process_profiles()
-    except KeyboardInterrupt:
-        logger.info("Beendet durch Benutzer")
-    except Exception as e:
-        logger.error(f"Unerwarteter Fehler: {e}")
-        sys.exit(1)
-
-
-if __name__ == "__main__":
-    main()
--- a/klausur-service/backend/cv_box_layout.py
+++ b/klausur-service/backend/cv_box_layout.py
@@ -0,0 +1,339 @@
+"""
+Box layout classifier — detects internal layout type of embedded boxes.
+
+Classifies each box as: flowing | columnar | bullet_list | header_only
+and provides layout-appropriate grid building.
+
+Used by the Box-Grid-Review step to rebuild box zones with correct structure.
+"""
+
+import logging
+import re
+import statistics
+from typing import Any, Dict, List, Optional, Tuple
+
+logger = logging.getLogger(__name__)
+
+# Bullet / list-item patterns at the start of a line
+_BULLET_RE = re.compile(
+    r'^[\-\u2022\u2013\u2014\u25CF\u25CB\u25AA\u25A0•·]\s'  # dash, bullet chars
+    r'|^\d{1,2}[.)]\s'     # numbered: "1) " or "1. "
+    r'|^[a-z][.)]\s'       # lettered: "a) " or "a. "
+)
+
+
+def classify_box_layout(
+    words: List[Dict],
+    box_w: int,
+    box_h: int,
+) -> str:
+    """Classify the internal layout of a detected box.
+
+    Args:
+        words: OCR word dicts within the box (with top, left, width, height, text)
+        box_w: Box width in pixels
+        box_h: Box height in pixels
+
+    Returns:
+        'header_only' | 'bullet_list' | 'columnar' | 'flowing'
+    """
+    if not words:
+        return "header_only"
+
+    # Group words into lines by y-proximity
+    lines = _group_into_lines(words)
+
+    # Header only: very few words or single line
+    total_words = sum(len(line) for line in lines)
+    if total_words <= 5 or len(lines) <= 1:
+        return "header_only"
+
+    # Bullet list: check if majority of lines start with bullet patterns
+    bullet_count = 0
+    for line in lines:
+        first_text = line[0].get("text", "") if line else ""
+        if _BULLET_RE.match(first_text):
+            bullet_count += 1
+        # Also check if first word IS a bullet char
+        elif first_text.strip() in ("-", "–", "—", "•", "·", "▪", "▸"):
+            bullet_count += 1
+    if bullet_count >= len(lines) * 0.4 and bullet_count >= 2:
+        return "bullet_list"
+
+    # Columnar: check for multiple distinct x-clusters
+    if len(lines) >= 3 and _has_column_structure(words, box_w):
+        return "columnar"
+
+    # Default: flowing text
+    return "flowing"
+
+
+def _group_into_lines(words: List[Dict]) -> List[List[Dict]]:
+    """Group words into lines by y-proximity."""
+    if not words:
+        return []
+
+    sorted_words = sorted(words, key=lambda w: (w["top"], w["left"]))
+    heights = [w["height"] for w in sorted_words if w.get("height", 0) > 0]
+    median_h = statistics.median(heights) if heights else 20
+    y_tolerance = max(median_h * 0.5, 5)
+
+    lines: List[List[Dict]] = []
+    current_line: List[Dict] = [sorted_words[0]]
+    current_y = sorted_words[0]["top"]
+
+    for w in sorted_words[1:]:
+        if abs(w["top"] - current_y) <= y_tolerance:
+            current_line.append(w)
+        else:
+            lines.append(sorted(current_line, key=lambda ww: ww["left"]))
+            current_line = [w]
+            current_y = w["top"]
+
+    if current_line:
+        lines.append(sorted(current_line, key=lambda ww: ww["left"]))
+
+    return lines
+
+
+def _has_column_structure(words: List[Dict], box_w: int) -> bool:
+    """Check if words have multiple distinct left-edge clusters (columns)."""
+    if box_w <= 0:
+        return False
+
+    lines = _group_into_lines(words)
+    if len(lines) < 3:
+        return False
+
+    # Collect left-edges of non-first words in each line
+    # (first word of each line often aligns regardless of columns)
+    left_edges = []
+    for line in lines:
+        for w in line[1:]:  # skip first word
+            left_edges.append(w["left"])
+
+    if len(left_edges) < 4:
+        return False
+
+    # Check if left edges cluster into 2+ distinct groups
+    left_edges.sort()
+    gaps = [left_edges[i + 1] - left_edges[i] for i in range(len(left_edges) - 1)]
+    if not gaps:
+        return False
+
+    median_gap = statistics.median(gaps)
+    # A column gap is typically > 15% of box width
+    column_gap_threshold = box_w * 0.15
+    large_gaps = [g for g in gaps if g > column_gap_threshold]
+
+    return len(large_gaps) >= 1
+
+
+def build_box_zone_grid(
+    zone_words: List[Dict],
+    box_x: int,
+    box_y: int,
+    box_w: int,
+    box_h: int,
+    zone_index: int,
+    img_w: int,
+    img_h: int,
+    layout_type: Optional[str] = None,
+) -> Dict[str, Any]:
+    """Build a grid for a box zone with layout-aware processing.
+
+    If layout_type is None, auto-detects it.
+    For 'flowing' and 'bullet_list', forces single-column layout.
+    For 'columnar', uses the standard multi-column detection.
+    For 'header_only', creates a single cell.
+
+    Returns the same format as _build_zone_grid (columns, rows, cells, header_rows).
+    """
+    from grid_editor_helpers import _build_zone_grid, _cluster_rows
+
+    if not zone_words:
+        return {
+            "columns": [],
+            "rows": [],
+            "cells": [],
+            "header_rows": [],
+            "box_layout_type": layout_type or "header_only",
+            "box_grid_reviewed": False,
+        }
+
+    # Auto-detect layout if not specified
+    if not layout_type:
+        layout_type = classify_box_layout(zone_words, box_w, box_h)
+
+    logger.info(
+        "Box zone %d: layout_type=%s, %d words, %dx%d",
+        zone_index, layout_type, len(zone_words), box_w, box_h,
+    )
+
+    if layout_type == "header_only":
+        # Single cell with all text concatenated
+        all_text = " ".join(
+            w.get("text", "") for w in sorted(zone_words, key=lambda ww: (ww["top"], ww["left"]))
+        ).strip()
+        return {
+            "columns": [{"col_index": 0, "index": 0, "label": "column_text", "col_type": "column_1",
+                         "x_min_px": box_x, "x_max_px": box_x + box_w,
+                         "x_min_pct": round(box_x / img_w * 100, 2) if img_w else 0,
+                         "x_max_pct": round((box_x + box_w) / img_w * 100, 2) if img_w else 0,
+                         "bold": False}],
+            "rows": [{"index": 0, "row_index": 0,
+                       "y_min": box_y, "y_max": box_y + box_h, "y_center": box_y + box_h / 2,
+                       "y_min_px": box_y, "y_max_px": box_y + box_h,
+                       "y_min_pct": round(box_y / img_h * 100, 2) if img_h else 0,
+                       "y_max_pct": round((box_y + box_h) / img_h * 100, 2) if img_h else 0,
+                       "is_header": True}],
+            "cells": [{
+                "cell_id": f"Z{zone_index}_R0C0",
+                "row_index": 0,
+                "col_index": 0,
+                "col_type": "column_1",
+                "text": all_text,
+                "word_boxes": zone_words,
+            }],
+            "header_rows": [0],
+            "box_layout_type": layout_type,
+            "box_grid_reviewed": False,
+        }
+
+    if layout_type in ("flowing", "bullet_list"):
+        # Force single column — each line becomes one row with one cell.
+        # Detect bullet structure from indentation and merge continuation
+        # lines into the bullet they belong to.
+        lines = _group_into_lines(zone_words)
+        column = {
+            "col_index": 0, "index": 0, "label": "column_text", "col_type": "column_1",
+            "x_min_px": box_x, "x_max_px": box_x + box_w,
+            "x_min_pct": round(box_x / img_w * 100, 2) if img_w else 0,
+            "x_max_pct": round((box_x + box_w) / img_w * 100, 2) if img_w else 0,
+            "bold": False,
+        }
+
+        # --- Detect indentation levels ---
+        line_indents = []
+        for line_words in lines:
+            if not line_words:
+                line_indents.append(0)
+                continue
+            min_left = min(w["left"] for w in line_words)
+            line_indents.append(min_left - box_x)
+
+        # Find the minimum indent (= bullet/main level)
+        valid_indents = [ind for ind in line_indents if ind >= 0]
+        min_indent = min(valid_indents) if valid_indents else 0
+
+        # Indentation threshold: lines indented > 15px more than minimum
+        # are continuation lines belonging to the previous bullet
+        INDENT_THRESHOLD = 15
+
+        # --- Group lines into logical items (bullet + continuations) ---
+        # Each item is a list of line indices
+        items: List[List[int]] = []
+        for li, indent in enumerate(line_indents):
+            is_continuation = (indent > min_indent + INDENT_THRESHOLD) and len(items) > 0
+            if is_continuation:
+                items[-1].append(li)
+            else:
+                items.append([li])
+
+        logger.info(
+            "Box zone %d flowing: %d lines → %d items (indents=%s, min=%d, threshold=%d)",
+            zone_index, len(lines), len(items),
+            [int(i) for i in line_indents], int(min_indent), INDENT_THRESHOLD,
+        )
+
+        # --- Build rows and cells from grouped items ---
+        rows = []
+        cells = []
+        header_rows = []
+
+        for row_idx, item_line_indices in enumerate(items):
+            # Collect all words from all lines in this item
+            item_words = []
+            item_texts = []
+            for li in item_line_indices:
+                if li < len(lines):
+                    item_words.extend(lines[li])
+                    line_text = " ".join(w.get("text", "") for w in lines[li]).strip()
+                    if line_text:
+                        item_texts.append(line_text)
+
+            if not item_words:
+                continue
+
+            y_min = min(w["top"] for w in item_words)
+            y_max = max(w["top"] + w["height"] for w in item_words)
+            y_center = (y_min + y_max) / 2
+
+            row = {
+                "index": row_idx,
+                "row_index": row_idx,
+                "y_min": y_min,
+                "y_max": y_max,
+                "y_center": y_center,
+                "y_min_px": y_min,
+                "y_max_px": y_max,
+                "y_min_pct": round(y_min / img_h * 100, 2) if img_h else 0,
+                "y_max_pct": round(y_max / img_h * 100, 2) if img_h else 0,
+                "is_header": False,
+            }
+            rows.append(row)
+
+            # Join multi-line text with newline for display
+            merged_text = "\n".join(item_texts)
+
+            # Add bullet marker if this is a bullet item without one
+            first_text = item_texts[0] if item_texts else ""
+            is_bullet = len(item_line_indices) > 1 or _BULLET_RE.match(first_text)
+            if is_bullet and not _BULLET_RE.match(first_text) and row_idx > 0:
+                # Continuation item without bullet — add one
+                merged_text = "• " + merged_text
+
+            cell = {
+                "cell_id": f"Z{zone_index}_R{row_idx}C0",
+                "row_index": row_idx,
+                "col_index": 0,
+                "col_type": "column_1",
+                "text": merged_text,
+                "word_boxes": item_words,
+            }
+            cells.append(cell)
+
+        # Detect header: first item if it has no continuation lines and is short
+        if len(items) >= 2:
+            first_item_texts = []
+            for li in items[0]:
+                if li < len(lines):
+                    first_item_texts.append(" ".join(w.get("text", "") for w in lines[li]).strip())
+            first_text = " ".join(first_item_texts)
+            if (len(first_text) < 40
+                    or first_text.isupper()
+                    or first_text.rstrip().endswith(':')):
+                header_rows = [0]
+
+        return {
+            "columns": [column],
+            "rows": rows,
+            "cells": cells,
+            "header_rows": header_rows,
+            "box_layout_type": layout_type,
+            "box_grid_reviewed": False,
+        }
+
+    # Columnar: use standard grid builder with independent column detection
+    result = _build_zone_grid(
+        zone_words, box_x, box_y, box_w, box_h,
+        zone_index, img_w, img_h,
+        global_columns=None,  # detect columns independently
+    )
+
+    # Colspan detection is now handled generically by _detect_colspan_cells
+    # in grid_editor_helpers.py (called inside _build_zone_grid).
+
+    result["box_layout_type"] = layout_type
+    result["box_grid_reviewed"] = False
+    return result
--- a/klausur-service/backend/cv_cell_grid.py
+++ b/klausur-service/backend/cv_cell_grid.py
@@ -1447,6 +1447,90 @@ def _merge_phonetic_continuation_rows(
    return merged


+def _merge_wrapped_rows(
+    entries: List[Dict[str, Any]],
+) -> List[Dict[str, Any]]:
+    """Merge rows where the primary column (EN) is empty — cell wrap continuation.
+
+    In textbook vocabulary tables, columns are often narrow, so the author
+    wraps text within a cell. OCR treats each physical line as a separate row.
+    The key indicator: if the EN column is empty but DE/example have text,
+    this row is a continuation of the previous row's cells.
+
+    Example (original textbook has ONE row):
+      Row 2: EN="take part (in)"  DE="teilnehmen (an), mitmachen"  EX="More than 200 singers took"
+      Row 3: EN=""                DE="(bei)"                        EX="part in the concert."
+      → Merged: EN="take part (in)" DE="teilnehmen (an), mitmachen (bei)" EX="More than 200 singers took part in the concert."
+
+    Also handles the reverse case: DE empty but EN has text (wrap in EN column).
+    """
+    if len(entries) < 2:
+        return entries
+
+    merged: List[Dict[str, Any]] = []
+    for entry in entries:
+        en = (entry.get('english') or '').strip()
+        de = (entry.get('german') or '').strip()
+        ex = (entry.get('example') or '').strip()
+
+        if not merged:
+            merged.append(entry)
+            continue
+
+        prev = merged[-1]
+        prev_en = (prev.get('english') or '').strip()
+        prev_de = (prev.get('german') or '').strip()
+        prev_ex = (prev.get('example') or '').strip()
+
+        # Case 1: EN is empty → continuation of previous row
+        # (DE or EX have text that should be appended to previous row)
+        if not en and (de or ex) and prev_en:
+            if de:
+                if prev_de.endswith(','):
+                    sep = ' '  # "Wort," + " " + "Ausdruck"
+                elif prev_de.endswith(('-', '(')):
+                    sep = ''   # "teil-" + "nehmen" or "(" + "bei)"
+                else:
+                    sep = ' '
+                prev['german'] = (prev_de + sep + de).strip()
+            if ex:
+                sep = ' ' if prev_ex else ''
+                prev['example'] = (prev_ex + sep + ex).strip()
+            logger.debug(
+                f"Merged wrapped row {entry.get('row_index')} into previous "
+                f"(empty EN): DE={prev['german']!r}, EX={prev.get('example', '')!r}"
+            )
+            continue
+
+        # Case 2: DE is empty, EN has text that looks like continuation
+        # (starts with lowercase or is a parenthetical like "(bei)")
+        if en and not de and prev_de:
+            is_paren = en.startswith('(')
+            first_alpha = next((c for c in en if c.isalpha()), '')
+            starts_lower = first_alpha and first_alpha.islower()
+
+            if (is_paren or starts_lower) and len(en.split()) < 5:
+                sep = ' ' if prev_en and not prev_en.endswith((',', '-', '(')) else ''
+                prev['english'] = (prev_en + sep + en).strip()
+                if ex:
+                    sep2 = ' ' if prev_ex else ''
+                    prev['example'] = (prev_ex + sep2 + ex).strip()
+                logger.debug(
+                    f"Merged wrapped row {entry.get('row_index')} into previous "
+                    f"(empty DE): EN={prev['english']!r}"
+                )
+                continue
+
+        merged.append(entry)
+
+    if len(merged) < len(entries):
+        logger.info(
+            f"_merge_wrapped_rows: merged {len(entries) - len(merged)} "
+            f"continuation rows ({len(entries)} → {len(merged)})"
+        )
+    return merged
+
+
 def _merge_continuation_rows(
    entries: List[Dict[str, Any]],
 ) -> List[Dict[str, Any]]:
@@ -1561,6 +1645,9 @@ def build_word_grid(
    # --- Post-processing pipeline (deterministic, no LLM) ---
    n_raw = len(entries)

+    # 0. Merge cell-wrap continuation rows (empty primary column = text wrap)
+    entries = _merge_wrapped_rows(entries)
+
    # 0a. Merge phonetic-only continuation rows into previous entry
    entries = _merge_phonetic_continuation_rows(entries)

--- a/klausur-service/backend/cv_gutter_repair.py
+++ b/klausur-service/backend/cv_gutter_repair.py
@@ -0,0 +1,610 @@
+"""
+Gutter Repair — detects and fixes words truncated or blurred at the book gutter.
+
+When scanning double-page spreads, the binding area (gutter) causes:
+  1. Blurry/garbled trailing characters  ("stammeli" → "stammeln")
+  2. Words split across lines with a hyphen lost in the gutter
+     ("ve" + "künden" → "verkünden")
+
+This module analyses grid cells, identifies gutter-edge candidates, and
+proposes corrections using pyspellchecker (DE + EN).
+
+Lizenz: Apache 2.0 (kommerziell nutzbar)
+DATENSCHUTZ: Alle Verarbeitung erfolgt lokal.
+"""
+
+import itertools
+import logging
+import re
+import time
+import uuid
+from dataclasses import dataclass, field, asdict
+from typing import Any, Dict, List, Optional, Tuple
+
+logger = logging.getLogger(__name__)
+
+# ---------------------------------------------------------------------------
+# Spellchecker setup (lazy, cached)
+# ---------------------------------------------------------------------------
+
+_spell_de = None
+_spell_en = None
+_SPELL_AVAILABLE = False
+
+def _init_spellcheckers():
+    """Lazy-load DE + EN spellcheckers (cached across calls)."""
+    global _spell_de, _spell_en, _SPELL_AVAILABLE
+    if _spell_de is not None:
+        return
+    try:
+        from spellchecker import SpellChecker
+        _spell_de = SpellChecker(language='de', distance=1)
+        _spell_en = SpellChecker(language='en', distance=1)
+        _SPELL_AVAILABLE = True
+        logger.info("Gutter repair: spellcheckers loaded (DE + EN)")
+    except ImportError:
+        logger.warning("pyspellchecker not installed — gutter repair unavailable")
+
+
+def _is_known(word: str) -> bool:
+    """Check if a word is known in DE or EN dictionary."""
+    _init_spellcheckers()
+    if not _SPELL_AVAILABLE:
+        return False
+    w = word.lower()
+    return bool(_spell_de.known([w])) or bool(_spell_en.known([w]))
+
+
+def _spell_candidates(word: str, lang: str = "both") -> List[str]:
+    """Get all plausible spellchecker candidates for a word (deduplicated)."""
+    _init_spellcheckers()
+    if not _SPELL_AVAILABLE:
+        return []
+    w = word.lower()
+    seen: set = set()
+    results: List[str] = []
+
+    for checker in ([_spell_de, _spell_en] if lang == "both"
+                    else [_spell_de] if lang == "de"
+                    else [_spell_en]):
+        if checker is None:
+            continue
+        cands = checker.candidates(w)
+        if cands:
+            for c in cands:
+                if c and c != w and c not in seen:
+                    seen.add(c)
+                    results.append(c)
+
+    return results
+
+
+# ---------------------------------------------------------------------------
+# Gutter position detection
+# ---------------------------------------------------------------------------
+
+# Minimum word length for spell-fix (very short words are often legitimate)
+_MIN_WORD_LEN_SPELL = 3
+
+# Minimum word length for hyphen-join candidates (fragments at the gutter
+# can be as short as 1-2 chars, e.g. "ve" from "ver-künden")
+_MIN_WORD_LEN_HYPHEN = 2
+
+# How close to the right column edge a word must be to count as "gutter-adjacent".
+# Expressed as fraction of column width (e.g. 0.75 = rightmost 25%).
+_GUTTER_EDGE_THRESHOLD = 0.70
+
+# Small common words / abbreviations that should NOT be repaired
+_STOPWORDS = frozenset([
+    # German
+    "ab", "an", "am", "da", "er", "es", "im", "in", "ja", "ob", "so", "um",
+    "zu", "wo", "du", "eh", "ei", "je", "na", "nu", "oh",
+    # English
+    "a", "am", "an", "as", "at", "be", "by", "do", "go", "he", "if", "in",
+    "is", "it", "me", "my", "no", "of", "on", "or", "so", "to", "up", "us",
+    "we",
+])
+
+# IPA / phonetic patterns — skip these cells
+_IPA_RE = re.compile(r'[\[\]/ˈˌːʃʒθðŋɑɒæɔəɛɪʊʌ]')
+
+
+def _is_ipa_text(text: str) -> bool:
+    """True if text looks like IPA transcription."""
+    return bool(_IPA_RE.search(text))
+
+
+def _word_is_at_gutter_edge(word_bbox: Dict, col_x: float, col_width: float) -> bool:
+    """Check if a word's right edge is near the right boundary of its column."""
+    if col_width <= 0:
+        return False
+    word_right = word_bbox.get("left", 0) + word_bbox.get("width", 0)
+    col_right = col_x + col_width
+    # Word's right edge within the rightmost portion of the column
+    relative_pos = (word_right - col_x) / col_width
+    return relative_pos >= _GUTTER_EDGE_THRESHOLD
+
+
+# ---------------------------------------------------------------------------
+# Suggestion types
+# ---------------------------------------------------------------------------
+
+@dataclass
+class GutterSuggestion:
+    """A single correction suggestion."""
+    id: str = field(default_factory=lambda: str(uuid.uuid4())[:8])
+    type: str = ""             # "hyphen_join" | "spell_fix"
+    zone_index: int = 0
+    row_index: int = 0
+    col_index: int = 0
+    col_type: str = ""
+    cell_id: str = ""
+    original_text: str = ""
+    suggested_text: str = ""
+    # For hyphen_join:
+    next_row_index: int = -1
+    next_row_cell_id: str = ""
+    next_row_text: str = ""
+    missing_chars: str = ""
+    display_parts: List[str] = field(default_factory=list)
+    # Alternatives (other plausible corrections the user can pick from)
+    alternatives: List[str] = field(default_factory=list)
+    # Meta:
+    confidence: float = 0.0
+    reason: str = ""           # "gutter_truncation" | "gutter_blur" | "hyphen_continuation"
+
+    def to_dict(self) -> Dict[str, Any]:
+        return asdict(self)
+
+
+# ---------------------------------------------------------------------------
+# Core repair logic
+# ---------------------------------------------------------------------------
+
+_TRAILING_PUNCT_RE = re.compile(r'[.,;:!?\)\]]+$')
+
+
+def _try_hyphen_join(
+    word_text: str,
+    next_word_text: str,
+    max_missing: int = 3,
+) -> Optional[Tuple[str, str, float]]:
+    """Try joining two fragments with 0..max_missing interpolated chars.
+
+    Strips trailing punctuation from the continuation word before testing
+    (e.g. "künden," → "künden") so dictionary lookup succeeds.
+
+    Returns (joined_word, missing_chars, confidence) or None.
+    """
+    base = word_text.rstrip("-").rstrip()
+    # Strip trailing punctuation from continuation (commas, periods, etc.)
+    raw_continuation = next_word_text.lstrip()
+    continuation = _TRAILING_PUNCT_RE.sub('', raw_continuation)
+
+    if not base or not continuation:
+        return None
+
+    # 1. Direct join (no missing chars)
+    direct = base + continuation
+    if _is_known(direct):
+        return (direct, "", 0.95)
+
+    # 2. Try with 1..max_missing missing characters
+    # Use common letters, weighted by frequency in German/English
+    _COMMON_CHARS = "enristaldhgcmobwfkzpvjyxqu"
+
+    for n_missing in range(1, max_missing + 1):
+        for chars in itertools.product(_COMMON_CHARS[:15], repeat=n_missing):
+            candidate = base + "".join(chars) + continuation
+            if _is_known(candidate):
+                missing = "".join(chars)
+                # Confidence decreases with more missing chars
+                conf = 0.90 - (n_missing - 1) * 0.10
+                return (candidate, missing, conf)
+
+    return None
+
+
+def _try_spell_fix(
+    word_text: str, col_type: str = "",
+) -> Optional[Tuple[str, float, List[str]]]:
+    """Try to fix a single garbled gutter word via spellchecker.
+
+    Returns (best_correction, confidence, alternatives_list) or None.
+    The alternatives list contains other plausible corrections the user
+    can choose from (e.g. "stammelt" vs "stammeln").
+    """
+    if len(word_text) < _MIN_WORD_LEN_SPELL:
+        return None
+
+    # Strip trailing/leading parentheses and check if the bare word is valid.
+    # Words like "probieren)" or "(Englisch" are valid words with punctuation,
+    # not OCR errors. Don't suggest corrections for them.
+    stripped = word_text.strip("()")
+    if stripped and _is_known(stripped):
+        return None
+
+    # Determine language priority from column type
+    if "en" in col_type:
+        lang = "en"
+    elif "de" in col_type:
+        lang = "de"
+    else:
+        lang = "both"
+
+    candidates = _spell_candidates(word_text, lang=lang)
+    if not candidates and lang != "both":
+        candidates = _spell_candidates(word_text, lang="both")
+
+    if not candidates:
+        return None
+
+    # Preserve original casing
+    is_upper = word_text[0].isupper()
+
+    def _preserve_case(w: str) -> str:
+        if is_upper and w:
+            return w[0].upper() + w[1:]
+        return w
+
+    # Sort candidates by edit distance (closest first)
+    scored = []
+    for c in candidates:
+        dist = _edit_distance(word_text.lower(), c.lower())
+        scored.append((dist, c))
+    scored.sort(key=lambda x: x[0])
+
+    best_dist, best = scored[0]
+    best = _preserve_case(best)
+    conf = max(0.5, 1.0 - best_dist * 0.15)
+
+    # Build alternatives (all other candidates, also case-preserved)
+    alts = [_preserve_case(c) for _, c in scored[1:] if c.lower() != best.lower()]
+    # Limit to top 5 alternatives
+    alts = alts[:5]
+
+    return (best, conf, alts)
+
+
+def _edit_distance(a: str, b: str) -> int:
+    """Simple Levenshtein distance."""
+    if len(a) < len(b):
+        return _edit_distance(b, a)
+    if len(b) == 0:
+        return len(a)
+    prev = list(range(len(b) + 1))
+    for i, ca in enumerate(a):
+        curr = [i + 1]
+        for j, cb in enumerate(b):
+            cost = 0 if ca == cb else 1
+            curr.append(min(curr[j] + 1, prev[j + 1] + 1, prev[j] + cost))
+        prev = curr
+    return prev[len(b)]
+
+
+# ---------------------------------------------------------------------------
+# Grid analysis
+# ---------------------------------------------------------------------------
+
+def analyse_grid_for_gutter_repair(
+    grid_data: Dict[str, Any],
+    image_width: int = 0,
+) -> Dict[str, Any]:
+    """Analyse a structured grid and return gutter repair suggestions.
+
+    Args:
+        grid_data: The grid_editor_result from the session (zones→cells structure).
+        image_width: Image width in pixels (for determining gutter side).
+
+    Returns:
+        Dict with "suggestions" list and "stats".
+    """
+    t0 = time.time()
+    _init_spellcheckers()
+
+    if not _SPELL_AVAILABLE:
+        return {
+            "suggestions": [],
+            "stats": {"error": "pyspellchecker not installed"},
+            "duration_seconds": 0,
+        }
+
+    zones = grid_data.get("zones", [])
+    suggestions: List[GutterSuggestion] = []
+    words_checked = 0
+    gutter_candidates = 0
+
+    for zi, zone in enumerate(zones):
+        columns = zone.get("columns", [])
+        cells = zone.get("cells", [])
+        if not columns or not cells:
+            continue
+
+        # Build column lookup: col_index → {x, width, type}
+        col_info: Dict[int, Dict] = {}
+        for col in columns:
+            ci = col.get("index", col.get("col_index", -1))
+            col_info[ci] = {
+                "x": col.get("x_min_px", col.get("x", 0)),
+                "width": col.get("x_max_px", col.get("width", 0)) - col.get("x_min_px", col.get("x", 0)),
+                "type": col.get("type", col.get("col_type", "")),
+            }
+
+        # Build row→col→cell lookup
+        cell_map: Dict[Tuple[int, int], Dict] = {}
+        max_row = 0
+        for cell in cells:
+            ri = cell.get("row_index", 0)
+            ci = cell.get("col_index", 0)
+            cell_map[(ri, ci)] = cell
+            if ri > max_row:
+                max_row = ri
+
+        # Determine which columns are at the gutter edge.
+        # For a left page: rightmost content columns.
+        # For now, check ALL columns — a word is a candidate if it's at the
+        # right edge of its column AND not a known word.
+        for (ri, ci), cell in cell_map.items():
+            text = (cell.get("text") or "").strip()
+            if not text:
+                continue
+            if _is_ipa_text(text):
+                continue
+
+            words_checked += 1
+            col = col_info.get(ci, {})
+            col_type = col.get("type", "")
+
+            # Get word boxes to check position
+            word_boxes = cell.get("word_boxes", [])
+
+            # Check the LAST word in the cell (rightmost, closest to gutter)
+            cell_words = text.split()
+            if not cell_words:
+                continue
+
+            last_word = cell_words[-1]
+
+            # Skip stopwords
+            if last_word.lower().rstrip(".,;:!?-") in _STOPWORDS:
+                continue
+
+            last_word_clean = last_word.rstrip(".,;:!?)(")
+            if len(last_word_clean) < _MIN_WORD_LEN_HYPHEN:
+                continue
+
+            # Check if the last word is at the gutter edge
+            is_at_edge = False
+            if word_boxes:
+                last_wb = word_boxes[-1]
+                is_at_edge = _word_is_at_gutter_edge(
+                    last_wb, col.get("x", 0), col.get("width", 1)
+                )
+            else:
+                # No word boxes — use cell bbox
+                bbox = cell.get("bbox_px", {})
+                is_at_edge = _word_is_at_gutter_edge(
+                    {"left": bbox.get("x", 0), "width": bbox.get("w", 0)},
+                    col.get("x", 0), col.get("width", 1)
+                )
+
+            if not is_at_edge:
+                continue
+
+            # Word is at gutter edge — check if it's a known word
+            if _is_known(last_word_clean):
+                continue
+
+            # Check if the word ends with "-" (explicit hyphen break)
+            ends_with_hyphen = last_word.endswith("-")
+
+            # If the word already ends with "-" and the stem (without
+            # the hyphen) is a known word, this is a VALID line-break
+            # hyphenation — not a gutter error.  Gutter problems cause
+            # the hyphen to be LOST ("ve" instead of "ver-"), so a
+            # visible hyphen + known stem = intentional word-wrap.
+            # Example: "wunder-" → "wunder" is known → skip.
+            if ends_with_hyphen:
+                stem = last_word_clean.rstrip("-")
+                if stem and _is_known(stem):
+                    continue
+
+            gutter_candidates += 1
+
+            # --- Strategy 1: Hyphen join with next row ---
+            next_cell = cell_map.get((ri + 1, ci))
+            if next_cell:
+                next_text = (next_cell.get("text") or "").strip()
+                next_words = next_text.split()
+                if next_words:
+                    first_next = next_words[0]
+                    first_next_clean = _TRAILING_PUNCT_RE.sub('', first_next)
+                    first_alpha = next((c for c in first_next if c.isalpha()), "")
+
+                    # Also skip if the joined word is known (covers compound
+                    # words where the stem alone might not be in the dictionary)
+                    if ends_with_hyphen and first_next_clean:
+                        direct = last_word_clean.rstrip("-") + first_next_clean
+                        if _is_known(direct):
+                            continue
+
+                    # Continuation likely if:
+                    # - explicit hyphen, OR
+                    # - next row starts lowercase (= not a new entry)
+                    if ends_with_hyphen or (first_alpha and first_alpha.islower()):
+                        result = _try_hyphen_join(last_word_clean, first_next)
+                        if result:
+                            joined, missing, conf = result
+                            # Build display parts: show hyphenation for original layout
+                            if ends_with_hyphen:
+                                display_p1 = last_word_clean.rstrip("-")
+                                if missing:
+                                    display_p1 += missing
+                                display_p1 += "-"
+                            else:
+                                display_p1 = last_word_clean
+                                if missing:
+                                    display_p1 += missing + "-"
+                                else:
+                                    display_p1 += "-"
+
+                            suggestion = GutterSuggestion(
+                                type="hyphen_join",
+                                zone_index=zi,
+                                row_index=ri,
+                                col_index=ci,
+                                col_type=col_type,
+                                cell_id=cell.get("cell_id", f"R{ri:02d}_C{ci}"),
+                                original_text=last_word,
+                                suggested_text=joined,
+                                next_row_index=ri + 1,
+                                next_row_cell_id=next_cell.get("cell_id", f"R{ri+1:02d}_C{ci}"),
+                                next_row_text=next_text,
+                                missing_chars=missing,
+                                display_parts=[display_p1, first_next],
+                                confidence=conf,
+                                reason="gutter_truncation" if missing else "hyphen_continuation",
+                            )
+                            suggestions.append(suggestion)
+                            continue  # skip spell_fix if hyphen_join found
+
+            # --- Strategy 2: Single-word spell fix (only for longer words) ---
+            fix_result = _try_spell_fix(last_word_clean, col_type)
+            if fix_result:
+                corrected, conf, alts = fix_result
+                suggestion = GutterSuggestion(
+                    type="spell_fix",
+                    zone_index=zi,
+                    row_index=ri,
+                    col_index=ci,
+                    col_type=col_type,
+                    cell_id=cell.get("cell_id", f"R{ri:02d}_C{ci}"),
+                    original_text=last_word,
+                    suggested_text=corrected,
+                    alternatives=alts,
+                    confidence=conf,
+                    reason="gutter_blur",
+                )
+                suggestions.append(suggestion)
+
+    duration = round(time.time() - t0, 3)
+
+    logger.info(
+        "Gutter repair: checked %d words, %d gutter candidates, %d suggestions (%.2fs)",
+        words_checked, gutter_candidates, len(suggestions), duration,
+    )
+
+    return {
+        "suggestions": [s.to_dict() for s in suggestions],
+        "stats": {
+            "words_checked": words_checked,
+            "gutter_candidates": gutter_candidates,
+            "suggestions_found": len(suggestions),
+        },
+        "duration_seconds": duration,
+    }
+
+
+def apply_gutter_suggestions(
+    grid_data: Dict[str, Any],
+    accepted_ids: List[str],
+    suggestions: List[Dict[str, Any]],
+) -> Dict[str, Any]:
+    """Apply accepted gutter repair suggestions to the grid data.
+
+    Modifies cells in-place and returns summary of changes.
+
+    Args:
+        grid_data: The grid_editor_result (zones→cells).
+        accepted_ids: List of suggestion IDs the user accepted.
+        suggestions: The full suggestions list (from analyse_grid_for_gutter_repair).
+
+    Returns:
+        Dict with "applied_count" and "changes" list.
+    """
+    accepted_set = set(accepted_ids)
+    accepted_suggestions = [s for s in suggestions if s.get("id") in accepted_set]
+
+    zones = grid_data.get("zones", [])
+    changes: List[Dict[str, Any]] = []
+
+    for s in accepted_suggestions:
+        zi = s.get("zone_index", 0)
+        ri = s.get("row_index", 0)
+        ci = s.get("col_index", 0)
+        stype = s.get("type", "")
+
+        if zi >= len(zones):
+            continue
+        zone_cells = zones[zi].get("cells", [])
+
+        # Find the target cell
+        target_cell = None
+        for cell in zone_cells:
+            if cell.get("row_index") == ri and cell.get("col_index") == ci:
+                target_cell = cell
+                break
+
+        if not target_cell:
+            continue
+
+        old_text = target_cell.get("text", "")
+
+        if stype == "spell_fix":
+            # Replace the last word in the cell text
+            original_word = s.get("original_text", "")
+            corrected = s.get("suggested_text", "")
+            if original_word and corrected:
+                # Replace from the right (last occurrence)
+                idx = old_text.rfind(original_word)
+                if idx >= 0:
+                    new_text = old_text[:idx] + corrected + old_text[idx + len(original_word):]
+                    target_cell["text"] = new_text
+                    changes.append({
+                        "type": "spell_fix",
+                        "zone_index": zi,
+                        "row_index": ri,
+                        "col_index": ci,
+                        "cell_id": target_cell.get("cell_id", ""),
+                        "old_text": old_text,
+                        "new_text": new_text,
+                    })
+
+        elif stype == "hyphen_join":
+            # Current cell: replace last word with the hyphenated first part
+            original_word = s.get("original_text", "")
+            joined = s.get("suggested_text", "")
+            display_parts = s.get("display_parts", [])
+            next_ri = s.get("next_row_index", -1)
+
+            if not original_word or not joined or not display_parts:
+                continue
+
+            # The first display part is what goes in the current row
+            first_part = display_parts[0] if display_parts else ""
+
+            # Replace the last word in current cell with the restored form.
+            # The next row is NOT modified — "künden" stays in its row
+            # because the original book layout has it there. We only fix
+            # the truncated word in the current row (e.g. "ve" → "ver-").
+            idx = old_text.rfind(original_word)
+            if idx >= 0:
+                new_text = old_text[:idx] + first_part + old_text[idx + len(original_word):]
+                target_cell["text"] = new_text
+                changes.append({
+                    "type": "hyphen_join",
+                    "zone_index": zi,
+                    "row_index": ri,
+                    "col_index": ci,
+                    "cell_id": target_cell.get("cell_id", ""),
+                    "old_text": old_text,
+                    "new_text": new_text,
+                    "joined_word": joined,
+                })
+
+    logger.info("Gutter repair applied: %d/%d suggestions", len(changes), len(accepted_suggestions))
+
+    return {
+        "applied_count": len(accepted_suggestions),
+        "changes": changes,
+    }
--- a/klausur-service/backend/cv_ipa_german.py
+++ b/klausur-service/backend/cv_ipa_german.py
@@ -0,0 +1,135 @@
+"""German IPA insertion for grid editor cells.
+
+Hybrid approach:
+  1. Primary lookup: wiki-pronunciation-dict (636k entries, CC-BY-SA)
+  2. Fallback: epitran rule-based G2P (MIT license)
+
+German IPA data sourced from Wiktionary contributors (CC-BY-SA 4.0).
+Attribution required — see grid editor UI.
+
+Lizenz: Code Apache-2.0, IPA-Daten CC-BY-SA 4.0 (Wiktionary)
+DATENSCHUTZ: Alle Verarbeitung erfolgt lokal.
+"""
+
+import logging
+import re
+from typing import Dict, List, Optional, Set
+
+logger = logging.getLogger(__name__)
+
+# IPA/phonetic characters — skip cells that already contain IPA
+_IPA_RE = re.compile(r'[\[\]ˈˌːʃʒθðŋɑɒæɔəɛɜɪʊʌ]')
+
+
+def _lookup_ipa_de(word: str) -> Optional[str]:
+    """Look up German IPA for a single word.
+
+    Returns IPA string or None if not found.
+    """
+    from cv_vocab_types import _de_ipa_dict, _epitran_de, DE_IPA_AVAILABLE
+
+    if not DE_IPA_AVAILABLE and _epitran_de is None:
+        return None
+
+    lower = word.lower().strip()
+    if not lower:
+        return None
+
+    # 1. Dictionary lookup (636k entries)
+    ipa = _de_ipa_dict.get(lower)
+    if ipa:
+        return ipa
+
+    # 2. epitran fallback (rule-based)
+    if _epitran_de is not None:
+        try:
+            result = _epitran_de.transliterate(word)
+            if result and result != word.lower():
+                return result
+        except Exception:
+            pass
+
+    return None
+
+
+def _insert_ipa_for_text(text: str) -> str:
+    """Insert German IPA after each recognized word in a text string.
+
+    Handles comma-separated lists:
+      "bildschön, blendend" → "bildschön [bɪltʃøn], blendend [blɛndənt]"
+
+    Skips cells already containing IPA brackets.
+    """
+    if not text or _IPA_RE.search(text):
+        return text
+
+    # Split on comma/semicolon sequences, keeping separators
+    tokens = re.split(r'([,;:]+\s*)', text)
+    result = []
+    changed = False
+
+    for tok in tokens:
+        # Keep separators as-is
+        if not tok or re.match(r'^[,;:\s]+$', tok):
+            result.append(tok)
+            continue
+
+        # Process words within this token
+        words = tok.split()
+        new_words = []
+        for w in words:
+            # Strip punctuation for lookup
+            clean = re.sub(r'[^a-zA-ZäöüÄÖÜß]', '', w)
+            if len(clean) < 3:
+                new_words.append(w)
+                continue
+
+            ipa = _lookup_ipa_de(clean)
+            if ipa:
+                new_words.append(f"{w} [{ipa}]")
+                changed = True
+            else:
+                new_words.append(w)
+
+        result.append(' '.join(new_words))
+
+    return ''.join(result) if changed else text
+
+
+def insert_german_ipa(
+    cells: List[Dict],
+    target_cols: Set[str],
+) -> int:
+    """Insert German IPA transcriptions into cells of target columns.
+
+    Args:
+        cells: Flat list of all cells (modified in-place).
+        target_cols: Set of col_type values to process.
+
+    Returns:
+        Number of cells modified.
+    """
+    from cv_vocab_types import DE_IPA_AVAILABLE, _epitran_de
+
+    if not DE_IPA_AVAILABLE and _epitran_de is None:
+        logger.warning("German IPA not available — skipping")
+        return 0
+
+    count = 0
+    for cell in cells:
+        ct = cell.get("col_type", "")
+        if ct not in target_cols:
+            continue
+        text = cell.get("text", "")
+        if not text.strip():
+            continue
+
+        new_text = _insert_ipa_for_text(text)
+        if new_text != text:
+            cell["text"] = new_text
+            cell["_ipa_corrected"] = True
+            count += 1
+
+    if count:
+        logger.info(f"German IPA inserted in {count} cells")
+    return count
--- a/klausur-service/backend/cv_ocr_engines.py
+++ b/klausur-service/backend/cv_ocr_engines.py
@@ -1030,6 +1030,15 @@ def _text_has_garbled_ipa(text: str) -> bool:
        # Contains IPA special characters
        if any(c in w for c in 'əɪɛɒʊʌæɑɔʃʒθðŋ'):
            return True
+        # Embedded apostrophe suggesting merged garbled IPA with stress mark.
+        # E.g. "Scotland'skotland" — OCR reads ˈ as '.
+        # Guard: apostrophe must be after ≥3 chars and before ≥3 lowercase
+        # chars to avoid contractions (don't, won't, o'clock).
+        if "'" in w and not w.startswith("'"):
+            apos_idx = w.index("'")
+            after = w[apos_idx + 1:]
+            if apos_idx >= 3 and len(after) >= 3 and after[0].islower():
+                return True
    return False


@@ -1173,6 +1182,10 @@ def _insert_missing_ipa(text: str, pronunciation: str = 'british') -> str:
                if wj in ('–', '—', '-', '/', '|', ',', ';'):
                    kept.extend(words[j:])
                    break
+                # Pure digits or numbering (e.g. "1", "2.", "3)") — keep
+                if re.match(r'^[\d.)\-]+$', wj):
+                    kept.extend(words[j:])
+                    break
                # Starts with uppercase — likely German or proper noun
                clean_j = re.sub(r'[^a-zA-Z]', '', wj)
                if clean_j and clean_j[0].isupper():
@@ -1183,6 +1196,19 @@ def _insert_missing_ipa(text: str, pronunciation: str = 'british') -> str:
                    if _lookup_ipa(clean_j, pronunciation):
                        kept.extend(words[j:])
                        break
+                # Merged token: dictionary word + garbled IPA stuck together.
+                # E.g. "fictionsalans'fIkfn" starts with "fiction".
+                # Extract the dictionary prefix (≥4 chars) and add it with
+                # IPA, but only if enough chars remain after the prefix (≥3)
+                # to look like garbled IPA, not just a plural 's'.
+                if clean_j and len(clean_j) >= 7:
+                    for pend in range(min(len(clean_j) - 3, 15), 3, -1):
+                        prefix_j = clean_j[:pend]
+                        prefix_ipa = _lookup_ipa(prefix_j, pronunciation)
+                        if prefix_ipa:
+                            kept.append(f"{prefix_j} [{prefix_ipa}]")
+                            break
+                    break  # rest of this token is garbled
                # Otherwise — likely garbled phonetics, skip
            words = kept
            break
@@ -1221,6 +1247,9 @@ def _has_non_dict_trailing(text: str, pronunciation: str = 'british') -> bool:
        wj = words[j]
        if wj in ('–', '—', '-', '/', '|', ',', ';'):
            return False
+        # Pure digits or numbering (e.g. "1", "2.", "3)") — not garbled IPA
+        if re.match(r'^[\d.)\-]+$', wj):
+            return False
        clean_j = re.sub(r'[^a-zA-Z]', '', wj)
        if clean_j and clean_j[0].isupper():
            return False
@@ -1852,6 +1881,11 @@ def _is_noise_tail_token(token: str) -> bool:
    if t.endswith(']'):
        return False

+    # Keep meaningful punctuation tokens used in textbooks
+    # = (definition marker), (= (definition opener), ; (separator)
+    if t in ('=', '(=', '=)', ';', ':', '-', '–', '—', '/', '+', '&'):
+        return False
+
    # Pure non-alpha → noise ("3", ")", "|")
    alpha_chars = _RE_ALPHA.findall(t)
    if not alpha_chars:
--- a/klausur-service/backend/cv_review.py
+++ b/klausur-service/backend/cv_review.py
@@ -720,6 +720,62 @@ def _spell_dict_knows(word: str) -> bool:
    return bool(_en_spell.known([w])) or bool(_de_spell.known([w]))


+def _try_split_merged_word(token: str) -> Optional[str]:
+    """Try to split a merged word like 'atmyschool' into 'at my school'.
+
+    Uses dynamic programming to find the shortest sequence of dictionary
+    words that covers the entire token.  Only returns a result when the
+    split produces at least 2 words and ALL parts are known dictionary words.
+
+    Preserves original capitalisation by mapping back to the input string.
+    """
+    if not _SPELL_AVAILABLE or len(token) < 4:
+        return None
+
+    lower = token.lower()
+    n = len(lower)
+
+    # dp[i] = (word_lengths_list, score) for best split of lower[:i], or None
+    # Score: (-word_count, sum_of_squared_lengths) — fewer words first,
+    # then prefer longer words (e.g. "come on" over "com eon")
+    dp: list = [None] * (n + 1)
+    dp[0] = ([], 0)
+
+    for i in range(1, n + 1):
+        for j in range(max(0, i - 20), i):
+            if dp[j] is None:
+                continue
+            candidate = lower[j:i]
+            word_len = i - j
+            if word_len == 1 and candidate not in ('a', 'i'):
+                continue
+            if _spell_dict_knows(candidate):
+                prev_words, prev_sq = dp[j]
+                new_words = prev_words + [word_len]
+                new_sq = prev_sq + word_len * word_len
+                new_key = (-len(new_words), new_sq)
+                if dp[i] is None:
+                    dp[i] = (new_words, new_sq)
+                else:
+                    old_key = (-len(dp[i][0]), dp[i][1])
+                    if new_key >= old_key:
+                        # >= so that later splits (longer first word) win ties
+                        dp[i] = (new_words, new_sq)
+
+    if dp[n] is None or len(dp[n][0]) < 2:
+        return None
+
+    # Reconstruct with original casing
+    result = []
+    pos = 0
+    for wlen in dp[n][0]:
+        result.append(token[pos:pos + wlen])
+        pos += wlen
+
+    logger.debug("Split merged word: %r → %r", token, " ".join(result))
+    return " ".join(result)
+
+
 def _spell_fix_token(token: str, field: str = "") -> Optional[str]:
    """Return corrected form of token, or None if no fix needed/possible.

@@ -777,6 +833,14 @@ def _spell_fix_token(token: str, field: str = "") -> Optional[str]:
                    correction = correction[0].upper() + correction[1:]
                if _spell_dict_knows(correction):
                    return correction
+
+    # 5. Merged-word split: OCR often merges adjacent words when spacing
+    #    is too tight, e.g. "atmyschool" → "at my school"
+    if len(token) >= 4 and token.isalpha():
+        split = _try_split_merged_word(token)
+        if split:
+            return split
+
    return None


@@ -817,10 +881,25 @@ def spell_review_entries_sync(entries: List[Dict]) -> Dict:
    """Rule-based OCR correction: spell-checker + structural heuristics.

    Deterministic — never translates, never touches IPA, never hallucinates.
+    Uses SmartSpellChecker for language-aware corrections with context-based
+    disambiguation (a/I), multi-digit substitution, and cross-language guard.
    """
    t0 = time.time()
    changes: List[Dict] = []
    all_corrected: List[Dict] = []
+
+    # Use SmartSpellChecker if available, fall back to legacy _spell_fix_field
+    _smart = None
+    try:
+        from smart_spell import SmartSpellChecker
+        _smart = SmartSpellChecker()
+        logger.debug("spell_review: using SmartSpellChecker")
+    except Exception:
+        logger.debug("spell_review: SmartSpellChecker not available, using legacy")
+
+    # Map field names → language codes for SmartSpellChecker
+    _LANG_MAP = {"english": "en", "german": "de", "example": "auto"}
+
    for i, entry in enumerate(entries):
        e = dict(entry)
        # Page-ref normalization (always, regardless of review status)
@@ -843,9 +922,18 @@ def spell_review_entries_sync(entries: List[Dict]) -> Dict:
            old_val = (e.get(field_name) or "").strip()
            if not old_val:
                continue
-            # example field is mixed-language — try German first (for umlauts)
+
+            if _smart:
+                # SmartSpellChecker path — language-aware, context-based
+                lang_code = _LANG_MAP.get(field_name, "en")
+                result = _smart.correct_text(old_val, lang=lang_code)
+                new_val = result.corrected
+                was_changed = result.changed
+            else:
+                # Legacy path
                lang = "german" if field_name in ("german", "example") else "english"
                new_val, was_changed = _spell_fix_field(old_val, field=lang)
+
            if was_changed and new_val != old_val:
                changes.append({
                    "row_index": e.get("row_index", i),
@@ -857,12 +945,13 @@ def spell_review_entries_sync(entries: List[Dict]) -> Dict:
                e["llm_corrected"] = True
        all_corrected.append(e)
    duration_ms = int((time.time() - t0) * 1000)
+    model_name = "smart-spell-checker" if _smart else "spell-checker"
    return {
        "entries_original": entries,
        "entries_corrected": all_corrected,
        "changes": changes,
        "skipped_count": 0,
-        "model_used": "spell-checker",
+        "model_used": model_name,
        "duration_ms": duration_ms,
    }

--- a/klausur-service/backend/cv_syllable_detect.py
+++ b/klausur-service/backend/cv_syllable_detect.py
@@ -1,11 +1,15 @@
 """
-CV-based syllable divider detection and insertion for dictionary pages.
+Syllable divider insertion for dictionary pages.

-Two-step approach:
-  1. CV: morphological vertical line detection checks if a word_box image
-     contains thin, isolated pipe-like vertical lines (syllable dividers).
-  2. pyphen: inserts syllable breaks at linguistically correct positions
-     for words where CV confirmed the presence of dividers.
+For confirmed dictionary pages (is_dictionary=True), processes all content
+column cells:
+  1. Strips existing | dividers for clean normalization
+  2. Merges pipe-gap spaces (where OCR split a word at a divider position)
+  3. Applies pyphen syllabification to each word >= 3 alpha chars (DE then EN)
+  4. Only modifies words that pyphen recognizes — garbled OCR stays as-is
+
+No CV gate needed — the dictionary detection confidence is sufficient.
+pyphen uses Hunspell/TeX hyphenation dictionaries and is very reliable.

 Lizenz: Apache 2.0 (kommerziell nutzbar)
 DATENSCHUTZ: Alle Verarbeitung erfolgt lokal.
@@ -13,94 +17,488 @@ DATENSCHUTZ: Alle Verarbeitung erfolgt lokal.

 import logging
 import re
-from typing import Any, Dict, List
+from typing import Any, Dict, List, Optional, Tuple

-import cv2
 import numpy as np

 logger = logging.getLogger(__name__)

-
-def _word_has_pipe_lines(img_gray: np.ndarray, wb: Dict) -> bool:
-    """CV check: does this word_box image show thin vertical pipe dividers?
-
-    Uses morphological opening with a tall thin kernel to isolate vertical
-    structures, then filters for thin (≤4px), isolated contours that are
-    NOT at the word edges (those would be l, I, 1 etc.).
-    """
-    x = wb.get("left", 0)
-    y = wb.get("top", 0)
-    w = wb.get("width", 0)
-    h = wb.get("height", 0)
-    if w < 30 or h < 12:
-        return False
-    ih, iw = img_gray.shape[:2]
-    y1, y2 = max(0, y), min(ih, y + h)
-    x1, x2 = max(0, x), min(iw, x + w)
-    roi = img_gray[y1:y2, x1:x2]
-    if roi.size == 0:
-        return False
-    rh, rw = roi.shape
-
-    # Binarize (ink = white on black background)
-    _, binary = cv2.threshold(
-        roi, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU
-    )
-
-    # Morphological opening: keep only tall vertical structures (≥55% height)
-    kern_h = max(int(rh * 0.55), 8)
-    kernel = np.ones((kern_h, 1), np.uint8)
-    vertical = cv2.morphologyEx(binary, cv2.MORPH_OPEN, kernel)
-
-    # Find surviving contours
-    contours, _ = cv2.findContours(
-        vertical, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE
-    )
-
-    margin = max(int(rw * 0.08), 3)
-    for cnt in contours:
-        cx, cy, cw, ch = cv2.boundingRect(cnt)
-        if cw > 4:
-            continue  # too wide for a pipe
-        if cx < margin or cx + cw > rw - margin:
-            continue  # at word edge — likely l, I, 1
-        # Check isolation: adjacent columns should be mostly empty (ink-free)
-        left_zone = binary[cy:cy + ch, max(0, cx - 3):cx]
-        right_zone = binary[cy:cy + ch, cx + cw:min(rw, cx + cw + 3)]
-        left_ink = np.mean(left_zone) if left_zone.size else 255
-        right_ink = np.mean(right_zone) if right_zone.size else 255
-        if left_ink < 80 and right_ink < 80:
-            return True  # isolated thin vertical line = pipe divider
-    return False
-
-
-# IPA/phonetic bracket pattern — don't hyphenate transcriptions
+# IPA/phonetic characters — skip cells containing these
 _IPA_RE = re.compile(r'[\[\]ˈˌːʃʒθðŋɑɒæɔəɛɜɪʊʌ]')

+# Common German words that should NOT be merged with adjacent tokens.
+# These are function words that appear as standalone words between
+# headwords/definitions on dictionary pages.
+_STOP_WORDS = frozenset([
+    # Articles
+    'der', 'die', 'das', 'dem', 'den', 'des',
+    'ein', 'eine', 'einem', 'einen', 'einer',
+    # Pronouns
+    'du', 'er', 'es', 'sie', 'wir', 'ihr', 'ich', 'man', 'sich',
+    'dich', 'dir', 'mich', 'mir', 'uns', 'euch', 'ihm', 'ihn',
+    # Prepositions
+    'mit', 'von', 'zu', 'für', 'auf', 'in', 'an', 'um', 'am', 'im',
+    'aus', 'bei', 'nach', 'vor', 'bis', 'durch', 'über', 'unter',
+    'zwischen', 'ohne', 'gegen',
+    # Conjunctions
+    'und', 'oder', 'als', 'wie', 'wenn', 'dass', 'weil', 'aber',
+    # Adverbs
+    'auch', 'noch', 'nur', 'schon', 'sehr', 'nicht',
+    # Verbs
+    'ist', 'hat', 'wird', 'kann', 'soll', 'muss', 'darf',
+    'sein', 'haben',
+    # Other
+    'kein', 'keine', 'keinem', 'keinen', 'keiner',
+])
+
+# Cached hyphenators
+_hyph_de = None
+_hyph_en = None
+
+# Cached spellchecker (for autocorrect_pipe_artifacts)
+_spell_de = None
+
+
+def _get_hyphenators():
+    """Lazy-load pyphen hyphenators (cached across calls)."""
+    global _hyph_de, _hyph_en
+    if _hyph_de is not None:
+        return _hyph_de, _hyph_en
+    try:
+        import pyphen
+    except ImportError:
+        return None, None
+    _hyph_de = pyphen.Pyphen(lang='de_DE')
+    _hyph_en = pyphen.Pyphen(lang='en_US')
+    return _hyph_de, _hyph_en
+
+
+def _get_spellchecker():
+    """Lazy-load German spellchecker (cached across calls)."""
+    global _spell_de
+    if _spell_de is not None:
+        return _spell_de
+    try:
+        from spellchecker import SpellChecker
+    except ImportError:
+        return None
+    _spell_de = SpellChecker(language='de')
+    return _spell_de
+
+
+def _is_known_word(word: str, hyph_de, hyph_en) -> bool:
+    """Check whether pyphen recognises a word (DE or EN)."""
+    if len(word) < 2:
+        return False
+    return ('|' in hyph_de.inserted(word, hyphen='|')
+            or '|' in hyph_en.inserted(word, hyphen='|'))
+
+
+def _is_real_word(word: str) -> bool:
+    """Check whether spellchecker knows this word (case-insensitive)."""
+    spell = _get_spellchecker()
+    if spell is None:
+        return False
+    return word.lower() in spell
+
+
+def _hyphenate_word(word: str, hyph_de, hyph_en) -> Optional[str]:
+    """Try to hyphenate a word using DE then EN dictionary.
+
+    Returns word with | separators, or None if not recognized.
+    """
+    hyph = hyph_de.inserted(word, hyphen='|')
+    if '|' in hyph:
+        return hyph
+    hyph = hyph_en.inserted(word, hyphen='|')
+    if '|' in hyph:
+        return hyph
+    return None
+
+
+def _autocorrect_piped_word(word_with_pipes: str) -> Optional[str]:
+    """Try to correct a word that has OCR pipe artifacts.
+
+    Printed syllable divider lines on dictionary pages confuse OCR:
+    the vertical stroke is often read as an extra character (commonly
+    ``l``, ``I``, ``1``, ``i``) adjacent to where the pipe appears.
+    Sometimes OCR reads one divider as ``|`` and another as a letter,
+    so the garbled character may be far from any detected pipe.
+
+    Uses ``spellchecker`` (frequency-based word list) for validation —
+    unlike pyphen which is a pattern-based hyphenator and accepts
+    nonsense strings like "Zeplpelin".
+
+    Strategy:
+        1. Strip ``|`` — if spellchecker knows the result, done.
+        2. Try deleting each pipe-like character (l, I, 1, i, t).
+           OCR inserts extra chars that resemble vertical strokes.
+        3. Fall back to spellchecker's own ``correction()`` method.
+        4. Preserve the original casing of the first letter.
+    """
+    stripped = word_with_pipes.replace('|', '')
+    if not stripped or len(stripped) < 3:
+        return stripped  # too short to validate
+
+    # Step 1: if the stripped word is already a real word, done
+    if _is_real_word(stripped):
+        return stripped
+
+    # Step 2: try deleting pipe-like characters (most likely artifacts)
+    _PIPE_LIKE = frozenset('lI1it')
+    for idx in range(len(stripped)):
+        if stripped[idx] not in _PIPE_LIKE:
+            continue
+        candidate = stripped[:idx] + stripped[idx + 1:]
+        if len(candidate) >= 3 and _is_real_word(candidate):
+            return candidate
+
+    # Step 3: use spellchecker's built-in correction
+    spell = _get_spellchecker()
+    if spell is not None:
+        suggestion = spell.correction(stripped.lower())
+        if suggestion and suggestion != stripped.lower():
+            # Preserve original first-letter case
+            if stripped[0].isupper():
+                suggestion = suggestion[0].upper() + suggestion[1:]
+            return suggestion
+
+    return None  # could not fix
+
+
+def autocorrect_pipe_artifacts(
+    zones_data: List[Dict], session_id: str,
+) -> int:
+    """Strip OCR pipe artifacts and correct garbled words in-place.
+
+    Printed syllable divider lines on dictionary scans are read by OCR
+    as ``|`` characters embedded in words (e.g. ``Zel|le``, ``Ze|plpe|lin``).
+    This function:
+
+    1. Strips ``|`` from every word in content cells.
+    2. Validates with spellchecker (real dictionary lookup).
+    3. If not recognised, tries deleting pipe-like characters or uses
+       spellchecker's correction (e.g. ``Zeplpelin`` → ``Zeppelin``).
+    4. Updates both word-box texts and cell text.
+
+    Returns the number of cells modified.
+    """
+    spell = _get_spellchecker()
+    if spell is None:
+        logger.warning("spellchecker not available — pipe autocorrect limited")
+        # Fall back: still strip pipes even without spellchecker
+        pass
+
+    modified = 0
+    for z in zones_data:
+        for cell in z.get("cells", []):
+            ct = cell.get("col_type", "")
+            if not ct.startswith("column_"):
+                continue
+
+            cell_changed = False
+
+            # --- Fix word boxes ---
+            for wb in cell.get("word_boxes", []):
+                wb_text = wb.get("text", "")
+                if "|" not in wb_text:
+                    continue
+
+                # Separate trailing punctuation
+                m = re.match(
+                    r'^([^a-zA-ZäöüÄÖÜßẞ]*)'
+                    r'(.*?)'
+                    r'([^a-zA-ZäöüÄÖÜßẞ]*)$',
+                    wb_text,
+                )
+                if not m:
+                    continue
+                lead, core, trail = m.group(1), m.group(2), m.group(3)
+                if "|" not in core:
+                    continue
+
+                corrected = _autocorrect_piped_word(core)
+                if corrected is not None and corrected != core:
+                    wb["text"] = lead + corrected + trail
+                    cell_changed = True
+
+            # --- Rebuild cell text from word boxes ---
+            if cell_changed:
+                wbs = cell.get("word_boxes", [])
+                if wbs:
+                    cell["text"] = " ".join(
+                        (wb.get("text") or "") for wb in wbs
+                    )
+                modified += 1
+
+            # --- Fallback: strip residual | from cell text ---
+            # (covers cases where word_boxes don't exist or weren't fixed)
+            text = cell.get("text", "")
+            if "|" in text:
+                clean = text.replace("|", "")
+                if clean != text:
+                    cell["text"] = clean
+                    if not cell_changed:
+                        modified += 1
+
+    if modified:
+        logger.info(
+            "build-grid session %s: autocorrected pipe artifacts in %d cells",
+            session_id, modified,
+        )
+    return modified
+
+
+def _try_merge_pipe_gaps(text: str, hyph_de) -> str:
+    """Merge fragments separated by single spaces where OCR split at a pipe.
+
+    Example: "Kaf fee" -> "Kaffee" (pyphen recognizes the merged word).
+    Multi-step: "Ka bel jau" -> "Kabel jau" -> "Kabeljau".
+
+    Guards against false merges:
+    - The FIRST token must be pure alpha (word start — no attached punctuation)
+    - The second token may have trailing punctuation (comma, period) which
+      stays attached to the merged word: "Kä" + "fer," -> "Käfer,"
+    - Common German function words (der, die, das, ...) are never merged
+    - At least one fragment must be very short (<=3 alpha chars)
+    """
+    parts = text.split(' ')
+    if len(parts) < 2:
+        return text
+
+    result = [parts[0]]
+    i = 1
+    while i < len(parts):
+        prev = result[-1]
+        curr = parts[i]
+
+        # Extract alpha-only core for lookup
+        prev_alpha = re.sub(r'[^a-zA-ZäöüÄÖÜßẞ]', '', prev)
+        curr_alpha = re.sub(r'[^a-zA-ZäöüÄÖÜßẞ]', '', curr)
+
+        # Guard 1: first token must be pure alpha (word-start fragment)
+        #          second token may have trailing punctuation
+        # Guard 2: neither alpha core can be a common German function word
+        # Guard 3: the shorter fragment must be <= 3 chars (pipe-gap signal)
+        # Guard 4: combined length must be >= 4
+        should_try = (
+            prev == prev_alpha  # first token: pure alpha (word start)
+            and prev_alpha and curr_alpha
+            and prev_alpha.lower() not in _STOP_WORDS
+            and curr_alpha.lower() not in _STOP_WORDS
+            and min(len(prev_alpha), len(curr_alpha)) <= 3
+            and len(prev_alpha) + len(curr_alpha) >= 4
+        )
+
+        if should_try:
+            merged_alpha = prev_alpha + curr_alpha
+            hyph = hyph_de.inserted(merged_alpha, hyphen='-')
+            if '-' in hyph:
+                # pyphen recognizes merged word — collapse the space
+                result[-1] = prev + curr
+                i += 1
+                continue
+
+        result.append(curr)
+        i += 1
+
+    return ' '.join(result)
+
+
+def merge_word_gaps_in_zones(zones_data: List[Dict], session_id: str) -> int:
+    """Merge OCR word-gap fragments in cell texts using pyphen validation.
+
+    OCR often splits words at syllable boundaries into separate word_boxes,
+    producing text like "zerknit tert" instead of "zerknittert".  This
+    function tries to merge adjacent fragments in every content cell.
+
+    More permissive than ``_try_merge_pipe_gaps`` (threshold 5 instead of 3)
+    but still guarded by pyphen dictionary lookup and stop-word exclusion.
+
+    Returns the number of cells modified.
+    """
+    hyph_de, _ = _get_hyphenators()
+    if hyph_de is None:
+        return 0
+
+    modified = 0
+    for z in zones_data:
+        for cell in z.get("cells", []):
+            ct = cell.get("col_type", "")
+            if not ct.startswith("column_"):
+                continue
+            text = cell.get("text", "")
+            if not text or " " not in text:
+                continue
+
+            # Skip IPA cells
+            text_no_brackets = re.sub(r'\[[^\]]*\]', '', text)
+            if _IPA_RE.search(text_no_brackets):
+                continue
+
+            new_text = _try_merge_word_gaps(text, hyph_de)
+            if new_text != text:
+                cell["text"] = new_text
+                modified += 1
+
+    if modified:
+        logger.info(
+            "build-grid session %s: merged word gaps in %d cells",
+            session_id, modified,
+        )
+    return modified
+
+
+def _try_merge_word_gaps(text: str, hyph_de) -> str:
+    """Merge OCR word fragments with relaxed threshold (max_short=5).
+
+    Similar to ``_try_merge_pipe_gaps`` but allows slightly longer fragments
+    (max_short=5 instead of 3).  Still requires pyphen to recognize the
+    merged word.
+    """
+    parts = text.split(' ')
+    if len(parts) < 2:
+        return text
+
+    result = [parts[0]]
+    i = 1
+    while i < len(parts):
+        prev = result[-1]
+        curr = parts[i]
+
+        prev_alpha = re.sub(r'[^a-zA-ZäöüÄÖÜßẞ]', '', prev)
+        curr_alpha = re.sub(r'[^a-zA-ZäöüÄÖÜßẞ]', '', curr)
+
+        should_try = (
+            prev == prev_alpha
+            and prev_alpha and curr_alpha
+            and prev_alpha.lower() not in _STOP_WORDS
+            and curr_alpha.lower() not in _STOP_WORDS
+            and min(len(prev_alpha), len(curr_alpha)) <= 5
+            and len(prev_alpha) + len(curr_alpha) >= 4
+        )
+
+        if should_try:
+            merged_alpha = prev_alpha + curr_alpha
+            hyph = hyph_de.inserted(merged_alpha, hyphen='-')
+            if '-' in hyph:
+                result[-1] = prev + curr
+                i += 1
+                continue
+
+        result.append(curr)
+        i += 1
+
+    return ' '.join(result)
+
+
+def _syllabify_text(text: str, hyph_de, hyph_en) -> str:
+    """Syllabify all significant words in a text string.
+
+    1. Strip existing | dividers
+    2. Merge pipe-gap spaces where possible
+    3. Apply pyphen to each word >= 3 alphabetic chars
+    4. Words pyphen doesn't recognize stay as-is (no bad guesses)
+    """
+    if not text:
+        return text
+
+    # Skip cells that contain IPA transcription characters outside brackets.
+    # Bracket content like [bɪltʃøn] is programmatically inserted and should
+    # not block syllabification of the surrounding text.
+    text_no_brackets = re.sub(r'\[[^\]]*\]', '', text)
+    if _IPA_RE.search(text_no_brackets):
+        return text
+
+    # Phase 1: strip existing pipe dividers for clean normalization
+    clean = text.replace('|', '')
+
+    # Phase 2: merge pipe-gap spaces (OCR fragments from pipe splitting)
+    clean = _try_merge_pipe_gaps(clean, hyph_de)
+
+    # Phase 3: tokenize and syllabify each word
+    # Split on whitespace and comma/semicolon sequences, keeping separators
+    tokens = re.split(r'(\s+|[,;:]+\s*)', clean)
+
+    result = []
+    for tok in tokens:
+        if not tok or re.match(r'^[\s,;:]+$', tok):
+            result.append(tok)
+            continue
+
+        # Strip trailing/leading punctuation for pyphen lookup
+        m = re.match(r'^([^a-zA-ZäöüÄÖÜßẞ]*)(.*?)([^a-zA-ZäöüÄÖÜßẞ]*)$', tok)
+        if not m:
+            result.append(tok)
+            continue
+        lead, word, trail = m.group(1), m.group(2), m.group(3)
+
+        if len(word) < 3 or not re.search(r'[a-zA-ZäöüÄÖÜß]', word):
+            result.append(tok)
+            continue
+
+        hyph = _hyphenate_word(word, hyph_de, hyph_en)
+        if hyph:
+            result.append(lead + hyph + trail)
+        else:
+            result.append(tok)
+
+    return ''.join(result)
+

 def insert_syllable_dividers(
    zones_data: List[Dict],
    img_bgr: np.ndarray,
    session_id: str,
+    *,
+    force: bool = False,
+    col_filter: Optional[set] = None,
 ) -> int:
-    """Insert pipe syllable dividers into dictionary cells where CV confirms them.
+    """Insert pipe syllable dividers into dictionary cells.

-    For each cell on a dictionary page:
-      1. Check if ANY word_box has CV-detected pipe lines
-      2. If yes, apply pyphen to EACH word (≥4 chars) in the cell
-      3. Try DE hyphenation first, then EN
+    For dictionary pages: process all content column cells, strip existing
+    pipes, merge pipe-gap spaces, and re-syllabify using pyphen.
+
+    Pre-check: at least 1% of content cells must already contain ``|`` from
+    OCR.  This guards against pages with zero pipe characters (the primary
+    guard — article_col_index — is checked at the call site).
+
+    Args:
+        force: If True, skip the pipe-ratio pre-check and syllabify all
+            content words regardless of whether the original has pipe dividers.
+        col_filter: If set, only process cells whose col_type is in this set.
+            None means process all content columns.

    Returns the number of cells modified.
    """
-    try:
-        import pyphen
-    except ImportError:
+    hyph_de, hyph_en = _get_hyphenators()
+    if hyph_de is None:
        logger.warning("pyphen not installed — skipping syllable insertion")
        return 0

-    _hyph_de = pyphen.Pyphen(lang='de_DE')
-    _hyph_en = pyphen.Pyphen(lang='en_US')
-    img_gray = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2GRAY)
+    # Pre-check: count cells that already have | from OCR.
+    # Real dictionary pages with printed syllable dividers will have OCR-
+    # detected pipes in many cells.  Pages without syllable dividers will
+    # have zero — skip those to avoid false syllabification.
+    if not force:
+        total_col_cells = 0
+        cells_with_pipes = 0
+        for z in zones_data:
+            for cell in z.get("cells", []):
+                if cell.get("col_type", "").startswith("column_"):
+                    total_col_cells += 1
+                    if "|" in cell.get("text", ""):
+                        cells_with_pipes += 1
+
+        if total_col_cells > 0:
+            pipe_ratio = cells_with_pipes / total_col_cells
+            if pipe_ratio < 0.01:
+                logger.info(
+                    "build-grid session %s: skipping syllable insertion — "
+                    "only %.1f%% of cells have existing pipes (need >=1%%)",
+                    session_id, pipe_ratio * 100,
+                )
+                return 0

    insertions = 0
    for z in zones_data:
@@ -108,48 +506,27 @@ def insert_syllable_dividers(
            ct = cell.get("col_type", "")
            if not ct.startswith("column_"):
                continue
+            if col_filter is not None and ct not in col_filter:
+                continue
            text = cell.get("text", "")
-            if not text or "|" in text:
-                continue
-            if _IPA_RE.search(text):
+            if not text:
                continue

-            # CV gate: check if ANY word_box in this cell has pipe lines
-            wbs = cell.get("word_boxes") or []
-            if not any(_word_has_pipe_lines(img_gray, wb) for wb in wbs):
+            # In auto mode (force=False), only normalize cells that already
+            # have | from OCR (i.e. printed syllable dividers on the original
+            # scan).  Don't add new syllable marks to other words.
+            if not force and "|" not in text:
                continue

-            # Apply pyphen to each significant word in the cell
-            tokens = re.split(r'(\s+|[,;]+\s*)', text)
-            new_tokens = []
-            changed = False
-            for tok in tokens:
-                # Skip whitespace/punctuation separators
-                if re.match(r'^[\s,;]+$', tok):
-                    new_tokens.append(tok)
-                    continue
-                # Only hyphenate words ≥ 4 alpha chars
-                clean = re.sub(r'[().\-]', '', tok)
-                if len(clean) < 4 or not re.search(r'[a-zA-ZäöüÄÖÜß]', clean):
-                    new_tokens.append(tok)
-                    continue
-                # Try DE first, then EN
-                hyph = _hyph_de.inserted(tok, hyphen='|')
-                if '|' not in hyph:
-                    hyph = _hyph_en.inserted(tok, hyphen='|')
-                if '|' in hyph and hyph != tok:
-                    new_tokens.append(hyph)
-                    changed = True
-                else:
-                    new_tokens.append(tok)
-            if changed:
-                cell["text"] = ''.join(new_tokens)
+            new_text = _syllabify_text(text, hyph_de, hyph_en)
+            if new_text != text:
+                cell["text"] = new_text
                insertions += 1

    if insertions:
        logger.info(
-            "build-grid session %s: inserted syllable dividers in %d cells "
-            "(CV-validated)",
+            "build-grid session %s: syllable dividers inserted/normalized "
+            "in %d cells (pyphen)",
            session_id, insertions,
        )
    return insertions
--- a/klausur-service/backend/cv_vocab_types.py
+++ b/klausur-service/backend/cv_vocab_types.py
@@ -65,6 +65,38 @@ if os.path.exists(_britfone_path):
 else:
    logger.info("Britfone not found — British IPA disabled")

+# --- German IPA Dictionary (CC-BY-SA, Wiktionary) ---
+
+DE_IPA_AVAILABLE = False
+_de_ipa_dict: Dict[str, str] = {}
+
+_de_ipa_path = os.path.join(os.path.dirname(__file__), 'data', 'de_ipa.tsv')
+if os.path.exists(_de_ipa_path):
+    try:
+        with open(_de_ipa_path, 'r', encoding='utf-8') as f:
+            for line in f:
+                parts = line.rstrip('\n').split('\t', 1)
+                if len(parts) == 2:
+                    _de_ipa_dict[parts[0]] = parts[1]
+        DE_IPA_AVAILABLE = True
+        logger.info(f"German IPA loaded — {len(_de_ipa_dict)} entries (CC-BY-SA, Wiktionary)")
+    except Exception as e:
+        logger.warning(f"Failed to load German IPA: {e}")
+else:
+    logger.info("German IPA not found — German IPA disabled")
+
+# --- epitran German fallback (MIT license) ---
+
+_epitran_de = None
+try:
+    import epitran as _epitran_module
+    _epitran_de = _epitran_module.Epitran('deu-Latn')
+    logger.info("epitran loaded — German rule-based IPA fallback enabled")
+except ImportError:
+    logger.info("epitran not installed — German IPA fallback disabled")
+except Exception as e:
+    logger.warning(f"Failed to init epitran: {e}")
+
 # --- Language Detection Constants ---

 GERMAN_FUNCTION_WORDS = {'der', 'die', 'das', 'und', 'ist', 'ein', 'eine', 'nicht',
--- a/klausur-service/backend/cv_words_first.py
+++ b/klausur-service/backend/cv_words_first.py
@@ -35,9 +35,15 @@ def _cluster_columns(
    words: List[Dict],
    img_w: int,
    min_gap_pct: float = 3.0,
+    max_columns: Optional[int] = None,
 ) -> List[Dict[str, Any]]:
    """Cluster words into columns by finding large horizontal gaps.

+    Args:
+        max_columns: If set, limits the number of columns by merging
+            the closest adjacent pairs until the count matches.
+            Prevents phantom columns from degraded OCR.
+
    Returns a list of column dicts:
        [{'index': 0, 'type': 'column_1', 'x_min': ..., 'x_max': ...}, ...]
    sorted left-to-right.
@@ -57,17 +63,28 @@ def _cluster_columns(

    # Find X-gap boundaries between consecutive words (sorted by X-center)
    # For each word, compute right edge; for next word, compute left edge
-    boundaries: List[float] = []  # X positions where columns split
+    # Collect gaps with their sizes for max_columns enforcement
+    gaps: List[Tuple[float, float]] = []  # (gap_size, split_x)
    for i in range(len(sorted_w) - 1):
        right_edge = sorted_w[i]['left'] + sorted_w[i]['width']
        left_edge = sorted_w[i + 1]['left']
        gap = left_edge - right_edge
        if gap > min_gap_px:
-            # Split point is midway through the gap
-            boundaries.append((right_edge + left_edge) / 2)
+            split_x = (right_edge + left_edge) / 2
+            gaps.append((gap, split_x))
+
+    # If max_columns is set, keep only the (max_columns - 1) largest gaps
+    if max_columns and len(gaps) >= max_columns:
+        gaps.sort(key=lambda g: g[0], reverse=True)
+        gaps = gaps[:max_columns - 1]
+        logger.info(
+            f"_cluster_columns: limited to {max_columns} columns "
+            f"(removed {len(gaps) + max_columns - 1 - (max_columns - 1)} smallest gaps)"
+        )
+
+    boundaries = sorted(g[1] for g in gaps)

    # Build column ranges from boundaries
-    # Column ranges: (-inf, boundary[0]), (boundary[0], boundary[1]), ..., (boundary[-1], +inf)
    col_edges = [0.0] + boundaries + [float(img_w)]
    columns = []
    for ci in range(len(col_edges) - 1):
@@ -302,6 +319,7 @@ def build_grid_from_words(
    img_h: int,
    min_confidence: int = 30,
    box_rects: Optional[List[Dict]] = None,
+    max_columns: Optional[int] = None,
 ) -> Tuple[List[Dict[str, Any]], List[Dict[str, Any]]]:
    """Build a cell grid bottom-up from Tesseract word boxes.

@@ -359,8 +377,9 @@ def build_grid_from_words(
            return [], []

    # Step 1: cluster columns
-    columns = _cluster_columns(words, img_w)
-    logger.info("build_grid_from_words: %d column(s) detected", len(columns))
+    columns = _cluster_columns(words, img_w, max_columns=max_columns)
+    logger.info("build_grid_from_words: %d column(s) detected%s",
+                len(columns), f" (max={max_columns})" if max_columns else "")

    # Step 2: cluster rows
    rows = _cluster_rows(words)
--- a/klausur-service/backend/data/de_ipa.tsv
+++ b/klausur-service/backend/data/de_ipa.tsv
--- a/klausur-service/backend/grid_build_core.py
+++ b/klausur-service/backend/grid_build_core.py
--- a/klausur-service/backend/grid_editor_api.py
+++ b/klausur-service/backend/grid_editor_api.py
--- a/klausur-service/backend/grid_editor_helpers.py
+++ b/klausur-service/backend/grid_editor_helpers.py
@@ -15,12 +15,155 @@ from typing import Any, Dict, List, Optional, Tuple
 import cv2
 import numpy as np

+from cv_vocab_types import PageZone
 from cv_words_first import _cluster_rows, _build_cells
 from cv_ocr_engines import _text_has_garbled_ipa

 logger = logging.getLogger(__name__)


+# ---------------------------------------------------------------------------
+# Cross-column word splitting
+# ---------------------------------------------------------------------------
+
+_spell_cache: Optional[Any] = None
+_spell_loaded = False
+
+
+def _is_recognized_word(text: str) -> bool:
+    """Check if *text* is a recognized German or English word.
+
+    Uses the spellchecker library (same as cv_syllable_detect.py).
+    Returns True for real words like "oder", "Kabel", "Zeitung".
+    Returns False for OCR merge artifacts like "sichzie", "dasZimmer".
+    """
+    global _spell_cache, _spell_loaded
+    if not text or len(text) < 2:
+        return False
+
+    if not _spell_loaded:
+        _spell_loaded = True
+        try:
+            from spellchecker import SpellChecker
+            _spell_cache = SpellChecker(language="de")
+        except Exception:
+            pass
+
+    if _spell_cache is None:
+        return False
+
+    return text.lower() in _spell_cache
+
+
+def _split_cross_column_words(
+    words: List[Dict],
+    columns: List[Dict],
+) -> List[Dict]:
+    """Split word boxes that span across column boundaries.
+
+    When OCR merges adjacent words from different columns (e.g. "sichzie"
+    spanning Col 1 and Col 2, or "dasZimmer" crossing the boundary),
+    split the word box at the column boundary so each piece is assigned
+    to the correct column.
+
+    Only splits when:
+    - The word has significant overlap (>15% of its width) on both sides
+    - AND the word is not a recognized real word (OCR merge artifact), OR
+      the word contains a case transition (lowercase→uppercase) near the
+      boundary indicating two merged words like "dasZimmer".
+    """
+    if len(columns) < 2:
+        return words
+
+    # Column boundaries = midpoints between adjacent column edges
+    boundaries = []
+    for i in range(len(columns) - 1):
+        boundary = (columns[i]["x_max"] + columns[i + 1]["x_min"]) / 2
+        boundaries.append(boundary)
+
+    new_words: List[Dict] = []
+    split_count = 0
+    for w in words:
+        w_left = w["left"]
+        w_width = w["width"]
+        w_right = w_left + w_width
+        text = (w.get("text") or "").strip()
+
+        if not text or len(text) < 4 or w_width < 10:
+            new_words.append(w)
+            continue
+
+        # Find the first boundary this word straddles significantly
+        split_boundary = None
+        for b in boundaries:
+            if w_left < b < w_right:
+                left_part = b - w_left
+                right_part = w_right - b
+                # Both sides must have at least 15% of the word width
+                if left_part > w_width * 0.15 and right_part > w_width * 0.15:
+                    split_boundary = b
+                    break
+
+        if split_boundary is None:
+            new_words.append(w)
+            continue
+
+        # Compute approximate split position in the text.
+        left_width = split_boundary - w_left
+        split_ratio = left_width / w_width
+        approx_pos = len(text) * split_ratio
+
+        # Strategy 1: look for a case transition (lowercase→uppercase) near
+        # the approximate split point — e.g. "dasZimmer" splits at 'Z'.
+        split_char = None
+        search_lo = max(1, int(approx_pos) - 3)
+        search_hi = min(len(text), int(approx_pos) + 2)
+        for i in range(search_lo, search_hi):
+            if text[i - 1].islower() and text[i].isupper():
+                split_char = i
+                break
+
+        # Strategy 2: if no case transition, only split if the whole word
+        # is NOT a real word (i.e. it's an OCR merge artifact like "sichzie").
+        # Real words like "oder", "Kabel", "Zeitung" must not be split.
+        if split_char is None:
+            clean = re.sub(r"[,;:.!?]+$", "", text)  # strip trailing punct
+            if _is_recognized_word(clean):
+                new_words.append(w)
+                continue
+            # Not a real word — use floor of proportional position
+            split_char = max(1, min(len(text) - 1, int(approx_pos)))
+
+        left_text = text[:split_char].rstrip()
+        right_text = text[split_char:].lstrip()
+
+        if len(left_text) < 2 or len(right_text) < 2:
+            new_words.append(w)
+            continue
+
+        right_width = w_width - round(left_width)
+        new_words.append({
+            **w,
+            "text": left_text,
+            "width": round(left_width),
+        })
+        new_words.append({
+            **w,
+            "text": right_text,
+            "left": round(split_boundary),
+            "width": right_width,
+        })
+        split_count += 1
+        logger.info(
+            "split cross-column word %r → %r + %r at boundary %.0f",
+            text, left_text, right_text, split_boundary,
+        )
+
+    if split_count:
+        logger.info("split %d cross-column word(s)", split_count)
+    return new_words
+
+
 def _filter_border_strip_words(words: List[Dict]) -> Tuple[List[Dict], int]:
    """Remove page-border decoration strip words BEFORE column detection.

@@ -137,8 +280,27 @@ def _cluster_columns_by_alignment(
        median_gap = sorted_gaps[len(sorted_gaps) // 2]
        heights = [w["height"] for w in words if w.get("height", 0) > 0]
        median_h = sorted(heights)[len(heights) // 2] if heights else 25
-        # Column boundary: gap > 3× median gap or > 1.5× median word height
+
+        # For small word counts (boxes, sub-zones): PaddleOCR returns
+        # multi-word blocks, so ALL inter-word gaps are potential column
+        # boundaries.  Use a low threshold based on word height — any gap
+        # wider than ~1x median word height is a column separator.
+        if len(words) <= 60:
+            gap_threshold = max(median_h * 1.0, 25)
+            logger.info(
+                "alignment columns (small zone): gap_threshold=%.0f "
+                "(median_h=%.0f, %d words, %d gaps: %s)",
+                gap_threshold, median_h, len(words), len(sorted_gaps),
+                [int(g) for g in sorted_gaps[:10]],
+            )
+        else:
+            # Standard approach for large zones (full pages)
            gap_threshold = max(median_gap * 3, median_h * 1.5, 30)
+            # Cap at 25% of zone width
+            max_gap = zone_w * 0.25
+            if gap_threshold > max_gap > 30:
+                logger.info("alignment columns: capping gap_threshold %.0f → %.0f (25%% of zone_w=%d)", gap_threshold, max_gap, zone_w)
+                gap_threshold = max_gap
    else:
        gap_threshold = 50

@@ -232,13 +394,17 @@ def _cluster_columns_by_alignment(
    used_ids = {id(c) for c in primary} | {id(c) for c in secondary}
    sig_xs = [c["mean_x"] for c in primary + secondary]

-    MIN_DISTINCT_ROWS_TERTIARY = max(MIN_DISTINCT_ROWS + 1, 4)
-    MIN_COVERAGE_TERTIARY = 0.05  # at least 5% of rows
+    # Tertiary: clusters that are clearly to the LEFT of the first
+    # significant column (or RIGHT of the last).  If words consistently
+    # start at a position left of the established first column boundary,
+    # they MUST be a separate column — regardless of how few rows they
+    # cover.  The only requirement is a clear spatial gap.
+    MIN_COVERAGE_TERTIARY = 0.02  # at least 1 row effectively
    tertiary = []
    for c in clusters:
        if id(c) in used_ids:
            continue
-        if c["distinct_rows"] < MIN_DISTINCT_ROWS_TERTIARY:
+        if c["distinct_rows"] < 1:
            continue
        if c["row_coverage"] < MIN_COVERAGE_TERTIARY:
            continue
@@ -906,13 +1072,42 @@ def _detect_heading_rows_by_single_cell(
            text = (cell.get("text") or "").strip()
            if not text or text.startswith("["):
                continue
+            # Continuation lines start with "(" — e.g. "(usw.)", "(TV-Serie)"
+            if text.startswith("("):
+                continue
+            # Single cell NOT in the first content column is likely a
+            # continuation/overflow line, not a heading.  Real headings
+            # ("Theme 1", "Unit 3: ...") appear in the first or second
+            # content column.
+            first_content_col = col_indices[0] if col_indices else 0
+            if cell.get("col_index", 0) > first_content_col + 1:
+                continue
            # Skip garbled IPA without brackets (e.g. "ska:f – ska:vz")
            # but NOT text with real IPA symbols (e.g. "Theme [θˈiːm]")
            _REAL_IPA_CHARS = set("ˈˌəɪɛɒʊʌæɑɔʃʒθðŋ")
            if _text_has_garbled_ipa(text) and not any(c in _REAL_IPA_CHARS for c in text):
                continue
+            # Guard: dictionary section headings are short (1-4 alpha chars
+            # like "A", "Ab", "Zi", "Sch").  Longer text that starts
+            # lowercase is a regular vocabulary word (e.g. "zentral") that
+            # happens to appear alone in its row.
+            alpha_only = re.sub(r'[^a-zA-ZäöüÄÖÜßẞ]', '', text)
+            if len(alpha_only) > 4 and text[0].islower():
+                continue
            heading_row_indices.append(ri)

+        # Guard: if >25% of eligible rows would become headings, the
+        # heuristic is misfiring (e.g. sparse single-column layout where
+        # most rows naturally have only 1 content cell).
+        eligible_rows = len(non_header_rows) - 2  # minus first/last excluded
+        if eligible_rows > 0 and len(heading_row_indices) > eligible_rows * 0.25:
+            logger.debug(
+                "Skipping single-cell heading detection for zone %s: "
+                "%d/%d rows would be headings (>25%%)",
+                z.get("zone_index"), len(heading_row_indices), eligible_rows,
+            )
+            continue
+
        for hri in heading_row_indices:
            header_cells = [c for c in cells if c.get("row_index") == hri]
            if not header_cells:
@@ -1023,6 +1218,130 @@ def _detect_header_rows(
    return headers


+def _detect_colspan_cells(
+    zone_words: List[Dict],
+    columns: List[Dict],
+    rows: List[Dict],
+    cells: List[Dict],
+    img_w: int,
+    img_h: int,
+) -> List[Dict]:
+    """Detect and merge cells that span multiple columns (colspan).
+
+    A word-block (PaddleOCR phrase) that extends significantly past a column
+    boundary into the next column indicates a merged cell.  This replaces
+    the incorrectly split cells with a single cell spanning multiple columns.
+
+    Works for both full-page scans and box zones.
+    """
+    if len(columns) < 2 or not zone_words or not rows:
+        return cells
+
+    from cv_words_first import _assign_word_to_row
+
+    # Column boundaries (midpoints between adjacent columns)
+    col_boundaries = []
+    for ci in range(len(columns) - 1):
+        col_boundaries.append((columns[ci]["x_max"] + columns[ci + 1]["x_min"]) / 2)
+
+    def _cols_covered(w_left: float, w_right: float) -> List[int]:
+        """Return list of column indices that a word-block covers."""
+        covered = []
+        for col in columns:
+            col_mid = (col["x_min"] + col["x_max"]) / 2
+            # Word covers a column if it extends past the column's midpoint
+            if w_left < col_mid < w_right:
+                covered.append(col["index"])
+            # Also include column if word starts within it
+            elif col["x_min"] <= w_left < col["x_max"]:
+                covered.append(col["index"])
+        return sorted(set(covered))
+
+    # Group original word-blocks by row
+    row_word_blocks: Dict[int, List[Dict]] = {}
+    for w in zone_words:
+        ri = _assign_word_to_row(w, rows)
+        row_word_blocks.setdefault(ri, []).append(w)
+
+    # For each row, check if any word-block spans multiple columns
+    rows_to_merge: Dict[int, List[Dict]] = {}  # row_index → list of spanning word-blocks
+
+    for ri, wblocks in row_word_blocks.items():
+        spanning = []
+        for w in wblocks:
+            w_left = w["left"]
+            w_right = w_left + w["width"]
+            covered = _cols_covered(w_left, w_right)
+            if len(covered) >= 2:
+                spanning.append({"word": w, "cols": covered})
+        if spanning:
+            rows_to_merge[ri] = spanning
+
+    if not rows_to_merge:
+        return cells
+
+    # Merge cells for spanning rows
+    new_cells = []
+    for cell in cells:
+        ri = cell.get("row_index", -1)
+        if ri not in rows_to_merge:
+            new_cells.append(cell)
+            continue
+
+        # Check if this cell's column is part of a spanning block
+        ci = cell.get("col_index", -1)
+        is_part_of_span = False
+        for span in rows_to_merge[ri]:
+            if ci in span["cols"]:
+                is_part_of_span = True
+                # Only emit the merged cell for the FIRST column in the span
+                if ci == span["cols"][0]:
+                    # Use the ORIGINAL word-block text (not the split cell texts
+                    # which may have broken words like "euros a" + "nd cents")
+                    orig_word = span["word"]
+                    merged_text = orig_word.get("text", "").strip()
+                    all_wb = [orig_word]
+
+                    # Compute merged bbox
+                    if all_wb:
+                        x_min = min(wb["left"] for wb in all_wb)
+                        y_min = min(wb["top"] for wb in all_wb)
+                        x_max = max(wb["left"] + wb["width"] for wb in all_wb)
+                        y_max = max(wb["top"] + wb["height"] for wb in all_wb)
+                    else:
+                        x_min = y_min = x_max = y_max = 0
+
+                    new_cells.append({
+                        "cell_id": cell["cell_id"],
+                        "row_index": ri,
+                        "col_index": span["cols"][0],
+                        "col_type": "spanning_header",
+                        "colspan": len(span["cols"]),
+                        "text": merged_text,
+                        "confidence": cell.get("confidence", 0),
+                        "bbox_px": {"x": x_min, "y": y_min,
+                                    "w": x_max - x_min, "h": y_max - y_min},
+                        "bbox_pct": {
+                            "x": round(x_min / img_w * 100, 2) if img_w else 0,
+                            "y": round(y_min / img_h * 100, 2) if img_h else 0,
+                            "w": round((x_max - x_min) / img_w * 100, 2) if img_w else 0,
+                            "h": round((y_max - y_min) / img_h * 100, 2) if img_h else 0,
+                        },
+                        "word_boxes": all_wb,
+                        "ocr_engine": cell.get("ocr_engine", ""),
+                        "is_bold": cell.get("is_bold", False),
+                    })
+                    logger.info(
+                        "colspan detected: row %d, cols %s → merged %d cells (%r)",
+                        ri, span["cols"], len(span["cols"]), merged_text[:50],
+                    )
+                break
+        if not is_part_of_span:
+            new_cells.append(cell)
+
+    return new_cells
+
+
 def _build_zone_grid(
    zone_words: List[Dict],
    zone_x: int,
@@ -1091,9 +1410,24 @@ def _build_zone_grid(
            "header_rows": [],
        }

+    # Split word boxes that straddle column boundaries (e.g. "sichzie"
+    # spanning Col 1 + Col 2).  Must happen after column detection and
+    # before cell assignment.
+    # Keep original words for colspan detection (split destroys span info).
+    original_zone_words = zone_words
+    if len(columns) >= 2:
+        zone_words = _split_cross_column_words(zone_words, columns)
+
    # Build cells
    cells = _build_cells(zone_words, columns, rows, img_w, img_h)

+    # --- Detect colspan (merged cells spanning multiple columns) ---
+    # Uses the ORIGINAL (pre-split) words to detect word-blocks that span
+    # multiple columns.  _split_cross_column_words would have destroyed
+    # this information by cutting words at column boundaries.
+    if len(columns) >= 2:
+        cells = _detect_colspan_cells(original_zone_words, columns, rows, cells, img_w, img_h)
+
    # Prefix cell IDs with zone index
    for cell in cells:
        cell["cell_id"] = f"Z{zone_index}_{cell['cell_id']}"
@@ -1288,29 +1622,42 @@ def _filter_footer_words(
    img_h: int,
    log: Any,
    session_id: str,
-) -> None:
+) -> Optional[Dict]:
    """Remove isolated words in the bottom 5% of the page (page numbers).

-    Modifies *words* in place.
+    Modifies *words* in place and returns a page_number metadata dict
+    if a page number was extracted, or None.
    """
    if not words or img_h <= 0:
-        return
+        return None
    footer_y = img_h * 0.95
    footer_words = [
        w for w in words
        if w["top"] + w.get("height", 0) / 2 > footer_y
    ]
    if not footer_words:
-        return
+        return None
    # Only remove if footer has very few words (≤ 3) with short text
    total_text = "".join((w.get("text") or "").strip() for w in footer_words)
    if len(footer_words) <= 3 and len(total_text) <= 10:
+        # Extract page number metadata before removing
+        page_number_info = {
+            "text": total_text.strip(),
+            "y_pct": round(footer_words[0]["top"] / img_h * 100, 1),
+        }
+        # Try to parse as integer
+        digits = "".join(c for c in total_text if c.isdigit())
+        if digits:
+            page_number_info["number"] = int(digits)
+
        footer_set = set(id(w) for w in footer_words)
        words[:] = [w for w in words if id(w) not in footer_set]
        log.info(
-            "build-grid session %s: removed %d footer words ('%s')",
-            session_id, len(footer_words), total_text,
+            "build-grid session %s: extracted page number '%s' and removed %d footer words",
+            session_id, total_text, len(footer_words),
        )
+        return page_number_info
+    return None


 def _filter_header_junk(
--- a/klausur-service/backend/main.py
+++ b/klausur-service/backend/main.py
@@ -46,6 +46,7 @@ from ocr_pipeline_api import router as ocr_pipeline_router, _cache as ocr_pipeli
 from grid_editor_api import router as grid_editor_router
 from orientation_crop_api import router as orientation_crop_router, set_cache_ref as set_orientation_crop_cache
 from ocr_pipeline_session_store import init_ocr_pipeline_tables
+from ocr_kombi.router import router as ocr_kombi_router
 try:
    from handwriting_htr_api import router as htr_router
 except ImportError:
@@ -186,6 +187,7 @@ if htr_router:
    app.include_router(htr_router)            # Handwriting HTR (Klausur)
 if dsfa_rag_router:
    app.include_router(dsfa_rag_router)   # DSFA RAG Corpus Search
+app.include_router(ocr_kombi_router)      # OCR Kombi Pipeline (modular)


 # =============================================
--- a/Show More
+++ b/Show More