feat(ocr): Add Grid Detection v4 tests, docs, and SBOM update

- Add comprehensive tests for grid_detection_service.py (31 tests) - mm coordinate conversion tests - Deskew calculation tests - Column detection tests - Integration tests for vocabulary tables - Add OCR-Compare documentation (OCR-Compare.md) - mm coordinate system documentation - Deskew correction documentation - Worksheet Editor integration guide - API endpoints documentation - Add TypeScript tests for ocr-integration.ts - mm to pixel conversion tests - OCR export format tests - localStorage operations tests - Update SBOM to v1.5.0 - Add OCR Grid Detection System section - Document Fabric.js (MIT) for Worksheet Editor - Document NumPy and OpenCV usage Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-08 21:31:35 -08:00
commit baee45b861
4 changed files with 2421 additions and 0 deletions
@@ -0,0 +1,366 @@
+# OCR Compare Tool - Dokumentation
+
+**Status:** Produktiv
+**Version:** 4.0
+**Letzte Aktualisierung:** 2026-02-08
+**URL:** https://macmini:3002/ai/ocr-compare
+
+---
+
+## Übersicht
+
+Das OCR Compare Tool ermöglicht die automatische Analyse von gescannten Vokabeltabellen mit:
+- Grid-basierter OCR-Erkennung
+- Automatischer Spalten-Erkennung (Englisch/Deutsch/Beispiel)
+- mm-Koordinatensystem für präzise Positionierung
+- Deskew-Korrektur für schiefe Scans
+- Export zum Worksheet-Editor
+
+---
+
+## Architektur
+
+```
+┌─────────────────────────────────────────────────────────────────────┐
+│                    Frontend (admin-v2)                              │
+│  /admin-v2/app/(admin)/ai/ocr-compare/page.tsx                     │
+│  - Bild-Upload                                                      │
+│  - Grid-Overlay Visualisierung                                      │
+│  - Cell-Edit Popup                                                  │
+│  - Export zum Worksheet-Editor                                      │
+└─────────────────────────────────────────────────────────────────────┘
+                              │
+                              ▼
+┌─────────────────────────────────────────────────────────────────────┐
+│                 klausur-service (FastAPI)                           │
+│  Port 8086 - /klausur-service/backend/                             │
+│  - /api/v1/ocr/analyze-grid (Grid-Analyse)                         │
+│  - services/grid_detection_service.py (v4)                         │
+└─────────────────────────────────────────────────────────────────────┘
+                              │
+                              ▼
+┌─────────────────────────────────────────────────────────────────────┐
+│                    PaddleOCR Service                                │
+│  Port 8088 - OCR-Erkennung                                         │
+└─────────────────────────────────────────────────────────────────────┘
+```
+
+---
+
+## Features (Version 4)
+
+### 1. mm-Koordinatensystem
+
+Alle Koordinaten werden im A4-Format (210x297mm) ausgegeben:
+
+| Feld | Beschreibung |
+|------|--------------|
+| `x_mm` | X-Position in mm (0-210) |
+| `y_mm` | Y-Position in mm (0-297) |
+| `width_mm` | Breite in mm |
+| `height_mm` | Höhe in mm |
+
+**Konvertierung:**
+```typescript
+// Prozent zu mm
+const x_mm = (x_percent / 100) * 210
+const y_mm = (y_percent / 100) * 297
+
+// mm zu Pixel (für Canvas bei 96 DPI)
+const MM_TO_PX = 3.7795275591
+const x_px = x_mm * MM_TO_PX
+```
+
+### 2. Deskew-Korrektur
+
+Automatische Ausrichtung schiefer Scans basierend auf der ersten Spalte:
+
+1. **Erkennung:** Alle Wörter in der ersten Spalte (x < 33%) werden analysiert
+2. **Berechnung:** Lineare Regression auf den linken Kanten
+3. **Korrektur:** Rotation aller Koordinaten um den berechneten Winkel
+4. **Limitierung:** Maximal ±5° Korrektur
+
+```python
+# Deskew-Winkel im Response
+{
+  "deskew_angle_deg": -1.2,  # Negativer Wert = nach links geneigt
+  ...
+}
+```
+
+### 3. Spalten-Erkennung mit 1mm Margin
+
+Spalten werden automatisch erkannt und beginnen 1mm vor dem ersten Wort:
+
+```json
+{
+  "detected_columns": [
+    {
+      "column_type": "english",
+      "x_start": 9.52,      // Prozent
+      "x_end": 35.0,
+      "x_start_mm": 20.0,   // mm (1mm vor erstem Wort)
+      "x_end_mm": 73.5,
+      "word_count": 15
+    },
+    {
+      "column_type": "german",
+      "x_start_mm": 74.0,
+      "x_end_mm": 140.0,
+      "word_count": 15
+    },
+    {
+      "column_type": "example",
+      "x_start_mm": 141.0,
+      "x_end_mm": 200.0,
+      "word_count": 12
+    }
+  ]
+}
+```
+
+### 4. Zellen-Status
+
+| Status | Beschreibung |
+|--------|--------------|
+| `empty` | Keine OCR-Erkennung in dieser Zelle |
+| `recognized` | Text erkannt mit Confidence ≥ 50% |
+| `problematic` | Text erkannt mit Confidence < 50% |
+| `manual` | Manuell korrigiert |
+
+---
+
+## API-Endpoints
+
+### POST /api/v1/ocr/analyze-grid
+
+Analysiert ein Bild und erkennt die Vokabeltabellen-Struktur.
+
+**Request:**
+```json
+{
+  "image_base64": "data:image/jpeg;base64,...",
+  "min_confidence": 0.5,
+  "padding": 2.0
+}
+```
+
+**Response:**
+```json
+{
+  "cells": [
+    [
+      {
+        "row": 0,
+        "col": 0,
+        "x": 10.0,
+        "y": 15.0,
+        "width": 25.0,
+        "height": 3.0,
+        "x_mm": 21.0,
+        "y_mm": 44.55,
+        "width_mm": 52.5,
+        "height_mm": 8.91,
+        "text": "house",
+        "confidence": 0.95,
+        "status": "recognized",
+        "column_type": "english",
+        "logical_row": 0,
+        "logical_col": 0
+      }
+    ]
+  ],
+  "detected_columns": [...],
+  "page_dimensions": {
+    "width_mm": 210.0,
+    "height_mm": 297.0,
+    "format": "A4"
+  },
+  "deskew_angle_deg": -0.5,
+  "statistics": {
+    "total_cells": 45,
+    "recognized_cells": 42,
+    "problematic_cells": 3,
+    "empty_cells": 0
+  }
+}
+```
+
+---
+
+## Frontend-Komponenten
+
+### GridOverlay.tsx
+
+Zeigt die erkannten Zellen als farbiges Overlay über dem Bild.
+
+**Props:**
+```typescript
+interface GridOverlayProps {
+  cells: GridCell[][]
+  imageWidth: number
+  imageHeight: number
+  showLabels?: boolean
+  onCellClick?: (cell: GridCell) => void
+}
+```
+
+**Farbkodierung:**
+- Grün: `recognized` (gut erkannt)
+- Gelb: `problematic` (niedrige Confidence)
+- Grau: `empty`
+- Blau: `manual` (manuell korrigiert)
+
+### CellEditPopup.tsx
+
+Popup zum Bearbeiten einer Zelle.
+
+**Features:**
+- Text bearbeiten
+- Spaltentyp ändern (English/German/Example)
+- Confidence anzeigen
+- mm-Koordinaten anzeigen
+- Keyboard-Shortcuts: Ctrl+Enter (Speichern), Esc (Abbrechen)
+
+---
+
+## Worksheet-Editor Integration
+
+### Export
+
+Der "Zum Editor exportieren" Button speichert die OCR-Daten in localStorage:
+
+```typescript
+interface OCRExportData {
+  version: '1.0'
+  source: 'ocr-compare'
+  exported_at: string
+  session_id: string
+  page_number: number
+  page_dimensions: {
+    width_mm: number
+    height_mm: number
+    format: string
+  }
+  words: OCRWord[]
+  detected_columns: DetectedColumn[]
+}
+```
+
+**localStorage Keys:**
+- `ocr_export_{session_id}_{page_number}`: Export-Daten
+- `ocr_export_latest`: Referenz zum neuesten Export
+
+### Import im Worksheet-Editor
+
+1. Öffnen Sie den Worksheet-Editor: https://macmini/worksheet-editor
+2. Klicken Sie auf den OCR-Import Button (grünes Icon)
+3. Die Wörter werden auf dem Canvas platziert
+
+**Konvertierung mm → Pixel:**
+```typescript
+const MM_TO_PX = 3.7795275591
+const x_px = word.x_mm * MM_TO_PX
+const y_px = word.y_mm * MM_TO_PX
+```
+
+---
+
+## Dateien
+
+### Backend (klausur-service)
+
+| Datei | Beschreibung |
+|-------|--------------|
+| `services/grid_detection_service.py` | Grid-Erkennung v4 mit Deskew |
+| `tests/test_grid_detection.py` | Unit Tests |
+
+### Frontend (admin-v2)
+
+| Datei | Beschreibung |
+|-------|--------------|
+| `app/(admin)/ai/ocr-compare/page.tsx` | Haupt-UI |
+| `components/ocr/GridOverlay.tsx` | Grid-Visualisierung |
+| `components/ocr/CellEditPopup.tsx` | Zellen-Editor |
+
+### Frontend (studio-v2)
+
+| Datei | Beschreibung |
+|-------|--------------|
+| `lib/worksheet-editor/ocr-integration.ts` | OCR Import/Export Utility |
+| `app/worksheet-editor/page.tsx` | Editor mit OCR-Import |
+| `components/worksheet-editor/EditorToolbar.tsx` | Toolbar mit OCR-Button |
+
+---
+
+## Deployment
+
+```bash
+# 1. Backend synchronisieren
+scp grid_detection_service.py macmini:.../klausur-service/backend/services/
+
+# 2. Tests synchronisieren
+scp test_grid_detection.py macmini:.../klausur-service/backend/tests/
+
+# 3. klausur-service neu bauen
+ssh macmini "docker compose build --no-cache klausur-service"
+
+# 4. Container starten
+ssh macmini "docker compose up -d klausur-service"
+
+# 5. Frontend (admin-v2) deployen
+ssh macmini "docker compose build --no-cache admin-v2 && docker compose up -d admin-v2"
+```
+
+---
+
+## Verwendete Open-Source-Bibliotheken
+
+| Bibliothek | Version | Lizenz | Verwendung |
+|------------|---------|--------|------------|
+| NumPy | ≥1.24 | BSD-3-Clause | Deskew-Berechnung (polyfit) |
+| OpenCV | ≥4.8 | Apache-2.0 | Bildverarbeitung (optional) |
+| PaddleOCR | 2.7 | Apache-2.0 | OCR-Erkennung |
+| Fabric.js | 6.x | MIT | Canvas-Rendering (Frontend) |
+
+---
+
+## Fehlerbehandlung
+
+### Häufige Probleme
+
+| Problem | Lösung |
+|---------|--------|
+| "Grid analysieren" lädt nicht | klausur-service Container prüfen |
+| Keine Zellen erkannt | Min. Confidence reduzieren |
+| Falsche Spalten-Zuordnung | Manuell im CellEditPopup korrigieren |
+| Export funktioniert nicht | Browser-Console auf Fehler prüfen |
+
+### Logging
+
+```bash
+# klausur-service Logs
+docker logs breakpilot-pwa-klausur-service --tail=100
+
+# Grid Detection spezifisch
+docker logs breakpilot-pwa-klausur-service 2>&1 | grep "grid_detection"
+```
+
+---
+
+## Änderungshistorie
+
+| Version | Datum | Änderungen |
+|---------|-------|------------|
+| 4.0 | 2026-02-08 | Deskew-Korrektur, 1mm Column Margin |
+| 3.0 | 2026-02-07 | mm-Koordinatensystem |
+| 2.0 | 2026-02-06 | Spalten-Erkennung |
+| 1.0 | 2026-02-05 | Initiale Implementierung |
+
+---
+
+## Referenzen
+
+- [Worksheet-Editor Architektur](Worksheet-Editor-Architecture.md)
+- [OCR Labeling Spec](OCR-Labeling-Spec.md)
+- [SBOM](/infrastructure/sbom)