This repository has been archived on 2026-02-15. You can view files and clone it. You cannot open issues or pull requests or push a commit.
Files
BreakPilot Dev 19855efacc
Some checks failed
Tests / Go Tests (push) Has been cancelled
Tests / Python Tests (push) Has been cancelled
Tests / Integration Tests (push) Has been cancelled
Tests / Go Lint (push) Has been cancelled
Tests / Python Lint (push) Has been cancelled
Tests / Security Scan (push) Has been cancelled
Tests / All Checks Passed (push) Has been cancelled
Security Scanning / Secret Scanning (push) Has been cancelled
Security Scanning / Dependency Vulnerability Scan (push) Has been cancelled
Security Scanning / Go Security Scan (push) Has been cancelled
Security Scanning / Python Security Scan (push) Has been cancelled
Security Scanning / Node.js Security Scan (push) Has been cancelled
Security Scanning / Docker Image Security (push) Has been cancelled
Security Scanning / Security Summary (push) Has been cancelled
CI/CD Pipeline / Go Tests (push) Has been cancelled
CI/CD Pipeline / Python Tests (push) Has been cancelled
CI/CD Pipeline / Website Tests (push) Has been cancelled
CI/CD Pipeline / Linting (push) Has been cancelled
CI/CD Pipeline / Security Scan (push) Has been cancelled
CI/CD Pipeline / Docker Build & Push (push) Has been cancelled
CI/CD Pipeline / Integration Tests (push) Has been cancelled
CI/CD Pipeline / Deploy to Staging (push) Has been cancelled
CI/CD Pipeline / Deploy to Production (push) Has been cancelled
CI/CD Pipeline / CI Summary (push) Has been cancelled
ci/woodpecker/manual/build-ci-image Pipeline was successful
ci/woodpecker/manual/main Pipeline failed
feat: BreakPilot PWA - Full codebase (clean push without large binaries)
All services: admin-v2, studio-v2, website, ai-compliance-sdk,
consent-service, klausur-service, voice-service, and infrastructure.
Large PDFs and compiled binaries excluded via .gitignore.
2026-02-11 13:25:58 +01:00

7.3 KiB

OCR Compare - Block Review Feature

Status: Produktiv Letzte Aktualisierung: 2026-02-08 URL: https://macmini:3002/ai/ocr-compare


Uebersicht

Das OCR Compare Tool ermoeglicht den Vergleich verschiedener OCR-Methoden zur Texterkennung aus gescannten Dokumenten. Die Block Review Funktion erlaubt eine zellenweise Ueberpruefung und Korrektur der OCR-Ergebnisse.

Hauptfunktionen

Feature Beschreibung
Multi-Method OCR Vergleich von Vision LLM, Tesseract, PaddleOCR und Claude Vision
Grid Detection Automatische Erkennung von Tabellenstrukturen
Block Review Zellenweise Ueberpruefung und Korrektur
Session Persistence Sessions bleiben bei Seitenwechsel erhalten
High-Resolution Display Hochaufloesende Bildanzeige (zoom=2.0)

Architektur

┌─────────────────────────────────────────────────────────────┐
│                    admin-v2 (Next.js)                        │
│  /app/(admin)/ai/ocr-compare/page.tsx                       │
│  - PDF Upload & Session Management                          │
│  - Grid Visualization mit SVG Overlay                       │
│  - Block Review Panel                                       │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                 klausur-service (FastAPI)                    │
│  Port 8086                                                   │
│  - /api/v1/vocab/sessions (Session CRUD)                    │
│  - /api/v1/vocab/sessions/{id}/pdf-thumbnail (Bild-Export)  │
│  - /api/v1/vocab/sessions/{id}/detect-grid (Grid-Erkennung) │
│  - /api/v1/vocab/sessions/{id}/run-ocr (OCR-Ausfuehrung)    │
└─────────────────────────────────────────────────────────────┘

Komponenten

GridOverlay

SVG-Overlay zur Visualisierung der erkannten Grid-Struktur.

Datei: /admin-v2/components/ocr/GridOverlay.tsx

interface GridOverlayProps {
  grid: GridData
  imageUrl?: string
  onCellClick?: (cell: GridCell) => void
  selectedCell?: GridCell | null
  showEmpty?: boolean        // Leere Zellen anzeigen
  showLabels?: boolean       // Spaltenlabels (EN, DE, Ex)
  showNumbers?: boolean      // Block-Nummern anzeigen
  highlightedBlockNumber?: number | null  // Hervorgehobener Block
  className?: string
}

Zellenstatus-Farben:

Status Farbe Bedeutung
recognized Gruen Text erfolgreich erkannt
problematic Orange Niedriger Confidence-Wert
manual Blau Manuell korrigiert
empty Transparent Keine Erkennung

BlockReviewPanel

Panel zur Block-fuer-Block Ueberpruefung der OCR-Ergebnisse.

Datei: /admin-v2/components/ocr/BlockReviewPanel.tsx

interface BlockReviewPanelProps {
  grid: GridData
  methodResults: Record<string, { vocabulary: Array<...> }>
  currentBlockNumber: number
  onBlockChange: (blockNumber: number) => void
  onApprove: (blockNumber: number, methodId: string, text: string) => void
  onCorrect: (blockNumber: number, correctedText: string) => void
  onSkip: (blockNumber: number) => void
  reviewData: Record<number, BlockReviewData>
  className?: string
}

Review-Status:

Status Beschreibung
pending Noch nicht ueberprueft
approved OCR-Ergebnis akzeptiert
corrected Manuell korrigiert
skipped Uebersprungen

BlockReviewSummary

Zusammenfassung aller ueberprueften Bloecke.

interface BlockReviewSummaryProps {
  reviewData: Record<number, BlockReviewData>
  totalBlocks: number
  onBlockClick: (blockNumber: number) => void
  className?: string
}

OCR-Methoden

ID Name Beschreibung
vision_llm Vision LLM Qwen VL 32B ueber Ollama
tesseract Tesseract Klassisches OCR (lokal)
paddleocr PaddleOCR PaddleOCR Engine
claude_vision Claude Vision Anthropic Claude Vision API

API Endpoints

Session Management

Method Endpoint Beschreibung
POST /api/v1/vocab/upload-pdf-info PDF hochladen
GET /api/v1/vocab/sessions/{id} Session-Details
DELETE /api/v1/vocab/sessions/{id} Session loeschen

Bildexport

Method Endpoint Beschreibung
GET /api/v1/vocab/sessions/{id}/pdf-thumbnail/{page} Thumbnail (zoom=0.5)
GET /api/v1/vocab/sessions/{id}/pdf-thumbnail/{page}?hires=true High-Res (zoom=2.0)

Grid-Erkennung

Method Endpoint Beschreibung
POST /api/v1/vocab/sessions/{id}/detect-grid Grid-Struktur erkennen
POST /api/v1/vocab/sessions/{id}/run-ocr OCR auf Grid ausfuehren

Session Persistence

Die aktive Session wird im localStorage gespeichert:

// Speichern
localStorage.setItem('ocr-compare-active-session', sessionId)

// Wiederherstellen beim Seitenladen
const lastSessionId = localStorage.getItem('ocr-compare-active-session')
if (lastSessionId) {
  // Session-Daten laden
}

Block Review Workflow

  1. PDF hochladen - Dokument in das System laden
  2. Grid erkennen - Automatische Tabellenerkennung
  3. OCR ausfuehren - Alle Methoden parallel ausfuehren
  4. Block Review starten - "Block Review" Button klicken
  5. Bloecke pruefen - Fuer jeden Block:
    • Ergebnisse aller Methoden vergleichen
    • Bestes Ergebnis waehlen oder manuell korrigieren
  6. Zusammenfassung - Uebersicht der Korrekturen

High-Resolution Bilder

Fuer die Anzeige werden hochaufloesende Bilder verwendet:

// Thumbnail URL mit High-Resolution Parameter
const imageUrl = `${KLAUSUR_API}/api/v1/vocab/sessions/${sessionId}/pdf-thumbnail/${pageNumber}?hires=true`
Parameter Zoom Verwendung
Ohne hires 0.5 Vorschau/Thumbnails
Mit hires=true 2.0 Anzeige/OCR

Dateien

Frontend (admin-v2)

Datei Beschreibung
app/(admin)/ai/ocr-compare/page.tsx Haupt-UI
components/ocr/GridOverlay.tsx SVG Grid-Overlay
components/ocr/BlockReviewPanel.tsx Review-Panel
components/ocr/CellCorrectionDialog.tsx Korrektur-Dialog
components/ocr/index.ts Exports

Backend (klausur-service)

Datei Beschreibung
vocab_worksheet_api.py API-Router
hybrid_vocab_extractor.py OCR-Extraktion

Aenderungshistorie

Datum Aenderung
2026-02-08 Block Review Feature hinzugefuegt
2026-02-08 High-Resolution Bilder aktiviert
2026-02-08 Session Persistence implementiert
2026-02-07 Grid Detection und Multi-Method OCR