feat: PaddleOCR Remote-Engine (PP-OCRv5 Latin auf Hetzner x86_64)
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 31s
CI / test-go-edu-search (push) Successful in 29s
CI / test-python-klausur (push) Failing after 2m7s
CI / test-python-agent-core (push) Successful in 21s
CI / test-nodejs-website (push) Successful in 21s
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 31s
CI / test-go-edu-search (push) Successful in 29s
CI / test-python-klausur (push) Failing after 2m7s
CI / test-python-agent-core (push) Successful in 21s
CI / test-nodejs-website (push) Successful in 21s
PaddleOCR als neue engine=paddle Option in der OCR-Pipeline. Microservice auf Hetzner (paddleocr-service/), async HTTP-Client (paddleocr_remote.py), Frontend-Dropdown, automatisch words_first. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -269,7 +269,7 @@ Alle Endpoints unter `/api/v1/ocr-pipeline/`.
|
||||
|
||||
| Parameter | Default | Beschreibung |
|
||||
|-----------|---------|--------------|
|
||||
| `engine` | `auto` | OCR-Engine: `auto`, `tesseract`, `rapid` |
|
||||
| `engine` | `auto` | OCR-Engine: `auto`, `tesseract`, `rapid`, `paddle` |
|
||||
| `pronunciation` | `british` | IPA-Woerterbuch: `british` oder `american` |
|
||||
| `stream` | `false` | SSE-Streaming (nur bei `grid_method=v2`) |
|
||||
| `skip_heal_gaps` | `false` | Zeilen-Luecken nicht heilen (Overlay-Modus) |
|
||||
@@ -706,10 +706,32 @@ Isolierte OCR einer einzelnen Zelle (Spalte × Zeile Schnittflaeche):
|
||||
1. **Crop:** Exakte Spalten- × Zeilengrenzen mit 3px internem Padding
|
||||
2. **Density-Check:** Ueberspringe leere Zellen (`dark_ratio < 0.005`)
|
||||
3. **Upscaling:** Kleine Crops (Hoehe < 80px) werden 3× vergroessert
|
||||
4. **OCR:** Engine-spezifisch (Tesseract, TrOCR, RapidOCR, LightON)
|
||||
4. **OCR:** Engine-spezifisch (Tesseract, TrOCR, RapidOCR, LightON, PaddleOCR)
|
||||
5. **Fallback:** Bei leerem Ergebnis → PSM 7 (Einzelzeile) statt PSM 6
|
||||
6. **Bereinigung:** `_clean_cell_text_lite()` (aggressives Noise-Filtering)
|
||||
|
||||
### PaddleOCR Remote-Engine (`engine=paddle`)
|
||||
|
||||
PaddleOCR (PP-OCRv5 Latin) laeuft als eigenstaendiger Microservice auf einem Hetzner x86_64 Server,
|
||||
da PaddlePaddle nicht auf ARM64 (Mac Mini) laeuft.
|
||||
|
||||
```
|
||||
Mac Mini (klausur-service) Hetzner (paddleocr-service)
|
||||
│ HTTPS POST + Bild │
|
||||
│ ──────────────────────────▶ │ PP-OCRv5 Latin
|
||||
│ │ FastAPI (Port 8095)
|
||||
│ JSON word_boxes │ API-Key Auth
|
||||
│ ◀────────────────────────── │
|
||||
```
|
||||
|
||||
**Besonderheiten:**
|
||||
|
||||
- Erzwingt automatisch `grid_method=words_first` (full-page OCR, kein cell-crop)
|
||||
- Async HTTP-Client (`paddleocr_remote.py`) mit 30s Timeout
|
||||
- Koordinaten sind bereits absolut (kein content_bounds Offset noetig)
|
||||
- API-Key Authentifizierung ueber `X-API-Key` Header
|
||||
- Dateien: `paddleocr-service/main.py`, `services/paddleocr_remote.py`, `cv_ocr_engines.py:ocr_region_paddle()`
|
||||
|
||||
### Ablauf von `build_cell_grid_v2()`
|
||||
|
||||
```
|
||||
@@ -1063,6 +1085,7 @@ ssh macmini "/usr/local/bin/docker compose -f /Users/benjaminadmin/Projekte/brea
|
||||
|
||||
| Datum | Version | Aenderung |
|
||||
|-------|---------|----------|
|
||||
| 2026-03-12 | 4.4.0 | PaddleOCR Remote-Engine (`engine=paddle`): PP-OCRv5 Latin auf Hetzner x86_64. Neuer Microservice (`paddleocr-service/`), HTTP-Client (`paddleocr_remote.py`), Frontend-Dropdown-Option. Nutzt words_first Grid-Methode. |
|
||||
| 2026-03-12 | 4.3.0 | Words-First Grid Builder (`cv_words_first.py`): Bottom-up-Algorithmus clustert Tesseract word_boxes direkt zu Spalten/Zeilen/Zellen. Neuer `grid_method` Parameter im `/words` Endpoint. Frontend-Toggle in StepWordRecognition. |
|
||||
| 2026-03-10 | 4.2.0 | Rekonstruktion: Overlay-Modus mit Pixel-Wortpositionierung, 180°-Rotation, Sub-Session-Merging, usePixelWordPositions Hook, Box-Boundary-Schutz (box_ranges_inner) |
|
||||
| 2026-03-05 | 3.1.0 | Spalten: Seiten-Segmentierung an Sub-Headern, Word-Coverage Fallback, Segment-gefilterte Validierung |
|
||||
|
||||
Reference in New Issue
Block a user