PaddlePaddle 3.x + PP-OCRv5 requires >6GB RAM and has oneDNN
compatibility issues on CPU. PaddleOCR 2.x with PP-OCRv4 works
reliably with ~2-3GB RAM and has no MKLDNN issues.
- Pin paddlepaddle<3.0.0 and paddleocr<3.0.0
- Simplify main.py — single init strategy, direct 2.x result format
- Re-enable warmup (fits in memory with 2.x)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Previous FLAGS_use_mkldnn env var was ignored by PaddlePaddle 3.x.
Now using paddle.set_flags() API and PaddleOCR enable_mkldnn param.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Set FLAGS_use_mkldnn=0 before paddle import to avoid
ConvertPirAttribute2RuntimeAttribute error
- Support both PaddleOCR 2.x (list) and 3.x (dict) result formats
- Use use_textline_orientation (3.x) instead of use_angle_cls
- Remove latin lang fallback (not supported in 3.x)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The warmup OCR call during startup pushes memory over 6G and causes
OOM kills + restart loops. First real OCR request will be slow
(JIT compilation) but container stays stable.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PaddleOCR 3.x removed show_log param and lang='latin'. Try multiple
init strategies in order until one succeeds.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The import and model loading can take minutes and was blocking
the startup event, causing health checks to timeout. Now loads
in a background thread — health endpoint returns 200 immediately
with status 'loading' until model is ready.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PaddleOCR 2.8.1 throws a generic Exception (not ValueError) when
ocr_version='PP-OCRv5' is used. Broadened except clause to catch
any error and fall back to lang='latin' for older versions.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PaddleOCR 3.4.0 removed 'latin' language support, causing ValueError
at startup. Now uses lang='en' with ocr_version='PP-OCRv5' and falls
back to lang='latin' for older PaddleOCR versions.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PaddlePaddle 3.x hat oneDNN/PIR Executor Bug. Zurueck auf 2.6.2
mit bewaeherter ocr() API statt predict().
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PaddlePaddle braucht libgomp.so.1 fuer Inferenz.
lang wird ignoriert bei explizitem model_name.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Model wird beim Container-Start geladen (nicht erst beim ersten Request).
Health-Check start_period auf 300s erhoeht fuer initialen Download.
/health gibt "loading" zurueck bis Modell bereit ist.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
lang="latin" braucht text_recognition_model_name in PP-OCRv5.
Neue API nutzt predict() statt ocr(), Ergebnis-Format angepasst.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Microservice fuer PaddleOCR auf Hetzner. FastAPI mit /ocr und /health
Endpoints, API-Key Auth, 4GB Memory Limit, Modell-Cache Volume.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>