Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 35s
CI / test-go-edu-search (push) Successful in 26s
CI / test-python-klausur (push) Failing after 1m47s
CI / test-python-agent-core (push) Successful in 15s
CI / test-nodejs-website (push) Successful in 22s
Extract _build_grid_core() from build_grid() endpoint for reuse. New ocr_pipeline_regression.py with endpoints to mark sessions as ground truth, list them, and run regression comparisons after code changes. Frontend button in StepGroundTruth.tsx to mark/update GT. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
64 lines
2.7 KiB
Python
64 lines
2.7 KiB
Python
"""
|
||
OCR Pipeline API - Schrittweise Seitenrekonstruktion.
|
||
|
||
Thin wrapper that assembles all sub-module routers into a single
|
||
composite router. Backward-compatible: main.py and tests can still
|
||
import ``router``, ``_cache``, and helper functions from here.
|
||
|
||
Sub-modules (each < 1 000 lines):
|
||
ocr_pipeline_common – shared state, cache, Pydantic models, helpers
|
||
ocr_pipeline_sessions – session CRUD, image serving, doc-type
|
||
ocr_pipeline_geometry – deskew, dewarp, structure, columns
|
||
ocr_pipeline_rows – row detection, box-overlay helper
|
||
ocr_pipeline_words – word detection (SSE), paddle-direct, word GT
|
||
ocr_pipeline_ocr_merge – paddle/tesseract merge helpers, kombi endpoints
|
||
ocr_pipeline_postprocess – LLM review, reconstruction, export, validation
|
||
ocr_pipeline_auto – auto-mode orchestrator, reprocess
|
||
|
||
Lizenz: Apache 2.0
|
||
DATENSCHUTZ: Alle Verarbeitung erfolgt lokal.
|
||
"""
|
||
|
||
from fastapi import APIRouter
|
||
|
||
# ---------------------------------------------------------------------------
|
||
# Shared state (imported by main.py and orientation_crop_api.py)
|
||
# ---------------------------------------------------------------------------
|
||
from ocr_pipeline_common import ( # noqa: F401 – re-exported
|
||
_cache,
|
||
_BORDER_GHOST_CHARS,
|
||
_filter_border_ghost_words,
|
||
)
|
||
|
||
# ---------------------------------------------------------------------------
|
||
# Sub-module routers
|
||
# ---------------------------------------------------------------------------
|
||
from ocr_pipeline_sessions import router as _sessions_router
|
||
from ocr_pipeline_geometry import router as _geometry_router
|
||
from ocr_pipeline_rows import router as _rows_router
|
||
from ocr_pipeline_words import router as _words_router
|
||
from ocr_pipeline_ocr_merge import (
|
||
router as _ocr_merge_router,
|
||
# Re-export for test backward compatibility
|
||
_split_paddle_multi_words, # noqa: F401
|
||
_group_words_into_rows, # noqa: F401
|
||
_merge_row_sequences, # noqa: F401
|
||
_merge_paddle_tesseract, # noqa: F401
|
||
)
|
||
from ocr_pipeline_postprocess import router as _postprocess_router
|
||
from ocr_pipeline_auto import router as _auto_router
|
||
from ocr_pipeline_regression import router as _regression_router
|
||
|
||
# ---------------------------------------------------------------------------
|
||
# Composite router (used by main.py)
|
||
# ---------------------------------------------------------------------------
|
||
router = APIRouter()
|
||
router.include_router(_sessions_router)
|
||
router.include_router(_geometry_router)
|
||
router.include_router(_rows_router)
|
||
router.include_router(_words_router)
|
||
router.include_router(_ocr_merge_router)
|
||
router.include_router(_postprocess_router)
|
||
router.include_router(_auto_router)
|
||
router.include_router(_regression_router)
|