Restructure: Move ocr_pipeline + labeling + crop into ocr/ package
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 29s
CI / test-go-edu-search (push) Successful in 29s
CI / test-python-klausur (push) Failing after 2m25s
CI / test-python-agent-core (push) Successful in 19s
CI / test-nodejs-website (push) Successful in 20s
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 29s
CI / test-go-edu-search (push) Successful in 29s
CI / test-python-klausur (push) Failing after 2m25s
CI / test-python-agent-core (push) Successful in 19s
CI / test-nodejs-website (push) Successful in 20s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
37
klausur-service/backend/ocr/cv_pipeline.py
Normal file
37
klausur-service/backend/ocr/cv_pipeline.py
Normal file
@@ -0,0 +1,37 @@
|
||||
"""
|
||||
CV-based Document Reconstruction Pipeline for Vocabulary Extraction.
|
||||
|
||||
Re-export facade — all logic lives in the sub-modules:
|
||||
|
||||
cv_vocab_types Dataklassen, Konstanten, IPA, Feature-Flags
|
||||
cv_preprocessing Bild-I/O, Orientierung, Deskew, Dewarp
|
||||
cv_layout Dokumenttyp, Spalten, Zeilen, Klassifikation
|
||||
cv_ocr_engines OCR-Engines, Vocab-Postprocessing, Text-Cleaning
|
||||
cv_cell_grid Cell-Grid (v2 + Legacy), Vocab-Konvertierung
|
||||
cv_review LLM/Spell Review, Pipeline-Orchestrierung
|
||||
|
||||
Lizenz: Apache 2.0 (kommerziell nutzbar)
|
||||
DATENSCHUTZ: Alle Verarbeitung erfolgt lokal.
|
||||
"""
|
||||
|
||||
from .types import * # noqa: F401,F403
|
||||
from .preprocessing.preprocessing import * # noqa: F401,F403
|
||||
from .layout.layout import * # noqa: F401,F403
|
||||
from .engines.engines import * # noqa: F401,F403
|
||||
from .cell_grid.cell_grid import * # noqa: F401,F403
|
||||
from .detect.box_detect import * # noqa: F401,F403
|
||||
from .review.review import * # noqa: F401,F403
|
||||
|
||||
# Private names used by consumers — not covered by wildcard re-exports.
|
||||
from .preprocessing.preprocessing import _apply_shear # noqa: F401
|
||||
from .layout.layout import ( # noqa: F401
|
||||
_detect_header_footer_gaps,
|
||||
_detect_sub_columns,
|
||||
_split_broad_columns,
|
||||
)
|
||||
from .engines.engines import ( # noqa: F401
|
||||
_fix_character_confusion,
|
||||
_fix_phonetic_brackets,
|
||||
)
|
||||
from .cell_grid.cell_grid import _cells_to_vocab_entries # noqa: F401
|
||||
from .words_first import build_grid_from_words # noqa: F401
|
||||
Reference in New Issue
Block a user