Files
breakpilot-lehrer/klausur-service/backend/cv_vocab_pipeline.py
Benjamin Admin 7005b18561
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 29s
CI / test-go-edu-search (push) Successful in 30s
CI / test-python-klausur (push) Failing after 1m59s
CI / test-python-agent-core (push) Successful in 17s
CI / test-nodejs-website (push) Successful in 19s
feat: generische Box-Erkennung fuer zonenbasierte Spaltenerkennung
- Neue Datei cv_box_detect.py: 2-Stufen-Algorithmus (Linien + Farbe)
- DetectedBox/PageZone Dataclasses in cv_vocab_types.py
- detect_column_geometry_zoned() in cv_layout.py
- API-Endpoints erweitert: zones/boxes_detected im column_result
- Overlay-Funktionen zeichnen Box-Grenzen als gestrichelte Rechtecke
- Fix: numpy array or-Verknuepfung an 7 Stellen in ocr_pipeline_api.py
- 12 Unit-Tests fuer Box-Erkennung und Zone-Splitting

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 15:06:23 +01:00

37 lines
1.4 KiB
Python

"""
CV-based Document Reconstruction Pipeline for Vocabulary Extraction.
Re-export facade — all logic lives in the sub-modules:
cv_vocab_types Dataklassen, Konstanten, IPA, Feature-Flags
cv_preprocessing Bild-I/O, Orientierung, Deskew, Dewarp
cv_layout Dokumenttyp, Spalten, Zeilen, Klassifikation
cv_ocr_engines OCR-Engines, Vocab-Postprocessing, Text-Cleaning
cv_cell_grid Cell-Grid (v2 + Legacy), Vocab-Konvertierung
cv_review LLM/Spell Review, Pipeline-Orchestrierung
Lizenz: Apache 2.0 (kommerziell nutzbar)
DATENSCHUTZ: Alle Verarbeitung erfolgt lokal.
"""
from cv_vocab_types import * # noqa: F401,F403
from cv_preprocessing import * # noqa: F401,F403
from cv_layout import * # noqa: F401,F403
from cv_ocr_engines import * # noqa: F401,F403
from cv_cell_grid import * # noqa: F401,F403
from cv_box_detect import * # noqa: F401,F403
from cv_review import * # noqa: F401,F403
# Private names used by consumers — not covered by wildcard re-exports.
from cv_preprocessing import _apply_shear # noqa: F401
from cv_layout import ( # noqa: F401
_detect_header_footer_gaps,
_detect_sub_columns,
_split_broad_columns,
)
from cv_ocr_engines import ( # noqa: F401
_fix_character_confusion,
_fix_phonetic_brackets,
)
from cv_cell_grid import _cells_to_vocab_entries # noqa: F401