Files
breakpilot-lehrer/klausur-service/backend/ocr/review/review.py
Benjamin Admin cb1be59e46
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 39s
CI / test-go-edu-search (push) Successful in 29s
CI / test-python-klausur (push) Failing after 2m34s
CI / test-python-agent-core (push) Successful in 20s
CI / test-nodejs-website (push) Successful in 26s
Restructure: Move 47 cv_* files into ocr/ package
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 21:03:54 +02:00

47 lines
1.2 KiB
Python

"""
Multi-pass OCR, line matching, LLM/spell review, and pipeline orchestration.
Re-export facade -- all logic lives in the sub-modules:
cv_review_pipeline Stages 6-8: OCR, line alignment, orchestrator
cv_review_spell Rule-based spell-checker OCR correction
cv_review_llm LLM-based OCR correction, prompt building, streaming
Lizenz: Apache 2.0 (kommerziell nutzbar)
DATENSCHUTZ: Alle Verarbeitung erfolgt lokal.
"""
# Re-export everything for backward compatibility
from .pipeline import ( # noqa: F401
ocr_region,
run_multi_pass_ocr,
match_lines_to_vocab,
llm_post_correct,
run_cv_pipeline,
)
from .spell import ( # noqa: F401
_SPELL_AVAILABLE,
_spell_dict_knows,
_spell_fix_field,
_spell_fix_token,
_try_split_merged_word,
_normalize_page_ref,
spell_review_entries_sync,
spell_review_entries_streaming,
)
from .llm import ( # noqa: F401
OLLAMA_REVIEW_MODEL,
REVIEW_ENGINE,
_REVIEW_BATCH_SIZE,
_build_llm_prompt,
_diff_batch,
_entry_needs_review,
_is_spurious_change,
_parse_llm_json_array,
_sanitize_for_json,
llm_review_entries,
llm_review_entries_streaming,
)