Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 29s
CI / test-go-edu-search (push) Successful in 28s
CI / test-python-klausur (push) Failing after 2m24s
CI / test-python-agent-core (push) Successful in 22s
CI / test-nodejs-website (push) Successful in 20s
Phase 1 of the clean architecture refactor: Replaces the 751-line ocr-overlay monolith with a modular pipeline. Each step gets its own component file. Frontend: /ai/ocr-kombi route with 11 steps (Upload, Orientation, PageSplit, Deskew, Dewarp, ContentCrop, OCR, Structure, GridBuild, GridReview, GroundTruth). Session list supports document grouping for multi-page uploads. Backend: New ocr_kombi/ module with multi-page PDF upload (splits PDF into N sessions with shared document_group_id). DB migration adds document_group_id and page_number columns. Old /ai/ocr-overlay remains fully functional for A/B testing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
31 lines
865 B
TypeScript
31 lines
865 B
TypeScript
'use client'
|
|
|
|
import { PaddleDirectStep } from '@/components/ocr-overlay/PaddleDirectStep'
|
|
|
|
interface StepOcrProps {
|
|
sessionId: string | null
|
|
onNext: () => void
|
|
}
|
|
|
|
/**
|
|
* Step 7: OCR (Kombi mode = PaddleOCR + Tesseract).
|
|
*
|
|
* Phase 1: Uses the existing PaddleDirectStep with kombi endpoint.
|
|
* Phase 3 (later) will add transparent 3-phase progress + engine comparison.
|
|
*/
|
|
export function StepOcr({ sessionId, onNext }: StepOcrProps) {
|
|
return (
|
|
<PaddleDirectStep
|
|
sessionId={sessionId}
|
|
onNext={onNext}
|
|
endpoint="paddle-kombi"
|
|
title="Kombi-Modus"
|
|
description="PP-OCRv5 und Tesseract laufen parallel. Koordinaten werden gewichtet gemittelt fuer optimale Positionierung."
|
|
icon="🔀"
|
|
buttonLabel="PP-OCRv5 + Tesseract starten"
|
|
runningLabel="PP-OCRv5 + Tesseract laufen..."
|
|
engineKey="kombi"
|
|
/>
|
|
)
|
|
}
|