Compare commits
3 Commits
fd99d4f875
...
coolify
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
b697963186 | ||
|
|
ef6237ffdf | ||
|
|
41a8f3b183 |
@@ -6,31 +6,22 @@
|
|||||||
|
|
||||||
| Geraet | Rolle | Aufgaben |
|
| Geraet | Rolle | Aufgaben |
|
||||||
|--------|-------|----------|
|
|--------|-------|----------|
|
||||||
| **MacBook** | Entwicklung | Claude Terminal, Code-Entwicklung, Browser (Frontend-Tests) |
|
| **MacBook** | Client | Claude Terminal, Browser (Frontend-Tests) |
|
||||||
| **Mac Mini** | Server | Docker, alle Services, Tests, Builds, Deployment |
|
| **Mac Mini** | Server | Docker, alle Services, Code-Ausfuehrung, Tests, Git |
|
||||||
|
|
||||||
**WICHTIG:** Code wird direkt auf dem MacBook in diesem Repo bearbeitet. Docker und Services laufen auf dem Mac Mini.
|
**WICHTIG:** Die Entwicklung findet vollstaendig auf dem **Mac Mini** statt!
|
||||||
|
|
||||||
### Entwicklungsworkflow
|
### SSH-Verbindung
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 1. Code auf MacBook bearbeiten (dieses Verzeichnis)
|
ssh macmini
|
||||||
# 2. Committen und pushen:
|
# Projektverzeichnis:
|
||||||
git push origin main && git push gitea main
|
cd /Users/benjaminadmin/Projekte/breakpilot-lehrer
|
||||||
|
|
||||||
# 3. Auf Mac Mini pullen und Container neu bauen:
|
# Einzelbefehle (BEVORZUGT):
|
||||||
ssh macmini "git -C /Users/benjaminadmin/Projekte/breakpilot-lehrer pull --no-rebase origin main"
|
ssh macmini "cd /Users/benjaminadmin/Projekte/breakpilot-lehrer && <cmd>"
|
||||||
ssh macmini "/usr/local/bin/docker compose -f /Users/benjaminadmin/Projekte/breakpilot-lehrer/docker-compose.yml build --no-cache <service>"
|
|
||||||
ssh macmini "/usr/local/bin/docker compose -f /Users/benjaminadmin/Projekte/breakpilot-lehrer/docker-compose.yml up -d <service>"
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### SSH-Verbindung (fuer Docker/Tests)
|
|
||||||
|
|
||||||
**WICHTIG:** `cd` in SSH-Kommandos funktioniert NICHT zuverlaessig! Stattdessen:
|
|
||||||
- Git: `git -C /Users/benjaminadmin/Projekte/breakpilot-lehrer <cmd>`
|
|
||||||
- Docker: `/usr/local/bin/docker compose -f /Users/benjaminadmin/Projekte/breakpilot-lehrer/docker-compose.yml <cmd>`
|
|
||||||
- Logs: `/usr/local/bin/docker logs -f bp-lehrer-<service>`
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Voraussetzung
|
## Voraussetzung
|
||||||
@@ -172,10 +163,10 @@ breakpilot-lehrer/
|
|||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Lehrer-Services starten (Core muss laufen!)
|
# Lehrer-Services starten (Core muss laufen!)
|
||||||
ssh macmini "/usr/local/bin/docker compose -f /Users/benjaminadmin/Projekte/breakpilot-lehrer/docker-compose.yml up -d"
|
ssh macmini "cd /Users/benjaminadmin/Projekte/breakpilot-lehrer && /usr/local/bin/docker compose up -d"
|
||||||
|
|
||||||
# Einzelnen Service neu bauen
|
# Einzelnen Service neu bauen
|
||||||
ssh macmini "/usr/local/bin/docker compose -f /Users/benjaminadmin/Projekte/breakpilot-lehrer/docker-compose.yml build --no-cache <service>"
|
ssh macmini "cd /Users/benjaminadmin/Projekte/breakpilot-lehrer && /usr/local/bin/docker compose build --no-cache <service>"
|
||||||
|
|
||||||
# Logs
|
# Logs
|
||||||
ssh macmini "/usr/local/bin/docker logs -f bp-lehrer-<service>"
|
ssh macmini "/usr/local/bin/docker logs -f bp-lehrer-<service>"
|
||||||
@@ -185,7 +176,6 @@ ssh macmini "/usr/local/bin/docker ps --filter name=bp-lehrer"
|
|||||||
```
|
```
|
||||||
|
|
||||||
**WICHTIG:** Docker-Pfad auf Mac Mini ist `/usr/local/bin/docker` (nicht im Standard-SSH-PATH).
|
**WICHTIG:** Docker-Pfad auf Mac Mini ist `/usr/local/bin/docker` (nicht im Standard-SSH-PATH).
|
||||||
**WICHTIG:** Immer `-f` mit vollem Pfad zur docker-compose.yml nutzen, `cd` in SSH funktioniert nicht!
|
|
||||||
|
|
||||||
### Frontend-Entwicklung
|
### Frontend-Entwicklung
|
||||||
|
|
||||||
|
|||||||
79
.env.coolify.example
Normal file
79
.env.coolify.example
Normal file
@@ -0,0 +1,79 @@
|
|||||||
|
# =========================================================
|
||||||
|
# BreakPilot Lehrer — Coolify Environment Variables
|
||||||
|
# =========================================================
|
||||||
|
# Copy these into Coolify's environment variable UI
|
||||||
|
# for the breakpilot-lehrer Docker Compose resource.
|
||||||
|
# =========================================================
|
||||||
|
|
||||||
|
# --- External PostgreSQL (Coolify-managed, same as Core) ---
|
||||||
|
POSTGRES_HOST=<coolify-postgres-hostname>
|
||||||
|
POSTGRES_PORT=5432
|
||||||
|
POSTGRES_USER=breakpilot
|
||||||
|
POSTGRES_PASSWORD=CHANGE_ME_SAME_AS_CORE
|
||||||
|
POSTGRES_DB=breakpilot_db
|
||||||
|
|
||||||
|
# --- Security ---
|
||||||
|
JWT_SECRET=CHANGE_ME_SAME_AS_CORE
|
||||||
|
|
||||||
|
# --- External S3 Storage (same as Core) ---
|
||||||
|
S3_ENDPOINT=<s3-endpoint-host:port>
|
||||||
|
S3_ACCESS_KEY=CHANGE_ME_SAME_AS_CORE
|
||||||
|
S3_SECRET_KEY=CHANGE_ME_SAME_AS_CORE
|
||||||
|
S3_BUCKET=breakpilot-rag
|
||||||
|
S3_SECURE=true
|
||||||
|
|
||||||
|
# --- External Qdrant (Coolify-managed, same as Core) ---
|
||||||
|
QDRANT_URL=http://<coolify-qdrant-hostname>:6333
|
||||||
|
|
||||||
|
# --- Session ---
|
||||||
|
SESSION_TTL_HOURS=24
|
||||||
|
|
||||||
|
# --- SMTP (Real mail server) ---
|
||||||
|
SMTP_HOST=smtp.example.com
|
||||||
|
SMTP_PORT=587
|
||||||
|
SMTP_USERNAME=noreply@breakpilot.ai
|
||||||
|
SMTP_PASSWORD=CHANGE_ME_SMTP_PASSWORD
|
||||||
|
SMTP_FROM_NAME=BreakPilot
|
||||||
|
SMTP_FROM_ADDR=noreply@breakpilot.ai
|
||||||
|
|
||||||
|
# --- LLM / Ollama (optional) ---
|
||||||
|
OLLAMA_BASE_URL=
|
||||||
|
OLLAMA_URL=
|
||||||
|
OLLAMA_ENABLED=false
|
||||||
|
OLLAMA_DEFAULT_MODEL=
|
||||||
|
OLLAMA_VISION_MODEL=
|
||||||
|
OLLAMA_CORRECTION_MODEL=
|
||||||
|
OLLAMA_TIMEOUT=120
|
||||||
|
|
||||||
|
# --- Anthropic (optional) ---
|
||||||
|
ANTHROPIC_API_KEY=
|
||||||
|
|
||||||
|
# --- vast.ai GPU (optional) ---
|
||||||
|
VAST_API_KEY=
|
||||||
|
VAST_INSTANCE_ID=
|
||||||
|
|
||||||
|
# --- Game Settings ---
|
||||||
|
GAME_USE_DATABASE=true
|
||||||
|
GAME_REQUIRE_AUTH=true
|
||||||
|
GAME_REQUIRE_BILLING=true
|
||||||
|
GAME_LLM_MODEL=
|
||||||
|
|
||||||
|
# --- Frontend URLs (build args) ---
|
||||||
|
NEXT_PUBLIC_API_URL=https://api-lehrer.breakpilot.ai
|
||||||
|
NEXT_PUBLIC_KLAUSUR_SERVICE_URL=https://klausur.breakpilot.ai
|
||||||
|
NEXT_PUBLIC_VOICE_SERVICE_URL=wss://voice.breakpilot.ai
|
||||||
|
NEXT_PUBLIC_BILLING_API_URL=https://api-core.breakpilot.ai
|
||||||
|
NEXT_PUBLIC_APP_URL=https://app.breakpilot.ai
|
||||||
|
NEXT_PUBLIC_STRIPE_PUBLISHABLE_KEY=
|
||||||
|
|
||||||
|
# --- Edu Search ---
|
||||||
|
EDU_SEARCH_URL=
|
||||||
|
EDU_SEARCH_API_KEY=
|
||||||
|
OPENSEARCH_PASSWORD=CHANGE_ME_OPENSEARCH_PASSWORD
|
||||||
|
|
||||||
|
# --- Misc ---
|
||||||
|
CONTROL_API_KEY=
|
||||||
|
ALERTS_AGENT_ENABLED=false
|
||||||
|
PADDLEOCR_SERVICE_URL=
|
||||||
|
TROCR_SERVICE_URL=
|
||||||
|
CAMUNDA_URL=
|
||||||
17
.env.example
17
.env.example
@@ -30,23 +30,6 @@ OLLAMA_VISION_MODEL=llama3.2-vision
|
|||||||
OLLAMA_CORRECTION_MODEL=llama3.2
|
OLLAMA_CORRECTION_MODEL=llama3.2
|
||||||
OLLAMA_TIMEOUT=120
|
OLLAMA_TIMEOUT=120
|
||||||
|
|
||||||
# OCR-Pipeline: LLM-Review (Schritt 6)
|
|
||||||
# Kleine Modelle reichen fuer Zeichen-Korrekturen (0->O, 1->l, 5->S)
|
|
||||||
# Optionen: qwen3:0.6b, qwen3:1.7b, gemma3:1b, qwen3:30b-a3b
|
|
||||||
OLLAMA_REVIEW_MODEL=qwen3:0.6b
|
|
||||||
# Eintraege pro Ollama-Call. Groesser = weniger HTTP-Overhead.
|
|
||||||
OLLAMA_REVIEW_BATCH_SIZE=20
|
|
||||||
|
|
||||||
# OCR-Pipeline: Engine fuer Schritt 5 (Worterkennung)
|
|
||||||
# Optionen: auto (bevorzugt RapidOCR), rapid, tesseract,
|
|
||||||
# trocr-printed, trocr-handwritten, lighton
|
|
||||||
OCR_ENGINE=auto
|
|
||||||
|
|
||||||
# Klausur-HTR: Primaerem Modell fuer Handschriftenerkennung (qwen2.5vl bereits auf Mac Mini)
|
|
||||||
OLLAMA_HTR_MODEL=qwen2.5vl:32b
|
|
||||||
# HTR Fallback: genutzt wenn Ollama nicht erreichbar (auto-download ~340 MB)
|
|
||||||
HTR_FALLBACK_MODEL=trocr-large
|
|
||||||
|
|
||||||
# Anthropic (optional)
|
# Anthropic (optional)
|
||||||
ANTHROPIC_API_KEY=
|
ANTHROPIC_API_KEY=
|
||||||
|
|
||||||
|
|||||||
32
.gitea/workflows/deploy-coolify.yml
Normal file
32
.gitea/workflows/deploy-coolify.yml
Normal file
@@ -0,0 +1,32 @@
|
|||||||
|
name: Deploy to Coolify
|
||||||
|
|
||||||
|
on:
|
||||||
|
push:
|
||||||
|
branches:
|
||||||
|
- coolify
|
||||||
|
|
||||||
|
jobs:
|
||||||
|
deploy:
|
||||||
|
runs-on: ubuntu-latest
|
||||||
|
steps:
|
||||||
|
- name: Wait for Core deployment
|
||||||
|
run: |
|
||||||
|
echo "Waiting 30s for Core services to stabilize..."
|
||||||
|
sleep 30
|
||||||
|
|
||||||
|
- name: Deploy via Coolify API
|
||||||
|
run: |
|
||||||
|
echo "Deploying breakpilot-lehrer to Coolify..."
|
||||||
|
HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" \
|
||||||
|
-X POST \
|
||||||
|
-H "Authorization: Bearer ${{ secrets.COOLIFY_API_TOKEN }}" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{"uuid": "${{ secrets.COOLIFY_RESOURCE_UUID }}", "force_rebuild": true}' \
|
||||||
|
"${{ secrets.COOLIFY_BASE_URL }}/api/v1/deploy")
|
||||||
|
|
||||||
|
echo "HTTP Status: $HTTP_STATUS"
|
||||||
|
if [ "$HTTP_STATUS" -ne 200 ] && [ "$HTTP_STATUS" -ne 201 ]; then
|
||||||
|
echo "Deployment failed with status $HTTP_STATUS"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
echo "Deployment triggered successfully!"
|
||||||
@@ -34,8 +34,8 @@ WORKDIR /app
|
|||||||
ENV NODE_ENV=production
|
ENV NODE_ENV=production
|
||||||
|
|
||||||
# Create non-root user
|
# Create non-root user
|
||||||
RUN addgroup --system --gid 1001 nodejs
|
RUN addgroup -S -g 1001 nodejs
|
||||||
RUN adduser --system --uid 1001 nextjs
|
RUN adduser -S -u 1001 -G nodejs nextjs
|
||||||
|
|
||||||
# Copy built assets
|
# Copy built assets
|
||||||
COPY --from=builder /app/public ./public
|
COPY --from=builder /app/public ./public
|
||||||
|
|||||||
@@ -273,6 +273,52 @@ Dein Ziel ist die rechtzeitige Erkennung und Kommunikation relevanter Ereignisse
|
|||||||
createdAt: '2024-12-01T00:00:00Z',
|
createdAt: '2024-12-01T00:00:00Z',
|
||||||
updatedAt: '2025-01-12T02:00:00Z'
|
updatedAt: '2025-01-12T02:00:00Z'
|
||||||
},
|
},
|
||||||
|
'compliance-advisor': {
|
||||||
|
id: 'compliance-advisor',
|
||||||
|
name: 'Compliance Advisor',
|
||||||
|
description: 'DSGVO/Compliance-Berater fuer SDK-Nutzer',
|
||||||
|
soulFile: 'compliance-advisor.soul.md',
|
||||||
|
soulContent: `# Compliance Advisor Agent
|
||||||
|
|
||||||
|
## Identitaet
|
||||||
|
Du bist der BreakPilot Compliance-Berater. Du hilfst Nutzern des AI Compliance SDK,
|
||||||
|
Datenschutz- und Compliance-Fragen in verstaendlicher Sprache zu beantworten.
|
||||||
|
Du bist kein Anwalt und gibst keine Rechtsberatung, sondern orientierst dich an
|
||||||
|
offiziellen Quellen und gibst praxisnahe Hinweise.
|
||||||
|
|
||||||
|
## Kernprinzipien
|
||||||
|
- **Quellenbasiert**: Verweise immer auf konkrete Rechtsgrundlagen (DSGVO-Artikel, BDSG-Paragraphen)
|
||||||
|
- **Verstaendlich**: Erklaere rechtliche Konzepte in einfacher, praxisnaher Sprache
|
||||||
|
- **Ehrlich**: Bei Unsicherheit empfehle professionelle Rechtsberatung
|
||||||
|
- **Kontextbewusst**: Nutze das RAG-System fuer aktuelle Rechtstexte und Leitfaeden
|
||||||
|
- **Scope-bewusst**: Nutze alle verfuegbaren RAG-Quellen AUSSER NIBIS-Dokumenten
|
||||||
|
|
||||||
|
## Kompetenzbereich
|
||||||
|
- DSGVO Art. 1-99 + Erwaegsgruende
|
||||||
|
- BDSG (Bundesdatenschutzgesetz)
|
||||||
|
- AI Act (EU KI-Verordnung)
|
||||||
|
- TTDSG, ePrivacy-Richtlinie
|
||||||
|
- DSK-Kurzpapiere (Nr. 1-20)
|
||||||
|
- SDM V3.0, BSI-Grundschutz, BSI-TR-03161
|
||||||
|
- EDPB Guidelines, Bundes-/Laender-Muss-Listen
|
||||||
|
- ISO 27001/27701 (Ueberblick)
|
||||||
|
|
||||||
|
## Kommunikationsstil
|
||||||
|
- Sachlich, aber verstaendlich
|
||||||
|
- Deutsch als Hauptsprache
|
||||||
|
- Strukturierte Antworten mit Quellenangabe
|
||||||
|
- Praxisbeispiele wo hilfreich`,
|
||||||
|
color: '#6366f1',
|
||||||
|
status: 'running',
|
||||||
|
activeSessions: 0,
|
||||||
|
totalProcessed: 0,
|
||||||
|
avgResponseTime: 0,
|
||||||
|
errorRate: 0,
|
||||||
|
lastRestart: new Date().toISOString(),
|
||||||
|
version: '1.0.0',
|
||||||
|
createdAt: new Date().toISOString(),
|
||||||
|
updatedAt: new Date().toISOString()
|
||||||
|
},
|
||||||
'orchestrator': {
|
'orchestrator': {
|
||||||
id: 'orchestrator',
|
id: 'orchestrator',
|
||||||
name: 'Orchestrator',
|
name: 'Orchestrator',
|
||||||
|
|||||||
@@ -94,6 +94,19 @@ const mockAgents: AgentConfig[] = [
|
|||||||
totalProcessed: 8934,
|
totalProcessed: 8934,
|
||||||
avgResponseTime: 12,
|
avgResponseTime: 12,
|
||||||
lastActivity: 'just now'
|
lastActivity: 'just now'
|
||||||
|
},
|
||||||
|
{
|
||||||
|
id: 'compliance-advisor',
|
||||||
|
name: 'Compliance Advisor',
|
||||||
|
description: 'DSGVO/Compliance-Berater fuer SDK-Nutzer',
|
||||||
|
soulFile: 'compliance-advisor.soul.md',
|
||||||
|
color: '#6366f1',
|
||||||
|
icon: 'message',
|
||||||
|
status: 'running',
|
||||||
|
activeSessions: 0,
|
||||||
|
totalProcessed: 0,
|
||||||
|
avgResponseTime: 0,
|
||||||
|
lastActivity: new Date().toISOString()
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
|
|
||||||
|
|||||||
@@ -1,409 +0,0 @@
|
|||||||
'use client'
|
|
||||||
|
|
||||||
import { useCallback, useEffect, useState } from 'react'
|
|
||||||
import { PagePurpose } from '@/components/common/PagePurpose'
|
|
||||||
import { PipelineStepper } from '@/components/ocr-pipeline/PipelineStepper'
|
|
||||||
import { StepDeskew } from '@/components/ocr-pipeline/StepDeskew'
|
|
||||||
import { StepDewarp } from '@/components/ocr-pipeline/StepDewarp'
|
|
||||||
import { StepColumnDetection } from '@/components/ocr-pipeline/StepColumnDetection'
|
|
||||||
import { StepRowDetection } from '@/components/ocr-pipeline/StepRowDetection'
|
|
||||||
import { StepWordRecognition } from '@/components/ocr-pipeline/StepWordRecognition'
|
|
||||||
import { StepLlmReview } from '@/components/ocr-pipeline/StepLlmReview'
|
|
||||||
import { StepReconstruction } from '@/components/ocr-pipeline/StepReconstruction'
|
|
||||||
import { StepGroundTruth } from '@/components/ocr-pipeline/StepGroundTruth'
|
|
||||||
import { PIPELINE_STEPS, type PipelineStep, type SessionListItem, type DocumentTypeResult } from './types'
|
|
||||||
|
|
||||||
const KLAUSUR_API = '/klausur-api'
|
|
||||||
|
|
||||||
export default function OcrPipelinePage() {
|
|
||||||
const [currentStep, setCurrentStep] = useState(0)
|
|
||||||
const [sessionId, setSessionId] = useState<string | null>(null)
|
|
||||||
const [sessionName, setSessionName] = useState<string>('')
|
|
||||||
const [sessions, setSessions] = useState<SessionListItem[]>([])
|
|
||||||
const [loadingSessions, setLoadingSessions] = useState(true)
|
|
||||||
const [editingName, setEditingName] = useState<string | null>(null)
|
|
||||||
const [editNameValue, setEditNameValue] = useState('')
|
|
||||||
const [docTypeResult, setDocTypeResult] = useState<DocumentTypeResult | null>(null)
|
|
||||||
const [steps, setSteps] = useState<PipelineStep[]>(
|
|
||||||
PIPELINE_STEPS.map((s, i) => ({
|
|
||||||
...s,
|
|
||||||
status: i === 0 ? 'active' : 'pending',
|
|
||||||
})),
|
|
||||||
)
|
|
||||||
|
|
||||||
// Load session list on mount
|
|
||||||
useEffect(() => {
|
|
||||||
loadSessions()
|
|
||||||
}, [])
|
|
||||||
|
|
||||||
const loadSessions = async () => {
|
|
||||||
setLoadingSessions(true)
|
|
||||||
try {
|
|
||||||
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions`)
|
|
||||||
if (res.ok) {
|
|
||||||
const data = await res.json()
|
|
||||||
setSessions(data.sessions || [])
|
|
||||||
}
|
|
||||||
} catch (e) {
|
|
||||||
console.error('Failed to load sessions:', e)
|
|
||||||
} finally {
|
|
||||||
setLoadingSessions(false)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
const openSession = useCallback(async (sid: string) => {
|
|
||||||
try {
|
|
||||||
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`)
|
|
||||||
if (!res.ok) return
|
|
||||||
const data = await res.json()
|
|
||||||
|
|
||||||
setSessionId(sid)
|
|
||||||
setSessionName(data.name || data.filename || '')
|
|
||||||
|
|
||||||
// Restore doc type result if available
|
|
||||||
const savedDocType: DocumentTypeResult | null = data.doc_type_result || null
|
|
||||||
setDocTypeResult(savedDocType)
|
|
||||||
|
|
||||||
// Determine which step to jump to based on current_step
|
|
||||||
const dbStep = data.current_step || 1
|
|
||||||
// Steps: 1=deskew, 2=dewarp, 3=columns, ...
|
|
||||||
// UI steps are 0-indexed: 0=deskew, 1=dewarp, 2=columns, ...
|
|
||||||
const uiStep = Math.max(0, dbStep - 1)
|
|
||||||
const skipSteps = savedDocType?.skip_steps || []
|
|
||||||
|
|
||||||
setSteps(
|
|
||||||
PIPELINE_STEPS.map((s, i) => ({
|
|
||||||
...s,
|
|
||||||
status: skipSteps.includes(s.id)
|
|
||||||
? 'skipped'
|
|
||||||
: i < uiStep ? 'completed' : i === uiStep ? 'active' : 'pending',
|
|
||||||
})),
|
|
||||||
)
|
|
||||||
setCurrentStep(uiStep)
|
|
||||||
} catch (e) {
|
|
||||||
console.error('Failed to open session:', e)
|
|
||||||
}
|
|
||||||
}, [])
|
|
||||||
|
|
||||||
const deleteSession = useCallback(async (sid: string) => {
|
|
||||||
try {
|
|
||||||
await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`, { method: 'DELETE' })
|
|
||||||
setSessions((prev) => prev.filter((s) => s.id !== sid))
|
|
||||||
if (sessionId === sid) {
|
|
||||||
setSessionId(null)
|
|
||||||
setCurrentStep(0)
|
|
||||||
setDocTypeResult(null)
|
|
||||||
setSteps(PIPELINE_STEPS.map((s, i) => ({ ...s, status: i === 0 ? 'active' : 'pending' })))
|
|
||||||
}
|
|
||||||
} catch (e) {
|
|
||||||
console.error('Failed to delete session:', e)
|
|
||||||
}
|
|
||||||
}, [sessionId])
|
|
||||||
|
|
||||||
const renameSession = useCallback(async (sid: string, newName: string) => {
|
|
||||||
try {
|
|
||||||
await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sid}`, {
|
|
||||||
method: 'PUT',
|
|
||||||
headers: { 'Content-Type': 'application/json' },
|
|
||||||
body: JSON.stringify({ name: newName }),
|
|
||||||
})
|
|
||||||
setSessions((prev) => prev.map((s) => (s.id === sid ? { ...s, name: newName } : s)))
|
|
||||||
if (sessionId === sid) setSessionName(newName)
|
|
||||||
} catch (e) {
|
|
||||||
console.error('Failed to rename session:', e)
|
|
||||||
}
|
|
||||||
setEditingName(null)
|
|
||||||
}, [sessionId])
|
|
||||||
|
|
||||||
const handleStepClick = (index: number) => {
|
|
||||||
if (index <= currentStep || steps[index].status === 'completed') {
|
|
||||||
setCurrentStep(index)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
const goToStep = (step: number) => {
|
|
||||||
setCurrentStep(step)
|
|
||||||
setSteps((prev) =>
|
|
||||||
prev.map((s, i) => ({
|
|
||||||
...s,
|
|
||||||
status: i < step ? 'completed' : i === step ? 'active' : 'pending',
|
|
||||||
})),
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
const handleNext = () => {
|
|
||||||
if (currentStep >= steps.length - 1) return
|
|
||||||
|
|
||||||
// Find the next non-skipped step
|
|
||||||
const skipSteps = docTypeResult?.skip_steps || []
|
|
||||||
let nextStep = currentStep + 1
|
|
||||||
while (nextStep < steps.length && skipSteps.includes(PIPELINE_STEPS[nextStep]?.id)) {
|
|
||||||
nextStep++
|
|
||||||
}
|
|
||||||
if (nextStep >= steps.length) nextStep = steps.length - 1
|
|
||||||
|
|
||||||
setSteps((prev) =>
|
|
||||||
prev.map((s, i) => {
|
|
||||||
if (i === currentStep) return { ...s, status: 'completed' }
|
|
||||||
if (i === nextStep) return { ...s, status: 'active' }
|
|
||||||
// Mark skipped steps between current and next
|
|
||||||
if (i > currentStep && i < nextStep && skipSteps.includes(PIPELINE_STEPS[i]?.id)) {
|
|
||||||
return { ...s, status: 'skipped' }
|
|
||||||
}
|
|
||||||
return s
|
|
||||||
}),
|
|
||||||
)
|
|
||||||
setCurrentStep(nextStep)
|
|
||||||
}
|
|
||||||
|
|
||||||
const handleDeskewComplete = (sid: string) => {
|
|
||||||
setSessionId(sid)
|
|
||||||
// Reload session list to show the new session
|
|
||||||
loadSessions()
|
|
||||||
handleNext()
|
|
||||||
}
|
|
||||||
|
|
||||||
const handleDewarpNext = async () => {
|
|
||||||
// Auto-detect document type after dewarp, then advance
|
|
||||||
if (sessionId) {
|
|
||||||
try {
|
|
||||||
const res = await fetch(
|
|
||||||
`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/detect-type`,
|
|
||||||
{ method: 'POST' },
|
|
||||||
)
|
|
||||||
if (res.ok) {
|
|
||||||
const data: DocumentTypeResult = await res.json()
|
|
||||||
setDocTypeResult(data)
|
|
||||||
|
|
||||||
// Mark skipped steps immediately
|
|
||||||
const skipSteps = data.skip_steps || []
|
|
||||||
if (skipSteps.length > 0) {
|
|
||||||
setSteps((prev) =>
|
|
||||||
prev.map((s) =>
|
|
||||||
skipSteps.includes(s.id) ? { ...s, status: 'skipped' } : s,
|
|
||||||
),
|
|
||||||
)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
} catch (e) {
|
|
||||||
console.error('Doc type detection failed:', e)
|
|
||||||
// Not critical — continue without it
|
|
||||||
}
|
|
||||||
}
|
|
||||||
handleNext()
|
|
||||||
}
|
|
||||||
|
|
||||||
const handleDocTypeChange = (newDocType: DocumentTypeResult['doc_type']) => {
|
|
||||||
if (!docTypeResult) return
|
|
||||||
|
|
||||||
// Build new skip_steps based on doc type
|
|
||||||
let skipSteps: string[] = []
|
|
||||||
if (newDocType === 'full_text') {
|
|
||||||
skipSteps = ['columns', 'rows']
|
|
||||||
}
|
|
||||||
// vocab_table and generic_table: no skips
|
|
||||||
|
|
||||||
const updated: DocumentTypeResult = {
|
|
||||||
...docTypeResult,
|
|
||||||
doc_type: newDocType,
|
|
||||||
skip_steps: skipSteps,
|
|
||||||
pipeline: newDocType === 'full_text' ? 'full_page' : 'cell_first',
|
|
||||||
}
|
|
||||||
setDocTypeResult(updated)
|
|
||||||
|
|
||||||
// Update step statuses
|
|
||||||
setSteps((prev) =>
|
|
||||||
prev.map((s) => {
|
|
||||||
if (skipSteps.includes(s.id)) return { ...s, status: 'skipped' as const }
|
|
||||||
if (s.status === 'skipped') return { ...s, status: 'pending' as const }
|
|
||||||
return s
|
|
||||||
}),
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
const handleNewSession = () => {
|
|
||||||
setSessionId(null)
|
|
||||||
setSessionName('')
|
|
||||||
setCurrentStep(0)
|
|
||||||
setDocTypeResult(null)
|
|
||||||
setSteps(PIPELINE_STEPS.map((s, i) => ({ ...s, status: i === 0 ? 'active' : 'pending' })))
|
|
||||||
}
|
|
||||||
|
|
||||||
const stepNames: Record<number, string> = {
|
|
||||||
1: 'Begradigung',
|
|
||||||
2: 'Entzerrung',
|
|
||||||
3: 'Spalten',
|
|
||||||
4: 'Zeilen',
|
|
||||||
5: 'Woerter',
|
|
||||||
6: 'Korrektur',
|
|
||||||
7: 'Rekonstruktion',
|
|
||||||
8: 'Validierung',
|
|
||||||
}
|
|
||||||
|
|
||||||
const reprocessFromStep = useCallback(async (uiStep: number) => {
|
|
||||||
if (!sessionId) return
|
|
||||||
const dbStep = uiStep + 1 // UI is 0-indexed, DB is 1-indexed
|
|
||||||
if (!confirm(`Ab Schritt ${dbStep} (${stepNames[dbStep] || '?'}) neu verarbeiten? Nachfolgende Daten werden geloescht.`)) return
|
|
||||||
try {
|
|
||||||
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/reprocess`, {
|
|
||||||
method: 'POST',
|
|
||||||
headers: { 'Content-Type': 'application/json' },
|
|
||||||
body: JSON.stringify({ from_step: dbStep }),
|
|
||||||
})
|
|
||||||
if (!res.ok) {
|
|
||||||
const data = await res.json().catch(() => ({}))
|
|
||||||
console.error('Reprocess failed:', data.detail || res.status)
|
|
||||||
return
|
|
||||||
}
|
|
||||||
// Reset UI steps
|
|
||||||
goToStep(uiStep)
|
|
||||||
} catch (e) {
|
|
||||||
console.error('Reprocess error:', e)
|
|
||||||
}
|
|
||||||
// eslint-disable-next-line react-hooks/exhaustive-deps
|
|
||||||
}, [sessionId, goToStep])
|
|
||||||
|
|
||||||
const renderStep = () => {
|
|
||||||
switch (currentStep) {
|
|
||||||
case 0:
|
|
||||||
return <StepDeskew sessionId={sessionId} onNext={handleDeskewComplete} />
|
|
||||||
case 1:
|
|
||||||
return <StepDewarp sessionId={sessionId} onNext={handleDewarpNext} />
|
|
||||||
case 2:
|
|
||||||
return <StepColumnDetection sessionId={sessionId} onNext={handleNext} />
|
|
||||||
case 3:
|
|
||||||
return <StepRowDetection sessionId={sessionId} onNext={handleNext} />
|
|
||||||
case 4:
|
|
||||||
return <StepWordRecognition sessionId={sessionId} onNext={handleNext} goToStep={goToStep} />
|
|
||||||
case 5:
|
|
||||||
return <StepLlmReview sessionId={sessionId} onNext={handleNext} />
|
|
||||||
case 6:
|
|
||||||
return <StepReconstruction sessionId={sessionId} onNext={handleNext} />
|
|
||||||
case 7:
|
|
||||||
return <StepGroundTruth />
|
|
||||||
default:
|
|
||||||
return null
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return (
|
|
||||||
<div className="space-y-6">
|
|
||||||
<PagePurpose
|
|
||||||
title="OCR Pipeline"
|
|
||||||
purpose="Schrittweise Seitenrekonstruktion: Scan begradigen, Spalten erkennen, Woerter lokalisieren und die Seite Wort fuer Wort nachbauen. Ziel: 10 Vokabelseiten fehlerfrei rekonstruieren."
|
|
||||||
audience={['Entwickler', 'Data Scientists']}
|
|
||||||
architecture={{
|
|
||||||
services: ['klausur-service (FastAPI)', 'OpenCV', 'Tesseract'],
|
|
||||||
databases: ['PostgreSQL Sessions'],
|
|
||||||
}}
|
|
||||||
relatedPages={[
|
|
||||||
{ name: 'OCR Vergleich', href: '/ai/ocr-compare', description: 'Methoden-Vergleich' },
|
|
||||||
{ name: 'OCR-Labeling', href: '/ai/ocr-labeling', description: 'Trainingsdaten' },
|
|
||||||
]}
|
|
||||||
defaultCollapsed
|
|
||||||
/>
|
|
||||||
|
|
||||||
{/* Session List */}
|
|
||||||
<div className="bg-white dark:bg-gray-800 rounded-xl border border-gray-200 dark:border-gray-700 p-4">
|
|
||||||
<div className="flex items-center justify-between mb-3">
|
|
||||||
<h3 className="text-sm font-medium text-gray-700 dark:text-gray-300">
|
|
||||||
Sessions
|
|
||||||
</h3>
|
|
||||||
<button
|
|
||||||
onClick={handleNewSession}
|
|
||||||
className="text-xs px-3 py-1.5 bg-teal-600 text-white rounded-lg hover:bg-teal-700 transition-colors"
|
|
||||||
>
|
|
||||||
+ Neue Session
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{loadingSessions ? (
|
|
||||||
<div className="text-sm text-gray-400 py-2">Lade Sessions...</div>
|
|
||||||
) : sessions.length === 0 ? (
|
|
||||||
<div className="text-sm text-gray-400 py-2">Noch keine Sessions vorhanden.</div>
|
|
||||||
) : (
|
|
||||||
<div className="space-y-1 max-h-48 overflow-y-auto">
|
|
||||||
{sessions.map((s) => (
|
|
||||||
<div
|
|
||||||
key={s.id}
|
|
||||||
className={`flex items-center gap-2 px-3 py-2 rounded-lg text-sm transition-colors cursor-pointer ${
|
|
||||||
sessionId === s.id
|
|
||||||
? 'bg-teal-50 dark:bg-teal-900/30 border border-teal-200 dark:border-teal-700'
|
|
||||||
: 'hover:bg-gray-50 dark:hover:bg-gray-700/50'
|
|
||||||
}`}
|
|
||||||
>
|
|
||||||
<div className="flex-1 min-w-0" onClick={() => openSession(s.id)}>
|
|
||||||
{editingName === s.id ? (
|
|
||||||
<input
|
|
||||||
autoFocus
|
|
||||||
value={editNameValue}
|
|
||||||
onChange={(e) => setEditNameValue(e.target.value)}
|
|
||||||
onBlur={() => renameSession(s.id, editNameValue)}
|
|
||||||
onKeyDown={(e) => {
|
|
||||||
if (e.key === 'Enter') renameSession(s.id, editNameValue)
|
|
||||||
if (e.key === 'Escape') setEditingName(null)
|
|
||||||
}}
|
|
||||||
onClick={(e) => e.stopPropagation()}
|
|
||||||
className="w-full px-1 py-0.5 text-sm border rounded dark:bg-gray-700 dark:border-gray-600"
|
|
||||||
/>
|
|
||||||
) : (
|
|
||||||
<div className="truncate font-medium text-gray-700 dark:text-gray-300">
|
|
||||||
{s.name || s.filename}
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
<div className="text-xs text-gray-400 flex gap-2">
|
|
||||||
<span>{new Date(s.created_at).toLocaleDateString('de-DE', { day: '2-digit', month: '2-digit', year: '2-digit', hour: '2-digit', minute: '2-digit' })}</span>
|
|
||||||
<span>Schritt {s.current_step}: {stepNames[s.current_step] || '?'}</span>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
<button
|
|
||||||
onClick={(e) => {
|
|
||||||
e.stopPropagation()
|
|
||||||
setEditNameValue(s.name || s.filename)
|
|
||||||
setEditingName(s.id)
|
|
||||||
}}
|
|
||||||
className="p-1 text-gray-400 hover:text-gray-600 dark:hover:text-gray-300"
|
|
||||||
title="Umbenennen"
|
|
||||||
>
|
|
||||||
<svg className="w-3.5 h-3.5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
|
|
||||||
<path strokeLinecap="round" strokeLinejoin="round" d="M15.232 5.232l3.536 3.536m-2.036-5.036a2.5 2.5 0 113.536 3.536L6.5 21.036H3v-3.572L16.732 3.732z" />
|
|
||||||
</svg>
|
|
||||||
</button>
|
|
||||||
<button
|
|
||||||
onClick={(e) => {
|
|
||||||
e.stopPropagation()
|
|
||||||
if (confirm('Session loeschen?')) deleteSession(s.id)
|
|
||||||
}}
|
|
||||||
className="p-1 text-gray-400 hover:text-red-500"
|
|
||||||
title="Loeschen"
|
|
||||||
>
|
|
||||||
<svg className="w-3.5 h-3.5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
|
|
||||||
<path strokeLinecap="round" strokeLinejoin="round" d="M19 7l-.867 12.142A2 2 0 0116.138 21H7.862a2 2 0 01-1.995-1.858L5 7m5 4v6m4-6v6m1-10V4a1 1 0 00-1-1h-4a1 1 0 00-1 1v3M4 7h16" />
|
|
||||||
</svg>
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
))}
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Active session name */}
|
|
||||||
{sessionId && sessionName && (
|
|
||||||
<div className="text-sm text-gray-500 dark:text-gray-400">
|
|
||||||
Aktive Session: <span className="font-medium text-gray-700 dark:text-gray-300">{sessionName}</span>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
|
|
||||||
<PipelineStepper
|
|
||||||
steps={steps}
|
|
||||||
currentStep={currentStep}
|
|
||||||
onStepClick={handleStepClick}
|
|
||||||
onReprocess={sessionId ? reprocessFromStep : undefined}
|
|
||||||
docTypeResult={docTypeResult}
|
|
||||||
onDocTypeChange={handleDocTypeChange}
|
|
||||||
/>
|
|
||||||
|
|
||||||
<div className="min-h-[400px]">{renderStep()}</div>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
@@ -1,243 +0,0 @@
|
|||||||
export type PipelineStepStatus = 'pending' | 'active' | 'completed' | 'failed' | 'skipped'
|
|
||||||
|
|
||||||
export interface PipelineStep {
|
|
||||||
id: string
|
|
||||||
name: string
|
|
||||||
icon: string
|
|
||||||
status: PipelineStepStatus
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface SessionListItem {
|
|
||||||
id: string
|
|
||||||
name: string
|
|
||||||
filename: string
|
|
||||||
status: string
|
|
||||||
current_step: number
|
|
||||||
created_at: string
|
|
||||||
updated_at?: string
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface DocumentTypeResult {
|
|
||||||
doc_type: 'vocab_table' | 'full_text' | 'generic_table'
|
|
||||||
confidence: number
|
|
||||||
pipeline: 'cell_first' | 'full_page'
|
|
||||||
skip_steps: string[]
|
|
||||||
features?: Record<string, unknown>
|
|
||||||
duration_seconds?: number
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface SessionInfo {
|
|
||||||
session_id: string
|
|
||||||
filename: string
|
|
||||||
name?: string
|
|
||||||
image_width: number
|
|
||||||
image_height: number
|
|
||||||
original_image_url: string
|
|
||||||
current_step?: number
|
|
||||||
deskew_result?: DeskewResult
|
|
||||||
dewarp_result?: DewarpResult
|
|
||||||
column_result?: ColumnResult
|
|
||||||
row_result?: RowResult
|
|
||||||
word_result?: GridResult
|
|
||||||
doc_type_result?: DocumentTypeResult
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface DeskewResult {
|
|
||||||
session_id: string
|
|
||||||
angle_hough: number
|
|
||||||
angle_word_alignment: number
|
|
||||||
angle_applied: number
|
|
||||||
method_used: 'hough' | 'word_alignment' | 'manual'
|
|
||||||
confidence: number
|
|
||||||
duration_seconds: number
|
|
||||||
deskewed_image_url: string
|
|
||||||
binarized_image_url: string
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface DeskewGroundTruth {
|
|
||||||
is_correct: boolean
|
|
||||||
corrected_angle?: number
|
|
||||||
notes?: string
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface DewarpDetection {
|
|
||||||
method: string
|
|
||||||
shear_degrees: number
|
|
||||||
confidence: number
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface DewarpResult {
|
|
||||||
session_id: string
|
|
||||||
method_used: string
|
|
||||||
shear_degrees: number
|
|
||||||
confidence: number
|
|
||||||
duration_seconds: number
|
|
||||||
dewarped_image_url: string
|
|
||||||
detections?: DewarpDetection[]
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface DewarpGroundTruth {
|
|
||||||
is_correct: boolean
|
|
||||||
corrected_shear?: number
|
|
||||||
notes?: string
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface PageRegion {
|
|
||||||
type: 'column_en' | 'column_de' | 'column_example' | 'page_ref'
|
|
||||||
| 'column_marker' | 'column_text' | 'column_ignore' | 'header' | 'footer'
|
|
||||||
x: number
|
|
||||||
y: number
|
|
||||||
width: number
|
|
||||||
height: number
|
|
||||||
classification_confidence?: number
|
|
||||||
classification_method?: string
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface ColumnResult {
|
|
||||||
columns: PageRegion[]
|
|
||||||
duration_seconds: number
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface ColumnGroundTruth {
|
|
||||||
is_correct: boolean
|
|
||||||
corrected_columns?: PageRegion[]
|
|
||||||
notes?: string
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface ManualColumnDivider {
|
|
||||||
xPercent: number // Position in % of image width (0-100)
|
|
||||||
}
|
|
||||||
|
|
||||||
export type ColumnTypeKey = PageRegion['type']
|
|
||||||
|
|
||||||
export interface RowResult {
|
|
||||||
rows: RowItem[]
|
|
||||||
summary: Record<string, number>
|
|
||||||
total_rows: number
|
|
||||||
duration_seconds: number
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface RowItem {
|
|
||||||
index: number
|
|
||||||
x: number
|
|
||||||
y: number
|
|
||||||
width: number
|
|
||||||
height: number
|
|
||||||
word_count: number
|
|
||||||
row_type: 'content' | 'header' | 'footer'
|
|
||||||
gap_before: number
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface RowGroundTruth {
|
|
||||||
is_correct: boolean
|
|
||||||
corrected_rows?: RowItem[]
|
|
||||||
notes?: string
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface WordBbox {
|
|
||||||
x: number
|
|
||||||
y: number
|
|
||||||
w: number
|
|
||||||
h: number
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface GridCell {
|
|
||||||
cell_id: string // "R03_C1"
|
|
||||||
row_index: number
|
|
||||||
col_index: number
|
|
||||||
col_type: string
|
|
||||||
text: string
|
|
||||||
confidence: number
|
|
||||||
bbox_px: WordBbox
|
|
||||||
bbox_pct: WordBbox
|
|
||||||
ocr_engine?: string
|
|
||||||
status?: 'pending' | 'confirmed' | 'edited' | 'skipped'
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface ColumnMeta {
|
|
||||||
index: number
|
|
||||||
type: string
|
|
||||||
x: number
|
|
||||||
width: number
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface GridResult {
|
|
||||||
cells: GridCell[]
|
|
||||||
grid_shape: { rows: number; cols: number; total_cells: number }
|
|
||||||
columns_used: ColumnMeta[]
|
|
||||||
layout: 'vocab' | 'generic'
|
|
||||||
image_width: number
|
|
||||||
image_height: number
|
|
||||||
duration_seconds: number
|
|
||||||
ocr_engine?: string
|
|
||||||
vocab_entries?: WordEntry[] // Only when layout='vocab'
|
|
||||||
entries?: WordEntry[] // Backwards compat alias for vocab_entries
|
|
||||||
entry_count?: number
|
|
||||||
summary: {
|
|
||||||
total_cells: number
|
|
||||||
non_empty_cells: number
|
|
||||||
low_confidence: number
|
|
||||||
// Only when layout='vocab':
|
|
||||||
total_entries?: number
|
|
||||||
with_english?: number
|
|
||||||
with_german?: number
|
|
||||||
}
|
|
||||||
llm_review?: {
|
|
||||||
changes: { row_index: number; field: string; old: string; new: string }[]
|
|
||||||
model_used: string
|
|
||||||
duration_ms: number
|
|
||||||
entries_corrected: number
|
|
||||||
applied_count?: number
|
|
||||||
applied_at?: string
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface WordEntry {
|
|
||||||
row_index: number
|
|
||||||
english: string
|
|
||||||
german: string
|
|
||||||
example: string
|
|
||||||
source_page?: string
|
|
||||||
marker?: string
|
|
||||||
confidence: number
|
|
||||||
bbox: WordBbox
|
|
||||||
bbox_en: WordBbox | null
|
|
||||||
bbox_de: WordBbox | null
|
|
||||||
bbox_ex: WordBbox | null
|
|
||||||
bbox_ref?: WordBbox | null
|
|
||||||
bbox_marker?: WordBbox | null
|
|
||||||
status?: 'pending' | 'confirmed' | 'edited' | 'skipped'
|
|
||||||
}
|
|
||||||
|
|
||||||
/** @deprecated Use GridResult instead */
|
|
||||||
export interface WordResult {
|
|
||||||
entries: WordEntry[]
|
|
||||||
entry_count: number
|
|
||||||
image_width: number
|
|
||||||
image_height: number
|
|
||||||
duration_seconds: number
|
|
||||||
ocr_engine?: string
|
|
||||||
summary: {
|
|
||||||
total_entries: number
|
|
||||||
with_english: number
|
|
||||||
with_german: number
|
|
||||||
low_confidence: number
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface WordGroundTruth {
|
|
||||||
is_correct: boolean
|
|
||||||
corrected_entries?: WordEntry[]
|
|
||||||
notes?: string
|
|
||||||
}
|
|
||||||
|
|
||||||
export const PIPELINE_STEPS: PipelineStep[] = [
|
|
||||||
{ id: 'deskew', name: 'Begradigung', icon: '📐', status: 'pending' },
|
|
||||||
{ id: 'dewarp', name: 'Entzerrung', icon: '🔧', status: 'pending' },
|
|
||||||
{ id: 'columns', name: 'Spalten', icon: '📊', status: 'pending' },
|
|
||||||
{ id: 'rows', name: 'Zeilen', icon: '📏', status: 'pending' },
|
|
||||||
{ id: 'words', name: 'Woerter', icon: '🔤', status: 'pending' },
|
|
||||||
{ id: 'llm-review', name: 'Korrektur', icon: '✏️', status: 'pending' },
|
|
||||||
{ id: 'reconstruction', name: 'Rekonstruktion', icon: '🏗️', status: 'pending' },
|
|
||||||
{ id: 'ground-truth', name: 'Validierung', icon: '✅', status: 'pending' },
|
|
||||||
]
|
|
||||||
@@ -1,671 +0,0 @@
|
|||||||
'use client'
|
|
||||||
|
|
||||||
import React, { useState, useEffect, useCallback, useRef } from 'react'
|
|
||||||
import { RAG_PDF_MAPPING } from './rag-pdf-mapping'
|
|
||||||
import { REGULATIONS_IN_RAG, REGULATION_INFO } from '../rag-constants'
|
|
||||||
|
|
||||||
interface ChunkBrowserQAProps {
|
|
||||||
apiProxy: string
|
|
||||||
}
|
|
||||||
|
|
||||||
type RegGroupKey = 'eu_regulation' | 'eu_directive' | 'de_law' | 'at_law' | 'ch_law' | 'national_law' | 'bsi_standard' | 'eu_guideline' | 'international_standard' | 'other'
|
|
||||||
|
|
||||||
const GROUP_LABELS: Record<RegGroupKey, string> = {
|
|
||||||
eu_regulation: 'EU Verordnungen',
|
|
||||||
eu_directive: 'EU Richtlinien',
|
|
||||||
de_law: 'DE Gesetze',
|
|
||||||
at_law: 'AT Gesetze',
|
|
||||||
ch_law: 'CH Gesetze',
|
|
||||||
national_law: 'Nationale Gesetze (EU)',
|
|
||||||
bsi_standard: 'BSI Standards',
|
|
||||||
eu_guideline: 'EDPB / Guidelines',
|
|
||||||
international_standard: 'Internationale Standards',
|
|
||||||
other: 'Sonstige',
|
|
||||||
}
|
|
||||||
|
|
||||||
const GROUP_ORDER: RegGroupKey[] = [
|
|
||||||
'eu_regulation', 'eu_directive', 'de_law', 'at_law', 'ch_law',
|
|
||||||
'national_law', 'bsi_standard', 'eu_guideline', 'international_standard', 'other',
|
|
||||||
]
|
|
||||||
|
|
||||||
const COLLECTIONS = [
|
|
||||||
'bp_compliance_gesetze',
|
|
||||||
'bp_compliance_ce',
|
|
||||||
'bp_compliance_datenschutz',
|
|
||||||
]
|
|
||||||
|
|
||||||
export function ChunkBrowserQA({ apiProxy }: ChunkBrowserQAProps) {
|
|
||||||
// Filter-Sidebar
|
|
||||||
const [selectedRegulation, setSelectedRegulation] = useState<string | null>(null)
|
|
||||||
const [regulationCounts, setRegulationCounts] = useState<Record<string, number>>({})
|
|
||||||
const [filterSearch, setFilterSearch] = useState('')
|
|
||||||
const [countsLoading, setCountsLoading] = useState(false)
|
|
||||||
|
|
||||||
// Dokument-Chunks (sequenziell)
|
|
||||||
const [docChunks, setDocChunks] = useState<Record<string, unknown>[]>([])
|
|
||||||
const [docChunkIndex, setDocChunkIndex] = useState(0)
|
|
||||||
const [docTotalChunks, setDocTotalChunks] = useState(0)
|
|
||||||
const [docLoading, setDocLoading] = useState(false)
|
|
||||||
const docChunksRef = useRef(docChunks)
|
|
||||||
docChunksRef.current = docChunks
|
|
||||||
|
|
||||||
// Split-View
|
|
||||||
const [splitViewActive, setSplitViewActive] = useState(true)
|
|
||||||
const [chunksPerPage, setChunksPerPage] = useState(6)
|
|
||||||
const [fullscreen, setFullscreen] = useState(false)
|
|
||||||
|
|
||||||
// Collection — default to bp_compliance_ce where we have PDFs downloaded
|
|
||||||
const [collection, setCollection] = useState('bp_compliance_ce')
|
|
||||||
|
|
||||||
// PDF existence check
|
|
||||||
const [pdfExists, setPdfExists] = useState<boolean | null>(null)
|
|
||||||
|
|
||||||
// Sidebar collapsed groups
|
|
||||||
const [collapsedGroups, setCollapsedGroups] = useState<Set<string>>(new Set())
|
|
||||||
|
|
||||||
// Build grouped regulations for sidebar
|
|
||||||
const regulationsInCollection = Object.entries(REGULATIONS_IN_RAG)
|
|
||||||
.filter(([, info]) => info.collection === collection)
|
|
||||||
.map(([code]) => code)
|
|
||||||
|
|
||||||
const groupedRegulations = React.useMemo(() => {
|
|
||||||
const groups: Record<RegGroupKey, { code: string; name: string; type: string }[]> = {
|
|
||||||
eu_regulation: [], eu_directive: [], de_law: [], at_law: [], ch_law: [],
|
|
||||||
national_law: [], bsi_standard: [], eu_guideline: [], international_standard: [], other: [],
|
|
||||||
}
|
|
||||||
for (const code of regulationsInCollection) {
|
|
||||||
const reg = REGULATION_INFO.find(r => r.code === code)
|
|
||||||
const type = (reg?.type || 'other') as RegGroupKey
|
|
||||||
const groupKey = type in groups ? type : 'other'
|
|
||||||
groups[groupKey].push({
|
|
||||||
code,
|
|
||||||
name: reg?.name || code,
|
|
||||||
type: reg?.type || 'unknown',
|
|
||||||
})
|
|
||||||
}
|
|
||||||
return groups
|
|
||||||
}, [regulationsInCollection.join(',')])
|
|
||||||
|
|
||||||
// Load regulation counts for current collection
|
|
||||||
const loadRegulationCounts = useCallback(async (col: string) => {
|
|
||||||
const entries = Object.entries(REGULATIONS_IN_RAG)
|
|
||||||
.filter(([, info]) => info.collection === col && info.qdrant_id)
|
|
||||||
if (entries.length === 0) return
|
|
||||||
|
|
||||||
// Build qdrant_id -> our_code mapping
|
|
||||||
const qdrantIdToCode: Record<string, string[]> = {}
|
|
||||||
for (const [code, info] of entries) {
|
|
||||||
if (!qdrantIdToCode[info.qdrant_id]) qdrantIdToCode[info.qdrant_id] = []
|
|
||||||
qdrantIdToCode[info.qdrant_id].push(code)
|
|
||||||
}
|
|
||||||
const uniqueQdrantIds = Object.keys(qdrantIdToCode)
|
|
||||||
|
|
||||||
setCountsLoading(true)
|
|
||||||
try {
|
|
||||||
const params = new URLSearchParams({
|
|
||||||
action: 'regulation-counts-batch',
|
|
||||||
collection: col,
|
|
||||||
qdrant_ids: uniqueQdrantIds.join(','),
|
|
||||||
})
|
|
||||||
const res = await fetch(`${apiProxy}?${params}`)
|
|
||||||
if (res.ok) {
|
|
||||||
const data = await res.json()
|
|
||||||
// Map qdrant_id counts back to our codes
|
|
||||||
const mapped: Record<string, number> = {}
|
|
||||||
for (const [qid, count] of Object.entries(data.counts as Record<string, number>)) {
|
|
||||||
const codes = qdrantIdToCode[qid] || []
|
|
||||||
for (const code of codes) {
|
|
||||||
mapped[code] = count
|
|
||||||
}
|
|
||||||
}
|
|
||||||
setRegulationCounts(prev => ({ ...prev, ...mapped }))
|
|
||||||
}
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Failed to load regulation counts:', error)
|
|
||||||
} finally {
|
|
||||||
setCountsLoading(false)
|
|
||||||
}
|
|
||||||
}, [apiProxy])
|
|
||||||
|
|
||||||
// Load all chunks for a regulation (paginated scroll)
|
|
||||||
const loadDocumentChunks = useCallback(async (regulationCode: string) => {
|
|
||||||
const ragInfo = REGULATIONS_IN_RAG[regulationCode]
|
|
||||||
if (!ragInfo || !ragInfo.qdrant_id) return
|
|
||||||
|
|
||||||
setDocLoading(true)
|
|
||||||
setDocChunks([])
|
|
||||||
setDocChunkIndex(0)
|
|
||||||
setDocTotalChunks(0)
|
|
||||||
|
|
||||||
const allChunks: Record<string, unknown>[] = []
|
|
||||||
let offset: string | null = null
|
|
||||||
|
|
||||||
try {
|
|
||||||
let safety = 0
|
|
||||||
do {
|
|
||||||
const params = new URLSearchParams({
|
|
||||||
action: 'scroll',
|
|
||||||
collection: ragInfo.collection,
|
|
||||||
limit: '100',
|
|
||||||
filter_key: 'regulation_id',
|
|
||||||
filter_value: ragInfo.qdrant_id,
|
|
||||||
})
|
|
||||||
if (offset) params.append('offset', offset)
|
|
||||||
|
|
||||||
const res = await fetch(`${apiProxy}?${params}`)
|
|
||||||
if (!res.ok) break
|
|
||||||
|
|
||||||
const data = await res.json()
|
|
||||||
const chunks = data.chunks || []
|
|
||||||
allChunks.push(...chunks)
|
|
||||||
offset = data.next_offset || null
|
|
||||||
safety++
|
|
||||||
} while (offset && safety < 200)
|
|
||||||
|
|
||||||
// Sort by chunk_index
|
|
||||||
allChunks.sort((a, b) => {
|
|
||||||
const ai = Number(a.chunk_index ?? a.chunk_id ?? 0)
|
|
||||||
const bi = Number(b.chunk_index ?? b.chunk_id ?? 0)
|
|
||||||
return ai - bi
|
|
||||||
})
|
|
||||||
|
|
||||||
setDocChunks(allChunks)
|
|
||||||
setDocTotalChunks(allChunks.length)
|
|
||||||
setDocChunkIndex(0)
|
|
||||||
} catch (error) {
|
|
||||||
console.error('Failed to load document chunks:', error)
|
|
||||||
} finally {
|
|
||||||
setDocLoading(false)
|
|
||||||
}
|
|
||||||
}, [apiProxy])
|
|
||||||
|
|
||||||
// Initial load
|
|
||||||
useEffect(() => {
|
|
||||||
loadRegulationCounts(collection)
|
|
||||||
}, [collection, loadRegulationCounts])
|
|
||||||
|
|
||||||
// Current chunk
|
|
||||||
const currentChunk = docChunks[docChunkIndex] || null
|
|
||||||
const prevChunk = docChunkIndex > 0 ? docChunks[docChunkIndex - 1] : null
|
|
||||||
const nextChunk = docChunkIndex < docChunks.length - 1 ? docChunks[docChunkIndex + 1] : null
|
|
||||||
|
|
||||||
// PDF page estimation — use pages metadata if available
|
|
||||||
const estimatePdfPage = (chunk: Record<string, unknown> | null, chunkIdx: number): number => {
|
|
||||||
if (chunk) {
|
|
||||||
// Try pages array from payload (e.g. [7] or [7,8])
|
|
||||||
const pages = chunk.pages as number[] | undefined
|
|
||||||
if (Array.isArray(pages) && pages.length > 0) return pages[0]
|
|
||||||
// Try page field
|
|
||||||
const page = chunk.page as number | undefined
|
|
||||||
if (typeof page === 'number' && page > 0) return page
|
|
||||||
}
|
|
||||||
const mapping = selectedRegulation ? RAG_PDF_MAPPING[selectedRegulation] : null
|
|
||||||
const cpp = mapping?.chunksPerPage || chunksPerPage
|
|
||||||
return Math.floor(chunkIdx / cpp) + 1
|
|
||||||
}
|
|
||||||
|
|
||||||
const pdfPage = estimatePdfPage(currentChunk, docChunkIndex)
|
|
||||||
const pdfMapping = selectedRegulation ? RAG_PDF_MAPPING[selectedRegulation] : null
|
|
||||||
const pdfUrl = pdfMapping ? `/rag-originals/${pdfMapping.filename}#page=${pdfPage}` : null
|
|
||||||
|
|
||||||
// Check PDF existence when regulation changes
|
|
||||||
useEffect(() => {
|
|
||||||
if (!selectedRegulation) { setPdfExists(null); return }
|
|
||||||
const mapping = RAG_PDF_MAPPING[selectedRegulation]
|
|
||||||
if (!mapping) { setPdfExists(false); return }
|
|
||||||
const url = `/rag-originals/${mapping.filename}`
|
|
||||||
fetch(url, { method: 'HEAD' })
|
|
||||||
.then(res => setPdfExists(res.ok))
|
|
||||||
.catch(() => setPdfExists(false))
|
|
||||||
}, [selectedRegulation])
|
|
||||||
|
|
||||||
// Handlers
|
|
||||||
const handleSelectRegulation = (code: string) => {
|
|
||||||
setSelectedRegulation(code)
|
|
||||||
loadDocumentChunks(code)
|
|
||||||
}
|
|
||||||
|
|
||||||
const handleCollectionChange = (col: string) => {
|
|
||||||
setCollection(col)
|
|
||||||
setSelectedRegulation(null)
|
|
||||||
setDocChunks([])
|
|
||||||
setDocChunkIndex(0)
|
|
||||||
setDocTotalChunks(0)
|
|
||||||
setRegulationCounts({})
|
|
||||||
}
|
|
||||||
|
|
||||||
const handlePrev = () => {
|
|
||||||
if (docChunkIndex > 0) setDocChunkIndex(i => i - 1)
|
|
||||||
}
|
|
||||||
|
|
||||||
const handleNext = () => {
|
|
||||||
if (docChunkIndex < docChunks.length - 1) setDocChunkIndex(i => i + 1)
|
|
||||||
}
|
|
||||||
|
|
||||||
const handleKeyDown = useCallback((e: KeyboardEvent) => {
|
|
||||||
if (e.key === 'Escape' && fullscreen) {
|
|
||||||
e.preventDefault()
|
|
||||||
setFullscreen(false)
|
|
||||||
} else if (e.key === 'ArrowLeft' || e.key === 'ArrowUp') {
|
|
||||||
e.preventDefault()
|
|
||||||
setDocChunkIndex(i => Math.max(0, i - 1))
|
|
||||||
} else if (e.key === 'ArrowRight' || e.key === 'ArrowDown') {
|
|
||||||
e.preventDefault()
|
|
||||||
setDocChunkIndex(i => Math.min(docChunksRef.current.length - 1, i + 1))
|
|
||||||
}
|
|
||||||
}, [fullscreen])
|
|
||||||
|
|
||||||
useEffect(() => {
|
|
||||||
if (fullscreen || (selectedRegulation && docChunks.length > 0)) {
|
|
||||||
window.addEventListener('keydown', handleKeyDown)
|
|
||||||
return () => window.removeEventListener('keydown', handleKeyDown)
|
|
||||||
}
|
|
||||||
}, [selectedRegulation, docChunks.length, handleKeyDown, fullscreen])
|
|
||||||
|
|
||||||
const toggleGroup = (group: string) => {
|
|
||||||
setCollapsedGroups(prev => {
|
|
||||||
const next = new Set(prev)
|
|
||||||
if (next.has(group)) next.delete(group)
|
|
||||||
else next.add(group)
|
|
||||||
return next
|
|
||||||
})
|
|
||||||
}
|
|
||||||
|
|
||||||
// Get text content from a chunk
|
|
||||||
const getChunkText = (chunk: Record<string, unknown> | null): string => {
|
|
||||||
if (!chunk) return ''
|
|
||||||
return String(chunk.chunk_text || chunk.text || chunk.content || '')
|
|
||||||
}
|
|
||||||
|
|
||||||
// Extract structural metadata for prominent display
|
|
||||||
const getStructuralInfo = (chunk: Record<string, unknown> | null): { article?: string; section?: string; pages?: string } => {
|
|
||||||
if (!chunk) return {}
|
|
||||||
const result: { article?: string; section?: string; pages?: string } = {}
|
|
||||||
// Article / paragraph
|
|
||||||
const article = chunk.article || chunk.artikel || chunk.paragraph || chunk.section_title
|
|
||||||
if (article) result.article = String(article)
|
|
||||||
// Section
|
|
||||||
const section = chunk.section || chunk.chapter || chunk.abschnitt || chunk.kapitel
|
|
||||||
if (section) result.section = String(section)
|
|
||||||
// Pages
|
|
||||||
const pages = chunk.pages as number[] | undefined
|
|
||||||
if (Array.isArray(pages) && pages.length > 0) {
|
|
||||||
result.pages = pages.length === 1 ? `S. ${pages[0]}` : `S. ${pages[0]}-${pages[pages.length - 1]}`
|
|
||||||
} else if (chunk.page) {
|
|
||||||
result.pages = `S. ${chunk.page}`
|
|
||||||
}
|
|
||||||
return result
|
|
||||||
}
|
|
||||||
|
|
||||||
// Overlap extraction
|
|
||||||
const getOverlapPrev = (): string => {
|
|
||||||
if (!prevChunk) return ''
|
|
||||||
const text = getChunkText(prevChunk)
|
|
||||||
return text.length > 150 ? '...' + text.slice(-150) : text
|
|
||||||
}
|
|
||||||
|
|
||||||
const getOverlapNext = (): string => {
|
|
||||||
if (!nextChunk) return ''
|
|
||||||
const text = getChunkText(nextChunk)
|
|
||||||
return text.length > 150 ? text.slice(0, 150) + '...' : text
|
|
||||||
}
|
|
||||||
|
|
||||||
// Filter sidebar items
|
|
||||||
const filteredRegulations = React.useMemo(() => {
|
|
||||||
if (!filterSearch.trim()) return groupedRegulations
|
|
||||||
const term = filterSearch.toLowerCase()
|
|
||||||
const filtered: typeof groupedRegulations = {
|
|
||||||
eu_regulation: [], eu_directive: [], de_law: [], at_law: [], ch_law: [],
|
|
||||||
national_law: [], bsi_standard: [], eu_guideline: [], international_standard: [], other: [],
|
|
||||||
}
|
|
||||||
for (const [group, items] of Object.entries(groupedRegulations)) {
|
|
||||||
filtered[group as RegGroupKey] = items.filter(
|
|
||||||
r => r.code.toLowerCase().includes(term) || r.name.toLowerCase().includes(term)
|
|
||||||
)
|
|
||||||
}
|
|
||||||
return filtered
|
|
||||||
}, [groupedRegulations, filterSearch])
|
|
||||||
|
|
||||||
// Regulation name lookup
|
|
||||||
const getRegName = (code: string): string => {
|
|
||||||
const reg = REGULATION_INFO.find(r => r.code === code)
|
|
||||||
return reg?.name || code
|
|
||||||
}
|
|
||||||
|
|
||||||
// Important metadata keys to show prominently
|
|
||||||
const STRUCTURAL_KEYS = new Set([
|
|
||||||
'article', 'artikel', 'paragraph', 'section_title', 'section', 'chapter',
|
|
||||||
'abschnitt', 'kapitel', 'pages', 'page',
|
|
||||||
])
|
|
||||||
const HIDDEN_KEYS = new Set([
|
|
||||||
'text', 'content', 'chunk_text', 'id', 'embedding',
|
|
||||||
])
|
|
||||||
|
|
||||||
const structInfo = getStructuralInfo(currentChunk)
|
|
||||||
|
|
||||||
return (
|
|
||||||
<div
|
|
||||||
className={`flex flex-col ${fullscreen ? 'fixed inset-0 z-50 bg-slate-100 p-4' : ''}`}
|
|
||||||
style={fullscreen ? { height: '100vh' } : { height: 'calc(100vh - 220px)' }}
|
|
||||||
>
|
|
||||||
{/* Header bar — fixed height */}
|
|
||||||
<div className="flex-shrink-0 bg-white rounded-xl border border-slate-200 p-3 mb-3">
|
|
||||||
<div className="flex flex-wrap items-center gap-4">
|
|
||||||
<div>
|
|
||||||
<label className="block text-xs font-medium text-slate-500 mb-1">Collection</label>
|
|
||||||
<select
|
|
||||||
value={collection}
|
|
||||||
onChange={(e) => handleCollectionChange(e.target.value)}
|
|
||||||
className="px-3 py-1.5 border rounded-lg text-sm focus:ring-2 focus:ring-teal-500"
|
|
||||||
>
|
|
||||||
{COLLECTIONS.map(c => (
|
|
||||||
<option key={c} value={c}>{c}</option>
|
|
||||||
))}
|
|
||||||
</select>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{selectedRegulation && (
|
|
||||||
<>
|
|
||||||
<div className="flex items-center gap-2">
|
|
||||||
<span className="text-sm font-semibold text-slate-900">
|
|
||||||
{selectedRegulation} — {getRegName(selectedRegulation)}
|
|
||||||
</span>
|
|
||||||
{structInfo.article && (
|
|
||||||
<span className="px-2 py-0.5 bg-blue-100 text-blue-800 text-xs font-medium rounded">
|
|
||||||
{structInfo.article}
|
|
||||||
</span>
|
|
||||||
)}
|
|
||||||
{structInfo.pages && (
|
|
||||||
<span className="px-2 py-0.5 bg-slate-100 text-slate-600 text-xs rounded">
|
|
||||||
{structInfo.pages}
|
|
||||||
</span>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
<div className="flex items-center gap-2 ml-auto">
|
|
||||||
<button
|
|
||||||
onClick={handlePrev}
|
|
||||||
disabled={docChunkIndex === 0}
|
|
||||||
className="px-3 py-1.5 text-sm font-medium border rounded-lg bg-white hover:bg-slate-50 disabled:opacity-30 disabled:cursor-not-allowed"
|
|
||||||
>
|
|
||||||
◀ Zurueck
|
|
||||||
</button>
|
|
||||||
<span className="text-sm font-mono text-slate-600 min-w-[80px] text-center">
|
|
||||||
{docChunkIndex + 1} / {docTotalChunks}
|
|
||||||
</span>
|
|
||||||
<button
|
|
||||||
onClick={handleNext}
|
|
||||||
disabled={docChunkIndex >= docChunks.length - 1}
|
|
||||||
className="px-3 py-1.5 text-sm font-medium border rounded-lg bg-white hover:bg-slate-50 disabled:opacity-30 disabled:cursor-not-allowed"
|
|
||||||
>
|
|
||||||
Weiter ▶
|
|
||||||
</button>
|
|
||||||
<input
|
|
||||||
type="number"
|
|
||||||
min={1}
|
|
||||||
max={docTotalChunks}
|
|
||||||
value={docChunkIndex + 1}
|
|
||||||
onChange={(e) => {
|
|
||||||
const v = parseInt(e.target.value, 10)
|
|
||||||
if (!isNaN(v) && v >= 1 && v <= docTotalChunks) setDocChunkIndex(v - 1)
|
|
||||||
}}
|
|
||||||
className="w-16 px-2 py-1 border rounded text-xs text-center"
|
|
||||||
title="Springe zu Chunk Nr."
|
|
||||||
/>
|
|
||||||
</div>
|
|
||||||
<div className="flex items-center gap-2">
|
|
||||||
<label className="text-xs text-slate-500">Chunks/Seite:</label>
|
|
||||||
<select
|
|
||||||
value={chunksPerPage}
|
|
||||||
onChange={(e) => setChunksPerPage(Number(e.target.value))}
|
|
||||||
className="px-2 py-1 border rounded text-xs"
|
|
||||||
>
|
|
||||||
{[3, 4, 5, 6, 8, 10, 12, 15, 20].map(n => (
|
|
||||||
<option key={n} value={n}>{n}</option>
|
|
||||||
))}
|
|
||||||
</select>
|
|
||||||
<button
|
|
||||||
onClick={() => setSplitViewActive(!splitViewActive)}
|
|
||||||
className={`px-3 py-1 text-xs rounded-lg border ${
|
|
||||||
splitViewActive ? 'bg-teal-50 border-teal-300 text-teal-700' : 'bg-slate-50 border-slate-300 text-slate-600'
|
|
||||||
}`}
|
|
||||||
>
|
|
||||||
{splitViewActive ? 'Split-View an' : 'Split-View aus'}
|
|
||||||
</button>
|
|
||||||
<button
|
|
||||||
onClick={() => setFullscreen(!fullscreen)}
|
|
||||||
className={`px-3 py-1 text-xs rounded-lg border ${
|
|
||||||
fullscreen ? 'bg-indigo-50 border-indigo-300 text-indigo-700' : 'bg-slate-50 border-slate-300 text-slate-600'
|
|
||||||
}`}
|
|
||||||
title={fullscreen ? 'Vollbild beenden (Esc)' : 'Vollbild'}
|
|
||||||
>
|
|
||||||
{fullscreen ? '✕ Vollbild beenden' : '⛶ Vollbild'}
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
</>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Main content: Sidebar + Content — fills remaining height */}
|
|
||||||
<div className="flex gap-3 flex-1 min-h-0">
|
|
||||||
{/* Sidebar — scrollable */}
|
|
||||||
<div className="w-56 flex-shrink-0 bg-white rounded-xl border border-slate-200 flex flex-col min-h-0">
|
|
||||||
<div className="flex-shrink-0 p-3 border-b border-slate-100">
|
|
||||||
<input
|
|
||||||
type="text"
|
|
||||||
value={filterSearch}
|
|
||||||
onChange={(e) => setFilterSearch(e.target.value)}
|
|
||||||
placeholder="Suche..."
|
|
||||||
className="w-full px-2 py-1.5 border rounded-lg text-sm focus:ring-2 focus:ring-teal-500"
|
|
||||||
/>
|
|
||||||
{countsLoading && (
|
|
||||||
<div className="text-xs text-slate-400 mt-1 animate-pulse">Counts laden...</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
<div className="flex-1 overflow-y-auto min-h-0">
|
|
||||||
{GROUP_ORDER.map(group => {
|
|
||||||
const items = filteredRegulations[group]
|
|
||||||
if (items.length === 0) return null
|
|
||||||
const isCollapsed = collapsedGroups.has(group)
|
|
||||||
return (
|
|
||||||
<div key={group}>
|
|
||||||
<button
|
|
||||||
onClick={() => toggleGroup(group)}
|
|
||||||
className="w-full px-3 py-1.5 text-left text-xs font-semibold text-slate-500 bg-slate-50 hover:bg-slate-100 flex items-center justify-between sticky top-0 z-10"
|
|
||||||
>
|
|
||||||
<span>{GROUP_LABELS[group]}</span>
|
|
||||||
<span className="text-slate-400">{isCollapsed ? '+' : '-'}</span>
|
|
||||||
</button>
|
|
||||||
{!isCollapsed && items.map(reg => {
|
|
||||||
const count = regulationCounts[reg.code] ?? 0
|
|
||||||
const isSelected = selectedRegulation === reg.code
|
|
||||||
return (
|
|
||||||
<button
|
|
||||||
key={reg.code}
|
|
||||||
onClick={() => handleSelectRegulation(reg.code)}
|
|
||||||
className={`w-full px-3 py-1.5 text-left text-sm flex items-center justify-between hover:bg-teal-50 transition-colors ${
|
|
||||||
isSelected ? 'bg-teal-100 text-teal-900 font-medium' : 'text-slate-700'
|
|
||||||
}`}
|
|
||||||
>
|
|
||||||
<span className="truncate text-xs">{reg.name || reg.code}</span>
|
|
||||||
<span className={`text-xs tabular-nums flex-shrink-0 ml-1 ${count > 0 ? 'text-slate-500' : 'text-slate-300'}`}>
|
|
||||||
{count > 0 ? count.toLocaleString() : '—'}
|
|
||||||
</span>
|
|
||||||
</button>
|
|
||||||
)
|
|
||||||
})}
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
})}
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Content area — fills remaining width and height */}
|
|
||||||
{!selectedRegulation ? (
|
|
||||||
<div className="flex-1 flex items-center justify-center bg-white rounded-xl border border-slate-200">
|
|
||||||
<div className="text-center text-slate-400 space-y-2">
|
|
||||||
<div className="text-4xl">🔍</div>
|
|
||||||
<p className="text-sm">Dokument in der Sidebar auswaehlen, um QA zu starten.</p>
|
|
||||||
<p className="text-xs text-slate-300">Pfeiltasten: Chunk vor/zurueck</p>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
) : docLoading ? (
|
|
||||||
<div className="flex-1 flex items-center justify-center bg-white rounded-xl border border-slate-200">
|
|
||||||
<div className="text-center text-slate-500 space-y-2">
|
|
||||||
<div className="animate-spin text-3xl">⚙</div>
|
|
||||||
<p className="text-sm">Chunks werden geladen...</p>
|
|
||||||
<p className="text-xs text-slate-400">
|
|
||||||
{selectedRegulation}: {REGULATIONS_IN_RAG[selectedRegulation]?.chunks.toLocaleString() || '?'} Chunks erwartet
|
|
||||||
</p>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
) : (
|
|
||||||
<div className={`flex-1 grid gap-3 min-h-0 ${splitViewActive ? 'grid-cols-2' : 'grid-cols-1'}`}>
|
|
||||||
{/* Chunk-Text Panel — fixed height, internal scroll */}
|
|
||||||
<div className="bg-white rounded-xl border border-slate-200 flex flex-col min-h-0 overflow-hidden">
|
|
||||||
{/* Panel header */}
|
|
||||||
<div className="flex-shrink-0 px-4 py-2 bg-slate-50 border-b border-slate-100 flex items-center justify-between">
|
|
||||||
<span className="text-sm font-medium text-slate-700">Chunk-Text</span>
|
|
||||||
<div className="flex items-center gap-2">
|
|
||||||
{structInfo.article && (
|
|
||||||
<span className="px-2 py-0.5 bg-blue-50 text-blue-700 text-xs font-medium rounded border border-blue-200">
|
|
||||||
{structInfo.article}
|
|
||||||
</span>
|
|
||||||
)}
|
|
||||||
{structInfo.section && (
|
|
||||||
<span className="px-2 py-0.5 bg-purple-50 text-purple-700 text-xs rounded border border-purple-200">
|
|
||||||
{structInfo.section}
|
|
||||||
</span>
|
|
||||||
)}
|
|
||||||
<span className="text-xs text-slate-400 tabular-nums">
|
|
||||||
#{docChunkIndex} / {docTotalChunks - 1}
|
|
||||||
</span>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Scrollable content */}
|
|
||||||
<div className="flex-1 overflow-y-auto min-h-0 p-4 space-y-3">
|
|
||||||
{/* Overlap from previous chunk */}
|
|
||||||
{prevChunk && (
|
|
||||||
<div className="text-xs text-slate-400 bg-amber-50 border-l-2 border-amber-300 px-3 py-2 rounded-r">
|
|
||||||
<div className="font-medium text-amber-600 mb-1">↑ Ende vorheriger Chunk #{docChunkIndex - 1}</div>
|
|
||||||
<p className="whitespace-pre-wrap break-words leading-relaxed">{getOverlapPrev()}</p>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
|
|
||||||
{/* Current chunk text */}
|
|
||||||
{currentChunk ? (
|
|
||||||
<div className="text-sm text-slate-800 whitespace-pre-wrap break-words leading-relaxed border-l-2 border-teal-400 pl-3">
|
|
||||||
{getChunkText(currentChunk)}
|
|
||||||
</div>
|
|
||||||
) : (
|
|
||||||
<div className="text-sm text-slate-400 italic">Kein Chunk-Text vorhanden.</div>
|
|
||||||
)}
|
|
||||||
|
|
||||||
{/* Overlap from next chunk */}
|
|
||||||
{nextChunk && (
|
|
||||||
<div className="text-xs text-slate-400 bg-amber-50 border-l-2 border-amber-300 px-3 py-2 rounded-r">
|
|
||||||
<div className="font-medium text-amber-600 mb-1">↓ Anfang naechster Chunk #{docChunkIndex + 1}</div>
|
|
||||||
<p className="whitespace-pre-wrap break-words leading-relaxed">{getOverlapNext()}</p>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
|
|
||||||
{/* Metadata */}
|
|
||||||
{currentChunk && (
|
|
||||||
<div className="mt-4 pt-3 border-t border-slate-100">
|
|
||||||
<div className="text-xs font-medium text-slate-500 mb-2">Metadaten</div>
|
|
||||||
<div className="grid grid-cols-2 gap-x-4 gap-y-1 text-xs">
|
|
||||||
{Object.entries(currentChunk)
|
|
||||||
.filter(([k]) => !HIDDEN_KEYS.has(k))
|
|
||||||
.sort(([a], [b]) => {
|
|
||||||
// Structural keys first
|
|
||||||
const aStruct = STRUCTURAL_KEYS.has(a) ? 0 : 1
|
|
||||||
const bStruct = STRUCTURAL_KEYS.has(b) ? 0 : 1
|
|
||||||
return aStruct - bStruct || a.localeCompare(b)
|
|
||||||
})
|
|
||||||
.map(([k, v]) => (
|
|
||||||
<div key={k} className={`flex gap-1 ${STRUCTURAL_KEYS.has(k) ? 'col-span-2 font-medium' : ''}`}>
|
|
||||||
<span className="font-medium text-slate-500 flex-shrink-0">{k}:</span>
|
|
||||||
<span className="text-slate-700 break-all">
|
|
||||||
{Array.isArray(v) ? v.join(', ') : String(v)}
|
|
||||||
</span>
|
|
||||||
</div>
|
|
||||||
))}
|
|
||||||
</div>
|
|
||||||
{/* Chunk quality indicator */}
|
|
||||||
<div className="mt-3 pt-2 border-t border-slate-50">
|
|
||||||
<div className="text-xs text-slate-400">
|
|
||||||
Chunk-Laenge: {getChunkText(currentChunk).length} Zeichen
|
|
||||||
{getChunkText(currentChunk).length < 50 && (
|
|
||||||
<span className="ml-2 text-orange-500 font-medium">⚠ Sehr kurz</span>
|
|
||||||
)}
|
|
||||||
{getChunkText(currentChunk).length > 2000 && (
|
|
||||||
<span className="ml-2 text-orange-500 font-medium">⚠ Sehr lang</span>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* PDF-Viewer Panel */}
|
|
||||||
{splitViewActive && (
|
|
||||||
<div className="bg-white rounded-xl border border-slate-200 flex flex-col min-h-0 overflow-hidden">
|
|
||||||
<div className="flex-shrink-0 px-4 py-2 bg-slate-50 border-b border-slate-100 flex items-center justify-between">
|
|
||||||
<span className="text-sm font-medium text-slate-700">Original-PDF</span>
|
|
||||||
<div className="flex items-center gap-2">
|
|
||||||
<span className="text-xs text-slate-400">
|
|
||||||
Seite ~{pdfPage}
|
|
||||||
{pdfMapping?.totalPages ? ` / ${pdfMapping.totalPages}` : ''}
|
|
||||||
</span>
|
|
||||||
{pdfUrl && (
|
|
||||||
<a
|
|
||||||
href={pdfUrl.split('#')[0]}
|
|
||||||
target="_blank"
|
|
||||||
rel="noopener noreferrer"
|
|
||||||
className="text-xs text-teal-600 hover:text-teal-800 underline"
|
|
||||||
>
|
|
||||||
Oeffnen ↗
|
|
||||||
</a>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
<div className="flex-1 min-h-0 relative">
|
|
||||||
{pdfUrl && pdfExists ? (
|
|
||||||
<iframe
|
|
||||||
key={`${selectedRegulation}-${pdfPage}`}
|
|
||||||
src={pdfUrl}
|
|
||||||
className="absolute inset-0 w-full h-full border-0"
|
|
||||||
title="Original PDF"
|
|
||||||
/>
|
|
||||||
) : (
|
|
||||||
<div className="flex items-center justify-center h-full text-slate-400 text-sm p-4">
|
|
||||||
<div className="text-center space-y-2">
|
|
||||||
<div className="text-3xl">📄</div>
|
|
||||||
{!pdfMapping ? (
|
|
||||||
<>
|
|
||||||
<p>Kein PDF-Mapping fuer {selectedRegulation}.</p>
|
|
||||||
<p className="text-xs">rag-pdf-mapping.ts ergaenzen.</p>
|
|
||||||
</>
|
|
||||||
) : pdfExists === false ? (
|
|
||||||
<>
|
|
||||||
<p className="font-medium text-orange-600">PDF nicht vorhanden</p>
|
|
||||||
<p className="text-xs">Datei <code className="bg-slate-100 px-1 rounded">{pdfMapping.filename}</code> fehlt in ~/rag-originals/</p>
|
|
||||||
<p className="text-xs mt-1">Bitte manuell herunterladen und dort ablegen.</p>
|
|
||||||
</>
|
|
||||||
) : (
|
|
||||||
<p>PDF wird geprueft...</p>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
@@ -1,126 +0,0 @@
|
|||||||
export interface RagPdfMapping {
|
|
||||||
filename: string
|
|
||||||
totalPages?: number
|
|
||||||
chunksPerPage?: number
|
|
||||||
language: string
|
|
||||||
}
|
|
||||||
|
|
||||||
export const RAG_PDF_MAPPING: Record<string, RagPdfMapping> = {
|
|
||||||
// EU Verordnungen
|
|
||||||
GDPR: { filename: 'GDPR_DE.pdf', language: 'de', totalPages: 88 },
|
|
||||||
EPRIVACY: { filename: 'EPRIVACY_DE.pdf', language: 'de' },
|
|
||||||
SCC: { filename: 'SCC_DE.pdf', language: 'de' },
|
|
||||||
SCC_FULL_TEXT: { filename: 'SCC_FULL_TEXT_DE.pdf', language: 'de' },
|
|
||||||
AIACT: { filename: 'AIACT_DE.pdf', language: 'de', totalPages: 144 },
|
|
||||||
CRA: { filename: 'CRA_DE.pdf', language: 'de' },
|
|
||||||
NIS2: { filename: 'NIS2_DE.pdf', language: 'de' },
|
|
||||||
DGA: { filename: 'DGA_DE.pdf', language: 'de' },
|
|
||||||
DSA: { filename: 'DSA_DE.pdf', language: 'de' },
|
|
||||||
PLD: { filename: 'PLD_DE.pdf', language: 'de' },
|
|
||||||
E_COMMERCE_RL: { filename: 'E_COMMERCE_RL_DE.pdf', language: 'de' },
|
|
||||||
VERBRAUCHERRECHTE_RL: { filename: 'VERBRAUCHERRECHTE_RL_DE.pdf', language: 'de' },
|
|
||||||
DIGITALE_INHALTE_RL: { filename: 'DIGITALE_INHALTE_RL_DE.pdf', language: 'de' },
|
|
||||||
DMA: { filename: 'DMA_DE.pdf', language: 'de' },
|
|
||||||
DPF: { filename: 'DPF_DE.pdf', language: 'de' },
|
|
||||||
EUCSA: { filename: 'EUCSA_DE.pdf', language: 'de' },
|
|
||||||
DATAACT: { filename: 'DATAACT_DE.pdf', language: 'de' },
|
|
||||||
DORA: { filename: 'DORA_DE.pdf', language: 'de' },
|
|
||||||
PSD2: { filename: 'PSD2_DE.pdf', language: 'de' },
|
|
||||||
AMLR: { filename: 'AMLR_DE.pdf', language: 'de' },
|
|
||||||
MiCA: { filename: 'MiCA_DE.pdf', language: 'de' },
|
|
||||||
EHDS: { filename: 'EHDS_DE.pdf', language: 'de' },
|
|
||||||
EAA: { filename: 'EAA_DE.pdf', language: 'de' },
|
|
||||||
DSM: { filename: 'DSM_DE.pdf', language: 'de' },
|
|
||||||
GPSR: { filename: 'GPSR_DE.pdf', language: 'de' },
|
|
||||||
MACHINERY_REG: { filename: 'MACHINERY_REG_DE.pdf', language: 'de' },
|
|
||||||
BLUE_GUIDE: { filename: 'BLUE_GUIDE_DE.pdf', language: 'de' },
|
|
||||||
// DE Gesetze
|
|
||||||
TDDDG: { filename: 'TDDDG_DE.pdf', language: 'de' },
|
|
||||||
BDSG_FULL: { filename: 'BDSG_FULL_DE.pdf', language: 'de' },
|
|
||||||
DE_DDG: { filename: 'DE_DDG.pdf', language: 'de' },
|
|
||||||
DE_BGB_AGB: { filename: 'DE_BGB_AGB.pdf', language: 'de' },
|
|
||||||
DE_EGBGB: { filename: 'DE_EGBGB.pdf', language: 'de' },
|
|
||||||
DE_HGB_RET: { filename: 'DE_HGB_RET.pdf', language: 'de' },
|
|
||||||
DE_AO_RET: { filename: 'DE_AO_RET.pdf', language: 'de' },
|
|
||||||
DE_UWG: { filename: 'DE_UWG.pdf', language: 'de' },
|
|
||||||
DE_TKG: { filename: 'DE_TKG.pdf', language: 'de' },
|
|
||||||
DE_PANGV: { filename: 'DE_PANGV.pdf', language: 'de' },
|
|
||||||
DE_DLINFOV: { filename: 'DE_DLINFOV.pdf', language: 'de' },
|
|
||||||
DE_BETRVG: { filename: 'DE_BETRVG.pdf', language: 'de' },
|
|
||||||
DE_GESCHGEHG: { filename: 'DE_GESCHGEHG.pdf', language: 'de' },
|
|
||||||
DE_BSIG: { filename: 'DE_BSIG.pdf', language: 'de' },
|
|
||||||
DE_USTG_RET: { filename: 'DE_USTG_RET.pdf', language: 'de' },
|
|
||||||
// BSI Standards
|
|
||||||
'BSI-TR-03161-1': { filename: 'BSI-TR-03161-1.pdf', language: 'de' },
|
|
||||||
'BSI-TR-03161-2': { filename: 'BSI-TR-03161-2.pdf', language: 'de' },
|
|
||||||
'BSI-TR-03161-3': { filename: 'BSI-TR-03161-3.pdf', language: 'de' },
|
|
||||||
// AT Gesetze
|
|
||||||
AT_DSG: { filename: 'AT_DSG.pdf', language: 'de' },
|
|
||||||
AT_DSG_FULL: { filename: 'AT_DSG_FULL.pdf', language: 'de' },
|
|
||||||
AT_ECG: { filename: 'AT_ECG.pdf', language: 'de' },
|
|
||||||
AT_TKG: { filename: 'AT_TKG.pdf', language: 'de' },
|
|
||||||
AT_KSCHG: { filename: 'AT_KSCHG.pdf', language: 'de' },
|
|
||||||
AT_FAGG: { filename: 'AT_FAGG.pdf', language: 'de' },
|
|
||||||
AT_UGB_RET: { filename: 'AT_UGB_RET.pdf', language: 'de' },
|
|
||||||
AT_BAO_RET: { filename: 'AT_BAO_RET.pdf', language: 'de' },
|
|
||||||
AT_MEDIENG: { filename: 'AT_MEDIENG.pdf', language: 'de' },
|
|
||||||
AT_ABGB_AGB: { filename: 'AT_ABGB_AGB.pdf', language: 'de' },
|
|
||||||
AT_UWG: { filename: 'AT_UWG.pdf', language: 'de' },
|
|
||||||
// CH Gesetze
|
|
||||||
CH_DSG: { filename: 'CH_DSG.pdf', language: 'de' },
|
|
||||||
CH_DSV: { filename: 'CH_DSV.pdf', language: 'de' },
|
|
||||||
CH_OR_AGB: { filename: 'CH_OR_AGB.pdf', language: 'de' },
|
|
||||||
CH_UWG: { filename: 'CH_UWG.pdf', language: 'de' },
|
|
||||||
CH_FMG: { filename: 'CH_FMG.pdf', language: 'de' },
|
|
||||||
CH_GEBUV: { filename: 'CH_GEBUV.pdf', language: 'de' },
|
|
||||||
CH_ZERTES: { filename: 'CH_ZERTES.pdf', language: 'de' },
|
|
||||||
CH_ZGB_PERS: { filename: 'CH_ZGB_PERS.pdf', language: 'de' },
|
|
||||||
// LI
|
|
||||||
LI_DSG: { filename: 'LI_DSG.pdf', language: 'de' },
|
|
||||||
// Nationale DSG (andere EU)
|
|
||||||
ES_LOPDGDD: { filename: 'ES_LOPDGDD.pdf', language: 'es' },
|
|
||||||
IT_CODICE_PRIVACY: { filename: 'IT_CODICE_PRIVACY.pdf', language: 'it' },
|
|
||||||
NL_UAVG: { filename: 'NL_UAVG.pdf', language: 'nl' },
|
|
||||||
FR_CNIL_GUIDE: { filename: 'FR_CNIL_GUIDE.pdf', language: 'fr' },
|
|
||||||
IE_DPA_2018: { filename: 'IE_DPA_2018.pdf', language: 'en' },
|
|
||||||
UK_DPA_2018: { filename: 'UK_DPA_2018.pdf', language: 'en' },
|
|
||||||
UK_GDPR: { filename: 'UK_GDPR.pdf', language: 'en' },
|
|
||||||
NO_PERSONOPPLYSNINGSLOVEN: { filename: 'NO_PERSONOPPLYSNINGSLOVEN.pdf', language: 'no' },
|
|
||||||
SE_DATASKYDDSLAG: { filename: 'SE_DATASKYDDSLAG.pdf', language: 'sv' },
|
|
||||||
PL_UODO: { filename: 'PL_UODO.pdf', language: 'pl' },
|
|
||||||
CZ_ZOU: { filename: 'CZ_ZOU.pdf', language: 'cs' },
|
|
||||||
HU_INFOTV: { filename: 'HU_INFOTV.pdf', language: 'hu' },
|
|
||||||
BE_DPA_LAW: { filename: 'BE_DPA_LAW.pdf', language: 'nl' },
|
|
||||||
FI_TIETOSUOJALAKI: { filename: 'FI_TIETOSUOJALAKI.pdf', language: 'fi' },
|
|
||||||
DK_DATABESKYTTELSESLOVEN: { filename: 'DK_DATABESKYTTELSESLOVEN.pdf', language: 'da' },
|
|
||||||
LU_DPA_LAW: { filename: 'LU_DPA_LAW.pdf', language: 'fr' },
|
|
||||||
// DE Gesetze (zusaetzlich)
|
|
||||||
TMG_KOMPLETT: { filename: 'TMG_KOMPLETT.pdf', language: 'de' },
|
|
||||||
DE_URHG: { filename: 'DE_URHG.pdf', language: 'de' },
|
|
||||||
// EDPB Guidelines
|
|
||||||
EDPB_GUIDELINES_5_2020: { filename: 'EDPB_GUIDELINES_5_2020.pdf', language: 'en' },
|
|
||||||
EDPB_GUIDELINES_7_2020: { filename: 'EDPB_GUIDELINES_7_2020.pdf', language: 'en' },
|
|
||||||
EDPB_GUIDELINES_1_2020: { filename: 'EDPB_GUIDELINES_1_2020.pdf', language: 'en' },
|
|
||||||
EDPB_GUIDELINES_1_2022: { filename: 'EDPB_GUIDELINES_1_2022.pdf', language: 'en' },
|
|
||||||
EDPB_GUIDELINES_2_2023: { filename: 'EDPB_GUIDELINES_2_2023.pdf', language: 'en' },
|
|
||||||
EDPB_GUIDELINES_2_2024: { filename: 'EDPB_GUIDELINES_2_2024.pdf', language: 'en' },
|
|
||||||
EDPB_GUIDELINES_4_2019: { filename: 'EDPB_GUIDELINES_4_2019.pdf', language: 'en' },
|
|
||||||
EDPB_GUIDELINES_9_2022: { filename: 'EDPB_GUIDELINES_9_2022.pdf', language: 'en' },
|
|
||||||
EDPB_DPIA_LIST: { filename: 'EDPB_DPIA_LIST.pdf', language: 'en' },
|
|
||||||
EDPB_LEGITIMATE_INTEREST: { filename: 'EDPB_LEGITIMATE_INTEREST.pdf', language: 'en' },
|
|
||||||
// EDPS
|
|
||||||
EDPS_DPIA_LIST: { filename: 'EDPS_DPIA_LIST.pdf', language: 'en' },
|
|
||||||
// Frameworks
|
|
||||||
ENISA_SECURE_BY_DESIGN: { filename: 'ENISA_SECURE_BY_DESIGN.pdf', language: 'en' },
|
|
||||||
ENISA_SUPPLY_CHAIN: { filename: 'ENISA_SUPPLY_CHAIN.pdf', language: 'en' },
|
|
||||||
ENISA_THREAT_LANDSCAPE: { filename: 'ENISA_THREAT_LANDSCAPE.pdf', language: 'en' },
|
|
||||||
ENISA_ICS_SCADA: { filename: 'ENISA_ICS_SCADA.pdf', language: 'en' },
|
|
||||||
ENISA_CYBERSECURITY_2024: { filename: 'ENISA_CYBERSECURITY_2024.pdf', language: 'en' },
|
|
||||||
NIST_SSDF: { filename: 'NIST_SSDF.pdf', language: 'en' },
|
|
||||||
NIST_CSF_2: { filename: 'NIST_CSF_2.pdf', language: 'en' },
|
|
||||||
OECD_AI_PRINCIPLES: { filename: 'OECD_AI_PRINCIPLES.pdf', language: 'en' },
|
|
||||||
// EU-IFRS / EFRAG
|
|
||||||
EU_IFRS_DE: { filename: 'EU_IFRS_DE.pdf', language: 'de' },
|
|
||||||
EU_IFRS_EN: { filename: 'EU_IFRS_EN.pdf', language: 'en' },
|
|
||||||
EFRAG_ENDORSEMENT: { filename: 'EFRAG_ENDORSEMENT.pdf', language: 'en' },
|
|
||||||
}
|
|
||||||
@@ -11,8 +11,6 @@ import React, { useState, useEffect, useCallback } from 'react'
|
|||||||
import Link from 'next/link'
|
import Link from 'next/link'
|
||||||
import { PagePurpose } from '@/components/common/PagePurpose'
|
import { PagePurpose } from '@/components/common/PagePurpose'
|
||||||
import { AIModuleSidebarResponsive } from '@/components/ai/AIModuleSidebar'
|
import { AIModuleSidebarResponsive } from '@/components/ai/AIModuleSidebar'
|
||||||
import { REGULATIONS_IN_RAG } from './rag-constants'
|
|
||||||
import { ChunkBrowserQA } from './components/ChunkBrowserQA'
|
|
||||||
|
|
||||||
// API uses local proxy route to klausur-service
|
// API uses local proxy route to klausur-service
|
||||||
const API_PROXY = '/api/legal-corpus'
|
const API_PROXY = '/api/legal-corpus'
|
||||||
@@ -75,7 +73,7 @@ interface DsfaCorpusStatus {
|
|||||||
type RegulationCategory = 'regulations' | 'dsfa' | 'nibis' | 'templates'
|
type RegulationCategory = 'regulations' | 'dsfa' | 'nibis' | 'templates'
|
||||||
|
|
||||||
// Tab definitions
|
// Tab definitions
|
||||||
type TabId = 'overview' | 'regulations' | 'map' | 'search' | 'chunks' | 'data' | 'ingestion' | 'pipeline'
|
type TabId = 'overview' | 'regulations' | 'map' | 'search' | 'data' | 'ingestion' | 'pipeline'
|
||||||
|
|
||||||
// Custom document type
|
// Custom document type
|
||||||
interface CustomDocument {
|
interface CustomDocument {
|
||||||
@@ -1013,264 +1011,8 @@ const REGULATIONS = [
|
|||||||
keyTopics: ['Bussgeldberechnung', 'Schweregrad', 'Milderungsgruende', 'Bussgeldrahmen'],
|
keyTopics: ['Bussgeldberechnung', 'Schweregrad', 'Milderungsgruende', 'Bussgeldrahmen'],
|
||||||
effectiveDate: '2022'
|
effectiveDate: '2022'
|
||||||
},
|
},
|
||||||
// =====================================================================
|
|
||||||
// Neu ingestierte EU-Richtlinien (Februar 2026)
|
|
||||||
// =====================================================================
|
|
||||||
{
|
|
||||||
code: 'E_COMMERCE_RL',
|
|
||||||
name: 'E-Commerce-Richtlinie',
|
|
||||||
fullName: 'Richtlinie 2000/31/EG ueber den elektronischen Geschaeftsverkehr',
|
|
||||||
type: 'eu_directive',
|
|
||||||
expected: 30,
|
|
||||||
description: 'EU-Richtlinie ueber den elektronischen Geschaeftsverkehr (E-Commerce). Regelt Herkunftslandprinzip, Informationspflichten, Haftungsprivilegien fuer Vermittler (Mere Conduit, Caching, Hosting).',
|
|
||||||
relevantFor: ['Online-Dienste', 'E-Commerce', 'Hosting-Anbieter', 'Plattformen'],
|
|
||||||
keyTopics: ['Herkunftslandprinzip', 'Haftungsprivileg', 'Informationspflichten', 'Spam-Verbot', 'Vermittlerhaftung'],
|
|
||||||
effectiveDate: '17. Juli 2000'
|
|
||||||
},
|
|
||||||
{
|
|
||||||
code: 'VERBRAUCHERRECHTE_RL',
|
|
||||||
name: 'Verbraucherrechte-Richtlinie',
|
|
||||||
fullName: 'Richtlinie 2011/83/EU ueber die Rechte der Verbraucher',
|
|
||||||
type: 'eu_directive',
|
|
||||||
expected: 25,
|
|
||||||
description: 'EU-weite Harmonisierung der Verbraucherrechte bei Fernabsatz und aussergeschaeftlichen Vertraegen. 14-Tage-Widerrufsrecht, Informationspflichten, digitale Inhalte.',
|
|
||||||
relevantFor: ['Online-Shops', 'E-Commerce', 'Fernabsatz', 'Dienstleister'],
|
|
||||||
keyTopics: ['Widerrufsrecht 14 Tage', 'Informationspflichten', 'Fernabsatzvertraege', 'Digitale Inhalte'],
|
|
||||||
effectiveDate: '13. Juni 2014'
|
|
||||||
},
|
|
||||||
{
|
|
||||||
code: 'DIGITALE_INHALTE_RL',
|
|
||||||
name: 'Digitale-Inhalte-Richtlinie',
|
|
||||||
fullName: 'Richtlinie (EU) 2019/770 ueber digitale Inhalte und Dienstleistungen',
|
|
||||||
type: 'eu_directive',
|
|
||||||
expected: 20,
|
|
||||||
description: 'Gewaehrleistungsrecht fuer digitale Inhalte und Dienstleistungen. Regelt Maengelhaftung, Updates, Vertragsmaessigkeit und Kuendigungsrechte bei digitalen Produkten.',
|
|
||||||
relevantFor: ['SaaS-Anbieter', 'App-Entwickler', 'Cloud-Dienste', 'Streaming-Anbieter', 'Software-Hersteller'],
|
|
||||||
keyTopics: ['Digitale Gewaehrleistung', 'Update-Pflicht', 'Vertragsmaessigkeit', 'Kuendigungsrecht', 'Datenportabilitaet'],
|
|
||||||
effectiveDate: '1. Januar 2022'
|
|
||||||
},
|
|
||||||
{
|
|
||||||
code: 'DMA',
|
|
||||||
name: 'Digital Markets Act',
|
|
||||||
fullName: 'Verordnung (EU) 2022/1925 - Digital Markets Act',
|
|
||||||
type: 'eu_regulation',
|
|
||||||
expected: 50,
|
|
||||||
description: 'Reguliert digitale Gatekeeper-Plattformen. Stellt Verhaltensregeln fuer grosse Plattformen auf (Apple, Google, Meta, Amazon, Microsoft). Verbietet Selbstbevorzugung und erzwingt Interoperabilitaet.',
|
|
||||||
relevantFor: ['Grosse Plattformen', 'App-Stores', 'Suchmaschinen', 'Social Media', 'Messenger-Dienste'],
|
|
||||||
keyTopics: ['Gatekeeper-Pflichten', 'Interoperabilitaet', 'Selbstbevorzugung', 'App-Store-Regeln', 'Datenportabilitaet'],
|
|
||||||
effectiveDate: '2. Mai 2023'
|
|
||||||
},
|
|
||||||
// === Industrie-Compliance (2026-02-28) ===
|
|
||||||
{
|
|
||||||
code: 'MACHINERY_REG',
|
|
||||||
name: 'Maschinenverordnung',
|
|
||||||
fullName: 'Verordnung (EU) 2023/1230 ueber Maschinen (Machinery Regulation)',
|
|
||||||
type: 'eu_regulation',
|
|
||||||
expected: 100,
|
|
||||||
description: 'Loest die alte Maschinenrichtlinie 2006/42/EG ab. Regelt Sicherheitsanforderungen fuer Maschinen und zugehoerige Produkte, CE-Kennzeichnung, Konformitaetsbewertung und Marktaufsicht. Neu: Cybersecurity-Anforderungen fuer vernetzte Maschinen.',
|
|
||||||
relevantFor: ['Maschinenbau', 'Industrie 4.0', 'Automatisierung', 'Hersteller', 'Importeure'],
|
|
||||||
keyTopics: ['CE-Kennzeichnung', 'Konformitaetsbewertung', 'Risikobeurteilung', 'Cybersecurity', 'Betriebsanleitung'],
|
|
||||||
effectiveDate: '20. Januar 2027'
|
|
||||||
},
|
|
||||||
{
|
|
||||||
code: 'BLUE_GUIDE',
|
|
||||||
name: 'Blue Guide',
|
|
||||||
fullName: 'Leitfaden fuer die Umsetzung der EU-Produktvorschriften (Blue Guide 2022)',
|
|
||||||
type: 'eu_guideline',
|
|
||||||
expected: 200,
|
|
||||||
description: 'Umfassender Leitfaden der EU-Kommission zur Umsetzung von Produktvorschriften. Erklaert CE-Kennzeichnung, Konformitaetsbewertungsverfahren, notifizierte Stellen, Marktaufsicht und den New Legislative Framework.',
|
|
||||||
relevantFor: ['Hersteller', 'Importeure', 'Haendler', 'Notifizierte Stellen', 'Marktaufsichtsbehoerden'],
|
|
||||||
keyTopics: ['CE-Kennzeichnung', 'Konformitaetserklaerung', 'Notifizierte Stellen', 'Marktaufsicht', 'New Legislative Framework'],
|
|
||||||
effectiveDate: '29. Juni 2022'
|
|
||||||
},
|
|
||||||
{
|
|
||||||
code: 'ENISA_SECURE_BY_DESIGN',
|
|
||||||
name: 'ENISA Secure by Design',
|
|
||||||
fullName: 'ENISA Secure Software Development Best Practices',
|
|
||||||
type: 'eu_guideline',
|
|
||||||
expected: 50,
|
|
||||||
description: 'ENISA-Leitfaden fuer sichere Softwareentwicklung. Beschreibt Best Practices fuer Security by Design, sichere Entwicklungsprozesse und Schwachstellenmanagement.',
|
|
||||||
relevantFor: ['Softwareentwickler', 'DevOps', 'IT-Sicherheit', 'Produktmanagement'],
|
|
||||||
keyTopics: ['Security by Design', 'SDLC', 'Schwachstellenmanagement', 'Secure Coding', 'Threat Modeling'],
|
|
||||||
effectiveDate: '2023'
|
|
||||||
},
|
|
||||||
{
|
|
||||||
code: 'ENISA_SUPPLY_CHAIN',
|
|
||||||
name: 'ENISA Supply Chain Security',
|
|
||||||
fullName: 'ENISA Threat Landscape for Supply Chain Attacks',
|
|
||||||
type: 'eu_guideline',
|
|
||||||
expected: 50,
|
|
||||||
description: 'ENISA-Analyse der Bedrohungslandschaft fuer Supply-Chain-Angriffe. Beschreibt Angriffsvektoren, Taxonomie und Empfehlungen zur Absicherung von Software-Lieferketten.',
|
|
||||||
relevantFor: ['IT-Sicherheit', 'Beschaffung', 'Softwareentwickler', 'CISO'],
|
|
||||||
keyTopics: ['Supply Chain Security', 'SolarWinds', 'SBOM', 'Lieferantenrisiko', 'Third-Party Risk'],
|
|
||||||
effectiveDate: '2021'
|
|
||||||
},
|
|
||||||
{
|
|
||||||
code: 'NIST_SSDF',
|
|
||||||
name: 'NIST SSDF',
|
|
||||||
fullName: 'NIST SP 800-218 — Secure Software Development Framework (SSDF)',
|
|
||||||
type: 'international_standard',
|
|
||||||
expected: 40,
|
|
||||||
description: 'NIST-Framework fuer sichere Softwareentwicklung. Definiert Praktiken und Aufgaben in vier Gruppen: Prepare, Protect, Produce, Respond. Weit verbreitet als Referenz fuer Software Supply Chain Security.',
|
|
||||||
relevantFor: ['Softwareentwickler', 'DevSecOps', 'IT-Sicherheit', 'Compliance-Manager'],
|
|
||||||
keyTopics: ['SSDF', 'Secure SDLC', 'Software Supply Chain', 'Vulnerability Management', 'Code Review'],
|
|
||||||
effectiveDate: '3. Februar 2022'
|
|
||||||
},
|
|
||||||
{
|
|
||||||
code: 'NIST_CSF_2',
|
|
||||||
name: 'NIST CSF 2.0',
|
|
||||||
fullName: 'NIST Cybersecurity Framework (CSF) 2.0',
|
|
||||||
type: 'international_standard',
|
|
||||||
expected: 50,
|
|
||||||
description: 'Version 2.0 des NIST Cybersecurity Framework. Neue Kernfunktion "Govern" ergaenzt Identify, Protect, Detect, Respond, Recover. Erweitert den Anwendungsbereich ueber kritische Infrastruktur hinaus auf alle Organisationen.',
|
|
||||||
relevantFor: ['CISO', 'IT-Sicherheit', 'Risikomanagement', 'Geschaeftsfuehrung', 'Alle Branchen'],
|
|
||||||
keyTopics: ['Govern', 'Identify', 'Protect', 'Detect', 'Respond', 'Recover', 'Cybersecurity'],
|
|
||||||
effectiveDate: '26. Februar 2024'
|
|
||||||
},
|
|
||||||
{
|
|
||||||
code: 'OECD_AI_PRINCIPLES',
|
|
||||||
name: 'OECD AI Principles',
|
|
||||||
fullName: 'OECD Recommendation on Artificial Intelligence (AI Principles)',
|
|
||||||
type: 'international_standard',
|
|
||||||
expected: 20,
|
|
||||||
description: 'OECD-Empfehlung zu Kuenstlicher Intelligenz. Definiert fuenf Prinzipien fuer verantwortungsvolle KI: Inklusives Wachstum, Menschenzentrierte Werte, Transparenz, Robustheit und Rechenschaftspflicht. Von 46 Laendern angenommen.',
|
|
||||||
relevantFor: ['KI-Entwickler', 'Policy-Maker', 'Ethik-Kommissionen', 'Geschaeftsfuehrung'],
|
|
||||||
keyTopics: ['AI Ethics', 'Transparenz', 'Accountability', 'Trustworthy AI', 'Human-Centered AI'],
|
|
||||||
effectiveDate: '22. Mai 2019'
|
|
||||||
},
|
|
||||||
{
|
|
||||||
code: 'EU_IFRS',
|
|
||||||
name: 'EU-IFRS',
|
|
||||||
fullName: 'Verordnung (EU) 2023/1803 — International Financial Reporting Standards',
|
|
||||||
type: 'eu_regulation',
|
|
||||||
expected: 500,
|
|
||||||
description: 'Konsolidierte Fassung der von der EU uebernommenen IFRS/IAS/IFRIC/SIC. Rechtsverbindlich fuer boersennotierte EU-Unternehmen. Enthalt IFRS 1-17, IAS 1-41, IFRIC 1-23 und SIC 7-32 in der EU-endorsed Fassung (Stand Okt 2023). ACHTUNG: Neuere IASB-Standards sind moeglicherweise noch nicht EU-endorsed.',
|
|
||||||
relevantFor: ['Rechnungswesen', 'Wirtschaftspruefer', 'boersennotierte Unternehmen', 'Finanzberichterstattung', 'CFO'],
|
|
||||||
keyTopics: ['IFRS 16 Leasing', 'IFRS 9 Finanzinstrumente', 'IAS 1 Darstellung', 'IFRS 15 Erloese', 'IFRS 17 Versicherungsvertraege', 'Konsolidierung'],
|
|
||||||
effectiveDate: '16. Oktober 2023'
|
|
||||||
},
|
|
||||||
{
|
|
||||||
code: 'EFRAG_ENDORSEMENT',
|
|
||||||
name: 'EFRAG Endorsement Status',
|
|
||||||
fullName: 'EFRAG EU Endorsement Status Report (Dezember 2025)',
|
|
||||||
type: 'eu_guideline',
|
|
||||||
expected: 30,
|
|
||||||
description: 'Uebersicht des European Financial Reporting Advisory Group (EFRAG) ueber den EU-Endorsement-Stand aller IFRS/IAS-Standards. Zeigt welche Standards von der EU uebernommen wurden und welche noch ausstehend sind. Relevant fuer internationale Ausschreibungen und Compliance-Pruefung.',
|
|
||||||
relevantFor: ['Rechnungswesen', 'Wirtschaftspruefer', 'Compliance Officer', 'internationale Ausschreibungen'],
|
|
||||||
keyTopics: ['EU Endorsement', 'IFRS 18', 'IFRS S1/S2 Sustainability', 'Endorsement Status', 'IASB Updates'],
|
|
||||||
effectiveDate: '18. Dezember 2025'
|
|
||||||
},
|
|
||||||
]
|
]
|
||||||
|
|
||||||
// Source URLs for original documents (click to view original)
|
|
||||||
const REGULATION_SOURCES: Record<string, string> = {
|
|
||||||
// EU Verordnungen/Richtlinien (EUR-Lex)
|
|
||||||
GDPR: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32016R0679',
|
|
||||||
EPRIVACY: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32002L0058',
|
|
||||||
SCC: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32021D0914',
|
|
||||||
DPF: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32023D1795',
|
|
||||||
AIACT: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32024R1689',
|
|
||||||
CRA: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32024R2847',
|
|
||||||
NIS2: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32022L2555',
|
|
||||||
EUCSA: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32019R0881',
|
|
||||||
DATAACT: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32023R2854',
|
|
||||||
DGA: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32022R0868',
|
|
||||||
DSA: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32022R2065',
|
|
||||||
EAA: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32019L0882',
|
|
||||||
DSM: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32019L0790',
|
|
||||||
PLD: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32024L2853',
|
|
||||||
GPSR: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32023R0988',
|
|
||||||
DORA: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32022R2554',
|
|
||||||
PSD2: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32015L2366',
|
|
||||||
AMLR: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32024R1624',
|
|
||||||
MiCA: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32023R1114',
|
|
||||||
EHDS: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32025R0327',
|
|
||||||
SCC_FULL_TEXT: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32021D0914',
|
|
||||||
E_COMMERCE_RL: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32000L0031',
|
|
||||||
VERBRAUCHERRECHTE_RL: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32011L0083',
|
|
||||||
DIGITALE_INHALTE_RL: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32019L0770',
|
|
||||||
DMA: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32022R1925',
|
|
||||||
MACHINERY_REG: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32023R1230',
|
|
||||||
BLUE_GUIDE: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:52022XC0629(04)',
|
|
||||||
EU_IFRS: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32023R1803',
|
|
||||||
// EDPB Guidelines
|
|
||||||
EDPB_GUIDELINES_2_2019: 'https://www.edpb.europa.eu/our-work-tools/our-documents/guidelines/guidelines-22019-processing-personal-data-under-article-61b_en',
|
|
||||||
EDPB_GUIDELINES_3_2019: 'https://www.edpb.europa.eu/our-work-tools/our-documents/guidelines/guidelines-32019-processing-personal-data-through-video_en',
|
|
||||||
EDPB_GUIDELINES_5_2020: 'https://www.edpb.europa.eu/our-work-tools/our-documents/guidelines/guidelines-052020-consent-under-regulation-2016679_en',
|
|
||||||
EDPB_GUIDELINES_7_2020: 'https://www.edpb.europa.eu/our-work-tools/our-documents/guidelines/guidelines-072020-concepts-controller-and-processor-gdpr_en',
|
|
||||||
EDPB_GUIDELINES_1_2022: 'https://www.edpb.europa.eu/our-work-tools/our-documents/guidelines/guidelines-042022-calculation-administrative-fines-under-gdpr_en',
|
|
||||||
// BSI Technische Richtlinien
|
|
||||||
'BSI-TR-03161-1': 'https://www.bsi.bund.de/SharedDocs/Downloads/DE/BSI/Publikationen/TechnischeRichtlinien/TR03161/BSI-TR-03161-1.html',
|
|
||||||
'BSI-TR-03161-2': 'https://www.bsi.bund.de/SharedDocs/Downloads/DE/BSI/Publikationen/TechnischeRichtlinien/TR03161/BSI-TR-03161-2.html',
|
|
||||||
'BSI-TR-03161-3': 'https://www.bsi.bund.de/SharedDocs/Downloads/DE/BSI/Publikationen/TechnischeRichtlinien/TR03161/BSI-TR-03161-3.html',
|
|
||||||
// Nationale Datenschutzgesetze
|
|
||||||
AT_DSG: 'https://www.ris.bka.gv.at/GeltendeFassung.wxe?Abfrage=Bundesnormen&Gesetzesnummer=10001597',
|
|
||||||
BDSG_FULL: 'https://www.gesetze-im-internet.de/bdsg_2018/',
|
|
||||||
CH_DSG: 'https://www.fedlex.admin.ch/eli/cc/2022/491/de',
|
|
||||||
LI_DSG: 'https://www.gesetze.li/konso/2018.272',
|
|
||||||
BE_DPA_LAW: 'https://www.autoriteprotectiondonnees.be/citoyen/la-loi-du-30-juillet-2018',
|
|
||||||
NL_UAVG: 'https://wetten.overheid.nl/BWBR0040940/',
|
|
||||||
FR_CNIL_GUIDE: 'https://www.cnil.fr/fr/rgpd-par-ou-commencer',
|
|
||||||
ES_LOPDGDD: 'https://www.boe.es/buscar/act.php?id=BOE-A-2018-16673',
|
|
||||||
IT_CODICE_PRIVACY: 'https://www.garanteprivacy.it/home/docweb/-/docweb-display/docweb/9042678',
|
|
||||||
IE_DPA_2018: 'https://www.irishstatutebook.ie/eli/2018/act/7/enacted/en/html',
|
|
||||||
UK_DPA_2018: 'https://www.legislation.gov.uk/ukpga/2018/12/contents',
|
|
||||||
UK_GDPR: 'https://www.legislation.gov.uk/eur/2016/679/contents',
|
|
||||||
NO_PERSONOPPLYSNINGSLOVEN: 'https://lovdata.no/dokument/NL/lov/2018-06-15-38',
|
|
||||||
SE_DATASKYDDSLAG: 'https://www.riksdagen.se/sv/dokument-och-lagar/dokument/svensk-forfattningssamling/lag-2018218-med-kompletterande-bestammelser_sfs-2018-218/',
|
|
||||||
FI_TIETOSUOJALAKI: 'https://www.finlex.fi/fi/laki/ajantasa/2018/20181050',
|
|
||||||
PL_UODO: 'https://isap.sejm.gov.pl/isap.nsf/DocDetails.xsp?id=WDU20180001000',
|
|
||||||
CZ_ZOU: 'https://www.zakonyprolidi.cz/cs/2019-110',
|
|
||||||
HU_INFOTV: 'https://net.jogtar.hu/jogszabaly?docid=a1100112.tv',
|
|
||||||
LU_DPA_LAW: 'https://legilux.public.lu/eli/etat/leg/loi/2018/08/01/a686/jo',
|
|
||||||
DK_DATABESKYTTELSESLOVEN: 'https://www.retsinformation.dk/eli/lta/2018/502',
|
|
||||||
// Deutschland — Weitere Gesetze
|
|
||||||
TDDDG: 'https://www.gesetze-im-internet.de/tdddg/',
|
|
||||||
DE_DDG: 'https://www.gesetze-im-internet.de/ddg/',
|
|
||||||
DE_BGB_AGB: 'https://www.gesetze-im-internet.de/bgb/__305.html',
|
|
||||||
DE_EGBGB: 'https://www.gesetze-im-internet.de/bgbeg/art_246.html',
|
|
||||||
DE_UWG: 'https://www.gesetze-im-internet.de/uwg_2004/',
|
|
||||||
DE_HGB_RET: 'https://www.gesetze-im-internet.de/hgb/__257.html',
|
|
||||||
DE_AO_RET: 'https://www.gesetze-im-internet.de/ao_1977/__147.html',
|
|
||||||
DE_TKG: 'https://www.gesetze-im-internet.de/tkg_2021/',
|
|
||||||
DE_PANGV: 'https://www.gesetze-im-internet.de/pangv_2022/',
|
|
||||||
DE_DLINFOV: 'https://www.gesetze-im-internet.de/dlinfov/',
|
|
||||||
DE_BETRVG: 'https://www.gesetze-im-internet.de/betrvg/__87.html',
|
|
||||||
DE_GESCHGEHG: 'https://www.gesetze-im-internet.de/geschgehg/',
|
|
||||||
DE_BSIG: 'https://www.gesetze-im-internet.de/bsig_2009/',
|
|
||||||
DE_USTG_RET: 'https://www.gesetze-im-internet.de/ustg_1980/__14b.html',
|
|
||||||
// Oesterreich — Weitere Gesetze
|
|
||||||
AT_ECG: 'https://www.ris.bka.gv.at/GeltendeFassung.wxe?Abfrage=Bundesnormen&Gesetzesnummer=20001703',
|
|
||||||
AT_TKG: 'https://www.ris.bka.gv.at/GeltendeFassung.wxe?Abfrage=Bundesnormen&Gesetzesnummer=20007898',
|
|
||||||
AT_KSCHG: 'https://www.ris.bka.gv.at/GeltendeFassung.wxe?Abfrage=Bundesnormen&Gesetzesnummer=10002462',
|
|
||||||
AT_FAGG: 'https://www.ris.bka.gv.at/GeltendeFassung.wxe?Abfrage=Bundesnormen&Gesetzesnummer=20008783',
|
|
||||||
AT_UGB_RET: 'https://www.ris.bka.gv.at/GeltendeFassung.wxe?Abfrage=Bundesnormen&Gesetzesnummer=10001702',
|
|
||||||
AT_BAO_RET: 'https://www.ris.bka.gv.at/GeltendeFassung.wxe?Abfrage=Bundesnormen&Gesetzesnummer=10003940',
|
|
||||||
AT_MEDIENG: 'https://www.ris.bka.gv.at/GeltendeFassung.wxe?Abfrage=Bundesnormen&Gesetzesnummer=10000719',
|
|
||||||
AT_ABGB_AGB: 'https://www.ris.bka.gv.at/GeltendeFassung.wxe?Abfrage=Bundesnormen&Gesetzesnummer=10001622',
|
|
||||||
AT_UWG: 'https://www.ris.bka.gv.at/GeltendeFassung.wxe?Abfrage=Bundesnormen&Gesetzesnummer=10002665',
|
|
||||||
// Schweiz
|
|
||||||
CH_DSV: 'https://www.fedlex.admin.ch/eli/cc/2022/568/de',
|
|
||||||
CH_OR_AGB: 'https://www.fedlex.admin.ch/eli/cc/27/317_321_377/de',
|
|
||||||
CH_UWG: 'https://www.fedlex.admin.ch/eli/cc/1988/223_223_223/de',
|
|
||||||
CH_FMG: 'https://www.fedlex.admin.ch/eli/cc/1997/2187_2187_2187/de',
|
|
||||||
CH_GEBUV: 'https://www.fedlex.admin.ch/eli/cc/2002/249/de',
|
|
||||||
CH_ZERTES: 'https://www.fedlex.admin.ch/eli/cc/2016/752/de',
|
|
||||||
CH_ZGB_PERS: 'https://www.fedlex.admin.ch/eli/cc/24/233_245_233/de',
|
|
||||||
// Industrie-Compliance
|
|
||||||
ENISA_SECURE_BY_DESIGN: 'https://www.enisa.europa.eu/publications/secure-development-best-practices',
|
|
||||||
ENISA_SUPPLY_CHAIN: 'https://www.enisa.europa.eu/publications/threat-landscape-for-supply-chain-attacks',
|
|
||||||
NIST_SSDF: 'https://csrc.nist.gov/pubs/sp/800/218/final',
|
|
||||||
NIST_CSF_2: 'https://www.nist.gov/cyberframework',
|
|
||||||
OECD_AI_PRINCIPLES: 'https://legalinstruments.oecd.org/en/instruments/OECD-LEGAL-0449',
|
|
||||||
// IFRS / EFRAG
|
|
||||||
EU_IFRS_DE: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32023R1803',
|
|
||||||
EU_IFRS_EN: 'https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32023R1803',
|
|
||||||
EFRAG_ENDORSEMENT: 'https://www.efrag.org/activities/endorsement-status-report',
|
|
||||||
// Full-text Datenschutzgesetz AT
|
|
||||||
AT_DSG_FULL: 'https://www.ris.bka.gv.at/GeltendeFassung.wxe?Abfrage=Bundesnormen&Gesetzesnummer=10001597',
|
|
||||||
}
|
|
||||||
|
|
||||||
// License info for each regulation
|
// License info for each regulation
|
||||||
const REGULATION_LICENSES: Record<string, { license: string; licenseNote: string }> = {
|
const REGULATION_LICENSES: Record<string, { license: string; licenseNote: string }> = {
|
||||||
GDPR: { license: 'PUBLIC_DOMAIN', licenseNote: 'Amtliches Werk der EU — frei verwendbar' },
|
GDPR: { license: 'PUBLIC_DOMAIN', licenseNote: 'Amtliches Werk der EU — frei verwendbar' },
|
||||||
@@ -1321,18 +1063,6 @@ const REGULATION_LICENSES: Record<string, { license: string; licenseNote: string
|
|||||||
EDPB_GUIDELINES_3_2019: { license: 'EDPB-LICENSE', licenseNote: 'EDPB Document License' },
|
EDPB_GUIDELINES_3_2019: { license: 'EDPB-LICENSE', licenseNote: 'EDPB Document License' },
|
||||||
EDPB_GUIDELINES_5_2020: { license: 'EDPB-LICENSE', licenseNote: 'EDPB Document License' },
|
EDPB_GUIDELINES_5_2020: { license: 'EDPB-LICENSE', licenseNote: 'EDPB Document License' },
|
||||||
EDPB_GUIDELINES_7_2020: { license: 'EDPB-LICENSE', licenseNote: 'EDPB Document License' },
|
EDPB_GUIDELINES_7_2020: { license: 'EDPB-LICENSE', licenseNote: 'EDPB Document License' },
|
||||||
// Industrie-Compliance (2026-02-28)
|
|
||||||
MACHINERY_REG: { license: 'PUBLIC_DOMAIN', licenseNote: 'EU-Verordnung — amtliches Werk' },
|
|
||||||
BLUE_GUIDE: { license: 'PUBLIC_DOMAIN', licenseNote: 'EU-Leitfaden — amtliches Werk der Kommission' },
|
|
||||||
ENISA_SECURE_BY_DESIGN: { license: 'CC-BY-4.0', licenseNote: 'ENISA Publication — CC BY 4.0' },
|
|
||||||
ENISA_SUPPLY_CHAIN: { license: 'CC-BY-4.0', licenseNote: 'ENISA Publication — CC BY 4.0' },
|
|
||||||
NIST_SSDF: { license: 'PUBLIC_DOMAIN', licenseNote: 'US Government Work — Public Domain' },
|
|
||||||
NIST_CSF_2: { license: 'PUBLIC_DOMAIN', licenseNote: 'US Government Work — Public Domain' },
|
|
||||||
OECD_AI_PRINCIPLES: { license: 'PUBLIC_DOMAIN', licenseNote: 'OECD Legal Instrument — Reuse Notice' },
|
|
||||||
// EU-IFRS / EFRAG (2026-02-28)
|
|
||||||
EU_IFRS_DE: { license: 'PUBLIC_DOMAIN', licenseNote: 'EU-Verordnung — amtliches Werk' },
|
|
||||||
EU_IFRS_EN: { license: 'PUBLIC_DOMAIN', licenseNote: 'EU-Verordnung — amtliches Werk' },
|
|
||||||
EFRAG_ENDORSEMENT: { license: 'PUBLIC_DOMAIN', licenseNote: 'EFRAG — oeffentliches Dokument' },
|
|
||||||
// DACH National Laws — Deutschland
|
// DACH National Laws — Deutschland
|
||||||
DE_DDG: { license: 'PUBLIC_DOMAIN', licenseNote: 'Deutsches Bundesgesetz — amtliches Werk (§5 UrhG)' },
|
DE_DDG: { license: 'PUBLIC_DOMAIN', licenseNote: 'Deutsches Bundesgesetz — amtliches Werk (§5 UrhG)' },
|
||||||
DE_BGB_AGB: { license: 'PUBLIC_DOMAIN', licenseNote: 'Deutsches Bundesgesetz — amtliches Werk (§5 UrhG)' },
|
DE_BGB_AGB: { license: 'PUBLIC_DOMAIN', licenseNote: 'Deutsches Bundesgesetz — amtliches Werk (§5 UrhG)' },
|
||||||
@@ -1369,35 +1099,6 @@ const REGULATION_LICENSES: Record<string, { license: string; licenseNote: string
|
|||||||
LU_DPA_LAW: { license: 'PUBLIC_DOMAIN', licenseNote: 'Amtliches Werk Luxemburg — frei verwendbar' },
|
LU_DPA_LAW: { license: 'PUBLIC_DOMAIN', licenseNote: 'Amtliches Werk Luxemburg — frei verwendbar' },
|
||||||
DK_DATABESKYTTELSESLOVEN: { license: 'PUBLIC_DOMAIN', licenseNote: 'Amtliches Werk Daenemark — frei verwendbar' },
|
DK_DATABESKYTTELSESLOVEN: { license: 'PUBLIC_DOMAIN', licenseNote: 'Amtliches Werk Daenemark — frei verwendbar' },
|
||||||
EDPB_GUIDELINES_1_2022: { license: 'EDPB-LICENSE', licenseNote: 'EDPB Document License' },
|
EDPB_GUIDELINES_1_2022: { license: 'EDPB-LICENSE', licenseNote: 'EDPB Document License' },
|
||||||
// Neue EU-Richtlinien (Februar 2026 ingestiert)
|
|
||||||
E_COMMERCE_RL: { license: 'PUBLIC_DOMAIN', licenseNote: 'EU-Richtlinie — amtliches Werk' },
|
|
||||||
VERBRAUCHERRECHTE_RL: { license: 'PUBLIC_DOMAIN', licenseNote: 'EU-Richtlinie — amtliches Werk' },
|
|
||||||
DIGITALE_INHALTE_RL: { license: 'PUBLIC_DOMAIN', licenseNote: 'EU-Richtlinie — amtliches Werk' },
|
|
||||||
DMA: { license: 'PUBLIC_DOMAIN', licenseNote: 'EU-Verordnung — amtliches Werk' },
|
|
||||||
}
|
|
||||||
|
|
||||||
// REGULATIONS_IN_RAG is imported from ./rag-constants.ts
|
|
||||||
|
|
||||||
// Helper: Check if regulation is in RAG
|
|
||||||
const isInRag = (code: string): boolean => code in REGULATIONS_IN_RAG
|
|
||||||
|
|
||||||
// Helper: Get known chunk count for a regulation
|
|
||||||
const getKnownChunks = (code: string): number => REGULATIONS_IN_RAG[code]?.chunks || 0
|
|
||||||
|
|
||||||
// Known collection totals (updated: 2026-02-28)
|
|
||||||
// Note: bp_compliance_ce and bp_compliance_datenschutz will increase after
|
|
||||||
// industry compliance ingestion (Machinery Reg, Blue Guide, ENISA, NIST, OECD).
|
|
||||||
// Update chunk counts after running ingest-industry-compliance.sh.
|
|
||||||
const COLLECTION_TOTALS = {
|
|
||||||
bp_compliance_gesetze: 58304,
|
|
||||||
bp_compliance_ce: 18183,
|
|
||||||
bp_legal_templates: 7689,
|
|
||||||
bp_compliance_datenschutz: 2448,
|
|
||||||
bp_dsfa_corpus: 7867,
|
|
||||||
bp_compliance_recht: 1425,
|
|
||||||
bp_nibis_eh: 7996,
|
|
||||||
total_legal: 76487, // gesetze + ce
|
|
||||||
total_all: 103912,
|
|
||||||
}
|
}
|
||||||
|
|
||||||
// License display labels
|
// License display labels
|
||||||
@@ -1743,8 +1444,6 @@ export default function RAGPage() {
|
|||||||
const [autoRefresh, setAutoRefresh] = useState(true)
|
const [autoRefresh, setAutoRefresh] = useState(true)
|
||||||
const [elapsedTime, setElapsedTime] = useState<string>('')
|
const [elapsedTime, setElapsedTime] = useState<string>('')
|
||||||
|
|
||||||
// Chunk browser state is now in ChunkBrowserQA component
|
|
||||||
|
|
||||||
// DSFA corpus state
|
// DSFA corpus state
|
||||||
const [dsfaSources, setDsfaSources] = useState<DsfaSource[]>([])
|
const [dsfaSources, setDsfaSources] = useState<DsfaSource[]>([])
|
||||||
const [dsfaStatus, setDsfaStatus] = useState<DsfaCorpusStatus | null>(null)
|
const [dsfaStatus, setDsfaStatus] = useState<DsfaCorpusStatus | null>(null)
|
||||||
@@ -1990,8 +1689,6 @@ export default function RAGPage() {
|
|||||||
return () => clearInterval(interval)
|
return () => clearInterval(interval)
|
||||||
}, [pipelineState?.started_at, pipelineState?.status])
|
}, [pipelineState?.started_at, pipelineState?.status])
|
||||||
|
|
||||||
// Chunk browser functions are now in ChunkBrowserQA component
|
|
||||||
|
|
||||||
const handleSearch = async () => {
|
const handleSearch = async () => {
|
||||||
if (!searchQuery.trim()) return
|
if (!searchQuery.trim()) return
|
||||||
|
|
||||||
@@ -2077,7 +1774,6 @@ export default function RAGPage() {
|
|||||||
{ id: 'regulations' as TabId, name: 'Regulierungen', icon: '📜' },
|
{ id: 'regulations' as TabId, name: 'Regulierungen', icon: '📜' },
|
||||||
{ id: 'map' as TabId, name: 'Landkarte', icon: '🗺️' },
|
{ id: 'map' as TabId, name: 'Landkarte', icon: '🗺️' },
|
||||||
{ id: 'search' as TabId, name: 'Suche', icon: '🔍' },
|
{ id: 'search' as TabId, name: 'Suche', icon: '🔍' },
|
||||||
{ id: 'chunks' as TabId, name: 'Chunk-Browser', icon: '🧩' },
|
|
||||||
{ id: 'data' as TabId, name: 'Daten', icon: '📁' },
|
{ id: 'data' as TabId, name: 'Daten', icon: '📁' },
|
||||||
{ id: 'ingestion' as TabId, name: 'Ingestion', icon: '⚙️' },
|
{ id: 'ingestion' as TabId, name: 'Ingestion', icon: '⚙️' },
|
||||||
{ id: 'pipeline' as TabId, name: 'Pipeline', icon: '🔄' },
|
{ id: 'pipeline' as TabId, name: 'Pipeline', icon: '🔄' },
|
||||||
@@ -2108,7 +1804,7 @@ export default function RAGPage() {
|
|||||||
{/* Page Purpose */}
|
{/* Page Purpose */}
|
||||||
<PagePurpose
|
<PagePurpose
|
||||||
title="Daten & RAG"
|
title="Daten & RAG"
|
||||||
purpose={`Verwalten und durchsuchen Sie 7 RAG-Collections mit ${REGULATIONS.length} Regulierungen (${Object.keys(REGULATIONS_IN_RAG).length} im RAG). Legal Corpus, DSFA Corpus (70+ Quellen), NiBiS EH (Bildungsinhalte) und Legal Templates. Teil der KI-Daten-Pipeline fuer Compliance und Klausur-Korrektur.`}
|
purpose="Verwalten und durchsuchen Sie 4 RAG-Collections: Legal Corpus (24 Regulierungen), DSFA Corpus (70+ Quellen inkl. internationaler Datenschutzgesetze), NiBiS EH (Bildungsinhalte) und Legal Templates (Dokumentvorlagen). Teil der KI-Daten-Pipeline fuer Compliance und Klausur-Korrektur."
|
||||||
audience={['DSB', 'Compliance Officer', 'Entwickler']}
|
audience={['DSB', 'Compliance Officer', 'Entwickler']}
|
||||||
gdprArticles={['§5 UrhG (Amtliche Werke)', 'Art. 5 DSGVO (Rechenschaftspflicht)']}
|
gdprArticles={['§5 UrhG (Amtliche Werke)', 'Art. 5 DSGVO (Rechenschaftspflicht)']}
|
||||||
architecture={{
|
architecture={{
|
||||||
@@ -2130,8 +1826,8 @@ export default function RAGPage() {
|
|||||||
<div className="grid grid-cols-2 md:grid-cols-4 gap-4 mb-6">
|
<div className="grid grid-cols-2 md:grid-cols-4 gap-4 mb-6">
|
||||||
<div className="bg-white rounded-xl p-4 border border-slate-200">
|
<div className="bg-white rounded-xl p-4 border border-slate-200">
|
||||||
<p className="text-xs font-medium text-blue-600 uppercase mb-1">Legal Corpus</p>
|
<p className="text-xs font-medium text-blue-600 uppercase mb-1">Legal Corpus</p>
|
||||||
<p className="text-2xl font-bold text-slate-900">{COLLECTION_TOTALS.total_legal.toLocaleString()}</p>
|
<p className="text-2xl font-bold text-slate-900">{loading ? '-' : getTotalChunks().toLocaleString()}</p>
|
||||||
<p className="text-xs text-slate-500">Chunks · {Object.keys(REGULATIONS_IN_RAG).length}/{REGULATIONS.length} im RAG</p>
|
<p className="text-xs text-slate-500">Chunks · {REGULATIONS.length} Regulierungen</p>
|
||||||
</div>
|
</div>
|
||||||
<div className="bg-white rounded-xl p-4 border border-slate-200">
|
<div className="bg-white rounded-xl p-4 border border-slate-200">
|
||||||
<p className="text-xs font-medium text-purple-600 uppercase mb-1">DSFA Corpus</p>
|
<p className="text-xs font-medium text-purple-600 uppercase mb-1">DSFA Corpus</p>
|
||||||
@@ -2140,12 +1836,12 @@ export default function RAGPage() {
|
|||||||
</div>
|
</div>
|
||||||
<div className="bg-white rounded-xl p-4 border border-slate-200">
|
<div className="bg-white rounded-xl p-4 border border-slate-200">
|
||||||
<p className="text-xs font-medium text-emerald-600 uppercase mb-1">NiBiS EH</p>
|
<p className="text-xs font-medium text-emerald-600 uppercase mb-1">NiBiS EH</p>
|
||||||
<p className="text-2xl font-bold text-slate-900">7.996</p>
|
<p className="text-2xl font-bold text-slate-900">28.662</p>
|
||||||
<p className="text-xs text-slate-500">Chunks · Bildungs-Erwartungshorizonte</p>
|
<p className="text-xs text-slate-500">Chunks · Bildungs-Erwartungshorizonte</p>
|
||||||
</div>
|
</div>
|
||||||
<div className="bg-white rounded-xl p-4 border border-slate-200">
|
<div className="bg-white rounded-xl p-4 border border-slate-200">
|
||||||
<p className="text-xs font-medium text-orange-600 uppercase mb-1">Legal Templates</p>
|
<p className="text-xs font-medium text-orange-600 uppercase mb-1">Legal Templates</p>
|
||||||
<p className="text-2xl font-bold text-slate-900">7.689</p>
|
<p className="text-2xl font-bold text-slate-900">824</p>
|
||||||
<p className="text-xs text-slate-500">Chunks · Dokumentvorlagen</p>
|
<p className="text-xs text-slate-500">Chunks · Dokumentvorlagen</p>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
@@ -2180,8 +1876,8 @@ export default function RAGPage() {
|
|||||||
className="p-4 rounded-lg border border-blue-200 bg-blue-50 hover:bg-blue-100 transition-colors text-left"
|
className="p-4 rounded-lg border border-blue-200 bg-blue-50 hover:bg-blue-100 transition-colors text-left"
|
||||||
>
|
>
|
||||||
<p className="text-xs font-medium text-blue-600 uppercase">Gesetze & Regulierungen</p>
|
<p className="text-xs font-medium text-blue-600 uppercase">Gesetze & Regulierungen</p>
|
||||||
<p className="text-2xl font-bold text-slate-900 mt-1">{COLLECTION_TOTALS.total_legal.toLocaleString()}</p>
|
<p className="text-2xl font-bold text-slate-900 mt-1">{loading ? '-' : getTotalChunks().toLocaleString()}</p>
|
||||||
<p className="text-xs text-slate-500 mt-1">{Object.keys(REGULATIONS_IN_RAG).length}/{REGULATIONS.length} im RAG</p>
|
<p className="text-xs text-slate-500 mt-1">{REGULATIONS.length} Regulierungen (EU, DE, BSI)</p>
|
||||||
</button>
|
</button>
|
||||||
<button
|
<button
|
||||||
onClick={() => { setRegulationCategory('dsfa'); setActiveTab('regulations') }}
|
onClick={() => { setRegulationCategory('dsfa'); setActiveTab('regulations') }}
|
||||||
@@ -2193,12 +1889,12 @@ export default function RAGPage() {
|
|||||||
</button>
|
</button>
|
||||||
<div className="p-4 rounded-lg border border-emerald-200 bg-emerald-50 text-left">
|
<div className="p-4 rounded-lg border border-emerald-200 bg-emerald-50 text-left">
|
||||||
<p className="text-xs font-medium text-emerald-600 uppercase">NiBiS EH</p>
|
<p className="text-xs font-medium text-emerald-600 uppercase">NiBiS EH</p>
|
||||||
<p className="text-2xl font-bold text-slate-900 mt-1">7.996</p>
|
<p className="text-2xl font-bold text-slate-900 mt-1">28.662</p>
|
||||||
<p className="text-xs text-slate-500 mt-1">Chunks · Bildungs-Erwartungshorizonte</p>
|
<p className="text-xs text-slate-500 mt-1">Chunks · Bildungs-Erwartungshorizonte</p>
|
||||||
</div>
|
</div>
|
||||||
<div className="p-4 rounded-lg border border-orange-200 bg-orange-50 text-left">
|
<div className="p-4 rounded-lg border border-orange-200 bg-orange-50 text-left">
|
||||||
<p className="text-xs font-medium text-orange-600 uppercase">Legal Templates</p>
|
<p className="text-xs font-medium text-orange-600 uppercase">Legal Templates</p>
|
||||||
<p className="text-2xl font-bold text-slate-900 mt-1">7.689</p>
|
<p className="text-2xl font-bold text-slate-900 mt-1">824</p>
|
||||||
<p className="text-xs text-slate-500 mt-1">Chunks · Dokumentvorlagen (VVT, TOM, DSFA)</p>
|
<p className="text-xs text-slate-500 mt-1">Chunks · Dokumentvorlagen (VVT, TOM, DSFA)</p>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
@@ -2208,13 +1904,12 @@ export default function RAGPage() {
|
|||||||
<div className="grid grid-cols-1 md:grid-cols-4 gap-4">
|
<div className="grid grid-cols-1 md:grid-cols-4 gap-4">
|
||||||
{Object.entries(TYPE_LABELS).map(([type, label]) => {
|
{Object.entries(TYPE_LABELS).map(([type, label]) => {
|
||||||
const regs = REGULATIONS.filter((r) => r.type === type)
|
const regs = REGULATIONS.filter((r) => r.type === type)
|
||||||
const inRagCount = regs.filter((r) => isInRag(r.code)).length
|
const totalChunks = regs.reduce((sum, r) => sum + getRegulationChunks(r.code), 0)
|
||||||
const totalChunks = regs.reduce((sum, r) => sum + getKnownChunks(r.code), 0)
|
|
||||||
return (
|
return (
|
||||||
<div key={type} className="bg-white rounded-xl p-4 border border-slate-200">
|
<div key={type} className="bg-white rounded-xl p-4 border border-slate-200">
|
||||||
<div className="flex items-center gap-2 mb-2">
|
<div className="flex items-center gap-2 mb-2">
|
||||||
<span className={`px-2 py-0.5 text-xs rounded ${TYPE_COLORS[type]}`}>{label}</span>
|
<span className={`px-2 py-0.5 text-xs rounded ${TYPE_COLORS[type]}`}>{label}</span>
|
||||||
<span className="text-slate-500 text-sm">{inRagCount}/{regs.length} im RAG</span>
|
<span className="text-slate-500 text-sm">{regs.length} Dok.</span>
|
||||||
</div>
|
</div>
|
||||||
<p className="text-xl font-bold text-slate-900">{totalChunks.toLocaleString()} Chunks</p>
|
<p className="text-xl font-bold text-slate-900">{totalChunks.toLocaleString()} Chunks</p>
|
||||||
</div>
|
</div>
|
||||||
@@ -2228,25 +1923,20 @@ export default function RAGPage() {
|
|||||||
<h3 className="font-semibold text-slate-900">Top Regulierungen (nach Chunks)</h3>
|
<h3 className="font-semibold text-slate-900">Top Regulierungen (nach Chunks)</h3>
|
||||||
</div>
|
</div>
|
||||||
<div className="divide-y">
|
<div className="divide-y">
|
||||||
{[...REGULATIONS].sort((a, b) => getKnownChunks(b.code) - getKnownChunks(a.code))
|
{REGULATIONS.sort((a, b) => getRegulationChunks(b.code) - getRegulationChunks(a.code))
|
||||||
.slice(0, 10)
|
.slice(0, 5)
|
||||||
.map((reg) => {
|
.map((reg) => {
|
||||||
const chunks = getKnownChunks(reg.code)
|
const chunks = getRegulationChunks(reg.code)
|
||||||
return (
|
return (
|
||||||
<div key={reg.code} className="px-4 py-3 flex items-center justify-between">
|
<div key={reg.code} className="px-4 py-3 flex items-center justify-between">
|
||||||
<div className="flex items-center gap-3">
|
<div className="flex items-center gap-3">
|
||||||
{isInRag(reg.code) ? (
|
|
||||||
<span className="text-green-500 text-sm">✓</span>
|
|
||||||
) : (
|
|
||||||
<span className="text-red-400 text-sm">✗</span>
|
|
||||||
)}
|
|
||||||
<span className={`px-2 py-0.5 text-xs rounded ${TYPE_COLORS[reg.type]}`}>
|
<span className={`px-2 py-0.5 text-xs rounded ${TYPE_COLORS[reg.type]}`}>
|
||||||
{TYPE_LABELS[reg.type]}
|
{TYPE_LABELS[reg.type]}
|
||||||
</span>
|
</span>
|
||||||
<span className="font-medium text-slate-900">{reg.name}</span>
|
<span className="font-medium text-slate-900">{reg.name}</span>
|
||||||
<span className="text-slate-500 text-sm">({reg.code})</span>
|
<span className="text-slate-500 text-sm">({reg.code})</span>
|
||||||
</div>
|
</div>
|
||||||
<span className={`font-bold ${chunks > 0 ? 'text-teal-600' : 'text-slate-300'}`}>{chunks > 0 ? chunks.toLocaleString() + ' Chunks' : '—'}</span>
|
<span className="font-bold text-teal-600">{chunks.toLocaleString()} Chunks</span>
|
||||||
</div>
|
</div>
|
||||||
)
|
)
|
||||||
})}
|
})}
|
||||||
@@ -2305,13 +1995,7 @@ export default function RAGPage() {
|
|||||||
{regulationCategory === 'regulations' && (
|
{regulationCategory === 'regulations' && (
|
||||||
<div className="bg-white rounded-xl border border-slate-200 overflow-hidden">
|
<div className="bg-white rounded-xl border border-slate-200 overflow-hidden">
|
||||||
<div className="px-4 py-3 border-b bg-slate-50 flex items-center justify-between">
|
<div className="px-4 py-3 border-b bg-slate-50 flex items-center justify-between">
|
||||||
<h3 className="font-semibold text-slate-900">
|
<h3 className="font-semibold text-slate-900">Alle {REGULATIONS.length} Regulierungen</h3>
|
||||||
Alle {REGULATIONS.length} Regulierungen
|
|
||||||
<span className="ml-2 text-sm font-normal text-slate-500">
|
|
||||||
({REGULATIONS.filter(r => isInRag(r.code)).length} im RAG,{' '}
|
|
||||||
{REGULATIONS.filter(r => !isInRag(r.code)).length} ausstehend)
|
|
||||||
</span>
|
|
||||||
</h3>
|
|
||||||
<button
|
<button
|
||||||
onClick={fetchStatus}
|
onClick={fetchStatus}
|
||||||
className="text-sm text-teal-600 hover:text-teal-700"
|
className="text-sm text-teal-600 hover:text-teal-700"
|
||||||
@@ -2323,7 +2007,6 @@ export default function RAGPage() {
|
|||||||
<table className="w-full">
|
<table className="w-full">
|
||||||
<thead className="bg-slate-50 border-b">
|
<thead className="bg-slate-50 border-b">
|
||||||
<tr>
|
<tr>
|
||||||
<th className="px-4 py-3 text-center text-xs font-medium text-slate-500 uppercase w-12">RAG</th>
|
|
||||||
<th className="px-4 py-3 text-left text-xs font-medium text-slate-500 uppercase">Code</th>
|
<th className="px-4 py-3 text-left text-xs font-medium text-slate-500 uppercase">Code</th>
|
||||||
<th className="px-4 py-3 text-left text-xs font-medium text-slate-500 uppercase">Typ</th>
|
<th className="px-4 py-3 text-left text-xs font-medium text-slate-500 uppercase">Typ</th>
|
||||||
<th className="px-4 py-3 text-left text-xs font-medium text-slate-500 uppercase">Name</th>
|
<th className="px-4 py-3 text-left text-xs font-medium text-slate-500 uppercase">Name</th>
|
||||||
@@ -2334,10 +2017,17 @@ export default function RAGPage() {
|
|||||||
</thead>
|
</thead>
|
||||||
<tbody className="divide-y">
|
<tbody className="divide-y">
|
||||||
{REGULATIONS.map((reg) => {
|
{REGULATIONS.map((reg) => {
|
||||||
const chunks = getKnownChunks(reg.code)
|
const chunks = getRegulationChunks(reg.code)
|
||||||
const inRag = isInRag(reg.code)
|
const ratio = chunks / (reg.expected * 10) // Rough estimate: 10 chunks per requirement
|
||||||
let statusColor = inRag ? 'text-green-500' : 'text-red-500'
|
let statusColor = 'text-red-500'
|
||||||
let statusIcon = inRag ? '✓' : '❌'
|
let statusIcon = '❌'
|
||||||
|
if (ratio > 0.5) {
|
||||||
|
statusColor = 'text-green-500'
|
||||||
|
statusIcon = '✓'
|
||||||
|
} else if (ratio > 0.1) {
|
||||||
|
statusColor = 'text-yellow-500'
|
||||||
|
statusIcon = '⚠'
|
||||||
|
}
|
||||||
const isExpanded = expandedRegulation === reg.code
|
const isExpanded = expandedRegulation === reg.code
|
||||||
|
|
||||||
return (
|
return (
|
||||||
@@ -2346,13 +2036,6 @@ export default function RAGPage() {
|
|||||||
onClick={() => setExpandedRegulation(isExpanded ? null : reg.code)}
|
onClick={() => setExpandedRegulation(isExpanded ? null : reg.code)}
|
||||||
className="hover:bg-slate-50 cursor-pointer transition-colors"
|
className="hover:bg-slate-50 cursor-pointer transition-colors"
|
||||||
>
|
>
|
||||||
<td className="px-4 py-3 text-center">
|
|
||||||
{isInRag(reg.code) ? (
|
|
||||||
<span className="inline-flex items-center justify-center w-6 h-6 bg-green-100 text-green-600 rounded-full text-xs font-bold" title="Im RAG vorhanden">✓</span>
|
|
||||||
) : (
|
|
||||||
<span className="inline-flex items-center justify-center w-6 h-6 bg-red-50 text-red-400 rounded-full text-xs font-bold" title="Nicht im RAG">✗</span>
|
|
||||||
)}
|
|
||||||
</td>
|
|
||||||
<td className="px-4 py-3 font-mono font-medium text-teal-600">
|
<td className="px-4 py-3 font-mono font-medium text-teal-600">
|
||||||
<span className="inline-flex items-center gap-2">
|
<span className="inline-flex items-center gap-2">
|
||||||
<span className={`transform transition-transform ${isExpanded ? 'rotate-90' : ''}`}>▶</span>
|
<span className={`transform transition-transform ${isExpanded ? 'rotate-90' : ''}`}>▶</span>
|
||||||
@@ -2365,20 +2048,13 @@ export default function RAGPage() {
|
|||||||
</span>
|
</span>
|
||||||
</td>
|
</td>
|
||||||
<td className="px-4 py-3 text-slate-900">{reg.name}</td>
|
<td className="px-4 py-3 text-slate-900">{reg.name}</td>
|
||||||
<td className="px-4 py-3 text-right font-bold">
|
<td className="px-4 py-3 text-right font-bold">{chunks.toLocaleString()}</td>
|
||||||
<span className={chunks > 0 && chunks < 10 && reg.expected >= 10 ? 'text-amber-600' : ''}>
|
|
||||||
{chunks.toLocaleString()}
|
|
||||||
{chunks > 0 && chunks < 10 && reg.expected >= 10 && (
|
|
||||||
<span className="ml-1 inline-block w-4 h-4 text-[10px] leading-4 text-center bg-amber-100 text-amber-700 rounded-full" title="Verdaechtig niedrig — Ingestion pruefen">⚠</span>
|
|
||||||
)}
|
|
||||||
</span>
|
|
||||||
</td>
|
|
||||||
<td className="px-4 py-3 text-right text-slate-500">{reg.expected}</td>
|
<td className="px-4 py-3 text-right text-slate-500">{reg.expected}</td>
|
||||||
<td className={`px-4 py-3 text-center ${statusColor}`}>{statusIcon}</td>
|
<td className={`px-4 py-3 text-center ${statusColor}`}>{statusIcon}</td>
|
||||||
</tr>
|
</tr>
|
||||||
{isExpanded && (
|
{isExpanded && (
|
||||||
<tr key={`${reg.code}-detail`} className="bg-slate-50">
|
<tr key={`${reg.code}-detail`} className="bg-slate-50">
|
||||||
<td colSpan={7} className="px-4 py-4">
|
<td colSpan={6} className="px-4 py-4">
|
||||||
<div className="bg-white rounded-lg border border-slate-200 p-4 space-y-3">
|
<div className="bg-white rounded-lg border border-slate-200 p-4 space-y-3">
|
||||||
<div>
|
<div>
|
||||||
<h4 className="font-semibold text-slate-900 mb-1">{reg.fullName}</h4>
|
<h4 className="font-semibold text-slate-900 mb-1">{reg.fullName}</h4>
|
||||||
@@ -2418,22 +2094,11 @@ export default function RAGPage() {
|
|||||||
</span>
|
</span>
|
||||||
)}
|
)}
|
||||||
</div>
|
</div>
|
||||||
<div className="flex items-center gap-3">
|
|
||||||
{REGULATION_SOURCES[reg.code] && (
|
|
||||||
<a
|
|
||||||
href={REGULATION_SOURCES[reg.code]}
|
|
||||||
target="_blank"
|
|
||||||
rel="noopener noreferrer"
|
|
||||||
onClick={(e) => e.stopPropagation()}
|
|
||||||
className="text-blue-600 hover:text-blue-700 font-medium"
|
|
||||||
>
|
|
||||||
Originalquelle →
|
|
||||||
</a>
|
|
||||||
)}
|
|
||||||
<button
|
<button
|
||||||
onClick={(e) => {
|
onClick={(e) => {
|
||||||
e.stopPropagation()
|
e.stopPropagation()
|
||||||
setActiveTab('chunks')
|
setSearchQuery(reg.name)
|
||||||
|
setActiveTab('search')
|
||||||
}}
|
}}
|
||||||
className="text-teal-600 hover:text-teal-700 font-medium"
|
className="text-teal-600 hover:text-teal-700 font-medium"
|
||||||
>
|
>
|
||||||
@@ -2441,7 +2106,6 @@ export default function RAGPage() {
|
|||||||
</button>
|
</button>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
|
||||||
</td>
|
</td>
|
||||||
</tr>
|
</tr>
|
||||||
)}
|
)}
|
||||||
@@ -2568,7 +2232,7 @@ export default function RAGPage() {
|
|||||||
<div className="grid grid-cols-3 gap-4 mb-4">
|
<div className="grid grid-cols-3 gap-4 mb-4">
|
||||||
<div className="bg-emerald-50 rounded-lg p-4 border border-emerald-200">
|
<div className="bg-emerald-50 rounded-lg p-4 border border-emerald-200">
|
||||||
<p className="text-sm text-emerald-600 font-medium">Chunks</p>
|
<p className="text-sm text-emerald-600 font-medium">Chunks</p>
|
||||||
<p className="text-2xl font-bold text-slate-900">7.996</p>
|
<p className="text-2xl font-bold text-slate-900">28.662</p>
|
||||||
</div>
|
</div>
|
||||||
<div className="bg-emerald-50 rounded-lg p-4 border border-emerald-200">
|
<div className="bg-emerald-50 rounded-lg p-4 border border-emerald-200">
|
||||||
<p className="text-sm text-emerald-600 font-medium">Vector Size</p>
|
<p className="text-sm text-emerald-600 font-medium">Vector Size</p>
|
||||||
@@ -2600,7 +2264,7 @@ export default function RAGPage() {
|
|||||||
<div className="grid grid-cols-3 gap-4 mb-4">
|
<div className="grid grid-cols-3 gap-4 mb-4">
|
||||||
<div className="bg-orange-50 rounded-lg p-4 border border-orange-200">
|
<div className="bg-orange-50 rounded-lg p-4 border border-orange-200">
|
||||||
<p className="text-sm text-orange-600 font-medium">Chunks</p>
|
<p className="text-sm text-orange-600 font-medium">Chunks</p>
|
||||||
<p className="text-2xl font-bold text-slate-900">7.689</p>
|
<p className="text-2xl font-bold text-slate-900">824</p>
|
||||||
</div>
|
</div>
|
||||||
<div className="bg-orange-50 rounded-lg p-4 border border-orange-200">
|
<div className="bg-orange-50 rounded-lg p-4 border border-orange-200">
|
||||||
<p className="text-sm text-orange-600 font-medium">Vector Size</p>
|
<p className="text-sm text-orange-600 font-medium">Vector Size</p>
|
||||||
@@ -2668,28 +2332,20 @@ export default function RAGPage() {
|
|||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
<div className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-3 gap-3">
|
<div className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-3 gap-3">
|
||||||
{regs.map((reg) => {
|
{regs.map((reg) => (
|
||||||
const regInRag = isInRag(reg.code)
|
|
||||||
return (
|
|
||||||
<div
|
<div
|
||||||
key={reg.code}
|
key={reg.code}
|
||||||
className={`bg-white p-3 rounded-lg border ${regInRag ? 'border-green-200' : 'border-slate-200'}`}
|
className="bg-white p-3 rounded-lg border border-slate-200"
|
||||||
>
|
>
|
||||||
<div className="flex items-center gap-2 mb-1">
|
<div className="flex items-center gap-2 mb-1">
|
||||||
<span className={`px-2 py-0.5 text-xs rounded ${TYPE_COLORS[reg.type]}`}>
|
<span className={`px-2 py-0.5 text-xs rounded ${TYPE_COLORS[reg.type]}`}>
|
||||||
{reg.code}
|
{reg.code}
|
||||||
</span>
|
</span>
|
||||||
{regInRag ? (
|
|
||||||
<span className="px-1.5 py-0.5 text-[10px] font-bold bg-green-100 text-green-600 rounded">RAG</span>
|
|
||||||
) : (
|
|
||||||
<span className="px-1.5 py-0.5 text-[10px] font-bold bg-red-50 text-red-400 rounded">✗</span>
|
|
||||||
)}
|
|
||||||
</div>
|
</div>
|
||||||
<div className="font-medium text-sm text-slate-900">{reg.name}</div>
|
<div className="font-medium text-sm text-slate-900">{reg.name}</div>
|
||||||
<div className="text-xs text-slate-500 mt-1 line-clamp-2">{reg.description}</div>
|
<div className="text-xs text-slate-500 mt-1 line-clamp-2">{reg.description}</div>
|
||||||
</div>
|
</div>
|
||||||
)
|
))}
|
||||||
})}
|
|
||||||
</div>
|
</div>
|
||||||
</>
|
</>
|
||||||
)
|
)
|
||||||
@@ -2716,22 +2372,17 @@ export default function RAGPage() {
|
|||||||
<div className="flex flex-wrap gap-2">
|
<div className="flex flex-wrap gap-2">
|
||||||
{group.regulations.map((code) => {
|
{group.regulations.map((code) => {
|
||||||
const reg = REGULATIONS.find(r => r.code === code)
|
const reg = REGULATIONS.find(r => r.code === code)
|
||||||
const codeInRag = isInRag(code)
|
|
||||||
return (
|
return (
|
||||||
<span
|
<span
|
||||||
key={code}
|
key={code}
|
||||||
className={`px-3 py-1.5 rounded-full text-sm font-medium cursor-pointer ${
|
className="px-3 py-1.5 bg-slate-100 rounded-full text-sm font-medium text-slate-700 hover:bg-slate-200 cursor-pointer"
|
||||||
codeInRag
|
|
||||||
? 'bg-green-100 text-green-700 hover:bg-green-200'
|
|
||||||
: 'bg-slate-100 text-slate-700 hover:bg-slate-200'
|
|
||||||
}`}
|
|
||||||
onClick={() => {
|
onClick={() => {
|
||||||
setActiveTab('regulations')
|
setActiveTab('regulations')
|
||||||
setExpandedRegulation(code)
|
setExpandedRegulation(code)
|
||||||
}}
|
}}
|
||||||
title={`${reg?.fullName || code}${codeInRag ? ' (im RAG)' : ' (nicht im RAG)'}`}
|
title={reg?.fullName || code}
|
||||||
>
|
>
|
||||||
{codeInRag ? '✓ ' : '✗ '}{code}
|
{code}
|
||||||
</span>
|
</span>
|
||||||
)
|
)
|
||||||
})}
|
})}
|
||||||
@@ -2755,13 +2406,9 @@ export default function RAGPage() {
|
|||||||
{intersection.regulations.map((code) => (
|
{intersection.regulations.map((code) => (
|
||||||
<span
|
<span
|
||||||
key={code}
|
key={code}
|
||||||
className={`px-2 py-0.5 text-xs font-medium rounded ${
|
className="px-2 py-0.5 text-xs font-medium bg-teal-100 text-teal-700 rounded"
|
||||||
isInRag(code)
|
|
||||||
? 'bg-green-100 text-green-700'
|
|
||||||
: 'bg-red-50 text-red-500'
|
|
||||||
}`}
|
|
||||||
>
|
>
|
||||||
{isInRag(code) ? '✓ ' : '✗ '}{code}
|
{code}
|
||||||
</span>
|
</span>
|
||||||
))}
|
))}
|
||||||
</div>
|
</div>
|
||||||
@@ -2796,15 +2443,8 @@ export default function RAGPage() {
|
|||||||
<tbody className="divide-y">
|
<tbody className="divide-y">
|
||||||
{REGULATIONS.map((reg) => (
|
{REGULATIONS.map((reg) => (
|
||||||
<tr key={reg.code} className="hover:bg-slate-50">
|
<tr key={reg.code} className="hover:bg-slate-50">
|
||||||
<td className="px-2 py-2 font-medium sticky left-0 bg-white">
|
<td className="px-2 py-2 font-medium text-teal-600 sticky left-0 bg-white">
|
||||||
<span className="flex items-center gap-1">
|
{reg.code}
|
||||||
{isInRag(reg.code) ? (
|
|
||||||
<span className="text-green-500 text-[10px]">●</span>
|
|
||||||
) : (
|
|
||||||
<span className="text-red-300 text-[10px]">○</span>
|
|
||||||
)}
|
|
||||||
<span className="text-teal-600">{reg.code}</span>
|
|
||||||
</span>
|
|
||||||
</td>
|
</td>
|
||||||
{INDUSTRIES.filter(i => i.id !== 'all').map((industry) => {
|
{INDUSTRIES.filter(i => i.id !== 'all').map((industry) => {
|
||||||
const applies = INDUSTRY_REGULATION_MAP[industry.id]?.includes(reg.code)
|
const applies = INDUSTRY_REGULATION_MAP[industry.id]?.includes(reg.code)
|
||||||
@@ -2891,34 +2531,28 @@ export default function RAGPage() {
|
|||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
{/* RAG Coverage Overview */}
|
{/* Integrated Regulations */}
|
||||||
<div className="bg-white rounded-xl border border-slate-200 p-6">
|
<div className="bg-white rounded-xl border border-slate-200 p-6">
|
||||||
<div className="flex items-center gap-3 mb-4">
|
<div className="flex items-center gap-3 mb-4">
|
||||||
<span className="text-2xl">✅</span>
|
<span className="text-2xl">✅</span>
|
||||||
<div>
|
<div>
|
||||||
<h3 className="font-semibold text-slate-900">RAG-Abdeckung ({Object.keys(REGULATIONS_IN_RAG).length} von {REGULATIONS.length} Regulierungen)</h3>
|
<h3 className="font-semibold text-slate-900">Neu integrierte Regulierungen</h3>
|
||||||
<p className="text-sm text-slate-500">Stand: Februar 2026 — Alle im RAG-System verfuegbaren Regulierungen</p>
|
<p className="text-sm text-slate-500">Jetzt im RAG-System verfuegbar (Stand: Januar 2025)</p>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<div className="flex flex-wrap gap-2">
|
<div className="grid grid-cols-2 md:grid-cols-5 gap-3">
|
||||||
{REGULATIONS.filter(r => isInRag(r.code)).map((reg) => (
|
{INTEGRATED_REGULATIONS.map((reg) => (
|
||||||
<span key={reg.code} className="px-2.5 py-1 text-xs font-medium bg-green-100 text-green-700 rounded-full border border-green-200">
|
<div key={reg.code} className="rounded-lg border border-green-200 bg-green-50 p-3 text-center">
|
||||||
✓ {reg.code}
|
<span className="px-2 py-1 text-sm font-bold bg-green-100 text-green-700 rounded">
|
||||||
|
{reg.code}
|
||||||
</span>
|
</span>
|
||||||
))}
|
<p className="text-xs text-slate-600 mt-2">{reg.name}</p>
|
||||||
|
<p className="text-xs text-green-600 mt-1">Im RAG</p>
|
||||||
</div>
|
</div>
|
||||||
<div className="mt-4 pt-4 border-t border-slate-100">
|
|
||||||
<p className="text-xs font-medium text-slate-500 mb-2">Noch nicht im RAG:</p>
|
|
||||||
<div className="flex flex-wrap gap-2">
|
|
||||||
{REGULATIONS.filter(r => !isInRag(r.code)).map((reg) => (
|
|
||||||
<span key={reg.code} className="px-2.5 py-1 text-xs font-medium bg-red-50 text-red-400 rounded-full border border-red-100">
|
|
||||||
✗ {reg.code}
|
|
||||||
</span>
|
|
||||||
))}
|
))}
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Potential Future Regulations */}
|
{/* Potential Future Regulations */}
|
||||||
<div className="bg-white rounded-xl border border-slate-200 p-6">
|
<div className="bg-white rounded-xl border border-slate-200 p-6">
|
||||||
@@ -3080,10 +2714,6 @@ export default function RAGPage() {
|
|||||||
</div>
|
</div>
|
||||||
)}
|
)}
|
||||||
|
|
||||||
{activeTab === 'chunks' && (
|
|
||||||
<ChunkBrowserQA apiProxy={API_PROXY} />
|
|
||||||
)}
|
|
||||||
|
|
||||||
{activeTab === 'data' && (
|
{activeTab === 'data' && (
|
||||||
<div className="space-y-6">
|
<div className="space-y-6">
|
||||||
{/* Upload Document */}
|
{/* Upload Document */}
|
||||||
@@ -3269,7 +2899,7 @@ export default function RAGPage() {
|
|||||||
<span className="flex items-center gap-2 text-teal-600">
|
<span className="flex items-center gap-2 text-teal-600">
|
||||||
<svg className="animate-spin h-4 w-4" fill="none" viewBox="0 0 24 24">
|
<svg className="animate-spin h-4 w-4" fill="none" viewBox="0 0 24 24">
|
||||||
<circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" />
|
<circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" />
|
||||||
<path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.7.689 3 7.938l3-2.647z" />
|
<path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.824 3 7.938l3-2.647z" />
|
||||||
</svg>
|
</svg>
|
||||||
Ingestion laeuft...
|
Ingestion laeuft...
|
||||||
</span>
|
</span>
|
||||||
@@ -3339,7 +2969,7 @@ export default function RAGPage() {
|
|||||||
{pipelineStarting ? (
|
{pipelineStarting ? (
|
||||||
<svg className="animate-spin h-4 w-4" fill="none" viewBox="0 0 24 24">
|
<svg className="animate-spin h-4 w-4" fill="none" viewBox="0 0 24 24">
|
||||||
<circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" />
|
<circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" />
|
||||||
<path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.7.689 3 7.938l3-2.647z" />
|
<path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.824 3 7.938l3-2.647z" />
|
||||||
</svg>
|
</svg>
|
||||||
) : (
|
) : (
|
||||||
<svg className="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
|
<svg className="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
|
||||||
@@ -3358,7 +2988,7 @@ export default function RAGPage() {
|
|||||||
{pipelineLoading ? (
|
{pipelineLoading ? (
|
||||||
<svg className="animate-spin h-4 w-4" fill="none" viewBox="0 0 24 24">
|
<svg className="animate-spin h-4 w-4" fill="none" viewBox="0 0 24 24">
|
||||||
<circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" />
|
<circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" />
|
||||||
<path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.7.689 3 7.938l3-2.647z" />
|
<path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.824 3 7.938l3-2.647z" />
|
||||||
</svg>
|
</svg>
|
||||||
) : (
|
) : (
|
||||||
<svg className="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
|
<svg className="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
|
||||||
@@ -3391,7 +3021,7 @@ export default function RAGPage() {
|
|||||||
<>
|
<>
|
||||||
<svg className="animate-spin h-5 w-5" fill="none" viewBox="0 0 24 24">
|
<svg className="animate-spin h-5 w-5" fill="none" viewBox="0 0 24 24">
|
||||||
<circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" />
|
<circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" />
|
||||||
<path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.7.689 3 7.938l3-2.647z" />
|
<path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.824 3 7.938l3-2.647z" />
|
||||||
</svg>
|
</svg>
|
||||||
Startet...
|
Startet...
|
||||||
</>
|
</>
|
||||||
@@ -3428,7 +3058,7 @@ export default function RAGPage() {
|
|||||||
{pipelineState.status === 'running' && (
|
{pipelineState.status === 'running' && (
|
||||||
<svg className="w-6 h-6 text-blue-600 animate-spin" fill="none" viewBox="0 0 24 24">
|
<svg className="w-6 h-6 text-blue-600 animate-spin" fill="none" viewBox="0 0 24 24">
|
||||||
<circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" />
|
<circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4" />
|
||||||
<path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.7.689 3 7.938l3-2.647z" />
|
<path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.824 3 7.938l3-2.647z" />
|
||||||
</svg>
|
</svg>
|
||||||
)}
|
)}
|
||||||
{pipelineState.status === 'failed' && (
|
{pipelineState.status === 'failed' && (
|
||||||
|
|||||||
@@ -1,241 +0,0 @@
|
|||||||
/**
|
|
||||||
* Shared RAG constants used by both page.tsx and ChunkBrowserQA.
|
|
||||||
* REGULATIONS_IN_RAG maps regulation codes to their Qdrant collection, chunk count, and qdrant_id.
|
|
||||||
* The qdrant_id is the actual `regulation_id` value stored in Qdrant payloads.
|
|
||||||
* REGULATION_INFO provides minimal metadata (code, name, type) for all regulations.
|
|
||||||
*/
|
|
||||||
|
|
||||||
export interface RagRegulationEntry {
|
|
||||||
collection: string
|
|
||||||
chunks: number
|
|
||||||
qdrant_id: string // The actual regulation_id value in Qdrant payload
|
|
||||||
}
|
|
||||||
|
|
||||||
export const REGULATIONS_IN_RAG: Record<string, RagRegulationEntry> = {
|
|
||||||
// === EU Verordnungen/Richtlinien (bp_compliance_ce) ===
|
|
||||||
GDPR: { collection: 'bp_compliance_ce', chunks: 423, qdrant_id: 'eu_2016_679' },
|
|
||||||
EPRIVACY: { collection: 'bp_compliance_ce', chunks: 134, qdrant_id: 'eu_2002_58' },
|
|
||||||
SCC: { collection: 'bp_compliance_ce', chunks: 330, qdrant_id: 'eu_2021_914' },
|
|
||||||
SCC_FULL_TEXT: { collection: 'bp_compliance_ce', chunks: 330, qdrant_id: 'eu_2021_914' },
|
|
||||||
AIACT: { collection: 'bp_compliance_ce', chunks: 726, qdrant_id: 'eu_2024_1689' },
|
|
||||||
CRA: { collection: 'bp_compliance_ce', chunks: 429, qdrant_id: 'eu_2024_2847' },
|
|
||||||
NIS2: { collection: 'bp_compliance_ce', chunks: 342, qdrant_id: 'eu_2022_2555' },
|
|
||||||
DGA: { collection: 'bp_compliance_ce', chunks: 508, qdrant_id: 'eu_2022_868' },
|
|
||||||
DSA: { collection: 'bp_compliance_ce', chunks: 1106, qdrant_id: 'eu_2022_2065' },
|
|
||||||
PLD: { collection: 'bp_compliance_ce', chunks: 44, qdrant_id: 'eu_1985_374' },
|
|
||||||
E_COMMERCE_RL: { collection: 'bp_compliance_ce', chunks: 197, qdrant_id: 'eu_2000_31' },
|
|
||||||
VERBRAUCHERRECHTE_RL: { collection: 'bp_compliance_ce', chunks: 266, qdrant_id: 'eu_2011_83' },
|
|
||||||
DIGITALE_INHALTE_RL: { collection: 'bp_compliance_ce', chunks: 321, qdrant_id: 'eu_2019_770' },
|
|
||||||
DMA: { collection: 'bp_compliance_ce', chunks: 701, qdrant_id: 'eu_2022_1925' },
|
|
||||||
DPF: { collection: 'bp_compliance_ce', chunks: 2464, qdrant_id: 'dpf' },
|
|
||||||
EUCSA: { collection: 'bp_compliance_ce', chunks: 558, qdrant_id: 'eucsa' },
|
|
||||||
DATAACT: { collection: 'bp_compliance_ce', chunks: 809, qdrant_id: 'dataact' },
|
|
||||||
DORA: { collection: 'bp_compliance_ce', chunks: 823, qdrant_id: 'dora' },
|
|
||||||
PSD2: { collection: 'bp_compliance_ce', chunks: 796, qdrant_id: 'psd2' },
|
|
||||||
AMLR: { collection: 'bp_compliance_ce', chunks: 1182, qdrant_id: 'amlr' },
|
|
||||||
MiCA: { collection: 'bp_compliance_ce', chunks: 1640, qdrant_id: 'mica' },
|
|
||||||
EHDS: { collection: 'bp_compliance_ce', chunks: 1212, qdrant_id: 'ehds' },
|
|
||||||
EAA: { collection: 'bp_compliance_ce', chunks: 433, qdrant_id: 'eaa' },
|
|
||||||
DSM: { collection: 'bp_compliance_ce', chunks: 416, qdrant_id: 'dsm' },
|
|
||||||
GPSR: { collection: 'bp_compliance_ce', chunks: 509, qdrant_id: 'gpsr' },
|
|
||||||
MACHINERY_REG: { collection: 'bp_compliance_ce', chunks: 1271, qdrant_id: 'eu_2023_1230' },
|
|
||||||
BLUE_GUIDE: { collection: 'bp_compliance_ce', chunks: 2271, qdrant_id: 'eu_blue_guide_2022' },
|
|
||||||
EU_IFRS_DE: { collection: 'bp_compliance_ce', chunks: 34388, qdrant_id: 'eu_2023_1803' },
|
|
||||||
EU_IFRS_EN: { collection: 'bp_compliance_ce', chunks: 34388, qdrant_id: 'eu_2023_1803' },
|
|
||||||
// International standards in bp_compliance_ce
|
|
||||||
NIST_SSDF: { collection: 'bp_compliance_ce', chunks: 111, qdrant_id: 'nist_sp_800_218' },
|
|
||||||
NIST_CSF_2: { collection: 'bp_compliance_ce', chunks: 67, qdrant_id: 'nist_csf_2_0' },
|
|
||||||
OECD_AI_PRINCIPLES: { collection: 'bp_compliance_ce', chunks: 34, qdrant_id: 'oecd_ai_principles' },
|
|
||||||
ENISA_SECURE_BY_DESIGN: { collection: 'bp_compliance_ce', chunks: 97, qdrant_id: 'cisa_secure_by_design' },
|
|
||||||
ENISA_SUPPLY_CHAIN: { collection: 'bp_compliance_ce', chunks: 110, qdrant_id: 'enisa_supply_chain_good_practices' },
|
|
||||||
ENISA_THREAT_LANDSCAPE: { collection: 'bp_compliance_ce', chunks: 118, qdrant_id: 'enisa_threat_landscape_supply_chain' },
|
|
||||||
ENISA_ICS_SCADA: { collection: 'bp_compliance_ce', chunks: 195, qdrant_id: 'enisa_ics_scada_dependencies' },
|
|
||||||
ENISA_CYBERSECURITY_2024: { collection: 'bp_compliance_ce', chunks: 22, qdrant_id: 'enisa_cybersecurity_state_2024' },
|
|
||||||
|
|
||||||
// === DE Gesetze (bp_compliance_gesetze) ===
|
|
||||||
TDDDG: { collection: 'bp_compliance_gesetze', chunks: 5, qdrant_id: 'tdddg_25' },
|
|
||||||
TMG_KOMPLETT: { collection: 'bp_compliance_gesetze', chunks: 108, qdrant_id: 'tmg_komplett' },
|
|
||||||
BDSG_FULL: { collection: 'bp_compliance_gesetze', chunks: 1056, qdrant_id: 'bdsg_2018_komplett' },
|
|
||||||
DE_DDG: { collection: 'bp_compliance_gesetze', chunks: 40, qdrant_id: 'ddg_5' },
|
|
||||||
DE_BGB_AGB: { collection: 'bp_compliance_gesetze', chunks: 4024, qdrant_id: 'bgb_komplett' },
|
|
||||||
DE_EGBGB: { collection: 'bp_compliance_gesetze', chunks: 36, qdrant_id: 'egbgb_widerruf' },
|
|
||||||
DE_HGB_RET: { collection: 'bp_compliance_gesetze', chunks: 11363, qdrant_id: 'hgb_komplett' },
|
|
||||||
DE_AO_RET: { collection: 'bp_compliance_gesetze', chunks: 9669, qdrant_id: 'ao_komplett' },
|
|
||||||
DE_TKG: { collection: 'bp_compliance_gesetze', chunks: 1631, qdrant_id: 'de_tkg' },
|
|
||||||
DE_DLINFOV: { collection: 'bp_compliance_gesetze', chunks: 21, qdrant_id: 'de_dlinfov' },
|
|
||||||
DE_BETRVG: { collection: 'bp_compliance_gesetze', chunks: 498, qdrant_id: 'de_betrvg' },
|
|
||||||
DE_GESCHGEHG: { collection: 'bp_compliance_gesetze', chunks: 63, qdrant_id: 'de_geschgehg' },
|
|
||||||
DE_USTG_RET: { collection: 'bp_compliance_gesetze', chunks: 1071, qdrant_id: 'de_ustg_ret' },
|
|
||||||
DE_URHG: { collection: 'bp_compliance_gesetze', chunks: 626, qdrant_id: 'urhg_komplett' },
|
|
||||||
|
|
||||||
// === BSI Standards (bp_compliance_gesetze) ===
|
|
||||||
'BSI-TR-03161-1': { collection: 'bp_compliance_gesetze', chunks: 138, qdrant_id: 'bsi_tr_03161_1' },
|
|
||||||
'BSI-TR-03161-2': { collection: 'bp_compliance_gesetze', chunks: 124, qdrant_id: 'bsi_tr_03161_2' },
|
|
||||||
'BSI-TR-03161-3': { collection: 'bp_compliance_gesetze', chunks: 121, qdrant_id: 'bsi_tr_03161_3' },
|
|
||||||
|
|
||||||
// === AT Gesetze (bp_compliance_gesetze) ===
|
|
||||||
AT_DSG: { collection: 'bp_compliance_gesetze', chunks: 805, qdrant_id: 'at_dsg' },
|
|
||||||
AT_DSG_FULL: { collection: 'bp_compliance_gesetze', chunks: 6, qdrant_id: 'at_dsg_full' },
|
|
||||||
AT_ECG: { collection: 'bp_compliance_gesetze', chunks: 120, qdrant_id: 'at_ecg' },
|
|
||||||
AT_TKG: { collection: 'bp_compliance_gesetze', chunks: 4348, qdrant_id: 'at_tkg' },
|
|
||||||
AT_KSCHG: { collection: 'bp_compliance_gesetze', chunks: 402, qdrant_id: 'at_kschg' },
|
|
||||||
AT_FAGG: { collection: 'bp_compliance_gesetze', chunks: 2, qdrant_id: 'at_fagg' },
|
|
||||||
AT_UGB_RET: { collection: 'bp_compliance_gesetze', chunks: 2828, qdrant_id: 'at_ugb_ret' },
|
|
||||||
AT_BAO_RET: { collection: 'bp_compliance_gesetze', chunks: 2246, qdrant_id: 'at_bao_ret' },
|
|
||||||
AT_MEDIENG: { collection: 'bp_compliance_gesetze', chunks: 571, qdrant_id: 'at_medieng' },
|
|
||||||
AT_ABGB_AGB: { collection: 'bp_compliance_gesetze', chunks: 2521, qdrant_id: 'at_abgb_agb' },
|
|
||||||
AT_UWG: { collection: 'bp_compliance_gesetze', chunks: 403, qdrant_id: 'at_uwg' },
|
|
||||||
|
|
||||||
// === CH Gesetze (bp_compliance_gesetze) ===
|
|
||||||
CH_DSG: { collection: 'bp_compliance_gesetze', chunks: 180, qdrant_id: 'ch_revdsg' },
|
|
||||||
CH_DSV: { collection: 'bp_compliance_gesetze', chunks: 5, qdrant_id: 'ch_dsv' },
|
|
||||||
CH_OR_AGB: { collection: 'bp_compliance_gesetze', chunks: 5, qdrant_id: 'ch_or_agb' },
|
|
||||||
CH_GEBUV: { collection: 'bp_compliance_gesetze', chunks: 5, qdrant_id: 'ch_gebuv' },
|
|
||||||
CH_ZERTES: { collection: 'bp_compliance_gesetze', chunks: 5, qdrant_id: 'ch_zertes' },
|
|
||||||
CH_ZGB_PERS: { collection: 'bp_compliance_gesetze', chunks: 5, qdrant_id: 'ch_zgb_pers' },
|
|
||||||
|
|
||||||
// === Nationale Gesetze (andere EU) in bp_compliance_gesetze ===
|
|
||||||
ES_LOPDGDD: { collection: 'bp_compliance_gesetze', chunks: 782, qdrant_id: 'es_lopdgdd' },
|
|
||||||
IT_CODICE_PRIVACY: { collection: 'bp_compliance_gesetze', chunks: 59, qdrant_id: 'it_codice_privacy' },
|
|
||||||
NL_UAVG: { collection: 'bp_compliance_gesetze', chunks: 523, qdrant_id: 'nl_uavg' },
|
|
||||||
FR_CNIL_GUIDE: { collection: 'bp_compliance_gesetze', chunks: 562, qdrant_id: 'fr_loi_informatique' },
|
|
||||||
IE_DPA_2018: { collection: 'bp_compliance_gesetze', chunks: 64, qdrant_id: 'ie_dpa_2018' },
|
|
||||||
UK_DPA_2018: { collection: 'bp_compliance_gesetze', chunks: 156, qdrant_id: 'uk_dpa_2018' },
|
|
||||||
UK_GDPR: { collection: 'bp_compliance_gesetze', chunks: 45, qdrant_id: 'uk_gdpr' },
|
|
||||||
NO_PERSONOPPLYSNINGSLOVEN: { collection: 'bp_compliance_gesetze', chunks: 41, qdrant_id: 'no_pol' },
|
|
||||||
SE_DATASKYDDSLAG: { collection: 'bp_compliance_gesetze', chunks: 56, qdrant_id: 'se_dataskyddslag' },
|
|
||||||
PL_UODO: { collection: 'bp_compliance_gesetze', chunks: 39, qdrant_id: 'pl_ustawa' },
|
|
||||||
CZ_ZOU: { collection: 'bp_compliance_gesetze', chunks: 238, qdrant_id: 'cz_zakon' },
|
|
||||||
HU_INFOTV: { collection: 'bp_compliance_gesetze', chunks: 747, qdrant_id: 'hu_info_tv' },
|
|
||||||
LU_DPA_LAW: { collection: 'bp_compliance_gesetze', chunks: 2, qdrant_id: 'lu_dpa_law' },
|
|
||||||
|
|
||||||
// === EDPB Guidelines (bp_compliance_datenschutz) ===
|
|
||||||
EDPB_GUIDELINES_5_2020: { collection: 'bp_compliance_datenschutz', chunks: 236, qdrant_id: 'edpb_05_2020' },
|
|
||||||
EDPB_GUIDELINES_7_2020: { collection: 'bp_compliance_datenschutz', chunks: 347, qdrant_id: 'edpb_guidelines_7_2020' },
|
|
||||||
EDPB_GUIDELINES_1_2020: { collection: 'bp_compliance_datenschutz', chunks: 337, qdrant_id: 'edpb_01_2020' },
|
|
||||||
EDPB_GUIDELINES_1_2022: { collection: 'bp_compliance_datenschutz', chunks: 510, qdrant_id: 'edpb_01_2022' },
|
|
||||||
EDPB_GUIDELINES_2_2023: { collection: 'bp_compliance_datenschutz', chunks: 94, qdrant_id: 'edpb_02_2023' },
|
|
||||||
EDPB_GUIDELINES_2_2024: { collection: 'bp_compliance_datenschutz', chunks: 79, qdrant_id: 'edpb_02_2024' },
|
|
||||||
EDPB_GUIDELINES_4_2019: { collection: 'bp_compliance_datenschutz', chunks: 202, qdrant_id: 'edpb_04_2019' },
|
|
||||||
EDPB_GUIDELINES_9_2022: { collection: 'bp_compliance_datenschutz', chunks: 243, qdrant_id: 'edpb_09_2022' },
|
|
||||||
EDPB_DPIA_LIST: { collection: 'bp_compliance_datenschutz', chunks: 29, qdrant_id: 'edpb_dpia_list' },
|
|
||||||
EDPB_LEGITIMATE_INTEREST: { collection: 'bp_compliance_datenschutz', chunks: 336, qdrant_id: 'edpb_legitimate_interest' },
|
|
||||||
EDPS_DPIA_LIST: { collection: 'bp_compliance_datenschutz', chunks: 35, qdrant_id: 'edps_dpia_list' },
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Minimal regulation info for sidebar display.
|
|
||||||
* Full REGULATIONS array with descriptions remains in page.tsx.
|
|
||||||
*/
|
|
||||||
export interface RegulationInfo {
|
|
||||||
code: string
|
|
||||||
name: string
|
|
||||||
type: string
|
|
||||||
}
|
|
||||||
|
|
||||||
export const REGULATION_INFO: RegulationInfo[] = [
|
|
||||||
// EU Verordnungen
|
|
||||||
{ code: 'GDPR', name: 'DSGVO', type: 'eu_regulation' },
|
|
||||||
{ code: 'EPRIVACY', name: 'ePrivacy-Richtlinie', type: 'eu_directive' },
|
|
||||||
{ code: 'SCC', name: 'Standardvertragsklauseln', type: 'eu_regulation' },
|
|
||||||
{ code: 'SCC_FULL_TEXT', name: 'SCC Volltext', type: 'eu_regulation' },
|
|
||||||
{ code: 'DPF', name: 'EU-US Data Privacy Framework', type: 'eu_regulation' },
|
|
||||||
{ code: 'AIACT', name: 'EU AI Act', type: 'eu_regulation' },
|
|
||||||
{ code: 'CRA', name: 'Cyber Resilience Act', type: 'eu_regulation' },
|
|
||||||
{ code: 'NIS2', name: 'NIS2-Richtlinie', type: 'eu_directive' },
|
|
||||||
{ code: 'EUCSA', name: 'EU Cybersecurity Act', type: 'eu_regulation' },
|
|
||||||
{ code: 'DATAACT', name: 'Data Act', type: 'eu_regulation' },
|
|
||||||
{ code: 'DGA', name: 'Data Governance Act', type: 'eu_regulation' },
|
|
||||||
{ code: 'DSA', name: 'Digital Services Act', type: 'eu_regulation' },
|
|
||||||
{ code: 'DMA', name: 'Digital Markets Act', type: 'eu_regulation' },
|
|
||||||
{ code: 'EAA', name: 'European Accessibility Act', type: 'eu_directive' },
|
|
||||||
{ code: 'DSM', name: 'DSM-Urheberrechtsrichtlinie', type: 'eu_directive' },
|
|
||||||
{ code: 'PLD', name: 'Produkthaftungsrichtlinie', type: 'eu_directive' },
|
|
||||||
{ code: 'GPSR', name: 'General Product Safety', type: 'eu_regulation' },
|
|
||||||
{ code: 'E_COMMERCE_RL', name: 'E-Commerce-Richtlinie', type: 'eu_directive' },
|
|
||||||
{ code: 'VERBRAUCHERRECHTE_RL', name: 'Verbraucherrechte-RL', type: 'eu_directive' },
|
|
||||||
{ code: 'DIGITALE_INHALTE_RL', name: 'Digitale-Inhalte-RL', type: 'eu_directive' },
|
|
||||||
// Financial
|
|
||||||
{ code: 'DORA', name: 'DORA', type: 'eu_regulation' },
|
|
||||||
{ code: 'PSD2', name: 'PSD2', type: 'eu_directive' },
|
|
||||||
{ code: 'AMLR', name: 'AML-Verordnung', type: 'eu_regulation' },
|
|
||||||
{ code: 'MiCA', name: 'MiCA', type: 'eu_regulation' },
|
|
||||||
{ code: 'EHDS', name: 'EHDS', type: 'eu_regulation' },
|
|
||||||
{ code: 'MACHINERY_REG', name: 'Maschinenverordnung', type: 'eu_regulation' },
|
|
||||||
{ code: 'BLUE_GUIDE', name: 'Blue Guide', type: 'eu_regulation' },
|
|
||||||
{ code: 'EU_IFRS_DE', name: 'EU-IFRS (DE)', type: 'eu_regulation' },
|
|
||||||
{ code: 'EU_IFRS_EN', name: 'EU-IFRS (EN)', type: 'eu_regulation' },
|
|
||||||
// DE Gesetze
|
|
||||||
{ code: 'TDDDG', name: 'TDDDG', type: 'de_law' },
|
|
||||||
{ code: 'TMG_KOMPLETT', name: 'TMG', type: 'de_law' },
|
|
||||||
{ code: 'BDSG_FULL', name: 'BDSG', type: 'de_law' },
|
|
||||||
{ code: 'DE_DDG', name: 'DDG', type: 'de_law' },
|
|
||||||
{ code: 'DE_BGB_AGB', name: 'BGB/AGB', type: 'de_law' },
|
|
||||||
{ code: 'DE_EGBGB', name: 'EGBGB', type: 'de_law' },
|
|
||||||
{ code: 'DE_HGB_RET', name: 'HGB', type: 'de_law' },
|
|
||||||
{ code: 'DE_AO_RET', name: 'AO', type: 'de_law' },
|
|
||||||
{ code: 'DE_TKG', name: 'TKG', type: 'de_law' },
|
|
||||||
{ code: 'DE_DLINFOV', name: 'DL-InfoV', type: 'de_law' },
|
|
||||||
{ code: 'DE_BETRVG', name: 'BetrVG', type: 'de_law' },
|
|
||||||
{ code: 'DE_GESCHGEHG', name: 'GeschGehG', type: 'de_law' },
|
|
||||||
{ code: 'DE_USTG_RET', name: 'UStG', type: 'de_law' },
|
|
||||||
{ code: 'DE_URHG', name: 'UrhG', type: 'de_law' },
|
|
||||||
// BSI
|
|
||||||
{ code: 'BSI-TR-03161-1', name: 'BSI-TR Teil 1', type: 'bsi_standard' },
|
|
||||||
{ code: 'BSI-TR-03161-2', name: 'BSI-TR Teil 2', type: 'bsi_standard' },
|
|
||||||
{ code: 'BSI-TR-03161-3', name: 'BSI-TR Teil 3', type: 'bsi_standard' },
|
|
||||||
// AT
|
|
||||||
{ code: 'AT_DSG', name: 'DSG Oesterreich', type: 'at_law' },
|
|
||||||
{ code: 'AT_DSG_FULL', name: 'DSG Volltext', type: 'at_law' },
|
|
||||||
{ code: 'AT_ECG', name: 'ECG', type: 'at_law' },
|
|
||||||
{ code: 'AT_TKG', name: 'TKG AT', type: 'at_law' },
|
|
||||||
{ code: 'AT_KSCHG', name: 'KSchG', type: 'at_law' },
|
|
||||||
{ code: 'AT_FAGG', name: 'FAGG', type: 'at_law' },
|
|
||||||
{ code: 'AT_UGB_RET', name: 'UGB', type: 'at_law' },
|
|
||||||
{ code: 'AT_BAO_RET', name: 'BAO', type: 'at_law' },
|
|
||||||
{ code: 'AT_MEDIENG', name: 'MedienG', type: 'at_law' },
|
|
||||||
{ code: 'AT_ABGB_AGB', name: 'ABGB/AGB', type: 'at_law' },
|
|
||||||
{ code: 'AT_UWG', name: 'UWG AT', type: 'at_law' },
|
|
||||||
// CH
|
|
||||||
{ code: 'CH_DSG', name: 'DSG Schweiz', type: 'ch_law' },
|
|
||||||
{ code: 'CH_DSV', name: 'DSV', type: 'ch_law' },
|
|
||||||
{ code: 'CH_OR_AGB', name: 'OR/AGB', type: 'ch_law' },
|
|
||||||
{ code: 'CH_GEBUV', name: 'GeBuV', type: 'ch_law' },
|
|
||||||
{ code: 'CH_ZERTES', name: 'ZertES', type: 'ch_law' },
|
|
||||||
{ code: 'CH_ZGB_PERS', name: 'ZGB', type: 'ch_law' },
|
|
||||||
// Andere EU nationale
|
|
||||||
{ code: 'ES_LOPDGDD', name: 'LOPDGDD Spanien', type: 'national_law' },
|
|
||||||
{ code: 'IT_CODICE_PRIVACY', name: 'Codice Privacy Italien', type: 'national_law' },
|
|
||||||
{ code: 'NL_UAVG', name: 'UAVG Niederlande', type: 'national_law' },
|
|
||||||
{ code: 'FR_CNIL_GUIDE', name: 'CNIL Guide RGPD', type: 'national_law' },
|
|
||||||
{ code: 'IE_DPA_2018', name: 'DPA 2018 Ireland', type: 'national_law' },
|
|
||||||
{ code: 'UK_DPA_2018', name: 'DPA 2018 UK', type: 'national_law' },
|
|
||||||
{ code: 'UK_GDPR', name: 'UK GDPR', type: 'national_law' },
|
|
||||||
{ code: 'NO_PERSONOPPLYSNINGSLOVEN', name: 'Personopplysningsloven', type: 'national_law' },
|
|
||||||
{ code: 'SE_DATASKYDDSLAG', name: 'Dataskyddslag Schweden', type: 'national_law' },
|
|
||||||
{ code: 'PL_UODO', name: 'UODO Polen', type: 'national_law' },
|
|
||||||
{ code: 'CZ_ZOU', name: 'Zakon Tschechien', type: 'national_law' },
|
|
||||||
{ code: 'HU_INFOTV', name: 'Infotv. Ungarn', type: 'national_law' },
|
|
||||||
{ code: 'LU_DPA_LAW', name: 'Datenschutzgesetz Luxemburg', type: 'national_law' },
|
|
||||||
// EDPB
|
|
||||||
{ code: 'EDPB_GUIDELINES_5_2020', name: 'EDPB GL Einwilligung', type: 'eu_guideline' },
|
|
||||||
{ code: 'EDPB_GUIDELINES_7_2020', name: 'EDPB GL C/P Konzepte', type: 'eu_guideline' },
|
|
||||||
{ code: 'EDPB_GUIDELINES_1_2020', name: 'EDPB GL Fahrzeuge', type: 'eu_guideline' },
|
|
||||||
{ code: 'EDPB_GUIDELINES_1_2022', name: 'EDPB GL Bussgelder', type: 'eu_guideline' },
|
|
||||||
{ code: 'EDPB_GUIDELINES_2_2023', name: 'EDPB GL Art. 37 Scope', type: 'eu_guideline' },
|
|
||||||
{ code: 'EDPB_GUIDELINES_2_2024', name: 'EDPB GL 2024', type: 'eu_guideline' },
|
|
||||||
{ code: 'EDPB_GUIDELINES_4_2019', name: 'EDPB GL Art. 25 DPbD', type: 'eu_guideline' },
|
|
||||||
{ code: 'EDPB_GUIDELINES_9_2022', name: 'EDPB GL Datenschutzverletzung', type: 'eu_guideline' },
|
|
||||||
{ code: 'EDPB_DPIA_LIST', name: 'EDPB DPIA-Liste', type: 'eu_guideline' },
|
|
||||||
{ code: 'EDPB_LEGITIMATE_INTEREST', name: 'EDPB Berecht. Interesse', type: 'eu_guideline' },
|
|
||||||
{ code: 'EDPS_DPIA_LIST', name: 'EDPS DPIA-Liste', type: 'eu_guideline' },
|
|
||||||
// International Standards
|
|
||||||
{ code: 'NIST_SSDF', name: 'NIST SSDF', type: 'international_standard' },
|
|
||||||
{ code: 'NIST_CSF_2', name: 'NIST CSF 2.0', type: 'international_standard' },
|
|
||||||
{ code: 'OECD_AI_PRINCIPLES', name: 'OECD AI Principles', type: 'international_standard' },
|
|
||||||
{ code: 'ENISA_SECURE_BY_DESIGN', name: 'CISA Secure by Design', type: 'international_standard' },
|
|
||||||
{ code: 'ENISA_SUPPLY_CHAIN', name: 'ENISA Supply Chain', type: 'international_standard' },
|
|
||||||
{ code: 'ENISA_THREAT_LANDSCAPE', name: 'ENISA Threat Landscape', type: 'international_standard' },
|
|
||||||
{ code: 'ENISA_ICS_SCADA', name: 'ENISA ICS/SCADA', type: 'international_standard' },
|
|
||||||
{ code: 'ENISA_CYBERSECURITY_2024', name: 'ENISA Cybersecurity 2024', type: 'international_standard' },
|
|
||||||
]
|
|
||||||
@@ -1,163 +0,0 @@
|
|||||||
import { describe, it, expect, vi, beforeEach } from 'vitest'
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Tests for Chunk-Browser logic:
|
|
||||||
* - Collection dropdown has all 10 collections
|
|
||||||
* - COLLECTION_TOTALS has expected keys
|
|
||||||
* - Text search highlighting logic
|
|
||||||
* - Pagination state management
|
|
||||||
*/
|
|
||||||
|
|
||||||
// Replicate the COMPLIANCE_COLLECTIONS from the dropdown
|
|
||||||
const COMPLIANCE_COLLECTIONS = [
|
|
||||||
'bp_compliance_gesetze',
|
|
||||||
'bp_compliance_ce',
|
|
||||||
'bp_compliance_datenschutz',
|
|
||||||
'bp_dsfa_corpus',
|
|
||||||
'bp_compliance_recht',
|
|
||||||
'bp_legal_templates',
|
|
||||||
'bp_compliance_gdpr',
|
|
||||||
'bp_compliance_schulrecht',
|
|
||||||
'bp_dsfa_templates',
|
|
||||||
'bp_dsfa_risks',
|
|
||||||
] as const
|
|
||||||
|
|
||||||
// Replicate COLLECTION_TOTALS from page.tsx
|
|
||||||
const COLLECTION_TOTALS: Record<string, number> = {
|
|
||||||
bp_compliance_gesetze: 58304,
|
|
||||||
bp_compliance_ce: 18183,
|
|
||||||
bp_legal_templates: 7689,
|
|
||||||
bp_compliance_datenschutz: 2448,
|
|
||||||
bp_dsfa_corpus: 7867,
|
|
||||||
bp_compliance_recht: 1425,
|
|
||||||
bp_nibis_eh: 7996,
|
|
||||||
total_legal: 76487,
|
|
||||||
total_all: 103912,
|
|
||||||
}
|
|
||||||
|
|
||||||
describe('Chunk-Browser Logic', () => {
|
|
||||||
describe('COMPLIANCE_COLLECTIONS', () => {
|
|
||||||
it('should have exactly 10 collections', () => {
|
|
||||||
expect(COMPLIANCE_COLLECTIONS).toHaveLength(10)
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should include bp_compliance_ce for IFRS documents', () => {
|
|
||||||
expect(COMPLIANCE_COLLECTIONS).toContain('bp_compliance_ce')
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should include bp_compliance_datenschutz for EFRAG/ENISA', () => {
|
|
||||||
expect(COMPLIANCE_COLLECTIONS).toContain('bp_compliance_datenschutz')
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should include bp_compliance_gesetze as default', () => {
|
|
||||||
expect(COMPLIANCE_COLLECTIONS[0]).toBe('bp_compliance_gesetze')
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should have all collection names starting with bp_', () => {
|
|
||||||
COMPLIANCE_COLLECTIONS.forEach((col) => {
|
|
||||||
expect(col).toMatch(/^bp_/)
|
|
||||||
})
|
|
||||||
})
|
|
||||||
})
|
|
||||||
|
|
||||||
describe('COLLECTION_TOTALS', () => {
|
|
||||||
it('should have bp_compliance_ce key', () => {
|
|
||||||
expect(COLLECTION_TOTALS).toHaveProperty('bp_compliance_ce')
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should have bp_compliance_datenschutz key', () => {
|
|
||||||
expect(COLLECTION_TOTALS).toHaveProperty('bp_compliance_datenschutz')
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should have positive counts for all collections', () => {
|
|
||||||
Object.values(COLLECTION_TOTALS).forEach((count) => {
|
|
||||||
expect(count).toBeGreaterThan(0)
|
|
||||||
})
|
|
||||||
})
|
|
||||||
|
|
||||||
it('total_all should be greater than total_legal', () => {
|
|
||||||
expect(COLLECTION_TOTALS.total_all).toBeGreaterThan(COLLECTION_TOTALS.total_legal)
|
|
||||||
})
|
|
||||||
})
|
|
||||||
|
|
||||||
describe('Text search filtering logic', () => {
|
|
||||||
const mockChunks = [
|
|
||||||
{ id: '1', text: 'DSGVO Artikel 1 Datenschutz', regulation_code: 'GDPR' },
|
|
||||||
{ id: '2', text: 'IFRS 16 Leasing Standard', regulation_code: 'EU_IFRS' },
|
|
||||||
{ id: '3', text: 'Datenschutz Grundverordnung', regulation_code: 'GDPR' },
|
|
||||||
{ id: '4', text: 'ENISA Supply Chain Security', regulation_code: 'ENISA' },
|
|
||||||
]
|
|
||||||
|
|
||||||
it('should filter chunks by text search (case insensitive)', () => {
|
|
||||||
const search = 'datenschutz'
|
|
||||||
const filtered = mockChunks.filter((c) =>
|
|
||||||
c.text.toLowerCase().includes(search.toLowerCase())
|
|
||||||
)
|
|
||||||
expect(filtered).toHaveLength(2)
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should return all chunks when search is empty', () => {
|
|
||||||
const search = ''
|
|
||||||
const filtered = search
|
|
||||||
? mockChunks.filter((c) => c.text.toLowerCase().includes(search.toLowerCase()))
|
|
||||||
: mockChunks
|
|
||||||
expect(filtered).toHaveLength(4)
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should return 0 chunks when no match', () => {
|
|
||||||
const search = 'blockchain'
|
|
||||||
const filtered = mockChunks.filter((c) =>
|
|
||||||
c.text.toLowerCase().includes(search.toLowerCase())
|
|
||||||
)
|
|
||||||
expect(filtered).toHaveLength(0)
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should match IFRS chunks', () => {
|
|
||||||
const search = 'IFRS'
|
|
||||||
const filtered = mockChunks.filter((c) =>
|
|
||||||
c.text.toLowerCase().includes(search.toLowerCase())
|
|
||||||
)
|
|
||||||
expect(filtered).toHaveLength(1)
|
|
||||||
expect(filtered[0].regulation_code).toBe('EU_IFRS')
|
|
||||||
})
|
|
||||||
})
|
|
||||||
|
|
||||||
describe('Pagination state', () => {
|
|
||||||
it('should start at page 0', () => {
|
|
||||||
const currentPage = 0
|
|
||||||
expect(currentPage).toBe(0)
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should increment page on next', () => {
|
|
||||||
let currentPage = 0
|
|
||||||
currentPage += 1
|
|
||||||
expect(currentPage).toBe(1)
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should maintain offset history for back navigation', () => {
|
|
||||||
const history: (string | null)[] = []
|
|
||||||
history.push(null) // page 0 offset
|
|
||||||
history.push('uuid-20') // page 1 offset
|
|
||||||
history.push('uuid-40') // page 2 offset
|
|
||||||
|
|
||||||
// Go back to page 1
|
|
||||||
const prevOffset = history[history.length - 2]
|
|
||||||
expect(prevOffset).toBe('uuid-20')
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should reset state on collection change', () => {
|
|
||||||
let chunkOffset: string | null = 'some-offset'
|
|
||||||
let chunkHistory: (string | null)[] = [null, 'uuid-1']
|
|
||||||
let chunkCurrentPage = 3
|
|
||||||
|
|
||||||
// Simulate collection change
|
|
||||||
chunkOffset = null
|
|
||||||
chunkHistory = []
|
|
||||||
chunkCurrentPage = 0
|
|
||||||
|
|
||||||
expect(chunkOffset).toBeNull()
|
|
||||||
expect(chunkHistory).toHaveLength(0)
|
|
||||||
expect(chunkCurrentPage).toBe(0)
|
|
||||||
})
|
|
||||||
})
|
|
||||||
})
|
|
||||||
@@ -1,90 +0,0 @@
|
|||||||
import { describe, it, expect } from 'vitest'
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Tests for RAG page constants - REGULATIONS_IN_RAG, REGULATION_SOURCES, REGULATION_LICENSES
|
|
||||||
*
|
|
||||||
* These are defined inline in page.tsx, so we test the data structures
|
|
||||||
* by importing a subset of the expected values.
|
|
||||||
*/
|
|
||||||
|
|
||||||
// Expected IFRS entries in REGULATIONS_IN_RAG
|
|
||||||
const EXPECTED_IFRS_ENTRIES = {
|
|
||||||
EU_IFRS_DE: { collection: 'bp_compliance_ce', chunks: 0 },
|
|
||||||
EU_IFRS_EN: { collection: 'bp_compliance_ce', chunks: 0 },
|
|
||||||
EFRAG_ENDORSEMENT: { collection: 'bp_compliance_datenschutz', chunks: 0 },
|
|
||||||
}
|
|
||||||
|
|
||||||
// Expected REGULATION_SOURCES URLs
|
|
||||||
const EXPECTED_SOURCES = {
|
|
||||||
GDPR: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32016R0679',
|
|
||||||
EU_IFRS_DE: 'https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=CELEX:32023R1803',
|
|
||||||
EU_IFRS_EN: 'https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32023R1803',
|
|
||||||
EFRAG_ENDORSEMENT: 'https://www.efrag.org/activities/endorsement-status-report',
|
|
||||||
ENISA_SECURE_DEV: 'https://www.enisa.europa.eu/publications/secure-development-best-practices',
|
|
||||||
NIST_SSDF: 'https://csrc.nist.gov/pubs/sp/800/218/final',
|
|
||||||
NIST_CSF: 'https://www.nist.gov/cyberframework',
|
|
||||||
OECD_AI: 'https://oecd.ai/en/ai-principles',
|
|
||||||
}
|
|
||||||
|
|
||||||
describe('RAG Page Constants', () => {
|
|
||||||
describe('IFRS entries in REGULATIONS_IN_RAG', () => {
|
|
||||||
it('should have EU_IFRS_DE entry with bp_compliance_ce collection', () => {
|
|
||||||
expect(EXPECTED_IFRS_ENTRIES.EU_IFRS_DE.collection).toBe('bp_compliance_ce')
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should have EU_IFRS_EN entry with bp_compliance_ce collection', () => {
|
|
||||||
expect(EXPECTED_IFRS_ENTRIES.EU_IFRS_EN.collection).toBe('bp_compliance_ce')
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should have EFRAG_ENDORSEMENT entry with bp_compliance_datenschutz collection', () => {
|
|
||||||
expect(EXPECTED_IFRS_ENTRIES.EFRAG_ENDORSEMENT.collection).toBe('bp_compliance_datenschutz')
|
|
||||||
})
|
|
||||||
})
|
|
||||||
|
|
||||||
describe('REGULATION_SOURCES URLs', () => {
|
|
||||||
it('should have valid EUR-Lex URLs for EU regulations', () => {
|
|
||||||
expect(EXPECTED_SOURCES.GDPR).toMatch(/^https:\/\/eur-lex\.europa\.eu/)
|
|
||||||
expect(EXPECTED_SOURCES.EU_IFRS_DE).toMatch(/^https:\/\/eur-lex\.europa\.eu/)
|
|
||||||
expect(EXPECTED_SOURCES.EU_IFRS_EN).toMatch(/^https:\/\/eur-lex\.europa\.eu/)
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should have correct CELEX for IFRS DE (32023R1803)', () => {
|
|
||||||
expect(EXPECTED_SOURCES.EU_IFRS_DE).toContain('32023R1803')
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should have correct CELEX for IFRS EN (32023R1803)', () => {
|
|
||||||
expect(EXPECTED_SOURCES.EU_IFRS_EN).toContain('32023R1803')
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should have DE language for IFRS DE', () => {
|
|
||||||
expect(EXPECTED_SOURCES.EU_IFRS_DE).toContain('/DE/')
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should have EN language for IFRS EN', () => {
|
|
||||||
expect(EXPECTED_SOURCES.EU_IFRS_EN).toContain('/EN/')
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should have EFRAG URL for endorsement status', () => {
|
|
||||||
expect(EXPECTED_SOURCES.EFRAG_ENDORSEMENT).toMatch(/^https:\/\/www\.efrag\.org/)
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should have ENISA URL for secure development', () => {
|
|
||||||
expect(EXPECTED_SOURCES.ENISA_SECURE_DEV).toMatch(/^https:\/\/www\.enisa\.europa\.eu/)
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should have NIST URLs for SSDF and CSF', () => {
|
|
||||||
expect(EXPECTED_SOURCES.NIST_SSDF).toMatch(/nist\.gov/)
|
|
||||||
expect(EXPECTED_SOURCES.NIST_CSF).toMatch(/nist\.gov/)
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should have OECD URL for AI principles', () => {
|
|
||||||
expect(EXPECTED_SOURCES.OECD_AI).toMatch(/oecd\.ai/)
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should all be valid HTTPS URLs', () => {
|
|
||||||
Object.values(EXPECTED_SOURCES).forEach((url) => {
|
|
||||||
expect(url).toMatch(/^https:\/\//)
|
|
||||||
})
|
|
||||||
})
|
|
||||||
})
|
|
||||||
})
|
|
||||||
@@ -1,249 +0,0 @@
|
|||||||
import { describe, it, expect, vi, beforeEach } from 'vitest'
|
|
||||||
|
|
||||||
// Mock fetch globally
|
|
||||||
const mockFetch = vi.fn()
|
|
||||||
global.fetch = mockFetch
|
|
||||||
|
|
||||||
// Mock NextRequest and NextResponse
|
|
||||||
vi.mock('next/server', () => ({
|
|
||||||
NextRequest: class MockNextRequest {
|
|
||||||
url: string
|
|
||||||
constructor(url: string) {
|
|
||||||
this.url = url
|
|
||||||
}
|
|
||||||
},
|
|
||||||
NextResponse: {
|
|
||||||
json: (data: unknown, init?: { status?: number }) => ({
|
|
||||||
data,
|
|
||||||
status: init?.status || 200,
|
|
||||||
}),
|
|
||||||
},
|
|
||||||
}))
|
|
||||||
|
|
||||||
describe('Legal Corpus API Proxy', () => {
|
|
||||||
beforeEach(() => {
|
|
||||||
mockFetch.mockClear()
|
|
||||||
})
|
|
||||||
|
|
||||||
describe('scroll action', () => {
|
|
||||||
it('should call Qdrant scroll endpoint with correct collection', async () => {
|
|
||||||
const mockScrollResponse = {
|
|
||||||
result: {
|
|
||||||
points: [
|
|
||||||
{ id: 'uuid-1', payload: { text: 'DSGVO Artikel 1', regulation_code: 'GDPR' } },
|
|
||||||
{ id: 'uuid-2', payload: { text: 'DSGVO Artikel 2', regulation_code: 'GDPR' } },
|
|
||||||
],
|
|
||||||
next_page_offset: 'uuid-3',
|
|
||||||
},
|
|
||||||
}
|
|
||||||
|
|
||||||
mockFetch.mockResolvedValueOnce({
|
|
||||||
ok: true,
|
|
||||||
json: () => Promise.resolve(mockScrollResponse),
|
|
||||||
})
|
|
||||||
|
|
||||||
const { GET } = await import('../route')
|
|
||||||
const request = { url: 'http://localhost/api/legal-corpus?action=scroll&collection=bp_compliance_ce&limit=20' }
|
|
||||||
const response = await GET(request as any)
|
|
||||||
|
|
||||||
expect(mockFetch).toHaveBeenCalledTimes(1)
|
|
||||||
const calledUrl = mockFetch.mock.calls[0][0]
|
|
||||||
expect(calledUrl).toContain('/collections/bp_compliance_ce/points/scroll')
|
|
||||||
|
|
||||||
const body = JSON.parse(mockFetch.mock.calls[0][1].body)
|
|
||||||
expect(body.limit).toBe(20)
|
|
||||||
expect(body.with_payload).toBe(true)
|
|
||||||
expect(body.with_vector).toBe(false)
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should pass offset parameter to Qdrant', async () => {
|
|
||||||
mockFetch.mockResolvedValueOnce({
|
|
||||||
ok: true,
|
|
||||||
json: () => Promise.resolve({ result: { points: [], next_page_offset: null } }),
|
|
||||||
})
|
|
||||||
|
|
||||||
const { GET } = await import('../route')
|
|
||||||
const request = { url: 'http://localhost/api/legal-corpus?action=scroll&collection=bp_compliance_gesetze&offset=some-uuid' }
|
|
||||||
await GET(request as any)
|
|
||||||
|
|
||||||
const body = JSON.parse(mockFetch.mock.calls[0][1].body)
|
|
||||||
expect(body.offset).toBe('some-uuid')
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should limit chunks to max 100', async () => {
|
|
||||||
mockFetch.mockResolvedValueOnce({
|
|
||||||
ok: true,
|
|
||||||
json: () => Promise.resolve({ result: { points: [], next_page_offset: null } }),
|
|
||||||
})
|
|
||||||
|
|
||||||
const { GET } = await import('../route')
|
|
||||||
const request = { url: 'http://localhost/api/legal-corpus?action=scroll&collection=bp_compliance_ce&limit=500' }
|
|
||||||
await GET(request as any)
|
|
||||||
|
|
||||||
const body = JSON.parse(mockFetch.mock.calls[0][1].body)
|
|
||||||
expect(body.limit).toBe(100)
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should apply text_search filter client-side', async () => {
|
|
||||||
const mockScrollResponse = {
|
|
||||||
result: {
|
|
||||||
points: [
|
|
||||||
{ id: 'uuid-1', payload: { text: 'DSGVO Artikel 1 Datenschutz' } },
|
|
||||||
{ id: 'uuid-2', payload: { text: 'IFRS Standard 16 Leasing' } },
|
|
||||||
{ id: 'uuid-3', payload: { text: 'Datenschutz Grundverordnung' } },
|
|
||||||
],
|
|
||||||
next_page_offset: null,
|
|
||||||
},
|
|
||||||
}
|
|
||||||
|
|
||||||
mockFetch.mockResolvedValueOnce({
|
|
||||||
ok: true,
|
|
||||||
json: () => Promise.resolve(mockScrollResponse),
|
|
||||||
})
|
|
||||||
|
|
||||||
const { GET } = await import('../route')
|
|
||||||
const request = { url: 'http://localhost/api/legal-corpus?action=scroll&collection=bp_compliance_ce&text_search=Datenschutz' }
|
|
||||||
const response = await GET(request as any)
|
|
||||||
|
|
||||||
// Should filter to only chunks containing "Datenschutz"
|
|
||||||
expect((response as any).data.chunks).toHaveLength(2)
|
|
||||||
expect((response as any).data.chunks[0].text).toContain('Datenschutz')
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should flatten payload into chunk objects', async () => {
|
|
||||||
const mockScrollResponse = {
|
|
||||||
result: {
|
|
||||||
points: [
|
|
||||||
{
|
|
||||||
id: 'uuid-1',
|
|
||||||
payload: {
|
|
||||||
text: 'IFRS 16 Leasing',
|
|
||||||
regulation_code: 'EU_IFRS',
|
|
||||||
language: 'de',
|
|
||||||
celex: '32023R1803',
|
|
||||||
},
|
|
||||||
},
|
|
||||||
],
|
|
||||||
next_page_offset: null,
|
|
||||||
},
|
|
||||||
}
|
|
||||||
|
|
||||||
mockFetch.mockResolvedValueOnce({
|
|
||||||
ok: true,
|
|
||||||
json: () => Promise.resolve(mockScrollResponse),
|
|
||||||
})
|
|
||||||
|
|
||||||
const { GET } = await import('../route')
|
|
||||||
const request = { url: 'http://localhost/api/legal-corpus?action=scroll&collection=bp_compliance_ce' }
|
|
||||||
const response = await GET(request as any)
|
|
||||||
|
|
||||||
const chunk = (response as any).data.chunks[0]
|
|
||||||
expect(chunk.id).toBe('uuid-1')
|
|
||||||
expect(chunk.text).toBe('IFRS 16 Leasing')
|
|
||||||
expect(chunk.regulation_code).toBe('EU_IFRS')
|
|
||||||
expect(chunk.language).toBe('de')
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should return next_offset from Qdrant response', async () => {
|
|
||||||
mockFetch.mockResolvedValueOnce({
|
|
||||||
ok: true,
|
|
||||||
json: () => Promise.resolve({
|
|
||||||
result: { points: [], next_page_offset: 'next-uuid' },
|
|
||||||
}),
|
|
||||||
})
|
|
||||||
|
|
||||||
const { GET } = await import('../route')
|
|
||||||
const request = { url: 'http://localhost/api/legal-corpus?action=scroll&collection=bp_compliance_ce' }
|
|
||||||
const response = await GET(request as any)
|
|
||||||
|
|
||||||
expect((response as any).data.next_offset).toBe('next-uuid')
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should handle Qdrant scroll failure', async () => {
|
|
||||||
mockFetch.mockResolvedValueOnce({
|
|
||||||
ok: false,
|
|
||||||
status: 404,
|
|
||||||
})
|
|
||||||
|
|
||||||
const { GET } = await import('../route')
|
|
||||||
const request = { url: 'http://localhost/api/legal-corpus?action=scroll&collection=nonexistent' }
|
|
||||||
const response = await GET(request as any)
|
|
||||||
|
|
||||||
expect((response as any).status).toBe(404)
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should apply filter when filter_key and filter_value provided', async () => {
|
|
||||||
mockFetch.mockResolvedValueOnce({
|
|
||||||
ok: true,
|
|
||||||
json: () => Promise.resolve({ result: { points: [], next_page_offset: null } }),
|
|
||||||
})
|
|
||||||
|
|
||||||
const { GET } = await import('../route')
|
|
||||||
const request = { url: 'http://localhost/api/legal-corpus?action=scroll&collection=bp_compliance_ce&filter_key=language&filter_value=de' }
|
|
||||||
await GET(request as any)
|
|
||||||
|
|
||||||
const body = JSON.parse(mockFetch.mock.calls[0][1].body)
|
|
||||||
expect(body.filter).toEqual({
|
|
||||||
must: [{ key: 'language', match: { value: 'de' } }],
|
|
||||||
})
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should default collection to bp_compliance_gesetze', async () => {
|
|
||||||
mockFetch.mockResolvedValueOnce({
|
|
||||||
ok: true,
|
|
||||||
json: () => Promise.resolve({ result: { points: [], next_page_offset: null } }),
|
|
||||||
})
|
|
||||||
|
|
||||||
const { GET } = await import('../route')
|
|
||||||
const request = { url: 'http://localhost/api/legal-corpus?action=scroll' }
|
|
||||||
await GET(request as any)
|
|
||||||
|
|
||||||
const calledUrl = mockFetch.mock.calls[0][0]
|
|
||||||
expect(calledUrl).toContain('/collections/bp_compliance_gesetze/')
|
|
||||||
})
|
|
||||||
})
|
|
||||||
|
|
||||||
describe('collection-count action', () => {
|
|
||||||
it('should return points_count from Qdrant collection info', async () => {
|
|
||||||
mockFetch.mockResolvedValueOnce({
|
|
||||||
ok: true,
|
|
||||||
json: () => Promise.resolve({
|
|
||||||
result: { points_count: 55053 },
|
|
||||||
}),
|
|
||||||
})
|
|
||||||
|
|
||||||
const { GET } = await import('../route')
|
|
||||||
const request = { url: 'http://localhost/api/legal-corpus?action=collection-count&collection=bp_compliance_ce' }
|
|
||||||
const response = await GET(request as any)
|
|
||||||
|
|
||||||
expect((response as any).data.count).toBe(55053)
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should return 0 when Qdrant is unavailable', async () => {
|
|
||||||
mockFetch.mockResolvedValueOnce({
|
|
||||||
ok: false,
|
|
||||||
status: 500,
|
|
||||||
})
|
|
||||||
|
|
||||||
const { GET } = await import('../route')
|
|
||||||
const request = { url: 'http://localhost/api/legal-corpus?action=collection-count&collection=bp_compliance_ce' }
|
|
||||||
const response = await GET(request as any)
|
|
||||||
|
|
||||||
expect((response as any).data.count).toBe(0)
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should default to bp_compliance_gesetze collection', async () => {
|
|
||||||
mockFetch.mockResolvedValueOnce({
|
|
||||||
ok: true,
|
|
||||||
json: () => Promise.resolve({ result: { points_count: 1234 } }),
|
|
||||||
})
|
|
||||||
|
|
||||||
const { GET } = await import('../route')
|
|
||||||
const request = { url: 'http://localhost/api/legal-corpus?action=collection-count' }
|
|
||||||
await GET(request as any)
|
|
||||||
|
|
||||||
const calledUrl = mockFetch.mock.calls[0][0]
|
|
||||||
expect(calledUrl).toContain('/collections/bp_compliance_gesetze')
|
|
||||||
})
|
|
||||||
})
|
|
||||||
})
|
|
||||||
@@ -66,99 +66,6 @@ export async function GET(request: NextRequest) {
|
|||||||
url += `/traceability?chunk_id=${encodeURIComponent(chunkId || '')}®ulation=${encodeURIComponent(regulation || '')}`
|
url += `/traceability?chunk_id=${encodeURIComponent(chunkId || '')}®ulation=${encodeURIComponent(regulation || '')}`
|
||||||
break
|
break
|
||||||
}
|
}
|
||||||
case 'scroll': {
|
|
||||||
const collection = searchParams.get('collection') || 'bp_compliance_gesetze'
|
|
||||||
const limit = parseInt(searchParams.get('limit') || '20', 10)
|
|
||||||
const offsetParam = searchParams.get('offset')
|
|
||||||
const filterKey = searchParams.get('filter_key')
|
|
||||||
const filterValue = searchParams.get('filter_value')
|
|
||||||
const textSearch = searchParams.get('text_search')
|
|
||||||
|
|
||||||
const scrollBody: Record<string, unknown> = {
|
|
||||||
limit: Math.min(limit, 100),
|
|
||||||
with_payload: true,
|
|
||||||
with_vector: false,
|
|
||||||
}
|
|
||||||
if (offsetParam) {
|
|
||||||
scrollBody.offset = offsetParam
|
|
||||||
}
|
|
||||||
if (filterKey && filterValue) {
|
|
||||||
scrollBody.filter = {
|
|
||||||
must: [{ key: filterKey, match: { value: filterValue } }],
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
const scrollRes = await fetch(`${QDRANT_URL}/collections/${encodeURIComponent(collection)}/points/scroll`, {
|
|
||||||
method: 'POST',
|
|
||||||
headers: { 'Content-Type': 'application/json' },
|
|
||||||
body: JSON.stringify(scrollBody),
|
|
||||||
cache: 'no-store',
|
|
||||||
})
|
|
||||||
if (!scrollRes.ok) {
|
|
||||||
return NextResponse.json({ error: 'Qdrant scroll failed' }, { status: scrollRes.status })
|
|
||||||
}
|
|
||||||
const scrollData = await scrollRes.json()
|
|
||||||
const points = (scrollData.result?.points || []).map((p: { id: string; payload?: Record<string, unknown> }) => ({
|
|
||||||
id: p.id,
|
|
||||||
...p.payload,
|
|
||||||
}))
|
|
||||||
|
|
||||||
// Client-side text search filter
|
|
||||||
let filtered = points
|
|
||||||
if (textSearch && textSearch.trim()) {
|
|
||||||
const term = textSearch.toLowerCase()
|
|
||||||
filtered = points.filter((p: Record<string, unknown>) => {
|
|
||||||
const text = String(p.text || p.content || p.chunk_text || '')
|
|
||||||
return text.toLowerCase().includes(term)
|
|
||||||
})
|
|
||||||
}
|
|
||||||
|
|
||||||
return NextResponse.json({
|
|
||||||
chunks: filtered,
|
|
||||||
next_offset: scrollData.result?.next_page_offset || null,
|
|
||||||
total_in_page: points.length,
|
|
||||||
})
|
|
||||||
}
|
|
||||||
case 'regulation-counts-batch': {
|
|
||||||
const col = searchParams.get('collection') || 'bp_compliance_gesetze'
|
|
||||||
// Accept qdrant_ids (actual regulation_id values in Qdrant payload)
|
|
||||||
const qdrantIds = (searchParams.get('qdrant_ids') || '').split(',').filter(Boolean)
|
|
||||||
const results: Record<string, number> = {}
|
|
||||||
for (let i = 0; i < qdrantIds.length; i += 10) {
|
|
||||||
const batch = qdrantIds.slice(i, i + 10)
|
|
||||||
await Promise.all(batch.map(async (qid) => {
|
|
||||||
try {
|
|
||||||
const res = await fetch(`${QDRANT_URL}/collections/${encodeURIComponent(col)}/points/count`, {
|
|
||||||
method: 'POST',
|
|
||||||
headers: { 'Content-Type': 'application/json' },
|
|
||||||
body: JSON.stringify({
|
|
||||||
filter: { must: [{ key: 'regulation_id', match: { value: qid } }] },
|
|
||||||
exact: true,
|
|
||||||
}),
|
|
||||||
cache: 'no-store',
|
|
||||||
})
|
|
||||||
if (res.ok) {
|
|
||||||
const data = await res.json()
|
|
||||||
results[qid] = data.result?.count || 0
|
|
||||||
}
|
|
||||||
} catch { /* skip failed counts */ }
|
|
||||||
}))
|
|
||||||
}
|
|
||||||
return NextResponse.json({ counts: results })
|
|
||||||
}
|
|
||||||
case 'collection-count': {
|
|
||||||
const col = searchParams.get('collection') || 'bp_compliance_gesetze'
|
|
||||||
const countRes = await fetch(`${QDRANT_URL}/collections/${encodeURIComponent(col)}`, {
|
|
||||||
cache: 'no-store',
|
|
||||||
})
|
|
||||||
if (!countRes.ok) {
|
|
||||||
return NextResponse.json({ count: 0 })
|
|
||||||
}
|
|
||||||
const countData = await countRes.json()
|
|
||||||
return NextResponse.json({
|
|
||||||
count: countData.result?.points_count || 0,
|
|
||||||
})
|
|
||||||
}
|
|
||||||
default:
|
default:
|
||||||
return NextResponse.json({ error: 'Unknown action' }, { status: 400 })
|
return NextResponse.json({ error: 'Unknown action' }, { status: 400 })
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -1,12 +1,8 @@
|
|||||||
import type { Metadata } from 'next'
|
import type { Metadata } from 'next'
|
||||||
import localFont from 'next/font/local'
|
import { Inter } from 'next/font/google'
|
||||||
import './globals.css'
|
import './globals.css'
|
||||||
|
|
||||||
const inter = localFont({
|
const inter = Inter({ subsets: ['latin'] })
|
||||||
src: '../public/fonts/Inter-VariableFont.woff2',
|
|
||||||
variable: '--font-inter',
|
|
||||||
display: 'swap',
|
|
||||||
})
|
|
||||||
|
|
||||||
export const metadata: Metadata = {
|
export const metadata: Metadata = {
|
||||||
title: 'BreakPilot Admin Lehrer KI',
|
title: 'BreakPilot Admin Lehrer KI',
|
||||||
|
|||||||
@@ -1,320 +0,0 @@
|
|||||||
'use client'
|
|
||||||
|
|
||||||
import { useState, useMemo } from 'react'
|
|
||||||
import type { ColumnResult, ColumnGroundTruth, PageRegion } from '@/app/(admin)/ai/ocr-pipeline/types'
|
|
||||||
|
|
||||||
interface ColumnControlsProps {
|
|
||||||
columnResult: ColumnResult | null
|
|
||||||
onRerun: () => void
|
|
||||||
onManualMode: () => void
|
|
||||||
onGtMode: () => void
|
|
||||||
onGroundTruth: (gt: ColumnGroundTruth) => void
|
|
||||||
onNext: () => void
|
|
||||||
isDetecting: boolean
|
|
||||||
savedGtColumns: PageRegion[] | null
|
|
||||||
}
|
|
||||||
|
|
||||||
const TYPE_COLORS: Record<string, string> = {
|
|
||||||
column_en: 'bg-blue-100 text-blue-700 dark:bg-blue-900/30 dark:text-blue-400',
|
|
||||||
column_de: 'bg-green-100 text-green-700 dark:bg-green-900/30 dark:text-green-400',
|
|
||||||
column_example: 'bg-orange-100 text-orange-700 dark:bg-orange-900/30 dark:text-orange-400',
|
|
||||||
column_text: 'bg-cyan-100 text-cyan-700 dark:bg-cyan-900/30 dark:text-cyan-400',
|
|
||||||
page_ref: 'bg-purple-100 text-purple-700 dark:bg-purple-900/30 dark:text-purple-400',
|
|
||||||
column_marker: 'bg-red-100 text-red-700 dark:bg-red-900/30 dark:text-red-400',
|
|
||||||
column_ignore: 'bg-gray-100 text-gray-500 dark:bg-gray-700/30 dark:text-gray-500',
|
|
||||||
header: 'bg-gray-100 text-gray-600 dark:bg-gray-700/50 dark:text-gray-400',
|
|
||||||
footer: 'bg-gray-100 text-gray-600 dark:bg-gray-700/50 dark:text-gray-400',
|
|
||||||
}
|
|
||||||
|
|
||||||
const TYPE_LABELS: Record<string, string> = {
|
|
||||||
column_en: 'EN',
|
|
||||||
column_de: 'DE',
|
|
||||||
column_example: 'Beispiel',
|
|
||||||
column_text: 'Text',
|
|
||||||
page_ref: 'Seite',
|
|
||||||
column_marker: 'Marker',
|
|
||||||
column_ignore: 'Ignorieren',
|
|
||||||
header: 'Header',
|
|
||||||
footer: 'Footer',
|
|
||||||
}
|
|
||||||
|
|
||||||
const METHOD_LABELS: Record<string, string> = {
|
|
||||||
content: 'Inhalt',
|
|
||||||
position_enhanced: 'Position',
|
|
||||||
position_fallback: 'Fallback',
|
|
||||||
}
|
|
||||||
|
|
||||||
interface DiffRow {
|
|
||||||
index: number
|
|
||||||
autoCol: PageRegion | null
|
|
||||||
gtCol: PageRegion | null
|
|
||||||
diffX: number | null
|
|
||||||
diffW: number | null
|
|
||||||
typeMismatch: boolean
|
|
||||||
}
|
|
||||||
|
|
||||||
/** Match auto columns to GT columns by overlap on X-axis (IoU > 50%) */
|
|
||||||
function computeDiff(autoCols: PageRegion[], gtCols: PageRegion[]): DiffRow[] {
|
|
||||||
const rows: DiffRow[] = []
|
|
||||||
const usedGt = new Set<number>()
|
|
||||||
const usedAuto = new Set<number>()
|
|
||||||
|
|
||||||
// Match auto → GT by best X-axis overlap
|
|
||||||
for (let ai = 0; ai < autoCols.length; ai++) {
|
|
||||||
const a = autoCols[ai]
|
|
||||||
let bestIdx = -1
|
|
||||||
let bestIoU = 0
|
|
||||||
|
|
||||||
for (let gi = 0; gi < gtCols.length; gi++) {
|
|
||||||
if (usedGt.has(gi)) continue
|
|
||||||
const g = gtCols[gi]
|
|
||||||
const overlapStart = Math.max(a.x, g.x)
|
|
||||||
const overlapEnd = Math.min(a.x + a.width, g.x + g.width)
|
|
||||||
const overlap = Math.max(0, overlapEnd - overlapStart)
|
|
||||||
const union = (a.width + g.width) - overlap
|
|
||||||
const iou = union > 0 ? overlap / union : 0
|
|
||||||
if (iou > bestIoU) {
|
|
||||||
bestIoU = iou
|
|
||||||
bestIdx = gi
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
if (bestIdx >= 0 && bestIoU > 0.3) {
|
|
||||||
usedGt.add(bestIdx)
|
|
||||||
usedAuto.add(ai)
|
|
||||||
const g = gtCols[bestIdx]
|
|
||||||
rows.push({
|
|
||||||
index: rows.length + 1,
|
|
||||||
autoCol: a,
|
|
||||||
gtCol: g,
|
|
||||||
diffX: g.x - a.x,
|
|
||||||
diffW: g.width - a.width,
|
|
||||||
typeMismatch: a.type !== g.type,
|
|
||||||
})
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Unmatched auto columns
|
|
||||||
for (let ai = 0; ai < autoCols.length; ai++) {
|
|
||||||
if (usedAuto.has(ai)) continue
|
|
||||||
rows.push({
|
|
||||||
index: rows.length + 1,
|
|
||||||
autoCol: autoCols[ai],
|
|
||||||
gtCol: null,
|
|
||||||
diffX: null,
|
|
||||||
diffW: null,
|
|
||||||
typeMismatch: false,
|
|
||||||
})
|
|
||||||
}
|
|
||||||
|
|
||||||
// Unmatched GT columns
|
|
||||||
for (let gi = 0; gi < gtCols.length; gi++) {
|
|
||||||
if (usedGt.has(gi)) continue
|
|
||||||
rows.push({
|
|
||||||
index: rows.length + 1,
|
|
||||||
autoCol: null,
|
|
||||||
gtCol: gtCols[gi],
|
|
||||||
diffX: null,
|
|
||||||
diffW: null,
|
|
||||||
typeMismatch: false,
|
|
||||||
})
|
|
||||||
}
|
|
||||||
|
|
||||||
return rows
|
|
||||||
}
|
|
||||||
|
|
||||||
export function ColumnControls({ columnResult, onRerun, onManualMode, onGtMode, onGroundTruth, onNext, isDetecting, savedGtColumns }: ColumnControlsProps) {
|
|
||||||
const [gtSaved, setGtSaved] = useState(false)
|
|
||||||
|
|
||||||
const diffRows = useMemo(() => {
|
|
||||||
if (!columnResult || !savedGtColumns) return null
|
|
||||||
const autoCols = columnResult.columns.filter(c => c.type.startsWith('column') || c.type === 'page_ref')
|
|
||||||
const gtCols = savedGtColumns.filter(c => c.type.startsWith('column') || c.type === 'page_ref')
|
|
||||||
return computeDiff(autoCols, gtCols)
|
|
||||||
}, [columnResult, savedGtColumns])
|
|
||||||
|
|
||||||
if (!columnResult) return null
|
|
||||||
|
|
||||||
const columns = columnResult.columns.filter((c: PageRegion) => c.type.startsWith('column') || c.type === 'page_ref')
|
|
||||||
const headerFooter = columnResult.columns.filter((c: PageRegion) => !c.type.startsWith('column') && c.type !== 'page_ref')
|
|
||||||
|
|
||||||
const handleGt = (isCorrect: boolean) => {
|
|
||||||
onGroundTruth({ is_correct: isCorrect })
|
|
||||||
setGtSaved(true)
|
|
||||||
}
|
|
||||||
|
|
||||||
return (
|
|
||||||
<div className="bg-white dark:bg-gray-800 rounded-xl border border-gray-200 dark:border-gray-700 p-4 space-y-4">
|
|
||||||
{/* Summary */}
|
|
||||||
<div className="flex items-center gap-3 flex-wrap">
|
|
||||||
<div className="text-sm text-gray-600 dark:text-gray-400">
|
|
||||||
<span className="font-medium text-gray-800 dark:text-gray-200">{columns.length} Spalten</span> erkannt
|
|
||||||
{columnResult.duration_seconds > 0 && (
|
|
||||||
<span className="ml-2 text-xs">({columnResult.duration_seconds}s)</span>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
<button
|
|
||||||
onClick={onRerun}
|
|
||||||
disabled={isDetecting}
|
|
||||||
className="text-xs px-2 py-1 bg-gray-100 dark:bg-gray-700 rounded hover:bg-gray-200 dark:hover:bg-gray-600 transition-colors disabled:opacity-50"
|
|
||||||
>
|
|
||||||
Erneut erkennen
|
|
||||||
</button>
|
|
||||||
<button
|
|
||||||
onClick={onManualMode}
|
|
||||||
className="text-xs px-2 py-1 bg-teal-100 text-teal-700 dark:bg-teal-900/30 dark:text-teal-400 rounded hover:bg-teal-200 dark:hover:bg-teal-900/50 transition-colors"
|
|
||||||
>
|
|
||||||
Manuell markieren
|
|
||||||
</button>
|
|
||||||
<button
|
|
||||||
onClick={onGtMode}
|
|
||||||
className="text-xs px-2 py-1 bg-amber-100 text-amber-700 dark:bg-amber-900/30 dark:text-amber-400 rounded hover:bg-amber-200 dark:hover:bg-amber-900/50 transition-colors"
|
|
||||||
>
|
|
||||||
{savedGtColumns ? 'Ground Truth bearbeiten' : 'Ground Truth eintragen'}
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Column list */}
|
|
||||||
<div className="space-y-2">
|
|
||||||
{columns.map((col: PageRegion, i: number) => (
|
|
||||||
<div key={i} className="flex items-center gap-3 text-sm">
|
|
||||||
<span className={`px-2 py-0.5 rounded text-xs font-medium ${TYPE_COLORS[col.type] || ''}`}>
|
|
||||||
{TYPE_LABELS[col.type] || col.type}
|
|
||||||
</span>
|
|
||||||
{col.classification_confidence != null && col.classification_confidence < 1.0 && (
|
|
||||||
<span className="text-xs font-medium text-gray-600 dark:text-gray-300">
|
|
||||||
{Math.round(col.classification_confidence * 100)}%
|
|
||||||
</span>
|
|
||||||
)}
|
|
||||||
{col.classification_method && (
|
|
||||||
<span className="text-xs text-gray-400 dark:text-gray-500">
|
|
||||||
({METHOD_LABELS[col.classification_method] || col.classification_method})
|
|
||||||
</span>
|
|
||||||
)}
|
|
||||||
<span className="text-gray-500 dark:text-gray-400 text-xs font-mono">
|
|
||||||
x={col.x} y={col.y} {col.width}x{col.height}px
|
|
||||||
</span>
|
|
||||||
</div>
|
|
||||||
))}
|
|
||||||
{headerFooter.map((r: PageRegion, i: number) => (
|
|
||||||
<div key={`hf-${i}`} className="flex items-center gap-3 text-sm">
|
|
||||||
<span className={`px-2 py-0.5 rounded text-xs font-medium ${TYPE_COLORS[r.type] || ''}`}>
|
|
||||||
{TYPE_LABELS[r.type] || r.type}
|
|
||||||
</span>
|
|
||||||
<span className="text-gray-500 dark:text-gray-400 text-xs font-mono">
|
|
||||||
x={r.x} y={r.y} {r.width}x{r.height}px
|
|
||||||
</span>
|
|
||||||
</div>
|
|
||||||
))}
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Diff table (Auto vs GT) */}
|
|
||||||
{diffRows && diffRows.length > 0 && (
|
|
||||||
<div className="border-t border-gray-100 dark:border-gray-700 pt-3">
|
|
||||||
<div className="text-xs font-medium text-gray-500 dark:text-gray-400 mb-2">
|
|
||||||
Vergleich: Auto vs Ground Truth
|
|
||||||
</div>
|
|
||||||
<div className="overflow-x-auto">
|
|
||||||
<table className="w-full text-xs">
|
|
||||||
<thead>
|
|
||||||
<tr className="text-gray-500 dark:text-gray-400 border-b border-gray-100 dark:border-gray-700">
|
|
||||||
<th className="text-left py-1 pr-2">#</th>
|
|
||||||
<th className="text-left py-1 pr-2">Auto (Typ, x, w)</th>
|
|
||||||
<th className="text-left py-1 pr-2">GT (Typ, x, w)</th>
|
|
||||||
<th className="text-right py-1 pr-2">Diff X</th>
|
|
||||||
<th className="text-right py-1">Diff W</th>
|
|
||||||
</tr>
|
|
||||||
</thead>
|
|
||||||
<tbody>
|
|
||||||
{diffRows.map((row) => (
|
|
||||||
<tr
|
|
||||||
key={row.index}
|
|
||||||
className={
|
|
||||||
!row.autoCol || !row.gtCol || row.typeMismatch
|
|
||||||
? 'bg-red-50 dark:bg-red-900/10'
|
|
||||||
: (row.diffX !== null && Math.abs(row.diffX) > 20) || (row.diffW !== null && Math.abs(row.diffW) > 20)
|
|
||||||
? 'bg-amber-50 dark:bg-amber-900/10'
|
|
||||||
: ''
|
|
||||||
}
|
|
||||||
>
|
|
||||||
<td className="py-1 pr-2 font-mono text-gray-400">{row.index}</td>
|
|
||||||
<td className="py-1 pr-2 font-mono">
|
|
||||||
{row.autoCol ? (
|
|
||||||
<span>
|
|
||||||
<span className={`inline-block px-1 rounded ${TYPE_COLORS[row.autoCol.type] || ''}`}>
|
|
||||||
{TYPE_LABELS[row.autoCol.type] || row.autoCol.type}
|
|
||||||
</span>
|
|
||||||
{' '}{row.autoCol.x}, {row.autoCol.width}
|
|
||||||
</span>
|
|
||||||
) : (
|
|
||||||
<span className="text-red-400">fehlt</span>
|
|
||||||
)}
|
|
||||||
</td>
|
|
||||||
<td className="py-1 pr-2 font-mono">
|
|
||||||
{row.gtCol ? (
|
|
||||||
<span>
|
|
||||||
<span className={`inline-block px-1 rounded ${TYPE_COLORS[row.gtCol.type] || ''}`}>
|
|
||||||
{TYPE_LABELS[row.gtCol.type] || row.gtCol.type}
|
|
||||||
</span>
|
|
||||||
{' '}{row.gtCol.x}, {row.gtCol.width}
|
|
||||||
</span>
|
|
||||||
) : (
|
|
||||||
<span className="text-red-400">fehlt</span>
|
|
||||||
)}
|
|
||||||
</td>
|
|
||||||
<td className="py-1 pr-2 text-right font-mono">
|
|
||||||
{row.diffX !== null ? (
|
|
||||||
<span className={Math.abs(row.diffX) > 20 ? 'text-amber-600 dark:text-amber-400' : 'text-gray-500'}>
|
|
||||||
{row.diffX > 0 ? '+' : ''}{row.diffX}
|
|
||||||
</span>
|
|
||||||
) : '—'}
|
|
||||||
</td>
|
|
||||||
<td className="py-1 text-right font-mono">
|
|
||||||
{row.diffW !== null ? (
|
|
||||||
<span className={Math.abs(row.diffW) > 20 ? 'text-amber-600 dark:text-amber-400' : 'text-gray-500'}>
|
|
||||||
{row.diffW > 0 ? '+' : ''}{row.diffW}
|
|
||||||
</span>
|
|
||||||
) : '—'}
|
|
||||||
</td>
|
|
||||||
</tr>
|
|
||||||
))}
|
|
||||||
</tbody>
|
|
||||||
</table>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
|
|
||||||
{/* Ground Truth + Navigation */}
|
|
||||||
<div className="flex items-center justify-between pt-2 border-t border-gray-100 dark:border-gray-700">
|
|
||||||
<div className="flex items-center gap-2">
|
|
||||||
<span className="text-sm text-gray-500 dark:text-gray-400">Spalten korrekt?</span>
|
|
||||||
{gtSaved ? (
|
|
||||||
<span className="text-xs text-green-600 dark:text-green-400">Gespeichert</span>
|
|
||||||
) : (
|
|
||||||
<>
|
|
||||||
<button
|
|
||||||
onClick={() => handleGt(true)}
|
|
||||||
className="text-xs px-3 py-1 bg-green-100 text-green-700 dark:bg-green-900/30 dark:text-green-400 rounded hover:bg-green-200 dark:hover:bg-green-900/50 transition-colors"
|
|
||||||
>
|
|
||||||
Ja
|
|
||||||
</button>
|
|
||||||
<button
|
|
||||||
onClick={() => handleGt(false)}
|
|
||||||
className="text-xs px-3 py-1 bg-red-100 text-red-700 dark:bg-red-900/30 dark:text-red-400 rounded hover:bg-red-200 dark:hover:bg-red-900/50 transition-colors"
|
|
||||||
>
|
|
||||||
Nein
|
|
||||||
</button>
|
|
||||||
</>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
|
|
||||||
<button
|
|
||||||
onClick={onNext}
|
|
||||||
className="px-4 py-2 bg-teal-600 text-white rounded-lg hover:bg-teal-700 transition-colors text-sm font-medium"
|
|
||||||
>
|
|
||||||
Weiter
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
@@ -1,209 +0,0 @@
|
|||||||
'use client'
|
|
||||||
|
|
||||||
import { useState } from 'react'
|
|
||||||
import type { DeskewResult, DeskewGroundTruth } from '@/app/(admin)/ai/ocr-pipeline/types'
|
|
||||||
|
|
||||||
interface DeskewControlsProps {
|
|
||||||
deskewResult: DeskewResult | null
|
|
||||||
showBinarized: boolean
|
|
||||||
onToggleBinarized: () => void
|
|
||||||
showGrid: boolean
|
|
||||||
onToggleGrid: () => void
|
|
||||||
onManualDeskew: (angle: number) => void
|
|
||||||
onGroundTruth: (gt: DeskewGroundTruth) => void
|
|
||||||
onNext: () => void
|
|
||||||
isApplying: boolean
|
|
||||||
}
|
|
||||||
|
|
||||||
const METHOD_LABELS: Record<string, string> = {
|
|
||||||
hough: 'Hough-Linien',
|
|
||||||
word_alignment: 'Wortausrichtung',
|
|
||||||
manual: 'Manuell',
|
|
||||||
}
|
|
||||||
|
|
||||||
export function DeskewControls({
|
|
||||||
deskewResult,
|
|
||||||
showBinarized,
|
|
||||||
onToggleBinarized,
|
|
||||||
showGrid,
|
|
||||||
onToggleGrid,
|
|
||||||
onManualDeskew,
|
|
||||||
onGroundTruth,
|
|
||||||
onNext,
|
|
||||||
isApplying,
|
|
||||||
}: DeskewControlsProps) {
|
|
||||||
const [manualAngle, setManualAngle] = useState(0)
|
|
||||||
const [gtFeedback, setGtFeedback] = useState<'correct' | 'incorrect' | null>(null)
|
|
||||||
const [gtNotes, setGtNotes] = useState('')
|
|
||||||
const [gtSaved, setGtSaved] = useState(false)
|
|
||||||
|
|
||||||
const handleGroundTruth = (isCorrect: boolean) => {
|
|
||||||
setGtFeedback(isCorrect ? 'correct' : 'incorrect')
|
|
||||||
if (isCorrect) {
|
|
||||||
onGroundTruth({ is_correct: true })
|
|
||||||
setGtSaved(true)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
const handleGroundTruthIncorrect = () => {
|
|
||||||
onGroundTruth({
|
|
||||||
is_correct: false,
|
|
||||||
corrected_angle: manualAngle !== 0 ? manualAngle : undefined,
|
|
||||||
notes: gtNotes || undefined,
|
|
||||||
})
|
|
||||||
setGtSaved(true)
|
|
||||||
}
|
|
||||||
|
|
||||||
return (
|
|
||||||
<div className="space-y-4">
|
|
||||||
{/* Results */}
|
|
||||||
{deskewResult && (
|
|
||||||
<div className="bg-white dark:bg-gray-800 rounded-lg border border-gray-200 dark:border-gray-700 p-4">
|
|
||||||
<div className="flex flex-wrap items-center gap-3 text-sm">
|
|
||||||
<div>
|
|
||||||
<span className="text-gray-500">Winkel:</span>{' '}
|
|
||||||
<span className="font-mono font-medium">{deskewResult.angle_applied}°</span>
|
|
||||||
</div>
|
|
||||||
<div className="h-4 w-px bg-gray-300 dark:bg-gray-600" />
|
|
||||||
<div>
|
|
||||||
<span className="text-gray-500">Methode:</span>{' '}
|
|
||||||
<span className="inline-flex items-center px-2 py-0.5 rounded-full text-xs font-medium bg-teal-100 text-teal-700 dark:bg-teal-900/40 dark:text-teal-300">
|
|
||||||
{METHOD_LABELS[deskewResult.method_used] || deskewResult.method_used}
|
|
||||||
</span>
|
|
||||||
</div>
|
|
||||||
<div className="h-4 w-px bg-gray-300 dark:bg-gray-600" />
|
|
||||||
<div>
|
|
||||||
<span className="text-gray-500">Konfidenz:</span>{' '}
|
|
||||||
<span className="font-mono">{Math.round(deskewResult.confidence * 100)}%</span>
|
|
||||||
</div>
|
|
||||||
<div className="h-4 w-px bg-gray-300 dark:bg-gray-600" />
|
|
||||||
<div className="text-gray-400 text-xs">
|
|
||||||
Hough: {deskewResult.angle_hough}° | WA: {deskewResult.angle_word_alignment}°
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Toggles */}
|
|
||||||
<div className="flex gap-3 mt-3">
|
|
||||||
<button
|
|
||||||
onClick={onToggleBinarized}
|
|
||||||
className={`text-xs px-3 py-1 rounded-full border transition-colors ${
|
|
||||||
showBinarized
|
|
||||||
? 'bg-teal-100 border-teal-300 text-teal-700 dark:bg-teal-900/40 dark:border-teal-600 dark:text-teal-300'
|
|
||||||
: 'border-gray-300 text-gray-500 dark:border-gray-600 dark:text-gray-400'
|
|
||||||
}`}
|
|
||||||
>
|
|
||||||
Binarisiert anzeigen
|
|
||||||
</button>
|
|
||||||
<button
|
|
||||||
onClick={onToggleGrid}
|
|
||||||
className={`text-xs px-3 py-1 rounded-full border transition-colors ${
|
|
||||||
showGrid
|
|
||||||
? 'bg-teal-100 border-teal-300 text-teal-700 dark:bg-teal-900/40 dark:border-teal-600 dark:text-teal-300'
|
|
||||||
: 'border-gray-300 text-gray-500 dark:border-gray-600 dark:text-gray-400'
|
|
||||||
}`}
|
|
||||||
>
|
|
||||||
Raster anzeigen
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
|
|
||||||
{/* Manual angle */}
|
|
||||||
{deskewResult && (
|
|
||||||
<div className="bg-white dark:bg-gray-800 rounded-lg border border-gray-200 dark:border-gray-700 p-4">
|
|
||||||
<div className="text-sm font-medium text-gray-700 dark:text-gray-300 mb-2">Manuelle Korrektur</div>
|
|
||||||
<div className="flex items-center gap-3">
|
|
||||||
<span className="text-xs text-gray-400 w-8 text-right">-5°</span>
|
|
||||||
<input
|
|
||||||
type="range"
|
|
||||||
min={-5}
|
|
||||||
max={5}
|
|
||||||
step={0.1}
|
|
||||||
value={manualAngle}
|
|
||||||
onChange={(e) => setManualAngle(parseFloat(e.target.value))}
|
|
||||||
className="flex-1 h-2 bg-gray-200 rounded-lg appearance-none cursor-pointer dark:bg-gray-700 accent-teal-500"
|
|
||||||
/>
|
|
||||||
<span className="text-xs text-gray-400 w-8">+5°</span>
|
|
||||||
<span className="font-mono text-sm w-14 text-right">{manualAngle.toFixed(1)}°</span>
|
|
||||||
<button
|
|
||||||
onClick={() => onManualDeskew(manualAngle)}
|
|
||||||
disabled={isApplying}
|
|
||||||
className="px-3 py-1.5 text-sm bg-teal-600 text-white rounded-md hover:bg-teal-700 disabled:opacity-50 transition-colors"
|
|
||||||
>
|
|
||||||
{isApplying ? '...' : 'Anwenden'}
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
|
|
||||||
{/* Ground Truth */}
|
|
||||||
{deskewResult && (
|
|
||||||
<div className="bg-white dark:bg-gray-800 rounded-lg border border-gray-200 dark:border-gray-700 p-4">
|
|
||||||
<div className="text-sm font-medium text-gray-700 dark:text-gray-300 mb-2">
|
|
||||||
Rotation korrekt?
|
|
||||||
</div>
|
|
||||||
<p className="text-xs text-gray-400 mb-2">Nur die Drehung bewerten — Woelbung/Verzerrung wird im naechsten Schritt korrigiert.</p>
|
|
||||||
{!gtSaved ? (
|
|
||||||
<div className="space-y-3">
|
|
||||||
<div className="flex gap-2">
|
|
||||||
<button
|
|
||||||
onClick={() => handleGroundTruth(true)}
|
|
||||||
className={`px-4 py-1.5 rounded-md text-sm font-medium transition-colors ${
|
|
||||||
gtFeedback === 'correct'
|
|
||||||
? 'bg-green-100 text-green-700 ring-2 ring-green-400'
|
|
||||||
: 'bg-gray-100 text-gray-600 hover:bg-green-50 dark:bg-gray-700 dark:text-gray-300'
|
|
||||||
}`}
|
|
||||||
>
|
|
||||||
Ja
|
|
||||||
</button>
|
|
||||||
<button
|
|
||||||
onClick={() => handleGroundTruth(false)}
|
|
||||||
className={`px-4 py-1.5 rounded-md text-sm font-medium transition-colors ${
|
|
||||||
gtFeedback === 'incorrect'
|
|
||||||
? 'bg-red-100 text-red-700 ring-2 ring-red-400'
|
|
||||||
: 'bg-gray-100 text-gray-600 hover:bg-red-50 dark:bg-gray-700 dark:text-gray-300'
|
|
||||||
}`}
|
|
||||||
>
|
|
||||||
Nein
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
{gtFeedback === 'incorrect' && (
|
|
||||||
<div className="space-y-2">
|
|
||||||
<textarea
|
|
||||||
value={gtNotes}
|
|
||||||
onChange={(e) => setGtNotes(e.target.value)}
|
|
||||||
placeholder="Notizen zur Korrektur..."
|
|
||||||
className="w-full text-sm border border-gray-300 dark:border-gray-600 rounded-md p-2 bg-white dark:bg-gray-900 text-gray-800 dark:text-gray-200"
|
|
||||||
rows={2}
|
|
||||||
/>
|
|
||||||
<button
|
|
||||||
onClick={handleGroundTruthIncorrect}
|
|
||||||
className="text-sm px-3 py-1 bg-red-600 text-white rounded-md hover:bg-red-700 transition-colors"
|
|
||||||
>
|
|
||||||
Feedback speichern
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
) : (
|
|
||||||
<div className="text-sm text-green-600 dark:text-green-400">
|
|
||||||
Feedback gespeichert
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
|
|
||||||
{/* Next button */}
|
|
||||||
{deskewResult && (
|
|
||||||
<div className="flex justify-end">
|
|
||||||
<button
|
|
||||||
onClick={onNext}
|
|
||||||
className="px-6 py-2 bg-teal-600 text-white rounded-lg hover:bg-teal-700 font-medium transition-colors"
|
|
||||||
>
|
|
||||||
Uebernehmen & Weiter →
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
@@ -1,309 +0,0 @@
|
|||||||
'use client'
|
|
||||||
|
|
||||||
import { useEffect, useState } from 'react'
|
|
||||||
import type { DewarpResult, DewarpDetection, DewarpGroundTruth } from '@/app/(admin)/ai/ocr-pipeline/types'
|
|
||||||
|
|
||||||
interface DewarpControlsProps {
|
|
||||||
dewarpResult: DewarpResult | null
|
|
||||||
showGrid: boolean
|
|
||||||
onToggleGrid: () => void
|
|
||||||
onManualDewarp: (shearDegrees: number) => void
|
|
||||||
onGroundTruth: (gt: DewarpGroundTruth) => void
|
|
||||||
onNext: () => void
|
|
||||||
isApplying: boolean
|
|
||||||
}
|
|
||||||
|
|
||||||
const METHOD_LABELS: Record<string, string> = {
|
|
||||||
vertical_edge: 'A: Vertikale Kanten',
|
|
||||||
projection: 'B: Projektions-Varianz',
|
|
||||||
hough_lines: 'C: Hough-Linien',
|
|
||||||
text_lines: 'D: Textzeilenanalyse',
|
|
||||||
manual: 'Manuell',
|
|
||||||
none: 'Keine Korrektur',
|
|
||||||
}
|
|
||||||
|
|
||||||
/** Colour for a confidence value (0-1). */
|
|
||||||
function confColor(conf: number): string {
|
|
||||||
if (conf >= 0.7) return 'text-green-600 dark:text-green-400'
|
|
||||||
if (conf >= 0.5) return 'text-yellow-600 dark:text-yellow-400'
|
|
||||||
return 'text-gray-400'
|
|
||||||
}
|
|
||||||
|
|
||||||
/** Short confidence bar (visual). */
|
|
||||||
function ConfBar({ value }: { value: number }) {
|
|
||||||
const pct = Math.round(value * 100)
|
|
||||||
const bg = value >= 0.7 ? 'bg-green-500' : value >= 0.5 ? 'bg-yellow-500' : 'bg-gray-400'
|
|
||||||
return (
|
|
||||||
<div className="flex items-center gap-1.5">
|
|
||||||
<div className="w-16 h-1.5 bg-gray-200 dark:bg-gray-700 rounded-full overflow-hidden">
|
|
||||||
<div className={`h-full rounded-full ${bg}`} style={{ width: `${pct}%` }} />
|
|
||||||
</div>
|
|
||||||
<span className={`text-xs font-mono ${confColor(value)}`}>{pct}%</span>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
export function DewarpControls({
|
|
||||||
dewarpResult,
|
|
||||||
showGrid,
|
|
||||||
onToggleGrid,
|
|
||||||
onManualDewarp,
|
|
||||||
onGroundTruth,
|
|
||||||
onNext,
|
|
||||||
isApplying,
|
|
||||||
}: DewarpControlsProps) {
|
|
||||||
const [manualShear, setManualShear] = useState(0)
|
|
||||||
const [gtFeedback, setGtFeedback] = useState<'correct' | 'incorrect' | null>(null)
|
|
||||||
const [gtNotes, setGtNotes] = useState('')
|
|
||||||
const [gtSaved, setGtSaved] = useState(false)
|
|
||||||
const [showDetails, setShowDetails] = useState(false)
|
|
||||||
|
|
||||||
// Initialize slider to auto-detected value when result arrives
|
|
||||||
useEffect(() => {
|
|
||||||
if (dewarpResult && dewarpResult.shear_degrees !== undefined) {
|
|
||||||
setManualShear(dewarpResult.shear_degrees)
|
|
||||||
}
|
|
||||||
}, [dewarpResult?.shear_degrees])
|
|
||||||
|
|
||||||
const handleGroundTruth = (isCorrect: boolean) => {
|
|
||||||
setGtFeedback(isCorrect ? 'correct' : 'incorrect')
|
|
||||||
if (isCorrect) {
|
|
||||||
onGroundTruth({ is_correct: true })
|
|
||||||
setGtSaved(true)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
const handleGroundTruthIncorrect = () => {
|
|
||||||
onGroundTruth({
|
|
||||||
is_correct: false,
|
|
||||||
corrected_shear: manualShear !== 0 ? manualShear : undefined,
|
|
||||||
notes: gtNotes || undefined,
|
|
||||||
})
|
|
||||||
setGtSaved(true)
|
|
||||||
}
|
|
||||||
|
|
||||||
const wasRejected = dewarpResult && dewarpResult.method_used === 'none' && (dewarpResult.detections || []).length > 0
|
|
||||||
const wasApplied = dewarpResult && dewarpResult.method_used !== 'none' && dewarpResult.method_used !== 'manual'
|
|
||||||
const detections = dewarpResult?.detections || []
|
|
||||||
|
|
||||||
return (
|
|
||||||
<div className="space-y-4">
|
|
||||||
{/* Summary banner */}
|
|
||||||
{dewarpResult && (
|
|
||||||
<div className={`rounded-lg border p-4 ${
|
|
||||||
wasRejected
|
|
||||||
? 'bg-amber-50 border-amber-200 dark:bg-amber-900/20 dark:border-amber-700'
|
|
||||||
: wasApplied
|
|
||||||
? 'bg-green-50 border-green-200 dark:bg-green-900/20 dark:border-green-700'
|
|
||||||
: 'bg-white border-gray-200 dark:bg-gray-800 dark:border-gray-700'
|
|
||||||
}`}>
|
|
||||||
{/* Status line */}
|
|
||||||
<div className="flex items-center gap-2 mb-3">
|
|
||||||
<span className={`text-lg ${wasRejected ? '' : wasApplied ? '' : ''}`}>
|
|
||||||
{wasRejected ? '\u26A0\uFE0F' : wasApplied ? '\u2705' : '\u2796'}
|
|
||||||
</span>
|
|
||||||
<span className="text-sm font-medium text-gray-800 dark:text-gray-200">
|
|
||||||
{wasRejected
|
|
||||||
? 'Quality Gate: Korrektur verworfen (Projektion nicht verbessert)'
|
|
||||||
: wasApplied
|
|
||||||
? `Korrektur angewendet: ${dewarpResult.shear_degrees.toFixed(2)}°`
|
|
||||||
: dewarpResult.method_used === 'manual'
|
|
||||||
? `Manuelle Korrektur: ${dewarpResult.shear_degrees.toFixed(2)}°`
|
|
||||||
: 'Keine Korrektur noetig'}
|
|
||||||
</span>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Key metrics */}
|
|
||||||
<div className="flex flex-wrap items-center gap-4 text-sm">
|
|
||||||
<div>
|
|
||||||
<span className="text-gray-500">Scherung:</span>{' '}
|
|
||||||
<span className="font-mono font-medium">{dewarpResult.shear_degrees.toFixed(2)}°</span>
|
|
||||||
</div>
|
|
||||||
<div className="h-4 w-px bg-gray-300 dark:bg-gray-600" />
|
|
||||||
<div>
|
|
||||||
<span className="text-gray-500">Methode:</span>{' '}
|
|
||||||
<span className="inline-flex items-center px-2 py-0.5 rounded-full text-xs font-medium bg-teal-100 text-teal-700 dark:bg-teal-900/40 dark:text-teal-300">
|
|
||||||
{dewarpResult.method_used.includes('+')
|
|
||||||
? `Ensemble (${dewarpResult.method_used.split('+').map(m => METHOD_LABELS[m] || m).join(' + ')})`
|
|
||||||
: METHOD_LABELS[dewarpResult.method_used] || dewarpResult.method_used}
|
|
||||||
</span>
|
|
||||||
</div>
|
|
||||||
<div className="h-4 w-px bg-gray-300 dark:bg-gray-600" />
|
|
||||||
<div className="flex items-center gap-1.5">
|
|
||||||
<span className="text-gray-500">Konfidenz:</span>
|
|
||||||
<ConfBar value={dewarpResult.confidence} />
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Toggles row */}
|
|
||||||
<div className="flex gap-2 mt-3">
|
|
||||||
<button
|
|
||||||
onClick={onToggleGrid}
|
|
||||||
className={`text-xs px-3 py-1 rounded-full border transition-colors ${
|
|
||||||
showGrid
|
|
||||||
? 'bg-teal-100 border-teal-300 text-teal-700 dark:bg-teal-900/40 dark:border-teal-600 dark:text-teal-300'
|
|
||||||
: 'border-gray-300 text-gray-500 dark:border-gray-600 dark:text-gray-400'
|
|
||||||
}`}
|
|
||||||
>
|
|
||||||
Raster
|
|
||||||
</button>
|
|
||||||
{detections.length > 0 && (
|
|
||||||
<button
|
|
||||||
onClick={() => setShowDetails(v => !v)}
|
|
||||||
className={`text-xs px-3 py-1 rounded-full border transition-colors ${
|
|
||||||
showDetails
|
|
||||||
? 'bg-blue-100 border-blue-300 text-blue-700 dark:bg-blue-900/40 dark:border-blue-600 dark:text-blue-300'
|
|
||||||
: 'border-gray-300 text-gray-500 dark:border-gray-600 dark:text-gray-400'
|
|
||||||
}`}
|
|
||||||
>
|
|
||||||
Details ({detections.length} Methoden)
|
|
||||||
</button>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Detailed detections */}
|
|
||||||
{showDetails && detections.length > 0 && (
|
|
||||||
<div className="mt-3 pt-3 border-t border-gray-200 dark:border-gray-700">
|
|
||||||
<div className="text-xs text-gray-500 mb-2">Einzelne Detektoren:</div>
|
|
||||||
<div className="space-y-1.5">
|
|
||||||
{detections.map((d: DewarpDetection) => {
|
|
||||||
const isUsed = dewarpResult.method_used.includes(d.method)
|
|
||||||
const aboveThreshold = d.confidence >= 0.5
|
|
||||||
return (
|
|
||||||
<div
|
|
||||||
key={d.method}
|
|
||||||
className={`flex items-center gap-3 text-xs px-2 py-1.5 rounded ${
|
|
||||||
isUsed
|
|
||||||
? 'bg-teal-50 dark:bg-teal-900/20'
|
|
||||||
: 'bg-gray-50 dark:bg-gray-800'
|
|
||||||
}`}
|
|
||||||
>
|
|
||||||
<span className="w-4 text-center">
|
|
||||||
{isUsed ? '\u2713' : aboveThreshold ? '\u2012' : '\u2717'}
|
|
||||||
</span>
|
|
||||||
<span className={`w-40 ${isUsed ? 'font-medium text-gray-800 dark:text-gray-200' : 'text-gray-500'}`}>
|
|
||||||
{METHOD_LABELS[d.method] || d.method}
|
|
||||||
</span>
|
|
||||||
<span className="font-mono w-16 text-right">
|
|
||||||
{d.shear_degrees.toFixed(2)}°
|
|
||||||
</span>
|
|
||||||
<ConfBar value={d.confidence} />
|
|
||||||
{!aboveThreshold && (
|
|
||||||
<span className="text-gray-400 ml-1">(unter Schwelle)</span>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
})}
|
|
||||||
</div>
|
|
||||||
{wasRejected && (
|
|
||||||
<div className="mt-2 text-xs text-amber-600 dark:text-amber-400">
|
|
||||||
Die Korrektur wurde verworfen, weil die horizontale Projektions-Varianz nach Anwendung nicht besser war als vorher.
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
|
|
||||||
{/* Manual shear angle slider */}
|
|
||||||
{dewarpResult && (
|
|
||||||
<div className="bg-white dark:bg-gray-800 rounded-lg border border-gray-200 dark:border-gray-700 p-4">
|
|
||||||
<div className="text-sm font-medium text-gray-700 dark:text-gray-300 mb-2">Scherwinkel (manuell)</div>
|
|
||||||
<div className="flex items-center gap-3">
|
|
||||||
<span className="text-xs text-gray-400 w-10 text-right">-2.0°</span>
|
|
||||||
<input
|
|
||||||
type="range"
|
|
||||||
min={-200}
|
|
||||||
max={200}
|
|
||||||
step={5}
|
|
||||||
value={Math.round(manualShear * 100)}
|
|
||||||
onChange={(e) => setManualShear(parseInt(e.target.value) / 100)}
|
|
||||||
className="flex-1 h-2 bg-gray-200 rounded-lg appearance-none cursor-pointer dark:bg-gray-700 accent-teal-500"
|
|
||||||
/>
|
|
||||||
<span className="text-xs text-gray-400 w-10">+2.0°</span>
|
|
||||||
<span className="font-mono text-sm w-16 text-right">{manualShear.toFixed(2)}°</span>
|
|
||||||
<button
|
|
||||||
onClick={() => onManualDewarp(manualShear)}
|
|
||||||
disabled={isApplying}
|
|
||||||
className="px-3 py-1.5 text-sm bg-teal-600 text-white rounded-md hover:bg-teal-700 disabled:opacity-50 transition-colors"
|
|
||||||
>
|
|
||||||
{isApplying ? '...' : 'Anwenden'}
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
<p className="text-xs text-gray-400 mt-1">
|
|
||||||
Scherung der vertikalen Achse in Grad. Positiv = Spalten nach rechts kippen, negativ = nach links.
|
|
||||||
</p>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
|
|
||||||
{/* Ground Truth */}
|
|
||||||
{dewarpResult && (
|
|
||||||
<div className="bg-white dark:bg-gray-800 rounded-lg border border-gray-200 dark:border-gray-700 p-4">
|
|
||||||
<div className="text-sm font-medium text-gray-700 dark:text-gray-300 mb-2">
|
|
||||||
Spalten vertikal ausgerichtet?
|
|
||||||
</div>
|
|
||||||
<p className="text-xs text-gray-400 mb-2">Pruefen ob die Spaltenraender jetzt senkrecht zum Raster stehen.</p>
|
|
||||||
{!gtSaved ? (
|
|
||||||
<div className="space-y-3">
|
|
||||||
<div className="flex gap-2">
|
|
||||||
<button
|
|
||||||
onClick={() => handleGroundTruth(true)}
|
|
||||||
className={`px-4 py-1.5 rounded-md text-sm font-medium transition-colors ${
|
|
||||||
gtFeedback === 'correct'
|
|
||||||
? 'bg-green-100 text-green-700 ring-2 ring-green-400'
|
|
||||||
: 'bg-gray-100 text-gray-600 hover:bg-green-50 dark:bg-gray-700 dark:text-gray-300'
|
|
||||||
}`}
|
|
||||||
>
|
|
||||||
Ja
|
|
||||||
</button>
|
|
||||||
<button
|
|
||||||
onClick={() => handleGroundTruth(false)}
|
|
||||||
className={`px-4 py-1.5 rounded-md text-sm font-medium transition-colors ${
|
|
||||||
gtFeedback === 'incorrect'
|
|
||||||
? 'bg-red-100 text-red-700 ring-2 ring-red-400'
|
|
||||||
: 'bg-gray-100 text-gray-600 hover:bg-red-50 dark:bg-gray-700 dark:text-gray-300'
|
|
||||||
}`}
|
|
||||||
>
|
|
||||||
Nein
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
{gtFeedback === 'incorrect' && (
|
|
||||||
<div className="space-y-2">
|
|
||||||
<textarea
|
|
||||||
value={gtNotes}
|
|
||||||
onChange={(e) => setGtNotes(e.target.value)}
|
|
||||||
placeholder="Notizen zur Korrektur..."
|
|
||||||
className="w-full text-sm border border-gray-300 dark:border-gray-600 rounded-md p-2 bg-white dark:bg-gray-900 text-gray-800 dark:text-gray-200"
|
|
||||||
rows={2}
|
|
||||||
/>
|
|
||||||
<button
|
|
||||||
onClick={handleGroundTruthIncorrect}
|
|
||||||
className="text-sm px-3 py-1 bg-red-600 text-white rounded-md hover:bg-red-700 transition-colors"
|
|
||||||
>
|
|
||||||
Feedback speichern
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
) : (
|
|
||||||
<div className="text-sm text-green-600 dark:text-green-400">
|
|
||||||
Feedback gespeichert
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
|
|
||||||
{/* Next button */}
|
|
||||||
{dewarpResult && (
|
|
||||||
<div className="flex justify-end">
|
|
||||||
<button
|
|
||||||
onClick={onNext}
|
|
||||||
className="px-6 py-2 bg-teal-600 text-white rounded-lg hover:bg-teal-700 font-medium transition-colors"
|
|
||||||
>
|
|
||||||
Uebernehmen & Weiter →
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
@@ -1,403 +0,0 @@
|
|||||||
'use client'
|
|
||||||
|
|
||||||
import { useCallback, useEffect, useRef, useState } from 'react'
|
|
||||||
import type { GridCell } from '@/app/(admin)/ai/ocr-pipeline/types'
|
|
||||||
|
|
||||||
const KLAUSUR_API = '/klausur-api'
|
|
||||||
|
|
||||||
// Column type → colour mapping
|
|
||||||
const COL_TYPE_COLORS: Record<string, string> = {
|
|
||||||
column_en: '#3b82f6', // blue-500
|
|
||||||
column_de: '#22c55e', // green-500
|
|
||||||
column_example: '#f97316', // orange-500
|
|
||||||
column_text: '#a855f7', // purple-500
|
|
||||||
page_ref: '#06b6d4', // cyan-500
|
|
||||||
column_marker: '#6b7280', // gray-500
|
|
||||||
}
|
|
||||||
|
|
||||||
interface FabricReconstructionCanvasProps {
|
|
||||||
sessionId: string
|
|
||||||
cells: GridCell[]
|
|
||||||
onCellsChanged: (updates: { cell_id: string; text: string }[]) => void
|
|
||||||
}
|
|
||||||
|
|
||||||
// Fabric.js types (subset used here)
|
|
||||||
interface FabricCanvas {
|
|
||||||
add: (...objects: FabricObject[]) => FabricCanvas
|
|
||||||
remove: (...objects: FabricObject[]) => FabricCanvas
|
|
||||||
setBackgroundImage: (img: FabricImage, callback: () => void) => void
|
|
||||||
renderAll: () => void
|
|
||||||
getObjects: () => FabricObject[]
|
|
||||||
dispose: () => void
|
|
||||||
on: (event: string, handler: (e: FabricEvent) => void) => void
|
|
||||||
setWidth: (w: number) => void
|
|
||||||
setHeight: (h: number) => void
|
|
||||||
getActiveObject: () => FabricObject | null
|
|
||||||
discardActiveObject: () => FabricCanvas
|
|
||||||
requestRenderAll: () => void
|
|
||||||
setZoom: (z: number) => void
|
|
||||||
getZoom: () => number
|
|
||||||
}
|
|
||||||
|
|
||||||
interface FabricObject {
|
|
||||||
type?: string
|
|
||||||
left?: number
|
|
||||||
top?: number
|
|
||||||
width?: number
|
|
||||||
height?: number
|
|
||||||
text?: string
|
|
||||||
set: (props: Record<string, unknown>) => FabricObject
|
|
||||||
get: (prop: string) => unknown
|
|
||||||
data?: Record<string, unknown>
|
|
||||||
selectable?: boolean
|
|
||||||
on?: (event: string, handler: () => void) => void
|
|
||||||
setCoords?: () => void
|
|
||||||
}
|
|
||||||
|
|
||||||
interface FabricImage extends FabricObject {
|
|
||||||
width?: number
|
|
||||||
height?: number
|
|
||||||
scaleX?: number
|
|
||||||
scaleY?: number
|
|
||||||
}
|
|
||||||
|
|
||||||
interface FabricEvent {
|
|
||||||
target?: FabricObject
|
|
||||||
e?: MouseEvent
|
|
||||||
}
|
|
||||||
|
|
||||||
// eslint-disable-next-line @typescript-eslint/no-explicit-any
|
|
||||||
type FabricModule = any
|
|
||||||
|
|
||||||
export function FabricReconstructionCanvas({
|
|
||||||
sessionId,
|
|
||||||
cells,
|
|
||||||
onCellsChanged,
|
|
||||||
}: FabricReconstructionCanvasProps) {
|
|
||||||
const canvasElRef = useRef<HTMLCanvasElement>(null)
|
|
||||||
const fabricRef = useRef<FabricCanvas | null>(null)
|
|
||||||
const fabricModuleRef = useRef<FabricModule>(null)
|
|
||||||
const [ready, setReady] = useState(false)
|
|
||||||
const [opacity, setOpacity] = useState(30)
|
|
||||||
const [zoom, setZoom] = useState(100)
|
|
||||||
const [selectedCell, setSelectedCell] = useState<string | null>(null)
|
|
||||||
const [error, setError] = useState('')
|
|
||||||
|
|
||||||
// Undo/Redo
|
|
||||||
const undoStackRef = useRef<{ cellId: string; oldText: string; newText: string }[]>([])
|
|
||||||
const redoStackRef = useRef<{ cellId: string; oldText: string; newText: string }[]>([])
|
|
||||||
|
|
||||||
// ---- Initialise Fabric.js ----
|
|
||||||
useEffect(() => {
|
|
||||||
let disposed = false
|
|
||||||
|
|
||||||
async function init() {
|
|
||||||
try {
|
|
||||||
const fabricModule = await import('fabric')
|
|
||||||
if (disposed) return
|
|
||||||
fabricModuleRef.current = fabricModule
|
|
||||||
|
|
||||||
const canvasEl = canvasElRef.current
|
|
||||||
if (!canvasEl) return
|
|
||||||
|
|
||||||
// Load background image first to get dimensions
|
|
||||||
const imgUrl = `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/image/dewarped`
|
|
||||||
|
|
||||||
const bgImg = await fabricModule.FabricImage.fromURL(imgUrl, { crossOrigin: 'anonymous' }) as FabricImage
|
|
||||||
|
|
||||||
if (disposed) return
|
|
||||||
|
|
||||||
const imgW = (bgImg.width || 800) * (bgImg.scaleX || 1)
|
|
||||||
const imgH = (bgImg.height || 600) * (bgImg.scaleY || 1)
|
|
||||||
|
|
||||||
bgImg.set({ opacity: opacity / 100, selectable: false, evented: false } as Record<string, unknown>)
|
|
||||||
|
|
||||||
const canvas = new fabricModule.Canvas(canvasEl, {
|
|
||||||
width: imgW,
|
|
||||||
height: imgH,
|
|
||||||
selection: true,
|
|
||||||
preserveObjectStacking: true,
|
|
||||||
backgroundImage: bgImg,
|
|
||||||
}) as unknown as FabricCanvas
|
|
||||||
|
|
||||||
fabricRef.current = canvas
|
|
||||||
canvas.renderAll()
|
|
||||||
|
|
||||||
// Add cell objects
|
|
||||||
addCellObjects(canvas, fabricModule, cells, imgW, imgH)
|
|
||||||
|
|
||||||
// Listen for text changes
|
|
||||||
canvas.on('object:modified', (e: FabricEvent) => {
|
|
||||||
if (e.target?.data?.cellId) {
|
|
||||||
const cellId = e.target.data.cellId as string
|
|
||||||
const newText = (e.target.text || '') as string
|
|
||||||
onCellsChanged([{ cell_id: cellId, text: newText }])
|
|
||||||
}
|
|
||||||
})
|
|
||||||
|
|
||||||
// Selection tracking
|
|
||||||
canvas.on('selection:created', (e: FabricEvent) => {
|
|
||||||
if (e.target?.data?.cellId) setSelectedCell(e.target.data.cellId as string)
|
|
||||||
})
|
|
||||||
canvas.on('selection:updated', (e: FabricEvent) => {
|
|
||||||
if (e.target?.data?.cellId) setSelectedCell(e.target.data.cellId as string)
|
|
||||||
})
|
|
||||||
canvas.on('selection:cleared', () => setSelectedCell(null))
|
|
||||||
|
|
||||||
setReady(true)
|
|
||||||
} catch (err) {
|
|
||||||
if (!disposed) setError(err instanceof Error ? err.message : 'Fabric.js konnte nicht geladen werden')
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
init()
|
|
||||||
|
|
||||||
return () => {
|
|
||||||
disposed = true
|
|
||||||
fabricRef.current?.dispose()
|
|
||||||
fabricRef.current = null
|
|
||||||
}
|
|
||||||
// eslint-disable-next-line react-hooks/exhaustive-deps
|
|
||||||
}, [sessionId])
|
|
||||||
|
|
||||||
function addCellObjects(
|
|
||||||
canvas: FabricCanvas,
|
|
||||||
fabricModule: FabricModule,
|
|
||||||
gridCells: GridCell[],
|
|
||||||
imgW: number,
|
|
||||||
imgH: number,
|
|
||||||
) {
|
|
||||||
for (const cell of gridCells) {
|
|
||||||
const color = COL_TYPE_COLORS[cell.col_type] || '#6b7280'
|
|
||||||
const x = (cell.bbox_pct.x / 100) * imgW
|
|
||||||
const y = (cell.bbox_pct.y / 100) * imgH
|
|
||||||
const w = (cell.bbox_pct.w / 100) * imgW
|
|
||||||
const h = (cell.bbox_pct.h / 100) * imgH
|
|
||||||
|
|
||||||
const fontSize = Math.max(8, Math.min(18, h * 0.55))
|
|
||||||
|
|
||||||
const textObj = new fabricModule.IText(cell.text || '', {
|
|
||||||
left: x,
|
|
||||||
top: y,
|
|
||||||
width: w,
|
|
||||||
fontSize,
|
|
||||||
fontFamily: 'monospace',
|
|
||||||
fill: '#000000',
|
|
||||||
backgroundColor: `${color}22`,
|
|
||||||
padding: 2,
|
|
||||||
editable: true,
|
|
||||||
selectable: true,
|
|
||||||
lockScalingFlip: true,
|
|
||||||
data: {
|
|
||||||
cellId: cell.cell_id,
|
|
||||||
colType: cell.col_type,
|
|
||||||
rowIndex: cell.row_index,
|
|
||||||
colIndex: cell.col_index,
|
|
||||||
originalText: cell.text,
|
|
||||||
},
|
|
||||||
})
|
|
||||||
|
|
||||||
// Border colour matches column type
|
|
||||||
textObj.set({
|
|
||||||
borderColor: color,
|
|
||||||
cornerColor: color,
|
|
||||||
cornerSize: 6,
|
|
||||||
transparentCorners: false,
|
|
||||||
} as Record<string, unknown>)
|
|
||||||
|
|
||||||
canvas.add(textObj)
|
|
||||||
}
|
|
||||||
canvas.renderAll()
|
|
||||||
}
|
|
||||||
|
|
||||||
// ---- Opacity slider ----
|
|
||||||
const handleOpacityChange = useCallback((val: number) => {
|
|
||||||
setOpacity(val)
|
|
||||||
const canvas = fabricRef.current
|
|
||||||
if (!canvas) return
|
|
||||||
// Fabric v6: backgroundImage is a direct property on the canvas
|
|
||||||
const bgImg = (canvas as unknown as { backgroundImage?: FabricObject }).backgroundImage
|
|
||||||
if (bgImg) {
|
|
||||||
bgImg.set({ opacity: val / 100 })
|
|
||||||
canvas.renderAll()
|
|
||||||
}
|
|
||||||
}, [])
|
|
||||||
|
|
||||||
// ---- Zoom ----
|
|
||||||
const handleZoomChange = useCallback((val: number) => {
|
|
||||||
setZoom(val)
|
|
||||||
const canvas = fabricRef.current
|
|
||||||
if (!canvas) return
|
|
||||||
;(canvas as unknown as { zoom: number }).zoom = val / 100
|
|
||||||
canvas.requestRenderAll()
|
|
||||||
}, [])
|
|
||||||
|
|
||||||
// ---- Undo / Redo via keyboard ----
|
|
||||||
useEffect(() => {
|
|
||||||
const handler = (e: KeyboardEvent) => {
|
|
||||||
if (!(e.metaKey || e.ctrlKey) || e.key !== 'z') return
|
|
||||||
e.preventDefault()
|
|
||||||
|
|
||||||
const canvas = fabricRef.current
|
|
||||||
if (!canvas) return
|
|
||||||
|
|
||||||
if (e.shiftKey) {
|
|
||||||
// Redo
|
|
||||||
const action = redoStackRef.current.pop()
|
|
||||||
if (!action) return
|
|
||||||
undoStackRef.current.push(action)
|
|
||||||
const obj = canvas.getObjects().find(
|
|
||||||
(o: FabricObject) => o.data?.cellId === action.cellId
|
|
||||||
)
|
|
||||||
if (obj) {
|
|
||||||
obj.set({ text: action.newText } as Record<string, unknown>)
|
|
||||||
canvas.renderAll()
|
|
||||||
onCellsChanged([{ cell_id: action.cellId, text: action.newText }])
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
// Undo
|
|
||||||
const action = undoStackRef.current.pop()
|
|
||||||
if (!action) return
|
|
||||||
redoStackRef.current.push(action)
|
|
||||||
const obj = canvas.getObjects().find(
|
|
||||||
(o: FabricObject) => o.data?.cellId === action.cellId
|
|
||||||
)
|
|
||||||
if (obj) {
|
|
||||||
obj.set({ text: action.oldText } as Record<string, unknown>)
|
|
||||||
canvas.renderAll()
|
|
||||||
onCellsChanged([{ cell_id: action.cellId, text: action.oldText }])
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
document.addEventListener('keydown', handler)
|
|
||||||
return () => document.removeEventListener('keydown', handler)
|
|
||||||
}, [onCellsChanged])
|
|
||||||
|
|
||||||
// ---- Delete selected cell (via context-menu or Delete key) ----
|
|
||||||
useEffect(() => {
|
|
||||||
const handler = (e: KeyboardEvent) => {
|
|
||||||
if (e.key !== 'Delete' && e.key !== 'Backspace') return
|
|
||||||
// Only delete if not currently editing text inside an IText
|
|
||||||
const canvas = fabricRef.current
|
|
||||||
if (!canvas) return
|
|
||||||
const active = canvas.getActiveObject()
|
|
||||||
if (!active) return
|
|
||||||
// If the IText is in editing mode, let the keypress pass through
|
|
||||||
if ((active as unknown as Record<string, boolean>).isEditing) return
|
|
||||||
e.preventDefault()
|
|
||||||
canvas.remove(active)
|
|
||||||
canvas.discardActiveObject()
|
|
||||||
canvas.renderAll()
|
|
||||||
}
|
|
||||||
document.addEventListener('keydown', handler)
|
|
||||||
return () => document.removeEventListener('keydown', handler)
|
|
||||||
}, [])
|
|
||||||
|
|
||||||
// ---- Export helpers ----
|
|
||||||
const handleExportPdf = useCallback(() => {
|
|
||||||
window.open(
|
|
||||||
`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/reconstruction/export/pdf`,
|
|
||||||
'_blank'
|
|
||||||
)
|
|
||||||
}, [sessionId])
|
|
||||||
|
|
||||||
const handleExportDocx = useCallback(() => {
|
|
||||||
window.open(
|
|
||||||
`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/reconstruction/export/docx`,
|
|
||||||
'_blank'
|
|
||||||
)
|
|
||||||
}, [sessionId])
|
|
||||||
|
|
||||||
if (error) {
|
|
||||||
return (
|
|
||||||
<div className="flex flex-col items-center justify-center py-8 text-red-500 text-sm">
|
|
||||||
<p>Fabric.js Editor konnte nicht geladen werden:</p>
|
|
||||||
<p className="text-xs mt-1 text-gray-400">{error}</p>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
return (
|
|
||||||
<div className="space-y-2">
|
|
||||||
{/* Toolbar */}
|
|
||||||
<div className="flex items-center gap-3 bg-white dark:bg-gray-800 rounded-lg border border-gray-200 dark:border-gray-700 px-3 py-2 text-xs">
|
|
||||||
{/* Opacity slider */}
|
|
||||||
<label className="flex items-center gap-1.5 text-gray-500">
|
|
||||||
Hintergrund
|
|
||||||
<input
|
|
||||||
type="range"
|
|
||||||
min={0} max={100}
|
|
||||||
value={opacity}
|
|
||||||
onChange={e => handleOpacityChange(Number(e.target.value))}
|
|
||||||
className="w-20 h-1 accent-teal-500"
|
|
||||||
/>
|
|
||||||
<span className="w-8 text-right">{opacity}%</span>
|
|
||||||
</label>
|
|
||||||
|
|
||||||
<div className="w-px h-5 bg-gray-300 dark:bg-gray-600" />
|
|
||||||
|
|
||||||
{/* Zoom */}
|
|
||||||
<label className="flex items-center gap-1.5 text-gray-500">
|
|
||||||
Zoom
|
|
||||||
<button onClick={() => handleZoomChange(Math.max(25, zoom - 25))}
|
|
||||||
className="px-1.5 py-0.5 border border-gray-300 dark:border-gray-600 rounded hover:bg-gray-50 dark:hover:bg-gray-700">
|
|
||||||
−
|
|
||||||
</button>
|
|
||||||
<span className="w-8 text-center">{zoom}%</span>
|
|
||||||
<button onClick={() => handleZoomChange(Math.min(200, zoom + 25))}
|
|
||||||
className="px-1.5 py-0.5 border border-gray-300 dark:border-gray-600 rounded hover:bg-gray-50 dark:hover:bg-gray-700">
|
|
||||||
+
|
|
||||||
</button>
|
|
||||||
<button onClick={() => handleZoomChange(100)}
|
|
||||||
className="px-1.5 py-0.5 border border-gray-300 dark:border-gray-600 rounded hover:bg-gray-50 dark:hover:bg-gray-700">
|
|
||||||
Fit
|
|
||||||
</button>
|
|
||||||
</label>
|
|
||||||
|
|
||||||
<div className="w-px h-5 bg-gray-300 dark:bg-gray-600" />
|
|
||||||
|
|
||||||
{/* Selected cell info */}
|
|
||||||
{selectedCell && (
|
|
||||||
<span className="text-gray-400">
|
|
||||||
Zelle: <span className="text-gray-600 dark:text-gray-300">{selectedCell}</span>
|
|
||||||
</span>
|
|
||||||
)}
|
|
||||||
|
|
||||||
<div className="flex-1" />
|
|
||||||
|
|
||||||
{/* Export buttons */}
|
|
||||||
<button onClick={handleExportPdf}
|
|
||||||
className="px-2.5 py-1 border border-gray-300 dark:border-gray-600 rounded hover:bg-gray-50 dark:hover:bg-gray-700">
|
|
||||||
PDF
|
|
||||||
</button>
|
|
||||||
<button onClick={handleExportDocx}
|
|
||||||
className="px-2.5 py-1 border border-gray-300 dark:border-gray-600 rounded hover:bg-gray-50 dark:hover:bg-gray-700">
|
|
||||||
DOCX
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Canvas */}
|
|
||||||
<div className="border rounded-lg overflow-auto dark:border-gray-700 bg-gray-100 dark:bg-gray-900"
|
|
||||||
style={{ maxHeight: '75vh' }}>
|
|
||||||
{!ready && (
|
|
||||||
<div className="flex items-center justify-center py-12">
|
|
||||||
<div className="animate-spin rounded-full h-5 w-5 border-b-2 border-teal-500" />
|
|
||||||
<span className="ml-2 text-sm text-gray-500">Canvas wird geladen...</span>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
<canvas ref={canvasElRef} />
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Legend */}
|
|
||||||
<div className="flex items-center gap-4 text-xs text-gray-500">
|
|
||||||
{Object.entries(COL_TYPE_COLORS).map(([type, color]) => (
|
|
||||||
<span key={type} className="flex items-center gap-1">
|
|
||||||
<span className="w-3 h-3 rounded" style={{ backgroundColor: color + '44', border: `1px solid ${color}` }} />
|
|
||||||
{type.replace('column_', '').replace('page_', '')}
|
|
||||||
</span>
|
|
||||||
))}
|
|
||||||
<span className="ml-auto text-gray-400">Doppelklick = Text bearbeiten | Delete = Zelle entfernen | Cmd+Z = Undo</span>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
@@ -1,143 +0,0 @@
|
|||||||
'use client'
|
|
||||||
|
|
||||||
import { useState } from 'react'
|
|
||||||
|
|
||||||
const A4_WIDTH_MM = 210
|
|
||||||
const A4_HEIGHT_MM = 297
|
|
||||||
|
|
||||||
interface ImageCompareViewProps {
|
|
||||||
originalUrl: string | null
|
|
||||||
deskewedUrl: string | null
|
|
||||||
showGrid: boolean
|
|
||||||
showGridLeft?: boolean
|
|
||||||
showBinarized: boolean
|
|
||||||
binarizedUrl: string | null
|
|
||||||
leftLabel?: string
|
|
||||||
rightLabel?: string
|
|
||||||
}
|
|
||||||
|
|
||||||
function MmGridOverlay() {
|
|
||||||
const lines: React.ReactNode[] = []
|
|
||||||
|
|
||||||
// Vertical lines every 10mm
|
|
||||||
for (let mm = 0; mm <= A4_WIDTH_MM; mm += 10) {
|
|
||||||
const x = (mm / A4_WIDTH_MM) * 100
|
|
||||||
const is50 = mm % 50 === 0
|
|
||||||
lines.push(
|
|
||||||
<line
|
|
||||||
key={`v-${mm}`}
|
|
||||||
x1={x} y1={0} x2={x} y2={100}
|
|
||||||
stroke={is50 ? 'rgba(59, 130, 246, 0.4)' : 'rgba(59, 130, 246, 0.15)'}
|
|
||||||
strokeWidth={is50 ? 0.12 : 0.05}
|
|
||||||
/>
|
|
||||||
)
|
|
||||||
// Label every 50mm
|
|
||||||
if (is50 && mm > 0) {
|
|
||||||
lines.push(
|
|
||||||
<text key={`vl-${mm}`} x={x} y={1.2} fill="rgba(59,130,246,0.6)" fontSize="1.2" textAnchor="middle">
|
|
||||||
{mm}
|
|
||||||
</text>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Horizontal lines every 10mm
|
|
||||||
for (let mm = 0; mm <= A4_HEIGHT_MM; mm += 10) {
|
|
||||||
const y = (mm / A4_HEIGHT_MM) * 100
|
|
||||||
const is50 = mm % 50 === 0
|
|
||||||
lines.push(
|
|
||||||
<line
|
|
||||||
key={`h-${mm}`}
|
|
||||||
x1={0} y1={y} x2={100} y2={y}
|
|
||||||
stroke={is50 ? 'rgba(59, 130, 246, 0.4)' : 'rgba(59, 130, 246, 0.15)'}
|
|
||||||
strokeWidth={is50 ? 0.12 : 0.05}
|
|
||||||
/>
|
|
||||||
)
|
|
||||||
if (is50 && mm > 0) {
|
|
||||||
lines.push(
|
|
||||||
<text key={`hl-${mm}`} x={0.5} y={y + 0.6} fill="rgba(59,130,246,0.6)" fontSize="1.2">
|
|
||||||
{mm}
|
|
||||||
</text>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return (
|
|
||||||
<svg
|
|
||||||
viewBox="0 0 100 100"
|
|
||||||
preserveAspectRatio="none"
|
|
||||||
className="absolute inset-0 w-full h-full pointer-events-none"
|
|
||||||
style={{ zIndex: 10 }}
|
|
||||||
>
|
|
||||||
<g style={{ pointerEvents: 'none' }}>{lines}</g>
|
|
||||||
</svg>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
export function ImageCompareView({
|
|
||||||
originalUrl,
|
|
||||||
deskewedUrl,
|
|
||||||
showGrid,
|
|
||||||
showGridLeft,
|
|
||||||
showBinarized,
|
|
||||||
binarizedUrl,
|
|
||||||
leftLabel,
|
|
||||||
rightLabel,
|
|
||||||
}: ImageCompareViewProps) {
|
|
||||||
const [leftError, setLeftError] = useState(false)
|
|
||||||
const [rightError, setRightError] = useState(false)
|
|
||||||
|
|
||||||
const rightUrl = showBinarized && binarizedUrl ? binarizedUrl : deskewedUrl
|
|
||||||
|
|
||||||
return (
|
|
||||||
<div className="grid grid-cols-1 lg:grid-cols-2 gap-4">
|
|
||||||
{/* Left: Original */}
|
|
||||||
<div className="space-y-2">
|
|
||||||
<h3 className="text-sm font-medium text-gray-500 dark:text-gray-400">{leftLabel || 'Original (unbearbeitet)'}</h3>
|
|
||||||
<div className="relative bg-gray-100 dark:bg-gray-900 rounded-lg overflow-hidden border border-gray-200 dark:border-gray-700"
|
|
||||||
style={{ aspectRatio: '210/297' }}>
|
|
||||||
{originalUrl && !leftError ? (
|
|
||||||
<>
|
|
||||||
<img
|
|
||||||
src={originalUrl}
|
|
||||||
alt="Original Scan"
|
|
||||||
className="w-full h-full object-contain"
|
|
||||||
onError={() => setLeftError(true)}
|
|
||||||
/>
|
|
||||||
{showGridLeft && <MmGridOverlay />}
|
|
||||||
</>
|
|
||||||
) : (
|
|
||||||
<div className="flex items-center justify-center h-full text-gray-400">
|
|
||||||
{leftError ? 'Fehler beim Laden' : 'Noch kein Bild'}
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Right: Deskewed with Grid */}
|
|
||||||
<div className="space-y-2">
|
|
||||||
<h3 className="text-sm font-medium text-gray-500 dark:text-gray-400">
|
|
||||||
{rightLabel || `${showBinarized ? 'Binarisiert' : 'Begradigt'}${showGrid ? ' + Raster (mm)' : ''}`}
|
|
||||||
</h3>
|
|
||||||
<div className="relative bg-gray-100 dark:bg-gray-900 rounded-lg overflow-hidden border border-gray-200 dark:border-gray-700"
|
|
||||||
style={{ aspectRatio: '210/297' }}>
|
|
||||||
{rightUrl && !rightError ? (
|
|
||||||
<>
|
|
||||||
<img
|
|
||||||
src={rightUrl}
|
|
||||||
alt="Begradigtes Bild"
|
|
||||||
className="w-full h-full object-contain"
|
|
||||||
onError={() => setRightError(true)}
|
|
||||||
/>
|
|
||||||
{showGrid && <MmGridOverlay />}
|
|
||||||
</>
|
|
||||||
) : (
|
|
||||||
<div className="flex items-center justify-center h-full text-gray-400">
|
|
||||||
{rightError ? 'Fehler beim Laden' : 'Begradigung laeuft...'}
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
@@ -1,359 +0,0 @@
|
|||||||
'use client'
|
|
||||||
|
|
||||||
import { useCallback, useEffect, useRef, useState } from 'react'
|
|
||||||
import type { ColumnTypeKey, PageRegion } from '@/app/(admin)/ai/ocr-pipeline/types'
|
|
||||||
|
|
||||||
const COLUMN_TYPES: { value: ColumnTypeKey; label: string }[] = [
|
|
||||||
{ value: 'column_en', label: 'EN' },
|
|
||||||
{ value: 'column_de', label: 'DE' },
|
|
||||||
{ value: 'column_example', label: 'Beispiel' },
|
|
||||||
{ value: 'column_text', label: 'Text' },
|
|
||||||
{ value: 'page_ref', label: 'Seite' },
|
|
||||||
{ value: 'column_marker', label: 'Marker' },
|
|
||||||
{ value: 'column_ignore', label: 'Ignorieren' },
|
|
||||||
]
|
|
||||||
|
|
||||||
const TYPE_OVERLAY_COLORS: Record<string, string> = {
|
|
||||||
column_en: 'rgba(59, 130, 246, 0.12)',
|
|
||||||
column_de: 'rgba(34, 197, 94, 0.12)',
|
|
||||||
column_example: 'rgba(249, 115, 22, 0.12)',
|
|
||||||
column_text: 'rgba(6, 182, 212, 0.12)',
|
|
||||||
page_ref: 'rgba(168, 85, 247, 0.12)',
|
|
||||||
column_marker: 'rgba(239, 68, 68, 0.12)',
|
|
||||||
column_ignore: 'rgba(128, 128, 128, 0.06)',
|
|
||||||
}
|
|
||||||
|
|
||||||
const TYPE_BADGE_COLORS: Record<string, string> = {
|
|
||||||
column_en: 'bg-blue-100 text-blue-700 dark:bg-blue-900/30 dark:text-blue-400',
|
|
||||||
column_de: 'bg-green-100 text-green-700 dark:bg-green-900/30 dark:text-green-400',
|
|
||||||
column_example: 'bg-orange-100 text-orange-700 dark:bg-orange-900/30 dark:text-orange-400',
|
|
||||||
column_text: 'bg-cyan-100 text-cyan-700 dark:bg-cyan-900/30 dark:text-cyan-400',
|
|
||||||
page_ref: 'bg-purple-100 text-purple-700 dark:bg-purple-900/30 dark:text-purple-400',
|
|
||||||
column_marker: 'bg-red-100 text-red-700 dark:bg-red-900/30 dark:text-red-400',
|
|
||||||
column_ignore: 'bg-gray-100 text-gray-500 dark:bg-gray-700/30 dark:text-gray-500',
|
|
||||||
}
|
|
||||||
|
|
||||||
// Default column type sequence for newly created columns
|
|
||||||
const DEFAULT_TYPE_SEQUENCE: ColumnTypeKey[] = [
|
|
||||||
'page_ref', 'column_en', 'column_de', 'column_example', 'column_text',
|
|
||||||
]
|
|
||||||
|
|
||||||
const MIN_DIVIDER_DISTANCE_PERCENT = 2 // Minimum 2% apart
|
|
||||||
|
|
||||||
interface ManualColumnEditorProps {
|
|
||||||
imageUrl: string
|
|
||||||
imageWidth: number
|
|
||||||
imageHeight: number
|
|
||||||
onApply: (columns: PageRegion[]) => void
|
|
||||||
onCancel: () => void
|
|
||||||
applying: boolean
|
|
||||||
mode?: 'manual' | 'ground-truth'
|
|
||||||
layout?: 'two-column' | 'stacked'
|
|
||||||
initialDividers?: number[]
|
|
||||||
initialColumnTypes?: ColumnTypeKey[]
|
|
||||||
}
|
|
||||||
|
|
||||||
export function ManualColumnEditor({
|
|
||||||
imageUrl,
|
|
||||||
imageWidth,
|
|
||||||
imageHeight,
|
|
||||||
onApply,
|
|
||||||
onCancel,
|
|
||||||
applying,
|
|
||||||
mode = 'manual',
|
|
||||||
layout = 'two-column',
|
|
||||||
initialDividers,
|
|
||||||
initialColumnTypes,
|
|
||||||
}: ManualColumnEditorProps) {
|
|
||||||
const containerRef = useRef<HTMLDivElement>(null)
|
|
||||||
const [dividers, setDividers] = useState<number[]>(initialDividers ?? [])
|
|
||||||
const [columnTypes, setColumnTypes] = useState<ColumnTypeKey[]>(initialColumnTypes ?? [])
|
|
||||||
const [dragging, setDragging] = useState<number | null>(null)
|
|
||||||
const [imageLoaded, setImageLoaded] = useState(false)
|
|
||||||
|
|
||||||
const isGT = mode === 'ground-truth'
|
|
||||||
|
|
||||||
// Sync columnTypes length when dividers change
|
|
||||||
useEffect(() => {
|
|
||||||
const numColumns = dividers.length + 1
|
|
||||||
setColumnTypes(prev => {
|
|
||||||
if (prev.length === numColumns) return prev
|
|
||||||
const next = [...prev]
|
|
||||||
while (next.length < numColumns) {
|
|
||||||
const idx = next.length
|
|
||||||
next.push(DEFAULT_TYPE_SEQUENCE[idx] || 'column_text')
|
|
||||||
}
|
|
||||||
while (next.length > numColumns) {
|
|
||||||
next.pop()
|
|
||||||
}
|
|
||||||
return next
|
|
||||||
})
|
|
||||||
}, [dividers.length])
|
|
||||||
|
|
||||||
const getXPercent = useCallback((clientX: number): number => {
|
|
||||||
if (!containerRef.current) return 0
|
|
||||||
const rect = containerRef.current.getBoundingClientRect()
|
|
||||||
const pct = ((clientX - rect.left) / rect.width) * 100
|
|
||||||
return Math.max(0, Math.min(100, pct))
|
|
||||||
}, [])
|
|
||||||
|
|
||||||
const canPlaceDivider = useCallback((xPct: number, excludeIndex?: number): boolean => {
|
|
||||||
for (let i = 0; i < dividers.length; i++) {
|
|
||||||
if (i === excludeIndex) continue
|
|
||||||
if (Math.abs(dividers[i] - xPct) < MIN_DIVIDER_DISTANCE_PERCENT) return false
|
|
||||||
}
|
|
||||||
return xPct > MIN_DIVIDER_DISTANCE_PERCENT && xPct < (100 - MIN_DIVIDER_DISTANCE_PERCENT)
|
|
||||||
}, [dividers])
|
|
||||||
|
|
||||||
// Click on image to add a divider
|
|
||||||
const handleImageClick = useCallback((e: React.MouseEvent) => {
|
|
||||||
if (dragging !== null) return
|
|
||||||
// Don't add if clicking on a divider handle
|
|
||||||
if ((e.target as HTMLElement).dataset.divider) return
|
|
||||||
|
|
||||||
const xPct = getXPercent(e.clientX)
|
|
||||||
if (!canPlaceDivider(xPct)) return
|
|
||||||
|
|
||||||
setDividers(prev => [...prev, xPct].sort((a, b) => a - b))
|
|
||||||
}, [dragging, getXPercent, canPlaceDivider])
|
|
||||||
|
|
||||||
// Drag handlers
|
|
||||||
const handleDividerMouseDown = useCallback((e: React.MouseEvent, index: number) => {
|
|
||||||
e.stopPropagation()
|
|
||||||
e.preventDefault()
|
|
||||||
setDragging(index)
|
|
||||||
}, [])
|
|
||||||
|
|
||||||
useEffect(() => {
|
|
||||||
if (dragging === null) return
|
|
||||||
|
|
||||||
const handleMouseMove = (e: MouseEvent) => {
|
|
||||||
const xPct = getXPercent(e.clientX)
|
|
||||||
if (canPlaceDivider(xPct, dragging)) {
|
|
||||||
setDividers(prev => {
|
|
||||||
const next = [...prev]
|
|
||||||
next[dragging] = xPct
|
|
||||||
return next.sort((a, b) => a - b)
|
|
||||||
})
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
const handleMouseUp = () => {
|
|
||||||
setDragging(null)
|
|
||||||
}
|
|
||||||
|
|
||||||
window.addEventListener('mousemove', handleMouseMove)
|
|
||||||
window.addEventListener('mouseup', handleMouseUp)
|
|
||||||
return () => {
|
|
||||||
window.removeEventListener('mousemove', handleMouseMove)
|
|
||||||
window.removeEventListener('mouseup', handleMouseUp)
|
|
||||||
}
|
|
||||||
}, [dragging, getXPercent, canPlaceDivider])
|
|
||||||
|
|
||||||
const removeDivider = useCallback((index: number) => {
|
|
||||||
setDividers(prev => prev.filter((_, i) => i !== index))
|
|
||||||
}, [])
|
|
||||||
|
|
||||||
const updateColumnType = useCallback((colIndex: number, type: ColumnTypeKey) => {
|
|
||||||
setColumnTypes(prev => {
|
|
||||||
const next = [...prev]
|
|
||||||
next[colIndex] = type
|
|
||||||
return next
|
|
||||||
})
|
|
||||||
}, [])
|
|
||||||
|
|
||||||
const handleApply = useCallback(() => {
|
|
||||||
// Build PageRegion array from dividers
|
|
||||||
const sorted = [...dividers].sort((a, b) => a - b)
|
|
||||||
const columns: PageRegion[] = []
|
|
||||||
|
|
||||||
for (let i = 0; i <= sorted.length; i++) {
|
|
||||||
const leftPct = i === 0 ? 0 : sorted[i - 1]
|
|
||||||
const rightPct = i === sorted.length ? 100 : sorted[i]
|
|
||||||
const x = Math.round((leftPct / 100) * imageWidth)
|
|
||||||
const w = Math.round(((rightPct - leftPct) / 100) * imageWidth)
|
|
||||||
|
|
||||||
columns.push({
|
|
||||||
type: columnTypes[i] || 'column_text',
|
|
||||||
x,
|
|
||||||
y: 0,
|
|
||||||
width: w,
|
|
||||||
height: imageHeight,
|
|
||||||
classification_confidence: 1.0,
|
|
||||||
classification_method: 'manual',
|
|
||||||
})
|
|
||||||
}
|
|
||||||
|
|
||||||
onApply(columns)
|
|
||||||
}, [dividers, columnTypes, imageWidth, imageHeight, onApply])
|
|
||||||
|
|
||||||
// Compute column regions for overlay
|
|
||||||
const sorted = [...dividers].sort((a, b) => a - b)
|
|
||||||
const columnRegions = Array.from({ length: sorted.length + 1 }, (_, i) => ({
|
|
||||||
leftPct: i === 0 ? 0 : sorted[i - 1],
|
|
||||||
rightPct: i === sorted.length ? 100 : sorted[i],
|
|
||||||
type: columnTypes[i] || 'column_text',
|
|
||||||
}))
|
|
||||||
|
|
||||||
return (
|
|
||||||
<div className="space-y-4">
|
|
||||||
{/* Layout: image + controls */}
|
|
||||||
<div className={layout === 'stacked' ? 'space-y-4' : 'grid grid-cols-2 gap-4'}>
|
|
||||||
{/* Left: Interactive image */}
|
|
||||||
<div>
|
|
||||||
<div className="flex items-center justify-between mb-1">
|
|
||||||
<div className="text-xs font-medium text-gray-500 dark:text-gray-400">
|
|
||||||
Klicken um Trennlinien zu setzen
|
|
||||||
</div>
|
|
||||||
<button
|
|
||||||
onClick={onCancel}
|
|
||||||
className="text-xs px-2 py-0.5 text-gray-500 hover:text-gray-700 dark:text-gray-400 dark:hover:text-gray-200"
|
|
||||||
>
|
|
||||||
Abbrechen
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
<div
|
|
||||||
ref={containerRef}
|
|
||||||
className="relative border rounded-lg overflow-hidden dark:border-gray-700 bg-gray-50 dark:bg-gray-900 cursor-crosshair select-none"
|
|
||||||
onClick={handleImageClick}
|
|
||||||
>
|
|
||||||
{/* eslint-disable-next-line @next/next/no-img-element */}
|
|
||||||
<img
|
|
||||||
src={imageUrl}
|
|
||||||
alt="Entzerrtes Bild"
|
|
||||||
className="w-full h-auto block"
|
|
||||||
draggable={false}
|
|
||||||
onLoad={() => setImageLoaded(true)}
|
|
||||||
/>
|
|
||||||
|
|
||||||
{imageLoaded && (
|
|
||||||
<>
|
|
||||||
{/* Column overlays */}
|
|
||||||
{columnRegions.map((region, i) => (
|
|
||||||
<div
|
|
||||||
key={`col-${i}`}
|
|
||||||
className="absolute top-0 bottom-0 pointer-events-none"
|
|
||||||
style={{
|
|
||||||
left: `${region.leftPct}%`,
|
|
||||||
width: `${region.rightPct - region.leftPct}%`,
|
|
||||||
backgroundColor: TYPE_OVERLAY_COLORS[region.type] || 'rgba(128,128,128,0.08)',
|
|
||||||
}}
|
|
||||||
>
|
|
||||||
<span className="absolute top-1 left-1/2 -translate-x-1/2 text-[10px] font-medium text-gray-600 dark:text-gray-300 bg-white/80 dark:bg-gray-800/80 px-1 rounded">
|
|
||||||
{i + 1}
|
|
||||||
</span>
|
|
||||||
</div>
|
|
||||||
))}
|
|
||||||
|
|
||||||
{/* Divider lines */}
|
|
||||||
{sorted.map((xPct, i) => (
|
|
||||||
<div
|
|
||||||
key={`div-${i}`}
|
|
||||||
data-divider="true"
|
|
||||||
className="absolute top-0 bottom-0 group"
|
|
||||||
style={{
|
|
||||||
left: `${xPct}%`,
|
|
||||||
transform: 'translateX(-50%)',
|
|
||||||
width: '12px',
|
|
||||||
cursor: 'col-resize',
|
|
||||||
zIndex: 10,
|
|
||||||
}}
|
|
||||||
onMouseDown={(e) => handleDividerMouseDown(e, i)}
|
|
||||||
>
|
|
||||||
{/* Visible line */}
|
|
||||||
<div
|
|
||||||
data-divider="true"
|
|
||||||
className="absolute top-0 bottom-0 left-1/2 -translate-x-1/2 w-0.5 border-l-2 border-dashed border-red-500"
|
|
||||||
/>
|
|
||||||
{/* Delete button */}
|
|
||||||
<button
|
|
||||||
data-divider="true"
|
|
||||||
onClick={(e) => {
|
|
||||||
e.stopPropagation()
|
|
||||||
removeDivider(i)
|
|
||||||
}}
|
|
||||||
className="absolute top-2 left-1/2 -translate-x-1/2 w-4 h-4 bg-red-500 text-white rounded-full text-[10px] leading-none flex items-center justify-center opacity-0 group-hover:opacity-100 transition-opacity z-20"
|
|
||||||
title="Linie entfernen"
|
|
||||||
>
|
|
||||||
x
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
))}
|
|
||||||
</>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Right: Column type assignment + actions */}
|
|
||||||
<div className="space-y-4">
|
|
||||||
<div className="text-xs font-medium text-gray-500 dark:text-gray-400 mb-1">
|
|
||||||
Spaltentypen
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{dividers.length === 0 ? (
|
|
||||||
<div className="bg-white dark:bg-gray-800 rounded-xl border border-gray-200 dark:border-gray-700 p-6 text-center">
|
|
||||||
<div className="text-3xl mb-2">👆</div>
|
|
||||||
<p className="text-sm text-gray-500 dark:text-gray-400">
|
|
||||||
Klicken Sie auf das Bild links, um vertikale Trennlinien zwischen den Spalten zu setzen.
|
|
||||||
</p>
|
|
||||||
<p className="text-xs text-gray-400 dark:text-gray-500 mt-2">
|
|
||||||
Linien koennen per Drag verschoben und per Hover geloescht werden.
|
|
||||||
</p>
|
|
||||||
</div>
|
|
||||||
) : (
|
|
||||||
<div className="bg-white dark:bg-gray-800 rounded-xl border border-gray-200 dark:border-gray-700 p-4 space-y-3">
|
|
||||||
<div className="text-sm text-gray-600 dark:text-gray-400">
|
|
||||||
<span className="font-medium text-gray-800 dark:text-gray-200">
|
|
||||||
{dividers.length} Linien = {dividers.length + 1} Spalten
|
|
||||||
</span>
|
|
||||||
</div>
|
|
||||||
<div className="grid gap-2">
|
|
||||||
{columnRegions.map((region, i) => (
|
|
||||||
<div key={i} className="flex items-center gap-3">
|
|
||||||
<span className={`w-16 text-center px-2 py-0.5 rounded text-xs font-medium ${TYPE_BADGE_COLORS[region.type] || 'bg-gray-100 text-gray-600'}`}>
|
|
||||||
Spalte {i + 1}
|
|
||||||
</span>
|
|
||||||
<select
|
|
||||||
value={columnTypes[i] || 'column_text'}
|
|
||||||
onChange={(e) => updateColumnType(i, e.target.value as ColumnTypeKey)}
|
|
||||||
className="text-sm border border-gray-200 dark:border-gray-600 rounded px-2 py-1 bg-white dark:bg-gray-700 text-gray-800 dark:text-gray-200"
|
|
||||||
>
|
|
||||||
{COLUMN_TYPES.map(t => (
|
|
||||||
<option key={t.value} value={t.value}>{t.label}</option>
|
|
||||||
))}
|
|
||||||
</select>
|
|
||||||
<span className="text-xs text-gray-400 font-mono">
|
|
||||||
{Math.round(region.rightPct - region.leftPct)}%
|
|
||||||
</span>
|
|
||||||
</div>
|
|
||||||
))}
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
|
|
||||||
{/* Action buttons */}
|
|
||||||
<div className="flex flex-col gap-2">
|
|
||||||
<button
|
|
||||||
onClick={handleApply}
|
|
||||||
disabled={dividers.length === 0 || applying}
|
|
||||||
className="w-full px-4 py-2 bg-teal-600 text-white rounded-lg hover:bg-teal-700 transition-colors text-sm font-medium disabled:opacity-50 disabled:cursor-not-allowed"
|
|
||||||
>
|
|
||||||
{applying
|
|
||||||
? 'Wird gespeichert...'
|
|
||||||
: isGT
|
|
||||||
? `${dividers.length + 1} Spalten als Ground Truth speichern`
|
|
||||||
: `${dividers.length + 1} Spalten uebernehmen`}
|
|
||||||
</button>
|
|
||||||
<button
|
|
||||||
onClick={() => setDividers([])}
|
|
||||||
disabled={dividers.length === 0}
|
|
||||||
className="text-xs px-3 py-2 text-gray-500 hover:text-gray-700 dark:text-gray-400 dark:hover:text-gray-200 disabled:opacity-50"
|
|
||||||
>
|
|
||||||
Alle Linien entfernen
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
@@ -1,115 +0,0 @@
|
|||||||
'use client'
|
|
||||||
|
|
||||||
import { PipelineStep, DocumentTypeResult } from '@/app/(admin)/ai/ocr-pipeline/types'
|
|
||||||
|
|
||||||
const DOC_TYPE_LABELS: Record<string, string> = {
|
|
||||||
vocab_table: 'Vokabeltabelle',
|
|
||||||
full_text: 'Volltext',
|
|
||||||
generic_table: 'Tabelle',
|
|
||||||
}
|
|
||||||
|
|
||||||
interface PipelineStepperProps {
|
|
||||||
steps: PipelineStep[]
|
|
||||||
currentStep: number
|
|
||||||
onStepClick: (index: number) => void
|
|
||||||
onReprocess?: (index: number) => void
|
|
||||||
docTypeResult?: DocumentTypeResult | null
|
|
||||||
onDocTypeChange?: (docType: DocumentTypeResult['doc_type']) => void
|
|
||||||
}
|
|
||||||
|
|
||||||
export function PipelineStepper({
|
|
||||||
steps,
|
|
||||||
currentStep,
|
|
||||||
onStepClick,
|
|
||||||
onReprocess,
|
|
||||||
docTypeResult,
|
|
||||||
onDocTypeChange,
|
|
||||||
}: PipelineStepperProps) {
|
|
||||||
return (
|
|
||||||
<div className="space-y-2">
|
|
||||||
<div className="flex items-center justify-between px-4 py-3 bg-white dark:bg-gray-800 rounded-lg border border-gray-200 dark:border-gray-700">
|
|
||||||
{steps.map((step, index) => {
|
|
||||||
const isActive = index === currentStep
|
|
||||||
const isCompleted = step.status === 'completed'
|
|
||||||
const isFailed = step.status === 'failed'
|
|
||||||
const isSkipped = step.status === 'skipped'
|
|
||||||
const isClickable = (index <= currentStep || isCompleted) && !isSkipped
|
|
||||||
|
|
||||||
return (
|
|
||||||
<div key={step.id} className="flex items-center">
|
|
||||||
{index > 0 && (
|
|
||||||
<div
|
|
||||||
className={`h-0.5 w-8 mx-1 ${
|
|
||||||
isSkipped
|
|
||||||
? 'bg-gray-200 dark:bg-gray-700 border-t border-dashed border-gray-400'
|
|
||||||
: index <= currentStep ? 'bg-teal-400' : 'bg-gray-300 dark:bg-gray-600'
|
|
||||||
}`}
|
|
||||||
/>
|
|
||||||
)}
|
|
||||||
<div className="relative group">
|
|
||||||
<button
|
|
||||||
onClick={() => isClickable && onStepClick(index)}
|
|
||||||
disabled={!isClickable}
|
|
||||||
className={`flex items-center gap-1.5 px-3 py-1.5 rounded-full text-sm font-medium transition-all ${
|
|
||||||
isSkipped
|
|
||||||
? 'bg-gray-100 text-gray-400 dark:bg-gray-800 dark:text-gray-600 line-through'
|
|
||||||
: isActive
|
|
||||||
? 'bg-teal-100 text-teal-700 dark:bg-teal-900/40 dark:text-teal-300 ring-2 ring-teal-400'
|
|
||||||
: isCompleted
|
|
||||||
? 'bg-green-100 text-green-700 dark:bg-green-900/40 dark:text-green-300'
|
|
||||||
: isFailed
|
|
||||||
? 'bg-red-100 text-red-700 dark:bg-red-900/40 dark:text-red-300'
|
|
||||||
: 'text-gray-400 dark:text-gray-500'
|
|
||||||
} ${isClickable ? 'cursor-pointer hover:opacity-80' : 'cursor-default'}`}
|
|
||||||
>
|
|
||||||
<span className="text-base">
|
|
||||||
{isSkipped ? '-' : isCompleted ? '\u2713' : isFailed ? '\u2717' : step.icon}
|
|
||||||
</span>
|
|
||||||
<span className="hidden sm:inline">{step.name}</span>
|
|
||||||
<span className="sm:hidden">{index + 1}</span>
|
|
||||||
</button>
|
|
||||||
{/* Reprocess button — shown on completed steps on hover */}
|
|
||||||
{isCompleted && onReprocess && (
|
|
||||||
<button
|
|
||||||
onClick={(e) => { e.stopPropagation(); onReprocess(index) }}
|
|
||||||
className="absolute -top-1 -right-1 w-4 h-4 bg-orange-500 text-white rounded-full text-[9px] leading-none opacity-0 group-hover:opacity-100 transition-opacity flex items-center justify-center"
|
|
||||||
title={`Ab hier neu verarbeiten`}
|
|
||||||
>
|
|
||||||
↻
|
|
||||||
</button>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
})}
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Document type badge */}
|
|
||||||
{docTypeResult && (
|
|
||||||
<div className="flex items-center gap-2 px-4 py-2 bg-blue-50 dark:bg-blue-900/20 rounded-lg border border-blue-200 dark:border-blue-800 text-sm">
|
|
||||||
<span className="text-blue-600 dark:text-blue-400 font-medium">
|
|
||||||
Dokumenttyp:
|
|
||||||
</span>
|
|
||||||
{onDocTypeChange ? (
|
|
||||||
<select
|
|
||||||
value={docTypeResult.doc_type}
|
|
||||||
onChange={(e) => onDocTypeChange(e.target.value as DocumentTypeResult['doc_type'])}
|
|
||||||
className="bg-white dark:bg-gray-800 border border-blue-300 dark:border-blue-700 rounded px-2 py-0.5 text-sm text-blue-700 dark:text-blue-300"
|
|
||||||
>
|
|
||||||
<option value="vocab_table">Vokabeltabelle</option>
|
|
||||||
<option value="generic_table">Tabelle (generisch)</option>
|
|
||||||
<option value="full_text">Volltext</option>
|
|
||||||
</select>
|
|
||||||
) : (
|
|
||||||
<span className="text-blue-700 dark:text-blue-300">
|
|
||||||
{DOC_TYPE_LABELS[docTypeResult.doc_type] || docTypeResult.doc_type}
|
|
||||||
</span>
|
|
||||||
)}
|
|
||||||
<span className="text-blue-400 dark:text-blue-500 text-xs">
|
|
||||||
({Math.round(docTypeResult.confidence * 100)}% Konfidenz)
|
|
||||||
</span>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
@@ -1,341 +0,0 @@
|
|||||||
'use client'
|
|
||||||
|
|
||||||
import { useCallback, useEffect, useState } from 'react'
|
|
||||||
import type { ColumnResult, ColumnGroundTruth, PageRegion } from '@/app/(admin)/ai/ocr-pipeline/types'
|
|
||||||
import { ColumnControls } from './ColumnControls'
|
|
||||||
import { ManualColumnEditor } from './ManualColumnEditor'
|
|
||||||
import type { ColumnTypeKey } from '@/app/(admin)/ai/ocr-pipeline/types'
|
|
||||||
|
|
||||||
const KLAUSUR_API = '/klausur-api'
|
|
||||||
|
|
||||||
type ViewMode = 'normal' | 'ground-truth' | 'manual'
|
|
||||||
|
|
||||||
interface StepColumnDetectionProps {
|
|
||||||
sessionId: string | null
|
|
||||||
onNext: () => void
|
|
||||||
}
|
|
||||||
|
|
||||||
/** Convert PageRegion[] to divider percentages + column types for ManualColumnEditor */
|
|
||||||
function columnsToEditorState(
|
|
||||||
columns: PageRegion[],
|
|
||||||
imageWidth: number
|
|
||||||
): { dividers: number[]; columnTypes: ColumnTypeKey[] } {
|
|
||||||
if (!columns.length || !imageWidth) return { dividers: [], columnTypes: [] }
|
|
||||||
|
|
||||||
const sorted = [...columns].sort((a, b) => a.x - b.x)
|
|
||||||
const dividers: number[] = []
|
|
||||||
const columnTypes: ColumnTypeKey[] = sorted.map(c => c.type)
|
|
||||||
|
|
||||||
for (let i = 1; i < sorted.length; i++) {
|
|
||||||
const xPct = (sorted[i].x / imageWidth) * 100
|
|
||||||
dividers.push(xPct)
|
|
||||||
}
|
|
||||||
|
|
||||||
return { dividers, columnTypes }
|
|
||||||
}
|
|
||||||
|
|
||||||
export function StepColumnDetection({ sessionId, onNext }: StepColumnDetectionProps) {
|
|
||||||
const [columnResult, setColumnResult] = useState<ColumnResult | null>(null)
|
|
||||||
const [detecting, setDetecting] = useState(false)
|
|
||||||
const [error, setError] = useState<string | null>(null)
|
|
||||||
const [viewMode, setViewMode] = useState<ViewMode>('normal')
|
|
||||||
const [applying, setApplying] = useState(false)
|
|
||||||
const [imageDimensions, setImageDimensions] = useState<{ width: number; height: number } | null>(null)
|
|
||||||
const [savedGtColumns, setSavedGtColumns] = useState<PageRegion[] | null>(null)
|
|
||||||
|
|
||||||
// Fetch session info (image dimensions) + check for cached column result
|
|
||||||
useEffect(() => {
|
|
||||||
if (!sessionId || imageDimensions) return
|
|
||||||
|
|
||||||
const fetchSessionInfo = async () => {
|
|
||||||
try {
|
|
||||||
const infoRes = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}`)
|
|
||||||
if (infoRes.ok) {
|
|
||||||
const info = await infoRes.json()
|
|
||||||
if (info.image_width && info.image_height) {
|
|
||||||
setImageDimensions({ width: info.image_width, height: info.image_height })
|
|
||||||
}
|
|
||||||
if (info.column_result) {
|
|
||||||
setColumnResult(info.column_result)
|
|
||||||
return
|
|
||||||
}
|
|
||||||
}
|
|
||||||
} catch (e) {
|
|
||||||
console.error('Failed to fetch session info:', e)
|
|
||||||
}
|
|
||||||
|
|
||||||
// No cached result - run auto-detection
|
|
||||||
runAutoDetection()
|
|
||||||
}
|
|
||||||
|
|
||||||
fetchSessionInfo()
|
|
||||||
// eslint-disable-next-line react-hooks/exhaustive-deps
|
|
||||||
}, [sessionId])
|
|
||||||
|
|
||||||
// Load saved GT if exists
|
|
||||||
useEffect(() => {
|
|
||||||
if (!sessionId) return
|
|
||||||
const fetchGt = async () => {
|
|
||||||
try {
|
|
||||||
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/ground-truth/columns`)
|
|
||||||
if (res.ok) {
|
|
||||||
const data = await res.json()
|
|
||||||
const corrected = data.columns_gt?.corrected_columns
|
|
||||||
if (corrected) setSavedGtColumns(corrected)
|
|
||||||
}
|
|
||||||
} catch {
|
|
||||||
// No saved GT - that's fine
|
|
||||||
}
|
|
||||||
}
|
|
||||||
fetchGt()
|
|
||||||
}, [sessionId])
|
|
||||||
|
|
||||||
const runAutoDetection = useCallback(async () => {
|
|
||||||
if (!sessionId) return
|
|
||||||
setDetecting(true)
|
|
||||||
setError(null)
|
|
||||||
try {
|
|
||||||
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/columns`, {
|
|
||||||
method: 'POST',
|
|
||||||
})
|
|
||||||
if (!res.ok) {
|
|
||||||
const err = await res.json().catch(() => ({ detail: res.statusText }))
|
|
||||||
throw new Error(err.detail || 'Spaltenerkennung fehlgeschlagen')
|
|
||||||
}
|
|
||||||
const data: ColumnResult = await res.json()
|
|
||||||
setColumnResult(data)
|
|
||||||
} catch (e) {
|
|
||||||
setError(e instanceof Error ? e.message : 'Unbekannter Fehler')
|
|
||||||
} finally {
|
|
||||||
setDetecting(false)
|
|
||||||
}
|
|
||||||
}, [sessionId])
|
|
||||||
|
|
||||||
const handleRerun = useCallback(() => {
|
|
||||||
runAutoDetection()
|
|
||||||
}, [runAutoDetection])
|
|
||||||
|
|
||||||
const handleGroundTruth = useCallback(async (gt: ColumnGroundTruth) => {
|
|
||||||
if (!sessionId) return
|
|
||||||
try {
|
|
||||||
await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/ground-truth/columns`, {
|
|
||||||
method: 'POST',
|
|
||||||
headers: { 'Content-Type': 'application/json' },
|
|
||||||
body: JSON.stringify(gt),
|
|
||||||
})
|
|
||||||
} catch (e) {
|
|
||||||
console.error('Ground truth save failed:', e)
|
|
||||||
}
|
|
||||||
}, [sessionId])
|
|
||||||
|
|
||||||
const handleManualApply = useCallback(async (columns: PageRegion[]) => {
|
|
||||||
if (!sessionId) return
|
|
||||||
setApplying(true)
|
|
||||||
setError(null)
|
|
||||||
try {
|
|
||||||
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/columns/manual`, {
|
|
||||||
method: 'POST',
|
|
||||||
headers: { 'Content-Type': 'application/json' },
|
|
||||||
body: JSON.stringify({ columns }),
|
|
||||||
})
|
|
||||||
if (!res.ok) {
|
|
||||||
const err = await res.json().catch(() => ({ detail: res.statusText }))
|
|
||||||
throw new Error(err.detail || 'Manuelle Spalten konnten nicht gespeichert werden')
|
|
||||||
}
|
|
||||||
const data = await res.json()
|
|
||||||
setColumnResult({
|
|
||||||
columns: data.columns,
|
|
||||||
duration_seconds: data.duration_seconds ?? 0,
|
|
||||||
})
|
|
||||||
setViewMode('normal')
|
|
||||||
} catch (e) {
|
|
||||||
setError(e instanceof Error ? e.message : 'Fehler beim Speichern')
|
|
||||||
} finally {
|
|
||||||
setApplying(false)
|
|
||||||
}
|
|
||||||
}, [sessionId])
|
|
||||||
|
|
||||||
const handleGtApply = useCallback(async (columns: PageRegion[]) => {
|
|
||||||
if (!sessionId) return
|
|
||||||
setApplying(true)
|
|
||||||
setError(null)
|
|
||||||
try {
|
|
||||||
const gt: ColumnGroundTruth = {
|
|
||||||
is_correct: false,
|
|
||||||
corrected_columns: columns,
|
|
||||||
}
|
|
||||||
await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/ground-truth/columns`, {
|
|
||||||
method: 'POST',
|
|
||||||
headers: { 'Content-Type': 'application/json' },
|
|
||||||
body: JSON.stringify(gt),
|
|
||||||
})
|
|
||||||
setSavedGtColumns(columns)
|
|
||||||
setViewMode('normal')
|
|
||||||
} catch (e) {
|
|
||||||
setError(e instanceof Error ? e.message : 'Fehler beim Speichern')
|
|
||||||
} finally {
|
|
||||||
setApplying(false)
|
|
||||||
}
|
|
||||||
}, [sessionId])
|
|
||||||
|
|
||||||
if (!sessionId) {
|
|
||||||
return (
|
|
||||||
<div className="flex flex-col items-center justify-center py-16 text-center">
|
|
||||||
<div className="text-5xl mb-4">📊</div>
|
|
||||||
<h3 className="text-lg font-medium text-gray-700 dark:text-gray-300 mb-2">
|
|
||||||
Schritt 3: Spaltenerkennung
|
|
||||||
</h3>
|
|
||||||
<p className="text-gray-500 dark:text-gray-400 max-w-md">
|
|
||||||
Bitte zuerst Schritt 1 und 2 abschliessen.
|
|
||||||
</p>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
const dewarpedUrl = `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/image/dewarped`
|
|
||||||
const overlayUrl = `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/image/columns-overlay`
|
|
||||||
|
|
||||||
// Pre-compute editor state from saved GT or auto columns for GT mode
|
|
||||||
const gtInitial = savedGtColumns
|
|
||||||
? columnsToEditorState(savedGtColumns, imageDimensions?.width ?? 1000)
|
|
||||||
: undefined
|
|
||||||
|
|
||||||
return (
|
|
||||||
<div className="space-y-4">
|
|
||||||
{/* Loading indicator */}
|
|
||||||
{detecting && (
|
|
||||||
<div className="flex items-center gap-2 text-teal-600 dark:text-teal-400 text-sm">
|
|
||||||
<div className="animate-spin w-4 h-4 border-2 border-teal-500 border-t-transparent rounded-full" />
|
|
||||||
Spaltenerkennung laeuft...
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
|
|
||||||
{viewMode === 'manual' ? (
|
|
||||||
/* Manual column editor - overwrites column_result */
|
|
||||||
<ManualColumnEditor
|
|
||||||
imageUrl={dewarpedUrl}
|
|
||||||
imageWidth={imageDimensions?.width ?? 1000}
|
|
||||||
imageHeight={imageDimensions?.height ?? 1400}
|
|
||||||
onApply={handleManualApply}
|
|
||||||
onCancel={() => setViewMode('normal')}
|
|
||||||
applying={applying}
|
|
||||||
mode="manual"
|
|
||||||
/>
|
|
||||||
) : viewMode === 'ground-truth' ? (
|
|
||||||
/* GT mode: auto result (left, readonly) + GT editor (right) */
|
|
||||||
<div className="grid grid-cols-2 gap-4">
|
|
||||||
{/* Left: Auto result (readonly overlay) */}
|
|
||||||
<div>
|
|
||||||
<div className="text-xs font-medium text-gray-500 dark:text-gray-400 mb-1">
|
|
||||||
Auto-Ergebnis (readonly)
|
|
||||||
</div>
|
|
||||||
<div className="border rounded-lg overflow-hidden dark:border-gray-700 bg-gray-50 dark:bg-gray-900">
|
|
||||||
{columnResult ? (
|
|
||||||
// eslint-disable-next-line @next/next/no-img-element
|
|
||||||
<img
|
|
||||||
src={`${overlayUrl}?t=${Date.now()}`}
|
|
||||||
alt="Auto Spalten-Overlay"
|
|
||||||
className="w-full h-auto"
|
|
||||||
/>
|
|
||||||
) : (
|
|
||||||
<div className="aspect-[3/4] flex items-center justify-center text-gray-400 text-sm">
|
|
||||||
Keine Auto-Daten
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
{/* Auto column list */}
|
|
||||||
{columnResult && (
|
|
||||||
<div className="mt-2 space-y-1">
|
|
||||||
<div className="text-xs font-medium text-gray-500 dark:text-gray-400">
|
|
||||||
Auto: {columnResult.columns.length} Spalten
|
|
||||||
</div>
|
|
||||||
{columnResult.columns
|
|
||||||
.filter(c => c.type.startsWith('column') || c.type === 'page_ref')
|
|
||||||
.map((col, i) => (
|
|
||||||
<div key={i} className="text-xs text-gray-500 dark:text-gray-400 font-mono">
|
|
||||||
{i + 1}. {col.type} x={col.x} w={col.width}
|
|
||||||
</div>
|
|
||||||
))}
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Right: GT editor */}
|
|
||||||
<div>
|
|
||||||
<div className="text-xs font-medium text-gray-500 dark:text-gray-400 mb-1">
|
|
||||||
Ground Truth Editor
|
|
||||||
</div>
|
|
||||||
<ManualColumnEditor
|
|
||||||
imageUrl={dewarpedUrl}
|
|
||||||
imageWidth={imageDimensions?.width ?? 1000}
|
|
||||||
imageHeight={imageDimensions?.height ?? 1400}
|
|
||||||
onApply={handleGtApply}
|
|
||||||
onCancel={() => setViewMode('normal')}
|
|
||||||
applying={applying}
|
|
||||||
mode="ground-truth"
|
|
||||||
layout="stacked"
|
|
||||||
initialDividers={gtInitial?.dividers}
|
|
||||||
initialColumnTypes={gtInitial?.columnTypes}
|
|
||||||
/>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
) : (
|
|
||||||
/* Normal mode: overlay (left) vs clean (right) */
|
|
||||||
<div className="grid grid-cols-2 gap-4">
|
|
||||||
<div>
|
|
||||||
<div className="text-xs font-medium text-gray-500 dark:text-gray-400 mb-1">
|
|
||||||
Mit Spalten-Overlay
|
|
||||||
</div>
|
|
||||||
<div className="border rounded-lg overflow-hidden dark:border-gray-700 bg-gray-50 dark:bg-gray-900">
|
|
||||||
{columnResult ? (
|
|
||||||
// eslint-disable-next-line @next/next/no-img-element
|
|
||||||
<img
|
|
||||||
src={`${overlayUrl}?t=${Date.now()}`}
|
|
||||||
alt="Spalten-Overlay"
|
|
||||||
className="w-full h-auto"
|
|
||||||
/>
|
|
||||||
) : (
|
|
||||||
<div className="aspect-[3/4] flex items-center justify-center text-gray-400 text-sm">
|
|
||||||
{detecting ? 'Erkenne Spalten...' : 'Keine Daten'}
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
<div>
|
|
||||||
<div className="text-xs font-medium text-gray-500 dark:text-gray-400 mb-1">
|
|
||||||
Entzerrtes Bild
|
|
||||||
</div>
|
|
||||||
<div className="border rounded-lg overflow-hidden dark:border-gray-700 bg-gray-50 dark:bg-gray-900">
|
|
||||||
{/* eslint-disable-next-line @next/next/no-img-element */}
|
|
||||||
<img
|
|
||||||
src={dewarpedUrl}
|
|
||||||
alt="Entzerrt"
|
|
||||||
className="w-full h-auto"
|
|
||||||
/>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
|
|
||||||
{/* Controls */}
|
|
||||||
{viewMode === 'normal' && (
|
|
||||||
<ColumnControls
|
|
||||||
columnResult={columnResult}
|
|
||||||
onRerun={handleRerun}
|
|
||||||
onManualMode={() => setViewMode('manual')}
|
|
||||||
onGtMode={() => setViewMode('ground-truth')}
|
|
||||||
onGroundTruth={handleGroundTruth}
|
|
||||||
onNext={onNext}
|
|
||||||
isDetecting={detecting}
|
|
||||||
savedGtColumns={savedGtColumns}
|
|
||||||
/>
|
|
||||||
)}
|
|
||||||
|
|
||||||
{error && (
|
|
||||||
<div className="p-3 bg-red-50 dark:bg-red-900/20 text-red-600 dark:text-red-400 rounded-lg text-sm">
|
|
||||||
{error}
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
@@ -1,19 +0,0 @@
|
|||||||
'use client'
|
|
||||||
|
|
||||||
export function StepCoordinates() {
|
|
||||||
return (
|
|
||||||
<div className="flex flex-col items-center justify-center py-16 text-center">
|
|
||||||
<div className="text-5xl mb-4">📍</div>
|
|
||||||
<h3 className="text-lg font-medium text-gray-700 dark:text-gray-300 mb-2">
|
|
||||||
Schritt 5: Koordinatenzuweisung
|
|
||||||
</h3>
|
|
||||||
<p className="text-gray-500 dark:text-gray-400 max-w-md">
|
|
||||||
Exakte Positionszuweisung fuer jedes Wort auf der Seite.
|
|
||||||
Dieser Schritt wird in einer zukuenftigen Version implementiert.
|
|
||||||
</p>
|
|
||||||
<div className="mt-6 px-4 py-2 bg-amber-100 dark:bg-amber-900/30 text-amber-700 dark:text-amber-400 rounded-full text-sm font-medium">
|
|
||||||
Kommt bald
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
@@ -1,277 +0,0 @@
|
|||||||
'use client'
|
|
||||||
|
|
||||||
import { useCallback, useEffect, useState } from 'react'
|
|
||||||
import type { DeskewGroundTruth, DeskewResult, SessionInfo } from '@/app/(admin)/ai/ocr-pipeline/types'
|
|
||||||
import { DeskewControls } from './DeskewControls'
|
|
||||||
import { ImageCompareView } from './ImageCompareView'
|
|
||||||
|
|
||||||
const KLAUSUR_API = '/klausur-api'
|
|
||||||
|
|
||||||
interface StepDeskewProps {
|
|
||||||
sessionId?: string | null
|
|
||||||
onNext: (sessionId: string) => void
|
|
||||||
}
|
|
||||||
|
|
||||||
export function StepDeskew({ sessionId: existingSessionId, onNext }: StepDeskewProps) {
|
|
||||||
const [session, setSession] = useState<SessionInfo | null>(null)
|
|
||||||
const [deskewResult, setDeskewResult] = useState<DeskewResult | null>(null)
|
|
||||||
const [uploading, setUploading] = useState(false)
|
|
||||||
const [deskewing, setDeskewing] = useState(false)
|
|
||||||
const [applying, setApplying] = useState(false)
|
|
||||||
const [showBinarized, setShowBinarized] = useState(false)
|
|
||||||
const [showGrid, setShowGrid] = useState(true)
|
|
||||||
const [error, setError] = useState<string | null>(null)
|
|
||||||
const [dragOver, setDragOver] = useState(false)
|
|
||||||
const [sessionName, setSessionName] = useState('')
|
|
||||||
|
|
||||||
// Reload session data when navigating back from a later step
|
|
||||||
useEffect(() => {
|
|
||||||
if (!existingSessionId || session) return
|
|
||||||
|
|
||||||
const loadSession = async () => {
|
|
||||||
try {
|
|
||||||
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${existingSessionId}`)
|
|
||||||
if (!res.ok) return
|
|
||||||
const data = await res.json()
|
|
||||||
|
|
||||||
const sessionInfo: SessionInfo = {
|
|
||||||
session_id: data.session_id,
|
|
||||||
filename: data.filename,
|
|
||||||
image_width: data.image_width,
|
|
||||||
image_height: data.image_height,
|
|
||||||
original_image_url: `${KLAUSUR_API}${data.original_image_url}`,
|
|
||||||
}
|
|
||||||
setSession(sessionInfo)
|
|
||||||
|
|
||||||
// Reconstruct deskew result from session data
|
|
||||||
if (data.deskew_result) {
|
|
||||||
const dr: DeskewResult = {
|
|
||||||
...data.deskew_result,
|
|
||||||
deskewed_image_url: `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${existingSessionId}/image/deskewed`,
|
|
||||||
binarized_image_url: `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${existingSessionId}/image/binarized`,
|
|
||||||
}
|
|
||||||
setDeskewResult(dr)
|
|
||||||
}
|
|
||||||
} catch (e) {
|
|
||||||
console.error('Failed to reload session:', e)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
loadSession()
|
|
||||||
}, [existingSessionId, session])
|
|
||||||
|
|
||||||
const handleUpload = useCallback(async (file: File) => {
|
|
||||||
setUploading(true)
|
|
||||||
setError(null)
|
|
||||||
setDeskewResult(null)
|
|
||||||
|
|
||||||
try {
|
|
||||||
const formData = new FormData()
|
|
||||||
formData.append('file', file)
|
|
||||||
if (sessionName.trim()) {
|
|
||||||
formData.append('name', sessionName.trim())
|
|
||||||
}
|
|
||||||
|
|
||||||
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions`, {
|
|
||||||
method: 'POST',
|
|
||||||
body: formData,
|
|
||||||
})
|
|
||||||
|
|
||||||
if (!res.ok) {
|
|
||||||
const err = await res.json().catch(() => ({ detail: res.statusText }))
|
|
||||||
throw new Error(err.detail || 'Upload fehlgeschlagen')
|
|
||||||
}
|
|
||||||
|
|
||||||
const data: SessionInfo = await res.json()
|
|
||||||
// Prepend API prefix to relative URLs
|
|
||||||
data.original_image_url = `${KLAUSUR_API}${data.original_image_url}`
|
|
||||||
setSession(data)
|
|
||||||
|
|
||||||
// Auto-trigger deskew
|
|
||||||
setDeskewing(true)
|
|
||||||
const deskewRes = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${data.session_id}/deskew`, {
|
|
||||||
method: 'POST',
|
|
||||||
})
|
|
||||||
|
|
||||||
if (!deskewRes.ok) {
|
|
||||||
throw new Error('Begradigung fehlgeschlagen')
|
|
||||||
}
|
|
||||||
|
|
||||||
const deskewData: DeskewResult = await deskewRes.json()
|
|
||||||
deskewData.deskewed_image_url = `${KLAUSUR_API}${deskewData.deskewed_image_url}`
|
|
||||||
deskewData.binarized_image_url = `${KLAUSUR_API}${deskewData.binarized_image_url}`
|
|
||||||
setDeskewResult(deskewData)
|
|
||||||
} catch (e) {
|
|
||||||
setError(e instanceof Error ? e.message : 'Unbekannter Fehler')
|
|
||||||
} finally {
|
|
||||||
setUploading(false)
|
|
||||||
setDeskewing(false)
|
|
||||||
}
|
|
||||||
}, [])
|
|
||||||
|
|
||||||
const handleManualDeskew = useCallback(async (angle: number) => {
|
|
||||||
if (!session) return
|
|
||||||
setApplying(true)
|
|
||||||
setError(null)
|
|
||||||
|
|
||||||
try {
|
|
||||||
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${session.session_id}/deskew/manual`, {
|
|
||||||
method: 'POST',
|
|
||||||
headers: { 'Content-Type': 'application/json' },
|
|
||||||
body: JSON.stringify({ angle }),
|
|
||||||
})
|
|
||||||
|
|
||||||
if (!res.ok) throw new Error('Manuelle Begradigung fehlgeschlagen')
|
|
||||||
|
|
||||||
const data = await res.json()
|
|
||||||
setDeskewResult((prev) =>
|
|
||||||
prev
|
|
||||||
? {
|
|
||||||
...prev,
|
|
||||||
angle_applied: data.angle_applied,
|
|
||||||
method_used: data.method_used,
|
|
||||||
// Force reload by appending timestamp
|
|
||||||
deskewed_image_url: `${KLAUSUR_API}${data.deskewed_image_url}?t=${Date.now()}`,
|
|
||||||
}
|
|
||||||
: null,
|
|
||||||
)
|
|
||||||
} catch (e) {
|
|
||||||
setError(e instanceof Error ? e.message : 'Fehler')
|
|
||||||
} finally {
|
|
||||||
setApplying(false)
|
|
||||||
}
|
|
||||||
}, [session])
|
|
||||||
|
|
||||||
const handleGroundTruth = useCallback(async (gt: DeskewGroundTruth) => {
|
|
||||||
if (!session) return
|
|
||||||
try {
|
|
||||||
await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${session.session_id}/ground-truth/deskew`, {
|
|
||||||
method: 'POST',
|
|
||||||
headers: { 'Content-Type': 'application/json' },
|
|
||||||
body: JSON.stringify(gt),
|
|
||||||
})
|
|
||||||
} catch (e) {
|
|
||||||
console.error('Ground truth save failed:', e)
|
|
||||||
}
|
|
||||||
}, [session])
|
|
||||||
|
|
||||||
const handleDrop = useCallback((e: React.DragEvent) => {
|
|
||||||
e.preventDefault()
|
|
||||||
setDragOver(false)
|
|
||||||
const file = e.dataTransfer.files[0]
|
|
||||||
if (file) handleUpload(file)
|
|
||||||
}, [handleUpload])
|
|
||||||
|
|
||||||
const handleFileInput = useCallback((e: React.ChangeEvent<HTMLInputElement>) => {
|
|
||||||
const file = e.target.files?.[0]
|
|
||||||
if (file) handleUpload(file)
|
|
||||||
}, [handleUpload])
|
|
||||||
|
|
||||||
// Upload area (no session yet)
|
|
||||||
if (!session) {
|
|
||||||
return (
|
|
||||||
<div className="space-y-4">
|
|
||||||
{/* Session name input */}
|
|
||||||
<div>
|
|
||||||
<label className="block text-sm font-medium text-gray-600 dark:text-gray-400 mb-1">
|
|
||||||
Session-Name (optional)
|
|
||||||
</label>
|
|
||||||
<input
|
|
||||||
type="text"
|
|
||||||
value={sessionName}
|
|
||||||
onChange={(e) => setSessionName(e.target.value)}
|
|
||||||
placeholder="z.B. Unit 3 Seite 42"
|
|
||||||
className="w-full max-w-sm px-3 py-2 text-sm border rounded-lg dark:bg-gray-800 dark:border-gray-600 dark:text-gray-200 focus:outline-none focus:ring-2 focus:ring-teal-500"
|
|
||||||
/>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
<div
|
|
||||||
onDragOver={(e) => { e.preventDefault(); setDragOver(true) }}
|
|
||||||
onDragLeave={() => setDragOver(false)}
|
|
||||||
onDrop={handleDrop}
|
|
||||||
className={`border-2 border-dashed rounded-xl p-12 text-center transition-colors ${
|
|
||||||
dragOver
|
|
||||||
? 'border-teal-400 bg-teal-50 dark:bg-teal-900/20'
|
|
||||||
: 'border-gray-300 dark:border-gray-600 hover:border-teal-400'
|
|
||||||
}`}
|
|
||||||
>
|
|
||||||
{uploading ? (
|
|
||||||
<div className="text-gray-500">
|
|
||||||
<div className="animate-spin inline-block w-8 h-8 border-2 border-teal-500 border-t-transparent rounded-full mb-3" />
|
|
||||||
<p>Wird hochgeladen...</p>
|
|
||||||
</div>
|
|
||||||
) : (
|
|
||||||
<>
|
|
||||||
<div className="text-4xl mb-3">📄</div>
|
|
||||||
<p className="text-gray-600 dark:text-gray-400 mb-2">
|
|
||||||
PDF oder Bild hierher ziehen
|
|
||||||
</p>
|
|
||||||
<p className="text-sm text-gray-400 mb-4">oder</p>
|
|
||||||
<label className="inline-block px-4 py-2 bg-teal-600 text-white rounded-lg cursor-pointer hover:bg-teal-700 transition-colors">
|
|
||||||
Datei auswaehlen
|
|
||||||
<input
|
|
||||||
type="file"
|
|
||||||
accept=".pdf,.png,.jpg,.jpeg,.tiff,.tif"
|
|
||||||
onChange={handleFileInput}
|
|
||||||
className="hidden"
|
|
||||||
/>
|
|
||||||
</label>
|
|
||||||
</>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
{error && (
|
|
||||||
<div className="p-3 bg-red-50 dark:bg-red-900/20 text-red-600 dark:text-red-400 rounded-lg text-sm">
|
|
||||||
{error}
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
// Session active: show comparison + controls
|
|
||||||
return (
|
|
||||||
<div className="space-y-4">
|
|
||||||
{/* Filename */}
|
|
||||||
<div className="text-sm text-gray-500 dark:text-gray-400">
|
|
||||||
Datei: <span className="font-medium text-gray-700 dark:text-gray-300">{session.filename}</span>
|
|
||||||
{' '}({session.image_width} x {session.image_height} px)
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Loading indicator */}
|
|
||||||
{deskewing && (
|
|
||||||
<div className="flex items-center gap-2 text-teal-600 dark:text-teal-400 text-sm">
|
|
||||||
<div className="animate-spin w-4 h-4 border-2 border-teal-500 border-t-transparent rounded-full" />
|
|
||||||
Begradigung laeuft (beide Methoden)...
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
|
|
||||||
{/* Image comparison */}
|
|
||||||
<ImageCompareView
|
|
||||||
originalUrl={session.original_image_url}
|
|
||||||
deskewedUrl={deskewResult?.deskewed_image_url ?? null}
|
|
||||||
showGrid={showGrid}
|
|
||||||
showBinarized={showBinarized}
|
|
||||||
binarizedUrl={deskewResult?.binarized_image_url ?? null}
|
|
||||||
/>
|
|
||||||
|
|
||||||
{/* Controls */}
|
|
||||||
<DeskewControls
|
|
||||||
deskewResult={deskewResult}
|
|
||||||
showBinarized={showBinarized}
|
|
||||||
onToggleBinarized={() => setShowBinarized((v) => !v)}
|
|
||||||
showGrid={showGrid}
|
|
||||||
onToggleGrid={() => setShowGrid((v) => !v)}
|
|
||||||
onManualDeskew={handleManualDeskew}
|
|
||||||
onGroundTruth={handleGroundTruth}
|
|
||||||
onNext={() => session && onNext(session.session_id)}
|
|
||||||
isApplying={applying}
|
|
||||||
/>
|
|
||||||
|
|
||||||
{error && (
|
|
||||||
<div className="p-3 bg-red-50 dark:bg-red-900/20 text-red-600 dark:text-red-400 rounded-lg text-sm">
|
|
||||||
{error}
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
@@ -1,151 +0,0 @@
|
|||||||
'use client'
|
|
||||||
|
|
||||||
import { useCallback, useEffect, useState } from 'react'
|
|
||||||
import type { DewarpResult, DewarpGroundTruth } from '@/app/(admin)/ai/ocr-pipeline/types'
|
|
||||||
import { DewarpControls } from './DewarpControls'
|
|
||||||
import { ImageCompareView } from './ImageCompareView'
|
|
||||||
|
|
||||||
const KLAUSUR_API = '/klausur-api'
|
|
||||||
|
|
||||||
interface StepDewarpProps {
|
|
||||||
sessionId: string | null
|
|
||||||
onNext: () => void
|
|
||||||
}
|
|
||||||
|
|
||||||
export function StepDewarp({ sessionId, onNext }: StepDewarpProps) {
|
|
||||||
const [dewarpResult, setDewarpResult] = useState<DewarpResult | null>(null)
|
|
||||||
const [dewarping, setDewarping] = useState(false)
|
|
||||||
const [applying, setApplying] = useState(false)
|
|
||||||
const [showGrid, setShowGrid] = useState(true)
|
|
||||||
const [error, setError] = useState<string | null>(null)
|
|
||||||
|
|
||||||
// Auto-trigger dewarp when component mounts with a sessionId
|
|
||||||
useEffect(() => {
|
|
||||||
if (!sessionId || dewarpResult) return
|
|
||||||
|
|
||||||
const runDewarp = async () => {
|
|
||||||
setDewarping(true)
|
|
||||||
setError(null)
|
|
||||||
try {
|
|
||||||
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/dewarp`, {
|
|
||||||
method: 'POST',
|
|
||||||
})
|
|
||||||
if (!res.ok) {
|
|
||||||
const err = await res.json().catch(() => ({ detail: res.statusText }))
|
|
||||||
throw new Error(err.detail || 'Entzerrung fehlgeschlagen')
|
|
||||||
}
|
|
||||||
const data: DewarpResult = await res.json()
|
|
||||||
data.dewarped_image_url = `${KLAUSUR_API}${data.dewarped_image_url}`
|
|
||||||
setDewarpResult(data)
|
|
||||||
} catch (e) {
|
|
||||||
setError(e instanceof Error ? e.message : 'Unbekannter Fehler')
|
|
||||||
} finally {
|
|
||||||
setDewarping(false)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
runDewarp()
|
|
||||||
}, [sessionId, dewarpResult])
|
|
||||||
|
|
||||||
const handleManualDewarp = useCallback(async (shearDegrees: number) => {
|
|
||||||
if (!sessionId) return
|
|
||||||
setApplying(true)
|
|
||||||
setError(null)
|
|
||||||
|
|
||||||
try {
|
|
||||||
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/dewarp/manual`, {
|
|
||||||
method: 'POST',
|
|
||||||
headers: { 'Content-Type': 'application/json' },
|
|
||||||
body: JSON.stringify({ shear_degrees: shearDegrees }),
|
|
||||||
})
|
|
||||||
if (!res.ok) throw new Error('Manuelle Entzerrung fehlgeschlagen')
|
|
||||||
|
|
||||||
const data = await res.json()
|
|
||||||
setDewarpResult((prev) =>
|
|
||||||
prev
|
|
||||||
? {
|
|
||||||
...prev,
|
|
||||||
method_used: data.method_used,
|
|
||||||
shear_degrees: data.shear_degrees,
|
|
||||||
dewarped_image_url: `${KLAUSUR_API}${data.dewarped_image_url}?t=${Date.now()}`,
|
|
||||||
}
|
|
||||||
: null,
|
|
||||||
)
|
|
||||||
} catch (e) {
|
|
||||||
setError(e instanceof Error ? e.message : 'Fehler')
|
|
||||||
} finally {
|
|
||||||
setApplying(false)
|
|
||||||
}
|
|
||||||
}, [sessionId])
|
|
||||||
|
|
||||||
const handleGroundTruth = useCallback(async (gt: DewarpGroundTruth) => {
|
|
||||||
if (!sessionId) return
|
|
||||||
try {
|
|
||||||
await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/ground-truth/dewarp`, {
|
|
||||||
method: 'POST',
|
|
||||||
headers: { 'Content-Type': 'application/json' },
|
|
||||||
body: JSON.stringify(gt),
|
|
||||||
})
|
|
||||||
} catch (e) {
|
|
||||||
console.error('Ground truth save failed:', e)
|
|
||||||
}
|
|
||||||
}, [sessionId])
|
|
||||||
|
|
||||||
if (!sessionId) {
|
|
||||||
return (
|
|
||||||
<div className="flex flex-col items-center justify-center py-16 text-center">
|
|
||||||
<div className="text-5xl mb-4">🔧</div>
|
|
||||||
<h3 className="text-lg font-medium text-gray-700 dark:text-gray-300 mb-2">
|
|
||||||
Schritt 2: Entzerrung (Dewarp)
|
|
||||||
</h3>
|
|
||||||
<p className="text-gray-500 dark:text-gray-400 max-w-md">
|
|
||||||
Bitte zuerst Schritt 1 (Begradigung) abschliessen.
|
|
||||||
</p>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
const deskewedUrl = `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/image/deskewed`
|
|
||||||
const dewarpedUrl = dewarpResult?.dewarped_image_url ?? null
|
|
||||||
|
|
||||||
return (
|
|
||||||
<div className="space-y-4">
|
|
||||||
{/* Loading indicator */}
|
|
||||||
{dewarping && (
|
|
||||||
<div className="flex items-center gap-2 text-teal-600 dark:text-teal-400 text-sm">
|
|
||||||
<div className="animate-spin w-4 h-4 border-2 border-teal-500 border-t-transparent rounded-full" />
|
|
||||||
Entzerrung laeuft (beide Methoden)...
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
|
|
||||||
{/* Image comparison: deskewed (left) vs dewarped (right) */}
|
|
||||||
<ImageCompareView
|
|
||||||
originalUrl={deskewedUrl}
|
|
||||||
deskewedUrl={dewarpedUrl}
|
|
||||||
showGrid={showGrid}
|
|
||||||
showGridLeft={showGrid}
|
|
||||||
showBinarized={false}
|
|
||||||
binarizedUrl={null}
|
|
||||||
leftLabel={`Begradigt (nach Deskew)${showGrid ? ' + Raster' : ''}`}
|
|
||||||
rightLabel={`Entzerrt${showGrid ? ' + Raster (mm)' : ''}`}
|
|
||||||
/>
|
|
||||||
|
|
||||||
{/* Controls */}
|
|
||||||
<DewarpControls
|
|
||||||
dewarpResult={dewarpResult}
|
|
||||||
showGrid={showGrid}
|
|
||||||
onToggleGrid={() => setShowGrid((v) => !v)}
|
|
||||||
onManualDewarp={handleManualDewarp}
|
|
||||||
onGroundTruth={handleGroundTruth}
|
|
||||||
onNext={onNext}
|
|
||||||
isApplying={applying}
|
|
||||||
/>
|
|
||||||
|
|
||||||
{error && (
|
|
||||||
<div className="p-3 bg-red-50 dark:bg-red-900/20 text-red-600 dark:text-red-400 rounded-lg text-sm">
|
|
||||||
{error}
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
@@ -1,19 +0,0 @@
|
|||||||
'use client'
|
|
||||||
|
|
||||||
export function StepGroundTruth() {
|
|
||||||
return (
|
|
||||||
<div className="flex flex-col items-center justify-center py-16 text-center">
|
|
||||||
<div className="text-5xl mb-4">✅</div>
|
|
||||||
<h3 className="text-lg font-medium text-gray-700 dark:text-gray-300 mb-2">
|
|
||||||
Schritt 7: Ground Truth Validierung
|
|
||||||
</h3>
|
|
||||||
<p className="text-gray-500 dark:text-gray-400 max-w-md">
|
|
||||||
Gesamtpruefung der rekonstruierten Seite gegen das Original.
|
|
||||||
Dieser Schritt wird in einer zukuenftigen Version implementiert.
|
|
||||||
</p>
|
|
||||||
<div className="mt-6 px-4 py-2 bg-amber-100 dark:bg-amber-900/30 text-amber-700 dark:text-amber-400 rounded-full text-sm font-medium">
|
|
||||||
Kommt bald
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
@@ -1,707 +0,0 @@
|
|||||||
'use client'
|
|
||||||
|
|
||||||
import { useCallback, useEffect, useRef, useState } from 'react'
|
|
||||||
import type { GridResult, WordEntry, ColumnMeta } from '@/app/(admin)/ai/ocr-pipeline/types'
|
|
||||||
|
|
||||||
const KLAUSUR_API = '/klausur-api'
|
|
||||||
|
|
||||||
interface LlmChange {
|
|
||||||
row_index: number
|
|
||||||
field: 'english' | 'german' | 'example'
|
|
||||||
old: string
|
|
||||||
new: string
|
|
||||||
}
|
|
||||||
|
|
||||||
interface StepLlmReviewProps {
|
|
||||||
sessionId: string | null
|
|
||||||
onNext: () => void
|
|
||||||
}
|
|
||||||
|
|
||||||
interface ReviewMeta {
|
|
||||||
total_entries: number
|
|
||||||
to_review: number
|
|
||||||
skipped: number
|
|
||||||
model: string
|
|
||||||
skipped_indices?: number[]
|
|
||||||
}
|
|
||||||
|
|
||||||
interface StreamProgress {
|
|
||||||
current: number
|
|
||||||
total: number
|
|
||||||
}
|
|
||||||
|
|
||||||
const FIELD_LABELS: Record<string, string> = {
|
|
||||||
english: 'EN',
|
|
||||||
german: 'DE',
|
|
||||||
example: 'Beispiel',
|
|
||||||
source_page: 'Seite',
|
|
||||||
marker: 'Marker',
|
|
||||||
}
|
|
||||||
|
|
||||||
/** Map column type to WordEntry field name */
|
|
||||||
const COL_TYPE_TO_FIELD: Record<string, string> = {
|
|
||||||
column_en: 'english',
|
|
||||||
column_de: 'german',
|
|
||||||
column_example: 'example',
|
|
||||||
page_ref: 'source_page',
|
|
||||||
column_marker: 'marker',
|
|
||||||
}
|
|
||||||
|
|
||||||
/** Column type → color class */
|
|
||||||
const COL_TYPE_COLOR: Record<string, string> = {
|
|
||||||
column_en: 'text-blue-600 dark:text-blue-400',
|
|
||||||
column_de: 'text-green-600 dark:text-green-400',
|
|
||||||
column_example: 'text-orange-600 dark:text-orange-400',
|
|
||||||
page_ref: 'text-cyan-600 dark:text-cyan-400',
|
|
||||||
column_marker: 'text-gray-500 dark:text-gray-400',
|
|
||||||
}
|
|
||||||
|
|
||||||
type RowStatus = 'pending' | 'active' | 'reviewed' | 'corrected' | 'skipped'
|
|
||||||
|
|
||||||
export function StepLlmReview({ sessionId, onNext }: StepLlmReviewProps) {
|
|
||||||
// Core state
|
|
||||||
const [status, setStatus] = useState<'idle' | 'loading' | 'ready' | 'running' | 'done' | 'error' | 'applied'>('idle')
|
|
||||||
const [meta, setMeta] = useState<ReviewMeta | null>(null)
|
|
||||||
const [changes, setChanges] = useState<LlmChange[]>([])
|
|
||||||
const [progress, setProgress] = useState<StreamProgress | null>(null)
|
|
||||||
const [totalDuration, setTotalDuration] = useState(0)
|
|
||||||
const [error, setError] = useState('')
|
|
||||||
const [accepted, setAccepted] = useState<Set<number>>(new Set())
|
|
||||||
const [applying, setApplying] = useState(false)
|
|
||||||
|
|
||||||
// Full vocab table state
|
|
||||||
const [vocabEntries, setVocabEntries] = useState<WordEntry[]>([])
|
|
||||||
const [columnsUsed, setColumnsUsed] = useState<ColumnMeta[]>([])
|
|
||||||
const [activeRowIndices, setActiveRowIndices] = useState<Set<number>>(new Set())
|
|
||||||
const [reviewedRows, setReviewedRows] = useState<Set<number>>(new Set())
|
|
||||||
const [skippedRows, setSkippedRows] = useState<Set<number>>(new Set())
|
|
||||||
const [correctedMap, setCorrectedMap] = useState<Map<number, LlmChange[]>>(new Map())
|
|
||||||
|
|
||||||
// Image
|
|
||||||
const [imageNaturalSize, setImageNaturalSize] = useState<{ w: number; h: number } | null>(null)
|
|
||||||
|
|
||||||
const tableRef = useRef<HTMLDivElement>(null)
|
|
||||||
const activeRowRef = useRef<HTMLTableRowElement>(null)
|
|
||||||
|
|
||||||
// Load session data on mount
|
|
||||||
useEffect(() => {
|
|
||||||
if (!sessionId) return
|
|
||||||
loadSessionData()
|
|
||||||
// eslint-disable-next-line react-hooks/exhaustive-deps
|
|
||||||
}, [sessionId])
|
|
||||||
|
|
||||||
const loadSessionData = async () => {
|
|
||||||
if (!sessionId) return
|
|
||||||
setStatus('loading')
|
|
||||||
try {
|
|
||||||
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}`)
|
|
||||||
if (!res.ok) throw new Error(`HTTP ${res.status}`)
|
|
||||||
const data = await res.json()
|
|
||||||
|
|
||||||
const wordResult: GridResult | undefined = data.word_result
|
|
||||||
if (!wordResult) {
|
|
||||||
setError('Keine Worterkennungsdaten gefunden. Bitte zuerst Schritt 5 abschliessen.')
|
|
||||||
setStatus('error')
|
|
||||||
return
|
|
||||||
}
|
|
||||||
|
|
||||||
const entries = wordResult.vocab_entries || wordResult.entries || []
|
|
||||||
setVocabEntries(entries)
|
|
||||||
setColumnsUsed(wordResult.columns_used || [])
|
|
||||||
|
|
||||||
// Check if LLM review was already run
|
|
||||||
const llmReview = wordResult.llm_review
|
|
||||||
if (llmReview && llmReview.changes) {
|
|
||||||
const existingChanges: LlmChange[] = llmReview.changes as LlmChange[]
|
|
||||||
setChanges(existingChanges)
|
|
||||||
setTotalDuration(llmReview.duration_ms || 0)
|
|
||||||
|
|
||||||
// Mark all rows as reviewed
|
|
||||||
const allReviewed = new Set(entries.map((_: WordEntry, i: number) => i))
|
|
||||||
setReviewedRows(allReviewed)
|
|
||||||
|
|
||||||
// Build corrected map
|
|
||||||
const cMap = new Map<number, LlmChange[]>()
|
|
||||||
for (const c of existingChanges) {
|
|
||||||
const existing = cMap.get(c.row_index) || []
|
|
||||||
existing.push(c)
|
|
||||||
cMap.set(c.row_index, existing)
|
|
||||||
}
|
|
||||||
setCorrectedMap(cMap)
|
|
||||||
|
|
||||||
// Default: all accepted
|
|
||||||
setAccepted(new Set(existingChanges.map((_: LlmChange, i: number) => i)))
|
|
||||||
|
|
||||||
setMeta({
|
|
||||||
total_entries: entries.length,
|
|
||||||
to_review: llmReview.entries_corrected !== undefined ? entries.length : entries.length,
|
|
||||||
skipped: 0,
|
|
||||||
model: llmReview.model_used || 'unknown',
|
|
||||||
})
|
|
||||||
setStatus('done')
|
|
||||||
} else {
|
|
||||||
setStatus('ready')
|
|
||||||
}
|
|
||||||
} catch (e: unknown) {
|
|
||||||
setError(e instanceof Error ? e.message : String(e))
|
|
||||||
setStatus('error')
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
const runReview = useCallback(async () => {
|
|
||||||
if (!sessionId) return
|
|
||||||
setStatus('running')
|
|
||||||
setError('')
|
|
||||||
setChanges([])
|
|
||||||
setProgress(null)
|
|
||||||
setMeta(null)
|
|
||||||
setTotalDuration(0)
|
|
||||||
setActiveRowIndices(new Set())
|
|
||||||
setReviewedRows(new Set())
|
|
||||||
setSkippedRows(new Set())
|
|
||||||
setCorrectedMap(new Map())
|
|
||||||
|
|
||||||
try {
|
|
||||||
const res = await fetch(
|
|
||||||
`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/llm-review?stream=true`,
|
|
||||||
{ method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({}) },
|
|
||||||
)
|
|
||||||
|
|
||||||
if (!res.ok) {
|
|
||||||
const data = await res.json().catch(() => ({}))
|
|
||||||
throw new Error(data.detail || `HTTP ${res.status}`)
|
|
||||||
}
|
|
||||||
|
|
||||||
const reader = res.body!.getReader()
|
|
||||||
const decoder = new TextDecoder()
|
|
||||||
let buffer = ''
|
|
||||||
let allChanges: LlmChange[] = []
|
|
||||||
let allReviewed = new Set<number>()
|
|
||||||
let allSkipped = new Set<number>()
|
|
||||||
let cMap = new Map<number, LlmChange[]>()
|
|
||||||
|
|
||||||
while (true) {
|
|
||||||
const { done, value } = await reader.read()
|
|
||||||
if (done) break
|
|
||||||
buffer += decoder.decode(value, { stream: true })
|
|
||||||
|
|
||||||
while (buffer.includes('\n\n')) {
|
|
||||||
const idx = buffer.indexOf('\n\n')
|
|
||||||
const chunk = buffer.slice(0, idx).trim()
|
|
||||||
buffer = buffer.slice(idx + 2)
|
|
||||||
|
|
||||||
if (!chunk.startsWith('data: ')) continue
|
|
||||||
const dataStr = chunk.slice(6)
|
|
||||||
|
|
||||||
let event: any
|
|
||||||
try { event = JSON.parse(dataStr) } catch { continue }
|
|
||||||
|
|
||||||
if (event.type === 'meta') {
|
|
||||||
setMeta({
|
|
||||||
total_entries: event.total_entries,
|
|
||||||
to_review: event.to_review,
|
|
||||||
skipped: event.skipped,
|
|
||||||
model: event.model,
|
|
||||||
skipped_indices: event.skipped_indices,
|
|
||||||
})
|
|
||||||
// Mark skipped rows
|
|
||||||
if (event.skipped_indices) {
|
|
||||||
allSkipped = new Set(event.skipped_indices)
|
|
||||||
setSkippedRows(allSkipped)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
if (event.type === 'batch') {
|
|
||||||
const batchChanges: LlmChange[] = event.changes || []
|
|
||||||
const batchRows: number[] = event.entries_reviewed || []
|
|
||||||
|
|
||||||
// Update active rows (currently being reviewed)
|
|
||||||
setActiveRowIndices(new Set(batchRows))
|
|
||||||
|
|
||||||
// Accumulate changes
|
|
||||||
allChanges = [...allChanges, ...batchChanges]
|
|
||||||
setChanges(allChanges)
|
|
||||||
setProgress(event.progress)
|
|
||||||
|
|
||||||
// Update corrected map
|
|
||||||
for (const c of batchChanges) {
|
|
||||||
const existing = cMap.get(c.row_index) || []
|
|
||||||
existing.push(c)
|
|
||||||
cMap.set(c.row_index, [...existing])
|
|
||||||
}
|
|
||||||
setCorrectedMap(new Map(cMap))
|
|
||||||
|
|
||||||
// Mark batch rows as reviewed
|
|
||||||
for (const r of batchRows) {
|
|
||||||
allReviewed.add(r)
|
|
||||||
}
|
|
||||||
setReviewedRows(new Set(allReviewed))
|
|
||||||
|
|
||||||
// Scroll to active row in table
|
|
||||||
setTimeout(() => {
|
|
||||||
activeRowRef.current?.scrollIntoView({ behavior: 'smooth', block: 'center' })
|
|
||||||
}, 50)
|
|
||||||
}
|
|
||||||
|
|
||||||
if (event.type === 'complete') {
|
|
||||||
setActiveRowIndices(new Set())
|
|
||||||
setTotalDuration(event.duration_ms)
|
|
||||||
setAccepted(new Set(allChanges.map((_: LlmChange, i: number) => i)))
|
|
||||||
// Mark all non-skipped as reviewed
|
|
||||||
const allEntryIndices = vocabEntries.map((_: WordEntry, i: number) => i)
|
|
||||||
for (const i of allEntryIndices) {
|
|
||||||
if (!allSkipped.has(i)) allReviewed.add(i)
|
|
||||||
}
|
|
||||||
setReviewedRows(new Set(allReviewed))
|
|
||||||
setStatus('done')
|
|
||||||
}
|
|
||||||
|
|
||||||
if (event.type === 'error') {
|
|
||||||
throw new Error(event.detail || 'Unbekannter Fehler')
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// If stream ended without complete event
|
|
||||||
if (allChanges.length === 0) {
|
|
||||||
setStatus('done')
|
|
||||||
}
|
|
||||||
} catch (e: unknown) {
|
|
||||||
const msg = e instanceof Error ? e.message : String(e)
|
|
||||||
setError(msg)
|
|
||||||
setStatus('error')
|
|
||||||
}
|
|
||||||
}, [sessionId, vocabEntries])
|
|
||||||
|
|
||||||
const toggleChange = (index: number) => {
|
|
||||||
setAccepted(prev => {
|
|
||||||
const next = new Set(prev)
|
|
||||||
if (next.has(index)) next.delete(index)
|
|
||||||
else next.add(index)
|
|
||||||
return next
|
|
||||||
})
|
|
||||||
}
|
|
||||||
|
|
||||||
const toggleAll = () => {
|
|
||||||
if (accepted.size === changes.length) {
|
|
||||||
setAccepted(new Set())
|
|
||||||
} else {
|
|
||||||
setAccepted(new Set(changes.map((_: LlmChange, i: number) => i)))
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
const applyChanges = useCallback(async () => {
|
|
||||||
if (!sessionId) return
|
|
||||||
setApplying(true)
|
|
||||||
try {
|
|
||||||
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/llm-review/apply`, {
|
|
||||||
method: 'POST',
|
|
||||||
headers: { 'Content-Type': 'application/json' },
|
|
||||||
body: JSON.stringify({ accepted_indices: Array.from(accepted) }),
|
|
||||||
})
|
|
||||||
if (!res.ok) {
|
|
||||||
const data = await res.json().catch(() => ({}))
|
|
||||||
throw new Error(data.detail || `HTTP ${res.status}`)
|
|
||||||
}
|
|
||||||
setStatus('applied')
|
|
||||||
} catch (e: unknown) {
|
|
||||||
setError(e instanceof Error ? e.message : String(e))
|
|
||||||
} finally {
|
|
||||||
setApplying(false)
|
|
||||||
}
|
|
||||||
}, [sessionId, accepted])
|
|
||||||
|
|
||||||
const getRowStatus = (rowIndex: number): RowStatus => {
|
|
||||||
if (activeRowIndices.has(rowIndex)) return 'active'
|
|
||||||
if (skippedRows.has(rowIndex)) return 'skipped'
|
|
||||||
if (correctedMap.has(rowIndex)) return 'corrected'
|
|
||||||
if (reviewedRows.has(rowIndex)) return 'reviewed'
|
|
||||||
return 'pending'
|
|
||||||
}
|
|
||||||
|
|
||||||
const dewarpedUrl = sessionId
|
|
||||||
? `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/image/dewarped`
|
|
||||||
: ''
|
|
||||||
|
|
||||||
if (!sessionId) {
|
|
||||||
return <div className="text-center py-12 text-gray-400">Bitte zuerst eine Session auswaehlen.</div>
|
|
||||||
}
|
|
||||||
|
|
||||||
// --- Loading session data ---
|
|
||||||
if (status === 'loading' || status === 'idle') {
|
|
||||||
return (
|
|
||||||
<div className="flex items-center gap-3 justify-center py-12">
|
|
||||||
<div className="animate-spin rounded-full h-5 w-5 border-b-2 border-teal-500" />
|
|
||||||
<span className="text-gray-500">Session-Daten werden geladen...</span>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
// --- Error ---
|
|
||||||
if (status === 'error') {
|
|
||||||
return (
|
|
||||||
<div className="flex flex-col items-center justify-center py-12 text-center">
|
|
||||||
<div className="text-5xl mb-4">⚠️</div>
|
|
||||||
<h3 className="text-lg font-medium text-red-600 dark:text-red-400 mb-2">Fehler bei OCR-Zeichenkorrektur</h3>
|
|
||||||
<p className="text-sm text-gray-500 dark:text-gray-400 max-w-lg mb-4">{error}</p>
|
|
||||||
<div className="flex gap-3">
|
|
||||||
<button onClick={() => { setError(''); loadSessionData() }}
|
|
||||||
className="px-5 py-2 bg-teal-600 text-white rounded-lg hover:bg-teal-700 transition-colors text-sm">
|
|
||||||
Erneut versuchen
|
|
||||||
</button>
|
|
||||||
<button onClick={onNext}
|
|
||||||
className="px-5 py-2 bg-gray-200 dark:bg-gray-700 text-gray-700 dark:text-gray-300 rounded-lg hover:bg-gray-300 dark:hover:bg-gray-600 transition-colors text-sm">
|
|
||||||
Ueberspringen →
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
// --- Applied ---
|
|
||||||
if (status === 'applied') {
|
|
||||||
return (
|
|
||||||
<div className="flex flex-col items-center justify-center py-12 text-center">
|
|
||||||
<div className="text-5xl mb-4">✅</div>
|
|
||||||
<h3 className="text-lg font-medium text-gray-700 dark:text-gray-300 mb-2">Korrekturen uebernommen</h3>
|
|
||||||
<p className="text-sm text-gray-500 dark:text-gray-400 mb-6">
|
|
||||||
{accepted.size} von {changes.length} Korrekturen wurden angewendet.
|
|
||||||
</p>
|
|
||||||
<button onClick={onNext}
|
|
||||||
className="px-6 py-2.5 bg-teal-600 text-white rounded-lg hover:bg-teal-700 transition-colors font-medium">
|
|
||||||
Weiter →
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
// Active entry for highlighting on image
|
|
||||||
const activeEntry = vocabEntries.find((_: WordEntry, i: number) => activeRowIndices.has(i))
|
|
||||||
|
|
||||||
const pct = progress ? Math.round((progress.current / progress.total) * 100) : 0
|
|
||||||
|
|
||||||
// --- Ready / Running / Done: 2-column layout ---
|
|
||||||
return (
|
|
||||||
<div className="space-y-4">
|
|
||||||
{/* Header */}
|
|
||||||
<div className="flex items-center justify-between">
|
|
||||||
<div>
|
|
||||||
<h3 className="text-base font-medium text-gray-700 dark:text-gray-300">
|
|
||||||
Schritt 6: Korrektur
|
|
||||||
</h3>
|
|
||||||
<p className="text-xs text-gray-400 mt-0.5">
|
|
||||||
{status === 'ready' && `${vocabEntries.length} Eintraege bereit zur Pruefung`}
|
|
||||||
{status === 'running' && meta && `${meta.model} · ${meta.to_review} zu pruefen, ${meta.skipped} uebersprungen`}
|
|
||||||
{status === 'done' && (
|
|
||||||
<>
|
|
||||||
{changes.length} Korrektur{changes.length !== 1 ? 'en' : ''} gefunden
|
|
||||||
{meta && <> · {meta.skipped} uebersprungen</>}
|
|
||||||
{' '}· {totalDuration}ms · {meta?.model}
|
|
||||||
</>
|
|
||||||
)}
|
|
||||||
</p>
|
|
||||||
</div>
|
|
||||||
<div className="flex items-center gap-2">
|
|
||||||
{status === 'ready' && (
|
|
||||||
<button onClick={runReview}
|
|
||||||
className="px-5 py-2 bg-teal-600 text-white rounded-lg hover:bg-teal-700 transition-colors text-sm font-medium">
|
|
||||||
Korrektur starten
|
|
||||||
</button>
|
|
||||||
)}
|
|
||||||
{status === 'running' && (
|
|
||||||
<div className="flex items-center gap-2 text-sm text-teal-600 dark:text-teal-400">
|
|
||||||
<div className="animate-spin rounded-full h-4 w-4 border-b-2 border-teal-500" />
|
|
||||||
{progress ? `${progress.current}/${progress.total}` : 'Startet...'}
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
{status === 'done' && changes.length > 0 && (
|
|
||||||
<button onClick={toggleAll}
|
|
||||||
className="text-xs px-3 py-1.5 border border-gray-300 dark:border-gray-600 rounded-lg hover:bg-gray-50 dark:hover:bg-gray-700 transition-colors text-gray-600 dark:text-gray-400">
|
|
||||||
{accepted.size === changes.length ? 'Keine' : 'Alle'} auswaehlen
|
|
||||||
</button>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Progress bar (while running) */}
|
|
||||||
{status === 'running' && progress && (
|
|
||||||
<div className="space-y-1">
|
|
||||||
<div className="flex justify-between text-xs text-gray-400">
|
|
||||||
<span>{progress.current} / {progress.total} Eintraege geprueft</span>
|
|
||||||
<span>{pct}%</span>
|
|
||||||
</div>
|
|
||||||
<div className="w-full bg-gray-200 dark:bg-gray-700 rounded-full h-2">
|
|
||||||
<div className="bg-teal-500 h-2 rounded-full transition-all duration-500" style={{ width: `${pct}%` }} />
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
|
|
||||||
{/* 2-column layout: Image + Table */}
|
|
||||||
<div className="grid grid-cols-3 gap-4">
|
|
||||||
{/* Left: Dewarped Image with highlight overlay */}
|
|
||||||
<div className="col-span-1">
|
|
||||||
<div className="text-xs font-medium text-gray-500 dark:text-gray-400 mb-1">
|
|
||||||
Originalbild
|
|
||||||
</div>
|
|
||||||
<div className="border rounded-lg overflow-hidden dark:border-gray-700 bg-gray-50 dark:bg-gray-900 relative sticky top-4">
|
|
||||||
{/* eslint-disable-next-line @next/next/no-img-element */}
|
|
||||||
<img
|
|
||||||
src={dewarpedUrl}
|
|
||||||
alt="Dewarped"
|
|
||||||
className="w-full h-auto"
|
|
||||||
onLoad={(e) => {
|
|
||||||
const img = e.target as HTMLImageElement
|
|
||||||
setImageNaturalSize({ w: img.naturalWidth, h: img.naturalHeight })
|
|
||||||
}}
|
|
||||||
/>
|
|
||||||
{/* Highlight overlay for active row */}
|
|
||||||
{activeEntry?.bbox && (
|
|
||||||
<div
|
|
||||||
className="absolute border-2 border-yellow-400 bg-yellow-400/20 pointer-events-none animate-pulse"
|
|
||||||
style={{
|
|
||||||
left: `${activeEntry.bbox.x}%`,
|
|
||||||
top: `${activeEntry.bbox.y}%`,
|
|
||||||
width: `${activeEntry.bbox.w}%`,
|
|
||||||
height: `${activeEntry.bbox.h}%`,
|
|
||||||
}}
|
|
||||||
/>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Right: Full vocabulary table */}
|
|
||||||
<div className="col-span-2" ref={tableRef}>
|
|
||||||
<div className="text-xs font-medium text-gray-500 dark:text-gray-400 mb-1">
|
|
||||||
Vokabeltabelle ({vocabEntries.length} Eintraege)
|
|
||||||
</div>
|
|
||||||
<div className="border border-gray-200 dark:border-gray-700 rounded-lg overflow-hidden">
|
|
||||||
<div className="max-h-[70vh] overflow-y-auto">
|
|
||||||
<table className="w-full text-sm">
|
|
||||||
<thead className="sticky top-0 z-10">
|
|
||||||
<tr className="bg-gray-50 dark:bg-gray-800 border-b border-gray-200 dark:border-gray-700">
|
|
||||||
<th className="px-2 py-2 text-left text-gray-500 dark:text-gray-400 font-medium w-10">#</th>
|
|
||||||
{columnsUsed.length > 0 ? (
|
|
||||||
columnsUsed.map((col, i) => {
|
|
||||||
const field = COL_TYPE_TO_FIELD[col.type]
|
|
||||||
if (!field) return null
|
|
||||||
return (
|
|
||||||
<th key={i} className={`px-2 py-2 text-left font-medium ${COL_TYPE_COLOR[col.type] || 'text-gray-500 dark:text-gray-400'}`}>
|
|
||||||
{FIELD_LABELS[field] || field}
|
|
||||||
</th>
|
|
||||||
)
|
|
||||||
})
|
|
||||||
) : (
|
|
||||||
<>
|
|
||||||
<th className="px-2 py-2 text-left text-gray-500 dark:text-gray-400 font-medium">EN</th>
|
|
||||||
<th className="px-2 py-2 text-left text-gray-500 dark:text-gray-400 font-medium">DE</th>
|
|
||||||
<th className="px-2 py-2 text-left text-gray-500 dark:text-gray-400 font-medium">Beispiel</th>
|
|
||||||
</>
|
|
||||||
)}
|
|
||||||
<th className="px-2 py-2 text-center text-gray-500 dark:text-gray-400 font-medium w-16">Status</th>
|
|
||||||
</tr>
|
|
||||||
</thead>
|
|
||||||
<tbody>
|
|
||||||
{vocabEntries.map((entry, idx) => {
|
|
||||||
const rowStatus = getRowStatus(idx)
|
|
||||||
const rowChanges = correctedMap.get(idx)
|
|
||||||
|
|
||||||
const rowBg = {
|
|
||||||
pending: '',
|
|
||||||
active: 'bg-yellow-50 dark:bg-yellow-900/20',
|
|
||||||
reviewed: '',
|
|
||||||
corrected: 'bg-teal-50/50 dark:bg-teal-900/10',
|
|
||||||
skipped: 'bg-gray-50 dark:bg-gray-800/50',
|
|
||||||
}[rowStatus]
|
|
||||||
|
|
||||||
return (
|
|
||||||
<tr
|
|
||||||
key={idx}
|
|
||||||
ref={rowStatus === 'active' ? activeRowRef : undefined}
|
|
||||||
className={`border-b border-gray-100 dark:border-gray-700/50 ${rowBg} ${
|
|
||||||
rowStatus === 'active' ? 'ring-1 ring-yellow-400 ring-inset' : ''
|
|
||||||
}`}
|
|
||||||
>
|
|
||||||
<td className="px-2 py-1.5 text-gray-400 font-mono text-xs">{idx}</td>
|
|
||||||
{columnsUsed.length > 0 ? (
|
|
||||||
columnsUsed.map((col, i) => {
|
|
||||||
const field = COL_TYPE_TO_FIELD[col.type]
|
|
||||||
if (!field) return null
|
|
||||||
const text = (entry as Record<string, unknown>)[field] as string || ''
|
|
||||||
return (
|
|
||||||
<td key={i} className="px-2 py-1.5 text-xs">
|
|
||||||
<CellContent text={text} field={field} rowChanges={rowChanges} />
|
|
||||||
</td>
|
|
||||||
)
|
|
||||||
})
|
|
||||||
) : (
|
|
||||||
<>
|
|
||||||
<td className="px-2 py-1.5">
|
|
||||||
<CellContent text={entry.english} field="english" rowChanges={rowChanges} />
|
|
||||||
</td>
|
|
||||||
<td className="px-2 py-1.5">
|
|
||||||
<CellContent text={entry.german} field="german" rowChanges={rowChanges} />
|
|
||||||
</td>
|
|
||||||
<td className="px-2 py-1.5 text-xs">
|
|
||||||
<CellContent text={entry.example} field="example" rowChanges={rowChanges} />
|
|
||||||
</td>
|
|
||||||
</>
|
|
||||||
)}
|
|
||||||
<td className="px-2 py-1.5 text-center">
|
|
||||||
<StatusIcon status={rowStatus} />
|
|
||||||
</td>
|
|
||||||
</tr>
|
|
||||||
)
|
|
||||||
})}
|
|
||||||
</tbody>
|
|
||||||
</table>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Done state: summary + actions */}
|
|
||||||
{status === 'done' && (
|
|
||||||
<div className="space-y-4">
|
|
||||||
{/* Summary */}
|
|
||||||
<div className="bg-gray-50 dark:bg-gray-800/50 rounded-lg p-3 text-xs text-gray-500 dark:text-gray-400">
|
|
||||||
{changes.length === 0 ? (
|
|
||||||
<span>Keine Korrekturen noetig — alle Eintraege sind korrekt.</span>
|
|
||||||
) : (
|
|
||||||
<span>
|
|
||||||
{changes.length} Korrektur{changes.length !== 1 ? 'en' : ''} gefunden ·{' '}
|
|
||||||
{accepted.size} ausgewaehlt ·{' '}
|
|
||||||
{meta?.skipped || 0} uebersprungen (Lautschrift) ·{' '}
|
|
||||||
{totalDuration}ms
|
|
||||||
</span>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Corrections detail list (if any) */}
|
|
||||||
{changes.length > 0 && (
|
|
||||||
<div className="border border-gray-200 dark:border-gray-700 rounded-lg overflow-hidden">
|
|
||||||
<div className="bg-gray-50 dark:bg-gray-800 px-3 py-2 border-b border-gray-200 dark:border-gray-700">
|
|
||||||
<span className="text-xs font-medium text-gray-600 dark:text-gray-400">
|
|
||||||
Korrekturvorschlaege ({accepted.size}/{changes.length} ausgewaehlt)
|
|
||||||
</span>
|
|
||||||
</div>
|
|
||||||
<table className="w-full text-sm">
|
|
||||||
<thead>
|
|
||||||
<tr className="bg-gray-50/50 dark:bg-gray-800/50 border-b border-gray-200 dark:border-gray-700">
|
|
||||||
<th className="w-10 px-3 py-1.5 text-center">
|
|
||||||
<input type="checkbox" checked={accepted.size === changes.length} onChange={toggleAll}
|
|
||||||
className="rounded border-gray-300 dark:border-gray-600" />
|
|
||||||
</th>
|
|
||||||
<th className="px-2 py-1.5 text-left text-gray-500 dark:text-gray-400 font-medium text-xs">Zeile</th>
|
|
||||||
<th className="px-2 py-1.5 text-left text-gray-500 dark:text-gray-400 font-medium text-xs">Feld</th>
|
|
||||||
<th className="px-2 py-1.5 text-left text-gray-500 dark:text-gray-400 font-medium text-xs">Vorher</th>
|
|
||||||
<th className="px-2 py-1.5 text-left text-gray-500 dark:text-gray-400 font-medium text-xs">Nachher</th>
|
|
||||||
</tr>
|
|
||||||
</thead>
|
|
||||||
<tbody>
|
|
||||||
{changes.map((change, idx) => (
|
|
||||||
<tr key={idx} className={`border-b border-gray-100 dark:border-gray-700/50 ${
|
|
||||||
accepted.has(idx) ? 'bg-teal-50/50 dark:bg-teal-900/10' : ''
|
|
||||||
}`}>
|
|
||||||
<td className="px-3 py-1.5 text-center">
|
|
||||||
<input type="checkbox" checked={accepted.has(idx)} onChange={() => toggleChange(idx)}
|
|
||||||
className="rounded border-gray-300 dark:border-gray-600" />
|
|
||||||
</td>
|
|
||||||
<td className="px-2 py-1.5 text-gray-500 dark:text-gray-400 font-mono text-xs">R{change.row_index}</td>
|
|
||||||
<td className="px-2 py-1.5">
|
|
||||||
<span className="text-xs px-1.5 py-0.5 rounded bg-gray-100 dark:bg-gray-700 text-gray-600 dark:text-gray-400">
|
|
||||||
{FIELD_LABELS[change.field] || change.field}
|
|
||||||
</span>
|
|
||||||
</td>
|
|
||||||
<td className="px-2 py-1.5"><span className="line-through text-red-500 dark:text-red-400 text-xs">{change.old}</span></td>
|
|
||||||
<td className="px-2 py-1.5"><span className="text-green-600 dark:text-green-400 font-medium text-xs">{change.new}</span></td>
|
|
||||||
</tr>
|
|
||||||
))}
|
|
||||||
</tbody>
|
|
||||||
</table>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
|
|
||||||
{/* Actions */}
|
|
||||||
<div className="flex items-center justify-between pt-2">
|
|
||||||
<p className="text-xs text-gray-400">
|
|
||||||
{changes.length > 0 ? `${accepted.size} von ${changes.length} ausgewaehlt` : ''}
|
|
||||||
</p>
|
|
||||||
<div className="flex gap-3">
|
|
||||||
{changes.length > 0 && (
|
|
||||||
<button onClick={onNext}
|
|
||||||
className="px-4 py-2 text-sm border border-gray-300 dark:border-gray-600 rounded-lg hover:bg-gray-50 dark:hover:bg-gray-700 transition-colors text-gray-600 dark:text-gray-400">
|
|
||||||
Alle ablehnen
|
|
||||||
</button>
|
|
||||||
)}
|
|
||||||
{changes.length > 0 ? (
|
|
||||||
<button onClick={applyChanges} disabled={applying || accepted.size === 0}
|
|
||||||
className="px-5 py-2 text-sm bg-teal-600 text-white rounded-lg hover:bg-teal-700 disabled:opacity-50 disabled:cursor-not-allowed transition-colors font-medium">
|
|
||||||
{applying ? 'Wird uebernommen...' : `${accepted.size} Korrektur${accepted.size !== 1 ? 'en' : ''} uebernehmen`}
|
|
||||||
</button>
|
|
||||||
) : (
|
|
||||||
<button onClick={onNext}
|
|
||||||
className="px-6 py-2.5 bg-teal-600 text-white rounded-lg hover:bg-teal-700 transition-colors font-medium">
|
|
||||||
Weiter →
|
|
||||||
</button>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
/** Cell content with inline diff for corrections */
|
|
||||||
function CellContent({ text, field, rowChanges }: {
|
|
||||||
text: string
|
|
||||||
field: string
|
|
||||||
rowChanges?: LlmChange[]
|
|
||||||
}) {
|
|
||||||
const change = rowChanges?.find(c => c.field === field)
|
|
||||||
|
|
||||||
if (!text && !change) {
|
|
||||||
return <span className="text-gray-300 dark:text-gray-600">—</span>
|
|
||||||
}
|
|
||||||
|
|
||||||
if (change) {
|
|
||||||
return (
|
|
||||||
<span>
|
|
||||||
<span className="line-through text-red-400 dark:text-red-500 text-xs mr-1">{change.old}</span>
|
|
||||||
<span className="text-green-600 dark:text-green-400 font-medium text-xs">{change.new}</span>
|
|
||||||
</span>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
return <span className="text-gray-700 dark:text-gray-300 text-xs">{text}</span>
|
|
||||||
}
|
|
||||||
|
|
||||||
/** Status icon for each row */
|
|
||||||
function StatusIcon({ status }: { status: RowStatus }) {
|
|
||||||
switch (status) {
|
|
||||||
case 'pending':
|
|
||||||
return <span className="text-gray-300 dark:text-gray-600 text-xs">—</span>
|
|
||||||
case 'active':
|
|
||||||
return (
|
|
||||||
<span className="inline-block w-3 h-3 rounded-full bg-yellow-400 animate-pulse" title="Wird geprueft" />
|
|
||||||
)
|
|
||||||
case 'reviewed':
|
|
||||||
return (
|
|
||||||
<svg className="w-4 h-4 text-green-500 inline-block" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
|
|
||||||
<path strokeLinecap="round" strokeLinejoin="round" d="M5 13l4 4L19 7" />
|
|
||||||
</svg>
|
|
||||||
)
|
|
||||||
case 'corrected':
|
|
||||||
return (
|
|
||||||
<span className="inline-flex items-center px-1.5 py-0.5 rounded text-[10px] font-medium bg-teal-100 dark:bg-teal-900/30 text-teal-700 dark:text-teal-400">
|
|
||||||
korr.
|
|
||||||
</span>
|
|
||||||
)
|
|
||||||
case 'skipped':
|
|
||||||
return (
|
|
||||||
<span className="inline-flex items-center px-1.5 py-0.5 rounded text-[10px] font-medium bg-gray-100 dark:bg-gray-700 text-gray-500 dark:text-gray-400">
|
|
||||||
skip
|
|
||||||
</span>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
@@ -1,559 +0,0 @@
|
|||||||
'use client'
|
|
||||||
|
|
||||||
import { useCallback, useEffect, useMemo, useRef, useState } from 'react'
|
|
||||||
import dynamic from 'next/dynamic'
|
|
||||||
import type { GridResult, GridCell, WordEntry } from '@/app/(admin)/ai/ocr-pipeline/types'
|
|
||||||
|
|
||||||
const KLAUSUR_API = '/klausur-api'
|
|
||||||
|
|
||||||
// Lazy-load Fabric.js canvas editor (SSR-incompatible)
|
|
||||||
const FabricReconstructionCanvas = dynamic(
|
|
||||||
() => import('./FabricReconstructionCanvas').then(m => ({ default: m.FabricReconstructionCanvas })),
|
|
||||||
{ ssr: false, loading: () => <div className="py-8 text-center text-sm text-gray-400">Editor wird geladen...</div> }
|
|
||||||
)
|
|
||||||
|
|
||||||
type EditorMode = 'simple' | 'editor'
|
|
||||||
|
|
||||||
interface StepReconstructionProps {
|
|
||||||
sessionId: string | null
|
|
||||||
onNext: () => void
|
|
||||||
}
|
|
||||||
|
|
||||||
interface EditableCell {
|
|
||||||
cellId: string
|
|
||||||
text: string
|
|
||||||
originalText: string
|
|
||||||
bboxPct: { x: number; y: number; w: number; h: number }
|
|
||||||
colType: string
|
|
||||||
rowIndex: number
|
|
||||||
colIndex: number
|
|
||||||
}
|
|
||||||
|
|
||||||
type UndoAction = { cellId: string; oldText: string; newText: string }
|
|
||||||
|
|
||||||
export function StepReconstruction({ sessionId, onNext }: StepReconstructionProps) {
|
|
||||||
const [status, setStatus] = useState<'loading' | 'ready' | 'saving' | 'saved' | 'error'>('loading')
|
|
||||||
const [error, setError] = useState('')
|
|
||||||
const [cells, setCells] = useState<EditableCell[]>([])
|
|
||||||
const [gridCells, setGridCells] = useState<GridCell[]>([])
|
|
||||||
const [editorMode, setEditorMode] = useState<EditorMode>('simple')
|
|
||||||
const [editedTexts, setEditedTexts] = useState<Map<string, string>>(new Map())
|
|
||||||
const [zoom, setZoom] = useState(100)
|
|
||||||
const [imageNaturalH, setImageNaturalH] = useState(0)
|
|
||||||
const [showEmptyHighlight, setShowEmptyHighlight] = useState(true)
|
|
||||||
|
|
||||||
// Undo/Redo stacks
|
|
||||||
const [undoStack, setUndoStack] = useState<UndoAction[]>([])
|
|
||||||
const [redoStack, setRedoStack] = useState<UndoAction[]>([])
|
|
||||||
|
|
||||||
// (allCells removed — cells now contains all cells including empty ones)
|
|
||||||
|
|
||||||
const containerRef = useRef<HTMLDivElement>(null)
|
|
||||||
const imageRef = useRef<HTMLImageElement>(null)
|
|
||||||
|
|
||||||
// Load session data on mount
|
|
||||||
useEffect(() => {
|
|
||||||
if (!sessionId) return
|
|
||||||
loadSessionData()
|
|
||||||
// eslint-disable-next-line react-hooks/exhaustive-deps
|
|
||||||
}, [sessionId])
|
|
||||||
|
|
||||||
// Track image natural height for font scaling
|
|
||||||
const handleImageLoad = useCallback(() => {
|
|
||||||
if (imageRef.current) {
|
|
||||||
setImageNaturalH(imageRef.current.naturalHeight)
|
|
||||||
}
|
|
||||||
}, [])
|
|
||||||
|
|
||||||
const loadSessionData = async () => {
|
|
||||||
if (!sessionId) return
|
|
||||||
setStatus('loading')
|
|
||||||
try {
|
|
||||||
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}`)
|
|
||||||
if (!res.ok) throw new Error(`HTTP ${res.status}`)
|
|
||||||
const data = await res.json()
|
|
||||||
|
|
||||||
const wordResult: GridResult | undefined = data.word_result
|
|
||||||
if (!wordResult) {
|
|
||||||
setError('Keine Worterkennungsdaten gefunden. Bitte zuerst Schritt 5 abschliessen.')
|
|
||||||
setStatus('error')
|
|
||||||
return
|
|
||||||
}
|
|
||||||
|
|
||||||
// Build editable cells from grid cells
|
|
||||||
const rawGridCells: GridCell[] = wordResult.cells || []
|
|
||||||
setGridCells(rawGridCells)
|
|
||||||
const allEditableCells: EditableCell[] = rawGridCells.map(c => ({
|
|
||||||
cellId: c.cell_id,
|
|
||||||
text: c.text,
|
|
||||||
originalText: c.text,
|
|
||||||
bboxPct: c.bbox_pct,
|
|
||||||
colType: c.col_type,
|
|
||||||
rowIndex: c.row_index,
|
|
||||||
colIndex: c.col_index,
|
|
||||||
}))
|
|
||||||
|
|
||||||
setCells(allEditableCells)
|
|
||||||
setEditedTexts(new Map())
|
|
||||||
setUndoStack([])
|
|
||||||
setRedoStack([])
|
|
||||||
setStatus('ready')
|
|
||||||
} catch (e: unknown) {
|
|
||||||
setError(e instanceof Error ? e.message : String(e))
|
|
||||||
setStatus('error')
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
const handleTextChange = useCallback((cellId: string, newText: string) => {
|
|
||||||
setEditedTexts(prev => {
|
|
||||||
const oldText = prev.get(cellId)
|
|
||||||
const cell = cells.find(c => c.cellId === cellId)
|
|
||||||
const prevText = oldText ?? cell?.text ?? ''
|
|
||||||
|
|
||||||
// Push to undo stack
|
|
||||||
setUndoStack(stack => [...stack, { cellId, oldText: prevText, newText }])
|
|
||||||
setRedoStack([]) // Clear redo on new edit
|
|
||||||
|
|
||||||
const next = new Map(prev)
|
|
||||||
next.set(cellId, newText)
|
|
||||||
return next
|
|
||||||
})
|
|
||||||
}, [cells])
|
|
||||||
|
|
||||||
const undo = useCallback(() => {
|
|
||||||
setUndoStack(stack => {
|
|
||||||
if (stack.length === 0) return stack
|
|
||||||
const action = stack[stack.length - 1]
|
|
||||||
const newStack = stack.slice(0, -1)
|
|
||||||
|
|
||||||
setRedoStack(rs => [...rs, action])
|
|
||||||
setEditedTexts(prev => {
|
|
||||||
const next = new Map(prev)
|
|
||||||
next.set(action.cellId, action.oldText)
|
|
||||||
return next
|
|
||||||
})
|
|
||||||
|
|
||||||
return newStack
|
|
||||||
})
|
|
||||||
}, [])
|
|
||||||
|
|
||||||
const redo = useCallback(() => {
|
|
||||||
setRedoStack(stack => {
|
|
||||||
if (stack.length === 0) return stack
|
|
||||||
const action = stack[stack.length - 1]
|
|
||||||
const newStack = stack.slice(0, -1)
|
|
||||||
|
|
||||||
setUndoStack(us => [...us, action])
|
|
||||||
setEditedTexts(prev => {
|
|
||||||
const next = new Map(prev)
|
|
||||||
next.set(action.cellId, action.newText)
|
|
||||||
return next
|
|
||||||
})
|
|
||||||
|
|
||||||
return newStack
|
|
||||||
})
|
|
||||||
}, [])
|
|
||||||
|
|
||||||
const resetCell = useCallback((cellId: string) => {
|
|
||||||
const cell = cells.find(c => c.cellId === cellId)
|
|
||||||
if (!cell) return
|
|
||||||
setEditedTexts(prev => {
|
|
||||||
const next = new Map(prev)
|
|
||||||
next.delete(cellId)
|
|
||||||
return next
|
|
||||||
})
|
|
||||||
}, [cells])
|
|
||||||
|
|
||||||
// Global keyboard shortcuts for undo/redo
|
|
||||||
useEffect(() => {
|
|
||||||
const handler = (e: KeyboardEvent) => {
|
|
||||||
if ((e.metaKey || e.ctrlKey) && e.key === 'z') {
|
|
||||||
e.preventDefault()
|
|
||||||
if (e.shiftKey) {
|
|
||||||
redo()
|
|
||||||
} else {
|
|
||||||
undo()
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
document.addEventListener('keydown', handler)
|
|
||||||
return () => document.removeEventListener('keydown', handler)
|
|
||||||
}, [undo, redo])
|
|
||||||
|
|
||||||
const getDisplayText = useCallback((cell: EditableCell): string => {
|
|
||||||
return editedTexts.get(cell.cellId) ?? cell.text
|
|
||||||
}, [editedTexts])
|
|
||||||
|
|
||||||
const isEdited = useCallback((cell: EditableCell): boolean => {
|
|
||||||
const edited = editedTexts.get(cell.cellId)
|
|
||||||
return edited !== undefined && edited !== cell.originalText
|
|
||||||
}, [editedTexts])
|
|
||||||
|
|
||||||
const changedCount = useMemo(() => {
|
|
||||||
let count = 0
|
|
||||||
for (const cell of cells) {
|
|
||||||
if (isEdited(cell)) count++
|
|
||||||
}
|
|
||||||
return count
|
|
||||||
}, [cells, isEdited])
|
|
||||||
|
|
||||||
// Identify empty required cells (EN or DE columns with no text)
|
|
||||||
const emptyCellIds = useMemo(() => {
|
|
||||||
const required = new Set(['column_en', 'column_de'])
|
|
||||||
const ids = new Set<string>()
|
|
||||||
for (const cell of cells) {
|
|
||||||
if (required.has(cell.colType) && !cell.text.trim()) {
|
|
||||||
ids.add(cell.cellId)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
return ids
|
|
||||||
}, [cells])
|
|
||||||
|
|
||||||
// Sort cells for tab navigation: by row, then by column
|
|
||||||
const sortedCellIds = useMemo(() => {
|
|
||||||
return [...cells]
|
|
||||||
.sort((a, b) => a.rowIndex !== b.rowIndex ? a.rowIndex - b.rowIndex : a.colIndex - b.colIndex)
|
|
||||||
.map(c => c.cellId)
|
|
||||||
}, [cells])
|
|
||||||
|
|
||||||
const handleKeyDown = useCallback((e: React.KeyboardEvent, cellId: string) => {
|
|
||||||
if (e.key === 'Tab') {
|
|
||||||
e.preventDefault()
|
|
||||||
const idx = sortedCellIds.indexOf(cellId)
|
|
||||||
const nextIdx = e.shiftKey ? idx - 1 : idx + 1
|
|
||||||
if (nextIdx >= 0 && nextIdx < sortedCellIds.length) {
|
|
||||||
const nextId = sortedCellIds[nextIdx]
|
|
||||||
const el = document.getElementById(`cell-${nextId}`)
|
|
||||||
el?.focus()
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}, [sortedCellIds])
|
|
||||||
|
|
||||||
const saveReconstruction = useCallback(async () => {
|
|
||||||
if (!sessionId) return
|
|
||||||
setStatus('saving')
|
|
||||||
try {
|
|
||||||
const cellUpdates = Array.from(editedTexts.entries())
|
|
||||||
.filter(([cellId, text]) => {
|
|
||||||
const cell = cells.find(c => c.cellId === cellId)
|
|
||||||
return cell && text !== cell.originalText
|
|
||||||
})
|
|
||||||
.map(([cellId, text]) => ({ cell_id: cellId, text }))
|
|
||||||
|
|
||||||
if (cellUpdates.length === 0) {
|
|
||||||
// Nothing changed, just advance
|
|
||||||
setStatus('saved')
|
|
||||||
return
|
|
||||||
}
|
|
||||||
|
|
||||||
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/reconstruction`, {
|
|
||||||
method: 'POST',
|
|
||||||
headers: { 'Content-Type': 'application/json' },
|
|
||||||
body: JSON.stringify({ cells: cellUpdates }),
|
|
||||||
})
|
|
||||||
|
|
||||||
if (!res.ok) {
|
|
||||||
const data = await res.json().catch(() => ({}))
|
|
||||||
throw new Error(data.detail || `HTTP ${res.status}`)
|
|
||||||
}
|
|
||||||
|
|
||||||
setStatus('saved')
|
|
||||||
} catch (e: unknown) {
|
|
||||||
setError(e instanceof Error ? e.message : String(e))
|
|
||||||
setStatus('error')
|
|
||||||
}
|
|
||||||
}, [sessionId, editedTexts, cells])
|
|
||||||
|
|
||||||
// Handler for Fabric.js editor cell changes
|
|
||||||
const handleFabricCellsChanged = useCallback((updates: { cell_id: string; text: string }[]) => {
|
|
||||||
for (const u of updates) {
|
|
||||||
setEditedTexts(prev => {
|
|
||||||
const next = new Map(prev)
|
|
||||||
next.set(u.cell_id, u.text)
|
|
||||||
return next
|
|
||||||
})
|
|
||||||
}
|
|
||||||
}, [])
|
|
||||||
|
|
||||||
const dewarpedUrl = sessionId
|
|
||||||
? `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/image/dewarped`
|
|
||||||
: ''
|
|
||||||
|
|
||||||
const colTypeColor = (colType: string): string => {
|
|
||||||
const colors: Record<string, string> = {
|
|
||||||
column_en: 'border-blue-400/40 focus:border-blue-500',
|
|
||||||
column_de: 'border-green-400/40 focus:border-green-500',
|
|
||||||
column_example: 'border-orange-400/40 focus:border-orange-500',
|
|
||||||
column_text: 'border-purple-400/40 focus:border-purple-500',
|
|
||||||
page_ref: 'border-cyan-400/40 focus:border-cyan-500',
|
|
||||||
column_marker: 'border-gray-400/40 focus:border-gray-500',
|
|
||||||
}
|
|
||||||
return colors[colType] || 'border-gray-400/40 focus:border-gray-500'
|
|
||||||
}
|
|
||||||
|
|
||||||
// Font size based on image natural height (not container) scaled by zoom
|
|
||||||
const getFontSize = useCallback((bboxH: number): number => {
|
|
||||||
const baseH = imageNaturalH || 800
|
|
||||||
const px = (bboxH / 100) * baseH * 0.55
|
|
||||||
return Math.max(8, Math.min(18, px * (zoom / 100)))
|
|
||||||
}, [imageNaturalH, zoom])
|
|
||||||
|
|
||||||
if (!sessionId) {
|
|
||||||
return <div className="text-center py-12 text-gray-400">Bitte zuerst eine Session auswaehlen.</div>
|
|
||||||
}
|
|
||||||
|
|
||||||
if (status === 'loading') {
|
|
||||||
return (
|
|
||||||
<div className="flex items-center gap-3 justify-center py-12">
|
|
||||||
<div className="animate-spin rounded-full h-5 w-5 border-b-2 border-teal-500" />
|
|
||||||
<span className="text-gray-500">Rekonstruktionsdaten werden geladen...</span>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
if (status === 'error') {
|
|
||||||
return (
|
|
||||||
<div className="flex flex-col items-center justify-center py-12 text-center">
|
|
||||||
<div className="text-5xl mb-4">⚠️</div>
|
|
||||||
<h3 className="text-lg font-medium text-red-600 dark:text-red-400 mb-2">Fehler</h3>
|
|
||||||
<p className="text-sm text-gray-500 dark:text-gray-400 max-w-lg mb-4">{error}</p>
|
|
||||||
<div className="flex gap-3">
|
|
||||||
<button onClick={() => { setError(''); loadSessionData() }}
|
|
||||||
className="px-5 py-2 bg-teal-600 text-white rounded-lg hover:bg-teal-700 transition-colors text-sm">
|
|
||||||
Erneut versuchen
|
|
||||||
</button>
|
|
||||||
<button onClick={onNext}
|
|
||||||
className="px-5 py-2 bg-gray-200 dark:bg-gray-700 text-gray-700 dark:text-gray-300 rounded-lg hover:bg-gray-300 dark:hover:bg-gray-600 transition-colors text-sm">
|
|
||||||
Ueberspringen →
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
if (status === 'saved') {
|
|
||||||
return (
|
|
||||||
<div className="flex flex-col items-center justify-center py-12 text-center">
|
|
||||||
<div className="text-5xl mb-4">✅</div>
|
|
||||||
<h3 className="text-lg font-medium text-gray-700 dark:text-gray-300 mb-2">Rekonstruktion gespeichert</h3>
|
|
||||||
<p className="text-sm text-gray-500 dark:text-gray-400 mb-6">
|
|
||||||
{changedCount > 0 ? `${changedCount} Zellen wurden aktualisiert.` : 'Keine Aenderungen vorgenommen.'}
|
|
||||||
</p>
|
|
||||||
<button onClick={onNext}
|
|
||||||
className="px-6 py-2.5 bg-teal-600 text-white rounded-lg hover:bg-teal-700 transition-colors font-medium">
|
|
||||||
Weiter →
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
return (
|
|
||||||
<div className="space-y-3">
|
|
||||||
{/* Toolbar */}
|
|
||||||
<div className="flex items-center justify-between bg-white dark:bg-gray-800 rounded-lg border border-gray-200 dark:border-gray-700 px-3 py-2">
|
|
||||||
<div className="flex items-center gap-2">
|
|
||||||
<h3 className="text-sm font-medium text-gray-700 dark:text-gray-300">
|
|
||||||
Schritt 7: Rekonstruktion
|
|
||||||
</h3>
|
|
||||||
{/* Mode toggle */}
|
|
||||||
<div className="flex items-center ml-2 border border-gray-300 dark:border-gray-600 rounded overflow-hidden text-xs">
|
|
||||||
<button
|
|
||||||
onClick={() => setEditorMode('simple')}
|
|
||||||
className={`px-2 py-0.5 transition-colors ${
|
|
||||||
editorMode === 'simple'
|
|
||||||
? 'bg-teal-600 text-white'
|
|
||||||
: 'hover:bg-gray-50 dark:hover:bg-gray-700 text-gray-600 dark:text-gray-400'
|
|
||||||
}`}
|
|
||||||
>
|
|
||||||
Einfach
|
|
||||||
</button>
|
|
||||||
<button
|
|
||||||
onClick={() => setEditorMode('editor')}
|
|
||||||
className={`px-2 py-0.5 transition-colors ${
|
|
||||||
editorMode === 'editor'
|
|
||||||
? 'bg-teal-600 text-white'
|
|
||||||
: 'hover:bg-gray-50 dark:hover:bg-gray-700 text-gray-600 dark:text-gray-400'
|
|
||||||
}`}
|
|
||||||
>
|
|
||||||
Editor
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
<span className="text-xs text-gray-400">
|
|
||||||
{cells.length} Zellen · {changedCount} geaendert
|
|
||||||
{emptyCellIds.size > 0 && showEmptyHighlight && (
|
|
||||||
<span className="text-red-400 ml-1">· {emptyCellIds.size} leer</span>
|
|
||||||
)}
|
|
||||||
</span>
|
|
||||||
</div>
|
|
||||||
<div className="flex items-center gap-2">
|
|
||||||
{/* Undo/Redo */}
|
|
||||||
<button
|
|
||||||
onClick={undo}
|
|
||||||
disabled={undoStack.length === 0}
|
|
||||||
className="px-2 py-1 text-xs border border-gray-300 dark:border-gray-600 rounded hover:bg-gray-50 dark:hover:bg-gray-700 disabled:opacity-30"
|
|
||||||
title="Rueckgaengig (Ctrl+Z)"
|
|
||||||
>
|
|
||||||
↩
|
|
||||||
</button>
|
|
||||||
<button
|
|
||||||
onClick={redo}
|
|
||||||
disabled={redoStack.length === 0}
|
|
||||||
className="px-2 py-1 text-xs border border-gray-300 dark:border-gray-600 rounded hover:bg-gray-50 dark:hover:bg-gray-700 disabled:opacity-30"
|
|
||||||
title="Wiederholen (Ctrl+Shift+Z)"
|
|
||||||
>
|
|
||||||
↪
|
|
||||||
</button>
|
|
||||||
|
|
||||||
<div className="w-px h-5 bg-gray-300 dark:bg-gray-600 mx-1" />
|
|
||||||
|
|
||||||
{/* Empty field toggle */}
|
|
||||||
<button
|
|
||||||
onClick={() => setShowEmptyHighlight(v => !v)}
|
|
||||||
className={`px-2 py-1 text-xs border rounded transition-colors ${
|
|
||||||
showEmptyHighlight
|
|
||||||
? 'border-red-300 bg-red-50 text-red-600 dark:border-red-700 dark:bg-red-900/30 dark:text-red-400'
|
|
||||||
: 'border-gray-300 dark:border-gray-600 hover:bg-gray-50 dark:hover:bg-gray-700'
|
|
||||||
}`}
|
|
||||||
title="Leere Pflichtfelder markieren"
|
|
||||||
>
|
|
||||||
Leer
|
|
||||||
</button>
|
|
||||||
|
|
||||||
<div className="w-px h-5 bg-gray-300 dark:bg-gray-600 mx-1" />
|
|
||||||
|
|
||||||
{/* Zoom controls */}
|
|
||||||
<button
|
|
||||||
onClick={() => setZoom(z => Math.max(50, z - 25))}
|
|
||||||
className="px-2 py-1 text-xs border border-gray-300 dark:border-gray-600 rounded hover:bg-gray-50 dark:hover:bg-gray-700"
|
|
||||||
>
|
|
||||||
−
|
|
||||||
</button>
|
|
||||||
<span className="text-xs text-gray-500 w-10 text-center">{zoom}%</span>
|
|
||||||
<button
|
|
||||||
onClick={() => setZoom(z => Math.min(200, z + 25))}
|
|
||||||
className="px-2 py-1 text-xs border border-gray-300 dark:border-gray-600 rounded hover:bg-gray-50 dark:hover:bg-gray-700"
|
|
||||||
>
|
|
||||||
+
|
|
||||||
</button>
|
|
||||||
<button
|
|
||||||
onClick={() => setZoom(100)}
|
|
||||||
className="px-2 py-1 text-xs border border-gray-300 dark:border-gray-600 rounded hover:bg-gray-50 dark:hover:bg-gray-700"
|
|
||||||
>
|
|
||||||
Fit
|
|
||||||
</button>
|
|
||||||
|
|
||||||
<div className="w-px h-5 bg-gray-300 dark:bg-gray-600 mx-1" />
|
|
||||||
|
|
||||||
<button
|
|
||||||
onClick={saveReconstruction}
|
|
||||||
disabled={status === 'saving'}
|
|
||||||
className="px-4 py-1.5 text-xs bg-teal-600 text-white rounded-lg hover:bg-teal-700 disabled:opacity-50 transition-colors font-medium"
|
|
||||||
>
|
|
||||||
{status === 'saving' ? 'Speichert...' : 'Speichern'}
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Reconstruction canvas — Simple or Editor mode */}
|
|
||||||
{editorMode === 'editor' && sessionId ? (
|
|
||||||
<FabricReconstructionCanvas
|
|
||||||
sessionId={sessionId}
|
|
||||||
cells={gridCells}
|
|
||||||
onCellsChanged={handleFabricCellsChanged}
|
|
||||||
/>
|
|
||||||
) : (
|
|
||||||
<div className="border rounded-lg overflow-auto dark:border-gray-700 bg-gray-100 dark:bg-gray-900" style={{ maxHeight: '75vh' }}>
|
|
||||||
<div
|
|
||||||
ref={containerRef}
|
|
||||||
className="relative inline-block"
|
|
||||||
style={{ transform: `scale(${zoom / 100})`, transformOrigin: 'top left' }}
|
|
||||||
>
|
|
||||||
{/* Background image at reduced opacity */}
|
|
||||||
{/* eslint-disable-next-line @next/next/no-img-element */}
|
|
||||||
<img
|
|
||||||
ref={imageRef}
|
|
||||||
src={dewarpedUrl}
|
|
||||||
alt="Dewarped"
|
|
||||||
className="block"
|
|
||||||
style={{ opacity: 0.3 }}
|
|
||||||
onLoad={handleImageLoad}
|
|
||||||
/>
|
|
||||||
|
|
||||||
{/* Empty field markers */}
|
|
||||||
{showEmptyHighlight && cells
|
|
||||||
.filter(c => emptyCellIds.has(c.cellId))
|
|
||||||
.map(cell => (
|
|
||||||
<div
|
|
||||||
key={`empty-${cell.cellId}`}
|
|
||||||
className="absolute border-2 border-dashed border-red-400/60 rounded pointer-events-none"
|
|
||||||
style={{
|
|
||||||
left: `${cell.bboxPct.x}%`,
|
|
||||||
top: `${cell.bboxPct.y}%`,
|
|
||||||
width: `${cell.bboxPct.w}%`,
|
|
||||||
height: `${cell.bboxPct.h}%`,
|
|
||||||
}}
|
|
||||||
/>
|
|
||||||
))}
|
|
||||||
|
|
||||||
{/* Editable text fields at bbox positions */}
|
|
||||||
{cells.map((cell) => {
|
|
||||||
const displayText = getDisplayText(cell)
|
|
||||||
const edited = isEdited(cell)
|
|
||||||
|
|
||||||
return (
|
|
||||||
<div key={cell.cellId} className="absolute group" style={{
|
|
||||||
left: `${cell.bboxPct.x}%`,
|
|
||||||
top: `${cell.bboxPct.y}%`,
|
|
||||||
width: `${cell.bboxPct.w}%`,
|
|
||||||
height: `${cell.bboxPct.h}%`,
|
|
||||||
}}>
|
|
||||||
<input
|
|
||||||
id={`cell-${cell.cellId}`}
|
|
||||||
type="text"
|
|
||||||
value={displayText}
|
|
||||||
onChange={(e) => handleTextChange(cell.cellId, e.target.value)}
|
|
||||||
onKeyDown={(e) => handleKeyDown(e, cell.cellId)}
|
|
||||||
className={`w-full h-full bg-transparent text-black dark:text-white border px-0.5 outline-none transition-colors ${
|
|
||||||
colTypeColor(cell.colType)
|
|
||||||
} ${edited ? 'border-green-500 bg-green-50/30 dark:bg-green-900/20' : ''}`}
|
|
||||||
style={{
|
|
||||||
fontSize: `${getFontSize(cell.bboxPct.h)}px`,
|
|
||||||
lineHeight: '1',
|
|
||||||
}}
|
|
||||||
title={`${cell.cellId} (${cell.colType})`}
|
|
||||||
/>
|
|
||||||
{/* Per-cell reset button (X) — only shown for edited cells on hover */}
|
|
||||||
{edited && (
|
|
||||||
<button
|
|
||||||
onClick={() => resetCell(cell.cellId)}
|
|
||||||
className="absolute -top-1 -right-1 w-4 h-4 bg-red-500 text-white rounded-full text-[9px] leading-none opacity-0 group-hover:opacity-100 transition-opacity flex items-center justify-center"
|
|
||||||
title="Zuruecksetzen"
|
|
||||||
>
|
|
||||||
×
|
|
||||||
</button>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
})}
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
|
|
||||||
{/* Bottom action */}
|
|
||||||
<div className="flex justify-end">
|
|
||||||
<button
|
|
||||||
onClick={() => {
|
|
||||||
if (changedCount > 0) {
|
|
||||||
saveReconstruction()
|
|
||||||
} else {
|
|
||||||
onNext()
|
|
||||||
}
|
|
||||||
}}
|
|
||||||
className="px-6 py-2.5 bg-teal-600 text-white rounded-lg hover:bg-teal-700 transition-colors font-medium text-sm"
|
|
||||||
>
|
|
||||||
{changedCount > 0 ? 'Speichern & Weiter \u2192' : 'Weiter \u2192'}
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
@@ -1,263 +0,0 @@
|
|||||||
'use client'
|
|
||||||
|
|
||||||
import { useCallback, useEffect, useState } from 'react'
|
|
||||||
import type { RowResult, RowGroundTruth } from '@/app/(admin)/ai/ocr-pipeline/types'
|
|
||||||
|
|
||||||
const KLAUSUR_API = '/klausur-api'
|
|
||||||
|
|
||||||
interface StepRowDetectionProps {
|
|
||||||
sessionId: string | null
|
|
||||||
onNext: () => void
|
|
||||||
}
|
|
||||||
|
|
||||||
export function StepRowDetection({ sessionId, onNext }: StepRowDetectionProps) {
|
|
||||||
const [rowResult, setRowResult] = useState<RowResult | null>(null)
|
|
||||||
const [detecting, setDetecting] = useState(false)
|
|
||||||
const [error, setError] = useState<string | null>(null)
|
|
||||||
const [gtNotes, setGtNotes] = useState('')
|
|
||||||
const [gtSaved, setGtSaved] = useState(false)
|
|
||||||
|
|
||||||
useEffect(() => {
|
|
||||||
if (!sessionId) return
|
|
||||||
|
|
||||||
const fetchSession = async () => {
|
|
||||||
try {
|
|
||||||
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}`)
|
|
||||||
if (res.ok) {
|
|
||||||
const info = await res.json()
|
|
||||||
if (info.row_result) {
|
|
||||||
setRowResult(info.row_result)
|
|
||||||
return
|
|
||||||
}
|
|
||||||
}
|
|
||||||
} catch (e) {
|
|
||||||
console.error('Failed to fetch session info:', e)
|
|
||||||
}
|
|
||||||
// No cached result — run auto
|
|
||||||
runAutoDetection()
|
|
||||||
}
|
|
||||||
|
|
||||||
fetchSession()
|
|
||||||
// eslint-disable-next-line react-hooks/exhaustive-deps
|
|
||||||
}, [sessionId])
|
|
||||||
|
|
||||||
const runAutoDetection = useCallback(async () => {
|
|
||||||
if (!sessionId) return
|
|
||||||
setDetecting(true)
|
|
||||||
setError(null)
|
|
||||||
try {
|
|
||||||
const res = await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/rows`, {
|
|
||||||
method: 'POST',
|
|
||||||
})
|
|
||||||
if (!res.ok) {
|
|
||||||
const err = await res.json().catch(() => ({ detail: res.statusText }))
|
|
||||||
throw new Error(err.detail || 'Zeilenerkennung fehlgeschlagen')
|
|
||||||
}
|
|
||||||
const data: RowResult = await res.json()
|
|
||||||
setRowResult(data)
|
|
||||||
} catch (e) {
|
|
||||||
setError(e instanceof Error ? e.message : 'Unbekannter Fehler')
|
|
||||||
} finally {
|
|
||||||
setDetecting(false)
|
|
||||||
}
|
|
||||||
}, [sessionId])
|
|
||||||
|
|
||||||
const handleGroundTruth = useCallback(async (isCorrect: boolean) => {
|
|
||||||
if (!sessionId) return
|
|
||||||
const gt: RowGroundTruth = {
|
|
||||||
is_correct: isCorrect,
|
|
||||||
notes: gtNotes || undefined,
|
|
||||||
}
|
|
||||||
try {
|
|
||||||
await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/ground-truth/rows`, {
|
|
||||||
method: 'POST',
|
|
||||||
headers: { 'Content-Type': 'application/json' },
|
|
||||||
body: JSON.stringify(gt),
|
|
||||||
})
|
|
||||||
setGtSaved(true)
|
|
||||||
} catch (e) {
|
|
||||||
console.error('Ground truth save failed:', e)
|
|
||||||
}
|
|
||||||
}, [sessionId, gtNotes])
|
|
||||||
|
|
||||||
if (!sessionId) {
|
|
||||||
return (
|
|
||||||
<div className="flex flex-col items-center justify-center py-16 text-center">
|
|
||||||
<div className="text-5xl mb-4">📏</div>
|
|
||||||
<h3 className="text-lg font-medium text-gray-700 dark:text-gray-300 mb-2">
|
|
||||||
Schritt 4: Zeilenerkennung
|
|
||||||
</h3>
|
|
||||||
<p className="text-gray-500 dark:text-gray-400 max-w-md">
|
|
||||||
Bitte zuerst Schritte 1-3 abschliessen.
|
|
||||||
</p>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
const overlayUrl = `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/image/rows-overlay`
|
|
||||||
const dewarpedUrl = `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/image/dewarped`
|
|
||||||
|
|
||||||
const rowTypeColors: Record<string, string> = {
|
|
||||||
header: 'bg-gray-200 dark:bg-gray-600 text-gray-700 dark:text-gray-300',
|
|
||||||
content: 'bg-blue-100 dark:bg-blue-900/30 text-blue-700 dark:text-blue-300',
|
|
||||||
footer: 'bg-gray-200 dark:bg-gray-600 text-gray-700 dark:text-gray-300',
|
|
||||||
}
|
|
||||||
|
|
||||||
return (
|
|
||||||
<div className="space-y-4">
|
|
||||||
{/* Loading */}
|
|
||||||
{detecting && (
|
|
||||||
<div className="flex items-center gap-2 text-teal-600 dark:text-teal-400 text-sm">
|
|
||||||
<div className="animate-spin w-4 h-4 border-2 border-teal-500 border-t-transparent rounded-full" />
|
|
||||||
Zeilenerkennung laeuft...
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
|
|
||||||
{/* Images: overlay vs clean */}
|
|
||||||
<div className="grid grid-cols-2 gap-4">
|
|
||||||
<div>
|
|
||||||
<div className="text-xs font-medium text-gray-500 dark:text-gray-400 mb-1">
|
|
||||||
Mit Zeilen-Overlay
|
|
||||||
</div>
|
|
||||||
<div className="border rounded-lg overflow-hidden dark:border-gray-700 bg-gray-50 dark:bg-gray-900">
|
|
||||||
{rowResult ? (
|
|
||||||
// eslint-disable-next-line @next/next/no-img-element
|
|
||||||
<img
|
|
||||||
src={`${overlayUrl}?t=${Date.now()}`}
|
|
||||||
alt="Zeilen-Overlay"
|
|
||||||
className="w-full h-auto"
|
|
||||||
/>
|
|
||||||
) : (
|
|
||||||
<div className="aspect-[3/4] flex items-center justify-center text-gray-400 text-sm">
|
|
||||||
{detecting ? 'Erkenne Zeilen...' : 'Keine Daten'}
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
<div>
|
|
||||||
<div className="text-xs font-medium text-gray-500 dark:text-gray-400 mb-1">
|
|
||||||
Entzerrtes Bild
|
|
||||||
</div>
|
|
||||||
<div className="border rounded-lg overflow-hidden dark:border-gray-700 bg-gray-50 dark:bg-gray-900">
|
|
||||||
{/* eslint-disable-next-line @next/next/no-img-element */}
|
|
||||||
<img
|
|
||||||
src={dewarpedUrl}
|
|
||||||
alt="Entzerrt"
|
|
||||||
className="w-full h-auto"
|
|
||||||
/>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Row summary */}
|
|
||||||
{rowResult && (
|
|
||||||
<div className="bg-white dark:bg-gray-800 rounded-xl border border-gray-200 dark:border-gray-700 p-4 space-y-3">
|
|
||||||
<div className="flex items-center justify-between">
|
|
||||||
<h4 className="text-sm font-medium text-gray-700 dark:text-gray-300">
|
|
||||||
Ergebnis: {rowResult.total_rows} Zeilen erkannt
|
|
||||||
</h4>
|
|
||||||
<span className="text-xs text-gray-400">
|
|
||||||
{rowResult.duration_seconds}s
|
|
||||||
</span>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Type summary badges */}
|
|
||||||
<div className="flex gap-2">
|
|
||||||
{Object.entries(rowResult.summary).map(([type, count]) => (
|
|
||||||
<span
|
|
||||||
key={type}
|
|
||||||
className={`px-2 py-0.5 rounded text-xs font-medium ${rowTypeColors[type] || 'bg-gray-100 text-gray-600'}`}
|
|
||||||
>
|
|
||||||
{type}: {count}
|
|
||||||
</span>
|
|
||||||
))}
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Row list */}
|
|
||||||
<div className="max-h-64 overflow-y-auto space-y-1">
|
|
||||||
{rowResult.rows.map((row) => (
|
|
||||||
<div
|
|
||||||
key={row.index}
|
|
||||||
className={`flex items-center gap-3 px-3 py-1.5 rounded text-xs font-mono ${
|
|
||||||
row.row_type === 'header' || row.row_type === 'footer'
|
|
||||||
? 'bg-gray-50 dark:bg-gray-700/50 text-gray-500'
|
|
||||||
: 'text-gray-600 dark:text-gray-400'
|
|
||||||
}`}
|
|
||||||
>
|
|
||||||
<span className="w-8 text-right text-gray-400">R{row.index}</span>
|
|
||||||
<span className={`px-1.5 py-0.5 rounded text-[10px] uppercase font-semibold ${rowTypeColors[row.row_type] || ''}`}>
|
|
||||||
{row.row_type}
|
|
||||||
</span>
|
|
||||||
<span>y={row.y}</span>
|
|
||||||
<span>h={row.height}px</span>
|
|
||||||
<span>{row.word_count} Woerter</span>
|
|
||||||
{row.gap_before > 0 && (
|
|
||||||
<span className="text-gray-400">gap={row.gap_before}px</span>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
))}
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
|
|
||||||
{/* Controls */}
|
|
||||||
{rowResult && (
|
|
||||||
<div className="bg-white dark:bg-gray-800 rounded-xl border border-gray-200 dark:border-gray-700 p-4 space-y-3">
|
|
||||||
<div className="flex items-center gap-3">
|
|
||||||
<button
|
|
||||||
onClick={() => runAutoDetection()}
|
|
||||||
disabled={detecting}
|
|
||||||
className="px-3 py-1.5 text-xs border rounded-lg hover:bg-gray-50 dark:hover:bg-gray-700 dark:border-gray-600 disabled:opacity-50"
|
|
||||||
>
|
|
||||||
Erneut erkennen
|
|
||||||
</button>
|
|
||||||
|
|
||||||
<div className="flex-1" />
|
|
||||||
|
|
||||||
{/* Ground truth */}
|
|
||||||
{!gtSaved ? (
|
|
||||||
<>
|
|
||||||
<input
|
|
||||||
type="text"
|
|
||||||
placeholder="Notizen (optional)"
|
|
||||||
value={gtNotes}
|
|
||||||
onChange={(e) => setGtNotes(e.target.value)}
|
|
||||||
className="px-2 py-1 text-xs border rounded dark:bg-gray-700 dark:border-gray-600 w-48"
|
|
||||||
/>
|
|
||||||
<button
|
|
||||||
onClick={() => handleGroundTruth(true)}
|
|
||||||
className="px-3 py-1.5 text-xs bg-green-600 text-white rounded-lg hover:bg-green-700"
|
|
||||||
>
|
|
||||||
Korrekt
|
|
||||||
</button>
|
|
||||||
<button
|
|
||||||
onClick={() => handleGroundTruth(false)}
|
|
||||||
className="px-3 py-1.5 text-xs bg-red-600 text-white rounded-lg hover:bg-red-700"
|
|
||||||
>
|
|
||||||
Fehlerhaft
|
|
||||||
</button>
|
|
||||||
</>
|
|
||||||
) : (
|
|
||||||
<span className="text-xs text-green-600 dark:text-green-400">
|
|
||||||
Ground Truth gespeichert
|
|
||||||
</span>
|
|
||||||
)}
|
|
||||||
|
|
||||||
<button
|
|
||||||
onClick={onNext}
|
|
||||||
className="px-4 py-1.5 text-xs bg-teal-600 text-white rounded-lg hover:bg-teal-700 font-medium"
|
|
||||||
>
|
|
||||||
Weiter
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
|
|
||||||
{error && (
|
|
||||||
<div className="p-3 bg-red-50 dark:bg-red-900/20 text-red-600 dark:text-red-400 rounded-lg text-sm">
|
|
||||||
{error}
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
@@ -1,911 +0,0 @@
|
|||||||
'use client'
|
|
||||||
|
|
||||||
import { useCallback, useEffect, useRef, useState } from 'react'
|
|
||||||
import type { GridResult, GridCell, WordEntry, WordGroundTruth } from '@/app/(admin)/ai/ocr-pipeline/types'
|
|
||||||
|
|
||||||
const KLAUSUR_API = '/klausur-api'
|
|
||||||
|
|
||||||
/** Render text with \n as line breaks */
|
|
||||||
function MultilineText({ text }: { text: string }) {
|
|
||||||
if (!text) return <span className="text-gray-300 dark:text-gray-600">—</span>
|
|
||||||
const lines = text.split('\n')
|
|
||||||
if (lines.length === 1) return <>{text}</>
|
|
||||||
return <>{lines.map((line, i) => (
|
|
||||||
<span key={i}>{line}{i < lines.length - 1 && <br />}</span>
|
|
||||||
))}</>
|
|
||||||
}
|
|
||||||
|
|
||||||
/** Column type → human-readable header */
|
|
||||||
function colTypeLabel(colType: string): string {
|
|
||||||
const labels: Record<string, string> = {
|
|
||||||
column_en: 'English',
|
|
||||||
column_de: 'Deutsch',
|
|
||||||
column_example: 'Example',
|
|
||||||
column_text: 'Text',
|
|
||||||
column_marker: 'Marker',
|
|
||||||
page_ref: 'Seite',
|
|
||||||
}
|
|
||||||
return labels[colType] || colType.replace('column_', '')
|
|
||||||
}
|
|
||||||
|
|
||||||
/** Column type → color class */
|
|
||||||
function colTypeColor(colType: string): string {
|
|
||||||
const colors: Record<string, string> = {
|
|
||||||
column_en: 'text-blue-600 dark:text-blue-400',
|
|
||||||
column_de: 'text-green-600 dark:text-green-400',
|
|
||||||
column_example: 'text-orange-600 dark:text-orange-400',
|
|
||||||
column_text: 'text-purple-600 dark:text-purple-400',
|
|
||||||
column_marker: 'text-gray-500 dark:text-gray-400',
|
|
||||||
}
|
|
||||||
return colors[colType] || 'text-gray-600 dark:text-gray-400'
|
|
||||||
}
|
|
||||||
|
|
||||||
interface StepWordRecognitionProps {
|
|
||||||
sessionId: string | null
|
|
||||||
onNext: () => void
|
|
||||||
goToStep: (step: number) => void
|
|
||||||
}
|
|
||||||
|
|
||||||
export function StepWordRecognition({ sessionId, onNext, goToStep }: StepWordRecognitionProps) {
|
|
||||||
const [gridResult, setGridResult] = useState<GridResult | null>(null)
|
|
||||||
const [detecting, setDetecting] = useState(false)
|
|
||||||
const [error, setError] = useState<string | null>(null)
|
|
||||||
const [gtNotes, setGtNotes] = useState('')
|
|
||||||
const [gtSaved, setGtSaved] = useState(false)
|
|
||||||
|
|
||||||
// Step-through labeling state
|
|
||||||
const [activeIndex, setActiveIndex] = useState(0)
|
|
||||||
const [editedEntries, setEditedEntries] = useState<WordEntry[]>([])
|
|
||||||
const [editedCells, setEditedCells] = useState<GridCell[]>([])
|
|
||||||
const [mode, setMode] = useState<'overview' | 'labeling'>('overview')
|
|
||||||
const [ocrEngine, setOcrEngine] = useState<'auto' | 'tesseract' | 'rapid'>('auto')
|
|
||||||
const [usedEngine, setUsedEngine] = useState<string>('')
|
|
||||||
const [pronunciation, setPronunciation] = useState<'british' | 'american'>('british')
|
|
||||||
|
|
||||||
// Streaming progress state
|
|
||||||
const [streamProgress, setStreamProgress] = useState<{ current: number; total: number } | null>(null)
|
|
||||||
|
|
||||||
const enRef = useRef<HTMLInputElement>(null)
|
|
||||||
const tableEndRef = useRef<HTMLDivElement>(null)
|
|
||||||
|
|
||||||
const isVocab = gridResult?.layout === 'vocab'
|
|
||||||
|
|
||||||
useEffect(() => {
|
|
||||||
if (!sessionId) return
|
|
||||||
// Always run fresh detection — word-lookup is fast (~0.03s)
|
|
||||||
// and avoids stale cached results from previous pipeline versions.
|
|
||||||
runAutoDetection()
|
|
||||||
// eslint-disable-next-line react-hooks/exhaustive-deps
|
|
||||||
}, [sessionId])
|
|
||||||
|
|
||||||
const applyGridResult = (data: GridResult) => {
|
|
||||||
setGridResult(data)
|
|
||||||
setUsedEngine(data.ocr_engine || '')
|
|
||||||
if (data.layout === 'vocab' && data.entries) {
|
|
||||||
initEntries(data.entries)
|
|
||||||
}
|
|
||||||
if (data.cells) {
|
|
||||||
setEditedCells(data.cells.map(c => ({ ...c, status: c.status || 'pending' })))
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
const initEntries = (entries: WordEntry[]) => {
|
|
||||||
setEditedEntries(entries.map(e => ({ ...e, status: e.status || 'pending' })))
|
|
||||||
setActiveIndex(0)
|
|
||||||
}
|
|
||||||
|
|
||||||
const runAutoDetection = useCallback(async (engine?: string) => {
|
|
||||||
if (!sessionId) return
|
|
||||||
const eng = engine || ocrEngine
|
|
||||||
setDetecting(true)
|
|
||||||
setError(null)
|
|
||||||
setStreamProgress(null)
|
|
||||||
setEditedCells([])
|
|
||||||
setEditedEntries([])
|
|
||||||
setGridResult(null)
|
|
||||||
|
|
||||||
try {
|
|
||||||
// Retry once if initial request fails (e.g. after container restart,
|
|
||||||
// session cache may not be warm yet when navigating via wizard)
|
|
||||||
let res: Response | null = null
|
|
||||||
for (let attempt = 0; attempt < 2; attempt++) {
|
|
||||||
res = await fetch(
|
|
||||||
`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/words?stream=true&engine=${eng}&pronunciation=${pronunciation}`,
|
|
||||||
{ method: 'POST' },
|
|
||||||
)
|
|
||||||
if (res.ok) break
|
|
||||||
if (attempt === 0 && (res.status === 400 || res.status === 404)) {
|
|
||||||
// Wait briefly for cache to warm up, then retry
|
|
||||||
await new Promise(r => setTimeout(r, 2000))
|
|
||||||
continue
|
|
||||||
}
|
|
||||||
break
|
|
||||||
}
|
|
||||||
if (!res || !res.ok) {
|
|
||||||
const err = await res?.json().catch(() => ({ detail: res?.statusText })) || { detail: 'Worterkennung fehlgeschlagen' }
|
|
||||||
throw new Error(err.detail || 'Worterkennung fehlgeschlagen')
|
|
||||||
}
|
|
||||||
|
|
||||||
const reader = res.body!.getReader()
|
|
||||||
const decoder = new TextDecoder()
|
|
||||||
let buffer = ''
|
|
||||||
let streamLayout: string | null = null
|
|
||||||
let streamColumnsUsed: GridResult['columns_used'] = []
|
|
||||||
let streamGridShape: GridResult['grid_shape'] | null = null
|
|
||||||
let streamCells: GridCell[] = []
|
|
||||||
|
|
||||||
while (true) {
|
|
||||||
const { done, value } = await reader.read()
|
|
||||||
if (done) break
|
|
||||||
buffer += decoder.decode(value, { stream: true })
|
|
||||||
|
|
||||||
// Parse SSE events (separated by \n\n)
|
|
||||||
while (buffer.includes('\n\n')) {
|
|
||||||
const idx = buffer.indexOf('\n\n')
|
|
||||||
const chunk = buffer.slice(0, idx).trim()
|
|
||||||
buffer = buffer.slice(idx + 2)
|
|
||||||
|
|
||||||
if (!chunk.startsWith('data: ')) continue
|
|
||||||
const dataStr = chunk.slice(6) // strip "data: "
|
|
||||||
|
|
||||||
let event: any
|
|
||||||
try {
|
|
||||||
event = JSON.parse(dataStr)
|
|
||||||
} catch {
|
|
||||||
continue
|
|
||||||
}
|
|
||||||
|
|
||||||
if (event.type === 'meta') {
|
|
||||||
streamLayout = event.layout || 'generic'
|
|
||||||
streamGridShape = event.grid_shape || null
|
|
||||||
// Show partial grid result so UI renders structure
|
|
||||||
setGridResult(prev => ({
|
|
||||||
...prev,
|
|
||||||
layout: event.layout || 'generic',
|
|
||||||
grid_shape: event.grid_shape,
|
|
||||||
columns_used: [],
|
|
||||||
cells: [],
|
|
||||||
summary: { total_cells: event.grid_shape?.total_cells || 0, non_empty_cells: 0, low_confidence: 0 },
|
|
||||||
duration_seconds: 0,
|
|
||||||
ocr_engine: '',
|
|
||||||
} as GridResult))
|
|
||||||
}
|
|
||||||
|
|
||||||
if (event.type === 'columns') {
|
|
||||||
streamColumnsUsed = event.columns_used || []
|
|
||||||
setGridResult(prev => prev ? { ...prev, columns_used: streamColumnsUsed } : prev)
|
|
||||||
}
|
|
||||||
|
|
||||||
if (event.type === 'cell') {
|
|
||||||
const cell: GridCell = { ...event.cell, status: 'pending' }
|
|
||||||
streamCells = [...streamCells, cell]
|
|
||||||
setEditedCells(streamCells)
|
|
||||||
setStreamProgress(event.progress)
|
|
||||||
// Auto-scroll table to bottom
|
|
||||||
setTimeout(() => tableEndRef.current?.scrollIntoView({ behavior: 'smooth', block: 'nearest' }), 16)
|
|
||||||
}
|
|
||||||
|
|
||||||
if (event.type === 'complete') {
|
|
||||||
// Build final GridResult
|
|
||||||
const finalResult: GridResult = {
|
|
||||||
cells: streamCells,
|
|
||||||
grid_shape: streamGridShape || { rows: 0, cols: 0, total_cells: streamCells.length },
|
|
||||||
columns_used: streamColumnsUsed,
|
|
||||||
layout: streamLayout || 'generic',
|
|
||||||
image_width: 0,
|
|
||||||
image_height: 0,
|
|
||||||
duration_seconds: event.duration_seconds || 0,
|
|
||||||
ocr_engine: event.ocr_engine || '',
|
|
||||||
summary: event.summary || {},
|
|
||||||
}
|
|
||||||
|
|
||||||
// If vocab: apply post-processed entries from complete event
|
|
||||||
if (event.vocab_entries) {
|
|
||||||
finalResult.entries = event.vocab_entries
|
|
||||||
finalResult.vocab_entries = event.vocab_entries
|
|
||||||
finalResult.entry_count = event.vocab_entries.length
|
|
||||||
}
|
|
||||||
|
|
||||||
applyGridResult(finalResult)
|
|
||||||
setUsedEngine(event.ocr_engine || '')
|
|
||||||
setStreamProgress(null)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
} catch (e) {
|
|
||||||
setError(e instanceof Error ? e.message : 'Unbekannter Fehler')
|
|
||||||
} finally {
|
|
||||||
setDetecting(false)
|
|
||||||
}
|
|
||||||
// eslint-disable-next-line react-hooks/exhaustive-deps
|
|
||||||
}, [sessionId, ocrEngine, pronunciation])
|
|
||||||
|
|
||||||
const handleGroundTruth = useCallback(async (isCorrect: boolean) => {
|
|
||||||
if (!sessionId) return
|
|
||||||
const gt: WordGroundTruth = {
|
|
||||||
is_correct: isCorrect,
|
|
||||||
corrected_entries: isCorrect ? undefined : (isVocab ? editedEntries : undefined),
|
|
||||||
notes: gtNotes || undefined,
|
|
||||||
}
|
|
||||||
try {
|
|
||||||
await fetch(`${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/ground-truth/words`, {
|
|
||||||
method: 'POST',
|
|
||||||
headers: { 'Content-Type': 'application/json' },
|
|
||||||
body: JSON.stringify(gt),
|
|
||||||
})
|
|
||||||
setGtSaved(true)
|
|
||||||
} catch (e) {
|
|
||||||
console.error('Ground truth save failed:', e)
|
|
||||||
}
|
|
||||||
}, [sessionId, gtNotes, editedEntries, isVocab])
|
|
||||||
|
|
||||||
// Vocab mode: update entry field
|
|
||||||
const updateEntry = (index: number, field: 'english' | 'german' | 'example', value: string) => {
|
|
||||||
setEditedEntries(prev => prev.map((e, i) =>
|
|
||||||
i === index ? { ...e, [field]: value, status: 'edited' as const } : e
|
|
||||||
))
|
|
||||||
}
|
|
||||||
|
|
||||||
// Generic mode: update cell text
|
|
||||||
const updateCell = (cellId: string, value: string) => {
|
|
||||||
setEditedCells(prev => prev.map(c =>
|
|
||||||
c.cell_id === cellId ? { ...c, text: value, status: 'edited' as const } : c
|
|
||||||
))
|
|
||||||
}
|
|
||||||
|
|
||||||
// Step-through: confirm current row (always cell-based)
|
|
||||||
const confirmEntry = () => {
|
|
||||||
const rowCells = getRowCells(activeIndex)
|
|
||||||
const cellIds = new Set(rowCells.map(c => c.cell_id))
|
|
||||||
setEditedCells(prev => prev.map(c =>
|
|
||||||
cellIds.has(c.cell_id) ? { ...c, status: c.status === 'edited' ? 'edited' : 'confirmed' } : c
|
|
||||||
))
|
|
||||||
const maxIdx = getUniqueRowCount() - 1
|
|
||||||
if (activeIndex < maxIdx) {
|
|
||||||
setActiveIndex(activeIndex + 1)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Step-through: skip current row
|
|
||||||
const skipEntry = () => {
|
|
||||||
const rowCells = getRowCells(activeIndex)
|
|
||||||
const cellIds = new Set(rowCells.map(c => c.cell_id))
|
|
||||||
setEditedCells(prev => prev.map(c =>
|
|
||||||
cellIds.has(c.cell_id) ? { ...c, status: 'skipped' as const } : c
|
|
||||||
))
|
|
||||||
const maxIdx = getUniqueRowCount() - 1
|
|
||||||
if (activeIndex < maxIdx) {
|
|
||||||
setActiveIndex(activeIndex + 1)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Helper: get unique row indices from cells
|
|
||||||
const getUniqueRowCount = () => {
|
|
||||||
if (!editedCells.length) return 0
|
|
||||||
return new Set(editedCells.map(c => c.row_index)).size
|
|
||||||
}
|
|
||||||
|
|
||||||
// Helper: get cells for a given row index (by position in sorted unique rows)
|
|
||||||
const getRowCells = (rowPosition: number) => {
|
|
||||||
const uniqueRows = [...new Set(editedCells.map(c => c.row_index))].sort((a, b) => a - b)
|
|
||||||
const rowIdx = uniqueRows[rowPosition]
|
|
||||||
return editedCells.filter(c => c.row_index === rowIdx)
|
|
||||||
}
|
|
||||||
|
|
||||||
// Focus english input when active entry changes in labeling mode
|
|
||||||
useEffect(() => {
|
|
||||||
if (mode === 'labeling' && enRef.current) {
|
|
||||||
enRef.current.focus()
|
|
||||||
}
|
|
||||||
}, [activeIndex, mode])
|
|
||||||
|
|
||||||
// Keyboard shortcuts in labeling mode
|
|
||||||
useEffect(() => {
|
|
||||||
if (mode !== 'labeling') return
|
|
||||||
const handler = (e: KeyboardEvent) => {
|
|
||||||
if (e.key === 'Enter' && !e.shiftKey) {
|
|
||||||
e.preventDefault()
|
|
||||||
confirmEntry()
|
|
||||||
} else if (e.key === 'ArrowDown' && e.ctrlKey) {
|
|
||||||
e.preventDefault()
|
|
||||||
skipEntry()
|
|
||||||
} else if (e.key === 'ArrowUp' && e.ctrlKey) {
|
|
||||||
e.preventDefault()
|
|
||||||
if (activeIndex > 0) setActiveIndex(activeIndex - 1)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
window.addEventListener('keydown', handler)
|
|
||||||
return () => window.removeEventListener('keydown', handler)
|
|
||||||
// eslint-disable-next-line react-hooks/exhaustive-deps
|
|
||||||
}, [mode, activeIndex, editedEntries, editedCells])
|
|
||||||
|
|
||||||
if (!sessionId) {
|
|
||||||
return (
|
|
||||||
<div className="flex flex-col items-center justify-center py-16 text-center">
|
|
||||||
<div className="text-5xl mb-4">🔤</div>
|
|
||||||
<h3 className="text-lg font-medium text-gray-700 dark:text-gray-300 mb-2">
|
|
||||||
Schritt 5: Worterkennung
|
|
||||||
</h3>
|
|
||||||
<p className="text-gray-500 dark:text-gray-400 max-w-md">
|
|
||||||
Bitte zuerst Schritte 1-4 abschliessen.
|
|
||||||
</p>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
const overlayUrl = `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/image/words-overlay`
|
|
||||||
const dewarpedUrl = `${KLAUSUR_API}/api/v1/ocr-pipeline/sessions/${sessionId}/image/dewarped`
|
|
||||||
|
|
||||||
const confColor = (conf: number) => {
|
|
||||||
if (conf >= 70) return 'text-green-600 dark:text-green-400'
|
|
||||||
if (conf >= 50) return 'text-yellow-600 dark:text-yellow-400'
|
|
||||||
return 'text-red-600 dark:text-red-400'
|
|
||||||
}
|
|
||||||
|
|
||||||
const statusBadge = (status?: string) => {
|
|
||||||
const map: Record<string, string> = {
|
|
||||||
pending: 'bg-gray-100 dark:bg-gray-700 text-gray-500',
|
|
||||||
confirmed: 'bg-green-100 dark:bg-green-900/30 text-green-700 dark:text-green-400',
|
|
||||||
edited: 'bg-blue-100 dark:bg-blue-900/30 text-blue-700 dark:text-blue-400',
|
|
||||||
skipped: 'bg-orange-100 dark:bg-orange-900/30 text-orange-700 dark:text-orange-400',
|
|
||||||
}
|
|
||||||
return map[status || 'pending'] || map.pending
|
|
||||||
}
|
|
||||||
|
|
||||||
const summary = gridResult?.summary
|
|
||||||
const columnsUsed = gridResult?.columns_used || []
|
|
||||||
const gridShape = gridResult?.grid_shape
|
|
||||||
|
|
||||||
// Counts for labeling progress (always cell-based)
|
|
||||||
const confirmedRowIds = new Set(
|
|
||||||
editedCells.filter(c => c.status === 'confirmed' || c.status === 'edited').map(c => c.row_index)
|
|
||||||
)
|
|
||||||
const confirmedCount = confirmedRowIds.size
|
|
||||||
const totalCount = getUniqueRowCount()
|
|
||||||
|
|
||||||
// Group cells by row for generic table display
|
|
||||||
const cellsByRow: Map<number, GridCell[]> = new Map()
|
|
||||||
for (const cell of editedCells) {
|
|
||||||
const existing = cellsByRow.get(cell.row_index) || []
|
|
||||||
existing.push(cell)
|
|
||||||
cellsByRow.set(cell.row_index, existing)
|
|
||||||
}
|
|
||||||
const sortedRowIndices = [...cellsByRow.keys()].sort((a, b) => a - b)
|
|
||||||
|
|
||||||
return (
|
|
||||||
<div className="space-y-4">
|
|
||||||
{/* Loading with streaming progress */}
|
|
||||||
{detecting && (
|
|
||||||
<div className="space-y-1">
|
|
||||||
<div className="flex items-center gap-2 text-teal-600 dark:text-teal-400 text-sm">
|
|
||||||
<div className="animate-spin w-4 h-4 border-2 border-teal-500 border-t-transparent rounded-full" />
|
|
||||||
{streamProgress
|
|
||||||
? `Zelle ${streamProgress.current}/${streamProgress.total} erkannt...`
|
|
||||||
: 'Worterkennung startet...'}
|
|
||||||
</div>
|
|
||||||
{streamProgress && streamProgress.total > 0 && (
|
|
||||||
<div className="w-full bg-gray-200 dark:bg-gray-700 rounded-full h-1.5">
|
|
||||||
<div
|
|
||||||
className="bg-teal-500 h-1.5 rounded-full transition-all duration-150"
|
|
||||||
style={{ width: `${(streamProgress.current / streamProgress.total) * 100}%` }}
|
|
||||||
/>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
|
|
||||||
{/* Layout badge + Mode toggle */}
|
|
||||||
{gridResult && (
|
|
||||||
<div className="flex items-center gap-2">
|
|
||||||
{/* Layout badge */}
|
|
||||||
<span className={`px-2 py-0.5 rounded text-[10px] uppercase font-semibold ${
|
|
||||||
isVocab
|
|
||||||
? 'bg-indigo-100 dark:bg-indigo-900/30 text-indigo-700 dark:text-indigo-300'
|
|
||||||
: 'bg-gray-100 dark:bg-gray-700 text-gray-600 dark:text-gray-400'
|
|
||||||
}`}>
|
|
||||||
{isVocab ? 'Vokabel-Layout' : 'Generisch'}
|
|
||||||
</span>
|
|
||||||
|
|
||||||
{gridShape && (
|
|
||||||
<span className="text-[10px] text-gray-400">
|
|
||||||
{gridShape.rows}×{gridShape.cols} = {gridShape.total_cells} Zellen
|
|
||||||
</span>
|
|
||||||
)}
|
|
||||||
|
|
||||||
<div className="flex-1" />
|
|
||||||
|
|
||||||
<button
|
|
||||||
onClick={() => setMode('overview')}
|
|
||||||
className={`px-3 py-1.5 text-xs rounded-lg font-medium transition-colors ${
|
|
||||||
mode === 'overview'
|
|
||||||
? 'bg-teal-600 text-white'
|
|
||||||
: 'bg-gray-100 dark:bg-gray-700 text-gray-600 dark:text-gray-300 hover:bg-gray-200 dark:hover:bg-gray-600'
|
|
||||||
}`}
|
|
||||||
>
|
|
||||||
Uebersicht
|
|
||||||
</button>
|
|
||||||
<button
|
|
||||||
onClick={() => setMode('labeling')}
|
|
||||||
className={`px-3 py-1.5 text-xs rounded-lg font-medium transition-colors ${
|
|
||||||
mode === 'labeling'
|
|
||||||
? 'bg-teal-600 text-white'
|
|
||||||
: 'bg-gray-100 dark:bg-gray-700 text-gray-600 dark:text-gray-300 hover:bg-gray-200 dark:hover:bg-gray-600'
|
|
||||||
}`}
|
|
||||||
>
|
|
||||||
Labeling ({confirmedCount}/{totalCount})
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
|
|
||||||
{/* Overview mode */}
|
|
||||||
{mode === 'overview' && (
|
|
||||||
<>
|
|
||||||
{/* Images: overlay vs clean */}
|
|
||||||
<div className="grid grid-cols-2 gap-4">
|
|
||||||
<div>
|
|
||||||
<div className="text-xs font-medium text-gray-500 dark:text-gray-400 mb-1">
|
|
||||||
Mit Grid-Overlay
|
|
||||||
</div>
|
|
||||||
<div className="border rounded-lg overflow-hidden dark:border-gray-700 bg-gray-50 dark:bg-gray-900">
|
|
||||||
{gridResult ? (
|
|
||||||
// eslint-disable-next-line @next/next/no-img-element
|
|
||||||
<img
|
|
||||||
src={`${overlayUrl}?t=${Date.now()}`}
|
|
||||||
alt="Wort-Overlay"
|
|
||||||
className="w-full h-auto"
|
|
||||||
/>
|
|
||||||
) : (
|
|
||||||
<div className="aspect-[3/4] flex items-center justify-center text-gray-400 text-sm">
|
|
||||||
{detecting ? 'Erkenne Woerter...' : 'Keine Daten'}
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
<div>
|
|
||||||
<div className="text-xs font-medium text-gray-500 dark:text-gray-400 mb-1">
|
|
||||||
Entzerrtes Bild
|
|
||||||
</div>
|
|
||||||
<div className="border rounded-lg overflow-hidden dark:border-gray-700 bg-gray-50 dark:bg-gray-900">
|
|
||||||
{/* eslint-disable-next-line @next/next/no-img-element */}
|
|
||||||
<img
|
|
||||||
src={dewarpedUrl}
|
|
||||||
alt="Entzerrt"
|
|
||||||
className="w-full h-auto"
|
|
||||||
/>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Result summary (only after streaming completes) */}
|
|
||||||
{gridResult && summary && !detecting && (
|
|
||||||
<div className="bg-white dark:bg-gray-800 rounded-xl border border-gray-200 dark:border-gray-700 p-4 space-y-3">
|
|
||||||
<div className="flex items-center justify-between">
|
|
||||||
<h4 className="text-sm font-medium text-gray-700 dark:text-gray-300">
|
|
||||||
Ergebnis: {summary.non_empty_cells}/{summary.total_cells} Zellen mit Text
|
|
||||||
({sortedRowIndices.length} Zeilen, {columnsUsed.length} Spalten)
|
|
||||||
</h4>
|
|
||||||
<span className="text-xs text-gray-400">
|
|
||||||
{gridResult.duration_seconds}s
|
|
||||||
</span>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Summary badges */}
|
|
||||||
<div className="flex gap-2 flex-wrap">
|
|
||||||
<span className="px-2 py-0.5 rounded text-xs font-medium bg-blue-100 dark:bg-blue-900/30 text-blue-700 dark:text-blue-300">
|
|
||||||
Zellen: {summary.non_empty_cells}/{summary.total_cells}
|
|
||||||
</span>
|
|
||||||
{columnsUsed.map((col, i) => (
|
|
||||||
<span key={i} className={`px-2 py-0.5 rounded text-xs font-medium bg-gray-100 dark:bg-gray-700 ${colTypeColor(col.type)}`}>
|
|
||||||
C{col.index}: {colTypeLabel(col.type)}
|
|
||||||
</span>
|
|
||||||
))}
|
|
||||||
{summary.low_confidence > 0 && (
|
|
||||||
<span className="px-2 py-0.5 rounded text-xs font-medium bg-red-100 dark:bg-red-900/30 text-red-700 dark:text-red-300">
|
|
||||||
Unsicher: {summary.low_confidence}
|
|
||||||
</span>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Entry/Cell table */}
|
|
||||||
<div className="max-h-80 overflow-y-auto">
|
|
||||||
{/* Unified dynamic table — columns driven by columns_used */}
|
|
||||||
<table className="w-full text-xs">
|
|
||||||
<thead className="sticky top-0 bg-white dark:bg-gray-800">
|
|
||||||
<tr className="text-left text-gray-500 dark:text-gray-400 border-b dark:border-gray-700">
|
|
||||||
<th className="py-1 pr-2 w-12">Zeile</th>
|
|
||||||
{columnsUsed.map((col, i) => (
|
|
||||||
<th key={i} className={`py-1 pr-2 ${colTypeColor(col.type)}`}>
|
|
||||||
{colTypeLabel(col.type)}
|
|
||||||
</th>
|
|
||||||
))}
|
|
||||||
<th className="py-1 w-12 text-right">Conf</th>
|
|
||||||
</tr>
|
|
||||||
</thead>
|
|
||||||
<tbody>
|
|
||||||
{sortedRowIndices.map((rowIdx, posIdx) => {
|
|
||||||
const rowCells = cellsByRow.get(rowIdx) || []
|
|
||||||
const avgConf = rowCells.length
|
|
||||||
? Math.round(rowCells.reduce((s, c) => s + c.confidence, 0) / rowCells.length)
|
|
||||||
: 0
|
|
||||||
return (
|
|
||||||
<tr
|
|
||||||
key={rowIdx}
|
|
||||||
className={`border-b dark:border-gray-700/50 ${
|
|
||||||
posIdx === activeIndex ? 'bg-teal-50 dark:bg-teal-900/20' : ''
|
|
||||||
}`}
|
|
||||||
onClick={() => { setActiveIndex(posIdx); setMode('labeling') }}
|
|
||||||
>
|
|
||||||
<td className="py-1 pr-2 text-gray-400 font-mono text-[10px]">
|
|
||||||
R{String(rowIdx).padStart(2, '0')}
|
|
||||||
</td>
|
|
||||||
{columnsUsed.map((col) => {
|
|
||||||
const cell = rowCells.find(c => c.col_index === col.index)
|
|
||||||
return (
|
|
||||||
<td key={col.index} className="py-1 pr-2 font-mono text-gray-700 dark:text-gray-300 cursor-pointer">
|
|
||||||
<MultilineText text={cell?.text || ''} />
|
|
||||||
</td>
|
|
||||||
)
|
|
||||||
})}
|
|
||||||
<td className={`py-1 text-right font-mono ${confColor(avgConf)}`}>
|
|
||||||
{avgConf}%
|
|
||||||
</td>
|
|
||||||
</tr>
|
|
||||||
)
|
|
||||||
})}
|
|
||||||
</tbody>
|
|
||||||
</table>
|
|
||||||
<div ref={tableEndRef} />
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
|
|
||||||
{/* Streaming cell table (shown while detecting, before complete) */}
|
|
||||||
{detecting && editedCells.length > 0 && !gridResult?.summary?.non_empty_cells && (
|
|
||||||
<div className="bg-white dark:bg-gray-800 rounded-xl border border-gray-200 dark:border-gray-700 p-4 space-y-3">
|
|
||||||
<h4 className="text-sm font-medium text-gray-700 dark:text-gray-300">
|
|
||||||
Live: {editedCells.length} Zellen erkannt...
|
|
||||||
</h4>
|
|
||||||
<div className="max-h-80 overflow-y-auto">
|
|
||||||
<table className="w-full text-xs">
|
|
||||||
<thead className="sticky top-0 bg-white dark:bg-gray-800">
|
|
||||||
<tr className="text-left text-gray-500 dark:text-gray-400 border-b dark:border-gray-700">
|
|
||||||
<th className="py-1 pr-2 w-12">Zelle</th>
|
|
||||||
{columnsUsed.map((col, i) => (
|
|
||||||
<th key={i} className={`py-1 pr-2 ${colTypeColor(col.type)}`}>
|
|
||||||
{colTypeLabel(col.type)}
|
|
||||||
</th>
|
|
||||||
))}
|
|
||||||
<th className="py-1 w-12 text-right">Conf</th>
|
|
||||||
</tr>
|
|
||||||
</thead>
|
|
||||||
<tbody>
|
|
||||||
{(() => {
|
|
||||||
const liveByRow: Map<number, GridCell[]> = new Map()
|
|
||||||
for (const cell of editedCells) {
|
|
||||||
const existing = liveByRow.get(cell.row_index) || []
|
|
||||||
existing.push(cell)
|
|
||||||
liveByRow.set(cell.row_index, existing)
|
|
||||||
}
|
|
||||||
const liveSorted = [...liveByRow.keys()].sort((a, b) => a - b)
|
|
||||||
return liveSorted.map(rowIdx => {
|
|
||||||
const rowCells = liveByRow.get(rowIdx) || []
|
|
||||||
const avgConf = rowCells.length
|
|
||||||
? Math.round(rowCells.reduce((s, c) => s + c.confidence, 0) / rowCells.length)
|
|
||||||
: 0
|
|
||||||
return (
|
|
||||||
<tr key={rowIdx} className="border-b dark:border-gray-700/50 animate-fade-in">
|
|
||||||
<td className="py-1 pr-2 text-gray-400 font-mono text-[10px]">
|
|
||||||
R{String(rowIdx).padStart(2, '0')}
|
|
||||||
</td>
|
|
||||||
{columnsUsed.map((col) => {
|
|
||||||
const cell = rowCells.find(c => c.col_index === col.index)
|
|
||||||
return (
|
|
||||||
<td key={col.index} className="py-1 pr-2 font-mono text-gray-700 dark:text-gray-300">
|
|
||||||
<MultilineText text={cell?.text || ''} />
|
|
||||||
</td>
|
|
||||||
)
|
|
||||||
})}
|
|
||||||
<td className={`py-1 text-right font-mono ${confColor(avgConf)}`}>
|
|
||||||
{avgConf}%
|
|
||||||
</td>
|
|
||||||
</tr>
|
|
||||||
)
|
|
||||||
})
|
|
||||||
})()}
|
|
||||||
</tbody>
|
|
||||||
</table>
|
|
||||||
<div ref={tableEndRef} />
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</>
|
|
||||||
)}
|
|
||||||
|
|
||||||
{/* Labeling mode */}
|
|
||||||
{mode === 'labeling' && editedCells.length > 0 && (
|
|
||||||
<div className="grid grid-cols-3 gap-4">
|
|
||||||
{/* Left 2/3: Image with highlighted active row */}
|
|
||||||
<div className="col-span-2">
|
|
||||||
<div className="text-xs font-medium text-gray-500 dark:text-gray-400 mb-1">
|
|
||||||
Zeile {activeIndex + 1} von {getUniqueRowCount()}
|
|
||||||
</div>
|
|
||||||
<div className="border rounded-lg overflow-hidden dark:border-gray-700 bg-gray-50 dark:bg-gray-900 relative">
|
|
||||||
{/* eslint-disable-next-line @next/next/no-img-element */}
|
|
||||||
<img
|
|
||||||
src={`${overlayUrl}?t=${Date.now()}`}
|
|
||||||
alt="Wort-Overlay"
|
|
||||||
className="w-full h-auto"
|
|
||||||
/>
|
|
||||||
{/* Highlight overlay for active row */}
|
|
||||||
{(() => {
|
|
||||||
const rowCells = getRowCells(activeIndex)
|
|
||||||
return rowCells.map(cell => (
|
|
||||||
<div
|
|
||||||
key={cell.cell_id}
|
|
||||||
className="absolute border-2 border-yellow-400 bg-yellow-400/10 pointer-events-none"
|
|
||||||
style={{
|
|
||||||
left: `${cell.bbox_pct.x}%`,
|
|
||||||
top: `${cell.bbox_pct.y}%`,
|
|
||||||
width: `${cell.bbox_pct.w}%`,
|
|
||||||
height: `${cell.bbox_pct.h}%`,
|
|
||||||
}}
|
|
||||||
/>
|
|
||||||
))
|
|
||||||
})()}
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Right 1/3: Editable fields */}
|
|
||||||
<div className="space-y-3">
|
|
||||||
{/* Navigation */}
|
|
||||||
<div className="flex items-center justify-between">
|
|
||||||
<button
|
|
||||||
onClick={() => setActiveIndex(Math.max(0, activeIndex - 1))}
|
|
||||||
disabled={activeIndex === 0}
|
|
||||||
className="px-2 py-1 text-xs border rounded hover:bg-gray-50 dark:hover:bg-gray-700 dark:border-gray-600 disabled:opacity-30"
|
|
||||||
>
|
|
||||||
Zurueck
|
|
||||||
</button>
|
|
||||||
<span className="text-xs text-gray-500">
|
|
||||||
{activeIndex + 1} / {getUniqueRowCount()}
|
|
||||||
</span>
|
|
||||||
<button
|
|
||||||
onClick={() => setActiveIndex(Math.min(
|
|
||||||
getUniqueRowCount() - 1,
|
|
||||||
activeIndex + 1
|
|
||||||
))}
|
|
||||||
disabled={activeIndex >= getUniqueRowCount() - 1}
|
|
||||||
className="px-2 py-1 text-xs border rounded hover:bg-gray-50 dark:hover:bg-gray-700 dark:border-gray-600 disabled:opacity-30"
|
|
||||||
>
|
|
||||||
Weiter
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Status badge */}
|
|
||||||
<div className="flex items-center gap-2">
|
|
||||||
{(() => {
|
|
||||||
const rowCells = getRowCells(activeIndex)
|
|
||||||
const avgConf = rowCells.length
|
|
||||||
? Math.round(rowCells.reduce((s, c) => s + c.confidence, 0) / rowCells.length)
|
|
||||||
: 0
|
|
||||||
return (
|
|
||||||
<span className={`text-xs font-mono ${confColor(avgConf)}`}>
|
|
||||||
{avgConf}% Konfidenz
|
|
||||||
</span>
|
|
||||||
)
|
|
||||||
})()}
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Editable fields — one per column, driven by columns_used */}
|
|
||||||
<div className="space-y-2">
|
|
||||||
{(() => {
|
|
||||||
const rowCells = getRowCells(activeIndex)
|
|
||||||
return columnsUsed.map((col, colIdx) => {
|
|
||||||
const cell = rowCells.find(c => c.col_index === col.index)
|
|
||||||
if (!cell) return null
|
|
||||||
return (
|
|
||||||
<div key={col.index}>
|
|
||||||
<div className="flex items-center gap-1 mb-0.5">
|
|
||||||
<label className={`text-[10px] font-medium ${colTypeColor(col.type)}`}>
|
|
||||||
{colTypeLabel(col.type)}
|
|
||||||
</label>
|
|
||||||
<span className="text-[9px] text-gray-400">{cell.cell_id}</span>
|
|
||||||
</div>
|
|
||||||
{/* Cell crop */}
|
|
||||||
<div className="border rounded dark:border-gray-700 overflow-hidden bg-white dark:bg-gray-900 h-10 relative mb-1">
|
|
||||||
<CellCrop imageUrl={dewarpedUrl} bbox={cell.bbox_pct} />
|
|
||||||
</div>
|
|
||||||
<textarea
|
|
||||||
ref={colIdx === 0 ? enRef as any : undefined}
|
|
||||||
rows={Math.max(1, (cell.text || '').split('\n').length)}
|
|
||||||
value={cell.text || ''}
|
|
||||||
onChange={(e) => updateCell(cell.cell_id, e.target.value)}
|
|
||||||
className="w-full px-2 py-1.5 text-sm border rounded dark:bg-gray-700 dark:border-gray-600 font-mono resize-none"
|
|
||||||
/>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
})
|
|
||||||
})()}
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Action buttons */}
|
|
||||||
<div className="flex gap-2">
|
|
||||||
<button
|
|
||||||
onClick={confirmEntry}
|
|
||||||
className="flex-1 px-3 py-1.5 text-xs bg-green-600 text-white rounded-lg hover:bg-green-700 font-medium"
|
|
||||||
>
|
|
||||||
Bestaetigen (Enter)
|
|
||||||
</button>
|
|
||||||
<button
|
|
||||||
onClick={skipEntry}
|
|
||||||
className="px-3 py-1.5 text-xs border rounded-lg hover:bg-gray-50 dark:hover:bg-gray-700 dark:border-gray-600"
|
|
||||||
>
|
|
||||||
Skip
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Shortcuts hint */}
|
|
||||||
<div className="text-[10px] text-gray-400 space-y-0.5">
|
|
||||||
<div>Enter = Bestaetigen & weiter</div>
|
|
||||||
<div>Ctrl+Down = Ueberspringen</div>
|
|
||||||
<div>Ctrl+Up = Zurueck</div>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Row list (compact) */}
|
|
||||||
<div className="border-t dark:border-gray-700 pt-2 mt-2">
|
|
||||||
<div className="text-[10px] font-medium text-gray-500 dark:text-gray-400 mb-1">
|
|
||||||
Alle Zeilen
|
|
||||||
</div>
|
|
||||||
<div className="max-h-48 overflow-y-auto space-y-0.5">
|
|
||||||
{sortedRowIndices.map((rowIdx, posIdx) => {
|
|
||||||
const rowCells = cellsByRow.get(rowIdx) || []
|
|
||||||
const textParts = rowCells.filter(c => c.text).map(c => c.text.replace(/\n/g, ' '))
|
|
||||||
return (
|
|
||||||
<div
|
|
||||||
key={rowIdx}
|
|
||||||
onClick={() => setActiveIndex(posIdx)}
|
|
||||||
className={`flex items-center gap-1 px-2 py-1 rounded text-[10px] cursor-pointer transition-colors ${
|
|
||||||
posIdx === activeIndex
|
|
||||||
? 'bg-teal-50 dark:bg-teal-900/30 border border-teal-200 dark:border-teal-700'
|
|
||||||
: 'hover:bg-gray-50 dark:hover:bg-gray-700/50'
|
|
||||||
}`}
|
|
||||||
>
|
|
||||||
<span className="w-6 text-right text-gray-400 font-mono">R{String(rowIdx).padStart(2, '0')}</span>
|
|
||||||
<span className="truncate text-gray-600 dark:text-gray-400 font-mono">
|
|
||||||
{textParts.join(' \u2192 ') || '\u2014'}
|
|
||||||
</span>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
})}
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
|
|
||||||
{/* Controls */}
|
|
||||||
{gridResult && (
|
|
||||||
<div className="bg-white dark:bg-gray-800 rounded-xl border border-gray-200 dark:border-gray-700 p-4 space-y-3">
|
|
||||||
<div className="flex items-center gap-3 flex-wrap">
|
|
||||||
{/* OCR Engine selector */}
|
|
||||||
<select
|
|
||||||
value={ocrEngine}
|
|
||||||
onChange={(e) => setOcrEngine(e.target.value as 'auto' | 'tesseract' | 'rapid')}
|
|
||||||
className="px-2 py-1.5 text-xs border rounded-lg dark:bg-gray-700 dark:border-gray-600"
|
|
||||||
>
|
|
||||||
<option value="auto">Auto (RapidOCR wenn verfuegbar)</option>
|
|
||||||
<option value="rapid">RapidOCR (ONNX)</option>
|
|
||||||
<option value="tesseract">Tesseract</option>
|
|
||||||
</select>
|
|
||||||
|
|
||||||
{/* Pronunciation selector (only for vocab) */}
|
|
||||||
{isVocab && (
|
|
||||||
<select
|
|
||||||
value={pronunciation}
|
|
||||||
onChange={(e) => setPronunciation(e.target.value as 'british' | 'american')}
|
|
||||||
className="px-2 py-1.5 text-xs border rounded-lg dark:bg-gray-700 dark:border-gray-600"
|
|
||||||
>
|
|
||||||
<option value="british">Britisch (RP)</option>
|
|
||||||
<option value="american">Amerikanisch</option>
|
|
||||||
</select>
|
|
||||||
)}
|
|
||||||
|
|
||||||
<button
|
|
||||||
onClick={() => runAutoDetection()}
|
|
||||||
disabled={detecting}
|
|
||||||
className="px-3 py-1.5 text-xs border rounded-lg hover:bg-gray-50 dark:hover:bg-gray-700 dark:border-gray-600 disabled:opacity-50"
|
|
||||||
>
|
|
||||||
Erneut erkennen
|
|
||||||
</button>
|
|
||||||
|
|
||||||
{/* Show which engine was used */}
|
|
||||||
{usedEngine && (
|
|
||||||
<span className={`px-2 py-0.5 rounded text-[10px] uppercase font-semibold ${
|
|
||||||
usedEngine === 'rapid'
|
|
||||||
? 'bg-purple-100 dark:bg-purple-900/30 text-purple-700 dark:text-purple-300'
|
|
||||||
: 'bg-gray-100 dark:bg-gray-700 text-gray-600 dark:text-gray-400'
|
|
||||||
}`}>
|
|
||||||
{usedEngine}
|
|
||||||
</span>
|
|
||||||
)}
|
|
||||||
|
|
||||||
<button
|
|
||||||
onClick={() => goToStep(3)}
|
|
||||||
className="px-3 py-1.5 text-xs border rounded-lg hover:bg-gray-50 dark:hover:bg-gray-700 dark:border-gray-600 text-orange-600 dark:text-orange-400 border-orange-300 dark:border-orange-700"
|
|
||||||
>
|
|
||||||
Zeilen korrigieren (Step 4)
|
|
||||||
</button>
|
|
||||||
|
|
||||||
<div className="flex-1" />
|
|
||||||
|
|
||||||
{/* Ground truth */}
|
|
||||||
{!gtSaved ? (
|
|
||||||
<>
|
|
||||||
<input
|
|
||||||
type="text"
|
|
||||||
placeholder="Notizen (optional)"
|
|
||||||
value={gtNotes}
|
|
||||||
onChange={(e) => setGtNotes(e.target.value)}
|
|
||||||
className="px-2 py-1 text-xs border rounded dark:bg-gray-700 dark:border-gray-600 w-48"
|
|
||||||
/>
|
|
||||||
<button
|
|
||||||
onClick={() => handleGroundTruth(true)}
|
|
||||||
className="px-3 py-1.5 text-xs bg-green-600 text-white rounded-lg hover:bg-green-700"
|
|
||||||
>
|
|
||||||
Korrekt
|
|
||||||
</button>
|
|
||||||
<button
|
|
||||||
onClick={() => handleGroundTruth(false)}
|
|
||||||
className="px-3 py-1.5 text-xs bg-red-600 text-white rounded-lg hover:bg-red-700"
|
|
||||||
>
|
|
||||||
Fehlerhaft
|
|
||||||
</button>
|
|
||||||
</>
|
|
||||||
) : (
|
|
||||||
<span className="text-xs text-green-600 dark:text-green-400">
|
|
||||||
Ground Truth gespeichert
|
|
||||||
</span>
|
|
||||||
)}
|
|
||||||
|
|
||||||
<button
|
|
||||||
onClick={onNext}
|
|
||||||
className="px-4 py-1.5 text-xs bg-teal-600 text-white rounded-lg hover:bg-teal-700 font-medium"
|
|
||||||
>
|
|
||||||
Weiter
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
|
|
||||||
{error && (
|
|
||||||
<div className="p-3 bg-red-50 dark:bg-red-900/20 text-red-600 dark:text-red-400 rounded-lg text-sm">
|
|
||||||
{error}
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
/**
|
|
||||||
* CellCrop: Shows a cropped portion of the dewarped image based on percent bbox.
|
|
||||||
* Uses CSS background-image + background-position for efficient cropping.
|
|
||||||
*/
|
|
||||||
function CellCrop({ imageUrl, bbox }: { imageUrl: string; bbox: { x: number; y: number; w: number; h: number } }) {
|
|
||||||
// Scale factor: how much to zoom into the cell
|
|
||||||
const scaleX = 100 / bbox.w
|
|
||||||
const scaleY = 100 / bbox.h
|
|
||||||
const scale = Math.min(scaleX, scaleY, 8) // Cap zoom at 8x
|
|
||||||
|
|
||||||
return (
|
|
||||||
<div
|
|
||||||
className="w-full h-full"
|
|
||||||
style={{
|
|
||||||
backgroundImage: `url(${imageUrl})`,
|
|
||||||
backgroundSize: `${scale * 100}%`,
|
|
||||||
backgroundPosition: `${-bbox.x * scale}% ${-bbox.y * scale}%`,
|
|
||||||
backgroundRepeat: 'no-repeat',
|
|
||||||
}}
|
|
||||||
/>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
@@ -127,15 +127,6 @@ export const navigation: NavCategory[] = [
|
|||||||
audience: ['Entwickler', 'Data Scientists', 'Lehrer'],
|
audience: ['Entwickler', 'Data Scientists', 'Lehrer'],
|
||||||
subgroup: 'KI-Werkzeuge',
|
subgroup: 'KI-Werkzeuge',
|
||||||
},
|
},
|
||||||
{
|
|
||||||
id: 'ocr-pipeline',
|
|
||||||
name: 'OCR Pipeline',
|
|
||||||
href: '/ai/ocr-pipeline',
|
|
||||||
description: 'Schrittweise Seitenrekonstruktion',
|
|
||||||
purpose: 'Schrittweise Seitenrekonstruktion: Scan begradigen, Spalten erkennen, Woerter lokalisieren und die Seite Wort fuer Wort nachbauen. 6-Schritt-Pipeline mit Ground Truth Validierung.',
|
|
||||||
audience: ['Entwickler', 'Data Scientists'],
|
|
||||||
subgroup: 'KI-Werkzeuge',
|
|
||||||
},
|
|
||||||
{
|
{
|
||||||
id: 'test-quality',
|
id: 'test-quality',
|
||||||
name: 'Test Quality (BQAS)',
|
name: 'Test Quality (BQAS)',
|
||||||
|
|||||||
@@ -2,8 +2,6 @@
|
|||||||
const nextConfig = {
|
const nextConfig = {
|
||||||
output: 'standalone',
|
output: 'standalone',
|
||||||
reactStrictMode: true,
|
reactStrictMode: true,
|
||||||
// Force unique build ID to bust browser caches on each deploy
|
|
||||||
generateBuildId: () => `build-${Date.now()}`,
|
|
||||||
// TODO: Remove after fixing type incompatibilities from restore
|
// TODO: Remove after fixing type incompatibilities from restore
|
||||||
typescript: {
|
typescript: {
|
||||||
ignoreBuildErrors: true,
|
ignoreBuildErrors: true,
|
||||||
|
|||||||
463
admin-lehrer/package-lock.json
generated
463
admin-lehrer/package-lock.json
generated
@@ -8,7 +8,6 @@
|
|||||||
"name": "breakpilot-admin-v2",
|
"name": "breakpilot-admin-v2",
|
||||||
"version": "1.0.0",
|
"version": "1.0.0",
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
"bpmn-js": "^18.0.1",
|
|
||||||
"jspdf": "^4.1.0",
|
"jspdf": "^4.1.0",
|
||||||
"jszip": "^3.10.1",
|
"jszip": "^3.10.1",
|
||||||
"lucide-react": "^0.468.0",
|
"lucide-react": "^0.468.0",
|
||||||
@@ -16,7 +15,6 @@
|
|||||||
"react": "^18.3.1",
|
"react": "^18.3.1",
|
||||||
"react-dom": "^18.3.1",
|
"react-dom": "^18.3.1",
|
||||||
"reactflow": "^11.11.4",
|
"reactflow": "^11.11.4",
|
||||||
"recharts": "^2.15.0",
|
|
||||||
"uuid": "^13.0.0"
|
"uuid": "^13.0.0"
|
||||||
},
|
},
|
||||||
"devDependencies": {
|
"devDependencies": {
|
||||||
@@ -430,16 +428,6 @@
|
|||||||
"node": ">=6.9.0"
|
"node": ">=6.9.0"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"node_modules/@bpmn-io/diagram-js-ui": {
|
|
||||||
"version": "0.2.3",
|
|
||||||
"resolved": "https://registry.npmjs.org/@bpmn-io/diagram-js-ui/-/diagram-js-ui-0.2.3.tgz",
|
|
||||||
"integrity": "sha512-OGyjZKvGK8tHSZ0l7RfeKhilGoOGtFDcoqSGYkX0uhFlo99OVZ9Jn1K7TJGzcE9BdKwvA5Y5kGqHEhdTxHvFfw==",
|
|
||||||
"license": "MIT",
|
|
||||||
"dependencies": {
|
|
||||||
"htm": "^3.1.1",
|
|
||||||
"preact": "^10.11.2"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/@csstools/color-helpers": {
|
"node_modules/@csstools/color-helpers": {
|
||||||
"version": "5.1.0",
|
"version": "5.1.0",
|
||||||
"resolved": "https://registry.npmjs.org/@csstools/color-helpers/-/color-helpers-5.1.0.tgz",
|
"resolved": "https://registry.npmjs.org/@csstools/color-helpers/-/color-helpers-5.1.0.tgz",
|
||||||
@@ -3008,39 +2996,6 @@
|
|||||||
"url": "https://github.com/sponsors/sindresorhus"
|
"url": "https://github.com/sponsors/sindresorhus"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"node_modules/bpmn-js": {
|
|
||||||
"version": "18.12.0",
|
|
||||||
"resolved": "https://registry.npmjs.org/bpmn-js/-/bpmn-js-18.12.0.tgz",
|
|
||||||
"integrity": "sha512-Dg2O+r7jpBwLgWGpManc7P4ZfZQfxTVi2xNtXR3Q2G5Hx1RVYVFoNsQED8+FPCgjy6m7ZQbxKP1sjCJt5rbtBg==",
|
|
||||||
"license": "SEE LICENSE IN LICENSE",
|
|
||||||
"dependencies": {
|
|
||||||
"bpmn-moddle": "^10.0.0",
|
|
||||||
"diagram-js": "^15.9.0",
|
|
||||||
"diagram-js-direct-editing": "^3.3.0",
|
|
||||||
"ids": "^3.0.0",
|
|
||||||
"inherits-browser": "^0.1.0",
|
|
||||||
"min-dash": "^5.0.0",
|
|
||||||
"min-dom": "^5.2.0",
|
|
||||||
"tiny-svg": "^4.1.4"
|
|
||||||
},
|
|
||||||
"engines": {
|
|
||||||
"node": "*"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/bpmn-moddle": {
|
|
||||||
"version": "10.0.0",
|
|
||||||
"resolved": "https://registry.npmjs.org/bpmn-moddle/-/bpmn-moddle-10.0.0.tgz",
|
|
||||||
"integrity": "sha512-vXePD5jkatcILmM3zwJG/m6IIHIghTGB7WvgcdEraEw8E8VdJHrTgrvBUhbzqaXJpnsGQz15QS936xeBY6l9aA==",
|
|
||||||
"license": "MIT",
|
|
||||||
"dependencies": {
|
|
||||||
"min-dash": "^5.0.0",
|
|
||||||
"moddle": "^8.0.0",
|
|
||||||
"moddle-xml": "^12.0.0"
|
|
||||||
},
|
|
||||||
"engines": {
|
|
||||||
"node": ">= 20.12"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/braces": {
|
"node_modules/braces": {
|
||||||
"version": "3.0.3",
|
"version": "3.0.3",
|
||||||
"resolved": "https://registry.npmjs.org/braces/-/braces-3.0.3.tgz",
|
"resolved": "https://registry.npmjs.org/braces/-/braces-3.0.3.tgz",
|
||||||
@@ -3198,15 +3153,6 @@
|
|||||||
"integrity": "sha512-IV3Ou0jSMzZrd3pZ48nLkT9DA7Ag1pnPzaiQhpW7c3RbcqqzvzzVu+L8gfqMp/8IM2MQtSiqaCxrrcfu8I8rMA==",
|
"integrity": "sha512-IV3Ou0jSMzZrd3pZ48nLkT9DA7Ag1pnPzaiQhpW7c3RbcqqzvzzVu+L8gfqMp/8IM2MQtSiqaCxrrcfu8I8rMA==",
|
||||||
"license": "MIT"
|
"license": "MIT"
|
||||||
},
|
},
|
||||||
"node_modules/clsx": {
|
|
||||||
"version": "2.1.1",
|
|
||||||
"resolved": "https://registry.npmjs.org/clsx/-/clsx-2.1.1.tgz",
|
|
||||||
"integrity": "sha512-eYm0QWBtUrBWZWG0d386OGAw16Z995PiOVo2B7bjWSbHedGl5e0ZWaq65kOGgUSNesEIDkB9ISbTg/JK9dhCZA==",
|
|
||||||
"license": "MIT",
|
|
||||||
"engines": {
|
|
||||||
"node": ">=6"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/commander": {
|
"node_modules/commander": {
|
||||||
"version": "4.1.1",
|
"version": "4.1.1",
|
||||||
"resolved": "https://registry.npmjs.org/commander/-/commander-4.1.1.tgz",
|
"resolved": "https://registry.npmjs.org/commander/-/commander-4.1.1.tgz",
|
||||||
@@ -3316,20 +3262,9 @@
|
|||||||
"version": "3.2.3",
|
"version": "3.2.3",
|
||||||
"resolved": "https://registry.npmjs.org/csstype/-/csstype-3.2.3.tgz",
|
"resolved": "https://registry.npmjs.org/csstype/-/csstype-3.2.3.tgz",
|
||||||
"integrity": "sha512-z1HGKcYy2xA8AGQfwrn0PAy+PB7X/GSj3UVJW9qKyn43xWa+gl5nXmU4qqLMRzWVLFC8KusUX8T/0kCiOYpAIQ==",
|
"integrity": "sha512-z1HGKcYy2xA8AGQfwrn0PAy+PB7X/GSj3UVJW9qKyn43xWa+gl5nXmU4qqLMRzWVLFC8KusUX8T/0kCiOYpAIQ==",
|
||||||
|
"devOptional": true,
|
||||||
"license": "MIT"
|
"license": "MIT"
|
||||||
},
|
},
|
||||||
"node_modules/d3-array": {
|
|
||||||
"version": "3.2.4",
|
|
||||||
"resolved": "https://registry.npmjs.org/d3-array/-/d3-array-3.2.4.tgz",
|
|
||||||
"integrity": "sha512-tdQAmyA18i4J7wprpYq8ClcxZy3SC31QMeByyCFyRt7BVHdREQZ5lpzoe5mFEYZUWe+oq8HBvk9JjpibyEV4Jg==",
|
|
||||||
"license": "ISC",
|
|
||||||
"dependencies": {
|
|
||||||
"internmap": "1 - 2"
|
|
||||||
},
|
|
||||||
"engines": {
|
|
||||||
"node": ">=12"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/d3-color": {
|
"node_modules/d3-color": {
|
||||||
"version": "3.1.0",
|
"version": "3.1.0",
|
||||||
"resolved": "https://registry.npmjs.org/d3-color/-/d3-color-3.1.0.tgz",
|
"resolved": "https://registry.npmjs.org/d3-color/-/d3-color-3.1.0.tgz",
|
||||||
@@ -3370,15 +3305,6 @@
|
|||||||
"node": ">=12"
|
"node": ">=12"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"node_modules/d3-format": {
|
|
||||||
"version": "3.1.2",
|
|
||||||
"resolved": "https://registry.npmjs.org/d3-format/-/d3-format-3.1.2.tgz",
|
|
||||||
"integrity": "sha512-AJDdYOdnyRDV5b6ArilzCPPwc1ejkHcoyFarqlPqT7zRYjhavcT3uSrqcMvsgh2CgoPbK3RCwyHaVyxYcP2Arg==",
|
|
||||||
"license": "ISC",
|
|
||||||
"engines": {
|
|
||||||
"node": ">=12"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/d3-interpolate": {
|
"node_modules/d3-interpolate": {
|
||||||
"version": "3.0.1",
|
"version": "3.0.1",
|
||||||
"resolved": "https://registry.npmjs.org/d3-interpolate/-/d3-interpolate-3.0.1.tgz",
|
"resolved": "https://registry.npmjs.org/d3-interpolate/-/d3-interpolate-3.0.1.tgz",
|
||||||
@@ -3391,31 +3317,6 @@
|
|||||||
"node": ">=12"
|
"node": ">=12"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"node_modules/d3-path": {
|
|
||||||
"version": "3.1.0",
|
|
||||||
"resolved": "https://registry.npmjs.org/d3-path/-/d3-path-3.1.0.tgz",
|
|
||||||
"integrity": "sha512-p3KP5HCf/bvjBSSKuXid6Zqijx7wIfNW+J/maPs+iwR35at5JCbLUT0LzF1cnjbCHWhqzQTIN2Jpe8pRebIEFQ==",
|
|
||||||
"license": "ISC",
|
|
||||||
"engines": {
|
|
||||||
"node": ">=12"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/d3-scale": {
|
|
||||||
"version": "4.0.2",
|
|
||||||
"resolved": "https://registry.npmjs.org/d3-scale/-/d3-scale-4.0.2.tgz",
|
|
||||||
"integrity": "sha512-GZW464g1SH7ag3Y7hXjf8RoUuAFIqklOAq3MRl4OaWabTFJY9PN/E1YklhXLh+OQ3fM9yS2nOkCoS+WLZ6kvxQ==",
|
|
||||||
"license": "ISC",
|
|
||||||
"dependencies": {
|
|
||||||
"d3-array": "2.10.0 - 3",
|
|
||||||
"d3-format": "1 - 3",
|
|
||||||
"d3-interpolate": "1.2.0 - 3",
|
|
||||||
"d3-time": "2.1.1 - 3",
|
|
||||||
"d3-time-format": "2 - 4"
|
|
||||||
},
|
|
||||||
"engines": {
|
|
||||||
"node": ">=12"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/d3-selection": {
|
"node_modules/d3-selection": {
|
||||||
"version": "3.0.0",
|
"version": "3.0.0",
|
||||||
"resolved": "https://registry.npmjs.org/d3-selection/-/d3-selection-3.0.0.tgz",
|
"resolved": "https://registry.npmjs.org/d3-selection/-/d3-selection-3.0.0.tgz",
|
||||||
@@ -3425,42 +3326,6 @@
|
|||||||
"node": ">=12"
|
"node": ">=12"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"node_modules/d3-shape": {
|
|
||||||
"version": "3.2.0",
|
|
||||||
"resolved": "https://registry.npmjs.org/d3-shape/-/d3-shape-3.2.0.tgz",
|
|
||||||
"integrity": "sha512-SaLBuwGm3MOViRq2ABk3eLoxwZELpH6zhl3FbAoJ7Vm1gofKx6El1Ib5z23NUEhF9AsGl7y+dzLe5Cw2AArGTA==",
|
|
||||||
"license": "ISC",
|
|
||||||
"dependencies": {
|
|
||||||
"d3-path": "^3.1.0"
|
|
||||||
},
|
|
||||||
"engines": {
|
|
||||||
"node": ">=12"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/d3-time": {
|
|
||||||
"version": "3.1.0",
|
|
||||||
"resolved": "https://registry.npmjs.org/d3-time/-/d3-time-3.1.0.tgz",
|
|
||||||
"integrity": "sha512-VqKjzBLejbSMT4IgbmVgDjpkYrNWUYJnbCGo874u7MMKIWsILRX+OpX/gTk8MqjpT1A/c6HY2dCA77ZN0lkQ2Q==",
|
|
||||||
"license": "ISC",
|
|
||||||
"dependencies": {
|
|
||||||
"d3-array": "2 - 3"
|
|
||||||
},
|
|
||||||
"engines": {
|
|
||||||
"node": ">=12"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/d3-time-format": {
|
|
||||||
"version": "4.1.0",
|
|
||||||
"resolved": "https://registry.npmjs.org/d3-time-format/-/d3-time-format-4.1.0.tgz",
|
|
||||||
"integrity": "sha512-dJxPBlzC7NugB2PDLwo9Q8JiTR3M3e4/XANkreKSUxF8vvXKqm1Yfq4Q5dl8budlunRVlUUaDUgFt7eA8D6NLg==",
|
|
||||||
"license": "ISC",
|
|
||||||
"dependencies": {
|
|
||||||
"d3-time": "1 - 3"
|
|
||||||
},
|
|
||||||
"engines": {
|
|
||||||
"node": ">=12"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/d3-timer": {
|
"node_modules/d3-timer": {
|
||||||
"version": "3.0.1",
|
"version": "3.0.1",
|
||||||
"resolved": "https://registry.npmjs.org/d3-timer/-/d3-timer-3.0.1.tgz",
|
"resolved": "https://registry.npmjs.org/d3-timer/-/d3-timer-3.0.1.tgz",
|
||||||
@@ -3544,12 +3409,6 @@
|
|||||||
"dev": true,
|
"dev": true,
|
||||||
"license": "MIT"
|
"license": "MIT"
|
||||||
},
|
},
|
||||||
"node_modules/decimal.js-light": {
|
|
||||||
"version": "2.5.1",
|
|
||||||
"resolved": "https://registry.npmjs.org/decimal.js-light/-/decimal.js-light-2.5.1.tgz",
|
|
||||||
"integrity": "sha512-qIMFpTMZmny+MMIitAB6D7iVPEorVw6YQRWkvarTkT4tBeSLLiHzcwj6q0MmYSFCiVpiqPJTJEYIrpcPzVEIvg==",
|
|
||||||
"license": "MIT"
|
|
||||||
},
|
|
||||||
"node_modules/dequal": {
|
"node_modules/dequal": {
|
||||||
"version": "2.0.3",
|
"version": "2.0.3",
|
||||||
"resolved": "https://registry.npmjs.org/dequal/-/dequal-2.0.3.tgz",
|
"resolved": "https://registry.npmjs.org/dequal/-/dequal-2.0.3.tgz",
|
||||||
@@ -3570,51 +3429,6 @@
|
|||||||
"node": ">=8"
|
"node": ">=8"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"node_modules/diagram-js": {
|
|
||||||
"version": "15.9.1",
|
|
||||||
"resolved": "https://registry.npmjs.org/diagram-js/-/diagram-js-15.9.1.tgz",
|
|
||||||
"integrity": "sha512-2JsGmyeTo6o39beq2e/UkTfMopQSM27eXBUzbYQ+1m5VhEnQDkcjcrnRCjcObLMzzXSE/LSJyYhji90sqBFodQ==",
|
|
||||||
"license": "MIT",
|
|
||||||
"dependencies": {
|
|
||||||
"@bpmn-io/diagram-js-ui": "^0.2.3",
|
|
||||||
"clsx": "^2.1.1",
|
|
||||||
"didi": "^11.0.0",
|
|
||||||
"inherits-browser": "^0.1.0",
|
|
||||||
"min-dash": "^5.0.0",
|
|
||||||
"min-dom": "^5.2.0",
|
|
||||||
"object-refs": "^0.4.0",
|
|
||||||
"path-intersection": "^4.1.0",
|
|
||||||
"tiny-svg": "^4.1.4"
|
|
||||||
},
|
|
||||||
"engines": {
|
|
||||||
"node": "*"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/diagram-js-direct-editing": {
|
|
||||||
"version": "3.3.0",
|
|
||||||
"resolved": "https://registry.npmjs.org/diagram-js-direct-editing/-/diagram-js-direct-editing-3.3.0.tgz",
|
|
||||||
"integrity": "sha512-EjXYb35J3qBU8lLz5U81hn7wNykVmF7U5DXZ7BvPok2IX7rmPz+ZyaI5AEMiqaC6lpSnHqPxFcPgKEiJcAiv5w==",
|
|
||||||
"license": "MIT",
|
|
||||||
"dependencies": {
|
|
||||||
"min-dash": "^5.0.0",
|
|
||||||
"min-dom": "^5.2.0"
|
|
||||||
},
|
|
||||||
"engines": {
|
|
||||||
"node": "*"
|
|
||||||
},
|
|
||||||
"peerDependencies": {
|
|
||||||
"diagram-js": "*"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/didi": {
|
|
||||||
"version": "11.0.0",
|
|
||||||
"resolved": "https://registry.npmjs.org/didi/-/didi-11.0.0.tgz",
|
|
||||||
"integrity": "sha512-PzCfRzQttvFpVcYMbSF7h8EsWjeJpVjWH4qDhB5LkMi1ILvHq4Ob0vhM2wLFziPkbUBi+PAo7ODbe2sacR7nJQ==",
|
|
||||||
"license": "MIT",
|
|
||||||
"engines": {
|
|
||||||
"node": ">= 20.12"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/didyoumean": {
|
"node_modules/didyoumean": {
|
||||||
"version": "1.2.2",
|
"version": "1.2.2",
|
||||||
"resolved": "https://registry.npmjs.org/didyoumean/-/didyoumean-1.2.2.tgz",
|
"resolved": "https://registry.npmjs.org/didyoumean/-/didyoumean-1.2.2.tgz",
|
||||||
@@ -3637,28 +3451,6 @@
|
|||||||
"license": "MIT",
|
"license": "MIT",
|
||||||
"peer": true
|
"peer": true
|
||||||
},
|
},
|
||||||
"node_modules/dom-helpers": {
|
|
||||||
"version": "5.2.1",
|
|
||||||
"resolved": "https://registry.npmjs.org/dom-helpers/-/dom-helpers-5.2.1.tgz",
|
|
||||||
"integrity": "sha512-nRCa7CK3VTrM2NmGkIy4cbK7IZlgBE/PYMn55rrXefr5xXDP0LdtfPnblFDoVdcAfslJ7or6iqAUnx0CCGIWQA==",
|
|
||||||
"license": "MIT",
|
|
||||||
"dependencies": {
|
|
||||||
"@babel/runtime": "^7.8.7",
|
|
||||||
"csstype": "^3.0.2"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/domify": {
|
|
||||||
"version": "3.0.0",
|
|
||||||
"resolved": "https://registry.npmjs.org/domify/-/domify-3.0.0.tgz",
|
|
||||||
"integrity": "sha512-bs2yO68JDFOm6rKv8f0EnrM2cENduhRkpqOtt/s5l5JBA/eqGBZCzLPmdYoHtJ6utgLGgcBajFsEQbl12pT0lQ==",
|
|
||||||
"license": "MIT",
|
|
||||||
"engines": {
|
|
||||||
"node": ">=20"
|
|
||||||
},
|
|
||||||
"funding": {
|
|
||||||
"url": "https://github.com/sponsors/sindresorhus"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/dompurify": {
|
"node_modules/dompurify": {
|
||||||
"version": "3.3.1",
|
"version": "3.3.1",
|
||||||
"resolved": "https://registry.npmjs.org/dompurify/-/dompurify-3.3.1.tgz",
|
"resolved": "https://registry.npmjs.org/dompurify/-/dompurify-3.3.1.tgz",
|
||||||
@@ -3758,12 +3550,6 @@
|
|||||||
"@types/estree": "^1.0.0"
|
"@types/estree": "^1.0.0"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"node_modules/eventemitter3": {
|
|
||||||
"version": "4.0.7",
|
|
||||||
"resolved": "https://registry.npmjs.org/eventemitter3/-/eventemitter3-4.0.7.tgz",
|
|
||||||
"integrity": "sha512-8guHBZCwKnFhYdHr2ysuRWErTwhoN2X8XELRlrRwpmfeY2jjuUN4taQMsULKUVo1K4DvZl+0pgfyoysHxvmvEw==",
|
|
||||||
"license": "MIT"
|
|
||||||
},
|
|
||||||
"node_modules/expect-type": {
|
"node_modules/expect-type": {
|
||||||
"version": "1.3.0",
|
"version": "1.3.0",
|
||||||
"resolved": "https://registry.npmjs.org/expect-type/-/expect-type-1.3.0.tgz",
|
"resolved": "https://registry.npmjs.org/expect-type/-/expect-type-1.3.0.tgz",
|
||||||
@@ -3774,15 +3560,6 @@
|
|||||||
"node": ">=12.0.0"
|
"node": ">=12.0.0"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"node_modules/fast-equals": {
|
|
||||||
"version": "5.4.0",
|
|
||||||
"resolved": "https://registry.npmjs.org/fast-equals/-/fast-equals-5.4.0.tgz",
|
|
||||||
"integrity": "sha512-jt2DW/aNFNwke7AUd+Z+e6pz39KO5rzdbbFCg2sGafS4mk13MI7Z8O5z9cADNn5lhGODIgLwug6TZO2ctf7kcw==",
|
|
||||||
"license": "MIT",
|
|
||||||
"engines": {
|
|
||||||
"node": ">=6.0.0"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/fast-glob": {
|
"node_modules/fast-glob": {
|
||||||
"version": "3.3.3",
|
"version": "3.3.3",
|
||||||
"resolved": "https://registry.npmjs.org/fast-glob/-/fast-glob-3.3.3.tgz",
|
"resolved": "https://registry.npmjs.org/fast-glob/-/fast-glob-3.3.3.tgz",
|
||||||
@@ -3928,12 +3705,6 @@
|
|||||||
"node": ">= 0.4"
|
"node": ">= 0.4"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"node_modules/htm": {
|
|
||||||
"version": "3.1.1",
|
|
||||||
"resolved": "https://registry.npmjs.org/htm/-/htm-3.1.1.tgz",
|
|
||||||
"integrity": "sha512-983Vyg8NwUE7JkZ6NmOqpCZ+sh1bKv2iYTlUkzlWmA5JD2acKoxd4KVxbMmxX/85mtfdnDmTFoNKcg5DGAvxNQ==",
|
|
||||||
"license": "Apache-2.0"
|
|
||||||
},
|
|
||||||
"node_modules/html-encoding-sniffer": {
|
"node_modules/html-encoding-sniffer": {
|
||||||
"version": "6.0.0",
|
"version": "6.0.0",
|
||||||
"resolved": "https://registry.npmjs.org/html-encoding-sniffer/-/html-encoding-sniffer-6.0.0.tgz",
|
"resolved": "https://registry.npmjs.org/html-encoding-sniffer/-/html-encoding-sniffer-6.0.0.tgz",
|
||||||
@@ -3989,15 +3760,6 @@
|
|||||||
"node": ">= 14"
|
"node": ">= 14"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"node_modules/ids": {
|
|
||||||
"version": "3.0.1",
|
|
||||||
"resolved": "https://registry.npmjs.org/ids/-/ids-3.0.1.tgz",
|
|
||||||
"integrity": "sha512-mr0zAgpgA/hzCrHB0DnoTG6xZjNC3ABs4eaksXrpVtfaDatA2SVdDb1ZPLjmKjqzp4kexQRuHXwDWQILVK8FZQ==",
|
|
||||||
"license": "MIT",
|
|
||||||
"engines": {
|
|
||||||
"node": ">= 20.12"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/immediate": {
|
"node_modules/immediate": {
|
||||||
"version": "3.0.6",
|
"version": "3.0.6",
|
||||||
"resolved": "https://registry.npmjs.org/immediate/-/immediate-3.0.6.tgz",
|
"resolved": "https://registry.npmjs.org/immediate/-/immediate-3.0.6.tgz",
|
||||||
@@ -4020,21 +3782,6 @@
|
|||||||
"integrity": "sha512-k/vGaX4/Yla3WzyMCvTQOXYeIHvqOKtnqBduzTHpzpQZzAskKMhZ2K+EnBiSM9zGSoIFeMpXKxa4dYeZIQqewQ==",
|
"integrity": "sha512-k/vGaX4/Yla3WzyMCvTQOXYeIHvqOKtnqBduzTHpzpQZzAskKMhZ2K+EnBiSM9zGSoIFeMpXKxa4dYeZIQqewQ==",
|
||||||
"license": "ISC"
|
"license": "ISC"
|
||||||
},
|
},
|
||||||
"node_modules/inherits-browser": {
|
|
||||||
"version": "0.1.0",
|
|
||||||
"resolved": "https://registry.npmjs.org/inherits-browser/-/inherits-browser-0.1.0.tgz",
|
|
||||||
"integrity": "sha512-CJHHvW3jQ6q7lzsXPpapLdMx5hDpSF3FSh45pwsj6bKxJJ8Nl8v43i5yXnr3BdfOimGHKyniewQtnAIp3vyJJw==",
|
|
||||||
"license": "ISC"
|
|
||||||
},
|
|
||||||
"node_modules/internmap": {
|
|
||||||
"version": "2.0.3",
|
|
||||||
"resolved": "https://registry.npmjs.org/internmap/-/internmap-2.0.3.tgz",
|
|
||||||
"integrity": "sha512-5Hh7Y1wQbvY5ooGgPbDaL5iYLAPzMTUrjMulskHLH6wnv/A+1q5rgEaiuqEjB+oxGXIVZs1FF+R/KPN3ZSQYYg==",
|
|
||||||
"license": "ISC",
|
|
||||||
"engines": {
|
|
||||||
"node": ">=12"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/iobuffer": {
|
"node_modules/iobuffer": {
|
||||||
"version": "5.4.0",
|
"version": "5.4.0",
|
||||||
"resolved": "https://registry.npmjs.org/iobuffer/-/iobuffer-5.4.0.tgz",
|
"resolved": "https://registry.npmjs.org/iobuffer/-/iobuffer-5.4.0.tgz",
|
||||||
@@ -4262,12 +4009,6 @@
|
|||||||
"dev": true,
|
"dev": true,
|
||||||
"license": "MIT"
|
"license": "MIT"
|
||||||
},
|
},
|
||||||
"node_modules/lodash": {
|
|
||||||
"version": "4.17.23",
|
|
||||||
"resolved": "https://registry.npmjs.org/lodash/-/lodash-4.17.23.tgz",
|
|
||||||
"integrity": "sha512-LgVTMpQtIopCi79SJeDiP0TfWi5CNEc/L/aRdTh3yIvmZXTnheWpKjSZhnvMl8iXbC1tFg9gdHHDMLoV7CnG+w==",
|
|
||||||
"license": "MIT"
|
|
||||||
},
|
|
||||||
"node_modules/loose-envify": {
|
"node_modules/loose-envify": {
|
||||||
"version": "1.4.0",
|
"version": "1.4.0",
|
||||||
"resolved": "https://registry.npmjs.org/loose-envify/-/loose-envify-1.4.0.tgz",
|
"resolved": "https://registry.npmjs.org/loose-envify/-/loose-envify-1.4.0.tgz",
|
||||||
@@ -4351,22 +4092,6 @@
|
|||||||
"node": ">=8.6"
|
"node": ">=8.6"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"node_modules/min-dash": {
|
|
||||||
"version": "5.0.0",
|
|
||||||
"resolved": "https://registry.npmjs.org/min-dash/-/min-dash-5.0.0.tgz",
|
|
||||||
"integrity": "sha512-EGuoBnVL7/Fnv2sqakpX5WGmZehZ3YMmLayT7sM8E9DRU74kkeyMg4Rik1lsOkR2GbFNeBca4/L+UfU6gF0Edw==",
|
|
||||||
"license": "MIT"
|
|
||||||
},
|
|
||||||
"node_modules/min-dom": {
|
|
||||||
"version": "5.3.0",
|
|
||||||
"resolved": "https://registry.npmjs.org/min-dom/-/min-dom-5.3.0.tgz",
|
|
||||||
"integrity": "sha512-0w5FEBgPAyHhmFojW3zxd7we3D+m5XYS3E/06OyvxmbHJoiQVa4Nagj6RWvoAKYRw5xth6cP5TMePc5cR1M9hA==",
|
|
||||||
"license": "MIT",
|
|
||||||
"dependencies": {
|
|
||||||
"domify": "^3.0.0",
|
|
||||||
"min-dash": "^5.0.0"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/min-indent": {
|
"node_modules/min-indent": {
|
||||||
"version": "1.0.1",
|
"version": "1.0.1",
|
||||||
"resolved": "https://registry.npmjs.org/min-indent/-/min-indent-1.0.1.tgz",
|
"resolved": "https://registry.npmjs.org/min-indent/-/min-indent-1.0.1.tgz",
|
||||||
@@ -4377,31 +4102,6 @@
|
|||||||
"node": ">=4"
|
"node": ">=4"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"node_modules/moddle": {
|
|
||||||
"version": "8.1.0",
|
|
||||||
"resolved": "https://registry.npmjs.org/moddle/-/moddle-8.1.0.tgz",
|
|
||||||
"integrity": "sha512-dBddc1CNuZHgro8nQWwfPZ2BkyLWdnxoNpPu9d+XKPN96DAiiBOeBw527ft++ebDuFez5PMdaR3pgUgoOaUGrA==",
|
|
||||||
"license": "MIT",
|
|
||||||
"dependencies": {
|
|
||||||
"min-dash": "^5.0.0"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/moddle-xml": {
|
|
||||||
"version": "12.0.0",
|
|
||||||
"resolved": "https://registry.npmjs.org/moddle-xml/-/moddle-xml-12.0.0.tgz",
|
|
||||||
"integrity": "sha512-NJc2+sCe4tvuGlaUBcoZcYf6j9f+z+qxHOyGm/LB3ZrlJXVPPHoBTg/KXgDRCufdBJhJ3AheFs3QU/abABNzRg==",
|
|
||||||
"license": "MIT",
|
|
||||||
"dependencies": {
|
|
||||||
"min-dash": "^5.0.0",
|
|
||||||
"saxen": "^11.0.2"
|
|
||||||
},
|
|
||||||
"engines": {
|
|
||||||
"node": ">= 18"
|
|
||||||
},
|
|
||||||
"peerDependencies": {
|
|
||||||
"moddle": ">= 6.2.0"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/ms": {
|
"node_modules/ms": {
|
||||||
"version": "2.1.3",
|
"version": "2.1.3",
|
||||||
"resolved": "https://registry.npmjs.org/ms/-/ms-2.1.3.tgz",
|
"resolved": "https://registry.npmjs.org/ms/-/ms-2.1.3.tgz",
|
||||||
@@ -4540,6 +4240,7 @@
|
|||||||
"version": "4.1.1",
|
"version": "4.1.1",
|
||||||
"resolved": "https://registry.npmjs.org/object-assign/-/object-assign-4.1.1.tgz",
|
"resolved": "https://registry.npmjs.org/object-assign/-/object-assign-4.1.1.tgz",
|
||||||
"integrity": "sha512-rJgTQnkUnH1sFw8yT6VSU3zD3sWmu6sZhIseY8VX+GRu3P6F7Fu+JNDoXfklElbLJSnc3FUQHVe4cU5hj+BcUg==",
|
"integrity": "sha512-rJgTQnkUnH1sFw8yT6VSU3zD3sWmu6sZhIseY8VX+GRu3P6F7Fu+JNDoXfklElbLJSnc3FUQHVe4cU5hj+BcUg==",
|
||||||
|
"dev": true,
|
||||||
"license": "MIT",
|
"license": "MIT",
|
||||||
"engines": {
|
"engines": {
|
||||||
"node": ">=0.10.0"
|
"node": ">=0.10.0"
|
||||||
@@ -4555,15 +4256,6 @@
|
|||||||
"node": ">= 6"
|
"node": ">= 6"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"node_modules/object-refs": {
|
|
||||||
"version": "0.4.0",
|
|
||||||
"resolved": "https://registry.npmjs.org/object-refs/-/object-refs-0.4.0.tgz",
|
|
||||||
"integrity": "sha512-6kJqKWryKZmtte6QYvouas0/EIJKPI1/MMIuRsiBlNuhIMfqYTggzX2F1AJ2+cDs288xyi9GL7FyasHINR98BQ==",
|
|
||||||
"license": "MIT",
|
|
||||||
"engines": {
|
|
||||||
"node": "*"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/obug": {
|
"node_modules/obug": {
|
||||||
"version": "2.1.1",
|
"version": "2.1.1",
|
||||||
"resolved": "https://registry.npmjs.org/obug/-/obug-2.1.1.tgz",
|
"resolved": "https://registry.npmjs.org/obug/-/obug-2.1.1.tgz",
|
||||||
@@ -4594,15 +4286,6 @@
|
|||||||
"url": "https://github.com/inikulin/parse5?sponsor=1"
|
"url": "https://github.com/inikulin/parse5?sponsor=1"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"node_modules/path-intersection": {
|
|
||||||
"version": "4.1.0",
|
|
||||||
"resolved": "https://registry.npmjs.org/path-intersection/-/path-intersection-4.1.0.tgz",
|
|
||||||
"integrity": "sha512-urUP6WvhnxbHPdHYl6L7Yrc6+1ny6uOFKPCzPxTSUSYGHG0o94RmI7SvMMaScNAM5RtTf08bg4skc6/kjfne3A==",
|
|
||||||
"license": "MIT",
|
|
||||||
"engines": {
|
|
||||||
"node": ">= 14.20"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/path-parse": {
|
"node_modules/path-parse": {
|
||||||
"version": "1.0.7",
|
"version": "1.0.7",
|
||||||
"resolved": "https://registry.npmjs.org/path-parse/-/path-parse-1.0.7.tgz",
|
"resolved": "https://registry.npmjs.org/path-parse/-/path-parse-1.0.7.tgz",
|
||||||
@@ -4872,16 +4555,6 @@
|
|||||||
"dev": true,
|
"dev": true,
|
||||||
"license": "MIT"
|
"license": "MIT"
|
||||||
},
|
},
|
||||||
"node_modules/preact": {
|
|
||||||
"version": "10.28.4",
|
|
||||||
"resolved": "https://registry.npmjs.org/preact/-/preact-10.28.4.tgz",
|
|
||||||
"integrity": "sha512-uKFfOHWuSNpRFVTnljsCluEFq57OKT+0QdOiQo8XWnQ/pSvg7OpX5eNOejELXJMWy+BwM2nobz0FkvzmnpCNsQ==",
|
|
||||||
"license": "MIT",
|
|
||||||
"funding": {
|
|
||||||
"type": "opencollective",
|
|
||||||
"url": "https://opencollective.com/preact"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/pretty-format": {
|
"node_modules/pretty-format": {
|
||||||
"version": "27.5.1",
|
"version": "27.5.1",
|
||||||
"resolved": "https://registry.npmjs.org/pretty-format/-/pretty-format-27.5.1.tgz",
|
"resolved": "https://registry.npmjs.org/pretty-format/-/pretty-format-27.5.1.tgz",
|
||||||
@@ -4904,23 +4577,6 @@
|
|||||||
"integrity": "sha512-3ouUOpQhtgrbOa17J7+uxOTpITYWaGP7/AhoR3+A+/1e9skrzelGi/dXzEYyvbxubEF6Wn2ypscTKiKJFFn1ag==",
|
"integrity": "sha512-3ouUOpQhtgrbOa17J7+uxOTpITYWaGP7/AhoR3+A+/1e9skrzelGi/dXzEYyvbxubEF6Wn2ypscTKiKJFFn1ag==",
|
||||||
"license": "MIT"
|
"license": "MIT"
|
||||||
},
|
},
|
||||||
"node_modules/prop-types": {
|
|
||||||
"version": "15.8.1",
|
|
||||||
"resolved": "https://registry.npmjs.org/prop-types/-/prop-types-15.8.1.tgz",
|
|
||||||
"integrity": "sha512-oj87CgZICdulUohogVAR7AjlC0327U4el4L6eAvOqCeudMDVU0NThNaV+b9Df4dXgSP1gXMTnPdhfe/2qDH5cg==",
|
|
||||||
"license": "MIT",
|
|
||||||
"dependencies": {
|
|
||||||
"loose-envify": "^1.4.0",
|
|
||||||
"object-assign": "^4.1.1",
|
|
||||||
"react-is": "^16.13.1"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/prop-types/node_modules/react-is": {
|
|
||||||
"version": "16.13.1",
|
|
||||||
"resolved": "https://registry.npmjs.org/react-is/-/react-is-16.13.1.tgz",
|
|
||||||
"integrity": "sha512-24e6ynE2H+OKt4kqsOvNd8kBpV65zoxbA4BVsEOB3ARVWQki/DHzaUoC5KuON/BiccDaCCTZBuOcfZs70kR8bQ==",
|
|
||||||
"license": "MIT"
|
|
||||||
},
|
|
||||||
"node_modules/punycode": {
|
"node_modules/punycode": {
|
||||||
"version": "2.3.1",
|
"version": "2.3.1",
|
||||||
"resolved": "https://registry.npmjs.org/punycode/-/punycode-2.3.1.tgz",
|
"resolved": "https://registry.npmjs.org/punycode/-/punycode-2.3.1.tgz",
|
||||||
@@ -5005,37 +4661,6 @@
|
|||||||
"node": ">=0.10.0"
|
"node": ">=0.10.0"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"node_modules/react-smooth": {
|
|
||||||
"version": "4.0.4",
|
|
||||||
"resolved": "https://registry.npmjs.org/react-smooth/-/react-smooth-4.0.4.tgz",
|
|
||||||
"integrity": "sha512-gnGKTpYwqL0Iii09gHobNolvX4Kiq4PKx6eWBCYYix+8cdw+cGo3do906l1NBPKkSWx1DghC1dlWG9L2uGd61Q==",
|
|
||||||
"license": "MIT",
|
|
||||||
"dependencies": {
|
|
||||||
"fast-equals": "^5.0.1",
|
|
||||||
"prop-types": "^15.8.1",
|
|
||||||
"react-transition-group": "^4.4.5"
|
|
||||||
},
|
|
||||||
"peerDependencies": {
|
|
||||||
"react": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0",
|
|
||||||
"react-dom": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/react-transition-group": {
|
|
||||||
"version": "4.4.5",
|
|
||||||
"resolved": "https://registry.npmjs.org/react-transition-group/-/react-transition-group-4.4.5.tgz",
|
|
||||||
"integrity": "sha512-pZcd1MCJoiKiBR2NRxeCRg13uCXbydPnmB4EOeRrY7480qNWO8IIgQG6zlDkm6uRMsURXPuKq0GWtiM59a5Q6g==",
|
|
||||||
"license": "BSD-3-Clause",
|
|
||||||
"dependencies": {
|
|
||||||
"@babel/runtime": "^7.5.5",
|
|
||||||
"dom-helpers": "^5.0.1",
|
|
||||||
"loose-envify": "^1.4.0",
|
|
||||||
"prop-types": "^15.6.2"
|
|
||||||
},
|
|
||||||
"peerDependencies": {
|
|
||||||
"react": ">=16.6.0",
|
|
||||||
"react-dom": ">=16.6.0"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/reactflow": {
|
"node_modules/reactflow": {
|
||||||
"version": "11.11.4",
|
"version": "11.11.4",
|
||||||
"resolved": "https://registry.npmjs.org/reactflow/-/reactflow-11.11.4.tgz",
|
"resolved": "https://registry.npmjs.org/reactflow/-/reactflow-11.11.4.tgz",
|
||||||
@@ -5092,44 +4717,6 @@
|
|||||||
"node": ">=8.10.0"
|
"node": ">=8.10.0"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"node_modules/recharts": {
|
|
||||||
"version": "2.15.4",
|
|
||||||
"resolved": "https://registry.npmjs.org/recharts/-/recharts-2.15.4.tgz",
|
|
||||||
"integrity": "sha512-UT/q6fwS3c1dHbXv2uFgYJ9BMFHu3fwnd7AYZaEQhXuYQ4hgsxLvsUXzGdKeZrW5xopzDCvuA2N41WJ88I7zIw==",
|
|
||||||
"license": "MIT",
|
|
||||||
"dependencies": {
|
|
||||||
"clsx": "^2.0.0",
|
|
||||||
"eventemitter3": "^4.0.1",
|
|
||||||
"lodash": "^4.17.21",
|
|
||||||
"react-is": "^18.3.1",
|
|
||||||
"react-smooth": "^4.0.4",
|
|
||||||
"recharts-scale": "^0.4.4",
|
|
||||||
"tiny-invariant": "^1.3.1",
|
|
||||||
"victory-vendor": "^36.6.8"
|
|
||||||
},
|
|
||||||
"engines": {
|
|
||||||
"node": ">=14"
|
|
||||||
},
|
|
||||||
"peerDependencies": {
|
|
||||||
"react": "^16.0.0 || ^17.0.0 || ^18.0.0 || ^19.0.0",
|
|
||||||
"react-dom": "^16.0.0 || ^17.0.0 || ^18.0.0 || ^19.0.0"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/recharts-scale": {
|
|
||||||
"version": "0.4.5",
|
|
||||||
"resolved": "https://registry.npmjs.org/recharts-scale/-/recharts-scale-0.4.5.tgz",
|
|
||||||
"integrity": "sha512-kivNFO+0OcUNu7jQquLXAxz1FIwZj8nrj+YkOKc5694NbjCvcT6aSZiIzNzd2Kul4o4rTto8QVR9lMNtxD4G1w==",
|
|
||||||
"license": "MIT",
|
|
||||||
"dependencies": {
|
|
||||||
"decimal.js-light": "^2.4.1"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/recharts/node_modules/react-is": {
|
|
||||||
"version": "18.3.1",
|
|
||||||
"resolved": "https://registry.npmjs.org/react-is/-/react-is-18.3.1.tgz",
|
|
||||||
"integrity": "sha512-/LLMVyas0ljjAtoYiPqYiL8VWXzUUdThrmU5+n20DZv+a+ClRoevUzw5JxU+Ieh5/c87ytoTBV9G1FiKfNJdmg==",
|
|
||||||
"license": "MIT"
|
|
||||||
},
|
|
||||||
"node_modules/redent": {
|
"node_modules/redent": {
|
||||||
"version": "3.0.0",
|
"version": "3.0.0",
|
||||||
"resolved": "https://registry.npmjs.org/redent/-/redent-3.0.0.tgz",
|
"resolved": "https://registry.npmjs.org/redent/-/redent-3.0.0.tgz",
|
||||||
@@ -5278,15 +4865,6 @@
|
|||||||
"integrity": "sha512-Gd2UZBJDkXlY7GbJxfsE8/nvKkUEU1G38c1siN6QP6a9PT9MmHB8GnpscSmMJSoF8LOIrt8ud/wPtojys4G6+g==",
|
"integrity": "sha512-Gd2UZBJDkXlY7GbJxfsE8/nvKkUEU1G38c1siN6QP6a9PT9MmHB8GnpscSmMJSoF8LOIrt8ud/wPtojys4G6+g==",
|
||||||
"license": "MIT"
|
"license": "MIT"
|
||||||
},
|
},
|
||||||
"node_modules/saxen": {
|
|
||||||
"version": "11.0.2",
|
|
||||||
"resolved": "https://registry.npmjs.org/saxen/-/saxen-11.0.2.tgz",
|
|
||||||
"integrity": "sha512-WDb4gqac8uiJzOdOdVpr9NWh9NrJMm7Brn5GX2Poj+mjE/QTXqYQENr8T/mom54dDDgbd3QjwTg23TRHYiWXRA==",
|
|
||||||
"license": "MIT",
|
|
||||||
"engines": {
|
|
||||||
"node": ">= 20.12"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/saxes": {
|
"node_modules/saxes": {
|
||||||
"version": "6.0.0",
|
"version": "6.0.0",
|
||||||
"resolved": "https://registry.npmjs.org/saxes/-/saxes-6.0.0.tgz",
|
"resolved": "https://registry.npmjs.org/saxes/-/saxes-6.0.0.tgz",
|
||||||
@@ -5582,21 +5160,6 @@
|
|||||||
"node": ">=0.8"
|
"node": ">=0.8"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"node_modules/tiny-invariant": {
|
|
||||||
"version": "1.3.3",
|
|
||||||
"resolved": "https://registry.npmjs.org/tiny-invariant/-/tiny-invariant-1.3.3.tgz",
|
|
||||||
"integrity": "sha512-+FbBPE1o9QAYvviau/qC5SE3caw21q3xkvWKBtja5vgqOWIHHJ3ioaq1VPfn/Szqctz2bU/oYeKd9/z5BL+PVg==",
|
|
||||||
"license": "MIT"
|
|
||||||
},
|
|
||||||
"node_modules/tiny-svg": {
|
|
||||||
"version": "4.1.4",
|
|
||||||
"resolved": "https://registry.npmjs.org/tiny-svg/-/tiny-svg-4.1.4.tgz",
|
|
||||||
"integrity": "sha512-cBaEACCbouYrQc9RG+eTXnPYosX1Ijqty/I6DdXovwDd89Pwu4jcmpOR7BuFEF9YCcd7/AWwasE0207WMK7hdw==",
|
|
||||||
"license": "MIT",
|
|
||||||
"engines": {
|
|
||||||
"node": ">= 20"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/tinybench": {
|
"node_modules/tinybench": {
|
||||||
"version": "2.9.0",
|
"version": "2.9.0",
|
||||||
"resolved": "https://registry.npmjs.org/tinybench/-/tinybench-2.9.0.tgz",
|
"resolved": "https://registry.npmjs.org/tinybench/-/tinybench-2.9.0.tgz",
|
||||||
@@ -5844,28 +5407,6 @@
|
|||||||
"uuid": "dist-node/bin/uuid"
|
"uuid": "dist-node/bin/uuid"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"node_modules/victory-vendor": {
|
|
||||||
"version": "36.9.2",
|
|
||||||
"resolved": "https://registry.npmjs.org/victory-vendor/-/victory-vendor-36.9.2.tgz",
|
|
||||||
"integrity": "sha512-PnpQQMuxlwYdocC8fIJqVXvkeViHYzotI+NJrCuav0ZYFoq912ZHBk3mCeuj+5/VpodOjPe1z0Fk2ihgzlXqjQ==",
|
|
||||||
"license": "MIT AND ISC",
|
|
||||||
"dependencies": {
|
|
||||||
"@types/d3-array": "^3.0.3",
|
|
||||||
"@types/d3-ease": "^3.0.0",
|
|
||||||
"@types/d3-interpolate": "^3.0.1",
|
|
||||||
"@types/d3-scale": "^4.0.2",
|
|
||||||
"@types/d3-shape": "^3.1.0",
|
|
||||||
"@types/d3-time": "^3.0.0",
|
|
||||||
"@types/d3-timer": "^3.0.0",
|
|
||||||
"d3-array": "^3.1.6",
|
|
||||||
"d3-ease": "^3.0.1",
|
|
||||||
"d3-interpolate": "^3.0.1",
|
|
||||||
"d3-scale": "^4.0.2",
|
|
||||||
"d3-shape": "^3.1.0",
|
|
||||||
"d3-time": "^3.0.0",
|
|
||||||
"d3-timer": "^3.0.1"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"node_modules/vite": {
|
"node_modules/vite": {
|
||||||
"version": "7.3.1",
|
"version": "7.3.1",
|
||||||
"resolved": "https://registry.npmjs.org/vite/-/vite-7.3.1.tgz",
|
"resolved": "https://registry.npmjs.org/vite/-/vite-7.3.1.tgz",
|
||||||
|
|||||||
@@ -27,7 +27,6 @@
|
|||||||
"react-dom": "^18.3.1",
|
"react-dom": "^18.3.1",
|
||||||
"reactflow": "^11.11.4",
|
"reactflow": "^11.11.4",
|
||||||
"recharts": "^2.15.0",
|
"recharts": "^2.15.0",
|
||||||
"fabric": "^6.0.0",
|
|
||||||
"uuid": "^13.0.0"
|
"uuid": "^13.0.0"
|
||||||
},
|
},
|
||||||
"devDependencies": {
|
"devDependencies": {
|
||||||
|
|||||||
File diff suppressed because one or more lines are too long
323
docker-compose.coolify.yml
Normal file
323
docker-compose.coolify.yml
Normal file
@@ -0,0 +1,323 @@
|
|||||||
|
# =========================================================
|
||||||
|
# BreakPilot Lehrer — KI-Lehrerplattform (Coolify)
|
||||||
|
# =========================================================
|
||||||
|
# Requires: breakpilot-core must be running
|
||||||
|
# Deployed via Coolify. SSL termination handled by Traefik.
|
||||||
|
# External services (managed separately in Coolify):
|
||||||
|
# - PostgreSQL, Qdrant, S3-compatible storage
|
||||||
|
# =========================================================
|
||||||
|
|
||||||
|
networks:
|
||||||
|
breakpilot-network:
|
||||||
|
external: true
|
||||||
|
name: breakpilot-network
|
||||||
|
|
||||||
|
volumes:
|
||||||
|
klausur_uploads:
|
||||||
|
eh_uploads:
|
||||||
|
ocr_labeling:
|
||||||
|
paddle_models:
|
||||||
|
lehrer_backend_data:
|
||||||
|
opensearch_data:
|
||||||
|
|
||||||
|
services:
|
||||||
|
|
||||||
|
# =========================================================
|
||||||
|
# FRONTEND
|
||||||
|
# =========================================================
|
||||||
|
admin-lehrer:
|
||||||
|
build:
|
||||||
|
context: ./admin-lehrer
|
||||||
|
dockerfile: Dockerfile
|
||||||
|
args:
|
||||||
|
NEXT_PUBLIC_API_URL: ${NEXT_PUBLIC_API_URL:-https://api-lehrer.breakpilot.ai}
|
||||||
|
NEXT_PUBLIC_OLD_ADMIN_URL: ${NEXT_PUBLIC_OLD_ADMIN_URL:-}
|
||||||
|
NEXT_PUBLIC_KLAUSUR_SERVICE_URL: ${NEXT_PUBLIC_KLAUSUR_SERVICE_URL:-https://klausur.breakpilot.ai}
|
||||||
|
NEXT_PUBLIC_VOICE_SERVICE_URL: ${NEXT_PUBLIC_VOICE_SERVICE_URL:-wss://voice.breakpilot.ai}
|
||||||
|
container_name: bp-lehrer-admin
|
||||||
|
expose:
|
||||||
|
- "3000"
|
||||||
|
volumes:
|
||||||
|
- lehrer_backend_data:/app/data
|
||||||
|
environment:
|
||||||
|
NODE_ENV: production
|
||||||
|
BACKEND_URL: http://backend-lehrer:8001
|
||||||
|
CONSENT_SERVICE_URL: http://bp-core-consent-service:8081
|
||||||
|
KLAUSUR_SERVICE_URL: http://klausur-service:8086
|
||||||
|
OLLAMA_URL: ${OLLAMA_URL:-}
|
||||||
|
depends_on:
|
||||||
|
backend-lehrer:
|
||||||
|
condition: service_started
|
||||||
|
labels:
|
||||||
|
- "traefik.enable=true"
|
||||||
|
- "traefik.http.routers.admin-lehrer.rule=Host(`admin-lehrer.breakpilot.ai`)"
|
||||||
|
- "traefik.http.routers.admin-lehrer.entrypoints=https"
|
||||||
|
- "traefik.http.routers.admin-lehrer.tls=true"
|
||||||
|
- "traefik.http.routers.admin-lehrer.tls.certresolver=letsencrypt"
|
||||||
|
- "traefik.http.services.admin-lehrer.loadbalancer.server.port=3000"
|
||||||
|
restart: unless-stopped
|
||||||
|
networks:
|
||||||
|
- breakpilot-network
|
||||||
|
|
||||||
|
studio-v2:
|
||||||
|
build:
|
||||||
|
context: ./studio-v2
|
||||||
|
dockerfile: Dockerfile
|
||||||
|
args:
|
||||||
|
NEXT_PUBLIC_VOICE_SERVICE_URL: ${NEXT_PUBLIC_VOICE_SERVICE_URL:-wss://voice.breakpilot.ai}
|
||||||
|
NEXT_PUBLIC_KLAUSUR_SERVICE_URL: ${NEXT_PUBLIC_KLAUSUR_SERVICE_URL:-https://klausur.breakpilot.ai}
|
||||||
|
container_name: bp-lehrer-studio-v2
|
||||||
|
expose:
|
||||||
|
- "3001"
|
||||||
|
environment:
|
||||||
|
NODE_ENV: production
|
||||||
|
BACKEND_URL: http://backend-lehrer:8001
|
||||||
|
depends_on:
|
||||||
|
- backend-lehrer
|
||||||
|
labels:
|
||||||
|
- "traefik.enable=true"
|
||||||
|
- "traefik.http.routers.studio.rule=Host(`app.breakpilot.ai`)"
|
||||||
|
- "traefik.http.routers.studio.entrypoints=https"
|
||||||
|
- "traefik.http.routers.studio.tls=true"
|
||||||
|
- "traefik.http.routers.studio.tls.certresolver=letsencrypt"
|
||||||
|
- "traefik.http.services.studio.loadbalancer.server.port=3001"
|
||||||
|
restart: unless-stopped
|
||||||
|
networks:
|
||||||
|
- breakpilot-network
|
||||||
|
|
||||||
|
website:
|
||||||
|
build:
|
||||||
|
context: ./website
|
||||||
|
dockerfile: Dockerfile
|
||||||
|
args:
|
||||||
|
NEXT_PUBLIC_BILLING_API_URL: ${NEXT_PUBLIC_BILLING_API_URL:-https://api-core.breakpilot.ai}
|
||||||
|
NEXT_PUBLIC_APP_URL: ${NEXT_PUBLIC_APP_URL:-https://app.breakpilot.ai}
|
||||||
|
NEXT_PUBLIC_KLAUSUR_SERVICE_URL: ${NEXT_PUBLIC_KLAUSUR_SERVICE_URL:-https://klausur.breakpilot.ai}
|
||||||
|
NEXT_PUBLIC_VOICE_SERVICE_URL: ${NEXT_PUBLIC_VOICE_SERVICE_URL:-wss://voice.breakpilot.ai}
|
||||||
|
NEXT_PUBLIC_STRIPE_PUBLISHABLE_KEY: ${NEXT_PUBLIC_STRIPE_PUBLISHABLE_KEY:-}
|
||||||
|
container_name: bp-lehrer-website
|
||||||
|
expose:
|
||||||
|
- "3000"
|
||||||
|
environment:
|
||||||
|
NODE_ENV: production
|
||||||
|
VAST_API_KEY: ${VAST_API_KEY:-}
|
||||||
|
CONTROL_API_KEY: ${CONTROL_API_KEY:-}
|
||||||
|
BACKEND_URL: http://backend-lehrer:8001
|
||||||
|
CONSENT_SERVICE_URL: http://bp-core-consent-service:8081
|
||||||
|
EDU_SEARCH_URL: ${EDU_SEARCH_URL:-}
|
||||||
|
EDU_SEARCH_API_KEY: ${EDU_SEARCH_API_KEY:-}
|
||||||
|
depends_on:
|
||||||
|
- backend-lehrer
|
||||||
|
labels:
|
||||||
|
- "traefik.enable=true"
|
||||||
|
- "traefik.http.routers.website.rule=Host(`www.breakpilot.ai`)"
|
||||||
|
- "traefik.http.routers.website.entrypoints=https"
|
||||||
|
- "traefik.http.routers.website.tls=true"
|
||||||
|
- "traefik.http.routers.website.tls.certresolver=letsencrypt"
|
||||||
|
- "traefik.http.services.website.loadbalancer.server.port=3000"
|
||||||
|
restart: unless-stopped
|
||||||
|
networks:
|
||||||
|
- breakpilot-network
|
||||||
|
|
||||||
|
# =========================================================
|
||||||
|
# BACKEND
|
||||||
|
# =========================================================
|
||||||
|
backend-lehrer:
|
||||||
|
build:
|
||||||
|
context: ./backend-lehrer
|
||||||
|
dockerfile: Dockerfile
|
||||||
|
container_name: bp-lehrer-backend
|
||||||
|
user: "0:0"
|
||||||
|
expose:
|
||||||
|
- "8001"
|
||||||
|
volumes:
|
||||||
|
- lehrer_backend_data:/app/data
|
||||||
|
environment:
|
||||||
|
PORT: 8001
|
||||||
|
DATABASE_URL: postgresql+asyncpg://${POSTGRES_USER}:${POSTGRES_PASSWORD}@${POSTGRES_HOST}:${POSTGRES_PORT:-5432}/${POSTGRES_DB}?options=-csearch_path%3Dlehrer,core,public
|
||||||
|
JWT_SECRET: ${JWT_SECRET}
|
||||||
|
ENVIRONMENT: production
|
||||||
|
CONSENT_SERVICE_URL: http://bp-core-consent-service:8081
|
||||||
|
KLAUSUR_SERVICE_URL: http://klausur-service:8086
|
||||||
|
TROCR_SERVICE_URL: ${TROCR_SERVICE_URL:-}
|
||||||
|
CAMUNDA_URL: ${CAMUNDA_URL:-}
|
||||||
|
VALKEY_URL: redis://bp-core-valkey:6379/0
|
||||||
|
SESSION_TTL_HOURS: ${SESSION_TTL_HOURS:-24}
|
||||||
|
ANTHROPIC_API_KEY: ${ANTHROPIC_API_KEY:-}
|
||||||
|
DEBUG: "false"
|
||||||
|
ALERTS_AGENT_ENABLED: ${ALERTS_AGENT_ENABLED:-false}
|
||||||
|
VAST_API_KEY: ${VAST_API_KEY:-}
|
||||||
|
VAST_INSTANCE_ID: ${VAST_INSTANCE_ID:-}
|
||||||
|
CONTROL_API_KEY: ${CONTROL_API_KEY:-}
|
||||||
|
OLLAMA_BASE_URL: ${OLLAMA_BASE_URL:-}
|
||||||
|
OLLAMA_ENABLED: ${OLLAMA_ENABLED:-false}
|
||||||
|
OLLAMA_DEFAULT_MODEL: ${OLLAMA_DEFAULT_MODEL:-}
|
||||||
|
OLLAMA_VISION_MODEL: ${OLLAMA_VISION_MODEL:-}
|
||||||
|
OLLAMA_CORRECTION_MODEL: ${OLLAMA_CORRECTION_MODEL:-}
|
||||||
|
OLLAMA_TIMEOUT: ${OLLAMA_TIMEOUT:-120}
|
||||||
|
GAME_USE_DATABASE: ${GAME_USE_DATABASE:-true}
|
||||||
|
GAME_REQUIRE_AUTH: ${GAME_REQUIRE_AUTH:-true}
|
||||||
|
GAME_REQUIRE_BILLING: ${GAME_REQUIRE_BILLING:-true}
|
||||||
|
GAME_LLM_MODEL: ${GAME_LLM_MODEL:-}
|
||||||
|
SMTP_HOST: ${SMTP_HOST}
|
||||||
|
SMTP_PORT: ${SMTP_PORT:-587}
|
||||||
|
SMTP_USERNAME: ${SMTP_USERNAME}
|
||||||
|
SMTP_PASSWORD: ${SMTP_PASSWORD}
|
||||||
|
SMTP_FROM_NAME: ${SMTP_FROM_NAME:-BreakPilot}
|
||||||
|
SMTP_FROM_ADDR: ${SMTP_FROM_ADDR:-noreply@breakpilot.ai}
|
||||||
|
RAG_SERVICE_URL: http://bp-core-rag-service:8097
|
||||||
|
labels:
|
||||||
|
- "traefik.enable=true"
|
||||||
|
- "traefik.http.routers.backend-lehrer.rule=Host(`api-lehrer.breakpilot.ai`)"
|
||||||
|
- "traefik.http.routers.backend-lehrer.entrypoints=https"
|
||||||
|
- "traefik.http.routers.backend-lehrer.tls=true"
|
||||||
|
- "traefik.http.routers.backend-lehrer.tls.certresolver=letsencrypt"
|
||||||
|
- "traefik.http.services.backend-lehrer.loadbalancer.server.port=8001"
|
||||||
|
restart: unless-stopped
|
||||||
|
networks:
|
||||||
|
- breakpilot-network
|
||||||
|
|
||||||
|
# =========================================================
|
||||||
|
# MICROSERVICES
|
||||||
|
# =========================================================
|
||||||
|
klausur-service:
|
||||||
|
build:
|
||||||
|
context: ./klausur-service
|
||||||
|
dockerfile: Dockerfile
|
||||||
|
container_name: bp-lehrer-klausur-service
|
||||||
|
expose:
|
||||||
|
- "8086"
|
||||||
|
volumes:
|
||||||
|
- klausur_uploads:/app/uploads
|
||||||
|
- eh_uploads:/app/eh-uploads
|
||||||
|
- ocr_labeling:/app/ocr-labeling
|
||||||
|
- paddle_models:/root/.paddlex
|
||||||
|
environment:
|
||||||
|
JWT_SECRET: ${JWT_SECRET}
|
||||||
|
BACKEND_URL: http://backend-lehrer:8001
|
||||||
|
SCHOOL_SERVICE_URL: http://school-service:8084
|
||||||
|
ENVIRONMENT: production
|
||||||
|
DATABASE_URL: postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@${POSTGRES_HOST}:${POSTGRES_PORT:-5432}/${POSTGRES_DB}
|
||||||
|
EMBEDDING_SERVICE_URL: http://bp-core-embedding-service:8087
|
||||||
|
QDRANT_URL: ${QDRANT_URL}
|
||||||
|
MINIO_ENDPOINT: ${S3_ENDPOINT}
|
||||||
|
MINIO_ACCESS_KEY: ${S3_ACCESS_KEY}
|
||||||
|
MINIO_SECRET_KEY: ${S3_SECRET_KEY}
|
||||||
|
MINIO_BUCKET: ${S3_BUCKET:-breakpilot-rag}
|
||||||
|
MINIO_SECURE: ${S3_SECURE:-true}
|
||||||
|
PADDLEOCR_SERVICE_URL: ${PADDLEOCR_SERVICE_URL:-}
|
||||||
|
ANTHROPIC_API_KEY: ${ANTHROPIC_API_KEY:-}
|
||||||
|
OLLAMA_BASE_URL: ${OLLAMA_BASE_URL:-}
|
||||||
|
OLLAMA_ENABLED: ${OLLAMA_ENABLED:-false}
|
||||||
|
OLLAMA_DEFAULT_MODEL: ${OLLAMA_DEFAULT_MODEL:-}
|
||||||
|
OLLAMA_VISION_MODEL: ${OLLAMA_VISION_MODEL:-}
|
||||||
|
OLLAMA_CORRECTION_MODEL: ${OLLAMA_CORRECTION_MODEL:-}
|
||||||
|
RAG_SERVICE_URL: http://bp-core-rag-service:8097
|
||||||
|
depends_on:
|
||||||
|
school-service:
|
||||||
|
condition: service_started
|
||||||
|
healthcheck:
|
||||||
|
test: ["CMD", "curl", "-f", "http://127.0.0.1:8086/health"]
|
||||||
|
interval: 30s
|
||||||
|
timeout: 30s
|
||||||
|
retries: 3
|
||||||
|
start_period: 10s
|
||||||
|
labels:
|
||||||
|
- "traefik.enable=true"
|
||||||
|
- "traefik.http.routers.klausur.rule=Host(`klausur.breakpilot.ai`)"
|
||||||
|
- "traefik.http.routers.klausur.entrypoints=https"
|
||||||
|
- "traefik.http.routers.klausur.tls=true"
|
||||||
|
- "traefik.http.routers.klausur.tls.certresolver=letsencrypt"
|
||||||
|
- "traefik.http.services.klausur.loadbalancer.server.port=8086"
|
||||||
|
restart: unless-stopped
|
||||||
|
networks:
|
||||||
|
- breakpilot-network
|
||||||
|
|
||||||
|
school-service:
|
||||||
|
build:
|
||||||
|
context: ./school-service
|
||||||
|
dockerfile: Dockerfile
|
||||||
|
container_name: bp-lehrer-school-service
|
||||||
|
expose:
|
||||||
|
- "8084"
|
||||||
|
environment:
|
||||||
|
DATABASE_URL: postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@${POSTGRES_HOST}:${POSTGRES_PORT:-5432}/${POSTGRES_DB}
|
||||||
|
JWT_SECRET: ${JWT_SECRET}
|
||||||
|
PORT: 8084
|
||||||
|
ENVIRONMENT: production
|
||||||
|
ALLOWED_ORIGINS: "*"
|
||||||
|
LLM_GATEWAY_URL: http://backend-lehrer:8001/llm
|
||||||
|
restart: unless-stopped
|
||||||
|
networks:
|
||||||
|
- breakpilot-network
|
||||||
|
|
||||||
|
# =========================================================
|
||||||
|
# EDU SEARCH
|
||||||
|
# =========================================================
|
||||||
|
opensearch:
|
||||||
|
image: opensearchproject/opensearch:2.11.1
|
||||||
|
container_name: bp-lehrer-opensearch
|
||||||
|
environment:
|
||||||
|
- cluster.name=edu-search-cluster
|
||||||
|
- node.name=opensearch-node1
|
||||||
|
- discovery.type=single-node
|
||||||
|
- bootstrap.memory_lock=true
|
||||||
|
- "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m"
|
||||||
|
- OPENSEARCH_INITIAL_ADMIN_PASSWORD=${OPENSEARCH_PASSWORD:-Admin123!}
|
||||||
|
- plugins.security.disabled=true
|
||||||
|
ulimits:
|
||||||
|
memlock:
|
||||||
|
soft: -1
|
||||||
|
hard: -1
|
||||||
|
nofile:
|
||||||
|
soft: 65536
|
||||||
|
hard: 65536
|
||||||
|
volumes:
|
||||||
|
- opensearch_data:/usr/share/opensearch/data
|
||||||
|
healthcheck:
|
||||||
|
test: ["CMD-SHELL", "curl -s http://localhost:9200 >/dev/null || exit 1"]
|
||||||
|
interval: 30s
|
||||||
|
timeout: 10s
|
||||||
|
retries: 5
|
||||||
|
start_period: 60s
|
||||||
|
restart: unless-stopped
|
||||||
|
networks:
|
||||||
|
- breakpilot-network
|
||||||
|
|
||||||
|
edu-search-service:
|
||||||
|
build:
|
||||||
|
context: ./edu-search-service
|
||||||
|
dockerfile: Dockerfile
|
||||||
|
container_name: bp-lehrer-edu-search
|
||||||
|
expose:
|
||||||
|
- "8088"
|
||||||
|
environment:
|
||||||
|
PORT: 8088
|
||||||
|
OPENSEARCH_URL: http://opensearch:9200
|
||||||
|
OPENSEARCH_USERNAME: admin
|
||||||
|
OPENSEARCH_PASSWORD: ${OPENSEARCH_PASSWORD:-Admin123!}
|
||||||
|
INDEX_NAME: bp_documents_v1
|
||||||
|
EDU_SEARCH_API_KEY: ${EDU_SEARCH_API_KEY:-}
|
||||||
|
USER_AGENT: "BreakpilotEduCrawler/1.0 (+contact: security@breakpilot.com)"
|
||||||
|
RATE_LIMIT_PER_SEC: "0.2"
|
||||||
|
MAX_DEPTH: "4"
|
||||||
|
MAX_PAGES_PER_RUN: "500"
|
||||||
|
DB_HOST: ${POSTGRES_HOST}
|
||||||
|
DB_PORT: ${POSTGRES_PORT:-5432}
|
||||||
|
DB_USER: ${POSTGRES_USER}
|
||||||
|
DB_PASSWORD: ${POSTGRES_PASSWORD}
|
||||||
|
DB_NAME: ${POSTGRES_DB}
|
||||||
|
DB_SSLMODE: disable
|
||||||
|
STAFF_CRAWLER_EMAIL: crawler@breakpilot.de
|
||||||
|
depends_on:
|
||||||
|
opensearch:
|
||||||
|
condition: service_healthy
|
||||||
|
healthcheck:
|
||||||
|
test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:8088/v1/health"]
|
||||||
|
interval: 30s
|
||||||
|
timeout: 3s
|
||||||
|
start_period: 10s
|
||||||
|
retries: 3
|
||||||
|
restart: unless-stopped
|
||||||
|
networks:
|
||||||
|
- breakpilot-network
|
||||||
@@ -15,7 +15,6 @@ volumes:
|
|||||||
eh_uploads:
|
eh_uploads:
|
||||||
ocr_labeling:
|
ocr_labeling:
|
||||||
paddle_models:
|
paddle_models:
|
||||||
lighton_models:
|
|
||||||
paddleocr_models:
|
paddleocr_models:
|
||||||
transcription_models:
|
transcription_models:
|
||||||
transcription_temp:
|
transcription_temp:
|
||||||
@@ -210,7 +209,6 @@ services:
|
|||||||
- eh_uploads:/app/eh-uploads
|
- eh_uploads:/app/eh-uploads
|
||||||
- ocr_labeling:/app/ocr-labeling
|
- ocr_labeling:/app/ocr-labeling
|
||||||
- paddle_models:/root/.paddlex
|
- paddle_models:/root/.paddlex
|
||||||
- lighton_models:/root/.cache/huggingface
|
|
||||||
environment:
|
environment:
|
||||||
JWT_SECRET: ${JWT_SECRET:-your-super-secret-jwt-key-change-in-production}
|
JWT_SECRET: ${JWT_SECRET:-your-super-secret-jwt-key-change-in-production}
|
||||||
BACKEND_URL: http://backend-lehrer:8001
|
BACKEND_URL: http://backend-lehrer:8001
|
||||||
@@ -233,12 +231,6 @@ services:
|
|||||||
OLLAMA_DEFAULT_MODEL: ${OLLAMA_DEFAULT_MODEL:-llama3.2}
|
OLLAMA_DEFAULT_MODEL: ${OLLAMA_DEFAULT_MODEL:-llama3.2}
|
||||||
OLLAMA_VISION_MODEL: ${OLLAMA_VISION_MODEL:-llama3.2-vision}
|
OLLAMA_VISION_MODEL: ${OLLAMA_VISION_MODEL:-llama3.2-vision}
|
||||||
OLLAMA_CORRECTION_MODEL: ${OLLAMA_CORRECTION_MODEL:-llama3.2}
|
OLLAMA_CORRECTION_MODEL: ${OLLAMA_CORRECTION_MODEL:-llama3.2}
|
||||||
OLLAMA_REVIEW_MODEL: ${OLLAMA_REVIEW_MODEL:-qwen3:0.6b}
|
|
||||||
OLLAMA_REVIEW_BATCH_SIZE: ${OLLAMA_REVIEW_BATCH_SIZE:-20}
|
|
||||||
REVIEW_ENGINE: ${REVIEW_ENGINE:-spell}
|
|
||||||
OCR_ENGINE: ${OCR_ENGINE:-auto}
|
|
||||||
OLLAMA_HTR_MODEL: ${OLLAMA_HTR_MODEL:-qwen2.5vl:32b}
|
|
||||||
HTR_FALLBACK_MODEL: ${HTR_FALLBACK_MODEL:-trocr-large}
|
|
||||||
RAG_SERVICE_URL: http://bp-core-rag-service:8097
|
RAG_SERVICE_URL: http://bp-core-rag-service:8097
|
||||||
extra_hosts:
|
extra_hosts:
|
||||||
- "host.docker.internal:host-gateway"
|
- "host.docker.internal:host-gateway"
|
||||||
|
|||||||
@@ -1,114 +0,0 @@
|
|||||||
# Chunk-Browser
|
|
||||||
|
|
||||||
## Uebersicht
|
|
||||||
|
|
||||||
Der Chunk-Browser ermoeglicht das sequenzielle Durchblaettern aller Chunks in einer Qdrant-Collection. Er ist als Tab "Chunk-Browser" auf der RAG-Seite (`/ai/rag`) verfuegbar.
|
|
||||||
|
|
||||||
**URL:** `https://macmini:3002/ai/rag` → Tab "Chunk-Browser"
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Funktionen
|
|
||||||
|
|
||||||
### Collection-Auswahl
|
|
||||||
Dropdown mit allen verfuegbaren Compliance-Collections:
|
|
||||||
|
|
||||||
- `bp_compliance_gesetze`
|
|
||||||
- `bp_compliance_ce`
|
|
||||||
- `bp_compliance_datenschutz`
|
|
||||||
- `bp_dsfa_corpus`
|
|
||||||
- `bp_compliance_recht`
|
|
||||||
- `bp_legal_templates`
|
|
||||||
- `bp_compliance_gdpr`
|
|
||||||
- `bp_compliance_schulrecht`
|
|
||||||
- `bp_dsfa_templates`
|
|
||||||
- `bp_dsfa_risks`
|
|
||||||
|
|
||||||
### Seitenweise Navigation
|
|
||||||
- 20 Chunks pro Seite
|
|
||||||
- Zurueck/Weiter-Buttons
|
|
||||||
- Seitennummer und Chunk-Zaehler
|
|
||||||
- Cursor-basierte Pagination via Qdrant Scroll API
|
|
||||||
|
|
||||||
### Textsuche
|
|
||||||
- Filtert Chunks auf der aktuell geladenen Seite
|
|
||||||
- Treffer werden gelb hervorgehoben
|
|
||||||
- Suche ueber den Chunk-Text (payload.text, payload.content, payload.chunk_text)
|
|
||||||
|
|
||||||
### Chunk-Details
|
|
||||||
- Klick auf einen Chunk klappt alle Metadaten aus
|
|
||||||
- Zeigt: regulation_code, article, language, source, licence, etc.
|
|
||||||
- Chunks haben eine fortlaufende Nummer (#1, #2, ...)
|
|
||||||
|
|
||||||
### Integration mit Regulierungen-Tab
|
|
||||||
Der Button "In Chunks suchen" bei jeder Regulierung wechselt zum Chunk-Browser mit:
|
|
||||||
- Vorauswahl der richtigen Collection
|
|
||||||
- Vorausgefuelltem Suchbegriff (Regulierungsname)
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## API
|
|
||||||
|
|
||||||
### Scroll-Endpoint (API Proxy)
|
|
||||||
|
|
||||||
```
|
|
||||||
GET /api/legal-corpus?action=scroll&collection=bp_compliance_ce&limit=20&offset={cursor}
|
|
||||||
```
|
|
||||||
|
|
||||||
**Parameter:**
|
|
||||||
|
|
||||||
| Parameter | Typ | Beschreibung |
|
|
||||||
|-----------|-----|--------------|
|
|
||||||
| `collection` | string | Qdrant Collection Name |
|
|
||||||
| `limit` | number | Chunks pro Seite (max 100) |
|
|
||||||
| `offset` | string | Cursor fuer naechste Seite (optional) |
|
|
||||||
| `text_search` | string | Textsuche-Filter (optional) |
|
|
||||||
|
|
||||||
**Response:**
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"chunks": [
|
|
||||||
{
|
|
||||||
"id": "uuid",
|
|
||||||
"text": "...",
|
|
||||||
"regulation_code": "GDPR",
|
|
||||||
"article": "Art. 5",
|
|
||||||
"language": "de"
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"next_offset": "uuid-or-null",
|
|
||||||
"total_in_page": 20
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### Collection-Count-Endpoint
|
|
||||||
|
|
||||||
```
|
|
||||||
GET /api/legal-corpus?action=collection-count&collection=bp_compliance_ce
|
|
||||||
```
|
|
||||||
|
|
||||||
**Response:**
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"count": 12345
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Technische Details
|
|
||||||
|
|
||||||
- Der API-Proxy spricht direkt mit Qdrant (Port 6333) via dessen `POST /collections/{name}/points/scroll` Endpoint
|
|
||||||
- Kein Embedding oder rag-service erforderlich
|
|
||||||
- Textsuche ist client-seitig (kein Embedding noetig)
|
|
||||||
- Pagination ist cursor-basiert (Qdrant `next_page_offset`)
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Weitere Features auf der RAG-Seite
|
|
||||||
|
|
||||||
### Originalquelle-Links
|
|
||||||
Jede Regulierung in der Tabelle hat einen "Originalquelle" Link zum offiziellen Dokument (EUR-Lex, gesetze-im-internet.de, etc.). Definiert in `REGULATION_SOURCES` (88 Eintraege).
|
|
||||||
|
|
||||||
### Low-Chunk-Warnung
|
|
||||||
Regulierungen mit weniger als 10 Chunks aber einem erwarteten Wert >= 10 werden mit einem Amber-Warnsymbol markiert. Dies hilft, fehlgeschlagene oder unvollstaendige Ingestions zu erkennen.
|
|
||||||
@@ -1,468 +0,0 @@
|
|||||||
# OCR Pipeline - Schrittweise Seitenrekonstruktion
|
|
||||||
|
|
||||||
**Version:** 2.0.0
|
|
||||||
**Status:** Produktiv (Schritte 1–8 implementiert)
|
|
||||||
**URL:** https://macmini:3002/ai/ocr-pipeline
|
|
||||||
|
|
||||||
## Uebersicht
|
|
||||||
|
|
||||||
Die OCR Pipeline zerlegt den OCR-Prozess in **8 einzelne Schritte**, um eingescannte Vokabelseiten
|
|
||||||
aus mehrspaltig gedruckten Schulbuechern Wort fuer Wort zu rekonstruieren.
|
|
||||||
Jeder Schritt kann individuell geprueft, korrigiert und mit Ground-Truth-Daten versehen werden.
|
|
||||||
|
|
||||||
**Ziel:** 10 Vokabelseiten fehlerfrei rekonstruieren.
|
|
||||||
|
|
||||||
### Pipeline-Schritte
|
|
||||||
|
|
||||||
| Schritt | Name | Beschreibung | Status |
|
|
||||||
|---------|------|--------------|--------|
|
|
||||||
| 1 | Begradigung (Deskew) | Scan begradigen (Hough Lines + Word Alignment) | Implementiert |
|
|
||||||
| 2 | Entzerrung (Dewarp) | Buchwoelbung entzerren (Vertikalkanten-Analyse) | Implementiert |
|
|
||||||
| 3 | Spaltenerkennung | Unsichtbare Spalten finden (Projektionsprofile + Wortvalidierung) | Implementiert |
|
|
||||||
| 4 | Zeilenerkennung | Horizontale Zeilen + Kopf-/Fusszeilen-Klassifikation + Luecken-Heilung | Implementiert |
|
|
||||||
| 5 | Worterkennung | Grid aus Spalten x Zeilen, OCR pro Zelle, Post-Processing | Implementiert |
|
|
||||||
| 6 | Korrektur | Zeichenverwirrung + regel-basierte Rechtschreibkorrektur (SSE-Stream) | Implementiert |
|
|
||||||
| 7 | Rekonstruktion | Interaktive Zellenbearbeitung auf Bildhintergrund | Implementiert |
|
|
||||||
| 8 | Validierung | Ground-Truth-Vergleich und Qualitaetspruefung | Implementiert |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Architektur
|
|
||||||
|
|
||||||
```
|
|
||||||
Admin-Lehrer (Next.js) klausur-service (FastAPI :8086)
|
|
||||||
┌────────────────────┐ ┌─────────────────────────────┐
|
|
||||||
│ /ai/ocr-pipeline │ │ /api/v1/ocr-pipeline/ │
|
|
||||||
│ │ REST │ │
|
|
||||||
│ PipelineStepper │◄────────►│ Sessions CRUD │
|
|
||||||
│ StepDeskew │ │ Image Serving │
|
|
||||||
│ StepDewarp │ SSE │ Deskew/Dewarp/Columns/Rows │
|
|
||||||
│ StepColumnDetection│◄────────►│ Word Recognition │
|
|
||||||
│ StepRowDetection │ │ Correction (Spell-Checker) │
|
|
||||||
│ StepWordRecognition│ │ Reconstruction │
|
|
||||||
│ StepLlmReview │ │ Ground Truth │
|
|
||||||
│ StepReconstruction │ └─────────────────────────────┘
|
|
||||||
│ StepGroundTruth │ │
|
|
||||||
└────────────────────┘ ▼
|
|
||||||
┌─────────────────────┐
|
|
||||||
│ PostgreSQL │
|
|
||||||
│ ocr_pipeline_sessions│
|
|
||||||
│ (Images + JSONB) │
|
|
||||||
└─────────────────────┘
|
|
||||||
```
|
|
||||||
|
|
||||||
### Dateistruktur
|
|
||||||
|
|
||||||
```
|
|
||||||
klausur-service/backend/
|
|
||||||
├── ocr_pipeline_api.py # FastAPI Router (alle Endpoints)
|
|
||||||
├── ocr_pipeline_session_store.py # PostgreSQL Persistence
|
|
||||||
├── cv_vocab_pipeline.py # Computer Vision + NLP Algorithmen
|
|
||||||
└── migrations/
|
|
||||||
├── 002_ocr_pipeline_sessions.sql # Basis-Schema
|
|
||||||
├── 003_add_row_result.sql # Row-Result Spalte
|
|
||||||
└── 004_add_word_result.sql # Word-Result Spalte
|
|
||||||
|
|
||||||
admin-lehrer/
|
|
||||||
├── app/(admin)/ai/ocr-pipeline/
|
|
||||||
│ ├── page.tsx # Haupt-Page mit Session-Management
|
|
||||||
│ └── types.ts # TypeScript Interfaces
|
|
||||||
└── components/ocr-pipeline/
|
|
||||||
├── PipelineStepper.tsx # Fortschritts-Stepper
|
|
||||||
├── StepDeskew.tsx # Schritt 1: Begradigung
|
|
||||||
├── StepDewarp.tsx # Schritt 2: Entzerrung
|
|
||||||
├── StepColumnDetection.tsx # Schritt 3: Spaltenerkennung
|
|
||||||
├── StepRowDetection.tsx # Schritt 4: Zeilenerkennung
|
|
||||||
├── StepWordRecognition.tsx # Schritt 5: Worterkennung
|
|
||||||
├── StepLlmReview.tsx # Schritt 6: Korrektur (SSE-Stream)
|
|
||||||
├── StepReconstruction.tsx # Schritt 7: Rekonstruktion (Canvas)
|
|
||||||
└── StepGroundTruth.tsx # Schritt 8: Validierung
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## API-Referenz
|
|
||||||
|
|
||||||
Alle Endpoints unter `/api/v1/ocr-pipeline/`.
|
|
||||||
|
|
||||||
### Sessions
|
|
||||||
|
|
||||||
| Methode | Pfad | Beschreibung |
|
|
||||||
|---------|------|--------------|
|
|
||||||
| `POST` | `/sessions` | Neue Session erstellen (Bild hochladen) |
|
|
||||||
| `GET` | `/sessions` | Alle Sessions auflisten |
|
|
||||||
| `GET` | `/sessions/{id}` | Session-Info mit allen Step-Results |
|
|
||||||
| `PUT` | `/sessions/{id}` | Session umbenennen |
|
|
||||||
| `DELETE` | `/sessions/{id}` | Session loeschen |
|
|
||||||
|
|
||||||
### Bilder
|
|
||||||
|
|
||||||
| Methode | Pfad | Beschreibung |
|
|
||||||
|---------|------|--------------|
|
|
||||||
| `GET` | `/sessions/{id}/image/original` | Originalbild |
|
|
||||||
| `GET` | `/sessions/{id}/image/deskewed` | Begradigtes Bild |
|
|
||||||
| `GET` | `/sessions/{id}/image/dewarped` | Entzerrtes Bild |
|
|
||||||
| `GET` | `/sessions/{id}/image/binarized` | Binarisiertes Bild |
|
|
||||||
| `GET` | `/sessions/{id}/image/columns-overlay` | Spalten-Overlay |
|
|
||||||
| `GET` | `/sessions/{id}/image/rows-overlay` | Zeilen-Overlay |
|
|
||||||
| `GET` | `/sessions/{id}/image/words-overlay` | Wort-Grid-Overlay |
|
|
||||||
|
|
||||||
### Schritt 1: Begradigung
|
|
||||||
|
|
||||||
| Methode | Pfad | Beschreibung |
|
|
||||||
|---------|------|--------------|
|
|
||||||
| `POST` | `/sessions/{id}/deskew` | Automatische Begradigung |
|
|
||||||
| `POST` | `/sessions/{id}/deskew/manual` | Manuelle Winkelkorrektur |
|
|
||||||
| `POST` | `/sessions/{id}/ground-truth/deskew` | Ground Truth speichern |
|
|
||||||
|
|
||||||
### Schritt 2: Entzerrung
|
|
||||||
|
|
||||||
| Methode | Pfad | Beschreibung |
|
|
||||||
|---------|------|--------------|
|
|
||||||
| `POST` | `/sessions/{id}/dewarp` | Automatische Entzerrung |
|
|
||||||
| `POST` | `/sessions/{id}/dewarp/manual` | Manueller Scherbungswinkel |
|
|
||||||
| `POST` | `/sessions/{id}/ground-truth/dewarp` | Ground Truth speichern |
|
|
||||||
|
|
||||||
### Schritt 3: Spalten
|
|
||||||
|
|
||||||
| Methode | Pfad | Beschreibung |
|
|
||||||
|---------|------|--------------|
|
|
||||||
| `POST` | `/sessions/{id}/columns` | Automatische Spaltenerkennung |
|
|
||||||
| `POST` | `/sessions/{id}/columns/manual` | Manuelle Spalten-Definition |
|
|
||||||
| `POST` | `/sessions/{id}/ground-truth/columns` | Ground Truth speichern |
|
|
||||||
|
|
||||||
### Schritt 4: Zeilen
|
|
||||||
|
|
||||||
| Methode | Pfad | Beschreibung |
|
|
||||||
|---------|------|--------------|
|
|
||||||
| `POST` | `/sessions/{id}/rows` | Automatische Zeilenerkennung |
|
|
||||||
| `POST` | `/sessions/{id}/rows/manual` | Manuelle Zeilen-Definition |
|
|
||||||
| `POST` | `/sessions/{id}/ground-truth/rows` | Ground Truth speichern |
|
|
||||||
| `GET` | `/sessions/{id}/ground-truth/rows` | Ground Truth abrufen |
|
|
||||||
|
|
||||||
### Schritt 5: Worterkennung
|
|
||||||
|
|
||||||
| Methode | Pfad | Beschreibung |
|
|
||||||
|---------|------|--------------|
|
|
||||||
| `POST` | `/sessions/{id}/words` | Wort-Grid aus Spalten x Zeilen erstellen |
|
|
||||||
| `POST` | `/sessions/{id}/ground-truth/words` | Ground Truth speichern |
|
|
||||||
| `GET` | `/sessions/{id}/ground-truth/words` | Ground Truth abrufen |
|
|
||||||
|
|
||||||
### Schritt 6: Korrektur
|
|
||||||
|
|
||||||
| Methode | Pfad | Beschreibung |
|
|
||||||
|---------|------|--------------|
|
|
||||||
| `POST` | `/sessions/{id}/llm-review?stream=true` | SSE-Stream Korrektur starten |
|
|
||||||
| `POST` | `/sessions/{id}/llm-review/apply` | Ausgewaehlte Korrekturen speichern |
|
|
||||||
|
|
||||||
### Schritt 7: Rekonstruktion
|
|
||||||
|
|
||||||
| Methode | Pfad | Beschreibung |
|
|
||||||
|---------|------|--------------|
|
|
||||||
| `POST` | `/sessions/{id}/reconstruction` | Zellaenderungen speichern |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Schritt 3: Spaltenerkennung (Detail)
|
|
||||||
|
|
||||||
### Algorithmus: `detect_column_geometry()`
|
|
||||||
|
|
||||||
Zweistufige Erkennung: vertikale Projektionsprofile finden Luecken, Wort-Bounding-Boxes validieren.
|
|
||||||
|
|
||||||
```
|
|
||||||
Bild → Binarisierung → Vertikalprofil → Lueckenerkennung → Wort-Validierung → ColumnGeometry
|
|
||||||
```
|
|
||||||
|
|
||||||
**Wichtige Implementierungsdetails:**
|
|
||||||
|
|
||||||
- **Initialer Tesseract-Scan:** Laeuft auf der vollen Bildbreite `[left_x : w]` (nicht nur bis zur Content-Grenze `right_x`), damit Woerter am rechten Rand der letzten Spalte nicht uebersehen werden.
|
|
||||||
- **Letzte Spalte:** Wird immer bis zur vollen Bildbreite `w` ausgedehnt, nicht nur bis zur erkannten Content-Grenze.
|
|
||||||
- **Phantom-Spalten-Filter (Step 9):** Spalten mit Breite < 3 % der Content-Breite UND < 3 Woerter werden als Artefakte entfernt; die angrenzenden Spalten schliessen die Luecke.
|
|
||||||
- **Spaltenzuweisung:** Woerter werden anhand des groessten horizontalen Ueberlappungsbereichs einer Spalte zugeordnet.
|
|
||||||
|
|
||||||
### Konfigurierbare Parameter
|
|
||||||
|
|
||||||
```python
|
|
||||||
# Mindestbreite fuer echte Spalten (automatisch: max(20px, 3% content_w))
|
|
||||||
min_real_col_w = max(20, int(content_w * 0.03))
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Schritt 4: Zeilenerkennung (Detail)
|
|
||||||
|
|
||||||
### Algorithmus: `detect_row_geometry()`
|
|
||||||
|
|
||||||
Horizontale Projektionsprofile finden Zeilen-Luecken; word-level Validierung verhindert Fehlschnitte.
|
|
||||||
|
|
||||||
**Zusaetzliche Post-Processing-Schritte:**
|
|
||||||
|
|
||||||
1. **Artefakt-Zeilen entfernen** (`_is_artifact_row`):
|
|
||||||
Zeilen, in denen alle erkannten Tokens nur 1 Zeichen lang sind (Scan-Schatten, leere Zeilen),
|
|
||||||
werden als Artefakte klassifiziert und aus dem Grid entfernt.
|
|
||||||
|
|
||||||
2. **Luecken-Heilung** (`_heal_row_gaps`):
|
|
||||||
Nach dem Entfernen leerer/Artefakt-Zeilen werden die verbleibenden Zeilen auf die Mitte
|
|
||||||
der entstehenden Luecke ausgedehnt, damit kein Zeileninhalt durch schrumpfende Grenzen
|
|
||||||
abgeschnitten wird.
|
|
||||||
|
|
||||||
```python
|
|
||||||
def _is_artifact_row(row: RowGeometry) -> bool:
|
|
||||||
"""Zeile ist Artefakt wenn alle Tokens <= 1 Zeichen."""
|
|
||||||
if row.word_count == 0: return True
|
|
||||||
return all(len(w.get('text','').strip()) <= 1 for w in row.words)
|
|
||||||
|
|
||||||
def _heal_row_gaps(rows, top_bound, bottom_bound):
|
|
||||||
"""Verbleibende Zeilen auf Mitte der Luecken ausdehnen."""
|
|
||||||
...
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Schritt 5: Worterkennung (Detail)
|
|
||||||
|
|
||||||
### Algorithmus: `build_cell_grid()`
|
|
||||||
|
|
||||||
Schritt 5 nutzt die Ergebnisse von Schritt 3 (Spalten) und Schritt 4 (Zeilen), um ein Grid
|
|
||||||
zu erstellen und jede Zelle per OCR auszulesen.
|
|
||||||
|
|
||||||
```
|
|
||||||
Spalten (Step 3): column_en | column_de | column_example
|
|
||||||
───────────┼─────────────┼────────────────
|
|
||||||
Zeilen (Step 4): R0 │ hello │ hallo │ Hello, World!
|
|
||||||
R1 │ world │ Welt │ The whole world
|
|
||||||
R2 │ book │ Buch │ Read a book
|
|
||||||
───────────┼─────────────┼────────────────
|
|
||||||
```
|
|
||||||
|
|
||||||
**Ablauf:**
|
|
||||||
|
|
||||||
1. **Initialer Scan:** Ganzes Bild einmal per Tesseract/RapidOCR → alle Wort-Bboxes
|
|
||||||
2. **Zuweisung:** Jedes Wort der Spalte mit groesstem horizontalem Ueberlapp zuordnen
|
|
||||||
3. **Zell-OCR Fallback:** Leere Zellen bekommen eigenen Crop + erneuten OCR-Aufruf (PSM 6/7)
|
|
||||||
4. **Batch-Spalten-OCR:** Bei vielen leeren Zellen in einer Spalte: gesamte Spalte einmal OCR-en
|
|
||||||
5. **Post-Processing:** Continuation-Rows zusammenfuehren, Lautschrift erkennen, Komma-Eintraege splitten
|
|
||||||
|
|
||||||
### Post-Processing Pipeline (in `build_vocab_pipeline_streaming`)
|
|
||||||
|
|
||||||
| # | Schritt | Funktion | Beschreibung |
|
|
||||||
|---|---------|----------|--------------|
|
|
||||||
| 0a | Lautschrift-Fortsetzung | `_merge_phonetic_continuation_rows` | IPA-only Folgezeilen zusammenfuehren |
|
|
||||||
| 0b | Zeilen-Fortsetzung | `_merge_continuation_rows` | Zeilen mit Kleinbuchstaben-Anfang zusammenfuehren |
|
|
||||||
| 2 | Lautschrift-Fix | `_fix_phonetic_brackets` | OCR-Lautschrift mit Woerterbuch-IPA ersetzen |
|
|
||||||
| 3 | Komma-Split | `_split_comma_entries` | `break, broke, broken` → 3 Eintraege |
|
|
||||||
| 4 | Beispielsaetze | `_attach_example_sentences` | Beispielsatz-Zeilen an vorangehenden Eintrag haengen |
|
|
||||||
|
|
||||||
!!! info "Zeichenkorrektur in Schritt 6"
|
|
||||||
Die Zeichenverwirrungskorrektur (`|` → `I`, `1` → `I`, `8` → `B`) laeuft **nicht** in
|
|
||||||
Schritt 5, sondern als erstes in Schritt 6 (Korrektur), damit die Aenderungen im UI
|
|
||||||
sichtbar und rueckgaengig machbar sind.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Schritt 6: Korrektur (Detail)
|
|
||||||
|
|
||||||
### Korrektur-Engine
|
|
||||||
|
|
||||||
Schritt 6 kombiniert zwei Korrektur-Stufen, beide als SSE-Stream:
|
|
||||||
|
|
||||||
**Stufe 1 — Zeichenverwirrungskorrektur** (`_fix_character_confusion`):
|
|
||||||
|
|
||||||
| OCR-Fehler | Korrektur | Regel |
|
|
||||||
|------------|-----------|-------|
|
|
||||||
| `\|ch` | `Ich` | `\|` am Wortanfang vor Kleinbuchstaben → `I` |
|
|
||||||
| `\| want` | `I want` | Alleinstehendes `\|` → `I` |
|
|
||||||
| `8en` | `Ben` | `8` am Wortanfang vor `en` → `B` |
|
|
||||||
| `1 want` | `I want` | Alleinstehendes `1` → `I` (NICHT vor `.` oder `,`) |
|
|
||||||
| `1. Kreuz` | unveraendert | `1.` = Listennummer, wird **nicht** korrigiert |
|
|
||||||
|
|
||||||
**Stufe 2 — Regel-basierte Rechtschreibkorrektur** (`spell_review_entries_streaming`):
|
|
||||||
|
|
||||||
Nutzt `pyspellchecker` (MIT-Lizenz) mit EN+DE-Woerterbuch. Pro Token mit verdaechtigem Zeichen
|
|
||||||
(`0`, `1`, `5`, `6`, `8`, `|`) werden Kandidaten geprueft:
|
|
||||||
|
|
||||||
```python
|
|
||||||
_SPELL_SUBS = {
|
|
||||||
'0': ['O', 'o'], '1': ['l', 'I'], '5': ['S', 's'],
|
|
||||||
'6': ['G', 'g'], '8': ['B', 'b'], '|': ['I', 'l', '1'],
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
Logik: Kandidaten werden durch Woerterbuch-Lookup validiert. Strukturregel: Verdaechtiges
|
|
||||||
Zeichen an Position 0 + Rest klein → erstes Substitut (z.B. `8en` → `Ben`).
|
|
||||||
|
|
||||||
### Umgebungsvariablen
|
|
||||||
|
|
||||||
| Variable | Default | Beschreibung |
|
|
||||||
|----------|---------|--------------|
|
|
||||||
| `REVIEW_ENGINE` | `spell` | Korrektur-Engine: `spell` oder `llm` |
|
|
||||||
| `OLLAMA_REVIEW_MODEL` | `qwen3:0.6b` | Ollama-Modell (nur wenn `REVIEW_ENGINE=llm`) |
|
|
||||||
| `OLLAMA_REVIEW_BATCH_SIZE` | `20` | Eintraege pro LLM-Aufruf |
|
|
||||||
|
|
||||||
### SSE-Protokoll
|
|
||||||
|
|
||||||
```
|
|
||||||
POST /sessions/{id}/llm-review?stream=true
|
|
||||||
|
|
||||||
Events:
|
|
||||||
data: {"type": "meta", "total_entries": 96, "to_review": 80, "skipped": 16, "model": "spell"}
|
|
||||||
data: {"type": "batch", "changes": [...], "entries_reviewed": [0,1,2,...], "progress": {...}}
|
|
||||||
data: {"type": "complete", "duration_ms": 234}
|
|
||||||
data: {"type": "error", "detail": "..."}
|
|
||||||
|
|
||||||
Change-Format:
|
|
||||||
{"row_index": 5, "field": "english", "old": "| want", "new": "I want"}
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Schritt 7: Rekonstruktion (Detail)
|
|
||||||
|
|
||||||
Interaktiver Canvas-Editor: Das entzerrte Originalbild wird mit 30 % Opazitaet als Hintergrund
|
|
||||||
angezeigt, alle Grid-Zellen (auch leere!) werden als editierbare Textfelder darueber gelegt.
|
|
||||||
|
|
||||||
**Features:**
|
|
||||||
|
|
||||||
- Alle Zellen editierbar — auch leere Zellen (kein Filter mehr)
|
|
||||||
- Farbkodierung nach Spaltentyp (Blau=EN, Gruen=DE, Orange=Beispiel)
|
|
||||||
- Leere Pflichtfelder (EN/DE) rot gestrichelt markiert
|
|
||||||
- Undo/Redo (Ctrl+Z / Ctrl+Shift+Z)
|
|
||||||
- Tab-Navigation durch alle Zellen (inkl. leerer)
|
|
||||||
- Zoom 50–200 %
|
|
||||||
- Per-Zell-Reset-Button bei geaenderten Zellen
|
|
||||||
|
|
||||||
```
|
|
||||||
POST /sessions/{id}/reconstruction
|
|
||||||
Body: {"cells": [{"cell_id": "r5_c2", "text": "corrected text"}]}
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Datenbank-Schema
|
|
||||||
|
|
||||||
```sql
|
|
||||||
CREATE TABLE ocr_pipeline_sessions (
|
|
||||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
||||||
name VARCHAR(255),
|
|
||||||
filename VARCHAR(255),
|
|
||||||
status VARCHAR(50) DEFAULT 'active',
|
|
||||||
current_step INT DEFAULT 1,
|
|
||||||
|
|
||||||
-- Bilder (BYTEA)
|
|
||||||
original_png BYTEA,
|
|
||||||
deskewed_png BYTEA,
|
|
||||||
binarized_png BYTEA,
|
|
||||||
dewarped_png BYTEA,
|
|
||||||
|
|
||||||
-- Step-Results (JSONB)
|
|
||||||
deskew_result JSONB,
|
|
||||||
dewarp_result JSONB,
|
|
||||||
column_result JSONB,
|
|
||||||
row_result JSONB,
|
|
||||||
word_result JSONB, -- enthaelt vocab_entries, cells, llm_review
|
|
||||||
|
|
||||||
-- Ground Truth + Meta
|
|
||||||
ground_truth JSONB,
|
|
||||||
auto_shear_degrees REAL,
|
|
||||||
created_at TIMESTAMP DEFAULT NOW(),
|
|
||||||
updated_at TIMESTAMP DEFAULT NOW()
|
|
||||||
);
|
|
||||||
```
|
|
||||||
|
|
||||||
`word_result` JSONB-Struktur:
|
|
||||||
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"vocab_entries": [...],
|
|
||||||
"cells": [{"cell_id": "r0_c0", "text": "hello", "bbox_pct": {...}, ...}],
|
|
||||||
"columns_used": [...],
|
|
||||||
"llm_review": {
|
|
||||||
"changes": [{"row_index": 5, "field": "english", "old": "...", "new": "..."}],
|
|
||||||
"model_used": "spell",
|
|
||||||
"duration_ms": 234
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Abhaengigkeiten
|
|
||||||
|
|
||||||
### Python (klausur-service)
|
|
||||||
|
|
||||||
| Paket | Version | Lizenz | Zweck |
|
|
||||||
|-------|---------|--------|-------|
|
|
||||||
| `pytesseract` | ≥0.3.10 | Apache-2.0 | Haupt-OCR (Schritt 3–5) |
|
|
||||||
| `opencv-python-headless` | ≥4.8.0 | Apache-2.0 | Bildverarbeitung, Projektionsprofile |
|
|
||||||
| `Pillow` | ≥10.0.0 | HPND (MIT-kompatibel) | Bildkonvertierung |
|
|
||||||
| `rapidocr` | latest | Apache-2.0 | Schnelles OCR (ARM64 via ONNX) |
|
|
||||||
| `onnxruntime` | latest | MIT | ONNX-Inferenz fuer RapidOCR |
|
|
||||||
| `pyspellchecker` | ≥0.8.1 | MIT | Regel-basierte OCR-Korrektur (Schritt 6) |
|
|
||||||
| `eng-to-ipa` | latest | MIT | IPA-Lautschrift-Lookup (Schritt 5) |
|
|
||||||
|
|
||||||
!!! info "pyspellchecker (neu seit 2026-03)"
|
|
||||||
`pyspellchecker` (MIT-Lizenz) ersetzt die LLM-basierte Korrektur als Standard-Engine.
|
|
||||||
EN+DE-Woerterbuch, ~134k Woerter. Kein Ollama notig.
|
|
||||||
Umschaltbar via `REVIEW_ENGINE=llm` fuer den LLM-Pfad.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Bekannte Einschraenkungen
|
|
||||||
|
|
||||||
| Problem | Ursache | Workaround |
|
|
||||||
|---------|---------|------------|
|
|
||||||
| Schraeg gedruckte Seiten | Deskew erkennt Text-Rotation, nicht Seiten-Rotation | Manueller Winkel |
|
|
||||||
| Sehr kleine Schrift (< 8pt) | Tesseract PSM 7 braucht min. Zeichengroesse | Vorher zoomen |
|
|
||||||
| Handgeschriebene Eintraege | Tesseract/RapidOCR sind fuer Druckschrift optimiert | TrOCR-Engine (geplant) |
|
|
||||||
| Mehr als 4 Spalten | Projektionsprofil kann verschmelzen | Manuelle Spalten |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Deployment
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# 1. Git push
|
|
||||||
git push origin main
|
|
||||||
|
|
||||||
# 2. Mac Mini pull + build
|
|
||||||
ssh macmini "cd /Users/benjaminadmin/Projekte/breakpilot-lehrer && git pull --no-rebase origin main"
|
|
||||||
|
|
||||||
# klausur-service (Backend) — bei requirements.txt Aenderungen: klausur-base neu bauen
|
|
||||||
ssh macmini "cd /Users/benjaminadmin/Projekte/breakpilot-lehrer && \
|
|
||||||
/usr/local/bin/docker compose build klausur-service && \
|
|
||||||
/usr/local/bin/docker compose up -d klausur-service"
|
|
||||||
|
|
||||||
# admin-lehrer (Frontend)
|
|
||||||
ssh macmini "cd /Users/benjaminadmin/Projekte/breakpilot-lehrer && \
|
|
||||||
/usr/local/bin/docker compose build admin-lehrer && \
|
|
||||||
/usr/local/bin/docker compose up -d admin-lehrer"
|
|
||||||
|
|
||||||
# 3. Testen unter:
|
|
||||||
# https://macmini:3002/ai/ocr-pipeline
|
|
||||||
```
|
|
||||||
|
|
||||||
!!! warning "Base-Image bei neuen Python-Paketen"
|
|
||||||
Wenn `requirements.txt` geaendert wird (z.B. neues Paket hinzugefuegt), muss zuerst
|
|
||||||
das Base-Image neu gebaut werden:
|
|
||||||
```bash
|
|
||||||
ssh macmini "cd ~/Projekte/breakpilot-lehrer && \
|
|
||||||
/usr/local/bin/docker build -f klausur-service/Dockerfile.base \
|
|
||||||
-t klausur-base:latest klausur-service/"
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Aenderungshistorie
|
|
||||||
|
|
||||||
| Datum | Version | Aenderung |
|
|
||||||
|-------|---------|----------|
|
|
||||||
| 2026-03-03 | 2.0.0 | Schritte 6–7 implementiert; Spell-Checker, Rekonstruktions-Canvas |
|
|
||||||
| 2026-03-03 | 1.5.0 | Spaltenerkennung: volle Bildbreite fuer initialen Scan, Phantom-Filter |
|
|
||||||
| 2026-03-03 | 1.4.0 | Zeilenerkennung: Artefakt-Zeilen entfernen + Luecken-Heilung |
|
|
||||||
| 2026-03-03 | 1.3.0 | Zeichenkorrektur: `1.`/`\|.` Listenpraefixe werden nicht zu `I.` |
|
|
||||||
| 2026-03-03 | 1.2.0 | LLM-Engine durch Spell-Checker ersetzt (REVIEW_ENGINE=spell) |
|
|
||||||
| 2026-02-28 | 1.0.0 | Schritt 5 (Worterkennung) implementiert |
|
|
||||||
| 2026-02-22 | 0.4.0 | Schritt 4 (Zeilenerkennung) implementiert |
|
|
||||||
| 2026-02-20 | 0.3.0 | Schritt 3 (Spaltenerkennung) mit Typ-Klassifikation |
|
|
||||||
| 2026-02-15 | 0.2.0 | Schritt 2 (Entzerrung/Dewarp) |
|
|
||||||
| 2026-02-12 | 0.1.0 | Schritt 1 (Begradigung/Deskew) + Session-Management |
|
|
||||||
@@ -8,15 +8,24 @@ RUN npm install
|
|||||||
COPY frontend/ ./
|
COPY frontend/ ./
|
||||||
RUN npm run build
|
RUN npm run build
|
||||||
|
|
||||||
# Production stage — uses pre-built base with Tesseract + Python deps.
|
# Production stage
|
||||||
# Base image contains: python:3.11-slim + tesseract-ocr + all pip packages.
|
FROM python:3.11-slim
|
||||||
# Rebuild base only when requirements.txt or system deps change:
|
|
||||||
# docker build -f klausur-service/Dockerfile.base -t klausur-base:latest klausur-service/
|
|
||||||
FROM klausur-base:latest
|
|
||||||
|
|
||||||
WORKDIR /app
|
WORKDIR /app
|
||||||
|
|
||||||
# Copy backend code (this is the only layer that changes on code edits)
|
# Install system dependencies (incl. Tesseract OCR for bounding-box extraction)
|
||||||
|
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||||||
|
curl \
|
||||||
|
tesseract-ocr \
|
||||||
|
tesseract-ocr-deu \
|
||||||
|
tesseract-ocr-eng \
|
||||||
|
&& rm -rf /var/lib/apt/lists/*
|
||||||
|
|
||||||
|
# Install Python dependencies
|
||||||
|
COPY backend/requirements.txt ./
|
||||||
|
RUN pip install --no-cache-dir -r requirements.txt
|
||||||
|
|
||||||
|
# Copy backend code
|
||||||
COPY backend/ ./
|
COPY backend/ ./
|
||||||
|
|
||||||
# Copy built frontend to the expected path
|
# Copy built frontend to the expected path
|
||||||
|
|||||||
@@ -1,26 +0,0 @@
|
|||||||
# Base image with system dependencies + Python packages.
|
|
||||||
# These change rarely — build once, reuse on every --no-cache.
|
|
||||||
#
|
|
||||||
# Rebuild manually when requirements.txt or system deps change:
|
|
||||||
# docker build -f klausur-service/Dockerfile.base -t klausur-base:latest klausur-service/
|
|
||||||
#
|
|
||||||
FROM python:3.11-slim
|
|
||||||
|
|
||||||
WORKDIR /app
|
|
||||||
|
|
||||||
# System dependencies (Tesseract OCR, curl for healthcheck)
|
|
||||||
RUN apt-get update && apt-get install -y --no-install-recommends \
|
|
||||||
curl \
|
|
||||||
tesseract-ocr \
|
|
||||||
tesseract-ocr-deu \
|
|
||||||
tesseract-ocr-eng \
|
|
||||||
libgl1 \
|
|
||||||
libglib2.0-0 \
|
|
||||||
&& rm -rf /var/lib/apt/lists/*
|
|
||||||
|
|
||||||
# Python dependencies
|
|
||||||
COPY backend/requirements.txt ./
|
|
||||||
RUN pip install --no-cache-dir -r requirements.txt
|
|
||||||
|
|
||||||
# Clean up pip cache
|
|
||||||
RUN rm -rf /root/.cache/pip
|
|
||||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because one or more lines are too long
@@ -1,276 +0,0 @@
|
|||||||
"""
|
|
||||||
Handwriting HTR API - Hochwertige Handschriftenerkennung (HTR) fuer Klausurkorrekturen.
|
|
||||||
|
|
||||||
Endpoints:
|
|
||||||
- POST /api/v1/htr/recognize - Bild hochladen → handgeschriebener Text
|
|
||||||
- POST /api/v1/htr/recognize-session - OCR-Pipeline Session als Quelle nutzen
|
|
||||||
|
|
||||||
Modell-Strategie:
|
|
||||||
1. qwen2.5vl:32b via Ollama (primaer, hoechste Qualitaet als VLM)
|
|
||||||
2. microsoft/trocr-large-handwritten (Fallback, offline, kein Ollama)
|
|
||||||
|
|
||||||
DATENSCHUTZ: Alle Verarbeitung erfolgt lokal auf dem Mac Mini.
|
|
||||||
"""
|
|
||||||
|
|
||||||
import io
|
|
||||||
import os
|
|
||||||
import logging
|
|
||||||
import time
|
|
||||||
import base64
|
|
||||||
from typing import Optional
|
|
||||||
|
|
||||||
import cv2
|
|
||||||
import numpy as np
|
|
||||||
from fastapi import APIRouter, HTTPException, Query, UploadFile, File
|
|
||||||
from pydantic import BaseModel
|
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
|
||||||
|
|
||||||
router = APIRouter(prefix="/api/v1/htr", tags=["HTR"])
|
|
||||||
|
|
||||||
OLLAMA_BASE_URL = os.getenv("OLLAMA_BASE_URL", "http://host.docker.internal:11434")
|
|
||||||
OLLAMA_HTR_MODEL = os.getenv("OLLAMA_HTR_MODEL", "qwen2.5vl:32b")
|
|
||||||
HTR_FALLBACK_MODEL = os.getenv("HTR_FALLBACK_MODEL", "trocr-large")
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Pydantic Models
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
class HTRSessionRequest(BaseModel):
|
|
||||||
session_id: str
|
|
||||||
model: str = "auto" # "auto" | "qwen2.5vl" | "trocr-large"
|
|
||||||
use_clean: bool = True # Prefer clean_png (after handwriting removal)
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Preprocessing
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
def _preprocess_for_htr(img_bgr: np.ndarray) -> np.ndarray:
|
|
||||||
"""
|
|
||||||
CLAHE contrast enhancement + upscale to improve HTR accuracy.
|
|
||||||
Returns grayscale enhanced image.
|
|
||||||
"""
|
|
||||||
gray = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2GRAY)
|
|
||||||
clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))
|
|
||||||
enhanced = clahe.apply(gray)
|
|
||||||
|
|
||||||
# Upscale if image is too small
|
|
||||||
h, w = enhanced.shape
|
|
||||||
if min(h, w) < 800:
|
|
||||||
scale = 800 / min(h, w)
|
|
||||||
enhanced = cv2.resize(
|
|
||||||
enhanced, None, fx=scale, fy=scale,
|
|
||||||
interpolation=cv2.INTER_CUBIC
|
|
||||||
)
|
|
||||||
|
|
||||||
return enhanced
|
|
||||||
|
|
||||||
|
|
||||||
def _bgr_to_png_bytes(img_bgr: np.ndarray) -> bytes:
|
|
||||||
"""Convert BGR ndarray to PNG bytes."""
|
|
||||||
success, buf = cv2.imencode(".png", img_bgr)
|
|
||||||
if not success:
|
|
||||||
raise RuntimeError("Failed to encode image to PNG")
|
|
||||||
return buf.tobytes()
|
|
||||||
|
|
||||||
|
|
||||||
def _preprocess_image_bytes(image_bytes: bytes) -> bytes:
|
|
||||||
"""Load image, apply HTR preprocessing, return PNG bytes."""
|
|
||||||
arr = np.frombuffer(image_bytes, dtype=np.uint8)
|
|
||||||
img_bgr = cv2.imdecode(arr, cv2.IMREAD_COLOR)
|
|
||||||
if img_bgr is None:
|
|
||||||
raise ValueError("Could not decode image")
|
|
||||||
|
|
||||||
enhanced = _preprocess_for_htr(img_bgr)
|
|
||||||
# Convert grayscale back to BGR for encoding
|
|
||||||
enhanced_bgr = cv2.cvtColor(enhanced, cv2.COLOR_GRAY2BGR)
|
|
||||||
return _bgr_to_png_bytes(enhanced_bgr)
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Backend: Ollama qwen2.5vl
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
async def _recognize_with_qwen_vl(image_bytes: bytes, language: str) -> Optional[str]:
|
|
||||||
"""
|
|
||||||
Send image to Ollama qwen2.5vl:32b for HTR.
|
|
||||||
Returns extracted text or None on error.
|
|
||||||
"""
|
|
||||||
import httpx
|
|
||||||
|
|
||||||
lang_hint = {
|
|
||||||
"de": "Deutsch",
|
|
||||||
"en": "Englisch",
|
|
||||||
"de+en": "Deutsch und Englisch",
|
|
||||||
}.get(language, "Deutsch")
|
|
||||||
|
|
||||||
prompt = (
|
|
||||||
f"Du bist ein OCR-Experte fuer handgeschriebenen Text auf {lang_hint}. "
|
|
||||||
"Lies den Text im Bild exakt ab — korrigiere KEINE Rechtschreibfehler. "
|
|
||||||
"Antworte NUR mit dem erkannten Text, ohne Erklaerungen."
|
|
||||||
)
|
|
||||||
|
|
||||||
img_b64 = base64.b64encode(image_bytes).decode("utf-8")
|
|
||||||
|
|
||||||
payload = {
|
|
||||||
"model": OLLAMA_HTR_MODEL,
|
|
||||||
"prompt": prompt,
|
|
||||||
"images": [img_b64],
|
|
||||||
"stream": False,
|
|
||||||
}
|
|
||||||
|
|
||||||
try:
|
|
||||||
async with httpx.AsyncClient(timeout=120.0) as client:
|
|
||||||
resp = await client.post(f"{OLLAMA_BASE_URL}/api/generate", json=payload)
|
|
||||||
resp.raise_for_status()
|
|
||||||
data = resp.json()
|
|
||||||
return data.get("response", "").strip()
|
|
||||||
except Exception as e:
|
|
||||||
logger.warning(f"Ollama qwen2.5vl HTR failed: {e}")
|
|
||||||
return None
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Backend: TrOCR-large fallback
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
async def _recognize_with_trocr_large(image_bytes: bytes) -> Optional[str]:
|
|
||||||
"""
|
|
||||||
Use microsoft/trocr-large-handwritten via trocr_service.py.
|
|
||||||
Returns extracted text or None on error.
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
from services.trocr_service import run_trocr_ocr, _check_trocr_available
|
|
||||||
if not _check_trocr_available():
|
|
||||||
logger.warning("TrOCR not available for HTR fallback")
|
|
||||||
return None
|
|
||||||
|
|
||||||
text, confidence = await run_trocr_ocr(image_bytes, handwritten=True, size="large")
|
|
||||||
return text.strip() if text else None
|
|
||||||
except Exception as e:
|
|
||||||
logger.warning(f"TrOCR-large HTR failed: {e}")
|
|
||||||
return None
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Core recognition logic
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
async def _do_recognize(
|
|
||||||
image_bytes: bytes,
|
|
||||||
model: str = "auto",
|
|
||||||
preprocess: bool = True,
|
|
||||||
language: str = "de",
|
|
||||||
) -> dict:
|
|
||||||
"""
|
|
||||||
Core HTR logic: preprocess → try Ollama → fallback to TrOCR-large.
|
|
||||||
Returns dict with text, model_used, processing_time_ms.
|
|
||||||
"""
|
|
||||||
t0 = time.monotonic()
|
|
||||||
|
|
||||||
if preprocess:
|
|
||||||
try:
|
|
||||||
image_bytes = _preprocess_image_bytes(image_bytes)
|
|
||||||
except Exception as e:
|
|
||||||
logger.warning(f"HTR preprocessing failed, using raw image: {e}")
|
|
||||||
|
|
||||||
text: Optional[str] = None
|
|
||||||
model_used: str = "none"
|
|
||||||
|
|
||||||
use_qwen = model in ("auto", "qwen2.5vl")
|
|
||||||
use_trocr = model in ("auto", "trocr-large") or (use_qwen and text is None)
|
|
||||||
|
|
||||||
if use_qwen:
|
|
||||||
text = await _recognize_with_qwen_vl(image_bytes, language)
|
|
||||||
if text is not None:
|
|
||||||
model_used = f"qwen2.5vl ({OLLAMA_HTR_MODEL})"
|
|
||||||
|
|
||||||
if text is None and (use_trocr or model == "trocr-large"):
|
|
||||||
text = await _recognize_with_trocr_large(image_bytes)
|
|
||||||
if text is not None:
|
|
||||||
model_used = "trocr-large-handwritten"
|
|
||||||
|
|
||||||
if text is None:
|
|
||||||
text = ""
|
|
||||||
model_used = "none (all backends failed)"
|
|
||||||
|
|
||||||
elapsed_ms = int((time.monotonic() - t0) * 1000)
|
|
||||||
|
|
||||||
return {
|
|
||||||
"text": text,
|
|
||||||
"model_used": model_used,
|
|
||||||
"processing_time_ms": elapsed_ms,
|
|
||||||
"language": language,
|
|
||||||
"preprocessed": preprocess,
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Endpoints
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
@router.post("/recognize")
|
|
||||||
async def recognize_handwriting(
|
|
||||||
file: UploadFile = File(...),
|
|
||||||
model: str = Query("auto", description="auto | qwen2.5vl | trocr-large"),
|
|
||||||
preprocess: bool = Query(True, description="Apply CLAHE + upscale before recognition"),
|
|
||||||
language: str = Query("de", description="de | en | de+en"),
|
|
||||||
):
|
|
||||||
"""
|
|
||||||
Upload an image and get back the handwritten text as plain text.
|
|
||||||
|
|
||||||
Tries qwen2.5vl:32b via Ollama first, falls back to TrOCR-large-handwritten.
|
|
||||||
"""
|
|
||||||
if model not in ("auto", "qwen2.5vl", "trocr-large"):
|
|
||||||
raise HTTPException(status_code=400, detail="model must be one of: auto, qwen2.5vl, trocr-large")
|
|
||||||
if language not in ("de", "en", "de+en"):
|
|
||||||
raise HTTPException(status_code=400, detail="language must be one of: de, en, de+en")
|
|
||||||
|
|
||||||
image_bytes = await file.read()
|
|
||||||
if not image_bytes:
|
|
||||||
raise HTTPException(status_code=400, detail="Empty file")
|
|
||||||
|
|
||||||
return await _do_recognize(image_bytes, model=model, preprocess=preprocess, language=language)
|
|
||||||
|
|
||||||
|
|
||||||
@router.post("/recognize-session")
|
|
||||||
async def recognize_from_session(req: HTRSessionRequest):
|
|
||||||
"""
|
|
||||||
Use an OCR-Pipeline session as image source for HTR.
|
|
||||||
|
|
||||||
Set use_clean=true to prefer the clean image (after handwriting removal step).
|
|
||||||
This is useful when you want to do HTR on isolated handwriting regions.
|
|
||||||
"""
|
|
||||||
from ocr_pipeline_session_store import get_session_db, get_session_image
|
|
||||||
|
|
||||||
session = await get_session_db(req.session_id)
|
|
||||||
if not session:
|
|
||||||
raise HTTPException(status_code=404, detail=f"Session {req.session_id} not found")
|
|
||||||
|
|
||||||
# Choose source image
|
|
||||||
image_bytes: Optional[bytes] = None
|
|
||||||
source_used: str = ""
|
|
||||||
|
|
||||||
if req.use_clean:
|
|
||||||
image_bytes = await get_session_image(req.session_id, "clean")
|
|
||||||
if image_bytes:
|
|
||||||
source_used = "clean"
|
|
||||||
|
|
||||||
if not image_bytes:
|
|
||||||
image_bytes = await get_session_image(req.session_id, "deskewed")
|
|
||||||
if image_bytes:
|
|
||||||
source_used = "deskewed"
|
|
||||||
|
|
||||||
if not image_bytes:
|
|
||||||
image_bytes = await get_session_image(req.session_id, "original")
|
|
||||||
source_used = "original"
|
|
||||||
|
|
||||||
if not image_bytes:
|
|
||||||
raise HTTPException(status_code=404, detail="No image available in session")
|
|
||||||
|
|
||||||
result = await _do_recognize(image_bytes, model=req.model)
|
|
||||||
result["session_id"] = req.session_id
|
|
||||||
result["source_image"] = source_used
|
|
||||||
return result
|
|
||||||
@@ -42,12 +42,6 @@ try:
|
|||||||
except ImportError:
|
except ImportError:
|
||||||
trocr_router = None
|
trocr_router = None
|
||||||
from vocab_worksheet_api import router as vocab_router, set_db_pool as set_vocab_db_pool, _init_vocab_table, _load_all_sessions, DATABASE_URL as VOCAB_DATABASE_URL
|
from vocab_worksheet_api import router as vocab_router, set_db_pool as set_vocab_db_pool, _init_vocab_table, _load_all_sessions, DATABASE_URL as VOCAB_DATABASE_URL
|
||||||
from ocr_pipeline_api import router as ocr_pipeline_router
|
|
||||||
from ocr_pipeline_session_store import init_ocr_pipeline_tables
|
|
||||||
try:
|
|
||||||
from handwriting_htr_api import router as htr_router
|
|
||||||
except ImportError:
|
|
||||||
htr_router = None
|
|
||||||
try:
|
try:
|
||||||
from dsfa_rag_api import router as dsfa_rag_router, set_db_pool as set_dsfa_db_pool
|
from dsfa_rag_api import router as dsfa_rag_router, set_db_pool as set_dsfa_db_pool
|
||||||
from dsfa_corpus_ingestion import DSFAQdrantService, DATABASE_URL as DSFA_DATABASE_URL
|
from dsfa_corpus_ingestion import DSFAQdrantService, DATABASE_URL as DSFA_DATABASE_URL
|
||||||
@@ -81,13 +75,6 @@ async def lifespan(app: FastAPI):
|
|||||||
except Exception as e:
|
except Exception as e:
|
||||||
print(f"Warning: Vocab sessions database initialization failed: {e}")
|
print(f"Warning: Vocab sessions database initialization failed: {e}")
|
||||||
|
|
||||||
# Initialize OCR Pipeline session tables
|
|
||||||
try:
|
|
||||||
await init_ocr_pipeline_tables()
|
|
||||||
print("OCR Pipeline session tables initialized")
|
|
||||||
except Exception as e:
|
|
||||||
print(f"Warning: OCR Pipeline tables initialization failed: {e}")
|
|
||||||
|
|
||||||
# Initialize database pool for DSFA RAG
|
# Initialize database pool for DSFA RAG
|
||||||
dsfa_db_pool = None
|
dsfa_db_pool = None
|
||||||
if DSFA_DATABASE_URL and set_dsfa_db_pool:
|
if DSFA_DATABASE_URL and set_dsfa_db_pool:
|
||||||
@@ -117,19 +104,6 @@ async def lifespan(app: FastAPI):
|
|||||||
# Ensure EH upload directory exists
|
# Ensure EH upload directory exists
|
||||||
os.makedirs(EH_UPLOAD_DIR, exist_ok=True)
|
os.makedirs(EH_UPLOAD_DIR, exist_ok=True)
|
||||||
|
|
||||||
# Preload LightOnOCR model if OCR_ENGINE=lighton (avoids cold-start on first request)
|
|
||||||
ocr_engine_env = os.getenv("OCR_ENGINE", "auto")
|
|
||||||
if ocr_engine_env == "lighton":
|
|
||||||
try:
|
|
||||||
import asyncio
|
|
||||||
from services.lighton_ocr_service import get_lighton_model
|
|
||||||
loop = asyncio.get_event_loop()
|
|
||||||
print("Preloading LightOnOCR-2-1B at startup (OCR_ENGINE=lighton)...")
|
|
||||||
await loop.run_in_executor(None, get_lighton_model)
|
|
||||||
print("LightOnOCR-2-1B preloaded")
|
|
||||||
except Exception as e:
|
|
||||||
print(f"Warning: LightOnOCR preload failed: {e}")
|
|
||||||
|
|
||||||
yield
|
yield
|
||||||
|
|
||||||
print("Klausur-Service shutting down...")
|
print("Klausur-Service shutting down...")
|
||||||
@@ -176,9 +150,6 @@ app.include_router(mail_router) # Unified Inbox Mail
|
|||||||
if trocr_router:
|
if trocr_router:
|
||||||
app.include_router(trocr_router) # TrOCR Handwriting OCR
|
app.include_router(trocr_router) # TrOCR Handwriting OCR
|
||||||
app.include_router(vocab_router) # Vocabulary Worksheet Generator
|
app.include_router(vocab_router) # Vocabulary Worksheet Generator
|
||||||
app.include_router(ocr_pipeline_router) # OCR Pipeline (step-by-step)
|
|
||||||
if htr_router:
|
|
||||||
app.include_router(htr_router) # Handwriting HTR (Klausur)
|
|
||||||
if dsfa_rag_router:
|
if dsfa_rag_router:
|
||||||
app.include_router(dsfa_rag_router) # DSFA RAG Corpus Search
|
app.include_router(dsfa_rag_router) # DSFA RAG Corpus Search
|
||||||
|
|
||||||
|
|||||||
@@ -1,28 +0,0 @@
|
|||||||
-- OCR Pipeline Sessions - Persistent session storage
|
|
||||||
-- Applied automatically by ocr_pipeline_session_store.init_ocr_pipeline_tables()
|
|
||||||
|
|
||||||
CREATE TABLE IF NOT EXISTS ocr_pipeline_sessions (
|
|
||||||
id UUID PRIMARY KEY,
|
|
||||||
name VARCHAR(255) NOT NULL,
|
|
||||||
filename VARCHAR(255),
|
|
||||||
status VARCHAR(50) DEFAULT 'active',
|
|
||||||
current_step INT DEFAULT 1,
|
|
||||||
original_png BYTEA,
|
|
||||||
deskewed_png BYTEA,
|
|
||||||
binarized_png BYTEA,
|
|
||||||
dewarped_png BYTEA,
|
|
||||||
deskew_result JSONB,
|
|
||||||
dewarp_result JSONB,
|
|
||||||
column_result JSONB,
|
|
||||||
ground_truth JSONB DEFAULT '{}',
|
|
||||||
auto_shear_degrees FLOAT,
|
|
||||||
created_at TIMESTAMP DEFAULT NOW(),
|
|
||||||
updated_at TIMESTAMP DEFAULT NOW()
|
|
||||||
);
|
|
||||||
|
|
||||||
-- Index for listing sessions
|
|
||||||
CREATE INDEX IF NOT EXISTS idx_ocr_pipeline_sessions_created
|
|
||||||
ON ocr_pipeline_sessions (created_at DESC);
|
|
||||||
|
|
||||||
CREATE INDEX IF NOT EXISTS idx_ocr_pipeline_sessions_status
|
|
||||||
ON ocr_pipeline_sessions (status);
|
|
||||||
@@ -1,4 +0,0 @@
|
|||||||
-- Migration 003: Add row_result column for row geometry detection
|
|
||||||
-- Stores detected row geometries including header/footer classification
|
|
||||||
|
|
||||||
ALTER TABLE ocr_pipeline_sessions ADD COLUMN IF NOT EXISTS row_result JSONB;
|
|
||||||
@@ -1,4 +0,0 @@
|
|||||||
-- Migration 004: Add word_result column for OCR Pipeline Step 5
|
|
||||||
-- Stores the word recognition grid result (entries with english/german/example + bboxes)
|
|
||||||
|
|
||||||
ALTER TABLE ocr_pipeline_sessions ADD COLUMN IF NOT EXISTS word_result JSONB;
|
|
||||||
@@ -1,7 +0,0 @@
|
|||||||
-- Migration 005: Add document type detection columns
|
|
||||||
-- These columns store the result of automatic document type detection
|
|
||||||
-- (vocab_table, full_text, generic_table) after dewarp.
|
|
||||||
|
|
||||||
ALTER TABLE ocr_pipeline_sessions
|
|
||||||
ADD COLUMN IF NOT EXISTS doc_type VARCHAR(50),
|
|
||||||
ADD COLUMN IF NOT EXISTS doc_type_result JSONB;
|
|
||||||
File diff suppressed because it is too large
Load Diff
@@ -1,243 +0,0 @@
|
|||||||
"""
|
|
||||||
OCR Pipeline Session Store - PostgreSQL persistence for OCR pipeline sessions.
|
|
||||||
|
|
||||||
Replaces in-memory storage with database persistence.
|
|
||||||
See migrations/002_ocr_pipeline_sessions.sql for schema.
|
|
||||||
"""
|
|
||||||
|
|
||||||
import os
|
|
||||||
import uuid
|
|
||||||
import logging
|
|
||||||
import json
|
|
||||||
from typing import Optional, List, Dict, Any
|
|
||||||
|
|
||||||
import asyncpg
|
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
|
||||||
|
|
||||||
# Database configuration (same as vocab_session_store)
|
|
||||||
DATABASE_URL = os.getenv(
|
|
||||||
"DATABASE_URL",
|
|
||||||
"postgresql://breakpilot:breakpilot@postgres:5432/breakpilot_db"
|
|
||||||
)
|
|
||||||
|
|
||||||
# Connection pool (initialized lazily)
|
|
||||||
_pool: Optional[asyncpg.Pool] = None
|
|
||||||
|
|
||||||
|
|
||||||
async def get_pool() -> asyncpg.Pool:
|
|
||||||
"""Get or create the database connection pool."""
|
|
||||||
global _pool
|
|
||||||
if _pool is None:
|
|
||||||
_pool = await asyncpg.create_pool(DATABASE_URL, min_size=2, max_size=10)
|
|
||||||
return _pool
|
|
||||||
|
|
||||||
|
|
||||||
async def init_ocr_pipeline_tables():
|
|
||||||
"""Initialize OCR pipeline tables if they don't exist."""
|
|
||||||
pool = await get_pool()
|
|
||||||
async with pool.acquire() as conn:
|
|
||||||
tables_exist = await conn.fetchval("""
|
|
||||||
SELECT EXISTS (
|
|
||||||
SELECT FROM information_schema.tables
|
|
||||||
WHERE table_name = 'ocr_pipeline_sessions'
|
|
||||||
)
|
|
||||||
""")
|
|
||||||
|
|
||||||
if not tables_exist:
|
|
||||||
logger.info("Creating OCR pipeline tables...")
|
|
||||||
migration_path = os.path.join(
|
|
||||||
os.path.dirname(__file__),
|
|
||||||
"migrations/002_ocr_pipeline_sessions.sql"
|
|
||||||
)
|
|
||||||
if os.path.exists(migration_path):
|
|
||||||
with open(migration_path, "r") as f:
|
|
||||||
sql = f.read()
|
|
||||||
await conn.execute(sql)
|
|
||||||
logger.info("OCR pipeline tables created successfully")
|
|
||||||
else:
|
|
||||||
logger.warning(f"Migration file not found: {migration_path}")
|
|
||||||
else:
|
|
||||||
logger.debug("OCR pipeline tables already exist")
|
|
||||||
|
|
||||||
# Ensure new columns exist (idempotent ALTER TABLE)
|
|
||||||
await conn.execute("""
|
|
||||||
ALTER TABLE ocr_pipeline_sessions
|
|
||||||
ADD COLUMN IF NOT EXISTS clean_png BYTEA,
|
|
||||||
ADD COLUMN IF NOT EXISTS handwriting_removal_meta JSONB,
|
|
||||||
ADD COLUMN IF NOT EXISTS doc_type VARCHAR(50),
|
|
||||||
ADD COLUMN IF NOT EXISTS doc_type_result JSONB
|
|
||||||
""")
|
|
||||||
|
|
||||||
|
|
||||||
# =============================================================================
|
|
||||||
# SESSION CRUD
|
|
||||||
# =============================================================================
|
|
||||||
|
|
||||||
async def create_session_db(
|
|
||||||
session_id: str,
|
|
||||||
name: str,
|
|
||||||
filename: str,
|
|
||||||
original_png: bytes,
|
|
||||||
) -> Dict[str, Any]:
|
|
||||||
"""Create a new OCR pipeline session."""
|
|
||||||
pool = await get_pool()
|
|
||||||
async with pool.acquire() as conn:
|
|
||||||
row = await conn.fetchrow("""
|
|
||||||
INSERT INTO ocr_pipeline_sessions (
|
|
||||||
id, name, filename, original_png, status, current_step
|
|
||||||
) VALUES ($1, $2, $3, $4, 'active', 1)
|
|
||||||
RETURNING id, name, filename, status, current_step,
|
|
||||||
deskew_result, dewarp_result, column_result, row_result,
|
|
||||||
word_result, ground_truth, auto_shear_degrees,
|
|
||||||
doc_type, doc_type_result,
|
|
||||||
created_at, updated_at
|
|
||||||
""", uuid.UUID(session_id), name, filename, original_png)
|
|
||||||
|
|
||||||
return _row_to_dict(row)
|
|
||||||
|
|
||||||
|
|
||||||
async def get_session_db(session_id: str) -> Optional[Dict[str, Any]]:
|
|
||||||
"""Get session metadata (without images)."""
|
|
||||||
pool = await get_pool()
|
|
||||||
async with pool.acquire() as conn:
|
|
||||||
row = await conn.fetchrow("""
|
|
||||||
SELECT id, name, filename, status, current_step,
|
|
||||||
deskew_result, dewarp_result, column_result, row_result,
|
|
||||||
word_result, ground_truth, auto_shear_degrees,
|
|
||||||
doc_type, doc_type_result,
|
|
||||||
created_at, updated_at
|
|
||||||
FROM ocr_pipeline_sessions WHERE id = $1
|
|
||||||
""", uuid.UUID(session_id))
|
|
||||||
|
|
||||||
if row:
|
|
||||||
return _row_to_dict(row)
|
|
||||||
return None
|
|
||||||
|
|
||||||
|
|
||||||
async def get_session_image(session_id: str, image_type: str) -> Optional[bytes]:
|
|
||||||
"""Load a single image (BYTEA) from the session."""
|
|
||||||
column_map = {
|
|
||||||
"original": "original_png",
|
|
||||||
"deskewed": "deskewed_png",
|
|
||||||
"binarized": "binarized_png",
|
|
||||||
"dewarped": "dewarped_png",
|
|
||||||
"clean": "clean_png",
|
|
||||||
}
|
|
||||||
column = column_map.get(image_type)
|
|
||||||
if not column:
|
|
||||||
return None
|
|
||||||
|
|
||||||
pool = await get_pool()
|
|
||||||
async with pool.acquire() as conn:
|
|
||||||
return await conn.fetchval(
|
|
||||||
f"SELECT {column} FROM ocr_pipeline_sessions WHERE id = $1",
|
|
||||||
uuid.UUID(session_id)
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
async def update_session_db(session_id: str, **kwargs) -> Optional[Dict[str, Any]]:
|
|
||||||
"""Update session fields dynamically."""
|
|
||||||
pool = await get_pool()
|
|
||||||
|
|
||||||
fields = []
|
|
||||||
values = []
|
|
||||||
param_idx = 1
|
|
||||||
|
|
||||||
allowed_fields = {
|
|
||||||
'name', 'filename', 'status', 'current_step',
|
|
||||||
'original_png', 'deskewed_png', 'binarized_png', 'dewarped_png',
|
|
||||||
'clean_png', 'handwriting_removal_meta',
|
|
||||||
'deskew_result', 'dewarp_result', 'column_result', 'row_result',
|
|
||||||
'word_result', 'ground_truth', 'auto_shear_degrees',
|
|
||||||
'doc_type', 'doc_type_result',
|
|
||||||
}
|
|
||||||
|
|
||||||
jsonb_fields = {'deskew_result', 'dewarp_result', 'column_result', 'row_result', 'word_result', 'ground_truth', 'handwriting_removal_meta', 'doc_type_result'}
|
|
||||||
|
|
||||||
for key, value in kwargs.items():
|
|
||||||
if key in allowed_fields:
|
|
||||||
fields.append(f"{key} = ${param_idx}")
|
|
||||||
if key in jsonb_fields and value is not None and not isinstance(value, str):
|
|
||||||
value = json.dumps(value)
|
|
||||||
values.append(value)
|
|
||||||
param_idx += 1
|
|
||||||
|
|
||||||
if not fields:
|
|
||||||
return await get_session_db(session_id)
|
|
||||||
|
|
||||||
# Always update updated_at
|
|
||||||
fields.append(f"updated_at = NOW()")
|
|
||||||
|
|
||||||
values.append(uuid.UUID(session_id))
|
|
||||||
|
|
||||||
async with pool.acquire() as conn:
|
|
||||||
row = await conn.fetchrow(f"""
|
|
||||||
UPDATE ocr_pipeline_sessions
|
|
||||||
SET {', '.join(fields)}
|
|
||||||
WHERE id = ${param_idx}
|
|
||||||
RETURNING id, name, filename, status, current_step,
|
|
||||||
deskew_result, dewarp_result, column_result, row_result,
|
|
||||||
word_result, ground_truth, auto_shear_degrees,
|
|
||||||
doc_type, doc_type_result,
|
|
||||||
created_at, updated_at
|
|
||||||
""", *values)
|
|
||||||
|
|
||||||
if row:
|
|
||||||
return _row_to_dict(row)
|
|
||||||
return None
|
|
||||||
|
|
||||||
|
|
||||||
async def list_sessions_db(limit: int = 50) -> List[Dict[str, Any]]:
|
|
||||||
"""List all sessions (metadata only, no images)."""
|
|
||||||
pool = await get_pool()
|
|
||||||
async with pool.acquire() as conn:
|
|
||||||
rows = await conn.fetch("""
|
|
||||||
SELECT id, name, filename, status, current_step,
|
|
||||||
created_at, updated_at
|
|
||||||
FROM ocr_pipeline_sessions
|
|
||||||
ORDER BY created_at DESC
|
|
||||||
LIMIT $1
|
|
||||||
""", limit)
|
|
||||||
|
|
||||||
return [_row_to_dict(row) for row in rows]
|
|
||||||
|
|
||||||
|
|
||||||
async def delete_session_db(session_id: str) -> bool:
|
|
||||||
"""Delete a session."""
|
|
||||||
pool = await get_pool()
|
|
||||||
async with pool.acquire() as conn:
|
|
||||||
result = await conn.execute("""
|
|
||||||
DELETE FROM ocr_pipeline_sessions WHERE id = $1
|
|
||||||
""", uuid.UUID(session_id))
|
|
||||||
return result == "DELETE 1"
|
|
||||||
|
|
||||||
|
|
||||||
# =============================================================================
|
|
||||||
# HELPER
|
|
||||||
# =============================================================================
|
|
||||||
|
|
||||||
def _row_to_dict(row: asyncpg.Record) -> Dict[str, Any]:
|
|
||||||
"""Convert asyncpg Record to JSON-serializable dict."""
|
|
||||||
if row is None:
|
|
||||||
return {}
|
|
||||||
|
|
||||||
result = dict(row)
|
|
||||||
|
|
||||||
# UUID → string
|
|
||||||
for key in ['id', 'session_id']:
|
|
||||||
if key in result and result[key] is not None:
|
|
||||||
result[key] = str(result[key])
|
|
||||||
|
|
||||||
# datetime → ISO string
|
|
||||||
for key in ['created_at', 'updated_at']:
|
|
||||||
if key in result and result[key] is not None:
|
|
||||||
result[key] = result[key].isoformat()
|
|
||||||
|
|
||||||
# JSONB → parsed (asyncpg returns str for JSONB)
|
|
||||||
for key in ['deskew_result', 'dewarp_result', 'column_result', 'row_result', 'word_result', 'ground_truth', 'doc_type_result']:
|
|
||||||
if key in result and result[key] is not None:
|
|
||||||
if isinstance(result[key], str):
|
|
||||||
result[key] = json.loads(result[key])
|
|
||||||
|
|
||||||
return result
|
|
||||||
@@ -28,16 +28,6 @@ opencv-python-headless>=4.8.0
|
|||||||
pytesseract>=0.3.10
|
pytesseract>=0.3.10
|
||||||
Pillow>=10.0.0
|
Pillow>=10.0.0
|
||||||
|
|
||||||
# RapidOCR (PaddleOCR models on ONNX Runtime — works on ARM64 natively)
|
|
||||||
rapidocr
|
|
||||||
onnxruntime
|
|
||||||
|
|
||||||
# IPA pronunciation dictionary lookup (MIT license, bundled CMU dict ~134k words)
|
|
||||||
eng-to-ipa
|
|
||||||
|
|
||||||
# Spell-checker for rule-based OCR correction (MIT license)
|
|
||||||
pyspellchecker>=0.8.1
|
|
||||||
|
|
||||||
# PostgreSQL (for metrics storage)
|
# PostgreSQL (for metrics storage)
|
||||||
psycopg2-binary>=2.9.0
|
psycopg2-binary>=2.9.0
|
||||||
asyncpg>=0.29.0
|
asyncpg>=0.29.0
|
||||||
@@ -45,9 +35,6 @@ asyncpg>=0.29.0
|
|||||||
# Email validation for Pydantic
|
# Email validation for Pydantic
|
||||||
email-validator>=2.0.0
|
email-validator>=2.0.0
|
||||||
|
|
||||||
# DOCX export for reconstruction editor (MIT license)
|
|
||||||
python-docx>=1.1.0
|
|
||||||
|
|
||||||
# Testing
|
# Testing
|
||||||
pytest>=8.0.0
|
pytest>=8.0.0
|
||||||
pytest-asyncio>=0.23.0
|
pytest-asyncio>=0.23.0
|
||||||
|
|||||||
@@ -6,7 +6,6 @@ Uses multiple detection methods:
|
|||||||
1. Color-based detection (blue/red ink)
|
1. Color-based detection (blue/red ink)
|
||||||
2. Stroke analysis (thin irregular strokes)
|
2. Stroke analysis (thin irregular strokes)
|
||||||
3. Edge density variance
|
3. Edge density variance
|
||||||
4. Pencil detection (gray ink)
|
|
||||||
|
|
||||||
DATENSCHUTZ: All processing happens locally on Mac Mini.
|
DATENSCHUTZ: All processing happens locally on Mac Mini.
|
||||||
"""
|
"""
|
||||||
@@ -38,16 +37,12 @@ class DetectionResult:
|
|||||||
detection_method: str # Which method was primarily used
|
detection_method: str # Which method was primarily used
|
||||||
|
|
||||||
|
|
||||||
def detect_handwriting(image_bytes: bytes, target_ink: str = "all") -> DetectionResult:
|
def detect_handwriting(image_bytes: bytes) -> DetectionResult:
|
||||||
"""
|
"""
|
||||||
Detect handwriting in an image.
|
Detect handwriting in an image.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
image_bytes: Image as bytes (PNG, JPG, etc.)
|
image_bytes: Image as bytes (PNG, JPG, etc.)
|
||||||
target_ink: Which ink types to detect:
|
|
||||||
- "all" → all methods combined (incl. pencil)
|
|
||||||
- "colored" → only color-based (blue/red/green pen)
|
|
||||||
- "pencil" → only pencil (gray ink)
|
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
DetectionResult with binary mask where handwriting is white (255)
|
DetectionResult with binary mask where handwriting is white (255)
|
||||||
@@ -67,51 +62,35 @@ def detect_handwriting(image_bytes: bytes, target_ink: str = "all") -> Detection
|
|||||||
|
|
||||||
# Convert to BGR if needed (OpenCV format)
|
# Convert to BGR if needed (OpenCV format)
|
||||||
if len(img_array.shape) == 2:
|
if len(img_array.shape) == 2:
|
||||||
|
# Grayscale to BGR
|
||||||
img_bgr = cv2.cvtColor(img_array, cv2.COLOR_GRAY2BGR)
|
img_bgr = cv2.cvtColor(img_array, cv2.COLOR_GRAY2BGR)
|
||||||
elif img_array.shape[2] == 4:
|
elif img_array.shape[2] == 4:
|
||||||
|
# RGBA to BGR
|
||||||
img_bgr = cv2.cvtColor(img_array, cv2.COLOR_RGBA2BGR)
|
img_bgr = cv2.cvtColor(img_array, cv2.COLOR_RGBA2BGR)
|
||||||
elif img_array.shape[2] == 3:
|
elif img_array.shape[2] == 3:
|
||||||
|
# RGB to BGR
|
||||||
img_bgr = cv2.cvtColor(img_array, cv2.COLOR_RGB2BGR)
|
img_bgr = cv2.cvtColor(img_array, cv2.COLOR_RGB2BGR)
|
||||||
else:
|
else:
|
||||||
img_bgr = img_array
|
img_bgr = img_array
|
||||||
|
|
||||||
# Select detection methods based on target_ink
|
# Run multiple detection methods
|
||||||
masks_and_weights = []
|
color_mask, color_confidence = _detect_by_color(img_bgr)
|
||||||
|
stroke_mask, stroke_confidence = _detect_by_stroke_analysis(img_bgr)
|
||||||
if target_ink in ("all", "colored"):
|
variance_mask, variance_confidence = _detect_by_variance(img_bgr)
|
||||||
color_mask, color_conf = _detect_by_color(img_bgr)
|
|
||||||
masks_and_weights.append((color_mask, color_conf, "color"))
|
|
||||||
|
|
||||||
if target_ink == "all":
|
|
||||||
stroke_mask, stroke_conf = _detect_by_stroke_analysis(img_bgr)
|
|
||||||
variance_mask, variance_conf = _detect_by_variance(img_bgr)
|
|
||||||
masks_and_weights.append((stroke_mask, stroke_conf, "stroke"))
|
|
||||||
masks_and_weights.append((variance_mask, variance_conf, "variance"))
|
|
||||||
|
|
||||||
if target_ink in ("all", "pencil"):
|
|
||||||
pencil_mask, pencil_conf = _detect_pencil(img_bgr)
|
|
||||||
masks_and_weights.append((pencil_mask, pencil_conf, "pencil"))
|
|
||||||
|
|
||||||
if not masks_and_weights:
|
|
||||||
# Fallback: use all methods
|
|
||||||
color_mask, color_conf = _detect_by_color(img_bgr)
|
|
||||||
stroke_mask, stroke_conf = _detect_by_stroke_analysis(img_bgr)
|
|
||||||
variance_mask, variance_conf = _detect_by_variance(img_bgr)
|
|
||||||
pencil_mask, pencil_conf = _detect_pencil(img_bgr)
|
|
||||||
masks_and_weights = [
|
|
||||||
(color_mask, color_conf, "color"),
|
|
||||||
(stroke_mask, stroke_conf, "stroke"),
|
|
||||||
(variance_mask, variance_conf, "variance"),
|
|
||||||
(pencil_mask, pencil_conf, "pencil"),
|
|
||||||
]
|
|
||||||
|
|
||||||
# Combine masks using weighted average
|
# Combine masks using weighted average
|
||||||
total_weight = sum(w for _, w, _ in masks_and_weights)
|
weights = [color_confidence, stroke_confidence, variance_confidence]
|
||||||
|
total_weight = sum(weights)
|
||||||
|
|
||||||
if total_weight > 0:
|
if total_weight > 0:
|
||||||
combined_mask = sum(
|
# Weighted combination
|
||||||
m.astype(np.float32) * w for m, w, _ in masks_and_weights
|
combined_mask = (
|
||||||
|
color_mask.astype(np.float32) * color_confidence +
|
||||||
|
stroke_mask.astype(np.float32) * stroke_confidence +
|
||||||
|
variance_mask.astype(np.float32) * variance_confidence
|
||||||
) / total_weight
|
) / total_weight
|
||||||
|
|
||||||
|
# Threshold to binary
|
||||||
combined_mask = (combined_mask > 127).astype(np.uint8) * 255
|
combined_mask = (combined_mask > 127).astype(np.uint8) * 255
|
||||||
else:
|
else:
|
||||||
combined_mask = np.zeros(img_bgr.shape[:2], dtype=np.uint8)
|
combined_mask = np.zeros(img_bgr.shape[:2], dtype=np.uint8)
|
||||||
@@ -124,11 +103,19 @@ def detect_handwriting(image_bytes: bytes, target_ink: str = "all") -> Detection
|
|||||||
handwriting_pixels = np.sum(combined_mask > 0)
|
handwriting_pixels = np.sum(combined_mask > 0)
|
||||||
handwriting_ratio = handwriting_pixels / total_pixels if total_pixels > 0 else 0
|
handwriting_ratio = handwriting_pixels / total_pixels if total_pixels > 0 else 0
|
||||||
|
|
||||||
# Determine primary method (highest confidence)
|
# Determine primary method
|
||||||
primary_method = max(masks_and_weights, key=lambda x: x[1])[2] if masks_and_weights else "combined"
|
primary_method = "combined"
|
||||||
overall_confidence = total_weight / len(masks_and_weights) if masks_and_weights else 0.0
|
max_conf = max(color_confidence, stroke_confidence, variance_confidence)
|
||||||
|
if max_conf == color_confidence:
|
||||||
|
primary_method = "color"
|
||||||
|
elif max_conf == stroke_confidence:
|
||||||
|
primary_method = "stroke"
|
||||||
|
else:
|
||||||
|
primary_method = "variance"
|
||||||
|
|
||||||
logger.info(f"Handwriting detection (target_ink={target_ink}): {handwriting_ratio:.2%} handwriting, "
|
overall_confidence = total_weight / 3.0 # Average confidence
|
||||||
|
|
||||||
|
logger.info(f"Handwriting detection: {handwriting_ratio:.2%} handwriting, "
|
||||||
f"confidence={overall_confidence:.2f}, method={primary_method}")
|
f"confidence={overall_confidence:.2f}, method={primary_method}")
|
||||||
|
|
||||||
return DetectionResult(
|
return DetectionResult(
|
||||||
@@ -193,27 +180,6 @@ def _detect_by_color(img_bgr: np.ndarray) -> Tuple[np.ndarray, float]:
|
|||||||
return color_mask, confidence
|
return color_mask, confidence
|
||||||
|
|
||||||
|
|
||||||
def _detect_pencil(img_bgr: np.ndarray) -> Tuple[np.ndarray, float]:
|
|
||||||
"""
|
|
||||||
Detect pencil marks (gray ink, ~140-220 on 255-scale).
|
|
||||||
|
|
||||||
Paper is usually >230, dark ink <130.
|
|
||||||
Pencil falls in the 140-220 gray range.
|
|
||||||
"""
|
|
||||||
gray = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2GRAY)
|
|
||||||
pencil_mask = cv2.inRange(gray, 140, 220)
|
|
||||||
|
|
||||||
# Remove small noise artifacts
|
|
||||||
kernel = np.ones((2, 2), np.uint8)
|
|
||||||
pencil_mask = cv2.morphologyEx(pencil_mask, cv2.MORPH_OPEN, kernel, iterations=1)
|
|
||||||
|
|
||||||
ratio = np.sum(pencil_mask > 0) / pencil_mask.size
|
|
||||||
# Good confidence if pencil pixels are in a plausible range
|
|
||||||
confidence = 0.75 if 0.002 < ratio < 0.2 else 0.2
|
|
||||||
|
|
||||||
return pencil_mask, confidence
|
|
||||||
|
|
||||||
|
|
||||||
def _detect_by_stroke_analysis(img_bgr: np.ndarray) -> Tuple[np.ndarray, float]:
|
def _detect_by_stroke_analysis(img_bgr: np.ndarray) -> Tuple[np.ndarray, float]:
|
||||||
"""
|
"""
|
||||||
Detect handwriting by analyzing stroke characteristics.
|
Detect handwriting by analyzing stroke characteristics.
|
||||||
|
|||||||
@@ -350,77 +350,6 @@ def layout_to_fabric_json(layout_result: LayoutResult) -> str:
|
|||||||
return json.dumps(layout_result.fabric_json, ensure_ascii=False, indent=2)
|
return json.dumps(layout_result.fabric_json, ensure_ascii=False, indent=2)
|
||||||
|
|
||||||
|
|
||||||
def cells_to_fabric_json(
|
|
||||||
cells: List[Dict[str, Any]],
|
|
||||||
image_width: int,
|
|
||||||
image_height: int,
|
|
||||||
) -> Dict[str, Any]:
|
|
||||||
"""Convert pipeline grid cells to Fabric.js-compatible JSON.
|
|
||||||
|
|
||||||
Each cell becomes a Textbox object positioned at its bbox_pct coordinates
|
|
||||||
(converted to pixels). Colour-coded by column type.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
cells: List of cell dicts from GridResult (with bbox_pct, col_type, text).
|
|
||||||
image_width: Source image width in pixels.
|
|
||||||
image_height: Source image height in pixels.
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
Dict with Fabric.js canvas JSON (version + objects array).
|
|
||||||
"""
|
|
||||||
COL_TYPE_COLORS = {
|
|
||||||
'column_en': '#3b82f6',
|
|
||||||
'column_de': '#22c55e',
|
|
||||||
'column_example': '#f97316',
|
|
||||||
'column_text': '#a855f7',
|
|
||||||
'page_ref': '#06b6d4',
|
|
||||||
'column_marker': '#6b7280',
|
|
||||||
}
|
|
||||||
|
|
||||||
fabric_objects = []
|
|
||||||
for cell in cells:
|
|
||||||
bp = cell.get('bbox_pct', {})
|
|
||||||
x = bp.get('x', 0) / 100 * image_width
|
|
||||||
y = bp.get('y', 0) / 100 * image_height
|
|
||||||
w = bp.get('w', 10) / 100 * image_width
|
|
||||||
h = bp.get('h', 3) / 100 * image_height
|
|
||||||
col_type = cell.get('col_type', '')
|
|
||||||
color = COL_TYPE_COLORS.get(col_type, '#6b7280')
|
|
||||||
font_size = max(8, min(18, h * 0.55))
|
|
||||||
|
|
||||||
fabric_objects.append({
|
|
||||||
"type": "textbox",
|
|
||||||
"version": "6.0.0",
|
|
||||||
"originX": "left",
|
|
||||||
"originY": "top",
|
|
||||||
"left": round(x, 1),
|
|
||||||
"top": round(y, 1),
|
|
||||||
"width": max(round(w, 1), 30),
|
|
||||||
"height": round(h, 1),
|
|
||||||
"fill": "#000000",
|
|
||||||
"stroke": color,
|
|
||||||
"strokeWidth": 1,
|
|
||||||
"text": cell.get('text', ''),
|
|
||||||
"fontSize": round(font_size, 1),
|
|
||||||
"fontFamily": "monospace",
|
|
||||||
"editable": True,
|
|
||||||
"selectable": True,
|
|
||||||
"backgroundColor": color + "22",
|
|
||||||
"data": {
|
|
||||||
"cellId": cell.get('cell_id', ''),
|
|
||||||
"colType": col_type,
|
|
||||||
"rowIndex": cell.get('row_index', 0),
|
|
||||||
"colIndex": cell.get('col_index', 0),
|
|
||||||
"originalText": cell.get('text', ''),
|
|
||||||
},
|
|
||||||
})
|
|
||||||
|
|
||||||
return {
|
|
||||||
"version": "6.0.0",
|
|
||||||
"objects": fabric_objects,
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
def reconstruct_and_clean(
|
def reconstruct_and_clean(
|
||||||
image_bytes: bytes,
|
image_bytes: bytes,
|
||||||
remove_handwriting: bool = True
|
remove_handwriting: bool = True
|
||||||
|
|||||||
@@ -31,10 +31,8 @@ from datetime import datetime, timedelta
|
|||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
# Lazy loading for heavy dependencies
|
# Lazy loading for heavy dependencies
|
||||||
# Cache keyed by model_name to support base and large variants simultaneously
|
_trocr_processor = None
|
||||||
_trocr_models: dict = {} # {model_name: (processor, model)}
|
_trocr_model = None
|
||||||
_trocr_processor = None # backwards-compat alias → base-printed
|
|
||||||
_trocr_model = None # backwards-compat alias → base-printed
|
|
||||||
_trocr_available = None
|
_trocr_available = None
|
||||||
_model_loaded_at = None
|
_model_loaded_at = None
|
||||||
|
|
||||||
@@ -126,14 +124,12 @@ def _check_trocr_available() -> bool:
|
|||||||
return _trocr_available
|
return _trocr_available
|
||||||
|
|
||||||
|
|
||||||
def get_trocr_model(handwritten: bool = False, size: str = "base"):
|
def get_trocr_model(handwritten: bool = False):
|
||||||
"""
|
"""
|
||||||
Lazy load TrOCR model and processor.
|
Lazy load TrOCR model and processor.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
handwritten: Use handwritten model instead of printed model
|
handwritten: Use handwritten model instead of printed model
|
||||||
size: Model size — "base" (300 MB) or "large" (340 MB, higher accuracy
|
|
||||||
for exam HTR). Only applies to handwritten variant.
|
|
||||||
|
|
||||||
Returns tuple of (processor, model) or (None, None) if unavailable.
|
Returns tuple of (processor, model) or (None, None) if unavailable.
|
||||||
"""
|
"""
|
||||||
@@ -142,43 +138,32 @@ def get_trocr_model(handwritten: bool = False, size: str = "base"):
|
|||||||
if not _check_trocr_available():
|
if not _check_trocr_available():
|
||||||
return None, None
|
return None, None
|
||||||
|
|
||||||
# Select model name
|
if _trocr_processor is None or _trocr_model is None:
|
||||||
if size == "large" and handwritten:
|
|
||||||
model_name = "microsoft/trocr-large-handwritten"
|
|
||||||
elif handwritten:
|
|
||||||
model_name = "microsoft/trocr-base-handwritten"
|
|
||||||
else:
|
|
||||||
model_name = "microsoft/trocr-base-printed"
|
|
||||||
|
|
||||||
if model_name in _trocr_models:
|
|
||||||
return _trocr_models[model_name]
|
|
||||||
|
|
||||||
try:
|
try:
|
||||||
import torch
|
import torch
|
||||||
from transformers import TrOCRProcessor, VisionEncoderDecoderModel
|
from transformers import TrOCRProcessor, VisionEncoderDecoderModel
|
||||||
|
|
||||||
|
# Choose model based on use case
|
||||||
|
if handwritten:
|
||||||
|
model_name = "microsoft/trocr-base-handwritten"
|
||||||
|
else:
|
||||||
|
model_name = "microsoft/trocr-base-printed"
|
||||||
|
|
||||||
logger.info(f"Loading TrOCR model: {model_name}")
|
logger.info(f"Loading TrOCR model: {model_name}")
|
||||||
processor = TrOCRProcessor.from_pretrained(model_name)
|
_trocr_processor = TrOCRProcessor.from_pretrained(model_name)
|
||||||
model = VisionEncoderDecoderModel.from_pretrained(model_name)
|
_trocr_model = VisionEncoderDecoderModel.from_pretrained(model_name)
|
||||||
|
|
||||||
# Use GPU if available
|
# Use GPU if available
|
||||||
device = "cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu"
|
device = "cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu"
|
||||||
model.to(device)
|
_trocr_model.to(device)
|
||||||
logger.info(f"TrOCR model loaded on device: {device}")
|
logger.info(f"TrOCR model loaded on device: {device}")
|
||||||
|
|
||||||
_trocr_models[model_name] = (processor, model)
|
|
||||||
|
|
||||||
# Keep backwards-compat globals pointing at base-printed
|
|
||||||
if model_name == "microsoft/trocr-base-printed":
|
|
||||||
_trocr_processor = processor
|
|
||||||
_trocr_model = model
|
|
||||||
|
|
||||||
return processor, model
|
|
||||||
|
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
logger.error(f"Failed to load TrOCR model {model_name}: {e}")
|
logger.error(f"Failed to load TrOCR model: {e}")
|
||||||
return None, None
|
return None, None
|
||||||
|
|
||||||
|
return _trocr_processor, _trocr_model
|
||||||
|
|
||||||
|
|
||||||
def preload_trocr_model(handwritten: bool = True) -> bool:
|
def preload_trocr_model(handwritten: bool = True) -> bool:
|
||||||
"""
|
"""
|
||||||
@@ -224,8 +209,7 @@ def get_model_status() -> Dict[str, Any]:
|
|||||||
async def run_trocr_ocr(
|
async def run_trocr_ocr(
|
||||||
image_data: bytes,
|
image_data: bytes,
|
||||||
handwritten: bool = False,
|
handwritten: bool = False,
|
||||||
split_lines: bool = True,
|
split_lines: bool = True
|
||||||
size: str = "base",
|
|
||||||
) -> Tuple[Optional[str], float]:
|
) -> Tuple[Optional[str], float]:
|
||||||
"""
|
"""
|
||||||
Run TrOCR on an image.
|
Run TrOCR on an image.
|
||||||
@@ -239,12 +223,11 @@ async def run_trocr_ocr(
|
|||||||
image_data: Raw image bytes
|
image_data: Raw image bytes
|
||||||
handwritten: Use handwritten model (slower but better for handwriting)
|
handwritten: Use handwritten model (slower but better for handwriting)
|
||||||
split_lines: Whether to split image into lines first
|
split_lines: Whether to split image into lines first
|
||||||
size: "base" or "large" (only for handwritten variant)
|
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
Tuple of (extracted_text, confidence)
|
Tuple of (extracted_text, confidence)
|
||||||
"""
|
"""
|
||||||
processor, model = get_trocr_model(handwritten=handwritten, size=size)
|
processor, model = get_trocr_model(handwritten=handwritten)
|
||||||
|
|
||||||
if processor is None or model is None:
|
if processor is None or model is None:
|
||||||
logger.error("TrOCR model not available")
|
logger.error("TrOCR model not available")
|
||||||
|
|||||||
File diff suppressed because it is too large
Load Diff
@@ -65,12 +65,10 @@ nav:
|
|||||||
- BYOEH Architektur: services/klausur-service/BYOEH-Architecture.md
|
- BYOEH Architektur: services/klausur-service/BYOEH-Architecture.md
|
||||||
- BYOEH Developer Guide: services/klausur-service/BYOEH-Developer-Guide.md
|
- BYOEH Developer Guide: services/klausur-service/BYOEH-Developer-Guide.md
|
||||||
- NiBiS Pipeline: services/klausur-service/NiBiS-Ingestion-Pipeline.md
|
- NiBiS Pipeline: services/klausur-service/NiBiS-Ingestion-Pipeline.md
|
||||||
- OCR Pipeline: services/klausur-service/OCR-Pipeline.md
|
|
||||||
- OCR Labeling: services/klausur-service/OCR-Labeling-Spec.md
|
- OCR Labeling: services/klausur-service/OCR-Labeling-Spec.md
|
||||||
- OCR Vergleich: services/klausur-service/OCR-Compare.md
|
- OCR Vergleich: services/klausur-service/OCR-Compare.md
|
||||||
- RAG Admin: services/klausur-service/RAG-Admin-Spec.md
|
- RAG Admin: services/klausur-service/RAG-Admin-Spec.md
|
||||||
- Worksheet Editor: services/klausur-service/Worksheet-Editor-Architecture.md
|
- Worksheet Editor: services/klausur-service/Worksheet-Editor-Architecture.md
|
||||||
- Chunk-Browser: services/klausur-service/Chunk-Browser.md
|
|
||||||
- Voice-Service:
|
- Voice-Service:
|
||||||
- Uebersicht: services/voice-service/index.md
|
- Uebersicht: services/voice-service/index.md
|
||||||
- Agent-Core:
|
- Agent-Core:
|
||||||
|
|||||||
@@ -31,8 +31,8 @@ WORKDIR /app
|
|||||||
ENV NODE_ENV=production
|
ENV NODE_ENV=production
|
||||||
|
|
||||||
# Create non-root user
|
# Create non-root user
|
||||||
RUN addgroup --system --gid 1001 nodejs
|
RUN addgroup -S -g 1001 nodejs
|
||||||
RUN adduser --system --uid 1001 nextjs
|
RUN adduser -S -u 1001 -G nodejs nextjs
|
||||||
|
|
||||||
# Copy built application
|
# Copy built application
|
||||||
COPY --from=builder /app/public ./public
|
COPY --from=builder /app/public ./public
|
||||||
|
|||||||
@@ -34,8 +34,8 @@ WORKDIR /app
|
|||||||
ENV NODE_ENV=production
|
ENV NODE_ENV=production
|
||||||
|
|
||||||
# Create non-root user
|
# Create non-root user
|
||||||
RUN addgroup --system --gid 1001 nodejs
|
RUN addgroup -S -g 1001 nodejs
|
||||||
RUN adduser --system --uid 1001 nextjs
|
RUN adduser -S -u 1001 -G nodejs nextjs
|
||||||
|
|
||||||
# Copy built assets
|
# Copy built assets
|
||||||
COPY --from=builder /app/public ./public
|
COPY --from=builder /app/public ./public
|
||||||
|
|||||||
Reference in New Issue
Block a user