Add Ground Truth regression test system for OCR pipeline
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 35s
CI / test-go-edu-search (push) Successful in 26s
CI / test-python-klausur (push) Failing after 1m47s
CI / test-python-agent-core (push) Successful in 15s
CI / test-nodejs-website (push) Successful in 22s

Extract _build_grid_core() from build_grid() endpoint for reuse.
New ocr_pipeline_regression.py with endpoints to mark sessions as
ground truth, list them, and run regression comparisons after code
changes. Frontend button in StepGroundTruth.tsx to mark/update GT.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Benjamin Admin
2026-03-18 13:46:48 +01:00
parent c894a0feeb
commit f655db30e4
5 changed files with 482 additions and 22 deletions

View File

@@ -270,6 +270,26 @@ async def get_sub_sessions(parent_session_id: str) -> List[Dict[str, Any]]:
return [_row_to_dict(row) for row in rows]
async def list_ground_truth_sessions_db() -> List[Dict[str, Any]]:
"""List sessions that have a build_grid_reference in ground_truth."""
pool = await get_pool()
async with pool.acquire() as conn:
rows = await conn.fetch("""
SELECT id, name, filename, status, current_step,
document_category, doc_type,
ground_truth,
parent_session_id, box_index,
created_at, updated_at
FROM ocr_pipeline_sessions
WHERE ground_truth IS NOT NULL
AND ground_truth::text LIKE '%build_grid_reference%'
AND parent_session_id IS NULL
ORDER BY created_at DESC
""")
return [_row_to_dict(row) for row in rows]
async def delete_session_db(session_id: str) -> bool:
"""Delete a session."""
pool = await get_pool()