refactor: Crop nach Deskew/Dewarp verschieben + content-basierter Buchscan-Crop
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 26s
CI / test-go-edu-search (push) Successful in 27s
CI / test-python-klausur (push) Failing after 1m56s
CI / test-python-agent-core (push) Successful in 16s
CI / test-nodejs-website (push) Successful in 17s

Pipeline-Reihenfolge neu: Orientierung → Begradigung → Entzerrung → Zuschneiden → Spalten...
Crop arbeitet jetzt auf dem bereits geraden Bild, was bessere Ergebnisse liefert.

page_crop.py komplett ersetzt: Adaptive Threshold + 4-Kanten-Erkennung
(Buchruecken-Schatten links, Ink-Projektion fuer alle Raender) statt
Otsu + groesste Kontur.

Backend: Step-Nummern, Input-Bilder, Reprocess-Kaskade angepasst.
Frontend: PIPELINE_STEPS umgeordnet, Switch-Cases, Vorher-Bilder aktualisiert.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Benjamin Admin
2026-03-09 08:52:11 +01:00
parent eb45bb4879
commit 156a818246
7 changed files with 295 additions and 173 deletions

View File

@@ -1,8 +1,8 @@
"""
Orientation & Crop API - Steps 1-2 of the OCR Pipeline.
Orientation & Crop API - Steps 1 and 4 of the OCR Pipeline.
Step 1: Orientation detection (fix 90/180/270 degree rotations)
Step 2: Page cropping (remove scanner borders, detect paper format)
Step 4 (UI index 3): Page cropping (after deskew + dewarp, so the image is straight)
These endpoints were extracted from the main pipeline to keep files manageable.
"""
@@ -161,21 +161,24 @@ async def detect_orientation(session_id: str):
# ---------------------------------------------------------------------------
# Step 2: Crop
# Step 4 (UI index 3): Crop — runs after deskew + dewarp
# ---------------------------------------------------------------------------
@router.post("/sessions/{session_id}/crop")
async def auto_crop(session_id: str):
"""Auto-detect and crop scanner borders.
"""Auto-detect and crop scanner/book borders.
Reads the oriented image (or original if no orientation step),
detects the page boundary and crops.
Reads the dewarped image (post-deskew + dewarp, so the page is straight).
Falls back to oriented → original if earlier steps were skipped.
"""
cached = await _ensure_cached(session_id)
# Use oriented image if available, else original
oriented = cached.get("oriented_bgr")
img_bgr = oriented if oriented is not None else cached.get("original_bgr")
# Use dewarped (preferred), fall back to oriented, then original
img_bgr = next(
(v for k in ("dewarped_bgr", "oriented_bgr", "original_bgr")
if (v := cached.get(k)) is not None),
None,
)
if img_bgr is None:
raise HTTPException(status_code=400, detail="No image available for cropping")
@@ -199,7 +202,7 @@ async def auto_crop(session_id: str):
session_id,
cropped_png=cropped_png,
crop_result=crop_info,
current_step=3,
current_step=5,
)
logger.info(
@@ -237,8 +240,11 @@ async def manual_crop(session_id: str, req: ManualCropRequest):
"""Manually crop using percentage coordinates."""
cached = await _ensure_cached(session_id)
oriented = cached.get("oriented_bgr")
img_bgr = oriented if oriented is not None else cached.get("original_bgr")
img_bgr = next(
(v for k in ("dewarped_bgr", "oriented_bgr", "original_bgr")
if (v := cached.get(k)) is not None),
None,
)
if img_bgr is None:
raise HTTPException(status_code=400, detail="No image available for cropping")
@@ -278,7 +284,7 @@ async def manual_crop(session_id: str, req: ManualCropRequest):
session_id,
cropped_png=cropped_png,
crop_result=crop_result,
current_step=3,
current_step=5,
)
ch, cw = cropped_bgr.shape[:2]
@@ -293,17 +299,20 @@ async def manual_crop(session_id: str, req: ManualCropRequest):
@router.post("/sessions/{session_id}/crop/skip")
async def skip_crop(session_id: str):
"""Skip cropping — use oriented (or original) image as-is."""
"""Skip cropping — use dewarped (or oriented/original) image as-is."""
cached = await _ensure_cached(session_id)
oriented = cached.get("oriented_bgr")
img_bgr = oriented if oriented is not None else cached.get("original_bgr")
img_bgr = next(
(v for k in ("dewarped_bgr", "oriented_bgr", "original_bgr")
if (v := cached.get(k)) is not None),
None,
)
if img_bgr is None:
raise HTTPException(status_code=400, detail="No image available")
h, w = img_bgr.shape[:2]
# Store the oriented image as cropped (identity crop)
# Store the dewarped image as cropped (identity crop)
success, png_buf = cv2.imencode(".png", img_bgr)
cropped_png = png_buf.tobytes() if success else b""
@@ -321,7 +330,7 @@ async def skip_crop(session_id: str):
session_id,
cropped_png=cropped_png,
crop_result=crop_result,
current_step=3,
current_step=5,
)
return {