feat: add pass 3 text-line regression to deskew pipeline
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 24s
CI / test-go-edu-search (push) Successful in 26s
CI / test-python-klausur (push) Failing after 1m53s
CI / test-python-agent-core (push) Successful in 15s
CI / test-nodejs-website (push) Successful in 15s
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 24s
CI / test-go-edu-search (push) Successful in 26s
CI / test-python-klausur (push) Failing after 1m53s
CI / test-python-agent-core (push) Successful in 15s
CI / test-nodejs-website (push) Successful in 15s
After iterative projection (pass 1) and word-alignment (pass 2), a third pass uses Tesseract word positions + linear regression per text line to measure and correct residual rotation. This catches cases where passes 1-2 leave significant slope (e.g. 1.7° residual on heavily skewed scans). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -1371,14 +1371,16 @@ async def _run_ocr_pipeline_for_page(
|
||||
except Exception as e:
|
||||
logger.warning(f"Could not create pipeline session in DB: {e}")
|
||||
|
||||
# 3. Two-pass deskew: iterative (±5°) + word-alignment residual
|
||||
# 3. Three-pass deskew: iterative + word-alignment + text-line regression
|
||||
t0 = _time.time()
|
||||
deskewed_bgr, angle_applied, deskew_debug = deskew_two_pass(img_bgr.copy())
|
||||
angle_pass1 = deskew_debug.get("pass1_angle", 0.0)
|
||||
angle_pass2 = deskew_debug.get("pass2_angle", 0.0)
|
||||
angle_pass3 = deskew_debug.get("pass3_angle", 0.0)
|
||||
|
||||
logger.info(f" deskew: pass1={angle_pass1:.2f} pass2={angle_pass2:.2f} "
|
||||
f"total={angle_applied:.2f} ({_time.time() - t0:.1f}s)")
|
||||
logger.info(f" deskew: p1={angle_pass1:.2f} p2={angle_pass2:.2f} "
|
||||
f"p3={angle_pass3:.2f} total={angle_applied:.2f} "
|
||||
f"({_time.time() - t0:.1f}s)")
|
||||
|
||||
# 4. Dewarp
|
||||
t0 = _time.time()
|
||||
|
||||
Reference in New Issue
Block a user