fix(ocr-pipeline): pass left_x/right_x to classify_column_types in API path
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 25s
CI / test-go-edu-search (push) Successful in 26s
CI / test-python-klausur (push) Failing after 1m45s
CI / test-python-agent-core (push) Successful in 15s
CI / test-nodejs-website (push) Successful in 18s

The ocr_pipeline_api.py code path called classify_column_types without
left_x/right_x, so margin regions were never created. Also add logging
to _build_margin_regions for debugging.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Benjamin Admin
2026-03-02 15:42:39 +01:00
parent 34ccdd5fd1
commit a052f73de3
2 changed files with 6 additions and 1 deletions

View File

@@ -2062,6 +2062,10 @@ def _build_margin_regions(
classification_method='content_bounds',
))
if margins:
logger.info(f"Margins: {[(m.type, m.x, m.width) for m in margins]} "
f"(left_x={left_x}, right_x={right_x}, img_w={img_w})")
return margins

View File

@@ -699,7 +699,8 @@ async def detect_columns(session_id: str):
cached["_content_bounds"] = (left_x, right_x, top_y, bottom_y)
# Phase B: Content-based classification
regions = classify_column_types(geometries, content_w, top_y, w, h, bottom_y)
regions = classify_column_types(geometries, content_w, top_y, w, h, bottom_y,
left_x=left_x, right_x=right_x)
duration = time.time() - t0