fix: only detect circles and illustrations, drop arrow/icon/line
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 30s
CI / test-go-edu-search (push) Successful in 29s
CI / test-python-klausur (push) Failing after 2m6s
CI / test-python-agent-core (push) Successful in 18s
CI / test-nodejs-website (push) Successful in 22s
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 30s
CI / test-go-edu-search (push) Successful in 29s
CI / test-python-klausur (push) Failing after 2m6s
CI / test-python-agent-core (push) Successful in 18s
CI / test-nodejs-website (push) Successful in 22s
Text fragments after word exclusion are indistinguishable from arrows and icons via contour metrics. Since the goal is detecting graphics, images, boxes and colors (not arrows/icons), simplify to only: - circle/balloon (circularity > 0.55 — very reliable) - illustration (area > 3000 — clearly non-text) Boxes and colors are handled by cv_box_detect and cv_color_detect. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -101,63 +101,31 @@ def _classify_shape(
|
||||
) -> tuple:
|
||||
"""Classify contour shape → (shape_name, confidence).
|
||||
|
||||
Uses circularity, aspect ratio, solidity, and vertex count.
|
||||
Only classifies as arrow/circle/line if the element is large enough
|
||||
to be a genuine graphic (not a text fragment).
|
||||
Only detects high-confidence shapes that are clearly non-text:
|
||||
- circle/balloon: high circularity (very reliable)
|
||||
- illustration: large area (clearly a drawing/image)
|
||||
|
||||
Text fragments are classified as 'noise' and filtered out.
|
||||
Boxes and colors are detected by separate modules.
|
||||
"""
|
||||
aspect = bw / bh if bh > 0 else 1.0
|
||||
perimeter = cv2.arcLength(contour, True)
|
||||
circularity = (4 * np.pi * area) / (perimeter * perimeter) if perimeter > 0 else 0
|
||||
|
||||
hull = cv2.convexHull(contour)
|
||||
hull_area = cv2.contourArea(hull)
|
||||
solidity = area / hull_area if hull_area > 0 else 0
|
||||
|
||||
# Approximate polygon
|
||||
epsilon = 0.03 * perimeter
|
||||
approx = cv2.approxPolyDP(contour, epsilon, True)
|
||||
vertices = len(approx)
|
||||
|
||||
aspect = bw / bh if bh > 0 else 1.0
|
||||
min_dim = min(bw, bh)
|
||||
max_dim = max(bw, bh)
|
||||
|
||||
# --- Circle / balloon --- (check first, most reliable)
|
||||
# Must be reasonably large (not a dot/period) — min 15px
|
||||
# --- Circle / balloon ---
|
||||
# High circularity is the most reliable non-text indicator.
|
||||
# Text characters rarely have circularity > 0.55.
|
||||
if circularity > 0.55 and 0.5 < aspect < 2.0 and min_dim > 15:
|
||||
conf = min(0.95, circularity)
|
||||
return "circle", conf
|
||||
|
||||
# --- Arrow detection --- (strict: must be sizable, distinct shape)
|
||||
# Arrows must be at least 20px in both dimensions
|
||||
if (min_dim > 20 and max_dim > 30
|
||||
and 5 <= vertices <= 9
|
||||
and 0.35 < solidity < 0.80
|
||||
and circularity < 0.35):
|
||||
hull_idx = cv2.convexHull(contour, returnPoints=False)
|
||||
if len(hull_idx) >= 4:
|
||||
try:
|
||||
defects = cv2.convexityDefects(contour, hull_idx)
|
||||
if defects is not None and len(defects) >= 2:
|
||||
max_depth = max(d[0][3] for d in defects) / 256.0
|
||||
if max_depth > min_dim * 0.25:
|
||||
return "arrow", min(0.75, 0.5 + max_depth / max_dim)
|
||||
except cv2.error:
|
||||
pass
|
||||
|
||||
# --- Line (decorative rule, separator) ---
|
||||
# Must be long enough to not be a dash/hyphen
|
||||
if (aspect > 6.0 or aspect < 1 / 6.0) and max_dim > 40:
|
||||
return "line", 0.7
|
||||
|
||||
# --- Larger illustration (drawing, image) ---
|
||||
# --- Illustration (drawing, image, large graphic) ---
|
||||
# Large connected regions that survived word exclusion = genuine graphics.
|
||||
if area > 3000 and min_dim > 30:
|
||||
return "illustration", 0.6
|
||||
|
||||
# --- Generic icon (moderate size, non-text shape) ---
|
||||
if area > 500 and min_dim > 15:
|
||||
return "icon", 0.4
|
||||
|
||||
# Everything else is too small or text-like — skip
|
||||
# Everything else is likely a text fragment — skip
|
||||
return "noise", 0.0
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user