Dictionary pages have small blue square bullets before entries that OCR reads as text artifacts. Three detection rules: a) Tiny blue symbols (area < 150, conf < 85): catches ©, e, * etc. b) X-overlapping word_boxes (>40%): remove lower confidence one c) Duplicate blue text with gap < 6px: remove one copy Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>