Benjamin Admin
46c8c28d34
fix: border strip pre-filter + 3-column detection for vocabulary tables
...
The border strip filter (Step 4e) used the LARGEST x-gap which incorrectly
removed base words along with edge artifacts. Now uses a two-stage approach:
1. _filter_border_strip_words() pre-filters raw words BEFORE column detection,
scanning from the page edge inward to find the FIRST significant gap (>30px)
2. Step 4e runs as fallback only when pre-filter didn't apply
Session 4233 now correctly detects 3 columns (base word | oder | synonyms)
instead of 2. Threshold raised from 15% to 20% to handle pages with many
edge artifacts. All 4 ground-truth sessions pass regression.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com >
2026-03-21 21:01:43 +01:00
..
2026-03-01 11:08:52 +01:00
2026-02-11 23:47:26 +01:00
2026-03-14 23:41:03 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-03-12 13:41:39 +01:00
2026-03-21 21:01:43 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-03-16 08:12:52 +01:00
2026-03-11 20:41:29 +01:00
2026-03-20 18:21:00 +01:00
2026-03-17 18:09:16 +01:00
2026-03-16 18:42:46 +01:00
2026-03-20 11:45:35 +01:00
2026-03-17 16:39:15 +01:00
2026-03-09 15:24:56 +01:00
2026-03-12 06:46:05 +01:00
2026-03-20 16:38:12 +01:00
2026-03-17 10:41:30 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-03-21 21:01:43 +01:00
2026-03-03 12:04:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-03-14 23:41:03 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-03-18 13:46:48 +01:00
2026-03-18 08:42:00 +01:00
2026-03-18 08:42:00 +01:00
2026-03-19 09:19:09 +01:00
2026-03-18 08:42:00 +01:00
2026-03-18 11:08:23 +01:00
2026-03-18 08:42:00 +01:00
2026-03-18 14:49:02 +01:00
2026-03-18 08:42:00 +01:00
2026-03-18 13:46:48 +01:00
2026-03-18 08:42:00 +01:00
2026-03-18 08:42:00 +01:00
2026-03-17 16:34:06 +01:00
2026-03-19 08:23:50 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-03-03 22:44:14 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-03-07 22:16:29 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00
2026-02-11 23:47:26 +01:00