Benjamin Admin
9da45c2a59
Fix false header detection and add decorative margin/footer filters
- Remove all_colored spanning header heuristic that falsely flagged
colored vocabulary entries (Scotland, secondary school) as headers
- Add _filter_decorative_margin: removes vertical A-Z alphabet strips
along page margins (single-char words in a compact vertical strip)
- Add _filter_footer_words: removes page numbers in bottom 5% of page
- Tighten spanning header rule: require ≥3 columns spanned + ≤3 words
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 10:38:20 +01:00
..
2026-03-14 23:41:03 +01:00
2026-03-12 13:41:39 +01:00
2026-03-16 18:42:46 +01:00
2026-03-16 08:12:52 +01:00
2026-03-11 20:41:29 +01:00
2026-03-18 09:13:09 +01:00
2026-03-17 18:09:16 +01:00
2026-03-16 18:42:46 +01:00
2026-03-14 08:26:04 +01:00
2026-03-17 16:39:15 +01:00
2026-03-09 15:24:56 +01:00
2026-03-12 06:46:05 +01:00
2026-03-09 15:06:23 +01:00
2026-03-17 10:41:30 +01:00
2026-03-18 10:38:20 +01:00
2026-03-14 23:41:03 +01:00
2026-03-18 08:42:00 +01:00
2026-03-18 08:42:00 +01:00
2026-03-18 08:42:00 +01:00
2026-03-18 08:42:00 +01:00
2026-03-18 08:42:00 +01:00
2026-03-18 08:46:49 +01:00
2026-03-18 08:42:00 +01:00
2026-03-18 08:42:00 +01:00
2026-03-16 12:31:09 +01:00
2026-03-18 08:42:00 +01:00
2026-03-18 08:42:00 +01:00
2026-03-17 16:34:06 +01:00
2026-03-17 17:01:34 +01:00
2026-03-07 22:16:29 +01:00