Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 34s
CI / test-go-edu-search (push) Successful in 27s
CI / test-python-klausur (push) Failing after 2m9s
CI / test-python-agent-core (push) Successful in 15s
CI / test-nodejs-website (push) Successful in 18s
1. Add per-cell artifact filter (4b2): removes single-word cells with ≤2 chars and confidence <65 (e.g. "as" from stray OCR marks) 2. Add narrow connector column normalization (4d2): when ≥60% of cells in a column share the same short text (e.g. "oder"), normalize near-match outliers like "oderb" → "oder" 3. Fix footer detection: require short text (≤20 chars) and no commas. Comma-separated lists like "Uhrzeit, Vergangenheit, Zukunft" are content continuations, not page numbers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>