693803fb7c0aff566079324803174968f56f04d2
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 42s
CI / test-go-edu-search (push) Successful in 42s
CI / test-python-klausur (push) Failing after 2m55s
CI / test-python-agent-core (push) Successful in 37s
CI / test-nodejs-website (push) Successful in 31s
Major improvements: - Frequency-based boundary repair: always tries repair, uses word frequency product to decide (Pound sand→Pounds and: 2000x better) - IPA bracket protection: words inside [brackets] are never modified, even when brackets land in tokenizer separators - Slash→l substitution: "p/" → "pl" for italic l misread as slash - Abbreviation guard uses rare-word threshold (freq < 1e-6) instead of binary known/unknown — prevents "Can I" → "Ca nI" while still fixing "ats th." → "at sth." - Tokenizer includes / character for slash-word detection 43 tests passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Description
No description provided
Languages
TypeScript
59.8%
Python
33.4%
Go
5.4%
C#
0.8%
CSS
0.2%
Other
0.3%