Fix word-split tiebreaker: prefer longer first word
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 47s
CI / test-go-edu-search (push) Successful in 39s
CI / test-python-klausur (push) Failing after 2m44s
CI / test-python-agent-core (push) Successful in 31s
CI / test-nodejs-website (push) Successful in 35s

"taskis" was split as "ta skis" instead of "task is" because both
have the same DP score. Changed comparison from > to >= so that
later candidates (with longer first words) win ties.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Benjamin Admin
2026-04-12 09:05:14 +02:00
parent 7ffa4c90f9
commit 4f4e6c31fa

View File

@@ -758,7 +758,8 @@ def _try_split_merged_word(token: str) -> Optional[str]:
dp[i] = (new_words, new_sq)
else:
old_key = (-len(dp[i][0]), dp[i][1])
if new_key > old_key:
if new_key >= old_key:
# >= so that later splits (longer first word) win ties
dp[i] = (new_words, new_sq)
if dp[n] is None or len(dp[n][0]) < 2: