Fix IPA continuation: skip words with inline IPA, recover emptied cells
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 25s
CI / test-go-edu-search (push) Successful in 27s
CI / test-python-klausur (push) Failing after 1m46s
CI / test-python-agent-core (push) Successful in 14s
CI / test-nodejs-website (push) Successful in 15s
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 25s
CI / test-go-edu-search (push) Successful in 27s
CI / test-python-klausur (push) Failing after 1m46s
CI / test-python-agent-core (push) Successful in 14s
CI / test-nodejs-website (push) Successful in 15s
Three fixes: 1. fix_ipa_continuation_cell: when headword has inline IPA like "beat [bˈiːt] , beat, beaten", only generate IPA for uncovered words (beaten), not words already shown (beat). When bracket is at end like "the Highlands [ˈhaɪləndz]", return inline IPA directly. 2. Step 5d: recover garbled IPA from word_boxes when Step 5c emptied the cell text (e.g. "[n, nn]" → ""). 3. Added 2 tests for inline IPA behavior (35 total). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -510,6 +510,23 @@ class TestGarbledIpaDetection:
|
||||
assert "klˈəʊs" in fixed # close IPA
|
||||
assert "dˈaʊn" in fixed # down IPA — must NOT be skipped
|
||||
|
||||
def test_continuation_skips_words_with_inline_ipa(self):
|
||||
"""'beat [bˈiːt] , beat, beaten' → continuation only for 'beaten'."""
|
||||
fixed = fix_ipa_continuation_cell(
|
||||
"[bi:tan]", "beat [bˈiːt] , beat, beaten", pronunciation="british",
|
||||
)
|
||||
# Should only have IPA for "beaten", NOT for "beat" (already inline)
|
||||
assert "bˈiːtən" in fixed
|
||||
assert fixed.count("bˈiːt") == 0 or fixed == "[bˈiːtən]"
|
||||
|
||||
def test_continuation_bracket_at_end_returns_inline(self):
|
||||
"""'the Highlands [ˈhaɪləndz]' → return inline IPA, not IPA for 'the'."""
|
||||
fixed = fix_ipa_continuation_cell(
|
||||
"'hailandz", "the Highlands [ˈhaɪləndz]", pronunciation="british",
|
||||
)
|
||||
assert fixed == "[ˈhaɪləndz]"
|
||||
assert "ðə" not in fixed # "the" must NOT get IPA
|
||||
|
||||
def test_headword_with_brackets_not_continuation(self):
|
||||
"""'employee [im'ploi:]' has a headword outside brackets → not garbled.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user