Fix SmartSpellChecker: preserve leading non-alpha text like (=

The tokenizer regex only matches alphabetic characters, so text before the first word match (like "(= " in "(= I won...") was silently dropped when reassembling the corrected text. Now preserves text[:first_match_start] as a leading prefix. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 23:41:33 +02:00
parent 596864431b
commit 4561320e0d
1 changed files with 7 additions and 0 deletions
@@ -534,6 +534,13 @@ class SmartSpellChecker:

        # --- Pass 3: Per-word correction ---
        parts: List[str] = []
+
+        # Preserve any leading text before the first token match
+        # (e.g., "(= " before "I won and he lost.")
+        first_start = tokens[0].start() if tokens else 0
+        if first_start > 0:
+            parts.append(text[:first_start])
+
        for i, (word, sep) in enumerate(token_list):
            # Skip words inside IPA brackets (brackets land in separators)
            prev_sep = token_list[i - 1][1] if i > 0 else ""