Benjamin Admin 87931c35e4 fix(ocr-pipeline): stop noise filter from stripping parenthesized words
_is_noise_tail_token() treated words with unbalanced parentheses like
"selbst)" or "(wir" as OCR noise because the parenthesis counted as
"internal noise". Now strips leading/trailing parentheses before the
noise check, so legitimate words in example sentences like
"We baked ... (wir ... selbst)" are preserved.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 12:51:28 +01:00
Description
No description provided
42 MiB
Languages
TypeScript 60.2%
Python 32.9%
Go 5.5%
C# 0.8%
CSS 0.2%
Other 0.3%