feat: OCR umlaut confusion correction + bold detection via stroke-width

- Add umlaut confusion rules (i→ü, a→ä, o→ö, u→ü) to _spell_fix_token for German text — fixes "iberqueren" → "überqueren" etc. - Add _detect_bold() using OpenCV stroke-width analysis on cell crops - Integrate bold detection in both narrow (cell-crop) and broad (word-lookup) paths - Add is_bold field to GridCell TypeScript interface - Render bold text in StepGroundTruth reconstruction view Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 12:06:57 +01:00
parent 40cfc1acdd
commit cd12755da6
3 changed files with 84 additions and 2 deletions
@@ -389,6 +389,7 @@ export function StepGroundTruth({ sessionId, onNext }: StepGroundTruthProps) {
                        height: `${cell.bbox_pct.h}%`,
                        color: '#1a1a1a',
                        fontSize: `${fontSize}px`,
+                        fontWeight: cell.is_bold ? 'bold' : 'normal',
                        fontFamily: "'Liberation Sans', 'DejaVu Sans', Arial, sans-serif",
                        display: 'flex',
                        alignItems: 'center',