breakpilot-lehrer

Author	SHA1	Message	Date
Benjamin Admin	2a21127f01	fix(ocr-pipeline): improve page crop spine detection and cell assignment Some checks failed CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 25s Details CI / test-go-edu-search (push) Successful in 25s Details CI / test-python-klausur (push) Failing after 1m54s Details CI / test-python-agent-core (push) Successful in 15s Details CI / test-nodejs-website (push) Successful in 17s Details 1. page_crop: Score all dark runs by center-proximity × darkness × narrowness instead of picking the widest. Fixes ad810209 where a wide dark area at 35% was chosen over the actual spine at 50%. 2. cv_words_first: Replace x-center-only word→column assignment with overlap-based three-pass strategy (overlap → midpoint-range → nearest). Fixes truncated German translations like "Schal" instead of "Schal - die Schals" in session 079cd0d9. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-24 09:23:30 +01:00
Benjamin Admin	c09838e91c	Fix spine shadow false positives: require dark valley, brightness rise, trim convolution edges Some checks failed CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 25s Details CI / test-go-edu-search (push) Successful in 25s Details CI / test-python-klausur (push) Failing after 1m54s Details CI / test-python-agent-core (push) Successful in 18s Details CI / test-nodejs-website (push) Successful in 16s Details The _detect_spine_shadow function was triggering on normal text content because shadow_range > 20 was too low and convolution edge artifacts created artificially low values. Now requires: range > 40, darkest < 180, narrow valley (not text plateau), and brightness rise toward page content. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-19 08:23:50 +01:00
Benjamin Admin	3fd6523872	Cut at spine center (darkest point) instead of shadow edge Some checks failed CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 28s Details CI / test-go-edu-search (push) Successful in 26s Details CI / test-python-klausur (push) Failing after 2m5s Details CI / test-python-agent-core (push) Successful in 16s Details CI / test-nodejs-website (push) Successful in 18s Details Refactor left/right shadow detection into shared _detect_spine_shadow() that finds the darkest column (= book spine center) via argmin of smoothed brightness. Both sides now cut at the spine center, ensuring equal page sizes in double-page scans regardless of shadow position. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-19 07:54:33 +01:00
Benjamin Admin	e56391b0c3	Add right-edge spine shadow detection for book scans Some checks failed CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 37s Details CI / test-go-edu-search (push) Successful in 27s Details CI / test-python-klausur (push) Failing after 1m56s Details CI / test-python-agent-core (push) Successful in 16s Details CI / test-nodejs-website (push) Successful in 22s Details Mirror the left-edge shadow detection for the right side: analyze brightness gradient in the right 25% to find scanner gray strips from book spines. Cuts at the last bright column before the shadow dip. Fixes cropping of book scans where the next page bleeds in. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-19 07:41:13 +01:00
Benjamin Admin	d66efdecf5	fix: NameError in detect_page_splits — 'gaps' var removed in rewrite Some checks failed CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 28s Details CI / test-go-edu-search (push) Successful in 28s Details CI / test-python-klausur (push) Failing after 1m52s Details CI / test-python-agent-core (push) Successful in 17s Details CI / test-nodejs-website (push) Successful in 22s Details Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-17 17:01:34 +01:00
Benjamin Admin	d36972b464	fix: detect spine by brightness, not ink density Some checks failed CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 26s Details CI / test-go-edu-search (push) Successful in 27s Details CI / test-python-klausur (push) Failing after 1m51s Details CI / test-python-agent-core (push) Successful in 18s Details CI / test-nodejs-website (push) Successful in 19s Details The previous algorithm used binary ink projection and found false splits at normal text column gaps. The spine of a book on a scanner has a characteristic DARK gray strip (scanner bed) flanked by bright white paper on both sides. New approach: column-mean brightness with heavy smoothing, looking for a dark valley (< 88% of paper brightness) in the center region that has bright paper on both sides. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-17 16:52:29 +01:00
Benjamin Admin	f30e526917	fix: merge nearby spine gaps + handle multi-page crop in frontend Some checks failed CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 31s Details CI / test-go-edu-search (push) Successful in 29s Details CI / test-python-klausur (push) Failing after 1m53s Details CI / test-python-agent-core (push) Successful in 16s Details CI / test-nodejs-website (push) Successful in 18s Details Backend: merge gaps within 5% of image width — the spine area may have thin ink strips splitting one physical gap into multiple detected gaps. Only use gaps >= 2% width as split points. Frontend: StepCrop now handles multi_page crop responses without crashing on missing original_size/cropped_size fields. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-17 16:44:32 +01:00
Benjamin Admin	902de027f4	feat: auto-detect multi-page spreads and split into sub-sessions Some checks failed CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 28s Details CI / test-go-edu-search (push) Successful in 29s Details CI / test-python-klausur (push) Failing after 2m0s Details CI / test-python-agent-core (push) Successful in 17s Details CI / test-nodejs-website (push) Successful in 19s Details When a book scan (double-page spread) is detected during the crop step, the system automatically: 1. Detects vertical center gaps (spine area) via ink density projection 2. Splits into N page sub-sessions (reusing existing sub-session mechanism) 3. Individually crops each page (removing its own borders) 4. Returns sub-session IDs for downstream pipeline processing Detection: landscape images (w > h * 1.15), vertical gap < 15% peak density in center region (25-75%), gap width >= 0.8% of image width. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-17 16:34:06 +01:00
Benjamin Admin	156a818246	refactor: Crop nach Deskew/Dewarp verschieben + content-basierter Buchscan-Crop Some checks failed CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 26s Details CI / test-go-edu-search (push) Successful in 27s Details CI / test-python-klausur (push) Failing after 1m56s Details CI / test-python-agent-core (push) Successful in 16s Details CI / test-nodejs-website (push) Successful in 17s Details Pipeline-Reihenfolge neu: Orientierung → Begradigung → Entzerrung → Zuschneiden → Spalten... Crop arbeitet jetzt auf dem bereits geraden Bild, was bessere Ergebnisse liefert. page_crop.py komplett ersetzt: Adaptive Threshold + 4-Kanten-Erkennung (Buchruecken-Schatten links, Ink-Projektion fuer alle Raender) statt Otsu + groesste Kontur. Backend: Step-Nummern, Input-Bilder, Reprocess-Kaskade angepasst. Frontend: PIPELINE_STEPS umgeordnet, Switch-Cases, Vorher-Bilder aktualisiert. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 08:52:11 +01:00
Benjamin Admin	2763631711	feat: Orientierung + Zuschneiden als Schritte 1-2 in OCR-Pipeline Some checks failed CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-school (push) Successful in 28s Details CI / test-go-edu-search (push) Successful in 27s Details CI / test-python-klausur (push) Failing after 1m59s Details CI / test-python-agent-core (push) Successful in 17s Details CI / test-nodejs-website (push) Successful in 18s Details Zwei neue Wizard-Schritte vor Begradigung: - Step 1: Orientierungserkennung (0/90/180/270° via Tesseract OSD) - Step 2: Seitenrand-Erkennung und Zuschnitt (Scannerraender entfernen) Backend: - orientation_crop_api.py: POST /orientation, POST /crop, POST /crop/skip - page_crop.py: detect_and_crop_page() mit Format-Erkennung (A4/A5/Letter) - Session-Store: orientation_result, crop_result Felder - Pipeline nutzt zugeschnittenes Bild fuer Deskew/Dewarp Frontend: - StepOrientation.tsx: Upload + Auto-Orientierung + Vorher/Nachher - StepCrop.tsx: Auto-Crop + Format-Badge + Ueberspringen-Option - Pipeline-Stepper: 10 Schritte (war 8) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-08 23:55:23 +01:00

10 Commits