Commit Graph

15 Commits

Author SHA1 Message Date
Benjamin Admin
b03cb0a1e6 Fix Landkarte tab crash: variable name shadowed isInRag function
Local variables named 'isInRag' shadowed the outer function, causing
"isInRag is not a function" error. Renamed to regInRag/codeInRag.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 00:01:01 +01:00
Benjamin Admin
5a45cbf605 Update RAG page: Chunks/Status columns use hardcoded data, Key Intersections show RAG status
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 24s
CI / test-go-edu-search (push) Successful in 24s
CI / test-python-klausur (push) Failing after 1m36s
CI / test-python-agent-core (push) Successful in 16s
CI / test-nodejs-website (push) Successful in 15s
- Chunks column now uses getKnownChunks() instead of API-based getRegulationChunks()
- Status column uses isInRag() check (green/red) instead of ratio-based calculation
- Key Intersections chips show green/red with checkmark/cross based on RAG status

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 23:53:21 +01:00
Benjamin Admin
2297f66edb feat(rag): Add RAG status indicators and 4 new EU regulations
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 26s
CI / test-go-edu-search (push) Successful in 27s
CI / test-python-klausur (push) Failing after 1m39s
CI / test-python-agent-core (push) Successful in 17s
CI / test-nodejs-website (push) Successful in 23s
- Add REGULATIONS_IN_RAG Set tracking all 42 regulations currently in Qdrant
- Add 4 new regulation entries: E-Commerce-RL, Verbraucherrechte-RL,
  Digitale-Inhalte-RL, DMA (all ingested Feb 2026)
- Add RAG column to regulations table with green check/red x indicators
- Update Landkarte tab: green/x on industry cards, thematic clusters,
  and regulation matrix
- Replace old "Integrated Regulations" section with full RAG coverage overview
- Update hardcoded chunk counts (Templates: 7689, NiBiS: 7996)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 23:23:52 +01:00
Benjamin Admin
587b066a40 feat(ocr-pipeline): ground-truth comparison tool for column detection
Side-by-side view: auto result (readonly) vs GT editor where teacher
draws correct columns. Diff table shows Auto vs GT with IoU matching.
GT data persisted per session for algorithm tuning.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 22:48:37 +01:00
Benjamin Admin
bb879a03a8 feat(ocr-pipeline): add column_ignore type for margins/empty areas
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 08:51:56 +01:00
Benjamin Admin
7a3570fe46 feat(ocr-pipeline): manual column editor for Step 3
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 08:27:54 +01:00
Benjamin Admin
1393a994f9 Flexible inhaltsbasierte Spaltenerkennung (2-Phasen)
Ersetzt hardcodierte Positionsregeln durch ein zweistufiges System:
Phase A erkennt Spaltengeometrie (Clustering), Phase B klassifiziert
Typen per Inhalt (Sprache/Rolle) mit 3-stufiger Fallback-Kette.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 23:33:35 +01:00
Benjamin Admin
cf27a95308 feat(ocr-pipeline): word-based 5-column detection for vocabulary pages
Replace projection-profile layout analysis with Tesseract word bounding
box clustering to detect 5-column vocabulary layouts (page_ref, EN, DE,
markers, examples). Falls back to projection profiles when < 3 clusters.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 23:08:14 +01:00
Benjamin Admin
aa06ae0f61 feat: Persistente Sessions (PostgreSQL) + Spaltenerkennung (Step 3)
Sessions werden jetzt in PostgreSQL gespeichert statt in-memory.
Neue Session-Liste mit Name, Datum, Schritt. Sessions ueberleben
Browser-Refresh und Container-Neustart. Step 3 nutzt analyze_layout()
fuer automatische Spaltenerkennung mit farbigem Overlay.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 22:16:37 +01:00
Benjamin Admin
09b820efbe refactor(dewarp): replace displacement map with affine shear correction
The old displacement-map approach shifted entire rows by a parabolic
profile, creating a circle/barrel distortion. The actual problem is
a linear vertical shear: after deskew aligns horizontal lines, the
vertical column edges are still tilted by ~0.5°.

New approach:
- Detect shear angle from strongest vertical edge slope (not curvature)
- Apply cv2.warpAffine shear to straighten vertical features
- Manual slider: -2.0° to +2.0° in 0.05° steps
- Slider initializes to auto-detected shear angle
- Ground truth question: "Spalten vertikal ausgerichtet?"

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 18:23:04 +01:00
Benjamin Admin
9df745574b fix(ocr-pipeline): dewarp visibility, grid on both sides, session persistence
- Fix dewarp method selection: prefer methods with >5px curvature over
  higher confidence (vertical_edge 79px was being ignored for text_baseline 2px)
- Add grid overlay on left image in Dewarp step for side-by-side comparison
- Add GET /sessions/{id} endpoint to reload session data
- StepDeskew accepts sessionId prop to restore state when navigating back
- SessionInfo type extended with optional deskew_result and dewarp_result

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 17:29:53 +01:00
Benjamin Admin
589d2f811a feat: Dewarp-Korrektur als Schritt 2 in OCR Pipeline (7 Schritte)
Implementiert Buchwoelbungs-Entzerrung mit zwei Methoden:
- Methode A: Vertikale-Kanten-Analyse (Sobel + Polynom 2. Grades)
- Methode B: Textzeilen-Baseline (Tesseract + Baseline-Kruemmung)
Beste Methode wird automatisch gewaehlt, manueller Slider (-3 bis +3).

Backend: 3 neue Endpoints (auto/manual dewarp, ground truth)
Frontend: StepDewarp + DewarpControls, Pipeline von 6 auf 7 Schritte

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 16:46:41 +01:00
Benjamin Admin
d552fd8b6b feat: OCR Pipeline mit 6-Schritt-Wizard fuer Seitenrekonstruktion
All checks were successful
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 38s
CI / test-go-edu-search (push) Successful in 29s
CI / test-python-klausur (push) Successful in 1m46s
CI / test-python-agent-core (push) Successful in 17s
CI / test-nodejs-website (push) Successful in 22s
Neue Route /ai/ocr-pipeline mit schrittweiser Begradigung (Deskew),
Raster-Overlay und Ground Truth. Schritte 2-6 als Platzhalter.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 15:38:08 +01:00
Benjamin Boenisch
6a53f8d79c refactor: Remove all SDK/compliance pages and API routes from admin-lehrer
SDK/compliance content belongs exclusively in admin-compliance (port 3007).
Removed:
- All (sdk)/ pages (document-crawler, dsb-portal, industry-templates, multi-tenant, sso)
- All api/sdk/ proxy routes
- All developers/sdk/ documentation pages
- Unused lib/sdk/ modules (kept: catalog-manager + its deps for dashboard)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 09:24:36 +01:00
Benjamin Boenisch
5a31f52310 Initial commit: breakpilot-lehrer - Lehrer KI Platform
Services: Admin-Lehrer, Backend-Lehrer, Studio v2, Website,
Klausur-Service, School-Service, Voice-Service, Geo-Service,
BreakPilot Drive, Agent-Core

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 23:47:26 +01:00