Commit Graph

71 Commits

Author SHA1 Message Date
Benjamin Admin
b03cb0a1e6 Fix Landkarte tab crash: variable name shadowed isInRag function
Local variables named 'isInRag' shadowed the outer function, causing
"isInRag is not a function" error. Renamed to regInRag/codeInRag.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 00:01:01 +01:00
Benjamin Admin
5a45cbf605 Update RAG page: Chunks/Status columns use hardcoded data, Key Intersections show RAG status
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 24s
CI / test-go-edu-search (push) Successful in 24s
CI / test-python-klausur (push) Failing after 1m36s
CI / test-python-agent-core (push) Successful in 16s
CI / test-nodejs-website (push) Successful in 15s
- Chunks column now uses getKnownChunks() instead of API-based getRegulationChunks()
- Status column uses isInRag() check (green/red) instead of ratio-based calculation
- Key Intersections chips show green/red with checkmark/cross based on RAG status

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 23:53:21 +01:00
Benjamin Admin
2297f66edb feat(rag): Add RAG status indicators and 4 new EU regulations
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 26s
CI / test-go-edu-search (push) Successful in 27s
CI / test-python-klausur (push) Failing after 1m39s
CI / test-python-agent-core (push) Successful in 17s
CI / test-nodejs-website (push) Successful in 23s
- Add REGULATIONS_IN_RAG Set tracking all 42 regulations currently in Qdrant
- Add 4 new regulation entries: E-Commerce-RL, Verbraucherrechte-RL,
  Digitale-Inhalte-RL, DMA (all ingested Feb 2026)
- Add RAG column to regulations table with green check/red x indicators
- Update Landkarte tab: green/x on industry cards, thematic clusters,
  and regulation matrix
- Replace old "Integrated Regulations" section with full RAG coverage overview
- Update hardcoded chunk counts (Templates: 7689, NiBiS: 7996)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 23:23:52 +01:00
Benjamin Admin
587b066a40 feat(ocr-pipeline): ground-truth comparison tool for column detection
Side-by-side view: auto result (readonly) vs GT editor where teacher
draws correct columns. Diff table shows Auto vs GT with IoU matching.
GT data persisted per session for algorithm tuning.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 22:48:37 +01:00
Benjamin Admin
bb879a03a8 feat(ocr-pipeline): add column_ignore type for margins/empty areas
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 08:51:56 +01:00
Benjamin Admin
f535d3c967 fix(ocr-pipeline): manual editor layout + no re-detection on cached result
- ManualColumnEditor now uses grid-cols-2 layout (image left, controls right)
  matching the normal view size so the image doesn't zoom in
- StepColumnDetection only runs auto-detection when no cached result exists;
  revisiting step 3 loads cached columns without re-running detection

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 08:45:49 +01:00
Benjamin Admin
7a3570fe46 feat(ocr-pipeline): manual column editor for Step 3
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 08:27:54 +01:00
Benjamin Admin
1393a994f9 Flexible inhaltsbasierte Spaltenerkennung (2-Phasen)
Ersetzt hardcodierte Positionsregeln durch ein zweistufiges System:
Phase A erkennt Spaltengeometrie (Clustering), Phase B klassifiziert
Typen per Inhalt (Sprache/Rolle) mit 3-stufiger Fallback-Kette.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 23:33:35 +01:00
Benjamin Admin
cf27a95308 feat(ocr-pipeline): word-based 5-column detection for vocabulary pages
Replace projection-profile layout analysis with Tesseract word bounding
box clustering to detect 5-column vocabulary layouts (page_ref, EN, DE,
markers, examples). Falls back to projection profiles when < 3 clusters.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 23:08:14 +01:00
Benjamin Admin
aa06ae0f61 feat: Persistente Sessions (PostgreSQL) + Spaltenerkennung (Step 3)
Sessions werden jetzt in PostgreSQL gespeichert statt in-memory.
Neue Session-Liste mit Name, Datum, Schritt. Sessions ueberleben
Browser-Refresh und Container-Neustart. Step 3 nutzt analyze_layout()
fuer automatische Spaltenerkennung mit farbigem Overlay.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 22:16:37 +01:00
Benjamin Admin
09b820efbe refactor(dewarp): replace displacement map with affine shear correction
The old displacement-map approach shifted entire rows by a parabolic
profile, creating a circle/barrel distortion. The actual problem is
a linear vertical shear: after deskew aligns horizontal lines, the
vertical column edges are still tilted by ~0.5°.

New approach:
- Detect shear angle from strongest vertical edge slope (not curvature)
- Apply cv2.warpAffine shear to straighten vertical features
- Manual slider: -2.0° to +2.0° in 0.05° steps
- Slider initializes to auto-detected shear angle
- Ground truth question: "Spalten vertikal ausgerichtet?"

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 18:23:04 +01:00
Benjamin Admin
ff2bb79a91 fix(dewarp): change manual slider to percentage (0-200%) instead of raw multiplier
The old -3.0 to +3.0 scale multiplied the full displacement map (up to ~79px)
directly, causing extreme distortion at values >1. New slider:
- 0% = no correction
- 100% = auto-detected correction (default)
- 200% = double correction
- Step size: 5%

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 18:10:34 +01:00
Benjamin Admin
9df745574b fix(ocr-pipeline): dewarp visibility, grid on both sides, session persistence
- Fix dewarp method selection: prefer methods with >5px curvature over
  higher confidence (vertical_edge 79px was being ignored for text_baseline 2px)
- Add grid overlay on left image in Dewarp step for side-by-side comparison
- Add GET /sessions/{id} endpoint to reload session data
- StepDeskew accepts sessionId prop to restore state when navigating back
- SessionInfo type extended with optional deskew_result and dewarp_result

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 17:29:53 +01:00
Benjamin Admin
44e8c573af fix: Deskew Ground Truth Frage auf Rotation beschraenken
"Korrekt ausgerichtet?" → "Rotation korrekt?" mit Hinweis,
dass Woelbung/Verzerrung im naechsten Schritt korrigiert wird.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 17:16:24 +01:00
Benjamin Admin
589d2f811a feat: Dewarp-Korrektur als Schritt 2 in OCR Pipeline (7 Schritte)
Implementiert Buchwoelbungs-Entzerrung mit zwei Methoden:
- Methode A: Vertikale-Kanten-Analyse (Sobel + Polynom 2. Grades)
- Methode B: Textzeilen-Baseline (Tesseract + Baseline-Kruemmung)
Beste Methode wird automatisch gewaehlt, manueller Slider (-3 bis +3).

Backend: 3 neue Endpoints (auto/manual dewarp, ground truth)
Frontend: StepDewarp + DewarpControls, Pipeline von 6 auf 7 Schritte

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 16:46:41 +01:00
Benjamin Admin
d552fd8b6b feat: OCR Pipeline mit 6-Schritt-Wizard fuer Seitenrekonstruktion
All checks were successful
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-school (push) Successful in 38s
CI / test-go-edu-search (push) Successful in 29s
CI / test-python-klausur (push) Successful in 1m46s
CI / test-python-agent-core (push) Successful in 17s
CI / test-nodejs-website (push) Successful in 22s
Neue Route /ai/ocr-pipeline mit schrittweiser Begradigung (Deskew),
Raster-Overlay und Ground Truth. Schritte 2-6 als Platzhalter.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 15:38:08 +01:00
Benjamin Boenisch
6a53f8d79c refactor: Remove all SDK/compliance pages and API routes from admin-lehrer
SDK/compliance content belongs exclusively in admin-compliance (port 3007).
Removed:
- All (sdk)/ pages (document-crawler, dsb-portal, industry-templates, multi-tenant, sso)
- All api/sdk/ proxy routes
- All developers/sdk/ documentation pages
- Unused lib/sdk/ modules (kept: catalog-manager + its deps for dashboard)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 09:24:36 +01:00
Benjamin Boenisch
27f1899428 feat: Sync SDK modules, API routes, blog and docs from admin-v2
- DSB Portal, Industry Templates, Multi-Tenant, SSO frontend pages
- All SDK API proxy routes (academy, crawler, incidents, vendors, whistleblower, etc.)
- Blog section with compliance articles
- BYOEH system documentation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 21:12:30 +01:00
Benjamin Boenisch
1bd1a0002a feat: Sync Document Crawler frontend page and API proxy (Phase 1.4)
Synced from admin-v2: 4-tab crawler page and catch-all API proxy route.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 20:35:29 +01:00
Benjamin Boenisch
8c1f7a783f refactor(admin-lehrer): Rename to Admin Lehrer KI and remove migrated categories
Remove communication, infrastructure, and development categories from
navigation (now in Admin Core on port 3008). Rename Admin v2 to
Admin Lehrer KI in sidebar, header, and browser title.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-12 20:23:16 +01:00
Benjamin Boenisch
5a31f52310 Initial commit: breakpilot-lehrer - Lehrer KI Platform
Services: Admin-Lehrer, Backend-Lehrer, Studio v2, Website,
Klausur-Service, School-Service, Voice-Service, Geo-Service,
BreakPilot Drive, Agent-Core

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 23:47:26 +01:00