A previous `git pull --rebase origin main` dropped 177 local commits,
losing 3400+ files across admin-v2, backend, studio-v2, website,
klausur-service, and many other services. The partial restore attempt
(660295e2) only recovered some files.
This commit restores all missing files from pre-rebase ref 98933f5e
while preserving post-rebase additions (night-scheduler, night-mode UI,
NightModeWidget dashboard integration).
Restored features include:
- AI Module Sidebar (FAB), OCR Labeling, OCR Compare
- GPU Dashboard, RAG Pipeline, Magic Help
- Klausur-Korrektur (8 files), Abitur-Archiv (5+ files)
- Companion, Zeugnisse-Crawler, Screen Flow
- Full backend, studio-v2, website, klausur-service
- All compliance SDKs, agent-core, voice-service
- CI/CD configs, documentation, scripts
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
33 KiB
33 KiB
Klausur-Modul - Vollständige Entwicklerspezifikation
Version: 2.0 Stand: Januar 2025 Autor: BreakPilot Development Team
Inhaltsverzeichnis
- Übersicht
- Systemarchitektur
- Datenmodelle
- API-Spezifikation
- Frontend-Architektur
- Sicherheit & Compliance
- Testing-Strategie
- Deployment & Operations
- Entwicklungsrichtlinien
1. Übersicht
1.1 Modulbeschreibung
Das Klausur-Modul ist ein umfassendes System für die digitale Korrektur, Bewertung und Verwaltung von Abitur- und Vorabiturklausuren. Es besteht aus folgenden Kernkomponenten:
| Komponente | Beschreibung | Technologie |
|---|---|---|
| Klausur-Service Backend | Hauptservice für alle Klausur-Operationen | Python FastAPI |
| BYOEH (Bring Your Own EH) | Erwartungshorizont-Management mit RAG | Qdrant, MinIO |
| Zeugnisse-Modul | Verordnungen und KI-Assistent | Crawler, Embeddings |
| Training-Modul | KI-Modell Training & Monitoring | Background Tasks |
| Frontend | Admin & Lehrer Oberflächen | Next.js, React |
1.2 Technologie-Stack
┌─────────────────────────────────────────────────────────────┐
│ Frontend Layer │
│ Next.js 15 │ React 18 │ TypeScript │ Tailwind CSS │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ API Gateway Layer │
│ Next.js API Routes │ Server-Side Proxy │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Backend Services │
│ Klausur-Service (FastAPI) │ Port 8086 │
│ ├── main.py (Klausur CRUD, BYOEH) │
│ ├── admin_api.py (NiBiS Ingestion) │
│ ├── zeugnis_api.py (Zeugnisse Crawler) │
│ ├── training_api.py (Training Management) │
│ └── metrics_db.py (PostgreSQL Operations) │
└─────────────────────────────────────────────────────────────┘
│
┌───────────────┼───────────────┐
▼ ▼ ▼
┌──────────────────┐ ┌──────────────┐ ┌────────────────┐
│ PostgreSQL │ │ Qdrant │ │ MinIO │
│ (Metadata) │ │ (Vectors) │ │ (Documents) │
│ Port 5432 │ │ Port 6333 │ │ Port 9000 │
└──────────────────┘ └──────────────┘ └────────────────┘
1.3 Kernfunktionen
-
Klausurverwaltung
- Erstellen/Bearbeiten von Klausuren
- Upload von Schülerarbeiten
- Kriterien-basierte Bewertung
- Gutachten-Generierung
-
BYOEH - Erwartungshorizont
- Upload & Verschlüsselung von EH-Dokumenten
- Chunking & Embedding-Generierung
- RAG-basierte Suche
- Tenant-Isolation
-
Zeugnisse
- Rights-Aware Crawler für Verordnungen
- KI-Assistent für Lehrer
- Bundesland-spezifische Suche
-
Training
- Modell-Training mit Monitoring
- Hyperparameter-Konfiguration
- Versions-Management
2. Systemarchitektur
2.1 Microservice-Architektur
┌───────────────────┐
│ Load Balancer │
│ (Nginx) │
└─────────┬─────────┘
│
┌─────────────────────┼─────────────────────┐
▼ ▼ ▼
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ Website │ │ Backend │ │ Klausur-Svc │
│ (Next.js) │ │ (FastAPI) │ │ (FastAPI) │
│ Port 3000 │ │ Port 8000 │ │ Port 8086 │
└───────────────┘ └───────────────┘ └───────────────┘
│ │ │
└─────────────────────┴─────────────────────┘
│
┌─────────┴─────────┐
│ Service Mesh │
│ (Docker Net) │
└─────────┬─────────┘
│
┌─────────────┬───────┴───────┬─────────────┐
▼ ▼ ▼ ▼
PostgreSQL Qdrant MinIO Mailpit
2.2 Datenfluss
2.2.1 Klausur-Korrektur Flow
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Upload │────▶│ OCR │────▶│ Analyse │────▶│ Bewertung│
│ Arbeit │ │ (extern) │ │ (LLM) │ │ Kriterien│
└──────────┘ └──────────┘ └──────────┘ └──────────┘
│
▼
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Export │◀────│ Finalize │◀────│Gutachten │◀────│ RAG-EH │
│ PDF │ │ │ │ Generate │ │ Query │
└──────────┘ └──────────┘ └──────────┘ └──────────┘
2.2.2 Zeugnis-Crawler Flow
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Seed URL │────▶│ Fetch │────▶│ Extract │────▶│ Check │
│ (Config) │ │ HTTP │ │ PDF/HTML │ │ Rights │
└──────────┘ └──────────┘ └──────────┘ └──────────┘
│
┌───────────────┤
▼ ▼
┌──────────┐ ┌──────────┐
│ MinIO │ │ Qdrant │
│ (Store) │ │ (Index) │
└──────────┘ └──────────┘
2.3 Komponenten-Details
2.3.1 Klausur-Service (main.py)
| Endpunkt | Methode | Beschreibung |
|---|---|---|
/api/v1/klausuren |
GET | Liste aller Klausuren |
/api/v1/klausuren |
POST | Neue Klausur erstellen |
/api/v1/klausuren/{id} |
GET | Klausur-Details |
/api/v1/klausuren/{id} |
PUT | Klausur aktualisieren |
/api/v1/klausuren/{id} |
DELETE | Klausur löschen |
/api/v1/klausuren/{id}/students |
POST | Schülerarbeit hinzufügen |
/api/v1/students/{id}/criteria |
PUT | Kriterien bewerten |
/api/v1/students/{id}/gutachten |
PUT | Gutachten speichern |
/api/v1/students/{id}/gutachten/generate |
POST | Gutachten generieren |
2.3.2 BYOEH (eh_pipeline.py, qdrant_service.py)
| Endpunkt | Methode | Beschreibung |
|---|---|---|
/api/v1/eh/upload |
POST | EH hochladen |
/api/v1/eh/{id}/index |
POST | EH indexieren |
/api/v1/eh/rag-query |
POST | RAG-Suche |
/api/v1/eh/{id}/share |
POST | EH teilen |
/api/v1/eh/{id}/link-klausur |
POST | EH mit Klausur verknüpfen |
2.3.3 Zeugnis-Modul (zeugnis_api.py)
| Endpunkt | Methode | Beschreibung |
|---|---|---|
/api/v1/admin/zeugnis/sources |
GET | Bundesländer-Quellen |
/api/v1/admin/zeugnis/crawler/start |
POST | Crawler starten |
/api/v1/admin/zeugnis/crawler/stop |
POST | Crawler stoppen |
/api/v1/admin/zeugnis/documents |
GET | Dokumente abrufen |
/api/v1/admin/zeugnis/stats |
GET | Statistiken |
2.3.4 Training-Modul (training_api.py)
| Endpunkt | Methode | Beschreibung |
|---|---|---|
/api/v1/admin/training/jobs |
GET | Training-Jobs |
/api/v1/admin/training/jobs |
POST | Training starten |
/api/v1/admin/training/jobs/{id}/pause |
POST | Pausieren |
/api/v1/admin/training/jobs/{id}/resume |
POST | Fortsetzen |
/api/v1/admin/training/models |
GET | Modell-Versionen |
3. Datenmodelle
3.1 PostgreSQL Schema
3.1.1 Kern-Tabellen (metrics_db.py)
-- RAG Feedback
CREATE TABLE rag_search_feedback (
id SERIAL PRIMARY KEY,
result_id VARCHAR(255) NOT NULL,
query_text TEXT,
collection_name VARCHAR(100),
score FLOAT,
rating INTEGER CHECK (rating >= 1 AND rating <= 5),
notes TEXT,
user_id VARCHAR(100),
created_at TIMESTAMP DEFAULT NOW()
);
-- RAG Search Logs
CREATE TABLE rag_search_logs (
id SERIAL PRIMARY KEY,
query_text TEXT NOT NULL,
collection_name VARCHAR(100),
result_count INTEGER,
latency_ms INTEGER,
top_score FLOAT,
filters JSONB,
created_at TIMESTAMP DEFAULT NOW()
);
-- Relevanz-Judgments für Precision/Recall
CREATE TABLE rag_relevance_judgments (
id SERIAL PRIMARY KEY,
query_id VARCHAR(255) NOT NULL,
query_text TEXT NOT NULL,
result_id VARCHAR(255) NOT NULL,
result_rank INTEGER,
is_relevant BOOLEAN NOT NULL,
collection_name VARCHAR(100),
user_id VARCHAR(100),
created_at TIMESTAMP DEFAULT NOW()
);
3.1.2 Zeugnis-Tabellen
-- Bundesland-Quellen
CREATE TABLE zeugnis_sources (
id VARCHAR(36) PRIMARY KEY,
bundesland VARCHAR(10) NOT NULL,
name VARCHAR(255) NOT NULL,
base_url TEXT,
license_type VARCHAR(50) NOT NULL,
training_allowed BOOLEAN DEFAULT FALSE,
verified_by VARCHAR(100),
verified_at TIMESTAMP,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
-- Seed URLs
CREATE TABLE zeugnis_seed_urls (
id VARCHAR(36) PRIMARY KEY,
source_id VARCHAR(36) REFERENCES zeugnis_sources(id),
url TEXT NOT NULL,
doc_type VARCHAR(50),
status VARCHAR(20) DEFAULT 'pending',
last_crawled TIMESTAMP,
error_message TEXT,
created_at TIMESTAMP DEFAULT NOW()
);
-- Dokumente
CREATE TABLE zeugnis_documents (
id VARCHAR(36) PRIMARY KEY,
seed_url_id VARCHAR(36) REFERENCES zeugnis_seed_urls(id),
title VARCHAR(500),
url TEXT NOT NULL,
content_hash VARCHAR(64),
minio_path TEXT,
training_allowed BOOLEAN DEFAULT FALSE,
indexed_in_qdrant BOOLEAN DEFAULT FALSE,
file_size INTEGER,
content_type VARCHAR(100),
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
-- Dokument-Versionen
CREATE TABLE zeugnis_document_versions (
id VARCHAR(36) PRIMARY KEY,
document_id VARCHAR(36) REFERENCES zeugnis_documents(id),
version INTEGER NOT NULL,
content_hash VARCHAR(64),
minio_path TEXT,
change_summary TEXT,
created_at TIMESTAMP DEFAULT NOW()
);
-- Usage Events (Audit Trail)
CREATE TABLE zeugnis_usage_events (
id VARCHAR(36) PRIMARY KEY,
document_id VARCHAR(36) REFERENCES zeugnis_documents(id),
event_type VARCHAR(50) NOT NULL,
user_id VARCHAR(100),
details JSONB,
created_at TIMESTAMP DEFAULT NOW()
);
-- Crawler Queue
CREATE TABLE zeugnis_crawler_queue (
id VARCHAR(36) PRIMARY KEY,
source_id VARCHAR(36) REFERENCES zeugnis_sources(id),
priority INTEGER DEFAULT 5,
status VARCHAR(20) DEFAULT 'pending',
started_at TIMESTAMP,
completed_at TIMESTAMP,
documents_found INTEGER DEFAULT 0,
documents_indexed INTEGER DEFAULT 0,
error_count INTEGER DEFAULT 0,
created_at TIMESTAMP DEFAULT NOW()
);
3.2 In-Memory Modelle (Python Dataclasses)
3.2.1 Klausur-Modelle
@dataclass
class StudentKlausur:
id: str
klausur_id: str
student_name: str
status: StudentKlausurStatus # Enum
criteria_scores: Dict[str, Dict]
gutachten: Optional[Dict]
file_path: Optional[str]
ocr_text: Optional[str]
created_at: datetime
updated_at: datetime
@dataclass
class Klausur:
id: str
title: str
subject: str
modus: KlausurModus # LANDES_ABITUR, VORABITUR
year: int
semester: str
erwartungshorizont: Optional[Dict]
students: List[StudentKlausur]
created_at: datetime
tenant_id: str
3.2.2 Zeugnis-Modelle (zeugnis_models.py)
class LicenseType(str, Enum):
PUBLIC_DOMAIN = "public_domain"
CC_BY = "cc_by"
CC_BY_SA = "cc_by_sa"
CC_BY_NC = "cc_by_nc"
GOV_STATUTE_FREE_USE = "gov_statute"
ALL_RIGHTS_RESERVED = "all_rights"
UNKNOWN_REQUIRES_REVIEW = "unknown"
class CrawlStatus(str, Enum):
PENDING = "pending"
RUNNING = "running"
COMPLETED = "completed"
FAILED = "failed"
PAUSED = "paused"
class ZeugnisSource(BaseModel):
id: str
bundesland: str
name: str
base_url: Optional[str]
license_type: LicenseType
training_allowed: bool
verified_by: Optional[str]
verified_at: Optional[datetime]
3.3 Qdrant Collections
3.3.1 BYOEH Collection (bp_eh)
# Collection Config
collection_name = "bp_eh"
vector_size = 384 # all-MiniLM-L6-v2
distance = Distance.COSINE
# Payload Schema
{
"tenant_id": str, # Tenant-Isolation
"eh_id": str, # Erwartungshorizont ID
"chunk_index": int, # Position im Dokument
"subject": str, # Fach
"encrypted_content": str, # AES-256-GCM encrypted
"training_allowed": bool, # IMMER False für EH
}
3.3.2 Zeugnis Collection (bp_zeugnis)
# Collection Config
collection_name = "bp_zeugnis"
vector_size = 384 # all-MiniLM-L6-v2
distance = Distance.COSINE
# Payload Schema
{
"document_id": str,
"chunk_index": int,
"chunk_text": str, # Preview (max 500 chars)
"bundesland": str,
"doc_type": str,
"title": str,
"source_url": str,
"training_allowed": bool, # Von Source geerbt
"indexed_at": str,
}
3.4 MinIO Bucket-Struktur
breakpilot-rag/
├── landes-daten/
│ ├── {bundesland}/
│ │ └── zeugnis/
│ │ └── {year}/
│ │ └── {filename}.pdf
│ └── klausur/
│ └── {year}/
│ └── {subject}/
│ └── {filename}.pdf
│
└── lehrer-daten/
└── {tenant_id}/
└── {teacher_id}/
└── {filename}.pdf.enc
4. API-Spezifikation
4.1 Authentifizierung
Alle API-Endpunkte erfordern JWT-Authentifizierung:
Authorization: Bearer <jwt_token>
JWT-Payload:
{
"sub": "user-id",
"tenant_id": "school-id",
"roles": ["teacher", "admin"],
"exp": 1704067200
}
4.2 Fehlerbehandlung
Standard-Fehlerformat
{
"detail": "Beschreibung des Fehlers",
"code": "ERROR_CODE",
"timestamp": "2024-01-01T10:00:00Z"
}
HTTP Status Codes
| Code | Bedeutung | Verwendung |
|---|---|---|
| 200 | OK | Erfolgreiche Anfrage |
| 201 | Created | Ressource erstellt |
| 400 | Bad Request | Ungültige Eingabe |
| 401 | Unauthorized | Authentifizierung fehlt |
| 403 | Forbidden | Keine Berechtigung |
| 404 | Not Found | Ressource nicht gefunden |
| 409 | Conflict | Ressourcenkonflikt |
| 422 | Unprocessable | Validierungsfehler |
| 500 | Internal Error | Serverfehler |
| 503 | Unavailable | Service nicht verfügbar |
4.3 Pagination
GET /api/v1/resource?limit=20&offset=0
Response:
{
"items": [...],
"total": 100,
"limit": 20,
"offset": 0,
"has_more": true
}
4.4 Rate Limiting
| Endpunkt-Typ | Limit |
|---|---|
| Standard API | 100/min |
| RAG Query | 30/min |
| Upload | 10/min |
| Training Start | 5/hour |
5. Frontend-Architektur
5.1 Verzeichnisstruktur
website/
├── app/
│ ├── admin/
│ │ ├── training/
│ │ │ └── page.tsx # Training Dashboard
│ │ ├── zeugnisse-crawler/
│ │ │ └── page.tsx # Crawler Admin
│ │ ├── rag/
│ │ │ └── page.tsx # RAG Admin
│ │ └── uni-crawler/
│ │ └── page.tsx # Uni Crawler
│ │
│ ├── zeugnisse/
│ │ └── page.tsx # Lehrer-Frontend
│ │
│ └── api/
│ └── admin/
│ ├── zeugnisse-crawler/
│ │ └── route.ts # API Proxy
│ └── training/
│ └── route.ts # API Proxy
│
├── components/
│ ├── ui/ # Basis-Komponenten
│ └── shared/ # Geteilte Komponenten
│
└── lib/
├── api.ts # API Client
└── utils.ts # Hilfsfunktionen
5.2 Komponenten-Hierarchie
App
├── Layout
│ ├── Header
│ │ ├── Navigation
│ │ └── UserMenu
│ └── Sidebar
│
├── Training Dashboard Page
│ ├── StatsCards
│ ├── TrainingJobCard
│ │ ├── ProgressRing
│ │ ├── MetricCards
│ │ └── LossChart
│ ├── DatasetOverview
│ └── NewTrainingModal (Wizard)
│
├── Zeugnisse Crawler Page
│ ├── StatsCards
│ ├── BundeslandTable
│ ├── DocumentList
│ └── CrawlerControls
│
└── Lehrer Zeugnisse Page
├── OnboardingWizard
├── ChatInterface
│ └── MessageList
├── SearchInterface
│ └── SearchResults
└── DocumentBrowser
5.3 State Management
Local State (useState)
- UI-Zustand (Modals, Tabs)
- Formular-Eingaben
- Lokale Filter
Server State (SWR/Fetch)
- API-Daten mit Polling
- Caching
- Revalidierung
Persisted State (localStorage)
- Benutzereinstellungen
- Letzte Suchen
- Wizard-Status
5.4 Styling-Konventionen
// Tailwind CSS Klassennamen
const buttonPrimary = "px-4 py-2 bg-blue-600 text-white rounded-lg hover:bg-blue-700 transition"
const buttonSecondary = "px-4 py-2 bg-gray-100 text-gray-700 rounded-lg hover:bg-gray-200 transition"
const card = "bg-white dark:bg-gray-800 rounded-xl shadow-lg border border-gray-200 dark:border-gray-700"
const input = "px-3 py-2 bg-gray-100 dark:bg-gray-900 border-0 rounded-lg focus:ring-2 focus:ring-blue-500"
6. Sicherheit & Compliance
6.1 Authentifizierung & Autorisierung
JWT-Validierung
def verify_jwt(token: str) -> dict:
try:
payload = jwt.decode(token, JWT_SECRET, algorithms=["HS256"])
return payload
except jwt.ExpiredSignatureError:
raise HTTPException(status_code=401, detail="Token expired")
except jwt.InvalidTokenError:
raise HTTPException(status_code=401, detail="Invalid token")
RBAC Rollen
| Rolle | Berechtigungen |
|---|---|
| teacher | Klausuren erstellen, EH hochladen, Zeugnis-Assistent |
| admin | + Crawler steuern, Training starten |
| superadmin | + System-Konfiguration |
6.2 Datenverschlüsselung
AES-256-GCM für EH-Dokumente
from cryptography.hazmat.primitives.ciphers.aead import AESGCM
def encrypt_text(plaintext: str, passphrase: str) -> tuple:
salt = os.urandom(16)
iv = os.urandom(12)
key = derive_key(passphrase, salt)
cipher = AESGCM(key)
ciphertext = cipher.encrypt(iv, plaintext.encode(), None)
return base64.b64encode(salt + iv + ciphertext).decode(), hash_key(key)
6.3 Tenant-Isolation
- Alle Datenbankabfragen filtern nach
tenant_id - Qdrant-Suchen mit
tenant_idFilter - MinIO-Pfade enthalten
tenant_id
6.4 Audit-Trail
async def log_event(event_type: str, resource_id: str, user_id: str, details: dict):
await log_zeugnis_event(
document_id=resource_id,
event_type=event_type,
user_id=user_id,
details=details,
)
6.5 DSGVO-Compliance
- Datenexport-Funktion
- Lösch-Anfragen
- Einwilligungs-Tracking
- Protokollierung aller Zugriffe
7. Testing-Strategie
7.1 Test-Pyramide
/\
/ \
/ E2E \ <- 5% (Critical Paths)
/------\
/ Integ \ <- 25% (API, DB)
/----------\
/ Unit \ <- 70% (Functions)
/--------------\
7.2 Unit Tests (Python)
Speicherort: klausur-service/backend/tests/
# tests/test_zeugnis_models.py
import pytest
from zeugnis_models import (
LicenseType, get_training_allowed, get_bundesland_name
)
class TestTrainingPermissions:
def test_niedersachsen_allows_training(self):
assert get_training_allowed("ni") == True
def test_berlin_disallows_training(self):
assert get_training_allowed("be") == False
def test_unknown_bundesland_disallows(self):
assert get_training_allowed("xx") == False
class TestBundeslandNames:
def test_valid_code_returns_name(self):
assert get_bundesland_name("ni") == "Niedersachsen"
def test_invalid_code_returns_code(self):
assert get_bundesland_name("xx") == "xx"
# tests/test_zeugnis_crawler.py
import pytest
from zeugnis_crawler import chunk_text, compute_hash, extract_text_from_pdf
class TestChunking:
def test_short_text_single_chunk(self):
text = "Dies ist ein kurzer Text."
chunks = chunk_text(text, chunk_size=100)
assert len(chunks) == 1
def test_long_text_multiple_chunks(self):
text = "A" * 2000
chunks = chunk_text(text, chunk_size=500, overlap=50)
assert len(chunks) > 1
def test_overlap_preserved(self):
text = "ABCDE" * 200
chunks = chunk_text(text, chunk_size=100, overlap=20)
for i in range(1, len(chunks)):
assert chunks[i][:20] == chunks[i-1][-20:]
class TestHashing:
def test_same_content_same_hash(self):
content = b"Hello World"
assert compute_hash(content) == compute_hash(content)
def test_different_content_different_hash(self):
assert compute_hash(b"Hello") != compute_hash(b"World")
7.3 Integration Tests
# tests/test_zeugnis_api_integration.py
import pytest
from httpx import AsyncClient
from main import app
@pytest.fixture
async def client():
async with AsyncClient(app=app, base_url="http://test") as ac:
yield ac
@pytest.mark.asyncio
class TestZeugnisAPI:
async def test_get_sources_returns_list(self, client):
response = await client.get("/api/v1/admin/zeugnis/sources")
assert response.status_code == 200
assert isinstance(response.json(), list)
async def test_start_crawler_without_running(self, client):
response = await client.post(
"/api/v1/admin/zeugnis/crawler/start",
json={"bundesland": "ni"}
)
assert response.status_code == 200
async def test_start_crawler_while_running_fails(self, client):
# First start
await client.post("/api/v1/admin/zeugnis/crawler/start")
# Second start should fail
response = await client.post("/api/v1/admin/zeugnis/crawler/start")
assert response.status_code == 409
7.4 E2E Tests (Playwright)
// tests/e2e/zeugnisse.spec.ts
import { test, expect } from '@playwright/test'
test.describe('Zeugnis-Assistent', () => {
test('onboarding wizard completes successfully', async ({ page }) => {
await page.goto('/zeugnisse')
// Step 1: Welcome
await expect(page.locator('h2')).toContainText('Willkommen')
await page.click('button:has-text("Weiter")')
// Step 2: Select Bundesland
await page.click('button:has-text("Niedersachsen")')
await page.click('button:has-text("Weiter")')
// Step 3: Select Schulform
await page.click('button:has-text("Gymnasium")')
await page.click('button:has-text("Weiter")')
// Step 4: Complete
await page.click('button:has-text("Loslegen")')
// Verify main interface
await expect(page.locator('h1')).toContainText('Zeugnis-Assistent')
})
test('chat interface responds to questions', async ({ page }) => {
// Skip wizard (set localStorage)
await page.goto('/zeugnisse')
await page.evaluate(() => {
localStorage.setItem('zeugnis-preferences', JSON.stringify({
bundesland: 'ni',
schulform: 'gymnasium',
hasSeenWizard: true,
}))
})
await page.reload()
// Send message
await page.fill('textarea', 'Wie schreibe ich Bemerkungen?')
await page.click('button[type="submit"]')
// Wait for response
await expect(page.locator('.bg-white.rounded-2xl').last())
.toContainText('Bemerkung', { timeout: 10000 })
})
})
7.5 Test-Ausführung
# Unit Tests (Python)
cd klausur-service/backend
pytest -v tests/
# Mit Coverage
pytest --cov=. --cov-report=html tests/
# E2E Tests (Playwright)
cd website
npx playwright test
# Alle Tests
./run-tests.sh
8. Deployment & Operations
8.1 Docker Compose
# docker-compose.yml (Auszug)
services:
klausur-service:
build:
context: ./klausur-service
dockerfile: Dockerfile
ports:
- "8086:8086"
environment:
- JWT_SECRET=${JWT_SECRET}
- QDRANT_URL=http://qdrant:6333
- MINIO_ENDPOINT=minio:9000
- DATABASE_URL=postgres://breakpilot:breakpilot123@postgres:5432/breakpilot_db
depends_on:
- qdrant
- minio
- postgres
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8086/health"]
interval: 30s
timeout: 10s
retries: 3
8.2 Dockerfile (Klausur-Service)
FROM python:3.11-slim
WORKDIR /app
# System dependencies
RUN apt-get update && apt-get install -y \
gcc \
libpq-dev \
curl \
&& rm -rf /var/lib/apt/lists/*
# Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Application code
COPY backend/ .
# Create directories
RUN mkdir -p /app/uploads /app/eh-uploads
EXPOSE 8086
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8086"]
8.3 Monitoring
Health Checks
@app.get("/health")
async def health():
return {
"status": "healthy",
"service": "klausur-service",
"version": "2.0",
"timestamp": datetime.now().isoformat(),
}
@app.get("/health/detailed")
async def health_detailed():
# Check dependencies
qdrant_ok = await check_qdrant()
postgres_ok = await check_postgres()
minio_ok = await check_minio()
return {
"status": "healthy" if all([qdrant_ok, postgres_ok, minio_ok]) else "degraded",
"dependencies": {
"qdrant": "ok" if qdrant_ok else "error",
"postgres": "ok" if postgres_ok else "error",
"minio": "ok" if minio_ok else "error",
}
}
Prometheus Metrics
from prometheus_client import Counter, Histogram
# Metriken
request_count = Counter('klausur_requests_total', 'Total requests', ['endpoint', 'method'])
request_latency = Histogram('klausur_request_latency_seconds', 'Request latency', ['endpoint'])
training_jobs = Counter('klausur_training_jobs_total', 'Training jobs', ['status'])
8.4 Logging
import logging
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger("klausur-service")
# Strukturiertes Logging
logger.info("Training started", extra={
"job_id": job_id,
"bundeslaender": config.bundeslaender,
"epochs": config.epochs,
})
9. Entwicklungsrichtlinien
9.1 Code-Style
Python
# Imports: Standard → Third-Party → Local
import os
from datetime import datetime
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from zeugnis_models import LicenseType
# Docstrings: Google Style
def process_document(content: bytes, doc_type: str) -> dict:
"""Process a document for indexing.
Args:
content: Raw document bytes.
doc_type: Type of document (pdf, html).
Returns:
Dict with extracted text and metadata.
Raises:
ValueError: If doc_type is not supported.
"""
pass
# Type Hints: Immer verwenden
async def get_sources(
bundesland: Optional[str] = None,
limit: int = 100,
) -> List[Dict[str, Any]]:
pass
TypeScript
// Interfaces über Types bevorzugen
interface TrainingJob {
id: string
name: string
status: TrainingStatus
}
// Props-Interface für Komponenten
interface TrainingCardProps {
job: TrainingJob
onPause: () => void
onResume: () => void
}
// Funktionskomponenten mit expliziten Typen
export function TrainingCard({ job, onPause, onResume }: TrainingCardProps) {
return (...)
}
9.2 Git-Workflow
main
│
├── develop
│ │
│ ├── feature/zeugnis-crawler
│ ├── feature/training-dashboard
│ └── fix/crawler-retry
│
└── release/v2.0
Commit-Messages
feat(zeugnis): add rights-aware crawler
- Implement PDF/HTML text extraction
- Add training_allowed flag per bundesland
- Create audit trail for document access
Closes #123
9.3 Review-Checkliste
- Tests vorhanden und bestanden
- Dokumentation aktualisiert
- Type-Hints/Interfaces vollständig
- Keine Hardcoded Credentials
- Error Handling implementiert
- Logging vorhanden
- Performance akzeptabel
9.4 Versionierung
Semantic Versioning: MAJOR.MINOR.PATCH
- MAJOR: Breaking Changes
- MINOR: Neue Features (rückwärtskompatibel)
- PATCH: Bug Fixes
Anhang A: Umgebungsvariablen
# Authentifizierung
JWT_SECRET=your-super-secret-key
# Datenbanken
DATABASE_URL=postgres://user:pass@host:5432/db
QDRANT_URL=http://qdrant:6333
MINIO_ENDPOINT=minio:9000
MINIO_ACCESS_KEY=breakpilot
MINIO_SECRET_KEY=breakpilot123
MINIO_BUCKET=breakpilot-rag
# Embeddings
EMBEDDING_BACKEND=local # oder "openai"
OPENAI_API_KEY=sk-... # Falls openai
# Services
BACKEND_URL=http://backend:8000
SCHOOL_SERVICE_URL=http://school-service:8084
# Feature Flags
BYOEH_ENCRYPTION_ENABLED=true
BYOEH_CHUNK_SIZE=1000
BYOEH_CHUNK_OVERLAP=200
Anhang B: Schnellreferenz
API-Basis-URLs
| Umgebung | URL |
|---|---|
| Lokal | http://localhost:8086 |
| Entwicklung | https://dev.breakpilot.app |
| Produktion | https://api.breakpilot.app |
Wichtige Befehle
# Service starten
docker-compose up -d klausur-service
# Logs anzeigen
docker logs -f breakpilot-pwa-klausur-service
# Tests ausführen
docker exec klausur-service pytest tests/
# DB-Migration
docker exec postgres psql -U breakpilot -d breakpilot_db -f /migration.sql
Letzte Aktualisierung: Januar 2025