fix: Restore all files lost during destructive rebase

A previous `git pull --rebase origin main` dropped 177 local commits, losing 3400+ files across admin-v2, backend, studio-v2, website, klausur-service, and many other services. The partial restore attempt (660295e2) only recovered some files. This commit restores all missing files from pre-rebase ref 98933f5e while preserving post-rebase additions (night-scheduler, night-mode UI, NightModeWidget dashboard integration). Restored features include: - AI Module Sidebar (FAB), OCR Labeling, OCR Compare - GPU Dashboard, RAG Pipeline, Magic Help - Klausur-Korrektur (8 files), Abitur-Archiv (5+ files) - Companion, Zeugnisse-Crawler, Screen Flow - Full backend, studio-v2, website, klausur-service - All compliance SDKs, agent-core, voice-service - CI/CD configs, documentation, scripts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 09:51:32 +01:00
parent f7487ee240
commit bfdaf63ba9
2009 changed files with 749983 additions and 1731 deletions
@@ -0,0 +1,385 @@
+# vast.ai GPU Infrastructure API Dokumentation
+
+**Version:** 0.1.0
+**Base URL:** `/infra/vast`
+
+---
+
+## Übersicht
+
+Die vast.ai Infrastructure API ermöglicht die Steuerung von GPU-Instanzen direkt aus dem Admin Panel. Features:
+
+- **Start/Stop**: GPU-Instanz ein- und ausschalten
+- **Auto-Shutdown**: Automatisches Stoppen bei Inaktivität (Kostenkontrolle)
+- **Kosten-Tracking**: Laufzeit und Kosten pro Session
+- **Audit-Log**: Protokollierung aller Aktionen
+
+---
+
+## Authentifizierung
+
+Alle Endpoints erfordern den `CONTROL_API_KEY` im Header:
+
+```
+X-API-Key: <CONTROL_API_KEY>
+```
+
+---
+
+## Endpoints
+
+### GET /infra/vast/status
+
+Gibt den aktuellen Status der vast.ai Instanz zurück.
+
+**Response (200):**
+
+```json
+{
+  "instance_id": 12345,
+  "status": "running",
+  "gpu_name": "RTX 3090",
+  "dph_total": 0.45,
+  "endpoint_base_url": "http://10.0.0.1:8001",
+  "last_activity": "2024-01-15T10:30:00Z",
+  "auto_shutdown_in_minutes": 25,
+  "total_runtime_hours": 2.5,
+  "total_cost_usd": 1.12,
+  "message": null
+}
+```
+
+**Status-Werte:**
+
+| Status | Beschreibung |
+|--------|--------------|
+| `running` | Instanz läuft |
+| `stopped` | Instanz gestoppt (Disk bleibt) |
+| `exited` | Instanz beendet |
+| `loading` | Instanz startet |
+| `scheduling` | Wartet auf GPU-Zuweisung |
+| `creating` | Wird erstellt |
+| `unconfigured` | VAST_API_KEY nicht gesetzt |
+| `not_found` | Instance ID nicht gefunden |
+
+---
+
+### POST /infra/vast/power/on
+
+Startet die vast.ai Instanz.
+
+**Request Body:**
+
+```json
+{
+  "wait_for_health": true,
+  "health_path": "/health",
+  "health_port": 8001
+}
+```
+
+**Parameter:**
+
+| Parameter | Typ | Default | Beschreibung |
+|-----------|-----|---------|--------------|
+| `wait_for_health` | boolean | true | Warten bis LLM-Server erreichbar |
+| `health_path` | string | "/health" | Health-Check Endpoint |
+| `health_port` | integer | 8001 | Port für Health-Check |
+
+**Response (200):**
+
+```json
+{
+  "status": "running",
+  "instance_id": 12345,
+  "endpoint_base_url": "http://10.0.0.1:8001",
+  "health_url": "http://10.0.0.1:8001/health",
+  "message": "Instance running and healthy"
+}
+```
+
+**Errors:**
+
+| Code | Beschreibung |
+|------|--------------|
+| 401 | Unauthorized (falscher API Key) |
+| 500 | VAST_API_KEY oder VAST_INSTANCE_ID nicht konfiguriert |
+| 502 | vast.ai API Fehler |
+| 504 | Health-Check Timeout |
+
+---
+
+### POST /infra/vast/power/off
+
+Stoppt die vast.ai Instanz (Disk bleibt erhalten).
+
+**Request Body:**
+
+```json
+{}
+```
+
+**Response (200):**
+
+```json
+{
+  "status": "stopped",
+  "session_runtime_minutes": 45.5,
+  "session_cost_usd": 0.34,
+  "message": "Instance stopped. Session: 45.5 min, $0.340"
+}
+```
+
+---
+
+### POST /infra/vast/activity
+
+Zeichnet Aktivität auf und verzögert den Auto-Shutdown Timer.
+
+**Verwendung:** Sollte vom LLM Gateway bei jedem Request aufgerufen werden.
+
+**Response (200):**
+
+```json
+{
+  "status": "recorded",
+  "last_activity": "2024-01-15T10:30:00Z"
+}
+```
+
+---
+
+### GET /infra/vast/costs
+
+Gibt Kosten-Statistiken zurück.
+
+**Response (200):**
+
+```json
+{
+  "total_runtime_hours": 12.5,
+  "total_cost_usd": 5.62,
+  "sessions_count": 5,
+  "avg_session_minutes": 150.0
+}
+```
+
+---
+
+### GET /infra/vast/audit
+
+Gibt die letzten Audit-Log Einträge zurück.
+
+**Query Parameter:**
+
+| Parameter | Typ | Default | Beschreibung |
+|-----------|-----|---------|--------------|
+| `limit` | integer | 50 | Max. Anzahl Einträge |
+
+**Response (200):**
+
+```json
+[
+  {
+    "ts": "2024-01-15T10:30:00Z",
+    "event": "power_on_complete",
+    "actor": "system",
+    "meta": {
+      "instance_id": 12345,
+      "endpoint": "http://10.0.0.1:8001"
+    }
+  },
+  {
+    "ts": "2024-01-15T09:00:00Z",
+    "event": "auto_shutdown",
+    "actor": "system",
+    "meta": {
+      "inactive_minutes": 30.5
+    }
+  }
+]
+```
+
+**Event-Typen:**
+
+| Event | Beschreibung |
+|-------|--------------|
+| `power_on_requested` | Start angefordert |
+| `power_on_complete` | Start abgeschlossen |
+| `power_on_health_timeout` | Health-Check fehlgeschlagen |
+| `power_off_requested` | Stop angefordert |
+| `power_off_complete` | Stop abgeschlossen |
+| `auto_shutdown` | Automatischer Stop wegen Inaktivität |
+| `auto_shutdown_complete` | Auto-Stop abgeschlossen |
+
+---
+
+## Auto-Shutdown
+
+Der Auto-Shutdown Mechanismus stoppt die Instanz automatisch bei Inaktivität:
+
+1. Bei jedem LLM-Request wird `/activity` aufgerufen
+2. Ein Hintergrund-Task prüft alle 60s die letzte Aktivität
+3. Nach `VAST_AUTO_SHUTDOWN_MINUTES` ohne Aktivität wird gestoppt
+4. Session-Kosten werden berechnet und geloggt
+
+**Konfiguration:**
+
+```bash
+VAST_AUTO_SHUTDOWN=true           # Feature aktivieren
+VAST_AUTO_SHUTDOWN_MINUTES=30     # Timeout in Minuten
+```
+
+---
+
+## Konfiguration
+
+Umgebungsvariablen in `.env`:
+
+```bash
+# vast.ai Credentials
+VAST_API_KEY=your-vast-api-key          # von https://cloud.vast.ai/cli/
+VAST_INSTANCE_ID=12345                   # numerische Instance ID
+
+# Admin-Schutz
+CONTROL_API_KEY=your-control-key         # generieren mit: openssl rand -hex 32
+
+# Health Check
+VAST_HEALTH_PORT=8001                    # Port auf der Instanz
+VAST_HEALTH_PATH=/health                 # Health-Endpoint
+VAST_WAIT_TIMEOUT_S=600                  # Timeout beim Start (10 min)
+
+# Auto-Shutdown
+VAST_AUTO_SHUTDOWN=true
+VAST_AUTO_SHUTDOWN_MINUTES=30
+
+# State Persistence (optional)
+VAST_STATE_PATH=./vast_state.json
+VAST_AUDIT_PATH=./vast_audit.log
+```
+
+---
+
+## Architektur
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+│                      Admin Panel (Browser)                       │
+│   ┌─────────────────────────────────────────────────────────┐   │
+│   │  GPU Infra Tab                                          │   │
+│   │  [Start] [Stop] [Refresh]   Status: Running             │   │
+│   │  GPU: RTX 3090   Cost: $0.45/h   Session: 25 min        │   │
+│   │  Auto-Shutdown in: 5 min                                │   │
+│   └─────────────────────────────────────────────────────────┘   │
+└────────────────────────────┬────────────────────────────────────┘
+                             │
+                             ▼
+┌─────────────────────────────────────────────────────────────────┐
+│                    Breakpilot Backend                            │
+│   ┌───────────────────┐   ┌───────────────────┐                 │
+│   │  /infra/vast/*    │   │  Auto-Shutdown    │                 │
+│   │  (FastAPI Router) │   │  Background Task  │                 │
+│   └─────────┬─────────┘   └─────────┬─────────┘                 │
+│             │                       │                            │
+│             ▼                       ▼                            │
+│   ┌─────────────────────────────────────────────┐               │
+│   │           VastAIClient                       │               │
+│   │   (REST API zu vast.ai Console)             │               │
+│   └─────────────────────────────────────────────┘               │
+└────────────────────────────┬────────────────────────────────────┘
+                             │
+                             ▼
+┌─────────────────────────────────────────────────────────────────┐
+│                   vast.ai Cloud API                              │
+│               https://console.vast.ai/api/v0/                    │
+└────────────────────────────┬────────────────────────────────────┘
+                             │
+                             ▼
+┌─────────────────────────────────────────────────────────────────┐
+│                    vast.ai GPU Instance                          │
+│   ┌───────────────────────────────────────────────────────┐     │
+│   │  Docker Container: vLLM                               │     │
+│   │  - Model: Mistral-7B-Instruct                        │     │
+│   │  - Port 8000: /v1/chat/completions                   │     │
+│   │  - Port 8001: /health (nginx proxy)                  │     │
+│   └───────────────────────────────────────────────────────┘     │
+│   GPU: RTX 3090 (24GB VRAM)                                     │
+└─────────────────────────────────────────────────────────────────┘
+```
+
+---
+
+## vast.ai Instance Setup
+
+### 1. Instance buchen
+
+- Typ: **On-Demand** (nicht Contract)
+- GPU: **RTX 3090** (24GB) oder **RTX 4090**
+- RAM: >= 32 GB
+- Disk: >= 150 GB
+- Interruptible: **Nein** (Non-interruptible)
+
+### 2. vLLM mit systemd autostart
+
+Auf der vast.ai Instanz:
+
+```bash
+# Docker Compose erstellen
+mkdir -p ~/llm-stack
+cd ~/llm-stack
+
+# docker-compose.yml und health-nginx.conf erstellen
+# (siehe vast.ai Implementierung.docx)
+
+# Systemd Service erstellen
+sudo tee /etc/systemd/system/llm-stack.service > /dev/null <<'EOF'
+[Unit]
+Description=LLM Stack via Docker Compose
+After=docker.service
+
+[Service]
+Type=oneshot
+RemainAfterExit=yes
+WorkingDirectory=/home/ubuntu/llm-stack
+ExecStart=/usr/bin/docker compose up -d
+ExecStop=/usr/bin/docker compose down
+
+[Install]
+WantedBy=multi-user.target
+EOF
+
+sudo systemctl enable llm-stack.service
+sudo systemctl start llm-stack.service
+```
+
+### 3. Backend konfigurieren
+
+```bash
+# .env
+VAST_API_KEY=vast_...
+VAST_INSTANCE_ID=12345
+CONTROL_API_KEY=$(openssl rand -hex 32)
+```
+
+---
+
+## Fehlerbehandlung
+
+| Fehler | Ursache | Lösung |
+|--------|---------|--------|
+| 500: VAST_API_KEY not configured | ENV nicht gesetzt | `.env` prüfen |
+| 502: vast CLI failed | vast.ai API Fehler | Instance ID prüfen |
+| 504: Health check timeout | vLLM startet nicht | SSH auf Instanz, Logs prüfen |
+| Instance stuck in scheduling | GPU nicht verfügbar | Andere GPU wählen |
+
+---
+
+## Kosten-Beispiele
+
+| GPU | $/Stunde | 1h Test | 8h Tag |
+|-----|----------|---------|--------|
+| RTX 3090 | ~$0.45 | $0.45 | $3.60 |
+| RTX 4090 | ~$0.75 | $0.75 | $6.00 |
+
+**Mit Auto-Shutdown (30 min):**
+- Vergessen auszuschalten: max. $0.23 (3090) bzw. $0.38 (4090) extra