fix: Restore all files lost during destructive rebase

A previous `git pull --rebase origin main` dropped 177 local commits, losing 3400+ files across admin-v2, backend, studio-v2, website, klausur-service, and many other services. The partial restore attempt (660295e2) only recovered some files. This commit restores all missing files from pre-rebase ref 98933f5e while preserving post-rebase additions (night-scheduler, night-mode UI, NightModeWidget dashboard integration). Restored features include: - AI Module Sidebar (FAB), OCR Labeling, OCR Compare - GPU Dashboard, RAG Pipeline, Magic Help - Klausur-Korrektur (8 files), Abitur-Archiv (5+ files) - Companion, Zeugnisse-Crawler, Screen Flow - Full backend, studio-v2, website, klausur-service - All compliance SDKs, agent-core, voice-service - CI/CD configs, documentation, scripts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 09:51:32 +01:00
parent f7487ee240
commit bfdaf63ba9
2009 changed files with 749983 additions and 1731 deletions
--- a/backend/docs/compliance_ai_integration.md
+++ b/backend/docs/compliance_ai_integration.md
@@ -0,0 +1,447 @@
+# Compliance AI Integration - Sprint 4
+
+## Ubersicht
+
+Die Compliance AI Integration bietet KI-gestutzte Funktionen zur automatischen Interpretation von regulatorischen Anforderungen, Vorschlagen von Controls und Risikobewertung fur Breakpilot-Module.
+
+## Architektur
+
+```
+Client → FastAPI Routes → AIComplianceAssistant → LLMProvider
+                                                    ├─ AnthropicProvider (Claude API)
+                                                    ├─ SelfHostedProvider (Ollama/vLLM)
+                                                    └─ MockProvider (Testing)
+```
+
+## Komponenten
+
+### 1. LLM Provider Abstraction (`llm_provider.py`)
+
+**Abstrakte Basisklasse**: `LLMProvider`
+- `complete()`: Einzelne Completion
+- `batch_complete()`: Batch-Verarbeitung mit Rate-Limiting
+
+**Implementierungen**:
+
+#### AnthropicProvider
+- Verwendet Claude API (https://api.anthropic.com)
+- Empfohlen fur Produktion (beste Qualitat)
+- Model: `claude-sonnet-4-20250514` (Standard)
+- Benotigt: `ANTHROPIC_API_KEY`
+
+#### SelfHostedProvider
+- Unterstutzung fur Ollama, vLLM, LocalAI
+- Auto-Detection von API-Formaten (Ollama vs. OpenAI-kompatibel)
+- Kostengunstige Alternative fur Self-Hosting
+- Benotigt: `SELF_HOSTED_LLM_URL`
+
+#### MockProvider
+- Fur Unit-Tests ohne echte API-Calls
+- Vordefinierte Antworten moglich
+
+**Factory Function**: `get_llm_provider(config?: LLMConfig) -> LLMProvider`
+
+### 2. AI Compliance Assistant (`ai_compliance_assistant.py`)
+
+Hauptklasse fur alle KI-Funktionen:
+
+```python
+from compliance.services.ai_compliance_assistant import get_ai_assistant
+
+assistant = get_ai_assistant()
+```
+
+#### Methoden
+
+**interpret_requirement()**
+- Ubersetzt rechtliche Anforderungen in technische Anleitung
+- Identifiziert betroffene Breakpilot-Module
+- Bewertet Risiko-Level
+- Gibt Implementierungs-Hinweise
+
+**suggest_controls()**
+- Schlagt 1-3 passende Controls vor
+- Domain-Zuordnung (priv, iam, sdlc, crypto, ops, ai, cra, gov, aud)
+- Gibt Implementierungs-Guidance und Pass-Criteria
+- Erkennt automatisierbare Controls
+
+**assess_module_risk()**
+- Bewertet Compliance-Risiko fur Service-Module
+- Berucksichtigt: PII-Verarbeitung, KI-Komponenten, Kritikalitat
+- Identifiziert Compliance-Lucken
+- Gibt Empfehlungen zur Risikominderung
+
+**analyze_gap()**
+- Analysiert Coverage zwischen Requirements und Controls
+- Identifiziert fehlende Abdeckung
+- Schlagt konkrete Massnahmen vor
+
+**batch_interpret_requirements()**
+- Verarbeitet mehrere Requirements mit Rate-Limiting
+- Fur Bulk-Processing von Regulations
+
+### 3. API Endpoints (`routes.py`)
+
+Alle Endpoints unter `/api/v1/compliance/ai/`:
+
+#### GET `/ai/status`
+Pruft Status des AI Providers
+
+**Response**:
+```json
+{
+  "provider": "anthropic",
+  "model": "claude-sonnet-4-20250514",
+  "is_available": true,
+  "is_mock": false,
+  "error": null
+}
+```
+
+#### POST `/ai/interpret`
+Interpretiert eine Anforderung
+
+**Request**:
+```json
+{
+  "requirement_id": "uuid",
+  "force_refresh": false
+}
+```
+
+**Response**:
+```json
+{
+  "requirement_id": "uuid",
+  "summary": "Kurze Zusammenfassung",
+  "applicability": "Wie dies auf Breakpilot anwendbar ist",
+  "technical_measures": ["Massnahme 1", "Massnahme 2"],
+  "affected_modules": ["consent-service", "klausur-service"],
+  "risk_level": "high",
+  "implementation_hints": ["Hinweis 1", "Hinweis 2"],
+  "confidence_score": 0.85,
+  "error": null
+}
+```
+
+#### POST `/ai/suggest-controls`
+Schlagt Controls fur eine Anforderung vor
+
+**Request**:
+```json
+{
+  "requirement_id": "uuid"
+}
+```
+
+**Response**:
+```json
+{
+  "requirement_id": "uuid",
+  "suggestions": [
+    {
+      "control_id": "PRIV-042",
+      "domain": "priv",
+      "title": "Verschlusselung personenbezogener Daten",
+      "description": "...",
+      "pass_criteria": "Alle PII sind AES-256 verschlusselt",
+      "implementation_guidance": "...",
+      "is_automated": true,
+      "automation_tool": "SOPS + Age",
+      "priority": "high",
+      "confidence_score": 0.9
+    }
+  ]
+}
+```
+
+#### POST `/ai/assess-risk`
+Bewertet Risiko fur ein Modul
+
+**Request**:
+```json
+{
+  "module_id": "consent-service"
+}
+```
+
+**Response**:
+```json
+{
+  "module_name": "consent-service",
+  "overall_risk": "high",
+  "risk_factors": [
+    {
+      "factor": "Verarbeitet personenbezogene Daten",
+      "severity": "high",
+      "likelihood": "high"
+    }
+  ],
+  "recommendations": ["Empfehlung 1", "Empfehlung 2"],
+  "compliance_gaps": ["Lucke 1", "Lucke 2"],
+  "confidence_score": 0.8
+}
+```
+
+#### POST `/ai/gap-analysis`
+Analysiert Coverage-Lucken
+
+**Request**:
+```json
+{
+  "requirement_id": "uuid"
+}
+```
+
+**Response**:
+```json
+{
+  "requirement_id": "uuid",
+  "requirement_title": "Art. 32 DSGVO - Sicherheit der Verarbeitung",
+  "coverage_level": "partial",
+  "existing_controls": ["PRIV-001", "PRIV-002"],
+  "missing_coverage": ["Fehlende Massnahme 1"],
+  "suggested_actions": ["Aktion 1", "Aktion 2"]
+}
+```
+
+#### POST `/ai/batch-interpret`
+Batch-Interpretation mehrerer Requirements
+
+**Request**:
+```json
+{
+  "requirement_ids": ["uuid1", "uuid2"],
+  "regulation_code": "GDPR",
+  "rate_limit": 1.0
+}
+```
+
+**Response**:
+```json
+{
+  "total": 10,
+  "processed": 10,
+  "interpretations": [...]
+}
+```
+
+## Konfiguration
+
+### Environment Variables
+
+#### Grundeinstellungen
+```bash
+# Provider-Auswahl
+COMPLIANCE_LLM_PROVIDER=anthropic  # oder: self_hosted, mock
+
+# LLM-Parameter
+COMPLIANCE_LLM_MAX_TOKENS=4096
+COMPLIANCE_LLM_TEMPERATURE=0.3
+COMPLIANCE_LLM_TIMEOUT=60.0
+```
+
+#### Anthropic Claude (empfohlen)
+```bash
+ANTHROPIC_API_KEY=sk-ant-...
+ANTHROPIC_MODEL=claude-sonnet-4-20250514
+```
+
+#### Self-Hosted Alternative
+```bash
+COMPLIANCE_LLM_PROVIDER=self_hosted
+SELF_HOSTED_LLM_URL=http://localhost:11434
+SELF_HOSTED_LLM_MODEL=llama3.1:8b
+SELF_HOSTED_LLM_KEY=optional-api-key
+```
+
+### Prompts
+
+Alle Prompts sind auf **Deutsch** und **Breakpilot-spezifisch**:
+
+- Berucksichtigt EdTech-Kontext (Schulverwaltung, Noten, Zeugnisse)
+- Kennt KI-Funktionen (Klausurkorrektur, Feedback)
+- Berucksichtigt DSGVO-Anforderungen
+- Versteht Breakpilot-Module (consent-service, klausur-service, etc.)
+
+## Verwendung
+
+### Beispiel 1: Requirement interpretieren
+
+```python
+from compliance.services.ai_compliance_assistant import get_ai_assistant
+
+assistant = get_ai_assistant()
+
+result = await assistant.interpret_requirement(
+    requirement_id="req-123",
+    article="Art. 32",
+    title="Sicherheit der Verarbeitung",
+    requirement_text="Der Verantwortliche muss...",
+    regulation_code="GDPR",
+    regulation_name="DSGVO"
+)
+
+print(f"Risiko: {result.risk_level}")
+print(f"Betroffene Module: {', '.join(result.affected_modules)}")
+```
+
+### Beispiel 2: Controls vorschlagen
+
+```python
+suggestions = await assistant.suggest_controls(
+    requirement_title="Verschlusselung in Ruhe und bei Ubertragung",
+    requirement_text="Personenbezogene Daten mussen...",
+    regulation_name="DSGVO",
+    affected_modules=["consent-service", "klausur-service"]
+)
+
+for control in suggestions:
+    print(f"{control.control_id}: {control.title}")
+    print(f"  Domain: {control.domain}")
+    print(f"  Automatisierbar: {control.is_automated}")
+```
+
+### Beispiel 3: Batch-Processing
+
+```python
+requirements = [
+    {"id": "req-1", "article": "Art. 32", ...},
+    {"id": "req-2", "article": "Art. 33", ...},
+]
+
+results = await assistant.batch_interpret_requirements(
+    requirements=requirements,
+    rate_limit=1.0  # 1 Sekunde zwischen Calls
+)
+
+for r in results:
+    if r.error:
+        print(f"Fehler bei {r.requirement_id}: {r.error}")
+    else:
+        print(f"{r.requirement_id}: {r.summary}")
+```
+
+## Rate Limiting
+
+- **Anthropic**: 1 Request/Sekunde (Standard)
+- **Self-Hosted**: 0.5 Sekunden (kann schneller sein)
+- Konfigurierbar uber `rate_limit` Parameter
+
+## Error Handling
+
+Alle Methoden fangen Exceptions ab und geben valide Responses zuruck:
+
+```python
+result = await assistant.interpret_requirement(...)
+
+if result.error:
+    print(f"Fehler aufgetreten: {result.error}")
+    print(f"Confidence: {result.confidence_score}")  # 0.0 bei Fehler
+else:
+    print(f"Erfolgreich: {result.summary}")
+```
+
+## Testing
+
+### Unit Tests mit MockProvider
+
+```python
+from compliance.services.llm_provider import MockProvider, LLMConfig, LLMProviderType
+
+# Mock-Provider erstellen
+config = LLMConfig(provider_type=LLMProviderType.MOCK)
+provider = MockProvider(config)
+
+# Vordefinierte Antworten
+provider.set_responses([
+    '{"summary": "Test-Antwort", "risk_level": "low"}'
+])
+
+# AI Assistant mit Mock
+assistant = AIComplianceAssistant(llm_provider=provider)
+result = await assistant.interpret_requirement(...)
+```
+
+### Integration Tests
+
+```bash
+# Mock-Provider verwenden
+export COMPLIANCE_LLM_PROVIDER=mock
+
+# Backend starten
+cd backend && uvicorn main:app --reload
+
+# Tests ausfuhren
+pytest tests/test_compliance_ai.py -v
+```
+
+## Best Practices
+
+### 1. Provider-Auswahl
+
+- **Produktion**: Anthropic Claude (beste Qualitat)
+- **Entwicklung**: Self-Hosted Ollama (kostenlos)
+- **Tests**: MockProvider (schnell, deterministisch)
+
+### 2. Caching
+
+Die AI-Interpretationen sollten gecached werden:
+- Speichere `result.raw_response` in DB
+- Setze `force_refresh=True` nur bei Bedarf
+- Reduziert API-Kosten erheblich
+
+### 3. Batch-Processing
+
+Fur initiales Setup von Regulations:
+- Verwende `batch_interpret_requirements()`
+- Setze rate_limit entsprechend Provider
+- Verarbeite maximal 50 Requirements pro Batch
+
+### 4. Fehlerbehandlung
+
+Immer `result.error` prufen:
+```python
+if result.error:
+    logger.warning(f"AI failed, using fallback: {result.error}")
+    # Fallback-Logik
+else:
+    # Verwende AI-Ergebnis
+```
+
+### 5. Monitoring
+
+Tracke folgende Metriken:
+- Response Times
+- Error Rates
+- Confidence Scores
+- Token Usage (bei Anthropic)
+
+## Roadmap
+
+### Bereits implementiert (Sprint 4)
+- [x] LLM Provider Abstraction
+- [x] Anthropic + Self-Hosted Support
+- [x] AI Compliance Assistant
+- [x] Alle API Endpoints
+- [x] Batch-Processing
+
+### Geplant (Sprint 5+)
+- [ ] Response Caching in DB
+- [ ] Background Job Queue fur Batch-Processing
+- [ ] Webhook-Support fur Async Processing
+- [ ] Fine-Tuning auf Breakpilot-Daten
+- [ ] Multi-Model Ensemble (kombiniert mehrere LLMs)
+- [ ] Automatisches Re-Training basierend auf Auditor-Feedback
+
+## Support
+
+Bei Fragen oder Problemen:
+1. Prufe `/api/v1/compliance/ai/status`
+2. Prufe Logs: `docker logs breakpilot-backend`
+3. Teste mit MockProvider: `COMPLIANCE_LLM_PROVIDER=mock`
+
+## Lizenz
+
+Teil des BreakPilot Compliance Frameworks.
+Alle Rechte vorbehalten.