fix: Restore all files lost during destructive rebase
A previous `git pull --rebase origin main` dropped 177 local commits,
losing 3400+ files across admin-v2, backend, studio-v2, website,
klausur-service, and many other services. The partial restore attempt
(660295e2) only recovered some files.
This commit restores all missing files from pre-rebase ref 98933f5e
while preserving post-rebase additions (night-scheduler, night-mode UI,
NightModeWidget dashboard integration).
Restored features include:
- AI Module Sidebar (FAB), OCR Labeling, OCR Compare
- GPU Dashboard, RAG Pipeline, Magic Help
- Klausur-Korrektur (8 files), Abitur-Archiv (5+ files)
- Companion, Zeugnisse-Crawler, Screen Flow
- Full backend, studio-v2, website, klausur-service
- All compliance SDKs, agent-core, voice-service
- CI/CD configs, documentation, scripts
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
447
backend/docs/compliance_ai_integration.md
Normal file
447
backend/docs/compliance_ai_integration.md
Normal file
@@ -0,0 +1,447 @@
|
||||
# Compliance AI Integration - Sprint 4
|
||||
|
||||
## Ubersicht
|
||||
|
||||
Die Compliance AI Integration bietet KI-gestutzte Funktionen zur automatischen Interpretation von regulatorischen Anforderungen, Vorschlagen von Controls und Risikobewertung fur Breakpilot-Module.
|
||||
|
||||
## Architektur
|
||||
|
||||
```
|
||||
Client → FastAPI Routes → AIComplianceAssistant → LLMProvider
|
||||
├─ AnthropicProvider (Claude API)
|
||||
├─ SelfHostedProvider (Ollama/vLLM)
|
||||
└─ MockProvider (Testing)
|
||||
```
|
||||
|
||||
## Komponenten
|
||||
|
||||
### 1. LLM Provider Abstraction (`llm_provider.py`)
|
||||
|
||||
**Abstrakte Basisklasse**: `LLMProvider`
|
||||
- `complete()`: Einzelne Completion
|
||||
- `batch_complete()`: Batch-Verarbeitung mit Rate-Limiting
|
||||
|
||||
**Implementierungen**:
|
||||
|
||||
#### AnthropicProvider
|
||||
- Verwendet Claude API (https://api.anthropic.com)
|
||||
- Empfohlen fur Produktion (beste Qualitat)
|
||||
- Model: `claude-sonnet-4-20250514` (Standard)
|
||||
- Benotigt: `ANTHROPIC_API_KEY`
|
||||
|
||||
#### SelfHostedProvider
|
||||
- Unterstutzung fur Ollama, vLLM, LocalAI
|
||||
- Auto-Detection von API-Formaten (Ollama vs. OpenAI-kompatibel)
|
||||
- Kostengunstige Alternative fur Self-Hosting
|
||||
- Benotigt: `SELF_HOSTED_LLM_URL`
|
||||
|
||||
#### MockProvider
|
||||
- Fur Unit-Tests ohne echte API-Calls
|
||||
- Vordefinierte Antworten moglich
|
||||
|
||||
**Factory Function**: `get_llm_provider(config?: LLMConfig) -> LLMProvider`
|
||||
|
||||
### 2. AI Compliance Assistant (`ai_compliance_assistant.py`)
|
||||
|
||||
Hauptklasse fur alle KI-Funktionen:
|
||||
|
||||
```python
|
||||
from compliance.services.ai_compliance_assistant import get_ai_assistant
|
||||
|
||||
assistant = get_ai_assistant()
|
||||
```
|
||||
|
||||
#### Methoden
|
||||
|
||||
**interpret_requirement()**
|
||||
- Ubersetzt rechtliche Anforderungen in technische Anleitung
|
||||
- Identifiziert betroffene Breakpilot-Module
|
||||
- Bewertet Risiko-Level
|
||||
- Gibt Implementierungs-Hinweise
|
||||
|
||||
**suggest_controls()**
|
||||
- Schlagt 1-3 passende Controls vor
|
||||
- Domain-Zuordnung (priv, iam, sdlc, crypto, ops, ai, cra, gov, aud)
|
||||
- Gibt Implementierungs-Guidance und Pass-Criteria
|
||||
- Erkennt automatisierbare Controls
|
||||
|
||||
**assess_module_risk()**
|
||||
- Bewertet Compliance-Risiko fur Service-Module
|
||||
- Berucksichtigt: PII-Verarbeitung, KI-Komponenten, Kritikalitat
|
||||
- Identifiziert Compliance-Lucken
|
||||
- Gibt Empfehlungen zur Risikominderung
|
||||
|
||||
**analyze_gap()**
|
||||
- Analysiert Coverage zwischen Requirements und Controls
|
||||
- Identifiziert fehlende Abdeckung
|
||||
- Schlagt konkrete Massnahmen vor
|
||||
|
||||
**batch_interpret_requirements()**
|
||||
- Verarbeitet mehrere Requirements mit Rate-Limiting
|
||||
- Fur Bulk-Processing von Regulations
|
||||
|
||||
### 3. API Endpoints (`routes.py`)
|
||||
|
||||
Alle Endpoints unter `/api/v1/compliance/ai/`:
|
||||
|
||||
#### GET `/ai/status`
|
||||
Pruft Status des AI Providers
|
||||
|
||||
**Response**:
|
||||
```json
|
||||
{
|
||||
"provider": "anthropic",
|
||||
"model": "claude-sonnet-4-20250514",
|
||||
"is_available": true,
|
||||
"is_mock": false,
|
||||
"error": null
|
||||
}
|
||||
```
|
||||
|
||||
#### POST `/ai/interpret`
|
||||
Interpretiert eine Anforderung
|
||||
|
||||
**Request**:
|
||||
```json
|
||||
{
|
||||
"requirement_id": "uuid",
|
||||
"force_refresh": false
|
||||
}
|
||||
```
|
||||
|
||||
**Response**:
|
||||
```json
|
||||
{
|
||||
"requirement_id": "uuid",
|
||||
"summary": "Kurze Zusammenfassung",
|
||||
"applicability": "Wie dies auf Breakpilot anwendbar ist",
|
||||
"technical_measures": ["Massnahme 1", "Massnahme 2"],
|
||||
"affected_modules": ["consent-service", "klausur-service"],
|
||||
"risk_level": "high",
|
||||
"implementation_hints": ["Hinweis 1", "Hinweis 2"],
|
||||
"confidence_score": 0.85,
|
||||
"error": null
|
||||
}
|
||||
```
|
||||
|
||||
#### POST `/ai/suggest-controls`
|
||||
Schlagt Controls fur eine Anforderung vor
|
||||
|
||||
**Request**:
|
||||
```json
|
||||
{
|
||||
"requirement_id": "uuid"
|
||||
}
|
||||
```
|
||||
|
||||
**Response**:
|
||||
```json
|
||||
{
|
||||
"requirement_id": "uuid",
|
||||
"suggestions": [
|
||||
{
|
||||
"control_id": "PRIV-042",
|
||||
"domain": "priv",
|
||||
"title": "Verschlusselung personenbezogener Daten",
|
||||
"description": "...",
|
||||
"pass_criteria": "Alle PII sind AES-256 verschlusselt",
|
||||
"implementation_guidance": "...",
|
||||
"is_automated": true,
|
||||
"automation_tool": "SOPS + Age",
|
||||
"priority": "high",
|
||||
"confidence_score": 0.9
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
#### POST `/ai/assess-risk`
|
||||
Bewertet Risiko fur ein Modul
|
||||
|
||||
**Request**:
|
||||
```json
|
||||
{
|
||||
"module_id": "consent-service"
|
||||
}
|
||||
```
|
||||
|
||||
**Response**:
|
||||
```json
|
||||
{
|
||||
"module_name": "consent-service",
|
||||
"overall_risk": "high",
|
||||
"risk_factors": [
|
||||
{
|
||||
"factor": "Verarbeitet personenbezogene Daten",
|
||||
"severity": "high",
|
||||
"likelihood": "high"
|
||||
}
|
||||
],
|
||||
"recommendations": ["Empfehlung 1", "Empfehlung 2"],
|
||||
"compliance_gaps": ["Lucke 1", "Lucke 2"],
|
||||
"confidence_score": 0.8
|
||||
}
|
||||
```
|
||||
|
||||
#### POST `/ai/gap-analysis`
|
||||
Analysiert Coverage-Lucken
|
||||
|
||||
**Request**:
|
||||
```json
|
||||
{
|
||||
"requirement_id": "uuid"
|
||||
}
|
||||
```
|
||||
|
||||
**Response**:
|
||||
```json
|
||||
{
|
||||
"requirement_id": "uuid",
|
||||
"requirement_title": "Art. 32 DSGVO - Sicherheit der Verarbeitung",
|
||||
"coverage_level": "partial",
|
||||
"existing_controls": ["PRIV-001", "PRIV-002"],
|
||||
"missing_coverage": ["Fehlende Massnahme 1"],
|
||||
"suggested_actions": ["Aktion 1", "Aktion 2"]
|
||||
}
|
||||
```
|
||||
|
||||
#### POST `/ai/batch-interpret`
|
||||
Batch-Interpretation mehrerer Requirements
|
||||
|
||||
**Request**:
|
||||
```json
|
||||
{
|
||||
"requirement_ids": ["uuid1", "uuid2"],
|
||||
"regulation_code": "GDPR",
|
||||
"rate_limit": 1.0
|
||||
}
|
||||
```
|
||||
|
||||
**Response**:
|
||||
```json
|
||||
{
|
||||
"total": 10,
|
||||
"processed": 10,
|
||||
"interpretations": [...]
|
||||
}
|
||||
```
|
||||
|
||||
## Konfiguration
|
||||
|
||||
### Environment Variables
|
||||
|
||||
#### Grundeinstellungen
|
||||
```bash
|
||||
# Provider-Auswahl
|
||||
COMPLIANCE_LLM_PROVIDER=anthropic # oder: self_hosted, mock
|
||||
|
||||
# LLM-Parameter
|
||||
COMPLIANCE_LLM_MAX_TOKENS=4096
|
||||
COMPLIANCE_LLM_TEMPERATURE=0.3
|
||||
COMPLIANCE_LLM_TIMEOUT=60.0
|
||||
```
|
||||
|
||||
#### Anthropic Claude (empfohlen)
|
||||
```bash
|
||||
ANTHROPIC_API_KEY=sk-ant-...
|
||||
ANTHROPIC_MODEL=claude-sonnet-4-20250514
|
||||
```
|
||||
|
||||
#### Self-Hosted Alternative
|
||||
```bash
|
||||
COMPLIANCE_LLM_PROVIDER=self_hosted
|
||||
SELF_HOSTED_LLM_URL=http://localhost:11434
|
||||
SELF_HOSTED_LLM_MODEL=llama3.1:8b
|
||||
SELF_HOSTED_LLM_KEY=optional-api-key
|
||||
```
|
||||
|
||||
### Prompts
|
||||
|
||||
Alle Prompts sind auf **Deutsch** und **Breakpilot-spezifisch**:
|
||||
|
||||
- Berucksichtigt EdTech-Kontext (Schulverwaltung, Noten, Zeugnisse)
|
||||
- Kennt KI-Funktionen (Klausurkorrektur, Feedback)
|
||||
- Berucksichtigt DSGVO-Anforderungen
|
||||
- Versteht Breakpilot-Module (consent-service, klausur-service, etc.)
|
||||
|
||||
## Verwendung
|
||||
|
||||
### Beispiel 1: Requirement interpretieren
|
||||
|
||||
```python
|
||||
from compliance.services.ai_compliance_assistant import get_ai_assistant
|
||||
|
||||
assistant = get_ai_assistant()
|
||||
|
||||
result = await assistant.interpret_requirement(
|
||||
requirement_id="req-123",
|
||||
article="Art. 32",
|
||||
title="Sicherheit der Verarbeitung",
|
||||
requirement_text="Der Verantwortliche muss...",
|
||||
regulation_code="GDPR",
|
||||
regulation_name="DSGVO"
|
||||
)
|
||||
|
||||
print(f"Risiko: {result.risk_level}")
|
||||
print(f"Betroffene Module: {', '.join(result.affected_modules)}")
|
||||
```
|
||||
|
||||
### Beispiel 2: Controls vorschlagen
|
||||
|
||||
```python
|
||||
suggestions = await assistant.suggest_controls(
|
||||
requirement_title="Verschlusselung in Ruhe und bei Ubertragung",
|
||||
requirement_text="Personenbezogene Daten mussen...",
|
||||
regulation_name="DSGVO",
|
||||
affected_modules=["consent-service", "klausur-service"]
|
||||
)
|
||||
|
||||
for control in suggestions:
|
||||
print(f"{control.control_id}: {control.title}")
|
||||
print(f" Domain: {control.domain}")
|
||||
print(f" Automatisierbar: {control.is_automated}")
|
||||
```
|
||||
|
||||
### Beispiel 3: Batch-Processing
|
||||
|
||||
```python
|
||||
requirements = [
|
||||
{"id": "req-1", "article": "Art. 32", ...},
|
||||
{"id": "req-2", "article": "Art. 33", ...},
|
||||
]
|
||||
|
||||
results = await assistant.batch_interpret_requirements(
|
||||
requirements=requirements,
|
||||
rate_limit=1.0 # 1 Sekunde zwischen Calls
|
||||
)
|
||||
|
||||
for r in results:
|
||||
if r.error:
|
||||
print(f"Fehler bei {r.requirement_id}: {r.error}")
|
||||
else:
|
||||
print(f"{r.requirement_id}: {r.summary}")
|
||||
```
|
||||
|
||||
## Rate Limiting
|
||||
|
||||
- **Anthropic**: 1 Request/Sekunde (Standard)
|
||||
- **Self-Hosted**: 0.5 Sekunden (kann schneller sein)
|
||||
- Konfigurierbar uber `rate_limit` Parameter
|
||||
|
||||
## Error Handling
|
||||
|
||||
Alle Methoden fangen Exceptions ab und geben valide Responses zuruck:
|
||||
|
||||
```python
|
||||
result = await assistant.interpret_requirement(...)
|
||||
|
||||
if result.error:
|
||||
print(f"Fehler aufgetreten: {result.error}")
|
||||
print(f"Confidence: {result.confidence_score}") # 0.0 bei Fehler
|
||||
else:
|
||||
print(f"Erfolgreich: {result.summary}")
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
### Unit Tests mit MockProvider
|
||||
|
||||
```python
|
||||
from compliance.services.llm_provider import MockProvider, LLMConfig, LLMProviderType
|
||||
|
||||
# Mock-Provider erstellen
|
||||
config = LLMConfig(provider_type=LLMProviderType.MOCK)
|
||||
provider = MockProvider(config)
|
||||
|
||||
# Vordefinierte Antworten
|
||||
provider.set_responses([
|
||||
'{"summary": "Test-Antwort", "risk_level": "low"}'
|
||||
])
|
||||
|
||||
# AI Assistant mit Mock
|
||||
assistant = AIComplianceAssistant(llm_provider=provider)
|
||||
result = await assistant.interpret_requirement(...)
|
||||
```
|
||||
|
||||
### Integration Tests
|
||||
|
||||
```bash
|
||||
# Mock-Provider verwenden
|
||||
export COMPLIANCE_LLM_PROVIDER=mock
|
||||
|
||||
# Backend starten
|
||||
cd backend && uvicorn main:app --reload
|
||||
|
||||
# Tests ausfuhren
|
||||
pytest tests/test_compliance_ai.py -v
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
### 1. Provider-Auswahl
|
||||
|
||||
- **Produktion**: Anthropic Claude (beste Qualitat)
|
||||
- **Entwicklung**: Self-Hosted Ollama (kostenlos)
|
||||
- **Tests**: MockProvider (schnell, deterministisch)
|
||||
|
||||
### 2. Caching
|
||||
|
||||
Die AI-Interpretationen sollten gecached werden:
|
||||
- Speichere `result.raw_response` in DB
|
||||
- Setze `force_refresh=True` nur bei Bedarf
|
||||
- Reduziert API-Kosten erheblich
|
||||
|
||||
### 3. Batch-Processing
|
||||
|
||||
Fur initiales Setup von Regulations:
|
||||
- Verwende `batch_interpret_requirements()`
|
||||
- Setze rate_limit entsprechend Provider
|
||||
- Verarbeite maximal 50 Requirements pro Batch
|
||||
|
||||
### 4. Fehlerbehandlung
|
||||
|
||||
Immer `result.error` prufen:
|
||||
```python
|
||||
if result.error:
|
||||
logger.warning(f"AI failed, using fallback: {result.error}")
|
||||
# Fallback-Logik
|
||||
else:
|
||||
# Verwende AI-Ergebnis
|
||||
```
|
||||
|
||||
### 5. Monitoring
|
||||
|
||||
Tracke folgende Metriken:
|
||||
- Response Times
|
||||
- Error Rates
|
||||
- Confidence Scores
|
||||
- Token Usage (bei Anthropic)
|
||||
|
||||
## Roadmap
|
||||
|
||||
### Bereits implementiert (Sprint 4)
|
||||
- [x] LLM Provider Abstraction
|
||||
- [x] Anthropic + Self-Hosted Support
|
||||
- [x] AI Compliance Assistant
|
||||
- [x] Alle API Endpoints
|
||||
- [x] Batch-Processing
|
||||
|
||||
### Geplant (Sprint 5+)
|
||||
- [ ] Response Caching in DB
|
||||
- [ ] Background Job Queue fur Batch-Processing
|
||||
- [ ] Webhook-Support fur Async Processing
|
||||
- [ ] Fine-Tuning auf Breakpilot-Daten
|
||||
- [ ] Multi-Model Ensemble (kombiniert mehrere LLMs)
|
||||
- [ ] Automatisches Re-Training basierend auf Auditor-Feedback
|
||||
|
||||
## Support
|
||||
|
||||
Bei Fragen oder Problemen:
|
||||
1. Prufe `/api/v1/compliance/ai/status`
|
||||
2. Prufe Logs: `docker logs breakpilot-backend`
|
||||
3. Teste mit MockProvider: `COMPLIANCE_LLM_PROVIDER=mock`
|
||||
|
||||
## Lizenz
|
||||
|
||||
Teil des BreakPilot Compliance Frameworks.
|
||||
Alle Rechte vorbehalten.
|
||||
Reference in New Issue
Block a user