This repository has been archived on 2026-02-15. You can view files and clone it. You cannot open issues or pull requests or push a commit.
Files
breakpilot-pwa/ai-content-generator/README.md
Benjamin Admin 21a844cb8a fix: Restore all files lost during destructive rebase
A previous `git pull --rebase origin main` dropped 177 local commits,
losing 3400+ files across admin-v2, backend, studio-v2, website,
klausur-service, and many other services. The partial restore attempt
(660295e2) only recovered some files.

This commit restores all missing files from pre-rebase ref 98933f5e
while preserving post-rebase additions (night-scheduler, night-mode UI,
NightModeWidget dashboard integration).

Restored features include:
- AI Module Sidebar (FAB), OCR Labeling, OCR Compare
- GPU Dashboard, RAG Pipeline, Magic Help
- Klausur-Korrektur (8 files), Abitur-Archiv (5+ files)
- Companion, Zeugnisse-Crawler, Screen Flow
- Full backend, studio-v2, website, klausur-service
- All compliance SDKs, agent-core, voice-service
- CI/CD configs, documentation, scripts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 09:51:32 +01:00

365 lines
6.3 KiB
Markdown

# AI Content Generator Service
Automatische Generierung von H5P-Lerninhalten mit Claude AI und YouTube-Integration.
## Übersicht
Der AI Content Generator analysiert hochgeladene Lernmaterialien und generiert automatisch alle 8 H5P-Content-Typen:
1. **Quiz** - Multiple Choice Fragen
2. **Interactive Video** - YouTube-Videos mit Interaktionen
3. **Course Presentation** - Präsentationsfolien
4. **Flashcards** - Lernkarten
5. **Timeline** - Chronologische Darstellungen
6. **Drag and Drop** - Zuordnungsaufgaben
7. **Fill in the Blanks** - Lückentexte
8. **Memory Game** - Memory-Paare
## Funktionen
### Material-Analyse
- **PDF**: Textextraktion aus mehrseitigen PDFs
- **Images**: OCR-Texterkennung (Tesseract)
- **DOCX**: Word-Dokument Analyse
- **Text**: Plain-Text Dateien
### YouTube-Integration
- Automatische Video-Suche (optional)
- Transkript-Analyse (deutsch/englisch)
- KI-generierte Interaktionen mit Zeitstempeln
### Claude AI Integration
- Altersgerechte Content-Generierung
- Basiert auf hochgeladenen Materialien
- JSON-strukturierte Ausgaben
## Installation
### Lokale Entwicklung
```bash
cd ai-content-generator
# Virtual Environment erstellen
python3 -m venv venv
source venv/bin/activate
# Dependencies installieren
pip install -r requirements.txt
# Tesseract OCR installieren (für Image-Analyse)
# macOS:
brew install tesseract tesseract-lang
# Ubuntu/Debian:
sudo apt-get install tesseract-ocr tesseract-ocr-deu
```
### Umgebungsvariablen
Kopiere `.env.example` zu `.env`:
```bash
cp .env.example .env
```
Setze deinen Anthropic API Key:
```env
ANTHROPIC_API_KEY=your-api-key-here
```
### Docker
```bash
# Content Services starten (inkl. AI Generator)
docker-compose -f docker-compose.content.yml up -d
# Nur AI Generator neu bauen
docker-compose -f docker-compose.content.yml up -d --build ai-content-generator
```
## API Endpoints
### Health Check
```bash
GET /health
```
### Content Generierung starten
```bash
POST /api/generate-content
Content-Type: multipart/form-data
# Form Data:
- topic: string (z.B. "Das Auge")
- description: string (optional)
- target_grade: string (z.B. "5-6", "7-8")
- materials: File[] (PDFs, Images, DOCX)
```
**Beispiel:**
```bash
curl -X POST http://localhost:8004/api/generate-content \
-F "topic=Das Auge" \
-F "description=Biologie Thema für Klasse 7" \
-F "target_grade=7-8" \
-F "materials=@auge_skizze.pdf" \
-F "materials=@auge_text.docx"
```
**Response:**
```json
{
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "pending",
"message": "Content generation started"
}
```
### Generation Status prüfen
```bash
GET /api/generation-status/{job_id}
```
**Response:**
```json
{
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "processing",
"progress": 50,
"current_step": "Generating Quiz questions..."
}
```
### Generierte Inhalte abrufen
```bash
GET /api/generated-content/{job_id}
```
**Response:**
```json
{
"topic": "Das Auge",
"target_grade": "7-8",
"generated_at": "2025-01-15T10:30:00Z",
"content_types": {
"quiz": {
"type": "quiz",
"title": "Quiz: Das Auge",
"questions": [...]
},
"interactive_video": {
"type": "interactive-video",
"videoUrl": "https://youtube.com/watch?v=...",
"interactions": [...]
},
"flashcards": {...},
"timeline": {...},
"drag_drop": {...},
"fill_blanks": {...},
"memory": {...},
"course_presentation": {...}
}
}
```
### YouTube Video-Suche
```bash
POST /api/youtube-search
Content-Type: application/json
{
"query": "Das Auge Biologie",
"max_results": 5
}
```
## Content-Typen Struktur
### Quiz
```json
{
"type": "quiz",
"title": "Quiz: Das Auge",
"questions": [
{
"question": "Was ist die Funktion der Pupille?",
"options": ["A", "B", "C", "D"],
"correct_answer": 0,
"explanation": "..."
}
]
}
```
### Interactive Video
```json
{
"type": "interactive-video",
"videoUrl": "https://youtube.com/watch?v=xyz",
"interactions": [
{
"time": "01:30",
"seconds": 90,
"type": "question",
"title": "Verständnisfrage",
"content": "Was wurde gerade erklärt?"
}
]
}
```
### Flashcards
```json
{
"type": "flashcards",
"cards": [
{
"id": 1,
"front": "Begriff",
"back": "Definition"
}
]
}
```
### Timeline
```json
{
"type": "timeline",
"events": [
{
"id": 1,
"year": "1800",
"title": "Ereignis",
"description": "..."
}
]
}
```
### Drag and Drop
```json
{
"type": "drag-drop",
"title": "Zuordnung",
"zones": [
{"id": 1, "name": "Kategorie 1"}
],
"draggables": [
{"id": 1, "text": "Element", "correctZoneId": 1}
]
}
```
### Fill in the Blanks
```json
{
"type": "fill-blanks",
"title": "Lückentext",
"text": "Das Auge hat eine *Linse* und eine *Netzhaut*.",
"hints": "Tipps..."
}
```
### Memory Game
```json
{
"type": "memory",
"pairs": [
{
"id": 1,
"card1": "Begriff 1",
"card2": "Zugehöriger Begriff"
}
]
}
```
### Course Presentation
```json
{
"type": "course-presentation",
"slides": [
{
"id": 1,
"title": "Folie 1",
"content": "...",
"backgroundColor": "#ffffff"
}
]
}
```
## Technologie-Stack
- **FastAPI**: Web Framework
- **Anthropic Claude**: AI Content Generation
- **youtube-transcript-api**: YouTube Transkript-Analyse
- **PyPDF2**: PDF-Verarbeitung
- **Pillow + Tesseract**: OCR für Bilder
- **python-docx**: Word-Dokument Verarbeitung
## Entwicklung
### Tests ausführen
```bash
# Unit Tests (TODO)
pytest tests/
# Integration Tests (TODO)
pytest tests/integration/
```
### Service lokal starten
```bash
source venv/bin/activate
uvicorn app.main:app --reload --port 8004
```
API-Dokumentation: http://localhost:8004/docs
## Limitierungen
- **YouTube API**: Ohne API Key wird Fallback verwendet
- **OCR**: Erfordert Tesseract Installation
- **Anthropic API**: Kostenpflichtig (Claude API Key erforderlich)
- **Job Store**: In-Memory (TODO: Redis Backend)
## TODO
- [ ] Redis Backend für Job Store
- [ ] Celery für Background Tasks
- [ ] Rate Limiting
- [ ] Unit & Integration Tests
- [ ] API Authentication
- [ ] Webhook Notifications
- [ ] Batch Processing
- [ ] Content Quality Validation
## Support
Bei Fragen oder Problemen:
- GitHub Issues
- Dokumentation: `/docs/ai-content-generator/`