This repository has been archived on 2026-02-15. You can view files and clone it. You cannot open issues or pull requests or push a commit.
Files
breakpilot-pwa/.claude/rules/vocab-worksheet.md
BreakPilot Dev 19855efacc
Some checks failed
Tests / Go Tests (push) Has been cancelled
Tests / Python Tests (push) Has been cancelled
Tests / Integration Tests (push) Has been cancelled
Tests / Go Lint (push) Has been cancelled
Tests / Python Lint (push) Has been cancelled
Tests / Security Scan (push) Has been cancelled
Tests / All Checks Passed (push) Has been cancelled
Security Scanning / Secret Scanning (push) Has been cancelled
Security Scanning / Dependency Vulnerability Scan (push) Has been cancelled
Security Scanning / Go Security Scan (push) Has been cancelled
Security Scanning / Python Security Scan (push) Has been cancelled
Security Scanning / Node.js Security Scan (push) Has been cancelled
Security Scanning / Docker Image Security (push) Has been cancelled
Security Scanning / Security Summary (push) Has been cancelled
CI/CD Pipeline / Go Tests (push) Has been cancelled
CI/CD Pipeline / Python Tests (push) Has been cancelled
CI/CD Pipeline / Website Tests (push) Has been cancelled
CI/CD Pipeline / Linting (push) Has been cancelled
CI/CD Pipeline / Security Scan (push) Has been cancelled
CI/CD Pipeline / Docker Build & Push (push) Has been cancelled
CI/CD Pipeline / Integration Tests (push) Has been cancelled
CI/CD Pipeline / Deploy to Staging (push) Has been cancelled
CI/CD Pipeline / Deploy to Production (push) Has been cancelled
CI/CD Pipeline / CI Summary (push) Has been cancelled
ci/woodpecker/manual/build-ci-image Pipeline was successful
ci/woodpecker/manual/main Pipeline failed
feat: BreakPilot PWA - Full codebase (clean push without large binaries)
All services: admin-v2, studio-v2, website, ai-compliance-sdk,
consent-service, klausur-service, voice-service, and infrastructure.
Large PDFs and compiled binaries excluded via .gitignore.
2026-02-11 13:25:58 +01:00

206 lines
5.3 KiB
Markdown

# Vokabel-Arbeitsblatt Generator - Entwicklerdokumentation
**Status:** Produktiv
**Letzte Aktualisierung:** 2026-02-08
**URL:** https://macmini/vocab-worksheet
---
## Uebersicht
Der Vokabel-Arbeitsblatt Generator ermoeglicht Lehrern:
- Schulbuchseiten (PDF/Bild) zu scannen
- Vokabeln automatisch per OCR zu extrahieren
- Druckfertige Arbeitsblaetter in verschiedenen Formaten zu generieren
---
## Architektur
```
Browser (studio-v2) klausur-service (Port 8086) PostgreSQL
│ │ │
│ POST /upload-pdf-info │ │
│ POST /process-single-page │ │
│ POST /generate │ │
│ POST /generate-nru │ ──── vocab_sessions ──────▶│
│ GET /worksheets/{id}/pdf │ ──── vocab_entries ───────▶│
│ │ ──── vocab_worksheets ────▶│
└────────────────────────────┘ │
```
---
## Arbeitsblatt-Formate
### Standard-Format
Klassisches Arbeitsblatt mit waehlbaren Uebungstypen:
- **Englisch → Deutsch**: Englische Woerter uebersetzen
- **Deutsch → Englisch**: Deutsche Woerter uebersetzen
- **Abschreibuebung**: Woerter mehrfach schreiben
- **Lueckensaetze**: Saetze mit Luecken ausfuellen
### NRU-Format (Neu: 2026-02-08)
Spezielles Format fuer strukturiertes Vokabellernen:
**Seite 1 (pro gescannter Seite): Vokabeltabelle**
| Englisch | Deutsch | Korrektur |
|----------|---------|-----------|
| word | (leer) | (leer) |
- Kind schreibt deutsche Uebersetzung
- Eltern korrigieren, Kind schreibt ggf. korrigierte Version
**Seite 2 (pro gescannter Seite): Lernsaetze**
| Deutscher Satz |
|-----------------------------------|
| (2 leere Zeilen fuer EN-Uebersetzung) |
- Deutscher Satz vorgegeben
- Kind schreibt englische Uebersetzung
**Automatische Trennung:**
- Einzelwoerter/Phrasen → Vokabeltabelle
- Saetze (enden mit `.!?` oder > 50 Zeichen) → Lernsaetze
---
## API-Endpoints
### Standard-Format
```
POST /api/v1/vocab/sessions/{session_id}/generate
Body: {
"worksheet_types": ["en_to_de", "de_to_en", "copy", "gap_fill"],
"title": "Vokabeln Unit 3",
"include_solutions": true,
"line_height": "normal" | "large" | "extra-large"
}
Response: { "id": "worksheet-uuid", ... }
```
### NRU-Format
```
POST /api/v1/vocab/sessions/{session_id}/generate-nru
Body: {
"title": "Vokabeltest",
"include_solutions": true,
"specific_pages": [1, 2] // optional, 1-indexed
}
Response: {
"worksheet_id": "uuid",
"statistics": {
"total_entries": 96,
"vocabulary_count": 75,
"sentence_count": 21,
"source_pages": [1, 2, 3],
"worksheet_pages": 6
},
"download_url": "/api/v1/vocab/worksheets/{id}/pdf",
"solution_url": "/api/v1/vocab/worksheets/{id}/solution"
}
```
### PDF-Download
```
GET /api/v1/vocab/worksheets/{worksheet_id}/pdf
GET /api/v1/vocab/worksheets/{worksheet_id}/solution
```
---
## Dateien
### Backend (klausur-service)
| Datei | Beschreibung |
|-------|--------------|
| `vocab_worksheet_api.py` | Haupt-API Router mit allen Endpoints |
| `nru_worksheet_generator.py` | NRU-Format HTML/PDF Generator |
| `vocab_session_store.py` | PostgreSQL Datenbankoperationen |
| `hybrid_vocab_extractor.py` | OCR-Extraktion (PaddleOCR + LLM) |
| `tesseract_vocab_extractor.py` | Tesseract OCR Fallback |
### Frontend (studio-v2)
| Datei | Beschreibung |
|-------|--------------|
| `app/vocab-worksheet/page.tsx` | Haupt-UI mit Template-Auswahl |
---
## Datenbank-Schema
```sql
-- Sessions
CREATE TABLE vocab_sessions (
id UUID PRIMARY KEY,
name VARCHAR(255),
status VARCHAR(50),
vocabulary_count INT,
source_language VARCHAR(10),
target_language VARCHAR(10),
created_at TIMESTAMP
);
-- Vokabeln
CREATE TABLE vocab_entries (
id UUID PRIMARY KEY,
session_id UUID REFERENCES vocab_sessions(id),
english TEXT,
german TEXT,
example_sentence TEXT,
source_page INT,
source_row INT,
source_column INT
);
-- Generierte Arbeitsblaetter
CREATE TABLE vocab_worksheets (
id UUID PRIMARY KEY,
session_id UUID REFERENCES vocab_sessions(id),
worksheet_types JSONB,
pdf_path VARCHAR(500),
solution_path VARCHAR(500),
generated_at TIMESTAMP
);
```
---
## Deployment
```bash
# 1. Backend synchronisieren
rsync -avz klausur-service/backend/ macmini:.../klausur-service/backend/
# 2. Frontend synchronisieren
rsync -avz studio-v2/app/vocab-worksheet/ macmini:.../studio-v2/app/vocab-worksheet/
# 3. Container neu bauen
ssh macmini "docker compose build --no-cache klausur-service studio-v2"
# 4. Container starten
ssh macmini "docker compose up -d klausur-service studio-v2"
```
---
## Erweiterung: Neue Formate hinzufuegen
1. **Backend**: Neuen Generator in `klausur-service/backend/` erstellen
2. **API**: Neuen Endpoint in `vocab_worksheet_api.py` hinzufuegen
3. **Frontend**: Format zu `worksheetFormats` Array in `page.tsx` hinzufuegen
4. **Doku**: Diese Datei aktualisieren
---
## Aenderungshistorie
| Datum | Aenderung |
|-------|-----------|
| 2026-02-08 | NRU-Format und Template-Auswahl hinzugefuegt |
| 2026-02-07 | Initiale Implementierung mit Standard-Format |