Files
breakpilot-lehrer/docs-src/services/klausur-service/BYOEH-Architecture.md
Benjamin Boenisch e22019b2d5 Add CLAUDE.md, MkDocs docs, .claude/rules
- CLAUDE.md: Comprehensive documentation for Lehrer KI platform
- docs-src: Klausur, Voice, Agent-Core, KI-Pipeline docs
- mkdocs.yml: Lehrer-specific nav with blue theme
- docker-compose: Added docs service (port 8010, profile: docs)
- .claude/rules: testing, docs, open-source, abiturkorrektur, vocab-worksheet, multi-agent, experimental-dashboard

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-12 00:49:25 +01:00

17 KiB

BYOEH (Bring-Your-Own-Expectation-Horizon) - Architecture Documentation

Overview

The BYOEH module enables teachers to upload their own Erwartungshorizonte (expectation horizons/grading rubrics) and use them for RAG-assisted grading suggestions. Key design principles:

  • Tenant Isolation: Each teacher/school has an isolated namespace
  • No Training Guarantee: EH content is only used for RAG, never for model training
  • Operator Blindness: Client-side encryption ensures Breakpilot cannot view plaintext
  • Rights Confirmation: Required legal acknowledgment at upload time

Architecture Diagram

┌─────────────────────────────────────────────────────────────────────────┐
│                         klausur-service (Port 8086)                      │
├─────────────────────────────────────────────────────────────────────────┤
│  ┌────────────────────┐    ┌─────────────────────────────────────────┐  │
│  │   BYOEH REST API   │    │           BYOEH Service Layer           │  │
│  │                    │    │                                         │  │
│  │ POST /api/v1/eh    │───▶│ - Upload Wizard Logic                   │  │
│  │ GET /api/v1/eh     │    │ - Rights Confirmation                   │  │
│  │ DELETE /api/v1/eh  │    │ - Chunking Pipeline                     │  │
│  │ POST /rag-query    │    │ - Encryption Service                    │  │
│  └────────────────────┘    └────────────────────┬────────────────────┘  │
└─────────────────────────────────────────────────┼────────────────────────┘
                                                  │
          ┌───────────────────────────────────────┼───────────────────────┐
          │                                       │                       │
          ▼                                       ▼                       ▼
┌──────────────────────┐   ┌──────────────────────────┐   ┌──────────────────────┐
│     PostgreSQL       │   │        Qdrant            │   │  Encrypted Storage   │
│   (Metadata + Audit) │   │   (Vector Search)        │   │   /app/eh-uploads/   │
│                      │   │                          │   │                      │
│ In-Memory Storage:   │   │ Collection: bp_eh        │   │ {tenant}/{eh_id}/    │
│ - erwartungshorizonte│   │ - tenant_id (filter)     │   │   encrypted.bin      │
│ - eh_chunks          │   │ - eh_id                  │   │   salt.txt           │
│ - eh_key_shares      │   │ - embedding[1536]        │   │                      │
│ - eh_klausur_links   │   │ - encrypted_content      │   └──────────────────────┘
│ - eh_audit_log       │   │                          │
└──────────────────────┘   └──────────────────────────┘

Data Flow

1. Upload Flow

Browser                          Backend                        Storage
   │                               │                               │
   │ 1. User selects PDF          │                               │
   │ 2. User enters passphrase    │                               │
   │ 3. PBKDF2 key derivation     │                               │
   │ 4. AES-256-GCM encryption    │                               │
   │ 5. SHA-256 key hash          │                               │
   │                               │                               │
   │──────────────────────────────▶│                               │
   │ POST /api/v1/eh/upload        │                               │
   │ (encrypted blob + key_hash)   │                               │
   │                               │──────────────────────────────▶│
   │                               │ Store encrypted.bin + salt    │
   │                               │◀──────────────────────────────│
   │                               │                               │
   │                               │ Save metadata to DB           │
   │◀──────────────────────────────│                               │
   │ Return EH record              │                               │

2. Indexing Flow (RAG Preparation)

Browser                          Backend                        Qdrant
   │                               │                               │
   │──────────────────────────────▶│                               │
   │ POST /api/v1/eh/{id}/index    │                               │
   │ (passphrase for decryption)   │                               │
   │                               │                               │
   │                               │ 1. Verify key hash            │
   │                               │ 2. Decrypt content            │
   │                               │ 3. Extract text (PDF)         │
   │                               │ 4. Chunk text                 │
   │                               │ 5. Generate embeddings        │
   │                               │ 6. Re-encrypt each chunk      │
   │                               │──────────────────────────────▶│
   │                               │ Index vectors + encrypted     │
   │                               │ chunks with tenant filter     │
   │◀──────────────────────────────│                               │
   │ Return chunk count            │                               │

3. RAG Query Flow

Browser                          Backend                        Qdrant
   │                               │                               │
   │──────────────────────────────▶│                               │
   │ POST /api/v1/eh/rag-query     │                               │
   │ (query + passphrase)          │                               │
   │                               │                               │
   │                               │ 1. Generate query embedding   │
   │                               │──────────────────────────────▶│
   │                               │ 2. Semantic search            │
   │                               │    (tenant-filtered)          │
   │                               │◀──────────────────────────────│
   │                               │ 3. Decrypt matched chunks     │
   │◀──────────────────────────────│                               │
   │ Return decrypted context      │                               │

Security Architecture

Client-Side Encryption

┌─────────────────────────────────────────────────────────────────┐
│                    Browser (Client-Side)                         │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  1. User enters passphrase (NEVER sent to server)               │
│     │                                                           │
│     ▼                                                           │
│  2. Key Derivation: PBKDF2-SHA256(passphrase, salt, 100k iter)  │
│     │                                                           │
│     ▼                                                           │
│  3. Encryption: AES-256-GCM(key, iv, file_content)              │
│     │                                                           │
│     ▼                                                           │
│  4. Key-Hash: SHA-256(derived_key) → server verification only   │
│     │                                                           │
│     ▼                                                           │
│  5. Upload: encrypted_blob + key_hash + salt (NOT key!)         │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Security Guarantees

Guarantee Implementation
No Training training_allowed: false on all Qdrant points
Operator Blindness Passphrase never leaves browser; server only sees key hash
Tenant Isolation Every query filtered by tenant_id
Audit Trail All actions logged with timestamps

Key Sharing System

The key sharing system enables first examiners to grant access to their EH to second examiners and supervisors.

Share Flow

First Examiner                   Backend                    Second Examiner
      │                             │                             │
      │ 1. Encrypt passphrase for   │                             │
      │    recipient (client-side)  │                             │
      │                             │                             │
      │─────────────────────────────▶                             │
      │ POST /eh/{id}/share         │                             │
      │ (encrypted_passphrase, role)│                             │
      │                             │                             │
      │                             │ Store EHKeyShare            │
      │◀─────────────────────────────                             │
      │                             │                             │
      │                             │                             │
      │                             │◀────────────────────────────│
      │                             │ GET /eh/shared-with-me      │
      │                             │                             │
      │                             │─────────────────────────────▶
      │                             │ Return shared EH list       │
      │                             │                             │
      │                             │◀────────────────────────────│
      │                             │ RAG query with decrypted    │
      │                             │ passphrase                  │

Data Structures

@dataclass
class EHKeyShare:
    id: str
    eh_id: str
    user_id: str                    # Recipient
    encrypted_passphrase: str       # Client-encrypted for recipient
    passphrase_hint: str            # Optional hint
    granted_by: str                 # Grantor user ID
    granted_at: datetime
    role: str                       # second_examiner, third_examiner, supervisor
    klausur_id: Optional[str]       # Link to specific Klausur
    active: bool

@dataclass
class EHKlausurLink:
    id: str
    eh_id: str
    klausur_id: str
    linked_by: str
    linked_at: datetime

API Endpoints

Core EH Endpoints

Method Endpoint Description
POST /api/v1/eh/upload Upload encrypted EH
GET /api/v1/eh List user's EH
GET /api/v1/eh/{id} Get single EH
DELETE /api/v1/eh/{id} Soft delete EH
POST /api/v1/eh/{id}/index Index EH for RAG
POST /api/v1/eh/rag-query Query EH content

Key Sharing Endpoints

Method Endpoint Description
POST /api/v1/eh/{id}/share Share EH with examiner
GET /api/v1/eh/{id}/shares List shares (owner)
DELETE /api/v1/eh/{id}/shares/{shareId} Revoke share
GET /api/v1/eh/shared-with-me List EH shared with user

Klausur Integration Endpoints

Method Endpoint Description
POST /api/v1/eh/{id}/link-klausur Link EH to Klausur
DELETE /api/v1/eh/{id}/link-klausur/{klausurId} Unlink EH
GET /api/v1/klausuren/{id}/linked-eh Get linked EH for Klausur

Audit & Admin Endpoints

Method Endpoint Description
GET /api/v1/eh/audit-log Get audit log
GET /api/v1/eh/rights-text Get rights confirmation text
GET /api/v1/eh/qdrant-status Get Qdrant status (admin)

Frontend Components

EHUploadWizard

5-step wizard for uploading Erwartungshorizonte:

  1. File Selection - Choose PDF file
  2. Metadata - Title, Subject, Niveau, Year
  3. Rights Confirmation - Legal acknowledgment
  4. Encryption - Set passphrase (2x confirmation)
  5. Summary - Review and upload

Integration Points

  • KorrekturPage: Shows EH prompt after first student upload
  • GutachtenGeneration: Uses RAG context from linked EH
  • Sidebar Badge: Shows linked EH count

File Structure

klausur-service/
├── backend/
│   ├── main.py              # API endpoints + data structures
│   ├── qdrant_service.py    # Vector database operations
│   ├── eh_pipeline.py       # Chunking, embedding, encryption
│   └── requirements.txt     # Python dependencies
├── frontend/
│   └── src/
│       ├── components/
│       │   └── EHUploadWizard.tsx
│       ├── services/
│       │   ├── api.ts       # API client
│       │   └── encryption.ts # Client-side crypto
│       ├── pages/
│       │   └── KorrekturPage.tsx # EH integration
│       └── styles/
│           └── eh-wizard.css
└── docs/
    ├── BYOEH-Architecture.md
    └── BYOEH-Developer-Guide.md

Configuration

Environment Variables

QDRANT_URL=http://qdrant:6333
OPENAI_API_KEY=sk-...              # For embeddings
BYOEH_ENCRYPTION_ENABLED=true
EH_UPLOAD_DIR=/app/eh-uploads

Docker Services

# docker-compose.yml
services:
  qdrant:
    image: qdrant/qdrant:v1.7.4
    ports:
      - "6333:6333"
    volumes:
      - qdrant_data:/qdrant/storage

Audit Events

Action Description
upload EH uploaded
index EH indexed for RAG
rag_query RAG query executed
delete EH soft deleted
share EH shared with examiner
revoke_share Share revoked
link_klausur EH linked to Klausur
unlink_klausur EH unlinked from Klausur

See Also