Services: Admin-Compliance, Backend-Compliance, AI-Compliance-SDK, Consent-SDK, Developer-Portal, PCA-Platform, DSMS Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2.3 KiB
2.3 KiB
BreakPilot Compliance SDK - Legal Corpus
Pre-indexed legal documents for the RAG system.
EU Regulations
| Document | Chunks | Description |
|---|---|---|
| DSGVO (GDPR) | ~99 | EU General Data Protection Regulation |
| AI Act | ~85 | EU Artificial Intelligence Act |
| NIS2 | ~46 | Network and Information Security Directive |
| ePrivacy | ~32 | ePrivacy Directive (Cookie Directive) |
| CRA | ~41 | Cyber Resilience Act |
| EUCSA | ~28 | EU Cybersecurity Act |
| Data Act | ~35 | EU Data Act |
| DGA | ~25 | Data Governance Act |
| DSA | ~52 | Digital Services Act |
| DMA | ~38 | Digital Markets Act |
| EAA | ~22 | European Accessibility Act |
| SCC | ~18 | Standard Contractual Clauses |
| DPF | ~15 | EU-US Data Privacy Framework |
German Regulations
| Document | Chunks | Description |
|---|---|---|
| TDDDG | ~28 | Telekommunikation-Digitale-Dienste-Datenschutz-Gesetz |
| TTDSG | ~24 | Telekommunikation-Telemedien-Datenschutz-Gesetz |
| BDSG | ~45 | Bundesdatenschutzgesetz |
| IT-SiG | ~32 | IT-Sicherheitsgesetz |
| BSI-KritisV | ~28 | BSI-Kritisverordnung |
Directory Structure
legal-corpus/
├── eu/
│ ├── dsgvo/
│ │ ├── articles/
│ │ ├── recitals/
│ │ └── metadata.json
│ ├── ai-act/
│ ├── nis2/
│ ├── eprivacy/
│ ├── cra/
│ ├── eucsa/
│ ├── data-act/
│ ├── dga/
│ ├── dsa/
│ ├── dma/
│ ├── eaa/
│ ├── scc/
│ └── dpf/
├── de/
│ ├── tdddg/
│ ├── ttdsg/
│ ├── bdsg/
│ ├── it-sig/
│ └── bsi-kritisv/
└── embeddings/
└── (generated vector embeddings)
Indexing
Documents are automatically indexed on first startup of the RAG service.
To manually re-index:
# Via CLI
breakpilot-cli index --all
# Via API
POST /api/v1/rag/index
Adding Custom Documents
Organizations can add their own internal documents:
# Upload via CLI
breakpilot-cli upload --file policy.pdf --category internal
# Via API
POST /api/v1/rag/documents
Content-Type: multipart/form-data
Embedding Model
Default: bge-m3 via Ollama
Supports:
- German legal terminology
- Multi-lingual (DE/EN)
- High-quality semantic search