# BreakPilot Compliance SDK - Legal Corpus Pre-indexed legal documents for the RAG system. ## EU Regulations | Document | Chunks | Description | |----------|--------|-------------| | DSGVO (GDPR) | ~99 | EU General Data Protection Regulation | | AI Act | ~85 | EU Artificial Intelligence Act | | NIS2 | ~46 | Network and Information Security Directive | | ePrivacy | ~32 | ePrivacy Directive (Cookie Directive) | | CRA | ~41 | Cyber Resilience Act | | EUCSA | ~28 | EU Cybersecurity Act | | Data Act | ~35 | EU Data Act | | DGA | ~25 | Data Governance Act | | DSA | ~52 | Digital Services Act | | DMA | ~38 | Digital Markets Act | | EAA | ~22 | European Accessibility Act | | SCC | ~18 | Standard Contractual Clauses | | DPF | ~15 | EU-US Data Privacy Framework | ## German Regulations | Document | Chunks | Description | |----------|--------|-------------| | TDDDG | ~28 | Telekommunikation-Digitale-Dienste-Datenschutz-Gesetz | | TTDSG | ~24 | Telekommunikation-Telemedien-Datenschutz-Gesetz | | BDSG | ~45 | Bundesdatenschutzgesetz | | IT-SiG | ~32 | IT-Sicherheitsgesetz | | BSI-KritisV | ~28 | BSI-Kritisverordnung | ## Directory Structure ``` legal-corpus/ ├── eu/ │ ├── dsgvo/ │ │ ├── articles/ │ │ ├── recitals/ │ │ └── metadata.json │ ├── ai-act/ │ ├── nis2/ │ ├── eprivacy/ │ ├── cra/ │ ├── eucsa/ │ ├── data-act/ │ ├── dga/ │ ├── dsa/ │ ├── dma/ │ ├── eaa/ │ ├── scc/ │ └── dpf/ ├── de/ │ ├── tdddg/ │ ├── ttdsg/ │ ├── bdsg/ │ ├── it-sig/ │ └── bsi-kritisv/ └── embeddings/ └── (generated vector embeddings) ``` ## Indexing Documents are automatically indexed on first startup of the RAG service. To manually re-index: ```bash # Via CLI breakpilot-cli index --all # Via API POST /api/v1/rag/index ``` ## Adding Custom Documents Organizations can add their own internal documents: ```bash # Upload via CLI breakpilot-cli upload --file policy.pdf --category internal # Via API POST /api/v1/rag/documents Content-Type: multipart/form-data ``` ## Embedding Model Default: `bge-m3` via Ollama Supports: - German legal terminology - Multi-lingual (DE/EN) - High-quality semantic search