Services: Admin-Compliance, Backend-Compliance, AI-Compliance-SDK, Consent-SDK, Developer-Portal, PCA-Platform, DSMS Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
99 lines
2.3 KiB
Markdown
99 lines
2.3 KiB
Markdown
# BreakPilot Compliance SDK - Legal Corpus
|
|
|
|
Pre-indexed legal documents for the RAG system.
|
|
|
|
## EU Regulations
|
|
|
|
| Document | Chunks | Description |
|
|
|----------|--------|-------------|
|
|
| DSGVO (GDPR) | ~99 | EU General Data Protection Regulation |
|
|
| AI Act | ~85 | EU Artificial Intelligence Act |
|
|
| NIS2 | ~46 | Network and Information Security Directive |
|
|
| ePrivacy | ~32 | ePrivacy Directive (Cookie Directive) |
|
|
| CRA | ~41 | Cyber Resilience Act |
|
|
| EUCSA | ~28 | EU Cybersecurity Act |
|
|
| Data Act | ~35 | EU Data Act |
|
|
| DGA | ~25 | Data Governance Act |
|
|
| DSA | ~52 | Digital Services Act |
|
|
| DMA | ~38 | Digital Markets Act |
|
|
| EAA | ~22 | European Accessibility Act |
|
|
| SCC | ~18 | Standard Contractual Clauses |
|
|
| DPF | ~15 | EU-US Data Privacy Framework |
|
|
|
|
## German Regulations
|
|
|
|
| Document | Chunks | Description |
|
|
|----------|--------|-------------|
|
|
| TDDDG | ~28 | Telekommunikation-Digitale-Dienste-Datenschutz-Gesetz |
|
|
| TTDSG | ~24 | Telekommunikation-Telemedien-Datenschutz-Gesetz |
|
|
| BDSG | ~45 | Bundesdatenschutzgesetz |
|
|
| IT-SiG | ~32 | IT-Sicherheitsgesetz |
|
|
| BSI-KritisV | ~28 | BSI-Kritisverordnung |
|
|
|
|
## Directory Structure
|
|
|
|
```
|
|
legal-corpus/
|
|
├── eu/
|
|
│ ├── dsgvo/
|
|
│ │ ├── articles/
|
|
│ │ ├── recitals/
|
|
│ │ └── metadata.json
|
|
│ ├── ai-act/
|
|
│ ├── nis2/
|
|
│ ├── eprivacy/
|
|
│ ├── cra/
|
|
│ ├── eucsa/
|
|
│ ├── data-act/
|
|
│ ├── dga/
|
|
│ ├── dsa/
|
|
│ ├── dma/
|
|
│ ├── eaa/
|
|
│ ├── scc/
|
|
│ └── dpf/
|
|
├── de/
|
|
│ ├── tdddg/
|
|
│ ├── ttdsg/
|
|
│ ├── bdsg/
|
|
│ ├── it-sig/
|
|
│ └── bsi-kritisv/
|
|
└── embeddings/
|
|
└── (generated vector embeddings)
|
|
```
|
|
|
|
## Indexing
|
|
|
|
Documents are automatically indexed on first startup of the RAG service.
|
|
|
|
To manually re-index:
|
|
|
|
```bash
|
|
# Via CLI
|
|
breakpilot-cli index --all
|
|
|
|
# Via API
|
|
POST /api/v1/rag/index
|
|
```
|
|
|
|
## Adding Custom Documents
|
|
|
|
Organizations can add their own internal documents:
|
|
|
|
```bash
|
|
# Upload via CLI
|
|
breakpilot-cli upload --file policy.pdf --category internal
|
|
|
|
# Via API
|
|
POST /api/v1/rag/documents
|
|
Content-Type: multipart/form-data
|
|
```
|
|
|
|
## Embedding Model
|
|
|
|
Default: `bge-m3` via Ollama
|
|
|
|
Supports:
|
|
- German legal terminology
|
|
- Multi-lingual (DE/EN)
|
|
- High-quality semantic search
|