Files
breakpilot-compliance/backend-compliance/requirements.txt
Benjamin Admin c52dbdb8f1
Some checks failed
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Failing after 42s
CI/CD / test-python-backend-compliance (push) Successful in 1m38s
CI/CD / test-python-document-crawler (push) Successful in 20s
CI/CD / test-python-dsms-gateway (push) Successful in 17s
CI/CD / validate-canonical-controls (push) Successful in 10s
CI/CD / Deploy (push) Has been skipped
feat(rag): optimize RAG pipeline — JSON-Mode, CoT, Hybrid Search, Re-Ranking, Cross-Reg Dedup, chunk 1024
Phase 1 (LLM Quality):
- Add format=json to all Ollama payloads (obligation_extractor, control_generator, citation_backfill)
- Add Chain-of-Thought analysis steps to Pass 0a/0b system prompts

Phase 2 (Retrieval Quality):
- Hybrid search via Qdrant Query API with RRF fusion + automatic text index (legal_rag.go)
- Fallback to dense-only search if Query API unavailable
- Cross-encoder re-ranking with BGE Reranker v2 (RERANK_ENABLED=false by default)
- CPU-only PyTorch dependency to keep Docker image small

Phase 3 (Data Layer):
- Cross-regulation dedup pass (threshold 0.95) links controls across regulations
- DedupResult.link_type field distinguishes dedup_merge vs cross_regulation
- Chunk size defaults updated 512/50 → 1024/128 for new ingestions only
- Existing collections and controls are NOT affected

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 11:49:43 +01:00

57 lines
1.1 KiB
Plaintext

# BreakPilot Compliance Backend Dependencies
# Web Framework
fastapi==0.123.9
uvicorn==0.38.0
starlette==0.49.3
# HTTP Client (consent-service proxy, DSR proxy)
httpx==0.28.1
requests==2.32.5
# Validation & Types
pydantic==2.12.5
pydantic_core==2.41.5
email-validator==2.3.0
annotated-types==0.7.0
# Authentication
PyJWT==2.10.1
python-multipart>=0.0.22
# AI / Anthropic (compliance AI assistant)
anthropic==0.75.0
# Re-Ranking (cross-encoder, CPU-only PyTorch to keep image small)
--extra-index-url https://download.pytorch.org/whl/cpu
torch
sentence-transformers>=3.0.0
# PDF Generation (GDPR export, audit reports)
weasyprint>=68.0
reportlab==4.2.5
Jinja2==3.1.6
# Document Processing (Word import for consent admin)
mammoth==1.11.0
Markdown==3.9
# PDF Text Extraction (document import analysis)
PyMuPDF==1.25.3
# Utilities
python-dateutil==2.9.0.post0
# Database
asyncpg==0.30.0
SQLAlchemy==2.0.36
psycopg2-binary==2.9.10
# Cache (Valkey/Redis - rate limiter middleware)
redis==5.2.1
# Security: Pin transitive dependencies to patched versions
idna>=3.7
cryptography>=42.0.0
pillow>=12.1.1