feat(rag): optimize RAG pipeline — JSON-Mode, CoT, Hybrid Search, Re-Ranking, Cross-Reg Dedup, chunk 1024
Some checks failed
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Failing after 42s
CI/CD / test-python-backend-compliance (push) Successful in 1m38s
CI/CD / test-python-document-crawler (push) Successful in 20s
CI/CD / test-python-dsms-gateway (push) Successful in 17s
CI/CD / validate-canonical-controls (push) Successful in 10s
CI/CD / Deploy (push) Has been skipped
Some checks failed
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Failing after 42s
CI/CD / test-python-backend-compliance (push) Successful in 1m38s
CI/CD / test-python-document-crawler (push) Successful in 20s
CI/CD / test-python-dsms-gateway (push) Successful in 17s
CI/CD / validate-canonical-controls (push) Successful in 10s
CI/CD / Deploy (push) Has been skipped
Phase 1 (LLM Quality): - Add format=json to all Ollama payloads (obligation_extractor, control_generator, citation_backfill) - Add Chain-of-Thought analysis steps to Pass 0a/0b system prompts Phase 2 (Retrieval Quality): - Hybrid search via Qdrant Query API with RRF fusion + automatic text index (legal_rag.go) - Fallback to dense-only search if Query API unavailable - Cross-encoder re-ranking with BGE Reranker v2 (RERANK_ENABLED=false by default) - CPU-only PyTorch dependency to keep Docker image small Phase 3 (Data Layer): - Cross-regulation dedup pass (threshold 0.95) links controls across regulations - DedupResult.link_type field distinguishes dedup_merge vs cross_regulation - Chunk size defaults updated 512/50 → 1024/128 for new ingestions only - Existing collections and controls are NOT affected Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -100,6 +100,40 @@ class ComplianceRAGClient:
|
||||
logger.warning("RAG search failed: %s", e)
|
||||
return []
|
||||
|
||||
async def search_with_rerank(
|
||||
self,
|
||||
query: str,
|
||||
collection: str = "bp_compliance_ce",
|
||||
regulations: Optional[List[str]] = None,
|
||||
top_k: int = 5,
|
||||
) -> List[RAGSearchResult]:
|
||||
"""
|
||||
Search with optional cross-encoder re-ranking.
|
||||
|
||||
Fetches top_k*4 results from RAG, then re-ranks with cross-encoder
|
||||
and returns top_k. Falls back to regular search if reranker is disabled.
|
||||
"""
|
||||
from .reranker import get_reranker
|
||||
|
||||
reranker = get_reranker()
|
||||
if reranker is None:
|
||||
return await self.search(query, collection, regulations, top_k)
|
||||
|
||||
# Fetch more candidates for re-ranking
|
||||
candidates = await self.search(
|
||||
query, collection, regulations, top_k=max(top_k * 4, 20)
|
||||
)
|
||||
if not candidates:
|
||||
return []
|
||||
|
||||
texts = [c.text for c in candidates]
|
||||
try:
|
||||
ranked_indices = reranker.rerank(query, texts, top_k=top_k)
|
||||
return [candidates[i] for i in ranked_indices]
|
||||
except Exception as e:
|
||||
logger.warning("Reranking failed, returning unranked: %s", e)
|
||||
return candidates[:top_k]
|
||||
|
||||
async def scroll(
|
||||
self,
|
||||
collection: str,
|
||||
|
||||
Reference in New Issue
Block a user