feat/authority-rerank
5 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
49147d9497 |
feat(ai-sdk): authority-aware re-ranking for legal RAG retrieval (Phase 1)
CI / detect-changes (pull_request) Successful in 16s
CI / branch-name (pull_request) Successful in 2s
CI / guardrail-integrity (pull_request) Successful in 5s
CI / secret-scan (pull_request) Successful in 6s
CI / dep-audit (pull_request) Failing after 1m1s
CI / sbom-scan (pull_request) Failing after 1m4s
CI / build-sha-integrity (pull_request) Successful in 14s
CI / validate-canonical-controls (pull_request) Successful in 13s
CI / test-go (pull_request) Successful in 1m2s
CI / loc-budget (pull_request) Successful in 24s
CI / go-lint (pull_request) Failing after 20s
CI / python-lint (pull_request) Failing after 23s
CI / nodejs-lint (pull_request) Failing after 1m10s
CI / nodejs-build (pull_request) Successful in 3m26s
CI / iace-gt-coverage (pull_request) Successful in 16s
CI / test-python-backend (pull_request) Successful in 27s
CI / test-python-document-crawler (pull_request) Successful in 13s
CI / test-python-dsms-gateway (pull_request) Successful in 9s
Re-orders /sdk/v1/rag/search results so binding law from the matching jurisdiction and domain ranks above guidance, foreign and off-domain law — without dropping anything (guidance stays as interpretation context). Internal-only: response schema is unchanged (json:"-" fields), so every consumer benefits without a contract change. - authority.go: classifyAuthority / queryDomain / chunkDomain / scopeClass / topic ontology. Tagged payload (authority_weight/source_class/jurisdiction) wins; deterministic fallback via category + name markers for the untagged corpus. - authority_rerank.go: rerankByAuthority. final = semantic + authority + jurisdiction + domain + scope + topic; the authority score is written back to Score so the multi-collection advisor merge preserves the order. - legal_rag_client: stratified retrieval — the binding-law pool AUGMENTS the semantic pool (mergeDedupHits), then re-rank. - legal_rag_http: searchBinding (source_class filter) + shared doPointsSearch. - table-driven tests for authority/domain/scope/topic + rerank acceptance + a stratified-binding integration test. go test -race green. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
c52dbdb8f1 |
feat(rag): optimize RAG pipeline — JSON-Mode, CoT, Hybrid Search, Re-Ranking, Cross-Reg Dedup, chunk 1024
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Failing after 42s
CI/CD / test-python-backend-compliance (push) Successful in 1m38s
CI/CD / test-python-document-crawler (push) Successful in 20s
CI/CD / test-python-dsms-gateway (push) Successful in 17s
CI/CD / validate-canonical-controls (push) Successful in 10s
CI/CD / Deploy (push) Has been skipped
Phase 1 (LLM Quality): - Add format=json to all Ollama payloads (obligation_extractor, control_generator, citation_backfill) - Add Chain-of-Thought analysis steps to Pass 0a/0b system prompts Phase 2 (Retrieval Quality): - Hybrid search via Qdrant Query API with RRF fusion + automatic text index (legal_rag.go) - Fallback to dense-only search if Query API unavailable - Cross-encoder re-ranking with BGE Reranker v2 (RERANK_ENABLED=false by default) - CPU-only PyTorch dependency to keep Docker image small Phase 3 (Data Layer): - Cross-regulation dedup pass (threshold 0.95) links controls across regulations - DedupResult.link_type field distinguishes dedup_merge vs cross_regulation - Chunk size defaults updated 512/50 → 1024/128 for new ingestions only - Existing collections and controls are NOT affected Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
49ce417428 |
feat: add compliance modules 2-5 (dashboard, security templates, process manager, evidence collector)
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Successful in 32s
CI/CD / test-python-backend-compliance (push) Successful in 34s
CI/CD / test-python-document-crawler (push) Successful in 23s
CI/CD / test-python-dsms-gateway (push) Successful in 21s
CI/CD / validate-canonical-controls (push) Successful in 11s
CI/CD / Deploy (push) Successful in 2s
Module 2: Extended Compliance Dashboard with roadmap, module-status, next-actions, snapshots, score-history Module 3: 7 German security document templates (IT-Sicherheitskonzept, Datenschutz, Backup, Logging, Incident-Response, Zugriff, Risikomanagement) Module 4: Compliance Process Manager with CRUD, complete/skip/seed, ~50 seed tasks, 3-tab UI Module 5: Evidence Collector Extended with automated checks, control-mapping, coverage report, 4-tab UI Also includes: canonical control library enhancements (verification method, categories, dedup), control generator improvements, RAG client extensions 52 tests pass, frontend builds clean. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
4d2f4f2d24 |
feat(qdrant): Migrate to hosted qdrant-dev.breakpilot.ai with API-Key auth
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-ai-compliance (push) Successful in 37s
CI / test-python-backend-compliance (push) Successful in 32s
CI / test-python-document-crawler (push) Successful in 22s
CI / test-python-dsms-gateway (push) Successful in 18s
- LegalRAGClient: QDRANT_HOST+PORT → QDRANT_URL + QDRANT_API_KEY - docker-compose: env vars updated for hosted Qdrant - AllowedCollections: added bp_compliance_gdpr, bp_dsfa_templates, bp_dsfa_risks - Migration scripts (bash + python) for data transfer Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
14a99322eb |
feat: Phase 2 — RAG integration in Requirements + DSFA Draft
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-ai-compliance (push) Successful in 35s
CI / test-python-backend-compliance (push) Successful in 26s
CI / test-python-document-crawler (push) Successful in 22s
CI / test-python-dsms-gateway (push) Successful in 19s
Add legal context enrichment from Qdrant vector corpus to the two highest-priority modules (Requirements AI assistant and DSFA drafting engine). Go SDK: - Add SearchCollection() with collection override + whitelist validation - Refactor Search() to delegate to shared searchInternal() Python backend: - New ComplianceRAGClient proxying POST /sdk/v1/rag/search (error-tolerant) - AI assistant: enrich interpret_requirement() and suggest_controls() with RAG - Requirements API: add ?include_legal_context=true query parameter Admin (Next.js): - Extract shared queryRAG() utility from chat route - Inject RAG legal context into v1 and v2 draft pipelines Tests for all three layers (Go, Python, TypeScript shared utility). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |