Commit Graph

4 Commits

Author SHA1 Message Date
Benjamin Admin 93099b2770 feat(pipeline): structural metadata end-to-end (Blocks D2-D4)
D2: RAG service stores section/section_title/paragraph/paragraph_num/page
from embedding service chunks_with_metadata into Qdrant payloads.

D3: Control generator prefers section > article > section_title from
Qdrant, adds page to source_citation and generation_metadata.

D4: Validated with real BGB §§ 312-312k text. Found and fixed critical
bug where Phase 3 overlap destroyed the [§ ...] section prefix, causing
only the first chunk per document to have metadata. All subsequent
chunks lost section info.

Also fixes pre-existing lint issues (unused imports, ambiguous variable
names, duplicate dict key, bare except).

456 tests passing (58 embedding + 387 pipeline + 11 rag-service).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-01 20:34:00 +02:00
Benjamin Admin 92ca5b7ba5 feat(rag): use Ollama for embeddings instead of embedding-service
Switch to Ollama's bge-m3 model (1024-dim) for generating embeddings,
solving the dimension mismatch with Qdrant collections. Embedding-service
still used for chunking, reranking, and PDF extraction.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 07:46:57 +01:00
Benjamin Admin 13ba1457b0 Fix embedding client endpoint paths
The embedding-service exposes endpoints at root level (/chunk, /embed,
/extract-pdf, /rerank) not under /api/v1/. Fix the RAG service's
embedding client to use the correct paths.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 23:24:47 +01:00
Benjamin Boenisch ad111d5e69 Initial commit: breakpilot-core - Shared Infrastructure
Docker Compose with 24+ services:
- PostgreSQL (PostGIS), Valkey, MinIO, Qdrant
- Vault (PKI/TLS), Nginx (Reverse Proxy)
- Backend Core API, Consent Service, Billing Service
- RAG Service, Embedding Service
- Gitea, Woodpecker CI/CD
- Night Scheduler, Health Aggregator
- Jitsi (Web/XMPP/JVB/Jicofo), Mailpit

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 23:47:13 +01:00