Files
breakpilot-compliance/.gitea/workflows/rag-ingest.yaml
Benjamin Admin c3654bc9ea
Some checks failed
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Successful in 36s
CI/CD / test-python-backend-compliance (push) Successful in 36s
CI/CD / test-python-document-crawler (push) Successful in 49s
CI/CD / test-python-dsms-gateway (push) Successful in 23s
CI/CD / deploy-hetzner (push) Failing after 1s
fix(ci): Spawn ingestion container on breakpilot-network
Instead of trying to connect the runner to breakpilot-network,
spawn a new alpine container directly on it via docker run.
Added debug output for network/container visibility.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 17:53:06 +01:00

73 lines
2.5 KiB
YAML

# Gitea Actions — RAG Legal Corpus Ingestion
#
# Manuell triggerbarer Workflow zur Ingestion von Rechtstexten in Qdrant.
# Trigger: Gitea UI → Actions → "RAG Ingestion" → Run
#
# Phasen: gesetze, eu, templates, datenschutz, verbraucherschutz, verify, version, all
#
# Voraussetzung: RAG-Service und Qdrant muessen auf Hetzner laufen.
name: RAG Ingestion
on:
workflow_dispatch:
inputs:
phase:
description: 'Ingestion Phase (gesetze, eu, templates, datenschutz, verbraucherschutz, verify, version, all)'
required: true
default: 'verbraucherschutz'
jobs:
ingest:
runs-on: docker
container: docker:27-cli
steps:
- name: Setup
run: |
apk add --no-cache git curl bash python3 > /dev/null 2>&1
- name: Checkout
run: |
git clone --depth 1 --branch main ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
- name: Run Ingestion
run: |
set -euo pipefail
PHASE="${{ github.event.inputs.phase }}"
echo "=== RAG Ingestion: Phase ${PHASE} ==="
echo ""
# Debug: Netzwerke und laufende Container anzeigen
echo "--- Docker Networks ---"
docker network ls 2>/dev/null || echo "docker network ls fehlgeschlagen"
echo ""
echo "--- RAG-relevante Container ---"
docker ps --filter name=rag --format "{{.Names}} {{.Status}}" 2>/dev/null || true
docker ps --filter name=bp-core --format "{{.Names}} {{.Status}}" 2>/dev/null || true
echo ""
# Ingestion in einem Container auf breakpilot-network ausfuehren.
# Der Runner hat Docker-Socket-Zugriff und kann Container spawnen.
docker run --rm \
--network breakpilot-network \
-v "$(pwd)/scripts:/workspace/scripts:ro" \
-e "WORK_DIR=/tmp/rag-ingestion" \
-e "RAG_URL=http://bp-core-rag-service:8097/api/v1/documents/upload" \
-e "QDRANT_URL=https://qdrant-dev.breakpilot.ai" \
-e "SDK_URL=http://bp-compliance-ai-sdk:8090" \
alpine:3.19 \
sh -c "
apk add --no-cache curl bash coreutils > /dev/null 2>&1
mkdir -p /tmp/rag-ingestion/{pdfs,repos,texts}
cd /workspace
if [ '$PHASE' = 'all' ]; then
bash scripts/ingest-legal-corpus.sh
else
bash scripts/ingest-legal-corpus.sh --only '$PHASE'
fi
"
echo ""
echo "=== Ingestion abgeschlossen ==="