Compare commits
22 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 79ad95e244 | |||
| a6f1020b2c | |||
| e50892a2aa | |||
| 9cfe6f83b1 | |||
| df7966656a | |||
| 05d75e8039 | |||
| e24a551ee4 | |||
| f11b2e035f | |||
| 230dc05287 | |||
| b83c3e6e00 | |||
| a1f425d43a | |||
| 23c6ac6f32 | |||
| d82f86fc95 | |||
| a4d1105b3c | |||
| 067118b12d | |||
| b9c00574b1 | |||
| 5ff08a240b | |||
| 3e3644f83d | |||
| e809d0bc1c | |||
| 869e7aeb1e | |||
| 33085c61b4 | |||
| 38a347a82a |
+3
-2
@@ -130,10 +130,11 @@ rsync -avz --exclude node_modules --exclude .next --exclude .git \
|
||||
|
||||
**breakpilot-core MUSS laufen!** Dieses Projekt nutzt Core-Services:
|
||||
- Valkey (Session-Cache)
|
||||
- Vault (Secrets)
|
||||
- RAG-Service (Vektorsuche fuer Compliance-Dokumente)
|
||||
- Nginx (Reverse Proxy)
|
||||
|
||||
Secrets liegen in Infisical (`secrets.meghsakha.com`); die Projektverknuepfung steht in `.infisical.json`. Lokal mit `infisical run --env=dev -- docker compose up` (oder `make dev`) starten — `.env`/`.env.local` werden nicht mehr verwendet.
|
||||
|
||||
**Externe Services (Production):**
|
||||
- PostgreSQL 17 (sslmode=require) — Schemas: `compliance`, `public`
|
||||
- Qdrant @ `qdrant-dev.breakpilot.ai` (HTTPS, API-Key)
|
||||
@@ -316,7 +317,7 @@ ssh macmini "/usr/local/bin/docker compose -f /Users/benjaminadmin/Projekte/brea
|
||||
|
||||
### 5. Sensitive Dateien
|
||||
**NIEMALS aendern oder committen:**
|
||||
- `.env`, `.env.local`, Vault-Tokens, SSL-Zertifikate
|
||||
- `.env`, `.env.local`, Infisical-Tokens, SSL-Zertifikate
|
||||
- `*.pdf`, `*.docx`, kompilierte Binaries, grosse Medien
|
||||
|
||||
---
|
||||
|
||||
@@ -92,7 +92,7 @@ Wenn Hochrisiko:
|
||||
|
||||
- [ ] **Transit:** TLS 1.3 für alle Verbindungen
|
||||
- [ ] **Rest:** Datenbank-Verschlüsselung
|
||||
- [ ] **Secrets:** Vault für Credentials
|
||||
- [ ] **Secrets:** Infisical (`secrets.meghsakha.com`) für Credentials
|
||||
|
||||
### Zugriffskontrollen
|
||||
|
||||
|
||||
+20
-18
@@ -43,7 +43,7 @@ jobs:
|
||||
- name: Checkout
|
||||
run: |
|
||||
apk add --no-cache git bash
|
||||
git clone --depth 200 --branch ${GITHUB_REF_NAME} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
git clone --depth 200 --branch ${GITHUB_HEAD_REF:-${GITHUB_REF_NAME}} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
if [ "${GITHUB_EVENT_NAME}" = "pull_request" ]; then
|
||||
git fetch --depth 200 origin "${GITHUB_BASE_REF}" || true
|
||||
else
|
||||
@@ -87,7 +87,7 @@ jobs:
|
||||
- name: Checkout
|
||||
run: |
|
||||
apk add --no-cache git bash
|
||||
git clone --depth 20 --branch ${GITHUB_REF_NAME} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
git clone --depth 20 --branch ${GITHUB_HEAD_REF:-${GITHUB_REF_NAME}} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
git fetch origin ${GITHUB_BASE_REF}:base
|
||||
- name: Require [guardrail-change] in commits touching guardrails
|
||||
run: |
|
||||
@@ -108,7 +108,7 @@ jobs:
|
||||
- name: Checkout
|
||||
run: |
|
||||
apk add --no-cache git bash
|
||||
git clone --depth 50 --branch ${GITHUB_REF_NAME} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
git clone --depth 50 --branch ${GITHUB_HEAD_REF:-${GITHUB_REF_NAME}} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
- name: Enforce 500-line hard cap
|
||||
run: |
|
||||
chmod +x scripts/check-loc.sh
|
||||
@@ -123,7 +123,7 @@ jobs:
|
||||
- name: Checkout
|
||||
run: |
|
||||
apk add --no-cache git
|
||||
git clone --depth 50 --branch ${GITHUB_REF_NAME} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
git clone --depth 50 --branch ${GITHUB_HEAD_REF:-${GITHUB_REF_NAME}} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
- name: Scan for secrets
|
||||
run: |
|
||||
gitleaks detect --source . --no-git \
|
||||
@@ -136,12 +136,14 @@ jobs:
|
||||
runs-on: docker
|
||||
needs: detect-changes
|
||||
if: github.event_name == 'pull_request' && needs.detect-changes.outputs.sdk == 'true'
|
||||
container: golangci/golangci-lint:v1.62-alpine
|
||||
container: golangci/golangci-lint:v1.64.8-alpine
|
||||
steps:
|
||||
- name: Checkout
|
||||
run: |
|
||||
apk add --no-cache git
|
||||
git clone --depth 1 --branch ${GITHUB_REF_NAME} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
# Full clone so `main` is a local ref — new-from-merge-base needs the merge base.
|
||||
git clone ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
git checkout ${GITHUB_HEAD_REF:-${GITHUB_REF_NAME}}
|
||||
- name: Lint ai-compliance-sdk
|
||||
run: |
|
||||
[ -d "ai-compliance-sdk" ] || exit 0
|
||||
@@ -162,7 +164,7 @@ jobs:
|
||||
steps:
|
||||
- name: Checkout
|
||||
run: |
|
||||
git clone --depth 1 --branch ${GITHUB_REF_NAME} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
git clone --depth 1 --branch ${GITHUB_HEAD_REF:-${GITHUB_REF_NAME}} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
- name: Lint (ruff) + type-check (mypy)
|
||||
run: |
|
||||
pip install --quiet ruff mypy
|
||||
@@ -193,7 +195,7 @@ jobs:
|
||||
- name: Checkout
|
||||
run: |
|
||||
apk add --no-cache git
|
||||
git clone --depth 1 --branch ${GITHUB_REF_NAME} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
git clone --depth 1 --branch ${GITHUB_HEAD_REF:-${GITHUB_REF_NAME}} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
- name: Lint + type-check
|
||||
run: |
|
||||
fail=0
|
||||
@@ -215,7 +217,7 @@ jobs:
|
||||
- name: Checkout
|
||||
run: |
|
||||
apk add --no-cache git
|
||||
git clone --depth 1 --branch ${GITHUB_REF_NAME} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
git clone --depth 1 --branch ${GITHUB_HEAD_REF:-${GITHUB_REF_NAME}} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
- name: Build Next.js services
|
||||
run: |
|
||||
fail=0
|
||||
@@ -239,7 +241,7 @@ jobs:
|
||||
steps:
|
||||
- name: Checkout
|
||||
run: |
|
||||
git clone --depth 1 --branch ${GITHUB_REF_NAME} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
git clone --depth 1 --branch ${GITHUB_HEAD_REF:-${GITHUB_REF_NAME}} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
- name: Install Node.js + Go
|
||||
run: |
|
||||
curl -fsSL https://deb.nodesource.com/setup_20.x | bash - > /dev/null 2>&1
|
||||
@@ -282,7 +284,7 @@ jobs:
|
||||
- name: Checkout
|
||||
run: |
|
||||
apk add --no-cache git curl bash
|
||||
git clone --depth 1 --branch ${GITHUB_REF_NAME} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
git clone --depth 1 --branch ${GITHUB_HEAD_REF:-${GITHUB_REF_NAME}} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
- name: Install syft + grype
|
||||
run: |
|
||||
curl -sSfL https://raw.githubusercontent.com/anchore/syft/main/install.sh | sh -s -- -b /usr/local/bin
|
||||
@@ -304,7 +306,7 @@ jobs:
|
||||
- name: Checkout
|
||||
run: |
|
||||
apk add --no-cache git
|
||||
git clone --depth 1 --branch ${GITHUB_REF_NAME} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
git clone --depth 1 --branch ${GITHUB_HEAD_REF:-${GITHUB_REF_NAME}} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
- name: Test ai-compliance-sdk
|
||||
run: |
|
||||
[ -d "ai-compliance-sdk" ] || exit 0
|
||||
@@ -324,7 +326,7 @@ jobs:
|
||||
steps:
|
||||
- name: Checkout
|
||||
run: |
|
||||
git clone --depth 1 --branch ${GITHUB_REF_NAME} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
git clone --depth 1 --branch ${GITHUB_HEAD_REF:-${GITHUB_REF_NAME}} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
- name: GT-Bremse measure-coverage report
|
||||
run: |
|
||||
python3 scripts/gt_measure_gap_analysis.py --json /tmp/gt_gap_report.json > /tmp/gt_gap_report.md
|
||||
@@ -355,7 +357,7 @@ jobs:
|
||||
steps:
|
||||
- name: Checkout
|
||||
run: |
|
||||
git clone --depth 1 --branch ${GITHUB_REF_NAME} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
git clone --depth 1 --branch ${GITHUB_HEAD_REF:-${GITHUB_REF_NAME}} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
- name: Test backend-compliance
|
||||
run: |
|
||||
[ -d "backend-compliance" ] || exit 0
|
||||
@@ -375,7 +377,7 @@ jobs:
|
||||
steps:
|
||||
- name: Checkout
|
||||
run: |
|
||||
git clone --depth 1 --branch ${GITHUB_REF_NAME} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
git clone --depth 1 --branch ${GITHUB_HEAD_REF:-${GITHUB_REF_NAME}} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
- name: Test document-crawler
|
||||
run: |
|
||||
[ -d "document-crawler" ] || exit 0
|
||||
@@ -395,7 +397,7 @@ jobs:
|
||||
steps:
|
||||
- name: Checkout
|
||||
run: |
|
||||
git clone --depth 1 --branch ${GITHUB_REF_NAME} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
git clone --depth 1 --branch ${GITHUB_HEAD_REF:-${GITHUB_REF_NAME}} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
- name: Test dsms-gateway
|
||||
run: |
|
||||
[ -d "dsms-gateway" ] || exit 0
|
||||
@@ -417,7 +419,7 @@ jobs:
|
||||
- name: Checkout
|
||||
run: |
|
||||
apk add --no-cache git python3 py3-yaml
|
||||
git clone --depth 1 --branch ${GITHUB_REF_NAME} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
git clone --depth 1 --branch ${GITHUB_HEAD_REF:-${GITHUB_REF_NAME}} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
- name: Validate every Dockerfile + compose block declares BUILD_SHA
|
||||
run: |
|
||||
python3 - <<'PY'
|
||||
@@ -456,6 +458,6 @@ jobs:
|
||||
steps:
|
||||
- name: Checkout
|
||||
run: |
|
||||
git clone --depth 1 --branch ${GITHUB_REF_NAME} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
git clone --depth 1 --branch ${GITHUB_HEAD_REF:-${GITHUB_REF_NAME}} ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}.git .
|
||||
- name: Validate controls
|
||||
run: python scripts/validate-controls.py
|
||||
|
||||
@@ -74,7 +74,7 @@ jobs:
|
||||
-e "WORK_DIR=/tmp/rag-ingestion" \
|
||||
-e "RAG_URL=http://bp-core-rag-service:8097/api/v1/documents/upload" \
|
||||
-e "QDRANT_URL=https://qdrant-dev.breakpilot.ai" \
|
||||
-e "QDRANT_API_KEY=z9cKbT74vl1aKPD1QGIlKWfET47VH93u" \
|
||||
-e "QDRANT_API_KEY=${{ secrets.QDRANT_API_KEY }}" \
|
||||
-e "SDK_URL=http://bp-compliance-ai-sdk:8090" \
|
||||
alpine:3.19 \
|
||||
sh -c "
|
||||
|
||||
-13
@@ -48,16 +48,3 @@ backups/*.backup
|
||||
*.wav
|
||||
ai-compliance-sdk/server
|
||||
*.bak
|
||||
|
||||
# Build/test artifacts (2026-06-21 cleanup)
|
||||
docs-site/
|
||||
ux-screenshots/
|
||||
**/test-results/
|
||||
**/audit-reports/
|
||||
admin-compliance/e2e/reports/
|
||||
admin-compliance/e2e/e2e/
|
||||
design/redesign/*-preview.png
|
||||
admin-compliance/BreakPilot-Pitch-Submission.html
|
||||
admin-compliance/shot-ds.mjs
|
||||
admin-compliance/ux-shots.mjs
|
||||
Neuer Ordner mit Objekten/
|
||||
|
||||
@@ -0,0 +1,21 @@
|
||||
# gitleaks configuration.
|
||||
# Keeps gitleaks' default ruleset and adds an allowlist for known FALSE POSITIVES
|
||||
# that surfaced once the CI checkout was fixed (secret-scan had never actually run
|
||||
# on a PR before). Real leaked credentials are removed in code, NOT allowlisted.
|
||||
|
||||
[extend]
|
||||
useDefault = true
|
||||
|
||||
[allowlist]
|
||||
description = "Documentation curl examples, env templates, and non-secret identifiers"
|
||||
paths = [
|
||||
# API reference pages — curl examples with placeholder tokens, not real secrets
|
||||
'''developer-portal/app/api/.*''',
|
||||
'''developer-portal/app/development/.*''',
|
||||
# Template env file — placeholder dev values (e.g. breakpilot123)
|
||||
'''\.env\.example$''',
|
||||
# Seed data: "rule_key" identifiers, not credentials
|
||||
'''backend-compliance/compliance/data/template_rule_seed_data\.py$''',
|
||||
# SDK deploy template — MINIO placeholder password
|
||||
'''breakpilot-compliance-sdk/packages/cli/src/commands/deploy\.ts$''',
|
||||
]
|
||||
@@ -0,0 +1,5 @@
|
||||
{
|
||||
"workspaceId": "996bda36-9e01-4071-ae8d-69a9f9ff5a23",
|
||||
"defaultEnvironment": "",
|
||||
"gitBranchToEnvironmentMapping": null
|
||||
}
|
||||
@@ -0,0 +1,157 @@
|
||||
# Infisical Setup for Local Development
|
||||
|
||||
This is the per-developer onboarding for accessing the `breakpilot-compliance` secrets while developing locally. Once this is done, **everything you launch through `make dev` (or `infisical run …`) gets the dev secrets injected as environment variables** — including any Claude Code session that spawns those commands.
|
||||
|
||||
Secrets live in the self-hosted Infisical instance at **`secrets.meghsakha.com`**. The project link is committed in `.infisical.json`, so you don't need to know the project ID.
|
||||
|
||||
---
|
||||
|
||||
## 1. Install the Infisical CLI
|
||||
|
||||
**macOS (recommended):**
|
||||
|
||||
```bash
|
||||
brew install infisical/get-cli/infisical
|
||||
```
|
||||
|
||||
**Other platforms / manual install:**
|
||||
|
||||
See <https://infisical.com/docs/cli/overview>. Verify with:
|
||||
|
||||
```bash
|
||||
infisical --version
|
||||
# infisical version 0.43.x (or newer)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. Log in to the self-hosted instance
|
||||
|
||||
```bash
|
||||
infisical login --domain https://secrets.meghsakha.com
|
||||
```
|
||||
|
||||
This opens a browser for SSO. The login is persisted to your OS keychain — you only do this once per machine.
|
||||
|
||||
Sanity check:
|
||||
|
||||
```bash
|
||||
cd ~/projects/breakpilot-compliance # wherever you cloned the repo
|
||||
infisical --domain https://secrets.meghsakha.com secrets --env=dev
|
||||
```
|
||||
|
||||
You should see a table of secret names + values. If you get an auth error, re-run `infisical login`.
|
||||
|
||||
---
|
||||
|
||||
## 3. Verify the project link
|
||||
|
||||
The repo already contains `.infisical.json` pointing at the `breakpilot-compliance` project:
|
||||
|
||||
```bash
|
||||
cat .infisical.json
|
||||
# { "workspaceId": "996bda36-9e01-4071-ae8d-69a9f9ff5a23", ... }
|
||||
```
|
||||
|
||||
If the file is missing (rare — only if you reset the repo), recreate it:
|
||||
|
||||
```bash
|
||||
infisical init --domain https://secrets.meghsakha.com
|
||||
```
|
||||
|
||||
Pick the `breakpilot-compliance` project from the picker.
|
||||
|
||||
---
|
||||
|
||||
## 4. Launch the stack
|
||||
|
||||
```bash
|
||||
make dev
|
||||
```
|
||||
|
||||
This runs `infisical run --env=dev -- docker compose up`. Every service in the compose stack sees its secrets as normal env vars — no `.env` file ever touches disk.
|
||||
|
||||
Other targets:
|
||||
|
||||
| Target | What it does |
|
||||
|--------|--------------|
|
||||
| `make dev-build` | Same as `make dev` but rebuilds images first |
|
||||
| `make dev-down` | Stop the stack (no secrets needed) |
|
||||
| `make dev-logs` | Tail logs |
|
||||
| `make dev-ps` | List running containers |
|
||||
| `make secrets` | Print all secrets in `dev` (read-only) |
|
||||
| `make secrets-set KEY=FOO VALUE=bar` | Add or update a secret in `dev` |
|
||||
|
||||
To target a different environment:
|
||||
|
||||
```bash
|
||||
make dev ENV=staging
|
||||
make secrets ENV=prod
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Using secrets from Claude Code
|
||||
|
||||
When Claude Code runs commands in this repo via its Bash tool, the commands inherit your shell's environment. Two patterns:
|
||||
|
||||
**Pattern A — let Claude launch the stack normally**
|
||||
|
||||
Claude just runs `make dev`. The Infisical CLI inside that command resolves secrets at run time and passes them to docker compose. Claude doesn't see plaintext secrets in its context, but the running services do.
|
||||
|
||||
**Pattern B — let Claude run a one-off script with secrets**
|
||||
|
||||
If Claude needs to execute a Python/Go script that requires secrets, wrap the command:
|
||||
|
||||
```bash
|
||||
infisical run --env=dev -- python scripts/some_one_off.py
|
||||
```
|
||||
|
||||
This works for any subprocess: pytest, alembic, go run, npm scripts. If Claude proposes a command that reads env vars and runs raw, ask it to wrap it in `infisical run --env=dev --` first.
|
||||
|
||||
**What Claude should not do:**
|
||||
|
||||
- `infisical export --env=dev > .env` — defeats the whole point and the `.gitignore` will still try to keep the file out.
|
||||
- `infisical secrets get KEY --env=dev --raw` and pasting the value into a code edit — secrets must stay out of the repo.
|
||||
|
||||
If you want Claude to never accidentally dump secrets, add this to your `.claude/settings.json` permissions (project-level or user-level):
|
||||
|
||||
```json
|
||||
{
|
||||
"permissions": {
|
||||
"deny": [
|
||||
"Bash(infisical export*)",
|
||||
"Bash(infisical secrets get*)"
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
| Symptom | Fix |
|
||||
|---------|-----|
|
||||
| `please either run infisical init or pass --projectId` | `.infisical.json` is missing or unreadable — re-run `infisical init` |
|
||||
| `unauthorized` / `please log in` | Re-run `infisical login --domain https://secrets.meghsakha.com` |
|
||||
| `make dev` says secret is empty | Check the name in `make secrets` matches what docker-compose expects, then update the service config or rename the secret in Infisical |
|
||||
| Browser SSO doesn't open | Use `infisical login --domain https://secrets.meghsakha.com --method=user` and paste the URL manually |
|
||||
|
||||
---
|
||||
|
||||
## What the dev env contains
|
||||
|
||||
Run `make secrets` to see the live list. As of this writing the dev env includes (at minimum):
|
||||
|
||||
- `BREAKPILOT_DB_PASSWORD`
|
||||
- `BREAKPILOT_QDRANT_API_KEY`
|
||||
- `LITELLM_API_KEY`
|
||||
|
||||
Every other variable in `.env.example` either has a sane default in `docker-compose.yml` or needs to be added to Infisical. To add one:
|
||||
|
||||
```bash
|
||||
make secrets-set KEY=ANTHROPIC_API_KEY VALUE=sk-ant-xxxx
|
||||
```
|
||||
|
||||
Or via the web UI: <https://secrets.meghsakha.com>.
|
||||
@@ -0,0 +1,57 @@
|
||||
# breakpilot-compliance — developer workflow
|
||||
#
|
||||
# Secrets are managed in Infisical (secrets.meghsakha.com). The project
|
||||
# link lives in .infisical.json. To get started:
|
||||
# 1) infisical login --domain https://secrets.meghsakha.com (once per machine)
|
||||
# 2) make dev
|
||||
#
|
||||
# .env / .env.local are NOT used in this repo anymore. Anything that needs
|
||||
# secrets MUST be launched through `infisical run` so the values come from
|
||||
# the secrets store instead of disk.
|
||||
|
||||
INFISICAL ?= infisical
|
||||
INFISICAL_DOMAIN ?= https://secrets.meghsakha.com
|
||||
ENV ?= dev
|
||||
|
||||
INFISICAL_RUN := $(INFISICAL) --domain $(INFISICAL_DOMAIN) run --env=$(ENV) --
|
||||
INFISICAL_SECRETS := $(INFISICAL) --domain $(INFISICAL_DOMAIN) secrets --env=$(ENV)
|
||||
|
||||
.PHONY: help dev dev-build dev-down dev-logs dev-ps secrets secrets-set check-loc
|
||||
|
||||
help:
|
||||
@echo "Targets:"
|
||||
@echo " dev Start the full compose stack with secrets injected from Infisical"
|
||||
@echo " dev-build Same as dev, but force a rebuild first"
|
||||
@echo " dev-down Stop the compose stack (no secrets needed)"
|
||||
@echo " dev-logs Tail logs from all services"
|
||||
@echo " dev-ps Show running containers"
|
||||
@echo " secrets List all secrets in the current env ($(ENV))"
|
||||
@echo " secrets-set Set a secret (KEY=... VALUE=...)"
|
||||
@echo " check-loc Run the 500-line LOC guard"
|
||||
|
||||
dev:
|
||||
$(INFISICAL_RUN) docker compose up
|
||||
|
||||
dev-build:
|
||||
$(INFISICAL_RUN) docker compose up --build
|
||||
|
||||
dev-down:
|
||||
docker compose down
|
||||
|
||||
dev-logs:
|
||||
docker compose logs -f
|
||||
|
||||
dev-ps:
|
||||
docker compose ps
|
||||
|
||||
secrets:
|
||||
$(INFISICAL_SECRETS)
|
||||
|
||||
secrets-set:
|
||||
@if [ -z "$(KEY)" ] || [ -z "$(VALUE)" ]; then \
|
||||
echo "Usage: make secrets-set KEY=MY_KEY VALUE=my_value"; exit 1; \
|
||||
fi
|
||||
$(INFISICAL) --domain $(INFISICAL_DOMAIN) secrets set $(KEY)=$(VALUE) --env=$(ENV)
|
||||
|
||||
check-loc:
|
||||
bash scripts/check-loc.sh
|
||||
@@ -42,23 +42,26 @@ All containers share the external `breakpilot-network` Docker network and depend
|
||||
|
||||
## Quick Start
|
||||
|
||||
**Prerequisites:** Docker, Go 1.24+, Python 3.12+, Node.js 20+
|
||||
**Prerequisites:** Docker, Go 1.24+, Python 3.12+, Node.js 20+, [Infisical CLI](https://infisical.com/docs/cli/overview)
|
||||
|
||||
```bash
|
||||
git clone ssh://git@gitea.meghsakha.com:22222/Benjamin_Boenisch/breakpilot-compliance.git
|
||||
cd breakpilot-compliance
|
||||
|
||||
# Copy and populate secrets (never commit .env)
|
||||
cp .env.example .env
|
||||
# One-time per machine: log in to the self-hosted Infisical instance
|
||||
infisical login --domain https://secrets.meghsakha.com
|
||||
|
||||
# Start all services
|
||||
docker compose up -d
|
||||
# Start the full stack with secrets injected from Infisical (env=dev)
|
||||
make dev
|
||||
```
|
||||
|
||||
Secrets are pulled from Infisical (`secrets.meghsakha.com`) at runtime; `.env` files are not used. See [INFISICAL_SETUP.md](./INFISICAL_SETUP.md) for full onboarding, and `make help` for the rest of the targets (`dev-build`, `dev-down`, `secrets`, `secrets-set`).
|
||||
|
||||
For the Orca/Hetzner production target (x86_64), use the override:
|
||||
|
||||
```bash
|
||||
docker compose -f docker-compose.yml -f docker-compose.hetzner.yml up -d
|
||||
make dev ENV=prod # or:
|
||||
infisical run --env=prod -- docker compose -f docker-compose.yml -f docker-compose.hetzner.yml up -d
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
@@ -35,6 +35,25 @@ Dies ist ein **Legal RAG**. Eine falsch zitierte Fundstelle ist schlimmer als ga
|
||||
- **Interne IDs** (Control-IDs wie SEC-xxxx, MC-/M-Nummern) gehoeren NICHT in die Nutzerantwort
|
||||
als Hauptaussage — fuehre die Pflicht im Klartext, eine ID hoechstens in Klammern nachgestellt.
|
||||
|
||||
## Korpus-Autoritaet & Aktualitaet — der Kontext schlaegt dein Gedaechtnis (KRITISCH)
|
||||
Gesetze aendern sich nach deinem Trainingsstand. Der bereitgestellte RAG-/Controls-Kontext bildet
|
||||
den AKTUELLEN Rechtsstand ab — dein Trainingswissen kann veraltet sein. Diese Regel gilt fuer
|
||||
FAKTEN, nicht nur fuer Fundstellen (ergaenzt **Quellentreue**).
|
||||
- Rechtliche **Fakten** (Schwellenwerte, Fristen, Zahlen, ob/ab-wann eine Pflicht gilt,
|
||||
Zustaendigkeiten) nimmst du AUSSCHLIESSLICH aus dem bereitgestellten Kontext. Dein Trainingswissen
|
||||
dient nur fuer Sprache, Struktur und Schlussfolgerung — **niemals als Rechtsquelle**.
|
||||
- Steht ein gefragter Fakt NICHT im Kontext: gib KEINE aus dem Gedaechtnis erinnerte Zahl/Frist/
|
||||
Schwelle aus — auch nicht beilaeufig im Fliesstext ohne Fundstelle. Sag offen, dass du ihn aus
|
||||
deinen geprueften Quellen nicht belegen kannst, nenne Pflicht/Thema allgemein, und biete den
|
||||
naechsten Schritt an (gezielt nachschlagen / mit DSB oder Anwalt verifizieren).
|
||||
- **Konflikt-Transparenz**: Weicht der Kontext von dem ab, was dir "gelaeufig" vorkommt, gewinnt
|
||||
IMMER der Kontext. Mach es ruhig transparent — z.B. "Die aktuelle Quelle nennt 20; eine evtl.
|
||||
aeltere, gelaeufige Annahme (10) gilt hier nicht."
|
||||
- **Co-Pilot-Ton, keine Roboter-Verweigerung**: formuliere "Aus meinen geprueften Quellen kann ich
|
||||
X nicht belegen — ich kann es gezielt nachschlagen, oder du klaerst es mit deinem DSB/Anwalt"
|
||||
statt eines harten "Nein". Du bleibst hilfreicher Begleiter, gibst dem Nutzer aber keine
|
||||
ungesicherte Rechtsangabe als Tatsache mit.
|
||||
|
||||
## Kompetenzbereich
|
||||
- DSGVO Art. 1-99 + Erwaegsgruende
|
||||
- BDSG (Bundesdatenschutzgesetz)
|
||||
|
||||
@@ -80,7 +80,7 @@ export async function POST(request: NextRequest) {
|
||||
let systemContent = soulPrompt || FALLBACK_SYSTEM_PROMPT
|
||||
if (validCountry) systemContent += countryBlock(validCountry)
|
||||
if (ragContext) {
|
||||
systemContent += `\n\n## Relevanter Kontext aus dem RAG-System\n\nNutze die folgenden Quellen fuer deine Antwort. Verweise in deiner Antwort auf die jeweilige Quelle:\n\n${ragContext}`
|
||||
systemContent += `\n\n## Relevanter Kontext aus dem RAG-System (deine EINZIGEN Rechtsquellen)\n\nDies sind deine einzigen zulaessigen Rechtsquellen. Triff keine konkrete Rechtsaussage (Zahl, Frist, Schwelle, Pflicht, Fundstelle), die nicht hier oder im Controls-Block belegt ist — sonst sage offen, dass du sie aus deinen Quellen nicht belegen kannst. Verweise in deiner Antwort auf die jeweilige Quelle:\n\n${ragContext}`
|
||||
}
|
||||
if (controlsContext) systemContent += `\n\n${controlsContext}`
|
||||
systemContent += `\n\n## Aktueller SDK-Schritt\nDer Nutzer befindet sich im SDK-Schritt: ${currentStep}`
|
||||
|
||||
@@ -0,0 +1,302 @@
|
||||
'use client'
|
||||
|
||||
// Erklärendes Architekturschema des Compliance-Check-Tools — Muster aus dem
|
||||
// CE-Modul (/sdk/iace/.../architektur) übernommen: hand-kurierte Boxen/Pfeile +
|
||||
// Schritt-Akkordeon. Inhalt spiegelt den Code-Pfad (api/agent_check/_orchestrator
|
||||
// + services/specialist_agents). Bewusst statisch (der Doc-Check ist Python, hat
|
||||
// keinen Architektur-Endpoint wie das Go-IACE-Modul) — bei Bedarf später aus einem
|
||||
// Backend-Handler speisbar.
|
||||
|
||||
import { useState, type ReactNode } from 'react'
|
||||
|
||||
function Box({ title, sub, accent }: { title: string; sub?: string; accent?: 'purple' | 'amber' | 'green' | 'gray' }) {
|
||||
const c =
|
||||
accent === 'purple'
|
||||
? 'border-purple-300 bg-purple-50/60 dark:border-purple-700 dark:bg-purple-900/20'
|
||||
: accent === 'amber'
|
||||
? 'border-amber-300 bg-amber-50/60 dark:border-amber-700 dark:bg-amber-900/20'
|
||||
: accent === 'green'
|
||||
? 'border-green-300 bg-green-50/60 dark:border-green-700 dark:bg-green-900/20'
|
||||
: 'border-gray-200 bg-white dark:border-gray-700 dark:bg-gray-800'
|
||||
return (
|
||||
<div className={`rounded-lg border ${c} px-2.5 py-1.5`}>
|
||||
<div className="text-[11px] font-medium text-gray-800 dark:text-gray-200 leading-tight">{title}</div>
|
||||
{sub && <div className="text-[10px] text-gray-500 leading-tight mt-0.5">{sub}</div>}
|
||||
</div>
|
||||
)
|
||||
}
|
||||
|
||||
function Lane({ label, children }: { label: string; children: ReactNode }) {
|
||||
return (
|
||||
<div className="flex-1 min-w-[150px] space-y-2">
|
||||
<div className="text-[10px] font-semibold uppercase tracking-wide text-gray-400 text-center">{label}</div>
|
||||
<div className="space-y-1.5">{children}</div>
|
||||
</div>
|
||||
)
|
||||
}
|
||||
|
||||
function Arrow() {
|
||||
return (
|
||||
<div className="flex items-center justify-center text-gray-300 dark:text-gray-600 shrink-0 px-0.5">
|
||||
<span className="hidden lg:block text-lg">→</span>
|
||||
<span className="lg:hidden text-sm">↓</span>
|
||||
</div>
|
||||
)
|
||||
}
|
||||
|
||||
type Stage = {
|
||||
id: string
|
||||
title: string
|
||||
summary: string
|
||||
input: string
|
||||
logic: string
|
||||
source: string
|
||||
example: string
|
||||
}
|
||||
|
||||
// Spiegelt run_compliance_check (Phasen A–F) + die Spezialagenten-Schicht.
|
||||
const STAGES: Stage[] = [
|
||||
{
|
||||
id: 'a',
|
||||
title: 'Phase A — Auflösen & Crawl',
|
||||
summary: 'URLs + hochgeladene Dokumente einsammeln, fehlende Pflichtseiten automatisch finden.',
|
||||
input: 'Start-URL, Dokument-Uploads, 8 Wizard-Felder (scan_context)',
|
||||
logic: 'Discovery (Sitemap/Heuristik) + Fetch je Seite, Text-Extraktion pro Doc-Typ',
|
||||
source: 'consent-tester /dsi-discovery, Playwright',
|
||||
example: 'Findet /impressum, /datenschutz, /agb ohne manuelle Eingabe',
|
||||
},
|
||||
{
|
||||
id: 'b',
|
||||
title: 'Phase B — Profil & Dokument-Checks',
|
||||
summary: 'Geschäftsprofil erkennen, jedes Dokument gegen seine Controls prüfen.',
|
||||
input: 'Doc-Texte je Typ + Business-Scope',
|
||||
logic: 'Regex-Runner + MC-Keyword + BGE-M3-Embedding + LLM-Verify (nur unscharf)',
|
||||
source: 'doc_check_controls (DB), mc_classification.db (Embeddings)',
|
||||
example: 'DSE: 267 Text-MCs, Keyword + semantischer Recall',
|
||||
},
|
||||
{
|
||||
id: 'agents',
|
||||
title: 'Spezialagenten (nebenläufig)',
|
||||
summary: 'Pro Dokumenttyp ein typisierter Agent → eigener Ergebnis-Tab, gefüllt per SSE.',
|
||||
input: 'Doc-Text, Scope, scan_context',
|
||||
logic: 'Impressum + AGB + DSE laufen parallel (asyncio.gather), je ein AgentOutput',
|
||||
source: 'api/agent_check/_agent_outputs._TOPIC_AGENTS',
|
||||
example: 'AGB-Tab + DSE-Tab erscheinen, sobald ihr Agent fertig ist',
|
||||
},
|
||||
{
|
||||
id: 'c',
|
||||
title: 'Phase C — Cookie-Banner',
|
||||
summary: 'Consent-Banner + gesetzte Cookies vor/nach Einwilligung live prüfen.',
|
||||
input: 'Live-Seite im Browser',
|
||||
logic: 'Consent-Tester-Scan: Banner, Vendors, Enforcement, Browser-Matrix',
|
||||
source: 'consent-tester /scan',
|
||||
example: 'Cookie vor Einwilligung gesetzt → Verstoß-Kandidat',
|
||||
},
|
||||
{
|
||||
id: 'd',
|
||||
title: 'Phase D — Vendors & Plausibilität',
|
||||
summary: 'Dritt-Dienste extrahieren + Findings auf Plausibilität prüfen.',
|
||||
input: 'Banner-/Seiten-Daten, Findings',
|
||||
logic: 'Vendor-Extraktion (+OCR-Fallback), Plausibilitäts-Check je FAIL',
|
||||
source: 'Cookie-/Vendor-Kataloge, LLM-Kaskade',
|
||||
example: 'Analytics ohne Rechtsgrundlage → bestätigtes Finding',
|
||||
},
|
||||
{
|
||||
id: 'reconcile',
|
||||
title: 'Cross-Finding-Abgleich',
|
||||
summary: 'Findings über Dokumente hinweg abgleichen — Doppel & Scheinverstöße auflösen.',
|
||||
input: 'Alle Modul-Findings',
|
||||
logic: 'Deckt ein anderes Dokument die Pflicht ab, wird das Cross-Finding unterdrückt',
|
||||
source: 'cross_doc_reconcile (B-Wirings)',
|
||||
example: '§36 VSBG im Impressum statt DSE → kein Doppel-Finding',
|
||||
},
|
||||
{
|
||||
id: 'f',
|
||||
title: 'Phase E/F — Bericht & Snapshot',
|
||||
summary: 'Ergebnis persistieren, Snapshot für die Historie speichern.',
|
||||
input: 'Konsolidiertes Ergebnis',
|
||||
logic: 'Mail-Render + DB-Persist + Snapshot (Tab-Ansicht ohne Re-Crawl)',
|
||||
source: 'compliance_check_snapshots',
|
||||
example: 'Historie erneut öffnen, ohne die Seite neu zu crawlen',
|
||||
},
|
||||
]
|
||||
|
||||
type ModuleEngine = { name: string; mechanism: string }
|
||||
const MODULES: ModuleEngine[] = [
|
||||
{ name: 'Impressum', mechanism: 'Scope-Gate + Feld-Matcher (§5 DDG / §18 MStV)' },
|
||||
{ name: 'AGB', mechanism: 'decision_method-Routing: Keyword → Geschäftsmodell-Gate → Embedding/Reference/LLM' },
|
||||
{ name: 'DSE', mechanism: '4-Layer: Regex-Boost → Keyword → BGE-M3-Recall (0.65) → Semantic-Validator' },
|
||||
{ name: 'Cookie-Banner', mechanism: 'Consent-Tester: Banner, Vendors, Enforcement, Browser-Matrix' },
|
||||
]
|
||||
|
||||
type Pruefer = { method: string; mechanism: string; deterministic: string; example: string }
|
||||
// Meta-Modell: jede Pflicht → ein Prüfertyp (decision_method). Wenige
|
||||
// wiederverwendbare Prüfer statt Logik pro Control.
|
||||
const PRUEFER: Pruefer[] = [
|
||||
{ method: 'REGEX', mechanism: 'Kuratierte Muster / Keyword', deterministic: 'ja', example: 'Pflicht-Stichwort im Text' },
|
||||
{ method: 'EMBEDDING', mechanism: 'BGE-M3 Kosinus ≥ Schwelle', deterministic: 'ja (feste Funktion)', example: '„Recht auf Berichtigung" ≈ Umschreibung' },
|
||||
{ method: 'REFERENCE', mechanism: 'Link-/Verweis-Auflösung', deterministic: 'ja', example: 'Verweis auf die Datenschutzerklärung' },
|
||||
{ method: 'LLM', mechanism: 'Kaskade Qwen→OVH→Claude, nur unscharfe Fälle', deterministic: 'nein (eskaliert)', example: 'Speicherdauer inhaltlich erfüllt?' },
|
||||
{ method: 'BEHAVIOR', mechanism: 'Playwright: Live-Verhalten', deterministic: 'ja', example: 'Cookies vor Einwilligung gesetzt?' },
|
||||
{ method: 'SCANNER', mechanism: 'Repo-/Netzwerk-/Prozess-Scan', deterministic: 'ja', example: 'Geplant: technische Nachweise' },
|
||||
]
|
||||
|
||||
function Field({ label, value, mono }: { label: string; value: string; mono?: boolean }) {
|
||||
return (
|
||||
<div>
|
||||
<dt className="text-[10px] uppercase tracking-wide text-gray-400">{label}</dt>
|
||||
<dd className={`text-gray-600 dark:text-gray-300 ${mono ? 'font-mono text-[11px]' : ''}`}>{value}</dd>
|
||||
</div>
|
||||
)
|
||||
}
|
||||
|
||||
function StageRow({ stage, last, open, onToggle }: { stage: Stage; last: boolean; open: boolean; onToggle: () => void }) {
|
||||
return (
|
||||
<div>
|
||||
<button
|
||||
onClick={onToggle}
|
||||
className={`w-full text-left rounded-lg border p-3 transition-colors ${
|
||||
open
|
||||
? 'border-purple-300 bg-purple-50/60 dark:border-purple-700 dark:bg-purple-900/20'
|
||||
: 'border-gray-200 dark:border-gray-700 bg-white dark:bg-gray-800 hover:bg-gray-50 dark:hover:bg-gray-700/50'
|
||||
}`}
|
||||
>
|
||||
<div className="flex items-start justify-between gap-3">
|
||||
<div>
|
||||
<div className="text-sm font-semibold text-gray-800 dark:text-gray-200">{stage.title}</div>
|
||||
<div className="text-xs text-gray-500 mt-0.5">{stage.summary}</div>
|
||||
</div>
|
||||
<span className="text-gray-400 text-xs shrink-0">{open ? '▲' : '▼'}</span>
|
||||
</div>
|
||||
{open && (
|
||||
<dl className="mt-3 grid grid-cols-1 md:grid-cols-2 gap-x-6 gap-y-2 text-xs">
|
||||
<Field label="Input" value={stage.input} />
|
||||
<Field label="Logik" value={stage.logic} />
|
||||
<Field label="Datenquelle" value={stage.source} mono />
|
||||
<Field label="Beispiel" value={stage.example} />
|
||||
</dl>
|
||||
)}
|
||||
</button>
|
||||
{!last && <div className="flex justify-center text-gray-300 dark:text-gray-600 text-xs leading-none py-0.5">↓</div>}
|
||||
</div>
|
||||
)
|
||||
}
|
||||
|
||||
export function ArchitekturView() {
|
||||
const [open, setOpen] = useState<string | null>('b')
|
||||
|
||||
return (
|
||||
<div className="space-y-8">
|
||||
<div>
|
||||
<h2 className="text-xl font-bold text-gray-900 dark:text-gray-100">Architektur & Datenfluss</h2>
|
||||
<p className="text-sm text-gray-500 dark:text-gray-400 max-w-3xl mt-1">
|
||||
Nachvollziehbar: <strong>woher jedes Finding stammt</strong> und <strong>wie es geprüft wird</strong>.
|
||||
Die Engine ist überwiegend <strong>deterministisch</strong> (Regex + Embedding); ein LLM entscheidet nur
|
||||
die unscharfen Fälle. Ergebnisse erscheinen pro Modul progressiv und werden am Ende per
|
||||
Cross-Finding-Abgleich bereinigt.
|
||||
</p>
|
||||
</div>
|
||||
|
||||
<section className="space-y-2">
|
||||
<h3 className="text-sm font-semibold text-gray-700 dark:text-gray-300">Datenfluss (Überblick)</h3>
|
||||
<div className="rounded-xl border border-gray-200 dark:border-gray-700 bg-gray-50/50 dark:bg-gray-900/20 p-3 overflow-x-auto">
|
||||
<div className="flex flex-col lg:flex-row gap-1.5 lg:items-stretch min-w-[280px]">
|
||||
<Lane label="Eingabe">
|
||||
<Box title="Website + Dokumente" sub="Impressum · DSE · AGB · Cookies" accent="purple" />
|
||||
<Box title="Wizard-Kontext" sub="8 Felder: Shop, Drittland, Beruf…" accent="purple" />
|
||||
</Lane>
|
||||
<Arrow />
|
||||
<Lane label="Crawl + Text">
|
||||
<Box title="Discovery + Fetch" sub="consent-tester, Playwright" />
|
||||
<Box title="Doc-Text je Typ" />
|
||||
</Lane>
|
||||
<Arrow />
|
||||
<Lane label="Engine (deterministisch)">
|
||||
<div className="rounded-lg border border-gray-200 dark:border-gray-700 bg-white dark:bg-gray-800 p-1.5 space-y-1">
|
||||
{STAGES.map((s) => (
|
||||
<div key={s.id} className="text-[10px] text-gray-600 dark:text-gray-300 leading-tight">
|
||||
{s.title}
|
||||
</div>
|
||||
))}
|
||||
</div>
|
||||
</Lane>
|
||||
<Arrow />
|
||||
<Lane label="Ausgaben">
|
||||
<Box title="Findings je Modul-Tab" sub="Impressum/AGB/DSE/Cookie" accent="green" />
|
||||
<Box title="Severity + Maßnahme" accent="green" />
|
||||
<Box title="Snapshot + Bericht" sub="ohne Re-Crawl" accent="green" />
|
||||
</Lane>
|
||||
</div>
|
||||
<p className="text-[10px] text-gray-400 mt-2">
|
||||
Links→rechts reproduzierbar. Embedding ist semantisch UND deterministisch (feste Funktion: gleicher
|
||||
Text → gleicher Vektor). Das LLM läuft nur für unscharfe Fälle und eskaliert mit Selbstkonfidenz.
|
||||
</p>
|
||||
</div>
|
||||
</section>
|
||||
|
||||
<section className="space-y-2">
|
||||
<h3 className="text-sm font-semibold text-gray-700 dark:text-gray-300">Pipeline (Schritt für Schritt)</h3>
|
||||
<div className="space-y-1">
|
||||
{STAGES.map((s, i) => (
|
||||
<StageRow
|
||||
key={s.id}
|
||||
stage={s}
|
||||
last={i === STAGES.length - 1}
|
||||
open={open === s.id}
|
||||
onToggle={() => setOpen(open === s.id ? null : s.id)}
|
||||
/>
|
||||
))}
|
||||
</div>
|
||||
</section>
|
||||
|
||||
<section className="space-y-3">
|
||||
<h3 className="text-sm font-semibold text-gray-700 dark:text-gray-300">Modul-Engines (live)</h3>
|
||||
<div className="grid grid-cols-1 sm:grid-cols-2 gap-3">
|
||||
{MODULES.map((m) => (
|
||||
<div key={m.name} className="rounded-lg border border-gray-200 dark:border-gray-700 bg-white dark:bg-gray-800 p-3">
|
||||
<div className="flex items-baseline justify-between gap-2">
|
||||
<span className="text-sm font-medium text-gray-800 dark:text-gray-200">{m.name}</span>
|
||||
<span className="inline-block rounded px-1.5 py-0.5 text-[10px] font-medium bg-green-100 text-green-700 dark:bg-green-900/40 dark:text-green-300">
|
||||
live
|
||||
</span>
|
||||
</div>
|
||||
<p className="text-xs text-gray-500 mt-1">{m.mechanism}</p>
|
||||
</div>
|
||||
))}
|
||||
</div>
|
||||
</section>
|
||||
|
||||
<section className="space-y-3">
|
||||
<h3 className="text-sm font-semibold text-gray-700 dark:text-gray-300">Prüfer-Matrix (Meta-Modell)</h3>
|
||||
<p className="text-xs text-gray-500 max-w-3xl">
|
||||
Jede Pflicht wird einem <strong>Prüfertyp</strong> zugeordnet — so braucht es nicht pro Control eigene
|
||||
Logik, sondern wenige wiederverwendbare Prüfer.
|
||||
</p>
|
||||
<div className="overflow-x-auto">
|
||||
<table className="w-full text-xs">
|
||||
<thead>
|
||||
<tr className="text-gray-500 border-b border-gray-200 dark:border-gray-700 text-left">
|
||||
<th className="py-1.5 pr-3">Prüfer</th>
|
||||
<th className="py-1.5 pr-3">Mechanismus</th>
|
||||
<th className="py-1.5 pr-3">Deterministisch</th>
|
||||
<th className="py-1.5">Beispiel</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
{PRUEFER.map((p) => (
|
||||
<tr key={p.method} className="border-b border-gray-100 dark:border-gray-700/50 align-top">
|
||||
<td className="py-1.5 pr-3">
|
||||
<code className="text-[11px] bg-gray-100 dark:bg-gray-700 rounded px-1">{p.method}</code>
|
||||
</td>
|
||||
<td className="py-1.5 pr-3 text-gray-600 dark:text-gray-300">{p.mechanism}</td>
|
||||
<td className="py-1.5 pr-3 text-gray-500">{p.deterministic}</td>
|
||||
<td className="py-1.5 text-gray-500">{p.example}</td>
|
||||
</tr>
|
||||
))}
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</section>
|
||||
</div>
|
||||
)
|
||||
}
|
||||
@@ -4,10 +4,12 @@ import React, { useState } from 'react'
|
||||
import { ComplianceCheckTab } from './_components/ComplianceCheckTab'
|
||||
import { ComplianceFAQ } from './_components/ComplianceFAQ'
|
||||
import { SnapshotHistoryList } from './_components/SnapshotHistoryList'
|
||||
import { ArchitekturView } from './_components/ArchitekturView'
|
||||
|
||||
export default function AgentPage() {
|
||||
// Nach einem abgeschlossenen Check die Historie unten neu laden.
|
||||
const [historyKey, setHistoryKey] = useState(0)
|
||||
const [tab, setTab] = useState<'check' | 'architektur'>('check')
|
||||
|
||||
return (
|
||||
<div className="space-y-6 max-w-4xl">
|
||||
@@ -16,11 +18,31 @@ export default function AgentPage() {
|
||||
<p className="text-gray-500 mt-1">Webseiten + Dokumente auf DSGVO-Konformität prüfen.</p>
|
||||
</div>
|
||||
|
||||
<div className="flex gap-1 border-b border-gray-200 dark:border-gray-700">
|
||||
{([['check', 'Check'], ['architektur', 'Architektur']] as const).map(([id, label]) => (
|
||||
<button
|
||||
key={id}
|
||||
onClick={() => setTab(id)}
|
||||
className={`px-3 py-2 text-sm font-medium -mb-px border-b-2 transition-colors ${
|
||||
tab === id
|
||||
? 'border-purple-500 text-purple-600 dark:text-purple-400'
|
||||
: 'border-transparent text-gray-500 hover:text-gray-700 dark:hover:text-gray-300'
|
||||
}`}
|
||||
>
|
||||
{label}
|
||||
</button>
|
||||
))}
|
||||
</div>
|
||||
|
||||
{tab === 'check' ? (
|
||||
<>
|
||||
<ComplianceCheckTab onComplete={() => setHistoryKey(k => k + 1)} />
|
||||
|
||||
<SnapshotHistoryList refreshKey={historyKey} />
|
||||
|
||||
<ComplianceFAQ />
|
||||
</>
|
||||
) : (
|
||||
<ArchitekturView />
|
||||
)}
|
||||
</div>
|
||||
)
|
||||
}
|
||||
|
||||
@@ -46,6 +46,28 @@ export interface CorpusOverview {
|
||||
totals: { documents: number; catalog_sources: number }
|
||||
}
|
||||
|
||||
// --- Ingested legal-corpus structure (from the vector store, via the Go SDK).
|
||||
// Shows WHAT each eur-lex act consists of (articles/annexes/recitals), so the
|
||||
// ingested corpus is not a black box for developers. ---
|
||||
export interface LegalActStructure {
|
||||
regulation_short: string
|
||||
regulation_name: string
|
||||
articles: number
|
||||
annexes: number
|
||||
recitals: number
|
||||
chunks: number
|
||||
}
|
||||
|
||||
export interface LegalCorpus {
|
||||
regulations: LegalActStructure[]
|
||||
totals: {
|
||||
regulations: number
|
||||
articles: number
|
||||
annexes: number
|
||||
recitals: number
|
||||
}
|
||||
}
|
||||
|
||||
// --- Korpus-Dokumente: gruppieren nach Art (Gesetz/Leitfaden/Standard/Urteil)
|
||||
// + Herausgeber-Familie (DSK, EDPB, OWASP, NIST …). Deterministisch, pure. ---
|
||||
interface DocCat {
|
||||
|
||||
@@ -3,6 +3,7 @@ import Link from 'next/link'
|
||||
import {
|
||||
type UseCaseRow,
|
||||
type CorpusOverview,
|
||||
type LegalCorpus,
|
||||
licenseTierBadgeClass,
|
||||
commercialBadgeClass,
|
||||
groupUseCases,
|
||||
@@ -11,28 +12,46 @@ import {
|
||||
|
||||
const BACKEND_URL =
|
||||
process.env.COMPLIANCE_BACKEND_URL || 'http://backend-compliance:8002'
|
||||
// The legal-corpus structure comes from the Go SDK (it owns the vector store).
|
||||
const SDK_URL = process.env.SDK_URL || 'http://ai-compliance-sdk:8090'
|
||||
|
||||
export const dynamic = 'force-dynamic'
|
||||
|
||||
// Fetched from the SDK and isolated in its own try/catch so a vector-store
|
||||
// hiccup degrades to "no structure shown" instead of blanking the whole page.
|
||||
async function fetchLegalCorpus(): Promise<LegalCorpus | null> {
|
||||
try {
|
||||
const res = await fetch(`${SDK_URL}/sdk/v1/rag/legal-corpus`, {
|
||||
cache: 'no-store',
|
||||
})
|
||||
return res.ok ? await res.json() : null
|
||||
} catch {
|
||||
return null
|
||||
}
|
||||
}
|
||||
|
||||
async function getData(): Promise<{
|
||||
useCases: UseCaseRow[]
|
||||
corpus: CorpusOverview | null
|
||||
legalCorpus: LegalCorpus | null
|
||||
}> {
|
||||
try {
|
||||
const [ucRes, corpusRes] = await Promise.all([
|
||||
const [ucRes, corpusRes, legalCorpus] = await Promise.all([
|
||||
fetch(`${BACKEND_URL}/api/compliance/v1/controls/use-cases`, {
|
||||
cache: 'no-store',
|
||||
}),
|
||||
fetch(`${BACKEND_URL}/api/compliance/v1/controls/corpus`, {
|
||||
cache: 'no-store',
|
||||
}),
|
||||
fetchLegalCorpus(),
|
||||
])
|
||||
return {
|
||||
useCases: ucRes.ok ? await ucRes.json() : [],
|
||||
corpus: corpusRes.ok ? await corpusRes.json() : null,
|
||||
legalCorpus,
|
||||
}
|
||||
} catch {
|
||||
return { useCases: [], corpus: null }
|
||||
return { useCases: [], corpus: null, legalCorpus: null }
|
||||
}
|
||||
}
|
||||
|
||||
@@ -46,7 +65,7 @@ function Stat({ label, value }: { label: string; value: string | number }) {
|
||||
}
|
||||
|
||||
export default async function CoveragePage() {
|
||||
const { useCases, corpus } = await getData()
|
||||
const { useCases, corpus, legalCorpus } = await getData()
|
||||
const groups = groupUseCases(useCases)
|
||||
const totalRelevant = useCases.reduce((s, u) => s + u.atom_relevant, 0)
|
||||
const totalAtoms = useCases.reduce((s, u) => s + u.atom_total, 0)
|
||||
@@ -221,6 +240,67 @@ export default async function CoveragePage() {
|
||||
</div>
|
||||
</section>
|
||||
|
||||
{legalCorpus?.regulations?.length ? (
|
||||
<section className="space-y-2">
|
||||
<h2 className="text-lg font-semibold text-gray-900">
|
||||
Ingestierter Rechtskorpus – Struktur ({legalCorpus.totals.regulations}{' '}
|
||||
Rechtsakte)
|
||||
</h2>
|
||||
<p className="text-xs text-gray-500">
|
||||
Woraus jeder ingestierte eur-lex-Rechtsakt tatsächlich besteht:
|
||||
Artikel (§), Anhänge, Erwägungsgründe und retrievbare Chunks — direkt
|
||||
aus dem Vektorspeicher, damit kein Black-Box-Korpus entsteht.
|
||||
</p>
|
||||
<div className="overflow-auto rounded-lg border border-gray-200">
|
||||
<table className="min-w-full divide-y divide-gray-200 text-sm">
|
||||
<thead className="bg-gray-50 text-left text-xs uppercase text-gray-500">
|
||||
<tr>
|
||||
<th className="px-4 py-2">Rechtsakt</th>
|
||||
<th className="px-4 py-2 text-right">Artikel (§)</th>
|
||||
<th className="px-4 py-2 text-right">Anhänge</th>
|
||||
<th className="px-4 py-2 text-right">Erwägungsgründe</th>
|
||||
<th className="px-4 py-2 text-right">Chunks</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody className="divide-y divide-gray-100 bg-white">
|
||||
{legalCorpus.regulations.map((r) => (
|
||||
<tr key={r.regulation_short}>
|
||||
<td className="px-4 py-2 text-gray-900">
|
||||
<span className="font-medium">{r.regulation_short}</span>
|
||||
{r.regulation_name !== r.regulation_short ? (
|
||||
<span className="ml-2 text-xs text-gray-500">
|
||||
{r.regulation_name}
|
||||
</span>
|
||||
) : null}
|
||||
</td>
|
||||
<td className="px-4 py-2 text-right font-semibold">
|
||||
{r.articles.toLocaleString('de-DE')}
|
||||
</td>
|
||||
<td className="px-4 py-2 text-right">
|
||||
{r.annexes > 0 ? (
|
||||
r.annexes.toLocaleString('de-DE')
|
||||
) : (
|
||||
<span className="text-gray-300">—</span>
|
||||
)}
|
||||
</td>
|
||||
<td className="px-4 py-2 text-right text-gray-500">
|
||||
{r.recitals > 0 ? (
|
||||
r.recitals.toLocaleString('de-DE')
|
||||
) : (
|
||||
<span className="text-gray-300">—</span>
|
||||
)}
|
||||
</td>
|
||||
<td className="px-4 py-2 text-right text-gray-500">
|
||||
{r.chunks.toLocaleString('de-DE')}
|
||||
</td>
|
||||
</tr>
|
||||
))}
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</section>
|
||||
) : null}
|
||||
|
||||
{corpus?.license_catalog?.length ? (
|
||||
<section className="space-y-2">
|
||||
<h2 className="text-lg font-semibold text-gray-900">
|
||||
|
||||
@@ -55,8 +55,7 @@ linters-settings:
|
||||
rules:
|
||||
- name: exported
|
||||
arguments:
|
||||
- checkPrivateReceivers: false
|
||||
- disableStutteringCheck: true
|
||||
- disableStutteringCheck
|
||||
- name: error-return
|
||||
- name: increment-decrement
|
||||
- name: var-declaration
|
||||
@@ -83,6 +82,6 @@ issues:
|
||||
max-issues-per-linter: 50
|
||||
max-same-issues: 5
|
||||
|
||||
# New code only: don't fail on pre-existing issues in files we haven't touched.
|
||||
# Remove this once a clean baseline is established.
|
||||
new: false
|
||||
# New code only: lint lines changed vs main, so pre-existing debt doesn't fail CI.
|
||||
# Needs the go-lint job to clone with a local `main` ref (see .gitea/workflows/ci.yaml).
|
||||
new-from-merge-base: main
|
||||
|
||||
@@ -211,6 +211,13 @@ func (h *IACEHandler) InitializeProject(c *gin.Context) {
|
||||
}
|
||||
|
||||
for _, cat := range mp.HazardCats {
|
||||
// Native cyber/AI categories (frontend groups I+J) belong to the
|
||||
// CRA module, not the traditional CE (ISO 12100) hazard log.
|
||||
// Enforced centrally here so it holds for EVERY project.
|
||||
if isCyberSecurityCategory(cat) {
|
||||
fmt.Printf("CYBER-SKIP: cat=%s pattern=%s — routed to CRA module\n", cat, mp.PatternID)
|
||||
continue
|
||||
}
|
||||
maxForCat := categoryHazardCap(cat, len(comps))
|
||||
if catCount[cat] >= maxForCat {
|
||||
continue
|
||||
|
||||
@@ -0,0 +1,45 @@
|
||||
package handlers
|
||||
|
||||
// Safety/Security separation for the IACE hazard log.
|
||||
//
|
||||
// The traditional CE risk assessment (Maschinenrichtlinie / EN ISO 12100) and
|
||||
// the cybersecurity assessment (Cyber Resilience Act) are two distinct steps.
|
||||
// IACE owns the traditional, physical + functional-safety hazards; the CRA
|
||||
// module (/sdk/iace/{id}/cra) owns the native cyber/AI topics and re-examines
|
||||
// which safety functions a cyber attack can re-open (see iace-safety-bridge).
|
||||
//
|
||||
// The split is by the NATURE of the hazard, not by the component: a control
|
||||
// fault, bus failure or botched update is FUNCTIONAL safety (random/systematic
|
||||
// fault) and stays in CE — independent of whether the controller is a bought-in
|
||||
// CE-marked PLC or the manufacturer's own embedded control. Only the security
|
||||
// PROPERTIES against malicious actors (access control, firmware/update
|
||||
// integrity, SBOM, vulnerability handling, default passwords) are CRA.
|
||||
//
|
||||
// Functional-safety control categories (software_control, software_fault,
|
||||
// safety_function_failure, configuration_error, communication_failure,
|
||||
// update_failure, sensor_fault, …) therefore intentionally STAY in IACE — they
|
||||
// are the safety functions whose loss the CRA bridge re-examines.
|
||||
//
|
||||
// Enforced centrally in InitializeProject so it holds for EVERY project.
|
||||
var nativeCyberSecurityCategories = map[string]bool{
|
||||
// I. Cyber / Netzwerk — security against malicious actors
|
||||
"unauthorized_access": true,
|
||||
"firmware_corruption": true,
|
||||
"cyber_resilience": true,
|
||||
"logging_audit_failure": true,
|
||||
"cyber_network": true,
|
||||
"sensor_spoofing": true,
|
||||
// J. KI-spezifisch
|
||||
"ai_specific": true,
|
||||
"ai_misclassification": true,
|
||||
"false_classification": true,
|
||||
"model_drift": true,
|
||||
"data_poisoning": true,
|
||||
"unintended_bias": true,
|
||||
}
|
||||
|
||||
// isCyberSecurityCategory reports whether a hazard category is a native cyber/AI
|
||||
// topic that belongs to the CRA module rather than the traditional CE hazard log.
|
||||
func isCyberSecurityCategory(category string) bool {
|
||||
return nativeCyberSecurityCategories[category]
|
||||
}
|
||||
@@ -0,0 +1,37 @@
|
||||
package handlers
|
||||
|
||||
import "testing"
|
||||
|
||||
func TestIsCyberSecurityCategory_RoutedToCRA(t *testing.T) {
|
||||
cyber := []string{
|
||||
"unauthorized_access", "firmware_corruption", "cyber_resilience",
|
||||
"logging_audit_failure", "cyber_network", "sensor_spoofing",
|
||||
"ai_specific", "ai_misclassification", "false_classification",
|
||||
"model_drift", "data_poisoning", "unintended_bias",
|
||||
}
|
||||
for _, c := range cyber {
|
||||
if !isCyberSecurityCategory(c) {
|
||||
t.Errorf("category %q must be routed to the CRA module, not the traditional IACE log", c)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestIsCyberSecurityCategory_StaysInIACE(t *testing.T) {
|
||||
// Physical + functional-safety categories must remain in the traditional CE
|
||||
// hazard log. communication_failure (bus failure -> loss of control) and
|
||||
// update_failure (botched update -> lost safety function) are FUNCTIONAL
|
||||
// faults, not attacks, so they stay too.
|
||||
keep := []string{
|
||||
"mechanical_hazard", "electrical_hazard", "thermal_hazard",
|
||||
"pneumatic_hydraulic", "noise_vibration", "ergonomic_hazard",
|
||||
"material_environmental", "chemical_risk", "fire_explosion",
|
||||
"software_control", "software_fault", "safety_function_failure",
|
||||
"configuration_error", "sensor_fault", "hmi_error",
|
||||
"communication_failure", "update_failure",
|
||||
}
|
||||
for _, c := range keep {
|
||||
if isCyberSecurityCategory(c) {
|
||||
t.Errorf("category %q must stay in the traditional IACE log, not be routed to CRA", c)
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -78,6 +78,7 @@ func (h *RAGHandlers) Search(c *gin.Context) {
|
||||
"query": req.Query,
|
||||
"results": results,
|
||||
"count": len(results),
|
||||
"assessment": ucca.Assess(results),
|
||||
})
|
||||
}
|
||||
|
||||
@@ -206,3 +207,32 @@ func (h *RAGHandlers) HandleScrollChunks(c *gin.Context) {
|
||||
"total": len(chunks),
|
||||
})
|
||||
}
|
||||
|
||||
// LegalCorpusStructure returns the composition (distinct articles, annexes,
|
||||
// recitals + chunk count) of every ingested eur-lex legal act, so the coverage
|
||||
// page can show WHAT was ingested instead of just the act name.
|
||||
// GET /sdk/v1/rag/legal-corpus
|
||||
func (h *RAGHandlers) LegalCorpusStructure(c *gin.Context) {
|
||||
acts, err := h.ragClient.CorpusStructure(c.Request.Context())
|
||||
if err != nil {
|
||||
c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to aggregate legal corpus: " + err.Error()})
|
||||
return
|
||||
}
|
||||
|
||||
arts, anns, recs := 0, 0, 0
|
||||
for _, a := range acts {
|
||||
arts += a.Articles
|
||||
anns += a.Annexes
|
||||
recs += a.Recitals
|
||||
}
|
||||
|
||||
c.JSON(http.StatusOK, gin.H{
|
||||
"regulations": acts,
|
||||
"totals": gin.H{
|
||||
"regulations": len(acts),
|
||||
"articles": arts,
|
||||
"annexes": anns,
|
||||
"recitals": recs,
|
||||
},
|
||||
})
|
||||
}
|
||||
|
||||
@@ -161,6 +161,7 @@ func registerRAGRoutes(v1 *gin.RouterGroup, h *handlers.RAGHandlers) {
|
||||
ragRoutes.GET("/corpus-status", h.CorpusStatus)
|
||||
ragRoutes.GET("/corpus-versions/:collection", h.CorpusVersionHistory)
|
||||
ragRoutes.GET("/scroll", h.HandleScrollChunks)
|
||||
ragRoutes.GET("/legal-corpus", h.LegalCorpusStructure)
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -0,0 +1,132 @@
|
||||
package iace
|
||||
|
||||
// GetWarewashingPatterns returns hazard patterns for commercial warewashing
|
||||
// machines (gewerbliche Geschirrspuelmaschinen / Untertisch-, Hauben-, Korb-
|
||||
// und Bandspuelmaschinen). These capture the machine-specific hazards a
|
||||
// Fachmann immediately expects but that the generic library did not cover:
|
||||
// hot-water/steam scalding on door opening, hot surfaces, hot ware, corrosive
|
||||
// detergent/rinse-aid contact, door pinch and wet-floor slipping.
|
||||
//
|
||||
// Every pattern is gated by the capability tag "dom_warewashing" (emitted only
|
||||
// by warewashing narrative keywords in keyword_dictionary.go), so none of these
|
||||
// leak into unrelated machine classes.
|
||||
//
|
||||
// HP range: HP2200-HP2206. ISO 12100 Annex B section identifiers only (facts);
|
||||
// product standard EN 60335-2-58 (commercial dishwashing machines).
|
||||
func GetWarewashingPatterns() []HazardPattern {
|
||||
return []HazardPattern{
|
||||
{
|
||||
ID: "HP2200", NameDE: "Verbruehung durch Heisswasser/Dampf beim Oeffnen der Tuer", NameEN: "Scalding by hot water/steam when opening the door",
|
||||
RequiredComponentTags: []string{"dom_warewashing", "steam_emission"},
|
||||
GeneratedHazardCats: []string{"thermal_hazard"},
|
||||
SuggestedMeasureIDs: []string{"M2200", "M2201", "M2202", "M2208"},
|
||||
Priority: 94,
|
||||
ApplicableLifecycles: []string{"normal_operation", "cleaning"},
|
||||
ScenarioDE: "Beim Oeffnen der Tuer waehrend oder unmittelbar nach dem Spuelgang tritt ein Schwall aus heissem Wasser und Wrasen (Dampf) aus der Spuelkammer aus und trifft Gesicht, Haende und Arme des Bedieners.",
|
||||
TriggerDE: "Tuer wird vor Programmende oder bei noch vorhandenem Restdampf geoeffnet; Tuerverriegelung fehlt oder ist ueberbrueckt; Nachspueltemperatur ca. 85 Grad C.",
|
||||
HarmDE: "Verbruehung 1.-2. Grades an Gesicht, Haenden und Unterarmen; Augenreizung durch heissen Dampf.",
|
||||
AffectedDE: "Bedienpersonal (Spuelkraft)",
|
||||
ZoneDE: "Tuer- und Beschickungsoeffnung der Spuelkammer",
|
||||
ISO12100Section: "6.2.4",
|
||||
DefaultSeverity: 3, DefaultExposure: 4,
|
||||
},
|
||||
{
|
||||
ID: "HP2201", NameDE: "Verbrennung an heissen Oberflaechen (Boiler/Tank/Spuelkammer)", NameEN: "Burn on hot surfaces (boiler/tank/wash chamber)",
|
||||
RequiredComponentTags: []string{"dom_warewashing", "high_temperature"},
|
||||
GeneratedHazardCats: []string{"thermal_hazard"},
|
||||
SuggestedMeasureIDs: []string{"M2202", "M055", "M2208"},
|
||||
Priority: 90,
|
||||
ApplicableLifecycles: []string{"cleaning", "maintenance"},
|
||||
ScenarioDE: "Beruehrung heisser Oberflaechen von Boiler, Tankheizkoerper oder Spuelkammerwaenden bei Reinigung, Entkalkung oder Wartung fuehrt zu Kontaktverbrennungen.",
|
||||
TriggerDE: "Reinigung/Entkalkung ohne Abkuehlzeit; Eingriff in die Spuelkammer bei betriebswarmem Geraet.",
|
||||
HarmDE: "Kontaktverbrennung an Haenden und Unterarmen.",
|
||||
AffectedDE: "Reinigungspersonal, Wartungspersonal",
|
||||
ZoneDE: "Boiler, Tankheizkoerper, Spuelkammerwaende",
|
||||
ISO12100Section: "6.2.4",
|
||||
DefaultSeverity: 2, DefaultExposure: 3,
|
||||
},
|
||||
{
|
||||
ID: "HP2202", NameDE: "Verbrennung an heissem Spuelgut beim Entladen", NameEN: "Burn on hot ware when unloading",
|
||||
RequiredComponentTags: []string{"dom_warewashing", "hot_water"},
|
||||
GeneratedHazardCats: []string{"thermal_hazard"},
|
||||
SuggestedMeasureIDs: []string{"M2202", "M055", "M2208"},
|
||||
Priority: 86,
|
||||
ApplicableLifecycles: []string{"normal_operation"},
|
||||
ScenarioDE: "Geschirr, Glaeser und Bestecke sind nach dem Spuelgang durch die Heisswasser-Nachspuelung sehr heiss; beim Entladen kommt es zu Verbrennungen.",
|
||||
TriggerDE: "Sofortiges Entnehmen des Spuelguts nach Programmende ohne Abkuehl-/Trocknungszeit.",
|
||||
HarmDE: "Verbrennung an Haenden/Fingern beim Greifen heisser Teile.",
|
||||
AffectedDE: "Bedienpersonal (Spuelkraft)",
|
||||
ZoneDE: "Spuelkammer, Entnahmebereich/Korb",
|
||||
ISO12100Section: "6.2.4",
|
||||
DefaultSeverity: 2, DefaultExposure: 3,
|
||||
},
|
||||
{
|
||||
ID: "HP2203", NameDE: "Chemische Veraetzung (Haut/Augen) durch Reiniger-/Klarspueler-Konzentrat", NameEN: "Chemical burn (skin/eyes) from detergent/rinse-aid concentrate",
|
||||
RequiredComponentTags: []string{"dom_warewashing", "corrosive_chemical"},
|
||||
GeneratedHazardCats: []string{"chemical_risk"},
|
||||
SuggestedMeasureIDs: []string{"M2203", "M2204", "M2208"},
|
||||
Priority: 92,
|
||||
ApplicableLifecycles: []string{"normal_operation", "maintenance"},
|
||||
ScenarioDE: "Direkter Kontakt mit dem aetzenden (alkalischen) Reiniger- bzw. Klarspueler-Konzentrat beim Nachfuellen, Sauglanzenwechsel oder bei Leckage fuehrt zu Veraetzungen von Haut und Augen.",
|
||||
TriggerDE: "Gebinde-/Sauglanzenwechsel ohne Schutzausruestung; Umfuellen von Konzentrat; undichte Dosierleitung.",
|
||||
HarmDE: "Veraetzung von Haut und Augen (alkalische Verletzung), bleibende Augenschaeden moeglich.",
|
||||
AffectedDE: "Bedienpersonal, Reinigungspersonal beim Chemikalien-Handling",
|
||||
ZoneDE: "Dosiergeraet, Reiniger-/Klarspueler-Gebinde, Sauglanzen",
|
||||
ISO12100Section: "6.2.4",
|
||||
DefaultSeverity: 3, DefaultExposure: 3,
|
||||
ClarificationQuestionsDE: []string{
|
||||
"Liegt fuer alle eingesetzten Reiniger/Klarspueler/Entkalker ein aktuelles Sicherheitsdatenblatt (SDB) am Geraet vor?",
|
||||
"Ist ein geschlossenes Dosiersystem mit Sauglanzen vorhanden, sodass kein Umfuellen noetig ist?",
|
||||
},
|
||||
},
|
||||
{
|
||||
ID: "HP2204", NameDE: "Reizung/Veraetzung der Atemwege durch Reinigungs-Aerosole/Daempfe", NameEN: "Respiratory irritation from cleaning aerosols/vapours",
|
||||
RequiredComponentTags: []string{"dom_warewashing", "corrosive_chemical"},
|
||||
GeneratedHazardCats: []string{"chemical_risk"},
|
||||
SuggestedMeasureIDs: []string{"M2205", "M2203", "M2204"},
|
||||
Priority: 82,
|
||||
ApplicableLifecycles: []string{"normal_operation", "maintenance"},
|
||||
ScenarioDE: "Aerosole und Daempfe der Reinigungschemie (insbesondere beim Oeffnen kurz nach dem Spuelgang oder bei der Entkalkung mit Saeure) gelangen in die Atemzone und reizen Atemwege und Schleimhaeute.",
|
||||
TriggerDE: "Oeffnen bei laufender/heisser Chemie; Entkalkung mit Saeure; unzureichende Lueftung des Aufstellbereichs.",
|
||||
HarmDE: "Reizung von Atemwegen, Augen und Schleimhaeuten; bei Saeure-/Laugen-Vermischung gefaehrliche Gase.",
|
||||
AffectedDE: "Bedienpersonal, Reinigungspersonal",
|
||||
ZoneDE: "Atemzone vor der Spuelkammer, Aufstellbereich",
|
||||
ISO12100Section: "6.2.4",
|
||||
DefaultSeverity: 2, DefaultExposure: 2,
|
||||
ClarificationQuestionsDE: []string{
|
||||
"Ist der Aufstellbereich ausreichend be-/entlueftet (Kuechenlueftung)?",
|
||||
"Wird in der BA vor dem Vermischen von Reiniger und Entkalker/Saeure gewarnt?",
|
||||
},
|
||||
},
|
||||
{
|
||||
ID: "HP2205", NameDE: "Quetschen der Finger an der Tuer/Haube", NameEN: "Finger crushing at the door/hood",
|
||||
RequiredComponentTags: []string{"dom_warewashing", "access_door"},
|
||||
GeneratedHazardCats: []string{"mechanical_hazard"},
|
||||
SuggestedMeasureIDs: []string{"M2206", "M003", "M2208"},
|
||||
Priority: 78,
|
||||
ApplicableLifecycles: []string{"normal_operation"},
|
||||
ScenarioDE: "Beim Schliessen der Tuer bzw. Absenken der Haube werden Finger zwischen Tuer/Haube und Gehaeuse gequetscht.",
|
||||
TriggerDE: "Greifen in den Schliessbereich beim Schliessen; hohe Schliesskraft der Haube; scharfe Kanten.",
|
||||
HarmDE: "Quetschung und Prellung der Finger.",
|
||||
AffectedDE: "Bedienpersonal (Spuelkraft)",
|
||||
ZoneDE: "Tuer-/Haubenkante, Schliessbereich",
|
||||
ISO12100Section: "6.2.3",
|
||||
DefaultSeverity: 1, DefaultExposure: 3,
|
||||
},
|
||||
{
|
||||
ID: "HP2206", NameDE: "Ausrutschen auf nassem Boden (Wasseraustritt/Leckage)", NameEN: "Slipping on wet floor (water leakage)",
|
||||
RequiredComponentTags: []string{"dom_warewashing"},
|
||||
GeneratedHazardCats: []string{"mechanical_hazard"},
|
||||
SuggestedMeasureIDs: []string{"M2207", "M538", "M2208"},
|
||||
Priority: 76,
|
||||
ApplicableLifecycles: []string{"normal_operation", "cleaning", "maintenance"},
|
||||
ScenarioDE: "Aus der Spuelmaschine austretendes Wasser (Beschickung, Tuer oeffnen, Leckage, Tankwasserwechsel) macht den Boden im Aufstellbereich rutschig; der Bediener rutscht aus.",
|
||||
TriggerDE: "Wasseraustritt beim Oeffnen/Beschicken; undichter Ablauf; fehlender Bodenablauf.",
|
||||
HarmDE: "Sturz mit Prellungen, Knochenbruechen oder Kopfaufprall.",
|
||||
AffectedDE: "Bedienpersonal, Reinigungspersonal",
|
||||
ZoneDE: "Aufstell- und Bedienbereich der Spuelmaschine",
|
||||
ISO12100Section: "6.3.5.6",
|
||||
DefaultSeverity: 2, DefaultExposure: 3,
|
||||
},
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,112 @@
|
||||
package iace
|
||||
|
||||
import "testing"
|
||||
|
||||
// firedSet runs the engine for the given custom tags and returns the set of
|
||||
// fired pattern IDs.
|
||||
func firedSet(customTags []string) map[string]bool {
|
||||
engine := NewPatternEngine()
|
||||
out := engine.Match(MatchInput{CustomTags: customTags})
|
||||
fired := make(map[string]bool, len(out.MatchedPatterns))
|
||||
for _, m := range out.MatchedPatterns {
|
||||
fired[m.PatternID] = true
|
||||
}
|
||||
return fired
|
||||
}
|
||||
|
||||
// A warewashing narrative emits these capability + functional tags.
|
||||
var warewashingTags = []string{
|
||||
"dom_warewashing", "steam_emission", "hot_water", "high_temperature",
|
||||
"corrosive_chemical", "access_door", "rotating_part",
|
||||
}
|
||||
|
||||
func TestWarewashing_PatternsFireForDishwasher(t *testing.T) {
|
||||
fired := firedSet(warewashingTags)
|
||||
want := []string{"HP2200", "HP2201", "HP2202", "HP2203", "HP2204", "HP2205", "HP2206"}
|
||||
for _, id := range want {
|
||||
if !fired[id] {
|
||||
t.Errorf("expected warewashing pattern %s to fire for a dishwasher, but it did not", id)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestWarewashing_PatternsDoNotLeakIntoOtherMachines(t *testing.T) {
|
||||
// A machine with thermal + electrical + chemical capability but NOT a
|
||||
// dishwasher must never produce warewashing hazards (dom_warewashing gate).
|
||||
fired := firedSet([]string{"high_temperature", "electrical_part", "chemical_risk", "rotating_part", "moving_part"})
|
||||
for _, id := range []string{"HP2200", "HP2201", "HP2202", "HP2203", "HP2204", "HP2205", "HP2206"} {
|
||||
if fired[id] {
|
||||
t.Errorf("warewashing pattern %s leaked into a non-dishwasher machine", id)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestWarewashing_WeldingAndGlueDoNotLeakIntoDishwasher(t *testing.T) {
|
||||
// The gate-term additions must stop the welding/flame/glue burn patterns
|
||||
// from firing for a dishwasher (they previously leaked via high_temperature
|
||||
// / electrical_part). dom_welding/dom_flame/dom_glue are absent here.
|
||||
fired := firedSet(warewashingTags)
|
||||
leak := map[string]string{
|
||||
"HP530": "Lichtbogen-Verbrennung (Schweissen)",
|
||||
"HP532": "Schweissrauch",
|
||||
"HP533": "Brand durch Schweissfunken (Schweissen)",
|
||||
}
|
||||
for id, name := range leak {
|
||||
if fired[id] {
|
||||
t.Errorf("cross-domain pattern %s (%s) leaked into a dishwasher", id, name)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestWarewashing_MeasureIDsExist(t *testing.T) {
|
||||
lib := GetProtectiveMeasureLibrary()
|
||||
have := make(map[string]bool, len(lib))
|
||||
for _, m := range lib {
|
||||
have[m.ID] = true
|
||||
}
|
||||
for _, p := range GetWarewashingPatterns() {
|
||||
for _, mid := range p.SuggestedMeasureIDs {
|
||||
if !have[mid] {
|
||||
t.Errorf("pattern %s references measure %s which is not in the library", p.ID, mid)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestWarewashing_NarrativeEmitsTags(t *testing.T) {
|
||||
// Closes the loop: a realistic dishwasher description must emit the tags
|
||||
// the warewashing patterns gate on (otherwise the patterns are dead).
|
||||
narrative := "Gewerbliche Untertisch-Geschirrspuelmaschine mit Heisswasser-Boiler " +
|
||||
"und Nachspuelung ca. 85 Grad C, Spuelpumpe mit rotierenden Spuelfeldern, " +
|
||||
"Dampf-/Wrasenabgabe beim Oeffnen, Reiniger und Klarspueler ueber Dosiergeraet, " +
|
||||
"Tuer mit Sicherheitsschalter, Eingreifen in die Spuelkammer."
|
||||
res := ParseNarrative(narrative, "Gewerbliche Geschirrspuelmaschine")
|
||||
got := make(map[string]bool, len(res.CustomTags))
|
||||
for _, tag := range res.CustomTags {
|
||||
got[tag] = true
|
||||
}
|
||||
for _, want := range []string{"dom_warewashing", "steam_emission", "hot_water", "corrosive_chemical", "access_door", "rotating_part"} {
|
||||
if !got[want] {
|
||||
t.Errorf("narrative did not emit expected tag %q (got %v)", want, res.CustomTags)
|
||||
}
|
||||
}
|
||||
// And it must NOT emit any welding/flame/glue domain that would re-open leaks.
|
||||
for _, bad := range []string{"dom_welding", "dom_flame", "dom_glue"} {
|
||||
if got[bad] {
|
||||
t.Errorf("dishwasher narrative unexpectedly emitted cross-domain tag %q", bad)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestWarewashing_NewMeasuresPresent(t *testing.T) {
|
||||
lib := GetProtectiveMeasureLibrary()
|
||||
have := make(map[string]bool, len(lib))
|
||||
for _, m := range lib {
|
||||
have[m.ID] = true
|
||||
}
|
||||
for _, mid := range []string{"M2200", "M2201", "M2202", "M2203", "M2204", "M2205", "M2206", "M2207", "M2208"} {
|
||||
if !have[mid] {
|
||||
t.Errorf("expected warewashing measure %s to be registered in the library", mid)
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -88,6 +88,21 @@ func GetKeywordDictionary() []KeywordEntry {
|
||||
{Keywords: []string{"folienwickler", "wickelmaschine", "konfektioniermaschine", "folienverpackung", "wellpappe"}, ExtraTags: []string{"dom_converting"}},
|
||||
{Keywords: []string{"bergbau", "untertage", "tunnelbau", "off-grid"}, ExtraTags: []string{"dom_remote"}},
|
||||
{Keywords: []string{"asbest", "asbestsanierung", "asbestexposition"}, ExtraTags: []string{"dom_asbestos"}},
|
||||
{Keywords: []string{"gasbrenner", "brennerbetrieb", "offene flamme", "flammhaert", "abflammen", "flammrichten"}, ExtraTags: []string{"dom_flame"}},
|
||||
{Keywords: []string{"heissleim", "heissleimanlage", "schmelzkleber", "schmelzklebstoff", "klebstoffschmelzer", "leimwerk"}, ExtraTags: []string{"dom_glue"}},
|
||||
|
||||
// ── Gewerbliche Spuelmaschine / Warewashing ──────────────────────
|
||||
// dom_warewashing gates the warewashing-specific patterns
|
||||
// (hazard_patterns_warewashing.go) so they never leak into other
|
||||
// machine classes. The functional tags (hot_water, steam_emission,
|
||||
// corrosive_chemical, access_door) are the within-domain triggers.
|
||||
{Keywords: []string{"spuelmaschine", "geschirrspuelmaschine", "geschirrspueler", "haubenspuelmaschine", "untertischspuelmaschine", "korbspuelmaschine", "bandspuelmaschine", "glaeserspuelmaschine", "bistrospuelmaschine", "warewashing", "dishwasher"}, ExtraTags: []string{"dom_warewashing"}},
|
||||
{Keywords: []string{"heisswasser", "nachspuelung", "nachspueltemperatur", "spuelgang", "spuelzyklus", "thermostopp", "thermostop"}, ExtraTags: []string{"hot_water", "high_temperature"}},
|
||||
{Keywords: []string{"dampf", "wrasen", "schwaden", "brueden"}, ExtraTags: []string{"steam_emission", "high_temperature"}},
|
||||
{Keywords: []string{"boiler", "spuelboiler", "nachspuelboiler", "tankheiz", "boilerheiz"}, ComponentIDs: []string{"C094"}, ExtraTags: []string{"heating_element", "high_temperature"}},
|
||||
{Keywords: []string{"reiniger", "klarspueler", "spuelmittel", "reinigungsmittel", "reinigerkonzentrat", "spuelchemie", "dosiergeraet", "dosierpumpe", "sauglanze", "entkalker"}, ExtraTags: []string{"corrosive_chemical"}},
|
||||
{Keywords: []string{"spuelarm", "spuelfeld", "wascharm", "spruehfeld"}, ComponentIDs: []string{"C004"}, ExtraTags: []string{"rotating_part"}},
|
||||
{Keywords: []string{"spuelkammer", "spueltuer", "geraetetuer", "haubentuer", "klapptuer"}, ExtraTags: []string{"access_door"}},
|
||||
// Ghost-Closure (Emit-Seite): macht die 34 toten Required-Tags
|
||||
// emittierbar, jeweils NUR via domaenenspezifische Keywords -> die 120
|
||||
// Ghost-Patterns feuern wieder, aber nur fuer ihre echte Maschine (kein
|
||||
|
||||
@@ -22,6 +22,7 @@ func GetProtectiveMeasureLibrary() []ProtectiveMeasureEntry {
|
||||
all = append(all, getGTBremseMeasures()...) // GT-Bremse-Coverage-Gaps (M483-M522)
|
||||
all = append(all, GetCRAMeasures()...) // CRA / DIN EN 40000-1-2 cyber-resilience (M540-M548)
|
||||
all = append(all, getLiftEndstopMeasures()...) // Lift/hoist endstop (M600-M604) — bridges OSHA MD library
|
||||
all = append(all, getWarewashingMeasures()...) // Commercial dishwasher (M2200-M2208) — scald/chemical/door/slip
|
||||
return all
|
||||
}
|
||||
|
||||
|
||||
@@ -0,0 +1,69 @@
|
||||
package iace
|
||||
|
||||
// getWarewashingMeasures returns protective measures for commercial warewashing
|
||||
// machines (gewerbliche Geschirrspuelmaschinen): hot-water/steam scalding,
|
||||
// hot surfaces, corrosive cleaning chemicals, door pinch and wet-floor slip.
|
||||
// They complement the generic thermal/mechanical/material measures with the
|
||||
// machine-specific controls a Fachmann expects for this product class.
|
||||
//
|
||||
// M-ID range: M2200-M2208. Norm identifiers only (facts) — no norm text is
|
||||
// reproduced (DIN/Beuth license). Lead standard: EN 60335-2-58 (safety of
|
||||
// commercial electric dishwashing machines).
|
||||
func getWarewashingMeasures() []ProtectiveMeasureEntry {
|
||||
return []ProtectiveMeasureEntry{
|
||||
{ID: "M2200", ReductionType: "design", SubType: "interlock",
|
||||
Name: "Tuer-/Haubenverriegelung beendet Spuelgang vor dem Oeffnen",
|
||||
Description: "Die Tuer bzw. Haube ist so mit der Steuerung verriegelt, dass beim Oeffnen Spuelpumpe und Nachspuelung sofort abschalten und ein Oeffnen erst nach Programmende (bzw. nach Abbau des Restdampfs) freigegeben wird. Verhindert den Schwall aus Heisswasser/Wrasen und den Kontakt mit noch rotierenden Spuelfeldern.",
|
||||
HazardCategory: "thermal",
|
||||
Examples: []string{"Tuerkontaktschalter schaltet Pumpe + Heizung beim Oeffnen ab", "Rastposition mit Restdampf-Verzoegerung vor Freigabe"},
|
||||
NormReferences: []string{"EN 60335-2-58", "EN ISO 12100 — Inhaerent sichere Konstruktion"}},
|
||||
{ID: "M2201", ReductionType: "design", SubType: "thermal",
|
||||
Name: "Wrasen-/Dampfreduzierung (Kondensations- / Waermerueckgewinnungssystem)",
|
||||
Description: "Der beim Oeffnen austretende Wrasen wird durch ein Kondensations- bzw. Waermerueckgewinnungssystem reduziert, sodass beim Entnehmen kein gefaehrlicher Dampfschwall entsteht. Senkt zugleich die Restwaerme- und Feuchtebelastung am Arbeitsplatz.",
|
||||
HazardCategory: "thermal",
|
||||
Examples: []string{"Umluft-Waermerueckgewinnung reduziert austretenden Wrasen", "Kondensationshaube ueber der Spuelkammer"},
|
||||
NormReferences: []string{"EN 60335-2-58"}},
|
||||
{ID: "M2202", ReductionType: "protection", SubType: "monitoring",
|
||||
Name: "Thermostop / Temperaturueberwachung von Boiler und Tank",
|
||||
Description: "Boiler- und Tanktemperatur werden ueberwacht; ein Thermostop gibt den naechsten Schritt erst frei, wenn die Solltemperatur erreicht ist, und begrenzt die maximale Nachspueltemperatur. Schuetzt vor Verbruehung durch unkontrolliert heisses Nachspuelwasser.",
|
||||
HazardCategory: "thermal",
|
||||
Examples: []string{"Temperatursensor in Boiler und Tank mit Abschaltgrenze", "Thermostop-Funktion im Spuelprogramm"},
|
||||
NormReferences: []string{"EN 60335-2-58", "EN ISO 13732-1"}},
|
||||
{ID: "M2203", ReductionType: "design", SubType: "containment",
|
||||
Name: "Geschlossenes Dosiersystem mit Sauglanzen und Niveauueberwachung",
|
||||
Description: "Reiniger und Klarspueler werden ausschliesslich ueber ein geschlossenes Dosiersystem mit Sauglanzen aus dem Originalgebinde gefoerdert (Niveau-Ueberwachung statt Umfuellen). Direkter Haut-/Augenkontakt mit dem aetzenden Konzentrat beim Nachfuellen wird konstruktiv vermieden.",
|
||||
HazardCategory: "material_environmental",
|
||||
Examples: []string{"Sauglanze mit Leermeldung im Reiniger-Kanister", "Kein Umfuellen — Gebindewechsel ohne offenen Chemiekontakt"},
|
||||
NormReferences: []string{"EN 60335-2-58", "Verordnung (EG) Nr. 1272/2008 (CLP/GHS)"}},
|
||||
{ID: "M2204", ReductionType: "information", SubType: "ppe",
|
||||
Name: "PSA (Augen-/Hautschutz) + GHS-Kennzeichnung und Sicherheitsdatenblatt",
|
||||
Description: "Fuer Handhabung, Gebindewechsel und Entkalkung werden Augen- und Handschutz vorgeschrieben; Reiniger/Klarspueler/Entkalker sind GHS-gekennzeichnet und das Sicherheitsdatenblatt liegt am Geraet vor. Stellt die sichere Handhabung der aetzenden Konzentrate sicher.",
|
||||
HazardCategory: "material_environmental",
|
||||
Examples: []string{"Schutzbrille + chemikalienbestaendige Handschuhe bei Gebindewechsel", "GHS-Etikett und SDB im Chemikalienschrank am Geraet"},
|
||||
NormReferences: []string{"Verordnung (EG) Nr. 1272/2008 (CLP/GHS)", "TRGS 500"}},
|
||||
{ID: "M2205", ReductionType: "protection", SubType: "ventilation",
|
||||
Name: "Be-/Entlueftung bzw. geschlossene Haube gegen Chemie-Aerosole und Wrasen",
|
||||
Description: "Der Aufstellbereich ist ausreichend be- und entlueftet bzw. die Spuelkammer bleibt waehrend des Programms geschlossen, sodass Reinigungs-Aerosole und heisser Wrasen nicht in die Atemzone des Bedieners gelangen.",
|
||||
HazardCategory: "material_environmental",
|
||||
Examples: []string{"Kuechenlueftung ueber dem Spuelbereich", "Programmstart nur bei geschlossener Haube"},
|
||||
NormReferences: []string{"EN 60335-2-58", "TRGS 500"}},
|
||||
{ID: "M2206", ReductionType: "design", SubType: "geometry",
|
||||
Name: "Tuerkanten mit geringer Schliesskraft / Einklemmschutz",
|
||||
Description: "Die Tuer-/Haubenmechanik ist so gestaltet (gefuehrte Bewegung, begrenzte Schliesskraft, abgerundete Kanten), dass beim Schliessen keine Finger gequetscht werden.",
|
||||
HazardCategory: "mechanical",
|
||||
Examples: []string{"Gefuehrte Haube mit gedaempfter Schliessbewegung", "Abgerundete Tuerkanten ohne Quetschspalt"},
|
||||
NormReferences: []string{"EN 60335-2-58", "EN ISO 12100 — Geometrie und Anordnung"}},
|
||||
{ID: "M2207", ReductionType: "design", SubType: "environment",
|
||||
Name: "Rutschhemmender Bodenbelag + Ablauf/Leckagewanne im Aufstellbereich",
|
||||
Description: "Im Aufstell- und Bedienbereich der Spuelmaschine sorgen rutschhemmender Bodenbelag und ein definierter Ablauf bzw. eine Leckagewanne dafuer, dass austretendes Wasser nicht zur Sturzgefahr wird.",
|
||||
HazardCategory: "mechanical",
|
||||
Examples: []string{"Rutschhemmender Industrieboden (Bewertungsgruppe R11/R12)", "Bodenablauf bzw. Leckagewanne unter dem Geraet"},
|
||||
NormReferences: []string{"ASR A1.5/1,2", "DGUV Regel 108-003"}},
|
||||
{ID: "M2208", ReductionType: "information", SubType: "signage",
|
||||
Name: "Warnhinweis heisser Dampf/Heisswasser — Tuer erst nach Programmende oeffnen",
|
||||
Description: "Am Geraet und in der Betriebsanleitung wird vor heissem Dampf und Heisswasser gewarnt und das Oeffnen der Tuer erst nach Programmende mit vorsichtigem Anheben vorgeschrieben. Sprachneutrale Piktogramme ergaenzen den Hinweis.",
|
||||
HazardCategory: "general",
|
||||
Examples: []string{"Warnpiktogramm 'Heisser Dampf' an der Tuer", "BA-Hinweis 'Tuer nach Programmende langsam oeffnen'"},
|
||||
NormReferences: []string{"ISO 7010", "EN 60335-2-58"}},
|
||||
}
|
||||
}
|
||||
@@ -46,6 +46,20 @@ var domainGateTerms = map[string]string{
|
||||
"widerstandsschweiss": "dom_welding", "lichtbogenschweiss": "dom_welding",
|
||||
"schutzgasschweiss": "dom_welding", "punktschweiss": "dom_welding",
|
||||
"schweisselektrod": "dom_welding", "elektrodenspalt": "dom_welding",
|
||||
// Schweissen — Oberflaechenformen die bisher ungegatet leakten (z.B. in
|
||||
// thermische Hazards einer Spuelmaschine ueber high_temperature/electrical_part)
|
||||
"schweissarbeitsplatz": "dom_welding", "schweissfunke": "dom_welding",
|
||||
"schweisshelm": "dom_welding", "schweisserschutz": "dom_welding",
|
||||
"lichtbogenzone": "dom_welding", "lichtbogen-verbrennung": "dom_welding",
|
||||
"schweissrauch": "dom_welding", "schweissgeraet": "dom_welding",
|
||||
"schweisszone": "dom_welding", "schweissbrenner": "dom_welding",
|
||||
"schweissspritzer": "dom_welding", "schweissstrom": "dom_welding",
|
||||
// Offene Flamme / Brenner (Gasbrenner, Flammhaerten, Abflammen)
|
||||
"offene flamme": "dom_flame", "brennerbereich": "dom_flame",
|
||||
"flammenzone": "dom_flame", "gasbrenner": "dom_flame",
|
||||
// Heissleim / Schmelzkleber
|
||||
"heissleimanlage": "dom_glue", "klebstoffschmelzer": "dom_glue",
|
||||
"heisskleber": "dom_glue", "schmelzkleber": "dom_glue",
|
||||
// Solar / PV
|
||||
"pv-modul": "dom_solar", "photovoltaik": "dom_solar", "pv-anlage": "dom_solar",
|
||||
"dc-steckverbindung": "dom_solar", "solarmodul": "dom_solar",
|
||||
|
||||
@@ -44,6 +44,7 @@ func collectAllPatterns() []HazardPattern {
|
||||
patterns = append(patterns, GetCRAPatterns()...) // HP1910-HP1918 CRA / DIN EN 40000-1-2 cyber-resilience spur
|
||||
patterns = append(patterns, GetSecondaryHarmDemoPatterns()...) // HP2000-HP2001 secondary harm chain demos (Cola splitter, Pharma)
|
||||
patterns = append(patterns, GetLiftEndstopPatterns()...) // HP2100-HP2102 lift body-part crush at endstops
|
||||
patterns = append(patterns, GetWarewashingPatterns()...) // HP2200-HP2206 commercial dishwasher (scald/chemical/door/slip)
|
||||
patterns = applyMachineTypeOverrides(patterns) // Fill MachineTypes on legacy patterns to prevent drift
|
||||
patterns = applyDomainGates(patterns) // Capability-domain gate: stop domain-specific patterns leaking cross-machine
|
||||
return patterns
|
||||
|
||||
@@ -0,0 +1,230 @@
|
||||
package ucca
|
||||
|
||||
import (
|
||||
"regexp"
|
||||
"strconv"
|
||||
"strings"
|
||||
)
|
||||
|
||||
// authorityInfo is the normative classification of a search result, used internally
|
||||
// for re-ranking only (Phase 1 changes ordering, not the response contract).
|
||||
type authorityInfo struct {
|
||||
weight int // 100 binding, 80 technical_standard, 70 guidance, 0 foreign, 50 unknown
|
||||
sourceClass string // binding_law | technical_standard | supervisory_guidance | foreign_law | unknown
|
||||
jurisdiction string // DE | EU | CH
|
||||
}
|
||||
|
||||
var (
|
||||
guidanceMarkers = []string{
|
||||
"DSK", "EDPB", "BfDI", "BFDI", "BayLfD", "Baylfb", "ENISA", "BSI", "EUCC",
|
||||
"Standards Mapping", "Kpnr", "Orientierungshilfe", "Handreichung", "Beschluss",
|
||||
"Leitlinie", "Guidance", "Empfehlung", "OECD", "CISA", "Blue Guide",
|
||||
}
|
||||
// Technical standards / control frameworks (best-practice controls). Checked BEFORE
|
||||
// guidanceMarkers so a "BSI Grundschutz" chunk classifies as a standard, not BSI guidance.
|
||||
standardMarkers = []string{
|
||||
"NIST", "OWASP", "Grundschutz", "ISO 27001", "ISO/IEC 27001",
|
||||
"CSA CCM", "Cloud Controls Matrix", "CIS Benchmark", "CIS Control",
|
||||
}
|
||||
foreignMarkers = []string{"RevDSG", "fedlex", "(CH)"}
|
||||
deMarkers = []string{"BDSG", "DSK", "BfDI", "BFDI", "BayLfD", "Baylfb", "BSI"}
|
||||
normPattern = regexp.MustCompile(`(§|Art\.?)\s*\d`)
|
||||
bdsgParagraph = regexp.MustCompile(`§\s*(\d+)`)
|
||||
)
|
||||
|
||||
// classifyAuthority derives weight/source-class/jurisdiction. Explicitly tagged payload
|
||||
// values win; otherwise it falls back to the curated category + name markers, so the
|
||||
// not-yet-re-ingested (untagged) corpus is still classified deterministically.
|
||||
func classifyAuthority(r LegalSearchResult) authorityInfo {
|
||||
jur := r.Jurisdiction
|
||||
if jur == "" {
|
||||
jur = inferJurisdiction(r)
|
||||
}
|
||||
if r.SourceClass != "" {
|
||||
w := r.AuthorityWeight
|
||||
if w == 0 && r.SourceClass == "binding_law" {
|
||||
w = 100
|
||||
}
|
||||
return authorityInfo{weight: w, sourceClass: r.SourceClass, jurisdiction: jur}
|
||||
}
|
||||
if r.AuthorityWeight > 0 {
|
||||
return authorityInfo{weight: r.AuthorityWeight, sourceClass: sourceClassFromWeight(r.AuthorityWeight), jurisdiction: jur}
|
||||
}
|
||||
hay := r.ArticleLabel + " " + r.RegulationShort + " " + r.RegulationName + " " + r.RegulationCode
|
||||
switch {
|
||||
case containsAny(hay, foreignMarkers):
|
||||
return authorityInfo{weight: 0, sourceClass: "foreign_law", jurisdiction: "CH"}
|
||||
case r.Category == "standard" || containsAny(hay, standardMarkers):
|
||||
return authorityInfo{weight: 80, sourceClass: "technical_standard", jurisdiction: jur}
|
||||
case r.Category == "guidance" || containsAny(hay, guidanceMarkers):
|
||||
return authorityInfo{weight: 70, sourceClass: "supervisory_guidance", jurisdiction: jur}
|
||||
case r.Category == "regulation" || r.Category == "eu_recht" || normPattern.MatchString(r.ArticleLabel):
|
||||
return authorityInfo{weight: 100, sourceClass: "binding_law", jurisdiction: jur}
|
||||
default:
|
||||
return authorityInfo{weight: 50, sourceClass: "unknown", jurisdiction: jur}
|
||||
}
|
||||
}
|
||||
|
||||
func sourceClassFromWeight(w int) string {
|
||||
switch {
|
||||
case w >= 100:
|
||||
return "binding_law"
|
||||
case w >= 80:
|
||||
return "technical_standard"
|
||||
case w >= 70:
|
||||
return "supervisory_guidance"
|
||||
case w <= 0:
|
||||
return "foreign_law"
|
||||
default:
|
||||
return "unknown"
|
||||
}
|
||||
}
|
||||
|
||||
func inferJurisdiction(r LegalSearchResult) string {
|
||||
hay := r.ArticleLabel + " " + r.RegulationShort + " " + r.RegulationName
|
||||
switch {
|
||||
case containsAny(hay, foreignMarkers):
|
||||
return "CH"
|
||||
case strings.Contains(hay, "§") || containsAny(hay, deMarkers):
|
||||
return "DE"
|
||||
default:
|
||||
return "EU"
|
||||
}
|
||||
}
|
||||
|
||||
// --- Domain routing: separates same-authority but topically foreign norms ---
|
||||
|
||||
type domainDef struct {
|
||||
name string
|
||||
regs []string // regulation markers found in a chunk
|
||||
keywords []string // query keywords that signal this domain
|
||||
}
|
||||
|
||||
// Deterministic order (slice, not map) — important for stable classification + tests.
|
||||
var domains = []domainDef{
|
||||
{"data_protection",
|
||||
[]string{"DSGVO", "GDPR", "BDSG", "EDPB", "DSK", "BfDI", "BayLfD", "DPF"},
|
||||
[]string{"personenbezogen", "betroffene", "datenschutz", "datenschutzbeauftrag", "dsb",
|
||||
"datenpanne", "auskunft", "loesch", "lösch", "einwilligung", "besondere kategorien", "auftragsverarbeiter"}},
|
||||
{"cyber",
|
||||
[]string{"CRA", "NIS2", "NIS-2", "ENISA", "DORA", "EUCC"},
|
||||
[]string{"security update", "sicherheitsupdate", "sicherheitsaktualisierung", "schwachstelle", "sbom",
|
||||
"cybersicherheit", "konformit", "hersteller", "importeur", "haendler", "händler", "ikt-",
|
||||
"resilienz", "sicherheitsvorfall", "digitalen elementen"}},
|
||||
{"ai",
|
||||
[]string{"AI Act", "KI-VO", "KI-Verordnung"},
|
||||
[]string{"ki-system", "ki-modell", "hochrisiko", "kuenstliche intelligenz", "künstliche intelligenz"}},
|
||||
{"product_safety",
|
||||
[]string{"Maschinenverordnung", "MaschinenVO", "GPSR", "RED", "MDR"},
|
||||
nil},
|
||||
}
|
||||
|
||||
func queryDomain(query string) string {
|
||||
ql := strings.ToLower(query)
|
||||
for _, d := range domains {
|
||||
for _, kw := range d.keywords {
|
||||
if strings.Contains(ql, kw) {
|
||||
return d.name
|
||||
}
|
||||
}
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
func chunkDomain(r LegalSearchResult) string {
|
||||
hay := r.ArticleLabel + " " + r.RegulationShort + " " + r.RegulationCode + " " + r.RegulationName
|
||||
for _, d := range domains {
|
||||
if containsAny(hay, d.regs) {
|
||||
return d.name
|
||||
}
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
// scopeClass flags special sub-regimes that must not win general questions —
|
||||
// BDSG Teil 3 (§§ 45-84) implements the JI directive (law enforcement), not the general regime.
|
||||
func scopeClass(r LegalSearchResult) string {
|
||||
hay := r.ArticleLabel + " " + r.RegulationShort
|
||||
if strings.Contains(hay, "BDSG") {
|
||||
if m := bdsgParagraph.FindStringSubmatch(hay); m != nil {
|
||||
if n, err := strconv.Atoi(m[1]); err == nil && n >= 45 && n <= 84 {
|
||||
return "law_enforcement"
|
||||
}
|
||||
}
|
||||
}
|
||||
return "general"
|
||||
}
|
||||
|
||||
// --- Topic ontology: amplifier only (boost), never an override ---
|
||||
|
||||
type topicDef struct {
|
||||
keywords []string
|
||||
norms []string // preferred canonical citation fragments
|
||||
}
|
||||
|
||||
var topics = []topicDef{
|
||||
{[]string{"datenschutzbeauftrag", "dsb", "benennung"}, []string{"Art. 37", "§ 38 BDSG"}},
|
||||
{[]string{"stellung des"}, []string{"Art. 38"}},
|
||||
{[]string{"aufgaben des"}, []string{"Art. 39"}},
|
||||
{[]string{"folgenabsch", "dsfa"}, []string{"Art. 35"}},
|
||||
{[]string{"besondere kategorien"}, []string{"Art. 9", "§ 22 BDSG"}},
|
||||
{[]string{"auskunft"}, []string{"Art. 15", "§ 34 BDSG"}},
|
||||
{[]string{"loesch", "lösch"}, []string{"Art. 17", "§ 35 BDSG"}},
|
||||
{[]string{"bussgeld", "geldbusse"}, []string{"Art. 83"}},
|
||||
{[]string{"security update", "sicherheitsupdate", "schwachstelle", "sbom", "cybersicherheitsanforderung"}, []string{"CRA Anhang I"}},
|
||||
{[]string{"meldepflicht", "sicherheitsvorfall"}, []string{"Art. 14 CRA"}},
|
||||
}
|
||||
|
||||
// resultMatchesTopic reports whether the result is a preferred norm of a topic the query hits.
|
||||
func resultMatchesTopic(query string, r LegalSearchResult) bool {
|
||||
ql := strings.ToLower(query)
|
||||
hay := r.ArticleLabel + " " + r.RegulationShort
|
||||
for _, t := range topics {
|
||||
if !containsAnyLower(ql, t.keywords) {
|
||||
continue
|
||||
}
|
||||
for _, n := range t.norms {
|
||||
if normMatches(hay, n) {
|
||||
return true
|
||||
}
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
// normMatches checks that norm appears in hay with a non-digit boundary, so "Art. 9"
|
||||
// matches "Art. 9 DSGVO" but not "Art. 90".
|
||||
func normMatches(hay, norm string) bool {
|
||||
idx := strings.Index(hay, norm)
|
||||
if idx < 0 {
|
||||
return false
|
||||
}
|
||||
end := idx + len(norm)
|
||||
if end < len(hay) && hay[end] >= '0' && hay[end] <= '9' {
|
||||
return false
|
||||
}
|
||||
return true
|
||||
}
|
||||
|
||||
func queryIsForeign(query string) bool {
|
||||
return containsAnyLower(strings.ToLower(query),
|
||||
[]string{"schweiz", "revdsg", "fedlex", " ch ", "oesterreich", "österreich"})
|
||||
}
|
||||
|
||||
func containsAny(hay string, markers []string) bool {
|
||||
for _, m := range markers {
|
||||
if strings.Contains(hay, m) {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func containsAnyLower(haylower string, markers []string) bool {
|
||||
for _, m := range markers {
|
||||
if strings.Contains(haylower, strings.ToLower(m)) {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
@@ -0,0 +1,171 @@
|
||||
package ucca
|
||||
|
||||
import (
|
||||
"sort"
|
||||
"strings"
|
||||
)
|
||||
|
||||
// Re-ranking coefficients (validated in the offline golden harness; Phase A — conservative).
|
||||
const (
|
||||
authorityCoef = 0.40 // * weight/100
|
||||
jurisdictionGain = 0.05 // binding/guidance from DE or EU
|
||||
foreignPenalty = 0.60 // foreign law on a DE/EU question (demoted, not removed)
|
||||
unknownPenalty = 0.08
|
||||
domainMatchGain = 0.15
|
||||
offDomainPenalty = 0.10 // off-domain binding (demoted, not removed)
|
||||
scopePenalty = 0.25 // BDSG Teil 3 (law enforcement) on a general DP question
|
||||
topicGain = 0.18 // amplifier only
|
||||
supersededPenalty = 0.50 // superseded Alt-Quelle (pre-eu-v1): demoted, nicht versteckt
|
||||
intentLiftGain = 0.10 // epsilon a qualifying interpretative source is lifted ABOVE the best binding
|
||||
intentLiftMargin = 0.05 // ...only if that source is semantically competitive with binding
|
||||
)
|
||||
|
||||
// guidanceIntentSignals mark a query that EXPLICITLY asks for an interpretation /
|
||||
// recommendation by a guidance body, rather than for the binding obligation. Only
|
||||
// then may a (semantically competitive) guideline outrank the binding norm.
|
||||
var guidanceIntentSignals = []string{
|
||||
"edpb", "europäischer datenschutzausschuss", "europaeischer datenschutzausschuss",
|
||||
"dsk", "enisa", "bsi", "leitlinie", "guideline", "orientierungshilfe",
|
||||
"auslegung", "empfiehlt", "empfehlung", "sagt", "laut",
|
||||
}
|
||||
|
||||
// controlIntentSignals mark a query that asks HOW to implement / which controls or
|
||||
// measures fit — rather than WHAT the binding obligation is. Only then may a
|
||||
// (semantically competitive) technical_standard outrank the binding norm.
|
||||
var controlIntentSignals = []string{
|
||||
"control", "controls", "maßnahme", "massnahme", "schutzmaßnahme",
|
||||
"best practice", "best-practice", "umsetzen", "implementier", "absicher",
|
||||
"härt", "haert", "hardening", "nist", "owasp", "grundschutz",
|
||||
"ccm", "iso 27001", "isms",
|
||||
}
|
||||
|
||||
func queryMatchesAny(query string, signals []string) bool {
|
||||
q := strings.ToLower(query)
|
||||
for _, sig := range signals {
|
||||
if strings.Contains(q, sig) {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
// queryWantsGuidance reports whether the query explicitly asks for guidance/interpretation.
|
||||
func queryWantsGuidance(query string) bool { return queryMatchesAny(query, guidanceIntentSignals) }
|
||||
|
||||
// queryWantsControls reports whether the query asks for implementation controls/measures.
|
||||
func queryWantsControls(query string) bool { return queryMatchesAny(query, controlIntentSignals) }
|
||||
|
||||
// bestBindingSemantic returns the highest RAW semantic score among binding-law
|
||||
// results (0 if none / no intent). Used as the guard threshold so an off-topic
|
||||
// interpretative source cannot ride the intent boost.
|
||||
func bestBindingSemantic(results []LegalSearchResult, wantsIntent bool) float64 {
|
||||
if !wantsIntent {
|
||||
return 0
|
||||
}
|
||||
best := 0.0
|
||||
for _, r := range results {
|
||||
if classifyAuthority(r).sourceClass == "binding_law" && r.Score > best {
|
||||
best = r.Score
|
||||
}
|
||||
}
|
||||
return best
|
||||
}
|
||||
|
||||
// authorityScore computes the normative relevance of a result for a query. It augments the
|
||||
// semantic score with authority/jurisdiction/domain/scope/topic signals. Exposed for tests.
|
||||
func authorityScore(query string, r LegalSearchResult, qDomain string, qForeign bool) float64 {
|
||||
info := classifyAuthority(r)
|
||||
score := r.Score + authorityCoef*float64(info.weight)/100.0
|
||||
|
||||
if r.Superseded {
|
||||
// Alt-Quelle (pre-eu-v1): Default-Fragen sollen die eu-v1-Norm sehen. Demoted,
|
||||
// nicht entfernt — fuer Historie/Uebergangsfragen bleibt sie auffindbar.
|
||||
score -= supersededPenalty
|
||||
}
|
||||
|
||||
if info.jurisdiction == "CH" && !qForeign {
|
||||
score -= foreignPenalty // Fremdrecht bei DE/EU-Frage: demoted, nicht geloescht
|
||||
} else {
|
||||
score += jurisdictionGain
|
||||
}
|
||||
if info.sourceClass == "unknown" {
|
||||
score -= unknownPenalty
|
||||
}
|
||||
if qDomain != "" {
|
||||
switch cd := chunkDomain(r); {
|
||||
case cd == qDomain:
|
||||
score += domainMatchGain
|
||||
case cd != "":
|
||||
score -= offDomainPenalty // off-domain binding: demoted, nicht geloescht
|
||||
}
|
||||
}
|
||||
if qDomain == "data_protection" && scopeClass(r) == "law_enforcement" {
|
||||
score -= scopePenalty
|
||||
}
|
||||
if resultMatchesTopic(query, r) {
|
||||
score += topicGain // Verstaerker, kein Override
|
||||
}
|
||||
return score
|
||||
}
|
||||
|
||||
// rerankByAuthority re-orders results so binding law from the matching jurisdiction/domain
|
||||
// ranks above guidance, foreign and off-domain law — WITHOUT dropping anything (guidance is
|
||||
// kept as interpretation context). The computed score is written back to Score so downstream
|
||||
// merges (e.g. the multi-collection advisor) preserve this order. Pure + deterministic.
|
||||
func rerankByAuthority(query string, results []LegalSearchResult) []LegalSearchResult {
|
||||
if len(results) < 2 {
|
||||
return results
|
||||
}
|
||||
qDomain := queryDomain(query)
|
||||
qForeign := queryIsForeign(query)
|
||||
wantsGuidance := queryWantsGuidance(query)
|
||||
wantsControls := queryWantsControls(query)
|
||||
bestBindingSem := bestBindingSemantic(results, wantsGuidance)
|
||||
|
||||
out := make([]LegalSearchResult, len(results))
|
||||
copy(out, results)
|
||||
for i := range out {
|
||||
out[i].Score = authorityScore(query, out[i], qDomain, qForeign)
|
||||
}
|
||||
// Explicit interpretation intent → a competitive guideline may outrank binding (lift
|
||||
// above the best binding FINAL). Explicit implementation intent → boost the CONTROL-POOL
|
||||
// (operational/procedural requirement, control standard, implementation guidance) over
|
||||
// the abstract obligation, soft-ordered by role. Norm questions (neither) stay untouched.
|
||||
if wantsGuidance {
|
||||
liftAboveBinding(out, results, bestBindingSem, "supervisory_guidance")
|
||||
}
|
||||
if wantsControls {
|
||||
applyControlRoles(out)
|
||||
}
|
||||
sort.SliceStable(out, func(a, b int) bool {
|
||||
return out[a].Score > out[b].Score
|
||||
})
|
||||
return out
|
||||
}
|
||||
|
||||
// liftAboveBinding lifts a semantically-competitive interpretative source (the given
|
||||
// sourceClass — supervisory_guidance or technical_standard) just ABOVE the best binding
|
||||
// hit, ordered by semantic, so an EXPLICIT guidance/implementation question can return
|
||||
// that source Top-1. A pure norm question (no intent → not called) keeps binding on top.
|
||||
// Sources below the semantic margin are left untouched, so an off-topic source can never
|
||||
// ride the override — and the lift is from the binding FINAL score, so authority/topic/
|
||||
// domain bonuses cannot edge it out.
|
||||
func liftAboveBinding(out, raw []LegalSearchResult, bestBindingSem float64, sourceClass string) {
|
||||
bestBindingFinal := 0.0
|
||||
for i := range out {
|
||||
if classifyAuthority(out[i]).sourceClass == "binding_law" && out[i].Score > bestBindingFinal {
|
||||
bestBindingFinal = out[i].Score
|
||||
}
|
||||
}
|
||||
for i := range out {
|
||||
// Classify (not raw payload) so the untagged legacy corpus — e.g. NIST ingested
|
||||
// before source_class tagging — is still recognized as its interpretative class.
|
||||
if classifyAuthority(out[i]).sourceClass != sourceClass || raw[i].Score < bestBindingSem-intentLiftMargin {
|
||||
continue
|
||||
}
|
||||
lifted := bestBindingFinal + intentLiftGain + (raw[i].Score - bestBindingSem)
|
||||
if lifted > out[i].Score {
|
||||
out[i].Score = lifted
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,96 @@
|
||||
package ucca
|
||||
|
||||
import "testing"
|
||||
|
||||
func bindingRes(label, reg, jur string, score float64) LegalSearchResult {
|
||||
return LegalSearchResult{ArticleLabel: label, RegulationShort: reg, SourceClass: "binding_law", AuthorityWeight: 100, Jurisdiction: jur, Score: score}
|
||||
}
|
||||
|
||||
func guidanceRes(label, reg string, score float64) LegalSearchResult {
|
||||
return LegalSearchResult{ArticleLabel: label, RegulationShort: reg, SourceClass: "supervisory_guidance", AuthorityWeight: 70, Jurisdiction: "EU", Score: score}
|
||||
}
|
||||
|
||||
func foreignRes(label string, score float64) LegalSearchResult {
|
||||
return LegalSearchResult{ArticleLabel: label, RegulationShort: "RevDSG", SourceClass: "foreign_law", AuthorityWeight: 0, Jurisdiction: "CH", Score: score}
|
||||
}
|
||||
|
||||
// Acceptance criteria (Phase 1) expressed as ordering tests.
|
||||
func TestRerankByAuthority_Acceptance(t *testing.T) {
|
||||
t.Run("guidance does not overtake semantically competitive binding", func(t *testing.T) {
|
||||
out := rerankByAuthority("Was gilt hier?", []LegalSearchResult{
|
||||
guidanceRes("ENISA Mapping", "ENISA", 0.72),
|
||||
bindingRes("CRA Anhang I", "CRA", "EU", 0.66),
|
||||
})
|
||||
if out[0].RegulationShort != "CRA" {
|
||||
t.Fatalf("binding must rank first over competitive guidance, got %q", out[0].RegulationShort)
|
||||
}
|
||||
})
|
||||
|
||||
t.Run("foreign law demoted on DE/EU question but kept", func(t *testing.T) {
|
||||
in := []LegalSearchResult{foreignRes("RevDSG Art 1", 0.85), bindingRes("Art. 9 DSGVO", "DSGVO", "EU", 0.62)}
|
||||
out := rerankByAuthority("Welche Daten sind besonders geschuetzt?", in)
|
||||
if out[0].RegulationShort != "DSGVO" {
|
||||
t.Fatalf("binding EU must beat foreign on a DE/EU query, got %q", out[0].RegulationShort)
|
||||
}
|
||||
if len(out) != 2 {
|
||||
t.Fatalf("foreign law must be kept, got len=%d", len(out))
|
||||
}
|
||||
})
|
||||
|
||||
t.Run("off-domain binding demoted but not removed", func(t *testing.T) {
|
||||
in := []LegalSearchResult{
|
||||
bindingRes("Art. 13 EU MDR", "MDR", "EU", 0.70),
|
||||
bindingRes("Art. 13 CRA", "CRA", "EU", 0.60),
|
||||
}
|
||||
out := rerankByAuthority("Welche Pflichten hat der Hersteller von Produkten mit digitalen Elementen?", in)
|
||||
if out[0].RegulationShort != "CRA" {
|
||||
t.Fatalf("on-domain CRA must beat off-domain MDR, got %q", out[0].RegulationShort)
|
||||
}
|
||||
if len(out) != 2 {
|
||||
t.Fatalf("off-domain MDR must be kept, got len=%d", len(out))
|
||||
}
|
||||
})
|
||||
|
||||
t.Run("same-regime binding wins over guidance", func(t *testing.T) {
|
||||
out := rerankByAuthority("Was gilt hier?", []LegalSearchResult{
|
||||
bindingRes("Art. 13 CRA", "CRA", "EU", 0.70),
|
||||
guidanceRes("ENISA Mapping", "ENISA", 0.60),
|
||||
})
|
||||
if out[0].RegulationShort != "CRA" {
|
||||
t.Fatalf("binding must win, got %q", out[0].RegulationShort)
|
||||
}
|
||||
})
|
||||
|
||||
t.Run("BDSG Teil 3 demoted below DSGVO on general DP question", func(t *testing.T) {
|
||||
in := []LegalSearchResult{
|
||||
bindingRes("§ 48 BDSG", "BDSG", "DE", 0.70), // Teil 3 (law enforcement)
|
||||
bindingRes("Art. 9 DSGVO", "DSGVO", "EU", 0.62),
|
||||
}
|
||||
out := rerankByAuthority("Was sind besondere Kategorien personenbezogener Daten?", in)
|
||||
if out[0].RegulationShort != "DSGVO" {
|
||||
t.Fatalf("DSGVO must beat BDSG Teil 3 on a general DP question, got %q", out[0].RegulationShort)
|
||||
}
|
||||
})
|
||||
|
||||
t.Run("nothing is dropped and topic amplifies", func(t *testing.T) {
|
||||
in := []LegalSearchResult{
|
||||
guidanceRes("ENISA", "ENISA", 0.72),
|
||||
bindingRes("CRA Anhang I", "CRA", "EU", 0.66),
|
||||
foreignRes("RevDSG", 0.5),
|
||||
}
|
||||
out := rerankByAuthority("Anforderungen an Security Updates?", in)
|
||||
if len(out) != len(in) {
|
||||
t.Fatalf("rerank must preserve all results, got %d want %d", len(out), len(in))
|
||||
}
|
||||
if out[0].ArticleLabel != "CRA Anhang I" {
|
||||
t.Fatalf("topic+authority must lift CRA Anhang I to top, got %q", out[0].ArticleLabel)
|
||||
}
|
||||
})
|
||||
|
||||
t.Run("single result returned unchanged", func(t *testing.T) {
|
||||
in := []LegalSearchResult{bindingRes("Art. 1 CRA", "CRA", "EU", 0.5)}
|
||||
if out := rerankByAuthority("x", in); len(out) != 1 {
|
||||
t.Fatalf("len=%d", len(out))
|
||||
}
|
||||
})
|
||||
}
|
||||
@@ -0,0 +1,129 @@
|
||||
package ucca
|
||||
|
||||
import "testing"
|
||||
|
||||
func TestClassifyAuthority(t *testing.T) {
|
||||
tests := []struct {
|
||||
name string
|
||||
result LegalSearchResult
|
||||
wantW int
|
||||
wantSC string
|
||||
wantJur string
|
||||
}{
|
||||
{"tagged binding EU", LegalSearchResult{AuthorityWeight: 100, SourceClass: "binding_law", Jurisdiction: "EU"}, 100, "binding_law", "EU"},
|
||||
{"tagged guidance DE", LegalSearchResult{AuthorityWeight: 70, SourceClass: "supervisory_guidance", Jurisdiction: "DE"}, 70, "supervisory_guidance", "DE"},
|
||||
{"tagged foreign CH", LegalSearchResult{AuthorityWeight: 0, SourceClass: "foreign_law", Jurisdiction: "CH"}, 0, "foreign_law", "CH"},
|
||||
{"untagged ENISA guidance", LegalSearchResult{RegulationShort: "ENISA", ArticleLabel: "ENISA CRA Standards Mapping"}, 70, "supervisory_guidance", "EU"},
|
||||
{"untagged NIST standard", LegalSearchResult{RegulationShort: "NIST SP 800-82r3", ArticleLabel: "AU-8"}, 80, "technical_standard", "EU"},
|
||||
{"BSI Grundschutz standard beats BSI guidance", LegalSearchResult{RegulationShort: "BSI Grundschutz", ArticleLabel: "BSI Grundschutz Baustein"}, 80, "technical_standard", "DE"},
|
||||
{"weight-only 85 TRGS standard", LegalSearchResult{AuthorityWeight: 85, RegulationShort: "TRGS 529"}, 85, "technical_standard", "EU"},
|
||||
{"tagged technical_standard", LegalSearchResult{AuthorityWeight: 80, SourceClass: "technical_standard", Jurisdiction: "EU"}, 80, "technical_standard", "EU"},
|
||||
{"untagged CRA binding", LegalSearchResult{RegulationShort: "CRA", ArticleLabel: "Art. 13 CRA", Category: "regulation"}, 100, "binding_law", "EU"},
|
||||
{"untagged BDSG binding DE", LegalSearchResult{RegulationShort: "BDSG", ArticleLabel: "§ 38 BDSG"}, 100, "binding_law", "DE"},
|
||||
{"untagged RevDSG foreign", LegalSearchResult{RegulationShort: "RevDSG", ArticleLabel: "RevDSG (CH)"}, 0, "foreign_law", "CH"},
|
||||
{"untagged unknown", LegalSearchResult{RegulationShort: "", ArticleLabel: ""}, 50, "unknown", "EU"},
|
||||
}
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
got := classifyAuthority(tt.result)
|
||||
if got.weight != tt.wantW || got.sourceClass != tt.wantSC || got.jurisdiction != tt.wantJur {
|
||||
t.Errorf("classifyAuthority() = {%d %s %s}, want {%d %s %s}",
|
||||
got.weight, got.sourceClass, got.jurisdiction, tt.wantW, tt.wantSC, tt.wantJur)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestQueryDomain(t *testing.T) {
|
||||
tests := []struct{ q, want string }{
|
||||
{"Welche Anforderungen an Security Updates?", "cyber"},
|
||||
{"Wer braucht einen Datenschutzbeauftragten?", "data_protection"},
|
||||
{"Was sind besondere Kategorien personenbezogener Daten?", "data_protection"},
|
||||
{"Welche Pflichten beim Hochrisiko-KI-System?", "ai"},
|
||||
{"Wie spaet ist es?", ""},
|
||||
}
|
||||
for _, tt := range tests {
|
||||
if got := queryDomain(tt.q); got != tt.want {
|
||||
t.Errorf("queryDomain(%q) = %q, want %q", tt.q, got, tt.want)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestChunkDomain(t *testing.T) {
|
||||
tests := []struct {
|
||||
name string
|
||||
r LegalSearchResult
|
||||
want string
|
||||
}{
|
||||
{"CRA cyber", LegalSearchResult{RegulationShort: "CRA", ArticleLabel: "Art. 13 CRA"}, "cyber"},
|
||||
{"DSGVO dp", LegalSearchResult{RegulationShort: "DSGVO", ArticleLabel: "Art. 9 DSGVO"}, "data_protection"},
|
||||
{"AI Act ai", LegalSearchResult{RegulationShort: "AI Act", ArticleLabel: "Art. 10 AI Act"}, "ai"},
|
||||
{"MDR product", LegalSearchResult{RegulationShort: "MDR", ArticleLabel: "Art. 13 EU MDR"}, "product_safety"},
|
||||
{"unknown", LegalSearchResult{RegulationShort: "XYZ"}, ""},
|
||||
}
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
if got := chunkDomain(tt.r); got != tt.want {
|
||||
t.Errorf("chunkDomain() = %q, want %q", got, tt.want)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestScopeClass(t *testing.T) {
|
||||
tests := []struct {
|
||||
name string
|
||||
r LegalSearchResult
|
||||
want string
|
||||
}{
|
||||
{"BDSG Teil 3 law enforcement", LegalSearchResult{RegulationShort: "BDSG", ArticleLabel: "§ 48 BDSG"}, "law_enforcement"},
|
||||
{"BDSG general part", LegalSearchResult{RegulationShort: "BDSG", ArticleLabel: "§ 38 BDSG"}, "general"},
|
||||
{"DSGVO general", LegalSearchResult{RegulationShort: "DSGVO", ArticleLabel: "Art. 9 DSGVO"}, "general"},
|
||||
}
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
if got := scopeClass(tt.r); got != tt.want {
|
||||
t.Errorf("scopeClass() = %q, want %q", got, tt.want)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestResultMatchesTopic(t *testing.T) {
|
||||
tests := []struct {
|
||||
name string
|
||||
query string
|
||||
r LegalSearchResult
|
||||
want bool
|
||||
}{
|
||||
{"besondere Kategorien -> Art 9 match", "Was sind besondere Kategorien?", LegalSearchResult{ArticleLabel: "Art. 9 DSGVO"}, true},
|
||||
{"besondere Kategorien -> Art 90 no match", "Was sind besondere Kategorien?", LegalSearchResult{ArticleLabel: "Art. 90 DSGVO"}, false},
|
||||
{"security updates -> CRA Anhang I", "Anforderungen an Security Updates?", LegalSearchResult{ArticleLabel: "CRA Anhang I"}, true},
|
||||
{"no topic keyword", "Wie spaet ist es?", LegalSearchResult{ArticleLabel: "Art. 9 DSGVO"}, false},
|
||||
}
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
if got := resultMatchesTopic(tt.query, tt.r); got != tt.want {
|
||||
t.Errorf("resultMatchesTopic() = %v, want %v", got, tt.want)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestNormMatches(t *testing.T) {
|
||||
tests := []struct {
|
||||
hay, norm string
|
||||
want bool
|
||||
}{
|
||||
{"Art. 9 DSGVO", "Art. 9", true},
|
||||
{"Art. 90 DSGVO", "Art. 9", false},
|
||||
{"§ 38 BDSG", "§ 38 BDSG", true},
|
||||
{"§ 380 BDSG", "§ 38", false},
|
||||
{"Art. 14 CRA", "Art. 14 CRA", true},
|
||||
}
|
||||
for _, tt := range tests {
|
||||
if got := normMatches(tt.hay, tt.norm); got != tt.want {
|
||||
t.Errorf("normMatches(%q,%q) = %v, want %v", tt.hay, tt.norm, got, tt.want)
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,123 @@
|
||||
package ucca
|
||||
|
||||
import "strings"
|
||||
|
||||
// source_role is the FUNCTIONAL role of a chunk — WHAT must be done (obligation),
|
||||
// HOW to implement it (operational/procedural requirement, control standard,
|
||||
// implementation guidance), or how to READ the norm (interpretation/definition).
|
||||
// It is ORTHOGONAL to source_class (legal authority): source_class decides RANK,
|
||||
// source_role decides CONTROL-POOL membership for implementation questions.
|
||||
// Derived deterministically from markers, so the untagged corpus needs no re-tag.
|
||||
const (
|
||||
roleObligation = "obligation" // the abstract duty (the WHAT)
|
||||
roleOperationalReq = "operational_requirement" // concrete binding requirement (CRA Annex I)
|
||||
roleProceduralReq = "procedural_requirement" // a process: notification/registration/DPIA/incident report
|
||||
roleControlStandard = "control_standard" // best-practice control catalog (NIST/OWASP/ISO/CIS)
|
||||
roleImplGuidance = "implementation_guidance" // advisory how-to (ENISA good practices, BSI)
|
||||
roleInterpretation = "interpretation" // interprets the norm's MEANING (EDPB guideline)
|
||||
roleDefinition = "definition" // definitions / scope / recitals
|
||||
)
|
||||
|
||||
var (
|
||||
proceduralMarkers = []string{
|
||||
"Meldung", "Meldepflicht", "Notification", "Notifizierung", "Registrierung",
|
||||
"Registration", "Konformitätserklärung", "Declaration of Conformity", "Incident",
|
||||
"Berichterstattung", "Reporting", "Folgenabschätzung", "DSFA", "DPIA", "Anzeigepflicht",
|
||||
}
|
||||
annexMarkers = []string{"Anhang", "Annex", "Appendix", "Anlage"}
|
||||
operationalMarkers = []string{"Anforderung", "Requirement", "essential", "wesentliche"}
|
||||
implMarkers = []string{
|
||||
"Good Practice", "Best Practice", "Standards Mapping", "Umsetzung", "Implementation",
|
||||
"Handreichung", "Maßnahmenkatalog", "ICS", "SCADA", "Technical Guideline", "TIG",
|
||||
}
|
||||
definitionMarkers = []string{"Begriffsbestimmung", "Definition"}
|
||||
)
|
||||
|
||||
// classifyRole derives the functional source_role from chunk metadata + the authority
|
||||
// class. technical_standard is always a control_standard; guidance splits into
|
||||
// implementation_guidance (how-to) vs interpretation (meaning); binding splits into
|
||||
// procedural / operational requirement / definition / plain obligation.
|
||||
func classifyRole(r LegalSearchResult) string {
|
||||
cls := classifyAuthority(r).sourceClass
|
||||
hay := strings.ToLower(r.ArticleLabel + " " + r.RegulationShort + " " + r.RegulationName + " " + r.Article)
|
||||
switch {
|
||||
case r.IsRecital:
|
||||
return roleDefinition
|
||||
case cls == "technical_standard":
|
||||
return roleControlStandard
|
||||
case cls == "supervisory_guidance":
|
||||
if containsAnyLower(hay, implMarkers) {
|
||||
return roleImplGuidance
|
||||
}
|
||||
return roleInterpretation
|
||||
case cls == "binding_law":
|
||||
switch {
|
||||
case containsAnyLower(hay, definitionMarkers):
|
||||
return roleDefinition
|
||||
case containsAnyLower(hay, proceduralMarkers):
|
||||
return roleProceduralReq
|
||||
case containsAnyLower(hay, annexMarkers) || containsAnyLower(hay, operationalMarkers):
|
||||
return roleOperationalReq
|
||||
default:
|
||||
return roleObligation
|
||||
}
|
||||
default:
|
||||
return roleObligation
|
||||
}
|
||||
}
|
||||
|
||||
// controlRoleBonus is the soft intra-pool preference (User 2026-06-24):
|
||||
// operational_requirement > procedural_requirement > control_standard > implementation_guidance.
|
||||
var controlRoleBonus = map[string]float64{
|
||||
roleOperationalReq: 0.100,
|
||||
roleProceduralReq: 0.075,
|
||||
roleControlStandard: 0.050,
|
||||
roleImplGuidance: 0.000,
|
||||
}
|
||||
|
||||
// controlPoolGain lifts EVERY control-pool role over the non-control roles (obligation/
|
||||
// interpretation/definition) on an implementation question, so the binding abstract
|
||||
// obligation does not dominate by authority alone. The obligation is not removed — it
|
||||
// stays visible as "Rechtsgrundlage" context below the recommended measures.
|
||||
const controlPoolGain = 0.15
|
||||
|
||||
// applyControlRoles boosts the control-pool (the four implementation roles) for an
|
||||
// EXPLICIT implementation question, soft-ordered op_req > procedural > standard > guidance.
|
||||
// Replaces the earlier "lift technical_standard above binding" — controls are not only
|
||||
// technical_standard, and the binding operational_requirement (e.g. CRA Annex I) should win.
|
||||
func applyControlRoles(out []LegalSearchResult) {
|
||||
for i := range out {
|
||||
if bonus, ok := controlRoleBonus[classifyRole(out[i])]; ok {
|
||||
out[i].Score += controlPoolGain + bonus
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// isControlPoolRole reports whether a role belongs to the control-pool surfaced on
|
||||
// implementation questions (the four "how to implement" roles).
|
||||
func isControlPoolRole(role string) bool {
|
||||
switch role {
|
||||
case roleOperationalReq, roleProceduralReq, roleControlStandard, roleImplGuidance:
|
||||
return true
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
// controlRoleOf classifies a raw Qdrant payload into a source_role, so searchControls can
|
||||
// filter its deep dense pull to the control-pool BEFORE hits are mapped to LegalSearchResult.
|
||||
func controlRoleOf(payload map[string]interface{}) string {
|
||||
article := getString(payload, "article")
|
||||
if article == "" {
|
||||
article = getString(payload, "section")
|
||||
}
|
||||
return classifyRole(LegalSearchResult{
|
||||
RegulationShort: getString(payload, "regulation_short"),
|
||||
RegulationName: getString(payload, "regulation_name_de"),
|
||||
ArticleLabel: getString(payload, "article_label"),
|
||||
Article: article,
|
||||
Category: getString(payload, "category"),
|
||||
SourceClass: getString(payload, "source_class"),
|
||||
AuthorityWeight: getInt(payload, "authority_weight"),
|
||||
IsRecital: getBool(payload, "is_recital"),
|
||||
})
|
||||
}
|
||||
@@ -0,0 +1,79 @@
|
||||
package ucca
|
||||
|
||||
import "testing"
|
||||
|
||||
func TestClassifyRole(t *testing.T) {
|
||||
tests := []struct {
|
||||
name string
|
||||
r LegalSearchResult
|
||||
want string
|
||||
}{
|
||||
{"NIST -> control_standard", LegalSearchResult{RegulationShort: "NIST SP 800-82r3", ArticleLabel: "AU-8"}, roleControlStandard},
|
||||
{"OWASP -> control_standard", LegalSearchResult{RegulationShort: "OWASP ASVS"}, roleControlStandard},
|
||||
{"CRA Anhang -> operational_requirement", LegalSearchResult{RegulationShort: "CRA", ArticleLabel: "CRA Anhang I", Category: "regulation"}, roleOperationalReq},
|
||||
{"CRA Meldepflicht -> procedural_requirement", LegalSearchResult{RegulationShort: "CRA", ArticleLabel: "Art. 14 CRA Meldepflicht", Category: "regulation"}, roleProceduralReq},
|
||||
{"ENISA Good Practices -> implementation_guidance", LegalSearchResult{RegulationShort: "ENISA Supply Chain Good Practices"}, roleImplGuidance},
|
||||
{"EDPB Leitlinie -> interpretation", LegalSearchResult{RegulationShort: "EDPB DPO", ArticleLabel: "WP243 Leitlinien Datenschutzbeauftragte"}, roleInterpretation},
|
||||
{"DORA article -> obligation", LegalSearchResult{RegulationShort: "DORA", ArticleLabel: "Art. 5 DORA", Category: "regulation"}, roleObligation},
|
||||
{"DSGVO Begriffsbestimmungen -> definition", LegalSearchResult{RegulationShort: "DSGVO", ArticleLabel: "Art. 4 DSGVO Begriffsbestimmungen", Category: "regulation"}, roleDefinition},
|
||||
{"recital -> definition", LegalSearchResult{RegulationShort: "CRA", IsRecital: true}, roleDefinition},
|
||||
}
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
if got := classifyRole(tt.r); got != tt.want {
|
||||
t.Errorf("classifyRole() = %q, want %q", got, tt.want)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestApplyControlRoles_PoolPreference(t *testing.T) {
|
||||
// op_req > procedural > control_standard > impl_guidance; non-control roles get no boost.
|
||||
roles := []struct {
|
||||
r LegalSearchResult
|
||||
wantGain float64
|
||||
}{
|
||||
{LegalSearchResult{ArticleLabel: "CRA Anhang I", Category: "regulation"}, controlPoolGain + 0.100},
|
||||
{LegalSearchResult{ArticleLabel: "Art. 14 CRA Meldepflicht", Category: "regulation"}, controlPoolGain + 0.075},
|
||||
{LegalSearchResult{RegulationShort: "NIST SP 800-53"}, controlPoolGain + 0.050},
|
||||
{LegalSearchResult{RegulationShort: "ENISA Good Practices"}, controlPoolGain + 0.000},
|
||||
{LegalSearchResult{ArticleLabel: "Art. 5 DORA", Category: "regulation"}, 0.0}, // obligation: no boost
|
||||
}
|
||||
for _, rc := range roles {
|
||||
out := []LegalSearchResult{rc.r}
|
||||
out[0].Score = 1.0
|
||||
applyControlRoles(out)
|
||||
if got := out[0].Score - 1.0; got < rc.wantGain-1e-9 || got > rc.wantGain+1e-9 {
|
||||
t.Errorf("role %q: gain %.3f, want %.3f", classifyRole(rc.r), got, rc.wantGain)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestIsControlPoolRole(t *testing.T) {
|
||||
for _, r := range []string{roleOperationalReq, roleProceduralReq, roleControlStandard, roleImplGuidance} {
|
||||
if !isControlPoolRole(r) {
|
||||
t.Errorf("%q should be in the control-pool", r)
|
||||
}
|
||||
}
|
||||
for _, r := range []string{roleObligation, roleInterpretation, roleDefinition} {
|
||||
if isControlPoolRole(r) {
|
||||
t.Errorf("%q should NOT be in the control-pool", r)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestControlRoleOf_Payload(t *testing.T) {
|
||||
// searchControls filters its deep dense pull by classifying the raw Qdrant payload.
|
||||
nist := map[string]interface{}{"regulation_short": "NIST SP 800-82r3", "article": "AU-8"}
|
||||
if got := controlRoleOf(nist); got != roleControlStandard {
|
||||
t.Errorf("untagged NIST payload role = %q, want control_standard", got)
|
||||
}
|
||||
craAnnex := map[string]interface{}{"regulation_short": "CRA", "article": "Anhang-I", "category": "regulation"}
|
||||
if got := controlRoleOf(craAnnex); got != roleOperationalReq {
|
||||
t.Errorf("CRA Anhang payload role = %q, want operational_requirement", got)
|
||||
}
|
||||
dora := map[string]interface{}{"regulation_short": "DORA", "article_label": "Art. 5 DORA", "category": "regulation"}
|
||||
if got := controlRoleOf(dora); isControlPoolRole(got) {
|
||||
t.Errorf("DORA abstract article role = %q must be excluded from the control-pool", got)
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,167 @@
|
||||
package ucca
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"context"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"io"
|
||||
"net/http"
|
||||
"sort"
|
||||
)
|
||||
|
||||
// LegalActStructure is the composition of one ingested eur-lex legal act — how
|
||||
// many distinct articles, annexes and recitals it consists of (plus the raw
|
||||
// chunk count). Backs the coverage page so the ingested corpus is not a black
|
||||
// box: a developer SEES what each act actually contains, not only its name.
|
||||
type LegalActStructure struct {
|
||||
RegulationShort string `json:"regulation_short"`
|
||||
RegulationName string `json:"regulation_name"`
|
||||
Articles int `json:"articles"`
|
||||
Annexes int `json:"annexes"`
|
||||
Recitals int `json:"recitals"`
|
||||
Chunks int `json:"chunks"`
|
||||
}
|
||||
|
||||
const eurlexSource = "eur-lex.europa.eu"
|
||||
|
||||
// legalStructureCollections hold the clean eur-lex legal corpus (chunks tagged
|
||||
// with chunk_scope = section | annex | recital).
|
||||
var legalStructureCollections = []string{"bp_compliance_ce", "bp_compliance_datenschutz"}
|
||||
|
||||
// chunkScopeBucket maps a Qdrant chunk_scope to the structure field it feeds.
|
||||
var chunkScopeBucket = map[string]string{"section": "articles", "annex": "annexes", "recital": "recitals"}
|
||||
|
||||
// CorpusStructure scrolls the eur-lex legal corpus across the legal collections
|
||||
// and aggregates the per-act composition. The source filter keeps it to a few
|
||||
// hundred points regardless of total corpus size. Read-only; a collection that
|
||||
// fails to scroll is skipped rather than failing the whole call.
|
||||
func (c *LegalRAGClient) CorpusStructure(ctx context.Context) ([]LegalActStructure, error) {
|
||||
var all []qdrantScrollPoint
|
||||
for _, coll := range legalStructureCollections {
|
||||
pts, err := c.scrollLegalCorpus(ctx, coll)
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
all = append(all, pts...)
|
||||
}
|
||||
return aggregateStructure(all), nil
|
||||
}
|
||||
|
||||
// aggregateStructure counts distinct article labels per (regulation, scope).
|
||||
// Pure → unit-testable without a vector store.
|
||||
func aggregateStructure(points []qdrantScrollPoint) []LegalActStructure {
|
||||
distinct := map[string]map[string]map[string]struct{}{}
|
||||
names := map[string]string{}
|
||||
chunks := map[string]int{}
|
||||
order := []string{}
|
||||
|
||||
for _, pt := range points {
|
||||
reg := getString(pt.Payload, "regulation_short")
|
||||
if reg == "" {
|
||||
continue
|
||||
}
|
||||
if _, seen := names[reg]; !seen {
|
||||
name := getString(pt.Payload, "regulation_name_de")
|
||||
if name == "" {
|
||||
name = reg
|
||||
}
|
||||
names[reg] = name
|
||||
distinct[reg] = map[string]map[string]struct{}{}
|
||||
order = append(order, reg)
|
||||
}
|
||||
chunks[reg]++
|
||||
bucket, ok := chunkScopeBucket[getString(pt.Payload, "chunk_scope")]
|
||||
article := getString(pt.Payload, "article")
|
||||
if !ok || article == "" {
|
||||
continue
|
||||
}
|
||||
if distinct[reg][bucket] == nil {
|
||||
distinct[reg][bucket] = map[string]struct{}{}
|
||||
}
|
||||
distinct[reg][bucket][article] = struct{}{}
|
||||
}
|
||||
|
||||
out := make([]LegalActStructure, 0, len(order))
|
||||
for _, reg := range order {
|
||||
out = append(out, LegalActStructure{
|
||||
RegulationShort: reg,
|
||||
RegulationName: names[reg],
|
||||
Articles: len(distinct[reg]["articles"]),
|
||||
Annexes: len(distinct[reg]["annexes"]),
|
||||
Recitals: len(distinct[reg]["recitals"]),
|
||||
Chunks: chunks[reg],
|
||||
})
|
||||
}
|
||||
sort.SliceStable(out, func(i, j int) bool {
|
||||
if out[i].Articles != out[j].Articles {
|
||||
return out[i].Articles > out[j].Articles
|
||||
}
|
||||
return out[i].RegulationShort < out[j].RegulationShort
|
||||
})
|
||||
return out
|
||||
}
|
||||
|
||||
// scrollLegalCorpus pages through one collection, filtered to the eur-lex legal
|
||||
// corpus, returning minimal-payload points (no text/vectors).
|
||||
func (c *LegalRAGClient) scrollLegalCorpus(ctx context.Context, collection string) ([]qdrantScrollPoint, error) {
|
||||
var all []qdrantScrollPoint
|
||||
var offset interface{}
|
||||
for {
|
||||
points, next, err := c.scrollLegalPage(ctx, collection, offset)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
all = append(all, points...)
|
||||
if next == nil {
|
||||
break
|
||||
}
|
||||
offset = next
|
||||
}
|
||||
return all, nil
|
||||
}
|
||||
|
||||
// scrollLegalPage fetches one page of the filtered scroll and returns the
|
||||
// points plus the next-page offset (nil when exhausted).
|
||||
func (c *LegalRAGClient) scrollLegalPage(ctx context.Context, collection string, offset interface{}) ([]qdrantScrollPoint, interface{}, error) {
|
||||
reqBody := map[string]interface{}{
|
||||
"limit": 500,
|
||||
"with_payload": map[string]interface{}{"include": []string{"regulation_short", "regulation_name_de", "chunk_scope", "article"}},
|
||||
"with_vectors": false,
|
||||
"filter": map[string]interface{}{
|
||||
"must": []map[string]interface{}{
|
||||
{"key": "source", "match": map[string]interface{}{"value": eurlexSource}},
|
||||
},
|
||||
},
|
||||
}
|
||||
if offset != nil {
|
||||
reqBody["offset"] = offset
|
||||
}
|
||||
jsonBody, err := json.Marshal(reqBody)
|
||||
if err != nil {
|
||||
return nil, nil, err
|
||||
}
|
||||
url := fmt.Sprintf("%s/collections/%s/points/scroll", c.qdrantURL, collection)
|
||||
req, err := http.NewRequestWithContext(ctx, "POST", url, bytes.NewReader(jsonBody))
|
||||
if err != nil {
|
||||
return nil, nil, err
|
||||
}
|
||||
req.Header.Set("Content-Type", "application/json")
|
||||
if c.qdrantAPIKey != "" {
|
||||
req.Header.Set("api-key", c.qdrantAPIKey)
|
||||
}
|
||||
resp, err := c.httpClient.Do(req)
|
||||
if err != nil {
|
||||
return nil, nil, err
|
||||
}
|
||||
defer func() { _ = resp.Body.Close() }()
|
||||
if resp.StatusCode != http.StatusOK {
|
||||
body, _ := io.ReadAll(resp.Body)
|
||||
return nil, nil, fmt.Errorf("qdrant returned %d: %s", resp.StatusCode, string(body))
|
||||
}
|
||||
var scrollResp qdrantScrollResponse
|
||||
if err := json.NewDecoder(resp.Body).Decode(&scrollResp); err != nil {
|
||||
return nil, nil, err
|
||||
}
|
||||
return scrollResp.Result.Points, scrollResp.Result.NextPageOffset, nil
|
||||
}
|
||||
@@ -0,0 +1,50 @@
|
||||
package ucca
|
||||
|
||||
import "testing"
|
||||
|
||||
func structPoint(reg, name, scope, article string) qdrantScrollPoint {
|
||||
return qdrantScrollPoint{Payload: map[string]interface{}{
|
||||
"regulation_short": reg,
|
||||
"regulation_name_de": name,
|
||||
"chunk_scope": scope,
|
||||
"article": article,
|
||||
}}
|
||||
}
|
||||
|
||||
func TestAggregateStructure_CountsDistinctPerScope(t *testing.T) {
|
||||
points := []qdrantScrollPoint{
|
||||
structPoint("CRA", "Cyber Resilience Act", "section", "13"),
|
||||
structPoint("CRA", "Cyber Resilience Act", "section", "13"), // duplicate article → still 1
|
||||
structPoint("CRA", "Cyber Resilience Act", "section", "14"),
|
||||
structPoint("CRA", "Cyber Resilience Act", "annex", "Anhang-I"),
|
||||
structPoint("CRA", "Cyber Resilience Act", "annex", "Anhang-VII"),
|
||||
structPoint("DORA", "", "section", "6"), // first sighting has no name →
|
||||
structPoint("DORA", "", "section", "19"), // regulation_name falls back to short
|
||||
structPoint("DORA", "", "recital", ""), // empty article → ignored for distinct
|
||||
structPoint("", "x", "section", "1"), // missing regulation → skipped entirely
|
||||
}
|
||||
|
||||
got := aggregateStructure(points)
|
||||
|
||||
if len(got) != 2 {
|
||||
t.Fatalf("want 2 acts, got %d (%+v)", len(got), got)
|
||||
}
|
||||
// CRA has more articles → sorts first.
|
||||
cra := got[0]
|
||||
if cra.RegulationShort != "CRA" || cra.Articles != 2 || cra.Annexes != 2 || cra.Recitals != 0 || cra.Chunks != 5 {
|
||||
t.Errorf("CRA wrong: %+v", cra)
|
||||
}
|
||||
dora := got[1]
|
||||
if dora.RegulationShort != "DORA" || dora.Articles != 2 || dora.Chunks != 3 {
|
||||
t.Errorf("DORA wrong: %+v", dora)
|
||||
}
|
||||
if dora.RegulationName != "DORA" {
|
||||
t.Errorf("DORA name fallback failed: %q", dora.RegulationName)
|
||||
}
|
||||
}
|
||||
|
||||
func TestAggregateStructure_Empty(t *testing.T) {
|
||||
if got := aggregateStructure(nil); len(got) != 0 {
|
||||
t.Errorf("want empty, got %+v", got)
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,134 @@
|
||||
package ucca
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"strings"
|
||||
)
|
||||
|
||||
const (
|
||||
assessConnectedCap = 12 // cap connected norms surfaced in the assessment
|
||||
assessCrossRegimeTopN = 5 // window over which "cross regime" is judged
|
||||
assessReviewMargin = 0.05 // a tighter winner gap → recommend human review
|
||||
)
|
||||
|
||||
// Assess builds the auditable explanation layer over a ranked result set:
|
||||
// primary norm, the norms it connects to (citation graph), cross-regime, a
|
||||
// human-review flag, the winner margin and a short reasoning string. Pure →
|
||||
// unit-testable. It EXPLAINS the ranking, it does not change it. Returns nil for
|
||||
// an empty result set.
|
||||
func Assess(results []LegalSearchResult) *LegalAssessment {
|
||||
if len(results) == 0 {
|
||||
return nil
|
||||
}
|
||||
// Norm-level view: collapse multiple chunks of the same article/annex so the
|
||||
// margin and cross-regime are judged between DISTINCT norms, not near-identical
|
||||
// chunks of one norm (which would make every winner margin ~0).
|
||||
norms := distinctNorms(results)
|
||||
p := norms[0]
|
||||
|
||||
primary := primaryLabel(p)
|
||||
connected := dedupStrings(p.ReferencesOut, p.ReferencesIn, p.CitationUnit)
|
||||
if len(connected) > assessConnectedCap {
|
||||
connected = connected[:assessConnectedCap]
|
||||
}
|
||||
|
||||
window := norms
|
||||
if len(window) > assessCrossRegimeTopN {
|
||||
window = window[:assessCrossRegimeTopN]
|
||||
}
|
||||
regimes := make(map[string]bool)
|
||||
for _, r := range window {
|
||||
if r.RegulationShort != "" {
|
||||
regimes[r.RegulationShort] = true
|
||||
}
|
||||
}
|
||||
crossRegime := len(regimes) > 1
|
||||
|
||||
margin := 0.0
|
||||
if len(norms) > 1 {
|
||||
margin = norms[0].Score - norms[1].Score
|
||||
}
|
||||
|
||||
primaryBinding := p.SourceClass == "binding_law"
|
||||
humanReview := margin < assessReviewMargin || crossRegime || !primaryBinding
|
||||
|
||||
return &LegalAssessment{
|
||||
PrimaryNorm: primary,
|
||||
PrimaryRegulation: p.RegulationShort,
|
||||
ConnectedNorms: connected,
|
||||
CrossRegime: crossRegime,
|
||||
HumanReviewFlag: humanReview,
|
||||
WinnerMargin: margin,
|
||||
ScoreReasoning: assessReasoning(p, margin, crossRegime, primaryBinding),
|
||||
}
|
||||
}
|
||||
|
||||
func primaryLabel(p LegalSearchResult) string {
|
||||
if p.CitationUnit != "" {
|
||||
return p.CitationUnit
|
||||
}
|
||||
if p.ArticleLabel != "" {
|
||||
return p.ArticleLabel
|
||||
}
|
||||
return strings.TrimSpace(p.RegulationShort + " " + p.Article)
|
||||
}
|
||||
|
||||
// assessReasoning renders a short, human-readable justification (German).
|
||||
func assessReasoning(p LegalSearchResult, margin float64, crossRegime, primaryBinding bool) string {
|
||||
label := primaryLabel(p)
|
||||
parts := make([]string, 0, 4)
|
||||
if primaryBinding {
|
||||
parts = append(parts, fmt.Sprintf("Primärtreffer %s: bindendes Recht (Autorität %d).", label, p.AuthorityWeight))
|
||||
} else {
|
||||
parts = append(parts, fmt.Sprintf("Primärtreffer %s ist keine bindende Norm (Leitlinie/Standard) — Quelle prüfen.", label))
|
||||
}
|
||||
if margin > 0 {
|
||||
parts = append(parts, fmt.Sprintf("Vorsprung %.2f vor #2.", margin))
|
||||
}
|
||||
if margin < assessReviewMargin {
|
||||
parts = append(parts, "Knapper Vorsprung — Alternativtreffer prüfen.")
|
||||
}
|
||||
if crossRegime {
|
||||
parts = append(parts, "Mehrere Regime betroffen — Querbezug prüfen.")
|
||||
}
|
||||
return strings.Join(parts, " ")
|
||||
}
|
||||
|
||||
// distinctNorms collapses results that share a citation (multiple chunks of the
|
||||
// same article/annex) to the first — i.e. highest-ranked — occurrence. Results
|
||||
// without any citation identity are each kept, since they cannot be matched.
|
||||
func distinctNorms(results []LegalSearchResult) []LegalSearchResult {
|
||||
seen := make(map[string]bool, len(results))
|
||||
out := make([]LegalSearchResult, 0, len(results))
|
||||
for _, r := range results {
|
||||
key := r.CitationUnit
|
||||
if key == "" {
|
||||
key = r.ArticleLabel
|
||||
}
|
||||
if key != "" {
|
||||
if seen[key] {
|
||||
continue
|
||||
}
|
||||
seen[key] = true
|
||||
}
|
||||
out = append(out, r)
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// dedupStrings concatenates out+in, drops empties and the excluded value, and
|
||||
// returns a stable de-duplicated slice (insertion order preserved).
|
||||
func dedupStrings(out, in []string, exclude string) []string {
|
||||
seen := map[string]bool{exclude: true}
|
||||
res := make([]string, 0, len(out)+len(in))
|
||||
for _, list := range [][]string{out, in} {
|
||||
for _, s := range list {
|
||||
if s == "" || seen[s] {
|
||||
continue
|
||||
}
|
||||
seen[s] = true
|
||||
res = append(res, s)
|
||||
}
|
||||
}
|
||||
return res
|
||||
}
|
||||
@@ -0,0 +1,112 @@
|
||||
package ucca
|
||||
|
||||
import "testing"
|
||||
|
||||
func ares(reg, cu, sc string, score float64, weight int, out, in []string) LegalSearchResult {
|
||||
return LegalSearchResult{
|
||||
RegulationShort: reg, CitationUnit: cu, SourceClass: sc, Score: score,
|
||||
AuthorityWeight: weight, ReferencesOut: out, ReferencesIn: in,
|
||||
}
|
||||
}
|
||||
|
||||
func TestAssess_Empty(t *testing.T) {
|
||||
if Assess(nil) != nil {
|
||||
t.Error("empty results → nil assessment")
|
||||
}
|
||||
}
|
||||
|
||||
func TestAssess_BindingPrimary_NoReview(t *testing.T) {
|
||||
results := []LegalSearchResult{
|
||||
ares("CRA", "Art. 13 CRA", "binding_law", 1.05, 100,
|
||||
[]string{"CRA Anhang I", "Art. 14 CRA"}, []string{"Art. 12 CRA"}),
|
||||
ares("CRA", "Art. 14 CRA", "binding_law", 0.80, 100, nil, nil),
|
||||
}
|
||||
a := Assess(results)
|
||||
if a == nil {
|
||||
t.Fatal("nil assessment")
|
||||
}
|
||||
if a.PrimaryNorm != "Art. 13 CRA" || a.PrimaryRegulation != "CRA" {
|
||||
t.Errorf("primary wrong: %+v", a)
|
||||
}
|
||||
if len(a.ConnectedNorms) != 3 { // out(2) + in(1), self excluded, deduped
|
||||
t.Errorf("connected norms: %v", a.ConnectedNorms)
|
||||
}
|
||||
if a.CrossRegime {
|
||||
t.Error("single regime must not be cross-regime")
|
||||
}
|
||||
if a.WinnerMargin < 0.24 || a.WinnerMargin > 0.26 {
|
||||
t.Errorf("margin = %v, want ~0.25", a.WinnerMargin)
|
||||
}
|
||||
if a.HumanReviewFlag {
|
||||
t.Error("clean binding + healthy margin + single regime → no review")
|
||||
}
|
||||
}
|
||||
|
||||
func TestAssess_CrossRegimeFlagsReview(t *testing.T) {
|
||||
a := Assess([]LegalSearchResult{
|
||||
ares("CRA", "Art. 13 CRA", "binding_law", 1.05, 100, nil, nil),
|
||||
ares("DORA", "Art. 6 DORA", "binding_law", 0.70, 100, nil, nil),
|
||||
})
|
||||
if !a.CrossRegime || !a.HumanReviewFlag {
|
||||
t.Errorf("cross-regime must flag review: %+v", a)
|
||||
}
|
||||
}
|
||||
|
||||
func TestAssess_NonBindingFlagsReview(t *testing.T) {
|
||||
a := Assess([]LegalSearchResult{
|
||||
ares("ENISA", "ENISA SBOM", "supervisory_guidance", 0.90, 70, nil, nil),
|
||||
ares("ENISA", "ENISA X", "supervisory_guidance", 0.40, 70, nil, nil),
|
||||
})
|
||||
if !a.HumanReviewFlag {
|
||||
t.Error("non-binding primary → review")
|
||||
}
|
||||
}
|
||||
|
||||
func TestAssess_TightMarginFlagsReview(t *testing.T) {
|
||||
a := Assess([]LegalSearchResult{
|
||||
ares("CRA", "Art. 13 CRA", "binding_law", 1.00, 100, nil, nil),
|
||||
ares("CRA", "Art. 14 CRA", "binding_law", 0.98, 100, nil, nil),
|
||||
})
|
||||
if a.WinnerMargin >= 0.05 || !a.HumanReviewFlag {
|
||||
t.Errorf("tight margin → review: %+v", a)
|
||||
}
|
||||
}
|
||||
|
||||
func TestAssess_MarginIsNormLevelNotChunkLevel(t *testing.T) {
|
||||
// Two near-identical chunks of the SAME norm at the top, then a distinct norm.
|
||||
results := []LegalSearchResult{
|
||||
ares("CRA", "Art. 13 CRA", "binding_law", 1.050, 100, []string{"CRA Anhang I"}, nil),
|
||||
ares("CRA", "Art. 13 CRA", "binding_law", 1.049, 100, nil, nil), // same norm
|
||||
ares("CRA", "Art. 14 CRA", "binding_law", 0.800, 100, nil, nil),
|
||||
}
|
||||
a := Assess(results)
|
||||
if a.WinnerMargin < 0.24 || a.WinnerMargin > 0.26 { // Art.13 vs Art.14, not chunk vs chunk
|
||||
t.Errorf("margin must be norm-level (~0.25), got %v", a.WinnerMargin)
|
||||
}
|
||||
if a.HumanReviewFlag {
|
||||
t.Error("healthy norm-level margin → no review")
|
||||
}
|
||||
}
|
||||
|
||||
func TestDistinctNorms(t *testing.T) {
|
||||
got := distinctNorms([]LegalSearchResult{
|
||||
{CitationUnit: "Art. 13 CRA"},
|
||||
{CitationUnit: "Art. 13 CRA"}, // duplicate norm → collapsed
|
||||
{CitationUnit: "Art. 14 CRA"},
|
||||
{CitationUnit: ""}, // no identity → kept
|
||||
{CitationUnit: ""}, // no identity → kept
|
||||
})
|
||||
if len(got) != 4 {
|
||||
t.Errorf("want 4 (2 distinct + 2 unidentified), got %d", len(got))
|
||||
}
|
||||
}
|
||||
|
||||
func TestDedupStrings(t *testing.T) {
|
||||
got := dedupStrings([]string{"a", "b", "", "a"}, []string{"b", "c"}, "self")
|
||||
if len(got) != 3 || got[0] != "a" || got[1] != "b" || got[2] != "c" {
|
||||
t.Errorf("dedup: %v", got)
|
||||
}
|
||||
if len(dedupStrings([]string{"self"}, nil, "self")) != 0 {
|
||||
t.Error("excluded value must be dropped")
|
||||
}
|
||||
}
|
||||
@@ -20,6 +20,7 @@ type LegalRAGClient struct {
|
||||
httpClient *http.Client
|
||||
textIndexEnsured map[string]bool
|
||||
hybridEnabled bool
|
||||
graphEnabled bool
|
||||
}
|
||||
|
||||
// NewLegalRAGClient creates a new Legal RAG client using Ollama bge-m3 embeddings.
|
||||
@@ -38,6 +39,11 @@ func NewLegalRAGClient() *LegalRAGClient {
|
||||
}
|
||||
|
||||
hybridEnabled := os.Getenv("RAG_HYBRID_SEARCH") != "false"
|
||||
// Graph-Expansion ist OPT-IN: kein gemessener Rang-Nutzen ggue. der Binding-Augmentation,
|
||||
// +1 Qdrant-Call/Suche, Flutungsrisiko ueber Reverse-Kanten. Bleibt als Recall-Sicherheitsnetz
|
||||
// fuer spaetere Luecken (RAG_GRAPH_EXPANSION=true). Die Graph-Kanten werden in der Response
|
||||
// zur Begruendung/Vollstaendigkeit genutzt, nicht zur Pool-Expansion (Default).
|
||||
graphEnabled := os.Getenv("RAG_GRAPH_EXPANSION") == "true"
|
||||
|
||||
return &LegalRAGClient{
|
||||
qdrantURL: qdrantURL,
|
||||
@@ -47,6 +53,7 @@ func NewLegalRAGClient() *LegalRAGClient {
|
||||
collection: "bp_compliance_ce",
|
||||
textIndexEnsured: make(map[string]bool),
|
||||
hybridEnabled: hybridEnabled,
|
||||
graphEnabled: graphEnabled,
|
||||
httpClient: &http.Client{
|
||||
Timeout: 60 * time.Second,
|
||||
},
|
||||
@@ -93,6 +100,29 @@ func (c *LegalRAGClient) searchInternal(ctx context.Context, collection string,
|
||||
hits = denseHits
|
||||
}
|
||||
|
||||
// Stratified: den binding_law-Pool ERGAENZEN (nicht ersetzen), damit die Pflichtquelle
|
||||
// immer Kandidat ist — Guidance bleibt als Auslegungskontext erhalten. Best-effort:
|
||||
// Fehler beim Binding-Query degradieren still auf den semantischen Pool.
|
||||
if bindingHits, bErr := c.searchBinding(ctx, collection, embedding, topK); bErr == nil {
|
||||
hits = mergeDedupHits(hits, bindingHits)
|
||||
}
|
||||
|
||||
// Control-Augmentation: bei expliziter Umsetzungsfrage einen tiefen dense-Pool ziehen und
|
||||
// nur die Control-Pool-Rollen behalten — so werden NIST/CRA-Anhang (dense rank ~8-9, unter
|
||||
// dem kleinen top-K) Kandidaten. Re-Rank/applyControlRoles ordnen sie danach.
|
||||
if queryWantsControls(query) {
|
||||
if controlHits, cErr := c.searchControls(ctx, collection, embedding); cErr == nil {
|
||||
hits = mergeDedupHits(hits, controlHits)
|
||||
}
|
||||
}
|
||||
|
||||
// Graph-Augmentation: verbundene Normen (references_out/in) der Top-Hits ueber die
|
||||
// praezise Zitations-Kante in den Pool ziehen — z.B. Art. 13 CRA zieht Anhang I (die
|
||||
// eigentliche Pflichtquelle). Pool-Augmentation only; Re-Rank + topK bleiben.
|
||||
if c.graphEnabled {
|
||||
hits = c.expandViaGraph(ctx, collection, hits)
|
||||
}
|
||||
|
||||
results := make([]LegalSearchResult, len(hits))
|
||||
for i, hit := range hits {
|
||||
// Legal-Metadaten nach rag_reingest_spec.md §2: bevorzugt die normalisierten Felder
|
||||
@@ -121,12 +151,45 @@ func (c *LegalRAGClient) searchInternal(ctx context.Context, collection string,
|
||||
Pages: getIntSlice(hit.Payload, "pages"),
|
||||
SourceURL: getString(hit.Payload, "source"),
|
||||
Score: hit.Score,
|
||||
AuthorityWeight: getInt(hit.Payload, "authority_weight"),
|
||||
SourceClass: getString(hit.Payload, "source_class"),
|
||||
Jurisdiction: getString(hit.Payload, "jurisdiction"),
|
||||
CitationUnit: getString(hit.Payload, "citation_unit"),
|
||||
ReferencesOut: getStringSlice(hit.Payload, "references_out"),
|
||||
ReferencesIn: getStringSlice(hit.Payload, "references_in"),
|
||||
Superseded: getString(hit.Payload, "status") == "superseded",
|
||||
}
|
||||
}
|
||||
|
||||
// Authority-aware Re-Ranking: bindendes Recht der passenden Jurisdiktion/Domaene nach
|
||||
// oben, Guidance/Fremdrecht/Off-Domain runter (nichts wird geloescht). Reihenfolge only,
|
||||
// Response-Schema unveraendert. Score traegt den Authority-Score, damit nachgelagerte
|
||||
// Multi-Collection-Merges (Advisor) die Ordnung bewahren.
|
||||
results = rerankByAuthority(query, results)
|
||||
if topK > 0 && len(results) > topK {
|
||||
results = results[:topK]
|
||||
}
|
||||
|
||||
return results, nil
|
||||
}
|
||||
|
||||
// mergeDedupHits concatenates two hit lists, keeping the first occurrence of each point ID.
|
||||
func mergeDedupHits(primary, extra []qdrantSearchHit) []qdrantSearchHit {
|
||||
seen := make(map[string]bool, len(primary)+len(extra))
|
||||
out := make([]qdrantSearchHit, 0, len(primary)+len(extra))
|
||||
for _, list := range [][]qdrantSearchHit{primary, extra} {
|
||||
for _, h := range list {
|
||||
id := fmt.Sprint(h.ID)
|
||||
if seen[id] {
|
||||
continue
|
||||
}
|
||||
seen[id] = true
|
||||
out = append(out, h)
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// FormatLegalContextForPrompt formats the legal context for inclusion in an LLM prompt.
|
||||
func (c *LegalRAGClient) FormatLegalContextForPrompt(lc *LegalContext) string {
|
||||
if lc == nil || len(lc.Results) == 0 {
|
||||
|
||||
@@ -0,0 +1,162 @@
|
||||
package ucca
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"context"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"io"
|
||||
"net/http"
|
||||
"sort"
|
||||
)
|
||||
|
||||
// Graph-augmented retrieval: when a top hit cites an annex/article (references_out)
|
||||
// or is cited by one (references_in), pull that connected norm into the candidate
|
||||
// pool via the PRECISE citation graph instead of hoping semantic search surfaces
|
||||
// it. E.g. a hit on CRA Art. 13 pulls in CRA Anhang I (the actual requirement).
|
||||
// Pool-augmentation only — authority re-rank + topK slice still apply, so the
|
||||
// response schema is unchanged.
|
||||
const (
|
||||
graphSeedCount = 5 // only the top hits seed the expansion
|
||||
graphMaxExpand = 15 // cap connected norms pulled in (avoid pool explosion)
|
||||
graphHopPenalty = 0.05 // a one-hop neighbour ranks just below its seed
|
||||
)
|
||||
|
||||
// expandViaGraph augments hits with the norms they cite and the norms that cite
|
||||
// them. Best-effort: on any error (or nothing to expand) the original hits are
|
||||
// returned unchanged.
|
||||
func (c *LegalRAGClient) expandViaGraph(ctx context.Context, collection string, hits []qdrantSearchHit) []qdrantSearchHit {
|
||||
if len(hits) == 0 {
|
||||
return hits
|
||||
}
|
||||
present := make(map[string]bool, len(hits))
|
||||
for _, h := range hits {
|
||||
if cu := getString(h.Payload, "citation_unit"); cu != "" {
|
||||
present[cu] = true
|
||||
}
|
||||
}
|
||||
|
||||
seeds := hits
|
||||
if len(seeds) > graphSeedCount {
|
||||
seeds = seeds[:graphSeedCount]
|
||||
}
|
||||
// Forward edges only (references_out = the detail a hit explicitly points to,
|
||||
// e.g. Art. 13 → Anhang I). Reverse (references_in) has high fan-out for popular
|
||||
// annexes (Anhang I is cited by 23 articles) → pool flooding; it is surfaced as
|
||||
// connected-norm metadata in the Phase 2 response instead of expanding the pool.
|
||||
want := make(map[string]float64) // connected citation_unit -> best seeding score
|
||||
for _, h := range seeds {
|
||||
for _, cu := range getStringSlice(h.Payload, "references_out") {
|
||||
if cu == "" || present[cu] {
|
||||
continue
|
||||
}
|
||||
if s, ok := want[cu]; !ok || h.Score > s {
|
||||
want[cu] = h.Score
|
||||
}
|
||||
}
|
||||
}
|
||||
if len(want) == 0 {
|
||||
return hits
|
||||
}
|
||||
|
||||
units := topByScore(want, graphMaxExpand)
|
||||
fetched, err := c.fetchByCitationUnits(ctx, collection, units)
|
||||
if err != nil || len(fetched) == 0 {
|
||||
return hits
|
||||
}
|
||||
neighbours := make([]qdrantSearchHit, 0, len(fetched))
|
||||
for cu, pt := range fetched {
|
||||
neighbours = append(neighbours, qdrantSearchHit{ID: pt.ID, Score: want[cu] - graphHopPenalty, Payload: pt.Payload})
|
||||
}
|
||||
return mergeDedupHits(hits, neighbours)
|
||||
}
|
||||
|
||||
// topByScore returns up to n keys with the highest values. Deterministic: ties
|
||||
// broken by the key string so the cap is stable across runs.
|
||||
func topByScore(m map[string]float64, n int) []string {
|
||||
keys := make([]string, 0, len(m))
|
||||
for k := range m {
|
||||
keys = append(keys, k)
|
||||
}
|
||||
sort.Slice(keys, func(i, j int) bool {
|
||||
if m[keys[i]] != m[keys[j]] {
|
||||
return m[keys[i]] > m[keys[j]]
|
||||
}
|
||||
return keys[i] < keys[j]
|
||||
})
|
||||
if len(keys) > n {
|
||||
keys = keys[:n]
|
||||
}
|
||||
return keys
|
||||
}
|
||||
|
||||
// fetchByCitationUnits loads one representative point (the first chunk) per
|
||||
// citation_unit from the given collection.
|
||||
func (c *LegalRAGClient) fetchByCitationUnits(ctx context.Context, collection string, units []string) (map[string]qdrantScrollPoint, error) {
|
||||
should := make([]map[string]interface{}, 0, len(units))
|
||||
for _, cu := range units {
|
||||
should = append(should, map[string]interface{}{"key": "citation_unit", "match": map[string]interface{}{"value": cu}})
|
||||
}
|
||||
reqBody := map[string]interface{}{
|
||||
"limit": len(units) * 4,
|
||||
"with_payload": true,
|
||||
"with_vectors": false,
|
||||
"filter": map[string]interface{}{"should": should},
|
||||
}
|
||||
jsonBody, err := json.Marshal(reqBody)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
url := fmt.Sprintf("%s/collections/%s/points/scroll", c.qdrantURL, collection)
|
||||
req, err := http.NewRequestWithContext(ctx, "POST", url, bytes.NewReader(jsonBody))
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
req.Header.Set("Content-Type", "application/json")
|
||||
if c.qdrantAPIKey != "" {
|
||||
req.Header.Set("api-key", c.qdrantAPIKey)
|
||||
}
|
||||
resp, err := c.httpClient.Do(req)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
defer func() { _ = resp.Body.Close() }()
|
||||
if resp.StatusCode != http.StatusOK {
|
||||
body, _ := io.ReadAll(resp.Body)
|
||||
return nil, fmt.Errorf("qdrant scroll returned %d: %s", resp.StatusCode, string(body))
|
||||
}
|
||||
var scrollResp qdrantScrollResponse
|
||||
if err := json.NewDecoder(resp.Body).Decode(&scrollResp); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
out := make(map[string]qdrantScrollPoint, len(units))
|
||||
for _, pt := range scrollResp.Result.Points {
|
||||
cu := getString(pt.Payload, "citation_unit")
|
||||
if cu != "" {
|
||||
if _, seen := out[cu]; !seen {
|
||||
out[cu] = pt
|
||||
}
|
||||
}
|
||||
}
|
||||
return out, nil
|
||||
}
|
||||
|
||||
// getStringSlice extracts a []string from a Qdrant payload list field
|
||||
// (references_out / references_in are stored as JSON arrays of strings).
|
||||
func getStringSlice(m map[string]interface{}, key string) []string {
|
||||
v, ok := m[key]
|
||||
if !ok {
|
||||
return nil
|
||||
}
|
||||
arr, ok := v.([]interface{})
|
||||
if !ok {
|
||||
return nil
|
||||
}
|
||||
out := make([]string, 0, len(arr))
|
||||
for _, item := range arr {
|
||||
if s, ok := item.(string); ok {
|
||||
out = append(out, s)
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
@@ -0,0 +1,89 @@
|
||||
package ucca
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"testing"
|
||||
)
|
||||
|
||||
func TestGetStringSlice(t *testing.T) {
|
||||
m := map[string]interface{}{
|
||||
"refs": []interface{}{"a", "b", 3, "c"}, // non-strings are skipped
|
||||
"str": "not-a-list",
|
||||
}
|
||||
got := getStringSlice(m, "refs")
|
||||
if len(got) != 3 || got[0] != "a" || got[2] != "c" {
|
||||
t.Errorf("refs: %v", got)
|
||||
}
|
||||
if getStringSlice(m, "missing") != nil {
|
||||
t.Error("missing key should be nil")
|
||||
}
|
||||
if getStringSlice(m, "str") != nil {
|
||||
t.Error("non-list should be nil")
|
||||
}
|
||||
}
|
||||
|
||||
func TestTopByScore_DeterministicCap(t *testing.T) {
|
||||
m := map[string]float64{"x": 0.5, "y": 0.9, "z": 0.5, "w": 0.7}
|
||||
got := topByScore(m, 2)
|
||||
if len(got) != 2 || got[0] != "y" || got[1] != "w" {
|
||||
t.Errorf("want [y w], got %v", got)
|
||||
}
|
||||
all := topByScore(m, 10)
|
||||
if all[2] != "x" || all[3] != "z" { // tie 0.5 broken by key string
|
||||
t.Errorf("tie-break not deterministic: %v", all)
|
||||
}
|
||||
}
|
||||
|
||||
func TestExpandViaGraph_NoSeedsOrRefs(t *testing.T) {
|
||||
c := &LegalRAGClient{} // nil httpClient → must not be called on these paths
|
||||
if out := c.expandViaGraph(context.Background(), "x", nil); out != nil {
|
||||
t.Error("empty hits should return nil")
|
||||
}
|
||||
hits := []qdrantSearchHit{{ID: 1, Score: 0.8, Payload: map[string]interface{}{"citation_unit": "Art. 1 CRA"}}}
|
||||
if out := c.expandViaGraph(context.Background(), "x", hits); len(out) != 1 {
|
||||
t.Errorf("no references → unchanged, got %d", len(out))
|
||||
}
|
||||
}
|
||||
|
||||
func TestExpandViaGraph_PullsConnectedNorm(t *testing.T) {
|
||||
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
|
||||
_ = json.NewEncoder(w).Encode(map[string]interface{}{
|
||||
"result": map[string]interface{}{
|
||||
"points": []map[string]interface{}{
|
||||
{"id": 99, "payload": map[string]interface{}{
|
||||
"citation_unit": "CRA Anhang I", "chunk_text": "Sicherheitsanforderungen",
|
||||
"source_class": "binding_law", "authority_weight": 100, "regulation_short": "CRA",
|
||||
}},
|
||||
},
|
||||
"next_page_offset": nil,
|
||||
},
|
||||
})
|
||||
}))
|
||||
defer srv.Close()
|
||||
|
||||
c := &LegalRAGClient{qdrantURL: srv.URL, httpClient: srv.Client()}
|
||||
hits := []qdrantSearchHit{
|
||||
{ID: 1, Score: 0.70, Payload: map[string]interface{}{
|
||||
"citation_unit": "Art. 13 CRA", "references_out": []interface{}{"CRA Anhang I"},
|
||||
}},
|
||||
}
|
||||
out := c.expandViaGraph(context.Background(), "bp_compliance_ce", hits)
|
||||
if len(out) != 2 {
|
||||
t.Fatalf("want 2 hits (seed + connected annex), got %d", len(out))
|
||||
}
|
||||
var found *qdrantSearchHit
|
||||
for i := range out {
|
||||
if getString(out[i].Payload, "citation_unit") == "CRA Anhang I" {
|
||||
found = &out[i]
|
||||
}
|
||||
}
|
||||
if found == nil {
|
||||
t.Fatal("connected norm CRA Anhang I was not pulled into the pool")
|
||||
}
|
||||
if found.Score < 0.64 || found.Score > 0.66 { // 0.70 seed − 0.05 hop penalty
|
||||
t.Errorf("connected score = %v, want ~0.65", found.Score)
|
||||
}
|
||||
}
|
||||
@@ -185,6 +185,55 @@ func (c *LegalRAGClient) searchDense(ctx context.Context, collection string, emb
|
||||
searchReq.Filter = &qdrantFilter{Should: conditions}
|
||||
}
|
||||
|
||||
return c.doPointsSearch(ctx, collection, searchReq)
|
||||
}
|
||||
|
||||
// searchBinding fetches the top binding_law hits (authority-stratified pool) so the
|
||||
// obligation source is always a candidate even when guidance dominates semantically.
|
||||
// It AUGMENTS the semantic pool — guidance is preserved as interpretation context.
|
||||
func (c *LegalRAGClient) searchBinding(ctx context.Context, collection string, embedding []float64, topK int) ([]qdrantSearchHit, error) {
|
||||
searchReq := qdrantSearchRequest{
|
||||
Vector: embedding,
|
||||
Limit: topK,
|
||||
WithPayload: true,
|
||||
Filter: &qdrantFilter{Must: []qdrantCondition{
|
||||
{Key: "source_class", Match: qdrantMatch{Value: "binding_law"}},
|
||||
}},
|
||||
}
|
||||
|
||||
return c.doPointsSearch(ctx, collection, searchReq)
|
||||
}
|
||||
|
||||
// controlPoolDepth is how deep the dense control pull reaches. Measured: for an EU-cyber
|
||||
// control query the relevant control sources sit at dense rank ~8-9 (NIST, CRA Annex), far
|
||||
// below the client's small top-K — so a fixed dense depth of 60 reliably surfaces them.
|
||||
const controlPoolDepth = 60
|
||||
|
||||
// searchControls fetches a DEEP dense pool and keeps only the control-pool roles, so control
|
||||
// sources that the small top-K (hybrid) search misses become candidates on an implementation
|
||||
// question. Role is derived in code (no source_role tag needed). AUGMENTS the pool — the
|
||||
// caller gates it on control-intent.
|
||||
func (c *LegalRAGClient) searchControls(ctx context.Context, collection string, embedding []float64) ([]qdrantSearchHit, error) {
|
||||
searchReq := qdrantSearchRequest{
|
||||
Vector: embedding,
|
||||
Limit: controlPoolDepth,
|
||||
WithPayload: true,
|
||||
}
|
||||
hits, err := c.doPointsSearch(ctx, collection, searchReq)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
kept := make([]qdrantSearchHit, 0, len(hits))
|
||||
for _, h := range hits {
|
||||
if isControlPoolRole(controlRoleOf(h.Payload)) {
|
||||
kept = append(kept, h)
|
||||
}
|
||||
}
|
||||
return kept, nil
|
||||
}
|
||||
|
||||
// doPointsSearch issues a POST /points/search and decodes the hits.
|
||||
func (c *LegalRAGClient) doPointsSearch(ctx context.Context, collection string, searchReq qdrantSearchRequest) ([]qdrantSearchHit, error) {
|
||||
jsonBody, err := json.Marshal(searchReq)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to marshal search request: %w", err)
|
||||
|
||||
@@ -0,0 +1,135 @@
|
||||
package ucca
|
||||
|
||||
import "testing"
|
||||
|
||||
func intentRes(reg, sourceClass string, sem float64, weight int) LegalSearchResult {
|
||||
return LegalSearchResult{
|
||||
RegulationShort: reg, SourceClass: sourceClass, Score: sem,
|
||||
AuthorityWeight: weight, Jurisdiction: "EU",
|
||||
}
|
||||
}
|
||||
|
||||
func TestQueryWantsGuidance(t *testing.T) {
|
||||
wants := []string{
|
||||
"Was empfiehlt der EDPB zum DSB?",
|
||||
"Was sagt die ENISA zu Security Updates?",
|
||||
"laut DSK ...",
|
||||
"Orientierungshilfe zur DSFA",
|
||||
"Welche BSI-Empfehlung gilt?",
|
||||
"Auslegung der Aufsichtsbehörde",
|
||||
}
|
||||
plain := []string{
|
||||
"Ab wann braucht man einen Datenschutzbeauftragten?",
|
||||
"Welche Anforderungen bestehen an Security Updates?",
|
||||
}
|
||||
for _, q := range wants {
|
||||
if !queryWantsGuidance(q) {
|
||||
t.Errorf("should detect interpretation intent: %q", q)
|
||||
}
|
||||
}
|
||||
for _, q := range plain {
|
||||
if queryWantsGuidance(q) {
|
||||
t.Errorf("should NOT detect intent (norm question): %q", q)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestRerank_NormQuestion_BindingStaysTop(t *testing.T) {
|
||||
// No intent signal → binding wins even though guidance is semantically higher.
|
||||
results := []LegalSearchResult{
|
||||
intentRes("EDPB DPO", "supervisory_guidance", 0.64, 70),
|
||||
intentRes("DSGVO", "binding_law", 0.58, 100),
|
||||
}
|
||||
out := rerankByAuthority("Ab wann braucht man einen Datenschutzbeauftragten?", results)
|
||||
if out[0].SourceClass != "binding_law" {
|
||||
t.Errorf("norm question: binding must stay Top-1, got %s", out[0].SourceClass)
|
||||
}
|
||||
}
|
||||
|
||||
func TestRerank_InterpretationQuestion_GuidanceMayWin(t *testing.T) {
|
||||
// Explicit intent + guidance semantically competitive → guidance wins.
|
||||
results := []LegalSearchResult{
|
||||
intentRes("EDPB DPO", "supervisory_guidance", 0.64, 70),
|
||||
intentRes("DSGVO", "binding_law", 0.58, 100),
|
||||
}
|
||||
out := rerankByAuthority("Was empfiehlt der EDPB zum Datenschutzbeauftragten?", results)
|
||||
if out[0].SourceClass != "supervisory_guidance" {
|
||||
t.Errorf("interpretation question: guidance should win Top-1, got %s", out[0].SourceClass)
|
||||
}
|
||||
}
|
||||
|
||||
func TestRerank_OffTopicGuidance_BlockedByGuard(t *testing.T) {
|
||||
// Intent present, but guidance semantic is far below the best binding hit →
|
||||
// the margin guard keeps binding on top (no off-topic guideline override).
|
||||
results := []LegalSearchResult{
|
||||
intentRes("EDPB DPO", "supervisory_guidance", 0.40, 70),
|
||||
intentRes("DSGVO", "binding_law", 0.58, 100),
|
||||
}
|
||||
out := rerankByAuthority("Was empfiehlt der EDPB zum Datenschutzbeauftragten?", results)
|
||||
if out[0].SourceClass != "binding_law" {
|
||||
t.Errorf("off-topic guidance must not win even with intent, got %s", out[0].SourceClass)
|
||||
}
|
||||
}
|
||||
|
||||
func TestQueryWantsControls(t *testing.T) {
|
||||
wants := []string{
|
||||
"Welche Controls passen zu Security Updates?",
|
||||
"Welche Maßnahmen sollten wir umsetzen?",
|
||||
"Wie härten wir den Server ab?",
|
||||
"Gibt es NIST-Controls dafür?",
|
||||
"OWASP Best Practice für Logging?",
|
||||
"BSI Grundschutz Bausteine",
|
||||
}
|
||||
plain := []string{
|
||||
"Welche Anforderungen bestehen an Security Updates?",
|
||||
"Ab wann braucht man einen Datenschutzbeauftragten?",
|
||||
}
|
||||
for _, q := range wants {
|
||||
if !queryWantsControls(q) {
|
||||
t.Errorf("should detect control/implementation intent: %q", q)
|
||||
}
|
||||
}
|
||||
for _, q := range plain {
|
||||
if queryWantsControls(q) {
|
||||
t.Errorf("should NOT detect control intent (norm question): %q", q)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestRerank_ControlQuestion_OperationalReqTop(t *testing.T) {
|
||||
// User priority for implementation questions: operational_requirement (binding concrete,
|
||||
// CRA Anhang I) > control_standard (NIST). Both are in the control-pool; op_req wins.
|
||||
results := []LegalSearchResult{
|
||||
{RegulationShort: "NIST SP 800-82r3", ArticleLabel: "AU-8", SourceClass: "technical_standard", AuthorityWeight: 80, Jurisdiction: "EU", Score: 0.60},
|
||||
{RegulationShort: "CRA", ArticleLabel: "CRA Anhang I", Category: "regulation", Score: 0.58},
|
||||
}
|
||||
out := rerankByAuthority("Welche Controls und Massnahmen passen zu Security Updates?", results)
|
||||
if out[0].RegulationShort != "CRA" {
|
||||
t.Errorf("operational_requirement (CRA Anhang I) should be Top-1 over control_standard, got %q", out[0].RegulationShort)
|
||||
}
|
||||
}
|
||||
|
||||
func TestRerank_NormQuestion_BindingOverStandard(t *testing.T) {
|
||||
// "Anforderungen" → no control intent → binding obligation stays Top-1 over the standard.
|
||||
results := []LegalSearchResult{
|
||||
intentRes("NIST SP 800-82", "technical_standard", 0.62, 80),
|
||||
intentRes("CRA", "binding_law", 0.58, 100),
|
||||
}
|
||||
out := rerankByAuthority("Welche Anforderungen bestehen an Security Updates?", results)
|
||||
if out[0].SourceClass != "binding_law" {
|
||||
t.Errorf("norm question: binding must stay Top-1 over standard, got %s", out[0].SourceClass)
|
||||
}
|
||||
}
|
||||
|
||||
func TestRerank_ControlQuestion_PoolBeatsBareObligation(t *testing.T) {
|
||||
// A control-pool source (NIST control_standard) outranks an abstract obligation with no
|
||||
// domain/topic advantage, because the implementation intent boosts the control-pool.
|
||||
results := []LegalSearchResult{
|
||||
{RegulationShort: "NIST SP 800-82r3", ArticleLabel: "AU-8", SourceClass: "technical_standard", AuthorityWeight: 80, Jurisdiction: "EU", Score: 0.55},
|
||||
{RegulationShort: "XYZ", ArticleLabel: "Art. 5 XYZ", Category: "regulation", Score: 0.58},
|
||||
}
|
||||
out := rerankByAuthority("Welche Controls und Massnahmen passen zu Security Updates?", results)
|
||||
if out[0].RegulationShort != "NIST SP 800-82r3" {
|
||||
t.Errorf("control_standard should beat a bare abstract obligation on a control question, got %q", out[0].RegulationShort)
|
||||
}
|
||||
}
|
||||
@@ -225,6 +225,18 @@ func getIntSlice(m map[string]interface{}, key string) []int {
|
||||
return result
|
||||
}
|
||||
|
||||
func getInt(m map[string]interface{}, key string) int {
|
||||
if v, ok := m[key]; ok {
|
||||
switch n := v.(type) {
|
||||
case float64:
|
||||
return int(n)
|
||||
case int:
|
||||
return n
|
||||
}
|
||||
}
|
||||
return 0
|
||||
}
|
||||
|
||||
func contains(slice []string, item string) bool {
|
||||
for _, s := range slice {
|
||||
if s == item {
|
||||
|
||||
@@ -0,0 +1,30 @@
|
||||
package ucca
|
||||
|
||||
import "testing"
|
||||
|
||||
// A superseded alt-source must rank below the same result when it is NOT
|
||||
// superseded (the eu-v1 norm), but only demoted — the penalty is finite, so it
|
||||
// stays in the pool and remains findable for history/transition questions.
|
||||
func TestAuthorityScore_SupersededIsDemotedNotRemoved(t *testing.T) {
|
||||
fresh := LegalSearchResult{
|
||||
Score: 0.65, SourceClass: "binding_law", AuthorityWeight: 100,
|
||||
Jurisdiction: "EU", RegulationShort: "CRA", Article: "13",
|
||||
}
|
||||
old := fresh
|
||||
old.Superseded = true
|
||||
|
||||
sFresh := authorityScore("CRA Sicherheitsupdates Hersteller", fresh, "", false)
|
||||
sOld := authorityScore("CRA Sicherheitsupdates Hersteller", old, "", false)
|
||||
|
||||
if sOld >= sFresh {
|
||||
t.Errorf("superseded must score lower: fresh=%.3f superseded=%.3f", sFresh, sOld)
|
||||
}
|
||||
gap := sFresh - sOld
|
||||
if gap < supersededPenalty-0.001 || gap > supersededPenalty+0.001 {
|
||||
t.Errorf("demotion should equal supersededPenalty (%.2f), got %.3f", supersededPenalty, gap)
|
||||
}
|
||||
// Still a positive, finite score → present in the pool, not hidden.
|
||||
if sOld <= -1 {
|
||||
t.Errorf("superseded score collapsed (%.3f) — must remain findable", sOld)
|
||||
}
|
||||
}
|
||||
@@ -399,8 +399,9 @@ func TestHybridSearch_UsesQueryAPI(t *testing.T) {
|
||||
return
|
||||
}
|
||||
|
||||
// Fallback: should not reach dense search
|
||||
t.Error("Unexpected dense search call when hybrid succeeded")
|
||||
// /points/search is now the stratified binding-law augmentation query (it AUGMENTS
|
||||
// the hybrid pool, it is not a dense fallback). Return empty so the hybrid hit
|
||||
// remains the sole result for this test.
|
||||
json.NewEncoder(w).Encode(qdrantSearchResponse{Result: []qdrantSearchHit{}})
|
||||
}))
|
||||
defer qdrantMock.Close()
|
||||
@@ -446,6 +447,59 @@ func TestHybridSearch_UsesQueryAPI(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
// TestSearch_StratifiedBindingRerank verifies that the binding-law pool augments the
|
||||
// semantic pool and that authority re-ranking lifts binding law above higher-semantic guidance.
|
||||
func TestSearch_StratifiedBindingRerank(t *testing.T) {
|
||||
ollamaMock := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
json.NewEncoder(w).Encode(ollamaEmbeddingResponse{Embedding: make([]float64, 1024)})
|
||||
}))
|
||||
defer ollamaMock.Close()
|
||||
|
||||
qdrantMock := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
if strings.Contains(r.URL.Path, "/index") {
|
||||
w.WriteHeader(http.StatusOK)
|
||||
w.Write([]byte(`{"result":{"status":"completed"}}`))
|
||||
return
|
||||
}
|
||||
if strings.Contains(r.URL.Path, "/points/query") {
|
||||
json.NewEncoder(w).Encode(qdrantQueryResponse{Result: []qdrantSearchHit{
|
||||
{ID: "g1", Score: 0.72, Payload: map[string]interface{}{
|
||||
"chunk_text": "ENISA guidance", "regulation_short": "ENISA",
|
||||
"article_label": "ENISA CRA Mapping", "source_class": "supervisory_guidance",
|
||||
"authority_weight": float64(70), "jurisdiction": "EU",
|
||||
}},
|
||||
}})
|
||||
return
|
||||
}
|
||||
// /points/search = stratified binding-law pool (source_class=binding_law)
|
||||
json.NewEncoder(w).Encode(qdrantSearchResponse{Result: []qdrantSearchHit{
|
||||
{ID: "b1", Score: 0.66, Payload: map[string]interface{}{
|
||||
"chunk_text": "CRA Anhang I requirement", "regulation_short": "CRA",
|
||||
"article_label": "CRA Anhang I", "source_class": "binding_law",
|
||||
"authority_weight": float64(100), "jurisdiction": "EU",
|
||||
}},
|
||||
}})
|
||||
}))
|
||||
defer qdrantMock.Close()
|
||||
|
||||
client := &LegalRAGClient{
|
||||
qdrantURL: qdrantMock.URL, ollamaURL: ollamaMock.URL, embeddingModel: "bge-m3",
|
||||
collection: "bp_compliance_ce", textIndexEnsured: make(map[string]bool),
|
||||
hybridEnabled: true, httpClient: http.DefaultClient,
|
||||
}
|
||||
|
||||
results, err := client.Search(context.Background(), "Was gilt hier?", nil, 5)
|
||||
if err != nil {
|
||||
t.Fatalf("search failed: %v", err)
|
||||
}
|
||||
if len(results) != 2 {
|
||||
t.Fatalf("expected 2 merged results (guidance + binding), got %d", len(results))
|
||||
}
|
||||
if results[0].RegulationShort != "CRA" {
|
||||
t.Errorf("binding CRA must rank first over higher-semantic guidance, got %q", results[0].RegulationShort)
|
||||
}
|
||||
}
|
||||
|
||||
func TestHybridSearch_FallbackToDense(t *testing.T) {
|
||||
var requestedPaths []string
|
||||
|
||||
|
||||
@@ -20,6 +20,38 @@ type LegalSearchResult struct {
|
||||
Pages []int `json:"pages,omitempty"`
|
||||
SourceURL string `json:"source_url"`
|
||||
Score float64 `json:"score"`
|
||||
|
||||
// Interne Felder fuer das Authority-Re-Ranking (Phase 1) — NICHT serialisiert
|
||||
// (json:"-"), daher kein Contract-Change. Aus dem Qdrant-Payload befuellt und nur
|
||||
// fuer die Sortierung in rerankByAuthority verwendet.
|
||||
AuthorityWeight int `json:"-"`
|
||||
SourceClass string `json:"-"`
|
||||
Jurisdiction string `json:"-"`
|
||||
|
||||
// Zitations-Graph (Phase 2) — intern, speist nur die Assessment-Berechnung
|
||||
// (verbundene Normen, Begruendung). Pro-Result-Schema bleibt eingefroren.
|
||||
CitationUnit string `json:"-"`
|
||||
ReferencesOut []string `json:"-"`
|
||||
ReferencesIn []string `json:"-"`
|
||||
|
||||
// Supersede-Status (status="superseded", use_for_primary=false) — Alt-Quelle,
|
||||
// die fuer Default-Fragen demoted wird (nicht versteckt; fuer Historie auffindbar).
|
||||
Superseded bool `json:"-"`
|
||||
}
|
||||
|
||||
// LegalAssessment is the auditable explanation layer over a ranked result set:
|
||||
// which norm is primary, which norms connect to it via the citation graph,
|
||||
// whether the answer crosses regulatory regimes, and whether a human should
|
||||
// review. Computed from the already-ranked results — it EXPLAINS retrieval, it
|
||||
// does not change it (graph edges for reasoning/completeness, not pool-expansion).
|
||||
type LegalAssessment struct {
|
||||
PrimaryNorm string `json:"primary_norm"`
|
||||
PrimaryRegulation string `json:"primary_regulation"`
|
||||
ConnectedNorms []string `json:"connected_norms"`
|
||||
CrossRegime bool `json:"cross_regime"`
|
||||
HumanReviewFlag bool `json:"human_review_flag"`
|
||||
WinnerMargin float64 `json:"winner_margin"`
|
||||
ScoreReasoning string `json:"score_reasoning"`
|
||||
}
|
||||
|
||||
// LegalContext represents aggregated legal context for an assessment.
|
||||
|
||||
@@ -13,6 +13,7 @@ the map). Once the tabs are the source of truth, B18's v1 path retires.
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import logging
|
||||
|
||||
from compliance.services.specialist_agents import REGISTRY, AgentInput
|
||||
@@ -27,6 +28,8 @@ logger = logging.getLogger(__name__)
|
||||
# topic key (matches state["doc_texts"]) -> registered agent_id
|
||||
_TOPIC_AGENTS: dict[str, str] = {
|
||||
"impressum": "impressum",
|
||||
"agb": "agb", # v2: AGBAgent mit decision_method-Routing (71% FP -> ~0)
|
||||
"dse": "dse", # v3: 4-Layer (Regex-Boost/Keyword/BGE-M3-Recall/Semantic)
|
||||
}
|
||||
|
||||
_MIN_TEXT = 100
|
||||
@@ -112,14 +115,17 @@ async def run_agent_outputs(state: dict) -> None:
|
||||
)
|
||||
|
||||
outputs: dict[str, dict] = state.get("agent_outputs") or {}
|
||||
for topic, agent_id in _TOPIC_AGENTS.items():
|
||||
|
||||
async def _run_one(topic: str, agent_id: str):
|
||||
"""Einen Topic-Agent laufen lassen + sein Tab-Event sofort emittieren
|
||||
(Zwischenbefund). Fängt eigene Fehler → ein Agent reißt den Run nicht ab."""
|
||||
text = (doc_texts.get(topic) or "").strip()
|
||||
if len(text) < _MIN_TEXT:
|
||||
continue
|
||||
return None
|
||||
agent = REGISTRY.get(agent_id)
|
||||
if agent is None:
|
||||
logger.warning("agent_outputs: agent '%s' not registered", agent_id)
|
||||
continue
|
||||
return None
|
||||
try:
|
||||
out = await agent.evaluate(AgentInput(
|
||||
doc_type=topic,
|
||||
@@ -128,15 +134,25 @@ async def run_agent_outputs(state: dict) -> None:
|
||||
company_name=company_name,
|
||||
origin_domain=origin_domain,
|
||||
))
|
||||
outputs[topic] = out.model_dump(mode="json")
|
||||
emit(check_id, {"type": "topic", "topic": topic,
|
||||
"output": outputs[topic]})
|
||||
dump = out.model_dump(mode="json")
|
||||
emit(check_id, {"type": "topic", "topic": topic, "output": dump})
|
||||
logger.info(
|
||||
"agent_outputs[%s]: %d findings, confidence %.2f",
|
||||
topic, len(out.findings), out.confidence,
|
||||
)
|
||||
return topic, dump
|
||||
except Exception as e: # noqa: BLE001 — best-effort, never break the run
|
||||
logger.warning("agent_outputs[%s] failed: %s", topic, e)
|
||||
return None
|
||||
|
||||
# Topic-Agenten laufen NEBENLÄUFIG (ihre Embedding-/LLM-Waits überlappen) und
|
||||
# füllen ihren Tab via SSE, sobald sie fertig sind — kein Warten aufs Schlusslicht.
|
||||
results = await asyncio.gather(
|
||||
*(_run_one(topic, agent_id) for topic, agent_id in _TOPIC_AGENTS.items())
|
||||
)
|
||||
for r in results:
|
||||
if r:
|
||||
outputs[r[0]] = r[1]
|
||||
|
||||
if outputs:
|
||||
state["agent_outputs"] = outputs
|
||||
|
||||
@@ -231,17 +231,6 @@ _USE_CASES: tuple[UseCase, ...] = (
|
||||
UseCase("bafin_it", "BaFin IT-Aufsicht (VAIT/BAIT)", "security",
|
||||
regulations=("VAIT", "BAIT"),
|
||||
verification_methods=("it_process", "document", "network")),
|
||||
UseCase("eidas", "eIDAS / Vertrauensdienste (VO 910/2014)", "product",
|
||||
regulations=("eIDAS",), verification_methods=("document", "it_process"),
|
||||
categories=("compliance", "security"),
|
||||
keyword_tokens=("eidas", "vertrauensdienst", "signatur", "siegel",
|
||||
"zeitstempel", "zertifikat")),
|
||||
UseCase("geschaeftsgeheimnis", "Geschäftsgeheimnisse (GeschGehG)", "cross_cutting",
|
||||
regulations=("GeschGehG",),
|
||||
verification_methods=("document", "it_process", "manual"),
|
||||
categories=("compliance", "security"),
|
||||
keyword_tokens=("geschäftsgeheimnis", "vertraulichkeit", "geheimhaltung",
|
||||
"betriebsgeheimnis")),
|
||||
)
|
||||
|
||||
|
||||
@@ -352,11 +341,6 @@ _REGULATION_RULES: tuple[tuple[str, str], ...] = (
|
||||
("bait", "bafin_it"),
|
||||
("gobd", "steuerrecht"),
|
||||
("dienstleistungs-informationspflichten", "impressum"),
|
||||
# eIDAS / Geschäftsgeheimnis (neue Use Cases 2026-06-17)
|
||||
("eidas", "eidas"),
|
||||
("910/2014", "eidas"),
|
||||
("geschäftsgeheim", "geschaeftsgeheimnis"),
|
||||
("geschgehg", "geschaeftsgeheimnis"),
|
||||
# Datenschutz-Catch-alls (zuletzt)
|
||||
("nist privacy framework", "dse"),
|
||||
("dsgvo", "dse"),
|
||||
|
||||
@@ -0,0 +1,82 @@
|
||||
"""Pruefer-Library — gemeinsames Interface. Siehe docs platform_checker_matrix.md.
|
||||
|
||||
Ein Checker prueft EINEN Control gegen EIN Dokument und liefert: vorhanden / fehlt
|
||||
/ unklar (+ Evidence). Module (DSE/Impressum/AGB/...) liefern nur Control-Metadaten
|
||||
ueber `ControlSpec` (verification_method + decision_method + checker-spezifische
|
||||
Config); die Engine routet method-agnostisch zum passenden Checker.
|
||||
|
||||
Ziel der Plattform: 14k Controls -> 7 Pruefertypen -> wenige Pruefer. Ein neues
|
||||
Modul wird damit ein Klassifizierungs-, kein Forschungsproblem.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass, field
|
||||
from typing import Any, Optional, Protocol, runtime_checkable
|
||||
|
||||
|
||||
class VerificationMethod:
|
||||
"""Achse 1 — WELCHER Pruefer-Typ (Kategorie)."""
|
||||
FIELD = "FIELD"
|
||||
REFERENCE = "REFERENCE"
|
||||
BEHAVIOR = "BEHAVIOR"
|
||||
PRESENTATION = "PRESENTATION"
|
||||
CONTENT = "CONTENT"
|
||||
PROCESS = "PROCESS"
|
||||
TECHNICAL = "TECHNICAL"
|
||||
CONTRACTUAL = "CONTRACTUAL"
|
||||
|
||||
|
||||
class DecisionMethod:
|
||||
"""Achse 2 — WIE entschieden wird (konkreter Mechanismus)."""
|
||||
REGEX = "REGEX"
|
||||
EMBEDDING = "EMBEDDING"
|
||||
LLM = "LLM"
|
||||
LINK_RESOLVER = "LINK_RESOLVER"
|
||||
PLAYWRIGHT = "PLAYWRIGHT"
|
||||
AUDIT = "AUDIT"
|
||||
SCANNER = "SCANNER"
|
||||
|
||||
|
||||
@dataclass
|
||||
class ControlSpec:
|
||||
"""Routing-Metadaten + checker-spezifische Config eines Controls. Module fuellen
|
||||
nur die fuer ihren decision_method relevanten Felder."""
|
||||
control_id: str
|
||||
verification_method: str
|
||||
decision_method: str
|
||||
label: str = ""
|
||||
severity: str = "MEDIUM"
|
||||
patterns: list[str] = field(default_factory=list) # FIELD/REGEX, REFERENCE
|
||||
paraphrases: list[str] = field(default_factory=list) # CONTENT (EMBEDDING/LLM)
|
||||
embed_threshold: Optional[float] = None # EMBEDDING (per-Control)
|
||||
topic_regex: str = "" # LLM: Section-Retrieval
|
||||
question: str = "" # LLM: Pruef-Frage
|
||||
extra: dict[str, Any] = field(default_factory=dict)
|
||||
|
||||
|
||||
@dataclass
|
||||
class DocContext:
|
||||
"""Das zu pruefende Artefakt. `text` = Volltext; `url`/`rendered` fuer
|
||||
PRESENTATION/BEHAVIOR (Playwright) — spaeter."""
|
||||
text: str = ""
|
||||
url: str = ""
|
||||
rendered: Any = None
|
||||
|
||||
|
||||
@dataclass
|
||||
class CheckResult:
|
||||
present: Optional[bool] # True=erfuellt, False=fehlt, None=unklar (fail-safe)
|
||||
evidence: str = ""
|
||||
confidence: float = 0.0
|
||||
source: str = "" # welcher Pruefer/Tier geantwortet hat
|
||||
detail: dict[str, Any] = field(default_factory=dict)
|
||||
|
||||
|
||||
@runtime_checkable
|
||||
class Checker(Protocol):
|
||||
"""Alle Pruefer haben dieselbe Signatur -> die Engine ist method-agnostisch und
|
||||
routet nur ueber ctrl.verification_method / ctrl.decision_method."""
|
||||
verification_method: str
|
||||
|
||||
async def check(self, ctrl: ControlSpec, doc: DocContext) -> CheckResult:
|
||||
...
|
||||
@@ -0,0 +1,51 @@
|
||||
"""CONTENT-Pruefer / decision_method=EMBEDDING.
|
||||
|
||||
Ist die Pflicht SEMANTISCH im Text vorhanden? Max-Cosinus (Doc-Chunks x Control-
|
||||
Paraphrasen) >= per-Control-Schwelle. Deterministisch (festes Embedding-Modell)
|
||||
und gecacht. Rettet Recall-FP (Klausel da, anders formuliert).
|
||||
|
||||
Faellt der Embedding-Service aus, liefert der Checker present=None (unklar) — der
|
||||
Aufrufer behaelt dann das Keyword-Ergebnis (kein Hang, kein Crash).
|
||||
(Validiert an AGB: 17 Items, per-Item-Schwelle, 0 Fehl-Rescue.)
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import logging
|
||||
|
||||
from .base import CheckResult, ControlSpec, DocContext, VerificationMethod
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Paraphrasen-Vektoren je Control einmal einbetten + cachen.
|
||||
_PARA_CACHE: dict[str, list] = {}
|
||||
|
||||
|
||||
class EmbeddingChecker:
|
||||
verification_method = VerificationMethod.CONTENT
|
||||
|
||||
async def check(self, ctrl: ControlSpec, doc: DocContext) -> CheckResult:
|
||||
text = doc.text or ""
|
||||
paras = ctrl.paraphrases or []
|
||||
thr = ctrl.embed_threshold if ctrl.embed_threshold is not None else 0.60
|
||||
if not paras or len(text) < 100:
|
||||
return CheckResult(present=None, source="embedding")
|
||||
try:
|
||||
from compliance.services.mc_embedding_matcher import (
|
||||
DIM, _chunk_text, _cosine, _embed_texts,
|
||||
)
|
||||
if ctrl.control_id not in _PARA_CACHE:
|
||||
pv = await _embed_texts(paras)
|
||||
_PARA_CACHE[ctrl.control_id] = [v for v in pv if v and len(v) == DIM]
|
||||
pvecs = _PARA_CACHE[ctrl.control_id]
|
||||
chunks = _chunk_text(text)
|
||||
cvecs = [v for v in await asyncio.wait_for(
|
||||
_embed_texts(chunks), timeout=90.0) if v and len(v) == DIM]
|
||||
except (Exception, asyncio.TimeoutError) as e:
|
||||
logger.info("embedding checker inaktiv %s: %s", ctrl.control_id, str(e)[:80])
|
||||
return CheckResult(present=None, source="embedding")
|
||||
if not pvecs or not cvecs:
|
||||
return CheckResult(present=None, source="embedding")
|
||||
best = max((_cosine(p, c) for p in pvecs for c in cvecs), default=0.0)
|
||||
return CheckResult(present=best >= thr, confidence=round(best, 3),
|
||||
source="embedding")
|
||||
@@ -0,0 +1,106 @@
|
||||
"""CONTENT/CONTRACTUAL-Pruefer / decision_method=LLM.
|
||||
|
||||
present/absent ueber die LLM-Kaskade (`call_with_cascade`; prod: OVH-120b zuerst).
|
||||
Retrieval = GANZE Paragraph-Abschnitte zum Topic (nicht Top-k-Chunks — das war in
|
||||
der AGB-Validierung der Schluessel). KEIN DEFECT — Korrektheits-/Defekt-Pruefung
|
||||
ist ein separater Modus. present=None bei Fehler (fail-safe: Aufrufer behaelt
|
||||
Keyword-Ergebnis). (Validiert an AGB delivery/warranty.)
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import logging
|
||||
import re
|
||||
|
||||
from .base import CheckResult, ControlSpec, DocContext, VerificationMethod
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
_SECTION = re.compile(r"(?m)(?=^\s*(?:§\s*)?\d+[\.\)]\s)")
|
||||
_SYS = (
|
||||
"Du bist deutscher Compliance-Rechtsexperte. Entscheide, ob die genannte "
|
||||
"Pflicht in den vorgelegten Abschnitten vorhanden ist. NUR die Abschnitte "
|
||||
'zaehlen. Antworte NUR JSON: {"verdict":"ERFUELLT|FEHLT","zitat":"woertlich '
|
||||
'oder leer","begruendung":"1 Satz"}.'
|
||||
)
|
||||
|
||||
|
||||
def _sections(text: str) -> list[str]:
|
||||
return [s.strip() for s in _SECTION.split(text) if s.strip()]
|
||||
|
||||
|
||||
def _parse(txt: str) -> dict:
|
||||
out = (txt or "").strip()
|
||||
if out.startswith("```"):
|
||||
out = out.split("```", 2)[1]
|
||||
out = out[4:] if out.startswith("json") else out
|
||||
a, b = out.find("{"), out.rfind("}")
|
||||
return json.loads(out[a:b + 1] if 0 <= a < b else out)
|
||||
|
||||
|
||||
class LLMChecker:
|
||||
verification_method = VerificationMethod.CONTENT
|
||||
|
||||
async def check(self, ctrl: ControlSpec, doc: DocContext) -> CheckResult:
|
||||
text = doc.text or ""
|
||||
if len(text) < 50:
|
||||
return CheckResult(present=None, source="llm")
|
||||
# decision_method=LLM mit judge='haiku': Sufficiency-Pfad (validiert
|
||||
# P0.89/R0.91). Der Qwen-first-Cascade ist als Sufficiency-Judge
|
||||
# widerlegt -> hier Haiku direkt, kriteriengeführte Subsumtion.
|
||||
if (ctrl.extra or {}).get("judge") == "haiku":
|
||||
return await self._haiku(ctrl, text)
|
||||
secs = _sections(text)
|
||||
if ctrl.topic_regex:
|
||||
rel = [s for s in secs if re.search(ctrl.topic_regex, s, re.I)][:6] or secs[:6]
|
||||
else:
|
||||
rel = secs[:6]
|
||||
question = ctrl.question or f"Ist die Pflicht '{ctrl.label}' im Text vorhanden?"
|
||||
try:
|
||||
from compliance.services.llm_cascade import call_with_cascade
|
||||
r = await call_with_cascade(
|
||||
_SYS,
|
||||
json.dumps({"frage": question, "abschnitte": rel}, ensure_ascii=False),
|
||||
min_confidence=0.6, max_tokens=500,
|
||||
)
|
||||
obj = _parse(r.get("text"))
|
||||
verdict = obj.get("verdict")
|
||||
zitat = (obj.get("zitat") or "")[:120]
|
||||
if verdict not in ("ERFUELLT", "FEHLT"):
|
||||
return CheckResult(present=None, evidence=zitat, source=r.get("source", "?"))
|
||||
return CheckResult(
|
||||
present=verdict == "ERFUELLT", evidence=zitat,
|
||||
confidence=float(r.get("confidence") or 0.0),
|
||||
source=r.get("source", "llm"),
|
||||
)
|
||||
except Exception as e:
|
||||
logger.info("llm checker fail %s: %s", ctrl.control_id, str(e)[:80])
|
||||
return CheckResult(present=None, source="error")
|
||||
|
||||
async def _haiku(self, ctrl: ControlSpec, text: str) -> CheckResult:
|
||||
"""Sufficiency via Haiku direkt (validierter Judge). Kriteriengeführt:
|
||||
die Rechts-Elemente stehen in ctrl.paraphrases; wiederverwendet den
|
||||
validierten deep_check-Sufficiency-Prompt."""
|
||||
try:
|
||||
from compliance.services.llm_cascade import _call_anthropic
|
||||
from compliance.services.specialist_agents.dse.deep_check import (
|
||||
_JUDGE_SYS, _build_user, _parse as _parse_judge,
|
||||
)
|
||||
crit = ctrl.paraphrases or [ctrl.label or ctrl.control_id]
|
||||
user = _build_user(text, ctrl.label or ctrl.control_id, crit)
|
||||
obj = None
|
||||
for _ in range(2):
|
||||
obj = _parse_judge(await _call_anthropic(_JUDGE_SYS, user, max_tokens=400))
|
||||
if obj:
|
||||
break
|
||||
if not obj:
|
||||
return CheckResult(present=None, source="haiku")
|
||||
return CheckResult(
|
||||
present=bool(obj.get("erfuellt")),
|
||||
evidence=(obj.get("begruendung") or "")[:120],
|
||||
confidence=float(obj.get("confidence") or 0.0),
|
||||
source="haiku",
|
||||
)
|
||||
except Exception as e:
|
||||
logger.info("llm haiku checker fail %s: %s", ctrl.control_id, str(e)[:80])
|
||||
return CheckResult(present=None, source="error")
|
||||
@@ -0,0 +1,41 @@
|
||||
"""REFERENCE-Pruefer (verification_method=REFERENCE, decision_method=LINK_RESOLVER).
|
||||
|
||||
Ist ein klarer Verweis auf ein anderes Pflichtdokument vorhanden (+ optional: loest
|
||||
der Link auf)? Deterministisch. Bsp: 'Details in unserer Datenschutzerklaerung'.
|
||||
KEIN LLM, kein juristisches Urteil. (Validiert an AGB data_protection: 7/7.)
|
||||
|
||||
Die tatsaechliche HTTP-Aufloesung des Links ist ein optionaler Runtime-Schritt
|
||||
(online), nicht Teil dieser deterministischen Text-Pruefung — die URL wird hier
|
||||
nur extrahiert und in `detail['link']` zurueckgegeben.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import re
|
||||
|
||||
from .base import CheckResult, ControlSpec, DocContext, VerificationMethod
|
||||
|
||||
_URL = re.compile(r"https?://[^\s)\]]+", re.I)
|
||||
|
||||
|
||||
class ReferenceChecker:
|
||||
verification_method = VerificationMethod.REFERENCE
|
||||
|
||||
async def check(self, ctrl: ControlSpec, doc: DocContext) -> CheckResult:
|
||||
text = doc.text or ""
|
||||
pats = ctrl.patterns or []
|
||||
if not pats or not text:
|
||||
return CheckResult(present=False, source="reference")
|
||||
for p in pats:
|
||||
m = re.search(p, text, re.I)
|
||||
if m:
|
||||
window = text[max(0, m.start() - 40): m.end() + 200]
|
||||
url = _URL.search(window) or _URL.search(text)
|
||||
link = url.group(0) if url else None
|
||||
return CheckResult(
|
||||
present=True,
|
||||
evidence=" ".join(m.group(0).split())[:120],
|
||||
confidence=1.0,
|
||||
source="reference",
|
||||
detail={"link": link},
|
||||
)
|
||||
return CheckResult(present=False, source="reference")
|
||||
@@ -0,0 +1,68 @@
|
||||
"""Prüfer-Router — method-agnostischer Dispatch.
|
||||
|
||||
control → sensor_classification (verification_method + decision_method) → Checker.
|
||||
Ein neues Modul liefert nur ControlSpecs; der Router wählt den Prüfer. Damit wird
|
||||
der „Embedding findet, Claude entscheidet"-Pfad EIN gemeinsamer CONTENT/LLM-Prüfer
|
||||
statt Cookie-Sonderlogik. Nicht-gebaute Prüfer (PLAYWRIGHT/AUDIT/SCANNER/REGEX-
|
||||
FIELD) → present=None (fail-safe: Aufrufer behält sein deterministisches Ergebnis).
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Any, Optional
|
||||
|
||||
from .base import CheckResult, ControlSpec, DecisionMethod, DocContext
|
||||
from .embedding_checker import EmbeddingChecker
|
||||
from .llm_checker import LLMChecker
|
||||
from .reference_checker import ReferenceChecker
|
||||
|
||||
_LLM = LLMChecker()
|
||||
_EMB = EmbeddingChecker()
|
||||
_REF = ReferenceChecker()
|
||||
|
||||
# decision_method → Checker. Fehlende Mechanismen bewusst None (noch nicht gebaut).
|
||||
_BY_DECISION: dict[str, Any] = {
|
||||
DecisionMethod.LLM: _LLM,
|
||||
DecisionMethod.EMBEDDING: _EMB,
|
||||
DecisionMethod.LINK_RESOLVER: _REF,
|
||||
}
|
||||
|
||||
|
||||
async def route_and_check(ctrl: ControlSpec, doc: DocContext) -> CheckResult:
|
||||
checker = _BY_DECISION.get((ctrl.decision_method or "").upper())
|
||||
if checker is None:
|
||||
return CheckResult(present=None,
|
||||
source=f"no_checker:{ctrl.decision_method}")
|
||||
return await checker.check(ctrl, doc)
|
||||
|
||||
|
||||
def build_spec(
|
||||
control_id: str,
|
||||
sensor_classification: Optional[dict[str, Any]],
|
||||
*,
|
||||
label: str = "",
|
||||
criteria: Optional[list] = None,
|
||||
question: str = "",
|
||||
patterns: Optional[list[str]] = None,
|
||||
embed_threshold: Optional[float] = None,
|
||||
) -> ControlSpec:
|
||||
"""Baut ein ControlSpec aus der GESPEICHERTEN sensor_classification
|
||||
(canonical_controls.generation_metadata.sensor_classification) + den
|
||||
Control-Kriterien. CONTENT/LLM → judge='haiku' (validierter Sufficiency-
|
||||
Judge; Default für Sufficiency lt. Entscheidung 2026-06-22)."""
|
||||
sc = sensor_classification or {}
|
||||
vm = (sc.get("verification_method") or "").upper()
|
||||
dm = (sc.get("decision_method") or "").upper()
|
||||
extra: dict[str, Any] = {}
|
||||
if vm == "CONTENT" and dm == "LLM":
|
||||
extra["judge"] = "haiku"
|
||||
return ControlSpec(
|
||||
control_id=control_id,
|
||||
verification_method=vm,
|
||||
decision_method=dm,
|
||||
label=label,
|
||||
paraphrases=[str(c) for c in (criteria or []) if c],
|
||||
question=question,
|
||||
patterns=patterns or [],
|
||||
embed_threshold=embed_threshold,
|
||||
extra=extra,
|
||||
)
|
||||
@@ -142,19 +142,26 @@ async def _call_ovh(system: str, user: str, max_tokens: int = 6000) -> str:
|
||||
headers = {"Content-Type": "application/json"}
|
||||
if key:
|
||||
headers["Authorization"] = f"Bearer {key}"
|
||||
# gpt-oss-120b is a REASONING model: it spends output tokens on
|
||||
# chain-of-thought before emitting the answer. A low cap (e.g. deep_check's
|
||||
# max_tokens=400) makes it hit the length limit mid-reasoning and return
|
||||
# content=null — the whole tier then silently yields nothing. Floor the
|
||||
# budget so the reasoning AND the JSON answer fit.
|
||||
payload = {
|
||||
"model": model, "temperature": 0.05, "max_tokens": max_tokens,
|
||||
"model": model, "temperature": 0.05, "max_tokens": max(max_tokens, 2000),
|
||||
"messages": [{"role": "system", "content": system},
|
||||
{"role": "user", "content": user}],
|
||||
"response_format": {"type": "json_object"},
|
||||
}
|
||||
try:
|
||||
async with httpx.AsyncClient(timeout=45.0) as c:
|
||||
async with httpx.AsyncClient(timeout=90.0) as c:
|
||||
r = await c.post(f"{base.rstrip('/')}/v1/chat/completions",
|
||||
json=payload, headers=headers)
|
||||
r.raise_for_status()
|
||||
choice = (r.json().get("choices") or [{}])[0]
|
||||
return (choice.get("message") or {}).get("content", "") or ""
|
||||
msg = (r.json().get("choices") or [{}])[0].get("message") or {}
|
||||
# Answer is normally in content; if the model was length-capped the
|
||||
# JSON can land in reasoning_content instead — fall back to it.
|
||||
return (msg.get("content") or "") or (msg.get("reasoning_content") or "")
|
||||
except Exception as e:
|
||||
logger.warning("ovh cascade tier 2 failed: %s", e)
|
||||
return ""
|
||||
|
||||
@@ -0,0 +1,102 @@
|
||||
"""AGB-Routing-Pipeline (C-lean): nimmt das Keyword-Ergebnis des ChecklistAgent
|
||||
und routet keyword-durchgefallene Items per `_routing.decision_method` an die
|
||||
wiederverwendbare Prüfer-Library (Embedding / Reference / LLM). Davor das
|
||||
Geschäftsmodell-Gate (Applicability). Das Re-Tiering (LOW → Empfehlung) +
|
||||
Output-Zusammenbau macht der AGBAgent — hier nur die Routing-Entscheidung.
|
||||
|
||||
Validiert (7-Firmen-Opus-GT): 71 % FP → ~0. agent.py bleibt dünn, dies ist der
|
||||
einzige Ort des C-lean-Flows.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
|
||||
from compliance.services.checkers.base import (
|
||||
ControlSpec,
|
||||
DecisionMethod,
|
||||
DocContext,
|
||||
VerificationMethod,
|
||||
)
|
||||
from compliance.services.checkers.embedding_checker import EmbeddingChecker
|
||||
from compliance.services.checkers.llm_checker import LLMChecker
|
||||
from compliance.services.checkers.reference_checker import ReferenceChecker
|
||||
|
||||
from . import _routing
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Checker sind zustandslos (schwere Imports erst in .check()) → Modul-Singletons.
|
||||
_EMB = EmbeddingChecker()
|
||||
_REF = ReferenceChecker()
|
||||
_LLM = LLMChecker()
|
||||
|
||||
|
||||
def _spec(item_id: str) -> ControlSpec:
|
||||
"""ControlSpec für ein Item aus der AGB-Routing-Config bauen."""
|
||||
dm = _routing.decision_method(item_id)
|
||||
if dm == _routing.REFERENCE:
|
||||
return ControlSpec(
|
||||
control_id=item_id, verification_method=VerificationMethod.REFERENCE,
|
||||
decision_method=DecisionMethod.LINK_RESOLVER,
|
||||
patterns=[_routing.REFERENCE_PATTERNS[item_id]],
|
||||
)
|
||||
if dm == _routing.LLM:
|
||||
return ControlSpec(
|
||||
control_id=item_id, verification_method=VerificationMethod.CONTENT,
|
||||
decision_method=DecisionMethod.LLM,
|
||||
paraphrases=_routing.PARAPHRASES.get(item_id, []),
|
||||
topic_regex=_routing.LLM_TOPIC.get(item_id, ""),
|
||||
question=_routing.LLM_QUESTION.get(item_id, ""),
|
||||
)
|
||||
return ControlSpec(
|
||||
control_id=item_id, verification_method=VerificationMethod.CONTENT,
|
||||
decision_method=DecisionMethod.EMBEDDING,
|
||||
paraphrases=_routing.PARAPHRASES.get(item_id, []),
|
||||
embed_threshold=_routing.EMBED_THRESHOLDS.get(item_id),
|
||||
)
|
||||
|
||||
|
||||
async def _resolves(item_id: str, text: str, skip_llm: bool):
|
||||
"""True = Klausel doch vorhanden (Keyword-Finding auflösen). False/None =
|
||||
Finding behalten (fail-safe: bei Unsicherheit/Service-Ausfall lieber melden)."""
|
||||
dm = _routing.decision_method(item_id)
|
||||
if dm == _routing.MERGED:
|
||||
return True # in ein anderes Item aufgegangen → kein eigenes Finding
|
||||
doc = DocContext(text=text)
|
||||
spec = _spec(item_id)
|
||||
if dm == _routing.REFERENCE:
|
||||
return (await _REF.check(spec, doc)).present
|
||||
if dm == _routing.LLM:
|
||||
if skip_llm:
|
||||
return None # interaktiv: kein LLM → Keyword-Ergebnis behalten
|
||||
return (await _LLM.check(spec, doc)).present
|
||||
return (await _EMB.check(spec, doc)).present
|
||||
|
||||
|
||||
async def run_routed(base_findings: list, text: str, context: dict | None = None):
|
||||
"""Routet die keyword-durchgefallenen Items.
|
||||
|
||||
Returns (kept, resolved_ids, gated_ids):
|
||||
kept = Findings, die nach Gate+Rescue bestehen bleiben
|
||||
resolved_ids = per Embedding/Reference/LLM doch als vorhanden erkannt
|
||||
gated_ids = per Geschäftsmodell nicht anwendbar (N/A)
|
||||
"""
|
||||
context = context or {}
|
||||
skip_llm = bool(context.get("skip_llm"))
|
||||
model = _routing.detect_business_model(text)
|
||||
kept, resolved, gated = [], [], []
|
||||
for f in base_findings:
|
||||
item_id = f.field_id
|
||||
if not _routing.is_applicable(item_id, model):
|
||||
gated.append(item_id)
|
||||
continue
|
||||
try:
|
||||
present = await _resolves(item_id, text, skip_llm)
|
||||
except Exception as e: # noqa: BLE001 — best-effort, Finding behalten
|
||||
logger.info("agb routing %s failed: %s", item_id, str(e)[:80])
|
||||
present = None
|
||||
if present is True:
|
||||
resolved.append(item_id)
|
||||
else:
|
||||
kept.append(f)
|
||||
return kept, resolved, gated
|
||||
@@ -0,0 +1,144 @@
|
||||
"""AGB-Routing — das verification_method / decision_method-Meta-Modell, angewandt
|
||||
auf die AGB_CHECKLIST. Siehe docs-src/development/platform_checker_matrix.md.
|
||||
|
||||
Pro Checklisten-Item: WELCHER Pruefer (verification_method) und WIE entschieden
|
||||
wird (decision_method). Single source of truth; `agb_checks.py` bleibt die reine
|
||||
Pflichtangaben-Liste, dieses Modul ist der additive Routing-Overlay.
|
||||
|
||||
Validiert 2026-06-20/21 gegen 7-Firmen-Opus-GT (71 % FP -> ~0):
|
||||
- 17 Items EMBEDDING (per-Item-Cosinus-Schwelle; 21 recall-FP gekillt, 0 Fehl-Rescue)
|
||||
- 2 Items LLM (delivery_timeframe, warranty_period; ganze Paragraph-Abschnitte + starkes Modell, present/absent)
|
||||
- 1 Item REFERENCE (data_protection; DSE-Verweis + Link, 7/7 deterministisch)
|
||||
- incorporation_clause MERGED in contract (implizit, kein eigener Pruefer)
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
# ── decision_method-Werte ────────────────────────────────────────────────
|
||||
EMBEDDING = "EMBEDDING"
|
||||
LLM = "LLM"
|
||||
REFERENCE = "REFERENCE"
|
||||
MERGED = "MERGED" # in ein anderes Item aufgegangen -> kein eigener Check
|
||||
|
||||
# ── Per-Item Embedding-Rescue-Schwellen ───────────────────────────────────
|
||||
# An der 7-Firmen-GT kalibriert. BEWUSST per-Item: eine globale Schwelle trennt
|
||||
# bei juristischer Prosa nicht (PASS/FAIL ueberlappen global, trennen per-Item).
|
||||
# Vorlaeufig (FAIL n=25 klein) -> vor Prod mit mehr Firmen nachkalibrieren.
|
||||
EMBED_THRESHOLDS: dict[str, float] = {
|
||||
"scope": 0.58, "contract": 0.58, "payment": 0.60, "payment_methods": 0.58,
|
||||
"delivery": 0.57, "warranty": 0.58, "termination": 0.60,
|
||||
"termination_period": 0.60, "termination_form": 0.60, "consumer_rights": 0.55,
|
||||
"liability": 0.615, "jurisdiction": 0.585, "dispute_odr_link": 0.67,
|
||||
"choice_of_law_specific": 0.625, "payment_due_date": 0.705,
|
||||
"salvatory_clause": 0.565, "amendment_clause": 0.635,
|
||||
}
|
||||
|
||||
# ── decision_method je Item (deckt alle 21 Checklisten-IDs ab) ────────────
|
||||
DECISION_METHOD: dict[str, str] = {cid: EMBEDDING for cid in EMBED_THRESHOLDS}
|
||||
DECISION_METHOD.update({
|
||||
"delivery_timeframe": LLM,
|
||||
"warranty_period": LLM,
|
||||
"data_protection": REFERENCE,
|
||||
"incorporation_clause": MERGED, # -> contract
|
||||
})
|
||||
|
||||
# ── Applicability-Gate (VOR allen Pruefern; Geschaeftsmodell entscheidet) ──
|
||||
ABO_ONLY = {"termination", "termination_period", "termination_form"} # nur Dauerschuld
|
||||
B2C_ONLY = {"consumer_rights", "dispute_odr_link"} # nicht reines B2B
|
||||
|
||||
# ── Referenz-Paraphrasen (Embedding-Rescue + LLM-Section-Ranking) ──────────
|
||||
PARAPHRASES: dict[str, list[str]] = {
|
||||
"scope": ["Diese AGB gelten fuer alle Vertraege zwischen dem Anbieter und dem Kunden.",
|
||||
"Die Angebote richten sich ausschliesslich an Verbraucher, die privat kaufen.",
|
||||
"Geltungsbereich: fuer die Geschaeftsbeziehung gelten die nachfolgenden Bedingungen."],
|
||||
"contract": ["Durch Anklicken des Bestellbuttons gibt der Kunde ein verbindliches Angebot ab.",
|
||||
"Der Vertrag kommt mit Zugang der Bestellbestaetigung zustande.",
|
||||
"Mit der Bestellung erkennt der Kunde diese AGB als Vertragsbestandteil an."],
|
||||
"liability": ["Die Haftung fuer leicht fahrlaessige Pflichtverletzungen ist beschraenkt.",
|
||||
"Wir haften unbeschraenkt fuer Schaeden aus Verletzung von Leben, Koerper, Gesundheit.",
|
||||
"Bei Verletzung wesentlicher Vertragspflichten Haftung auf vorhersehbaren Schaden begrenzt."],
|
||||
"jurisdiction": ["Es gilt das Recht der Bundesrepublik Deutschland unter Ausschluss des UN-Kaufrechts.",
|
||||
"Gerichtsstand fuer alle Streitigkeiten ist der Sitz des Unternehmens.",
|
||||
"Auf die Vertraege findet deutsches Recht Anwendung."],
|
||||
"dispute_odr_link": ["Die EU-Kommission stellt eine Plattform zur Online-Streitbeilegung bereit.",
|
||||
"Zur aussergerichtlichen Streitbeilegung steht die OS-Plattform zur Verfuegung."],
|
||||
"choice_of_law_specific": ["Es gilt deutsches Recht unter Ausschluss des UN-Kaufrechts (CISG).",
|
||||
"Anwendbar ist das Recht der Bundesrepublik Deutschland."],
|
||||
"payment": ["Die Preise sind Endpreise inklusive Mehrwertsteuer; Versandkosten gesondert ausgewiesen.",
|
||||
"Zahlungsbedingungen und Preise richten sich nach den Angaben im Bestellprozess."],
|
||||
"payment_methods": ["Zur Zahlung stehen Vorkasse, Kreditkarte, Lastschrift, Rechnung und PayPal zur Verfuegung.",
|
||||
"Folgende Zahlungsarten werden akzeptiert: Ueberweisung, SEPA-Lastschrift, Kreditkarte."],
|
||||
"payment_due_date": ["Der Kaufpreis ist sofort mit Vertragsschluss faellig.",
|
||||
"Die Zahlung ist bei Bestellung zu leisten.",
|
||||
"Der Rechnungsbetrag wird mit Versand der Ware faellig.",
|
||||
"Bei Kauf auf Rechnung ist der Betrag innerhalb von 14 Tagen zu zahlen."],
|
||||
"delivery": ["Die Lieferung erfolgt an die vom Kunden angegebene Lieferadresse.",
|
||||
"Wir liefern innerhalb Deutschlands; die Leistung wird nach Vertragsschluss erbracht."],
|
||||
"delivery_timeframe": ["Die Lieferzeit betraegt in der Regel 3-5 Werktage.",
|
||||
"Die Ware wird voraussichtlich innerhalb von 2 bis 4 Werktagen geliefert."],
|
||||
"warranty": ["Es gelten die gesetzlichen Maengelhaftungsrechte (Gewaehrleistung).",
|
||||
"Bei Maengeln stehen dem Kunden die gesetzlichen Gewaehrleistungsrechte zu.",
|
||||
"Fuer Sachmaengel haften wir nach den gesetzlichen Bestimmungen."],
|
||||
"warranty_period": ["Die Gewaehrleistungsfrist betraegt zwei Jahre ab Lieferung.",
|
||||
"Die Verjaehrungsfrist fuer Maengelansprueche betraegt zwei Jahre."],
|
||||
"termination": ["Der Vertrag kann von beiden Parteien ordentlich gekuendigt werden.",
|
||||
"Das Abonnement kann jederzeit zum Ende der Laufzeit gekuendigt werden."],
|
||||
"termination_period": ["Die Kuendigungsfrist betraegt einen Monat zum Vertragsende.",
|
||||
"Der Vertrag ist mit einer Frist von vier Wochen kuendbar."],
|
||||
"termination_form": ["Die Kuendigung bedarf der Textform und kann per E-Mail erfolgen.",
|
||||
"Eine Kuendigung ist schriftlich oder per E-Mail moeglich."],
|
||||
"salvatory_clause": ["Sollten einzelne Bestimmungen unwirksam sein, bleibt die Wirksamkeit der uebrigen unberuehrt.",
|
||||
"Die Unwirksamkeit einzelner Klauseln beruehrt nicht die Gueltigkeit der uebrigen AGB."],
|
||||
"amendment_clause": ["Wir behalten uns vor, diese AGB mit Wirkung fuer die Zukunft zu aendern.",
|
||||
"Aenderungen dieser Bedingungen werden dem Kunden rechtzeitig mitgeteilt."],
|
||||
"consumer_rights": ["Die gesetzlichen Rechte des Verbrauchers bleiben unberuehrt.",
|
||||
"Zwingende Verbraucherschutzvorschriften bleiben von diesen Bedingungen unberuehrt."],
|
||||
}
|
||||
|
||||
# ── LLM-Items: Paragraph-Abschnitts-Retrieval + Pruef-Frage ───────────────
|
||||
LLM_TOPIC: dict[str, str] = {
|
||||
"delivery_timeframe": r"liefer",
|
||||
"warranty_period": r"gew(?:ä|ae)hrleist|m(?:ä|ae)ngel|sachm|verj(?:ä|ae)hr|haftungsdauer|garantie",
|
||||
}
|
||||
LLM_QUESTION: dict[str, str] = {
|
||||
"delivery_timeframe": ("Wird eine KONKRETE Lieferzeit/Lieferfrist genannt (z.B. '3-5 Werktage', "
|
||||
"'innerhalb von 2 Werktagen')? Eine nur allgemeine Lieferregelung ODER ein "
|
||||
"Verweis 'Lieferzeit im Bestellvorgang' ohne konkrete Frist zaehlt NICHT."),
|
||||
"warranty_period": ("Wird eine KONKRETE Gewaehrleistungs-/Verjaehrungsfrist als ZAHL genannt "
|
||||
"(z.B. 'zwei Jahre', 'ein Jahr')? Ein blosser Verweis auf 'gesetzliche "
|
||||
"Verjaehrungsfristen' ohne Zahl zaehlt NICHT."),
|
||||
}
|
||||
|
||||
# ── REFERENCE-Item data_protection ────────────────────────────────────────
|
||||
REFERENCE_PATTERNS: dict[str, str] = {
|
||||
"data_protection": r"datenschutz(erkl(?:ä|ae)rung|bestimmung|hinweis)",
|
||||
}
|
||||
|
||||
|
||||
def detect_business_model(text: str) -> dict[str, bool]:
|
||||
"""Deterministischer Geschaeftsmodell-Detektor fuer das Applicability-Gate.
|
||||
Edge-Case: gemischte Modelle (Webshop + Finanzierung/Service) koennen 'abo'
|
||||
triggern -> dann greift das termination-Gate nicht; bewusst konservativ
|
||||
(lieber eine Kuendigungs-Pruefung zu viel als eine echte Luecke uebersehen)."""
|
||||
tl = text.lower()
|
||||
consumer = ("widerrufsbelehrung" in tl) or ("widerrufsrecht" in tl and "verbraucher" in tl)
|
||||
b2b = (not consumer) and any(s in tl for s in (
|
||||
"geschäftskunden", "ausschließlich an unternehmer", "nur an unternehmer",
|
||||
"lieferbedingungen für geschäftskunden"))
|
||||
abo = any(s in tl for s in (
|
||||
"abonnement", "mindestlaufzeit", "vertragslaufzeit", "verlängert sich",
|
||||
"monatsabo", "jahresabo")) or ("abo" in tl and "kündig" in tl)
|
||||
return {"b2b": b2b, "abo": abo, "b2c": not b2b}
|
||||
|
||||
|
||||
def is_applicable(item_id: str, model: dict[str, bool]) -> bool:
|
||||
"""Gate: gilt das Item fuer dieses Geschaeftsmodell? (False -> N/A, nicht pruefen)."""
|
||||
if item_id in ABO_ONLY and not model.get("abo"):
|
||||
return False
|
||||
if item_id in B2C_ONLY and model.get("b2b"):
|
||||
return False
|
||||
return True
|
||||
|
||||
|
||||
def decision_method(item_id: str) -> str:
|
||||
"""decision_method fuer ein Item; Default EMBEDDING (Prosa-Rescue)."""
|
||||
return DECISION_METHOD.get(item_id, EMBEDDING)
|
||||
@@ -1,19 +1,60 @@
|
||||
"""AGBAgent — Allgemeine Geschäftsbedingungen (§§ 305 ff. BGB).
|
||||
|
||||
Thin-Subclass von ChecklistAgent über die kuratierte AGB_CHECKLIST (L1
|
||||
Pflichtangaben + L2 Detailchecks). KEIN Library-Firehose.
|
||||
ChecklistAgent-Subclass: erst L1/L2-Keyword-Pass, dann **C-lean-Routing** — die
|
||||
keyword-durchgefallenen Items werden per `decision_method` an die wiederverwendbare
|
||||
Prüfer-Library geroutet (Embedding / Reference / LLM), davor das Geschäftsmodell-
|
||||
Gate (Applicability), danach Severity-Re-Tiering (LOW → Empfehlung).
|
||||
Validiert gegen 7-Firmen-Opus-GT: 71 % FP → ~0. Config in `_routing`, Flow in `_pipeline`.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from compliance.services.doc_checks.agb_checks import AGB_CHECKLIST
|
||||
|
||||
from .._base import AgentInput, AgentOutput, lint_output
|
||||
from .._checklist_agent import ChecklistAgent
|
||||
from .._rollup import rollup
|
||||
from ._pipeline import run_routed
|
||||
|
||||
|
||||
class AGBAgent(ChecklistAgent):
|
||||
CHECKLIST = AGB_CHECKLIST
|
||||
agent_id = "agb"
|
||||
agent_version = "1.0"
|
||||
agent_version = "2.0" # v2: decision_method-Routing (Embedding/Reference/LLM)
|
||||
doc_type = "agb"
|
||||
owned_mc_ids = tuple(c["id"] for c in AGB_CHECKLIST)
|
||||
|
||||
async def evaluate(self, agent_input: AgentInput) -> AgentOutput:
|
||||
# 1) Basis-Keyword-Pass (L1/L2). out.findings = keyword-durchgefallene Items.
|
||||
out = await super().evaluate(agent_input)
|
||||
text = (agent_input.text or "").strip()
|
||||
if len(text) < 100 or not out.findings:
|
||||
return out # zu kurz / nichts zu routen
|
||||
|
||||
# 2) Routing: Gate + Embedding/Reference/LLM-Rescue der Keyword-Misses.
|
||||
kept, resolved, gated = await run_routed(
|
||||
out.findings, text, agent_input.context)
|
||||
resolved_set, gated_set = set(resolved), set(gated)
|
||||
|
||||
# 3) Coverage angleichen: rescued → ok, gated → na.
|
||||
for c in out.mc_coverage:
|
||||
if c.mc_id in resolved_set:
|
||||
c.status, c.reason = "ok", "semantisch vorhanden (Routing)"
|
||||
elif c.mc_id in gated_set:
|
||||
c.status, c.reason = "na", "für Geschäftsmodell nicht anwendbar"
|
||||
|
||||
# 4) Severity-Re-Tiering: HIGH/MEDIUM = Findings, LOW = nur Empfehlung.
|
||||
out.findings = [f for f in kept if f.severity in ("HIGH", "MEDIUM")]
|
||||
out.recommendations = rollup(kept)
|
||||
|
||||
# 5) Aggregat-Kennzahlen neu (Coverage hat sich verschoben).
|
||||
cov = out.mc_coverage
|
||||
out.mc_total = len(cov)
|
||||
out.mc_ok = sum(1 for c in cov if c.status == "ok")
|
||||
out.mc_na = sum(1 for c in cov if c.status == "na")
|
||||
out.mc_high = sum(1 for c in cov if c.status == "high")
|
||||
out.mc_medium = sum(1 for c in cov if c.status == "medium")
|
||||
out.mc_low = sum(1 for c in cov if c.status == "low")
|
||||
out.notes = ((out.notes + " · ") if out.notes else "") + \
|
||||
f"routed: {len(resolved)} rescued, {len(gated)} n/a"
|
||||
return lint_output(out)
|
||||
|
||||
+78
@@ -0,0 +1,78 @@
|
||||
"""Applicability-Gate fuer den Cookie-Policy-Scan.
|
||||
|
||||
Schliesst Controls aus dem Cookie-Findings-Scan aus, die laut
|
||||
`compliance.control_classification` NICHT gegen eine Cookie-Policy laufen
|
||||
('COOKIE_POLICY' nicht in applicable_artifacts). Diese gehoeren zu einem
|
||||
anderen Artefakt/Pruefer — Banner (BEHAVIOR/Playwright), Security/TOM/Audit
|
||||
(PROCESS) — und erzeugen sonst Unsinn-Findings (z.B. 'TOMs nicht dokumentiert'
|
||||
gegen eine Cookie-Richtlinie). Sie werden NICHT geloescht, sondern als
|
||||
Routing-Liste zurueckgegeben.
|
||||
|
||||
Anders als das DSE-Gate OHNE needs_review-Ausnahme: das Artefakt-Signal ist
|
||||
hier entscheidend und per Inventar (2026-06-21) belegt; die mis-scopeten 11
|
||||
sind geprueft. Fail-safe: fehlt die Tabelle / DB nicht erreichbar -> leeres
|
||||
Dict -> es wird NICHT gefiltert (kein stiller Recall-Verlust).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import os
|
||||
from typing import Any
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
async def load_cookie_gate(db_url: str = "") -> dict[str, dict[str, Any]]:
|
||||
"""Liefert {control_id: meta} fuer Controls, die aus dem Cookie-Findings-
|
||||
Scan auszuschliessen sind (kein COOKIE_POLICY-Artefakt). Leeres Dict =
|
||||
kein Filter."""
|
||||
dsn = (db_url or os.getenv("DATABASE_URL")
|
||||
or os.getenv("COMPLIANCE_DATABASE_URL") or "")
|
||||
if not dsn:
|
||||
return {}
|
||||
try:
|
||||
import asyncpg
|
||||
conn = await asyncpg.connect(dsn)
|
||||
try:
|
||||
rows = await conn.fetch(
|
||||
"""SELECT control_id, obligation_type, check_intent,
|
||||
applicable_artifacts
|
||||
FROM compliance.control_classification
|
||||
WHERE is_active
|
||||
AND NOT ('COOKIE_POLICY' = ANY(applicable_artifacts))""")
|
||||
finally:
|
||||
await conn.close()
|
||||
except Exception as e: # Tabelle fehlt / DB weg -> kein Filter
|
||||
logger.info("cookie classification gate inaktiv: %s", str(e)[:90])
|
||||
return {}
|
||||
return {
|
||||
r["control_id"]: {
|
||||
"obligation_type": r["obligation_type"],
|
||||
"check_intent": r["check_intent"],
|
||||
"applicable_artifacts": list(r["applicable_artifacts"] or []),
|
||||
}
|
||||
for r in rows if r["control_id"]
|
||||
}
|
||||
|
||||
|
||||
def apply_gate(
|
||||
controls: list[dict[str, Any]],
|
||||
gate: dict[str, dict[str, Any]],
|
||||
) -> tuple[list[dict[str, Any]], list[dict[str, Any]]]:
|
||||
"""Teilt geladene Controls in (kept, routed_out).
|
||||
|
||||
kept: laufen normal durch den Cookie-Scan.
|
||||
routed_out: aus dem Scan genommen (control_id + title + Klassifikations-
|
||||
Metadaten fuer das Routing zu Banner/Security/Audit).
|
||||
"""
|
||||
kept: list[dict[str, Any]] = []
|
||||
routed_out: list[dict[str, Any]] = []
|
||||
for c in controls:
|
||||
cid = c.get("control_id")
|
||||
meta = gate.get(cid) if cid else None
|
||||
if meta:
|
||||
routed_out.append({"control_id": cid, "title": c.get("title"), **meta})
|
||||
else:
|
||||
kept.append(c)
|
||||
return kept, routed_out
|
||||
+63
@@ -0,0 +1,63 @@
|
||||
"""Layer-3 Sufficiency-Judge fuer Cookie-Policy.
|
||||
|
||||
Das Embedding/Boost-Auto-Rescue (Layer 0/2) ist BEWUSST optimistisch — es findet
|
||||
das Thema, beweist aber nicht die Erfuellung. Messung (2026-06-22): 159 FN
|
||||
(Over-Rescue) gegen Opus-GT, weil 'Thema erwaehnt' als 'erfuellt' durchgewunken
|
||||
wurde. Diese Schicht prueft GENAU die rescued Controls mit dem validierten
|
||||
Haiku-Judge (Cohort cookie_sufficiency_v1: P0.89/R0.91) — NICHT die Qwen-first-
|
||||
Kaskade (lokal ist als Sufficiency-Judge widerlegt) — und nimmt 'passed' zurueck,
|
||||
wenn die konkrete Pflicht nicht erfuellt ist. 'Embedding findet, Claude entscheidet.'
|
||||
|
||||
Nur fuer den NICHT-skip_llm-Pfad (voller Check); der schnelle/interaktive Pfad
|
||||
behaelt das deterministische Rescue.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from typing import Any
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
_RESCUE_MARKERS = ("+embedding", "+regex_boost")
|
||||
|
||||
|
||||
def _is_rescued(r: dict[str, Any]) -> bool:
|
||||
src = r.get("source") or ""
|
||||
return r.get("passed") and any(m in src for m in _RESCUE_MARKERS)
|
||||
|
||||
|
||||
async def judge_rescued(text: str, results: list[dict[str, Any]]) -> int:
|
||||
"""Prueft alle rescued (embedding/boost) passed-Controls mit Haiku.
|
||||
Nimmt passed zurueck, wenn der Judge die Pflicht als NICHT erfuellt sieht.
|
||||
Gibt die Anzahl zurueckgenommener (korrigierter) Rescues zurueck.
|
||||
"""
|
||||
# Über den gemeinsamen Prüfer-Router (kein Cookie-Sonderfall mehr):
|
||||
# CONTENT/LLM → build_spec setzt judge='haiku' → LLMChecker (validierter
|
||||
# Sufficiency-Judge). Damit ist Cookie der erste echte Router-Consumer.
|
||||
from compliance.services.checkers.base import DocContext
|
||||
from compliance.services.checkers.router import build_spec, route_and_check
|
||||
|
||||
candidates = [r for r in results if _is_rescued(r)]
|
||||
if not candidates:
|
||||
return 0
|
||||
doc = DocContext(text=text)
|
||||
sc = {"verification_method": "CONTENT", "decision_method": "LLM"}
|
||||
corrected = 0
|
||||
for r in candidates:
|
||||
crit = r.get("_pass_criteria") or [r.get("label") or r.get("hint") or ""]
|
||||
if not isinstance(crit, list):
|
||||
crit = [str(crit)]
|
||||
label = r.get("label") or r.get("hint") or r.get("control_id") or ""
|
||||
spec = build_spec(r.get("control_id") or "", sc, label=label, criteria=crit)
|
||||
res = await route_and_check(spec, doc)
|
||||
if res.present is False:
|
||||
r["passed"] = False
|
||||
r["source"] = (r.get("source") or "") + "+llm_failed"
|
||||
r["matched_text"] = "[layer-3 sufficiency-judge: nicht erfuellt]"
|
||||
r["_judge_reason"] = (res.evidence or "")[:200]
|
||||
corrected += 1
|
||||
if corrected:
|
||||
logger.info("cookie layer-3 sufficiency-judge: %d/%d rescues zurueckgenommen",
|
||||
corrected, len(candidates))
|
||||
return corrected
|
||||
@@ -96,6 +96,22 @@ class CookiePolicyAgent(BaseSpecialistAgent):
|
||||
"Branchen-MCs entfernt"
|
||||
)
|
||||
|
||||
# Layer 3 — Sufficiency-Judge (Haiku) auf die embedding/boost-rescued
|
||||
# Controls: Embedding findet das Thema, Claude entscheidet ob die Pflicht
|
||||
# konkret erfuellt ist. Nur im vollen Check (nicht skip_llm).
|
||||
skip_llm = bool((agent_input.context or {}).get("skip_llm"))
|
||||
if not skip_llm:
|
||||
try:
|
||||
from ._sufficiency_judge import judge_rescued
|
||||
corrected = await judge_rescued(text, results)
|
||||
if corrected:
|
||||
notes_parts.append(
|
||||
f"layer-3 sufficiency-judge: {corrected} Rescues "
|
||||
"zurückgenommen"
|
||||
)
|
||||
except Exception as e:
|
||||
logger.warning("cookie layer-3 judge skipped: %s", e)
|
||||
|
||||
seen: set[str] = set()
|
||||
for r in results:
|
||||
mc_id = r.get("control_id") or ""
|
||||
|
||||
@@ -45,6 +45,15 @@ async def run_v3_pipeline(
|
||||
controls = []
|
||||
_normalize_criteria(controls)
|
||||
controls, sector_dropped = _filter_sector(controls, business_scope)
|
||||
# Artefakt-Gate: Controls ohne COOKIE_POLICY-Artefakt (Security/TOM/Audit,
|
||||
# Banner) raus — sie gehoeren zu anderem Pruefer/Artefakt und erzeugen sonst
|
||||
# Unsinn-Findings. Siehe _classification_gate.
|
||||
routed_out: list[dict[str, Any]] = []
|
||||
try:
|
||||
from ._classification_gate import apply_gate, load_cookie_gate
|
||||
controls, routed_out = apply_gate(controls, await load_cookie_gate(db_url))
|
||||
except Exception as e:
|
||||
logger.warning("cookie classification gate skipped: %s", e)
|
||||
results: list[dict[str, Any]] = []
|
||||
if controls:
|
||||
try:
|
||||
@@ -111,6 +120,7 @@ async def run_v3_pipeline(
|
||||
"layer_0_boost_overrides": boost_overrides,
|
||||
"total_mcs": len(results),
|
||||
"sector_dropped": sector_dropped,
|
||||
"artifact_gated": len(routed_out),
|
||||
}
|
||||
return results, telemetry
|
||||
|
||||
|
||||
@@ -0,0 +1,80 @@
|
||||
"""Applicability-Gate fuer den DSE-Scan.
|
||||
|
||||
Schliesst Controls aus dem DSE-FINDINGS-Scan aus, die laut
|
||||
`compliance.control_classification` NICHT gegen eine DSE laufen
|
||||
('DSE' nicht in applicable_artifacts) UND sicher klassifiziert sind
|
||||
(needs_review=false). Diese werden NICHT geloescht, sondern als
|
||||
*organisatorische Checkliste* zurueckgegeben (Routing zu VVT/TOM/Audit).
|
||||
|
||||
Fail-safe: unsichere Klassifikationen (needs_review=true) bleiben im
|
||||
Findings-Scan. Defensiv: fehlt die Tabelle (z.B. Prod ohne Migration),
|
||||
liefert das Gate ein leeres Dict -> es wird NICHT gefiltert.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import os
|
||||
from typing import Any
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
async def load_dse_gate(db_url: str = "") -> dict[str, dict[str, Any]]:
|
||||
"""Liefert {control_id: meta} fuer Controls, die aus dem DSE-Findings-Scan
|
||||
auszuschliessen sind (hochsicher organisatorisch). Leeres Dict = kein Filter.
|
||||
"""
|
||||
dsn = (db_url or os.getenv("DATABASE_URL")
|
||||
or os.getenv("COMPLIANCE_DATABASE_URL") or "")
|
||||
if not dsn:
|
||||
return {}
|
||||
try:
|
||||
import asyncpg
|
||||
conn = await asyncpg.connect(dsn)
|
||||
try:
|
||||
rows = await conn.fetch(
|
||||
"""SELECT control_id, obligation_type, check_intent,
|
||||
applicable_artifacts, reference_allowed
|
||||
FROM compliance.control_classification
|
||||
WHERE is_active AND NOT needs_review
|
||||
AND NOT ('DSE' = ANY(applicable_artifacts))""")
|
||||
finally:
|
||||
await conn.close()
|
||||
except Exception as e: # Tabelle fehlt / DB nicht erreichbar -> kein Filter
|
||||
logger.info("dse classification gate inaktiv: %s", str(e)[:90])
|
||||
return {}
|
||||
return {
|
||||
r["control_id"]: {
|
||||
"obligation_type": r["obligation_type"],
|
||||
"check_intent": r["check_intent"],
|
||||
"applicable_artifacts": list(r["applicable_artifacts"] or []),
|
||||
"reference_allowed": r["reference_allowed"],
|
||||
}
|
||||
for r in rows if r["control_id"]
|
||||
}
|
||||
|
||||
|
||||
def apply_gate(
|
||||
controls: list[dict[str, Any]],
|
||||
gate: dict[str, dict[str, Any]],
|
||||
) -> tuple[list[dict[str, Any]], list[dict[str, Any]]]:
|
||||
"""Teilt geladene Controls in (findings_controls, organizational).
|
||||
|
||||
findings_controls: laufen normal durch den DSE-Scan.
|
||||
organizational: aus dem Scan genommen, als Checkliste ausgegeben
|
||||
(control_id + title + Klassifikations-Metadaten fuer das Routing).
|
||||
"""
|
||||
kept: list[dict[str, Any]] = []
|
||||
organizational: list[dict[str, Any]] = []
|
||||
for c in controls:
|
||||
cid = c.get("control_id")
|
||||
meta = gate.get(cid) if cid else None
|
||||
if meta:
|
||||
organizational.append({
|
||||
"control_id": cid,
|
||||
"title": c.get("title"),
|
||||
**meta,
|
||||
})
|
||||
else:
|
||||
kept.append(c)
|
||||
return kept, organizational
|
||||
@@ -0,0 +1,170 @@
|
||||
"""Deterministische semantische Recall-Schicht für den DSE-Check.
|
||||
|
||||
WARUM: Reines Keyword-Matching hat schlechten Recall (eine Pflicht lässt sich
|
||||
auf viele Arten formulieren). Der frühere Regex-Boost war zu stumpf (über-passt
|
||||
auf vollständigen Dokumenten). BGE-M3-Embeddings erkennen den SINN — und sind
|
||||
dabei DETERMINISTISCH: ein Embedding-Modell ist eine feste Funktion, gleicher
|
||||
Text → gleicher Vektor → gleiches Pass/Fail bei fester Schwelle. Reproduzierbar,
|
||||
auditierbar, kein Keyword-Katalog, kein generatives LLM zur Checkzeit.
|
||||
|
||||
Design:
|
||||
- Doc wird EINMAL pro Dokument-Hash eingebettet (teuer: ~37s/64k-Doc), die
|
||||
Per-Control-Scores werden gecacht (/data) → Folge-Checks sind instant.
|
||||
- Reachability-Guard: ist der Embedding-Service nicht erreichbar, liefert die
|
||||
Schicht leer zurück (der deterministische Keyword-Layer trägt) — KEIN Hang.
|
||||
- Schwelle ist die einzige Stellschraube (DSE-Default 0.65, an BMW-GT kalibriert;
|
||||
braucht Mehr-Firmen-Kalibrierung gegen Overfitting — bewusst konservativ).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import hashlib
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import sqlite3
|
||||
from typing import Iterable
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# DSE-Schwelle: an BMW-Haiku-GT vermessen (PASS-Median 0.648 / FAIL-Median 0.612).
|
||||
# 0.65 = präzisionsfreundlich (wenig Über-Pass). Per ENV überschreibbar für
|
||||
# spätere Mehr-Firmen-Kalibrierung, ohne Code-Änderung.
|
||||
DSE_EMBED_THRESHOLD = float(os.getenv("DSE_EMBED_THRESHOLD", "0.65"))
|
||||
_CACHE_PATH = os.getenv("DSE_EMBED_CACHE", "/data/dse_embed_cache.json")
|
||||
_SIDECAR_DB = os.getenv("MC_CLASS_DB", "/data/mc_classification.db")
|
||||
|
||||
|
||||
def _doc_hash(text: str) -> str:
|
||||
return hashlib.sha256(text.encode("utf-8", "ignore")).hexdigest()[:20]
|
||||
|
||||
|
||||
def _load_cache() -> dict:
|
||||
try:
|
||||
with open(_CACHE_PATH, encoding="utf-8") as f:
|
||||
return json.load(f)
|
||||
except Exception:
|
||||
return {}
|
||||
|
||||
|
||||
def _save_cache(cache: dict) -> None:
|
||||
try:
|
||||
# LRU-Kappung: max 30 Dokumente im Cache (Scores sind klein)
|
||||
if len(cache) > 30:
|
||||
for k in list(cache.keys())[:-30]:
|
||||
cache.pop(k, None)
|
||||
tmp = _CACHE_PATH + ".tmp"
|
||||
with open(tmp, "w", encoding="utf-8") as f:
|
||||
json.dump(cache, f)
|
||||
os.replace(tmp, _CACHE_PATH)
|
||||
except Exception as e:
|
||||
logger.warning("dse embed-cache save failed: %s", e)
|
||||
|
||||
|
||||
def _load_control_vecs(cids: Iterable[str]) -> dict[str, list[float]]:
|
||||
from compliance.services.mc_embedding_matcher import _blob_to_vec
|
||||
cid_list = [c for c in cids if c]
|
||||
if not cid_list:
|
||||
return {}
|
||||
try:
|
||||
with sqlite3.connect(_SIDECAR_DB) as c:
|
||||
ph = ",".join("?" * len(cid_list))
|
||||
rows = c.execute(
|
||||
f"SELECT control_id, embedding FROM mc_classification "
|
||||
f"WHERE control_id IN ({ph}) AND doc_type='dse' "
|
||||
f"AND check_type='text' AND embedding IS NOT NULL",
|
||||
cid_list,
|
||||
).fetchall()
|
||||
return {cid: _blob_to_vec(b) for cid, b in rows}
|
||||
except Exception as e:
|
||||
logger.warning("dse control-vec load failed: %s", e)
|
||||
return {}
|
||||
|
||||
|
||||
async def _embedding_reachable(timeout: float = 2.0) -> bool:
|
||||
"""Schneller TCP-Connect zum Embedding-Service. Verhindert, dass ein toter
|
||||
Service den Check blockiert (macmini-Lehrer-Last hat das Embedding früher
|
||||
verstopft)."""
|
||||
url = os.getenv("EMBEDDING_URL", "http://embedding-service:8087")
|
||||
hostport = url.split("://", 1)[-1].split("/", 1)[0]
|
||||
host, _, port = hostport.partition(":")
|
||||
port = int(port or "8087")
|
||||
try:
|
||||
fut = asyncio.open_connection(host, port)
|
||||
reader, writer = await asyncio.wait_for(fut, timeout=timeout)
|
||||
writer.close()
|
||||
try:
|
||||
await writer.wait_closed()
|
||||
except Exception:
|
||||
pass
|
||||
return True
|
||||
except Exception as e:
|
||||
logger.warning("dse embedding-service nicht erreichbar (%s) — "
|
||||
"deterministischer Layer trägt", e)
|
||||
return False
|
||||
|
||||
|
||||
async def _compute_scores(text: str, all_cids: list[str]) -> dict[str, float]:
|
||||
"""Bettet das Dokument EINMAL ein und liefert max-Cosinus je Control."""
|
||||
from compliance.services.mc_embedding_matcher import (
|
||||
_chunk_text, _cosine, _embed_texts, DIM,
|
||||
)
|
||||
mc_vecs = _load_control_vecs(all_cids)
|
||||
if not mc_vecs:
|
||||
return {}
|
||||
chunks = _chunk_text(text)
|
||||
if not chunks:
|
||||
return {}
|
||||
chunk_vecs = await _embed_texts(chunks)
|
||||
chunk_vecs = [v for v in chunk_vecs if v and len(v) == DIM]
|
||||
if not chunk_vecs:
|
||||
return {}
|
||||
return {
|
||||
cid: round(float(max((_cosine(mv, cv) for cv in chunk_vecs),
|
||||
default=0.0)), 4)
|
||||
for cid, mv in mc_vecs.items()
|
||||
}
|
||||
|
||||
|
||||
async def embedding_recall(
|
||||
text: str,
|
||||
candidate_cids: Iterable[str],
|
||||
threshold: float | None = None,
|
||||
embed_timeout: float = 90.0,
|
||||
) -> set[str]:
|
||||
"""Returns die candidate control_ids, die semantisch (>= Schwelle) im Doc
|
||||
vorkommen. Deterministisch + gecacht. Leeres Set, wenn Service down/Fehler.
|
||||
|
||||
candidate_cids: die im Keyword-Layer DURCHGEFALLENEN Controls (Recall-Rescue).
|
||||
"""
|
||||
cands = [c for c in candidate_cids if c]
|
||||
if not text or len(text) < 100 or not cands:
|
||||
return set()
|
||||
thr = DSE_EMBED_THRESHOLD if threshold is None else threshold
|
||||
|
||||
h = _doc_hash(text)
|
||||
cache = _load_cache()
|
||||
scores = cache.get(h)
|
||||
|
||||
if scores is None:
|
||||
if not await _embedding_reachable():
|
||||
return set()
|
||||
try:
|
||||
scores = await asyncio.wait_for(
|
||||
_compute_scores(text, cands), timeout=embed_timeout)
|
||||
except (Exception, asyncio.TimeoutError) as e:
|
||||
logger.warning("dse embedding_recall skipped: %s", e)
|
||||
return set()
|
||||
if not scores:
|
||||
return set()
|
||||
cache[h] = scores
|
||||
_save_cache(cache)
|
||||
logger.info("dse embedding_recall: doc %s eingebettet (%d Scores)",
|
||||
h, len(scores))
|
||||
else:
|
||||
logger.info("dse embedding_recall: Cache-Treffer doc %s", h)
|
||||
|
||||
cand_set = set(cands)
|
||||
return {cid for cid, s in scores.items()
|
||||
if cid in cand_set and s >= thr}
|
||||
@@ -0,0 +1,183 @@
|
||||
"""Getierte 3-Status-Auswertung für DSE-Controls mit `tiered_criteria`.
|
||||
|
||||
Pro Kriterium wird nach `decision_method` bewertet:
|
||||
- EMBEDDING (Präsenz): deterministisch (festes Modell), Doc EINMAL pro Scan
|
||||
eingebettet → reproduzierbar, kein LLM. Trägt den GROSSTEIL.
|
||||
- LLM (Sufficiency): Haiku-Judge, GECACHT pro (doc_hash, control_id#idx,
|
||||
PROMPT_VERSION, criterion) → gleicher Scan = gleiches Ergebnis. Löst die
|
||||
empirisch gemessene Judge-Varianz (ein Live-Call ist NICHT reproduzierbar).
|
||||
|
||||
Status NUR aus LEGAL_MINIMUM:
|
||||
ERFÜLLT (alle LM erfüllt ODER kein LM) · FEHLT (kein LM erfüllt) ·
|
||||
TEILWEISE (Teil der LM erfüllt) · UNBESTIMMT (LM nicht bewertbar, z. B.
|
||||
Embedding-Service down → Aufrufer behält sein Legacy-Ergebnis).
|
||||
BEST_PRACTICE/OPTIONAL fließen NIE in den Status, nur in `recommendations`.
|
||||
Siehe docs-src/development/criterion_meta_model.md.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import hashlib
|
||||
import logging
|
||||
import os
|
||||
import sqlite3
|
||||
from typing import Any, Optional
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
PROMPT_VERSION = "dse-tier-v1"
|
||||
_CACHE_DB = os.getenv("TIERED_JUDGE_CACHE", "/data/tiered_judge_cache.db")
|
||||
_EMBED_THR = float(os.getenv("DSE_CRITERION_EMBED_THRESHOLD", "0.62"))
|
||||
LM = "LEGAL_MINIMUM"
|
||||
|
||||
|
||||
def _doc_hash(text: str) -> str:
|
||||
return hashlib.sha256(text.encode("utf-8", "ignore")).hexdigest()[:20]
|
||||
|
||||
|
||||
def _ckey(dh: str, cid: str, idx: int, crit: str) -> str:
|
||||
ch = hashlib.sha256(crit.encode("utf-8", "ignore")).hexdigest()[:12]
|
||||
return f"{dh}|{cid}#{idx}|{PROMPT_VERSION}|{ch}"
|
||||
|
||||
|
||||
def _cache_get(key: str) -> Optional[bool]:
|
||||
try:
|
||||
with sqlite3.connect(_CACHE_DB) as c:
|
||||
c.execute("create table if not exists judge(k text primary key, met int)")
|
||||
row = c.execute("select met from judge where k=?", (key,)).fetchone()
|
||||
return None if row is None else bool(row[0])
|
||||
except Exception:
|
||||
return None
|
||||
|
||||
|
||||
def _cache_put(key: str, met: bool) -> None:
|
||||
try:
|
||||
with sqlite3.connect(_CACHE_DB) as c:
|
||||
c.execute("create table if not exists judge(k text primary key, met int)")
|
||||
c.execute("insert or replace into judge values(?,?)", (key, int(met)))
|
||||
except Exception as e:
|
||||
logger.warning("tiered judge cache put: %s", e)
|
||||
|
||||
|
||||
async def prepare_doc(text: str) -> dict[str, Any]:
|
||||
"""Doc EINMAL pro Scan einbetten. Liefert {hash, chunk_vecs}. Bei Embedding-
|
||||
Ausfall: chunk_vecs=None → EMBEDDING-Kriterien werden UNBESTIMMT (Fallback)."""
|
||||
ctx: dict[str, Any] = {"hash": _doc_hash(text or ""), "chunk_vecs": None}
|
||||
if not text or len(text) < 100:
|
||||
return ctx
|
||||
try:
|
||||
from compliance.services.mc_embedding_matcher import DIM, _chunk_text, _embed_texts
|
||||
vecs = await asyncio.wait_for(_embed_texts(_chunk_text(text)), timeout=90.0)
|
||||
ctx["chunk_vecs"] = [v for v in vecs if v and len(v) == DIM]
|
||||
except (Exception, asyncio.TimeoutError) as e:
|
||||
logger.warning("tiered prepare_doc embedding inaktiv: %s", e)
|
||||
return ctx
|
||||
|
||||
|
||||
async def _embed_present(crits: list[str], ctx: dict, thr: float) -> dict[str, Optional[bool]]:
|
||||
cvecs = ctx.get("chunk_vecs")
|
||||
if not cvecs:
|
||||
return {c: None for c in crits}
|
||||
try:
|
||||
from compliance.services.mc_embedding_matcher import DIM, _cosine, _embed_texts
|
||||
pv = await _embed_texts(crits)
|
||||
out: dict[str, Optional[bool]] = {}
|
||||
for crit, v in zip(crits, pv):
|
||||
if not v or len(v) != DIM:
|
||||
out[crit] = None
|
||||
else:
|
||||
out[crit] = max((_cosine(v, cv) for cv in cvecs), default=0.0) >= thr
|
||||
return out
|
||||
except Exception as e:
|
||||
logger.warning("tiered embed present: %s", e)
|
||||
return {c: None for c in crits}
|
||||
|
||||
|
||||
async def _llm_met(cid: str, idx: int, crit: str, doc, dh: str) -> Optional[bool]:
|
||||
key = _ckey(dh, cid, idx, crit)
|
||||
cached = _cache_get(key)
|
||||
if cached is not None:
|
||||
return cached
|
||||
from compliance.services.checkers.router import build_spec, route_and_check
|
||||
spec = build_spec(cid, {"verification_method": "CONTENT", "decision_method": "LLM"},
|
||||
label=crit, criteria=[crit])
|
||||
res = await route_and_check(spec, doc)
|
||||
if res.present is None:
|
||||
return None
|
||||
_cache_put(key, bool(res.present))
|
||||
return bool(res.present)
|
||||
|
||||
|
||||
def _status(lm_vals: list[Optional[bool]]) -> str:
|
||||
if not lm_vals:
|
||||
return "ERFÜLLT" # kein gesetzliches Minimum → nie rot
|
||||
if any(m is None for m in lm_vals):
|
||||
return "UNBESTIMMT" # Aufrufer behält Legacy
|
||||
n = sum(1 for m in lm_vals if m)
|
||||
if n == len(lm_vals):
|
||||
return "ERFÜLLT"
|
||||
return "FEHLT" if n == 0 else "TEILWEISE"
|
||||
|
||||
|
||||
async def evaluate_tiered(control_id: str, tiered_criteria: list[dict],
|
||||
ctx: dict, doc) -> dict[str, Any]:
|
||||
dh = ctx.get("hash") or _doc_hash(getattr(doc, "text", "") or "")
|
||||
emb_texts = [c["criterion"] for c in (tiered_criteria or [])
|
||||
if c.get("criterion")
|
||||
and (c.get("decision_method") or "EMBEDDING").upper() != "LLM"]
|
||||
emb_res = await _embed_present(emb_texts, ctx, _EMBED_THR) if emb_texts else {}
|
||||
|
||||
lm_vals: list[Optional[bool]] = []
|
||||
recs: list[dict] = []
|
||||
detail: list[dict] = []
|
||||
for idx, c in enumerate(tiered_criteria or []):
|
||||
crit = c.get("criterion") or ""
|
||||
if not crit:
|
||||
continue
|
||||
tier = (c.get("compliance_tier") or "").upper()
|
||||
if (c.get("decision_method") or "EMBEDDING").upper() == "LLM":
|
||||
met = await _llm_met(control_id, idx, crit, doc, dh)
|
||||
src = "haiku-cache"
|
||||
else:
|
||||
met = emb_res.get(crit)
|
||||
src = "embedding"
|
||||
detail.append({"criterion": crit, "tier": tier, "met": met, "source": src})
|
||||
if tier == LM:
|
||||
lm_vals.append(met)
|
||||
elif met is False:
|
||||
recs.append({"criterion": crit, "tier": tier or "OPTIONAL",
|
||||
"legal_basis": c.get("legal_basis")})
|
||||
|
||||
return {"status": _status(lm_vals), "lm_met": sum(1 for m in lm_vals if m),
|
||||
"lm_total": len(lm_vals), "recommendations": recs, "detail": detail}
|
||||
|
||||
|
||||
async def fetch_tiered_criteria(cids: list[str], db_url: str = "") -> dict[str, list]:
|
||||
"""tiered_criteria der angegebenen Controls aus canonical_controls laden.
|
||||
Leeres Dict bei Fehler/keiner DB (Fallback: kein Tiering, Legacy trägt)."""
|
||||
cids = [c for c in cids if c]
|
||||
if not cids:
|
||||
return {}
|
||||
import json
|
||||
dsn = db_url or os.getenv("DATABASE_URL") or os.getenv("COMPLIANCE_DATABASE_URL")
|
||||
if not dsn:
|
||||
return {}
|
||||
try:
|
||||
import asyncpg
|
||||
conn = await asyncpg.connect(dsn)
|
||||
rows = await conn.fetch(
|
||||
"select control_id, generation_metadata->'tiered_criteria' tc "
|
||||
"from compliance.canonical_controls "
|
||||
"where control_id = any($1::text[]) "
|
||||
"and generation_metadata ? 'tiered_criteria'", cids)
|
||||
await conn.close()
|
||||
except Exception as e:
|
||||
logger.warning("fetch_tiered_criteria failed: %s", e)
|
||||
return {}
|
||||
out: dict[str, list] = {}
|
||||
for r in rows:
|
||||
tc = r["tc"]
|
||||
tc = json.loads(tc) if isinstance(tc, str) else tc
|
||||
if tc:
|
||||
out[r["control_id"]] = tc
|
||||
return out
|
||||
@@ -1,29 +1,286 @@
|
||||
"""DSEAgent — Datenschutzerklärung / Datenschutzinformation (Art. 13/14 DSGVO).
|
||||
"""DSE-Agent v3 — Datenschutzerklärung / Datenschutzinformation (Art. 13/14
|
||||
DSGVO), baut auf doc_check_controls (267 text-MCs aus DB).
|
||||
|
||||
Thin-Subclass von ChecklistAgent über die kuratierte ART13_CHECKLIST (KEIN
|
||||
90k-Library-Firehose). Einzige Spezialität: Drittland wird bei dokumentiertem
|
||||
Drittlandtransfer (Scan-Kontext) zu HIGH angehoben.
|
||||
Volle Parität zu impressum/ + cookie_policy/ (User-Vorgabe 2026-06-17):
|
||||
Layer 0 — Regex-Boost (kuratierte Art-13-Patterns aus mcs.py / ART13_CHECKLIST)
|
||||
Layer 1 — Keyword-Match aus pass_criteria der DSE-DB-MCs (deterministisch)
|
||||
Layer 2 — BGE-M3 Embedding-Match
|
||||
Layer 3 — Semantic-Validator (LLM) für offene HIGH/MEDIUM-Fails + Auto-Learning
|
||||
|
||||
Die kuratierten Patterns gehen NICHT verloren — sie boosten (Layer 0) die DB-
|
||||
Controls (z.B. präzises "keine Drittlandübermittlung" → drittland-MC PASS, kein
|
||||
False-Positive). DSE-Spezialität bleibt: Drittland → HIGH bei dokumentiertem
|
||||
Transfer (scan_context).
|
||||
|
||||
Output-Layer (Linter / Rollup / Methodik-UI) bleibt 1:1.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from compliance.services.doc_checks.dse_checks import ART13_CHECKLIST
|
||||
import logging
|
||||
from datetime import datetime, timezone
|
||||
|
||||
from .._base import AgentInput
|
||||
from .._checklist_agent import ChecklistAgent
|
||||
from .._base import (
|
||||
AgentInput,
|
||||
AgentOutput,
|
||||
BaseSpecialistAgent,
|
||||
EscalationLog,
|
||||
EvidenceSource,
|
||||
Finding,
|
||||
McCoverage,
|
||||
Severity,
|
||||
SourceType,
|
||||
lint_output,
|
||||
)
|
||||
from .._pattern_library import record as record_pattern
|
||||
from .._rollup import rollup
|
||||
from .._semantic_validator import build_rename_action, validate_present
|
||||
from .mcs import MC_IDS, MCS
|
||||
from .regex_boost import BOOST_KEYWORDS
|
||||
from .v3_engine import run_v3_pipeline
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class DSEAgent(ChecklistAgent):
|
||||
CHECKLIST = ART13_CHECKLIST
|
||||
_SEV_TO_ENUM = {
|
||||
"CRITICAL": Severity.HIGH,
|
||||
"HIGH": Severity.HIGH,
|
||||
"MEDIUM": Severity.MEDIUM,
|
||||
"LOW": Severity.LOW,
|
||||
"INFO": Severity.INFO,
|
||||
}
|
||||
|
||||
# Drittland-Vokabeln für die scan_context-Heraufstufung (Art. 13(1)(f)).
|
||||
_THIRD_COUNTRY_KW = tuple(set(
|
||||
BOOST_KEYWORDS.get("third_country", ())
|
||||
+ BOOST_KEYWORDS.get("third_country_mechanism", ())
|
||||
))
|
||||
|
||||
|
||||
def _build_measure(label: str, norm: str) -> str:
|
||||
"""Maßnahme (Imperativ) statt Pruef-Frage als action."""
|
||||
base = (label or "").strip().rstrip(".")
|
||||
if not base:
|
||||
return ("Datenschutz-Pflichtangabe ergänzen und gegen Art. 13/14 "
|
||||
"DSGVO prüfen.")
|
||||
msg = f"Pflichtangabe ergänzen: {base}."
|
||||
if norm:
|
||||
msg += f" Rechtsgrundlage: {norm}."
|
||||
return msg
|
||||
|
||||
|
||||
def _is_third_country_topic(result: dict) -> bool:
|
||||
"""Ist dieses DB-MC thematisch ein Drittland-Control?"""
|
||||
parts: list[str] = [str(result.get("label") or "").lower()]
|
||||
for c in (result.get("_pass_criteria") or []):
|
||||
if c:
|
||||
parts.append(str(c).lower())
|
||||
blob = " ".join(parts)
|
||||
hits = sum(1 for kw in _THIRD_COUNTRY_KW if kw in blob)
|
||||
return hits >= 1
|
||||
|
||||
|
||||
class DSEAgent(BaseSpecialistAgent):
|
||||
agent_id = "dse"
|
||||
agent_version = "1.0"
|
||||
agent_version = "3.0"
|
||||
doc_type = "dse"
|
||||
owned_mc_ids = tuple(c["id"] for c in ART13_CHECKLIST)
|
||||
owned_mc_ids = MC_IDS
|
||||
|
||||
def _severity_override(self, c: dict, agent_input: AgentInput):
|
||||
def _third_country_transfer(self, agent_input: AgentInput) -> bool:
|
||||
sc = (agent_input.context or {}).get("scan_context") or {}
|
||||
tc = str(sc.get("third_country_transfer", "")).lower() in (
|
||||
return str(sc.get("third_country_transfer", "")).lower() in (
|
||||
"yes", "true", "1", "ja")
|
||||
if tc and c["id"] in ("third_country", "third_country_mechanism"):
|
||||
return "HIGH"
|
||||
return None
|
||||
|
||||
async def evaluate(self, agent_input: AgentInput) -> AgentOutput:
|
||||
start = datetime.now(timezone.utc)
|
||||
text = (agent_input.text or "").strip()
|
||||
scope = set(agent_input.business_scope or [])
|
||||
coverage: list[McCoverage] = []
|
||||
findings: list[Finding] = []
|
||||
esc_logs: list[EscalationLog] = []
|
||||
notes_parts: list[str] = []
|
||||
|
||||
if len(text) < 100:
|
||||
for mc in MCS:
|
||||
coverage.append(McCoverage(
|
||||
mc_id=mc.mc_id, status="skipped",
|
||||
label=mc.label, reason="text too short",
|
||||
))
|
||||
return self._finalize(
|
||||
start, findings, esc_logs, coverage,
|
||||
confidence=0.0,
|
||||
notes="DSE-Text zu kurz oder leer.",
|
||||
)
|
||||
|
||||
tc_transfer = self._third_country_transfer(agent_input)
|
||||
# Embedding-Recall (Layer 2) läuft IMMER — deterministisch, gecacht
|
||||
# (pro Doc-Hash → Folge-Views instant) und Reachability-gegated
|
||||
# (kein Hang, wenn der Service fehlt). Ersetzt den über-passenden Boost.
|
||||
results, telemetry = await run_v3_pipeline(text, scope)
|
||||
notes_parts.append(
|
||||
f"v3-pipeline: {telemetry.get('total_mcs', 0)} DB-MCs · "
|
||||
f"{telemetry.get('layer_1_pass', 0)} Keyword-Treffer · "
|
||||
f"{telemetry.get('embedding_passes', 0)} semantisch (Embedding)"
|
||||
)
|
||||
if telemetry.get("sector_dropped") or telemetry.get("offtopic_dropped"):
|
||||
notes_parts.append(
|
||||
f"Scope-Filter: {telemetry.get('sector_dropped', 0)} "
|
||||
f"Branchen-MCs, {telemetry.get('offtopic_dropped', 0)} "
|
||||
"themenfremde MCs entfernt"
|
||||
)
|
||||
|
||||
seen: set[str] = set()
|
||||
for r in results:
|
||||
mc_id = r.get("control_id") or ""
|
||||
if not mc_id or mc_id in seen:
|
||||
continue
|
||||
seen.add(mc_id)
|
||||
passed = bool(r.get("passed"))
|
||||
sev = _SEV_TO_ENUM.get(
|
||||
(r.get("severity") or "MEDIUM").upper(), Severity.MEDIUM,
|
||||
)
|
||||
# DSE-Spezialität: Drittland → HIGH bei dokumentiertem Transfer.
|
||||
sev_reason = "db_mc_failed"
|
||||
if tc_transfer and _is_third_country_topic(r):
|
||||
sev = Severity.HIGH
|
||||
sev_reason = "db_mc_failed_third_country_transfer"
|
||||
coverage.append(McCoverage(
|
||||
mc_id=mc_id,
|
||||
status="ok" if passed else sev.value.lower(),
|
||||
reason=str(r.get("matched_text") or r.get("hint") or "")[:120],
|
||||
))
|
||||
if passed:
|
||||
continue
|
||||
label = r.get("label") or r.get("hint") or ""
|
||||
norm_str = str(r.get("regulation") or "").strip()
|
||||
if r.get("article"):
|
||||
norm_str = (norm_str + f" Art. {r.get('article')}").strip()
|
||||
if not norm_str:
|
||||
norm_str = "DSGVO Art. 13/14"
|
||||
findings.append(Finding(
|
||||
check_id=f"DSE-DBMC-{mc_id}",
|
||||
agent=self.agent_id,
|
||||
agent_version=self.agent_version,
|
||||
field_id=mc_id,
|
||||
severity=sev,
|
||||
severity_reason=sev_reason,
|
||||
title=str(label)[:200] or f"DB-MC {mc_id} nicht erfüllt",
|
||||
norm=norm_str,
|
||||
evidence="",
|
||||
action=_build_measure(str(label), norm_str)[:400],
|
||||
confidence=0.9,
|
||||
sources=[EvidenceSource(
|
||||
source_type=SourceType.MC,
|
||||
source_id=mc_id,
|
||||
detail=str(r.get("source") or "keyword_match")[:120],
|
||||
confidence=0.9,
|
||||
)],
|
||||
))
|
||||
|
||||
# Boost-Coverage: meine Pattern-Treffer (regex-boost field_ids).
|
||||
boost_ids = set(telemetry.get("layer_0_field_ids") or [])
|
||||
for mc in MCS:
|
||||
coverage.append(McCoverage(
|
||||
mc_id=mc.mc_id,
|
||||
status="ok" if mc.field_id in boost_ids else "na",
|
||||
label=mc.label,
|
||||
reason=("regex-boost hit"
|
||||
if mc.field_id in boost_ids
|
||||
else "kein Pattern-Treffer (kein Veto)"),
|
||||
))
|
||||
|
||||
if not (agent_input.context or {}).get("skip_llm"):
|
||||
await self._semantic_demote(text, findings, coverage)
|
||||
|
||||
confs = [f.confidence for f in findings if f.confidence] or [0.95]
|
||||
overall = sum(confs) / len(confs)
|
||||
|
||||
return self._finalize(
|
||||
start, findings, esc_logs, coverage,
|
||||
confidence=overall, notes=" · ".join(notes_parts),
|
||||
)
|
||||
|
||||
async def _semantic_demote(
|
||||
self, text: str, findings: list[Finding],
|
||||
coverage: list[McCoverage],
|
||||
) -> None:
|
||||
"""LLM-Layer für HIGH/MEDIUM-DB-MCs: Label-Mismatch-Check.
|
||||
Bei Fund → HIGH/MEDIUM → LOW + Rename-Action."""
|
||||
candidates = [
|
||||
f for f in findings
|
||||
if f.severity in (Severity.HIGH.value, Severity.MEDIUM.value)
|
||||
and f.severity_reason in (
|
||||
"db_mc_failed", "db_mc_failed_third_country_transfer")
|
||||
]
|
||||
if not candidates:
|
||||
return
|
||||
result = await validate_present(
|
||||
text, [(f.field_id, f.title[:80]) for f in candidates],
|
||||
)
|
||||
if not result:
|
||||
return
|
||||
for finding in candidates:
|
||||
row = result.get(finding.field_id)
|
||||
if not row or not row.get("found"):
|
||||
continue
|
||||
if row.get("confidence", 0) < 0.6:
|
||||
continue
|
||||
label_used = row.get("label_used") or "abweichendes Label"
|
||||
conf = float(row.get("confidence") or 0.8)
|
||||
finding.severity = Severity.LOW.value
|
||||
finding.severity_reason = "label_mismatch"
|
||||
finding.title = (
|
||||
f"Label '{label_used}' weicht von Standard ab"
|
||||
)
|
||||
finding.evidence = str(row.get("evidence") or "")[:200]
|
||||
finding.action = build_rename_action(
|
||||
finding.field_id, label_used,
|
||||
)
|
||||
finding.confidence = conf
|
||||
finding.sources.append(EvidenceSource(
|
||||
source_type=SourceType.LLM_LOCAL,
|
||||
source_id="semantic_validator",
|
||||
detail=f"LLM-confirmed: '{label_used}'",
|
||||
confidence=conf,
|
||||
))
|
||||
for c in coverage:
|
||||
if c.mc_id == finding.field_id:
|
||||
c.status = "low"
|
||||
c.reason = f"label_mismatch: '{label_used}'"
|
||||
try:
|
||||
record_pattern(
|
||||
field_id=finding.field_id,
|
||||
label_used=label_used,
|
||||
confidence=conf,
|
||||
agent_id=self.agent_id,
|
||||
)
|
||||
except Exception as e:
|
||||
logger.warning("pattern-library record failed: %s", e)
|
||||
|
||||
def _finalize(
|
||||
self, start: datetime, findings: list[Finding],
|
||||
esc_logs: list[EscalationLog], coverage: list[McCoverage],
|
||||
confidence: float, notes: str = "",
|
||||
) -> AgentOutput:
|
||||
end = datetime.now(timezone.utc)
|
||||
recs = rollup(findings)
|
||||
out = AgentOutput(
|
||||
agent=self.agent_id,
|
||||
agent_version=self.agent_version,
|
||||
started_at=start,
|
||||
finished_at=end,
|
||||
duration_ms=int((end - start).total_seconds() * 1000),
|
||||
findings=findings,
|
||||
recommendations=recs,
|
||||
mc_coverage=coverage,
|
||||
escalation_log=esc_logs,
|
||||
confidence=confidence,
|
||||
notes=notes,
|
||||
mc_total=len(coverage),
|
||||
mc_ok=sum(1 for c in coverage if c.status == "ok"),
|
||||
mc_na=sum(1 for c in coverage if c.status == "na"),
|
||||
mc_high=sum(1 for c in coverage if c.status == "high"),
|
||||
mc_medium=sum(1 for c in coverage if c.status == "medium"),
|
||||
mc_low=sum(1 for c in coverage if c.status == "low"),
|
||||
)
|
||||
return lint_output(out)
|
||||
|
||||
@@ -0,0 +1,129 @@
|
||||
"""DSE-Tiefenprüfung: LLM-Kaskade auf die UNSCHARFEN Findings.
|
||||
|
||||
User-Architektur (2026-06-18): die deterministische Engine (Keyword + Embedding)
|
||||
triagiert. Eindeutige Fälle (sehr hoher/niedriger Embedding-Score) bleiben
|
||||
deterministisch. Die UNSCHARFE Mitte + grenzwertig-Bestandene gehen durch die
|
||||
Kaskade — denn dort entstehen sowohl 'verpasste Lücken' (schlimmster Fehler) als
|
||||
auch Falsch-Findings (Rework).
|
||||
|
||||
Eskalation auf ANTWORT-UNSICHERHEIT (nicht JSON-Gültigkeit): jedes Tier liefert
|
||||
{erfuellt, confidence, begruendung}. Confidence < Schwelle → nächstes Tier.
|
||||
Tier 1: Qwen 35B (lokal, schnell, billig)
|
||||
Tier 2: OVH gpt-oss-120B
|
||||
Tier 3: Claude — NUR mit Freigabe (allow_claude), sonst 'needs_freigabe'.
|
||||
|
||||
Judging-Leitplanken (User-Vorgaben):
|
||||
- Speicherdauer nur erfüllt bei konkreter Höchstdauer ODER echtem,
|
||||
nachvollziehbarem Kriterium — NICHT zirkulär ('bis Zweck wegfällt').
|
||||
- Ohne ausreichenden Kontext → eher nicht erfüllt (nichts fehlen lassen).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
|
||||
from compliance.services.llm_cascade import (
|
||||
_call_anthropic, _call_ollama, _call_ovh,
|
||||
)
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Unscharfe Embedding-Zone (kalibriert an 5-Firmen-GT 2026-06-18): außerhalb ist
|
||||
# die Engine sicher genug, innen entscheidet der LLM.
|
||||
FUZZY_LO = float(os.getenv("DSE_FUZZY_LO", "0.50"))
|
||||
FUZZY_HI = float(os.getenv("DSE_FUZZY_HI", "0.72"))
|
||||
# Selbstkonfidenz-Schwelle: darunter → eskalieren.
|
||||
ESC_CONF = float(os.getenv("DSE_ESC_CONF", "0.75"))
|
||||
|
||||
_JUDGE_SYS = (
|
||||
"Du bist ein erfahrener DSGVO-Datenschutz-Auditor. Du prüfst, ob eine "
|
||||
"konkrete Pflicht in einer Datenschutzerklärung (DSE) ERFÜLLT ist. "
|
||||
"Sei streng wie ein Fachanwalt: lieber 'nicht erfüllt' wenn unklar — eine "
|
||||
"übersehene Lücke ist schlimmer als ein Hinweis zu viel. "
|
||||
"Speicherdauer ist NUR erfüllt bei konkreter Höchstdauer ODER einem echten, "
|
||||
"nachvollziehbaren Kriterium; zirkuläre Formeln ('bis der Zweck wegfällt') "
|
||||
"erfüllen die Pflicht NICHT. "
|
||||
'Antworte AUSSCHLIESSLICH als JSON: '
|
||||
'{"erfuellt": true|false, "confidence": 0.0-1.0, "begruendung": "kurz"}'
|
||||
)
|
||||
|
||||
|
||||
def _build_user(doc_text: str, title: str, criteria: list) -> str:
|
||||
crit = "; ".join(str(c) for c in (criteria or []) if c)[:600]
|
||||
return (
|
||||
f"PFLICHT: {title}\n"
|
||||
f"Erfüllt, wenn: {crit}\n\n"
|
||||
f"DATENSCHUTZERKLÄRUNG (Auszug):\n{doc_text[:14000]}\n\n"
|
||||
"Ist die Pflicht im Text inhaltlich erfüllt?"
|
||||
)
|
||||
|
||||
|
||||
def _parse(text: str) -> dict | None:
|
||||
if not text:
|
||||
return None
|
||||
s, e = text.find("{"), text.rfind("}")
|
||||
if s < 0 or e <= s:
|
||||
return None
|
||||
try:
|
||||
o = json.loads(text[s:e + 1])
|
||||
return {
|
||||
"erfuellt": bool(o.get("erfuellt")),
|
||||
"confidence": float(o.get("confidence") or 0.0),
|
||||
"begruendung": str(o.get("begruendung") or "")[:300],
|
||||
}
|
||||
except Exception:
|
||||
return None
|
||||
|
||||
|
||||
async def judge_control(
|
||||
doc_text: str, title: str, criteria: list, allow_claude: bool = False,
|
||||
) -> dict:
|
||||
"""Tiered judgment mit Selbstkonfidenz-Eskalation. Returns
|
||||
{erfuellt, confidence, source, begruendung, needs_freigabe}."""
|
||||
user = _build_user(doc_text, title, criteria)
|
||||
tiers = [("qwen", _call_ollama), ("ovh_120b", _call_ovh)]
|
||||
best: dict | None = None
|
||||
for name, call in tiers:
|
||||
try:
|
||||
if name == "qwen":
|
||||
txt = await call(_JUDGE_SYS, user, max_tokens=400,
|
||||
timeout=60, think=False)
|
||||
else:
|
||||
txt = await call(_JUDGE_SYS, user, max_tokens=400)
|
||||
except Exception as e:
|
||||
logger.warning("deep_check tier %s failed: %s", name, e)
|
||||
txt = ""
|
||||
p = _parse(txt)
|
||||
if p:
|
||||
p["source"] = name
|
||||
best = p
|
||||
if p["confidence"] >= ESC_CONF:
|
||||
return {**p, "needs_freigabe": False}
|
||||
# Tier 3: Claude — nur mit Freigabe
|
||||
if not allow_claude:
|
||||
if best:
|
||||
return {**best, "needs_freigabe": True}
|
||||
return {"erfuellt": False, "confidence": 0.0, "source": "none",
|
||||
"begruendung": "Unsicher — Anwalt/Claude-Freigabe nötig",
|
||||
"needs_freigabe": True}
|
||||
try:
|
||||
txt = await _call_anthropic(_JUDGE_SYS, user, max_tokens=400)
|
||||
p = _parse(txt)
|
||||
if p:
|
||||
return {**p, "source": "anthropic_claude", "needs_freigabe": False}
|
||||
except Exception as e:
|
||||
logger.warning("deep_check claude failed: %s", e)
|
||||
if best:
|
||||
return {**best, "needs_freigabe": False}
|
||||
return {"erfuellt": False, "confidence": 0.0, "source": "none",
|
||||
"begruendung": "Kein LLM-Ergebnis", "needs_freigabe": False}
|
||||
|
||||
|
||||
def is_fuzzy(score: float, kw_pass: bool) -> bool:
|
||||
"""Unscharf = im Embedding-Graubereich UND nicht durch Keyword klar bestätigt.
|
||||
Klar-bestanden (kw) bleibt deterministisch; klar-hoch/niedrig auch."""
|
||||
if kw_pass:
|
||||
return False
|
||||
return FUZZY_LO <= score <= FUZZY_HI
|
||||
@@ -0,0 +1,78 @@
|
||||
"""Machine-Check-Definitionen für den DSE-Agent (Layer-0 Regex-Boost).
|
||||
|
||||
Eine MC = ein abgegrenztes Art-13/14-DSGVO-Pflichtfeld mit deterministischen
|
||||
Patterns. Quelle der Patterns ist die EINE kuratierte ART13_CHECKLIST
|
||||
(doc_checks/dse_checks.py) — hier nur in das MC-Format gehoben, damit der
|
||||
Regex-Boost (regex_boost.py) und die v3-Engine (v3_engine.py) dieselbe Struktur
|
||||
nutzen wie impressum/ + cookie_policy/. KEINE Pattern-Duplikation: die Patterns
|
||||
bleiben in dse_checks.py, dieses Modul kompiliert sie nur.
|
||||
|
||||
Owner = dse-agent.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import re
|
||||
from dataclasses import dataclass, field
|
||||
from typing import Pattern
|
||||
|
||||
from compliance.services.doc_checks.dse_checks import ART13_CHECKLIST
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class MC:
|
||||
"""Eine Machine-Check-Definition (Boost-Pattern für ein DSE-Feld)."""
|
||||
mc_id: str # DSE-MC-001 ...
|
||||
field_id: str # controller, legal_basis, third_country ...
|
||||
label: str
|
||||
norm: str
|
||||
patterns: tuple[Pattern[str], ...] = field(default_factory=tuple)
|
||||
severity_if_missing: str = "MEDIUM"
|
||||
level: int = 1
|
||||
|
||||
|
||||
_NORM_RE = re.compile(r"\((Art\.[^)]+|§\s*\d+[^)]*)\)")
|
||||
|
||||
|
||||
def _norm_of(label: str) -> str:
|
||||
m = _NORM_RE.search(label or "")
|
||||
return m.group(1).strip() if m else "Art. 13/14 DSGVO"
|
||||
|
||||
|
||||
def _compile(patterns: list[str]) -> tuple[Pattern[str], ...]:
|
||||
out: list[Pattern[str]] = []
|
||||
for p in patterns or ():
|
||||
try:
|
||||
out.append(re.compile(p, re.IGNORECASE | re.MULTILINE))
|
||||
except re.error:
|
||||
continue
|
||||
return tuple(out)
|
||||
|
||||
|
||||
def _build_mcs() -> tuple[MC, ...]:
|
||||
"""Hebt die ART13_CHECKLIST in das MC-Format (Boost-Pattern pro Feld)."""
|
||||
mcs: list[MC] = []
|
||||
for i, c in enumerate(ART13_CHECKLIST, start=1):
|
||||
mcs.append(MC(
|
||||
mc_id=f"DSE-MC-{i:03d}",
|
||||
field_id=c["id"],
|
||||
label=c["label"],
|
||||
norm=_norm_of(c["label"]),
|
||||
patterns=_compile(c.get("patterns", [])),
|
||||
severity_if_missing=c.get("severity", "MEDIUM"),
|
||||
level=c.get("level", 1),
|
||||
))
|
||||
return tuple(mcs)
|
||||
|
||||
|
||||
MCS: tuple[MC, ...] = _build_mcs()
|
||||
|
||||
# Public list of all MC-IDs for the Registry / owned_mc_ids.
|
||||
MC_IDS: tuple[str, ...] = tuple(m.mc_id for m in MCS)
|
||||
|
||||
|
||||
def scope_matches(mc: MC, scope: set[str]) -> bool:
|
||||
"""Art-13/14-Pflichten gelten universell für jede DSE — keine Branchen-
|
||||
Gating auf Boost-Ebene (anders als Impressum mit Kammerberufen). Das
|
||||
Sektor-Gate über den control_id-Prefix passiert in der v3-Engine."""
|
||||
return True
|
||||
@@ -0,0 +1,179 @@
|
||||
"""Layer-0 Regex-Boost für den DSE-Agent — die kuratierten Art-13/14-Patterns
|
||||
als deterministische Vor-Stufe vor dem Keyword-Match aus doc_check_controls.
|
||||
|
||||
Analog zu impressum/regex_boost.py + cookie_policy/regex_boost.py:
|
||||
- run_v3_pipeline lädt die 267 text-MCs (doc_type='dse') und macht
|
||||
Keyword-Match aus deren pass_criteria.
|
||||
- MEIN Beitrag (Layer 0): die präzisen Art-13-Patterns (mcs.py / aus
|
||||
ART13_CHECKLIST) laufen ZUERST. Trifft ein Pattern → das thematisch
|
||||
passende DB-MC wird zu PASS geboostet (auch wenn der Keyword-Match unklar
|
||||
war). Mapping: field_id → typische Wörter in der pass_criteria der DB-MC.
|
||||
|
||||
Damit gehen die kuratierten DSE-Patterns nicht verloren, sondern boosten das
|
||||
DB-Control-System (statt es zu ersetzen).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
|
||||
from .mcs import MCS, scope_matches
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
# field_id (aus ART13_CHECKLIST) → Wörter, wie sie in der pass_criteria der
|
||||
# zugehörigen DSE-DB-MCs vorkommen. Treffen ≥2 dieser Wörter in den criteria
|
||||
# eines DB-MC, gehört es zu diesem Feld → Boost. Die Vokabeln sind an den
|
||||
# real beobachteten DB-Kriterien ausgerichtet (DATA/SEC/AUTH-Controls).
|
||||
BOOST_KEYWORDS: dict[str, tuple[str, ...]] = {
|
||||
"controller": (
|
||||
"verantwortlich", "verantwortliche stelle", "verantwortlichen",
|
||||
"kontaktdaten des verantwortlichen", "name und kontaktdaten",
|
||||
"firmenname", "rechtsform", "anschrift", "ladungsfähige",
|
||||
"identität des verantwortlichen", "controller",
|
||||
),
|
||||
"dpo": (
|
||||
"datenschutzbeauftragt", "datenschutzbeauftragter",
|
||||
"datenschutzbeauftragte", "data protection officer",
|
||||
"kontaktdaten des datenschutz", "benennung", "art. 37",
|
||||
),
|
||||
"purposes": (
|
||||
"zweck", "zwecke", "verarbeitungszweck", "zweck der verarbeitung",
|
||||
"zwecke der verarbeitung", "verarbeitungstätigkeit", "purpose",
|
||||
"zweckbindung", "erhebung",
|
||||
),
|
||||
"legal_basis": (
|
||||
"rechtsgrundlage", "berechtigte interesse", "berechtigtes interesse",
|
||||
"einwilligung", "vertragserfüllung", "vertragserfuellung",
|
||||
"interessenabwägung", "interessenabwaegung", "art. 6",
|
||||
"rechtmäßigkeit", "rechtmaessigkeit", "erforderlich",
|
||||
),
|
||||
"recipients": (
|
||||
"empfänger", "empfaenger", "empfängerkategorien",
|
||||
"empfaengerkategorien", "weitergabe", "übermittlung an",
|
||||
"auftragsverarbeit", "auftragsverarbeiter", "dienstleister",
|
||||
"dritte", "drittempfänger", "kategorien von empfängern",
|
||||
),
|
||||
"third_country": (
|
||||
"drittland", "drittstaat", "drittländer", "drittlaender",
|
||||
"standardvertragsklausel", "angemessenheitsbeschluss",
|
||||
"übermittlung in ein drittland", "geeignete garantien",
|
||||
"schutzgarantien", "data privacy framework", "ewr",
|
||||
"internationale übermittlung", "transfermechanismus",
|
||||
),
|
||||
"third_country_mechanism": (
|
||||
"standardvertragsklausel", "angemessenheitsbeschluss",
|
||||
"geeignete garantien", "data privacy framework",
|
||||
"schutzgarantien", "art. 46", "art. 45",
|
||||
),
|
||||
"retention": (
|
||||
"speicherdauer", "aufbewahrungsfrist", "aufbewahrungsdauer",
|
||||
"löschfrist", "loeschfrist", "speicherbegrenzung", "löschung",
|
||||
"loeschung", "dauer der speicherung", "kriterien für die festlegung",
|
||||
"speicherfrist", "aufbewahrung",
|
||||
),
|
||||
"retention_periods": (
|
||||
"aufbewahrungsfrist", "löschfrist", "speicherdauer",
|
||||
"gesetzliche frist", "handelsrechtlich", "steuerrechtlich",
|
||||
),
|
||||
"rights": (
|
||||
"betroffenenrecht", "betroffenenrechte", "recht auf auskunft",
|
||||
"auskunft", "berichtigung", "löschung", "loeschung",
|
||||
"einschränkung", "einschraenkung", "datenübertragbarkeit",
|
||||
"datenuebertragbarkeit", "widerspruch", "widerruf",
|
||||
"rechte der betroffenen", "art. 15", "art. 17", "art. 21",
|
||||
),
|
||||
"complaint": (
|
||||
"beschwerderecht", "aufsichtsbehörde", "aufsichtsbehoerde",
|
||||
"datenschutzbehörde", "datenschutzbehoerde", "beschwerde",
|
||||
"recht auf beschwerde", "art. 77", "zuständige aufsichtsbehörde",
|
||||
),
|
||||
"rights_art22_profiling": (
|
||||
"automatisierte entscheidung", "profiling", "art. 22",
|
||||
"automatisierte einzelentscheidung", "scoring",
|
||||
),
|
||||
"dse_version_date": (
|
||||
"stand", "letzte aktualisierung", "zuletzt geändert",
|
||||
"gültig ab", "gueltig ab", "version", "versionsdatum",
|
||||
"aktualität", "nachweisbarkeit",
|
||||
),
|
||||
}
|
||||
|
||||
|
||||
def compute_regex_boosts(text: str, business_scope: set[str] | None = None) -> set[str]:
|
||||
"""Welche DSE-field_ids haben die kuratierten Patterns erkannt?
|
||||
|
||||
Returns die Menge gehit'ter field_ids, über die später entschieden wird,
|
||||
ob ein DB-MC darüber automatisch passed werden kann. business_scope wird
|
||||
akzeptiert (Signatur-Parität mit impressum), für DSE aber nicht gegated —
|
||||
Art-13-Pflichten sind universell.
|
||||
"""
|
||||
if not text or len(text) < 50:
|
||||
return set()
|
||||
scope = business_scope or set()
|
||||
hits: set[str] = set()
|
||||
for mc in MCS:
|
||||
if not scope_matches(mc, scope):
|
||||
continue
|
||||
if any(p.search(text) for p in mc.patterns):
|
||||
hits.add(mc.field_id)
|
||||
return hits
|
||||
|
||||
|
||||
def boost_matches_db_mc(
|
||||
boosts: set[str],
|
||||
pass_criteria: list,
|
||||
fail_criteria: list | None = None,
|
||||
) -> str | None:
|
||||
"""Hat ein gebooster field_id ≥2 Keyword-Überlapp mit den pass/fail_criteria
|
||||
eines DB-MC? Returns field_id (höchster Match-Count) oder None."""
|
||||
if not boosts:
|
||||
return None
|
||||
crit_parts: list[str] = []
|
||||
for c in (pass_criteria or []):
|
||||
if c:
|
||||
crit_parts.append(str(c).lower())
|
||||
for c in (fail_criteria or []):
|
||||
if c:
|
||||
crit_parts.append(str(c).lower())
|
||||
if not crit_parts:
|
||||
return None
|
||||
crit_text = " ".join(crit_parts)
|
||||
best: tuple[int, str] | None = None
|
||||
for field_id in boosts:
|
||||
kws = BOOST_KEYWORDS.get(field_id) or ()
|
||||
match_count = sum(1 for kw in kws if kw in crit_text)
|
||||
if match_count >= 2:
|
||||
if best is None or match_count > best[0]:
|
||||
best = (match_count, field_id)
|
||||
return best[1] if best else None
|
||||
|
||||
|
||||
def criteria_on_topic(
|
||||
pass_criteria: list | None,
|
||||
fail_criteria: list | None = None,
|
||||
min_hits: int = 2,
|
||||
) -> bool:
|
||||
"""Deterministischer Themen-Gate: gehört ein DB-MC überhaupt ins DSE-
|
||||
Themenfeld (Art 13/14)? ≥min_hits unterschiedliche Schlüsselwörter aus
|
||||
IRGENDEINEM DSE-Feld in den kombinierten criteria. Fängt fremd-getaggte
|
||||
MCs ab. Leere Kriterien → on-topic behalten (konservativ)."""
|
||||
crit_parts: list[str] = []
|
||||
for c in (pass_criteria or []):
|
||||
if c:
|
||||
crit_parts.append(str(c).lower())
|
||||
for c in (fail_criteria or []):
|
||||
if c:
|
||||
crit_parts.append(str(c).lower())
|
||||
if not crit_parts:
|
||||
return True
|
||||
crit_text = " ".join(crit_parts)
|
||||
hits: set[str] = set()
|
||||
for kws in BOOST_KEYWORDS.values():
|
||||
for kw in kws:
|
||||
if kw in crit_text:
|
||||
hits.add(kw)
|
||||
if len(hits) >= min_hits:
|
||||
return True
|
||||
return False
|
||||
@@ -0,0 +1,239 @@
|
||||
"""v3-Engine: läuft die 4-Layer-Pipeline auf einem DSE-Text (Art. 13/14 DSGVO).
|
||||
|
||||
Layer 0 — Regex-Boost (die kuratierten Art-13-Patterns aus mcs.py)
|
||||
Layer 1 — MC-Laden + Keyword-Match. Das LADEN delegiert an die Main-Tool-
|
||||
Engine (rag_document_checker._load_controls, doc_type='dse'):
|
||||
eine Quelle der Wahrheit inkl. P72-Scope, check_type='text'
|
||||
(267 von 571) und fits_doc_type/scope_requires aus dem Sidecar.
|
||||
Layer 2 — BGE-M3 Embedding-Match (mc_embedding_matcher, shared)
|
||||
Layer 0 Override — failed MCs, deren criteria zu einem gebooster field_id
|
||||
passen, werden zu PASS überschrieben.
|
||||
|
||||
Zusätzlich am Agent-Rand: subtraktives Sektor-/Themen-Gate (_filter_controls)
|
||||
— das Sektor-Gate (Branchen-Prefix GOV/FIN/MED…) verwirft branchenfremde MCs,
|
||||
das Themen-Gate fremd-getaggte. Analog impressum/v3_engine.py.
|
||||
|
||||
Output: Liste Result-Dicts kompatibel mit rag_document_checker. Der Agent
|
||||
konvertiert sie zu Finding-Objekten.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from typing import Any
|
||||
|
||||
from .regex_boost import (
|
||||
compute_regex_boosts,
|
||||
criteria_on_topic,
|
||||
)
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Branchen-Prefix -> erwarteter Scope-Token. Reuse aus dem Mail-V2-Scope-
|
||||
# Filter, damit Agent-Pfad und Report-Pfad dieselbe Quelle nutzen. Import
|
||||
# defensiv: faellt der Mail-Pfad weg, bleibt der Agent lauffaehig.
|
||||
try:
|
||||
from compliance.services.mail_render_v2._scope_filter import (
|
||||
SECTOR_PREFIXES,
|
||||
)
|
||||
except Exception: # pragma: no cover - defensiver Fallback
|
||||
SECTOR_PREFIXES = {}
|
||||
|
||||
|
||||
async def run_v3_pipeline(
|
||||
text: str,
|
||||
business_scope: set[str],
|
||||
db_url: str = "",
|
||||
skip_embedding: bool = False,
|
||||
) -> tuple[list[dict[str, Any]], dict[str, Any]]:
|
||||
"""Returns (results, telemetry).
|
||||
|
||||
results: pro DB-MC ein dict {control_id, passed, severity, ...}
|
||||
telemetry: counters für Frontend-Anzeige (Layer-Aufschlüsselung)
|
||||
|
||||
skip_embedding: Layer-2 (BGE-M3 Recall) überspringen. Nur für Unit-Tests
|
||||
ohne Embedding-Service. Im Betrieb läuft die Recall-Schicht: sie ist
|
||||
gecacht (pro Doc-Hash) und Reachability-gegated, blockiert also nie.
|
||||
"""
|
||||
if not text or len(text) < 100:
|
||||
return [], {"reason": "text too short"}
|
||||
|
||||
# Layer 0: kuratierte Art-13-Patterns
|
||||
boosts = compute_regex_boosts(text, business_scope)
|
||||
boost_field_ids = sorted(boosts)
|
||||
logger.info("dse v3 Layer-0 boosts: %d hits — %s",
|
||||
len(boost_field_ids), boost_field_ids)
|
||||
|
||||
# Layer 1: MC-Laden DELEGIERT an die Main-Tool-Engine (Scope-Schutz inkl.).
|
||||
try:
|
||||
from compliance.services.rag_document_checker import _load_controls
|
||||
controls = await _load_controls(
|
||||
"dse", db_url, 0, business_scope,
|
||||
)
|
||||
except Exception as e:
|
||||
logger.warning("dse v3 load via main-tool engine failed: %s", e)
|
||||
controls = []
|
||||
_normalize_criteria(controls)
|
||||
# Agent-Rand-Backstop: Sektor-Gate (Branchen-Prefix) + Themen-Gate.
|
||||
controls, drop_stats = _filter_controls(controls, business_scope)
|
||||
# Applicability-Gate: hochsichere organisatorische Controls (laut
|
||||
# control_classification NICHT DSE, needs_review=false) aus dem
|
||||
# FINDINGS-Scan nehmen -> organisatorische Checkliste statt False-FEHLT.
|
||||
# Fail-safe: needs_review bleiben drin. Defensiv: fehlt die Tabelle, kein
|
||||
# Filter (Prod-sicher). Siehe _classification_gate.
|
||||
from ._classification_gate import apply_gate, load_dse_gate
|
||||
gate = await load_dse_gate(db_url)
|
||||
organizational: list[dict[str, Any]] = []
|
||||
if gate:
|
||||
controls, organizational = apply_gate(controls, gate)
|
||||
results: list[dict[str, Any]] = []
|
||||
if controls:
|
||||
try:
|
||||
from compliance.services.rag_document_checker import (
|
||||
_check_mc_deterministic,
|
||||
)
|
||||
text_lower = text.lower().replace("\xad", "")
|
||||
for mc in controls:
|
||||
r = _check_mc_deterministic(text_lower, mc)
|
||||
if r:
|
||||
r["_pass_criteria"] = mc.get("pass_criteria")
|
||||
r["_fail_criteria"] = mc.get("fail_criteria")
|
||||
results.append(r)
|
||||
except Exception as e:
|
||||
logger.warning("layer-1 keyword check failed: %s", e)
|
||||
results = []
|
||||
|
||||
layer_1_pass = sum(1 for r in results if r.get("passed"))
|
||||
|
||||
# Layer 2: DETERMINISTISCHE semantische Recall-Schicht (BGE-M3, gecacht).
|
||||
# Ersetzt den früheren Regex-Boost, der auf vollständigen DSE-Dokumenten
|
||||
# massiv über-passte (BMW: 71/94 Boost-Overrides → 49% Übereinstimmung).
|
||||
# Embedding ist genauer (BMW-GT: KW|EMB@0.65 = 75%) UND deterministisch
|
||||
# (feste Funktion, reproduzierbar — kein Keyword-Katalog, kein LLM).
|
||||
embedding_passes = 0
|
||||
if not skip_embedding:
|
||||
failed_cids = [r.get("control_id") for r in results
|
||||
if r and not r.get("passed") and r.get("control_id")]
|
||||
if failed_cids:
|
||||
try:
|
||||
from ._embedding_recall import embedding_recall
|
||||
semantic = await embedding_recall(text, failed_cids)
|
||||
except Exception as e:
|
||||
logger.warning("dse embedding recall failed: %s", e)
|
||||
semantic = set()
|
||||
for r in results:
|
||||
if (r.get("control_id") in semantic
|
||||
and not r.get("passed")):
|
||||
r["passed"] = True
|
||||
r["matched_text"] = "[semantisch erkannt — Embedding]"
|
||||
r["source"] = (r.get("source") or "") + "+embedding"
|
||||
embedding_passes += 1
|
||||
|
||||
# Layer 3: getierte 3-Status-Auswertung (nur Controls mit tiered_criteria).
|
||||
# Reproduzierbar: EMBEDDING-Präsenz (deterministisch) + GECACHTER Haiku-Judge
|
||||
# nur für Sufficiency. UNBESTIMMT → Legacy-Pass bleibt. Gated + fail-safe.
|
||||
tiered_evaluated = 0
|
||||
try:
|
||||
from compliance.services.checkers.base import DocContext
|
||||
from ._tiered_eval import (
|
||||
evaluate_tiered, fetch_tiered_criteria, prepare_doc,
|
||||
)
|
||||
result_cids = [r.get("control_id") for r in results if r.get("control_id")]
|
||||
tiered_map = await fetch_tiered_criteria(result_cids, db_url)
|
||||
if tiered_map:
|
||||
ctx = await prepare_doc(text)
|
||||
doc_ctx = DocContext(text=text)
|
||||
for r in results:
|
||||
tc = tiered_map.get(r.get("control_id"))
|
||||
if not tc:
|
||||
continue
|
||||
ev = await evaluate_tiered(r["control_id"], tc, ctx, doc_ctx)
|
||||
if ev["status"] == "UNBESTIMMT":
|
||||
continue
|
||||
r["compliance_status"] = ev["status"]
|
||||
r["recommendations"] = ev["recommendations"]
|
||||
r["tier_lm"] = f"{ev['lm_met']}/{ev['lm_total']}"
|
||||
r["passed"] = ev["status"] == "ERFÜLLT"
|
||||
tiered_evaluated += 1
|
||||
except Exception as e:
|
||||
logger.warning("dse tiered eval skipped: %s", e)
|
||||
|
||||
telemetry = {
|
||||
"layer_0_field_hits": len(boost_field_ids),
|
||||
"layer_0_field_ids": boost_field_ids,
|
||||
"layer_1_pass": layer_1_pass,
|
||||
"embedding_passes": embedding_passes,
|
||||
"tiered_evaluated": tiered_evaluated,
|
||||
"total_mcs": len(results),
|
||||
"sector_dropped": drop_stats.get("sector_dropped", 0),
|
||||
"offtopic_dropped": drop_stats.get("offtopic_dropped", 0),
|
||||
"gate_excluded": len(organizational),
|
||||
"organizational_checklist": organizational,
|
||||
}
|
||||
logger.info("dse v3 telemetry: %s", telemetry)
|
||||
return results, telemetry
|
||||
|
||||
|
||||
def _filter_controls(
|
||||
controls: list[dict[str, Any]],
|
||||
business_scope: set[str],
|
||||
) -> tuple[list[dict[str, Any]], dict[str, int]]:
|
||||
"""Subtraktiver Scope-Filter VOR der Bewertung.
|
||||
|
||||
1. Sektor-Gate — MCs deren control_id-Prefix eine Branche bezeichnet
|
||||
(FIN/GOV/MED/INS/EDU/LEG/REL/POL), die NICHT im business_scope liegt
|
||||
UND die nicht on-topic ist, werden verworfen.
|
||||
2. Themen-Gate — MCs ohne DSE-Themenüberlapp werden verworfen.
|
||||
|
||||
Rein subtraktiv: entfernt nur falsch-positive Kandidaten.
|
||||
"""
|
||||
scope_lc = {s.lower() for s in (business_scope or set())}
|
||||
kept: list[dict[str, Any]] = []
|
||||
sector_dropped = 0
|
||||
offtopic_dropped = 0
|
||||
for c in controls:
|
||||
cid = c.get("control_id") or ""
|
||||
prefix = cid.split("-")[0].upper() if "-" in cid else ""
|
||||
on_topic = criteria_on_topic(c.get("pass_criteria"),
|
||||
c.get("fail_criteria"))
|
||||
required = SECTOR_PREFIXES.get(prefix)
|
||||
# Sektor-Gate nur fuer NICHT-on-topic Controls: ein klar DSE-
|
||||
# thematischer Control (z.B. GOV-Prefix aus der Domain-Erkennung)
|
||||
# darf nicht am Branchen-Prefix scheitern.
|
||||
if required and not (scope_lc & required) and not on_topic:
|
||||
sector_dropped += 1
|
||||
continue
|
||||
if not on_topic:
|
||||
offtopic_dropped += 1
|
||||
continue
|
||||
kept.append(c)
|
||||
if sector_dropped or offtopic_dropped:
|
||||
logger.info(
|
||||
"dse v3 scope-filter: -%d Branchen-MCs, -%d themenfremde MCs "
|
||||
"(scope=%s)", sector_dropped, offtopic_dropped,
|
||||
sorted(scope_lc) or "leer",
|
||||
)
|
||||
return kept, {
|
||||
"sector_dropped": sector_dropped,
|
||||
"offtopic_dropped": offtopic_dropped,
|
||||
}
|
||||
|
||||
|
||||
def _normalize_criteria(controls: list[dict[str, Any]]) -> None:
|
||||
"""asyncpg liefert JSONB-Spalten (pass_criteria/fail_criteria) als
|
||||
Roh-String. In echte Listen parsen, damit Sektor-/Themen-Gate und der
|
||||
Boost-Layer Element-weise iterieren."""
|
||||
import json
|
||||
for c in controls:
|
||||
for key in ("pass_criteria", "fail_criteria"):
|
||||
v = c.get(key)
|
||||
if isinstance(v, list):
|
||||
continue
|
||||
if isinstance(v, str):
|
||||
try:
|
||||
parsed = json.loads(v)
|
||||
c[key] = parsed if isinstance(parsed, list) else [v]
|
||||
except Exception:
|
||||
c[key] = [v] if v else []
|
||||
else:
|
||||
c[key] = []
|
||||
@@ -182,18 +182,12 @@ def _filter_controls(
|
||||
for c in controls:
|
||||
cid = c.get("control_id") or ""
|
||||
prefix = cid.split("-")[0].upper() if "-" in cid else ""
|
||||
on_topic = criteria_on_topic(c.get("pass_criteria"),
|
||||
c.get("fail_criteria"))
|
||||
required = SECTOR_PREFIXES.get(prefix)
|
||||
# Sektor-Gate nur fuer NICHT-on-topic Controls: ein klar
|
||||
# impressum-thematischer Control (z.B. MStV §18(1) mit GOV-Prefix
|
||||
# aus der Domain-Erkennung der Control-Generierung) darf nicht am
|
||||
# Branchen-Prefix scheitern. Der Themen-Ueberlapp ist der staerkere
|
||||
# Relevanz-Beweis als ein vererbter ID-Prefix.
|
||||
if required and not (scope_lc & required) and not on_topic:
|
||||
if required and not (scope_lc & required):
|
||||
sector_dropped += 1
|
||||
continue
|
||||
if not on_topic:
|
||||
if not criteria_on_topic(c.get("pass_criteria"),
|
||||
c.get("fail_criteria")):
|
||||
offtopic_dropped += 1
|
||||
continue
|
||||
kept.append(c)
|
||||
|
||||
@@ -1,12 +1,27 @@
|
||||
"""AGBAgent — kuratierte §§-305-ff-BGB-Checkliste (ChecklistAgent-Subclass)."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
"""AGBAgent (v2, routed). Embedding/LLM offline-gestubbt → kein Netzwerk."""
|
||||
import asyncio
|
||||
|
||||
import pytest
|
||||
|
||||
import compliance.services.specialist_agents.agb._pipeline as pipeline
|
||||
from compliance.services.checkers.base import CheckResult
|
||||
from compliance.services.specialist_agents import REGISTRY, AgentInput
|
||||
|
||||
|
||||
class _Stub:
|
||||
def __init__(self, present):
|
||||
self._p = present
|
||||
|
||||
async def check(self, ctrl, doc):
|
||||
return CheckResult(present=self._p)
|
||||
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def _offline(monkeypatch):
|
||||
monkeypatch.setattr(pipeline, "_EMB", _Stub(None))
|
||||
monkeypatch.setattr(pipeline, "_LLM", _Stub(None))
|
||||
|
||||
|
||||
def _run(text: str):
|
||||
return asyncio.run(
|
||||
REGISTRY.get("agb").evaluate(AgentInput(doc_type="agb", text=text)))
|
||||
|
||||
@@ -0,0 +1,62 @@
|
||||
"""AGB routed-Pipeline: Gate, Reference-/Embedding-Rescue, LLM-skip, Re-Tiering.
|
||||
Embedding + LLM offline-gestubbt → deterministisch, kein Netzwerk (Reference = echtes Regex)."""
|
||||
import asyncio
|
||||
from types import SimpleNamespace
|
||||
|
||||
import pytest
|
||||
|
||||
import compliance.services.specialist_agents.agb._pipeline as pipeline
|
||||
from compliance.services.checkers.base import CheckResult
|
||||
from compliance.services.specialist_agents._base import AgentInput
|
||||
from compliance.services.specialist_agents.agb.agent import AGBAgent
|
||||
|
||||
|
||||
class _Stub:
|
||||
def __init__(self, present):
|
||||
self._p = present
|
||||
|
||||
async def check(self, ctrl, doc):
|
||||
return CheckResult(present=self._p)
|
||||
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def _offline(monkeypatch):
|
||||
monkeypatch.setattr(pipeline, "_EMB", _Stub(None))
|
||||
monkeypatch.setattr(pipeline, "_LLM", _Stub(None))
|
||||
|
||||
|
||||
def _routed(field_ids, text, context=None):
|
||||
findings = [SimpleNamespace(field_id=fid) for fid in field_ids]
|
||||
return asyncio.run(pipeline.run_routed(findings, text, context or {}))
|
||||
|
||||
|
||||
def test_gate_termination_na_for_oneoff_shop():
|
||||
text = "Widerrufsbelehrung: Sie koennen binnen 14 Tagen widerrufen. " * 5
|
||||
kept, resolved, gated = _routed(["termination", "termination_form"], text)
|
||||
assert set(gated) == {"termination", "termination_form"}
|
||||
assert kept == []
|
||||
|
||||
|
||||
def test_reference_rescues_data_protection():
|
||||
text = "Einzelheiten zur Verarbeitung in unserer Datenschutzerklaerung. " * 5
|
||||
kept, resolved, gated = _routed(["data_protection"], text)
|
||||
assert "data_protection" in resolved and kept == []
|
||||
|
||||
|
||||
def test_embedding_rescue_resolves(monkeypatch):
|
||||
monkeypatch.setattr(pipeline, "_EMB", _Stub(True))
|
||||
kept, resolved, gated = _routed(["scope"], "x" * 200)
|
||||
assert "scope" in resolved
|
||||
|
||||
|
||||
def test_llm_skipped_keeps_finding():
|
||||
kept, resolved, gated = _routed(["delivery_timeframe"], "x" * 200, {"skip_llm": True})
|
||||
assert [f.field_id for f in kept] == ["delivery_timeframe"] and resolved == []
|
||||
|
||||
|
||||
def test_evaluate_retiers_low_out_of_findings():
|
||||
text = ("Allgemeine Geschaeftsbedingungen. Vertragsschluss durch Bestellung. "
|
||||
"Haftung beschraenkt. Gerichtsstand Muenchen. ") * 6
|
||||
out = asyncio.run(AGBAgent().evaluate(AgentInput(doc_type="agb", text=text)))
|
||||
assert out.agent == "agb" and out.agent_version == "2.0"
|
||||
assert all(f.severity in ("HIGH", "MEDIUM") for f in out.findings)
|
||||
@@ -0,0 +1,14 @@
|
||||
"""AGB muss im LIVE-Pfad verdrahtet sein (_TOPIC_AGENTS), nicht nur per Snapshot."""
|
||||
from compliance.api.agent_check._agent_outputs import _TOPIC_AGENTS
|
||||
|
||||
|
||||
def test_agb_wired_into_live_topic_agents():
|
||||
assert _TOPIC_AGENTS.get("agb") == "agb"
|
||||
|
||||
|
||||
def test_dse_wired_into_live_topic_agents():
|
||||
assert _TOPIC_AGENTS.get("dse") == "dse"
|
||||
|
||||
|
||||
def test_impressum_still_wired():
|
||||
assert _TOPIC_AGENTS.get("impressum") == "impressum"
|
||||
@@ -0,0 +1,83 @@
|
||||
"""Unit-Tests der Prüfer-Library. Embedding + LLM gemockt → kein Netzwerk."""
|
||||
import asyncio
|
||||
|
||||
import compliance.services.llm_cascade as cascade_mod
|
||||
import compliance.services.mc_embedding_matcher as emb_mod
|
||||
from compliance.services.checkers.base import (
|
||||
ControlSpec,
|
||||
DecisionMethod,
|
||||
DocContext,
|
||||
VerificationMethod,
|
||||
)
|
||||
from compliance.services.checkers.embedding_checker import EmbeddingChecker
|
||||
from compliance.services.checkers.llm_checker import LLMChecker
|
||||
from compliance.services.checkers.reference_checker import ReferenceChecker
|
||||
|
||||
|
||||
def _run(coro):
|
||||
return asyncio.run(coro)
|
||||
|
||||
|
||||
def test_reference_present_and_absent():
|
||||
rc = ReferenceChecker()
|
||||
spec = ControlSpec("data_protection", VerificationMethod.REFERENCE,
|
||||
DecisionMethod.LINK_RESOLVER,
|
||||
patterns=[r"datenschutz(erkl|bestimmung|hinweis)"])
|
||||
r = _run(rc.check(spec, DocContext(
|
||||
text="Details in unserer Datenschutzerklaerung: https://x.de/datenschutz")))
|
||||
assert r.present is True
|
||||
assert r.detail.get("link", "").startswith("https://")
|
||||
r2 = _run(rc.check(spec, DocContext(text="Keine Angabe zum Datenschutz-Thema.")))
|
||||
assert r2.present is False
|
||||
|
||||
|
||||
def test_embedding_threshold(monkeypatch):
|
||||
monkeypatch.setattr(emb_mod, "DIM", 3, raising=False)
|
||||
monkeypatch.setattr(emb_mod, "_chunk_text", lambda t: [t], raising=False)
|
||||
|
||||
async def _embed(texts):
|
||||
return [[1.0, 0.0, 0.0] for _ in texts]
|
||||
|
||||
monkeypatch.setattr(emb_mod, "_embed_texts", _embed, raising=False)
|
||||
ec = EmbeddingChecker()
|
||||
spec = ControlSpec("scope_t", VerificationMethod.CONTENT, DecisionMethod.EMBEDDING,
|
||||
paraphrases=["Geltungsbereich"], embed_threshold=0.58)
|
||||
monkeypatch.setattr(emb_mod, "_cosine", lambda a, b: 0.90, raising=False)
|
||||
r = _run(ec.check(spec, DocContext(text="x" * 200)))
|
||||
assert r.present is True and r.confidence >= 0.58
|
||||
monkeypatch.setattr(emb_mod, "_cosine", lambda a, b: 0.20, raising=False)
|
||||
r2 = _run(ec.check(spec, DocContext(text="x" * 200)))
|
||||
assert r2.present is False
|
||||
|
||||
|
||||
def test_embedding_offline_returns_none(monkeypatch):
|
||||
async def _boom(texts):
|
||||
raise ConnectionError("embedding-service down")
|
||||
|
||||
monkeypatch.setattr(emb_mod, "_embed_texts", _boom, raising=False)
|
||||
ec = EmbeddingChecker()
|
||||
spec = ControlSpec("scope_off", VerificationMethod.CONTENT, DecisionMethod.EMBEDDING,
|
||||
paraphrases=["x"], embed_threshold=0.6)
|
||||
r = _run(ec.check(spec, DocContext(text="y" * 200)))
|
||||
assert r.present is None # fail-safe
|
||||
|
||||
|
||||
def test_llm_present_and_absent(monkeypatch):
|
||||
lc = LLMChecker()
|
||||
spec = ControlSpec("delivery_timeframe", VerificationMethod.CONTENT, DecisionMethod.LLM,
|
||||
topic_regex=r"liefer", question="Konkrete Lieferfrist?")
|
||||
doc = DocContext(text=("1. Lieferung\nDie Ware wird innerhalb von 2 Werktagen "
|
||||
"geliefert.\n") * 4)
|
||||
|
||||
async def _erfuellt(system, user, **kw):
|
||||
return {"text": '{"verdict":"ERFUELLT","zitat":"2 Werktagen","begruendung":"x"}',
|
||||
"source": "qwen", "confidence": 0.7}
|
||||
|
||||
monkeypatch.setattr(cascade_mod, "call_with_cascade", _erfuellt, raising=False)
|
||||
assert _run(lc.check(spec, doc)).present is True
|
||||
|
||||
async def _fehlt(system, user, **kw):
|
||||
return {"text": '{"verdict":"FEHLT"}', "source": "qwen"}
|
||||
|
||||
monkeypatch.setattr(cascade_mod, "call_with_cascade", _fehlt, raising=False)
|
||||
assert _run(lc.check(spec, doc)).present is False
|
||||
@@ -1,65 +1,153 @@
|
||||
"""DSEAgent — kuratierte Art-13/14-Checkliste (kein Library-Firehose)."""
|
||||
"""DSE-Agent v3 — DB-Controls (doc_check_controls) via run_v3_pipeline +
|
||||
kuratierter Art-13-Regex-Boost (Layer 0). Volle Parität zu impressum/cookie.
|
||||
|
||||
Die Tests prüfen die deterministischen Bausteine (regex_boost/mcs) ohne DB und
|
||||
den Agent-Pfad mit gemocktem run_v3_pipeline (CI hat keine DB).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
|
||||
import compliance.services.specialist_agents.dse.agent as dse_agent
|
||||
from compliance.services.specialist_agents import REGISTRY, AgentInput
|
||||
from compliance.services.specialist_agents.dse.mcs import MCS, MC_IDS
|
||||
from compliance.services.specialist_agents.dse.regex_boost import (
|
||||
boost_matches_db_mc,
|
||||
compute_regex_boosts,
|
||||
criteria_on_topic,
|
||||
)
|
||||
|
||||
|
||||
def _run(text: str):
|
||||
return asyncio.run(
|
||||
REGISTRY.get("dse").evaluate(AgentInput(doc_type="dse", text=text)))
|
||||
|
||||
|
||||
def test_dse_agent_registered():
|
||||
assert REGISTRY.get("dse") is not None
|
||||
|
||||
|
||||
def test_dse_detects_core_obligations():
|
||||
text = (
|
||||
"Datenschutzerklaerung. Verantwortlich im Sinne der DSGVO ist die "
|
||||
"Muster GmbH, Musterstrasse 1, 12345 Berlin. E-Mail: info@muster.de. "
|
||||
_DSE_SAMPLE = (
|
||||
"Datenschutzerklaerung. Verantwortlich im Sinne der DSGVO ist die Muster "
|
||||
"GmbH, Musterstrasse 1, 12345 Berlin. E-Mail: info@muster.de. "
|
||||
"Datenschutzbeauftragter: dsb@muster.de. Zwecke der Verarbeitung und "
|
||||
"Rechtsgrundlage Art. 6 Abs. 1. Empfaenger Ihrer Daten. Speicherdauer "
|
||||
"der Daten. Ihre Rechte: Auskunft, Loeschung, Widerspruch, Beschwerde "
|
||||
"bei der Aufsichtsbehoerde. ") * 3
|
||||
out = _run(text)
|
||||
assert out.agent == "dse"
|
||||
# 10 L1-Pflichtangaben immer + L2-Details deren Parent vorhanden ist
|
||||
# (fehlende Parents → L2 übersprungen, kein 'na'-Rauschen).
|
||||
assert 10 <= out.mc_total <= 33
|
||||
ok = [c.label for c in out.mc_coverage if c.status == "ok"]
|
||||
assert any("Verantwortlich" in lbl for lbl in ok)
|
||||
assert any("Rechtsgrundlage" in lbl for lbl in ok)
|
||||
"Rechtsgrundlage Art. 6 Abs. 1 lit. f berechtigtes Interesse. Empfaenger "
|
||||
"Ihrer Daten sind Auftragsverarbeiter. Speicherdauer der Daten richtet "
|
||||
"sich nach Aufbewahrungsfristen. Sie haben das Recht auf Auskunft, das "
|
||||
"Recht auf Berichtigung, das Recht auf Loeschung sowie ein "
|
||||
"Widerspruchsrecht. Beschwerde bei der Aufsichtsbehoerde moeglich. Stand: "
|
||||
"Januar 2026. ") * 3
|
||||
|
||||
|
||||
def test_dse_missing_obligations_are_findings():
|
||||
out = _run("Lorem ipsum dolor sit amet consectetur adipiscing elit. " * 6)
|
||||
assert out.findings
|
||||
assert any(f.severity == "HIGH" for f in out.findings)
|
||||
# ── Registrierung ────────────────────────────────────────────────────────
|
||||
def test_dse_agent_registered():
|
||||
agent = REGISTRY.get("dse")
|
||||
assert agent is not None
|
||||
assert agent.agent_version == "3.0"
|
||||
assert agent.doc_type == "dse"
|
||||
|
||||
|
||||
def test_owned_mc_ids_match_checklist():
|
||||
# owned_mc_ids = die Boost-Pattern-IDs (aus ART13_CHECKLIST gehoben).
|
||||
assert MC_IDS == tuple(m.mc_id for m in MCS)
|
||||
assert len(MC_IDS) >= 10 # mind. die 10 L1-Pflichtfelder + L2
|
||||
|
||||
|
||||
# ── Layer-0 Regex-Boost (deterministisch, ohne DB) ───────────────────────
|
||||
def test_regex_boost_detects_core_fields():
|
||||
boosts = compute_regex_boosts(_DSE_SAMPLE)
|
||||
# Die zentralen Art-13-Felder müssen erkannt werden.
|
||||
for field in ("controller", "legal_basis", "rights", "complaint",
|
||||
"retention", "dse_version_date"):
|
||||
assert field in boosts, f"{field} nicht erkannt: {sorted(boosts)}"
|
||||
|
||||
|
||||
def test_regex_boost_empty_on_short_text():
|
||||
assert compute_regex_boosts("zu kurz") == set()
|
||||
|
||||
|
||||
def test_criteria_on_topic_accepts_dse_rejects_foreign():
|
||||
dse_crit = ["Rechtsgrundlage gemäß Art. 6 DSGVO benannt",
|
||||
"Speicherdauer und Löschfrist angegeben"]
|
||||
assert criteria_on_topic(dse_crit) is True
|
||||
foreign = ["Bestellbestätigung wird per E-Mail versendet",
|
||||
"Versandkosten werden im Warenkorb angezeigt"]
|
||||
assert criteria_on_topic(foreign) is False
|
||||
# leere Kriterien → konservativ on-topic behalten
|
||||
assert criteria_on_topic([]) is True
|
||||
|
||||
|
||||
def test_boost_matches_db_mc_third_country():
|
||||
boosts = {"third_country", "controller"}
|
||||
crit = ["Standardvertragsklauseln für Drittland benannt",
|
||||
"Geeignete Garantien bei Übermittlung in ein Drittland"]
|
||||
assert boost_matches_db_mc(boosts, crit) == "third_country"
|
||||
# ohne passende Boosts → None
|
||||
assert boost_matches_db_mc(set(), crit) is None
|
||||
|
||||
|
||||
# ── Agent-Pfad mit gemocktem run_v3_pipeline ─────────────────────────────
|
||||
def _mock_v3(results, telemetry=None):
|
||||
async def _fake(text, scope, db_url="", skip_embedding=False):
|
||||
return results, (telemetry or {
|
||||
"total_mcs": len(results), "layer_0_field_hits": 0,
|
||||
"layer_0_field_ids": [], "layer_0_boost_overrides": 0,
|
||||
"sector_dropped": 0, "offtopic_dropped": 0})
|
||||
return _fake
|
||||
|
||||
|
||||
def _run(text, context=None):
|
||||
return asyncio.run(REGISTRY.get("dse").evaluate(
|
||||
AgentInput(doc_type="dse", text=text, context=context or {})))
|
||||
|
||||
|
||||
def test_dse_short_text_skips():
|
||||
out = _run("zu kurz")
|
||||
assert out.confidence == 0.0
|
||||
assert all(c.status == "skipped" for c in out.mc_coverage)
|
||||
assert out.mc_coverage and all(
|
||||
c.status == "skipped" for c in out.mc_coverage)
|
||||
|
||||
|
||||
def test_third_country_high_when_applicable_no_na_detail_short_action():
|
||||
# Text ohne Drittland-Abschnitt + Scan-Kontext drittland=ja:
|
||||
# - third_country (L1) fehlt → HIGH (nicht weiches MEDIUM)
|
||||
# - Transfermechanismus (L2) → KEIN 'na' (übersprungen, Parent deckt ab)
|
||||
# - Titel/Maßnahme kurz (kein 280-Zeichen-Hint als Recommendation-Titel)
|
||||
text = ("Datenschutz. Verantwortlich ist die Muster GmbH, info@muster.de. "
|
||||
"Zwecke und Rechtsgrundlage Art. 6. Speicherdauer. Ihre Rechte. ") * 4
|
||||
out = asyncio.run(REGISTRY.get("dse").evaluate(AgentInput(
|
||||
doc_type="dse", text=text,
|
||||
context={"scan_context": {"third_country_transfer": "yes"}})))
|
||||
tc = [f for f in out.findings if "Drittland" in f.title]
|
||||
assert tc and tc[0].severity == "HIGH"
|
||||
assert not any(c.status == "na" and "Transfermechanismus" in c.label
|
||||
for c in out.mc_coverage)
|
||||
assert all(len(f.action) < 110 for f in out.findings)
|
||||
# Detail-Begründung bleibt als evidence erhalten
|
||||
assert any(f.evidence for f in out.findings)
|
||||
def test_dse_findings_from_failed_db_mc(monkeypatch):
|
||||
results = [{
|
||||
"control_id": "DATA-525-A17", "passed": False, "severity": "HIGH",
|
||||
"label": "Berechtigte Interessen ausweisen", "regulation": None,
|
||||
"article": None, "_pass_criteria": ["berechtigtes interesse benannt"],
|
||||
"matched_text": "", "source": "keyword_match",
|
||||
}, {
|
||||
"control_id": "AUTH-2051-A11", "passed": True, "severity": "LOW",
|
||||
"label": "Prägnante Form", "regulation": None, "article": None,
|
||||
"_pass_criteria": [], "matched_text": "ok",
|
||||
}]
|
||||
monkeypatch.setattr(dse_agent, "run_v3_pipeline", _mock_v3(results))
|
||||
out = _run(_DSE_SAMPLE, context={"skip_llm": True})
|
||||
fids = {f.field_id for f in out.findings}
|
||||
assert "DATA-525-A17" in fids # failed → Finding
|
||||
assert "AUTH-2051-A11" not in fids # passed → kein Finding
|
||||
f = next(f for f in out.findings if f.field_id == "DATA-525-A17")
|
||||
assert f.severity == "HIGH"
|
||||
assert f.norm == "DSGVO Art. 13/14" # NULL-regulation → Fallback-Norm
|
||||
assert len(f.action) < 410
|
||||
|
||||
|
||||
def test_dse_third_country_override_to_high(monkeypatch):
|
||||
# MEDIUM-Drittland-MC → HIGH bei dokumentiertem Transfer (scan_context).
|
||||
results = [{
|
||||
"control_id": "DATA-900-A01", "passed": False, "severity": "MEDIUM",
|
||||
"label": "Drittlandtransfer Schutzgarantien benennen",
|
||||
"regulation": None, "article": None,
|
||||
"_pass_criteria": ["standardvertragsklauseln", "drittland garantien"],
|
||||
"matched_text": "", "source": "keyword_match",
|
||||
}]
|
||||
monkeypatch.setattr(dse_agent, "run_v3_pipeline", _mock_v3(results))
|
||||
out = _run(_DSE_SAMPLE, context={
|
||||
"skip_llm": True,
|
||||
"scan_context": {"third_country_transfer": "yes"}})
|
||||
f = next(f for f in out.findings if f.field_id == "DATA-900-A01")
|
||||
assert f.severity == "HIGH"
|
||||
assert f.severity_reason == "db_mc_failed_third_country_transfer"
|
||||
|
||||
|
||||
def test_dse_no_transfer_keeps_medium(monkeypatch):
|
||||
results = [{
|
||||
"control_id": "DATA-900-A01", "passed": False, "severity": "MEDIUM",
|
||||
"label": "Drittlandtransfer Schutzgarantien benennen",
|
||||
"regulation": None, "article": None,
|
||||
"_pass_criteria": ["standardvertragsklauseln", "drittland garantien"],
|
||||
"matched_text": "", "source": "keyword_match",
|
||||
}]
|
||||
monkeypatch.setattr(dse_agent, "run_v3_pipeline", _mock_v3(results))
|
||||
out = _run(_DSE_SAMPLE, context={"skip_llm": True})
|
||||
f = next(f for f in out.findings if f.field_id == "DATA-900-A01")
|
||||
assert f.severity == "MEDIUM"
|
||||
|
||||
@@ -0,0 +1,59 @@
|
||||
"""Tests fuer das DSE-Applicability-Gate (_classification_gate).
|
||||
|
||||
Deckt die reine Split-Logik (apply_gate) und das defensive Verhalten von
|
||||
load_dse_gate ohne DB ab. Die DB-Abfrage selbst ist I/O und wird hier nicht
|
||||
gegen eine echte DB getestet (defensiver Pfad: kein DSN -> leeres Dict)."""
|
||||
|
||||
import asyncio
|
||||
import os
|
||||
|
||||
from compliance.services.specialist_agents.dse._classification_gate import (
|
||||
apply_gate,
|
||||
load_dse_gate,
|
||||
)
|
||||
|
||||
|
||||
def test_apply_gate_splits_findings_and_organizational():
|
||||
controls = [
|
||||
{"control_id": "AUTH-2051-A02", "title": "Speicherdauer nennen"},
|
||||
{"control_id": "AUTH-2049-A01", "title": "VVT fuehren"},
|
||||
]
|
||||
gate = {
|
||||
"AUTH-2049-A01": {
|
||||
"obligation_type": "EVIDENCE",
|
||||
"check_intent": "DIRECT_EVIDENCE",
|
||||
"applicable_artifacts": ["VVT", "AUDIT"],
|
||||
"reference_allowed": "NO",
|
||||
}
|
||||
}
|
||||
kept, organizational = apply_gate(controls, gate)
|
||||
assert [c["control_id"] for c in kept] == ["AUTH-2051-A02"]
|
||||
assert len(organizational) == 1
|
||||
org = organizational[0]
|
||||
assert org["control_id"] == "AUTH-2049-A01"
|
||||
assert org["title"] == "VVT fuehren"
|
||||
assert org["applicable_artifacts"] == ["VVT", "AUDIT"]
|
||||
assert org["check_intent"] == "DIRECT_EVIDENCE"
|
||||
|
||||
|
||||
def test_apply_gate_empty_gate_keeps_all():
|
||||
controls = [{"control_id": "X-1"}, {"control_id": "X-2"}]
|
||||
kept, organizational = apply_gate(controls, {})
|
||||
assert len(kept) == 2
|
||||
assert organizational == []
|
||||
|
||||
|
||||
def test_load_dse_gate_without_dsn_is_defensive():
|
||||
"""Kein DSN + keine Env -> leeres Dict (kein Filter), kein Fehler."""
|
||||
saved = (
|
||||
os.environ.pop("DATABASE_URL", None),
|
||||
os.environ.pop("COMPLIANCE_DATABASE_URL", None),
|
||||
)
|
||||
try:
|
||||
result = asyncio.run(load_dse_gate(""))
|
||||
assert result == {}
|
||||
finally:
|
||||
if saved[0] is not None:
|
||||
os.environ["DATABASE_URL"] = saved[0]
|
||||
if saved[1] is not None:
|
||||
os.environ["COMPLIANCE_DATABASE_URL"] = saved[1]
|
||||
@@ -0,0 +1,67 @@
|
||||
"""DSE Embedding-Recall — deterministische semantische Schicht (gecacht).
|
||||
|
||||
Testet die reine Logik OHNE Embedding-Service: Cache-Treffer-Pfad,
|
||||
Schwellen-Filter, Kandidaten-Schnitt, Reachability-Guard. Das Einbetten selbst
|
||||
(Embedding-Service) ist Integration und wird auf macmini/Prod validiert.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import json
|
||||
|
||||
import compliance.services.specialist_agents.dse._embedding_recall as er
|
||||
|
||||
|
||||
_TEXT = ("Datenschutzerklaerung der Muster GmbH. " * 20) # > 100 Zeichen
|
||||
|
||||
|
||||
def _seed_cache(tmp_path, scores: dict[str, float]) -> str:
|
||||
p = tmp_path / "dse_embed_cache.json"
|
||||
p.write_text(json.dumps({er._doc_hash(_TEXT): scores}))
|
||||
return str(p)
|
||||
|
||||
|
||||
def test_doc_hash_deterministic():
|
||||
# feste Funktion: gleicher Text → gleicher Hash (Reproduzierbarkeit)
|
||||
assert er._doc_hash(_TEXT) == er._doc_hash(_TEXT)
|
||||
assert er._doc_hash("a") != er._doc_hash("b")
|
||||
|
||||
|
||||
def test_cache_hit_threshold_filter(tmp_path, monkeypatch):
|
||||
# Cache-Treffer: kein Embedding-Service nötig. Nur Scores >= Schwelle UND
|
||||
# in den Kandidaten werden zurückgegeben.
|
||||
scores = {"DATA-1": 0.71, "DATA-2": 0.60, "AUTH-3": 0.68, "SEC-4": 0.50}
|
||||
monkeypatch.setenv("DSE_EMBED_CACHE", _seed_cache(tmp_path, scores))
|
||||
monkeypatch.setattr(er, "_CACHE_PATH", str(tmp_path / "dse_embed_cache.json"))
|
||||
|
||||
cands = ["DATA-1", "DATA-2", "AUTH-3", "SEC-4"]
|
||||
out = asyncio.run(er.embedding_recall(_TEXT, cands, threshold=0.65))
|
||||
# >=0.65: DATA-1 (0.71), AUTH-3 (0.68). NICHT DATA-2 (0.60), SEC-4 (0.50).
|
||||
assert out == {"DATA-1", "AUTH-3"}
|
||||
|
||||
|
||||
def test_cache_hit_candidate_intersection(tmp_path, monkeypatch):
|
||||
# Nur Kandidaten (durchgefallene Controls) zählen — andere ignoriert.
|
||||
scores = {"DATA-1": 0.90, "DATA-2": 0.90}
|
||||
monkeypatch.setattr(er, "_CACHE_PATH", str(tmp_path / "c.json"))
|
||||
(tmp_path / "c.json").write_text(json.dumps({er._doc_hash(_TEXT): scores}))
|
||||
out = asyncio.run(er.embedding_recall(_TEXT, ["DATA-1"], threshold=0.65))
|
||||
assert out == {"DATA-1"} # DATA-2 nicht in Kandidaten
|
||||
|
||||
|
||||
def test_empty_inputs():
|
||||
assert asyncio.run(er.embedding_recall("zu kurz", ["X"])) == set()
|
||||
assert asyncio.run(er.embedding_recall(_TEXT, [])) == set()
|
||||
|
||||
|
||||
def test_service_down_returns_empty(tmp_path, monkeypatch):
|
||||
# Kein Cache + Service nicht erreichbar → leer (deterministischer Layer trägt),
|
||||
# KEIN Hang.
|
||||
monkeypatch.setattr(er, "_CACHE_PATH", str(tmp_path / "none.json"))
|
||||
|
||||
async def _unreachable(timeout=2.0):
|
||||
return False
|
||||
monkeypatch.setattr(er, "_embedding_reachable", _unreachable)
|
||||
out = asyncio.run(er.embedding_recall(_TEXT, ["DATA-1"]))
|
||||
assert out == set()
|
||||
@@ -1,18 +0,0 @@
|
||||
-- Migration 154: control_pendants — self-written (license_rule=3) -> sourced
|
||||
-- (license_rule 1/2) Pendant-Mapping aus dem Phase-2-Reconcile (Embedding-kNN +
|
||||
-- Haiku-Adjudikation, 2026-06-15). Ein hier gelistetes self-written Atom hat ein
|
||||
-- kommerziell nutzbares Quell-Control, das DIESELBE Pflicht ausdrueckt -> das
|
||||
-- Retrieval soll das lizenzierte Quell-Control bevorzugen. Additiv, idempotent.
|
||||
-- [migration-approved]
|
||||
SET search_path TO compliance, public;
|
||||
|
||||
CREATE TABLE IF NOT EXISTS control_pendants (
|
||||
control_uuid uuid PRIMARY KEY,
|
||||
pendant_control_uuid uuid NOT NULL,
|
||||
cosine numeric,
|
||||
method varchar(40) NOT NULL DEFAULT 'embed_haiku',
|
||||
created_at timestamptz NOT NULL DEFAULT now()
|
||||
);
|
||||
|
||||
CREATE INDEX IF NOT EXISTS idx_control_pendants_pendant
|
||||
ON control_pendants (pendant_control_uuid);
|
||||
@@ -1,87 +0,0 @@
|
||||
"""Tests for the audit-walk ZIP-builder."""
|
||||
|
||||
import io
|
||||
import json
|
||||
import zipfile
|
||||
from unittest.mock import patch, MagicMock
|
||||
|
||||
from compliance.services.audit_walk_zip_builder import (
|
||||
_readme,
|
||||
build_audit_walk_zip,
|
||||
)
|
||||
|
||||
|
||||
_FAKE_WALK = {
|
||||
"walk_id": "abc123def456",
|
||||
"url": "https://example.com/",
|
||||
"started_at": "2026-06-07T10:00:00+00:00",
|
||||
"completed_at": "2026-06-07T10:00:30+00:00",
|
||||
"video": {
|
||||
"filename": "video.webm",
|
||||
"size_bytes": 12345,
|
||||
"sha256": "a" * 64,
|
||||
"dsms": {"cid": "QmFakeCidVideo"},
|
||||
},
|
||||
"walk_json_dsms": {"cid": "QmFakeCidMeta"},
|
||||
"actions": [
|
||||
{"action": "navigate", "url": "https://example.com/dse"},
|
||||
{"action": "navigate", "url": "https://example.com/imprint"},
|
||||
{"action": "expand_accordions", "expanded": 3},
|
||||
],
|
||||
}
|
||||
|
||||
|
||||
class TestReadme:
|
||||
def test_contains_walk_id_and_url(self):
|
||||
r = _readme(_FAKE_WALK)
|
||||
assert "abc123def456" in r
|
||||
assert "https://example.com/" in r
|
||||
|
||||
def test_contains_dsms_cids(self):
|
||||
r = _readme(_FAKE_WALK)
|
||||
assert "QmFakeCidVideo" in r
|
||||
assert "QmFakeCidMeta" in r
|
||||
|
||||
def test_counts_navigates_and_accordions(self):
|
||||
r = _readme(_FAKE_WALK)
|
||||
assert "2 Compliance-Seiten" in r
|
||||
assert "3 Akkordeon" in r
|
||||
|
||||
|
||||
class TestBuildZip:
|
||||
def test_empty_walk_returns_empty(self):
|
||||
assert build_audit_walk_zip({}) == b""
|
||||
|
||||
def test_zip_contains_three_entries(self):
|
||||
# Mock the video fetch to return tiny content
|
||||
with patch(
|
||||
"compliance.services.audit_walk_zip_builder.httpx.Client"
|
||||
) as mock_client:
|
||||
instance = mock_client.return_value.__enter__.return_value
|
||||
instance.get.return_value = MagicMock(
|
||||
status_code=200, content=b"fakevideo",
|
||||
)
|
||||
zip_bytes = build_audit_walk_zip(_FAKE_WALK)
|
||||
assert zip_bytes
|
||||
with zipfile.ZipFile(io.BytesIO(zip_bytes)) as z:
|
||||
names = set(z.namelist())
|
||||
assert {"video.webm", "walk.json", "README.txt"}.issubset(names)
|
||||
walk_content = json.loads(z.read("walk.json"))
|
||||
assert walk_content["walk_id"] == "abc123def456"
|
||||
assert z.read("video.webm") == b"fakevideo"
|
||||
|
||||
def test_video_fetch_failure_still_produces_zip(self):
|
||||
# consent-tester down → no video, but ZIP still contains
|
||||
# walk.json + README so the recipient has the metadata.
|
||||
with patch(
|
||||
"compliance.services.audit_walk_zip_builder.httpx.Client"
|
||||
) as mock_client:
|
||||
instance = mock_client.return_value.__enter__.return_value
|
||||
instance.get.side_effect = Exception("network down")
|
||||
zip_bytes = build_audit_walk_zip(_FAKE_WALK)
|
||||
assert zip_bytes
|
||||
with zipfile.ZipFile(io.BytesIO(zip_bytes)) as z:
|
||||
names = z.namelist()
|
||||
assert "video.webm" not in names
|
||||
assert "walk.json" in names
|
||||
assert "README.txt" in names
|
||||
@@ -0,0 +1,51 @@
|
||||
"""Prüfer-Router: build_spec aus sensor_classification + method-agnostischer
|
||||
Dispatch. CONTENT/LLM -> Haiku-Sufficiency-Tier (validiert), unbekannte
|
||||
decision_methods -> fail-safe present=None."""
|
||||
import pytest
|
||||
from unittest.mock import AsyncMock, patch
|
||||
|
||||
from compliance.services.checkers.base import DocContext
|
||||
from compliance.services.checkers.router import build_spec, route_and_check
|
||||
|
||||
_ANTHROPIC = "compliance.services.llm_cascade._call_anthropic"
|
||||
|
||||
|
||||
def test_build_spec_content_llm_uses_haiku():
|
||||
s = build_spec("X", {"verification_method": "CONTENT", "decision_method": "LLM"},
|
||||
label="L", criteria=["a", "b"])
|
||||
assert s.verification_method == "CONTENT" and s.decision_method == "LLM"
|
||||
assert s.extra.get("judge") == "haiku"
|
||||
assert s.paraphrases == ["a", "b"]
|
||||
|
||||
|
||||
def test_build_spec_embedding_no_haiku():
|
||||
s = build_spec("X", {"verification_method": "CONTENT", "decision_method": "EMBEDDING"})
|
||||
assert s.extra.get("judge") is None
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_route_unknown_decision_is_failsafe():
|
||||
s = build_spec("X", {"verification_method": "BEHAVIOR", "decision_method": "PLAYWRIGHT"})
|
||||
r = await route_and_check(s, DocContext(text="x" * 200))
|
||||
assert r.present is None and "no_checker" in r.source
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_route_content_llm_haiku_fehlt():
|
||||
s = build_spec("X", {"verification_method": "CONTENT", "decision_method": "LLM"},
|
||||
label="Speicherdauer", criteria=["Höchstdauer pro Kategorie"])
|
||||
fake = AsyncMock(return_value='{"erfuellt": false, "confidence": 0.9, "begruendung": "fehlt"}')
|
||||
with patch(_ANTHROPIC, new=fake):
|
||||
r = await route_and_check(s, DocContext(text="Wir nutzen Cookies. " * 30))
|
||||
assert r.present is False and r.source == "haiku"
|
||||
assert fake.call_count >= 1
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_route_content_llm_haiku_erfuellt():
|
||||
s = build_spec("X", {"verification_method": "CONTENT", "decision_method": "LLM"},
|
||||
label="L", criteria=["x"])
|
||||
fake = AsyncMock(return_value='{"erfuellt": true, "confidence": 0.8}')
|
||||
with patch(_ANTHROPIC, new=fake):
|
||||
r = await route_and_check(s, DocContext(text="text " * 40))
|
||||
assert r.present is True
|
||||
@@ -0,0 +1,42 @@
|
||||
"""Tests for the cookie-policy applicability gate: controls without a
|
||||
COOKIE_POLICY artifact are routed out of the findings scan (not deleted),
|
||||
and the gate is fail-safe (no DSN -> no filter)."""
|
||||
import pytest
|
||||
|
||||
from compliance.services.specialist_agents.cookie_policy._classification_gate import (
|
||||
apply_gate, load_cookie_gate,
|
||||
)
|
||||
|
||||
|
||||
def test_apply_gate_splits_kept_and_routed():
|
||||
controls = [
|
||||
{"control_id": "COOK-1", "title": "Kategorien"},
|
||||
{"control_id": "TOM-1", "title": "Verschlüsselung"},
|
||||
{"control_id": "BAN-1", "title": "Consent vor Setzen"},
|
||||
]
|
||||
gate = {
|
||||
"TOM-1": {"obligation_type": "TECHNICAL", "check_intent": "DIRECT_TECHNICAL",
|
||||
"applicable_artifacts": ["TOM", "AUDIT"]},
|
||||
"BAN-1": {"obligation_type": "TECHNICAL", "check_intent": "DIRECT_TECHNICAL",
|
||||
"applicable_artifacts": ["COOKIE_BANNER", "SYSTEMSCAN"]},
|
||||
}
|
||||
kept, routed = apply_gate(controls, gate)
|
||||
assert [c["control_id"] for c in kept] == ["COOK-1"]
|
||||
assert {c["control_id"] for c in routed} == {"TOM-1", "BAN-1"}
|
||||
# routed entries carry title + classification metadata for downstream routing
|
||||
tom = next(c for c in routed if c["control_id"] == "TOM-1")
|
||||
assert tom["title"] == "Verschlüsselung"
|
||||
assert tom["applicable_artifacts"] == ["TOM", "AUDIT"]
|
||||
|
||||
|
||||
def test_apply_gate_empty_gate_keeps_all():
|
||||
controls = [{"control_id": "A"}, {"control_id": "B"}]
|
||||
kept, routed = apply_gate(controls, {})
|
||||
assert len(kept) == 2 and routed == []
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_load_cookie_gate_no_dsn_is_failsafe(monkeypatch):
|
||||
monkeypatch.delenv("DATABASE_URL", raising=False)
|
||||
monkeypatch.delenv("COMPLIANCE_DATABASE_URL", raising=False)
|
||||
assert await load_cookie_gate("") == {}
|
||||
@@ -0,0 +1,68 @@
|
||||
"""Layer-3 cookie sufficiency-judge: only embedding/boost-RESCUED passes are
|
||||
re-judged by Haiku; keyword passes are untouched; a FEHLT verdict un-passes."""
|
||||
import pytest
|
||||
from unittest.mock import AsyncMock, patch
|
||||
|
||||
from compliance.services.specialist_agents.cookie_policy._sufficiency_judge import (
|
||||
judge_rescued,
|
||||
)
|
||||
|
||||
_ANTHROPIC = "compliance.services.llm_cascade._call_anthropic"
|
||||
_DOC = "Volltext der Cookie-Richtlinie mit ausreichend Inhalt. " * 4
|
||||
|
||||
|
||||
def _r(cid, source, passed=True):
|
||||
return {"control_id": cid, "source": source, "passed": passed,
|
||||
"label": cid, "_pass_criteria": ["konkrete Angabe nötig"]}
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_rescued_unpassed_when_judge_fehlt():
|
||||
results = [_r("A", "keyword+embedding")]
|
||||
fake = AsyncMock(return_value='{"erfuellt": false, "confidence": 0.9, "begruendung": "fehlt"}')
|
||||
with patch(_ANTHROPIC, new=fake):
|
||||
n = await judge_rescued(_DOC, results)
|
||||
assert n == 1
|
||||
assert results[0]["passed"] is False
|
||||
assert "+llm_failed" in results[0]["source"]
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_rescued_kept_when_judge_erfuellt():
|
||||
results = [_r("A", "keyword+embedding")]
|
||||
fake = AsyncMock(return_value='{"erfuellt": true, "confidence": 0.9}')
|
||||
with patch(_ANTHROPIC, new=fake):
|
||||
n = await judge_rescued(_DOC, results)
|
||||
assert n == 0
|
||||
assert results[0]["passed"] is True
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_keyword_pass_not_judged():
|
||||
"""Deterministisch (keyword) bestandene Controls werden NICHT befragt."""
|
||||
results = [_r("A", "keyword")]
|
||||
fake = AsyncMock(return_value='{"erfuellt": false}')
|
||||
with patch(_ANTHROPIC, new=fake):
|
||||
n = await judge_rescued(_DOC, results)
|
||||
assert n == 0
|
||||
assert results[0]["passed"] is True
|
||||
assert fake.call_count == 0
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_boost_rescue_is_judged():
|
||||
results = [_r("A", "keyword+regex_boost")]
|
||||
fake = AsyncMock(return_value='{"erfuellt": false}')
|
||||
with patch(_ANTHROPIC, new=fake):
|
||||
n = await judge_rescued(_DOC, results)
|
||||
assert n == 1 and results[0]["passed"] is False
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_failed_controls_ignored():
|
||||
"""Nicht-bestandene (failed) Controls sind nicht Sache dieser Schicht."""
|
||||
results = [_r("A", "keyword+embedding", passed=False)]
|
||||
fake = AsyncMock(return_value='{"erfuellt": false}')
|
||||
with patch(_ANTHROPIC, new=fake):
|
||||
n = await judge_rescued(_DOC, results)
|
||||
assert n == 0 and fake.call_count == 0
|
||||
@@ -0,0 +1,77 @@
|
||||
"""Regression tests for the OVH (gpt-oss-120b) tier of the LLM cascade.
|
||||
|
||||
gpt-oss-120b is a reasoning model: it spends output tokens on chain-of-thought
|
||||
before the answer. Two bugs this pins:
|
||||
1. A small max_tokens (deep_check passed 400) length-caps it mid-reasoning →
|
||||
content=null → the tier silently returns nothing. _call_ovh must floor the
|
||||
budget so reasoning + the JSON answer fit.
|
||||
2. When length-capped, the JSON can land in reasoning_content, not content →
|
||||
_call_ovh must fall back to reasoning_content.
|
||||
"""
|
||||
import pytest
|
||||
from unittest.mock import AsyncMock, MagicMock, patch
|
||||
|
||||
from compliance.services import llm_cascade
|
||||
|
||||
|
||||
def _resp(data):
|
||||
r = MagicMock()
|
||||
r.raise_for_status = MagicMock()
|
||||
r.json = MagicMock(return_value=data)
|
||||
return r
|
||||
|
||||
|
||||
def _client(resp):
|
||||
inst = AsyncMock()
|
||||
inst.post.return_value = resp
|
||||
inst.__aenter__ = AsyncMock(return_value=inst)
|
||||
inst.__aexit__ = AsyncMock(return_value=False)
|
||||
return inst
|
||||
|
||||
|
||||
class TestCallOvhReasoning:
|
||||
@pytest.mark.asyncio
|
||||
async def test_reasoning_content_used_when_content_null(self, monkeypatch):
|
||||
monkeypatch.setenv("OVH_LLM_URL", "https://llm.example.com")
|
||||
monkeypatch.setenv("OVH_LLM_MODEL", "gpt-oss-120b")
|
||||
monkeypatch.setenv("OVH_LLM_KEY", "k")
|
||||
resp = _resp({"choices": [{"message": {
|
||||
"content": None,
|
||||
"reasoning_content": '{"erfuellt": true, "confidence": 0.9}'}}]})
|
||||
with patch("httpx.AsyncClient", return_value=_client(resp)):
|
||||
out = await llm_cascade._call_ovh("sys", "user", max_tokens=400)
|
||||
assert '"erfuellt": true' in out
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_small_budget_is_floored(self, monkeypatch):
|
||||
monkeypatch.setenv("OVH_LLM_URL", "https://llm.example.com")
|
||||
monkeypatch.setenv("OVH_LLM_MODEL", "gpt-oss-120b")
|
||||
inst = _client(_resp({"choices": [{"message": {"content": "{}"}}]}))
|
||||
with patch("httpx.AsyncClient", return_value=inst):
|
||||
await llm_cascade._call_ovh("sys", "user", max_tokens=400)
|
||||
assert inst.post.call_args.kwargs["json"]["max_tokens"] >= 2000
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_large_budget_is_preserved(self, monkeypatch):
|
||||
monkeypatch.setenv("OVH_LLM_URL", "https://llm.example.com")
|
||||
monkeypatch.setenv("OVH_LLM_MODEL", "gpt-oss-120b")
|
||||
inst = _client(_resp({"choices": [{"message": {"content": "{}"}}]}))
|
||||
with patch("httpx.AsyncClient", return_value=inst):
|
||||
await llm_cascade._call_ovh("sys", "user", max_tokens=6000)
|
||||
assert inst.post.call_args.kwargs["json"]["max_tokens"] == 6000
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_content_preferred_when_present(self, monkeypatch):
|
||||
monkeypatch.setenv("OVH_LLM_URL", "https://llm.example.com")
|
||||
monkeypatch.setenv("OVH_LLM_MODEL", "gpt-oss-120b")
|
||||
resp = _resp({"choices": [{"message": {
|
||||
"content": '{"erfuellt": false}', "reasoning_content": "noise"}}]})
|
||||
with patch("httpx.AsyncClient", return_value=_client(resp)):
|
||||
out = await llm_cascade._call_ovh("sys", "user")
|
||||
assert out == '{"erfuellt": false}'
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_unconfigured_returns_empty(self, monkeypatch):
|
||||
monkeypatch.delenv("OVH_LLM_URL", raising=False)
|
||||
monkeypatch.delenv("OVH_LLM_MODEL", raising=False)
|
||||
assert await llm_cascade._call_ovh("sys", "user") == ""
|
||||
@@ -0,0 +1,102 @@
|
||||
"""Unit-Tests für die getierte 3-Status-Auswertung (_tiered_eval).
|
||||
|
||||
Deckt ab: Status-Logik (inkl. kein-LM → ERFÜLLT, UNBESTIMMT bei nicht bewertbar),
|
||||
Empfehlungs-Sammlung, EMBEDDING/LLM-Routing (gemockt) und den Reproduzierbarkeits-
|
||||
Cache. Embedding/LLM werden gemockt — kein Netzwerk."""
|
||||
import asyncio
|
||||
|
||||
from compliance.services.specialist_agents.dse import _tiered_eval as te
|
||||
|
||||
|
||||
# ---- reine Status-Logik -------------------------------------------------
|
||||
def test_status_no_lm_is_erfuellt():
|
||||
assert te._status([]) == "ERFÜLLT"
|
||||
|
||||
|
||||
def test_status_all_met_erfuellt():
|
||||
assert te._status([True, True]) == "ERFÜLLT"
|
||||
|
||||
|
||||
def test_status_none_met_fehlt():
|
||||
assert te._status([False, False]) == "FEHLT"
|
||||
|
||||
|
||||
def test_status_partial_teilweise():
|
||||
assert te._status([True, False]) == "TEILWEISE"
|
||||
|
||||
|
||||
def test_status_any_none_unbestimmt():
|
||||
assert te._status([True, None]) == "UNBESTIMMT"
|
||||
|
||||
|
||||
# ---- evaluate_tiered (Embedding/LLM gemockt) ----------------------------
|
||||
def _crit(text, tier, dm="EMBEDDING"):
|
||||
return {"criterion": text, "compliance_tier": tier,
|
||||
"decision_method": dm, "legal_basis": "x"}
|
||||
|
||||
|
||||
class _Doc:
|
||||
def __init__(self, text):
|
||||
self.text = text
|
||||
|
||||
|
||||
def test_evaluate_partial_with_recommendation(monkeypatch):
|
||||
crits = [_crit("Zwecke genannt", "LEGAL_MINIMUM"),
|
||||
_crit("Speicherdauer genannt", "LEGAL_MINIMUM"),
|
||||
_crit("tabellarisch ausgewiesen", "BEST_PRACTICE")]
|
||||
|
||||
async def fake_embed(texts, ctx, thr):
|
||||
return {"Zwecke genannt": True, "Speicherdauer genannt": False,
|
||||
"tabellarisch ausgewiesen": False}
|
||||
|
||||
monkeypatch.setattr(te, "_embed_present", fake_embed)
|
||||
out = asyncio.run(te.evaluate_tiered("C1", crits, {"hash": "h"}, _Doc("x" * 200)))
|
||||
assert out["status"] == "TEILWEISE"
|
||||
assert out["lm_met"] == 1 and out["lm_total"] == 2
|
||||
assert len(out["recommendations"]) == 1
|
||||
assert out["recommendations"][0]["tier"] == "BEST_PRACTICE"
|
||||
|
||||
|
||||
def test_evaluate_no_lm_is_erfuellt_with_recs(monkeypatch):
|
||||
crits = [_crit("Bildsymbole", "OPTIONAL"), _crit("Legende", "OPTIONAL")]
|
||||
|
||||
async def fake_embed(texts, ctx, thr):
|
||||
return {t: False for t in texts}
|
||||
|
||||
monkeypatch.setattr(te, "_embed_present", fake_embed)
|
||||
out = asyncio.run(te.evaluate_tiered("C2", crits, {"hash": "h"}, _Doc("x" * 200)))
|
||||
assert out["status"] == "ERFÜLLT"
|
||||
assert out["lm_total"] == 0
|
||||
assert len(out["recommendations"]) == 2
|
||||
|
||||
|
||||
def test_evaluate_llm_criterion_routed(monkeypatch):
|
||||
crits = [_crit("Speicherdauer hinreichend nachvollziehbar", "LEGAL_MINIMUM", dm="LLM")]
|
||||
|
||||
async def fake_llm(cid, idx, crit, doc, dh):
|
||||
return True
|
||||
|
||||
monkeypatch.setattr(te, "_llm_met", fake_llm)
|
||||
out = asyncio.run(te.evaluate_tiered("C3", crits, {"hash": "h"}, _Doc("x" * 200)))
|
||||
assert out["status"] == "ERFÜLLT" and out["lm_total"] == 1
|
||||
|
||||
|
||||
def test_evaluate_unbestimmt_when_embed_unavailable(monkeypatch):
|
||||
crits = [_crit("Zwecke genannt", "LEGAL_MINIMUM")]
|
||||
|
||||
async def fake_embed(texts, ctx, thr):
|
||||
return {t: None for t in texts} # Embedding-Service down
|
||||
|
||||
monkeypatch.setattr(te, "_embed_present", fake_embed)
|
||||
out = asyncio.run(te.evaluate_tiered("C4", crits, {"hash": "h"}, _Doc("x" * 200)))
|
||||
assert out["status"] == "UNBESTIMMT"
|
||||
|
||||
|
||||
# ---- Reproduzierbarkeits-Cache -----------------------------------------
|
||||
def test_cache_roundtrip(monkeypatch, tmp_path):
|
||||
monkeypatch.setattr(te, "_CACHE_DB", str(tmp_path / "cache.db"))
|
||||
assert te._cache_get("k1") is None
|
||||
te._cache_put("k1", True)
|
||||
te._cache_put("k2", False)
|
||||
assert te._cache_get("k1") is True
|
||||
assert te._cache_get("k2") is False
|
||||
@@ -159,13 +159,3 @@ def test_all_regulation_rules_point_to_valid_use_cases():
|
||||
for _needle, uc in reg._REGULATION_RULES:
|
||||
assert uc in reg.REGISTRY, uc
|
||||
assert reg.REGISTRY[uc].enabled
|
||||
|
||||
|
||||
def test_new_use_cases_eidas_geschaeftsgeheimnis():
|
||||
# Korpus-Luecken 2026-06-17: eIDAS (VO 910/2014) + GeschGehG als eigene
|
||||
# Use Cases ingestiert + klassifiziert.
|
||||
assert reg.is_valid_use_case("eidas")
|
||||
assert reg.is_valid_use_case("geschaeftsgeheimnis")
|
||||
assert reg.use_case_for_regulation("eIDAS-Verordnung (EU) Nr. 910/2014") == "eidas"
|
||||
assert reg.use_case_for_regulation(
|
||||
"Gesetz zum Schutz von Geschäftsgeheimnissen") == "geschaeftsgeheimnis"
|
||||
|
||||
@@ -1,62 +0,0 @@
|
||||
# Benchmark-Archiv & RC-Freeze — `v1` (2026-06-19)
|
||||
|
||||
> **Zweck:** Reproduzierbarkeits-Record der Doc-Check-Kalibrierung (DSE / Cookie / Impressum).
|
||||
> Diese Datei enthält **nur Metadaten + Hashes** — **kein** Drittanbieter-Dokumenttext (Urheber-/Datenbankrecht).
|
||||
> Die vollständigen Artefakte (Korpora, GTs, Ergebnisse, Skripte) liegen im **internen Audit-Archiv**, getrennt von Repo / RAG / Produkt.
|
||||
|
||||
## 1. Daten-Klassen (Retention-Entscheidung 2026-06-19)
|
||||
|
||||
Drei Risikoklassen, drei Regeln:
|
||||
|
||||
| Klasse | Regel |
|
||||
|---|---|
|
||||
| **RAG-Korpus** | Control ableiten → Dokument **verwerfen**. Keine Volltexte als Wissensbasis. |
|
||||
| **Kundendaten (Prod)** | Speichern: Finding · Evidence · Hash · Version · URL · Zeitpunkt. **Keine** Dauer-Volltextkopie. Datensparsamkeit. |
|
||||
| **Benchmark/Validierung** | **Versioniert behalten** — sonst sind Messungen nicht reproduzierbar. Intern, off-RAG, off-Produkt. Wie ein Test-/Audit-Archiv, nicht wie eine Wissensbasis. |
|
||||
|
||||
Begründung: Das Risiko eines kleinen internen Benchmark-Archivs (öffentlich zugängliche Dokumente) ist geringer als das Risiko, die gesamte Validierung später nicht mehr belegen zu können.
|
||||
|
||||
## 2. Release-Candidates (eingefroren)
|
||||
|
||||
| RC | doc_type | Opus-GT (Archiv) | Testfirmen | FP / FN | Status |
|
||||
|---|---|---|---|---|---|
|
||||
| **DSE_RC_v1** | dse | `gt_opus_dse.json` (5 orig) + `gt_opus_dse_fresh.json` (3 frisch) | 8 (db, otto, ikea, ob, teamviewer + GT-Roster) | FP 11 %→**6 %**, FN ~7 %; frisch FP 7 % / FN 5 % | Release-Candidate |
|
||||
| **COOKIE_RC_v1** | cookie | `gt_opus_cookie_v2.json` (Mehrfach-Sampling offen) | 7 (db, ikea, lieferando, mediamarkt, ob, tchibo, teamviewer) | Prec 0,81→**0,95**, Rec 0,26→**0,44**, verpasste Lücken→**0 %** | Wave-1 (GT-Rauschen-Vorbehalt) |
|
||||
| **IMPRESSUM_RC_v1** | impressum | `gt_opus_impressum.json` | 9 (db, ikea, lieferando, mediamarkt, ob, otto, tchibo, teamviewer, zalando) | Text-Check FP **0 %** / FN **2 %** (81 anwendbar, 9 Faktenfeld-Controls) | Release-Candidate |
|
||||
|
||||
Detail-Methodik + Fehlerkarte: [`platform_validation_v1.md`](platform_validation_v1.md). Per-Modul-Zahlen: Gedächtnis `project_engine_quality.md`.
|
||||
|
||||
## 3. Archiv-Ort + Index
|
||||
|
||||
```
|
||||
macmini:~/bp-benchmark-archive/v1_2026-06-19/
|
||||
├── MANIFEST.json # 54 Dateien, je SHA256 + Bytes (autoritativ)
|
||||
├── gt_<firma>_<doctype>.txt # Korpora (Drittanbieter-Volltext — NUR hier)
|
||||
├── gt_opus_*.json # Opus-Oracle-GTs
|
||||
├── *_candidates*.json, *_resid.json, *_falsefindings*.json
|
||||
├── *_criteria_changelog.json / *_criteria_backup.json
|
||||
└── scripts/ # 46 Mess-Skripte (cc_*.py) = "wie gemessen"
|
||||
```
|
||||
|
||||
**Versionsdefinierende Hashes** (12-stellig gekürzt; voll in `MANIFEST.json`):
|
||||
|
||||
| Artefakt | sha256… | Rolle |
|
||||
|---|---|---|
|
||||
| `gt_opus_dse.json` | `c5c8975afa42` | DSE-GT (orig) |
|
||||
| `gt_opus_dse_fresh.json` | `f3940da2e420` | DSE-GT (Anti-Overfit) |
|
||||
| `gt_opus_cookie_v2.json` | `fcb61dc9b332` | Cookie-GT |
|
||||
| `gt_opus_impressum.json` | `3e0f2f8d5f5f` | Impressum-GT |
|
||||
| `dse_criteria_changelog.json` | `d8d461527f5b` | DSE-Kriterien-Diff |
|
||||
| `cookie_criteria_changelog.json` | `9d29d7b515a5` | Cookie-Kriterien-Diff |
|
||||
| `impressum_fp_by_cause.json` | `9477f98c0577` | Impressum SCOPE/JUDGE-Split |
|
||||
|
||||
## 4. Reproduktion
|
||||
|
||||
1. Archiv = Grundwahrheit (Korpus-Hash belegt die damalige Dokumentversion; ändert die Firma ihr Dokument → neuer Hash, alte Messung bleibt über das Archiv belegbar).
|
||||
2. Mess-Skripte unter `scripts/` gegen die GTs laufen lassen (Pattern: `docker exec -i bp-compliance-backend python3 - < scripts/cc_engine_*.py`).
|
||||
3. OVH ist stochastisch → Zahlen ±Rauschen; RC-Werte sind Mittel über den dokumentierten Lauf.
|
||||
|
||||
## 5. Was NICHT passiert
|
||||
|
||||
- Korpus-Volltexte gehen **nicht** ins Repo, **nicht** in Qdrant/RAG, **nicht** ins Produkt.
|
||||
- Das Archiv ist read-only Referenz; Kalibrierungs-Änderungen sind über die Changelog-Artefakte reversibel.
|
||||
@@ -0,0 +1,155 @@
|
||||
# Kriterien-Meta-Modell & Compliance-Tier-Architektur
|
||||
|
||||
> **Status: EINGEFROREN 2026-06-22.** Änderungen an diesem Modell sind
|
||||
> Architekturentscheidungen und erfordern eine bewusste Freigabe (DB-Owner /
|
||||
> Produktverantwortung). Verwandt: [`platform_checker_matrix.md`](platform_checker_matrix.md),
|
||||
> [`verification_method.md`](verification_method.md), [`platform_validation_v1.md`](platform_validation_v1.md).
|
||||
|
||||
## 1. Motivation
|
||||
|
||||
Die Kalibrierung der vier Website-Compliance-Module deckte vier **verschiedene**
|
||||
dominante Fehlerursachen auf:
|
||||
|
||||
| Modul | Dominanter Hebel |
|
||||
|-------|------------------|
|
||||
| Cookie-Policy | Sufficiency (Judge) |
|
||||
| Impressum | Scope / Routing |
|
||||
| AGB | Decision-Method / Routing |
|
||||
| DSE | **Überladene Controls + Vermischung „gesetzliches Minimum vs. Best Practice"** |
|
||||
|
||||
Die DSE-Untersuchung (Adjudikation von 13 Judge↔GT-Disagreements) ergab: **85 % der
|
||||
Restfehler sind Katalog-Defekte, 15 % Prüfer.** Der größte Einzeldefekt: ein Control
|
||||
bündelt mehrere Anforderungen **unterschiedlicher Verbindlichkeit** und wird nur dann
|
||||
als ERFÜLLT gewertet, wenn *alle* erfüllt sind. Folge: gesetzlich konforme Dokumente
|
||||
werden als „FEHLT" gemeldet, weil eine Best-Practice-Empfehlung fehlt.
|
||||
|
||||
Dieses Modell behebt das **im Katalog** — ohne den Prüfer zu ändern und ohne Controls
|
||||
physisch aufzuspalten.
|
||||
|
||||
## 2. Datenmodell
|
||||
|
||||
Ein Control bleibt **stabil** (UUID, Citations, GT-Historie, Kalibrierung,
|
||||
Statistiken). Seine `pass_criteria` werden von einer Stringliste zu **atomaren,
|
||||
getypten Kriterien-Objekten**:
|
||||
|
||||
```
|
||||
Control (stabile control_uuid — NICHT splitten)
|
||||
└─ criteria: Criterion[]
|
||||
|
||||
Criterion
|
||||
├─ criterion (Text der Einzelanforderung)
|
||||
├─ legal_basis (z. B. "Art. 13(1)(c) DSGVO")
|
||||
├─ verification_method (Achse 1 — WAS wird geprüft)
|
||||
├─ decision_method (Achse 2 — WIE wird entschieden)
|
||||
├─ compliance_tier (Achse 3 — WIE VERBINDLICH)
|
||||
└─ weight (reserviert für Reifegrad, s. §6 — heute NICHT gating)
|
||||
```
|
||||
|
||||
**Speicherort:** `canonical_controls.generation_metadata->'tiered_criteria'` (jsonb).
|
||||
**Keine Schema-Änderung.** Kein physischer Control-Split (Variante A wurde verworfen:
|
||||
neue UUIDs → Verlust von Benchmarks/Kalibrierung/Citation/GT = Migrationsprojekt).
|
||||
|
||||
## 3. Die drei Achsen
|
||||
|
||||
Jedes Kriterium trägt drei **unabhängige** Klassifikationen:
|
||||
|
||||
1. **`verification_method`** — artefakt-abhängig: CONTENT · FIELD · REFERENCE ·
|
||||
BEHAVIOR · PRESENTATION · PROCESS · TECHNICAL · CONTRACTUAL. Siehe
|
||||
[`verification_method.md`](verification_method.md).
|
||||
2. **`decision_method`** — welcher Prüfer: REGEX · EMBEDDING · LLM · LINK_RESOLVER ·
|
||||
PLAYWRIGHT · AUDIT · SCANNER. Siehe [`platform_checker_matrix.md`](platform_checker_matrix.md).
|
||||
3. **`compliance_tier`** *(neu, dieses Dokument)* — Verbindlichkeit:
|
||||
- **`LEGAL_MINIMUM`** — gesetzlich erforderlich. Beeinflusst den Compliance-Status.
|
||||
- **`BEST_PRACTICE`** — empfehlenswert, gesetzlich nicht erforderlich. Erscheint als
|
||||
Empfehlung. Beeinflusst den Status **nie**.
|
||||
- **`OPTIONAL`** — Komfort/Detailtiefe. Empfehlung. Beeinflusst den Status **nie**.
|
||||
|
||||
Achse 1 + 2 sind primär **per Kriterium** (atomar); ein Control kann Kriterien
|
||||
verschiedener Methoden mischen.
|
||||
|
||||
## 4. Status-Berechnung (3 Zustände) — Gating NUR auf LEGAL_MINIMUM
|
||||
|
||||
Sei `LM` die Menge der `LEGAL_MINIMUM`-Kriterien eines Controls und `met(LM)` die
|
||||
erfüllten darunter:
|
||||
|
||||
```
|
||||
ERFÜLLT := |LM| > 0 und met(LM) == |LM| (alle Pflicht-Kriterien erfüllt)
|
||||
TEILWEISE := 0 < met(LM) < |LM| (mind. eines erfüllt, mind. eines fehlt)
|
||||
FEHLT := |LM| > 0 und met(LM) == 0 (kein Pflicht-Kriterium erfüllt)
|
||||
```
|
||||
|
||||
`BEST_PRACTICE`/`OPTIONAL`-Kriterien gehen **nicht** in diese Berechnung ein. Sie
|
||||
werden separat als Empfehlungen ausgewiesen (§5, Ebene 2).
|
||||
|
||||
> **Invariante:** Ein erfülltes gesetzliches Minimum darf NIE durch fehlende
|
||||
> Best-Practice-/Optional-Kriterien auf FEHLT/Rot gezogen werden.
|
||||
|
||||
## 5. Reporting — drei Ebenen
|
||||
|
||||
| Ebene | Inhalt | Quelle |
|
||||
|-------|--------|--------|
|
||||
| **1 — Compliance-Status (rechtlich)** | ERFÜLLT / TEILWEISE / FEHLT | NUR `LEGAL_MINIMUM` |
|
||||
| **2 — Optimierungspotenzial** | „Empfehlungen: N · Best-Practice-Abdeckung X %" | `BEST_PRACTICE` + `OPTIONAL` |
|
||||
| **3 — Risiko-Reifegrad** *(optional, später)* | „Reifegrad Y %" für CRA/NIS2/ISO 27001/TOM | gewichtet, s. §6 |
|
||||
|
||||
**Anti-Pattern (verboten):** kein „Compliance-Score = 72 %", wenn alle gesetzlichen
|
||||
Anforderungen erfüllt sind. Das erzeugt „welche 28 % fehlen?" → „eigentlich keine
|
||||
Pflicht" → der Score wird wertlos.
|
||||
|
||||
### Farb-Semantik (Bedeutung, nicht Wertung)
|
||||
|
||||
- **Grün** = gesetzliche Anforderungen erfüllt (Pflicht erfüllt)
|
||||
- **Blau** = empfohlene Verbesserungen vorhanden (Optimierung möglich)
|
||||
- **Rot** = gesetzliche Anforderungen fehlen (Pflichtverletzung)
|
||||
|
||||
`TEILWEISE` ist visuell ein eigener Zustand (z. B. Gelb/Amber): Pflicht teilweise
|
||||
erfüllt. Verbindet sich mit der BreakPilot-Tonalität (kein Panik-Rot) und dem
|
||||
3-Tier-Obligation-Modell (Pflicht/Empfehlung/Kann).
|
||||
|
||||
## 6. `weight`
|
||||
|
||||
Wird heute **gespeichert, aber nicht für das Gating verwendet** (bewusste
|
||||
Entscheidung: Gewichte erzeugen sofort „warum 0.3 und nicht 0.4?"-Diskussionen). Es
|
||||
ist die Reserve für **Ebene 3 (Reifegrad)**: später lässt sich daraus ein gewichteter
|
||||
Best-Practice-/Reifegrad-Prozentwert berechnen. Richtwerte: LEGAL_MINIMUM 1.0 ·
|
||||
BEST_PRACTICE ~0.3 · OPTIONAL ~0.1.
|
||||
|
||||
## 7. compliance_tier ist eine PLATTFORM-Achse
|
||||
|
||||
Nicht nur ein DSE-Fix. Dasselbe Muster tritt überall auf — DSE (Minimum vs. BP),
|
||||
Cookie (Offenlegung vs. Transparenz), Impressum (Pflicht- vs. Komfortfelder), AGB
|
||||
(erforderlich vs. empfehlenswert) und perspektivisch CRA/NIS2/Maschinenverordnung.
|
||||
Ein einzelnes Kriterium trägt überall `compliance_tier`; die Plattform wertet
|
||||
**Compliance / Empfehlungen / Reifegrad** regulierungsunabhängig aus.
|
||||
|
||||
## 8. Validierungsnachweis (Pilot, 2026-06-22)
|
||||
|
||||
Geschrieben auf macmini (`generation_metadata.tiered_criteria`, prod-guarded), gemessen
|
||||
gegen Opus-GT (ikea/ob/teamviewer):
|
||||
|
||||
- **5 Pilot-Controls** (SEC-7285-A03, SEC-3257-A01, Portabilitäts-Cluster
|
||||
DATA-1613/DATA-2552/COMP-2087): alle **6 Disagreement-Fälle** (vormals falsch-FEHLT)
|
||||
wandern zu **ERFÜLLT + Empfehlungen**; echte Lücken bleiben korrekt FEHLT — ohne
|
||||
Prüfer-Änderung.
|
||||
- **TEILWEISE-Validierung** (DATA-1445-A02, SEC-4752-A02): der 3. Status tritt real auf
|
||||
(1 ERFÜLLT / 5 TEILWEISE), Splitter durchgängig „Speicherdauer pro Zweck"
|
||||
(Art. 13(2)(a)).
|
||||
- Lehre: selbst Pilot-Kriterien können Minimum + Best-Practice vermischen
|
||||
(„Speicherdauer *pro Zweck*"). Die LM/BP-Linie ist eine **Produktpolitik-Entscheidung
|
||||
(Mensch)**, kein NLP-Problem. Das Modell ist korrekt; die Kriterien-Schärfe ist
|
||||
Kurationsarbeit.
|
||||
|
||||
## 9. Invarianten (nicht verletzen)
|
||||
|
||||
1. Control-UUID bleibt stabil — **kein** physischer Split.
|
||||
2. Status (Grün/Gelb/Rot) hängt **ausschließlich** an `LEGAL_MINIMUM`.
|
||||
3. `BEST_PRACTICE`/`OPTIONAL` erzeugen Empfehlungen, **nie** einen FEHLT-Status.
|
||||
4. Kein Prozent-Compliance-Score, wenn alle gesetzlichen Anforderungen erfüllt sind.
|
||||
5. Speicherung in `generation_metadata` (jsonb) — keine Schema-Migration.
|
||||
|
||||
## 10. Rollout (nach diesem Freeze)
|
||||
|
||||
1. **10–15** der schlimmsten überladenen DSE-Controls tiern (nicht alle 49 auf einmal).
|
||||
2. 3-Status-Logik in die Live-DSE-Engine verdrahten (heute nur Mess-Harness).
|
||||
3. Benchmark erneut: FP / FN / Precision / Recall + Status-Verteilung.
|
||||
4. Erst bei stabilem Effekt: Rollout auf alle 49 überladenen Controls.
|
||||
@@ -0,0 +1,153 @@
|
||||
# DSE-Engine — Validierung & Versionierung `DSE_v1`
|
||||
|
||||
> **Status:** Release-Candidate (qualitativ validiert, nicht „fertig")
|
||||
> **Datum:** 2026-06-19
|
||||
> **Modul:** Datenschutzerklärung (`doc_type = dse`)
|
||||
> **Zweck dieses Dokuments:** nachvollziehbar festhalten, *wie* die DSE-Engine gemessen
|
||||
> wurde, *welche* Controls *warum* korrigiert wurden und *welcher* reproduzierbare Prozess
|
||||
> daraus entstanden ist — als Referenz für Cookie-Richtlinie, Impressum, AGB, CRA, NIS2, KI-VO usw.
|
||||
|
||||
## 1. Kurzfazit
|
||||
|
||||
Die zentrale Aussage ist nicht „FP 11 % → 6 %", sondern: **die False Positives wurden
|
||||
halbiert, ohne den Schutz vor echten Lücken zu verlieren** (deterministisch nachgewiesen,
|
||||
0 verschluckte Lücken).
|
||||
|
||||
| KPI (vs Opus-GT, 5 Firmen, 432 anwendbare Controls) | vorher | `DSE_v1` |
|
||||
|---|---|---|
|
||||
| Falsche Findings (Engine FEHLT / GT ERFÜLLT) | 51 (11 %) | **26 (6 %)** |
|
||||
| Verpasste Lücken (Engine ERFÜLLT / GT FEHLT) | 32 (7 %) | **~31 (~7 %, stabil)** |
|
||||
| Recall | 0,76 | **0,88** |
|
||||
| Precision | 0,83 | **0,84** |
|
||||
|
||||
Ursache der falschen Findings war **nicht** ein zu schwaches LLM, sondern ~11 Controls,
|
||||
die mehr verlangten als das Gesetz. Die Korrektur der Regel verbessert OVH, Claude,
|
||||
Embeddings und menschliche Auditoren gleichzeitig — höchster ROI.
|
||||
|
||||
## 2. Testkorpus (Anti-Overfit)
|
||||
|
||||
Fünf repräsentative, real abgerufene Datenschutzerklärungen unterschiedlicher Größe/Branche:
|
||||
|
||||
| Firma | Charakter | DSE-Größe |
|
||||
|---|---|---|
|
||||
| BMW | Großkonzern, sehr umfangreich | ~64 k Zeichen |
|
||||
| Mercedes-Benz | Großkonzern | ~19 k |
|
||||
| ELLI (VW) | Tochter/Energie | ~26 k |
|
||||
| ETO | B2B-Mittelstand | ~26 k |
|
||||
| SafetyKon | kleiner B2B, dünne DSE | ~5 k |
|
||||
|
||||
Die dünne SafetyKon-DSE ist bewusst der Härtetest gegen **zu lasche** Kriterien.
|
||||
|
||||
## 3. Ground-Truth-Methode
|
||||
|
||||
- **Orakel:** `claude-opus-4-8` (stärkstes verfügbares Modell — NICHT Haiku, das auf 5-Firmen-DSE
|
||||
zu lasch/„N/A-blind" war). Pro `(Firma × Control)` ein Urteil `ERFUELLT | FEHLT | NA`.
|
||||
- Strenge juristische Leitplanken im Prompt (Speicherdauer: zirkuläre Formeln erfüllen nicht;
|
||||
berechtigtes Interesse eines Dritten ≠ eigenes; EU-Kommission-Erwähnung ≠ Angemessenheitsbeschluss).
|
||||
- 456 Urteile, davon 24 `NA` (nicht anwendbar) → 432 anwendbare als Messbasis.
|
||||
- Wichtige Einschränkung: das Opus-GT trägt auf einigen der korrigierten Controls **dieselbe
|
||||
Überstrenge** — siehe §6 (FN-Nachweis). Auf diesen Controls ist die Engine inzwischen
|
||||
*näher am Gesetz* als das GT.
|
||||
|
||||
## 4. Gemessene Engine-Architektur
|
||||
|
||||
Dreistufig, Verdikt je `(Firma × Control)`:
|
||||
|
||||
1. **Keyword** (deterministisch) → trifft zu ⇒ ERFÜLLT.
|
||||
2. **BGE-M3-Embedding-Recall**, Score < 0,50 ⇒ FEHLT (deterministisch, gecacht).
|
||||
3. **LLM-Judge** (OVH `gpt-oss-120b`) auf den Embedding-Passes (Score ≥ 0,50) — fängt die
|
||||
„semantisch nah, aber nicht erfüllt"-Über-Passes des Embeddings.
|
||||
|
||||
**Robustheit (Pflicht):** OVH liefert unter Concurrency teils leere Antworten. Eine leere
|
||||
LLM-Antwort darf **niemals** als FEHLT gewertet werden. `DSE_v1` misst mit Retry-on-empty,
|
||||
Concurrency 3, `max_tokens` 600; bei dauerhaft leer → `UNSICHER` (= 4-Status
|
||||
`INSUFFICIENT_EVIDENCE` / Eskalation), nicht FEHLT.
|
||||
|
||||
## 5. Messverlauf (ehrlich, inkl. Sackgassen)
|
||||
|
||||
| Stufe | verpasste Lücken | falsche Findings | Anmerkung |
|
||||
|---|---|---|---|
|
||||
| nur Keyword + Embedding @0,58 | 32 % | 3 % | Embedding allein **ungenügend** (Über-Passes) |
|
||||
| + OVH-LLM (55k-Truncation) | 5 % | 19 % | LLM löst „nichts fehlt"; FP scheinbar hoch |
|
||||
| + Volltext (kein 55k-Cut) | 6 % | 14 % | Truncation hatte BMW-FP aufgebläht |
|
||||
| + Robust (Retry, kein Leer→FEHLT) | 7 % | 11 % | Artefaktfreie Basis |
|
||||
| **+ Kriterien-Fix = `DSE_v1`** | **~7 %** | **6 %** | siehe §7 |
|
||||
|
||||
**Schlüssel-Befund (Produkt-relevant):** ein großer Teil der ursprünglichen falschen Findings
|
||||
waren **leere OVH-Antworten**, die als FEHLT verbucht wurden — ein echter Pipeline-Bug, nicht
|
||||
ein Inhaltsproblem. Lehre für die Kaskaden-Verdrahtung: leere/Timeout-Antwort → Retry →
|
||||
`INSUFFICIENT_EVIDENCE` / Claude-Eskalation.
|
||||
|
||||
## 6. Die korrigierten Controls (juristischer Review)
|
||||
|
||||
12 systematische FP-Controls (firmenübergreifend wiederkehrend) wurden juristisch reviewt
|
||||
(Gesetz-Forderung vs. Control-Forderung) und in drei Klassen geteilt. **11 korrigiert,
|
||||
1 (`DATA-1611-A04`, Klasse A) unangetastet.** Volle Alt→Neu-Diffs:
|
||||
`dse_criteria_changelog.json` (Repo-Wurzel). Backup für Restore: `dse_criteria_backup.json`.
|
||||
|
||||
### Klasse B — Control verlangte mehr als das Gesetz (7)
|
||||
|
||||
| Control | Rechtsnorm | Kern-Korrektur |
|
||||
|---|---|---|
|
||||
| `DATA-2260-A01` | Art. 13(1)(c) | „*primärer*/einzelner Zweck" → Zweck(e) im Plural genügen |
|
||||
| `AUTH-3737-A06` | Art. 13(1)(c)+(e) | keine Zweck/Rechtsgrundlage-Matrix *je* Übermittlung |
|
||||
| `DATA-2992-A03` | Art. 13(1)(e) | Empfänger/Kategorien + Zweck; keine AV-Distinktion/Vertraulichkeit |
|
||||
| `DATA-1624-A03` | Art. 13(1)(f) | Garantie + Zugang via **Link ODER** Kontakt; keine Schutzwirkungs-Beschreibung |
|
||||
| `DATA-1619-A03` | Art. 13(1)(c) | Rechtsgrundlage je Zweck; Artikel-Zitat *nicht* zwingend |
|
||||
| `DATA-424-A09` | Art. 20 | Recht erwähnen genügt; Format (CSV/JSON) *nicht* in der DSE |
|
||||
| `GOV-3300-A06` | Art. 20 | wie A09 (Dedupe-Kandidat zu `DATA-424-A09`) |
|
||||
|
||||
### Klasse C — Control mehrdeutig / Pflichten vermischt (4)
|
||||
|
||||
| Control | Rechtsnorm | Kern-Korrektur |
|
||||
|---|---|---|
|
||||
| `AI-1560-A01` | Art. 13(1)(c) vs (2)(a) | Speicherdauer-Forderung entfernt (eigene Pflicht) |
|
||||
| `SEC-3444-A04` | Art. 13(1)(c) | Titel/Frage „*beschränken*" (Verhalten) → „offenlegen" |
|
||||
| `DATA-1624-A06` | Art. 13(1)(f) | Schutzwirkungs-Beschreibung raus; ⚠️ Near-Duplikat zu `DATA-1624-A03` |
|
||||
| `DATA-2812-A05` | Art. 17 / §25 TTDSG | Titel „*implementieren*" → „offenlegen"; Verweis auf Cookie-Einstellungen genügt |
|
||||
|
||||
### Klasse A — Control korrekt, LLM/Artefakt (1, nicht geändert)
|
||||
`DATA-1611-A04` (Art. 13(1)(c)): Kriterien rechtlich sauber; die FP waren OVH-Leerantworten.
|
||||
|
||||
## 7. FN-Sicherheitsnachweis (kein Aufweichen)
|
||||
|
||||
Lockerere Kriterien dürfen keine echten Lücken durchwinken. Nach dem Fix stieg FN scheinbar
|
||||
32 → 36. Kausaltest (11 korrigierte Controls × 5 Firmen) + **deterministische Textprüfung**:
|
||||
|
||||
- **0 echte verschluckte Lücken.** Alle 5 „neuen FN" sind Fälle, in denen die Engine jetzt
|
||||
*korrekt* ist und das **Opus-GT zu streng war** — im DSE-Text belegt (bmw nennt „Art. 20 DSGVO",
|
||||
elli „EU-Standardvertragsklauseln … USA", mercedes Empfänger+Weitergabe, eto konkrete Zwecke).
|
||||
- **11 echte Lücken weiterhin gefangen** (TN), v.a. SafetyKons dünne DSE → die Kriterien sind
|
||||
nicht zahnlos geworden.
|
||||
|
||||
⇒ Die wahre Engine-FN bleibt bei ~7 % (stabil); die scheinbaren +4 sind GT-Überstrenge.
|
||||
|
||||
## 8. Reproduzierbarer Kalibrierungsprozess (das eigentliche Ergebnis)
|
||||
|
||||
Auf jedes weitere Modul anwendbar (Cookie, Impressum, AGB, CRA, NIS2, KI-VO, DORA, MaschVO):
|
||||
|
||||
1. **Opus-GT** je `(Firma × Control)` über 5 repräsentative Firmen bauen.
|
||||
2. **Engine messen** (Keyword → Embedding → robuster LLM-Judge) vs GT.
|
||||
3. **FP clustern** — wiederkehrende Controls statt Einzel-Findings (systematisch ≠ zufällig).
|
||||
4. **Gesetz-vs-Control-Review** der Top-Cluster → Klassen A (LLM-Fehler) / B (zu streng) / C (mehrdeutig).
|
||||
5. **Kriterien korrigieren** (B+C), versioniert, mit Rechtsnotiz im Changelog.
|
||||
6. **Re-Messung** — Pflicht: FP gesunken **und** FN stabil (FN-Kausaltest gegen Über-Lockerung).
|
||||
|
||||
## 9. Bekannte Grenzen / offene Punkte
|
||||
|
||||
- **OVH stochastisch** (kein Seed): ±~4 Findings Lauf-zu-Lauf. Für harte Zahlen Mehrfachlauf/Mehrheit.
|
||||
- **GT-Überstrenge** auf einigen korrigierten Controls → „8 % FN" überzeichnet leicht (wahr ~7 %).
|
||||
- **Dedupe offen** (separater Catalog-Schritt, nicht gelöscht): `DATA-1624-A06`↔`A03`,
|
||||
`DATA-424-A09`↔`GOV-3300`.
|
||||
- **Nur macmini-dev.** Kriterien-Änderungen sind reversibel (Backup) und noch **nicht** auf Prod.
|
||||
- **Restliche FP-Tail** (~17 außerhalb der Top-12) bei 6 % belassen — weitere Optimierung
|
||||
schlechter ROI; operativer Hebel ist der Claude-Freigabe-Tier (Kaskade), nicht Regel-Tuning.
|
||||
|
||||
## 10. Artefakte & Reproduktion
|
||||
|
||||
- GT-Verdikte: `/tmp/gt_opus_dse.json` (Container) · Kandidaten/Scores: `/tmp/multi_company_gt.json`
|
||||
- Changelog (Alt→Neu + Rechtsnotiz): `dse_criteria_changelog.json` · Restore: `dse_criteria_backup.json`
|
||||
- Skripte (MacBook `/tmp`, Ausführung via `docker exec -i bp-compliance-backend python3 -`):
|
||||
`cc_gt_opus_dse.py` (GT) · `cc_engine_llm_dse3.py` (robuste Messung) ·
|
||||
`cc_apply_criteria.py` (Korrektur + Versionierung) · `cc_check_fn.py` (FN-Kausaltest) ·
|
||||
`cc_verify_fn.py` (deterministische Textprüfung)
|
||||
@@ -0,0 +1,43 @@
|
||||
# Mapping: Nutzungsbedingungen & Shop-AGB auf die Prüfer-Matrix
|
||||
|
||||
> **Zweck:** Beleg der These *„neues Modul = Klassifizierung/Mapping, kein Forschungsprojekt"*. Keine neue Architektur, keine neuen Prüfertypen — nur Zuordnung vorhandener Controls + weniger neuer Items auf die bestehenden Prüfer.
|
||||
|
||||
## 1. Shop-AGB = 0 neue Arbeit
|
||||
|
||||
Der AGB-Korpus, an dem kalibriert wurde (Zalando, Otto, MediaMarkt, Tchibo, Lieferando), **sind** Shop-AGB. „Shop-AGB" ist damit kein neues Modul, sondern **das AGB-Modul selbst**. Aufwand: **null**.
|
||||
|
||||
## 2. Nutzungsbedingungen (Plattform/App-ToS) = Reuse + wenige neue Items
|
||||
|
||||
NB sind Vertragsprosa wie AGB, nur ohne Warenkauf-Pflichten und mit Plattform-spezifischen Pflichten. Mapping:
|
||||
|
||||
**Aus AGB wiederverwendet (gleiche Prüfer, ggf. neue Paraphrasen):**
|
||||
| Item | verification_method | decision_method |
|
||||
|---|---|---|
|
||||
| scope (Geltungsbereich) | CONTENT | EMBEDDING |
|
||||
| liability (Haftung) | CONTENT | EMBEDDING |
|
||||
| jurisdiction / choice_of_law | CONTENT | EMBEDDING |
|
||||
| data_protection (DSE-Verweis) | REFERENCE | LINK_RESOLVER |
|
||||
| salvatory_clause | CONTENT | EMBEDDING |
|
||||
| amendment_clause | CONTENT | EMBEDDING |
|
||||
| termination (Konto/Account) | CONTENT | EMBEDDING |
|
||||
| consumer_rights | CONTENT | EMBEDDING |
|
||||
| dispute_odr_link (bei B2C) | REFERENCE | LINK_RESOLVER |
|
||||
|
||||
**Nicht anwendbar (Scope-Gate, Waren-Verkauf):** payment*, delivery*, warranty*, contract/incorporation (anderer Vertragsschluss).
|
||||
|
||||
**Neu (NB-spezifisch) — alle auf EXISTIERENDE Prüfertypen:**
|
||||
| Item | verification_method | decision_method |
|
||||
|---|---|---|
|
||||
| Nutzungsrechte / zulässige Nutzung | CONTENT | EMBEDDING |
|
||||
| Geistiges Eigentum / Schutzrechte | CONTENT | EMBEDDING |
|
||||
| Verfügbarkeit (kein Anspruch auf ununterbrochenen Betrieb) | CONTENT | EMBEDDING |
|
||||
| Account / Registrierung / Nutzerpflichten | CONTENT | EMBEDDING |
|
||||
| Nutzergenerierte Inhalte / Verhaltensregeln | CONTENT | EMBEDDING |
|
||||
| Haftung für Links / Drittinhalte | CONTENT | EMBEDDING |
|
||||
|
||||
## 3. Ergebnis
|
||||
|
||||
- **Shop-AGB:** 0 neue Items, 0 neue Prüfer.
|
||||
- **Nutzungsbedingungen:** ~9 Items aus AGB wiederverwendet + ~6 neue Items — **alle auf bestehenden Prüfertypen** (CONTENT/EMBEDDING + REFERENCE). **0 neue Prüfertypen.**
|
||||
|
||||
Ein neues Web-Dokument ist damit ein **Mapping-/Klassifizierungs-** und Paraphrasen-Schreibproblem (Stunden–Tage), kein Mess-/Forschungsprojekt (Wochen). Genau die These der Prüfer-Matrix.
|
||||
@@ -0,0 +1,87 @@
|
||||
# Prüfer-Matrix — Meta-Modell der Doc-Check-Plattform
|
||||
|
||||
> **Status:** Plattformkonzept, **eingefroren 2026-06-21**. Abgeleitet aus 4 kalibrierten Modulen (DSE, Cookie, Impressum, AGB). Erweitert `verification_method.md` (5→8 Klassen) und fügt die `decision_method`-Achse hinzu.
|
||||
> **Kernsatz:** *Nicht jedes Control braucht denselben Richter.* Der **Kontrolltyp bestimmt den Prüfer** — nicht alles ist ein Text-/LLM-Problem.
|
||||
|
||||
## 0. Die Architektur-Verschiebung
|
||||
|
||||
**Vorher (implizit):** `Control → Embedding → LLM → Finding`.
|
||||
|
||||
**Jetzt (empirisch bewiesen):**
|
||||
```
|
||||
Control → [scope-gate] → artifact_type → verification_method → decision_method
|
||||
→ passender Prüfer → Evidence → Finding (severity-getiert)
|
||||
```
|
||||
|
||||
Vier strukturell verschiedene Dokumenttypen führten immer wieder auf dieselbe Meta-Struktur. Das ist größer als jeder Einzel-Fix: es ist mit hoher Wahrscheinlichkeit das Routing-Prinzip für alle ~14.000 Master Controls.
|
||||
|
||||
## 1. Empirische Basis (4 Module)
|
||||
|
||||
| Modul | dominanter Prüfer | Beleg |
|
||||
|---|---|---|
|
||||
| DSE | CONTENT (LLM/Embedding) | Kriterien-Kalibrierung, FP 11→6 % |
|
||||
| Cookie-Banner | BEHAVIOR | Enforcement / Dark-Pattern (Playwright) |
|
||||
| Cookie-Policy | CONTENT + REFERENCE | Inhalt + Verweise |
|
||||
| Impressum | FIELD + PRESENTATION (+ SCOPE-Gate) | Feld-Matcher FP 0 %, Präsentation re-routed |
|
||||
| AGB | CONTENT (KEYWORD→EMBEDDING→LLM) + REFERENCE (+ SCOPE-Gate) | 71 % FP → ~0; LLM nur 2/21 Items |
|
||||
|
||||
## 2. Achse 1 — `verification_method` (welcher Prüfer-TYP)
|
||||
|
||||
| verification_method | Prüfer | Leitfrage | Beleg | Reifegrad |
|
||||
|---|---|---|---|---|
|
||||
| **CONTENT** | Embedding + LLM-Kaskade | Was steht (als Offenlegung) im Text? | DSE, Cookie-Policy | kalibriert |
|
||||
| **FIELD** | Regex / Extraktion (Feldmatrix) | Welche Pflichtfelder existieren + sind valide? | Impressum (HRB, USt-IdNr, Anschrift) | ✓ FP 0 % |
|
||||
| **REFERENCE** | Link-Resolver | Gibt es einen klaren Verweis/Link, löst er auf? | AGB `data_protection` | ✓ 7/7 |
|
||||
| **BEHAVIOR** | Playwright + API | Manipuliert die UI die Entscheidung? | Cookie-Banner (Reject=Accept, Pre-Consent-Cookies) | Matrix vorhanden |
|
||||
| **PRESENTATION** | Playwright UI-Sensor | Auffindbar / sichtbar / erreichbar? | Impressum „leicht erkennbar" | re-routed |
|
||||
| **PROCESS** | Audit / Evidence | Gibt es einen internen Nachweis? | VVT, TOM, interne Richtlinie | Checkliste |
|
||||
| **TECHNICAL** | Scanner (Repo / Netz / Config) | Ist die technische Maßnahme implementiert? | geplant: CRA, NIS2, ISO 27001 | offen |
|
||||
| **CONTRACTUAL** | Clause-Engine | Ist die Klausel vorhanden + rechtskonform? | AGB (delivery/warranty; Defekte → Stage 3) | teilweise |
|
||||
|
||||
**CONTENT vs CONTRACTUAL:** CONTENT = Offenlegungs-Prosa (DSE nennt Zwecke). CONTRACTUAL = Vertragsklauseln (AGB-Haftung/Lieferung). Beide können Embedding+LLM nutzen — die Trennung ist die Rechtsnatur + die spätere Defekt-Prüfung (Klausel rechtswidrig?).
|
||||
|
||||
**PRESENTATION ≠ BEHAVIOR:** beide Playwright, andere Rechtslogik. PRESENTATION = Auffindbarkeit/Sichtbarkeit; BEHAVIOR = Entscheidungs-Manipulation/Dark-Pattern.
|
||||
|
||||
## 3. Achse 2 — `decision_method` (WIE innerhalb CONTENT/CONTRACTUAL entschieden wird)
|
||||
|
||||
Die AGB-Entdeckung: **Controls INNERHALB eines Prüfer-Typs brauchen verschiedene Entscheider.** Eskalation nur bei Bedarf (Kostendisziplin):
|
||||
|
||||
| decision_method | Mechanismus | Wann | Beleg (AGB) |
|
||||
|---|---|---|---|
|
||||
| **KEYWORD** | Regex-Match | Pflicht eindeutig formuliert | Keyword-Layer |
|
||||
| **EMBEDDING** | per-Item-Cosinus-Schwelle (Doc-Chunks × Item-Paraphrasen) | Prosa, semantisch trennbar | 13/21 Items, 0 Fehl-Rescue |
|
||||
| **LLM** | Clause-Retrieval (**ganze §-Abschnitte**) + starkes Modell, present/absent | semantisch eng (Embedding trennt nicht) | 2/21 Items (delivery/warranty), 14/14 |
|
||||
|
||||
`CONTENT_SIMPLE` = KEYWORD/EMBEDDING reicht; `CONTENT_COMPLEX` = LLM nötig. AGB-Bilanz: **81 % deterministisch, 19 % LLM-fähig**, LLM real nur bei Keyword-Miss.
|
||||
|
||||
## 4. Durable Per-Control-Metadaten (das Routing-Vokabular)
|
||||
|
||||
| Feld | Zweck |
|
||||
|---|---|
|
||||
| `artifact_type` | gegen welches Artefakt geprüft wird → Scanner-Routing |
|
||||
| `obligation_type` | Rechtsnatur: Pflicht / Empfehlung / Kann → Tier |
|
||||
| `check_intent` | was die Prüfung bezweckt |
|
||||
| `reference_allowed` | darf per Verweis erfüllt werden → REFERENCE statt CONTENT |
|
||||
| `scope` / `scope_requires` | Applicability-Gate (Geschäftsmodell, Rechtsform) — **vor** allen Prüfern |
|
||||
| `verification_method` | Achse 1 (Prüfer-Typ) |
|
||||
| `decision_method` | Achse 2 (Entscheider innerhalb CONTENT/CONTRACTUAL) |
|
||||
| `severity` | HIGH / MEDIUM / LOW → Finding vs Empfehlung |
|
||||
|
||||
## 5. Hart erarbeitete Plattform-Prinzipien
|
||||
|
||||
1. **Route, don't uniformly-LLM** — verschiedene Controls, verschiedene Prüfer.
|
||||
2. Eskaliere **KEYWORD → EMBEDDING → LLM nur bei Bedarf** (AGB: 17/21 ohne LLM).
|
||||
3. Embedding: **per-Item-Schwellen** (globale Schwelle scheitert bei juristischer Prosa — PASS/FAIL überlappen global, trennen per-Item).
|
||||
4. LLM-Judge: **ganze §-Abschnitte** schlagen Top-k-Chunks; **starken Tier pinnen** (billig-zuerst-Kaskade eskaliert selbstbewusst-falsche Antworten NICHT, weil die Confidence-Heuristik genauigkeits-blind ist); **present/absent** trennen von der Defekt-Prüfung.
|
||||
5. **REFERENCE (Link) ist ein eigener billiger Prüfer** — keinen „siehe Datenschutzerklärung"-Verweis durch ein LLM jagen.
|
||||
6. **SCOPE-Gate (Applicability) ist vor allen Prüfern** — N/A-Controls werden nie geprüft.
|
||||
7. **Severity → Finding vs Empfehlung** (Tier, nicht droppen).
|
||||
8. *Was im Text nicht beweisbar ist, gehört nicht in den Text-Check.*
|
||||
|
||||
## 6. Schema-Status
|
||||
|
||||
Kein DB-Eingriff (DB eingefroren). `verification_method` + `decision_method` als **abgeleitete Tags** in `control_classification` (aus `artifact_type` / `obligation_type` / `check_intent` + Item-Kalibrierung). `canonical_controls.verification_method` existiert (~4 % befüllt, gröbere Enterprise-Taxonomie) — **nicht** das Doc-Check-Routing.
|
||||
|
||||
## 7. Verbindlichkeit
|
||||
|
||||
Dies ist der **Vertrag**, gegen den implementiert wird. Die AGB-Integration und die nächsten Module (Nutzungsbedingungen, Widerruf, CRA, MaschVO, DORA, NIS2, ISO 27001, AI-Act, VVT, TOM) bauen **dieselbe** Routing-Schicht — nicht modul-lokal. Reihenfolge: **(1) diese Matrix einfrieren → (2) AGB integrieren → (3) Nutzungsbedingungen → (4) Widerruf.**
|
||||
@@ -0,0 +1,64 @@
|
||||
# BreakPilot — Evidenz- & Qualitätsnachweis (Website-Compliance v1)
|
||||
|
||||
> **Status:** konsolidierter Freeze-Stand 2026-06-21. Belegbasis aus 4 kalibrierten Modulen (DSE, Cookie, Impressum, AGB). Dient als (a) technischer Freeze-Record und (b) Backbone für Sales/Investoren.
|
||||
> **Hinweis:** Zahlen = *gemessene* Validierungsergebnisse gegen Opus-Ground-Truth. Tool-/Prod-Integrationsstand je Modul siehe §7 (validiert ≠ überall schon live).
|
||||
|
||||
## 1. Kernaussage
|
||||
|
||||
Die meisten Compliance-Tools machen: **Dokument → LLM → Finding** — ein Richter für alles. Das erzeugt systematische False Positives und hat *keine* belastbare Evidenzbasis.
|
||||
|
||||
BreakPilot macht: **Dokument → Control-Routing → spezialisierter Prüfer → Finding.**
|
||||
|
||||
> Wir haben **für jeden Kontrolltyp den optimalen Prüfer empirisch ermittelt** — mit echten Vorher/Nachher-Zahlen, nicht mit Marketing.
|
||||
|
||||
Das ist über 4 strukturell verschiedene Dokumenttypen reproduzierbar belegt — und damit voraussichtlich das Routing-Prinzip für alle ~14.000 Master Controls.
|
||||
|
||||
## 2. Die Architektur (zwei Routing-Achsen)
|
||||
|
||||
Vollständige Kette: **Regulation → Obligation → Control → verification_method → decision_method → Prüfer → Evidence → Finding → Ticket.**
|
||||
|
||||
- **`verification_method`** (Kategorie / welcher Prüfer-Typ): CONTENT · FIELD · REFERENCE · BEHAVIOR · PRESENTATION · PROCESS · TECHNICAL · CONTRACTUAL.
|
||||
- **`decision_method`** (konkreter Mechanismus): REGEX · EMBEDDING · LLM · LINK_RESOLVER · PLAYWRIGHT · AUDIT · SCANNER.
|
||||
|
||||
Kernregel: *Was im Text nicht beweisbar ist, gehört nicht in den Text-Check.* Scope-Gate (Applicability) läuft vor allen Prüfern; Severity steuert Finding vs. Empfehlung.
|
||||
|
||||
## 3. Evidenz je Modul
|
||||
|
||||
| Modul | dominanter Prüfer | gemessenes Ergebnis | Hebel | Reife |
|
||||
|---|---|---|---|---|
|
||||
| **DSE** | CONTENT (Embedding+LLM) | False Positives **11 % → 6 %**; an **8 Firmen** validiert, Generalisierung nachgewiesen (kein Overfit auf einen Assessor); Claude-Tier-Pfad → ~2 % bekannt | Kriterien-Kalibrierung + LLM-Kaskade | **RC** |
|
||||
| **Impressum** | FIELD + PRESENTATION (+ Scope-Gate) | **171 falsche Findings → 0** (Scope-Gate); Feldmatrix (Firma/Anschrift/HRB/USt-IdNr/Kontakt) **FP 0 %, Recall 1.0**; 5 Präsentations-Controls an Playwright re-routet | Scope-Gate + deterministischer Feld-Matcher schlägt LLM | **RC** |
|
||||
| **Cookie** | BEHAVIOR + CONTENT | Artifact-Type-Trennung **Banner ≠ Richtlinie** validiert (Controls liefen am falschen Artefakt → re-routet); Browser-Verhaltens-Matrix (Enforcement, Dark-Pattern, Reject=Accept) | Artifact-Type-Routing + Playwright-Verhaltenssensor | Wave-1 (GT-Stab. offen) |
|
||||
| **AGB** | CONTENT + REFERENCE + LLM | **71 % FP → ~0** (7-Firmen-Opus-GT): 49 Findings / 35 falsch → bereinigt; Embedding-Rescue **21 Recall-FP gekillt, 0 Fehl-Rescue**; LLM-Judge (ganze §-Abschnitte) **14/14**; Reference-Check **7/7** | **decision_method pro Item** (17 EMBEDDING, 2 LLM, 1 REFERENCE) | Architektur validiert |
|
||||
|
||||
## 4. Warum die Zahlen belastbar sind (Methodik-Rigor)
|
||||
|
||||
- **Ground Truth mit dem stärksten Modell** (Opus-4-8), nicht mit billigen Modellen.
|
||||
- **Prove-don't-handwave:** echte FP/FN-Zählungen, Vorher/Nachher, keine Behauptungen.
|
||||
- **Generalisierung statt Overfit:** Mehr-Firmen-GT (DSE 8, AGB 7) + explizite Leitplanken gegen Ein-Assessor-Overfit.
|
||||
- **Mehrfach-Referenz-Validierung:** bei AGB 3-Wege (Opus-GT × Claude-Eigenbewertung × Laufzeit-Kaskade) — deckte sogar einen Fehler in der GT selbst auf.
|
||||
- **Stichprobe vor Aufbau:** vor jeder teuren Klassifikation/Batch zuerst stratifizierte Stichprobe geprüft (verhinderte mehrfach Aufbau auf falschem Fundament).
|
||||
|
||||
## 5. Die Schlüssel-Entdeckung (AGB)
|
||||
|
||||
Verschiedene Controls **innerhalb desselben Moduls** brauchen verschiedene Richter. Belege:
|
||||
- Eine **globale** Embedding-Schwelle scheitert bei juristischer Prosa; **per-Item-Schwellen** trennen sauber.
|
||||
- **Whole-Section-Retrieval** (ganze §-Abschnitte) schlägt Top-k-Chunks für den LLM-Judge deutlich.
|
||||
- Ein **billig-zuerst-Kaskaden-LLM** taugt nicht als Richter (eskaliert selbstbewusst-falsche Antworten nicht) — für harte Items starken Tier pinnen.
|
||||
- Ein **Verweis** („siehe Datenschutzerklärung") ist ein REFERENCE/Link-Check, **kein** LLM-Fall.
|
||||
|
||||
## 6. Wettbewerbspositionierung
|
||||
|
||||
| | Typisches Tool | BreakPilot |
|
||||
|---|---|---|
|
||||
| Prüfansatz | ein LLM für alles | Control-Routing → spezialisierter Prüfer |
|
||||
| False Positives | systematisch (LLM auf Nicht-Text-Pflichten) | je Kontrolltyp minimiert (gemessen) |
|
||||
| Evidenzbasis | keine | Mehr-Firmen-GT, reproduzierbare Zahlen |
|
||||
| Skalierung neuer Regulierungen | jedes Mal neu | Mapping auf bestehende Prüfer-Matrix |
|
||||
|
||||
## 7. Reifegrad, Ehrlichkeit & Roadmap
|
||||
|
||||
- **Validiert (Messung):** alle 4 Module oben.
|
||||
- **Live im Tool:** DSE-Kriterien (prod). Impressum-Scope/Feldmatrix, Cookie-Artifact-Type und AGB-C-lean sind **validiert, aber noch nicht überall ins Produkt integriert** → Demo-Integration ist der nächste Schritt (Vorher/Nachher live zeigbar machen).
|
||||
- **Website-/Marketing-Compliance: abgeschlossen** (DSE/Impressum/Cookie/AGB + Architektur). Restliche Web-Doc-Typen (Nutzungsbedingungen, Shop-AGB, Legal Notice, Social-Media) = **Mapping**, keine neue Architektur.
|
||||
- **Nächste große Etappe (nach Sales):** industrielle Compliance (CRA, Maschinenverordnung, NIS2, DORA, ISO 27001, TISAX, AI Act) — neue Prüfertypen TECHNICAL/PROCESS/EVIDENCE/SYSTEM; die Prüfer-Matrix wird dort wiederverwendet.
|
||||
@@ -0,0 +1,74 @@
|
||||
# Plattform-Validierung der Doc-Check-Kalibrierung — `platform_validation_v1`
|
||||
|
||||
> **Status:** Plattform-Methodik validiert über 3 strukturell verschiedene Dokumentklassen (2026-06-19).
|
||||
> **Zweck:** Nicht ein Modul dokumentieren, sondern den **Kalibrierungsprozess** und die **empirische Fehlerkarte** der Engine — damit die *Ursachen* erhalten bleiben (nicht nur die Messwerte). Erkenntnis > Metrik.
|
||||
|
||||
## 1. Was hier validiert wurde
|
||||
|
||||
Vor dieser Runde war unklar, ob der Restfehler der Doc-Check-Engine aus dem **LLM**, dem **Embedding**, dem **Prompt**, der **Applicability** oder dem **Control-Katalog** stammt — alles vermischt. Nach DSE + Cookie + Impressum existiert eine **belastbare Taxonomie der Fehlerursachen**, und der **Kalibrierungsprozess** hat in drei sehr unterschiedlichen Domänen geliefert. Das ist die eigentliche Errungenschaft — größer als jede einzelne Zahl.
|
||||
|
||||
## 2. Der Kalibrierungsprozess (wiederverwendbarer Kern)
|
||||
|
||||
1. **Opus-GT** je `(Firma × Control)` über 5–9 repräsentative Firmen (stärkstes Modell, NICHT Haiku).
|
||||
2. **Engine-Messung** (Keyword → BGE-M3-Embedding → robuster LLM-Judge) vs GT.
|
||||
3. **FP-Cluster** — wiederkehrende Controls statt Einzel-Findings (systematisch ≠ zufällig).
|
||||
4. **Ursachen-Klassifikation** je FP: `SCOPE` / `ARTIFACT_TYPE` / `CRITERIA` / `JUDGE`.
|
||||
5. **Fix** der dominanten Ursache (versioniert, mit Rechtsnotiz).
|
||||
6. **Re-Messung** — Pflicht: FP↓ **und** FN stabil. Plus **Anti-Overfit** auf ungesehenen Firmen.
|
||||
|
||||
## 3. Plattform-Fehlerkarte (Kernergebnis)
|
||||
|
||||
| Modul | Dominante Ursache | Hebel | Ergebnis | Status |
|
||||
|---|---|---|---|---|
|
||||
| **DSE** | Kriterien zu streng | Kriterien-Kalibrierung (11 Controls) | FP 11 % → **6 %**, FN ~7 %; **generalisiert** (8 Firmen; fresh FP 7 % / FN 5 %) | Release-Candidate |
|
||||
| **Cookie** | `artifact_type` (Banner ≠ Richtlinie) | 31 Banner-Controls → `COOKIE_BANNER`; 21 Kriterien (Kategorie statt Pro-Cookie, Zitat optional), Pro-Cookie = Best-Practice | Precision 0,81 → **0,95**, Recall 0,26 → **0,44**, verpasste Lücken → **0 %**, abs. FP 71 → 54 | Wave-1 (dev) |
|
||||
| **Impressum** | **Scope** (GT-NA 48 %) + **Feld-Extraktion** + **Präsentation** | Scope-Gate (14 raus) + **Feldmatrix-Matcher** (Fakten) + **PRESENTATION_CHECK**-Re-Route (5) | roh: SCOPE-FP 105 / JUDGE-FP 66 → **Text-Check FP 0 % / FN 2 %** | Release-Candidate |
|
||||
|
||||
## 4. Meta-Befunde
|
||||
|
||||
- **Die generische Architektur bewährt sich.** Jede Domäne hat ein *anderes* dominantes Problem — `artifact_type` / `obligation_type` / `scope` tragen unterschiedlich stark. Eine gute generische Architektur erzeugt nicht überall denselben Effekt, sondern löst je Domäne ein anderes Problem. Genau das ist eingetreten.
|
||||
- **Die Zielarchitektur ist domänen-adaptiv, nicht uniform.** „Embedding → OVH → Claude" ist nicht überall richtig: bei **Prosa** (DSE/Cookie) ist die LLM-Kaskade der Hebel; bei **strukturierten Faktendokumenten** (Impressum) ist das LLM sogar schwach (es verfehlt Adressen/Felder, die *dastehen*) → dort schlagen **Scope-Gate + deterministischer Feld-Matcher** den LLM-Judge.
|
||||
- **Wiederkehrendes Anti-Muster:** „vermeintlicher Judge-Fehler → eigentlich Katalog-Fehler" (Scope, Präsentations-statt-Inhalt, Fehl-Typisierung). Erst NACH den Katalog-Fixes ist der Rest ein *echter* Judge-Fehler.
|
||||
|
||||
## 4b. Die `verification_method`-Achse (Synthese — die eigentliche Lehre)
|
||||
|
||||
Nicht jede Compliance-Pflicht ist ein Textproblem. Die 5 entdeckten Fehlerklassen mappen auf **5 Prüfer-Typen** — eine neue Routing-Metadaten-Achse `verification_method`, die einem Control sagt, *welcher Prüfer* zuständig ist (nicht alles an den LLM):
|
||||
|
||||
| `verification_method` | Prüfer | Frage | Beispiel | Status |
|
||||
|---|---|---|---|---|
|
||||
| **CONTENT** | Embedding + LLM-Kaskade (OVH→Claude) | Was steht da? | DSE nennt Zwecke; Cookie-Policy | DSE/Cookie kalibriert |
|
||||
| **FIELD** | Regex/Parser (Feldmatrix) | Welche Felder existieren? | HRB, USt-IdNr, Adresse | Impressum-Fakten ✓ (FP 0 %) |
|
||||
| **PRESENTATION** | Playwright (Sichtbarkeit/Erreichbarkeit) | Ist es auffindbar/wahrnehmbar? | Impressum leicht erkennbar, ständig verfügbar; Footer nicht verdeckt | Re-Route gemacht; Check offen |
|
||||
| **BEHAVIOR** | Playwright + API (Interaktion) | Manipuliert es die Entscheidung? | Reject = Accept, Consent VOR Cookie, kein Dark Pattern | Cookie-Banner-Matrix existiert |
|
||||
| **PROCESS** | Audit/Nachweis | Gibt es internen Nachweis? | VVT, interne Richtlinie, Audit-Entscheidung | Org-Checkliste |
|
||||
|
||||
**PRESENTATION ≠ BEHAVIOR** (beide Playwright, andere Rechtslogik): Präsentation = *Auffindbarkeit/Sichtbarkeit/Zugänglichkeit* (Impressum leicht erkennbar); Behavior = *Entscheidungs-Manipulation/Dark-Pattern* (Reject versteckt). Getrennt halten.
|
||||
|
||||
**Playwright wird damit vom Crawler zum Compliance-Sensor:** es prüft, was kein LLM kann — `display:none`, `font-size:4px`, Cookie-Layer verdeckt den Footer. LLM sieht `<a href="/impressum">` und sagt „erfüllt"; Playwright sieht die Verdeckung und sagt „nicht erfüllt".
|
||||
|
||||
**Kern-Regel der Architektur:** *Was im Text nicht beweisbar ist, gehört nicht in den Text-Check.* → route per `verification_method`. Sobald die Klassen sauber getrennt sind, sinken die FP fast automatisch (Impressum: SCOPE+JUDGE 171 → Text-Check-FP 0).
|
||||
|
||||
**Schema-Status:** `canonical_controls.verification_method` existiert (nur ~4 % befüllt, andere/gröbere Taxonomie document/code_review/tool/hybrid), `doc_check_controls` hat sie nicht. Die hier definierte Doc-Check-Routing-Achse ist **aus `control_classification` (artifact_type/obligation_type/check_intent) ableitbar** → kein Schema-Eingriff (eingefroren) nötig; als abgeleitetes Tag in `control_classification` führen.
|
||||
|
||||
## 5. Mess-Disziplin (prove-don't-handwave)
|
||||
|
||||
- GT mit dem stärksten Modell (`claude-opus-4-8`), nicht Haiku (zu lasch).
|
||||
- Robust gegen LLM-Leerantworten: Retry + `INSUFFICIENT_EVIDENCE`/Eskalation statt FEHLT (ein realer Produktions-Bug, der die FP künstlich aufblähte).
|
||||
- **Anti-Overfit:** Kriterien am Gesetz kalibrieren, dann auf *ungesehenen* Firmen gegenprüfen (DSE: 5 Original + 3 frische → stabile Zahlen = kein Overfit).
|
||||
- OVH ist stochastisch (±~Rauschen je Lauf) und strenger als Opus → der Rest-FP konvergiert über Module auf **OVH-Über-Strenge**.
|
||||
- **Zirkularitäts-Leitplanke:** Claude = Opus-GT-Modell → ein Claude-Tier-Sim misst die *Kaskaden-Reichweite* (erreicht Opus-Niveau), nicht eine unabhängige Validierung.
|
||||
|
||||
## 6. Offen (Reihenfolge)
|
||||
|
||||
1. **Claude-Tier-Sim (DSE + Cookie):** quantifiziert den verbleibenden **reinen** Judge-Fehler nach allen Katalog-Fixes — die letzte große unbekannte Variable. Erwartung: kleiner als roh, weil viel „Judge" sich als Katalog entpuppte.
|
||||
2. **Impressum-Fix:** Rechtsform-Scope-Gate (#33) verdrahten + deterministischer Feld-Matcher + Re-Messung.
|
||||
3. **Cookie Wave-2** (Cluster-E) + Produktions-Re-Route der 31 Banner-Controls (`control_classification`).
|
||||
4. **Produktivschaltung** DSE + Cookie (zuletzt; verify-first DB-Write).
|
||||
|
||||
## 7. Artefakte
|
||||
|
||||
- DSE: `docs-src/development/dse_v1_validation.md`, `dse_criteria_changelog.json`/`dse_criteria_backup.json`.
|
||||
- Cookie: `cookie_criteria_changelog.json`/`cookie_criteria_backup.json`/`cookie_best_practice.json` (Container `/tmp`), Cluster-Map.
|
||||
- Impressum: `impressum_fp_by_cause.json` (SCOPE/JUDGE-Split).
|
||||
- Gedächtnis: `project_engine_quality.md` (Detail je Modul). Werkzeuge: `cc_gt_opus_*`, `cc_engine_*`, `cc_*_candidates*` (alle macmini `/tmp`).
|
||||
- **Alle Control-Änderungen nur auf macmini-dev**, versioniert, reversibel; Prod-Schaltung ausstehend.
|
||||
@@ -0,0 +1,59 @@
|
||||
# `verification_method` — die Prüfer-Routing-Achse
|
||||
|
||||
> **Status:** Architektur-Achse (2026-06-19), abgeleitet aus der 3-Modul-Kalibrierung (DSE / Cookie / Impressum).
|
||||
> **Kernsatz:** *Nicht jede Compliance-Pflicht ist ein Textproblem.* `verification_method` sagt einem Control, **welcher Prüfer** zuständig ist — damit nicht alles am LLM hängt.
|
||||
|
||||
## 1. Warum diese Achse existiert
|
||||
|
||||
Die Kalibrierung von drei strukturell verschiedenen Dokumentklassen zeigte drei **verschiedene** dominante Fehlerursachen — und alle ließen sich auf die *Wahl des falschen Prüfers* zurückführen:
|
||||
|
||||
- **DSE** (Prosa): LLM-Urteil zu streng → Kriterien-Kalibrierung. Prüfer war richtig (LLM), Kriterien falsch.
|
||||
- **Cookie** (Banner ≠ Richtlinie): Controls am falschen Artefakt geprüft → `artifact_type`-Re-Route.
|
||||
- **Impressum** (Faktendokument): LLM verfehlt Felder, die *dastehen* (Adresse, HRB) → deterministischer Feld-Matcher schlägt den LLM. Und 5 Controls waren **gar nicht im Text beweisbar** (Erreichbarkeit/Verfügbarkeit) → gehören an Playwright, nicht an den Text-Check.
|
||||
|
||||
**Regel:** *Was im Text nicht beweisbar ist, gehört nicht in den Text-Check.* Sobald die Klassen sauber getrennt sind, sinken die False Positives fast automatisch (Beleg Impressum: SCOPE+JUDGE 171 Roh-FP → Text-Check-FP 0).
|
||||
|
||||
## 2. Die fünf Klassen
|
||||
|
||||
| `verification_method` | Prüfer | Leitfrage | Beispiel | Reifegrad |
|
||||
|---|---|---|---|---|
|
||||
| **CONTENT** | Embedding-Recall + LLM-Kaskade (OVH→Claude) | Was steht da? | DSE nennt Verarbeitungszwecke; Cookie-Richtlinie | DSE/Cookie kalibriert |
|
||||
| **FIELD** | Regex / Parser (Feldmatrix) | Welche Felder existieren + sind valide? | HRB, USt-IdNr, Anschrift, E-Mail+Telefon | Impressum-Fakten ✓ (FP 0 %) |
|
||||
| **PRESENTATION** | Playwright (Rendering-Sensor) | Ist es auffindbar / wahrnehmbar / erreichbar? | Impressum „leicht erkennbar", ständig verfügbar, Footer nicht verdeckt | Re-Route gemacht, Checker offen |
|
||||
| **BEHAVIOR** | Playwright + API (Interaktion) | Manipuliert die UI die Entscheidung? | Reject = Accept, Cookies VOR Consent, Dark Pattern | Cookie-Banner-Matrix vorhanden |
|
||||
| **PROCESS** | Audit / Nachweis | Gibt es einen internen Nachweis? | VVT, interne Richtlinie, Audit-Entscheidung | Org-Checkliste |
|
||||
|
||||
## 3. PRESENTATION ≠ BEHAVIOR
|
||||
|
||||
Beide nutzen Playwright, prüfen aber **verschiedene Rechtslogik** — getrennt halten:
|
||||
|
||||
- **PRESENTATION** = Auffindbarkeit / Sichtbarkeit / Zugänglichkeit. Beispiel: Impressum-Link erreichbar, nicht in 4px-Schrift, nicht hinter `display:none`, nicht dauerhaft vom Cookie-Layer verdeckt.
|
||||
- **BEHAVIOR** = Entscheidungs-Manipulation / Dark-Pattern. Beispiel: „Ablehnen" versteckt, Vorauswahl gesetzt, Consent technisch ignoriert.
|
||||
|
||||
## 4. Playwright als Compliance-Sensor (nicht Crawler)
|
||||
|
||||
Playwright prüft, was **kein** LLM kann: Der LLM sieht `<a href="/impressum">` und urteilt „erfüllt"; der Sensor sieht, dass das Element verdeckt / unsichtbar / unerreichbar ist und urteilt „nicht erfüllt". Drei technische Prüfer langfristig:
|
||||
|
||||
- **Content-Checker** → LLM (CONTENT)
|
||||
- **Structure-Checker** → Regex/Parser (FIELD)
|
||||
- **Presentation-Checker** → Playwright (PRESENTATION + BEHAVIOR)
|
||||
|
||||
## 5. Schema-Status & Verortung
|
||||
|
||||
- `canonical_controls.verification_method` **existiert**, aber nur ~4 % befüllt und mit *anderer*, gröberer Taxonomie (`document` / `code_review` / `tool` / `hybrid` — generische Enterprise-Verifikation, nicht das Doc-Check-Routing).
|
||||
- `doc_check_controls` hat **keine** `verification_method`-Spalte.
|
||||
- → Die hier definierte Doc-Check-Routing-Achse ist **neu**, aber **ableitbar** aus den schon vorhandenen `control_classification`-Achsen (`artifact_type` / `obligation_type` / `check_intent`). **Kein** Schema-Eingriff nötig (DB ist eingefroren) — als abgeleitetes Tag in `control_classification` führen.
|
||||
|
||||
Heuristik für die Ableitung (Startpunkt, nicht final):
|
||||
|
||||
| Signal | → verification_method |
|
||||
|---|---|
|
||||
| `artifact_type = COOKIE_BANNER`, Interaktionspflicht | BEHAVIOR |
|
||||
| Pflicht zu Erreichbarkeit / Sichtbarkeit / „ständig verfügbar" | PRESENTATION |
|
||||
| Faktenfeld (Anschrift, Register, Kennung) | FIELD |
|
||||
| `obligation_type` Prozess / Nachweis ohne Außenwirkung | PROCESS |
|
||||
| sonst (inhaltliche Offenlegung in Prosa) | CONTENT |
|
||||
|
||||
## 6. Warum das über die 3 Module hinaus zählt
|
||||
|
||||
Für die nächsten Module (CRA, Maschinenverordnung, NIS2, TISAX, ISO 27001) ist diese Achse vermutlich fast so wichtig wie `artifact_type`: viele dieser Pflichten sind **PROCESS** oder **BEHAVIOR**, kein Textinhalt. Wer sie an den LLM-Text-Check hängt, erzeugt systematische False Positives. Das ist die eigentliche Erkenntnis der Kalibrierung: **nicht** dass DSE/Cookie/Impressum funktionieren, sondern dass klar wurde, *welcher Prüfer für welche Art von Pflicht zuständig ist*.
|
||||
@@ -0,0 +1,68 @@
|
||||
{
|
||||
"DATA-2260-A01": {
|
||||
"title": "Primären Verarbeitungszweck schriftlich und verständlich dokumentieren",
|
||||
"check_question": "Ist der primäre Verarbeitungszweck schriftlich und verständlich dokumentiert?",
|
||||
"pass_criteria": "[\"primärer Verarbeitungszweck verständlich beschrieben\", \"Zweck der Datenerhebung nachvollziehbar genannt\"]",
|
||||
"fail_criteria": "[\"Primärzweck nicht schriftlich dokumentiert\", \"Unverständliche oder zu technische Formulierung\", \"Zu allgemeine Beschreibung ohne konkrete Bezüge\"]"
|
||||
},
|
||||
"AUTH-3737-A06": {
|
||||
"title": "Zwecke von Datenübermittlungen dokumentieren",
|
||||
"check_question": "Sind die Zwecke aller Datenübermittlungen transparent und nachvollziehbar dokumentiert?",
|
||||
"pass_criteria": "[\"Explizite Zweckangabe für jede Datenübermittlung (z.B. 'Vertragserfüllung', 'Rechtliche Verpflichtung')\", \"Rechtsgrundlage für die jeweilige Übermittlung (Art. 6 DSGVO oder spezifische Norm)\", \"Empfänger und Empfängerkategorie mit Zweckbindung\", \"Dokumentation der Zwecke in verständlicher Form für Betroffene\", \"Unterscheidung zwischen verschiedenen Übermittlungszwecken\"]",
|
||||
"fail_criteria": "[\"Generische Zweckangaben wie 'geschäftliche Zwecke' ohne Konkretisierung\", \"Fehlende Rechtsgrundlage für die Übermittlung\", \"Keine Dokumentation der Zwecke oder nur mündliche Absprachen\"]"
|
||||
},
|
||||
"DATA-2992-A03": {
|
||||
"title": "Weiterübertragung an Drittparteien dokumentieren (Zweck, Rechtsgrundlage)",
|
||||
"check_question": "Dokumentiert die Datenschutzinformation für jede Weiterübertragung an Drittparteien den Zweck und die Rechtsgrundlage?",
|
||||
"pass_criteria": "[\"Für jeden Drittpartner-Transfer: Expliziter Zweck dokumentiert (z.B. 'Zahlungsabwicklung', 'Kundenservice')\", \"Rechtsgrundlage für die Weiterübertragung genannt (z.B. 'Vertragserfüllung mit Kunde', 'Einwilligung des Betroffenen')\", \"Unterscheidung zwischen Auftragsverarbeiter und eigenverantwortlichem Verantwortlicher\", \"Informationen zu Weitergabebeschränkungen oder Vertraulichkeitsverpflichtungen\"]",
|
||||
"fail_criteria": "[\"Drittparteien genannt, aber Zweck oder Rechtsgrundlage fehlt\", \"Pauschalaussage wie 'Daten werden an Partner weitergegeben' ohne Spezifizierung\", \"Keine Unterscheidung zwischen verschiedenen Weiterübertragungsszenarien\"]"
|
||||
},
|
||||
"DATA-1624-A03": {
|
||||
"title": "Verweis auf Garantien für Drittlandtransfer bereitstellen",
|
||||
"check_question": "Werden betroffene Personen über alternative Garantien für Drittlandtransfers (falls kein Angemessenheitsbeschluss) informiert und auf diese verwiesen?",
|
||||
"pass_criteria": "[\"Aufzählung der angewendeten Transfermechanismen (z.B. 'Standardvertragsklauseln', 'Binding Corporate Rules', 'Zertifizierungen')\", \"Konkrete Beschreibung jedes Mechanismus und dessen Schutzwirkung in verständlicher Sprache\", \"Angabe, wie Betroffene die Garantiedokumente einsehen können (mit Kontaktdaten oder Link)\", \"Hinweis auf Rechte der Betroffenen (z.B. Recht auf Beschwerde, Recht auf Auskunft über Schutzmaßnahmen)\"]",
|
||||
"fail_criteria": "[\"Nur Nennung von Transfermechanismen ohne Erklärung oder Zugriff auf Dokumente\", \"Unvollständige Aufzählung (z.B. nur SCCs erwähnt, aber auch BCR verwendet)\", \"Garantien werden erwähnt, sind aber nicht tatsächlich implementiert oder dokumentiert\"]"
|
||||
},
|
||||
"DATA-1619-A03": {
|
||||
"title": "Verarbeitungszwecke und Rechtsgrundlage offenlegen",
|
||||
"check_question": "Sind Verarbeitungszwecke und Rechtsgrundlagen klar und verständlich offengelegt?",
|
||||
"pass_criteria": "[\"Konkrete Verarbeitungszwecke benannt (z.B. 'Vertragserfüllung', 'Rechnungsstellung', 'Kundenservice')\", \"Spezifische Rechtsgrundlage mit Artikel genannt (z.B. 'Art. 6 Abs. 1 Buchstabe b DSGVO')\", \"Unterscheidung zwischen verschiedenen Verarbeitungszwecken mit jeweiliger Rechtsgrundlage\", \"Verständliche Sprache ohne juristische Fachbegriffe oder mit Erklärung\", \"Trennung von Pflichtangaben und freiwilligen Verarbeitungen\"]",
|
||||
"fail_criteria": "[\"Zweck nur allgemein formuliert ('geschäftliche Zwecke', 'interne Nutzung')\", \"Rechtsgrundlage fehlt oder nur 'DSGVO' ohne Artikel und Absatz\", \"Mehrere Zwecke ohne klare Zuordnung zu Rechtsgrundlagen\", \"Unverständliche juristische Formulierungen ohne Erklärung\"]"
|
||||
},
|
||||
"DATA-424-A09": {
|
||||
"title": "Datenübertragbarkeit bei Einwilligung oder Vertrag ermöglichen",
|
||||
"check_question": "Dokumentiert die Datenschutzinformation die Bereitstellung von Daten in maschinenlesbarem Format für Fälle mit Einwilligung oder Vertrag als Rechtsgrundlage?",
|
||||
"pass_criteria": "[\"Datenübertragbarkeit bei Einwilligung oder Vertrag erwähnt\", \"maschinenlesbares Format genannt\"]",
|
||||
"fail_criteria": "[\"Maschinenlesbare Formate werden nicht angeboten\", \"Keine Differenzierung nach Rechtsgrundlagen\", \"Abruf nur in unstrukturierten Formaten (z.B. PDF) möglich\"]"
|
||||
},
|
||||
"GOV-3300-A06": {
|
||||
"title": "Daten in maschinenlesbaren Formaten bei Datenportierung bereitstellen",
|
||||
"check_question": "Stellt die Datenschutzinformation sicher, dass Betroffene ihre Daten bei Datenportierungsanfragen in maschinenlesbaren Formaten erhalten?",
|
||||
"pass_criteria": "[\"Recht auf Datenübertragbarkeit erwähnt\", \"strukturiertes oder maschinenlesbares Format genannt\"]",
|
||||
"fail_criteria": "[\"Nur Bereitstellung in nicht-maschinenlesbaren Formaten (PDF, Papier)\", \"Vage Aussagen zu 'gängigen Formaten' ohne konkrete Nennung\", \"Einschränkung auf proprietäre oder nicht-standardisierte Formate\"]"
|
||||
},
|
||||
"AI-1560-A01": {
|
||||
"title": "Zwecke der Datenverwendung dokumentieren",
|
||||
"check_question": "Sind die Zwecke der Datenverwendung transparent und DSGVO-konform dokumentiert?",
|
||||
"pass_criteria": "[\"Schriftliche Dokumentation aller Verarbeitungszwecke\", \"Verständliche Darstellung für Betroffene (keine Fachjargon ohne Erklärung)\", \"Einhaltung des Zweckbindungsprinzips (Zwecke sind spezifisch und nicht beliebig erweiterbar)\", \"Dokumentation der Zwecke in der Datenschutzerklärung oder Datenschutzinformation\", \"Angabe von Speicherdauer in Bezug auf Verarbeitungszwecke\"]",
|
||||
"fail_criteria": "[\"Unklare oder mehrdeutige Zweckbeschreibungen\", \"Fehlende Dokumentation in Datenschutzerklärung\", \"Zu breite Zweckdefinitionen, die Zweckentfremdung ermöglichen\"]"
|
||||
},
|
||||
"SEC-3444-A04": {
|
||||
"title": "Sekundärverarbeitungen auf Notwendigkeit beschränken",
|
||||
"check_question": "Beschränkt die Datenschutzinformation Sekundärverarbeitungen von Adressendaten auf die ursprünglichen Zwecke und notwendige Folgemaßnahmen?",
|
||||
"pass_criteria": "[\"ursprünglicher Verarbeitungszweck benannt\", \"Zweckbindung der Daten angegeben\"]",
|
||||
"fail_criteria": "[\"Uneingeschränkte Erlaubnis zur Datennutzung für beliebige Zwecke\", \"Keine Differenzierung zwischen ursprünglichem und neuem Zweck\", \"Fehlende Nennung konkreter Folgemaßnahmen\"]"
|
||||
},
|
||||
"DATA-1624-A06": {
|
||||
"title": "Übermittlung von Drittland-Schutzgarantie-Informationen verifizieren",
|
||||
"check_question": "Informiert die Datenschutzinformation betroffene Personen über die angewendeten Schutzmechanismen bei Datenübermittlungen in Drittländer (Adequacy Decisions, SCCs, BCRs)?",
|
||||
"pass_criteria": "[\"Explizite Nennung der angewendeten Schutzmechanismen (z.B. 'Adequacy Decision der EU-Kommission', 'Standarddatenschutzklauseln', 'Binding Corporate Rules')\", \"Angabe der betroffenen Drittländer oder Regionen\", \"Beschreibung der Garantien und Schutzmaßnahmen für die Datenübermittlung\", \"Verweis auf Dokumentation oder Rechtsgrundlagen (z.B. Verträge, Entscheidungen)\", \"Information über Rechte der betroffenen Person bei Drittlandtransfers\"]",
|
||||
"fail_criteria": "[\"Nur pauschale Aussage 'Daten werden geschützt übermittelt' ohne Nennung konkreter Mechanismen\", \"Aufzählung von Drittländern ohne Angabe der Schutzmechanismen\", \"Fehlende Differenzierung zwischen verschiedenen Übermittlungsszenarien\"]"
|
||||
},
|
||||
"DATA-2812-A05": {
|
||||
"title": "Löschungsrecht für Cookies und Speicherdaten implementieren",
|
||||
"check_question": "Wird in der Datenschutzinformation das Recht auf Löschung von in Cookies und Speichermechanismen abgelegten personenbezogenen Daten beschrieben?",
|
||||
"pass_criteria": "[\"Recht auf Löschung von Cookie- oder Speicherdaten beschrieben\", \"Verwaltung oder Löschung von Cookies angesprochen\"]",
|
||||
"fail_criteria": "[\"Cookies werden nicht erwähnt oder als unvermeidbar dargestellt\", \"Keine Anleitung zur Löschung oder Verwaltung von Cookies\", \"Keine Möglichkeit zur Ablehnung oder zum Widerruf von Cookies beschrieben\"]"
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,283 @@
|
||||
[
|
||||
{
|
||||
"control_id": "DATA-2260-A01",
|
||||
"klasse": "B",
|
||||
"legal_note": "Art. 13(1)(c) DSGVO verlangt 'die Zwecke' (Plural) — keinen einzelnen 'primären' Zweck und keine Priorisierung. Mehrere genannte Zwecke erfüllen die Pflicht.",
|
||||
"changed_fields": [
|
||||
"title",
|
||||
"check_question",
|
||||
"pass_criteria",
|
||||
"fail_criteria"
|
||||
],
|
||||
"old": {
|
||||
"title": "Primären Verarbeitungszweck schriftlich und verständlich dokumentieren",
|
||||
"check_question": "Ist der primäre Verarbeitungszweck schriftlich und verständlich dokumentiert?",
|
||||
"pass_criteria": "[\"primärer Verarbeitungszweck verständlich beschrieben\", \"Zweck der Datenerhebung nachvollziehbar genannt\"]",
|
||||
"fail_criteria": "[\"Primärzweck nicht schriftlich dokumentiert\", \"Unverständliche oder zu technische Formulierung\", \"Zu allgemeine Beschreibung ohne konkrete Bezüge\"]"
|
||||
},
|
||||
"new": {
|
||||
"title": "Verarbeitungszweck(e) schriftlich und verständlich dokumentieren",
|
||||
"check_question": "Sind die Verarbeitungszwecke schriftlich und verständlich genannt?",
|
||||
"pass_criteria": [
|
||||
"Verarbeitungszweck(e) verständlich beschrieben",
|
||||
"Zweck der Datenerhebung nachvollziehbar genannt"
|
||||
],
|
||||
"fail_criteria": [
|
||||
"Verarbeitungszwecke nicht genannt",
|
||||
"Unverständliche oder zu technische Formulierung",
|
||||
"Nur pauschale Floskel ohne konkreten Zweckbezug"
|
||||
]
|
||||
}
|
||||
},
|
||||
{
|
||||
"control_id": "AUTH-3737-A06",
|
||||
"klasse": "B",
|
||||
"legal_note": "Art. 13(1)(c)+(e) verlangt Zwecke + Empfänger(kategorien) — keine vollständige Zweck/Rechtsgrundlage-Matrix je einzelner Übermittlung.",
|
||||
"changed_fields": [
|
||||
"pass_criteria",
|
||||
"fail_criteria"
|
||||
],
|
||||
"old": {
|
||||
"pass_criteria": "[\"Explizite Zweckangabe für jede Datenübermittlung (z.B. 'Vertragserfüllung', 'Rechtliche Verpflichtung')\", \"Rechtsgrundlage für die jeweilige Übermittlung (Art. 6 DSGVO oder spezifische Norm)\", \"Empfänger und Empfängerkategorie mit Zweckbindung\", \"Dokumentation der Zwecke in verständlicher Form für Betroffene\", \"Unterscheidung zwischen verschiedenen Übermittlungszwecken\"]",
|
||||
"fail_criteria": "[\"Generische Zweckangaben wie 'geschäftliche Zwecke' ohne Konkretisierung\", \"Fehlende Rechtsgrundlage für die Übermittlung\", \"Keine Dokumentation der Zwecke oder nur mündliche Absprachen\"]"
|
||||
},
|
||||
"new": {
|
||||
"pass_criteria": [
|
||||
"Zwecke der Datenübermittlungen genannt",
|
||||
"Empfänger oder Empfängerkategorien angegeben",
|
||||
"verständliche Darstellung für Betroffene"
|
||||
],
|
||||
"fail_criteria": [
|
||||
"keine Angabe von Übermittlungszwecken",
|
||||
"weder Empfänger noch Empfängerkategorien genannt",
|
||||
"nur pauschale Floskel ('geschäftliche Zwecke') ohne jeden Bezug"
|
||||
]
|
||||
}
|
||||
},
|
||||
{
|
||||
"control_id": "DATA-2992-A03",
|
||||
"klasse": "B",
|
||||
"legal_note": "Art. 13(1)(e) verlangt Empfänger/Kategorien; die DSE muss keine AV-/Verantwortlicher-Distinktion, keine Vertraulichkeitszusage und keine Rechtsgrundlage je Empfänger ausweisen.",
|
||||
"changed_fields": [
|
||||
"pass_criteria",
|
||||
"fail_criteria"
|
||||
],
|
||||
"old": {
|
||||
"pass_criteria": "[\"Für jeden Drittpartner-Transfer: Expliziter Zweck dokumentiert (z.B. 'Zahlungsabwicklung', 'Kundenservice')\", \"Rechtsgrundlage für die Weiterübertragung genannt (z.B. 'Vertragserfüllung mit Kunde', 'Einwilligung des Betroffenen')\", \"Unterscheidung zwischen Auftragsverarbeiter und eigenverantwortlichem Verantwortlicher\", \"Informationen zu Weitergabebeschränkungen oder Vertraulichkeitsverpflichtungen\"]",
|
||||
"fail_criteria": "[\"Drittparteien genannt, aber Zweck oder Rechtsgrundlage fehlt\", \"Pauschalaussage wie 'Daten werden an Partner weitergegeben' ohne Spezifizierung\", \"Keine Unterscheidung zwischen verschiedenen Weiterübertragungsszenarien\"]"
|
||||
},
|
||||
"new": {
|
||||
"pass_criteria": [
|
||||
"Weitergabe an Dritte offengelegt (Empfänger oder Kategorien)",
|
||||
"Zweck der Weitergabe genannt"
|
||||
],
|
||||
"fail_criteria": [
|
||||
"Weitergabe verschwiegen",
|
||||
"weder Empfänger/Kategorie noch Zweck genannt",
|
||||
"nur pauschal 'Daten werden an Partner weitergegeben' ohne jeden Bezug"
|
||||
]
|
||||
}
|
||||
},
|
||||
{
|
||||
"control_id": "DATA-1624-A03",
|
||||
"klasse": "B",
|
||||
"legal_note": "Art. 13(1)(f) verlangt Verweis auf geeignete Garantien + wie eine Kopie erhältlich/wo verfügbar ist. Ein Link genügt; eine verständliche Beschreibung der Schutzwirkung ist nicht gefordert.",
|
||||
"changed_fields": [
|
||||
"pass_criteria",
|
||||
"fail_criteria"
|
||||
],
|
||||
"old": {
|
||||
"pass_criteria": "[\"Aufzählung der angewendeten Transfermechanismen (z.B. 'Standardvertragsklauseln', 'Binding Corporate Rules', 'Zertifizierungen')\", \"Konkrete Beschreibung jedes Mechanismus und dessen Schutzwirkung in verständlicher Sprache\", \"Angabe, wie Betroffene die Garantiedokumente einsehen können (mit Kontaktdaten oder Link)\", \"Hinweis auf Rechte der Betroffenen (z.B. Recht auf Beschwerde, Recht auf Auskunft über Schutzmaßnahmen)\"]",
|
||||
"fail_criteria": "[\"Nur Nennung von Transfermechanismen ohne Erklärung oder Zugriff auf Dokumente\", \"Unvollständige Aufzählung (z.B. nur SCCs erwähnt, aber auch BCR verwendet)\", \"Garantien werden erwähnt, sind aber nicht tatsächlich implementiert oder dokumentiert\"]"
|
||||
},
|
||||
"new": {
|
||||
"pass_criteria": [
|
||||
"geeignete Garantie genannt (z.B. SCC/BCR/Zertifizierung)",
|
||||
"Zugang zu den Garantien angegeben (Link ODER Kontakt)"
|
||||
],
|
||||
"fail_criteria": [
|
||||
"Drittlandtransfer ohne Nennung einer Garantie",
|
||||
"Garantie genannt, aber keinerlei Möglichkeit sie einzusehen (kein Link und kein Kontakt)"
|
||||
]
|
||||
}
|
||||
},
|
||||
{
|
||||
"control_id": "DATA-1619-A03",
|
||||
"klasse": "B",
|
||||
"legal_note": "Art. 13(1)(c) verlangt die Rechtsgrundlage; die Nennung des konkreten Artikels (Art. 6 Abs. 1 lit. ...) ist gute Praxis, aber nicht zwingend; ebenso wenig die Trennung Pflicht/freiwillig.",
|
||||
"changed_fields": [
|
||||
"pass_criteria",
|
||||
"fail_criteria"
|
||||
],
|
||||
"old": {
|
||||
"pass_criteria": "[\"Konkrete Verarbeitungszwecke benannt (z.B. 'Vertragserfüllung', 'Rechnungsstellung', 'Kundenservice')\", \"Spezifische Rechtsgrundlage mit Artikel genannt (z.B. 'Art. 6 Abs. 1 Buchstabe b DSGVO')\", \"Unterscheidung zwischen verschiedenen Verarbeitungszwecken mit jeweiliger Rechtsgrundlage\", \"Verständliche Sprache ohne juristische Fachbegriffe oder mit Erklärung\", \"Trennung von Pflichtangaben und freiwilligen Verarbeitungen\"]",
|
||||
"fail_criteria": "[\"Zweck nur allgemein formuliert ('geschäftliche Zwecke', 'interne Nutzung')\", \"Rechtsgrundlage fehlt oder nur 'DSGVO' ohne Artikel und Absatz\", \"Mehrere Zwecke ohne klare Zuordnung zu Rechtsgrundlagen\", \"Unverständliche juristische Formulierungen ohne Erklärung\"]"
|
||||
},
|
||||
"new": {
|
||||
"pass_criteria": [
|
||||
"konkrete Verarbeitungszwecke benannt",
|
||||
"Rechtsgrundlage je Zweck genannt (Artikel-Zitat nicht zwingend)"
|
||||
],
|
||||
"fail_criteria": [
|
||||
"Zweck nur pauschal ('geschäftliche Zwecke')",
|
||||
"keine Rechtsgrundlage genannt"
|
||||
]
|
||||
}
|
||||
},
|
||||
{
|
||||
"control_id": "DATA-424-A09",
|
||||
"klasse": "B",
|
||||
"legal_note": "Art. 20 DSGVO — die DSE informiert über das Recht; das konkrete Exportformat (CSV/JSON/XML) ist Umsetzungsfrage und muss in der DSE nicht benannt werden.",
|
||||
"changed_fields": [
|
||||
"title",
|
||||
"check_question",
|
||||
"pass_criteria",
|
||||
"fail_criteria"
|
||||
],
|
||||
"old": {
|
||||
"title": "Datenübertragbarkeit bei Einwilligung oder Vertrag ermöglichen",
|
||||
"check_question": "Dokumentiert die Datenschutzinformation die Bereitstellung von Daten in maschinenlesbarem Format für Fälle mit Einwilligung oder Vertrag als Rechtsgrundlage?",
|
||||
"pass_criteria": "[\"Datenübertragbarkeit bei Einwilligung oder Vertrag erwähnt\", \"maschinenlesbares Format genannt\"]",
|
||||
"fail_criteria": "[\"Maschinenlesbare Formate werden nicht angeboten\", \"Keine Differenzierung nach Rechtsgrundlagen\", \"Abruf nur in unstrukturierten Formaten (z.B. PDF) möglich\"]"
|
||||
},
|
||||
"new": {
|
||||
"title": "Recht auf Datenübertragbarkeit (Einwilligung/Vertrag) offenlegen",
|
||||
"check_question": "Informiert die Datenschutzinformation über das Recht auf Datenübertragbarkeit (bei Einwilligung oder Vertrag)?",
|
||||
"pass_criteria": [
|
||||
"Recht auf Datenübertragbarkeit erwähnt (bei Einwilligung oder Vertrag)"
|
||||
],
|
||||
"fail_criteria": [
|
||||
"Recht auf Datenübertragbarkeit nicht erwähnt"
|
||||
]
|
||||
}
|
||||
},
|
||||
{
|
||||
"control_id": "GOV-3300-A06",
|
||||
"klasse": "B",
|
||||
"legal_note": "Wie DATA-424-A09 (Art. 20): Format-Nennung in der DSE nicht gefordert. Dedupe-Kandidat zu DATA-424-A09.",
|
||||
"changed_fields": [
|
||||
"pass_criteria",
|
||||
"fail_criteria"
|
||||
],
|
||||
"old": {
|
||||
"pass_criteria": "[\"Recht auf Datenübertragbarkeit erwähnt\", \"strukturiertes oder maschinenlesbares Format genannt\"]",
|
||||
"fail_criteria": "[\"Nur Bereitstellung in nicht-maschinenlesbaren Formaten (PDF, Papier)\", \"Vage Aussagen zu 'gängigen Formaten' ohne konkrete Nennung\", \"Einschränkung auf proprietäre oder nicht-standardisierte Formate\"]"
|
||||
},
|
||||
"new": {
|
||||
"pass_criteria": [
|
||||
"Recht auf Datenübertragbarkeit erwähnt"
|
||||
],
|
||||
"fail_criteria": [
|
||||
"Recht auf Datenübertragbarkeit nicht erwähnt"
|
||||
]
|
||||
}
|
||||
},
|
||||
{
|
||||
"control_id": "AI-1560-A01",
|
||||
"klasse": "C",
|
||||
"legal_note": "Zweck-Offenlegung (Art. 13(1)(c)) und Speicherdauer (Art. 13(2)(a)) sind verschiedene Pflichten; die Speicherdauer-Forderung gehört nicht in den Zweck-Control und wird entfernt.",
|
||||
"changed_fields": [
|
||||
"pass_criteria",
|
||||
"fail_criteria"
|
||||
],
|
||||
"old": {
|
||||
"pass_criteria": "[\"Schriftliche Dokumentation aller Verarbeitungszwecke\", \"Verständliche Darstellung für Betroffene (keine Fachjargon ohne Erklärung)\", \"Einhaltung des Zweckbindungsprinzips (Zwecke sind spezifisch und nicht beliebig erweiterbar)\", \"Dokumentation der Zwecke in der Datenschutzerklärung oder Datenschutzinformation\", \"Angabe von Speicherdauer in Bezug auf Verarbeitungszwecke\"]",
|
||||
"fail_criteria": "[\"Unklare oder mehrdeutige Zweckbeschreibungen\", \"Fehlende Dokumentation in Datenschutzerklärung\", \"Zu breite Zweckdefinitionen, die Zweckentfremdung ermöglichen\"]"
|
||||
},
|
||||
"new": {
|
||||
"pass_criteria": [
|
||||
"Verarbeitungszwecke in der Datenschutzerklärung genannt",
|
||||
"verständliche Darstellung für Betroffene"
|
||||
],
|
||||
"fail_criteria": [
|
||||
"Verarbeitungszwecke nicht genannt",
|
||||
"nur unklare/zu breite Zweckfloskeln, die jede Nutzung erlauben"
|
||||
]
|
||||
}
|
||||
},
|
||||
{
|
||||
"control_id": "SEC-3444-A04",
|
||||
"klasse": "C",
|
||||
"legal_note": "In der DSE wird die Zweckbindung OFFENGELEGT (Art. 13(1)(c)); ob Sekundärverarbeitung tatsächlich 'beschränkt' wird, ist eine Verhaltensfrage und aus dem Text nicht prüfbar. Titel/Frage an den Offenlegungs-Charakter angeglichen.",
|
||||
"changed_fields": [
|
||||
"title",
|
||||
"check_question",
|
||||
"pass_criteria",
|
||||
"fail_criteria"
|
||||
],
|
||||
"old": {
|
||||
"title": "Sekundärverarbeitungen auf Notwendigkeit beschränken",
|
||||
"check_question": "Beschränkt die Datenschutzinformation Sekundärverarbeitungen von Adressendaten auf die ursprünglichen Zwecke und notwendige Folgemaßnahmen?",
|
||||
"pass_criteria": "[\"ursprünglicher Verarbeitungszweck benannt\", \"Zweckbindung der Daten angegeben\"]",
|
||||
"fail_criteria": "[\"Uneingeschränkte Erlaubnis zur Datennutzung für beliebige Zwecke\", \"Keine Differenzierung zwischen ursprünglichem und neuem Zweck\", \"Fehlende Nennung konkreter Folgemaßnahmen\"]"
|
||||
},
|
||||
"new": {
|
||||
"title": "Zweckbindung der Datenverarbeitung offenlegen",
|
||||
"check_question": "Legt die Datenschutzinformation den ursprünglichen Zweck und die Zweckbindung der Daten offen?",
|
||||
"pass_criteria": [
|
||||
"ursprünglicher Verarbeitungszweck benannt",
|
||||
"Zweckbindung der Daten angegeben"
|
||||
],
|
||||
"fail_criteria": [
|
||||
"kein Zweck genannt",
|
||||
"uneingeschränkte Nutzung für beliebige Zwecke erlaubt"
|
||||
]
|
||||
}
|
||||
},
|
||||
{
|
||||
"control_id": "DATA-1624-A06",
|
||||
"klasse": "C",
|
||||
"legal_note": "Art. 13(1)(f): Beschreibung der Schutzwirkung nicht gefordert. ⚠️ Near-Duplikat zu DATA-1624-A03 → Dedupe als separater Catalog-Schritt empfohlen (hier NICHT gelöscht).",
|
||||
"changed_fields": [
|
||||
"title",
|
||||
"check_question",
|
||||
"pass_criteria",
|
||||
"fail_criteria"
|
||||
],
|
||||
"old": {
|
||||
"title": "Übermittlung von Drittland-Schutzgarantie-Informationen verifizieren",
|
||||
"check_question": "Informiert die Datenschutzinformation betroffene Personen über die angewendeten Schutzmechanismen bei Datenübermittlungen in Drittländer (Adequacy Decisions, SCCs, BCRs)?",
|
||||
"pass_criteria": "[\"Explizite Nennung der angewendeten Schutzmechanismen (z.B. 'Adequacy Decision der EU-Kommission', 'Standarddatenschutzklauseln', 'Binding Corporate Rules')\", \"Angabe der betroffenen Drittländer oder Regionen\", \"Beschreibung der Garantien und Schutzmaßnahmen für die Datenübermittlung\", \"Verweis auf Dokumentation oder Rechtsgrundlagen (z.B. Verträge, Entscheidungen)\", \"Information über Rechte der betroffenen Person bei Drittlandtransfers\"]",
|
||||
"fail_criteria": "[\"Nur pauschale Aussage 'Daten werden geschützt übermittelt' ohne Nennung konkreter Mechanismen\", \"Aufzählung von Drittländern ohne Angabe der Schutzmechanismen\", \"Fehlende Differenzierung zwischen verschiedenen Übermittlungsszenarien\"]"
|
||||
},
|
||||
"new": {
|
||||
"title": "Schutzgarantien bei Drittlandübermittlung offenlegen",
|
||||
"check_question": "Informiert die Datenschutzinformation über die angewendeten Schutzmechanismen bei Drittlandtransfers?",
|
||||
"pass_criteria": [
|
||||
"Schutzmechanismus genannt (Adäquanzbeschluss/SCC/BCR)",
|
||||
"betroffene Drittländer oder Regionen angegeben",
|
||||
"Zugang/Verweis zur Garantie angegeben (Link oder Kontakt)"
|
||||
],
|
||||
"fail_criteria": [
|
||||
"Drittlandtransfer ohne Nennung eines Schutzmechanismus",
|
||||
"nur pauschal 'Daten werden geschützt übermittelt'"
|
||||
]
|
||||
}
|
||||
},
|
||||
{
|
||||
"control_id": "DATA-2812-A05",
|
||||
"klasse": "C",
|
||||
"legal_note": "DSE-Offenlegung des Lösch-/Verwaltungswegs (Art. 17 / §25 TTDSG-Kontext); ein Verweis auf Cookie-Einstellungen/Banner oder Browser genügt — eine Schritt-für-Schritt-Anleitung ist nicht gefordert. FRAGE war bereits disclosure-framed → bleibt DSE.",
|
||||
"changed_fields": [
|
||||
"title",
|
||||
"pass_criteria",
|
||||
"fail_criteria"
|
||||
],
|
||||
"old": {
|
||||
"title": "Löschungsrecht für Cookies und Speicherdaten implementieren",
|
||||
"pass_criteria": "[\"Recht auf Löschung von Cookie- oder Speicherdaten beschrieben\", \"Verwaltung oder Löschung von Cookies angesprochen\"]",
|
||||
"fail_criteria": "[\"Cookies werden nicht erwähnt oder als unvermeidbar dargestellt\", \"Keine Anleitung zur Löschung oder Verwaltung von Cookies\", \"Keine Möglichkeit zur Ablehnung oder zum Widerruf von Cookies beschrieben\"]"
|
||||
},
|
||||
"new": {
|
||||
"title": "Recht auf Löschung von Cookie-/Speicherdaten offenlegen",
|
||||
"pass_criteria": [
|
||||
"Recht auf Löschung/Verwaltung von Cookie- bzw. Speicherdaten beschrieben",
|
||||
"Hinweis auf Verwaltungs-/Löschweg (Cookie-Einstellungen, Banner oder Browser) — Verweis/Link genügt"
|
||||
],
|
||||
"fail_criteria": [
|
||||
"Cookies/Speicherdaten ohne jeden Hinweis auf Löschung/Verwaltung",
|
||||
"Löschung/Verwaltung ausdrücklich verweigert"
|
||||
]
|
||||
}
|
||||
}
|
||||
]
|
||||
@@ -18,6 +18,7 @@ Run with --dry-run to preview deletions without executing.
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import time
|
||||
import requests
|
||||
@@ -33,7 +34,7 @@ TARGETS = {
|
||||
},
|
||||
"production": {
|
||||
"url": "https://qdrant-dev.breakpilot.ai",
|
||||
"api_key": "z9cKbT74vl1aKPD1QGIlKWfET47VH93u",
|
||||
"api_key": os.environ.get("QDRANT_API_KEY"),
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
@@ -2,16 +2,23 @@
|
||||
# Emit per-service + aggregate change flags for the CI / build workflows.
|
||||
#
|
||||
# Reads:
|
||||
# BASE_SHA — diff base. Empty / unreachable → emit everything as true.
|
||||
# BASE_SHA — diff base. Empty / unreachable / diff-failure → emit everything as true.
|
||||
# HEAD_SHA — diff target. Defaults to HEAD.
|
||||
#
|
||||
# Writes key=value lines to $GITHUB_OUTPUT (defaults to /dev/stdout for local runs).
|
||||
#
|
||||
# ROBUSTNESS CONTRACT: this script must ALWAYS emit a full set of outputs. A
|
||||
# missing/empty output makes the job's `outputs:` mapping evaluate to a Go format
|
||||
# error (`%!t(string=)`) which fails detect-changes AND every job that `needs` it
|
||||
# (cascade). So we do NOT use `set -e` (an aborting git/grep must not kill the
|
||||
# script before it emits), we treat any base/diff failure as "rebuild all", and an
|
||||
# EXIT trap emits rebuild-all + forces exit 0 if we ever exit early/unexpectedly.
|
||||
#
|
||||
# Keys emitted:
|
||||
# admin, backend, sdk, portal, tts, crawler, dsms_gateway, dsms_node
|
||||
# any_python, any_node, any
|
||||
|
||||
set -euo pipefail
|
||||
set -uo pipefail
|
||||
|
||||
BASE_SHA="${BASE_SHA:-}"
|
||||
HEAD_SHA="${HEAD_SHA:-HEAD}"
|
||||
@@ -31,17 +38,27 @@ emit_all_true() {
|
||||
done
|
||||
}
|
||||
|
||||
# Safety net: never let the job end with undefined outputs. If we exit before
|
||||
# DONE=1 (any error / early termination), emit rebuild-all and exit 0 so the
|
||||
# step still succeeds — rebuild-all is the safe over-approximation.
|
||||
DONE=0
|
||||
trap '[ "$DONE" = 1 ] || emit_all_true "safety-net (unexpected exit)"; exit 0' EXIT
|
||||
|
||||
if [ -z "$BASE_SHA" ]; then
|
||||
emit_all_true "no BASE_SHA provided"
|
||||
exit 0
|
||||
DONE=1; exit 0
|
||||
fi
|
||||
|
||||
if ! git rev-parse --verify "${BASE_SHA}^{commit}" >/dev/null 2>&1; then
|
||||
emit_all_true "BASE_SHA ${BASE_SHA} unreachable"
|
||||
exit 0
|
||||
DONE=1; exit 0
|
||||
fi
|
||||
|
||||
if ! changed=$(git diff --name-only "$BASE_SHA" "$HEAD_SHA" 2>/dev/null); then
|
||||
emit_all_true "git diff against ${BASE_SHA} failed"
|
||||
DONE=1; exit 0
|
||||
fi
|
||||
|
||||
changed=$(git diff --name-only "$BASE_SHA" "$HEAD_SHA" || true)
|
||||
echo "Changed files since ${BASE_SHA}:"
|
||||
echo "${changed:-(none)}"
|
||||
echo "---"
|
||||
@@ -91,3 +108,5 @@ else
|
||||
emit any false
|
||||
echo " any: false"
|
||||
fi
|
||||
|
||||
DONE=1
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user