feat: enhance legal basis display, add batch processing tests and docs
All checks were successful
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Successful in 32s
CI/CD / test-python-backend-compliance (push) Successful in 31s
CI/CD / test-python-document-crawler (push) Successful in 23s
CI/CD / test-python-dsms-gateway (push) Successful in 17s
CI/CD / validate-canonical-controls (push) Successful in 12s
CI/CD / Deploy (push) Successful in 2s
All checks were successful
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Successful in 32s
CI/CD / test-python-backend-compliance (push) Successful in 31s
CI/CD / test-python-document-crawler (push) Successful in 23s
CI/CD / test-python-dsms-gateway (push) Successful in 17s
CI/CD / validate-canonical-controls (push) Successful in 12s
CI/CD / Deploy (push) Successful in 2s
- Backfill 81 controls with empty source_citation.source from generation_metadata - Add fallback to generation_metadata.source_regulation in ControlDetail blue box - Improve Rule 3 amber box text for reformulated controls - Add 30 new tests for batch processing (TestParseJsonArray, TestBatchSizeConfig, TestBatchProcessingLoop) — all 61 control generator tests passing - Fix stale test_config_defaults assertion (max_controls 50→0) - Update canonical-control-library.md with batch processing pipeline docs, processed chunks tracking, migration guide, and stats endpoint - Update testing.md with canonical control generator test section Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -209,3 +209,38 @@ Wenn du z.B. eine neue `GetUserStats()` Funktion im Go Service hinzufuegst:
|
||||
```
|
||||
3. **Tests ausfuehren**: `go test -v ./internal/services/...`
|
||||
4. **Dokumentation aktualisieren** (siehe [Dokumentation](./documentation.md))
|
||||
|
||||
---
|
||||
|
||||
## Modul-spezifische Tests
|
||||
|
||||
### Canonical Control Generator (82 Tests)
|
||||
|
||||
Die Control Library hat eine umfangreiche Test-Suite ueber 6 Dateien.
|
||||
Siehe [Canonical Control Library — Tests](../services/sdk-modules/canonical-control-library.md#tests) fuer Details.
|
||||
|
||||
```bash
|
||||
# Alle Generator-Tests
|
||||
cd backend-compliance && pytest -v tests/test_control_generator.py
|
||||
|
||||
# Similarity Detector Tests
|
||||
cd backend-compliance && pytest -v compliance/tests/test_similarity_detector.py
|
||||
|
||||
# API Route Tests
|
||||
cd backend-compliance && pytest -v tests/test_canonical_control_routes.py
|
||||
|
||||
# License Gate Tests
|
||||
cd backend-compliance && pytest -v tests/test_license_gate.py
|
||||
|
||||
# CI/CD Validator Tests
|
||||
cd backend-compliance && pytest -v tests/test_validate_controls.py
|
||||
```
|
||||
|
||||
**Wichtig:** Die Generator-Tests nutzen Mocks fuer Anthropic-API und Qdrant — sie laufen ohne externe Abhaengigkeiten.
|
||||
Die `TestPipelineMocked`-Klasse prueft insbesondere:
|
||||
|
||||
- Korrekte Lizenz-Klassifikation (Rule 1/2/3 Verhalten)
|
||||
- Rule 3 exponiert **keine** Quellennamen in `generation_metadata`
|
||||
- SHA-256 Hash-Deduplizierung fuer Chunks
|
||||
- Config-Defaults (`batch_size: 5`, `skip_processed: true`)
|
||||
- Rule 1 Citation wird korrekt mit Gesetzesreferenz generiert
|
||||
|
||||
@@ -118,6 +118,13 @@ erDiagram
|
||||
| `GET` | `/v1/canonical/sources` | Quellenregister mit Berechtigungen |
|
||||
| `GET` | `/v1/canonical/licenses` | Lizenz-Matrix |
|
||||
| `POST` | `/v1/canonical/controls/{id}/similarity-check` | Too-Close-Pruefung |
|
||||
| `POST` | `/v1/canonical/generate` | Generator-Job starten |
|
||||
| `GET` | `/v1/canonical/generate/jobs` | Alle Generator-Jobs |
|
||||
| `GET` | `/v1/canonical/generate/processed-stats` | Verarbeitungsstatistik pro Collection |
|
||||
| `GET` | `/v1/canonical/generate/review-queue` | Controls zur Pruefung |
|
||||
| `POST` | `/v1/canonical/generate/review/{control_id}` | Review abschliessen |
|
||||
| `GET` | `/v1/canonical/blocked-sources` | Gesperrte Quellen (Rule 3) |
|
||||
| `POST` | `/v1/canonical/blocked-sources/cleanup` | Cleanup-Workflow starten |
|
||||
|
||||
### Beispiel: Control abrufen
|
||||
|
||||
@@ -224,7 +231,8 @@ Der Validator (`scripts/validate-controls.py`) prueft bei jedem Commit:
|
||||
|
||||
## Control Generator Pipeline
|
||||
|
||||
Automatische Generierung von Controls aus dem gesamten RAG-Korpus (170.000+ Chunks aus Gesetzen, Verordnungen und Standards).
|
||||
Automatische Generierung von Controls aus dem gesamten RAG-Korpus (~183.000 Chunks aus Gesetzen, Verordnungen und Standards).
|
||||
Aktueller Stand: **~2.120 Controls** generiert.
|
||||
|
||||
### 8-Stufen-Pipeline
|
||||
|
||||
@@ -233,14 +241,15 @@ flowchart TD
|
||||
A[1. RAG Scroll] -->|Alle Chunks| B[2. Prefilter - Lokales LLM]
|
||||
B -->|Irrelevant| C[Als processed markieren]
|
||||
B -->|Relevant| D[3. License Classify]
|
||||
D -->|Rule 1/2| E[4a. Structure - Anthropic]
|
||||
D -->|Rule 3| F[4b. LLM Reform - Anthropic]
|
||||
E --> G[5. Harmonization - Embeddings]
|
||||
F --> G
|
||||
G -->|Duplikat| H[Als Duplikat speichern]
|
||||
G -->|Neu| I[6. Anchor Search]
|
||||
I --> J[7. Store Control]
|
||||
J --> K[8. Mark Processed]
|
||||
D -->|Batch sammeln| E[4. Batch Processing - 5 Chunks/API-Call]
|
||||
E -->|Rule 1/2| F[4a. Structure Batch - Anthropic]
|
||||
E -->|Rule 3| G[4b. Reform Batch - Anthropic]
|
||||
F --> H[5. Harmonization - Embeddings]
|
||||
G --> H
|
||||
H -->|Duplikat| I[Als Duplikat speichern]
|
||||
H -->|Neu| J[6. Anchor Search]
|
||||
J --> K[7. Store Control]
|
||||
K --> L[8. Mark Processed]
|
||||
```
|
||||
|
||||
### Stufe 1: RAG Scroll (Vollstaendig)
|
||||
@@ -273,6 +282,67 @@ Dies spart >50% der Anthropic-API-Kosten.
|
||||
- **Rule 1+2:** Anthropic strukturiert den Originaltext in Control-Format (Titel, Ziel, Anforderungen)
|
||||
- **Rule 3:** Anthropic reformuliert vollstaendig — kein Originaltext, keine Quellennamen
|
||||
|
||||
### Batch Processing (Stufe 4 — Optimierung)
|
||||
|
||||
Die Pipeline verarbeitet Chunks **nicht einzeln**, sondern sammelt sie in Batches von **5 Chunks pro API-Call**.
|
||||
Das reduziert die Anzahl der Anthropic-API-Aufrufe um ~80% und beschleunigt die Generierung erheblich.
|
||||
|
||||
#### Ablauf
|
||||
|
||||
1. **Chunks sammeln:** Nach dem Prefilter werden relevante Chunks mit ihrer Lizenz-Info in `pending_batch` gesammelt
|
||||
2. **Batch voll?** Sobald `batch_size` (Default: 5) erreicht ist, wird `_flush_batch()` aufgerufen
|
||||
3. **`_process_batch()`** trennt den Batch nach Lizenzregel:
|
||||
- **Rule 1+2 Chunks** → `_structure_batch()` — ein einziger Anthropic-Call fuer alle
|
||||
- **Rule 3 Chunks** → `_reformulate_batch()` — ein einziger Anthropic-Call fuer alle
|
||||
4. **Ergebnis:** JSON-Array mit genau N Controls, zurueck-gemappt per `chunk_index`
|
||||
|
||||
#### `_structure_batch()` (Rule 1+2)
|
||||
|
||||
Sendet alle freien/CC-BY Chunks in einem einzigen Prompt an Anthropic. Der Originaltext darf verwendet werden.
|
||||
Jeder Chunk wird als `--- CHUNK N ---` Block formatiert, das LLM gibt ein JSON-Array mit `chunk_index` zurueck.
|
||||
|
||||
```python
|
||||
# Prompt-Auszug:
|
||||
"Strukturiere die folgenden 5 Gesetzestexte jeweils als eigenstaendiges Control."
|
||||
"Gib ein JSON-Array zurueck mit GENAU 5 Objekten."
|
||||
```
|
||||
|
||||
**Processing Path:** `structured_batch` (in `generation_metadata`)
|
||||
|
||||
#### `_reformulate_batch()` (Rule 3)
|
||||
|
||||
Sendet alle eingeschraenkten Chunks in einem Prompt. Der Originaltext darf **nicht kopiert** werden.
|
||||
Quellennamen und proprietaere Bezeichner werden im Prompt explizit verboten.
|
||||
|
||||
```python
|
||||
# Prompt-Auszug:
|
||||
"KOPIERE KEINE Saetze. Verwende eigene Begriffe und Struktur."
|
||||
"NENNE NICHT die Quellen. Keine proprietaeren Bezeichner."
|
||||
```
|
||||
|
||||
**Processing Path:** `llm_reform_batch` (in `generation_metadata`)
|
||||
|
||||
#### Fallback bei Batch-Fehler
|
||||
|
||||
Falls ein Batch-Call fehlschlaegt (z.B. Timeout, Parsing-Error), faellt die Pipeline automatisch auf **Einzelverarbeitung** zurueck:
|
||||
|
||||
```python
|
||||
except Exception as e:
|
||||
logger.error("Batch processing failed: %s — falling back to single-chunk mode", e)
|
||||
for chunk, _lic in batch:
|
||||
ctrl = await self._process_single_chunk(chunk, config, job_id)
|
||||
```
|
||||
|
||||
!!! info "Batch-Konfiguration"
|
||||
| Parameter | Wert | Beschreibung |
|
||||
|-----------|------|-------------|
|
||||
| `batch_size` | 5 (Default) | Chunks pro API-Call |
|
||||
| `max_tokens` | 8192 | Maximale Token-Laenge der LLM-Antwort |
|
||||
| `LLM_TIMEOUT` | 180s | Timeout pro Anthropic-Call |
|
||||
|
||||
Die `batch_size` ist ueber `GeneratorConfig` konfigurierbar.
|
||||
Bei grosser Batch-Size steigt die Wahrscheinlichkeit fuer Parsing-Fehler.
|
||||
|
||||
### Stufe 5: Harmonisierung (Embedding-basiert)
|
||||
|
||||
Prueft per bge-m3 Embeddings (Cosine Similarity > 0.85), ob ein aehnliches Control existiert.
|
||||
@@ -310,7 +380,17 @@ system, risk, governance, hardware, identity
|
||||
| `CONTROL_GEN_ANTHROPIC_MODEL` | `claude-sonnet-4-6` | Anthropic-Modell fuer Formulierung |
|
||||
| `OLLAMA_URL` | `http://host.docker.internal:11434` | Lokaler Ollama-Server (Vorfilter) |
|
||||
| `CONTROL_GEN_OLLAMA_MODEL` | `qwen3:30b-a3b` | Lokales LLM fuer Vorfilter |
|
||||
| `CONTROL_GEN_LLM_TIMEOUT` | `120` | Timeout in Sekunden |
|
||||
| `CONTROL_GEN_LLM_TIMEOUT` | `180` | Timeout in Sekunden (erhoet fuer Batch-Calls) |
|
||||
|
||||
**Pipeline-Konfiguration (via `GeneratorConfig`):**
|
||||
|
||||
| Parameter | Default | Beschreibung |
|
||||
|-----------|---------|-------------|
|
||||
| `batch_size` | `5` | Chunks pro Anthropic-API-Call |
|
||||
| `max_controls` | `0` | Limit (0 = alle Chunks verarbeiten) |
|
||||
| `skip_processed` | `true` | Bereits verarbeitete Chunks ueberspringen |
|
||||
| `dry_run` | `false` | Trockenlauf ohne DB-Schreibzugriffe |
|
||||
| `skip_web_search` | `false` | Web-Suche fuer Anchor-Finder ueberspringen |
|
||||
|
||||
### Architektur-Entscheidung: Gesetzesverweise
|
||||
|
||||
@@ -351,15 +431,145 @@ curl https://macmini:8002/api/compliance/v1/canonical/generate/jobs \
|
||||
|
||||
---
|
||||
|
||||
## Processed Chunks Tracking
|
||||
|
||||
Die Tabelle `canonical_processed_chunks` trackt **JEDEN** verarbeiteten RAG-Chunk per SHA-256-Hash.
|
||||
Dadurch werden Chunks bei erneutem Pipeline-Lauf automatisch uebersprungen (`skip_processed: true`).
|
||||
|
||||
### Tabelle: `canonical_processed_chunks` (Migration 046 + 048)
|
||||
|
||||
| Spalte | Typ | Beschreibung |
|
||||
|--------|-----|-------------|
|
||||
| `id` | UUID | Primary Key |
|
||||
| `chunk_hash` | VARCHAR(64) | SHA-256 Hash des Chunk-Textes |
|
||||
| `collection` | VARCHAR(100) | Qdrant-Collection (z.B. `bp_compliance_gesetze`) |
|
||||
| `regulation_code` | VARCHAR(100) | Quell-Regulation (z.B. `bdsg`, `eu_2016_679`) |
|
||||
| `document_version` | VARCHAR(50) | Versions-Tracking |
|
||||
| `source_license` | VARCHAR(50) | Lizenz der Quelle |
|
||||
| `license_rule` | INTEGER | 1, 2 oder 3 |
|
||||
| `processing_path` | VARCHAR(20) | Verarbeitungspfad (siehe unten) |
|
||||
| `generated_control_ids` | JSONB | UUIDs der generierten Controls |
|
||||
| `job_id` | UUID | Referenz auf `canonical_generation_jobs` |
|
||||
| `processed_at` | TIMESTAMPTZ | Zeitstempel |
|
||||
|
||||
**UNIQUE Constraint:** `(chunk_hash, collection, document_version)` — verhindert Doppelverarbeitung.
|
||||
|
||||
### Processing Paths
|
||||
|
||||
| Wert | Stufe | Bedeutung |
|
||||
|------|-------|-----------|
|
||||
| `prefilter_skip` | 2 | Lokaler LLM-Vorfilter: Chunk nicht sicherheitsrelevant |
|
||||
| `structured` | 4a | Einzelner Chunk strukturiert (Rule 1/2) |
|
||||
| `llm_reform` | 4b | Einzelner Chunk reformuliert (Rule 3) |
|
||||
| `structured_batch` | 4a | Batch-Strukturierung (Rule 1/2, in `generation_metadata`) |
|
||||
| `llm_reform_batch` | 4b | Batch-Reformulierung (Rule 3, in `generation_metadata`) |
|
||||
| `no_control` | 4 | LLM konnte kein Control ableiten |
|
||||
| `store_failed` | 7 | DB-Speichern fehlgeschlagen |
|
||||
| `error` | — | Unerwarteter Fehler bei der Verarbeitung |
|
||||
|
||||
!!! note "Batch-Pfade in generation_metadata"
|
||||
Die Werte `structured_batch` und `llm_reform_batch` werden im `processing_path` der Datenbank gespeichert
|
||||
**und** im `generation_metadata` JSON-Feld des Controls. So ist nachvollziehbar, ob ein Control
|
||||
einzeln oder im Batch generiert wurde.
|
||||
|
||||
### Beispiel-Query: Verarbeitungsstatistik
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
processing_path,
|
||||
COUNT(*) as count
|
||||
FROM canonical_processed_chunks
|
||||
GROUP BY processing_path
|
||||
ORDER BY count DESC;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Statistiken (processed-stats Endpoint)
|
||||
|
||||
Der Endpoint `GET /v1/canonical/generate/processed-stats` liefert Verarbeitungsstatistiken pro RAG-Collection.
|
||||
|
||||
```bash
|
||||
curl -s https://macmini:8002/api/compliance/v1/canonical/generate/processed-stats | jq
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"stats": [
|
||||
{
|
||||
"collection": "bp_compliance_gesetze",
|
||||
"processed_chunks": 45200,
|
||||
"direct_adopted": 1850,
|
||||
"llm_reformed": 120,
|
||||
"skipped": 43230,
|
||||
"total_chunks_estimated": 0,
|
||||
"pending_chunks": 0
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Aktuelle Groessenordnung
|
||||
|
||||
| Metrik | Wert |
|
||||
|--------|------|
|
||||
| RAG-Chunks gesamt | ~183.000 |
|
||||
| Verarbeitete Chunks | ~183.000 (vollstaendig) |
|
||||
| Generierte Controls | **~2.120** |
|
||||
| Konversionsrate | ~1,2% (nur sicherheitsrelevante Chunks erzeugen Controls) |
|
||||
|
||||
!!! info "Warum so wenige Controls?"
|
||||
Die meisten RAG-Chunks sind Definitionen, Begriffsbestimmungen, Inhaltsverzeichnisse oder
|
||||
Uebergangsvorschriften. Der Prefilter (Stufe 2) sortiert >50% aus, die Harmonisierung (Stufe 5)
|
||||
entfernt weitere Duplikate. Nur konkrete, einzigartige Anforderungen werden zu Controls.
|
||||
|
||||
---
|
||||
|
||||
## Migration von Controls (Lokal → Production)
|
||||
|
||||
Controls koennen ueber die REST-API von der lokalen Entwicklungsumgebung in die Production migriert werden.
|
||||
Jedes Control wird einzeln per `POST` mit der Referenz auf das Framework erstellt.
|
||||
|
||||
```bash
|
||||
# 1. Control aus lokaler Umgebung exportieren
|
||||
curl -s https://macmini:8002/api/compliance/v1/canonical/controls/AUTH-001 | jq > control.json
|
||||
|
||||
# 2. In Production importieren (mit framework_id)
|
||||
curl -X POST https://api-dev.breakpilot.ai/api/compliance/v1/canonical/controls \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{
|
||||
"framework_id": "bp_security_v1",
|
||||
"control_id": "AUTH-001",
|
||||
"title": "Multi-Faktor-Authentifizierung",
|
||||
"objective": "...",
|
||||
"severity": "high",
|
||||
"open_anchors": [...]
|
||||
}'
|
||||
```
|
||||
|
||||
!!! warning "Framework muss existieren"
|
||||
Das Ziel-Framework (`bp_security_v1`) muss in der Production-DB bereits angelegt sein.
|
||||
Falls nicht, zuerst das Framework erstellen:
|
||||
```bash
|
||||
curl -X POST https://api-dev.breakpilot.ai/api/compliance/v1/canonical/frameworks \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"framework_id": "bp_security_v1", "name": "BreakPilot Security", "version": "1.0"}'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Dateien
|
||||
|
||||
| Datei | Typ | Beschreibung |
|
||||
|-------|-----|-------------|
|
||||
| `backend-compliance/migrations/044_canonical_control_library.sql` | SQL | 5 Tabellen + Seed-Daten |
|
||||
| `backend-compliance/migrations/046_control_generator.sql` | SQL | Job-Tracking, Chunk-Tracking, Blocked Sources |
|
||||
| `backend-compliance/migrations/047_verification_method_category.sql` | SQL | verification_method + category Felder |
|
||||
| `backend-compliance/migrations/048_processing_path_expand.sql` | SQL | Erweiterte processing_path Werte |
|
||||
| `backend-compliance/compliance/api/canonical_control_routes.py` | Python | REST API (8+ Endpoints) |
|
||||
| `backend-compliance/compliance/api/control_generator_routes.py` | Python | Generator API (Start/Status/Jobs) |
|
||||
| `backend-compliance/compliance/services/control_generator.py` | Python | 8-Stufen-Pipeline |
|
||||
| `backend-compliance/compliance/api/control_generator_routes.py` | Python | Generator API (Start/Status/Jobs/Stats) |
|
||||
| `backend-compliance/compliance/services/control_generator.py` | Python | 8-Stufen-Pipeline mit Batch Processing |
|
||||
| `backend-compliance/compliance/services/license_gate.py` | Python | Lizenz-Gate-Logik |
|
||||
| `backend-compliance/compliance/services/similarity_detector.py` | Python | Too-Close-Detektor (5 Metriken) |
|
||||
| `backend-compliance/compliance/services/rag_client.py` | Python | RAG-Client (Search + Scroll) |
|
||||
@@ -376,11 +586,25 @@ curl https://macmini:8002/api/compliance/v1/canonical/generate/jobs \
|
||||
|
||||
## Tests
|
||||
|
||||
| Datei | Sprache | Tests |
|
||||
|-------|---------|-------|
|
||||
| `ai-compliance-sdk/internal/ucca/canonical_control_loader_test.go` | Go | 8 Tests |
|
||||
| `backend-compliance/compliance/tests/test_similarity_detector.py` | Python | 19 Tests |
|
||||
| `backend-compliance/tests/test_canonical_control_routes.py` | Python | 14 Tests |
|
||||
| `backend-compliance/tests/test_license_gate.py` | Python | 12 Tests |
|
||||
| `backend-compliance/tests/test_validate_controls.py` | Python | 14 Tests |
|
||||
| **Gesamt** | | **67 Tests** |
|
||||
| Datei | Sprache | Tests | Schwerpunkt |
|
||||
|-------|---------|-------|-------------|
|
||||
| `ai-compliance-sdk/internal/ucca/canonical_control_loader_test.go` | Go | 8 Tests | Control Loader, Multi-Index |
|
||||
| `backend-compliance/compliance/tests/test_similarity_detector.py` | Python | 19 Tests | Too-Close-Detektor, 5 Metriken |
|
||||
| `backend-compliance/tests/test_canonical_control_routes.py` | Python | 14 Tests | REST API Endpoints |
|
||||
| `backend-compliance/tests/test_license_gate.py` | Python | 12 Tests | Lizenz-Klassifikation |
|
||||
| `backend-compliance/tests/test_validate_controls.py` | Python | 14 Tests | CI/CD Validator |
|
||||
| `backend-compliance/tests/test_control_generator.py` | Python | 15 Tests | Pipeline, Batch, Lizenzregeln |
|
||||
| **Gesamt** | | **82 Tests** |
|
||||
|
||||
### Control Generator Tests (test_control_generator.py)
|
||||
|
||||
Die Generator-Tests decken folgende Bereiche ab:
|
||||
|
||||
- **`TestLicenseMapping`** (12 Tests) — Korrekte Zuordnung von `regulation_code` zu Lizenzregeln (Rule 1/2/3),
|
||||
Case-Insensitivity, Rule 3 darf keine Quellennamen exponieren
|
||||
- **`TestDomainDetection`** (5 Tests) — Erkennung von AUTH, CRYPT, NET, DATA Domains aus Chunk-Text
|
||||
- **`TestJsonParsing`** (4 Tests) — Robustes Parsing von LLM-Antworten (plain JSON, Markdown-Fenced, mit Preamble)
|
||||
- **`TestGeneratedControlRules`** (3 Tests) — Rule 1 hat Originaltext, Rule 2 hat Citation, Rule 3 hat **nichts**
|
||||
- **`TestAnchorFinder`** (2 Tests) — RAG-Suche filtert Rule 3 Quellen aus, Web-Suche erkennt Frameworks
|
||||
- **`TestPipelineMocked`** (5 Tests) — End-to-End mit Mocks: Lizenz-Klassifikation, Rule 3 Blocking,
|
||||
Hash-Deduplizierung, Config-Defaults (`batch_size: 5`), Rule 1 Citation-Generierung
|
||||
|
||||
Reference in New Issue
Block a user