feat(qa): recital detection, review split, duplicate comparison
Some checks failed
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Failing after 42s
CI/CD / test-python-backend-compliance (push) Successful in 34s
CI/CD / test-python-document-crawler (push) Successful in 21s
CI/CD / test-python-dsms-gateway (push) Successful in 20s
CI/CD / validate-canonical-controls (push) Successful in 12s
CI/CD / Deploy (push) Has been skipped

Add _detect_recital() to QA pipeline — flags controls where
source_original_text contains Erwägungsgrund markers instead of
article text (28% of controls with source text affected).

- Recital detection via regex + phrase matching in QA validation
- 10 new tests (TestRecitalDetection), 81 total
- ReviewCompare component for side-by-side duplicate comparison
- Review mode split: Duplikat-Verdacht vs Rule-3-ohne-Anchor tabs
- MkDocs: recital detection documentation
- Detection script for bulk analysis (scripts/find_recital_controls.py)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Benjamin Admin
2026-03-18 08:20:02 +01:00
parent a9e0869205
commit 148c7ba3af
7 changed files with 657 additions and 28 deletions

View File

@@ -214,13 +214,13 @@ Wenn du z.B. eine neue `GetUserStats()` Funktion im Go Service hinzufuegst:
## Modul-spezifische Tests
### Canonical Control Generator (71+ Tests)
### Canonical Control Generator (81+ Tests)
Die Control Library hat eine umfangreiche Test-Suite ueber 6 Dateien.
Siehe [Canonical Control Library — Tests](../services/sdk-modules/canonical-control-library.md#tests) und [Control Generator Pipeline](../services/sdk-modules/control-generator-pipeline.md) fuer Details.
```bash
# Alle Generator-Tests (71 Tests in 10 Klassen)
# Alle Generator-Tests (81 Tests in 12 Klassen)
cd backend-compliance && pytest -v tests/test_control_generator.py
# Similarity Detector Tests
@@ -253,3 +253,4 @@ cd backend-compliance && pytest -v tests/test_validate_controls.py
| `TestBatchProcessingLoop` | 10 | Batch-Verarbeitung (Rule-Split, Mixed-Rules, Too-Close, Null-Handling) |
| `TestRegulationFilter` | 5 | regulation_filter Prefix-Matching, leere regulation_codes |
| `TestPipelineVersion` | 5 | pipeline_version=2 in DB-Writes, null-Handling in Structure/Reform |
| `TestRecitalDetection` | 10 | Erwaegungsgrund-Erkennung in Quelltexten (Regex, Phrasen, Kombiniert) |