Files
breakpilot-compliance/zeroclaw/docs/batch-test-results-2026-05-08.md
T
Benjamin Admin 1b5c6bd340
Build + Deploy / build-ai-sdk (push) Failing after 33s
Build + Deploy / build-developer-portal (push) Successful in 7s
Build + Deploy / build-tts (push) Successful in 7s
Build + Deploy / build-document-crawler (push) Successful in 7s
Build + Deploy / build-dsms-gateway (push) Successful in 8s
Build + Deploy / build-admin-compliance (push) Successful in 1m51s
Build + Deploy / build-backend-compliance (push) Successful in 8s
CI / loc-budget (push) Failing after 18s
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 19s
Build + Deploy / build-dsms-node (push) Successful in 8s
CI / branch-name (push) Has been skipped
Build + Deploy / trigger-orca (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / nodejs-build (push) Successful in 3m8s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Failing after 46s
CI / test-python-backend (push) Successful in 41s
CI / test-python-document-crawler (push) Successful in 32s
CI / test-python-dsms-gateway (push) Successful in 24s
docs: Batch test results for 9 websites + EUIPO analysis
Tested BMW, Stadt Koeln, BfDI, Sparkasse, Caritas, TUEV Sued,
Spiegel, ETO Gruppe, EUIPO. Key findings:

- Stadt Koeln + ETO Gruppe best (95% correctness)
- BMW, Sparkasse, Spiegel genuinely deficient (verified)
- EUIPO uses EU Regulation 2018/1725, not GDPR — needs separate checklist
- ~0-2 false positives per website after LLM verification

7 regex fixes emerged from batch testing (soft hyphens, word
insertions, numbered headings, German section names, etc.)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-08 00:41:28 +02:00

42 lines
2.0 KiB
Markdown

# Batch-Test Ergebnisse (2026-05-08)
## 9 Websites getestet
| # | Website | Typ | L1 | L2 | Vollst. | Korr. | Woerter | Bewertung |
|---|---------|-----|----|----|---------|-------|---------|-----------|
| 1 | Stadt Koeln | Kommune | 9/9 | 21/22 | 100% | 95% | 5910 | Vorbildlich |
| 2 | Caritas | Nonprofit | 9/9 | 19/22 | 100% | 86% | 9447 | Gut |
| 3 | ETO Gruppe | Mittelstand | 9/9 | 21/22 | 100% | 95% | 7312 | Vorbildlich |
| 4 | BfDI | Bundesbehoerde | 9/9 | 16/22 | 100% | 73% | 2014 | OK (kurz) |
| 5 | TUEV Sued | Prueforg. | 8/9 | 15/21 | 89% | 71% | 9467 | Luecken |
| 6 | IHK Konstanz | Kammer | 9/9 | 18/22 | 100% | 82% | 6353 | Gut |
| 7 | BMW | Konzern | 8/9 | 10/21 | 89% | 48% | 7207 | Mangelhaft |
| 8 | Sparkasse | Finanz | 7/9 | 10/20 | 78% | 50% | 12183 | Mangelhaft |
| 9 | Spiegel | Medien | 6/9 | 10/13 | 67% | 77% | 13698 | Mangelhaft |
### Sonderfaelle
- **EUIPO** (EU-Behoerde): 6/9 L1, 5/13 L2 — unterliegt Verordnung 2018/1725, nicht DSGVO. Separate Checkliste noetig.
- **dm, Zalando, HWK**: Text-Extraktion scheitert (JS-heavy SPAs, Consent-Wall blockiert)
## Verifizierte True Positives
BMW, Sparkasse und Spiegel haben **tatsaechlich lueckenhafte DSEs** — verifiziert gegen Originaltexte:
- BMW: Keine E-Mail, kein Art. 77 Beschwerderecht, keine Art.-Referenzen fuer Rechte
- Sparkasse: Kein DSB, kein Art. 77
- Spiegel: Kein DSB, kein Art. 77, keine Betroffenenrechte
## False-Positive-Rate
Ueber alle 9 Websites: **~0-2 FP pro Website** nach LLM-Verifikation.
Hauptursache verbleibender FP: Ungewoehnliche Formulierungen die weder Regex noch LLM erkennen.
## Regex-Fixes die aus dem Batch-Test entstanden
1. Soft-Hyphen Stripping (\xad) — etogruppe
2. "Recht auf [Adjektiv] Auskunft" — Wort-Einschub
3. "nach Fortfall" neben "nach Ablauf" — Loeschkonzept
4. DSB-Kontakt ueber Zeilenumbrueche — [\s\S]{0,300}
5. Nummerierte Headings ("5. Soziale Medien") — isdigit()
6. Section-Splitter nur bei klassifizierten Headings
7. "Soziale Medien/Netzwerke" als Social-Media-Heading