686834cea0ec49c669ab73effa2b4fe671a636f7
4 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
686834cea0 |
feat: 4 remaining tasks — EU institutions, banner integration, JS-sites, Caritas fixes
Build + Deploy / build-ai-sdk (push) Failing after 36s
Build + Deploy / build-developer-portal (push) Successful in 8s
Build + Deploy / build-tts (push) Successful in 7s
Build + Deploy / build-document-crawler (push) Successful in 7s
Build + Deploy / build-admin-compliance (push) Successful in 8s
Build + Deploy / build-backend-compliance (push) Successful in 8s
CI / nodejs-build (push) Successful in 3m14s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Failing after 46s
CI / test-python-backend (push) Successful in 43s
CI / test-python-document-crawler (push) Successful in 29s
CI / test-python-dsms-gateway (push) Successful in 30s
CI / validate-canonical-controls (push) Successful in 16s
Build + Deploy / build-dsms-gateway (push) Successful in 8s
Build + Deploy / build-dsms-node (push) Successful in 8s
CI / branch-name (push) Has been skipped
Build + Deploy / trigger-orca (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / loc-budget (push) Failing after 17s
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
1. EU Institution Checks (Verordnung 2018/1725): - New doc_type "eu_institution" with 9 L1 + 15 L2 checks - Both German + English patterns (EU institutions are multilingual) - Auto-detection via "2018/1725", "EDSB", "EDPS" keywords - Correct article references (Art. 15 instead of 13, Art. 5 instead of 6) 2. Banner Check Integration: - banner_runner.py maps scan results to 36 L1/L2 structured checks - BannerCheckTab shows hierarchical ChecklistView with hints - 3-phase summary (cookies/scripts before/after consent) - /scan endpoint now includes structured_checks in response 3. JS-heavy Website Fixes (dm, Zalando, HWK): - dsi_helpers.py: goto_resilient (networkidle→domcontentloaded fallback) - try_dismiss_consent_banner before text extraction - PDF redirect detection (dm.de redirects to GCS PDF) 4. Caritas False Positive Fixes: - Phone regex allows parentheses: +49 (0)761 → now matches - "Recht auf Widerspruch" (3 words) + §23 KDG → matches Art. 21 - Church authorities: "Katholisches Datenschutzzentrum" recognized Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
3efc491ec5 |
fix: 5 false positives from etogruppe.com ground truth
Build + Deploy / build-tts (push) Successful in 1m38s
Build + Deploy / build-document-crawler (push) Successful in 41s
Build + Deploy / build-dsms-gateway (push) Successful in 26s
Build + Deploy / build-dsms-node (push) Successful in 12s
Build + Deploy / build-admin-compliance (push) Successful in 2m22s
Build + Deploy / build-backend-compliance (push) Successful in 3m21s
Build + Deploy / build-ai-sdk (push) Successful in 53s
Build + Deploy / build-developer-portal (push) Successful in 1m16s
CI / guardrail-integrity (push) Has been skipped
CI / loc-budget (push) Failing after 20s
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / branch-name (push) Has been skipped
CI / nodejs-build (push) Successful in 3m18s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Failing after 59s
CI / test-python-backend (push) Successful in 47s
CI / test-python-document-crawler (push) Successful in 32s
CI / test-python-dsms-gateway (push) Successful in 27s
CI / validate-canonical-controls (push) Successful in 16s
Build + Deploy / trigger-orca (push) Successful in 3m23s
1. Soft hyphens (/\xad) stripped before regex matching —
fixes "Datenübertragbarkeit" not matching
2. Art. 15/17/20: allow adjectives between "Recht auf" and keyword
("Recht auf unentgeltliche Auskunft" now matches)
3. DSB contact: regex spans up to 300 chars across newlines
(DSB section with company address between heading and email)
4. Löschkonzept: added "Fortfall", "Entfall", "Beendigung" as
deletion trigger words alongside "Ablauf"/"Wegfall"
Reduces etogruppe FPs from 5 to ~1.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
|
|
293c58d0dd |
feat: Add actionable hints to all 138 compliance checks
Build + Deploy / build-admin-compliance (push) Successful in 1m40s
Build + Deploy / build-backend-compliance (push) Successful in 7s
Build + Deploy / build-ai-sdk (push) Successful in 35s
Build + Deploy / build-developer-portal (push) Successful in 8s
Build + Deploy / build-tts (push) Successful in 7s
Build + Deploy / build-document-crawler (push) Successful in 8s
Build + Deploy / build-dsms-gateway (push) Successful in 7s
Build + Deploy / build-dsms-node (push) Successful in 8s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / loc-budget (push) Failing after 16s
CI / secret-scan (push) Has been skipped
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m50s
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Failing after 40s
CI / test-python-backend (push) Successful in 37s
CI / test-python-document-crawler (push) Successful in 25s
CI / test-python-dsms-gateway (push) Successful in 23s
CI / validate-canonical-controls (push) Successful in 15s
Build + Deploy / trigger-orca (push) Successful in 2m28s
Each check now has a "hint" field explaining what is missing and what the customer should do to fix it. Hints are shown in the frontend below failed checks in red text. Examples: - "Bei Verarbeitung auf Basis von Art. 6(1)(f) muss dokumentiert werden, warum Ihr berechtigtes Interesse die Rechte der Betroffenen ueberwiegt." - "Die ladungsfaehige Anschrift fehlt. Erforderlich: Strasse, Hausnummer, PLZ und Ort." Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
b363c28539 |
feat: Add 76 Level-2 regex checks for document correctness verification
Split dsi_document_checker.py (466 LOC) into doc_checks/ package (9 files). Two-pass L1→L2 logic: L1 checks "Is it mentioned?", L2 checks "Is it correct?" (e.g. controller has full address, specific Art. 6 lit., concrete time periods). 138 total checks (62 L1 + 76 L2) across 7 doc types: - DSE Art. 13: 31, Impressum §5 TMG: 16, Cookie §25 TDDDG: 15 - Widerruf §355: 15, AGB §305ff: 21, Social Media Art. 26: 20, DSFA Art. 35: 18 Frontend: hierarchical L1→L2 display with dual progress bars (green=completeness, blue=correctness). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |