feat(audit): VW-Cookie-Bug-Fix + P101/P102 Cookie-Library-Mismatch-Findings
CI / loc-budget (push) Failing after 19s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 42s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / detect-changes (push) Successful in 11s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 15s
CI / loc-budget (push) Failing after 19s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
CI / test-python-backend (push) Successful in 42s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / detect-changes (push) Successful in 11s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 15s
VW-Bug B1: extract_vendors_via_llm hatte max_text_chars=12000 -> bei VW-Cookie-Doc (60k chars, 100 Cookies in Tabelle) wurden 80% abgeschnitten, LLM extrahierte nur 1 Vendor. Fix: max_text_chars=50000, num_predict 6000->16000 fuer mehr Vendor-Output, Ollama-Timeout 120s->420s. P101 Aggregator-Script (backend-compliance/scripts/cookie_library_enrich.py) geht alle compliance_check_snapshots durch und extrahiert (cookie_name, declared_category, observed_sites). Erste Auswertung ueber 8 Snapshots: 101 unique Cookies, 47 in Library, 54 unbekannt, 18 Mismatches. P102 Cookie-Klassifikations-Pruefung als Mail-Block. Vergleicht Site-deklarierte Kategorie vs Library + Vendor-Doku. HIGH wenn Library sagt 'marketing' aber Site als 'essential'/'statistics' deklariert (faktische Drittland-/Werbe-Verarbeitung versteckt). MEDIUM sonst. In agent_compliance_check_routes Mail-Komposition + Replay-Pipeline eingebaut. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1043,11 +1043,45 @@ async def _run_compliance_check(check_id: str, req: ComplianceCheckRequest):
|
||||
except Exception as e:
|
||||
logger.warning("Scope-disclaimer block skipped: %s", e)
|
||||
|
||||
# P102: Cookie-Klassifikations-Pruefung (deklariert vs Library)
|
||||
library_mismatch_html = ""
|
||||
try:
|
||||
from compliance.services.cookie_library_mismatch import (
|
||||
detect_mismatches, build_mismatch_block_html,
|
||||
)
|
||||
from database import SessionLocal
|
||||
cookie_doc_for_check = doc_texts.get("cookie") or doc_texts.get("dse") or ""
|
||||
all_cookies_seen: list[str] = []
|
||||
if banner_result:
|
||||
for ph in (banner_result.get("phases") or {}).values():
|
||||
if isinstance(ph, dict):
|
||||
for ck in (ph.get("cookies") or []):
|
||||
if isinstance(ck, str):
|
||||
all_cookies_seen.append(ck)
|
||||
elif isinstance(ck, dict) and ck.get("name"):
|
||||
all_cookies_seen.append(ck["name"])
|
||||
if all_cookies_seen and cookie_doc_for_check:
|
||||
_mm_db = SessionLocal()
|
||||
try:
|
||||
mismatches = detect_mismatches(
|
||||
_mm_db, all_cookies_seen, cookie_doc_for_check,
|
||||
)
|
||||
if mismatches:
|
||||
library_mismatch_html = build_mismatch_block_html(mismatches)
|
||||
logger.info(
|
||||
"P102: %d Cookie-Mismatches gefunden", len(mismatches)
|
||||
)
|
||||
finally:
|
||||
_mm_db.close()
|
||||
except Exception as e:
|
||||
logger.warning("P102 mismatch detection failed: %s", e)
|
||||
|
||||
full_html = (
|
||||
critical_html + scope_disclaimer_html + exec_summary_html
|
||||
+ cookie_arch_html + summary_html + scanned_html + profile_html
|
||||
+ scorecard_html + redundancy_html
|
||||
+ providers_html + banner_deep_html + vvt_html + report_html
|
||||
+ providers_html + banner_deep_html + library_mismatch_html
|
||||
+ vvt_html + report_html
|
||||
)
|
||||
|
||||
# Step 6: Send email — derive site name primarily from entered URL.
|
||||
|
||||
Reference in New Issue
Block a user