d6b8bf87c2
CI / detect-changes (push) Successful in 9s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / test-python-backend (push) Successful in 29s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Failing after 4s
CI / validate-canonical-controls (push) Successful in 10s
CI / loc-budget (push) Successful in 13s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Has been skipped
CI / iace-gt-coverage (push) Has been skipped
(1) B22 Cross-Domain (fix #59):
Elli-Test fand AGB auf logpay.de NICHT obwohl URL in doc_entries
korrekt. Vermutete Ursache: Discovery-Phase A drops/überschreibt
Original-URL bei PDF-Fetch-Fail (word_count=0).
Fix: _collect_audit_urls() iteriert über state.doc_entries +
rejected_url + req.documents — Cross-Domain-Hosting ist
unabhängig vom Text-Inhalt. Plus Trace-Logging für künftige
Diagnose. Dedup per (doc_type, host_sld).
(2) B17 Audit-Walk-Fail-Fallback (fix #60):
BMW v5 hatte audit_walk=None ohne Mail-Hinweis. Vermutlich
180s-Timeout bei OneTrust-CMP-Banner-Tour.
Fix: Timeout 180s → 300s. Plus: Bei Fail wird ein Hinweis-
Stub mit error-Grund in state["audit_walk"] + HTML-Block
geschrieben — Reviewer sieht den Fail statt silent-skip.
(3) company_name + origin_domain im Backend (fix #61):
Frontend sendet seit ec03317 die zwei Felder — Backend ignorierte
sie.
Fix: ComplianceCheckRequest-Schema um company_name +
origin_domain erweitert. phase_e_email priorisiert User-Input
vor URL-Heuristik für site_name. Bei origin_domain ohne
ableitbare doc_entries-domain wird der User-Input als domain
übernommen.
(4) Plausibility-LLM Fallback-Modell (fix #62):
qwen3:30b-a3b liefert auf großen DSEs (BMW 122 FAIL) gehäuft
leere format='json'-Responses — Circuit-Breaker griff aber
Phase blieb nutzlos.
Fix: Default-Modell auf qwen2.5:7b umgestellt (4× kleiner,
zuverlässiger bei format=json, ausreichendes Reasoning für
PASS/MODIFY/DROP-Klassifikation). Plus Strategy-C eingeführt
— Fallback-Modell (llama3.2:3b) wenn primary leer bleibt.
BATCH_SIZE 4 → 3. ENV-Switches PLAUSIBILITY_LLM_MODEL +
PLAUSIBILITY_FALLBACK_MODEL für Tuning.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
50 lines
1.5 KiB
Python
50 lines
1.5 KiB
Python
"""Pydantic request/response schemas for the compliance-check route."""
|
|
|
|
from __future__ import annotations
|
|
|
|
from pydantic import BaseModel
|
|
|
|
|
|
class ExtractTextRequest(BaseModel):
|
|
url: str
|
|
|
|
|
|
class DocumentInput(BaseModel):
|
|
doc_type: str # dse, agb, impressum, cookie, widerruf, avv, loeschkonzept, etc.
|
|
url: str = ""
|
|
text: str = "" # text has priority over URL
|
|
|
|
|
|
class ComplianceCheckRequest(BaseModel):
|
|
documents: list[DocumentInput]
|
|
use_agent: bool = False
|
|
recipient: str = "dsb@breakpilot.local"
|
|
# P12: Override fuer TDM-Vorbehalt bei dokumentierter Kunden-Erlaubnis.
|
|
# Pflichtfeld tdm_override_reason wenn tdm_override=True
|
|
# (z.B. "Auftragsbeziehung Safetykon GmbH, Email Hr. X 18.05.2026").
|
|
tdm_override: bool = False
|
|
tdm_override_reason: str = ""
|
|
# P79: 8-Feld Pre-Scan-Wizard (Branche, B2B/B2C, Direkt-Vertrieb,
|
|
# Rechtsform, Konzern, MA, Besondere Daten, Drittland). Wird im
|
|
# Snapshot persistiert und filtert die MC-Auswertung (P72).
|
|
scan_context: dict | None = None
|
|
# Frontend-eingegebene Firma + Origin-Domain. Priorisiert vor
|
|
# LLM-extracted_profile-Inferenz. Wenn leer: Fallback auf Heuristik
|
|
# aus URL-Domains und DSE-Text.
|
|
company_name: str | None = None
|
|
origin_domain: str | None = None
|
|
|
|
|
|
class ComplianceCheckStartResponse(BaseModel):
|
|
check_id: str
|
|
status: str = "running"
|
|
|
|
|
|
class ComplianceCheckStatusResponse(BaseModel):
|
|
check_id: str
|
|
status: str
|
|
progress: str = ""
|
|
progress_pct: int = 0
|
|
result: dict | None = None
|
|
error: str = ""
|