feat(audit): overlapping evidence-slices fuer lueckenlose Beweiskette
Statt EIN full-page screenshot: full-page wird per PIL in viewport-grosse Slices geschnitten, jede ueberlappt die vorherige um overlap_px Pixel. Jeder Cookie erscheint in mind. einer Slice, an Slice-Grenzen sogar in zwei → Dedup nach Name eliminiert die Doppel. Warum nicht direkt scroll-based slicing in Playwright? VW's Cookie-Page nutzt scroll-snap / fixed-position — alle viewport-shots kamen identisch zurueck (Header-Overlay). PIL-cut auf dem full-page PNG bypasst das Problem voellig. VW smoke-test (32 slices): per-slice: [0, 0, 2, 5, 5, 3, 4, 7, 4, 3, 4, 5, ...] 103 raw cookies → 79 unique nach dedup 14 vendor records (Google 9, Adobe-Familie 17, etc.) Jeder Slice hat eigenen Timestamp + SHA256 → ZIP-Anhang fuer juristische Beweiskette. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
+57
-1
@@ -16,7 +16,10 @@ from services.consent_scanner import run_consent_test, ConsentTestResult
|
||||
from services.authenticated_scanner import run_authenticated_test, AuthTestResult
|
||||
from services.playwright_scanner import scan_website_playwright
|
||||
from services.dsi_discovery import discover_dsi_documents, DSIDiscoveryResult
|
||||
from services.page_screenshot import capture_page_evidence
|
||||
from services.page_screenshot import (
|
||||
capture_page_evidence,
|
||||
capture_page_overlapping_slices,
|
||||
)
|
||||
from checks.banner_runner import map_scan_to_checks
|
||||
|
||||
logging.basicConfig(level=logging.INFO, format="%(levelname)s:%(name)s: %(message)s")
|
||||
@@ -407,6 +410,59 @@ async def capture_evidence(req: EvidenceRequest):
|
||||
)
|
||||
|
||||
|
||||
# ── Evidence slices (overlapping scrolling screenshots) ─────────────
|
||||
|
||||
class EvidenceSlicesRequest(BaseModel):
|
||||
url: str
|
||||
check_id: str = ""
|
||||
viewport_h: int = 1024
|
||||
overlap_px: int = 200
|
||||
max_slices: int = 40
|
||||
|
||||
|
||||
class EvidenceSliceItem(BaseModel):
|
||||
idx: int
|
||||
ts: str
|
||||
top_y: int
|
||||
bot_y: int
|
||||
sha256: str
|
||||
png_b64: str
|
||||
png_size: int
|
||||
|
||||
|
||||
class EvidenceSlicesResponse(BaseModel):
|
||||
url: str
|
||||
total_height_px: int
|
||||
width_px: int
|
||||
accepted_banner: bool
|
||||
expanded: int
|
||||
slices: list[EvidenceSliceItem]
|
||||
|
||||
|
||||
@app.post("/capture-evidence-slices", response_model=EvidenceSlicesResponse)
|
||||
async def capture_evidence_slices(req: EvidenceSlicesRequest):
|
||||
"""Overlapping viewport-screenshots fuer lueckenlose Beweiskette.
|
||||
|
||||
Jede Slice ueberlappt die vorherige um overlap_px Pixel — jeder Cookie
|
||||
erscheint in mind. einem Bild, an Slice-Grenzen sogar in zwei. Dedup
|
||||
nach Cookie-Name eliminiert die Doppel im Endresultat.
|
||||
"""
|
||||
logger.info("Capturing overlapping evidence slices for %s", req.url)
|
||||
data = await capture_page_overlapping_slices(
|
||||
req.url, check_id=req.check_id,
|
||||
viewport_h=req.viewport_h, overlap_px=req.overlap_px,
|
||||
max_slices=req.max_slices,
|
||||
)
|
||||
return EvidenceSlicesResponse(
|
||||
url=data["url"],
|
||||
total_height_px=data["total_height_px"],
|
||||
width_px=data["width_px"],
|
||||
accepted_banner=data["accepted_banner"],
|
||||
expanded=data["expanded"],
|
||||
slices=[EvidenceSliceItem(**s) for s in data["slices"]],
|
||||
)
|
||||
|
||||
|
||||
# ── Admin: CMP discoveries (Phase E) ────────────────────────────────
|
||||
|
||||
@app.get("/cmp-discoveries")
|
||||
|
||||
Reference in New Issue
Block a user