Benjamin Admin
1784b43d72
feat(audit): Screenshot+Tesseract-OCR Cookie-Extract als Vendor-Quelle C
Statt fragiler text-Regex + LLM-Cascade-Workarounds: deterministische
Pipeline. consent-tester macht Full-Page-Screenshot der Cookie-Richtlinie
(akzeptiert Banner, klappt Accordions, brennt Timestamp ein). Backend
laesst Tesseract OCR (deu, PSM 4) drueber + anchor-basierter Parser
extrahiert {name, category, purpose, duration, type} pro Cookie.
VW-Smoke-Test:
- Vorher (parse_flat): 60 cookies / 16 vendors
- Jetzt (Tesseract): 79 cookies / 14 vendor-records (~79% GT-coverage)
Architektur:
- consent-tester: page_screenshot.py + /capture-evidence Endpoint
- backend: cookie_screenshot_ocr.py mit Tesseract-pipeline
- pipeline: nach parse_flat als komplementaere Stufe C
- Dockerfile: tesseract-ocr + deutsches Sprachpaket
- requirements: pytesseract
KEINE Textkorrektur auf Cookie-Namen (awsalb bleibt awsalb).
Timestamp im Screenshot = juristischer Beweis was wir zum Scan-Zeitpunkt
wirklich auf der Site gesehen haben.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 23:22:35 +02:00
..
2026-05-21 12:21:29 +02:00
2026-05-21 18:49:10 +02:00
2026-05-11 22:52:26 +02:00
2026-04-29 15:27:51 +02:00
2026-05-10 22:56:09 +02:00
2026-05-21 23:36:45 +02:00
2026-05-03 21:42:50 +02:00
2026-05-03 20:58:06 +02:00
2026-05-12 14:43:13 +02:00
2026-05-21 17:27:55 +02:00
2026-05-12 18:18:50 +02:00
2026-05-02 19:52:04 +02:00
2026-05-22 08:24:46 +02:00
2026-05-22 09:23:37 +02:00
2026-05-18 23:48:34 +02:00
2026-05-19 12:22:05 +02:00
2026-05-02 21:18:10 +02:00
2026-05-22 08:57:02 +02:00
2026-05-21 08:53:31 +02:00
2026-05-10 22:56:09 +02:00
2026-05-19 14:31:13 +02:00
2026-05-21 21:30:02 +02:00
2026-05-18 23:48:34 +02:00
2026-05-22 08:38:08 +02:00
2026-04-10 11:23:43 +02:00
2026-05-21 06:28:25 +02:00
2026-05-21 23:36:45 +02:00
2026-05-18 23:48:34 +02:00
2026-05-18 18:30:08 +02:00
2026-05-18 23:48:34 +02:00
2026-05-18 23:48:34 +02:00
2026-05-21 15:47:11 +02:00
2026-05-18 18:30:08 +02:00
2026-05-22 00:24:07 +02:00
2026-05-19 01:01:48 +02:00
2026-05-22 23:22:35 +02:00
2026-05-02 20:06:57 +02:00
2026-05-21 18:32:07 +02:00
2026-05-22 00:24:07 +02:00
2026-05-22 19:17:21 +02:00
2026-05-18 18:30:08 +02:00
2026-05-21 18:58:32 +02:00
2026-05-21 16:45:12 +02:00
2026-05-21 18:58:32 +02:00
2026-05-04 23:22:30 +02:00
2026-05-04 23:22:30 +02:00
2026-04-29 19:36:46 +02:00
2026-04-10 11:23:43 +02:00
2026-04-10 11:23:43 +02:00
2026-05-07 12:37:03 +02:00
2026-05-11 23:39:26 +02:00
2026-05-03 23:38:32 +02:00
2026-05-03 23:15:25 +02:00
2026-04-10 11:23:43 +02:00
2026-05-02 19:52:04 +02:00
2026-05-03 23:38:32 +02:00
2026-05-18 18:30:08 +02:00
2026-05-21 16:20:19 +02:00
2026-05-21 20:21:28 +02:00
2026-05-21 16:43:15 +02:00
2026-05-22 08:38:08 +02:00
2026-05-02 08:37:51 +02:00
2026-04-10 11:23:43 +02:00
2026-04-10 11:23:43 +02:00
2026-04-10 11:23:43 +02:00
2026-04-10 11:23:43 +02:00
2026-05-21 17:31:37 +02:00
2026-05-04 23:22:30 +02:00
2026-04-09 20:04:16 +02:00
2026-05-22 19:00:27 +02:00
2026-05-21 17:06:48 +02:00
2026-05-04 23:22:30 +02:00
2026-05-22 11:51:03 +02:00
2026-05-18 18:30:08 +02:00
2026-05-22 08:57:02 +02:00
2026-05-22 09:40:11 +02:00
2026-05-17 14:06:28 +02:00
2026-05-17 14:06:28 +02:00
2026-04-10 11:23:43 +02:00
2026-04-10 11:23:43 +02:00
2026-05-13 16:00:15 +02:00
2026-05-21 17:21:19 +02:00
2026-05-22 08:24:46 +02:00
2026-04-10 11:23:43 +02:00
2026-04-29 11:36:24 +02:00
2026-05-22 08:38:08 +02:00
2026-05-21 16:38:25 +02:00
2026-05-17 01:53:09 +02:00
2026-05-13 16:00:15 +02:00
2026-05-04 23:34:00 +02:00
2026-04-27 23:28:21 +02:00
2026-05-04 07:01:37 +02:00
2026-05-22 19:00:27 +02:00
2026-05-18 23:48:34 +02:00
2026-05-03 22:03:25 +02:00
2026-05-18 23:48:34 +02:00
2026-05-18 23:48:34 +02:00
2026-05-12 23:14:54 +02:00
2026-05-12 23:24:12 +02:00
2026-05-02 19:52:04 +02:00
2026-05-17 13:15:40 +02:00
2026-04-10 11:23:43 +02:00
2026-04-10 11:23:43 +02:00
2026-04-10 11:23:43 +02:00
2026-05-18 18:30:08 +02:00
2026-05-21 06:28:25 +02:00
2026-05-22 21:55:23 +02:00
2026-05-21 23:36:45 +02:00
2026-05-21 08:01:27 +02:00
2026-05-18 18:30:08 +02:00
2026-05-12 18:18:50 +02:00
2026-05-02 08:26:59 +02:00
2026-05-11 11:44:20 +02:00