feat(backend): On-demand Browser-Verhaltens-Matrix + Snapshot-Persistenz (Phase 2)
- check_snapshot: update_browser_matrix/load_browser_matrix — migrationsfrei
in banner_result.browser_matrix (JSONB jsonb_set, eigener scanned_at)
- snapshot_check_routes: POST /snapshots/{id}/browser-behavior/run laeuft
/scan-matrix LIVE (Re-Crawl je Engine, nur live messbar), persistiert das
Ergebnis; GET /snapshots/{id}/browser-behavior liefert die gespeicherte
Matrix ohne Re-Crawl. Profil-Set = 4 Default-Engines + Brave/Chrome/Edge.
- consent-tester multi_browser_scanner: Semaphore(2) gegen OOM (7 Browser
parallel sprengten das 2g-mem_limit)
- Pydantic-Modell mit Optional[List[...]] (nicht `| None`) → Py3.9-sicher
- Tests: _snapshot_scan_url + Request-Defaults (5)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -37,6 +37,11 @@ _HARD_FAIL_CAP = 55
|
||||
# Banner-Design / Dark 20%
|
||||
_WEIGHTS = {"pre_consent": 0.5, "reject_respect": 0.3, "banner_design": 0.2}
|
||||
|
||||
# Nebenlaeufigkeit kappen: jeder Playwright-Browser braucht 300-500 MB; bei 7
|
||||
# Profilen wuerde paralleles Starten das 2g-mem_limit des Containers sprengen
|
||||
# (OOM-Kill). 2 gleichzeitig → Peak ~1 GB, Wall-Time ~Profile/2.
|
||||
_MAX_CONCURRENCY = 2
|
||||
|
||||
|
||||
def _extract_dimensions(banner_result: dict) -> dict[str, float]:
|
||||
"""Best-effort: derive 3 sub-scores from the existing scan output.
|
||||
@@ -149,7 +154,13 @@ async def run_matrix(
|
||||
"verbal": _verbal(score),
|
||||
}
|
||||
|
||||
results = await asyncio.gather(*[_run_one(p) for p in profiles])
|
||||
_sem = asyncio.Semaphore(_MAX_CONCURRENCY)
|
||||
|
||||
async def _bounded(prof: dict) -> dict:
|
||||
async with _sem:
|
||||
return await _run_one(prof)
|
||||
|
||||
results = await asyncio.gather(*[_bounded(p) for p in profiles])
|
||||
sorted_by_score = sorted(results, key=lambda r: r["score"])
|
||||
worst = sorted_by_score[0]
|
||||
best = sorted_by_score[-1]
|
||||
|
||||
Reference in New Issue
Block a user