fix: Context-aware Impressum checks + 3 regex fixes
3 Regex fixes: - Telefon: matches '0761 / 48 98 09 01' format (spaces around /) - Registergericht: matches 'AG Freiburg' (not just 'Amtsgericht') - Vertretung: matches 'Geschaeftsfuehrung:' (not just 'Geschaeftsfuehrer:') 6 checks changed from FAIL to INFO severity: - V.i.S.d.P.: only relevant if website has editorial content - Streitbeilegung: only relevant for B2C online shops - Berufsrecht: only relevant for regulated professions - Stammkapital: legally required but rarely enforced - Aufsichtsbehoerde: only for licensed activities - Berufshaftpflicht: only for mandatory insurance INFO checks don't count towards completeness percentage. They appear as hints, not findings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -111,14 +111,19 @@ def check_document_completeness(
|
||||
passed_l1_ids: set[str] = set()
|
||||
all_checks: list[dict] = []
|
||||
l1_present = 0
|
||||
l1_scoreable = 0 # Exclude INFO checks from score
|
||||
|
||||
for check in l1_checks:
|
||||
is_info = check.get("severity") == "INFO"
|
||||
match = _match_patterns(check["patterns"], text_lower)
|
||||
passed = match is not None
|
||||
if passed:
|
||||
passed_l1_ids.add(check["id"])
|
||||
l1_present += 1
|
||||
else:
|
||||
if not is_info:
|
||||
l1_present += 1
|
||||
if not is_info:
|
||||
l1_scoreable += 1
|
||||
if not passed and not is_info:
|
||||
findings.append({
|
||||
"code": f"DSI-MISSING-{check['id'].upper()}",
|
||||
"severity": check.get("severity", "MEDIUM"),
|
||||
@@ -175,7 +180,7 @@ def check_document_completeness(
|
||||
})
|
||||
|
||||
# ── Summary ───────────────────────────────────────────────────────
|
||||
l1_total = len(l1_checks)
|
||||
l1_total = l1_scoreable # Exclude INFO checks from percentage
|
||||
completeness_pct = round(l1_present / l1_total * 100) if l1_total else 0
|
||||
correctness_pct = round(l2_passed / l2_total * 100) if l2_total else 0
|
||||
|
||||
|
||||
Reference in New Issue
Block a user