fix: Add impressum keywords to dsi_discovery.py inline DSI_KEYWORDS

The inline DSI_KEYWORDS in dsi_discovery.py was missing 'impressum'.
This caused self-extraction to skip impressum pages, returning
datenschutz text instead. Added: impressum, anbieterkennzeichnung,
imprint, legal notice, site notice.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Benjamin Admin
2026-05-11 14:43:47 +02:00
parent 3c7ed65f86
commit fde2f551d7
+4
View File
@@ -44,6 +44,10 @@ DSI_KEYWORDS: dict[str, list[str]] = {
"widerruf", "rücktrittsrecht",
# Cookie
"cookie-richtlinie", "cookie-policy", "cookie-hinweis",
# Impressum
"impressum", "anbieterkennzeichnung",
# Imprint (EN)
"imprint", "legal notice", "site notice",
],
"en": [
"privacy policy", "privacy notice", "data protection", "data policy",