fix: Add impressum keywords to dsi_discovery.py inline DSI_KEYWORDS
The inline DSI_KEYWORDS in dsi_discovery.py was missing 'impressum'. This caused self-extraction to skip impressum pages, returning datenschutz text instead. Added: impressum, anbieterkennzeichnung, imprint, legal notice, site notice. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -44,6 +44,10 @@ DSI_KEYWORDS: dict[str, list[str]] = {
|
|||||||
"widerruf", "rücktrittsrecht",
|
"widerruf", "rücktrittsrecht",
|
||||||
# Cookie
|
# Cookie
|
||||||
"cookie-richtlinie", "cookie-policy", "cookie-hinweis",
|
"cookie-richtlinie", "cookie-policy", "cookie-hinweis",
|
||||||
|
# Impressum
|
||||||
|
"impressum", "anbieterkennzeichnung",
|
||||||
|
# Imprint (EN)
|
||||||
|
"imprint", "legal notice", "site notice",
|
||||||
],
|
],
|
||||||
"en": [
|
"en": [
|
||||||
"privacy policy", "privacy notice", "data protection", "data policy",
|
"privacy policy", "privacy notice", "data protection", "data policy",
|
||||||
|
|||||||
Reference in New Issue
Block a user