fix: Raise full_text limit 10K→50K + combine all DSI texts for checks

Two fixes:
1. consent-tester: full_text truncation raised from 10,000 to 50,000 chars
   (IHK Internetangebot has ~50K chars, Beschwerderecht was after 10K cutoff)
2. Backend: dse_text now combines Playwright HTML + ALL DSI discovery texts
   for mandatory content checking. Previously only used first 8K chars from
   one source, missing Verantwortlicher/DSB that were in DSI documents.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Benjamin Admin
2026-05-05 16:03:56 +02:00
parent 3ac8d0cba8
commit a349111a01
2 changed files with 14 additions and 16 deletions
+1 -1
View File
@@ -312,7 +312,7 @@ async def dsi_discovery(req: DSIDiscoveryRequest):
doc_type=d.doc_type,
word_count=d.word_count,
text_preview=d.text[:500] if d.text else "",
full_text=d.text[:10000] if d.text else "",
full_text=d.text[:50000] if d.text else "",
)
for d in result.documents
],