feat(control-pipeline): production LegalActIngester for EU acts (Parser 1)
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-consent (push) Successful in 28s
CI / test-python-voice (push) Successful in 32s
CI / test-bqas (push) Successful in 30s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-consent (push) Successful in 28s
CI / test-python-voice (push) Successful in 32s
CI / test-bqas (push) Successful in 30s
Add services/legal_act_ingester.py — the EU eur-lex LegalActIngester engine: CELLAR download (with eur-lex fallback, bypassing the HTTP 202 web block on large acts like DORA), parse into articles + annexes with full authority metadata + forward citation edges (references_out), and a self-test gate before upload. Refactor scripts/ingest_eu_regulations.py to use it: parse-based, per-unit upload with a skip-by-CELEX guard (no automatic re-ingest). Recitals are intentionally left to a separate ingester (Parser 2). Tested: parser / metadata / self-test / refs_out over a synthetic eur-lex fixture (7 tests), ruff + mypy clean, real CELLAR fetch of DORA verified end-to-end (64 articles, full authority metadata). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,17 @@
|
||||
<!DOCTYPE html>
|
||||
<html><body>
|
||||
<p class="oj-doc-ti">VERORDNUNG (EU) 2099/1 DES TESTGEBERS</p>
|
||||
<p class="oj-normal">(1) Dieser Erwaegungsgrund steht vor den Artikeln und darf NICHT als Artikel geparst werden.</p>
|
||||
<p class="oj-ti-grseq-1">KAPITEL I</p>
|
||||
<p class="oj-ti-art">Artikel 1</p>
|
||||
<p class="oj-sti-art">Gegenstand</p>
|
||||
<p class="oj-normal">Diese Verordnung legt Anforderungen fest; Einzelheiten regeln Artikel 2 und Anhang I.</p>
|
||||
<p class="oj-ti-art">Artikel 2</p>
|
||||
<p class="oj-sti-art">Begriffsbestimmungen</p>
|
||||
<p class="oj-normal">Im Sinne dieser Verordnung bezeichnet der Ausdruck Produkt eine Sache mit digitalen Elementen.</p>
|
||||
<p class="oj-doc-ti">ANHANG I</p>
|
||||
<p class="oj-ti-grseq-1">GRUNDLEGENDE ANFORDERUNGEN</p>
|
||||
<p class="oj-normal">Die Produkte muessen die grundlegenden Anforderungen gemaess Artikel 1 dauerhaft erfuellen.</p>
|
||||
<p class="oj-doc-ti">ANHANG II</p>
|
||||
<p class="oj-normal">x</p>
|
||||
</body></html>
|
||||
Reference in New Issue
Block a user