Files
breakpilot-core/control-pipeline/services
Benjamin Admin c258fbc3de
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-consent (push) Successful in 30s
CI / test-python-voice (push) Successful in 38s
CI / test-bqas (push) Successful in 40s
feat(control-pipeline): RecitalIngester for EU act recitals (Parser 2)
Add services/recital_ingester.py — parses EU act recitals (Erwägungsgründe)
from the eur-lex/CELLAR preamble via the id="rct_N" markers (the table layout
that defeats a naive article parser) and tags them as a SEPARATE interpretative
source: source_class=recital, authority_weight=60, use_for_primary=false, so
they rank below binding articles and surface only as interpretation context.
Reuses the Parser-1 download + helpers. Add scripts/ingest_recitals.py
(skip-by-existing, no auto re-ingest) + tests/fixture.

Tested: 4 unit tests over a synthetic rct_N fixture, ruff + mypy clean, real
CELLAR parse of DORA verified end-to-end (106 recitals, interpretative metadata).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-24 08:49:30 +02:00
..