Files
breakpilot-core/control-pipeline/tests/test_standard_ingester.py
T
Benjamin Admin 3b466be140
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-consent (push) Successful in 25s
CI / test-python-voice (push) Successful in 28s
CI / test-bqas (push) Successful in 26s
feat(control-pipeline): StandardIngester engine for technical standards (Parser 4)
Add services/standard_ingester.py — tags technical standards / control frameworks
(NIST / OWASP / BSI Grundschutz / CSA CCM) source_class=technical_standard /
authority_weight=80 / bindingness=best_practice / use_for_primary=false, so a
standard ranks below binding law and guidance for obligation/interpretation
questions but surfaces for "which controls/measures fit?" (control-intent, a
follow-up retriever step). Reuses the guidance_ingester extraction helpers. The
per-source license is carried on every unit so the commercial gate can refuse a
non-commercial source (e.g. CSA CCM = CC-BY-NC).

Tested: 3 unit tests on the metadata path, ruff + mypy clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-24 11:08:20 +02:00

36 lines
1.4 KiB
Python

"""Unit tests for the StandardIngester engine (Parser 4)."""
from services.standard_ingester import StandardSpec, build_upload_unit, standard_meta
SPEC = StandardSpec(
source_id="nist_csf_2_0", short="NIST CSF 2.0",
title="NIST Cybersecurity Framework 2.0", publisher="NIST",
url="https://nist.gov/csf", license="public_domain", version_date="2024-02-26",
)
def test_standard_meta_is_best_practice_not_primary():
m = standard_meta(SPEC)
assert m["source_class"] == "technical_standard"
assert m["authority_weight"] == 80
assert m["bindingness"] == "best_practice"
assert m["use_for_primary"] is False
assert m["chunk_scope"] == "standard"
assert m["regulation_short"] == "NIST CSF 2.0"
assert m["issuer"] == "NIST"
assert m["license"] == "public_domain"
def test_build_upload_unit_tags_version_and_collection():
unit = build_upload_unit(SPEC, "A" * 300, "run9")
assert unit.document_version == "run9-nist_csf_2_0"
assert unit.collection == "bp_compliance_ce"
assert unit.filename == "nist_csf_2_0.txt"
assert unit.meta["use_for_primary"] is False
def test_noncommercial_license_is_carried_for_the_gate():
ccm = StandardSpec(source_id="csa_ccm", short="CSA CCM", title="Cloud Controls Matrix",
publisher="CSA", url="https://...", license="CC-BY-NC")
assert standard_meta(ccm)["license"] == "CC-BY-NC" # commercial gate can refuse it