feat(pipeline): F2+F3 action/object ontology — DB-backed normalization
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-consent (push) Successful in 36s
CI / test-python-voice (push) Successful in 33s
CI / test-bqas (push) Successful in 31s

Migrates ACTION_TYPES (26+8 types), _NEGATIVE_PATTERNS (22), _ACTION_SYNONYMS
(65), and _OBJECT_SYNONYMS (75) from hardcoded dicts to DB tables.

- SQL migration: 003_action_object_ontology.sql (3 tables)
- Migration scripts: f2_migrate_actions.py (34 types, 145 synonyms), f3_migrate_objects.py (75 objects)
- OntologyRegistry cache: 5min TTL, raises RuntimeError if empty (safe fallback to dicts)
- control_ontology.classify_action/get_phase delegate to DB with dict fallback
- control_dedup.normalize_action/normalize_object delegate to DB with dict fallback
- 25 new tests, 446 total pass, 0 regressions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Benjamin Admin
2026-05-03 23:47:53 +02:00
parent aab8eeb335
commit 652e3a65a3
7 changed files with 854 additions and 16 deletions
+25 -6
View File
@@ -126,22 +126,29 @@ _ACTION_SYNONYMS: dict[str, str] = {
def normalize_action(action: str) -> str:
"""Normalize an action verb to a canonical English form."""
"""Normalize an action verb to a canonical English form.
Delegates to DB-backed OntologyRegistry with dict fallback.
"""
try:
from .ontology_registry import get_ontology_registry
return get_ontology_registry().normalize_action(action)
except Exception:
pass
# Fallback: original logic
if not action:
return ""
action = action.strip().lower()
# Strip German infinitive/conjugation suffixes for lookup
action_base = re.sub(r"(en|t|st|e|te|tet|end)$", "", action)
# Try exact match first, then base form
if action in _ACTION_SYNONYMS:
return _ACTION_SYNONYMS[action]
if action_base in _ACTION_SYNONYMS:
return _ACTION_SYNONYMS[action_base]
# Fuzzy: check if action starts with any known verb
for verb, canonical in _ACTION_SYNONYMS.items():
if action.startswith(verb) or verb.startswith(action):
return canonical
return action # fallback: return as-is
return action
# ── Object Normalization ─────────────────────────────────────────────
@@ -237,7 +244,19 @@ _OBJECT_KEYS_SORTED = sorted(_OBJECT_SYNONYMS.keys(), key=len, reverse=True)
def normalize_object(obj: str) -> str:
"""Normalize a compliance object to a canonical token."""
"""Normalize a compliance object to a canonical token.
Delegates to DB-backed OntologyRegistry with dict fallback.
"""
# Try DB-backed registry first
try:
from .ontology_registry import get_ontology_registry
result = get_ontology_registry().normalize_object(obj)
if result != obj.strip().lower():
return result
except Exception:
pass
if not obj:
return ""
obj_lower = obj.strip().lower()