feat(pipeline): F2+F3 action/object ontology — DB-backed normalization
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-consent (push) Successful in 36s
CI / test-python-voice (push) Successful in 33s
CI / test-bqas (push) Successful in 31s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-consent (push) Successful in 36s
CI / test-python-voice (push) Successful in 33s
CI / test-bqas (push) Successful in 31s
Migrates ACTION_TYPES (26+8 types), _NEGATIVE_PATTERNS (22), _ACTION_SYNONYMS (65), and _OBJECT_SYNONYMS (75) from hardcoded dicts to DB tables. - SQL migration: 003_action_object_ontology.sql (3 tables) - Migration scripts: f2_migrate_actions.py (34 types, 145 synonyms), f3_migrate_objects.py (75 objects) - OntologyRegistry cache: 5min TTL, raises RuntimeError if empty (safe fallback to dicts) - control_ontology.classify_action/get_phase delegate to DB with dict fallback - control_dedup.normalize_action/normalize_object delegate to DB with dict fallback - 25 new tests, 446 total pass, 0 regressions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -126,22 +126,29 @@ _ACTION_SYNONYMS: dict[str, str] = {
|
||||
|
||||
|
||||
def normalize_action(action: str) -> str:
|
||||
"""Normalize an action verb to a canonical English form."""
|
||||
"""Normalize an action verb to a canonical English form.
|
||||
|
||||
Delegates to DB-backed OntologyRegistry with dict fallback.
|
||||
"""
|
||||
try:
|
||||
from .ontology_registry import get_ontology_registry
|
||||
return get_ontology_registry().normalize_action(action)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Fallback: original logic
|
||||
if not action:
|
||||
return ""
|
||||
action = action.strip().lower()
|
||||
# Strip German infinitive/conjugation suffixes for lookup
|
||||
action_base = re.sub(r"(en|t|st|e|te|tet|end)$", "", action)
|
||||
# Try exact match first, then base form
|
||||
if action in _ACTION_SYNONYMS:
|
||||
return _ACTION_SYNONYMS[action]
|
||||
if action_base in _ACTION_SYNONYMS:
|
||||
return _ACTION_SYNONYMS[action_base]
|
||||
# Fuzzy: check if action starts with any known verb
|
||||
for verb, canonical in _ACTION_SYNONYMS.items():
|
||||
if action.startswith(verb) or verb.startswith(action):
|
||||
return canonical
|
||||
return action # fallback: return as-is
|
||||
return action
|
||||
|
||||
|
||||
# ── Object Normalization ─────────────────────────────────────────────
|
||||
@@ -237,7 +244,19 @@ _OBJECT_KEYS_SORTED = sorted(_OBJECT_SYNONYMS.keys(), key=len, reverse=True)
|
||||
|
||||
|
||||
def normalize_object(obj: str) -> str:
|
||||
"""Normalize a compliance object to a canonical token."""
|
||||
"""Normalize a compliance object to a canonical token.
|
||||
|
||||
Delegates to DB-backed OntologyRegistry with dict fallback.
|
||||
"""
|
||||
# Try DB-backed registry first
|
||||
try:
|
||||
from .ontology_registry import get_ontology_registry
|
||||
result = get_ontology_registry().normalize_object(obj)
|
||||
if result != obj.strip().lower():
|
||||
return result
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
if not obj:
|
||||
return ""
|
||||
obj_lower = obj.strip().lower()
|
||||
|
||||
Reference in New Issue
Block a user