608611423b
CI / Check (pull_request) Successful in 8m8s
CI / Detect Changes (pull_request) Has been skipped
CI / Deploy Agent (pull_request) Has been skipped
CI / Deploy Dashboard (pull_request) Has been skipped
CI / Deploy Docs (pull_request) Has been skipped
CI / Deploy MCP (pull_request) Has been skipped
Replaces the M7.2-C static `SCHEDULER_TENANT_IDS` env enumeration with
a live query to the tenant-registry at every tick. New tenants get
picked up without an agent restart; the env stays as the fallback
when the registry is unreachable so the scheduler is never silenced
by a registry outage.
Resolution order
1. agent.config.tenant_registry_url → GET <url>/v1/tenants
- 5s timeout (kept short — we'd rather fall back than block the
tick)
- Frozen and Archived tenants filtered out (the M7.1 status gate
would 402/410 them anyway, no point scanning their repos)
- Accepts either {"id"} or {"tenant_id"} for forward compatibility
with whatever shape the registry settles on
2. SCHEDULER_TENANT_IDS env (comma-separated) — fallback when the
registry URL is unset OR the fetch fails OR the parsed response is
empty. Each failure mode logs a warn with the url so operators see
the problem.
3. DEFAULT_SCHEDULER_TENANT_ID ("dev") — last-ditch fallback so a
bare `cargo run` against a clean Mongo still scans the dev tenant.
Why each tick instead of caching
- Tick frequency is every few hours (scan_schedule default
"0 0 */6 * * *"). The registry call is at most 4 times a day per
agent — cheap.
- Caching introduces a staleness window for newly provisioned
tenants. The whole point of registry integration is to pick them
up fast.
Startup log
- Includes "tenant source=tenant-registry" or "env" so operators can
tell at a glance which mode the scheduler is in.
Test plan
- cargo fmt --all clean
- cargo clippy -p compliance-agent -- -D warnings clean
- cargo test -p compliance-agent --lib — 232 pass (+3 new):
* filter_active_keeps_running_skips_frozen_archived
* deserialize_registry_response_accepts_id_or_tenant_id (covers
the {"id"|"tenant_id"} alias)
* tenants_from_env_resolution (single test covering unset →
default, csv → splits, "" → default — collapsed to one to
avoid env-var test races)
Production
- Set TENANT_REGISTRY_URL in orca-infra alongside KEYCLOAK_URL when
the registry is ready to serve. Until then, scheduler keeps using
SCHEDULER_TENANT_IDS — no operator action needed.
- Future M7.4 cleanup: once tenant-registry adoption is universal,
delete SCHEDULER_TENANT_IDS env support entirely.
Stacked on #95 (admin endpoints) since that PR added
tenant_registry_url to AgentConfig. Once #95 lands this auto-
retargets to main.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>