Compare commits

..

2 Commits

Author SHA1 Message Date
Sharang Parnerkar 608611423b feat(m7.3): scheduler pulls tenants from registry, env as fallback
CI / Check (pull_request) Successful in 8m8s
CI / Detect Changes (pull_request) Has been skipped
CI / Deploy Agent (pull_request) Has been skipped
CI / Deploy Dashboard (pull_request) Has been skipped
CI / Deploy Docs (pull_request) Has been skipped
CI / Deploy MCP (pull_request) Has been skipped
Replaces the M7.2-C static `SCHEDULER_TENANT_IDS` env enumeration with
a live query to the tenant-registry at every tick. New tenants get
picked up without an agent restart; the env stays as the fallback
when the registry is unreachable so the scheduler is never silenced
by a registry outage.

Resolution order
1. agent.config.tenant_registry_url → GET <url>/v1/tenants
   - 5s timeout (kept short — we'd rather fall back than block the
     tick)
   - Frozen and Archived tenants filtered out (the M7.1 status gate
     would 402/410 them anyway, no point scanning their repos)
   - Accepts either {"id"} or {"tenant_id"} for forward compatibility
     with whatever shape the registry settles on
2. SCHEDULER_TENANT_IDS env (comma-separated) — fallback when the
   registry URL is unset OR the fetch fails OR the parsed response is
   empty. Each failure mode logs a warn with the url so operators see
   the problem.
3. DEFAULT_SCHEDULER_TENANT_ID ("dev") — last-ditch fallback so a
   bare `cargo run` against a clean Mongo still scans the dev tenant.

Why each tick instead of caching
- Tick frequency is every few hours (scan_schedule default
  "0 0 */6 * * *"). The registry call is at most 4 times a day per
  agent — cheap.
- Caching introduces a staleness window for newly provisioned
  tenants. The whole point of registry integration is to pick them
  up fast.

Startup log
- Includes "tenant source=tenant-registry" or "env" so operators can
  tell at a glance which mode the scheduler is in.

Test plan
- cargo fmt --all clean
- cargo clippy -p compliance-agent -- -D warnings clean
- cargo test -p compliance-agent --lib — 232 pass (+3 new):
    * filter_active_keeps_running_skips_frozen_archived
    * deserialize_registry_response_accepts_id_or_tenant_id (covers
      the {"id"|"tenant_id"} alias)
    * tenants_from_env_resolution (single test covering unset →
      default, csv → splits, "" → default — collapsed to one to
      avoid env-var test races)

Production
- Set TENANT_REGISTRY_URL in orca-infra alongside KEYCLOAK_URL when
  the registry is ready to serve. Until then, scheduler keeps using
  SCHEDULER_TENANT_IDS — no operator action needed.
- Future M7.4 cleanup: once tenant-registry adoption is universal,
  delete SCHEDULER_TENANT_IDS env support entirely.

Stacked on #95 (admin endpoints) since that PR added
tenant_registry_url to AgentConfig. Once #95 lands this auto-
retargets to main.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-18 13:07:11 +02:00
Sharang Parnerkar e20e7f1c6e feat(m7.3): cross-tenant admin HTTP endpoints
CI / Check (pull_request) Successful in 8m4s
CI / Detect Changes (pull_request) Has been skipped
CI / Deploy Agent (pull_request) Has been skipped
CI / Deploy Dashboard (pull_request) Has been skipped
CI / Deploy Docs (pull_request) Has been skipped
CI / Deploy MCP (pull_request) Has been skipped
Adds two cross-tenant operator endpoints on top of the M7.2-D
DatabasePool primitives:
- GET    /api/v1/admin/tenants              → list tenant DBs
- DELETE /api/v1/admin/tenants/{tenant_id}  → drop (GDPR delete)

Auth is a static bearer (ADMIN_API_TOKEN env), explicitly NOT a
Keycloak JWT — the whole point is to operate across tenants and a
customer JWT always carries a single tenant_id, which would be a
semantic conflict. Comparison is constant-time to avoid byte-level
timing probes.

Design
- ADMIN_API_TOKEN env on the agent. When unset, the admin routes
  aren't mounted at all (404 rather than 401). An operator who
  hasn't opted in can't fingerprint the surface.
- Admin sub-router is built in start_api_server when the token is
  configured, then merged into the main router with its own
  require_admin_token middleware.
- compliance-core::auth gains a PUBLIC_PREFIXES list. Paths under
  /api/v1/admin/ bypass require_jwt_auth so the customer JWT path
  and the admin token path never collide.
- require_tenant_status passes through naturally — admin requests
  carry no TenantContext.

Files
- compliance-core/src/auth.rs — PUBLIC_PREFIXES + prefix-aware skip.
- compliance-core/src/config.rs — admin_api_token + tenant_registry_url
  fields on AgentConfig. tenant_registry_url is added now so the
  scheduler→registry PR doesn't have to bump the config shape again.
- compliance-agent/src/config.rs — env wiring for both.
- compliance-agent/src/api/handlers/admin.rs (new) — list_tenant_dbs,
  drop_tenant_db, require_admin_token middleware, tokens_eq helper
  with a small test.
- compliance-agent/src/api/server.rs — conditional admin sub-router
  + merge.
- Test harness fixtures updated for the two new config fields.

Test plan
- cargo fmt --all clean
- cargo clippy --workspace --exclude compliance-dashboard
  -- -D warnings clean
- cargo test -p compliance-core --lib — 7 pass
- cargo test -p compliance-agent --lib — 229 pass (+1 new for
  tokens_eq)

Production
- Set ADMIN_API_TOKEN in orca-infra (per-secret, NOT committed) when
  ready to expose these endpoints. Without the env, the routes
  literally don't exist on the binary.
- Long-term: replace the static bearer with a dedicated admin realm
  in Keycloak. Token rotation is just an env change + restart for
  now; revocation responsiveness is zero.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-18 13:02:37 +02:00
+14 -6
View File
@@ -2,6 +2,7 @@ use axum::routing::{delete, get, patch, post};
use axum::Router;
use crate::api::handlers;
use crate::webhooks;
pub fn build_router() -> Router {
Router::new()
@@ -174,10 +175,17 @@ pub fn build_router() -> Router {
"/api/v1/pentest/stats",
get(handlers::pentest::pentest_stats),
)
// Webhook routes live on the separate webhook server (port 3002,
// see crate::webhooks::server). The M7.2-C tenant-in-URL form is
// `/webhook/{tenant_id}/{platform}/{repo_id}` and the handlers
// expect a (tenant_id, repo_id) path tuple. Anything mounting
// them here on the API server would mismatch the handler
// signature, so the routes are not exported.
// Webhook endpoints (proxied through dashboard)
.route(
"/webhook/github/{repo_id}",
post(webhooks::github::handle_github_webhook),
)
.route(
"/webhook/gitlab/{repo_id}",
post(webhooks::gitlab::handle_gitlab_webhook),
)
.route(
"/webhook/gitea/{repo_id}",
post(webhooks::gitea::handle_gitea_webhook),
)
}