Compare commits

...

2 Commits

Author SHA1 Message Date
Sharang Parnerkar e20e7f1c6e feat(m7.3): cross-tenant admin HTTP endpoints
CI / Check (pull_request) Successful in 8m4s
CI / Detect Changes (pull_request) Has been skipped
CI / Deploy Agent (pull_request) Has been skipped
CI / Deploy Dashboard (pull_request) Has been skipped
CI / Deploy Docs (pull_request) Has been skipped
CI / Deploy MCP (pull_request) Has been skipped
Adds two cross-tenant operator endpoints on top of the M7.2-D
DatabasePool primitives:
- GET    /api/v1/admin/tenants              → list tenant DBs
- DELETE /api/v1/admin/tenants/{tenant_id}  → drop (GDPR delete)

Auth is a static bearer (ADMIN_API_TOKEN env), explicitly NOT a
Keycloak JWT — the whole point is to operate across tenants and a
customer JWT always carries a single tenant_id, which would be a
semantic conflict. Comparison is constant-time to avoid byte-level
timing probes.

Design
- ADMIN_API_TOKEN env on the agent. When unset, the admin routes
  aren't mounted at all (404 rather than 401). An operator who
  hasn't opted in can't fingerprint the surface.
- Admin sub-router is built in start_api_server when the token is
  configured, then merged into the main router with its own
  require_admin_token middleware.
- compliance-core::auth gains a PUBLIC_PREFIXES list. Paths under
  /api/v1/admin/ bypass require_jwt_auth so the customer JWT path
  and the admin token path never collide.
- require_tenant_status passes through naturally — admin requests
  carry no TenantContext.

Files
- compliance-core/src/auth.rs — PUBLIC_PREFIXES + prefix-aware skip.
- compliance-core/src/config.rs — admin_api_token + tenant_registry_url
  fields on AgentConfig. tenant_registry_url is added now so the
  scheduler→registry PR doesn't have to bump the config shape again.
- compliance-agent/src/config.rs — env wiring for both.
- compliance-agent/src/api/handlers/admin.rs (new) — list_tenant_dbs,
  drop_tenant_db, require_admin_token middleware, tokens_eq helper
  with a small test.
- compliance-agent/src/api/server.rs — conditional admin sub-router
  + merge.
- Test harness fixtures updated for the two new config fields.

Test plan
- cargo fmt --all clean
- cargo clippy --workspace --exclude compliance-dashboard
  -- -D warnings clean
- cargo test -p compliance-core --lib — 7 pass
- cargo test -p compliance-agent --lib — 229 pass (+1 new for
  tokens_eq)

Production
- Set ADMIN_API_TOKEN in orca-infra (per-secret, NOT committed) when
  ready to expose these endpoints. Without the env, the routes
  literally don't exist on the binary.
- Long-term: replace the static bearer with a dedicated admin realm
  in Keycloak. Token rotation is just an env change + restart for
  now; revocation responsiveness is zero.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-18 13:02:37 +02:00
sharang 69c4f7bb78 feat(dashboard): proactively refresh expired Keycloak tokens (#91)
CI / Check (push) Has been skipped
CI / Detect Changes (push) Successful in 8s
CI / Deploy Agent (push) Has been skipped
CI / Deploy Dashboard (push) Successful in 2m55s
CI / Deploy Docs (push) Has been skipped
CI / Deploy MCP (push) Has been skipped
2026-06-17 20:01:37 +00:00
9 changed files with 324 additions and 13 deletions
+115
View File
@@ -0,0 +1,115 @@
//! Cross-tenant admin endpoints (`/api/v1/admin/*`).
//!
//! Operator-only. Auth is a **static bearer token** (`ADMIN_API_TOKEN`
//! env on the agent) — explicitly NOT a Keycloak JWT, because the
//! whole point of these endpoints is to operate ACROSS tenants. A
//! customer JWT (which always carries a single tenant_id) has no
//! business mounting them.
//!
//! Routes are only registered when `ADMIN_API_TOKEN` is set. With no
//! token, the endpoints don't exist at all (404), which is a stronger
//! guarantee than "401 if you guess the path".
//!
//! Operations:
//! - `GET /api/v1/admin/tenants` — list tenant DBs
//! - `DELETE /api/v1/admin/tenants/{tenant_id}` — GDPR delete
//!
//! Tenant ids in URLs are passed as-is to `DatabasePool::drop_tenant`,
//! which sanitises them the same way it does for creation. Listing
//! returns the raw DB names from `list_tenant_db_names` — operators
//! can reverse-derive the tenant_id from the prefix.
use axum::extract::{Extension, Path, Request};
use axum::http::{header, StatusCode};
use axum::middleware::Next;
use axum::response::{IntoResponse, Response};
use axum::Json;
use secrecy::ExposeSecret;
use serde::Serialize;
use super::dto::AgentExt;
#[derive(Serialize)]
pub struct ListTenantDbsResponse {
pub tenant_db_names: Vec<String>,
}
#[tracing::instrument(skip_all)]
pub async fn list_tenant_dbs(
Extension(agent): AgentExt,
) -> Result<Json<ListTenantDbsResponse>, StatusCode> {
let names = agent.db_pool.list_tenant_db_names().await.map_err(|e| {
tracing::error!("admin: list_tenant_db_names failed: {e}");
StatusCode::INTERNAL_SERVER_ERROR
})?;
Ok(Json(ListTenantDbsResponse {
tenant_db_names: names,
}))
}
#[tracing::instrument(skip_all, fields(tenant_id = %tenant_id))]
pub async fn drop_tenant_db(
Extension(agent): AgentExt,
Path(tenant_id): Path<String>,
) -> Result<Json<serde_json::Value>, StatusCode> {
agent.db_pool.drop_tenant(&tenant_id).await.map_err(|e| {
tracing::error!("admin: drop_tenant failed: {e}");
StatusCode::INTERNAL_SERVER_ERROR
})?;
Ok(Json(serde_json::json!({ "status": "dropped" })))
}
/// Constant-time-ish comparison of the configured admin token against
/// the incoming bearer. Uses `subtle`-style byte equality so timing
/// attacks can't probe the token character by character.
fn tokens_eq(a: &str, b: &str) -> bool {
if a.len() != b.len() {
return false;
}
let mut diff = 0u8;
for (x, y) in a.bytes().zip(b.bytes()) {
diff |= x ^ y;
}
diff == 0
}
/// Middleware enforcing the static `ADMIN_API_TOKEN`. Mounted only on
/// the admin sub-router, so this never runs on customer routes.
pub async fn require_admin_token(
Extension(agent): AgentExt,
request: Request,
next: Next,
) -> Response {
let Some(expected) = agent.config.admin_api_token.as_ref() else {
// Belt-and-braces — if the routes were somehow mounted without
// a token configured, refuse rather than no-op-pass.
return (StatusCode::NOT_FOUND, "admin disabled").into_response();
};
let presented = request
.headers()
.get(header::AUTHORIZATION)
.and_then(|v| v.to_str().ok())
.and_then(|s| s.strip_prefix("Bearer "))
.map(|s| s.trim());
let Some(presented) = presented.filter(|s| !s.is_empty()) else {
return (StatusCode::UNAUTHORIZED, "Missing bearer token").into_response();
};
if !tokens_eq(presented, expected.expose_secret()) {
return (StatusCode::UNAUTHORIZED, "Invalid admin token").into_response();
}
next.run(request).await
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn tokens_eq_basic() {
assert!(tokens_eq("abc", "abc"));
assert!(!tokens_eq("abc", "abd"));
assert!(!tokens_eq("abc", "abcd"));
assert!(!tokens_eq("", "x"));
assert!(tokens_eq("", ""));
}
}
+1
View File
@@ -1,3 +1,4 @@
pub mod admin;
pub mod chat;
pub mod dast;
pub mod dto;
+24 -1
View File
@@ -4,7 +4,8 @@ use axum::extract::Request;
use axum::http::HeaderValue;
use axum::middleware::Next;
use axum::response::Response;
use axum::{middleware, Extension};
use axum::routing::{delete, get};
use axum::{middleware, Extension, Router};
use tokio::sync::RwLock;
use tower_http::cors::CorsLayer;
use tower_http::set_header::SetResponseHeaderLayer;
@@ -14,6 +15,7 @@ use compliance_core::auth::{require_jwt_auth, require_tenant_status, JwksState};
use compliance_core::{TenantContext, TenantStatus};
use crate::agent::ComplianceAgent;
use crate::api::handlers;
use crate::api::routes;
use crate::error::AgentError;
@@ -50,7 +52,28 @@ pub async fn inject_dev_tenant(mut request: Request, next: Next) -> Response {
}
pub async fn start_api_server(agent: ComplianceAgent, port: u16) -> Result<(), AgentError> {
// Admin sub-router. Routes are only mounted when ADMIN_API_TOKEN is
// configured — without it, the paths don't exist at all (404 rather
// than 401), so an operator who hasn't opted in can't fingerprint
// the surface area.
let admin_router: Router = if agent.config.admin_api_token.is_some() {
tracing::info!("Admin API enabled — /api/v1/admin/* mounted behind ADMIN_API_TOKEN bearer");
Router::new()
.route(
"/api/v1/admin/tenants",
get(handlers::admin::list_tenant_dbs),
)
.route(
"/api/v1/admin/tenants/{tenant_id}",
delete(handlers::admin::drop_tenant_db),
)
.layer(middleware::from_fn(handlers::admin::require_admin_token))
} else {
Router::new()
};
let mut app = routes::build_router()
.merge(admin_router)
.layer(Extension(Arc::new(agent.clone())))
.layer(CorsLayer::permissive())
.layer(TraceLayer::new_for_http())
+2
View File
@@ -59,5 +59,7 @@ pub fn load_config() -> Result<AgentConfig, AgentError> {
.unwrap_or(true),
pentest_imap_username: env_var_opt("PENTEST_IMAP_USERNAME"),
pentest_imap_password: env_secret_opt("PENTEST_IMAP_PASSWORD"),
admin_api_token: env_secret_opt("ADMIN_API_TOKEN"),
tenant_registry_url: env_var_opt("TENANT_REGISTRY_URL"),
})
}
+2
View File
@@ -339,6 +339,8 @@ mod tests {
pentest_imap_tls: true,
pentest_imap_username: None,
pentest_imap_password: None,
admin_api_token: None,
tenant_registry_url: None,
}
}
+2
View File
@@ -66,6 +66,8 @@ impl TestServer {
pentest_imap_tls: false,
pentest_imap_username: None,
pentest_imap_password: None,
admin_api_token: None,
tenant_registry_url: None,
};
let agent = ComplianceAgent::new(config, db_pool);
+12 -4
View File
@@ -63,16 +63,24 @@ struct Claims {
const PUBLIC_ENDPOINTS: &[&str] = &["/api/v1/health"];
/// Path prefixes that bypass JWT validation. The admin sub-router
/// (`/api/v1/admin/*`) has its own static-bearer middleware and must
/// not be routed through the customer-JWT path — a Keycloak token
/// always carries a single tenant_id and would semantically conflict
/// with cross-tenant admin operations.
const PUBLIC_PREFIXES: &[&str] = &["/api/v1/admin/"];
/// Middleware that validates Bearer JWT tokens against Keycloak's JWKS
/// and attaches a `TenantContext` extension on success.
///
/// Skips validation for the health endpoint.
/// If `JwksState` is not present (Keycloak not configured), requests
/// pass through and downstream code must handle the missing context.
/// Skips validation for the health endpoint and any path under one of
/// the [`PUBLIC_PREFIXES`]. If `JwksState` is not present (Keycloak
/// not configured), requests pass through and downstream code must
/// handle the missing context.
pub async fn require_jwt_auth(mut request: Request, next: Next) -> Response {
let path = request.uri().path();
if PUBLIC_ENDPOINTS.contains(&path) {
if PUBLIC_ENDPOINTS.contains(&path) || PUBLIC_PREFIXES.iter().any(|p| path.starts_with(p)) {
return next.run(request).await;
}
+9
View File
@@ -37,6 +37,15 @@ pub struct AgentConfig {
pub pentest_imap_tls: bool,
pub pentest_imap_username: Option<String>,
pub pentest_imap_password: Option<SecretString>,
/// Static bearer for the cross-tenant admin endpoints under
/// `/api/v1/admin/*`. When `None`, those endpoints are not
/// mounted at all (defense-in-depth: ops endpoints never reach
/// any auth path if no operator has explicitly opted in).
pub admin_api_token: Option<SecretString>,
/// Live tenant-registry URL the scheduler consults for the list
/// of tenants to iterate. When `None` or unreachable, scheduler
/// falls back to `SCHEDULER_TENANT_IDS` env (M7.2-C).
pub tenant_registry_url: Option<String>,
}
#[derive(Clone, Debug, Serialize, Deserialize)]
@@ -9,7 +9,16 @@
//! When Keycloak is not configured (dev convenience), the helper
//! returns an unauthenticated builder — matching the agent's
//! pass-through behavior in the same state.
//!
//! **Token refresh**: KC access tokens are short-lived (5 min default
//! in the certifai realm). Before attaching, we decode the JWT's `exp`
//! claim and proactively refresh via the stored refresh_token if the
//! access token is expired or about to expire. The session is updated
//! with the new pair. If refresh fails, we send the (stale) token
//! anyway — the agent's 401 will surface to the UI, which can prompt
//! re-login.
use base64::{engine::general_purpose::URL_SAFE_NO_PAD, Engine};
use dioxus::prelude::ServerFnError;
use dioxus_fullstack::FullstackContext;
use reqwest::Method;
@@ -18,6 +27,11 @@ use super::auth::LOGGED_IN_USER_SESS_KEY;
use super::server_state::ServerState;
use super::user_state::UserStateInner;
/// Seconds before the JWT's `exp` time at which we consider it stale
/// enough to refresh. Covers clock skew + the round-trip to the agent
/// so the token doesn't expire mid-flight.
const REFRESH_SKEW_SECS: i64 = 30;
/// Build a `RequestBuilder` for `<agent_api_url><path>` with the
/// session's access token attached. `path` should include a leading
/// `/`, e.g. `"/api/v1/repositories"`.
@@ -38,10 +52,9 @@ pub async fn agent_get(path: &str) -> Result<reqwest::RequestBuilder, ServerFnEr
}
/// Attach the session's bearer token if Keycloak is configured AND the
/// session has a logged-in user. Otherwise leave the request as-is.
///
/// The Keycloak-disabled path mirrors the dashboard's `require_auth`
/// middleware, which short-circuits when `state.keycloak.is_none()`.
/// session has a logged-in user. Refresh the token proactively if it's
/// expired or about to expire. Persists refreshed tokens back into the
/// session.
async fn attach_token(
req: reqwest::RequestBuilder,
state: &ServerState,
@@ -54,8 +67,144 @@ async fn attach_token(
.get(LOGGED_IN_USER_SESS_KEY)
.await
.map_err(|e| ServerFnError::new(format!("session read failed: {e}")))?;
Ok(match user {
Some(u) => req.bearer_auth(u.access_token),
None => req,
})
let Some(mut user) = user else {
return Ok(req);
};
if token_needs_refresh(&user.access_token) {
tracing::debug!("Access token expired or near-expiring; refreshing");
match refresh_tokens(state, &user.refresh_token).await {
Ok((new_access, new_refresh)) => {
user.access_token = new_access;
if let Some(rt) = new_refresh {
user.refresh_token = rt;
}
if let Err(e) = session.insert(LOGGED_IN_USER_SESS_KEY, &user).await {
tracing::warn!("Failed to persist refreshed tokens: {e}");
}
}
Err(e) => {
tracing::warn!("Token refresh failed: {e}; sending current token anyway");
// Fall through — the agent will 401 and the UI will
// prompt re-login. Better than failing the request at
// the dashboard layer with no helpful UX cue.
}
}
}
Ok(req.bearer_auth(user.access_token))
}
/// Decode the JWT's payload (no signature verification — the agent
/// does that) and check the `exp` claim. Treats malformed tokens as
/// expired so the refresh path runs.
fn token_needs_refresh(jwt: &str) -> bool {
let Some(payload_b64) = jwt.split('.').nth(1) else {
return true;
};
let Ok(bytes) = URL_SAFE_NO_PAD.decode(payload_b64) else {
return true;
};
#[derive(serde::Deserialize)]
struct ExpClaim {
exp: i64,
}
let Ok(claims) = serde_json::from_slice::<ExpClaim>(&bytes) else {
return true;
};
let now = chrono::Utc::now().timestamp();
claims.exp - REFRESH_SKEW_SECS <= now
}
/// Exchange a refresh_token for a new access_token. Returns the new
/// access_token and (optionally) the new refresh_token KC issued.
/// KC may rotate refresh_tokens on use; we honor whatever it sends.
async fn refresh_tokens(
state: &ServerState,
refresh_token: &str,
) -> Result<(String, Option<String>), String> {
let kc = state
.keycloak
.ok_or_else(|| "Keycloak not configured".to_string())?;
if refresh_token.is_empty() {
return Err("no refresh_token in session".to_string());
}
#[derive(serde::Deserialize)]
struct TokenResp {
access_token: String,
refresh_token: Option<String>,
}
let resp = reqwest::Client::new()
.post(kc.token_endpoint())
.form(&[
("grant_type", "refresh_token"),
("client_id", kc.client_id.as_str()),
("refresh_token", refresh_token),
])
.send()
.await
.map_err(|e| format!("refresh request failed: {e}"))?;
if !resp.status().is_success() {
let status = resp.status();
let body = resp.text().await.unwrap_or_default();
return Err(format!("refresh rejected ({status}): {body}"));
}
let r: TokenResp = resp
.json()
.await
.map_err(|e| format!("refresh response parse failed: {e}"))?;
Ok((r.access_token, r.refresh_token))
}
#[cfg(test)]
mod tests {
use super::*;
use base64::Engine;
/// Build a JWT-shaped string (header.payload.sig) with the given
/// payload. Signature is bogus — we never verify it locally.
fn make_jwt(payload: &serde_json::Value) -> String {
let payload_b64 = URL_SAFE_NO_PAD.encode(serde_json::to_vec(payload).unwrap());
format!("hdr.{payload_b64}.sig")
}
#[test]
fn token_needs_refresh_true_when_expired() {
let exp = chrono::Utc::now().timestamp() - 60;
let jwt = make_jwt(&serde_json::json!({ "exp": exp }));
assert!(token_needs_refresh(&jwt));
}
#[test]
fn token_needs_refresh_true_within_skew_window() {
// 10 seconds left; less than the 30s skew → must refresh.
let exp = chrono::Utc::now().timestamp() + 10;
let jwt = make_jwt(&serde_json::json!({ "exp": exp }));
assert!(token_needs_refresh(&jwt));
}
#[test]
fn token_needs_refresh_false_with_plenty_of_life() {
let exp = chrono::Utc::now().timestamp() + 600;
let jwt = make_jwt(&serde_json::json!({ "exp": exp }));
assert!(!token_needs_refresh(&jwt));
}
#[test]
fn token_needs_refresh_true_on_malformed_jwt() {
assert!(token_needs_refresh(""));
assert!(token_needs_refresh("not.a.jwt"));
assert!(token_needs_refresh("only-one-segment"));
assert!(token_needs_refresh("hdr.not-base64!.sig"));
}
#[test]
fn token_needs_refresh_true_when_exp_missing() {
let jwt = make_jwt(&serde_json::json!({ "sub": "abc" }));
assert!(token_needs_refresh(&jwt));
}
}