Compare commits

..

1 Commits

Author SHA1 Message Date
Sharang Parnerkar e20e7f1c6e feat(m7.3): cross-tenant admin HTTP endpoints
CI / Check (pull_request) Successful in 8m4s
CI / Detect Changes (pull_request) Has been skipped
CI / Deploy Agent (pull_request) Has been skipped
CI / Deploy Dashboard (pull_request) Has been skipped
CI / Deploy Docs (pull_request) Has been skipped
CI / Deploy MCP (pull_request) Has been skipped
Adds two cross-tenant operator endpoints on top of the M7.2-D
DatabasePool primitives:
- GET    /api/v1/admin/tenants              → list tenant DBs
- DELETE /api/v1/admin/tenants/{tenant_id}  → drop (GDPR delete)

Auth is a static bearer (ADMIN_API_TOKEN env), explicitly NOT a
Keycloak JWT — the whole point is to operate across tenants and a
customer JWT always carries a single tenant_id, which would be a
semantic conflict. Comparison is constant-time to avoid byte-level
timing probes.

Design
- ADMIN_API_TOKEN env on the agent. When unset, the admin routes
  aren't mounted at all (404 rather than 401). An operator who
  hasn't opted in can't fingerprint the surface.
- Admin sub-router is built in start_api_server when the token is
  configured, then merged into the main router with its own
  require_admin_token middleware.
- compliance-core::auth gains a PUBLIC_PREFIXES list. Paths under
  /api/v1/admin/ bypass require_jwt_auth so the customer JWT path
  and the admin token path never collide.
- require_tenant_status passes through naturally — admin requests
  carry no TenantContext.

Files
- compliance-core/src/auth.rs — PUBLIC_PREFIXES + prefix-aware skip.
- compliance-core/src/config.rs — admin_api_token + tenant_registry_url
  fields on AgentConfig. tenant_registry_url is added now so the
  scheduler→registry PR doesn't have to bump the config shape again.
- compliance-agent/src/config.rs — env wiring for both.
- compliance-agent/src/api/handlers/admin.rs (new) — list_tenant_dbs,
  drop_tenant_db, require_admin_token middleware, tokens_eq helper
  with a small test.
- compliance-agent/src/api/server.rs — conditional admin sub-router
  + merge.
- Test harness fixtures updated for the two new config fields.

Test plan
- cargo fmt --all clean
- cargo clippy --workspace --exclude compliance-dashboard
  -- -D warnings clean
- cargo test -p compliance-core --lib — 7 pass
- cargo test -p compliance-agent --lib — 229 pass (+1 new for
  tokens_eq)

Production
- Set ADMIN_API_TOKEN in orca-infra (per-secret, NOT committed) when
  ready to expose these endpoints. Without the env, the routes
  literally don't exist on the binary.
- Long-term: replace the static bearer with a dedicated admin realm
  in Keycloak. Token rotation is just an env change + restart for
  now; revocation responsiveness is zero.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-18 13:02:37 +02:00
21 changed files with 202 additions and 627 deletions
Generated
-4
View File
@@ -676,7 +676,6 @@ dependencies = [
"jsonwebtoken",
"mongodb",
"octocrab",
"rand 0.9.2",
"regex",
"reqwest",
"secrecy",
@@ -819,15 +818,12 @@ dependencies = [
"bson",
"chrono",
"compliance-core",
"dashmap",
"dotenvy",
"hex",
"mongodb",
"rmcp",
"schemars 1.2.1",
"serde",
"serde_json",
"sha2",
"thiserror 2.0.18",
"tokio",
"tower-http",
-2
View File
@@ -34,5 +34,3 @@ zip = { version = "2", features = ["aes-crypto", "deflate"] }
dashmap = "6"
tokio-stream = { version = "0.1", features = ["sync"] }
aes-gcm = "0.10"
rand = "0.9"
base64 = "0.22"
-1
View File
@@ -42,7 +42,6 @@ tokio-tungstenite = { version = "0.26", features = ["rustls-tls-webpki-roots"] }
futures-core = "0.3"
dashmap = { workspace = true }
tokio-stream = { workspace = true }
rand = { workspace = true }
[dev-dependencies]
compliance-core = { workspace = true, features = ["mongodb", "axum"] }
+115
View File
@@ -0,0 +1,115 @@
//! Cross-tenant admin endpoints (`/api/v1/admin/*`).
//!
//! Operator-only. Auth is a **static bearer token** (`ADMIN_API_TOKEN`
//! env on the agent) — explicitly NOT a Keycloak JWT, because the
//! whole point of these endpoints is to operate ACROSS tenants. A
//! customer JWT (which always carries a single tenant_id) has no
//! business mounting them.
//!
//! Routes are only registered when `ADMIN_API_TOKEN` is set. With no
//! token, the endpoints don't exist at all (404), which is a stronger
//! guarantee than "401 if you guess the path".
//!
//! Operations:
//! - `GET /api/v1/admin/tenants` — list tenant DBs
//! - `DELETE /api/v1/admin/tenants/{tenant_id}` — GDPR delete
//!
//! Tenant ids in URLs are passed as-is to `DatabasePool::drop_tenant`,
//! which sanitises them the same way it does for creation. Listing
//! returns the raw DB names from `list_tenant_db_names` — operators
//! can reverse-derive the tenant_id from the prefix.
use axum::extract::{Extension, Path, Request};
use axum::http::{header, StatusCode};
use axum::middleware::Next;
use axum::response::{IntoResponse, Response};
use axum::Json;
use secrecy::ExposeSecret;
use serde::Serialize;
use super::dto::AgentExt;
#[derive(Serialize)]
pub struct ListTenantDbsResponse {
pub tenant_db_names: Vec<String>,
}
#[tracing::instrument(skip_all)]
pub async fn list_tenant_dbs(
Extension(agent): AgentExt,
) -> Result<Json<ListTenantDbsResponse>, StatusCode> {
let names = agent.db_pool.list_tenant_db_names().await.map_err(|e| {
tracing::error!("admin: list_tenant_db_names failed: {e}");
StatusCode::INTERNAL_SERVER_ERROR
})?;
Ok(Json(ListTenantDbsResponse {
tenant_db_names: names,
}))
}
#[tracing::instrument(skip_all, fields(tenant_id = %tenant_id))]
pub async fn drop_tenant_db(
Extension(agent): AgentExt,
Path(tenant_id): Path<String>,
) -> Result<Json<serde_json::Value>, StatusCode> {
agent.db_pool.drop_tenant(&tenant_id).await.map_err(|e| {
tracing::error!("admin: drop_tenant failed: {e}");
StatusCode::INTERNAL_SERVER_ERROR
})?;
Ok(Json(serde_json::json!({ "status": "dropped" })))
}
/// Constant-time-ish comparison of the configured admin token against
/// the incoming bearer. Uses `subtle`-style byte equality so timing
/// attacks can't probe the token character by character.
fn tokens_eq(a: &str, b: &str) -> bool {
if a.len() != b.len() {
return false;
}
let mut diff = 0u8;
for (x, y) in a.bytes().zip(b.bytes()) {
diff |= x ^ y;
}
diff == 0
}
/// Middleware enforcing the static `ADMIN_API_TOKEN`. Mounted only on
/// the admin sub-router, so this never runs on customer routes.
pub async fn require_admin_token(
Extension(agent): AgentExt,
request: Request,
next: Next,
) -> Response {
let Some(expected) = agent.config.admin_api_token.as_ref() else {
// Belt-and-braces — if the routes were somehow mounted without
// a token configured, refuse rather than no-op-pass.
return (StatusCode::NOT_FOUND, "admin disabled").into_response();
};
let presented = request
.headers()
.get(header::AUTHORIZATION)
.and_then(|v| v.to_str().ok())
.and_then(|s| s.strip_prefix("Bearer "))
.map(|s| s.trim());
let Some(presented) = presented.filter(|s| !s.is_empty()) else {
return (StatusCode::UNAUTHORIZED, "Missing bearer token").into_response();
};
if !tokens_eq(presented, expected.expose_secret()) {
return (StatusCode::UNAUTHORIZED, "Invalid admin token").into_response();
}
next.run(request).await
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn tokens_eq_basic() {
assert!(tokens_eq("abc", "abc"));
assert!(!tokens_eq("abc", "abd"));
assert!(!tokens_eq("abc", "abcd"));
assert!(!tokens_eq("", "x"));
assert!(tokens_eq("", ""));
}
}
@@ -1,186 +0,0 @@
//! `/api/v1/mcp-tokens` — per-tenant API tokens for the MCP server.
//!
//! These are opaque static bearers issued via the dashboard (or a
//! direct curl with a KC JWT) and copied into LLM clients (Claude
//! Desktop / Cursor / ChatGPT). The MCP server hashes incoming bearers
//! and looks them up in the cross-tenant `<prefix>__admin.mcp_tokens`
//! collection to derive the tenant_id for routing.
//!
//! The raw token is shown to the caller exactly once at creation; the
//! database only ever stores the SHA-256 hash. Revocation is a soft
//! delete (sets `revoked: true`) so the audit log keeps the record.
use axum::extract::{Extension, Path};
use axum::http::StatusCode;
use axum::Json;
use base64::{engine::general_purpose::URL_SAFE_NO_PAD, Engine as _};
use compliance_core::models::{McpToken, McpTokenView};
use compliance_core::tenant_ctx::TenantCtx;
use mongodb::bson::doc;
use rand::RngCore;
use sha2::{Digest, Sha256};
use super::dto::{AgentExt, ApiResponse};
/// Mongo collection name inside the admin DB.
const COLLECTION: &str = "mcp_tokens";
/// Token prefix the MCP server expects on every bearer.
const TOKEN_PREFIX: &str = "mcpt_";
/// Bytes of randomness behind each token. 32 → ~256 bits.
/// Encoded as URL-safe base64 without padding → 43 chars.
/// Combined with `mcpt_` → 48-char tokens.
const TOKEN_RAND_BYTES: usize = 32;
#[derive(serde::Deserialize)]
pub struct CreateMcpTokenRequest {
pub name: String,
}
/// Returned exactly once at creation. The `token` field is gone from
/// the listing endpoint — the user must save it now.
#[derive(serde::Serialize)]
pub struct CreateMcpTokenResponse {
pub token: String,
pub view: McpTokenView,
}
/// `POST /api/v1/mcp-tokens` — mint a new token for the caller's tenant.
#[tracing::instrument(skip_all)]
pub async fn create_mcp_token(
Extension(agent): AgentExt,
tenant: TenantCtx,
Json(req): Json<CreateMcpTokenRequest>,
) -> Result<Json<CreateMcpTokenResponse>, StatusCode> {
if req.name.trim().is_empty() {
return Err(StatusCode::BAD_REQUEST);
}
let raw = generate_token();
let token_hash = sha256_hex(&raw);
let token_prefix: String = raw.chars().take(12).collect();
let mut token = McpToken {
id: None,
token_hash,
token_prefix,
tenant_id: tenant.0.tenant_id.clone(),
name: req.name.trim().to_string(),
created_by: tenant.0.user_id.clone(),
created_at: chrono::Utc::now(),
last_used_at: None,
revoked: false,
};
let col = agent.db_pool.admin_db().collection::<McpToken>(COLLECTION);
let res = col.insert_one(&token).await.map_err(|e| {
tracing::error!("Failed to insert MCP token: {e}");
StatusCode::INTERNAL_SERVER_ERROR
})?;
token.id = res.inserted_id.as_object_id();
Ok(Json(CreateMcpTokenResponse {
view: McpTokenView::from(&token),
token: raw,
}))
}
/// `GET /api/v1/mcp-tokens` — list tokens for the caller's tenant.
/// Hash is never returned; only metadata + the 12-char prefix so the
/// user can identify which row is which.
#[tracing::instrument(skip_all)]
pub async fn list_mcp_tokens(
Extension(agent): AgentExt,
tenant: TenantCtx,
) -> Result<Json<ApiResponse<Vec<McpTokenView>>>, StatusCode> {
let col = agent.db_pool.admin_db().collection::<McpToken>(COLLECTION);
let mut cursor = col
.find(doc! { "tenant_id": &tenant.0.tenant_id })
.sort(doc! { "created_at": -1 })
.await
.map_err(|e| {
tracing::error!("Failed to list MCP tokens: {e}");
StatusCode::INTERNAL_SERVER_ERROR
})?;
let mut out = Vec::new();
while cursor.advance().await.map_err(|e| {
tracing::warn!("MCP tokens cursor advance failed: {e}");
StatusCode::INTERNAL_SERVER_ERROR
})? {
match cursor.deserialize_current() {
Ok(t) => out.push(McpTokenView::from(&t)),
Err(e) => tracing::warn!("Failed to deserialize MCP token: {e}"),
}
}
Ok(Json(ApiResponse {
data: out,
total: None,
page: None,
}))
}
/// `DELETE /api/v1/mcp-tokens/{id}` — revoke (soft delete).
/// Scoped to the caller's tenant: a user can't revoke another tenant's
/// token even if they guess its id.
#[tracing::instrument(skip_all, fields(id = %id))]
pub async fn revoke_mcp_token(
Extension(agent): AgentExt,
tenant: TenantCtx,
Path(id): Path<String>,
) -> Result<Json<serde_json::Value>, StatusCode> {
let oid = mongodb::bson::oid::ObjectId::parse_str(&id).map_err(|_| StatusCode::BAD_REQUEST)?;
let col = agent.db_pool.admin_db().collection::<McpToken>(COLLECTION);
let result = col
.update_one(
doc! { "_id": oid, "tenant_id": &tenant.0.tenant_id },
doc! { "$set": { "revoked": true } },
)
.await
.map_err(|e| {
tracing::error!("Failed to revoke MCP token: {e}");
StatusCode::INTERNAL_SERVER_ERROR
})?;
if result.matched_count == 0 {
return Err(StatusCode::NOT_FOUND);
}
Ok(Json(serde_json::json!({ "status": "revoked" })))
}
/// 32 bytes random → URL-safe base64 → 43 chars, no padding.
/// Prefixed with `mcpt_` so the MCP server can sniff the format
/// before bothering with the DB lookup.
fn generate_token() -> String {
let mut bytes = [0u8; TOKEN_RAND_BYTES];
rand::rng().fill_bytes(&mut bytes);
format!("{TOKEN_PREFIX}{}", URL_SAFE_NO_PAD.encode(bytes))
}
fn sha256_hex(s: &str) -> String {
let mut h = Sha256::new();
h.update(s.as_bytes());
hex::encode(h.finalize())
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn generated_tokens_are_unique_and_prefixed() {
let a = generate_token();
let b = generate_token();
assert_ne!(a, b);
assert!(a.starts_with(TOKEN_PREFIX));
assert!(b.starts_with(TOKEN_PREFIX));
// 5 + 43 = 48 chars
assert_eq!(a.len(), 5 + 43);
}
#[test]
fn sha256_is_stable_and_64_hex() {
let h = sha256_hex("mcpt_abc");
assert_eq!(h.len(), 64);
assert!(h.chars().all(|c| c.is_ascii_hexdigit()));
assert_eq!(sha256_hex("mcpt_abc"), h);
}
}
+1 -1
View File
@@ -1,3 +1,4 @@
pub mod admin;
pub mod chat;
pub mod dast;
pub mod dto;
@@ -6,7 +7,6 @@ pub mod graph;
pub mod health;
pub mod help_chat;
pub mod issues;
pub mod mcp_tokens;
pub mod notifications;
pub mod pentest_handlers;
pub use pentest_handlers as pentest;
-9
View File
@@ -47,15 +47,6 @@ pub fn build_router() -> Router {
.route("/api/v1/sbom/diff", get(handlers::sbom_diff))
.route("/api/v1/issues", get(handlers::list_issues))
.route("/api/v1/scan-runs", get(handlers::list_scan_runs))
// MCP token management (per-tenant API tokens for the MCP server)
.route(
"/api/v1/mcp-tokens",
get(handlers::mcp_tokens::list_mcp_tokens).post(handlers::mcp_tokens::create_mcp_token),
)
.route(
"/api/v1/mcp-tokens/{id}",
delete(handlers::mcp_tokens::revoke_mcp_token),
)
// Graph API endpoints
.route("/api/v1/graph/{repo_id}", get(handlers::graph::get_graph))
.route(
+24 -1
View File
@@ -4,7 +4,8 @@ use axum::extract::Request;
use axum::http::HeaderValue;
use axum::middleware::Next;
use axum::response::Response;
use axum::{middleware, Extension};
use axum::routing::{delete, get};
use axum::{middleware, Extension, Router};
use tokio::sync::RwLock;
use tower_http::cors::CorsLayer;
use tower_http::set_header::SetResponseHeaderLayer;
@@ -14,6 +15,7 @@ use compliance_core::auth::{require_jwt_auth, require_tenant_status, JwksState};
use compliance_core::{TenantContext, TenantStatus};
use crate::agent::ComplianceAgent;
use crate::api::handlers;
use crate::api::routes;
use crate::error::AgentError;
@@ -50,7 +52,28 @@ pub async fn inject_dev_tenant(mut request: Request, next: Next) -> Response {
}
pub async fn start_api_server(agent: ComplianceAgent, port: u16) -> Result<(), AgentError> {
// Admin sub-router. Routes are only mounted when ADMIN_API_TOKEN is
// configured — without it, the paths don't exist at all (404 rather
// than 401), so an operator who hasn't opted in can't fingerprint
// the surface area.
let admin_router: Router = if agent.config.admin_api_token.is_some() {
tracing::info!("Admin API enabled — /api/v1/admin/* mounted behind ADMIN_API_TOKEN bearer");
Router::new()
.route(
"/api/v1/admin/tenants",
get(handlers::admin::list_tenant_dbs),
)
.route(
"/api/v1/admin/tenants/{tenant_id}",
delete(handlers::admin::drop_tenant_db),
)
.layer(middleware::from_fn(handlers::admin::require_admin_token))
} else {
Router::new()
};
let mut app = routes::build_router()
.merge(admin_router)
.layer(Extension(Arc::new(agent.clone())))
.layer(CorsLayer::permissive())
.layer(TraceLayer::new_for_http())
+2
View File
@@ -59,5 +59,7 @@ pub fn load_config() -> Result<AgentConfig, AgentError> {
.unwrap_or(true),
pentest_imap_username: env_var_opt("PENTEST_IMAP_USERNAME"),
pentest_imap_password: env_secret_opt("PENTEST_IMAP_PASSWORD"),
admin_api_token: env_secret_opt("ADMIN_API_TOKEN"),
tenant_registry_url: env_var_opt("TENANT_REGISTRY_URL"),
})
}
-19
View File
@@ -141,25 +141,6 @@ impl DatabasePool {
&self.client
}
/// Cross-tenant admin database used by features that intentionally
/// span tenants (today: MCP bearer tokens — each token row carries
/// a `tenant_id` and the MCP server reads them to route requests).
///
/// The name `<db_prefix>__admin` (double underscore) is reserved —
/// the sanitizer never produces it for a normal tenant DB because
/// the natural format is `<db_prefix>_<sanitized_tenant_id>` (one
/// underscore) and tenant_ids would have to start with `_admin` to
/// collide. New tenant provisioning should reject such ids.
pub fn admin_db(&self) -> mongodb::Database {
self.client.database(&self.admin_db_name())
}
/// Name of the admin database — public so tests / operators can
/// drop it via the raw client.
pub fn admin_db_name(&self) -> String {
format!("{}__admin", self.db_prefix)
}
/// List every Mongo database currently belonging to this pool,
/// identified by the `<db_prefix>_` prefix. The result is the raw
/// database names — opening one for offboarding/cleanup goes
+2
View File
@@ -339,6 +339,8 @@ mod tests {
pentest_imap_tls: true,
pentest_imap_username: None,
pentest_imap_password: None,
admin_api_token: None,
tenant_registry_url: None,
}
}
+2
View File
@@ -66,6 +66,8 @@ impl TestServer {
pentest_imap_tls: false,
pentest_imap_username: None,
pentest_imap_password: None,
admin_api_token: None,
tenant_registry_url: None,
};
let agent = ComplianceAgent::new(config, db_pool);
+12 -4
View File
@@ -63,16 +63,24 @@ struct Claims {
const PUBLIC_ENDPOINTS: &[&str] = &["/api/v1/health"];
/// Path prefixes that bypass JWT validation. The admin sub-router
/// (`/api/v1/admin/*`) has its own static-bearer middleware and must
/// not be routed through the customer-JWT path — a Keycloak token
/// always carries a single tenant_id and would semantically conflict
/// with cross-tenant admin operations.
const PUBLIC_PREFIXES: &[&str] = &["/api/v1/admin/"];
/// Middleware that validates Bearer JWT tokens against Keycloak's JWKS
/// and attaches a `TenantContext` extension on success.
///
/// Skips validation for the health endpoint.
/// If `JwksState` is not present (Keycloak not configured), requests
/// pass through and downstream code must handle the missing context.
/// Skips validation for the health endpoint and any path under one of
/// the [`PUBLIC_PREFIXES`]. If `JwksState` is not present (Keycloak
/// not configured), requests pass through and downstream code must
/// handle the missing context.
pub async fn require_jwt_auth(mut request: Request, next: Next) -> Response {
let path = request.uri().path();
if PUBLIC_ENDPOINTS.contains(&path) {
if PUBLIC_ENDPOINTS.contains(&path) || PUBLIC_PREFIXES.iter().any(|p| path.starts_with(p)) {
return next.run(request).await;
}
+9
View File
@@ -37,6 +37,15 @@ pub struct AgentConfig {
pub pentest_imap_tls: bool,
pub pentest_imap_username: Option<String>,
pub pentest_imap_password: Option<SecretString>,
/// Static bearer for the cross-tenant admin endpoints under
/// `/api/v1/admin/*`. When `None`, those endpoints are not
/// mounted at all (defense-in-depth: ops endpoints never reach
/// any auth path if no operator has explicitly opted in).
pub admin_api_token: Option<SecretString>,
/// Live tenant-registry URL the scheduler consults for the list
/// of tenants to iterate. When `None` or unreachable, scheduler
/// falls back to `SCHEDULER_TENANT_IDS` env (M7.2-C).
pub tenant_registry_url: Option<String>,
}
#[derive(Clone, Debug, Serialize, Deserialize)]
-69
View File
@@ -1,69 +0,0 @@
//! Per-tenant API tokens used by `compliance-mcp` to authenticate MCP
//! HTTP requests on behalf of LLM clients (Claude Desktop, Cursor,
//! ChatGPT, etc.) that can't run a Keycloak OIDC flow.
//!
//! Tokens are opaque strings of the form `mcpt_<44 url-safe random
//! chars>`. The raw value is shown to the user exactly once at
//! creation; the database only ever sees the SHA-256 hash. Lookups go
//! through the cross-tenant `<prefix>__admin.mcp_tokens` collection
//! and return the `tenant_id` the MCP server should route to.
use chrono::{DateTime, Utc};
use serde::{Deserialize, Serialize};
/// Persisted token metadata. `token_hash` is the SHA-256 hex of the
/// raw token; the raw token itself is never stored.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct McpToken {
#[serde(rename = "_id", skip_serializing_if = "Option::is_none")]
pub id: Option<bson::oid::ObjectId>,
/// SHA-256 hex of the raw token. Unique index in the collection.
pub token_hash: String,
/// First 8 chars of the raw token — purely for UI display so users
/// can identify which token is which without re-issuing.
pub token_prefix: String,
/// Routes to `<db_prefix>_<tenant_id>` on MCP requests.
pub tenant_id: String,
/// User-given label, e.g. "Claude Desktop" or "Sharang's laptop".
pub name: String,
/// Keycloak `sub` of the user who created this token, for audit.
pub created_by: String,
#[serde(with = "super::serde_helpers::bson_datetime")]
pub created_at: DateTime<Utc>,
#[serde(default, with = "super::serde_helpers::opt_bson_datetime")]
pub last_used_at: Option<DateTime<Utc>>,
/// Soft-delete flag. A revoked token doc stays around for audit
/// but never authenticates.
#[serde(default)]
pub revoked: bool,
}
/// Public projection of a token — never includes the hash.
/// Returned by `GET /api/v1/mcp-tokens`.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct McpTokenView {
pub id: String,
pub name: String,
/// `mcpt_xxxx…` so the user can identify which row is which.
pub token_prefix: String,
pub created_by: String,
#[serde(with = "super::serde_helpers::bson_datetime")]
pub created_at: DateTime<Utc>,
#[serde(default, with = "super::serde_helpers::opt_bson_datetime")]
pub last_used_at: Option<DateTime<Utc>>,
pub revoked: bool,
}
impl From<&McpToken> for McpTokenView {
fn from(t: &McpToken) -> Self {
Self {
id: t.id.map(|o| o.to_hex()).unwrap_or_default(),
name: t.name.clone(),
token_prefix: t.token_prefix.clone(),
created_by: t.created_by.clone(),
created_at: t.created_at,
last_used_at: t.last_used_at,
revoked: t.revoked,
}
}
}
-2
View File
@@ -7,7 +7,6 @@ pub mod finding;
pub mod graph;
pub mod issue;
pub mod mcp;
pub mod mcp_token;
pub mod notification;
pub mod pentest;
pub mod repository;
@@ -29,7 +28,6 @@ pub use graph::{
};
pub use issue::{IssueStatus, TrackerIssue, TrackerType};
pub use mcp::{McpServerConfig, McpServerStatus, McpTransport};
pub use mcp_token::{McpToken, McpTokenView};
pub use notification::{CveNotification, NotificationSeverity, NotificationStatus};
pub use pentest::{
AttackChainNode, AttackNodeStatus, AuthMode, CodeContextHint, Environment, IdentityProvider,
+1 -4
View File
@@ -4,7 +4,7 @@ version = "0.1.0"
edition = "2021"
[dependencies]
compliance-core = { workspace = true, features = ["mongodb", "axum"] }
compliance-core = { workspace = true, features = ["mongodb"] }
rmcp = { version = "0.16", features = ["server", "macros", "transport-io", "transport-streamable-http-server"] }
tokio = { workspace = true }
serde = { workspace = true }
@@ -19,6 +19,3 @@ bson = { version = "2", features = ["chrono-0_4"] }
schemars = "1.0"
axum = "0.8"
tower-http = { version = "0.6", features = ["cors"] }
sha2 = { workspace = true }
hex = { workspace = true }
dashmap = { workspace = true }
-129
View File
@@ -1,129 +0,0 @@
//! Bearer-token authentication for incoming MCP HTTP requests.
//!
//! LLM clients (Claude Desktop / Cursor / ChatGPT / etc.) can't run
//! Keycloak OIDC, so the MCP server uses opaque static tokens minted
//! per-tenant via the agent's `POST /api/v1/mcp-tokens` endpoint.
//!
//! Flow per request:
//! 1. Extract `Authorization: Bearer <token>`. Missing → 401.
//! 2. SHA-256 hash the token.
//! 3. Look up the hash in `<prefix>__admin.mcp_tokens`. Missing or
//! revoked → 401.
//! 4. Fire-and-forget update of `last_used_at` so the dashboard can
//! show staleness without blocking the handler.
//! 5. Stash the tenant_id in [`TENANT_ID`] (a `tokio::task_local`) so
//! the MCP tool handlers can read it without modifying rmcp's
//! handler signatures.
//!
//! The `task_local` is scoped around the inner service call via
//! [`bearer_auth`], so every handler invoked downstream sees the
//! tenant_id without us having to thread it through the macro-
//! generated tool router.
use axum::body::Body;
use axum::extract::{Request, State};
use axum::http::StatusCode;
use axum::middleware::Next;
use axum::response::{IntoResponse, Response};
use mongodb::bson::doc;
use sha2::{Digest, Sha256};
use crate::database::DatabasePool;
tokio::task_local! {
/// Tenant id resolved from the bearer for this request. Set by
/// [`bearer_auth`] before the inner service runs; read by the
/// MCP tool handlers via [`current_tenant_id`].
pub static TENANT_ID: String;
}
/// Mongo collection name in `<prefix>__admin`.
const COLLECTION: &str = "mcp_tokens";
/// Returns the tenant_id set by the auth middleware. `None` outside a
/// request scope (e.g. unit tests that bypass the middleware).
pub fn current_tenant_id() -> Option<String> {
TENANT_ID.try_with(|s| s.clone()).ok()
}
/// Axum middleware: validate bearer → set [`TENANT_ID`] → call inner.
pub async fn bearer_auth(
State(pool): State<DatabasePool>,
request: Request,
next: Next,
) -> Response {
let Some(token) = extract_bearer(&request) else {
return (StatusCode::UNAUTHORIZED, "Missing bearer token").into_response();
};
if !token.starts_with("mcpt_") {
return (StatusCode::UNAUTHORIZED, "Invalid token format").into_response();
}
let token_hash = sha256_hex(&token);
let col = pool.admin_db().collection::<TokenLookup>(COLLECTION);
let found = match col
.find_one(doc! { "token_hash": &token_hash, "revoked": false })
.await
{
Ok(Some(t)) => t,
Ok(None) => {
return (StatusCode::UNAUTHORIZED, "Invalid or revoked token").into_response();
}
Err(e) => {
tracing::error!("MCP token lookup failed: {e}");
return (StatusCode::INTERNAL_SERVER_ERROR, "Token lookup error").into_response();
}
};
// Fire-and-forget last_used_at update — never block the handler.
let col2 = pool.admin_db().collection::<TokenLookup>(COLLECTION);
let hash_for_update = token_hash.clone();
tokio::spawn(async move {
let _ = col2
.update_one(
doc! { "token_hash": &hash_for_update },
doc! { "$set": { "last_used_at": mongodb::bson::DateTime::now() } },
)
.await;
});
let tenant_id = found.tenant_id;
let inner = next.run(request);
TENANT_ID.scope(tenant_id, inner).await
}
/// Bare-bones projection — we don't need the whole `McpToken` here,
/// just enough to route and confirm validity.
#[derive(serde::Deserialize)]
struct TokenLookup {
tenant_id: String,
}
fn extract_bearer(req: &Request<Body>) -> Option<String> {
req.headers()
.get(axum::http::header::AUTHORIZATION)
.and_then(|v| v.to_str().ok())
.and_then(|s| s.strip_prefix("Bearer "))
.map(|s| s.trim().to_string())
.filter(|s| !s.is_empty())
}
fn sha256_hex(s: &str) -> String {
let mut h = Sha256::new();
h.update(s.as_bytes());
hex::encode(h.finalize())
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn sha256_known_value() {
// python -c 'import hashlib; print(hashlib.sha256(b"mcpt_known").hexdigest())'
assert_eq!(
sha256_hex("mcpt_known"),
"27cf6cf678a44244106863c1c031be8e57b84c2b3019d742f755f8e7afa75dfd"
);
}
}
+7 -115
View File
@@ -1,127 +1,19 @@
//! Per-tenant Mongo broker for the MCP server.
//!
//! Mirror of the agent's `compliance_agent::database::DatabasePool` —
//! duplicated here rather than lifted into `compliance-core` to keep
//! this PR focused. If a third consumer ever needs it, lift then.
//!
//! Bearer tokens (validated by the auth middleware) carry a tenant_id
//! and the handler resolves the per-tenant database via
//! [`DatabasePool::for_tenant_id`]. The admin database
//! (`<db_prefix>__admin`) holds the cross-tenant `mcp_tokens`
//! collection that the middleware queries on every request.
use std::sync::Arc;
use dashmap::DashMap;
use mongodb::{bson::doc, Client, Collection};
use sha2::{Digest, Sha256};
use mongodb::{Client, Collection};
use compliance_core::models::*;
/// 63-byte Mongo db-name cap; same invariant as the agent's pool.
const MAX_DB_NAME_LEN: usize = 63;
/// 16-byte SHA-256 truncation, hex-encoded → 32 chars.
const HASH_HEX_LEN: usize = 32;
const MAX_PREFIX_LEN: usize = MAX_DB_NAME_LEN - 1 - HASH_HEX_LEN;
#[derive(Clone, Debug)]
pub struct DatabasePool {
client: Client,
db_prefix: String,
/// Tenants we've handed out a [`Database`] for. The MCP server
/// doesn't ensure indexes (the agent owns that side of the
/// schema), so the marker exists only to satisfy the parallel
/// shape — current code never reads it.
#[allow(dead_code)]
seen: Arc<DashMap<String, ()>>,
}
#[derive(Debug, thiserror::Error)]
pub enum DbError {
#[error("db_prefix '{prefix}' is {len} chars; max is {max} so the hash-fallback DB name fits Mongo's 63-byte cap")]
PrefixTooLong {
prefix: String,
len: usize,
max: usize,
},
#[error(transparent)]
Mongo(#[from] mongodb::error::Error),
}
impl DatabasePool {
pub async fn connect(uri: &str, db_prefix: &str) -> Result<Self, DbError> {
if db_prefix.len() > MAX_PREFIX_LEN {
return Err(DbError::PrefixTooLong {
prefix: db_prefix.to_string(),
len: db_prefix.len(),
max: MAX_PREFIX_LEN,
});
}
let client = Client::with_uri_str(uri).await?;
client
.database("admin")
.run_command(doc! { "ping": 1 })
.await?;
tracing::info!(
"MCP MongoDB cluster reachable; per-tenant pool ready (db prefix '{db_prefix}')"
);
Ok(Self {
client,
db_prefix: db_prefix.to_string(),
seen: Arc::new(DashMap::new()),
})
}
/// Read-only handle to the tenant's database. No indexes are
/// ensured here — the agent owns writes, MCP only reads.
pub fn for_tenant_id(&self, tenant_id: &str) -> Database {
let db_name = self.tenant_db_name(tenant_id);
self.seen.insert(tenant_id.to_string(), ());
Database::new(self.client.database(&db_name))
}
/// Cross-tenant admin DB — holds the `mcp_tokens` collection that
/// the auth middleware queries to map bearer → tenant_id.
pub fn admin_db(&self) -> mongodb::Database {
self.client.database(&format!("{}__admin", self.db_prefix))
}
pub fn tenant_db_name(&self, tenant_id: &str) -> String {
let sanitized = sanitize_tenant_id(tenant_id);
let natural = format!("{}_{}", self.db_prefix, sanitized);
if natural.len() <= MAX_DB_NAME_LEN {
natural
} else {
let mut h = Sha256::new();
h.update(tenant_id.as_bytes());
let digest = h.finalize();
let suffix = hex::encode(&digest[..HASH_HEX_LEN / 2]);
format!("{}_{}", self.db_prefix, suffix)
}
}
}
fn sanitize_tenant_id(tenant_id: &str) -> String {
tenant_id
.chars()
.map(|c| match c {
'/' | '\\' | '.' | '"' | '$' | ' ' | '\0' => '_',
c => c,
})
.collect()
}
/// Typed accessors for the MCP-readable collections in a tenant DB.
/// Matches the agent's `Database` shape but only exposes what the MCP
/// tool handlers actually need.
#[derive(Clone, Debug)]
pub struct Database {
inner: mongodb::Database,
}
impl Database {
pub(crate) fn new(inner: mongodb::Database) -> Self {
Self { inner }
pub async fn connect(uri: &str, db_name: &str) -> Result<Self, mongodb::error::Error> {
let client = Client::with_uri_str(uri).await?;
let db = client.database(db_name);
db.run_command(mongodb::bson::doc! { "ping": 1 }).await?;
tracing::info!("MCP server connected to MongoDB '{db_name}'");
Ok(Self { inner: db })
}
pub fn findings(&self) -> Collection<Finding> {
+8 -33
View File
@@ -1,11 +1,10 @@
mod auth;
mod database;
mod server;
mod tools;
use std::sync::Arc;
use database::DatabasePool;
use database::Database;
use rmcp::transport::{
streamable_http_server::session::local::LocalSessionManager, StreamableHttpServerConfig,
StreamableHttpService,
@@ -25,60 +24,36 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mongo_uri =
std::env::var("MONGODB_URI").unwrap_or_else(|_| "mongodb://localhost:27017".to_string());
// MONGODB_DATABASE is reused as the per-tenant DB-name prefix —
// same convention as the agent so `<prefix>__admin.mcp_tokens`
// and `<prefix>_<tenant_id>` line up across services.
let db_prefix =
let db_name =
std::env::var("MONGODB_DATABASE").unwrap_or_else(|_| "compliance_scanner".to_string());
let pool = DatabasePool::connect(&mongo_uri, &db_prefix).await?;
let db = Database::connect(&mongo_uri, &db_name).await?;
// HTTP transport: bind a small axum router with bearer-auth in
// front of the rmcp service. `/health` stays public for orca's
// container probe.
// If MCP_PORT is set, run as Streamable HTTP server; otherwise use stdio.
if let Ok(port_str) = std::env::var("MCP_PORT") {
let port: u16 = port_str.parse()?;
tracing::info!("Starting MCP server on HTTP port {port}");
let pool_for_factory = pool.clone();
let db_clone = db.clone();
let service = StreamableHttpService::new(
move || Ok(ComplianceMcpServer::new(pool_for_factory.clone())),
move || Ok(ComplianceMcpServer::new(db_clone.clone())),
Arc::new(LocalSessionManager::default()),
StreamableHttpServerConfig::default(),
);
let router = axum::Router::new()
.route("/health", axum::routing::get(|| async { "ok" }))
.nest_service(
"/mcp",
axum::Router::new().fallback_service(service).layer(
axum::middleware::from_fn_with_state(pool.clone(), auth::bearer_auth),
),
);
.nest_service("/mcp", service);
let listener = tokio::net::TcpListener::bind(("0.0.0.0", port)).await?;
tracing::info!("MCP HTTP server listening on 0.0.0.0:{port}");
axum::serve(listener, router).await?;
} else {
// stdio transport — used when run as a local MCP server next
// to the LLM client. There's no HTTP layer to do bearer auth,
// so we synthesize a tenant_id from STDIO_TENANT_ID for local
// development. NEVER use this in production.
tracing::info!("Starting MCP server on stdio");
let synth_tenant = std::env::var("STDIO_TENANT_ID").unwrap_or_else(|_| "dev".to_string());
tracing::warn!(
tenant_id = %synth_tenant,
"stdio transport — using synthetic tenant id; DO NOT use in production"
);
let server = ComplianceMcpServer::new(pool);
let server = ComplianceMcpServer::new(db);
let transport = rmcp::transport::stdio();
use rmcp::ServiceExt;
auth::TENANT_ID
.scope(synth_tenant, async {
let handle = server.serve(transport).await?;
handle.waiting().await?;
Ok::<_, Box<dyn std::error::Error>>(())
})
.await?;
}
Ok(())
+17 -46
View File
@@ -2,37 +2,20 @@ use rmcp::{
handler::server::wrapper::Parameters, model::*, tool, tool_handler, tool_router, ServerHandler,
};
use crate::auth::current_tenant_id;
use crate::database::{Database, DatabasePool};
use crate::database::Database;
use crate::tools::{dast, findings, pentest, sbom};
pub struct ComplianceMcpServer {
pool: DatabasePool,
db: Database,
#[allow(dead_code)]
tool_router: rmcp::handler::server::router::tool::ToolRouter<Self>,
}
impl ComplianceMcpServer {
/// Resolve the per-tenant `Database` from the bearer-set
/// `task_local`. Every tool handler calls this; missing context
/// surfaces as `internal_error` because it means the auth
/// middleware was misconfigured (handler ran without scope).
fn tenant_db(&self) -> Result<Database, rmcp::ErrorData> {
let tenant_id = current_tenant_id().ok_or_else(|| {
rmcp::ErrorData::internal_error(
"no tenant context — bearer middleware not in chain".to_string(),
None,
)
})?;
Ok(self.pool.for_tenant_id(&tenant_id))
}
}
#[tool_router]
impl ComplianceMcpServer {
pub fn new(pool: DatabasePool) -> Self {
pub fn new(db: Database) -> Self {
Self {
pool,
db,
tool_router: Self::tool_router(),
}
}
@@ -46,8 +29,7 @@ impl ComplianceMcpServer {
&self,
Parameters(params): Parameters<findings::ListFindingsParams>,
) -> Result<CallToolResult, rmcp::ErrorData> {
let db = self.tenant_db()?;
findings::list_findings(&db, params).await
findings::list_findings(&self.db, params).await
}
#[tool(description = "Get a single finding by its ID")]
@@ -55,8 +37,7 @@ impl ComplianceMcpServer {
&self,
Parameters(params): Parameters<findings::GetFindingParams>,
) -> Result<CallToolResult, rmcp::ErrorData> {
let db = self.tenant_db()?;
findings::get_finding(&db, params).await
findings::get_finding(&self.db, params).await
}
#[tool(description = "Get a summary of findings counts grouped by severity and status")]
@@ -64,8 +45,7 @@ impl ComplianceMcpServer {
&self,
Parameters(params): Parameters<findings::FindingsSummaryParams>,
) -> Result<CallToolResult, rmcp::ErrorData> {
let db = self.tenant_db()?;
findings::findings_summary(&db, params).await
findings::findings_summary(&self.db, params).await
}
// ── SBOM ──────────────────────────────────────────────
@@ -77,8 +57,7 @@ impl ComplianceMcpServer {
&self,
Parameters(params): Parameters<sbom::ListSbomPackagesParams>,
) -> Result<CallToolResult, rmcp::ErrorData> {
let db = self.tenant_db()?;
sbom::list_sbom_packages(&db, params).await
sbom::list_sbom_packages(&self.db, params).await
}
#[tool(
@@ -88,8 +67,7 @@ impl ComplianceMcpServer {
&self,
Parameters(params): Parameters<sbom::SbomVulnReportParams>,
) -> Result<CallToolResult, rmcp::ErrorData> {
let db = self.tenant_db()?;
sbom::sbom_vuln_report(&db, params).await
sbom::sbom_vuln_report(&self.db, params).await
}
// ── DAST ──────────────────────────────────────────────
@@ -101,8 +79,7 @@ impl ComplianceMcpServer {
&self,
Parameters(params): Parameters<dast::ListDastFindingsParams>,
) -> Result<CallToolResult, rmcp::ErrorData> {
let db = self.tenant_db()?;
dast::list_dast_findings(&db, params).await
dast::list_dast_findings(&self.db, params).await
}
#[tool(description = "Get a summary of recent DAST scan runs and finding counts")]
@@ -110,8 +87,7 @@ impl ComplianceMcpServer {
&self,
Parameters(params): Parameters<dast::DastScanSummaryParams>,
) -> Result<CallToolResult, rmcp::ErrorData> {
let db = self.tenant_db()?;
dast::dast_scan_summary(&db, params).await
dast::dast_scan_summary(&self.db, params).await
}
// ── Pentest ─────────────────────────────────────────────
@@ -123,8 +99,7 @@ impl ComplianceMcpServer {
&self,
Parameters(params): Parameters<pentest::ListPentestSessionsParams>,
) -> Result<CallToolResult, rmcp::ErrorData> {
let db = self.tenant_db()?;
pentest::list_pentest_sessions(&db, params).await
pentest::list_pentest_sessions(&self.db, params).await
}
#[tool(description = "Get a single AI pentest session by its ID")]
@@ -132,8 +107,7 @@ impl ComplianceMcpServer {
&self,
Parameters(params): Parameters<pentest::GetPentestSessionParams>,
) -> Result<CallToolResult, rmcp::ErrorData> {
let db = self.tenant_db()?;
pentest::get_pentest_session(&db, params).await
pentest::get_pentest_session(&self.db, params).await
}
#[tool(
@@ -143,8 +117,7 @@ impl ComplianceMcpServer {
&self,
Parameters(params): Parameters<pentest::GetAttackChainParams>,
) -> Result<CallToolResult, rmcp::ErrorData> {
let db = self.tenant_db()?;
pentest::get_attack_chain(&db, params).await
pentest::get_attack_chain(&self.db, params).await
}
#[tool(description = "Get chat messages from a pentest session")]
@@ -152,8 +125,7 @@ impl ComplianceMcpServer {
&self,
Parameters(params): Parameters<pentest::GetPentestMessagesParams>,
) -> Result<CallToolResult, rmcp::ErrorData> {
let db = self.tenant_db()?;
pentest::get_pentest_messages(&db, params).await
pentest::get_pentest_messages(&self.db, params).await
}
#[tool(
@@ -163,8 +135,7 @@ impl ComplianceMcpServer {
&self,
Parameters(params): Parameters<pentest::PentestStatsParams>,
) -> Result<CallToolResult, rmcp::ErrorData> {
let db = self.tenant_db()?;
pentest::pentest_stats(&db, params).await
pentest::pentest_stats(&self.db, params).await
}
}
@@ -178,7 +149,7 @@ impl ServerHandler for ComplianceMcpServer {
.build(),
server_info: Implementation::from_build_env(),
instructions: Some(
"Compliance Scanner MCP server. Query security findings, SBOM data, DAST results, and AI pentest sessions for your tenant."
"Compliance Scanner MCP server. Query security findings, SBOM data, DAST results, and AI pentest sessions."
.to_string(),
),
}