Compare commits

..

6 Commits

Author SHA1 Message Date
Sharang Parnerkar dcec519565 fix(dashboard): attach Keycloak token on agent API calls
CI / Check (pull_request) Successful in 8m12s
CI / Detect Changes (pull_request) Has been skipped
CI / Deploy Agent (pull_request) Has been skipped
CI / Deploy Dashboard (pull_request) Has been skipped
CI / Deploy Docs (pull_request) Has been skipped
CI / Deploy MCP (pull_request) Has been skipped
Symptom: "unable to load repositories" — agent returns
"Missing authorization header" 401 on every protected endpoint
because the dashboard's server functions were calling reqwest::get
without an Authorization header. The Keycloak OIDC flow
(auth_login / auth_callback) was already wired up and storing the
access_token in tower-sessions, but the access_token was never
threaded into outbound calls.

Fix
- New `infrastructure::agent_client` module exposes:
  - `agent_request(method, path) -> RequestBuilder`
  - `agent_get(path) -> RequestBuilder` (sugar for GET)
  Both pull the session's access_token (via FullstackContext extract)
  and attach `Authorization: Bearer <token>`. When Keycloak is not
  configured the helper short-circuits — matching the dashboard's
  require_auth middleware which short-circuits in the same state.
- Migrated every #[server] function in:
  - chat, dast, findings, graph, issues, notifications, pentest,
    repositories, sbom, scans, stats
  - 57 call sites total, all replaced.
- Left as-is:
  - `infrastructure::server::webhook_proxy` — forwards to the agent's
    separate webhook server (port 3002), which is HMAC-authenticated,
    not JWT-authenticated.
  - `infrastructure::auth::auth_callback` — performs the KC token
    exchange itself; bearer auth would be circular.

Test plan
- cargo fmt --all clean
- cargo clippy -p compliance-dashboard --features server -- -D warnings
  clean
- cargo check -p compliance-dashboard --features server clean
- cargo check -p compliance-dashboard (web target) implicit via build
- Manual: after deploy, dashboard's repositories page loads without
  401; calls now carry Authorization: Bearer header to the agent.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-17 20:31:21 +02:00
Sharang Parnerkar 08c4ec4cff feat(m7.2-D): drop transitional agent.db, add admin helpers
CI / Check (pull_request) Successful in 9m27s
CI / Detect Changes (pull_request) Has been skipped
CI / Deploy Agent (pull_request) Has been skipped
CI / Deploy Dashboard (pull_request) Has been skipped
CI / Deploy Docs (pull_request) Has been skipped
CI / Deploy MCP (pull_request) Has been skipped
Final slice of M7.2. Removes the transitional single-database handle
that M7.2-A introduced alongside the pool, so the compliance-agent
now has a single source of truth for storage: every code path obtains
a tenant-scoped Database from `agent.db_pool.for_tenant_id(...)` or
`for_tenant(&ctx)`. There is no shared "default" database anywhere.

Changes
- ComplianceAgent: `db: Database` field removed. ComplianceAgent::new
  now takes only `(config, db_pool)`. Verified by an earlier grep
  during M7.2-C that no remaining call site reads `agent.db`.
- main.rs: stops constructing the legacy Database. Only the pool is
  built at startup.
- TestServer: same — drops Database::connect/ensure_indexes, builds
  only the pool. cleanup() now drops every `<db_name>_*` per-tenant
  database (no longer touches a bare `<db_name>`).
- DatabasePool::list_tenant_db_names() — lists Mongo databases
  matching the pool's prefix. For admin endpoints + scheduler tenant
  enumeration in a future M7.3 (this PR keeps SCHEDULER_TENANT_IDS
  env config — registry integration is a separate concern).
- DatabasePool::drop_tenant(&str) — idempotent tenant offboarding.
  Drops the per-tenant database and evicts the in-memory `ensured`
  marker so a later re-provision re-runs ensure_indexes.

Test plan
- cargo fmt --all clean
- cargo clippy --workspace --exclude compliance-dashboard
  -- -D warnings clean
- cargo test -p compliance-core --lib — 7 pass
- cargo test -p compliance-agent --lib — 228 pass
- cargo test -p compliance-agent --test tenant_isolation — 6 pass
  including new `admin_helpers_list_and_drop_tenant_dbs`
- cargo test -p compliance-agent --test tenant_status_middleware
  — 6 pass

M7.2 closeout state after this lands
- M7.1 (auth + status) — done
- M7.2-A (pool) — done
- M7.2-B (handlers) — done
- M7.2-C (background paths) — done
- M7.2-D (legacy db removed, admin helpers) — done (this PR)
- Future M7.3: scheduler pulls tenants from tenant-registry instead
  of SCHEDULER_TENANT_IDS env; cross-tenant admin HTTP endpoints
  built on list_tenant_db_names / drop_tenant.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-17 15:05:27 +02:00
Sharang Parnerkar 0f6dd1135e feat(m7.2-C): migrate background paths to per-tenant pool
CI / Check (pull_request) Successful in 10m33s
CI / Detect Changes (pull_request) Has been skipped
CI / Deploy Agent (pull_request) Has been skipped
CI / Deploy Dashboard (pull_request) Has been skipped
CI / Deploy Docs (pull_request) Has been skipped
CI / Deploy MCP (pull_request) Has been skipped
Closes the loop on M7.2 isolation for paths that don't have a JWT
context: scheduler, webhooks, and the agent's `run_scan` / `run_pr_review`
helpers all now take a `tenant_id` at the boundary and resolve to a
tenant-scoped `Database` via `db_pool.for_tenant_id(...)`. Internal
orchestrators (PipelineOrchestrator, PentestOrchestrator) and pipeline
helpers were already DB-agnostic — they take `db: Database` at
construction and don't care which tenant it points to.

Changes
- DatabasePool::for_tenant_id(&str) — same as for_tenant but accepts
  a bare tenant_id. Background paths don't have a full TenantContext.
  for_tenant is now a thin wrapper that delegates.
- agent.run_scan(tenant_id, repo_id, trigger) — pulls the tenant
  database before constructing the PipelineOrchestrator. Was:
  run_scan(repo_id, trigger) reading agent.db.
- agent.run_pr_review(tenant_id, repo_id, ...) — same shape.
- Webhook routes change: /webhook/{tenant_id}/{platform}/{repo_id}.
  Tenant is part of the URL path because webhooks arrive without a
  JWT — they're authenticated via per-repo HMAC, not the tenant gate.
  The dashboard surfaces the full per-tenant URL when the repo is
  registered. All three handlers (gitea, github, gitlab) updated.
- scheduler.rs — iterates tenants from $SCHEDULER_TENANT_IDS
  (comma-separated env), or DEV_TENANT_ID's `dev` default. Both
  scan_all_repos and monitor_cves now run once per configured
  tenant. M7.2-D will replace this static config with a pull from
  the tenant-registry.
- api/handlers/repos.rs::trigger_scan now passes tenant.0.tenant_id.

What's unchanged because it didn't need to change
- PipelineOrchestrator, PentestOrchestrator: take `db: Database` at
  construction — they're tenant-DB-agnostic by design. The caller
  picks the tenant DB.
- pipeline/{dedup,graph_build,issue_creation,sbom/mod}.rs,
  pentest/{context,report/html/*}.rs, trackers/jira.rs, llm/triage.rs:
  take `&Database` or `&mongodb::Database` as args, transitively
  tenant-scoped via the caller.

Test plan
- cargo fmt --all clean
- cargo clippy --workspace --exclude compliance-dashboard
  -- -D warnings clean
- cargo test -p compliance-core --lib — 7 pass
- cargo test -p compliance-agent --lib — 228 pass
- cargo test -p compliance-agent --test tenant_isolation — 5 pass
- cargo test -p compliance-agent --test tenant_status_middleware
  — 6 pass

What's left (PR-D)
- Drop the transitional agent.db field — no remaining call sites
  (verified by `grep -rn "agent\.db\b" compliance-agent/src`).
- main.rs / TestServer stop building the legacy Database; only the
  pool remains.
- Add cross-tenant admin helpers (list tenants, drop tenant DB) on
  the pool for offboarding flows.
- Pull tenants from the tenant-registry instead of an env var.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-17 15:00:37 +02:00
Sharang Parnerkar cdfbb62f9d feat(m7.2-B): migrate API handlers to per-tenant database pool
CI / Check (pull_request) Successful in 8m9s
CI / Detect Changes (pull_request) Has been skipped
CI / Deploy Agent (pull_request) Has been skipped
CI / Deploy Dashboard (pull_request) Has been skipped
CI / Deploy Docs (pull_request) Has been skipped
CI / Deploy MCP (pull_request) Has been skipped
Builds on PR M7.2-A. Every HTTP handler in compliance-agent/src/api/
now takes a TenantCtx extractor and pulls a tenant-scoped Database
from agent.db_pool.for_tenant(&ctx). The query bodies are unchanged —
`db.findings().find(doc! {...})` reads from the tenant's own physical
database, so the filter doc cannot leak data across tenants because
the wrong tenant's data is literally on a different db handle.

Changes
- New `dto::tenant_db(&agent, &tenant) -> Result<Database, StatusCode>`
  helper. Every migrated handler calls it at the top of the body
  instead of `let db = &agent.db;`. 500 on the rare pool failure;
  4xx auth failures are already handled by the M7.1 status gate.
- New `api::server::inject_dev_tenant` middleware mounted only when
  Keycloak is NOT configured. Synthesizes a TenantContext with
  tenant_id = $DEV_TENANT_ID (default `dev`) so `cargo run` against
  a bare Mongo + no KC still serves the API. Logged loudly as
  "DO NOT use in any environment with real customer data".
- Test harness: TestServer mounts inject_dev_tenant so existing E2E
  tests reach handlers; cleanup() now drops every <db_name>_*
  per-tenant database, not just the legacy <db_name>.

Files migrated (handler count, all pass `cargo build`):
- chat.rs (3) — also rewires RagPipeline + EmbeddingStore to the
  tenant DB's inner() so vector search is per-tenant
- dast.rs (5)
- findings.rs (5)
- graph.rs (7) — also rewires GraphStore inside trigger_build's
  spawn to the tenant DB
- health.rs (1) — stats_overview migrated; public /health stays
  un-scoped
- issues.rs (1)
- notifications.rs (5)
- pentest_handlers/session.rs (12) — both wizard + legacy paths,
  plus pause/resume/stop/get_attack_chain/get_messages/
  get_session_findings/lookup_repo. PentestOrchestrator now gets
  the tenant DB clone in its spawn.
- pentest_handlers/export.rs (1) — fans out across sessions,
  attack_chain_nodes, dast_findings, findings, sbom_entries,
  graph_nodes from a single tenant_db acquisition
- pentest_handlers/stats.rs (1)
- pentest_handlers/stream.rs (1) — SSE handler verifies session
  via the tenant DB before subscribing
- repos.rs (6)
- sbom.rs (5)
- scans.rs (1)

help_chat.rs has no DB queries and was skipped.

Test plan
- cargo fmt --all clean
- cargo clippy --workspace --exclude compliance-dashboard
  -- -D warnings clean
- cargo test -p compliance-core --lib — 7 pass
- cargo test -p compliance-agent --lib — 228 pass
- cargo test -p compliance-agent --test tenant_isolation — 5 pass
  (driver-level isolation still holds post-handler migration)
- cargo test -p compliance-agent --test tenant_status_middleware
  — 6 pass

What's not yet migrated (PR-C / PR-D)
- scheduler.rs (6 sites), pipeline/orchestrator.rs (14),
  pentest/orchestrator.rs (13), webhooks (gitea/github/gitlab),
  trackers/jira.rs, pipeline/dedup.rs etc. — background paths
  without a JWT-derived tenant context.
- agent.db is still in the ComplianceAgent struct as a transitional
  handle for those paths. PR-D removes it once PR-C migrates the
  background paths.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-17 13:28:33 +02:00
Sharang Parnerkar 003835764e fixup(m7.2-A): validate db_prefix at connect, bump hash to 16 bytes
CI / Check (pull_request) Successful in 8m29s
CI / Detect Changes (pull_request) Has been skipped
CI / Deploy Agent (pull_request) Has been skipped
CI / Deploy Dashboard (pull_request) Has been skipped
CI / Deploy Docs (pull_request) Has been skipped
CI / Deploy MCP (pull_request) Has been skipped
Addresses review feedback on the hash-fallback path.

The original `debug_assert!(hashed.len() <= MAX_DB_NAME_LEN)` was a
runtime hack that vanished in release builds. With an 8-byte hash
truncation (~2^32 birthday-collision resistance), two tenant_ids
hashing to the same suffix would silently share a database — no
panic, no rollback, just cross-tenant data leak. Not acceptable for
a regulated-industry product.

Changes:
- Bump hash truncation 8 → 16 bytes (32 hex chars). 2^64 birthday
  resistance — collision-impossible at our scale.
- Add MAX_PREFIX_LEN (= 30) and validate db_prefix.len() at
  `DatabasePool::connect`. The runtime hash-fallback arithmetic is
  now provably within Mongo's 63-byte cap; drop the debug_assert!.
- New test `connect_rejects_overlong_db_prefix` exercises the
  inclusive bound (30 passes, 31 fails).
- Existing hash-fallback test now asserts a 32-char hex suffix +
  basic distinctness for two different inputs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-17 13:16:46 +02:00
Sharang Parnerkar e3aabe7d18 feat(m7.2-A): introduce per-tenant DatabasePool
CI / Check (pull_request) Successful in 8m40s
CI / Detect Changes (pull_request) Has been skipped
CI / Deploy Agent (pull_request) Has been skipped
CI / Deploy Dashboard (pull_request) Has been skipped
CI / Deploy Docs (pull_request) Has been skipped
CI / Deploy MCP (pull_request) Has been skipped
First slice of the M7.2 tenant-isolation work. Adds a `DatabasePool`
that hands out per-tenant `Database` handles physically scoped to
`<prefix>_<tenant_id>` Mongo databases. Isolation is at the driver,
not at "we hope we filter" — a handle for tenant A literally cannot
see tenant B's documents because it's connected to a different db.

What's in this PR
- DatabasePool::connect — pings the cluster, prepares per-tenant lazy
  handles.
- DatabasePool::for_tenant(&TenantContext) — returns a Database scoped
  to that tenant. ensure_indexes runs once per tenant per process via
  a DashMap-backed marker; failure rolls the marker back so the next
  request retries.
- tenant_db_name — `<prefix>_<sanitized_tenant_id>` if it fits in
  Mongo's 63-byte db-name cap, else `<prefix>_<sha256-16hex>` fallback.
- Sanitizer rewrites the Mongo-disallowed chars (`/ \ . " $ <space>
  NUL`) so any future tenant_id shape works.
- ComplianceAgent gains a `db_pool: DatabasePool` field next to the
  existing `db: Database`. Handlers / pipelines / webhooks still use
  `db` — they migrate to `db_pool.for_tenant(&ctx)` in M7.2-B/C and
  `db` goes away in M7.2-D.

Test plan
- cargo fmt --all clean
- cargo clippy --workspace --exclude compliance-dashboard -- -D warnings
  clean
- cargo test -p compliance-core --lib — 7 pass
- cargo test -p compliance-agent --lib — 228 pass
- cargo test -p compliance-agent --test tenant_isolation — 4 pass
  against live mongo on 27017:
    * pool_isolates_tenants_at_driver_level — writes for acme + globex,
      reads through each tenant's handle; each sees exactly its own
      data with no filter doc anywhere.
    * for_tenant_is_idempotent_index_creation — second + third call
      for the same tenant do not error.
    * tenant_db_name_sanitizes_unsafe_characters
    * tenant_db_name_falls_back_to_hash_when_too_long — 100-byte
      tenant_id collapses to a stable 8-byte hex suffix.

Why per-tenant DB vs `tenant_id` field + filter
- Driver-level isolation; impossible to forget the filter on one of
  the 184 query call-sites in compliance-agent.
- Handlers don't change shape at migration — `agent.db.findings()`
  becomes `db.findings()` after pulling `db` from
  `agent.db_pool.for_tenant(&ctx)`.
- GDPR delete = `db.dropDatabase()`.
- On-prem deploy = the same code path, with one tenant.
- Trade-off accepted: index storage duplicated per tenant; Mongo's
  ~thousand-db ceiling is way above the 10s-100s tenants we're
  targeting.

Caveats
- Existing `agent.db` continues to point at the single legacy db.
  Handlers / pipelines that use it are unscoped until M7.2-B/C
  migrate them.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-17 11:58:24 +02:00
20 changed files with 43 additions and 1145 deletions
Generated
-4
View File
@@ -676,7 +676,6 @@ dependencies = [
"jsonwebtoken",
"mongodb",
"octocrab",
"rand 0.9.2",
"regex",
"reqwest",
"secrecy",
@@ -819,15 +818,12 @@ dependencies = [
"bson",
"chrono",
"compliance-core",
"dashmap",
"dotenvy",
"hex",
"mongodb",
"rmcp",
"schemars 1.2.1",
"serde",
"serde_json",
"sha2",
"thiserror 2.0.18",
"tokio",
"tower-http",
-2
View File
@@ -34,5 +34,3 @@ zip = { version = "2", features = ["aes-crypto", "deflate"] }
dashmap = "6"
tokio-stream = { version = "0.1", features = ["sync"] }
aes-gcm = "0.10"
rand = "0.9"
base64 = "0.22"
-1
View File
@@ -42,7 +42,6 @@ tokio-tungstenite = { version = "0.26", features = ["rustls-tls-webpki-roots"] }
futures-core = "0.3"
dashmap = { workspace = true }
tokio-stream = { workspace = true }
rand = { workspace = true }
[dev-dependencies]
compliance-core = { workspace = true, features = ["mongodb", "axum"] }
@@ -1,186 +0,0 @@
//! `/api/v1/mcp-tokens` — per-tenant API tokens for the MCP server.
//!
//! These are opaque static bearers issued via the dashboard (or a
//! direct curl with a KC JWT) and copied into LLM clients (Claude
//! Desktop / Cursor / ChatGPT). The MCP server hashes incoming bearers
//! and looks them up in the cross-tenant `<prefix>__admin.mcp_tokens`
//! collection to derive the tenant_id for routing.
//!
//! The raw token is shown to the caller exactly once at creation; the
//! database only ever stores the SHA-256 hash. Revocation is a soft
//! delete (sets `revoked: true`) so the audit log keeps the record.
use axum::extract::{Extension, Path};
use axum::http::StatusCode;
use axum::Json;
use base64::{engine::general_purpose::URL_SAFE_NO_PAD, Engine as _};
use compliance_core::models::{McpToken, McpTokenView};
use compliance_core::tenant_ctx::TenantCtx;
use mongodb::bson::doc;
use rand::RngCore;
use sha2::{Digest, Sha256};
use super::dto::{AgentExt, ApiResponse};
/// Mongo collection name inside the admin DB.
const COLLECTION: &str = "mcp_tokens";
/// Token prefix the MCP server expects on every bearer.
const TOKEN_PREFIX: &str = "mcpt_";
/// Bytes of randomness behind each token. 32 → ~256 bits.
/// Encoded as URL-safe base64 without padding → 43 chars.
/// Combined with `mcpt_` → 48-char tokens.
const TOKEN_RAND_BYTES: usize = 32;
#[derive(serde::Deserialize)]
pub struct CreateMcpTokenRequest {
pub name: String,
}
/// Returned exactly once at creation. The `token` field is gone from
/// the listing endpoint — the user must save it now.
#[derive(serde::Serialize)]
pub struct CreateMcpTokenResponse {
pub token: String,
pub view: McpTokenView,
}
/// `POST /api/v1/mcp-tokens` — mint a new token for the caller's tenant.
#[tracing::instrument(skip_all)]
pub async fn create_mcp_token(
Extension(agent): AgentExt,
tenant: TenantCtx,
Json(req): Json<CreateMcpTokenRequest>,
) -> Result<Json<CreateMcpTokenResponse>, StatusCode> {
if req.name.trim().is_empty() {
return Err(StatusCode::BAD_REQUEST);
}
let raw = generate_token();
let token_hash = sha256_hex(&raw);
let token_prefix: String = raw.chars().take(12).collect();
let mut token = McpToken {
id: None,
token_hash,
token_prefix,
tenant_id: tenant.0.tenant_id.clone(),
name: req.name.trim().to_string(),
created_by: tenant.0.user_id.clone(),
created_at: chrono::Utc::now(),
last_used_at: None,
revoked: false,
};
let col = agent.db_pool.admin_db().collection::<McpToken>(COLLECTION);
let res = col.insert_one(&token).await.map_err(|e| {
tracing::error!("Failed to insert MCP token: {e}");
StatusCode::INTERNAL_SERVER_ERROR
})?;
token.id = res.inserted_id.as_object_id();
Ok(Json(CreateMcpTokenResponse {
view: McpTokenView::from(&token),
token: raw,
}))
}
/// `GET /api/v1/mcp-tokens` — list tokens for the caller's tenant.
/// Hash is never returned; only metadata + the 12-char prefix so the
/// user can identify which row is which.
#[tracing::instrument(skip_all)]
pub async fn list_mcp_tokens(
Extension(agent): AgentExt,
tenant: TenantCtx,
) -> Result<Json<ApiResponse<Vec<McpTokenView>>>, StatusCode> {
let col = agent.db_pool.admin_db().collection::<McpToken>(COLLECTION);
let mut cursor = col
.find(doc! { "tenant_id": &tenant.0.tenant_id })
.sort(doc! { "created_at": -1 })
.await
.map_err(|e| {
tracing::error!("Failed to list MCP tokens: {e}");
StatusCode::INTERNAL_SERVER_ERROR
})?;
let mut out = Vec::new();
while cursor.advance().await.map_err(|e| {
tracing::warn!("MCP tokens cursor advance failed: {e}");
StatusCode::INTERNAL_SERVER_ERROR
})? {
match cursor.deserialize_current() {
Ok(t) => out.push(McpTokenView::from(&t)),
Err(e) => tracing::warn!("Failed to deserialize MCP token: {e}"),
}
}
Ok(Json(ApiResponse {
data: out,
total: None,
page: None,
}))
}
/// `DELETE /api/v1/mcp-tokens/{id}` — revoke (soft delete).
/// Scoped to the caller's tenant: a user can't revoke another tenant's
/// token even if they guess its id.
#[tracing::instrument(skip_all, fields(id = %id))]
pub async fn revoke_mcp_token(
Extension(agent): AgentExt,
tenant: TenantCtx,
Path(id): Path<String>,
) -> Result<Json<serde_json::Value>, StatusCode> {
let oid = mongodb::bson::oid::ObjectId::parse_str(&id).map_err(|_| StatusCode::BAD_REQUEST)?;
let col = agent.db_pool.admin_db().collection::<McpToken>(COLLECTION);
let result = col
.update_one(
doc! { "_id": oid, "tenant_id": &tenant.0.tenant_id },
doc! { "$set": { "revoked": true } },
)
.await
.map_err(|e| {
tracing::error!("Failed to revoke MCP token: {e}");
StatusCode::INTERNAL_SERVER_ERROR
})?;
if result.matched_count == 0 {
return Err(StatusCode::NOT_FOUND);
}
Ok(Json(serde_json::json!({ "status": "revoked" })))
}
/// 32 bytes random → URL-safe base64 → 43 chars, no padding.
/// Prefixed with `mcpt_` so the MCP server can sniff the format
/// before bothering with the DB lookup.
fn generate_token() -> String {
let mut bytes = [0u8; TOKEN_RAND_BYTES];
rand::rng().fill_bytes(&mut bytes);
format!("{TOKEN_PREFIX}{}", URL_SAFE_NO_PAD.encode(bytes))
}
fn sha256_hex(s: &str) -> String {
let mut h = Sha256::new();
h.update(s.as_bytes());
hex::encode(h.finalize())
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn generated_tokens_are_unique_and_prefixed() {
let a = generate_token();
let b = generate_token();
assert_ne!(a, b);
assert!(a.starts_with(TOKEN_PREFIX));
assert!(b.starts_with(TOKEN_PREFIX));
// 5 + 43 = 48 chars
assert_eq!(a.len(), 5 + 43);
}
#[test]
fn sha256_is_stable_and_64_hex() {
let h = sha256_hex("mcpt_abc");
assert_eq!(h.len(), 64);
assert!(h.chars().all(|c| c.is_ascii_hexdigit()));
assert_eq!(sha256_hex("mcpt_abc"), h);
}
}
-1
View File
@@ -6,7 +6,6 @@ pub mod graph;
pub mod health;
pub mod help_chat;
pub mod issues;
pub mod mcp_tokens;
pub mod notifications;
pub mod pentest_handlers;
pub use pentest_handlers as pentest;
-9
View File
@@ -47,15 +47,6 @@ pub fn build_router() -> Router {
.route("/api/v1/sbom/diff", get(handlers::sbom_diff))
.route("/api/v1/issues", get(handlers::list_issues))
.route("/api/v1/scan-runs", get(handlers::list_scan_runs))
// MCP token management (per-tenant API tokens for the MCP server)
.route(
"/api/v1/mcp-tokens",
get(handlers::mcp_tokens::list_mcp_tokens).post(handlers::mcp_tokens::create_mcp_token),
)
.route(
"/api/v1/mcp-tokens/{id}",
delete(handlers::mcp_tokens::revoke_mcp_token),
)
// Graph API endpoints
.route("/api/v1/graph/{repo_id}", get(handlers::graph::get_graph))
.route(
-19
View File
@@ -141,25 +141,6 @@ impl DatabasePool {
&self.client
}
/// Cross-tenant admin database used by features that intentionally
/// span tenants (today: MCP bearer tokens — each token row carries
/// a `tenant_id` and the MCP server reads them to route requests).
///
/// The name `<db_prefix>__admin` (double underscore) is reserved —
/// the sanitizer never produces it for a normal tenant DB because
/// the natural format is `<db_prefix>_<sanitized_tenant_id>` (one
/// underscore) and tenant_ids would have to start with `_admin` to
/// collide. New tenant provisioning should reject such ids.
pub fn admin_db(&self) -> mongodb::Database {
self.client.database(&self.admin_db_name())
}
/// Name of the admin database — public so tests / operators can
/// drop it via the raw client.
pub fn admin_db_name(&self) -> String {
format!("{}__admin", self.db_prefix)
}
/// List every Mongo database currently belonging to this pool,
/// identified by the `<db_prefix>_` prefix. The result is the raw
/// database names — opening one for offboarding/cleanup goes
-69
View File
@@ -1,69 +0,0 @@
//! Per-tenant API tokens used by `compliance-mcp` to authenticate MCP
//! HTTP requests on behalf of LLM clients (Claude Desktop, Cursor,
//! ChatGPT, etc.) that can't run a Keycloak OIDC flow.
//!
//! Tokens are opaque strings of the form `mcpt_<44 url-safe random
//! chars>`. The raw value is shown to the user exactly once at
//! creation; the database only ever sees the SHA-256 hash. Lookups go
//! through the cross-tenant `<prefix>__admin.mcp_tokens` collection
//! and return the `tenant_id` the MCP server should route to.
use chrono::{DateTime, Utc};
use serde::{Deserialize, Serialize};
/// Persisted token metadata. `token_hash` is the SHA-256 hex of the
/// raw token; the raw token itself is never stored.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct McpToken {
#[serde(rename = "_id", skip_serializing_if = "Option::is_none")]
pub id: Option<bson::oid::ObjectId>,
/// SHA-256 hex of the raw token. Unique index in the collection.
pub token_hash: String,
/// First 8 chars of the raw token — purely for UI display so users
/// can identify which token is which without re-issuing.
pub token_prefix: String,
/// Routes to `<db_prefix>_<tenant_id>` on MCP requests.
pub tenant_id: String,
/// User-given label, e.g. "Claude Desktop" or "Sharang's laptop".
pub name: String,
/// Keycloak `sub` of the user who created this token, for audit.
pub created_by: String,
#[serde(with = "super::serde_helpers::bson_datetime")]
pub created_at: DateTime<Utc>,
#[serde(default, with = "super::serde_helpers::opt_bson_datetime")]
pub last_used_at: Option<DateTime<Utc>>,
/// Soft-delete flag. A revoked token doc stays around for audit
/// but never authenticates.
#[serde(default)]
pub revoked: bool,
}
/// Public projection of a token — never includes the hash.
/// Returned by `GET /api/v1/mcp-tokens`.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct McpTokenView {
pub id: String,
pub name: String,
/// `mcpt_xxxx…` so the user can identify which row is which.
pub token_prefix: String,
pub created_by: String,
#[serde(with = "super::serde_helpers::bson_datetime")]
pub created_at: DateTime<Utc>,
#[serde(default, with = "super::serde_helpers::opt_bson_datetime")]
pub last_used_at: Option<DateTime<Utc>>,
pub revoked: bool,
}
impl From<&McpToken> for McpTokenView {
fn from(t: &McpToken) -> Self {
Self {
id: t.id.map(|o| o.to_hex()).unwrap_or_default(),
name: t.name.clone(),
token_prefix: t.token_prefix.clone(),
created_by: t.created_by.clone(),
created_at: t.created_at,
last_used_at: t.last_used_at,
revoked: t.revoked,
}
}
}
-2
View File
@@ -7,7 +7,6 @@ pub mod finding;
pub mod graph;
pub mod issue;
pub mod mcp;
pub mod mcp_token;
pub mod notification;
pub mod pentest;
pub mod repository;
@@ -29,7 +28,6 @@ pub use graph::{
};
pub use issue::{IssueStatus, TrackerIssue, TrackerType};
pub use mcp::{McpServerConfig, McpServerStatus, McpTransport};
pub use mcp_token::{McpToken, McpTokenView};
pub use notification::{CveNotification, NotificationSeverity, NotificationStatus};
pub use pentest::{
AttackChainNode, AttackNodeStatus, AuthMode, CodeContextHint, Environment, IdentityProvider,
-2
View File
@@ -44,8 +44,6 @@ pub enum Route {
PentestSessionPage { session_id: String },
#[route("/mcp-servers")]
McpServersPage {},
#[route("/mcp-tokens")]
McpTokensPage {},
}
const FAVICON: Asset = asset!("/assets/favicon.svg");
@@ -9,16 +9,7 @@
//! When Keycloak is not configured (dev convenience), the helper
//! returns an unauthenticated builder — matching the agent's
//! pass-through behavior in the same state.
//!
//! **Token refresh**: KC access tokens are short-lived (5 min default
//! in the certifai realm). Before attaching, we decode the JWT's `exp`
//! claim and proactively refresh via the stored refresh_token if the
//! access token is expired or about to expire. The session is updated
//! with the new pair. If refresh fails, we send the (stale) token
//! anyway — the agent's 401 will surface to the UI, which can prompt
//! re-login.
use base64::{engine::general_purpose::URL_SAFE_NO_PAD, Engine};
use dioxus::prelude::ServerFnError;
use dioxus_fullstack::FullstackContext;
use reqwest::Method;
@@ -27,11 +18,6 @@ use super::auth::LOGGED_IN_USER_SESS_KEY;
use super::server_state::ServerState;
use super::user_state::UserStateInner;
/// Seconds before the JWT's `exp` time at which we consider it stale
/// enough to refresh. Covers clock skew + the round-trip to the agent
/// so the token doesn't expire mid-flight.
const REFRESH_SKEW_SECS: i64 = 30;
/// Build a `RequestBuilder` for `<agent_api_url><path>` with the
/// session's access token attached. `path` should include a leading
/// `/`, e.g. `"/api/v1/repositories"`.
@@ -52,9 +38,10 @@ pub async fn agent_get(path: &str) -> Result<reqwest::RequestBuilder, ServerFnEr
}
/// Attach the session's bearer token if Keycloak is configured AND the
/// session has a logged-in user. Refresh the token proactively if it's
/// expired or about to expire. Persists refreshed tokens back into the
/// session.
/// session has a logged-in user. Otherwise leave the request as-is.
///
/// The Keycloak-disabled path mirrors the dashboard's `require_auth`
/// middleware, which short-circuits when `state.keycloak.is_none()`.
async fn attach_token(
req: reqwest::RequestBuilder,
state: &ServerState,
@@ -67,144 +54,8 @@ async fn attach_token(
.get(LOGGED_IN_USER_SESS_KEY)
.await
.map_err(|e| ServerFnError::new(format!("session read failed: {e}")))?;
let Some(mut user) = user else {
return Ok(req);
};
if token_needs_refresh(&user.access_token) {
tracing::debug!("Access token expired or near-expiring; refreshing");
match refresh_tokens(state, &user.refresh_token).await {
Ok((new_access, new_refresh)) => {
user.access_token = new_access;
if let Some(rt) = new_refresh {
user.refresh_token = rt;
}
if let Err(e) = session.insert(LOGGED_IN_USER_SESS_KEY, &user).await {
tracing::warn!("Failed to persist refreshed tokens: {e}");
}
}
Err(e) => {
tracing::warn!("Token refresh failed: {e}; sending current token anyway");
// Fall through — the agent will 401 and the UI will
// prompt re-login. Better than failing the request at
// the dashboard layer with no helpful UX cue.
}
}
}
Ok(req.bearer_auth(user.access_token))
}
/// Decode the JWT's payload (no signature verification — the agent
/// does that) and check the `exp` claim. Treats malformed tokens as
/// expired so the refresh path runs.
fn token_needs_refresh(jwt: &str) -> bool {
let Some(payload_b64) = jwt.split('.').nth(1) else {
return true;
};
let Ok(bytes) = URL_SAFE_NO_PAD.decode(payload_b64) else {
return true;
};
#[derive(serde::Deserialize)]
struct ExpClaim {
exp: i64,
}
let Ok(claims) = serde_json::from_slice::<ExpClaim>(&bytes) else {
return true;
};
let now = chrono::Utc::now().timestamp();
claims.exp - REFRESH_SKEW_SECS <= now
}
/// Exchange a refresh_token for a new access_token. Returns the new
/// access_token and (optionally) the new refresh_token KC issued.
/// KC may rotate refresh_tokens on use; we honor whatever it sends.
async fn refresh_tokens(
state: &ServerState,
refresh_token: &str,
) -> Result<(String, Option<String>), String> {
let kc = state
.keycloak
.ok_or_else(|| "Keycloak not configured".to_string())?;
if refresh_token.is_empty() {
return Err("no refresh_token in session".to_string());
}
#[derive(serde::Deserialize)]
struct TokenResp {
access_token: String,
refresh_token: Option<String>,
}
let resp = reqwest::Client::new()
.post(kc.token_endpoint())
.form(&[
("grant_type", "refresh_token"),
("client_id", kc.client_id.as_str()),
("refresh_token", refresh_token),
])
.send()
.await
.map_err(|e| format!("refresh request failed: {e}"))?;
if !resp.status().is_success() {
let status = resp.status();
let body = resp.text().await.unwrap_or_default();
return Err(format!("refresh rejected ({status}): {body}"));
}
let r: TokenResp = resp
.json()
.await
.map_err(|e| format!("refresh response parse failed: {e}"))?;
Ok((r.access_token, r.refresh_token))
}
#[cfg(test)]
mod tests {
use super::*;
use base64::Engine;
/// Build a JWT-shaped string (header.payload.sig) with the given
/// payload. Signature is bogus — we never verify it locally.
fn make_jwt(payload: &serde_json::Value) -> String {
let payload_b64 = URL_SAFE_NO_PAD.encode(serde_json::to_vec(payload).unwrap());
format!("hdr.{payload_b64}.sig")
}
#[test]
fn token_needs_refresh_true_when_expired() {
let exp = chrono::Utc::now().timestamp() - 60;
let jwt = make_jwt(&serde_json::json!({ "exp": exp }));
assert!(token_needs_refresh(&jwt));
}
#[test]
fn token_needs_refresh_true_within_skew_window() {
// 10 seconds left; less than the 30s skew → must refresh.
let exp = chrono::Utc::now().timestamp() + 10;
let jwt = make_jwt(&serde_json::json!({ "exp": exp }));
assert!(token_needs_refresh(&jwt));
}
#[test]
fn token_needs_refresh_false_with_plenty_of_life() {
let exp = chrono::Utc::now().timestamp() + 600;
let jwt = make_jwt(&serde_json::json!({ "exp": exp }));
assert!(!token_needs_refresh(&jwt));
}
#[test]
fn token_needs_refresh_true_on_malformed_jwt() {
assert!(token_needs_refresh(""));
assert!(token_needs_refresh("not.a.jwt"));
assert!(token_needs_refresh("only-one-segment"));
assert!(token_needs_refresh("hdr.not-base64!.sig"));
}
#[test]
fn token_needs_refresh_true_when_exp_missing() {
let jwt = make_jwt(&serde_json::json!({ "sub": "abc" }));
assert!(token_needs_refresh(&jwt));
}
Ok(match user {
Some(u) => req.bearer_auth(u.access_token),
None => req,
})
}
@@ -1,90 +0,0 @@
//! Server-functions for the MCP-tokens management UI.
//!
//! These wrap the agent's `/api/v1/mcp-tokens` CRUD endpoints. The raw
//! token returned by `create_mcp_token` is only visible at creation
//! time — the agent's storage never holds the plaintext.
use dioxus::prelude::*;
use serde::{Deserialize, Serialize};
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
pub struct McpTokenView {
pub id: String,
pub name: String,
pub token_prefix: String,
pub created_by: String,
pub created_at: serde_json::Value,
#[serde(default)]
pub last_used_at: Option<serde_json::Value>,
#[serde(default)]
pub revoked: bool,
}
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
pub struct McpTokensListResponse {
pub data: Vec<McpTokenView>,
}
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
pub struct CreateMcpTokenResponse {
/// Raw token. Shown ONCE — the user must copy it now.
pub token: String,
pub view: McpTokenView,
}
#[server]
pub async fn fetch_mcp_tokens() -> Result<McpTokensListResponse, ServerFnError> {
let resp = super::agent_client::agent_get("/api/v1/mcp-tokens")
.await?
.send()
.await
.map_err(|e| ServerFnError::new(e.to_string()))?;
let body: McpTokensListResponse = resp
.json()
.await
.map_err(|e| ServerFnError::new(e.to_string()))?;
Ok(body)
}
#[server]
pub async fn create_mcp_token(name: String) -> Result<CreateMcpTokenResponse, ServerFnError> {
if name.trim().is_empty() {
return Err(ServerFnError::new("Name is required"));
}
let resp = super::agent_client::agent_request(reqwest::Method::POST, "/api/v1/mcp-tokens")
.await?
.json(&serde_json::json!({ "name": name.trim() }))
.send()
.await
.map_err(|e| ServerFnError::new(e.to_string()))?;
if !resp.status().is_success() {
let body = resp.text().await.unwrap_or_default();
return Err(ServerFnError::new(format!(
"Failed to create token: {body}"
)));
}
let body: CreateMcpTokenResponse = resp
.json()
.await
.map_err(|e| ServerFnError::new(e.to_string()))?;
Ok(body)
}
#[server]
pub async fn revoke_mcp_token(id: String) -> Result<(), ServerFnError> {
let resp = super::agent_client::agent_request(
reqwest::Method::DELETE,
&format!("/api/v1/mcp-tokens/{id}"),
)
.await?
.send()
.await
.map_err(|e| ServerFnError::new(e.to_string()))?;
if !resp.status().is_success() {
let body = resp.text().await.unwrap_or_default();
return Err(ServerFnError::new(format!(
"Failed to revoke token: {body}"
)));
}
Ok(())
}
@@ -8,7 +8,6 @@ pub mod graph;
pub mod help_chat;
pub mod issues;
pub mod mcp;
pub mod mcp_tokens;
pub mod notifications;
pub mod pentest;
#[allow(clippy::too_many_arguments)]
@@ -1,271 +0,0 @@
use dioxus::prelude::*;
use dioxus_free_icons::icons::bs_icons::*;
use dioxus_free_icons::Icon;
use crate::components::page_header::PageHeader;
use crate::components::toast::{ToastType, Toasts};
use crate::infrastructure::mcp_tokens::{
create_mcp_token, fetch_mcp_tokens, revoke_mcp_token, CreateMcpTokenResponse,
};
#[component]
pub fn McpTokensPage() -> Element {
let mut tokens = use_resource(|| async { fetch_mcp_tokens().await.ok() });
let mut toasts = use_context::<Toasts>();
// Create-form state
let mut show_form = use_signal(|| false);
let mut new_name = use_signal(String::new);
let mut submitting = use_signal(|| false);
// After creation, the raw token shows once in a banner
let mut just_created: Signal<Option<CreateMcpTokenResponse>> = use_signal(|| None);
// Revoke confirmation: (id, name)
let mut confirm_revoke: Signal<Option<(String, String)>> = use_signal(|| None);
rsx! {
PageHeader {
title: "MCP Tokens",
description: "Static bearer tokens for the MCP server. Use in your LLM client (Claude Desktop, Cursor, etc.) — one token per tool/device.",
}
// ── Just-created banner ────────────────────────────────────
if let Some(resp) = just_created() {
div { class: "card mb-4", style: "border: 1px solid var(--accent-warning); background: var(--bg-warning-subtle);",
div { class: "card-header", style: "color: var(--accent-warning);",
Icon { icon: BsExclamationTriangle, width: 14, height: 14 }
" Copy this token now — it won't be shown again"
}
div { style: "padding: 1rem;",
p { style: "margin-bottom: 0.5rem; color: var(--text-secondary);",
"Token for "
strong { "{resp.view.name}" }
}
div { class: "copyable", style: "background: var(--bg-secondary); padding: 0.75rem; border-radius: 4px;",
code { style: "font-family: var(--font-mono); word-break: break-all; flex: 1;", "{resp.token}" }
crate::components::copy_button::CopyButton { value: resp.token.clone(), small: false }
}
div { style: "margin-top: 0.75rem;",
button {
class: "btn btn-sm btn-ghost",
onclick: move |_| just_created.set(None),
"Dismiss"
}
}
}
}
}
// ── Create form ────────────────────────────────────────────
div { class: "mb-4",
button {
class: "btn btn-primary",
onclick: move |_| {
show_form.set(!show_form());
new_name.set(String::new());
},
if show_form() { "Cancel" } else {
Icon { icon: BsPlusLg, width: 14, height: 14 }
" Create Token"
}
}
}
if show_form() {
div { class: "card mb-4",
div { class: "card-header", "New MCP Token" }
div { style: "padding: 1rem;",
div { class: "form-group",
label { "Name" }
input {
r#type: "text",
placeholder: "Claude Desktop on my laptop",
value: "{new_name}",
oninput: move |e| new_name.set(e.value()),
}
small { style: "color: var(--text-secondary);", "A label so you can identify this token in the list. Not visible to LLM clients." }
}
div { style: "margin-top: 1rem;",
button {
class: "btn btn-primary",
disabled: submitting() || new_name().trim().is_empty(),
onclick: move |_| {
let name = new_name().trim().to_string();
if name.is_empty() {
return;
}
spawn(async move {
submitting.set(true);
match create_mcp_token(name).await {
Ok(resp) => {
toasts.push(ToastType::Success, "Token created. Copy it now — it won't be shown again.");
just_created.set(Some(resp));
show_form.set(false);
new_name.set(String::new());
tokens.restart();
}
Err(e) => {
toasts.push(ToastType::Error, format!("Failed to create token: {e}"));
}
}
submitting.set(false);
});
},
if submitting() { "Creating..." } else { "Create" }
}
}
}
}
}
// ── Tokens list ────────────────────────────────────────────
match &*tokens.read() {
Some(Some(resp)) => {
if resp.data.is_empty() {
rsx! {
div { class: "card",
p { style: "padding: 1rem; color: var(--text-secondary);", "No MCP tokens yet. Create one to start using the MCP server from an LLM client." }
}
}
} else {
rsx! {
div { class: "mcp-cards-grid",
for token in resp.data.iter() {
{
let id = token.id.clone();
let name = token.name.clone();
let prefix = token.token_prefix.clone();
let created_str = format_timestamp(&token.created_at);
let last_used_str = token
.last_used_at
.as_ref()
.map(format_timestamp)
.unwrap_or_else(|| "never".to_string());
let revoked = token.revoked;
rsx! {
div { class: "mcp-card", style: if revoked { "opacity: 0.55;" } else { "" },
div { class: "mcp-card-header",
div { class: "mcp-card-title",
Icon { icon: BsKey, width: 14, height: 14 }
h3 { "{name}" }
if revoked {
span { class: "mcp-card-status stopped", "revoked" }
}
}
if !revoked {
button {
class: "btn btn-sm btn-ghost btn-ghost-danger",
title: "Revoke token",
onclick: {
let id = id.clone();
let name = name.clone();
move |_| {
confirm_revoke.set(Some((id.clone(), name.clone())));
}
},
Icon { icon: BsTrash, width: 14, height: 14 }
}
}
}
div { class: "mcp-card-details",
div { class: "mcp-detail-row",
Icon { icon: BsKey, width: 13, height: 13 }
span { class: "mcp-detail-label", "Prefix" }
code { class: "mcp-detail-value", "{prefix}…" }
}
div { class: "mcp-detail-row",
Icon { icon: BsCalendar, width: 13, height: 13 }
span { class: "mcp-detail-label", "Created" }
span { class: "mcp-detail-value", "{created_str}" }
}
div { class: "mcp-detail-row",
Icon { icon: BsClockHistory, width: 13, height: 13 }
span { class: "mcp-detail-label", "Last used" }
span { class: "mcp-detail-value", "{last_used_str}" }
}
}
}
}
}
}
}
}
}
}
Some(None) => rsx! {
div { class: "card",
p { style: "padding: 1rem; color: var(--accent-danger);", "Failed to load MCP tokens." }
}
},
None => rsx! {
div { class: "card",
p { style: "padding: 1rem; color: var(--text-secondary);", "Loading..." }
}
},
}
// ── Revoke confirmation modal ──────────────────────────────
if let Some((id, name)) = confirm_revoke() {
div { class: "modal-overlay",
div { class: "modal",
h3 { "Revoke token?" }
p {
"The token "
strong { "{name}" }
" will stop working immediately. This cannot be undone. Any LLM client using it will start getting 401."
}
div { style: "display: flex; gap: 0.5rem; margin-top: 1rem; justify-content: flex-end;",
button {
class: "btn btn-ghost",
onclick: move |_| confirm_revoke.set(None),
"Cancel"
}
button {
class: "btn btn-danger",
onclick: {
let id = id.clone();
move |_| {
let id = id.clone();
spawn(async move {
match revoke_mcp_token(id).await {
Ok(()) => {
toasts.push(ToastType::Success, "Token revoked");
tokens.restart();
}
Err(e) => {
toasts.push(ToastType::Error, format!("Failed to revoke: {e}"));
}
}
confirm_revoke.set(None);
});
}
},
"Revoke"
}
}
}
}
}
}
}
/// Best-effort timestamp formatter. The agent serializes BSON DateTime
/// as `{"$date":{"$numberLong":"..."}}` in extended JSON. We accept
/// that shape, plain ISO strings, or anything else (best-effort).
fn format_timestamp(v: &serde_json::Value) -> String {
if let Some(s) = v.as_str() {
return s.to_string();
}
if let Some(ms) = v
.get("$date")
.and_then(|d| d.get("$numberLong"))
.and_then(|s| s.as_str())
.and_then(|s| s.parse::<i64>().ok())
{
return chrono::DateTime::<chrono::Utc>::from_timestamp_millis(ms)
.map(|d| d.format("%Y-%m-%d %H:%M").to_string())
.unwrap_or_else(|| ms.to_string());
}
"".to_string()
}
-2
View File
@@ -11,7 +11,6 @@ pub mod graph_index;
pub mod impact_analysis;
pub mod issues;
pub mod mcp_servers;
pub mod mcp_tokens;
pub mod overview;
pub mod pentest_dashboard;
pub mod pentest_session;
@@ -31,7 +30,6 @@ pub use graph_index::GraphIndexPage;
pub use impact_analysis::ImpactAnalysisPage;
pub use issues::IssuesPage;
pub use mcp_servers::McpServersPage;
pub use mcp_tokens::McpTokensPage;
pub use overview::OverviewPage;
pub use pentest_dashboard::PentestDashboardPage;
pub use pentest_session::PentestSessionPage;
+1 -4
View File
@@ -4,7 +4,7 @@ version = "0.1.0"
edition = "2021"
[dependencies]
compliance-core = { workspace = true, features = ["mongodb", "axum"] }
compliance-core = { workspace = true, features = ["mongodb"] }
rmcp = { version = "0.16", features = ["server", "macros", "transport-io", "transport-streamable-http-server"] }
tokio = { workspace = true }
serde = { workspace = true }
@@ -19,6 +19,3 @@ bson = { version = "2", features = ["chrono-0_4"] }
schemars = "1.0"
axum = "0.8"
tower-http = { version = "0.6", features = ["cors"] }
sha2 = { workspace = true }
hex = { workspace = true }
dashmap = { workspace = true }
-129
View File
@@ -1,129 +0,0 @@
//! Bearer-token authentication for incoming MCP HTTP requests.
//!
//! LLM clients (Claude Desktop / Cursor / ChatGPT / etc.) can't run
//! Keycloak OIDC, so the MCP server uses opaque static tokens minted
//! per-tenant via the agent's `POST /api/v1/mcp-tokens` endpoint.
//!
//! Flow per request:
//! 1. Extract `Authorization: Bearer <token>`. Missing → 401.
//! 2. SHA-256 hash the token.
//! 3. Look up the hash in `<prefix>__admin.mcp_tokens`. Missing or
//! revoked → 401.
//! 4. Fire-and-forget update of `last_used_at` so the dashboard can
//! show staleness without blocking the handler.
//! 5. Stash the tenant_id in [`TENANT_ID`] (a `tokio::task_local`) so
//! the MCP tool handlers can read it without modifying rmcp's
//! handler signatures.
//!
//! The `task_local` is scoped around the inner service call via
//! [`bearer_auth`], so every handler invoked downstream sees the
//! tenant_id without us having to thread it through the macro-
//! generated tool router.
use axum::body::Body;
use axum::extract::{Request, State};
use axum::http::StatusCode;
use axum::middleware::Next;
use axum::response::{IntoResponse, Response};
use mongodb::bson::doc;
use sha2::{Digest, Sha256};
use crate::database::DatabasePool;
tokio::task_local! {
/// Tenant id resolved from the bearer for this request. Set by
/// [`bearer_auth`] before the inner service runs; read by the
/// MCP tool handlers via [`current_tenant_id`].
pub static TENANT_ID: String;
}
/// Mongo collection name in `<prefix>__admin`.
const COLLECTION: &str = "mcp_tokens";
/// Returns the tenant_id set by the auth middleware. `None` outside a
/// request scope (e.g. unit tests that bypass the middleware).
pub fn current_tenant_id() -> Option<String> {
TENANT_ID.try_with(|s| s.clone()).ok()
}
/// Axum middleware: validate bearer → set [`TENANT_ID`] → call inner.
pub async fn bearer_auth(
State(pool): State<DatabasePool>,
request: Request,
next: Next,
) -> Response {
let Some(token) = extract_bearer(&request) else {
return (StatusCode::UNAUTHORIZED, "Missing bearer token").into_response();
};
if !token.starts_with("mcpt_") {
return (StatusCode::UNAUTHORIZED, "Invalid token format").into_response();
}
let token_hash = sha256_hex(&token);
let col = pool.admin_db().collection::<TokenLookup>(COLLECTION);
let found = match col
.find_one(doc! { "token_hash": &token_hash, "revoked": false })
.await
{
Ok(Some(t)) => t,
Ok(None) => {
return (StatusCode::UNAUTHORIZED, "Invalid or revoked token").into_response();
}
Err(e) => {
tracing::error!("MCP token lookup failed: {e}");
return (StatusCode::INTERNAL_SERVER_ERROR, "Token lookup error").into_response();
}
};
// Fire-and-forget last_used_at update — never block the handler.
let col2 = pool.admin_db().collection::<TokenLookup>(COLLECTION);
let hash_for_update = token_hash.clone();
tokio::spawn(async move {
let _ = col2
.update_one(
doc! { "token_hash": &hash_for_update },
doc! { "$set": { "last_used_at": mongodb::bson::DateTime::now() } },
)
.await;
});
let tenant_id = found.tenant_id;
let inner = next.run(request);
TENANT_ID.scope(tenant_id, inner).await
}
/// Bare-bones projection — we don't need the whole `McpToken` here,
/// just enough to route and confirm validity.
#[derive(serde::Deserialize)]
struct TokenLookup {
tenant_id: String,
}
fn extract_bearer(req: &Request<Body>) -> Option<String> {
req.headers()
.get(axum::http::header::AUTHORIZATION)
.and_then(|v| v.to_str().ok())
.and_then(|s| s.strip_prefix("Bearer "))
.map(|s| s.trim().to_string())
.filter(|s| !s.is_empty())
}
fn sha256_hex(s: &str) -> String {
let mut h = Sha256::new();
h.update(s.as_bytes());
hex::encode(h.finalize())
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn sha256_known_value() {
// python -c 'import hashlib; print(hashlib.sha256(b"mcpt_known").hexdigest())'
assert_eq!(
sha256_hex("mcpt_known"),
"27cf6cf678a44244106863c1c031be8e57b84c2b3019d742f755f8e7afa75dfd"
);
}
}
+7 -115
View File
@@ -1,127 +1,19 @@
//! Per-tenant Mongo broker for the MCP server.
//!
//! Mirror of the agent's `compliance_agent::database::DatabasePool` —
//! duplicated here rather than lifted into `compliance-core` to keep
//! this PR focused. If a third consumer ever needs it, lift then.
//!
//! Bearer tokens (validated by the auth middleware) carry a tenant_id
//! and the handler resolves the per-tenant database via
//! [`DatabasePool::for_tenant_id`]. The admin database
//! (`<db_prefix>__admin`) holds the cross-tenant `mcp_tokens`
//! collection that the middleware queries on every request.
use std::sync::Arc;
use dashmap::DashMap;
use mongodb::{bson::doc, Client, Collection};
use sha2::{Digest, Sha256};
use mongodb::{Client, Collection};
use compliance_core::models::*;
/// 63-byte Mongo db-name cap; same invariant as the agent's pool.
const MAX_DB_NAME_LEN: usize = 63;
/// 16-byte SHA-256 truncation, hex-encoded → 32 chars.
const HASH_HEX_LEN: usize = 32;
const MAX_PREFIX_LEN: usize = MAX_DB_NAME_LEN - 1 - HASH_HEX_LEN;
#[derive(Clone, Debug)]
pub struct DatabasePool {
client: Client,
db_prefix: String,
/// Tenants we've handed out a [`Database`] for. The MCP server
/// doesn't ensure indexes (the agent owns that side of the
/// schema), so the marker exists only to satisfy the parallel
/// shape — current code never reads it.
#[allow(dead_code)]
seen: Arc<DashMap<String, ()>>,
}
#[derive(Debug, thiserror::Error)]
pub enum DbError {
#[error("db_prefix '{prefix}' is {len} chars; max is {max} so the hash-fallback DB name fits Mongo's 63-byte cap")]
PrefixTooLong {
prefix: String,
len: usize,
max: usize,
},
#[error(transparent)]
Mongo(#[from] mongodb::error::Error),
}
impl DatabasePool {
pub async fn connect(uri: &str, db_prefix: &str) -> Result<Self, DbError> {
if db_prefix.len() > MAX_PREFIX_LEN {
return Err(DbError::PrefixTooLong {
prefix: db_prefix.to_string(),
len: db_prefix.len(),
max: MAX_PREFIX_LEN,
});
}
let client = Client::with_uri_str(uri).await?;
client
.database("admin")
.run_command(doc! { "ping": 1 })
.await?;
tracing::info!(
"MCP MongoDB cluster reachable; per-tenant pool ready (db prefix '{db_prefix}')"
);
Ok(Self {
client,
db_prefix: db_prefix.to_string(),
seen: Arc::new(DashMap::new()),
})
}
/// Read-only handle to the tenant's database. No indexes are
/// ensured here — the agent owns writes, MCP only reads.
pub fn for_tenant_id(&self, tenant_id: &str) -> Database {
let db_name = self.tenant_db_name(tenant_id);
self.seen.insert(tenant_id.to_string(), ());
Database::new(self.client.database(&db_name))
}
/// Cross-tenant admin DB — holds the `mcp_tokens` collection that
/// the auth middleware queries to map bearer → tenant_id.
pub fn admin_db(&self) -> mongodb::Database {
self.client.database(&format!("{}__admin", self.db_prefix))
}
pub fn tenant_db_name(&self, tenant_id: &str) -> String {
let sanitized = sanitize_tenant_id(tenant_id);
let natural = format!("{}_{}", self.db_prefix, sanitized);
if natural.len() <= MAX_DB_NAME_LEN {
natural
} else {
let mut h = Sha256::new();
h.update(tenant_id.as_bytes());
let digest = h.finalize();
let suffix = hex::encode(&digest[..HASH_HEX_LEN / 2]);
format!("{}_{}", self.db_prefix, suffix)
}
}
}
fn sanitize_tenant_id(tenant_id: &str) -> String {
tenant_id
.chars()
.map(|c| match c {
'/' | '\\' | '.' | '"' | '$' | ' ' | '\0' => '_',
c => c,
})
.collect()
}
/// Typed accessors for the MCP-readable collections in a tenant DB.
/// Matches the agent's `Database` shape but only exposes what the MCP
/// tool handlers actually need.
#[derive(Clone, Debug)]
pub struct Database {
inner: mongodb::Database,
}
impl Database {
pub(crate) fn new(inner: mongodb::Database) -> Self {
Self { inner }
pub async fn connect(uri: &str, db_name: &str) -> Result<Self, mongodb::error::Error> {
let client = Client::with_uri_str(uri).await?;
let db = client.database(db_name);
db.run_command(mongodb::bson::doc! { "ping": 1 }).await?;
tracing::info!("MCP server connected to MongoDB '{db_name}'");
Ok(Self { inner: db })
}
pub fn findings(&self) -> Collection<Finding> {
+10 -35
View File
@@ -1,11 +1,10 @@
mod auth;
mod database;
mod server;
mod tools;
use std::sync::Arc;
use database::DatabasePool;
use database::Database;
use rmcp::transport::{
streamable_http_server::session::local::LocalSessionManager, StreamableHttpServerConfig,
StreamableHttpService,
@@ -25,60 +24,36 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mongo_uri =
std::env::var("MONGODB_URI").unwrap_or_else(|_| "mongodb://localhost:27017".to_string());
// MONGODB_DATABASE is reused as the per-tenant DB-name prefix —
// same convention as the agent so `<prefix>__admin.mcp_tokens`
// and `<prefix>_<tenant_id>` line up across services.
let db_prefix =
let db_name =
std::env::var("MONGODB_DATABASE").unwrap_or_else(|_| "compliance_scanner".to_string());
let pool = DatabasePool::connect(&mongo_uri, &db_prefix).await?;
let db = Database::connect(&mongo_uri, &db_name).await?;
// HTTP transport: bind a small axum router with bearer-auth in
// front of the rmcp service. `/health` stays public for orca's
// container probe.
// If MCP_PORT is set, run as Streamable HTTP server; otherwise use stdio.
if let Ok(port_str) = std::env::var("MCP_PORT") {
let port: u16 = port_str.parse()?;
tracing::info!("Starting MCP server on HTTP port {port}");
let pool_for_factory = pool.clone();
let db_clone = db.clone();
let service = StreamableHttpService::new(
move || Ok(ComplianceMcpServer::new(pool_for_factory.clone())),
move || Ok(ComplianceMcpServer::new(db_clone.clone())),
Arc::new(LocalSessionManager::default()),
StreamableHttpServerConfig::default(),
);
let router = axum::Router::new()
.route("/health", axum::routing::get(|| async { "ok" }))
.nest_service(
"/mcp",
axum::Router::new().fallback_service(service).layer(
axum::middleware::from_fn_with_state(pool.clone(), auth::bearer_auth),
),
);
.nest_service("/mcp", service);
let listener = tokio::net::TcpListener::bind(("0.0.0.0", port)).await?;
tracing::info!("MCP HTTP server listening on 0.0.0.0:{port}");
axum::serve(listener, router).await?;
} else {
// stdio transport — used when run as a local MCP server next
// to the LLM client. There's no HTTP layer to do bearer auth,
// so we synthesize a tenant_id from STDIO_TENANT_ID for local
// development. NEVER use this in production.
tracing::info!("Starting MCP server on stdio");
let synth_tenant = std::env::var("STDIO_TENANT_ID").unwrap_or_else(|_| "dev".to_string());
tracing::warn!(
tenant_id = %synth_tenant,
"stdio transport — using synthetic tenant id; DO NOT use in production"
);
let server = ComplianceMcpServer::new(pool);
let server = ComplianceMcpServer::new(db);
let transport = rmcp::transport::stdio();
use rmcp::ServiceExt;
auth::TENANT_ID
.scope(synth_tenant, async {
let handle = server.serve(transport).await?;
handle.waiting().await?;
Ok::<_, Box<dyn std::error::Error>>(())
})
.await?;
let handle = server.serve(transport).await?;
handle.waiting().await?;
}
Ok(())
+17 -46
View File
@@ -2,37 +2,20 @@ use rmcp::{
handler::server::wrapper::Parameters, model::*, tool, tool_handler, tool_router, ServerHandler,
};
use crate::auth::current_tenant_id;
use crate::database::{Database, DatabasePool};
use crate::database::Database;
use crate::tools::{dast, findings, pentest, sbom};
pub struct ComplianceMcpServer {
pool: DatabasePool,
db: Database,
#[allow(dead_code)]
tool_router: rmcp::handler::server::router::tool::ToolRouter<Self>,
}
impl ComplianceMcpServer {
/// Resolve the per-tenant `Database` from the bearer-set
/// `task_local`. Every tool handler calls this; missing context
/// surfaces as `internal_error` because it means the auth
/// middleware was misconfigured (handler ran without scope).
fn tenant_db(&self) -> Result<Database, rmcp::ErrorData> {
let tenant_id = current_tenant_id().ok_or_else(|| {
rmcp::ErrorData::internal_error(
"no tenant context — bearer middleware not in chain".to_string(),
None,
)
})?;
Ok(self.pool.for_tenant_id(&tenant_id))
}
}
#[tool_router]
impl ComplianceMcpServer {
pub fn new(pool: DatabasePool) -> Self {
pub fn new(db: Database) -> Self {
Self {
pool,
db,
tool_router: Self::tool_router(),
}
}
@@ -46,8 +29,7 @@ impl ComplianceMcpServer {
&self,
Parameters(params): Parameters<findings::ListFindingsParams>,
) -> Result<CallToolResult, rmcp::ErrorData> {
let db = self.tenant_db()?;
findings::list_findings(&db, params).await
findings::list_findings(&self.db, params).await
}
#[tool(description = "Get a single finding by its ID")]
@@ -55,8 +37,7 @@ impl ComplianceMcpServer {
&self,
Parameters(params): Parameters<findings::GetFindingParams>,
) -> Result<CallToolResult, rmcp::ErrorData> {
let db = self.tenant_db()?;
findings::get_finding(&db, params).await
findings::get_finding(&self.db, params).await
}
#[tool(description = "Get a summary of findings counts grouped by severity and status")]
@@ -64,8 +45,7 @@ impl ComplianceMcpServer {
&self,
Parameters(params): Parameters<findings::FindingsSummaryParams>,
) -> Result<CallToolResult, rmcp::ErrorData> {
let db = self.tenant_db()?;
findings::findings_summary(&db, params).await
findings::findings_summary(&self.db, params).await
}
// ── SBOM ──────────────────────────────────────────────
@@ -77,8 +57,7 @@ impl ComplianceMcpServer {
&self,
Parameters(params): Parameters<sbom::ListSbomPackagesParams>,
) -> Result<CallToolResult, rmcp::ErrorData> {
let db = self.tenant_db()?;
sbom::list_sbom_packages(&db, params).await
sbom::list_sbom_packages(&self.db, params).await
}
#[tool(
@@ -88,8 +67,7 @@ impl ComplianceMcpServer {
&self,
Parameters(params): Parameters<sbom::SbomVulnReportParams>,
) -> Result<CallToolResult, rmcp::ErrorData> {
let db = self.tenant_db()?;
sbom::sbom_vuln_report(&db, params).await
sbom::sbom_vuln_report(&self.db, params).await
}
// ── DAST ──────────────────────────────────────────────
@@ -101,8 +79,7 @@ impl ComplianceMcpServer {
&self,
Parameters(params): Parameters<dast::ListDastFindingsParams>,
) -> Result<CallToolResult, rmcp::ErrorData> {
let db = self.tenant_db()?;
dast::list_dast_findings(&db, params).await
dast::list_dast_findings(&self.db, params).await
}
#[tool(description = "Get a summary of recent DAST scan runs and finding counts")]
@@ -110,8 +87,7 @@ impl ComplianceMcpServer {
&self,
Parameters(params): Parameters<dast::DastScanSummaryParams>,
) -> Result<CallToolResult, rmcp::ErrorData> {
let db = self.tenant_db()?;
dast::dast_scan_summary(&db, params).await
dast::dast_scan_summary(&self.db, params).await
}
// ── Pentest ─────────────────────────────────────────────
@@ -123,8 +99,7 @@ impl ComplianceMcpServer {
&self,
Parameters(params): Parameters<pentest::ListPentestSessionsParams>,
) -> Result<CallToolResult, rmcp::ErrorData> {
let db = self.tenant_db()?;
pentest::list_pentest_sessions(&db, params).await
pentest::list_pentest_sessions(&self.db, params).await
}
#[tool(description = "Get a single AI pentest session by its ID")]
@@ -132,8 +107,7 @@ impl ComplianceMcpServer {
&self,
Parameters(params): Parameters<pentest::GetPentestSessionParams>,
) -> Result<CallToolResult, rmcp::ErrorData> {
let db = self.tenant_db()?;
pentest::get_pentest_session(&db, params).await
pentest::get_pentest_session(&self.db, params).await
}
#[tool(
@@ -143,8 +117,7 @@ impl ComplianceMcpServer {
&self,
Parameters(params): Parameters<pentest::GetAttackChainParams>,
) -> Result<CallToolResult, rmcp::ErrorData> {
let db = self.tenant_db()?;
pentest::get_attack_chain(&db, params).await
pentest::get_attack_chain(&self.db, params).await
}
#[tool(description = "Get chat messages from a pentest session")]
@@ -152,8 +125,7 @@ impl ComplianceMcpServer {
&self,
Parameters(params): Parameters<pentest::GetPentestMessagesParams>,
) -> Result<CallToolResult, rmcp::ErrorData> {
let db = self.tenant_db()?;
pentest::get_pentest_messages(&db, params).await
pentest::get_pentest_messages(&self.db, params).await
}
#[tool(
@@ -163,8 +135,7 @@ impl ComplianceMcpServer {
&self,
Parameters(params): Parameters<pentest::PentestStatsParams>,
) -> Result<CallToolResult, rmcp::ErrorData> {
let db = self.tenant_db()?;
pentest::pentest_stats(&db, params).await
pentest::pentest_stats(&self.db, params).await
}
}
@@ -178,7 +149,7 @@ impl ServerHandler for ComplianceMcpServer {
.build(),
server_info: Implementation::from_build_env(),
instructions: Some(
"Compliance Scanner MCP server. Query security findings, SBOM data, DAST results, and AI pentest sessions for your tenant."
"Compliance Scanner MCP server. Query security findings, SBOM data, DAST results, and AI pentest sessions."
.to_string(),
),
}