LLM clients (Claude Desktop, Cursor, ChatGPT) can't run a Keycloak
OIDC flow, so the MCP server can't use JWTs for auth. This PR
introduces opaque static bearer tokens minted per-tenant via new
agent endpoints, validated by the MCP server, and used to route
incoming MCP requests to the caller's per-tenant database.
Until now, the MCP server connected to a single shared MongoDB DB
with no auth and no tenant awareness — every tool (list_findings,
list_sbom_packages, etc.) returned data across all tenants. After
M7.2 made the agent per-tenant, MCP was the lone cross-tenant data
leak. This closes it.
Design summary
- Token format: `mcpt_<43 url-safe random chars>` (48 chars total).
Opaque, never embeds tenant_id, never stored in plaintext.
- Storage: cross-tenant `<prefix>__admin.mcp_tokens` collection,
keyed by SHA-256 hash. Each row carries the tenant_id, name,
created_by, created_at, last_used_at, revoked flag.
- Agent endpoints (tenant-scoped via TenantCtx):
POST /api/v1/mcp-tokens → mint (returns raw token ONCE)
GET /api/v1/mcp-tokens → list (metadata + 12-char prefix,
never the hash)
DELETE /api/v1/mcp-tokens/id → soft revoke
- MCP middleware: extract `Authorization: Bearer mcpt_...`, sniff
the prefix, SHA-256 → lookup in admin DB → reject if missing or
revoked. Updates last_used_at fire-and-forget so it never blocks.
Sets `tokio::task_local!` TENANT_ID for the inner service call;
the rmcp tool handlers read it and resolve the per-tenant DB.
- task_local is scoped via TENANT_ID.scope(...) around next.run(req)
so the rmcp tool handlers downstream see the tenant_id without
modifying their (macro-generated) signatures.
Files
- compliance-core/src/models/mcp_token.rs (new) — McpToken +
McpTokenView (public projection without the hash).
- compliance-agent/src/database.rs — DatabasePool::admin_db() +
admin_db_name(): cross-tenant access for token storage.
- compliance-agent/src/api/handlers/mcp_tokens.rs (new) — three
endpoints. Token generation: 32 random bytes → URL-safe base64,
no padding. SHA-256 hex stored.
- compliance-mcp/src/database.rs — replaced single Database with
DatabasePool. Tenant-scoped Database constructed per request.
Same sanitization + 63-byte cap + hash fallback as the agent.
- compliance-mcp/src/auth.rs (new) — bearer middleware + task_local.
Includes a SHA-256 round-trip test against a known vector.
- compliance-mcp/src/main.rs — HTTP transport: bearer middleware
layered on /mcp (not /health, so orca's container probe still
works). stdio transport: falls back to STDIO_TENANT_ID env (defaults
to "dev") so local development still works; logged loudly as
not-for-production.
- compliance-mcp/src/server.rs — each of the 12 tool handlers
resolves the per-tenant DB via task_local before calling its tool
fn. Tool fns themselves are unchanged.
Token UX
- Generated by the dashboard (or curl + KC JWT) — user sees raw
token exactly once, copies it into their LLM client config.
- Dashboard UI for management is a follow-up; can use curl in the
meantime:
curl -X POST https://comp-dev.../api/v1/mcp-tokens \
-H "Authorization: Bearer $KC_JWT" \
-H "Content-Type: application/json" \
-d '{"name":"Claude Desktop"}'
Test plan
- cargo fmt --all clean
- cargo clippy --workspace --exclude compliance-dashboard
-- -D warnings clean
- cargo test -p compliance-core --lib — 7 pass
- cargo test -p compliance-agent --lib — 230 pass (+2 new for
token generation + sha256 stability)
- cargo test -p compliance-agent --test tenant_isolation — 6 pass
- cargo test -p compliance-mcp — 34 pass (+1 new sha256 vector)
What's deferred
- Dashboard UI for managing tokens (page + create modal + list/
revoke). Trivial once the API is live.
- Token expiry + per-tool scope (today every token grants access
to all 12 tools for its tenant).
- Lifting DatabasePool into compliance-core (duplicated for now
in compliance-mcp to keep this PR focused; lift if a third
consumer appears).
Production
- The `<prefix>__admin` DB needs to NOT collide with a tenant
DB. Sanitized tenant_id never starts with `_admin` for any
current tenant_id shape (UUIDs); flagged in the database.rs
docstring so tenant provisioning can reject `_admin*` ids
proactively.
- orca-infra MCP service block already has MONGODB_URI /
MONGODB_DATABASE — no new env needed. No KC creds since MCP
doesn't use Keycloak for its own auth.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Compliance Scanner
Autonomous security and compliance scanning agent for git repositories
About
Compliance Scanner is an autonomous agent that continuously monitors git repositories for security vulnerabilities, GDPR/OAuth compliance patterns, and dependency risks. It creates issues in external trackers (GitHub/GitLab/Jira/Gitea) with evidence and remediation suggestions, reviews pull requests with multi-pass LLM analysis, runs autonomous penetration tests, and exposes a Dioxus-based dashboard for visualization.
How it works: The agent runs as a lazy daemon -- it only scans when new commits are detected, triggered by cron schedules or webhooks. LLM-powered triage filters out false positives and generates actionable remediation with multi-language awareness.
Features
| Area | Capabilities |
|---|---|
| SAST Scanning | Semgrep-based static analysis with auto-config rules |
| SBOM Generation | Syft + cargo-audit for complete dependency inventory |
| CVE Monitoring | OSV.dev batch queries, NVD CVSS enrichment, SearXNG context |
| GDPR Patterns | Detect PII logging, missing consent, hardcoded retention, missing deletion |
| OAuth Patterns | Detect implicit grant, missing PKCE, token in localStorage, token in URLs |
| LLM Triage | Multi-language-aware confidence scoring (Rust, Python, Go, Java, Ruby, PHP, C++) |
| Issue Creation | Auto-create issues in GitHub, GitLab, Jira, or Gitea with dedup via fingerprints |
| PR Reviews | Multi-pass security review (logic, security, convention, complexity) with dedup |
| DAST Scanning | Black-box security testing with endpoint discovery and parameter fuzzing |
| AI Pentesting | Autonomous LLM-orchestrated penetration testing with encrypted reports |
| Code Graph | Interactive code knowledge graph with impact analysis |
| AI Chat (RAG) | Natural language Q&A grounded in repository source code |
| Help Assistant | Documentation-grounded help chat accessible from every dashboard page |
| MCP Server | Expose live security data to Claude, Cursor, and other AI tools |
| Dashboard | Fullstack Dioxus UI with findings, SBOM, issues, DAST, pentest, and graph |
| Webhooks | GitHub, GitLab, and Gitea webhook receivers for push/PR events |
| Finding Dedup | SHA-256 fingerprint dedup for SAST, CWE-based dedup for DAST findings |
Architecture
┌──────────────────────────────────────────────────────────────────────────┐
│ Cargo Workspace │
├──────────────┬──────────────────┬──────────────┬──────────┬─────────────┤
│ compliance- │ compliance- │ compliance- │ complian-│ compliance- │
│ core (lib) │ agent (bin) │ dashboard │ ce-graph │ mcp (bin) │
│ │ │ (bin) │ (lib) │ │
│ Models │ Scan Pipeline │ Dioxus 0.7 │ Tree- │ MCP Server │
│ Traits │ LLM Client │ Fullstack UI │ sitter │ Live data │
│ Config │ Issue Trackers │ Help Chat │ Graph │ for AI │
│ Errors │ Pentest Engine │ Server Fns │ Embedds │ tools │
│ │ DAST Tools │ │ RAG │ │
│ │ REST API │ │ │ │
│ │ Webhooks │ │ │ │
└──────────────┴──────────────────┴──────────────┴──────────┴─────────────┘
│
MongoDB (shared)
Scan Pipeline (7 Stages)
- Change Detection --
git2fetch, compare HEAD SHA with last scanned commit - Semgrep SAST -- CLI wrapper with JSON output parsing
- SBOM Generation -- Syft (CycloneDX) + cargo-audit vulnerability merge
- CVE Scanning -- OSV.dev batch + NVD CVSS enrichment + SearXNG context
- Pattern Scanning -- Regex-based GDPR and OAuth compliance checks
- LLM Triage -- LiteLLM confidence scoring, filter findings < 3/10
- Issue Creation -- Dedup via SHA-256 fingerprint, create tracker issues
Tech Stack
| Layer | Technology |
|---|---|
| Shared Library | compliance-core -- models, traits, config |
| Agent | Axum REST API, git2, tokio-cron-scheduler, Semgrep, Syft |
| Dashboard | Dioxus 0.7.3 fullstack, Tailwind CSS 4 |
| Code Graph | compliance-graph -- tree-sitter parsing, embeddings, RAG |
| MCP Server | compliance-mcp -- Model Context Protocol for AI tools |
| DAST | compliance-dast -- dynamic application security testing |
| Database | MongoDB with typed collections |
| LLM | LiteLLM (OpenAI-compatible API for chat, triage, embeddings) |
| Issue Trackers | GitHub (octocrab), GitLab (REST v4), Jira (REST v3), Gitea |
| CVE Sources | OSV.dev, NVD, SearXNG |
| Auth | Keycloak (OAuth2/PKCE, SSO) |
| Browser Automation | Chromium (headless, for pentesting and PDF generation) |
Getting Started
Prerequisites
- Rust 1.94+
- Dioxus CLI (
dx) - MongoDB
- Docker & Docker Compose (optional)
Optional External Tools
- Semgrep -- for SAST scanning
- Syft -- for SBOM generation
- cargo-audit -- for Rust dependency auditing
Setup
# Clone the repository
git clone <repo-url>
cd compliance-scanner
# Start MongoDB + SearXNG
docker compose up -d mongo searxng
# Configure environment
cp .env.example .env
# Edit .env with your LiteLLM, tracker tokens, and MongoDB settings
# Run the agent
cargo run -p compliance-agent
# Run the dashboard (separate terminal)
dx serve --features server --platform web
Docker Compose (Full Stack)
docker compose up -d
This starts MongoDB, SearXNG, the agent (port 3001), and the dashboard (port 8080).
REST API
The agent exposes a REST API on port 3001:
| Method | Endpoint | Description |
|---|---|---|
GET |
/api/v1/health |
Health check |
GET |
/api/v1/stats/overview |
Summary statistics and trends |
GET |
/api/v1/repositories |
List tracked repositories |
POST |
/api/v1/repositories |
Add a repository to track |
POST |
/api/v1/repositories/:id/scan |
Trigger a manual scan |
GET |
/api/v1/findings |
List findings (filterable) |
GET |
/api/v1/findings/:id |
Get finding with code evidence |
PATCH |
/api/v1/findings/:id/status |
Update finding status |
GET |
/api/v1/sbom |
List dependencies |
GET |
/api/v1/issues |
List cross-tracker issues |
GET |
/api/v1/scan-runs |
Scan execution history |
GET |
/api/v1/graph/:repo_id |
Code knowledge graph |
POST |
/api/v1/graph/:repo_id/build |
Trigger graph build |
GET |
/api/v1/dast/targets |
List DAST targets |
POST |
/api/v1/dast/targets |
Add DAST target |
GET |
/api/v1/dast/findings |
List DAST findings |
POST |
/api/v1/chat/:repo_id |
RAG-powered code chat |
POST |
/api/v1/help/chat |
Documentation-grounded help chat |
POST |
/api/v1/pentest/sessions |
Create pentest session |
POST |
/api/v1/pentest/sessions/:id/export |
Export encrypted pentest report |
POST |
/webhook/github |
GitHub webhook (HMAC-SHA256) |
POST |
/webhook/gitlab |
GitLab webhook (token verify) |
POST |
/webhook/gitea |
Gitea webhook |
Dashboard Pages
| Page | Description |
|---|---|
| Overview | Stat cards, severity distribution, AI chat cards, MCP status |
| Repositories | Add/manage tracked repos, trigger scans, webhook config |
| Findings | Filterable table by severity, type, status, scanner |
| Finding Detail | Code evidence, remediation, suggested fix, linked issue |
| SBOM | Dependency inventory with vulnerability badges, license summary |
| Issues | Cross-tracker view (GitHub + GitLab + Jira + Gitea) |
| Code Graph | Interactive architecture visualization, impact analysis |
| AI Chat | RAG-powered Q&A about repository code |
| DAST | Dynamic scanning targets, findings, and scan history |
| Pentest | AI-driven pentest sessions, attack chain visualization |
| MCP Servers | Model Context Protocol server management |
| Help Chat | Floating assistant (available on every page) for product Q&A |
Project Structure
compliance-scanner/
├── compliance-core/ Shared library (models, traits, config, errors)
├── compliance-agent/ Agent daemon (pipeline, LLM, trackers, API, webhooks)
│ └── src/
│ ├── pipeline/ 7-stage scan pipeline, dedup, PR reviews, code review
│ ├── llm/ LiteLLM client, triage, descriptions, fixes, review prompts
│ ├── trackers/ GitHub, GitLab, Jira, Gitea integrations
│ ├── pentest/ AI-driven pentest orchestrator, tools, reports
│ ├── rag/ RAG pipeline, chunking, embedding
│ ├── api/ REST API (Axum), help chat
│ └── webhooks/ GitHub, GitLab, Gitea webhook receivers
├── compliance-dashboard/ Dioxus fullstack dashboard
│ └── src/
│ ├── components/ Reusable UI (sidebar, help chat, attack chain, etc.)
│ ├── infrastructure/ Server functions, DB, config, auth
│ └── pages/ Full page views (overview, DAST, pentest, graph, etc.)
├── compliance-graph/ Code knowledge graph (tree-sitter, embeddings, RAG)
├── compliance-dast/ Dynamic application security testing
├── compliance-mcp/ Model Context Protocol server
├── docs/ VitePress documentation site
├── assets/ Static assets (CSS, icons)
└── styles/ Tailwind input stylesheet
External Services
| Service | Purpose | Default URL |
|---|---|---|
| MongoDB | Persistence | mongodb://localhost:27017 |
| LiteLLM | LLM proxy (chat, triage, embeddings) | http://localhost:4000 |
| SearXNG | CVE context search | http://localhost:8888 |
| Keycloak | Authentication (OAuth2/PKCE, SSO) | http://localhost:8080 |
| Semgrep | SAST scanning | CLI tool |
| Syft | SBOM generation | CLI tool |
| Chromium | Headless browser (pentesting, PDF) | Managed via Docker |
Built with Rust, Dioxus, and a commitment to automated security compliance.