Final slice of M7.2. Removes the transitional single-database handle
that M7.2-A introduced alongside the pool, so the compliance-agent
now has a single source of truth for storage: every code path obtains
a tenant-scoped Database from `agent.db_pool.for_tenant_id(...)` or
`for_tenant(&ctx)`. There is no shared "default" database anywhere.
Changes
- ComplianceAgent: `db: Database` field removed. ComplianceAgent::new
now takes only `(config, db_pool)`. Verified by an earlier grep
during M7.2-C that no remaining call site reads `agent.db`.
- main.rs: stops constructing the legacy Database. Only the pool is
built at startup.
- TestServer: same — drops Database::connect/ensure_indexes, builds
only the pool. cleanup() now drops every `<db_name>_*` per-tenant
database (no longer touches a bare `<db_name>`).
- DatabasePool::list_tenant_db_names() — lists Mongo databases
matching the pool's prefix. For admin endpoints + scheduler tenant
enumeration in a future M7.3 (this PR keeps SCHEDULER_TENANT_IDS
env config — registry integration is a separate concern).
- DatabasePool::drop_tenant(&str) — idempotent tenant offboarding.
Drops the per-tenant database and evicts the in-memory `ensured`
marker so a later re-provision re-runs ensure_indexes.
Test plan
- cargo fmt --all clean
- cargo clippy --workspace --exclude compliance-dashboard
-- -D warnings clean
- cargo test -p compliance-core --lib — 7 pass
- cargo test -p compliance-agent --lib — 228 pass
- cargo test -p compliance-agent --test tenant_isolation — 6 pass
including new `admin_helpers_list_and_drop_tenant_dbs`
- cargo test -p compliance-agent --test tenant_status_middleware
— 6 pass
M7.2 closeout state after this lands
- M7.1 (auth + status) — done
- M7.2-A (pool) — done
- M7.2-B (handlers) — done
- M7.2-C (background paths) — done
- M7.2-D (legacy db removed, admin helpers) — done (this PR)
- Future M7.3: scheduler pulls tenants from tenant-registry instead
of SCHEDULER_TENANT_IDS env; cross-tenant admin HTTP endpoints
built on list_tenant_db_names / drop_tenant.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Closes the loop on M7.2 isolation for paths that don't have a JWT
context: scheduler, webhooks, and the agent's `run_scan` / `run_pr_review`
helpers all now take a `tenant_id` at the boundary and resolve to a
tenant-scoped `Database` via `db_pool.for_tenant_id(...)`. Internal
orchestrators (PipelineOrchestrator, PentestOrchestrator) and pipeline
helpers were already DB-agnostic — they take `db: Database` at
construction and don't care which tenant it points to.
Changes
- DatabasePool::for_tenant_id(&str) — same as for_tenant but accepts
a bare tenant_id. Background paths don't have a full TenantContext.
for_tenant is now a thin wrapper that delegates.
- agent.run_scan(tenant_id, repo_id, trigger) — pulls the tenant
database before constructing the PipelineOrchestrator. Was:
run_scan(repo_id, trigger) reading agent.db.
- agent.run_pr_review(tenant_id, repo_id, ...) — same shape.
- Webhook routes change: /webhook/{tenant_id}/{platform}/{repo_id}.
Tenant is part of the URL path because webhooks arrive without a
JWT — they're authenticated via per-repo HMAC, not the tenant gate.
The dashboard surfaces the full per-tenant URL when the repo is
registered. All three handlers (gitea, github, gitlab) updated.
- scheduler.rs — iterates tenants from $SCHEDULER_TENANT_IDS
(comma-separated env), or DEV_TENANT_ID's `dev` default. Both
scan_all_repos and monitor_cves now run once per configured
tenant. M7.2-D will replace this static config with a pull from
the tenant-registry.
- api/handlers/repos.rs::trigger_scan now passes tenant.0.tenant_id.
What's unchanged because it didn't need to change
- PipelineOrchestrator, PentestOrchestrator: take `db: Database` at
construction — they're tenant-DB-agnostic by design. The caller
picks the tenant DB.
- pipeline/{dedup,graph_build,issue_creation,sbom/mod}.rs,
pentest/{context,report/html/*}.rs, trackers/jira.rs, llm/triage.rs:
take `&Database` or `&mongodb::Database` as args, transitively
tenant-scoped via the caller.
Test plan
- cargo fmt --all clean
- cargo clippy --workspace --exclude compliance-dashboard
-- -D warnings clean
- cargo test -p compliance-core --lib — 7 pass
- cargo test -p compliance-agent --lib — 228 pass
- cargo test -p compliance-agent --test tenant_isolation — 5 pass
- cargo test -p compliance-agent --test tenant_status_middleware
— 6 pass
What's left (PR-D)
- Drop the transitional agent.db field — no remaining call sites
(verified by `grep -rn "agent\.db\b" compliance-agent/src`).
- main.rs / TestServer stop building the legacy Database; only the
pool remains.
- Add cross-tenant admin helpers (list tenants, drop tenant DB) on
the pool for offboarding flows.
- Pull tenants from the tenant-registry instead of an env var.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Addresses review feedback on the hash-fallback path.
The original `debug_assert!(hashed.len() <= MAX_DB_NAME_LEN)` was a
runtime hack that vanished in release builds. With an 8-byte hash
truncation (~2^32 birthday-collision resistance), two tenant_ids
hashing to the same suffix would silently share a database — no
panic, no rollback, just cross-tenant data leak. Not acceptable for
a regulated-industry product.
Changes:
- Bump hash truncation 8 → 16 bytes (32 hex chars). 2^64 birthday
resistance — collision-impossible at our scale.
- Add MAX_PREFIX_LEN (= 30) and validate db_prefix.len() at
`DatabasePool::connect`. The runtime hash-fallback arithmetic is
now provably within Mongo's 63-byte cap; drop the debug_assert!.
- New test `connect_rejects_overlong_db_prefix` exercises the
inclusive bound (30 passes, 31 fails).
- Existing hash-fallback test now asserts a 32-char hex suffix +
basic distinctness for two different inputs.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
First slice of the M7.2 tenant-isolation work. Adds a `DatabasePool`
that hands out per-tenant `Database` handles physically scoped to
`<prefix>_<tenant_id>` Mongo databases. Isolation is at the driver,
not at "we hope we filter" — a handle for tenant A literally cannot
see tenant B's documents because it's connected to a different db.
What's in this PR
- DatabasePool::connect — pings the cluster, prepares per-tenant lazy
handles.
- DatabasePool::for_tenant(&TenantContext) — returns a Database scoped
to that tenant. ensure_indexes runs once per tenant per process via
a DashMap-backed marker; failure rolls the marker back so the next
request retries.
- tenant_db_name — `<prefix>_<sanitized_tenant_id>` if it fits in
Mongo's 63-byte db-name cap, else `<prefix>_<sha256-16hex>` fallback.
- Sanitizer rewrites the Mongo-disallowed chars (`/ \ . " $ <space>
NUL`) so any future tenant_id shape works.
- ComplianceAgent gains a `db_pool: DatabasePool` field next to the
existing `db: Database`. Handlers / pipelines / webhooks still use
`db` — they migrate to `db_pool.for_tenant(&ctx)` in M7.2-B/C and
`db` goes away in M7.2-D.
Test plan
- cargo fmt --all clean
- cargo clippy --workspace --exclude compliance-dashboard -- -D warnings
clean
- cargo test -p compliance-core --lib — 7 pass
- cargo test -p compliance-agent --lib — 228 pass
- cargo test -p compliance-agent --test tenant_isolation — 4 pass
against live mongo on 27017:
* pool_isolates_tenants_at_driver_level — writes for acme + globex,
reads through each tenant's handle; each sees exactly its own
data with no filter doc anywhere.
* for_tenant_is_idempotent_index_creation — second + third call
for the same tenant do not error.
* tenant_db_name_sanitizes_unsafe_characters
* tenant_db_name_falls_back_to_hash_when_too_long — 100-byte
tenant_id collapses to a stable 8-byte hex suffix.
Why per-tenant DB vs `tenant_id` field + filter
- Driver-level isolation; impossible to forget the filter on one of
the 184 query call-sites in compliance-agent.
- Handlers don't change shape at migration — `agent.db.findings()`
becomes `db.findings()` after pulling `db` from
`agent.db_pool.for_tenant(&ctx)`.
- GDPR delete = `db.dropDatabase()`.
- On-prem deploy = the same code path, with one tenant.
- Trade-off accepted: index storage duplicated per tenant; Mongo's
~thousand-db ceiling is way above the 10s-100s tenants we're
targeting.
Caveats
- Existing `agent.db` continues to point at the single legacy db.
Handlers / pipelines that use it are unscoped until M7.2-B/C
migrate them.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add DAST scanning and code knowledge graph features across the stack:
- compliance-dast and compliance-graph workspace crates
- Agent API handlers and routes for DAST targets/scans and graph builds
- Core models and traits for DAST and graph domains
- Dashboard pages for DAST targets/findings/overview and graph explorer/impact
- Toast notification system with auto-dismiss for async action feedback
- Button click animations and disabled states for better UX
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Run cargo fmt on all crates
- Fix regex patterns using unsupported lookahead in patterns.rs
- Replace unwrap() calls with compile_regex() helper
- Fix never type fallback in GitHub tracker
- Fix redundant field name in findings page
- Allow enum_variant_names for Dioxus Route enum
- Fix &mut Vec -> &mut [T] clippy lint in sbom.rs
- Mark unused-but-intended APIs with #[allow(dead_code)]
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>