08c4ec4cff4ceaf7d649fa51464acdb7f54ad98b
5 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
08c4ec4cff |
feat(m7.2-D): drop transitional agent.db, add admin helpers
CI / Check (pull_request) Successful in 9m27s
CI / Detect Changes (pull_request) Has been skipped
CI / Deploy Agent (pull_request) Has been skipped
CI / Deploy Dashboard (pull_request) Has been skipped
CI / Deploy Docs (pull_request) Has been skipped
CI / Deploy MCP (pull_request) Has been skipped
Final slice of M7.2. Removes the transitional single-database handle that M7.2-A introduced alongside the pool, so the compliance-agent now has a single source of truth for storage: every code path obtains a tenant-scoped Database from `agent.db_pool.for_tenant_id(...)` or `for_tenant(&ctx)`. There is no shared "default" database anywhere. Changes - ComplianceAgent: `db: Database` field removed. ComplianceAgent::new now takes only `(config, db_pool)`. Verified by an earlier grep during M7.2-C that no remaining call site reads `agent.db`. - main.rs: stops constructing the legacy Database. Only the pool is built at startup. - TestServer: same — drops Database::connect/ensure_indexes, builds only the pool. cleanup() now drops every `<db_name>_*` per-tenant database (no longer touches a bare `<db_name>`). - DatabasePool::list_tenant_db_names() — lists Mongo databases matching the pool's prefix. For admin endpoints + scheduler tenant enumeration in a future M7.3 (this PR keeps SCHEDULER_TENANT_IDS env config — registry integration is a separate concern). - DatabasePool::drop_tenant(&str) — idempotent tenant offboarding. Drops the per-tenant database and evicts the in-memory `ensured` marker so a later re-provision re-runs ensure_indexes. Test plan - cargo fmt --all clean - cargo clippy --workspace --exclude compliance-dashboard -- -D warnings clean - cargo test -p compliance-core --lib — 7 pass - cargo test -p compliance-agent --lib — 228 pass - cargo test -p compliance-agent --test tenant_isolation — 6 pass including new `admin_helpers_list_and_drop_tenant_dbs` - cargo test -p compliance-agent --test tenant_status_middleware — 6 pass M7.2 closeout state after this lands - M7.1 (auth + status) — done - M7.2-A (pool) — done - M7.2-B (handlers) — done - M7.2-C (background paths) — done - M7.2-D (legacy db removed, admin helpers) — done (this PR) - Future M7.3: scheduler pulls tenants from tenant-registry instead of SCHEDULER_TENANT_IDS env; cross-tenant admin HTTP endpoints built on list_tenant_db_names / drop_tenant. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
cdfbb62f9d |
feat(m7.2-B): migrate API handlers to per-tenant database pool
CI / Check (pull_request) Successful in 8m9s
CI / Detect Changes (pull_request) Has been skipped
CI / Deploy Agent (pull_request) Has been skipped
CI / Deploy Dashboard (pull_request) Has been skipped
CI / Deploy Docs (pull_request) Has been skipped
CI / Deploy MCP (pull_request) Has been skipped
Builds on PR M7.2-A. Every HTTP handler in compliance-agent/src/api/
now takes a TenantCtx extractor and pulls a tenant-scoped Database
from agent.db_pool.for_tenant(&ctx). The query bodies are unchanged —
`db.findings().find(doc! {...})` reads from the tenant's own physical
database, so the filter doc cannot leak data across tenants because
the wrong tenant's data is literally on a different db handle.
Changes
- New `dto::tenant_db(&agent, &tenant) -> Result<Database, StatusCode>`
helper. Every migrated handler calls it at the top of the body
instead of `let db = &agent.db;`. 500 on the rare pool failure;
4xx auth failures are already handled by the M7.1 status gate.
- New `api::server::inject_dev_tenant` middleware mounted only when
Keycloak is NOT configured. Synthesizes a TenantContext with
tenant_id = $DEV_TENANT_ID (default `dev`) so `cargo run` against
a bare Mongo + no KC still serves the API. Logged loudly as
"DO NOT use in any environment with real customer data".
- Test harness: TestServer mounts inject_dev_tenant so existing E2E
tests reach handlers; cleanup() now drops every <db_name>_*
per-tenant database, not just the legacy <db_name>.
Files migrated (handler count, all pass `cargo build`):
- chat.rs (3) — also rewires RagPipeline + EmbeddingStore to the
tenant DB's inner() so vector search is per-tenant
- dast.rs (5)
- findings.rs (5)
- graph.rs (7) — also rewires GraphStore inside trigger_build's
spawn to the tenant DB
- health.rs (1) — stats_overview migrated; public /health stays
un-scoped
- issues.rs (1)
- notifications.rs (5)
- pentest_handlers/session.rs (12) — both wizard + legacy paths,
plus pause/resume/stop/get_attack_chain/get_messages/
get_session_findings/lookup_repo. PentestOrchestrator now gets
the tenant DB clone in its spawn.
- pentest_handlers/export.rs (1) — fans out across sessions,
attack_chain_nodes, dast_findings, findings, sbom_entries,
graph_nodes from a single tenant_db acquisition
- pentest_handlers/stats.rs (1)
- pentest_handlers/stream.rs (1) — SSE handler verifies session
via the tenant DB before subscribing
- repos.rs (6)
- sbom.rs (5)
- scans.rs (1)
help_chat.rs has no DB queries and was skipped.
Test plan
- cargo fmt --all clean
- cargo clippy --workspace --exclude compliance-dashboard
-- -D warnings clean
- cargo test -p compliance-core --lib — 7 pass
- cargo test -p compliance-agent --lib — 228 pass
- cargo test -p compliance-agent --test tenant_isolation — 5 pass
(driver-level isolation still holds post-handler migration)
- cargo test -p compliance-agent --test tenant_status_middleware
— 6 pass
What's not yet migrated (PR-C / PR-D)
- scheduler.rs (6 sites), pipeline/orchestrator.rs (14),
pentest/orchestrator.rs (13), webhooks (gitea/github/gitlab),
trackers/jira.rs, pipeline/dedup.rs etc. — background paths
without a JWT-derived tenant context.
- agent.db is still in the ComplianceAgent struct as a transitional
handle for those paths. PR-D removes it once PR-C migrates the
background paths.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
e3aabe7d18 |
feat(m7.2-A): introduce per-tenant DatabasePool
CI / Check (pull_request) Successful in 8m40s
CI / Detect Changes (pull_request) Has been skipped
CI / Deploy Agent (pull_request) Has been skipped
CI / Deploy Dashboard (pull_request) Has been skipped
CI / Deploy Docs (pull_request) Has been skipped
CI / Deploy MCP (pull_request) Has been skipped
First slice of the M7.2 tenant-isolation work. Adds a `DatabasePool`
that hands out per-tenant `Database` handles physically scoped to
`<prefix>_<tenant_id>` Mongo databases. Isolation is at the driver,
not at "we hope we filter" — a handle for tenant A literally cannot
see tenant B's documents because it's connected to a different db.
What's in this PR
- DatabasePool::connect — pings the cluster, prepares per-tenant lazy
handles.
- DatabasePool::for_tenant(&TenantContext) — returns a Database scoped
to that tenant. ensure_indexes runs once per tenant per process via
a DashMap-backed marker; failure rolls the marker back so the next
request retries.
- tenant_db_name — `<prefix>_<sanitized_tenant_id>` if it fits in
Mongo's 63-byte db-name cap, else `<prefix>_<sha256-16hex>` fallback.
- Sanitizer rewrites the Mongo-disallowed chars (`/ \ . " $ <space>
NUL`) so any future tenant_id shape works.
- ComplianceAgent gains a `db_pool: DatabasePool` field next to the
existing `db: Database`. Handlers / pipelines / webhooks still use
`db` — they migrate to `db_pool.for_tenant(&ctx)` in M7.2-B/C and
`db` goes away in M7.2-D.
Test plan
- cargo fmt --all clean
- cargo clippy --workspace --exclude compliance-dashboard -- -D warnings
clean
- cargo test -p compliance-core --lib — 7 pass
- cargo test -p compliance-agent --lib — 228 pass
- cargo test -p compliance-agent --test tenant_isolation — 4 pass
against live mongo on 27017:
* pool_isolates_tenants_at_driver_level — writes for acme + globex,
reads through each tenant's handle; each sees exactly its own
data with no filter doc anywhere.
* for_tenant_is_idempotent_index_creation — second + third call
for the same tenant do not error.
* tenant_db_name_sanitizes_unsafe_characters
* tenant_db_name_falls_back_to_hash_when_too_long — 100-byte
tenant_id collapses to a stable 8-byte hex suffix.
Why per-tenant DB vs `tenant_id` field + filter
- Driver-level isolation; impossible to forget the filter on one of
the 184 query call-sites in compliance-agent.
- Handlers don't change shape at migration — `agent.db.findings()`
becomes `db.findings()` after pulling `db` from
`agent.db_pool.for_tenant(&ctx)`.
- GDPR delete = `db.dropDatabase()`.
- On-prem deploy = the same code path, with one tenant.
- Trade-off accepted: index storage duplicated per tenant; Mongo's
~thousand-db ceiling is way above the 10s-100s tenants we're
targeting.
Caveats
- Existing `agent.db` continues to point at the single legacy db.
Handlers / pipelines that use it are unscoped until M7.2-B/C
migrate them.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
4388e98b5b | feat: add E2E test suite with nightly CI, fix dashboard Dockerfile (#52) | ||
|
|
3bb690e5bb |
refactor: modularize codebase and add 404 unit tests (#13)
CI / Format (push) Successful in 4s
CI / Clippy (push) Successful in 4m19s
CI / Security Audit (push) Successful in 1m44s
CI / Tests (push) Successful in 5m15s
CI / Detect Changes (push) Successful in 5s
CI / Deploy Agent (push) Successful in 2s
CI / Deploy Dashboard (push) Successful in 2s
CI / Deploy Docs (push) Has been skipped
CI / Deploy MCP (push) Successful in 2s
|