feat(m7.2-A): introduce per-tenant DatabasePool #86
Reference in New Issue
Block a user
Delete Branch "feat/m7.2a-tenant-db-pool"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
First slice of M7.2 tenant isolation. Introduces a
DatabasePoolthat hands out per-tenantDatabasehandles physically scoped to<prefix>_<tenant_id>Mongo databases.Isolation is at the driver, not at "we hope we filter." A handle for tenant A literally cannot see tenant B's documents because it is connected to a different db. There is no
{tenant_id: ctx.tenant_id}filter to forget on one of the 184 query call-sites incompliance-agent.What's in this PR
DatabasePool::connect(uri, db_prefix)— pings the cluster, prepares per-tenant lazy handles.DatabasePool::for_tenant(&TenantContext)— returns aDatabasescoped to that tenant.ensure_indexesruns once per tenant per process via aDashMap-backed marker; failure rolls the marker back so the next request retries.DatabasePool::tenant_db_name—<prefix>_<sanitized_tenant_id>if it fits Mongo's 63-byte db-name cap, else<prefix>_<sha256-16hex>fallback./ \ . " $ <space> NUL) so any future tenant_id shape works.ComplianceAgentgains adb_pool: DatabasePoolfield next to the existingdb: Database. Handlers / pipelines / webhooks still usedb— they migrate todb_pool.for_tenant(&ctx)in M7.2-B/C anddbgoes away in M7.2-D.Why per-tenant database (vs
tenant_idfield + app filter)agent.db.findings()becomesdb.findings()after pullingdbfromagent.db_pool.for_tenant(&ctx).db.dropDatabase().Test plan
cargo fmt --all -- --checkcleancargo clippy --workspace --exclude compliance-dashboard -- -D warningscleancargo test -p compliance-core --lib— 7 passcargo test -p compliance-agent --lib— 228 passcargo test -p compliance-agent --test tenant_isolation— 4/4 pass against live Mongo on:27017:pool_isolates_tenants_at_driver_level— writes for acme + globex, reads through each tenant's handle; each sees exactly its own data with no filter doc anywhere.for_tenant_is_idempotent_index_creation— second + third call for the same tenant do not error.tenant_db_name_sanitizes_unsafe_characterstenant_db_name_falls_back_to_hash_when_too_long— 100-byte tenant_id collapses to a stable 8-byte hex suffix. (Caught a real bug during dev — the original test prefix overran the 63-byte cap.)Caveats
agent.dbcontinues to point at the single legacy db. Handlers / pipelines that use it are unscoped until M7.2-B/C migrate them. The agent is bootable and functional throughout.What's next
api/handlers/*to takeTenantCtxand useagent.db_pool.for_tenant(&ctx).TenantContextthrough scheduler / pipeline / pentest orchestrators / webhooks / trackers.agent.dbfield; add cross-tenant admin helper.🤖 Generated with Claude Code
First slice of the M7.2 tenant-isolation work. Adds a `DatabasePool` that hands out per-tenant `Database` handles physically scoped to `<prefix>_<tenant_id>` Mongo databases. Isolation is at the driver, not at "we hope we filter" — a handle for tenant A literally cannot see tenant B's documents because it's connected to a different db. What's in this PR - DatabasePool::connect — pings the cluster, prepares per-tenant lazy handles. - DatabasePool::for_tenant(&TenantContext) — returns a Database scoped to that tenant. ensure_indexes runs once per tenant per process via a DashMap-backed marker; failure rolls the marker back so the next request retries. - tenant_db_name — `<prefix>_<sanitized_tenant_id>` if it fits in Mongo's 63-byte db-name cap, else `<prefix>_<sha256-16hex>` fallback. - Sanitizer rewrites the Mongo-disallowed chars (`/ \ . " $ <space> NUL`) so any future tenant_id shape works. - ComplianceAgent gains a `db_pool: DatabasePool` field next to the existing `db: Database`. Handlers / pipelines / webhooks still use `db` — they migrate to `db_pool.for_tenant(&ctx)` in M7.2-B/C and `db` goes away in M7.2-D. Test plan - cargo fmt --all clean - cargo clippy --workspace --exclude compliance-dashboard -- -D warnings clean - cargo test -p compliance-core --lib — 7 pass - cargo test -p compliance-agent --lib — 228 pass - cargo test -p compliance-agent --test tenant_isolation — 4 pass against live mongo on 27017: * pool_isolates_tenants_at_driver_level — writes for acme + globex, reads through each tenant's handle; each sees exactly its own data with no filter doc anywhere. * for_tenant_is_idempotent_index_creation — second + third call for the same tenant do not error. * tenant_db_name_sanitizes_unsafe_characters * tenant_db_name_falls_back_to_hash_when_too_long — 100-byte tenant_id collapses to a stable 8-byte hex suffix. Why per-tenant DB vs `tenant_id` field + filter - Driver-level isolation; impossible to forget the filter on one of the 184 query call-sites in compliance-agent. - Handlers don't change shape at migration — `agent.db.findings()` becomes `db.findings()` after pulling `db` from `agent.db_pool.for_tenant(&ctx)`. - GDPR delete = `db.dropDatabase()`. - On-prem deploy = the same code path, with one tenant. - Trade-off accepted: index storage duplicated per tenant; Mongo's ~thousand-db ceiling is way above the 10s-100s tenants we're targeting. Caveats - Existing `agent.db` continues to point at the single legacy db. Handlers / pipelines that use it are unscoped until M7.2-B/C migrate them. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>Superseded — the M7.2 stack was inadvertently included in PR #90 squash-merge (
5648291) on main. The dashboard PR was branched off this PR's descendant and its full diff swept into main as one squash commit. M7.2-A through M7.2-D are all live on main and in production. Closing without merging.Pull request closed