compliance-scanner-agent

Author	SHA1	Message	Date
Sharang Parnerkar	08c4ec4cff	feat(m7.2-D): drop transitional agent.db, add admin helpers CI / Check (pull_request) Successful in 9m27s Details CI / Detect Changes (pull_request) Has been skipped Details CI / Deploy Agent (pull_request) Has been skipped Details CI / Deploy Dashboard (pull_request) Has been skipped Details CI / Deploy Docs (pull_request) Has been skipped Details CI / Deploy MCP (pull_request) Has been skipped Details Final slice of M7.2. Removes the transitional single-database handle that M7.2-A introduced alongside the pool, so the compliance-agent now has a single source of truth for storage: every code path obtains a tenant-scoped Database from `agent.db_pool.for_tenant_id(...)` or `for_tenant(&ctx)`. There is no shared "default" database anywhere. Changes - ComplianceAgent: `db: Database` field removed. ComplianceAgent::new now takes only `(config, db_pool)`. Verified by an earlier grep during M7.2-C that no remaining call site reads `agent.db`. - main.rs: stops constructing the legacy Database. Only the pool is built at startup. - TestServer: same — drops Database::connect/ensure_indexes, builds only the pool. cleanup() now drops every `<db_name>_*` per-tenant database (no longer touches a bare `<db_name>`). - DatabasePool::list_tenant_db_names() — lists Mongo databases matching the pool's prefix. For admin endpoints + scheduler tenant enumeration in a future M7.3 (this PR keeps SCHEDULER_TENANT_IDS env config — registry integration is a separate concern). - DatabasePool::drop_tenant(&str) — idempotent tenant offboarding. Drops the per-tenant database and evicts the in-memory `ensured` marker so a later re-provision re-runs ensure_indexes. Test plan - cargo fmt --all clean - cargo clippy --workspace --exclude compliance-dashboard -- -D warnings clean - cargo test -p compliance-core --lib — 7 pass - cargo test -p compliance-agent --lib — 228 pass - cargo test -p compliance-agent --test tenant_isolation — 6 pass including new `admin_helpers_list_and_drop_tenant_dbs` - cargo test -p compliance-agent --test tenant_status_middleware — 6 pass M7.2 closeout state after this lands - M7.1 (auth + status) — done - M7.2-A (pool) — done - M7.2-B (handlers) — done - M7.2-C (background paths) — done - M7.2-D (legacy db removed, admin helpers) — done (this PR) - Future M7.3: scheduler pulls tenants from tenant-registry instead of SCHEDULER_TENANT_IDS env; cross-tenant admin HTTP endpoints built on list_tenant_db_names / drop_tenant. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-17 15:05:27 +02:00
Sharang Parnerkar	0f6dd1135e	feat(m7.2-C): migrate background paths to per-tenant pool CI / Check (pull_request) Successful in 10m33s Details CI / Detect Changes (pull_request) Has been skipped Details CI / Deploy Agent (pull_request) Has been skipped Details CI / Deploy Dashboard (pull_request) Has been skipped Details CI / Deploy Docs (pull_request) Has been skipped Details CI / Deploy MCP (pull_request) Has been skipped Details Closes the loop on M7.2 isolation for paths that don't have a JWT context: scheduler, webhooks, and the agent's `run_scan` / `run_pr_review` helpers all now take a `tenant_id` at the boundary and resolve to a tenant-scoped `Database` via `db_pool.for_tenant_id(...)`. Internal orchestrators (PipelineOrchestrator, PentestOrchestrator) and pipeline helpers were already DB-agnostic — they take `db: Database` at construction and don't care which tenant it points to. Changes - DatabasePool::for_tenant_id(&str) — same as for_tenant but accepts a bare tenant_id. Background paths don't have a full TenantContext. for_tenant is now a thin wrapper that delegates. - agent.run_scan(tenant_id, repo_id, trigger) — pulls the tenant database before constructing the PipelineOrchestrator. Was: run_scan(repo_id, trigger) reading agent.db. - agent.run_pr_review(tenant_id, repo_id, ...) — same shape. - Webhook routes change: /webhook/{tenant_id}/{platform}/{repo_id}. Tenant is part of the URL path because webhooks arrive without a JWT — they're authenticated via per-repo HMAC, not the tenant gate. The dashboard surfaces the full per-tenant URL when the repo is registered. All three handlers (gitea, github, gitlab) updated. - scheduler.rs — iterates tenants from $SCHEDULER_TENANT_IDS (comma-separated env), or DEV_TENANT_ID's `dev` default. Both scan_all_repos and monitor_cves now run once per configured tenant. M7.2-D will replace this static config with a pull from the tenant-registry. - api/handlers/repos.rs::trigger_scan now passes tenant.0.tenant_id. What's unchanged because it didn't need to change - PipelineOrchestrator, PentestOrchestrator: take `db: Database` at construction — they're tenant-DB-agnostic by design. The caller picks the tenant DB. - pipeline/{dedup,graph_build,issue_creation,sbom/mod}.rs, pentest/{context,report/html/*}.rs, trackers/jira.rs, llm/triage.rs: take `&Database` or `&mongodb::Database` as args, transitively tenant-scoped via the caller. Test plan - cargo fmt --all clean - cargo clippy --workspace --exclude compliance-dashboard -- -D warnings clean - cargo test -p compliance-core --lib — 7 pass - cargo test -p compliance-agent --lib — 228 pass - cargo test -p compliance-agent --test tenant_isolation — 5 pass - cargo test -p compliance-agent --test tenant_status_middleware — 6 pass What's left (PR-D) - Drop the transitional agent.db field — no remaining call sites (verified by `grep -rn "agent\.db\b" compliance-agent/src`). - main.rs / TestServer stop building the legacy Database; only the pool remains. - Add cross-tenant admin helpers (list tenants, drop tenant DB) on the pool for offboarding flows. - Pull tenants from the tenant-registry instead of an env var. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-17 15:00:37 +02:00
Sharang Parnerkar	003835764e	fixup(m7.2-A): validate db_prefix at connect, bump hash to 16 bytes CI / Check (pull_request) Successful in 8m29s Details CI / Detect Changes (pull_request) Has been skipped Details CI / Deploy Agent (pull_request) Has been skipped Details CI / Deploy Dashboard (pull_request) Has been skipped Details CI / Deploy Docs (pull_request) Has been skipped Details CI / Deploy MCP (pull_request) Has been skipped Details Addresses review feedback on the hash-fallback path. The original `debug_assert!(hashed.len() <= MAX_DB_NAME_LEN)` was a runtime hack that vanished in release builds. With an 8-byte hash truncation (~2^32 birthday-collision resistance), two tenant_ids hashing to the same suffix would silently share a database — no panic, no rollback, just cross-tenant data leak. Not acceptable for a regulated-industry product. Changes: - Bump hash truncation 8 → 16 bytes (32 hex chars). 2^64 birthday resistance — collision-impossible at our scale. - Add MAX_PREFIX_LEN (= 30) and validate db_prefix.len() at `DatabasePool::connect`. The runtime hash-fallback arithmetic is now provably within Mongo's 63-byte cap; drop the debug_assert!. - New test `connect_rejects_overlong_db_prefix` exercises the inclusive bound (30 passes, 31 fails). - Existing hash-fallback test now asserts a 32-char hex suffix + basic distinctness for two different inputs. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-17 13:16:46 +02:00
Sharang Parnerkar	e3aabe7d18	feat(m7.2-A): introduce per-tenant DatabasePool CI / Check (pull_request) Successful in 8m40s Details CI / Detect Changes (pull_request) Has been skipped Details CI / Deploy Agent (pull_request) Has been skipped Details CI / Deploy Dashboard (pull_request) Has been skipped Details CI / Deploy Docs (pull_request) Has been skipped Details CI / Deploy MCP (pull_request) Has been skipped Details First slice of the M7.2 tenant-isolation work. Adds a `DatabasePool` that hands out per-tenant `Database` handles physically scoped to `<prefix>_<tenant_id>` Mongo databases. Isolation is at the driver, not at "we hope we filter" — a handle for tenant A literally cannot see tenant B's documents because it's connected to a different db. What's in this PR - DatabasePool::connect — pings the cluster, prepares per-tenant lazy handles. - DatabasePool::for_tenant(&TenantContext) — returns a Database scoped to that tenant. ensure_indexes runs once per tenant per process via a DashMap-backed marker; failure rolls the marker back so the next request retries. - tenant_db_name — `<prefix>_<sanitized_tenant_id>` if it fits in Mongo's 63-byte db-name cap, else `<prefix>_<sha256-16hex>` fallback. - Sanitizer rewrites the Mongo-disallowed chars (`/ \ . " $ <space> NUL`) so any future tenant_id shape works. - ComplianceAgent gains a `db_pool: DatabasePool` field next to the existing `db: Database`. Handlers / pipelines / webhooks still use `db` — they migrate to `db_pool.for_tenant(&ctx)` in M7.2-B/C and `db` goes away in M7.2-D. Test plan - cargo fmt --all clean - cargo clippy --workspace --exclude compliance-dashboard -- -D warnings clean - cargo test -p compliance-core --lib — 7 pass - cargo test -p compliance-agent --lib — 228 pass - cargo test -p compliance-agent --test tenant_isolation — 4 pass against live mongo on 27017: * pool_isolates_tenants_at_driver_level — writes for acme + globex, reads through each tenant's handle; each sees exactly its own data with no filter doc anywhere. * for_tenant_is_idempotent_index_creation — second + third call for the same tenant do not error. * tenant_db_name_sanitizes_unsafe_characters * tenant_db_name_falls_back_to_hash_when_too_long — 100-byte tenant_id collapses to a stable 8-byte hex suffix. Why per-tenant DB vs `tenant_id` field + filter - Driver-level isolation; impossible to forget the filter on one of the 184 query call-sites in compliance-agent. - Handlers don't change shape at migration — `agent.db.findings()` becomes `db.findings()` after pulling `db` from `agent.db_pool.for_tenant(&ctx)`. - GDPR delete = `db.dropDatabase()`. - On-prem deploy = the same code path, with one tenant. - Trade-off accepted: index storage duplicated per tenant; Mongo's ~thousand-db ceiling is way above the 10s-100s tenants we're targeting. Caveats - Existing `agent.db` continues to point at the single legacy db. Handlers / pipelines that use it are unscoped until M7.2-B/C migrate them. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-17 11:58:24 +02:00
sharang	49d5cd4e0a	feat: hourly CVE alerting with notification bell and API (#53 ) CI / Check (push) Has been skipped Details CI / Detect Changes (push) Successful in 3s Details CI / Deploy Agent (push) Successful in 2s Details CI / Deploy Dashboard (push) Successful in 2s Details CI / Deploy Docs (push) Has been skipped Details CI / Deploy MCP (push) Successful in 2s Details	2026-03-30 10:39:39 +00:00
sharang	acc5b86aa4	feat: AI-driven automated penetration testing (#12 ) CI / Format (push) Failing after 42s Details CI / Clippy (push) Failing after 1m51s Details CI / Security Audit (push) Successful in 2m1s Details CI / Tests (push) Has been skipped Details CI / Detect Changes (push) Has been skipped Details CI / Deploy Agent (push) Has been skipped Details CI / Deploy Dashboard (push) Has been skipped Details CI / Deploy Docs (push) Has been skipped Details CI / Deploy MCP (push) Has been skipped Details	2026-03-12 14:42:54 +00:00
sharang	42cabf0582	feat: rag-embedding-ai-chat (#1 ) CI / Format (push) Successful in 2s Details CI / Clippy (push) Successful in 2m56s Details CI / Security Audit (push) Successful in 1m25s Details CI / Tests (push) Successful in 3m57s Details Co-authored-by: Sharang Parnerkar <parnerkarsharang@gmail.com> Reviewed-on: #1	2026-03-06 21:54:15 +00:00
Sharang Parnerkar	cea8f59e10	Add DAST, graph modules, toast notifications, and dashboard enhancements Add DAST scanning and code knowledge graph features across the stack: - compliance-dast and compliance-graph workspace crates - Agent API handlers and routes for DAST targets/scans and graph builds - Core models and traits for DAST and graph domains - Dashboard pages for DAST targets/findings/overview and graph explorer/impact - Toast notification system with auto-dismiss for async action feedback - Button click animations and disabled states for better UX Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-04 13:53:50 +01:00
Sharang Parnerkar	03ee69834d	Fix formatting and clippy warnings across workspace CI / Format (push) Successful in 3s Details CI / Clippy (push) Successful in 2m15s Details CI / Security Audit (push) Successful in 1m34s Details CI / Tests (push) Successful in 3m4s Details - Run cargo fmt on all crates - Fix regex patterns using unsupported lookahead in patterns.rs - Replace unwrap() calls with compile_regex() helper - Fix never type fallback in GitHub tracker - Fix redundant field name in findings page - Allow enum_variant_names for Dioxus Route enum - Fix &mut Vec -> &mut [T] clippy lint in sbom.rs - Mark unused-but-intended APIs with #[allow(dead_code)] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-02 17:41:03 +01:00
Sharang Parnerkar	0867e401bc	Initial commit: Compliance Scanner Agent Autonomous security and compliance scanning agent for git repositories. Features: SAST (Semgrep), SBOM (Syft), CVE monitoring (OSV.dev/NVD), GDPR/OAuth pattern detection, LLM triage, issue creation (GitHub/GitLab/Jira), PR reviews, and Dioxus fullstack dashboard. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-02 13:30:17 +01:00

10 Commits