sharang/compliance-scanner-agent

Fork 0

T

Sharang Parnerkar cdfbb62f9d

CI / Check (pull_request) Successful in 8m9s

Details

CI / Detect Changes (pull_request) Has been skipped

Details

CI / Deploy Agent (pull_request) Has been skipped

Details

CI / Deploy Dashboard (pull_request) Has been skipped

Details

CI / Deploy Docs (pull_request) Has been skipped

Details

CI / Deploy MCP (pull_request) Has been skipped

Details

feat(m7.2-B): migrate API handlers to per-tenant database pool

Builds on PR M7.2-A. Every HTTP handler in compliance-agent/src/api/
now takes a TenantCtx extractor and pulls a tenant-scoped Database
from agent.db_pool.for_tenant(&ctx). The query bodies are unchanged —
`db.findings().find(doc! {...})` reads from the tenant's own physical
database, so the filter doc cannot leak data across tenants because
the wrong tenant's data is literally on a different db handle.

Changes
- New `dto::tenant_db(&agent, &tenant) -> Result<Database, StatusCode>`
  helper. Every migrated handler calls it at the top of the body
  instead of `let db = &agent.db;`. 500 on the rare pool failure;
  4xx auth failures are already handled by the M7.1 status gate.
- New `api::server::inject_dev_tenant` middleware mounted only when
  Keycloak is NOT configured. Synthesizes a TenantContext with
  tenant_id = $DEV_TENANT_ID (default `dev`) so `cargo run` against
  a bare Mongo + no KC still serves the API. Logged loudly as
  "DO NOT use in any environment with real customer data".
- Test harness: TestServer mounts inject_dev_tenant so existing E2E
  tests reach handlers; cleanup() now drops every <db_name>_*
  per-tenant database, not just the legacy <db_name>.

Files migrated (handler count, all pass `cargo build`):
- chat.rs (3) — also rewires RagPipeline + EmbeddingStore to the
  tenant DB's inner() so vector search is per-tenant
- dast.rs (5)
- findings.rs (5)
- graph.rs (7) — also rewires GraphStore inside trigger_build's
  spawn to the tenant DB
- health.rs (1) — stats_overview migrated; public /health stays
  un-scoped
- issues.rs (1)
- notifications.rs (5)
- pentest_handlers/session.rs (12) — both wizard + legacy paths,
  plus pause/resume/stop/get_attack_chain/get_messages/
  get_session_findings/lookup_repo. PentestOrchestrator now gets
  the tenant DB clone in its spawn.
- pentest_handlers/export.rs (1) — fans out across sessions,
  attack_chain_nodes, dast_findings, findings, sbom_entries,
  graph_nodes from a single tenant_db acquisition
- pentest_handlers/stats.rs (1)
- pentest_handlers/stream.rs (1) — SSE handler verifies session
  via the tenant DB before subscribing
- repos.rs (6)
- sbom.rs (5)
- scans.rs (1)

help_chat.rs has no DB queries and was skipped.

Test plan
- cargo fmt --all clean
- cargo clippy --workspace --exclude compliance-dashboard
  -- -D warnings clean
- cargo test -p compliance-core --lib — 7 pass
- cargo test -p compliance-agent --lib — 228 pass
- cargo test -p compliance-agent --test tenant_isolation — 5 pass
  (driver-level isolation still holds post-handler migration)
- cargo test -p compliance-agent --test tenant_status_middleware
  — 6 pass

What's not yet migrated (PR-C / PR-D)
- scheduler.rs (6 sites), pipeline/orchestrator.rs (14),
  pentest/orchestrator.rs (13), webhooks (gitea/github/gitlab),
  trackers/jira.rs, pipeline/dedup.rs etc. — background paths
  without a JWT-derived tenant context.
- agent.db is still in the ComplianceAgent struct as a transitional
  handle for those paths. PR-D removes it once PR-C migrates the
  background paths.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-06-17 13:28:33 +02:00

.cargo

fix: scanner timeouts, semgrep memory cap, syft remote lookups, Script error (#78 )

2026-05-12 11:27:24 +00:00

.gitea/workflows

ci: log orca webhook response so deploy steps arent silent

2026-04-08 15:09:27 +02:00

assets

docs: rewrite user-facing documentation with screenshots (#11 )

2026-03-11 15:26:00 +00:00

bin

feat: opentelemetry-tracing (#3 )

2026-03-07 23:51:20 +00:00

compliance-agent

feat(m7.2-B): migrate API handlers to per-tenant database pool

2026-06-17 13:28:33 +02:00

compliance-core

fix(m7.1): JWKS refresh-on-failure in auth middleware (#84 )

2026-06-04 14:46:14 +00:00

compliance-dashboard

feat(dashboard): add light/dark theme with sidebar toggle (#81 )

2026-05-13 11:44:22 +00:00

compliance-dast

feat: pentest onboarding — streaming, browser automation, reports, user cleanup (#16 )

2026-03-17 20:32:20 +00:00

compliance-graph

feat: add E2E test suite with nightly CI, fix dashboard Dockerfile (#52 )

2026-03-30 10:04:07 +00:00

compliance-mcp

fix: check Gitea API response status and fallback for PR reviews (#47 )

2026-03-25 16:26:09 +00:00

compliance-smoke

M7.1 smoke harness: lift auth to compliance-core + compliance-smoke service (#83 )

2026-06-04 14:38:35 +00:00

dashboards

fix: rewrite SigNoz dashboards using correct v4 widget schema

2026-03-11 09:49:45 +01:00

deploy

fix: require TLS for IMAP auth, close port 143 (CERT-Bund compliance)

2026-03-18 09:29:34 +01:00

docs

feat: add floating help chat widget, remove settings page (#51 )

2026-03-30 08:05:29 +00:00

fuzz

refactor: modularize codebase and add 404 unit tests (#13 )

2026-03-13 08:03:45 +00:00

scripts

M7.1 smoke harness: lift auth to compliance-core + compliance-smoke service (#83 )

2026-06-04 14:38:35 +00:00

styles

Initial commit: Compliance Scanner Agent

2026-03-02 13:30:17 +01:00

.env.example

feat: findings refinement, new scanners, and deployment tooling (#6 )

2026-03-09 12:53:12 +00:00

.gitignore

feat: pentest onboarding — streaming, browser automation, reports, user cleanup (#16 )

2026-03-17 20:32:20 +00:00

AGENTS.md

Add DAST, graph modules, toast notifications, and dashboard enhancements

2026-03-04 13:53:50 +01:00

build.rs

Initial commit: Compliance Scanner Agent

2026-03-02 13:30:17 +01:00

Cargo.lock

feat(m7.1): wire compliance-agent to compliance-core auth + status gate (#85 )

2026-06-17 09:36:52 +00:00

Cargo.toml

M7.1 smoke harness: lift auth to compliance-core + compliance-smoke service (#83 )

2026-06-04 14:38:35 +00:00

clippy.toml

Initial commit: Compliance Scanner Agent

2026-03-02 13:30:17 +01:00

Dioxus.toml

Initial commit: Compliance Scanner Agent

2026-03-02 13:30:17 +01:00

docker-compose.yml

feat: pentest onboarding — streaming, browser automation, reports, user cleanup (#16 )

2026-03-17 20:32:20 +00:00

Dockerfile.agent

ci: trigger first orca build for all services

2026-04-08 10:10:07 +02:00

Dockerfile.dashboard

ci: trigger first orca build for all services

2026-04-08 10:10:07 +02:00

Dockerfile.docs

ci: trigger first orca build for all services

2026-04-08 10:10:07 +02:00

Dockerfile.mcp

ci: trigger first orca build for all services

2026-04-08 10:10:07 +02:00

otel-collector-config.yaml

feat: opentelemetry-tracing (#3 )

2026-03-07 23:51:20 +00:00

README.md

feat: add floating help chat widget, remove settings page (#51 )

2026-03-30 08:05:29 +00:00

README.md

Compliance Scanner

Autonomous security and compliance scanning agent for git repositories

About

Compliance Scanner is an autonomous agent that continuously monitors git repositories for security vulnerabilities, GDPR/OAuth compliance patterns, and dependency risks. It creates issues in external trackers (GitHub/GitLab/Jira/Gitea) with evidence and remediation suggestions, reviews pull requests with multi-pass LLM analysis, runs autonomous penetration tests, and exposes a Dioxus-based dashboard for visualization.

How it works: The agent runs as a lazy daemon -- it only scans when new commits are detected, triggered by cron schedules or webhooks. LLM-powered triage filters out false positives and generates actionable remediation with multi-language awareness.

Features

Area	Capabilities
SAST Scanning	Semgrep-based static analysis with auto-config rules
SBOM Generation	Syft + cargo-audit for complete dependency inventory
CVE Monitoring	OSV.dev batch queries, NVD CVSS enrichment, SearXNG context
GDPR Patterns	Detect PII logging, missing consent, hardcoded retention, missing deletion
OAuth Patterns	Detect implicit grant, missing PKCE, token in localStorage, token in URLs
LLM Triage	Multi-language-aware confidence scoring (Rust, Python, Go, Java, Ruby, PHP, C++)
Issue Creation	Auto-create issues in GitHub, GitLab, Jira, or Gitea with dedup via fingerprints
PR Reviews	Multi-pass security review (logic, security, convention, complexity) with dedup
DAST Scanning	Black-box security testing with endpoint discovery and parameter fuzzing
AI Pentesting	Autonomous LLM-orchestrated penetration testing with encrypted reports
Code Graph	Interactive code knowledge graph with impact analysis
AI Chat (RAG)	Natural language Q&A grounded in repository source code
Help Assistant	Documentation-grounded help chat accessible from every dashboard page
MCP Server	Expose live security data to Claude, Cursor, and other AI tools
Dashboard	Fullstack Dioxus UI with findings, SBOM, issues, DAST, pentest, and graph
Webhooks	GitHub, GitLab, and Gitea webhook receivers for push/PR events
Finding Dedup	SHA-256 fingerprint dedup for SAST, CWE-based dedup for DAST findings

Architecture

┌──────────────────────────────────────────────────────────────────────────┐
│                          Cargo Workspace                                 │
├──────────────┬──────────────────┬──────────────┬──────────┬─────────────┤
│ compliance-  │ compliance-      │ compliance-  │ complian-│ compliance- │
│ core (lib)   │ agent (bin)      │ dashboard    │ ce-graph │ mcp (bin)   │
│              │                  │ (bin)        │ (lib)    │             │
│ Models       │ Scan Pipeline    │ Dioxus 0.7   │ Tree-    │ MCP Server  │
│ Traits       │ LLM Client      │ Fullstack UI │ sitter   │ Live data   │
│ Config       │ Issue Trackers   │ Help Chat    │ Graph    │ for AI      │
│ Errors       │ Pentest Engine   │ Server Fns   │ Embedds  │ tools       │
│              │ DAST Tools       │              │ RAG      │             │
│              │ REST API         │              │          │             │
│              │ Webhooks         │              │          │             │
└──────────────┴──────────────────┴──────────────┴──────────┴─────────────┘
                                 │
                            MongoDB (shared)

Scan Pipeline (7 Stages)

Change Detection -- git2 fetch, compare HEAD SHA with last scanned commit
Semgrep SAST -- CLI wrapper with JSON output parsing
SBOM Generation -- Syft (CycloneDX) + cargo-audit vulnerability merge
CVE Scanning -- OSV.dev batch + NVD CVSS enrichment + SearXNG context
Pattern Scanning -- Regex-based GDPR and OAuth compliance checks
LLM Triage -- LiteLLM confidence scoring, filter findings < 3/10
Issue Creation -- Dedup via SHA-256 fingerprint, create tracker issues

Tech Stack

Layer	Technology
Shared Library	`compliance-core` -- models, traits, config
Agent	Axum REST API, git2, tokio-cron-scheduler, Semgrep, Syft
Dashboard	Dioxus 0.7.3 fullstack, Tailwind CSS 4
Code Graph	`compliance-graph` -- tree-sitter parsing, embeddings, RAG
MCP Server	`compliance-mcp` -- Model Context Protocol for AI tools
DAST	`compliance-dast` -- dynamic application security testing
Database	MongoDB with typed collections
LLM	LiteLLM (OpenAI-compatible API for chat, triage, embeddings)
Issue Trackers	GitHub (octocrab), GitLab (REST v4), Jira (REST v3), Gitea
CVE Sources	OSV.dev, NVD, SearXNG
Auth	Keycloak (OAuth2/PKCE, SSO)
Browser Automation	Chromium (headless, for pentesting and PDF generation)

Getting Started

Prerequisites

Rust 1.94+
Dioxus CLI (dx)
MongoDB
Docker & Docker Compose (optional)

Optional External Tools

Semgrep -- for SAST scanning
Syft -- for SBOM generation
cargo-audit -- for Rust dependency auditing

Setup

# Clone the repository
git clone <repo-url>
cd compliance-scanner

# Start MongoDB + SearXNG
docker compose up -d mongo searxng

# Configure environment
cp .env.example .env
# Edit .env with your LiteLLM, tracker tokens, and MongoDB settings

# Run the agent
cargo run -p compliance-agent

# Run the dashboard (separate terminal)
dx serve --features server --platform web

Docker Compose (Full Stack)

docker compose up -d

This starts MongoDB, SearXNG, the agent (port 3001), and the dashboard (port 8080).

REST API

The agent exposes a REST API on port 3001:

Method	Endpoint	Description
`GET`	`/api/v1/health`	Health check
`GET`	`/api/v1/stats/overview`	Summary statistics and trends
`GET`	`/api/v1/repositories`	List tracked repositories
`POST`	`/api/v1/repositories`	Add a repository to track
`POST`	`/api/v1/repositories/:id/scan`	Trigger a manual scan
`GET`	`/api/v1/findings`	List findings (filterable)
`GET`	`/api/v1/findings/:id`	Get finding with code evidence
`PATCH`	`/api/v1/findings/:id/status`	Update finding status
`GET`	`/api/v1/sbom`	List dependencies
`GET`	`/api/v1/issues`	List cross-tracker issues
`GET`	`/api/v1/scan-runs`	Scan execution history
`GET`	`/api/v1/graph/:repo_id`	Code knowledge graph
`POST`	`/api/v1/graph/:repo_id/build`	Trigger graph build
`GET`	`/api/v1/dast/targets`	List DAST targets
`POST`	`/api/v1/dast/targets`	Add DAST target
`GET`	`/api/v1/dast/findings`	List DAST findings
`POST`	`/api/v1/chat/:repo_id`	RAG-powered code chat
`POST`	`/api/v1/help/chat`	Documentation-grounded help chat
`POST`	`/api/v1/pentest/sessions`	Create pentest session
`POST`	`/api/v1/pentest/sessions/:id/export`	Export encrypted pentest report
`POST`	`/webhook/github`	GitHub webhook (HMAC-SHA256)
`POST`	`/webhook/gitlab`	GitLab webhook (token verify)
`POST`	`/webhook/gitea`	Gitea webhook

Dashboard Pages

Page	Description
Overview	Stat cards, severity distribution, AI chat cards, MCP status
Repositories	Add/manage tracked repos, trigger scans, webhook config
Findings	Filterable table by severity, type, status, scanner
Finding Detail	Code evidence, remediation, suggested fix, linked issue
SBOM	Dependency inventory with vulnerability badges, license summary
Issues	Cross-tracker view (GitHub + GitLab + Jira + Gitea)
Code Graph	Interactive architecture visualization, impact analysis
AI Chat	RAG-powered Q&A about repository code
DAST	Dynamic scanning targets, findings, and scan history
Pentest	AI-driven pentest sessions, attack chain visualization
MCP Servers	Model Context Protocol server management
Help Chat	Floating assistant (available on every page) for product Q&A

Project Structure

compliance-scanner/
├── compliance-core/        Shared library (models, traits, config, errors)
├── compliance-agent/       Agent daemon (pipeline, LLM, trackers, API, webhooks)
│   └── src/
│       ├── pipeline/       7-stage scan pipeline, dedup, PR reviews, code review
│       ├── llm/            LiteLLM client, triage, descriptions, fixes, review prompts
│       ├── trackers/       GitHub, GitLab, Jira, Gitea integrations
│       ├── pentest/        AI-driven pentest orchestrator, tools, reports
│       ├── rag/            RAG pipeline, chunking, embedding
│       ├── api/            REST API (Axum), help chat
│       └── webhooks/       GitHub, GitLab, Gitea webhook receivers
├── compliance-dashboard/   Dioxus fullstack dashboard
│   └── src/
│       ├── components/     Reusable UI (sidebar, help chat, attack chain, etc.)
│       ├── infrastructure/ Server functions, DB, config, auth
│       └── pages/          Full page views (overview, DAST, pentest, graph, etc.)
├── compliance-graph/       Code knowledge graph (tree-sitter, embeddings, RAG)
├── compliance-dast/        Dynamic application security testing
├── compliance-mcp/         Model Context Protocol server
├── docs/                   VitePress documentation site
├── assets/                 Static assets (CSS, icons)
└── styles/                 Tailwind input stylesheet

External Services

Service	Purpose	Default URL
MongoDB	Persistence	`mongodb://localhost:27017`
LiteLLM	LLM proxy (chat, triage, embeddings)	`http://localhost:4000`
SearXNG	CVE context search	`http://localhost:8888`
Keycloak	Authentication (OAuth2/PKCE, SSO)	`http://localhost:8080`
Semgrep	SAST scanning	CLI tool
Syft	SBOM generation	CLI tool
Chromium	Headless browser (pentesting, PDF)	Managed via Docker

_{Built with Rust, Dioxus, and a commitment to automated security compliance.}

Releases 1

v0.2.0 — AI-Native Security & Compliance Platform Latest

2026-03-30 13:18:47 +00:00

Languages

Rust 93.6%

CSS 5.6%

JavaScript 0.5%

Shell 0.3%