Files
compliance-scanner-agent/compliance-agent/src/webhooks/gitlab.rs
T
Sharang Parnerkar 0f6dd1135e
CI / Check (pull_request) Successful in 10m33s
CI / Detect Changes (pull_request) Has been skipped
CI / Deploy Agent (pull_request) Has been skipped
CI / Deploy Dashboard (pull_request) Has been skipped
CI / Deploy Docs (pull_request) Has been skipped
CI / Deploy MCP (pull_request) Has been skipped
feat(m7.2-C): migrate background paths to per-tenant pool
Closes the loop on M7.2 isolation for paths that don't have a JWT
context: scheduler, webhooks, and the agent's `run_scan` / `run_pr_review`
helpers all now take a `tenant_id` at the boundary and resolve to a
tenant-scoped `Database` via `db_pool.for_tenant_id(...)`. Internal
orchestrators (PipelineOrchestrator, PentestOrchestrator) and pipeline
helpers were already DB-agnostic — they take `db: Database` at
construction and don't care which tenant it points to.

Changes
- DatabasePool::for_tenant_id(&str) — same as for_tenant but accepts
  a bare tenant_id. Background paths don't have a full TenantContext.
  for_tenant is now a thin wrapper that delegates.
- agent.run_scan(tenant_id, repo_id, trigger) — pulls the tenant
  database before constructing the PipelineOrchestrator. Was:
  run_scan(repo_id, trigger) reading agent.db.
- agent.run_pr_review(tenant_id, repo_id, ...) — same shape.
- Webhook routes change: /webhook/{tenant_id}/{platform}/{repo_id}.
  Tenant is part of the URL path because webhooks arrive without a
  JWT — they're authenticated via per-repo HMAC, not the tenant gate.
  The dashboard surfaces the full per-tenant URL when the repo is
  registered. All three handlers (gitea, github, gitlab) updated.
- scheduler.rs — iterates tenants from $SCHEDULER_TENANT_IDS
  (comma-separated env), or DEV_TENANT_ID's `dev` default. Both
  scan_all_repos and monitor_cves now run once per configured
  tenant. M7.2-D will replace this static config with a pull from
  the tenant-registry.
- api/handlers/repos.rs::trigger_scan now passes tenant.0.tenant_id.

What's unchanged because it didn't need to change
- PipelineOrchestrator, PentestOrchestrator: take `db: Database` at
  construction — they're tenant-DB-agnostic by design. The caller
  picks the tenant DB.
- pipeline/{dedup,graph_build,issue_creation,sbom/mod}.rs,
  pentest/{context,report/html/*}.rs, trackers/jira.rs, llm/triage.rs:
  take `&Database` or `&mongodb::Database` as args, transitively
  tenant-scoped via the caller.

Test plan
- cargo fmt --all clean
- cargo clippy --workspace --exclude compliance-dashboard
  -- -D warnings clean
- cargo test -p compliance-core --lib — 7 pass
- cargo test -p compliance-agent --lib — 228 pass
- cargo test -p compliance-agent --test tenant_isolation — 5 pass
- cargo test -p compliance-agent --test tenant_status_middleware
  — 6 pass

What's left (PR-D)
- Drop the transitional agent.db field — no remaining call sites
  (verified by `grep -rn "agent\.db\b" compliance-agent/src`).
- main.rs / TestServer stop building the legacy Database; only the
  pool remains.
- Add cross-tenant admin helpers (list tenants, drop tenant DB) on
  the pool for offboarding flows.
- Pull tenants from the tenant-registry instead of an env var.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-17 15:00:37 +02:00

133 lines
4.2 KiB
Rust

use std::sync::Arc;
use axum::body::Bytes;
use axum::extract::{Extension, Path};
use axum::http::{HeaderMap, StatusCode};
use compliance_core::models::ScanTrigger;
use crate::agent::ComplianceAgent;
pub async fn handle_gitlab_webhook(
Extension(agent): Extension<Arc<ComplianceAgent>>,
Path((tenant_id, repo_id)): Path<(String, String)>,
headers: HeaderMap,
body: Bytes,
) -> StatusCode {
// Look up the repo in the tenant's database to get its webhook secret
let oid = match mongodb::bson::oid::ObjectId::parse_str(&repo_id) {
Ok(oid) => oid,
Err(_) => return StatusCode::NOT_FOUND,
};
let db = match agent.db_pool.for_tenant_id(&tenant_id).await {
Ok(db) => db,
Err(e) => {
tracing::warn!("GitLab webhook: cannot open tenant database '{tenant_id}': {e}");
return StatusCode::NOT_FOUND;
}
};
let repo = match db
.repositories()
.find_one(mongodb::bson::doc! { "_id": oid })
.await
{
Ok(Some(repo)) => repo,
_ => {
tracing::warn!("GitLab webhook: repo {repo_id} not found in tenant '{tenant_id}'");
return StatusCode::NOT_FOUND;
}
};
// GitLab sends the secret token in X-Gitlab-Token header (plain text comparison)
if let Some(secret) = &repo.webhook_secret {
let token = headers
.get("x-gitlab-token")
.and_then(|v| v.to_str().ok())
.unwrap_or("");
if token != secret {
tracing::warn!("GitLab webhook: invalid token for repo {repo_id}");
return StatusCode::UNAUTHORIZED;
}
}
let payload: serde_json::Value = match serde_json::from_slice(&body) {
Ok(v) => v,
Err(e) => {
tracing::warn!("GitLab webhook: invalid JSON: {e}");
return StatusCode::BAD_REQUEST;
}
};
let event_type = payload["object_kind"].as_str().unwrap_or("");
match event_type {
"push" => {
let agent_clone = (*agent).clone();
let repo_id = repo_id.clone();
let tenant_id = tenant_id.clone();
tokio::spawn(async move {
tracing::info!(
"GitLab push webhook: triggering scan for {repo_id} in tenant {tenant_id}"
);
if let Err(e) = agent_clone
.run_scan(&tenant_id, &repo_id, ScanTrigger::Webhook)
.await
{
tracing::error!("Webhook-triggered scan failed: {e}");
}
});
StatusCode::OK
}
"merge_request" => handle_merge_request(agent, &tenant_id, &repo_id, &payload).await,
_ => {
tracing::debug!("GitLab webhook: ignoring event '{event_type}'");
StatusCode::OK
}
}
}
async fn handle_merge_request(
agent: Arc<ComplianceAgent>,
tenant_id: &str,
repo_id: &str,
payload: &serde_json::Value,
) -> StatusCode {
let action = payload["object_attributes"]["action"]
.as_str()
.unwrap_or("");
if action != "open" && action != "update" {
return StatusCode::OK;
}
let mr_iid = payload["object_attributes"]["iid"].as_u64().unwrap_or(0);
let head_sha = payload["object_attributes"]["last_commit"]["id"]
.as_str()
.unwrap_or("");
let base_sha = payload["object_attributes"]["diff_refs"]["base_sha"]
.as_str()
.unwrap_or("");
if mr_iid == 0 || head_sha.is_empty() || base_sha.is_empty() {
tracing::warn!("GitLab MR webhook: missing required fields");
return StatusCode::BAD_REQUEST;
}
let repo_id = repo_id.to_string();
let tenant_id = tenant_id.to_string();
let head_sha = head_sha.to_string();
let base_sha = base_sha.to_string();
let agent_clone = (*agent).clone();
tokio::spawn(async move {
tracing::info!("GitLab MR webhook: reviewing MR !{mr_iid} on {repo_id}");
if let Err(e) = agent_clone
.run_pr_review(&tenant_id, &repo_id, mr_iid, &base_sha, &head_sha)
.await
{
tracing::error!("MR review failed for !{mr_iid}: {e}");
}
});
StatusCode::OK
}