feat: deduplicate DAST findings, PR comments, and pentest reports
All checks were successful
CI / Check (pull_request) Successful in 10m17s
CI / Detect Changes (pull_request) Has been skipped
CI / Deploy Agent (pull_request) Has been skipped
CI / Deploy Dashboard (pull_request) Has been skipped
CI / Deploy Docs (pull_request) Has been skipped
CI / Deploy MCP (pull_request) Has been skipped

Two-phase DAST dedup: exact fingerprint match (title+endpoint+method)
and CWE-based related finding merge (e.g., HSTS reported as both
security_header_missing and tls_misconfiguration). Applied at insertion
time in the pentest orchestrator and at report export.

PR review comments now include fingerprints and skip duplicates within
the same review run.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Sharang Parnerkar
2026-03-29 22:15:48 +02:00
parent 46c7188757
commit 5da33ef882
4 changed files with 435 additions and 4 deletions

View File

@@ -95,8 +95,8 @@ pub async fn export_session_report(
Err(_) => Vec::new(),
};
// Fetch DAST findings for this session
let findings: Vec<DastFinding> = match agent
// Fetch DAST findings for this session, then deduplicate
let raw_findings: Vec<DastFinding> = match agent
.db
.dast_findings()
.find(doc! { "session_id": &id })
@@ -106,6 +106,14 @@ pub async fn export_session_report(
Ok(cursor) => collect_cursor_async(cursor).await,
Err(_) => Vec::new(),
};
let raw_count = raw_findings.len();
let findings = crate::pipeline::dedup::dedup_dast_findings(raw_findings);
if findings.len() < raw_count {
tracing::info!(
"Deduped DAST findings for session {id}: {raw_count} → {}",
findings.len()
);
}
// Fetch SAST findings, SBOM, and code context for the linked repository
let repo_id = session