feat: refine all LLM system prompts for precision and reduced false positives (#49)
Some checks failed
CI / Check (push) Has been skipped
CI / Deploy Agent (push) Has been cancelled
CI / Deploy Dashboard (push) Has been cancelled
CI / Deploy Docs (push) Has been cancelled
CI / Deploy MCP (push) Has been cancelled
CI / Detect Changes (push) Has been cancelled

This commit was merged in pull request #49.
This commit is contained in:
2026-03-30 07:11:17 +00:00
parent ff088f9eb4
commit dd53132746
6 changed files with 196 additions and 63 deletions

View File

@@ -8,22 +8,46 @@ use crate::pipeline::orchestrator::GraphContext;
/// Maximum number of findings to include in a single LLM triage call.
const TRIAGE_CHUNK_SIZE: usize = 30;
const TRIAGE_SYSTEM_PROMPT: &str = r#"You are a security finding triage expert. Analyze each of the following security findings with its code context and determine the appropriate action.
const TRIAGE_SYSTEM_PROMPT: &str = r#"You are a pragmatic security triage expert. Your job is to filter out noise and keep only findings that a developer should actually fix. Be aggressive about dismissing false positives — a clean, high-signal list is more valuable than a comprehensive one.
Actions:
- "confirm": The finding is a true positive at the reported severity. Keep as-is.
- "downgrade": The finding is real but over-reported. Lower severity recommended.
- "upgrade": The finding is under-reported. Higher severity recommended.
- "dismiss": The finding is a false positive. Should be removed.
- "confirm": True positive with real impact. Keep severity as-is.
- "downgrade": Real issue but over-reported severity. Lower it.
- "upgrade": Under-reported — higher severity warranted.
- "dismiss": False positive, not exploitable, or not actionable. Remove it.
Consider:
- Is the code in a test, example, or generated file? (lower confidence for test code)
- Does the surrounding code context confirm or refute the finding?
- Is the finding actionable by a developer?
- Would a real attacker be able to exploit this?
Dismiss when:
- The scanner flagged a language idiom as a bug (see examples below)
- The finding is in test/example/generated/vendored code
- The "vulnerability" requires preconditions that don't exist in the code
- The finding is about code style, complexity, or theoretical concerns rather than actual bugs
- A hash function is used for non-security purposes (dedup, caching, content addressing)
- Internal logging of non-sensitive operational data is flagged as "information disclosure"
- The finding duplicates another finding already in the list
- Framework-provided security is already in place (e.g. ORM parameterized queries, CSRF middleware, auth decorators)
Respond with a JSON array, one entry per finding in the same order they were presented:
[{"id": "<fingerprint>", "action": "confirm|downgrade|upgrade|dismiss", "confidence": 0-10, "rationale": "brief explanation", "remediation": "optional fix suggestion"}, ...]"#;
Common false positive patterns by language (dismiss these):
- Rust: short-circuit `||`/`&&`, variable shadowing, `clone()`, `unsafe` with safety docs, `sha2` for fingerprinting
- Python: EAFP try/except, `subprocess` with hardcoded args, `pickle` on trusted data, Django `mark_safe` on static content
- Go: `if err != nil` is not "swallowed error", `crypto/rand` is secure, returning errors is not "information disclosure"
- Java/Kotlin: Spring Security annotations are valid auth, JPA parameterized queries are safe, Kotlin `!!` in tests is fine
- Ruby: Rails `params.permit` is validation, ActiveRecord finders are parameterized, `html_safe` on generated content
- PHP: PDO prepared statements are safe, Laravel Eloquent is parameterized, `htmlspecialchars` is XSS mitigation
- C/C++: `strncpy`/`snprintf` are bounds-checked, smart pointers manage memory, RAII handles cleanup
Confirm only when:
- You can describe a concrete scenario where the bug manifests or the vulnerability is exploitable
- The fix is actionable (developer can change specific code to resolve it)
- The finding is in production code that handles external input or sensitive data
Confidence scoring (0-10):
- 8-10: Certain true positive with clear exploit/bug scenario
- 5-7: Likely true positive, some assumptions required
- 3-4: Uncertain, needs manual review
- 0-2: Almost certainly a false positive
Respond with a JSON array, one entry per finding in the same order presented (no markdown fences):
[{"id": "<fingerprint>", "action": "confirm|downgrade|upgrade|dismiss", "confidence": 0-10, "rationale": "1-2 sentences", "remediation": "optional fix"}, ...]"#;
pub async fn triage_findings(
llm: &Arc<LlmClient>,