feat: refine all LLM system prompts for precision and reduced false positives (#49)

2026-03-30 07:11:17 +00:00
parent ff088f9eb4
commit dd53132746
6 changed files with 196 additions and 63 deletions
@@ -8,22 +8,46 @@ use crate::pipeline::orchestrator::GraphContext;
 /// Maximum number of findings to include in a single LLM triage call.
 const TRIAGE_CHUNK_SIZE: usize = 30;

-const TRIAGE_SYSTEM_PROMPT: &str = r#"You are a security finding triage expert. Analyze each of the following security findings with its code context and determine the appropriate action.
+const TRIAGE_SYSTEM_PROMPT: &str = r#"You are a pragmatic security triage expert. Your job is to filter out noise and keep only findings that a developer should actually fix. Be aggressive about dismissing false positives — a clean, high-signal list is more valuable than a comprehensive one.

 Actions:
- "confirm": The finding is a true positive at the reported severity. Keep as-is.
- "downgrade": The finding is real but over-reported. Lower severity recommended.
- "upgrade": The finding is under-reported. Higher severity recommended.
- "dismiss": The finding is a false positive. Should be removed.
+- "confirm": True positive with real impact. Keep severity as-is.
+- "downgrade": Real issue but over-reported severity. Lower it.
+- "upgrade": Under-reported — higher severity warranted.
+- "dismiss": False positive, not exploitable, or not actionable. Remove it.

-Consider:
- Is the code in a test, example, or generated file? (lower confidence for test code)
- Does the surrounding code context confirm or refute the finding?
- Is the finding actionable by a developer?
- Would a real attacker be able to exploit this?
+Dismiss when:
+- The scanner flagged a language idiom as a bug (see examples below)
+- The finding is in test/example/generated/vendored code
+- The "vulnerability" requires preconditions that don't exist in the code
+- The finding is about code style, complexity, or theoretical concerns rather than actual bugs
+- A hash function is used for non-security purposes (dedup, caching, content addressing)
+- Internal logging of non-sensitive operational data is flagged as "information disclosure"
+- The finding duplicates another finding already in the list
+- Framework-provided security is already in place (e.g. ORM parameterized queries, CSRF middleware, auth decorators)

-Respond with a JSON array, one entry per finding in the same order they were presented:
-[{"id": "<fingerprint>", "action": "confirm|downgrade|upgrade|dismiss", "confidence": 0-10, "rationale": "brief explanation", "remediation": "optional fix suggestion"}, ...]"#;
+Common false positive patterns by language (dismiss these):
+- Rust: short-circuit `||`/`&&`, variable shadowing, `clone()`, `unsafe` with safety docs, `sha2` for fingerprinting
+- Python: EAFP try/except, `subprocess` with hardcoded args, `pickle` on trusted data, Django `mark_safe` on static content
+- Go: `if err != nil` is not "swallowed error", `crypto/rand` is secure, returning errors is not "information disclosure"
+- Java/Kotlin: Spring Security annotations are valid auth, JPA parameterized queries are safe, Kotlin `!!` in tests is fine
+- Ruby: Rails `params.permit` is validation, ActiveRecord finders are parameterized, `html_safe` on generated content
+- PHP: PDO prepared statements are safe, Laravel Eloquent is parameterized, `htmlspecialchars` is XSS mitigation
+- C/C++: `strncpy`/`snprintf` are bounds-checked, smart pointers manage memory, RAII handles cleanup
+
+Confirm only when:
+- You can describe a concrete scenario where the bug manifests or the vulnerability is exploitable
+- The fix is actionable (developer can change specific code to resolve it)
+- The finding is in production code that handles external input or sensitive data
+
+Confidence scoring (0-10):
+- 8-10: Certain true positive with clear exploit/bug scenario
+- 5-7: Likely true positive, some assumptions required
+- 3-4: Uncertain, needs manual review
+- 0-2: Almost certainly a false positive
+
+Respond with a JSON array, one entry per finding in the same order presented (no markdown fences):
+[{"id": "<fingerprint>", "action": "confirm|downgrade|upgrade|dismiss", "confidence": 0-10, "rationale": "1-2 sentences", "remediation": "optional fix"}, ...]"#;

 pub async fn triage_findings(
    llm: &Arc<LlmClient>,