docs: rewrite user docs, fix modal scroll, webhook URL, and sccache
Some checks failed
CI / Clippy (push) Failing after 2m49s
CI / Security Audit (push) Has been skipped
CI / Tests (push) Has been skipped
CI / Detect Changes (push) Has been skipped
CI / Format (pull_request) Successful in 3s
CI / Clippy (pull_request) Failing after 2m52s
CI / Security Audit (pull_request) Has been skipped
CI / Tests (pull_request) Has been skipped
CI / Format (push) Successful in 3s
CI / Deploy Agent (push) Has been skipped
CI / Deploy Dashboard (push) Has been skipped
CI / Deploy Docs (push) Has been skipped
CI / Deploy MCP (push) Has been skipped
CI / Detect Changes (pull_request) Has been skipped
CI / Deploy Agent (pull_request) Has been skipped
CI / Deploy Dashboard (pull_request) Has been skipped
CI / Deploy Docs (pull_request) Has been skipped
CI / Deploy MCP (pull_request) Has been skipped

Rewrite all public documentation to be user-facing only:
- Remove deployment, configuration, and self-hosting sections
- Add guide pages for SBOM, issues, webhooks & PR reviews
- Add reference pages for glossary and tools/scanners
- Add 12 screenshots from live dashboard
- Explain MCP, LLM triage, false positives, human-in-the-loop

Fix edit repository modal not scrollable (max-height + overflow-y).
Show full webhook URL using window.location.origin instead of path.
Unset RUSTC_WRAPPER in agent cargo commands to avoid sccache errors.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Sharang Parnerkar
2026-03-11 14:17:46 +01:00
parent 689daa0f49
commit c253e4ef5e
40 changed files with 872 additions and 1334 deletions

View File

@@ -1,153 +0,0 @@
# Configuration
Compliance Scanner is configured through environment variables. Copy `.env.example` to `.env` and edit the values.
## Required Settings
### MongoDB
```bash
MONGODB_URI=mongodb://root:example@localhost:27017/compliance_scanner?authSource=admin
MONGODB_DATABASE=compliance_scanner
```
### Agent
```bash
AGENT_PORT=3001
```
### Dashboard
```bash
DASHBOARD_PORT=8080
AGENT_API_URL=http://localhost:3001
```
## LLM Configuration
The AI features (chat, remediation suggestions) use LiteLLM as a proxy to various LLM providers:
```bash
LITELLM_URL=http://localhost:4000
LITELLM_API_KEY=your-key
LITELLM_MODEL=gpt-4o
LITELLM_EMBED_MODEL=text-embedding-3-small
```
The embed model is used for the RAG/AI Chat feature to generate code embeddings.
## Git Provider Tokens
### GitHub
```bash
GITHUB_TOKEN=ghp_xxxx
GITHUB_WEBHOOK_SECRET=your-webhook-secret
```
### GitLab
```bash
GITLAB_URL=https://gitlab.com
GITLAB_TOKEN=glpat-xxxx
GITLAB_WEBHOOK_SECRET=your-webhook-secret
```
## Issue Tracker Integration
### Jira
```bash
JIRA_URL=https://your-org.atlassian.net
JIRA_EMAIL=user@example.com
JIRA_API_TOKEN=your-api-token
JIRA_PROJECT_KEY=SEC
```
When configured, new findings automatically create Jira issues in the specified project.
## Scan Schedules
Cron expressions for automated scanning:
```bash
# Scan every 6 hours
SCAN_SCHEDULE=0 0 */6 * * *
# Check for new CVEs daily at midnight
CVE_MONITOR_SCHEDULE=0 0 0 * * *
```
## Search Engine
SearXNG is used for CVE enrichment and vulnerability research:
```bash
SEARXNG_URL=http://localhost:8888
```
## NVD API
An NVD API key increases rate limits for CVE lookups:
```bash
NVD_API_KEY=your-nvd-api-key
```
Get a free key at [https://nvd.nist.gov/developers/request-an-api-key](https://nvd.nist.gov/developers/request-an-api-key).
## MCP Server
The MCP server exposes compliance data to external LLMs via the Model Context Protocol. See [MCP Server](/features/mcp-server) for full details.
```bash
# Set MCP_PORT to enable HTTP transport (omit for stdio mode)
MCP_PORT=8090
```
The MCP server shares the `MONGODB_URI` and `MONGODB_DATABASE` variables with the rest of the platform.
## Clone Path
Where the agent stores cloned repository files:
```bash
GIT_CLONE_BASE_PATH=/tmp/compliance-scanner/repos
```
## All Environment Variables
| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| `MONGODB_URI` | Yes | — | MongoDB connection string |
| `MONGODB_DATABASE` | No | `compliance_scanner` | Database name |
| `AGENT_PORT` | No | `3001` | Agent REST API port |
| `DASHBOARD_PORT` | No | `8080` | Dashboard web UI port |
| `AGENT_API_URL` | No | `http://localhost:3001` | Agent URL for dashboard |
| `LITELLM_URL` | No | `http://localhost:4000` | LiteLLM proxy URL |
| `LITELLM_API_KEY` | No | — | LiteLLM API key |
| `LITELLM_MODEL` | No | `gpt-4o` | LLM model for analysis |
| `LITELLM_EMBED_MODEL` | No | `text-embedding-3-small` | Embedding model for RAG |
| `GITHUB_TOKEN` | No | — | GitHub personal access token |
| `GITHUB_WEBHOOK_SECRET` | No | — | GitHub webhook signing secret |
| `GITLAB_URL` | No | `https://gitlab.com` | GitLab instance URL |
| `GITLAB_TOKEN` | No | — | GitLab access token |
| `GITLAB_WEBHOOK_SECRET` | No | — | GitLab webhook signing secret |
| `JIRA_URL` | No | — | Jira instance URL |
| `JIRA_EMAIL` | No | — | Jira account email |
| `JIRA_API_TOKEN` | No | — | Jira API token |
| `JIRA_PROJECT_KEY` | No | — | Jira project key for issues |
| `SEARXNG_URL` | No | `http://localhost:8888` | SearXNG instance URL |
| `NVD_API_KEY` | No | — | NVD API key for CVE lookups |
| `SCAN_SCHEDULE` | No | `0 0 */6 * * *` | Cron schedule for scans |
| `CVE_MONITOR_SCHEDULE` | No | `0 0 0 * * *` | Cron schedule for CVE checks |
| `GIT_CLONE_BASE_PATH` | No | `/tmp/compliance-scanner/repos` | Local clone directory |
| `KEYCLOAK_URL` | No | — | Keycloak server URL |
| `KEYCLOAK_REALM` | No | — | Keycloak realm name |
| `KEYCLOAK_CLIENT_ID` | No | — | Keycloak client ID |
| `REDIRECT_URI` | No | — | OAuth callback URL |
| `APP_URL` | No | — | Application root URL |
| `OTEL_EXPORTER_OTLP_ENDPOINT` | No | — | OTLP collector endpoint |
| `OTEL_SERVICE_NAME` | No | — | OpenTelemetry service name |
| `MCP_PORT` | No | — | MCP HTTP transport port (omit for stdio) |

View File

@@ -1,68 +1,62 @@
# Managing Findings
# Understanding Findings
Findings are security issues discovered during scans. The findings workflow lets you triage, track, and resolve vulnerabilities across all your repositories.
## Findings List
Navigate to **Findings** in the sidebar to see all findings. The table shows:
Navigate to **Findings** in the sidebar to see all findings across your repositories.
| Column | Description |
|--------|-------------|
| Severity | Color-coded badge: Critical (red), High (orange), Medium (yellow), Low (green) |
| Title | Short description of the vulnerability (clickable) |
| Type | SAST, SBOM, CVE, GDPR, or OAuth |
| Scanner | Tool that found the issue (e.g. semgrep, syft) |
| File | Source file path where the issue was found |
| Status | Current triage status |
![Findings list with severity badges, types, and filter controls](/screenshots/findings-list.png)
## Filtering
### Filtering
Use the filter bar at the top to narrow results:
Use the filter bar to narrow results:
- **Repository** — Filter to a specific repository or view all
- **Severity** Critical, High, Medium, Low, or Info
- **Type** SAST, SBOM, CVE, GDPR, OAuth
- **Status** Open, Triaged, Resolved, False Positive, Ignored
- **Repository** -- filter to a specific repository or view all
- **Severity** -- Critical, High, Medium, Low, or Info
- **Type** -- SAST, SBOM, CVE, GDPR, OAuth, Secrets, Code Review
- **Status** -- Open, Triaged, Resolved, False Positive, Ignored
Filters can be combined. Results are paginated with 20 findings per page.
### Columns
| Column | Description |
|--------|-------------|
| Severity | Color-coded badge: Critical (red), High (orange), Medium (yellow), Low (green), Info (blue) |
| Title | Short description of the vulnerability (clickable) |
| Type | SAST, SBOM, CVE, GDPR, OAuth, Secrets, or Code Review |
| Scanner | Tool that found the issue (e.g. Semgrep, Grype) |
| File | Source file path where the issue was found |
| Status | Current triage status |
## Finding Detail
Click any finding title to view its full detail page, which includes:
Click any finding title to view its full detail page.
### Metadata
- Severity level with CWE identifier and CVSS score (when available)
- Scanner tool and scan type
- File path and line number
![Finding detail page showing description, triage rationale, code evidence, remediation, and status controls](/screenshots/finding-detail.png)
The detail page is organized into these sections:
### Description
Full explanation of the vulnerability, why it's a risk, and what conditions trigger it.
A full explanation of the vulnerability: what it is, why it is a risk, and what conditions trigger it.
### AI Triage Rationale
The LLM's assessment of the finding, including why it assigned a particular severity and confidence score. This rationale considers the code context, the type of vulnerability, and the blast radius based on the code knowledge graph.
### Code Evidence
The source code snippet where the issue was found, with syntax highlighting and the file path.
The source code snippet where the issue was found, with syntax highlighting and the file path with line number.
### Remediation
Step-by-step guidance on how to fix the vulnerability.
### Suggested Fix
A code example showing the corrected implementation.
Step-by-step guidance on how to fix the vulnerability, often including a suggested code fix showing the corrected implementation.
### Linked Issue
If the finding was pushed to an issue tracker (GitHub, GitLab, Jira), a direct link to the external issue.
## Updating Status
On the finding detail page, change the finding's status using the status buttons:
| Status | When to Use |
|--------|-------------|
| **Open** | New finding, not yet reviewed |
| **Triaged** | Reviewed and confirmed as a real issue, pending fix |
| **Resolved** | Fix has been applied |
| **False Positive** | Finding is not a real vulnerability in this context |
| **Ignored** | Known issue that won't be fixed (accepted risk) |
Status changes are persisted immediately.
If the finding has been pushed to an issue tracker (GitHub, GitLab, Gitea, Jira), a direct link to the external issue appears here.
## Severity Levels
@@ -73,3 +67,77 @@ Status changes are persisted immediately.
| **Medium** | Moderate risk, exploitation requires specific conditions | Insecure deserialization, weak crypto |
| **Low** | Minor risk, limited impact | Information disclosure, verbose errors |
| **Info** | Informational, no direct security impact | Best practice recommendations |
## Finding Types
| Type | Source | Description |
|------|--------|-------------|
| **SAST** | Semgrep | Code-level vulnerabilities found through static analysis |
| **SBOM** | Syft + Grype | Vulnerable dependencies identified in your software bill of materials |
| **CVE** | NVD | Known CVEs matching your dependency versions |
| **GDPR** | Custom rules | Personal data handling and consent issues |
| **OAuth** | Custom rules | OAuth/OIDC misconfigurations and insecure token handling |
| **Secrets** | Custom rules | Hardcoded credentials, API keys, and tokens |
| **Code Review** | LLM | Architecture and security patterns reviewed by the AI engine |
## Triage Workflow
Every finding follows a lifecycle from discovery to resolution. The status indicates where a finding is in this process:
| Status | Meaning |
|--------|---------|
| **Open** | Newly discovered, not yet reviewed |
| **Triaged** | Reviewed and confirmed as a real issue, pending fix |
| **Resolved** | A fix has been applied |
| **False Positive** | Not a real vulnerability in this context |
| **Ignored** | Known issue that will not be fixed (accepted risk) |
On the finding detail page, use the status buttons to move a finding through this workflow. Status changes take effect immediately.
### Recommended Flow
1. A scan discovers a new finding -- it starts as **Open**
2. You review the AI triage rationale and code evidence
3. If it is a real issue, mark it as **Triaged** to signal that it needs a fix
4. Once the fix is deployed and a new scan confirms it, mark it as **Resolved**
5. If the AI got it wrong, mark it as **False Positive** (see below)
## False Positives
Not every finding is a real vulnerability. Static analysis tools can flag code that looks suspicious but is actually safe in context. When this happens:
1. Open the finding detail page
2. Review the code evidence and the AI triage rationale
3. If you determine the finding is not a real issue, click **False Positive**
::: tip
When you mark a finding as a false positive, you are providing training signal to the AI. Over time, the LLM learns from your feedback and becomes better at distinguishing real vulnerabilities from false alarms in your codebase.
:::
## Human in the Loop
Certifai uses AI to triage findings, but humans make the final decisions. Here is how the process works:
1. **AI triages** -- the LLM reviews each finding, assigns a severity, generates a confidence score, and writes a rationale explaining its assessment
2. **You review** -- you read the AI's analysis alongside the code evidence and decide whether to act on it
3. **You decide** -- you set the final status (Triaged, Resolved, False Positive, or Ignored)
4. **AI learns** -- your feedback on false positives and status changes helps improve future triage accuracy
The AI provides the analysis; you provide the judgment. This approach gives you the speed of automated scanning with the accuracy of human review.
## Developer Feedback
On the finding detail page, you can provide feedback on the AI's triage. This feedback loop serves two purposes:
- **Accuracy** -- helps the platform understand which findings are actionable in your specific codebase and context
- **Context** -- lets you add notes explaining why a finding is or is not relevant, which benefits other team members reviewing the same finding
## Confidence Scores
Each AI-triaged finding includes a confidence score from 0.0 to 1.0, indicating how certain the LLM is about its assessment:
- **0.8 -- 1.0** -- High confidence. The AI is very certain this is (or is not) a real vulnerability.
- **0.5 -- 0.8** -- Moderate confidence. The finding likely warrants human review.
- **Below 0.5** -- Low confidence. The AI is uncertain and recommends manual inspection.
Use confidence scores to prioritize your review queue: start with high-severity, high-confidence findings for the greatest impact.

View File

@@ -1,55 +1,49 @@
# Getting Started
Compliance Scanner is a security compliance platform that scans your Git repositories for vulnerabilities, builds software bills of materials, performs dynamic application testing, and provides AI-powered code intelligence.
Certifai is an AI-powered security compliance platform that scans your Git repositories for vulnerabilities, builds software bills of materials, performs dynamic application testing, and provides code intelligence through an interactive knowledge graph and AI chat.
## Architecture
## What You Get
The platform consists of three main components:
When you connect a repository, Certifai runs a comprehensive scan pipeline that covers:
- **Agent** — Background service that clones repositories, runs scans, builds graphs, and exposes a REST API
- **Dashboard** — Web UI built with Dioxus (Rust full-stack framework) for viewing results and managing repositories
- **MongoDB** — Database for storing all scan results, findings, SBOM data, and graph structures
- **Static Analysis (SAST)** -- finds code-level vulnerabilities like injection flaws, insecure crypto, and misconfigurations
- **Software Bill of Materials (SBOM)** -- inventories every dependency, its version, and its license
- **CVE Monitoring** -- cross-references your dependencies against known vulnerabilities
- **Code Knowledge Graph** -- maps the structure of your codebase for impact analysis
- **AI Triage** -- every finding is reviewed by an LLM that provides severity assessment, confidence scores, and remediation guidance
- **Issue Tracking** -- automatically creates issues in your tracker for new findings
## Quick Start with Docker Compose
## Dashboard Overview
The fastest way to get running:
After logging in, you land on the Overview page, which gives you a snapshot of your security posture across all repositories.
```bash
# Clone the repository
git clone <repo-url> compliance-scanner
cd compliance-scanner
![Dashboard overview showing stats cards, severity distribution, and recent scan activity](/screenshots/dashboard-overview.png)
# Copy and configure environment variables
cp .env.example .env
# Edit .env with your settings (see Configuration)
The overview shows key metrics at a glance: total repositories, findings broken down by severity, dependency counts, CVE alerts, and tracker issues. A severity distribution chart visualizes your risk profile, and recent scan runs let you monitor scanning activity.
# Start all services
docker-compose up -d
```
## Quick Walkthrough
This starts:
- MongoDB on port `27017`
- Agent API on port `3001`
- Dashboard on port `8080`
- Chromium (for DAST crawling) on port `3003`
Here is the fastest path from zero to your first scan results:
Open the dashboard at [http://localhost:8080](http://localhost:8080).
### 1. Add a repository
## What Happens During a Scan
Navigate to **Repositories** in the sidebar and click **Add Repository**. Enter a name, the Git clone URL, and the default branch to scan.
When you add a repository and trigger a scan, the agent runs through these phases:
![Add repository dialog](/screenshots/add-repository.png)
1. **Clone** — Clones or pulls the latest code from the Git remote
2. **SAST** — Runs static analysis using Semgrep with rules for OWASP, GDPR, OAuth, and general security
3. **SBOM** — Extracts all dependencies using Syft, identifying packages, versions, licenses, and known vulnerabilities
4. **CVE Check** — Cross-references dependencies against the NVD database for known CVEs
5. **Graph Build** — Parses the codebase to construct a code knowledge graph of functions, classes, and their relationships
6. **Issue Sync** — Creates or updates issues in connected trackers (GitHub, GitLab, Jira) for new findings
### 2. Trigger a scan
Each phase produces results visible in the dashboard immediately.
Click the **Scan** button on your repository row. The scan runs in the background through all phases: cloning, static analysis, SBOM extraction, CVE checking, graph building, and issue sync.
### 3. View findings
Once the scan completes, navigate to **Findings** to see everything that was discovered. Each finding includes a severity level, description, code evidence, and AI-generated remediation guidance.
![Findings list with filters](/screenshots/findings-list.png)
## Next Steps
- [Add your first repository](/guide/repositories)
- [Understand scan results](/guide/findings)
- [Configure integrations](/guide/configuration)
- [Add and configure repositories](/guide/repositories) -- including private repos and issue tracker setup
- [Understand how scans work](/guide/scanning) -- phases, triggers, and deduplication
- [Work with findings](/guide/findings) -- triage, false positives, and developer feedback
- [Explore your SBOM](/guide/sbom) -- dependencies, licenses, and exports

56
docs/guide/issues.md Normal file
View File

@@ -0,0 +1,56 @@
# Issues & Tracking
Certifai automatically creates issues in your existing issue trackers when new security findings are discovered. This integrates security into your development workflow without requiring teams to check a separate tool.
## How Issues Are Created
When a scan discovers new findings, the following happens automatically:
1. Each new finding is checked against existing issues using its fingerprint
2. If no matching issue exists, a new issue is created in the configured tracker
3. The issue includes the finding title, severity, vulnerability details, file location, and a link back to the finding in Certifai
4. The finding is updated with a link to the external issue
This means every actionable finding gets tracked in the same system your developers already use.
## Issues List
Navigate to **Issues** in the sidebar to see all tracker issues across your repositories.
![Issues list showing tracker issues](/screenshots/issues-list.png)
The issues table shows:
| Column | Description |
|--------|-------------|
| Tracker | Badge showing GitHub, GitLab, Gitea, or Jira |
| External ID | Issue number in the external system |
| Title | Issue title |
| Status | Open, Closed, or tracker-specific status |
| Created | When the issue was created |
| Link | Direct link to the issue in the external tracker |
Click the link to go directly to the issue in your tracker.
## Supported Trackers
| Tracker | How to Configure |
|---------|-----------------|
| **GitHub Issues** | Set up in the repository's issue tracker settings with your GitHub API token |
| **GitLab Issues** | Set up with your GitLab project ID, instance URL, and API token |
| **Gitea Issues** | Set up with your Gitea repository details, instance URL, and API token |
| **Jira** | Set up with your Jira project key, instance URL, email, and API token |
Issue tracker configuration is per-repository. You set it up when [adding or editing a repository](/guide/repositories#configuring-an-issue-tracker).
## Deduplication
Issues are deduplicated using the same fingerprint hash that deduplicates findings. This means:
- If the same vulnerability appears in consecutive scans, only one issue is created
- If a finding is resolved and then reappears, the platform recognizes it and can reopen the existing issue rather than creating a duplicate
- Different findings (even if similar) get separate issues because their fingerprints differ based on file path, line number, and vulnerability type
## Linked Issues in Finding Detail
When viewing a [finding's detail page](/guide/findings#finding-detail), you will see a **Linked Issue** section if an issue was created for that finding. This provides a direct link to the external tracker issue, making it easy to jump between the security context in Certifai and the development workflow in your tracker.

View File

@@ -1,26 +1,78 @@
# Adding Repositories
Repositories are the core resource in Compliance Scanner. Each tracked repository is scanned on a schedule and its results are available across all features.
Repositories are the core resource in Certifai. Each tracked repository is scanned on a schedule, and its results are available across all features -- findings, SBOM, code graph, AI chat, and issue tracking.
## Adding a Repository
1. Navigate to **Repositories** in the sidebar
2. Click **Add Repository** at the top of the page
2. Click **Add Repository**
3. Fill in the form:
- **Name** — A display name for the repository
- **Git URL** — The clone URL (HTTPS or SSH), e.g. `https://github.com/org/repo.git`
- **Default Branch** — The branch to scan, e.g. `main` or `master`
- **Name** -- a display name for the repository
- **Git URL** -- the clone URL (HTTPS or SSH), e.g. `https://github.com/org/repo.git` or `git@github.com:org/repo.git`
- **Default Branch** -- the branch to scan, e.g. `main` or `master`
4. Click **Add**
![Add repository dialog](/screenshots/add-repository.png)
The repository appears in the list immediately. It will not be scanned until you trigger a scan manually or the next scheduled scan runs.
## Public vs Private Repositories
**Public repositories** can be cloned using an HTTPS URL with no additional setup.
**Private repositories** require SSH access. When you add a repository with an SSH URL (e.g. `git@github.com:org/repo.git`), Certifai uses an SSH deploy key to authenticate.
### Getting the SSH Public Key
To grant Certifai access to a private repository:
1. Go to the **Repositories** page
2. The platform's SSH public key is available for copying
3. Add this key as a **deploy key** in your Git hosting provider:
- **GitHub**: Repository Settings > Deploy keys > Add deploy key
- **GitLab**: Repository Settings > Repository > Deploy keys
- **Gitea**: Repository Settings > Deploy Keys > Add Deploy Key
::: tip
For private repositories, configure a GitHub token (`GITHUB_TOKEN`) or GitLab token (`GITLAB_TOKEN`) in your environment. The agent uses these tokens when cloning.
Deploy keys are scoped to a single repository and are read-only by default. This is the recommended approach for granting Certifai access to private code.
:::
## Configuring an Issue Tracker
You can connect an issue tracker so that new findings are automatically created as issues in your existing workflow.
When adding or editing a repository, expand the **Issue Tracker** section to configure:
![Add repository dialog with issue tracker options](/screenshots/add-repository-tracker.png)
### Supported Trackers
| Tracker | Required Fields |
|---------|----------------|
| **GitHub Issues** | Repository owner, repository name, API token |
| **GitLab Issues** | Project ID, GitLab URL, API token |
| **Gitea Issues** | Repository owner, repository name, Gitea URL, API token |
| **Jira** | Project key, Jira URL, email, API token |
Each tracker is configured per-repository, so different repositories can use different trackers.
## Editing Repository Settings
Click the **Edit** button on any repository row to modify its settings, including the issue tracker configuration.
![Edit repository modal with tracker configuration](/screenshots/edit-repository.png)
From the edit modal you can:
- Change the repository name, Git URL, or default branch
- Add, modify, or remove issue tracker configuration
- View the webhook URL and secret for this repository (see [Webhooks & PR Reviews](/guide/webhooks))
## Repository List
The repositories page shows all tracked repositories with:
The repositories page shows all tracked repositories in a table.
![Repository list table](/screenshots/repositories-list.png)
| Column | Description |
|--------|-------------|
@@ -32,7 +84,7 @@ The repositories page shows all tracked repositories with:
## Triggering a Scan
Click the **Scan** button on any repository row to trigger an immediate scan. The scan runs in the background through all phases (clone, SAST, SBOM, CVE, graph). You can monitor progress on the Overview page under recent scan runs.
Click the **Scan** button on any repository row to trigger an immediate scan. The scan runs in the background through all phases (clone, SAST, SBOM, CVE, graph, issue sync). You can monitor progress on the Overview page under recent scan runs.
## Deleting a Repository
@@ -44,19 +96,6 @@ Click the **Delete** button on a repository row. A confirmation dialog appears w
- Code graph data
- Embedding vectors (for AI chat)
- CVE alerts
- Tracker issues
This action cannot be undone.
## Automatic Scanning
Repositories are scanned automatically on a schedule configured by the `SCAN_SCHEDULE` environment variable (cron format). The default is every 6 hours:
```
SCAN_SCHEDULE=0 0 */6 * * *
```
CVE monitoring runs on a separate schedule (default: daily at midnight):
```
CVE_MONITOR_SCHEDULE=0 0 0 * * *
```

111
docs/guide/sbom.md Normal file
View File

@@ -0,0 +1,111 @@
# SBOM & Licenses
The SBOM (Software Bill of Materials) feature provides a complete inventory of all dependencies across your repositories, with vulnerability tracking and license compliance analysis.
## What is an SBOM?
A Software Bill of Materials is a list of every component (library, package, framework) that your software depends on, along with version numbers, licenses, and known vulnerabilities. SBOMs are increasingly required for compliance audits, customer security questionnaires, and supply chain transparency.
Certifai generates SBOMs automatically during each scan using Syft for dependency extraction and Grype for vulnerability matching.
## Packages Tab
Navigate to **SBOM** in the sidebar to see the packages tab, which lists all dependencies discovered during scans.
![SBOM packages tab with filters and export options](/screenshots/sbom-packages.png)
### Filtering
Use the filter bar to narrow results:
- **Repository** -- select a specific repository or view all
- **Package Manager** -- npm, cargo, pip, go, maven, nuget, composer, gem
- **Search** -- filter by package name
- **Vulnerabilities** -- show all packages, only those with vulnerabilities, or only clean packages
- **License** -- filter by specific license (MIT, Apache-2.0, BSD-3-Clause, GPL-3.0, etc.)
### Package Details
Each package row shows:
| Column | Description |
|--------|-------------|
| Package | Package name |
| Version | Installed version |
| Manager | Package manager (npm, cargo, pip, etc.) |
| License | License identifier with color-coded badge |
| Vulnerabilities | Count of known vulnerabilities (click to expand) |
### Vulnerability Details
Click the vulnerability count on any package to expand inline details showing:
- Vulnerability ID (e.g. CVE-2024-1234)
- Source database
- Severity level
- Link to the advisory
## License Compliance Tab
The license compliance tab helps you understand your licensing obligations across all dependencies.
![License compliance tab with copyleft warnings and distribution chart](/screenshots/sbom-licenses.png)
### Copyleft Warnings
If any dependencies use copyleft licenses (GPL, AGPL, LGPL, MPL), a warning banner appears listing the affected packages. Copyleft licenses may impose distribution requirements on your software.
::: warning
Copyleft-licensed dependencies can require you to release your source code under the same license. Review flagged packages carefully with your legal team if you distribute proprietary software.
:::
### License Distribution
A horizontal bar chart visualizes the percentage breakdown of licenses across your dependencies, giving you a quick overview of your licensing profile.
### License Table
A detailed table lists every license found:
| Column | Description |
|--------|-------------|
| License | License identifier |
| Type | **Copyleft** or **Permissive** badge |
| Packages | List of packages using this license |
| Count | Number of packages |
**Copyleft licenses** (flagged as potentially restrictive): GPL-2.0, GPL-3.0, AGPL-3.0, LGPL-2.1, LGPL-3.0, MPL-2.0
**Permissive licenses** (generally safe for commercial use): MIT, Apache-2.0, BSD-2-Clause, BSD-3-Clause, ISC, and others
## Export
You can export your SBOM in industry-standard formats:
1. Select a repository (or export across all repositories)
2. Choose a format:
- **CycloneDX 1.5** -- JSON format widely supported by security tools
- **SPDX 2.3** -- Linux Foundation standard for license compliance
3. Click **Export**
4. The SBOM downloads as a JSON file
::: tip
SBOM exports are useful for compliance audits, customer security questionnaires, government procurement requirements, and supply chain transparency.
:::
## Compare Tab
Compare the dependency profiles of two repositories side by side:
1. Select **Repository A** from the first dropdown
2. Select **Repository B** from the second dropdown
3. View the comparison results:
| Section | Description |
|---------|-------------|
| **Only in A** | Packages present in repo A but not in repo B |
| **Only in B** | Packages present in repo B but not in repo A |
| **Version Diffs** | Same package with different versions between repos |
| **Common** | Count of packages that match exactly |
This is useful for auditing consistency across microservices, identifying dependency drift, and planning coordinated upgrades.

View File

@@ -1,20 +1,22 @@
# Running Scans
Scans are the primary workflow in Compliance Scanner. Each scan analyzes a repository for security vulnerabilities, dependency risks, and code structure.
Scans are the primary workflow in Certifai. Each scan analyzes a repository for security vulnerabilities, dependency risks, and code structure.
## Scan Types
## What Happens During a Scan
A full scan consists of multiple phases, each producing different types of findings:
When a scan is triggered, Certifai runs through these phases in order:
| Scan Type | What It Detects | Scanner |
|-----------|----------------|---------|
| **SAST** | Code-level vulnerabilities (injection, XSS, insecure crypto, etc.) | Semgrep |
| **SBOM** | Dependency inventory, outdated packages, known vulnerabilities | Syft |
| **CVE** | Known CVEs in dependencies cross-referenced against NVD | NVD API |
| **GDPR** | Personal data handling issues, consent violations | Custom rules |
| **OAuth** | OAuth/OIDC misconfigurations, insecure token handling | Custom rules |
1. **Clone** -- pulls the latest code from the Git remote (or clones it for the first time)
2. **SAST** -- runs static analysis using Semgrep with rules covering OWASP, GDPR, OAuth, secrets, and general security patterns
3. **SBOM** -- extracts all dependencies using Syft, identifying packages, versions, licenses, and known vulnerabilities via Grype
4. **CVE Check** -- cross-references dependencies against the NVD database for known CVEs
5. **Graph Build** -- parses the codebase to construct a code knowledge graph of functions, classes, and their relationships
6. **AI Triage** -- new findings are reviewed by an LLM that assesses severity, considers blast radius using the code graph, and generates remediation guidance
7. **Issue Sync** -- creates or updates issues in connected trackers (GitHub, GitLab, Gitea, Jira) for new findings
## Triggering a Scan
Each phase produces results that are visible in the dashboard as soon as they complete.
## How Scans Are Triggered
### Manual Scan
@@ -24,60 +26,54 @@ A full scan consists of multiple phases, each producing different types of findi
### Scheduled Scans
Scans run automatically based on the `SCAN_SCHEDULE` cron expression. The default scans every 6 hours:
```
SCAN_SCHEDULE=0 0 */6 * * *
```
Repositories are scanned automatically on a recurring schedule. By default, scans run every 6 hours and CVE monitoring runs daily. Your administrator controls these schedules.
### Webhook-Triggered Scans
Configure GitHub or GitLab webhooks to trigger scans on push events. Set the webhook URL to:
When you configure a webhook in your Git hosting provider, scans are triggered automatically on push events. You can also get automated PR reviews. See [Webhooks & PR Reviews](/guide/webhooks) for setup instructions.
```
http://<agent-host>:3002/webhook/github
http://<agent-host>:3002/webhook/gitlab
```
## Scan Phases and Statuses
And configure the corresponding webhook secret:
```
GITHUB_WEBHOOK_SECRET=your-secret
GITLAB_WEBHOOK_SECRET=your-secret
```
## Scan Phases
Each scan progresses through these phases in order:
1. **Queued** — Scan is waiting to start
2. **Cloning** — Repository is being cloned or updated
3. **Scanning** — Static analysis and SBOM extraction are running
4. **Analyzing** — CVE cross-referencing and graph construction
5. **Reporting** — Creating tracker issues for new findings
6. **Completed** — All phases finished successfully
If any phase fails, the scan status is set to **Failed** with an error message.
## Viewing Scan History
The Overview page shows the 10 most recent scan runs across all repositories, including:
- Repository name
- Scan status
- Current phase
- Number of findings discovered
- Start time and duration
## Scan Run Statuses
Each scan progresses through these statuses:
| Status | Meaning |
|--------|---------|
| `queued` | Waiting to start |
| `running` | Currently executing |
| `completed` | Finished successfully |
| `failed` | Stopped due to an error |
| **Queued** | Scan is waiting to start |
| **Running** | Currently executing scan phases |
| **Completed** | All phases finished successfully |
| **Failed** | Stopped due to an error |
## Deduplication
You can monitor scan progress on the Overview page, which shows the most recent scan runs across all repositories, including the current phase, finding count, and duration.
Findings are deduplicated using a fingerprint hash based on the scanner, file path, line number, and vulnerability type. Repeated scans will not create duplicate findings for the same issue.
## Scan Types
A full scan runs multiple analysis engines, each producing different types of findings:
| Scan Type | What It Detects | Scanner |
|-----------|----------------|---------|
| **SAST** | Code-level vulnerabilities (injection, XSS, insecure crypto, etc.) | Semgrep |
| **SBOM** | Dependency inventory, outdated packages, known vulnerabilities | Syft + Grype |
| **CVE** | Known CVEs in dependencies cross-referenced against NVD | NVD API |
| **GDPR** | Personal data handling issues, consent violations | Custom rules |
| **OAuth** | OAuth/OIDC misconfigurations, insecure token handling | Custom rules |
| **Secrets** | Hardcoded credentials, API keys, tokens in source code | Custom rules |
| **Code Review** | Architecture and security patterns reviewed by AI | LLM-powered |
## Deduplication and Fingerprinting
Findings are deduplicated using a fingerprint hash based on the scanner, file path, line number, and vulnerability type. This means:
- **Repeated scans** will not create duplicate findings for the same issue
- **Tracker issues** are only created once per unique finding
- **Resolved findings** that reappear in a new scan are flagged for re-review
The fingerprint is also used to match findings to existing tracker issues, preventing duplicate issues from being created in GitHub, GitLab, Gitea, or Jira.
## Interpreting Results
After a scan completes, you can explore results in several ways:
- **Findings** -- browse all discovered vulnerabilities with filters for severity, type, and status. See [Understanding Findings](/guide/findings).
- **SBOM** -- review your dependency inventory, check for vulnerable packages, and audit license compliance. See [SBOM & Licenses](/guide/sbom).
- **Overview** -- check the dashboard for a high-level summary of your security posture across all repositories.
- **Issues** -- see which findings have been pushed to your issue tracker. See [Issues & Tracking](/guide/issues).

87
docs/guide/webhooks.md Normal file
View File

@@ -0,0 +1,87 @@
# Webhooks & PR Reviews
Webhooks let Certifai respond to events in your Git repositories automatically. When configured, pushes to your repository trigger scans, and pull requests receive automated security reviews.
## What Webhooks Enable
- **Automatic scans on push** -- every time code is pushed to your default branch, a scan is triggered automatically
- **PR security reviews** -- when a pull request is opened or updated, Certifai scans the changes and posts a review comment summarizing any security findings in the diff
## Finding the Webhook URL and Secret
Each repository in Certifai has its own webhook URL and secret:
1. Go to **Repositories**
2. Click **Edit** on the repository you want to configure
3. In the edit modal, you will find the **Webhook URL** and **Webhook Secret**
4. Copy both values -- you will need them when configuring your Git hosting provider
## Setting Up Webhooks
### Gitea
1. Go to your repository in Gitea
2. Navigate to **Settings > Webhooks > Add Webhook > Gitea**
3. Set the **Target URL** to the webhook URL from Certifai
4. Set the **Secret** to the webhook secret from Certifai
5. Under **Trigger On**, select:
- **Push Events** -- for automatic scans on push
- **Pull Request Events** -- for PR security reviews
6. Set the content type to `application/json`
7. Click **Add Webhook**
### GitHub
1. Go to your repository on GitHub
2. Navigate to **Settings > Webhooks > Add webhook**
3. Set the **Payload URL** to the webhook URL from Certifai
4. Set the **Content type** to `application/json`
5. Set the **Secret** to the webhook secret from Certifai
6. Under **Which events would you like to trigger this webhook?**, select **Let me select individual events**, then check:
- **Pushes** -- for automatic scans on push
- **Pull requests** -- for PR security reviews
7. Click **Add webhook**
### GitLab
1. Go to your project in GitLab
2. Navigate to **Settings > Webhooks**
3. Set the **URL** to the webhook URL from Certifai
4. Set the **Secret token** to the webhook secret from Certifai
5. Under **Trigger**, check:
- **Push events** -- for automatic scans on push
- **Merge request events** -- for PR security reviews
6. Click **Add webhook**
## PR Review Flow
When a pull request (or merge request) is opened or updated, the following happens:
1. Your Git provider sends a webhook event to Certifai
2. Certifai checks out the PR branch and runs a targeted scan on the changed files
3. Findings specific to the changes in the PR are identified
4. Certifai posts a review comment on the PR summarizing:
- Number of new findings introduced by the changes
- Severity breakdown
- Details for each finding including file, line, and remediation guidance
This gives developers immediate security feedback in their pull request workflow, before code is merged.
::: tip
PR reviews focus only on changes introduced in the pull request, not the entire codebase. This keeps reviews relevant and actionable.
:::
## Events to Select
Here is a summary of which events to enable for each feature:
| Feature | Gitea | GitHub | GitLab |
|---------|-------|--------|--------|
| Scan on push | Push Events | Pushes | Push events |
| PR reviews | Pull Request Events | Pull requests | Merge request events |
You can enable one or both depending on your workflow.
::: warning
Make sure the webhook secret matches exactly between your Git provider and Certifai. Requests with an invalid signature are rejected.
:::