docs: rewrite user docs, fix modal scroll, webhook URL, and sccache

Rewrite all public documentation to be user-facing only: - Remove deployment, configuration, and self-hosting sections - Add guide pages for SBOM, issues, webhooks & PR reviews - Add reference pages for glossary and tools/scanners - Add 12 screenshots from live dashboard - Explain MCP, LLM triage, false positives, human-in-the-loop Fix edit repository modal not scrollable (max-height + overflow-y). Show full webhook URL using window.location.origin instead of path. Unset RUSTC_WRAPPER in agent cargo commands to avoid sccache errors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 14:17:46 +01:00
parent 689daa0f49
commit c253e4ef5e
40 changed files with 872 additions and 1334 deletions
@@ -1,153 +0,0 @@
-# Configuration
-
-Compliance Scanner is configured through environment variables. Copy `.env.example` to `.env` and edit the values.
-
-## Required Settings
-
-### MongoDB
-
-```bash
-MONGODB_URI=mongodb://root:example@localhost:27017/compliance_scanner?authSource=admin
-MONGODB_DATABASE=compliance_scanner
-```
-
-### Agent
-
-```bash
-AGENT_PORT=3001
-```
-
-### Dashboard
-
-```bash
-DASHBOARD_PORT=8080
-AGENT_API_URL=http://localhost:3001
-```
-
-## LLM Configuration
-
-The AI features (chat, remediation suggestions) use LiteLLM as a proxy to various LLM providers:
-
-```bash
-LITELLM_URL=http://localhost:4000
-LITELLM_API_KEY=your-key
-LITELLM_MODEL=gpt-4o
-LITELLM_EMBED_MODEL=text-embedding-3-small
-```
-
-The embed model is used for the RAG/AI Chat feature to generate code embeddings.
-
-## Git Provider Tokens
-
-### GitHub
-
-```bash
-GITHUB_TOKEN=ghp_xxxx
-GITHUB_WEBHOOK_SECRET=your-webhook-secret
-```
-
-### GitLab
-
-```bash
-GITLAB_URL=https://gitlab.com
-GITLAB_TOKEN=glpat-xxxx
-GITLAB_WEBHOOK_SECRET=your-webhook-secret
-```
-
-## Issue Tracker Integration
-
-### Jira
-
-```bash
-JIRA_URL=https://your-org.atlassian.net
-JIRA_EMAIL=user@example.com
-JIRA_API_TOKEN=your-api-token
-JIRA_PROJECT_KEY=SEC
-```
-
-When configured, new findings automatically create Jira issues in the specified project.
-
-## Scan Schedules
-
-Cron expressions for automated scanning:
-
-```bash
-# Scan every 6 hours
-SCAN_SCHEDULE=0 0 */6 * * *
-
-# Check for new CVEs daily at midnight
-CVE_MONITOR_SCHEDULE=0 0 0 * * *
-```
-
-## Search Engine
-
-SearXNG is used for CVE enrichment and vulnerability research:
-
-```bash
-SEARXNG_URL=http://localhost:8888
-```
-
-## NVD API
-
-An NVD API key increases rate limits for CVE lookups:
-
-```bash
-NVD_API_KEY=your-nvd-api-key
-```
-
-Get a free key at [https://nvd.nist.gov/developers/request-an-api-key](https://nvd.nist.gov/developers/request-an-api-key).
-
-## MCP Server
-
-The MCP server exposes compliance data to external LLMs via the Model Context Protocol. See [MCP Server](/features/mcp-server) for full details.
-
-```bash
-# Set MCP_PORT to enable HTTP transport (omit for stdio mode)
-MCP_PORT=8090
-```
-
-The MCP server shares the `MONGODB_URI` and `MONGODB_DATABASE` variables with the rest of the platform.
-
-## Clone Path
-
-Where the agent stores cloned repository files:
-
-```bash
-GIT_CLONE_BASE_PATH=/tmp/compliance-scanner/repos
-```
-
-## All Environment Variables
-
-| Variable | Required | Default | Description |
-|----------|----------|---------|-------------|
-| `MONGODB_URI` | Yes | — | MongoDB connection string |
-| `MONGODB_DATABASE` | No | `compliance_scanner` | Database name |
-| `AGENT_PORT` | No | `3001` | Agent REST API port |
-| `DASHBOARD_PORT` | No | `8080` | Dashboard web UI port |
-| `AGENT_API_URL` | No | `http://localhost:3001` | Agent URL for dashboard |
-| `LITELLM_URL` | No | `http://localhost:4000` | LiteLLM proxy URL |
-| `LITELLM_API_KEY` | No | — | LiteLLM API key |
-| `LITELLM_MODEL` | No | `gpt-4o` | LLM model for analysis |
-| `LITELLM_EMBED_MODEL` | No | `text-embedding-3-small` | Embedding model for RAG |
-| `GITHUB_TOKEN` | No | — | GitHub personal access token |
-| `GITHUB_WEBHOOK_SECRET` | No | — | GitHub webhook signing secret |
-| `GITLAB_URL` | No | `https://gitlab.com` | GitLab instance URL |
-| `GITLAB_TOKEN` | No | — | GitLab access token |
-| `GITLAB_WEBHOOK_SECRET` | No | — | GitLab webhook signing secret |
-| `JIRA_URL` | No | — | Jira instance URL |
-| `JIRA_EMAIL` | No | — | Jira account email |
-| `JIRA_API_TOKEN` | No | — | Jira API token |
-| `JIRA_PROJECT_KEY` | No | — | Jira project key for issues |
-| `SEARXNG_URL` | No | `http://localhost:8888` | SearXNG instance URL |
-| `NVD_API_KEY` | No | — | NVD API key for CVE lookups |
-| `SCAN_SCHEDULE` | No | `0 0 */6 * * *` | Cron schedule for scans |
-| `CVE_MONITOR_SCHEDULE` | No | `0 0 0 * * *` | Cron schedule for CVE checks |
-| `GIT_CLONE_BASE_PATH` | No | `/tmp/compliance-scanner/repos` | Local clone directory |
-| `KEYCLOAK_URL` | No | — | Keycloak server URL |
-| `KEYCLOAK_REALM` | No | — | Keycloak realm name |
-| `KEYCLOAK_CLIENT_ID` | No | — | Keycloak client ID |
-| `REDIRECT_URI` | No | — | OAuth callback URL |
-| `APP_URL` | No | — | Application root URL |
-| `OTEL_EXPORTER_OTLP_ENDPOINT` | No | — | OTLP collector endpoint |
-| `OTEL_SERVICE_NAME` | No | — | OpenTelemetry service name |
-| `MCP_PORT` | No | — | MCP HTTP transport port (omit for stdio) |
@@ -1,68 +1,62 @@
-# Managing Findings
+# Understanding Findings

 Findings are security issues discovered during scans. The findings workflow lets you triage, track, and resolve vulnerabilities across all your repositories.

 ## Findings List

-Navigate to **Findings** in the sidebar to see all findings. The table shows:
+Navigate to **Findings** in the sidebar to see all findings across your repositories.

-| Column | Description |
-|--------|-------------|
-| Severity | Color-coded badge: Critical (red), High (orange), Medium (yellow), Low (green) |
-| Title | Short description of the vulnerability (clickable) |
-| Type | SAST, SBOM, CVE, GDPR, or OAuth |
-| Scanner | Tool that found the issue (e.g. semgrep, syft) |
-| File | Source file path where the issue was found |
-| Status | Current triage status |
+![Findings list with severity badges, types, and filter controls](/screenshots/findings-list.png)

-## Filtering
+### Filtering

-Use the filter bar at the top to narrow results:
+Use the filter bar to narrow results:

- **Repository** — Filter to a specific repository or view all
- **Severity** — Critical, High, Medium, Low, or Info
- **Type** — SAST, SBOM, CVE, GDPR, OAuth
- **Status** — Open, Triaged, Resolved, False Positive, Ignored
+- **Repository** -- filter to a specific repository or view all
+- **Severity** -- Critical, High, Medium, Low, or Info
+- **Type** -- SAST, SBOM, CVE, GDPR, OAuth, Secrets, Code Review
+- **Status** -- Open, Triaged, Resolved, False Positive, Ignored

 Filters can be combined. Results are paginated with 20 findings per page.

+### Columns
+
+| Column | Description |
+|--------|-------------|
+| Severity | Color-coded badge: Critical (red), High (orange), Medium (yellow), Low (green), Info (blue) |
+| Title | Short description of the vulnerability (clickable) |
+| Type | SAST, SBOM, CVE, GDPR, OAuth, Secrets, or Code Review |
+| Scanner | Tool that found the issue (e.g. Semgrep, Grype) |
+| File | Source file path where the issue was found |
+| Status | Current triage status |
+
 ## Finding Detail

-Click any finding title to view its full detail page, which includes:
+Click any finding title to view its full detail page.

-### Metadata
- Severity level with CWE identifier and CVSS score (when available)
- Scanner tool and scan type
- File path and line number
+![Finding detail page showing description, triage rationale, code evidence, remediation, and status controls](/screenshots/finding-detail.png)
+
+The detail page is organized into these sections:

 ### Description
-Full explanation of the vulnerability, why it's a risk, and what conditions trigger it.
+
+A full explanation of the vulnerability: what it is, why it is a risk, and what conditions trigger it.
+
+### AI Triage Rationale
+
+The LLM's assessment of the finding, including why it assigned a particular severity and confidence score. This rationale considers the code context, the type of vulnerability, and the blast radius based on the code knowledge graph.

 ### Code Evidence
-The source code snippet where the issue was found, with syntax highlighting and the file path.
+
+The source code snippet where the issue was found, with syntax highlighting and the file path with line number.

 ### Remediation
-Step-by-step guidance on how to fix the vulnerability.

-### Suggested Fix
-A code example showing the corrected implementation.
+Step-by-step guidance on how to fix the vulnerability, often including a suggested code fix showing the corrected implementation.

 ### Linked Issue
-If the finding was pushed to an issue tracker (GitHub, GitLab, Jira), a direct link to the external issue.

-## Updating Status
-
-On the finding detail page, change the finding's status using the status buttons:
-
-| Status | When to Use |
-|--------|-------------|
-| **Open** | New finding, not yet reviewed |
-| **Triaged** | Reviewed and confirmed as a real issue, pending fix |
-| **Resolved** | Fix has been applied |
-| **False Positive** | Finding is not a real vulnerability in this context |
-| **Ignored** | Known issue that won't be fixed (accepted risk) |
-
-Status changes are persisted immediately.
+If the finding has been pushed to an issue tracker (GitHub, GitLab, Gitea, Jira), a direct link to the external issue appears here.

 ## Severity Levels

@@ -73,3 +67,77 @@ Status changes are persisted immediately.
 | **Medium** | Moderate risk, exploitation requires specific conditions | Insecure deserialization, weak crypto |
 | **Low** | Minor risk, limited impact | Information disclosure, verbose errors |
 | **Info** | Informational, no direct security impact | Best practice recommendations |
+
+## Finding Types
+
+| Type | Source | Description |
+|------|--------|-------------|
+| **SAST** | Semgrep | Code-level vulnerabilities found through static analysis |
+| **SBOM** | Syft + Grype | Vulnerable dependencies identified in your software bill of materials |
+| **CVE** | NVD | Known CVEs matching your dependency versions |
+| **GDPR** | Custom rules | Personal data handling and consent issues |
+| **OAuth** | Custom rules | OAuth/OIDC misconfigurations and insecure token handling |
+| **Secrets** | Custom rules | Hardcoded credentials, API keys, and tokens |
+| **Code Review** | LLM | Architecture and security patterns reviewed by the AI engine |
+
+## Triage Workflow
+
+Every finding follows a lifecycle from discovery to resolution. The status indicates where a finding is in this process:
+
+| Status | Meaning |
+|--------|---------|
+| **Open** | Newly discovered, not yet reviewed |
+| **Triaged** | Reviewed and confirmed as a real issue, pending fix |
+| **Resolved** | A fix has been applied |
+| **False Positive** | Not a real vulnerability in this context |
+| **Ignored** | Known issue that will not be fixed (accepted risk) |
+
+On the finding detail page, use the status buttons to move a finding through this workflow. Status changes take effect immediately.
+
+### Recommended Flow
+
+1. A scan discovers a new finding -- it starts as **Open**
+2. You review the AI triage rationale and code evidence
+3. If it is a real issue, mark it as **Triaged** to signal that it needs a fix
+4. Once the fix is deployed and a new scan confirms it, mark it as **Resolved**
+5. If the AI got it wrong, mark it as **False Positive** (see below)
+
+## False Positives
+
+Not every finding is a real vulnerability. Static analysis tools can flag code that looks suspicious but is actually safe in context. When this happens:
+
+1. Open the finding detail page
+2. Review the code evidence and the AI triage rationale
+3. If you determine the finding is not a real issue, click **False Positive**
+
+::: tip
+When you mark a finding as a false positive, you are providing training signal to the AI. Over time, the LLM learns from your feedback and becomes better at distinguishing real vulnerabilities from false alarms in your codebase.
+:::
+
+## Human in the Loop
+
+Certifai uses AI to triage findings, but humans make the final decisions. Here is how the process works:
+
+1. **AI triages** -- the LLM reviews each finding, assigns a severity, generates a confidence score, and writes a rationale explaining its assessment
+2. **You review** -- you read the AI's analysis alongside the code evidence and decide whether to act on it
+3. **You decide** -- you set the final status (Triaged, Resolved, False Positive, or Ignored)
+4. **AI learns** -- your feedback on false positives and status changes helps improve future triage accuracy
+
+The AI provides the analysis; you provide the judgment. This approach gives you the speed of automated scanning with the accuracy of human review.
+
+## Developer Feedback
+
+On the finding detail page, you can provide feedback on the AI's triage. This feedback loop serves two purposes:
+
+- **Accuracy** -- helps the platform understand which findings are actionable in your specific codebase and context
+- **Context** -- lets you add notes explaining why a finding is or is not relevant, which benefits other team members reviewing the same finding
+
+## Confidence Scores
+
+Each AI-triaged finding includes a confidence score from 0.0 to 1.0, indicating how certain the LLM is about its assessment:
+
+- **0.8 -- 1.0** -- High confidence. The AI is very certain this is (or is not) a real vulnerability.
+- **0.5 -- 0.8** -- Moderate confidence. The finding likely warrants human review.
+- **Below 0.5** -- Low confidence. The AI is uncertain and recommends manual inspection.
+
+Use confidence scores to prioritize your review queue: start with high-severity, high-confidence findings for the greatest impact.
@@ -1,55 +1,49 @@
 # Getting Started

-Compliance Scanner is a security compliance platform that scans your Git repositories for vulnerabilities, builds software bills of materials, performs dynamic application testing, and provides AI-powered code intelligence.
+Certifai is an AI-powered security compliance platform that scans your Git repositories for vulnerabilities, builds software bills of materials, performs dynamic application testing, and provides code intelligence through an interactive knowledge graph and AI chat.

-## Architecture
+## What You Get

-The platform consists of three main components:
+When you connect a repository, Certifai runs a comprehensive scan pipeline that covers:

- **Agent** — Background service that clones repositories, runs scans, builds graphs, and exposes a REST API
- **Dashboard** — Web UI built with Dioxus (Rust full-stack framework) for viewing results and managing repositories
- **MongoDB** — Database for storing all scan results, findings, SBOM data, and graph structures
+- **Static Analysis (SAST)** -- finds code-level vulnerabilities like injection flaws, insecure crypto, and misconfigurations
+- **Software Bill of Materials (SBOM)** -- inventories every dependency, its version, and its license
+- **CVE Monitoring** -- cross-references your dependencies against known vulnerabilities
+- **Code Knowledge Graph** -- maps the structure of your codebase for impact analysis
+- **AI Triage** -- every finding is reviewed by an LLM that provides severity assessment, confidence scores, and remediation guidance
+- **Issue Tracking** -- automatically creates issues in your tracker for new findings

-## Quick Start with Docker Compose
+## Dashboard Overview

-The fastest way to get running:
+After logging in, you land on the Overview page, which gives you a snapshot of your security posture across all repositories.

-```bash
-# Clone the repository
-git clone <repo-url> compliance-scanner
-cd compliance-scanner
+![Dashboard overview showing stats cards, severity distribution, and recent scan activity](/screenshots/dashboard-overview.png)

-# Copy and configure environment variables
-cp .env.example .env
-# Edit .env with your settings (see Configuration)
+The overview shows key metrics at a glance: total repositories, findings broken down by severity, dependency counts, CVE alerts, and tracker issues. A severity distribution chart visualizes your risk profile, and recent scan runs let you monitor scanning activity.

-# Start all services
-docker-compose up -d
-```
+## Quick Walkthrough

-This starts:
- MongoDB on port `27017`
- Agent API on port `3001`
- Dashboard on port `8080`
- Chromium (for DAST crawling) on port `3003`
+Here is the fastest path from zero to your first scan results:

-Open the dashboard at [http://localhost:8080](http://localhost:8080).
+### 1. Add a repository

-## What Happens During a Scan
+Navigate to **Repositories** in the sidebar and click **Add Repository**. Enter a name, the Git clone URL, and the default branch to scan.

-When you add a repository and trigger a scan, the agent runs through these phases:
+![Add repository dialog](/screenshots/add-repository.png)

-1. **Clone** — Clones or pulls the latest code from the Git remote
-2. **SAST** — Runs static analysis using Semgrep with rules for OWASP, GDPR, OAuth, and general security
-3. **SBOM** — Extracts all dependencies using Syft, identifying packages, versions, licenses, and known vulnerabilities
-4. **CVE Check** — Cross-references dependencies against the NVD database for known CVEs
-5. **Graph Build** — Parses the codebase to construct a code knowledge graph of functions, classes, and their relationships
-6. **Issue Sync** — Creates or updates issues in connected trackers (GitHub, GitLab, Jira) for new findings
+### 2. Trigger a scan

-Each phase produces results visible in the dashboard immediately.
+Click the **Scan** button on your repository row. The scan runs in the background through all phases: cloning, static analysis, SBOM extraction, CVE checking, graph building, and issue sync.
+
+### 3. View findings
+
+Once the scan completes, navigate to **Findings** to see everything that was discovered. Each finding includes a severity level, description, code evidence, and AI-generated remediation guidance.
+
+![Findings list with filters](/screenshots/findings-list.png)

 ## Next Steps

- [Add your first repository](/guide/repositories)
- [Understand scan results](/guide/findings)
- [Configure integrations](/guide/configuration)
+- [Add and configure repositories](/guide/repositories) -- including private repos and issue tracker setup
+- [Understand how scans work](/guide/scanning) -- phases, triggers, and deduplication
+- [Work with findings](/guide/findings) -- triage, false positives, and developer feedback
+- [Explore your SBOM](/guide/sbom) -- dependencies, licenses, and exports
@@ -0,0 +1,56 @@
+# Issues & Tracking
+
+Certifai automatically creates issues in your existing issue trackers when new security findings are discovered. This integrates security into your development workflow without requiring teams to check a separate tool.
+
+## How Issues Are Created
+
+When a scan discovers new findings, the following happens automatically:
+
+1. Each new finding is checked against existing issues using its fingerprint
+2. If no matching issue exists, a new issue is created in the configured tracker
+3. The issue includes the finding title, severity, vulnerability details, file location, and a link back to the finding in Certifai
+4. The finding is updated with a link to the external issue
+
+This means every actionable finding gets tracked in the same system your developers already use.
+
+## Issues List
+
+Navigate to **Issues** in the sidebar to see all tracker issues across your repositories.
+
+![Issues list showing tracker issues](/screenshots/issues-list.png)
+
+The issues table shows:
+
+| Column | Description |
+|--------|-------------|
+| Tracker | Badge showing GitHub, GitLab, Gitea, or Jira |
+| External ID | Issue number in the external system |
+| Title | Issue title |
+| Status | Open, Closed, or tracker-specific status |
+| Created | When the issue was created |
+| Link | Direct link to the issue in the external tracker |
+
+Click the link to go directly to the issue in your tracker.
+
+## Supported Trackers
+
+| Tracker | How to Configure |
+|---------|-----------------|
+| **GitHub Issues** | Set up in the repository's issue tracker settings with your GitHub API token |
+| **GitLab Issues** | Set up with your GitLab project ID, instance URL, and API token |
+| **Gitea Issues** | Set up with your Gitea repository details, instance URL, and API token |
+| **Jira** | Set up with your Jira project key, instance URL, email, and API token |
+
+Issue tracker configuration is per-repository. You set it up when [adding or editing a repository](/guide/repositories#configuring-an-issue-tracker).
+
+## Deduplication
+
+Issues are deduplicated using the same fingerprint hash that deduplicates findings. This means:
+
+- If the same vulnerability appears in consecutive scans, only one issue is created
+- If a finding is resolved and then reappears, the platform recognizes it and can reopen the existing issue rather than creating a duplicate
+- Different findings (even if similar) get separate issues because their fingerprints differ based on file path, line number, and vulnerability type
+
+## Linked Issues in Finding Detail
+
+When viewing a [finding's detail page](/guide/findings#finding-detail), you will see a **Linked Issue** section if an issue was created for that finding. This provides a direct link to the external tracker issue, making it easy to jump between the security context in Certifai and the development workflow in your tracker.
@@ -1,26 +1,78 @@
 # Adding Repositories

-Repositories are the core resource in Compliance Scanner. Each tracked repository is scanned on a schedule and its results are available across all features.
+Repositories are the core resource in Certifai. Each tracked repository is scanned on a schedule, and its results are available across all features -- findings, SBOM, code graph, AI chat, and issue tracking.

 ## Adding a Repository

 1. Navigate to **Repositories** in the sidebar
-2. Click **Add Repository** at the top of the page
+2. Click **Add Repository**
 3. Fill in the form:
-   - **Name** — A display name for the repository
-   - **Git URL** — The clone URL (HTTPS or SSH), e.g. `https://github.com/org/repo.git`
-   - **Default Branch** — The branch to scan, e.g. `main` or `master`
+   - **Name** -- a display name for the repository
+   - **Git URL** -- the clone URL (HTTPS or SSH), e.g. `https://github.com/org/repo.git` or `git@github.com:org/repo.git`
+   - **Default Branch** -- the branch to scan, e.g. `main` or `master`
 4. Click **Add**

+![Add repository dialog](/screenshots/add-repository.png)
+
 The repository appears in the list immediately. It will not be scanned until you trigger a scan manually or the next scheduled scan runs.

+## Public vs Private Repositories
+
+**Public repositories** can be cloned using an HTTPS URL with no additional setup.
+
+**Private repositories** require SSH access. When you add a repository with an SSH URL (e.g. `git@github.com:org/repo.git`), Certifai uses an SSH deploy key to authenticate.
+
+### Getting the SSH Public Key
+
+To grant Certifai access to a private repository:
+
+1. Go to the **Repositories** page
+2. The platform's SSH public key is available for copying
+3. Add this key as a **deploy key** in your Git hosting provider:
+   - **GitHub**: Repository Settings > Deploy keys > Add deploy key
+   - **GitLab**: Repository Settings > Repository > Deploy keys
+   - **Gitea**: Repository Settings > Deploy Keys > Add Deploy Key
+
 ::: tip
-For private repositories, configure a GitHub token (`GITHUB_TOKEN`) or GitLab token (`GITLAB_TOKEN`) in your environment. The agent uses these tokens when cloning.
+Deploy keys are scoped to a single repository and are read-only by default. This is the recommended approach for granting Certifai access to private code.
 :::

+## Configuring an Issue Tracker
+
+You can connect an issue tracker so that new findings are automatically created as issues in your existing workflow.
+
+When adding or editing a repository, expand the **Issue Tracker** section to configure:
+
+![Add repository dialog with issue tracker options](/screenshots/add-repository-tracker.png)
+
+### Supported Trackers
+
+| Tracker | Required Fields |
+|---------|----------------|
+| **GitHub Issues** | Repository owner, repository name, API token |
+| **GitLab Issues** | Project ID, GitLab URL, API token |
+| **Gitea Issues** | Repository owner, repository name, Gitea URL, API token |
+| **Jira** | Project key, Jira URL, email, API token |
+
+Each tracker is configured per-repository, so different repositories can use different trackers.
+
+## Editing Repository Settings
+
+Click the **Edit** button on any repository row to modify its settings, including the issue tracker configuration.
+
+![Edit repository modal with tracker configuration](/screenshots/edit-repository.png)
+
+From the edit modal you can:
+
+- Change the repository name, Git URL, or default branch
+- Add, modify, or remove issue tracker configuration
+- View the webhook URL and secret for this repository (see [Webhooks & PR Reviews](/guide/webhooks))
+
 ## Repository List

-The repositories page shows all tracked repositories with:
+The repositories page shows all tracked repositories in a table.
+
+![Repository list table](/screenshots/repositories-list.png)

 | Column | Description |
 |--------|-------------|
@@ -32,7 +84,7 @@ The repositories page shows all tracked repositories with:

 ## Triggering a Scan

-Click the **Scan** button on any repository row to trigger an immediate scan. The scan runs in the background through all phases (clone, SAST, SBOM, CVE, graph). You can monitor progress on the Overview page under recent scan runs.
+Click the **Scan** button on any repository row to trigger an immediate scan. The scan runs in the background through all phases (clone, SAST, SBOM, CVE, graph, issue sync). You can monitor progress on the Overview page under recent scan runs.

 ## Deleting a Repository

@@ -44,19 +96,6 @@ Click the **Delete** button on a repository row. A confirmation dialog appears w
 - Code graph data
 - Embedding vectors (for AI chat)
 - CVE alerts
+- Tracker issues

 This action cannot be undone.
-
-## Automatic Scanning
-
-Repositories are scanned automatically on a schedule configured by the `SCAN_SCHEDULE` environment variable (cron format). The default is every 6 hours:
-
-```
-SCAN_SCHEDULE=0 0 */6 * * *
-```
-
-CVE monitoring runs on a separate schedule (default: daily at midnight):
-
-```
-CVE_MONITOR_SCHEDULE=0 0 0 * * *
-```
@@ -0,0 +1,111 @@
+# SBOM & Licenses
+
+The SBOM (Software Bill of Materials) feature provides a complete inventory of all dependencies across your repositories, with vulnerability tracking and license compliance analysis.
+
+## What is an SBOM?
+
+A Software Bill of Materials is a list of every component (library, package, framework) that your software depends on, along with version numbers, licenses, and known vulnerabilities. SBOMs are increasingly required for compliance audits, customer security questionnaires, and supply chain transparency.
+
+Certifai generates SBOMs automatically during each scan using Syft for dependency extraction and Grype for vulnerability matching.
+
+## Packages Tab
+
+Navigate to **SBOM** in the sidebar to see the packages tab, which lists all dependencies discovered during scans.
+
+![SBOM packages tab with filters and export options](/screenshots/sbom-packages.png)
+
+### Filtering
+
+Use the filter bar to narrow results:
+
+- **Repository** -- select a specific repository or view all
+- **Package Manager** -- npm, cargo, pip, go, maven, nuget, composer, gem
+- **Search** -- filter by package name
+- **Vulnerabilities** -- show all packages, only those with vulnerabilities, or only clean packages
+- **License** -- filter by specific license (MIT, Apache-2.0, BSD-3-Clause, GPL-3.0, etc.)
+
+### Package Details
+
+Each package row shows:
+
+| Column | Description |
+|--------|-------------|
+| Package | Package name |
+| Version | Installed version |
+| Manager | Package manager (npm, cargo, pip, etc.) |
+| License | License identifier with color-coded badge |
+| Vulnerabilities | Count of known vulnerabilities (click to expand) |
+
+### Vulnerability Details
+
+Click the vulnerability count on any package to expand inline details showing:
+
+- Vulnerability ID (e.g. CVE-2024-1234)
+- Source database
+- Severity level
+- Link to the advisory
+
+## License Compliance Tab
+
+The license compliance tab helps you understand your licensing obligations across all dependencies.
+
+![License compliance tab with copyleft warnings and distribution chart](/screenshots/sbom-licenses.png)
+
+### Copyleft Warnings
+
+If any dependencies use copyleft licenses (GPL, AGPL, LGPL, MPL), a warning banner appears listing the affected packages. Copyleft licenses may impose distribution requirements on your software.
+
+::: warning
+Copyleft-licensed dependencies can require you to release your source code under the same license. Review flagged packages carefully with your legal team if you distribute proprietary software.
+:::
+
+### License Distribution
+
+A horizontal bar chart visualizes the percentage breakdown of licenses across your dependencies, giving you a quick overview of your licensing profile.
+
+### License Table
+
+A detailed table lists every license found:
+
+| Column | Description |
+|--------|-------------|
+| License | License identifier |
+| Type | **Copyleft** or **Permissive** badge |
+| Packages | List of packages using this license |
+| Count | Number of packages |
+
+**Copyleft licenses** (flagged as potentially restrictive): GPL-2.0, GPL-3.0, AGPL-3.0, LGPL-2.1, LGPL-3.0, MPL-2.0
+
+**Permissive licenses** (generally safe for commercial use): MIT, Apache-2.0, BSD-2-Clause, BSD-3-Clause, ISC, and others
+
+## Export
+
+You can export your SBOM in industry-standard formats:
+
+1. Select a repository (or export across all repositories)
+2. Choose a format:
+   - **CycloneDX 1.5** -- JSON format widely supported by security tools
+   - **SPDX 2.3** -- Linux Foundation standard for license compliance
+3. Click **Export**
+4. The SBOM downloads as a JSON file
+
+::: tip
+SBOM exports are useful for compliance audits, customer security questionnaires, government procurement requirements, and supply chain transparency.
+:::
+
+## Compare Tab
+
+Compare the dependency profiles of two repositories side by side:
+
+1. Select **Repository A** from the first dropdown
+2. Select **Repository B** from the second dropdown
+3. View the comparison results:
+
+| Section | Description |
+|---------|-------------|
+| **Only in A** | Packages present in repo A but not in repo B |
+| **Only in B** | Packages present in repo B but not in repo A |
+| **Version Diffs** | Same package with different versions between repos |
+| **Common** | Count of packages that match exactly |
+
+This is useful for auditing consistency across microservices, identifying dependency drift, and planning coordinated upgrades.
@@ -1,20 +1,22 @@
 # Running Scans

-Scans are the primary workflow in Compliance Scanner. Each scan analyzes a repository for security vulnerabilities, dependency risks, and code structure.
+Scans are the primary workflow in Certifai. Each scan analyzes a repository for security vulnerabilities, dependency risks, and code structure.

-## Scan Types
+## What Happens During a Scan

-A full scan consists of multiple phases, each producing different types of findings:
+When a scan is triggered, Certifai runs through these phases in order:

-| Scan Type | What It Detects | Scanner |
-|-----------|----------------|---------|
-| **SAST** | Code-level vulnerabilities (injection, XSS, insecure crypto, etc.) | Semgrep |
-| **SBOM** | Dependency inventory, outdated packages, known vulnerabilities | Syft |
-| **CVE** | Known CVEs in dependencies cross-referenced against NVD | NVD API |
-| **GDPR** | Personal data handling issues, consent violations | Custom rules |
-| **OAuth** | OAuth/OIDC misconfigurations, insecure token handling | Custom rules |
+1. **Clone** -- pulls the latest code from the Git remote (or clones it for the first time)
+2. **SAST** -- runs static analysis using Semgrep with rules covering OWASP, GDPR, OAuth, secrets, and general security patterns
+3. **SBOM** -- extracts all dependencies using Syft, identifying packages, versions, licenses, and known vulnerabilities via Grype
+4. **CVE Check** -- cross-references dependencies against the NVD database for known CVEs
+5. **Graph Build** -- parses the codebase to construct a code knowledge graph of functions, classes, and their relationships
+6. **AI Triage** -- new findings are reviewed by an LLM that assesses severity, considers blast radius using the code graph, and generates remediation guidance
+7. **Issue Sync** -- creates or updates issues in connected trackers (GitHub, GitLab, Gitea, Jira) for new findings

-## Triggering a Scan
+Each phase produces results that are visible in the dashboard as soon as they complete.
+
+## How Scans Are Triggered

 ### Manual Scan

@@ -24,60 +26,54 @@ A full scan consists of multiple phases, each producing different types of findi

 ### Scheduled Scans

-Scans run automatically based on the `SCAN_SCHEDULE` cron expression. The default scans every 6 hours:
-
-```
-SCAN_SCHEDULE=0 0 */6 * * *
-```
+Repositories are scanned automatically on a recurring schedule. By default, scans run every 6 hours and CVE monitoring runs daily. Your administrator controls these schedules.

 ### Webhook-Triggered Scans

-Configure GitHub or GitLab webhooks to trigger scans on push events. Set the webhook URL to:
+When you configure a webhook in your Git hosting provider, scans are triggered automatically on push events. You can also get automated PR reviews. See [Webhooks & PR Reviews](/guide/webhooks) for setup instructions.

-```
-http://<agent-host>:3002/webhook/github
-http://<agent-host>:3002/webhook/gitlab
-```
+## Scan Phases and Statuses

-And configure the corresponding webhook secret:
-
-```
-GITHUB_WEBHOOK_SECRET=your-secret
-GITLAB_WEBHOOK_SECRET=your-secret
-```
-
-## Scan Phases
-
-Each scan progresses through these phases in order:
-
-1. **Queued** — Scan is waiting to start
-2. **Cloning** — Repository is being cloned or updated
-3. **Scanning** — Static analysis and SBOM extraction are running
-4. **Analyzing** — CVE cross-referencing and graph construction
-5. **Reporting** — Creating tracker issues for new findings
-6. **Completed** — All phases finished successfully
-
-If any phase fails, the scan status is set to **Failed** with an error message.
-
-## Viewing Scan History
-
-The Overview page shows the 10 most recent scan runs across all repositories, including:
-
- Repository name
- Scan status
- Current phase
- Number of findings discovered
- Start time and duration
-
-## Scan Run Statuses
+Each scan progresses through these statuses:

 | Status | Meaning |
 |--------|---------|
-| `queued` | Waiting to start |
-| `running` | Currently executing |
-| `completed` | Finished successfully |
-| `failed` | Stopped due to an error |
+| **Queued** | Scan is waiting to start |
+| **Running** | Currently executing scan phases |
+| **Completed** | All phases finished successfully |
+| **Failed** | Stopped due to an error |

-## Deduplication
+You can monitor scan progress on the Overview page, which shows the most recent scan runs across all repositories, including the current phase, finding count, and duration.

-Findings are deduplicated using a fingerprint hash based on the scanner, file path, line number, and vulnerability type. Repeated scans will not create duplicate findings for the same issue.
+## Scan Types
+
+A full scan runs multiple analysis engines, each producing different types of findings:
+
+| Scan Type | What It Detects | Scanner |
+|-----------|----------------|---------|
+| **SAST** | Code-level vulnerabilities (injection, XSS, insecure crypto, etc.) | Semgrep |
+| **SBOM** | Dependency inventory, outdated packages, known vulnerabilities | Syft + Grype |
+| **CVE** | Known CVEs in dependencies cross-referenced against NVD | NVD API |
+| **GDPR** | Personal data handling issues, consent violations | Custom rules |
+| **OAuth** | OAuth/OIDC misconfigurations, insecure token handling | Custom rules |
+| **Secrets** | Hardcoded credentials, API keys, tokens in source code | Custom rules |
+| **Code Review** | Architecture and security patterns reviewed by AI | LLM-powered |
+
+## Deduplication and Fingerprinting
+
+Findings are deduplicated using a fingerprint hash based on the scanner, file path, line number, and vulnerability type. This means:
+
+- **Repeated scans** will not create duplicate findings for the same issue
+- **Tracker issues** are only created once per unique finding
+- **Resolved findings** that reappear in a new scan are flagged for re-review
+
+The fingerprint is also used to match findings to existing tracker issues, preventing duplicate issues from being created in GitHub, GitLab, Gitea, or Jira.
+
+## Interpreting Results
+
+After a scan completes, you can explore results in several ways:
+
+- **Findings** -- browse all discovered vulnerabilities with filters for severity, type, and status. See [Understanding Findings](/guide/findings).
+- **SBOM** -- review your dependency inventory, check for vulnerable packages, and audit license compliance. See [SBOM & Licenses](/guide/sbom).
+- **Overview** -- check the dashboard for a high-level summary of your security posture across all repositories.
+- **Issues** -- see which findings have been pushed to your issue tracker. See [Issues & Tracking](/guide/issues).
@@ -0,0 +1,87 @@
+# Webhooks & PR Reviews
+
+Webhooks let Certifai respond to events in your Git repositories automatically. When configured, pushes to your repository trigger scans, and pull requests receive automated security reviews.
+
+## What Webhooks Enable
+
+- **Automatic scans on push** -- every time code is pushed to your default branch, a scan is triggered automatically
+- **PR security reviews** -- when a pull request is opened or updated, Certifai scans the changes and posts a review comment summarizing any security findings in the diff
+
+## Finding the Webhook URL and Secret
+
+Each repository in Certifai has its own webhook URL and secret:
+
+1. Go to **Repositories**
+2. Click **Edit** on the repository you want to configure
+3. In the edit modal, you will find the **Webhook URL** and **Webhook Secret**
+4. Copy both values -- you will need them when configuring your Git hosting provider
+
+## Setting Up Webhooks
+
+### Gitea
+
+1. Go to your repository in Gitea
+2. Navigate to **Settings > Webhooks > Add Webhook > Gitea**
+3. Set the **Target URL** to the webhook URL from Certifai
+4. Set the **Secret** to the webhook secret from Certifai
+5. Under **Trigger On**, select:
+   - **Push Events** -- for automatic scans on push
+   - **Pull Request Events** -- for PR security reviews
+6. Set the content type to `application/json`
+7. Click **Add Webhook**
+
+### GitHub
+
+1. Go to your repository on GitHub
+2. Navigate to **Settings > Webhooks > Add webhook**
+3. Set the **Payload URL** to the webhook URL from Certifai
+4. Set the **Content type** to `application/json`
+5. Set the **Secret** to the webhook secret from Certifai
+6. Under **Which events would you like to trigger this webhook?**, select **Let me select individual events**, then check:
+   - **Pushes** -- for automatic scans on push
+   - **Pull requests** -- for PR security reviews
+7. Click **Add webhook**
+
+### GitLab
+
+1. Go to your project in GitLab
+2. Navigate to **Settings > Webhooks**
+3. Set the **URL** to the webhook URL from Certifai
+4. Set the **Secret token** to the webhook secret from Certifai
+5. Under **Trigger**, check:
+   - **Push events** -- for automatic scans on push
+   - **Merge request events** -- for PR security reviews
+6. Click **Add webhook**
+
+## PR Review Flow
+
+When a pull request (or merge request) is opened or updated, the following happens:
+
+1. Your Git provider sends a webhook event to Certifai
+2. Certifai checks out the PR branch and runs a targeted scan on the changed files
+3. Findings specific to the changes in the PR are identified
+4. Certifai posts a review comment on the PR summarizing:
+   - Number of new findings introduced by the changes
+   - Severity breakdown
+   - Details for each finding including file, line, and remediation guidance
+
+This gives developers immediate security feedback in their pull request workflow, before code is merged.
+
+::: tip
+PR reviews focus only on changes introduced in the pull request, not the entire codebase. This keeps reviews relevant and actionable.
+:::
+
+## Events to Select
+
+Here is a summary of which events to enable for each feature:
+
+| Feature | Gitea | GitHub | GitLab |
+|---------|-------|--------|--------|
+| Scan on push | Push Events | Pushes | Push events |
+| PR reviews | Pull Request Events | Pull requests | Merge request events |
+
+You can enable one or both depending on your workflow.
+
+::: warning
+Make sure the webhook secret matches exactly between your Git provider and Certifai. Requests with an invalid signature are rejected.
+:::