docs: rewrite user docs, fix modal scroll, webhook URL, and sccache

Rewrite all public documentation to be user-facing only: - Remove deployment, configuration, and self-hosting sections - Add guide pages for SBOM, issues, webhooks & PR reviews - Add reference pages for glossary and tools/scanners - Add 12 screenshots from live dashboard - Explain MCP, LLM triage, false positives, human-in-the-loop Fix edit repository modal not scrollable (max-height + overflow-y). Show full webhook URL using window.location.origin instead of path. Unset RUSTC_WRAPPER in agent cargo commands to avoid sccache errors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 14:17:46 +01:00
parent 689daa0f49
commit c253e4ef5e
40 changed files with 872 additions and 1334 deletions
--- a/docs/guide/scanning.md
+++ b/docs/guide/scanning.md
@@ -1,20 +1,22 @@
 # Running Scans

-Scans are the primary workflow in Compliance Scanner. Each scan analyzes a repository for security vulnerabilities, dependency risks, and code structure.
+Scans are the primary workflow in Certifai. Each scan analyzes a repository for security vulnerabilities, dependency risks, and code structure.

-## Scan Types
+## What Happens During a Scan

-A full scan consists of multiple phases, each producing different types of findings:
+When a scan is triggered, Certifai runs through these phases in order:

-| Scan Type | What It Detects | Scanner |
-|-----------|----------------|---------|
-| **SAST** | Code-level vulnerabilities (injection, XSS, insecure crypto, etc.) | Semgrep |
-| **SBOM** | Dependency inventory, outdated packages, known vulnerabilities | Syft |
-| **CVE** | Known CVEs in dependencies cross-referenced against NVD | NVD API |
-| **GDPR** | Personal data handling issues, consent violations | Custom rules |
-| **OAuth** | OAuth/OIDC misconfigurations, insecure token handling | Custom rules |
+1. **Clone** -- pulls the latest code from the Git remote (or clones it for the first time)
+2. **SAST** -- runs static analysis using Semgrep with rules covering OWASP, GDPR, OAuth, secrets, and general security patterns
+3. **SBOM** -- extracts all dependencies using Syft, identifying packages, versions, licenses, and known vulnerabilities via Grype
+4. **CVE Check** -- cross-references dependencies against the NVD database for known CVEs
+5. **Graph Build** -- parses the codebase to construct a code knowledge graph of functions, classes, and their relationships
+6. **AI Triage** -- new findings are reviewed by an LLM that assesses severity, considers blast radius using the code graph, and generates remediation guidance
+7. **Issue Sync** -- creates or updates issues in connected trackers (GitHub, GitLab, Gitea, Jira) for new findings

-## Triggering a Scan
+Each phase produces results that are visible in the dashboard as soon as they complete.
+
+## How Scans Are Triggered

 ### Manual Scan

@@ -24,60 +26,54 @@ A full scan consists of multiple phases, each producing different types of findi

 ### Scheduled Scans

-Scans run automatically based on the `SCAN_SCHEDULE` cron expression. The default scans every 6 hours:
-
-```
-SCAN_SCHEDULE=0 0 */6 * * *
-```
+Repositories are scanned automatically on a recurring schedule. By default, scans run every 6 hours and CVE monitoring runs daily. Your administrator controls these schedules.

 ### Webhook-Triggered Scans

-Configure GitHub or GitLab webhooks to trigger scans on push events. Set the webhook URL to:
+When you configure a webhook in your Git hosting provider, scans are triggered automatically on push events. You can also get automated PR reviews. See [Webhooks & PR Reviews](/guide/webhooks) for setup instructions.

-```
-http://<agent-host>:3002/webhook/github
-http://<agent-host>:3002/webhook/gitlab
-```
+## Scan Phases and Statuses

-And configure the corresponding webhook secret:
-
-```
-GITHUB_WEBHOOK_SECRET=your-secret
-GITLAB_WEBHOOK_SECRET=your-secret
-```
-
-## Scan Phases
-
-Each scan progresses through these phases in order:
-
-1. **Queued** — Scan is waiting to start
-2. **Cloning** — Repository is being cloned or updated
-3. **Scanning** — Static analysis and SBOM extraction are running
-4. **Analyzing** — CVE cross-referencing and graph construction
-5. **Reporting** — Creating tracker issues for new findings
-6. **Completed** — All phases finished successfully
-
-If any phase fails, the scan status is set to **Failed** with an error message.
-
-## Viewing Scan History
-
-The Overview page shows the 10 most recent scan runs across all repositories, including:
-
- Repository name
- Scan status
- Current phase
- Number of findings discovered
- Start time and duration
-
-## Scan Run Statuses
+Each scan progresses through these statuses:

 | Status | Meaning |
 |--------|---------|
-| `queued` | Waiting to start |
-| `running` | Currently executing |
-| `completed` | Finished successfully |
-| `failed` | Stopped due to an error |
+| **Queued** | Scan is waiting to start |
+| **Running** | Currently executing scan phases |
+| **Completed** | All phases finished successfully |
+| **Failed** | Stopped due to an error |

-## Deduplication
+You can monitor scan progress on the Overview page, which shows the most recent scan runs across all repositories, including the current phase, finding count, and duration.

-Findings are deduplicated using a fingerprint hash based on the scanner, file path, line number, and vulnerability type. Repeated scans will not create duplicate findings for the same issue.
+## Scan Types
+
+A full scan runs multiple analysis engines, each producing different types of findings:
+
+| Scan Type | What It Detects | Scanner |
+|-----------|----------------|---------|
+| **SAST** | Code-level vulnerabilities (injection, XSS, insecure crypto, etc.) | Semgrep |
+| **SBOM** | Dependency inventory, outdated packages, known vulnerabilities | Syft + Grype |
+| **CVE** | Known CVEs in dependencies cross-referenced against NVD | NVD API |
+| **GDPR** | Personal data handling issues, consent violations | Custom rules |
+| **OAuth** | OAuth/OIDC misconfigurations, insecure token handling | Custom rules |
+| **Secrets** | Hardcoded credentials, API keys, tokens in source code | Custom rules |
+| **Code Review** | Architecture and security patterns reviewed by AI | LLM-powered |
+
+## Deduplication and Fingerprinting
+
+Findings are deduplicated using a fingerprint hash based on the scanner, file path, line number, and vulnerability type. This means:
+
+- **Repeated scans** will not create duplicate findings for the same issue
+- **Tracker issues** are only created once per unique finding
+- **Resolved findings** that reappear in a new scan are flagged for re-review
+
+The fingerprint is also used to match findings to existing tracker issues, preventing duplicate issues from being created in GitHub, GitLab, Gitea, or Jira.
+
+## Interpreting Results
+
+After a scan completes, you can explore results in several ways:
+
+- **Findings** -- browse all discovered vulnerabilities with filters for severity, type, and status. See [Understanding Findings](/guide/findings).
+- **SBOM** -- review your dependency inventory, check for vulnerable packages, and audit license compliance. See [SBOM & Licenses](/guide/sbom).
+- **Overview** -- check the dashboard for a high-level summary of your security posture across all repositories.
+- **Issues** -- see which findings have been pushed to your issue tracker. See [Issues & Tracking](/guide/issues).