feat: add compliance modules 2-5 (dashboard, security templates, process manager, evidence collector)
All checks were successful
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Successful in 32s
CI/CD / test-python-backend-compliance (push) Successful in 34s
CI/CD / test-python-document-crawler (push) Successful in 23s
CI/CD / test-python-dsms-gateway (push) Successful in 21s
CI/CD / validate-canonical-controls (push) Successful in 11s
CI/CD / Deploy (push) Successful in 2s

Module 2: Extended Compliance Dashboard with roadmap, module-status, next-actions, snapshots, score-history
Module 3: 7 German security document templates (IT-Sicherheitskonzept, Datenschutz, Backup, Logging, Incident-Response, Zugriff, Risikomanagement)
Module 4: Compliance Process Manager with CRUD, complete/skip/seed, ~50 seed tasks, 3-tab UI
Module 5: Evidence Collector Extended with automated checks, control-mapping, coverage report, 4-tab UI

Also includes: canonical control library enhancements (verification method, categories, dedup), control generator improvements, RAG client extensions

52 tests pass, frontend builds clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Benjamin Admin
2026-03-14 21:03:04 +01:00
parent 13d13c8226
commit 49ce417428
35 changed files with 8741 additions and 422 deletions

View File

@@ -29,11 +29,13 @@ export async function GET(request: NextRequest) {
const domain = searchParams.get('domain')
const verificationMethod = searchParams.get('verification_method')
const categoryFilter = searchParams.get('category')
const targetAudience = searchParams.get('target_audience')
const params = new URLSearchParams()
if (severity) params.set('severity', severity)
if (domain) params.set('domain', domain)
if (verificationMethod) params.set('verification_method', verificationMethod)
if (categoryFilter) params.set('category', categoryFilter)
if (targetAudience) params.set('target_audience', targetAudience)
const qs = params.toString()
backendPath = `/api/compliance/v1/canonical/controls${qs ? `?${qs}` : ''}`
break

File diff suppressed because it is too large Load Diff

View File

@@ -8,7 +8,7 @@ import {
} from 'lucide-react'
import {
CanonicalControl, EFFORT_LABELS, BACKEND_URL,
SeverityBadge, StateBadge, LicenseRuleBadge, VerificationMethodBadge, CategoryBadge,
SeverityBadge, StateBadge, LicenseRuleBadge, VerificationMethodBadge, CategoryBadge, TargetAudienceBadge,
VERIFICATION_METHODS, CATEGORY_OPTIONS,
} from './helpers'
@@ -124,6 +124,7 @@ export function ControlDetail({
<LicenseRuleBadge rule={ctrl.license_rule} />
<VerificationMethodBadge method={ctrl.verification_method} />
<CategoryBadge category={ctrl.category} />
<TargetAudienceBadge audience={ctrl.target_audience} />
</div>
<h2 className="text-lg font-semibold text-gray-900 mt-1">{ctrl.title}</h2>
</div>
@@ -163,22 +164,46 @@ export function ControlDetail({
<p className="text-sm text-gray-700 leading-relaxed">{ctrl.rationale}</p>
</section>
{/* Source Info (Rule 1 + 2) */}
{/* Gesetzliche Grundlage (Rule 1 + 2) */}
{ctrl.source_citation && (
<section className="bg-blue-50 border border-blue-200 rounded-lg p-4">
<div className="flex items-center gap-2 mb-2">
<div className="flex items-center gap-2 mb-3">
<Scale className="w-4 h-4 text-blue-600" />
<h3 className="text-sm font-semibold text-blue-900">Quellenangabe</h3>
<h3 className="text-sm font-semibold text-blue-900">Gesetzliche Grundlage</h3>
{ctrl.license_rule === 1 && (
<span className="text-xs bg-blue-100 text-blue-700 px-2 py-0.5 rounded-full">Direkte gesetzliche Pflicht</span>
)}
{ctrl.license_rule === 2 && (
<span className="text-xs bg-teal-100 text-teal-700 px-2 py-0.5 rounded-full">Standard mit Zitationspflicht</span>
)}
</div>
<div className="text-xs text-blue-700 space-y-1">
{Object.entries(ctrl.source_citation).map(([k, v]) => (
<p key={k}><span className="font-medium">{k}:</span> {v}</p>
))}
<div className="flex items-start gap-3">
<div className="flex-1">
{ctrl.source_citation.source && (
<p className="text-sm font-medium text-blue-900 mb-1">{ctrl.source_citation.source}</p>
)}
{ctrl.source_citation.license && (
<p className="text-xs text-blue-600">Lizenz: {ctrl.source_citation.license}</p>
)}
{ctrl.source_citation.license_notice && (
<p className="text-xs text-blue-600 mt-0.5">{ctrl.source_citation.license_notice}</p>
)}
</div>
{ctrl.source_citation.url && (
<a
href={ctrl.source_citation.url}
target="_blank"
rel="noopener noreferrer"
className="flex items-center gap-1 text-xs text-blue-600 hover:text-blue-800 whitespace-nowrap"
>
<ExternalLink className="w-3.5 h-3.5" />Quelle
</a>
)}
</div>
{ctrl.source_original_text && (
<details className="mt-3">
<summary className="text-xs text-blue-600 cursor-pointer hover:text-blue-800">Originaltext anzeigen</summary>
<p className="text-xs text-gray-600 mt-2 p-2 bg-white rounded border border-blue-100 leading-relaxed max-h-40 overflow-y-auto">
<p className="text-xs text-gray-600 mt-2 p-2 bg-white rounded border border-blue-100 leading-relaxed max-h-40 overflow-y-auto whitespace-pre-wrap">
{ctrl.source_original_text}
</p>
</details>
@@ -186,6 +211,19 @@ export function ControlDetail({
</section>
)}
{/* Impliziter Gesetzesbezug (Rule 3 — kein Originaltext, aber ggf. Gesetzesbezug ueber Anchors) */}
{!ctrl.source_citation && ctrl.open_anchors.length > 0 && (
<section className="bg-amber-50 border border-amber-200 rounded-lg p-3">
<div className="flex items-center gap-2">
<Scale className="w-4 h-4 text-amber-600" />
<p className="text-xs text-amber-700">
Dieser Control setzt implizit gesetzliche Anforderungen um (z.B. DSGVO Art. 32, NIS2 Art. 21).
Die konkreten Massnahmen leiten sich aus den Open-Source-Referenzen unten ab.
</p>
</div>
</section>
)}
{/* Scope */}
{(ctrl.scope.platforms?.length || ctrl.scope.components?.length || ctrl.scope.data_classes?.length) ? (
<section>

View File

@@ -2,7 +2,7 @@
import { useState } from 'react'
import { BookOpen, Trash2, Save, X } from 'lucide-react'
import { EMPTY_CONTROL, VERIFICATION_METHODS, CATEGORY_OPTIONS } from './helpers'
import { EMPTY_CONTROL, VERIFICATION_METHODS, CATEGORY_OPTIONS, TARGET_AUDIENCE_OPTIONS } from './helpers'
export function ControlForm({
initial,
@@ -268,8 +268,8 @@ export function ControlForm({
</div>
</div>
{/* Verification Method & Category */}
<div className="grid grid-cols-2 gap-4">
{/* Verification Method, Category & Target Audience */}
<div className="grid grid-cols-3 gap-4">
<div>
<label className="block text-xs font-medium text-gray-600 mb-1">Nachweismethode</label>
<select
@@ -297,6 +297,20 @@ export function ControlForm({
))}
</select>
</div>
<div>
<label className="block text-xs font-medium text-gray-600 mb-1">Zielgruppe</label>
<select
value={form.target_audience || ''}
onChange={e => setForm({ ...form, target_audience: e.target.value || null })}
className="w-full px-3 py-2 text-sm border border-gray-300 rounded-lg"
>
<option value=""> Nicht zugewiesen </option>
{Object.entries(TARGET_AUDIENCE_OPTIONS).map(([k, v]) => (
<option key={k} value={k}>{v.label}</option>
))}
</select>
<p className="text-xs text-gray-400 mt-1">Fuer wen ist dieses Control relevant?</p>
</div>
</div>
</div>
)

View File

@@ -44,6 +44,7 @@ export interface CanonicalControl {
customer_visible?: boolean
verification_method: string | null
category: string | null
target_audience: string | null
generation_metadata?: Record<string, unknown> | null
created_at: string
updated_at: string
@@ -96,6 +97,7 @@ export const EMPTY_CONTROL = {
tags: [] as string[],
verification_method: null as string | null,
category: null as string | null,
target_audience: null as string | null,
}
export const DOMAIN_OPTIONS = [
@@ -138,6 +140,13 @@ export const CATEGORY_OPTIONS = [
{ value: 'identity', label: 'Identitaetsmanagement' },
]
export const TARGET_AUDIENCE_OPTIONS: Record<string, { bg: string; label: string }> = {
enterprise: { bg: 'bg-cyan-100 text-cyan-700', label: 'Unternehmen' },
authority: { bg: 'bg-rose-100 text-rose-700', label: 'Behoerden' },
provider: { bg: 'bg-violet-100 text-violet-700', label: 'Anbieter' },
all: { bg: 'bg-gray-100 text-gray-700', label: 'Alle' },
}
export const COLLECTION_OPTIONS = [
{ value: 'bp_compliance_ce', label: 'CE (OWASP, ENISA, BSI)' },
{ value: 'bp_compliance_gesetze', label: 'Gesetze (EU, DE, BSI)' },
@@ -213,6 +222,13 @@ export function CategoryBadge({ category }: { category: string | null }) {
)
}
export function TargetAudienceBadge({ audience }: { audience: string | null }) {
if (!audience) return null
const config = TARGET_AUDIENCE_OPTIONS[audience]
if (!config) return null
return <span className={`inline-flex items-center px-2 py-0.5 rounded text-xs font-medium ${config.bg}`}>{config.label}</span>
}
export function getDomain(controlId: string): string {
return controlId.split('-')[0] || ''
}

View File

@@ -2,13 +2,14 @@
import { useState, useEffect, useMemo, useCallback } from 'react'
import {
Shield, Search, ChevronRight, Filter, Lock,
Shield, Search, ChevronRight, ChevronLeft, Filter, Lock,
BookOpen, Plus, Zap, BarChart3, ListChecks,
ChevronsLeft, ChevronsRight,
} from 'lucide-react'
import {
CanonicalControl, Framework, BACKEND_URL, EMPTY_CONTROL,
SeverityBadge, StateBadge, LicenseRuleBadge, VerificationMethodBadge, CategoryBadge,
getDomain, VERIFICATION_METHODS, CATEGORY_OPTIONS,
SeverityBadge, StateBadge, LicenseRuleBadge, VerificationMethodBadge, CategoryBadge, TargetAudienceBadge,
getDomain, VERIFICATION_METHODS, CATEGORY_OPTIONS, TARGET_AUDIENCE_OPTIONS,
} from './components/helpers'
import { ControlForm } from './components/ControlForm'
import { ControlDetail } from './components/ControlDetail'
@@ -32,6 +33,7 @@ export default function ControlLibraryPage() {
const [stateFilter, setStateFilter] = useState<string>('')
const [verificationFilter, setVerificationFilter] = useState<string>('')
const [categoryFilter, setCategoryFilter] = useState<string>('')
const [audienceFilter, setAudienceFilter] = useState<string>('')
// CRUD state
const [mode, setMode] = useState<'list' | 'detail' | 'create' | 'edit'>('list')
@@ -42,6 +44,10 @@ export default function ControlLibraryPage() {
const [processedStats, setProcessedStats] = useState<Array<Record<string, unknown>>>([])
const [showStats, setShowStats] = useState(false)
// Pagination
const [currentPage, setCurrentPage] = useState(1)
const PAGE_SIZE = 50
// Review mode
const [reviewMode, setReviewMode] = useState(false)
const [reviewIndex, setReviewIndex] = useState(0)
@@ -80,6 +86,7 @@ export default function ControlLibraryPage() {
if (stateFilter && c.release_state !== stateFilter) return false
if (verificationFilter && c.verification_method !== verificationFilter) return false
if (categoryFilter && c.category !== categoryFilter) return false
if (audienceFilter && c.target_audience !== audienceFilter) return false
if (searchQuery) {
const q = searchQuery.toLowerCase()
return (
@@ -91,7 +98,17 @@ export default function ControlLibraryPage() {
}
return true
})
}, [controls, severityFilter, domainFilter, stateFilter, verificationFilter, categoryFilter, searchQuery])
}, [controls, severityFilter, domainFilter, stateFilter, verificationFilter, categoryFilter, audienceFilter, searchQuery])
// Reset page when filters change
useEffect(() => { setCurrentPage(1) }, [severityFilter, domainFilter, stateFilter, verificationFilter, categoryFilter, audienceFilter, searchQuery])
// Pagination
const totalPages = Math.max(1, Math.ceil(filteredControls.length / PAGE_SIZE))
const paginatedControls = useMemo(() => {
const start = (currentPage - 1) * PAGE_SIZE
return filteredControls.slice(start, start + PAGE_SIZE)
}, [filteredControls, currentPage])
// Review queue items
const reviewItems = useMemo(() => {
@@ -413,6 +430,16 @@ export default function ControlLibraryPage() {
<option key={c.value} value={c.value}>{c.label}</option>
))}
</select>
<select
value={audienceFilter}
onChange={e => setAudienceFilter(e.target.value)}
className="text-sm border border-gray-300 rounded-lg px-3 py-2 focus:outline-none focus:ring-2 focus:ring-purple-500"
>
<option value="">Alle Zielgruppen</option>
{Object.entries(TARGET_AUDIENCE_OPTIONS).map(([k, v]) => (
<option key={k} value={k}>{v.label}</option>
))}
</select>
</div>
{/* Processing Stats */}
@@ -443,10 +470,19 @@ export default function ControlLibraryPage() {
/>
)}
{/* Pagination Header */}
<div className="px-6 py-2 bg-gray-50 border-b border-gray-200 flex items-center justify-between text-xs text-gray-500">
<span>
{filteredControls.length} Controls gefunden
{filteredControls.length !== controls.length && ` (von ${controls.length} gesamt)`}
</span>
<span>Seite {currentPage} von {totalPages}</span>
</div>
{/* Control List */}
<div className="flex-1 overflow-y-auto p-6">
<div className="space-y-3">
{filteredControls.map(ctrl => (
{paginatedControls.map(ctrl => (
<button
key={ctrl.control_id}
onClick={() => { setSelectedControl(ctrl); setMode('detail') }}
@@ -454,13 +490,14 @@ export default function ControlLibraryPage() {
>
<div className="flex items-start justify-between">
<div className="flex-1 min-w-0">
<div className="flex items-center gap-2 mb-1">
<div className="flex items-center gap-2 mb-1 flex-wrap">
<span className="text-xs font-mono text-purple-600 bg-purple-50 px-1.5 py-0.5 rounded">{ctrl.control_id}</span>
<SeverityBadge severity={ctrl.severity} />
<StateBadge state={ctrl.release_state} />
<LicenseRuleBadge rule={ctrl.license_rule} />
<VerificationMethodBadge method={ctrl.verification_method} />
<CategoryBadge category={ctrl.category} />
<TargetAudienceBadge audience={ctrl.target_audience} />
{ctrl.risk_score !== null && (
<span className="text-xs text-gray-400">Score: {ctrl.risk_score}</span>
)}
@@ -472,11 +509,14 @@ export default function ControlLibraryPage() {
<div className="flex items-center gap-2 mt-2">
<BookOpen className="w-3 h-3 text-green-600" />
<span className="text-xs text-green-700">
{ctrl.open_anchors.length} Open-Source-Referenzen:
</span>
<span className="text-xs text-gray-500">
{ctrl.open_anchors.map(a => a.framework).filter((v, i, arr) => arr.indexOf(v) === i).join(', ')}
{ctrl.open_anchors.length} Referenzen
</span>
{ctrl.source_citation?.source && (
<>
<span className="text-gray-300">|</span>
<span className="text-xs text-blue-600">{ctrl.source_citation.source}</span>
</>
)}
</div>
</div>
<ChevronRight className="w-4 h-4 text-gray-300 group-hover:text-purple-500 flex-shrink-0 mt-1 ml-4" />
@@ -492,6 +532,72 @@ export default function ControlLibraryPage() {
</div>
)}
</div>
{/* Pagination Controls */}
{totalPages > 1 && (
<div className="flex items-center justify-center gap-2 mt-6 pb-4">
<button
onClick={() => setCurrentPage(1)}
disabled={currentPage === 1}
className="p-2 text-gray-500 hover:text-purple-600 disabled:opacity-30 disabled:cursor-not-allowed"
title="Erste Seite"
>
<ChevronsLeft className="w-4 h-4" />
</button>
<button
onClick={() => setCurrentPage(p => Math.max(1, p - 1))}
disabled={currentPage === 1}
className="p-2 text-gray-500 hover:text-purple-600 disabled:opacity-30 disabled:cursor-not-allowed"
title="Vorherige Seite"
>
<ChevronLeft className="w-4 h-4" />
</button>
{/* Page numbers */}
{Array.from({ length: totalPages }, (_, i) => i + 1)
.filter(p => p === 1 || p === totalPages || Math.abs(p - currentPage) <= 2)
.reduce<(number | 'dots')[]>((acc, p, i, arr) => {
if (i > 0 && p - (arr[i - 1] as number) > 1) acc.push('dots')
acc.push(p)
return acc
}, [])
.map((p, i) =>
p === 'dots' ? (
<span key={`dots-${i}`} className="px-1 text-gray-400">...</span>
) : (
<button
key={p}
onClick={() => setCurrentPage(p as number)}
className={`w-8 h-8 text-sm rounded-lg ${
currentPage === p
? 'bg-purple-600 text-white'
: 'text-gray-600 hover:bg-purple-50 hover:text-purple-600'
}`}
>
{p}
</button>
)
)
}
<button
onClick={() => setCurrentPage(p => Math.min(totalPages, p + 1))}
disabled={currentPage === totalPages}
className="p-2 text-gray-500 hover:text-purple-600 disabled:opacity-30 disabled:cursor-not-allowed"
title="Naechste Seite"
>
<ChevronRight className="w-4 h-4" />
</button>
<button
onClick={() => setCurrentPage(totalPages)}
disabled={currentPage === totalPages}
className="p-2 text-gray-500 hover:text-purple-600 disabled:opacity-30 disabled:cursor-not-allowed"
title="Letzte Seite"
>
<ChevronsRight className="w-4 h-4" />
</button>
</div>
)}
</div>
</div>
)

View File

@@ -44,6 +44,7 @@ const CATEGORIES: { key: string; label: string; types: string[] | null }[] = [
{ key: 'cloud', label: 'Cloud', types: ['cloud_service_agreement'] },
{ key: 'misc', label: 'Weitere', types: ['community_guidelines', 'copyright_policy', 'data_usage_clause'] },
{ key: 'dsfa', label: 'DSFA', types: ['dsfa'] },
{ key: 'security', label: 'Sicherheitskonzepte', types: ['it_security_concept', 'data_protection_concept', 'backup_recovery_concept', 'logging_concept', 'incident_response_plan', 'access_control_concept', 'risk_management_concept'] },
]
// =============================================================================

View File

@@ -303,8 +303,76 @@ function LoadingSkeleton() {
// MAIN PAGE
// =============================================================================
// =============================================================================
// EVIDENCE CHECK TYPES
// =============================================================================
interface EvidenceCheck {
id: string
check_code: string
title: string
description: string | null
check_type: string
target_url: string | null
frequency: string
is_active: boolean
last_run_at: string | null
next_run_at: string | null
}
interface CheckResult {
id: string
check_id: string
run_status: string
summary: string | null
findings_count: number
critical_findings: number
duration_ms: number
run_at: string
}
interface EvidenceMapping {
id: string
evidence_id: string
control_code: string
mapping_type: string
verified_at: string | null
verified_by: string | null
notes: string | null
}
interface CoverageReport {
total_controls: number
controls_with_evidence: number
controls_without_evidence: number
coverage_percent: number
}
type EvidenceTabKey = 'evidence' | 'checks' | 'mapping' | 'report'
const CHECK_TYPE_LABELS: Record<string, { label: string; color: string }> = {
tls_scan: { label: 'TLS-Scan', color: 'bg-blue-100 text-blue-700' },
header_check: { label: 'Header-Check', color: 'bg-green-100 text-green-700' },
certificate_check: { label: 'Zertifikat', color: 'bg-yellow-100 text-yellow-700' },
dns_check: { label: 'DNS-Check', color: 'bg-purple-100 text-purple-700' },
api_scan: { label: 'API-Scan', color: 'bg-indigo-100 text-indigo-700' },
config_scan: { label: 'Config-Scan', color: 'bg-orange-100 text-orange-700' },
port_scan: { label: 'Port-Scan', color: 'bg-red-100 text-red-700' },
}
const RUN_STATUS_LABELS: Record<string, { label: string; color: string }> = {
running: { label: 'Laeuft...', color: 'bg-blue-100 text-blue-700' },
passed: { label: 'Bestanden', color: 'bg-green-100 text-green-700' },
failed: { label: 'Fehlgeschlagen', color: 'bg-red-100 text-red-700' },
warning: { label: 'Warnung', color: 'bg-yellow-100 text-yellow-700' },
error: { label: 'Fehler', color: 'bg-red-100 text-red-700' },
}
const CHECK_API = '/api/sdk/v1/compliance/evidence-checks'
export default function EvidencePage() {
const { state, dispatch } = useSDK()
const [activeTab, setActiveTab] = useState<EvidenceTabKey>('evidence')
const [filter, setFilter] = useState<string>('all')
const [loading, setLoading] = useState(true)
const [error, setError] = useState<string | null>(null)
@@ -314,6 +382,17 @@ export default function EvidencePage() {
const [pageSize] = useState(20)
const [total, setTotal] = useState(0)
// Evidence Checks state
const [checks, setChecks] = useState<EvidenceCheck[]>([])
const [checksLoading, setChecksLoading] = useState(false)
const [runningCheckId, setRunningCheckId] = useState<string | null>(null)
const [checkResults, setCheckResults] = useState<Record<string, CheckResult[]>>({})
// Mappings state
const [mappings, setMappings] = useState<EvidenceMapping[]>([])
const [coverageReport, setCoverageReport] = useState<CoverageReport | null>(null)
const [seedingChecks, setSeedingChecks] = useState(false)
// Fetch evidence from backend on mount and when page changes
useEffect(() => {
const fetchEvidence = async () => {
@@ -511,8 +590,86 @@ export default function EvidencePage() {
}
}
// Load checks when tab changes
const loadChecks = async () => {
setChecksLoading(true)
try {
const res = await fetch(`${CHECK_API}?limit=50`)
if (res.ok) {
const data = await res.json()
setChecks(data.checks || [])
}
} catch { /* silent */ }
finally { setChecksLoading(false) }
}
const runCheck = async (checkId: string) => {
setRunningCheckId(checkId)
try {
const res = await fetch(`${CHECK_API}/${checkId}/run`, { method: 'POST' })
if (res.ok) {
const result = await res.json()
setCheckResults(prev => ({
...prev,
[checkId]: [result, ...(prev[checkId] || [])].slice(0, 5),
}))
loadChecks() // refresh last_run_at
}
} catch { /* silent */ }
finally { setRunningCheckId(null) }
}
const loadCheckResults = async (checkId: string) => {
try {
const res = await fetch(`${CHECK_API}/${checkId}/results?limit=5`)
if (res.ok) {
const data = await res.json()
setCheckResults(prev => ({ ...prev, [checkId]: data.results || [] }))
}
} catch { /* silent */ }
}
const seedChecks = async () => {
setSeedingChecks(true)
try {
await fetch(`${CHECK_API}/seed`, { method: 'POST' })
loadChecks()
} catch { /* silent */ }
finally { setSeedingChecks(false) }
}
const loadMappings = async () => {
try {
const res = await fetch(`${CHECK_API}/mappings`)
if (res.ok) {
const data = await res.json()
setMappings(data.mappings || [])
}
} catch { /* silent */ }
}
const loadCoverageReport = async () => {
try {
const res = await fetch(`${CHECK_API}/mappings/report`)
if (res.ok) setCoverageReport(await res.json())
} catch { /* silent */ }
}
useEffect(() => {
if (activeTab === 'checks' && checks.length === 0) loadChecks()
if (activeTab === 'mapping') { loadMappings(); loadCoverageReport() }
if (activeTab === 'report') loadCoverageReport()
}, [activeTab]) // eslint-disable-line react-hooks/exhaustive-deps
const stepInfo = STEP_EXPLANATIONS['evidence']
const evidenceTabs: { key: EvidenceTabKey; label: string }[] = [
{ key: 'evidence', label: 'Nachweise' },
{ key: 'checks', label: 'Automatische Checks' },
{ key: 'mapping', label: 'Control-Mapping' },
{ key: 'report', label: 'Report' },
]
return (
<div className="space-y-6">
{/* Hidden file input */}
@@ -556,6 +713,25 @@ export default function EvidencePage() {
</button>
</StepHeader>
{/* Tab Navigation */}
<div className="bg-white rounded-xl shadow-sm border">
<div className="flex border-b">
{evidenceTabs.map(tab => (
<button
key={tab.key}
onClick={() => setActiveTab(tab.key)}
className={`px-6 py-3 text-sm font-medium transition-colors ${
activeTab === tab.key
? 'text-purple-600 border-b-2 border-purple-600'
: 'text-gray-500 hover:text-gray-700'
}`}
>
{tab.label}
</button>
))}
</div>
</div>
{/* Error Banner */}
{error && (
<div className="p-4 bg-red-50 border border-red-200 rounded-lg text-red-700 flex items-center justify-between">
@@ -564,6 +740,11 @@ export default function EvidencePage() {
</div>
)}
{/* ============================================================ */}
{/* TAB: Nachweise (existing content) */}
{/* ============================================================ */}
{activeTab === 'evidence' && <>
{/* Controls Alert */}
{state.controls.length === 0 && !loading && (
<div className="bg-amber-50 border border-amber-200 rounded-xl p-4">
@@ -681,6 +862,250 @@ export default function EvidencePage() {
<p className="mt-2 text-gray-500">Passen Sie den Filter an oder laden Sie neue Nachweise hoch.</p>
</div>
)}
</>}
{/* ============================================================ */}
{/* TAB: Automatische Checks */}
{/* ============================================================ */}
{activeTab === 'checks' && (
<>
{/* Seed if empty */}
{!checksLoading && checks.length === 0 && (
<div className="p-4 bg-yellow-50 border border-yellow-200 rounded-lg flex items-center justify-between">
<div>
<p className="font-medium text-yellow-800">Keine automatischen Checks vorhanden</p>
<p className="text-sm text-yellow-700">Laden Sie ca. 15 Standard-Checks (TLS, Header, Zertifikate, etc.).</p>
</div>
<button onClick={seedChecks} disabled={seedingChecks}
className="px-4 py-2 bg-yellow-600 text-white rounded-lg hover:bg-yellow-700 disabled:opacity-50">
{seedingChecks ? 'Lade...' : 'Standard-Checks laden'}
</button>
</div>
)}
{checksLoading ? (
<div className="flex justify-center py-12">
<div className="animate-spin rounded-full h-8 w-8 border-b-2 border-purple-600" />
</div>
) : (
<div className="space-y-3">
{checks.map(check => {
const typeMeta = CHECK_TYPE_LABELS[check.check_type] || { label: check.check_type, color: 'bg-gray-100 text-gray-700' }
const results = checkResults[check.id] || []
const lastResult = results[0]
const isRunning = runningCheckId === check.id
return (
<div key={check.id} className="bg-white rounded-xl border border-gray-200 p-5">
<div className="flex items-start justify-between">
<div className="flex-1">
<div className="flex items-center gap-2">
<h4 className="font-medium text-gray-900">{check.title}</h4>
<span className={`px-2 py-0.5 text-xs rounded ${typeMeta.color}`}>{typeMeta.label}</span>
{!check.is_active && (
<span className="px-2 py-0.5 text-xs rounded bg-gray-100 text-gray-500">Deaktiviert</span>
)}
</div>
{check.description && <p className="text-sm text-gray-500 mt-1">{check.description}</p>}
<div className="flex items-center gap-4 mt-2 text-xs text-gray-400">
<span>Code: {check.check_code}</span>
{check.target_url && <span>Ziel: {check.target_url}</span>}
<span>Frequenz: {check.frequency}</span>
{check.last_run_at && <span>Letzter Lauf: {new Date(check.last_run_at).toLocaleDateString('de-DE')}</span>}
</div>
</div>
<div className="flex items-center gap-2 ml-4">
{lastResult && (
<span className={`px-2 py-0.5 text-xs rounded ${RUN_STATUS_LABELS[lastResult.run_status]?.color || ''}`}>
{RUN_STATUS_LABELS[lastResult.run_status]?.label || lastResult.run_status}
</span>
)}
<button
onClick={() => { runCheck(check.id); loadCheckResults(check.id) }}
disabled={isRunning}
className="px-3 py-1.5 text-xs bg-purple-600 text-white rounded-lg hover:bg-purple-700 disabled:opacity-50"
>
{isRunning ? 'Laeuft...' : 'Ausfuehren'}
</button>
<button
onClick={() => loadCheckResults(check.id)}
className="px-3 py-1.5 text-xs border rounded-lg hover:bg-gray-50"
>
Historie
</button>
</div>
</div>
{/* Results */}
{results.length > 0 && (
<div className="mt-3 border-t pt-3">
<p className="text-xs font-medium text-gray-500 mb-2">Letzte Ergebnisse</p>
<div className="space-y-1">
{results.slice(0, 3).map(r => (
<div key={r.id} className="flex items-center gap-3 text-xs">
<span className={`px-1.5 py-0.5 rounded ${RUN_STATUS_LABELS[r.run_status]?.color || 'bg-gray-100'}`}>
{RUN_STATUS_LABELS[r.run_status]?.label || r.run_status}
</span>
<span className="text-gray-500">{new Date(r.run_at).toLocaleString('de-DE')}</span>
<span className="text-gray-400">{r.duration_ms}ms</span>
{r.findings_count > 0 && (
<span className="text-orange-600">{r.findings_count} Findings ({r.critical_findings} krit.)</span>
)}
{r.summary && <span className="text-gray-600 truncate">{r.summary}</span>}
</div>
))}
</div>
</div>
)}
</div>
)
})}
</div>
)}
</>
)}
{/* ============================================================ */}
{/* TAB: Control-Mapping */}
{/* ============================================================ */}
{activeTab === 'mapping' && (
<>
{coverageReport && (
<div className="grid grid-cols-1 md:grid-cols-4 gap-4">
<div className="bg-white rounded-xl border border-gray-200 p-6">
<p className="text-sm text-gray-500">Gesamt Controls</p>
<p className="text-3xl font-bold text-gray-900">{coverageReport.total_controls}</p>
</div>
<div className="bg-white rounded-xl border border-green-200 p-6">
<p className="text-sm text-green-600">Mit Nachweis</p>
<p className="text-3xl font-bold text-green-600">{coverageReport.controls_with_evidence}</p>
</div>
<div className="bg-white rounded-xl border border-red-200 p-6">
<p className="text-sm text-red-600">Ohne Nachweis</p>
<p className="text-3xl font-bold text-red-600">{coverageReport.controls_without_evidence}</p>
</div>
<div className="bg-white rounded-xl border border-purple-200 p-6">
<p className="text-sm text-purple-600">Abdeckung</p>
<p className="text-3xl font-bold text-purple-600">{coverageReport.coverage_percent.toFixed(0)}%</p>
</div>
</div>
)}
<div className="bg-white rounded-xl shadow-sm border overflow-hidden">
<div className="p-4 border-b">
<h3 className="font-semibold text-gray-900">Evidence-Control-Verknuepfungen ({mappings.length})</h3>
</div>
{mappings.length === 0 ? (
<div className="p-8 text-center text-gray-500">
<p>Noch keine Verknuepfungen erstellt.</p>
<p className="text-sm mt-1">Fuehren Sie automatische Checks aus, um Nachweise automatisch mit Controls zu verknuepfen.</p>
</div>
) : (
<table className="w-full">
<thead className="bg-gray-50">
<tr>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Control</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Evidence</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Typ</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-500 uppercase">Verifiziert</th>
</tr>
</thead>
<tbody className="divide-y divide-gray-200">
{mappings.map(m => (
<tr key={m.id} className="hover:bg-gray-50">
<td className="px-4 py-3 font-mono text-sm text-purple-600">{m.control_code}</td>
<td className="px-4 py-3 text-sm text-gray-700">{m.evidence_id.slice(0, 8)}...</td>
<td className="px-4 py-3">
<span className="px-2 py-0.5 text-xs rounded bg-blue-100 text-blue-700">{m.mapping_type}</span>
</td>
<td className="px-4 py-3 text-sm text-gray-500">
{m.verified_at ? `${new Date(m.verified_at).toLocaleDateString('de-DE')} von ${m.verified_by || '—'}` : 'Ausstehend'}
</td>
</tr>
))}
</tbody>
</table>
)}
</div>
</>
)}
{/* ============================================================ */}
{/* TAB: Report */}
{/* ============================================================ */}
{activeTab === 'report' && (
<div className="bg-white rounded-xl shadow-sm border p-6">
<h3 className="text-lg font-semibold text-gray-900 mb-6">Evidence Coverage Report</h3>
{!coverageReport ? (
<div className="flex justify-center py-12">
<div className="animate-spin rounded-full h-8 w-8 border-b-2 border-purple-600" />
</div>
) : (
<>
{/* Coverage Bar */}
<div className="mb-8">
<div className="flex items-center justify-between mb-2">
<span className="text-sm font-medium text-gray-700">Gesamt-Abdeckung</span>
<span className={`text-2xl font-bold ${
coverageReport.coverage_percent >= 80 ? 'text-green-600' :
coverageReport.coverage_percent >= 50 ? 'text-yellow-600' : 'text-red-600'
}`}>
{coverageReport.coverage_percent.toFixed(1)}%
</span>
</div>
<div className="h-4 bg-gray-200 rounded-full overflow-hidden">
<div
className={`h-full transition-all duration-500 ${
coverageReport.coverage_percent >= 80 ? 'bg-green-500' :
coverageReport.coverage_percent >= 50 ? 'bg-yellow-500' : 'bg-red-500'
}`}
style={{ width: `${coverageReport.coverage_percent}%` }}
/>
</div>
</div>
{/* Summary */}
<div className="grid grid-cols-1 md:grid-cols-3 gap-4 mb-8">
<div className="p-4 bg-gray-50 rounded-lg text-center">
<p className="text-3xl font-bold text-gray-900">{coverageReport.total_controls}</p>
<p className="text-sm text-gray-500">Controls gesamt</p>
</div>
<div className="p-4 bg-green-50 rounded-lg text-center">
<p className="text-3xl font-bold text-green-600">{coverageReport.controls_with_evidence}</p>
<p className="text-sm text-green-600">Mit Nachweis belegt</p>
</div>
<div className="p-4 bg-red-50 rounded-lg text-center">
<p className="text-3xl font-bold text-red-600">{coverageReport.controls_without_evidence}</p>
<p className="text-sm text-red-600">Ohne Nachweis</p>
</div>
</div>
{/* Check Summary */}
<div className="border-t pt-6">
<h4 className="font-medium text-gray-900 mb-3">Automatische Checks</h4>
<div className="flex items-center gap-4 text-sm text-gray-600">
<span>{checks.length} Check-Definitionen</span>
<span>{checks.filter(c => c.is_active).length} aktiv</span>
<span>{checks.filter(c => c.last_run_at).length} mindestens 1x ausgefuehrt</span>
</div>
</div>
{/* Evidence Summary */}
<div className="border-t pt-6 mt-6">
<h4 className="font-medium text-gray-900 mb-3">Nachweise</h4>
<div className="flex items-center gap-4 text-sm text-gray-600">
<span>{displayEvidence.length} Nachweise gesamt</span>
<span className="text-green-600">{validCount} gueltig</span>
<span className="text-red-600">{expiredCount} abgelaufen</span>
<span className="text-yellow-600">{pendingCount} ausstehend</span>
</div>
</div>
</>
)}
</div>
)}
</div>
)
}

File diff suppressed because it is too large Load Diff

View File

@@ -280,6 +280,7 @@ func main() {
ragRoutes.GET("/regulations", ragHandlers.ListRegulations)
ragRoutes.GET("/corpus-status", ragHandlers.CorpusStatus)
ragRoutes.GET("/corpus-versions/:collection", ragHandlers.CorpusVersionHistory)
ragRoutes.GET("/scroll", ragHandlers.HandleScrollChunks)
}
// Roadmap routes - Compliance Implementation Roadmaps

View File

@@ -2,6 +2,7 @@ package handlers
import (
"net/http"
"strconv"
"github.com/breakpilot/ai-compliance-sdk/internal/ucca"
"github.com/gin-gonic/gin"
@@ -157,3 +158,47 @@ func (h *RAGHandlers) CorpusVersionHistory(c *gin.Context) {
"count": len(versions),
})
}
// HandleScrollChunks scrolls/lists all chunks in a Qdrant collection with pagination.
// GET /sdk/v1/rag/scroll?collection=...&offset=...&limit=...
func (h *RAGHandlers) HandleScrollChunks(c *gin.Context) {
collection := c.Query("collection")
if collection == "" {
c.JSON(http.StatusBadRequest, gin.H{"error": "query parameter 'collection' is required"})
return
}
if !AllowedCollections[collection] {
c.JSON(http.StatusBadRequest, gin.H{"error": "Unknown collection: " + collection})
return
}
// Parse limit (default 100, max 500)
limit := 100
if limitStr := c.Query("limit"); limitStr != "" {
parsed, err := strconv.Atoi(limitStr)
if err != nil || parsed < 1 {
c.JSON(http.StatusBadRequest, gin.H{"error": "limit must be a positive integer"})
return
}
limit = parsed
}
if limit > 500 {
limit = 500
}
// Offset is optional (empty string = start from beginning)
offset := c.Query("offset")
chunks, nextOffset, err := h.ragClient.ScrollChunks(c.Request.Context(), collection, offset, limit)
if err != nil {
c.JSON(http.StatusInternalServerError, gin.H{"error": "scroll failed: " + err.Error()})
return
}
c.JSON(http.StatusOK, gin.H{
"chunks": chunks,
"next_offset": nextOffset,
"total": len(chunks),
})
}

View File

@@ -91,6 +91,72 @@ func TestSearch_WithCollectionParam_BindsCorrectly(t *testing.T) {
}
}
func TestHandleScrollChunks_MissingCollection_Returns400(t *testing.T) {
gin.SetMode(gin.TestMode)
handler := &RAGHandlers{}
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Request, _ = http.NewRequest("GET", "/sdk/v1/rag/scroll", nil)
handler.HandleScrollChunks(c)
if w.Code != http.StatusBadRequest {
t.Errorf("Expected 400, got %d", w.Code)
}
var resp map[string]interface{}
json.Unmarshal(w.Body.Bytes(), &resp)
if resp["error"] == nil {
t.Error("Expected error message in response")
}
}
func TestHandleScrollChunks_InvalidCollection_Returns400(t *testing.T) {
gin.SetMode(gin.TestMode)
handler := &RAGHandlers{}
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Request, _ = http.NewRequest("GET", "/sdk/v1/rag/scroll?collection=bp_evil_collection", nil)
handler.HandleScrollChunks(c)
if w.Code != http.StatusBadRequest {
t.Errorf("Expected 400, got %d", w.Code)
}
}
func TestHandleScrollChunks_InvalidLimit_Returns400(t *testing.T) {
gin.SetMode(gin.TestMode)
handler := &RAGHandlers{}
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Request, _ = http.NewRequest("GET", "/sdk/v1/rag/scroll?collection=bp_compliance_ce&limit=abc", nil)
handler.HandleScrollChunks(c)
if w.Code != http.StatusBadRequest {
t.Errorf("Expected 400, got %d", w.Code)
}
}
func TestHandleScrollChunks_NegativeLimit_Returns400(t *testing.T) {
gin.SetMode(gin.TestMode)
handler := &RAGHandlers{}
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Request, _ = http.NewRequest("GET", "/sdk/v1/rag/scroll?collection=bp_compliance_ce&limit=-5", nil)
handler.HandleScrollChunks(c)
if w.Code != http.StatusBadRequest {
t.Errorf("Expected 400, got %d", w.Code)
}
}
func TestSearch_EmptyCollection_IsAllowed(t *testing.T) {
// Empty collection should be allowed (falls back to default in the handler)
body := `{"query":"test"}`

View File

@@ -443,6 +443,142 @@ func (c *LegalRAGClient) FormatLegalContextForPrompt(lc *LegalContext) string {
return buf.String()
}
// ScrollChunkResult represents a single chunk from the scroll/list endpoint.
type ScrollChunkResult struct {
ID string `json:"id"`
Text string `json:"text"`
RegulationCode string `json:"regulation_code"`
RegulationName string `json:"regulation_name"`
RegulationShort string `json:"regulation_short"`
Category string `json:"category"`
Article string `json:"article,omitempty"`
Paragraph string `json:"paragraph,omitempty"`
SourceURL string `json:"source_url,omitempty"`
}
// qdrantScrollRequest for the Qdrant scroll API.
type qdrantScrollRequest struct {
Limit int `json:"limit"`
Offset interface{} `json:"offset,omitempty"` // string (UUID) or null
WithPayload bool `json:"with_payload"`
WithVectors bool `json:"with_vectors"`
}
// qdrantScrollResponse from the Qdrant scroll API.
type qdrantScrollResponse struct {
Result struct {
Points []qdrantScrollPoint `json:"points"`
NextPageOffset interface{} `json:"next_page_offset"`
} `json:"result"`
}
type qdrantScrollPoint struct {
ID interface{} `json:"id"`
Payload map[string]interface{} `json:"payload"`
}
// ScrollChunks iterates over all chunks in a Qdrant collection using the scroll API.
// Pass an empty offset to start from the beginning. Returns chunks, next offset ID, and error.
func (c *LegalRAGClient) ScrollChunks(ctx context.Context, collection string, offset string, limit int) ([]ScrollChunkResult, string, error) {
scrollReq := qdrantScrollRequest{
Limit: limit,
WithPayload: true,
WithVectors: false,
}
if offset != "" {
// Qdrant expects integer point IDs — parse the offset string back to a number
// Try parsing as integer first, fall back to string (for UUID-based collections)
var offsetInt uint64
if _, err := fmt.Sscanf(offset, "%d", &offsetInt); err == nil {
scrollReq.Offset = offsetInt
} else {
scrollReq.Offset = offset
}
}
jsonBody, err := json.Marshal(scrollReq)
if err != nil {
return nil, "", fmt.Errorf("failed to marshal scroll request: %w", err)
}
url := fmt.Sprintf("%s/collections/%s/points/scroll", c.qdrantURL, collection)
req, err := http.NewRequestWithContext(ctx, "POST", url, bytes.NewReader(jsonBody))
if err != nil {
return nil, "", fmt.Errorf("failed to create scroll request: %w", err)
}
req.Header.Set("Content-Type", "application/json")
if c.qdrantAPIKey != "" {
req.Header.Set("api-key", c.qdrantAPIKey)
}
resp, err := c.httpClient.Do(req)
if err != nil {
return nil, "", fmt.Errorf("scroll request failed: %w", err)
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
body, _ := io.ReadAll(resp.Body)
return nil, "", fmt.Errorf("qdrant returned %d: %s", resp.StatusCode, string(body))
}
var scrollResp qdrantScrollResponse
if err := json.NewDecoder(resp.Body).Decode(&scrollResp); err != nil {
return nil, "", fmt.Errorf("failed to decode scroll response: %w", err)
}
// Convert points to results
chunks := make([]ScrollChunkResult, len(scrollResp.Result.Points))
for i, pt := range scrollResp.Result.Points {
// Extract point ID as string
pointID := ""
if pt.ID != nil {
pointID = fmt.Sprintf("%v", pt.ID)
}
chunks[i] = ScrollChunkResult{
ID: pointID,
Text: getString(pt.Payload, "text"),
RegulationCode: getString(pt.Payload, "regulation_code"),
RegulationName: getString(pt.Payload, "regulation_name"),
RegulationShort: getString(pt.Payload, "regulation_short"),
Category: getString(pt.Payload, "category"),
Article: getString(pt.Payload, "article"),
Paragraph: getString(pt.Payload, "paragraph"),
SourceURL: getString(pt.Payload, "source_url"),
}
// Fallback: try alternate payload field names used in ingestion
if chunks[i].Text == "" {
chunks[i].Text = getString(pt.Payload, "chunk_text")
}
if chunks[i].RegulationCode == "" {
chunks[i].RegulationCode = getString(pt.Payload, "regulation_id")
}
if chunks[i].RegulationName == "" {
chunks[i].RegulationName = getString(pt.Payload, "regulation_name_de")
}
if chunks[i].SourceURL == "" {
chunks[i].SourceURL = getString(pt.Payload, "source")
}
}
// Extract next offset — Qdrant returns integer point IDs
nextOffset := ""
if scrollResp.Result.NextPageOffset != nil {
switch v := scrollResp.Result.NextPageOffset.(type) {
case float64:
nextOffset = fmt.Sprintf("%.0f", v)
case string:
nextOffset = v
default:
nextOffset = fmt.Sprintf("%v", v)
}
}
return chunks, nextOffset, nil
}
// Helper functions
func getString(m map[string]interface{}, key string) string {

View File

@@ -87,6 +87,197 @@ func TestSearchCollection_FallbackDefault(t *testing.T) {
}
}
func TestScrollChunks_ReturnsChunksAndNextOffset(t *testing.T) {
qdrantMock := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if !strings.Contains(r.URL.Path, "/points/scroll") {
t.Errorf("Expected scroll endpoint, got: %s", r.URL.Path)
}
// Decode request to verify fields
var reqBody map[string]interface{}
json.NewDecoder(r.Body).Decode(&reqBody)
if reqBody["with_vectors"] != false {
t.Error("Expected with_vectors=false")
}
if reqBody["with_payload"] != true {
t.Error("Expected with_payload=true")
}
resp := map[string]interface{}{
"result": map[string]interface{}{
"points": []map[string]interface{}{
{
"id": "abc-123",
"payload": map[string]interface{}{
"text": "Artikel 35 DSGVO",
"regulation_code": "eu_2016_679",
"regulation_name": "DSGVO",
"regulation_short": "DSGVO",
"category": "regulation",
"article": "Art. 35",
"paragraph": "1",
"source_url": "https://example.com/dsgvo",
},
},
{
"id": "def-456",
"payload": map[string]interface{}{
"chunk_text": "AI Act Titel III",
"regulation_id": "eu_2024_1689",
"regulation_name_de": "KI-Verordnung",
"regulation_short": "AI Act",
"category": "regulation",
"source": "https://example.com/ai-act",
},
},
},
"next_page_offset": "def-456",
},
}
json.NewEncoder(w).Encode(resp)
}))
defer qdrantMock.Close()
client := &LegalRAGClient{
qdrantURL: qdrantMock.URL,
httpClient: http.DefaultClient,
}
chunks, nextOffset, err := client.ScrollChunks(context.Background(), "bp_compliance_ce", "", 100)
if err != nil {
t.Fatalf("ScrollChunks failed: %v", err)
}
if len(chunks) != 2 {
t.Fatalf("Expected 2 chunks, got %d", len(chunks))
}
// First chunk uses direct field names
if chunks[0].ID != "abc-123" {
t.Errorf("Expected ID abc-123, got %s", chunks[0].ID)
}
if chunks[0].Text != "Artikel 35 DSGVO" {
t.Errorf("Expected text 'Artikel 35 DSGVO', got '%s'", chunks[0].Text)
}
if chunks[0].RegulationCode != "eu_2016_679" {
t.Errorf("Expected regulation_code eu_2016_679, got %s", chunks[0].RegulationCode)
}
if chunks[0].Article != "Art. 35" {
t.Errorf("Expected article 'Art. 35', got '%s'", chunks[0].Article)
}
// Second chunk uses fallback field names (chunk_text, regulation_id, etc.)
if chunks[1].Text != "AI Act Titel III" {
t.Errorf("Expected fallback text 'AI Act Titel III', got '%s'", chunks[1].Text)
}
if chunks[1].RegulationCode != "eu_2024_1689" {
t.Errorf("Expected fallback regulation_code eu_2024_1689, got '%s'", chunks[1].RegulationCode)
}
if chunks[1].RegulationName != "KI-Verordnung" {
t.Errorf("Expected fallback regulation_name 'KI-Verordnung', got '%s'", chunks[1].RegulationName)
}
if nextOffset != "def-456" {
t.Errorf("Expected next_offset 'def-456', got '%s'", nextOffset)
}
}
func TestScrollChunks_EmptyCollection_ReturnsEmpty(t *testing.T) {
qdrantMock := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
resp := map[string]interface{}{
"result": map[string]interface{}{
"points": []interface{}{},
"next_page_offset": nil,
},
}
json.NewEncoder(w).Encode(resp)
}))
defer qdrantMock.Close()
client := &LegalRAGClient{
qdrantURL: qdrantMock.URL,
httpClient: http.DefaultClient,
}
chunks, nextOffset, err := client.ScrollChunks(context.Background(), "bp_compliance_ce", "", 100)
if err != nil {
t.Fatalf("ScrollChunks failed: %v", err)
}
if len(chunks) != 0 {
t.Errorf("Expected 0 chunks, got %d", len(chunks))
}
if nextOffset != "" {
t.Errorf("Expected empty next_offset, got '%s'", nextOffset)
}
}
func TestScrollChunks_WithOffset_SendsOffset(t *testing.T) {
var receivedBody map[string]interface{}
qdrantMock := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
json.NewDecoder(r.Body).Decode(&receivedBody)
resp := map[string]interface{}{
"result": map[string]interface{}{
"points": []interface{}{},
"next_page_offset": nil,
},
}
json.NewEncoder(w).Encode(resp)
}))
defer qdrantMock.Close()
client := &LegalRAGClient{
qdrantURL: qdrantMock.URL,
httpClient: http.DefaultClient,
}
_, _, err := client.ScrollChunks(context.Background(), "bp_compliance_ce", "some-offset-id", 50)
if err != nil {
t.Fatalf("ScrollChunks failed: %v", err)
}
if receivedBody["offset"] != "some-offset-id" {
t.Errorf("Expected offset 'some-offset-id', got '%v'", receivedBody["offset"])
}
if receivedBody["limit"] != float64(50) {
t.Errorf("Expected limit 50, got %v", receivedBody["limit"])
}
}
func TestScrollChunks_SendsAPIKey(t *testing.T) {
var receivedAPIKey string
qdrantMock := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
receivedAPIKey = r.Header.Get("api-key")
resp := map[string]interface{}{
"result": map[string]interface{}{
"points": []interface{}{},
"next_page_offset": nil,
},
}
json.NewEncoder(w).Encode(resp)
}))
defer qdrantMock.Close()
client := &LegalRAGClient{
qdrantURL: qdrantMock.URL,
qdrantAPIKey: "test-api-key-123",
httpClient: http.DefaultClient,
}
_, _, err := client.ScrollChunks(context.Background(), "bp_compliance_ce", "", 10)
if err != nil {
t.Fatalf("ScrollChunks failed: %v", err)
}
if receivedAPIKey != "test-api-key-123" {
t.Errorf("Expected api-key 'test-api-key-123', got '%s'", receivedAPIKey)
}
}
func TestSearch_StillWorks(t *testing.T) {
var requestedURL string

View File

@@ -53,6 +53,8 @@ _ROUTER_MODULES = [
"wiki_routes",
"canonical_control_routes",
"control_generator_routes",
"process_task_routes",
"evidence_check_routes",
]
_loaded_count = 0

View File

@@ -78,6 +78,7 @@ class ControlResponse(BaseModel):
customer_visible: Optional[bool] = None
verification_method: Optional[str] = None
category: Optional[str] = None
target_audience: Optional[str] = None
generation_metadata: Optional[dict] = None
created_at: str
updated_at: str
@@ -106,6 +107,7 @@ class ControlCreateRequest(BaseModel):
customer_visible: Optional[bool] = True
verification_method: Optional[str] = None
category: Optional[str] = None
target_audience: Optional[str] = None
generation_metadata: Optional[dict] = None
@@ -130,6 +132,7 @@ class ControlUpdateRequest(BaseModel):
customer_visible: Optional[bool] = None
verification_method: Optional[str] = None
category: Optional[str] = None
target_audience: Optional[str] = None
generation_metadata: Optional[dict] = None
@@ -158,7 +161,7 @@ _CONTROL_COLS = """id, framework_id, control_id, title, objective, rationale,
evidence_confidence, open_anchors, release_state, tags,
license_rule, source_original_text, source_citation,
customer_visible, verification_method, category,
generation_metadata,
target_audience, generation_metadata,
created_at, updated_at"""
@@ -241,6 +244,7 @@ async def list_framework_controls(
release_state: Optional[str] = Query(None),
verification_method: Optional[str] = Query(None),
category: Optional[str] = Query(None),
target_audience: Optional[str] = Query(None),
):
"""List controls belonging to a framework."""
with SessionLocal() as db:
@@ -271,6 +275,9 @@ async def list_framework_controls(
if category:
query += " AND category = :cat"
params["cat"] = category
if target_audience:
query += " AND target_audience = :ta"
params["ta"] = target_audience
query += " ORDER BY control_id"
rows = db.execute(text(query), params).fetchall()
@@ -289,6 +296,7 @@ async def list_controls(
release_state: Optional[str] = Query(None),
verification_method: Optional[str] = Query(None),
category: Optional[str] = Query(None),
target_audience: Optional[str] = Query(None),
):
"""List all canonical controls, with optional filters."""
query = f"""
@@ -313,6 +321,9 @@ async def list_controls(
if category:
query += " AND category = :cat"
params["cat"] = category
if target_audience:
query += " AND target_audience = :ta"
params["ta"] = target_audience
query += " ORDER BY control_id"
@@ -384,7 +395,7 @@ async def create_control(body: ControlCreateRequest):
open_anchors, release_state, tags,
license_rule, source_original_text, source_citation,
customer_visible, verification_method, category,
generation_metadata
target_audience, generation_metadata
) VALUES (
:fw_id, :cid, :title, :objective, :rationale,
CAST(:scope AS jsonb), CAST(:requirements AS jsonb),
@@ -394,7 +405,7 @@ async def create_control(body: ControlCreateRequest):
:license_rule, :source_original_text,
CAST(:source_citation AS jsonb),
:customer_visible, :verification_method, :category,
CAST(:generation_metadata AS jsonb)
:target_audience, CAST(:generation_metadata AS jsonb)
)
RETURNING {_CONTROL_COLS}
"""),
@@ -421,6 +432,7 @@ async def create_control(body: ControlCreateRequest):
"customer_visible": body.customer_visible,
"verification_method": body.verification_method,
"category": body.category,
"target_audience": body.target_audience,
"generation_metadata": _json.dumps(body.generation_metadata) if body.generation_metadata else None,
},
).fetchone()
@@ -647,6 +659,7 @@ def _control_row(r) -> dict:
"customer_visible": r.customer_visible,
"verification_method": r.verification_method,
"category": r.category,
"target_audience": r.target_audience,
"generation_metadata": r.generation_metadata,
"created_at": r.created_at.isoformat() if r.created_at else None,
"updated_at": r.updated_at.isoformat() if r.updated_at else None,

View File

@@ -12,6 +12,7 @@ Endpoints:
POST /v1/canonical/blocked-sources/cleanup — Start cleanup workflow
"""
import asyncio
import json
import logging
from typing import Optional, List
@@ -89,9 +90,42 @@ class BlockedSourceResponse(BaseModel):
# ENDPOINTS
# =============================================================================
async def _run_pipeline_background(config: GeneratorConfig, job_id: str):
"""Run the pipeline in the background. Uses its own DB session."""
db = SessionLocal()
try:
config.existing_job_id = job_id
pipeline = ControlGeneratorPipeline(db=db, rag_client=get_rag_client())
result = await pipeline.run(config)
logger.info(
"Background generation job %s completed: %d controls from %d chunks",
job_id, result.controls_generated, result.total_chunks_scanned,
)
except Exception as e:
logger.error("Background generation job %s failed: %s", job_id, e)
# Update job as failed
try:
db.execute(
text("""
UPDATE canonical_generation_jobs
SET status = 'failed', errors = :errors, completed_at = NOW()
WHERE id = CAST(:job_id AS uuid)
"""),
{"job_id": job_id, "errors": json.dumps([str(e)])},
)
db.commit()
except Exception:
pass
finally:
db.close()
@router.post("/generate", response_model=GenerateResponse)
async def start_generation(req: GenerateRequest):
"""Start a control generation run."""
"""Start a control generation run (runs in background).
Returns immediately with job_id. Use GET /generate/status/{job_id} to poll progress.
"""
config = GeneratorConfig(
collections=req.collections,
domain=req.domain,
@@ -101,30 +135,63 @@ async def start_generation(req: GenerateRequest):
dry_run=req.dry_run,
)
if req.dry_run:
# Dry run: execute synchronously and return controls
db = SessionLocal()
try:
pipeline = ControlGeneratorPipeline(db=db, rag_client=get_rag_client())
result = await pipeline.run(config)
return GenerateResponse(
job_id=result.job_id,
status=result.status,
message=f"Dry run: {result.controls_generated} controls from {result.total_chunks_scanned} chunks",
total_chunks_scanned=result.total_chunks_scanned,
controls_generated=result.controls_generated,
controls_verified=result.controls_verified,
controls_needs_review=result.controls_needs_review,
controls_too_close=result.controls_too_close,
controls_duplicates_found=result.controls_duplicates_found,
errors=result.errors,
controls=result.controls,
)
except Exception as e:
logger.error("Dry run failed: %s", e)
raise HTTPException(status_code=500, detail=str(e))
finally:
db.close()
# Create job record first so we can return the ID
db = SessionLocal()
try:
pipeline = ControlGeneratorPipeline(db=db, rag_client=get_rag_client())
result = await pipeline.run(config)
return GenerateResponse(
job_id=result.job_id,
status=result.status,
message=f"Generated {result.controls_generated} controls from {result.total_chunks_scanned} chunks",
total_chunks_scanned=result.total_chunks_scanned,
controls_generated=result.controls_generated,
controls_verified=result.controls_verified,
controls_needs_review=result.controls_needs_review,
controls_too_close=result.controls_too_close,
controls_duplicates_found=result.controls_duplicates_found,
errors=result.errors,
controls=result.controls if req.dry_run else [],
result = db.execute(
text("""
INSERT INTO canonical_generation_jobs (status, config)
VALUES ('running', :config)
RETURNING id
"""),
{"config": json.dumps(config.model_dump())},
)
db.commit()
row = result.fetchone()
job_id = str(row[0]) if row else None
except Exception as e:
logger.error("Generation failed: %s", e)
raise HTTPException(status_code=500, detail=str(e))
logger.error("Failed to create job: %s", e)
raise HTTPException(status_code=500, detail=f"Failed to create job: {e}")
finally:
db.close()
if not job_id:
raise HTTPException(status_code=500, detail="Failed to create job record")
# Launch pipeline in background
asyncio.create_task(_run_pipeline_background(config, job_id))
return GenerateResponse(
job_id=job_id,
status="running",
message="Generation started in background. Poll /generate/status/{job_id} for progress.",
)
@router.get("/generate/status/{job_id}")
async def get_job_status(job_id: str):

View File

@@ -5,16 +5,23 @@ Endpoints:
- /dashboard: Main compliance dashboard
- /dashboard/executive: Executive summary for managers
- /dashboard/trend: Compliance score trend over time
- /dashboard/roadmap: Prioritised controls in 4 buckets
- /dashboard/module-status: Completion status of each SDK module
- /dashboard/next-actions: Top 5 most important actions
- /dashboard/snapshot: Save / query compliance score snapshots
- /score: Quick compliance score
- /reports: Report generation
"""
import logging
from datetime import datetime, timedelta
from datetime import datetime, date, timedelta
from calendar import month_abbr
from typing import Optional
from typing import Optional, Dict, Any, List
from decimal import Decimal
from fastapi import APIRouter, Depends, HTTPException, Query
from pydantic import BaseModel
from sqlalchemy import text
from sqlalchemy.orm import Session
from classroom_engine.database import get_db
@@ -34,6 +41,8 @@ from .schemas import (
DeadlineItem,
TeamWorkloadItem,
)
from .tenant_utils import get_tenant_id as _get_tenant_id
from .db_utils import row_to_dict as _row_to_dict
logger = logging.getLogger(__name__)
router = APIRouter(tags=["compliance-dashboard"])
@@ -322,6 +331,272 @@ async def get_compliance_trend(
}
# ============================================================================
# Dashboard Extended — Roadmap, Module-Status, Next-Actions, Snapshots
# ============================================================================
# Weight map for control prioritisation
_PRIORITY_WEIGHTS = {"legal": 5, "security": 3, "best_practice": 1, "operational": 2}
# SDK module definitions → DB table used for counting completion
_MODULE_DEFS: List[Dict[str, str]] = [
{"key": "vvt", "label": "VVT", "table": "compliance_vvt_activities"},
{"key": "tom", "label": "TOM", "table": "compliance_toms"},
{"key": "dsfa", "label": "DSFA", "table": "compliance_dsfa_assessments"},
{"key": "loeschfristen", "label": "Loeschfristen", "table": "compliance_loeschfristen"},
{"key": "risks", "label": "Risiken", "table": "compliance_risks"},
{"key": "controls", "label": "Controls", "table": "compliance_controls"},
{"key": "evidence", "label": "Nachweise", "table": "compliance_evidence"},
{"key": "obligations", "label": "Pflichten", "table": "compliance_obligations"},
{"key": "incidents", "label": "Vorfaelle", "table": "compliance_notfallplan_incidents"},
{"key": "vendor", "label": "Auftragsverarbeiter", "table": "compliance_vendor_assessments"},
{"key": "legal_templates", "label": "Rechtl. Dokumente", "table": "compliance_legal_templates"},
{"key": "training", "label": "Schulungen", "table": "training_modules"},
{"key": "audit", "label": "Audit", "table": "compliance_audit_sessions"},
{"key": "security_backlog", "label": "Security-Backlog", "table": "compliance_security_backlog"},
{"key": "quality", "label": "Qualitaet", "table": "compliance_quality_items"},
]
@router.get("/dashboard/roadmap")
async def get_dashboard_roadmap(
db: Session = Depends(get_db),
tenant_id: str = Depends(_get_tenant_id),
):
"""Prioritised controls in 4 buckets: Quick Wins, Must Have, Should Have, Nice to Have."""
ctrl_repo = ControlRepository(db)
controls = ctrl_repo.get_all()
today = datetime.utcnow().date()
buckets: Dict[str, list] = {
"quick_wins": [],
"must_have": [],
"should_have": [],
"nice_to_have": [],
}
for ctrl in controls:
status = ctrl.status.value if ctrl.status else "planned"
if status == "pass":
continue # already done
weight = _PRIORITY_WEIGHTS.get(ctrl.category if hasattr(ctrl, "category") else "best_practice", 1)
days_overdue = 0
if ctrl.next_review_at:
review_date = ctrl.next_review_at.date() if hasattr(ctrl.next_review_at, "date") else ctrl.next_review_at
days_overdue = (today - review_date).days
urgency = weight * 2 + (1 if days_overdue > 0 else 0)
item = {
"id": str(ctrl.id),
"control_id": ctrl.control_id,
"title": ctrl.title,
"status": status,
"domain": ctrl.domain.value if ctrl.domain else "unknown",
"owner": ctrl.owner,
"next_review_at": ctrl.next_review_at.isoformat() if ctrl.next_review_at else None,
"days_overdue": max(0, days_overdue),
"weight": weight,
}
if weight >= 5 and days_overdue > 0:
buckets["quick_wins"].append(item)
elif weight >= 4:
buckets["must_have"].append(item)
elif weight >= 2:
buckets["should_have"].append(item)
else:
buckets["nice_to_have"].append(item)
# Sort each bucket by urgency desc
for key in buckets:
buckets[key].sort(key=lambda x: x["days_overdue"], reverse=True)
return {
"buckets": buckets,
"counts": {k: len(v) for k, v in buckets.items()},
"generated_at": datetime.utcnow().isoformat(),
}
@router.get("/dashboard/module-status")
async def get_module_status(
db: Session = Depends(get_db),
tenant_id: str = Depends(_get_tenant_id),
):
"""Completion status for each SDK module based on DB record counts."""
modules = []
for mod in _MODULE_DEFS:
try:
row = db.execute(
text(f"SELECT COUNT(*) FROM {mod['table']} WHERE tenant_id = :tid"),
{"tid": tenant_id},
).fetchone()
count = int(row[0]) if row else 0
except Exception:
count = 0
# Simple heuristic: 0 = not started, 1-2 = in progress, 3+ = complete
if count == 0:
status = "not_started"
progress = 0
elif count < 3:
status = "in_progress"
progress = min(60, count * 30)
else:
status = "complete"
progress = 100
modules.append({
"key": mod["key"],
"label": mod["label"],
"count": count,
"status": status,
"progress": progress,
})
started = sum(1 for m in modules if m["status"] != "not_started")
complete = sum(1 for m in modules if m["status"] == "complete")
return {
"modules": modules,
"total": len(modules),
"started": started,
"complete": complete,
"overall_progress": round((complete / len(modules)) * 100, 1) if modules else 0,
}
@router.get("/dashboard/next-actions")
async def get_next_actions(
limit: int = Query(5, ge=1, le=20),
db: Session = Depends(get_db),
tenant_id: str = Depends(_get_tenant_id),
):
"""Top N most important actions sorted by urgency*impact."""
ctrl_repo = ControlRepository(db)
controls = ctrl_repo.get_all()
today = datetime.utcnow().date()
actions = []
for ctrl in controls:
status = ctrl.status.value if ctrl.status else "planned"
if status == "pass":
continue
days_overdue = 0
if ctrl.next_review_at:
review_date = ctrl.next_review_at.date() if hasattr(ctrl.next_review_at, "date") else ctrl.next_review_at
days_overdue = max(0, (today - review_date).days)
weight = _PRIORITY_WEIGHTS.get(ctrl.category if hasattr(ctrl, "category") else "best_practice", 1)
urgency_score = weight * 10 + days_overdue
actions.append({
"id": str(ctrl.id),
"control_id": ctrl.control_id,
"title": ctrl.title,
"status": status,
"domain": ctrl.domain.value if ctrl.domain else "unknown",
"owner": ctrl.owner,
"days_overdue": days_overdue,
"urgency_score": urgency_score,
"reason": "Ueberfaellig" if days_overdue > 0 else "Offen",
})
actions.sort(key=lambda x: x["urgency_score"], reverse=True)
return {"actions": actions[:limit]}
@router.post("/dashboard/snapshot")
async def create_score_snapshot(
db: Session = Depends(get_db),
tenant_id: str = Depends(_get_tenant_id),
):
"""Save current compliance score as a historical snapshot."""
ctrl_repo = ControlRepository(db)
evidence_repo = EvidenceRepository(db)
risk_repo = RiskRepository(db)
ctrl_stats = ctrl_repo.get_statistics()
evidence_stats = evidence_repo.get_statistics()
risks = risk_repo.get_all()
total = ctrl_stats.get("total", 0)
passing = ctrl_stats.get("pass", 0)
partial = ctrl_stats.get("partial", 0)
score = round(((passing + partial * 0.5) / total) * 100, 2) if total > 0 else 0
risks_high = sum(1 for r in risks if (r.inherent_risk.value if r.inherent_risk else "low") in ("high", "critical"))
today = date.today()
row = db.execute(text("""
INSERT INTO compliance_score_snapshots (
tenant_id, score, controls_total, controls_pass, controls_partial,
evidence_total, evidence_valid, risks_total, risks_high, snapshot_date
) VALUES (
:tenant_id, :score, :controls_total, :controls_pass, :controls_partial,
:evidence_total, :evidence_valid, :risks_total, :risks_high, :snapshot_date
)
ON CONFLICT (tenant_id, project_id, snapshot_date) DO UPDATE SET
score = EXCLUDED.score,
controls_total = EXCLUDED.controls_total,
controls_pass = EXCLUDED.controls_pass,
controls_partial = EXCLUDED.controls_partial,
evidence_total = EXCLUDED.evidence_total,
evidence_valid = EXCLUDED.evidence_valid,
risks_total = EXCLUDED.risks_total,
risks_high = EXCLUDED.risks_high
RETURNING *
"""), {
"tenant_id": tenant_id,
"score": score,
"controls_total": total,
"controls_pass": passing,
"controls_partial": partial,
"evidence_total": evidence_stats.get("total", 0),
"evidence_valid": evidence_stats.get("by_status", {}).get("valid", 0),
"risks_total": len(risks),
"risks_high": risks_high,
"snapshot_date": today,
}).fetchone()
db.commit()
return _row_to_dict(row)
@router.get("/dashboard/score-history")
async def get_score_history(
months: int = Query(12, ge=1, le=36),
db: Session = Depends(get_db),
tenant_id: str = Depends(_get_tenant_id),
):
"""Get compliance score history from snapshots."""
since = date.today() - timedelta(days=months * 30)
rows = db.execute(text("""
SELECT * FROM compliance_score_snapshots
WHERE tenant_id = :tenant_id AND snapshot_date >= :since
ORDER BY snapshot_date ASC
"""), {"tenant_id": tenant_id, "since": since}).fetchall()
snapshots = []
for r in rows:
d = _row_to_dict(r)
# Convert Decimal to float for JSON
if isinstance(d.get("score"), Decimal):
d["score"] = float(d["score"])
snapshots.append(d)
return {
"snapshots": snapshots,
"total": len(snapshots),
"period_months": months,
}
# ============================================================================
# Reports
# ============================================================================

File diff suppressed because it is too large Load Diff

View File

@@ -50,6 +50,14 @@ VALID_DOCUMENT_TYPES = {
"cookie_banner",
"agb",
"clause",
# Security document templates (Migration 051)
"it_security_concept",
"data_protection_concept",
"backup_recovery_concept",
"logging_concept",
"incident_response_plan",
"access_control_concept",
"risk_management_concept",
}
VALID_STATUSES = {"published", "draft", "archived"}

File diff suppressed because it is too large Load Diff

View File

@@ -46,7 +46,7 @@ EMBEDDING_URL = os.getenv("EMBEDDING_URL", "http://embedding-service:8087")
ANTHROPIC_API_KEY = os.getenv("ANTHROPIC_API_KEY", "")
ANTHROPIC_MODEL = os.getenv("CONTROL_GEN_ANTHROPIC_MODEL", "claude-sonnet-4-6")
OLLAMA_URL = os.getenv("OLLAMA_URL", "http://host.docker.internal:11434")
OLLAMA_MODEL = os.getenv("CONTROL_GEN_OLLAMA_MODEL", "qwen3:30b-a3b")
OLLAMA_MODEL = os.getenv("CONTROL_GEN_OLLAMA_MODEL", "qwen3.5:35b-a3b")
LLM_TIMEOUT = float(os.getenv("CONTROL_GEN_LLM_TIMEOUT", "120"))
HARMONIZATION_THRESHOLD = 0.85 # Cosine similarity above this = duplicate
@@ -157,6 +157,9 @@ REGULATION_LICENSE_MAP: dict[str, dict] = {
"edpb_transfers_07_2020":{"license": "EU_PUBLIC", "rule": 1, "name": "EDPB Transfers 07/2020"},
"edpb_video_03_2019": {"license": "EU_PUBLIC", "rule": 1, "name": "EDPB Video Surveillance"},
"edps_dpia_list": {"license": "EU_PUBLIC", "rule": 1, "name": "EDPS DPIA Liste"},
"edpb_certification_01_2018": {"license": "EU_PUBLIC", "rule": 1, "name": "EDPB Certification 01/2018"},
"edpb_certification_01_2019": {"license": "EU_PUBLIC", "rule": 1, "name": "EDPB Certification 01/2019"},
"eaa": {"license": "EU_LAW", "rule": 1, "name": "European Accessibility Act"},
# WP29 (pre-EDPB) Guidelines
"wp244_profiling": {"license": "EU_PUBLIC", "rule": 1, "name": "WP29 Profiling"},
"wp251_profiling": {"license": "EU_PUBLIC", "rule": 1, "name": "WP29 Data Portability"},
@@ -240,6 +243,83 @@ DOMAIN_KEYWORDS = {
}
CATEGORY_KEYWORDS = {
"encryption": ["encryption", "cryptography", "tls", "ssl", "certificate", "hashing",
"aes", "rsa", "verschlüsselung", "kryptographie", "zertifikat", "cipher"],
"authentication": ["authentication", "login", "password", "credential", "mfa", "2fa",
"session", "oauth", "authentifizierung", "anmeldung", "passwort"],
"network": ["network", "firewall", "dns", "vpn", "proxy", "segmentation",
"netzwerk", "routing", "port", "intrusion", "ids", "ips"],
"data_protection": ["data protection", "privacy", "personal data", "datenschutz",
"personenbezogen", "dsgvo", "gdpr", "löschung", "verarbeitung", "einwilligung"],
"logging": ["logging", "monitoring", "audit trail", "siem", "alert", "anomaly",
"protokollierung", "überwachung", "nachvollziehbar"],
"incident": ["incident", "response", "breach", "recovery", "vorfall", "sicherheitsvorfall"],
"continuity": ["backup", "disaster recovery", "notfall", "wiederherstellung", "notfallplan",
"business continuity", "ausfallsicherheit"],
"compliance": ["compliance", "audit", "regulation", "certification", "konformität",
"prüfung", "zertifizierung", "nachweis"],
"supply_chain": ["supplier", "vendor", "third party", "lieferant", "auftragnehmer",
"unterauftragnehmer", "supply chain", "dienstleister"],
"physical": ["physical", "building", "access zone", "physisch", "gebäude", "zutritt",
"schließsystem", "rechenzentrum"],
"personnel": ["training", "awareness", "employee", "schulung", "mitarbeiter",
"sensibilisierung", "personal", "unterweisung"],
"application": ["application", "software", "code review", "sdlc", "secure coding",
"anwendung", "entwicklung", "software-entwicklung", "api"],
"system": ["hardening", "patch", "configuration", "update", "härtung", "konfiguration",
"betriebssystem", "system", "server"],
"risk": ["risk assessment", "risk management", "risiko", "bewertung", "risikobewertung",
"risikoanalyse", "bedrohung", "threat"],
"governance": ["governance", "policy", "organization", "isms", "sicherheitsorganisation",
"richtlinie", "verantwortlichkeit", "rolle"],
"hardware": ["hardware", "platform", "firmware", "bios", "tpm", "chip",
"plattform", "geräte"],
"identity": ["identity", "iam", "directory", "ldap", "sso", "provisioning",
"identität", "identitätsmanagement", "benutzerverzeichnis"],
}
VERIFICATION_KEYWORDS = {
"code_review": ["source code", "code review", "static analysis", "sast", "dast",
"dependency check", "quellcode", "codeanalyse", "secure coding",
"software development", "api", "input validation", "output encoding"],
"document": ["policy", "procedure", "documentation", "training", "awareness",
"richtlinie", "dokumentation", "schulung", "nachweis", "vertrag",
"organizational", "process", "role", "responsibility"],
"tool": ["scanner", "monitoring", "siem", "ids", "ips", "firewall", "antivirus",
"vulnerability scan", "penetration test", "tool", "automated"],
"hybrid": [], # Assigned when multiple methods match equally
}
def _detect_category(text: str) -> Optional[str]:
"""Detect the most likely category from text content."""
text_lower = text.lower()
scores: dict[str, int] = {}
for cat, keywords in CATEGORY_KEYWORDS.items():
scores[cat] = sum(1 for kw in keywords if kw in text_lower)
if not scores or max(scores.values()) == 0:
return None
return max(scores, key=scores.get)
def _detect_verification_method(text: str) -> Optional[str]:
"""Detect verification method from text content."""
text_lower = text.lower()
scores: dict[str, int] = {}
for method, keywords in VERIFICATION_KEYWORDS.items():
if method == "hybrid":
continue
scores[method] = sum(1 for kw in keywords if kw in text_lower)
if not scores or max(scores.values()) == 0:
return None
top = sorted(scores.items(), key=lambda x: -x[1])
# If top two are close, it's hybrid
if len(top) >= 2 and top[0][1] > 0 and top[1][1] > 0 and top[1][1] >= top[0][1] * 0.7:
return "hybrid"
return top[0][0] if top[0][1] > 0 else None
def _detect_domain(text: str) -> str:
"""Detect the most likely domain from text content."""
text_lower = text.lower()
@@ -259,10 +339,11 @@ class GeneratorConfig(BaseModel):
collections: Optional[List[str]] = None
domain: Optional[str] = None
batch_size: int = 5
max_controls: int = 50
max_controls: int = 0 # 0 = unlimited (process ALL chunks)
skip_processed: bool = True
skip_web_search: bool = False
dry_run: bool = False
existing_job_id: Optional[str] = None # If set, reuse this job instead of creating a new one
@dataclass
@@ -287,6 +368,9 @@ class GeneratedControl:
source_citation: Optional[dict] = None
customer_visible: bool = True
generation_metadata: dict = field(default_factory=dict)
# Classification fields
verification_method: Optional[str] = None # code_review, document, tool, hybrid
category: Optional[str] = None # one of 17 categories
@dataclass
@@ -299,6 +383,7 @@ class GeneratorResult:
controls_needs_review: int = 0
controls_too_close: int = 0
controls_duplicates_found: int = 0
chunks_skipped_prefilter: int = 0
errors: list = field(default_factory=list)
controls: list = field(default_factory=list)
@@ -310,14 +395,68 @@ class GeneratorResult:
async def _llm_chat(prompt: str, system_prompt: Optional[str] = None) -> str:
"""Call LLM — Anthropic Claude (primary) or Ollama (fallback)."""
if ANTHROPIC_API_KEY:
logger.info("Calling Anthropic API (model=%s)...", ANTHROPIC_MODEL)
result = await _llm_anthropic(prompt, system_prompt)
if result:
logger.info("Anthropic API success (%d chars)", len(result))
return result
logger.warning("Anthropic failed, falling back to Ollama")
logger.info("Calling Ollama (model=%s)...", OLLAMA_MODEL)
return await _llm_ollama(prompt, system_prompt)
async def _llm_local(prompt: str, system_prompt: Optional[str] = None) -> str:
"""Call local Ollama LLM only (for pre-filtering and classification tasks)."""
return await _llm_ollama(prompt, system_prompt)
PREFILTER_SYSTEM_PROMPT = """Du bist ein Compliance-Analyst. Deine Aufgabe: Prüfe ob ein Textabschnitt eine konkrete Sicherheitsanforderung, Datenschutzpflicht, oder technische/organisatorische Maßnahme enthält.
Antworte NUR mit einem JSON-Objekt: {"relevant": true/false, "reason": "kurze Begründung"}
Relevant = true wenn der Text mindestens EINE der folgenden enthält:
- Konkrete Pflicht/Anforderung ("muss", "soll", "ist sicherzustellen")
- Technische Sicherheitsmaßnahme (Verschlüsselung, Zugriffskontrolle, Logging)
- Organisatorische Maßnahme (Schulung, Dokumentation, Audit)
- Datenschutz-Vorgabe (Löschpflicht, Einwilligung, Zweckbindung)
- Risikomanagement-Anforderung
Relevant = false wenn der Text NUR enthält:
- Definitionen ohne Pflichten
- Inhaltsverzeichnisse oder Verweise
- Reine Begriffsbestimmungen
- Übergangsvorschriften ohne Substanz
- Adressaten/Geltungsbereich ohne Anforderung"""
async def _prefilter_chunk(chunk_text: str) -> tuple[bool, str]:
"""Use local LLM to check if a chunk contains an actionable requirement.
Returns (is_relevant, reason).
Much cheaper than sending every chunk to Anthropic.
"""
prompt = f"""Prüfe ob dieser Textabschnitt eine konkrete Sicherheitsanforderung oder Compliance-Pflicht enthält.
Text:
---
{chunk_text[:1500]}
---
Antworte NUR mit JSON: {{"relevant": true/false, "reason": "kurze Begründung"}}"""
try:
raw = await _llm_local(prompt, PREFILTER_SYSTEM_PROMPT)
data = _parse_llm_json(raw)
if data:
return data.get("relevant", True), data.get("reason", "")
# If parsing fails, assume relevant (don't skip)
return True, "parse_failed"
except Exception as e:
logger.warning("Prefilter failed: %s — treating as relevant", e)
return True, f"error: {e}"
async def _llm_anthropic(prompt: str, system_prompt: Optional[str] = None) -> str:
"""Call Anthropic Messages API."""
headers = {
@@ -364,6 +503,8 @@ async def _llm_ollama(prompt: str, system_prompt: Optional[str] = None) -> str:
"model": OLLAMA_MODEL,
"messages": messages,
"stream": False,
"options": {"num_predict": 512}, # Limit response length for speed
"think": False, # Disable thinking for faster responses
}
try:
@@ -397,6 +538,26 @@ async def _get_embedding(text: str) -> list[float]:
return []
async def _get_embeddings_batch(texts: list[str], batch_size: int = 32) -> list[list[float]]:
"""Get embedding vectors for multiple texts in batches."""
all_embeddings: list[list[float]] = []
for i in range(0, len(texts), batch_size):
batch = texts[i:i + batch_size]
try:
async with httpx.AsyncClient(timeout=30.0) as client:
resp = await client.post(
f"{EMBEDDING_URL}/embed",
json={"texts": batch},
)
resp.raise_for_status()
embeddings = resp.json().get("embeddings", [])
all_embeddings.extend(embeddings)
except Exception as e:
logger.warning("Batch embedding failed for %d texts: %s", len(batch), e)
all_embeddings.extend([[] for _ in batch])
return all_embeddings
def _cosine_sim(a: list[float], b: list[float]) -> float:
"""Compute cosine similarity between two vectors."""
if not a or not b or len(a) != len(b):
@@ -464,50 +625,96 @@ class ControlGeneratorPipeline:
# ── Stage 1: RAG Scan ──────────────────────────────────────────────
async def _scan_rag(self, config: GeneratorConfig) -> list[RAGSearchResult]:
"""Load unprocessed chunks from RAG collections."""
"""Scroll through ALL chunks in RAG collections.
Uses the scroll endpoint to iterate over every chunk (not just top-K search).
Filters out already-processed chunks by hash.
"""
collections = config.collections or ALL_COLLECTIONS
all_results: list[RAGSearchResult] = []
queries = [
"security requirement control measure",
"Sicherheitsanforderung Maßnahme Prüfaspekt",
"compliance requirement audit criterion",
"data protection privacy obligation",
"access control authentication authorization",
]
# Pre-load all processed hashes for fast filtering
processed_hashes: set[str] = set()
if config.skip_processed:
try:
result = self.db.execute(
text("SELECT chunk_hash FROM canonical_processed_chunks")
)
processed_hashes = {row[0] for row in result}
logger.info("Loaded %d processed chunk hashes", len(processed_hashes))
except Exception as e:
logger.warning("Error loading processed hashes: %s", e)
if config.domain:
domain_kw = DOMAIN_KEYWORDS.get(config.domain, [])
if domain_kw:
queries.append(" ".join(domain_kw[:5]))
seen_hashes: set[str] = set()
for collection in collections:
for query in queries:
results = await self.rag.search(
query=query,
offset = None
page = 0
collection_total = 0
collection_new = 0
seen_offsets: set[str] = set() # Detect scroll loops
while True:
chunks, next_offset = await self.rag.scroll(
collection=collection,
top_k=20,
offset=offset,
limit=200,
)
all_results.extend(results)
# Deduplicate by text hash
seen_hashes: set[str] = set()
unique: list[RAGSearchResult] = []
for r in all_results:
h = hashlib.sha256(r.text.encode()).hexdigest()
if h not in seen_hashes:
seen_hashes.add(h)
unique.append(r)
if not chunks:
break
# Filter out already-processed chunks
if config.skip_processed and unique:
hashes = [hashlib.sha256(r.text.encode()).hexdigest() for r in unique]
processed = self._get_processed_hashes(hashes)
unique = [r for r, h in zip(unique, hashes) if h not in processed]
collection_total += len(chunks)
logger.info("RAG scan: %d unique chunks (%d after filtering processed)",
len(seen_hashes), len(unique))
return unique[:config.max_controls * 3] # Over-fetch to account for duplicates
for chunk in chunks:
if not chunk.text or len(chunk.text.strip()) < 50:
continue # Skip empty/tiny chunks
h = hashlib.sha256(chunk.text.encode()).hexdigest()
# Skip duplicates (same text in multiple collections)
if h in seen_hashes:
continue
seen_hashes.add(h)
# Skip already-processed
if h in processed_hashes:
continue
all_results.append(chunk)
collection_new += 1
page += 1
if page % 50 == 0:
logger.info(
"Scrolling %s: page %d, %d total chunks, %d new unprocessed",
collection, page, collection_total, collection_new,
)
# Stop conditions
if not next_offset:
break
# Detect infinite scroll loops (Qdrant mixed ID types)
if next_offset in seen_offsets:
logger.warning(
"Scroll loop detected in %s at offset %s (page %d) — stopping",
collection, next_offset, page,
)
break
seen_offsets.add(next_offset)
offset = next_offset
logger.info(
"Collection %s: %d total chunks scrolled, %d new unprocessed",
collection, collection_total, collection_new,
)
logger.info(
"RAG scroll complete: %d total unique seen, %d new unprocessed to process",
len(seen_hashes), len(all_results),
)
return all_results
def _get_processed_hashes(self, hashes: list[str]) -> set[str]:
"""Check which chunk hashes are already processed."""
@@ -568,6 +775,8 @@ Quelle: {chunk.regulation_name} ({chunk.regulation_code}), {chunk.article}"""
"url": chunk.source_url or "",
}
control.customer_visible = True
control.verification_method = _detect_verification_method(chunk.text)
control.category = _detect_category(chunk.text)
control.generation_metadata = {
"processing_path": "structured",
"license_rule": 1,
@@ -617,6 +826,8 @@ Quelle: {chunk.regulation_name}, {chunk.article}"""
"url": chunk.source_url or "",
}
control.customer_visible = True
control.verification_method = _detect_verification_method(chunk.text)
control.category = _detect_category(chunk.text)
control.generation_metadata = {
"processing_path": "structured",
"license_rule": 2,
@@ -661,6 +872,8 @@ Gib JSON zurück mit diesen Feldern:
control.source_original_text = None # NEVER store original
control.source_citation = None # NEVER cite source
control.customer_visible = False # Only our formulation
control.verification_method = _detect_verification_method(chunk.text)
control.category = _detect_category(chunk.text)
# generation_metadata: NO source names, NO original texts
control.generation_metadata = {
"processing_path": "llm_reform",
@@ -676,6 +889,10 @@ Gib JSON zurück mit diesen Feldern:
if not existing:
return None
# Pre-load all existing embeddings in batch (once per pipeline run)
if not self._existing_embeddings:
await self._preload_embeddings(existing)
new_text = f"{new_control.title} {new_control.objective}"
new_emb = await _get_embedding(new_text)
if not new_emb:
@@ -684,14 +901,7 @@ Gib JSON zurück mit diesen Feldern:
similar = []
for ex in existing:
ex_key = ex.get("control_id", "")
ex_text = f"{ex.get('title', '')} {ex.get('objective', '')}"
# Get or compute embedding for existing control
if ex_key not in self._existing_embeddings:
emb = await _get_embedding(ex_text)
self._existing_embeddings[ex_key] = emb
ex_emb = self._existing_embeddings.get(ex_key, [])
if not ex_emb:
continue
@@ -705,6 +915,20 @@ Gib JSON zurück mit diesen Feldern:
return similar if similar else None
async def _preload_embeddings(self, existing: list[dict]):
"""Pre-load embeddings for all existing controls in batches."""
texts = [f"{ex.get('title', '')} {ex.get('objective', '')}" for ex in existing]
keys = [ex.get("control_id", "") for ex in existing]
logger.info("Pre-loading embeddings for %d existing controls...", len(texts))
embeddings = await _get_embeddings_batch(texts)
for key, emb in zip(keys, embeddings):
self._existing_embeddings[key] = emb
loaded = sum(1 for emb in embeddings if emb)
logger.info("Pre-loaded %d/%d embeddings", loaded, len(texts))
def _load_existing_controls(self) -> list[dict]:
"""Load existing controls from DB (cached per pipeline run)."""
if self._existing_controls is not None:
@@ -799,10 +1023,11 @@ Gib JSON zurück mit diesen Feldern:
return str(uuid.uuid4())
def _update_job(self, job_id: str, result: GeneratorResult):
"""Update job with final stats."""
"""Update job with current stats. Sets completed_at only when status is final."""
is_final = result.status in ("completed", "failed")
try:
self.db.execute(
text("""
text(f"""
UPDATE canonical_generation_jobs
SET status = :status,
total_chunks_scanned = :scanned,
@@ -811,8 +1036,8 @@ Gib JSON zurück mit diesen Feldern:
controls_needs_review = :needs_review,
controls_too_close = :too_close,
controls_duplicates_found = :duplicates,
errors = :errors,
completed_at = NOW()
errors = :errors
{"" if not is_final else ", completed_at = NOW()"}
WHERE id = CAST(:job_id AS uuid)
"""),
{
@@ -857,14 +1082,16 @@ Gib JSON zurück mit diesen Feldern:
severity, risk_score, implementation_effort,
open_anchors, release_state, tags,
license_rule, source_original_text, source_citation,
customer_visible, generation_metadata
customer_visible, generation_metadata,
verification_method, category
) VALUES (
:framework_id, :control_id, :title, :objective, :rationale,
:scope, :requirements, :test_procedure, :evidence,
:severity, :risk_score, :implementation_effort,
:open_anchors, :release_state, :tags,
:license_rule, :source_original_text, :source_citation,
:customer_visible, :generation_metadata
:customer_visible, :generation_metadata,
:verification_method, :category
)
ON CONFLICT (framework_id, control_id) DO NOTHING
RETURNING id
@@ -890,6 +1117,8 @@ Gib JSON zurück mit diesen Feldern:
"source_citation": json.dumps(control.source_citation) if control.source_citation else None,
"customer_visible": control.customer_visible,
"generation_metadata": json.dumps(control.generation_metadata) if control.generation_metadata else None,
"verification_method": control.verification_method,
"category": control.category,
},
)
self.db.commit()
@@ -926,7 +1155,7 @@ Gib JSON zurück mit diesen Feldern:
"""),
{
"hash": chunk_hash,
"collection": "bp_compliance_ce", # Default, we don't track collection per result
"collection": chunk.collection or "bp_compliance_ce",
"regulation_code": chunk.regulation_code,
"doc_version": "1.0",
"license": license_info.get("license", ""),
@@ -946,8 +1175,11 @@ Gib JSON zurück mit diesen Feldern:
"""Execute the full 7-stage pipeline."""
result = GeneratorResult()
# Create job
job_id = self._create_job(config)
# Create or reuse job
if config.existing_job_id:
job_id = config.existing_job_id
else:
job_id = self._create_job(config)
result.job_id = job_id
try:
@@ -962,13 +1194,37 @@ Gib JSON zurück mit diesen Feldern:
# Process chunks
controls_count = 0
for chunk in chunks:
if controls_count >= config.max_controls:
break
chunks_skipped_prefilter = 0
for i, chunk in enumerate(chunks):
try:
# Progress logging every 50 chunks
if i > 0 and i % 50 == 0:
logger.info(
"Progress: %d/%d chunks processed, %d controls generated, %d skipped by prefilter",
i, len(chunks), controls_count, chunks_skipped_prefilter,
)
self._update_job(job_id, result)
# Stage 1.5: Local LLM pre-filter — skip chunks without requirements
if not config.dry_run:
is_relevant, prefilter_reason = await _prefilter_chunk(chunk.text)
if not is_relevant:
chunks_skipped_prefilter += 1
# Mark as processed so we don't re-check next time
license_info = self._classify_license(chunk)
self._mark_chunk_processed(
chunk, license_info, "prefilter_skip", [], job_id
)
continue
control = await self._process_single_chunk(chunk, config, job_id)
if control is None:
# No control generated — still mark as processed
if not config.dry_run:
license_info = self._classify_license(chunk)
self._mark_chunk_processed(
chunk, license_info, "no_control", [], job_id
)
continue
# Count by state
@@ -989,6 +1245,12 @@ Gib JSON zurück mit diesen Feldern:
license_info = self._classify_license(chunk)
path = "llm_reform" if license_info["rule"] == 3 else "structured"
self._mark_chunk_processed(chunk, license_info, path, [ctrl_uuid], job_id)
else:
# Store failed — still mark as processed
license_info = self._classify_license(chunk)
self._mark_chunk_processed(
chunk, license_info, "store_failed", [], job_id
)
result.controls_generated += 1
result.controls.append(asdict(control))
@@ -1006,6 +1268,21 @@ Gib JSON zurück mit diesen Feldern:
error_msg = f"Error processing chunk {chunk.regulation_code}/{chunk.article}: {e}"
logger.error(error_msg)
result.errors.append(error_msg)
# Mark failed chunks as processed too (so we don't retry endlessly)
try:
if not config.dry_run:
license_info = self._classify_license(chunk)
self._mark_chunk_processed(
chunk, license_info, "error", [], job_id
)
except Exception:
pass
result.chunks_skipped_prefilter = chunks_skipped_prefilter
logger.info(
"Pipeline complete: %d controls generated, %d chunks skipped by prefilter, %d total chunks",
controls_count, chunks_skipped_prefilter, len(chunks),
)
result.status = "completed"

View File

@@ -33,6 +33,7 @@ class RAGSearchResult:
paragraph: str
source_url: str
score: float
collection: str = ""
class ComplianceRAGClient:
@@ -91,6 +92,7 @@ class ComplianceRAGClient:
paragraph=r.get("paragraph", ""),
source_url=r.get("source_url", ""),
score=r.get("score", 0.0),
collection=collection,
))
return results
@@ -98,6 +100,54 @@ class ComplianceRAGClient:
logger.warning("RAG search failed: %s", e)
return []
async def scroll(
self,
collection: str,
offset: Optional[str] = None,
limit: int = 100,
) -> tuple[List[RAGSearchResult], Optional[str]]:
"""
Scroll through ALL chunks in a collection (paginated).
Returns (chunks, next_offset). next_offset is None when done.
"""
scroll_url = self._search_url.replace("/search", "/scroll")
params = {"collection": collection, "limit": str(limit)}
if offset:
params["offset"] = offset
try:
async with httpx.AsyncClient(timeout=30.0) as client:
resp = await client.get(scroll_url, params=params)
if resp.status_code != 200:
logger.warning(
"RAG scroll returned %d: %s", resp.status_code, resp.text[:200]
)
return [], None
data = resp.json()
results = []
for r in data.get("chunks", []):
results.append(RAGSearchResult(
text=r.get("text", ""),
regulation_code=r.get("regulation_code", ""),
regulation_name=r.get("regulation_name", ""),
regulation_short=r.get("regulation_short", ""),
category=r.get("category", ""),
article=r.get("article", ""),
paragraph=r.get("paragraph", ""),
source_url=r.get("source_url", ""),
score=0.0,
collection=collection,
))
next_offset = data.get("next_offset") or None
return results, next_offset
except Exception as e:
logger.warning("RAG scroll failed: %s", e)
return [], None
def format_for_prompt(
self, results: List[RAGSearchResult], max_results: int = 5
) -> str:

View File

@@ -14,6 +14,12 @@ from contextlib import asynccontextmanager
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
# Configure root logging so all modules' logger.info() etc. are visible
logging.basicConfig(
level=logging.INFO,
format="%(levelname)s:%(name)s: %(message)s",
)
logger = logging.getLogger(__name__)
# Compliance-specific API routers

View File

@@ -0,0 +1,17 @@
-- 048: Expand processing_path CHECK constraint for new pipeline paths
-- New values: prefilter_skip, no_control, store_failed, error
ALTER TABLE canonical_processed_chunks
DROP CONSTRAINT IF EXISTS canonical_processed_chunks_processing_path_check;
ALTER TABLE canonical_processed_chunks
ADD CONSTRAINT canonical_processed_chunks_processing_path_check
CHECK (processing_path IN (
'structured', -- Rule 1/2: structured from original text
'llm_reform', -- Rule 3: LLM reformulated
'skipped', -- Legacy: skipped for other reasons
'prefilter_skip', -- Local LLM determined chunk has no requirement
'no_control', -- Processing ran but no control could be derived
'store_failed', -- Control generated but DB store failed
'error' -- Processing error occurred
));

View File

@@ -0,0 +1,8 @@
-- 049: Add target_audience field to canonical_controls
-- Distinguishes who a control is relevant for: enterprises, authorities, providers, or all.
ALTER TABLE canonical_controls ADD COLUMN IF NOT EXISTS
target_audience VARCHAR(20) DEFAULT NULL
CHECK (target_audience IN ('enterprise', 'authority', 'provider', 'all'));
CREATE INDEX IF NOT EXISTS idx_cc_target_audience ON canonical_controls(target_audience);

View File

@@ -0,0 +1,22 @@
-- Score Snapshots: Historical compliance score tracking
-- Migration 050
CREATE TABLE IF NOT EXISTS compliance_score_snapshots (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL,
project_id UUID,
score DECIMAL(5,2) NOT NULL,
controls_total INTEGER DEFAULT 0,
controls_pass INTEGER DEFAULT 0,
controls_partial INTEGER DEFAULT 0,
evidence_total INTEGER DEFAULT 0,
evidence_valid INTEGER DEFAULT 0,
risks_total INTEGER DEFAULT 0,
risks_high INTEGER DEFAULT 0,
snapshot_date DATE NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE (tenant_id, project_id, snapshot_date)
);
CREATE INDEX IF NOT EXISTS idx_score_snap_tenant ON compliance_score_snapshots(tenant_id);
CREATE INDEX IF NOT EXISTS idx_score_snap_date ON compliance_score_snapshots(snapshot_date);

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,53 @@
-- Process Manager: Recurring compliance tasks with audit trail
-- Migration 052
CREATE TABLE compliance_process_tasks (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL,
project_id UUID,
task_code VARCHAR(50) NOT NULL,
title VARCHAR(500) NOT NULL,
description TEXT,
category VARCHAR(50) NOT NULL
CHECK (category IN ('dsgvo','nis2','bsi','iso27001','ai_act','internal')),
priority VARCHAR(20) NOT NULL DEFAULT 'medium'
CHECK (priority IN ('critical','high','medium','low')),
frequency VARCHAR(20) NOT NULL DEFAULT 'yearly'
CHECK (frequency IN ('weekly','monthly','quarterly','semi_annual','yearly','once')),
assigned_to VARCHAR(255),
responsible_team VARCHAR(255),
linked_control_ids JSONB DEFAULT '[]',
linked_module VARCHAR(100),
last_completed_at TIMESTAMPTZ,
next_due_date DATE,
due_reminder_days INTEGER DEFAULT 14,
status VARCHAR(20) NOT NULL DEFAULT 'pending'
CHECK (status IN ('pending','in_progress','completed','overdue','skipped')),
completion_date TIMESTAMPTZ,
completion_result TEXT,
completion_evidence_id UUID,
follow_up_actions JSONB DEFAULT '[]',
is_seed BOOLEAN DEFAULT FALSE,
notes TEXT,
tags JSONB DEFAULT '[]',
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE (tenant_id, project_id, task_code)
);
CREATE TABLE compliance_process_task_history (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
task_id UUID NOT NULL REFERENCES compliance_process_tasks(id) ON DELETE CASCADE,
completed_by VARCHAR(255),
completed_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
result TEXT,
evidence_id UUID,
notes TEXT,
status VARCHAR(20) NOT NULL
);
CREATE INDEX idx_process_tasks_tenant ON compliance_process_tasks(tenant_id);
CREATE INDEX idx_process_tasks_status ON compliance_process_tasks(status);
CREATE INDEX idx_process_tasks_due ON compliance_process_tasks(next_due_date);
CREATE INDEX idx_process_tasks_category ON compliance_process_tasks(category);
CREATE INDEX idx_task_history_task ON compliance_process_task_history(task_id);

View File

@@ -0,0 +1,62 @@
-- Evidence Checks: Automated compliance verification
-- Migration 053
CREATE TABLE compliance_evidence_checks (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL,
project_id UUID,
check_code VARCHAR(50) NOT NULL,
title VARCHAR(500) NOT NULL,
description TEXT,
check_type VARCHAR(30) NOT NULL
CHECK (check_type IN ('tls_scan','header_check','certificate_check',
'config_scan','api_scan','dns_check','port_scan')),
target_url TEXT,
target_config JSONB DEFAULT '{}',
linked_control_ids JSONB DEFAULT '[]',
frequency VARCHAR(20) DEFAULT 'monthly'
CHECK (frequency IN ('daily','weekly','monthly','quarterly','manual')),
last_run_at TIMESTAMPTZ,
next_run_at TIMESTAMPTZ,
is_active BOOLEAN DEFAULT TRUE,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE (tenant_id, project_id, check_code)
);
CREATE TABLE compliance_evidence_check_results (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
check_id UUID NOT NULL REFERENCES compliance_evidence_checks(id) ON DELETE CASCADE,
tenant_id UUID NOT NULL,
run_status VARCHAR(20) NOT NULL DEFAULT 'running'
CHECK (run_status IN ('running','passed','failed','warning','error')),
result_data JSONB NOT NULL DEFAULT '{}',
summary TEXT,
findings_count INTEGER DEFAULT 0,
critical_findings INTEGER DEFAULT 0,
evidence_id UUID,
duration_ms INTEGER,
run_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE TABLE compliance_evidence_control_map (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL,
evidence_id UUID NOT NULL,
control_code VARCHAR(50) NOT NULL,
mapping_type VARCHAR(20) DEFAULT 'supports'
CHECK (mapping_type IN ('supports','partially_supports','required')),
verified_at TIMESTAMPTZ,
verified_by VARCHAR(255),
notes TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE (tenant_id, evidence_id, control_code)
);
CREATE INDEX idx_evidence_checks_tenant ON compliance_evidence_checks(tenant_id);
CREATE INDEX idx_evidence_checks_type ON compliance_evidence_checks(check_type);
CREATE INDEX idx_evidence_checks_active ON compliance_evidence_checks(is_active);
CREATE INDEX idx_check_results_check ON compliance_evidence_check_results(check_id);
CREATE INDEX idx_check_results_status ON compliance_evidence_check_results(run_status);
CREATE INDEX idx_evidence_control_map_tenant ON compliance_evidence_control_map(tenant_id);
CREATE INDEX idx_evidence_control_map_control ON compliance_evidence_control_map(control_code);

View File

@@ -0,0 +1,209 @@
"""Tests for extended dashboard routes (roadmap, module-status, next-actions, snapshots)."""
import pytest
from unittest.mock import MagicMock, patch
from fastapi.testclient import TestClient
from fastapi import FastAPI
from datetime import datetime, date, timedelta
from decimal import Decimal
from compliance.api.dashboard_routes import router
from classroom_engine.database import get_db
from compliance.api.tenant_utils import get_tenant_id
DEFAULT_TENANT_ID = "9282a473-5c95-4b3a-bf78-0ecc0ec71d3e"
# =============================================================================
# Test App Setup
# =============================================================================
app = FastAPI()
app.include_router(router)
mock_db = MagicMock()
def override_get_db():
yield mock_db
def override_tenant():
return DEFAULT_TENANT_ID
app.dependency_overrides[get_db] = override_get_db
app.dependency_overrides[get_tenant_id] = override_tenant
client = TestClient(app)
# =============================================================================
# Helpers
# =============================================================================
class MockControl:
def __init__(self, id="ctrl-001", control_id="CTRL-001", title="Test Control",
status_val="planned", domain_val="gov", owner="ISB",
next_review_at=None, category="security"):
self.id = id
self.control_id = control_id
self.title = title
self.status = MagicMock(value=status_val)
self.domain = MagicMock(value=domain_val)
self.owner = owner
self.next_review_at = next_review_at
self.category = category
class MockRisk:
def __init__(self, inherent_risk_val="high", status="open"):
self.inherent_risk = MagicMock(value=inherent_risk_val)
self.status = status
def make_snapshot_row(overrides=None):
data = {
"id": "snap-001",
"tenant_id": DEFAULT_TENANT_ID,
"project_id": None,
"score": Decimal("72.50"),
"controls_total": 20,
"controls_pass": 12,
"controls_partial": 5,
"evidence_total": 10,
"evidence_valid": 8,
"risks_total": 5,
"risks_high": 2,
"snapshot_date": date(2026, 3, 14),
"created_at": datetime(2026, 3, 14),
}
if overrides:
data.update(overrides)
row = MagicMock()
row._mapping = data
return row
# =============================================================================
# Tests
# =============================================================================
class TestDashboardRoadmap:
def test_roadmap_returns_buckets(self):
"""Roadmap returns 4 buckets with controls."""
overdue = datetime.utcnow() - timedelta(days=10)
future = datetime.utcnow() + timedelta(days=30)
with patch("compliance.api.dashboard_routes.ControlRepository") as MockCtrlRepo:
instance = MockCtrlRepo.return_value
instance.get_all.return_value = [
MockControl(id="c1", status_val="planned", category="legal", next_review_at=overdue),
MockControl(id="c2", status_val="partial", category="security"),
MockControl(id="c3", status_val="planned", category="best_practice"),
MockControl(id="c4", status_val="pass"), # should be excluded
]
resp = client.get("/dashboard/roadmap")
assert resp.status_code == 200
data = resp.json()
assert "buckets" in data
assert "counts" in data
# c4 is pass, so excluded; c1 is legal+overdue → quick_wins
total_in_buckets = sum(data["counts"].values())
assert total_in_buckets == 3
def test_roadmap_empty_controls(self):
"""Roadmap with no controls returns empty buckets."""
with patch("compliance.api.dashboard_routes.ControlRepository") as MockCtrlRepo:
MockCtrlRepo.return_value.get_all.return_value = []
resp = client.get("/dashboard/roadmap")
assert resp.status_code == 200
assert all(v == 0 for v in resp.json()["counts"].values())
class TestModuleStatus:
def test_module_status_returns_modules(self):
"""Module status returns list of modules with counts."""
# Mock db.execute for each module's COUNT query
count_result = MagicMock()
count_result.fetchone.return_value = (5,)
mock_db.execute.return_value = count_result
resp = client.get("/dashboard/module-status")
assert resp.status_code == 200
data = resp.json()
assert "modules" in data
assert data["total"] > 0
assert all(m["count"] == 5 for m in data["modules"])
def test_module_status_handles_missing_tables(self):
"""Module status handles missing tables gracefully."""
mock_db.execute.side_effect = Exception("relation does not exist")
resp = client.get("/dashboard/module-status")
assert resp.status_code == 200
data = resp.json()
# All modules should have count=0 and status=not_started
assert all(m["count"] == 0 for m in data["modules"])
assert all(m["status"] == "not_started" for m in data["modules"])
mock_db.execute.side_effect = None # reset
class TestNextActions:
def test_next_actions_returns_sorted(self):
"""Next actions returns controls sorted by urgency."""
overdue = datetime.utcnow() - timedelta(days=30)
with patch("compliance.api.dashboard_routes.ControlRepository") as MockCtrlRepo:
instance = MockCtrlRepo.return_value
instance.get_all.return_value = [
MockControl(id="c1", status_val="planned", category="legal", next_review_at=overdue),
MockControl(id="c2", status_val="partial", category="best_practice"),
MockControl(id="c3", status_val="pass"), # excluded
]
resp = client.get("/dashboard/next-actions?limit=5")
assert resp.status_code == 200
data = resp.json()
assert len(data["actions"]) == 2
# c1 should be first (higher urgency due to legal + overdue)
assert data["actions"][0]["control_id"] == "CTRL-001"
class TestScoreSnapshot:
def test_create_snapshot(self):
"""Creating a snapshot saves current score."""
with patch("compliance.api.dashboard_routes.ControlRepository") as MockCtrlRepo, \
patch("compliance.api.dashboard_routes.EvidenceRepository") as MockEvRepo, \
patch("compliance.api.dashboard_routes.RiskRepository") as MockRiskRepo:
MockCtrlRepo.return_value.get_statistics.return_value = {
"total": 20, "pass": 12, "partial": 5, "by_status": {}
}
MockEvRepo.return_value.get_statistics.return_value = {
"total": 10, "by_status": {"valid": 8}
}
MockRiskRepo.return_value.get_all.return_value = [
MockRisk("high"), MockRisk("critical"), MockRisk("low")
]
snap_row = make_snapshot_row()
mock_db.execute.return_value.fetchone.return_value = snap_row
resp = client.post("/dashboard/snapshot")
assert resp.status_code == 200
data = resp.json()
assert "score" in data
def test_score_history(self):
"""Score history returns snapshots."""
rows = [make_snapshot_row({"snapshot_date": date(2026, 3, i)}) for i in range(1, 4)]
mock_db.execute.return_value.fetchall.return_value = rows
resp = client.get("/dashboard/score-history?months=3")
assert resp.status_code == 200
data = resp.json()
assert data["total"] == 3
assert len(data["snapshots"]) == 3

View File

@@ -0,0 +1,374 @@
"""Tests for Evidence Check routes (evidence_check_routes.py)."""
import json
import pytest
from unittest.mock import MagicMock, patch, AsyncMock
from datetime import datetime
from fastapi import FastAPI
from fastapi.testclient import TestClient
from compliance.api.evidence_check_routes import router, VALID_CHECK_TYPES
from classroom_engine.database import get_db
from compliance.api.tenant_utils import get_tenant_id
# ---------------------------------------------------------------------------
# App setup with mocked DB dependency
# ---------------------------------------------------------------------------
app = FastAPI()
app.include_router(router)
DEFAULT_TENANT_ID = "9282a473-5c95-4b3a-bf78-0ecc0ec71d3e"
CHECK_ID = "ffffffff-0001-0001-0001-000000000001"
RESULT_ID = "eeeeeeee-0001-0001-0001-000000000001"
MAPPING_ID = "dddddddd-0001-0001-0001-000000000001"
EVIDENCE_ID = "cccccccc-0001-0001-0001-000000000001"
NOW = datetime(2026, 3, 14, 12, 0, 0)
def override_get_tenant_id():
return DEFAULT_TENANT_ID
app.dependency_overrides[get_tenant_id] = override_get_tenant_id
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _make_check_row(overrides=None):
"""Create a mock DB row for a check."""
data = {
"id": CHECK_ID,
"tenant_id": DEFAULT_TENANT_ID,
"project_id": None,
"check_code": "TLS-SCAN-001",
"title": "TLS-Scan Hauptwebseite",
"description": "Prueft TLS",
"check_type": "tls_scan",
"target_url": "https://example.com",
"target_config": {},
"linked_control_ids": [],
"frequency": "monthly",
"last_run_at": None,
"next_run_at": None,
"is_active": True,
"created_at": NOW,
"updated_at": NOW,
}
if overrides:
data.update(overrides)
row = MagicMock()
row._mapping = data
row.__getitem__ = lambda self, i: list(data.values())[i]
return row
def _make_result_row(overrides=None):
"""Create a mock DB row for a result."""
data = {
"id": RESULT_ID,
"check_id": CHECK_ID,
"tenant_id": DEFAULT_TENANT_ID,
"run_status": "passed",
"result_data": {"tls_version": "TLSv1.3"},
"summary": "TLS TLSv1.3",
"findings_count": 0,
"critical_findings": 0,
"evidence_id": None,
"duration_ms": 150,
"run_at": NOW,
}
if overrides:
data.update(overrides)
row = MagicMock()
row._mapping = data
row.__getitem__ = lambda self, i: list(data.values())[i]
return row
def _make_mapping_row(overrides=None):
data = {
"id": MAPPING_ID,
"tenant_id": DEFAULT_TENANT_ID,
"evidence_id": EVIDENCE_ID,
"control_code": "TOM-001",
"mapping_type": "supports",
"verified_at": None,
"verified_by": None,
"notes": "Test mapping",
"created_at": NOW,
}
if overrides:
data.update(overrides)
row = MagicMock()
row._mapping = data
row.__getitem__ = lambda self, i: list(data.values())[i]
return row
# ---------------------------------------------------------------------------
# Tests
# ---------------------------------------------------------------------------
class TestListChecks:
def test_list_checks(self):
mock_db = MagicMock()
# COUNT query
count_row = MagicMock()
count_row.__getitem__ = lambda self, i: 2
# Data rows
rows = [_make_check_row(), _make_check_row({"check_code": "TLS-SCAN-002"})]
mock_db.execute.side_effect = [
MagicMock(fetchone=MagicMock(return_value=count_row)),
MagicMock(fetchall=MagicMock(return_value=rows)),
]
app.dependency_overrides[get_db] = lambda: (yield mock_db).__next__() or mock_db
def override_db():
yield mock_db
app.dependency_overrides[get_db] = override_db
client = TestClient(app)
resp = client.get("/evidence-checks")
assert resp.status_code == 200
data = resp.json()
assert "checks" in data
assert len(data["checks"]) == 2
class TestCreateCheck:
def test_create_check(self):
mock_db = MagicMock()
mock_db.execute.return_value.fetchone.return_value = _make_check_row()
def override_db():
yield mock_db
app.dependency_overrides[get_db] = override_db
client = TestClient(app)
resp = client.post("/evidence-checks", json={
"check_code": "TLS-SCAN-001",
"title": "TLS-Scan Hauptwebseite",
"check_type": "tls_scan",
"frequency": "monthly",
})
assert resp.status_code == 201
data = resp.json()
assert data["check_code"] == "TLS-SCAN-001"
def test_create_check_invalid_type(self):
mock_db = MagicMock()
def override_db():
yield mock_db
app.dependency_overrides[get_db] = override_db
client = TestClient(app)
resp = client.post("/evidence-checks", json={
"check_code": "INVALID-001",
"title": "Invalid Check",
"check_type": "invalid_type",
})
assert resp.status_code == 400
assert "Ungueltiger check_type" in resp.json()["detail"]
class TestGetSingleCheck:
def test_get_single_check(self):
mock_db = MagicMock()
check_row = _make_check_row()
result_rows = [_make_result_row()]
mock_db.execute.side_effect = [
MagicMock(fetchone=MagicMock(return_value=check_row)),
MagicMock(fetchall=MagicMock(return_value=result_rows)),
]
def override_db():
yield mock_db
app.dependency_overrides[get_db] = override_db
client = TestClient(app)
resp = client.get(f"/evidence-checks/{CHECK_ID}")
assert resp.status_code == 200
data = resp.json()
assert data["check_code"] == "TLS-SCAN-001"
assert "recent_results" in data
assert len(data["recent_results"]) == 1
class TestUpdateCheck:
def test_update_check(self):
mock_db = MagicMock()
updated_row = _make_check_row({"title": "Updated Title"})
mock_db.execute.return_value.fetchone.return_value = updated_row
def override_db():
yield mock_db
app.dependency_overrides[get_db] = override_db
client = TestClient(app)
resp = client.put(f"/evidence-checks/{CHECK_ID}", json={
"title": "Updated Title",
})
assert resp.status_code == 200
assert resp.json()["title"] == "Updated Title"
class TestDeleteCheck:
def test_delete_check(self):
mock_db = MagicMock()
mock_db.execute.return_value.rowcount = 1
def override_db():
yield mock_db
app.dependency_overrides[get_db] = override_db
client = TestClient(app)
resp = client.delete(f"/evidence-checks/{CHECK_ID}")
assert resp.status_code == 204
class TestRunCheckTLS:
def test_run_check_tls(self):
mock_db = MagicMock()
check_row = _make_check_row()
result_insert_row = _make_result_row({"run_status": "running"})
result_update_row = _make_result_row({"run_status": "passed"})
mock_db.execute.side_effect = [
# Load check
MagicMock(fetchone=MagicMock(return_value=check_row)),
# Insert running result
MagicMock(fetchone=MagicMock(return_value=result_insert_row)),
# Update result
MagicMock(fetchone=MagicMock(return_value=result_update_row)),
# Update check timestamps
MagicMock(),
]
mock_db.commit = MagicMock()
tls_result = {
"run_status": "passed",
"result_data": {"tls_version": "TLSv1.3", "findings": []},
"summary": "TLS TLSv1.3, Zertifikat gueltig",
"findings_count": 0,
"critical_findings": 0,
"duration_ms": 100,
}
def override_db():
yield mock_db
app.dependency_overrides[get_db] = override_db
client = TestClient(app)
with patch("compliance.api.evidence_check_routes._run_tls_scan", new_callable=AsyncMock, return_value=tls_result):
resp = client.post(f"/evidence-checks/{CHECK_ID}/run")
assert resp.status_code == 200
data = resp.json()
assert data["run_status"] == "passed"
class TestRunCheckHeader:
def test_run_check_header(self):
mock_db = MagicMock()
check_row = _make_check_row({"check_type": "header_check"})
result_insert_row = _make_result_row({"run_status": "running"})
result_update_row = _make_result_row({"run_status": "warning"})
mock_db.execute.side_effect = [
MagicMock(fetchone=MagicMock(return_value=check_row)),
MagicMock(fetchone=MagicMock(return_value=result_insert_row)),
MagicMock(fetchone=MagicMock(return_value=result_update_row)),
MagicMock(),
]
mock_db.commit = MagicMock()
header_result = {
"run_status": "warning",
"result_data": {"missing_headers": ["Permissions-Policy"], "findings": []},
"summary": "5/6 Security-Header vorhanden",
"findings_count": 1,
"critical_findings": 0,
"duration_ms": 200,
}
def override_db():
yield mock_db
app.dependency_overrides[get_db] = override_db
client = TestClient(app)
with patch("compliance.api.evidence_check_routes._run_header_check", new_callable=AsyncMock, return_value=header_result):
resp = client.post(f"/evidence-checks/{CHECK_ID}/run")
assert resp.status_code == 200
data = resp.json()
assert data["run_status"] == "warning"
class TestSeedChecks:
def test_seed_checks(self):
mock_db = MagicMock()
# Each seed INSERT returns rowcount=1
mock_result = MagicMock()
mock_result.rowcount = 1
mock_db.execute.return_value = mock_result
def override_db():
yield mock_db
app.dependency_overrides[get_db] = override_db
client = TestClient(app)
resp = client.post("/evidence-checks/seed")
assert resp.status_code == 200
data = resp.json()
assert data["total_definitions"] == 15
assert data["seeded"] == 15
class TestMappingsCRUD:
def test_mappings_crud(self):
mock_db = MagicMock()
def override_db():
yield mock_db
app.dependency_overrides[get_db] = override_db
client = TestClient(app)
# Create mapping
mapping_row = _make_mapping_row()
mock_db.execute.return_value.fetchone.return_value = mapping_row
resp = client.post("/evidence-checks/mappings", json={
"evidence_id": EVIDENCE_ID,
"control_code": "TOM-001",
"mapping_type": "supports",
"notes": "Test mapping",
})
assert resp.status_code == 201
data = resp.json()
assert data["control_code"] == "TOM-001"
# List mappings
mock_db.execute.return_value.fetchall.return_value = [mapping_row]
resp = client.get("/evidence-checks/mappings")
assert resp.status_code == 200
assert "mappings" in resp.json()
# Delete mapping
mock_db.execute.return_value.rowcount = 1
resp = client.delete(f"/evidence-checks/mappings/{MAPPING_ID}")
assert resp.status_code == 204

View File

@@ -0,0 +1,525 @@
"""Tests for compliance process task routes (process_task_routes.py)."""
import pytest
from unittest.mock import MagicMock, patch, call
from datetime import datetime, date, timedelta
from fastapi import FastAPI
from fastapi.testclient import TestClient
from compliance.api.process_task_routes import (
router,
ProcessTaskCreate,
ProcessTaskUpdate,
ProcessTaskComplete,
ProcessTaskSkip,
VALID_CATEGORIES,
VALID_FREQUENCIES,
VALID_PRIORITIES,
VALID_STATUSES,
FREQUENCY_DAYS,
)
from classroom_engine.database import get_db
from compliance.api.tenant_utils import get_tenant_id
DEFAULT_TENANT_ID = "9282a473-5c95-4b3a-bf78-0ecc0ec71d3e"
TASK_ID = "ffffffff-0001-0001-0001-000000000001"
app = FastAPI()
app.include_router(router)
# ---------------------------------------------------------------------------
# Mock helpers
# ---------------------------------------------------------------------------
class _MockRow:
"""Simulates a SQLAlchemy row with _mapping attribute."""
def __init__(self, data: dict):
self._mapping = data
def __getitem__(self, idx):
vals = list(self._mapping.values())
return vals[idx]
def _make_task_row(overrides=None):
now = datetime(2026, 3, 14, 12, 0, 0)
data = {
"id": TASK_ID,
"tenant_id": DEFAULT_TENANT_ID,
"project_id": None,
"task_code": "DSGVO-VVT-REVIEW",
"title": "VVT-Review und Aktualisierung",
"description": "Jaehrliche Ueberpruefung des VVT.",
"category": "dsgvo",
"priority": "high",
"frequency": "yearly",
"assigned_to": None,
"responsible_team": None,
"linked_control_ids": [],
"linked_module": "vvt",
"last_completed_at": None,
"next_due_date": date(2027, 3, 14),
"due_reminder_days": 14,
"status": "pending",
"completion_date": None,
"completion_result": None,
"completion_evidence_id": None,
"follow_up_actions": [],
"is_seed": False,
"notes": None,
"tags": [],
"created_at": now,
"updated_at": now,
}
if overrides:
data.update(overrides)
return _MockRow(data)
def _make_history_row(overrides=None):
now = datetime(2026, 3, 14, 12, 0, 0)
data = {
"id": "eeeeeeee-0001-0001-0001-000000000001",
"task_id": TASK_ID,
"completed_by": "admin",
"completed_at": now,
"result": "Alles in Ordnung",
"evidence_id": None,
"notes": "Keine Auffaelligkeiten",
"status": "completed",
}
if overrides:
data.update(overrides)
return _MockRow(data)
def _count_row(val):
"""Simulates a COUNT(*) row — fetchone()[0] returns the value."""
row = MagicMock()
row.__getitem__ = lambda self, idx: val
return row
@pytest.fixture
def mock_db():
db = MagicMock()
app.dependency_overrides[get_db] = lambda: db
yield db
app.dependency_overrides.pop(get_db, None)
@pytest.fixture
def client(mock_db):
return TestClient(app)
# =============================================================================
# Test 1: List Tasks
# =============================================================================
class TestListTasks:
def test_list_tasks(self, client, mock_db):
"""List tasks returns items and total."""
row = _make_task_row()
mock_db.execute.side_effect = [
MagicMock(fetchone=MagicMock(return_value=_count_row(1))),
MagicMock(fetchall=MagicMock(return_value=[row])),
]
resp = client.get("/process-tasks")
assert resp.status_code == 200
data = resp.json()
assert data["total"] == 1
assert len(data["tasks"]) == 1
assert data["tasks"][0]["id"] == TASK_ID
assert data["tasks"][0]["task_code"] == "DSGVO-VVT-REVIEW"
def test_list_tasks_empty(self, client, mock_db):
"""List tasks returns empty when no tasks."""
mock_db.execute.side_effect = [
MagicMock(fetchone=MagicMock(return_value=_count_row(0))),
MagicMock(fetchall=MagicMock(return_value=[])),
]
resp = client.get("/process-tasks")
assert resp.status_code == 200
data = resp.json()
assert data["total"] == 0
assert data["tasks"] == []
# =============================================================================
# Test 2: List Tasks with Filters
# =============================================================================
class TestListTasksWithFilters:
def test_list_tasks_with_filters(self, client, mock_db):
"""Filter by status and category."""
row = _make_task_row({"status": "overdue", "category": "nis2"})
mock_db.execute.side_effect = [
MagicMock(fetchone=MagicMock(return_value=_count_row(1))),
MagicMock(fetchall=MagicMock(return_value=[row])),
]
resp = client.get("/process-tasks?status=overdue&category=nis2")
assert resp.status_code == 200
data = resp.json()
assert data["tasks"][0]["status"] == "overdue"
assert data["tasks"][0]["category"] == "nis2"
def test_list_tasks_overdue_filter(self, client, mock_db):
"""Filter overdue=true adds date condition."""
mock_db.execute.side_effect = [
MagicMock(fetchone=MagicMock(return_value=_count_row(0))),
MagicMock(fetchall=MagicMock(return_value=[])),
]
resp = client.get("/process-tasks?overdue=true")
assert resp.status_code == 200
# Verify the SQL was called (mock_db.execute called twice: count + select)
assert mock_db.execute.call_count == 2
# =============================================================================
# Test 3: Get Stats
# =============================================================================
class TestGetStats:
def test_get_stats(self, client, mock_db):
"""Verify stat counts structure."""
stats_row = MagicMock()
stats_row._mapping = {
"total": 50,
"pending": 20,
"in_progress": 5,
"completed": 15,
"overdue": 8,
"skipped": 2,
"overdue_count": 8,
"due_7_days": 3,
"due_14_days": 7,
"due_30_days": 12,
}
cat_row1 = MagicMock()
cat_row1._mapping = {"category": "dsgvo", "cnt": 15}
cat_row2 = MagicMock()
cat_row2._mapping = {"category": "nis2", "cnt": 10}
mock_db.execute.side_effect = [
MagicMock(fetchone=MagicMock(return_value=stats_row)),
MagicMock(fetchall=MagicMock(return_value=[cat_row1, cat_row2])),
]
resp = client.get("/process-tasks/stats")
assert resp.status_code == 200
data = resp.json()
assert data["total"] == 50
assert data["by_status"]["pending"] == 20
assert data["by_status"]["completed"] == 15
assert data["overdue_count"] == 8
assert data["due_7_days"] == 3
assert data["due_30_days"] == 12
assert data["by_category"]["dsgvo"] == 15
assert data["by_category"]["nis2"] == 10
def test_get_stats_empty(self, client, mock_db):
"""Stats with no tasks returns zeros."""
mock_db.execute.side_effect = [
MagicMock(fetchone=MagicMock(return_value=None)),
MagicMock(fetchall=MagicMock(return_value=[])),
]
resp = client.get("/process-tasks/stats")
assert resp.status_code == 200
data = resp.json()
assert data["total"] == 0
assert data["by_category"] == {}
# =============================================================================
# Test 4: Create Task
# =============================================================================
class TestCreateTask:
def test_create_task(self, client, mock_db):
"""Create a valid task returns 201."""
row = _make_task_row()
mock_db.execute.return_value = MagicMock(fetchone=MagicMock(return_value=row))
resp = client.post("/process-tasks", json={
"task_code": "DSGVO-VVT-REVIEW",
"title": "VVT-Review und Aktualisierung",
"category": "dsgvo",
"priority": "high",
"frequency": "yearly",
})
assert resp.status_code == 201
data = resp.json()
assert data["id"] == TASK_ID
assert data["task_code"] == "DSGVO-VVT-REVIEW"
mock_db.commit.assert_called_once()
# =============================================================================
# Test 5: Create Task Invalid Category
# =============================================================================
class TestCreateTaskInvalidCategory:
def test_create_task_invalid_category(self, client, mock_db):
"""Invalid category returns 400."""
resp = client.post("/process-tasks", json={
"task_code": "TEST-001",
"title": "Test",
"category": "invalid_category",
})
assert resp.status_code == 400
assert "Invalid category" in resp.json()["detail"]
def test_create_task_invalid_priority(self, client, mock_db):
"""Invalid priority returns 400."""
resp = client.post("/process-tasks", json={
"task_code": "TEST-001",
"title": "Test",
"category": "dsgvo",
"priority": "super_high",
})
assert resp.status_code == 400
assert "Invalid priority" in resp.json()["detail"]
def test_create_task_invalid_frequency(self, client, mock_db):
"""Invalid frequency returns 400."""
resp = client.post("/process-tasks", json={
"task_code": "TEST-001",
"title": "Test",
"category": "dsgvo",
"frequency": "biweekly",
})
assert resp.status_code == 400
assert "Invalid frequency" in resp.json()["detail"]
# =============================================================================
# Test 6: Get Single Task
# =============================================================================
class TestGetSingleTask:
def test_get_single_task(self, client, mock_db):
"""Get existing task by ID."""
row = _make_task_row()
mock_db.execute.return_value = MagicMock(fetchone=MagicMock(return_value=row))
resp = client.get(f"/process-tasks/{TASK_ID}")
assert resp.status_code == 200
assert resp.json()["id"] == TASK_ID
def test_get_task_not_found(self, client, mock_db):
"""Get non-existent task returns 404."""
mock_db.execute.return_value = MagicMock(fetchone=MagicMock(return_value=None))
resp = client.get("/process-tasks/nonexistent-id")
assert resp.status_code == 404
# =============================================================================
# Test 7: Complete Task
# =============================================================================
class TestCompleteTask:
def test_complete_task(self, client, mock_db):
"""Complete a task: verify history insert and next_due recalculation."""
task_row = _make_task_row({"frequency": "quarterly"})
updated_row = _make_task_row({
"frequency": "quarterly",
"status": "pending",
"last_completed_at": datetime(2026, 3, 14, 12, 0, 0),
"next_due_date": date(2026, 6, 12),
})
# First call: SELECT task, Second: INSERT history, Third: UPDATE task
mock_db.execute.side_effect = [
MagicMock(fetchone=MagicMock(return_value=task_row)),
MagicMock(), # history INSERT
MagicMock(fetchone=MagicMock(return_value=updated_row)),
]
resp = client.post(f"/process-tasks/{TASK_ID}/complete", json={
"completed_by": "admin",
"result": "Alles geprueft",
"notes": "Keine Auffaelligkeiten",
})
assert resp.status_code == 200
data = resp.json()
assert data["status"] == "pending" # Reset for recurring
mock_db.commit.assert_called_once()
def test_complete_once_task(self, client, mock_db):
"""Complete a one-time task stays completed."""
task_row = _make_task_row({"frequency": "once"})
updated_row = _make_task_row({
"frequency": "once",
"status": "completed",
"next_due_date": None,
})
mock_db.execute.side_effect = [
MagicMock(fetchone=MagicMock(return_value=task_row)),
MagicMock(),
MagicMock(fetchone=MagicMock(return_value=updated_row)),
]
resp = client.post(f"/process-tasks/{TASK_ID}/complete", json={
"completed_by": "admin",
})
assert resp.status_code == 200
assert resp.json()["status"] == "completed"
def test_complete_task_not_found(self, client, mock_db):
"""Complete non-existent task returns 404."""
mock_db.execute.return_value = MagicMock(fetchone=MagicMock(return_value=None))
resp = client.post("/process-tasks/nonexistent-id/complete", json={
"completed_by": "admin",
})
assert resp.status_code == 404
# =============================================================================
# Test 8: Skip Task
# =============================================================================
class TestSkipTask:
def test_skip_task(self, client, mock_db):
"""Skip task with reason, verify next_due recalculation."""
task_row = _make_task_row({"frequency": "monthly"})
updated_row = _make_task_row({
"frequency": "monthly",
"status": "pending",
"next_due_date": date(2026, 4, 13),
})
mock_db.execute.side_effect = [
MagicMock(fetchone=MagicMock(return_value=task_row)),
MagicMock(), # history INSERT
MagicMock(fetchone=MagicMock(return_value=updated_row)),
]
resp = client.post(f"/process-tasks/{TASK_ID}/skip", json={
"reason": "Kein Handlungsbedarf diesen Monat",
})
assert resp.status_code == 200
assert resp.json()["status"] == "pending"
mock_db.commit.assert_called_once()
def test_skip_task_not_found(self, client, mock_db):
"""Skip non-existent task returns 404."""
mock_db.execute.return_value = MagicMock(fetchone=MagicMock(return_value=None))
resp = client.post("/process-tasks/nonexistent-id/skip", json={
"reason": "Test",
})
assert resp.status_code == 404
# =============================================================================
# Test 9: Seed Idempotent
# =============================================================================
class TestSeedIdempotent:
def test_seed_idempotent(self, client, mock_db):
"""Seed twice — ON CONFLICT ensures idempotency."""
# First seed: all inserted (rowcount=1 for each)
mock_result = MagicMock()
mock_result.rowcount = 1
mock_db.execute.return_value = mock_result
resp = client.post("/process-tasks/seed")
assert resp.status_code == 200
data = resp.json()
assert data["seeded"] == data["total_available"]
assert data["total_available"] == 50
mock_db.commit.assert_called_once()
def test_seed_second_time_no_inserts(self, client, mock_db):
"""Second seed inserts nothing (ON CONFLICT DO NOTHING)."""
mock_result = MagicMock()
mock_result.rowcount = 0
mock_db.execute.return_value = mock_result
resp = client.post("/process-tasks/seed")
assert resp.status_code == 200
data = resp.json()
assert data["seeded"] == 0
assert data["total_available"] == 50
# =============================================================================
# Test 10: Get History
# =============================================================================
class TestGetHistory:
def test_get_history(self, client, mock_db):
"""Return history entries for a task."""
task_id_row = _MockRow({"id": TASK_ID})
history_row = _make_history_row()
mock_db.execute.side_effect = [
MagicMock(fetchone=MagicMock(return_value=task_id_row)),
MagicMock(fetchall=MagicMock(return_value=[history_row])),
]
resp = client.get(f"/process-tasks/{TASK_ID}/history")
assert resp.status_code == 200
data = resp.json()
assert len(data["history"]) == 1
assert data["history"][0]["task_id"] == TASK_ID
assert data["history"][0]["status"] == "completed"
assert data["history"][0]["completed_by"] == "admin"
def test_get_history_task_not_found(self, client, mock_db):
"""History for non-existent task returns 404."""
mock_db.execute.return_value = MagicMock(fetchone=MagicMock(return_value=None))
resp = client.get("/process-tasks/nonexistent-id/history")
assert resp.status_code == 404
def test_get_history_empty(self, client, mock_db):
"""Task with no history returns empty list."""
task_id_row = _MockRow({"id": TASK_ID})
mock_db.execute.side_effect = [
MagicMock(fetchone=MagicMock(return_value=task_id_row)),
MagicMock(fetchall=MagicMock(return_value=[])),
]
resp = client.get(f"/process-tasks/{TASK_ID}/history")
assert resp.status_code == 200
assert resp.json()["history"] == []
# =============================================================================
# Constant / Schema Validation Tests
# =============================================================================
class TestConstants:
def test_valid_categories(self):
assert VALID_CATEGORIES == {"dsgvo", "nis2", "bsi", "iso27001", "ai_act", "internal"}
def test_valid_frequencies(self):
assert VALID_FREQUENCIES == {"weekly", "monthly", "quarterly", "semi_annual", "yearly", "once"}
def test_valid_priorities(self):
assert VALID_PRIORITIES == {"critical", "high", "medium", "low"}
def test_valid_statuses(self):
assert VALID_STATUSES == {"pending", "in_progress", "completed", "overdue", "skipped"}
def test_frequency_days_mapping(self):
assert FREQUENCY_DAYS["weekly"] == 7
assert FREQUENCY_DAYS["monthly"] == 30
assert FREQUENCY_DAYS["quarterly"] == 90
assert FREQUENCY_DAYS["semi_annual"] == 182
assert FREQUENCY_DAYS["yearly"] == 365
assert FREQUENCY_DAYS["once"] is None
class TestDeleteTask:
def test_delete_existing(self, client, mock_db):
mock_db.execute.return_value = MagicMock(rowcount=1)
resp = client.delete(f"/process-tasks/{TASK_ID}")
assert resp.status_code == 204
mock_db.commit.assert_called_once()
def test_delete_not_found(self, client, mock_db):
mock_db.execute.return_value = MagicMock(rowcount=0)
resp = client.delete(f"/process-tasks/{TASK_ID}")
assert resp.status_code == 404

View File

@@ -0,0 +1,175 @@
"""Tests for security document templates (Module 3)."""
import pytest
from unittest.mock import MagicMock, patch
from fastapi.testclient import TestClient
from fastapi import FastAPI
from datetime import datetime
from compliance.api.legal_template_routes import router
from classroom_engine.database import get_db
from compliance.api.tenant_utils import get_tenant_id
DEFAULT_TENANT_ID = "9282a473-5c95-4b3a-bf78-0ecc0ec71d3e"
# =============================================================================
# Test App Setup
# =============================================================================
app = FastAPI()
app.include_router(router)
mock_db = MagicMock()
def override_get_db():
yield mock_db
def override_tenant():
return DEFAULT_TENANT_ID
app.dependency_overrides[get_db] = override_get_db
app.dependency_overrides[get_tenant_id] = override_tenant
client = TestClient(app)
SECURITY_TEMPLATE_TYPES = [
"it_security_concept",
"data_protection_concept",
"backup_recovery_concept",
"logging_concept",
"incident_response_plan",
"access_control_concept",
"risk_management_concept",
]
# =============================================================================
# Helpers
# =============================================================================
def make_template_row(doc_type, title="Test Template", content="# Test"):
row = MagicMock()
row._mapping = {
"id": "tmpl-001",
"tenant_id": DEFAULT_TENANT_ID,
"document_type": doc_type,
"title": title,
"description": f"Test {doc_type}",
"content": content,
"placeholders": ["COMPANY_NAME", "ISB_NAME"],
"language": "de",
"jurisdiction": "DE",
"status": "published",
"license_id": None,
"license_name": None,
"source_name": None,
"inspiration_sources": [],
"created_at": datetime(2026, 3, 14),
"updated_at": datetime(2026, 3, 14),
}
return row
# =============================================================================
# Tests
# =============================================================================
class TestSecurityTemplateTypes:
"""Verify the 7 security template types are accepted by the API."""
def test_all_security_types_in_valid_set(self):
"""All 7 security template types are in VALID_DOCUMENT_TYPES."""
from compliance.api.legal_template_routes import VALID_DOCUMENT_TYPES
for doc_type in SECURITY_TEMPLATE_TYPES:
assert doc_type in VALID_DOCUMENT_TYPES, (
f"{doc_type} not in VALID_DOCUMENT_TYPES"
)
def test_security_template_count(self):
"""There are exactly 7 security template types."""
assert len(SECURITY_TEMPLATE_TYPES) == 7
def test_create_security_template_accepted(self):
"""Creating a template with a security type is accepted (not 400)."""
insert_row = MagicMock()
insert_row._mapping = {
"id": "new-tmpl",
"tenant_id": DEFAULT_TENANT_ID,
"document_type": "it_security_concept",
"title": "IT-Sicherheitskonzept",
"description": "Test",
"content": "# IT-Sicherheitskonzept",
"placeholders": [],
"language": "de",
"jurisdiction": "DE",
"status": "draft",
"license_id": None,
"license_name": None,
"source_name": None,
"inspiration_sources": [],
"created_at": datetime(2026, 3, 14),
"updated_at": datetime(2026, 3, 14),
}
mock_db.execute.return_value.fetchone.return_value = insert_row
mock_db.commit = MagicMock()
resp = client.post("/legal-templates", json={
"document_type": "it_security_concept",
"title": "IT-Sicherheitskonzept",
"content": "# IT-Sicherheitskonzept\n\n## 1. Managementzusammenfassung",
"language": "de",
"jurisdiction": "DE",
})
# Should NOT be 400 (invalid type)
assert resp.status_code != 400 or "Invalid document_type" not in resp.text
def test_invalid_type_rejected(self):
"""A non-existent template type is rejected with 400."""
resp = client.post("/legal-templates", json={
"document_type": "nonexistent_type",
"title": "Test",
"content": "# Test",
})
assert resp.status_code == 400
assert "Invalid document_type" in resp.json()["detail"]
class TestSecurityTemplateFilter:
"""Verify filtering templates by security document types."""
def test_filter_by_security_type(self):
"""GET /legal-templates?document_type=it_security_concept returns matching templates."""
row = make_template_row("it_security_concept", "IT-Sicherheitskonzept")
mock_db.execute.return_value.fetchall.return_value = [row]
resp = client.get("/legal-templates?document_type=it_security_concept")
assert resp.status_code == 200
data = resp.json()
assert "templates" in data or isinstance(data, list)
class TestSecurityTemplatePlaceholders:
"""Verify placeholder structure for security templates."""
def test_common_placeholders_present(self):
"""Security templates should use standard placeholders."""
common_placeholders = [
"COMPANY_NAME", "GF_NAME", "ISB_NAME",
"DOCUMENT_VERSION", "VERSION_DATE", "NEXT_REVIEW_DATE",
]
row = make_template_row(
"it_security_concept",
content="# IT-Sicherheitskonzept\n{{COMPANY_NAME}} {{ISB_NAME}}"
)
row._mapping["placeholders"] = common_placeholders
mock_db.execute.return_value.fetchone.return_value = row
# Verify the mock has all expected placeholders
assert all(
p in row._mapping["placeholders"]
for p in ["COMPANY_NAME", "GF_NAME", "ISB_NAME"]
)

View File

@@ -222,14 +222,149 @@ Der Validator (`scripts/validate-controls.py`) prueft bei jedem Commit:
---
## Control Generator Pipeline
Automatische Generierung von Controls aus dem gesamten RAG-Korpus (170.000+ Chunks aus Gesetzen, Verordnungen und Standards).
### 8-Stufen-Pipeline
```mermaid
flowchart TD
A[1. RAG Scroll] -->|Alle Chunks| B[2. Prefilter - Lokales LLM]
B -->|Irrelevant| C[Als processed markieren]
B -->|Relevant| D[3. License Classify]
D -->|Rule 1/2| E[4a. Structure - Anthropic]
D -->|Rule 3| F[4b. LLM Reform - Anthropic]
E --> G[5. Harmonization - Embeddings]
F --> G
G -->|Duplikat| H[Als Duplikat speichern]
G -->|Neu| I[6. Anchor Search]
I --> J[7. Store Control]
J --> K[8. Mark Processed]
```
### Stufe 1: RAG Scroll (Vollstaendig)
Scrollt durch **ALLE** Chunks in allen RAG-Collections mittels Qdrant Scroll-API.
Kein Limit — jeder Chunk wird verarbeitet, um keine gesetzlichen Anforderungen zu uebersehen.
Bereits verarbeitete Chunks werden per SHA-256-Hash uebersprungen (`canonical_processed_chunks`).
### Stufe 2: Lokaler LLM-Vorfilter (Qwen 30B)
**Kostenoptimierung:** Bevor ein Chunk an die Anthropic API geht, prueft das lokale Qwen-Modell (`qwen3:30b-a3b` auf Mac Mini), ob der Chunk eine konkrete Anforderung enthaelt.
- **Relevant:** Pflichten ("muss", "soll"), technische Massnahmen, Datenschutz-Vorgaben
- **Irrelevant:** Definitionen, Inhaltsverzeichnisse, Begriffsbestimmungen, Uebergangsvorschriften
Irrelevante Chunks werden als `prefilter_skip` markiert und nie wieder verarbeitet.
Dies spart >50% der Anthropic-API-Kosten.
### Stufe 3: Lizenz-Klassifikation (3-Regel-System)
| Regel | Lizenz | Original erlaubt? | Beispiel |
|-------|--------|-------------------|----------|
| **Rule 1** (free_use) | EU-Gesetze, NIST, DE-Gesetze | Ja | DSGVO, BDSG, NIS2 |
| **Rule 2** (citation_required) | CC-BY, CC-BY-SA | Ja, mit Zitation | OWASP ASVS |
| **Rule 3** (restricted) | Proprietaer | Nein, volle Reformulierung | BSI TR-03161 |
### Stufe 4a/4b: Strukturierung / Reformulierung
- **Rule 1+2:** Anthropic strukturiert den Originaltext in Control-Format (Titel, Ziel, Anforderungen)
- **Rule 3:** Anthropic reformuliert vollstaendig — kein Originaltext, keine Quellennamen
### Stufe 5: Harmonisierung (Embedding-basiert)
Prueft per bge-m3 Embeddings (Cosine Similarity > 0.85), ob ein aehnliches Control existiert.
Embeddings werden in Batches vorgeladen (32 Texte/Request) fuer maximale Performance.
### Stufe 6-8: Anchor Search, Store, Mark Processed
- **Anchor Search:** Findet Open-Source-Referenzen (OWASP, NIST, ENISA)
- **Store:** Persistiert Control mit `verification_method` und `category`
- **Mark Processed:** Markiert **JEDEN** Chunk als verarbeitet (auch bei Skip/Error/Duplikat)
### Automatische Klassifikation
Bei der Generierung werden automatisch zugewiesen:
**Verification Method** (Nachweis-Methode):
| Methode | Beschreibung |
|---------|-------------|
| `code_review` | Im Source Code pruefbar |
| `document` | Dokument/Prozess-Nachweis |
| `tool` | Tool-basierte Pruefung |
| `hybrid` | Kombination mehrerer Methoden |
**Category** (17 thematische Kategorien):
encryption, authentication, network, data_protection, logging, incident,
continuity, compliance, supply_chain, physical, personnel, application,
system, risk, governance, hardware, identity
### Konfiguration
| ENV-Variable | Default | Beschreibung |
|-------------|---------|-------------|
| `ANTHROPIC_API_KEY` | — | API-Key fuer Anthropic Claude |
| `CONTROL_GEN_ANTHROPIC_MODEL` | `claude-sonnet-4-6` | Anthropic-Modell fuer Formulierung |
| `OLLAMA_URL` | `http://host.docker.internal:11434` | Lokaler Ollama-Server (Vorfilter) |
| `CONTROL_GEN_OLLAMA_MODEL` | `qwen3:30b-a3b` | Lokales LLM fuer Vorfilter |
| `CONTROL_GEN_LLM_TIMEOUT` | `120` | Timeout in Sekunden |
### Architektur-Entscheidung: Gesetzesverweise
Controls leiten sich aus zwei Quellen ab:
1. **Direkte gesetzliche Pflichten (Rule 1):** z.B. DSGVO Art. 32 erzwingt "technische und organisatorische Massnahmen". Diese Controls haben `source_citation` mit exakter Gesetzesreferenz und Originaltext.
2. **Implizite Umsetzung ueber Best Practices (Rule 2/3):** z.B. OWASP ASVS V2.7 fordert MFA — das ist keine gesetzliche Pflicht, aber eine Best Practice um NIS2 Art. 21 oder DSGVO Art. 32 zu erfuellen. Diese Controls haben Open-Source-Referenzen (Anchors).
**Im Frontend:**
- Rule 1/2 Controls zeigen eine blaue "Gesetzliche Grundlage" Box mit Gesetz, Artikel und Link
- Rule 3 Controls zeigen einen Hinweis dass sie implizit Gesetze umsetzen, mit Verweis auf die Referenzen
### API
```bash
# Job starten (laeuft im Hintergrund)
curl -X POST https://macmini:8002/api/compliance/v1/canonical/generate \
-H 'Content-Type: application/json' \
-H 'X-Tenant-ID: 550e8400-e29b-41d4-a716-446655440000' \
-d '{"collections": ["bp_compliance_gesetze"]}'
# Job-Status abfragen
curl https://macmini:8002/api/compliance/v1/canonical/generate/jobs \
-H 'X-Tenant-ID: 550e8400-e29b-41d4-a716-446655440000'
```
### RAG Collections
| Collection | Inhalte | Erwartete Regel |
|-----------|---------|----------------|
| `bp_compliance_gesetze` | Deutsche Gesetze (BDSG, TTDSG, TKG etc.) | Rule 1 |
| `bp_compliance_recht` | EU-Verordnungen (DSGVO, NIS2, AI Act etc.) | Rule 1 |
| `bp_compliance_datenschutz` | Datenschutz-Leitlinien | Rule 1/2 |
| `bp_compliance_ce` | CE/Sicherheitsstandards | Rule 1/2/3 |
| `bp_dsfa_corpus` | DSFA-Korpus | Rule 1/2 |
| `bp_legal_templates` | Rechtsvorlagen | Rule 1 |
---
## Dateien
| Datei | Typ | Beschreibung |
|-------|-----|-------------|
| `backend-compliance/migrations/044_canonical_control_library.sql` | SQL | 5 Tabellen + Seed-Daten |
| `backend-compliance/compliance/api/canonical_control_routes.py` | Python | REST API (8 Endpoints) |
| `backend-compliance/migrations/047_verification_method_category.sql` | SQL | verification_method + category Felder |
| `backend-compliance/compliance/api/canonical_control_routes.py` | Python | REST API (8+ Endpoints) |
| `backend-compliance/compliance/api/control_generator_routes.py` | Python | Generator API (Start/Status/Jobs) |
| `backend-compliance/compliance/services/control_generator.py` | Python | 8-Stufen-Pipeline |
| `backend-compliance/compliance/services/license_gate.py` | Python | Lizenz-Gate-Logik |
| `backend-compliance/compliance/services/similarity_detector.py` | Python | Too-Close-Detektor (5 Metriken) |
| `backend-compliance/compliance/services/rag_client.py` | Python | RAG-Client (Search + Scroll) |
| `ai-compliance-sdk/internal/ucca/legal_rag.go` | Go | RAG Search + Scroll (Qdrant) |
| `ai-compliance-sdk/internal/api/handlers/rag_handlers.go` | Go | RAG HTTP-Handler |
| `ai-compliance-sdk/policies/canonical_controls_v1.json` | JSON | 10 Seed Controls, 39 Open Anchors |
| `ai-compliance-sdk/internal/ucca/canonical_control_loader.go` | Go | Control Loader mit Multi-Index |
| `admin-compliance/app/sdk/control-library/page.tsx` | TSX | Control Library Browser |