feat(control-generator): 7-stage pipeline for RAG→LLM→Controls generation
Some checks failed
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Successful in 45s
CI/CD / test-python-document-crawler (push) Has been cancelled
CI/CD / test-python-dsms-gateway (push) Has been cancelled
CI/CD / validate-canonical-controls (push) Has been cancelled
CI/CD / deploy-hetzner (push) Has been cancelled
CI/CD / test-python-backend-compliance (push) Has been cancelled

Implements the Control Generator Pipeline that systematically generates
canonical security controls from 150k+ RAG chunks across all compliance
collections (BSI, NIST, OWASP, ENISA, EU laws, German laws).

Three license rules enforced throughout:
- Rule 1 (free_use): Laws/Public Domain — original text preserved
- Rule 2 (citation_required): CC-BY/CC-BY-SA — text with citation
- Rule 3 (restricted): BSI/ISO — full reformulation, no source traces

New files:
- Migration 046: job tracking, chunk tracking, blocked sources tables
- control_generator.py: 7-stage pipeline (scan→classify→structure/reform→harmonize→anchor→store→mark)
- anchor_finder.py: RAG + DuckDuckGo open-source reference search
- control_generator_routes.py: REST API (generate, review, stats, blocked-sources)
- test_control_generator.py: license mapping, rule enforcement, anchor filtering tests

Modified:
- __init__.py: register control_generator_router
- route.ts: proxy generator/review/stats endpoints
- page.tsx: Generator modal, stats panel, state filter, review queue, license badges

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Benjamin Admin
2026-03-13 09:03:37 +01:00
parent c87f07c99a
commit de19ef0684
8 changed files with 2404 additions and 9 deletions

View File

@@ -52,6 +52,45 @@ export async function GET(request: NextRequest) {
backendPath = '/api/compliance/v1/canonical/licenses'
break
// Generator endpoints
case 'generate-jobs':
backendPath = '/api/compliance/v1/canonical/generate/jobs'
break
case 'generate-status': {
const jobId = searchParams.get('jobId')
if (!jobId) {
return NextResponse.json({ error: 'Missing jobId' }, { status: 400 })
}
backendPath = `/api/compliance/v1/canonical/generate/status/${encodeURIComponent(jobId)}`
break
}
case 'review-queue': {
const state = searchParams.get('release_state') || 'needs_review'
backendPath = `/api/compliance/v1/canonical/generate/review-queue?release_state=${encodeURIComponent(state)}`
break
}
case 'processed-stats':
backendPath = '/api/compliance/v1/canonical/generate/processed-stats'
break
case 'blocked-sources':
backendPath = '/api/compliance/v1/canonical/blocked-sources'
break
case 'controls-customer': {
const custSeverity = searchParams.get('severity')
const custDomain = searchParams.get('domain')
const custParams = new URLSearchParams()
if (custSeverity) custParams.set('severity', custSeverity)
if (custDomain) custParams.set('domain', custDomain)
const custQs = custParams.toString()
backendPath = `/api/compliance/v1/canonical/controls-customer${custQs ? `?${custQs}` : ''}`
break
}
default:
return NextResponse.json({ error: `Unknown endpoint: ${endpoint}` }, { status: 400 })
}
@@ -95,6 +134,16 @@ export async function POST(request: NextRequest) {
if (endpoint === 'create-control') {
backendPath = '/api/compliance/v1/canonical/controls'
} else if (endpoint === 'generate') {
backendPath = '/api/compliance/v1/canonical/generate'
} else if (endpoint === 'review') {
const controlId = searchParams.get('id')
if (!controlId) {
return NextResponse.json({ error: 'Missing control id' }, { status: 400 })
}
backendPath = `/api/compliance/v1/canonical/generate/review/${encodeURIComponent(controlId)}`
} else if (endpoint === 'blocked-sources-cleanup') {
backendPath = '/api/compliance/v1/canonical/blocked-sources/cleanup'
} else if (endpoint === 'similarity-check') {
const controlId = searchParams.get('id')
if (!controlId) {