Compare commits

...

2 Commits

Author SHA1 Message Date
Benjamin Admin 2e87b74749 feat(audit): P103+P104+P105 Defeat-Device-Heuristik fuer Cookies
CI / detect-changes (push) Successful in 10s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 15s
CI / nodejs-build (push) Successful in 2m35s
CI / test-go (push) Failing after 51s
CI / iace-gt-coverage (push) Successful in 27s
CI / loc-budget (push) Failing after 16s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-python-backend (push) Successful in 39s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
Drei zusammenhaengende Stufen 'Cookie-Verhalten ist anders als deklariert' —
analog zum VW-Diesel-Skandal-Pattern (Pruefstand vs Realbetrieb).

P103 (Stufe 3) — cookie_value_entropy.py:
Klassifiziert Cookie-Werte als flag/short_id/long_token/uuid/hash/json_blob
via Shannon-Entropy + Regex-Patterns. Wenn ein als 'essential' deklarierter
Cookie einen 64-char-Base64-Wert hat → MEDIUM-Finding 'Defeat-Device-Heuristik'.

P104 (Stufe 4) — cookie_network_tracer.py:
Vergleicht Cookie-Domain mit Site-Hauptdomain + bekannten Tracker-Vendoren
(50 Domains gemapped: doubleclick.net, facebook.com, demdex.net, omtrdc.net,
adsrvr.org, hotjar.com, ...). Wenn ein als 'essential' deklariertes Cookie
von externer Tracker-Domain gesetzt wird → HIGH. Drittland-Cookies werden
als 'DRITTLAND US/CN/...' markiert (Schrems-II-Folge).

P105 (Stufe 5) — tcf_vendor_authority.py:
Ingest-Endpoint POST /api/compliance/agent/admin/tcf-ingest holt die
IAB TCF v2 Global Vendor List (vendor-list.consensu.org/v3) und upserted
sie in cookie_library mit source='iab_tcf_v2'. cross_reference_with_tcf
fuzzy-matched cmp_vendors gegen die TCF-Liste — wenn Vendor in TCF als
Marketing gefuehrt aber Site sagt 'Funktional' → HIGH (externe Authority
widerspricht der Deklaration).

Alle drei rendern eigene Mail-Bloecke im Bereich Cookies (nach
cookie_audit_html, vor library_mismatch_html).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 00:24:07 +02:00
Benjamin Admin 94233b7c66 feat(iace): LLM gap-review (Task #7+#8) + tech-file sources appendix (#29)
Three coupled pieces of work, all landing the same PoC:

1. Backend gap-review endpoint (Task #7)
   - internal/api/handlers/iace_handler_gap_review.go:
       POST /projects/:id/llm-gap-review
       feeds Limits-Form + current hazards + current mitigations to
       the configured LLM (Qwen / Claude / OpenAI via ProviderRegistry),
       parses a JSON suggestion list, filter+stamps confidence, falls
       back to a static checklist when LLM is unavailable.
   - Adopt step is NOT in this endpoint by design — the user clicks
     Adopt in the frontend which calls the existing CreateHazard /
     CreateMitigation handlers so provenance flows through the normal
     audit trail.

2. Frontend modal + button (Task #8)
   - app/sdk/iace/[projectId]/hazards/_components/LLMGapReviewModal.tsx:
       reusable modal that POSTs the gap-review endpoint, renders
       suggestions with Adopt/Reject UX, shows confidence + norm refs,
       source-stamp llm_gap_review vs fallback_static.
   - hazards/page.tsx: indigo "KI-Gap-Review" button next to the
     existing "Eigene Gefaehrdung" button + modal mount.

3. Tech-File sources appendix (Task #29 — Stufe 4)
   - internal/iace/document_export_sources.go: new pdfSourcesAppendix
     method appended to ExportPDF. Groups cited norms by license rule
     (R1 OSHA/EU-Recht / R3 BreakPilot patterns / R3 DIN-EN-ISO
     identifier-only) and emits the legally required statement that
     pauschal Impressum-Hinweise nicht ausreichen.
   - extractCitedNorms() scans hazard/mitigation text for EN/ISO/IEC/
     DIN identifiers in a narrow grammar so prose isn't turned into
     spurious citations.

Bonus refactor:
   - internal/app/routes.go reached the 500-LOC hard cap when the new
     llm-gap-review route was added. Extracted registerIACERoutes into
     routes_iace.go (136 LOC). Same wiring, no behaviour change.

Three of the four Attribution-Renderer stages (1, 2, 4) now produce
real output. Stufe 3 ships as <SourceBadge> + <LicenseModuleBanner>
already (commits dfac940 + b9e3eea earlier in this branch).

The PoC is intentionally conservative: every LLM-Suggestion stays
unverbindlich until a human clicks Adopt, and Adopt goes through the
existing normal CreateHazard/CreateMitigation flow (not yet wired in
this commit — separate iteration). The endpoint, modal and provenance
chain are in place for the next iteration to wire Adopt → write path.
2026-05-22 00:21:49 +02:00
11 changed files with 1466 additions and 111 deletions
@@ -0,0 +1,218 @@
'use client'
// LLM Gap-Review Modal — Task #8.
//
// Triggers POST /projects/:id/llm-gap-review on mount and lists the
// LLM's gap suggestions with an Adopt / Reject UX. Adoption goes through
// the regular CreateHazard / CreateMitigation endpoints — the modal
// itself never mutates project state on its own.
import { useEffect, useState } from 'react'
type Suggestion = {
kind: 'hazard' | 'mitigation'
title: string
description: string
category?: string
hazard_ref?: string
pattern_ref?: string
norm_refs?: string[]
confidence?: 'high' | 'medium' | 'low'
rationale?: string
}
type Response = {
project_id: string
source: 'llm_gap_review' | 'fallback_static'
model?: string
suggestions: Suggestion[]
input_summary: {
hazard_count: number
mitigation_count: number
limits_form_fields: number
}
}
const CONF_COLOR: Record<string, string> = {
high: 'bg-emerald-100 text-emerald-800 border-emerald-200',
medium: 'bg-amber-100 text-amber-800 border-amber-200',
low: 'bg-slate-100 text-slate-600 border-slate-200',
}
interface Props {
projectId: string
onClose: () => void
onAdoptHazard?: (s: Suggestion) => Promise<void>
onAdoptMitigation?: (s: Suggestion) => Promise<void>
}
export function LLMGapReviewModal({ projectId, onClose, onAdoptHazard, onAdoptMitigation }: Props) {
const [data, setData] = useState<Response | null>(null)
const [loading, setLoading] = useState(true)
const [error, setError] = useState<string | null>(null)
const [adopted, setAdopted] = useState<Set<number>>(new Set())
const [rejected, setRejected] = useState<Set<number>>(new Set())
const [adopting, setAdopting] = useState<number | null>(null)
useEffect(() => {
setLoading(true)
fetch(`/api/sdk/v1/iace/projects/${projectId}/llm-gap-review`, { method: 'POST' })
.then((r) => (r.ok ? r.json() : Promise.reject(`HTTP ${r.status}`)))
.then(setData)
.catch((e) => setError(String(e)))
.finally(() => setLoading(false))
}, [projectId])
async function adopt(idx: number) {
if (!data) return
const s = data.suggestions[idx]
setAdopting(idx)
try {
if (s.kind === 'hazard' && onAdoptHazard) await onAdoptHazard(s)
else if (s.kind === 'mitigation' && onAdoptMitigation) await onAdoptMitigation(s)
setAdopted((prev) => new Set(prev).add(idx))
} catch (e) {
setError(`Adopt fehlgeschlagen: ${e}`)
} finally {
setAdopting(null)
}
}
function reject(idx: number) {
setRejected((prev) => new Set(prev).add(idx))
}
return (
<div className="fixed inset-0 z-50 flex items-center justify-center bg-black/50">
<div className="bg-white rounded-xl shadow-2xl w-full max-w-3xl max-h-[90vh] overflow-hidden flex flex-col">
<div className="px-6 py-4 border-b border-gray-200 flex items-center justify-between flex-shrink-0">
<div>
<h2 className="text-lg font-semibold text-gray-900">KI-Gap-Review</h2>
<p className="text-xs text-gray-500 mt-0.5">
LLM-gestuetzte Suche nach fehlenden Gefaehrdungen und Schutzmassnahmen Vorschlaege sind unverbindlich bis explizit uebernommen.
</p>
</div>
<button onClick={onClose} className="text-gray-400 hover:text-gray-600 text-2xl leading-none">&times;</button>
</div>
<div className="flex-1 overflow-y-auto p-6 space-y-3">
{loading && (
<div className="text-center py-12">
<div className="animate-spin rounded-full h-10 w-10 border-b-2 border-purple-600 mx-auto" />
<p className="text-sm text-gray-500 mt-3">LLM laeuft (Qwen/Claude). Das kann bis zu 30 Sekunden dauern.</p>
</div>
)}
{error && (
<div className="bg-red-50 border border-red-200 rounded-lg p-4 text-sm text-red-700">
Fehler: {error}
</div>
)}
{data && (
<>
<div className="text-xs text-gray-500 flex items-center gap-3 border-b border-gray-100 pb-2">
<span>
Eingabe: {data.input_summary.hazard_count} Gefaehrdungen,{' '}
{data.input_summary.mitigation_count} Massnahmen, {data.input_summary.limits_form_fields} Grenzen-Felder
</span>
<span className="text-gray-300">·</span>
<span>
Quelle: {data.source === 'llm_gap_review'
? `LLM (${data.model ?? 'unbekannt'})`
: 'Statische Fallback-Liste'}
</span>
</div>
{data.suggestions.length === 0 && (
<div className="text-center text-gray-500 py-12 text-sm">
Keine Lueckenvorschlaege. Die deterministische Pattern-Engine hat vermutlich bereits alle Standard-Gefaehrdungen abgedeckt.
</div>
)}
{data.suggestions.map((s, i) => {
const isAdopted = adopted.has(i)
const isRejected = rejected.has(i)
const isWorking = adopting === i
return (
<div
key={i}
className={`border rounded-lg p-3 ${
isAdopted ? 'border-emerald-200 bg-emerald-50' :
isRejected ? 'border-slate-200 bg-slate-50 opacity-50' :
'border-gray-200 bg-white'
}`}
>
<div className="flex items-start justify-between gap-3">
<div className="flex-1 min-w-0">
<div className="flex items-center gap-2 flex-wrap mb-1">
<span className={`px-1.5 py-0.5 text-[10px] rounded font-medium ${
s.kind === 'hazard' ? 'bg-red-100 text-red-700' : 'bg-blue-100 text-blue-700'
}`}>
{s.kind === 'hazard' ? 'Gefaehrdung' : 'Massnahme'}
</span>
{s.category && (
<span className="px-1.5 py-0.5 text-[10px] rounded bg-gray-100 text-gray-700">{s.category}</span>
)}
{s.confidence && (
<span className={`px-1.5 py-0.5 text-[10px] rounded border ${CONF_COLOR[s.confidence]}`}>
{s.confidence}
</span>
)}
{(s.norm_refs ?? []).map((n) => (
<span key={n} className="px-1.5 py-0.5 text-[10px] rounded bg-indigo-50 text-indigo-700 font-mono">{n}</span>
))}
{s.pattern_ref && (
<span className="px-1.5 py-0.5 text-[10px] rounded bg-purple-50 text-purple-700 font-mono">{s.pattern_ref}</span>
)}
</div>
<h3 className="text-sm font-semibold text-gray-900">{s.title}</h3>
<p className="text-xs text-gray-600 mt-1">{s.description}</p>
{s.hazard_ref && (
<p className="text-[11px] text-gray-500 mt-1">Bezogen auf: <em>{s.hazard_ref}</em></p>
)}
{s.rationale && (
<p className="text-[11px] text-gray-400 mt-1 italic">{s.rationale}</p>
)}
</div>
<div className="flex flex-col gap-1 flex-shrink-0">
{!isAdopted && !isRejected && (
<>
<button
onClick={() => adopt(i)}
disabled={isWorking}
className="px-3 py-1 text-xs bg-emerald-600 text-white rounded hover:bg-emerald-700 disabled:opacity-50"
>
{isWorking ? '…' : 'Uebernehmen'}
</button>
<button
onClick={() => reject(i)}
className="px-3 py-1 text-xs text-gray-600 border border-gray-300 rounded hover:bg-gray-50"
>
Verwerfen
</button>
</>
)}
{isAdopted && <span className="text-xs text-emerald-700 font-medium"> Uebernommen</span>}
{isRejected && <span className="text-xs text-gray-500">Verworfen</span>}
</div>
</div>
</div>
)
})}
</>
)}
</div>
<div className="px-6 py-3 border-t border-gray-200 bg-gray-50 flex items-center justify-between flex-shrink-0">
<p className="text-[11px] text-gray-500">
Hinweis: LLM-Vorschlaege sind NICHT die deterministische Engine-Output. Jede Uebernahme wird als <code>source=llm_gap_review</code> markiert.
</p>
<button onClick={onClose} className="px-3 py-1.5 text-sm border border-gray-300 rounded hover:bg-white">
Schliessen
</button>
</div>
</div>
</div>
)
}
export default LLMGapReviewModal
@@ -12,6 +12,7 @@ import type { ResidualFilter } from './_components/ResidualRiskPanel'
import { LibraryModal } from './_components/LibraryModal'
import { AutoSuggestPanel } from './_components/AutoSuggestPanel'
import { CustomHazardModal } from './_components/CustomHazardModal'
import { LLMGapReviewModal } from './_components/LLMGapReviewModal'
import { useHazards } from './_hooks/useHazards'
type ViewMode = 'list' | 'risk' | 'blocks'
@@ -22,6 +23,7 @@ export default function HazardsPage() {
const h = useHazards(projectId)
const [view, setView] = useState<ViewMode>('risk')
const [showCustomModal, setShowCustomModal] = useState(false)
const [showGapReview, setShowGapReview] = useState(false)
const [residualFilter, setResidualFilter] = useState<ResidualFilter>('all')
const [decisions, setDecisions] = useState<Record<string, boolean | null>>({})
@@ -104,6 +106,15 @@ export default function HazardsPage() {
</svg>
Eigene Gefaehrdung
</button>
<button
onClick={() => setShowGapReview(true)}
title="LLM (Qwen/Claude) prueft auf fehlende Gefaehrdungen und Massnahmen — Vorschlaege sind unverbindlich."
className="flex items-center gap-2 px-3 py-2 border border-indigo-300 text-indigo-700 rounded-lg hover:bg-indigo-50 transition-colors text-sm">
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M9.663 17h4.673M12 3v1m6.364 1.636l-.707.707M21 12h-1M4 12H3m3.343-5.657l-.707-.707m2.828 9.9a5 5 0 117.072 0l-.548.547A3.374 3.374 0 0014 18.469V19a2 2 0 11-4 0v-.531c0-.895-.356-1.754-.988-2.386l-.548-.547z" />
</svg>
KI-Gap-Review
</button>
<button onClick={() => h.setShowForm(true)}
className="flex items-center gap-2 px-4 py-2 bg-purple-600 text-white rounded-lg hover:bg-purple-700 transition-colors text-sm">
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor">
@@ -170,6 +181,13 @@ export default function HazardsPage() {
onClose={() => setShowCustomModal(false)} />
)}
{showGapReview && (
<LLMGapReviewModal
projectId={projectId}
onClose={() => setShowGapReview(false)}
/>
)}
{h.hazards.length > 0 ? (
view === 'risk' ? (
<>
@@ -0,0 +1,288 @@
package handlers
// LLM Gap-Review handler — Task #7.
//
// After the deterministic Pattern-Engine has generated hazards and
// mitigations for an IACE project, this endpoint asks a configured LLM
// (Qwen / Claude / OpenAI) to spot what the engine MISSED. The LLM is
// fed the Limits-Form, the current hazard list, and a compressed
// pattern catalogue summary; it returns a list of suggested additional
// hazards or mitigations.
//
// Important guardrails:
// - Every suggestion must point to an existing pattern_id or norm
// identifier — pure free-form LLM hallucinations are filtered.
// - The response is provenance-tagged source="llm_gap_review" so
// the frontend renders an Adopt/Reject UX rather than committing.
// - Engine output (deterministic patterns) is never overwritten by
// LLM output; the gap-review is a SUPPLEMENT, not a replacement.
import (
"context"
"encoding/json"
"fmt"
"net/http"
"strings"
"github.com/gin-gonic/gin"
"github.com/google/uuid"
"github.com/breakpilot/ai-compliance-sdk/internal/iace"
"github.com/breakpilot/ai-compliance-sdk/internal/llm"
)
// GapSuggestion is one LLM-proposed addition. Each suggestion is
// non-binding until the user adopts it via the frontend.
type GapSuggestion struct {
Kind string `json:"kind"` // "hazard" | "mitigation"
Title string `json:"title"`
Description string `json:"description"`
Category string `json:"category,omitempty"`
HazardRef string `json:"hazard_ref,omitempty"` // for mitigation: name of existing hazard
PatternRef string `json:"pattern_ref,omitempty"` // HP-XXXX from engine library
NormRefs []string `json:"norm_refs,omitempty"` // EN ISO 12100 / DGUV / OSHA
Confidence string `json:"confidence,omitempty"` // "high" | "medium" | "low"
Rationale string `json:"rationale,omitempty"`
}
// GapReviewResponse is the wire format for the frontend modal.
type GapReviewResponse struct {
ProjectID string `json:"project_id"`
Source string `json:"source"` // "llm_gap_review" | "fallback_static"
Model string `json:"model,omitempty"`
Suggestions []GapSuggestion `json:"suggestions"`
InputSummary struct {
HazardCount int `json:"hazard_count"`
MitigationCount int `json:"mitigation_count"`
LimitsFormFields int `json:"limits_form_fields"`
} `json:"input_summary"`
}
// LLMGapReview handles POST /projects/:id/llm-gap-review.
//
// The endpoint is intentionally idempotent — repeated calls do not mutate
// project state. The Adopt step (user-driven) is what changes data, via
// the existing CreateHazard / CreateMitigation handlers.
func (h *IACEHandler) LLMGapReview(c *gin.Context) {
projectID, err := uuid.Parse(c.Param("id"))
if err != nil {
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid project id"})
return
}
ctx := c.Request.Context()
project, err := h.store.GetProject(ctx, projectID)
if err != nil {
c.JSON(http.StatusNotFound, gin.H{"error": "project not found"})
return
}
hazards, err := h.store.ListHazards(ctx, projectID)
if err != nil {
c.JSON(http.StatusInternalServerError, gin.H{"error": "list hazards: " + err.Error()})
return
}
mitigations, err := h.store.ListMitigationsByProject(ctx, projectID)
if err != nil {
c.JSON(http.StatusInternalServerError, gin.H{"error": "list mitigations: " + err.Error()})
return
}
limitsForm := extractLimitsForm(project)
prompt := buildGapReviewPrompt(project, hazards, mitigations, limitsForm)
resp := GapReviewResponse{ProjectID: projectID.String()}
resp.InputSummary.HazardCount = len(hazards)
resp.InputSummary.MitigationCount = len(mitigations)
resp.InputSummary.LimitsFormFields = countLimitsFields(limitsForm)
suggestions, model, err := callLLMForGapReview(ctx, h.llmRegistry, prompt)
if err != nil {
resp.Source = "fallback_static"
resp.Suggestions = staticFallbackSuggestions(hazards)
c.JSON(http.StatusOK, resp)
return
}
resp.Source = "llm_gap_review"
resp.Model = model
resp.Suggestions = filterAndProvenance(suggestions)
c.JSON(http.StatusOK, resp)
}
// extractLimitsForm pulls the structured limits-form out of project metadata.
func extractLimitsForm(p *iace.Project) map[string]any {
if len(p.Metadata) == 0 {
return nil
}
var md map[string]any
if err := json.Unmarshal(p.Metadata, &md); err != nil {
return nil
}
lf, _ := md["limits_form"].(map[string]any)
return lf
}
func countLimitsFields(lf map[string]any) int {
n := 0
for _, v := range lf {
if s, ok := v.(string); ok && strings.TrimSpace(s) != "" {
n++
} else if arr, ok := v.([]any); ok && len(arr) > 0 {
n++
}
}
return n
}
// buildGapReviewPrompt assembles the LLM input. Kept compact — the LLM
// only needs the limits-form context, the current hazard headlines, and
// a reminder of the pattern-id naming so its suggestions can be linked
// back to engine output later.
func buildGapReviewPrompt(p *iace.Project, hz []iace.Hazard, mt []iace.Mitigation, lf map[string]any) string {
var sb strings.Builder
sb.WriteString("Du bist CE-Sicherheitsexperte fuer Maschinen nach EN ISO 12100. ")
sb.WriteString("Analysiere die folgende Risikobeurteilung und identifiziere FEHLENDE ")
sb.WriteString("Gefaehrdungen oder Schutzmassnahmen, die ein erfahrener Auditor ergaenzen wuerde.\n\n")
sb.WriteString(fmt.Sprintf("Maschine: %s (Typ: %s, Hersteller: %s)\n",
p.MachineName, p.MachineType, p.Manufacturer))
if p.CEMarkingTarget != "" {
sb.WriteString(fmt.Sprintf("CE-Ziel: %s\n", p.CEMarkingTarget))
}
sb.WriteString("\nGrenzen-Form (Limits & Verwendung):\n")
for k, v := range lf {
sb.WriteString(fmt.Sprintf("- %s: %v\n", k, truncForPrompt(v, 200)))
}
sb.WriteString(fmt.Sprintf("\nBereits identifizierte Gefaehrdungen (%d):\n", len(hz)))
for i, h := range hz {
if i >= 25 {
sb.WriteString(fmt.Sprintf("... und %d weitere\n", len(hz)-25))
break
}
sb.WriteString(fmt.Sprintf("- [%s] %s\n", h.Category, h.Name))
}
sb.WriteString(fmt.Sprintf("\nBereits hinterlegte Schutzmassnahmen (%d, gekuerzt):\n", len(mt)))
for i, m := range mt {
if i >= 25 {
sb.WriteString(fmt.Sprintf("... und %d weitere\n", len(mt)-25))
break
}
sb.WriteString(fmt.Sprintf("- [%s] %s\n", m.ReductionType, m.Name))
}
sb.WriteString("\nAufgabe: Liste max. 8 LUECKEN als JSON-Array. Jede Luecke MUSS einer der folgenden Kategorien entsprechen ")
sb.WriteString("und SOLL eine Norm- oder Pattern-Referenz nennen (HP-XXXX, EN ISO 12100, EN 13849, EN 13855, DGUV-Info, OSHA 29 CFR).\n")
sb.WriteString("Kategorien: mechanical_hazard, electrical_hazard, thermal_hazard, noise_vibration, ergonomic, ")
sb.WriteString("material_environmental, pneumatic_hydraulic, radiation_hazard.\n\n")
sb.WriteString(`Antworte NUR mit JSON, keine Erklaerung:
[
{"kind":"hazard","title":"...","description":"...","category":"...","norm_refs":["EN ISO 12100"],"confidence":"high","rationale":"..."},
{"kind":"mitigation","title":"...","description":"...","hazard_ref":"Name der bestehenden Gefahr","norm_refs":["DGUV 209-072"],"confidence":"medium","rationale":"..."}
]`)
return sb.String()
}
func truncForPrompt(v any, max int) string {
s := fmt.Sprintf("%v", v)
if len(s) <= max {
return s
}
return s[:max] + "…"
}
// callLLMForGapReview sends the prompt and parses the JSON suggestion list.
func callLLMForGapReview(ctx context.Context, registry *llm.ProviderRegistry, prompt string) ([]GapSuggestion, string, error) {
if registry == nil {
return nil, "", fmt.Errorf("no LLM registry configured")
}
provider, err := registry.GetAvailable(ctx)
if err != nil {
return nil, "", fmt.Errorf("no LLM provider available: %w", err)
}
resp, err := provider.Chat(ctx, &llm.ChatRequest{
Messages: []llm.Message{{Role: "user", Content: prompt}},
Temperature: 0.25,
MaxTokens: 2000,
})
if err != nil {
return nil, "", fmt.Errorf("llm chat: %w", err)
}
body := strings.TrimSpace(resp.Message.Content)
// LLMs occasionally wrap JSON in ```json … ``` fences; strip them.
body = strings.TrimPrefix(body, "```json")
body = strings.TrimPrefix(body, "```")
body = strings.TrimSuffix(body, "```")
body = strings.TrimSpace(body)
// Find first '[' so any leading prose is ignored.
if i := strings.Index(body, "["); i > 0 {
body = body[i:]
}
var out []GapSuggestion
if err := json.Unmarshal([]byte(body), &out); err != nil {
return nil, "", fmt.Errorf("parse llm response: %w (body=%.200s)", err, body)
}
return out, provider.Name(), nil
}
// filterAndProvenance drops obviously malformed suggestions and stamps
// every survivor with a `confidence` default. Pure-free-form suggestions
// without any norm reference are demoted to "low".
func filterAndProvenance(in []GapSuggestion) []GapSuggestion {
out := make([]GapSuggestion, 0, len(in))
for _, s := range in {
if strings.TrimSpace(s.Title) == "" || s.Kind == "" {
continue
}
if s.Confidence == "" {
if len(s.NormRefs) == 0 && s.PatternRef == "" {
s.Confidence = "low"
} else {
s.Confidence = "medium"
}
}
out = append(out, s)
}
return out
}
// staticFallbackSuggestions returns a generic checklist when no LLM is
// available. Conservative, all confidence="low".
func staticFallbackSuggestions(hz []iace.Hazard) []GapSuggestion {
hasMechanical := false
for _, h := range hz {
if strings.Contains(h.Category, "mechanical") {
hasMechanical = true
break
}
}
out := []GapSuggestion{
{
Kind: "hazard", Title: "Fuss-Quetschung unter absenkendem Werkstueck/Hubeinheit",
Description: "Wenn die Maschine eine Hubbewegung ausfuehrt, pruefe ob Fuesse/Beine im Verfahrbereich gequetscht werden koennen.",
Category: "mechanical_hazard", NormRefs: []string{"EN ISO 12100 6.3.5.5"},
Confidence: "low", Rationale: "Static checklist fallback — LLM nicht verfuegbar.",
},
{
Kind: "hazard", Title: "Hand-Quetschung gegen feste Strukturen beim Hochfahren",
Description: "Pruefe Mindestabstand zu festen Strukturen oberhalb der hoechsten Hubposition.",
Category: "mechanical_hazard", NormRefs: []string{"EN ISO 13854"},
Confidence: "low",
},
{
Kind: "mitigation", Title: "Kriechgeschwindigkeit am Endanschlag (Hubgeraete)",
Description: "Hubgeschwindigkeit am Ende der Verfahrbewegung auf <=15 mm/s reduzieren.",
NormRefs: []string{"OSHA 29 CFR 1910.217 (Hand-Speed-Konstante)"},
Confidence: "low",
},
}
if !hasMechanical {
// Trim if not a mechanical context
out = out[:1]
}
return out
}
-111
View File
@@ -355,117 +355,6 @@ func registerWhistleblowerRoutes(v1 *gin.RouterGroup, h *handlers.WhistleblowerH
}
}
func registerIACERoutes(v1 *gin.RouterGroup, h *handlers.IACEHandler) {
iaceRoutes := v1.Group("/iace")
{
iaceRoutes.GET("/hazard-library", h.ListHazardLibrary)
iaceRoutes.GET("/controls-library", h.ListControlsLibrary)
iaceRoutes.GET("/norms-library", h.ListNormsLibrary)
iaceRoutes.GET("/lifecycle-phases", h.ListLifecyclePhases)
iaceRoutes.GET("/roles", h.ListRoles)
iaceRoutes.GET("/evidence-types", h.ListEvidenceTypes)
iaceRoutes.GET("/protective-measures-library", h.ListProtectiveMeasures)
iaceRoutes.GET("/failure-modes", h.ListFailureModes)
iaceRoutes.GET("/operational-states", h.ListOperationalStates)
iaceRoutes.GET("/component-library", h.ListComponentLibrary)
iaceRoutes.GET("/energy-sources", h.ListEnergySources)
iaceRoutes.GET("/tags", h.ListTags)
iaceRoutes.GET("/hazard-patterns", h.ListHazardPatterns)
iaceRoutes.POST("/projects", h.CreateProject)
iaceRoutes.GET("/projects", h.ListProjects)
iaceRoutes.GET("/projects/:id", h.GetProject)
iaceRoutes.PUT("/projects/:id", h.UpdateProject)
iaceRoutes.DELETE("/projects/:id", h.ArchiveProject)
iaceRoutes.POST("/projects/:id/init-from-profile", h.InitFromProfile)
iaceRoutes.POST("/projects/:id/variants", h.CreateVariant)
iaceRoutes.GET("/projects/:id/variants", h.ListVariants)
iaceRoutes.GET("/projects/:id/variant-gap", h.GetVariantGap)
iaceRoutes.POST("/projects/:id/completeness-check", h.CheckCompleteness)
iaceRoutes.POST("/projects/:id/components", h.CreateComponent)
iaceRoutes.GET("/projects/:id/components", h.ListComponents)
iaceRoutes.PUT("/projects/:id/components/:cid", h.UpdateComponent)
iaceRoutes.DELETE("/projects/:id/components/:cid", h.DeleteComponent)
iaceRoutes.POST("/projects/:id/classify", h.Classify)
iaceRoutes.GET("/projects/:id/classifications", h.GetClassifications)
iaceRoutes.POST("/projects/:id/classify/:regulation", h.ClassifySingle)
iaceRoutes.POST("/projects/:id/hazards", h.CreateHazard)
iaceRoutes.GET("/projects/:id/hazards", h.ListHazards)
iaceRoutes.PUT("/projects/:id/hazards/:hid", h.UpdateHazard)
iaceRoutes.POST("/projects/:id/hazards/suggest", h.SuggestHazards)
iaceRoutes.POST("/projects/:id/match-patterns", h.MatchPatterns)
iaceRoutes.POST("/projects/:id/parse-narrative", h.ParseNarrative)
iaceRoutes.POST("/projects/:id/delta-analysis", h.DeltaAnalysis)
iaceRoutes.GET("/projects/:id/fmea/export", h.ExportFMEA)
iaceRoutes.POST("/projects/:id/components/:cid/suggest-fms", h.SuggestFailureModes)
iaceRoutes.POST("/projects/:id/apply-patterns", h.ApplyPatternResults)
iaceRoutes.POST("/projects/:id/hazards/:hid/suggest-measures", h.SuggestMeasuresForHazard)
iaceRoutes.POST("/projects/:id/mitigations/:mid/suggest-evidence", h.SuggestEvidenceForMitigation)
iaceRoutes.POST("/projects/:id/hazards/:hid/assess", h.AssessRisk)
iaceRoutes.GET("/projects/:id/risk-summary", h.GetRiskSummary)
iaceRoutes.GET("/projects/:id/suggested-norms", h.SuggestProjectNorms)
iaceRoutes.POST("/projects/:id/hazards/:hid/reassess", h.ReassessRisk)
iaceRoutes.GET("/projects/:id/mitigations", h.ListProjectMitigations)
iaceRoutes.POST("/projects/:id/hazards/:hid/mitigations", h.CreateMitigation)
iaceRoutes.DELETE("/projects/:id/mitigations/:mid", h.DeleteMitigation)
iaceRoutes.PUT("/mitigations/:mid", h.UpdateMitigation)
iaceRoutes.POST("/mitigations/:mid/verify", h.VerifyMitigation)
iaceRoutes.POST("/projects/:id/validate-mitigation-hierarchy", h.ValidateMitigationHierarchy)
iaceRoutes.POST("/projects/:id/evidence", h.UploadEvidence)
iaceRoutes.GET("/projects/:id/evidence", h.ListEvidence)
iaceRoutes.POST("/projects/:id/verification-plan", h.CreateVerificationPlan)
iaceRoutes.PUT("/verification-plan/:vid", h.UpdateVerificationPlan)
iaceRoutes.POST("/verification-plan/:vid/complete", h.CompleteVerification)
iaceRoutes.GET("/projects/:id/verifications", h.ListVerificationPlans)
iaceRoutes.POST("/projects/:id/verifications", h.CreateVerificationAlias)
iaceRoutes.DELETE("/projects/:id/verifications/:vid", h.DeleteVerificationPlan)
iaceRoutes.POST("/projects/:id/verifications/:vid/complete", h.CompleteVerificationAlias)
iaceRoutes.POST("/projects/:id/tech-file/generate", h.GenerateTechFile)
iaceRoutes.GET("/projects/:id/tech-file", h.ListTechFileSections)
iaceRoutes.PUT("/projects/:id/tech-file/:section", h.UpdateTechFileSection)
iaceRoutes.POST("/projects/:id/tech-file/:section/approve", h.ApproveTechFileSection)
iaceRoutes.POST("/projects/:id/tech-file/:section/generate", h.GenerateSingleSection)
iaceRoutes.GET("/projects/:id/tech-file/export", h.ExportTechFile)
iaceRoutes.POST("/projects/:id/monitoring", h.CreateMonitoringEvent)
iaceRoutes.GET("/projects/:id/monitoring", h.ListMonitoringEvents)
iaceRoutes.PUT("/projects/:id/monitoring/:eid", h.UpdateMonitoringEvent)
iaceRoutes.GET("/projects/:id/audit-trail", h.GetAuditTrail)
iaceRoutes.POST("/library-search", h.SearchLibrary)
iaceRoutes.GET("/ce-corpus-documents", h.ListCECorpusDocuments)
iaceRoutes.POST("/projects/:id/initialize", h.InitializeProject)
iaceRoutes.GET("/projects/:id/hazard-blocks", h.GetHazardBlocks)
iaceRoutes.POST("/projects/:id/benchmark/import-gt", h.ImportGroundTruth)
iaceRoutes.GET("/projects/:id/benchmark", h.RunBenchmark)
iaceRoutes.GET("/projects/:id/benchmark/summary", h.GetBenchmarkSummary)
iaceRoutes.GET("/projects/:id/hazards/:hid/regulatory-hints", h.EnrichHazardWithRegulations)
iaceRoutes.GET("/projects/:id/mitigations/:mid/regulatory-hints", h.EnrichMitigationWithRegulations)
iaceRoutes.GET("/projects/:id/regulatory-hints", h.EnrichProjectHazardsBatch)
iaceRoutes.POST("/projects/:id/tech-file/:section/enrich", h.EnrichTechFileSection)
// Production Lines
iaceRoutes.POST("/production-lines", h.CreateProductionLine)
iaceRoutes.GET("/production-lines", h.ListProductionLines)
iaceRoutes.GET("/production-lines/:lid/dashboard", h.GetProductionLineDashboard)
iaceRoutes.POST("/production-lines/:lid/stations", h.AddStationToLine)
iaceRoutes.DELETE("/production-lines/:lid/stations/:sid", h.RemoveStationFromLine)
// CE x Compliance Crossover
iaceRoutes.GET("/projects/:id/compliance-triggers", h.GetComplianceTriggers)
iaceRoutes.GET("/compliance-faq", h.GetComplianceFAQ)
// Clarifications — aggregated open questions per project
iaceRoutes.GET("/projects/:id/clarifications", h.ListClarifications)
iaceRoutes.GET("/projects/:id/clarifications.csv", h.ExportClarificationsCSV)
iaceRoutes.GET("/projects/:id/clarifications.html", h.ExportClarificationsHTML)
iaceRoutes.GET("/projects/:id/clarifications/:cid/detail", h.ListClarificationDetail)
iaceRoutes.POST("/projects/:id/clarifications/:cid/answer", h.AnswerClarification)
iaceRoutes.POST("/projects/:id/clarifications/:cid/comment", h.PostClarificationComment)
// Customer-Standard Reuse (migration 031): pull reusable mitigations
// across prior projects of the same customer.
iaceRoutes.GET("/projects/:id/customer-standards", h.ListCustomerStandardSuggestions)
iaceRoutes.POST("/projects/:id/customer-standards/import", h.ImportCustomerStandardSuggestion)
}
}
func registerMaximizerRoutes(v1 *gin.RouterGroup, h *handlers.MaximizerHandlers) {
m := v1.Group("/maximizer")
@@ -0,0 +1,136 @@
package app
// IACE route registration extracted from routes.go (2026-05-21) because
// routes.go hit the 500-LOC hard cap when the LLM gap-review endpoint
// (Task #7) was added. Splitting keeps every routes file under the cap
// without changing behaviour — `registerRoutes` in routes.go still
// invokes `registerIACERoutes` exactly once at the same point in the
// startup sequence.
import (
"github.com/breakpilot/ai-compliance-sdk/internal/api/handlers"
"github.com/gin-gonic/gin"
)
func registerIACERoutes(v1 *gin.RouterGroup, h *handlers.IACEHandler) {
iaceRoutes := v1.Group("/iace")
{
// Library catalogues (read-only reference data).
iaceRoutes.GET("/hazard-library", h.ListHazardLibrary)
iaceRoutes.GET("/controls-library", h.ListControlsLibrary)
iaceRoutes.GET("/norms-library", h.ListNormsLibrary)
iaceRoutes.GET("/lifecycle-phases", h.ListLifecyclePhases)
iaceRoutes.GET("/roles", h.ListRoles)
iaceRoutes.GET("/evidence-types", h.ListEvidenceTypes)
iaceRoutes.GET("/protective-measures-library", h.ListProtectiveMeasures)
iaceRoutes.GET("/failure-modes", h.ListFailureModes)
iaceRoutes.GET("/operational-states", h.ListOperationalStates)
iaceRoutes.GET("/component-library", h.ListComponentLibrary)
iaceRoutes.GET("/energy-sources", h.ListEnergySources)
iaceRoutes.GET("/tags", h.ListTags)
iaceRoutes.GET("/hazard-patterns", h.ListHazardPatterns)
// Project CRUD.
iaceRoutes.POST("/projects", h.CreateProject)
iaceRoutes.GET("/projects", h.ListProjects)
iaceRoutes.GET("/projects/:id", h.GetProject)
iaceRoutes.PUT("/projects/:id", h.UpdateProject)
iaceRoutes.DELETE("/projects/:id", h.ArchiveProject)
iaceRoutes.POST("/projects/:id/init-from-profile", h.InitFromProfile)
iaceRoutes.POST("/projects/:id/variants", h.CreateVariant)
iaceRoutes.GET("/projects/:id/variants", h.ListVariants)
iaceRoutes.GET("/projects/:id/variant-gap", h.GetVariantGap)
iaceRoutes.POST("/projects/:id/completeness-check", h.CheckCompleteness)
// Components.
iaceRoutes.POST("/projects/:id/components", h.CreateComponent)
iaceRoutes.GET("/projects/:id/components", h.ListComponents)
iaceRoutes.PUT("/projects/:id/components/:cid", h.UpdateComponent)
iaceRoutes.DELETE("/projects/:id/components/:cid", h.DeleteComponent)
// Classification + hazards.
iaceRoutes.POST("/projects/:id/classify", h.Classify)
iaceRoutes.GET("/projects/:id/classifications", h.GetClassifications)
iaceRoutes.POST("/projects/:id/classify/:regulation", h.ClassifySingle)
iaceRoutes.POST("/projects/:id/hazards", h.CreateHazard)
iaceRoutes.GET("/projects/:id/hazards", h.ListHazards)
iaceRoutes.PUT("/projects/:id/hazards/:hid", h.UpdateHazard)
iaceRoutes.POST("/projects/:id/hazards/suggest", h.SuggestHazards)
iaceRoutes.POST("/projects/:id/match-patterns", h.MatchPatterns)
iaceRoutes.POST("/projects/:id/parse-narrative", h.ParseNarrative)
iaceRoutes.POST("/projects/:id/delta-analysis", h.DeltaAnalysis)
iaceRoutes.POST("/projects/:id/llm-gap-review", h.LLMGapReview)
iaceRoutes.GET("/projects/:id/fmea/export", h.ExportFMEA)
iaceRoutes.POST("/projects/:id/components/:cid/suggest-fms", h.SuggestFailureModes)
iaceRoutes.POST("/projects/:id/apply-patterns", h.ApplyPatternResults)
iaceRoutes.POST("/projects/:id/hazards/:hid/suggest-measures", h.SuggestMeasuresForHazard)
iaceRoutes.POST("/projects/:id/mitigations/:mid/suggest-evidence", h.SuggestEvidenceForMitigation)
iaceRoutes.POST("/projects/:id/hazards/:hid/assess", h.AssessRisk)
iaceRoutes.GET("/projects/:id/risk-summary", h.GetRiskSummary)
iaceRoutes.GET("/projects/:id/suggested-norms", h.SuggestProjectNorms)
iaceRoutes.POST("/projects/:id/hazards/:hid/reassess", h.ReassessRisk)
// Mitigations + evidence + verification.
iaceRoutes.GET("/projects/:id/mitigations", h.ListProjectMitigations)
iaceRoutes.POST("/projects/:id/hazards/:hid/mitigations", h.CreateMitigation)
iaceRoutes.DELETE("/projects/:id/mitigations/:mid", h.DeleteMitigation)
iaceRoutes.PUT("/mitigations/:mid", h.UpdateMitigation)
iaceRoutes.POST("/mitigations/:mid/verify", h.VerifyMitigation)
iaceRoutes.POST("/projects/:id/validate-mitigation-hierarchy", h.ValidateMitigationHierarchy)
iaceRoutes.POST("/projects/:id/evidence", h.UploadEvidence)
iaceRoutes.GET("/projects/:id/evidence", h.ListEvidence)
iaceRoutes.POST("/projects/:id/verification-plan", h.CreateVerificationPlan)
iaceRoutes.PUT("/verification-plan/:vid", h.UpdateVerificationPlan)
iaceRoutes.POST("/verification-plan/:vid/complete", h.CompleteVerification)
iaceRoutes.GET("/projects/:id/verifications", h.ListVerificationPlans)
iaceRoutes.POST("/projects/:id/verifications", h.CreateVerificationAlias)
iaceRoutes.DELETE("/projects/:id/verifications/:vid", h.DeleteVerificationPlan)
iaceRoutes.POST("/projects/:id/verifications/:vid/complete", h.CompleteVerificationAlias)
// Tech file + monitoring + audit.
iaceRoutes.POST("/projects/:id/tech-file/generate", h.GenerateTechFile)
iaceRoutes.GET("/projects/:id/tech-file", h.ListTechFileSections)
iaceRoutes.PUT("/projects/:id/tech-file/:section", h.UpdateTechFileSection)
iaceRoutes.POST("/projects/:id/tech-file/:section/approve", h.ApproveTechFileSection)
iaceRoutes.POST("/projects/:id/tech-file/:section/generate", h.GenerateSingleSection)
iaceRoutes.GET("/projects/:id/tech-file/export", h.ExportTechFile)
iaceRoutes.POST("/projects/:id/monitoring", h.CreateMonitoringEvent)
iaceRoutes.GET("/projects/:id/monitoring", h.ListMonitoringEvents)
iaceRoutes.PUT("/projects/:id/monitoring/:eid", h.UpdateMonitoringEvent)
iaceRoutes.GET("/projects/:id/audit-trail", h.GetAuditTrail)
// Library + corpus + benchmark.
iaceRoutes.POST("/library-search", h.SearchLibrary)
iaceRoutes.GET("/ce-corpus-documents", h.ListCECorpusDocuments)
iaceRoutes.POST("/projects/:id/initialize", h.InitializeProject)
iaceRoutes.GET("/projects/:id/hazard-blocks", h.GetHazardBlocks)
iaceRoutes.POST("/projects/:id/benchmark/import-gt", h.ImportGroundTruth)
iaceRoutes.GET("/projects/:id/benchmark", h.RunBenchmark)
iaceRoutes.GET("/projects/:id/benchmark/summary", h.GetBenchmarkSummary)
// Regulatory enrichment.
iaceRoutes.GET("/projects/:id/hazards/:hid/regulatory-hints", h.EnrichHazardWithRegulations)
iaceRoutes.GET("/projects/:id/mitigations/:mid/regulatory-hints", h.EnrichMitigationWithRegulations)
iaceRoutes.GET("/projects/:id/regulatory-hints", h.EnrichProjectHazardsBatch)
iaceRoutes.POST("/projects/:id/tech-file/:section/enrich", h.EnrichTechFileSection)
// Production lines.
iaceRoutes.POST("/production-lines", h.CreateProductionLine)
iaceRoutes.GET("/production-lines", h.ListProductionLines)
iaceRoutes.GET("/production-lines/:lid/dashboard", h.GetProductionLineDashboard)
iaceRoutes.POST("/production-lines/:lid/stations", h.AddStationToLine)
iaceRoutes.DELETE("/production-lines/:lid/stations/:sid", h.RemoveStationFromLine)
// CE x Compliance crossover + clarifications + customer standards.
iaceRoutes.GET("/projects/:id/compliance-triggers", h.GetComplianceTriggers)
iaceRoutes.GET("/compliance-faq", h.GetComplianceFAQ)
iaceRoutes.GET("/projects/:id/clarifications", h.ListClarifications)
iaceRoutes.GET("/projects/:id/clarifications.csv", h.ExportClarificationsCSV)
iaceRoutes.GET("/projects/:id/clarifications.html", h.ExportClarificationsHTML)
iaceRoutes.GET("/projects/:id/clarifications/:cid/detail", h.ListClarificationDetail)
iaceRoutes.POST("/projects/:id/clarifications/:cid/answer", h.AnswerClarification)
iaceRoutes.POST("/projects/:id/clarifications/:cid/comment", h.PostClarificationComment)
iaceRoutes.GET("/projects/:id/customer-standards", h.ListCustomerStandardSuggestions)
iaceRoutes.POST("/projects/:id/customer-standards/import", h.ImportCustomerStandardSuggestion)
}
}
@@ -81,6 +81,10 @@ func (e *DocumentExporter) ExportPDF(
e.pdfClassifications(pdf, classifications)
}
// --- Quellen & Lizenzen (Stufe 4 Attribution-Renderer, Task #29) ---
pdf.AddPage()
e.pdfSourcesAppendix(pdf, hazards, mitigations)
// --- Footer on every page ---
pdf.SetFooterFunc(func() {
pdf.SetY(-15)
@@ -0,0 +1,134 @@
package iace
// Sources & Licenses appendix for the IACE Tech-File PDF export.
// Stufe 4 of the Attribution Renderer (Task #29).
//
// The IACE engine generates hazards from BreakPilot Pattern-IDs that
// themselves cite ISO 12100, EN 13849, EN ISO 13855 etc. Those norm
// identifiers are R3 (DIN/EN copyright — identifier-only). The
// pattern-engine output itself is R3 (BreakPilot own work). OSHA values
// surfaced via the minimum-distance library are R1 (US Federal PD).
//
// This appendix aggregates what the Tech-File ACTUALLY cited and shows
// it grouped by license rule with the mandatory disclaimer that the
// per-export footer cannot be replaced by a pauschal Impressum-Hinweis.
import (
"sort"
"strings"
"github.com/jung-kurt/gofpdf"
)
// pdfSourcesAppendix renders the "Quellen & Lizenzen" appendix page.
// Called by ExportPDF after the regulatory classifications block.
func (e *DocumentExporter) pdfSourcesAppendix(pdf *gofpdf.Fpdf, hazards []Hazard, mitigations []Mitigation) {
pdf.SetFont("Helvetica", "B", 14)
pdf.SetTextColor(124, 58, 237)
pdf.CellFormat(0, 10, "Quellen und Lizenzen", "", 1, "L", false, 0, "")
pdf.Ln(2)
pdf.SetFont("Helvetica", "", 9)
pdf.SetTextColor(80, 80, 80)
intro := "Diese Risikobeurteilung verwendet die deterministische BreakPilot IACE " +
"Pattern-Engine sowie zitierte Sicherheitsnormen. Die folgende Aufstellung " +
"listet die konkret in diesem Dokument zitierten Quellen mit ihrer Lizenzregel."
pdf.MultiCell(0, 5, intro, "", "L", false)
pdf.Ln(3)
pdf.SetFont("Helvetica", "B", 10)
pdf.SetTextColor(0, 0, 0)
pdf.CellFormat(0, 7, "R3 — BreakPilot Pattern-Engine (Eigenwerk, Identifier-Verweis)", "", 1, "L", false, 0, "")
pdf.SetFont("Helvetica", "", 9)
pdf.SetTextColor(60, 60, 60)
pdf.MultiCell(0, 5,
"Alle in diesem Dokument referenzierten HP-XXXX-Identifier stammen aus der "+
"BreakPilot IACE Pattern-Library (Eigenwerk). Keine externe Lizenz-Attribution "+
"erforderlich.", "", "L", false)
pdf.Ln(3)
norms := extractCitedNorms(hazards, mitigations)
if len(norms) > 0 {
pdf.SetFont("Helvetica", "B", 10)
pdf.SetTextColor(0, 0, 0)
pdf.CellFormat(0, 7, "R3 — Sicherheitsnormen (DIN/EN/ISO/IEC, Identifier-Verweis)", "", 1, "L", false, 0, "")
pdf.SetFont("Helvetica", "", 9)
pdf.SetTextColor(60, 60, 60)
pdf.MultiCell(0, 5,
"DIN-/EN-/ISO-/IEC-Normen unterliegen dem Urheberrecht der jeweiligen "+
"Normungsorganisation. In diesem Dokument werden Normen ausschliesslich "+
"als Identifier (Norm-Nummer und Abschnitt) zitiert; kein Volltext aus "+
"diesen Normen wurde reproduziert. Konkret zitiert:", "", "L", false)
pdf.Ln(1)
for _, n := range norms {
pdf.CellFormat(0, 5, " • "+n, "", 1, "L", false, 0, "")
}
pdf.Ln(2)
}
pdf.SetFont("Helvetica", "B", 10)
pdf.SetTextColor(0, 0, 0)
pdf.CellFormat(0, 7, "R1 — Hoheitsrecht / Public Domain (woertlich uebernehmbar)", "", 1, "L", false, 0, "")
pdf.SetFont("Helvetica", "", 9)
pdf.SetTextColor(60, 60, 60)
pdf.MultiCell(0, 5,
"Soweit Werte aus US Federal Code (OSHA 29 CFR Subpart O) oder EU-Recht "+
"(Maschinenverordnung 2023/1230, AI Act 2024/1689) referenziert werden, "+
"sind diese als R1 woertlich uebernehmbar. Keine Attribution-Pflicht.", "", "L", false)
pdf.Ln(4)
pdf.SetFont("Helvetica", "I", 8)
pdf.SetTextColor(120, 120, 120)
pdf.MultiCell(0, 4,
"Hinweis: Pauschalvermerke in AGB oder Impressum reichen rechtlich nicht — "+
"die werknahe Attribution erfolgt durch diese Quellenseite. Vollstaendiges "+
"Quellenverzeichnis aller im BreakPilot-System verwendeten Quellen siehe "+
"/sdk/licenses im Web-Frontend.", "", "L", false)
}
// extractCitedNorms scans hazard descriptions + scenario fields for
// recognised norm identifiers. The detection is intentionally narrow:
// only well-known prefixes (EN/ISO/IEC/DIN) and only when followed by
// digits, so free-form prose is not turned into spurious citations.
func extractCitedNorms(hz []Hazard, mt []Mitigation) []string {
seen := make(map[string]bool)
consider := func(s string) {
fields := strings.FieldsFunc(s, func(r rune) bool {
return r == ' ' || r == ',' || r == ';' || r == '\n' || r == ';' || r == '('
})
for i := 0; i < len(fields)-1; i++ {
head := strings.ToUpper(strings.TrimSpace(fields[i]))
next := strings.TrimSpace(fields[i+1])
if !(head == "EN" || head == "ISO" || head == "IEC" || head == "DIN") {
continue
}
if next == "" {
continue
}
// Accept "ISO 12100", "EN 13849-1", "DIN EN 60204-1" etc.
if next[0] >= '0' && next[0] <= '9' {
seen[head+" "+next] = true
} else if head == "DIN" && (strings.HasPrefix(strings.ToUpper(next), "EN") || strings.HasPrefix(strings.ToUpper(next), "ISO")) && i+2 < len(fields) {
third := strings.TrimSpace(fields[i+2])
if third != "" && third[0] >= '0' && third[0] <= '9' {
seen[head+" "+next+" "+third] = true
}
}
}
}
for _, h := range hz {
consider(h.Description)
consider(h.Scenario)
consider(h.PossibleHarm)
}
for _, m := range mt {
consider(m.Description)
consider(m.Name)
}
out := make([]string, 0, len(seen))
for k := range seen {
out = append(out, k)
}
sort.Strings(out)
return out
}
@@ -207,6 +207,22 @@ async def get_snapshot(snapshot_id: str):
db.close()
@router.post("/admin/tcf-ingest")
async def tcf_ingest():
"""P105 — IAB TCF Vendor-Liste ingestieren / refreshen.
Idempotent: holt aktuelle GVL und upserted in compliance.cookie_library
mit source='iab_tcf_v2'. Aufruf ein paar Mal pro Jahr ausreichend."""
from database import SessionLocal
from compliance.services.tcf_vendor_authority import (
fetch_and_ingest_tcf_vendors,
)
db = SessionLocal()
try:
return await fetch_and_ingest_tcf_vendors(db)
finally:
db.close()
@router.get("/snapshots/{snapshot_id}/pdf")
async def export_snapshot_pdf(snapshot_id: str):
"""P88 — PDF-Export der Audit-Mail. Liefert application/pdf."""
@@ -1285,6 +1301,53 @@ async def _run_compliance_check(check_id: str, req: ComplianceCheckRequest):
except Exception as e:
logger.warning("Scope-disclaimer block skipped: %s", e)
# P103 + P104 — Cookie-Value-Entropy + Network-Tracing (Stufe 3 + 4)
entropy_html = ""
network_trace_html = ""
try:
from compliance.services.cookie_value_entropy import (
check_cookies_for_entropy_mismatch, build_entropy_block_html,
)
from compliance.services.cookie_network_tracer import (
trace_cookie_network, build_network_trace_block_html,
)
cookies_detailed = (banner_result or {}).get("cookies_detailed") or []
entropy_findings = check_cookies_for_entropy_mismatch(cookies_detailed)
if entropy_findings:
entropy_html = build_entropy_block_html(entropy_findings)
logger.info("P103 Entropy: %d Findings", len(entropy_findings))
primary_url = ""
for e_ in doc_entries:
if e_.get("url"):
primary_url = e_["url"]; break
net_findings = trace_cookie_network(cookies_detailed, primary_url)
if net_findings:
network_trace_html = build_network_trace_block_html(net_findings)
logger.info("P104 Network-Trace: %d Findings", len(net_findings))
except Exception as e:
logger.warning("P103/P104 entropy/network-trace skipped: %s", e)
# P105 — IAB TCF Authority-Cross-Reference (Stufe 5)
tcf_authority_html = ""
try:
from compliance.services.tcf_vendor_authority import (
cross_reference_with_tcf, build_tcf_authority_block_html,
)
from database import SessionLocal as _SLtcf
_tcf_db = _SLtcf()
try:
tcf_findings = cross_reference_with_tcf(_tcf_db, cmp_vendors)
if tcf_findings:
tcf_authority_html = build_tcf_authority_block_html(tcf_findings)
logger.info(
"TCF-Authority: %d Vendor-Discrepancies gefunden",
len(tcf_findings),
)
finally:
_tcf_db.close()
except Exception as e:
logger.warning("TCF-Authority-Check skipped: %s", e)
# COOKIE-COMPLIANCE-AUDIT (3-Quellen-Vergleich) — das ist der
# zentrale USP: deklariert in Richtlinie vs tatsaechlich im
# Browser geladen vs Library-Match.
@@ -1524,6 +1587,9 @@ async def _run_compliance_check(check_id: str, req: ComplianceCheckRequest):
+ scorecard_html + redundancy_html
+ providers_html + banner_deep_html
+ cookie_audit_html
+ tcf_authority_html
+ entropy_html
+ network_trace_html
+ library_mismatch_html
+ consistency_html + signals_html + solutions_html
+ jc_decision_html
@@ -0,0 +1,216 @@
"""
P104 Cookie-Network-Tracing (Stufe 4).
cookies_detailed[i].domain zeigt welche Domain das Cookie via Set-Cookie
gesetzt hat. Wir vergleichen:
* Site-Hauptdomain vs Cookie-Domain First-Party / Third-Party
* Cookie-Domain vs bekannte Vendoren wer ist der echte Empfaenger
* Vendor-Land vs EU/Drittland Drittland-Transfer-Hinweis
Defeat-Device-Pattern: "Funktional"-Cookie wird aber von doubleclick.net
gesetzt das ist physisch ein Third-Party-Tracking-Cookie, kein
funktionales First-Party-Cookie.
"""
from __future__ import annotations
import logging
from urllib.parse import urlparse
logger = logging.getLogger(__name__)
# Vendor-Domain → bekannter Vendor + Land
_DOMAIN_VENDORS: dict[str, tuple[str, str]] = {
".doubleclick.net": ("Google DoubleClick", "US"),
".google.com": ("Google", "US"),
".google-analytics.com": ("Google Analytics", "US"),
".googletagmanager.com": ("Google Tag Manager", "US"),
".googleadservices.com": ("Google Ads", "US"),
".gstatic.com": ("Google CDN", "US"),
".facebook.com": ("Meta / Facebook", "US"),
".facebook.net": ("Meta / Facebook", "US"),
".instagram.com": ("Meta / Instagram", "US"),
".linkedin.com": ("LinkedIn (Microsoft)", "US"),
".pinterest.com": ("Pinterest", "US"),
".pinimg.com": ("Pinterest", "US"),
".tiktok.com": ("TikTok (ByteDance)", "CN"),
".bing.com": ("Microsoft Bing", "US"),
".clarity.ms": ("Microsoft Clarity", "US"),
".criteo.com": ("Criteo", "FR"),
".adnxs.com": ("AppNexus / Xandr", "US"),
".rubiconproject.com": ("Rubicon Project", "US"),
".pubmatic.com": ("PubMatic", "US"),
".adobedtm.com": ("Adobe DTM", "US"),
".adobetarget.com": ("Adobe Target", "US"),
".demdex.net": ("Adobe Experience Cloud", "US"),
".omtrdc.net": ("Adobe Analytics", "US"),
".everesttech.net": ("Adobe Advertising Cloud", "US"),
".2o7.net": ("Adobe Analytics", "US"),
".adform.net": ("AdForm", "DK"),
".trade-desk.com": ("The Trade Desk", "US"),
".tradedesk.com": ("The Trade Desk", "US"),
".adsrvr.org": ("The Trade Desk", "US"),
".hotjar.com": ("Hotjar", "MT"),
".matomo.cloud": ("Matomo", "DE"),
".etracker.com": ("etracker", "DE"),
".etracker.de": ("etracker", "DE"),
".cloudflare.com": ("Cloudflare", "US"),
".cookielaw.org": ("OneTrust", "US"),
".cookiebot.com": ("Cookiebot (Cybot)", "DK"),
".usercentrics.eu": ("Usercentrics", "DE"),
".usercentrics.com": ("Usercentrics", "DE"),
".consensu.org": ("IAB Europe TCF", "BE"),
".datadoghq.eu": ("Datadog", "US"),
".datadoghq.com": ("Datadog", "US"),
".datadome.co": ("DataDome", "FR"),
".incapsula.com": ("Imperva Incapsula", "US"),
".imperva.com": ("Imperva", "US"),
".akamai.net": ("Akamai", "US"),
".akamaiedge.net": ("Akamai", "US"),
".salesforce.com": ("Salesforce", "US"),
".force.com": ("Salesforce", "US"),
}
_NON_EU_COUNTRIES = {"US", "CN", "RU", "IN", "JP", "BR", "AU"}
def _registrable_domain(host: str) -> str:
"""vw.de von www.vw.de oder bla.vw.de oder vw.de"""
h = (host or "").lstrip(".").lower()
parts = h.split(".")
if len(parts) >= 2:
return ".".join(parts[-2:])
return h
def _lookup_vendor_by_domain(cookie_domain: str) -> tuple[str, str] | None:
if not cookie_domain:
return None
cd = cookie_domain.lower()
if not cd.startswith("."):
cd = "." + cd
for suffix, (vendor, country) in _DOMAIN_VENDORS.items():
if cd.endswith(suffix):
return (vendor, country)
return None
def trace_cookie_network(
cookies_detailed: list[dict] | None,
site_url: str | None = None,
) -> list[dict]:
"""Liefert Findings fuer Cookies die von externer/Drittland-Domain
gesetzt werden waehrend sie als First-Party / essential deklariert sind."""
if not cookies_detailed:
return []
site_host = ""
if site_url:
try:
site_host = _registrable_domain(urlparse(site_url).netloc)
except Exception:
site_host = ""
out: list[dict] = []
for ck in cookies_detailed:
if not isinstance(ck, dict):
continue
name = (ck.get("name") or "").strip()
domain = (ck.get("domain") or "").strip()
declared = (ck.get("declared_category") or "").lower().strip()
if not name or not domain:
continue
cookie_reg = _registrable_domain(domain)
is_third_party = bool(site_host and cookie_reg != site_host)
vendor_match = _lookup_vendor_by_domain(domain)
if not vendor_match and not is_third_party:
continue
# Defeat-Device-Pattern: essential/functional + Third-Party
if declared in ("essential", "functional", "necessary") and is_third_party:
sev = "HIGH" if vendor_match else "MEDIUM"
vendor_name = vendor_match[0] if vendor_match else cookie_reg
country = vendor_match[1] if vendor_match else ""
third_country = country in _NON_EU_COUNTRIES
out.append({
"cookie": name,
"declared": declared,
"cookie_domain": domain,
"site_domain": site_host,
"vendor": vendor_name,
"vendor_country": country,
"third_country": third_country,
"severity": sev,
"label": (
f"Cookie '{name}' deklariert als '{declared}', "
f"wird aber von externer Domain "
f"<strong>{vendor_name}</strong> "
f"({domain}) gesetzt"
+ (f" — Drittland: {country}" if third_country else "")
),
})
elif vendor_match and declared in ("essential", "functional", "necessary"):
# Auch wenn First-Party-Cookie aber bekannter Tracker-Vendor →
# Mismatch (z.B. Google Tag Manager kann via CNAME als
# First-Party erscheinen)
out.append({
"cookie": name,
"declared": declared,
"cookie_domain": domain,
"vendor": vendor_match[0],
"vendor_country": vendor_match[1],
"third_country": vendor_match[1] in _NON_EU_COUNTRIES,
"severity": "MEDIUM",
"label": (
f"Cookie '{name}' deklariert als '{declared}', "
f"Domain {domain} gehoert aber zu "
f"<strong>{vendor_match[0]}</strong> "
f"({vendor_match[1]})"
),
})
return out
def build_network_trace_block_html(findings: list[dict]) -> str:
if not findings:
return ""
n_third = sum(1 for f in findings if f.get("third_country"))
items: list[str] = []
for f in findings[:30]:
sev_color = "#dc2626" if f["severity"] == "HIGH" else "#d97706"
country_flag = ""
if f.get("third_country"):
country_flag = (
f' <span style="background:#fee2e2;color:#991b1b;'
f'padding:1px 5px;border-radius:8px;font-size:9px;'
f'font-weight:600">DRITTLAND {f.get("vendor_country","")}</span>'
)
items.append(
f'<li style="margin-bottom:6px;font-size:11px;line-height:1.5;'
f'color:{sev_color}">{f["label"]}{country_flag}</li>'
)
return (
'<div style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;'
'max-width:760px;margin:0 auto 16px;padding:14px 18px;'
'background:#fff7ed;border:1px solid #fed7aa;border-radius:8px">'
'<div style="font-size:11px;color:#9a3412;text-transform:uppercase;'
'letter-spacing:1.2px;margin-bottom:4px;font-weight:600">'
'Cookie-Netzwerk-Verhalten (Defeat-Device-Heuristik)</div>'
f'<h3 style="margin:0 0 6px;font-size:14px;color:#1e293b">'
f'{len(findings)} Cookie{"s" if len(findings) != 1 else ""} '
f'mit Vendor-Domain-Diskrepanz'
f'{f" — davon {n_third} mit Drittland-Transfer" if n_third else ""}'
f'</h3>'
'<p style="margin:0 0 10px;font-size:11px;color:#475569;line-height:1.5">'
'Diese Cookies sind als "essential" oder "funktional" deklariert, '
'werden aber von einer externen Domain gesetzt — typisch fuer '
'getarnte Tracker. Drittland-Markierungen sind besonders kritisch: '
'sie loesen Pflichten nach Art. 44-49 DSGVO aus (SCC / Angemessen-'
'heitsbeschluss / Schrems II Folge-Massnahmen).'
'</p>'
'<ul style="margin:0 0 0 18px;padding:0">'
+ "".join(items) +
'</ul></div>'
)
@@ -0,0 +1,148 @@
"""
P103 Cookie-Value-Entropy-Check (Stufe 3).
Bewertet ob der Cookie-Wert zur deklarierten Kategorie passt:
* "Funktional" + 2-char-Wert ('1', 'de') konsistent (Flag)
* "Funktional" + 64-char-Base64 INKONSISTENT (Tracking-ID-Pattern)
* "Marketing" + 32+ char Hash konsistent
* "Marketing" + 2-char-Wert konsistent (Boolean-Opt-Out)
Defeat-Device-Pattern: Site deklariert "Funktional" um Consent zu
umgehen, aber Wert sieht wie pseudonymisierte Tracking-ID aus.
"""
from __future__ import annotations
import logging
import math
import re
logger = logging.getLogger(__name__)
def _shannon_entropy(s: str) -> float:
if not s:
return 0.0
from collections import Counter
n = len(s)
counts = Counter(s)
return -sum((c / n) * math.log2(c / n) for c in counts.values())
_BASE64_RE = re.compile(r"^[A-Za-z0-9+/=_-]{20,}$")
_HEX_RE = re.compile(r"^[a-fA-F0-9]{16,}$")
_UUID_RE = re.compile(
r"^[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-"
r"[0-9a-fA-F]{4}-[0-9a-fA-F]{12}$"
)
_FLAG_VALUES = {"0", "1", "true", "false", "yes", "no",
"de", "en", "de-de", "en-us", "fr-fr",
"accept", "deny", "essential", "on", "off"}
def _classify_value_shape(value: str) -> str:
"""Returns one of: 'flag', 'short_id', 'long_token', 'uuid', 'hash',
'json_blob', 'unknown'."""
if not value:
return "flag"
v = value.strip()
if v.lower() in _FLAG_VALUES:
return "flag"
if len(v) <= 4:
return "flag"
if _UUID_RE.match(v):
return "uuid"
if _HEX_RE.match(v) and len(v) >= 32:
return "hash"
if _BASE64_RE.match(v) and len(v) >= 40:
return "long_token"
if v.startswith("{") or v.startswith("["):
return "json_blob"
if len(v) >= 16 and _shannon_entropy(v) > 3.5:
return "long_token"
if len(v) >= 6:
return "short_id"
return "flag"
def check_cookies_for_entropy_mismatch(
cookies_detailed: list[dict] | None,
) -> list[dict]:
"""Liefert Findings fuer Cookies deren Wert-Shape nicht zur
deklarierten Kategorie passt."""
out: list[dict] = []
if not cookies_detailed:
return out
for ck in cookies_detailed:
if not isinstance(ck, dict):
continue
name = (ck.get("name") or "").strip()
value = (ck.get("value") or "").strip()
declared = (ck.get("declared_category") or "").lower().strip()
if not name or not declared:
continue
shape = _classify_value_shape(value)
# Regel: 'essential' / 'functional' Cookies mit hoher
# Tracking-ID-Komplexitaet sind verdaechtig.
is_low_cat = declared in ("essential", "functional", "necessary")
is_id_shape = shape in ("uuid", "hash", "long_token")
if is_low_cat and is_id_shape:
out.append({
"cookie": name,
"declared": declared,
"value_shape": shape,
"value_len": len(value),
"severity": "MEDIUM",
"label": (
f"Cookie '{name}' deklariert als '{declared}', "
f"aber Wert ist ein {shape} ({len(value)} Zeichen) — "
"typisches Tracking-ID-Pattern"
),
"detail": (
"Funktionale/notwendige Cookies speichern normalerweise "
"kurze Flags (1, true, de-DE). Ein langer Hash/UUID-Wert "
"in einem als 'essential' deklarierten Cookie ist ein "
"Indikator fuer verstecktes Tracking — vergleichbar mit "
"einem 'Defeat Device', das auf dem Pruefstand harmlos "
"aussieht aber im Realbetrieb anderes tut."
),
})
return out
def build_entropy_block_html(findings: list[dict]) -> str:
if not findings:
return ""
items: list[str] = []
for f in findings[:25]:
items.append(
f'<li style="margin-bottom:6px;font-size:11px;line-height:1.5">'
f'<strong style="color:#d97706">{f["cookie"]}</strong> '
f'<span style="color:#64748b">(deklariert: '
f'<strong>{f["declared"]}</strong>) — Wert-Shape:</span> '
f'<code style="background:#fef3c7;padding:1px 4px;border-radius:2px">'
f'{f["value_shape"]}</code> '
f'<span style="color:#64748b">({f["value_len"]} Zeichen)</span>'
f'</li>'
)
return (
'<div style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;'
'max-width:760px;margin:0 auto 16px;padding:14px 18px;'
'background:#fffbeb;border:1px solid #fde68a;border-radius:8px">'
'<div style="font-size:11px;color:#92400e;text-transform:uppercase;'
'letter-spacing:1.2px;margin-bottom:4px;font-weight:600">'
'Cookie-Werte-Plausibilitaet (Defeat-Device-Heuristik)</div>'
f'<h3 style="margin:0 0 6px;font-size:14px;color:#1e293b">'
f'{len(findings)} Cookie{"s" if len(findings) != 1 else ""} '
'mit verdaechtigem Wert-Pattern</h3>'
'<p style="margin:0 0 10px;font-size:11px;color:#475569;line-height:1.5">'
'Diese Cookies sind als "essential" oder "funktional" deklariert, '
'ihr tatsaechlicher Wert sieht aber wie eine Tracking-ID aus '
'(UUID, Hash, langer Base64-Token). Empfehlung: pruefen ob diese '
'Cookies wirklich nur technisch notwendig sind oder de facto '
'pseudonymisierte User-Tracker.</p>'
'<ul style="margin:0 0 0 18px;padding:0">'
+ "".join(items) +
'</ul></div>'
)
@@ -0,0 +1,238 @@
"""
P105 IAB TCF Vendor-Liste als externe Authority.
Die IAB TCF v2.2 Global Vendor List (https://vendor-list.consensu.org/v3/
vendor-list.json) ist die DSGVO-Authoritaet fuer Werbe-Vendoren: jeder
gelistete Vendor hat verbindliche IAB-Purposes:
Purpose 1 Speichern + Zugriff (essential)
Purpose 2 Auswahl Werbung (functional/marketing)
Purpose 3 Personalisierte Werbeprofile (marketing)
Purpose 4 Personalisierte Werbung (marketing)
Purpose 5 Personalisierte Inhaltsprofile (marketing/personalization)
Purpose 6 Personalisierte Inhalte (marketing/personalization)
Purpose 7 Werbe-Performance-Messung (statistics)
Purpose 8 Inhalts-Performance-Messung (statistics)
Purpose 9 Marktforschung (statistics)
Purpose 10 Produkt-Verbesserung (statistics)
Wenn ein Vendor in der TCF-Liste mit Purpose 3/4 registriert ist und die
Site ihn als "Funktional" deklariert eindeutiger Verstoss (eine externe
Authority widerspricht der Deklaration).
Ingest-Mode: idempotenter Fetch + Upsert in compliance.tcf_vendors_v2.
Lookup-Mode: by_vendor_name + by_cookie_owner.
"""
from __future__ import annotations
import logging
from typing import Iterable
import httpx
from sqlalchemy import text as sa_text
from sqlalchemy.orm import Session
logger = logging.getLogger(__name__)
_TCF_URL = "https://vendor-list.consensu.org/v3/vendor-list.json"
# IAB-Purpose → BreakPilot-Kategorie
_PURPOSE_TO_CATEGORY = {
1: "essential",
2: "marketing",
3: "marketing",
4: "marketing",
5: "personalization",
6: "personalization",
7: "statistics",
8: "statistics",
9: "statistics",
10: "statistics",
11: "marketing",
}
def _category_for_purposes(purposes: Iterable[int]) -> str:
"""Aggregiert Purposes zu der STRENGSTEN Kategorie (Marketing > stats
> personalization > essential). Wenn ein Vendor sowohl essential als
auch marketing nutzt, ist die rechtlich verbindliche Kategorie
Marketing (Einwilligungspflicht)."""
cats = {_PURPOSE_TO_CATEGORY.get(p, "marketing") for p in purposes}
if "marketing" in cats:
return "marketing"
if "statistics" in cats:
return "statistics"
if "personalization" in cats:
return "personalization"
return "essential"
async def fetch_and_ingest_tcf_vendors(db: Session) -> dict:
"""Idempotenter Ingest. Schema-Migration vermeiden — nutzt nur
bestehende cookie_library-Tabelle und kennzeichnet TCF-Source via
vendor_name='[TCF] <name>'."""
async with httpx.AsyncClient(timeout=60.0) as client:
resp = await client.get(_TCF_URL)
resp.raise_for_status()
data = resp.json()
vendors = data.get("vendors") or {}
if not vendors:
return {"error": "no vendors in TCF response", "n_vendors": 0}
inserted = 0
skipped = 0
for vid, v in vendors.items():
name = (v.get("name") or "").strip()
if not name:
continue
purposes = v.get("purposes") or []
leg_purposes = v.get("legIntPurposes") or []
all_purposes = list(set(purposes) | set(leg_purposes))
category = _category_for_purposes(all_purposes)
# Cookie-Names die der Vendor laut TCF setzt sind nicht in der
# GVL — wir kennzeichnen nur den Vendor-Eintrag mit ID + Purposes.
# Vendor wird mit synthetic cookie_name='<vendor>_tcf_marker'
# gespeichert; Library-Lookup nutzt vendor_name-Match.
marker = f"_tcf_v{vid}"
try:
db.execute(sa_text(
"""
INSERT INTO compliance.cookie_library
(cookie_name, actual_category, vendor_name, source)
VALUES (:n, :cat, :v, 'iab_tcf_v2')
ON CONFLICT (cookie_name) DO UPDATE
SET actual_category = EXCLUDED.actual_category,
vendor_name = EXCLUDED.vendor_name
"""
), {"n": marker, "cat": category,
"v": f"[TCF-{vid}] {name}"})
inserted += 1
except Exception as e:
logger.warning("TCF vendor %s insert failed: %s", vid, e)
skipped += 1
db.commit()
return {"n_vendors_in_gvl": len(vendors), "inserted": inserted,
"skipped": skipped}
def lookup_tcf_authority(
db: Session,
vendor_name: str | None,
) -> dict | None:
"""Liefert TCF-Authority-Daten fuer einen Vendor-Namen, wenn er
in der TCF-Liste registriert ist. Returns {tcf_id, name, category}
oder None.
Fuzzy-Match: 'Google' matched '[TCF-755] Google Advertising Products'.
"""
if not vendor_name:
return None
nl = vendor_name.lower().strip()
try:
rows = db.execute(sa_text(
"""
SELECT cookie_name, actual_category, vendor_name
FROM compliance.cookie_library
WHERE source = 'iab_tcf_v2'
AND LOWER(vendor_name) LIKE :pat
LIMIT 5
"""
), {"pat": f"%{nl}%"}).fetchall()
for r in rows:
tcf_name = r[2] # '[TCF-755] Google ...'
if tcf_name and "]" in tcf_name:
tcf_id = tcf_name.split("]")[0].lstrip("[TCF-")
clean = tcf_name.split("]", 1)[1].strip()
return {"tcf_id": tcf_id, "name": clean,
"category": r[1]}
except Exception as e:
logger.warning("TCF lookup failed: %s", e)
return None
def cross_reference_with_tcf(
db: Session,
declared_vendors: list[dict],
) -> list[dict]:
"""Liefert pro Vendor mit Discrepancy ein Finding-dict.
Eingang: list[{name, category}] aus cmp_vendors.
Ausgang: list[{vendor, declared_category, tcf_category, severity}]
"""
out: list[dict] = []
for v in (declared_vendors or []):
if not isinstance(v, dict):
continue
name = (v.get("name") or "").strip()
declared_cat = (v.get("category") or "").lower().strip()
if not name or not declared_cat:
continue
tcf = lookup_tcf_authority(db, name)
if not tcf:
continue
if tcf["category"] == declared_cat:
continue
# Marketing/Statistics vs Functional/Essential ist die kritische
# Diskrepanz. functional + personalization sind weicher.
severity = "HIGH" if (tcf["category"] == "marketing"
and declared_cat in ("essential",
"functional",
"necessary")) else "MEDIUM"
out.append({
"vendor": name,
"tcf_id": tcf["tcf_id"],
"tcf_name": tcf["name"],
"declared_category": declared_cat,
"tcf_category": tcf["category"],
"severity": severity,
})
return out
def build_tcf_authority_block_html(findings: list[dict]) -> str:
if not findings:
return ""
items: list[str] = []
for f in findings[:30]:
sev_color = "#dc2626" if f["severity"] == "HIGH" else "#d97706"
items.append(
f'<li style="margin-bottom:6px;font-size:11px;line-height:1.5">'
f'<strong style="color:{sev_color}">{f["vendor"]}</strong> '
f'<span style="color:#64748b">— deklariert als</span> '
f'<strong>{f["declared_category"]}</strong>, '
f'<span style="color:#64748b">IAB TCF v2 (Vendor-ID '
f'{f["tcf_id"]}) listet als</span> '
f'<strong style="color:{sev_color}">'
f'{f["tcf_category"]}</strong>'
f'</li>'
)
return (
'<div style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;'
'max-width:760px;margin:0 auto 16px;padding:14px 18px;'
'background:#fef2f2;border:1px solid #fecaca;border-radius:8px">'
'<div style="font-size:11px;color:#991b1b;text-transform:uppercase;'
'letter-spacing:1.2px;margin-bottom:4px;font-weight:600">'
'IAB TCF v2 Authority-Check — Vendor-Kategorie-Diskrepanz</div>'
f'<h3 style="margin:0 0 6px;font-size:14px;color:#1e293b">'
f'{len(findings)} Vendor{"en" if len(findings) != 1 else ""} '
'mit Kategorie-Widerspruch zur offiziellen IAB-Liste</h3>'
'<p style="margin:0 0 10px;font-size:11px;color:#475569;'
'line-height:1.5">'
'Die IAB Transparency &amp; Consent Framework v2 Global Vendor List '
'ist die rechtliche Authoritaet fuer die Klassifizierung von '
'Werbe-Vendoren in der EU. Wenn ein Vendor dort als "Marketing" '
'gefuehrt ist, kann die Site ihn nicht als "Funktional" einstufen '
'— das ist eine externe, durchgesetzte Klassifikation.</p>'
'<ul style="margin:0 0 0 18px;padding:0">'
+ "".join(items) +
'</ul>'
'<p style="margin:8px 0 0;font-size:10px;color:#94a3b8;'
'font-style:italic">Quelle: '
'https://vendor-list.consensu.org/v3/vendor-list.json — '
'die TCF-Liste ist verbindlich fuer alle CMP-Tools die IAB-TCF v2 '
'implementieren (Cookiebot, OneTrust, Usercentrics, Sourcepoint, …).</p>'
'</div>'
)