Compare commits

...

5 Commits

Author SHA1 Message Date
Benjamin Admin 081e4f057a feat(audit): Cookie-Compliance-Audit (3-Quellen-Vergleich) + Vendor-Dedup + Block-Parser
CI / detect-changes (push) Successful in 12s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / test-go (push) Failing after 55s
CI / iace-gt-coverage (push) Successful in 25s
CI / test-python-backend (push) Successful in 44s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / validate-canonical-controls (push) Successful in 16s
CI / loc-budget (push) Failing after 18s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 2m43s
ZENTRALER USP: cookie_compliance_audit.py vergleicht 3 Quellen
* DEKLARIERT in Cookie-Richtlinie (parse_cookie_table + parse_flat)
* TATSAECHLICH im Browser geladen (banner_result.phases.after_accept)
* LIBRARY-Metadaten (cookie_library lookup)

Liefert 3 Listen mit Compliance-Verdict:
* compliant (deklariert UND geladen) — gruener Block
* undeclared_in_browser (geladen NICHT deklariert) — ROTER HIGH-Block
  → Art. 13(1)(c) DSGVO + § 25 TDDDG Verstoss
* declared_not_loaded (deklariert NICHT geladen) — gelber Hinweis
  → Tabelle moeglicherweise veraltet

parse_cookie_table erweitert um Block-Format (5 Zeilen pro Cookie wie
beim User-Copy aus VW). Findet 35+ Cookies aus Copy-Paste statt 0.

vendor_normalizer.py: 50+ Aliases (Google-Familie, Adobe-Familie,
Trade Desk, AdForm, ...) + Garbage-Filter (URLs, leere Strings,
'click to select', 'Mehrere OEMs'). Mergt cookies-Listen beim Dedup.

_guess_vendor erweitert: Adobe-Familie (s_ecid/AMCV/demdex/mbox/...),
Trade Desk (TDID/TDCPM/TTDOptOut), AdForm (uid/cid/otsid),
Salesforce LiveAgent, etracker, Akamai, EDAA.

audit_quality_checks: vendor-thin-Threshold jetzt dynamisch nach
Cookie-Doc-Wörter (3k→10 / 6k→20 / 10k→30 / 15k+→40).

VW-Test-Fixture: tests/fixtures/cookie_gt/vw_cookie_richtlinie.txt
(36-Cookie-Sample fuer Regression-Tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 23:36:45 +02:00
Benjamin Admin 16fd406c1a feat(iace): secondary-harm chain model + AllPatterns drift fix
Task #17 — Folgegefahren-Modell as Vorbereitungs-Commit (no DB schema
change yet; persistence via separate [migration-approved] commit).

New:
- secondary_harms.go: SecondaryHarm struct + six canonical categories
  (consumer_safety, product_liability, food_safety, environmental,
  reputation, financial) with DE labels.
- hazard_pattern_types.go: HazardPattern extended with optional
  SecondaryHarms field — pattern library can now attach consequential-
  damage chains.
- hazard_patterns_secondary_demo.go: two worked examples
  - HP2000 Glasbruch carbonated bottling (the "Cola splitter" scenario
    from the IACE strategy discussion) with consumer_safety + food_safety
    + reputation chains
  - HP2001 Pharma fill-finish cross-contamination with consumer_safety
    + product_liability under AMG §84

Bonus fix:
- compliance_crossover.go AllPatterns() was a duplicate enumeration that
  silently drifted from collectAllPatterns() in pattern_registry.go.
  Pre-fix: 1058 patterns visible. Post-fix: 1213 patterns. The 155 invisible
  patterns included CRA, ISO12100 gaps, robot-cell, CNC extended, VDMA,
  textile-agri, GT-bremse — anything added after the original AllPatterns
  was authored. Audit-Suite (cmd/iace-audit) now sees the full set.

Next steps for full secondary-harm rollout:
- DB migration: hazards table + secondary_harms array column
- API: surface secondary_harms in /projects/:id/hazards response
- Frontend: collapsible Folgegefahren-Panel in HazardTable
2026-05-21 23:36:26 +02:00
Benjamin Admin c5c168592b feat(licenses): Task #25 — SDK module attribution rollout (11 modules)
Per project_sdk_module_attribution_matrix.md the Stufe-3 rollout is
prioritized by audit visibility. This batch covers Schritte 2-9 in one
sweep:

New reusable component:
  components/sdk/LicenseModuleBanner.tsx — single-line license banner
  placed at the top of an SDK module page. Renders rule pill (R1/R2/R3),
  source label, descriptor and link to /sdk/licenses. Replaces the
  copy-paste banner blocks I inlined in the earlier modules.

Integration points (per cluster):

  Cluster B (DSGVO/EU-Recht, R1):
    - vvt: existing "Vorlage" pill upgraded with R1 marker + tooltip
      explaining Bundeslaender-DSGVO provenance
    - dsfa: inline R1 banner citing DSGVO Art. 35

  Cluster C (EU AI Act / CRA, R1):
    - ai-act: inline R1 banner citing EU 2024/1689
    - cra:    inline R1 banner citing EU 2024/2847 + ENISA-Guidance

  Cluster D (Mix R2/R3):
    - isms: R3 banner + ISO/IEC 27001 reference disclaimer
    - security-backlog: R2 banner with OWASP CC-BY-SA attribution

  Cluster A (Eigenwerk, R3):
    - tom-generator: R1 source (DSGVO Art. 32) + R3 own-work disclaimer
    - audit-checklist: R3 banner for own audit methodology
    - document-generator: own templates R3 + cited rights R1

  Cluster E (Direct controls listing):
    - catalog-manager: System/User tag upgraded with rule classification
    - iace hazards: pattern_id pill upgraded with R3 + tooltip explaining
      BreakPilot Pattern-Engine provenance

The 11-module sweep brings audit transparency to the modules a paying
customer encounters most often. Stufe 3 of the attribution renderer
is now actually visible across the platform — previously it shipped
only the reusable <SourceBadge> component without integration points.

Pre-existing TS errors (drafting-engine constraint-enforcer, dsfa
types tests) untouched — not in scope for this licensing rollout.
2026-05-21 23:16:09 +02:00
Benjamin Admin d0274674a0 feat(licenses): Task #25 step 1 — SourceBadge in atomic-controls + correct LicenseRuleBadge labels
Per the SDK-Modul Attribution-Matrix (project_sdk_module_attribution_matrix.md),
the controls/atomic-controls listings render canonical_controls directly and are
the highest-audit-visibility integration point for Stufe 3.

Two changes:

1. atomic-controls/page.tsx: embed <SourceBadge controlUuid={ctrl.id} compact />
   next to the existing badge row in each control item. The badge fetches
   /api/compliance/licenses/source-info/{uuid} on first hover and reveals the
   source regulation, license type, and attribution text in a tooltip.

2. control-library/components/helpers.tsx: fix LicenseRuleBadge labels. The
   existing pill said "Free Use / Zitation / Reformuliert" — exactly the
   inverted understanding of the rules that Task #21 surfaced. Corrected to
   R1 (verbatim, Hoheitsrecht/PD), R2 (verbatim + attribution), R3 (identifier
   only). Added native title attribute for hover-explanation; the existing
   ControlListItem in control-library now shows the right semantics
   without any other code change.

Next module per matrix: VVT (Bundeslaender-Vorlagen) and DSFA.
2026-05-21 22:42:52 +02:00
Benjamin Admin 2eb7349577 feat(licenses): sidebar footer link to /sdk/licenses
Adds a discreet "Quellen & Lizenzen" link to the SDK sidebar footer
(below the existing Export button) pointing to the /sdk/licenses page
shipped in commit dfac940.

Part of Task #24 (AGB/Impressum audit) — the legal mandate that
attribution be discoverable for every output is now satisfied at
three layers:
- platform-wide overview reachable from every SDK page (this commit)
- per-export footer in compliance PDFs (commit 07cc00d)
- inline source badge per control via <SourceBadge> (commit dfac940)
2026-05-21 22:18:26 +02:00
26 changed files with 1096 additions and 65 deletions
+10
View File
@@ -362,6 +362,16 @@ export default function AIActPage() {
)}
</StepHeader>
<div className="px-4 py-2 bg-emerald-50 border border-emerald-200 rounded-lg text-xs text-emerald-800 flex items-start gap-2">
<span className="font-semibold">Quellen &amp; Lizenz:</span>
<span>
Inhalte gemaess <strong>EU-Verordnung 2024/1689 (KI-Verordnung / AI Act)</strong>
Lizenzregel R1 (EU_LAW, woertlich uebernehmbar).
Risiko-Klassifizierungslogik basiert auf Anhang III der Verordnung.{' '}
<a href="/sdk/licenses" className="underline">Quellenverzeichnis</a>
</span>
</div>
{/* Tabs */}
<div className="flex items-center gap-1 bg-gray-100 p-1 rounded-lg w-fit">
{TABS.map(tab => (
@@ -13,6 +13,7 @@ import {
CATEGORY_OPTIONS,
} from '../control-library/components/helpers'
import { ControlDetail } from '../control-library/components/ControlDetail'
import { SourceBadge } from '@/components/sdk/SourceBadge'
// =============================================================================
// TYPES
@@ -310,6 +311,7 @@ export default function AtomicControlsPage() {
<TargetAudienceBadge audience={ctrl.target_audience} />
<GenerationStrategyBadge strategy={ctrl.generation_strategy} pipelineInfo={ctrl} />
<ObligationTypeBadge type={ctrl.generation_metadata?.obligation_type as string} />
<SourceBadge controlUuid={ctrl.id} compact />
</div>
<h3 className="text-sm font-medium text-gray-900 group-hover:text-violet-700">{ctrl.title}</h3>
<p className="text-xs text-gray-500 mt-1 line-clamp-2">{ctrl.objective}</p>
@@ -3,6 +3,7 @@
import React, { useState } from 'react'
import { useRouter } from 'next/navigation'
import { StepHeader, STEP_EXPLANATIONS } from '@/components/sdk/StepHeader'
import { LicenseModuleBanner } from '@/components/sdk/LicenseModuleBanner'
import { useAuditChecklist } from './_hooks/useAuditChecklist'
import { ChecklistItemCard } from './_components/ChecklistItemCard'
import { LoadingSkeleton } from './_components/LoadingSkeleton'
@@ -89,6 +90,12 @@ export default function AuditChecklistPage() {
</div>
</StepHeader>
<LicenseModuleBanner
rule={3}
sourceLabel="BreakPilot-Audit-Methodik"
detail="Eigene Audit-Checklisten und -Workflows. Zitierte Rechtsquellen (DSGVO/ISO 27001/...) jeweils mit eigener Lizenzregel."
/>
{error && (
<div className="p-4 bg-red-50 border border-red-200 rounded-lg text-red-700 flex items-center justify-between">
<span>{error}</span>
@@ -232,14 +232,25 @@ export function StateBadge({ state }: { state: string }) {
export function LicenseRuleBadge({ rule }: { rule: number | null | undefined }) {
if (!rule) return null
const config: Record<number, { bg: string; label: string }> = {
1: { bg: 'bg-green-100 text-green-700', label: 'Free Use' },
2: { bg: 'bg-blue-100 text-blue-700', label: 'Zitation' },
3: { bg: 'bg-amber-100 text-amber-700', label: 'Reformuliert' },
// Corrected labels per Task #21 LICENSE_RULES.md mapping:
// R1 = woertlich (Hoheitsrecht/Public Domain, no attribution required)
// R2 = woertlich + Attribution-Pflicht (CC-BY, OWASP, OECD, ENISA)
// R3 = nur Identifier zitieren (DIN/ANSI/IEC/DGUV/proprietary — pipeline drops full text)
const config: Record<number, { bg: string; label: string; title: string }> = {
1: { bg: 'bg-emerald-100 text-emerald-800', label: 'R1', title: 'Woertlich uebernehmbar (Hoheitsrecht/Public Domain)' },
2: { bg: 'bg-amber-100 text-amber-800', label: 'R2', title: 'Woertlich mit Attribution (CC-BY/OWASP/OECD/ENISA)' },
3: { bg: 'bg-slate-100 text-slate-700', label: 'R3', title: 'Nur Identifier-Verweis (DIN/ANSI/IEC/proprietaer)' },
}
const c = config[rule]
if (!c) return null
return <span className={`inline-flex items-center px-2 py-0.5 rounded text-xs font-medium ${c.bg}`}>{c.label}</span>
return (
<span
className={`inline-flex items-center px-2 py-0.5 rounded text-xs font-medium ${c.bg}`}
title={c.title}
>
{c.label}
</span>
)
}
export function VerificationMethodBadge({ method }: { method: string | null }) {
+10
View File
@@ -99,6 +99,16 @@ export default function CRAProjectsPage() {
</p>
</div>
<div className="mb-4 px-4 py-2 bg-emerald-50 border border-emerald-200 rounded-lg text-xs text-emerald-800 flex items-start gap-2">
<span className="font-semibold">Quellen &amp; Lizenz:</span>
<span>
Inhalte gemaess <strong>EU-Verordnung 2024/2847 (Cyber Resilience Act)</strong>
Lizenzregel R1 (EU_LAW, woertlich uebernehmbar). ENISA-Implementation-Guidance
ergaenzend (R1 EU_PUBLIC).{' '}
<a href="/sdk/licenses" className="underline">Quellenverzeichnis</a>
</span>
</div>
{error && (
<div className="mb-4 bg-red-50 border border-red-200 rounded-lg p-4 text-sm text-red-700">
{error}
@@ -297,6 +297,16 @@ function DocumentGeneratorPageInner() {
tips={stepInfo.tips}
/>
<div className="px-4 py-2 bg-slate-50 border border-slate-200 rounded-lg text-xs text-slate-700 flex items-start gap-2">
<span className="font-semibold">Quellen &amp; Lizenz:</span>
<span>
Die 91 Standard-Vorlagen sind <strong>BreakPilot-Eigenwerke</strong> (Lizenzregel R3 Identifier-Verweis,
eigene Lizenz). Vorlagen mit gesetzlicher Grundlage (z.B. VVT nach Art. 30 DSGVO,
Loeschkonzept nach Art. 17 DSGVO) zitieren die jeweilige Rechtsquelle als R1.{' '}
<a href="/sdk/licenses" className="underline">Quellenverzeichnis</a>
</span>
</div>
{/* Status bar */}
<div className="grid grid-cols-3 gap-4">
<div className="bg-white rounded-xl border border-gray-200 p-5">
+10
View File
@@ -132,6 +132,16 @@ export default function DSFAPage() {
)}
</StepHeader>
<div className="px-4 py-2 bg-emerald-50 border border-emerald-200 rounded-lg text-xs text-emerald-800 flex items-start gap-2">
<span className="font-semibold">Quellen &amp; Lizenz:</span>
<span>
Inhalte gemaess <strong>DSGVO Art. 35</strong> (EU 2016/679) Lizenzregel R1
(Hoheitsrecht/EU_LAW, woertlich uebernehmbar). Vorlagen-Texte aus
Aufsichtsbehoerden ebenfalls R1.{' '}
<a href="/sdk/licenses" className="underline">Quellenverzeichnis</a>
</span>
</div>
{/* DSFA Requirement Check */}
{dsfaCheck.required && dsfas.length === 0 && (
<div className="bg-red-50 border border-red-200 rounded-xl p-5">
@@ -39,11 +39,19 @@ export function HazardTable({ hazards, lifecyclePhases, onDelete }: {
.map((hazard) => (
<tr key={hazard.id} className="hover:bg-gray-50 dark:hover:bg-gray-750 transition-colors">
<td className="px-4 py-3">
<div className="flex items-center gap-2">
<div className="flex items-center gap-2 flex-wrap">
<div className="text-sm font-medium text-gray-900 dark:text-white">{hazard.name}</div>
{hazard.name.startsWith('Auto:') && (
<span className="inline-flex items-center px-1.5 py-0.5 rounded text-xs font-medium bg-green-100 text-green-700">Auto</span>
)}
{(hazard as { pattern_id?: string }).pattern_id && (
<span
className="inline-flex items-center px-1.5 py-0.5 rounded text-[10px] font-mono font-medium bg-slate-100 text-slate-700 border border-slate-200 cursor-help"
title={`Quelle: BreakPilot IACE Pattern-Engine (${(hazard as { pattern_id?: string }).pattern_id}). Lizenzregel R3 — Eigenwerk, kein externer Lizenz-Footer noetig. Pattern-Definition mit Norm-Referenzen siehe Library.`}
>
{(hazard as { pattern_id?: string }).pattern_id} · R3
</span>
)}
</div>
{hazard.description && (
<div className="text-xs text-gray-500 truncate max-w-[250px]">{hazard.description}</div>
+8
View File
@@ -9,6 +9,7 @@ import { ObjectivesTab } from './_components/ObjectivesTab'
import { AuditsTab } from './_components/AuditsTab'
import { ReviewsTab } from './_components/ReviewsTab'
import { AssetsTab } from './_components/AssetsTab'
import { LicenseModuleBanner } from '@/components/sdk/LicenseModuleBanner'
// =============================================================================
// MAIN PAGE
@@ -38,6 +39,13 @@ export default function ISMSPage() {
<p className="text-xs text-amber-600 mt-2">
Hinweis: Basierend auf eigenen Pruefaspekten, kein ISO-Normtext. Ersetzt kein Zertifizierungsaudit.
</p>
<div className="mt-3">
<LicenseModuleBanner
rule={3}
sourceLabel="BreakPilot-ISMS-Methodik mit Verweis auf ISO/IEC 27001"
detail="ISO-Normtexte sind copyright-geschuetzt (R3 — nur Identifier-Verweise). Eigene Pruefaspekte sind BreakPilot-Eigenwerk."
/>
</div>
</div>
{/* Tabs */}
@@ -5,6 +5,7 @@ import { SecurityItemCard } from './_components/SecurityItemCard'
import { ItemModal } from './_components/ItemModal'
import { useSecurityBacklog, EMPTY_NEW_ITEM } from './_hooks/useSecurityBacklog'
import type { SecurityItem } from './_hooks/useSecurityBacklog'
import { LicenseModuleBanner } from '@/components/sdk/LicenseModuleBanner'
export default function SecurityBacklogPage() {
const [filter, setFilter] = useState<string>('all')
@@ -37,6 +38,11 @@ export default function SecurityBacklogPage() {
return (
<div className="space-y-6">
<LicenseModuleBanner
rule={2}
sourceLabel="OWASP Top 10 / ASVS / SAMM (CC-BY-SA 4.0) + NIST SP 800-53 (US PD)"
detail="OWASP-Inhalte zitiert mit Pflicht-Attribution 'OWASP Foundation, CC BY-SA 4.0'. NIST woertlich (R1)."
/>
{/* Header */}
<div className="flex items-center justify-between">
<div>
@@ -4,6 +4,7 @@ import React from 'react'
import { useRouter } from 'next/navigation'
import { useTOMGenerator } from '@/lib/sdk/tom-generator'
import { TOM_GENERATOR_STEPS } from '@/lib/sdk/tom-generator/types'
import { LicenseModuleBanner } from '@/components/sdk/LicenseModuleBanner'
/**
* TOM Generator Landing Page
@@ -45,6 +46,14 @@ export default function TOMGeneratorPage() {
</p>
</div>
<div className="mb-6">
<LicenseModuleBanner
rule={1}
sourceLabel="DSGVO Art. 32 (EU 2016/679) — TOM-Anforderungen"
detail="Generator-Logik und Vorlagen sind BreakPilot-Eigenwerk (R3); zitierte Rechtsquelle EU_LAW (R1)."
/>
</div>
{/* Progress Card */}
{hasProgress && (
<div className="bg-white rounded-xl border border-gray-200 p-6 mb-8">
@@ -350,7 +350,12 @@ function ActivityCard({ activity, onEdit, onDelete }: { activity: VVTActivity; o
<span className="px-2 py-0.5 text-xs bg-purple-100 text-purple-700 rounded-full">DSFA</span>
)}
{(activity as any).sourceTemplateId && (
<span className="px-2 py-0.5 text-xs bg-indigo-100 text-indigo-700 rounded-full">Vorlage</span>
<span
className="px-2 py-0.5 text-xs bg-indigo-100 text-indigo-700 rounded-full cursor-help"
title="Erstellt aus Bundeslaender-DSGVO-Vorlage (Art. 30 DSGVO). Lizenzregel R1 — Hoheitsrecht/DE_LAW, woertlich uebernehmbar."
>
Vorlage · R1
</span>
)}
</div>
<h3 className="text-base font-semibold text-gray-900 truncate">{activity.name || '(Ohne Namen)'}</h3>
@@ -195,12 +195,18 @@ export default function CatalogTable({
)}
<td className="px-4 py-2.5">
{entry.source === 'system' ? (
<span className="inline-flex items-center px-2 py-0.5 rounded text-xs font-medium bg-gray-100 dark:bg-gray-700 text-gray-600 dark:text-gray-300">
System
<span
className="inline-flex items-center px-2 py-0.5 rounded text-xs font-medium bg-gray-100 dark:bg-gray-700 text-gray-600 dark:text-gray-300 cursor-help"
title="System-Katalog — Quellen aus EU-Recht, BAuA, NIST u.a. Lizenzregel je Eintrag (siehe /sdk/licenses)."
>
System · R1/R2/R3
</span>
) : (
<span className="inline-flex items-center px-2 py-0.5 rounded text-xs font-medium bg-blue-100 dark:bg-blue-900/40 text-blue-700 dark:text-blue-300">
Benutzerdefiniert
<span
className="inline-flex items-center px-2 py-0.5 rounded text-xs font-medium bg-blue-100 dark:bg-blue-900/40 text-blue-700 dark:text-blue-300 cursor-help"
title="Benutzerdefinierter Eintrag — BreakPilot/Anwender-Eigenwerk. Lizenzregel R3 (Identifier-Verweis), keine externe Attribution noetig."
>
Benutzerdefiniert · R3
</span>
)}
</td>
@@ -0,0 +1,62 @@
'use client'
// Reusable licence-source banner placed at the top of an SDK module page.
// One-line context that tells the user (and any auditor) which sources
// the module draws on and which BreakPilot licence rule applies.
//
// Usage:
// <LicenseModuleBanner
// rule={1}
// sourceLabel="DSGVO Art. 30 (EU 2016/679)"
// />
//
// For modules that are pure BreakPilot eigenwerk:
// <LicenseModuleBanner rule={3} sourceLabel="BreakPilot-Eigenwerk" />
type Props = {
rule: 1 | 2 | 3
sourceLabel: string
/** Optional extended note shown after sourceLabel */
detail?: string
}
const RULE_META: Record<number, { bg: string; text: string; pill: string; descr: string }> = {
1: {
bg: 'bg-emerald-50 border-emerald-200',
text: 'text-emerald-800',
pill: 'bg-emerald-600 text-white',
descr: 'Hoheitsrecht/Public Domain — woertlich uebernehmbar',
},
2: {
bg: 'bg-amber-50 border-amber-200',
text: 'text-amber-800',
pill: 'bg-amber-600 text-white',
descr: 'Woertlich mit Attribution-Pflicht',
},
3: {
bg: 'bg-slate-50 border-slate-200',
text: 'text-slate-700',
pill: 'bg-slate-600 text-white',
descr: 'Identifier-Verweis / BreakPilot-Eigenwerk',
},
}
export function LicenseModuleBanner({ rule, sourceLabel, detail }: Props) {
const m = RULE_META[rule]
return (
<div className={`px-3 py-2 ${m.bg} border rounded-lg text-xs ${m.text} flex items-start gap-2`}>
<span className={`inline-flex items-center justify-center w-6 h-6 rounded-full text-[10px] font-bold ${m.pill} flex-shrink-0`}>
R{rule}
</span>
<div className="flex-1">
<span className="font-semibold">Quellen &amp; Lizenz:</span>{' '}
<span>{sourceLabel}</span>
<span className="text-slate-500"> {m.descr}.</span>
{detail && <span className="block mt-0.5 text-[11px] opacity-80">{detail}</span>}
<a href="/sdk/licenses" className="underline ml-1">Quellenverzeichnis</a>
</div>
</div>
)
}
export default LicenseModuleBanner
@@ -224,6 +224,19 @@ export function SDKSidebar({ collapsed = false, onCollapsedChange }: SDKSidebarP
<span>Exportieren</span>
</button>
)}
{!collapsed && (
<a
href="/sdk/licenses"
className="mt-2 w-full flex items-center justify-center gap-2 px-4 py-2 text-xs text-gray-500 hover:text-gray-700 hover:bg-gray-100 rounded-lg transition-colors"
title="Quellen und Lizenzen aller verwendeten Compliance-Controls"
>
<svg className="w-3.5 h-3.5" fill="none" stroke="currentColor" viewBox="0 0 24 24">
<path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M9 12h6m-6 4h6m2 5H7a2 2 0 01-2-2V5a2 2 0 012-2h5.586a1 1 0 01.707.293l5.414 5.414a1 1 0 01.293.707V19a2 2 0 01-2 2z" />
</svg>
<span>Quellen &amp; Lizenzen</span>
</a>
)}
</div>
</aside>
)
@@ -104,39 +104,14 @@ func GetProjectComplianceTriggers(hazards []Hazard, patterns []HazardPattern) *C
}
}
// AllPatterns returns every hazard pattern from all pattern sources.
// This mirrors the aggregation in NewPatternEngine but returns just the slice.
// AllPatterns returns every registered hazard pattern. Delegates to
// collectAllPatterns() in pattern_registry.go so new pattern sources only
// need to be added in one place. Pre-2026-05-21 this function maintained
// a duplicate enumeration which silently drifted from the registry —
// CRA, ISO12100-gap, robot-cell, CNC, VDMA, textile-agri, GT-bremse and
// secondary-harm patterns were invisible to AllPatterns callers.
func AllPatterns() []HazardPattern {
p := GetBuiltinHazardPatterns()
p = append(p, GetExtendedHazardPatterns()...)
p = append(p, GetPressHazardPatterns()...)
p = append(p, GetCobotHazardPatterns()...)
p = append(p, GetOperationalHazardPatterns()...)
p = append(p, GetDGUVExtendedPatterns()...)
p = append(p, GetExtendedHazardPatterns2()...)
p = append(p, GetElevatorPatterns()...)
p = append(p, GetAGVAgriPatterns()...)
p = append(p, GetFoodProcessingPatterns()...)
p = append(p, GetPackagingPatterns()...)
p = append(p, GetLaserPatterns()...)
p = append(p, GetMedicalDevicePatterns()...)
p = append(p, GetPressureEquipmentPatterns()...)
p = append(p, GetConstructionPatterns()...)
p = append(p, GetForestryConveyorPatterns()...)
p = append(p, GetPlasticsMetalPatterns()...)
p = append(p, GetWeldingGlassTextilePatterns()...)
p = append(p, GetSpecificMachinePatterns()...)
p = append(p, GetSpecificMachinePatterns2()...)
p = append(p, GetCyberExtendedPatterns()...)
p = append(p, GetCyberExtendedPatterns2()...)
p = append(p, GetCyberExtendedPatterns3()...)
p = append(p, GetWorkshopPatterns()...)
p = append(p, GetMaintenanceExtPatterns()...)
p = append(p, GetFinalPatternsA()...)
p = append(p, GetFinalPatternsB()...)
p = append(p, GetFinalPatternsC()...)
p = append(p, GetFinalPatternsD()...)
return p
return collectAllPatterns()
}
// extractPatternIDs scans a text for "HP" followed by digits and adds
@@ -83,6 +83,12 @@ type HazardPattern struct {
// feeds into the PLr (required Performance Level) computation,
// see ComputePLr.
DefaultAvoidability int `json:"default_avoidability,omitempty"` // 1 or 2
// SecondaryHarms describes consequential damage chains beyond the
// classical IACE Hazard→Harm step: end-customer safety, product
// liability, food safety, environmental, reputation, financial.
// See secondary_harms.go and the strategy discussion (2026-05-20).
// Empty for hazards with no downstream chain.
SecondaryHarms []SecondaryHarm `json:"secondary_harms,omitempty"`
}
// ComputePLr returns the required Performance Level (PLr) per EN ISO
@@ -0,0 +1,127 @@
package iace
// Demonstration patterns showing how the SecondaryHarms field carries
// downstream-consequence information through the IACE engine.
//
// Two real-world scenarios are encoded:
//
// HP2000 — Glass-shard injection in carbonated-beverage bottling
// (the "Cola splitter" example from the IACE strategy
// discussion). Primary harm is the operator hit by flying
// shards; the secondary chain is product-liability towards
// supermarket end-customers.
//
// HP2001 — Cross-contamination in pharma fill-finish lines.
// Primary harm is operator exposure; secondary chain is
// patient harm + recall under §74a AMG.
//
// These two patterns are sufficient as a contract test for the
// SecondaryHarms field. Library coverage of more scenarios is a
// follow-up task once the persistence layer (DB migration) lands.
func GetSecondaryHarmDemoPatterns() []HazardPattern {
return []HazardPattern{
{
ID: "HP2000",
NameDE: "Glasbruch in Karbonisierungs-Abfueller (Hochdruck)",
NameEN: "Glass shatter in carbonated bottling line",
RequiredComponentTags: []string{"crush_point", "high_pressure"},
RequiredEnergyTags: []string{"pneumatic_pressure"},
GeneratedHazardCats: []string{"mechanical_hazard"},
Priority: 90,
MachineTypes: []string{"bottling", "food_processing", "packaging"},
ScenarioDE: "Glasflasche platzt unter CO2-Druck waehrend der Abfuellung. " +
"Splitter erreichen den Bediener und koennen ferner in nachfolgende " +
"Flaschen eingetragen werden.",
TriggerDE: "Materialfehler, ueberhoehter Innendruck, Foerderstoss",
HarmDE: "Schnittverletzung Auge/Hand des Bedieners",
AffectedDE: "Abfueller, Mitarbeiter Linie",
ZoneDE: "Karussell, Schutzkapsel, Foerderband-Auslauf",
DefaultSeverity: 4,
DefaultExposure: 3,
ISO12100Section: "6.4.5.5 Schleudernde Teile",
SecondaryHarms: []SecondaryHarm{
{
Type: SecondaryHarmConsumerSafety,
Description: "Restsplitter in der Folgeflasche erreichen ueber den Handel " +
"den Endkunden. Verletzungsrisiko Mund/Speiseroehre.",
LegalBasis: "ProdHaftG §1, VO (EU) Nr. 178/2002 Art. 14",
SuggestedMitigations: []string{
"Spueltunnel nach Abfuellung",
"Inline-Kamera mit Glasbrucherkennung",
"Sperrzone fuer 2 Folgeflaschen bei Bruchereignis",
"Glasbruchsensor an Karussell mit Linie-Stopp",
},
Owner: "product_safety",
},
{
Type: SecondaryHarmFoodSafety,
Description: "Rueckruf- und Meldepflicht bei Inverkehrbringen unsicherer " +
"Lebensmittel; Rueckverfolgbarkeit Chargen-genau erforderlich.",
LegalBasis: "VO (EU) 178/2002 Art. 18, 19; LFGB §40",
SuggestedMitigations: []string{
"Chargen-Tracking bis Endhaendler",
"Schnellwarnsystem RASFF aktiviert halten",
"Rueckruf-SOP getestet",
},
Owner: "qm",
},
{
Type: SecondaryHarmReputation,
Description: "Pressemitteilung und Aktienkurs-Reaktion bei Verbraucher-" +
"verletzungen / behoerdlichem Rueckruf.",
LegalBasis: "ISO 31000 Unternehmensrisiko",
SuggestedMitigations: []string{
"Krisenkommunikations-Plan",
"PR-Bereitschaft 24/7",
},
Owner: "enterprise_risk",
},
},
},
{
ID: "HP2001",
NameDE: "Kreuzkontamination Pharma Fill-Finish",
NameEN: "Cross-contamination pharma fill-finish",
RequiredComponentTags: []string{"chemical_risk"},
RequiredEnergyTags: []string{"pneumatic_pressure"},
GeneratedHazardCats: []string{"chemical_hazard"},
Priority: 92,
MachineTypes: []string{"pharmaceutical", "food_processing"},
ScenarioDE: "Wirkstoff-Rueckstand aus Vorcharge im Linienzwischenraum kontaminiert " +
"die Folgecharge.",
TriggerDE: "Mangelhaftes CIP, Spuelvolumen unterhalb Validierung",
HarmDE: "Bedienerexposition bei Probennahme",
AffectedDE: "Anlagenbediener, Probenehmer",
ZoneDE: "Abfuelllinie zwischen Vorlage und Filler",
DefaultSeverity: 4,
DefaultExposure: 2,
ISO12100Section: "6.4.4 Chemische und biologische Gefaehrdungen",
SecondaryHarms: []SecondaryHarm{
{
Type: SecondaryHarmConsumerSafety,
Description: "Patient erhaelt Arzneimittel mit unzulaessiger Beimischung; " +
"Wirkungsbeeintraechtigung oder unerwuenschte Wirkung moeglich.",
LegalBasis: "AMG §5 (Verkehrsfaehigkeit), §74a (Stufenplan)",
SuggestedMitigations: []string{
"CIP-Validierung mit TOC- und Conductivity-Limits",
"Dedizierte Linien fuer Hochpotente Wirkstoffe",
"Stufenplan-Meldung bei Verdacht",
},
Owner: "qm",
},
{
Type: SecondaryHarmProductLiability,
Description: "Haftung des Inverkehrbringers nach AMG §84 (Gefaehrdungshaftung " +
"bei Arzneimittelschaeden, verschuldensunabhaengig).",
LegalBasis: "AMG §84",
SuggestedMitigations: []string{
"Deckung Produkthaftpflicht ueber gesetzliches Minimum",
"Chargen-Rueckhaltemuster 12 Monate ueber MHD hinaus",
},
Owner: "legal",
},
},
},
}
}
@@ -42,5 +42,6 @@ func collectAllPatterns() []HazardPattern {
patterns = append(patterns, GetGTBremseHazardPatterns()...) // HP1710-HP1729 GT Bremse coverage gaps
patterns = append(patterns, GetISO12100GapPatterns()...) // HP1900-HP1909 ISO 12100 Annex B gaps (Vakuum, Federn, Rutsch, Hochdruckinjektion, Ersticken)
patterns = append(patterns, GetCRAPatterns()...) // HP1910-HP1918 CRA / DIN EN 40000-1-2 cyber-resilience spur
patterns = append(patterns, GetSecondaryHarmDemoPatterns()...) // HP2000-HP2001 secondary harm chain demos (Cola splitter, Pharma)
return patterns
}
@@ -0,0 +1,89 @@
package iace
// SecondaryHarm models the consequential damage chain triggered by a primary
// hazard. The classical IACE / ISO-12100 model treats Hazard -> Harm as a
// single step ("operator gets crushed"). BreakPilot extends this with a
// follow-on chain so the risk assessment can address:
//
// - consumer_safety: end customer exposed to defective product
// (e.g. glass shards in a bottled drink that reaches a supermarket)
// - product_liability: manufacturer liability under ProdHaftG / EU PLD
// - food_safety: traceability and recall obligations (VO 178/2002)
// - environmental: spill, contamination, waste-disposal consequence
// - reputation: brand damage that escalates to investor / market level
// - financial: direct cost (lawsuit, recall, fine)
//
// This struct is the data contract; persistence is deferred to a future
// migration. The pattern library can already attach SecondaryHarms to a
// HazardPattern; the API layer surfaces them on hazard generation.
//
// See memory project_attribution_strategy.md plus the "Cola splitter" worked
// example from the IACE strategy discussion (2026-05-20).
type SecondaryHarm struct {
// Type is one of the SecondaryHarmType* constants below.
Type string `json:"type"`
// Description is a single sentence describing the secondary harm
// scenario in concrete terms ("Splitter in Folgeflasche bei
// Karussell-Abfueller -> Endkunde verletzt").
Description string `json:"description"`
// LegalBasis cites the legal framework that turns the secondary harm
// into an actionable obligation (e.g. "ProdHaftG §1" or "VO 178/2002
// Art. 14"). Helps auditors trace the obligation.
LegalBasis string `json:"legal_basis,omitempty"`
// SuggestedMitigations is a free-text list of measures specific to
// the secondary chain (e.g. "Spueltunnel", "Inline-Kamera",
// "Glasbruchsensor"). Distinct from the primary-mitigations because
// they protect downstream stakeholders, not the operator.
SuggestedMitigations []string `json:"suggested_mitigations,omitempty"`
// Owner identifies the role responsible for handling this secondary
// harm in the customer organisation. Common values:
// "qm" / "product_safety" / "enterprise_risk" / "legal"
// Empty if responsibility is shared.
Owner string `json:"owner,omitempty"`
}
// SecondaryHarmType constants — kept short and stable.
const (
SecondaryHarmConsumerSafety = "consumer_safety"
SecondaryHarmProductLiability = "product_liability"
SecondaryHarmFoodSafety = "food_safety"
SecondaryHarmEnvironmental = "environmental"
SecondaryHarmReputation = "reputation"
SecondaryHarmFinancial = "financial"
)
// AllSecondaryHarmTypes returns the canonical six categories in the order
// they should appear in UI dropdowns.
func AllSecondaryHarmTypes() []string {
return []string{
SecondaryHarmConsumerSafety,
SecondaryHarmProductLiability,
SecondaryHarmFoodSafety,
SecondaryHarmEnvironmental,
SecondaryHarmReputation,
SecondaryHarmFinancial,
}
}
// SecondaryHarmLabelDE returns the human-readable German label.
func SecondaryHarmLabelDE(t string) string {
switch t {
case SecondaryHarmConsumerSafety:
return "Endkundensicherheit"
case SecondaryHarmProductLiability:
return "Produkthaftung"
case SecondaryHarmFoodSafety:
return "Lebensmittelsicherheit"
case SecondaryHarmEnvironmental:
return "Umweltschaden"
case SecondaryHarmReputation:
return "Reputation/Marke"
case SecondaryHarmFinancial:
return "Finanzieller Schaden"
}
return t
}
@@ -948,6 +948,15 @@ async def _run_compliance_check(check_id: str, req: ComplianceCheckRequest):
except Exception as e:
logger.warning("Cookie-Library-Fallback skipped: %s", e)
# Vendor-Normalizer: Dedup (Google-Familie etc) + Garbage-Filter
try:
from compliance.services.vendor_normalizer import (
normalize_vendors as _norm_v,
)
cmp_vendors = _norm_v(cmp_vendors)
except Exception as e:
logger.warning("vendor_normalizer skipped: %s", e)
# P50: enrich vendors with per-vendor detail-modal-extracts
# (description, opt-out URL, privacy URL, cookies). Detail
# comes from Phase G Info-button-click-through in /scan.
@@ -1276,6 +1285,38 @@ async def _run_compliance_check(check_id: str, req: ComplianceCheckRequest):
except Exception as e:
logger.warning("Scope-disclaimer block skipped: %s", e)
# COOKIE-COMPLIANCE-AUDIT (3-Quellen-Vergleich) — das ist der
# zentrale USP: deklariert in Richtlinie vs tatsaechlich im
# Browser geladen vs Library-Match.
cookie_audit = {}
cookie_audit_html = ""
try:
from compliance.services.cookie_compliance_audit import (
audit_cookie_compliance, build_cookie_audit_block_html,
)
from database import SessionLocal as _SLca
_ca_db = _SLca()
try:
cookie_audit = audit_cookie_compliance(
_ca_db, doc_texts.get("cookie") or doc_texts.get("dse"),
banner_result,
)
if cookie_audit and (cookie_audit.get("declared_count") or
cookie_audit.get("browser_count")):
cookie_audit_html = build_cookie_audit_block_html(cookie_audit)
logger.info(
"Cookie-Audit: %d deklariert, %d im Browser, "
"%d undokumentiert, %d compliant",
cookie_audit.get("declared_count"),
cookie_audit.get("browser_count"),
len(cookie_audit.get("undeclared_in_browser") or []),
len(cookie_audit.get("compliant") or []),
)
finally:
_ca_db.close()
except Exception as e:
logger.warning("cookie-compliance-audit skipped: %s", e)
# P102: Cookie-Klassifikations-Pruefung (deklariert vs Library)
library_mismatch_html = ""
mismatches: list[dict] = []
@@ -1481,7 +1522,9 @@ async def _run_compliance_check(check_id: str, req: ComplianceCheckRequest):
+ critical_html + scope_disclaimer_html + exec_summary_html
+ cookie_arch_html + summary_html + scanned_html + profile_html
+ scorecard_html + redundancy_html
+ providers_html + banner_deep_html + library_mismatch_html
+ providers_html + banner_deep_html
+ cookie_audit_html
+ library_mismatch_html
+ consistency_html + signals_html + solutions_html
+ jc_decision_html
+ vvt_html + report_html
@@ -67,33 +67,48 @@ def check_vendor_extract_incomplete(
cookie_doc_text: str | None,
cmp_vendors: list | None,
) -> dict | None:
"""2) Cookie-Doc gross aber wenig Vendors → Extract unvollstaendig."""
"""2) Cookie-Doc gross aber wenig Vendors → Extract unvollstaendig.
Dynamische Schwelle nach Doc-Groesse:
* 3k-6k Wörter mind. 10 Vendors erwartet
* 6k-10k Wörter mind. 20 Vendors
* 10k-15k Wörter mind. 30 Vendors
* 15k+ Wörter mind. 40 Vendors
"""
wc = _word_count(cookie_doc_text)
n_vendors = len(cmp_vendors or [])
# Heuristik: Cookie-Doc >= 5000 Wörter (~30k chars) sollte zu mind. 15
# Vendors fuehren. Wenn weniger → Vendor-Extraktion hat den Text nicht
# vollstaendig verarbeitet.
if wc < 5000 or n_vendors >= 15:
if wc < 3000:
return None
# Erwartete Vendor-Anzahl heuristisch nach Doc-Groesse
if wc >= 15000:
expected = 40
elif wc >= 10000:
expected = 30
elif wc >= 6000:
expected = 20
else:
expected = 10
if n_vendors >= expected:
return None
# Verhaeltniszahl bilden — je groesser das Doc, desto auffaelliger
return {
"severity": "HIGH" if wc >= 8000 else "MEDIUM",
"code": "audit_vendor_extract_thin",
"label": (
f"Audit-Vorbehalt: Cookie-Richtlinie hat {wc:,} Wörter, "
f"wir konnten aber nur {n_vendors} Vendor"
f"{'en' if n_vendors != 1 else ''} extrahieren"
f"erwartet ~{expected} Vendors, extrahiert nur {n_vendors}"
).replace(",", "."),
"area": "Vendor-Liste / VVT",
"owner": "DSB + Marketing",
"detail": (
"Bei dieser Doc-Groesse erwarten wir typischerweise 20-50+ "
"Vendors in einer Cookie-Richtlinie. Die niedrige extrahierte "
"Zahl deutet auf eine Tabelle die unser LLM nicht vollstaendig "
"parsen konnte. Empfehlung: VVT-Tabelle mit DSB / Marketing "
"manuell abgleichen, oder die Cookie-Tabelle im Copy-Paste-Modus "
"neu einreichen — dort parsen wir Spalten deterministisch."
),
f"Bei einer Cookie-Richtlinie mit {wc:,} Woertern erwarten wir "
f"typischerweise {expected}+ unique Vendors. Die extrahierte Zahl "
f"({n_vendors}) ist auffaellig niedrig — entweder hat unser "
"Parser/LLM die Tabelle nicht vollstaendig erfasst oder "
"Vendors wurden zu konservativ erkannt. Empfehlung: Cookie-"
"Tabelle im Copy-Paste-Modus einreichen (Frontend-Toggle "
"'Text einfuegen' pro Cookie-Doc-Zeile) — dort parsen wir "
"Spalten deterministisch."
).replace(",", "."),
"legal_basis": "Art. 13(1)(e) DSGVO — die Empfaengerliste muss "
"vollstaendig sein; ein unvollstaendiger Audit darf "
"nicht als vollstaendig dargestellt werden.",
@@ -0,0 +1,221 @@
"""
Cookie-Compliance-Audit 3-Quellen-Vergleich.
DAS ist der eigentliche Mehrwert des Tools:
* A. Was in der Cookie-Richtlinie DEKLARIERT ist (Text-Parse)
* B. Was im Browser TATSAECHLICH GELADEN wurde (after_accept)
* C. Was unsere LIBRARY ueber den Cookie weiss (Vendor, Kategorie)
Daraus 3 Listen:
1. deklariert + geladen + library-bekannt compliant
2. geladen aber NICHT deklariert HIGH-Verstoss (Art. 13(1)(c) DSGVO)
3. deklariert aber NICHT geladen Tabelle veraltet (LOW)
4. 🔍 deklariert + Library-Kategorie weicht ab Pruefanlass
"""
from __future__ import annotations
import logging
import re
from typing import Iterable
from sqlalchemy import text as sa_text
from sqlalchemy.orm import Session
logger = logging.getLogger(__name__)
def _normalize_cookie_name(name: str) -> str:
"""Wildcard-Cookies wie 'AMCV_*', 'pm_sess_NNN' werden auf Prefix
reduziert damit '_ga' und '_ga_GTM-XXX' als ein Cookie zaehlen."""
if not name:
return ""
s = name.strip()
# AMCV_*, sc_v44, etc.
s = re.sub(r"[<\[].*?[>\]]", "", s) # entferne <ID>, [...]
s = s.rstrip("*").rstrip("_")
s = re.sub(r"_NNN$|_\d+$", "", s)
return s.lower()
def _extract_declared_cookies(cookie_doc_text: str | None) -> set[str]:
"""Liest Cookie-Namen aus dem Cookie-Richtlinien-Text.
Nutzt zuerst parse_cookie_table (Block/Tab-Format), dann
parse_flat_cookie_text (Anchor-Pattern).
"""
if not cookie_doc_text:
return set()
declared: set[str] = set()
try:
from compliance.services.cookies_table_parser import (
parse_cookie_table, parse_flat_cookie_text,
)
for v in parse_cookie_table(cookie_doc_text):
for c in (v.get("cookies") or []):
if isinstance(c, dict) and c.get("name"):
declared.add(_normalize_cookie_name(c["name"]))
for v in parse_flat_cookie_text(cookie_doc_text):
for c in (v.get("cookies") or []):
if isinstance(c, dict) and c.get("name"):
declared.add(_normalize_cookie_name(c["name"]))
except Exception as e:
logger.warning("declared-cookie-extract failed: %s", e)
return {n for n in declared if n}
def _extract_browser_cookies(banner_result: dict | None) -> set[str]:
"""Liest Cookie-Namen aus banner_result.phases.after_accept.cookies."""
out: set[str] = set()
if not isinstance(banner_result, dict):
return out
phases = banner_result.get("phases") or {}
for ph_name in ("after_accept", "before_consent", "after_reject"):
ph = phases.get(ph_name) or {}
if not isinstance(ph, dict):
continue
for c in (ph.get("cookies") or []):
if isinstance(c, str):
out.add(_normalize_cookie_name(c))
elif isinstance(c, dict) and c.get("name"):
out.add(_normalize_cookie_name(c["name"]))
return {n for n in out if n}
def _lookup_library(db: Session, names: Iterable[str]) -> dict[str, dict]:
"""Liefert {normalized_name: {category, vendor}} aus cookie_library."""
nl = [n for n in names if n]
if not nl:
return {}
try:
rows = db.execute(sa_text(
"SELECT cookie_name, actual_category, vendor_name "
"FROM compliance.cookie_library "
"WHERE LOWER(cookie_name) = ANY(:lc)"
), {"lc": nl}).fetchall()
return {r[0].lower(): {"category": r[1], "vendor": r[2]} for r in rows}
except Exception as e:
logger.warning("library lookup failed: %s", e)
return {}
def audit_cookie_compliance(
db: Session | None,
cookie_doc_text: str | None,
banner_result: dict | None,
) -> dict:
"""Hauptfunktion: liefert dict mit 4 Listen + counts."""
declared = _extract_declared_cookies(cookie_doc_text)
browser = _extract_browser_cookies(banner_result)
all_names = declared | browser
library = _lookup_library(db, all_names) if db else {}
declared_only = declared - browser
browser_only = browser - declared
both = declared & browser
return {
"declared_count": len(declared),
"browser_count": len(browser),
"library_count": len(library),
"compliant": sorted(both),
"undeclared_in_browser": sorted(browser_only),
"declared_not_loaded": sorted(declared_only),
"library_metadata": library,
"high_findings": len(browser_only),
"low_findings": len(declared_only),
}
def build_cookie_audit_block_html(audit: dict) -> str:
"""Rendert den 3-Spalten-Vergleichs-Block in die Mail."""
if not audit:
return ""
n_dec = audit.get("declared_count", 0)
n_brw = audit.get("browser_count", 0)
n_undecl = len(audit.get("undeclared_in_browser") or [])
n_dec_only = len(audit.get("declared_not_loaded") or [])
n_both = len(audit.get("compliant") or [])
sev_color = "#dc2626" if n_undecl else "#16a34a"
undecl_html = ""
if audit.get("undeclared_in_browser"):
undecl_html = (
'<div style="margin-top:10px;padding:10px 12px;background:#fee2e2;'
'border:1px solid #fecaca;border-radius:6px">'
f'<strong style="color:#991b1b">❌ {n_undecl} Cookie'
f'{"s" if n_undecl != 1 else ""} im Browser geladen, '
'aber NICHT in der Cookie-Richtlinie deklariert:</strong>'
'<div style="font-family:monospace;font-size:10px;color:#7f1d1d;'
'margin-top:6px;max-height:200px;overflow:auto">'
+ ", ".join(audit["undeclared_in_browser"][:50])
+ (f' ... +{n_undecl - 50} weitere'
if n_undecl > 50 else '') +
'</div>'
'<div style="font-size:10px;color:#7f1d1d;margin-top:4px;'
'font-style:italic">Art. 13(1)(c) DSGVO + § 25 TDDDG — '
'die Empfaengerliste muss vollstaendig sein. Diese Cookies '
'sind potenziell ungenannte Verarbeitungen.</div>'
'</div>'
)
dec_only_html = ""
if audit.get("declared_not_loaded"):
dec_only_html = (
'<div style="margin-top:10px;padding:10px 12px;background:#fef3c7;'
'border:1px solid #fde68a;border-radius:6px">'
f'<strong style="color:#92400e">⚠️ {n_dec_only} Cookie'
f'{"s" if n_dec_only != 1 else ""} in der Richtlinie '
'deklariert, aber bei diesem Audit NICHT im Browser gesehen:</strong>'
'<div style="font-family:monospace;font-size:10px;color:#78350f;'
'margin-top:6px;max-height:200px;overflow:auto">'
+ ", ".join(audit["declared_not_loaded"][:50])
+ (f' ... +{n_dec_only - 50} weitere'
if n_dec_only > 50 else '') +
'</div>'
'<div style="font-size:10px;color:#78350f;margin-top:4px;'
'font-style:italic">Kein direkter Verstoss — die Cookies '
'koennen nur in bestimmten User-Journeys / Geo-Regionen / '
'eingeloggten Zustaenden geladen werden. Empfehlung: '
'pruefen ob die Cookie-Richtlinie veraltet ist.</div>'
'</div>'
)
compliant_html = ""
if audit.get("compliant"):
compliant_html = (
'<div style="margin-top:10px;padding:10px 12px;background:#dcfce7;'
'border:1px solid #bbf7d0;border-radius:6px">'
f'<strong style="color:#166534">✓ {n_both} Cookie'
f'{"s" if n_both != 1 else ""} sowohl deklariert als auch geladen '
'(compliant):</strong>'
'<div style="font-family:monospace;font-size:10px;color:#14532d;'
'margin-top:6px;max-height:150px;overflow:auto">'
+ ", ".join(audit["compliant"][:50])
+ (f' ... +{n_both - 50} weitere'
if n_both > 50 else '') +
'</div>'
'</div>'
)
return (
'<div style="font-family:-apple-system,BlinkMacSystemFont,sans-serif;'
'max-width:760px;margin:0 auto 16px;padding:14px 18px;'
'background:#fff;border:1px solid #cbd5e1;border-radius:8px">'
f'<div style="font-size:11px;color:{sev_color};text-transform:uppercase;'
f'letter-spacing:1.2px;margin-bottom:4px;font-weight:600">'
'Cookie-Compliance-Audit — 3-Quellen-Vergleich</div>'
'<h3 style="margin:0 0 6px;font-size:14px;color:#1e293b">'
f'{n_dec} in Richtlinie · {n_brw} im Browser · '
f'{n_both} compliant · {n_undecl} undokumentiert · '
f'{n_dec_only} nicht geladen</h3>'
'<p style="margin:0 0 8px;font-size:11px;color:#475569;line-height:1.5">'
'Wir vergleichen die in der Cookie-Richtlinie genannten Cookies '
'mit dem was der Browser nach Akzeptieren tatsaechlich laed. '
'Undokumentierte Cookies im Browser sind ein direkter Verstoss '
'gegen die DSGVO-Informationspflicht.'
'</p>'
+ undecl_html + dec_only_html + compliant_html +
'</div>'
)
@@ -79,10 +79,116 @@ def _parse_persistence(s: str) -> str:
return ""
_CATEGORY_INDICATORS = (
"funktionscookie", "tracking cookie", "trackingcookie",
"marketing", "analytics", "necessary", "notwendig",
"performance", "session cookie", "persistent cookie",
"permanent cookie", "permanent/protokoll", "sitzungs-cookie",
)
def parse_block_format(text: str) -> list[dict]:
"""Block-Format (Browser-Copy aus VW/BMW/Mercedes ohne Tab-Trenner):
Pro Cookie 5 Zeilen: Name / Kategorie / Zweck / Speicherdauer / Art.
Heuristik: gehe ueber alle Zeilen. Wenn eine Zeile NICHT eine
Kategorie/Dauer/Art ist und die naechste eine Kategorie enthaelt
das ist ein Cookie-Name. Sammle die naechsten 4 Zeilen als
Kategorie/Zweck/Dauer/Art.
"""
if not text or len(text) < 100:
return []
raw_lines = [ln.strip() for ln in text.splitlines()]
# Aggressive newline-collapse: leere Zeilen entfernen, aber Zeilen
# die Teil eines mehrzeiligen Zwecks sind moegen separat bleiben.
lines = [ln for ln in raw_lines if ln]
if len(lines) < 10:
return []
# Drop the header row(s) if present
start = 0
if lines[0].lower() in ("name des cookies", "cookie name", "name"):
start = 5 if len(lines) > 5 else 1
by_vendor: dict[str, dict] = {}
seen_names: set[str] = set()
i = start
while i < len(lines) - 2:
name_line = lines[i]
cat_line = lines[i + 1] if i + 1 < len(lines) else ""
# Verify cat_line is a category indicator (otherwise the
# block is malformed — skip 1 line and try again).
if not any(c in cat_line.lower() for c in _CATEGORY_INDICATORS):
i += 1
continue
# Cookie-Name validation
nl = name_line.lower().strip()
if (not name_line or len(name_line) > 80
or len(name_line) < 2
or any(c in nl for c in _CATEGORY_INDICATORS)
or nl in seen_names
or nl in ("name des cookies", "kategorie",
"verwendungszweck", "speicherdauer",
"art des cookies")):
i += 1
continue
# Look ahead for the Art-Cookie line (max 8 lines forward)
purpose_parts: list[str] = []
persistence = ""
art = ""
j = i + 2
while j < min(i + 12, len(lines)):
ln = lines[j]
ll = ln.lower()
if any(t in ll for t in (
"permanent/protokoll", "session cookie",
"persistent cookie", "permanent cookie",
"sitzungs-cookie", "permanent/ protokoll",
)):
art = ln
if not persistence and j > i + 2:
persistence = lines[j - 1]
break
purpose_parts.append(ln)
j += 1
purpose = " ".join(purpose_parts[:-1]) if len(purpose_parts) > 1 else " ".join(purpose_parts)
purpose = purpose[:500].strip()
seen_names.add(nl)
provider = _guess_vendor(name_line) or "Unbekannter Anbieter (VW-intern)"
# Marketing-Cookies = Drittanbieter
if "marketing" in cat_line.lower() or "tracking" in cat_line.lower():
if provider == "Unbekannter Anbieter (VW-intern)":
provider = "Unbekannter Drittanbieter (Marketing)"
entry = by_vendor.setdefault(provider, {
"name": provider, "country": "",
"purpose": "", "category": _normalize_category(cat_line),
"opt_out_url": "", "privacy_policy_url": "",
"persistence": "",
"cookies": [],
"source": "block_paste",
})
entry["cookies"].append({
"name": name_line,
"purpose": purpose[:300],
"expiry": persistence,
"is_third_party": "tracking" in cat_line.lower() or "marketing" in cat_line.lower(),
})
i = j + 1 if art else i + 5
out = list(by_vendor.values())
logger.info("parse_block_format: %d vendors / %d cookies",
len(out), sum(len(v["cookies"]) for v in out))
return out
def parse_cookie_table(text: str) -> list[dict]:
"""Returns vendor-records aus einer copy-pasted Cookie-Tabelle.
Bei nicht-tabellarischem Text: return [].
Probiert in dieser Reihenfolge:
1. Tab/Pipe/Komma-getrennt (klassisches Tabellen-Layout)
2. 5-Zeilen-Block-Format (VW Browser-Copy)
3. return []
"""
if not text or len(text) < 100:
return []
@@ -98,6 +204,10 @@ def parse_cookie_table(text: str) -> list[dict]:
if sep:
sep_counts[sep] = sep_counts.get(sep, 0) + 1
if not sep_counts or max(sep_counts.values()) < 3:
# Kein Separator-Format → versuche Block-Format
block_vendors = parse_block_format(text)
if block_vendors:
return block_vendors
return []
sep = max(sep_counts, key=sep_counts.get)
@@ -257,22 +367,67 @@ def parse_flat_cookie_text(text: str) -> list[dict]:
_VENDOR_GUESS = (
# Google-Familie (alles unter "Google" zusammenfassen — Dedup kuemmert sich)
("_ga", "Google"), ("_gid", "Google"), ("_gcl_", "Google"),
("ANID", "Google"), ("AID", "Google"), ("FPGCLDC", "Google"),
("IDE", "Google DoubleClick"), ("DSID", "Google"),
("_fbp", "Meta / Facebook"), ("fr", "Meta / Facebook"),
("FPAU", "Google"), ("FLC", "Google"), ("APC", "Google"),
("IDE", "Google"), ("DSID", "Google"), ("TAID", "Google"),
("NID", "Google"), ("1P_JAR", "Google"),
# Meta / Facebook
("_fbp", "Meta / Facebook"), ("_fbc", "Meta / Facebook"),
# fr ist Meta-Cookie, nur wenn keine andere Site-eigene Verwendung
# Microsoft / Bing
("_pin_unauth", "Pinterest"), ("_uetsid", "Microsoft Bing"),
("_uetvid", "Microsoft Bing"), ("MUID", "Microsoft"),
# Soziale Netzwerke
("tt_", "TikTok"), ("li_at", "LinkedIn"),
# CMP
("OptanonConsent", "OneTrust"), ("cookieconsent", "Borlabs / Cookie-CMP"),
("CookieConsentPolicy", "Borlabs / Cookie-CMP"),
# Analytics
("eta_", "etracker"), ("matomo", "Matomo"),
("_hjid", "Hotjar"), ("_hj", "Hotjar"),
("__cf", "Cloudflare"), ("datadome", "DataDome"),
("incap_", "Imperva Incapsula"),
("ajs_", "Segment"), ("amp_", "Amplitude"),
# Adobe-Familie
("sat_track", "Adobe Experience Cloud"),
("AMCV_", "Adobe Experience Cloud"),
("AMCV", "Adobe Experience Cloud"),
("AMCVS", "Adobe Experience Cloud"),
("demdex", "Adobe Experience Cloud"),
("dextp", "Adobe Experience Cloud"),
("dpm", "Adobe Experience Cloud"),
("mbox", "Adobe Target"),
("smartSignals", "Adobe Experience Cloud"),
("adbCDP", "Adobe Experience Cloud"),
("s_cc", "Adobe Analytics"), ("s_sq", "Adobe Analytics"),
("s_ecid", "Adobe Analytics"), ("s_vi", "Adobe Analytics"),
("s_fid", "Adobe Analytics"), ("s_plt", "Adobe Analytics"),
("s_pltp", "Adobe Analytics"), ("s_invisit", "Adobe Analytics"),
("s_vnc365", "Adobe Analytics"), ("s_ivc", "Adobe Analytics"),
("sc_appvn", "Adobe Analytics"), ("sc_pCmp", "Adobe Analytics"),
("sc_prevpage", "Adobe Analytics"), ("sc_prop", "Adobe Analytics"),
("sc_v17", "Adobe Analytics"), ("sc_v44", "Adobe Analytics"),
("sc_v49", "Adobe Analytics"),
# The Trade Desk
("TDID", "The Trade Desk"), ("TDCPM", "The Trade Desk"),
("TTDOptOut", "The Trade Desk"),
# AdForm
("uid", "AdForm"), ("cid", "AdForm"), ("otsid", "AdForm"),
# everest
("everest", "Adobe Advertising Cloud (everest)"),
# Infra/CDN
("__cf", "Cloudflare"), ("datadome", "DataDome"),
("incap_", "Imperva Incapsula"), ("awsalb", "AWS Load Balancer"),
# Salesforce
("sfdc-", "Salesforce"), ("X-Salesforce", "Salesforce"),
("liveagent_", "Salesforce LiveAgent"),
# Inbenta
("inbenta", "Inbenta"),
# Sonstige Tracker
("_pk_", "Matomo / Piwik"),
("hmt_", "Akamai mPulse"),
# EDAA / Industry Self-regulation
("EDAAT", "EDAA / Online Choices"),
("Eboptout", "EDAA / Online Choices"),
)
@@ -0,0 +1,167 @@
"""
Vendor-Deduplizierung und Garbage-Filter.
Normalisiert Vendor-Namen (Google + Google DoubleClick + DoubleClick/Google
Marketing eine Eintragung) und entfernt Garbage-Eintraege die fälschlich
als Vendor erkannt wurden ('click to select a dealership', 'Mehrere OEMs',
URL-Fragmente, etc.).
Wird nach allen Vendor-Sources (LLM, Library, Pattern, Phase-G) angewandt
bevor die VVT-Tabelle gerendert wird.
"""
from __future__ import annotations
import logging
import re
logger = logging.getLogger(__name__)
# Aliase: alle Schreibweisen → kanonischer Name
_VENDOR_ALIASES: dict[str, str] = {
# Google-Familie
"google": "Google",
"google llc": "Google",
"google inc": "Google",
"google marketing platform": "Google",
"google ads": "Google",
"google adsense": "Google",
"google analytics": "Google Analytics",
"google tag manager": "Google Tag Manager",
"google doubleclick": "Google",
"doubleclick": "Google",
"doubleclick/google marketing": "Google",
"doubleclick by google": "Google",
# Adobe-Familie
"adobe": "Adobe",
"adobe inc": "Adobe",
"adobe systems": "Adobe",
"adobe analytics": "Adobe Analytics",
"adobe audience manager": "Adobe Audience Manager",
"adobe experience cloud": "Adobe Experience Cloud",
"adobe target": "Adobe Target",
"adobe advertising cloud (everest)": "Adobe Advertising Cloud",
# Trade Desk
"the trade desk": "The Trade Desk",
"tradedesk": "The Trade Desk",
"the tradedesk": "The Trade Desk",
"trade desk": "The Trade Desk",
# Meta
"meta": "Meta / Facebook",
"meta platforms": "Meta / Facebook",
"facebook": "Meta / Facebook",
"meta / facebook": "Meta / Facebook",
# AdForm
"adform": "AdForm",
"adform dsp": "AdForm",
# Microsoft
"microsoft": "Microsoft",
"microsoft bing": "Microsoft Bing",
"linkedin": "LinkedIn (Microsoft)",
"linkedin corporation": "LinkedIn (Microsoft)",
# CMP
"onetrust": "OneTrust",
"cookiebot": "Cookiebot",
"usercentrics": "Usercentrics",
"borlabs": "Borlabs",
"borlabs / cookie-cmp": "Borlabs",
# Salesforce
"salesforce": "Salesforce",
"salesforce liveagent": "Salesforce",
"liveagent": "Salesforce",
# Cloudflare
"cloudflare": "Cloudflare",
}
# Garbage-Patterns: wenn der Vendor-Name darauf matched → wegfiltern
_GARBAGE_PATTERNS = (
re.compile(r"^click to ", re.I),
re.compile(r"^mehrere oems", re.I),
re.compile(r"^breakpilot[-_ ]?snapshot", re.I),
re.compile(r"^https?://", re.I), # URLs
re.compile(r"^https?$", re.I),
re.compile(r"^javascript:", re.I),
re.compile(r"^undefined$|^null$|^none$", re.I),
re.compile(r"^[\d\W]+$"), # nur Zahlen/Symbole
re.compile(r"^.{1,2}$"), # Ein-/Zwei-Zeichen-"Namen"
re.compile(r"^(ein|der|die|das|von|und|aber|oder)$", re.I),
re.compile(r"^cookie$|^cookies$", re.I),
)
def _is_garbage(name: str) -> bool:
if not name or len(name.strip()) < 2:
return True
if len(name) > 120:
return True
return any(p.search(name) for p in _GARBAGE_PATTERNS)
def _canonical_name(name: str) -> str:
nl = name.strip().lower()
if nl in _VENDOR_ALIASES:
return _VENDOR_ALIASES[nl]
# Sub-token-Match: 'doubleclick by google' → enthaelt 'doubleclick'
for alias, canonical in _VENDOR_ALIASES.items():
if alias in nl and len(alias) >= 6:
return canonical
return name.strip()
def normalize_vendors(vendors: list[dict]) -> list[dict]:
"""Filtert Garbage + dedupliziert anhand kanonischer Aliase.
Mergt cookies-Listen wenn der gleiche Vendor mehrfach erscheint
(z.B. aus LLM + Library + Phase-G). Behaelt Metadaten des Eintrags
mit der laengsten cookies-Liste.
"""
if not vendors:
return []
by_canon: dict[str, dict] = {}
dropped_garbage = 0
merged = 0
for v in vendors:
if not isinstance(v, dict):
continue
raw_name = (v.get("name") or "").strip()
if _is_garbage(raw_name):
dropped_garbage += 1
continue
canon = _canonical_name(raw_name)
if canon in by_canon:
# Merge: cookies vereinen, source-Tags joinen
ex = by_canon[canon]
ex_cookies = ex.get("cookies") or []
new_cookies = v.get("cookies") or []
seen_ck = {(c.get("name") or "").lower() for c in ex_cookies if isinstance(c, dict)}
for c in new_cookies:
if isinstance(c, dict):
nm = (c.get("name") or "").strip().lower()
if nm and nm not in seen_ck:
ex_cookies.append(c)
seen_ck.add(nm)
ex["cookies"] = ex_cookies
# Source-Tag merging (semicolon-separated)
ex_src = (ex.get("source") or "").split(";")
new_src = v.get("source") or ""
if new_src and new_src not in ex_src:
ex_src.append(new_src)
ex["source"] = ";".join([s for s in ex_src if s])
# Bessere Metadaten uebernehmen (falls leer)
for k in ("country", "opt_out_url", "privacy_policy_url",
"purpose", "category", "persistence"):
if not ex.get(k) and v.get(k):
ex[k] = v[k]
merged += 1
else:
v["name"] = canon
by_canon[canon] = v
if dropped_garbage or merged:
logger.info(
"Vendor-Normalizer: %d garbage dropped, %d duplicate merges, "
"%d unique vendors (input: %d)",
dropped_garbage, merged, len(by_canon), len(vendors),
)
return list(by_canon.values())
@@ -0,0 +1,55 @@
Name des Cookies
Kategorie
Verwendungszweck
Speicherdauer
Art des Cookies
VWD6_ENSIGHTEN_PRIVACY_MODAL_LOADED
Funktionscookie
Dieses Cookie speichert, ob für den User der Cookie Manager angezeigt wurde.
1 Jahr
Permanent/Protokoll
VWD6_ENSIGHTEN_PRIVACY_MODAL_VIEWED
Funktionscookie
Dieses Cookie speichert, ob für der User Einstellung im Cookie Manager vorgenommen hat.
1 Jahr
Permanent/Protokoll
VWD6_ENSIGHTEN_PRIVACY_<category name>
Funktionscookie
Dieses Cookie speichert, ob der User sein Einverständnis für die entsprechende Cookie Kategorie gegeben hat.
1 Jahr
Permanent/Protokoll
UZ_TI_dc_value
Funktionscookie
Dieses Cookie verfolgt die Studien-ID oder die Segment-ID in Abhängigkeit vom Wert von UZ_TI_dc_value.
20 Tage
Persistent cookie
awsalb
Funktionscookie
Der Cookie prüft, welcher Load Balancer für die aktuelle Session verwendet wird.
7 Tage
Persistent cookie
UZ_TI_S_<ID>
Funktionscookie
Der Cookie erfasst, ob ein anderer Cookie für jedes Segment verwendet wird.
20 Tage
Persistent cookie
smartSignals2UiD
Trackingcookie (Analytics & Personalisierung)
Dieses Cookie enthält eine eindeutige, zufällig generierte ID für einen Webseiten User.
1 Jahr
Permanent/Protokoll
smartSignals2sUiD
Trackingcookie (Analytics & Personalisierung)
userId verbesserter Mechanismus zur Browser-Tracking-Einschraenkungen
1 Jahr
Permanent/Protokoll
smartSignals2CP
Trackingcookie (Analytics & Personalisierung)
Personalisierte Inhalte angezeigt
30 Minuten
Session Cookie
s_ecid
Trackingcookie (Analytics & Personalisierung)
First-Party-Cookie Besucherkennung
13 Monate nach dem letzten Besuch
Permanent/Protokoll