Commit Graph

17 Commits

Author SHA1 Message Date
Benjamin Admin
6a0e7c947f perf(pipeline): switch to v3 prompt for generation, v4 fields via Haiku backfill
Remove applicability/scanner_hint/evidence_type/provides_context from
Pass 0b prompt to reduce output tokens (~40% less). These 6 fields are
added via cheap Haiku backfill afterwards (~$1.50 per 10k controls).

Saves ~$200 over the remaining 160k obligations.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-27 00:14:47 +02:00
Benjamin Admin
5ef039a6bc feat(pipeline): Pass 0b prompt v4 + Haiku backfill endpoint
Prompt v4 adds 6 new fields to Pass 0b output:
- applicability: condition rules (same format as dependency engine)
- check_type: expanded to 10 granular types
- scanner_hint: search_terms + negative_indicators for MCP
- manual_review_required_if: escalation conditions
- evidence_type: code/process/hybrid
- provides_context: context variables this control creates

New endpoint POST /generate/backfill-extended:
- Backfills existing 9k controls via Haiku Batch API (~$1.50)
- Adds all 6 new fields to generation_metadata
- Supports dry_run mode

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-26 23:14:59 +02:00
Benjamin Admin
42ab5ead26 feat(pipeline): implement Control Dependency Engine (Block 9)
Core engine (dependency_engine.py):
- 5 dependency types: prerequisite, supersedes, compensating_control,
  conditional_requirement, scope_exclusion
- Generic condition evaluator (JSONB rules with AND/OR/NOT/field ops)
- Priority-based conflict resolution
- Cycle detection (DFS) + topological sort
- Full evaluation with MCP-compatible dependency_resolution trace
- 39 tests all passing (incl. GHV scenario from user requirements)

Automatic generator (dependency_generator.py):
- Ontology-based: same normalized_object + phase sequence -> prerequisite
- Pattern-based: define->implement, implement->monitor, etc.
- Domain packs: YAML rules for GDPR, AI Act, CRA, Security, Labor Contracts
- 14 tests all passing

API routes (dependency_routes.py):
- CRUD for dependencies
- POST /evaluate with dependency resolution
- POST /generate (auto-generation with dry_run)
- POST /validate (cycle detection)
- GET /graph (nodes + edges for visualization)

Prompt enhancement (decomposition_pass.py):
- Added dependency_hints + lifecycle_phase_order to Pass 0b prompt
- Stored in generation_metadata for post-processing

DB migration: control_dependencies + control_evaluation_results tables

126 tests total, all passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-26 20:28:10 +02:00
Benjamin Admin
629b9d9ca5 feat(pipeline): store MCP fields (assertion, pass/fail criteria, check_type) in generation_metadata
- Add assertion, pass_criteria, fail_criteria, check_type to AtomicControlCandidate dataclass
- Parse MCP fields from LLM output in _process_pass0b_control
- Store MCP fields in generation_metadata JSON for later use by MCP scanner
- Fields default to empty when not present (backward-compatible with old prompts)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-26 09:32:56 +02:00
Benjamin Admin
7e3b1108e2 feat: integrate Ontology pre-LLM filter into Pass 0b submit
Obligations classified before API call:
- evidence → skipped (saves API cost)
- composite → skipped (not atomic)
- framework_container → skipped (decompose separately)
- atomic → sent to LLM

Filter stats returned in submit response.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-26 09:13:32 +02:00
Benjamin Admin
b3fbbbacfe feat(control-pipeline): Control Ontology v1 — action types, evidence/container/framework detection
Block 7.1-7.2 from masterplan:
- 26 action_types with German aliases + phase mapping
- Negative obligation patterns (exclude, prevent, enforce)
- Container detection (11 composite objects that must not become atomic)
- Evidence detection (14 indicators + "X dokumentieren" pattern)
- Framework reference detection (OWASP, NIST, BSI, CSA, ISO patterns)
- classify_obligation() routes to: atomic, composite, evidence, framework_container
- build_canonical_key() for deterministic dedup
- 36 tests covering all classification functions

Also: merge_key bug fix in _process_pass0b_control()

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-26 09:06:39 +02:00
Benjamin Admin
3a100fa1f1 feat: Pass 0b prompt v3 — compound action ban, evidence-of-action rule, pflicht-vs-prozess merge
Fixes from v2 evaluation (7.9/10 avg, 28 controls):
1. COMPOUND BAN: "durchführen UND Maßnahmen ergreifen" → pick primary action only
2. EVIDENCE-OF-ACTION: "Tests dokumentieren" → evidence field, not own control
3. PFLICHT=PROZESS: "Behörden informieren" + "Verfahren etablieren" = 1 control
4. MERGE-KEY BUG: merge_key from LLM output now stored in generation_metadata

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-26 00:25:38 +02:00
Benjamin Admin
7a53f5bee1 feat: Pass 0b prompt v2 — container detection, merge-key, evidence separation, actionable titles
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-26 00:00:59 +02:00
Benjamin Admin
ea30ceb1f1 feat(control-pipeline): improved Pass 0b prompt for actionable control titles
Key changes to system prompt:
- Evidence/documentation belongs in evidence field, NOT as separate control
- SBOM = 1 control (not "maintain" + "document" separately)
- Security lifecycle phases (identify/assess/remediate/monitor) = separate controls
- Same object + same action + same actor = 1 control (merge, not split)
- Titles must contain the ACTION, not just the subject
  WRONG: "Vertraulichkeit Mitarbeiter"
  RIGHT: "Mitarbeiter zur Vertraulichkeit verpflichten"

Titles serve as MCP search queries against customer documents/code.
Bad titles = bad search results = unusable product.

All 52,566 old pass0b controls deprecated (not deleted) for full regeneration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 23:45:37 +02:00
Benjamin Admin
cd33777d75 fix: Pass 0b INSERT ON CONFLICT DO UPDATE + per-result commit/rollback
Prevents UniqueViolation from blocking entire batch. Each result
is committed individually, errors are rolled back without affecting
subsequent results.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 22:15:21 +02:00
Benjamin Admin
c73a489075 fix: Pass 0b filter — skip obligations whose parent already has pass0b controls
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 21:54:32 +02:00
Benjamin Admin
7ddb572f5d fix: Pass 0b batch custom_id + result handler for numeric format
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 16:08:19 +02:00
Benjamin Admin
f1359d63ba fix: handle new numeric batch custom_id format in Pass 0a result processing
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-24 07:21:50 +02:00
Benjamin Admin
bbfcd44407 fix: use numeric batch index as custom_id (64 char limit, alphanumeric only)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-24 00:39:13 +02:00
Benjamin Admin
5da5a5597b fix: increase Batch API upload timeout to 600s for large payloads
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-24 00:31:50 +02:00
Benjamin Admin
f89ce46631 fix: Pipeline-Skalierung — 6 Optimierungen für 80k+ Controls
1. control_generator: GeneratorResult.status Default "completed" → "running" (Bug)
2. control_generator: Anthropic API mit Phase-Timeouts + Retry bei Disconnect
3. control_generator: regulation_exclude Filter + Harmonization via Qdrant statt In-Memory
4. decomposition_pass: Enrich Pass Batch-UPDATEs (400k → ~400 DB-Calls)
5. decomposition_pass: Merge Pass single Query statt N+1
6. batch_dedup_runner: Cross-Group Dedup parallelisiert (asyncio.gather)
7. canonical_control_routes: Framework Controls API Pagination (limit/offset)
8. DB-Indizes: idx_oc_parent_release, idx_oc_trigger_null, idx_cc_framework

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 14:09:32 +02:00
Benjamin Admin
e3ab428b91 feat: control-pipeline Service aus Compliance-Repo migriert
Control-Pipeline (Pass 0a/0b, BatchDedup, Generator) als eigenstaendiger
Service in Core, damit Compliance-Repo unabhaengig refakturiert werden kann.
Schreibt weiterhin ins compliance-Schema der shared PostgreSQL.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 14:40:47 +02:00