breakpilot-core

Author	SHA1	Message	Date
Benjamin Admin	652e3a65a3	feat(pipeline): F2+F3 action/object ontology — DB-backed normalization CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-consent (push) Successful in 36s Details CI / test-python-voice (push) Successful in 33s Details CI / test-bqas (push) Successful in 31s Details Migrates ACTION_TYPES (26+8 types), _NEGATIVE_PATTERNS (22), _ACTION_SYNONYMS (65), and _OBJECT_SYNONYMS (75) from hardcoded dicts to DB tables. - SQL migration: 003_action_object_ontology.sql (3 tables) - Migration scripts: f2_migrate_actions.py (34 types, 145 synonyms), f3_migrate_objects.py (75 objects) - OntologyRegistry cache: 5min TTL, raises RuntimeError if empty (safe fallback to dicts) - control_ontology.classify_action/get_phase delegate to DB with dict fallback - control_dedup.normalize_action/normalize_object delegate to DB with dict fallback - 25 new tests, 446 total pass, 0 regressions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-03 23:47:53 +02:00
Benjamin Admin	9437e029d0	feat(pipeline): F1 regulation registry — DB-backed license/source-type lookup Migrates REGULATION_LICENSE_MAP (135 entries) and SOURCE_REGULATION_CLASSIFICATION (58 entries) from hardcoded Python dicts to compliance.regulation_registry table. - SQL migration: 002_regulation_registry.sql (table + indexes + trigger) - Migration script: f1_migrate_regulation_registry.py (162 rows, --dry-run) - RegulationRegistry cache: 5min TTL, prefix fallback, graceful degradation - control_generator._classify_regulation() delegates to DB with dict fallback - source_type_classification.classify_source_regulation() delegates to DB - 34 new tests (lookup, cache, degradation, migration data consistency) - 421 total tests pass, 0 regressions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-03 23:14:06 +02:00
Benjamin Admin	93099b2770	feat(pipeline): structural metadata end-to-end (Blocks D2-D4) D2: RAG service stores section/section_title/paragraph/paragraph_num/page from embedding service chunks_with_metadata into Qdrant payloads. D3: Control generator prefers section > article > section_title from Qdrant, adds page to source_citation and generation_metadata. D4: Validated with real BGB §§ 312-312k text. Found and fixed critical bug where Phase 3 overlap destroyed the [§ ...] section prefix, causing only the first chunk per document to have metadata. All subsequent chunks lost section info. Also fixes pre-existing lint issues (unused imports, ambiguous variable names, duplicate dict key, bare except). 456 tests passing (58 embedding + 387 pipeline + 11 rag-service). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-01 20:34:00 +02:00
Benjamin Admin	d9c16fb914	feat(pipeline): add adversarial tests (30 cases) + regression harness Block C implementation: - adversarial_cases.yaml: 30 tricky cases in 5 categories (wrong legal basis, dark patterns, incomplete docs, similar-but-different, homonyms) - test_adversarial.py: 63 tests validating adversarial cases - test_regression.py: ontology stability, dependency engine, quality metrics - conftest.py: shared fixtures (DB session, sample controls) Total: 371 tests passing (221 existing + 150 new). Real-world benchmarks (C1) need manual ground truth creation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-01 13:02:29 +02:00
Benjamin Admin	42ab5ead26	feat(pipeline): implement Control Dependency Engine (Block 9) Core engine (dependency_engine.py): - 5 dependency types: prerequisite, supersedes, compensating_control, conditional_requirement, scope_exclusion - Generic condition evaluator (JSONB rules with AND/OR/NOT/field ops) - Priority-based conflict resolution - Cycle detection (DFS) + topological sort - Full evaluation with MCP-compatible dependency_resolution trace - 39 tests all passing (incl. GHV scenario from user requirements) Automatic generator (dependency_generator.py): - Ontology-based: same normalized_object + phase sequence -> prerequisite - Pattern-based: define->implement, implement->monitor, etc. - Domain packs: YAML rules for GDPR, AI Act, CRA, Security, Labor Contracts - 14 tests all passing API routes (dependency_routes.py): - CRUD for dependencies - POST /evaluate with dependency resolution - POST /generate (auto-generation with dry_run) - POST /validate (cycle detection) - GET /graph (nodes + edges for visualization) Prompt enhancement (decomposition_pass.py): - Added dependency_hints + lifecycle_phase_order to Pass 0b prompt - Stored in generation_metadata for post-processing DB migration: control_dependencies + control_evaluation_results tables 126 tests total, all passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-26 20:28:10 +02:00
Benjamin Admin	d660a45bb5	feat(pipeline): implement golden test suite + fix ontology patterns - Add test_golden_controls.py: 37 tests covering all 8 YAML categories (container, framework, evidence, negative, title, split, scope, merge_key) - Fix evidence detection: handle German feminine articles (eine/einer/etc.) - Fix framework detection: use verb stems for conjugated German verbs - Add framework patterns: OWASP API6, CCM without CSA prefix, generic category - Fix negative patterns: use "nicht übertragen/gespeichert/erscheinen" before generic "dürfen nicht" to correctly route prevent vs exclude All 73 tests passing (36 ontology + 37 golden). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-26 09:48:12 +02:00
Benjamin Admin	b3fbbbacfe	feat(control-pipeline): Control Ontology v1 — action types, evidence/container/framework detection Block 7.1-7.2 from masterplan: - 26 action_types with German aliases + phase mapping - Negative obligation patterns (exclude, prevent, enforce) - Container detection (11 composite objects that must not become atomic) - Evidence detection (14 indicators + "X dokumentieren" pattern) - Framework reference detection (OWASP, NIST, BSI, CSA, ISO patterns) - classify_obligation() routes to: atomic, composite, evidence, framework_container - build_canonical_key() for deterministic dedup - 36 tests covering all classification functions Also: merge_key bug fix in _process_pass0b_control() Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-26 09:06:39 +02:00
Benjamin Admin	fbeb93046d	docs: Pass 0b v2 evaluation — 28 controls, 7.9/10 avg, 3 findings for v3 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-26 00:19:06 +02:00
Benjamin Admin	0cce8a2011	feat: add Golden Test Suite v1 (40 regression tests for Pass 0b pipeline) 8 categories: duplicate explosion, compound split, negative obligations, container detection, framework decomposition, evidence leakage, scope dimension, title quality. Includes global quality gates. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-26 00:05:08 +02:00
Benjamin Admin	3ffa3f5793	feat(control-pipeline): add Document Compliance Engine — scope detection + document requirements New service: document_scope_resolver.py with 28 document rules covering: - Base (impressum, privacy_policy) - Tracking (cookie_banner, cookie_policy) - E-Commerce (AGB, withdrawal, shipping, pricing, payment) - Digital (digital_content_terms, no_withdrawal_notice) - SaaS (ToS, service_description, DPA, SLA) - AI (transparency_notice, automated_decisions) - Hardware (warranty, return, CE, safety) - Environmental (WEEE, battery disposal) - Marketplace (seller terms, ranking transparency) - Subscription (cancellation terms) API: POST /v1/document-compliance/required Input: company flags + jurisdiction → Output: required documents + assessment Includes confidence scoring, escalation detection (e.g. ecommerce without distance_selling flag), and reasoning. 19 tests covering all business model combinations including B2B-only exclusions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-24 08:39:55 +02:00
Benjamin Admin	716bc651c4	fix(control-pipeline): remove fictional demo packages, add real DB integration tests Deleted 3 packages that were copied without validation: - applicability_demo/ (fictional control IDs, wrong API schema) - applicability_demo_sdk/ (wrong endpoint URL, fictional request format) - applicability_demo_ci/ (GitHub Actions instead of Gitea, duplicated code) Replaced with real integration in test_applicability_use_cases.py: - TestApplicabilityIntegration calls real get_applicable_controls() - Checks source_citation->source and control_id domain prefixes - Runs against actual DB when DATABASE_URL is set - 128 structure/acceptance tests pass, 24 integration tests skip without DB Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-23 19:59:56 +02:00
Benjamin Admin	27f12e4659	feat(control-pipeline): add CI regression suite for applicability tests Makefile + pytest + GitHub Actions workflow for automated regression: - make install / make eval / make test - pytest integration with demo_cases.yaml - Golden outputs for 6 priority cases - Report generation (JSON + Markdown) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-23 19:12:44 +02:00
Benjamin Admin	a7c6ffe4dd	feat(control-pipeline): add SDK endpoint demo package for applicability tests Request payloads + response contract + api_runner.py for 6 priority cases. Can be run directly against /v1/applicability/evaluate endpoint. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-23 19:11:44 +02:00
Benjamin Admin	ae5c5c24eb	feat(control-pipeline): add applicability demo test package with evaluator 6 priority demo cases with golden outputs, evaluator.py and run_demo.py: - CASE-001: Webshop+Stripe (anti-PSD2 false positive) - CASE-002: Bank+TAN-Generator (scope override for batteries) - CASE-004: FinTech Wallet (true positive PSD2/AML) - CASE-006: SaaS+SMS Gateway (anti-TKG false positive) - CASE-008: Software→IoT Hardware (multi-regime scope) - CASE-011: Embedded Finance (escalation case) Self-test passes 6/6 against golden outputs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-23 19:08:31 +02:00
Benjamin Admin	e8ec50e0fc	feat(control-pipeline): 24 demo test cases for applicability engine YAML-based test package with 4 categories (6 each): - Standard sector cases (Telko, SaaS, Energie, Automotive, Health, Law) - Scope-beats-sector (Bank+Battery, KI-Recruiting, White-Label, Payments) - False friends (Stripe!=PSD2, Hotline!=TKG, Repo-signals!=regulation) - Escalation (IoT-SIM, FinTech unclear, Treuhand, KI-Diagnose) Enforces 5 acceptance rules: no false certainty, scope>sector, repo signals insufficient, standard first, 40%+ negative tests. Scoring framework: must_include + must_not_include + reasoning + escalation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-23 17:42:38 +02:00
Benjamin Admin	1f8667c7da	feat(control-pipeline): replace similarity-only dedup with LLM-verified dedup in pipeline Stage 4 (Harmonization) now uses two-tier approach: - Score >= 0.92: auto-duplicate (embedding only, fast) - Score 0.85-0.92: LLM verification via local qwen3.5 (think=false, ~3s) - Score < 0.85: not a duplicate This eliminates ~44% false positives from pure embedding similarity. LLM_DEDUP_ENABLED env var controls the feature (default: true). Also adds 10 applicability use case tests (bank+TAN, webshop+Stripe, SaaS startup, energy provider, health app, automotive, law firm, etc.) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-23 16:57:37 +02:00
Benjamin Admin	441d5740bd	feat: Applicability Engine + API-Filter + DB-Sync + Cleanup CI / go-lint (push) Has been skipped Details CI / python-lint (push) Has been skipped Details CI / nodejs-lint (push) Has been skipped Details CI / test-go-consent (push) Successful in 35s Details CI / test-python-voice (push) Successful in 33s Details CI / test-bqas (push) Successful in 37s Details CI / Deploy (push) Failing after 2s Details - Applicability Engine (deterministisch, kein LLM): filtert Controls nach Branche, Unternehmensgroesse, Scope-Signalen - API-Filter auf GET /controls, /controls-count, /controls-meta - POST /controls/applicable Endpoint fuer Company-Profile-Matching - 35 Unit-Tests fuer Engine - Port-8098-Konflikt mit Nginx gefixt (nur expose, kein Host-Port) - CLAUDE.md: control-pipeline Dokumentation ergaenzt - 6 internationale Gesetze geloescht (ES/FR/HU/NL/SE/CZ — nur DACH) - DB-Backup-Import-Script (import_backup.py) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 21:58:17 +02:00
Benjamin Admin	e3ab428b91	feat: control-pipeline Service aus Compliance-Repo migriert Control-Pipeline (Pass 0a/0b, BatchDedup, Generator) als eigenstaendiger Service in Core, damit Compliance-Repo unabhaengig refakturiert werden kann. Schreibt weiterhin ins compliance-Schema der shared PostgreSQL. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 14:40:47 +02:00

18 Commits