Commit Graph

8 Commits

Author SHA1 Message Date
Benjamin Admin
642a8587b5 feat(control-pipeline): add batch-dedup endpoint + source_citation JSONB migration
- Add POST /v1/canonical/generate/batch-dedup endpoint for Pass 0b
  atomic controls deduplication (Phase 1: intra-group, Phase 2: cross-group)
- source_citation column migrated from TEXT to JSONB (5,459 rows converted)
- migrate_jsonb.py script added for generation_metadata conversion

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-22 08:44:31 +02:00
Benjamin Admin
8b7671d310 feat(control-pipeline): add repair backfill endpoint for missing title/objective/requirements
POST /v1/canonical/generate/backfill-repair uses Anthropic API to
generate missing fields from available context (source text, other
fields). Handles 1,470 controls with incomplete data.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-22 07:06:19 +02:00
Benjamin Admin
b29dc33708 fix(control-pipeline): anchor finder uses direct Qdrant search instead of Go SDK
The Go SDK RAG proxy returns 401 (Qdrant API key mismatch). Switch
AnchorFinder to use direct Qdrant vector search + embedding service,
same approach as the main pipeline. No dependency on Go SDK anymore.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-21 18:13:12 +02:00
Benjamin Admin
91f4202e88 feat(control-pipeline): add anchor backfill endpoint + normalize target_audience
- Add POST /v1/canonical/generate/backfill-anchors endpoint for batch
  populating open_anchors on controls generated with skip_web_search=true
- Uses AnchorFinder Stage A (RAG search) to find OWASP/NIST/ENISA refs
- Background job with progress tracking (same pattern as other backfills)
- Promotes needs_review controls that gain anchors to draft state
- Target audience normalization (enterprise/authority/provider → JSON arrays)
  already applied via SQL

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-21 18:04:50 +02:00
Benjamin Admin
756d068b4f fix: skip_web_search Default auf True — 5x schnellere Pipeline
Anchor-Search (DuckDuckGo + RAG via SDK) verlangsamt Pipeline von
~50 Chunks/min auf ~10 Chunks/min. Anchors (OWASP/NIST-Referenzen)
koennen nachtraeglich in einem Batch-Job befuellt werden.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 12:26:01 +02:00
Benjamin Admin
f89ce46631 fix: Pipeline-Skalierung — 6 Optimierungen für 80k+ Controls
1. control_generator: GeneratorResult.status Default "completed" → "running" (Bug)
2. control_generator: Anthropic API mit Phase-Timeouts + Retry bei Disconnect
3. control_generator: regulation_exclude Filter + Harmonization via Qdrant statt In-Memory
4. decomposition_pass: Enrich Pass Batch-UPDATEs (400k → ~400 DB-Calls)
5. decomposition_pass: Merge Pass single Query statt N+1
6. batch_dedup_runner: Cross-Group Dedup parallelisiert (asyncio.gather)
7. canonical_control_routes: Framework Controls API Pagination (limit/offset)
8. DB-Indizes: idx_oc_parent_release, idx_oc_trigger_null, idx_cc_framework

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 14:09:32 +02:00
Benjamin Admin
441d5740bd feat: Applicability Engine + API-Filter + DB-Sync + Cleanup
Some checks failed
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / test-go-consent (push) Successful in 35s
CI / test-python-voice (push) Successful in 33s
CI / test-bqas (push) Successful in 37s
CI / Deploy (push) Failing after 2s
- Applicability Engine (deterministisch, kein LLM): filtert Controls
  nach Branche, Unternehmensgroesse, Scope-Signalen
- API-Filter auf GET /controls, /controls-count, /controls-meta
- POST /controls/applicable Endpoint fuer Company-Profile-Matching
- 35 Unit-Tests fuer Engine
- Port-8098-Konflikt mit Nginx gefixt (nur expose, kein Host-Port)
- CLAUDE.md: control-pipeline Dokumentation ergaenzt
- 6 internationale Gesetze geloescht (ES/FR/HU/NL/SE/CZ — nur DACH)
- DB-Backup-Import-Script (import_backup.py)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 21:58:17 +02:00
Benjamin Admin
e3ab428b91 feat: control-pipeline Service aus Compliance-Repo migriert
Control-Pipeline (Pass 0a/0b, BatchDedup, Generator) als eigenstaendiger
Service in Core, damit Compliance-Repo unabhaengig refakturiert werden kann.
Schreibt weiterhin ins compliance-Schema der shared PostgreSQL.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 14:40:47 +02:00