feat(pipeline): Anthropic Batch API, source/regulation filter, cost optimization
Some checks failed
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Failing after 35s
CI/CD / test-python-backend-compliance (push) Successful in 34s
CI/CD / test-python-document-crawler (push) Successful in 22s
CI/CD / test-python-dsms-gateway (push) Successful in 19s
CI/CD / validate-canonical-controls (push) Successful in 11s
CI/CD / Deploy (push) Has been skipped
Some checks failed
CI/CD / go-lint (push) Has been skipped
CI/CD / python-lint (push) Has been skipped
CI/CD / nodejs-lint (push) Has been skipped
CI/CD / test-go-ai-compliance (push) Failing after 35s
CI/CD / test-python-backend-compliance (push) Successful in 34s
CI/CD / test-python-document-crawler (push) Successful in 22s
CI/CD / test-python-dsms-gateway (push) Successful in 19s
CI/CD / validate-canonical-controls (push) Successful in 11s
CI/CD / Deploy (push) Has been skipped
- Add Anthropic API support to decomposition Pass 0a/0b (prompt caching, content batching) - Add Anthropic Batch API (50% cost reduction, async 24h processing) - Add source_filter (ILIKE on source_citation) for regulation-based filtering - Add category_filter to Pass 0a for selective decomposition - Add regulation_filter to control_generator for RAG scan phase filtering (prefix match on regulation_code — enables CE + Code Review focus) - New API endpoints: batch-submit-0a, batch-submit-0b, batch-status, batch-process - 83 new tests (all passing) Cost reduction: $2,525 → ~$600-700 with all optimizations combined. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -384,6 +384,7 @@ class GeneratorConfig(BaseModel):
|
||||
skip_web_search: bool = False
|
||||
dry_run: bool = False
|
||||
existing_job_id: Optional[str] = None # If set, reuse this job instead of creating a new one
|
||||
regulation_filter: Optional[List[str]] = None # Only process chunks matching these regulation_code prefixes
|
||||
|
||||
|
||||
@dataclass
|
||||
@@ -803,6 +804,13 @@ class ControlGeneratorPipeline:
|
||||
or payload.get("regulation_code", "")
|
||||
or payload.get("source_id", "")
|
||||
or payload.get("source_code", ""))
|
||||
|
||||
# Filter by regulation_code if configured
|
||||
if config.regulation_filter and reg_code:
|
||||
code_lower = reg_code.lower()
|
||||
if not any(code_lower.startswith(f.lower()) for f in config.regulation_filter):
|
||||
continue
|
||||
|
||||
reg_name = (payload.get("regulation_name_de", "")
|
||||
or payload.get("regulation_name", "")
|
||||
or payload.get("source_name", "")
|
||||
|
||||
Reference in New Issue
Block a user