feat(cra): SBOM- + DAST-Findings aus dem Scanner-MCP konsumieren
CI / detect-changes (push) Successful in 8s
CI / branch-name (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 6s
CI / validate-canonical-controls (push) Successful in 10s
CI / loc-budget (push) Successful in 20s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Successful in 1m4s
CI / iace-gt-coverage (push) Successful in 15s
CI / test-python-backend (push) Successful in 24s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / detect-changes (push) Successful in 8s
CI / branch-name (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 6s
CI / validate-canonical-controls (push) Successful in 10s
CI / loc-budget (push) Successful in 20s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Has been skipped
CI / test-go (push) Successful in 1m4s
CI / iace-gt-coverage (push) Successful in 15s
CI / test-python-backend (push) Successful in 24s
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
Sharangs compliance-scanner-agent exponiert SBOM (sbom_vuln_report) + DAST (list_dast_findings) als eigene MCP-Tools (nicht via list_findings). Neuer fetch_all_findings(repo_id) zieht list_findings + SBOM + DAST in EINER MCP-Session und normalisiert ins Finding-Schema: - SBOM: ein Finding pro verwundbarem Paket (nicht pro CVE), cwe=CWE-1395 -> deterministisch CRA-AI-22 (robust gegen Paketnamen wie "sqlite"). - DAST: cwe/endpoint/vuln_type uebernommen -> Mapping via cwe/keywords. assess-from-scanner nutzt fetch_all_findings + liefert source.breakdown (code/sbom/dast). DAST hat im MCP keinen repo_id-Filter -> dast_repo_scoped:false (deployment-weit, transparent geflaggt). Echte MCP-Daten: Kitchenasty 58 code + 35 sbom + 81 dast -> 174 gemappt (Coverage 94,3%, alle 35 SBOM -> CRA-AI-22). Enthaelt zusaetzlich das Qdrant->Prod-Kopierскript (#42, verbatim macmini->prod). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,89 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Verbatim copy of the IACE Qdrant knowledge-base collections to another Qdrant.
|
||||
|
||||
There is no RAG/embedding service on prod, so the normal ingest_iace_kb.sh has no
|
||||
target there. Instead we copy the already-embedded points (id + vector + payload)
|
||||
1:1 from the source Qdrant (macmini) to the destination (prod). No re-embedding,
|
||||
no re-chunking → the destination is byte-identical and /sdk/v1/rag/search reads it
|
||||
the same way. Idempotent: same point ids → upsert overwrites, no duplicates.
|
||||
|
||||
Usage (run on macmini; reads local Qdrant, writes prod Qdrant):
|
||||
SRC_QDRANT=http://localhost:6333 \
|
||||
DST_QDRANT=https://qdrant-dev.breakpilot.ai \
|
||||
DST_QDRANT_KEY=<prod-api-key> \
|
||||
python3 copy_iace_collections_to_prod.py
|
||||
"""
|
||||
import json
|
||||
import os
|
||||
import urllib.error
|
||||
import urllib.request
|
||||
|
||||
SRC = os.environ.get("SRC_QDRANT", "http://localhost:6333").rstrip("/")
|
||||
DST = os.environ["DST_QDRANT"].rstrip("/")
|
||||
KEY = os.environ["DST_QDRANT_KEY"]
|
||||
COLLECTIONS = os.environ.get(
|
||||
"COLLECTIONS", "bp_iace_accident_stats,bp_iace_safety_kb,bp_iace_failure_kb"
|
||||
).split(",")
|
||||
BATCH = 128
|
||||
|
||||
|
||||
def _req(method, url, body=None, key=None):
|
||||
data = json.dumps(body).encode() if body is not None else None
|
||||
r = urllib.request.Request(url, data=data, method=method)
|
||||
r.add_header("Content-Type", "application/json")
|
||||
if key:
|
||||
r.add_header("api-key", key)
|
||||
with urllib.request.urlopen(r, timeout=120) as resp:
|
||||
return json.loads(resp.read())
|
||||
|
||||
|
||||
def _exists(base, col, key=None) -> bool:
|
||||
try:
|
||||
_req("GET", f"{base}/collections/{col}", key=key)
|
||||
return True
|
||||
except urllib.error.HTTPError as e:
|
||||
if e.code == 404:
|
||||
return False
|
||||
raise
|
||||
|
||||
|
||||
def copy_collection(col: str) -> None:
|
||||
src_cfg = _req("GET", f"{SRC}/collections/{col}")["result"]["config"]["params"]["vectors"]
|
||||
size, dist = src_cfg["size"], src_cfg["distance"]
|
||||
if _exists(DST, col, KEY):
|
||||
print(f" {col}: dst exists — upserting into it")
|
||||
else:
|
||||
_req("PUT", f"{DST}/collections/{col}", {"vectors": {"size": size, "distance": dist}}, KEY)
|
||||
print(f" {col}: created on dst ({size}d {dist})")
|
||||
|
||||
offset, total = None, 0
|
||||
while True:
|
||||
body = {"limit": BATCH, "with_vector": True, "with_payload": True}
|
||||
if offset is not None:
|
||||
body["offset"] = offset
|
||||
res = _req("POST", f"{SRC}/collections/{col}/points/scroll", body)["result"]
|
||||
pts = res.get("points", [])
|
||||
if not pts:
|
||||
break
|
||||
upsert = [{"id": p["id"], "vector": p["vector"], "payload": p.get("payload", {})} for p in pts]
|
||||
_req("PUT", f"{DST}/collections/{col}/points?wait=true", {"points": upsert}, KEY)
|
||||
total += len(pts)
|
||||
offset = res.get("next_page_offset")
|
||||
if offset is None:
|
||||
break
|
||||
|
||||
src_n = _req("POST", f"{SRC}/collections/{col}/points/count", {"exact": True})["result"]["count"]
|
||||
dst_n = _req("POST", f"{DST}/collections/{col}/points/count", {"exact": True}, KEY)["result"]["count"]
|
||||
flag = "OK" if dst_n >= src_n else "MISMATCH"
|
||||
print(f" {col}: copied {total} | src={src_n} dst={dst_n} [{flag}]")
|
||||
|
||||
|
||||
def main() -> None:
|
||||
print(f"Copy IACE collections {SRC} -> {DST}")
|
||||
for col in COLLECTIONS:
|
||||
copy_collection(col.strip())
|
||||
print("Done.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
Reference in New Issue
Block a user