perf(ai-sdk): embed query once across router fan-out + fold umlauts in intent/concept matching
CI / detect-changes (push) Successful in 5s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 5s
CI / validate-canonical-controls (push) Successful in 4s
CI / loc-budget (push) Successful in 18s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 3m0s
CI / test-go (push) Successful in 59s
CI / iace-gt-coverage (push) Successful in 17s
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
CI / detect-changes (push) Successful in 5s
CI / branch-name (push) Has been skipped
CI / guardrail-integrity (push) Has been skipped
CI / secret-scan (push) Has been skipped
CI / dep-audit (push) Has been skipped
CI / sbom-scan (push) Has been skipped
CI / build-sha-integrity (push) Successful in 5s
CI / validate-canonical-controls (push) Successful in 4s
CI / loc-budget (push) Successful in 18s
CI / go-lint (push) Has been skipped
CI / python-lint (push) Has been skipped
CI / nodejs-lint (push) Has been skipped
CI / nodejs-build (push) Successful in 3m0s
CI / test-go (push) Successful in 59s
CI / iace-gt-coverage (push) Successful in 17s
CI / test-python-backend (push) Has been skipped
CI / test-python-document-crawler (push) Has been skipped
CI / test-python-dsms-gateway (push) Has been skipped
Authority Router re-embedded the query per collection (6x); on dev the embed endpoint (OVH) is remote so that was 6 round-trips = 7-12s per /retrieve. Embed once, reuse via ctx across the concurrent per-collection searches. DetectIntent + ConceptNorms now fold ae/oe/ue/ss so ASCII (Pruefe) and umlaut (Pruefe) inputs both match.
This commit is contained in:
@@ -0,0 +1,15 @@
|
||||
package ucca
|
||||
|
||||
import "strings"
|
||||
|
||||
// normalizeGerman lowercases and folds German umlauts / ß to their ASCII digraphs
|
||||
// (ä→ae, ö→oe, ü→ue, ß→ss) so keyword matching is insensitive to whether the user
|
||||
// typed "Prüfe" or "Pruefe", "Datenschutzerklärung" or "Datenschutzerklaerung".
|
||||
// Applied to BOTH the query and the keyword lists in the German-text matchers.
|
||||
func normalizeGerman(s string) string {
|
||||
return umlautFolder.Replace(strings.ToLower(s))
|
||||
}
|
||||
|
||||
var umlautFolder = strings.NewReplacer(
|
||||
"ä", "ae", "ö", "oe", "ü", "ue", "ß", "ss",
|
||||
)
|
||||
Reference in New Issue
Block a user