Benjamin_Boenisch
  • Joined on 2026-02-07
Benjamin_Boenisch pushed to main at Benjamin_Boenisch/breakpilot-compliance 2026-03-19 22:30:53 +00:00
1cc34c23d9 feat(document-generator): 33 policy + module document templates
Benjamin_Boenisch pushed to main at Benjamin_Boenisch/breakpilot-lehrer 2026-03-19 22:30:34 +00:00
3c7fc43f43 Fix test expectation: valid IPA in brackets also triggers detection
Benjamin_Boenisch pushed to main at Benjamin_Boenisch/breakpilot-lehrer 2026-03-19 22:28:09 +00:00
6bfa9eed86 Fix garbled IPA detection for bracket-notation like [n, nn] and [1uedtX,1]
Benjamin_Boenisch pushed to main at Benjamin_Boenisch/breakpilot-lehrer 2026-03-19 22:04:11 +00:00
7750b2a05f Fix ghost filter for borderless boxes + remove oversized graphic artifacts
Benjamin_Boenisch pushed to main at Benjamin_Boenisch/breakpilot-compliance 2026-03-19 21:26:05 +00:00
5dd7a27336 fix(pipeline): add missing regulation codes to LICENSE_MAP
c3afa628ed feat(sdk): vendor-compliance cross-module integration — VVT, obligations, TOM, loeschfristen
4b1eede45b feat(tom): audit document, compliance checks, 25 controls, canonical control mapping
2a70441eaa feat(sdk): VVT master libraries, process templates, Loeschfristen profiling + document
Compare 4 commits »
Benjamin_Boenisch pushed to main at Benjamin_Boenisch/breakpilot-lehrer 2026-03-19 12:56:15 +00:00
e3395ae8cf Fix overlay word leak, ghost filter false positive, merged zone header
Benjamin_Boenisch pushed to main at Benjamin_Boenisch/breakpilot-lehrer 2026-03-19 11:22:22 +00:00
df30d4eae3 Add zone merging across images + heading detection by color/height
2e6ab3a646 Fix IPA marker split: walk back max 3 chars for onset cluster
cc5ee74921 Use OCR-recognized IPA when word not in dictionary
21d37b5da1 Fix prefix matching: use alpha-only chars, min 4-char prefix
19cbbf310a Improve garbled IPA cleanup: trailing strip, prefix match, broader guard
Compare 7 commits »
Benjamin_Boenisch pushed to main at Benjamin_Boenisch/breakpilot-lehrer 2026-03-19 08:59:34 +00:00
038eaf783c Only insert IPA when garbled phonetics exist in OCR text
Benjamin_Boenisch pushed to main at Benjamin_Boenisch/breakpilot-lehrer 2026-03-19 08:38:36 +00:00
432eee3694 Auto-filter decorative margin strips and header junk
Benjamin_Boenisch pushed to main at Benjamin_Boenisch/breakpilot-lehrer 2026-03-19 08:19:19 +00:00
8e4cbd84c2 Invalidate grid_editor_result when exclude regions change
Benjamin_Boenisch pushed to main at Benjamin_Boenisch/breakpilot-lehrer 2026-03-19 08:08:40 +00:00
f9d71d50d1 Add exclude region marking in Structure step
Benjamin_Boenisch pushed to main at Benjamin_Boenisch/breakpilot-lehrer 2026-03-19 07:23:59 +00:00
c09838e91c Fix spine shadow false positives: require dark valley, brightness rise, trim convolution edges
Benjamin_Boenisch pushed to main at Benjamin_Boenisch/breakpilot-lehrer 2026-03-19 06:54:36 +00:00
3fd6523872 Cut at spine center (darkest point) instead of shadow edge
Benjamin_Boenisch pushed to main at Benjamin_Boenisch/breakpilot-lehrer 2026-03-19 06:41:24 +00:00
e56391b0c3 Add right-edge spine shadow detection for book scans
Benjamin_Boenisch pushed to main at Benjamin_Boenisch/breakpilot-compliance 2026-03-18 15:28:26 +00:00
f2819b99af feat(pipeline): v3 — scoped control applicability + source_type classification
Benjamin_Boenisch pushed to main at Benjamin_Boenisch/breakpilot-lehrer 2026-03-18 13:52:42 +00:00
a3e2a7f994 Add GT button to OCR overlay, prominent category picker, track pipeline
Benjamin_Boenisch pushed to main at Benjamin_Boenisch/breakpilot-lehrer 2026-03-18 13:33:51 +00:00
f655db30e4 Add Ground Truth regression test system for OCR pipeline
c894a0feeb Improve IPA continuation row detection with phonetic heuristics
8ef4c089cf Remove IPA continuation rows and support hyphenated word lookup
821e5481c2 Only apply IPA correction on vocabulary tables (≥3 columns)
b98ea33a3a Strip garbled OCR phonetics after IPA insertion
Compare 22 commits »
Benjamin_Boenisch pushed to main at Benjamin_Boenisch/breakpilot-compliance 2026-03-18 07:49:53 +00:00
3bb9fffab6 docs: update control library taxonomy, add provenance wiki page
Benjamin_Boenisch pushed to main at Benjamin_Boenisch/breakpilot-compliance 2026-03-18 07:20:18 +00:00
148c7ba3af feat(qa): recital detection, review split, duplicate comparison
a9e0869205 feat(pipeline): pipeline_version v2, migration 062, docs + 71 tests
653aad57e3 Let Anthropic API decide chunk relevance instead of local prefilter
a7f7e57dd7 Add skip_prefilter option to control generator
567e82ddf5 Fix stale DB session after long embedding pre-load
Compare 6 commits »
Benjamin_Boenisch pushed to main at Benjamin_Boenisch/breakpilot-lehrer 2026-03-17 16:01:38 +00:00
d66efdecf5 fix: NameError in detect_page_splits — 'gaps' var removed in rewrite