OSHA 29 CFR 1910 Subpart O (1910.211-1910.219) — complete machine
guarding requirements. US federal law, public domain.
International norms mapping table: China GB/T, Korea KS, India BIS
equivalents to ISO/EN standards. Unfortunately all countries protect
ISO copyright even for identical national adoptions (IDT).
Only OSHA provides truly free machinery safety content.
EU Excel harmonised standards list included for reference.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Email security gateways follow GET redirects automatically and were
consuming the token before the investor clicked through. The verify page
now shows an 'Access Pitch Deck' button; the token is only consumed on
explicit click, which scanners cannot trigger.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Behind Orca's reverse proxy, request.url resolves to http://127.0.0.1:3000
which causes redirects to go to the internal address instead of the public
domain. Use PITCH_BASE_URL (already set in service.toml) as the base.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- New pitch_short_links table stores 6-char alphanumeric codes mapped to magic link tokens
- GET /p/[code] redirects to /auth/verify?token=... (302, validates expiry)
- All magic link generation points (invite, generate-link, resend) now create a short code
- Emails (invite + resend) use the short URL — less token-like, cleaner for spam filters
- Copy-link UI shows short URL prominently with full URL as fallback
- Migration 008 added to /api/admin/migrate
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add ROW_LABEL_MAP (DE→EN) covering GuV, Liquidität, Kunden, Betriebliche Aufwendungen rows
- Add FORMULA_TOOLTIPS_EN with English tooltip text for all formula-driven rows
- Add MONTH_LABELS_EN (Mrz→Mar, Mai→May, Okt→Oct)
- LabelWithTooltip now accepts `de` flag, translates display text and tooltip accordingly
- Month column headers switch between DE/EN month abbreviations
- Falls back to original German label for any row not in the map
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- GET /api/admin/investors/:id now returns preferred_lang
- PATCH /api/admin/investors/:id accepts preferred_lang (de/en), validates value
- Investor detail page: DE/EN toggle in the Pitch Version card, instant save on click
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Version dropdown on the invite form shows all committed versions
- Selected version is assigned to the investor at creation time (no separate step needed)
- API validates version is committed before upserting
- Leaving the dropdown empty keeps any existing assignment (COALESCE behavior)
- version_id included in audit log
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add English email template variants (greeting, message, closing, subject, CTA copy)
- Add `preferred_lang` column to `pitch_investors` — stored per investor, deck opens in that language by default
- Invite form: DE/EN language toggle that switches email defaults and pitch language setting
- Invite form: "Send email" toggle — when off, creates investor + returns magic link without sending email (for cold outreach attachment)
- `app/page.tsx`: initializes pitch language from investor's `preferred_lang` before first render (no flash)
- Migration 007 added to `/api/admin/migrate` route for production rollout
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Splits master controls >200 members by re-clustering their object groups
with k=4-20 per group. First round: 38 groups → 325 sub-groups → 253 new MCs.
25 generic MCs remain (monitoring, procedure, etc.) — need regulation-source split.
Session summary: Block F complete, Control Generation (1,599+), Pass 0a/0b,
Production Sync, G-pre1/2/3 Object Clustering + Master Controls + API,
G1-G4 Compliance Execution Layer (Decision Trace, Commit Ledger, Decision Memory,
Pre-Deployment Enforcement).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New table: deployment_checks (verdict, blocking/warning controls, risk score)
New API:
POST /v1/deployment-checks (SDK asks: "can I deploy?")
GET /v1/deployment-checks/{id} (check result)
POST /v1/deployment-checks/{id}/override (manual override with justification)
GET /v1/deployment-checks/stats (approval/block rate)
Check logic: queries G1 decision_traces + G3 open failures per affected control.
Verdict: approved (0 blocking) or blocked (with fix recommendations).
454 tests pass, 0 regressions.
Block G complete: G1-G4 all implemented.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New table: decision_events (assessment→decision→fix→verification→failure cycle)
New API:
POST /v1/decision-events (record lifecycle event)
GET /v1/decision-events (list with filters)
GET /v1/decision-events/timeline/{control_id} (full chronological timeline)
GET /v1/decision-events/stats (failure rate, cycle times)
Each event captures input_state, output_state, actor, evidence.
454 tests pass, 0 regressions.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New table: compliance_commits (commit hash, affected controls, risk level)
New API:
POST /v1/compliance-commits (SDK registers commit + impact)
GET /v1/compliance-commits (list with filters)
GET /v1/compliance-commits/by-control/{id} (all commits for a control)
GET /v1/compliance-commits/stats (dashboard)
GET /v1/compliance-commits/{id} (detail)
GIN index on affected_control_ids for fast @> containment queries.
454 tests pass, 0 regressions.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New table: decision_traces (status, reason, evidence, fix plan per control)
New API:
POST/GET/PUT /v1/decision-traces (CRUD for decisions)
GET /v1/decision-traces/stats (compliance dashboard)
GET /v1/controls/{id}/full-trace (Regulation→Obligation→Control→Decision→Evidence)
454 tests pass, 0 regressions.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
G-pre1: 144k objects clustered into 7,466 groups via Mini-Batch K-Means
on bge-m3 embeddings. Two-stage: k=5000 base + sub-cluster groups >50.
G-pre2: 5,114 Master Controls from lifecycle phase chains
(define→implement→test→monitor), linking 172,504 atomic controls.
G-pre3: REST API for Master Controls
GET /v1/master-controls (list, search, filter)
GET /v1/master-controls/stats
GET /v1/master-controls/{mc_id} (detail with phase-controls)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
8 tests confirm all REGULATION_LICENSE_MAP, ACTION_TYPES, _NEGATIVE_PATTERNS,
_ACTION_SYNONYMS, and _OBJECT_SYNONYMS entries are correctly migrated to DB.
Dicts kept as fallback for DB-unavailability resilience.
Block F complete: F1-F5 all done.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Uses Ollama (qwen3.5:35b-a3b, think:false) to generate additional
German synonyms for action types and object tokens. Results stored
with source='llm' in action_synonyms/object_synonyms tables.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Root cause: init scripts ran repeatedly (on container restart) and tried
vault secrets enable / vault auth enable for already-existing paths.
Vault logged ERRORs and burned 40-84% CPU in the loop.
Fix:
- Marker file /vault/data/.init-complete skips re-initialization
- vault secrets list / vault auth list checks before enable calls
- No more "path already in use" errors on subsequent runs
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The preview-data API was returning `fm_scenarios` but PitchDeck reads
`data.fp_scenarios`, so fpBaseScenarioId was always null and the
Finanzplan slide fell back to the global default scenario (Base Case 200k)
instead of the version's assigned scenario (e.g. 1 Mio. Euro Base).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Registers /generate/run-pass0a and /generate/pass0a-status/{job_id}
on the core control-pipeline (port 8098). Previously Pass 0a was only
available on the compliance backend which connects to Production DB,
causing a split-brain when controls are generated locally.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
AI Q&A: fetch is_showcase from DB; showcase sessions receive no financial/funding
context and have an explicit LLM guard refusing to discuss investment details.
FAQ context and financial slide IDs stripped from system prompt.
FAB: flex layout so Fullscreen button is always visible regardless of panel height.
Presenter: pass activeSlideOrder to usePresenterMode so buildSlideAudioPlan maps
slideIdx → slideId from the filtered list, not the full SLIDE_ORDER. Progress
calculation also filters to active scripts only.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
NavigationFAB and SlideOverview now accept slideNames prop and render only the
active slide list (filtered for showcase mode). Adds AI presenter start button
to the FAB footer so it's accessible even when intro-presenter slide is hidden.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds is_showcase boolean to pitch_investors; when set, filters out financials,
the ask, cap table, assumptions, finanzplan, risks, and intro-presenter slides.
Slide navigation is fully dynamic — progress bar and counts update accordingly.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
/api/finanzplan now accepts ?scenarioId and uses it for the per-sheet
row counts (the numbers in brackets on the tab bar). FinanzplanSlide
passes fpBaseScenarioId when fetching the sheet list, so Wandeldarlehen
investors see e.g. Personalkosten (9) instead of (35).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The fm_scenarios array in each pitch version snapshot already stores the
fp_scenario IDs directly (same pattern 1 Mio used). Wandeldarlehen snapshots
were missing Bear/Bull entries — updated in DB to add them.
- /api/data: include fp_scenarios in version response (was omitted)
- PitchDeck: derive fpBaseScenarioId from data.fp_scenarios
- useFpKPIs: accept fpBaseScenarioId instead of isWandeldarlehen boolean
- AssumptionsSlide: find Bear/Base/Bull by name from fpScenarios prop
- FinanzplanSlide: initialize from fpBaseScenarioId, use version scenarios for selector
- FinancialsSlide / ExecutiveSummarySlide: pass fpBaseScenarioId to hook
- types: add FpScenarioRef + fp_scenarios field to PitchData
No UUID hardcoded in any component. Adding a new pitch version only
requires setting the correct fp_scenario IDs in its fm_scenarios snapshot.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AssumptionsSlide sends ?scenarioId=<uuid> for Bear/Base/Bull cards but
the route was silently dropping it for non-admin requests, making all
three cards return the same default Base Case data. Since fp_ financial
projections are already investor-facing, any valid scenarioId is allowed.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Migrates ACTION_TYPES (26+8 types), _NEGATIVE_PATTERNS (22), _ACTION_SYNONYMS
(65), and _OBJECT_SYNONYMS (75) from hardcoded dicts to DB tables.
- SQL migration: 003_action_object_ontology.sql (3 tables)
- Migration scripts: f2_migrate_actions.py (34 types, 145 synonyms), f3_migrate_objects.py (75 objects)
- OntologyRegistry cache: 5min TTL, raises RuntimeError if empty (safe fallback to dicts)
- control_ontology.classify_action/get_phase delegate to DB with dict fallback
- control_dedup.normalize_action/normalize_object delegate to DB with dict fallback
- 25 new tests, 446 total pass, 0 regressions
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Next: F1 Regulation Registry (DB + API + Frontend + Auto-Create)
Frontend at /sdk/regulation-registry in breakpilot-compliance admin
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add case-sensitive _SINGLE_NUM_ALLCAPS_RE for "1. INTRODUCTION" style
headers (ENISA, BSI docs). Cannot use _LEGAL_SECTION_RE for this because
it uses re.IGNORECASE which would false-positive on "1. Erstens" etc.
Also re-downloaded 2 corrupt PDFs from nist.gov (nistir_8259a, nist_ai_rmf)
— originals in MinIO were 263-byte XML error responses, not PDFs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
_SECTION_NUMBER_RE only had patterns for §/Art/Section/Kapitel/Annex
but missed NIST-style identifiers (AC-1, GV.OC-01, 3.1, A01:2021).
This caused 0% section rate for all NIST/BSI/ENISA documents even
though sections were correctly detected — the section NUMBER wasn't
extracted from the header.
Also adds:
- reupload_legal_strategy.py: re-upload with legal chunking
- extract_and_upload_nist.py: local PDF extraction workaround
- qdrant-snapshot.sh: backup mechanism for Qdrant collections
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Complete instructions for next session including:
- Current quality metrics per document type
- Prioritized action items (NIST fix, citation backfill, missing laws)
- Full Block E-G roadmap with details
- All critical files, DB state, test commands
- Known issues (3 lost NIST PDFs, frontend 500s, D5 script safety)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Session achieved: structural metadata end-to-end (D2-D4), overlap bug
fix, HTML stripping with charset detection, 430/436 docs re-ingested.
Remaining: ~40 EU Official Journal PDFs need HTML from EUR-Lex (broken
multi-column PDF extraction), 3 missing EDPB PDFs, 1 corrupt PDF.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The default was 'pymupdf' which doesn't exist as a backend, causing
fallthrough to pypdf every time. With 'auto', the priority is:
unstructured > pdfplumber > pypdf.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
EU Official Journal PDFs (AI Act, CRA, NIS2, DSGVO, etc.) use
multi-column layouts that pypdf breaks into fragmented words
("Ar tik el" instead of "Artikel"). pdfplumber handles these correctly.
Backend priority: unstructured > pdfplumber > pypdf (auto mode).
Also increases D5 re-ingestion timeout to 3600s for large PDFs.
58 embedding-service tests passing. pdfplumber: MIT license.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>