Apply platform-domain decision (2026-05-18). No services touched; docs/config only. Refs: M1.1
44 KiB
Implementation Plan — Breakpilot Platform
Companion to PLATFORM_ARCHITECTURE.md, INFRASTRUCTURE.md, and PRODUCT_INTEGRATION_SPEC.md.
This is the build plan for an AI coding agent (Claude Code, executing PRs against the listed repos). Each milestone is sized to fit in 1–3 PRs, ships independently, and leaves the system in a working state.
0. How to read this document
- Milestones are named
M{phase}.{n}and grouped by phase. - Each milestone has: Goal, Depends on, Repos/files, Deliverables, Acceptance, Tests, Gate, Effort (S = ≤1 day, M = 2–4 days, L = ≥1 week).
- "Gate" is who/what approves the PR for merge. Standard is 1 human reviewer + green CI; some milestones add a manual sign-off.
- Phases are ordered; milestones within a phase can be parallelised where dependencies allow.
- The dependency graph at §11 is the source of truth — when in doubt, read it.
1. Cross-cutting conventions (apply to every PR in every repo)
1.1 Repo strategy
Polyrepo under a new Gitea org gitea.meghsakha.com/platform/. One repo per deployable unit. Existing product repos stay where they are.
Repos to create:
| Repo | Purpose | Created in |
|---|---|---|
platform/orca-platform |
IaC for VMs, Orca manifests, DNS, TLS, backups | M1.1 |
platform/tenant-registry |
Go service: tenant glue, audit, API keys | M4.1 |
platform/portal |
Next.js 15: customer area + backstage | M5.1 |
platform/docs |
Architecture, integration spec, this plan, runbooks | M0.1 |
platform/seed-data |
Demo tenant fixtures per product | M13.1 |
platform/design-tokens |
CSS variables / fonts (consumed by product web comps) | M5.1 |
Existing repos that get changes (no new repos):
benjamin_boenisch/certifai— M6.1 / M6.2 / M6.3benjamin_boenisch/breakpilot-compliance— M7.1 / M7.2
1.2 Per-repo scaffolding (must exist before any feature work)
Every new repo lands in M0.1 with:
/README.md what this repo is, how to run, links to architecture
/CONTRIBUTING.md branch model, commit format, how to open a PR
/CODEOWNERS at least one mandatory reviewer (us)
/.gitea/
/pull_request_template.md
/issue_template/
bug.md
feature.md
/.gitea/workflows/
ci.yaml fmt → lint → test → build (per-language details in M0.2)
release.yaml on tag: build image, push to registry
/CHANGELOG.md generated from conventional commits
/LICENSE MIT for portal/docs; Apache-2.0 for libraries
1.3 Branch + commit conventions
- Trunk-based.
mainis always deployable. Feature branches:feat/<short-slug>,fix/<short-slug>,chore/<short-slug>. Max lifetime 5 days. - Conventional Commits (
feat:,fix:,chore:,docs:,refactor:,test:,breaking!:). Enforced bycommitlintin CI. - Squash-merge to main. PR title becomes the commit message.
- Direct push to main is blocked by Gitea branch protection.
1.4 PR template (in every repo)
## What
<1-3 bullets>
## Why
<link to architecture section, milestone ID, or issue>
## How
<implementation notes for the reviewer>
## Test plan
- [ ] unit
- [ ] integration (if API surface changed)
- [ ] e2e (if user-facing flow changed)
- [ ] manual smoke on stage
## Risk
<what could break, blast radius, rollback plan>
## Linked milestone
M{phase}.{n}
1.5 CI checks (required before merge, configured in M0.2)
Per language defaults:
| Stack | Required checks |
|---|---|
| Go | go fmt -l (no diff), go vet, golangci-lint, go test ./..., go build |
| Rust | cargo fmt --check, cargo clippy -- -D warnings, cargo test -j8 |
| TypeScript | pnpm lint, pnpm typecheck, pnpm test, pnpm build |
| Python | ruff check, ruff format --check, mypy, pytest |
| All | commitlint, image build, container scan (trivy), SBOM upload |
1.6 Approval gates
- Standard gate (most milestones): 1 human reviewer approves + all CI checks green. Enforced by Gitea branch protection on
main. - CODEOWNERS auto-requests the right reviewer based on path.
- Production-promotion gate (release tags only): manual sign-off by
@sharangon the release issue + stage soak ≥ 24h. - Security gate (M2.x, M4.x, M14.x): security checklist in PR body completed.
1.7 Versioning + release strategy
- Semver per repo. Container images carry three tags:
:sha-<short>,:v1.4.2,:env-stage/:env-prod. - Stage auto-deploys on every merge to
main(Gitea Actions → Orca apply againststagecluster). - Production deploys only when a release tag
vX.Y.Zis created. Tag creation requires the production-promotion gate. - Rollback:
orca rollout undo <service>flips back to previous image tag. RTO target ≤ 5 min for any single service. - Database migrations are forward-only and run as an init container before the service starts. Migrations that delete columns require two releases (1: stop writing, 2: drop).
1.8 Environments
Three Orca clusters, all on the same hardware until volume justifies separation:
| Env | Cluster name | Purpose | Data | Auto-deploy? |
|---|---|---|---|---|
| dev | local | Developer machine, docker-compose | fixtures | n/a |
| stage | orca-stage |
Pre-prod validation | seeded demo + synthetic customers | yes (on merge to main) |
| prod | orca-prod |
Live customer traffic | real | tag + gate |
Domain pattern:
- dev:
*.localhost(mkcert) - stage:
*.stage.breakpilot.com - prod:
*.breakpilot.com
1.9 Observability + audit
- SigNoz (already running at
signoz.meghsakha.com) for traces, logs, metrics. Every service ships OTel SDK from day one. - Audit events in the Retraced-shape schema (PRODUCT_INTEGRATION_SPEC.md §8.4) emitted to Tenant Registry
/auditfrom every service. Required for every state-changing endpoint. - Structured logs (JSON) only. No
fmt.Println/console.login committed code; CI rejects.
1.10 Secrets
- Infisical machine identity per service, path
/{env}/{service}/. - The only secret allowed in an Orca env file is the Keycloak DB URI (bootstrap exception — see INFRASTRUCTURE.md).
- CI scans for committed secrets via
gitleaks. Failures block merge.
1.11 Testing policy (mandatory; see also feedback_testing_everything)
- Unit: every non-trivial function.
- Integration: every API endpoint, against real Postgres/MongoDB via
testcontainers. No mock databases. - E2E: every user-facing flow has at least one Playwright spec running against stage post-deploy.
- Regression: when a bug is fixed, a failing test is added FIRST, then the fix.
- No PR ships without tests. "Manual tested" is not acceptable except for IaC.
1.12 A/B testing (designed for, adopted later)
Every place where a future flag would gate behaviour MUST flow through a single featureFlags.evaluate(tenantId, flagKey) function. Initial implementation returns hard-coded values from manifest.yaml. Swap to Unleash/OpenFeature in M19.1 with zero call-site changes.
2. Phase 0 — Foundations (M0.x – M3.x)
Goal: Repos exist, CI works, infra is provisioned and observable, identity + secrets are usable. No customer-visible features yet.
M0.1 — Bootstrap repos and docs
- Depends on: nothing
- Repos:
platform/docs,platform/orca-platform,platform/portal,platform/tenant-registry,platform/design-tokens,platform/seed-data - Deliverables: create the Gitea
platformorg; for each repo add the §1.2 scaffolding;platform/docsingests the existingPLATFORM_ARCHITECTURE.md,INFRASTRUCTURE.md,PRODUCT_INTEGRATION_SPEC.md, this plan. - Acceptance: every repo has a working
README.md,CONTRIBUTING.md,CODEOWNERS, PR template. - Tests: n/a
- Gate: standard
- Effort: S
M0.2 — CI templates + branch protection
- Depends on: M0.1
- Repos: all of the above
- Deliverables:
.gitea/workflows/ci.yamlper repo (matching §1.5 by stack), Gitea branch protection onmain(require PR, 1 review, status checks green, no direct push),commitlint,gitleaks,trivyconfigured. - Acceptance: a deliberately-broken PR is rejected by every check; a clean PR is mergeable.
- Tests: smoke PR per repo demonstrating green CI.
- Gate: standard
- Effort: S
M0.3 — Self-hosted DNS + wildcard TLS
- Depends on: M1.2 (vm-edge must exist before PowerDNS lands)
- Repos:
platform/orca-platform - Deliverables:
- PowerDNS Authoritative on
vm-edge(Orca-managed). PostgreSQL backend on same VM (small; ~100 records). - At the registrar (Benjamin's account): set
ns1.breakpilot.comandns2.breakpilot.comglue records pointing at vm-edge public IP; delegate the domain to those NS. - Zone file committed in
orca-platform/dns/breakpilot.com.zone; Orca syncs into PowerDNS on apply. - Records: apex
breakpilot.com, wildcards*.breakpilot.com+*.stage.breakpilot.com, plusauth.,erp.,mcp.,cdn.,mail.,ns1.,ns2., SPF/DKIM/DMARC TXT records (for M3.2). - Wildcard TLS via Let's Encrypt DNS-01 against PowerDNS (Lego's
--dns=pdnsprovider); ACME credentials in Infisical at/prod/orca-proxy/PDNS_API_KEY. - Orca-Proxy reloads the cert via watch on the secret file; renewal cron runs at 02:00 daily.
- PowerDNS Authoritative on
- Acceptance:
dig @1.1.1.1 anything.breakpilot.comreturns an answer;curl https://anything.breakpilot.comreturns 404 from Orca-Proxy (no TLS error). - Tests: ACME renewal dry-run; PowerDNS zone-diff check in CI; reach via stage and prod subdomains; cert expiry page wired to SigNoz alert.
- Gate: standard + manual DNS-delegation check by both founders (irreversible from registrar side without 24–48h propagation)
- Effort: M (was S — registrar delegation + PowerDNS adds setup time vs. Cloudflare)
M1.1 — orca-platform repo (IaC)
- Depends on: M0.1, M0.2
- Repos:
platform/orca-platform - Deliverables: directory layout per
INFRASTRUCTURE.md; one Orca manifest per VM × service; per-env overlays (overlays/dev,overlays/stage,overlays/prod); aMakefilewithmake plan/make applyper env. - Acceptance:
make plan ENV=stageproduces a no-op diff once applied. - Tests:
orca validateruns in CI; PRs that break a manifest fail. - Gate: standard
- Effort: M
M1.2 — Provision VMs (locked topology)
- Depends on: M1.1 (Orca manifest layout)
- Repos:
platform/orca-platform - Deliverables: the 4 VMs from
INFRASTRUCTURE.md §1provisioned on SysEleven (DUS2):- stage (m2.small, public IP) — runs app-plane code only, calls prod KC + Stalwart
- vm-edge (m2.small, public IP) — Identity + Infra planes (orca-proxy, PowerDNS, Keycloak, pg-keycloak, Infisical, pg-infisical, Gitea)
- vm-control (m2.medium) — Control plane (portal, tenant-registry, ERPNext, Frappe HD, MariaDB, Stalwart)
- vm-data (m2.medium) — Data plane (CERTifAI, MongoDB, LiteLLM, compliance ×3, pg-app, Qdrant, MinIO)
- Private network 10.0.0.0/16 between all four. Public ingress only via vm-edge (and stage's own IP for tester access).
- SSH disabled; only
orca execfor shell access.
- Acceptance: every VM reachable from Orca control plane; private-network connectivity verified; resource limits per service set in manifest per
INFRASTRUCTURE.md §6co-tenant notes. - Tests: cold-start sequence from
INFRASTRUCTURE.md §10 Scenario Fruns successfully on stage VMs. - Gate: standard + manual sign-off (touches infra spend and 36M commitment decision)
- Effort: M
- Cost impact: see COST_PLAN.md §3. Initial run: ~€552/mo On-Demand, dropping to ~€310/mo after 36M-upfront commit in Month 4.
M1.3 — Backups, monitoring, on-call
- Depends on: M1.2
- Repos:
platform/orca-platform - Deliverables: backup cron per VM per
INFRASTRUCTURE.md §3(Postgres pg_dump, MinIO bucket replication); SigNoz OTel collector running on every VM; alert routing tooncall@breakpilot.com; restore runbook inplatform/docs/runbooks/restore.md. - Acceptance: restore drill on stage succeeds (script in
platform/orca-platform/scripts/restore-drill.sh); SigNoz shows traces from a synthetic request. - Tests: disaster-recovery exercise per failure scenario in
INFRASTRUCTURE.md §10— at least Scenarios A, B, F validated on stage. - Gate: standard + manual sign-off
- Effort: L
M2.1 — Keycloak deployment
- Depends on: M1.2, M1.3
- Repos:
platform/orca-platform - Deliverables: Keycloak 26 on
vm-identity, Postgres backing store onvm-control, exposed atauth.breakpilot.comandauth.stage.breakpilot.com. Realm import file inorca-platform/keycloak/realm-export.json(committed, source-of-truth). - Acceptance: master admin login works; realm
breakpilot-prodexists in both envs. - Tests: automated realm-state diff in CI (
kcadmagainst checked-in export). - Gate: standard + security checklist
- Effort: M
M2.2 — Realm configuration: roles + protocol mappers + Organizations
- Depends on: M2.1
- Repos:
platform/orca-platform(realm config) - Deliverables: Organizations feature enabled; realm roles
BREAKPILOT_ADMIN,SUPPORT_ENGINEER,SALES_REP; org rolesIT_ADMIN,CXO,FINANCE,LEGAL,USER; protocol mapper that calls Tenant Registry at token issuance forproducts,plan,tenant_statusclaims; SALES_REP guardrail policy (token only issuable withorg_id = demo). - Acceptance: a test user gets the expected JWT claims; a SALES_REP user cannot get a JWT for a non-demo org (verified by integration test).
- Tests: Keycloak integration suite in
platform/tenant-registry/test/keycloak_test.go. - Gate: standard + security checklist
- Effort: M
M3.1 — Infisical
- Depends on: M1.2
- Repos:
platform/orca-platform - Deliverables: Infisical on
vm-secrets, machine identity per service, secret paths laid out perPRODUCT_INTEGRATION_SPEC.md §9.4. - Acceptance: a stub service can read its secrets at startup; rotating a secret in Infisical UI is picked up on next pod start.
- Tests: smoke test container reads secrets.
- Gate: standard + security checklist
- Effort: S
M3.2 — Stalwart transactional email
- Depends on: M0.3 (needs DNS records under our control), M3.1
- Repos:
platform/orca-platform - Deliverables:
- Stalwart on
vm-control(Orca-managed); reachable atmail.breakpilot.com. - DNS records added to the zone in M0.3:
mailA record, MX → mail, SPF (v=spf1 mx -all), DKIM (Stalwart-generated public key), DMARC (p=quarantine; rua=mailto:dmarc@breakpilot.com), reverse DNS (PTR) configured at the cloud provider for the vm-control public IP — coordinate with vm-edge since outbound mail must egress from a host with a clean PTR. - SMTP submission service account per platform sender:
noreply@,oncall@,support@,billing@,dmarc@. - Outbound queue and bounce handler; failed deliveries surface as audit events.
- Webhook receiver at
/inbound/postmasterfor bounce/complaint feedback loops (Gmail FBL, MS SNDS). - IP warming plan: write a
platform/docs/runbooks/email-warming.mddocumenting the 4–8 week ramp from low daily volumes; first 2 weeks of trial nudges (M12.2) explicitly throttled.
- Stalwart on
- Acceptance: test email from
noreply@breakpilot.comtoparnerkarsharang@gmail.comlands in inbox (not spam) on day 1; SPF/DKIM/DMARC all "pass" in Gmail's "show original" view; mail-tester.com score ≥ 9/10. - Tests: automated daily mail-tester check (failure pages on-call); bounce-handling integration test.
- Gate: standard + security checklist + manual deliverability sign-off (DKIM keys are load-bearing)
- Effort: L (deliverability tuning is the long tail)
Phase 0 exit criteria:
- Stage cluster boots cold from cron-driven nightly stop/start using only
INFRASTRUCTURE.md §5ordering. - A synthetic HTTPS request to
https://hello.stage.breakpilot.comreaches a stub container. - Restore drill on stage Postgres succeeds end-to-end.
3. Phase 1 — Control plane core (M4.x – M5.x)
Goal: Tenant Registry stores tenants; the portal authenticates a user and resolves their tenant. No products surfaced yet.
M4.1 — Tenant Registry: schema + migrations
- Depends on: M1.2, M2.2
- Repos:
platform/tenant-registry - Deliverables: Go service scaffold;
golang-migratemigrations fortenants,tenant_projects,tenant_products,tenant_idp_config,api_keys,audit_logperPLATFORM_ARCHITECTURE.md §5c; thetenant.statusenum +tenant.kindcolumn from the lifecycle spec. - Acceptance:
make migrate-upon a fresh Postgres produces the documented schema. - Tests: migration up/down round-trip via
testcontainers-go. - Gate: standard
- Effort: M
M4.2 — Tenant Registry: REST API
- Depends on: M4.1
- Repos:
platform/tenant-registry - Deliverables: OpenAPI 3.1 spec at
/openapi.yaml; endpointsPOST /tenants,GET /tenants/:id,POST /tenants/:id/activate,POST /tenants/:id/cancel,GET /catalog,POST /catalog/request,POST /catalog/trial-request,POST /api-keys,POST /internal/api-keys/verify,POST /audit,GET /audit. - Acceptance: every endpoint passes the OpenAPI contract test; returns documented errors for invalid input.
- Tests: integration tests against real Postgres for every endpoint.
- Gate: standard
- Effort: L
M4.3 — Tenant Registry: Keycloak adapter
- Depends on: M4.2, M2.2
- Repos:
platform/tenant-registry - Deliverables: package
internal/keycloakthat creates orgs, invites IT_ADMIN users, sets realm roles, and serves the protocol-mapper claims endpoint (the URL Keycloak hits during token issuance from M2.2). - Acceptance: creating a tenant via
POST /tenantsprovisions a Keycloak org and one IT_ADMIN user; user receives invite email. - Tests: integration test against the stage Keycloak.
- Gate: standard
- Effort: M
M5.1 — Portal scaffold: subdomain routing + OIDC login
- Depends on: M2.2, M4.3, M0.3
- Repos:
platform/portal,platform/design-tokens - Deliverables: Next.js 15 app on
vm-control; middleware readsHost→ extracts slug → calls Tenant RegistryGET /tenants?slug=→ injects tenant context; Keycloak OIDC login; logout;design-tokenspackage consumed by portal. - Acceptance: visiting
https://acme.stage.breakpilot.comredirects to Keycloak; after login, user lands on/acme/dashboard(empty page) with valid session. - Tests: Playwright e2e: login + logout for an existing test tenant.
- Gate: standard
- Effort: M
M5.2 — Portal: dashboard + backstage shells
- Depends on: M5.1
- Repos:
platform/portal - Deliverables: customer dashboard route
/[slug]/dashboard(renders product tiles from JWTproductsclaim — empty initially), backstage routes perPLATFORM_ARCHITECTURE.md §5askeleton, RBAC enforcement (§5a "Operating principles" — hide what user can't access), session refresh. - Acceptance: user with
org_roles=[USER]cannot see settings or billing links; backstage routes return 403 for non-BREAKPILOT_ADMINusers. - Tests: Playwright spec per role × route matrix.
- Gate: standard
- Effort: M
M5.3 — Playwright e2e harness
- Depends on: M5.2
- Repos:
platform/portal - Deliverables: Playwright config that runs against
stage.breakpilot.compost-deploy; CI jobe2e-stagetriggered after stage deploy; failure pages on-call. - Acceptance: breaking change to login is caught in CI within 10 min of merge.
- Tests: the suite itself.
- Gate: standard
- Effort: S
Phase 1 exit criteria:
- A tenant created via
POST /tenantsresults in a working login flow at<slug>.stage.breakpilot.com. - All Phase 1 routes have a passing Playwright spec running on every stage deploy.
4. Phase 2 — Existing product uplift (M6.x – M7.x, parallel)
Goal: CERTifAI and breakpilot-compliance both honour the JWT contract and surface a real product tile in the portal.
M6.1 — CERTifAI: org_id scoping at DB layer
- Depends on: M2.2
- Repos:
benjamin_boenisch/certifai - Deliverables: MongoDB middleware that requires
org_idon every query; backfill script for existing collections; per-tenant collection-level role checks (IT_ADMIN→ Admin, etc.). - Acceptance: integration test attempting a cross-tenant read returns
403; existing single-tenant flows still work for tenantdefault. - Tests: unit + integration; regression tests for every existing controller.
- Gate: standard + security checklist
- Effort: L (4–6 weeks per prior gap analysis)
M6.2 — CERTifAI: JWT validation + role mapping
- Depends on: M6.1
- Repos:
benjamin_boenisch/certifai - Deliverables: Keycloak JWKS validation middleware; role mapping per
PLATFORM_ARCHITECTURE.md §6; tenant_status middleware (returns 402 on writes whenfrozen, 410 whenarchived, allows demo with no metering). - Acceptance: all four
tenant.statusstates behave per spec; tested against a stage Keycloak. - Tests: integration tests per status value.
- Gate: standard + security checklist
- Effort: M
M6.3 — CERTifAI: manifest + integration assets
- Depends on: M6.2
- Repos:
benjamin_boenisch/certifai - Deliverables:
product.manifest.yamlperPRODUCT_INTEGRATION_SPEC.md §10published tocdn.breakpilot.com; OpenAPI 3.1 spec;/v1/health,/v1/usage,/v1/tenants/:id/export,DELETE /v1/tenants/:id/data,POST /v1/tenants/demo/reset; web componentcertifai-dashboardper §5.A. - Acceptance: CERTifAI appears in the portal catalog; subscribed tenants can open it from the dashboard.
- Tests: contract test that manifest validates against schema; web component renders inside portal shadow-DOM host.
- Gate: standard
- Effort: L
M7.1 — Compliance: JWT validation upgrade
- Depends on: M2.2
- Repos:
benjamin_boenisch/breakpilot-compliance - Deliverables: Next.js proxy validates JWT against Keycloak JWKS (replacing today's
X-Tenant-IDtrust); tenant_status middleware as in M6.2. - Acceptance: spoofing
X-Tenant-IDwithout a JWT returns 401; valid JWT for tenant A cannot read tenant B data. - Tests: integration tests for both auth and status states.
- Gate: standard + security checklist
- Effort: M (3–5 weeks per prior gap analysis)
M7.2 — Compliance: manifest + integration assets
- Depends on: M7.1
- Repos:
benjamin_boenisch/breakpilot-compliance - Deliverables: same endpoint set as M6.3; web component (existing React →
@r2wc/react-to-web-componentper §5.A); manifest withsupports_projects: true(already implemented). - Acceptance: compliance appears in portal catalog; opens from dashboard; project switching works inside the product.
- Tests: as M6.3.
- Gate: standard
- Effort: M
Phase 2 exit criteria:
- A real tenant on stage can subscribe to both products and use them through the portal.
- Cross-product audit at
/[slug]/auditshows events from both products in the Retraced schema.
5. Phase 3 — Business operations (M8.x – M9.x)
Goal: ERPNext and Frappe HD run, Sales Order → tenant activate works, tickets escalate to Gitea.
M8.1 — ERPNext deployment
- Depends on: M1.2, M2.1
- Repos:
platform/orca-platform - Deliverables: Frappe + ERPNext on
vm-control(separate Postgres database from tenant_registry — seeINFRASTRUCTURE.mdRISK-1); reached aterp.breakpilot.com; Keycloak OIDC; IP-restricted at Orca-Proxy. - Acceptance: us login works; a Customer record can be created manually.
- Tests: smoke test for OIDC; backup of Frappe filestore validated.
- Gate: standard + manual sign-off (touches
vm-controlresources) - Effort: M
M8.2 — ERPNext customization
- Depends on: M8.1
- Repos:
platform/orca-platform/erpnext-app/ - Deliverables: custom Frappe app with:
tenant_idfield onCustomer;sales_ownerfield onLead; server scripts for the Sales Order → Tenant Registry webhook;Cancelworkflow that calls Tenant Registry/cancel. - Acceptance: submitting a Sales Order in ERPNext triggers a tenant activation in stage Tenant Registry.
- Tests: server-script unit tests (Frappe test harness); integration test exercises the full webhook.
- Gate: standard
- Effort: M
M8.3 — Self-serve billing (Polar.sh)
- Depends on: M8.1, M5.2
- Repos:
platform/portal,platform/tenant-registry - Deliverables:
- Polar.sh organization + products configured for Starter / Professional / per-seat tiers.
- Polar Checkout embedded in portal
/[slug]/billing/upgrade. - Webhook listener at
tenant-registry /polar/webhook(HMAC-verified) handlessubscription.created,subscription.updated,subscription.canceled,order.paid→ flipstenant.status, mirrors the customer + invoice into ERPNext via REST. - Polar acts as Merchant of Record — they handle EU VAT MOSS, no per-country tax registration needed for our side.
- Portal billing page reads invoices from ERPNext (single source of truth for accounting) but links out to Polar's customer portal for payment-method management.
- Acceptance: signing up self-serve creates a tenant, a Polar subscription, an ERPNext Customer + Invoice, and a usable login; VAT line item appears correctly on the EU customer's invoice.
- Tests: integration test against Polar sandbox; webhook replay test; tax calculation correct for at least DE, FR, NL, US.
- Gate: standard + security checklist
- Effort: L
Why Polar.sh over Stripe / Lemon Squeezy: OSS-aligned, Merchant of Record (handles EU VAT MOSS automatically), developer-first, 4% + Stripe fees vs. Lemon's 5%. Stripe direct would require us to register for VAT in 27 countries — not viable for a 2-person team. See self-hosted-oss-first.
M9.1 — Frappe Helpdesk
- Depends on: M8.1
- Repos:
platform/orca-platform - Deliverables: Frappe HD on the same Frappe bench; customer portal embedded at
/[slug]/support/. - Acceptance: a customer user can submit a ticket; we receive it.
- Tests: Playwright spec for ticket submission.
- Gate: standard
- Effort: S
M9.2 — HD → Gitea escalation
- Depends on: M9.1
- Repos:
platform/orca-platform/erpnext-app/ - Deliverables: server script that on a
Ticket: Escalate to Engineeringaction creates a Gitea issue in the matching repo via Gitea REST API; reverse webhook from Gitea on issue close marks ticket resolved. - Acceptance: the round-trip works for a test ticket on stage.
- Tests: integration test against stage Gitea.
- Gate: standard
- Effort: S
Phase 3 exit criteria:
- ERPNext is the source of truth for billing/CRM/HR.
- The full Lead → Quote → Sales Order → Tenant chain works on stage.
6. Phase 4 — Customer UX & lifecycle (M10.x – M14.x)
Goal: Every customer-facing flow from PLATFORM_ARCHITECTURE.md works end-to-end on stage.
M10.1 — Customer area: full surfaces
- Depends on: M5.2, M6.3, M7.2
- Repos:
platform/portal - Deliverables: real implementations of
/[slug]/dashboard,/[slug]/products/*,/[slug]/projects,/[slug]/settings/{identity,users,api-keys,integrations},/[slug]/billing,/[slug]/audit,/[slug]/support. - Acceptance: every route is implemented, RBAC-gated, with empty/loading/error states.
- Tests: one Playwright spec per route × primary role.
- Gate: standard
- Effort: L
M10.2 — Cross-product audit view
- Depends on: M10.1, M4.2
- Repos:
platform/portal - Deliverables: audit page filters by product/actor/action/time; CSV + PDF export; events rendered from the Retraced-shape schema.
- Acceptance: a DPO-style query ("show me everything user X did across all products last month") returns in <2s for a tenant with 100k events.
- Tests: load test with synthetic events.
- Gate: standard + security checklist
- Effort: M
M11.1 — Catalog flow (P13)
- Depends on: M4.2, M10.1
- Repos:
platform/portal,platform/tenant-registry - Deliverables:
/[slug]/catalogUI perPLATFORM_ARCHITECTURE.mdP13; "Request" button creates ERPNext CRM Lead. - Acceptance: customer requests a non-subscribed product; sales sees a Lead in ERPNext with the right
sales_owner. - Tests: Playwright e2e covering the full P13 sequence.
- Gate: standard
- Effort: M
M12.1 — Self-serve trial (P15)
- Depends on: M8.3, M11.1
- Repos:
platform/portal,platform/tenant-registry - Deliverables: public
/startform; trial tenant provisioning (status=trial, trial_ends_at); banner; trial_quota enforcement (read by products from JWT). - Acceptance: prospect signs up; trial tenant with 14-day timer exists; quota enforced.
- Tests: Playwright e2e signs up → uses → hits quota.
- Gate: standard
- Effort: M
M12.2 — Trial lifecycle cron + emails
- Depends on: M12.1, M3.2 (Stalwart must be deliverability-clean)
- Repos:
platform/tenant-registry - Deliverables: scheduler in tenant-registry that runs day-7/12/14 emails; status transitions trial → active (on payment) or trial → frozen → archived; SMTP via Stalwart at
mail.breakpilot.com:587; sendernoreply@breakpilot.com; HTML + plaintext templates committed undertenant-registry/templates/email/; List-Unsubscribe headers per RFC 8058. - Acceptance: in a time-warped stage test (script that advances
trial_ends_at), all transitions fire in order and all three emails land in Gmail inbox. - Tests: integration test with time injection; deliverability spot-check at each release.
- Gate: standard
- Effort: M
M13.1 — Demo tenant seeding
- Depends on: M6.3, M7.2
- Repos:
platform/seed-data - Deliverables: per-product fixture archives (
certifai/seed-v1.tar.gz,compliance/seed-v1.tar.gz); publishing pipeline tocdn.breakpilot.com;catalog.demo.seed_data_urlpopulated in product manifests. - Acceptance: calling
POST /v1/tenants/demo/reseton either product restores fixtures. - Tests: integration test asserts fixture state after reset.
- Gate: standard
- Effort: M
M13.2 — Sales demo flow (P14)
- Depends on: M2.2, M13.1
- Repos:
platform/portal,platform/tenant-registry - Deliverables: demo tenant created in stage and prod with
kind=demo, status=demo; SALES_REP role usable; backstage routes restricted to/backstage/leadsand/backstage/demo; demo tenant audit events tagged{"demo": true}and hidden from real-tenant audit views. - Acceptance: sales rep logs in at
demo.breakpilot.com, walks both products live, [Request Trial] modal creates a CRM Lead withsales_owner = the rep. - Tests: Playwright e2e for the sales walk-through.
- Gate: standard + security checklist (SALES_REP guardrail enforcement is the load-bearing piece)
- Effort: M
M13.3 — Nightly demo reset
- Depends on: M13.2
- Repos:
platform/tenant-registry - Deliverables: cron at 03:00 Europe/Berlin calls each product's reset endpoint; failures page on-call.
- Acceptance: after a deliberately-corrupted demo state, the next 03:00 reset restores fixtures.
- Tests: test runs the reset manually + verifies fixture state.
- Gate: standard
- Effort: S
M14.1 — Cancel + frozen state (P16 part 1)
- Depends on: M10.1, M6.2, M7.1
- Repos:
platform/portal,platform/tenant-registry - Deliverables: cancel modal with reason + typed-confirm; status active → frozen transition; Stripe
cancel_at_period_end; ERPNext Opportunity → Lost; reactivation path within 30 days. - Acceptance: test customer cancels; portal switches to read-only; reactivate restores
activestatus without data loss. - Tests: Playwright e2e covering cancel + reactivate.
- Gate: standard + security checklist
- Effort: M
M14.2 — Offboarding cron + final export (P16 part 2)
- Depends on: M14.1, M6.3, M7.2
- Repos:
platform/tenant-registry - Deliverables: day-30 cron builds final export ZIP per product, emails signed URL (7-day TTL), calls
DELETE /v1/tenants/:id/dataon every subscribed product, archives Keycloak org, markstenant.status = archived. - Acceptance: time-warped test runs the full P16 sequence end-to-end on stage; export ZIP contains data from both products; second post-archive request to either product returns 410.
- Tests: integration test with time injection; GDPR-compliance regression suite added.
- Gate: standard + security checklist + manual sign-off (irreversible operation)
- Effort: L
Phase 4 exit criteria:
- Every flow P1–P16 from
PLATFORM_ARCHITECTURE.mdhas a passing Playwright spec. - Stage runs a full lifecycle: sign-up trial → convert → use → cancel → offboard, in an automated nightly job.
- We can hand a prospect a real demo using
demo.breakpilot.com.
7. Phase 5 — Headless products (M15.x – M17.x)
Goal: Make the platform host products with no UI of their own.
M15.1 — API key infrastructure
- Depends on: M4.2, M10.1
- Repos:
platform/tenant-registry,platform/portal - Deliverables: API key CRUD per
PRODUCT_INTEGRATION_SPEC.md §6.2; portal UI at/[slug]/settings/api-keys;POST /internal/api-keys/verifyfor products. - Acceptance: create key in portal; product call with key succeeds; revoke kills access within 60s.
- Tests: integration tests for verify endpoint; Playwright for portal UI.
- Gate: standard + security checklist (rotation + scope enforcement)
- Effort: M
M15.2 — Webhook delivery
- Depends on: M15.1
- Repos:
platform/tenant-registry,platform/portal - Deliverables: webhook config + delivery service per
PLATFORM_ARCHITECTURE.mdH4; portal page/[slug]/integrations; signed payloads; 3-attempt retry with backoff; dead-letter visible at/webhooks/deliveries. - Acceptance: test webhook to https://requestbin.com works; failed deliveries appear in dead letter.
- Tests: integration tests with a local sink.
- Gate: standard
- Effort: M
M16.1 — First headless product reference implementation
- Depends on: M15.2
- Repos: TBD (proof-of-concept can live in
platform/docs/examples/headless-template/) - Deliverables: a minimal headless product (e.g., echo-bot) that implements the full §5.C contract: manifest, API, audit emit, usage emit, demo reset, GDPR endpoints.
- Acceptance: echo-bot is bookable from catalog, works end-to-end, passes the same lifecycle test as Phase 4.
- Tests: the lifecycle e2e from M14.2 extended to include echo-bot.
- Gate: standard
- Effort: M
M17.1 — MCP servers (Enterprise)
- Depends on: M6.3, M7.2
- Repos:
benjamin_boenisch/certifai,benjamin_boenisch/breakpilot-compliance - Deliverables: MCP endpoints per
PRODUCT_INTEGRATION_SPEC.md §10mcp:block; gated onplan == enterprise; routed viamcp.breakpilot.com. - Acceptance: Claude Code can connect to
mcp.breakpilot.com/certifaiwith a service token and calllist_ai_agents. - Tests: MCP contract test using
mcp-cli. - Gate: standard + security checklist
- Effort: L
Phase 5 exit criteria:
- A third-party (or us) can add a new headless product by following
PRODUCT_INTEGRATION_SPEC.mdand a referenced template, with no portal code changes required.
8. Phase 6 — Enterprise + scale (M18.x – M19.x)
These ship only when a paying customer requires them.
M18.1 — Custom domains
- Depends on: M0.3, M10.1
- Repos:
platform/orca-platform,platform/portal - Deliverables: ACME on-demand TLS in Orca-Proxy; portal UI for customer to add domain; CNAME verification.
- Acceptance:
compliance.acme.comresolves and renders the Acme portal. - Tests: integration test with a synthetic domain.
- Gate: standard
- Effort: M
M18.2 — Physical data isolation
- Depends on: M4.1, M6.1, M7.1
- Repos: all data-plane products +
tenant-registry - Deliverables: option per tenant for a dedicated Postgres / Mongo schema or database; provisioning automation; migration path from logical → physical.
- Acceptance: an enterprise tenant runs on a dedicated schema; cross-tenant queries are physically impossible.
- Tests: isolation enforcement test.
- Gate: standard + security review + manual sign-off
- Effort: L
M19.1 — A/B testing infra
- Depends on: anywhere
featureFlags.evaluate()is called - Repos: new
platform/feature-flags(Unleash onvm-controlor hosted) + portal SDK shim - Deliverables: swap the hard-coded
evaluate()from §1.12 to call Unleash; eval results land in audit events for reproducibility. - Acceptance: flipping a flag in Unleash changes behaviour for the targeted tenant set within 30s; no behavior change for other tenants.
- Tests: integration test asserts flag-driven branches.
- Gate: standard
- Effort: M
9. Cross-cutting work (every phase, ongoing)
These are not milestones — they are commitments enforced by CI and process.
- Regression suite expansion. Every bug fix lands with a regression test FIRST. Tracked by
tests-addedlabel on PRs; fix-without-test PRs are rejected by reviewer. - Security review per phase. End of each phase: dependency audit (
cargo audit,npm audit,pip-audit), SAST scan (semgrep), threat model update inplatform/docs/security/. - Disaster-recovery drills. Once per phase on stage: pick one scenario from
INFRASTRUCTURE.md §10, run it, document time-to-recover in the runbook. - Doc currency. PR template requires the author to tick "docs updated" or "n/a" — CI fails on a missing tick.
- OSS swap-in readiness. When adding metering / audit / SCIM / flag eval code, use the schema/interface noted in
PRODUCT_INTEGRATION_SPEC.md §15so swap-in stays cheap.
10. First-PR checklist for Claude Code
When starting work, the first sequence of PRs should be:
- PR-1 (M0.1): Create
platform/docswith copied architecture docs + this plan. Land in 1 day. - PR-2 to PR-7 (M0.1 continued): Bootstrap each of the other five repos with §1.2 scaffolding. Land in parallel.
- PR-8 (M0.2): CI templates + branch protection per repo.
- PR-9 (M1.1):
orca-platformdirectory layout + first stub manifest. - PR-10 (M1.2): VM provisioning (vm-edge, vm-identity, vm-secrets, vm-control first — DNS and Keycloak depend on these).
- PR-11 (M0.3): PowerDNS on vm-edge + zone file + registrar NS delegation + wildcard TLS via Let's Encrypt DNS-01.
After PR-11, the dependency graph fans out and parallel work begins.
For each PR, Claude Code MUST:
- Open the PR with the §1.4 template filled in.
- Link the milestone ID in PR body (
Linked milestone: M0.1). - Wait for human approval (no self-merge — branch protection enforces).
- After merge: verify the stage deploy succeeds before starting the next dependent PR.
11. Dependency graph
┌── M6.1 ── M6.2 ── M6.3 ──┐
│ │
┌── M2.1 ── M2.2 ────────┤ ├── M10.1 ── M10.2
│ │ │ │
M0.1 ── M0.2 ── M1.1 ──┼── M1.2 ── M0.3 ── M1.3 │ │ ├── M11.1 ── M12.1 ── M12.2
│ │ │ │ │ │
│ └── M3.1 ── M3.2 │ ├── M13.2 ── M13.3 │
│ │ │ │ │
└─────────────────────── M4.1 ── M4.2 ── M4.3 ── M5.1 ── M5.2 ── M5.3 M13.1 │
│
M8.1 ── M8.2 ── M8.3 ── M9.1 ── M9.2 ──────────────────────┤
│
M15.1 ── M15.2 ── M16.1 ── M17.1 │
│
M14.1 ── M14.2
Phase-6 (M18, M19) depends on Phase-4 completion + a paying customer.
M12.2 depends on M3.2 (Stalwart deliverability must be clean before trial emails go out).
Critical path (longest chain to first paying customer):
M0.1 → M0.2 → M1.1 → M1.2 → M0.3 → M1.3 → M2.1 → M2.2 → M4.1 → M4.2 → M4.3 → M5.1 → M5.2 → M6.2 → M6.3 → M10.1 → M11.1 → M12.1
That's 18 milestones. With one full-time agent and standard human review pacing, plan for 9–13 weeks to first paying customer flow on stage (added 1 week for the PowerDNS / DNS-delegation cycle vs. the prior Cloudflare path); +2–4 weeks for prod hardening and the Phase-4 lifecycle completion.
Note on M3.2 critical path: Stalwart IP warming (4–8 weeks) runs in background parallel — start it immediately after M3.1 so warming finishes before M12.2 needs it. It is NOT on the critical path for first paying customer (that customer can be onboarded by hand), but it IS on the critical path for self-serve trial volume.
Parallelism opportunities:
- M6.x and M7.x can run fully in parallel (different repos, different stacks).
- M8.x is independent of all data-plane work once M2.2 is done.
- M15.x can begin as soon as M10.1 lands.
12. Open questions to resolve before starting
Resolved:
Email provider→ Stalwart, self-hosted on vm-control. Plan in M3.2; 4–8 week IP warming acknowledged.Stripe vs Lemon Squeezy→ Polar.sh. Plan in M8.3.Cloudflare account ownership→ not used; DNS is self-hosted via PowerDNS on vm-edge (M0.3). Registrar account (Benjamin's) still needs documented 2FA recovery — see new DR item below.
Still open:
- CDN host for
cdn.breakpilot.com: self-hosted MinIO + Caddy on vm-edge is the OSS-aligned default; alternative is BunnyCDN (cheap, EU). Decide before M6.3 (manifest bundles + hero images). - Cloud provider for port 25 outbound. Stalwart needs unblocked port 25 to send mail. Hetzner blocks by default and requires a request to unblock with proof of intent + abuse contact; OVH and Scaleway unblock on request faster. Confirm with Benjamin which provider vm-control runs on. Block on M3.2 if port 25 is unblockable — fallback is sending via a different provider's IP with reverse DNS.
- Test data privacy. The demo tenant must contain ONLY synthetic data — confirm seed pipeline strips real PII even if our test orgs accidentally seed from prod.
- Registrar + DNS bus-factor. Document who owns the registrar account, who has 2FA recovery codes, and the procedure to update NS records without that person available. Goes in
platform/docs/runbooks/dr.mdbefore M0.3 ships. - Internal CA.
step-calisted in INFRASTRUCTURE.md vm-edge as "optional" — decide whether inter-service mTLS is in scope for Phase 0 or deferred until Phase 4 (Enterprise tier).
End of document. Open items in §12 should be triaged before M0.1 starts; the bus-factor and port-25 items are the only hard blockers.