chore(domain): yourplatform.com → breakpilot.com
ci / shared (pull_request) Failing after 3s

Apply the platform-domain decision (2026-05-18) to every README,
workflow, and config in this repo. 7 files updated.

Refs: M1.1
This commit is contained in:
2026-05-18 22:08:05 +02:00
parent 1ed2dcee57
commit cb50fc5026
8 changed files with 80 additions and 79 deletions
+24 -24
View File
@@ -120,8 +120,8 @@ Three Orca clusters, all on the same hardware until volume justifies separation:
Domain pattern:
- dev: `*.localhost` (mkcert)
- stage: `*.stage.yourplatform.com`
- prod: `*.yourplatform.com`
- stage: `*.stage.breakpilot.com`
- prod: `*.breakpilot.com`
### 1.9 Observability + audit
- **SigNoz** (already running at `signoz.meghsakha.com`) for traces, logs, metrics. Every service ships OTel SDK from day one.
@@ -172,12 +172,12 @@ Every place where a future flag would gate behaviour MUST flow through a single
- **Repos:** `platform/orca-platform`
- **Deliverables:**
- **PowerDNS Authoritative** on `vm-edge` (Orca-managed). PostgreSQL backend on same VM (small; ~100 records).
- At the registrar (Benjamin's account): set `ns1.yourplatform.com` and `ns2.yourplatform.com` glue records pointing at vm-edge public IP; delegate the domain to those NS.
- Zone file committed in `orca-platform/dns/yourplatform.com.zone`; Orca syncs into PowerDNS on apply.
- Records: apex `yourplatform.com`, wildcards `*.yourplatform.com` + `*.stage.yourplatform.com`, plus `auth.`, `erp.`, `mcp.`, `cdn.`, `mail.`, `ns1.`, `ns2.`, SPF/DKIM/DMARC TXT records (for M3.2).
- At the registrar (Benjamin's account): set `ns1.breakpilot.com` and `ns2.breakpilot.com` glue records pointing at vm-edge public IP; delegate the domain to those NS.
- Zone file committed in `orca-platform/dns/breakpilot.com.zone`; Orca syncs into PowerDNS on apply.
- Records: apex `breakpilot.com`, wildcards `*.breakpilot.com` + `*.stage.breakpilot.com`, plus `auth.`, `erp.`, `mcp.`, `cdn.`, `mail.`, `ns1.`, `ns2.`, SPF/DKIM/DMARC TXT records (for M3.2).
- Wildcard TLS via Let's Encrypt **DNS-01 against PowerDNS** (Lego's `--dns=pdns` provider); ACME credentials in Infisical at `/prod/orca-proxy/PDNS_API_KEY`.
- Orca-Proxy reloads the cert via watch on the secret file; renewal cron runs at 02:00 daily.
- **Acceptance:** `dig @1.1.1.1 anything.yourplatform.com` returns an answer; `curl https://anything.yourplatform.com` returns 404 from Orca-Proxy (no TLS error).
- **Acceptance:** `dig @1.1.1.1 anything.breakpilot.com` returns an answer; `curl https://anything.breakpilot.com` returns 404 from Orca-Proxy (no TLS error).
- **Tests:** ACME renewal dry-run; PowerDNS zone-diff check in CI; reach via stage and prod subdomains; cert expiry page wired to SigNoz alert.
- **Gate:** standard + manual DNS-delegation check by both founders (irreversible from registrar side without 2448h propagation)
- **Effort:** M (was S — registrar delegation + PowerDNS adds setup time vs. Cloudflare)
@@ -210,7 +210,7 @@ Every place where a future flag would gate behaviour MUST flow through a single
### M1.3 — Backups, monitoring, on-call
- **Depends on:** M1.2
- **Repos:** `platform/orca-platform`
- **Deliverables:** backup cron per VM per `INFRASTRUCTURE.md §3` (Postgres pg_dump, MinIO bucket replication); SigNoz OTel collector running on every VM; alert routing to `oncall@yourplatform.com`; restore runbook in `platform/docs/runbooks/restore.md`.
- **Deliverables:** backup cron per VM per `INFRASTRUCTURE.md §3` (Postgres pg_dump, MinIO bucket replication); SigNoz OTel collector running on every VM; alert routing to `oncall@breakpilot.com`; restore runbook in `platform/docs/runbooks/restore.md`.
- **Acceptance:** restore drill on stage succeeds (script in `platform/orca-platform/scripts/restore-drill.sh`); SigNoz shows traces from a synthetic request.
- **Tests:** disaster-recovery exercise per failure scenario in `INFRASTRUCTURE.md §10` — at least Scenarios A, B, F validated on stage.
- **Gate:** standard + manual sign-off
@@ -219,7 +219,7 @@ Every place where a future flag would gate behaviour MUST flow through a single
### M2.1 — Keycloak deployment
- **Depends on:** M1.2, M1.3
- **Repos:** `platform/orca-platform`
- **Deliverables:** Keycloak 26 on `vm-identity`, Postgres backing store on `vm-control`, exposed at `auth.yourplatform.com` and `auth.stage.yourplatform.com`. Realm import file in `orca-platform/keycloak/realm-export.json` (committed, source-of-truth).
- **Deliverables:** Keycloak 26 on `vm-identity`, Postgres backing store on `vm-control`, exposed at `auth.breakpilot.com` and `auth.stage.breakpilot.com`. Realm import file in `orca-platform/keycloak/realm-export.json` (committed, source-of-truth).
- **Acceptance:** master admin login works; realm `breakpilot-prod` exists in both envs.
- **Tests:** automated realm-state diff in CI (`kcadm` against checked-in export).
- **Gate:** standard + security checklist
@@ -247,20 +247,20 @@ Every place where a future flag would gate behaviour MUST flow through a single
- **Depends on:** M0.3 (needs DNS records under our control), M3.1
- **Repos:** `platform/orca-platform`
- **Deliverables:**
- **Stalwart** on `vm-control` (Orca-managed); reachable at `mail.yourplatform.com`.
- DNS records added to the zone in M0.3: `mail` A record, MX → mail, SPF (`v=spf1 mx -all`), DKIM (Stalwart-generated public key), DMARC (`p=quarantine; rua=mailto:dmarc@yourplatform.com`), reverse DNS (PTR) configured at the cloud provider for the vm-control public IP — coordinate with vm-edge since outbound mail must egress from a host with a clean PTR.
- **Stalwart** on `vm-control` (Orca-managed); reachable at `mail.breakpilot.com`.
- DNS records added to the zone in M0.3: `mail` A record, MX → mail, SPF (`v=spf1 mx -all`), DKIM (Stalwart-generated public key), DMARC (`p=quarantine; rua=mailto:dmarc@breakpilot.com`), reverse DNS (PTR) configured at the cloud provider for the vm-control public IP — coordinate with vm-edge since outbound mail must egress from a host with a clean PTR.
- SMTP submission service account per platform sender: `noreply@`, `oncall@`, `support@`, `billing@`, `dmarc@`.
- Outbound queue and bounce handler; failed deliveries surface as audit events.
- Webhook receiver at `/inbound/postmaster` for bounce/complaint feedback loops (Gmail FBL, MS SNDS).
- **IP warming plan**: write a `platform/docs/runbooks/email-warming.md` documenting the 48 week ramp from low daily volumes; first 2 weeks of trial nudges (M12.2) explicitly throttled.
- **Acceptance:** test email from `noreply@yourplatform.com` to `parnerkarsharang@gmail.com` lands in inbox (not spam) on day 1; SPF/DKIM/DMARC all "pass" in Gmail's "show original" view; mail-tester.com score ≥ 9/10.
- **Acceptance:** test email from `noreply@breakpilot.com` to `parnerkarsharang@gmail.com` lands in inbox (not spam) on day 1; SPF/DKIM/DMARC all "pass" in Gmail's "show original" view; mail-tester.com score ≥ 9/10.
- **Tests:** automated daily mail-tester check (failure pages on-call); bounce-handling integration test.
- **Gate:** standard + security checklist + manual deliverability sign-off (DKIM keys are load-bearing)
- **Effort:** L (deliverability tuning is the long tail)
**Phase 0 exit criteria:**
- Stage cluster boots cold from cron-driven nightly stop/start using only `INFRASTRUCTURE.md §5` ordering.
- A synthetic HTTPS request to `https://hello.stage.yourplatform.com` reaches a stub container.
- A synthetic HTTPS request to `https://hello.stage.breakpilot.com` reaches a stub container.
- Restore drill on stage Postgres succeeds end-to-end.
---
@@ -300,7 +300,7 @@ Every place where a future flag would gate behaviour MUST flow through a single
- **Depends on:** M2.2, M4.3, M0.3
- **Repos:** `platform/portal`, `platform/design-tokens`
- **Deliverables:** Next.js 15 app on `vm-control`; middleware reads `Host` → extracts slug → calls Tenant Registry `GET /tenants?slug=` → injects tenant context; Keycloak OIDC login; logout; `design-tokens` package consumed by portal.
- **Acceptance:** visiting `https://acme.stage.yourplatform.com` redirects to Keycloak; after login, user lands on `/acme/dashboard` (empty page) with valid session.
- **Acceptance:** visiting `https://acme.stage.breakpilot.com` redirects to Keycloak; after login, user lands on `/acme/dashboard` (empty page) with valid session.
- **Tests:** Playwright e2e: login + logout for an existing test tenant.
- **Gate:** standard
- **Effort:** M
@@ -317,14 +317,14 @@ Every place where a future flag would gate behaviour MUST flow through a single
### M5.3 — Playwright e2e harness
- **Depends on:** M5.2
- **Repos:** `platform/portal`
- **Deliverables:** Playwright config that runs against `stage.yourplatform.com` post-deploy; CI job `e2e-stage` triggered after stage deploy; failure pages on-call.
- **Deliverables:** Playwright config that runs against `stage.breakpilot.com` post-deploy; CI job `e2e-stage` triggered after stage deploy; failure pages on-call.
- **Acceptance:** breaking change to login is caught in CI within 10 min of merge.
- **Tests:** the suite itself.
- **Gate:** standard
- **Effort:** S
**Phase 1 exit criteria:**
- A tenant created via `POST /tenants` results in a working login flow at `<slug>.stage.yourplatform.com`.
- A tenant created via `POST /tenants` results in a working login flow at `<slug>.stage.breakpilot.com`.
- All Phase 1 routes have a passing Playwright spec running on every stage deploy.
---
@@ -354,7 +354,7 @@ Every place where a future flag would gate behaviour MUST flow through a single
### M6.3 — CERTifAI: manifest + integration assets
- **Depends on:** M6.2
- **Repos:** `benjamin_boenisch/certifai`
- **Deliverables:** `product.manifest.yaml` per `PRODUCT_INTEGRATION_SPEC.md §10` published to `cdn.yourplatform.com`; OpenAPI 3.1 spec; `/v1/health`, `/v1/usage`, `/v1/tenants/:id/export`, `DELETE /v1/tenants/:id/data`, `POST /v1/tenants/demo/reset`; web component `certifai-dashboard` per §5.A.
- **Deliverables:** `product.manifest.yaml` per `PRODUCT_INTEGRATION_SPEC.md §10` published to `cdn.breakpilot.com`; OpenAPI 3.1 spec; `/v1/health`, `/v1/usage`, `/v1/tenants/:id/export`, `DELETE /v1/tenants/:id/data`, `POST /v1/tenants/demo/reset`; web component `certifai-dashboard` per §5.A.
- **Acceptance:** CERTifAI appears in the portal catalog; subscribed tenants can open it from the dashboard.
- **Tests:** contract test that manifest validates against schema; web component renders inside portal shadow-DOM host.
- **Gate:** standard
@@ -391,7 +391,7 @@ Every place where a future flag would gate behaviour MUST flow through a single
### M8.1 — ERPNext deployment
- **Depends on:** M1.2, M2.1
- **Repos:** `platform/orca-platform`
- **Deliverables:** Frappe + ERPNext on `vm-control` (separate Postgres database from tenant_registry — see `INFRASTRUCTURE.md` RISK-1); reached at `erp.yourplatform.com`; Keycloak OIDC; IP-restricted at Orca-Proxy.
- **Deliverables:** Frappe + ERPNext on `vm-control` (separate Postgres database from tenant_registry — see `INFRASTRUCTURE.md` RISK-1); reached at `erp.breakpilot.com`; Keycloak OIDC; IP-restricted at Orca-Proxy.
- **Acceptance:** us login works; a Customer record can be created manually.
- **Tests:** smoke test for OIDC; backup of Frappe filestore validated.
- **Gate:** standard + manual sign-off (touches `vm-control` resources)
@@ -489,7 +489,7 @@ Every place where a future flag would gate behaviour MUST flow through a single
### M12.2 — Trial lifecycle cron + emails
- **Depends on:** M12.1, M3.2 (Stalwart must be deliverability-clean)
- **Repos:** `platform/tenant-registry`
- **Deliverables:** scheduler in tenant-registry that runs day-7/12/14 emails; status transitions trial → active (on payment) or trial → frozen → archived; SMTP via Stalwart at `mail.yourplatform.com:587`; sender `noreply@yourplatform.com`; HTML + plaintext templates committed under `tenant-registry/templates/email/`; List-Unsubscribe headers per RFC 8058.
- **Deliverables:** scheduler in tenant-registry that runs day-7/12/14 emails; status transitions trial → active (on payment) or trial → frozen → archived; SMTP via Stalwart at `mail.breakpilot.com:587`; sender `noreply@breakpilot.com`; HTML + plaintext templates committed under `tenant-registry/templates/email/`; List-Unsubscribe headers per RFC 8058.
- **Acceptance:** in a time-warped stage test (script that advances `trial_ends_at`), all transitions fire in order and all three emails land in Gmail inbox.
- **Tests:** integration test with time injection; deliverability spot-check at each release.
- **Gate:** standard
@@ -498,7 +498,7 @@ Every place where a future flag would gate behaviour MUST flow through a single
### M13.1 — Demo tenant seeding
- **Depends on:** M6.3, M7.2
- **Repos:** `platform/seed-data`
- **Deliverables:** per-product fixture archives (`certifai/seed-v1.tar.gz`, `compliance/seed-v1.tar.gz`); publishing pipeline to `cdn.yourplatform.com`; `catalog.demo.seed_data_url` populated in product manifests.
- **Deliverables:** per-product fixture archives (`certifai/seed-v1.tar.gz`, `compliance/seed-v1.tar.gz`); publishing pipeline to `cdn.breakpilot.com`; `catalog.demo.seed_data_url` populated in product manifests.
- **Acceptance:** calling `POST /v1/tenants/demo/reset` on either product restores fixtures.
- **Tests:** integration test asserts fixture state after reset.
- **Gate:** standard
@@ -508,7 +508,7 @@ Every place where a future flag would gate behaviour MUST flow through a single
- **Depends on:** M2.2, M13.1
- **Repos:** `platform/portal`, `platform/tenant-registry`
- **Deliverables:** demo tenant created in stage and prod with `kind=demo, status=demo`; SALES_REP role usable; backstage routes restricted to `/backstage/leads` and `/backstage/demo`; demo tenant audit events tagged `{"demo": true}` and hidden from real-tenant audit views.
- **Acceptance:** sales rep logs in at `demo.yourplatform.com`, walks both products live, [Request Trial] modal creates a CRM Lead with `sales_owner = the rep`.
- **Acceptance:** sales rep logs in at `demo.breakpilot.com`, walks both products live, [Request Trial] modal creates a CRM Lead with `sales_owner = the rep`.
- **Tests:** Playwright e2e for the sales walk-through.
- **Gate:** standard + security checklist (SALES_REP guardrail enforcement is the load-bearing piece)
- **Effort:** M
@@ -543,7 +543,7 @@ Every place where a future flag would gate behaviour MUST flow through a single
**Phase 4 exit criteria:**
- Every flow P1P16 from `PLATFORM_ARCHITECTURE.md` has a passing Playwright spec.
- Stage runs a full lifecycle: sign-up trial → convert → use → cancel → offboard, in an automated nightly job.
- We can hand a prospect a real demo using `demo.yourplatform.com`.
- We can hand a prospect a real demo using `demo.breakpilot.com`.
---
@@ -581,8 +581,8 @@ Every place where a future flag would gate behaviour MUST flow through a single
### M17.1 — MCP servers (Enterprise)
- **Depends on:** M6.3, M7.2
- **Repos:** `benjamin_boenisch/certifai`, `benjamin_boenisch/breakpilot-compliance`
- **Deliverables:** MCP endpoints per `PRODUCT_INTEGRATION_SPEC.md §10` `mcp:` block; gated on `plan == enterprise`; routed via `mcp.yourplatform.com`.
- **Acceptance:** Claude Code can connect to `mcp.yourplatform.com/certifai` with a service token and call `list_ai_agents`.
- **Deliverables:** MCP endpoints per `PRODUCT_INTEGRATION_SPEC.md §10` `mcp:` block; gated on `plan == enterprise`; routed via `mcp.breakpilot.com`.
- **Acceptance:** Claude Code can connect to `mcp.breakpilot.com/certifai` with a service token and call `list_ai_agents`.
- **Tests:** MCP contract test using `mcp-cli`.
- **Gate:** standard + security checklist
- **Effort:** L
@@ -703,7 +703,7 @@ That's 18 milestones. With one full-time agent and standard human review pacing,
- ~~Cloudflare account ownership~~ → not used; DNS is self-hosted via PowerDNS on vm-edge (M0.3). Registrar account (Benjamin's) still needs documented 2FA recovery — see new DR item below.
**Still open:**
- **CDN host** for `cdn.yourplatform.com`: self-hosted MinIO + Caddy on vm-edge is the OSS-aligned default; alternative is BunnyCDN (cheap, EU). Decide before M6.3 (manifest bundles + hero images).
- **CDN host** for `cdn.breakpilot.com`: self-hosted MinIO + Caddy on vm-edge is the OSS-aligned default; alternative is BunnyCDN (cheap, EU). Decide before M6.3 (manifest bundles + hero images).
- **Cloud provider for port 25 outbound.** Stalwart needs unblocked port 25 to send mail. Hetzner blocks by default and requires a request to unblock with proof of intent + abuse contact; OVH and Scaleway unblock on request faster. Confirm with Benjamin which provider vm-control runs on. Block on M3.2 if port 25 is unblockable — fallback is sending via a different provider's IP with reverse DNS.
- **Test data privacy.** The demo tenant must contain ONLY synthetic data — confirm seed pipeline strips real PII even if our test orgs accidentally seed from prod.
- **Registrar + DNS bus-factor.** Document who owns the registrar account, who has 2FA recovery codes, and the procedure to update NS records without that person available. Goes in `platform/docs/runbooks/dr.md` before M0.3 ships.