Apply platform-domain decision (2026-05-18). No services touched; docs/config only. Refs: M1.1
69 KiB
Platform Architecture — B2B Customer Portal
Status: Design Draft
Authors: Sharang, Benjamin
Date: 2026-05-11
1. Vision
We sell CERTifAI and breakpilot-compliance as modular B2B building blocks. Customers buy one or both and operate them inside a unified customer portal — without needing to understand that they are separate products under the hood.
Each customer is a tenant: fully isolated data, their own user base, their own identity configuration. We manage all tenants from a single operator backstage.
ERPNext runs our company: CRM, sales orders, invoicing, HR. Frappe Helpdesk runs customer support. Gitea runs engineering. Everything else — Keycloak, all product services, all databases — runs on our own infrastructure managed by Orca.
2. Products in Scope
| Product | What it is |
|---|---|
| CERTifAI | Self-hosted GDPR-compliant AI admin dashboard. Manages LLMs, AI agents, MCP servers, usage analytics. Built with Rust/Dioxus. |
| breakpilot-compliance | GDPR and AI-Act compliance automation. Covers DSFA, VVT, TOM, DSR, AI Act, risk, vendor, incidents. Built with Python/FastAPI + Go AI SDK + Next.js. |
Out of scope: breakpilot-dataroom, breakpilot-lehrer, breakpilot-pitch-deck.
3. The Four Planes
╔══════════════════════════════════════════════════════════════════╗
║ PLANE 1 — IDENTITY (logical root, all auth flows through here) ║
╚══════════════════════════════════════════════════════════════════╝
↓ JWT
╔══════════════════════════════════════════════════════════════════╗
║ PLANE 2 — CONTROL (portal + ERPNext + tenant registry) ║
╚══════════════════════════════════════════════════════════════════╝
↓ tenant-scoped API calls
╔══════════════════════════════════════════════════════════════════╗
║ PLANE 3 — DATA (CERTifAI + breakpilot-compliance) ║
╚══════════════════════════════════════════════════════════════════╝
↓ everything runs on
╔══════════════════════════════════════════════════════════════════╗
║ PLANE 4 — INFRA (Orca + VMs + Gitea + Infisical + LiteLLM) ║
╚══════════════════════════════════════════════════════════════════╝
4. Plane 1 — Identity
Technology: Keycloak 26, single realm (breakpilot-prod)
Keycloak is the only truth about who anyone is. Every other service validates JWTs issued here — nothing else handles auth logic.
Structure
Realm: breakpilot-prod
│
├── Organizations (one per B2B customer)
│ ├── Acme Corp → org_id: uuid-acme
│ ├── BayernAG → org_id: uuid-bayernag
│ └── ...
│
├── Organization Roles (what a user can do within their company)
│ ├── IT_ADMIN — full portal access, user management, IdP config
│ ├── CXO — dashboard, billing, audit (read)
│ ├── FINANCE — billing, invoices
│ ├── LEGAL — audit log, compliance read
│ └── USER — product access only
│
├── Realm Roles (what we, the operators, can do)
│ ├── BREAKPILOT_ADMIN — full backstage, impersonation, demo tenant edit
│ ├── SUPPORT_ENGINEER — read backstage, limited impersonation
│ └── SALES_REP — demo tenant login, CRM read, NO real-tenant access
│
└── Identity Provider Brokering (per org, optional)
├── OIDC (Okta, Google Workspace, any OIDC provider)
└── SAML (Azure AD, ADFS, any SAML 2.0 provider)
JWT Structure
Every service receives a JWT containing:
sub — user UUID
email — user email
org_id — customer tenant UUID (= Keycloak org ID)
org_name — human-readable company name
org_roles — [IT_ADMIN, USER, ...] roles within their org
realm_roles — [customer] | [BREAKPILOT_ADMIN] | [SUPPORT_ENGINEER] | [SALES_REP]
products — [certifai, compliance] entitlements (injected by protocol mapper)
plan — starter | professional | enterprise
iss — https://auth.breakpilot.com/realms/breakpilot-prod
The products and plan claims are added by a Keycloak protocol mapper that reads live entitlements from the Tenant Registry at token issuance. Products do not need to call back to the registry on every request.
5. Plane 2 — Control
Three distinct services. Clear separation of responsibility.
5a. Customer Portal
Technology: Next.js 15 (new service)
Deployed at: *.breakpilot.com via Orca-Proxy wildcard routing
The front door for all customers and for us. Owns no business logic — it is a routing, auth, and UI layer.
Subdomain routing:
- DNS wildcard
*.breakpilot.com→ Orca-Proxy - Orca-Proxy reads
Hostheader → routes all traffic to the portal container - Portal reads
Host→ extracts tenant slug → looks up Tenant Registry
Customer area (requires valid JWT for their org):
/[slug]/dashboard product tiles, usage summary, activity
/[slug]/catalog browse ALL products, subscribed and not (upgrade/upsell flow)
/[slug]/products/
/certifai CERTifAI product area (subscribed only)
/compliance breakpilot-compliance area (subscribed only)
/[slug]/projects optional sub-tenancy: dev/staging/prod separation [IT_ADMIN]
/[slug]/settings/
/identity IdP configuration [IT_ADMIN]
/users invite, roles, deactivate [IT_ADMIN]
/api-keys API keys for integrations [IT_ADMIN]
/integrations webhooks, process hooks
/[slug]/billing/ plan, usage, invoices [FINANCE, CXO, IT_ADMIN]
/[slug]/audit/ platform + product audit, filterable by product [LEGAL, IT_ADMIN]
/[slug]/support/ Frappe HD customer portal [all roles]
Operating principles (borrowed from AWS/Azure/GCP consoles):
1. Role-based UI hiding
The portal NEVER shows a button, link, or section the user cannot use.
Disabled-with-tooltip is also wrong — hide it. The customer's mental model
should be "the portal shows me what I can do," not "the portal teases me."
2. Browse before buy
/catalog shows every product available on the platform with description,
pricing tier, and a one-click "Request" CTA — even for products the
customer is not subscribed to. Drives organic upsell instead of
requiring sales touchpoints.
3. Hierarchy: Tenant → Project (optional) → Resources
A tenant can have multiple projects (e.g., "Production", "Staging").
Products that support project scoping isolate data per project.
Customers without sophistication operate as single-project (default).
Mirrors GCP Project / AWS Account / Azure Resource Group pattern.
4. Cross-product activity log
/audit shows portal events AND every product's audit events filtered
by tenant. Filterable by product, actor, action, time range. One log
to satisfy DPO inquiries instead of hunting per-product.
5. Cost and usage as first-class
Billing page is not just "your invoice." Shows live usage per product,
trend over time, and projected next invoice. Removes "bill shock."
Backstage (access by realm role):
BREAKPILOT_ADMIN— everything belowSUPPORT_ENGINEER— read all + impersonation, no create/deleteSALES_REP—/backstage/leads,/backstage/demo, own CRM activity only; CANNOT load any other/backstage/tenants/[id]route
/backstage/dashboard MRR, active tenants, system health
/backstage/tenants/
/new create customer
/[id]/overview health, logins, API volume
/[id]/products enable/configure products
/[id]/users view members, impersonate
/[id]/billing Stripe + ERPNext view
/[id]/support tickets for this customer
/[id]/audit full audit trail
/backstage/system/
/health all service health
/incidents incident log
/releases deployment history
5b. ERPNext
Technology: Frappe + ERPNext (self-hosted via Orca)
Access: erp.breakpilot.com — us only (IP-restricted at Orca-Proxy)
Auth: Keycloak OIDC — we log in with our existing accounts, no separate password
ERPNext is our business operations backbone. We do not build CRM, invoicing, or HR — we configure ERPNext for these.
| ERPNext Module | Used for |
|---|---|
| CRM | Leads, opportunities, deal pipeline |
| Sales | Quotations, Sales Orders (= contracts) |
| Accounts | Sales Invoices, payment tracking, DATEV export |
| Buying | Our own SaaS costs, infrastructure invoices |
| HR | Sharang + Benjamin as employees, expense claims |
| Support (Frappe HD) | Customer tickets, SLA, escalation to Gitea |
Integration with the platform:
- ERPNext Customer record has a custom field
tenant_idlinking to the Tenant Registry - When a Sales Order is submitted in ERPNext → webhook → Tenant Registry
/tenants/{id}/activate - Portal billing page reads invoices from ERPNext REST API server-side — customers never log into ERPNext directly
- We (founders) create quotations, orders, and invoices inside ERPNext
5c. Tenant Registry
Technology: Go service (new), PostgreSQL schema tenant_registry
The glue between Keycloak, ERPNext, and the products. The technical source of truth for "what is this tenant, what do they have access to, how are they configured."
Key data it holds:
tenants id, slug, name, erp_customer_id, stripe_cust_id,
status, plan, trial/contract dates, sales_owner,
kind (real | demo).
status ∈ {demo, trial, active, frozen, archived}.
demo — shared demo tenant; reset nightly; no billing
trial — real customer in their N-day evaluation window
active — paid, contract or self-serve plan
frozen — read-only after cancel / non-payment (30d grace)
archived — data export window closed; only audit log retained
tenant_projects OPTIONAL sub-tenancy. id, tenant_id, name, slug,
status. Customers without need operate as a single
implicit "default" project. Products opt in via
manifest (supports_projects: true) and accept an
optional project_id parameter on tenant-scoped APIs.
Mirrors GCP Project / AWS Account pattern.
tenant_products tenant ↔ product, enabled, config (litellm_url,
max_seats, modules_enabled), expires_at
tenant_idp_config type (oidc/saml), metadata, verified
audit_log every portal AND product action: who, what, when,
from where, including impersonations. Indexed for
cross-product search (filter by tenant + product +
actor + action + time). Schema is Retraced-compatible
so we can swap implementation without changing
producers (see PRODUCT_INTEGRATION_SPEC.md §8.4).
api_keys portal-owned. tenant_id, product, scopes, name,
hash, created_by, last_used_at, revoked_at.
Headless products call /internal/api-keys/verify
to validate inbound keys. Single source of truth
across all products.
Links:
tenant.id=Keycloak org_id(one-to-one)tenant.erp_customer_id=ERPNext Customer.name(one-to-one)tenant.stripe_cust_id= Stripe Customer ID (self-serve billing only)
5d. Demo Tenant (Shared)
Slug: demo — reachable at demo.breakpilot.com
Status: demo (never transitions; never billed)
Owner: us (BREAKPILOT_ADMIN curates content; SALES_REP reads + logs in)
A single, shared tenant pre-seeded with realistic-but-fake data covering CERTifAI + breakpilot-compliance. Sales reps use it to walk prospects through the product live. Prospects do NOT log in directly — the sales rep drives the screen.
How it differs from a real tenant:
DEMO TENANT REAL TENANT
───────────────────────────────────── ───────────────────────────────────
status = demo status = trial | active
billing disabled billing active
audit emitted but not exported audit emitted and exportable
nightly reset job restores fixtures data is permanent
seed data loaded on reset: customer-owned data
product.manifest.seed_data_url
all real-tenant flows work otherwise same flows, same code paths
Why shared and not per-prospect:
- Cheap (one tenant, no Orca provisioning per prospect)
- Predictable (sales reps know exactly what's in there)
- The known-quantity model — works in practice, matches what we have experience with
- Tradeoff accepted: concurrent edits during the same day are visible across demo sessions; nightly reset hides this within 24h
Nightly reset:
- Cron job (3:00 Europe/Berlin) calls each product's
/v1/tenants/demo/resetendpoint - Product fetches its fixtures from
catalog.seed_data_urland restores - Reset is itself an audit event; failures page the on-call
5e. Frappe Helpdesk + Gitea Issues
Technology: Frappe HD (installed on same Frappe bench as ERPNext), Gitea Issues
Support flow:
- Customer submits ticket via
/[slug]/support/(Frappe HD customer portal, embedded or linked) - Agent (us) triages in Frappe HD agent UI at
erp.breakpilot.com - If technical: agent clicks "Escalate to Engineering" → Frappe server script creates a Gitea issue in the relevant repo via Gitea REST API → issue URL stored on ticket
- When Gitea issue is closed → Gitea webhook → Frappe HD → ticket marked "Resolved"
6. Plane 3 — Data
CERTifAI
Self-hosted GDPR-compliant AI dashboard. After updates, it is fully tenant-aware.
Multi-tenancy: All MongoDB queries scoped by org_id from JWT
Auth: Validates Keycloak JWT (JWKS endpoint), maps org_roles to product roles
LiteLLM: Shared managed instance (Starter/Professional, API key per tenant) or customer-hosted (Enterprise, URL stored in tenant_products.config)
Role mapping:
| Portal role | CERTifAI role |
|---|---|
| IT_ADMIN | Admin |
| CXO, USER | Member |
| FINANCE, LEGAL | Viewer |
breakpilot-compliance
GDPR and AI-Act compliance automation platform. After updates, tenant identity comes from validated JWT — not raw client headers.
Multi-tenancy: All PostgreSQL queries scoped by tenant_id (= org_id from JWT)
Auth: Next.js proxy validates JWT → extracts org_id → sets X-Tenant-ID
Role mapping: LEGAL can approve DSFA; IT_ADMIN is compliance admin; USER contributes to DSR/VVT workflows
7. Plane 4 — Infra
Orchestration: Orca manages all containers on Hetzner VMs
Secrets: Infisical — every service has a machine identity, pulls its own secrets at startup
CI/CD: Gitea Actions → Docker build → push to private registry → Orca redeploy webhook
Routing: Orca-Proxy handles all TLS termination and subdomain routing
Orca-Proxy routing table:
auth.breakpilot.com → Keycloak
erp.breakpilot.com → ERPNext + Frappe HD (IP-restricted)
git.breakpilot.com → Gitea
secrets.breakpilot.com → Infisical (IP-restricted)
*.breakpilot.com → customer-portal (wildcard, Host → tenant)
Services managed by Orca:
Identity & Auth
└── Keycloak 26
Business Operations
├── ERPNext (Frappe)
└── Frappe Helpdesk
Developer Tooling
├── Gitea
└── Gitea Runner
Secrets
└── Infisical
AI Inference
└── LiteLLM (shared, API key per tenant; or customer-hosted for Enterprise)
Customer Portal
├── customer-portal (new)
└── tenant-registry (new)
Products
├── certifai-dashboard
└── breakpilot-compliance stack
├── backend-compliance (Python/FastAPI)
├── ai-compliance-sdk (Go)
└── admin-compliance (Next.js)
Data Stores
├── PostgreSQL 17 [schemas: tenant_registry, compliance]
├── MongoDB [CERTifAI]
├── Qdrant [compliance RAG]
└── MinIO [compliance documents]
Infisical secret namespacing:
/prod/
/keycloak/ DB_PASS, ADMIN_PASS, REALM_KEYS
/erpnext/ DB_PASS, SMTP_PASS, OIDC_CLIENT_SECRET
/customer-portal/ KEYCLOAK_CLIENT_SECRET, ERP_API_KEY, REGISTRY_DB_URI
/tenant-registry/ POSTGRES_URI, KEYCLOAK_ADMIN_SECRET, ERP_API_KEY, STRIPE_SECRET
/certifai/ MONGODB_URI, KEYCLOAK_CLIENT_SECRET, LITELLM_MASTER_KEY
/compliance/ POSTGRES_URI, QDRANT_API_KEY, MINIO_KEYS, ANTHROPIC_API_KEY
/litellm/ OPENAI_API_KEY, ANTHROPIC_API_KEY, MASTER_KEY
/gitea-runner/ DOCKER_REGISTRY_PASS, ORCA_WEBHOOK_TOKEN
8. Process Sketches
P1 — New Customer Onboarding (Sales-Led)
US (ERPNext) TENANT REGISTRY KEYCLOAK
│ │ │
│ Lead → Opportunity │ │
│ → Quotation (PDF sent) │ │
│ → Sales Order submitted │ │
│─────── webhook ─────────────►│ │
│ │ create org ─────────►│
│ │◄──── org_id ──────────│
│ │ write tenant row │
│ │ write tenant_products│
│ │ send welcome email │
│ │ │ │
│ │ ▼ │
│ IT ADMIN receives email │
│ clicks setup link │
│ │ ┌───────┤
│ │ │set pw │
│ │ │ 2FA │
│ │ └───┬───┘
│ │ │
│ lands on /acme/dashboard │
│ │ │
P2 — User Login (Customer's Own IdP)
USER ORCA-PROXY PORTAL KEYCLOAK CUSTOMER IdP
│ │ │ │ │
│ acme.breakpilot.com │ │ │ │
│───────────────────────►│ │ │ │
│ │ Host=acme.* │ │ │
│ │───────────────►│ │ │
│ │ │ slug=acme │ │
│ │ │ lookup tenant │ │
│ │ │ → idp=acme-okta│ │
│ │ │─── redirect ──►│ │
│ │ │ kc_idp_hint │ │
│ │ │ │─── redirect ──►│
│ │ │ │ │
│ │◄─────────────────────── auth ──┤ │
│ │ │ │ issue JWT │
│ │ │◄── JWT ────────│ │
│◄── /acme/dashboard ────┤ │ │ │
P3 — User Login (Our IdP — email + password)
USER PORTAL KEYCLOAK
│ │ │
│ acme.breakpilot │ │
│──────────────────►│ │
│ │ redirect + PKCE │
│ │─────────────────►│
│◄── Keycloak login page ─────────────┤
│ enter email + password (+ TOTP) │
│─────────────────────────────────────►│
│ │◄── JWT ──────────│
│◄── /acme/dashboard┤ │
P4 — IT Admin Configures External IdP
IT ADMIN PORTAL TENANT REGISTRY KEYCLOAK
│ │ │ │
│ /settings/ │ │ │
│ identity │ │ │
│───────────────►│ │ │
│ fill OIDC/ │ │ │
│ SAML details │ │ │
│───────────────►│ │ │
│ │── PATCH idp_config►│ │
│ │ │── create IdP ────►│
│ │ │ for org │
│ │ │◄── ok ────────────│
│ │ │ verified=true │
│◄── "Test" btn ─┤ │ │
│ auth popup ───────────────────────────────────────────►│
│◄── success ────────────────────────────────────────────┤
│◄── "IdP configured" ┤ │ │
P5 — IT Admin Invites a Team Member
IT ADMIN PORTAL KEYCLOAK NEW USER
│ │ │ │
│ /settings/ │ │ │
│ users → invite│ │ │
│ email + role │ │ │
│───────────────►│ │ │
│ │ create user in org│ │
│ │──────────────────►│ │
│ │ │ send invite │
│ │ │ email ────────►│
│ │ │ │ click link
│ │ │◄── set pw ─────│
│ │ │ (+ TOTP) │
│ │ │ issue JWT │
│ │◄─── JWT ──────────│ │
│ │ │ ┌──────┘
│ │ │ lands on│
│ │ │ /acme/dashboard
│ │ │ (role-filtered view)
P6 — Customer Accesses a Product
USER PORTAL KEYCLOAK PRODUCT (e.g. CERTifAI)
│ │ │ │
│ /acme/products/ │ │ │
│ certifai │ │ │
│─────────────────►│ │ │
│ │ check JWT: │ │
│ │ products claim │ │
│ │ includes │ │
│ │ "certifai" ? │ │
│ │ │ │
│ [YES] ────────┤ │ │
│ │ pass JWT ──────────────────────── │
│ │ │ validate JWKS │
│ │ │ extract org_id │
│ │ │ scope all data │
│◄── product UI ───┤ │ │
│ │ │ │
│ [NO] ─────────┤ │ │
│◄── "Not in your plan" + upgrade CTA │
P7 — Finance User Views Billing
FINANCE USER PORTAL ERPNEXT API STRIPE API
│ │ │ │
│ /acme/billing│ │ │
│──────────────►│ │ │
│ │ role check: │ │
│ │ FINANCE → ok │ │
│ │ │ │
│ │── fetch invoices►│ │
│ │◄── invoice list ─│ │
│ │ │ │
│ │── fetch usage ────────────────────► │
│ │◄── usage data ────────────────────── │
│ │ │ │
│◄── billing page renders │ │
│ plan · usage · invoices │ │
│ [Download PDF] ──────────────►│ │
│◄── PDF streamed ─────────────────│ │
│ │ │
│ [Upgrade Plan] │ │
│──────────────►│ │ │
│ │ create CRM task in ERPNext │
│ │─────────────────►│ │
│ │ notify us (email/ERPNext task) │
│◄── "We'll be in touch" ──┤ │ │
P8 — Legal User Exports Audit Report
LEGAL USER PORTAL TENANT REGISTRY COMPLIANCE PRODUCT
│ │ │ │
│ /acme/audit │ │ │
│──────────────►│ │ │
│ │ role check: │ │
│ │ LEGAL → ok │ │
│ │ │ │
│ │── platform audit ──────────────────►
│ │ (who logged in, role changes, │
│ │ IdP changes, impersonations) │
│ │◄── audit_log rows ─────────────────┤
│ │ │ │
│ │── compliance audit ────────────────►
│ │ (DSFA approvals, DSR processing, │
│ │ TOM completions) │
│ │◄── compliance audit rows ──────────┤
│ │ │ │
│ [Export] │ │ │
│──────────────►│ │ │
│◄── ZIP: │ │ │
│ platform-audit.csv │ │
│ compliance-audit.pdf │ │
P9 — Support Ticket Escalated to Engineering
CUSTOMER FRAPPE HD US (AGENT) GITEA
│ │ │ │
│ submit ticket │ │ │
│ via /support/ │ │ │
│─────────────────►│ │ │
│ │── notify agent ─►│ │
│ │ │ triage ticket │
│ │ │ → technical bug │
│ │ │ │
│ │ │ [Escalate] │
│ │◄─────────────────│ │
│ │ server script: │ │
│ │ POST /issues ────────────────────► │
│ │ │ {title, body, │
│ │ │ labels:[bug]} │
│ │◄──── issue URL ───────────────────┤
│ │ store on ticket │ │
│◄── "Escalated to engineering, we'll update you" ─────┤
│ │ │ │
│ │ │ dev fixes it │
│ │ │ closes issue ─►│
│ │◄──────── webhook ──────────────────│
│ │ ticket → Resolved│ │
│◄── notification ─│ │ │
P10 — We Create a New Customer (Startup Flow)
US (BACKSTAGE) TENANT REGISTRY KEYCLOAK ERPNEXT IT ADMIN
│ │ │ │ │
│ /backstage/ │ │ │ │
│ tenants/new │ │ │ │
│ fill: name, │ │ │ │
│ contact, plan, │ │ │ │
│ products │ │ │ │
│──── [Create] ────►│ │ │ │
│ │── create org ───►│ │ │
│ │◄── org_id ───────│ │ │
│ │── create Customer ──────────────►│ │
│ │◄── erp_customer_id ──────────────│ │
│ │ write tenant rows │ │
│ │ send welcome email ─────────────────────────── │
│◄── tenant created ┤ │ │ │
│ "Awaiting setup"│ │ │ │
│ │ │ │ click link│
│ │ │◄── set pw ────────────────── │
│ │ │ + 2FA │ │
│ │ │ JWT issued │ │
│ │◄─────────────────────────────────────── /acme/ ─┤
P11 — We Debug a Customer Issue (Impersonation)
US (BACKSTAGE) TENANT REGISTRY KEYCLOAK PORTAL (AS CUSTOMER)
│ │ │ │
│ /backstage/ │ │ │
│ tenants/acme/ │ │ │
│ users → │ │ │
│ Impersonate Alice│ │ │
│──────────────────►│ │ │
│ │ write audit_log │ │
│ │ {action: │ │
│ │ impersonate, │ │
│ │ actor: sharang, │ │
│ │ target: alice} │ │
│ │── request token ►│ │
│ │◄── imp. token ───│ │
│◄── token ─────────┤ (30min, signed, │ │
│ │ impersonated_by │ │
│ │ claim) │ │
│ │ │
│ new tab: acme.breakpilot.com │ │
│──────────────────────────────────────────────────────────►│
│ │ [orange banner] │
│ │ Impersonating │
│ │ alice@acme.com │
│ │ 29:47 remaining │
│ reproduce issue, identify root cause │ │
│──────────────────────────────────────────────────────────►│
│ │ [Exit impersonation]
P12 — ERPNext Sales Order Activates a Tenant
US (ERPNEXT) ERPNEXT TENANT REGISTRY KEYCLOAK IT ADMIN
│ │ │ │ │
│ Sales Order │ │ │ │
│ Submit ──── ►│ │ │ │
│ │── webhook ───────►│ │ │
│ │ {order_id, │ │ │
│ │ tenant_id, │ │ │
│ │ products, │ │ │
│ │ plan, │ │ │
│ │ contract_start │ │ │
│ │ contract_end} │ │ │
│ │ │ tenant.status │ │
│ │ │ = active │ │
│ │ │ tenant_products │ │
│ │ │ enabled=true │ │
│ │ │── update claims ─►│ │
│ │ │ (protocol mapper│ │
│ │ │ picks up new │ │
│ │ │ entitlements) │ │
│ │ │── send email ─────────────────►│
│ │ │ "Subscription │ │
│ │ │ now active" │ │
P13 — Customer Browses Catalog and Requests a New Product
USER (any role) PORTAL TENANT REGISTRY ERPNEXT (CRM)
│ │ │ │
│ /acme/catalog │ │ │
│──────────────►│ │ │
│ │── GET /catalog ──────►│ │
│ │◄── product manifests +│ │
│ │ subscribed status │ │
│◄── catalog page │ │
│ • CERTifAI [✓ Subscribed] │ │
│ • Compliance [✓ Subscribed] │ │
│ • Notetaker [+ Request] │ │
│ • Classifier [+ Request] │ │
│ │ │
│ click [Request] on Notetaker │ │
│ Modal: "Why do you want this?" │ │
│ + estimated seats / volume │ │
│──────────────►│ │ │
│ │── POST /catalog/ │ │
│ │ request ─────────►│ │
│ │ {tenant, product, │ │
│ │ requested_by, note}│ │
│ │ │── create CRM Lead ──►│
│ │ │ linked to Customer │
│ │ │◄── lead_id ──────────│
│ │ │ notify sales_owner │
│ │ │ (email + ERPNext │
│ │ │ activity) │
│◄── "We'll be in touch within 1 day" ──│ │
P14 — Sales Rep Demos to a Prospect (Shared Demo Tenant)
SALES REP KEYCLOAK PORTAL DEMO TENANT
│ │ │ │
│ open Zoom with prospect, share screen │
│ │
│ demo.breakpilot.com │
│────────────────────────────────►│ │
│ │ │ Host: demo │
│ │ │ → slug = demo │
│ │ │ → tenant.kind=demo │
│ │ │ tenant.status=demo │
│ │ │ │
│ │ OIDC redirect │ │
│◄──────────────│─────────────────│ │
│ login sales@breakpilot │
│ realm_role=SALES_REP │
│──────────────►│ │ │
│ │ verify SALES_REP allowed on demo only │
│ │ issue JWT: │
│ │ org_id=demo, org_roles=[IT_ADMIN], │
│ │ realm_roles=[SALES_REP], │
│ │ products=[certifai, compliance] │
│◄──────────────│ │ │
│ │ │
│ /demo/dashboard ───────────────►│ │
│ /demo/products/certifai ─►│ load custom elt ►│
│ /demo/products/compliance ─►│ load custom elt ►│
│◄── show prospect every flow ────│ │
│ │ │
│ if prospect interested: │
│ click [Request Trial] in /demo/catalog │
│ modal: prospect email, company, est. seats │
│ → POST /catalog/trial-request │
│ creates CRM Lead in ERPNext, NOT a tenant │
│ sales_owner = the logged-in SALES_REP │
│ │
│ 03:00 nightly: │
│ cron → product /v1/tenants/demo/reset │
│ fixtures from catalog.seed_data_url restored │
│ demo is clean for next day │
Guardrails:
- Keycloak policy:
SALES_REPrealm role MUST NOT be issued a token withorg_id ≠ demo - Backstage policy:
SALES_REPCANNOT see real-tenant data, CAN see CRM (their leads) - Real customer support is NEVER done from a SALES_REP login
P15 — Self-Serve Trial → Convert or Expire
PROSPECT PORTAL TENANT REGISTRY ERPNEXT KEYCLOAK
│ │ │ │ │
│ breakpilot.com/start │ │ │
│──────────────►│ │ │ │
│ form: email, company, password │ │ │
│──────────────►│ │ │ │
│ │── POST /trials ──────►│ │ │
│ │ {email, company, │ │ │
│ │ requested_products} │ │ │
│ │ │ │ │
│ │ │ slugify(company) │ │
│ │ │ create tenant │ │
│ │ │ status=trial │ │
│ │ │ trial_ends_at = │ │
│ │ │ now + 14d │ │
│ │ │ create Customer ►│ │
│ │ │ tier=Trial │ │
│ │ │ sales_owner= │ │
│ │ │ unassigned │ │
│ │ │── create org ───────────────►│
│ │ │ + IT_ADMIN user │
│ │ │ + verify email │
│ │ │ │ │
│◄── magic link │ │ │ │
│ click link, set password │ │ │
│ land on /acme-trial/dashboard │ │ │
│ banner: "Trial: 14 days left — Add billing to keep your data" │
│ │ │ │
│ ── customer uses platform normally ── │ │
│ │ │ │
│ DAY 7 cron: trial_ends_at - 7d │ │ │
│ → email IT_ADMIN + CXO │ │ │
│ → CRM Activity: "Day-7 nudge" ►│ │ │
│ │ │ │
│ DAY 12: same, urgent tone │ │ │
│ DAY 14: trial_ends_at reached │ │ │
│ │ │ │
│ IF customer added payment: │ │ │
│ status: trial → active │ │ │
│ Stripe subscription created │ │ │
│ OR Sales Order in ERPNext signed │ │ │
│ banner removed │ │ │
│ │
│ ELSE: │
│ status: trial → frozen │ │ │
│ 30-day grace: portal read-only, products return 402 │ │
│ daily reminder email until day 44 │ │
│ │
│ DAY 44: frozen → archived │ │ │
│ GDPR export ZIP emailed to IT_ADMIN │ │ │
│ each product called: DELETE /v1/tenants/{id}/data │ │
│ 30 days later: tenant row deleted (audit_log retained 7y) │
Trial scoping:
- All paid products are available in trial mode by default unless
catalog.available_on_plansexcludestrial - Usage-billed products (e.g., LiteLLM calls) get a hard cap during trial (manifest:
trial_quota) - Customer can upgrade plan mid-trial; trial timer just stops, no proration
P16 — Customer Cancels and Offboards
IT ADMIN PORTAL TENANT REGISTRY PRODUCTS ERPNEXT
│ │ │ │ │
│ /acme/settings/billing │ │ │
│──────────────►│ │ │ │
│ [Cancel Subscription] │ │ │
│ Modal: │ │ │
│ • reason (dropdown) │ │ │
│ • confirm typing "acme" │ │ │
│ • shows: data retained 30d, then deleted │ │
│──────────────►│ │ │ │
│ │── POST /tenants/ │ │ │
│ │ acme/cancel ─────►│ │ │
│ │ │ status: active │ │
│ │ │ → frozen │ │
│ │ │ frozen_at = now │ │
│ │ │ delete_at = │ │
│ │ │ now + 30d │ │
│ │ │── Stripe cancel │ │
│ │ │ at_period_end │ │
│ │ │── opportunity ──────────────────►│
│ │ │ stage=Lost │ │
│ │ │ reason=... │ │
│ │ │── notify sales_owner │
│ │ │ (could reach out for save) │
│◄── confirmation page │ │ │
│ "Frozen until <date>. Download your data anytime." │ │
│ │ │ │
│ frozen state: │ │ │
│ portal works READ-ONLY │ │ │
│ /export available (all products) │ │ │
│ product APIs return 402 on writes │ │ │
│ │ │ │
│ if customer changes mind within 30d: │ │
│ [Reactivate] → status: frozen → active │ │
│ no data loss │ │
│ │ │ │
│ DAY 30 cron: │ │ │
│ tenant.delete_at reached │ │ │
│ build final export ZIP per product ►│ /v1/tenants/{id}/export │
│ email ZIP link to IT_ADMIN + CXO │ │ │
│ (signed URL, 7-day TTL) │ │ │
│ for each product: │ │
│ DELETE /v1/tenants/{id}/data ────►│ │ │
│ Keycloak: org archived, users disabled │ │
│ ERPNext Customer: status=Inactive │ │
│ tenant.status = archived │ │
│ │ │
│ audit_log retained 7y per GDPR / accounting │ │
Self-serve vs. enterprise:
- Stripe-billed customers cancel in-portal; flow above
- ERPNext-billed (enterprise) customers send written notice; sales rep updates Sales Order; flow runs from
/backstage/tenants/[id]/lifecyclewith the same downstream effects
Headless Product Flows
P1–P13 cover interactive products that ship a UI. Products declared as frontend.type = headless (see PRODUCT_INTEGRATION_SPEC.md §5) ship no frontend code — customers configure them through a portal-rendered UI and consume them via API/MCP from their own systems. Examples: a notetaker bot, a document classifier, a webhook router, a compliance reporter.
The portal still hosts these products end-to-end: the customer area, billing, audit, and Backstage all work the same. Only the "use the product" surface changes from a UI to API keys + webhooks.
H1 — Customer Enables a Headless Product
IT ADMIN PORTAL TENANT REGISTRY HEADLESS PRODUCT
│ │ │ │
│ /acme/products│ │ │
│ /notetaker │ │ │
│──────────────►│ │ │
│ │ load manifest │ │
│ │ frontend.type = │ │
│ │ "headless" │ │
│ │ │ │
│ │ render portal-owned │ │
│ │ config page from │ │
│ │ manifest sections: │ │
│ │ • API Keys │ │
│ │ • Webhooks │ │
│ │ • Usage chart │ │
│ │ • Docs link │ │
│ │ • Code samples │ │
│◄── page ──────│ │ │
H2 — Generate API Key for a Headless Product
IT ADMIN PORTAL TENANT REGISTRY HEADLESS PRODUCT
│ │ │ │
│ [Generate Key]│ │ │
│ name: "prod" │ │ │
│ scopes:[r,w] │ │ │
│──────────────►│ │ │
│ │── POST /api-keys ────►│ │
│ │ {tenant, scopes, │ │
│ │ product, name} │ │
│ │ │ generate raw key │
│ │ │ store HASH only │
│ │ │ bind: tenant + │
│ │ │ product + │
│ │ │ scopes │
│ │◄── raw key (once) ────│ │
│◄── show once ─│ │ │
│ "Copy now — │ │ │
│ won't show │ │ │
│ again" │ │ │
H3 — Customer's System Calls the Headless Product
CUSTOMER SYSTEM HEADLESS PRODUCT TENANT REGISTRY
│ │ │
│ POST /v1/sessions │ │
│ Auth: ApiKey k_xxx │ │
│ X-Tenant: acme │ │
│──────────────────────►│ │
│ │ validate key ───────►│
│ │ → tenant_id, │
│ │ scopes │
│ │◄─────────────────────│
│ │ enforce scope │
│ │ tenant_id in EVERY │
│ │ DB query │
│ │ process request │
│ │ emit usage ─────────►│
│ │ emit audit ─────────►│
│◄── 200 response ──────│ │
H4 — Async Result Delivered via Webhook
HEADLESS PRODUCT CUSTOMER WEBHOOK URL PORTAL (delivery log)
│ │ │
│ async job finishes │ │
│ load webhook config │ │
│ for this tenant + │ │
│ this event type │ │
│ │ │
│ POST customer URL ─────►│ │
│ Body: {event, result, │ │
│ tenant, signature} │ │
│◄── 200 ─────────────────│ │
│ log delivery ────────────────────────────────────►
│ (success/fail, ts, │ │
│ response code) │ │
│ │ │
│ if delivery fails: │ │
│ retry with backoff │ │
│ 3 attempts, then │ │
│ dead-letter │ │
│ visible in portal at │ │
│ /webhooks/deliveries │ │
H5 — Headless Product Tile on Customer Dashboard
USER PORTAL TENANT REGISTRY HEADLESS PRODUCT
│ │ │ │
│ /acme/dashboard │ │ │
│─────────────────►│ │ │
│ │ for each entitled │ │
│ │ product in JWT: │ │
│ │ │ │
│ │ type=interactive → │ │
│ │ render "Open" tile │ │
│ │ │ │
│ │ type=widget → │ │
│ │ load widget bundle │ │
│ │ render custom elt │ │
│ │ │ │
│ │ type=headless → │ │
│ │ GET /v1/usage ─────────────────────────────►│
│ │◄────────────────────── usage summary ────────│
│ │ render stat tile: │ │
│ │ "Notetaker │ │
│ │ 142 sessions │ │
│ │ last 30d" │ │
│ │ click → goes to │ │
│ │ /products/notetaker │ │
│◄── dashboard ────│ │ │
H6 — Backstage Operates a Headless Product
US (BACKSTAGE) PORTAL TENANT REGISTRY HEADLESS PRODUCT
│ │ │ │
│ /backstage/ │ │ │
│ tenants/acme/ │ │ │
│ products/ │ │ │
│ notetaker │ │ │
│──────────────►│ │ │
│ │ NO "Impersonate" btn │ │
│ │ (no UI to enter) │ │
│ │ │ │
│ │ shows: │ │
│ │ • Health │ │
│ │ • Usage 30/90d │ │
│ │ • API call errors │ │
│ │ • Webhook deliveries │ │
│ │ • Failed deliveries │ │
│ │ • Admin actions from │ │
│ │ manifest: │ │
│ │ [Flush queue] │ │
│ │ [Rotate keys] │ │
│ │ [Reset state] │ │
│ [Flush queue] │ │ │
│──────────────►│ │ │
│ │── service token ─────►│ │
│ │ POST /admin/flush ─────────────────────────►
│ │ │ audit event │
│ │◄─────────────────────────────────── ok ───── │
│◄── done ──────│ │ │
9. Technology Decisions (Locked)
| Decision | Choice | Rationale |
|---|---|---|
| Identity | Keycloak, single realm | Already in CERTifAI; Organizations + IdP brokering built-in |
| Tenant model | Keycloak Organization per customer | Native isolation, JWT claims, no custom multi-tenant auth code |
| Subdomain routing | Orca-Proxy, wildcard cert | Consistent with existing infra; tenant from Host header |
| Secret management | Infisical, machine identity per service | Uniform across all services; path-namespaced per service |
| Business operations | ERPNext (Frappe) | CRM + sales + invoicing + HR in one; avoids building our own |
| Customer support | Frappe Helpdesk | Same Frappe bench as ERPNext; native customer-ticket-account link |
| Engineering issues | Gitea Issues | Already running Gitea; Frappe HD → Gitea via REST API (server script) |
| Data isolation | Logical (tenant_id / org_id columns) | Sufficient for Starter/Professional; physical isolation offered for Enterprise |
| Billing — self-serve | Stripe (Starter, Professional) | Standard; portal billing page reads Stripe |
| Billing — enterprise | ERPNext Sales Invoices | Manual invoicing, DATEV export for accountant |
| Customer portal | New Next.js 15 app | Clean slate; existing admin apps have product-specific chrome |
| Tenant Registry | New Go service | Thin glue layer; owns entitlements, IdP config, audit log |
| Products scope | CERTifAI + breakpilot-compliance only | Dataroom and pitch-deck out of scope |
10. Open Items / Phasing
Phase 0 — Foundation (pilot-ready, one real customer)
- Orca-Proxy: wildcard TLS, subdomain routing table
- Infisical: machine identities + secrets for all existing services
- Keycloak: Organizations enabled, realm roles (incl.
SALES_REP), one test org - Tenant Registry: core schema + API (
/tenantsCRUD +/activate),statusenum - Backstage minimal: create tenant form, tenant list, impersonation
- Portal login: subdomain detection → Keycloak OIDC → tenant context
- CERTifAI: MongoDB-backed sessions,
org_idquery scoping, role enforcement - breakpilot-compliance: JWT →
X-Tenant-IDvalidated at Next.js proxy - Demo tenant
demoseeded; sales rep can log in and walk a screen-shared prospect
Phase 1 — Customer-Facing Portal
- Full customer dashboard, product tiles, usage summary
- User management and invite flow
- IdP configuration wizard (OIDC + SAML)
- Billing page (ERPNext invoices + Stripe usage)
- Audit log page and CSV/PDF export
- Frappe HD embedded in
/[slug]/support/
Phase 2 — Business Operations
- ERPNext configured: CRM, Sales Orders, Invoicing, HR
- ERPNext → Tenant Registry webhook (Sales Order submit → tenant activate)
- Frappe HD → Gitea escalation (server script)
- Backstage health dashboard (service health, incidents)
- Keycloak protocol mapper (products + plan injected into JWT)
- Self-serve trial flow P15:
/startform, 14-day timer, day-7/12/14 emails, trial → frozen → archived state machine - Cancel + offboard flow P16: cancel modal, 30-day frozen window, automated final-export ZIP, GDPR erasure call to every product
- Demo nightly reset: cron at 03:00 Europe/Berlin calls each product's
/v1/tenants/demo/reset
Phase 3 — Product API Surface
- CERTifAI: OpenAPI spec,
/api/v1/health+/api/v1/usage - breakpilot-compliance: OpenAPI spec,
/api/v1/usage - Customer-facing API keys (IT Admin generates, scoped to their org)
- LiteLLM per-tenant API key metering → usage data in portal
Phase 4 — Enterprise Tier
- Physical data isolation option (dedicated PostgreSQL schema per tenant)
- Customer-hosted LiteLLM (URL stored in
tenant_products.config) - Custom domain support (
compliance.acme.com→ Orca-Proxy → portal) - MCP servers per product (CERTifAI MCP, compliance MCP)
- SLA enforcement in Frappe HD per plan tier
End of document. Updated after design review 2026-05-11.