Files
tenant-registry/README.md
T
sharang fd5f8ae36f
ci / shared (pull_request) Successful in 7s
ci / image (pull_request) Has been skipped
ci / test (pull_request) Failing after 20s
feat(keycloak): M4.3 — Admin API adapter + claim resolver
internal/keycloak/ — Adapter interface with two implementations:
  HTTPAdapter  pgxpool-style real Admin API client with cached client-
               credentials token (auto-refresh, 401 retry).
  Mock         in-process map for unit tests + dev convenience when
               KEYCLOAK_ADMIN_URL is empty. Used by the eachStore harness.

Adapter contract (adapter.go):
  CreateOrgAndInvite(ctx, InviteInput) (*InviteResult, error)
    Creates a KC organization, an IT_ADMIN user, adds the user as a
    member, triggers VERIFY_EMAIL + UPDATE_PASSWORD execute-actions
    email. Atomic from the caller's PoV; partial failures surface as
    typed errors (ErrOrgConflict, ErrUserConflict, ErrUnauthorized,
    ErrUnavailable).
  SyncClaims(ctx, userID, Claims) error
    Pushes tenant_id / tenant_slug / org_roles / products / plan /
    tenant_status into the user's KC attributes — the same shape the
    realm's protocol mappers project into JWTs.
  Health(ctx) error
    Pings /admin/serverinfo; wired into readyz.

Wiring:
  POST /v1/tenants now accepts admin_email + admin_name. When set, the
  adapter creates the org and invites the user. Response wraps the
  tenant with the new TenantCreated{tenant, invite_url} shape so dev
  testers can use the action-token URL without waiting for the email.
  KC failures DO NOT roll the tenant back — they emit a
  keycloak.provision_failed audit event so the operator can resend.
  Successful invites emit keycloak.invite_sent.

  POST /v1/internal/keycloak/claims resolves a tenant's current claim
  bundle. Lookup chain: body.tenant_id → body.tenant_slug →
  body.user_attrs.tenant_id → body.user_attrs.tenant_slug. The realm's
  protocol mapper calls this at token issuance, or operators on demand.

Config: KEYCLOAK_ADMIN_URL / REALM / CLIENT_ID / CLIENT_SECRET; empty
URL falls back to Mock for dev.

OpenAPI: TenantCreated + Claims schemas added; /v1/internal/keycloak/claims
documented. Contract test extended to cover the new endpoint.

Tests:
  internal/keycloak/mock_test.go    Mock semantics: conflict surfacing,
                                    FailNext hook, SyncClaims persistence.
  internal/server/keycloak_test.go  KC provisioning end-to-end via
                                    eachStore: invite_url returned,
                                    mock records, invite_sent audit;
                                    failure path emits provision_failed
                                    but tenant still lands; claims
                                    endpoint resolves via tenant_id /
                                    tenant_slug / user_attrs / 404 / 400.

The real-KC integration test (against a testcontainers-spun KC 26)
lands in a follow-up — gating it behind KEYCLOAK_INTEGRATION=1 + a
slower nightly CI is cleaner than baking 30s+ of KC boot into every PR.

Refs: M4.3
2026-05-19 13:24:41 +02:00

9.0 KiB

tenant-registry

Multi-tenant glue: orgs, entitlements, API keys, audit.

Part of the Breakpilot Platform. For the big picture see platform/docs: Architecture · Infrastructure · Product Integration Spec · Implementation Plan

What this is

Multi-tenant glue: orgs, entitlements, API keys, audit. Scaffolded under milestone M4.1. See platform/docs for the full architecture context.

Plane: Control Owner: @sharang Status: pre-alpha Linked milestone: M4.1

Run locally

# Prerequisites: Go 1.25+
# Dependencies (Keycloak, pg-app) come from the dev stack — see platform/orca-platform/dev.

# In one terminal — bring up dev dependencies (in the orca-platform clone):
cd /path/to/platform/orca-platform && make dev-up

# In another — run the service:
make dev          # APP_ENV=dev, listens on :8090 (Keycloak owns :8080 in the dev stack)
make test         # unit tests
make build        # compile to ./bin/tenant-registry

Env vars (override at the shell):

Var Default Purpose
APP_ENV dev one of dev, stage, prod
ADDR :8090 listen address (avoids Keycloak's :8080)
KEYCLOAK_ISSUER http://localhost:8080/realms/breakpilot-dev OIDC issuer URL (the JWT signer)
DATABASE_URL empty (in-memory store fallback) Postgres DSN; service uses Memory when empty
KEYCLOAK_ADMIN_URL empty (Mock adapter used in dev) KC base URL for the Admin API
KEYCLOAK_REALM breakpilot-dev Realm name for Admin API calls
KEYCLOAK_CLIENT_ID empty Service-account client id (Admin)
KEYCLOAK_CLIENT_SECRET empty Service-account client secret

Endpoints

Authoritative spec: openapi.yaml. Summary:

Method Path Purpose
GET /healthz Liveness
GET /readyz Pings the store
POST /v1/tenants Create a tenant
GET /v1/tenants/{id} Read by id
GET /v1/tenants/by-slug/{slug} Read by slug (portal middleware uses this)
POST /v1/tenants/{id}/activate trial → active
POST /v1/tenants/{id}/cancel active → frozen
GET /v1/entitlements?tenant_id={id} List product entitlements
GET /v1/catalog List requestable products
POST /v1/catalog/request Customer requests a product (sales follow-up)
POST /v1/catalog/trial-request Self-serve 14-day trial
GET /v1/api-keys?tenant_id={id} List keys
POST /v1/api-keys Create key (plaintext shown once)
DELETE /v1/api-keys/{id} Revoke
POST /v1/internal/api-keys/verify Used by headless products to validate inbound keys
POST /v1/audit Append an audit event
GET /v1/audit Query (cursor-paginated)

State-changing endpoints emit audit events automatically. The OpenAPI contract test (openapi_test.go) asserts every listed path resolves against the committed spec.

Storage

The service picks its store based on DATABASE_URL:

  • empty → in-memory store, pre-seeded with the acme tenant (id: 00000000-0000-0000-0000-000000000001). Useful for portal dev without spinning Postgres.
  • set → pgx-backed Postgres. Run make migrate-up against the same DSN first.

Both implementations pass the same test harness (internal/server/server_test.goeachStore).

Keycloak adapter (M4.3)

internal/keycloak is the seam between tenant-registry and Keycloak. The Adapter interface has two implementations:

Implementation When used
Mock Default in dev when KEYCLOAK_ADMIN_URL is empty
HTTPAdapter Real KC Admin API client; activated when KC env vars are populated

POST /v1/tenants now accepts admin_email and admin_name. When set, the adapter creates a Keycloak organization (alias = the tenant slug), invites the user as the IT_ADMIN, and triggers the verify-email + set-password flow. The response body includes invite_url so dev testers can use it without waiting for the email — production discards it.

KC failures are non-fatal. The tenant row still lands; a keycloak.provision_failed audit event captures the error so the operator can resend the invite from the KC UI.

POST /v1/internal/keycloak/claims resolves a tenant's current entitlement bundle (tenant_id, slug, products, plan, status). The realm's protocol mapper calls this at token-issuance time (or whenever user attributes need a refresh).

For production, provision a service-account client in the realm with the realm-management:manage-users + manage-organizations roles. Drop its credentials in Infisical at /{env}/tenant-registry/KEYCLOAK_CLIENT_*.

Schema migrations (M4.1)

# Apply all pending migrations against the dev Postgres (assumes
# `make dev-up` in platform/orca-platform is running):
make migrate-up

# Inspect current version:
make migrate-version

# Roll back the most recent migration:
make migrate-down

# Wipe everything (DESTRUCTIVE — only safe against a dev DB):
make migrate-down-all

# Create the next pair of empty migration files:
make migrate-create NAME=add_team_table

Migrations are embedded into both cmd/server and cmd/migrate via migrations/embed.go. In production, cmd/migrate ships as an Orca init container so the schema is applied before the API server starts (IMPLEMENTATION_PLAN.md §1.7: migrations are forward-only and run as an init container before the service).

The migrations package ships three integration tests (require Docker):

Test What it asserts
TestMigrate_upDownRoundTrip up → all 6 tables + 4 enums exist; down → schema empty; up again succeeds
TestSeed_canInsertAndQuery end-to-end insert across all 6 tables, FK cascade behaviour, audit_log SET-NULL on tenant delete
TestSlugConstraint tenant slug regex enforced (rejects too-short / leading dash / uppercase / underscore)

Run them with make test. Use make test-short in environments without Docker.

Deployment

Env URL How
dev http://localhost:8090 make dev
stage https://tenant-registry.stage.breakpilot.com auto on merge to main
prod https://tenant-registry.breakpilot.com manual: tag vX.Y.Z + sign-off

Rollback: orca rollout undo tenant-registry --env={{env}}.

Observability

  • Traces, logs, metrics: SigNoz — service name tenant-registry
  • Audit events: Tenant Registry /audit (Retraced-shape schema)
  • On-call: oncall@breakpilot.com · runbook at platform/docs/runbooks/tenant-registry.md

Contributing

See CONTRIBUTING.md. TL;DR: branch from main, open a PR, 1 review + green CI, squash-merge.

License

Proprietary — all rights reserved. Copyright (c) 2026 Sharang Parnerkar and Benjamin Boenisch. See LICENSE.