feat(dashboard): proactively refresh expired Keycloak tokens #91
Reference in New Issue
Block a user
Delete Branch "feat/dashboard-token-refresh"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
The dashboard stored a
refresh_tokenin the session at login (auth.rs) but never used it. Once the access_token's 5-minute lifespan ran out, every subsequent agent call failed with 401ExpiredSignatureand the UI showed "unable to load X" until the user manually logged out and back in.Fix
Before attaching the bearer in
agent_client::attach_token:expclaim (no signature verification — the agent does that).REFRESH_SKEW_SECS(30s) of expiry, exchangerefresh_tokenfor a fresh pair via the realm's token endpoint.access_token+ (possibly rotated)refresh_tokenback into the session.If the refresh fails (refresh_token also expired or rejected), fall through with the stale token. The agent's 401 then surfaces to the UI, which can prompt re-login — better UX cue than failing silently at the dashboard layer.
Why proactive, not retry-on-401
reqwest::RequestBuilderbodies for retry (some bodies aren't cloneable)Test plan
cargo test -p compliance-dashboard --features server --no-default-features infrastructure::agent_client::tests— 5 pass:expclaim → refresh (defensive)Note on the other error
While diagnosing the original "unable to load repositories" symptom, the agent log surfaced two distinct failure modes:
JWT validation failed: ExpiredSignature— what this PR fixes.JWT validation failed: JWT is missing tenant_id claim— a Keycloak realm config issue (user logging in lacks the M7.1 attributes that the protocol mappers consume). Being fixed separately by switching both services to thebreakpilot-devrealm inorca-infra.🤖 Generated with Claude Code
The dashboard stored a refresh_token in the session at login (auth.rs) but never used it. Once the access_token's 5-minute lifespan ran out, every subsequent agent call failed with 401 ExpiredSignature. The UI showed "unable to load X" until the user logged out and back in. Fix: before attaching the bearer, decode the JWT's `exp` claim and proactively refresh via the stored refresh_token if the token is expired or within REFRESH_SKEW_SECS (30s) of expiry. Updates the session with the new access_token (and rotated refresh_token if KC sends one). Refresh failures fall through with the stale token so the agent's 401 surfaces to the UI rather than failing the request at the dashboard layer. Why "proactive" instead of "retry on 401" - Saves a wasted round-trip on every agent call once the token has aged past 5 min. - Doesn't require cloning RequestBuilder bodies for retry. - Same end state — fresh token reaches the agent. Test plan - cargo test -p compliance-dashboard --features server --no-default-features infrastructure::agent_client::tests — 5 pass: * expired JWT → refresh * near-expiry within skew window → refresh * fresh JWT → no refresh * malformed/empty JWT → refresh (defensive) * JWT without exp claim → refresh (defensive) - Manual after deploy: dashboard works past the 5-min token lifespan without manual re-login. Note - The refresh code addresses the ExpiredSignature failure mode. The separate "JWT is missing tenant_id claim" 401 is a Keycloak realm config issue (the user logging in lacks the M7.1 attributes that the protocol mappers consume) and is fixed by realm/attribute config, not by this PR. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>