Iiq Provisioning Recovery Console
An operator console that turns SailPoint IdentityIQ's scattered failed‑provisioning state into a single triage surface — classified, age‑ranked, recoverable, and audit‑logged.
IIQ Provisioning Recovery Console
An operator console that turns SailPoint IdentityIQ's scattered failed‑provisioning state into a single triage surface — classified, age‑ranked, recoverable, and audit‑logged.
| Field | Value |
|---|---|
| Type | App |
| Theme | Recover (primary, NIST CSF) + Automation (secondary) |
| Platform | SailPoint IdentityIQ 8.4 |
| Date | 2026-05-11 |
| Status | Concept (browser-runnable mock, synthetic data) |
The problem (in IIQ-shaped form)
When provisioning fails in IIQ, the evidence lands in three places that do not talk to each other:
- ProvisioningTransaction (PTO) objects — one per attempted operation per target system, with
status = Success | Failed | Retry | Pendingand a free-textstatusMessage. Configured viaenableProvisioningTransactionLogandprovisioningTransactionLogLevel = Failure(seeWEB-INF/config/init.xml, lines 2109–2111). - IdentityRequest rows — the user-facing access request, with
executionStatus = Verifying | Executing | Terminated | Completed. Failures roll up as messages, but the linkage from a specific Failed PTO back to the IdentityRequest is a click-through chain, not a query. - Workflow case — the BPM case for
LCM ProvisioningcarriesisProvisioningFailedand the retry loop controlled byenableRetryRequest(seeWEB-INF/config/lcmworkflows.xml, lines 255–257). A stuck case sits in the Process Monitor with no aggregated view of what failed across the fleet.
OOB IIQ surfaces these in three different grids — the Provisioning Transactions debug page, the Identity Request list, and the Process Monitor. There is no single answer to the question an IAM operator actually asks at 9 AM:
"What's broken in provisioning right now, why, who's affected, and what's the fastest recovery?"
This console answers that question.
What the console shows
Top KPI bar (5 tiles):
- Open — total non-
SuccessProvisioningTransactions - Auto‑retry eligible —
status = Retry, ready for the retry loop - Past SLA —
> 24hold and still not resolved (configurable) - Hot app — application carrying the largest share of open failures
- Terminated identities affected — failures where
Identity.inactive = true(the highest‑risk subset, because it likely means a Leaver event didn't fully execute)
Filter rail (left):
- Application multi-select (AD, Okta, Workday, ServiceNow, Salesforce, GitHub Enterprise, SAP HR, …)
- Status: Failed / Retry / Pending / Manual
- Failure category — derived from
statusMessageby a heuristic classifier:- Connector unreachable (
Connection timed out,Host unreachable) - Auth failure (
401,Invalid credentials,expired token) - Schema / attribute validation (
Invalid attribute,required field,enum mismatch) - Role / policy precondition (
SoD violation,role requires) - Form pending (
Provisioning form awaiting input) - Manual work item (
Routed to manual fulfillment) - Duplicate (
already exists,409) - Other
- Connector unreachable (
- Age bucket: < 1h / 1–4h / 4–24h / > 24h
- Identity type: Active / Terminated
Failure grid (center):
| Time | Identity | App | Operation | Native identity | Category | Status | Retries | Message |
|---|---|---|---|---|---|---|---|---|
2026-05-11 08:42 |
jdoe |
AD - NA | Modify | CN=jdoe,OU=… |
Connector unreachable | Retry (3/5) | 3 | Connection timed out after 30s |
Click a row → right-side drawer opens with:
- Full
statusMessage - The compiled provisioning plan diff (what was supposed to change)
- The parent
IdentityRequestID andexecutionStatus - The parent workflow case ID
- Recovery actions: Retry now, Mark abandoned (reason required), Escalate to manual, Send to retry queue with backoff, Open in IIQ
Bulk select supports the same actions across many PTOs.
Recovery Audit Log (bottom):
Every action the operator takes is recorded in a chronological panel:
2026-05-11 09:14:02 mike.s retry-now PTO-8842 (Connector unreachable) reason: connector restored 08:58
2026-05-11 09:14:05 mike.s retry-now PTO-8841 (Connector unreachable) reason: connector restored 08:58
2026-05-11 09:18:30 mike.s mark-abandoned PTO-8801 (Schema validation) reason: app retired in CMDB
This is the Recover artifact — the defensible trail of what was triaged, restored, abandoned, and learned during an incident.
Why this matters
| Pain | Today | With this console |
|---|---|---|
| 9 AM stand-up: "What's broken in provisioning?" | Three browser tabs, manual cross-reference, gut-feel triage | One screen, KPI tiles, category heatmap |
A connector outage drops 400 PTOs into Retry |
Operator clicks each row in the debug page | One filter, multi-select, one "Retry now" |
| Auditor asks for evidence of how a Q1 outage was handled | Mailbox archaeology + screenshots | Recovery Audit Log export, signed by timestamp + operator |
| Failed Leaver leaves a terminated user with live access | Discovered weeks later, if at all | KPI tile: Terminated identities affected — visible from minute one |
Same statusMessage recurs across applications, signaling a config drift |
Hard to spot in raw log | Failure-category heatmap — drift becomes a contiguous column of red |
NIST CSF — Recover (primary)
- RC.RP-1 Recovery plan is executed during or after an event — the audit log is the executed plan.
- RC.IM-1 Recovery plans incorporate lessons learned — categorized failure data lets the team update connectors, retry policies, and role definitions.
- RC.IM-2 Recovery strategies are updated — the heatmap surfaces systemic vs. one-off failures.
- RC.CO-3 Recovery activities are communicated — bulk reason fields and the export feed comms templates.
Automation (secondary)
- Heuristic classifier reduces categorization clicks.
- Bulk actions reduce per-row clicks (400 PTOs → 1 action).
- Retry queue with backoff means the operator doesn't babysit transient outages.
What's in this folder
| File | Purpose |
|---|---|
index.html |
Single-page console — open in any modern browser, no build, no network. |
style.css |
Operator-console aesthetic — zinc-950 background, emerald accents, dense typography. |
script.js |
Renders the grid, KPIs, filters, drawer, and audit log from sample-data.json. Includes the failure-category classifier. |
sample-data.json |
42 synthetic ProvisioningTransactions across 7 IIQ applications and 18 identities (3 terminated, 15 active). |
requirements.md |
Functional requirements, IIQ object model, out-of-scope. |
metadata.md |
Provenance, model, IIQ pain points targeted, NIST CSF mapping. |
cover-image.png |
16:10 concept art (nanobanana). |
How to run
Double-click index.html. That's it. No server, no API key, no IIQ instance required — the data is synthetic and lives in sample-data.json.
If this were wired into a real IIQ 8.4 deployment, the data layer would be:
- Read ProvisioningTransaction objects via the SailPoint API:
GET /identityiq/rest/provisioningTransactions?filter=status.in("Failed","Retry","Pending")- Optionally use the
iiqsearch index for free-text search onstatusMessage.
- Read parent IdentityRequest via
IdentityRequest.getId()from the PTO. - Write recovery actions back as:
- Retry → call
Provisioner.retry(pto.getId())from a custom Rule, or relaunch the parent workflow case. - Mark abandoned → set a custom
IIQ_RECOVERY_STATE = "abandoned"on the PTO viasetAttribute, then prune via the nextProvisioning Transaction Prunertask. - Audit log → write to a dedicated
IIQRecoveryAuditLogSailPointObject (custom class) or to the OOBAuditEventtable withaction = "provisioning_recovery_*".
- Retry → call
The examplerules.xml and lcmworkflows.xml files under Resources/IIQ_Repo_V8.4/WEB-INF/config/ show the BeanShell shapes for the rules and workflow steps that would back this UI.
Sources
- Local —
My-Library/Apps/IAM-Ideas/Resources/IIQ_Repo_V8.4/WEB-INF/config/init.xml(PTO logging config),lcmworkflows.xml(LCM Provisioning workflow + retry loop),workflowRules.xml,authorization.xml(PTO authz scopes),tasksCommon.xml. - Local —
My-Library/Apps/IAM-Ideas/Resources/IIQ_Documentation/8.4/identityiq-doc-8.4/(8.4 subject PDFs, esp. Provisioning, Lifecycle Manager, Tasks). - Web —
serperMCP server was unavailable this run (HTTP 400 fromgoogle.serper.dev); regenerate next run for current SailPoint Compass / community discussion on PTO triage and recovery patterns.
Requirements
Requirements — IIQ Provisioning Recovery Console
Functional requirements
F1. KPI bar
- Display 5 tiles, computed from the loaded
ProvisioningTransactionset:- Open —
count(status != "Success") - Auto‑retry eligible —
count(status == "Retry") - Past SLA —
count(ageHours > 24 && status != "Success")(SLA is 24h, fixed in the mock) - Hot app —
applicationwith the largestcount(status != "Success") - Terminated identities affected —
count(distinct identityName where identity.inactive == true && status != "Success")
- Open —
- Each tile is keyboard-focusable and shows a sparkline of the last 7 days when present in the data.
F2. Filter rail
- Application — multi-select chip list, sourced from
distinct applicationin the data. - Status — toggle row: Failed / Retry / Pending / Manual.
- Failure category — multi-select chip list, sourced from the classifier output (see F4).
- Age — 4 buckets:
<1h,1–4h,4–24h,>24h. - Identity type — Active / Terminated toggle.
- All filters AND together; "Reset filters" clears them.
F3. Grid
- Columns: Time (relative + absolute on hover), Identity (displayName + name), Application, Operation (Create / Modify / Delete / Enable / Disable / SetAttribute), Native identity (truncated, full on hover), Category (badge), Status (badge), Retries (
n/max), Message (truncated 120 chars). - Sortable by Time, Application, Status, Retries, Category.
- Row click opens the right-side drawer.
- Multi-select via checkbox + shift-range; multi-select reveals the bulk-action toolbar.
F4. Failure-category classifier
- Pure function
classifyMessage(statusMessage): Categoryimplemented client-side. - Categories:
Connector unreachable,Auth failure,Schema/Attribute,Role/Policy,Form pending,Manual work item,Duplicate,Other. - Heuristic: lowercase scan + first-match wins. Patterns are illustrative, not exhaustive:
Connector unreachable←connection refused,timed out,unreachable,no route to hostAuth failure←401,403,invalid credentials,unauthorized,expired tokenSchema/Attribute←invalid attribute,required field,not a valid enum,must matchRole/Policy←sod violation,policy violation,requires role,preconditionForm pending←provisioning form,awaiting inputManual work item←manual fulfillment,routed to,work item createdDuplicate←already exists,409,duplicate- else →
Other.
F5. Detail drawer
- Triggered by row click; closeable via Esc or close button.
- Shows: full
statusMessage, the provisioning plan diff (synthetic, displayed as a 3-line JSON-ish change set),IdentityRequest.id+executionStatus, parent workflow case ID, retry history (n entries with timestamps), recovery actions (5 buttons — see F6).
F6. Recovery actions (single + bulk)
- Retry now — sets
status = "Retry", incrementsretryCount, appends an audit row. - Mark abandoned — opens a modal requiring a non-empty reason; on confirm sets a virtual
IIQ_RECOVERY_STATE = "abandoned"and removes the row from the open grid. - Escalate to manual — sets
status = "Manual", audit row captures the assignee (defaults to current user). - Send to retry queue with backoff — sets
status = "Retry"but suppresses the row for 1h (visual only) and audits the deferral. - Open in IIQ — visual no-op in the mock; in production would deep-link to
/identityiq/identityRequest.jsf?id=<id>.
F7. Recovery Audit Log
- Bottom panel, scrollable, always visible.
- Each entry:
timestamp operator action pto-id category reason. - New entries highlight briefly when added.
- "Export CSV" button downloads the visible log.
F8. Theme
- Dark by default (zinc-950 / emerald). Light toggle in the header (
zinc-50/ forest-green). - Respects
prefers-color-schemeon first load.
Non-functional requirements
- N1. Opens directly from disk — no server, no API, no network calls.
- N2. No build step. Plain HTML + CSS + JS, ES2020. No framework, no transpiler.
- N3. No external fonts or assets at runtime (uses system font stack).
- N4. Renders ≥ 500 PTOs at 60fps on a 2020-era laptop (current sample is 42).
- N5. All identity data is synthetic; no resemblance to real persons is intended.
- N6. Accessible: keyboard-navigable filters and grid; ARIA labels on KPIs and badges.
IIQ object model assumptions
| Object | Field | Shape |
|---|---|---|
ProvisioningTransaction |
id |
string (e.g. PTO-08842) |
operation |
enum: Create Modify Delete Enable Disable SetAttribute |
|
status |
enum: Failed Retry Pending Manual |
|
statusMessage |
free text from the connector | |
application |
string app name | |
identityName |
string (the IIQ Identity name) |
|
identityDisplayName |
string | |
identityInactive |
boolean (joined from Identity for the KPI) | |
nativeIdentity |
string (DN, sAMAccountName, email, etc.) | |
retryCount |
int | |
maxRetries |
int (config; mock uses 5) | |
created |
ISO-8601 | |
identityRequestId |
string (e.g. IR-12340) |
|
executionStatus |
enum: Verifying Executing Terminated Completed |
|
workflowCaseId |
string (e.g. WC-87654) |
|
planDiff |
array of {op, attr, from, to} |
Out of scope
- Live SailPoint API calls.
- Persistence beyond page lifetime (audit log is in-memory; refreshing the page resets).
- Multi-tenant scoping / SPRight enforcement.
- Mobile / narrow-viewport layout (operator console assumes ≥ 1280px wide).
More from IAM Ideas