IWRI Methodology

Last updated: 7 May 2026

The Invisible Work Risk Index (IWRI) is a screening tool. It analyses sprint and issue data already in Jira to surface Scrum teams whose workflow patterns suggest hidden invisible work, and points you to where to start a conversation: which team, which sprint, which signal. The goal is to help you ask better questions about effort that is not showing up in the numbers.

IWRI rests on three principles:

Relative, not absolute. Every score is measured against the population on your own site, not an industry average.
Screening, not diagnosis. A flag indicates a pattern worth investigating. It is not a verdict.
Conversation starter. Flags are designed to start conversations, not drive performance decisions.

1. Where the seven signals came from

Every team does work that never makes it into Jira: mentoring juniors, responding to incidents, fixing things before they become tickets, answering questions in Slack. That “invisible work” drains capacity, skews estimates, and burns people out. The research question was: can you detect it from the data teams already have?

We studied hundreds of Scrum teams across multiple organisations in banking, telecommunications, and other sectors. Each team rated their own volume of invisible work through structured surveys. Team leads were then interviewed to map what kinds of invisible work they experienced and where it showed up in their workflow. Over 20 candidate indicators (patterns in sprint data, ticket behaviour, and activity logs) were tested against this self-reported data. Indicators that did not show a consistent, repeatable pattern were removed.

Seven indicators survived. Each captures a different facet of how invisible work manifests in team data, and each has a documented split-half reliability score between 53 and 86 percent.

2. Pipeline at a glance

IWRI transforms raw Jira data into risk signals through a six-stage pipeline. Each stage builds on the previous, ensuring that the final risk assessment is grounded in measurable, standardised data rather than subjective judgement.

Discover projects on the Jira site.
Apply eligibility filters (see the Eligibility page).
Pull sprint and issue data for each eligible project.
Compute the seven raw indicators per team.
Standardise each indicator into a z-score against the site population.
Translate z-scores into flags and risk levels.

3. What we analyse

IWRI examines completed sprint data for eligible projects. The analysis window and filters ensure statistical relevance while capturing recent team behaviour.

Parameter	Value
Sprint window	Last 8 completed sprints within 90 days
Minimum sprint duration	7 days per sprint
Issue types	Story, Task, Bug (not sub-tasks or epics)
Commitment window	2 calendar days from sprint start
Population minimum	20 eligible projects required for scoring
Team size	3 or more distinct assignees

4. Eligibility

Only projects that meet every eligibility criterion are scored. Projects that fail one or more are excluded from the scored population. They still appear in filter dropdowns (greyed out, marked “Excluded”) but do not contribute to or receive scores.

The full criteria set, including the population minimum of 20 eligible projects, is documented on the Eligibility page.

5. The seven indicators

Each indicator measures a specific dimension of team workflow behaviour. The first three are averaged into a composite called Completing Less Than Planned. The remaining four are assessed independently, and each powers its own flag.

Sprint-delivery indicators (feed Completing Less Than Planned)

Indicator	What it measures	Formula
Velocity Stability	How inconsistent a team’s completed work is relative to total demand. High residual means velocity swings more than demand, a sign of capacity instability.	`CV(velocity) − CV(demand)`
Commitment Gaps	The average shortfall between what a team commits to and what they deliver. Over-delivery is clamped to zero; this only measures under-delivery.	`mean( max(0, (committed − completed) / committed) )`
Work Left Undone	The proportion of committed work that was never started by the time the sprint closed. A high value suggests overcommitment or displacement by unplanned work.	`mean( notStartedCount / committedCount )`

Independent signals (each fires its own flag)

Indicator	What it measures	Formula
Tracking per Person	Issues tracked per person per week. Teams tracking fewer items per developer than peers may have significant work happening outside Jira. Sign-flipped: low values indicate higher risk.	`mean( totalIssues / teamSize / sprintWeeks )`
Daily Jira Activity	The proportion of working days with zero Jira activity. Higher values suggest potential disengagement or work happening elsewhere.	`mean( zeroActivityDays / totalWorkingDays )`
Ticket Descriptions	The proportion of completed tickets with empty descriptions. Sparse descriptions often indicate rushed creation or verbal-only requirements.	`mean( emptyDescriptionCount / completedCount )`
Mid-Sprint Changes	How unpredictable the volume of mid-sprint additions is across sprints. High volatility suggests inconsistent sources of interruption that destabilise planning.	`CV( midSprintAdditions )`

6. Z-score standardisation

Raw indicator values are measured in different units: percentages, ratios, coefficients of variation. Z-score standardisation makes them comparable by expressing each value as standard deviations from the population mean.

z = (raw_value − population_mean) / population_stddev

A z-score of 0 means the team is exactly average. A z-score of 1.5 means the team is 1.5 standard deviations above the population, in the direction of higher risk.

Band	Range	Interpretation
Normal	`z ≤ 0`	At or below the population average. No flag.
Low concern	`0 < z ≤ 1`	Above average but below the flag threshold. Informational only.
Moderate	`1 < z ≤ 2`	Flag fires. The team is meaningfully above the population average.
High	`z > 2`	Flag fires. The team is in the top few percent on this indicator.

Note on Tracking per Person: this is the only indicator where the z-score is sign-flipped. Lower density means higher risk (work may be happening outside Jira), so the z-score is multiplied by −1 to maintain the convention that positive z = higher risk.

7. How Completing Less Than Planned is computed

Completing Less Than Planned is the only signal built from a composite of three z-scores: Velocity Stability, Commitment Gaps, and Work Left Undone. The three are averaged with equal weight. The same value serves two purposes: it appears as an indicator on the team drill-down, and it fires as a flag when it exceeds 1.0.

CompletingLessThanPlanned = mean( z_velocityStability, z_commitmentGaps, z_workLeftUndone )

Partial computation: if only one or two of the three components are available (for example, due to insufficient sprint data for one indicator), Completing Less Than Planned is computed from the available components. The UI shows “N of 3 components” so you can interpret accordingly.

8. The five risk flags

Flags are binary signals derived from z-scores. A flag is activated when the relevant z-score exceeds 1.0, meaning the team is more than one standard deviation above the population average for that risk dimension (roughly the 84th percentile). The threshold is deliberately strict: requiring a team to be well above the mean (not just above it) ensures that multiple converging flags produce a meaningful risk signal, not just noise.

Flag	Trigger	What it suggests
Capacity Drain	Completing Less Than Planned > 1	The team consistently completes less than planned: velocity is unstable, commitments are missed, and planned work is left unstarted.
Low Ticket Volume per Person	z(Tracking per Person) > 1 (sign-flipped)	The team tracks fewer items per developer than peers, suggesting significant work may be happening outside Jira.
Low Day-to-Day Jira Activity	z(Daily Jira Activity) > 1	The team has more zero-activity days than peers. May indicate disengagement, burnout, or work in other tools.
Tickets Lacking Descriptions	z(Ticket Descriptions) > 1	A higher proportion of completed tickets lack descriptions, suggesting rushed creation or verbal-only requirements.
Unpredictable Mid-Sprint Changes	z(Mid-Sprint Changes) > 1	The volume of unplanned work added mid-sprint varies unpredictably, undermining sprint planning.

9. Risk classification

Risk level is determined solely by the number of active flags. The more flags triggered simultaneously, the less likely all of them are caused by benign confounders. Converging signals increase confidence that invisible work patterns are present.

Active flags	Risk level
0	Healthy
1	Watch
2	Elevated
3	Concerning
4	At Risk
5	High Risk

10. Known limitations

IWRI is designed for transparency. These are the known limitations and assumptions built into the methodology.

Limitation	What it means
Screening, not diagnosing	Every indicator has alternative explanations. A flag indicates a pattern worth investigating. It does not confirm invisible work is present.
Relative, not absolute	Z-scores are relative within your scored population. A team with z = 1.5 is above average for your organisation, not the industry. Cross-organisation comparison is not supported.
Adapted teams may be missed	Teams that have already accommodated invisible work into their practices (for example by inflating estimates) may not trigger flags, because their patterns appear “normal” within the population.
Persistence, not prediction	Split-half reliability demonstrates that patterns persist across sprint halves, but this is not predictive or construct validity.
Uniform sizing assumption	Issue-count-based indicators assume approximately uniform issue sizing. Teams with highly variable story point distributions may see skewed results.
Team-size confound	Daily Jira Activity correlates with team size. Smaller teams naturally have more zero-activity days. This known confound is flagged in the UI but not adjusted for.
Internal comparison only	IWRI scores are meaningful within a single organisation’s scored population. Comparing scores across different organisations or Jira instances is not valid.