Predicting Employee Turnover: What the Research Actually Shows

Predicting Employee Turnover

You can inspect every tree and still misunderstand the forest.  

Much of today’s employee turnover prediction does exactly that. By focusing on individual flight-risk scores, organizations attempt to identify who leaves next. The assumption: put together enough data about a person and the answer is revealed. 

It isn’t. 

In two independent studies conducted with the Pattern Recognition Lab at TU Delft and the Department of Business Analytics at the University of Amsterdam, we rigorously tested how well individual employee turnover can be predicted using only core HRIS data—the kind found in systems like Workday, Oracle, or SAP SuccessFactors. 

The conclusion was clear: with core HR data alone, it is not possible to produce reliable individual-level flight risk predictions. 

This doesn’t mean turnover can’t be anticipated. It means that individual flight-risk scoring promises more precision than the data can support. So what to do? Stop staring at trees and step back to look at the forest. 

The Allure of Attrition Prediction 

Few workforce outcomes are as visible, disruptive, and closely scrutinized as employees leaving, which is why the pressure to predict them persists.  

Depending on role complexity, replacing an employee can cost anywhere between 40% to 200% of their annual salary, according to a Gallup estimate.  

Say you run a 5,000-employee company, with a 15% voluntary turnover rate (about 750 people leave each year). If the average fully-loaded cost per employee is $75k, replacing those who quit could cost you roughly $42 million annually (assuming ~75% of salary as the replacement cost factor). 

And this math isn’t the whole of it. When productive employees leave, organisations lose output, continuity, team stability, and institutional knowledge. 

At the same time, there is a stronger-than-ever promise: analytics capabilities, increasingly powered by AI, have expanded. Leaders are accustomed to predictive dashboards in finance and operations, so it feels reasonable to expect similar foresight in HR. 

If every employee had a reliable flight-risk score, paired with a clear estimate of business impact, workforce planning would become more targeted. Managers would know where to focus, HR could prioritise interventions, and resources could be allocated more precisely. 

All this shows us we don’t need to look far to grasp the appeal or usefulness of employee turnover prediction. If you could identify risk early, you could intervene early. What we need to look at is whether predicting individual exits can be done with sufficient reliability to support decisions. 

What the Research Actually Showed 

In our first study with TU Delft, we tested how far individual employee turnover prediction can go using only monthly HRIS data from a large employer. We engineered dozens of structural features and evaluated multiple modelling approaches using precision-recall metrics rather than headline accuracy. 

Even with substantial feature engineering and careful validation, the models hit a ceiling and the best-performing turnover prediction models achieved only modest precision.  

When measured through largely static HRIS attributes, future leavers and stayers are statistically difficult to distinguish. Core fields change infrequently, while exits can occur at any time. The data simply does not move fast enough to support reliable individual separation. 

In our second study with the University of Amsterdam, we shifted the question. Instead of asking who will leave, we forecast turnover rates across meaningful segments, defined by location, function, tenure band, and similar attributes. At that level, signal stabilised and insights became materially more useful. 

This pattern mirrors broader academic findings. A comprehensive meta-analysis by Rubenstein and colleagues identified 17 validated predictors of voluntary turnover. Only two of those typically exist in core HCM systems. The strongest predictors—intent to quit, job satisfaction, commitment, job search behaviour—live in engagement and climate data, not in static payroll tables. 

17 Validated Predictors of Voluntary Turnover

Ranked by effect size (ρ) within category · Source: Rubenstein, Eberly, Lee & Mitchell (2017)

Variable Category Correlation ρ

How to read the correlation (ρ): The correlation coefficient shows how strongly each factor relates to turnover. A negative value means the higher the score, the lower the turnover — a protective factor. A positive value means the opposite — a risk factor.

For example, Other satisfaction (ρ = −0.43) — which captures how satisfied employees are with their career and life overall — is one of the strongest protective factors found: employees who score high on this are meaningfully less likely to quit. By contrast, Work-life conflict (ρ = +0.19) works in the opposite direction: the more employees feel their work interferes with their personal life, the more likely they are to leave.

All effects shown are statistically significant and corrected for measurement error. Effect sizes were derived from a meta-analysis of 316 studies covering over 1,800 effect sizes. Source: Rubenstein, Eberly, Lee & Mitchell (2017).

Why Individual Turnover Prediction Crumbles 

As the studies showed, individual-level turnover prediction is constrained less by modelling technique and more by the structure of the data itself. 

Sparse and Slow-Moving Data 

HR datasets can appear large in volume. Yes, there are thousands of employees across multiple years, generating millions of data points.  

But most of the data points change infrequently. For example, salary and grade updates typically happen once a year, and engagement data are collected once a quarter. At the same time, employees can leave on any given day.  

Over time, this creates relatively static histories with limited variation at the individual level. And machine learning models require meaningful variation to learn predictive patterns. When features remain flat while outcomes occur unpredictably, signal strength is inherently constrained. 

Imbalance Between Leavers and Stayers 

Even in organizations with significant turnover, far more employees stay than leave in any given period. This imbalance makes individual-level prediction statistically challenging, as the models learn from a relatively small number of leavers, which limits precision at an individual level. A model may correctly predict many stayers and appear strong on overall accuracy, yet still fail to reliably identify actual leavers. 

The Accuracy Illusion 

Accuracy is a seductive but misleading metric in employee turnover prediction. 

If 90% of employees stay, a model that predicts “no one will leave” achieves 90% accuracy while providing zero actionable insight. Precision and recall provide a far more meaningful test of whether a model is actually identifying future leavers in a usable way. This is why serious evaluation of employee turnover prediction must focus on precision, recall, and out-of-sample validation. 

When evaluated properly using precision-recall curves and out-of-sample testing, individual-level attrition prediction based on core HR data rarely achieves precision strong enough to justify high-stakes, person-specific decisions. 

Why accuracy can be misleading

To see why accuracy can be misleading in employee turnover prediction, consider a simplified example. 

Imagine a company with 1,000 employees and a 10% annual voluntary turnover rate. That means 100 employees leave in a year. 

Now imagine a turnover prediction model that flags 40 employees as “high risk.” 

  • Of those 40, only 10 actually leave. 
  • The other 30 stay. 
  • Meanwhile, the model fails to flag 90 of the 100 employees who do resign. 

In this scenario: 

  • Precision = 10 correct predictions out of 40 flagged = 25%. 
  • Recall = 10 identified out of 100 actual leavers = 10%. 

 That model would still claim 90% accuracy — while failing to identify most future leavers. 

Prediction Is a Governance Question 

Because these models influence real workforce decisions, they cannot be treated as abstract analytics exercises. If organisations choose to use individual flight-risk scores, minimum governance guardrails should include: 

Validation on future time periods. 
Test the model on data it has never seen before — ideally on later time periods, not just reshuffled historical splits. 

Evaluation using precision and recall, not accuracy alone. 
Accuracy can look strong even when a model fails to identify most future leavers. Precision and recall provide a more honest assessment of usefulness. 

Monitoring for performance drift. 
Workforce conditions change. Model performance must be reviewed regularly to ensure it does not degrade over time. 

Bias audits across demographic groups. 
Assess whether certain groups are disproportionately flagged due to structural patterns in historical data. 

Clear decision policies and human oversight. 
Define explicitly how predictions may inform decisions — and where they may not. Scores should support managerial judgment, not automate action. 

Why does this matter? 

Because weak signals, when presented as confident scores, can easily be over-interpreted. For example, a manager may treat a “high risk” label as fact rather than probability, and certain groups may be disproportionately flagged due to structural bias in historical data. 

Over time, this can erode trust in workforce analytics and expose organisations to governance and legal scrutiny. The more predictive models affect people, the higher the standard for transparency, explainability, and ongoing monitoring must be. 

Group-Level Forecasting: Alternative That Holds Up 

As you’ve seen, individual exits are shaped by personal circumstances, labour market conditions, leadership context, and psychological factors. And many of the strongest turnover predictors are not captured in static HRIS data. 

At the aggregate level, however, patterns become more stable. When turnover is examined across defined groups, such as a role family, tenure band, or performance segment, patterns emerge over time. At that level, time-series forecasting methods such as ARIMA can model how turnover behaves over time. 

Instead of asking, “Who will leave?”, the question becomes: 

  • How much turnover should we expect? 
  • Where is it likely to concentrate? 
  • When are trends accelerating or stabilising? 

# Individual vs group predictions 

At its core, predictive turnover analytics forces a structural choice: 

What you ask What you get What it supports Risk profile
Individual prediction “Who will leave?” Names + risk scores (often fragile) Targeted intervention (only if signal + governance are strong enough) High false positives + governance sensitivity
Group-level forecasting “Where and when will turnover rise?” Forecast ranges + trends (statistically more stable) Workforce planning, prioritization, early warning without naming people Lower volatility + planning-grade stability

If the question is about a person, the signal must be exceptionally strong. If it concerns a population, forecasting often provides more reliable ground. In practice, that shift produces outputs like this:“Out of our 250 high-performing salespeople, between 35 and 45 are likely to leave next year — within a defined confidence range.” 

This forecast supports planning, prioritization, and credible conversations with leadership, instead of finger pointing and naming names, without being anchored in limited, individual predictions. 

You can then complement the picture by using driver analysis to identify which factors move with changes in turnover—not as proof of causality, but as signals that help HR teams decide where to look more closely.  

Perspective Over Prediction 

The question is not whether we can build turnover prediction models. We can. The question is whether those models are strong enough to act on at the individual level without introducing risk. 

When expectations are realistic, analytics becomes a tool for judgment rather than false precision. Group-level forecasting, combined with thoughtful driver analysis, offers a stable and ethically defensible way to anticipate workforce trends. 

Individual flight-risk scores may promise clarity. Patterns, trends, and confidence ranges deliver something more valuable: perspective

A Practical Model Checklist 

Not all turnover prediction models are equal. If a solution, whether from a vendor or internal, promises individual flight-risk scores, it should withstand scrutiny. 

Before accepting claims, ask: 

  • What is the model’s precision at a fixed recall level
  • How does performance hold up on future time periods, not just historical splits? 
  • What is the baseline comparison (e.g., predicting no one leaves)? 
  • How is class imbalance handled? 
  • What is the expected false-positive rate at the recommended threshold?
  • How stable are results across segments (location, function, tenure, demographic groups)?
  • How is the score explained to managers?
  • What governance guardrails are recommended?
  • How is model drift monitored over time?
  • Is there evidence that using the model actually improves retention outcomes?

If these questions cannot be answered clearly, the model may not be ready for high-stakes, person-specific decisions. 

Key Takeaways

  • Individual turnover prediction using core HRIS data rarely achieves decision-grade precision.
  • Accuracy alone can hide weak predictive performance.
  • Governance and validation are essential before acting on flight-risk scores.
  • Group-level forecasting provides more stable planning insights.

Employee Turnover Prediction FAQ (Research-Based Answers)

Can employee turnover be predicted accurately at the individual level?

Research shows that predicting individual employee turnover using only core HRIS data is highly unreliable. Static HR attributes change infrequently, while resignations can occur at any time, making future leavers statistically difficult to distinguish from stayers.

Why is individual flight-risk scoring often inaccurate?

Individual turnover prediction struggles due to slow-moving HR data, class imbalance between leavers and stayers, and missing behavioral factors such as job satisfaction or intent to quit. These strongest predictors typically exist outside core HR systems.

Why is accuracy a misleading metric in turnover prediction?

Accuracy can appear high even when a model fails to identify future leavers. For example, if 90% of employees stay, predicting that no one will leave achieves 90% accuracy while providing no actionable insight. Precision and recall better measure usefulness.

What is a better alternative to individual turnover prediction?

Group-level forecasting analyzes turnover trends across segments such as tenure bands, roles, or locations. At this level, patterns stabilize and forecasting methods can produce more reliable planning insights.

What governance safeguards should organizations apply to turnover prediction models?

Organizations should validate models on future time periods, evaluate performance using precision and recall, monitor model drift, audit bias across demographic groups, and ensure human oversight in workforce decisions.

What data improves turnover prediction reliability?

Research shows that engagement data, job satisfaction, commitment levels, and intent-to-quit indicators are stronger predictors of voluntary turnover than static payroll or demographic HRIS data.

What is employee turnover prediction?

Employee turnover prediction uses workforce data and analytics models to estimate future resignations. Approaches range from individual flight-risk scoring to group-level forecasting that identifies trends across teams, roles, or locations.

What research supports group-level turnover forecasting?

Studies conducted with the Pattern Recognition Lab at TU Delft and the University of Amsterdam found that individual turnover prediction using core HRIS data achieved limited precision, while forecasting turnover across workforce segments produced more stable and decision-useful insights.

How to Assess Whether a Turnover Prediction Model Is Decision-Ready

Time needed: 1 hour and 30 minutes


Predictive turnover analytics can influence real workforce decisions, which makes model evaluation a governance responsibility rather than a technical exercise. Use this checklist to assess whether a turnover prediction model produces reliable, decision-grade insight before acting on individual flight-risk scores.

  1. Validate the model on future data

    Test turnover prediction models using future time periods rather than reshuffled historical datasets to assess real-world performance.

  2. Evaluate precision and recall

    Assess whether the model correctly identifies future leavers using precision and recall metrics instead of relying on accuracy alone.

  3. Check baseline comparisons

    Compare results against simple baselines such as predicting that no employees will leave to understand true added value.

  4. Assess class imbalance handling

    Confirm how the model manages the imbalance between many stayers and relatively few leavers.

  5. Review false-positive rates

    Understand how many employees may be incorrectly flagged as high risk at the recommended decision threshold.

  6. Audit bias across groups

    Evaluate whether demographic or organizational groups are disproportionately flagged due to historical data patterns.

  7. Confirm governance guardrails

    Ensure decision policies, transparency standards, and human oversight are clearly defined before using predictions operationally.

  8. Monitor performance drift

    Establish ongoing monitoring to detect performance degradation as workforce conditions change.

  9. Review business impact evidence

    Look for proof that using the model improves retention outcomes rather than simply generating risk scores.