A Systematic Literature Review: Cognitive Workload Assessment in Human Factors Research

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Cognitive workload, the mental effort it takes to complete a task, is an ever‐growing field, especially with increasing developments in technology and systems. This paper reviews published articles on cognitive workload, multitasking, the multiattribute task battery, physiological measurements of cognitive workload, and the improved performance research integration tool before gathering quantitative data about the cognitive workload field through a bibliometric analysis. The literature review was conducted to gain a deeper understanding of the cognitive workload field and synthesize published literature within the field. The bibliometric analysis was conducted using Web of Science and four VOSviewer trials to provide quantitative evidence and create predictions about the cognitive workload field. The reviews conducted in this paper lead to the findings that physiological measures of cognitive workload, specifically electroencephalography, virtual reality, and machine learning, are rising within the human factors field. This review also found that the improved performance research integration tool, a powerful tool used to model and predict workload levels, is less prevalent in recent research. Recommendations for future cognitive workload studies are also highlighted in this review.

Similar Papers
  • Single Report
  • Cite Count Icon 4
  • 10.21236/ada626356
A Procedure for Collecting Mental Workload Data During an Experiment That Is Comparable to IMPRINT Workload Data
  • Oct 1, 2009
  • Diane K Mitchell + 3 more

: U.S. Army Research Laboratory (ARL) analysts use the Improved Performance Research Integration Tool (IMPRINT) to predict the mental workload and performance of Soldiers operating the Future Combat System (FCS) manned ground vehicles. IMPRINT is a human-performance-modeling tool that analysts use to build models representing Soldiers interacting with equipment to accomplish a mission. The models contain tasks, task sequences, task times, and workload estimates that allow the software to calculate estimates of mental workload and mission performance. One of the key outputs from the IMPRINT models is the combination of tasks likely to contribute to high Soldier workload. Evaluators of FCS equipment can include the potentially high workload task combinations into their evaluations to be sure that they evaluate the tasks mostly like to contribute to mental overload. The U.S. Army Aberdeen Test Center (ATC) is one of the groups responsible for testing and evaluating FCS equipment and concepts. They must identify any issues that might degrade mission performance, including Soldier mental overload. To ensure that the ATC evaluations include tasks relevant to Soldier mental workload they can include high workload task combinations identified by IMPRINT into their test plans. However, to be compatible with IMPRINT, they must evaluate workload by a methodology compatible with the IMPRINT technique. This report outlines the methodology ATC and ARL developed within the Automated Communications Analysis of Situation Awareness /IMPRINT/Joint Warfighter Test and Training Capability test conducted in May 2008.

  • Research Article
  • 10.1177/10711813241272121
Verification and Validation of Cognitive Workload Models for Adaptive Automation Tasks
  • Sep 1, 2024
  • Proceedings of the Human Factors and Ergonomics Society Annual Meeting
  • Charles P Rowan

Technological advances that seek to address future operational challenges abound. While advanced capabilities are being developed, there is an important space for human design considerations, including cognitive workload. One proposed solution to improve cognitive workload management is adaptive automation (AA). This research used a novel, model-based approach to assess the impacts of AA on cognitive workload. This assessment modeled the tasks in NASA’s Multi-attribute Task Battery-II (MATB-II) using the Improved Performance Research Integration Tool (IMPRINT). The effort sought to investigate the relationship of cognitive workload, situation awareness, and performance through three human-in-the-loop studies with 120 participants using MATB-II. The research also attempted to validate cognitive workload models from IMPRINT. The IMPRINT models were representative of the MATB tasks with statistically significant differences between workload conditions, which mirrored the models’ predictions. The results demonstrated that AA system task models can be developed using IMPRINT to provide design recommendations.

  • Research Article
  • Cite Count Icon 3
  • 10.1177/1071181311551181
Modeling Performance Measures and Self-Ratings of Workload in a Visual Scanning Task
  • Sep 1, 2011
  • Proceedings of the Human Factors and Ergonomics Society Annual Meeting
  • D N Cassenti + 3 more

Mental workload is the amount of demand on an individual’s limited mental resources and thus is an important consideration in human factors research. This research focuses on workload from two primary methods of measuring it – self-ratings of workload and performance. An experiment to test workload involved the manipulation of the number of tasks to be performed at once and the time available to respond to the task or tasks. The results show that performance changes by shifting between ceiling, linear decrease, and floor performance as workload increases. SWAT ratings of workload followed the same pattern. We conclude that the IMPRINT (Improved Performance Research Integration Tool; Archer & Adkins, 1999) modeling system should maintain its existing method of modeling self ratings of workload, but that they may make use of a new algorithm based on this data to model performance as workload changes.

  • Research Article
  • Cite Count Icon 14
  • 10.1177/154193121005401968
Modeling the Workload-Performance Relationship
  • Sep 1, 2010
  • Proceedings of the Human Factors and Ergonomics Society Annual Meeting
  • Daniel N Cassenti + 2 more

Human factors research is often focused on the mental workload that is required to perform a task or set of tasks with the goal of reducing workload to make systems easier to manage. The Improved Performance Research Integration Tool (IMPRINT) includes an algorithm to predict mental workload. The algorithm was developed using subject matter expert ratings of workload tasks. We aimed to enhance this capability by developing algorithms using data from four new studies investigating change in performance as demands on mental resources increase. The results indicate three task types of similar difficulty and one task type of much greater difficulty. We then map these to our hypothesized workload function. Finally, we propose a way forward in modeling performance as a function of workload in IMPRINT.

  • Conference Article
  • 10.4050/f-0078-2022-17511
Task Analysis and Predictive Workload Modeling for Autonomous Aircraft
  • May 10, 2022
  • Margaret Lampazzi + 1 more

The desire to transition to single-pilot operations (SPO) has led to research and development of autonomous technologies that can take over tasks normally handled by two pilots and create a new paradigm that supports SPO. To safely achieve single pilot operations (SPO) of an existing dual crew aircraft, the workload split between the two pilots needs to be analyzed and candidate tasks for offloading identified. Simulation is a valuable tool to model different task allocation strategies for such systems. This paper presents the methodology that was used to analyze shared tasks between a two-pilot crew and identify candidate tasks that could be handled by the autonomous system. A simulation tool called Improved Performance Research Integration Tool (IMPRINT), developed by the U.S. Army was used as part of the design process for an autonomous flight control system. IMPRINT was used to guide cognitive walk-throughs and model pilot workload to inform task allocation between autonomy and the human operator. Advantages and disadvantages of this method will be discussed as well as recommendations for future work.

  • Single Report
  • Cite Count Icon 95
  • 10.21236/ada377300
Mental Workload and ARL Workload Modeling Tools
  • Apr 1, 2000
  • Diane K Mitchell

: The author of this report provides an overview of mental workload theory and mental workload measurement. She also describes the development, application, and validation of the mental workload modeling tools developed by the Human Research and Engineering Directorate of the U.S. Army Research Laboratory (ARL). These ARL tools, VACP (visual, auditory, cognitive, psychomotor) option in the improved performance research integration tool (IMPRINT) and WinCrew, can help the designers of military systems to assess the mental workload associated with different configurations of soldiers and equipment involved in the performance of a mission. System designers can conduct this assessment in the concept development phase of system design and reduce the need to build costly system mock-ups.

  • Research Article
  • 10.1177/1541931213601168
Performing System Tradeoff Analyses Using Human Performance Modeling
  • Sep 1, 2016
  • Proceedings of the Human Factors and Ergonomics Society Annual Meeting
  • Michael E Watson + 3 more

Humans perform critical functions in nearly every system, making them vital to consider during system development. Human Systems Integration (HSI) would ideally permit the human’s impact on system performance to be effectively accounted for during the systems engineering (SE) process, but effective processes are often not applied, especially in the early design phases. Failure to properly account for human capabilities and limitations during system design may lead to unreasonable expectations of the human. The result is a system design that makes unrealistic assumptions about the human, leading to an overestimation of the human’s performance and thus the system’s performance. This research proposes a method of integrating HSI with SE that allows human factors engineers to apply Systems Modeling Language (SysML) and human performance simulation to describe and communicate human and system performance. Using these models, systems engineers can more fully understand the system’s performance to facilitate design decisions that account for the human. A scenario is applied to illustrate the method, in which a system developer seeks to redesign an example system, Vigilant Spirit, by incorporating system automation to improve overall system performance. The example begins by performing a task analysis through physical observation and analysis of human subjects’ data from 12 participants employing Vigilant Spirit. This analysis is depicted in SysML Activity and Sequence Diagrams. A human-in-the-loop experiment is used to study performance and workload effects of humans applying Vigilant Spirit to conduct simulated remotely-piloted aircraft surveillance and tracking missions. The results of the task analysis and human performance data gathered from the experiment are used to build a human performance model in the Improved Performance Research Integration Tool (IMPRINT). IMPRINT allows the analyst to represent a mission in terms of functions and tasks performed by the system and human, and then run a discrete event simulation of the system and human accomplishing the mission to observe the effects of defined variables on performance and workload. The model was validated against performance data from the human-subjects’ experiment. In the scenario, six different scan algorithms, which varied in terms of scan accuracy and speed, were simulated. These algorithms represented different potential system trades as factors such as various technologies and hardware architectures could influence algorithm accuracy and speed. These automation trades were incorporated into the system’s block definition (BDD), requirements, and parametric SysML diagrams. These diagrams were modeled from a systems engineer’s perspective; therefore they originally placed less emphasis on the human. The BDD portrayed the structural aspect of Vigilant Spirit, to include the operator, automation, and system software. The requirements diagram levied a minimum system-level performance requirement. The parametric diagram further defined the performance and specification requirements, along with the automation’s scan settings, through the use of constraints. It was unclear from studying the SysML diagrams which automation setting would produce the best results, or if any could meet the performance requirement. Existing system models were insufficient by themselves to evaluate these trades; thus, IMPRINT was used to perform a trade study to determine the effects of each of the automation options on overall system performance. The results of the trade study revealed that all six automation conditions significantly improved performance scores from the baseline, but only two significantly improved workload. Once the trade study identified the preferred alternative, the results were integrated into existing system diagrams. Originally system-focused, SysML diagrams were updated to reflect the results of the trade analysis. The result is a set of integrated diagrams that accounts for both the system and human, which may then be used to better inform system design. Using human performance- and workload-modeling tools such as IMPRINT to perform tradeoff analyses, human factors engineers can attain data about the human subsystem early in system design. These data may then be integrated into existing SysML diagrams applied by systems engineers. In so doing, additional insights into the whole system can be gained that would not be possible if human factors and systems engineers worked independently. Thus, the human is incorporated into the system’s design and the total system performance may be predicted, achieving a successful HSI process.

  • Research Article
  • Cite Count Icon 14
  • 10.1177/154193121005400701
Improved Performance Research Integration Tool (IMPRINT): Human Performance Modeling for Improved System Design
  • Sep 1, 2010
  • Proceedings of the Human Factors and Ergonomics Society Annual Meeting
  • Charneta Samms

Identifying human factors issues before systems are built and demonstrating the importance of those issues to decision makers outside the human factors community is difficult. Modeling and simulation (M&S) has become a critical part of tackling this challenge, but the true impact of M&S is made when the results can be translated into predictions that are important to the decision makers such as future system performance. Through the use of the Improved Performance Research Integration Tool (IMPRINT), analysts are able to quantify the effect of human operator performance on system performance. IMPRINT is a task network modeling tool designed to help assess the influence of the human operator on system performance throughout the system lifecycle. This demonstration will provide a brief overview of IMPRINT and its capabilities and highlight new features that provide enhanced modeling capabilities and better results visualization. It will also feature a look at the new Multimodal Interface Design Support (MIDS) tool plug-in that provides users with analysis specific multimodal design guidelines they could be implemented into the system design to minimize mental overload and improve system performance.

  • Research Article
  • Cite Count Icon 30
  • 10.1016/j.jss.2022.07.010
The Measurement of Cognitive Workload in Surgery Using Pupil Metrics: A Systematic Review and Narrative Analysis
  • Aug 26, 2022
  • Journal of Surgical Research
  • Ravi Naik + 4 more

IntroductionIncreased cognitive workload (CWL) is a well-established entity that can impair surgical performance and increase the likelihood of surgical error. The use of pupil and gaze tracking data is increasingly being used to measure CWL objectively in surgery. The aim of this review is to summarize and synthesize the existing evidence that surrounds this. MethodsA systematic review was undertaken in accordance with Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. A search of OVID MEDLINE, IEEE Xplore, Web of Science, Google Scholar, APA PsychINFO, and EMBASE was conducted for articles published in English between 1990 and January 2021. In total, 6791 articles were screened and 32 full-text articles were selected based on the inclusion criteria. A narrative analysis was undertaken in view of the heterogeneity of studies. ResultsSeventy-eight percent of selected studies were deemed high quality. The most frequent surgical environment and task studied was surgical simulation (75%) and performance of laparoscopic skills (56%) respectively. The results demonstrated that the current literature can be broadly categorized into pupil, blink, and gaze metrics used in the assessment of CWL. These can be further categorized according to their use in the context of CWL: (1) direct measurement of CWL (n = 16), (2) determination of expertise level (n = 14), and (3) predictors of performance (n = 2). ConclusionsEye-tracking data provide a wealth of information; however, there is marked study heterogeneity. Pupil diameter and gaze entropy demonstrate promise in CWL assessment. Future work will entail the use of artificial intelligence in the form of deep learning and the use of a multisensor platform to accurately measure CWL.

  • Research Article
  • Cite Count Icon 101
  • 10.1177/0018720819830553
Cardiac Measures of Cognitive Workload: A Meta-Analysis.
  • Mar 1, 2019
  • Human Factors: The Journal of the Human Factors and Ergonomics Society
  • Ashley M Hughes + 4 more

We aimed to provide an assessment of the impact of workload manipulations on various cardiac measurements. We further sought to determine the most effective measurement approaches of cognitive workload as well as quantify the conditions under which these measures are most effective for interpretation. Cognitive workload affects human performance, particularly when load is relatively high (overload) or low (underload). Despite ongoing interest in assessing cognitive workload through cardiac measures, it is currently unclear which cardiac-based assessments best indicate cognitive workload. Although several quantitative studies and qualitative reviews have sought to provide guidance, no meta-analytic integration of cardiac assessment(s) of cognitive workload exists to date. We used Morris and DeShon's meta-analytic procedures to quantify the changes in cardiac measures due to task load conditions. Sample-weighted Cohen's d values suggest that several metrics of cardiac activity demonstrate sensitivity in response to cognitive workload manipulations. Heart rate variability measures show sensitivity to task load, conditions of event rate, and task duration. Authors of future work should seek to quantify the utility of leveraging multiple metrics to understand workload. Results suggest that assessment of cognitive workload can be done using various cardiac activity indicators. Further, given the number of valid and reliable measures available, researchers and practitioners should base their selection of a psychophysiological measure on the experimental and practical concerns inherent to their task/protocol. Findings bear implications for future assessment of cognitive workload within basic and applied settings. Future research should seek to validate conditions under which measurements are best interpreted, including but not limited to individual differences.

  • Research Article
  • Cite Count Icon 3
  • 10.3389/fnrgo.2025.1566431
One size does not fit all: a support vector machine exploration of multiclass cognitive state classifications using physiological measures
  • Jun 18, 2025
  • Frontiers in Neuroergonomics
  • Jonathan Vogl + 2 more

IntroductionThis study aims to develop and evaluate support vector machines (SVMs) learning models for predicting cognitive workload (CWL) based on physiological data. The objectives include creating robust binary classifiers, expanding these to multiclass models for nuanced CWL prediction, and exploring the benefits of individualized models for enhanced accuracy. Cognitive workload assessment is critical for operator performance and safety in high-demand domains like aviation. Traditional CWL assessment methods rely on subjective reports or isolated metrics, which lack real-time applicability. Machine learning offers a promising solution for integrating physiological data to monitor and predict CWL dynamically. SVMs provide transparent and auditable decision-making pipelines, making them particularly suitable for safety-critical environments.MethodsPhysiological data, including electrocardiogram (ECG) and pupillometry metrics, were collected from three participants performing tasks with varying demand levels in a low-fidelity aviation simulator. Binary and multiclass SVMs were trained to classify task demand and subjective CWL ratings, with models tailored to individual and combined subject datasets. Feature selection approaches evaluated the impact of streamlined input variables on model performance.ResultsBinary SVMs achieved accuracies of 70.5% and 80.4% for task demand and subjective workload predictions, respectively, using all features. Multiclass models demonstrated comparable discrimination (AUC-ROC: 0.75–0.79), providing finer resolution across CWL levels. Individualized models outperformed combined-subject models, showing a 13% average improvement in accuracy. SVMs effectively predict CWL from physiological data, with individualized multiclass models offering superior granularity and accuracy.DiscussionThese findings emphasize the potential of tailored machine learning approaches for real-time workload monitoring in fields that can justify the added time and expense required for personalization. The results support the development of adaptive automation systems in aviation and other high-stakes domains, enabling dynamic interventions to mitigate cognitive overload and enhance operator performance and safety.

  • Research Article
  • Cite Count Icon 35
  • 10.1111/bjet.13503
A systematic review of immersive technologies for education: Learning performance, cognitive load and intrinsic motivation
  • Jul 16, 2024
  • British Journal of Educational Technology
  • Matisse Poupard + 3 more

Immersive technologies are assumed to have many benefits for learning due to their potential positive impact on optimizing learners' cognitive load and fostering intrinsic motivation. However, despite promising results, the findings regarding the actual impact on learning remain inconclusive, raising questions about the determinants of efficacy. To address these gaps, we conducted a PRISMA systematic review to investigate the contributions and limitations of virtual reality (VR) and augmented reality (AR) in learning, specifically by examining their effects on cognitive load and intrinsic motivations. Through the application of an analytical grid, we systematically classified the impact of VR/AR on the causal relationship between learning performance (ie, objective learning improvement) and cognitive load or motivation, while respecting the fundamental assumptions of the main theories related to these factors. Analysing 36 studies, the findings reveal that VR, often causing extraneous load, hinders learning, particularly among novices. In contrast, AR optimizes cognitive load, proving beneficial for novice learners but demonstrating less effectiveness for intermediate learners. The effects on intrinsic motivation remain inconclusive, likely due to variations in measurement methods. The review underscores the need for detailed, sophisticated evaluations and comprehensive frameworks that consider both cognitive load and intrinsic motivation to improve understanding of the impact of immersive technologies on learning. Practitioner notes What is know Virtual and augmented reality show promise for education, but findings are inconsistent. Existing studies suggest that augmented reality optimizes learners' cognitive load. The literature often asserts that VR and AR are expected to enhance learning motivation. Adding VR introduces unnecessary cognitive load, while AR proves effective for learning performance and cognitive load, particularly for novice learners. The impact of AR and VR on motivation to learn is unclear. Our analytical grid offers a comprehensive framework for assessing the effects of AR and VR on learning outcomes. Implications AR is more suitable than VR for education concerning cognitive load. The cost/benefit balance of VR should be carefully considered before implementation, especially for novice learners. Rigorous studies on motivation to learn in AR and VR contexts are essential.

  • Research Article
  • Cite Count Icon 18
  • 10.1007/s10111-014-0276-0
Evaluation of the driver’s mental workload: a necessity in a perspective of in-vehicle system design for road safety improvement
  • May 20, 2014
  • Cognition, Technology & Work
  • Annie Pauzié

Human factors research and debate related to mental workload have been going on for decades since the 60's (McKenzie et al. 1966) and is still happening (Finomore et al. 2013). The raised issues are: is it useful (do we need the concept of mental workload or do we have to banish it for good? Leplat 2002), is it scientifically credible (Dekker et al. 2010), and, in case of a positive answer, how to measure it (Jex 1988). In the context of the driving task, human factors research is abundant and diversified, aiming at a better understanding of driver behavior and functional capacities in terms of perception, cognition, and motor processes in order to improve road safety (Lee 2008), drivers' mental workload being an important issue to consider in this framework (Dick de Waard 1996). This problematic is even more crucial since the deployment of onboard Intelligent Transport System in the vehicles (Carsten and Nilsson 2001), as human factors have the responsibility to evaluate whether these innovative systems really support the driving task or, on the contrary, lead to distraction and increase mental workload, with potential dramatic consequences in terms of road safety. So, since the beginning of research in this area, the objectives have been to establish methods of assessing fluctuations in mental workload that are sensitive to the various aspects of attentional processing requirements in relation to both external environmental conditions, such as traffic density as well as in-vehicle conditions, such as competing visual and auditory displays (Pauzie´ and Amditis 2010). Workload can be defined as a hypothetical construct that represents the cost incurred by a human operator to achieve a particular level of performance (Hart 1986). Driving performance and the driver's mental workload are both relevant and complementary parameters to consider, knowing that they can vary independently (Yeh and Wickens 1988). Indeed, if the complexity of the task increases, the driver is able to maintain a stable performance to a certain degree, by increasing effort. Keywords: Driver distraction; Language: en

  • Research Article
  • Cite Count Icon 18
  • 10.3389/fphys.2024.1408242
MATB for assessing different mental workload levels.
  • Jul 23, 2024
  • Frontiers in physiology
  • Anaïs Pontiggia + 7 more

Multi-Attribute Task Battery (MATB) is a computerized flight simulator for aviation-related tasks, suitable for non-pilots and available in many versions, including open source. MATB requires the individual or simultaneous execution of 4 sub-tasks: system monitoring (SYSMON), tracking (TRACK), communications (COMM), and resource management (RESMAN). Fully customizable, the design of test duration, number of sub-tasks used, event rates, response times and overlap, create different levels of mental load. MATB can be combined with an additional auditory attention (Oddball) task, or with physiological constraints (i.e., sleep loss, exercise, hypoxia). We aimed to assess the main characteristics of MATB design for assessing the response to different workload levels. We identified and reviewed 19 articles for which the effects of low and high workload were analyzed. Although MATB has shown promise in detecting performance degradation due to increase workload, studies have yielded conflicting or unclear results regarding MATB configurations. Increased event rates, number of sub-tasks (multitasking), and overlap are associated with increased perceived workload score (ex. NASA-TLX), decreased performance (especially tracking), and neurophysiological responses, while no effect of time-on-task is observed. The median duration used for the test is 20min (range 12-60) with a level duration of 10min (range 4-15). To assess mental workload, the median number of stimuli is respectively 3 events/min (range 0.6-17.2) for low, and 23.5 events/min (range 9-65) for high workload level. In this review, we give some recommendations for standardization of MATB design, configuration, description and training, in order to improve reproducibility and comparison between studies, a challenge for the future researches, as human-machine interaction and digital influx increase for pilots. We also open the discussion on the possible use of MATB in the context of aeronautical/operational constraints in order to assess the effects combined with changes in mental workload levels. Thus, with appropriate levels of difficulty, MATB can be used as a suitable simulation tool to study the effects of changes on the mental workload of aircraft pilots, during different operational and physiological constraints.

  • Research Article
  • Cite Count Icon 65
  • 10.3758/s13428-020-01364-w
OpenMATB: A Multi-Attribute Task Battery promoting task customization, software extensibility and experiment replicability
  • Mar 5, 2020
  • Behavior Research Methods
  • J Cegarra + 4 more

OpenMATB is an open-source variant of the Multi-Attribute Task Battery (MATB) and is available under a free software license. MATB consists of a set of tasks representative of those performed in aircraft piloting. It is used, in particular, to study the effect of automation on decision-making, mental workload, and vigilance. Since the publication of MATB 20 years ago, the subject of automation has grown considerably in importance. After introducing the task battery, this article highlights three main requirements for an up-to-date implementation of MATB. First, there is a need for task customization, to make it possible to change the values, appearance or integrated components (such as rating scales) of the tasks. Second, researchers need software extensibility to enable them to integrate specific features, such as synchronization with psychophysiological devices. Third, to achieve experiment replicability, it is necessary that the source code and the scenario files are easily available and auditable. In the present paper, we explain how these aspects are implemented in OpenMATB by presenting the software architecture and features, while placing special emphasis on the crucial role of the plugin system and the simplicity of the format used in the script files. Finally, we present a number of general trends for the future study of automation in human factors research and ergonomics.

Save Icon
Up Arrow
Open/Close