Human–Robot Interactive Performance Analysis with Enhanced Metacognition
ABSTRACT This study investigates integrating metacognition into human–robot interaction (HRI) systems to improve reliability, efficiency, and trust. As HRI deployments expand, performance regulation under uncertainty becomes critical. Prior work has examined metacognition largely through behavioral observation, with limited system-level models that quantify operational impact. We address this gap by proposing a representative Markov chain model to evaluate metacognitive effects under deterministic and stochastic conditions. Using statistical analysis and probabilistic modeling, we measure impacts on machine accuracy and efficiency over extended operation. Experiments show metacognitive-enabled systems achieve a 59.29% reduction in errors and inefficiencies, with an 18.67% increase in execution time, indicating a favorable performance tradeoff. We further conceptualize metacognition as a quantifiable decision-regulation layer rather than an interface-level construct. The framework offers a scalable analytical basis for designing more dependable and trustworthy HRI systems in complex, information-intensive environments.
- Conference Article
1
- 10.1109/icppw.2012.35
- Sep 1, 2012
Power consumption is a growing concern in the design of computing platforms, particularly large-scale, HPC systems or computing platforms that assist battlefield operations. Accordingly, Intel® recently introduced a new development platform, Software Development Platform S2R2 Family with Intel® Node Manager technology, that is capable of real-time dynamic power monitoring and capping. Although targeted at data centers, such a tool can be applied to one compute node. This development opens up new possibilities for managing payloads in fielded computing platforms, which typically have limited power budgets. Towards this end, this paper presents a preliminary study of the effect of node power capping on the execution time of two applications of interest to the U.S. Army. We executed the applications under a range of power caps and employed performance counters and a program that strides through memory invoking different levels of the hierarchy to capture execution time as well as other performance metrics that help explain the increase in execution times with heightened restrictions on power consumption. Confirming results of earlier work, we show that, in general, time-to-solution and energy consumption increase as the power cap decreases. In addition, our data indicates that: (1) for fielded systems there is a range of power caps that may result in acceptable increases in execution time and (2) although power capping is achieved mainly by dynamic voltage and frequency scaling (DVFS), when executing at lower power caps other techniques are being employed to reduce power consumption.
- Research Article
- 10.1016/j.dib.2025.112234
- Nov 4, 2025
- Data in Brief
Human-Robot Interaction (HRI) systems are becoming increasingly integral to collaborative industrial and service environments. However, understanding human performance within such settings, particularly in Programming by Demonstration (PbD) frameworks, remains a challenge due to limited availability of comprehensive datasets. This study presents a multimodal dataset designed to assess human performance in HRI applications. The dataset includes both objective and subjective measures collected using various tools within a PbD framework. The data is collected to assess the overall system performance, with a focus on the effectiveness of PbD methods in improving robot learning process. The performance attributes encompass programming efficiency, cognitive workload, usability, and ergonomic assessment of human posture in physical HRI settings. Using tools such as a robot manipulator (UR10e), a vision -based motion tracking system, eye- tracking glasses, NASA Task Load Index (NASA-TLX) and system usability scale (SUS) questionnaire, we recorded interactions between human and robot comprising-robot trajectories, human motion trajectories, participant eye tracking metrices, and participant subjective responses. The data was collected in an experimental setting involving two kinaesthetic robot teaching tasks, under four feedback conditions that provide performance related insights and actionable instructions to improve participant performance. A total of N=28 participants performed three trials of each task using developed Human-Machine Interface (HMI). This dataset is valuable for advancing human-robot teaching interaction and is applicable for developing adaptive HRI systems. The dataset can serve as a significant resource for researchers and machine learning engineers aiming to develop advance HRI systems, improve PbD algorithms, train HRI based machine learning models and facilitate further studies in safety and ergonomic assessment applications in robot learning and collaboration.
- Conference Article
- 10.1109/iembs.2011.6091144
- Aug 1, 2011
Methods for decoding movements from neural spike counts using adaptive filters often rely on minimizing the mean-squared error. However, for non-Gaussian distribution of errors, this approach is not optimal for performance. Therefore, rather than using probabilistic modeling, we propose an alternate non-parametric approach. In order to extract more structure from the input signal (neuronal spike counts) we propose using minimum error entropy (MEE), an information-theoretic approach that minimizes the error entropy as part of an iterative cost function. However, the disadvantage of using MEE as the cost function for adaptive filters is the increase in computational complexity. In this paper we present a comparison between the decoding performance of the analytic Wiener filter and a linear filter trained with MEE, which is then mapped to a parallel architecture in reconfigurable hardware tailored to the computational needs of the MEE filter. We observe considerable speedup from the hardware design. The adaptation of filter weights for the multiple-input, multiple-output linear filters, necessary in motor decoding, is a highly parallelizable algorithm. It can be decomposed into many independent computational blocks with a parallel architecture readily mapped to a field-programmable gate array (FPGA) and scales to large numbers of neurons. By pipelining and parallelizing independent computations in the algorithm, the proposed parallel architecture has sublinear increases in execution time with respect to both window size and filter order.
- Conference Article
1
- 10.5591/978-1-57735-516-8/ijcai11-468
- Jul 16, 2011
Human Robot Interaction (HRI) is an active field of integrating and embedding different techniques in artificial intelligence. This paper describes my research topic on: Control of Robotic Systems for Safe Interaction with Human Operators. It consists of online motion generation for robotic manipulators interactingwith dynamic obstacles and humans using a moving horizon scheme, modeling and long term prediction of human motion using probabilistic models and reachability analysis, and development of an HRI demonstration platform.
- Conference Article
2
- 10.1109/spdp.1990.143655
- Dec 2, 1990
The paper considers the mapping of communicating modules of a parallelized task to the processing elements of a parallel computer when precedence relationships among the modules is available. The goal of the mapping is to minimize the total execution time of the task, including both processing and communications time, within a processor network of limited size. This paper presents a method for contracting complete binary precedence trees with n nodes to trees with (n+1)/2 nodes with no increase in execution time. The authors then provide methods for embedding these trees into hypercubes and m-dimensional meshes. When embedded into the hypercube of dimension log(/sup n+1///sub 2/), or into meshes with dimension m>or=(log(/sup n+1///sub 2/))/2, the contracted tree is embedded with unit dilation and with no increase in execution time. For meshes with dimension m >
- Research Article
15
- 10.1002/spe.3139
- Aug 14, 2022
- Software: Practice and Experience
In the article we propose an automatic power capping software tool DEPO that allows one to perform runtime optimization of performance and energy related metrics. For an assumed application model with an initialization phase followed by a running phase with uniform compute and memory intensity, the tool performs automatic tuning engaging one of the two exploration algorithms—linear search (LS) and golden section search (GSS), finds a power cap optimizing a given metric and sets it for the remaining computations. The considered metrics include energy (E), energy‐delay sum, energy‐delay product. We present experimental results obtained for a set of benchmarks that differ in compute and memory intensity—parallel custom built OpenMP implementations of: numerical integration, heat distribution simulation (HEAT), fast Fourier transform (FFT), and additionally NAS parallel benchmarks: CG, MG, BT, SP, and LU. Tests were performed using multi‐core CPUs that are representatives of modern servers and the desktop family: 2 Intel Xeon E5‐2670 v3 CPU (Haswell‐EP) and Intel i7‐9700K CPU (Coffee Lake). The results show that our approach enabled considerable improvements for the tested metrics, for example, for HEAT and Coffee Lake we minimized energy by 50% at the cost of a 15% increase in execution time (LS), for FFT energy was minimized by 40% at a 25.5% increase in execution time (GSS), for SP and Haswell energy was minimized by 25% at the cost of an 18.5% time increase and for Coffee Lake energy was decreased by 56% with a 12% time increase.
- Research Article
18
- 10.1145/355611.362537
- Nov 1, 1973
- Communications of the ACM
Most graphics systems using a raster scan output device (CRT or hardcopy) maintain a display file in the XY or random scan format. Scan converters, hardware or software, must be provided to translate the picture description from the XY format to the raster format. Published scan conversion algorithms which are fast will reserve a buffer area large enough to accommodate the entire screen. On the other hand, those which use a small buffer area are slow because they require multiple passes through the XY display file. The scan conversion algorithm described here uses a linked list data structure to process the lines of the drawing in strips corresponding to groups of scan lines. A relatively small primary memory buffer area is used to accumulate the binary image for a group of scan lines. When this portion of the drawing has been plotted, the buffer is reused for the next portion. Because of the list processing procedures used, only a single pass through the XY display file is required when generating the binary image and only a slight increase in execution time over the fully buffered core results. Results show that storage requirements can be reduced by more than 80 percent while causing less than a 10 percent increase in execution time.
- Research Article
- 10.3390/electronics14244862
- Dec 10, 2025
- Electronics
Human–robot cooperative tasks require physical human–robot interaction (pHRI) systems that can adapt to individual human behaviors while ensuring robustness and stability. This paper presents a dual-loop control framework combining an admittance outer loop and a neural adaptive inner loop based on the Robust Integral of the Sign of the Error (RISE) approach. The outer loop reshapes the manipulator trajectory according to interaction forces, ensuring compliant motion and user safety. The inner-loop Adaptive RISE–RBFNN controller compensates for unknown nonlinear dynamics and bounded disturbances through online neural learning and robust sign-based correction, guaranteeing semi-global asymptotic convergence. Quantitative results demonstrate that the proposed adaptive RISE controller with neural-network error compensation (ARINNSE) achieves superior performance in the Joint-1 tracking task, reducing the root-mean-square tracking error by approximately 51.7% and 42.3% compared to conventional sliding mode control and standard RISE methods, respectively, while attaining the smallest maximum absolute error and maintaining control energy consumption comparable to that of RISE. Under human–robot interaction scenarios, the controller preserves stable, bounded control inputs and rapid error convergence even under time-varying disturbances. These results confirm that the proposed admittance-based RISE–RBFNN framework provides enhanced robustness, adaptability, and compliance, making it a promising approach for safe and efficient human–robot collaboration.
- Book Chapter
- 10.1007/978-3-540-87442-3_109
- Jan 1, 2008
On the basis of analyzing the information fusion method for human-robot interaction, a kind of information feedback structure for human---robot interaction platform has been provided, the source feedback information is handled with multi-layer processed structure, and the processed multi-mode information is integrated with certain interactive rules and knowledge in interaction knowledge database. A set of feedback information expression and fusion method has been presented in information integration process, the abstract information from feedback multimodality is expressed by the situation of task execution process and the stability of feedback modality, and then these information is fused in light of the result of task execution, at last, the fusion result is delivered to user by the most stable feedback modality. Some experiments have been done with the provided methods in the human---robot interaction system, and parts of the experiment results show that user's cognition loads can be decreased by this semantic information fusion and feedback method, moreover, high work efficiency also can be gotten in human-robot interaction.
- Conference Article
10
- 10.1109/hora55278.2022.9799820
- Jun 9, 2022
The role and job descriptions of the new generation of industrial robots that will operate in smart factories are being shaped by the industry 4.0 (I4.0) process, which has evolved with digital transformation and advanced production procedures. Human-robot interaction is a new industry trend and a key component of the I4.0 strategy. The main objective of this new solution is to improve the safety, ergonomics, productivity, and quality of the process. This solution aims to bridge the gap between manual production and fully automated production. In this way, the employee integrates the advantage of both humans and robots by sharing the workspace with the robot in non-ergonomic, repetitive, uncomfortable, and dangerous operations. This also means that the inclusion of robots in manufacturing processes does not devalue the human component; on the contrary, it shows that the increase in productivity is due to human-robot cooperation. As the level of human-robot cooperation increases, production capacity must be waived as a result of the slowdown of robots by nature, and risk assessment becomes more important according to certain standards. It is also clear that risk analysis of human and robot interaction systems contains a mixture of quantitative and qualitative data based on human evaluations and hesitancy and process uncertainties. In general, risk assessment approaches rely on the expertise and experience of specialists. So, the fuzzy set theory (FST) is more suitable to evaluate the risk assessment of this system. This study aims to contribute to improving human-robot collaboration and safety in an industrial setting for risk assessment based on FST. Additionally, the z-number, which is a fuzzy number of pairs is integrated into the proposed methodology to reflect the uncertainties of the risk assessment stage. Within the scope of the study, a new fuzzy-based risk assessment methodology is proposed to provide a safe workplace where humans and robots collaborate on a typical task. The proposed methodology consists of DELPHI, DEMATEL, ANP, and VIKOR which are multi-criteria decisions making (MCDM) methods based on the z-numbers that can take into account the uncertainty of the data and the hesitancies of the experts.
- Research Article
9
- 10.1108/jdal-11-2022-0010
- Nov 13, 2023
- Journal of Defense Analytics and Logistics
PurposeThis paper presents a survey of research into interactive robotic systems for the purpose of identifying the state of the art capabilities as well as the extant gaps in this emerging field. Communication is multimodal. Multimodality is a representation of many modes chosen from rhetorical aspects for its communication potentials. The author seeks to define the available automation capabilities in communication using multimodalities that will support a proposed Interactive Robot System (IRS) as an AI mounted robotic platform to advance the speed and quality of military operational and tactical decision making.Design/methodology/approachThis review will begin by presenting key developments in the robotic interaction field with the objective of identifying essential technological developments that set conditions for robotic platforms to function autonomously. After surveying the key aspects in Human Robot Interaction (HRI), Unmanned Autonomous System (UAS), visualization, Virtual Environment (VE) and prediction, the paper then proceeds to describe the gaps in the application areas that will require extension and integration to enable the prototyping of the IRS. A brief examination of other work in HRI-related fields concludes with a recapitulation of the IRS challenge that will set conditions for future success.FindingsUsing insights from a balanced cross section of sources from the government, academic, and commercial entities that contribute to HRI a multimodal IRS in military communication is introduced. Multimodal IRS (MIRS) in military communication has yet to be deployed.Research limitations/implicationsMultimodal robotic interface for the MIRS is an interdisciplinary endeavour. This is not realistic that one can comprehend all expert and related knowledge and skills to design and develop such multimodal interactive robotic interface. In this brief preliminary survey, the author has discussed extant AI, robotics, NLP, CV, VDM, and VE applications that is directly related to multimodal interaction. Each mode of this multimodal communication is an active research area. Multimodal human/military robot communication is the ultimate goal of this research.Practical implicationsA multimodal autonomous robot in military communication using speech, images, gestures, VST and VE has yet to be deployed. Autonomous multimodal communication is expected to open wider possibilities for all armed forces. Given the density of the land domain, the army is in a position to exploit the opportunities for human–machine teaming (HMT) exposure. Naval and air forces will adopt platform specific suites for specially selected operators to integrate with and leverage this emerging technology. The possession of a flexible communications means that readily adapts to virtual training will enhance planning and mission rehearsals tremendously.Social implicationsInteraction, perception, cognition and visualization based multimodal communication system is yet missing. Options to communicate, express and convey information in HMT setting with multiple options, suggestions and recommendations will certainly enhance military communication, strength, engagement, security, cognition, perception as well as the ability to act confidently for a successful mission.Originality/valueThe objective is to develop a multimodal autonomous interactive robot for military communications. This survey reports the state of the art, what exists and what is missing, what can be done and possibilities of extension that support the military in maintaining effective communication using multimodalities. There are some separate ongoing progresses, such as in machine-enabled speech, image recognition, tracking, visualizations for situational awareness, and virtual environments. At this time, there is no integrated approach for multimodal human robot interaction that proposes a flexible and agile communication. The report briefly introduces the research proposal about multimodal interactive robot in military communication.
- Research Article
11
- 10.1109/mra.2025.3543957
- Jan 1, 2025
- IEEE Robotics & Automation Magazine
Translating human intent into robot commands is crucial for the future of service robots in an aging society. Existing human‒robot interaction (HRI) systems relying on gestures or verbal commands are impractical for the elderly, due to difficulties with complex syntax or sign language. To address the challenge, this article introduces a multimodal interaction framework that combines voice and deictic posture information to create a more natural HRI system. Visual cues are first processed by the object detection model to gain a global understanding of the environment, and then bounding boxes are estimated based on depth information. By using a large language model (LLM) with voice-to-text commands and temporally aligned selected bounding boxes, robot action sequences can be generated, while key control syntax constraints are applied to avoid potential LLM hallucination issues. The system is evaluated on real-world tasks with varying levels of complexity, using a Universal Robots UR3e manipulator. Our method demonstrates significantly better HRI performance in terms of accuracy and robustness. To benefit the research community and the general public, we made our code and design open source.
- Research Article
4
- 10.21518/ms2023-207
- Aug 19, 2023
- Meditsinskiy sovet = Medical Council
Introduction. The study of the spectrum of neurocognitive disorders in patients with arterial hypertension (AH) in order to create an effective therapeutic and rehabilitation strategy is an urgent direction of modern medicine.Aim. To study neuropsychological characteristics in patients with hypertension in middle and old age.Materials and methods. 357 middle-aged and elderly patients with hypertension were examined. All patients underwent neuropsychological examination: Montreal Cognitive Function Assessment Scale (MoCA test), Schulte Table Test, Verbal Association Test, Trail Making Test (TMT), Asthenia Self-questionnaire (MFI-20), O. Kopina Reader Adaptation Test, questionnaire on the level of life exhaustion, hospital anxiety scale and depression (HADS).Results. The analysis of the results of general neuropsychological testing showed a deviation from the reference values in the majority of participants. In the Schulte test, an increase in execution time was noted in 50% of elderly patients and in 21% of middle-aged patients. In the TMT test: an increase in the execution time of part A – in 88% of elderly patients and 58% of middle-aged patients, part B – in 97 and 88% of patients, respectively. The MoCA test demonstrated pronounced cognitive impairment in 16% of middle-aged patients and in 35% of elderly patients. More than 97% of elderly and 88% of middle-aged patients showed a high level of asthenia in the MFI-20 test; life exhaustion was noted in 56 and 45%, and anxiety and depression in more than 50% of elderly and 35% of middle-aged patients, respectively.Conclusion. In the studied groups of patients with hypertension, there was a decrease in the integral index of cognitive functions, as well as changes in indicators in tests characterizing the state of control functions, attention, speed of thought processes and semantic memory, while more pronounced deviations were noted in the elderly. The described cognitive impairments were combined with a high level of psychoemotional tension, anxiety, depression and asthenia.
- Research Article
10
- 10.2208/jscej.2000.660_125
- Oct 20, 2000
- Doboku Gakkai Ronbunshu
Cost and time are esential parts of construction works accomplishment. According to the existing circumstances, construction works are analyzed under the deterministic or stochastic conditions. The deterministic conditions are assumed when random factors can be sufficiently eliminated. In opposite case the stochastic conditions must be considered. The requirements of construction works execution are described by using models of construction technology and resources. Scheduling of the works execution are realized by solving mixed linear programming problems under the requirement. In this paper, these are applied to cost-time scheduling of the small bridge erection as an example under stochastic conditions.
- Conference Article
- 10.1145/3569902.3569906
- Nov 21, 2022
Reduced Instruction Set (RISC) architectures optimize a complex ISA by implementing only the most frequently used instructions in hardware; however, the application execution time significantly increases when executing heavily used instructions in software. One technique that optimizes the trade-off of implementation cost and execution time is the use of a Multiprocessor System-on-Chip (MPSoC), in which RISC processors extend their ISA by sharing coprocessors that implement lesser-used instructions. This article analyses the impact of shared coprocessor failures on two RISC-V MPSoC architectures. We evaluated these architectures using two image processing applications and four failure rates in terms of power dissipation, energy consumption, area consumption, maximum operating frequency, and execution time. The experiments show a 16% maximum increase in execution time for the application with a low percentage of instructions executed. In contrast, for the application with the highest rate of coprocessor use, considering a one-fault scenario, the execution time does not increase significantly in one of the architectural configurations proposed for the MPSoC.