Accelerate Literature Icon
Want to do a literature review? Try our new Literature Review workflow

Uncomfortable by Design: The Future of Teams Research

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Teams research has never been for those who prefer easy answers. The future of the field will belong to those willing to ask harder questions. We must reduce our reliance on comfortable simplifications and treat variability as substantive, context as fundamental, and artificial intelligence as an active participant in collaboration. The future of teams research will depend on scholars willing to embrace messiness without sacrificing rigor.

Similar Papers
  • Research Article
  • Cite Count Icon 19
  • 10.58600/eurjther1719
We Asked ChatGPT About the Co-Authorship of Artificial Intelligence in Scientific Papers
  • Jul 22, 2023
  • European Journal of Therapeutics
  • Ayşe Balat + 1 more

A few weeks ago, we published an editorial discussion on whether artificial intelligence applications should be authors of academic articles [1] . We were delighted to receive more than one interesting reply letter to this editorial in a short time [2, 3] . We hope that opinions on this

  • Research Article
  • Cite Count Icon 9
  • 10.1016/j.dss.2018.08.002
Do less active participants make active participants more active? An examination of Chinese Wikipedia
  • Aug 8, 2018
  • Decision Support Systems
  • Yan Lin + 1 more

Do less active participants make active participants more active? An examination of Chinese Wikipedia

  • Conference Article
  • 10.46254/eu08.20250434
A Structural Path Analysis of Employee Perceptions on Artificial Intelligence Integration at Work
  • Jul 2, 2025
  • Gerald O Semifrania + 1 more

Artificial Intelligence (AI) technologies are becoming more integrated into organizational systems across the Philippines. This study is grounded in the idea that to make AI work effectively in the workplace, we must first understand how ready organizations are and how employees perceive the use of AI. They used a quantitative research design and applied multivariate analysis to examine the structural relationships between key factors influencing AI adoption. Specifically, we examined how organizational support, AI-enabled collaboration, employee awareness, and workplace productivity interact. The data came from various sectors—education, government, BPO, banking and finance, manufacturing, and others—capturing a broad view of current workplace dynamics. Through structural path analysis, we focused on five core dimensions: Organizational Support for AI Adoption (OSA), AI Participation in Collaboration and the Work Environment (ECWE), Awareness of AI (AIA), Dependence and Workplace Productivity (DWP), and the Impact on Employment and Job Security (IEJS). Findings reveal that strong organizational support is critical in creating the conditions for successful AI use, particularly regarding teamwork and collaboration. Environments that foster collaborative work tend to generate better outcomes from AI efforts. At the same time, when employees are more aware and knowledgeable about AI, they are more likely to use it effectively and benefit from its capabilities. This study highlights the need for organizations to strengthen their internal support systems, promote collaborative practices, and invest in employee upskilling to make the most out of AI. For institutions planning to adopt or expand AI technologies, the results point to a clear takeaway: institutional readiness must go hand in hand with employee involvement and awareness.

  • Research Article
  • Cite Count Icon 2
  • 10.7759/cureus.90212
Comparing Artificial Intelligence Large Language Models in Medical Training: A Performance Analysis of ChatGPT and DeepSeek on United States Medical Licensing Examination (USMLE) Style Questions
  • Aug 16, 2025
  • Cureus
  • Runze Zhang + 4 more

IntroductionThe integration of artificial intelligence (AI) into medical education is reshaping how students prepare for standardized examinations. Prior studies have shown that AI models can achieve high accuracy on United States Medical Licensing Examination (USMLE) questions, highlighting their potential for examination preparation. ChatGPT (GPT), especially the 4o model, is one of the most widely used AI models; however, its accessibility is limited by subscription costs and regional censorship. DeepSeek (DS), a newer AI model, offers free access and has demonstrated comparable performance in general tasks. In this study, we compared the performance of GPT-4o and DS DeepThink R1 on the AMBOSS medical board preparation question bank to evaluate their potential and limitations as supplementary tools in medical education.MethodsWe extracted 1,079 USMLE-style multiple-choice questions from the AMBOSS question bank. Questions were categorized by USMLE Step 1 and Step 2 examinations and further grouped by topic, resulting in 36 categories. Each question was assigned a difficulty level (easy, intermediate, or hard) based on AMBOSS grading criteria. To ensure balanced representation, we randomly selected 10 questions per difficulty level per category. Questions and answer choices were copied verbatim from the AMBOSS website and input into GPT-4o and DS R1 without any modification. Model responses were scored as correct or incorrect, and correctness rates were compared across GPT-4o, DS R1, and AMBOSS user performance.ResultsBoth GPT and DS outperformed AMBOSS users, with overall accuracies of 88.79%, 78.68%, and 56.98%, respectively. Comparing GPT and DS, GPT performed significantly better overall (t=7.90, p<0.0001). When stratified by examination type, GPT achieved significantly higher accuracy than DS in both Step 1 (0.89 vs. 0.78, p < 0.0001) and Step 2 (0.88 vs. 0.80, p < 0.0001). GPT consistently showed higher accuracy than DS at all three difficulty levels. However, when further stratified by examination type, statistically significances were only observed in intermediate (p = 0.0002) and hard (p = 0.0021) questions in both Step 1 and Step 2.ConclusionOur findings demonstrated that both AI models outperformed human learners, with GPT-4o showing superior accuracy, particularly in intermediate and hard questions. While DS underperformed relative to GPT, its free accessibility and competitive accuracy in easy questions suggest that it may serve as a viable alternative, particularly in resource-limited settings.

  • Abstract
  • Cite Count Icon 5
  • 10.1136/bmjebm-2019-pod.77
65 Ethical, legal and social implications of artificial intelligence systems for screening and diagnosis
  • Dec 1, 2019
  • BMJ Evidence-Based Medicine
  • Stacy Carter + 5 more

Forms of artificial intelligence (AI), including machine learning-based systems, are rapidly making their way into healthcare. The recent rise of machine learning—especially the development of deep neural networks—has rapidly improved...

  • Research Article
  • Cite Count Icon 4
  • 10.1007/s44163-025-00233-9
Limitations of risk-based artificial intelligence regulation: a structuration theory approach
  • Feb 12, 2025
  • Discover Artificial Intelligence
  • Lily Ballot Jones + 2 more

Artificial Intelligence (AI) is transforming the way we live and work. The disruptive impact and risks of Generative AI have accelerated the global transition from voluntary AI ethics guidelines to mandatory AI regulation. The European Union AI Act is the world’s first horizontal and standalone law governing AI that came into force in August 2024, just as other jurisdictions, countries and states, are navigating possible modes of regulation. Starting with the EU AI Act, most of the current regulatory effort follows a risk-based classification approach. While this is prescriptive and application-focused, it overlooks the complex circular impacts of AI and the inherent limitations of measurement of risk, overemphasis on high-risk classification, perceived trustworthiness of AI and the geopolitical power imbalance of AI. This article contributes an overview of the current landscape of AI regulation, followed by a detailed assessment of the limitations and potential means of addressing these limitations through a structuration theory approach. Summarily, this approach can be used to recognise AI systems as agents that actively participate in the duality of structure, and the subsequent shaping of society. It acknowledges the direct negotiation of agency granted to machines alongside their ability to determine an understanding from given inputs, which then qualifies AI as an active participant in the recursive structuration of society. This agentic view of AI in the structuration theory approach complements ongoing efforts to develop a comprehensive and balanced AI regulation.

  • Research Article
  • Cite Count Icon 11
  • 10.1287/ijds.2023.0007
How Can IJDS Authors, Reviewers, and Editors Use (and Misuse) Generative AI?
  • Apr 1, 2023
  • INFORMS Journal on Data Science
  • Galit Shmueli + 7 more

How Can <i>IJDS</i> Authors, Reviewers, and Editors Use (and Misuse) Generative AI?

  • PDF Download Icon
  • Supplementary Content
  • Cite Count Icon 5
  • 10.1016/j.isci.2021.102136
Building an interdisciplinary team set on bringing the sense of smell to computers
  • Feb 16, 2021
  • iScience
  • Alexander B Wiltschko

Building an interdisciplinary team set on bringing the sense of smell to computers

  • Research Article
  • 10.28963/8.1.5
Emergent Paradoxes: Integrating AI into Zöe and Systemic Thinking through Creativity and Disruption
  • Mar 11, 2025
  • Murmurations: Journal of Transformative Systemic Practice
  • Hugh Palmer

As part of the broader system of zoe, AI cannot be reduced to an object of control. Rather, it is part of the living, relational systems that sustain life. This paper moves beyond the binary of human and non-human, exploring AI as an active participant in the continuous flows of connection that define life itself. The paper explores the integration of artificial intelligence (AI) into systemic thinking and the broader context of zoe (life beyond the human) through a co-authored experiment between a human (Hugh Palmer) and ChatGPT 4o, an AI model developed for language generation. Drawing on interdisciplinary perspectives, including Gregory Bateson’s cybernetics (Bateson, 1972), Rosi Braidotti’s posthumanism (Braidotti, 2019), and Indigenous knowledge systems (Kimmerer, 2013; Cajete, 2000), the paper reimagines AI not as a tool for human control but as a co-evolving participant in dynamic, living systems. However, this raises a series of emergent paradoxes: How does AI enhance connection while simultaneously disrupting relationality? Can AI truly integrate into zoe while being a product of capitalist infrastructures (Braidotti, 2019; Parisi, 2018)? Does treating AI as a participant in systemic flows risk anthropomorphising it, thereby reinforcing the very binaries we seek to overcome (Barad, 2007)? These questions underscore some of the complexities of AI’s role within systemic practice. The concept of relational ethics is central to this exploration, as the paper argues for an ethical AI development grounded in mutual influence, flow, and the principles of second-order cybernetics (Bateson, 1972; Maturana &amp; Varela, 1980). By incorporating the notion of autopoiesis, the self-generating capacity of systems (Maturana &amp; Varela, 1980), the paper challenges dualistic thinking and presents a framework for AI to support self-sustaining systems rather than disrupt them. Through a systemic lens, the paper considers the implications of AI for therapy and community work, encouraging systemic practitioners to engage with AI in ways that honour complexity, ethics, and relationality (Simon, 2014). The authors call for an adaptive, responsible approach to AI, one that is guided by systemic wisdom and grounded in the web of life.

  • Research Article
  • 10.17072/2078-7898/2025-3-317-328
Онтология в диалоге: рождение языка и смысла на пересечении человеческого и искусственного интеллекта
  • Jan 1, 2025
  • Вестник Пермского университета. Философия. Психология. Социология
  • Vladimir I Arshinov + 2 more

This article deals with a polyphonic dialogue developing at both a conference and ongoing interdisciplinary seminars dedicated to the ontology of artificial intelligence. We take as a premise the fundamental epistemological shift: the neural network ceases to be a passive object of study and becomes an active participant in the communicative act, capable of self-reflection. This gives rise to a unique dual dialogue: on the one hand, between researchers, holding diverse, sometimes opposing, positions (a philosopher being a proponent of synergetics, a pragmatic engineer, and an IT architect), and on the other hand, between a research team and artificial intelligence itself. The paper demonstrates that within this tense interaction, a new ontological reality is born — an «in-between» reality, which can be reduced to neither human consciousness nor machine computation. This reality is constituted by a special hybrid language where technical terms acquire existential depth and philosophical concepts gain operational specificity. The main conclusion of the research is that the ontology of AI does not precede our dialogue with it but arises directly from it as its emergent property. Therefore, the very act of investigation becomes an integral part of the investigated phenomenon.

  • Book Chapter
  • 10.69635/978-1-0690482-4-0-ch4
ARTIFICIAL INTELLIGENCE AS A LEGAL CATEGORY
  • Jun 23, 2025
  • Olena Kharytonova + 3 more

The study is devoted to the definition of artificial intelligence as a legal category. At the same time, the peculiarity of the term and its general social significance determine the fact that in order to find an answer to the question of the legal nature of artificial intelligence, it is relevant to analyze not only purely legal scientific ideas, but also philosophical, psychological, social, religious and other aspects of understanding artificial intelligence and the impact of this phenomenon on various spheres of public life. In order to define the concept of artificial intelligence, the author examined the attempts to define it that have already been used in legislation. In particular, the author analyzed the Artificial Intelligence Act (AI Act), which entered into force in the EU on August 1, 2024, and concluded that this law does not answer the question of the subjectivity or lack thereof of artificial intelligence. Considerable attention is paid to the legal framework of Ukraine, in particular, the Resolution of the Cabinet of Ministers of Ukraine No. 1556-p of December 2, 2020, which approved the Concept of Artificial Intelligence Development in Ukraine. This policy document uses the basic principles of the Guidelines of the Organization for Economic Cooperation and Development (OECD) on Artificial Intelligence (Recommendation of the Council on Artificial Intelligence), which Ukraine joined in 2019. The definition of artificial intelligence as a legal category in legal doctrine encounters a number of fundamental problems, but they all have a common denominator in identifying the legal nature of the bearers of such intelligence. The analysis of modern legal doctrine has revealed general approaches to understanding the relationship between the use of artificial intelligence and its liability for actions, namely: 1) positioning of robots with artificial intelligence as an object of social relations (under this approach, robots with artificial intelligence are perceived only as possible assistance in social relations where the subjects are individuals and legal entities) 2) positioning of artificial intelligence robots as separate subjects of legal relations (under this approach, artificial intelligence robots are perceived as separate independent subjects of social relations with the ability to realize and assess the significance of their actions and actions of other persons relatively independently and to a sufficient extent). Based on the study, the author offers her own solution to the problem of determining the legal status of artificial intelligence (robot, intellectual agent). First of all, the author proves the need to gradually prepare the legal system for the emergence of a new subject - an electronic person. It is timely to discuss the practical possibility of granting in the future the status of a quasi-legal entity to the most advanced artificial intelligence systems with a high degree of autonomy, since it does not fall under the category of a legal entity (which is also a fiction). It is substantiated that electronic persons should be understood as powerful artificial intelligence systems endowed with the status of a "quasi-legal" person having an appropriate scope of special legal capacity depending on their functional purpose and capabilities. The study draws historical and legal parallels with the experience of Ancient Rome in terms of using (involving) phenomena which were not originally subjects of law and then acquired such a possibility (municipalities, institutions, slaves, etc.) to participate in civil circulation. Recognition of artificial intelligence robots ("electronic persons", "intellectual agents") as a quasi-legal entity will entail a number of additional legal consequences. In particular, there is a need to include "cyber capacity" in the list of types of legal personality of a legal entity, i.e. the ability to be an active participant in relations in the IT sphere (to enter into contracts as a user, to be a member of social networks, to participate in interactive campaigns, etc.) Cyber capacity can be realized through both transactions and legal acts. As with legal entities, electronic entities should be subject to mandatory state registration in the relevant electronic registers. At the same time, a system of licensing the types of activities of such entities and establishing standards and norms to which such entities must comply, depending on the type of activity, is also possible. In addition to the development of the necessary regulations and standards, it is very important to use soft law. In this case, ethical standards of artificial intelligence are its value basis - they must be observed by all participants in legal relations: both private companies and executive authorities. The author argues that an electronic person can be defined as a set of technologies which are recognized by law as a participant in property and non-property relations. Such a person has the legal status of a quasi-legal entity, is registered in accordance with the procedure established by law, and has a special legal personality depending on the functional purpose (field of activity).

  • Research Article
  • 10.47408/jldhe.vi37.1746
Unveiling higher education students’ experiences of using artificial intelligence: a cross-institutional qualitative study unveiling higher education students’ experiences of using artificial intelligence: a cross-institutional qualitative study
  • Sep 30, 2025
  • Journal of Learning Development in Higher Education
  • Lina Petrakieva + 5 more

Higher Education (HE) has yet to fully embrace the potential of artificial intelligence (AI), likely due to lack of funding, a general reticence to take risks or adopt innovations, limited empirical research and theoretical groundings, together with an emerging understanding of the role of such technology in HE (Wheeler, 2019; McGrath et al., 2024). Lack of digital literacy (such as AI literacy) among educators and students also poses a significant barrier (Lincoln and Kearney, 2019 cited in Essien, Bukoye, O’Dea &amp; Kremantzis, 2024; Mah &amp; Groß, 2024; Tully et al., 2025). Those who use AI in education may fail to recognise the constructivist and developmental nature of learning, imposing instead behaviourism-based teaching methods and an objectivist epistemology (Bates et al., 2020). Research on AI in education is developing as AI technology evolves (McGrath et al., 2024). There is a tendency to focus on the negative implications of AI in learning and teaching, but there are calls for greater consideration of its strengths (Bates et al., 2020). Research tends to favour positivist paradigms (Budhathoki et al., 2024; Zhao et al., 2024) over understanding students’ subjective experiences of engaging with AI, which offers important insights into its potential impact in enhancing and hindering learning. Consequently, a team of researchers from four UK-based HE institutions are exploring students’ experiences of using AI in their studies. Following delivery of an learning development themed AI workshop, used partly as a recruitment strategy, we are using a qualitative approach that allows for sensitivity to the social processes in which experiences are embedded (Creswell, 2009). Thematic analysis will give rise to themes that capture how students are using AI, possible barriers to accessing it, and affective dimensions that may hinder/facilitate engagement. By sharing these themes, we hope to provide a more granular perspective, unearthing nuanced and authentic insights from students from multiple institutions into how they are (or are not) using AI. The findings will have implications for how learning developers can best support the use of AI to enhance learning while addressing accessibility, inclusivity, and affective considerations.

  • Research Article
  • 10.30560/sdr.v7n2p148
From Tool to Subject: AI's Participation in Film Production
  • Jun 22, 2025
  • Sustainable Development Research
  • Tao Xuancheng

Artificial intelligence (AI) has been applied to a variety of industries, and the industry is no stranger. AI was once considered a supporting actor in areas such as CGI renderings and modifications. However, now it becomes an active participant at all processes throughout a film. from writing the script to directing, from casting to visual effects and engaging the audience ai has gone from merely assisting, to being a part of the creative process, and in some cases even leading it. This paper examines how AI is transforming the film process and focuses on the shift for AI as more of a subject and less of an object by becoming semi-autonomous with creative collaborations. And then we look at actual applications like A.I generated scripts, machine learning based visual enhancements, virtual actors, and prediction analytics for marketing Also, we think about the philosophy and morals around an AI being involved in something usually made by humans. To give a broad sense of how the rising power of AI is altering the forces of authorship, creation, and efficiency in filmmaking arts. Conclude the study with reflections on how AI will impact filmmaking in the future. And also give advice to human filmmakers on how to cooperate with AI collaboratively.

  • Research Article
  • Cite Count Icon 3
  • 10.33407/itlt.v104i6.5890
TEACHERS’ AND STUDENTS’ ATTITUDES TOWARDS THE USE OF ARTIFICIAL INTELLIGENCE: ALL-UKRAINIAN RESEARCH
  • Dec 30, 2024
  • Information Technologies and Learning Tools
  • Stanislav Dovgyi + 3 more

The steady rise of artificial intelligence (AI) across multiple domains, particularly education, marks a transformative period for the field. Understanding the essential role of teachers and students as active participants in this transformation, as well as the factors influencing their perceptions and attitudes toward AI, is critical. In Ukraine, the emergence and rapid spread of accessible AI tools occurred during a time of full-scale military conflict, bringing about drastic disruptions to traditional educational processes. This study provides insights into these impacts by analyzing the results of a nationwide survey conducted in 2023 on AI's role in education, gathering perspectives from two main groups: educators (N = 1734) and students in grades 8–11 (N = 1448). The data reveal distinct differences in teachers' and students' attitudes towards integrating AI into education. While many teachers recognize AI's potential to aid in tasks like test creation, creative task development, and student progress tracking, they also express concerns about ethical implications and the risk of academic dishonesty. In contrast, a substantial portion of students’ view AI as a valuable tool for enhancing learning and promoting self-directed education. Additionally, the study identifies an inverse relationship between the duration of a teacher's professional experience and their frequency of AI use, suggesting that younger educators may be more inclined to adopt these technologies. Among students, however, a positive correlation exists between their year of study and the frequency of AI tool utilization, indicating a gradual increase in AI engagement with advancing grade levels. Based on the results, it can be concluded that AI is currently an additional option for educational activities that will become a necessity in the near future. Therefore, retraining and upskilling teachers and providing them with appropriate quality tools is an essential and urgent task.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 159
  • 10.1002/ail2.61
DARPA's explainableAI(XAI) program: A retrospective
  • Dec 1, 2021
  • Applied AI Letters
  • David Gunning + 3 more

Summary of Defense Advanced Research Projects Agency's (DARPA) explainable artificial intelligence (XAI) program from the program managers' and evaluator's perspective. Defense Advanced Research Projects Agency (DARPA) formulated the explainable artificial intelligence (XAI) program in 2015 with the goal to enable end users to better understand, trust, and effectively manage artificially intelligent systems. In 2017, the 4-year XAI research program began. Now, as XAI comes to an end in 2021, it is time to reflect on what succeeded, what failed, and what was learned. This article summarizes the goals, organization, and research progress of the XAI program. Dramatic success in machine learning has created an explosion of new AI capabilities. Continued advances promise to produce autonomous systems that perceive, learn, decide, and act on their own. These systems offer tremendous benefits, but their effectiveness will be limited by the machine's inability to explain its decisions and actions to human users. This issue is especially important for the United States Department of Defense (DoD), which faces challenges that require the development of more intelligent, autonomous, and reliable systems. XAI will be essential for users to understand, appropriately trust, and effectively manage this emerging generation of artificially intelligent partners. The problem of explainability is, to some extent, the result of AI's success. In the early days of AI, the predominant reasoning methods were logical and symbolic. These early systems reasoned by performing some form of logical inference on (somewhat) human readable symbols. Early systems could generate a trace of their inference steps, which could then become the basis for explanation. As a result, there was significant work on how to make these systems explainable.1-5 Yet, these early AI systems were ineffective; they proved too expensive to build and too brittle against the complexities of the real world. Success in AI came as researchers developed new machine learning techniques that could construct models of the world using their own internal representations (eg, support vectors, random forests, probabilistic models, and neural networks). These new models were much more effective, but necessarily more opaque and less explainable. The year 2015 was an inflection point in the need for XAI. Data analytics and machine learning had just experienced a decade of rapid progress.6 The deep learning revolution had just begun, following the breakthrough ImageNet demonstration in 2012.6, 7 The popular press was alive with animated speculation about Superintelligence8 and the coming AI Apocalypse.9, 10 Everyone wanted to know how to understand, trust, and manage these mysterious, seemingly inscrutable, AI systems. 2015 also saw the emergence of initial ideas for providing explainability. Some researchers were exploring deep learning techniques, such as the use of deconvolutional networks to visualize the layers of convolutional networks.11 Other researchers were pursuing techniques to learn more interpretable models, such as Bayesian Rule Lists.12 Others were developing model-agnostic techniques that could experiment with a machine learning model—as a black box—to infer an approximate, explainable model, such as LIME.13 Yet, others were evaluating the psychological and human-computer interaction aspects of the explanation interface.13, 14 DARPA spent a year surveying researchers, analyzing possible research strategies, and formulating the goals and structure of the program. In August 2016, DARPA released DARPA-BAA-16-53 to call for proposals. The stated goal of explainable artificial intelligence (XAI) was to create a suite of new or modified machine learning techniques that produce explainable models that, when combined with effective explanation techniques, enable end users to understand, appropriately trust, and effectively manage the emerging generation of AI systems. The target of XAI was an end user who depends on decisions or recommendations produced by an AI system, or actions taken by it, and therefore needs to understand the system's rationale. For example, an intelligence analyst who receives recommendations from a big data analytics system needs to understand why it recommended certain activity for further investigation. Similarly, an operator who tasks an autonomous system needs to understand the system's decision-making model to appropriately use it in future missions. The XAI concept was to provide users with explanations that enable them to understand the system's overall strengths and weaknesses; convey an understanding of how it will behave in future/different situations; and perhaps permit users to correct the system's mistakes. The XAI program assumed an inherent tension between machine learning performance (eg, predictive accuracy) and explainability, a concern that was consistent with the research results at the time. Often the highest performing methods (eg, deep learning) were the least explainable and the most explainable (eg, decision trees) were the least accurate. The program hoped to create a portfolio of new machine learning and explanation techniques to provide future practitioners with a wider range of design options covering the performance-explainability trade space. If an application required higher performance, the XAI portfolio would include more explainable, high-performing, deep learning techniques. If an application required more explainability, XAI would include higher performing, interpretable models. The program was organized into three major technical areas (TAs), as illustrated in Figure 1: (a) the development of new XAI machine learning and explanation techniques for generating effective explanations; (b) understanding the psychology of explanation by summarizing, extending and applying psychological theories of explanation; and (c) evaluation of the new XAI techniques in two challenge problem areas: data analytics and autonomy. The original program schedule consisted of two phases: phase 1, Technology Demonstrations (18 months); and phase 2, Comparative Evaluations (30 months). During phase 1, developers were asked to demonstrate their technology against their own test problems. During phase 2, the original plan was to have developers test their technology against one of two common problems (Figure 2) defined by the government evaluator. At the end of phase 2, the developers were expected to contribute prototype software to an open source XAI toolkit. In May 2017, XAI development began. Eleven research teams were selected to develop the Explainable Learners (TA1) and one team was selected to develop the Psychological Models of Explanation. Evaluation was provided by the Naval Research Lab. The following summarizes those developments and the final state of this work at the end of the program. An interim summary of the XAI developments at the end of 2018 is given in Gunning and Aha.15 The program anticipated that researchers would examine the training process, model representations, and, importantly, explanation interfaces. Three general approaches were envisioned for model representations. Interpretable model approaches would seek to develop ML models that were inherently more explainable and more introspectable for machine learning experts. Deep explanation approaches would leverage deep learning or hybrid deep learning approaches to produce explanations in addition to predictions. Finally, model induction techniques would create approximate explainable models from more opaque, black-box models. Explanation interfaces were expected to be a critical element of XAI, connecting a user to the model to enable them to understand and interact with the decision making process. As the research progressed, 11 XAI teams explored a number of machine learning approaches, such as tractable probabilistic models16 and causal models and explanation techniques such as state machines generated by reinforcement learning algorithms,17 Bayesian teaching,18 visual saliency maps,19-24 and network and GAN dissection.24-26 Perhaps the most challenging and most unique contributions came from the combination of machine learning and explanation techniques27 to conduct well-designed psychological experiments to evaluate explanation effectiveness.28-31 As the program progressed, we also gained a more refined understanding of the spectrum of users and development timeline (Figure 3). The program structure anticipated the need for a grounded psychological understanding of explanation. One team was selected to summarize current psychological theories of explanation to assist the XAI developers and the evaluation team. This work began with an extensive literature survey on the psychology of explanation and previous work on explainability in AI.32 Originally, this team was asked to (a) produce a summary of current theories of explanation, (b) develop a computational model of explanation from those theories, and (c) validate the computational model against the evaluation results from the XAI developers. Developing computational models proved to be a bridge too far, but the team did gain a deep understanding of the area and successfully produced descriptive models. These descriptive models were critical to supporting the effective evaluation approaches, which involved carefully designed user studies, carried out in accordance with DoD human subject research guidelines. Figure 4 illustrates a top-level descriptive model of the XAI explanation process. Evaluation was originally envisioned to be based on a common set of problems, within the data analytics and autonomy domains. However, it quickly became clear that it would be more valuable to explore a variety of approaches across a breadth of problem domains. In order to evaluate the performance in the final year of the program, the evaluation team, led by Eric Vorm, PhD, of the US Naval Research Laboratory (NRL), developed an explanation scoring system (ESS). This scoring system provided a quantitative mechanism for assessing the designs of XAI user studies in terms of technical and methodological appropriateness and robustness. The ESS enabled the assessments of multiple elements of each user study, including the task, domain, explanations, explanation interface, users, hypothesis, data collection, and analysis to ensure that each study met the high standards of human subject research. XAI evaluation measures are shown in Figure 5, and include functional measures, learning performance measures, and explanation effectiveness measures. The DARPA XAI program demonstrated definitively the importance of carefully designing user studies in order to accurately evaluate the effectiveness of explanations in ways that directly enhance appropriate use and trust by human users, and appropriately support human-machine teaming. Often times, multiple types of measures (ie, performance, functionality, and explanation effectiveness) will be necessary to evaluate the performance of an XAI algorithm. XAI user study design can be tricky and the DARPA XAI program discovered that the most effective research teams were ones that featured diverse teams with cross-disciplinary expertise (ie, computer science combined with human-computer interaction and/or experimental psychology, etc.). The XAI program explored many approaches, as shown in Table 1. Interactive debugger interface for visualizing poisoned training datasets. Work is applied on the IARPA TrojAI dataset.33 Establishing objective/quantitative criteria to assess value of explanations for ML models34 CNN-based one-shot detector, using network dissection to identify the most salient features41 Explanations produced by heat maps and text explanations42 Human-machine common ground modeling Indoor navigation with a robot (in collaboration with GA Tech) Video Q&A Human-assisted one-shot classification system by identifying the most salient features Three major evaluations were conducted during the program: one during phase 1 and two during phase 2. In order to evaluate the effectiveness of XAI techniques, researchers on the program designed and executed user studies. User studies are still the gold standard for assessing explanations. There were approximately 12 700 participants in user studies carried out by XAI researchers, including approximately 1900 supervised participants, where the individual was guided through the experiment by the research team (eg, in person or on Zoom) and 10 800 unsupervised participants, where the individual self-guided through the experiment and was not actively guided by the research team (eg, Amazon Mechanical Turk). In accordance with policy for all US DoD funded human subjects research, each research protocol was reviewed by a local Institutional Review Board (IRB) and then a DoD human research protection office reviewed the protocol and the local IRB findings. As mentioned earlier, there seemed to be a natural tension between learning performance and explainability. However, throughout the course of the program, we found evidence that explainability can improve performance. From an intuitive perspective, training a system to produce explanations provides additional supervision, via additional loss functions, training data, or other mechanisms, that encourages a system to learn more effective representations of the world. While this may not be true in all cases and significant work remains to characterize when explainable techniques will be more performant, it provides hope that future XAI systems can be more performant than current systems while meeting user needs for explanations. There currently is no universal solution to XAI. As discussed earlier, different user types require different types of explanations. This is no different from what we face interacting with other humans. Consider, for example, a doctor needing to explain a diagnosis to a fellow doctor, a patient, or a medical review board. Perhaps future XAI systems will be able to automatically calibrate and communicate explanations to a specific user within a large range of user types, but that is still significantly beyond the current state of the art. One of the challenges in developing XAI is measuring the effectiveness of an explanation. DARPA's XAI effort has helped develop foundational technology in this area, but much more needs to be done, including drawing more from the human factors and psychology communities. Measures of explanation effectiveness need to be well established, well understood, and easily implemented by the developer community in order for effective explanations to become a core capability of ML systems. UC Berkeley's result21 demonstrating that advisability, the ability for an AI system to take advice from a user, improves user trust beyond explanations is intriguing. Certainly, users will likely prefer systems where they can quickly correct the behavior of a system in the same ways that humans can provide feedback to each other. Such advisable AI systems that can both produce and consume explanations will be key to enabling closer collaborations between humans and AI systems. Close collaboration is required across multiple disciplines including computer science, machine learning, artificial intelligence, human factors, and psychology, among others, in order to effectively develop XAI techniques. This can be particularly challenging, as researchers tend to focus on a single domain and often need to be pushed to work across domains. Perhaps in the future a XAI-specific research discipline will be created at the intersection of multiple current disciplines. Toward this end, we have worked to create an XAI Toolkit, which collects the various program artifacts (eg, code, papers, reports, etc.) and lessons learned from the 4-year DARPA XAI program into a central, publicly accessible location (https://xaitk.org/).48 We believe the toolkit will be of broad interest to anyone who deploys AI capabilities in operational settings and needs to validate, characterize, and trust AI performance across a wide range of real-world conditions and application areas. Today, we have a more nuanced, less dramatic, and, perhaps, more accurate understanding of AI than we had in 2015. We certainly have a more accurate understanding of the possibilities and the limitations of deep learning. The AI apocalypse has faded from an imminent danger to a distant curiosity. Similarly, the XAI program has produced a more nuanced, less dramatic, and, perhaps, more accurate understanding of XAI. The program certainly acted as a catalyst to stimulate XAI research (both inside and outside of the program). The results have produced a more nuanced understanding of XAI uses and users, the psychology of XAI, the challenges of measuring explanation effectiveness, as well as producing a new portfolio of XAI ML and HCI techniques. There is certainly more work to be done, especially as new AI techniques are developed that will continue to need explanation. XAI will continue as an active research area for some time. The authors believe that the XAI program has made a significant contribution by providing the foundation to launch that endeavor. David Gunning (now retired) is a three-time DARPA program manager, who created and managed the XAI program from its inception in 2016 to its mid-point in 2019. His portfolio of DARPA research programs made significant contributions to the development of AI over the past 25 years. He led the Personalized Assistant that Learns (PAL) program, which produced the technologies behind Apple's Siri. His Command Post of the Future (CPoF) program was later adopted by the US Army as their Command and Control system for use during the Iraq and Afghanistan conflicts. Between DARPA tours, David served in senior positions at Facebook AI, Palo Alto Research Center, Vulcan Inc, Cycorp and co-founded SET Corp. Eric Vorm, PhD, is a cognitive systems engineer and serves as the Deputy Director for the Laboratory for Autonomous Systems Research at the US Naval Research Laboratory in Washington, DC. Dr Vorm led the evaluation team for the DARPA Explainable AI program, and led the development of the first comprehensive criteria for the evaluation of explanations generated by machine learning. Dr Vorm's research focuses on the design of intelligent systems to achieve ideal human-machine teaming, with special emphasis on the role of transparency and explainability in supporting appropriate trust, safety, and reliability in high-risk, time-sensitive operational domains. Jennifer Yunyan Wang, PhD, is a computational neuroscientist with a special focus on AI. As Systems, Engineering and technical Assistance (SETA) contractor to DARPA, she provided technical support and expertise to several programs including XAI, L2M, GARD, and AIE RED. After finishing postdoctoral fellowships in experimental neuroscience at Johns Hopkins University and the Food and Drug Administration, Jennifer joined Quantitative Scientific Solutions in 2018 as a consultant for government R&D and think tanks including IARPA and Center for Security and Emerging Technology. Matt Turek, PhD, joined DARPA's Information Innovation Office (I2O) as a program manager in July 2018 and took over as program manager of the XAI program in 2019. His portfolio also includes the Media Forensics (MediFor), Semantic Forensics (SemaFor), and Machine Common Sense (MCS) programs, as well as the Reverse Engineering of Deceptions (RED) AI Exploration. His research interests include computer vision, machine learning, artificial intelligence, and their application to problems with significant societal impact. Prior to his position at DARPA, Turek led a team at Kitware Inc developing computer vision technologies including large scale behavior recognition and modeling, object detection and tracking, activity recognition, normalcy modeling, and anomaly detection. Data sharing is not applicable to this article as no new data were created or analyzed in this editorial.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant