Articles published on Dialogue Tasks
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
482 Search results
Sort by Recency
- New
- Research Article
- 10.1016/j.eswa.2025.130494
- Mar 1, 2026
- Expert Systems with Applications
- Jongmoon Jun + 4 more
Aspect-augmented distillation of task-oriented dialogues to small language models
- New
- Research Article
- 10.1016/j.ipm.2025.104317
- Mar 1, 2026
- Information Processing & Management
- Dongning Rao + 2 more
Leveraging dynamic few-shot prompting and ensemble method for task-oriented dialogue with subjective knowledge
- New
- Research Article
- 10.3390/sym18020372
- Feb 17, 2026
- Symmetry
- Shenghui Bao + 1 more
Task-oriented dialogue systems face a tension between comprehensive constraint elicitation (task adequacy) and conversational efficiency (minimizing turns). Current preference learning frameworks treat preferences as static, unable to capture the dynamic evolution of interaction states that evolve across dialogue progression. We present Dual-DPO, a framework that embeds multi-objective preferences into data construction via turn-aware scoring. Our approach decouples objective balancing from policy updates through offline preference scalarization, addressing the optimization instability challenges in online multi-objective reinforcement learning. Experiments on MultiWOZ 2.4 demonstrate 28–35% dialogue turn reduction while maintaining Joint Goal Accuracy > 89% (p<0.001). Pareto frontier analysis shows 94% coverage with hypervolume HV=0.847. Independent expert evaluation by 10 PhD-level researchers (n=300 assessments, inter-rater agreement α=0.78) confirms 32% user satisfaction improvement (p<0.001). Theoretical analysis demonstrates that offline scalarization, which correlates with improved optimization stability, achieves 3.2× lower gradient variance than online multi-reward optimization by eliminating sampling stochasticity. Our approach enables balanced treatment of competing objectives through Pareto-optimal trade-offs. These results highlight a symmetric and balanced treatment of competing objectives within a Pareto-optimal optimization framework.
- New
- Research Article
- 10.1007/s11704-025-41285-5
- Feb 12, 2026
- Frontiers of Computer Science
- Yuan Ren + 4 more
Zero-shot reinforcement learning for multi-domain task-oriented dialogue policy
- New
- Research Article
- 10.59429/esp.v11i2.4492
- Feb 11, 2026
- Environment and Social Psychology
- Xu Han + 2 more
Despite sustained pedagogical emphasis on communicative competence, oral fluency remains a persistent challenge for EFL learners, particularly in task-based speaking contexts involving time pressure. While previous research has examined the effects of task planning on linguistic performance, less attention has been paid to learners’ psychological experiences during task execution and how these experiences interact with cognitive and linguistic resources over time. This study investigates how three online task planning conditions—pressured online planning (POP), unpressured online planning (UOP), and hybrid online planning (HOP)—shape learners’ use of formulaic sequences (FSs), with a focus on both frequency and variation, as well as their relationship with working memory (WM). Ninety Chinese EFL undergraduates participated in an eight-week longitudinal intervention and completed dialogic narrative tasks at pre-test, post-test, and delayed post-test. Quantitative analyses examined changes in FS deployment and exploratory associations with WM capacity. To complement these analyses, stimulated recall interviews were conducted to capture learners’ perceived pressure, emotional responses, and strategic decision-making under different planning conditions. The results reveal condition-sensitive and time-dependent patterns in FS use, with distinct profiles emerging across planning conditions. Interview data suggest that these patterns are closely associated with learners’ psychological regulation under task constraints, particularly their tendency toward risk avoidance, reliance on familiar expressions, and prioritization of fluency under pressure. Together, the findings highlight the importance of integrating psychological perspectives into the interpretation of task-based language performance and offer implications for the design of planning conditions in L2 speaking instruction.
- New
- Research Article
- 10.61227/arji.v8i1.708
- Feb 5, 2026
- Action Research Journal Indonesia (ARJI)
- Salem Aladi + 1 more
Generative AI is used by students as a study and writing tool, which could reshape how they read, argue, and evaluate ideas. This change is important in philosophy courses where learning depends on dialogic inquiry, close reading, and evidence-based reasoning. This study investigates how (AI) influences and reshapes the teaching and learning of philosophical inquiry and critical thinking among digital-native university students. Using mixed-methods that combine pre-test and post-test quantitative assessment with interviews and classroom observations, the research examines the extent to which AI-supported dialogic tasks influence argument identification, fallacy detection, inferential reasoning, and conceptual clarity. Quantitative data were analyzed by comparing pre–post gains within and between an AI-supported group and a traditionally taught control group. A sample of 100 students was divided into an AI-supported group and a traditionally taught control group, with post-test findings showing significantly higher gains for the AI group (d = 2.91), particularly in inferential reasoning and conceptual clarity. Qualitative data from 20 student and 10 instructor interviews show that learners view AI as a cognitive partner that enhances explanatory clarity, expands interpretive possibilities, and increases confidence in constructing arguments, though concerns appear regarding overreliance. Instructor observations corroborate these patterns, indicating shifts in classroom dynamics. The results show greater engagement and question diversity, but diminished persistence when confronting challenging primary texts. The study concludes that while AI can meaningfully improve critical-thinking development in philosophy classrooms, its pedagogical success depends on structured instructor mediation to preserve deep reading, intellectual struggle, and reflective judgment.
- Research Article
- 10.3389/feduc.2026.1725080
- Jan 30, 2026
- Frontiers in Education
- Miguel Naranjo-Toro + 3 more
Digital internationalization has moved beyond “mobility-at-a-distance” toward course-embedded designs that can scale intercultural competence (IC) development. Using PRISMA, we mapped digital technologies and pedagogical strategies used to teach IC in higher education, outcome measures, direction of effects, and equity-related implementation conditions, and we added mechanism-oriented coding linking modality affordances to learning processes and IC outcomes. We searched Scopus, Web of Science, and ERIC and screened 133 records. Twenty-four studies (total n = 1,787) met eligibility criteria. We combined descriptive bibliometrics with an adapted JBI quality appraisal (median quality 78.6%, IQR 57.1–85.7), a SWiM synthesis of effect direction by strategy, and intervention-level mechanism coding. Evidence clusters around LMS/forums, videoconferencing and collaboration suites, immersive VR/360°, telepresence/telesimulation, MOOCs, and guided inquiry. Across modalities, the most consistent IC gains occur when interventions instantiate structured intercultural contact, facilitation and feedback, authentic co-production, and guided reflection/debriefing. Scenario-based and immersive formats are most effective when situated performance is paired with structured debriefing and feedback. By contrast, content-centric LMS/MOOC formats show mixed results unless instructors engineer dialogic tasks and active facilitation to convert exposure into interaction. Most studies rely on standardized self-report measures; performance or observational evidence is less common and concentrates in simulation and role-play contexts. Reported constraints include connectivity, language and time-zone barriers, and facilitation workload, underscoring the role of institutional support. Future research should strengthen comparative and longitudinal designs, report implementation fidelity and dosage, validate instruments across languages, and triangulate self-report with performance evidence and downstream academic/employability outcomes.
- Research Article
- 10.1109/tnnls.2026.3659341
- Jan 1, 2026
- IEEE transactions on neural networks and learning systems
- Di Wu + 2 more
Dialog state tracking (DST) is an essential component of task-oriented dialog (ToD) systems. Existing few-shot DST models face challenges in effectively leveraging semantically related information and exhibit limited adaptability to dialog scenarios. To address these challenges, the dual-teacher and dual-prompt pool model for few-shot DST (DDP-DST) is proposed. Specifically, by enhancing key semantic information and syntactic structure, dual teacher models are constructed to generate pseudolabels from complementary perspectives. Self-training is employed to further improve state value generation. Besides, considering the intrinsic symmetry between slot and state value generation tasks, a dual-prompt fine-tuning strategy is designed. A dynamic prompt pool is constructed to adaptively generate prompts. Reconstruction errors (REs) are fed back into the DDP-DST model, leading to improved accuracy in DST. Experimental results on the MultiWOZ 2.1 dataset demonstrate that DDP-DST outperforms baseline models such as SM2-3b, DS2, and SVAG with average improvements of 4.3%, 2.4%, and 2.0% in the metric of joint goal accuracy (JGA). Notably, with fewer than 1 billion parameters, DDP-DST achieves competitive performance in few-shot settings, even outperforming models with up to 10 billion parameters.
- Research Article
- 10.1007/s12369-025-01335-1
- Jan 1, 2026
- International Journal of Social Robotics
- Ke Xu + 3 more
Abstract Robots are now pervasive, leveraging their automation capabilities to assist humans across a diverse range of tasks. Nevertheless, end-users may have a limited understanding of the robot’s operation and typically assume a passive role when interacting with the robot performing a particular task. In this study, we address the critical need for effective explainability in human-robot interaction. By comparing different methods of explaining robotic scenario information to end-users, the proposed methodologies use a labelled property graph-based chatbot that adheres to the IEEE Robotics Ontology Standards. In this study, we designed two virtual robotic scenarios and simulated their information flow using the Robot Operating System. A between-subjects experiment was conducted where participants engaged with the system through various interaction methods to understand the two scenarios. These methods included real-time Linux Command Line Interface outputs, querying a chatbot, exploring knowledge graphs, or a combination of chatbot and knowledge graphs. The study findings suggest that both the knowledge graphs and the chatbot significantly enhance the system’s explainability compared to a simple Linux terminal information output. Moreover, utilizing knowledge graphs alongside the chatbot has received better subjective evaluations concerning metrics such as clarity, usability, and robustness. This research made contributions towards the development of standardised labelled property graphs for representing scenario information in language-based human-robot interaction. The experiment design and evaluations also provided a solution for assessing the explainability of task-oriented dialogue systems both subjectively and objectively.
- Research Article
- 10.32604/cmc.2026.075777
- Jan 1, 2026
- Computers, Materials & Continua
- Ksenia Kharitonova + 3 more
Automating the Initial Development of Intent-Based Task-Oriented Dialog Systems Using Large Language Models: Experiences and Challenges
- Research Article
- 10.1177/29498732261419317
- Jan 1, 2026
- Neurosymbolic Artificial Intelligence
- Vasile Ionut Remus Iga + 1 more
Recent studies have demonstrated that large language models can perform various knowledge graph-related tasks, including knowledge graph construction, even in zero- and few-shot settings. However, they are prone to hallucinating information and producing non-deterministic outputs, which can result in flawed reasoning, even when the answers appear to meet user expectations. This unpredictability limits their integration into automated natural language processing pipelines, such as those used in chatbots or task-oriented dialogue systems. To explore the potential and limitations of large language models in knowledge graph tasks, we evaluate three prominent models, namely Mixtral-8x7b-Instruct-v0.1, GPT-3.5-Turbo-0125, and GPT-4o, on constructing static knowledge graphs. Our approach uses prompts based on the TELeR taxonomy in zero- and one-shot scenarios, within the context of a task-oriented dialogue system. Additionally, we propose a flexible evaluation framework that captures all usable information generated by the models, alongside traditional strict metrics, and introduce TODSet, a dataset tailored to gauge the performance of large language models on knowledge graph-related tasks. Our findings suggest that, with well-designed prompts containing sufficient detail and examples, large language models can effectively contribute to knowledge graph construction tasks.
- Research Article
- 10.1016/j.xcrm.2025.102547
- Jan 1, 2026
- Cell Reports Medicine
- Xinti Sun + 12 more
Model confrontation and collaboration: A debate intelligence framework for enhancing medical reasoning in large language models
- Research Article
- 10.1016/j.softx.2025.102409
- Dec 1, 2025
- SoftwareX
- María Jesús Rodríguez-Sánchez + 3 more
Spec2chat: A Python library for task-oriented dialogue generation from OpenAPI specifications
- Research Article
- 10.1016/j.neucom.2025.131058
- Nov 1, 2025
- Neurocomputing
- Shaghayegh Saffari + 2 more
Graph representation based reward shaping approach for addressing reward sparsity in task-oriented dialogue systems
- Research Article
- 10.1016/j.engappai.2025.111793
- Nov 1, 2025
- Engineering Applications of Artificial Intelligence
- Yasaman Saffari + 1 more
A Graph-based State Representation Learning for episodic reinforcement learning in task-oriented dialogue systems
- Research Article
2
- 10.1145/3771090
- Oct 30, 2025
- ACM Computing Surveys
- Zihao Yi + 6 more
This survey provides a comprehensive review of research on multi-turn dialogue systems, with a particular focus on multi-turn dialogue systems based on large language models (LLMs). This paper aims to (a) give a summary of existing LLMs and approaches for adapting LLMs to downstream tasks; (b) elaborate recent advances in multi-turn dialogue systems, covering both LLM-based open-domain dialogue (ODD) and task-oriented dialogue (TOD) systems, along with datasets and evaluation metrics; (c) discuss some future emphasis and recent research problems arising from the development of LLMs and the increasing demands on multi-turn dialogue systems.
- Research Article
- 10.70102/afts.2025.1833.253
- Oct 30, 2025
- Archives for Technical Sciences
- Srikanth Reddy Keshireddy
The research focuses on incorporating Retrieval-Augmented Generation (RAG) methods into Oracle APEX to enhance the context, semantics, and accurateness of responses given by AI assistants in enterprise applications. We developed a fully integrated, low-latency RAG system tailored for Oracle’s low-code framework by embedding dense semantic search through FAISS vector stores and hybrid BM25 keyword filter with transformer embedding retrieval pipelines. The system integrates effortlessly with GPT-style language models through RESTful APIs, drawing upon domain-specific corpora within Oracle databases to enrich the generative processes and perform retrieval-augmented generation. Crossfunctional domain experiments, including multi-turn interactions in HR, IT support, and finance, demonstrated remarkable improvements overall, including a 21% increase in BLEU scores, 25% in ROUGE-L, and 34% in user satisfaction as opposed to non-RAG configurations. Context Relevance Scores (CRS) were particularly high for multi-turn technical queries, underscoring the critical impact of retrieval accuracy for grounding generative outputs. The hybrid retriever also demonstrated strong performance in minimizing token overhead while maintaining contextual precision. These results illustrate how Oracle APEX can scale as a secure host environment for sophisticated AI-driven feedback systems and how the RAG architecture presented in this work acts as a generic enhancement blueprint to task-oriented dialogue systems in low-code enterprise applications.
- Research Article
- 10.1145/3770857
- Oct 13, 2025
- ACM Transactions on Autonomous and Adaptive Systems
- Alexandre Yukio Ichida + 1 more
Building conversational agents to help humans in domain-specific tasks is challenging since the agent needs to understand the natural language and act over it while accessing domain expert knowledge. Modern natural language processing techniques led to an expansion of conversational agents, with recent pretrained language models achieving increasingly accurate language recognition results using ever-larger open datasets. However, the black-box nature of such pretrained language models obscures the agent's reasoning and its motivations when responding, leading to unexplained dialogues. In this work, we develop a belief-desire-intention (BDI) agent as a task-oriented dialogue system to introduce mental attitudes similar to humans describing their behavior during a dialogue. We compare the BDI model with pipeline task-oriented dialogue system architecture by leveraging existing components from dialogue systems and developing the agent's intention selection as a dialogue policy. We show that combining traditional agent modelling approaches, such as BDI, with more recent learning techniques can result in efficient and scrutable dialogue systems.
- Research Article
- 10.1017/s1366728925100564
- Oct 7, 2025
- Bilingualism: Language and Cognition
- Jianmin Gao + 1 more
Abstract We explored the relationships between L2 utterance fluency and cognitive fluency in monologic and dialogic tasks. The study involved 136 Chinese university-level English learners. Utterance fluency was measured through speed, breakdown, and repair fluency aspects. Cognitive fluency was indicated by L2 lexical and syntactic processing efficiency measures. Stepwise regression models, including metrics of L2-specific cognitive fluency, L2 knowledge, and L1 utterance fluency as predictors, targeted L2 utterance fluency as the dependent variable. We found that L2 cognitive fluency predicted limited variance in utterance fluency, with its influence more evident in monologues. L2 lexical processing efficiency paralleled syntactic processing efficiency’s importance in the monologic task but surpassed it in dialogues. Moreover, L2 processing speed had a more significant impact on utterance fluency than processing stability across both contexts. We suggest that cognitive fluency is not the sole determinant of utterance fluency; L2 knowledge and L1 utterance fluency play non-negligible roles.
- Research Article
- 10.48084/etasr.12403
- Oct 6, 2025
- Engineering, Technology & Applied Science Research
- Ussen Kimanuka + 3 more
As Artificial Intelligence (AI) advances, conversational agents are increasingly used across sectors, including humanitarian response. However, current systems and datasets mainly support high-resource languages and open-domain tasks, resulting in significant limitations in addressing low-resource, domain-specific needs. This study addresses this gap by focusing on a Congolese Swahili corpus collected from Short Message Service (SMS) messages and call-center humanitarian questions to develop an effective conversational agent for low-resource languages that supports communication during humanitarian crises. The goal of this research is to develop an effective Task-Oriented Dialogue System (ToDS) to assist displaced persons seeking humanitarian information in Congolese Swahili. We built a pipeline-based ToDS that converts natural language into SPARQL by utilizing a trained Named Entity Recognition (NER) model and a Dual Intent and Entity Transformer (DIET) classifier. This ToDS includes a humanitarian-specific ontology and dynamically queries a local triple store with data derived from the Humanitarian Data Exchange (HDX). The preliminary results indicate high accuracy in entity recognition and intent classification, which enables precise and timely information responses. The agent effectively provides context-relevant answers to humanitarian questions in crisis interactions. The findings demonstrate that applying Natural Language Understanding (NLU) methods in a low-resource, crisis-based context is viable and impactful. This ToDS offers a scalable solution for improving information accessibility in humanitarian emergencies and during forced internal displacements.