A Primer on Seq2Seq Models for Generative Chatbots

Vincenzo Scotti,Licia Sbattella,Roberto Tedesco

doi:10.1145/3604281

Abstract

The recent spread of Deep Learning-based solutions for Artificial Intelligence and the development of Large Language Models has pushed forwards significantly the Natural Language Processing area. The approach has quickly evolved in the last ten years, deeply affecting NLP, from low-level text pre-processing tasks –such as tokenisation or POS tagging– to high-level, complex NLP applications like machine translation and chatbots. This article examines recent trends in the development of open-domain data-driven generative chatbots, focusing on the Seq2Seq architectures. Such architectures are compatible with multiple learning approaches, ranging from supervised to reinforcement and, in the last years, allowed to realise very engaging open-domain chatbots. Not only do these architectures allow to directly output the next turn in a conversation but, to some extent, they also allow to control the style or content of the response. To offer a complete view on the subject, we examine possible architecture implementations as well as training and evaluation approaches. Additionally, we provide information about the openly available corpora to train and evaluate such models and about the current and past chatbot competitions. Finally, we present some insights on possible future directions, given the current research status.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: ACM Computing Surveys	Publication Date: Oct 6, 2023
Citations: 5	License type: cc-by

R Discovery Prime

R Discovery Prime

A Primer on Seq2Seq Models for Generative Chatbots

Abstract

Talk to us

Similar Papers

More From: ACM Computing Surveys

Lead the way for us

Similar Papers

Implementation of Kadazan Tagger Based on Brill's Method
Marylyn Alex ... Lailatul Qadri Zakaria
Journal of ICT Research and Applications | VOL. 7
Marylyn Alex, et. al.Marylyn Alex ... Lailatul Qadri Zakaria
01 Dec 2013
Journal of ICT Research and Applications | VOL. 7

Building Machine Learning System with Deep Neural Network for Text Processing
Shashi Pal Singh ... Hemant Darbari
-
Shashi Pal Singh, et. al.Shashi Pal Singh ... Hemant Darbari
17 Aug 2017
17 Aug 2017

Part of Speech Tagging for Tamil Language Using Deep Learning
Hemakasiny Visuwalingam ... Roshan G Ragel
-
Hemakasiny Visuwalingam, et. al.Hemakasiny Visuwalingam ... Roshan G Ragel
12 Sep 2021
12 Sep 2021

Part of Speech Tagging for Setswana African Language
M.A Dibitso ... S.O Ojo
-
M.A Dibitso, et. al.M.A Dibitso ... S.O Ojo
01 Nov 2019
01 Nov 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Primer on Seq2Seq Models for Generative Chatbots

Abstract

Talk to us

Similar Papers

More From: ACM Computing Surveys