Abstract

Currently chatbots, dialogue systems and intelligent assistants increasingly found in an equipment of everyday life, used in technical support of commercial organizations and in entertainment services. Systems for the English language have good groundwork. However, the process of “recognition” of a natural language associated with a number of difficulties caused by the need to have a significant initial database of dialogues, explore various architectures of neural networks, solve problems of the perception and morphology of the Russian language. In this regard, the purpose of this study is the development of a neural network model for natural Russian language processing, capable of becoming an open platform for the development of specialized dialogue systems. For this, design and training of dialog models of neural networks based on modifications of the Transformer architecture are proposed. Own parsers for extracting and post-processing dialogues in natural Russian from the Otvet@mail.Ru portal and public chat rooms in the Telegram messenger for training neural networks were developed. The data set, prepared with their help and now publicly available on the Internet, contains more than 22.5 million question-answer pairs in natural Russian language. The prepared data set in various configurations applied when training a number of neural network models designed by modifying the Sequence2Sequence, Transformer and text2text architectures. The final version of developed neural network model generates answers to any user message up to 200 characters and is integrated into a dialogue system implemented using the client-server architecture for user interaction with the chat bot.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.