Published Scholarly Research Articles from tashkent university of information technology

The demand for customer support call centers has surged across various sectors due to the pandemic. Yet, the constraints of round-the-clock human services and fluctuating wait times pose challenges in fully meeting customer needs. In response, there’s a growing need for automated customer service systems that can provide responses tailored to specific domains and in the native languages of customers, particularly in developing nations like Uzbekistan where call center usage is on the rise. Our system, “UzAssistant,” is designed to recognize user voices and accurately present customer issues in standardized Uzbek, as well as vocalize the responses to voice queries. It employs feature extraction and recurrent neural network (RNN)-based models for effective automatic speech recognition, achieving an impressive 96.4% accuracy in real-time tests with 56 participants. Additionally, the system incorporates a sentence similarity assessment method and a text-to-speech (TTS) synthesis feature specifically for the Uzbek language. The TTS component utilizes the WaveNet architecture to convert text into speech in Uzbek.

Facial emotion recognition (FER) has a huge importance in the field of human–machine interface. Given the intricacies of human facial expressions and the inherent variations in images, which are characterized by diverse facial poses and lighting conditions, the task of FER remains a challenging endeavour for computer-based models. Recent advancements have seen vision transformer (ViT) models attain state-of-the-art results across various computer vision tasks, encompassing image classification, object detection, and segmentation. Moreover, one of the most important aspects of creating strong machine learning models is correcting data imbalances. To avoid biased predictions and guarantee reliable findings, it is essential to maintain the distribution equilibrium of the training dataset. In this work, we have chosen two widely used open-source datasets, RAF-DB and FER2013. As well as resolving the imbalance problem, we present a new, balanced dataset, applying data augmentation techniques and cleaning poor-quality images from the FER2013 dataset. We then conduct a comprehensive evaluation of thirteen different ViT models with these three datasets. Our investigation concludes that ViT models present a promising approach for FER tasks. Among these ViT models, Mobile ViT and Tokens-to-Token ViT models appear to be the most effective, followed by PiT and Cross Former models.

Filters

Publication Date

Institution 1

Institution Country

Journal

Publisher

Publication Type

Field Of Study

Topics

Open Access

Language

Voice-Controlled Intelligent Personal Assistant for Call-Center Automation in the Uzbek Language

Digital Signal and Image Processing Using Neural Networks

Comparative Analysis of Vision Transformer Models for Facial Emotion Recognition Using Augmented Balanced Datasets

Methodology of Teaching Subjects Based on Mobile Technologies in Higher Education Institutions

Developing Rule-Based and Gazetteer Lists for Named Entity Recognition in Uzbek Language: Geographical Names

Multi-Level Approach in Organizing the Energy Supply System in Telecommunication Networks

Analysis of Mathematical Modeling Model in Power Supply Systems

Application of a Genetic Algorithm in Planning the Optimal Route of Unmanned Aerial Vehicles Used for Large Area Monitoring

Automating the Transition from Dialectal to Literary Forms in Uzbek Language Texts: An Algorithmic Perspective

Utilizing Lexicographic Resources for Sentiment Classification in Uzbek Language

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph