- Research Article
- 10.1142/s2196888825400056
- Aug 25, 2025
- Vietnam Journal of Computer Science
- Irakli Kardava
This paper presents a study investigating the optimization of well-known NLP algorithms and approaches for the Georgian language, known for its unique linguistic features. Standard methods effective for well-resourced languages, including pretrained models like mBERT and embedding methods such as FastText, may lack flexibility and efficiency when applied to Georgian, often resulting in increased complexity and effort. To address these challenges, we propose a novel approach that leverages Georgian’s rich morphology, including case inflections, extensive suffixation, verb agreement, and conjugation patterns. This method refines algorithms such as Minimum Editing Distance, Text Classification, Language Modeling, and word-level semantic similarity by incorporating language-specific characteristics. Our approach reduces data sparsity and model complexity while preserving accuracy. Although developed for Georgian, it is also relevant for other fusional and agglutinative languages and contributes to reducing dependence on large corpora, supporting the creation of more human-like text.
- Research Article
- 10.1142/s2196888825300029
- Jul 31, 2025
- Vietnam Journal of Computer Science
- Houari Horkous
Nowadays, interpreting human emotions through speech has attracted great attention in human–computer interaction and artificial intelligence. Speech emotion recognition (SER) systems have become a significant field of research. SER is one of the interesting directions in speech processing, is to predict the expressed emotional state. SER systems encounter numerous challenges, such as the availability of appropriate emotional databases, the identification of suitable speech features, and the choice of the appropriate classification method. SER systems are mostly implemented in English, French, German, Indian, and Chinese languages. However, SER for the Arabic language is still in the growing phase. In this work, a literature review on the SER in Arabic has been presented in terms of emotional databases, speech features, and classification algorithms. This review contributes to filling the gap in the works on emotion recognition available in the Arabic language and constitutes a valuable resource for researchers in this field.
- Research Article
- 10.1142/s2196888825300030
- Jul 22, 2025
- Vietnam Journal of Computer Science
- Nooshin Yousefzadeh + 2 more
The use of graphs enables the systematic modeling, analysis, and optimization of complex systems in various real-world domains. When multiple types of relationships or interactions exist among entities, whether homogeneous or heterogeneous, graphs can be structured into multiple layers to model context-specific interdependencies and more effectively capture the complexity of these interactions. This survey introduces a novel and comprehensive taxonomy that categorizes the diverse spectrum of multi-layer graph embedding methods into three main groups: algorithmic, machine learning, and deep learning approaches. This survey aims to serve as a guide for the research community in navigating the graph embedding methods for multi-layer graphs by providing a structured summary, analysis, and comparison within and across different categories that highlight their respective strengths, limitations, and suitability for various application domains. Furthermore, we examine key factors that influence the selection of appropriate methods, including graph structure, inherent properties, application domain, learning paradigm, and computational constraints. Finally, we outline several promising research directions to advance this rapidly evolving field.
- Research Article
- 10.1142/s2196888825500174
- Jul 22, 2025
- Vietnam Journal of Computer Science
- Tram B.t Tran + 2 more
Aggregation methods for single-valued neutrosophic elements (SVNE) play a critical role in multi-criteria decision-making (MCDM); however, existing approaches often lack the flexibility and representational capacity required to model uncertainty, indeterminacy, and periodic variations inherent in real-world data. To address these limitations, this study proposes a novel class of trigonometric aggregation operators based on t-norm and t-conorm operations. By incorporating trigonometric functions such as [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text], and [Formula: see text], the proposed operators for SVNE capture smooth transitions and periodic behaviors. Weighted algebraic and geometric trigonometric mean operators are introduced to improve the aggregation process under conditions of partial weight information. An MCDM framework is constructed using these operators and validated through a practical case study. Comparative experiments demonstrate that the proposed method outperforms existing approaches in terms of accuracy and expressiveness, while sensitivity analysis confirms its robustness and adaptability. These findings suggest that the proposed model offers a promising solution for decision-making tasks involving incomplete, uncertain, or periodic data.
- Research Article
1
- 10.1142/s2196888825500149
- Jul 10, 2025
- Vietnam Journal of Computer Science
- Dao Thanh Huyen + 3 more
This paper presents a study on the applicability of color-based emotional expression models for robots, originally developed for Japanese users, to a Vietnamese demographic. Our research introduces a newly collected dataset capturing Vietnamese participants’ interpretations of robot-emitted colors and gradients, addressing a significant gap in culturally adapted robotic communication. Through two experimental conditions, we evaluate the effectiveness of (1) solid colors and (2) color gradients in conveying eight primary emotions. A total of 330 Vietnamese participants engaged in the study, providing subjective ratings on emotion recognition accuracy. Our findings indicate that while some emotions (e.g. anger and trust) align well with color representations, others (e.g. surprise and disgust) exhibit cultural variations in perception. A one-way ANOVA test confirms statistically significant differences in how emotions are recognized across color models. These results underscore the importance of cultural adaptation in robot emotion expression and highlight potential directions for refining color-based models for cross-cultural applications. Future research will explore neural network-based emotion adaptation to enhance real-time, culturally responsive robotic interactions.
- Research Article
- 10.1142/s2196888825500162
- Jul 4, 2025
- Vietnam Journal of Computer Science
- Tiziano Labruna + 1 more
Task-oriented dialogue systems often face challenges when operating in dynamic environments where domains change frequently, affecting their performance. This paper introduces the Domain Change Simulator (DCS), a novel framework designed to simulate domain changes and evaluate their impact on dialogue systems. The simulator allows controlled experimentation with various types and magnitudes of domain shifts, providing valuable insights for system developers. In addition to simulating domain changes, the framework integrates Generative Dialogue Domain Adaptation (G-DDA), utilizing large language models to dynamically generate slot-value substitutions. This approach enhances the system’s adaptability to new domain contexts without requiring extensive retraining. Through a series of experiments on the MultiWOZ dataset, we demonstrate how the DCS enables precise predictions of system performance under evolving domains, offering a robust tool for improving the resilience of task-oriented dialogue agents. Our results highlight the potential of generative models in maintaining system coherence and domain adherence, even in the face of substantial domain shifts.
- Research Article
2
- 10.1142/s2196888825500113
- Jun 28, 2025
- Vietnam Journal of Computer Science
- Ha Minh Tan + 4 more
Monaural speech separation has been adopted in other applications, e.g. paralinguistics, hearing aids, online video conferences, human–machine interactions, and speech recognition. In recent years, deep learning has replaced previous methods, e.g. Gaussian Mixture Models, the hidden Markov, the independent component analysis, and the non-negative matrix factorization for utterance separation tasks. The time-frequency separation speech using masking as a training goal offers state-of-the-art performance. In this paper, we propose a multi-mask learning and vector training method. First, the network is adopted to teach the deep embedding vectors. These embedding vectors are adopted as the input features for another backbone network with the multi-mask target. The knowledge is accumulated by learning many time-frequency masks and deep embedding vectors. Experimental results have shown that our multi-mask learning and vector training model achieves higher performance than the training approach with a single mask and the multi-mask learning approach without embedding feature vectors.
- Research Article
1
- 10.1142/s2196888825500137
- Jun 28, 2025
- Vietnam Journal of Computer Science
- Tzvetomir Ivanov Vassilev
One of the major drawbacks of JPEG compression is that it cannot produce a good compression ratio for small mean square error (MSE). This paper presents a new method for lossy image compression for storing and transmitting images over the internet, which overcomes this weakness. The method comprises the following steps. The image is first converted in YUV color space and then partitioned in [Formula: see text] pixel blocks. The blocks are divided into four groups and principal component analysis (PCA) is applied to each group. The original pixel data is transformed in the new space, i.e. mode coefficients are calculated. The parameters of the PCA models and coefficients are converted to unsigned byte and saved in a binary file. Only 6 bits are used for storing the coefficients and eigenvectors’ coordinates range is adjusted according to the variance in each vector direction. Then Lempel–Ziv–Markov chain algorithm (LZMA) lossless compression is applied to obtain the final encoded file. The results show that the proposed method produced a much better compression ratio than JPEG for small MSEs. Two types of encoding schemes are evaluated and performance results are shown at the end of the paper.
- Research Article
- 10.1142/s2196888825500150
- Jun 20, 2025
- Vietnam Journal of Computer Science
- Jaroslaw Watrobski + 3 more
Intelligent decision support systems (DSSs) are widely used in sustainability assessments across various domains. This paper presents a DSS that enables multi-criteria evaluation while accounting for the temporal dynamics of the assessed alternatives. The system is based on a newly developed method — Temporal EDAS (Temporal Evaluation based on Distance from Average Solution) — which supports temporal multi-criteria decision analysis (MCDA). This method evaluates alternatives using multiple criteria and aggregates performance over time into clear assessment scores and rankings. The DSS is demonstrated through a case study assessing the implementation of Sustainable Development Goal 7 (SDG 7) in selected European countries. The conducted analysis demonstrated significant differences in country rankings depending on the selected measure of variability and the application of the Temporal EDAS method. The highest potential for improvement was observed in the case of Lithuania, which advanced to 13th place in the overall ranking due to a substantial increase in scores in the final year of analysis, despite previously low positions. In contrast, Cyprus and Romania showed the highest downward variability, which contributed to their low final rankings. Countries occupying the top positions, such as Norway, Iceland, and Sweden, demonstrated score stability and low variability. The study confirmed that using measures such as entropy, standard deviation, and statistical variance leads to more consistent results compared to the Gini coefficient and the coefficient of variation, which tend to favor countries with high potential for score improvement. The proposed framework enables fast, automated, and objective temporal MCDA, delivering unambiguous, interpretable results. The findings confirm the DSSs effectiveness in evaluating sustainable development strategies, particularly those involving a sustainable energy mix.
- Research Article
- 10.1142/s2196888825500101
- Jun 18, 2025
- Vietnam Journal of Computer Science
- Thi-Thu-Hong Phan
This study introduces a novel sparse one-dimensional dense convolutional network-like (1D-DenseNet) architecture, optimized explicitly for raisin variety classification. By combining advanced deep learning techniques with a concise set of seven pre-extracted morphological features, our method significantly enhances classification accuracy. The 1D-DenseNet-like model employs a consistent number of filters across its Dense and Transition blocks, optimizing performance while minimizing complexity. Experimental results demonstrate the superiority of our approach, achieving a remarkable average accuracy of 92.55%. This significant improvement outperforms the best models in previous studies, including Competitive Layer Neural Network (CLNN), as well as top traditional machine learning models like Random Forest and Support Vector Machine, which achieved accuracies of 88.34%, 85.22%, and 86.44%, respectively. These findings highlight the effectiveness of our method in addressing the complexities of raisin variety classification and emphasize the potential of combining deep learning with morphological features for more accurate and efficient solutions in this domain.