- Research Article
1
- 10.1142/s2196888824500179
- Oct 17, 2024
- Vietnam Journal of Computer Science
- Thinh Vinh Le + 2 more
In the dynamic field of fog computing, there is a clear trend toward exploiting local, resource-rich nodes to bypass traditional cloud infrastructure limitations. This study introduces an innovative method for enhancing the reliability of Wi-Fi systems, crucial in fog computing, by integrating quality-focused user feedback. Our approach significantly enhances the assessment of Wi-Fi system trustworthiness by emphasizing user perspectives. While traditional metrics such as Availability, Performance, and Security Parameters are crucial for defining Quality of Service (QoS), our research also integrates user feedback as a key, albeit secondary, factor in assessing Wi-Fi node trustworthiness. Designed to improve the evaluation process, this method combines system-generated QoS metrics with user feedback to subtly increase trust assessment objectivity. This not only connects technical service quality with user experiences but also strengthens trust and reliability in fog computing. Utilizing sophisticated cloud theory techniques, our model employs both backward and forward cloud generators — the backward generator converts QoS data into qualitative insights, while the forward generator combines these insights with user feedback to thoroughly evaluate Wi-Fi node service quality. The integration of user feedback allows for a more dynamic and responsive system evaluation, addressing the limitations of previous models by providing a comprehensive assessment that aligns technical service quality with user experiences. This enhancement in trust evaluation is achieved by skillfully blending QoS metrics and user feedback to create a more objective trust value.
- Research Article
- 10.1142/s2196888824500167
- Aug 31, 2024
- Vietnam Journal of Computer Science
- Dinh Tai Pham + 2 more
Healthcare Recommendation Systems (HRSs) primarily aim to offer advice, recommendations, or suggestions related to human healthcare. Similar to other information systems, datasets affect HRSs’ efficiency. The larger datasets can make the information more diverse and complete. Therefore, the recommendation systems can be more accurate and deliver better performance. In addition, several recent studies have revealed that to enhance the accuracy of systems, there should be a switch from a model-centered method to data-centricity. Therefore, several datasets have been introduced over the past few years to assess the effectiveness of HRSs. However, building new datasets and selecting suitable datasets to evaluate HRSs is an issue. Another problem is that there are presently no articles surveying the datasets that have been published to offer the requirements that a dataset uses to judge the HRSs needs to have. These are the reasons that motivated us to conduct this research. To address these problems, we systematically focus on a comprehensive survey of the datasets used to evaluate HRSs published in recent years. Some of the primary contributions of this study are as follows: (i) Surveying 34 datasets commonly used in methods for building and developing HRSs. (ii) Extracting 10 common characteristics and comparing datasets with those 10 common characteristics. (iii) Providing the 13 requirements regarding a dataset used in HRSs (iv) Comparing surveyed datasets and state-of-the-art HRSs based on 13 constructed requirements (v) Discussing the five main challenges and set out some directions for building new datasets in the future. The results of this research analysis are valuable to many new researchers in choosing or constructing suitable datasets to evaluate how they build their HRSs for the present and future.
- Research Article
1
- 10.1142/s2196888824400025
- Aug 28, 2024
- Vietnam Journal of Computer Science
- Phuoc-Loc Nguyen + 6 more
Mangoes are among the tropical fruits that provide high export value. Therefore, grading them based on physical and external features is required to satisfy the exporting criteria. An image acquisition system has recently been developed to capture the mango’s 360∘ views. This study developed a computer vision system (CVS) to extend this system towards automatic grade classification of mango fruits based on the mango’s weight and imperfect skin area. The front-view images of four major faces could be identified with a higher accuracy of 97.98% by applying an average filter on the skin area. The mango region in these major-face images was then detected so that the skin defects could measured, and the mango’s 3D dimensions and area in pixel numbers could be calculated to estimate the mango’s weight with an average error of about 2.5% using a ridge regression model. Experimental results with 120 identified major-face images with artificial blemishes showed an average percentage error of 7.28%. At the controlled conveyor speed of 41[Formula: see text]mm/s for favorable image capture, the system could sequentially analyze about 1452 mangoes per hour for blemish area calculation. With such high throughput, the proposed CVS could be tailored to build an automatic mango grading system based on the mango’s weight and skin imperfections.
- Research Article
5
- 10.1142/s2196888824500155
- Aug 26, 2024
- Vietnam Journal of Computer Science
- Kien Cao-Van + 4 more
Prostate cancer is increasingly common among men. However, the process of diagnosing malignant disease is relatively complicated and time-consuming. Identifying benign or malignant tumors early can assist medical professionals in choosing appropriate treatment methods. Consequently, we introduce a soft-voting ensemble model comprising several single machine learning models such as Logistic Regression, Random Forest, XGBoost, LGBM, and Support Vector Machine for the classification task with the prostate cancer dataset. The dataset was divided into two parts for training and testing with a ratio of 67:33. The confusion matrix was used to evaluate the performance of both the individual and ensemble models. Experimental results show that ensemble models achieve performance ranging from 87.88% to 96.97%, which is 3% to 9% better than individual models, surpassing recent research. Integrating the strengths of individual models helps minimize errors, resulting in optimal classification with high accuracy and overall performance in the field of machine learning.
- Research Article
2
- 10.1142/s2196888824500143
- Jul 26, 2024
- Vietnam Journal of Computer Science
- Praveen Kumar Gundaram
The security of digital communication and information systems is mostly dependent on block ciphers. ARX-based ciphers are widely used due to their simplicity and efficiency. This paper provides an exhaustive cryptanalysis of a subset of ARX-based block ciphers, with particular emphasis on SIMON, SPECK, and IDEA. These ciphers need to be exposed for their weaknesses in algebraic attack resistance and cryptographic properties such as key sensitivity. In addition, we assess the resource utilization and speed of these ciphers, both of which are critical for practical implementation. SMT (Satisfiability Modulo Theories) framework is utilized to tackle constraint fulfillment problems based on first-order logic. The following cryptographic steps use SMT solvers: differential cryptanalysis, collision attack, pre-image attack, modular root-finding, and cryptographic primitive verification. We show that SMT solvers can solve block cipher cryptanalysis constraints. In a cryptanalytic attack, we convert block cipher boolean equations to Z3py. The proposed cryptanalysis method evaluates ARX cipher performance. This method recovers the partial secret key using plaintext and ciphertext pairs, partial key bits, and a predetermined number of rounds. To determine whether SIMON, SPECK, or IDEA are appropriate for distinct security requirements, we conducted a comparative analysis of the results and presented them in tabulated form. This research builds a better understanding of ARX-based block ciphers and allows us to develop more robust and efficient cryptographic algorithms to protect sensitive data in modern communication systems.
- Research Article
6
- 10.1142/s2196888824400013
- Jul 25, 2024
- Vietnam Journal of Computer Science
- Luong Vuong Nguyen
Clustering methodologies are pivotal in enhancing the recommendation systems powered by collaborative filtering (CF). These systems commonly rely on CF approaches to generate recommendations based on similarities. While conventional user clustering methods are prevalent, there’s a growing necessity to delve into bio-inspired clustering techniques to elevate the recommendation generation process. This paper introduces a novel ensemble method termed Bio-Inspired Clustering Collaborative Filtering (BICCF) designed explicitly for recommendation systems. By harnessing swarm intelligence, this approach aims to refine the precision of recommendations within user-based CF frameworks. The study conducts experiments using real-world datasets sourced from MovieLens to assess the efficacy of this proposed method. The findings reveal marked enhancements in accuracy and efficiency, as evaluated through metrics such as Recall, Precision, MAE, and RMSE surpassing the performance of established baseline methods.
- Research Article
- 10.1142/s2196888824500131
- Jun 22, 2024
- Vietnam Journal of Computer Science
- Tien Do + 5 more
Key information extraction and recognition from rich text images are crucial for various applications. There are two main tasks involved in this process: Line Item Recognition (LIR) and Key Information Localization and Extraction (KILE). LIR aims at identifying and interpreting data line items in a document. The essential information in each line item is then classified or extracted, a task known as KILE. A widely used approach for this problem is sequence based, which relies on the generalization of a language model and requires a significant amount of training time. We present an effective and reliable solution to the problem by using RoBERTa, a transformer model trained on a large corpus, along with the LION optimizer to improve the training process. A comprehensive evaluation was conducted on two different benchmarks, emphasizing two different languages, English and Vietnamese. Experimental results on DocILE indicate that the proposed framework significantly improves the KILE task with a 7.24% increase in accuracy compared to the baseline and also enhances the correct recognition rate at the LIR stage. On MCOCR, the method achieved a Character Error Rate (CER) of 28.6%, which is competitive with the state-of-the-art on this dataset.
- Research Article
- 10.1142/s219688882450012x
- May 14, 2024
- Vietnam Journal of Computer Science
- Marwa Abderrahim + 2 more
Due to the diversity of image sources, content-based multi-source image fusion and retrieval have shown promising capabilities in computer vision tasks, and especially when applied in Computer-Aided Diagnosis (CAD) to automate and improve the accuracy of medical image analysis. The combination of computer vision and CAD systems has the potential to revolutionize healthcare by augmenting the expertise of clinicians, improving overall diagnostic accuracy and helping experts in the clinical decision-making process by classifying and retrieving similar annotated clinical images to a given query. In the context of multi-view mammography interpretation, the concept of multi-view feature fusion has recently been studied to improve retrieval performance while effectively guaranteeing the complementarity of both MLO and CC views. However, conventional multi-view feature fusion makes descriptors long and lacks to take into consideration the relationship between descriptors. To deal with this issue, we propose two hierarchical multi-view feature fusion methods, for multi-view mammogram retrieval, based on the Canonical Correlation Analysis (CCA), which is the most commonly used multivariate parametric test. In fact, we have adapted CCA to determine the relationship between two descriptors by processing latent correlation factors. Moreover, after extracting descriptors for each view, a comparative study of texture and shape fusion descriptors is proposed in order to identify the more discriminative features for multi-view mammogram retrieval. Then, a query-dependent distance metric preserving both visual resemblance and semantic similarity is carried out to dynamically determine the more appropriate distance measure for each query image. Extensive experiments on the challenging Curated Breast Imaging Subset of Digital Database for Screening Mammography (CBIS-DDSM) have demonstrated the effectiveness of the proposed hierarchical multi-view feature fusion for mammogram retrieval, which outperforms the performance achieved either by conventional fused information or by single view information. To improve the transparency of our paper, the source code of the proposed method and the related dataset (including readme files) are publicly accessible through the following GitHub link: https://github.com/ABDERRAHIMMAR/Multi-View-Feature-Fusion-for-Mammogram-Retrieval . This open-access resource empowers researchers and practitioners to delve deeper into our methodology, fostering collaboration and advancements in the field of computer-aided diagnosis and medical image analysis.
- Research Article
- 10.1142/s2196888824500064
- May 2, 2024
- Vietnam Journal of Computer Science
- Simon Tobias Lund + 1 more
As the complexity of software systems is ever increasing, so is the need for practical tools for formal verification. Among these are automatic theorem provers, capable of solving various reasoning problems automatically, and proof assistants, capable of deriving more complex results when guided by a mathematician/programmer. In this paper we consider using the latter to build the former. In the proof assistant Isabelle/HOL we combine functional programming and logical program verification to build a theorem prover for propositional logic. We also consider how such a prover can be used to solve a reasoning task without much mental labor. The development is extended with a formalized proof system for writing machine-checked sequent calculus proofs. We consider how this can be used to teach computer science students about logic, automated reasoning and proof assistants.
- Research Article
- 10.1142/s219688882450009x
- Apr 27, 2024
- Vietnam Journal of Computer Science
- Piotr Karwaczyński + 6 more
The implementation of parallel applications is always a challenge. It embraces many distinctive design decisions that are to be taken. The paper presents issues of parallel processing with use of .NET applications and popular Database Management Systems (DBMSes). In the paper, four design dilemmas are addressed: how efficient is the auto-parallelism implemented in the .NET TPL library, how do popular DBMSes differ in serving parallel requests, what is the optimal size of data chunks in the data parallelism scheme, and how the TPL auto-parallelism behaves in the public clouds. They are analyzed in the context of the typical and practical business case originated from IT solutions which are dedicated for the energy market participants. The paper presents the results of experiments conducted in a controlled, on-premises and cloud environments. The experiments allowed to compare the performance of the TPL auto-parallelism with a wide range of manually set numbers of worker threads. They also helped to evaluate four DBMSes: Oracle, MySQL, PostgreSQL, and MSSQL in the scenario of serving parallel queries. Finally, they showed the impact of data chunk sizes on the overall performance.