Optimizing AIGC Technology for IoT Devices with Deep Learning

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

The present article intends to explore how a deep learning model could be applied to improve the ability of AI-generated content (AIGC) technology in graphic recognition within the IoT ecosystem. Objectives: This research pursues two key objectives: first, the model is compressed to a smaller size and decreased computational cost for on-device deployment on resource-poor IoT devices, and second, it achieves better adaptability through data augmentation and regularization techniques. Methods/Analysis: A purpose-built CNN design was built and trained to solve IoT-specific constraints. Model compression techniques such as weight pruning and quantization were used to reduce resource requirements. To ameliorate this, we applied data augmentation techniques like rotation, shear, and zoom, and regularization techniques like dropout to avoid overfitting. The work was done on MNIST and CIFAR-10 typical datasets using TensorFlow as a deep learning framework. Results: The pattern-recognition accuracy on MNIST and CIFAR-10 datasets achieved are 99.5% and 89.2%, respectively. Moreover, the recognition speed was improved by around 30% since the computational cost of the DL algorithm is effective because of parallel processing, resulting in lower processing time. The compressed model overcame the massive computational complexity, which is more suitable for resource-limited IoT devices. Novelty/Improvement: a new methodology is presented that integrates CNN optimization and model compression in conjunction with sophisticated regularization techniques to develop a suitable solution for the peculiarities of the IoT landscape. Ultimately, overcoming the universal problems like limited resources and real-time processes in this research helps to improve the technological and theoretical support for practical IoT applications and accelerate the practical implementation of AIGC performance optimization across various industries such as smart homes, smart transportation, and smart security.

Similar Papers
  • PDF Download Icon
  • Research Article
  • Cite Count Icon 7
  • 10.14569/ijacsa.2023.0140127
Data Augmentation for Deep Learning Algorithms that Perform Driver Drowsiness Detection
  • Jan 1, 2023
  • International Journal of Advanced Computer Science and Applications
  • Ghulam Masudh Mohamed + 2 more

Driver drowsiness is one of the main causes of driver-related motor vehicle collisions, as this impairs a person’s concentration whilst driving. With the enhancements of computer vision and deep learning (DL), driver drowsiness detection systems have been developed previously, in an attempt to improve road safety. These systems experienced performance degradation under real-world testing due to factors such as driver movement and poor lighting. This study proposed to improve the training of DL models for driver drowsiness detection by applying data augmentation (DA) techniques that model these real-world scenarios. This paper studies six DL models for driver drowsiness detection: four configurations of a Convolutional Neural Network (CNN), two custom configurations as well as the architectures designed by the Visual Geometry Group (VGG) (i.e. VGG16 and VGG19); a Generative Adversarial Network (GAN) and a Multi-Layer Perceptron (MLP). These DL models were trained using two datasets of eye images, where the state of eye (open or closed) is used in determining driver drowsiness. The performance of the DL models was measured with respect to accuracy, F1-Score, precision, negative class precision, recall and specificity. When comparing the performance of DL models trained on datasets with and without DA in aggregation, it was found that all metrics were improved. After removing outliers from the results, it was found that the average improvement in both accuracy and F1 score due to DA was +4.3%. Furthermore, it is shown that the extent to which the DA techniques improve DL model performance is correlated with the inherent model performance. For DL models with accuracy and F1-Score ≤ 90%, results show that the DA techniques studied should improve performance by at least +5%.

  • Research Article
  • 10.1158/1538-7445.am2021-184
Abstract 184: The utility of deep metric learning for breast cancer identification on mammographic images
  • Jul 1, 2021
  • Cancer Research
  • Justin Du + 8 more

Purpose: Although deep learning (DL) models have shown increasing ability to accurately classify diagnostic images in oncology, significantly large amounts of well-curated data are often needed to match human level performance. Given the relative paucity of imaging datasets for less prevalent cancer types, there is an increasing need for methods which can improve the performance of deep learning models trained using limited diagnostic images. Deep metric learning (DML) is a potential method which can improve accuracy in deep learning models trained on limited datasets. Leveraging a triplet-loss function, DML exponentially increases training data compared to a traditional DL model. In this study, we investigated the utility of DML to improve the accuracy of DL models trained to classify cancerous lesions found on screening mammograms. Methods: Using a dataset of 2620 lesions found on routine screening mammogram, we trained both a traditional DL and DML models to classify suspicious lesions as cancerous or benign. The VGG16 architecture was used as the basis for the DL and DML models. Model performance was compared by calculating model accuracy, sensitivity, and specificity on a blinded test set of 378 lesions. In addition to individual model performance, we also measured agreement accuracy when both the DL and DML models were combined. Sub-analyses were conducted to identify phenotypes which were best suited for each model type. Both models underwent hyperparameters optimization to identify ideal batch size, learning rate, and regularization to prevent overfitting. Results: We found that the combination of the traditional DL model with DML model resulted in the highest overall accuracy (78.7%) representing a 7.1% improvement compared to the traditional DL model (p<.001). Alone, the traditional DL model had an improved accuracy compared to the DML model (71.4% vs 66.4%). The traditional DL model had a higher sensitivity (94.8% vs 73.6 %) , but lower specificity (34.7% vs 55.1%) compared the DML model. Sub-analyses suggested the traditional DL model was more accurate on higher density breasts, whereas the DML model was more accurate on lower density breasts. Additionally, the traditional DL model had the highest accuracy on oval shaped lesions, compared to the DML model which was most accurate on irregularly shaped breast lesions. Conclusion: Our study suggests that addition of DML models with traditional DL models can improve diagnostic image classification performance in cancer. Our results suggest DML models may provide increased specificity and help with classification of unique populations often misclassified by traditional DL models. Further studied investigating the utility of DML on other cancer imaging tasks are necessary to successfully build more robust DL models in cancer imaging. Citation Format: Justin Du, Sachin Umrao, Enoch Chang, Marina Joel, Aidan Gilson, Guneet Janda, Rachel Choi, Yongfeng Hui, Sanjay Aneja. The utility of deep metric learning for breast cancer identification on mammographic images [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 184.

  • Research Article
  • Cite Count Icon 114
  • 10.5455/aim.2019.27.327-332
Deep Transfer Learning Models for Medical Diabetic Retinopathy Detection.
  • Jan 1, 2019
  • Acta Informatica Medica
  • Nour Khalifa + 3 more

Introduction:Diabetic retinopathy (DR) is the most common diabetic eye disease worldwide and a leading cause of blindness. The number of diabetic patients will increase to 552 million by 2034, as per the International Diabetes Federation (IDF).Aim:With advances in computer science techniques, such as artificial intelligence (AI) and deep learning (DL), opportunities for the detection of DR at the early stages have increased. This increase means that the chances of recovery will increase and the possibility of vision loss in patients will be reduced in the future.Methods:In this paper, deep transfer learning models for medical DR detection were investigated. The DL models were trained and tested over the Asia Pacific Tele-Ophthalmology Society (APTOS) 2019 dataset. According to literature surveys, this research is considered one the first studies to use of the APTOS 2019 dataset, as it was freshly published in the second quarter of 2019. The selected deep transfer models in this research were AlexNet, Res-Net18, SqueezeNet, GoogleNet, VGG16, and VGG19. These models were selected, as they consist of a small number of layers when compared to larger models, such as DenseNet and InceptionResNet. Data augmentation techniques were used to render the models more robust and to overcome the overfitting problem.Results:The testing accuracy and performance metrics, such as the precision, recall, and F1 score, were calculated to prove the robustness of the selected models. The AlexNet model achieved the highest testing accuracy at 97.9%. In addition, the achieved performance metrics strengthened our achieved results. Moreover, AlexNet has a minimum number of layers, which decreases the training time and the computational complexity.

  • Research Article
  • 10.47857/irjms.2024.v05i03.0829
IoT Generated Multi-Modality Data Analysis Using a Deep Learning Framework for Managing Sustainability in Smart Environments
  • Jan 1, 2024
  • International Research Journal of Multidisciplinary Scope
  • Doraswamy B + 2 more

Incorporating the Internet of Things (IoT) systems into Smart environments can reinforce stability and sustainability. IoT-deployed smart environment monitoring devices supports energy monitoring, water consumption, and other resource utilization details. Several earlier research works have focused on energy, water consumption, or other sustainable parameter monitoring that cannot retain and manage sustainability. This paper aims to maintain and manage the sustainability of smart environments by creating a Deep Learning Framework (DLF) to analyze the multimodality data generated by IoT devices. The DLF integrates and incorporates an IoT dashboard, IoT network, Deep and Reinforcement Learning algorithms, Optimization algorithm, and Smart Regulator. The proposed DLF monitors multiple smart environments using IoT devices that generate multi-modality data. The IoT devices are clustered as networks that monitor different smart environments, generate data, and send it to the cloud data centers. Before that, the Particle Swarm Optimization algorithm is used to preprocess and optimize the data. Based on the data modality, deep learning algorithms are activated automatically to process and predict the conditions of the smart environment concerning abnormalities and utility consumptions. Smart metering, building, fleet, and air quality monitoring and predictions are some of the additional solutions given by the DLF. It incorporates cloud and IoT dashboards to improve the overall monitoring performance and reduce cost efficiency. The performance of the DLF is verified by simulating and comparing its output with the other state-of-the-art methods, and it is found that the proposed DLF outperforms the others.

  • Research Article
  • Cite Count Icon 4
  • 10.15575/join.v8i2.1073
The Impact of Data Augmentation Techniques on the Recognition of Script Images in Deep Learning Models
  • Dec 28, 2023
  • Jurnal Online Informatika
  • Wulan Sapitri + 3 more

Deep learning technology is widely used for recognizing character images, including various regional characters and diverse ancient scripts. Deep learning models require large data sets to recognize images accurately. However, creating a dataset has limitations in terms of quantity, including the Komering script dataset used in this study. Data augmentation techniques can be applied to expand the dataset by modifying existing images to increase data diversity. This study aims to investigate the impact of augmentation techniques on the performance of deep learning models in the case of Komering script recognition. The dataset consists of 500 images for five classes of Komering script characters. Three augmentation techniques, namely random rotation, height shift, and width shift, were applied to the five characters, which were then used to test the model trained to recognize characters in the Komering dataset. This research contributes to providing insights into the effect of augmentation techniques on robust confidence prediction of deep learning models for recognizing newly augmented data. The results demonstrate that the deep learning model can recognize modified data using augmentation techniques with an average accuracy of 80.05%.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 115
  • 10.1109/access.2020.2988854
Fog-Based Attack Detection Framework for Internet of Things Using Deep Learning
  • Jan 1, 2020
  • IEEE Access
  • Ahmed Samy + 2 more

The number of cyber-attacks and data breaches has immensely increased across different enterprises, companies, and industries as a result of the exploitation of the weaknesses in securing Internet of Things (IoT) devices. The increasing number of various devices connected to IoT and their different protocols has led to growing volume of zero-day attacks. Deep learning (DL) has demonstrated its superiority in big data fields and cyber-security. Recently, DL has been used in cyber-attacks detection because of its capability of extracting and learning deep features of known attacks and detecting unknown attacks without the need for manual feature engineering. However, DL cannot be implemented on IoT devices with limited resources because it requires extensive computation, strong power and storage capabilities. This paper presents a comprehensive attack detection framework of a distributed, robust, and high detection rate to detect several IoT cyber-attacks using DL. The proposed framework implements an attack detector on fog nodes because of its distributed nature, high computational capacity and proximity to edge devices. Six DL models are compared to identify the DL model with the best performance. All DL models are evaluated using five different datasets, each of which involves various attacks. Experiments show that the long short-term memory model outperforms the five other DL models. The proposed framework is effective in terms of response time and detection accuracy and can detect several types of cyber-attacks with 99.97% detection rate and 99.96% detection accuracy in binary classification and 99.65% detection accuracy in multi-class classification.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 4
  • 10.4108/eetsis.4056
Whale Optimization based Deep Residual Learning Network for Early Rice Disease Prediction in IoT
  • Oct 3, 2023
  • ICST Transactions on Scalable Information Systems
  • M Sri Lakshmi + 4 more

Disease detection on a farm requires laborious and time-consuming observation of individual plants, which is made more difficult when the farm is large and many different plants are farmed. To address these problems, cutting-edge technologies, AI, and Deep Learning (DL) are employed to provide more accurate illness predictions. When it comes to smart farming and precision agriculture, IoT opens up exciting new possibilities. To a certain extent, the goal-mouth of "smart farming" is to upsurge productivity and efficiency in agricultural processes. Smart farming is an approach to agriculture in which Internet of Things devices are interconnected and new technologies are used to optimize existing methods. Utilizing Internet of Things (IoT) devices, smart farming aids in more informed decision making. In many parts of the world, rice is the staple diet. This means that early detection of rice plant diseases using automated techniques and IoT devices is essential. Growing rice yields and profits may be helped along by DL model creation and deployment in agriculture. Here we introduce DRL, a deep residual learning framework that has been trained using photos of rice leaves to recognize one of four classes. The suggested model is called WO-DRL, and the hyper-parameter tuning procedure of DRL is executed with the help of the Whale Optimization algorithm. The outcomes demonstrate the efficacy of our suggested approach in directing the WO-DRL model to learn important characteristics. The findings of this study will pave the way for the agriculture sector to more quickly diagnose and treat plant diseases using AI.

  • Research Article
  • 10.25147/ijcsr.2017.001.1.224
Empirical Analysis of the State-of-the-Art Models for Handling Polarity Shifts Due to Implicit Negation in Mobile Phone Reviews
  • Jan 1, 2025
  • International Journal of Computing Sciences Research
  • Millicent Murithi + 2 more

Purpose–This paper presents a comprehensive empirical analysis focusing on sentiment flux within state-of-the-art models designed for handling polarity shifts due to implicit negation in Amazon mobile phones' reviews. Method–The research evaluates diverse models across five categories: traditional machine learning (ML), deep learning (DL), and hybrid models combining both approaches. Various feature extraction, feature selection, and data augmentation techniques are tested on Amazon mobile phone reviews dataset. BERT and LSTM are used for deep learning while SVM and Naive Bayes are used for traditional ML. ANOVA is used to identify the presence or absence of significant differences and interactions among these entities. Results –DL shows superior performance compared to traditional ML models. ANOVA analysis shows significant performance differences between conventional ML and DL models. Traditional ML models interact significantly with feature extraction and selection techniques while DL models do not. Traditional ML models do not interact significantly with data augmentation methods while DL models do. FastText extraction outperforms word2vec; Back translation outperforms synonym replacement while recursive feature selection (RFE) surpasses TF-IDF (Term Frequency-Inverse Document Frequency). The BERT and LSTM exhibit one of the strongest performances. Conclusion –The study concludes that DL models are more effective. Data augmentation techniques significantly impact the performance of DL models, with back translation showing superior performance over synonym replacement. This provides a leverage point in developing an improved model in the future. Recommendations –Future research should focus on developing a hybrid model for Enhanced Polarity Shift Management of Mobile Phone Reviews using Contextual Back Translation Augmented by Seq2seq Perturbations. This aims at leveraging contextual back translation and Seq2seq perturbations to generate a diverse interpretation that consequently improves the model's ability to handle nuanced expressions of sentiments due to implicit negation with enhanced accuracy, generalizability, robustness to polarity shifts, and contextual understanding. Research Implications –The findings provide valuable insights into the development of state-of-the-art models, offering a promising direction for further research in sentiment analysis. Keywords –empirical analysis, hybrid, perturbations, implicit negation, sentiment flux

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 21
  • 10.1109/access.2021.3071393
Deep Sentiment Analysis: A Case Study on Stemmed Turkish Twitter Data
  • Jan 1, 2021
  • IEEE Access
  • Harisu Abdullahi Shehu + 7 more

Sentiment analysis using stemmed Twitter data from various languages is an emerging research topic. In this paper, we address three data augmentation techniques namely Shift, Shuffle, and Hybrid to increase the size of the training data; and then we use three key types of deep learning (DL) models namely recurrent neural network (RNN), convolution neural network (CNN), and hierarchical attention network (HAN) to classify the stemmed Turkish Twitter data for sentiment analysis. The performance of these DL models has been compared with the existing traditional machine learning (TML) models. The performance of TML models has been affected negatively by the stemmed data, but the performance of DL models has been improved greatly with the utilization of the augmentation techniques. Based on the simulation, experimental, and statistical results analysis deeming identical datasets, it has been concluded that the TML models outperform the DL models with respect to both training-time (TTM) and runtime (RTM) complexities of the algorithms; but the DL models outperform the TML models with respect to the most important performance factors as well as the average performance rankings. CCBY

  • Conference Article
  • Cite Count Icon 2
  • 10.1109/icses52305.2021.9633781
A Quantitative Analysis of Basic vs. Deep Learning-based Image Data Augmentation Techniques
  • Sep 24, 2021
  • Mohammed Ehsan Ur Rahman + 4 more

Our proposed work is a research project that does quantitative analysis of various basic image manipulation techniques as processes for augmentation of image type data on the accuracy of deep learning task of hand-written digit recognition on MNIST dataset. The paper also presents a detailed comparison of various parameters such as computation burden, storage requirements for model storage, accuracy, and loss function value of the results obtained by using basic image manipulation techniques as image data augmentation techniques with those data augmentation mechanisms that are rooted in deep learning. The results that we have obtained on MNIST dataset without data augmentation applied are accuracy of 97.80% and loss of 0.320, whereas the highest accuracy was achieved by adjusting brightness as the data augmentation technique with 98.57% accuracy and 0.301 loss value. In the view of our results, we recommend that basic image manipulation-based data augmentation techniques must be used to address overfitting instead of memory or computationally expensive deep learning-based image augmentation techniques. This strategy also helps enhance the performance of various image data-based deep learning pipelines and makes these models more robust.

  • Research Article
  • Cite Count Icon 4
  • 10.52339/tjet.v42i2.853
Deep Learning Model Compression Techniques: Advances, Opportunities, and Perspective
  • Jun 30, 2023
  • Tanzania Journal of Engineering and Technology
  • Hubert Msuya + 1 more

Recently, deep learning (DL) models have excelled in a wide range of fields. All of these successes are built on intricate DL models. The hundreds of millions or even billions of parameters and high-performance computing graphical processing units or tensor processing units are largely responsible for their achievement. DL model integration into real-time devices with tight latency limitations, limited memory, and power-constrained requirements is the key driving force behind investigation of DL model compression techniques. Also, there is an increase in data availability that encourages multimodal fusion in DL models to boost the models' predictive accuracy. In order to create compact DL models for deployment that is memory- and computationally efficient, the data included in the network parameters is compressed as much as possible, leaving only the bits necessary to carry out the task. A better trade-off between compression rate and accuracy loss should be established to take model acceleration and compression into consideration without severely reducing the model's performance. In this paper, we examine various DL model compression techniques used for both single- modality and multi-modal deep learning tasks. We explore over numerous DL model compression methods that have advanced in a number of applications. We then come up with the benefits and drawbacks of various compression and acceleration methods such as ineffectiveness in compressing more complicated networks with dimensionality-dependent complex structures, and, ultimately, the field's future prospects are given.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 2
  • 10.3390/jimaging9110238
Constraints on Optimising Encoder-Only Transformers for Modelling Sign Language with Human Pose Estimation Keypoint Data
  • Nov 2, 2023
  • Journal of Imaging
  • Luke T Woods + 1 more

Supervised deep learning models can be optimised by applying regularisation techniques to reduce overfitting, which can prove difficult when fine tuning the associated hyperparameters. Not all hyperparameters are equal, and understanding the effect each hyperparameter and regularisation technique has on the performance of a given model is of paramount importance in research. We present the first comprehensive, large-scale ablation study for an encoder-only transformer to model sign language using the improved Word-level American Sign Language dataset (WLASL-alt) and human pose estimation keypoint data, with a view to put constraints on the potential to optimise the task. We measure the impact a range of model parameter regularisation and data augmentation techniques have on sign classification accuracy. We demonstrate that within the quoted uncertainties, other than parameter regularisation, none of the regularisation techniques we employ have an appreciable positive impact on performance, which we find to be in contradiction to results reported by other similar, albeit smaller scale, studies. We also demonstrate that the model architecture is bounded by the small dataset size for this task over finding an appropriate set of model parameter regularisation and common or basic dataset augmentation techniques. Furthermore, using the base model configuration, we report a new maximum top-1 classification accuracy of 84% on 100 signs, thereby improving on the previous benchmark result for this model architecture and dataset.

  • Conference Article
  • 10.1115/imece2024-142963
Enhancing Defective Casting Classification in Manufacturing Using Deep Learning: A Case Study With VGG16
  • Nov 17, 2024
  • Sathish Gurupatham + 3 more

In the manufacturing industry, the detection of defective products, such as castings, is crucial for ensuring product quality and preventing potential safety hazards. Deep learning techniques, particularly convolutional neural networks (CNNs), have emerged as powerful tools for automating the detection and classification of defects in manufacturing processes. This study explores the application of deep learning, specifically utilizing the VGG16 model, for classifying defective castings from normal castings in the manufacturing field. The dataset comprises a collection of images depicting both normal and defective castings. Initially, the VGG16 model was trained on a dataset consisting of 4100 training images, 1200 validation images, and 500 testing images. The model achieved a moderate accuracy of approximately 60% on the testing set, indicating its ability to classify defective castings to some extent. However, the model’s performance in terms of precision, recall, and F1 score was suboptimal. To enhance the model’s performance, several strategies were employed. First, the VGG16 layers were fine-tuned by selectively unfreezing and training certain layers of the pre-trained model. This process allowed the model to adapt more effectively to the nuances of the specific dataset. Additionally, regularization techniques, such as dropout and L2 regularization, were incorporated to prevent overfitting and improve generalization performance. Moreover, the learning rate of the model was adjusted, and different values were experimented with to optimize its performance. Learning rate schedules and adaptive optimizers, such as the Adam optimizer, were utilized to dynamically adjust the learning rate during training. Furthermore, the training dataset was augmented by applying various data augmentation techniques, such as rotation, flipping, and zooming, to increase the diversity of the images. By balancing the dataset to include an equal number of training, validation, and testing images, consisting of 4100, 1200, and 500 samples respectively, a significant improvement in the model’s performance was observed. After implementing these enhancements, the model achieved impressive results, with an accuracy of 89.6%, precision of 91.4%, recall of 89.6%, and F1 score of 89.5% on the testing set. These findings underscore the effectiveness of deep learning approaches, particularly when combined with fine-tuning, regularization, and data augmentation techniques, in accurately detecting and classifying defective castings in the manufacturing industry. Such advancements have the potential to streamline quality control processes, reduce production costs, and enhance overall product reliability and safety in manufacturing operations.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 12
  • 10.1038/s41598-024-66481-4
Explainable artificial intelligence (XAI) for predicting the need for intubation in methanol-poisoned patients: a study comparing deep and machine learning models
  • Jul 8, 2024
  • Scientific Reports
  • Khadijeh Moulaei + 14 more

The need for intubation in methanol-poisoned patients, if not predicted in time, can lead to irreparable complications and even death. Artificial intelligence (AI) techniques like machine learning (ML) and deep learning (DL) greatly aid in accurately predicting intubation needs for methanol-poisoned patients. So, our study aims to assess Explainable Artificial Intelligence (XAI) for predicting intubation necessity in methanol-poisoned patients, comparing deep learning and machine learning models. This study analyzed a dataset of 897 patient records from Loghman Hakim Hospital in Tehran, Iran, encompassing cases of methanol poisoning, including those requiring intubation (202 cases) and those not requiring it (695 cases). Eight established ML (SVM, XGB, DT, RF) and DL (DNN, FNN, LSTM, CNN) models were used. Techniques such as tenfold cross-validation and hyperparameter tuning were applied to prevent overfitting. The study also focused on interpretability through SHAP and LIME methods. Model performance was evaluated based on accuracy, specificity, sensitivity, F1-score, and ROC curve metrics. Among DL models, LSTM showed superior performance in accuracy (94.0%), sensitivity (99.0%), specificity (94.0%), and F1-score (97.0%). CNN led in ROC with 78.0%. For ML models, RF excelled in accuracy (97.0%) and specificity (100%), followed by XGB with sensitivity (99.37%), F1-score (98.27%), and ROC (96.08%). Overall, RF and XGB outperformed other models, with accuracy (97.0%) and specificity (100%) for RF, and sensitivity (99.37%), F1-score (98.27%), and ROC (96.08%) for XGB. ML models surpassed DL models across all metrics, with accuracies from 93.0% to 97.0% for DL and 93.0% to 99.0% for ML. Sensitivities ranged from 98.0% to 99.37% for DL and 93.0% to 99.0% for ML. DL models achieved specificities from 78.0% to 94.0%, while ML models ranged from 93.0% to 100%. F1-scores for DL were between 93.0% and 97.0%, and for ML between 96.0% and 98.27%. DL models scored ROC between 68.0% and 78.0%, while ML models ranged from 84.0% to 96.08%. Key features for predicting intubation necessity include GCS at admission, ICU admission, age, longer folic acid therapy duration, elevated BUN and AST levels, VBG_HCO3 at initial record, and hemodialysis presence. This study as the showcases XAI's effectiveness in predicting intubation necessity in methanol-poisoned patients. ML models, particularly RF and XGB, outperform DL counterparts, underscoring their potential for clinical decision-making.

  • Research Article
  • 10.1145/3719293
Urdu Word Sense Disambiguation: Leveraging Contextual Stacked Embedding, Siamese Transformer Encoder 1DCNN-BiLSTM, and Gloss Data Augmentation
  • Apr 23, 2025
  • ACM Transactions on Asian and Low-Resource Language Information Processing
  • Anil Ahmed + 3 more

Word Sense Disambiguation (WSD) in Natural Language Processing (NLP) is crucial for discerning the correct meaning of words with multiple senses in various contexts. Recent advancements in this field, particularly Deep Learning (DL) and sophisticated language models such as BERT and GPT, have significantly improved WSD performance. However, challenges persist, especially with languages like Urdu, which are known for their linguistic complexity and limited digital resources compared to English. This study addresses the challenge of advancing WSD in Urdu by developing and applying tailored Data Augmentation (DA) techniques. We introduce an innovative approach, Prompt Engineering with Retrieval Augmented Generation (RAG), leveraging GPT-3.5-turbo to generate context-sensitive Gloss Definitions (GD). Additionally, we employ sentence-level and word-level DA techniques, including Back Translation (BT) and Masked Word Prediction (MWP). To enhance sentence understanding, we combine three BERT embedding models: mBERT, mDistilBERT, and Roberta_Urdu, facilitating a more nuanced comprehension of sentences and improving word disambiguation in complex linguistic contexts. Furthermore, we propose a novel network architecture merging Transformer Encoder (TE)-CNN and TE-BiLSTM models with Multi-Head Self-Attention (MHSA), One-Dimensional Convolutional Neural Network (1DCNN), and Bidirectional Long Short-Term Memory (BiLSTM). This architecture is tailored to address polysemy and capture short- and long-range dependencies critical for effective WSD in Urdu. Empirical evaluations on Lexical Sample (LS) and All Word (AW) tasks demonstrate the effectiveness of our approach, achieving an 88.9% F1 score on the LS and a 79.2% F1 score on AW tasks. These results underscore the importance of language-specific approaches and the potential of DA and advanced modeling techniques in overcoming challenges associated with WSD in languages with limited resources.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.