- New
- Research Article
- 10.1142/s0218213026400063
- Mar 14, 2026
- International Journal on Artificial Intelligence Tools
- Ali Khebizi + 1 more
Missing data (MD) is an inherent challenge that arises for various reasons across different domains. It often results in incomplete datasets, which can compromise the quality of information processing and, in turn, reduce the reliability and relevance of the resulting decisions. Although the literature offers a wide range of methods and techniques to address MD challenges, many approaches still lack accuracy and fail to capture the underlying correlations and patterns within datasets. Furthermore, existing solutions have yet to fully exploit recent advances in artificial intelligence (AI) to enhance the effectiveness of MD handling. In this paper, we present a comprehensive framework for effectively addressing MD. In the proposed approach, missing values in structured datasets are imputed using the Random Forest (RF) algorithm, while sequential and time-series data are handled through the Long Short-Term Memory (LSTM) recurrent neural network (RNN), which captures both temporal dependencies and long-range patterns. The framework also allows users to define customized strategies for enhanced MD handling, where a user-defined strategy is formulated as a composition of existing imputation techniques combined with the proposed predictive models. To support practical adoption, the system has been implemented in a dedicated software tool that offers an efficient and flexible solution for diverse dataset formats with varying levels of missingness. Experiments on the London Weather (LWD) and PhysioNet Intensive Care Unit (ICU) benchmark datasets demonstrate the effectiveness of the proposed method, revealing clear improvements over traditional methods.
- Research Article
- 10.1142/s0218213026400014
- Feb 11, 2026
- International Journal on Artificial Intelligence Tools
- Prateek Goel + 1 more
Advancement in any field requires approaches for measurement. Failure to build such approaches inhibits improvements within the field. In the context of interpretability in Artificial Intelligence (AI), a lack of widely adopted evaluation and measurement approaches prevents its advance. While some approaches in literature propose ways to measure interpretability, no consensus exists on objective measurement of interpretability. To advance the state-of-the-art, a clear understanding of these approaches is essential. This paper conducts a systematic review of existing approaches that propose to measure or quantify interpretability and its aspects. The resulting analysis of this review identifies important aspects to consider when measuring interpretability. We found that no approaches directly propose to measure interpretability but instead quantify aspects associated with interpretability. We identify the relevant aspects in result of this review.
- Research Article
- 10.1142/s0218213026400026
- Feb 11, 2026
- International Journal on Artificial Intelligence Tools
- Brandon D Hines + 2 more
Machine learning (ML) is capable of aiding and improving medical diagnostics for a wide variety of pathologies. When used to process data from smart implantable medical devices, ML can offer timely and automated diagnostic tools to improve patient care. However, diagnostic ML models that influence healthcare decisions must have outputs that can be understood and trusted. Currently, ML-based medical diagnostic research focuses mainly on accuracy; lacking investigation of interpretability, explainability, and trust in the models – fundamental principles of diagnostic ML. To address this gap, this study seeks to improve explainability in ML models trained to diagnose aseptic tibial loosening in smart piezoelectric total knee replacements (TKRs). Specifically, a pathway to explainability is presented by applying the Local Interpretable Model-agnostic Explanations (LIME) method to interpret previously trained k-nearest neighbor (KNN), support vector machine (SVM), and discriminant analysis (DA) ML models that classify cement damage from electromechanical impedance signatures of piezoelectric-instrumented simulated TKRs. Two simple yet novel feature engineering techniques are proposed to align the models with domain knowledge of piezoelectric impedance-based structural health monitoring (SHM), thus improving explainability. These feature engineering techniques are broadly applicable to data types where adjacent features have inherent relations (i.e., time series, spectra, images, etc.). The original KNN, SVM, and DA models demonstrated explainability scores of 52%, 64.4%, and 42.1%, respectively. The first feature engineering technique improved the KNN and SVM scores to 84.6% and 87.6%, respectively with DA falling to 29.5%. The second feature engineering technique improved the KNN and SVM scores to 74.5% and 81.1%, respectively, with DA falling to 33.5%. The pathway to explainability and the feature engineering techniques presented in this study yield models with improved explainability that are better aligned with domain knowledge, while either improving (SVM) or maintaining (KNN) their original accuracy.
- Research Article
- 10.1142/s021821302640004x
- Feb 11, 2026
- International Journal on Artificial Intelligence Tools
- Alaa O Khadidos + 5 more
In this paper, the significance of supply chain management in healthcare logistics is emphasized by addressing the challenges of fluctuating demand and controlled drug production. To overcome these issues, the proposed system monitors the quantity of manufactured drugs through a combination of centralized and individual hub establishments that manage product distribution efficiently. An Artificial Intelligence (AI)-based automatic product clustering mechanism is integrated to analyze demand expectations and associated risk factors. The clustered products are then systematically arranged, and the internal connectivity between local suppliers is optimized to ensure minimal demand imbalance. Furthermore, to enhance the stability of the healthcare supply chain, proportional connections among hubs are evaluated, enabling data-driven and optimized decision-making. The performance of the proposed AI model is validated using four case studies, demonstrating its capability to achieve high connectivity and scalability. The model can be implemented in real time with a minimal operational cost of approximately 1,635 USD, confirming its practicality and cost-effectiveness.
- Research Article
- 10.1142/s0218213026400051
- Feb 11, 2026
- International Journal on Artificial Intelligence Tools
- Lynn Vonderhaar + 2 more
While metrics such as precision, recall, and related measures are useful for evaluating object detection models, they provide a limited perspective on model behavior. Even with carefully prepared training data and robust optimization, there is no guarantee regarding what features a model actually learns. In practice, a model may associate certain background elements, i.e., scene level objects, with the presence of target classes, resulting in unintended contextual dependencies. Conventional performance metrics, however, do not reveal this issue. To address this gap, this paper introduces a black box explainability approach that evaluates object detection models by quantifying the influence of scene level objects on class identification. By comparing Average Precision (AP) on test data with and without specific scene elements, the method highlights the extent to which those objects contribute to model performance. Three experiments are presented to demonstrate the method’s utility. The findings provide both quantitative and global explanations of model behavior, yielding a more complete picture of object detection performance.
- Research Article
- 10.1142/s0218213026020021
- Feb 11, 2026
- International Journal on Artificial Intelligence Tools
- Sheikh Rabiul Islam + 3 more
The rapid deployment of Artificial Intelligence (AI) in high-stakes domains necessitates robust approaches to transparency, fairness, and trustworthiness. Current advancements in AI performance are outpacing our understanding and ability to govern these systems. This special issue presents research addressing explainability, fairness, and trust as interconnected socio-technical challenges. Accepted papers demonstrate novel techniques for revealing hidden model dependencies, aligning explanations with domain expertise, rigorously benchmarking model classes for explanation robustness, and refining methods for measuring interpretability. We synthesize these contributions, situate them within current policy and standardization (EU AI Act [1]; NIST AI RMF [2, 3]; ISO/IEC 23894 and 42001 [4, 5]), and connect them to emerging evaluation science in XAI (e.g., BEExAI [9], Saliency-Bench [10], F-Fidelity [11]). Finally, we outline a forward-looking agenda emphasizing multi-aspect evaluation, context-sensitive trust, and the development of governance-ready AI systems.
- Research Article
- 10.1142/s0218213026400038
- Feb 11, 2026
- International Journal on Artificial Intelligence Tools
- Alina Lazar + 2 more
The goal of this study was to evaluate the performance of traditional gradient boosting (GB) and neural network models on diverse tabular datasets that differ in scale, class balance, and feature composition (numerical, categorical, or mixed). We focused on six representative datasets: adult census income, bank marketing, credit card fraud, breast cancer diagnosis, diabetes, and in-vehicle coupon recommendation, each with distinct challenges related to dimensionality, sample size, and heterogeneity. We benchmark the predictive performance of XGBoost and LightGBM (gradient boosting models) against Multilayer Perceptrons (MLP), Tabular Transformers, and tabular prior-data fitted network (TabPFN), using metrics such as accuracy, F1 score, ROC-AUC, and log loss. To ensure transparency and interpretability, we applied SHapley Additive exPlanations (SHAP) and Local Interpretable Model-Agnostic Explanation (LIME) to all models and evaluated the explanation quality using stability, fidelity, and consistency criteria. Our findings confirm that gradient boosting models consistently achieve the best balance of performance, calibration, and interpretability across heterogeneous and imbalanced datasets. SHAP-based insights show that gradient boosting (GB) models provide more stable and interpretable feature attributions, making them well suited for high-stakes domains such as finance and healthcare. These results emphasize the practical advantages of gradient boosting methods for structured data tasks and highlight the interpretability limitations of deep learning models when applied to tabular datasets. Future work will explore hybrid architectures and pretraining strategies to close this performance gap.
- Research Article
- 10.1142/s021821302650003x
- Feb 6, 2026
- International Journal on Artificial Intelligence Tools
- Mohammed El-Amine Meziane
The flexible job shop scheduling problem with automated guided vehicles (FJSP–AGV) couples production and transport decisions, making scheduling and energy management computationally challenging. Conventional genetic algorithms apply a single decoder throughout the search and thus cannot adapt when instance characteristics or battery constraints change. We propose a portfolio–island decoding genetic algorithm (PID-NSGA-II) that shifts the focus from modifying evolutionary operators to learning which decoding strategies work best. Five heterogeneous decoders run in parallel on separate islands, and an upper-confidence-bound multi-armed bandit measures each island’s contribution to makespan improvement and adaptively reallocates population resources, automatically balancing exploration and exploitation. The framework is tested under two settings—pure makespan minimization and energy-aware scheduling with AGV battery considerations. Experiments on benchmark datasets show that PID-NSGA-II consistently improves solution quality and stability compared with single-decoder genetic algorithms, with greater gains when energy constraints are present. Adaptive learning of decoders delivers more robust scheduling decisions for complex FJSP-AGV environments and provides a scalable platform for smart manufacturing applications, achieving up to 25 % makespan reduction and substantial improvements in AGV battery levels across small, medium and large problem instances.
- Research Article
- 10.1142/s0218213026500028
- Jan 30, 2026
- International Journal on Artificial Intelligence Tools
- Mamta Suyog Bhamare + 1 more
Personality is becoming a famous topic in Natural Language Processing (NLP), as it is the most straightforward way to translate emotions and internal thoughts into a form that others can recognize. More attention has been paid recently to cognitive-based sentiment analysis based on online social media language, with an emphasis on automatically detecting user behavior, including personality traits. However, the extended training periods linked to sequential inputs and the current deep learning approaches' limited capacity to capture words' real (semantic) meaning provide a problem that could compromise prediction accuracy. A novel social media personality prediction model is presented in this research to address these issues by incorporating preprocessing, extraction of features, and prediction models. The model is based on textual data from Twitter. Initially, preprocessing the input text allows for more significant feature extraction and reduces input complexity. The features are then extracted from the texts that have already been preprocessed using the polarity score and semantic similarity estimate. These methods can provide brief and insightful representations that increase the accuracy of personality prediction tests. In order to predict personality traits, the derived features are then learned using a hybrid technique that combines Long Short-Term Memory (LSTM) and Recurrent Neural Network (RNN). To make the prediction more accurate, the improved LSTM weight is optimally tuned using a new Sea Lion Updated Shark Smell Optimization (SUSSO) algorithm that combines Sea Lion Optimization (SLnO) and Shark Smell Optimization (SSO) methods. This parametric adjusting guarantees that the suggested method of personality trait prediction works as intended. Eventually, the presented model efficacy is assessed over existing models like various performance metrics. The proposed hybrid classifier + SUSSO model achieves an accuracy of 0.92633 and 0.9344 for Dataset 1 and Dataset 2 while the conventional models acquired minimal ratings. Thus, the proposed model offers promising results and paves the way for more personalized and psychologically informed interactions on social media platforms.
- Research Article
- 10.1142/s0218213025400135
- Jan 20, 2026
- International Journal on Artificial Intelligence Tools
- Dimitris Karpontinis + 1 more
Patch-based Transformer models have gained widespread adoption, achieving state-of-the-art performance across various domains that involve multi-dimensional spatio-temporal data, such as, for example, in vision tasks. Recently, they have emerged as a promising alternative for multivariate time-series forecasting, where each univariate series is treated as a separate channel, while sharing the same embedding and Transformer weights. In this work, we further explore the capabilities of patch-based Transformers in the context of forecasting a single time series, specifically focusing on energy consumption prediction. Our primary interest lies in long-term forecasting, a relatively under-explored area in the literature. To this end, we evaluate Transformer-based models on two energy consumption datasets — one public and one private — and assess their performance. We argue that leveraging patches or patching-like techniques can significantly enhance model efficiency. Lastly, we discuss the current limitations of Transformer-based architectures and propose potential solutions.