Sort by
Memetic multilabel feature selection using pruned refinement process

With the growing complexity of data structures, which include high-dimensional and multilabel datasets, the significance of feature selection has become more emphasized. Multilabel feature selection endeavors to identify a subset of features that concurrently exhibit relevance across multiple labels. Owing to the impracticality of performing exhaustive searches to obtain the optimal feature subset, conventional approaches in multilabel feature selection often resort to a heuristic search process. In this context, memetic multilabel feature selection has received considerable attention because of its superior search capability; the fitness of the feature subset created by the stochastic search is further enhanced through a refinement process predicated on the employed multilabel feature filter. Thus, it is imperative to employ an effective refinement process that frequently succeeds in improving the target feature subset to maximize the benefits of hybridization. However, the refinement process in conventional memetic multilabel feature selection often overlooks potential biases in feature scores and compatibility issues between the multilabel feature filter and the subsequent learner. Consequently, conventional methods may not effectively identify the optimal feature subset in complex multilabel datasets. In this study, we propose a new memetic multilabel feature selection method that addresses these limitations by incorporating the pruning of features and labels into the refinement process. The effectiveness of the proposed method was demonstrated through experiments on 14 multilabel datasets.

Open Access
Relevant
Unlocking the potential of Naive Bayes for spatio temporal classification: a novel approach to feature expansion

Prediction processes in areas ranging from climate and disease spread to disasters and air pollution rely heavily on spatial–temporal data. Understanding and forecasting the distribution patterns of disease cases and climate change phenomena has become a focal point of researchers around the world. Machine learning models for prediction can generally be classified into 2: based on previous patterns such as LSTM and based on causal factors such as Naive Bayes and other classifiers. The main drawback of models such as Naive Bayes is that it does not have the ability to predict future trends because it only make predictionsin the present time. In this study, we propose a novel approach that makes the Naive Bayes classifier capable of predicting future classification. The process of expanding the dimension of the feature matrix based on historical data from several previous time periods is performed to obtain a long-term classification prediction model using Naive Bayes. The case studies used are the prediction of the distribution of the annual number of dengue fever cases in Bandung City and the distribution of monthly rainfall in Java Island, Indonesia. Through rigorous testing, we demonstrate the effectiveness of this Time-Based Feature Expansion approach in Naive Bayes in accurately predicting the distribution of annual dengue fever cases in 30 sub-districts in Bandung City and monthly rainfall in Java Island, Indonesia with with both accuracy and F1-score reaching more than 97%.Graphical

Open Access
Relevant
Sentiment-based predictive models for online purchases in the era of marketing 5.0: a systematic review

The convergence of artificial intelligence (AI), big data (DB), and Internet of Things (IoT) in Society 5.0, has given rise to Marketing 5.0, revolutionizing personalized customer experiences. In this study, a systematic literature review was conducted to examine the integration of predictive modelling and sentiment analysis within the Marketing 5.0 domain. Unlike previous research, this study addresses both aspects within a single context, emphasizing the need for a sentiment-based predictive approach to the buyers’ journey. This review explores how predictive and sentiment models enhance customer experience, inform business decisions, and optimize marketing processes. This study contributes to the literature by identifying areas of improvement in predictive modelling and emphasizes the role of a sentiment-based approach in Marketing 5.0. The sentiment-based model assists businesses in understanding customer preferences, offering personalized products, and enabling customers to receive relevant advertisements during their purchase journey. The paper’s structure covers the evolution of traditional marketing to digital marketing, AI’s role in digital marketing, predictive modelling in marketing, and the significance of analyzing customer sentiments in their reviews. The Prisma-P methodology, research questions, and suggestions for future work and limitations provide a comprehensive overview of the scope and contributions of this review.

Open Access
Relevant
Advancing cybersecurity: a comprehensive review of AI-driven detection techniques

As the number and cleverness of cyber-attacks keep increasing rapidly, it's more important than ever to have good ways to detect and prevent them. Recognizing cyber threats quickly and accurately is crucial because they can cause severe damage to individuals and businesses. This paper takes a close look at how we can use artificial intelligence (AI), including machine learning (ML) and deep learning (DL), alongside metaheuristic algorithms to detect cyber-attacks better. We've thoroughly examined over sixty recent studies to measure how effective these AI tools are at identifying and fighting a wide range of cyber threats. Our research includes a diverse array of cyberattacks such as malware attacks, network intrusions, spam, and others, showing that ML and DL methods, together with metaheuristic algorithms, significantly improve how well we can find and respond to cyber threats. We compare these AI methods to find out what they're good at and where they could improve, especially as we face new and changing cyber-attacks. This paper presents a straightforward framework for assessing AI Methods in cyber threat detection. Given the increasing complexity of cyber threats, enhancing AI methods and regularly ensuring strong protection is critical. We evaluate the effectiveness and the limitations of current ML and DL proposed models, in addition to the metaheuristic algorithms. Recognizing these limitations is vital for guiding future enhancements. We're pushing for smart and flexible solutions that can adapt to new challenges. The findings from our research suggest that the future of protecting against cyber-attacks will rely on continuously updating AI methods to stay ahead of hackers' latest tricks.

Open Access
Relevant
Interpolation-split: a data-centric deep learning approach with big interpolated data to boost airway segmentation performance

The morphology and distribution of airway tree abnormalities enable diagnosis and disease characterisation across a variety of chronic respiratory conditions. In this regard, airway segmentation plays a critical role in the production of the outline of the entire airway tree to enable estimation of disease extent and severity. Furthermore, the segmentation of a complete airway tree is challenging as the intensity, scale/size and shape of airway segments and their walls change across generations. The existing classical techniques either provide an undersegmented or oversegmented airway tree, and manual intervention is required for optimal airway tree segmentation. The recent development of deep learning methods provides a fully automatic way of segmenting airway trees; however, these methods usually require high GPU memory usage and are difficult to implement in low computational resource environments. Therefore, in this study, we propose a data-centric deep learning technique with big interpolated data, Interpolation-Split, to boost the segmentation performance of the airway tree. The proposed technique utilises interpolation and image split to improve data usefulness and quality. Then, an ensemble learning strategy is implemented to aggregate the segmented airway segments at different scales. In terms of average segmentation performance (dice similarity coefficient, DSC), our method (A) achieves 90.55%, 89.52%, and 85.80%; (B) outperforms the baseline models by 2.89%, 3.86%, and 3.87% on average; and (C) produces maximum segmentation performance gain by 14.11%, 9.28%, and 12.70% for individual cases when (1) nnU-Net with instant normalisation and leaky ReLU; (2) nnU-Net with batch normalisation and ReLU; and (3) modified dilated U-Net are used respectively. Our proposed method outperformed the state-of-the-art airway segmentation approaches. Furthermore, our proposed technique has low RAM and GPU memory usage, and it is GPU memory-efficient and highly flexible, enabling it to be deployed on any 2D deep learning model.

Open Access
Relevant
An adaptive composite time series forecasting model for short-term traffic flow

Short-term traffic flow forecasting is a hot issue in the field of intelligent transportation. The research field of traffic forecasting has evolved greatly in past decades. With the rapid development of deep learning and neural networks, a series of effective methods have been proposed to address the short-term traffic flow forecasting problem, which makes it possible to examine and forecast traffic situations more accurately than ever. Different from linear based methods, deep learning based methods achieve traffic flow forecasting by exploring the complex nonlinear relationships in traffic flow. Most existing methods always use a single framework for feature extraction and forecasting only. These approaches treat all traffic flow equally and consider them contain same attribute. However, the traffic flow from different time spots or roads may contain distinct attributes information (such as congested and uncongested). A simple single framework usually ignore the different attributes embedded in different distributions of data. This would decrease the accuracy of traffic forecasting. To tackle these issues, we propose an adaptive composite framework, named Long-Short-Combination (LSC). In the proposed method, two data forecasting modules(L and S) are designed for short-term traffic flow with different attributes respectively. Furthermore, we also integrate an attribute forecasting module (C) to forecast the traffic attributes for each time point in future time series. The proposed framework has been assessed on real-world datasets. The experimental results demonstrate that the proposed model has excellent forecasting performance.

Open Access
Relevant