Sort by
Improved Marine Predators Algorithm and Extreme Gradient Boosting (XGBoost) for shipment status time prediction

Shipment Status Time Prediction (STP) is a complex problem requiring expertise in many disciplines, including Machine Learning (ML) and logistics management, to develop effective solutions. Estimating every possible shipment step before creation plays a vital role in e-commerce sale channels. In this point of view, a novel STP approach was proposed to predict shipment status times. The proposed approach involves two phases. The first leg of the novel STP approach is to build an ML model for estimating shipment statuses using a dataset acquired from a real-world application. The Extreme Gradient Boosting (XGB) and popular ML algorithms were compared for the classification of shipment status prediction first. The XGB algorithm performed best among the compared algorithms, with 99.92% and 96.16% accuracy for training and test sets. Moreover, the ML algorithms were run on the public New York City taxi trip dataset. The XGB algorithm exhibited the best performance. The accuracy for training and test sets are 97.40% and 97.33%, respectively. The second phase of the proposed approach is shipment status time estimation, designed as an optimization problem. The Marine Predators Algorithm (MPA) is a recently proposed optimization algorithm for numerical function optimization. An improved MPA algorithm for STP (STPMPA) was proposed in this study. The performance of the STPMPA algorithm was scrutinized on numerical benchmark problems first. The STPMPA algorithm outperformed all the algorithms in the experiment. Then, the most feasible shipment status times are searched by optimizers using the XGB model. The proposed STPMPA algorithm put forth a superior performance for the STP problem than the compared optimization algorithms. Consequently, experimental studies reveal that the proposed STP approach is able to generate efficient estimations in reasonable times for real-time systems.

Just Published
Relevant
A business process network efficiency model for handling conflicting information

In today's big data era, where a significant volume of business data is generated daily, managing conflicting information within business process networks is crucial for maintaining operational efficiency. This paper addresses this challenge by proposing an efficiency model for business process networks tailored to handle conflict information, drawing on queuing theory and evidence theory. Firstly, we introduce a novel approach for measuring conflict information based on evidence theory and Pignistic probability transformation theory. Next, we tailor efficiency models for the four fundamental structures found in business process networks: sequential, selective, parallel, and loop structures, using queuing theory to manage conflict information effectively in each scenario. We further extend this approach by conceptualizing virtual business activities, allowing us to view the entire business process network as a sequential structure of virtual business activities, facilitating efficiency measurement across the network. Utilizing these measurements, we formulate the queuing service of the business process network as a nonlinear programming problem aimed at minimizing time, thus determining the optimal service rate for business process activities. Finally, we demonstrate the applicability and effectiveness of our proposed model through an experimental analysis focused on the railway intermodal transportation business process. The experimental results indicate that our model significantly reduces the impact of conflicting information, leading to a measurable improvement in the efficiency of the business process network. Specifically, the model achieves a notable enhancement in the coordination and execution of intermodal transportation activities, thereby streamlining operations and reducing decision-making uncertainties. This structured approach not only addresses the challenge of managing conflicting information within business process networks but also provides a clear framework for understanding and optimizing network efficiency.

Just Published
Relevant
Utilizing domain knowledge: Robust machine learning for building energy performance prediction with small, inconsistent datasets

Machine learning (ML) applications often require large datasets, a requirement that can pose a major challenge in fields where data is sparse or inconsistent. To address this issue, we propose a novel approach that combines prior knowledge with data-driven methods to significantly reduce data dependency. This study represents a disentangled system compositionality knowledge by the method of Component-Based Machine Learning (CBML) in the context of energy-efficient building engineering. In this way, CBML incorporates semantic domain knowledge within the structure of a data-driven model. To understand the advantage of CBML, we conducted a case experiment to assess the effectiveness of this knowledge-encoded ML approach in scenarios with sparse data input (1 % - 0.0125 % sampling rate) and several typical ML methods. Our findings reveal three key advantages of this approach over traditional ML methods: 1) It significantly improves the robustness of ML models when dealing with extremely small and inconsistent datasets; 2) It allows for efficient utilization of data from diverse record collections; 3) It can handle incomplete data while maintaining high interpretability and reducing training time. These features offer a promising solution to the challenges associated with deploying data-intensive methods and contribute to more efficient real-world data usage. Additionally, we outline four essential prerequisites to ensure the successful integration of prior knowledge and ML generalization in target scenarios and open-sourced the code and dataset for community reproduction.

Open Access
Relevant
Few-shot fault diagnosis of turnout switch machine based on flexible semi-supervised meta-learning network

The safety of train operations hinges on the reliability of the signal system, and the switch machine stands out as a pivotal component within it. Consequently, fault diagnosis of switch machines is of paramount importance. However, obtaining a substantial amount of fault data is challenging in reality, and labeled data is even scarcer, which makes the fault diagnosis model of the switch machine have low diagnostic accuracy and poor generalization ability. To address these problems, a flexible semi-supervised meta-learning network (FSMN) is proposed for the fault diagnosis of switch machines in this paper. Firstly, a dual-channel hetero-convolution kernel feature extractor (DHKFE) is efficiently proposed to extract the switch machine fault features at different levels from few-shot samples. Secondly, a flexible distance prototype corrector is employed to adaptively modify the distance function. It accomplishes this by rapidly identifying similarities among fault samples and harnessing the potential of unlabeled data to fine-tune prototype positions, which can enhance prototype stability and generalization, leading to more accurate fault classification. Finally, the A-phase current data collected in the real scene during the transition between the two states of the switch machine are utilized for the validation of FSMN, alongside a comparative assessment against five other methods. The results show that the accuracy in forward-reverse and reverse-forward of FSMN is up to 97.35% and 92.72%, respectively, which means FSMN is superior in few-shot fault diagnosis and can be applied to various switch machines.

Relevant
BDCore: Bidirectional Decoding with Co-graph Representation for Joint Entity and Relation Extraction

Relation extraction has become a crucial step for the automatic construction of Knowledge Graph (KG). Recently, researchers leverage Sequence-to-Sequence (Seq2Seq) models for Joint Entity and Relation Extraction (JERE). Nevertheless, traditional decoding methods entail the generation of the target sequence incrementally from left to right by the decoder, without the ability to revise earlier predictions when errors occur. This limitation becomes evident when decoding errors manifest prior to the current decoding step. Furthermore, the interrelations among triplets originating from the same sentence exhibit a robust correlation, which has been overlooked. In this paper, we propose Bidirectional Decoding with Co-graph representation (BDCore) to address the issues mentioned above. Specifically, we first introduce a backward decoder to decode the target sequence in a reverse order. Then, the forward decoder introduces two attention mechanisms to simultaneously considering the hidden states of the encoder and the backward decoder. Thus, the backward decoding information helps to alleviate the negative impact of the forward decoding errors. Besides, we construct a relation co-occurrence graph (Co-graph) and exploit Graph Convolutional Network (GCN) to capture the relation correlation. The extensive experiments demonstrate the benefits of the proposed bidirectional decoding and co-graph representation for relation extraction. Compared to the previous methods, our approach significantly outperforms the baselines on the NYT benchmark.

Relevant
Improving dense retrieval models with LLM augmented data for dataset search

Data augmentation for training supervised models has achieved great results in different areas. With the popularity of Large Language Models (LLMs), a research area has emerged focused on applying LLMs for text data augmentation. This approach is particularly beneficial for low-resource tasks, whereby the availability of labeled data is very scarce. Dataset search is an information retrieval task that aims to retrieve relevant datasets based on user queries. However, due to the lack of labeled data tailored explicitly for this task, developing accurate retrieval models becomes challenging. In this paper, we target LLMs to create training examples for retrieval models in the dataset search task. Specifically, we propose a new pipeline that generates synthetic queries from dataset descriptions using LLMs. The query-description pairs are utilized to fine-tune dense retrieval approaches for re-ranking, which we assume as soft matches to our task. We evaluated our pipeline using fine-tuned embedding models for semantic search over dataset search benchmarks (NTCIR and ACORDAR). We tuned these models in the dataset search task using the synthetic data generated by our solution and compared their performance with the original models. The results show that the models tuned on the synthetic data statistically outperform the baselines at different normalized discounted cumulative gain levels.

Relevant
Personalized Image Aesthetics Assessment based on Graph Neural Network and Collaborative Filtering

Personalized image aesthetics assessment aims to capture individual aesthetic preferences, which are influenced by image aesthetic attributes and user demographic attributes. The interaction of attributes facilitates the determination of users' aesthetic preferences for images. Therefore, we define two forms of attribute interactions: external-interactions and internal-interactions. The interaction of these two types of attributes is not considered in existing models. To address this drawback, we suggest a personalized image aesthetics assessment method based on graph neural network and collaborative filtering, which models and aggregates two types of attribute interactions in the graph structure for predicting personalized image aesthetics scores. Firstly, we designed an image aesthetic feature extraction phase for obtaining aesthetic attributes and distributions based on the aesthetic assessment of mass images. Secondly, we propose an aesthetic prior model-building phase with two basic processes: learning the aesthetic features of images and users' aesthetic viewpoints; learning users' preferences for images. This phase is accomplished through internal-interactions (using the graph's message passing mechanism) and external-interactions (using collaborative filtering). Finally, we fuse the post-interaction features and image aesthetic distribution features for personalized image aesthetic assessment. The performance of our designed method is outperformed by the state-of-the-art method, as seen from the experimental results. Furthermore, further studies verify the accuracy and validity of our model in providing improved prediction of users' aesthetic preferences.

Relevant