Abstract

Genome-scale metabolic models (GEMs) are powerful tools for predicting cellular metabolic and physiological states. However, there are still missing reactions in GEMs due to incomplete knowledge. Recent gaps filling methods suggest directly predicting missing responses without relying on phenotypic data. However, they do not differentiate between substrates and products when constructing the prediction models, which affects the predictive performance of the models. In this paper, we propose a hyperedge prediction model that distinguishes substrates and products based on dual-scale fused hypergraph convolution, DSHCNet, for inferring the missing reactions to effectively fill gaps in the GEM. First, we model each hyperedge as a heterogeneous complete graph and then decompose it into three subgraphs at both homogeneous and heterogeneous scales. Then we design two graph convolution-based models to, respectively, extract features of the vertices in two scales, which are then fused via the attention mechanism. Finally, the features of all vertices are further pooled to generate the representative feature of the hyperedge. The strategy of graph decomposition in DSHCNet enables the vertices to engage in message passing independently at both scales, thereby enhancing the capability of information propagation and making the obtained product and substrate features more distinguishable. The experimental results show that the average recovery rate of missing reactions obtained by DSHCNet is at least 11.7% higher than that of the state-of-the-art methods, and that the gap-filled GEMs based on our DSHCNet model achieve the best prediction performance, demonstrating the superiority of our method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call