The complexity, scale, and uncertainty in regulatory networks (e.g., gene regulatory networks and microbial networks) regularly pose a huge uncertainty in their models. These uncertainties often cannot be entirely reduced using limited and costly data acquired from the normal condition of systems. Meanwhile, regulatory networks often suffer from the non-identifiability issue, which refers to scenarios where the true underlying network model cannot be clearly distinguished from other possible models. Perturbation or excitation is a well-known process in systems biology for acquiring targeted data to reveal the complex underlying mechanisms of regulatory networks and overcome the non-identifiability issue. We consider a general class of Boolean network models for capturing the activation and inactivation of components and their complex interactions. Assuming partial available knowledge about the interactions between components of the networks, this paper formulates the inference process through the maximum aposteriori (MAP) criterion. We develop a Bayesian lookahead policy that systematically perturbs regulatory networks to maximize the performance of MAP inference under the perturbed data. This is achieved by optimally formulating the perturbation process in a reinforcement learning context and deriving a scalable deep reinforcement learning perturbation policy to compute near-optimal Bayesian policy. The proposed method learns the perturbation policy through planning without the need for any real data. The high performance of the proposed approach is demonstrated by comprehensive numerical experiments using the well-known mammalian cell cycle and gut microbial community networks.
Read full abstract