Abstract
Many real-world applications with graph data require the solution of a given regression task as well as the identification of the subgraphs which are relevant for the task. In these cases graphs are commonly represented as high dimensional binary vectors of indicators of subgraphs. However, since the dimensionality of such indicator vectors can be high even for small datasets, traditional regression algorithms become intractable and past approaches used to preselect a feasible subset of subgraphs. A different approach was recently proposed by a Lasso-type method where the objective function optimization with a large number of variables is reformulated as a dual mathematical programming problem with a small number of variables but a large number of constraints. The dual problem is then solved by column generation, where the subgraphs corresponding to the most violated constraints are found by weighted subgraph mining. This paper proposes an extension of this method to a Bayesian approach in which the regression parameters are considered as random variables and integrated out from the model likelihood, thus providing a posterior distribution on the target variable as opposed to a point estimate. We focus on a linear regression model with a Gaussian prior distribution on the parameters. We evaluate our approach on several molecular graph datasets and analyze whether the uncertainty in the target estimate given by the target posterior distribution variance can be used to improve model performance and therefore provides useful additional information.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.