Fast and interpretable genomic data analysis using multiple approximate kernel learning.

Ayyüce Begüm Bektaş,Çiğdem Ak,Mehmet Gönen

doi:10.1093/bioinformatics/btac241

Ayyüce Begüm Bektaş, Çiğdem Ak + Show 1 more

Open Access

https://doi.org/10.1093/bioinformatics/btac241

Copy DOI

Abstract

MotivationDataset sizes in computational biology have been increased drastically with the help of improved data collection tools and increasing size of patient cohorts. Previous kernel-based machine learning algorithms proposed for increased interpretability started to fail with large sample sizes, owing to their lack of scalability. To overcome this problem, we proposed a fast and efficient multiple kernel learning (MKL) algorithm to be particularly used with large-scale data that integrates kernel approximation and group Lasso formulations into a conjoint model. Our method extracts significant and meaningful information from the genomic data while conjointly learning a model for out-of-sample prediction. It is scalable with increasing sample size by approximating instead of calculating distinct kernel matrices.ResultsTo test our computational framework, namely, Multiple Approximate Kernel Learning (MAKL), we demonstrated our experiments on three cancer datasets and showed that MAKL is capable to outperform the baseline algorithm while using only a small fraction of the input features. We also reported selection frequencies of approximated kernel matrices associated with feature subsets (i.e. gene sets/pathways), which helps to see their relevance for the given classification task. Our fast and interpretable MKL algorithm producing sparse solutions is promising for computational biology applications considering its scalability and highly correlated structure of genomic datasets, and it can be used to discover new biomarkers and new therapeutic guidelines.Availability and implementationMAKL is available at https://github.com/begumbektas/makl together with the scripts that replicate the reported experiments. MAKL is also available as an R package at https://cran.r-project.org/web/packages/MAKL.Supplementary information Supplementary data are available at Bioinformatics online.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Bioinformatics	Publication Date: Jun 24, 2022
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Fast and interpretable genomic data analysis using multiple approximate kernel learning.

Abstract

Talk to us

Similar Papers

More From: Bioinformatics

Lead the way for us

Similar Papers

Multiple Spectral Kernel Learning and a Gaussian Complexity Computation
Nima Reyhani
Neural Computation | VOL. 25
Nima ReyhaniNima Reyhani
22 Apr 2013
Neural Computation | VOL. 25

Weight-based multiple empirical kernel learning with neighbor discriminant constraint for heart failure mortality prediction.
Zhe Wang ... Bolu Wang
Journal of Biomedical Informatics | VOL. 101
Zhe Wang, et. al.Zhe Wang ... Bolu Wang
19 Nov 2019
Journal of Biomedical Informatics | VOL. 101

Reduced multiple empirical kernel learning machine.
Zhe Wang ... Mingzhe Lu
Cognitive Neurodynamics | VOL. 9
Zhe Wang, et. al.Zhe Wang ... Mingzhe Lu
29 Jul 2014
Cognitive Neurodynamics | VOL. 9

A computationally efficient multi-domain active learning method for crop mapping using satellite image time-series
Saeid Niazmardi ... Abdolreza Safari
International Journal of Remote Sensing | VOL. 40
Saeid Niazmardi, et. al.Saeid Niazmardi ... Abdolreza Safari
21 Mar 2019
International Journal of Remote Sensing | VOL. 40

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Fast and interpretable genomic data analysis using multiple approximate kernel learning.

Abstract

Talk to us

Similar Papers

More From: Bioinformatics