A Pathway-Based Kernel Boosting Method for Sample Classification Using Genomic Data.

Li Zeng,Zhaolong Yu,Hongyu Zhao

doi:10.3390/genes10090670

Abstract

The analysis of cancer genomic data has long suffered “the curse of dimensionality.” Sample sizes for most cancer genomic studies are a few hundreds at most while there are tens of thousands of genomic features studied. Various methods have been proposed to leverage prior biological knowledge, such as pathways, to more effectively analyze cancer genomic data. Most of the methods focus on testing marginal significance of the associations between pathways and clinical phenotypes. They can identify informative pathways but do not involve predictive modeling. In this article, we propose a Pathway-based Kernel Boosting (PKB) method for integrating gene pathway information for sample classification, where we use kernel functions calculated from each pathway as base learners and learn the weights through iterative optimization of the classification loss function. We apply PKB and several competing methods to three cancer studies with pathological and clinical information, including tumor grade, stage, tumor sites and metastasis status. Our results show that PKB outperforms other methods and identifies pathways relevant to the outcome variables.

Highlights

High-throughput genomic technologies have enabled cancer researchers to study the associations between genes and clinical phenotypes of interest
We propose a Pathway-based Kernel Boosting (PKB) method for sample classification
Motivated by Nonparametric Pathway-based Regression (NPR) and GAR, we propose the PKB model, where we employ kernel functions as base learners, optimize loss function with second order approximation [19] which gives Newton-like descent speed and incorporates regularization in selection of pathways in each boosting iteration

Summary

Introduction

High-throughput genomic technologies have enabled cancer researchers to study the associations between genes and clinical phenotypes of interest. A large number of cancer genomic data sets have been collected with both genomic and clinical information from the patients. The analyses of these data have yielded valuable insights on cancer mechanisms, subtypes, prognosis and treatment response. Many methods have been developed to identify genes informative of clinical phenotypes and build prediction models from these data, it is often difficult to interpret the results with single-gene focused approaches, as one gene is often involved in multiple biological processes and the results are not robust when the signals from individual genes are weak. A pathway can be considered as a set of genes that are involved in the same biological process or molecular function. There are many pathway databases available, such as the Kyoto

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Genes	Publication Date: Aug 31, 2019
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Pathway-Based Kernel Boosting Method for Sample Classification Using Genomic Data.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Genes

Lead the way for us

Similar Papers

Abstract 2779: Rapid drug target ranking system developed from a systematic analysis of cancer genomic data from the Oncomine™ knowledgebase identifies an oncogenic role for the NFE2L2 pathway in multiple cancer types
Sean Eddy ... Mary Ellen Urick
Cancer Research | VOL. 74
Sean Eddy, et. al.Sean Eddy ... Mary Ellen Urick
30 Sep 2014
Cancer Research | VOL. 74

Abstract 250: UCSC Xena for the visualization and analysis of cancer genomics data
Mary J Goldman ... David Haussler
-
Mary J Goldman, et. al.Mary J Goldman ... David Haussler
01 Jul 2021
01 Jul 2021

Identification of novel prognostic indicators for triple-negative breast cancer patients through integrative analysis of cancer genomics data and protein interactome data.
Fan Zhang ... Hengqiang Zhao
Oncotarget | VOL. 7
Fan Zhang, et. al.Fan Zhang ... Hengqiang Zhao
27 Sep 2016
Oncotarget | VOL. 7

A Tensor Robust Model Based on Enhanced Tensor Nuclear Norm and Low-Rank Constraint for Multi-view Cancer Genomics Data
Qian Qiao ... Jin-Xing Liu
-
Qian Qiao, et. al.Qian Qiao ... Jin-Xing Liu
01 Jan 2021
01 Jan 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Pathway-Based Kernel Boosting Method for Sample Classification Using Genomic Data.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Genes