Abstract

The objectives of this Perspective paper are to review some recent advances in sparse feature selection for regression and classification, as well as compressed sensing, and to discuss how these might be used to develop tools to advance personalized cancer therapy. As an illustration of the possibilities, a new algorithm for sparse regression is presented and is applied to predict the time to tumour recurrence in ovarian cancer. A new algorithm for sparse feature selection in classification problems is presented, and its validation in endometrial cancer is briefly discussed. Some open problems are also presented.

Highlights

  • The objectives of this Perspective paper are to review some recent advances in sparse feature selection for regression and classification, as well as compressed sensing, and to discuss how these might be used to develop tools to advance personalized cancer therapy

  • The objectives of this Perspective paper are to review some recent advances in sparse feature selection for regression and classification, and to discuss how these might be used in the computational biology of cancer

  • One of the motivations for writing this paper is to present a broad picture of some recent advances in machine learning to the more mathematically inclined within the cancer biologist community, and to apply some of these techniques to a couple of problems

Read more

Summary

Introduction

The objectives of this Perspective paper are to review some recent advances in sparse feature selection for regression and classification, and to discuss how these might be used in the computational biology of cancer. There are four major types of breast cancer, known as luminal A, luminal B, non-luminal and basal type These subtypes are defined based on the expression levels of the genes oestrogen receptor, progesterone receptor and HER2, known as ERBB2, being either high or low. In the TCGA data, molecular measurements are available for almost all tumours, and clinical annotations are available for many tumours With such a wealth of data becoming freely available, researchers in the machine learning community can aspire to make useful contributions to cancer biology without the need to undertake any experimentation themselves. Throughout this paper, it is assumed that X ∈ Rm×n, that is, that each measurement is a real number Binary measurements such as the presence or absence of mutations are usually handled by partitioning the sample set into two groups, corresponding to the two labels. The objective is to find a function f : Rn → R or f : Rn → {−1, 1} such that yi is well approximated by f (xi)

Regression methods
Compressed sensing
Classification methods
C2 M1 TP FN M2 FP TN
Findings
Some topics for further research
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.