Abstract

It is known that all agents that cause cancer (carcinogens) also cause a change in the DNA sequence. In order to identify such often subtle changes, we attempt to integrate multiple molecular profile data sets released by the International Cancer Genome Consortium (ICGC). The list of data sets includes matched gene and microRNA expression profiles, somatic copy number variation, DNA methylation, and protein expression profiles for lung adenocarcinoma patients receiving treatments. We consider both unsupervised and supervised learning techniques (clustering and penalized regression) to identify interesting molecular markers corresponding to each type of –omics profiles that can differentiate patients. Associations between important markers of 2 types have been studied. An adaptive ensemble binary regression model has been presented that uses the entirety of available –omics profiles leading to a more accurate clinical prognosis for the patients in the given sample. This integrated study provides a more com...

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call