Abstract
The ability to accurately classify disease subtypes is of vital importance, especially in oncology where this capability could have a life saving impact. Here we report a classification between two subtypes of non-small cell lung cancer, namely Adenocarcinoma versus Squamous cell carcinoma. The data consists of approximately 20,000 gene expression values for each of 104 patients. The data was curated from Kuner (Geo lung cancer data set—gse10245, 2009, http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE10245 ) and Golumbic (Geo lung cancer data set—gse18842, 2010, http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE18842 ). We used a combination of classical and quantum machine learning models to successfully classify these patients. We utilized feature selection methods based on univariate statistics in addition to XGBoost (Guestrin and Chen in Xgboost: a scalable tree boosting system, 2007, https://dl.acm.org/citation.cfm?id=1273596 ). A novel and proprietary data representation method developed by one of the authors called QCrush was also used as it was designed to incorporate a maximal amount of information under the size constraints of the D-Wave quantum annealing computer. The machine learning was performed by a quantum Boltzmann machine. This paper will report our results, the various classical methods, and the quantum machine learning approach we utilized.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.