Abstract

Breast cancer, as one of the most common diseases threatening the women's life, has attracted serious attention of the clinical and biomedical researchers worldwide. The genome-based studies along with their registered GEO datasets are frequent in the literature. Since several methodologies have been developed for analyzing and identifying gene biomarkers, it is necessary to evaluate their robustness. In this study, three well-known biomarker identification methods (i.e., ClusterOne, MCODE, and BioDiscML) were employed in order to identify the potential biomarkers. Then, the methods were ranked and evaluated using nonlinear classification models developed based on the identified sets of biomarkers. A combined BC microarray dataset consisting of GSE124647, GSE124646, and GSE15852 was used as training set, and two test datasets, GSE15852 and GSE25066, were used for the performance measurement of the trained models. The validation of the proposed models was carried out internally (leave-one-out, fivefold and tenfold cross-validation, random sampling, test on training set) and externally (test on test set). The results showed that ClusterOne, MCODE, and BioDiscML tools ranked first, second, and third, respectively, based on the area under the curve (AUC), accuracy, F1 score, precision, and recall metrics. Overall, it can be concluded that the descriptive values of gene biomarkers in terms of their biological aspects that have been determined by a given methodology and the predictive power of the models developed based on the identified gene biomarkers should be considered simultaneously while validating the biomarker identification approaches.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.