Abstract
Cancer diseases have serious influence on people's live, but the-state-of-art machine learning approaches have the potential to decrease cancer death rates by formulating prevention strategies for cancer treatment. Some supervised learning methods have been used to give early warning of cancer successfully by using gene expression data, but the most prominent challenge is the insufficient labeled samples of biological data, especially in cancer datasets. It may cause the training model with over-fitting. Therefore, semi-supervised method approaches, such as self-training, are designed to improve the model performance by utilizing the unlabeled samples which are more than the labeled samples in some biological data. The high-quality pseudo-labeled sample is selected to train for improvement of model based curriculum learning paradigm. Of note, the self-training can meet above requirements. Next samples with high quality are pseudo-labeled by self-training. Afterward, put in the training dataset to increase the number of available labeled samples, and the cycle will continue until all unlabeled samples are annotated. The ensemble learning is a widely learning method, which can improve the model classification performance by combining multi-week classifier style. It can help self-training to learn the initial classifiers using the labeled samples. Therefore, in this study an ensemble self-training learning(ESTL) method is proposed to selects the unlabeled samples with high-quality more effectively and improves the robustness of the model. Compared with other tradition classification algorithms, the results of real cancer dataset showed that our proposed ESTL approach improve about 5% in classification performance.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.