Abstract

Identifying valuable features from complex omics data is of great significance for disease diagnosis study. This paper proposes a new feature selection algorithm based on sample network (FS-SN) to mine important information from omics data. The sample network is constructed according to the sample neighbor relationship at the molecular (feature) expression level, and the distinguishing ability of the feature is evaluated based on the topology of the sample network. The sample network established on a feature with a strong discriminating ability tends to have many edges between the same group samples and few edges between the different group samples. At the same time, FS-SN removes redundant features according to the gravitational interaction between features. To show the validation of FS-SN, it was compared on ten public datasets with ERGS, mRMR, ReliefF, ATSD-DN, and INDEED which are efficient in omics data analysis. Experimental results show that FS-SN performed better than the compared methods in accuracy, sensitivity and specificity in most cases. Hence, FS-SN making use of the topology of the sample network is effective for analyzing omics data, it can identify key features that reflect the occurrence and development of diseases, and reveal the underlying biological mechanism.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.