Abstract

Abstract In clinical medicine, multidimensional time series data can be used to find the rules of disease progress by data mining technology, such as classification and prediction. However, in multidimensional time series data mining problems, the excessive data dimension causes the inaccuracy of probability density distribution to increase the computational complexity. Besides, information redundancy and irrelevant features may lead to high computational complexity and over-fitting problems. The combination of these two factors can reduce the classification performance. To reduce computational complexity and to eliminate information redundancies and irrelevant features, we improved upon a multidimensional time series feature selection method to achieve dimension reduction. The improved method selects features through the combination of the Kozachenko–Leonenko (K–L) information entropy estimation method for feature extraction based on mutual information and the feature selection algorithm based on class separability. We performed experiments on the Electroencephalogram (EEG) dataset for verification and the non-small cell lung cancer (NSCLC) clinical dataset for application. The results show that with the comparison of CLeVer, Corona and AGV, respectively, the improved method can effectively reduce the dimensions of multidimensional time series for clinical data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.