Abstract

Exploring and identifying a good feature representation to describe high-dimensional datasets is a challenge of prime importance. However, plenty of feature selection techniques and distance metrics exist, which entails an intricacy for identifying the one best suited to the task. This paper provides an algorithm to design high-order distance metrics over a sparse selection of features dedicated to classification. Our approach is based on Conditional Random Field (CRF) energy minimization and Dual Decomposition, which allow efficiency and great flexibility in the considered features. The optimization technique ensures the tractability of high-dimensionality problems using hundreds of features and samples. Our approach is evaluated on synthetic data as well as on Covid-19 patient stratification. Comparisons with state-of-the-art baselines and our proposed method on different classification results prove the learned metric’s relevance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.