Abstract

Exploring and identifying a good feature representation to describe high-dimensional datasets is a challenge of prime importance. However, plenty of feature selection techniques and distance metrics exist, which entails an intricacy for identifying the one best suited to the task. This paper provides an algorithm to design high-order distance metrics over a sparse selection of features dedicated to classification. Our approach is based on Conditional Random Field (CRF) energy minimization and Dual Decomposition, which allow efficiency and great flexibility in the considered features. The optimization technique ensures the tractability of high-dimensionality problems using hundreds of features and samples. Our approach is evaluated on synthetic data as well as on Covid-19 patient stratification. Comparisons with state-of-the-art baselines and our proposed method on different classification results prove the learned metric’s relevance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call