Abstract
It is an important and challenging task to represent the categorical values in mixed data as numerical vectors with intrinsic features, by revealing the complex coupling relationships between the categorical values, attributes and samples. The majority of extant studies expose only one particular coupling relationship in depth, or fuse multiple coupling relationships by using shallow learning based on kernels. The former may not fully mine the essential features of the categorical data. The latter typically has some limitations, for example, difficulty in expanding the spatial structure and difficulty in determining the optimal kernel function. Therefore, this paper proposes a Multi-view Deep Metric Learning for Categorical Representation on mixed data (MvDML-CR). Specifically, first, based on the principle of information complementarity, multiple coupled views are extracted from the complex interaction relationships of the categorical data. Then, in each coupled view, a new proxy loss function is designed to build a deep metric learning sub-model with strong separability, which represents the categorical values as numerical vectors with discrimination. Last, we employ the Hilbert–Schmidt independence criterion to maximize the dependency between the views, and then fuse the sub-models trained in the different views to enhance the complementarity and consistency of the categorical representations. Extensive experiments on 34 mixed datasets with diversified characteristics demonstrate that the classification performance of MvDML-CR is significantly improved, compared with the state-of-the-art competitors.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.