Abstract

Cross-species or Cross-platform data classification is a challenging problem in the field of bioinformatics, which aims to classify data samples in one species/platform by using labeled data samples in another species/platform. Traditional classification methods can not be used in this case, since the samples from two species/platforms may have different feature spaces, or follow different statistical distributions. Domain adaptation is a new strategy which could be used to deal with this problem. A big challenge in domain adaptation is how to reduce the difference and correct the drift between the source and the target domains in the heterogeneous case, when the feature spaces of the two domains are different. It has been shown theoretically that probability divergences between the two domains such as maximum mean discrepancy (MMD) play an important role in the generalization bound for domain adaptation. However, they are rarely used for heterogeneous domain adaptation due to the different feature spaces of the domains. In this work, we propose a heterogeneous domain adaptation approach by making use of MMD, which measures the probability divergence in an embedded low-dimensional common subspace. Our proposed discriminative heterogeneous MMD approach (DMMD) aims to find new representations of the samples in a common subspace by minimizing the domain probability divergence with preserving the known discriminative information. A conjugate gradient algorithm on a Grassmann manifold is applied to solve the nonlinear DMMD model. Our experiments on both simulation and benchmark machine learning datasets show that our approaches outperform other state-of-the-art approaches for heterogeneous domain adaptation. We finally apply our approach to a cross-platform dataset and a cross-species dataset, and the results show the effectiveness of our approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call