Feature Extraction and Classification Using Leading Eigenvectors: Applications to Biomedical and Multi-Modal mHealth Data

Georgina Cosma,T Martin Mcginnity

doi:10.1109/access.2019.2932868

Georgina Cosma, T Martin Mcginnity

Open Access

https://doi.org/10.1109/access.2019.2932868

Copy DOI

Abstract

Eigendecomposition is the factorization of a matrix into its canonical form, whereby the matrix is represented in terms of its eigenvalues and eigenvectors. A common step is the reduction of the data to a kernel matrix also known as a Gram matrix which is used for machine learning tasks. A significant drawback of kernel methods is the computational complexity associated with manipulating kernel matrices. This paper demonstrates that leading eigenvectors derived from singular value decomposition (SVD) and Nystrom approximation methods can be utilized for classification tasks without the need to construct Gram matrices. Experiments were conducted with 14 biomedical datasets to compare classifier performance when taking as input into a classifier matrices containing: 1) leading eigenvectors which result from each approximation method, and 2) matrices which result from constructing the patient-by-patient Gram matrix. The results provide evidence to support the main hypothesis of this paper that using the leading eigenvectors as input into a classifier significantly (p < 0.05) improves classifier performance in terms of accuracy and time compared to using Gram matrices. Furthermore, experiments were carried out using large multi-modal mHealth time series datasets of ten different subjects with diverse profiles while they were performing several physical activities. Experiments with the mHealth datasets utilized a sequential deep learning model. The significance of the proposed approach is that it can make feature extraction methods more accessible on large-scale unimodal and multi-modal data which are becoming common in many applications.

Highlights

Low-rank matrix decompositions are important in the application of kernel methods to large-scale learning problems
A significant drawback of kernel methods is the computational complexity associated with manipulating kernel matrices
This paper demonstrates that leading eigenvectors derived from Singular Value Decomposition (SVD) and Nyström methods, for reducing the dimensionality of data, can be utilised for classification tasks without the need to construct Gram matrices

Summary

Introduction

Low-rank matrix decompositions are important in the application of kernel methods to large-scale learning problems. High-dimensional data is represented in more than two or three dimensions and it can be difficult to manipulate and interpret. One approach to dealing with high-dimensional data is to assume that the data of interest reside on an embedded non-linear manifold within the higher-dimensional space. If the manifold is of low enough dimensionality, the data can be visualised in a low-dimensional space. Manifold learning is known as non-linear dimensionality reduction. Large matrices consist of thousands to millions of matrix entries and performing even simple operations on these

Objectives

Methods

Results

Conclusion