Abstract
Finite mixture models (FMMs) are a ubiquitous tool for the analysis of heterogeneous data across a broad number of fields including agriculture, bioinformatics, botany, cell biology, economics, fisheries research, genetics, genomics, geology, machine learning, medicine, palaeontology, psychology, and zoology, among many others. Due to their flexibility, FMMs can be used to cluster data, classify data, estimate densities, and increasingly, they are also being used to conduct regression analysis and to analyze regression outcomes. There is now an expansive literature on the usage of FMMs for regression, as well as a broad demand for the development of such methods for the analysis of new and complex data. This thesis begins with a summary of the current literature on FMMs and their applications to regression problems. Here, the mixture of regression models (MRMs), cluster-weight models (CWMs), mixtures of experts (MoEs), and mixtures of linear of mixed effects models (MLMMs), as well as other variants of FMMs for regression analysis are introduced. Various properties such as denseness and identifiability, as well as maximum likelihood (ML) estimation techniques such as the expectation--maximization (EM) and minorization--maximization (MM) algorithms are discussed, and a review is presented regarding asymptotic inference and model selection in FMMs. A new result on the characterization of a t linear CWM (LCWM) is also presented. Some new applications of FMMs to regression problems are then discussed. Firstly, a series of models based on FMMs are presented for the clustering and classification of sparsely sampled bivariate functional data. These methods are named mixture of spatial spline regression (MSSR) and MSSR discriminant analysis (MSSRDA). MSSR is constructed using the theory of MLMMs and spatial splines, and an EM algorithm for the ML estimation of the model is presented. MSSRDA is then constructed by combining MSSR with the mixture discriminant analysis framework for classification. The methods are tested on their ability to cluster and classify simulated data. An example application to handwritten digits recognition is then presented. Here, it is shown that MSSR and MSSRDA perform comparably to currently available methods, and outperform said methods in missing data scenarios. Secondly, an FMM is used to produce a false discovery rate (FDR) control procedure for magnetic resonance imaging (MRI) data. In MRI data analysis, millions of hypotheses are often tested simultaneously, resulting in inflated numbers of false positive results. Many of the available FDR techniques for MRI data either do not take into account the spatial structure or rely on difficult to verify assumptions and user-specified parameters. To address these shortcomings, the Markov random field (MRF) FDR (MRF-FDR) technique is presented. MRF-FDR uses a Gaussian mixture model (GMM) to perform FDR control based on empirical-Bayesian principles. An MRF is then used to make the outcome of the GMM spatially coherent. The MRF is fitted using maximum pseudolikelihood estimation, and the pseudolikelihood information criterion is used to automatically specify the MRF model. MRF-FDR is shown to perform favorably in simulations against some currently used methods. An application to the PATH study data is presented, which shows that MRF-FDR generates inference that is clinically consistent with the available literature on brain aging. Thirdly, a new mixture of linear experts (MoLE) model is constructed using the Laplace error model; this is named the Laplace MoLE (LMoLE) model. An MM algorithm for the ML estimation of the LMoLE model is construct, which can be generalized for the monotonic likelihood maximization of any MoE. Theoretical properties such as identifiability and consistency are proven for the LMoLE, and connections are drawn between the LMoLE and the least absolute deviation regression criterion. Through simulations, the consistency of the ML estimator for the LMoLE is demonstrated. Results regarding the robustness of the LMoLE model against the more popular Gaussian MoLE are also provided. An application to a climate science data set is used to demonstrate the utility of the model. Finally, the Gaussian LCWM is extended via the linear regression characterization (LRC) of the GMM. The LRC is shown to be equivalent to the Gaussian LCWM. An MM algorithm that requires no matrix operations is constructed for the ML estimation of the LRC GMM. The MM algorithm is shown to monotonic increase the likelihood and to be convergent to a stationary point of the likelihood function. The ML estimator of the LRC GMM is proven to be consistent and asymptotically normal, thus providing an alternative proof for GMMs. A simple procedure for the mitigation of singularities in ML estimation via the LRC is discussed. Simulations are used to provide evidence that the MM algorithm based on the LRC may improve upon the performance of the traditional EM algorithm for GMMs, in some practically relevant scenarios.
Paper version not known (Free)
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have