Functional Cluster Method Research Articles

Multisensor data that track system operating behaviors are widely available nowadays from various engineering systems. Measurements from each sensor over time form a curve and can be viewed as functional data. Clustering of these multivariate functional curves is important for studying the operating patterns of systems. One complication in such applications is the possible presence of sensors whose data do not contain relevant information. Hence, it is desirable for the clustering method to equip with an automatic sensor selection procedure. Motivated by a real engineering application, we propose a functional data clustering method that simultaneously removes noninformative sensors and groups functional curves into clusters using informative sensors. Functional principal component analysis is used to transform multivariate functional data into a coefficient matrix for data reduction. We then model the transformed data by a Gaussian mixture distribution to perform model-based clustering with variable selection. Three types of penalties, the individual, variable, and group penalties, are considered to achieve automatic variable selection. Extensive simulations are conducted to assess the clustering and variable selection performance of the proposed methods. The application of the proposed methods to an engineering system with multiple sensors shows the promise of the methods and reveals interesting patterns in the sensor data. History: Kwok Tsui served as the senior editor for this article. Funding: The research by J. Min and Y. Hong was partially supported by the National Science Foundation [Grant CMMI-1904165] to Virginia Tech. The work by Y. Hong was partially supported by the Virginia Tech College of Science Research Equipment Fund. Data Ethics & Reproducibility Note: The original dataset is proprietary and cannot be shared. The full code to replicate the results in this paper, based on summary statistics of the original data, is available at https://github.com/jiem3/MultiFuncClustering . The code applied to a simplified version is available at https://codeocean.com/capsule/4041000/tree/v1 , which covers the data analysis and part of the simulation scenarios with a single dataset under each scenario using a fixed set of hyperparameters, for reducing computation time, and at https://doi.org/10.1287/ijds.2022.0034 .

Read full abstract

BackgroundTechnology advancement has allowed more frequent monitoring of biomarkers. The resulting data structure entails more frequent follow-ups compared to traditional longitudinal studies where the number of follow-up is often small. Such data allow explorations of the role of intra-person variability in understanding disease etiology and characterizing disease processes. A specific example was to characterize pathogenesis of bacterial vaginosis (BV) using weekly vaginal microbiota Nugent assay scores collected over 2 years in post-menarcheeal women from Rakai, Uganda, and to identify risk factors for each vaginal microbiota pattern to inform epidemiological and etiological understanding of the pathogenesis of BV.MethodsWe use a fully data-driven approach to characterize the longitudinal patters of vaginal microbiota by considering the densely sampled Nugent scores to be random functions over time and performing dimension reduction by functional principal components. Extending a current functional data clustering method, we use a hierarchical functional clustering framework considering multiple data features to help identify clinically meaningful patterns of vaginal microbiota fluctuations. Additionally, multinomial logistic regression was used to identify risk factors for each vaginal microbiota pattern to inform epidemiological and etiological understanding of the pathogenesis of BV.ResultsUsing weekly Nugent scores over 2 years of 211 sexually active and post-menarcheal women in Rakai, four patterns of vaginal microbiota variation were identified: persistent with a BV state (high Nugent scores), persistent with normal ranged Nugent scores, large fluctuation of Nugent scores which however are predominantly in the BV state; large fluctuation of Nugent scores but predominantly the scores are in the normal state. Higher Nugent score at the start of an interval, younger age group of less than 20 years, unprotected source for bathing water, a woman’s partner’s being not circumcised, use of injectable/Norplant hormonal contraceptives for family planning were associated with higher odds of persistent BV in women.ConclusionThe hierarchical functional data clustering method can be used for fully data driven unsupervised clustering of densely sampled longitudinal data to identify clinically informative clusters and risk-factors associated with each cluster.

Read full abstract

Functional Cluster Method Research Articles

Related Topics

Articles published on Functional Cluster Method

Distance-based Clustering of Functional Data with Derivative Principal Component Analysis

Multivariate Functional Clustering with Variable Selection and Application to Sensor Data from Engineering Systems

Synchronies and asynchronies in the development of COVID-19 pandemic in Italy

Functional Data Clustering Method Based on Shape Information and Functional Mahalanobis Distance

Functional Data Clustering Via Functional Mahalanobis Distance

Functional data analysis to characterize disease patterns in frequent longitudinal data: application to bacterial vaginal microbiota patterns using weekly Nugent scores and identification of pattern-specific risk factors

Interval-valued functional clustering based on the improved Euclidean distance with application to air quality index

Review of Clustering Methods for Functional Data

Analysis of the Influencing Factors of Crystalline Blockages in Mountain Tunnel Drainage Systems Based on Decision Analysis Methods

Quantile-based Clustering for Functional Data via Modelling Functional Principal Components Scores

Functional clustering on a sphere via Riemannian functional principal components

Using Functional Clustering to Diagnose Person Misfit

Conditional functional clustering for longitudinal data with heterogeneous nonlinear patterns

Interval-valued functional clustering based on the Wasserstein distance with application to stock data

Clustering-based simultaneous forecasting of life expectancy time series through Long-Short Term Memory Neural Networks

Functional clustering methods for resistance spot welding process data in the automotive industry

Ambulatory blood pressure profile and stroke recurrence

Functional clustering methods for longitudinal data with application to electronic health records.

Green efficiency performance analysis of the logistics industry in China: based on a kind of machine learning methods

A generalization of functional clustering for discrete multivariate longitudinal data.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Functional Cluster Method Research Articles

Related Topics

Articles published on Functional Cluster Method

Distance-based Clustering of Functional Data with Derivative Principal Component Analysis

Multivariate Functional Clustering with Variable Selection and Application to Sensor Data from Engineering Systems

Synchronies and asynchronies in the development of COVID-19 pandemic in Italy

Functional Data Clustering Method Based on Shape Information and Functional Mahalanobis Distance

Functional Data Clustering Via Functional Mahalanobis Distance

Functional data analysis to characterize disease patterns in frequent longitudinal data: application to bacterial vaginal microbiota patterns using weekly Nugent scores and identification of pattern-specific risk factors

Interval-valued functional clustering based on the improved Euclidean distance with application to air quality index

Review of Clustering Methods for Functional Data

Analysis of the Influencing Factors of Crystalline Blockages in Mountain Tunnel Drainage Systems Based on Decision Analysis Methods

Quantile-based Clustering for Functional Data via Modelling Functional Principal Components Scores

Functional clustering on a sphere via Riemannian functional principal components

Using Functional Clustering to Diagnose Person Misfit

Conditional functional clustering for longitudinal data with heterogeneous nonlinear patterns

Interval-valued functional clustering based on the Wasserstein distance with application to stock data

Clustering-based simultaneous forecasting of life expectancy time series through Long-Short Term Memory Neural Networks

Functional clustering methods for resistance spot welding process data in the automotive industry

Ambulatory blood pressure profile and stroke recurrence

Functional clustering methods for longitudinal data with application to electronic health records.

Green efficiency performance analysis of the logistics industry in China: based on a kind of machine learning methods

A generalization of functional clustering for discrete multivariate longitudinal data.