Abstract

In this paper, we cluster profiles of longitudinal data using a penalized regression method. Specifically, we allow heterogeneous variation of longitudinal patterns for each subject, and utilize a pairwise-grouping penalization on coefficients of the nonparametric B-spline models to form subgroups. Consequently, we identify clusters based on different patterns of the predicted longitudinal curves. One advantage of the proposed method is that there is no need to pre-specify the number of clusters; instead the number of clusters is selected automatically through a model selection criterion. Our method is also applicable for unbalanced data where different subjects could have measurements at different time points. To implement the proposed method, we develop an alternating direction method of multipliers (ADMM) algorithm which has the desirable convergence property. In theory, we establish the consistency properties for approximated nonparametric function estimation and subgrouping memberships. In addition, we show that our method outperforms the existing competitive approaches in our simulation studies and real data example.

Highlights

  • In longitudinal data studies, distinguishing patterns of longitudinal trajectories is useful in many practical applications

  • One advantage of the proposed approach is that a pre-specification of the number of clusters is not required; instead we select the number of clusters automatically through a model selection criterion

  • We propose a nonparametric pairwise-grouping approach to cluster longitudinal trajectories over time

Read more

Summary

Introduction

In longitudinal data studies, distinguishing patterns of longitudinal trajectories is useful in many practical applications. We propose a regression-based approach which partitions observations into subgroups through penalization of pairwise distances between the B-spline coefficients vectors. One advantage of the proposed approach is that a pre-specification of the number of clusters is not required; instead we select the number of clusters automatically through a model selection criterion This allows us to achieve model estimations and subgrouping subjects simultaneously. Another advantage is that the proposed method is applicable in characterizing longitudinal trajectories which can deal with unbalanced longitudinal data. Our simulation studies and real data analysis confirm that the proposed method performs well in identifying subgroups compared to other existing approaches.

A Subject-wise Model for Longitudinal Data
A Nonparametric Pairwise-Grouping Approach
Asymptotic Properties
Simulation Study
An Application to Drosophila Life Cycle Gene Expression Data
Findings
Discussion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.