Functional data analysis (FDA) is an important modern paradigm for handling infinite-dimensional data. An important task in FDA is clustering, which identifies subgroups based on the shapes of measured curves. Considering that derivatives can provide additional useful information about the shapes of functionals, we propose a novel L 2 distance between two random functions by incorporating the functions and their derivative information to determine the dissimilarity of curves under a unified scheme for dense observations. The Karhunen–Loève expansion is used to approximate the curves and their derivatives. Cluster membership prediction for each curve intends to minimize the new distances between the observed and predicted curves through subspace projection among all possible clusters. We provide consistent estimators for the curves, curve derivatives, and the proposed distance. Identifiability issues of the clustering procedure are also discussed. The utility of the proposed method is illustrated via simulation studies and applications to two real datasets. The proposed method can considerably improve cluster performance compared with existing functional clustering methods. Supplementary materials for the article are available online.
Read full abstract