Abstract

In this paper, we consider the estimation of a change-point for possibly high-dimensional data in a Gaussian model, using a maximum likelihood method. We are interested in how dimension reduction can affect the performance of the method. We provide an estimator of the change-point that has a minimax rate of convergence, up to a logarithmic factor. The minimax rate is in fact composed of a fast rate —dimension-invariant— and a slow rate —increasing with the dimension. Moreover, it is proved that considering the case of sparse data, with a Sobolev regularity, there is a bound on the separation of the regimes above which there exists an optimal choice of dimension reduction, leading to the fast rate of estimation. We propose an adaptive dimension reduction procedure based on Lepski’s method and show that the resulting estimator attains the fast rate of convergence. Our results are then illustrated by a simulation study. In particular, practical strategies are suggested to perform dimension reduction.

Highlights

  • We will address a framework where the change between classes occurs on a time scale, which casts the problem into the change-point estimation issue

  • We do not have this opportunity, which adds a difficulty to the problem. Another related reference is the paper by [59], who proposed a two-stage procedure based on a projection followed by a univariate change point estimation algorithm applied to the projected data, providing rates of convergence for the estimator of the change-point location

  • We present the problem of dimension reduction and the maximum likelihood estimator of the change-point

Read more

Summary

The model

An important problem in the vast domain of statistical learning is the question of unsupervised classification of high-dimensional data. For high-dimensional data, from a computational point of view, there is an obvious need for dimension reduction when estimating τ. Without such a step, the segmentation algorithm might be unstable or even not work at all. We will consider the dimension reduction problem from a theoretical point of view (as opposed to the algorithmic point of view). From a theoretical point of view, one might suspect that it should always be better to keep the whole data, to get the best precision on the estimation of the change-point. We show that this intuition is not correct Addressing this dimension reduction problem can require sophisticated tools directly connected to smoothing questions in nonparametric estimation. (d) Does on-line (signal by signal) dimension reduction perform as well as offline (using a preprocessing involving all the signals)?

Literature review and related work
Outline of the paper
Change-point model and assumptions
Condition on the means
Dimension reduction for the estimation of τ
Minimax convergence rate under sparsity condition
Fast rate of convergence
Lepski’s procedure
Preprocessing
Adaptive convergence rate
Rate of convergence
Selection of p
Proof of Proposition 1
Findings
Proof of Theorem 2

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.