Abstract

This paper investigates the problem of detecting relevant change points in the mean vector, say $\mu_{t}=(\mu_{t,1},\ldots ,\mu_{t,d})^{T}$ of a high dimensional time series $(Z_{t})_{t\in \mathbb{Z}}$. While the recent literature on testing for change points in this context considers hypotheses for the equality of the means $\mu_{h}^{(1)}$ and $\mu_{h}^{(2)}$ before and after the change points in the different components, we are interested in a null hypothesis of the form \begin{equation*}H_{0}:|\mu^{(1)}_{h}-\mu^{(2)}_{h}|\leq \Delta_{h}~~~\mbox{ forall }~~h=1,\ldots ,d\end{equation*} where $\Delta_{1},\ldots ,\Delta_{d}$ are given thresholds for which a smaller difference of the means in the $h$-th component is considered to be non-relevant. This formulation of the testing problem is motivated by the fact that in many applications a modification of the statistical analysis might not be necessary, if the differences between the parameters before and after the change points in the individual components are small. This problem is of particular relevance in high dimensional change point analysis, where a small change in only one component can yield a rejection by the classical procedure although all components change only in a non-relevant way. We propose a new test for this problem based on the maximum of squared and integrated CUSUM statistics and investigate its properties as the sample size $n$ and the dimension $d$ both converge to infinity. In particular, using Gaussian approximations for the maximum of a large number of dependent random variables, we show that on certain points of the boundary of the null hypothesis a standardized version of the maximum converges weakly to a Gumbel distribution. This result is used to construct a consistent asymptotic level $\alpha $ test and a multiplier bootstrap procedure is proposed, which improves the finite sample performance of the test. The finite sample properties of the test are investigated by means of a simulation study and we also illustrate the new approach investigating data from hydrology.

Highlights

  • In the context of high dimensional time series it is typically unrealistic to assume stationarity

  • A simple form of non-stationarity, which is motivated by financial time series, where large panels of asset returns routinely display break points, is to assume structural breaks at different times in the individual components

  • One goal of statistical inference is to correctly estimate these “change points” such that the original data can be partitioned into shorter stationary segments. This field is called change point analysis in the statistical literature and since the seminal papers of Page (1954, 1955) numerous authors have worked on the problem of detecting structural breaks or change points in various statistical models [see Aue and Horvath (2013) for a recent review]

Read more

Summary

Introduction

In the context of high dimensional time series it is typically unrealistic to assume stationarity. In the simplest case of one structural break in each component many authors attack the problem of detecting the change point by means of hypothesis testing. Jirak (2015a) investigates the hypothesis of no structural break in a high-dimensional time series by testing the hypotheses. In this paper we use a different approach to test the hypotheses of a relevant structural break in any of the components of a high dimensional time series. It turns out that - in contrast to the classical change point problem - the analysis of the test for no relevant structural breaks is substantially harder as the null hypothesis does not correspond to a stationary process (non-relevant changes in the means are allowed). Some of the technical details are deferred to the appendix

Relevant changes in high dimensions - basic principles
Asymptotic properties
Assumptions
Asymptotic properties - known variances and locations
Estimation of long-run variances and change point locations
Weak convergence
Relevant changes in high dimensional time series
Bootstrap
Simulation study
Data example
A Proofs of the results in Sections 2 and 3
Findings
C Proofs of the results in Section 5
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call