A Reality Check for Data Snooping

Halbert White

doi:10.1111/1468-0262.00152

Abstract

Data snooping occurs when a given set of data is used more than once for purposes of inference or model selection. When such data reuse occurs, there is always the possibility that any satisfactory results obtained may simply be due to chance rather than to any merit inherent in the method yielding the results. This problem is practically unavoidable in the analysis of time-series data, as typically only a single history measuring a given phenomenon of interest is available for analysis. It is widely acknowledged by empirical researchers that data snooping is a dangerous practice to be avoided, but in fact it is endemic. The main problem has been a lack of sufficiently simple practical methods capable of assessing the potential dangers of data snooping in a given situation. Our purpose here is to provide such methods by specifying a straightforward procedure for testing the null hypothesis that the best model encountered in a specification search has no predictive superiority over a given benchmark model. This permits data snooping to be undertaken with some degree of confidence that one will not mistake results that could have been generated by chance for genuinely good results.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Reality Check for Data Snooping

Abstract

Talk to us

Similar Papers

More From: Econometrica

Lead the way for us

Journal: Econometrica	Publication Date: Sep 1, 2000
Citations: 1600

Similar Papers

Do Google Searches Help in Nowcasting Private Consumption
...
-
, et. al. ...
25 May 2010
25 May 2010

Efficient statistical significance approximation for local similarity analysis of high-throughput time series data
Li C Xia ... Jed A Fuhrman
Bioinformatics | VOL. 29
Li C Xia, et. al.Li C Xia ... Jed A Fuhrman
23 Nov 2012
Bioinformatics | VOL. 29

Confirmatory and exploratory analysis applied to pharmaco-EEG and related study data: contradiction or useful enrichment?
U Ferner ... G Neff
Neuropsychobiology | VOL. 9
U Ferner, et. al.U Ferner ... G Neff
01 Jan 1982
Neuropsychobiology | VOL. 9

Choosing Factors in a Multifactor Asset Pricing Model when Returns are Nonnormal
Johan Parmler ... Sune Karlsson
SSRN Electronic Journal | VOL. -
Johan Parmler, et. al.Johan Parmler ... Sune Karlsson
28 Feb 2005
SSRN Electronic Journal | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Reality Check for Data Snooping

Abstract

Talk to us

Similar Papers

More From: Econometrica