Abstract

Data collection at ultra high-frequency on financial markets requires the manipulation of complex databases, and possibly the correction of errors present in the data. The New York Stock Exchange is chosen to provide evidence of problems affecting ultra high-frequency data sets. Standard filters can be applied to remove bad records from the trades and quotes data. A method for outlier detection is proposed to remove data which do not correspond to plausible market activity. Several methods of aggregation of the data are suggested, according to which corresponding time series of interest for econometric analysis can be constructed. As an example of the relevance of the procedure, the autoregressive conditional duration model is estimated on price durations. Failure to purge the data from “wrong” ticks is likely to shorten the financial durations between substantial price movements and to alter the autocorrelation profile of the series. The estimated coefficients and overall model diagnostics are considerably altered in the absence of appropriate steps in data cleaning. Overall the difference in the coefficients is bigger between the dirty series and the clean series than among series filtered with different algorithms.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.