A New Timing Error Cost Function for Binary Time Series Prediction.

Francois Rivest,Richard Kohar

doi:10.1109/tnnls.2019.2900046

Abstract

The ability to make predictions is central to the artificial intelligence problem. While machine learning algorithms have difficulty in learning to predict events with hundreds of time-step dependencies, animals can learn event timing within tens of trials across a broad spectrum of time scales. This suggests strongly a need for new perspectives on the forecasting problem. This paper focuses on binary time series that can be predicted within some temporal precision. We demonstrate that the sum of squared errors (SSE) calculated at every time step is not appropriate for this problem. Next, we look at the advantages and shortcomings of using a dynamic time warping (DTW) cost function. Then, we propose the squared timing error (STE) that uses DTW on the event space and applies SSE on the timing error instead of at each time step. We evaluate all three cost functions on different types of timing errors, such as phase shift, warping, and missing events, on synthetic and real-world binary time series (heartbeats, finance, and music). The results show that STE provides more information about timing error, is differentiable, and can be computed online efficiently. Finally, we devise a gradient descent algorithm for STE on a simplified recurrent neural network. We then compare the performance of the STE-based algorithm to SSE- and logit-based gradient descent algorithms on the same network architecture. The results in real-world binary time series show that the STE algorithm generally outperforms all the other cost functions considered.

Full Text