Understanding how price-volume information determines future price movement is important for market makers who frequently place orders on both buy and sell sides, and for traders to split meta-orders to reduce price impact. Given the complex non-linear nature of the problem, we consider the prediction of the movement direction of the mid-price on an option order book, using machine learning tools. The applicability of such tools on the options market is currently missing. On an intraday tick-level dataset of options on an exchange traded fund from the Chinese market, we apply a variety of machine learning methods, including decision tree, random forest, logistic regression, and long short-term memory neural network. As machine learning models become more complex, they can extract deeper hidden relationship from input features, which classic market microstructure models struggle to deal with. We discover that the price movement is predictable, deep neural networks with time-lagged features perform better than all other simpler models, and this ability is universal and shared across assets. Using an interpretable model-agnostic tool, we find that the first two levels of features are the most important for prediction. The findings of this article encourage researchers as well as practitioners to explore more sophisticated models and use more relevant features.
Read full abstract