Predicting Bitcoin prices is crucial because they reflect trends in the overall cryptocurrency market. Owing to the market's short history and high price volatility, previous research has focused on the factors influencing Bitcoin price fluctuations. Although previous studies used sentiment analysis or diversified input features, this study's novelty lies in its utilization of data classified into more than five major categories. Moreover, the use of data spanning more than 2,000 days adds novelty to this study. With this extensive dataset, the authors aimed to predict Bitcoin prices across various timeframes using time series analysis. The authors incorporated a broad spectrum of inputs, including technical indicators, sentiment analysis from social media, news sources, and Google Trends. In addition, this study integrated macroeconomic indicators, on-chain Bitcoin transaction details, and traditional financial asset data. The primary objective was to evaluate extensive machine learning and deep learning frameworks for time series prediction, determine optimal window sizes, and enhance Bitcoin price prediction accuracy by leveraging diverse input features. Consequently, employing the bidirectional long short-term memory (Bi-LSTM) yielded significant results even without excluding the COVID-19 outbreak as a black swan outlier. Specifically, using a window size of 3, Bi-LSTM achieved a root mean squared error of 0.01824, mean absolute error of 0.01213, mean absolute percentage error of 2.97%, and an R-squared value of 0.98791. Additionally, to ascertain the importance of input features, gradient importance was examined to identify which variables specifically influenced prediction results. Ablation test was also conducted to validate the effectiveness and validity of input features. The proposed methodology provides a varied examination of the factors influencing price formation, helping investors make informed decisions regarding Bitcoin-related investments, and enabling policymakers to legislate considering these factors.
Read full abstract