AbstractThis paper investigates the soybean futures price prediction problem from a new perspective and proposes an effective prediction model named Two‐Stage Hybrid Long Short‐Term Memory (TSH‐LSTM) by using text data from social media. First, the unstructured text is transformed into structured data by sentiment analysis and text classification methods. The improved sentiment score is computed by combining the degree centrality of sentiment words based on the sentiment dictionary method, and the characteristics of price fluctuations in texts are learned through the text Recurrent Convolutional Neural Networks. Second, the significant relationship between social media features and soybean futures price is assessed through stepwise regression, and the results of such an assessment are used as a basis for the identification of significant factors as input variables of the prediction model. Finally, the TSH‐LSTM prediction model is designed, and the final prediction result is acquired through the combination of prediction results of each stage using the error reciprocal method. The empirical results indicate that the incorporation of the social media text feature helps improve forecasting performances. Specifically, the proposed TSH‐LSTM is more accurate than univariate LSTM, multivariate LSTM, and eXtreme Gradient Boosting.