In the era of rapid e-commerce and social networks, the recognition of textual emotion has garnered escalating interest in recent years. This area of study proves pivotal for applications such as content recommendation and human-robot interaction. Prominent models such as CNN, LSTM, and Roberta have been used in natural language processing to decipher the subtleties of textual emotion. Despite the commendable accuracy achieved by Roberta, its extensive model parameters come at the cost of sluggish training speeds. To address this issue, we propose a novel hybrid approach that integrates Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) networks. We employ this hybrid model to address the emotion classification problem using a dataset named "Emotion", and we compare its performance with that of individual models like Roberta, CNN, and LSTM. The new hybrid model exhibits a more compact architecture while maintaining a relatively high accuracy rate, boasting an impressive 91%. This hybrid architecture can provide a good compromise between efficiency and performance. This research contributes to the ongoing exploration of efficient and accurate models for textual emotion recognition, emphasizing the significance of balancing model complexity and training speed in practical applications.