Only the label corresponding to the maximum value of the fully connected layer is used as the output category when a neural network performs classification tasks. When the maximum value of the fully connected layer is close to the sub-maximum value, the classification obtained by considering only the maximum value and ignoring the sub-maximum value is not completely accurate. To reduce the noise and improve classification accuracy, combining the principles of fuzzy reasoning, this paper integrates all the output results of the fully connected layer with the emotional tendency of the text based on the dictionary to establish a multi-modal fuzzy recognition emotion enhancement model. The provided model considers the enhancement effect of negative words, degree adverbs, exclamation marks, and question marks based on the smallest subtree on the emotion of emotional words, and defines the global emotional membership function of emojis based on the corpus. Through comparing the results of CNN, LSTM, BiLSTM and GRU on Weibo and Douyin, it is shown that the provided model can effectively improve the text emotion recognition when the neural network output result is not clear, especially for long texts.