In video lectures, instructors often use spontaneous emotional expressions (facial expressions, tone of voice) and visual cues (underlining, circling) to guide students' attention toward key instructional information. While previous research has confirmed the benefits of visual cues in guiding attention and processing specific information, there's a notable gap in understanding the role of emotional expression in this context. Moreover, there is a lack of comprehensive exploration regarding the specific design of both behaviors (whether they emphasize the same instructional information) and their effect on students. This study conducted two experiments. Experiment 1 first confirmed the guiding effect of an instructor's emotional expression on students, establishing the foundation for our research. Additionally, Experiment 1 explored the impact of the consistency/inconsistency of facial expressions and tone of voice, including student motivation, cognitive load, and learning performance. Results revealed the benefits of consistent positive emotional expressions on motivation and transfer performance, and the benefits of consistent negative emotional expressions on retention performance. Furthermore, we found that tone of voice was a key factor in guiding students, while facial expressions were associated with students' immediate memory. Building upon Experiment 1, Experiment 2 introduced visual cues to investigate the combined impact of these two guiding behaviors on students. Results regarding emotional expressions were replicated, confirming the positive effects of both. Moreover, we found that the irrelevance of visual cues weakened the guiding influence of emotional expression on students, leading to the loss of relevant information. Therefore, we suggest encouraging instructors to convey positive emotions to enhance the learning experience while emphasizing key information through negative emotional expressions accompanied by visual cues. Additionally, minimizing or concealing visual cues whenever possible is advisable when delivering content beyond visual representations.