This quasi-experimental study aimed to determine the relationship between (i) oral language ability and emotions represented by facial emotions, and (ii) modality of assessment (audios versus videos) and sentiments embedded in each modality. Sixty university students watched and/or listened to four selected audio-visual stimuli and orally answered follow-up comprehension questions. One stimulus was designed to evoke happiness while the other, sadness. Participants' facial emotions during the answering were measured using the FaceReader technology. In addition, four trained raters assessed the responses of the participants. An analysis of the FaceReader data showed that there were significant main and interaction effects of sentiment and modality on participants' facial emotional expression. Notably, there was a significant difference in the amount of facial emotions evoked by (i) the happy vs. sad sentiment videos and (ii) video vs. audio modalities. In contrast, sentiments embedded in the stimuli and modalities had no significant effect on the measured speaking performance of the participants. Nevertheless, we found a number of significant correlations between the participants' test scores and some of their facial emotions evoked by the stimuli. Implications of these findings for the assessment of oral communication are discussed.