Interference by distractors has been associated multiple times with diminished visual and auditory working memory (WM) performance. Negative emotional distractors in particular lead to detrimental effects on WM. However, these associations have only been seen when distractors and items to maintain in WM are from the same sensory modality. In this study, we investigate cross-modal interference on WM. We invited 20 participants to complete a visual change-detection task, assessing visual WM (VWM), while hearing emotional (fearful) and neutral auditory distractors. Electrophysiological activity was recorded to measure contralateral delay activity (CDA) and auditory P2 event-related potentials (ERP), indexing WM maintenance and distractor salience respectively. At the behavioral level, fearful prosody didn't decrease significantly working memory accuracy, compared to neutral prosody. Regarding ERPs, fearful distractors evoked a greater P2 amplitude than neutral distractors. Correlations between the two ERP potentials indicated that P2 amplitude difference between the two types of prosody was associated with the difference in CDA amplitude for fearful and neutral trials. This association suggests that cognitive resources required to process fearful prosody detrimentally impact VWM maintenance. That result provides a piece of additional evidence that negative emotional stimuli produce greater interference than neutral stimuli and that the cognitive resources used to process stimuli from different modalities come from a common pool.