Cross-modal conflicts arise when information from multisensory modalities is incongruent. Most previous studies investigating audiovisual cross-modal conflicts have focused on visual targets with auditory distractors, and only a few studies have focused on auditory targets with visual distractors. Moreover, no study has investigated the differences in the impact of visual cross-modal conflict with semantic and nonsemantic competition and its neural basis. This cross-sectional study aimed to characterize the impact of 2 types of visual cross-modal conflicts with semantic and nonsemantic distractors through a working memory task and associated brain activities. The participants were 33 healthy, right-handed, young male adults. The paced auditory serial addition test was performed under 3 conditions: no-distractor and 2 types of visual distractor conditions (nonsemantic and semantic distractor conditions). Symbols and numbers were used as nonsemantic and semantic distractors, respectively. The oxygenated hemoglobin (Oxy-Hb) concentration in the frontoparietal regions, bilateral ventrolateral prefrontal cortex (VLPFC), dorsolateral prefrontal cortex, and inferior parietal cortex (IPC) were measured during the task under each condition. The results showed significantly lower paced auditory serial addition test performances in both distractor conditions than in the no-distractor condition, but no significant difference between the 2 distractor conditions. For brain activity, a significantly increased Oxy-Hb concentration in the right VLPFC was only observed in the nonsemantic distractor condition (corrected P = .015; Cohen d = .46). The changes in Oxy-Hb in the bilateral IPC were positively correlated with changes in task performance for both types of visual cross-modal distractor conditions. Visual cross-modal conflict significantly impairs auditory working memory task performance, regardless of the presence of semantic or nonsemantic distractors. The right VLPFC may be a crucial region to inhibit visual nonsemantic information in cross-modal conflict situations, and bilateral IPC may be closely linked with the inhibition of visual cross-modal distractor, regardless of the presence of semantic or nonsemantic distractors.