Utilizing the high temporal resolution of event-related potentials (ERPs), we compared the time course of processing incongruent color versus 3D-depth information. Participants were asked to judge whether the food color (color condition) or 3D structure (3D-depth condition) was congruent or incongruent with their previous knowledge and experience. The behavioral results showed that the reaction times in the congruent 3D-depth condition were slower than those in the congruent color condition. The reaction times in the incongruent 3D-depth condition were slower than those in the incongruent color condition. The ERP results showed that incongruent color stimuli induced a larger N270, larger P300, and smaller N400 components in the fronto-central region than the congruent color stimuli. Incongruent 3D-depth stimuli induced a smaller N1 in the occipital region, larger P300 and smaller N400 in the parietal-occipital region than congruent 3D-depth stimuli. The time–frequency analysis found that incongruent color stimuli induced a larger theta band (360–580 ms) activation in the fronto-central region than congruent color stimuli. Incongruent 3D-depth stimuli induced larger alpha and beta bands (240–350 ms) activation in the parietal region than congruent 3D-depth stimuli. Our results suggest that the human brain deals with violating general color or depth knowledge in different time courses. We speculate that the depth perception conflict was dominated by solving the problem with visual processing, whereas the color perception conflict was dominated by solving the problem with semantic violation.