Complex auditory scenes pose a challenge to attentive listening, rendering listeners slower and more uncertain in their perceptual decisions. How can we explain such behaviors from the dynamics of cortical networks that pertain to the control of listening behavior? We here follow up on the hypothesis that human adaptive perception in challenging listening situations is supported by modular reconfiguration of auditory-control networks in a sample of N=40 participants (13 males) who underwent resting-state and task functional magnetic resonance imaging (fMRI). Individual titration of a spatial selective auditory attention task maintained an average accuracy of ∼70% but yielded considerable inter-individual differences in listeners' response speed and reported confidence in their own perceptual decisions. Whole-brain network modularity increased from rest to task by reconfiguring auditory, cinguloopercular, and dorsal attention networks. Specifically, interconnectivity between the auditory network and cinguloopercular network decreased during the task relative to the resting state. Additionally, interconnectivity between the dorsal attention network and cinguloopercular network increased. These interconnectivity dynamics were predictive of individual differences in response confidence, the degree of which was more pronounced after incorrect judgments. Our findings uncover the behavioral relevance of functional crosstalk between auditory and attentional-control networks during metacognitive assessment of one's own perception in challenging listening situations and suggest two functionally dissociable cortical networked systems that shape the considerable metacognitive differences between individuals in adaptive listening behavior.Significance Statement The ability to communicate in challenging listening situations varies not only objectively between individuals but also in terms of their subjective perceptual confidence. Using fMRI and a challenging auditory task, we demonstrate that this variability in the metacognitive aspect of listening behavior is reflected on a cortical level through the modular reconfiguration of brain networks. Importantly, task-related modulation of interconnectivity between the cinguolopercular network and each auditory and dorsal attention network can explain for individuals' differences in response confidence. This suggests two dissociable cortical networked systems that shape the individual evaluation of one's own perception during listening, promising new opportunities to better understand and intervene in deficits of auditory perception such as age-related hearing loss or auditory hallucinations.