Auditory evoked potentials (AEPs) play an important role in evaluating hearing in infants and others who are unable to participate reliably in behavioral testing. Discriminating the AEP from the much larger background activity, however, can be challenging and time-consuming, especially when several AEP measurements are needed, as is the case for audiogram estimation. This task is usually entrusted to clinicians, who visually inspect the AEP waveforms to determine if a response is present or absent. The drawback is that this introduces a subjective element to the test, compromising quality control of the examination. Various objective methods have therefore been developed to aid clinicians with response detection. In recent work, the authors introduced Gaussian processes (GPs) with active learning for hearing threshold estimation using auditory brainstem responses (ABRs). The GP is attractive for this task, as it can exploit the correlation structure underlying AEP waveforms across different stimulus levels and frequencies, which is often overlooked by conventional detection methods. GPs with active learning previously proved effective for ABR hearing threshold estimation in simulations, but have not yet been evaluated for audiogram estimation in subject data. The present work evaluates GPs with active learning for ABR audiogram estimation in a sample of normal-hearing and hearing-impaired adults. This involves introducing an additional dimension to the GP (i.e., stimulus frequency) along with real-time implementations and active learning rules for automated stimulus selection. The GP's accuracy was evaluated using the "hearing threshold estimation error," defined as the difference between the GP-estimated hearing threshold and the behavioral hearing threshold to the same stimuli. Test time was evaluated using the number of preprocessed and artifact-free epochs (i.e., the sample size) required for locating hearing threshold at each frequency. Comparisons were drawn with visual inspection by examiners who followed strict guidelines provided by the British Society of Audiology. Twenty-two normal hearing and nine hearing-impaired adults were tested (one ear per subject). For each subject, the audiogram was estimated three times: once using the GP approach, once using visual inspection by examiners, and once using a standard behavioral hearing test. The GP's median estimation error was approximately 0 dB hearing level (dB HL), demonstrating an unbiased test performance relative to the behavioral hearing thresholds. The GP additionally reduced test time by approximately 50% relative to the examiners. The hearing thresholds estimated by the examiners were 5 to 15 dB HL higher than the behavioral thresholds, which was consistent with the literature. Further testing is still needed to determine the extent to which these results generalize to the clinic. GPs with active learning enable automatic, real-time ABR audiogram estimation with relatively low test time and high accuracy. The GP could be used to automate ABR audiogram estimation or to guide clinicians with this task, who may choose to override the GP's decisions if deemed necessary. Results suggest that GPs hold potential for next-generation ABR hearing threshold and audiogram-seeking devices.