The aim of this study is to provide developmental standards on a variety of temporal, spectral, and binaural psychoacoustic (auditory processing [AP]) tests in typically developing children, including immediate and delayed retest reliability, and comparisons between single listener performance on different tests. This study also informs choices on the selection of tests for clinical evaluation of hearing and listening (e.g., for auditory processing disorder). This is a laboratory-based study of AP threshold and variability of 75 children, aged 6 to 11 yrs, and 21 young adults with normal audiometry recruited from local schools and colleges. Data were gathered in clinic-like conditions, without training and across two sessions. Eleven individual (e.g., simultaneous masking and backward masking [BM], amplitude modulation [AM], and frequency modulation [FM] detection) and three derived (temporal integration, frequency resolution, masking level difference) measure tests were embedded within a suite of computer games, each employing a three-interval, three-alternative (odd-one-out) forced choice response paradigm and a staircase adaptive method. AP measures generally showed lower thresholds and reduced variance with increasing age. At 6 to 7 yrs, performance was markedly poorer than in the older groups; 35% of the children could not do the test of frequency discrimination (FD). However, on all the tasks, some children in the same group performed at near-adult levels. The distribution of performance between individuals varied widely across tasks, with clustered performance on tests of tone detection (with or without a simultaneous masker) and AM detection, and scattered performance on BM, FM detection, and FD. Threshold maturity was achieved at different rates across tests and by 10 to 11 yrs of age on all tests except FD. Masking level difference (MLD) performance did not change with age. Retest reliability was mostly high within test sessions but, again, was poorer for some of the younger children. Between test sessions separated by one to several weeks, reliability varied from poor (for FM detection) to high (for long tone detection in quiet, BM, and FD). Correlations between thresholds on different tests were generally low. Data suggest that the perception of different auditory stimuli occurs and develops using rather independent mechanisms, even for tasks that are closely related in procedure. While individual children can perform reliably on several distinct tasks, differences between individuals on the same tasks can be large. Because some of the youngest children perform reliably across time, at or near adult levels, immaturity between 6 and 11 yrs of age, as reflected in group statistics, reflects poor performance of some individual children rather than obligate, age-related deficits in AP. While several of the tests used were found to have potential clinical applicability, because of their reliability and ability to distinguish between individuals, it is currently unclear how performance on such tests relates to everyday listening skills.