Abstract Background The accurate measurement of serum steroid hormone levels is important in many endocrinologic diseases, including but not limited to congenital adrenal hyperplasia (CAH). Screening for CAH with 17-OH-progesterone immunoassays is known to yield many false-positive results. Furthermore, the differentiation between classical CAH and other non-classical CAH subtypes requires complex steroid profiling. We developed and evaluated the analytical performance of a simultaneous multiplex LC-MS/MS assay for 20 steroid hormones, and also established preliminary pediatric reference intervals for the evaluated steroid hormones. Methods Briefly, sample preparation consists of serum sample mixing with internal standard and acetonitrile, followed by lipid extraction with methyl-tert-butyl ether. Derivatization is conducted using hydroxylamine. The LC-MS/MS system consists of an ACQUITY UPLC I-Class System (Waters Corporation, Milford, MA, USA) equipped with a Waters ACQUITY UPLC BEH C18 column (1.0×50 mm, 1.7 µm; Waters Corporation), and Xevo TQ-XS (Waters Corporation) MS/MS system. The linearity, within- and between-run precision, carryover, lower limits of detection (LOQ) and quantification (LOQ), and matrix effect/recovery were evaluated for aldosterone, cortisol, cortisone, 21-deoxycortisol, corticosterone, 11β-OH-testosterone, 11-ketotestosterone, 11-ketoandrostendione, 11β-OH-androstenedione, estrone, 11-deoxycortisol, 17-OH-pregnenolone, 11β-OH-progesterone, Dehydroepiandrosterone (DHEA), androstenedione, 11-deoxycorticosterone, testosterone, 17-OH-progesterone, 5α-dihydrotestosterone, pregnenolone, and progesterone. Depending on the analyte, approximately 40 residual pediatric samples (age <9 years) were used to calculate the 2.5 and 97.5 percentiles in order to establish reference intervals. Results Linearity evaluation showed correlation coefficients (R2) of >0.9900 for 19/20 hormones, and 0.9853 for corticosterone. Within-run precision (%CV) ranged from 2.24-13.04%, 3.30-12.27%, and 2.63-9.67% for low, medium, and high steroid hormone levels. Between-run precision (%CV) ranged from 3.91-23.86%, 2.97-27.35%, and 3.95-23.67% for low, medium, and high steroid hormone levels. Carryover (%) was <5% for 18/20 hormones, 6.61% for pregnenolone, and 8.24% for progesterone. LOD ranged from 0.005-0.25 ng/mL, and LOQ ranged from 0.01-0.5 ng/mL for all 20 hormones. Matrix effect ranged from 99.0-110.6% for all 20 hormones, and recovery was 69.4-88.1% for 19/20 hormones, and 59.7% for aldosterone. Preliminary reference intervals (2.5-97.5 percentile) were successfully established for 18/20 evaluated hormones (ng/mL): aldosterone (0.010-0.186), cortisol (23.4-80.8), cortisone (5.82-24.8), 21-deoxycortisol (0.130-0.226), corticosterone (0.088-1.501), 11β-OH-testosterone (0.001-0.021), 11-ketotestosterone (0.031-0.390), 11-ketoandrostendione (0.031-0.390), 11β-OH-androstenedione (0.024-0.279), 11-deoxycortisol (0.040-0.701), 17-OH-pregnenolone (0.112-1.994), 11β-OH-progesterone (0.001-0.316), DHEA (0.018-1.452), androstenedione (0.010-0.160), 11-deoxycorticosterone (0.016-0.175), testosterone (0.015-0.096), 17-OH-progesterone (0.020-0.251), pregnenolone (0.031-0.215), and progesterone (0.010-0.048). Conclusions The LC-MS/MS-based multiplex steroid hormone assay showed robust analytical performance for all evaluated parameters, and can be used to reliably measure 20 steroid hormones with good sensitivity and specificity. This assay can be utilized in the clinical laboratory for the accurate diagnosis of various endocrinologic diseases, including rare subtypes of CAH, which can ultimately lead to better clinical outcomes.