Past studies identified rare copy number variants (CNVs) as risk factors for neurodevelopmental disorders (NDDs), including autism spectrum disorder and schizophrenia. However, the clinical characterization of NDD CNVs is understudied in population cohorts unselected for neuropsychiatric disorders and in cohorts of diverse ancestry. To identify individuals harboring NDD CNVs in a multiancestry biobank and to query their enrichment for select neuropsychiatric disorders as well as association with multiple medical disorders. In a series of phenotypic enrichment and association analyses, NDD CNVs were clinically characterized among 24 877 participants in the BioMe biobank, an electronic health record-linked biobank derived from the Mount Sinai Health System, New York, New York. Participants were recruited into the biobank since September 2007 across diverse ancestry and medical and neuropsychiatric specialties. For the current analyses, electronic health record data were analyzed from May 2004 through May 2019. NDD CNVs were identified using a consensus of 2 CNV calling algorithms, based on whole-exome sequencing and genotype array data, followed by novel in-silico clinical assessments. Of 24 877 participants, 14 586 (58.7%) were female; self-reported ancestry categories included 5965 (24.0%) who were of African ancestry, 7892 (31.7%) who were of European ancestry, and 8536 (34.3%) who were of Hispanic ancestry; and the mean (SD) age was 50.5 (17.3) years. Among 24 877 individuals, the prevalence of 64 NDD CNVs was 2.5% (n = 627), with prevalence varying by locus, corroborating the presence of some relatively highly prevalent NDD CNVs (eg, 15q11.2 deletion/duplication). An aggregate set of NDD CNVs were enriched for congenital disorders (odds ratio, 2.0; 95% CI, 1.1-3.5; P = .01) and major depressive disorder (odds ratio, 1.5; 95% CI, 1.1-2.0; P = .01). In a meta-analysis of medical diagnoses (n = 195 hierarchically clustered diagnostic codes), NDD CNVs were significantly associated with several medical outcomes, including essential hypertension (z score = 3.6; P = 2.8 × 10-4), kidney failure (z score = 3.3; P = 1.1 × 10-3), and obstructive sleep apnea (z score = 3.4; P = 8.1 × 10-4) and, in another analysis, morbid obesity (z score = 3.8; P = 1.3 × 10-4). Further, NDD CNVs were associated with increased body mass index in a multiancestry analysis (β = 0.19; 95% CI, 0.10-0.31; P = .003). For 36 common serum tests, there was no association with NDD CNVs. Clinical features of individuals harboring NDD CNVs were elucidated in a large-scale, multiancestry biobank, identifying enrichments for congenital disorders and major depressive disorder as well as associations with several medical outcomes, including hypertension, kidney failure, and obesity and obesity-related phenotypes, specifically obstructive sleep apnea and increased body mass index. The association between NDD CNVs and obesity outcomes indicate further potential pleiotropy of NDD CNVs beyond neurodevelopmental outcomes previously reported. Future clinical genetic investigations may lead to insights of at-risk individuals and therapeutic strategies targeting specific genetic variants. The importance of diverse inclusion within biobanks and considering the effect of rare genetic variants in a multiancestry context is evident.
Read full abstract