Background Autism Spectrum Disorder (ASD) affects 1% world population and has become a pressing medical and social problem worldwide. As a paradigmatic complex genetic disease, ASD has been intensively studied and thousands of gene mutations reported. Because these variants rarely recur, major challenges remain as to: 1) evaluate the disease association of individual variants; 2) pinpoint most driver (pathogenic) events from a huge pool of passengers (random mutations); 3) replicate independent studies, or 4) verify their results systematically. Important findings have been made, yet a systematic and precise understanding of autism biology has not been achieved with these studies. Methods To address these challenges, we employed a holographic approach and analyzed these data in the hierarchical context of multiple annotation levels: i.e. variant, gene and pathways. Previously, mutations/variants were divided into 1-dimensional groups according to their molecular effects (i.e. loss of function, missense, synonymous, etc.), their target gene (e.g. CHD8, GRIN2B, SCN2A etc.) or function annotation (e.g. FMRP targets, synaptic or Schizophrenia related genes etc.). Compared to this classical one-level approach, multi-level approach provides more sophisticated and relevant classification and prioritization of de novo (DN) mutations, which makes new and more powerful analyses possible with these rare events. Note that multi-level approach is a generic strategy, which are used in different types of analysis in this study, i.e. variant prioritization, association analysis, function analysis etc. Results We analyzed thousands of ASD whole exome mutation profiles. These mutations do not recur or replicate at variant level, but significantly and increasingly so at gene and pathway level. Genetic association reveals a novel gene+pathway dual-hit model, where the mutation burden becomes less relevant. In multiple independent analyses, hundreds of variants or genes repeatedly converge to several canonical pathways, either novel or literature-supported. These pathways define relevant, recurrent and systematic ASD biology, distinct from previously reported gene groups or networks. They also present a catalog of novel ASD risk factors including 118 variants and 72 genes. At sub-pathway level, most variants disrupt the pathway-related gene functions, and in the same gene, they tend to hit residues extremely close and in the same domain. Multiple interacting variants spotlight key modules, e.g. cAMP second-messenger system and mGluR signaling regulation by GRK (G protein-coupled receptor kinases). At super-pathway level, distinct pathways further interconnect and converge to three biology themes, i.e. synaptic function, morphology and plasticity. Discussion First, we inferred ASD genetic causes using a novel sequential prioritization procedure, which can be applied to other complex diseases. Second, we did a multi-level recurrence analysis, revealing that ASD genetics may be systematically replicated across independent cohorts even though the mutations themselves rarely recur. We dissected ASD genetic association across multiple levels, and built a more inclusive gene+pathway dual-hit model. We reconstructed replicable, systematic and interpretable molecular mechanisms for ASD with relevant context and details. Note we have the full manuscript including all figures, tables and supplementary data posted online at http://biorxiv.org/content/early/2016/05/11/052878.
Read full abstract