Abstract
BackgroundTandem repeats are highly mutable and contribute to the development of human disease by a variety of mechanisms. It is difficult to predict which tandem repeats may cause a disease. One hypothesis is that changeable tandem repeats are the source of genetic diseases, because disease-causing repeats are polymorphic in healthy individuals. However, it is not clear whether disease-causing repeats are more polymorphic than other repeats.MethodsWe performed a genome-wide survey of the millions of human tandem repeats using publicly available long read genome sequencing data from 21 humans. We measured tandem repeat copy number changes using tandem-genotypes. Length variation of known disease-associated repeats was compared to other repeat loci.ResultsWe found that known Mendelian disease-causing or disease-associated repeats, especially CAG and 5′UTR GGC repeats, are relatively long and polymorphic in the general population. We also show that repeat lengths of two disease-causing tandem repeats, in ATXN3 and GLS, are correlated with near-by GWAS SNP genotypes.ConclusionsWe provide a catalog of polymorphic tandem repeats across a variety of repeat unit lengths and sequences, from long read sequencing data. This method especially if used in genome wide association study, may indicate possible new candidates of pathogenic or biologically important tandem repeats in human genomes.
Highlights
Tandem repeats are highly mutable and contribute to the development of human disease by a variety of mechanisms
There are more than 30 rare Mendelian diseases caused by tandem repeat expansions in human genomes [1]
We found that disease-causing repeats show different distribution from other non-disease repeats (Additional file 2: Fig S1A–C)
Summary
Tandem repeats are highly mutable and contribute to the development of human disease by a variety of mechanisms. It is difficult to predict which tandem repeats may cause a disease. One hypothesis is that changeable tandem repeats are the source of genetic diseases, because disease-causing repeats are polymorphic in healthy individuals. There are more than 30 rare Mendelian diseases caused by tandem repeat expansions in human genomes [1]. Tandem repeats are highly mutable and can affect phenotype, they are rarely considered in genomewide association studies (GWAS). As tandem repeats’ rapid evolution causes them to have weak association with nearby polymorphisms, we may hypothesize that repeats explain these phenotypes, as represented in previous studies [6, 7]. To the best of our knowledge, there has been no study that characterizes the genotypic variation of disease-causing and other tandem repeats using only long reads
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.