Abstract

BackgroundTandem repeats are highly mutable and contribute to the development of human disease by a variety of mechanisms. It is difficult to predict which tandem repeats may cause a disease. One hypothesis is that changeable tandem repeats are the source of genetic diseases, because disease-causing repeats are polymorphic in healthy individuals. However, it is not clear whether disease-causing repeats are more polymorphic than other repeats.MethodsWe performed a genome-wide survey of the millions of human tandem repeats using publicly available long read genome sequencing data from 21 humans. We measured tandem repeat copy number changes using tandem-genotypes. Length variation of known disease-associated repeats was compared to other repeat loci.ResultsWe found that known Mendelian disease-causing or disease-associated repeats, especially CAG and 5′UTR GGC repeats, are relatively long and polymorphic in the general population. We also show that repeat lengths of two disease-causing tandem repeats, in ATXN3 and GLS, are correlated with near-by GWAS SNP genotypes.ConclusionsWe provide a catalog of polymorphic tandem repeats across a variety of repeat unit lengths and sequences, from long read sequencing data. This method especially if used in genome wide association study, may indicate possible new candidates of pathogenic or biologically important tandem repeats in human genomes.

Highlights

  • Tandem repeats are highly mutable and contribute to the development of human disease by a variety of mechanisms

  • There are more than 30 rare Mendelian diseases caused by tandem repeat expansions in human genomes [1]

  • We found that disease-causing repeats show different distribution from other non-disease repeats (Additional file 2: Fig S1A–C)

Read more

Summary

Introduction

Tandem repeats are highly mutable and contribute to the development of human disease by a variety of mechanisms. It is difficult to predict which tandem repeats may cause a disease. One hypothesis is that changeable tandem repeats are the source of genetic diseases, because disease-causing repeats are polymorphic in healthy individuals. There are more than 30 rare Mendelian diseases caused by tandem repeat expansions in human genomes [1]. Tandem repeats are highly mutable and can affect phenotype, they are rarely considered in genomewide association studies (GWAS). As tandem repeats’ rapid evolution causes them to have weak association with nearby polymorphisms, we may hypothesize that repeats explain these phenotypes, as represented in previous studies [6, 7]. To the best of our knowledge, there has been no study that characterizes the genotypic variation of disease-causing and other tandem repeats using only long reads

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call