Short tandem repeats (STRs) and variable-number tandem repeats (VNTRs) are repetitive genomic sequences seen widely throughout the genome. These repeat expansions are currently known to cause ∼60 diseases, with expansions in new loci linked to rare diseases continuing to be discovered. Genome sequencing is an important tool for detecting disease-causing variants and several computational tools have been developed to analyze tandem repeats from genomic data, enabling the genotyping and the identification of expanded alleles. However, guidelines for conducting the analysis of these repeats and, more importantly, for assessing the findings are lacking. Understanding the tools and their technical limitations is important for accurately interpreting the results. This article provides detailed, step-by-step instructions for three key use cases in STR analysis from short-read genome sequencing data, which are also applicable to smaller VNTRs. First, it demonstrates an approach for genotyping known pathogenic loci and the identification of clinically significant expansions. Second, we offer guidance on defining tandem repeat loci and conducting genome-wide genotyping studies, which is also applicable to diploid organisms other than humans. Third, instructions are provided on how to find novel expansions at loci not previously known to be associated with disease, aiding in the discovery of new pathogenic loci. Moreover, we introduce the use of newly-developed helper tools that enable a complete and streamlined tandem repeat analysis protocol by addressing the gaps in current methods. All three protocols are compatible with human hg19, hg38, and the latest telomere-to-telomere (hs1) reference genomes. Additionally, this protocol provides an overview and discussion on how to interpret genotyping results. © 2024 The Author(s). Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Genotyping known pathogenic tandem repeat loci Alternate Protocol: Genotyping known pathogenic tandem repeat loci with STRipy Support Protocol 1: Installation of tools and ExpansionHunter catalog modification Basic Protocol 2: Performing genome-wide genotyping of tandem repeats Basic Protocol 3: Discovering de novo tandem repeat expansions Support Protocol 2: Compiling ExpansionHunter Denovo from source code and generating STR profiles.
Read full abstract