Clinical short-read exome and genome sequencing approaches have positively impacted diagnostic testing for rare diseases. Yet, technical limitations associated with short reads challenge their use for the detection of disease-associated variation in complex regions of the genome. Long-read sequencing (LRS) technologies may overcome these challenges, potentially qualifying as a first-tier test for all rare diseases. To test this hypothesis, we performed LRS (30× high-fidelity [HiFi] genomes) for 100 samples with 145 known clinically relevant germline variants that are challenging to detect using short-read sequencing and necessitate a broad range of complementary test modalities in diagnostic laboratories. We show that relevant variant callers readily re-identified the majority of variants (120/145, 83%), including ∼90% of structural variants, SNVs/insertions or deletions (indels) in homologous sequences, and expansions of short tandem repeats. Another 10% (n= 14) was visually apparent in the data but not automatically detected. Our analyses also identified systematic challenges for the remaining 7% (n= 11) of variants, such as the detection of AG-rich repeat expansions. Titration analysis showed that 90% of all automatically called variants could also be identified using 15-fold coverage. Long-read genomes thus identified 93% of challenging pathogenic variants from our dataset. Even with reduced coverage, the vast majority of variants remained detectable, possibly enhancing cost-effective diagnostic implementation. Most importantly, we show the potential to use a single technology to accurately identify all types of clinically relevant variants.
Read full abstract