Abstract
Congenital heart disease (CHD) is a major cause of childhood mortality and lifelong morbidity in children and adults. CHD has a strong genetic basis, with exome and short read whole genome sequencing (srWGS) identifying genetic contributions to CHD risk in around 45% of affected individuals, while the remaining 55% have not had a specific genetic risk identified. Variation in tandem repeats (TRs), defined as repeated sequences of base pairs, has been implicated in several diseases and can affect gene expression. TR polymorphism in coding or regulatory regions of CHD related genes can alter function and expression, thereby modifying risk of CHD in patients. Identifying variants in TRs has been a challenge due to multiple variants at each site, as well as difficulty resolving long expansions with short read sequencing technologies. New bioinformatic tools have enabled identification of high-confidence short-TR (STR) variants in srWGS. Additionally, long read whole genome sequencing (lrWGS), with read lengths of 10-20 kilobases, enables genotyping of longer TRs that are missed by srWGS. We hypothesized that patients with CHD will have a higher burden of de novo and transmitted TRs near known CHD genes. We identified de novo and transmitted STR variants near known CHD genes in srWGS data from 1900 CHD proband and their parents in the Pediatric Cardiac Genomics Consortium (PCGC), and 1611 non-CHD probands and their parents. Transmitted STR variants in CHD probands were enriched in 5’-UTR, introns, and promoters of CHD genes compared to non-CHD probands. We identified large de novo repeats near known CHD genes such as NOTCH2 and RBFOX2 . Additionally we used Pacbio HiFi lrWGS on 150 PCGC probands without an established etiology for CHD. Combining these analyses, we identified 3,243 repeat regions with low read depth in srWGS, and therefore unable to be resolved accurately, near known CHD genes and from lrWGS we assess length polymorphisms using Tandem Repeat Finder. TRs that harbor large variation in length were identified within intronic regions of CHD genes including FLT4 , NFATC1 , and PKD1 . We suggest that polymorphic repeat regions provide new candidate variants or altering gene transcription or splicing and thereby contribute to CHD.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.