Sort by
Structure and transcription of integrated HPV DNA in vulvar carcinomas

HPV infections are associated with a fraction of vulvar cancers. Through hybridization capture and DNA sequencing, HPV DNA was detected in five of thirteen vulvar cancers. HPV16 DNA was integrated into human DNA in three of the five. The insertions were in introns of human NCKAP1, C5orf67, and LRP1B. Integrations in NCKAP1 and C5orf67 were flanked by short direct repeats in the human DNA, consistent with HPV DNA insertions at sites of abortive, staggered, endonucleolytic incisions. The insertion in C5orf67 was present as a 36 kbp, human-HPV-hetero-catemeric DNA as either an extrachromosomal circle or a tandem repeat within the human genome. The human circularization/repeat junction was defined at single nucleotide resolution. The integrated viral DNA segments all retained an intact upstream regulatory region and the adjacent viral E6 and E7 oncogenes. RNA sequencing revealed that the only HPV genes consistently transcribed from the integrated viral DNAs were E7 and E6*I. The other two HPV DNA+ tumors had coinfections, but no evidence for integration. HPV-positive and HPV-negative vulvar cancers exhibited contrasting human, global gene expression patterns partially overlapping with previously observed differences between HPV-positive and HPV-negative cervical and oropharyngeal cancers. A substantial fraction of the differentially expressed genes involved immune system function. Thus, transcription and HPV DNA integration in vulvar cancers resemble those in other HPV-positive cancers. This study emphasizes the power of hybridization capture coupled with DNA and RNA sequencing to identify a broad spectrum of HPV types, determine human genome integration status of viral DNAs, and elucidate their structures.

Open Access Just Published
Relevant
Pharmacogenomics of coronary artery response to intravenous gamma globulin in kawasaki disease

Kawasaki disease (KD) is a multisystem inflammatory illness of infants and young children that can result in acute vasculitis. The mechanism of coronary artery aneurysms (CAA) in KD despite intravenous gamma globulin (IVIG) treatment is not known. We performed a Whole Genome Sequencing (WGS) association analysis in a racially diverse cohort of KD patients treated with IVIG, both using AHA guidelines. We defined coronary aneurysm (CAA) (N = 234) as coronary z ≥ 2.5 and large coronary aneurysm (CAA/L) (N = 92) as z ≥ 5.0. We conducted logistic regression models to examine the association of genetic variants with CAA/L during acute KD and with persistence >6 weeks using an additive model between cases and 238 controls with no CAA. We adjusted for age, gender and three principal components of genetic ancestry. The top significant variants associated with CAA/L were in the intergenic regions (rs62154092 p < 6.32E–08 most significant). Variants in SMAT4, LOC100127, PTPRD, TCAF2 and KLRC2 were the most significant non-intergenic SNPs. Functional mapping and annotation (FUMA) analysis identified 12 genomic risk loci with eQTL or chromatin interactions mapped to 48 genes. Of these NDUFA5 has been implicated in KD CAA and MICU and ZMAT4 has potential functional implications. Genetic risk score using these 12 genomic risk loci yielded an area under the receiver operating characteristic curve (AUC) of 0.86. This pharmacogenomics study provides insights into the pathogenesis of CAA/L in IVIG-treated KD and shows that genomics can help define the cause of CAA/L to guide management and improve risk stratification of KD patients.

Open Access
Relevant
Analysis of cell free DNA to predict outcome to bevacizumab therapy in colorectal cancer patients

To predict outcome to combination bevacizumab (BVZ) therapy, we employed cell-free DNA (cfDNA) to determine chromosomal instability (CIN), nucleosome footprints (NF) and methylation profiles in metastatic colorectal cancer (mCRC) patients. Low-coverage whole-genome sequencing (LC-WGS) was performed on matched tumor and plasma samples, collected from 74 mCRC patients from the AC-ANGIOPREDICT Phase II trial (NCT01822444), and analysed for CIN and NFs. A validation cohort of plasma samples from the University Medical Center Mannheim (UMM) was similarly profiled. 61 AC-ANGIOPREDICT plasma samples collected before and following BVZ treatment were selected for targeted methylation sequencing. Using cfDNA CIN profiles, AC-ANGIOPREDICT samples were subtyped with 92.3% accuracy into low and high CIN clusters, with good concordance observed between matched plasma and tumor. Improved survival was observed in CIN-high patients. Plasma-based CIN clustering was validated in the UMM cohort. Methylation profiling identified differences in CIN-low vs. CIN high (AUC = 0.87). Moreover, significant methylation score decreases following BVZ was associated with improved outcome (p = 0.013). Analysis of CIN, NFs and methylation profiles from cfDNA in plasma samples facilitates stratification into CIN clusters which inform patient response to treatment.

Open Access
Relevant
An efficient molecular genetic testing strategy for incontinentia pigmenti based on single-tube long fragment read sequencing

Incontinentia pigmenti (IP) is a rare X-linked dominant neuroectodermal dysplasia that primarily affects females. The only known causative gene is IKBKG, and the most common genetic cause is the recurrent IKBKG△4–10 deletion resulting from recombination between two MER67B repeats. Detection of variants in IKBKG is challenging due to the presence of a highly homologous non-pathogenic pseudogene IKBKGP1. In this study, we successfully identified four pathogenic variants in four IP patients using a strategy based on single-tube long fragment read (stLFR) sequencing with a specialized analysis pipeline. Three frameshift variants (c.519-3_519dupCAGG, c.1167dupC, and c.700dupT) were identified and subsequently validated by Sanger sequencing. Notably, c.519-3_519dupCAGG was found in both IKBKG and IKBKGP1, whereas the other two variants were only detected in the functional gene. The IKBKG△4–10 deletion was identified and confirmed in one patient. These results demonstrate that the proposed strategy can identify potential pathogenic variants and distinguish whether they are derived from IKBKG or its pseudogene. Thus, this strategy can be an efficient genetic testing method for IKBKG. By providing a comprehensive understanding of the whole genome, it may also enable the exploration of other genes potentially associated with IP. Furthermore, the strategy may also provide insights into other diseases with detection challenges due to pseudogenes.

Open Access
Relevant
Structure-based network analysis predicts pathogenic variants in human proteins associated with inherited retinal disease.

Advances in gene sequencing technologies have accelerated the identification of genetic variants, but better tools are needed to understand which are causal of disease. This would be particularly useful in fields where gene therapy is a potential therapeutic modality for a disease-causing variant such as inherited retinal disease (IRD). Here, we apply structure-based network analysis (SBNA), which has been successfully utilized to identify variant-constrained amino acid residues in viral proteins, to identify residues that may cause IRD if subject to missense mutation. SBNA is based entirely on structural first principles and is not fit to specific outcome data, which makes it distinct from other contemporary missense prediction tools. In 4 well-studied human disease-associated proteins (BRCA1, HRAS, PTEN, and ERK2) with high-quality structural data, we find that SBNA scores correlate strongly with deep mutagenesis data. When applied to 47 IRD genes with available high-quality crystal structure data, SBNA scores reliably identified disease-causing variants according to phenotype definitions from the ClinVar database. Finally, we applied this approach to 63 patients at Massachusetts Eye and Ear (MEE) with IRD but for whom no genetic cause had been identified. Untrained models built using SBNA scores and BLOSUM62 scores for IRD-associated genes successfully predicted the pathogenicity of novel variants (AUC = 0.851), allowing us to identify likely causative disease variants in 40 IRD patients. Model performance was further augmented by incorporating orthogonal data from EVE scores (AUC = 0.927), which are based on evolutionary multiple sequence alignments. In conclusion, SBNA can used to successfully identify variants as causal of disease in human proteins and may help predict variants causative of IRD in an unbiased fashion.

Open Access
Relevant
Consensus reporting guidelines to address gaps in descriptions of ultra-rare genetic conditions

Genome-wide sequencing and genetic matchmaker services are propelling a new era of genotype-driven ascertainment of novel genetic conditions. The degree to which reported phenotype data in discovery-focused studies address informational priorities for clinicians and families is unclear. We identified reports published from 2017 to 2021 in 10 genetics journals of novel Mendelian disorders. We adjudicated the quality and detail of the phenotype data via 46 questions pertaining to six priority domains: (I) Development, cognition, and mental health; (II) Feeding and growth; (III) Medication use and treatment history; (IV) Pain, sleep, and quality of life; (V) Adulthood; and (VI) Epilepsy. For a subset of articles, all subsequent published follow-up case descriptions were identified and assessed in a similar manner. A modified Delphi approach was used to develop consensus reporting guidelines, with input from content experts across four countries. In total, 200 of 3243 screened publications met inclusion criteria. Relevant phenotypic details across each of the 6 domains were rated superficial or deficient in >87% of papers. For example, less than 10% of publications provided details regarding neuropsychiatric diagnoses and “behavioural issues”, or about the type/nature of feeding problems. Follow-up reports (n = 95) rarely contributed this additional phenotype data. In summary, phenotype information relevant to clinical management, genetic counselling, and the stated priorities of patients and families is lacking for many newly described genetic diseases. The PHELIX (PHEnotype LIsting fiX) reporting guideline checklists were developed to improve phenotype reporting in the genomic era.

Open Access
Relevant