Abstract

Rett syndrome (RTT) is a rare neurological disorder mostly caused by a genetic variation in MECP2. Making new MECP2 variants and the related phenotypes available provides data for better understanding of disease mechanisms and faster identification of variants for diagnosis. This is, however, currently hampered by the lack of interoperability between genotype-phenotype databases. Here, we demonstrate on the example of MECP2 in RTT that by making the genotype-phenotype data more Findable, Accessible, Interoperable, and Reusable (FAIR), we can facilitate prioritization and analysis of variants. In total, 10,968 MECP2 variants were successfully integrated. Among these variants 863 unique confirmed RTT causing and 209 unique confirmed benign variants were found. This dataset was used for comparison of pathogenicity predicting tools, protein consequences, and identification of ambiguous variants. Prediction tools generally recognised the RTT causing and benign variants, however, there was a broad range of overlap Nineteen variants were identified that were annotated as both disease-causing and benign, suggesting that there are additional factors in these cases contributing to disease development.

Highlights

  • Background and SummaryRett syndrome (RTT) is a rare neurological disorder first described in 1956 by Andreas Rett occurring predominantly in females[1]

  • We investigated the status of RTT genotype-phenotype databases and the methods that different resources use to share newly identified genetic variants on the example of RTT18

  • In a recent study[18], we identified thirteen genotype-phenotype databases containing RTT-specific MECP2 variation data

Read more

Summary

Background and Summary

Rett syndrome (RTT) is a rare neurological disorder first described in 1956 by Andreas Rett occurring predominantly in females[1]. We investigated the status of RTT genotype-phenotype databases and the methods that different resources use to share newly identified genetic variants on the example of RTT18. Thirteen different genotype-phenotype databases were identified that are used to collect and share genetic variants annotated with observed or predicted effects. We show how to integrate the available RTT genetic and phenotypic data across multiple databases and use the integrated data for further analysis about RTT, in order to investigate variant abundance and distribution and to test variant effect prediction algorithms. The dataset created and used in this study is the largest collection of annotated disease-causing and benign MECP2 variants available at this moment, and may help researchers investigate and test disease models

Methods
Findings
Code availability
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call