Abstract

Base editing is an enhanced gene editing approach that enables the precise transformation of single nucleotides and has the potential to cure rare diseases. The design process of base editors is labour-intensive and outcomes are not easily predictable. For any clinical use, base editing has to be accurate and efficient. Thus, any bystander mutations have to be minimized. In recent years, computational models to predict base editing outcomes have been developed. However, the overall robustness and performance of those models is limited. One way to improve the performance is to train models on a diverse, feature-rich, and large dataset, which does not exist for the base editing field. Hence, we develop BE-dataHIVE, a mySQL database that covers over 460,000 gRNA target combinations. The current version of BE-dataHIVE consists of data from five studies and is enriched with melting temperatures and energy terms. Furthermore, multiple different data structures for machine learning were computed and are directly available. The database can be accessed via our website https://be-datahive.com/ or API and is therefore suitable for practitioners and machine learning researchers.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.