Abstract
Introduction: The human heart expresses ~10,000 genes, but only a subset is most relevant to cardiac disease and medicine. To maximize the impact of new technological developments to cardiac research, it is imperative that valuable resources be allocated such that genes that are most relevant to the cardiac community are prioritized, e.g., to ensure high-quality research reagents on these genes are available and to promote biocuration efforts to annotate their functions. Hypothesis: A relatively small number of genes concern the majority of cardiac research efforts, and can be identified from the analysis of publicly available literature records. Method: We developed a novel data science method to analyze over one million peer-reviewed cardiac-related publications and the cardiac genes referenced to these articles. We then derived a metric to measure gene importance to cardiac research based on the normalized semantic distance between the gene and articles on various cardiac diseases to which it is referenced. The cardiac genes with the least semantic distance to cardiac publications were retrieved. Results: We identified 50 of the most highly investigated genes in the heart, accounting for ~15% of all cardiac research, and constructed gene networks from their publication records and functional annotations. The analysis also allowed us to visualize several major clusters of cardiac research focuses, surrounding the contractile machinery, ion channels, BNP signaling, and PKA signaling. In addition, we identified ~50 proximal genes which are functionally related to critical cardiac processes but currently under-studied, and which we predict will become important research targets in the near future. Conclusion: We demonstrate here a big data-driven approach to catalogue and predict the focus and trend of cardiac research.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.