Abstract

The war on cancer is progressing globally but slowly as researchers around the world continue to seek and discover more innovative and effective ways of curing this catastrophic disease. Organizing biological information, representing it, and making it accessible, or biocuration, is an important aspect of biomedical research and discovery. However, because maintaining sophisticated biocuration is highly resource dependent, it continues to lag behind the continually being generated biomedical data. Another critical aspect of cancer research, pathway analysis, has proven to be an efficient method for gaining insight into the underlying biology associated with cancer. We propose a deep-learning-based model, Stacked Denoising Autoencoder Multi-Label Learning (SdaMLL), for facilitating gene multi-function discovery and pathway completion. SdaMLL can capture intermediate representations robust to partial corruption of the input pattern and generate low-dimensional codes superior to conditional dimension reduction tools. Experimental results indicate that SdaMLL outperforms existing classical multi-label algorithms. Moreover, we found some gene functions, such as Fused in Sarcoma (FUS, which may be part of transcriptional misregulation in cancer) and p27 (which we expect will become a member viral carcinogenesis), that can be used to complete the related pathways. We provide a visual tool (https://www.keaml.cn/gpvisual) to view the new gene functions in cancer pathways.

Highlights

  • The war on cancer is progressing globally but slowly as researchers around the world continue to seek and discover more innovative and effective ways of curing this catastrophic disease

  • Among the 22 novel drugs approved by the U.S Food and Drug Administration (FDA), six of them were designed for treating or diagnosing cancer[2]

  • Originating with the Human Genome Project (HGP), microarray expression analysis, investments in large-scale sequencing centres and high-throughput analytical facilities have been increasing sharply, all leading to the exponential growth of biological data

Read more

Summary

Introduction

The war on cancer is progressing globally but slowly as researchers around the world continue to seek and discover more innovative and effective ways of curing this catastrophic disease. Scientists from various fields, such as biology, statistics, and computer science, are using a vast array of approaches, trying their best to wage a battle but win the war against cancer worldwide Among these approaches, biocuration, which involves organizing, representing, and providing biological information for humans and computers, is an essential part of biomedical discovery and research[6]. The purpose of providing a greater amount of instantaneous manual annotation associated with increased data acquisition, while being prepared to address the possibility of having to make best use of purely human labour, creates a virtually insurmountable dilemma[8] It is because this approach is totally dependent on well-trained professional biocurators who can analyse and extract categorized information from the published literature. The tool should provide linking of gene expression mentions of biological entities identified in the text with their referents identified in biological databases, link them to the appropriate ontological terms

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call