Abstract

BackgroundUnderstanding how genes are expressed specifically in particular tissues is a fundamental question in developmental biology. Many tissue-specific genes are involved in the pathogenesis of complex human diseases. However, experimental identification of tissue-specific genes is time consuming and difficult. The accurate predictions of tissue-specific gene targets could provide useful information for biomarker development and drug target identification.ResultsIn this study, we have developed a machine learning approach for predicting the human tissue-specific genes using microarray expression data. The lists of known tissue-specific genes for different tissues were collected from UniProt database, and the expression data retrieved from the previously compiled dataset according to the lists were used for input vector encoding. Random Forests (RFs) and Support Vector Machines (SVMs) were used to construct accurate classifiers. The RF classifiers were found to outperform SVM models for tissue-specific gene prediction. The results suggest that the candidate genes for brain or liver specific expression can provide valuable information for further experimental studies. Our approach was also applied for identifying tissue-selective gene targets for different types of tissues.ConclusionsA machine learning approach has been developed for accurately identifying the candidate genes for tissue specific/selective expression. The approach provides an efficient way to select some interesting genes for developing new biomedical markers and improve our knowledge of tissue-specific expression.

Highlights

  • Understanding how genes are expressed in particular tissues is a fundamental question in developmental biology

  • The results suggest that the expression data according to our lists of known tissue-specific genes can provide useful information for classifier construction using machine learning methods

  • The results suggest that Random Forests (RFs) classifiers reached better predictive performance than Support Vector Machines (SVMs) models (Table 1 and Figure 3)

Read more

Summary

Introduction

Understanding how genes are expressed in particular tissues is a fundamental question in developmental biology. Many tissue-specific genes are involved in the pathogenesis of complex human diseases. The accurate predictions of tissue-specific gene targets could provide useful information for biomarker development and drug target identification. Some genes are highly expressed in a particular tissue and lowly expressed or not expressed in other tissues. Many tissue-selective genes are involved in the pathogenesis of complex human diseases [1], including insulin signaling pathways in diabetes [2] and tumor-host interactions in cancer [3]. The identification of tissue-specific genes could help biologists to elucidate the molecular mechanisms of tissue development and provide valuable information for identifying candidate biomarkers and drug targets

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call