Abstract

Different genes have their protein products localized in various subcellular compartments. The diversity in protein localization may serve as a gene characteristic, revealing gene essentiality from a subcellular perspective. To measure this diversity, we introduced a Subcellular Diversity Index (SDI) based on the Gene Ontology-Cellular Component Ontology (GO-CCO) and a semantic similarity measure of GO terms. Analyses revealed that SDI of human genes was well correlated with some known measures of gene essentiality, including protein–protein interaction (PPI) network topology measurements, dN/dS ratio, homologous gene number, expression level and tissue specificity. In addition, SDI had a good performance in predicting human essential genes (AUC = 0.702) and drug target genes (AUC = 0.704), and drug targets with higher SDI scores tended to cause more side-effects. The results suggest that SDI could be used to identify novel drug targets and to guide the filtering of drug targets with fewer potential side effects. Finally, we developed a user-friendly online database for querying SDI score for genes across eight species, and the predicted probabilities of human drug target based on SDI. The online database of SDI is available at: http://www.cuilab.cn/sdi.

Highlights

  • After gaining the potential genes of interest from a large-scale screen, it comes to the need to determine which particular gene is worthy of future research

  • As the distribution of Subcellular Diversity Index (SDI) was slightly different across species, we examined the total GO-CC term counts for each gene in all eight species, and calculated for a percentage of the number of genes with one to five GO-CC terms to the total number of genes in each species (Table S2)

  • Gene annotations in Gene Ontology-Cellular Component Ontology (GO-CCO) are mainly used for querying for a specific gene, or performing the enrichment analysis of a gene set

Read more

Summary

Introduction

After gaining the potential genes of interest from a large-scale screen, it comes to the need to determine which particular gene is worthy of future research. On the other hand, when considering the gene per se, important characteristics can be used as references to assess the status of each gene of interest, in different dimensions, in the entire genome. Multiple essential genes have been identified by gene deletion technologies in genome-scale across organisms and cell types (Wang et al, 2015; Evers et al, 2016; Morgens et al, 2016). Computational methods have revealed that gene essentiality is correlated with other measurements of a gene, such

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.