Sort by
Hierarchical ensemble methods for protein function prediction.

Protein function prediction is a complex multiclass multilabel classification problem, characterized by multiple issues such as the incompleteness of the available annotations, the integration of multiple sources of high dimensional biomolecular data, the unbalance of several functional classes, and the difficulty of univocally determining negative examples. Moreover, the hierarchical relationships between functional classes that characterize both the Gene Ontology and FunCat taxonomies motivate the development of hierarchy-aware prediction methods that showed significantly better performances than hierarchical-unaware “flat” prediction methods. In this paper, we provide a comprehensive review of hierarchical methods for protein function prediction based on ensembles of learning machines. According to this general approach, a separate learning machine is trained to learn a specific functional term and then the resulting predictions are assembled in a “consensus” ensemble decision, taking into account the hierarchical relationships between classes. The main hierarchical ensemble methods proposed in the literature are discussed in the context of existing computational methods for protein function prediction, highlighting their characteristics, advantages, and limitations. Open problems of this exciting research area of computational biology are finally considered, outlining novel perspectives for future research.

Open Access
Relevant
Construction and Analysis of the Cell Surface’s Protein Network for Human Sperm-Egg Interaction

Sperm-egg interaction is one of the most impressive processes in sexual reproduction, and understanding the molecular mechanism is crucial in solving problems in infertility and failed in vitro fertilization. The main purpose of this study is to map the sperm-egg interaction network between cell-surface proteins and perform an interaction analysis on this new network. We built the first protein interaction network of human sperm-egg binding and fusion proteins that consists of 84 protein nodes and 112 interactions. The gene ontology analysis identified a number of functional clusters that may be involved in the sperm-egg interaction. These include G-protein coupled receptor protein signaling pathway, cellular membrane fusion, and single fertilization. The PPI network showed a highly interconnected network and identified a set of candidate proteins: ADAM-ZP3, ZP3-CLGN, IZUMO1-CD9, and ADAM2-IZUMO1 that may have an important role in sperm-egg interaction. The result showed that the ADAM2 may mediate interaction between two essential factors CD9 and IZUMO1. The KEGG analysis showed 12 statistically significant pathways with 10 proteins associated with cancer, suggesting a common pathway between tumor fusion and sperm-egg fusion. We believe that the availability of this map will assist future researches in the fertilization mechanism and will also facilitate biological interpretation of sperm-egg interaction.

Open Access
Relevant
A Computational Approach towards the Understanding of <i>Plasmodium falciparum</i> Multidrug Resistance Protein 1

The emergence of drug resistance in Plasmodium falciparum tremendously affected the chemotherapy worldwide while the intense distribution of chloroquine-resistant strains in most of the endemic areas added more complications in the treatment of malaria. The situation has even worsened by the lack of molecular mechanism to understand the resistance conferred by Plasmodia species. Recent studies have suggested the association of antimalarial resistance with P. falciparum multidrug resistance protein 1 (PfMDR1), an ATP-binding cassette (ABC) transporter and a homologue of human P-glycoprotein 1 (P-gp1). The present study deals about the development of PfMDR1 computational model and the model of substrate transport across PfMDR1 with insights derived from conformations relative to inward- and outward-facing topologies that switch on/off the transportation system. Comparison of ATP docked positions and its structural motif binding properties were found to be similar among other ATPases, and thereby contributes to NBD domains dimerization, a unique structural agreement noticed in Mus musculus Pgp and Escherichia coli MDR transporter homolog (MsbA). The interaction of leading antimalarials and phytochemicals within the active pocket of both wild-type and mutant-type PfMDR1 demonstrated the mode of binding and provided insights of less binding affinity thereby contributing to parasite's resistance mechanism.

Open Access
Relevant
Exploiting identifiability and intergene correlation for improved detection of differential expression.

Accurate differential analysis of microarray data strongly depends on effective treatment of intergene correlation. Such dependence is ordinarily accounted for in terms of its effect on significance cutoffs. In this paper, it is shown that correlation can, in fact, be exploited to share information across tests and reorder expression differentials for increased statistical power, regardless of the threshold. Significantly improved differential analysis is the result of two simple measures: (i) adjusting test statistics to exploit information from identifiable genes (the large subset of genes represented on a microarray that can be classified a priori as nondifferential with very high confidence], but (ii) doing so in a way that accounts for linear dependencies among identifiable and nonidentifiable genes. A method is developed that builds upon the widely used two-sample t-statistic approach and uses analysis in Hilbert space to decompose the nonidentified gene vector into two components that are correlated and uncorrelated with the identified set. In the application to data derived from a widely studied prostate cancer database, the proposed method outperforms some of the most highly regarded approaches published to date. Algorithms in MATLAB and in R are available for public download.

Open Access
Relevant