In-Pero: Exploiting Deep Learning Embeddings of Protein Sequences to Predict the Localisation of Peroxisomal Proteins.

Marco Anteghini,Edoardo Saccenti,Vitor Martins Dos Santos

doi:10.3390/ijms22126409

Marco Anteghini, Edoardo Saccenti + Show 1 more

Open Access

https://doi.org/10.3390/ijms22126409

Copy DOI

Abstract

Peroxisomes are ubiquitous membrane-bound organelles, and aberrant localisation of peroxisomal proteins contributes to the pathogenesis of several disorders. Many computational methods focus on assigning protein sequences to subcellular compartments, but there are no specific tools tailored for the sub-localisation (matrix vs. membrane) of peroxisome proteins. We present here In-Pero, a new method for predicting protein sub-peroxisomal cellular localisation. In-Pero combines standard machine learning approaches with recently proposed multi-dimensional deep-learning representations of the protein amino-acid sequence. It showed a classification accuracy above 0.9 in predicting peroxisomal matrix and membrane proteins. The method is trained and tested using a double cross-validation approach on a curated data set comprising 160 peroxisomal proteins with experimental evidence for sub-peroxisomal localisation. We further show that the proposed approach can be easily adapted (In-Mito) to the prediction of mitochondrial protein localisation obtaining performances for certain classes of proteins (matrix and inner-membrane) superior to existing tools.

Highlights

In eukaryotes, there are ten main subcellular localisations which can be further subdivided into intra-organellar compartments
This has led to the hypothesis that each protein has evolved to function optimally in a given subcellular compartment, and to the idea that the information encoded in the sequence can be used to predict the subcellular localisation
We compared four commonly used machine learning approaches (Logistic Regression, Partial Least Squares Discriminant analysis, Random Forest and Support Vector Machines) in combination with different protein sequence encodings and embeddings to select the best classification strategy to predict the sub-localisation of peroxisomal proteins

Summary

Introduction

There are ten main subcellular localisations which can be further subdivided into intra-organellar compartments (see Figure 1A) These organelles perform one or more, and often complementary, specific tasks in the cellular machinery. It has been observed that proteins from different organelles show signatures, in their amino acid composition, that associate with their subcellular localisation [4]. This has led to the hypothesis that each protein has evolved to function optimally in a given subcellular compartment, and to the idea that the information encoded in the sequence can be used to predict the subcellular localisation

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International journal of molecular sciences	Publication Date: Jun 15, 2021
Citations: 19	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

In-Pero: Exploiting Deep Learning Embeddings of Protein Sequences to Predict the Localisation of Peroxisomal Proteins.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International journal of molecular sciences

Lead the way for us

Similar Papers

Protein Translocation into Peroxisomes
Suresh Subramani
Journal of Biological Chemistry | VOL. 271
Suresh SubramaniSuresh Subramani
01 Dec 1996
Journal of Biological Chemistry | VOL. 271

Pex19p-dependent Targeting of Pex17p, a Peripheral Component of the Peroxisomal Protein Import Machinery
Wolfgang Girzalsky ... Ralf Erdmann
Journal of Biological Chemistry | VOL. 281
Wolfgang Girzalsky, et. al.Wolfgang Girzalsky ... Ralf Erdmann
01 Jul 2006
Pex19p-dependent Targeting of Pex17p, a Peripheral Component of the Peroxisomal Protein Import Machinery
Wolfgang Girzalsky ... Ralf Erdmann

Proteomics Characterization of Mouse Kidney Peroxisomes by Tandem Mass Spectrometry and Protein Correlation Profiling
Sebastian Wiese ...
Molecular & Cellular Proteomics | VOL. 6
Sebastian Wiese, et. al.Sebastian Wiese ...
01 Dec 2007
Molecular & Cellular Proteomics | VOL. 6

The di-aromatic pentapeptide repeats of the human peroxisome import receptor PEX5 are separate high affinity binding sites for the peroxisomal membrane protein PEX14.
Jürgen Saidowsky ... Wolf-H Kunau
Journal of Biological Chemistry | VOL. 276
Jürgen Saidowsky, et. al.Jürgen Saidowsky ... Wolf-H Kunau
03 Jul 2001
Journal of Biological Chemistry | VOL. 276

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

In-Pero: Exploiting Deep Learning Embeddings of Protein Sequences to Predict the Localisation of Peroxisomal Proteins.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International journal of molecular sciences