Membrane proteins constitute essential biomolecules attached to or integrated into cellular and organelle membranes, playing diverse roles in cellular processes. Their precise localization is crucial for understanding their functions. Existing protein subcellular localization predictors are predominantly trained on globular proteins; their performance diminishes for membrane proteins, explicitly via deep learning models. To address this challenge, the proposed study segregates membrane proteins into three distinct locations, including the plasma membrane, internal membrane, and membrane of the organelle, using deep learning algorithms including recurrent neural networks (RNN) and Long Short-Term Memory (LSTM). A redundancy-curtailed dataset of 3000 proteins from the MemLoci approach is selected for the investigation, along with incorporating pseudo amino acid composition (PseAAC). PseAAC is an exemplary technique for extracting protein information hidden in the amino acid sequences. After extensive testing, the results show that the accuracy for LSTM and RNN is 83.4% and 80.5%, respectively. The results show that the LSTM model outperforms the RNN and is most commonly employed in proteomics.
Read full abstract