Understanding a protein’s exact cellular location is often essential to understanding its function. Even with the advancements in computer approaches, protein localization prediction indeed faces major obstacles such as interpretability and handling numerous localization sites. In this research, a novel approach, Squirrel Search Optimized Dynamic Visual Geometry Group Network (SSO-DVGG), is proposed to improve protein sub-cellular localization predictions by utilizing spatial metrology models to tackle these problems. With its simplified architecture, SSO-DVGG can explain whether a protein is directed to particular cellular sites, as well as identify important sequence components like sorting motifs or localization signals. This model allows users to select acceptable error levels by providing a confidence estimate for each prediction and highlighting sequence properties that are responsible for localization. This makes the model interpretable. Furthermore, SSO-DVGG uses a probabilistic methodology and integrates a large amount of data from dual-targeted proteins, which enables it to predict multiple localization locations per protein accurately. SSO-DVGG outperforms the best predictors and shows superior capacity to predict multiple localizations when tested on several independent datasets. By providing a clear and accurate understanding of protein distribution and function, this method promotes the application of spatial metrology models in cell molecular localization and functional prediction.
Read full abstract