Abstract

The 3D structure of a protein is closely related to its function, and the similarity analysis between their structures can help reveal the function of proteins. However, there exist two problems arising from the analysis of 3D structures of proteins. The proteins with a similar sequence may have different structures, while the proteins with a similar structure may have different sequences. In the analysis of similarity in 3D structures of proteins, it remains difficult for the traditional methods using the spatial feature distribution and geometry or topology features of proteins to solve these problems. In this paper, a Tile-CNN network is proposed to analyze the similarity of proteins in 3D structure. In order to capture the overall and the local features as exhibited by the 3D structures of proteins, it projects 3D protein models into 2D protein images from different views and then cuts these 2D projected images using the tile strategy. After the training of proteins with these images in the Tile-CNN, the test protein model can be expressed by an analysis matrix, and then the similarity between 3D structures of proteins is computed using the root mean square distance (RMSD) for the benchmark matrix and the analysis matrix. As revealed by the experimental results, the proposed algorithm is more robust in analyzing the similarity of 3D structures of proteins and produces a satisfactory performance in solving the two aforementioned problems.

Highlights

  • Bioinformatics is an interdisciplinary subject, which analyzes the biological information from such perspectives as computer science, biology, physics and mathematics [1], [2]

  • This paper proposes a similarity analysis method for the 3D structures of proteins based on the neural network

  • It is divided into data set construction generating 2D images from 3D proteins by the multi-view and tile strategy, Tile-CNN training for obtaining the probability matrix for each category and testing by the root mean square distance (RMSD) computation between the benchmark matrix and the analysis matrix

Read more

Summary

INTRODUCTION

Bioinformatics is an interdisciplinary subject, which analyzes the biological information from such perspectives as computer science, biology, physics and mathematics [1], [2]. By computing the deviation degree of these shape features, proteins can be analyzed for their structural similarity This method is suitable exclusively for the analysis of local protein chains. Hu and Peng [5] proposed a volume fractal dimensionality method to analyze the similarity shown by the 3D structures of proteins, which can keep the rotation and translation invariance This method demonstrates a strong adaptive capacity when the amino acids mutate with no functional changes. A local diameter (LD) was constructed as the analysis vector by extracting the skeleton of the 3D protein model, and the LD between proteins was compared to determine their similarity Both of the above-mentioned methods are premised on the local shape of the 3D structures of proteins, and the global feature of proteins is excluded from consideration. As demonstrated by the experimental results, the proposed algorithm is capable of eliminating the impact of invalid features, and of achieving satisfactory performance in the similarity analysis of 3D structures of proteins

DATA SET CONSTRUCTION
SIMILARITY MEASUREMENT
Findings
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.