Abstract
Accurately predicting protein structure properties is essential in analyzing the structure and function of a protein, such as secondary structure, solvent accessibility, and dihedral angles. Multiple Sequence Alignment (MSA), which is a sequence alignment of multiple homologous protein sequences for the target protein, is widely used in the protein structure property prediction. The most popular strategy to exploit MSA is converting it into a position-specific scoring matrice (PSSM), then inputs the PSSM to the relevant prediction networks. PSSM is obtained by simply counting the frequency of amino acids presented at each position in the corresponding MSA, which means, each sequence in the MSA has the same weight to the target protein. However, simply setting the weights of homologous protein sequences of a protein as same cannot sufficiently model the complex relationships between them. Moreover, some sequences within the MSA are redundant, which raises a tantalizing question: can we generate a different weight for each sequence in the MSA and use the weighted PSSM to improve the performance of protein structure property prediction? To help answer this question, we present WeightAln framework, which to our knowledge, is the first attempt to generate learnable MSA weights for protein prediction tasks. We prove the effectiveness of our method by conducting extensive experiments on three protein structure property prediction tasks.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.