Abstract
Intrinsically disordered proteins perform a variety of important biological functions, which makes their accurate prediction useful for a wide range of applications. We develop a scheme for predicting intrinsically disordered proteins by employing 35 features including eight structural properties, seven physicochemical properties and 20 pieces of evolutionary information. In particular, the scheme includes a preprocessing procedure which greatly reduces the input features. Using two different windows, the preprocessed data containing not only the properties of the surroundings of the target residue but also the properties related to the specific target residue are fed into a multi-layer perceptron neural network as its inputs. The Adam algorithm for the back propagation together with the dropout algorithm to avoid overfitting are introduced during the training process. The training as well as testing our procedure is performed on the dataset DIS803 from a DisProt database. The simulation results show that the performance of our scheme is competitive in comparison with ESpritz and IsUnstruct.
Highlights
The intrinsically disordered proteins (IDPs) have at least one region lacking a unique 3D structure [1]
The preprocessed features are capable of containing the properties of the surroundings of the target residue with the long window and the properties related to the specific target residue with the short window
Since the input of our scheme is comprised of the information obtained from a short window as well as a long window, we choose different window sizes to study the impact of window sizes on the training set of our scheme
Summary
The intrinsically disordered proteins (IDPs) have at least one region lacking a unique 3D structure [1]. They exist as conformational ensembles without equilibrium positions for their atom positions and bond angles [2]. Their mobile flexibility and structural instability are encoded by their amino acid sequences [3]. They play a crucial role in a variety of important biological functions [4]. It is essential to predict IDPs through the computational approaches
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have