Abstract

A new graphical representation of protein sequences is introduced in this paper. Nine main physicochemical properties of amino acids were used to obtain a 2D discrete point set for protein sequences by applying principal component analysis. The fractal method was then employed to interpolate discrete points in constructing a graphical representation of protein sequences. Fractal dimension of the protein curve was used to analyze the similarity of protein sequences by comparing the distance of vectors representing segments of protein sequences. The Jeffrey's and Matusita distance was modified in the similarity comparison of protein sequences with different lengths. Nine different species from Nicotinamide adenine dinucleotide (NADH) dehydrogenase 5 (ND5) protein sequences were tested as an example to demonstrate our method. Finally, a linear correlation and significance analysis was used to compare our results with other graphical representations referring to the ClustalW result. To confirm the validity of our method, eight species in NADH dehydrogenase 6 (ND6) protein families and twenty-seven species in beta-globin protein families were also analyzed. Experimental results show that the proposed method is effective for the similarity analysis of proteins.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.