High risk types of human papillomaviruses (HPVs) that is the leading cause of cervical cancer, and the second most common tumor in the female reproductive system, and the HPVs E6 protein is viral oncogene protein that expression in almost all HPV-positive cancers. Hence how to distinguish whether it is a risk types of HPVs with E6 properties is very serviceable and imperative to make a diagnosis and remedy the cervical cancer. The sample with a pseudo amino acid (PseAA) composition representation of the protein so as to incorporate a plentiful amount of sequence pattern information in order to increase the prediction precision for the classification of risk types. This article, which is based on the value of hydrophobicity, hydrophilicity, side-chain mass for sequence, we put forward a new method—protein mean value matrix image(MVMI) to predict HPVs risk types from E6 protein sequences. Two geometric moments are on the base of the protein MVMI were collected from each of the protein sequences are made for their PseAA. It could testify by means of the jackknife cross-check method that the total successful rate are 90.93%. The experimental results indicate that bioinformatics based on theory methodology can simplify and make experimental studies more intuitive.
Read full abstract