We describe an automatic segmentation method for polyproteins of the viruses belonging to the Potyviridae family. It uses machine learning techniques in order to predict the cleavage site which define the segments in which said polyproteins are cut in their process of functional maturation. The segmentation application is publicly available for use on a website and it can be accessed through the web service interface too. The prediction models have an average sensitivity of 0.79 and a Matthews correlation coefficient average of 0.23. This method is capable of predicting correctly (coinciding with previously published segmentation) the segmentation of sequences which come from Potyvirus and Rymovirus, genera. However accurate prediction capabilities are affected when faced with either atypical sequences or viruses belonging to less common genera in the Potyviridae family. Future work will focus on establishing greater flexibility in this sense.
Read full abstract