Abstract

Various protein sequence classification approaches are developed to classify unknown sequences in to its classes or familes with an certain accuracy. Features extraction from protein sequence is a key technique to implement all approaches. N-gram encoding method is a popular feature extraction procedure. But to maintain the low computational time and high accuracy level of classification, it requires to fix up the upper limit of ‘N’ of N-gram encoding method. On the other hand, the standard deviation value of protein sequence is one of the important feature value which is extracted by N-gram encoding method. This feature can be extracted by two different ways like standard deviation calculation using standard mean value and using floating mean value. It is also important to find proper method to calculate the value of standard deviation. In this paper, an investigational proof has done to find upper limit of N-gram encoding method as well as find the proper technique to calculate the standard deviation value as a feature which are extracted from unknown protein sequence.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.