Abstract

We generalize the familiar position-specific score matrix (PSSM), aka weight matrix, by considering a log-odds score for (nonadjacent) k-tuple frequencies, each k-tuple score weighted by the product of its mutual information and its statistical significance, as measured by a point estimator for the p-value of the mutual information. Performance of this new approach, along with other variants of generalized PSSM and profile methods, is measured by receiver-operating characteristic (ROC) curves for the specific problem of signal peptide cleavage site recognition. We additionally compare Vert's recent support vector machine string kernel, Brown's joint probability approximation algorithm and the method WAM. Similar algorithm comparisons are made, though not as extensively, in the case of disulfide bond recognition. While in the case of signal peptide cleavage site recognition, the monoresidue PSSM is essentially competitive, within the limits of statistical significance, even against Vert's support vector machine kernel, diresidue and triresidue PSSM methods display improved performance over monoresidue PSSM for disulfide bond recognition.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.