Abstract

Reverse vaccinology aims to accelerate subunit vaccine design by rapidly predicting which proteins in a pathogenic bacterial proteome are putative protective antigens. Support vector machine classification is a machine learning approach that has been applied to solve numerous classification problems in biological sciences but has not previously been incorporated into a reverse vaccinology approach. A training data set of 136 bacterial protective antigens paired with 136 non-antigens was constructed and bioinformatic tools were used to annotate this data for predicted protein features, many of which are associated with antigenicity (i.e. extracellular localization, signal peptides and B-cell epitopes). Annotation was used to train support vector machine classifiers that exhibited a maximum accuracy of 92% for discriminating protective antigens from non-antigens as assessed by a leave-tenth-out cross-validation approach. These accuracies were superior to those achieved when annotating training data with auto and cross covariance transformations of z-descriptors for hydrophobicity, molecular size and polarity, or when classification was performed using regression methods. To further validate support vector machine classifiers, they were used to rank all the proteins in six bacterial proteomes for their antigenicity. Protective antigens from the training data were significantly recalled (enriched) in the top 75 ranked proteins for all six proteomes as assessed by a Fisher's exact test (p<0.05). This paper describes a superior workflow for performing reverse vaccinology studies and provides a benchmark training data set that can be used to evaluate future methodological improvements.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.