Abstract
Previous work has shown that binding of target proteins to a sparse, unbiased sample of all possible peptide sequences is sufficient to train a machine learning model that can then predict, with statistically high accuracy, target binding to any possible peptide sequence of similar length. Here, highly sequence-specific molecular recognition is explored by measuring binding of 8 monoclonal antibodies (mAbs) with specific linear cognate epitopes to an array containing 121,715 near-random sequences about 10 residues in length. Network models trained on resulting sequence-binding values are used to predict the binding of each mAb to its cognate sequence and to an in silico generated one million random sequences. The model always ranks the binding of the cognate sequence in the top 100 sequences, and for 6 of the 8 mAbs, the cognate sequence ranks in the top ten. Practically, this approach has potential utility in selecting highly specific mAbs for therapeutics or diagnostics. More fundamentally, this demonstrates that very sparse random sampling of a large amino acid sequence spaces is sufficient to generate comprehensive models predictive of highly specific molecular recognition.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.