Abstract

MotivationCanonical forms of the antibody complementarity-determining regions (CDRs) were first described in 1987 and have been redefined on multiple occasions since. The canonical forms are often used to approximate the antibody binding site shape as they can be predicted from sequence. A rapid predictor would facilitate the annotation of CDR structures in the large amounts of repertoire data now becoming available from next generation sequencing experiments.ResultsSCALOP annotates CDR canonical forms for antibody sequences, supported by an auto-updating database to capture the latest cluster information. Its accuracy is comparable to that of a standard structural predictor but it is 800 times faster. The auto-updating nature of SCALOP ensures that it always attains the best possible coverage.Availability and implementationSCALOP is available as a web application and for download under a GPLv3 license at opig.stats.ox.ac.uk/webapps/scalop.Supplementary information Supplementary data are available at Bioinformatics online.

Highlights

  • Antibodies are proteins of the immune system that bind to foreign molecules

  • The binding site is largely formed of six complementarity-determining regions (CDRs): three on each of the heavy and light chains

  • Canonical forms have been redefined in the literature many times, but each update has been a static snapshot of the available data

Read more

Summary

Introduction

Antibodies are proteins of the immune system that bind to foreign molecules. The binding site is largely formed of six complementarity-determining regions (CDRs): three on each of the heavy and light chains. Canonical forms have been redefined in the literature many times, but each update has been a static snapshot of the available data These constant renewals illustrate how the growth of structural data continuously modifies our understanding of CDR loop structures, with 10 canonical forms in 1987 (Chothia and Lesk, 1987) and 26 by 2016 (Nowak et al, 2016). The most recently published method used a Gradient Boosting Machine to annotate CDR backbone conformations with up to 85.1% accuracy (Long et al, 2018). None of these tools uses an auto-updating database, and none provides both a web interface and a freely available software package for largescale sequence analysis. Logpk;j bk where Mk;j is the element score, pk;j is the probability of observing an amino acid k at the ANARCI-numbered position j within the cluster and bk is the background probability of k (Supplementary Material)

Cluster assignment
Algorithm
Benchmark
Findings
Building the PSSM

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.