Improving antibody language models with native pairing

Sarah M Burbach,Bryan Briney

doi:10.1016/j.patter.2024.100967

Abstract

Existing antibody language models are limited by their use of unpaired antibody sequence data. A recently published dataset of ∼1.6×106 natively paired human antibody sequences offers a unique opportunity to evaluate how antibody language models are improved by training with native pairs. We trained three baseline antibody language models (BALM), using natively paired (BALM-paired), randomly-paired (BALM-shuffled), or unpaired (BALM-unpaired) sequences from this dataset. To address the paucity of paired sequences, we additionally fine-tuned ESM (evolutionary scale modeling)-2 with natively paired antibody sequences (ft-ESM). We provide evidence that training with native pairs allows the model to learn immunologically relevant features that span the light and heavy chains, which cannot be simulated by training with random pairs. We additionally show that training with native pairs improves model performance on a variety of metrics, including the ability of the model to classify antibodies by pathogen specificity.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Patterns (New York, N.Y.)	Publication Date: Apr 4, 2024
Citations: 9	License type: cc-by

R Discovery Prime

R Discovery Prime

Improving antibody language models with native pairing

Abstract

Talk to us

Similar Papers

More From: Patterns (New York, N.Y.)

Lead the way for us

Similar Papers

Modelo Acústico y de Lenguaje del Idioma Español para el dialecto Cucuteño, Orientado al Reconocimiento Automático del Habla
Juan David Celis Nuñez ... Rodrigo Andres Llanos Castro
Ingeniería | VOL. 22
Juan David Celis Nuñez, et. al.Juan David Celis Nuñez ... Rodrigo Andres Llanos Castro
12 Sep 2017
Ingeniería | VOL. 22

Statistical feature language model
Salma Jamoussi ... Kamel Smaili
-
Salma Jamoussi, et. al.Salma Jamoussi ... Kamel Smaili
04 Oct 2004
04 Oct 2004

Microblog Search Method Based on Neural Network Language Model
Jincai Lai ... Wenzhen Zheng
-
Jincai Lai, et. al.Jincai Lai ... Wenzhen Zheng
21 Sep 2017
21 Sep 2017

Exploring the language modeling toolkits for Arabic text
Fawaz S Al-Anzi ... Dia Abuzeina
-
Fawaz S Al-Anzi, et. al.Fawaz S Al-Anzi ... Dia Abuzeina
01 Nov 2017
01 Nov 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving antibody language models with native pairing

Abstract

Talk to us

Similar Papers

More From: Patterns (New York, N.Y.)