Abstract

AbstractSpider silks are remarkable materials characterized by superb mechanical properties such as strength, extensibility, and lightweightedness. Yet, to date, limited models are available to fully explore sequence‐property relationships for analysis and design. Here a custom generative large‐language model is proposed to enable the design of novel spider silk protein sequences to meet complex combinations of target mechanical properties. The model, pretrained on a large set of protein sequences, is fine‐tuned on ≈1,000 major ampullate spidroin (MaSp) sequences for which associated fiber‐level mechanical properties exist, to yield an end‐to‐end forward and inverse generative approach that is aplied in a multi‐agent strategy. Performance is assessed through: 1) a novelty analysis and protein type classification for generated spidroin sequences through Basic Local Alignment Search Tool (BLAST) searches, 2) property evaluation and comparison with similar sequences, 3) comparison of resulting molecular structures, and 4) a detailed sequence motif analyses. This work generates silk sequences with property combinations that do not exist in nature and develops a deeper understanding of the mechanistic roles of sequence patterns in achieving overarching key mechanical properties (elastic modulus (E), strength, toughness, failure strain). The model provides an efficient approach to expand the silkome dataset, facilitating further sequence‐structure analyses of silks, and establishes a foundation for synthetic silk design and optimization.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call