Nucleobase binding is a fundamental molecular recognition event central to modern biological and bioinspired supramolecular research. Underpinning this recognition is a deceptively simple hydrogen-bonding code, primarily based on the canonical nucleobases in DNA and RNA. Inspired by these biotic structures, chemists and biologists have designed abiotic hydrogen-bonding motifs that can interact with, augment, and reshape native molecular recognition, for applications ranging from genetic code expansion and nucleic acid recognition to supramolecular materials utilizing mono- and bifacial nucleobases. However, as the number of nucleobase-inspired motifs expands, the absence of a standard vocabulary to describe hydrogen bond (HB) patterns has led to a haphazard mixture of shorthand descriptors that are confusing and inconsistent. Alternative notations that specify individual HB sites (such as DAD for donor-acceptor-donor) are cumbersome for biological and supramolecular constructs that contain many such patterns. This situation creates a barrier to sharing and interpreting nucleobase-related research across sub-disciplines, hindering collaboration and innovation. In this perspective, we aim to initiate discourse on this issue by considering what would be needed to formulate a concise one-letter code for the HB patterns associated with synthetic nucleobases. We first summarize some of the issues caused by the current absence of a consistent naming scheme. Subsequently, we discuss some key considerations in designing a coherent naming system. Finally, we leverage chemical rationale and pedagogical mnemonic considerations to propose a succinct and intuitive one-letter code for supramolecular two- and three-HB motifs. We hope that this discussion will spark conversations within our interdisciplinary community, thereby facilitating collaboration and easing communication among researchers engaged in synthetic nucleobase design.
Read full abstract