Abstract

A Coding DNA Sequence (CDS) is a fraction of DNA whose nucleotides are grouped into consecutive triplets called codons, each one encoding an amino acid. Because most amino acids can be encoded by more than one codon, the same amino acid chain can be obtained by a very large number of different CDSs. These synonymous CDSs show different features that, also depending on the organism the transcript is expressed in, could affect translational efficiency and yield. The identification of optimal CDSs with respect to given transcript indicators is in general a challenging task, but it has been observed in recent literature that integer linear programming (ILP) can be a very flexible and efficient way to achieve it. In this article, we add evidence to this observation by proposing a new ILP model that simultaneously optimizes different well-grounded indicators. With this model, we efficiently find solutions that dominate those returned by six existing codon optimization heuristics.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call