Abstract
Proteins are fundamental to life and exhibit a wide diversity of activities, some of which are toxic. Therefore, assessing whether a specific protein is safe for consumption in foods and feeds is critical. Simple BLAST searches may reveal homology to a known toxin, when in fact the protein may pose no real danger. Another challenge to answer this question is the lack of curated databases with a representative set of experimentally validated toxins. Here we have systematically analyzed over 10,000 manually curated toxin sequences using sequence clustering, network analysis, and protein domain classification. We also developed a functional sequence signature method to distinguish toxic from non-toxic proteins. The current database, combined with motif analysis, can be used by researchers and regulators in a hazard screening capacity to assess the potential of a protein to be toxic at early stages of development. Identifying key signatures of toxicity can also aid in redesigning proteins, so as to maintain their desirable functions while reducing the risk of potential health hazards.
Highlights
Proteins are fundamental to life and exhibit a wide diversity of activities, some of which are toxic
The amino acid sequence determines the three-dimensional structure and the biochemical function of the protein, the specific determinants for the pathogenic effect are not known in many cases
The extensive data sets of amino acid sequences, three-dimensional structures, biochemical and biological functions of gene products in publicly available databases can be the basis for bioinformatics approaches to determine the potential risk of toxicity
Summary
Proteins are fundamental to life and exhibit a wide diversity of activities, some of which are toxic. The extensive data sets of amino acid sequences, three-dimensional structures, biochemical and biological functions of gene products in publicly available databases can be the basis for bioinformatics approaches to determine the potential risk of toxicity Proper cataloguing of this data, by discriminating the small proportion of proteins that are known toxins, is one part of an overall “weight-of-evidence” evaluation for the safety of GE products[4,5,15,16]. We show further that sequence alignments of the clustered toxins can establish structural and sequential motifs[18,19,20] for use in distinguishing toxins from their non-toxic homologues in the same PFAM class Extending this classification and motif analysis to all known toxic proteins can aid in identifying possible mechanisms of toxicity during the first tier of hazard screening, and prevent potentially problematic proteins from entering the developmental pipeline
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.