Abstract

Though the rhesus monkey is one of the most valuable non-human primate animal models for various human diseases because of its manageable size and genetic and proteomic similarities with humans, proteomic research using rhesus monkeys still remains challenging due to the lack of a complete protein sequence database and effective strategy. To investigate the most effective and high-throughput proteomic strategy, comparative data analysis was performed employing various protein databases and search engines. The UniProt databases of monkey, human, bovine, rat and mouse were used for the comparative analysis and also a universal database with all protein sequences from all available species was tested. At the same time, de novo sequencing was compared to the SEQUEST search algorithm to identify an optimal work flow for monkey proteomics. Employing the most effective strategy, proteomic profiling of monkey organs identified 3,481 proteins at 0.5% FDR from 9 male and 10 female tissues in an automated, high-throughput manner. Data are available via ProteomeXchange with identifier PXD001972. Based on the success of this alternative interpretation of MS data, the list of proteins identified from 12 organs of male and female subjects will benefit future rhesus monkey proteome research.

Highlights

  • Since the human genome project was completed in 2003, proteomics has become a powerful tool for understanding the large and global characteristics of proteins within a broad range of biomedical research platforms [1,2,3]

  • The Universal Protein Resource (UniProt) database of human and Macaca mulatta were tested using the SEQUEST search algorithm. (Fig 2A) Since the annotated FASTA database of Macaca mulatta has only 358 entries, TrEMBL database was used for monkey

  • The TrEMBL UniProt database of Macaca mulatta contains over 70,000 entries, the SEQUEST search with the UniProt monkey database returned matches to 819 proteins, of which 488 were “uncharacterized proteins” due to the fact that most of the entries have not yet been annotated

Read more

Summary

Introduction

Since the human genome project was completed in 2003, proteomics has become a powerful tool for understanding the large and global characteristics of proteins within a broad range of biomedical research platforms [1,2,3]. The current approach to overcoming the challenge of the incomplete or uncertain protein databases is to include the use of redundant whole proteome databases from National Center for Biotechnology Information (NCBI) and/or Universal Protein Resource (UniProt) This approach requires an extensive amount of time and a high level of computational performance to deal with comparative MS data interpretation of over 3 million protein sequence entries. The de novo peptide sequencing strategy was introduced as a promising methodology for interpretation of LC-MS/MS data from unknown species Current software such as DeNoS [8], Lutefisk[9] and PEAKS[10] do not yet support a fully automated search function, so they eventually require much more time than automated database search engines

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.