A Biomedically Enriched Collection of 7000 Human ORF Clones

Andreas Rolfs,Munira M A Baqui,Daniel A Jepson,Lars Ebert,Yanhui Hu,Bernhard Korn,Joshua Labaer,Dietmar Hoffmann,Binghua Shen,Dongmei Zuo,Elena Taycher,Joseph Pearlberg,Craig Deloughery,Jacob Raphael,Niro Ramachandran,Andreas Hoerlein,Fontina Kelley,Seamus Mccarron

doi:10.1371/journal.pone.0001528

Andreas Rolfs, Munira M A Baqui + Show 16 more

Open Access

https://doi.org/10.1371/journal.pone.0001528

Copy DOI

Abstract

We report the production and availability of over 7000 fully sequence verified plasmid ORF clones representing over 3400 unique human genes. These ORF clones were derived using the human MGC collection as template and were produced in two formats: with and without stop codons. Thus, this collection supports the production of either native protein or proteins with fusion tags added to either or both ends. The template clones used to generate this collection were enriched in three ways. First, gene redundancy was removed. Second, clones were selected to represent the best available GenBank reference sequence. Finally, a literature-based software tool was used to evaluate the list of target genes to ensure that it broadly reflected biomedical research interests. The target gene list was compared with 4000 human diseases and over 8500 biological and chemical MeSH classes in ∼15 Million publications recorded in PubMed at the time of analysis. The outcome of this analysis revealed that relative to the genome and the MGC collection, this collection is enriched for the presence of genes with published associations with a wide range of diseases and biomedical terms without displaying a particular bias towards any single disease or concept. Thus, this collection is likely to be a powerful resource for researchers who wish to study protein function in a set of genes with documented biomedical significance.

Highlights

The study of protein function often demands high quality plasmid clones that contain the relevant open reading frames (ORFs) in a format compatible with protein expression
To make the most useful ORF clone set of the Mammalian Gene Collection (MGC) clones, we wished to select an enriched set of genes that is of particular interest to both medicine and biology
The result of this query was compared with queries using either all unique genes represented in MGC or all,33,000 human genes listed at the time in LocusLink (2004, : EntrezGene [15])

Summary

Introduction

The study of protein function often demands high quality plasmid clones that contain the relevant open reading frames (ORFs) in a format compatible with protein expression. High throughput methods have created the demand for clones that encode a class of proteins of interest or the entire proteome of a species. To avoid erroneous or ambiguous results regarding the expressed proteins, it is important that the plasmids are clonal isolates that are fully sequence verified. For many eukaryotic species, including humans, the number of protein coding sequences exceeds 15,000 genes, making the production of comprehensive sequence-verified ORF clone collections daunting and expensive. One strategy is for researchers to focus on (a) meaningful subset(s) of genes for functional studies relevant to the biological questions they wish to address. For a human ORF collection the criteria for selecting genes are mostly driven by researchers’ interest and clone availability, resulting often in either collections of special interest [4] [5], or more ‘random’ lists of genes in collections (RZPD, Invitrogen)

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLoS ONE	Publication Date: Jan 30, 2008
Citations: 44	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Biomedically Enriched Collection of 7000 Human ORF Clones

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE

Lead the way for us

Similar Papers

Bioinformatics Analysis of miR-181a and Its Role in Adipogenesis, Obesity, and Lipid Metabolism Through Review of Literature.
Guo Hongfang ... Ahmed A El-Mansi
Molecular biotechnology | VOL. 66
Guo Hongfang, et. al.Guo Hongfang ... Ahmed A El-Mansi
29 Sep 2023
Molecular biotechnology | VOL. 66

Multiple Endoplasmic Reticulum-to-Nucleus Signaling Pathways Coordinate Phospholipid Metabolism with Gene Expression by Distinct Mechanisms
Stephen A Jesch ... Susan A Henry
Journal of Biological Chemistry | VOL. 281
Stephen A Jesch, et. al.Stephen A Jesch ... Susan A Henry
01 Aug 2006
Journal of Biological Chemistry | VOL. 281

Contextual Refinement of Regulatory Targets Reveals Effects on Breast Cancer Prognosis of the Regulome.
Erik Andrews ... Chao Cheng
PLOS Computational Biology | VOL. 13
Erik Andrews, et. al.Erik Andrews ... Chao Cheng
19 Jan 2017
PLOS Computational Biology | VOL. 13

Author response: Ribosome recycling is not critical for translational coupling in Escherichia coli
Kazuki Saito ... Allen R Buskirk
-
Kazuki Saito, et. al.Kazuki Saito ... Allen R Buskirk
11 Sep 2020
11 Sep 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Biomedically Enriched Collection of 7000 Human ORF Clones

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE