Publishing re-usable phylogenetic trees, in theory and practice

Brian O’Meara,Jamie Whitacre,Dan Rosauer,Ross Mounce,Arlin Stoltzfus,Rutger Vos

doi:10.1038/npre.2011.6048.1

Abstract

AbstractSharing and re-use of data are essential to the progressive and self-correcting nature of science. In recognition of this principle, journals and funding agencies have adopted policies to encourage sharing of information ('data'), including empirical data as well as computed inferences such as phylogenetic trees.Here we summarize an ongoing analysis of 1) current practices for sharing phylogenetic trees and associated data; 2) current barriers to effective sharing and reuse of such data; and 3) prospects for reducing these barriers to promote more widespread sharing and re-use. Currently, the technical infrastructure is available to support (with some limitations) rudimentary archiving in conjunction with manuscript publication. Yet, most published trees are not archived, and there is no community standard governing the recommended format or content to ensure a re-usable phylogenetic record. Without a shift in emphasis toward re-usability, along with technology and standards to support such a shift, the value of trees (whether disseminated via public archives, or by other means) will be limited. Interviews with actual or potential secondary consumers of phylogenetic results suggest that there is a considerable market for re-use, but that most attempts end in disappointment. Phylogenetic results available via author requests, journal web sites, archival repositories and project web sites rarely include the critical information that secondary consumers seek, such as unique identifiers for biological sources (including species sources and accession numbers), indicators of quality, and documentation of the analytical methods used to obtain the results.Based on the analysis presented here, we suggest that enabling effective re-use entails a commitment by the research community to several changes from current practice: 1) using globally unique identifiers (GUIDs) to reference informational and material entities; 2) developing and using technology for documenting and exchanging the metadata that facilitate re-use; and 3) supporting development and use of a minimal reporting standard that indicates what data and metadata are considered essential for a re-useable phylogenetic record. We suggest that re-use may be catalyzed most rapidly by identifying and targeting (with appropriate technology) the most promising circumstances for re-use. These might include the extraction of sub-trees from large trees (for use in reconciliation, classification, and comparative analysis); the re-use of seed alignments, sub-alignments and homologized characters; the linking of phylogenies to geographic information (for use in ecology, phylogeography and biogeography); and the construction of supertrees and supermatrices.

Highlights

General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights
We learned some fascinating things from talking to scientific users directly about their experiences with data re-use (list)

Summary

University of Bath

General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. What we have learned from users suggests that most attempts to discover, access and re-use comparative data and trees end in disappointment. My co-authors and I are part of a loose network of people— a network in which NESCent plays a major role— interested in facilitating re-use of comparative data and trees. We’ve been doing several things to try to understand the cycle of re‐use and how to enhance it (list) The results of this are available in a dra< report. What I’m going to talk about today is assessing user needs and practices-- the human aspects of re-use, rather than the technology development aspect. Today I’m going to talk about an analysis of data re-use and archiving done by Brian O’Meara and myself. We read each paper and looked for generation, re-use, or archiving of comparative data and trees. Of 40 recent papers with “phylogeny” in title that created new trees: Archiving of phylogenies

Journal TreeBase

Future practice?

Signs of hope

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Publishing re-usable phylogenetic trees, in theory and practice

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Nature Precedings

Lead the way for us

Journal: Nature Precedings	Publication Date: Jun 22, 2011
License type: CC BY 3.0

Similar Papers

Data from: Sharing and re-use of phylogenetic trees (and associated data) to facilitate synthesis
...
-
, et. al. ...
03 Feb 2016
03 Feb 2016

Sharing of Clinical Trial Data and Samples: The Cancer Patient Perspective.
Stefanie Broes ... Minne Casteels
Frontiers in medicine | VOL. 7
Stefanie Broes, et. al.Stefanie Broes ... Minne Casteels
11 Feb 2020
Frontiers in medicine | VOL. 7

Achievements and challenges in the integration, reuse and synthesis of vegetation plot data
Susan K Wiser
Journal of Vegetation Science | VOL. 27
Susan K WiserSusan K Wiser
13 Jun 2016
Journal of Vegetation Science | VOL. 27

Genome sharing projects around the world: how you find data for your research
...
-
, et. al. ...
01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Publishing re-usable phylogenetic trees, in theory and practice

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Nature Precedings