Abstract
A wide variety of data sets produced by individual investigators are now synthesized to address ecological questions that span a range of spatial and temporal scales. It is important to facilitate such syntheses so that "consumers" of data sets can be confident that both input data sets and synthetic products are reliable. Necessary documentation to ensure the reliability and validation of data sets includes both familiar descriptive metadata and formal documentation of the scientific processes used (i.e., process metadata) to produce usable data sets from collections of raw data. Such documentation is complex and difficult to construct, so it is important to help "producers" create reliable data sets and to facilitate their creation of required metadata. We describe a formal representation, an "analytic web," that aids both producers and consumers of data sets by providing complete and precise definitions of scientific processes used to process raw and derived data sets. The formalisms used to define analytic webs are adaptations of those used in software engineering, and they provide a novel and effective support system for both the synthesis and the validation of ecological data sets. We illustrate the utility of an analytic web as an aid to producing synthetic data sets through a worked example: the synthesis of long-term measurements of whole-ecosystem carbon exchange. Analytic webs are also useful validation aids for consumers because they support the concurrent construction of a complete, Internet-accessible audit trail of the analytic processes used in the synthesis of the data sets. Finally we describe our early efforts to evaluate these ideas through the use of a prototype software tool, SciWalker. We indicate how this tool has been used to create analytic webs tailored to specific data-set synthesis and validation activities, and suggest extensions to it that will support additional forms of validation. The process metadata created by SciWalker is readily adapted for inclusion in Ecological Metadata Language (EML) files.
Highlights
Examining complex questions, integrating information from a variety of disciplines, and testing hypotheses at multiple spatial and temporal scales account for an increasing proportion of ecological and environmental research (e.g., Michener et al 2001, Andelman et al 2004)
We can imagine a wide range of possible details that could be incorporated into process metadata, and so we offer no hard and fast requirements for them
Because Ecological Metadata Language (EML) currently lacks formal specifications for describing analytical processes, we suggest that augmenting EML with process metadata will significantly enhance the reliability of documented ecological data sets
Summary
Examining complex questions, integrating information from a variety of disciplines, and testing hypotheses at multiple spatial and temporal scales account for an increasing proportion of ecological and environmental research (e.g., Michener et al 2001, Andelman et al 2004). Such syntheses can help to identify and address the ‘‘big’’ ecological questions (Lubchenco et al 1991, Belovsky et al 2004) and contribute to the setting of local, regional, national, and global environmental policies (e.g., Schemske et al 1994, IPCC 2001, Kareiva 2002).
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.